DC & CV Lab. CSIE NTU
Chapter 4
Model fitting and optimization
Advanced Computer Vision
Computer Vision:
Algorithms and Applications
Presented by: 傅楸善 & 蕭延儒 E-mail: luis862013@gmail.com
手機 : 0978193029 授課教授:傅楸善 博士
Outline
4.1 Scattered data interpolation
4.2 Variational methods and regularization
4.3 Markov random fields
4.1 Scattered data interpolation
4.1.1 Radial basis functions
4.1.2 Overfitting and underfitting
4.1.3 Robust data fitting
4.1.0 Scattered data interpolation
Example of scattered data interpolation
Produce a function , such that Let ,
4.1.0 Scattered data interpolation
Example of scattered data interpolation
1. It requires the function to pass through data point.
2. The data points are irregularly placed throughout the domain.
4.1.0 Scattered data interpolation
Common methods:
1. Triangulation & interpolation
2. Pull-push algorithms
4.1.0 Scattered data interpolation
Triangulation & interpolation
Scattered data Triangulation Interpolation
4.1.0 Scattered data interpolation
Triangulation & interpolation
Triangulation: Division into triangles.
A good triangulation should avoid producing long skew triangles.
4.1.0 Scattered data interpolation
Triangulation & interpolation
Interpolation
Barycentric coordinates are usually used to do interpolation.
If a smoother surface is desired, we can use higher-order splines.
Cubic splines interpolation
4.1.0 Scattered data interpolation
Pull-push algorithms
Original image Scattered data Pull-push Interpolation result
The algorithm is very fast but less accurate.
4.1 Scattered data interpolation
4.1.1 Radial basis functions
4.1.2 Overfitting and underfitting
4.1.3 Robust data fitting
4.1.1 Radial basis functions
Meshed-based approaches are limited to low dimensional domain.
We here introduce radial basis functions, a meshed-free approach, which can be easily extended to higher dimension.
� (�)= ∑
�
�
�∅(ǁ�−�
�ǁ)
Interpolated function using radial basis functions
4.1.1 Radial basis functions
Some commonly used basis functions include:
controls the size (radial fallout) of the basis function, and hence its smoothness.
4.1.1 Radial basis functions
Let be a scattered data, the equation need to satisfy:
are the locations of the scattered data points, s are radial basis functions, are the local weights.
We need to obtain the desired set of
4.1.1 Radial basis functions
Solution:
1. Minimizing the data constraint energy together with a weight penalty.
2. Kernel regression
4.1.1 Radial basis functions
Minimizing the data constraint energy together with a weight penalty:
� ( { �
�} ) = �
�+ λ �
�4.1.1 Radial basis functions
Minimizing the data constraint energy together with a weight penalty:
� ( { �
�} ) = �
�+ λ �
�When ,
it becomes a pure least squares problem.
4.1.1 Radial basis functions
Kernel regression
We simply set to :
� ( � ) = ∑
�
� � ∅ ( ǁ �−� � ǁ )
However, this fails to interpolate the data.
4.1.1 Radial basis functions
Kernel regression
� ( � ) =
∑
���∅(ǁ� − ��ǁ)
∑
� ∅(ǁ � − ��ǁ)So, we divide the data-weighted summed basis functions by the sum of all the basis functions:
While not that widely used in computer vision, kernel regression techniques have been applied to a number of low-level image processing operations.
4.1 Scattered data interpolation
4.1.1 Radial basis functions
4.1.2 Overfitting and underfitting
4.1.3 Robust data fitting
4.1.2 Overfitting and underfitting
Some data are noisy, so fitting them makes no sense.
Doing so can produce a lot of spurious wiggles, which may cause overfitting.
4.1.2 Overfitting and underfitting
overfitting
underfitting underfitting
M=0 M=1
M=9 plausible
M=3
4.1.2 Overfitting and underfitting
How can we quantify the amount of underfitting and overfitting?
How can we get just the right amount?
Adjust to get different results.
4.1.2 Overfitting and underfitting
.
ln λ=− 18 ln λ=0
(Plausible fit) (underfitting)
4.1.2 Overfitting and underfitting
Validation Set:
We save some data in a validation set in order to see if the function we compute is overfitting or underfitting.
data Training set Validation set
4.1.2 Overfitting and underfitting
Now we vary , we can typically obtain a curve as below:
4.1.2 Overfitting and underfitting
Cross validation:
1. Split the data into K folds.
2. We train for K runs, with different folds to be validation set.
3. Estimate the best result by averaging over all K training runs’ result.
1
…
2 K-1 K (K=5 is often used.)
4.1 Scattered data interpolation
4.1.1 Radial basis functions
4.1.2 Overfitting and underfitting
4.1.3 Robust data fitting
4.1.3 Robust data fitting
Robust loss function
Lower weights to larger error, which are more likely to be outlier.
4.1.3 Robust data fitting
Penalty function
controls the range of residual values corresponds to inliers
was often determined based on the expected shape of the outlier distribution.
4.2 Variational methods and regularization
4.2.1 Discrete energy minimization 4.2.2 Total variation
4.2.3 Bilateral solver
4.2.4 Application: Interactive colorization
4.2.0 Variational methods and regularization
The methods in the previous section provide reasonable solutions, but:
1. Cannot directly quantify and hence optimize the amount of smoothness in the solution.
2. No local control over where the solution should be discontinuous.
4.2.0 Variational methods and regularization
Variational methods:
1-dimensional functions
Such functions ( are often called variational methods,
because they measure the variation (non-smoothness) in a function.
4.2.0 Variational methods and regularization
Variational methods in 2-dimensions:
However, these smoothness functions cannot provide discontinuities.
4.2.0 Variational methods and regularization
Variational methods in 2-dimensions:
∫ � ( �,� ) {[ 1− �(�,�)][� � 2 ( �,� ) + � 2 � ( �,� ) ����]+� ( �,� ) [ � �� 2 ( �,� ) +2 ∗� 2 �� ( �,� ) + � 2 �� ( �,� ) ]} ����
Ɛ
��=¿
controls the continuity of the surface
controls how flat the surface wants to be.
4.2.0 Variational methods and regularization
In addition to the smoothness term, variational problems also require a data penalty .
Ɛ � = ∑ [ � ( � � , � � ) − � � ] 2
For scattered data interpolation, the data penalty measures the distance Between the function and a set of data points
4.2.0 Variational methods and regularization
In addition to the smoothness term, variational problems also require a data penalty .
Ɛ � = ∫ [ � ( �, � ) − � ( � , � ) ] 2 ����
For a problem like noise removal,
a continuous version of this measure can be used,
4.2.0 Variational methods and regularization
Finally, we get a global energy that can be minimized, the two energy penalties are usually added together,
is the smoothness penalty. ( or some weighted blend such as )
is the regularization parameter, which controls the smoothness of the solution.
Ɛ =Ɛ
�+ λ Ɛ
�(we can use the methods in 4.1.2 to estimate good values for .)
4.2 Variational methods and regularization
4.2.1 Discrete energy minimization 4.2.2 Total variation
4.2.3 Bilateral solver
4.2.4 Application: Interactive colorization
4.2.1 Discrete energy minimization
are optional smoothness weights.
They control the location of horizontal and vertical weakness in the surface.
The exact elements they control depend on the problem itself.
4.2.1 Discrete energy minimization
are gradient data constraints used by algorithms, such as Photometric stereo, Poisson blending, etc.
They are set to zero when just discretizing the conventional first-order smoothness functional.
4.2.1 Discrete energy minimization
is the size of the finite element grid. It is only important if the energy is being discretized at a variety of resolutions.
4.2.1 Discrete energy minimization
are crease variables.
They control the locations of creases in the surface.
4.2.1 Discrete energy minimization
controls how strongly the data constraint is enforced.
The 2-dimensional discrete data energy is written
as:
4.2.1 Discrete energy minimization
The total energy of the discretized problem can now be written as a quadratic form:
�=�
�+ λ �
�= �
��� −2 �
��+�
is called state vector.
is Hessian. It encodes the second derivative of the energy function.
is the weighted data vector.
4.2.1 Discrete energy minimization
Minimizing the quadratic form is equivalent to solving the following linear system:
�=�
�+ λ �
�= �
��� −2 �
��+�
��=�
4.2.1 Discrete energy minimization 4.2.2 Total variation
4.2.3 Bilateral solver
4.2.4 Application: Interactive colorization
4.2 Variational methods and regularization
4.2.2 Total variation
Today, many regularized problems are formulated using norm, which is often called total variation.
¿ � ∨¿�
� (� )=¿
4.2.2 Total variation
It tends to better preserve discontinuities, but still results in a convex problem that has a globally unique solution.
¿ � ∨¿�
� (� )=¿
4.2.2 Total variation
Hyper-Laplacian norms with have gained popularity.
They have even stronger tendency to prefer large discontinuities over small ones.
¿ � ∨¿�
� (� )=¿
4.2.1 Discrete energy minimization 4.2.2 Total variation
4.2.3 Bilateral solver
4.2.4 Application: Interactive colorization
4.2 Variational methods and regularization
4.2.3 Bilateral solver
As discussed in 3.3.2, we can often get better results by looking at a larger spatial neighborhood. We can extend this idea to energy minimization. Recall the bilateral
weight function:
� ( � , � ,� , � ) =exp (− ( � −� )
2+ ( � − � )
22 �
�2− ǁ � ( � , � ) − � (�, �)ǁ
22 �
�2)
4.2.3 Bilateral solver
A wider-neighborhood, bilaterally weighted version of nearest-neighbor smoothness penalty:
� ^ ( � , � ,� , � ) = �(� , � , � , �)
� ,�
∑
� (� , � ,� ,�)
4.2.3 Bilateral solver
The bilateral solver has been used in a number of demanding
video processing and 3D reconstruction application, including
the stitching of binocular omnidirectional panoramic videos,
and smartphone AR (Augmented Reality) system.
4.2.1 Discrete energy minimization 4.2.2 Total variation
4.2.3 Bilateral solver
4.2.4 Application: Interactive colorization
4.2 Variational methods and regularization
4.2.4 Application: Interactive colorization
A good use of edge-aware interpolation techniques is in
colorization, which manually adding colors to grayscale
image.
4.2.4 Application: Interactive colorization
The user draws some scribbles and the system
interpolates the specified chrominance values
4.2.4 Application: Interactive colorization
Then, re-combined chrominancevalues with the
luminance channel to produce a final colorized image.
4.2.4 Application: Interactive colorization
The interpolation is performed using locally weighted
regularization introduced in 4.2.1. This approach has
inspired many later algorithms.
4.3 Markov random fields
4.3.1 Conditional random fields
4.3.2 Application: Interactive segmentation
4.3.0 Markov random fields
An alternative technique is to use probabilistic model.
� ( � | � ) = � ( � | � ) �(�)
�( � )
− log � ( � | � ) =− log � ( � | � ) − log � ( � | � ) + �
� ( �, � ) = �
�( �, � ) + �
�( �, �)
4.3.0 Markov random fields
An alternative technique is to use probabilistic model.
� ( �, � ) = �
�( �, � ) + �
�( �, �)
�=[ � ( 0,0 ) … � (�− 1,�−1)]
�=[� ( 0,0 ) … �(�−1, �−1)]
Input pixels:
output pixels: