Assignment #2 Registration solved




2 VLFeat Installation
(a) Image (b) SIFT
Figure 1: Given an image (a), you will extract SIFT features using VLFeat.
One of key skills to learn in computer vision is the ability to use other open source
code, which allow you not to re-invent the wheel. We will use VLFeat by A. Vedaldi
and B. Fulkerson (2008) for SIFT extraction given your images. Install VLFeat from
Run vl demo sift basic to double check the installation is completed.
(Note) You will use this library only for SIFT feature extraction and its visualization.
All following visualizations and algorithms must be done by your code. Using VLFeat,
you can extract keypoints and associated descriptors as shown in Figure 1.
(SIFT visualization) Use VLFeat to visualize SIFT features with scale and orientation
as shown in Figure 1. You may want to follow the following tutorial:
CSCI 5561: Assignment #2
3 SIFT Feature Matching
(a) Template (b) Target (c) SIFT matches with ratio test
Figure 2: You will match points between the template and target image using SIFT
The SIFT is composed of scale, orientation, and 128 dimensional local feature descriptor
(integer), f ∈ Z128. You will use the SIFT features to match between two images, I1
and I2. Use two sets of descriptors from the template and target, find the matches
using nearest neighbor with the ratio test. You may use knnsearch built-in function
function [x1, x2] = FindMatch(I1, I2)
Input: two input gray-scale images with uint8 format.
Output: x1 and x2 are n × 2 matrices that specify the correspondence.
Description: Each row of x1 and x2 contains the (x, y) coordinate of the point correspondence in I1 ad I2, respectively, i.e., x1(i,:) ↔ x2(i,:).
(Note) You can only use VLFeat for the SIFT descriptor extraction. Matching with the
ratio test needs to be implemented by yourself.
CSCI 5561: Assignment #2
4 Feature-based Image Alignment
Figure 3: You will compute an affine transform using SIFT matches filtered by
RANSAC. Blue: outliers; Orange: inliers; Red: the boundary of the transformed template.
(Note) From this point, you cannot use any function provided by VLFeat.
The noisy SIFT matches can be filtered by RANSAC with an affine transformation as
shown in Figure 3.
function [A] = AlignImageUsingFeature(x1, x2, ransac_thr, ransac_iter)
Input: x1 and x2 are the correspondence sets (n × 2 matrices). ransac_thr and
ransac_iter are the error threshold and the number of iterations for RANSAC.
Output: 3 × 3 affine transformation.
Description: The affine transform will transform x1 to x2, i.e., x2 = Ax1. You may
visualize the inliers and the boundary of the transformed template to validate your
CSCI 5561: Assignment #2
5 Image Warping
(a) Image (b) Warped image
(c) Template (d) Error map
Figure 4: You will use the affine transform to warp the target image to the template
using the inverse mapping. Using the warped image, the error map |Itpl − Iwrp| can be
computed to validate the correctness of the transformation where Itpl and Iwrp are the
template and warped images.
Given an affine transform A, you will write a code to warp an image I(x) → I(Ax).
function [I_warped] = WarpImage(I, A, output_size)
Input: I is an image to warp, A is the affine transformation from the original coordinate
to the warped coordinate, output_size=[h,w] is the size of the warped image where
w and h are the width and height of the warped image.
Output: I_warped is the warped image with the size of output_size.
Description: The inverse mapping method needs to be applied to make sure the
warped image does not produce empty pixel. You are allowed to use interp2 build-in
function in MATLAB for bilinear interpolation.
(Validation) Using the warped image, the error map |Itpl − Iwrp| can be computed to
validate the correctness of the transformation where Itpl and Iwrp are the template and
warped images.
CSCI 5561: Assignment #2
6 Inverse Compositional Image Alignment
(a) Template (b) Initialization (c) Aligned image
Figure 5: You will use the initial estimate of the affine transform to align (i.e., track)
next image. (a) Template image from the first frame image. (b) The second frame
image with the initialization of the affine transform. (c) The second frame image with
the optimized affine transform using the inverse compositional image alignment.
Given the initial estimate of the affine transform A from the feature based image alignment (Section 4) as shown in Figure 5(b), you will track the next frame image using the
inverse compositional method (Figure 5(c)). You will parametrize the affine transform
with 6 parameters p = (p1, p2, p3, p4, p5, p6), i.e.,
W(x; p) =

p1 + 1 p2 p3
p4 p5 + 1 p6
0 0 1


 = A(p)x (1)
where W(x; p) is the warping function from the template patch to the target image.
x =


 is the coordinate of the point before warping, and A(p) is the affine transform
parametrized by p.
function [A_refined] = AlignImage(template, target, A)
Input: gray-scale template template and target image target; the initialization of
3×3 affine transform A, i.e., xtgt =Axtpl where xtgt and xtpl are points in the target and
template images, respectively.
Output: A_refined is the refined affine transform based on inverse compositional image alignment
Description: You will refine the affine transform using inverse compositional image
alignment, i.e., A→A_refined. The pseudo-code can be found in Algorithm 1.
Tip: You can validate your algorithm by visualizing their error map as shown in Figure 6(d) and 6(h). Also you can visualize the error plot over iterations, i.e., the error
must decrease as shown in Figure 6(i).
CSCI 5561: Assignment #2
(a) Template (b) Initial warp (c) Overlay (d) Error map
(e) Template (f) Opt. warp (g) Overlay (h) Error map
0 50 100 150 200 250 300 350 400 450
Error ||Itpl – Itgt||
(i) Error map
Figure 6: (a,e) Template images of the first frame. (b) Warped image based on the initialization of the affine parameters. (c) Template image is overlaid by the initialization.
(d) Error map of the initialization. (f) Optimized warped image using the inverse compositional image alignment. (g) Template image is overlaid by the optimized warped
image. (h) Error map of the optimization. (i) An error plot over iterations.
Algorithm 1 Inverse Compositional Image Alignment
1: Initialize p = p0 from input A.
2: Compute the gradient of template image, ∇Itpl
3: Compute the Jacobian ∂W
∂p at (x; 0).
4: Compute the steepest decent images ∇Itpl
5: Compute the 6 × 6 Hessian H =
∂p iT h
∂p i
6: while kpk >  do
7: Warp the target to the template domain Itgt(W(x; p)).
8: Compute the error image Ierr = Itgt(W(x; p)) − Itpl.
9: Compute F =
∂p iT
10: Compute ∆p = H−1F.
11: Update W(x; p) ← W(x; p) ◦ W(x; ∆p) = W(W(x; ∆p); p).
12: end while
13: Return A_refined made of p.
CSCI 5561: Assignment #2
7 Putting Things Together: Multiframe Tracking
(a) Frame 1 (b) Frame 2
(c) Frame 3 (d) Frame 4
Figure 7: You will use the inverse compositional image alignment to track 4 frames of
Given a template and a set of consecutive images, you will (1) initialize the affine
transform using the feature based alignment and then (2) track over frames using the
inverse compositional image alignment.
function [A_cell] = TrackMultiFrames(template, image_cell)
Input: template is gray-scale template. image_cell is a cell structure that contains
a set of consecutive image frames, i.e., image_cell{i} is the i
th frame.
Output: A_cell is the set of affine transforms from the template to each frame of
image, i.e., A_cell{i} is the affine transform from the template to the i
th image.
Description: You will apply the inverse compositional image alignment sequentially
to track over frames as shown in Figure 7. Note that the template image needs to be
updated at every frame, i.e., template←WarpImage(I, inv(A), size(template)).