EECS 442: Computer Vision (Winter 2024)
Homework 3 – Fitting Models and Image Warping
Instructions
This homework is due at 11:59 p.m. on Wednesday, March 6th, 2024.
The submission includes two parts:
-
To Canvas: submit a
zip
file containing a single directory with your uniqname as the name that contains all your code and anything else asked for on the Canvas Submission Checklist. Don’t add unnecessary files or directories.- We have indicated questions where you have to do something in code in red. If Gradescope asks for it, also submit your code in the report with the formatting below.
Starter code is given to you on Canvas under the “Homework 3” assignment. You can also download it here. Clean up your submission to include only the necessary files. Pay close attention to filenames for autograding purposes.
Submission Tip: Use the Tasks Checklist and Canvas Submission Checklist at the end of this homework. We also provide a script that validates the submission format here.
-
To Gradescope: submit a
pdf
file as your write-up, including your answers to all the questions and key choices you made.- We have indicated questions where you have to do something in the report in green. Some coding questions also need to be included in the report.
The write-up must be an electronic version. No handwriting, including plotting questions. \(\LaTeX\) is recommended but not mandatory.
For including code, do not use screenshots. Generate a PDF using a tool like this or using this Overleaf LaTeX template. If this PDF contains only code, be sure to append it to the end of your report and match the questions carefully on Gradescope.
Python Environment
Consider referring to the Python standard library docs when you have questions about Python utilties.
We recommend you install the latest Anaconda for Python 3.12. This is a Python package manager that includes most of the modules you need for this course. We will make use of the following packages extensively in this course:
RANSAC and Fitting Models
Task 1: RANSAC Theory
In this section, suppose we are fitting a 3D plane (i.e., \(ax + by + cz + d = 0\)). A 3D plane can be defined by 3 points (2 points define a line). Plane fitting happens when people analyze point clouds to reconstruct scenes from laser scans. To distinguish from other notations that you may find elsewhere, we will refer to the model that is fit within the loop of RANSAC (covered in the lecture) as the putative model.
-
(3 points) Write in your report the minimum number of 3D points needed to sample in an iteration to compute a putative model.
-
(3 points) Determine the probability that the data picked for to fit the putative model in a single iteration fails, assuming that the outlier ratio in the dataset is \(0.5\) and we are fitting 3D planes.
-
(3 points) Determine the minimum number of RANSAC trials needed to have \(\geq 98\%\) chance of success, assuming that the outlier ratio in the dataset is \(0.5\) and we are fitting planes.
You can do this by explicit calculation or by search/trial and error with numpy.
Task 2: Fitting Linear Transformations
Throughout, suppose we have a set of 2D correspondences (\([x_i',y_i'] \leftrightarrow [x_i,y_i]\)) for \(1 \le i \le N\).
-
(3 points) Suppose we are fitting a linear transformation, which can be parameterized by a matrix \(\MB \in \mathbb{R}^{2\times 2}\) (i.e., \([x',y']^T = \MB [x,y]^T\)).
Write in your report the number of degrees of freedom \(\MB\) has and the minimum number of 2D correspondences that are required to fully constrain or estimate \(\MB\).
-
(3 points) Suppose we want to fit \([x_i',y_i']^T = \MB [x_i,y_i]^T\). We would like you formulate the fitting problem in the form of a least-squares problem of the form:
\[\argmin_{m \in \mathbb{R}^4} \|\AB \mB - \bB\|_2^2\]where \(\mB \in \mathbb{R}^4\) contains all the parameters of \(\MB\), \(\AB\) depends on the points \([x_i,y_i]\) and \(\bB\) depends on the points \([x'_i, y'_i]\).
Write the form of \(\AB\), \(\mB\), and \(\bB\) in your report.
Task 3: Fitting Affine Transformations
Throughout, again suppose we have a set of 2D correspondences \([x_i',y_i'] \leftrightarrow [x_i,y_i]\) for \(1 \le i \le N\).
Files: We give an actual set of points in task3/points_case_1.npy
and task3/points_case_2.npy
: each row of the matrix contains the data \([x_i,y_i,x'_i,y'_i]\) representing the correspondence. You do not need to turn in your code but you may want to write some file task3.py
that loads and plots data.
-
(3 points) Fit a transformation of the form:
\[[x',y']^T = \SB [x,y]^T + \tB, ~~~~~ \SB \in \mathbb{R}^{2 \times 2}, \tB \in \mathbb{R}^{2 \times 1}\]by setting up a problem of the form:
\[\argmin_{\vB \in \mathbb{R}^6} \|\AB \vB - \bB\|^2_2\]and solving it via least-squares.
Report (\(\SB\),\(\tB\)) in your report for
points_case_1.npy
.There is no trick question – use the setup from the foreword. Write a small amount of code that does this by loading a matrix, shuffling the data around, and then calling
np.linalg.lstsq
. -
(3 points) Make a scatterplot of the points \([x_i,y_i]\), \([x'_i,y'_i]\) and \(\SB[x_i,y_i]^T+\tB\) in one figure with different colors. Do this for both
points_case_1.npy
andpoint_case_2.npy
. In other words, there should be two plots, each of which contains three sets of \(N\) points.Save the figures and put them in your report
Look at
plt.scatter
andplt.savefig
. For drawing the scatterplot, useplt.scatter(xy[:,0],xy[:,1],1)
. The last argument controls the size of the dot and you may want this to be small so you can set the pattern. As you ask it to scatterplot more plots, they accumulate on the current figure. End the figure byplt.close()
. -
(5 points) Write in the report your answer to how well does an affine transform describe the relationship between \([x,y] \leftrightarrow [x',y']\) for
points_case_1.npy
andpoints_case_2.npy
? You should describe this in two to three sentences.What properties are preserved by each transformation?
Task 4: Fitting Homographies
Files: We have generated 9 cases of correspondences in task4/
. These are named points_case_k.npy
for \(1 \le k \le 9\). All are the same format as the previous task and are matrices where each row contains \([x_i,y_i,x'_i,y'_i]\). Eight are transformed letters \(M\). The last case (case 9) is copied from task 3. You can use these examples to verify your implementation of fit_homography
.
-
(5 points) Fill in
fit_homography
inhomography.py
.This should fit a homography mapping between the two given points. Remembering that \(\pB_i \equiv [x_i, y_i, 1]\) and \(\pB'_i \equiv [x'_i, y'_i, 1]\), your goal is to fit a homography \(\HB \in \mathbb{R}^{3}\) that satisfies:
\[\pB'_i \equiv \HB \pB_i.\]Most sets of correspondences are not exactly described by a homography, so your goal is to fit a homography using an optimization problem of the form:
\[\argmin_{\|\hB\|_2^2=1} \|\AB \hB\|,~~~\hB \in \mathbb{R}^{9}, \AB \in \mathbb{R}^{2N \times 9}\]where \(\hB\) has all the parameters of \(\HB\).
Again, this is not meant to be a trick question – use the setup from the foreword.
Important: Please include your implementation in your report using the tools mentioned at the beginning of this assignment
-
(3 points) Report \(\HB\) for cases
points_case_1.npy
andpoints_case_4.npy
. You must normalize the last entry to \(1\). -
(3 points) Visualize the original points \([x_i,y_i]\), target points \([x'_i,y'_i]\) and points after applying a homography transform \(T(H,[x_i,y_i])\) in one figure. Please do this for
points_case_5.npy
andpoints_case_9.npy
. Thus there should be two plots, each of which contains 3 sets ofN
points.Save the figure and put it in the report.
Image Warping and Homographies
Task 5: Synthetic Views – Name that book!
We asked the professor what he’s reading, and so he sent us a few pictures. They’re a bit distorted since he wants you to get used to cv2.warpPerspective
before you use it in the next task. He says “it’s all the same, right, homographies can map between planes and book covers are planes, no?”.
Files: We provide data in task5/
, with one folder per book. Each folder has:
book.jpg
– an image of the book taken from an angle;corners.npy
– a numpy containing a \(4 \times 2\) matrix where each row is \([x_i, y_i]\) representing the corners of the book stored in (top-left, top-right, bottom-right, bottom-left) order;size.npy
which gives the size of the book cover in inches in a \(2D\) array [height, width].
-
(5 points) Fill in
make_synthetic_view(sceneImage,corners,size)
intask5.py
.This should return the image of the cover viewed head-on (i.e., with cover parallel to the image plane) where one inch on the book corresponds to 100 pixels.
Walkthrough: First fit the homography between the book as seen in the image and book cover. In the new image, the top-left corner will be at \([x,y] = [0,0]\) and the bottom-right corner will be at \([x,y] = [100w-1,100h-1]\). Figure out where the other corners should go. Then read the documentation for
cv2.warpPerspective
. -
(3 points) Put a copy of both book covers in your report.
-
(5 points) One of these images doesn’t have perfectly straight lines. Write in your report why you think the lines might be slightly crooked despite the book cover being roughly a plane. You should write about 3 sentences.
-
(Suggestion/optional) Before you proceed, see if you can make another function that does the operation in the reverse: it should map the corners of
synthetic
cover tosceneImage
assuming the same relationship between the corners of synthetic and the listed corners in the scene. In other words, if you were to doodle on the cover of one of the books, and send it back into the scene, it should look as if it’s viewed from an angle. Pixels that do not have a corresponding source should be set to \(0\). What happens if synthetic contains only ones?
Task 6: Stitching Stuff Together
Recall from the introduction that a keypoint has a location \(\pB_i\) and descriptor \(\dB_i\). There are many types of keypoints used. Traditionally this course has used SIFT and SURF, but these are subject to patents and installed in only a few versions of opencv
. Traditionally, this has led to homework 3 being an exercise in figuring out how to install a very special version of opencv
(and then figuring out some undocumented features).
We provide another descriptor called AKAZE (plus some other features) in common.py
. In addition to this descriptor, you are encouraged to look at common.py
to see if there are things you want to use while working on the homework.
The calling convention is: keypoints, descriptors = common.get_AKAZE(image)
which will give you a \(N \times 4\) matrix keypoints
and a \(N \times F\) matrix descriptors
containing descriptors for each keypoint. The first two columns of keypoints
contain \(x,y\); the last two are
the angle and (roughly) the scale at which they were found in case those are of interest. The descriptor has also been post-processed into something where \(\|\dB_i - \dB_j'\|^2_2\) is meaningful.
Files: We provide you with a number of panoramas in task6/
that you can choose to merge
together. To enable you to run your code automatically on multiple panoramas without manually editing filenames (see also os.listdir
), we provide them in a set of folders.
Each folder contains two images: (a) p1.jpg
; and (b) p2.jpg
. Some also contain images (e.g., p3.jpg
) which may or may not work. You should be able to match all the provided panoramas; you should be able to stitch all except for florence3
and florence3_alt
.
-
(3 points) Fill in
compute_distance
intask6.py
. This should compute the pairwise squared L2 distance between two matrices of descriptors. You can and should use the \(\|\xB-\yB\|^2_2 = \|\xB\|^2_2 + \|\yB\|^2_2 - 2 \xB^T \yB\) trick from HW0, numpy test 11. -
(5 points) Fill in
find_matches
intask6.py
. This should usecompute_distance
plus the ratio test from the foreword to return the matches. You will have to pick a threshold for the ratio test. Something between \(0.7\) and \(1\) is reasonable, but you should experiment with it (look output of thedraw_matches
once you complete it).Beware! The numbers for the ratio shown in the lecture slides apply to SIFT; the descriptor here is different so the ratio threshold you should use is different.
Look at
np.argsort
as well asnp.take_along_axis
. -
(5 points) Fill in
draw_matches
intask6.py
. This should put the images on top of each other and draw lines between the matches. You can use this to debug things.Use
cv2.line
. -
(3 points) Put a picture of the matches between two image pairs of your choice in your report.
-
(10 points) Fill in
RANSAC_fit_homography
inhomography.py
.This should RANSACify
fit_homography
. You should keep track of the best set of inliers you have seen in the RANSAC loop. Once the loop is done, please re-fit the model to these inliers. In other words, if you are told to run \(N\) iterations of RANSAC, you should fit a homography \(N\) times on the minimum number of points needed; this should be followed by a single fitting of a homography on many more points (the inliers for the best of the \(N\) models). You will need to set epsilon’s default value: \(0.1\) pixels is too small; \(100\) pixels is too big. You will need to play with this to get the later parts to work.When sampling correspondences, draw without replacement; if you do it with replacement you may pick the same point repeatedly and then try to (effectively) fit a model to three points.
-
(18 points) Fill in
make_warped
andwarp_and_combine
intask6.py
. This should take two images as an argument and do the whole pipeline described in the foreword. The resulting image should usecv2.warpPerspective
to make a merged image where both images fit in. This merged image should have: (a) image 1’s pixel data if only image 1 is present at that location; (b) image 2’s pixel data if only image 2 is present at that location; (c) the average of image 1’s data and image 2’s data if both are present. You can do so by using a warped mask.*Walkthrough of
warp_and_combine
*:-
There is an information bottleneck in estimating \(\HB\). If \(\HB\) is correct, then you’re set; if it’s wrong, there’s nothing you can do. First make sure your code estimates \(\HB\) right.
-
Pick which image you’re going to merge to; without loss of generality, pick image 1. Figure out how to make a merged image that’s big enough to hold both image 1 and transformed image 2. Think of this as finding the smallest enclosing rectangle of both images. The merged image should be bigger than all input image in size. The upper left corner of this rectangle (i.e., pixel \([0,0]\)) may not be at the same location as in image 1. You will almost certainly need to hand-make a homography that translates image 1 to its location in the merged image. For doing this calculations, use the fact that the image content will be bounded by the image corners. Looking at the
min
,max
of these gives you what you need to create the panorama. -
Warp both images to the merged image. You can figure out where the images go by warping images containing ones to the merged images instead of the image and filling the image with 0s where the image doesn’t go. These masks also tell you how to create the average.
-
Be careful about your order of applying the homography. Think about are you warping image2 into image1’s perspective or the other way around. If you want to reverse the homography effect, you can just inverse the homography matrix.
Debugging Hints:
-
Make a fake pair of images by taking an image, rolling it by \((10,30)\) and then saving it. Debugging this is far easier if you know what the answer should be.
-
If you want to debug the warping, you can also provide two images that are crops of the same image, (e.g.,
I[100:400,100:400]
andI[150:450,75:375]
) where you know the homography (since it is just a translation).
-
-
(3 points) Put merges from two of your favorite pairs in the report. You can either choose an image we provide you or use a pair of images you take yourself.
-
(3 points) Put these merges as
mypanorama1.jpg
andmypanorama2.jpg
in your zip submission.
Augmented Reality on a Budget
Task 7: Augmented Reality on a Budget
If you can warp images together, you can replace things in your reality. Imagine that we have a template image and this template appears in the world but viewed at an angle. You can fit a homography mapping between the template and the scene. Once you have a homography, you can transfer anything. This enables you to improve things.
Files: We give a few examples of templates and scenes in task7/scenes/
. Each folder contains:
template.png
: a viewed-from-head-on / distortion-free / fronto-parallel version of the texture; and scene.jpg
: an image where the texture appears at some location and viewed at some angle. We provide a set of seals (e.g., the UM seal) that you may want to put on things in task7/seals/
. You can substitute whatever you like.
-
(3 points) Fill in the function
improve_image(scene,template,transfer)
intask7.py
that alignstemplate
toscene
using a homography, just as in task 6. Then, instead of warpingtemplate
to the image, warptransfer
. If you want to copy over your functions from task 6, you can either import them or just copy them.- The matches that you get are definitely not one-to-one. You’ll probably get better results if you match from the template to the scene (i.e., for each template keypoint, find the best match in scene). Be careful about ordering though if you transfer your code!
- The image to transfer might not be the same size as the template. You can either resize
transfer
to be the same size astemplate
or automatically generate a homography. - For using the fucntion
warp_and_combine
from task 6, you may want to change it a little bit, since you should make sure you use warpedtemplate
to cover areas in thescene
completely as shown in Figure 3.
-
(2 points) Do something fun with this. Submit a synthetically done warp of something interesting. If you do something particularly neat to get the system to work, please write this in the report.
Include the images in your report. In addition, submit in your zip file the following files:
myscene.jpg
– the scenemytemplate.png
ORmytemplate.jpg
– the template. Submit either png or jpg but not both.mytransfer.jpg
– the thing you transfermyimproved.jpg
– your result
Guidelines: If you try this on your own images, here are some suggestions:
-
Above all, please be respectful.
-
This sort of trick works best with something that has lots of texture across the entire template. The lacroix carton works very well. The
aep.jpg
image that you saw in dithering does not work so well since it has little texture for the little building at the bottom. -
This trick is most impressive if you do this for something seen at a very different angle. You may be able to extend how far you can match by pre-generating synthetic warps of the template (i.e, generate \(\textrm{synth}_i = \textrm{apply}(\HB_i,T)\) for a series of \(\HB_i\), then see if you can find a good warping \(\hat{\HB}\) from \(\textrm{synth}_i\) to the scene. Then the final homography is \(\hat{\HB} \HB_i\).
Tasks Checklist
This section is meant to help you keep track of the many things that go in the report:
- RANSAC Theory:
- 1.1 - Minimum # of points
- 1.2 - Probability single iteration fails
- 1.3 - Minimum # of RANSAC trials
- Fitting Linear Transformations:
- 2.1 - Degrees of freedom, Minimum # of correspondences
- 2.2 - Form of \(\AB\), \(\mB\), and \(\bB\)
- Fitting Affine Transformations:
- 3.1 - (\(\SB\),\(\tB\)) in your report for
points_case_1.npy
- 3.2 - Figures for
points_case_1.npy
andpoints_case_2.npy
- 3.3 - Affinity
- 3.1 - (\(\SB\),\(\tB\)) in your report for
- Fitting Homographies:
- 4.1 -
fit_homography
- 4.2 - \(\HB\) for
points_case_1.npy
andpoints_case_4.npy
- 4.3 - Figures for
points_case_5.npy
and `points_case_9.npy
- 4.1 -
- Synthetic Views:
- 5.1 -
make_synthetic_view
- 5.2 - Both book covers
- 5.3 - Lines crooked?
- 5.4 - (optional) Reverse
- 5.1 -
- Stitching Stuff Together:
- 6.1 -
compute_distance
- 6.2 -
find_matches
- 6.3 -
draw_matches
- 6.4 - Two image pairs
- 6.5 -
RANSAC_fit_homography
- 6.6 -
make_warped
- 6.7 - Two merged pairs
- 6.8 -
mypanorama1.jpg
andmypanorama1.jpg
- 6.1 -
- Augmented Reality:
- 7.1 -
improve_image
- 7.2 -
myscene.jpg
,mytemplate.png/jpg
,mytransfer.jpg
,myimproved.jpg
- 7.1 -
Canvas Submission Checklist
In the zip
file you submit to Canvas, the directory named after your uniqname should include the following files:
common.py
– do not edit though; this may be substitutedhomography.py
task5.py
task6.py
mypanorama1.jpg
,mypanorama2.jpg
task7.py
myscene.jpg
,mytemplate.png/jpg
,mytransfer.jpg
,myimproved.jpg
The following are optional:
myfavoritepanorama.jpg