Home Syllabus Schedule AI Prompt Resources

Written Homework 2

CSSE 461 - Computer Vision

Instructions

Your homework solutions must be typewritten. Upload a single PDF to Gradescope. If you write a little code to answer a question, include the code in your answer.

(Use our preferred OpenCV-based coordinate conventions unless noted otherwise.)

Problems

Problem 1: Parsing Camera Matrices

Suppose we know everything about our camera: the intrinsics matrix \(\mathbf{K}\), the extrinsics, \(\mathbf{R}\) and \(\mathbf{t}\), and even the model (Nikon D500). See the values below – all units are meters or pixels as appropriate. Given these values, compute the following:

(Hint: there are a few ways to do this. One straightforward way is to pick a point directly in front of the camera, and compute its world coordinates. There’s a simpler way that requires a little more mathematical insight.)

# Here are the camera intrisincs and extrinsics matrices
# (formatted for easy copy-paste into your code):
K = array([[1.18468085e+04, 0.00000000e+00, 2.78400000e+03],
           [0.00000000e+00, 1.18216561e+04, 1.85600000e+03],
           [0.00000000e+00, 0.00000000e+00, 1.00000000e+00]])
E = array([[-0.31402103, -0.94879008,  0.03446983,  1.92193983],
           [ 0.13790285, -0.08150285, -0.98708667,  0.21731886],
           [ 0.93934743, -0.30521248,  0.15643447, -8.12951535]])

Problem 2: Reprojection Error

You are given two cameras with known intrinsics and extrinsics. You are also given the estimated location of a 3D point and its pixel projections into each camera. Compute the squared reprojection error of this estimated location for the 3D point. That is, sum up the squared pixel distances between where the 3D point was observed in each image, and where it projects to in that image.

Here are all of the givens, set up to be copy-pasted into your code:

# same intrisincs for both cameras
K = np.array([[1.18468085e+04, 0.00000000e+00, 2.78400000e+03],
              [0.00000000e+00, 1.18216561e+04, 1.85600000e+03],
              [0.00000000e+00, 0.00000000e+00, 1.00000000e+00]])
# extrinsincs for camera 1
E1 = np.array([[-0.31402103, -0.94879008,  0.03446983, -0.94640895],
               [ 0.13790285, -0.08150285, -0.98708667,  1.42010489],
               [ 0.93934743, -0.30521248,  0.15643447,  0.85907637]])
# extrinsics for camera 2
E2 = np.array([[-0.6022098 , -0.79814732, -0.01744177,  0.54587954],
               [-0.01736461,  0.03493795, -0.99923861,  1.4551765 ],
               [ 0.798149  , -0.60144841, -0.0348995 ,  0.75675417]])
# estimated position of world point (meters, in world coords)
Pw = np.array([[ 4.3 ],
               [-1.4 ],
               [ 2.03]])
# observed location of Pw in each camera (col,row pixel units)
px_cam1 = np.array([[ 898.91171569],
                    [2114.24202796]])
px_cam2 = np.array([[488.47708728],
                    [198.14597679]])