CSSE 461 - Computer Vision
Given a 3-channel color image with width width and
height height stored in a 3-dimensional array
F, write pseudocode to give the image a reddish tint.
Assume that F[r, c, i] is the syntax to access the value of
the ith color channel (where 0 is red, 1 is green, 2 is
blue) of the pixel at the rth row and cth
column. Your answer can, but does not need to involve any color space
transformations.
One approach is to simply increase the red channel:
F[:,:,0] += some_constant
However, this approach has limitations. For example, if you start with a perfectly blue image, scaling up red won’t effectively tint it red.
A better approach is to blend the image with a pure red image:
red = np.zeros_like(F)
red[:,:,0] = 1
F = alpha * F + (1-alpha) * redThis works well because it can move towards red by reducing other channels in addition to adding to the red channel.
Given a grayscale image \(f(x, y)\), how could you increase the contrast? In other words, how could you make the bright stuff brighter and dark stuff darker? As above, your approach should not allow values to go outside their original range from 0 to 1.
One approach is to shift the midpoint, scale, then shift back:
# First temporarily shift the "middle" of grayscale space to make values <0.5 negative
F[r, c] = F[r, c] - 0.5
# Next scale by the amount of contrast we want, making all values <0.5 darker and >0.5 brighter
F[r,c] = F[r,c] * contrast_factor
# Finally move the midpoint back into regular grayscale space
F[r,c] = F[r,c] + 0.5This works well for non-extreme adjustments.
Another approach used in practice is to use some kind of nonlinear s-shaped curve (see [https://en.wikipedia.org/wiki/Sigmoid_function]) to modify the values so you don’t have as much detail getting clipped to 0 or 1.
In terms of an input image \(f(x, y)\), write a mathematical expression for a new image \(g\) that is shifted four pixels to the left.
\(g(x, y) = f(x + 4, y)\)
In terms of an input image \(f(x, y)\), write a mathematical expression for a new image \(g\) that is twice as big (i.e., larger by a factor of two in both \(x\) and \(y\)).
\(g(x, y) = f(x/2, y/2)\)
To make \(g\) bigger, you need \(x/2, y/2\) instead of \(2x, 2y\). Think of it as a lookup - to become bigger, \(g\) needs to look up its value from smaller coordinates in \(f\).
Recall that \(f \otimes w\) means cross-correlating image \(f\) with filter \(w\). Compute the following cross-correlation using same output size and zero padding. \[ \begin{bmatrix} 0 & 1 & 0\\ 0 & 1 & 0\\ 0 & 1 & 0 \end{bmatrix} \otimes \begin{bmatrix} 1 & 2 & 1\\ 2 & 4 & 2\\ 1 & 2 & 1 \end{bmatrix} \]
\[ \begin{bmatrix} 3 & 6 & 3\\ 4 & 8 & 4\\ 3 & 6 & 3 \end{bmatrix} \]
Perform the same convolution as above, but use repeat padding.
\[ \begin{bmatrix} 4 & 8 & 4\\ 4 & 8 & 4\\ 4 & 8 & 4 \end{bmatrix} \]
Describe in words the result of applying the following filter using cross-correlation. If you aren’t sure, try applying it to the image above to gain intuition.
\[ \begin{bmatrix} 0 & 0 & 0\\ 0 & 0 & 1\\ 0 & 0 & 0 \end{bmatrix} \]
This shifts things one pixel to the left. If we were using convolution (instead of cross-correlation), it would shift right.
Compute the following convolution, assuming repeat padding and same output size. \[ \begin{bmatrix} 0 & 0 & 1\\ 0 & 0 & 1\\ 0 & 1 & 1 \end{bmatrix} * \begin{bmatrix} 0 & 0 & 0\\ 1 & 0 & -1\\ 0 & 0 & 0 \end{bmatrix} = \begin{bmatrix} \ & \ & \ \\ \ & \ & \ \\ \hspace{1em} & \hspace{1em} & \hspace{1em} \end{bmatrix} \]
\[ \begin{bmatrix} 0 & 0 & 1\\ 0 & 0 & 1\\ 0 & 1 & 1 \end{bmatrix} * \begin{bmatrix} 0 & 0 & 0\\ 1 & 0 & -1\\ 0 & 0 & 0 \end{bmatrix} = \begin{bmatrix} 0 & 1 & 1 \\ 0 & 1 & 1 \\ 1 & 1 & 0 \\ \end{bmatrix} \]
Does blurring then sharpening an image yield the original, unfiltered image?
No. The information lost when blurring is truly gone and cannot be recovered by sharpening.
For each of the following, decide whether it’s possible to design a convolution filter that performs the given operation.
Max filter: the output pixel is the maximum value among the pixels in the input window
Threshold: the output pixel is
1.0 if the input pixel is > 0.5
0.0 otherwise
\(y\) partial derivative: the output is a finite-differences approximation of the input image’s vertical derivative \(\frac{\partial}{\partial y} f(x, y)\).
Compute the following convolution, which results in a new filter kernel, and describe the effect of this new kernel in words. \[ \begin{bmatrix} 1 & 2 & 1\\ 2 & 4 & 2\\ 1 & 2 & 1 \end{bmatrix} * \begin{bmatrix} 0 & 0 & 0\\ 1 & 0 & -1\\ 0 & 0 & 0 \end{bmatrix} = \begin{bmatrix} \ & \ & \ \\ \ & \ & \ \\ \hspace{1em} & \hspace{1em} & \hspace{1em} \end{bmatrix} \]
\[ \begin{bmatrix} 2 & 0 & -2\\ 4 & 0 & -4\\ 2 & 0 & -2 \end{bmatrix} \]
or scaled by 1/2 if using repeat padding.
It gives the derivative of a blurred image, making our derivative estimate a little more robust to noise. This filter highlights only vertical edges - a transposed version of the same filter would work on horizontal edges.
Compute the structure tensor for each of the following image patches.
I have it on good authority that these images are noise-free, so we can
safely skip the Sobel filter and compute gradients using 3x1 and 1x3
centered finite difference filters and repeat padding.
\[
\begin{bmatrix}
2 & 2 & 2\\
2 & 2 & 2\\
0 & 0 & 0
\end{bmatrix}
\begin{bmatrix}
0 & 2 & 2\\
0 & 2 & 2\\
0 & 0 & 0
\end{bmatrix}
\begin{bmatrix}
2 & 2 & 2\\
0 & 2 & 2\\
0 & 0 & 2
\end{bmatrix}
\]
Gradients: \[ X: \begin{bmatrix} 0 & 0 & 0\\ 0 & 0 & 0\\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} 2 & 2 & 0\\ 2 & 2 & 0\\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} 0 & 0 & 0\\ 2 & 2 & 0\\ 0 & 2 & 2 \end{bmatrix} \\ Y: \begin{bmatrix} 0 & 0 & 0\\ 2 & 2 & 2\\ 2 & 2 & 2 \end{bmatrix} \begin{bmatrix} 0 & 0 & 0\\ 0 & 2 & 2\\ 0 & 2 & 2 \end{bmatrix} \begin{bmatrix} 2 & 0 & 0\\ 2 & 2 & 0\\ 0 & 2 & 0 \end{bmatrix} \]
Structure tensor: \[ \begin{bmatrix} 0 & 0 \\ 0 & 24 \end{bmatrix} \begin{bmatrix} 16 & 4 \\ 4 & 16 \end{bmatrix} \begin{bmatrix} 16 & 12 \\ 12 & 16 \end{bmatrix} \]
Using software of your choice (e.g., np.linalg.eigvals,
or use the formula described here),
compute the smallest eigenvalue of each of the structure tensors you
computed in the prior problem.
0, 12, 4