Ray Generation

Given a camera's orthonormal basis (\(\mathbf{pos}, \mathbf{u}, \mathbf{v}, \mathbf{w}\)), the width and height of the desired image (\(n_x, n_y\)), and the field of view (\(fov\)), we can generate a ray for each pixel in the image.

Ray

We begin by noting that a ray is a parametric line that begins at a point and continues on forever. A ray defined as: \(\mathbf{r} = \mathbf{p} + t\mathbf{d}\) where \(\mathbf{p}\) is the starting position of the ray, \(\mathbf{d}\) is the direction of the ray, and positive \(t\) is the parameter that generates the set \(\mathbf{r}\) of postions along the ray.

Perspective Ray Generation

To create a ray for a given camera, note that each ray must pass through a pixel in the image plane. So, imagine an image plane in front of the camera. Each pixel has a position in 3D space and the ray for each pixel must pass through the 3D point. We will call this point \(\mathbf{s}\). The position of \(\mathbf{s}\) can be computed as:

$$ \mathbf{s} = \mathbf{e} + u \mathbf{u} + v \mathbf{v} - d \mathbf{w} $$

Where the scalars \(u\) and \(v\) are related to the image resolution as follows:

$$ u = \ell + \frac{(r - \ell)(i + 0.5)} {n_x} $$ $$ v = b + \frac{(t - b)(j + 0.5)} {n_y} $$

\(i\) and \(j\) are the index of the pixel the ray is being computed for. \(n_x\) and \(n_y\) are the width and height of the image plane in pixels. \(\ell\), \(r\), \(b\), \(t\) are the distance from the camera center to the left, right, bottom, and top of the image frame. The origin for this image frame is in the left bottom corner. If the camera is centered in the frame (as is typical), then these values are related to \(\frac{n_x}{2}\) and \(\frac{n_y}{2}\). For example, \(\ell\) would be \(-\frac{n_x}{2}\) and \(r\) would be \(\frac{n_x}{2}\).

The scalar \(d\) controls how far the image plane is from the camera origin. This is related to the field-of-view. A small FOV results in a larger \(d\). Given a horizontal FOV, \(d\) can be computed as:

$$ d = \cot(\frac{fov}{2}) \frac{n_x}{2} $$

Finally, once \(\mathbf{s}\) is known, the ray direction \(\mathbf{d}\) can be computed as \(\mathbf{s} - \mathbf{e}\). The ray position is just the same as the camera position.