Transformation pipeline

To create 2D images form 3D scenes, OpenGL requires a series of transforms. Each transform changes the coordinate space that is being used.

These are common operations and the transforms and coordinate spaces used.

  1. Move the models around to prepare the world
    Object transforms
    World coordinate space
  2. Adjust for the camera position and orientation
    Camera/Inverse camera/modelview transform
    Eye coordinates
  3. Reshape the viewable scene to fit OpenGL's view
    Projection transform
    Clip coordinate space
  4. Adjust for perspective
    Perspective divide
    Normalized device coordinate space
  5. Reshape OpenGL's view to fit the image resolution
    Viewport/image transform
    Window/image coordinate space

Object transforms

The individual models of the virtual world need to composed into a coherent scene. Since each model has a different origin and orientation, many transforms are needed to build a scene from individual models.

Camera transform

The OpenGL camera is fixed at the origin, looking down the -z axis, with its up being along +y. This camera is not useful for general scenes, so a camera transform must be applied. This transform adjust the OpenGL view, so the scene will render from a specified camera view. A camera matrix can be formed using the camera position and its basis vectors. However, this matrix transforms points into the camera's reference frame and not into OpenGL's. Once we form the camera matrix, its inverse is used to move world coordinates into OpenGL's view coordinates.

Projection transform

OpenGL can only render objects in the range \( (-1, -1, -1) - (1, 1, 1) \). This space is called Normalized Device Coordinates. Our view of the world is often a pyramid shaped frustum and thus must be reshaped to fit OpenGL's renderable region. The projection matrix warps the viewable scene into this space. After this transform, all objects outside the viewable region are discarded.

There are two main projection transforms: orthographic and perspective. Orthographic preserves parallel lines and relative object sizes, however far away objects do not appear smaller. Orthographic projection is often used in design tools. Perspective projection makes far away objects appear smaller and matches what humans perceive, however it is difficult to compare object size in perspective projection. Perspective projection is often used when viewing realistic 3D scenes.

Higher res orthographic image Higher res perspective image

Perspective divide

Perspective transforms are non-linear and require adjustment of the homogeneous coordinates. This step scales all the vertices by their distance from the camera to adjust for perspective warping. After this operation, The scene is confined to the \( (-1, -1, -1) - (1, 1, 1) \) range.

Viewport transform

This transform adjust the viewable region to fit the output image dimensions. Also, it is at this stage that far objects are marked as occluded by near objects and discarded.