Primitive Assembly, Clipping, and Rasterization
After the front end of the pipeline has run (which includes vertex shading, tessellation, and geometry shading), a fixed-function part of the pipeline performs a series of tasks that take the vertex representation of our scene and convert it into a series of pixels, which in turn need to be colored and written to the screen. The first step in this process is primitive assembly, which is the grouping of vertices into lines and triangles. Primitive assembly still occurs for points, but it is trivial in that case.
Once primitives have been constructed from their individual vertices, they are clipped against the displayable region, which usually means the window or screen, but can also be a smaller area known as the viewport. Finally, the parts of the primitive that are determined to be potentially visible are sent to a fixed-function subsystem called the rasterizer. This block determines which pixels are covered by the primitive (point, line, or triangle) and sends the list of pixels on to the next stage—that is, fragment shading.
As vertices exit the front end of the pipeline, their position is said to be in clip space. This is one of the many coordinate systems that can be used to represent positions. You may have noticed that the gl_Position variable that we have written to in our vertex, tessellation, and geometry shaders has a vec4 type, and that the positions we have produced by writing to it are all four-component vectors. This is what is known as a homogeneous coordinate. The homogeneous coordinate system is used in projective geometry because much of the math ends up being simpler in homogeneous coordinate space than it does in regular Cartesian space. Homogeneous coordinates have one more component than their equivalent Cartesian coordinate, which is why our three-dimensional position vector is represented as a four-component variable.
Although the output of the front end is a four-component homogeneous coordinate, clipping occurs in Cartesian space. Thus, to convert from homogeneous coordinates to Cartesian coordinates, OpenGL performs a perspective division, which involves dividing all four components of the position by the last, w component. This has the effect of projecting the vertex from the homogeneous space to the Cartesian space, leaving w as 1.0. In all of the examples so far, we have set the w component of gl_Position as 1.0, so this division has not had any effect. When we explore projective geometry in a short while, we will discuss the effect of setting w to values other than 1.0.
After the projective division, the resulting position is in normalized device space. In OpenGL, the visible region of normalized device space is the volume that extends from −1.0 to 1.0 in the x and y dimensions and from 0.0 to 1.0 in the z dimension. Any geometry that is contained in this region may become visible to the user and anything outside of it should be discarded. The six sides of this volume are formed by planes in three-dimensional space. As a plane divides a coordinate space in two, the volumes on each side of the plane are called half-spaces.
Before passing primitives on to the next stage, OpenGL performs clipping by determining which side of each of these planes the vertices of each primitive lie on. Each plane effectively has an “outside” and an “inside.” If a primitive’s vertices all lie on the “outside” of any one plane, then the whole thing is thrown away. If all of primitive’s vertices are on the “inside” of all the planes (and therefore inside the view volume), then it is passed through unaltered. Primitives that are partially visible (which means that they cross one of the planes) must be handled specially. More details about how this works is given in the “Clipping” section in Chapter 7.
After clipping, all of the vertices of the geometry have coordinates that lie between −1.0 and 1.0 in the x and y dimensions. Along with a z coordinate that lies between 0.0 and 1.0, these are known as normalized device coordinates. However, the window that you’re drawing to has coordinates that usually1 start from (0, 0) at the bottom left and range to (w − 1,h − 1), where w and h are the width and height of the window in pixels, respectively. To place your geometry into the window, OpenGL applies the viewport transform, which applies a scale and offset to the vertices’ normalized device coordinates to move them into window coordinates. The scale and bias to apply are determined by the viewport bounds, which you can set by calling glViewport() and glDepthRange(). Their prototypes are
void glViewport(GLint x, GLint y, GLsizei width, GLsizei height);
void glDepthRange(GLdouble nearVal, GLdouble farVal);
This transform takes the following form:
Here, xw, yw, and zw are the resulting coordinates of the vertex in window space, and xd, yd, and zd are the incoming coordinates of the vertex in normalized device space. px and py are the width and height of the viewport in pixels, and n and f are the near and far plane distances in the z coordinate, respectively. Finally, ox, oy, and oz are the origins of the viewport.
Before a triangle is processed further, it may be optionally passed through a stage called culling, which determines whether the triangle faces toward or away from the viewer and can decide whether to actually go ahead and draw it based on the result of this computation. If the triangle faces toward the viewer, then it is considered to be front-facing; otherwise, it is said to be back-facing. It is very common to discard triangles that are back-facing because when an object is closed, any back-facing triangle will be hidden by another front-facing triangle.
To determine whether a triangle is front- or back-facing, OpenGL will determine its signed area in window space. One way to determine the area of a triangle is to take the cross product of two of its edges. The equation for this is
Here, and are the coordinates of the ith vertex of the triangle in window space and i ⊕ 1 is (i +1) mod 3. If the area is positive, then the triangle is considered to be front-facing; if it is negative, then it is considered to be back-facing. The sense of this computation can be reversed by calling glFrontFace() with dir set to either GL_CW or GL_CCW (where CW and CCW stand for clockwise and counterclockwise, respectively). This is known as the winding order of the triangle, and the clockwise or counterclockwise terms refer to the order in which the vertices appear in window space. By default, this state is set to GL_CCW, indicating that triangles whose vertices are in counterclockwise order are considered to be front-facing and those whose vertices are in clockwise order are considered to be back-facing. If the state is GL_CW, then a is simply negated before being used in the culling process. Figure 3.3 shows this pictorially for the purpose of illustration.
Figure 3.3: Clockwise (left) and counterclockwise (right) winding order
Once the direction that the triangle is facing has been determined, OpenGL is capable of discarding either front-facing, back-facing, or even both types of triangles. By default, OpenGL will render all triangles, regardless of which way they face. To turn on culling, call glEnable() with cap set to GL_CULL_FACE. When you enable culling, OpenGL will cull back-facing triangles by default. To change which types of triangles are culled, call glCullFace() with face set to GL_FRONT, GL_BACK, or GL_FRONT_AND_BACK.
As points and lines don’t have any geometric area,2 this facing calculation doesn’t apply to them and they can’t be culled at this stage.
Rasterization is the process of determining which fragments might be covered by a primitive such as a line or a triangle. There are myriad algorithms for doing this, but most OpenGL systems will settle on a half-space–based method for triangles, as it lends itself well to parallel implementation. Essentially, OpenGL will determine a bounding box for the triangle in window coordinates and test every fragment inside it to determine whether it is inside or outside the triangle. To do this, it treats each of the triangle’s three edges as a half-space that divides the window in two.
Fragments that lie on the interior of all three edges are considered to be inside the triangle and fragments that lie on the exterior of any of the three edges are considered to be outside the triangle. Because the algorithm to determine which side of a line a point lies on is relatively simple and is independent of anything besides the position of the line’s endpoints and of the point being tested, many tests can be performed concurrently, providing the opportunity for massive parallelism.