Introduction to Scene Composition: Elements and Principles

Scene composition is the invisible structure that organizes your shot elements into the imagery needed to tell your story. Composition can be judged by how it serves that end goal and whether it creates the suspension of disbelief. Your audience might not be able to articulate the flaws of a composition in artistic or design terms; they just know that they didn't enjoy the show or couldn't follow the story.


A max artist's work is to give form to feeling, and it is the scene composition that provides the skeleton of the form. If this basic structure is flawed, your imagery—no matter how beautiful—will not have the impact it could have had in a well-designed composition.

The goal of the composition is to create the illusion of 3D space to draw your audience into the magical universe you have created.

The scenes in the short movie Area51.avi are divided into shot elements that are combined to create the whole scene. If you were to create all these elements in one single max file, it would create a huge unwieldy file. Eventually it would be impossible to work in the scene, and the rendering of each individual frame would take too long.

To remedy this, you will divide the shot elements into individual layers. This will give you more artistic freedom and control over your final imagery than creating your images from one single max file that includes all the effects and animation.

Layer Division and Structure

All the elements of an individual shot are grouped by where they sit in the depth planes or layers of the image. In a basic composition, there are three depth-plane layers: foreground, middleground, and background.

With minor variations, the digital content of the scenes produced in most studios will be divided in a similar fashion. These depth planes are divided further into individual images. When those images are combined, or composited, the result is a complete scene.

The titles, credits, establishing shot, and interior of the Area 51 labs in Area51.avi will be created from individually rendered sets of still images. Those images will be composited to form the complete movie. Using discrete layers creates two notable advantages:

  • The layer structure gives you the freedom to isolate and modify individual elements of a scene without impacting the animation or lighting of the surrounding elements.

  • The layer structure allows cooperative team workflow and reduces the risk of producing mistakes in the final imagery. This means that if something is wrong with an image, individual layers can be revised and re-rendered in a fraction of the time it would take to do the same with nonlayered imagery.

The first step in understanding how to divide a composition into layers is to understand how the background, middleground, and foreground image layers work together to support the visual story you are trying to tell.

To create believable imagery, you must create the illusion that the scene is actually a smaller visible part of a far larger world. The first step in doing this is to divide the composition into three layers or image planes. The following images are from a short test movie created using the concepts and techniques from 3D Studio MAX 3.0 Workshop and additional shot elements that I created (see Figure 10).

Figure 10 Each image layer has its own overlapping structure. Depth and interest are created in the composition by carefully placing each element in the shot.

The image plane closest to your point of view is the foreground, or FG. In the foreground of this shot are the cooling towers closest to the camera POV. These towers have been rendered separately from the all the other elements in the shot (see Figure 11).

Figure 11 The foreground elements were rendered separately using a specialized material developed for that purpose. This material is called the matte/shadow material.

The plane that is farthest away from you is the background, or BG. The BG in this shot is the sky and the animating cloud plane (see Figure 12).

Figure 12 The animating cloud layer is created by applying an animated texture to the surface of a flat plane.

The elements between the FG and BG are in the middleground, or MG, which include the remaining cooling towers, the lights and doodads on top of the towers, and the smoke layer. The two remaining planes, the ground and mist planes, are elements that appear in all three depth image planes, and they are composited separately as well.

Organizing your shot elements into an effective composition is accomplished by applying basic artistic principles to your work. These principles might seem simple, but they are powerful when used effectively; the quality of your imagery will suffer if they are not employed.

Basic Composition Concepts

There are four basic concepts that should be considered in every composition:

  • Point of view (POV)—Determines the viewing position of the audience.

  • Focal point—Used to direct the viewers' eyes to the important storytelling parts of the composition. Focal points are created by the application of the other basic compositional principles.

  • Paths of motion—Shot elements in your scene create paths of motion in and through the shot. These paths are used to attract and hold the attention of the viewer.

  • The illusion of depth—Must be created if you want to draw your audience into the world of the story you are creating.


The first thing you need to consider in creating your composition is the audience's point of view in the scene. The placement of the camera establishes the point of view. Where you place the audience's POV is an important decision because it affects their perception of the world they are viewing.

A low-angle POV (as from a child's perspective) can be used to exaggerate size, such as that of an approaching monster, thus creating tension and dread. A bird's-eye POV can be used to create the illusion of height and vertigo (see Figure 13).

Figure 13 Using an extreme POV can be very effective when you want to create drama and emphasize relative size and height.

Focal Point

When the layers of a composition are well organized, the eyes of the viewers will be directed to the storytelling parts of the scene, called focal points. The purpose of a focal point is to tell the point of the story by inviting the viewers to visually enter into the world and stay long enough for the point to be made.

The creative challenge in the visual imagery for any individual act, scene, sequence, or shot is to overcome the attention deficiency of your audience. You have about 3 seconds to convince the viewers' subconscious to watch what you have put before them. After that you've got another 7 to 10 seconds to tell the core of the story visually. If you get past those first two perceptive thresholds, you will have the attention of the audience. Figure 14 shows how the composition of a scene uses contrast and the placement of shot elements to lead the eye to a focal point.

Figure 14 Contrast between the light and dark areas of this image draws the eye into the forest and focuses it on the figure in the clearing.

Paths of Motion

When a shot contains animated objects and effects, the path of that animation creates a movement vector or a line that defines the direction and energy of an object's motion in space. Paths of motion created when a shot element moves within the scene direct the audience's attention to specific places in the scene (see Figure 15).

Figure 15 Paths of motion are created by the placement, shape, and animation of an object within the composition of a scene.

"It Looks Like It's Going 150 MPH Sitting Still!"

A car's design or styling will create the visual or implied image of speed. Shot elements have similar vectors that can also create visual or implied motion. The balance between these competing motion vectors is not easy to achieve and might result in visual confusion or noise.

The Illusion of Depth

The illusion of depth doesn't happen just because max is a 3D program. It is created by the deliberate use of overlap, value and color contrast, and atmospheric perspective. The color plates in the color section of this book are good examples of the illusion of depth. As you read the following descriptions, look at the images and observe how these important concepts were put into practice.


Overlap is created when an element is placed in the composition so that it obscures parts of the elements behind it. There are two types of overlap that can be used in your scenes to create the illusion of depth in your composition:

  • Physical overlap—Used to give the viewer some visual cues to help understand the relative depth position of the objects within a scene. An obvious place of overlap in Area 51 is in Sc-01. The dark mass of the foreground tower is silhouetted against the sky.

  • Screen boundary overlap—Created when an individual element (such as the foreground tower) intersects with the borders of the viewport (part of the tower is off-screen). Psychologically, the viewers know that the rest of the tower that is off-screen is still there even though they can't see it. The audience's subconscious completes a picture that includes more of the same things it has already seen.


Designing your composition so that elements go off-screen in the depth layers of the shot creates the visual height, width, and depth you are looking for. This visual depth cue invites the audience to mentally accept that there is more of the off-screen world to explore.


Overlap is a powerful tool. However, when objects in a scene come close enough to touch each other without overlapping, a tangency in the image is formed. Tangency in a composition negates the illusion of depth, and the viewer becomes confused trying to resolve the relative visual depth of each object. If the camera is moving in the shot, tangency is less of a problem. In shots using a fixed camera or slow camera movements, visual tangency is something to watch out for and is easily fixed either by overlapping the adjacent elements or by moving the tangent elements farther apart.


Contrast—the light and dark values seen in a composition—is used by your eyes to determine an object's position and movement in 3D space. Color temperature—the relative warmth or coolness of a color—will also reinforce the illusion of depth. As a general rule of thumb, dark objects and cool colors appear to recede (move back) from the viewer, whereas light objects and warm colors appear to advance (come forward) toward the viewer.

There are two basic types of contrast used to reinforce the illusion of depth in a scene:

  • Value contrast—Created by the deliberate overlap of dark and light values in the shot. When using physical overlap to create depth, it is also important to consider the contrasting value of the elements. Value is a term used to describe the relative lightness or darkness of the color or lighting of an element. As you look at the scenes in Area51.avi, you can see the effort made to place dark against light in every depth plane from foreground to background. This iterative placement creates interest and allows the shot elements to be silhouetted against each other.

  • Color temperature—Describes whether an element's color is warm or cool in its relationship to the other objects around it. The interior lab scene in Area51.avi is lit up by the hot glow of the plasma generators. The objects in the shadows are lit with blue and are textured in cool blue tones, as are the upper parts of the interior of the cooling tower. This helps to create visual depth. It also makes the composition more interesting to look at.

Atmospheric Perspective

Atmospheric perspective provides visual clues about how close or far an object is in relation to your point of view. The world is smothered in an ocean of air full of invisible particles of dirt, smoke, and water vapor. Light passing through the mass of air between mountains or buildings far away from my POV makes them appear hazy. The closer I am to them, the less distance the light has to travel to reach my eyes, so they appear sharper and less hazy. This visual cue tells my brain they are closer.

This is the idea behind atmospheric perspective. Fog or atmospheric haze gives the viewer information about depth in the environment around him (see Figure 16).

Figure 16 Both images show how atmospheric perspective can be used to create image depth. Fog creates it by causing your scene to disappear into the mist. Haze, as seen on a sunny day, makes objects far from your POV appear softer and less defined, and reduces their color saturation.

It's important to remember that the ideas discussed are just a few of the concepts and ideas I've found to be valuable. There are many other approaches to creating digital content that are just as valid. I encourage you to search them out and study them. Put the ideas and principles that work for you into practice and synthesize your own personal approach to digital content creation in max.

