The creation of 3D images
The process of creating images from scenes is called rendering;
recently the phrase image synthesis has also been applied to the
creation of computer-generated three-dimensional images. The following
diagram presents an overview of the general process:
The Object Model is derived from the modelling
processes that we have examined earlier in this topic.
The Illumination Model describes how light (and other energy components)
is defined and managed.
Texture refers to the group of techniques
(texture mapping, bump mapping, reflection mapping etc.) used to add effects
to the surface of objects.
Shading refers to variations in surface lighting that result from
the application of an illumination model.
Image Formation is the process of bringing together the other components
to form a synthetic image, usually called rendering, or image
Forms of illumination
Different rendering techniques are based on the use
of different illumination (lighting) models. Lighting models are composed
of some combination of a number of different 'types' of lighting; each
model can also employ a different formulation of how each component works.
In simplistic terms, the more lighting components are incorporated, and
the more effectively they are formulated, the more realistic will be the
image that is generated. As we shall see, however, there are computational
implications to some components that lead us sometimes to adopt a trade-off
between level of realism and performance.
The first, most general component of a lighting model is ambient
light. Ambient light is diffuse, non-directional light that is the result
of multiple reflections from surrounding surfaces. Put simply it is light
that has no obvious source; it is 'everywhere'. When a scene has a low
ambient light level, it is going to be rendered as a 'dark' scene (although
this may be offset by more specific point sources).
All surfaces (unless they are true black bodies) reflect light to
some extent; what we call 'shiny' surfaces reflect most of the light falling
on them, and 'dull' surfaces reflect much less. However, unlike the simple
reflection diagrams we see in school physics texts, real surfaces are not
perfectly smooth; light is reflected in slightly different directions by
the variations in small-scale roughness of a surface. This results in a
'cone' of reflection, rather than a single coherent beam. Diffuse
reflection means that the light is reflected equally in all directions;
there is an even spread within the cone.
Shiny surfaces also generate other effects that need to be modelled;
reflection from such surfaces is called specular reflection. Generally,
when light is reflected from any surface the colour of the reflected light
is based on the colour of the light source. However, shiny surfaces also
generate highlights (small areas on a curved surface that appear
as bright spots, with the colour of the light source, rather than the object),
and these have to be modelled properly. The colour of the highlight, for
example, is a combination of the original light source and the characteristics
of the surface. The distribution of the reflected light will also be irregular,
requiring a more complex model.
A sophisticated lighting model needs to be able to model these effects
very well; the most commonly-used reflection formulation - and its associated
shading model - is probably that described by Rob Cook and Ken Torrance
in 1982, which is (not unreasonably) called the Cook-Torrance model.
Modelling the highlights adds another level of complexity to the system;
the most successful attempt to model highlights was developed in 1975 by
Phong Bui-Thong, and is called the Phong highlighting model.
Another significant illumination effect - rather than a form of light -
is attenuation. This is the effect whereby objects at a greater
distance from the observer appear to be 'dimmer' (that is, have lower light
intensity levels). In the physical world this is caused by absorption of
light by material (dust, smoke, water vapour) in the atmosphere; the greater
the distance between the object and the observer, the greater the amount
of absorption. This effect is particularly important in depth cueing, giving
us the illusion of three-dimensions in a two-dimensional image.
You can experiment with the detailed components of the 'standard' lighting
model using one of Patrick Min's Java applets. Also, Dino Schweitzer
has created a small [237Kb] Windows
program to allow you to 'play around' with the basic components of
lighting (and shading) models.
Local illumination models generally incorporate ambient light, and
simplified models of reflection; global models attempt to model
all components of the illumination process. Because different objects -
and different parts of the same object - in a scene will almost always
be at varying distances from the various point light sources, objects exhibit
gradations in colour and lightness that we call shading. The next
stage of image synthesis is to develop an applicable shading model.
The simplest shading model - indeed, almost a 'non-shading' model - is
'flat', or constant, shading. In this model all points on the surface
of any polygon in the scene have the same colour value (and light intensity);
the result is that the scene has a matte look that is hardly realistic.
The main advantage of this model is the speed with which images can be
rendered: once hidden-surface calculation are done, all that is need is
to identify a pixel on each visible polygon, and assign a colour value
to it; simple flood fill techniques will complete the rendering process.
Images are much more solid-looking than when using wireframes, but with
only a marginal increase in render times. Nonetheless, the technique has
the major drawback that the boundaries between polygons are clearly visible.
The next level of 'realism' is to introduce actual shading by calculating
variations in colour value within a polygon. Whilst this produces
images that are visually more effective, there is obviously a computational
cost. Intra-polygon shading takes basically two forms:
a component that produces the smooth gradations in colour values that result
from parts of a polygon being at different distances from light sources;
the major technique employed is Gouraud shading
a component that produces 'localised' effects within a polygon, such as
highlights; the major technique employed is Phong shading
The Gouraud shading model was developed by Henri Gouraud at Renault in
the late 1960s. Basically the process involves calculating the colour values
for each vertex of the polygon, then the colour value at each point within
a polygon is derived by linear interpolation from these calculated values.
Whilst this approach requires significantly more calculations that constant
shading, they are all (relatively) simple, and the result is markedly more
realistic images of scenes that primarily involve diffuse reflection. However,
the technique still tends to produce visible polygon boundaries, which
show up as bright bands, called Mach bands.
As discussed earlier, curved reflective surfaces tend to generate highlights
when illuminated with point light sources (and thus include some degree
of specular reflection). The most successful shading technique that simulates
highlights was defined by Bui-Tuong Phong in 1975. His technique starts
from the same point as the Gouraud process, but interpolates from the surface
normals at the vertices. Also, separate intensity values are calculated
for each pixel. The resulting process is much more complex computationally
(and hence more time-consuming), but the resulting images are yet more
For a clear exposition of shading processes look at the relevant section
in Watt and Watt (1993, pp.127-142). To show how the models function in
practice, Dino Schweitzer has created a [244Kb] Windows
program with which you can investigate the Gouraud and Phong shading
The rendering (synthesis) process
The culmination of the rendering process is to generate an image by applying
lighting (and shading) to a scene by application of a rendering algorithm:
There are a number of significant rendering (image synthesis) algorithms
used in computer graphics. Some are based on local illumination/shading
technqiues; they tend to be fast, but lack support for such important scene
characteristics as diffuse shadows, and intrer-reflections. As a result,
the images they generate tend to lack 'extreme' levels of realism; in many
cases however - particularly in production environments - this 'tade-off'
between speed and quality is quite acceptable.
|Scene Description + Illumination Model
+ Rendering Technique = IMAGE
Image synthesis techniques that predominantly employ local illumination
are built on a visibility approach. That is, they render scenes
by first defining the visible surfaces in the scene, then applying a flat
(or at the most Gouraud) shading model to 'paint' them. Such an approach
can be very rapid, as the core operation of defining visible surfaces is
'required', and the rendering process (+ Illumination Model + Rendering
Technique) is relatively straightforward. Indeed, if a flat shading
model is employed (even if overlain by texture mapping) this form of rendering
can be carried out in the graphic processor component of the display
sub-system; this is the basis of 'real-time' animation and rendering
There are a number of algorithms
that have been (and are) used in visible surface determination. These include
back face culling, ray casting (from which is derived ray
tracing), and the z-buffer. The latter is the basis of the scan-line
rendering process. The central idea in using the z-buffer is to test the
"z-depth" (distance from the observer) of each surface to work out the
closest (visible) surface of each object. If two objects (or surfaces of
the same object) have different z-depth values along the same projected
line, the higher value is further away - and thus behind - the nearer
(lower z-depth) surface or object. Applying this approach allows us to
render scenes using scan-line rendering.
The development of global illumination models made possible the generation
- albeit very slowly! - of images with a much higher level of realism.
The first (and most widely-used) of these techniques, ray tracing,
was devised in the early 1980s by Turner Whitted. It is based on ray casting
techniques which, as has been suggested, were developed as an alternative
to z-buffer for deriving visible surfaces. The attraction of the ray tarcing
algorithm is that it incorporates (indeed, it is inherent within the technique)
such crucial realism elements as visible surface detection, shadowing,
reflection, transparency, mapping, and multiple light sources.
The basic algorithm of ray tracing is indicated in the following diagram:
ray tracing algorithm is iterative:
we 'shoot' one ray per pixel 'through' the screen to produce primary
rays, looking for ray-object intersections (this also gives
us visible surfaces); if no intersection is found, the pixel will have
the 'background' colour
at each intersection we follow any secondary rays - generated by
reflection and transmission, and from shadows - to generate a ray tree,
with a user-defined maximum depth (usually about ten levels)
when the complete tree has been defined, we determine the intensity and
colour of each pixel by 'adding up' from the bottom level of the ray tree
the components of the tree for each pixel
The major problem with ray tracing is the significant processing time involved
in generating complex ray trees. Much of this time is associated with surface
intersection calculations, so improving the 'efficiency' of ray tracing
has been a prime aim of much recent research. This can be done by using
various bounding volumes (such as boxes and spheres) to allow the
rapid determination of 'safe' zones where no surfcaes exist; the program
can then ignore them.
The basic ray tracing technique can also generate images with limited
quality, particularly in the form of aliasing. To improve image quality
the techniuqe has been modified to allow extra effort around the edges
of objects, using supersampling and adaptive sampling. In
these approaches extra rays (more than one per pixel) are defined in specific
areas to define the border between colour areas (objects) more carefully.
There have been further developments of the basic approach designed
to add other effects. These include distributed ray tracing, which
supports scenes incorporating lens effects, atmospheric effects, and motion
Despite its ability to create higly impressive images, ray tracing has
been criticised for its slowness, and its emphasis on direct reflection
and transmission. Looking for a technique that would more accurately render
environments characterised more by diffuse reflection, Don Greenberg
and his collaborators at Cornell
devised the radiosity method of image synthesis in the mid 1980s.
The basic principles of radiosity are as follows:
Carrying out radiosity calculations is a two-stage process:
We calculate the surface radiosity for each surface 'patch' in the environment
based on the amount of energy interaction (the form factors) between
it and all other surface patches. This is a view-independent
process; whilst fairly time-consuming, it needs to be carried out only
once for each scene (provided the energy balance does not change).
We render the scene taking into account a) visible surfaces b) interpolative
shading (flat or Gouraud). This is view-dependent process; moving
the 'camera' position means re-calcuating the image - but this can be done
quickly enough to allow (almost) real-time rendering. Certainly, with appropriate
'tricks', radiosity can generate rendered images 'on the fly' at up to
15-20 frames per second (as the basis for walkthroughs).
Like ray tracing, radiosity has a number of problem areas (and areas of
Incorporating specular reflection, 'true' reflection and transmission effects
is possible using various "hybrid" approaches; unfortunately, these that
approaches are view-dependent.
The radiosity solution has to be recalculated in systems that allow for
interactive adjustment of the lighting parameters (e.g. light colour),
as in theatre and lighting design.
The quality of the radiosity solution (and the resulting image) depends
significantly on how the surface patches are defined (surface subdivision),
and how they are re-defined in critical areas (notably shadows).
As with ray tracing, the computational efficiency of the technique can
be increased in various ways, notably through dynamic re-definition
of the forms factors (progressive refinement).