After having written a traditional
scanline rasterizer before, I thought an interesting task
would be writing a more modern-style half-space one. This algorithm is actually fairly old, with Pineda's seminal paper,
A Parallel Algorithm for Polygon Rasterization ,
introducing it in 1988. A forum post titled
Advanced Rasterization
by Nicolas "Nick" Capens is one of the most concise and approachable explanation of the half-space rasterization algorithm I've seen so far.
Bonus points for working code.
It's interesting to note how half-space functions,
cross products
and
barycentric coordinates are related, and how both can form the basis for rasterization and
ray-triangle intersection.
This rasterizer implements a different
fill convention
than the top-left one favored by Direct3D and OpenGL. With pixel centers being at integer coordinates and my origin at the bottom-left, a bottom-left
fill rule seemed to make more sense. There's a helpful
ASCII art
explanation of this decision in the source.
The application here is not just a rasterizer, but a small yet complete renderer.
Fabian "ryg" Giesen's series of articles
A trip through the Graphics Pipeline 2011
gives an excellent overview of how a modern GPU rendering pipeline works. It's a bit light on the details of the actual rasterizer, but that is remedied
in another series of articles by him. I tried to implement something similar to what a GPU would do, so these articles are recommended reading for understanding
this program.
The same author also published a series of articles called
Optimizing Software Occlusion Culling. Especially part 6 on
barycentric coordinates and the explanations and optimization ideas in part 7 - 11 were very valuable and helped me a great deal with this project. The author's
replies in the comment section are also worth checking out.
The material in
Rasterization: a Practical Implementation provides a very gentle and basic introduction to the subject.
I'd also highly recommend
Michael Abrash's writing on the Larrabee rasterizer
(
Rasterization on Larrabee). It's mostly interesting for the hierarchical
and parallel optimizations done to the half-space algorithm. I implemented a coarse raster phase myself, but it did not help very much.
Vertex processing, shading and rasterization are all parallelized. Triangles are
distributed over a number of tiles, each processed independently. The
scoped_threadpool library is used throughout the code for
all thread management.
This rasterizer also implements a depth buffer
(
Z not W)
and perspective correct interpolation. The
Perspective-Correct Interpolation
paper is good reading material for the required math.
I implemented many different shaders for the renderer. Most of them make use of
Image based lighting through
Prefiltered environment maps. The
Cube maps read by this program where generated by the HDR environment map
processing pipeline of another program of mine
(
GitHub). Several different
environments are included, 6x64x64 resolution was sufficient for the prefiltered representations.
Shading happens in linear color space,
Gamma correction
is applied at the framebuffer level. I use an 11bit precision lookup table as a compromise between the cost of a floating-point pow() computation and
the banding caused by doing it in 8bit. See
this chapter in GPU Gems and
this article
why this matters.
The rasterizer can also render points and has a
DDA
based wireframe mode.
A number of different meshes (loaded from a
simple text format)
and camera movements are included in the program, allowing to test and profile
the rasterizer in different scenarios.
Many of the meshes have baked-in
ambient occlusion or
radiosity, computed
by my own preprocessors (see
here and
here).
Background graphics are also selectable to contrast the shaded meshes optimally.
The entire renderer, including mesh loading, all cubemap code, transform, rasterization and shading is about
2k lines of Rust.
While the rasterizer is implemented as a Rust library, the application framework is written in
Haskell, doing the display, user interaction
and non-inner-loop parts. Efficient use of
OpenGL PBOs
ensures speedy display of the rasterized images.
The Haskell application itself
might also be of interests. It features a pluggable experiment framework, modern OpenGL 3/4.x style rendering, text output,
quad rendering, screenshots, a framebuffer system, an FPS counter, GLSL support, logging etc. A good starting point for your own Haskell + OpenGL adventures.
The
nalgebra library was used for storage of and operations on vectors, points and matrices.
There's plenty of room left for optimizations, but even now using a single core of a 2009 laptop machine the
'Killeroo' mesh can be rendered at 1280x800@60FPS with vertex-level
diffuse + glossy IBL.
The source code for this project is
available on GitHub.