Heightfield raycasting in a P10 architecture
Under Construction!!!
Last modification 3rd of August 2002
This page collects a few experiments I've done on coding a raycaster inside a fragment shader on a Wildcat VP board.
Why raycasting heightfields?
There are so many things that can be done with a programmable architecture, so why did I choose to do raycasting
of heightfields?
- It's a fresh split from the typical bump-map-my-poly fragment shader.
- The algorithms are easy to code (I had my first prototype running in a few nights), require little knowledge and use
simple structures, unlike, for example, raytracing.
- Datasets are easy to obtain. You only need a USGS DEM reader (which is a piece of cake to code). Unfortunately
Digital Orthophoto Quadrangles (the "colormap" or satellite view of a heightmap) matching a DEM are quite difficult to get.
In my case I coloured my DEM datasets by assigning a color to each height.
- It's a bit more challenging than normal 3D volume raycasting. In 3D volume raycasting, you just alphablend all
the voxels in the path of the ray. Using a heightfield as a dataset requires a comparison function and the posibility of
early exiting the loop (OK, you could also test for full opacity when raycasting a 3D volume).
- There are several optimisations which can be easily applied: for example hierarchical heightfields.
- It's actually nice to watch :) (as opposed as raycasting the bowels of some dead animal).
- You can, using the same basic algorithm, apply nifty effects like shadows, reflections, etc.
Algorithms
Basic algorithm
The heightfield is represented by a four component texture, with the height in the alpha channel and the color
of the terrain in the other channels. Because the raycast performs many more height lookups than color ones.
This could be optimised to store the height in a separate one-component texture, and the colormap in a 3-component
texture which is looked up only once we hit the terrain.
| Grand Canyon |
 |  |
| Colormap | Heightfield (grayscale) |
The border of the texture is set to black (height zero) and the clamping mode to GL_CLAMP_TO_BORDER, so full bordercolor
is used when we go outside the map. Fractal maps, because they are built to tile, can be set to GL_REPEAT.
The texture sizes I use range from the 64x64 simple test texture I used for development to the 1024x1024 texture of the
Mount Olympus (WA).
The core of the fragment shader is a simple 3D DDA loop which walks a 2D texture containing height values. The algorithm
is slightly simplified by using orthographic projection.
At each step the fragment shader compares the current height of the ray with the height of the map by accessing
the texel at the x,z texture coordinate. If the current height is lower, then the ray has hit the terrain and
we can set the current fragment to the terrain color at that position. Otherwise we increment the current position
by a given delta in the direction of the ray and test again.
The CPU simply draws a cube with face culling, what we could call the visualization volume,
then, the fragment shader is applied to each face of the cube, raycasting the whole vis volume.
This means that all the CPU has to do is to send 6 quads to the graphics card, the rest of the work is done by the VPU.
 |  |
| Raycasted view | Wireframe view |
Pseudocode:
uniform vec3 dnorm; // Normalised delta vector
// As we are dealing with an orthographic projection,
// the delta is constant for all the fragments, so it can be
// normalised in the vertex shader
// Note that we don't really need an unitary normalised delta
// it suffices with normalising wrt the largest component
uniform vec3 entrypoint; // Entry point of the view ray with this face of the vis volume
// The entry point of the ray is interpolated linearly across the polygon
// in the vertex shader.
// Because we are using orthographic views, we don't need to perform
// perspective correction.
vec3 ray; // Current ray position 'y' is up, 'z' is near and 'x' is right
const int nMaxIterations = 100; // Maximum number of iterations
// The main reason of the maximum number of iterations is not having to check
// for too many exit conditions in the loop, so keeping a tight loop will make
// the code faster. The drawback is that we may end up raycasting beyond the vis
// volume, but we will assume that generally a terrain raycasts more full voxels
// than empty ones.
const TEX_HEIGHTFIELD = 0; // Texture stage for the heighfield (color in RGB, height in A)
// Initialise the current ray position
// decrement by norm because of the way the loop is built
ray = entrypoint - dnorm;
vec4 texel; // Texel we fetch from the texture with the color & height in the alpha channel
for (int i=0; i < nMaxIterations;i++) {
// Increment the raycasted position
ray += dnorm;
// Fetch the current height & color from the heightfield
texel = texture4(TEX_HEIGHTFIELD, ray.xz);
// Extract the height
height = texel.a;
// Compare the current height to the current ray height
if (ray.y < height) {
// We hit a voxel of the terrain
break;
}
}
// If the ray didn't hit, we will be out of the visualization volume and texel
// will contain the border color, which is ok because the border color is set
// to the background color. In that case, we could also kill this fragment.
gl_FragColor.rgb = texel.rgb;
The main problem with this algorithm is that, although fast because it only performs a texture lookup and a vec3 increment
per iteration, there's no guarantee that we won't skip any voxel, and in fact for certain view angles you
can see heavy aliasing, result of skipping voxels we should have raycasted. See the "Hierarchical Heightfield" section for
an attempt to solve this problem efficiently.
Shadow casting
There's little point in using raycasting just to raycast an opaque heightfield (you could do it better by using
rasterisation techniques and displacement maps, for example), the real reason to use raycasting is to apply techniques
not possible in normal rasterisation. In this case shadow casting is a perfect candidate of such techniques which can be
applied to our terrain raycaster.
In my case I decided to use point light sources as opposed to infinite light sources.
This means that we need to calculate the vector from the voxel the basic raycast hits to the light in a perpixel basis (i.e. in the fragment shader).
Had I used infinite lighting, that vector would remain constant per pixel, so would the deltas to the light, so all those factors could have been calculated
either in the vertex shader or just in the CPU and provided as constants to the fragment shader.
Hard shadows
The code for the shadow caster is just a straightforward derivative of the 3D DDA. Once we hit a voxel of the terrain,
we calculate the vector to the point light from that position and walk our way towards it. If we hit the light or we exit the
vis volume, we assume the voxel of the terrain the basic raycaster hit is lit, otherwise we will have hit a voxel in our way to the light,
implying that the original voxel is in shadows.
 |  |
| Diffuse shading, no shadows | Hard shadows |
Pseudocode
void raycastshadows(vec3 ray, output float alpha) {
const int nMaxIterations = 100; // Maximum number of iterations
// Same as in the basic raycasting, maxiterations allows us to avoid
// checking extra exit conditions from the loop.
uniform vec3 light; // Position of the light
vec3 absdelta; // Absolute values of the deltas
vec3 dnorm; // Normalised delta from the place where the ray hit to the light
// Calculate the vector from the current point to the light
dnorm = light - ray;
absdelta = abs(dnorm);
// Find the largest coordinate, assume it's x
maxdelta = absdelta.x;
if ((absdelta.z > absdelta.x) && (absdelta.z > absdelta.y)) {
maxdelta = absdelta.z;
} else {
if (absdelta.y > absdelta.x) {
maxdelta = absdelta.y;
}
}
// Normalise wrt to the largest coordinate (abs value)
// We could also use a properly unitary normalised delta vector
// but normalising the vector this way we ensure we never
// visit the same voxel twice
dnorm = dnorm / maxdelta;
// We have now "maxdelta" increments in "dnorm" steps to the light
// Raycast in the direction to the light, finish
// when we hit the terrain (in shadow) or when we
// hit the light (lit)
// Assume lit
alpha = 1.0;
for (int i=0; i<nMaxIterations;i++) {
// Note that we don't want to test the first voxel
// so incrementing before testing is just fine
ray += dnorm;
// Fetch the current height from the heightfield
texel = texture4(TEXEL_HEIGHTFIELD, ray.xz);
height = texel.a;
maxdelta = maxdelta - 1.0;
if ((ray.y < height) || (maxdelta <= 0)) {
break;
}
}
// Test if we are lit
if (ray.y < height) {
alpha = 0.0;
}
}
TBD
Soft shadows
This is a slight modification of the previous shadowcasting algorithm.
To draw the shadows, once we hit the terrain, we raycast to the light (a point light) until we either hit the light position
or we hit a point in the texture higher than the current raycasted point.
Soft shadows is just a variation of the former by considering the terrain to be "translucid": instead of
just regarding in shadows the voxel when we first hit another voxel when raycasting to the light,
we decrease the diffuse contribution by some amount. Successive hits with the terrain in the way of the ray to the light will
decrease the diffuse contribution to zero, thus causing the point to be in complete shadow.
Yes, this is a real hack, but the visual effect is nice and the cost over the previous shadowcasting algorithm is just
delaying a bit one of the exit conditions of the loop.
 |  |
| Hard shadows | Soft shadows |
Pseudocode
TBD
Hierarchical heightfields
In an attempt to avoid the heavy aliasing, I coded a version performing true ray-voxel intersection.
To minimise the number of interesections to calculate, I added an acceleration structure in the form of
a hierarchical heightfield:
The mipmap levels of the texture store the maximum of the heights for the given texel footprint, thus clustering the original voxels
in several levels of supervoxels.
The 1x1 level is first tested using the ray-voxel intersection code:
- if the ray doesn't hit then it's impossible the ray hits the inside this supervoxel, so we just move to the neighbour voxel of this mipmap level.
- if the ray hits, we recurse to the next mipmap level and continue the intersection tests beginning with the subvoxel closest to the intersection point in
the parent supervoxel.
Later on, I modified the hit-test algorithm so if the ray doesn't hit, we return to the previous mipmap level, in order to avoid cases where we hit
a false positive (a spike in the terrain) which would force us to remain in low mipmap levels forever.
Unfortunately, due to execution time constraints of the fragment shader, there are cases where the algorithm cannot reach the definitive point in the
required time and and another visual artifact will appear in the form of columns.s
This is very similar to the behaviour of a traditional hierarchical-z buffer.
TBD
Fujimoto's 3D DDA
TBD
Reflections
Another nice effect you can achieve with raycasting is reflections on water.
TBD
Examples
All the images in this document have been rendered in realtime using a Wildcat VP870 and the fragment shaders exposed in the previous sections.
Because of the advanced programming capabilities of the P10 architecture, the whole fragment shader
is executed entirely inside the graphics chip, there's no need to multipass in the host, in fact there's no need for
any kind of host processing at all, other than providing the rotation matrix in order to spin the vis volume and the light position.
First test
My First raycasted image: a fractal terrain stored in a 256x256 texture.



And now with colors
It's a static view, you couldn't rotate nor tilt the camera

Havasupai, Grand Canyon
I got some USGS maps in USGS DEM elevation map format and wrote an utility to convert them to PPM files. The program can now load PPM files, this is one of them.
The "dirty" borders are due to not setting the right clamping mode in the program, but now you could control the viewpoint with the mouse.

Grand Teton, Wyoming
Sporting raycasted softshadows. The blurriness of the image comes from the Quincunx-like FSAA filter I turned on in the control panel,
which helps to reduce the aliasing a bit (the current drivers do multisampling FSAA, so any other FSAA setting wouldn't do anything
at all, as they only apply to polygon edges and there are only 3 polygons in this image :) ).