Something that seems to come up again and again is the topic of linear vs. non-linear depth. If you take a look at the standard DirectX projection matrix and do the math for the component, you’ll end up with something like this
where is the depth value before projection,
is the depth value after projection and
,
correspond to the near and far planes. So projection actually transforms
into some variation of
. The reason for this is simple: GPUs rasterize primitives in screen space and interpolate attribute data linearly in screen space as well. Linear depth
in view space, however, becomes non-linear after projection and thus cannot be correctly interpolated by simple linear interpolators. Conversely, it turns out that
is linear in screen space. This is actually quite easy to see: Assume a plane in view space
Perspective projection transforms view space x and y coordinates to
Inserting these equations into the original plane equation yields
which gives us
clearly showing that is a linear function of screen space
and
. This is illustrated quite nicely in this blog post by rendering
ddx(z')
and ddy(z')
as color to the screen. The same holds for other generic attributes like texture coordinates: The GPU cannot directly interpolate and
, but will interpolate
and
instead. The attribute value will then be reconstructed per pixel by multiplying by
.
Depth Precision
Now that we have established that the value that ends up in the depth buffer is not the depth but rather something related to , one might ask what kind of effect this will have on depth precision. After all,
is a highly non-linear function that will significantly warp the original depth values. Check out the graph below: I plotted the resulting
for the view space depth range
for different near plane values
:
Notice how steep the function is on the first couple of meters. Almost the entire interval
is spent on the first couple of meters.
In order to test this result empirically I wrote a small program that will sample the range in regular intervals on the GPU, calculate the depth value
after projection and write it to some depth buffer of choice. The buffer is then read back to the CPU and view space depth is reconstructed for each sample. This allows us to calculate the error of original depth value vs. reconstructed depth value. Here are the results for the formats
DXGI_FORMAT_D16_UNORM
and DXGI_FORMAT_D32_FLOAT
with the following configuration: ,
:
Note how the error for
DXGI_FORMAT_D16_UNORM
quickly approaches ridiculous proportions; 16 bit integer depth in combination with a projective transform is definitely a no go! Here’s another plot to illustrate the error of DXGI_FORMAT_D32_FLOAT
in more detail:Much better, though at the extremes we still get an error of over 100 meters. With some care though, this can greatly reduced: The shape of the hyperbolic
curve is largely determined by the near plane distance
. Even a slight change from
to
reduces the maximal error from
down to
.
I also tested DXGI_FORMAT_D24_UNORM_S8_UINT
but the results were so close to DXGI_FORMAT_D32_FLOAT
that I can only conclude that the driver internally maps the depth format to 32 bit float. Not that much of a surprise, this is exactly what the the AMD GCN architecture does as well.
Practical Considerations
- First of all: Make sure that your near plane is as far away from the camera as you can afford it. This will flatten the hyperbolic
curve and provide much better depth precision far away from the viewer.
- Unless you are in some crazy setting with hundreds of kilometers view distance and you are going for sub centimeter depth resolution,
DXGI_FORMAT_D32_FLOAT
should be good enough and on modern GPUs should come at no additional cost compared toDXGI_FORMAT_D24_UNORM_S8_UINT
. DXGI_FORMAT_D16_UNORM
isn’t really a choice for projective transforms. It can be quite valuable for orthographic projections though (for example sun shadow maps), reducing bandwidth by half compared to a 32 bit format.
Linear Depth
And if you really really need linear depth you can write it via the SV_DEPTH
semantic in the pixel shader. Beware though, you’ll loose the early Z unless you use the variant SV_DepthGreater
, or SV_DepthLessEqual
. Check out this blog post for more details. In most cases though I would argue that non linear depth is just fine.