Linear Depth

Something that seems to come up again and again is the topic of linear vs. non-linear depth. If you take a look at the standard DirectX projection matrix and do the math for the z component, you’ll end up with something like this

  z' = \frac{z_f}{z_f - z_n} (1 - \frac{z_n}{z})

where z is the depth value before projection, z' is the depth value after projection and z_n, z_f correspond to the near and far planes. So projection actually transforms z into some variation of 1/z. The reason for this is simple: GPUs rasterize primitives in screen space and interpolate attribute data linearly in screen space as well. Linear depth z in view space, however, becomes non-linear after projection and thus cannot be correctly interpolated by simple linear interpolators. Conversely, it turns out that 1/z is linear in screen space. This is actually quite easy to see: Assume a plane in view space

  Ax + By + Cz = D

Perspective projection transforms view space x and y coordinates to

  x' = \frac{x}{z}, \qquad y' = \frac{y}{z}

Inserting these equations into the original plane equation yields

  A x' z + B y' z + C z = D

which gives us

  \frac{1}{z} = \frac{A}{D} x' + \frac{B}{D} y' + \frac{C}{D}

clearly showing that 1/z is a linear function of screen space x' and y'. This is illustrated quite nicely in this blog post by rendering ddx(z') and ddy(z') as color to the screen. The same holds for other generic attributes like texture coordinates: The GPU cannot directly interpolate u and v, but will interpolate u/z and v/z instead. The attribute value will then be reconstructed per pixel by multiplying by z.

Depth Precision

Now that we have established that the value that ends up in the depth buffer is not the depth but rather something related to 1/z, one might ask what kind of effect this will have on depth precision. After all, 1/z is a highly non-linear function that will significantly warp the original depth values. Check out the graph below: I plotted the resulting z' for the view space depth range z \in \{0,\dots,100\} for different near plane values z_n:znwarp2Notice how steep the function is on the first couple of meters. Almost the entire interval z'\in\{0,\dots,0.99\} is spent on the first couple of meters.

In order to test this result empirically I wrote a small program that will sample the range z \in \{z_n,\dots,z_f\} in regular intervals on the GPU, calculate the depth value z' after projection and write it to some depth buffer of choice. The buffer is then read back to the CPU and view space depth is reconstructed for each sample. This allows us to calculate the error of original depth value vs. reconstructed depth value. Here are the results for the formats DXGI_FORMAT_D16_UNORM and DXGI_FORMAT_D32_FLOAT with the following configuration: z_n = 0.1, z_f = 10000:
D16U_D32FNote how the error for DXGI_FORMAT_D16_UNORM quickly approaches ridiculous proportions; 16 bit integer depth in combination with a projective transform is definitely a no go! Here’s another plot to illustrate the error of DXGI_FORMAT_D32_FLOAT in more detail:D32FMuch better, though at the extremes we still get an error of over 100 meters. With some care though, this can greatly reduced: The shape of the hyperbolic z' curve is largely determined by the near plane distance z_n. Even a slight change from z_n=0.1 to z_n=0.25 reduces the maximal error from 1.4\% down to 0.26\%.

I also tested DXGI_FORMAT_D24_UNORM_S8_UINT but the results were so close to DXGI_FORMAT_D32_FLOAT that I can only conclude that the driver internally maps the depth format to 32 bit float. Not that much of a surprise, this is exactly what the the AMD GCN architecture does as well.

Practical Considerations

  • First of all: Make sure that your near plane is as far away from the camera as you can afford it. This will flatten the hyperbolic 1/z curve and provide much better depth precision far away from the viewer.
  • Unless you are in some crazy setting with hundreds of kilometers view distance and you are going for sub centimeter depth resolution, DXGI_FORMAT_D32_FLOAT should be good enough and on modern GPUs should come at no additional cost compared to DXGI_FORMAT_D24_UNORM_S8_UINT.
  • DXGI_FORMAT_D16_UNORM isn’t really a choice for projective transforms. It can be quite valuable for orthographic projections though (for example sun shadow maps), reducing bandwidth by half compared to a 32 bit format.

Linear Depth

And if you really really need linear depth you can write it via the SV_DEPTH semantic in the pixel shader. Beware though, you’ll loose the early Z unless you use the variant SV_DepthGreater, or SV_DepthLessEqual. Check out this blog post for more details. In most cases though I would argue that non linear depth is just fine.