# Shadow Quality (1)

Let’s face it: even after stabilization, the shadows in my sample still look pretty bad. Very blocky, hard shadow edges and strange shapes everywhere… So let’s do something about it! But first a short disclaimer: The efficiency of the techniques I propose in this post is highly dependent on the scene and your visual target: There is no one approach that will fix all your shadow artifacts once and for all – it’s rather through trial and error and a combination well known techniques that you can get to a state you are satisfied with. I have yet to come across a game that does a perfect job with real time shadows.

But let’s start now, with something very simple: During shading, we do know the surface normal for each position on the geometry. So lets make use of that: If a surface normal points away from the light source we know for sure that that point does not receive any light. Hence it is in shadow. Determining wether the surface normal is pointing away from the light is simple: Just take the dot product between the surface normal and the light direction and check if it’s less than 0. So let’s combine that with the shadow map look up from GetShadowFactor in Cascaded Shadow Maps (3):

// compute shadow factor: 0 if in shadow, 1 if not.
{
float storedDepth = tex2Dlod( ShadowMapSampler, float4( splitInfo.TexCoords, 0, 0)).r;

return (splitInfo.LightSpaceDepth < storedDepth) * (ndotl > 0);
}


where ndotl is assumed to contain the dot product between surface normal and light direction. Note that we are combining the two terms “Pixel is occluded by some occluder in the shadow map” and “Surface is oriented away from the light” with a simple multiplication. This equation can be interpreted in a probabilistic sense, in that we are saying that a point can only be in light if it is not occluded by some object AND the surface is oriented towards the light. Check out the comparision below: Shading without ndotl term on the left, with ndotl term on the rightThe geometry based ndotl term is clearly helping a lot with thin surfaces, where the shadow map doesn’t have enough $Z$ precision to pick apart the front and back facing polygons. It can, however, only be as good as the surface normals, so correct surface normals are a must.

Next, let’s have a look at another geometric approach: depth bias. When rendering the shadow map, we store the scene distance with respect to the light source in a more or less accurate manner. In my case, I requested a SurfaceFormat.Single shadow map format, which means 32 bits (floating point) per pixel. But not all developers/hardware can afford such a surface format, sometimes 24 bit or even 16 bit per pixel are a requirement. Under such circumstances, the depth range that can be stored in the shadow map is quite limited, resulting in a high quantization error for the stored depth. This means though, when we read back the stored depth from the shadow map, we’ll get a depth value that is more or less close to the real depth of the corresponding surface element. So we can either end up thinking that a given surface point is in shadow because the stored depth value is (due to quantization) closer to the light source than the real depth value, or that the surface point is not in shadow because due to quantization it is further away from the light source than in reality. Since the first problem is much more visible than the later, a simple work around has been invented: Add some bias to the shadow stored depth. There is even hardware support for this functionality: the D3DRS_DEPTHBIAS and D3DRS_SLOPESCALEDEPTHBIAS renderstates, where D3DRS_DEPTHBIAS corresponds to a constant bias and D3DRS_SLOPESCALEDEPTHBIAS is a surface slope based bias. Slope based bias can be very helpful with surface elements that are oriented in a steep angle with respect to the light source. Unfortunately these rendersates only applied to the position interpolator, so not much use to us because we are passing the depth via a color interpolator. So we have to emulate these in the fragment shader:

float2 DepthBias;
{
float depthSlopeBias = max(
abs(ddx(input.Depth)),
abs(ddy(input.Depth))
);
return float4( input.Depth + depthSlopeBias * DepthBias.x + DepthBias.y, 0, 0, 0 );
}


The DepthBias values are passed to the shader via shader constants. Check out the images below:

On the left you can see the shadowed scene without any bias and on the right with tweaked bias values.

# Stable CSM

So… How can we fix this? First, lets agree to make the shadow map size in world space constant, say shadowMapSize, and decouple it from light and camera rotation. In my case I chose to align the shadow map to world space $X$ and $Z$ axis. I do so by exchanging the shadow view matrix for a matrix where the $X$ and $Y$ axis point in direction of world space $X$ and $Z$, positioned at mLightPosition like so:

 // Remember: XNA uses a right handed coordinate system, i.e. -Z goes into the screen
var look = Vector3.Normalize(arena.BoundingSphere.Center - mLightPosition);
new Matrix(
1,              0,              0,             0,
0,              0,             -1,             0,
-look.X,        -look.Y,        -look.Z,        0,
mLightPosition.X, mLightPosition.Y, mLightPosition.Z,     1
)
);


Note that the $Y$ axis is flipped in order to preserve culling order in the final view transform. Also note that this approach only works as long as mLightPosition does not lie in the $y=0$ plane in world space as then the resulting matrix becomes singular.

Now lets tackle camera movement: As outlined before, the problem is that even the slightest movement of the shadow map will affect all the scene as each scene position will change position in the shadow map (subpixel-wise speaking). What we need, however, is that the scene positions stay constant (at least relative to their corresponding pixel). So instead of moving the shadow map continuously, lets move it in fixed increments of one shadow map pixel. When moving the shadow map this way, each world space position might fall into a different shadow map texel than the frame before – but the relative position within the shadow map texel will stay the same, which means no more moving shadow boundaries.

So how can we implement this? Given our view transform defined like above, all we need to do is adjust the shadow projections: We want to place the shadow map corners at discrete positions only, separated by some value, e.g. quantizationStep. Remember, in one of my previous posts Cascaded Shadow Mapping (1), we defined the extent of the shadow projection matrices based on values min and max which were determined from the view frustum. All we need to do now is make sure the $X$ and $Y$ coordinates of min and max are properly discretized:

var quantizationStep = 1.0f / shadowMapSize;
var qx = (float)Math.IEEERemainder(min.X, quantizationStep);
var qy = (float)Math.IEEERemainder(min.Y, quantizationStep);

min.X -= qx;
min.Y -= qy;



Using the adjusted min and max values we create the shadow projection matrix as described before:

Projection = Matrix.CreateOrthographicOffCenter(min.X, max.X, min.Y, max.Y, minZ, maxZ);


The effect of these few lines of code is dramatic. Check out the video below:

You can clearly see how the scene shadows are stabilized, almost all artifacts during camera movements and rotations are gone. The remaining artifacts stem from transitions between the different shadow splits, as shown in the last part of the video.

Having seen how we can render the shadow map for each split in the last post, it’s time to look at the final step: map the shadow atlas onto the scene. So, basically, for each pixel in the scene we need to determine if it lies in shadow or not. To do so we first need to determine the appropriate shadow split for that pixel. This can be done in a couple different ways, for example, by determining the split based on the distance to the viewer or by picking the split based on the look up texture coordinates. I chose the latter, as this makes better usage of the higher resolution maps. Check out this article for a nice comparison.

So in order to determine the shadow split we basically transform the current fragment’s world position by each shadow transform and then pick the first shadow map where the coordinates fall within the range of the shadow atlas partition. Let’s define some helper functions first:

#define NumSplits 4

// the "world to shadow atlas partition" transforms

// the bounding rectangle (in texture coordinates) for each split
float4 TileBounds[NumSplits];

// the shadow atlas
sampler ShadowMapSampler = sampler_state
{
magfilter = POINT;    minfilter = POINT;    mipfilter = POINT;
};

// Data passed from vertex shader to pixel shader
{
float4 TexCoords_0_1;
float4 TexCoords_2_3;
float4 LightSpaceDepth;
};

// compute shadow parameters (tex coords and depth)
// for a given world space position
{
float4 texCoords[NumSplits];
float lightSpaceDepth[NumSplits];

for( int i=0; i<NumSplits; ++i )
{
float4 lightSpacePosition = mul( worldPosition, ShadowTransform[i] );
texCoords[i] = lightSpacePosition / lightSpacePosition.w;
lightSpaceDepth[i] = texCoords[i].z;
}

result.TexCoords_0_1 = float4(texCoords[0].xy, texCoords[1].xy);
result.TexCoords_2_3 = float4(texCoords[2].xy, texCoords[3].xy);
result.LightSpaceDepth = float4( lightSpaceDepth[0],
lightSpaceDepth[1],
lightSpaceDepth[2],
lightSpaceDepth[3] );

return result;
}

{
float2 TexCoords;
float  LightSpaceDepth;
int    SplitIndex;
};

// find split index, texcoords and light space depth for given shadow data
{
{
};

float lightSpaceDepth[NumSplits] =
{
};

for( int splitIndex=0; splitIndex < NumSplits; splitIndex++ )
{
if( shadowTexCoords[splitIndex].x >= TileBounds[splitIndex].x &&
shadowTexCoords[splitIndex].x <= TileBounds[splitIndex].y &&
shadowTexCoords[splitIndex].y >= TileBounds[splitIndex].z &&
shadowTexCoords[splitIndex].y <= TileBounds[splitIndex].w )
{
result.LightSpaceDepth = lightSpaceDepth[splitIndex];
result.SplitIndex = splitIndex;

return result;
}
}

ShadowSplitInfo result = { float2(0,0), 0, NumSplits };
return result;
}

// compute shadow factor: 0 if in shadow, 1 if not
{
float storedDepth = tex2Dlod( ShadowMapSampler, float4( splitInfo.TexCoords, 0, 0)).r;

return (splitInfo.LightSpaceDepth <  storedDepth);
}


Armed with these definitions we can now add shadowing to any shader. All we need to do is call GetShadowData in order to convert a given world position into shadow atlas texture coordinates and then GetShadowFactor to do the lookup. Note that since the quantities in ShadowData are linear, I would suggest calling GetShadowFactor in the vertex shader in order to save fragment shader instructions. I collected these functions in a header file Shadow.h, which can be included by any shader.

What’s with ShadowTransform[i] though? The matrix is assumed to transform a vertex’s world space position into the texture coordinates of it’s corresponding shadow split $i$. Actually nothing new, the combined shadow view and projection matrix – if it weren’t for the nasty coordinate space differences: After projection, a vertex ends up in clip space which is defined as (ignoring the Z coordinate) $[-1, \dots, 1] \times [-1, \dots, 1]$. But we need texture coordinates for the shadow map lookup, which range from $[0, \dots, 1] \times [0, \dots, 1]$. Even more specific: we need to index into the sub rectangle of the corresponding shadow map in the shadow atlas. And to make things even worse: The y-axis of clip space points upwards while the y axis in texture space points downwards: top left in clip space is $(-1,1)$ whereas top left in texture coordinates is $(0,0)$. Luckily we can extend the combined shadow view and projection matrix to do the remapping:

// compute block index into shadow atlas
int tileX = i % 2;
int tileY = i / 2;

// tile matrix: maps from clip space to shadow atlas block
var tileMatrix = Matrix.Identity;
tileMatrix.M11 = 0.25f;
tileMatrix.M22 = -0.25f;
tileMatrix.Translation = new Vector3(0.25f + tileX * 0.5f, 0.25f + tileY * 0.5f, 0);

// now combine with shadow view and projection


So the first two diagonal elements of tileMatrix scale the clip space $x$ and $y$ coordinates down to $[-\frac{1}{4}, \dots, \frac{1}{4}]$ and the translation component shifts the coordinates to $[0, \dots, \frac{1}{2}]$. The tileX * 0.5f and tileY * 0.5f part then offsets the coordinates into the proper shadow atlas partition. Simple as that. Beware though, this only works for orthographic shadow projections.

Check out the images below, from left to right: The scene with shadows, shadows per split, shadow map resolution per split.

Have a look at the next visualization as well: Each pixel in the shadow map is back-projected into world space again and rendered as a cube. This sort of illustrates how the shadow map ‘sees’ the scene. You can clearly see the stepping on tilted surfaces, as well as discretization errors. Note that this rendering method is *very* demanding on the GPU (and I haven’t had time to put in some optimization), so the video is somewhat choppy.

This concludes the section about cascaded shadow mapping, next time we’ll look into how to stabilize the shadows during camera movement.

So last time we saw how we can partition the view frustum into several subfrustums and how to compute a projection matrix for each. This time we’ll look at how the shadow maps are actually rendered.

For starters, I chose to render all shadow maps into a single texture atlas instead of assigning a unique texture to each shadow split. This avoids us to switch render targets for each shadow split and also simplifies the shadow mapping shader. In this sample I am using a shadow atlas of 1024×1024 pixels,

mShadowMap = new RenderTarget2D(mGraphicsDevice, 1024, 1024,
false, SurfaceFormat.Single, DepthFormat.Depth24);


and each shadow split is rendered into a 512×512 subrectangle. Ideally we’d render into the depth buffer only (double speed!) and then either resolve to a texture (XBox 360) or directly bind it as texture (DirectX). Unfortunately this is not possible in XNA. So I use a 32 bit float render target for the shadow map and a separate depth buffer.

During the shadow map render pass I then bind the atlas as render target and, for each split, set up the viewport to only render into the corresponding shadow atlas partition. Once the viewport is set we can finally render the model using the shadow transform as combined view and projection matrix:

// bind shadow atlas as render target and clear
mGraphicsDevice.Clear(ClearOptions.Target | ClearOptions.DepthBuffer, Color.White, 1.0f, 0);

// get model bone transforms
Matrix[] transforms = new Matrix[arena.Bones.Count];
arena.CopyAbsoluteBoneTransformsTo(transforms);

for (int i = 0; i < mNumShadowSplits; ++i)
{
// set up viewport
{
int x = i % 2;
int y = i / 2;
var viewPort = new Viewport(x * 512, y * 512, 512, 512);

mGraphicsDevice.Viewport = viewPort;
}

// Draw the arena model
foreach (ModelMesh mesh in arena.Meshes)
{
foreach (var effect in mesh.Effects)
{
effect.Parameters["World"].SetValue(transforms[mesh.ParentBone.Index]);
}

mesh.Draw();
}
}


As for the shader used during this render pass, it’s nothing special: It just transforms each vertex into shadow clip space and then writes out the depth value as color – like any other shadow mapping shader:

float4x4 World;
float4x4 ViewProjection;

{
float4 Position        : POSITION0;
};

{
float4 Position        : POSITION0;
float  Depth           : COLOR0;
};

{

float4 worldPosition = mul(input.Position, worldTransform);
output.Position = mul(worldPosition, ViewProjection);
output.Depth = output.Position.z / output.Position.w;

return output;
}

{
return float4( input.Depth, 0, 0, 0 );
}


Check out the images below: On the left you can see the scene and the view frustum and on the right the corresponding shadow atlas.

Obviously, the split distances don’t match the arena dimensions very well, but I’ll get into that in another post.

As mentioned in my previous post, cascaded shadow mapping is a technique to shadow large areas at reasonable memory and run-time cost. The basic idea is simple: Split the view frustum into several sub frustums and render a shadow map for each split. Since, naturally, splits closer to the viewer cover less area in world space (i.e. under perspective projection) you get better shadow map resolution close to the camera and less resolution further away. Check out the image below: Each split is visualized in a different color and the corresponding shadow map has been filled with a block pattern. All shadow maps are of the same resolution. You can clearly see how shadow maps closer to the viewer (red and green) allocate much higher detail per world space unit than shadow maps further away from the viewer (blue and yellow). This gives us a highly desired quality: Lots of resolution/details close to the viewer and less far away – without the need for a huge shadow map.

So how can we get this done? First we have to take the same steps as in regular shadow mapping: Define the shadow casting light i.e. light direction and position and the corresponding projection into the shadow map. In my case I chose to simulate sun light which can be represented by a simple orthogonal projection. I create the shadow transform like so:

// shadow view matrix
mShadowView = Matrix.CreateLookAt(mLightPosition, mLightPosition + mLightDirection, Vector3.Up);

// determine shadow projection based on model bounding sphere
{
var center =  Vector3.Transform(arena.BoundingSphere.Center, mShadowView);

var min = center - new Vector3(arena.BoundingSphere.Radius);
var max = center + new Vector3(arena.BoundingSphere.Radius);

mShadowProjection = Matrix.CreateOrthographicOffCenter( min.X, max.X, min.Y, max.Y, -max.Z, -min.Z);
}



So the shadow view matrix is a simple look at matrix placed at the position of the light mLightPosition and looking towards mLightDirection. As Up vector I just use the y-axis, which works fine as long as mLightDirection is not pointing straight upwards. Using this definition I can then transform the model’s bounding sphere center into shadow space (i.e. compute the position of the bounding sphere center relative to the light) and get the minimum axis aligned bounding box (in shadow space) encompassing the sphere by just adding and subtracting the bounding sphere radius along each coordinate axis. The resulting vectors min and max give us thus the shadow frustum we need to project into our shadow maps in order to cover the whole model. Note that view space looks along the negative z axis so we need to negate the Z values.

The resulting matrix mShadowTransform gives us thus the projection of world space coordinates into our shadow map. Now, in cascaded shadow mapping we have not one, but multiple shadows maps. We therefore need to define multiple versions of mShadowTransform, one for each shadow split. And we also need to align the splits to the view frustum. So let’s start with the view frustum: Lets say we split the frustum at constant distances from the viewer, say the first split should range from -1 to -100, the second split from -100 to -300 and the third split from -300 to -600. Given these split distances we can use the functionality described in the post Frustum Splits to figure out the world space positions of the sub frustum corners for each split. Once we know the world space positions for each split we’re almost done: We just need to adjust the projection matrix to focus on the the split frustum.

// determine clip space split distances
var splitDistances = new[] {-1, -100.0f, -300.0f, -600 }
.Select(d =>
{
var c = Vector4.Transform(new Vector3(0, 0, d), camera.projectionMatrix);
return c.Z / c.W;
})
.ToArray();

// determine split projections
var splitData = Enumerable.Range(0, mNumShadowSplits).Select(i =>
{
var n = splitDistances[i];
var f = splitDistances[i + 1];

// get frustum split corners and transform into shadow space
var frustumCorners = splitFrustum(n, f)
.Select(v => Vector3.Transform(v, mShadowView));

var min = frustumCorners.Aggregate((v1, v2) => Vector3.Min(v1, v2));
var max = frustumCorners.Aggregate((v1, v2) => Vector3.Max(v1, v2));

// determine the min/max z values based on arena bounding box
var arenaBB = GeometryHelper.transformBoundingBox(arena.CollisionData.geometry.boundingBox, ShadowView);
var minZ = -arenaBB.Max.Z;
var maxZ = -arenaBB.Min.Z;

// return orthographic projection
return new
{
Distance = f,
Projection = Matrix.CreateOrthographicOffCenter(min.X, max.X, min.Y, max.Y, minZ, maxZ)
};
}).ToArray();

// compute final split transforms
ShadowSplitProjections = splitData.Select(s => mShadowView * s.Projection).ToArray();
ShadowSplitDistances = splitData.Select(s => s.Distance).ToArray();


This gives us an array ShadowSplitProjections of matrices where each matrix projects the scene into a shadow map, from the view of the light. Note that I also store the clip space split distances, measured in view space from the camera’s center of projection. Have a look at the images below:

The images on the left show the view frustum split into three sub frustums, while the images on the right show the shadow transform’s frustum for each split.

This concludes the first step of implementing cascaded shadow mapping. In the next post I’ll show how the finally render the shadow maps and how to project them onto the scene. Hope you find it as interesting as I do ðŸ™‚