Rendering dynamic cube maps for omni light shadows with Vulkan API

This post shares my process and discovery during the implementation of a dynamic variance shadow map with Gaussian pre-filter for omni lights. The first part explains three possible solutions for dynamic cube map rendering. The second part focuses on UV wrapping for the filter pass.

GitHub repo

(I'm a in a learning process myself. So please comment or share on your thoughts about the methods described in this post:)

Part 1 Rendering cube maps with Vulkan

1.1 Draw to one framebuffer color attachment with six array layers

Layered image is the common way of storing cube maps especially in cases of static assets. We can group the following Vulkan objects to represent a cube map entity:

  • A VkImage with type 2D image type, and 6 layers
  • A VkImageView with type cube image type
  • A VkSampler

However, for a dynamic render target, it is not straight forward to write to each of the 6 layers of an attachment. The trouble is that we need a geometry shader to tag triangles with their layers. The good thing is that we can only use one draw call for all six layers. This is the method I implemented in the source code.

The geometry shader takes the following task:

  • Takes the layer(face) flags from the vertex shader to determine if a triangle is in each of the six layers
  • Emit triangles with their layers using gl_Layer

The copy workload in the geometry shader should be partially relieved comparing with copying all triangles six times for the fact that triangles outside of a face frustum will not be emitted to that layer.

We now have our basic omni light shadows:

vsm without pre-filter

1.2 Using a framebuffer with six single-layered attachments

This method seems convenient at first sight, but is actually more complicated, at least in terms of the number of Vulkan objects we have to create. First, we need six draw calls for six faces instead of one as in part 1.1. Output to the framebuffer attachments planed in the current subpass is a simultaneous process. And there is no way to turn on some and turn of others on the fly. Therefore we need six subpasses in our render pass. Along with it comes the subpass dependencies(synchronization). This solution is essentially performing the tasks in serial, which is suitable for devices with a tight framebuffer attachment number(bandwidth) limit.

1.3 Patched 2D cube map

With Vulkan we can have other solutions to render dynamic cube maps, such using a 2D texture to draw each of the faces and transfer it to the actual layered image. But I would like to talk about another method - patched 2D cube map, since using texture atlas is dominant in game development. The method uses a single 2D texture to store six cube faces. We need two treatments to make it work:

  • upon rendering the shadow map, viewport/scissor is shifted before each draw call, so that the results can lay in their dedicated positions
  • upon using the shadow map, a direction-uv conversion as in the following code is applied to sample the 2D texture
// cube faces +x, -x, +y, -y, +z, -z in a row
vec2 l_to_shadow_map_uv(vec3 v) {
    float faceIndex;
    vec3 vAbs = abs(v);
    float ma;
    vec2 uv;
    if(vAbs.z >= vAbs.x && vAbs.z >= vAbs.y)
    {
        faceIndex = v.z < 0.0 ? 5.0 : 4.0;
        ma = 0.5 / vAbs.z;
        uv = vec2(v.z < 0.0 ? -v.x : v.x, -v.y);
    }
    else if(vAbs.y >= vAbs.x)
    {
        faceIndex = v.y < 0.0 ? 3.0 : 2.0;
        ma = 0.5 / vAbs.y;
        uv = vec2(v.x, v.y < 0.0 ? -v.z : v.z);
    }
    else
    {
        faceIndex = v.x < 0.0 ? 1.0 : 0.0;
        ma = 0.5 / vAbs.x;
        uv = vec2(v.x < 0.0 ? v.z : -v.z, -v.y);
    }
    uv = uv * ma + 0.5;
    uv.x = (uv.x + faceIndex) / 6.f;
    return uv;
}

The shadow map color attachment looks like this (texture view from RednderDoc):

2d

The advantage of this method is that we only need one single-layered color attachment to prepare the cube map. The disadvantage is that the linear texel filter provided by the hardware will not be useful because it causes seams along the face borders. And we know that using a nearest texel filter to sample the shadow map will cause serious shadow acne. But it is still a viable solution if we have our own filter in the shader.

Part 2 Applying Gaussian blur filter on a cube map

Shadow maps need anti-aliasing. It is possibility to apply pre-filter on variance shadow maps due to its nature - data stored in the map are the expectation values. In this project, a Gaussian blur filter is applied on the shadow map before its usage on screen.

We can call the same filter render pass twice - one filter in x direction and the other filter in y direction. Again, we have six faces to process. I choose to simply use a framebuffer with six attachments for the render pass. But unlike part 1.2, we can do it with one draw call and one subpass, since output to all six attachments is simultaneous.

The major problem we are facing is how to eliminate the seams(discontinuity) across adjacent cube face borders:

cube seams

There is an intuitive idea that if we choose a patched 2D cube map in the first place, we can avoid dealing with the seams. But this is actually not true. No matter what folding we have, a 2D image will not be folding itself on its boundaries. Consider the one row folding and compare it with the implementation of having each face in a single layer, the area across the border with discontinuity are the same.

Therefore we have no choice but to write our own UV wrapping code in shader that performs the Gaussian blur. (This project only deal with texel filtering but not the mip map filter smearing problem in cases of more than one mip level.) It relocates UV coordinates that exceed the boundary to the correct adjacent faces. The following code shows a wrapping function on the texel positions, which is easier to read and debug than, but essentially equivalent to the actual wrapping function performed on texture UV:

ivec3 wrap_p(ivec2 p, ivec2 offset, int res, int face) {
    // p, offset are in pixels
    // res is the resolution of the cube face
    // face ranges from 0 to 5
    ivec2 p_new = p + offset;

    int s = p_new.x;
    int t = p_new.y;
    if (s >= res) {
        if (face == 0) { face=5; p_new.x=s - res; }
        else if (face == 1) { face=4; p_new.x=s - res; }
        else if (face == 2) { face=0; p_new.x=res - t - 1; p_new.y=s - res; }
        else if (face == 3) { face=0; p_new.x=t; p_new.y=2 * res - s - 1; }
        else if (face == 4) { face=0; p_new.x=s - res; }
        else if (face == 5) { face=1; p_new.x=s - res; }
    }
    else if (s < 0) {
        if (face == 0) { face=4; p_new.x=s + res; }
        else if (face == 1) { face=5; p_new.x=s + res; }
        else if (face == 2) { face=1; p_new.x=t; p_new.y=-s - 1; }
        else if (face == 3) { face=1; p_new.x=res - t - 1; p_new.y=s + res; }
        else if (face == 4) { face=1; p_new.x=s + res; }
        else if (face == 5) { face=0; p_new.x=s + res; }
    }
    else if (t >= res) {
        if (face == 0) { face=3; p_new.x=2 * res - t - 1; p_new.y=s; }
        else if (face == 1) { face=3; p_new.x=t - res; p_new.y=res - s - 1; }
        else if (face == 2) { face=4; p_new.y=t - res; }
        else if (face == 3) { face=5; p_new.x=res - s - 1; p_new.y=2 * res - t - 1; }
        else if (face == 4) { face=3; p_new.y=t - res; }
        else if (face == 5) { face=3; p_new.x=res - s - 1; p_new.y=2 * res - t - 1; }
    }
    else if (t < 0) {
        if (face==0) { face=2; p_new.x=res + t; p_new.y=res - s - 1; }
        else if (face==1) { face=2; p_new.x=-t - 1; p_new.y=s; }
        else if (face==2) { face=5; p_new.x=res - s - 1; p_new.y=-t - 1; }
        else if (face==3) { face=4; p_new.y=t + res; }
        else if (face==4) { face=2; p_new.y=t + res; }
        else if (face==5) { face=2; p_new.x=res - s - 1; p_new.y=-t - 1; }
    }
    return ivec3(p_new, face);
}

To wrap UV instead of texel positions, the above function simply needs a 'one pixel to one divided by texture resolution' conversion. The reason for choosing UV wrapping(texture sampling in the shader) in this project is that we still need the hardware linear texel filter on. It provides a better blur quality due to a more accurate low-pass filter computation. Comparison is shown in the following image - Gaussian filter on texel coordinate(left) vs. linear filtered sampler UV coordinate(right):

filter_lookup_compare

Finally, we have a pre-filtered shadow map for an omni light:

final small