Printer Friendly

Sparse Procedural Volumetric Rendering.

India, Oct. 9 -- Sparse Procedural Volumetric Rendering (SPVR) is a technique for rendering real-time volumetric effects. We're excited that the upcoming book "GPU Pro 6" will include an SPVR chapter. This document gives some additional details.

SPVR efficiently renders a large volume by breaking it into smaller pieces and processing only the occupied pieces. We call the pieces "metavoxels"; a voxel is the volume's smallest piece. A metavoxel is a 3D array of voxels. And the overall volume is a 3D array of metavoxels. The sample has compile-time constants for defining these numbers. It's currently configured for a total volume size of 10243 voxels, in the form of 323 metavoxels, each composed of 323 voxels.

The sample also achieves efficiency by populating the volume with volume primitives1. Many volume primitive types are possible. The sample implements one of them: a radially displaced sphere. The sample uses a cube map to represent the displacement over the sphere's surface (see below for details). We identify the metavoxels affected by the volume primitives, compute the color and density of the affected voxels, propagate lighting, and ray-march the result from the eye's point of view.

The sample also makes efficient use of memory. Consider the volume primitives as a compressed description of the volume's contents. The algorithm effectively decompresses them on the fly, iterating between populating and ray-marching metavoxels. It can perform this switch for every metavoxel. But, switching between filling and ray marching has costs (e.g., changing shaders), so the algorithm supports filling a list of metavoxels before switching to ray-marching them. It allocates a relatively small array of metavoxels, reusing them as needed to process the total volume. Note also that many typical use cases occupy only a small percentage of the total volume in the first place.

Figure 1. Particles, Voxels, Metavoxels, and the system's various coordinate systems

Figure 1 shows some of the system's participants: voxels, metavoxels, a volume-primitive particle, a light, a camera, and the world. Each has a reference frame, defined by a position P, an up vector U, and a right vector R. The real system is 3D, but Figure 1 is simplified to 2D for clarity.

Algorithm Overview

01 // Render shadow map

02 foreach scene model visible from light

03 draw model from light view


05 // Render eye-view Z-Prepass

06 foreach scene model visible from eye

07 draw model from eye view


09 // Bin particles

10 foreach particle

11 foreach metavoxel covered by the particle

12 append particle to metavoxel's particle list


14 // Draw metavoxels to eye-view render target

15 foreach non-empty metavoxel

16 fill metavoxel with binned particles and shadow map as input

17 ray march metavoxel from eye point of view, with depth buffer as input


19 // Render scene to back buffer

20 foreach scene model

21 draw model from eye view


23 // Composite eye-view render target with back buffer

24 draw full-screen sprite with eye-view render target as texture

Note that the sample supports filling multiple metavoxels before ray-marching them. It fills a "cache" of metavoxels, then ray-marches them, repeating until it finishes processing all non-empty metavoxels. It also supports filling the metavoxels only every nth frame, or filling them only once. If the real app needs only a static or slowly changing volume, then it can be significantly faster by not updating the metavoxels every frame.

Filling the volume

The sample fills the metavoxels with the particles that cover them. "Covered" means the particles' bounds intersect the metavoxel's bounds. The sample avoids processing empty metavoxels. It stores color and density in each of the metavoxel's voxels (color in RGB, and density in alpha). It performs this work in a pixel shader. It writes to the 3D texture as a RWTexture3D Unordered Access View (UAV). The sample draws a two-triangle 2D square, sized to match the metavoxel (e.g., 32x32 pixels for a 32x32x32 metavoxel). The pixel shader loops over each voxel in the corresponding voxel column, computing each voxel's density and lit color from the particles that cover the metavoxel.

The sample determines if each voxel is inside each particle. The 2D diagram shows a simplified particle. (The diagram shows a radially displaced 2D circle. The 3D system implements radially displaced spheres.) Figure 1 shows the particle's bounding radius rP and its displaced radius rPD. The particle covers the voxel if the distance between the particle's center PP and the voxel's center PV is less than the displaced distance rPD. For example: the voxel at PVI is inside the particle, while the voxel at PVO is outside.

Inside = |PV- PP|

Figure 2. Relationship between shadow Z values and indices

The shadowIndex varies from 0 to METAVOXEL_WIDTH as the shadow value varies from the top of the metavoxel to the bottom. Metavoxel local space is centered at (0, 0, 0) and ranges from -1.0 to 1.0. So, the top is at (0, 0, -1), and the bottom is at (0, 0, 1). Transforming to light/shadow space gives:

1 Top = (LightWorldViewProjection._m23 - LightWorldViewProjection._m22)

2 Bottom = (LightWorldViewProjection._m23 + LightWorldViewProjection._m22)

Resulting in the following shader code in FillVolumePixelShader.fx:

1 float shadowZ = _Shadow.Sample(ShadowSampler, lightUv.xy).r;

2 float startShadowZ = LightWorldViewProjection._m23 - LightWorldViewProjection._m22;

3 float endShadowZ = LightWorldViewProjection._m23 + LightWorldViewProjection._m22;

4 uint shadowIndex = METAVOXEL_WIDTH*(shadowZ-startShadowZ)/(endShadowZ-startShadowZ);

Note that the metavoxels are cubes, so METAVOXEL_WIDTH is also the height and depth.

Light-propagation texture

In addition to computing each voxel's color and density, FillVolumePixelShader.fx writes the final propagated light value to a light-propagation texture. The sample refers to this texture by the name "$PropagateLighting". It's a 2D texture, covering the whole volume. For example, the sample, as configured for a 10243 volume (323 metavoxels, each with 323 voxels), would have a 1024x1024 (32*32=1024) light-propagation texture. There are two twists: this texture includes space for each metavoxel's one-voxel border, and the value stored in the texture is the last non-shadowed value.

Each metavoxel maintains a one-voxel border so texture filtering works (when sampling during the eye-view ray march). The sample casts shadows from the volume onto the rest of the scene by projecting the light-propagation texture onto the scene. A simple projection would show visual artifacts where the texture duplicates values to support the one-voxel border. It avoids these artifacts by adjusting the texture coordinates to accommodate the one-voxel border. Here's the code (from DefaultShader.fx):

1 float oneVoxelBorderAdjust = ((float)(METAVOXEL_WIDTH-2)/(float)METAVOXEL_WIDTH);

2 float2 uvVol = input.VolumeUv.xy * 0.5f + 0.5f;

3 float2 uvMetavoxel = uvVol * WIDTH_IN_METAVOXELS;

4 int2 uvInt = int2(uvMetavoxel);

5 float2 uvOffset = uvMetavoxel - (float2)uvInt - 0.5f;

6 float2 lightPropagationUv = ((float2)uvInt + 0.5f + uvOffset * oneVoxelBorderAdjust )

7 * (1.0f/(float)WIDTH_IN_METAVOXELS);

The light-propagation texture stores the light value at the last voxel that isn't in shadow. Once the light propagation process encounters the shadowing surface, the propagated lighting goes to 0 (no light propagates past the shadow caster). But, storing the last light value allows us to use the texture as a light map. Projecting this last-lighting value onto the scene means the shadow casting surface receives the expected lighting value. Surfaces that are in shadow effectively ignore this texture.

Ray Marching

Figure 3. Eye-View Ray March

The sample ray-marches the metavoxel with a pixel shader (EyeViewRayMarch.fx), sampling from the 3D texture as a Shader Resource View (SRV). It marches each ray from far to near with respect to the eye. It performs filtered samples from the metavoxel's corresponding 3D texture. Each sampled color adds to the final color, while each sampled density occludes the final color and the final alpha.

1 blend = 1/(1+density)

2 colorresult = colorresult * blend + color * (1-blend)

3 alpharesult = alpharesult * blend

The sample processes each metavoxel independently. It ray-marches each metavoxel one at a time, blending the results with the eye-view render target to generate the combined result. It marches each metavoxel by drawing a cube (i.e., 12 triangles) from the eye's point of view. The pixel shader marches a ray through each pixel covered by the cube. It renders the cube with front-face culling so the pixel shader executes only once for each covered pixel. If it rendered without culling, then each ray could be marched twice-once for the front faces and once for the back faces. If it rendered with back-face culling, then when the camera is inside the cube, the pixels would be culled, and the rays wouldn't be marched.

The simple example in Figure 3 shows two rays marched through four metavoxels. It illustrates how the ray steps are distributed along each ray. The distance between the steps is the same when projected onto the look vector. That means the steps are longer for off-axis rays. This approach empirically yielded the best-looking results (versus, for example, equal steps for all rays). Note the sampling points all start on the far plane and not on the metavoxel's back surface. This matches how they would be sampled for a monolithic volume, without the concept of metavoxels. Starting the ray march on each metavoxel's back surface resulted in visible seams at metavoxel boundaries.

Figure 3 also shows how the samples land in the different metavoxels. The gray samples are outside all metavoxels. The red, green, and blue samples land in separate metavoxels.

Depth test

The ray-march shader honors the depth buffer by truncating the rays against the depth buffer. It does this efficiently by beginning the ray march at the first ray step that would pass the depth test.

Figure 4. Relationship between depths and indices

The relationship between depth values and ray-march indices is shown in Figure 4. It reads the Z value from the Z buffer and computes the corresponding depth value (i.e., distance from the eye). The indices vary proportionally from 0 to totalRaymarchCount as the depth varies from zMin to zMax. Resulting in this code (from EyeViewRayMarch.fx):

1 float depthBuffer = DepthBuffer.Sample( SAMPLER0, screenPosUV ).r;

2 float div = Near/(Near-Far);

3 float depth = (Far*div)/(div-depthBuffer);

4 uint indexAtDepth = uint(totalRaymarchCount * (depth-zMax)/(zMin-zMax));

Where zMin and zMax are the depth values at the ray-march extents. zMax and zMin are the values at the furthest point from the eye, and closest point, respectively.


Metavoxel rendering honors two sort orders: one for the light and one for the eye. Light propagation starts at the metavoxels closest to the light and progresses through more distant metavoxels. The metavoxels are semi-transparent, so correct results also require sorting from the eye view. The two choices when sorting from the eye view are: back to front with "over" alpha blending and front to back with "under" alpha blending.

Figure 5. Metavoxel sort order

Figure 5 shows a simple arrangement of three metavoxels, a camera for the eye, and a light. Light propagation requires an order of 1, 2, 3; we must propagate the lighting through metavoxel 1 to know how much light makes it to metavoxel 2. And, we must propagate light through metavoxel 2 to know how much light makes it to metavoxel 3.

We also need to sort the metavoxels from the eye's point of view. If rendering front to back, we would render metavoxel 2 first. The blue and purple lines show how metavoxels 1 and 3 are behind metavoxel 2. We need to propagate lighting through metavoxels 1 and 2 before we can render metavoxel 3. The worst case requires propagating lighting through the entire column before rendering any of them. Rendering that case back-to-front allows us to render each metavoxel immediately after propagating its lighting.

The sample combines both back-to-front and front-to-back sorting to support the ability to render metavoxels immediately after propagating lighting. The sample renders metavoxels above the perpendicular (i.e., the green line) back-to-front with over-blending, followed by the metavoxels below the perpendicular front-to-back with under-blending. This ordering produces the correct results without requiring enough memory to hold an entire column of metavoxels. Note that the algorithm could always sort front-to-back with under-blending if the app can commit enough memory.

Alpha Blending

The sample uses over-blending for the metavoxels that are sorted back-to-front (i.e., the most-distant metavoxel is rendered first, followed by successively closer metavoxels). It uses under-blending for the metavoxels that are sorted front-to-back (i.e., the closest metavoxel is rendered first, with more-distant metavoxels rendered behind).

Over-blend: Colordest = Colordest * Alphasrc + Colorsrc

Under-blend: Colordest = Colorsrc * Alphadest + Colordest

The sample blends the alpha channel the same for both over- and under-blending; they both simply scale the destination alpha by the pixel shader alpha.

Alphadest = Alphadest * Alphasrc

The following are the render states used for over- and under-blending.

Over-Blend Render States (From

1 SrcBlend = D3D11_BLEND_ONE

2 DestBlend = D3D11_BLEND_SRC_ALPHA

3 BlendOp = D3D11_BLEND_OP_ADD

4 SrcBlendAlpha = D3D11_BLEND_ZERO

5 DestBlendAlpha = D3D11_BLEND_SRC_ALPHA

6 BlendOpAlpha = D3D11_BLEND_OP_ADD

Under-Blend Render States (From


2 DestBlend = D3D11_BLEND_ONE

3 BlendOp = D3D11_BLEND_OP_ADD

4 SrcBlendAlpha = D3D11_BLEND_ZERO

5 DestBlendAlpha = D3D11_BLEND_SRC_ALPHA

6 BlendOpAlpha = D3D11_BLEND_OP_ADD


The result of the eye-view ray march is a texture with a pre-multiplied alpha channel. We draw a full screen sprite with alpha blending enabled to composite with the back buffer.

Colordest = Colordest * Alphasrc + Colorsrc

The render states are:

1 SrcBlend = D3D11_BLEND_ONE

2 DestBlend = D3D11_BLEND_SRC_ALPHA

The sample supports having an eye-view render target with a different resolution from the back buffer. A smaller render target can significantly improve performance as it reduces the total number of rays marched. However, when the render target is smaller than the back buffer, the composite step performs up-sampling that can generate cracks around silhouette edges. This common issue, left for future work, can be addressed by up-sampling during the compositing step.

Known Issues

If the eye-view render target is smaller than the back buffer, cracks appear when compositing.

Resolution-mismatch between the light-propagation texture and the shadow map results in cracks.

For more such intel resources and tools from Intel on Game, please visit the Intel(R) Game Developer Zone


Published by HT Syndication with permission from Digit.

Copyright [c] HT Media Ltd. Provided by SyndiGate Media Inc. ( ).
COPYRIGHT 2015 SyndiGate Media Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2015 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Date:Oct 9, 2015
Previous Article:Coolpad Note 3 launched at Rs. 8,999, exclusive to Amazon.
Next Article:Dell launches XPS 12, 13, 15 powered with Intel Skylake processors.

Terms of use | Privacy policy | Copyright © 2021 Farlex, Inc. | Feedback | For webmasters |