**Combining OIT and volumetric fog** 2025.02.13 first publication
2025.02.13 why a single accumulator is wrong, updated conclusion
2025.06.24 more screenshots # Introduction ## Context and goal *Challenge Quake 3* (CNQ3), the official *Quake III Arena* engine of the *Challenge ProMode Arena* (CPMA) mod, has a cinematic rendering pipeline (CRP) that is all about pushing pretty pixels for fragmovies, screenshots and promotional material. When I started working on volumetric lighting for the CRP, order-independent transparency (OIT) was already implemented. The goal was to make the new unified volumetric lighting system work with the existing OIT system and have the existing OIT resolve shader handle the volumetric lighting. ## Preview The rocket explosion (additive blend) is between the 2 furthest green health bubbles (standard blend). It therefore "lights up" the furthest bubble but the closest 2 fully block light from it, as expected. You can see [more screenshots here](../q3_retrospective_part_2/#workdoneoncnq3/thecrp/volumetrics) to see much more interesting scenarios that involve volumetric lighting. # Volumetric lighting recap Volumetric lighting simulates light interactions with participating media. A participating medium is any volume that affects light transport within it. This includes volumes such as fog, smoke, dust, clouds, liquids, fire, explosions, etc. ## Cases There are 3 main phenomena that need to be accounted for: | Phenomenon | Description | |:---|:---| | Scattering | Light hits a particle and changes direction | | Absorption | Light hits a particle and dissipates as heat | | Emission | Light is emitted by a hot particle | If we look specifically at what can happen along a given line segment in 3D space, we're looking at 4 distinct cases: - Emission generates light headed towards the destination: ************************************************************************************************* * * * o-------------------------------------<----------*--------------------------------o * * end particle as light source start * * * ************************************************************************************************* - In-scattering redirects light towards the destination: ************************************************************************************************* * * * * light source * * / * * / * * / * * v * * o-------------------------------------<----------o--------------------------------o * * end particle start * * * ************************************************************************************************* - Out-scattering redirects light that was headed towards the destination: ************************************************************************************************* * * * ^ * * \ * * \ * * \ light source * * o------------------------------------------------o<----------*--------------------o * * end particle start * * * ************************************************************************************************* - Absorption dissipates the light as heat: ************************************************************************************************* * * * light source * * o------------------------------------------------o<----------*--------------------o * * end particle start * * * ************************************************************************************************* These 4 scenarios can be split into 2 categories: - In-scattering and emission add energy wrt the measurement at the end-point.
In practice, light will be added. - Out-scattering and absorption remove energy wrt the measurement at the end-point.
In practice, light will multiplied by a normalized scaling factor called transmittance.
More on that in a bit. ## Properties Participating media have a few properties that are used to describe their interaction with light: | | Property | Description | |:---|:---|:---| | $\sigma_{s}$ | Scattering coefficient | How strongly light gets scattered | | $\sigma_{a}$ | Absorption coefficient | How strongly light gets dissipated as heat | | $\sigma_{t}$ | Extinction coefficient | How strongly light gets scattered and/or dissipates as heat
$ \sigma_{t} = \sigma_{s} + \sigma_{a} $ | | $g$ | Anisotropy | How strongly does light bounce in in a forward/backward direction after hitting a particle | The anisotropy factor g is a number between -1 and 1: | g | Description | Note | |---:|:---|:---| | 1 | Peak forward scattering | Peak anisotropy | | 0 | Scattering in all directions equally | Fully isotropic | | -1 | Peak backward scattering | Peak anisotropy | ## Transmittance For our use case, transmittance is the percentage of incoming light that makes it through the participating medium over a given line segment. The transmittance over a line segment of length d with an absorption/scattering/extinction coefficient $\sigma$ is: $ T(d, \sigma) = e^{- \sigma d} $ This formula is derived from the Beer-Lambert law. Suppose we compute the transmittance for absorption and scattering separately: - $ T_{a} = T(d, \sigma_{a}) = e^{- \sigma_{a} d} $ - $ T_{s} = T(d, \sigma_{s}) = e^{- \sigma_{s} d} $ The transmittance coefficient that accounts for both absorption and scattering is: \begin{equation} \begin{split} T_{e} & = T_{a} T_{s} \\ & = T(d, \sigma_{a}) T(d, \sigma_{s}) \\ & = e^{- \sigma_{a} d} e^{- \sigma_{s} d} \\ & = e^{- \sigma_{a} d - \sigma_{s} d} \\ & = e^{- (\sigma_{a} + \sigma_{s}) d} \end{split} \end{equation} This shows us where the following formula came from: $ \sigma_{t} = \sigma_{s} + \sigma_{a} $ ## Scattering models Depending on particle sizes and composition, light interacts with participating media differently from a statistical point of view. | Model | Particle size | Anisotropy | Absorption | Participating media | |:---|:---|:---|:---|:---| | Mie scattering | Large particles | High anisotropy | High absorption | Thick fog, smoke | | Rayleigh scattering | Small particles | Low anisotropy | Low absorption | Air, atmosphere | ## Phase functions Phase functions are similar to BSDFs. They compute what percentage of light is scattered in a given direction. For Mie scattering with the Henyey-Greenstein function, 2 arguments are needed: - The angle between the incoming light and the outgoing direction of interest. - The anisotropy factor g. Phase functions have to obey the law of conservation of energy. That is, the integral of a phase function over the entire sphere must be 1. ## Formula Let's simplify things by ignoring emissive participating media and looking at light coming from the following light interactions: - Some light is reflected off an opaque surface. - Some light is in-scattered by a local point. ******************************************************************************************** * * * Near clip plane Light source * * | * * * | /|\ Light source * * | / | \ * * * | / | \ \ * * | / | \ \ * * / | v v v v * * +) |<------------------------------<---o-<---o-<---o-------------------<---o * * \ | Particles Opaque surface * * Camera | * * |<--------------------------------------------------------------------> * * | d * * | * * * ******************************************************************************************** \begin{equation} Li = T(d)Lr(d) + \int_{0}^{d} T(x)Ls(x)dx \end{equation} Where: - T(d) is the transmittance of a segment of length d - Lr(d) is the amount of reflected light over a segment of length d - Ls(d) is the amount of scattered light over a segment of length d The first term doesn't need to be an integral since the light is reflected at a specific point. The second term needs the integral form since the extinction coefficient of the participating medium might be different at each point along the segment that the light source is illuminating. The final color in this scenario is computed as `C*T + S`, where: - C is the opaque surface's color - T is the transmittance between the opaque surface's position and the near clip plane - S the in-scattered light between the opaque surface's position and the near clip plane # Rendering techniques recap ## Per-pixel linked-list OIT The core idea: - Opaque surfaces and the skybox are rendered first. - Transparent surfaces are forward rendered but instead of writing to a render target, each fragment's information is stored in a per-pixel linked list. - A pixel or compute shader reads each pixel's linked list into a local fragment array, sorts the fragments back to front and then applies each fragment's contribution in order. How is the per-pixel linked list stored? - The *fragment buffer* contains all fragments of the scene view (i.e. a big array of fragments). - The *counter buffer* has a fragment counter used to allocate fragments from the fragment buffer (using atomic increments). - The 2D R32_UINT *index texture* stores the first index of each pixel's linked list (indexing into the fragment buffer). ```hlsl struct OIT_Counter { uint fragmentCount; // number of allocated fragments uint maxFragmentCount; // size of the fragment buffer uint overflowCount; // how many fragments did we try to allocate past the end of the buffer? // this is useful to know when the buffer size needs a bump }; struct OIT_Fragment { uint packedColor; float viewDepth; // higher = further from the camera uint blendMode; // the source and destination blend modes uint nextFramentIndex; // index into the fragment buffer, 0 when last in the list // extra stuff: material index, depth fade settings, ... }; ``` Pixel shader logic used when forward rendering transparent surfaces: ```hlsl RWStructuredBuffer<OIT_Counter> counterBuffer = ResourceDescriptorHeap[counterBufferIndex]; // run the usual forward rendering shading logic to compute a color float4 color = ComputeFinalColor(); // allocate a new fragment uint fragmentIndex; InterlockedAdd(counterBuffer[0].fragmentCount, 1, fragmentIndex); if(fragmentIndex < counterBuffer[0].maxFragmentCount) // did allocation succeed? { RWTexture2D indexTexture = ResourceDescriptorHeap[indexTextureIndex]; RWStructuredBuffer<OIT_Fragment> fragmentBuffer = ResourceDescriptorHeap[fragmentBufferIndex]; // insert the new fragment at the start of the linked list uint prevFragmentIndex; InterlockedExchange(indexTexture[int2(input.position.xy)], fragmentIndex, prevFragmentIndex); // write the payload OIT_Fragment fragment; fragment.packedColor = PackColor(color); fragment.viewDepth = input.depthVS; // from the vertex shader fragment.blendMode = blendMode; // from root constants fragment.next = prevFragmentIndex; // connect our new fragment to the linked list fragmentBuffer[fragmentIndex] = fragment; } ``` "Full-screen" compute/pixel shader logic used to apply transparent fragments to each pixel: ```hlsl Texture2D indexTexture = ResourceDescriptorHeap[indexTextureIndex]; StructuredBuffer<OIT_Fragment> fragmentBuffer = ResourceDescriptorHeap[fragmentBufferIndex]; Texture2D backgroundTexture = ResourceDescriptorHeap[backgroundTextureIndex]; // grab this pixel's fragments uint fragmentIndex = indexTexture[tcPx.xy]; OIT_Fragment sorted[OIT_MAX_FRAGMENTS_PER_PIXEL]; uint fragmentCount = 0; while(fragmentIndex != 0 && fragmentCount < OIT_MAX_FRAGMENTS_PER_PIXEL) { sorted[fragmentCount] = fragmentBuffer[fragmentIndex]; fragmentIndex = sorted[fragmentCount].next; fragmentCount++; } // sort the fragments using an insertion sort // ...exercise left to the reader... // apply fragments float4 color = backgroundTexture[tcPx.xy]; // opaque/skybox background color for(uint i = 0; i < fragmentCount; ++i) { OIT_Fragment fragment = sorted[i]; color = Blend(UnpackColor(fragment.color), color, fragment.blendMode); } return color; ``` Before you scream in horror at me for using this method, I'd like to point out a few things: - There is 25+ years of user-generated content to support that can use any of 72 blend modes and set depth writes on/off and set the depth test to lequal or equal. They're not all unique because some modes are equivalent, e.g. `(dst_color, zero)` is the same as `(zero, src_color)`. - The memory and GPU overheads of this approach are actually nowhere close to being an issue in practice. There aren't that many transparency layers per pixel in Q3 content because it wasn't handled well. - I don't see how other OIT techniques would offer a better balance of image quality and performance. If you do, please let me know. Reference: - *Real-Time Concurrent Linked List Construction on the GPU* by Jason C. Yang, Justin Hensley, Holger Grün, Nicolas Thibieroz ## Clip-space volumetric lighting The core idea: - A 3D grid divides the clip-space volume into voxels that are frustum-shaped in world-space. A frustum-shaped voxel is called a froxel. - A series of 3D compute dispatches (1 thread per froxel) writes to a RGBA_FLOAT16 3D texture that contains the in-scatted light (RGB) and extinction coefficient of each froxel (A). - A single 2D dispatch (1 thread per froxel on the X/Y plane) does the final raymarch in a RGBA_FLOAT16 3D texture, traversing the texture on the Z axis. The output texture has the total in-scattered light (RGB) and final transmittance (A) from each froxel's center to the near clip plane. - To apply the fog to opaque surfaces: - The view-space depth of a pixel is fetched. - The corresponding Z slice of the 3D texture is computed. - Using the pixel's X/Y coordinates and the Z slice, transmittance and in-scattering are fetched from the raymarched 3D texture. - The final color is computed as `C*T + S`, where: - C is the opaque surface's color - T is the transmittance between the opaque surface's position and the near clip plane - S the in-scattered light between the opaque surface's position and the near clip plane 3D compute shader (`width * height * depth` compute threads) logic to evaluate a fog and scatter sunlight: ```hlsl RWTexture3D scatterExtTexture = ResourceDescriptorHeap[scatterExtTextureIndex]; SceneView sceneView = GetSceneView(sceneViewIndex); uint3 textureIndexPx = ...; float3 scattering; float extinction; float anisotropy; EvaluateHeightFog(scattering, extinction, anisotropy, textureIndexPx); float vis = EvaluateSunVisibility(textureIndexPx); // e.g. exponential shadow map float3 cameraRay = sceneView.CamerayRay(...); float cosTheta = dot(-scene.sunDirection, -cameraRay); float phase = HenyeyGreenstein(cosTheta, anisotropy); float3 inScatteredLight = vis * scene.sunColor * scene.sunIntensityVL * scattering * phase; scatterExtTexture[textureIndexPx] = float4(inScatteredLight, extinction); ``` 2D compute shader (`width * height` compute threads) logic to raymarch the final 3D texture: ```hlsl Texture3D scatterExtTexture = ResourceDescriptorHeap[scatterExtTextureIndex]; RWTexture3D resolveTexture = ResourceDescriptorHeap[resolveTextureIndex]; SceneView sceneView = GetSceneView(sceneViewIndex); float3 accumScatter = ...; float accumTrans = ...; // write out resolveTexture at depth slice 0 // ... // write out resolveTexture at depth slices 1+ for(uint z = 1; z < textureSize.z; z++) { uint3 textureIndexPx = ...; float3 froxelScatter; float froxelExtinction; GetScatterExt(froxelScatter, froxelExtinction, scatterExtTexture[textureIndexPx]); float3 currPosition = sceneView.ComputeWorldSpacePosition(..., z); float3 prevPosition = sceneView.ComputeWorldSpacePosition(..., z - 1); float depthStep = distance(currPosition, prevPosition); float froxelTrans = Transmittance(depthStep, froxelExtinction); // Frostbite's integration formula float froxelTransInteg = (1.0 - froxelTrans) / (froxelExtinction == 0.0 ? 1.0 : froxelExtinction); accumScatter += (accumTrans * froxelTransInteg) * froxelScatter; accumTrans *= froxelTrans; resolveTexture[textureIndexPx] = float4(accumScatter, accumTrans); } ``` There are many other interesting ideas and improvements. Here are the important ones: - The X/Y axis are sub-sampled (4x/8x are common). - Depth distribution is non-linear (usually exponential) to improve accuracy near the camera. 64/128 depth slices are common. - Temporal reprojection (jittering with low-discrepency sequences + fetching data from previous texels + exponential moving averages) can be used to reduce aliasing. - This is done on visibility data (e.g. sun shadow), not the final raymarched texture. - It is easier to do here than with pure 2D techniques as the reprojection is in world-space. - A V-Buffer stores participating media properties (scattering/absorption coefficients, anisotropy, etc.): - Fog volumes, volumetric particles and VDBs can be injected separately. - There is no need to loop over every object of every type in one massive inefficient shader. - Light injection can also be done in separate passes: ambient light, sunlight, local point lights, etc. - Self-shadowing can be achieved by injecting fogs/particles/VDBs into *extinction volumes* (extinction coefficients) and *transmittance maps* (aka volumetric shadow maps) can be derived from them. - Transmittance maps can be AABBs at the light's position (point lights) or oriented cascades at the camera's position (directional lights like the sun). - A transmittance map contains the transmittance from the light source to the voxel's center. References: - *Volumetric Fog: Unified Compute Shader-Based Solution to Atmospheric Scattering* by Bart Wronski - *Physically Based and Unified Volumetric Rendering in Frostbite* by Sébastien Hillaire - *The Road toward Unified Rendering with Unity's HDRP* by Sébastien Lagarde and Evgenii Golubev - *Volumetric Fog of The Last of Us Part II* by Artem Kovalovs # Available data What are we starting with? - The existing OIT resolve shader already has access to the fragment index texture and fragment buffer. - The resolve shader now also needs access to the raymarched volumetric light 3D texture (RGB: in-scattered light, A: transmittance). Remember that each texel stores the in-scattered light and transmittance values between the froxel's center and the near clip plane. Thinking about the problem, it becomes obvious that to handle transparency, we need the ability to know the in-scattered and transmittance values between any two depth values. The texture data contains in-scattered light and transmittance for the following useful ranges: ************************************************************************************************* * * * Near clip plane * * | * * / | Fragment 2 Fragment 1 Background * * +) |┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄o┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄o┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄o * * \ |<------------------------------------------------> Sb,Tb * * Camera |<-------------------------------> S1,T1 * * |<--------------> S2,T2 * * | * * * ************************************************************************************************* But when resolving, we need data for the following ranges: ************************************************************************************************* * * * Near clip plane * * | * * / | Fragment 2 Fragment 1 Background * * +) |┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄o┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄o┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄o * * \ | <--------------> Sb1,Tb1 * * Camera | <--------------> S12,T12 * * |<--------------> S2n,T2n * * | * * * ************************************************************************************************* Because the plan is to do this: - Assign background color to the accumulator - Apply volumetric lighting between background and fragment #1 - Blend fragment #1 - Apply volumetric lighting between fragment #1 and fragment #2 - Blend fragment #2 - Apply volumetric lighting between fragment #2 and the near clip plane Fortunately, this is trivial to do: - In-scattering: `range = far - near` - Transmittance: `range = far / near` We can therefore compute the missing values like so: - `Sb1 = Sb - S1` - `Tb1 = Tb / T1` - `S12 = S1 - S2` - `T12 = T1 / T2` - `S2n = S2` - `T2n = T2` Of course, we wouldn't go out dividing by zero like savages: ```hlsl float3 GetRangeInScattering(float3 far, float3 near) { return far - near; } float GetRangeTransmittance(float far, float near) { return min(far / max(near, 0.000001), 1.0); } ``` With just one more texture (and associated sampler), we are now ready to expand the OIT resolve shader. # First attempt Here is the general approach taken in the resolve shader: ```C# List fragments = GetPixelFragmentList(); SortFragmentsByDepth(fragments); // insert the background (opaque / skybox) at index 0 // with a replace mode blend equation: (one, zero) fragments.InsertFirst(opaqueFragment); float3 color; // first fragment is opaque and replaces the destination for(int i = 0; i < fragments.length; i++) { // apply blending Fragment fragment = fragments[i]; color = Blend(UnpackColor(fragment.packedColor), color, fragment.blendMode); // apply volumetric lighting between fragments float depthFar = fragment.viewDepth; // if no closer fragment is available, // FindDepthOfNextFragment returns the view depth of the near clip plane float depthNear = FindDepthOfNextFragment(i); float4 stFar = FetchInScatterTransmittance(depthFar); float4 stNear = FetchInScatterTransmittance(depthNear); float3 S = GetRangeInScattering(stFar.rgb, stNear.rgb); float T = GetRangeTransmittance(stFar.a, stNear.a); color = color * T + S; } ``` Note that the equation used is `C*T + S` and not `(C+S) * T` as the transmittance of every froxel was already accounted for when the final raymarched in-scattered light was computed. We're now set up for success. Let's run it! ![Oh no... what's happening?](vl1_s0.jpg) Here's a closer look at it to make the artifacts clear: You see these sharp edges? These are the edges of the rotated billboarded quad of the explosion and the billboarded quad of a light beam on the ceiling. But wait, hang on! These edge texels are black and the explosion is additive blended. Adding black to a color doesn't change it, so how can they be visible? # The problem At first I thought I simply hadn't implemented the algorithm correctly because I'm obviously too much of a dummy to write code that works as expected. I wanted to convince myself that the approach can't possibly generate these artifacts, so I needed some level of proof that it can't happen. Let's create 2 scenarios that should generate the exact same color: 1. No transparent fragment. 2. A single transparent fragment that does nothing (e.g. additive blending with a black color). Let's assume the following definitions: | Variable | Description | |:---|:---| | B | Opaque/background surface color | | C | Color accumulator | | D | Color accumulator #2 | | S | In-scattered light from near clip plane to opaque surface | | T | Transmittance from near clip plane to opaque surface | | S' | In-scattered light from near clip plane to fragment | | T' | Transmittance from near clip plane to fragment | | F | Final computed color | ## 0 fragment \begin{equation} F0 = BT + S \end{equation} ## 1 invisible fragment ************************************************************************************************* * * * Near clip plane * * | * * / | Fragment Background * * +) |┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄o┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄o * * \ |<--------------> S',T' (in 3D texture) * * Camera |<-------------------------------> S ,T (in 3D texture) * * | <--------------> S0,T0 (computed) * * |<--------------> S1,T1 (computed) * * | * * * ************************************************************************************************* Apply volumetric lighting between the background and the fragment: \begin{equation} \begin{split} C0 & = BT0 + S0 \\ & = B\frac{T}{T'} + (S - S') \end{split} \end{equation} Blend the fragment: \begin{equation} \begin{split} C1 & = Blend(blackColor, C0, additiveBlend) \\ & = C0 \\ & = B\frac{T}{T'} + (S - S') \end{split} \end{equation} Apply volumetric lighting between the fragment and the near clip plane: \begin{equation} \begin{split} F1 & = C1T1 + S1 \\ & = (B\frac{T}{T'} + (S - S'))T1 + S1 \\ & = (B\frac{T}{T'} + (S - S'))(\frac{T'}{1}) + (S' - 0) \\ & = (B\frac{T}{T'} + (S - S'))T' + S' \\ & = BT + ST' - S'T' + S' \end{split} \end{equation} It's mismatched: we get `ST' - S'T' + S'` instead of `S`. Where do the new constants 1 and 0 (`T1 = T'/1` and `S1 = S'-0`) come from? - They're the raymarched transmittance and in-scattered light values *at* the near clip plane. - Since we deal with raymarched results *from* the near clip plane, we have a raymarched depth range of 0. - With a depth range of 0, no light can be out-scattered or dissipated into heat, so `T = 1`. - With a depth range of 0, no light can be in-scattered, so `S = 0`. Other than not being equal to F0, what jumps out at you in this equation?
$ F1 = BT + ST' - S'T' + S' $ We have two instances of S (in-scattered light) being multiplied by T (transmittance) even though the in-scattered light contributions we're dealing with already account for transmittance. We are therefore applying transmittance too many times and darkening the results. Woopsie! # Split accumulators Instead of using a single accumulator like so: ```C# foreach(var f in fragments) { S,T = GetRangedVolumetricLightData(...); C = Blend(f.color, C, f.blendMode); C = C * T + S; } return C; ``` Let's try adding a separate color accumulator D just for the in-scattered light like so: ```C# foreach(var f in fragments) { S,T = GetRangedVolumetricLightData(...); C = Blend(f.color, C, f.blendMode); C = C * T; D = Blend(f.color, D, f.blendMode); D = D + S; } return C + D; ``` That way, transmittance only multiplies transmittance. ## 0 fragment \begin{equation} F0 = C + D = BT + S \end{equation} Same as before, that's a good start... ## 1 invisible fragment ************************************************************************************************* * * * Near clip plane * * | * * / | Fragment Background * * +) |┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄o┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄o * * \ |<--------------> S',T' (in 3D texture) * * Camera |<-------------------------------> S ,T (in 3D texture) * * | <--------------> S0,T0 (computed) * * |<--------------> S1,T1 (computed) * * | * * * ************************************************************************************************* Apply volumetric lighting between the background and the fragment: \begin{equation} \begin{split} C0 & = BT0 \\ & = B\frac{T}{T'} \end{split} \end{equation} \begin{equation} \begin{split} D0 & = S0 \\ & = S - S' \end{split} \end{equation} Blend the fragment: \begin{equation} \begin{split} C1 & = Blend(blackColor, C0, additiveBlend) \\ & = C0 \\ & = B\frac{T}{T'} \end{split} \end{equation} \begin{equation} \begin{split} D1 & = Blend(blackColor, D0, additiveBlend) \\ & = D0 \\ & = S - S' \end{split} \end{equation} Apply volumetric lighting between the fragment and the near clip plane: \begin{equation} \begin{split} C2 & = C1T1 \\ & = C1T' \\ & = B\frac{T}{T'}T' \\ & = BT \end{split} \end{equation} \begin{equation} \begin{split} D2 & = D1 + S1 \\ & = D1 + S' - 0 \\ & = D1 + S' \\ & = S - S' + S' \\ & = S \\ \end{split} \end{equation} \begin{equation} \begin{split} F1 & = C2 + D2 \\ & = BT + S \end{split} \end{equation} Now it matches perfectly! Let's see the result. ![No weird edges... wiii](vl1_s1_d1.jpg) But wait, is it me or is the result too bright now? # Tripartite blending The code for blending was: ```C# C = Blend(f.color, C, f.blendMode); D = Blend(f.color, D, f.blendMode); ``` Which, expanded, looks like: ```C# C = f.color * BlendFactor(f.color, C, f.blendMode & GLS_SRCBLEND_BITS) + C * BlendFactor(f.color, C, f.blendMode & GLS_DSTBLEND_BITS); D = f.color * BlendFactor(f.color, D, f.blendMode & GLS_SRCBLEND_BITS) + D * BlendFactor(f.color, D, f.blendMode & GLS_DSTBLEND_BITS); ``` So you can see that the fragment color's contribution (GLS_SRCBLEND_BITS) is doubled whenever the source blend mode doesn't involve the destination. This explains the incorrectly boosted brightness and confirms that the blending is just wrong. Let's break down the bipartite blend equation with a destination color that's split into 2 parts: \begin{equation} \begin{split} D & = Blend(S, D1 + D2, SM | DM) \\ & = S * BlendFactor(S, D1 + D2, SM) + (D1 + D2) * BlendFactor(S, D1 + D2, DM) \\ & = S * BlendFactor(S, D1 + D2, SM) + D1 * BlendFactor(S, D1 + D2, DM) + D2 * BlendFactor(S, D1 + D2, DM) \end{split} \end{equation} We can now distribute the 3 parts (source, main destination, in-scattered destination) over the 2 accumulators (main destination, in-scattered destination): \begin{equation} \begin{split} D1 & = S * BlendFactor(S, D1 + D2, SM) + D1 * BlendFactor(S, D1 + D2, DM) \\ D2 & = D2 * BlendFactor(S, D1 + D2, DM) \end{split} \end{equation} Here's the updated code to reflect the new logic: ```C# foreach(var f in fragments) { S,T = GetRangedVolumetricLightData(...); C = f.color * BlendFactor(f.color, C + D, f.blendMode & GLS_SRCBLEND_BITS) + C * BlendFactor(f.color, C + D, f.blendMode & GLS_DSTBLEND_BITS); C = C * T; D = D * BlendFactor(f.color, C + D, f.blendMode & GLS_DSTBLEND_BITS); D = D + S; } return C + D; ``` ## 0 fragment \begin{equation} F0 = BT + S \end{equation} Same as before. Let's move on. ## 1 invisible fragment ************************************************************************************************* * * * Near clip plane * * | * * / | Fragment Background * * +) |┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄o┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄┄o * * \ |<--------------> S',T' (in 3D texture) * * Camera |<-------------------------------> S ,T (in 3D texture) * * | <--------------> S0,T0 (computed) * * |<--------------> S1,T1 (computed) * * | * * * ************************************************************************************************* Volumetric lighting between the background and the fragment is applied in the same way as before, so we start with the same C0 and D0 values: \begin{equation} C0 = B\frac{T}{T'} \end{equation} \begin{equation} D0 = S - S' \end{equation} And now we blend the invisible fragment using the new recipe: \begin{equation} \begin{split} C1 & = black * BlendFactor(black, C0 + D0, one) + C0 * BlendFactor(black, C0 + D0, one) \\ & = 0 * 1 + C0 * 1 \\ & = C0 \\ & = B\frac{T}{T'} \end{split} \end{equation} \begin{equation} \begin{split} D1 & = D0 * BlendFactor(black, C0 + D0, one) \\ & = D0 * 1 \\ & = D0 \\ & = S - S' \end{split} \end{equation} We get the same results for C1 and D1 as previously. Since the volumetric light math in the follow-up step is unchanged, we end up with the same results. Here is a full comparison with all results in order: Here is the same gallery zoomed in: # Conclusion As usual, it turns out that the solutions are pretty simple and seem obvious in retrospect. They were not really obvious before diving in but deriving the results wasn't much work. I don't remember whether I ended up splitting the accumulators because I realized what the problem was or not. The initial goal of all this was to make sure a specific type of (very objectionable) visual artifact never happens. It wasn't even to be correct or accurate. In the process of figuring out how to make things artifact-free, a path towards more correct rendering showed up. But you know what? It doesn't really matter in this case. Being artifact-free and good enough to be believable are the most important things for the CRP. If being artifact-free had meant having to use a less correct approach, I would have taken that trade. Techniques that are closer to the ground truth but generate annoying artifacts that you can't properly work around are a downgrade, especially when you let the users take control. Here's the combined OIT/fog resolve shader: [transp_resolve.hlsl](https://bitbucket.org/CPMADevs/cnq3/src/master/code/renderer/shaders/crp/transp_resolve.hlsl). It has to deal with a bunch of extra complications: - Compiling with and without volumetric light support. - Different fragments can have the same depth but still must be sorted in a specific order (lowest Q3 shader stage index is blended first). - Q3 shader stages can set different depth write and depth test settings even in transparent shaders. - Fragments can be depth faded. - The correct Q3 shader index must be written out to a buffer UAV for the built-in shader tracing tool. The fragment has to actually change the color of the main accumulator. Despite that, it's still a bit more gnarly than it could have been.
Oh well. ¯\_(ツ)_/¯ Here are some extra examples of OIT and volumetric fog being combined: --------------------------------------------------------------------------------