Tile frustum calculation for the point light culling

Tile-based deferred shading throught the DirectX Compute Shader for achieving the high performance with many lights by amortizing the light culling overhead over screen tiles as well as grouping lights to avoid wasting memory bandwidth.

As you know, we need to calculate the frustum of each tile for our tile-based rendering – (0) Intel of sample “Deferred Rendering for Current and Future Rendering Pipelines”

#define COMPUTE_SHADER_TILE_GROUP_DIM 10
// Work out scale/bias from [0, 1]
float2 tileScale = float2(mFramebufferDimensions.xy) * rcp(float(2 * COMPUTE_SHADER_TILE_GROUP_DIM));
float2 tileBias = tileScale - float2(groupId.xy);

// Now work out composite projection matrix
// Relevant matrix columns for this tile frusta
float4 c1 = float4(mCameraProj._11 * tileScale.x, 0.0f, tileBias.x, 0.0f);
float4 c2 = float4(0.0f, -mCameraProj._22 * tileScale.y, tileBias.y, 0.0f);
float4 c4 = float4(0.0f, 0.0f, 1.0f, 0.0f);

// Derive frustum planes
float4 frustumPlanes[6];
// Sides
frustumPlanes[0] = c4 - c1; // (-X direction)
frustumPlanes[1] = c1;      // (+ X direction).
frustumPlanes[2] = c4 - c2; // (-Y direction).
frustumPlanes[3] = c2;      // (+ Y direction)
// Near/far
frustumPlanes[4] = float4(0.0f, 0.0f, 1.0f, -minTileZ);
frustumPlanes[5] = float4(0.0f, 0.0f, -1.0f, maxTileZ);

// Normalize frustum planes (near/far already normalized)
[unroll] for (uint i = 0; i < 4; ++i) {
frustumPlanes[i] *= rcp(length(frustumPlanes[i].xyz));
}

In the above code, we have to calculate the normal of 6 planes that constitute the frustum.
The calculation is performed with a view space, but the normal line of the calculation is considered in the tile, not the distance unit.

Meaning of the variables:

mFramebufferDimensions = length and width of the frame buffer size (pixels)
tileScale = the center as (0, 0), one tile = 1.0 to become such a maximum value of the tile coordinate system
tileBias = coordinates of the tile you are dealing now (beginning from +tileScale, ending in -tileScale)

In the following description mFramebufferDimensions = (600,600), and the COMPUTE_SHADER_TILE_GROUP_DIM = 10.
Since the tileScale = 600 / (2 * 10) = 30, the minimum -30.0 and range to a maximum of 30.0, will coordinate system:

time

Now lets try to calculate the normal line (+ Y direction) of the frustum.
float4 c2 = float4 to (0.0f, tileBias.y, mCameraProj._22 * tileScale.y, 0.0f)  (C2 tile coordinate system in the view space).

The value of mCameraProj._22 is, row 2, column 2 of the projection matrix = (1.0f / tan (fovY / 2))

First, consider the case of groupID = 0.

group-id-0

Rotating c2 90 degrees, since the normal direction of the plane of the + Y side of the frustum is understood to be a c2. We have the expression of the following Intel’s sample.
float4 c2 ‘= float4 (0.0f, -mCameraProj._22 * tileScale.y, tileBias.y, 0.0f);

group-id-0-90

Similarly, consider the case of groupID = 1 is next to the tile. (In this case, tileBias = tileScale – 20)

group-id-1

Rotating c2 90 degrees, since the normal direction of the plane of the + Y side of the frustum is understood to be a c2. We have the expression of the following Intel’s sample.
float4 c2 ‘= float4 (0.0f, -mCameraProj._22 * tileScale.y, tileBias.y, 0.0f);

group-id-1-90

Try to calculate the normal line of the lower side of the frustum (-Y direction).
float4 c2′ = float4 (0.0f, -mCameraProj._22 * tileScale.y, tileBias.y, 0.0f);
float4 c4 = float4 (0.0f, 0.0f, 1.0f, 0.0f);
(C2′ and C4 tile coordinate system in the view space)

Then, the normal direction of the lower side of the frustum (-Y direction), c4 – can be represented by c2′.

Consider the case of groupID = 0.

group-id-0-c2

Repeat this for each coordinate of the point light with collision detection of the frustum.

// Cull lights for this tile
for (uint lightIndex = groupIndex; lightIndex < numPointLights; lightIndex += COMPUTE_SHADER_TILE_GROUP_SIZE) {

float3 lightPosition = mul(float4(PointLightDataArray[lightIndex].pos.xyz, 1.0f), matView).xyz;
float cutoffRadius = PointLightDataArray[lightIndex].col_radius.w;
// Cull: point light sphere vs tile frustum
bool inFrustum = true;
[unroll] for (uint i = 0; i < 6; ++i) {
float d = dot(frustumPlanes[i], float4(lightPosition, 1.0f));
inFrustum = inFrustum && (d >= -cutoffRadius);
}

[branch] if (inFrustum) {
// Append light to list
// Compaction might be better if we expect a lot of lights
uint listIndex;
InterlockedAdd(sTileNumLights, 1, listIndex);
sTileLightIndices[listIndex] = lightIndex;
}
}

Visual course

Starting my new course – special for 2D/3D artist, ta and can be interesting to some programmers with this topics:

  • Color – lets talk about color
    • RGB and other models – how the rgb was found
    • Gamma correction – what is gamma correction today, white balance etc.
    • Linear space – what is linear space and why we use it
    • LDR/HDR – what is HDR and LDR what the difference
    • Tonemapping – how to use and why we need to have tonemapping
  • Forward vs Deferred
    • Deferred is everywhere – why everybody switch to deferred
    • Problems – what the problems we have with deferred
    • Combination – how we can use both of them and why do this
    • New approach – Screen space effects
  • PBR workflows – why use PBR
    • Shading models – lambert, burley, etc..
    • Specular vs metallic – what the difference
    • Albedo, glossines, roughes, specular, ao, cavity – different maps
    • Metals and dielectrics – how to choose
    • Lighting models – what models are
  • Lighting – for PBR
    • Brdfs – what is BRDF
    • Different BRDF/BTDF models – what the difference
    • Energy conservation – why it’s important
    • Fresnel – what is Fresnel term and how it affect metallic
    • How IBL fits here – when we use IBL
    • Area lights – what the difference between area and directional
  • Image base lighting – what is IBL
    • IBL in games – how and why we use it
    • Irradiance environment map (IEM) – approaches
    • Spherical harmonics – approaches
    • Volumetric lighting
  • PBR Camera – final image post-processing
    • Camera – why is important (LUV,SHUTTER SPEED OTHER)
    • High values – color space and dynamic range
    • Antialiasing
    • Optical effects – Optic, Lenses, Bokeh, Glare, Anamorphic
    • Motion blur, DOF
  • Custom Lighting – complicated
    • Photon lighting
    • Crytek LPV
    • Unreal voxel cone tracing
  • Ray Tracing and Distance functions
    • Raymarch – what it is and why we need to know
    • Approximations – how and why we should do it
    • Examples of approach – ray traced distance field soft shadows
  • Texture Compression
    • Memory, quality, power, performance – four horsemen of texture compression
    • DXT – S3 compression.
    • BCn – compression
    • Atlases – what is this and why we need to use them
    • UV space – from 0 to 1, mirror, wrap and others

The main purprose – to have a solid understanding what everybody talking about every day but dont know exactly what is it. This is not a final listing here, I’m still WIP and adding a little bit more topics to cover. All is a rather simple form, without any complicated formulas or calculations, the task of the course – learn the concepts with which we operate every day.

3D Fractals

The Mandelbulb is a three-dimensional fractal, constructed by Daniel White and Paul Nylander using spherical coordinates in 2009.

As we know, the canonical 3-dimensional Mandelbrot set doesn’t exist, since there is no 3-dimensional analogue of the 2-dimensional space of complex numbers. It is possible to construct Mandelbrot sets in 4 dimensions using quaternions. However, this set does not exhibit detail at all scales like the 2D Mandelbrot set does (0).

The Mandelbulb is then defined as the set of those {\mathbf c} in 3 for which the orbit of \langle 0, 0, 0\rangle under the iteration {\mathbf v} \mapsto {\mathbf v}^n+{\mathbf c} is bounded. For n > 3, the result is a 3-dimensional bulb-like structure with fractal surface detail and a number of “lobes” depending on n. Usually you can use n = 8 or as an example animate this POWER value.

The Sierpinski triangle is a fractal and attractive fixed set with the overall shape of an equilateral triangle, subdivided recursively into smaller equilateral triangles. It is named after the Polish mathematician Wacław Sierpiński but appeared as a decorative pattern many centuries prior to the work of Sierpiński.

There are many different ways of constructing the Sierpinski triangle, this is a 3D way using raymarching.

And the last on is the Menger sponge (also known as the Menger universal curve) is a fractal curve. It is a three-dimensional generalization of the Cantor set and Sierpinski carpet. It was first described by Karl Menger in 1926, in his studies of the concept of topological dimension.

The Menger sponge simultaneously exhibits an infinite surface area and zero volume.

The construction of a Menger sponge can be described as follows:

  • Begin with a cube (first image).
  • Divide every face of the cube into 9 squares, like a Rubik’s Cube. This will sub-divide the cube into 27 smaller cubes.
  • Remove the smaller cube in the middle of each face, and remove the smaller cube in the very center of the larger cube, leaving 20 smaller cubes.
  • Repeat steps 2 and 3 for each of the remaining smaller cubes.

Radiosity

The energy leaving a surface on unit area and per unit time. Lighting calculation algorithm wich have the same word – “radiosity” comes basically from the law of conservation of energy and “Energy balance” hypothesis.

light-path-simple-room

Let’s consider a simple example (of the 4 walls of the room in 2D). Suppose that one of the walls of a rejected the amount of energy (light) in a very short period of time (and here is extinguished). The law of conservation of energy says that all the energy that was radiated this wall will remain inside the room. For the emitted light, this means that it is all come to the other 3 wall. After this a some amount of a light can be absorbed, it depend on the surface type and other reflected, and one more time repeat, and repeat..  That mean, in a closed room (closed system), all once radiated energy goes into heat with the passage of time. Thus, the term “energy balance in the system” to have actually merely means that the level of illumination in the scene does not change with time. I.e we can ignore the time and consider the “instant” solution!

Radiosity algorithm works in terms of energy transferred between small areas(patches).

  1. Surfaces scenes are divided into a finite number of sites. Patches are small but finite size and is made up of several consecutive steps.
  2. The light sources are also interpreter as patches, radiating energy.
  3. For each pair of pathes are calculated so-called form factor 𝐹𝑖𝑗 – part of the energy transferred from the patch to the patch 𝑖𝑗, way we obtain the matrix. From the law of conservation of energy it follows that all the emitted
    energy of their energy patch i should come to any other patches. I.e amount of form factors in the row must be equal to 1.
  4. Once the form factors are known, it begins the process of solving linear radiosity equation. There is a calculation of what part of the actual energy on what grounds passes. There may be at least 2 ways – direct
    calculation and decision system of linear equations.
  5. Once the total emitted by each site is known, can be visualized lyse scene. As Lambert material radiates energy equally in all parties, in order to calculate the brightness of the areas should be divided the energy of each site on its area.

As the basic geometric element is proposed to use lined on the axes rectangles (Axis Aligned Quad). This primitive defines direct rectangle, located in 1 of 3 planes – XoY, XoZ or YoZ. It can be set as 𝑃 position and size in 2 directions perpendicular to the normal vector 𝑛 – 𝑛 rectangle at normal XoY can either be (0,0,1) or (0,0, -1).

simple-mod

With such a primitive enough to easily search-beam. Suppose you have a ray starting at point and the direction 𝑂 𝐷. And there is a plane YoZ. Then parametric coordinate 𝑡 point of intersection of the beam and the plane will be equal Tℎ𝑖𝑡 = (𝑃.𝑥 – 𝑂.𝑥) /𝐷.𝑥. In order to understand whether or not your beam is rectangular, you will check whether the found point is crossed (expressed as a straight line from the equation 𝑃ℎ𝑖𝑡 = 𝑂 + 𝐷 * Tℎ𝑖𝑡) within the boundaries of the rectangle.

The calculation of form factors

Differential Form Factor for the two sites and 𝑖 𝑗  is written as follows:

$$
FdA_{i}dA_{j} = \frac{cos\theta_{i}cos\theta_{j}}{\pi r^2}
$$

rp-patches

The relative positions of patches.

Complete the form factor for a given patch will be equal to the integral of the differential of the form factor on the area under consideration patch.

$$
F_{ij} = \frac{1}{A_{i}}\int_{x\in p_{j}}\int_{y\in i_{j}}\frac{cos\theta_{i}cos\theta_{j}}{\pi r^2}V(x,d)dydx
$$

Thus, we have made the transition from all points of the scene to the form factor between two specific sites. It remains to learn how to calculate the double integral over the surface. Very simple but long ray tracing and the Monte Carlo method. In the case of form factors should be N times to generate 2 random points on the floors, to calculate for each pair of points differential form factors and the results averaged. The result should be multiplied by the area of the two areas.

mc-method

The integration of form factors using the Monte Carlo method and a trace rays

Form Factor propetries:

  1. The dependence on the geometry of the scene. Form factor depends solely on
    scene geometry, do not depend on the position of the light sources and their intensities
    intensity and position of the camera.
  2. The reversibility. Knowing the area ratio patches, you can switch from one form factor to another: 𝐹𝑖𝑗𝐴𝑖 = 𝐹𝑗𝑖𝐴𝑗
  3. Additivity. Knowing the form factor for a couple of patches i and j and the form factor for the pair
    patches i and the k, j, and by combining the k patches, we can calculate the form factor for the pair patcheyy and patch, which is a union of patch j and k. This property will be useful in the implementation of the hierarchical radiosity, when a group of distant sites can be considered as one big playground.
  4. The sum of all the possible form factor for each patch equal to the scene per unit. All the energy, leaving the patch must come to some other patch.

 Direct radiosity calculation

The algorithm is a direct calculation is quite simple. At each iteration, for each site It is going all the incoming energy. At the end of the iteration is multiplied by koefiftsient reflections and converted to outgoing energy for the next iteration.

struct PatchRadiosity // separateradiosity params from geometry
{
float3 emmision;
float3 reflectance;
float3 incident;
float3 excident;
float3 deposit ; // this is sum of all passes
};
// std :: vector <PatchRadiosity> patchesR(N_PATCHES); 
// init phase
for (PatchRadiosity&p : patchesR)
{
p.excident = p.emmision ;
p.emmision = float3 ( 0, 0, 0 );
p.deposit += p.excident;
}
// do radiosity
for ( int pass = 0; pass < a_numPasses; pass++)
{
// each patch collect slight from the scene
for ( int i = 0; i < patchesR.size(); j++)
{
patchesR[i].incident = float3 ( 0, 0, 0 );
for ( int j = 0; j < patchesR.size(); j++)
patchesR[i].incident += patchesR[j].excident * a_formFactors[i][j];
}
// deposit emitted energy for current bounce
for ( PatchRadiosity& patch : patchesR )
{
patch.excident = patch.incident * patch.reflectance;
patch.deposit += patch.excident ;
}
}

Example of using this algorithm
rad-example

Random vectors directions within a sphere

Way to get the random vectors direction vec3 (x, y, z) within a sphere with a fixed radius 1.0?

Use three angles:
$$
-180° <= a1 < +180° $$ $$
0° <= a2 <= 90° $$ $$
0° <= a3 <= 90°
$$

Steps:
1. Select a random direction in the XY plane: {cos (a 3), sin (a 3) 0}.
2. turn the Z axis to a random angle a2 around the random vector {cos (a1), sin (a1), 0}, which lies in the XY plane.

rotX(p) := matrix(
[1,0,0],
[0,cos(p),-sin(p)],
[0,sin(p),cos(p)]
);
rotZ(p) := matrix(
[cos(p),-sin(p),0],
[sin(p),cos(p),0],
[0,0,1]
);
dir(p) := matrix(
[ cos(p) ],
[ sin(p) ],
[ 0 ]
);
calc_dir(p1, p2, p3) := rotX(p1) . rotZ(p2) . dir(p3);

Current example you can try in Maxima Lisp.
The result of matrix multiplication:
$$
vx = cos (p2) * cos (p3) -sin (p2) * sin (p3) $$ $$
vy = cos (p1) * (cos (p2) * sin (p3) + sin (p2) * cos (p3))$$ $$
vz = sin (p1) * (cos (p2) * sin (p3) + sin (p2) * cos (p3)) $$

And this is a generated image, as you you see no visible sphere poles, wich is great!

sphere-good

We can use it in the raytracing and gi.