26 oktober 2012

Deferred shader part 2

This is a description how deferred shading is used in Ephenation. The algorithm has been separated into a number of stages, that each will update one or more data buffers. The stages and the buffers used for these stages are explained below. Most of the links refer to the actual source code.

Overview

Green box: Shader execution
Blue box: Data stored in a bitmap (of varying formats)
Solid arrow: Indicates data is used
Wide orange arrow: Data is updated

Buffers

There is a depth, diffuse, normal and position buffer allocated. These are common in many deferred shaders, usually called a G buffer, which is described in more detail in part 1.

Light buffer

All lighting effects are added up in a separate channel of type R16F. It is a bitmap texture allocated as follows:
glTexImage2D(GL_TEXTURE_2D, 0, GL_R16F, w, h, 0, G_RED, 0);  
The advantage of having only one channel is that less memory is needed. The disadvantage is that light contributions can not have separate effects for the different colors. The values will typically range from 0 to 1, but can exceed 1 if there are more than one light source (e.g. sun and lamp). Players can create any number of light sources, so it is important that this is displayed satisfactorily (see section about tone mapping below).

The buffer is initialized to 0, which would give a completely dark world unless some light is added.
Light map, using red channel

Blend data

This is a GL_RGBA buffer for separately managing blending. See more details about the blending stage below.

Shadow map

A depth buffer with information about the distance from the sun to the world. This is not described in detail in this article. More information can be found at www.opengl-tutorial.org.
Shadow map
The projection matrix is an orthogonal projection, not a perspective projection.

Note that the shadow map uses variable resolution (defined as a shader function), which explains the distortion in the picture at the edges. It has high resolution in the middle, near the player, and lower resolution far from the player. Even though the sun is incoming from an angle, matrix shearing is used to transform height values to normalized values in the Y dimension. Otherwise, height value at upper right would have oversaturated into white and values in the lower left oversaturated into black.

Stages

Point lights using tile based rendering

The effect from the point lights do not take into account shadows from objects. This is a shortcoming, but there can be many lamps and the cost to compute shadows would be too high. The fall-off function is a simple linear function, with max intensity at the lamp position, and 0 at a distance depending on the lamp size. The physically more correct function, giving an intensity proportional to 1/r^2, would give infinite intensity at r=0 and would never reach 0.

Each point light is rendered separately. A bounding 2D quad is positioned at the place of the point light. The fragment shader will thus be activated only for pixels that are visible (highlighted below). Some of these pixels will then be affected by the light. The position buffer is used to compute the distance from the point light to the pixel, and the normal buffer is used to find the angle.
The high-lighted square is used for lighting calculation
As the quad is drawn at the position of the point light, it may be that all pixels are culled because they fail in the depth test. This is a great advantage, and will speed up drawing considerably as lamps are frequently hidden behind walls or other objects. There are two adjustments done. The quad is not positioned exactly at the point light, but in front of it. The other issue is when the camera is inside the light sphere, in which case the quad has to be moved away from the player, or it would be drawn behind the camera and culled completely.

Blending

Blending is usually a problem with deferred shading. If the blending is done before light effects are applied, it will look bad. In Ephenation, drawing of semi transparent objects is done separately from the opaque objects. But it is done using the FBO, so as to have access to the depth buffer. Because of that, the result is saved in a special blend buffer that is applied in the deferred stage.

Textures used for the opaque objects use the alpha component to indicate either full transparency or full opaqueness. That is handled by the shader, which will discard transparent fragments completely. This will prevent updates of the depth buffer.

Deferred shading

All drawing until now has been done with a Frame Buffer Object as a target. The deferred stage is the one that combines the results from this FBO into a single output, which is the default frame buffer (the screen).

Gamma correction

The colors sent to the screen are clamped in the interval [0,1]. 0 is black, and 1 is as white as you can get. The value can be seen as an energy, where more energy gives more light. However, 0.5 is not half the energy of 1. The reason for this is that the monitor will transform the output with a gamma correction. The correction is approximately C^2.2. The constant 2.2 is called the gamma constant. To get a value half way between black and white, 0.5^0.45=0.73 should be used, to compensate for the non-linear behavior of the monitor.

SRGB input

The exact algorithm is defined by the sRGB format. LCD displays use the sRGB coding automatically. If all bitmaps are in the sRGB format, then the final output will automatically be correct. Or rather, it could be correct, but there are important cases where it is not. As the sRGB is not linear, you can't add two values correctly. For example, using the average between 0 and 1, which is 0.5, would not give the average energy in the final display on the monitor. So if there is pixel color manipulations, the final colors can get wrong or there can be artifacts.
 if (srgb < 0.04045)  
     linear = srgb / 12.92;  
 else  
     linear = pow((srgb + 0.055)/1.055, 2.4);  
If this transformation is done on an 8-bit color, the special case of values less than 0.04045 will all be rounded to 0 or 1 when divided by 12.92.

When you edit a bitmap in an editor, what you see is what you get. That means that the monitor will interpret the colors as being sRGB. OpenGL has built-in support for conversion from the sRGB format. If the format is specified for textures, OpenGL will automatically convert to linear color space. if sRGB is not specified, the transform has to be done manually in the shader. In Ephenation, bitmaps are specified as sRGB to get the automatic transformation, which means the equation above isn't needed.

SRGB output

In the last phase, when pixels are written to the default frame buffer, the value has to be manually transformed to non-linear (sRGB). There is actually automatic support for this in OpenGL if using a Frame Buffer Object with a texture target object in format sRGB. However, the final outputting is usually to the default frame buffer, which have no such automatic transformation. Regardless, it may be a good idea to implement it in the shader, to make it possible to calibrate and control by the end user.
if (linear <= 0.0031308)
 CRT = linear * 12.92;
else
 CRT = 1.055 * pow(linear, 1/2.4) - 0.055;

HDR

Colors are limited to the range [0,1], but there is no such limitation in the real world. The energy of a color is unlimited. But the limitation is needed, as it represents the maximum intensity of the display hardware. When doing light manipulations, it is easy to get values bigger than 1. One way would be to start with low values, and then make sure there can never be a situation where the final value will saturate. However, that could mean that the normal case will turn out to be too dark.

HDR is short for High Dynamic Range. It is basically images where the dynamic range (difference between the lowest and highest intensity) is bigger than can be shown on the display. Eventually, when the image is going to be shown, some mechanism is required to compress the range to something that will not saturate. A simple way would be to down scale the values, but then the lower ranges would again disappear. There are various techniques to prevent this from happening. In the case of gaming, we don't want the high values to saturate too much, and so a more simple algorithm can be used to compress the range.

Tone mapping

There are several ways to do tone mapping, and in Ephenation the Reinhard transformation is used. This will transform almost all values to the range [0,1]. If it is done separately for each color channel, it can give color shifts for colors if one of the components R, G or B is much bigger than the others. Because of that, the transformation is done on the luminance. This can be computed with the following in the deferred shader:
float lightIntensity;
vec3 diffuse;
vec3 rgb = diffuse * lightIntensity;
float L = 0.2126 * rgb.r + 0.7152 * rgb.g + 0.0722 * rgb.b;
float Lfact = (1+L/Lwhite2)/(1+L);
vec3 output = rgb * Lfact;
'rgb' is the color when lighting has been applied. This is the value that need to be adjusted by tone mapping.

One simple solution, that is sometimes used, is to transform each channel with x/(1+x). That would take away much of the white from the picture, as almost no values will get close to 1. The solution used above, is to compute the luminance L of the pixel. This luminance value is then transformed with tone mapping, and used to scale the RGB value. The idea is to set Lwhite to an intensity that shall be interpreted as white. Suppose we set Lwhite to 3.0. The tone mapping filter will transform everything below 3.0 to the range [0,1], and values above 3.0 will saturate.
The formula using white compensation will saturate at 3.0
Note how the transformation x/(1+x) will asymptotically approach 1. Without tone mapping, everything above 1.0 would have saturated, but now it is as 3.0.
Tone mapping disabled

Tone mapping enabled
The transformation using Lwhite can also be applied to each channel individually. That will give a little different results with many lights, as the final result would be almost near white. Which variant is best is not defined; it depends from application to application.

Tone mapping enabled per channel

For reference, diffuse data with no lighting applied

Monster selection

After the deferred shader, data from the G buffer can still be used. In Ephenation, there is a red marker around selected monsters. This is a color added to pixels that are inside a limited distance to the monster.
Red selection marker
The same technique is also used to make simple shadows if the dynamic shadow map is disabled.