Chapter 3 Algorithm
3.4 Fake Soft Shadow
Shadow is an important effect for realistic image synthesis. Nowadays, shadow mapping which creates the shadows by comparing the depth of the pixels in eye’s view with the corresponding depth value in the shadow map is a popular real-time algorithm for dynamic hard shadows. Unfortunately, it always suffers the perspective aliasing and the projective aliasing problems that influence the visual perception in the images. A feasible solution is using Percentage Closer Filtering [10] (PCF). This technique takes more than one sample around each shaded point to compare their depths, and averages the results of their comparisons to get the anti-aliased shadows as shown in Figure 3.18. In general case, we can extend this technique to obtain the soft shadows by taking no less than 49 samples. However, it still leads to a slightly banding artifact. Fortunately, we can use the cached shadow values to improve the quality of the shadows as previous work does [8]. We not only rotate but also scale the sampling pattern to generate slightly different shadows. We then take the different values of them as the weighting factors to accumulate the shadow values.
Figure 3.18: We can get the results of depth comparisons between a surface point p and the samples around it in shadow map. An average of the values stored in the mask can be an occlusion rate to create the anti-aliased shadows.
28
3.4.1 Exploiting Cached Information
We first take the past shadow value 𝑆𝑡−1 per pixel from history cache.
Before computing the new shadow value 𝑆𝑡 per pixel, we have to rotate the sampling pattern and scale the size of pattern each frame. Afterwards, we can get the difference between them according to Eq. 4. By taking the difference value as weighting factor, we can avoid spending time to find the acceptable value for the weight that is used to accumulate the shadow values 𝑆𝑡−1 and 𝑆𝑡. Figure 3.19 and Figure 3.20 show the comparisons.
Figure 3.19: Hard shadows (left).The soft shadows are created by using 49 samples (middle).The soft shadows are created by using only 9 samples and the cached data (right).
Figure 3.20: The shadows of left image without accumulating cached information have apparent aliasing than the shadows of right image with accumulating cached information. ( Both images use 49 samples.)
29
Chapter 4
Implementation and Results
We implement our approach by using OpenGL 4.3 with OpenGL Shading Language. In order to avoid multiple rendering passes for the generation of G-buffer and RSMs, we use multiple render targets (MRT).
To increase the rendering performance of indirect illumination, we take the advantage of low-frequency nature of diffuse lighting as original RSMs [3] does.
Therefore, we first compute the low-resolution indirect illumination. We use a half of full resolution that is the best trade-off between high performance and high quality in our experiments. We then get the interpolated full resolution indirect illumination by testing the four surrounding low-resolution samples each pixel . If the sample’s world space normal and world space position close to the pixel’s normal and position, we can compute the contribution of each sample by using bilinear interpolation.
Otherwise, the discarded samples are computed later in the final gathering pass.
Figure 4.1 shows the discarded samples.
Figure 4.1: The red pixels are discarded samples and must be completely computed.
30
In order to achieve high frame rate, we can reuse the converged results from SSAO, one-bounce indirect illumination and soft shadows by caching them in the buffer. Note that we still have to refresh them after a few frames. Because when the camera and the light source are moving, the shading condition is changed.
Furthermore, when doing the reverse re-projection to get the cached information, it always gradually accumulates the reconstruction error. This problem is caused by resampling the discrete data. The one to one pixel mapping is difficult. The common solution is doing bilinear interpolation which is directly supported by the modern graphics hardware. Unfortunately, if we keep reusing cached results over several frames, the bilinear interpolation will lead to over blurred results. Therefore, we follow a refresh strategy described in Nehab et al. [9]. We can divide the screen into a grid of 𝑛 tiles, and randomly select a non-repeated tile to be updated. Figure 4.2 and Figure 4.3 show the comparison with and without using the tiled update strategy. Note that the size of a tile will influence the performance and the quality. In our experiment, the region of a tile in SSAO implementation is 64×64 pixels. The region of a tile in one-bounce indirect illumination is 16× 16 pixels and the region of a tile in shadow implementation is 64×64 pixels. On the other hand, we found that the shading results are discontinuous on the borders of the refreshed regions as shown in Figure 4.4. Fortunately, they are not too obvious to influence the visual perception of the results. In addition, in our experiment, we found that there is not a remarkable performance speedup for the PCF soft shadows, because the calculation of PCF is not complex. It only needs doing the depth comparison. Nevertheless, we still can take the advantage of temporal coherence to improve the shadow quality.
31
Figure 4.2: SSAO with 32 samples. The left image does not use the tiled update strategy. The right image uses the tiled update strategy.
Figure 4.3: Indirect illumination with 512 VPLs. The left image does not the use tiled update strategy. The right image uses the tiled update strategy.
Figure 4.4: The shading results are discontinuous on the borders of the refreshed regions.
32
The results of our algorithm are implemented on a desktop PC with Intel Core i7 930 CPU 2.8GHz, 8G RAM, and NVIDIA GeForce GTX 480 video card. All images are rendered at a resolution of 512×512, and the tested scenes are static. We compute one-bounce indirect illumination with 512 VPLs on a half resolution and interpolate the solution into the full resolution image. All the results of our SSAO are rendered with 32 samples.
We also implement the SSAO method of Mittring [7] by accumulating the cached information of SSAO. Then we compare it with the results using Gaussian blur as shown in Figure 4.5. Figure 4.6 compares our proposed SSAO method that uses the cached information with the results using Gaussian blur. Figure 4.7 shows the indirect illumination results that are combined with direct illumination, and we compare them with the results using the cached information. Figure 4.8 shows the pseudo soft shadows that are computed by using the PCF technique with 49 samples, and we compare them with the results that use the cached information to improve the quality. Figure 4.9 shows the final converged results that are composed of ambient occlusion, one-bounce indirect illumination and direct illumination. Table 4.1 shows the average timings of the results from Figure 4.9 with and without reusing past information. Tables 4.2~4.4 show the converged results of large scenes and the frame rates for their scene walkthroughs with and without using cached information of previous frames.
33
Figure 4.5: Left column: Buddha & Box scene. Right column: Dragon & Box scene.
Note that the over blurred results are in (b) and (e), and the finer converged results are in (c) and (f). Our approach also works well for the SSAO technique proposed by Mittring [7].
(a). 16 samples/pixel
(b). 16 samples/pixel + blur
(c). 16 samples/pixel + cached data
(d). 16 samples/pixel
(e). 16 samples/pixel + blur
(f). 16 samples/pixel + cached data
34
Figure 4.6: Left column: Buddha & Box scene. Right column: Dragon & Box scene.
Note that the over blurred results are in (b) and (e), and the finer converged results are in (c) and (f). Our approach works well on both scenes.
(a). 32 samples/pixel
(b). 32 samples/pixel + blur
(c). 32 samples/pixel + cached data
(d). 32 samples/pixel
(e). 32 samples/pixel + blur
(f). 32 samples/pixel + cached data
35
Figure 4.7: Left column: No use cached information. Right column: Use cached information. The images of right column have less banding artifacts than left column.
(a). 512 VPLs/pixel
(b). 512 VPLs/pixel
(c). 512 VPLs/pixel
(d). 512 VPLs/pixel + cached data
(e). 512 VPLs/pixel + cached data
(f). 512 VPLs/pixel + cached data
36
Figure 4.8: Left column: Buddha & Box scene. Right column: Sponza scene. The hard shadows have aliasing artifacts in (a) and (d). The smooth shadows are produced by the PCF technique in (b) and (e). The high-quality soft shadows are produced by using the PCF technique and the cached shadow values in (c) and (f).
(a). Hard shadows
(b). 49 samples/pixel (e). 49 samples/pixel (d). Hard shadows
(c). 49 samples/pixel + cached data (f). 49 samples/pixel + cached data
37
(a). Direct lighting (b). Direct lighting + SSAO
(c). Direct lighting + SSAO +Indirect lighting
(d).
Zoom in
Figure 4.9: The four difference scenes from top row to bottom row. All images are computed by using our approach. Column (a) to (c) display different effects such as pseudo soft shadows , ambient occlusion and one-bounce indirect illumination.
38
Table 4.1 shows the experimental results of our approach. We compare the difference in performance for various scenes from Figure 4.9 with and without reusing past information stored in history cache. All the scenes have the same camera path for their walkthroughs. We measure the average timings of SSAO, one-bounce
Table 4.1: The comparison of the performance for different scenes with and without reusing past data.
39
Dabrovic Sponza (66,450 faces)
Direct illumination SSAO Indirect illumination
Direct and Indirect illumination + SSAO
Table 4.2: Characteristic walkthroughs for the Sponza scene. The performance of the scene rendered by using cached information is mostly better than the results that do not use cached information.
40
Sibenik (75,284 faces)
Direct illumination SSAO Indirect illumination
Direct and Indirect illumination + SSAO
Table 4.3: Characteristic walkthroughs for the Sibenik scene. The performance of the scene rendered by using cached information is mostly better than the results that do not use cached information.
41
Conference Room (331,179 faces)
Direct illumination SSAO Indirect illumination
Direct and Indirect illumination + SSAO
Table 4.4: Characteristic walkthroughs for the conference room. The performance of the scene rendered by using cached information is mostly better than the results that do not use cached information.
42
Tables 4.2~4.4 show that our approach also works well for large-scale scenes.
Note that there are more pixels have to be recomputed in the more complex scene when the camera moves as shown in Figure 4.10.
Figure 4.10: A simple scene is composed of a dragon and a box (left). A complex scene is the conference room (right). There are more pixels (marked red) in the right image have to be recomputed than in the left image when the camera moves left.
43
Chapter 5
Conclusion and Future Work
In this thesis, we propose a novel approach that utilizes temporal coherence to enhance the screen space global illumination algorithms such as reflective shadow maps and screen space ambient occlusion. We demonstrate that both the quality and the performance can be improved by reusing the previous results. In addition, our algorithm is easy to implement. However, we only can handle one-bounce diffuse indirect lighting and the light source must be static.
In the future, we want to extend our algorithm to handle the dynamic scenes. For example, whenever an object moves, the effects of global illumination on this object and the effects of global illumination caused by it have been changed. Therefore, how to efficiently know where the regions of the screen have to be recomputed is a difficult problem. Even though we can create a movement mask by rasterizing the bounding volumes of the moving objects, it still not absolutely resolves the problem.
One of the reasons is that if the scene object is big, its projection might cover a large portion in the movement mask and more pixels have to be updated. The other is that we should know where the pixels in screen are influenced by the moving objects.
Finally, we find that the over blurred results occur since we reuse the cached results over many frames. Consequently, we would investigate that how to keep the better accuracy of the reused results.
44
References
[1] Bunnell, M. 2005 Dynamic Ambient Occlusion and Indirect Lighting. In GPU Gems2, M. Pharr, Ed. Addison Wesley, Mar., ch. 2, 223-233.
[2] Bavoil, L., Sainz, M., and Dimitrov, R. 2008. Image-space horizon-based indirect illumination with clustered visibility. In Proceedings of Vision, Modeling, and Visualization Workshop, 2009.
[5] Keller, A. 1997. Instant Radiostiy. In SIGGRAPH '97, 49-56.
[6] Landis, H. 2002. Production-ready global illumination. Course notes for SIGGRAPH 2002 Course 16, RenderMan in Production.
[7] Mittring M.: Finding next gen: Cryengine 2. In SIGGRAPH 07: ACM SIGGRAPH 2007 courses, ACM, pp. 97-121.
[8] Nehab, D., Sander, P.V., and Isidoro, J.R.: The real-time reprojection cache. In ACM SIGGRAPH 2006 Sketches, ACM Press, p. 185.
[9] Nehab, D., Sander, P.V., Lawrence, J., Tatarchuk, N., Isidoro, J.R.: Accelerating real-time shading with reverse reprojection caching. In: Graphics Hardware.
(2007), p. 25-35.
[10] Reeves, W. T., Salesin, D. H., and Cook, R. L. 1987. Rendering antialiased shadows with depth maps. In Computer Graphics (Proceedings of SIGGRAPH '87), vol. 21, 283-291.
[11] Ritschel, T., Grosch, T., Kim, M. H., Seidel, H.-P., Dachsbacher, C., and Kautz, J. 2008. Imperfect shadow maps for efficient computation of indirect illumination. ACM Trans. Graph. (Proc. SIGGRAPH Asia) 27, 5, 12:1-129:8.
45
[12] Scherzer, D., Jeschke, S., Wimmer, M.: Pixel-correct shadow maps with temporal reprojection and shadow test confidence. In Kautz, J., Pattanaik, S., eds.: Rendering Techniques 2007 (Proceedings Eurographics Symposium on Rendering), p. 45-50.
[13] Scherzer, D., Schw ̈rzler, M., Mattausch, O., and Wimmer, M. 2009. Real-Time Soft Shadows Using Temporal Coherence. In Advances in Visual Computing:
5th International Symposium on Visual Computing (ISVC 2009), Springer, Lecture Notes in Computer Science, 13–24.
[14] Schw ̈ rzler, M., Luksch, C., Scherzer, D., and Wimmer, M. 2013. Fast Percentage Closer Soft Shadows using Temporal Coherence. In Proceedings of ACM Symposium on Interactive 3D Graphics and Games 2013, pages 79-86.
[15] Shanmugam, P., and Arikan, O. 2007. Hardware Accelerated Ambient Occlusion Techniques on GPUs. In Proceedings of ACM Symposium in Interactive 3D Graphics and Games. ACM, B. Gooch and P.-P.J. Sloan, Eds., 73-80.
[16] Yang, J.C., Hensley, J., Grün, H., and Thibieroz, N.: Real-Time Concurrent Linked List Construction on the GPU. Comput. Graph. Forum 29(4): 1297-1304 (2010)
[17] Zhang, F., Sun, H., Xu, L., and Lun, L. K. 2006. Parallel-Split shadow maps for large-scale virtual environments. In VRCIA ’06: Proceedings of the 2006 ACM International Conference on Virtual Reality Continuum and its Applications, ACM,New York, NY, USA, 311–318