CHAPTER 1 Introduction
1.3 Thesis Organization
The rest of the thesis is structured as follows: Chapter 2 reviews the related works of stochastic rasterizations on motion and defocus blur. Chapter 3 describes our two-stage culling algorithm of stochastic rasterization. Chapter 4 describes our implementation detail, results and system performance. Finally, we conclude this research with limitation and future work in Chapter 5.
4
5
Chapter 2
Related Works
In this chapter, we review related previous works. Existing blurry effects rendering often use post processing. We will not discuss these methods. First we focus on traditional sampling methods with the 5D space of pixel location (x,y), time t and lens position (u,v). Then we briefly survey the technique of stochastic rasterization that reduce the computation cost for blurry rendering.
Haeberli and Akeley [7] proposed a hard ware supported method called accumulation buffer. It renders multiple images in each buffer. Then average pixel values as the final color. If we render enough number of images, this brute-force method will get a good result. But the computation cost is very high. If the number of rendered images is reduced, the “ghosting” artifact could appear.
Stochastic rasterization renders image once with multiple samplings. Each sample has a unique sampling dimensions (x,y,t,u,v).
Fatahalian et al. [4] presented the InterleaveUVT algorithm. They use as many samples as possible in accumulation buffer. For example: 4 images = 4 samples per pixel. But these samples come from different time steps and lens locations. For example, if we have 16 images at different time steps, we use only 4 of them. Each
6
pixel use different 4 images. This is an enhancement of accumulation buffer. But there is still a discrete set of individual u, v, t values can be used. This will cause the banding artifact.
Akenine-Möller et al. [1] create a bounding box for each moving triangle on screen space. Then they test all stochastic samples inside this bounding box. McGuire et al. [13] escalate this oriented bounding box into 2D convex hull, which makes the testing area smaller. However, most of the samples inside these bounding boxes/2D convex hulls do not hit the triangle. In fact, if we look at a pixel which is inside the triangle at time = 0, it seems impossible that the samples with high t values could be covered if the motion is large. To make the sample test more efficient, solving the sample visibility problem is a major challenge. That is, if we can reject samples that can’t be hit, the sample test efficiency (STE) would increase.
Laine et al. [10] use a dual-space culling algorithm to solve the visibility test. For the defocus blur, they use the linear relation ship between the lens location and the space location. By interpolating two axis aligned bounding boxes of the triangle on lens locations (0,0) and (1,1), they can get the maximum and minimum lens location on the pixel. With the motion blur, the world space affine motion is not affine in
7
where γ is the viewing direction. In this equation, δ is linear with and ω. Besides and ω are both linear with time t. So we can say δ is linear with t. They use this linear relationship to get the time bounds by linear interpolation.
Munkberg et al. [15] propose a hyper plane culling algorithm. They use the equation between object location, lens coordinate, and time coordinate to find linear bounds in ut- and vt-space. They also create a hyper plane in xyuvt-space for each triangle edge to cull samples. This hyper plane culling algorithm is selectively enabled while the blurry area is large enough. Our method use the similar idea to create linear bounds in clip space using the triangle similarity relationship. By using these bounds to reduce the number of sample tests, we can obtain better performance.
8
Chapter 3 Algorithm
In this paper, we introduce a novel 2-stage sample culling test for stochastic rasterization of motion and defocus blur. We assume that the motion of triangles is linear in world space in shutter times. The steps of our algorithm work as follows for each triangle. Each pixel is sampled at different 5D sample locations (x,y,t,u,v). Note that all steps take place on shaders.
1. For each triangle, create a corresponding’s bound geometry 2D convex hull (if motion blur only) or bounding box in screen space.
2. For each pixel inside the box, test the stochastic 5D-dimension samples.
3. For each sample, apply the 1st stage culling test and cull the sample outside the rough camera lens bounds.
4. If the sample passes the stage 1 test, apply the 2nd stage test. This test uses the triangle equation in clip space with point’s information. Find out the time bounds of each (u,v) location samples.
5. Only samples pass both 1st and 2nd stages need a complete triangle intersection test.
The rest of this chapter is organized as follows. In Section 3.1, we show how to render motion blur and defocus blur images with stochastic rasterizer. In Section 3.2, we describe the initial 2D convex hull/bounding box. In Section 3.3, we describe the
9
1st stage culling test and Section 3.4 for the 2nd stage test. Finally in Section 3.5, we describe an application to control the defocus blur range by Munkberg et al. [17].
3.1 Stochastic Rasterization
To render motion blurred images, we need to input triangles at time t = 0 and t = 1. That is, the triangle positions when shutter opens and closes. Then, for each pixel covered the moving triangle, we test different stochastic samples at different time coordinate. Similar to the ray intersection test. We shoot a ray from a pixel and test if the ray intersects with triangle at time t. If the intersection happens, we add the color value of the intersection point. Otherwise we discard this sample. Finally we average valid sample colors as the pixel color.
To render defocus blurred image, we need the lens radius to compute the circle of confusion (CoC). We use a simple physically-based CoC setting. It is linearly dependent on the depth w in clip space, i.e., CoC(w) = a+wb . Parameters a and b are constant derived from the camera’s aperture size and focal distance. The out of focus vertex position is changed to different camera lens coordinates. That is, a vertex position (x,y,z) in clip space. If we use camera lens (u,v) to see this vertex, it’s position will change into (x + uc , y + vc , z), where c = a + zb. For a pixel, only a position in lens coordinates can see this vertex. The intersection test is similar to the previous one. The ray checks the intersection test with the triangle in lens coordinates.
For the case of both effects exist, the intersection test should find out the triangle position in camera lens (u,v) and time coordinate (t). As we know, a lot of samples are not visible for a given pixel. Our goal is to increase the hit ration of sample test.
10
3.2 Bounding the Moving/Defocus Range
For each triangle, we need to rasterize the bounding geometry large enough to cover the entire triangle’s motion. The most conservative bounding is the whole viewing frustum’s near plane. The tightest bounding geometry is the front-facing surfaces of the triangle’s swept volume, but this bound is curved on the sides. We use the 2D bounding box of the moving triangle at near plane to bound the motion triangle. If we rasterize in-focus images, we change the bounding box into convex hull.
To create the 2D bounding box, we need the vertex positions of triangles at the beginning and the end of the shutter time. The triangle transfers into screen space bounding box in geometry shader. First we project these 6 vertices into screen space and get the bounding box’s maximum and minimum boundary. Then we increase/decrease these boundaries, by passing the camera lens’s radius and the focus depth. We compute the maximum circle of confusion with the minimum and the maximum depth from the triangle vertices. Figure 3.1 shows how to create 2D bounding box. If the triangle passes the z = 0 plane, we find out the intersection point at the near plane from the 6 triangle edges and 3 moving point paths. Figure 3.2 shows this case.
If we render the in-focus image, only the motion blur effect is added. We make the bound tighter by using the 2D convex hull. We use Graham scan algorithm [Graham 1972] [5] to create this 2D convex hull. The reason why we don’t use this technique for defocus blur is that computing all these 6 circle of confusions with
11
different depth taking too much cost, as shown in Figure 3.3.
(a) Without defocus (b) With defocus (Red: circle of confusions) Figure 3.1: Create 2D bounding box
Figure 3.2: Crossing Z=0 plane
Z=0 Near
+z -z
12
Figure 3.3: 2D convex hull
3.3 Culling Stage 1
Consider the vertex of a triangle in the clip space. We assume a linear motion, p(t) = (1-t)q + r, where q is the starting point and r is ending point. For defocus blur,
in the signed clip space, we assume that the circle of confusion is linear with time, that is, c(t) = a + w(t)b. In stage 1 culling, the samples outside the camera lens bounds are culled.
The computation of these bounds is relatively simple because the clip-space vertex position is linear to lens coordinates. We normalize the lens by (u,v) ∈ [-1,1]2 . For each pixel covered by the triangle, we give stochastic samples with different parameters (t,u,v). Then we need to test if the triangle is visible from camera lens u and v in time t.
In Figure 3.4, a ray is shot from camera and passes through a screen pixel. We set this “screen pixel” as the focus point with our focus depth. In this xw space, we connect all 6 triangle vertices with the focus point. We will obtain 6 points at depth
13
w=0. We take the maximum and minimum one as the camera lens bounds.
We can compute u = ( nx ⋅ p(t) ) / ( p(t).z – focusDepth ),
where nx = ( -focusDepth , 0 , focusPoint.x) is the normal vector of the pixel ray.
Then we divide this u by camera lens radius. Finally we get the normalized camera lens bounds at u dimension. The v dimension bound can be obtained in the same way with ny = ( 0 , -focusDepth , focusPoint.y).
Figure 3.4: A ray is shot from camera to screen pixel and intersects with the focus point. All 6 triangle vertices connect with this focus
point and intersect with the camera lens at depth w=0
With these rough camera lens bounds, we cull the stochastic samples where u, v dimensions are outside the lens bounds. This makes higher STE further.
x w
t=0
t=1
u
minu
MAXW=focusDepth
n
x14
3.4 Culling Stage 2
After 1st stage culling, samples outside camera lens bounds are culled. The remaining of samples would go into2nd stage test. We test the samples with each u, v dimensions and find out the corresponding time bounds.
Consider the triangle vertices in clip space. The vertex will “translate” with different camera lens u, v relatively. For example, a vertex p = ( x , y , w ), we use camera lens ( u , v ) to see this vertex. It is translated into p’ = ( x + uc , y + vc , w ), where c is the circle of confusion of this vertex : c = a + wb. With shutter opened, we need to know when this pixel can see the triangle. It is hard to compute the
intersection time in screen space because, the vertex’s linear motion is not linear to the viewing direction, as shown in Figure 3.5.
Figure 3.5 Ray intersection with moving vertex
In order to obtain the intersection time, we find a triangle relationship in this clip
x w
t=0
t=1
15
space. First, for each sample, we compute two vertex positions with camera lens u, v at time t = 0 and t = 1, that is,
x’(0) = x(0) + uc(0) , x’(1) = x(1) + uc(1).
The y coordinate can be obtained in the same way. Then we find the other 2 points with the ray shot from camera to pixel which intersects with the triangle vertex’s depth values when the shutter opened and closed.
Figure 3.6 Ray intersection with two similar triangles
Now we have 4 vertices in xw clip pace. Two points with camera lens u, v and their own circle of confusions c(t) when shutter opened and closed. The other 2 points are the viewing direction’s intersection points at the same depth value of the previous 2 vertices. Figure 3.6 shows all these four points. Then we can find two similar triangles, where their corresponding edges are proportional in length:
x
16
Then the intersection time t can be obtained as follows:
(1)
We compute all 3 pair of triangle vertices at time t = 0 and t = 1 in xw and vw space. Then we get time bound [ tmax , tmin ]. All stochastic samples with t dimension outside this bound will be culled.
As we described before, we find the similar triangles in clip space. And use it to compute time dimension bounds. We have to compute each vertex position with stochastic sample’s u, v camera lens position. That is, for each sample, we have to compute the triangle vertex position p(0) = ( x + uc(0) , y + vc(0) , w). This takes too much cost.
By rewriting equation 1, each x can be replaced by x = x + u*c(t), we have :
( ) ( )
This is a rational function in u, and this function is also monotone, which can be bounded by two linear functions as shown in Figure 3.7.
17
Figure 3.7 The rational function is bounded by two linear bounds
Figure 3.8 Combine of all 3 pairs of linear bounds with 2 linear bounds (Red Lines)
u t
-1 1
u t
- 1
18
The two linear bound are as follows:
Green : ( ) ( ) ( ) ( ( ) ( )) Red : ( ) ( ) ( ) ( ) ( )
All 3 pairs of triangle vertices will create 2 linear bounds. We combine these linear bounds with 2 bounds, one is maximum bound [t(-1)MAX , t(1)MAX], and one minimum bound [t(-1)min, t(1)min]. By two linear bounds, we can test all stochastic samples at the same pixel (same viewing direction). If the time dimension is larger than upper bound or smaller than lower bound, the sample will be culled. That is, we only need to compute two linear bounds once for each pixel.
3.5Controllable Defocus Blur
People often want to control over depth of field parameters empirically, such as limit foreground blur or extend the in-focus range while preserving the foreground and background. MunKberg et al. [17] proposed a user controllable defocus blur method. We use their technique in our system.
We modify the circle of confusion by limited it’s size near the focus depth.
0 , w-focusDepth < ε C =
a + wb , otherwise
19
It is assumed that the clip space circle of confusion radius is linear in the interior of a triangle. Then we can increase the focus range or decrease the foreground blur easily. If the triangle vertex is inside the extended focus range, all samples pass stage 1 test. And set u = v = 0 to compute the stage 2 time bounds. Finally we modify the circle of confusion when doing the triangle intersection test and render the extended focus range blurry image.
20
Chapter 4
Implementation and Results
We develop our system based on openGL and GLSL. Our stochastic sample culling algorithm is implemented in pixel shader. In the CPU host, we bind the vertex stream and the transform matrix at shutter time t = 0. We also bind the texture coordinates and other shading attributes like vertex normal etc. as usual. We also bind the vertex at shutter time t = 1 as an additional vertex attribute.
Vertex Shader:
We transform the vertices into clip space at shutter time t = 0 and t = 1. Then we pass all vertex attributes to the geometry shader.
Geometry Shader:
In this shader, we convert each triangle into a 2D bounding box or a 2D convex hull as we described in Section 3.2. Since this method destroy the triangle’s structure, we should store the original all 6 vertices positions as output variables. Besides we pack each vertex’s circle of confusion and other vertex attributes from previous shader to every emitted vertex.
21
Pixel Shader:
In pixel shader, we deal with each pixel inside the 2D bounding box. The stochastic sample buffer is created in CPU host program, and passes into pixel shader as a texture. We create this buffer in 3 channels at time t and camera lens u, v. The time values are uniform in [0,1] as floating point. The camera lens value are uniform in [-1,1]. We set this buffer size as 128*128. Each pixel takes the stochastic samples from this texture.
We also use multi-sample antialiasing (MSAA) as well. For each pixel, we use a bit mask to store hit/miss of the samples. We use gl_SampleMask[] as the bit mask in openGL. Coverage for the current fragment will become the logical AND of the coverage mask and the output gl_SampleMask. That is, setting a bit in gl_SampleMask to zero will cause the corresponding sample to be considered uncovered for the purpose of multisample fragment operations. Finally we average the pixel values whose corresponding mask bit is equal to one. If the bit mask is all zero, we discard this pixel.
If the stochastic sample passes our culling test and the triangle intersection test, we draw the pixel color with the corresponding texture coordinate or the corresponding color. The GPU program flow is shown in Figure 4.1.
22
Figure 4.1: GPU program workflow Pixel shader:
23
In the remaining of this chapter, we present our result images rendered in real-time on a desktop PC with Intel Core i7 950 CPU 3.07GHz and NVIDIA GeForce GTX 580 video card. Our system is implemented in C++ Language using OpenGL and GLSL. All results are rendered with 1024 x 768 pixels.
For our stochastic rasterization culling algorithm, we compare two culling tests.
Namely Liane et al.’s dual space [10] culling algorithm. And the traditional culling algorithm with bounding box. We test four different scenes. The scene Hand is tested with object’s motion blur (Figure 5.2). The FlyingDragonfly is tested with camera’s motion blur (Figure 5.3). The FairyForest is tested with both motion blur and defocus blur (Figure5.4). We also show the image with enlarge focus range (Figure 5.5). With increasing the focus range, the whole fairy face can be seen clearly. Note that all scenes are rendered by 32 samples per pixel.
24
Figure 4.2: Defocus blur with different focus depths
25
Figure 4.3: RacingCar with/without motion blur
26
Figure 4.4: FlyingDragonfly with/without camera motion blur
27
Figure 4.5: FairyForest with/without both motion and defocus blur
28
Figure 4.6: Difference between enlarge focus range or not
29
Table 4.1 shows the result of STE. We compare our algorithm with the other two culling tests. The first scene Hand has 15k triangles, Car has 54k triangles ,Dragonfly and FairyForest has about 85k and 170k triangles. We can see that with motion blur or defocus blur only, our algorithm’s STE is almost the same as Dual space. Because our stages 1 and 2 culling algorithms are separately like the uv test and t test with Laine et al. [10]. But our algorithm comes better with both effects exist.
Scene Bbox Dual Space Our
Table 4.1: STE results with 16spp. Higher is better.
30
Table 4.2 shows the computation time for each scene in millisecond. Our algorithm takes a little more computation time than Laine et al. [10] while rendering large motion scenes. But, we perform better when both effects exist.
Scene Bbox Dual Space Our Table 4.2: Computation time results with 16spp. Lower is better.
31
Chapter 5
Conclusions and Future Work
In this thesis, we propose a 2-stage culling algorithm for stochastic rasterization.
We propose a new idea to compute the time bounds with relative camera lens in clip space. Using these bounds to cull and get higher STE.
We cull samples outside camera lens bounds in stage1 using the linear relationship between camera lens and vertex position. Stage 2 culls samples outside time bounds. We use the triangle similarity in clip space to find the intersection time easily. We use this idea to compute two linear bounds. Each pixel only needs to compute these bounds once. Our algorithm can handle motion blur and defocus blur simultaneously.
In the future we would find some other improvement to find bounding geometry easily and more accurately. We shall try to reduce the execution time, such as increase the test range from per pixel test to per tile test. Furthermore we can make the STE
In the future we would find some other improvement to find bounding geometry easily and more accurately. We shall try to reduce the execution time, such as increase the test range from per pixel test to per tile test. Furthermore we can make the STE