Experiments and Results - 基於圖形處理器的即時頭髮渲染與模擬

Hardware specification: The simulation code runs on a GPU with CUDA compute capability 1.1 or above. The GPU we used was a Nvidia GeForce 9600GT. It was the middle-end consumer product of the GeForce 9000-GT series launched in Feb. 2008.

It had 64 CUDA cores. Its graphics clock speed was 650 MHz and its memory bandwidth was 57.6 GB/sec. The rendering code requires a GPU that supports OpenGL 4.0 or above. We ran the rendering code on a Nvidia GeForce GTX480. It supported OpenGL 4.1. It was launched in March 2010. It had 480 CUDA cores. Its graphics clock speed was 700 MHz and its memory bandwidth was 177.4 GB/sec.

The CPU we used was Intel i7-930. It was launched in the 1^st quarter of 2010. It had 4 cores and 8 threads with clock speed 2.8 GHz.

Development environment: The operating system was Microsoft Windows XP Professional with service pack 3. The GeForce driver installed was version 260.99.

The development tool we used was Microsoft Visual Studio 2008. The Nvidia GPU Computing SDK version was 3.10. The BSGP compiler version was 2.0 by Huo [10].

Results: The snap shots from the motion of a character shaking her head are shown in Figure 34. The number of key hairs, hairs rendered and primitives rendered were 1750, 28000 and 560000 respectively. There were three lights in the scene. Two sets of opacity maps were generated for the two spot lights which casted shadows. The number of simulated particles is 10500. The Flip grid is 32x32x32. The number of iteration of the constraint solver was 20.

Figure 34: Snap shots of a character shaking her head.

Table 2: Number of grid cells versus FPS preconditioner used in the precondition conjugate gradient (PCG) method. We chose the modified incomplete Cholesky (MIC) as [11] for the CPU code. We used the Jacobi preconditioner for the GPU code. We used the better preconditioner in the CPU code because some required computations were more difficult to realize on the GPU.

Figure 35: FPS versus number of grid cells between CPU and GPU version

Table 3: Number of grid cells versus number of PCG iterations

Number of grid cells

Table 3 shows the number of grid cells vs. number of PCG iterations of the GPU and CPU code. The tolerance for the PCG is 10⁻⁵. The preconditioner of the CPU

version makes the PCG to converged faster than the GPU version as shown by Table 3.

The performance of the GPU version was better than the CPU version even though the preconditioner was inferior.

Table 4: Computation time percentage in one time step

The percentage of time spent in one time-step

CPU GPU

Solving constraints 31.4% 19.4%

Volume method 65.7% 76.6% iterations and the velocity transfer was more time consuming.

We used the viewing distance to adjust the number of hair segments generated in the rendering stage. When we viewed the hair model closely, we needed a detailed model. Whew we viewed the hair model from a far distance, we didn‟t need a detailed model and we used a coarse model. We used lesser segments when the camera was far and used more segments when the camera was near. This was the level-of-detail (LOD) method we used for rendering. Table 5 shows the FPS at different viewing distances with different screen resolution. The result is plotted in Figure 36. Because the number of pixel processed affected the rendering performance, therefore we measure FPS at different viewing distance with different screen resolutions. Figure 37 is the snapshots of the hair model adjusted for performance at different camera distances. We can observe that the FPS didn‟t increase when the viewing distance increased beyond a certain value. It was because the rendering task and the simulation task were run simultaneously in the process of one frame. Even if the render task was finish first, we had to wait until the simulation task finish to enter the next frame.

When the simulation cost became higher than the rendering cost, the increase of rendering performance would not reflect on FPS anymore. And the simulation became the performance bottleneck.

Table 5: Viewing Distance versus FPS

Viewing distance

Figure 36: FPS versus viewing distance

Figure 37: Snapshots of the hair models in different level of details

We used 2x super-sample antialiasing (SSAA) with a down-sampling filter pixel which takes five samples around a pixel. We show the comparison between the image with antialiasing turned off and the image with antialiasing turned on in Figure 38.

The image with antialiasing turned off is shown in the left side of Figure 38 and the image with antialiasing turned on is shown in the right side of Figure 38. The image on right side has fewer jagged lines than the image on the left side and the down-sampling filter makes the lines looked thinner.

0 20 40 60 80 100 120 140 160

0 0.5 1 1.5 2 2.5

FPS

Viewing Distance

800x600 1024x712 1152x802 1280x968

Figure 38: Comparison between antialiasing turned off/on

The comparison between the hair curves with B-spline tessellation and the hair curves without B-spline tessellation is shown in Figure 39. The hair curve with B-spline tessellation is shown in the left side of Figure 39 and the hair curve without B-spline tessellation is shown in the right side of Figure 39. The hair curves on the left side were smoother than the hair curves on the right side.

Figure 39: Comparison between using/not using B-spline tessellation

The comparison between the hair model with interpolated hairs and the hair model without interpolated hairs is shown in Figure 39. The hair model with interpolated hairs is shown in the left side of Figure 39 and the hair model without interpolated hairs is shown in the right side of Figure 39. The image on the left side has more hairs than the image on the right side.

Figure 40: Comparison between with/without interpolated hairs

The images of different numbers of interpolated hairs are shown in Figure 41. The original hair model is shown in the top left image. It had 1750 hairs. The number of hairs in the top right image was 4 times the number of hairs before interpolation (7000 hairs). The number of hairs in the bottom left image was 8 times the number of hairs before interpolation (14000 hairs). The number of hairs in the bottom right image was 16 times the number of hairs before interpolation (28000 hairs).

Figure 41: Images of different numbers of interpolated hairs

Images of hair with different lighting directions are shown in Figure 42. The lights were on the left side of the head in the top left image. The lights were in front of the head in the top right image. The lights were on the right side of the head in the bottom left image. The lights were behind the head in the bottom right image.

Figure 42: Snapshots of hair with different directions of lighting

View direction Light direction

View direction Front

Back

Light direction Head

Back Head

Front Back Head

Front Head

Images of hair with different hair colors are shown in Figure 43. The hair was brown in leftmost image. The hair was black in the middle image. The hair was yellow in the rightmost image.

Figure 43: Images of hair with different colors

We shifted the tangents to produce two highlights. There was one white highlight and one dark-brown highlight on the hair shown in Figure 44.

Figure 44: Two highlights of hair

We could reset the constraints of hair to change the length of hair at runtime. We didn‟t change the length of segments near the hair root. We only changed the segment near the tip of the hair. The hairs were set shorter on the left side and the hairs were set longer on the right side as shown in Figure 45.

Figure 45: Hair with different length changed at runtime

We animated a character with long hair shaking her head. The snapshots taken in front of the character are shown in Figure 46. The snapshots taken behind the character are shown in Figure 47.

Figure 46: Snapshots taken in front of a character with long hair (5)

(1) (2)

(3) (4)

Figure 47: Snapshots taken behind a character with long hair

Because the goal of computer animation is to imitate the real world, we compare the hair animated by the computer program with the hair in the real world. The snapshots of the real hair are shown in Figure 48 (part 1) and Figure 49 (part 2).

The highlight of the hair from the real world video glistened more smoothly and brightly than the computer animated highlight. And the majority of the real hairs moved as a whole group while the computer animated hairs seemed to be moving more independently.

(5)

(1) (2)

(3) (4)

Figure 48: Snapshots of real hair (part1)

Figure 49: Snapshots of real hair (part2)

Limitations: There are gaps between the animated result and the real world. The hair motion had some artifacts. The dynamics model we implemented didn‟t support curly hairs. It could not simulate the twisting phenomenon. Besides, it could only maintain part of the input hair style. For hair-body collision detection, only simple shapes (sphere and capsule) were used. Hair-hair collisions were not handled. Only the render part had LOD. The hair interpolation had artifacts.

Chapter 7: Conclusions and Future

在文檔中基於圖形處理器的即時頭髮渲染與模擬 (頁 45-58)