Out-of-Core Construction and Occlusion Culling Rendering of Multi-resolution in Hybrid Triangles and Vertices

(1)

Out-of-Core Construction and Occlusion Culling Rendering of Multi-resolution in Hybrid Triangles and Vertices

Wei Chung Hsu Polytechnic University b6506051@csie.ntu.edu.tw ,

whsu01@utopia.poly.edu

Abstract

In this paper we present a method for feature-lossless construction and view-dependent occlusion culling rendering in hybrid triangles and vertices. This method could be divided into two parts. First part is the pre- process. We use out-of-core algorithm to keep all the original features in the bottom of the octree and simplify this sub-tree up to build up the output pre-processing model. By using 13 levels octree, this could guarantee that there is no original feature loss. Second part is the rendering framework that has occlusion culling and view- volume culling feature. During the rendering, this system determines whether to render vertices or triangles for each independent section, whether to go up or down the LOD (Level-of-Detail) and whether this section is inside view volume or not. We develop a fantasy algorithm for view volume culling and occlusion culling by using matrix operation.

1. Introduction

Recent desire in large dataset model increases much more because of the advances in hardware performance.

One reason is the improvement of scanning technology that makes the model larger and larger. However, the most important reason makes us to present this out-of-core method is the cost of personal computer and entertainment equipment dropping down rapidly. Our system runs on personal computer and gets good performance in simplification, rendering and memory utilization. How to overcome the challenges in terms of interactive display and manipulation the enormous size of datasets on limit resource is what our system tries to do.

The level-of-detail (LOD) method and view–dependent simplification and rendering have been proposed and worked well [Lindstrom et al. 1996; Hoppe 1997; Luebke and Erkson 1997; Xia et al. 1997]. Their method simplified the model from the bottom to top and build up

the output into several different level-of-detail (LODs).

They keep their hierarchy in core, including several different LODs, and then real-time compute to get an active cutting line that is the most appropriate level for each section based on view-position, visibility and other parameters.

However, applying these methods into large dataset do not perform well. This leads some other technology come up, such as out-of-core algorithm, occlusion culling [El- Sana et al. 2001; Lindstrom 2003; Yoon et al. 2004], hybrid point and polygon [Chen et al. 2001], pure point set, data encoding, networked visualization [Levoy 2000, 2001; Fleishman et al. 2003].

According to all of these methods, our system tried to combine them together without conflict, and to implement in a personal computer. In this paper, we describe the precious work in Section 2. Section 3 introduces how to construct the rendering hierarchy. In Section 4 we would discuss our out-of-core algorithm in pre-processing phase.

Section 5 discusses how to keep the original dataset and simplify up. Section 6 talks about how to compute the

(2)

solidity for occlusion culling use. Section 7 talks about how to construct and use the view volume, because we have a new algorithm to compute the view volume culling and occlusion culling. Section 8 discusses how our rendering system works interactively. The last two sections are our result and conclusion along with our future work.

2. Previous Work

In this section, we have an overview on each feature that we have in our system and occurs in previous work.

2.1 Out-of-Core View-Dependent Simplification As we discussed in Section 1, the View-Dependent Simplification started from 1997 and there were many important algorithms still used now the day. The idea of our simplification algorithm comes from Garland and Heckbert’s quadric error metrics simplification method [Garland and Heckbert 1997]. Their algorithm is based on pair vertices contraction. First, they compute the quadric error metrics Q for all initial vertices. Then, they interactively remove the least cost pair from the pair heap, contract this pair, and update the costs of all valid pairs involving these two vertices. This is a really good algorithm, but it has two issues for large dataset. First, they perform this simplification in core. Second one is that each time they choose only one pair vertices to contract and nee to update all the other pair vertices involving these tow vertices. These two issues would cost a lot in large dataset. In Lindstrom’s out-of-core construction and visualization of multi-resolution surfaces [Lindstrom 2003], he solves these tow issues by using his out-of-core algorithm and by collapsing and expanding node in octree.

Instead of simplifying only one pair vertices each time, he simplified all vertices in each node. After simplifying these vertices to one vertex in each node, he connects these simplified vertices to a new simplified model. Then, he simplifies this model again by using the same algorithm until touching the top of tree. In his method, he spreads all original dataset into each node, however, when there are some nodes that have too many vertices and triangles, he will simplify this node instead of keeping original data.

This method may lose some features. The reason may come from that the octree level is not deep enough to spread the original data generally. We use 13 levels octree to solve this situation. According to 13 levels, our system could keep all the original data in the bottom of octree.

The drawback is how to maintain 13 levels octree in active cut and in preprocessing. This is our out-of-core algorithm key point.

2.2 Occlusion Culling and View Volume Culling Algorithm

The basic view volume culling method is to go through all the nodes and to check if they are inside or outside.

This method should real time compute the view volume and then check all the nodes. We have a new method to quickly get the node inside view volume. Our key contribution is that our algorithm could combine with our occlusion culling method to get the best performance. In Quick-VDR paper [Yoon et al. 2004], they use NVIDIA GPU to do the occlusion culling. We think the occlusion culling could be solved by GPU in the future, but for now, our method is focus on theoretical computation with view volume culling feature. In El-Sana and Silva’s integrating occlusion culling with view-dependent rendering paper [El-Sana et al. 2001], they use an estimating visibility algorithm. By using their formula “LOD_final=(1- OP)*LOD_view + OP*LOD_lowest”, it could compute the estimating LOD. First, they sort all the nodes by distance between viewpoint and nodes and put these nodes into a list. Then, they pick up the farthest node from the list and do ray shooting from the viewpoint to this node, and put LODfinal to all the nodes in this line. The issue is that the distance change frequently and need to do the sorting every time the distance changed. Sorting is expensive. Our method follows this estimating visibility idea and changes some algorithm to avoid the sorting.

2.3 Hybrid Triangles and Vertices

There are several rendering format. The common format is rendering the triangles, and there are many researches focus on point set rendering [Levoy 2000, 2001; Fleishman et al. 2003]. Both of them have their own benefit. In triangle rendering, it has three features that point set rendering doesn’t have. First it could have hardware support to improve performance. Second, it could avoid the holes and poor quality when the viewpoint is too close. Third, it could save the computation time comparing to some point set rendering that needs to real- time compute the point size and position. In the other hand, the point set rendering has some features that the triangle rendering doesn’t have. First, it could save a lot of disk space and memory utilization, because the point structure is much smaller than the triangle structure.

Second, in theory, it could have infinite good and smooth quality if it could real time compute the new vertex size and position. Some researches are trying to put them together to get both benefit [Chen et al. 2001]. In our system, we try to implement the third way which is using hybrid triangles an vertices.

3. Hierarchy Construction

(3)

Before talking about our-of-core algorithm and other methods, we should discuss our system hierarchy and how it works. In our system, we have a 13 level octree. It’s impossible to build this octree by only one process.

Therefore, we divide this octree into two phases. First phase is from the top to level 6. Second one is from level 6 to level 12. Figure 1 is our system hierarchy diagram.

We only process all the original data in first phase that is from top to level 6. Then only the bottom nodes that have too many vertices and triangles could go to second phase processing. By using this method, we don’t really need go from all the nodes in level 12 and build up. That will cost us a lot, because the level 12 has 8^12 = 64 billion nodes.

We start to build up our octree from level 6 which has 8^6

= 256 K nodes. This could improve a lot performance. We assume that after level 6 there are few nodes burst. In the first phase, it spread all the original data into level 6 which is the bottom level. If the nodes have too few vertices and triangles, the system will collect these nodes up to higher level. In the other hand, if the nodes have too many vertices and triangles (we call this burst), then this node will go to second phase from level 6 to level 12.

Since the first phase handles all the original dataset, it would work in out-of-core method. In second phase, we assume there is very few data need to be processed for each node, therefore, we do second phase in core. The simplification algorithm and hierarchy structures are the same between phase one and two. The only difference is in core method and out core method.

In figure 1, our hierarchy is one level triangles followed by one level vertices. This is what we said hybrid triangles and vertices. If there is a very small area that we want to render, for example just several pixels in the view screen, it’s better to just rendering the vertices instead of rendering several triangles. The tradeoff of point set rendering is that when the viewpoint is too close to the model, it would have holes and quality problems. In addition, how to maintain the topology is another issue. In Progressive Point Set Surfaces [Fleishman et al. 2003], they first compute MLS surfaces and base point set.

According to these two data, they go back to produce the point set that is in MLS surfaces. By using this method, they could get a better and smoother quality. But when the viewpoint is too close to the object or surfaces and they don’t real time to produce the point set in rendering, they would have either holes or large point output that is bad quality. One answer to solve this problem is using the ray shooting to real time produce the point set in MLS surfaces, but this cost a lot especially in high resolution view screen. Therefore, we decide to read in the PLY surface model that contains triangles data. We put this original triangle in the bottom nodes. This could avoid previous problems, such as holes, topology, and computation time. If viewpoint is too close, it will hit our bottom nodes, and our system will directly render the original triangles. In the other hand, if the distance is far enough, we could only render several vertices. That’s why we have one level triangles followed by one level vertices.

4. Out-of-Core Algorithm

In PLY file format, their triangles use the vertex id instead of using vertex position. By using the vertex id, it could save 3 times disk space, but it would be a problem for simplification. If we don’t have enough memory to store entire vertex table, this format would cost a lot in simplification. Therefore, first step of our out-of-core algorithm is to spread all the vertex position into triangles.

The original triangle format should be this way (V0id, V₁id, V₂id) that has three vertex id columns. We do out- of-core sorting for each column by using vertex id as a key. After this sorting, we could only one pass the vertex table, assign the vertex position to triangle, and write the output into a file. This costs O(NlogN+ N) for each column assignment and total produce three files. These files are in three formats (Triangle_id, V0x, V0y, V0z), (Triangle_id, V1x, V1y, V1z), (Triangle_id, V2x, V2y, V2z).

Then we sort these file by using triangle id as the key and combine them together to get the result. The result file format is (V0x, V0y, V0z, V1x, V1y, V1z, V2x, V2y, V2z) which is what we want. According to this output file, we could simplify them up to next level by only one pass with limit memory.

5. Simplification

As we discuss before, our hierarchy is based on one level triangles followed by one level vertices. After simplifying the bottom level original triangles, instead of reconnecting these vertices into next level triangles, we do nothing but hold these vertices. Then we contract these vertices to next level up and reconnect triangles at this level that is two levels up from the bottom. We continuously do this way until we touch the root.

(4)

According to this method, we could have the hierarchy in Figure 1. The simplification algorithm is the same as Lindstrom’s algorithm [Lindstrom 2003], the only difference is that we simplify vertices directly for all odd levels.

In Lindstrom’s algorithm, he constructs a 4x4 quadric error matrix Qv for each cluster v:

=

t T

q q q q

q q q q t t

v

n n

Q

44 43 42 41

34 33 32 31

24 23 22 21

14 13 12 11

After getting this Qv for each cluster v, we use Garland and Heckbert’s formula [Garland and Heckbert 1997]:

= 1 0 0 0

1 0 0 0

34 33 32 31

24 23 22 21

14 13 12 11

q V q q q

q q q q

,

−

=

1 0 0 1 0

1 0 0 0

34 33 32 31

24 23 22 21

14 13 12 11

q q q q

q q q q V

Then we got the simplified vertex for each cluster v.

6. Solidity

In general LOD method, we always maintain an active cutting line in rendering framework. If we adding the occlusion culling feature, the cutting line would be different to the original LOD method. By adding the solidity into each node, we could compute and decide whether going higher than the original LOD or just staying in the original LOD. In El-Sana’s paper [El-Sana et al. 2001], they use this formula: “LODfinal=(1- OP)*LODview + OP*LODlowest”. The LODview is the original LOD, and by using the solidity parameter and real-time computation, we could get the estimating LODfinal. We will discuss how to use solidity to real time compute LODfinal in section 8. Here, we would discuss how to computer and assign the solidity to each node. In El-Sana’s paper, they have two kind methods to compute the solidity, first one is using projection, second one is using ray shooting. We choose the projection method that is easier to implement.

∈

=

∈

) ( ) (

) (

) ( ) _

(

ci face f

ci poly t

f area

t area ci proj

Solidity

In their method, there is a key issue that they need to sort all the cells so that they could do the shooting from farthest cell. We produce a new view volume algorithm that could avoid this sorting.

7. View Volume Construction and Operation

In the 3D graphics, people always use matrix to rotate and translate the model. In general view volume culling

method, people always check all the nodes to decide if it is inside view volume or outside. According to these two methods, we have a new idea to improve performance.

First, we construct an initial view volume which has all cells sorted by distance already. We always keep this initial view volume in memory and using the rotating and translating matrix to move this view volume to the correct position. At matrix operation, we match this rotated view volume into the object cells and pick up the overlap cells.

Figure 2 explains how our algorithms work in 2D. The 3D case is the same as 2D case.

There are two advantages to use this algorithm. First one is that it would be more efficient than checking all the model cells. Of course, it depends. If our initial view volume has too many cells, it doesn’t save too much processing time in moving all of them comparing to the original method. If the view volume is not big, it will be very efficient. The second advantage is that we don’t need

(5)

any sorting now. Since we already keep all the sorted view volume cells in memory, even after rotating and translating the view volume and mapping to the object cells, the sorted cells list is still the same order. Therefore, we could just only do the rotate and translate matrix without any sorting for each frame. This feature could really improve a lot of performance.

8. View-Dependent Out-of-Core Rendering

In our view-dependent out-of-core algorithm, we define a rendering union and every time when we fetch the disk, we always base on this union. We call this union as one “subtree”. As in Figure 1, we our hierarchy is 13 level octree and have one level triangles followed by one level vertices. Therefore, in our base union, we define that there are three levels in this union. The bottom level has triangles. The middle level has vertices. Top level is only one vertex. In addition, this top level overlaps with the bottom level of its parent subtree. This means, when we are in the top of this subtree, we could choose either only to render one vertex or to fetch its parent subtree and then render triangles. Actually, this definition is really important, because the I/O fetching costs a lot, it is very difficult to fetch the parent subtree immediately.

Therefore, in our system, we could first render only one vertex and wait the I/O to get the parent. Rendering only one vertex is better than rendering nothing but just waiting the I/O. Figure 3 is our subtree structure.

We always keep a 6 level octree in memory. This could help us to quickly render some portion that is not in memory yet. Our active cutting line is made from several subtree unions. In the beginning, we start from the top level of octree, and then we go down by using the current active line to approach the next frame’s active line. Figure 4 is our rendering structure. If there are some subtree is outside in the next frame, we just directly throw away this subtree. In the other hand, if there is some cells that are outside in previous frame and inside in the next frame, then we could first render the 6 level octree which is

already in memory. Then according to the information in this octree, we could fetch the child subtree from the disk.

The following is our fetching algorithm:

1. We start from root. In the other word, we put a seed in root.

2. Every time, we use current LOD to approach next frame LOD.

3. LOD_final=(1-OP)*LOD_view + OP*LOD_lowest”. The LODview is suing current LODcurrent to approach. After matrix operation, we use Bresenham’s algorithm to shoot the farthest cell, and assign LODfinal for all the cells through this shooting line.

4. After getting LODfinal, we make a decision to either fetch in disk/memory or keep current subtree.

9. Results

We present our result in this section. The table 1 is our preprocessing time result. The rendering frame basically is all the same in every model, because we are using the multi-thread. However, the quality is not good in large data model when the system just start up. It takes some time to approach the correct LOD.

10. Summary and Future Work

In our view volume algorithm, there is an issue that after rotating and translating, the mapping cells may have two special cases. First, there may be some cells are in different position but after translation, they are mapped into the same position. Second, there may be some cells connected to each other and there is no hole between them, but after translation, they are not connected anymore and there are some holes between them. We are trying to mathematically proof this issue is impossible. For now, we don’t have any good proof, but at least in our system, we don’t see this problem happen. In the other hand, is it so important that you need to have 100% correct view volume checking? Is it possible to miss one or two cells without hurting any output quality? In our system, the answer is yes.

(6)

11. References

[1] B. Chen, and M.X. Nguyen. POP: A Hybrid Point and Polygon Rendering System for Large Data. In IEEE Visualization 2001, October, 2001.

[2] C.T. Silva, Y-J. Chiang, J. El-Sana, and P. Lindstrom. Out- of-Core Algorithm for Scientific Visualization and Computer Graphics. Submitted for a journal publication, August, 2002.

[3] H. Hoppe. View-Dependent Refinement of Progressive Meshes. In ACM SIGGRAPH 1997, August, 1997.

[4] J. El-Sana, and Y-J Chiang. External Memory View- Dependent Simplification. In Eurographics 2000, August, 2000.

[5] J. El-Sana, N. Sokolovsky, and C.T. Sliva. Integrating Occlusion Culling with View-Dependent Rendering. In IEEE Visualization 2001, October, 2001.

[6] J.D. Cohen, D.G. Aliaga, and W. Zhang. Hybrid Simplification: Combining Multi-resolution Polygon and Point Rendering. In IEEE Visualization 2001, October, 2001.

[7] M. Garland, and P.S. Heckbert. Surface Simplification Using Quadric Error Metrics. In ACM SIGGRAPH 1997, August, 1997.

[8] M. Alexa, J. Behr, D. Cohen-Or, D. Levin, and C.T. Silva.

Point Set Surface. In IEEE Visualization 2001, October, 2001.

[9] M. Pauly, M. Gross, and L.P. Kobbelt. Efficient Simplification of Point-Sampled Surfaces. In IEEE Visualization 2002, October, 2002.

[10] M. Alexa, J. Behr, D. Cohen-Or, S. Fleishman, D. Levin, C.T. Silva. Computing and Rendering Point Set Surfaces. In IEEE Visualization 2003, January, 2003.

[11] M. Alexa. Surfaces from Point Samples. Course Tutorial.

[12] P. Linstrom. Out-of-Core Construction and Visualization of Multiresolution Surfaces. In ACM SIGGRAPH 2003 Symposium on Interactive 3D Graphics, April, 2003.

[13] S. Ruinkiewicz, and M. Levoy. Qsplat: A Multiresolution Point Rendering System for Large Meshes. In ACM SIGGRAPH 2000, July, 2000.

[14] S. Fleishman, M. Alexa, D. Cohen-Or, and C.T. Silva.

Progressive Point Set Surfaces. In ACM Transactions on Graphics 2003, October 2003.

[15] S-E Yoon, B. Salomon, R. Gayle, and D. Manocha. Quick- VDR: Interactive View-Dependent Rendering of Massive Models. In IEEE Visualization 2004, October, 2004.

About the Authors

The About the Authors section must begin 4 spaces below the bibliography using the same margins as the body. It is not a numbered section of the report. Consider each author as a separate paragraph, skipping one line between each author. Limit the information to 75 words or less.

Copyright forms

You must include your fully-completed, signed IEEE copyright release form when you submit your paper. We must have this form before your paper can be published in the proceedings. The copyright form is available either at http://www.ieee.org/copyright

Bunny: 3MB

(7)

Dragon: 32MB

Budda: 40.6MB

Blade: 80MB

Lucy: 508MB