Chapter 2 Visual Salience-Guided Mesh Decomposition
2.1 Review of Hoffman and Singh’s Theory of Part Salience
would allow us to construct an efficient and valid set of visual parts from a 3-D model. In this way, one could realize “query-by-significant-components” in a 3-D shape retrieval system.
The remainder of this chapter is organized as follows. In Section 2.1, we intro-duce the theory of part salience and illustrate its importance in the perception of parts. In Section 2.2, we propose the computational processes for realizing the qualitative salient features, and describe in detail how to incorporate visual salience into the mesh decomposition process. The experiment results are pre-sented in Section 2.3. Finally, in Section 2.4, we present our conclusions.
2.1 Review of Hoffman and Singh’s Theory of Part Salience
In [51], Hoffman and Singh proposed the theory of part salience, which states that at least three factors determine the salience of a part: the protrusion, the boundary strength, and the relative size of the part. We now give the quantitative definitions of these salient factors and then describe their importance in visual processes.
Protrusion of A Part This factor is the degree to which a part protrudes from its main body. For 2-D silhouettes, it can be quantified as the ratio of the perimeter of the part (excluding its bases) to the sum of its base lengths. For 3-D shapes, the base of a part is referred to as the minimal surface formed by the
Chapter 2. Visual Salience-Guided Mesh Decomposition
boundary curve of the part. Hence, the protrusion of a 3-D part can be quantified as the ratio of the area of the part’s surface to the area of its base surface.
Strength of A Part’s Boundary According to the principle of transversality, a part’s boundaries are usually located at the concave creases, as shown in Fig.
2.1(b). In [51], Hoffman and Singh proposed that possible quantitative defini-tions of the boundary strength include the turning normals and locale turning, as shown in Figs. 2.1(a) and 2.1(c) respectively. Obviously, the indication of the normal direction must have a global orientation consistency so that the boundary strength can be captured precisely. The discriminating capability of turning nor-mals and locale turning is shown by the following examples. For 2-D silhouettes, two sides of a crease boundary usually have two normals, and the angle between them can, in one sense, represent the strength of that boundary. On the other hand, for potentially smooth boundaries, which are represented by the dotted lines in Fig. 2.1(c), there is one normal at every point along a curve. To tackle this problem, Hoffman and Singh [51] proposed obtaining the measure of turning in an appropriate region near the boundary. As shown in Fig. 2.1(c), the gray region is the so-called locale3 and the normals on its two sides (i.e., the so-called locale turning) are used to characterize the strength of the smooth boundary. For 3-D shapes, the principal curvatures can be used to measure the strength of a
3By definition [51], a locale is an appropriate region near (but not just infinitesimally near) a negative minimum of the curvature, in which we can explore how the curve evolves.
2.1. Review of Hoffman and Singh’s Theory of Part Salience
(a) Turning normals (b) 2-D Silhouette (c) Locale turning
Figure 2.1: Illustration of turning normals and locale turning at the boundary of a 2-D silhouette.
part’s boundary.
Relative Size of A Part This factor indicates the size of a part relative to the whole object. For 2-D silhouettes, it can be defined as the ratio of the area of a part to the area of the whole object. For 3-D shapes, the relative volume can be used to measure a part’s relative size.
Having reviewed the factors that may be used to determine the salience of a part, we now discuss their effects on both visual and decomposition processes.
For simplicity, the following discussion is based on 2-D silhouettes; however, the concept can be easily extended to 3-D models. Fig. 2.2 shows the boundaries and cuts of parts of a 2-D silhouette, indicated by isolated points and dotted lines respectively. Note that, in Fig. 2.2(a), the four boundaries are used to form possible cuts; and, in Figs. 2.2(b) and 2.2(c), each part is generated by exactly one cut. According to the visually salient properties of interest, a 2-D silhouette
Chapter 2. Visual Salience-Guided Mesh Decomposition
may have different interpretations. For example, the 2-D silhouette might be in-terpreted as an alien’s head with a pair of protrusive ears when the salience of the part is determined primarily by its protrusion (i.e., the part’s cuts in Fig. 2.2(b)).
On the other hand, the 2-D silhouette might be interpreted as an unidentified flying object when the part salience is determined primarily by its relative size (i.e., the part’s cuts in Fig. 2.2(c)). As a result, the part salience would affect not only the high-level visual processes that determine the interpretation of a shape, but also the low-level visual processes that determine how the shape is really decomposed. In order to precisely determine a part’s cuts, another independent theory that incorporates a priori knowledge about the shape is usually required.
In the early 1980’s, 3-D object recognition was a popular research topic [28]. Also, among the large number of research issues, 2-D perceptual organization [106] and recognition-by-components (or parts) [11, 50, 114] were two important directions.
However, their development was hindered by some ill-posed early vision problems, such as edge detection and image segmentation. Since these problems could not be solved, 2-D perceptual organization and recognition by 2-D components (or parts) could not be converted into “complete” computational processes, so they both failed. Nowadays, there are large numbers of 3-D models distributed world-wide. Since 3-D models (or meshes) are not restricted by the limitations of 2-D images, perceptual organization is now possible in 3-D cases. Furthermore, Hoff-man and Singh’s theory of part salience means that a priori knowledge may not
2.2. Visual Salience-Guided Mesh Decomposition
(a) A part’s boundaries (b) A part’s cuts (c) A part’s cuts
Figure 2.2: A part’s boundaries and cuts in a 2-D silhouette (Re-sketched from [51]).
be necessary in 3-D shape decomposition processes. However, the quantitative definitions for part salience proposed by Hoffman and Singh [51] were made un-der the assumption that a part and its boundary are found in advance. In terms of perceptual organization, this is a drawback that, to some extent, limits the power of Hoffman and Singh’s theory. In this chapter, we propose a new mesh decomposition scheme that incorporates the cognitive psychology theory into the mesh decomposition process such that the visually significant components can be extracted from a given 3-D mesh.
2.2 Visual Salience-Guided Mesh Decomposition
We now discuss the computational processes for realizing two of the visually salient features, namely, the protrusion and the boundary strength. We also describe how to incorporate each visually salient feature into the mesh decompo-sition process. As to the third salient feature, the relative size of components, we
Chapter 2. Visual Salience-Guided Mesh Decomposition
can easily calculate it once the protrusion and the boundary strength are known.
We shall use the relative size feature in the mesh retrieval process. This section is organized as follows. In Section 2.2.1, we present the computational process for characterizing the protrusion of an arbitrary surface mesh. Based on protrusion characterization, a local maximum approach for choosing the salient representa-tives of parts is proposed and described in Section 2.2.2. In Section 2.2.3, we describe in detail the proposed computational process for modeling the boundary strength. The proposed measure of boundary strength is used as the guideline to find the locale of a part’s boundary. In Section 2.2.4, a coarse-to-fine approach is proposed for finding the locale of a part’s boundary. In Section 2.2.5, Katz and Tal’s algorithm for determining the boundary of a part (presented in [62]) is described for the purpose of completeness.
2.2.1 Modeling the Protrusion as the Degree of Center
In this section, we propose a suitable way to characterize the protrusion of a shape. It is intuitive that a protrusion is closely related to the skeletal structure of a shape. As a result, some existing skeletonization methods [22, 70, 74, 111]
may be useful for characterizing the protrusion. In our investigations, however, we found that the integral function proposed by Hilaga et al. [49] is more suitable for protrusion characterization. The main reasons are as follows. First, the integral function can be constructed on any type of polygonal meshes, including
2.2. Visual Salience-Guided Mesh Decomposition
Figure 2.3: Illustration of the base patch construction for protrusion characteri-zation: The darker region is the base patch occupied by bi.
non-orientable, non-closed, and non-manifold surfaces. Second, the function is very stable so that there is no initial point selection problem. Third, the integral can be calculated over the entire surface. As a result, the protrusion of every vertex is accessible to any salience-guided process. Finally, the function is not only invariant to geometrical transformations (such as rotation, translation, and scaling), but is also resistant to noise added to vertex coordinates. Therefore, we adopt the integral function described in [49] to characterize the protrusion of a part.
In [49], the degree of center at the point v on the surface S is defined as follows [49]:
µ(v) = Z
p∈S
g(v, p)dS, (2.1)
where g(v, p) represents the geodesic distance between v and p on S. The
contin-Chapter 2. Visual Salience-Guided Mesh Decomposition
(a) Cactus (b) Dinopet (c) Hummingbird
Figure 2.4: The protrusion degree calculated on different 3-D meshes.
uous integral function µ(v) is defined as the total sum of geodesic distance from the point v to all points on S. In other words, the value of µ(v) can be inter-preted as a distance from the point v to arbitrary points on S. More precisely, a smaller value of µ(v) indicates that the point v is closer to the center of the surface S. On the other hand, a larger value of µ(v) means that the point v is farther from the center of the surface. It can be seen from Eq. (2.1) that calcu-lating the integral based on geodesic distance is computationally prohibitive. To trade off accuracy for computational efficiency, Hilaga et al. employed Dijkstra’s algorithm to approximate geodesic distance based on edge length of a 3-D mesh.
Here, in contrast to [49], the integral function is constructed on the dual graph of a given 3-D mesh, G = (V, E), where V and E represent the set of
2.2. Visual Salience-Guided Mesh Decomposition
dual vertices and the set of dual edges respectively. A dual vertex v ∈ V is referred to as the center of mass of a face in the original mesh, while a dual edge (u, v) ∈ E links the center-of-mass of two adjacent faces and intersects at the midpoint of the edge shared by the two faces. For computational efficiency, we segment the mesh into small patches of approximately equal size, which we called base patches. Each base patch is represented by a single dual vertex, bi, located at its approximate center. Such a base patch is constructed by a modified version of Dijkstra’s algorithm such that the shortest distance between the base vertex and any vertex within the base patch is less than a radius value. As shown in Fig. 2.3, the darker region is the base patch of the radius thrµ with the base dual-vertex bi in its center. Obviously, by increasing the number of base patches, a more accurate integral can be obtained; however, the drawback is an increase of computation time. Let area(v) denote the area of the mesh face corresponding to a dual vertex v and area(V ) denote the total area of the object surface. The protrusion degree at a dual vertex v can be defined as in [49]:
µ(v) =X
iarea(Pi) = area(V ), while g(v, bi) returns the geodesic distance between the dual vertex v and the base vertex bi. Since the function µ(v) defined in Eq.
(2.2) is not invariant to scaling transformation, a normalized version of µ(v) is
Chapter 2. Visual Salience-Guided Mesh Decomposition
defined as in [49]:
Protrusion(v) = µ(v) − minu∈V µ(u)
maxu∈V µ(u) . (2.3)
The calculation of the integral function has the complexity O(|V |log|V |), where |V | is the number of faces on the mesh. Using the normalized protru-sion degree defined in Eq. (2.3), we can calculate a numeric value (ranging from 0 to 1) for each dual vertex located on a 3-D mesh. The farther a dual vertex is from the center of a 3-D mesh, the larger the protrusion degree will be. Fig. 2.4 illustrates the protrusion degree calculated on different 3-D meshes. Note that a darker color represents a protrusion degree close to 0, while a lighter color means the protrusion degree is close to 1.
2.2.2 Choosing the Salient Representatives of Parts
Here, we describe how to select a set of salient representatives from a given 3-D mesh. The local maxima of protrusion degrees is the criterion used to select salient features. After the selection process is completed, each identified local maximum can be regarded as a salient representative of a part. Given a dual vertex r ∈ V , the dual vertex is chosen as a salient representative of a part if the following condition is satisfied:
Protrusion(r) = max
Wr
{Protrusion(v)} (2.4)
where Wr = {v ∈ V |g(r, v) < thrp} is an observation window for finding a local maximum of protrusion degrees; and thrp represents the size of the observation
2.2. Visual Salience-Guided Mesh Decomposition
Figure 2.5: Illustration of the candidate locales construction: The two darker regions are the first two candidate locales of the salient representative of a part.
window, with which we can control the range of influence of a protrusive stim-ulus. The observation window can be constructed using the modified version of Dijkstra’s algorithm mentioned in the previous section. By replacing bi and thrµ
in Fig. 2.3 with r and thrp respectively, the darker region shown in Fig. 2.3 can be interpreted as the observation window for choosing the salient representative.
Moreover, if the protrusion degree of the vertex r (i.e., the star shown in Fig.
2.3) is the largest value within the local window, the vertex is chosen as the salient representative. Note that since the observation windows of local maxima are subject to overlap, only one of them is chosen as a salient representative.
Chapter 2. Visual Salience-Guided Mesh Decomposition
2.2.3 Modeling the Boundary Strength based on the Border Area Change
In this section, we describe how to convert the concept of boundary strength into a computational process. Since the boundary of a part is completely unknown, we start from the surface mesh and the salient representatives obtained in the previous section. Motivated by the concept of locale turning (described in Section 2.1), we use Dijkstra’s algorithm [23] to explore how the surface evolves in the locale of a boundary. For clarity, we split the computational process for modeling the boundary strength into two steps:
Step 1. Establishing the Candidate Locales
Given a salient representative of a part, r, a set of candidate locales, {Lxr} = {L0r, L1r, · · · }, is established. For simplicity and later use, we drop the subscript r in subsequent descriptions and denote the xth candidate locale as:
Lx = {v|∀v ∈ V, x · e ≤ D(v) < (x + 1) · e}
for x ∈ {0, . . . , l − 1}, (2.5)
where e represents the extent of a candidate locale, in which the boundary evo-lution is explored; and l = bmaxv∈V D(v)/ec is the number of candidate locales established. D(v) returns the shortest distance from the source, r, to a dual ver-tex, v, in terms of geodesic distance and protrusive difference. Fig. 2.5 illustrates
2.2. Visual Salience-Guided Mesh Decomposition
that based on the new distance measure D(·), the first two candidate locales, L0 and L1, are established using the modified version of Dijkstra’s algorithm (as in Section 2.2.1). To compute the shortest distance D(·), the weight for each edge (u, v) ∈ E in the dual graph is defined as follows:
W eight(u, v) = δ · Len(u, v)
avg(Len) + (1 − δ) · P rot(u, v)
avg(P rot), (2.6) where Len(u, v) is the length of a dual edge between u and v. Here, P rot(u, v) represents the absolute protrusion degree of difference between two dual vertices, u and v. Also, avg(Len) and avg(P rot) represent the average length and the average protrusion degree difference respectively. In order to fulfill the proximity and similarity requirement of the Gestalt laws, the first term on the right-hand side of Eq. (2.6) is usually considered in Dijkstra’s algorithm to determine the shortest distance (or path) on a graph in terms of the geodesic proximity. The purpose of the second term is to balance the effect caused by the geodesic prox-imity, while creating the candiate locales containing similar protrusion degrees; δ is the weighting between the two constraints. Moreover, including the protrusive similarity is helpful in maintaining a locale’s boundaries approximately parallel to a part’s boundaries.
Since each salient representative produces a set of candidate locales and even-tually grows into the whole 3-D mesh, certain of candidate locales must “march”
by the potential region of its corresponding part boundary. However, the sets of candidate locales will overlap one another. To prevent candidate locales from
Chapter 2. Visual Salience-Guided Mesh Decomposition
marching into the regions occupied by other parts, a constrained set of candidate locales is constructed such that the region-growing process always ends whenever a termination base is touched. To do so, we first define the termination base, K, as follows:
K = {v|∀v ∈ V, Protrusion(v) ≤ thrb} , (2.7)
where thrb is the parameter used to collect the set of faces that forms the termi-nation base. Next, the constrained set of candidate locales, L, is defined as the union of (m + 1) consecutive locales,
L =
where ∆b is used to specify that the last (∆b+1) locales in L overlap with the ter-mination base, K. By the above construction, the overlap between a constrained set of locales and the termination base provides a potential region in which to find the correct boundary of a part.
Step 2. Modeling the Boundary Strength
With the constrained set of candidate locales established in Step 1, we now con-sider two adjacent locales in L to determine how the surface evolves in candidate
2.2. Visual Salience-Guided Mesh Decomposition
locales. Let VLx denote the set of dual vertices in Lx+1 ∈ L that has a dual edge joining Lx ∈ L in the graph G. We then associate the following geometric property to the xth candidate locale in L:
f (x) = X
v∈VLx
area(v). (2.10)
Since VLx is a set of dual vertices that collects the direct neighbors between Lx and Lx+1, f (x) can be regarded as the total-area-of-border between two adjacent candidate locales. Based on the geometric property defined in Eq. (2.10), we model the boundary strength as the total-area-of-border change in response to the boundary’s evolution. The modeling is reasonable because, at the border of two adjacent parts, the total-area-of-border defined above will usually undergo a significant change. Therefore, to judge whether a locale contains a boundary using the total-area-of-border change is a justifiable choice. As a consequence, the boundary strength at the xth candidate locale can be defined as follows:
Boundary Strength(x) = |f (x + 1) − f (x)| . (2.11)
By obtaining the measure of boundary evolution for the boundary strength, we can explore how the surface evolves in the locale of a part’s boundary. Moreover, by treating f (x) as a one-dimensional function defined in L, we can make the process for finding the locale of a part’s boundary analytic.
Chapter 2. Visual Salience-Guided Mesh Decomposition
2.2.4 Finding the Locale of A Part’s Boundary
In this section, we describe how to use the previously defined boundary strength to locate the locale of a part’s boundary. As mentioned in the previous section, the boundary strength is quantified in response to the boundary’s evolutionary process. Hence, the locale of a part’s boundary should possess the maximum boundary strength. However, the function f (x) is very jagged (or noisy), since the faces in the immediate neighbor of the xth candidate locale can never have a regular area due to the nature of a mesh-based object. This makes finding the locale of a part’s boundary very difficult. To overcome this, based on Haar wavelet representation [83, 96], the function f (x) is transformed into w different scales, f1(x), f2(x), · · · , fw(x). Then, the candidate locale that possesses the most significant boundary strength is traced from a coarser scale fj(x) to a finer scale fj−1(x) until a predefined finer scale is reached. In this way, one can conduct a coarse-to-fine search to identify the locale that contains a part’s boundary.
Let kj−1 denote the index of the candidate locale that possesses the maximum boundary strength in fj−1(x). Then, the index kj−1is determined by the following
Let kj−1 denote the index of the candidate locale that possesses the maximum boundary strength in fj−1(x). Then, the index kj−1is determined by the following