A Generalized Shape-Axis Model for Planar Shapes

(1)

Copyright c 15th Int’l. Conf. on Pattern Recognition, vol. 3, pp.491-495, Barcelona, Spain, September 2000

A Generalized Shape-Axis Model for Planar Shapes

Tyng-Luh Liu

IIS, Academia Sinica, Taipei, Nankang 115, Taiwan liutyng@iis.sinica.edu.tw

Abstract

We describe a generalized shape-axis (SA) model for representing both open and closed planar curves. The SA model is an effective way to represent shapes by compar- ing their self-similarities. Given a 2D shape, whether it is closed or open, we use two different parameterizations (one parameterization is oriented clockwise and the other counterclockwise) for the curve. To study the self-similarity, the two parameterizations are matched to each other via a variational framework, where the self-similarity criterion is to be defined depending on the class of shapes and human perception factors. Useful self-similarity criteria include symmetry, parallelism and convexity, and so forth. A match is allowed to have discontinuities, and the optimal match can be computed by a dynamic programming algorithm in O(N⁴) time, where N is the size of the shape. We use a grouping process for the shape axis to construct a unique SA-tree, however, when a planar shape is open, it is possible to derive an SA-forest. The generalized SA model provides a compact and informative way for2D shape representation.

1 Introduction

An effective and compact shape representation system is a critical element for various computer vision application and consequently, the subject has been studied extensively [3, 4, 11, 14, 15, 2, 6, 12, 16, 17, 20, 9, 19, 21, 8].

If shapes are represented with global descriptors, then local deformations can not be escribed so articulated objects tend to be dissimilar since these deformations may change the global appearance of objects considerably while the en- tire deformation is concentrated on specific points. Alterna- tively, if shapes are described by local boundary descriptors [2], then they do not account for region information, object parts, and global properties such as symmetries.

A good representation system of shapes should not be sensitive to small changes of appearance; otherwise similar objects of a same class will be represented very differently.

The shape-axis model [9] is a variational framework for 2D closed shapes/curves (see Figure 1) and is considered to be (i) complete, i.e., all the object information is stored in the representation, (ii) simple and compact, that is, redun- dancies are removed by capturing symmetries, region information, articulations (local deformations), and dividing the object into “object parts” with as few parts as needed for any level of specified details, (iii) stable, i.e., robust under small variations of the shape (including noise variations and articulations) and (iv) easily computable, i.e., easy to find its representation and to manipulate with it.

(a) (b) (c)

Figure 1. (a) Two shape contours extracted from real images. (b) Matching correspon- dences and shape axes. (c) SA-trees.

Our goal is to extend the shape-axis model so it can be applied to bothopenandclosedplanar shapes. Compared to other recent related work [17, 19, 20, 21], such extension is significant and is the first shape representation model to be capable of dealing with open shapes. After the shape axis of a shape contour has been recovered, a grouping process based on analyzing the discontinuities of a shape axis can be carried out to construct the unique SA-tree [9]. In case that a planar curve is open it is possible to derive an SA-forest (see Figure 4(c)). With the SA-tree/forest structure, comparing shapes with deformations, articulations and occlusions can be formulated as solving a tree matching process [8].

(2)

Previous Work Blum first proposed to describe shapes by their symmetry and thickness [4]. Binford [3] is also a pioneer and brought the attention to generalized cylinders.

Other early work includes [1, 11]. Pizer et al. have pre- sented a computational model for object representation via

“cores”, or regions of high medialness in intensity images [5]. Leymarie and Levine [10] have simulated the grassfire transform (from the psychology literature). Ogniewicz [12] proposed an efficient Voronoi skeleton algorithm. Sid- diqi and Kimia’s work [16] has an interesting mathematical formulation, preserving the intuition of the grassfire idea.

It is a development of a reaction-diffusion equation where the symmetry axis is obtained and described by the development of shocks. It is motivated by the framework for shape analysis via shock-diffusion equations [7]. Tari and Shah [19] extended their level-set approach to shapes of arbitrary dimensions. The FORMS by Zhu and Yuille [20]

was based on transforming shapes into skeleton graphs for recognition. Recently, Zhu has proposed a stochastic jump- diffusion process for computing medial axes [21]. Liu et al.

[9] have addressed the shape representation problem via a variational approach, and it goes beyond the symmetry axis representation. The resulting shape axis can be partially in- side and partially outside a shape, and it has a unique SA- tree structure which can be used for shape recognition [8].

2 Variational Matching of Planar Curves

Given a planar curve, two parameterizations, where one is oriented counterclockwise and the other clockwise, can be used to describe it (see Figure 2(a)). Let’s denote them asΓ = {x(s) : 0 ≤ s ≤ 1} and ˜Γ = {˜x(t) : 0 ≤ t ≤ 1}.

We may think of ˜Γ as the mirror image of Γ since ˜x(t) = x(1−t), and when the curve is closed we have x(0) = x(1).

By a match betweenΓ and ˜Γ we mean a monotone corre- spondence between the two parameterizations. To represent a match we specify a pair of monotone functionss(σ) and t(σ), each defined for 0 ≤ σ ≤ 1, such that x(s(σ)) cor- responds with ˜x(t(σ)). Notice that the end points are not required to be mapped to end points in the other parameter- ization. It is convenient to represent a match betweenΓ and

˜Γ as a binary function µ(s, t):

µ(s, t) =

1, if x(s) corresponds to ˜x(t) = x(1 − t), 0, otherwise.

The plot ofµ is symmetric with respect to the mirror line s + t = 1 since µ(s, t) = µ(1 − t, 1 − s). Therefore, it suf- fices to consider only the lower triangular part of the(s × t) matching space (see Figure 2(b)). The shape axis is defined to be the loci of the midpoints of the correspondences, i.e.,

x(s) + ˜x(t)

2 | µ(s, t) = 1

.

Γ

˜Γ

s0

s1

s2

s3

s4

s5

s6

s7

s8

t0

t1

t2 t3

t4

t5 t6

t7

t8 Γ

˜Γ

s0 s1 s2 s3 s4 s5 s6 s7 s8

t0

t1

t2

t3

t4

t5

t6

t7

t8

s t

Mirror Lines + t = 1

(a) (b)

Figure 2. (a) Two parameterizations of a shape contour and its shape axis and SA- tree.(b) The piecewise monotone and con- tinuous correspondences, i.e., the solution path µ(s, t) = 1 is plotted in the matching space(s, t). Notice the jumps/discontinuities in the match are grouped together to form the degree-4 bifurcation node in the SA-tree.

To determine a “good match” we minimize an appropri- ate variational problem with the matching energy of the following form

₁

0 F (x, τ; ˜x, ˜τ ; s, t) dσ, (1) whereF is the energy density, and τ = xs/|xs| and ˜τ =

˜xt/|˜xt| are the oriented unit tangent vectors to Γ and ˜Γ at x(s(σ)) and ˜x(t(σ)), respectively. Following [9], we define the energy/cost of a match as

E[t(σ), s(σ)]

=

₁

0

F (x, τ; ˜x, ˜τ ; s, t) + θ(σ) · JumpCost dσ

=

₁

0

[(x(s) − ˜x(t)) · (τ(s)s(σ) + ˜τ(t)t(σ))]² s(σ) + t(σ)

+

(x(s) − ˜x(t)) · (τ(s)s(σ) − ˜τ(t)t(σ))^⊥₂ s(σ) + t(σ)

+ c |x(s) − ˜x(t)|²|s(σ) − t(σ)|² s(σ) + t(σ) + θ(σ) · JumpCost

dσ ,

(2)

where the penalty term θ(σ) · JumpCost is interpreted as follows: at a jump, the values t(σ+) = limδ→0t(σ + δ)

(3)

andt(σ−) = limδ→0t(σ − δ) are different. The associated jump cost can be a function of|t(σ+) − t(σ−)|. In practice we have used a constant cost,JumpCost, for jumps. The energy density F defined in (2) is to favorsymmetryand closeness so two points are more likely to match to each other if the distance between them is not large and the corresponding tangents are symmetric. It can be shown thatF has several nice structural and geometric properties including parameterization invariant under change-of-variable in σ, translation invariant and rotation invariant.

3 Generalized Shape-Axis Algorithm

We describe a generalized shape-axis algorithm based on dynamic programmingfor solving the variational matching problem (2) for both open and closed planar curves. The method does not put any restriction on the degree of a bifurcation in a shape axis so it can be applied to a variety of complex planar shapes.

In practice a planar curve is given discretely, as a se- quence ofN + 1 points (if it is an open curve of N points, we just copy the first point and append it to the end, that is,x(sN) = x(s0)). Then it is straightforward to derive a discrete matching energy for equation (2)

E[t(σ), s(σ)]

≈ 1 N

N−1

k=0

(1 − θ(σk)) · ˆF (s(σk), t(σk), s(σk+1), t(σk+1))

+

jumps

JumpCost , (3)

where

F (s(σˆ k), t(σk), s(σk+1), t(σk+1))

=[(x(s(σk)) − ˜x(t(σk))) · (τ(s(σk))s(σk) + ˜τ(t(σk))t(σk))]² s(σk) + t(σk)

+

(x(s(σk)) − ˜x(t(σk))) · (τ(s(σk))s(σk) − ˜τ(t(σk))t(σk))^⊥₂

s(σk) + t(σk) + c|x(s(σk)) − ˜x(t(σk))|²|s(σk) − t(σk)|²

s(σk) + t(σk) , (4)

s(σk) = ^s(σ^k+1_N^)−s(σ^k⁾ andt(σk) = ^t(σ^k+1_N^)−t(σ^k⁾. The indicator functionθ(σk) is defined to be 1 at σk if a jump occurred and0 otherwise.

Our task now is to search in{si× tj; i, j = 0, 1, ..., N}

for an optimal solution path{µ^∗_ij = µ^∗(si, tj) = 1 | i, j = 0, ..., N} that yields the minimum energy of (3).

3.1 Dynamic Programming

As stated before, due to the mirror property of µ we shall focus only on the lower triangular part of the dia- gram{si× tj}. Any possible match, including the optimal

one, can be obtained by searching and scanning downward along diagonal lines, 45 degrees, starting from the mirror linet = 1 − s (or j = N − i for the discrete form). Each diagonal line can be indexed byl that varies over {0, 1, ..., N } and represented ast = _N^l − s. So, l = N corresponds to the mirror line andl = 0 corresponds to the origin (s0, t0) (see Figure 3). For convenience, we switch from the representation of matching points(i, j) → (si, tj) (or µij = 1) to the representation[i, l] → (si,_N^l − si) (or µil= 1).

Dynamic programming sets an initial cost along the mirror line l = N and iterates on l = N − 2, ..., 0 (note that l = N − 1 does not need to be considered since it will cause a path to reach the mirror line at a non-lattice point, i.e., we can simply set the costs to infinity at the line l = N − 1). Along each diagonal line l it visits every (si,_N^l − si), i = 0, ..., l , solving the subproblem “What is the cost of the best path passing through(si,_N^l − si)?”.

a a+2

D_S

i+1 j

a a+2 N-a+i-1 i N-i-1

a-i+1 [i,a]

j = N − i(l = N)

D_J

i a N-b+i i

j

[i,b]

[i,a] [N-b+i,N-b+a]

a-i a b-i

j = N − i(l = N)

(a) (b)

Figure 3. The subproblem being solved is (si,_N^a − si), i.e., to findCost[i, a]. The candi- date predecessors can either come fromDS

to form a smooth transition or from DJ to form a bifurcation where the grey areas rep- resent subproblems that have been solved.

Notice that for the bifurcation case, we only need to search the domainDJsince for each [i, b] ∈ DJthe third point,[N − b + i, N − b + a], will be automatically determined.

We use the matrixCost[i, l] to store the best cost of each subproblem at(si,_N^l − si). With this representation, the second index gives the indexl of the diagonal line and the first index fixes the position on the diagonal linel. A sketch of the algorithm is as follows.

1. Initialization: Initially, allCost[i, l]’s are set to ∞ ex- cept those on the mirror line. The mirror line represents

“self matches”, i.e., matches between points along the shape contour with themselves. Most often they occur at points of locally maximum/minimum curvature. Thus, one could in- stead of setting

(4)

Cost[i, N] = 0 , for i = 0, ..., N ,

put a bias for points along the contour that their curvatures are local maxima/minima, that is, raise the costs for the other points.

2. Iterations onl: Suppose we have solved the subprob- lems up to linel = a + 1, i.e., Cost[i, l], i = 0, 1, ..., l and l = N, N −1, ..., a+1, are computed. Then, Cost[i, a], i = 0, ..., a of diagonal line l = a can be derived by

Cost[i, a] = min

GS[i, a], GJ[i, a]

, (5)

where

GS[i, a]

= min

[j,b]∈DS[i,a]

F (sˆ i, ta−i, sj, tb−j) + Cost[j, b] , (6)

GJ[i, a]

= min

[i,b]∈DJ[i,a]

Cost[i, b] + Cost[N − b + i, N − b + a]

+ [(3 − 2 × J[i, b] − 2 × J[N − b + i, N − b + a])

× JumpCost]

, (7)

and

J[i, a] =

1, if Cost[i, a] = GJ[i, a], 0, if Cost[i, a] = GS[i, a], DS[i, a] = {[i + 1, d] : a + 1 < d <= N}

∪{[i + k, a + 1 + k] : 0 < k < N − a} , DJ[i, a] = {[i, d] : a + 1 < d < N − 1} .

The notationGJ represents the cost of grouping the two branches to be part of a bifurcation (due to jumps) andGS

represents the cost of having the predecessor in a smooth path (continuation).DSandDJare the domains where dynamic programming searches for possible predecessors to form a smooth path or a bifurcation, respectively (see Fig- ure 3). The binary matrixJ[i, a] defined above is to guar- antee that each (vertical) jump is penalized only once, and the penalty term in (7) makes sure a degreen bifurcation to be chargedJumpCost exactly n times.

To backtrack the optimal solution, the algorithm also keeps the predecessors

Back[i, a] =









{[j^∗, b^∗]}, if Cost[i, a] = GS[i, a], {[i, b^∗], [N − b^∗+ i, N − b^∗+ a]},

ifCost[i, a] = GJ[i, a].

(8)

3. Optimal solution and backtracking: In the end this iterative process, from l = N to l = 0, will provide Cost[0, 0], the total cost of the optimal match. The final stage, l = 0, requires special attention when the contour is closed and[0, 0] has two “continuous” predecessors, say J[0, b] = J[N − b, N − b] = 0. Then unlike the usual

“jump” case, it does not imply a bifurcation, due to the pe- riodicity of a closed curve, and in this case the origin does not represent any match, but only the end of the algorithm.

To construct the corresponding SA-tree, we start at the origin[0, 0], and the value Back[i, a] will take at each step to the predecessor(s). It is either only one predecessor or two, depending on if it is a continuation or a bifurcation.

When there is one predecessor it continues accumulating (retrieving) the segment path. When it gives two predecessors we can group them together and argue it has reached a bifurcation node of the tree. To complete a bifurcation node, backtracking should be applied to each of the two predecessors in parallel, and each process is again carried out recursively until reaching a point with only one predecessor. Then all the points visited will be grouped together to complete the bifurcation node, and the backtracking process starting again from each of them.

Complexity: There areO(N²) subproblems to be solved (evaluatingCost[i, l] for i = 0, ..., l and l = N − 1, ..., 0), and the complexity of each subproblem isO(N²) (the size of total searching spacesDSandDJisO(N ) and the complexity of ˆF (si, ta−i, sk, td−k) is also O(N), assuming uni- form matching). Therefore, the complexity of this algorithm isO(N⁴).

4 Experimental Results and Conclusion

We have proposed a generalized shape-axis model based on dynamic programming for planar shape representation.

The approach generalizes the variational framework pre- sented in [9], and can be used for both open and closed planar shape contours. The shape axis obtained can be interpreted as the optimal solution to a well-defined variational matching problem. This is also where the power of the approach relies, where by changing the optimality criterion we can obtain different representations. Worth noticing that (i) the number of branches in a bifurcation is allowed to be arbitrary, thus data driven and (ii) the shape-axis solution is rather stable against small perturbations on the boundary of a planar shape. A set of experimental results are provided in Figure 4.

Acknowledgments

The author is supported in part by Academia Sinica of Tai- wan, and NSC grant 89-2213-E-001-023.

(5)

References

[1] H. Asada and M. Brady, “The Curvature Primal Sketch,” IEEE TPAMI, 8(1):2-14, 1986.

[2] R. Basri, L. Costa, D. Geiger and D. Jacobs, “Determining the Simi- larity of Deformable Shapes,” Physics Based Modeling Workshop in Computer Vision, 1995.

[3] T. O. Binford, “Visual Perception by Computer,” IEEE Conference on Systems and Control, December 1971.

[4] H. Blum, “Biological Shape and Visual Science,” J. of Theoretical Biology, 38, pp. 205-287, 1973.

[5] C. A. Burbeck and S. M. Pizer, “Object Representation by Cores:

Identifying and Representing Primitive Spatial Regions,” Vision Re- search, Vol. 35, 1995.

[6] Y. Gdalyahu and D. Weinshall, “Measures for Silhouettes Resem- blance and Representative Silhouettes of Curved Objects,” ECCV(II), pp. 363-375, Cambridge, UK, April 1996.

[7] B. Kimia, A. Tannenbaum, S. Zucker, “Shapes, Shocks, and De- formations I: The Components of Two-Dimensional Shape and the Reaction-Diffusion Space,” IJCV, 15(3):189-224, July 1995.

[8] T-L. Liu and D. Geiger, “Approximate Tree Matching and Shape Sim- ilarity,” ICCV, pp.456-462, Kerkyra, Greece, September 1999.

[9] T-L. Liu, D. Geiger and R. V. Kohn, “Representation and Self- Similarity of Shapes,” ICCV, pp. 1129-1135, Bombay, India, 1998.

[10] F. Leymarie and M. D. Levine, “Simulating the Grassfire Transform Using an Active Contour Model.” IEEE TPAMI, 14(1):56-75, 1992.

[11] R. Nevatia and T. O. Binford, “Description and Recognition of Curved Objects,” Artificial Intelligence, Vol. 8, pp. 77–98, 1977.

[12] R. L. Ogniewicz, Discrete Voronoi Skeletons, Hartung-Gorre, 1993.

[13] A. Pentland and S. Sclaroff, “Closed-Form Solutions for Physically Based Shape Modeling and Recognition,” IEEE TPAMI, 13(7):715- 729, 1991.

[14] J. Ponce, and O. D. Faugeras, “An Object Centered Hierarchical Rep- resentation for 3D Objects: The Prism Tree,” CVGIP, 38(1):1-28, April 1987.

[15] S. M. Pizer, W. R. Oliver and S. H. Bloomberg, “Hierarchical Shape Description via the Multiresolution Symmetric Axis Trans- form,” IEEE TPAMI, 9(4):505-511, 1987.

[16] K. Siddiqi and B. B. Kimia, “Parts of Visual Form: Computational Aspects,” IEEE TPAMI, 17(3):239-251, 1995.

[17] K. Siddiqi and B. B. Kimia, “A Shock Grammar for Recognition,”

CVPR, pp. 507-513, San Francisco, CA, 1996.

[18] D. Terzopolous, A. Witkin, and. M. Kass, “Symmetry-Seeking Mod- els and 3D Object Recovery,” IJCV, 1(3):211-221, October 1987.

[19] S. Tari and J. Shah, “Local Symmetries of Shapes in Arbitrary Di- mension,” ICCV, pp. 1123-1128, Bombay, India, 1998.

[20] S. C. Zhu and A. Yuille, “FORMS: A Flexible Object Recognition and Modeling System,” ICCV, pp. 465-472, Cambridge, MA, 1995.

[21] S. C. Zhu, “Stochastic Computation of Medial Axis in the Markov Random Fields,” CVPR, Santa Barbara, CA, 1998.

(a)

(b)

(c)

Figure 4. (a), (b) Experimental results for closed and open shapes. (c) Examples of SA-forests for open shapes.