Perfect Matches of Lung Branch Points via Mass Transport Methods

(1)

PREPRINT

國立臺灣大學數學系預印本 Department of Mathematics, National Taiwan University

www.math.ntu.edu.tw/ ~ mathlib/preprint/2012- 10.pdf

Perfect Matches of Lung Branch Points via Mass Transport Methods

Pengwen Chen, Ching-Long Lin, and I-Liang Chern

July 12, 2012

(2)

TRANSPORT METHODS

PENGWEN CHEN ^∗, CHING-LONG LIN^†, AND I-LIANG CHERN ^‡

Abstract.

In this paper, we focus on point-set matching methods, which are based on Monge-Kantorvich mass transport problems. We apply these methods to match two sets of pulmonary vascular tree branch points whose displacement is caused by the lung volume changes in the same human subject.

The nearly perfect match performances of the six subjects used in this study verify the effectiveness of this mass transport-based approach for matching lung branch points. A theoretical analysis on the L2 mass transport cost is provided to illustrate these perfect match results, including the curl- cardinality relation and the invariant properties.

Key words. Mass transport problems, Point-set matching problems, Gaussian kernel correla- tion, lung registration

AMS subject classifications.

1. Introduction.

1.1. Literature reviews. Registration is concerned with matching two or more sets of image data taken at different times or from different sensors. Depending on the type of image data, registration methods can be classified into two groups: intensity- based methods and feature-based methods. Comprehensive surveys of traditional registration methods can be found in [26] and [54]. Here, we provide a brief review on point-set matching problems and mass transport, which are related to the present study.

Point-set representations of image data are commonly employed in computer vi- sion. These points are typically the specific features of an image, such as the locations of corners, boundary points or salient regions. The associated point-set matching (registration) problem is concerned with establishing a consistent point-to-point correspondence between two point-sets and recovering the spatial transformation that yields the best alignment. The correspondence can be regarded as a spatial transformation achieved by either a linear transform or a non-rigid transform procedure.

Several well-known methods have been developed. For example, the iterative closest point (ICP) algorithm is one common approach to the feature-based image registration problem [3] [52]. One limitation of ICP is its local convergence restriction, which requires sufficient overlap between the data-sets for initialization. Another drawback is the algorithm’s susceptibility to outliers. To alleviate these challenges, a variety of robust methods have been developed. In [14], Chui and Rangarjan proposed a robust point matching (RPM) method that simultaneously estimates non-rigid transformation via splines and correspondence. where as asymmetric correspondence between two point-sets is obtained via the closest point method, a technique called soft-assignment has been proposed to establish symmetric correspondence and detect outliers.

The dissimilarity between two unlabeled point-sets can be measured by various probability distances. One common choice is the Wasserstein distance, which arises from the well-known Monge-Kantorovich (MK) mass transport problem [22]. A survey

∗Mathematics, National Taiwan University (pengwen@math.ntu.edu.tw).

†Engineering, University of Iowa (ching@engineering.uiowa.edu).

‡Mathematics, National Taiwan University (chern@math.ntu.edu.tw) 1

(3)

of theoretical studies on this problem can be found in [15] or [44]. Monge-Kantorovich mass transport has been employed in image retrieval, image registration and image morphing [47], [21], [36], [53], [31], or [29]. Image intensity functions are regarded as piles of soil, and image registration involves the task of optimally moving these piles of soil. One advantage of the MK problem is that any local minimizer is a global minimizer as a result of the inherent convexity of the problem. However, two major obstacles hinder its popularity in imaging registration. First, computation of a minimizer is not an easy task for large-sized point-sets, unless it is a one-dimensional case.

Second, with regard to practical applications, intensity-based methods are sensitive to intensity changes, which can be caused by noise or variations in illumination[54].

Experiments show that the morphing effect in mass transport seems to be unavoidable and may lead to misregistration for medical images[29]. Recently, mass transport has been incorporated into feature-based methods. The Hellinger distances-based point- set matching model (HD)[11] is one example. The HD model can be regarded as an approximation of the MK problem when the kernel scale tends to infinity. With a finite kernel scale, the measure preserving constraint is relaxed to tolerate the existence of outliers.

There are several research studies related to lung registration. The intensity- based lung registration method matches the intensity patterns of the images retrieved at different lung volumes by minimizing a similarity measure [38][13][32][50][51]. The sum of the squared tissue volume difference (SSTVD) is a new similarity criterion based on preserving tissue mass of the lung at different volumes[51]. SSTVD has been demonstrated to yield an improved registration for large deformations in the lower lobes. However, such methods are known to require high computational costs¹ and a possible mismatch of important anatomical landmarks may occur when registration optimization falls into a local minima. In addition to intensity data, anatomical features have also been used to derive the transformation from one lung dataset to another[16][4][24][10]. To further improve the accuracy of registering intra-subject datasets across large lung volume changes, hybrid methods have been proposed to utilize both anatomical landmark and intensity information [30] [25] [49] [19]. The correspondences between anatomical points, such as the bifurcation points of airway and/or vascular trees and vertebra were manually designated by experts. Some recently published work has shown that it is possible to generate large numbers of corresponding landmarks (more than 1,000) from a pair of lung CT datasets with semi-automatic tools [28] [10]. Although semi-automatic tools [28] [10] were developed to accelerate the process, the task of designating the correspondences between the points remains a labor-intensive task.

1.2. Our contributions. In this paper, we apply mass transport-based methods to register two large sets of landmark data acquired from a pair of lung CT images (acquired during breath-holds) and establish their correspondence. These methods are tested on the lung CT datasets of several human subjects measured at the total lung capacity (TLC) and functional residual capacity (FRC). See Fig. 1.1.

To date, there is no universal approach that can match point-sets under large deformations due to the diverse nature of transforms. The effectiveness of matching algorithms generally depends on the transform assumptions. In this work, points sampled from one elastic object are transported by some vector field of a small curl.²

1Computational time varies from several minutes to several hours depending on the input image size and the implementation method (sampling points, GPU, multi-threading).

2From the Helmholtz decomposition, a vector field on R^d can be decomposed as a sum of a

(4)

0 50 100 150 200 250 300 0

200 400 300 250 200 150 100 50

Fig. 1.1. 3D CT images of lungs(Left): (a) TLC (total lung capacity) and (b) FRC (functional residual capacity). Branch points are marked by green dots. The lungs, the airway tree, and the vessel tree are marked by cyan, red, and purple, respectively. Right: Two point-sets selected from CT images. The unit of the coordinates is mm. The direction z is along the lung height, i.e., a small z corresponds to the apex and a large z corresponds to the base. TLC is shown by markers × and FRC is shown by markers •.

The nearly perfect match results show the effectiveness of the mass transport methods in elastic deformation problems. The performance of the methods further is evaluated against other existing point-set registration methods.

The second contribution of this paper is a theoretical analysis on mass transport- based models, which illustrates their lung matching performance. To this end, we will establish one relation between the point cardinality and the maximum curl ωmax

(Def. 2.1 ), which indicates one of the fundamental differences between feature-based methods and intensity-based methods. Roughly speaking, under elastic deformation with a nonzero curl, perfect matches can occur if the point cardinality does not ex- ceed some upper bound. On the other hand, the linear elastic theory suggests the occurrence of large curls in the lung periphery. Together, perfect matches are likely to occur at those points at lower generations.

Moreover, we will highlight several correspondence invariants in the L2 mass transport cost. Due to the invariants, the L2 mass transport cost yields a better performance than other mass transport costs, as shown in the experiments. In addition, we will demonstrate the cyclical monotonicity inherent in the HD model. Due to this intimate relationship with mass transport problems, the HD model can be viewed as a mass transport model in which the outlier effect is alleviated. The theoretical analysis and experimental results suggest that the mass transport method is a suitable tool for handling point-set registration problems subject to elastic deformation. To the best of our knowledge, this is the first study revealing the outstanding matching performance of mass transport methods on medical images.

This paper is organized as follows: In sections 2 and 3, we provide several theoretical studies on the robust point matching model, including the curl-cardinality relation, the optimal criteria and the cyclical monotone property. In section 4, we present various matching experiments on lung branch points.

divergence-free vector field and a curl-free vector field. Besides, according to Brenier’s polar factor- ization theorem [7], Theorem 3.8[44], a L2 vector-valued mapping can be expressed as a composition of the gradient of a convex function and a measure-preserving mapping.

(5)

2. Mass transport-based point-set matching models. In this paper, let δ be the Dirac delta function and k · k2= k · k be the 2-norm. We denote the trace and the transpose of a matrix X by T r(X) and X^>, respectively.

2.1. Mass transport problems and point-set matching problems. One central problem in elasticity is related to the determination of the deformation T on a bounded open connected subset Ω of R³ subjected to some applied force. The deformation T must be injective and orientation-preserving (the deformation gradient det(∇T ) > 0) to be physically acceptable.

The main focus of this paper is the inverse problem: determining the correspondence between two unlabeled point-sets {xi}ⁿ_i=1, {yi}ⁿ_i=1 sampled from Ω and T (Ω), respectively. Here, the correspondence is described by a permutation τ on {1, . . . , n}

such that yi = T (x_{τ (i)}). To proceed, we require a stronger assumption on transforms: T is twice continuously differentiable and has the Helmholtz decomposition, T (x) = ∇φ + ∇ × ψ, where φ is strongly convex and ∇ · ψ = 0. Note that φ, ψ can be determined by solving Poisson’s equations (see pages 238-242 in [37]):

∇ · T = ∇²φ, ∇ × T = −∇²ψ.

Suppose that the corresponding point-pairs {xi, y_{τ (i)}} are nearby. One estimation of τ is implemented by solving the minimization problem:

minτ n

X

i=1

kxi− y_{τ (i)}k^α, α ≥ 1. (2.1)

This is a combinatorial optimization problem, because n! possibilities must be evaluated. This difficulty can be alleviated if we instead consider a relaxed problem,

minµi,j

n

X

i=1

kxi− yjk^αµi,j, (2.2)

subject to the unit mass constraint, Pn

i=1µ_i,j = 1 = Pn

j=1µ_i,j, µ_i,j ≥ 0. This problem is known as the L^αMonge-Kantorovich mass transport problem. Here, the original permutation τ is relaxed to a correspondence matrix characterized by the doubly stochastic matrix {µi,j: i, j = 1, . . . , n} or the measure µ :=Pn

i,j=1µi,jδ(x − xi, y − yj). More precisely, τ (i) = j if µi,j= 1.

The relaxed problem in (2.2 ) is a convex (in fact, linear) minimization problem, which permits an optimal permutation matrix by Birkhoff’s theorem and can be solved by interior point methods[6] or primal-dual algorithms [21] (see chapter 4 in [8] also).

Finally, we say that a perfect match occurs if the underlying correspondence between two point-sets coincides with a minimizer {µi,j}ⁿ_i,j=1 of Eq. (2.2 ).

2.2. Two specialities in the L2 cost. In this paper, the correspondence estimation method based on Eq. (2.2 ) with α = 2 is referred to as the MK method. Here we emphasize two specialities of the L2 mass transport cost.

The first speciality is cyclically monotonicity. A nonempty subset {(xi, yi)}ⁿ_i=1 in R^d, d ≥ 1 is said to be cyclically monotone (p. 79 [44] ) if for all m ≥ 2 and for all subsets {(xi, yi)}^m_i=1,

m

X

i=1

kxi− yik²≤

m

X

i=1

kxi− yi−1k², with the convention y0= ym.

(6)

In this context, µ =Pn

i=1δ(x−xi, y−yi) is optimal in the L2 mass transport problem, when the correspondence {(xi, yi)}ⁿ_i=1is cyclically monotone. Likewise, a permutation τ corresponding to the permutation µ is optimal, when {(xi, y_{τ (i)})}ⁿ_i=1 is cyclically monotone. One characterization of the optimal condition is that if the support of the optimal correspondence µ satisfies the cyclical monotonicity, then µ is supported in the sub-differential of a proper lower semi-continuous convex function φ (Theorem 2.27[44] ), i.e., yi∈ ∂φ(xi), where ∂ refers to sub-gradients[33]. ³

Consider a transform T : R^d → R^d between two point-sets {xi}^m_i=1, {yi}ⁿ_i=1 in R^d with yi = T (xi). When T happens to be the gradient of some convex function, then the correspondence can be recovered correctly by solving mass transport problems.

The special classes of the transforms consist of scalings, translations, positive definite affine transforms (T (x) := Ax + t, where A is positive definite) and other curl-free maps. Note that when φ is strongly convex and differentiable, then the Hessian of φ is positive definite, which implies that ∇φ is orientation preserving and injective.

The second specialty of the L2 cost is the correspondence invariant under one additional transform S. Here, we discuss two correspondence invariants: (i) between {S(xi)}ⁿ_i=1 and {S(yi)}ⁿ_i=1 and (ii) between {S(xi)}ⁿ_i=1 and {yi}ⁿ_i=1. For one- dimensional point-sets, the cyclically monotone correspondence µ is the monotone re- arrangement (the spatial ordering of points is preserved), page 75[44]. Thus, the cyclical monotonicity of {(xi, yi)}ⁿ_i=1implies the cyclical monotonicity of {S(xi), S(yi)}ⁿ_i=1 and {S(xi), yi}ⁿ_i=1, if S is a scalar increasing function. We have both invariants (i) and (ii).

In R^d, d > 1, cyclically monotone correspondence {(xi, yi)}ⁿ_i=1 generally does not imply cyclical monotonicity {(S(xi), S(yi))}ⁿ_i=1 or {(S(xi), yi)}ⁿ_i=1.

Instead, for type (i), the correspondence in L^αcost is invariant under rigid motions and scalings, S(x) = Qx + t, S(x) = ax + t, where a is a positive scalar, Q is an orthogonal matrix and t ∈ R^d. In general, the correspondence in the mass transport cost kx − yk^α is not invariant under affine transforms.⁴ With regard to invariant (ii), we will first show that the L2 cost possesses one additional (forward-backward) invariant:

Proposition 2.1 (Forward-backward). Suppose that a matrix µ minimizes the L2 mass transport cost between {xi}ⁿ_i=1 and {yi}ⁿ_i=1. Let S(x) = Ax + t be an affine transform with a nonsingular symmetric matrix A and a vector t. Then, the matrix µ also minimizes the L2 mass transport cost between {S(xi)} and {S⁻¹(yj)}:

arg min

µ n

X

i,j=1

µi,jkxi− yjk²= arg min

µ n

X

i,j=1

µi,jk(Axi+ t) − A⁻¹(yj− t)k².

In particular, the result holds for positive definite matrices A.

3Regard two point-sets as finite realizations of two random variables. According to Theorem 2.32[44] or [27], for two probability measures µ, ν with µ absolutely continuous with respect to the Lebesgue measure, there exists a unique measurable map T such that the push-forward measure T #µ = ν and T = ∇φ for some convex function φ.

4Here is one example. Let {xi}⁴_i=1 = {(1, 1), (0, 1), (0, 0), (−1, 0)} and {yi}⁴_i=1 = {(−1, 1), (0, 1), (0, 0), (1, 0)}. Consider two affinely transformed point-sets {Axi}, {Ayi}, where A is a diagonal matrix with diagonal entries [a, 1]. For a ∈ (0, 1) and close to zero, x1is matched with y2(view {xi}⁴_i=1and {yi}⁴_i=1as {x1, x2} ∪ {x₃, x4} and {y₁, y2} ∪ {y₃, y4}). However, x₁is matched with y4 for a sufficiently greater than 1 (view {xi}⁴_i=1 and {yi}⁴_i=1as {x1} ∪ {x₂, x3} ∪ {x₄} and {y1} ∪ {y2, y3} ∪ {y4}).

(7)

Proof. Because ofPn

i=1µi,j= 1 =Pn

j=1µi,j, any minimizer µi,j for the cost X

i,j

µi,jk(Axi+ t) − A⁻¹(yj− t)k²=X

i,j

(−2)µi,jy^>_jxi+ constant terms w.r.t. µi,j

also minimizesPn

i,j=1µ_i,jkx_i− y_jk².

Along with the first invariant, the optimal correspondence {(x_i, y_i)}ⁿ_i=1implies the optimal correspondence {(axi+ t, yi)}ⁿ_i=1 with a positive scalar a, which is invariant (ii). In practical applications, the invariant in the L2 cost eliminates the difficulty of estimating the parameters a, t.

Generally speaking, the match error of the mass transport approach is caused by two factors, nonzero curls and outliers. We first examine the curl effect on the occurrence of mismatch. A discussion regarding outliers will be presented in the following section.

According to Helmholtz decompositions, a three-dimensional smooth vector field can be formulated as a sum of a gradient function and a curl function. From the viewpoint of mass transport methods, point correspondence can be estimated correctly if the transform is the gradient of some convex function. However, these special transforms form a very small class, and they rarely occur in practical applications.

Consider any point-set with finite cardinality n sampled from Ω. The following analysis shows that if the curl magnitude ω_max(Def. 2.1 ) of the transform T is less than some upper bound C/n, then a perfect match can still be obtained, where C is a positive constant related to the constant in the isoperimetric inequality[9]. Empirical experiments show that the upper bound can be largely improved if the point-set is scattered uniformly over some region of a high dimensional space rather than being concentrated in a circle.

Definition 2.1. For each d × d matrix B, let Λ(B; k) be the k^thlargest singular value of B. Let T be a transform on R^d, d ≥ 2. Let TS and TA be the symmetric and asymmetric part of ∇T at each x in the domain Ω, i.e., ∇T (x) = TS(x) + TA(x).

Define the maximum curl ωmax of T on Ω as

ωmax(T ; Ω) = 2 tan⁻¹(maxxΛ(TA(x); 1) min_xΛ(T_S(x); d)).

In the case where Ω ⊂ R³and T = ∇φ + ∇ × ψ with φ strongly convex, we obtain Λ(TA, 2) = 0 and Λ(TA, 1) = −Λ(TA, 3) = 2⁻¹k∇ × T k = 2⁻¹k∇²ψk². Thus, tan(ω_max/2) is proportional to the maximum curl, max_xk∇ × T (x)k.

To gain a better understanding of ω_max, we examine a simple rotation example first. This example illustrates the upper bound of ω for perfect matches. Consider one point-set consisting of n points {xi∈ R²: i = 1, . . . , n ≥ 2} in a circle (centered at the origin) with radius r and polar angle {θi= 2iπ/n : i = 1, . . . , n} and another point-set consisting of {yi: i = 1, . . . , n} in the circle with polar angle θi+ ω. Use the convention xn+1= x1and x0= xn. The perfect match condition is forfeited under ω if

n

X

i=1

xi· yi= nr²cos ω ≤ max(

n

X

i=1

xi+1· yi,

n

X

i=1

xi−1· yi),

which implies, tan |ω| ≥ (1 − cos(2π/n))/ sin(2π/n) = tan(π/n),

(8)

i.e., |ω| ≥ π/n. One can easily verify that Λ(TA; 1)/Λ(TS, 2) = tan |ω| and ωmax in Def. 2.1 is exactly 2|ω|. Hence, a perfect match can be obtained if the rotation angle ω lies in (ω₋, ω+) := (−π/n, π/n) or ωmax≤ 2π/n. Here, we call the range (ω₋, ω+) the perfect match range. Note that this upper bound (not depending on the spatial size 2r) can be regarded as the angle resolution of the point-set (the average in-between angle of n points {xi}ⁿ_i=1 evenly spaced among the angular range 2π).

Next, we will show that a perfect match can be obtained for any set of n points in R^dif the rotation has ωmax≤ 2π/n. Consider a point-set {xi}ⁿ_i=1 in R^dand generate another point-set {yi}ⁿ_i=1by rotating {xi}ⁿ_i=1by angle ω. A rotation can be described by an orthogonal matrix R, T (x) = Rx. Decompose ∇T = R = RS + RA, where RS, RAare the symmetric and asymmetric parts of R. The magnitude of the rotation can then be measured by the trace, T r(RAR^>_A) or the largest singular value Λ(RA, 1).

Proposition 2.2 (Curl-Cardinality). Suppose that transforms T are rotations.

(i) A perfect match can be obtained for a set of n points in R^d if ω_max(T, R^d) ≤ 2π/n.

(ii) Among all of the possible point-sets consisting of n points, the point-set that consists of the vertices of the n-sided regular polygon has the smallest perfect match range (ω₋, ω+) := (−π/n, π/n).

To prove this proposition, we need the following lemma.

Lemma 2.2. Let A, B be asymmetric d × d matrices, where A is of the form vw^>− wv^>, with v, w ∈ R^d. Then

max

A T r(BA) = T r(2AA^>)^1/2Λ(B; 1).

The maximum is obtained only when v, w are singular vectors of B corresponding to Λ(B; 1)

Proof. According to the matrix theory (page 107), B can be expressed as

k

X

j=1

βj(u2j−1u^>_2j− u2ju^>_2j−1),

where {β1 ≥ β2 ≥ . . . ≥ βk > 0} are singular values and {uj}^2k_j=1 are orthonormal singular vectors of B. Observe that for v, w ∈ R^d,

T r(B(wv^>− vw^>)) = 2

k

X

i=1

β_i((v · u_2i−1)(w · u_2i) − (w · u_2i−1)(v · u_2i))

≤ 2β1(kvk²kwk²− (v · w)²)^1/2= β₁T r(2AA^>)^1/2, A := vw^>− wv^>. The equality holds when v, w lie in the subspace spanned by u1, u2.

Proof. [Proof of the Proposition] The case n = 2 is obvious. Here, we focus on the case for n ≥ 3. Let R be a rotation matrix with a minimum of Λ(RA; 1), which leads to the occurrence of a mismatch. Then, there exists some m ∈ N, m ≤ n and some relabeling on {Rxi}ⁿ_i=1, such that

m

X

i=1

kRxi− xik²≥

m

X

i=1

kRxi− xi+1k², i.e.,

m

X

i=1

x^>_i R(xi− xi+1) ≤ 0.

(9)

We focus on the case of m = n, which is in fact the worst case, i.e.,

n

X

i=1

T r(R[(x_i− x_i+1)(x_i− x_i+1)^>− (x_i+1x^>_i − x_ix^>_i+1)]^>) ≤ 0. (2.3)

Denote S :=Pn

i=1Si, with Si:= (xi− xi+1)(xi− xi+1)^>. Let

Aˆi:= xi+1x^>_i − xix^>_i+1, Ai:= (xi+1− x1)(xi− x1)^>− (xi− x1)(xi+1− x1)^>

and A :=Pn

i=1A_i =Pn

i=1Aˆ_i. The problem can be converted into the task of finding a pair of matrices (R_S, R_A) to minimize T r(R_AR^>_A), subject to the following conditions:

RR^>= R_SR_S^>+ R_AR^>_A = I, (2.4)

T r(RS^>) ≤ T r(RA^>), i.e., T r(R_SS^>) ≤ T r(R_AA^>). (2.5) Observe that the optimality of (R_S, R_A) does not depend on the value of T r(S) itself.

Note that min

R_S T r(R_SS^>) ≤ T r(R_SS^>) ≤ T r(R_AA^>) ≤ max

A T r(R_AA^>).

We will establish a lower bound for T r(R_AR^>_A) by examining two subproblems:

min

R_S T r(R_SS^>), subject to T r(R_SR^>_S) fixed and T r(S) = 1; (2.6)

max

A T r(RAA^>), subject to T r(S) = 1. (2.7) For the first subproblem in Eq. (2.6 ), let a be the smallest eigenvalue of RS. Then,

min

RS

T r(SR^>_S) = min

RS

n

X

i=1

(xi+1− xi)^>RS(xi+1− xi) ≥ aT r(S).

For the second subproblem in Eq. (2.7 ), observe that

maxA T r(AR^>_A) =

n

X

i=1

maxA_i T r(AiR^>_A) ≤ Λ(RA, 1) max

A_i n

X

i=1

T r(2AiA^>_i )^1/2,

where equality holds if and only if the set {xi}ⁿ_i=1 is coplanar and lies in the subspace spanned by the singular vectors of RA corresponding to Λ(RA, 1)(Lemma 2.2). Note that

4⁻¹

n

X

i=1

T r(2AiA^>_i )^1/2=

n

X

i=1

2⁻¹[kxi− x1k²kxi+1− x1k²− ((xi− x1) · (xi+1− x1))²]^1/2

is the area of a polygon assembled from n triangles with vertices {xi, xi+1, x1}ⁿ_i=1. Hence

maxA T r(AR^>_A)/4Λ(RA)

(10)

is bounded above by the maximum area of the closed polygon with given side lengths.

From [23], the maximum area occurs when the polygon is inscribed in a circle. Let Area be the area of the polygon.

Without loss of generality, let RSbe a diagonal matrix with diagonal entries sorted in an increasing order. ⁵ Because the set {xi}ⁿ_i=1 is coplanar, we may assume that R is a two-dimensional rotation matrix with [R(i, j)]²_i,j=1 = [cos θ, sin θ; − sin θ, cos θ].

Under this circumstance, a = cos θ. From Eq. (2.5), aT r(S)

4√

1 − a² =T r(SR^>_S)

4Λ(R_A) ≤T r(AR_A^>)

4Λ(R_A) ≤ Area. (2.8)

Also, we have L² ≤ nT r(S) ( Cauchy-Schwartz inequalities), where the length L :=

Pn

i=1kxi− xi+1k reaches a maximum if kxi− xi+1k is constant for each i. Together, (tan |θ|)⁻¹= a

√1 − a² ≤ 4Area

T r(S) ≤ 4Area L²/n ≤n

π,

where the isoperimetric inequality, L²/Area ≥ 4π (page 33, [9] ) is used. In fact, the isoperimetric inequality for polygons[23] (L²/Area ≥ 4n tan(π/n)) shows that

tan |θ| ≥ tan(π/n), i.e., |θ| ≥ π/n.

According to the previous example, the lower bound can be reached by choosing {xi}ⁿ_i=1 as vertices of a regular polygon, which completes the proof.

Remark 2.3. For a general transform T on R^d, the curl-cardinality relation still holds with the constant C not necessarily 2π. For simplicity, we only consider d = 3 here and present the proof in the following. The occurrence of mismatches implies the existence of n ∈ N and some subset {xi}ⁿ_i=1 with convention xn+1:= x1, such that

n

X

i=1

(xi− xi+1) · T (xi) =

n

X

i=1

xi+1· (T (xi+1) − T (xi)) ≤ 0 for some n.

Then

n

X

i=1

(xi− xi+1) · (T (xi) − T (xi+1)) ≤ −

n

X

i=1

(xi− xi+1) · (T (xi) + T (xi+1)). (2.9) The left hand side of Eq. (2.9) has a lower bound,

n

X

i=1

((xi+1− xi)^>TS(ξi)(xi+1− xi)) ≥ n⁻¹(min

x Λ(TS(x)))(

n

X

i=1

|xi+1− xi|)² (2.10)

which is an approximation of n⁻¹(min_xΛ(T_S(x)))L(∂Ω)²( the existence of ξ_i is en- sured by the mean value theorem ). On the other hand, the right hand side of Eq. (2.9) can be regarded as the Riemann sum of one line integral,

n

X

i=1

(xi− xi+1) · (T (xi) + T (xi+1)) ≈ 2 Z

∂Ω

T (x) · dx

= 2 Z

Ω

(∇ × T (x)) · nda,

5Otherwise let RS= U DU^>be the eigenvalue decomposition of the symmetric matrix RSand replace the point-set by the point-set {U^>xi}ⁿ_i=1. The trace T r(S) remains unchanged for the rotated point-set.

(11)

which is bounded above by 2(maxx∈Ω|∇ × T (x)|)Area, where da is the area element.

If the above two approximation errors can be neglected, then together with the inequality L(∂Ω)²≥ 4πArea(Ω), we have

4πn⁻¹Area(Ω) ≤ n⁻¹L(∂Ω)²≤ 2 maxx∈Ω|∇ × T (x)|

minxΛ(TS(x); 3) Area(Ω).

Hence,

maxx∈Ω|∇ × T (x)|

min_xΛ(T_S(x); 3) ≥2π n ,

which verifies the curl-cardinality relation ωmax≤ C/n for perfect matches, where C might be different from 2π due to the approximation difference of the Riemann sums.

Corollary 2.4. Here, we like to mention one special case: the gradient of general transforms T can be decomposed as the sum ∇T = TS+TAon R^d, d ≥ 2, where the symmetric part TS is constant, positive definite and can be factored as T_S^1/2T_S^1/2. Define L and Area as above. Then the perfect matches occur if ωmax≤ 2π/n, where

ωmax:= 2 tan⁻¹(max

x Λ(T_S^−1/2TA(x)T_S^−1/2; 1)).

Proof.

Obviously, the left-hand side of Eq. (2.9) has a lower bound:

n

X

i=1

akx_i− xi+1k²≥ aL²/n, where a = Λ(T_S(ξ_i); 1)

and ξi is some point between xi, xi+1( the mean-value theorem). However, when TS

is constant, we can derive a tighter bound. Let x_i = T_S^−1/2y_i for i = 1, . . . , n. Then the mean value theorem indicates that the existence of ξ_isuch that the left-hand side becomes

n

X

i=1

(xi− xi+1) · (TS(ξi)(xi− xi+1)) =

n

X

i=1

kyi− yi+1k².

The right-hand side of Eq. (2.9) can be reformulated as

n

X

i=1

(x_i+ x_i+1) · (T (x_i) − T (x_i+1)) =

n

X

i=1

(x_i− x₁+ x_i+1− x₁) · (T (x_i) − T (x_i+1))

=

n

X

i=1

(ˆxi+ ˆxi+1) · ((TS+ TA(ηi))(ˆxi− ˆxi+1)) =

n

X

i=1

T r(TA(ηi)(ˆxixˆ^>_i+1− ˆxi+1xˆ^>_i )),

=

n

X

i=1

T r(T_S^−1/2TA(ηi)T_S^−1/2(ˆyiyˆ^>_i+1− ˆyi+1yˆ^>_i )),

where ηi is some point lying on the segment between xiand xi+1 (by the mean value theorem), and ˆxi = xi−x1, ˆyi= yi−y1. Divide the skew polygon with vertices {yi}ⁿ_i=1 into n − 2 triangles, each of which has vertices {y1, yi, yi+1}ⁿ⁻¹_i=2. The right-hand side can be regarded as

n−1

X

i=2

T r(T_S^−1/2TA(ηi)T_S^−1/2Ai) ≤

n

X

i=1

β1(ηi)T r(2AiA^>_i )^1/2,

(12)

where Ai= ˆyiyˆ_i+1^> − ˆyi+1yˆ^>_i and β1(ηi) is the largest singular value of the asymmetric matrix {T_S^−1/2T_A(η_i)T_S^−1/2}ⁿ_i=1. Following this procedure, the same arguments in the previous proof yield

tanωmax(T ; Ω)

2 ≥ L²

4nArea,

where ωmax(T ; Ω) is the largest singular value of T_S^−1/2TA(x)T_S^−1/2 among all x in Ω. Hence, we have the curl-cardinality relation for perfect matches: ωmax ≤ 2π/n according to the isoperimetric inequality.

The value ω+− ω− for rotating a high dimensional point-set is usually far greater than 2π/n. In fact, from the proof of the proposition, n can be reduced to a maximum length of disjoint cycles of the corresponding permutation τ . Let us examine the following special cases. In 2D, suppose the points are located at rectangular grids x_i,j = (i/√

n, j/√

n) for i, j = 1, . . . ,√

n ∈ N. Along the rectangular boundary, we have 4√

n points. The angle resolution is approximately 2π/4√

n = π/2√

n. Thus, ω₊− ω₋ ≈ 1.57n^−1/2. ⁶ Similar arguments show that ω₊− ω₋≈ 1.57n^−1/3 for the cases of 3D rectangular grids.

Here, one simulation of the curl-cardinality relation, in which the coefficient is greater than π/2 ≈ 1.57 is presented. Consider a point-set randomly generated from a unit square [0, 1] × [0, 1] uniformly. Rotate the point-set along the z-axis with angle ω. Each perfect match range (ω₋, ω+) is measured under various point cardinalities, n = 50, . . . , 200. The result is reported in Table 2.1 and Fig. 2.1. In the figure, the green solid line reveals a linear relationship between n^−1/2 and ω+− ω₋, where (approximately) ω+− ω− = C2n^−1/2, with C2 = 2.90. Similarly, we measure the perfect match range for 3D point-sets (randomly generated from [0, 1] × [0, 1] × [0, 1]) with different point cardinalities. The result is shown by the red dashed line, which is (approximately) ω₊− ω−= C₃n^−1/3, with C₃= 2.86.

Table 2.1

(ω+− ω−) × 0.01 based on 10 random samples

n 50 75 100 125 150 175 200

2D 46 ± 4.4 34.8 ± 1.6 28.8 ± 1.6 25.2 ± 1.88 23.3 ± 1.88 21 ± 0.89 19.2 ± 1.4 3D 84.8 ± 6.9 68.8 ± 4.4 62.8 ± 3.4 57.2 ± 3.9 51.6 ± 2.6 49.2 ± 2.4 46.8 ± 1.6

Consequently, under the same magnitude of curls, the matching difficulty for 3D 1000 (randomly sampled) points in mass transport models is similar to the task of matching 2D 100 points and similar to matching 40 points on the four sides of a square. Hence, it is not unexpected that perfect matches occur in matching 3D lung point-sets with 1000 points. See the experimental results for details.

2.3. Large curls near the lung periphery. In this subsection, we will provide one explanation for the occurrence of large curls near the boundary of the domain, namely the lung periphery. First, we briefly describe the mechanics of breathing. The

6To see the factor n^1/2, consider a polygon {xi}^m_i=1with constant side lengths kxi− x_i−1k = h for i = 1, . . . , m and x0 = xm. Let us also fix Area (the area of the enclosed region). Since T r(S) = mh², an upper bound for tan |θ| is mh²/4Area (see Eq. (2.8)). Hence, the tightest bound for ωmaxis given by the polygon with the minimum cardinality m.

(13)

0 0.05 0.1 0.15 0.2 0.25 0.3 0

0.2 0.4 0.6 0.8

Fig. 2.1. Robustness of the extra curl on random sampled point-sets. The graph is generated using Table 2.1. The y-axis is ω+− ω− and the x-axis shows n^−1/2(the green solid line) and n^−1/3(the red dashed line) for 2D and 3D, respectively.

repeated inflation and deflation of the lungs are controlled by the respiratory muscles, the diaphragm and the intercostal muscles. During inhalation, the diaphragm and the intercostal muscles contract and create negative pressure (relative to atmospheric pressure) surrounding the lungs. The expansion decreases the pressure in the chest cavity and allows air flow in, which inflates millions of alveoli. The region around the lungs in which this negative pressure acts is called the pleural space and is filled with a very thin layer of lubricating fluid that separates the outer surface of the lungs from the inner surface of the rib cage ([1], page 4).

The lungs, which are made of spongy and elastic tissue, are commonly modeled as a linear, isotropic and homogeneous medium[48]. In this situation, the displacement field u(x) := T (x)−x on the domain Ω (lungs) satisfies the Lam´e equilibrium equations in the linear elasticity theory:

µ∇²u + ∇((µ + λ)∇ · u) = f, subject to several boundary conditions on u, where µ > 0 and λ > 0 are called the Lam´e constants and the vector function f is the body force.

We will study the spatial distribution of the curl ∇ × T = ∇ × u. According to the classic uniqueness result of the Lam´e equations, u is unique up to some rigid body displacement if traction is prescribed over the entire surface. Based on the superposition principle, the solution u can be constructed as the sum of a particular solution of the inhomogeneous equilibrium equations and a solution of the homogeneous equilibrium equations subject to the desired boundary condition.

To proceed, we assume that the body forces are derived from a scalar potential, i.e., f = ∇ξ. This assumption is valid in many cases, i.e., for the gravity forces.

Consider a particular solution of the gradient form u = ∇φ. Then

∇((λ + 2µ)∇²φ − ξ) = 0.

Hence, one particular solution can be obtained by solving Poisson’s equation:

∇²φ = ξ 2µ + λ.

Note that u is curl-free, and thus the body force does not contribute any curl on the displacement field u ( However, the nonzero f does affect the boundary condition for u).

(14)

According to Helmholtz’s theorem, a vector field u on a bounded domain Ω in R³can be decomposed into a sum, u = ∇ ˜φ + ∇ × ψ with ∇ · ψ = 0. Substituting the decomposition into the homogeneous Lam´e equations leads to

µ∇ × ∇²ψ + (λ + 2µ)∇(∇²φ) = 0.˜ (2.11) The desired displacement field is the one satisfying the boundary condition.

Note that ∇ × u = −∇²ψ. By applying the divergence and curl operator on Eq. (2.11 ), we have ∇²∇²φ = ∇˜ ²∇²ψ = 0. Thus, for each unit vector v ∈ R³, v · ∇²ψ is harmonic, which implies that the extreme values of v · ∇²ψ occur at the boundary

∂Ω (Theorem 2.3, page 15, [17] ). Hence, regardless of the boundary condition, the maximum magnitude of the curl ∇ × u occurs at the boundary of the elastic object.

This theoretical result is consistent with our lung experiments: mismatches occur at the branch points of higher generations. ⁷

3. Outliers in matching problems. In this section, we impose the same (small curl) assumption on transforms as in the previous section. Hence, the mass transport method yields the correct correspondence in this ideal case. We will study the outlier effect in mass transport models. In our experiment, outliers refer to the points, which appear in one point-set but their correspondences are missing in the other point-set.

From a mathematical viewpoint, because the points are unlabeled, it is impossible to distinguish outliers from the “inliers”. Hence, perfect matches are unattainable when outliers exist. In fact, the matching result can be much worse. Sometimes, a small number of distant outliers can lead to large mismatches in the aforementioned mass transport model (see experiments).⁸

One possible solution is the HD model [11], where correspondences are estimated by maximizing

m,n

X

i,j=1

q

γ_i,j⁺γ_i,j⁻ exp(−kxi− yjk²/2σ²), (3.1)

with respect to nonnegative unknowns γ_i,j⁺, γ_i,j⁻, subject to the unit mass constraint, Pm

i=1γ_i,j⁺ = 1 =Pn

j=1γ_i,j⁻ for all i, j. Here, the correspondence is characterized by two matrices γ_i,j⁺, γ_i,j⁻. This model is an approximation for the mass transport model as σ tends to infinity (B.3 in [11] ). With a finite kernel scale σ, this model is robust to distant outliers (see experiments, section 4 ). Indeed, with some sufficiently large σ, the correspondence determined by the majority rule(Eq. 4.2 ) is identical to the correspondence generated by the MK mass transport model, and a robustness against outliers is obtained from the finite kernel scales. Indeed, the optimal correspondence in the HD model is cyclically monotone (see Remark 3.4 ). Hence, the HD model can be viewed as a re-weighted mass transport model, where the weight function is related to the spatial distance of each corresponding point-pair.

3.1. Duality. We will examine the optimal condition in the HD model via the duality structure between the maximization problem Eq. (3.1 ) and its dual problem

7Note that T (x) = x+u. Here, we ignore the variation of ωmaxcontributed from the denominator 1 + Λ(∇²φ).˜

8For example, consider one-dimensional point-sets {xi}, {y_i} on the x-axis with perfect matches.

Add one outlier x to {xi} and add one outlier y to {y_i} with x_i< x, yi> y for all i. We obtain a 100% mismatch rate.

(15)

Eq. (3.3 ). Let S = {(Γ⁺, Γ⁻) : Γ⁺, Γ⁻are m×n matrices with entries γ_i,j⁺ ≥ 0, γ_i,j⁻ ≥ 0 satisfyingPn

j=1γ_i,j⁺ = γ_i⁺,Pm

i=1γ_i,j⁻ = γ_j⁻}. Let E(Γ⁺, Γ⁻) =

( Pm,n i=1,j=12q

γ⁺_i,jγ_i.j⁻K(x_i, y_j), if (Γ⁺, Γ⁻) ∈ S

−∞, else. (3.2)

Let T = {(φ, ψ) : Vectors φ, ψ have positive entries φi, ψj with φiψj ≥ K(xi, yj)²}.

Here, the dual problem max E is presented:

min

(φ,ψ)∈TJ (φ, ψ), where J (φ, ψ) :=

m

X

i=1

φiγ_i⁺+

n

X

j=1

ψjγ_j⁻. (3.3)

Indeed, these two problems are connected by the weak duality:

max

(Γ⁺,Γ⁻)∈SE(Γ⁺, Γ⁻) ≤ min

(φ,ψ)∈TJ (φ, ψ). (3.4)

We say that (Γ⁺, Γ⁻, φ, ψ) is a saddle point if

E(Γ⁺, Γ⁻) = J (φ, ψ). (3.5)

The weak and strong dualities have been studied in [11], and the strong duality can be established by the strong duality theorem (Prop. 5.2.1[2]. The following proposition summarizes the optimal conditions.

Proposition 3.1. Assume K^i,j> 0 for all i, j. Then max

(Γ⁺,Γ⁻)∈SE(Γ⁺, Γ⁻) = min

(φ,ψ)∈TJ (φ, ψ).

This result indicates the absence of the duality gap. That is, if the matrices (Γ⁺, Γ⁻) and the vectors (φ, ψ) are optimal, then from Eq. (3.4 ), the following conditions hold:

1. For all i, j, γ_i,j⁺γ_i,j⁻(φiψj− K_i,j² ) = 0. Then, γ_i,j⁺ = 0 = γ_i,j⁻ for each pair (i, j) with φiψj > K_i,j² ;

2. φiγ⁺_i,j= ψjγ_i,j⁻ for all i, j.

Conversely, when these two conditions are fulfilled, (Γ⁺, Γ⁻, φ, ψ) is a saddle point.

From Prop. 3.1, the (i, j)−entries in the matrix K_i,j² should coincide with those in the product of the two vectors, if γ⁺_i,jγ_i,j⁻ > 0. Thus, give n a matrix {Ki,j : i = 1, . . . , m, j = 1, . . . , n} with nonzero minors⁹, the correspondence matrices (Γ⁺, Γ⁻) are highly sparse. Here, a simple block coordinate descent method to compute Γ⁺, Γ⁻ is presented.

Algorithm 3.2 (Correspondence estimation[11] ).

1. Initialize matrices Γ⁻ with entries γ_i,j⁻ = 1/n and K with entries Ki,j = exp(−ky_j− xik^α/2σ^α). Let σ be some kernel scale.

2. Repeat the iterations till they converge, γ_i,j⁺ ← γ_i,j⁻K_i,j²

Pn

j=1γ_i,j⁻K_i,j² , γ_i,j⁻ ← γ_i,j⁺K_i,j² Pm

i=1γ⁺_i,jK_i,j² . (3.6) In Theorem 4.5 [12], any limit of the sequences is a maximizer independent of the initial positive matrices Γ⁺ and Γ⁻. Empirically, the convergence speed is generally acceptable for point-sets with hundreds of points and small-sized kernel scales.

9All square sub-matrices have a nonzero determinant.

(16)

3.2. Cyclical monotonicity in the HD model. Rearrange and partition each pair of discrete masses ν^X, ν^Y properly, such that the matching is “bijective”, i.e.,

ν⁺ =

n

X

i=1

γ_i⁺δ(x − xi), ν⁻=

n

X

i=1

γ_i⁻δ(y − yi).

Then, the maximizer (Γ⁺, Γ⁻) of E can be expressed as two square diagonal matrices with diagonal entries {γ_i⁺}ⁿ_i=1 and {γ_i⁻}ⁿ_i=1, respectively. The next proposition characterizes the optimal bijective matching described by {(γ⁺_i , γ_i⁻) : i = 1, . . . , n}.

Proposition 3.3. The above bijective matching is optimal if and only if Ki,iKj,j

K_i,j² ≥ v u u t

γ_i⁺γ⁻_j

γ_j⁺γ⁻_i ≥ K_j,i² Ki,iKj,j

, for all i, j. (3.7)

Note that for either yi= yj or xi= xj we have γ_i⁻

γ_j⁻ = γ⁺_i K_i,i² γ_j⁺K_j,i² or γ_i⁺

γ_j⁺ = γ_i⁻K_i,i²

γ_j⁻K_i,j² , respectively . (3.8) Proof. (The only-if part) Let (φ, ψ) be a minimizer of the dual problem J . Then we have K_i,i² = φiψi, K_j,j² = φjψj and K_i,j² ≤ φiψj, which implies,

Ki,iKj,j

K_i,j² ≥

sφiφjψiψj

φ²_iψ²_j = s

ψi

φi

φj

ψj

= v u u t

γ_i⁺γ_j⁻ γ_i⁻γ_j⁺,

where we also used γ_i⁺φ_i = γ_i⁻ψ_i. Thus we proved the first inequality. The second inequality is obtained by exchanging i and j.

(The if part) Suppose that Γ⁺, Γ⁻ and Ki,j satisfy the inequality. Let φi= Ki,i

q

γ_i⁻/γ_i⁺, ψi = Ki,i

q γ_i⁺/γ_i⁻.

One can then easily verify φiγ_i⁺= ψiγ_i⁻, φiψj≥ K_i,j² . From Prop. 3.1, (Γ⁺, Γ⁻, φ, ψ) is a saddle point. Thus, the diagonal matrices (Γ⁺, Γ⁻) are an optimal pair.

This proposition yields two consequences.

Remark 3.4 (Cyclical monotonicity). The inequality in Eq. (3.7 ) yields the following “c”-cyclical monotonicity [45] with the cost function “c” defined as − log K.

For any natural number N and any subset {(x₁, y₁), . . . , (x_N, y_N)} from two point- sets, the following inequality holds:

K_1,1(QN −1

i=2 K_i,i² )K_N,N QN −1

i=1 K_i,i+1² ≥

N −1

Y

i=1

sγ_i⁺γ_i+1⁻ γ_i⁻γ⁺_i+1 =

s γ₁⁺γ_N⁻

γ₁⁻γ_N⁺ ≥ K_N,1² K_1,1K_N,N, i.e., we obtain the c-cyclical monotonicity,

N

X

i=1

log Ki,i≥

N

X

i=1

log Ki,i+1 where KN,N +1:= KN,1.

When K(x, y) = exp(−kx − yk²/σ²), the cyclical monotonicity implies that {(x_i, y_i) : i = 1, . . . , N } is included in the sub-differential of a proper lower semi-continuous