Fast All Nearest Neighbor Algorithms from Image Processing Perspective
Yuh-Rau Wang
Department of Computer Science and Information Engineering St. John’s & St. Mary’s Institute of Technology
499, Section 4, Tam-King Road, Tamsui, Taipei, Taiwan [email protected]
Shi-Jinn Horng and Hung-Chang Chan
Department of Computer Science and Information Engineering National Taiwan University of Science and Technology
43, Section 4, Kee-Lung Road, Taipei, Taiwan [email protected]
Abstract
In this paper, we solve the k-dimensional all nearest neighbor ( kD ANN) problem, where k = 2 or 3, on a linear array with a reconfigurable pipelined bus system (LARPBS) from image processing perspective. Three scalable O(1) time algorithms are proposed, one for solving the Euclidean distance transform (EDT) problem and the other two for solving the all nearest neighbor (ANN) problem. First, for a two-dimensional (2D) binary image of size N × N , we devise an algorithm for solving the 2D EDT problem us- ing an LARPBS of size N
2+ε, where 0 < ε = + δ =
2c+11−1
+
1κ< 1, κ and c are constants, and an algorithm for solving the 2D ANN problem using an LARPBS of size N
2+, where 0 < =
2c+11−11. Then, for a three- dimensional (3D) binary image of size N × N × N , we de- vise an algorithm for solving the 3D ANN problem using an LARPBS of size N
3+εbased on the computed 2D EDT and 2D ANN. To the best of our knowledge, all results derived above are the best O(1) time EDT and ANN algorithms on the LARPBS model known.
1. Introduction
The all nearest neighbor (ANN) problem, which has been applied in pattern recognition, geography, mathemati- cal ecology, and solid-state physics [11], can be solved from either the computational geometry or image processing per- spective. From a computational geometry perspective, the ANN problem is defined as: Given a set of n points, to find a nearest neighbor from the set for each point of the set. If the given set of n points is in the two-dimensional
(2D) Euclidean space ( E
2), then we define this ANN prob- lem as the 2D all nearest neighbor (2D ANN) problem. The 3D ANN problem can be defined similarly if the given set of n points is in the three-dimensional (3D) Euclidean space (E
3). Some O(1) time 2D ANN algorithms from the com- putational geometry perspective have been introduced by several researchers. Eshaghian [1] proposed an O(1) time algorithm on a 2D n
2× n
2array of the optical mesh with reconfigurable free-space optical beams (OMC). Tsai et al.
[12] proposed an O(1) time algorithm on a 2D hyper-bus broadcast network (HBBN) using n × n processors with n
1c-bit bus width, where n
1c> log n, c is a constant and positive integer. Jang et al. [2] proposed an O(1) time al- gorithm on an n × n reconfigurable mesh. For n points in a k-D Euclidean space (E
k), where k ≥ 2, Lai and Sheng [3] proposed an O(1) time algorithm for ANN problem on an n × n reconfigurable mesh. Wang et al. [15] proposed an O(1) time algorithm for kD ANN problem, where k = 2 or 3, on an LARPBS of n
1.5processors, which is the best pre- vious result known, from the computational geometry per- spective.
If the domain is a 2D image, then the 2D ANN prob-
lem is usually solved from image processing perspective
and is redefined as: Given a 2D N × N image, to find a
nearest neighbor (i.e., black pixel) from the image for each
black pixel of the image. Assume that there are n black pix-
els in a 2D N × N image. Clearly, 0 ≤ n ≤ N
2. In the
worst case, n = O(N
2). Hence, the notation n described
in previous paragraph should be represented by N
2when
we solve the 2D ANN problem from image processing per-
spective. Basically, the ANN problem is considered to be
different problem if it is approached from different per-
spective. Usually, the application itself decides which ap-
proach should be chosen. It is very difficult for us to judge which choice is better. Suppose that both approaches are good for solving an ANN application, then the image pro- cessing might be a good choice if n = O(N
2); however, if n = O(N ), n = O(log N ), or even n = O(1), then the image processing approach might be very inefficient [15]. Some O(1) time algorithms for solving the 2D ANN problem from image processing perspective as follows. Kao and Horng [7] proposed an O(1) time algorithm using an N × N reconfigurable array of processors (RAP) with N
2c- bit bus width, where N
2c> 2 log N , c is a constant and positive integer. Pan et al. [10] proposed two O(1) time al- gorithms on the LARPBS: One runs on N
2+δprocessors, where 0 < δ =
1κ< 1, κ is a constant. The other runs on N
2processors with high probability; however, in fact, it runs in O(log log N ) time in the worst case. Similarly, if the domain is a 3D image and is also solved from an image pro- cessing perspective, then the 3D ANN problem is redefined as: Given a 3D N × N × N binary image, to find a nearest neighbor (i.e., black voxel) from the image for each black voxel of this image. Assume that there are n black pixels in the 3D N × N × N binary image. Clearly, 0 ≤ n ≤ N
3. In the worst case, n = O(N
3). To the best of our knowl- edge, up to now, no parallel algorithms have been devised for the 3D ANN computation from image processing per- spective.
The Euclidean distance transform (EDT) is an operation that converts an image consisting of 1- and 0-pixels to an image where each pixel has a value that represents the Eu- clidean distance to its nearest 1-pixel. If a pixel is 1-pixel, then its nearest 1-pixel is itself. The EDT is the distance transform (DT) based on the Euclidean metric. Because of the property of rotation invariant, the Euclidean metric is considered to be superior to other metrics. For example, one of the applications of EDT is the robot collision avoidance.
Many parallel algorithms have been proposed for construct- ing the EDT of a 2D binary image of size N × N . Some of the most recently published papers are [13], [14]. Note that the 2D ANN problem “looks like” closely related with the 2D Euclidean distance transform (2D EDT) problem; how- ever, they are very different (e.g., their properties are very different). In this paper, we devise the 3D ANN algorithm based on the computed 2D EDT and 2D ANN.
Because the EDT and ANN problems are global opera- tions, they are prohibitively time consuming when perform these operations. In order to provide fast image computa- tions, parallelism is employed. Hence, in this paper, we devise scalable O(1) time algorithms on the LARPBS for solving the EDT and ANN problems. We first devise an al- gorithm for solving the 2D EDT problem using N
2+εpro- cessors, where 0 < ε = + δ =
2c+11−1+
κ1< 1, both κ and c are constants. Then, we devise two algorithms for solving the ANN problem from image processing perspec-
tive as follows: One for solving the 2D ANN problem us- ing N
2+processors, where 0 < =
2c+11−11, and the other for solving the 3D ANN problem based on the com- puted 2D EDT and 2D ANN using N
3+εprocessors. To prevent from triviality, in this version, we omit the details of the processor partition, allocation, reconfiguration and data routing. To the best of our knowledge, all results de- rived in this paper are the best O(1) time algorithms on the LARPBS model known.
The remainder of this paper is organized as follows. In Section 2, we introduce the LARPBS model and some fun- damental algorithms which will be used as building blocks of this paper. In Section 3, we introduce the definitions and properties of EDT and ANN. In Section 4, we devise a scal- able O(1) time algorithm for solving the 2D EDT problem.
In Section 5, we devise two scalable O(1) time algorithms:
One for solving the 2D ANN problem, and the other for solving the 3D ANN problem. Finally, some concluding re- marks are included in the last section.
2. The LARPBS Model
In this paper, we will base on the linear array with a re- configurable pipelined bus system (LARPBS) [8], [9] to de- vise our algorithms. An LARPBS is an array of N proces- sors P
0, P
1, · · ·, P
N−1connected by optical buses as shown in Figure 1. In the LARPBS, each processor with a local memory is identified by a unique index and denoted as P
i, where 0 ≤ i ≤ N − 1. Optical buses offer numerous advan- tages including high bandwidth and low interference; how- ever, the two most relevant properties of optical buses are their unidirectional propagation and predictable propaga- tion delay per unit length, which enable synchronized con- current access to an optical bus in a pipelined fashion [6].
The optical bus of the LARPBS is constructed from three
identical waveguides. They are message, select and refer-
ence waveguides. The message waveguide is used for send-
ing data, and the select and reference waveguides are used
for sending address information. Each waveguide is con-
ceptually divided into two segments, the transmitting seg-
ment and the receiving segment. Each processor P
iis con-
nected to the optical bus with two couplers. One coupler,
which is a 1 × 2 optical switch (denoted as RST(i)), is
used to write data on the upper (transmitting) segment of
the bus. The other, which is a 2 × 1 optical switch (denoted
as RSR( i)), is used to read the data from the lower (receiv-
ing) segment of the bus. Since each processor P
ihas to con-
nect to three waveguides and each waveguide needs a pair of
RST( i) and RSR(i), hence, each processor P
ineeds six lo-
cal optical switches, three for connecting to its three trans-
mitting segments and three for connecting to its three re-
ceiving segments. These switches are also called reconfig-
urable switches due to their function. Due to reconfigurabil-
end processor receiving segment
receiving segment receiving segment
end processor end processor leader processor
leader processor
leader processor
transmitting segment transmitting segment
transmitting segment message
select reference
(a)
(b) message
select reference
P 1 P
iP 0 P
N -2P
N -1P 0 P 1 P
i-1P
iP
i+1P
N -1d
0Figure 1. (a) An LARPBS of size N. (b) An LARPBS is split into two independent subarrays.
ity, an LARPBS can be partitioned into k (where k ≥ 2) in- dependent subarrays. All these subarrays can operate inde- pendently to solve different subproblems. If all the reconfig- urable switches are set to straight, the bus system operates as a regular pipelined bus system as shown in Figure 1.(a). If all the six reconfigurable switches of processor P
iare set to cross, then the LARPBS will be split into two separate sys- tems, one consisting of processors P
0, P
1, · · ·, P
i, whose leader processor is P
0, and the other consisting of proces- sors P
i+1, · · ·, P
N−1, whose leader processor is P
i+1as shown in Figure 1.(b).
As in many other synchronous parallel computing sys- tems, an LARPBS computation is a sequence of alter- nate global communication and local computation steps.
Since the message communication time on the whole op- tical pipelined bus is compatible with the computation time of any arithmetic or logic operation, we can assume that a bus cycle as well as a processor cycle as one step and each step takes O(1) time. This assumption for time complexity measure has been shown and adopted widely in the litera- ture [4], [9], [14], [15]. In this paper, we adopt the same time complexity measure. The coincident pulse technique [5] is used on the LARPBS model for addressing. Lemma 1 sum- marizes some primitive operations of the LARPBS model.
See [4], [9] for the definitions and detailed implementations of these primitive operations. Unlike many theoretical mod- els, such as PRAM, the LARPBS is more realistic and is likely to become feasible architecture in the near future [4], [9]. The powerful primitive operations plus the reconfigura- bility of the LARPBS model make the LARPBS model very attractive in solving the EDT and ANN problems that are
both computation and communication intensive.
Lemma 1 [4], [9]. One-to-one communication, broadcast- ing, multicasting, multiple multicasting, binary prefix sum, binary summation, and compression can be done in O(1) bus cycles on an LARPBS.
Lemma 2 implies Lemma 3. Here we name Lemma 3 as the basic minimum finding algorithm. Then, based on Lemma 3 and [13], we devise a scalable O(1) time min- imum finding algorithm, named Algorithm MFA as de- scribed in Lemma 4.
Lemma 2 [10]. N items can be sorted in O(1) time using an LARPBS of N
2processors.
Lemma 3 The minimum value of N data items can be com- puted in O(1) time on an LARPBS of size N
2.
Lemma 4 (Algorithm MFA). The minimum value of N data items can be computed in O(1) time on an LARPBS of size N
1+, where 0 < =
2c+11−11, c is a con- stant. Here we skip the proof.
Note that throughout this paper,
2c+11−1is denoted as
. Clearly, Lemma 3 is a special case of Lemma 4 (when
c = 0). Note that in each step of Algorithm MFA, all proces-
sors are always fully utilized and converges toward zero
very fast. It implies that the number of processors needed
will reduce dramatically even if c increases very slowly. For
example, let N = 10
15, then if we invoke Lemma 3 (i.e.,
c = 0 in Lemma 4), it will take 10
30processors for us to
find the minimum of these N data items in O(1) time; how-
ever, if we invoke Lemma 4 and choose c = 3, then it only
takes 10
16processors for us to find the minimum in O(1) time. Obviously, huge amount of processors saving (10
14times) are rewarded for this algorithm when we increase c from 0 to 3. Although c can be any non-negative integer, to fully take the advantage of this algorithm, we suggest that c ≥ 3 or 5. This lemma can be used as a sub-algorithm to solve lots of other related parallel computation problems on the LARPBS more efficiently.
Lemma 5 introduces the algorithms for computing the 1D EDT and 1D ANN. The 1D EDT and 1D ANN prob- lems are defined in Section 3. Here we skip the proof of Lemma 5.
Lemma 5 The 1D EDT and 1D ANN of N Boolean data can be solved in O(1) time on an LARPBS of N processors.
3. Definitions and Properties of EDT and ANN
Given a set of 1-pixels, to find a nearest neighbor (NN) for each 1-pixel of the set is defined as the all nearest neigh- bor (ANN) problem. The EDT is an operation that converts an image consisting of black and white pixels to an image where each pixel has a value that represents the Euclidean distance to its nearest 1-pixel. If a pixel is a 1-pixel, then its nearest 1-pixel is itself. Let pixel b with coordinate (i) be denoted as b
i. Let L = {b
i|b
iis either a 0-pixel or a 1-pixel, 0 ≤ i < N} denote a line of size N. Let B
1D= {q
x|q
xis a 1-pixel, 0 ≤ x < N} ⊆ L represent the set of the 1- pixels of the line. In the following, we first introduce EDT and ANN problems of a one-dimensional (1D) binary im- age of size N formally.
Definition 1 (1D EDT). Let the 1D nearest 1-pixel of b
i∈ L be denoted as E
bi. Let the 1D Euclidean distance transform (1D EDT) of pixel b
ibe de- noted as 1 d edt
biand be defined as the distance from pixel b
ito its nearest 1-pixel E
bi∈ B
1D, i.e., 1d edt
bi= dist(b
i, E
bi) = min
qx∈B1D{|i − x|}, and the 1D EDT of L is to compute {1d edt
bi|∀ b
i∈ L}. Clearly, if b
iis a 1-pixel, then E
bi= b
i. If no 1-pixel is in line L, let E
bi= (+∞) and 1d edt
bi= +∞.
Definition 2 (1D ANN). Let a 1D nearest neighbor of q
xbe denoted as N
qx. Then, a 1-pixel q
xis defined as N
qxif dist(q
x, N
qx) = min
qx∈B1D−{qx}{|x
− x|} and the 1D ANN of B
1Dis to compute {N
qx| ∀q
x∈ B
1D}. Clearly, N
qx= q
x. If q
xis the only one 1-pixel in line L, let N
qx= (+∞).
Let a pixel p with coordinates (i, j) be denoted as p
i,j. Let I = {p
i,j| 0 ≤ i, j < N, p
i,jis either a 0-pixel or a 1-pixel.} denote a 2D binary image of size N × N. Let B
2D= {q
x,y|q
x,yis a 1-pixel, 0 ≤ x, y < N} ⊆ I denote the set of the 1-pixels of the 2D binary image. Let
(X
Npi,j, Y
Npi,j) denote the coordinates of N
pi,j. Let col- umn k be denoted as C
k. Let row r be denoted as R
r. We then introduce the EDT and ANN problems of a 2D binary image of size N × N formally as follows.
Definition 3 (2D EDT). Let E
pi,jdenote the 2D nearest 1-pixel of p
i,j∈ I. Let the 2D Euclidean distance trans- form (2D EDT) of pixel p
i,jbe denoted as 2 d edt
pi,jand be defined as the distance from pixel p
i,jto its nearest 1- pixel E
pi,j∈ B
2D, i.e., 2 d edt
pi,j= dist(p
i,j, E
pi,j) = min
qx,y∈B2D{((i − x)
2+ (j − y)
2)
12} and the 2D EDT of I is to compute {2d edt
pi,j| ∀p
i,j∈ I}. Clearly, if p
i,jis a 1-pixel, then E
pi,j= p
i,j. If no 1-pixel is in I, let E
pi,j= (+∞, +∞) and 2d edt
pi,j= +∞.
Definition 4 For a fixed k, let E
pCi,jk, 0 ≤ k < N, denote the 2D nearest 1-pixel of the pixel p
i,jwith respect to all 1-pixels in column k. Clearly, if j = k, then E
pCi,kkreduces to the 1D EDT of p
i,kon column k.
Definition 5 (2D ANN). Let 1-pixel q
x,y∈ B
2D. Let 1-pixel N
qx,ydenote the 2D nearest neigh- bor of the 1-pixel q
x,y. Then, a 1-pixel q
x,yis defined as N
qx,yif dist(q
x,y, N
qx,y) = min
qx,y∈B2D−{qx,y}
{((x
− x)
2+ (y
− y)
2)
12} and the 2D ANN of B
2Dis to compute {N
qx,y| ∀q
x,y∈ B
2D}.
If q
x,yis the only one 1-pixel in I, let N
qx,y= (+∞, +∞).
Definition 6 For a fixed k, let 1-pixel N
qCx,yk, 0 ≤ k < N, denote the 2D nearest neighbor of the 1-pixel q
x,ywith re- spect to all the 1-pixels in column k. Clearly, if y = k, then N
qCx,kkreduces to the 1D ANN of q
x,kon column k.
Let a voxel p with coordinates (i, j, k) be denoted as p
i,j,k. Let V = {p
i,j,k| 0 ≤ i, j, k < N, p
i,j,kis ei- ther a 0-voxel or a 1-voxel.} denote a 3D binary image of size N × N × N . Let B
3D= {q
x,y,z|q
x,y,zis a 1-voxel, 0 ≤ x, y, z < N} ⊆ V denote the set of the 1-voxels of the 3D binary image. Let ( X
Npi,j,k, Y
Npi,j,k, Z
Npi,j,k) de- note the coordinates of N
pi,j,k. Finally, we introduce ANN problem of a 3D binary image of size N × N × N formally as follows.
Definition 7 (3D ANN). Let a 1-voxel q
x,y,z∈ B
3D. Let 1-voxel N
qx,y,zdenote the 3D nearest neighbor of the 1- voxel q
x,y,z. Then, a 1-voxel q
x,y,zis defined to be N
qx,y,zif dist(q
x,y,z, N
qx,y,z) = min
qx,y,z∈B3D−{qx,y,z}{((x
− x)
2+ (y
− y)
2+ (z
− z)
2)
12}. The 3D ANN of B
3Dis to compute {N
qx,y,z| ∀q
x,y,z∈ B
3D}. If q
x,y,zis the only one 1-voxel in V, let N
qx,y,z= (+∞, +∞, +∞).
Definition 8 ( Z-plane, Z-column). Let I
k={(x, y, k)| 0 ≤
x, y < N, k is a fixed integer}. Then V can be represented
by {I
k| 0 ≤ k < N}. Let Z
x,y-column={(x, y, k) | 0 ≤
k < N, x and y are fixed integers}. Then V can be repre-
sented by {Z
x,y-column | 0 ≤ x, y < N} also. Also, each
Z
x,y-column is denoted as Z-column, and each Z-plane I
kis denoted as Z-plane for short.
3.1. Some Basic Properties of 2D EDT
Lemma 6 [13]. Let p
x,i, p
x,k, p
x,j, i < k < j, be three pixels located at row x, which is parallel to Y -axis. Each of these three pixels is either a 0-pixel or a 1-pixel. Let E
px,i= p
m,sand E
px,j= p
n,tbe the 2D nearest 1-pixels of p
x,iand p
x,jrespectively. Then E
px,k= p
u,rlocates between column s (included) and column t (included). That is, s ≤ r ≤ t.
Lemma 7 [13]. Let p
x,k, 0 ≤ k < N , be all the pixels lo- cated at row x, which is parallel to Y -axis. Each of these pixels is either a 0-pixel or a 1-pixel. Then, for each p
x,k, E
px,k∈ {E
pCx,rr| 0 ≤ r < N}, 0 ≤ k < N.
Lemma 8 Let p
x,i, p
x,k, p
x,j, i < k < j, be three pix- els located at row x, which is parallel to Y -axis. Each of these three pixels is either a 0-pixel or a 1-pixel. Let E
px,i= p
m,sand E
px,j= p
n,tbe the 2D nearest 1- pixels of p
x,iand p
x,j, respectively. Then, for each p
x,k, E
px,k∈ {E
pCx,rr| s ≤ r ≤ t}, i ≤ k ≤ j. This lemma can be derived straightforwardly based on Lemmas 6 and 7.
3.2. Some Basic Properties of 2D ANN
Lemma 9 Let p
x,i, p
x,k, p
x,j, i < k < j, be three 1- pixels located at row x, where the left and right 1D nearest neighbors of the 1-pixel p
x,kon the row x are p
x,iand p
x,jrespectively. Assume N
px,k= p
t,r. Then, p
t,rlocates be- tween columns i and j. That is, i ≤ t ≤ j. In other words, i ≤ Y
Npx,k≤ j. This lemma can be proven easily. Here we skip the proof.
Lemma 10 Let p
x,i, p
x,k, p
x,j, i < k < j, be three 1- pixels located at row x, where the left and right 1D near- est neighbors of the 1-pixel p
x,kon the row x are p
x,iand p
x,jrespectively as shown in Figure 2. Then N
px,k∈ {E
pCx,rr| i ≤ r ≤ j, r = k}
N
pCx,kk.
Proof: Based on Lemma 9, N
px,klocates between columns i and j. First, assume that N
px,k, say p
q,k, is on column k. Since the nearest neighbor of 1-pixel p
x,kon column k is N
pCx,kk, so N
px,kmust be N
pCx,kk. Other- wise, if N
px,k, say p
s,r, is not on column k, then we should prove that N
px,k∈ {E
pCx,rr| i ≤ r ≤ j}, where r = k. This can be proven by contradiction as fol- lows. Assume that E
pCx,rr= p
t,r, where s = t, then
|p
x,rE
pCx,rr| = |x − t| < |x − s| = |p
x,rE
px,k|. There- fore, |p
x,kE
pCx,rr| = ((x − t)
2+ (k − r)
2)
1/2< ((x − s)
2+ (k − r)
2)
1/2= |p
x,kN
px,k|. It means that pixel p
x,kis closer to the 1-pixel E
pCx,rrwith coordinates ( t, r) than to the 1-pixel N
px,kwith coordinates ( s, r). A contradiction. This
proves that N
px,k∈ {E
pCx,rr| i ≤ r ≤ j, r = k} N
pCx,kk, i < k < j.
- row x E
pCx,iiE
pCx,rr•
= p
t,rN
px,k•
= p
s,r• N
pCx,kk= p
q,kE
pCx,jj• ◦ • ◦ •
• p
x,ip
x,rp
x,kp
x,jH H H Y J J J J ]
-
6
Figure 2. An illustration of Lemma 10.
3.3. Some Basic Properties of 3D ANN
Lemma 11 Let p
x,y,i, p
x,y,k, p
x,y,j, i < k < j, be three 1- voxels located at Z
x,y-column, which is parallel to Z-axis, where the left and right 1D nearest neighbors of the 1-pixel p
x,y,kon the Z
x,y-column are p
x,y,i∈ I
iand p
x,y,j∈ I
j, respectively. Assume N
px,y,k= p
u,v,r. Then, p
u,v,rlocates between Z-planes I
iand I
j. That is, i ≤ r ≤ j. The proof of this lemma is similar to that of Lemma 9. Here we skip the details.
Lemma 12 Let p
x,y,i, p
x,y,k, p
x,y,j, i < k < j, be three 1- voxels located at Z
x,y-column, which is parallel to Z-axis, where the left and right nearest neighbors of the 1-pixel p
x,kon the Z
x,y-column are p
x,y,i∈ I
iand p
x,y,j∈ I
j, respectively. Then N
px,y,k∈ {E
pIx,y,rr| i ≤ r ≤ j, r =
k}
N
pIx,y,kk. The proof of this lemma is similar to that of Lemma 10. Here we skip the details.
4. Algorithm for Computing 2D EDT
By fully taking the advantages of the EDT properties in- troduced in Section 3 and employing exact the same divide- and-conquer strategy as used in Wang and Horng [14], in this section, Algorithm ROW 2D EDT is first devised, then, we devise Algorithm 2D EDT for a 2D N × N binary im- age. To prevent from trivializing, here we only describe these two algorithms briefly as follows.
4.1. Algorithm for Computing One Row of 2D EDT
Algorithm Row 2D EDT is an algorithm for any pixel
(which is either a 0-pixel or a 1-pixel) of a row of a 2D
N × N binary image to compute the 2D EDT. Based on
the pre-computed E
bClland 1 d edt
Cbll(by invoking Lemma 5),
this algorithm computes the E
bland 2 d edt
blfor each pixel
b
l= p
r,l, 0 ≤ l < N, of one fixed row r of I. Through- out this paper, let κ be a constant,
κ1be denoted as δ, and + δ be denoted as ε. For convenience, each proces- sor with index ξ of an LARPBS of size N
1++δ(i.e., P
ξ, 0 < ξ ≤ N
1++δ) is also represented as P (i, j, k), where ξ = i × N
1++ j × N + k, 0 ≤ i < N
δ, 0 ≤ j <
N
, 0 ≤ k < N . We assume that the pixel b
l, 0 ≤ l < N, is initially given in the first N processors P
ξ(i.e., P (0, 0, l), 0 ≤ l < N) of an LARPBS of size N
1++δ, one pixel per processor. Based on Lemmas 6 , 7, and 8, the 2D EDT of all pixels in a row of size N can be computed efficiently in O(1) time on an LARPBS of N
1++δprocessors. This con- cludes Lemma 13.
Lemma 13 Algorithm Row 2D EDT can be computed in O(1) time on an LARPBS with N
1+εprocessors, where 0 < ε = + δ =
2c+11−1+
κ1< 1, both κ and c are con- stants.
4.2. A Scalable O(1) Time Algorithm for Solving the 2D EDT Problem
The high level description of this algorithm is stated as follows. First, we perform the 1D computation phase to compute the 1D EDT for each pixel along a column.
All columns perform in parallel. Then based on the com- puted results, we perform the 2D computation phase by in- voking Algorithm Row 2D EDT for each row to compute its corresponding Row 2D EDT in parallel. For comput- ing 2D EDT, we denote each processor with index ξ of an LARPBS of size N
2++δ(i.e., P
ξ, 0 ≤ ξ < N
2++δ) as P (v, i, j, k), where ξ = v × N
1++δ+ i × N
1++ j × N + k, 0 ≤ v < N, 0 ≤ i < N
δ, 0 ≤ j < N
, 0 ≤ k < N . The P (v, i, j, k) of size N
2++δcan be constructed from P (i, j, k) of size N
1++δby expanding one tuple v, where 0 ≤ v < N. In the 1D computation phase, ac- cording to Lemma 5, the 1D EDT of each column of I can be processed in O(1) time using N processors. So it takes O(1) time using N
2processors for all the N columns to compute their corresponding 1D EDTs in parallel. In the 2D computation phase, according to Lemma 13, Algo- rithm Row 2D EDT takes O(1) time using an LARPBS of N
1++δprocessors for all pixels b
l= p
r,lof a fixed row r to compute its 2D EDT. So it takes O(1) time using an LARPBS of N
2++δprocessors for all rows r, 0 ≤ r < N , to compute their corresponding 2D EDTs in parallel. This concludes Theorem 1.
Theorem 1 The 2D EDT of a binary image of size N × N can be computed in O(1) time on an LARPBS with N
2+εprocessors, where 0 < ε = + δ =
2c+11−1+
κ1< 1, both κ and c are constants.
5. Algorithms for Computing 2D ANN and 3D ANN
In this section, Algorithms Row 2D ANN and Zcol 3D ANN are first devised. Then, based on Al- gorithm Row 2D ANN and Lemma 5, we devise Algo- rithm 2D ANN, which computes the 2D ANN for a 2D N × N binary image (i.e., I). Finally, based on Algo- rithms 2D EDT, 2D ANN and Zcol 3D ANN, we de- vise Algorithm 3D ANN, which computes the 3D ANN for a 3D N × N × N binary image (i.e., V). We solve the 2D ANN and 3D ANN problems efficiently by fully tak- ing the advantages of the EDT and ANN properties intro- duced in Section 3.
5.1. Algorithm for Computing One Row of 2D ANN
Algorithm Row 2D ANN is an algorithm for any 1-pixel of a row r (i.e., R
r) of a 2D N × N binary image (i.e., I) to compute its corresponding 2D ANN. This algorithm is devised based on all the pre-computed E
pCr,ll, N
pCr,lland N
pRr,lr. As described before, we can apply Lemma 5 for ev- ery pixel b
r= p
r,lof a fixed column l to find its E
pCr,ll(i.e., the 1D nearest 1-pixel within column l), for each 1-pixel b
r= p
r,lof a fixed column l to find its N
pCr,ll(i.e., the 1D nearest 1-pixel neighbor within column l), and for each 1- pixel b
l= p
r,lof a fixed row r to find its N
pRr,lr(i.e., the 1D nearest 1-pixel neighbor within row r). Here we describe this algorithm briefly as follows.
Algorithm Row 2D ANN
Input: A fixed R
r= {p
r,l| 0 ≤ l < N} of I. For any p
r,l∈ I, it holds its corresponding E
pCr,ll. Besides, if p
r,lis a 1-pixel, then it holds its corresponding N
pCr,lland N
pRr,lralso. All of them are pre-computed by invok- ing Lemma 5. Initially, all E
pCr,ll, N
pCr,lland N
pRr,lrare al- located at the first N processors (i.e., P
l, 0 ≤ l < N) of an LARPBS of size N
1+.
Output: Each 1-pixel p
r,lof row r is assigned N
pr,l. Step 1: According to Lemma 9, the N
pr,l(i.e., the
2D ANN of each 1-pixel p
r,l) is bounded within its left and right 1D ANNs on row r. Based on Lemma 10, the 2D ANN (i.e., N
pr,l) of each odd 1-pixel p
r,lof row r, 0 ≤ l < N , can be com- puted.
Step 2: Similarly, based on Lemmas 9 and 10, the
2D ANN (i.e., N
pr,l) of each even 1-pixel p
r,lof
row r, 0 ≤ l < N , can be computed. This con-
cludes Lemma 14.
◦ • •
-
[ C
o1, C
o2] [ C
o2, C
o3] - [ C
o3, C
N−1] - -
- -
[ C
0, C
e1] [ C
e1, C
e2] [ C
e2, C
e3]
◦
◦ ◦ ◦ ◦◦ ◦◦◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦◦ ◦ ◦ ◦
0 e
1e
2e
3o
1o •
2• o •
3• N − 1 ◦ - row r
Figure 3. An illustration for Algorithm Row 2D ANN.
Lemma 14 Algorithm Row 2D ANN can be computed in O(1) time on an LARPBS with N
1+processors, where 0 < =
2c+11−11, c is a constant.
Proof: The proof of this algorithm is as follows. See Fig- ure 3 for reference. Let the set of 1-pixels in row r be de- noted as B
Rr= {b
1, b
2, ..., b
k, ..., b
m}. Clearly, |B
Rr| = m. Let the set of 1-pixels in row r with odd index k be de- noted as BO
Rr= {o
1, o
2, ..., o
mo}, and let the set of 1- pixels in row r with even index k be denoted as BE
Rr= {e
1, e
2, ..., e
me}. Clearly, |BO
Rr| = m
o, |BE
Rr| = m
eand m = m
o+ m
e. Note that the work of indexing the 1-pixels can be done by invoking Lemma 1. Based on Lemma 9, N
oi(i.e., the 2D ANN of 1-pixel o
i) locates be- tween columns e
i−1and e
i(i.e., [ C
ei−1, C
ei]), where C
e0= C
0; if i > m
e, then let C
ei= C
N−1. Let the number of columns in [ C
ei−1, C
ei] be denoted as |[ C
ei−1, C
ei]| = m
oi. Then it takes m
oi1+processors for o
ito find N
oi. Since
mo
i=1
m
oi≤ N, so
moi=1
m
oi1+≤ (
moi=1
m
oi)
1+≤ N
1+. In the worst case, if only one 1-pixel is on row r, then N
1+processors are needed. Clearly, it takes at most N
1+processors for Step 1 to be performed in O(1) time. Simi- larly, N
ei(i.e., the 2D ANN of 1-pixel e
i) locates between columns o
iand o
i+1(i.e., [ C
oi, C
oi+1]); if i + 1 > m
o, then let C
oi+1= C
N−1. Let |[ C
oi, C
oi+1]| = m
ei. Then it takes m
ei1+processors for e
ito find N
ei. Since
me
i=1
m
ei≤ N, so
me
i=1
m
ei1+≤ (
mei=1
m
ei)
1+≤ N
1+. If only one 1-pixel is on row r, then we skip Step 2. Clearly, it takes at most N
1+processors for Step 2 to be performed in O(1) time also.
5.2. A Scalable O(1) Time Algorithm for Solving the 2D ANN Problem
The high level description of Algorithm 2D ANN is stated as follows. The parallel algorithm for computing 2D ANN consists of two phases: the 1D computation phase and the 2D computation phase. During the 1D computation phase, we first compute the 1D EDT of each column in par-
allel, then compute the 1D ANN of each column in parallel, finally compute the 1D ANN of each row in parallel. Then based on the computed results, we perform the 2D com- putation phase by invoking Algorithm Row 2D ANN, all rows are performed in parallel. For computing 2D ANN, we denote each processor with index ξ of an LARPBS of size N
2+(i.e., P
ξ, 0 ≤ ξ < N
2+) as P (v, j, k), where ξ = v × N
1++ j × N + k, 0 ≤ v < N, 0 ≤ j <
N
, 0 ≤ k < N . The P (v, j, k) of size N
2+can be con- structed from the P (j, k) of size N
1+by expanding one tuple v, where 0 ≤ v < N . (As stated in Section 1, here we omit the details of the processor partition, allocation, re- configuration and data routing.) We describe this algorithm briefly as follows.
Algorithm 2D ANN
Input: A 2D N × N binary image (i.e., I), each pixel be- ing represented by p
r,l= 0 or 1, where 0 ≤ r, l < N.
Every pixel p
r,l∈ I is initially allocated at proces- sor P (r, 0, l) of size N
2+.
Output: Each 1-pixel p
r,lbeing assigned the coordinates of its 2D nearest 1-pixel N
pr,l.
Step 1: The 1D computation phase. As described above, every pixel p
r,l, 0 ≤ r < N, of a fixed column l (i.e., p
∗,l) of I is initially allocated at P (∗, 0, l). In this step, we first apply Lemma 5 for every pixel b
r= p
r,lof fixed column l to find its E
pCr,ll(i.e., the 1D near- est 1-pixel within column l), for each 1-pixel b
r= p
r,lof fixed column l to find its N
pCr,ll(i.e., the 1D near- est 1-pixel neighbor within column l), and for each 1- pixel b
l= p
r,lof fixed row r to find its N
pRr,lr(i.e., the 1D nearest 1-pixel neighbor within row r). At the end of this step, for every column l, 0 ≤ l < N , ev- ery processor P (r, 0, l) allocated for pixel b
r= p
r,lof column l has held E
bCrl= (X
EClbr
, l) and 1d edt
Cbrl=
|b
rE
bCrl| = |X
EClbr