Fast All Nearest Neighbor Algorithms from Image Processing Perspective

(1)

Fast All Nearest Neighbor Algorithms from Image Processing Perspective

Yuh-Rau Wang

Department of Computer Science and Information Engineering St. John’s & St. Mary’s Institute of Technology

499, Section 4, Tam-King Road, Tamsui, Taipei, Taiwan [email protected]

Shi-Jinn Horng and Hung-Chang Chan

Department of Computer Science and Information Engineering National Taiwan University of Science and Technology

43, Section 4, Kee-Lung Road, Taipei, Taiwan [email protected]

Abstract

In this paper, we solve the k-dimensional all nearest neighbor ( kD ANN) problem, where k = 2 or 3, on a linear array with a reconﬁgurable pipelined bus system (LARPBS) from image processing perspective. Three scalable O(1) time algorithms are proposed, one for solving the Euclidean distance transform (EDT) problem and the other two for solving the all nearest neighbor (ANN) problem. First, for a two-dimensional (2D) binary image of size N × N , we devise an algorithm for solving the 2D EDT problem us- ing an LARPBS of size N

^2+ε

, where 0 < ε = + δ =

2^c+11−1

+

¹_κ

< 1, κ and c are constants, and an algorithm for solving the 2D ANN problem using an LARPBS of size N

²⁺

, where 0 < =

₂c+1¹−1

1. Then, for a three- dimensional (3D) binary image of size N × N × N , we de- vise an algorithm for solving the 3D ANN problem using an LARPBS of size N

^3+ε

based on the computed 2D EDT and 2D ANN. To the best of our knowledge, all results derived above are the best O(1) time EDT and ANN algorithms on the LARPBS model known.

1. Introduction

The all nearest neighbor (ANN) problem, which has been applied in pattern recognition, geography, mathemati- cal ecology, and solid-state physics [11], can be solved from either the computational geometry or image processing per- spective. From a computational geometry perspective, the ANN problem is deﬁned as: Given a set of n points, to ﬁnd a nearest neighbor from the set for each point of the set. If the given set of n points is in the two-dimensional

(2D) Euclidean space ( E

²

), then we deﬁne this ANN prob- lem as the 2D all nearest neighbor (2D ANN) problem. The 3D ANN problem can be deﬁned similarly if the given set of n points is in the three-dimensional (3D) Euclidean space (E

³

). Some O(1) time 2D ANN algorithms from the com- putational geometry perspective have been introduced by several researchers. Eshaghian [1] proposed an O(1) time algorithm on a 2D n

²

× n

²

array of the optical mesh with reconﬁgurable free-space optical beams (OMC). Tsai et al.

[12] proposed an O(1) time algorithm on a 2D hyper-bus broadcast network (HBBN) using n × n processors with n

¹^c

-bit bus width, where n

¹^c

> log n, c is a constant and positive integer. Jang et al. [2] proposed an O(1) time al- gorithm on an n × n reconﬁgurable mesh. For n points in a k-D Euclidean space (E

^k

), where k ≥ 2, Lai and Sheng [3] proposed an O(1) time algorithm for ANN problem on an n × n reconﬁgurable mesh. Wang et al. [15] proposed an O(1) time algorithm for kD ANN problem, where k = 2 or 3, on an LARPBS of n

^1.5

processors, which is the best pre- vious result known, from the computational geometry per- spective.

If the domain is a 2D image, then the 2D ANN prob-

lem is usually solved from image processing perspective

and is redeﬁned as: Given a 2D N × N image, to ﬁnd a

nearest neighbor (i.e., black pixel) from the image for each

black pixel of the image. Assume that there are n black pix-

els in a 2D N × N image. Clearly, 0 ≤ n ≤ N

²

. In the

worst case, n = O(N

²

). Hence, the notation n described

in previous paragraph should be represented by N

²

when

we solve the 2D ANN problem from image processing per-

spective. Basically, the ANN problem is considered to be

different problem if it is approached from different per-

spective. Usually, the application itself decides which ap-

(2)

proach should be chosen. It is very difﬁcult for us to judge which choice is better. Suppose that both approaches are good for solving an ANN application, then the image pro- cessing might be a good choice if n = O(N

²

); however, if n = O(N ), n = O(log N ), or even n = O(1), then the image processing approach might be very inefﬁcient [15]. Some O(1) time algorithms for solving the 2D ANN problem from image processing perspective as follows. Kao and Horng [7] proposed an O(1) time algorithm using an N × N reconﬁgurable array of processors (RAP) with N

²^c

- bit bus width, where N

²^c

> 2 log N , c is a constant and positive integer. Pan et al. [10] proposed two O(1) time al- gorithms on the LARPBS: One runs on N

^2+δ

processors, where 0 < δ =

¹_κ

< 1, κ is a constant. The other runs on N

²

processors with high probability; however, in fact, it runs in O(log log N ) time in the worst case. Similarly, if the domain is a 3D image and is also solved from an image pro- cessing perspective, then the 3D ANN problem is redeﬁned as: Given a 3D N × N × N binary image, to ﬁnd a nearest neighbor (i.e., black voxel) from the image for each black voxel of this image. Assume that there are n black pixels in the 3D N × N × N binary image. Clearly, 0 ≤ n ≤ N

³

. In the worst case, n = O(N

³

). To the best of our knowl- edge, up to now, no parallel algorithms have been devised for the 3D ANN computation from image processing per- spective.

The Euclidean distance transform (EDT) is an operation that converts an image consisting of 1- and 0-pixels to an image where each pixel has a value that represents the Eu- clidean distance to its nearest 1-pixel. If a pixel is 1-pixel, then its nearest 1-pixel is itself. The EDT is the distance transform (DT) based on the Euclidean metric. Because of the property of rotation invariant, the Euclidean metric is considered to be superior to other metrics. For example, one of the applications of EDT is the robot collision avoidance.

Many parallel algorithms have been proposed for construct- ing the EDT of a 2D binary image of size N × N . Some of the most recently published papers are [13], [14]. Note that the 2D ANN problem “looks like” closely related with the 2D Euclidean distance transform (2D EDT) problem; how- ever, they are very different (e.g., their properties are very different). In this paper, we devise the 3D ANN algorithm based on the computed 2D EDT and 2D ANN.

Because the EDT and ANN problems are global opera- tions, they are prohibitively time consuming when perform these operations. In order to provide fast image computa- tions, parallelism is employed. Hence, in this paper, we devise scalable O(1) time algorithms on the LARPBS for solving the EDT and ANN problems. We ﬁrst devise an al- gorithm for solving the 2D EDT problem using N

^2+ε

pro- cessors, where 0 < ε = + δ =

₂c+1¹−1

+

_κ¹

< 1, both κ and c are constants. Then, we devise two algorithms for solving the ANN problem from image processing perspec-

tive as follows: One for solving the 2D ANN problem us- ing N

²⁺

processors, where 0 < =

₂c+1¹−1

1, and the other for solving the 3D ANN problem based on the com- puted 2D EDT and 2D ANN using N

^3+ε

processors. To prevent from triviality, in this version, we omit the details of the processor partition, allocation, reconﬁguration and data routing. To the best of our knowledge, all results de- rived in this paper are the best O(1) time algorithms on the LARPBS model known.

The remainder of this paper is organized as follows. In Section 2, we introduce the LARPBS model and some fun- damental algorithms which will be used as building blocks of this paper. In Section 3, we introduce the deﬁnitions and properties of EDT and ANN. In Section 4, we devise a scal- able O(1) time algorithm for solving the 2D EDT problem.

In Section 5, we devise two scalable O(1) time algorithms:

One for solving the 2D ANN problem, and the other for solving the 3D ANN problem. Finally, some concluding re- marks are included in the last section.

2. The LARPBS Model

In this paper, we will base on the linear array with a re- conﬁgurable pipelined bus system (LARPBS) [8], [9] to de- vise our algorithms. An LARPBS is an array of N proces- sors P

0

, P

1

, · · ·, P

_N−1

connected by optical buses as shown in Figure 1. In the LARPBS, each processor with a local memory is identiﬁed by a unique index and denoted as P

i

, where 0 ≤ i ≤ N − 1. Optical buses offer numerous advan- tages including high bandwidth and low interference; how- ever, the two most relevant properties of optical buses are their unidirectional propagation and predictable propaga- tion delay per unit length, which enable synchronized con- current access to an optical bus in a pipelined fashion [6].

The optical bus of the LARPBS is constructed from three

identical waveguides. They are message, select and refer-

ence waveguides. The message waveguide is used for send-

ing data, and the select and reference waveguides are used

for sending address information. Each waveguide is con-

ceptually divided into two segments, the transmitting seg-

ment and the receiving segment. Each processor P

i

is con-

nected to the optical bus with two couplers. One coupler,

which is a 1 × 2 optical switch (denoted as RST(i)), is

used to write data on the upper (transmitting) segment of

the bus. The other, which is a 2 × 1 optical switch (denoted

as RSR( i)), is used to read the data from the lower (receiv-

ing) segment of the bus. Since each processor P

i

has to con-

nect to three waveguides and each waveguide needs a pair of

RST( i) and RSR(i), hence, each processor P

i

needs six lo-

cal optical switches, three for connecting to its three trans-

mitting segments and three for connecting to its three re-

ceiving segments. These switches are also called reconﬁg-

urable switches due to their function. Due to reconﬁgurabil-

(3)

end processor receiving segment

receiving segment receiving segment

end processor end processor leader processor

leader processor

transmitting segment transmitting segment

transmitting segment message

select reference

(a)

(b) message

select reference

P 1 P

i

P 0 P

N -2

P

N -1

P 0 P 1 P

i-1

P

i

P

i+1

P

N -1

d

0

Figure 1. (a) An LARPBS of size N. (b) An LARPBS is split into two independent subarrays.

ity, an LARPBS can be partitioned into k (where k ≥ 2) in- dependent subarrays. All these subarrays can operate inde- pendently to solve different subproblems. If all the reconﬁg- urable switches are set to straight, the bus system operates as a regular pipelined bus system as shown in Figure 1.(a). If all the six reconﬁgurable switches of processor P

i

are set to cross, then the LARPBS will be split into two separate sys- tems, one consisting of processors P

0

, P

1

, · · ·, P

_i

, whose leader processor is P

0

, and the other consisting of proces- sors P

_i+1

, · · ·, P

_N−1

, whose leader processor is P

_i+1

as shown in Figure 1.(b).

As in many other synchronous parallel computing sys- tems, an LARPBS computation is a sequence of alter- nate global communication and local computation steps.

Since the message communication time on the whole op- tical pipelined bus is compatible with the computation time of any arithmetic or logic operation, we can assume that a bus cycle as well as a processor cycle as one step and each step takes O(1) time. This assumption for time complexity measure has been shown and adopted widely in the litera- ture [4], [9], [14], [15]. In this paper, we adopt the same time complexity measure. The coincident pulse technique [5] is used on the LARPBS model for addressing. Lemma 1 sum- marizes some primitive operations of the LARPBS model.

See [4], [9] for the deﬁnitions and detailed implementations of these primitive operations. Unlike many theoretical mod- els, such as PRAM, the LARPBS is more realistic and is likely to become feasible architecture in the near future [4], [9]. The powerful primitive operations plus the reconﬁgura- bility of the LARPBS model make the LARPBS model very attractive in solving the EDT and ANN problems that are

both computation and communication intensive.

Lemma 1 [4], [9]. One-to-one communication, broadcast- ing, multicasting, multiple multicasting, binary preﬁx sum, binary summation, and compression can be done in O(1) bus cycles on an LARPBS.

Lemma 2 implies Lemma 3. Here we name Lemma 3 as the basic minimum ﬁnding algorithm. Then, based on Lemma 3 and [13], we devise a scalable O(1) time min- imum ﬁnding algorithm, named Algorithm MFA as de- scribed in Lemma 4.

Lemma 2 [10]. N items can be sorted in O(1) time using an LARPBS of N

²

processors.

Lemma 3 The minimum value of N data items can be com- puted in O(1) time on an LARPBS of size N

²

.

Lemma 4 (Algorithm MFA). The minimum value of N data items can be computed in O(1) time on an LARPBS of size N

¹⁺

, where 0 < =

₂c+1¹−1

1, c is a con- stant. Here we skip the proof.

Note that throughout this paper,

₂c+1¹−1

is denoted as

. Clearly, Lemma 3 is a special case of Lemma 4 (when

c = 0). Note that in each step of Algorithm MFA, all proces-

sors are always fully utilized and converges toward zero

very fast. It implies that the number of processors needed

will reduce dramatically even if c increases very slowly. For

example, let N = 10

¹⁵

, then if we invoke Lemma 3 (i.e.,

c = 0 in Lemma 4), it will take 10

³⁰

processors for us to

ﬁnd the minimum of these N data items in O(1) time; how-

ever, if we invoke Lemma 4 and choose c = 3, then it only

(4)

takes 10

¹⁶

processors for us to ﬁnd the minimum in O(1) time. Obviously, huge amount of processors saving (10

¹⁴

times) are rewarded for this algorithm when we increase c from 0 to 3. Although c can be any non-negative integer, to fully take the advantage of this algorithm, we suggest that c ≥ 3 or 5. This lemma can be used as a sub-algorithm to solve lots of other related parallel computation problems on the LARPBS more efﬁciently.

Lemma 5 introduces the algorithms for computing the 1D EDT and 1D ANN. The 1D EDT and 1D ANN prob- lems are deﬁned in Section 3. Here we skip the proof of Lemma 5.

Lemma 5 The 1D EDT and 1D ANN of N Boolean data can be solved in O(1) time on an LARPBS of N processors.

3. Deﬁnitions and Properties of EDT and ANN

Given a set of 1-pixels, to ﬁnd a nearest neighbor (NN) for each 1-pixel of the set is deﬁned as the all nearest neigh- bor (ANN) problem. The EDT is an operation that converts an image consisting of black and white pixels to an image where each pixel has a value that represents the Euclidean distance to its nearest 1-pixel. If a pixel is a 1-pixel, then its nearest 1-pixel is itself. Let pixel b with coordinate (i) be denoted as b

i

. Let L = {b

i

|b

i

is either a 0-pixel or a 1-pixel, 0 ≤ i < N} denote a line of size N. Let B

1D

= {q

x

|q

x

is a 1-pixel, 0 ≤ x < N} ⊆ L represent the set of the 1- pixels of the line. In the following, we ﬁrst introduce EDT and ANN problems of a one-dimensional (1D) binary im- age of size N formally.

Deﬁnition 1 (1D EDT). Let the 1D nearest 1-pixel of b

i

∈ L be denoted as E

bi

. Let the 1D Euclidean distance transform (1D EDT) of pixel b

i

be de- noted as 1 d edt

bi

and be deﬁned as the distance from pixel b

i

to its nearest 1-pixel E

_b_i

∈ B

_1D

, i.e., 1d edt

_b_i

= dist(b

_i

, E

bi

) = min

_q_x_∈B_1D

{|i − x|}, and the 1D EDT of L is to compute {1d edt

_b_i

|∀ b

_i

∈ L}. Clearly, if b

i

is a 1-pixel, then E

_b_i

= b

_i

. If no 1-pixel is in line L, let E

_b_i

= (+∞) and 1d edt

_b_i

= +∞.

Deﬁnition 2 (1D ANN). Let a 1D nearest neighbor of q

x

be denoted as N

qx

. Then, a 1-pixel q

x

is deﬁned as N

qx

if dist(q

x

, N

qx

) = min

_q_x_∈B_1D_−{q_x_}

{|x

− x|} and the 1D ANN of B

1D

is to compute {N

qx

| ∀q

x

∈ B

1D

}. Clearly, N

qx

= q

x

. If q

x

is the only one 1-pixel in line L, let N

qx

= (+∞).

Let a pixel p with coordinates (i, j) be denoted as p

i,j

. Let I = {p

_i,j

| 0 ≤ i, j < N, p

_i,j

is either a 0-pixel or a 1-pixel.} denote a 2D binary image of size N × N. Let B

_2D

= {q

_x,y

|q

_x,y

is a 1-pixel, 0 ≤ x, y < N} ⊆ I denote the set of the 1-pixels of the 2D binary image. Let

(X

N_pi,j

, Y

N_pi,j

) denote the coordinates of N

pi,j

. Let col- umn k be denoted as C

k

. Let row r be denoted as R

r

. We then introduce the EDT and ANN problems of a 2D binary image of size N × N formally as follows.

Deﬁnition 3 (2D EDT). Let E

_p_i,j

denote the 2D nearest 1-pixel of p

i,j

∈ I. Let the 2D Euclidean distance trans- form (2D EDT) of pixel p

i,j

be denoted as 2 d edt

pi,j

and be deﬁned as the distance from pixel p

i,j

to its nearest 1- pixel E

_p_i,j

∈ B

_2D

, i.e., 2 d edt

pi,j

= dist(p

_i,j

, E

pi,j

) = min

_q_x,y_∈B_2D

{((i − x)

²

+ (j − y)

²

)

¹²

} and the 2D EDT of I is to compute {2d edt

_p_i,j

| ∀p

_i,j

∈ I}. Clearly, if p

_i,j

is a 1-pixel, then E

pi,j

= p

i,j

. If no 1-pixel is in I, let E

pi,j

= (+∞, +∞) and 2d edt

pi,j

= +∞.

Deﬁnition 4 For a ﬁxed k, let E

_p^C_i,j^k

, 0 ≤ k < N, denote the 2D nearest 1-pixel of the pixel p

i,j

with respect to all 1-pixels in column k. Clearly, if j = k, then E

_p^C_i,k^k

reduces to the 1D EDT of p

i,k

on column k.

Deﬁnition 5 (2D ANN). Let 1-pixel q

x,y

∈ B

2D

. Let 1-pixel N

_q_x,y

denote the 2D nearest neigh- bor of the 1-pixel q

x,y

. Then, a 1-pixel q

x,y

is deﬁned as N

_q_x,y

if dist(q

x,y

, N

qx,y

) = min

_q

x,y∈B2D−{qx,y}

{((x

− x)

²

+ (y

− y)

²

)

¹²

} and the 2D ANN of B

2D

is to compute {N

qx,y

| ∀q

x,y

∈ B

2D

}.

If q

x,y

is the only one 1-pixel in I, let N

qx,y

= (+∞, +∞).

Deﬁnition 6 For a ﬁxed k, let 1-pixel N

_q^C_x,y^k

, 0 ≤ k < N, denote the 2D nearest neighbor of the 1-pixel q

x,y

with re- spect to all the 1-pixels in column k. Clearly, if y = k, then N

_q^C_x,k^k

reduces to the 1D ANN of q

x,k

on column k.

Let a voxel p with coordinates (i, j, k) be denoted as p

i,j,k

. Let V = {p

_i,j,k

| 0 ≤ i, j, k < N, p

_i,j,k

is ei- ther a 0-voxel or a 1-voxel.} denote a 3D binary image of size N × N × N . Let B

3D

= {q

_x,y,z

|q

_x,y,z

is a 1-voxel, 0 ≤ x, y, z < N} ⊆ V denote the set of the 1-voxels of the 3D binary image. Let ( X

N_pi,j,k

, Y

N_pi,j,k

, Z

N_pi,j,k

) de- note the coordinates of N

pi,j,k

. Finally, we introduce ANN problem of a 3D binary image of size N × N × N formally as follows.

Deﬁnition 7 (3D ANN). Let a 1-voxel q

x,y,z

∈ B

3D

. Let 1-voxel N

qx,y,z

denote the 3D nearest neighbor of the 1- voxel q

x,y,z

. Then, a 1-voxel q

x,y,z

is deﬁned to be N

qx,y,z

if dist(q

x,y,z

, N

qx,y,z

) = min

_q_x,y,z_∈B_3D_−{q_x,y,z_}

{((x

− x)

²

+ (y

− y)

²

+ (z

− z)

²

)

¹2

}. The 3D ANN of B

_3D

is to compute {N

_q_x,y,z

| ∀q

_x,y,z

∈ B

_3D

}. If q

_x,y,z

is the only one 1-voxel in V, let N

_q_x,y,z

= (+∞, +∞, +∞).

Deﬁnition 8 ( Z-plane, Z-column). Let I

k

={(x, y, k)| 0 ≤

x, y < N, k is a ﬁxed integer}. Then V can be represented

by {I

_k

| 0 ≤ k < N}. Let Z

_x,y

-column={(x, y, k) | 0 ≤

k < N, x and y are ﬁxed integers}. Then V can be repre-

sented by {Z

_x,y

-column | 0 ≤ x, y < N} also. Also, each

(5)

Z

x,y

-column is denoted as Z-column, and each Z-plane I

k

is denoted as Z-plane for short.

3.1. Some Basic Properties of 2D EDT

Lemma 6 [13]. Let p

x,i

, p

x,k

, p

x,j

, i < k < j, be three pixels located at row x, which is parallel to Y -axis. Each of these three pixels is either a 0-pixel or a 1-pixel. Let E

_p_x,i

= p

m,s

and E

_p_x,j

= p

_n,t

be the 2D nearest 1-pixels of p

x,i

and p

x,j

respectively. Then E

_p_x,k

= p

_u,r

locates between column s (included) and column t (included). That is, s ≤ r ≤ t.

Lemma 7 [13]. Let p

x,k

, 0 ≤ k < N , be all the pixels lo- cated at row x, which is parallel to Y -axis. Each of these pixels is either a 0-pixel or a 1-pixel. Then, for each p

x,k

, E

_p_x,k

∈ {E

_p^C_x,r^r

| 0 ≤ r < N}, 0 ≤ k < N.

Lemma 8 Let p

x,i

, p

x,k

, p

x,j

, i < k < j, be three pix- els located at row x, which is parallel to Y -axis. Each of these three pixels is either a 0-pixel or a 1-pixel. Let E

px,i

= p

m,s

and E

px,j

= p

n,t

be the 2D nearest 1- pixels of p

x,i

and p

x,j

, respectively. Then, for each p

x,k

, E

px,k

∈ {E

_p^C_x,r^r

| s ≤ r ≤ t}, i ≤ k ≤ j. This lemma can be derived straightforwardly based on Lemmas 6 and 7.

3.2. Some Basic Properties of 2D ANN

Lemma 9 Let p

x,i

, p

x,k

, p

x,j

, i < k < j, be three 1- pixels located at row x, where the left and right 1D nearest neighbors of the 1-pixel p

x,k

on the row x are p

x,i

and p

x,j

respectively. Assume N

_p_x,k

= p

_t,r

. Then, p

t,r

locates be- tween columns i and j. That is, i ≤ t ≤ j. In other words, i ≤ Y

N_px,k

≤ j. This lemma can be proven easily. Here we skip the proof.

Lemma 10 Let p

x,i

, p

x,k

, p

x,j

, i < k < j, be three 1- pixels located at row x, where the left and right 1D near- est neighbors of the 1-pixel p

x,k

on the row x are p

x,i

and p

x,j

respectively as shown in Figure 2. Then N

px,k

∈ {E

_p^C_x,r^r

| i ≤ r ≤ j, r = k}

N

_p^C_x,k^k

.

Proof: Based on Lemma 9, N

px,k

locates between columns i and j. First, assume that N

px,k

, say p

q,k

, is on column k. Since the nearest neighbor of 1-pixel p

x,k

on column k is N

_p^C_x,k^k

, so N

px,k

must be N

_p^C_x,k^k

. Other- wise, if N

px,k

, say p

s,r

, is not on column k, then we should prove that N

px,k

∈ {E

_p^C_x,r^r

| i ≤ r ≤ j}, where r = k. This can be proven by contradiction as fol- lows. Assume that E

_p^C_x,r^r

= p

_t,r

, where s = t, then

|p

_x,r

E

p^Cx,r^r

| = |x − t| < |x − s| = |p

_x,r

E

_p_x,k

|. There- fore, |p

x,k

E

p^Cx,r^r

| = ((x − t)

²

+ (k − r)

²

)

^1/2

< ((x − s)

²

+ (k − r)

²

)

^1/2

= |p

_x,k

N

_p_x,k

|. It means that pixel p

_x,k

is closer to the 1-pixel E

_p^C_x,r^r

with coordinates ( t, r) than to the 1-pixel N

_p_x,k

with coordinates ( s, r). A contradiction. This

proves that N

px,k

∈ {E

_p^C_x,r^r

| i ≤ r ≤ j, r = k} N

_p^C_x,k^k

, i < k < j.

- row x E

p^C_x,iⁱ

E

p^Cx,r^r

• = p

t,r

N

p_x,k

• = p

s,r

• N

p^Cx,k^k

= p

q,k

E

p^C_x,j^j

• ◦ • ◦ •

• p

x,i

p

x,r

p

x,k

p

x,j

H H H Y J J J J ]

-

6 Figure 2. An illustration of Lemma 10.

3.3. Some Basic Properties of 3D ANN

Lemma 11 Let p

x,y,i

, p

x,y,k

, p

x,y,j

, i < k < j, be three 1- voxels located at Z

x,y

-column, which is parallel to Z-axis, where the left and right 1D nearest neighbors of the 1-pixel p

x,y,k

on the Z

x,y

-column are p

x,y,i

∈ I

_i

and p

x,y,j

∈ I

_j

, respectively. Assume N

px,y,k

= p

u,v,r

. Then, p

u,v,r

locates between Z-planes I

i

and I

j

. That is, i ≤ r ≤ j. The proof of this lemma is similar to that of Lemma 9. Here we skip the details.

Lemma 12 Let p

x,y,i

, p

x,y,k

, p

x,y,j

, i < k < j, be three 1- voxels located at Z

x,y

-column, which is parallel to Z-axis, where the left and right nearest neighbors of the 1-pixel p

x,k

on the Z

x,y

-column are p

x,y,i

∈ I

_i

and p

x,y,j

∈ I

_j

, respectively. Then N

_p_x,y,k

∈ {E

_p^I_x,y,r^r

| i ≤ r ≤ j, r =

k}

N

_p^I_x,y,k^k

. The proof of this lemma is similar to that of Lemma 10. Here we skip the details.

4. Algorithm for Computing 2D EDT

By fully taking the advantages of the EDT properties in- troduced in Section 3 and employing exact the same divide- and-conquer strategy as used in Wang and Horng [14], in this section, Algorithm ROW 2D EDT is ﬁrst devised, then, we devise Algorithm 2D EDT for a 2D N × N binary im- age. To prevent from trivializing, here we only describe these two algorithms brieﬂy as follows.

4.1. Algorithm for Computing One Row of 2D EDT

Algorithm Row 2D EDT is an algorithm for any pixel

(which is either a 0-pixel or a 1-pixel) of a row of a 2D

N × N binary image to compute the 2D EDT. Based on

the pre-computed E

_b^C_l^l

and 1 d edt

^C_b_l^l

(by invoking Lemma 5),

this algorithm computes the E

_b_l

and 2 d edt

bl

for each pixel

(6)

b

l

= p

r,l

, 0 ≤ l < N, of one ﬁxed row r of I. Through- out this paper, let κ be a constant,

_κ¹

be denoted as δ, and + δ be denoted as ε. For convenience, each proces- sor with index ξ of an LARPBS of size N

^1++δ

(i.e., P

ξ

, 0 < ξ ≤ N

^1++δ

) is also represented as P (i, j, k), where ξ = i × N

¹⁺

+ j × N + k, 0 ≤ i < N

^δ

, 0 ≤ j <

N

, 0 ≤ k < N . We assume that the pixel b

l

, 0 ≤ l < N, is initially given in the ﬁrst N processors P

ξ

(i.e., P (0, 0, l), 0 ≤ l < N) of an LARPBS of size N

^1++δ

, one pixel per processor. Based on Lemmas 6 , 7, and 8, the 2D EDT of all pixels in a row of size N can be computed efﬁciently in O(1) time on an LARPBS of N

^1++δ

processors. This con- cludes Lemma 13.

Lemma 13 Algorithm Row 2D EDT can be computed in O(1) time on an LARPBS with N

^1+ε

processors, where 0 < ε = + δ =

₂c+1¹−1

+

_κ¹

< 1, both κ and c are con- stants.

4.2. A Scalable O(1) Time Algorithm for Solving the 2D EDT Problem

The high level description of this algorithm is stated as follows. First, we perform the 1D computation phase to compute the 1D EDT for each pixel along a column.

All columns perform in parallel. Then based on the com- puted results, we perform the 2D computation phase by in- voking Algorithm Row 2D EDT for each row to compute its corresponding Row 2D EDT in parallel. For comput- ing 2D EDT, we denote each processor with index ξ of an LARPBS of size N

^2++δ

(i.e., P

ξ

, 0 ≤ ξ < N

^2++δ

) as P (v, i, j, k), where ξ = v × N

^1++δ

+ i × N

¹⁺

+ j × N + k, 0 ≤ v < N, 0 ≤ i < N

^δ

, 0 ≤ j < N

, 0 ≤ k < N . The P (v, i, j, k) of size N

^2++δ

can be constructed from P (i, j, k) of size N

^1++δ

by expanding one tuple v, where 0 ≤ v < N. In the 1D computation phase, ac- cording to Lemma 5, the 1D EDT of each column of I can be processed in O(1) time using N processors. So it takes O(1) time using N

²

processors for all the N columns to compute their corresponding 1D EDTs in parallel. In the 2D computation phase, according to Lemma 13, Algo- rithm Row 2D EDT takes O(1) time using an LARPBS of N

^1++δ

processors for all pixels b

l

= p

r,l

of a ﬁxed row r to compute its 2D EDT. So it takes O(1) time using an LARPBS of N

^2++δ

processors for all rows r, 0 ≤ r < N , to compute their corresponding 2D EDTs in parallel. This concludes Theorem 1.

Theorem 1 The 2D EDT of a binary image of size N × N can be computed in O(1) time on an LARPBS with N

^2+ε

processors, where 0 < ε = + δ =

₂c+1¹−1

+

_κ¹

< 1, both κ and c are constants.

5. Algorithms for Computing 2D ANN and 3D ANN

In this section, Algorithms Row 2D ANN and Zcol 3D ANN are ﬁrst devised. Then, based on Al- gorithm Row 2D ANN and Lemma 5, we devise Algo- rithm 2D ANN, which computes the 2D ANN for a 2D N × N binary image (i.e., I). Finally, based on Algo- rithms 2D EDT, 2D ANN and Zcol 3D ANN, we de- vise Algorithm 3D ANN, which computes the 3D ANN for a 3D N × N × N binary image (i.e., V). We solve the 2D ANN and 3D ANN problems efﬁciently by fully tak- ing the advantages of the EDT and ANN properties intro- duced in Section 3.

5.1. Algorithm for Computing One Row of 2D ANN

Algorithm Row 2D ANN is an algorithm for any 1-pixel of a row r (i.e., R

r

) of a 2D N × N binary image (i.e., I) to compute its corresponding 2D ANN. This algorithm is devised based on all the pre-computed E

_p^C_r,l^l

, N

_p^C_r,l^l

and N

_p^R_r,l^r

. As described before, we can apply Lemma 5 for ev- ery pixel b

r

= p

_r,l

of a ﬁxed column l to ﬁnd its E

_p^C_r,l^l

(i.e., the 1D nearest 1-pixel within column l), for each 1-pixel b

r

= p

_r,l

of a ﬁxed column l to ﬁnd its N

_p^C_r,l^l

(i.e., the 1D nearest 1-pixel neighbor within column l), and for each 1- pixel b

l

= p

_r,l

of a ﬁxed row r to ﬁnd its N

_p^R_r,l^r

(i.e., the 1D nearest 1-pixel neighbor within row r). Here we describe this algorithm brieﬂy as follows.

Algorithm Row 2D ANN

Input: A ﬁxed R

r

= {p

_r,l

| 0 ≤ l < N} of I. For any p

r,l

∈ I, it holds its corresponding E

_p^C_r,l^l

. Besides, if p

r,l

is a 1-pixel, then it holds its corresponding N

_p^C_r,l^l

and N

_p^R_r,l^r

also. All of them are pre-computed by invok- ing Lemma 5. Initially, all E

_p^C_r,l^l

, N

_p^C_r,l^l

and N

_p^R_r,l^r

are al- located at the ﬁrst N processors (i.e., P

l

, 0 ≤ l < N) of an LARPBS of size N

¹⁺

.

Output: Each 1-pixel p

r,l

of row r is assigned N

pr,l

. Step 1: According to Lemma 9, the N

pr,l

(i.e., the

2D ANN of each 1-pixel p

r,l

) is bounded within its left and right 1D ANNs on row r. Based on Lemma 10, the 2D ANN (i.e., N

pr,l

) of each odd 1-pixel p

r,l

of row r, 0 ≤ l < N , can be com- puted.

Step 2: Similarly, based on Lemmas 9 and 10, the

2D ANN (i.e., N

_p_r,l

) of each even 1-pixel p

r,l

of

row r, 0 ≤ l < N , can be computed. This con-

cludes Lemma 14.

(7)

◦ • •

-

^{[ C}

^o¹

^{, C}

^o²

^] ^{[ C}

^o²

^{, C}

^o³

^] - ^{[ C}

^o³

^{, C}

^N−1

^] - -

- -

[ C

0

, C

e1

] [ C

e1

, C

e2

] [ C

e2

, C

e3

]

◦

◦ ◦ ◦ ◦◦ ◦◦◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦◦ ◦ ◦ ◦

0 e

1

e

2

e

3

o

1

o •

2

• o •

3

• N − 1 ◦ - row r

Figure 3. An illustration for Algorithm Row 2D ANN.

Lemma 14 Algorithm Row 2D ANN can be computed in O(1) time on an LARPBS with N

¹⁺

processors, where 0 < =

₂c+1¹−1

1, c is a constant.

Proof: The proof of this algorithm is as follows. See Fig- ure 3 for reference. Let the set of 1-pixels in row r be de- noted as B

Rr

= {b

1

, b

2

, ..., b

k

, ..., b

m

}. Clearly, |B

Rr

| = m. Let the set of 1-pixels in row r with odd index k be de- noted as BO

Rr

= {o

1

, o

2

, ..., o

mo

}, and let the set of 1- pixels in row r with even index k be denoted as BE

Rr

= {e

1

, e

2

, ..., e

me

}. Clearly, |BO

Rr

| = m

o

, |BE

Rr

| = m

e

and m = m

o

+ m

e

. Note that the work of indexing the 1-pixels can be done by invoking Lemma 1. Based on Lemma 9, N

_o_i

(i.e., the 2D ANN of 1-pixel o

i

) locates be- tween columns e

i−1

and e

i

(i.e., [ C

_e_i−1

, C

ei

]), where C

_e₀

= C

₀

; if i > m

e

, then let C

_e_i

= C

_N−1

. Let the number of columns in [ C

_e_i−1

, C

ei

] be denoted as |[ C

_e_i−1

, C

ei

]| = m

_o_i

. Then it takes m

oi1+

processors for o

i

to ﬁnd N

_o_i

. Since

mo

i=1

m

oi

≤ N, so

^m^o

i=1

m

oi1+

≤ (

^m^o

i=1

m

oi

)

¹⁺

≤ N

¹⁺

. In the worst case, if only one 1-pixel is on row r, then N

¹⁺

processors are needed. Clearly, it takes at most N

¹⁺

processors for Step 1 to be performed in O(1) time. Simi- larly, N

ei

(i.e., the 2D ANN of 1-pixel e

i

) locates between columns o

i

and o

i+1

(i.e., [ C

oi

, C

oi+1

]); if i + 1 > m

o

, then let C

oi+1

= C

N−1

. Let |[ C

oi

, C

oi+1

]| = m

ei

. Then it takes m

ei1+

processors for e

i

to ﬁnd N

_e_i

. Since

me

i=1

m

ei

≤ N, so

me

i=1

m

ei1+

≤ (

^m^e

i=1

m

ei

)

¹⁺

≤ N

¹⁺

. If only one 1-pixel is on row r, then we skip Step 2. Clearly, it takes at most N

¹⁺

processors for Step 2 to be performed in O(1) time also.

5.2. A Scalable O(1) Time Algorithm for Solving the 2D ANN Problem

The high level description of Algorithm 2D ANN is stated as follows. The parallel algorithm for computing 2D ANN consists of two phases: the 1D computation phase and the 2D computation phase. During the 1D computation phase, we ﬁrst compute the 1D EDT of each column in par-

allel, then compute the 1D ANN of each column in parallel, ﬁnally compute the 1D ANN of each row in parallel. Then based on the computed results, we perform the 2D com- putation phase by invoking Algorithm Row 2D ANN, all rows are performed in parallel. For computing 2D ANN, we denote each processor with index ξ of an LARPBS of size N

²⁺

(i.e., P

ξ

, 0 ≤ ξ < N

²⁺

) as P (v, j, k), where ξ = v × N

¹⁺

+ j × N + k, 0 ≤ v < N, 0 ≤ j <

N

, 0 ≤ k < N . The P (v, j, k) of size N

²⁺

can be con- structed from the P (j, k) of size N

¹⁺

by expanding one tuple v, where 0 ≤ v < N . (As stated in Section 1, here we omit the details of the processor partition, allocation, re- conﬁguration and data routing.) We describe this algorithm brieﬂy as follows.

Algorithm 2D ANN

Input: A 2D N × N binary image (i.e., I), each pixel be- ing represented by p

r,l

= 0 or 1, where 0 ≤ r, l < N.

Every pixel p

r,l

∈ I is initially allocated at proces- sor P (r, 0, l) of size N

²⁺

.

Output: Each 1-pixel p

r,l

being assigned the coordinates of its 2D nearest 1-pixel N

pr,l

.

Step 1: The 1D computation phase. As described above, every pixel p

r,l

, 0 ≤ r < N, of a ﬁxed column l (i.e., p

∗,l

) of I is initially allocated at P (∗, 0, l). In this step, we ﬁrst apply Lemma 5 for every pixel b

r

= p

r,l

of ﬁxed column l to ﬁnd its E

_p^C_r,l^l

(i.e., the 1D near- est 1-pixel within column l), for each 1-pixel b

r

= p

_r,l

of ﬁxed column l to ﬁnd its N

_p^C_r,l^l

(i.e., the 1D near- est 1-pixel neighbor within column l), and for each 1- pixel b

l

= p

_r,l

of ﬁxed row r to ﬁnd its N

_p^R_r,l^r

(i.e., the 1D nearest 1-pixel neighbor within row r). At the end of this step, for every column l, 0 ≤ l < N , ev- ery processor P (r, 0, l) allocated for pixel b

r

= p

_r,l

of column l has held E

_b^C_r^l

= (X

_ECl

br

, l) and 1d edt

^C_b_r^l

=

|b

r

E

_b^C_r^l

| = |X

_ECl

br

− r|. If no 1-pixel is in column l,

let X

Ebr

= +∞. The N

_p^C_r,l^l

and N

_p^R_r,l^r

can be com-

puted similarly. If no other 1-pixel is in column l, let

X

Nbr

= +∞. If no other 1-pixel is in row r, let

Y

N_bl

= +∞.

(8)

Step 2: The 2D computation phase. In this step, for every pixel b

l

= p

r,l

of a ﬁxed row r, we use P (r, ∗, ∗), 0 ≤ r < N, to compute the N

bl

. Since P (r, ∗, ∗) of size N

²⁺

can be partitioned into N independent sub- arrays P (∗, ∗), each of size N

¹⁺

, and totally there are N rows in I, so, P (r, ∗, ∗), 0 ≤ r < N , can perform Algorithm Row 2D ANN on all the N rows in paral- lel. At the end of this step, each pixel b

l

= p

_r,l

of I is assigned N

_b_l

. If there is no other 1-pixel in the plane, let N

_b_l

= (+∞, +∞).

Time Complexity Analysis. In Step 1, according to Lemma 5, the 1D EDT of each column, the 1D ANN of each column, and the 1D EDT of each row of I can be pro- cessed in O(1) time using N processors. So it takes O(1) time using N

²

processors for all the columns l, 0 ≤ l < N , to compute their corresponding 1D EDTs, and for all the columns l, 0 ≤ l < N , and all the rows r, 0 ≤ r < N , to compute their corresponding col- umn 1D ANNs and row 1D ANNs, respectively, in par- allel. In Step 2, according to Lemma 13, Algorithm Row 2D ANN takes O(1) time using at most N

¹⁺

pro- cessors for all the 1-pixels of a ﬁxed row r to compute their corresponding 2D ANNs. So it takes O(1) time us- ing an LARPBS of at most N

²⁺

processors for all the N rows to compute their corresponding 2D ANNs in paral- lel. This concludes Theorem 2.

Theorem 2 The 2D ANN of a binary image of size N × N can be computed in O(1) time on an LARPBS with N

²⁺

processors, where 0 < =

₂c+1¹−1

1, c is a constant.

5.3. Algorithm for Computing One Z

_s,t

-column of 3D ANN

For a Z

s,t

-column of a 3D N × N × N binary image, if each 0-voxel b

l

= p

_s,t,l

∈ Z

_s,t

-column, 0 ≤ l < N, holds E

_b^I_l^l

, and each 1-voxel b

l

= p

s,t,l

∈ Z

s,t

-column, 0 ≤ l < N, holds N

_b^I_l^l

and the left and right 1D ANNs on Z

s,t

-column, then the N

_b_l

of each 1-voxel b

l

, 0 ≤ l < N, of this Z

s,t

-column can be computed in O(1) time on an LARPBS of N

¹⁺

processors. We denote this algorithm as Algorithm Zcol 3D ANN. The proof of this algorithm is similar to that of Algorithm Row 2D ANN. Here we skip the details to prevent from triviality. This concludes Lemma 15.

Lemma 15 Algorithm Zcol 3D ANN can be computed in O(1) time on an LARPBS with N

¹⁺

processors, where 0 <

=

₂c+1¹−1

1, c is a constant.

5.4. A Scalable O(1) Time Algorithm for Solving the 3D ANN Problem

The parallel algorithm for computing 3D ANN con- sists of two phases: the Z-plane phase and the 3D ANN computation phase. During the Z-plane phase, we com- pute the 2D EDT and the 2D ANN for each Z-plane I

l

, 0 ≤ l ≤ N − 1. The 2D EDT can be implemented by Al- gorithm 2D EDT. The 2D ANN can be computed by in- voking Algorithm 2D ANN. Clearly, there are total N in- dependent planes, each of size N × N . All planes are com- puted in parallel. Then based on the computed results, we perform the 3D ANN computation phase by invoking Al- gorithm Zcol 3D ANN, all N × N Z-columns are per- formed in parallel. For computing 3D ANN, we denote each processor with index ξ of an LARPBS of size N

^3++δ

(i.e., P

ξ

, 0 ≤ ξ < N

^3++δ

) as P (u, v, i, j, k), where ξ = u×N

^2++δ

+v×N

^1++δ

+i×N

¹⁺

+j ×N +k, 0 ≤ u, v, k < N, 0 ≤ i < N

^δ

, 0 ≤ j < N

. We describe this algorithm brieﬂy as follows.

Algorithm 3D ANN

Input: A 3D N × N × N binary image (i.e., V), each pixel being represented by p

s,t,l

= 0 or 1, where 0 ≤ s, t, l < N.

Output: Each 1-pixel p

s,t,l

being assigned the coordinates of its 3D ANN (i.e., N

_p_s,t,l

).

Step 1: The Z-plane phase. First, we apply Algorithm 2D EDT on each Z-plane I

l

, where 0 ≤ l < N, in parallel. Then, we apply Algorithm 2D ANN on each Z-plane I

l

, where 0 ≤ l < N, in parallel.

Step 2: The 3D ANN computation phase. In this step, we apply Algorithm Zcol 3D ANN on each of the N

²

Z

s,t

-columns, where 0 ≤ s, t < N, in parallel.

Time Complexity Analysis. The complexity analysis of

Algorithm 3D ANN is stated as follows. In the Z-plane

phase done in Step 1, according to Theorem 1, the 2D EDT

computation of each Z-plane can be processed in O(1)

time using N

^2+ε

processors. Also, according to Theorem 2,

the 2D ANN computation of each Z-plane can be pro-

cessed in O(1) time using N

²⁺

processors. Therefore,

all the 2D EDT and the 2D ANN computations of the N

Z-planes can be processed in parallel in O(1) time us-

ing N

^3+ε

and N

³⁺

processors, respectively. In Step 2

(i.e., the 3D EDT computation phase), we apply Algorithm

Zcol 3D ANN on a ﬁxed Z

s,t

-column in parallel. Accord-

ing to Lemma 15, each of the N × N Z

s,t

-columns can

be processed in O(1) time using N

¹⁺

processors for all

the 1-pixels b

l

= p

_s,t,l

of a ﬁxed Z

s,t

-column to compute

their corresponding 3D ANNs. So it takes O(1) time using

an LARPBS of N

³⁺

processors for all the N × N Z

s,t

-

columns to be processed in parallel. The properties of Lem-

(9)

mas 11 and 12 show that Step 2 performs the 3D ANN com- putation correctly. This concludes Theorem 3.

Theorem 3 The 3D ANN of a binary image of size N × N × N can be computed in O(1) time on an LARPBS with N

^3+ε

processors, where 0 < ε = + δ =

₂c+1¹−1

+

_κ¹

< 1, both κ and c are constants.

6. Concluding Remarks

In this paper, we devise scalable O(1) time algorithms for solving the 2D EDT, 2D ANN and 3D ANN problems from image processing perspective on the LARPBS model by fully taking the advantages of the LARPBS properties such as unidirectional propagation and predictable prop- agation delay per unit length which enable synchronized concurrent access to an optical bus in a pipelined fashion, and the reconfigurability which shows that the LARPBS is a powerful architecture for exploiting a very large de- gree of parallelism for solving divide-and-conquer prob- lems. We first develop two very efficient parallel algorithms:

One for solving the 2D EDT problem of a 2D binary im- age of size N × N in O(1) time on an LARPBS with N

^2+ε

processors, where 0 < ε = + δ =

₂c+1¹−1

+

¹_κ

< 1, both κ and c are constants, and the other for solving the 2D ANN problem of a 2D binary image of size N × N in O(1) time on an LARPBS with N

²⁺

processors, where 0 < =

₂c+1¹−1

1. Assume that there are a binary im- age of size 10

¹⁵

× 10

¹⁵

. By choosing κ = 3 in the best pre- viously published algorithm [10], it takes 10

³⁵

processors for [10] to solve the 2D ANN problem; however, for c = 3 in our algorithm, it only takes 10

³¹

processors for our algo- rithm to solve the same problem. Obviously, 9999 × 10

³¹

processors are saved by our algorithm. We then further de- velop algorithm for solving the 3D ANN problem from im- age processing perspective based on the computed 2D ANN and 2D EDT. We solve the 3D ANN problem of a 3D bi- nary image of size N ×N ×N in O(1) time on an LARPBS with N

^3+ε

processors. This is the ﬁrst algorithm devised for solving the 3D ANN problem from image processing perspective. To the best of our knowledge, all results de- rived in this paper are the best O(1) time algorithms on the LARPBS computation model known. The respective 2D ANN and 3D ANN comparison results are shown in Ta- bles 1 and 2.

7. Acknowledgments

This work was partially supported by National Science Council under the contract numbers NSC-93-2213-E-129- 011 and NSC-92-2213-E-011-084.

References

[1] M.M. Eshaghian, “Parallel algorithms for image processing on OMC,” IEEE Trans. Computers, vol. 40, pp. 827-833, 1991.

[2] J. Jang, M. Nigam, V.K. Prasanna Kumar and S. Sahni,

“Constant time algorithms for computational geometry on the reconﬁgurable mesh,” IEEE Trans. Parallel and Dis- tributed Systems, vol. 8, no. 1, pp. 1-12, 1997.

[3] T.H. Lai and M.J. Sheng, “Constructing Euclidean minimum spanning trees and all nearest neighbors on reconﬁgurable meshes,” IEEE Trans. Parallel and Distributed Systems, vol.

7, no. 8, pp. 806-817, 1996.

[4] K. Li, Y. Pan and S.-Q. Zheng, “Fast and processor efﬁ- cient parallel matrix multiplication algorithms on a linear ar- ray with a reconﬁgurable pipelined bus system,” IEEE Trans.

Parallel and Distributed Systems, vol. 9, no. 8, pp. 705-720, 1998.

[5] K. Li, Y. Pan and S.-Q. Zheng, “Efﬁcient deterministic and probabilistic simulations of PRAMs on linear arrays with re- conﬁgurable pipelined bus systems,” J. Supercomputing, vol.

15, pp. 163-181, 2000.

[6] R.G. Melhem, D.M. Chiarulli and S.P. Levitan, “Space mul- tiplexing of waveguides in optically interconnected multipro- cessor systems,” Computer Journal, vol. 32 no. 4, pp. 362- 369, 1989.

[7] T.W. Kao and S.J. Horng, “Efﬁcient algorithms for com- puting two nearest neighbor problems on a RAP,” Pattern Recognition, vol. 27, no.12, pp. 1707-1716, Dec. 1994.

[8] Y. Pan and M. Hamdi, “Quicksort on a linear array with a re- conﬁgurable pipelined bus,” Proc. Int’l Symp. Parallel Archi- tectures, Algorithms, and Networks, pp. 313-319, 1996.

[9] Y. Pan and K. Li, “Linear array with a reconﬁgurable pipelined bus system - concepts and applications,” Journal of Information Sciences, pp. 237-258, 1998.

[10] Y. Pan, K. Li and S.-Q. Zheng, “Fast nearest neighbor algo- rithms on a linear array with a reconﬁgurable pipelined bus system,” Parallel Algorithms and Applications, vol. 13, pp.

1-25, 1998.

[11] F.P. Preparata and M.I. Shamos, Computational Geometry:

An Introduction. Third Corrected Printing. Berlin: Springer- Verlag, 1990.

[12] H.R. Tsai, S.J. Horng, T.W. Kao, S.S. Lee and S.S. Tsai,

“Fundamental data movement operations and its applications on a hyper-bus broadcast network,” Parallel Computing, vol.

25, no. 2, pp. 137-157, 1999.

[13] Y.R. Wang and S.J. Horng, “An O(1) time parallel algorithm for the 3-D Euclidean distance transform on the CRCW PRAM model,” IEEE Trans. Parallel and Distributed Sys- tems, vol. 14, no. 10, pp. 973-982, 2003.

[14] Y.R. Wang and S.J. Horng, “Parallel algorithms for arbitrary dimensional Euclidean distance transforms with applications on arrays with reconﬁgurable optical buses,” IEEE Trans.

Systems, Man and Cybernetics - Part B: Cybernetics, vol.

34, no.1, pp. 517-532, 2004.

(10)

Algorithm Architecture Processor Time Complexity

Eshaghian [1] OMC n

²

× n

²

O(1)

Tsai et al. [12] HBBN n × n O(1) with n

¹^c

-bit bus width

Jang et al. [2] RMESH n × n O(1)

Lai and Sheng [3] RMESH n × n O(1)

Wang et al. [15] LARPBS n

^1.5

O(1)

Kao and Horng [7]* RAP N × N O(1) with N

²^c

-bit bus width

Pan et al. [10]* LARPBS N

^2+δ

, 0 < δ =

¹_k

< 1 O(1) This paper* LARPBS N

²⁺

, 0 < =

₂c+1¹−1

1 O(1)

* The algorithms introduced in [7], [10] and this paper are based on image processing perspective. All others are based on computational geometry perspective, where 0 ≤ n ≤ N

²

.

Table 1. Summary of comparison results for parallel 2D ANN algorithms.

Algorithm Architecture Processor Time Complexity

Lai and Sheng [3] RMESH n × n O(1)

Wang et al. [15] LARPBS n

^1.5

O(1)

This paper* LARPBS N

^3+ε

, 0 < ε =

₂c+1¹−1

+

¹_κ

< 1 O(1)

* This algorithm is based on image processing perspective. Algorithms introduced in [3] and [15] are based on computational geometry perspective, where 0 ≤ n ≤ N

³