• 沒有找到結果。

合併的光學多級式網路之無開關干擾之可重排性

N/A
N/A
Protected

Academic year: 2021

Share "合併的光學多級式網路之無開關干擾之可重排性"

Copied!
33
0
0

加載中.... (立即查看全文)

全文

(1)

國 立 交 通 大 學

應用數學系

碩 士 論 文

合併的光學多級式網路之

無開關干擾之可重排性

On the Crosstalk-free Rearrangeability of Combined

Optical Multistage Interconnection Networks

研 究 生:黃志文

指導教授:陳秋媛 教授

中 華 民 國 九 十 八 年 一 月

(2)

合併的光學多級式網路之

無開關干擾之可重排性

On the Crosstalk-free Rearrangeability of Combined

Optical Multistage Interconnection Networks

研 究 生:黃志文 Student:Chih-Wen Huang

指導教授:陳秋媛 Advisor:Chiuyuan Chen

國 立 交 通 大 學

應 用 數 學 系

碩 士 論 文

A Thesis

Submitted to Department of Applied Mathematics

College of Science

National Chiao Tung University

in Partial Fulfillment of the Requirements

for the Degree of

Master

in

Applied Mathematics

January 2009

Hsinchu, Taiwan, Republic of China

(3)

i

合併的光學多級式網路之無開關干擾之可重排性

研究生:黃志文

指導老師:陳秋媛 教授

國 立 交 通 大 學

應 用 數 學 系

摘 要

一個多級式連接網路的可重排性,是指這個網路的 N 個輸入到 N 個輸出,在必 要時允許重新連線的情況下,是否可以連接所有 N!種可能的輸入輸出排列。在 文獻[8]中,Das 對於合併的 2n – 1 階級的多級式連接網路的可重排性,提出了一 個漂亮的充分條件,其中 n = log2N,Das 並且對符合這個充分條件的多級式連接 網路提出一個時間複雜度為 O(Nn)的排列繞送演算法。然而,上述的可重排性的 定義以及 Das 的結果,皆只適用於電子的多級式連接網路。如今,光學的多級式 連接網路,因其高效能,已是許多人的網路選擇。如同文獻[26]中所提,電子的 多級式連接網路、與光學的多級式連接網路,其最大的區別是:在電子的多級式 連接網路中,兩個訊息傳送之需求,當它們的傳送路徑的邊均不重覆時,可以同 時傳送;而在光學的多級式連接網路中,兩個訊息傳送之需求,只有當它們的傳 送路徑的點均不重覆時,才能同時傳送(這意味著這兩條傳送路徑不能同時通過 同一個開關,也因此不會有開關干擾的問題產生)。這篇論文的目的便是針對光 學的多級式連接網路來重做 Das 的工作。我們對於合併的 2n – 2 階級、以及 2n – 1 階級的光學多級式連接網路的無開關干擾之可重排性各提出一個充分條件,對 於符合充分條件的光學多級式連接網路提出時間複雜度為 O(Nn)的排列繞送演 算法。另外,我們也針對 baseline 網路提出在四回合之內、點均不重覆的排列繞 送演算法。 關鍵詞:多級式連接網路,光學的多級式連接網路,可重排性,排列繞送,開關 干擾,Benes 網路,baseline 網路,反向的 baseline 網路。

(4)

On the Crosstalk-free Rearrangeability of Combined

Optical Multistage Interconnection Networks

Student: Chih-Wen Huang

Advisor: Chiuyuan Chen

Department of Applied Mathematics National Chiao Tung University

Abstract

Rearrangeability of a multistage interconnection network (MIN) is that if the MIN can connect its N inputs to its N outputs in all N ! possible ways, by rear-ranging the existing connections if required. In [8], Das formulated an elegant suf-ficient condition for the rearrangeability of a combined (2n − 1)-stage MIN, where n = log2N, and presented an O(N log2N)-time routing algorithm for MINs that satisfy the sufficient condition. However, the above definition of rearrangeability and the results of Das are for electronic MINs. Recently, optical MINs have become a promising network choice for their high performance. As was mentioned in [28], the fundamental difference between an electronic MIN and an optical MIN is that: two routing requests in an electronic MIN can be sent simultaneously if they are link-disjoint, while two routing requests in an optical MIN can be sent simultane-ously only when their routing paths are node-disjoint, meaning that these two paths do not pass through the same switching element and therefore there is no crosstalk problem. The purpose of this thesis is to redo the works of Das for optical MINs. In particular, we formulate a sufficient condition for the crosstalk-free rearrangeability of a combined (2n−2)-stage and a combined (2n−1)-stage optical MIN, we propose an O(N log2N)-time routing algorithm for optical MINs that satisfy the sufficient condition. In this thesis we also propose an algorithm to realize any permutation in a baseline network with node-disjoint paths in four passes.

Keywords: Multistage interconnection network; Optical multistage intercon-nection network; Rearrangeability; Permutation routing; Crosstalk; Benes network; Baseline network; Reverse baseline network.

(5)

iii

誌 謝

能夠完成此篇論文首先最要感謝的就是陳秋媛老師,陳老師自從我考上了交 大研究所開始就經常撥冗與我討論研究方向及學業上的種種,亦時常教導我們學 習該有的心態並且分享許多經驗,除了學業之外,陳老師更是時常關心我們生活 方面是否有遭遇什麼問題,因此陳老師對我而言不只是老師同時也是很好的朋 友,我真的很慶幸也很感謝這些日子來有陳老師的指導。未來,我定會把陳老師 特別交代的兩句話牢記在心並努力實踐,這兩句話便是交大校訓,『知新致遠、 崇實篤行』。 同時特別感謝交大應數所組合組的傅恆霖老師、黃大原老師、翁志文老師與 符麥克老師的教導與鞭策。於研究期間,感謝學長國元、柏樹、鈺傑與威雄的照 顧並且不時給予許多有用的意見,亦感謝信菖、子鴻、松育、世慶、宜君、慧棻 以及許許多多的研究同學及學弟妹們提供的一些研究方面的建議及協助,除此之 外,學業以外的許許多多照顧、歡笑及不定時的美食分享也都是我最寶貴的回 憶,很高興能認識這些學長、同學與學弟妹們,希望你們未來都能夠一帆風順。 除了研究方面的協助,很感謝我的家人們的栽培,父母對我的全力支持與鼓 勵,永遠是驅使我不斷向前的動力,我的學業能一路走到這篇論文完成,你們無 疑是最大功臣,但願我的未來能不負你們期望。 最後,要感謝的人太多,以上未提及的朋友及同學們再此也一併感謝,謹以 此論文獻給我最親愛的家人、師長及同學們!

(6)

Contents

Abstract (in Chinese) i Abstract (in English) ii Acknowledgement iii

Contents iv

List of Figures v

1 Introduction 1

2 Preliminaries 5 3 Decompose a permutation into semi-permutations 8 4 CF-rearrangeability of optical MINs 10 5 A permutation routing algorithm in the baseline (or reverse baseline)

network 18

(7)

List of Figures

1 Communications among processors using an MIN. . . 1

2 A 5-stage, 8 × 8 MIN; this MIN is a 8 × 8 Benes network. . . 2

3 (a) A 2 × 2 switch and its sub ports. (b) The two possible states. . . 2

4 An 8 × 8 baseline network with labels and i-mappings. . . 6

5 A 16 × 16 dilated Benes network. . . 12

6 The routing paths obtained by our algorithm. . . 17

7 (a) Routing paths in the first pass. (b) Routing paths in the second pass. . 22

(8)

1

Introduction

Permutation routing is an important transmission pattern in parallel and distributed computing systems [27]. The purpose of this thesis is to consider the problem of routing all N! possible permutations in an optical multistage interconnection network (MIN).

Given N processors P0, P1,· · · , PN −1, an N × N MIN can be used for communication

among these processors as shown in Figures 1 and 2, where N × N means N inputs and N outputs. In this thesis an MIN denotes both an electronic MIN and an optical MIN, and unless otherwise specified, an MIN means an N × N MIN. A column in an MIN is called a stage. The number of stages in an MIN is denoted by s. The nodes in a stage are called switches (or switching elements or crossbars). Define

n= log2N.

Each switch is assumed to be of size 2 ×2 (thus N is even); see [4, 6, 7, 10, 15] for switches of other sizes. It is well known that a 2 × 2 switch has only two possible states: straight and cross, as shown in Figure 3.

N x N MIN P0 P1 P N-1 I 0 I 1 IN-1 O 0 O 1 ON-1 … … …

Figure 1: Communications among processors using an MIN.

A permutation of an MIN is one-to-one mapping between the inputs and outputs. A permutation is admissible of an MIN if it can be realized on that MIN with link-disjoint paths in one pass. An MIN is rearrangeable if all N! possible permutations are admissible.

(9)

0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7

stage 0 stage 1 stage 2 Stage 3 stage 4

Figure 2: A 5-stage, 8 × 8 MIN; this MIN is a 8 × 8 Benes network.

sub port 0 sub port 1 sub port 0

sub port 1

(a) (b)

switch straight cross

Figure 3: (a) A 2 × 2 switch and its sub ports. (b) The two possible states. The rearrangeability of an MIN is that if the MIN can connect its N inputs to its N outputs in all N! possible ways, by rearranging the existing connections if required.

The Benes network has been proposed as a popular architecture for rearrangeable MINs and it uses the theoretically minimum number of stages [2]. More precisely, a Benes network is a (2n − 1)-stage MIN and it is essentially the concatenation of the baseline network and the reverse baseline network with the last stage of the baseline network overlapped with the first stage of the reverse baseline network. The shuffle-exchange network is also a widely studied architecture for rearrangeable MINs. In [23], Waksman proved that if a shuffle-exchange network is rearrangeable, then it has at least 2n − 1 stages. Later on, Stone [20] showed that an n2-stage shuffle-exchange network

is rearrangeable. In [24], Wu and Feng proposed an algorithm for realizing an arbitrary permutation on a (3n − 1)-stage shuffle-exchange network. In [1] and [9], Babu et al. had proven that 3n − 3 is an upper bound on the number of stages for a shuffle-exchange network to be rearrangeable; this upper bound was later improved to be 3n − 4 by Linial and Tarsi [13]. In [5], Cam proved that a (2n − 1)-stage shuffle-exchange network is

(10)

rearrangeable, but the proof is still doubtful for n > 4; see [21].

Do notice that the above definition of rearrangeability and the results of [1, 2, 5, 9, 13, 20, 23, 24] are for electronic MINs. Recently, optical MINs have become a promising network choice for their high performance and faster speed; see [11, 16, 19, 22, 25, 26, 27, 28]. As was mentioned in [28], electronic MINs and optical MINs have many similarities, but there are some fundamental differences between them. The major difference between them is that the optical MINs have the crosstalk problem (see [28]) and therefore two messages can not pass through the same switch at the same time. As a result, two routing requests can be sent simultaneously in an electronic MIN if they are link-disjoint; they can be sent simultaneously in an optical MIN only when their routing paths are node-disjoint.

Due to the crosstalk problem, the results for electronic MINs may not be applied on optical MINs. Yang et al. [28] observed that the maximum number of input-output pairs that can be routed simultaneously in an optical MIN is N

2. Thus they introduced the

definition of semi-permutation, which is a partial permutation with N

2 input-output pairs. A partial permutation a0 a1 · · · aN2−1 b0 b1 · · · bN 2−1 ! of an N-element set {0, 1, 2, . . . , N − 1}, where ai, bi ∈ {0, 1, 2, . . . , N − 1}, is a semi-permutation of the N-element set if

n a0 2  ,  a1 2 , . . . , jaN 2 −1 2 ko =  b0 2 ,  b1 2 , . . . , b N 2 −1 2  =0, 1, . . . ,N 2 − 1 .

For example, when N = 8,  1 2 5 6 7 2 0 4  is a semi-permutation since 1 2 ,  2 2 ,  5 2 ,  6 2 =  7 2 ,  2 2 ,  0 2 ,  4 2 = {0, 1, 2, 3}.

A semi-permutation ensures that there is no crosstalk at the first and the last stages. In [3], Bao and Li defined a routing to be crosstalk-free (conflict-free) if any two paths used in the routing are node-disjoint (link-disjoint). They showed that the crosstalk-free routing on any bit permutation network (BPN) is equivalent to the conflict-free routing on a BPN of smaller size and with fewer stages. They defined “CF-rearrangeable” and

(11)

proved that the minimum number of stages for a BPN to be CF-rearrangeable is 2n − 2. In particular, an MIN is crosstalk-free rearrangeable (CF-rearrangeable) if every semi-permutations can be realized with node-disjoint paths in one pass. Since the maximum number of input-output pairs that can be routed simultaneously in an optical MIN is

N

2, at least two passes are required for realizing a permutation. Yang et al. [28] proved

an important result: Any permutation can be decomposed into two semi-permutations. Thus a CF-rearrangeable MIN can realize any permutation in two passes and this is optimal. Yang et al. [28] proved that a Benes network is CF-rearrangeable and proposed a permutation routing algorithm for a Benes network. Lu and Zheng [14] also proposed a permutation routing algorithm for the same network.

In [26], Yang and Wang proposed a permutation routing algorithm for the baseline (or reverse baseline) network with node-disjoint paths in four passes; they said that the proposed algorithm can work efficiently only for long message. Later, in [27], Yang and Wang presented a permutation routing algorithm for the baseline (or reverse baseline) network with node-disjoint paths in four passes and they claimed that this algorithm is suitable for messages of any length.

Recently, Das [8] formulated a sufficient condition for the rearrangeability of a com-bined (2n − 1)-stage electronic MIN and presented an O(N log2N)-time permutation routing algorithm for MINs that satisfy the sufficient condition. However, the results of Das are for electronic MINs. Therefore, due to the crosstalk problem, these results can not be applied on optical MINs. The purpose of this thesis is to transform the results of Das into results applicable to optical MINs. In particular, we formulate a sufficient condition for the CF-rearrangeability of a combined (2n − 2)-stage optical MIN and a suf-ficient condition for the CF-rearrangeability of a combined (2n−1)-stage optical MIN. We propose an O(N log2N)-time semi-permutation routing algorithm for optical MINs that satisfy the sufficient condition. We also improve the decomposition algorithm proposed

(12)

in [28] and the permutation routing algorithm proposed in [27].

This thesis is organized as follows: Section 2 lists some preliminaries. Section 3 im-proves the decomposition algorithm proposed in [28]. Section 4 contains our results on CF-rearrangeability of optical MINs. Section 5 improves the permutation routing algo-rithm in [27]. Concluding remarks are given in the last section.

2

Preliminaries

An s-stage N × N MIN is represented as follows (see Figure 4 for an illustration): • label the inputs of the MIN as 0, 1, 2, . . . , N − 1 and represent each input by an n-bit

binary number xn−1xn−2· · · x0;

• label the outputs of the MIN as 0, 1, 2, . . . , N − 1 and represent each output by an n-bit binary number yn−1yn−2· · · y0;

• label the stages as 0, 1, 2, . . . , s − 1;

• label the switches of each stage as 0, 1, 2, . . . ,N

2 − 1 and represent each switch by an

(n − 1)-bit binary number zn−2zn−3· · · z0; and

• label the upper and the lower output links of every switch j, 0 ≤ j ≤ N

2 − 1, as

2j and 2j + 1 respectively, and represent each link by an n-bit binary number xn−1xn−2· · · x0.

In an s-stage MIN, a path from an input to an output can be described by a sequence r0r1· · · rs−1of labels that label the successive links on this path. Such a sequence of labels

is called the routing bits [8] (or control tag [17], tag [6], or path descriptor [12]). Routing bits can be used as the header for routing a message: each successive switch uses the first routing bit to route the message, and then discards it. In particular, routing bit rk

(13)

stage 0 stage 1 stage 2 0 0 0 1 0 1 0 1 0 0 1 g g g 1 2 1 3 2 3 2 3 1 2 3 2 4 2 5 4 5 4 5 2 4 5 3 6 3 7 6 7 6 7 3 6 7 F0 F1 F2

Figure 4: An 8 × 8 baseline network with labels and i-mappings.

made to sub port 0 (respectively, sub port 1). For example, in Figure 4, input 0 can get to output 5 by using routing bits 101, which means the routing request can be fulfilled by a path via sub port 1 at stage 0, sub port 0 at stage 1, and sub port 1 at stage 2.

In this thesis, the links connecting inputs and switches of stage 0 are regarded as output links of switches of stage (−1) although there are no switches of stage (−1). The following definition, i-mapping, is crucial for this thesis and was first proposed in [8]. Given an MIN, if for each stage i, 0 ≤ i ≤ s − 1, there exists a mapping

Fi : xn−1xn−2· · · x1x0 −→ yn−1yn−2· · · y1ri

between the output links xn−1xn−2· · · x1x0 of switches of stage (i−1) and the output links

yn−1yn−2· · · y1ri of switches of stage i, where yn−1yn−2· · · y1 is a permutation of any n − 1

bits of xn−1xn−2· · · x1x0 and ri is the routing bit, such that the link xn−1xn−2· · · x1x0 at

stage (i − 1) is connected to the link yn−1yn−2· · · y1ri at stage i, then Fi’s are defined as

i-mappings for the MIN. For example, the i-mappings for the 8 × 8 baseline network in Figure 4 are: F0 : x2x1x0 −→ x2x1r0, F1 : x2x1x0 −→ x0x2r1, F2 : x2x1x0 −→ x2x0r2.

An s-stag MIN is said to follow destination tag routing if the routing bits r0r1· · · rs−1of

every message equal to the n-bit binary representation yn−1yn−2· · · y0 of the destination.

In other words, if an MIN follows destination tag routing, then s = n and the routing bits of a message sent to output yn−1yn−2· · · y0 are yn−1yn−2· · · y0; also, the destination of a

(14)

message with routing bits r0r1· · · rs−1 is r0r1· · · rs−1. Consider the 8 × 8 baseline network

shown in Figure 4. From its i-mappings, a message from input x2x1x0 with routing bits

r0r1r2 will reach output r0r1r2 since x2x1x0 F0 −→ x2x1r0 F1 −→ r0x2r1 F2 −→ r0r1r2. Thus

the 8 × 8 baseline network follows destination tag routing. In general, an N × N baseline network satisfies s = n and has i-mappings:

F0 : xn−1xn−2· · · x1x0 −→ xn−1xn−2· · · x1r0, F1 : xn−1xn−2· · · x1x0 −→ x0xn−1xn−2· · · x2r1, F2 : xn−1xn−2· · · x1x0 −→ xn−1x0xn−2· · · x2r2, .. . Fn−1 : xn−1xn−2· · · x1x0 −→ xn−1xn−2· · · x2x0rn−1.

A baseline network follows destination tag routing since xn−1xn−2· · · x1x0 F0 −→ xn−1xn−2· · · x1r0 F1 −→ r0xn−1xn−2· · · x2r1 F2 −→ · · · F−→ rn−1 0r1· · · rn−1.

We now define a combined MIN M1(n1) ⊕ M2(n2). Let M1(n1) and M2(n2) denote an

n1-stage MIN and an n2-stage MIN, respectively. The combined MIN M1(n1) ⊕ M2(n2)

is the concatenation of M1(n1) and M2(n2) with the last stage of M1(n1) overlapped with

the first stage of M2(n2). Clearly, M1(n1) ⊕ M2(n2) is an (n1+ n2 − 1)-stage MIN.

Recall that n = log2N. A n-stage shuffle-exchange network is the well-known omega

network and is usually denoted as Ω. In [8], Das formulated the following sufficient condition for the rearrangeability of a combined (2n − 1)-stage MIN ∆ ⊕ ∆′, where ∆ and

∆′ are two n-stage Ω-equivalent networks (notice that an Ω-equivalent network follows

destination tag routing).

Theorem 1. [8] In a combined (2n − 1)-stage MIN ∆ ⊕ ∆′, if i-mappings exist for

all i, 0 ≤ i ≤ 2n − 2, and each AR-bit rj, 0 ≤ j ≤ n − 2, occurs only in each Sk,

j+ 1 ≤ k ≤ 2n − 2 − j, then ∆ ⊕ ∆′ is rearrangeable.

For the definitions of AR-bit and Sk, see Section 4. The Benes network is an example

(15)

3

Decompose a permutation into semi-permutations

The purpose of this section is to improve the decomposition algorithm proposed in [28]. Throughout this section, P =  a0 a1 · · · aN −1

b0 b1 · · · bN −1



denotes a given permutation. In [28], Yang et al. proposed an efficient algorithm to decompose a given permutation P into two semi-permutations L and R. This algorithm first constructs an undirected bipartite graph G = (V1, V2; E) for P . The vertex sets of G are given by V1 = {A[1]0 , A

[1] 1 , . . . , A [1] N 2−1 } and V2 = {A[2]0 , A [2] 1 , . . . , A [2] N 2−1 }, where A[1]j and A [2]

j correspond to inputs and outputs,

respectively, and both A[1]j and A[2]j are the 2-element set {2j, 2j+1} for all j, 0 ≤ j ≤ N 2−1.

The edge set E is defined by: (A[1]j1, A

[2]

j2) ∈ E if and only if there exists a pair  ai bi  in P such that ai ∈ A [1] j1 and bi ∈ A [2] j2. Clearly, G is 2-regular, |V1| = |V2| = N 2, and |E| = N.

The algorithm in [28] takes O(N) time and is listed in Algorithm 1. Algorithm 1 DECOMPOSITION ALGORITHM in [28]

Require: A permutation P .

Ensure: Two semi-permutations L and R of P .

1: Construct a bipartite graph G = (V1, V2; E) for P .

2: For each connected component of G, start from a vertex of this component in V1,

traverse through an unvisited edge to the neighbor vertex in V2, back and forth until

returning to the starting vertex. (During the traverse, a visited edge is marked “for-ward” if the traverse direction on this edge is from V1 to V2 and marked “backward”

if the direction is opposite.)

3: Take all one-pair mappings corresponding to edges marked with “forward” to form semi-permutation L; take all one-pair mappings corresponding to edges marked with “backward” to form semi-permutation R.

In Algorithm 1, a bipartite graph has to be constructed explicitly. We now propose a decomposition algorithm, which abandon the requirement for constructing a bipartite graph and still takes O(N) time. Without loss of generality, in our algorithm, assume the given permutation P is of the form P =



0 1 · · · N − 1 b0 b1 · · · bN −1



and is represented as an array also called P with P [i] = bi. An array Q is used to store the inverse permutation

of P ; that is, Q[bi] = i if and only if P [i] = bi. Semi-permutation L is represented by two

arrays La and Lb such that La[⌊2i⌋] = i and Lb[⌊2i⌋] = bi if and only if

 i bi



(16)

represented by two arrays Ra and Rb such that Ra[⌊2i⌋] = i and Rb[⌊2i⌋] = bi if and only if  i bi 

∈ R. Define mate(v) = v + 1 if v is an even number and v − 1 if v is an odd number. Thus 0 and 1 are the mates of each other, 2 and 3 are the mates of each other, and so on. The following is our decomposition algorithm.

Algorithm 2 OUR DECOMPOSITION ALGORITHM Require: A permutation P .

Ensure: Two semi-permutations L and R of P .

1: for i ← 0 to N − 1 do 2: A[i] ← 0; 3: end for 4: for i ← 0 to N − 1 do 5: Q[P [i]] ← i; 6: end for

7: while there exists i such that A[i] = 0 do

8: head ← i; 9: next← i; 10: repeat 11: La[⌊next 2 ⌋] ← next; 12: Lb[⌊next2 ⌋] ← P [next]; 13: A[next] ← 1;

14: next← Q[mate(P [next])];

15: Ra[⌊next2 ⌋] ← next;

16: Rb[⌊next2 ⌋] ← P [next];

17: A[next] ← 1;

18: next← mate(next);

19: until (next = head);

20: end while

We now give an example for Algorithm 2. Suppose P = 0 1 2 3 4 5 6 7 5 7 2 6 3 0 4 1  . In the 1st iteration, L = 0 ? ? ? 5 ? ? ?  and R = ? ? ? 6 ? ? ? 4  . In the 2nd iteration, L = 0 ? ? 7 5 ? ? 1  and R = ? ? 5 6 ? ? 0 4  . In the 3rd iteration, L = 0 ? 4 7 5 ? 3 1  and R = ? 2 5 6 ? 2 0 4  . In the final iteration, we obtain the two semi-permutations

L= 0 3 4 7 5 6 3 1  and R = 1 2 5 6 7 2 0 4  . We now analyze Algorithm 2.

(17)

Theorem 2. Algorithm 2 is correct and takes O(N) time.

Proof. Let G be the bipartite graph in the algorithm in [28]. Our algorithm is based on the observation that: for all v, 0 ≤ v < N, {v, mate(v)} is the vertex A[1]i and also the vertex A[2]i in G, A[1]i is adjacent to A[2]j , where i = v

2 and j = j P[v] 2 k . Since G is a 2-regular bipartite graph, each connected component of G is an even cycle. The repeat-loop in lines 10 to 19 corresponds to traversing a cycle in G and inserting edges of the cycle into L and R alternatingly. The while-loop ensures that all the connected components of G are traversed. Hence the resultant L and R are two semi-permutations. Lines 1 to 6 of this algorithm take O(N) time. Lines 7 to 20 take O(N) time since each input-output pair in P is considered exactly once. Thus Algorithm 2 takes O(N) time.

4

CF-rearrangeability of optical MINs

The purpose of this section is to formulate a sufficient condition for the crosstalk-free rearrangeability of a combined (2n − 2)-stage and a combined (2n − 1)-stage optical MIN and to propose a routing algorithm for MINs that satisfy the sufficient condition. Before going further, we give three definitions: AR-bits, OWk, and OSk.

Recall that in this thesis, an MIN is an N ×N MIN and n = log2N. Let M1(n′)⊕M2(n)

be a combined optical MIN in which M2(n) follows destination tag routing and set s =

n′+n−1 for easy writing. A path from an input to an output through the MIN is referred

to as an input-output path. Since M2(n) follows destination tag routing, for a particular

input-output path, the routing bits for stages k, s − n ≤ k ≤ s − 1, are predetermined by the n-bit binary representation of the destination (i.e., the output), but the routing bits rk, 0 ≤ k ≤ s − n − 1, can be arbitrary and are referred to as arbitrary routing bits

(AR-bits).

Suppose the i-mappings of M1(n′) ⊕M2(n) are F0, F1, . . . , Fs−1. Then an input-output

(18)

where L0 = x, Ls = y, and Lk, 1 ≤ k ≤ s − 1, is the output of stage k − 1 followed by the

path. Note that Lk = Fk−1(Lk−1) for all k, 1 ≤ k ≤ s. The path can also be represented

as E0 → E1 → · · · → Es−1, where Ek, 0 ≤ k ≤ s − 1, is the switch passed by the path at

stage k. It is not difficult to see that the binary representation of Ek, 0 ≤ k ≤ s − 1, can

be obtained by deleting the rightmost bit of the binary representation of Lk+1.

Given a semi-permutation on M1(n′) ⊕ M2(n), at any stage k, 0 ≤ k ≤ s − 1, the set

of switches passed by individual input-output paths can be represented by an N

2 × (n − 1)

matrix, called optical window OWk, where each row j, 0 ≤ j ≤ N2 − 1, of OWk is the

(n−1)-bit binary representation of the switch at stage k that is passed by the path started from input 2j (if 2j belongs to the semi-permutation) or 2j + 1 (if 2j + 1 belongs to the semi-permutation). Note that each optical window OWk can be represented uniquely by

a string OSk obtained by deleting the rightmost bit of Sk+1, where S0 = xn−1xn−2· · · x0

and Sk= Fk−1(Sk−1), 1 ≤ k ≤ s. OSk is called the characteristic string of OWk.

A dilated Benes network is a (2n − 2)-stage MIN and it is the concatenation of the baseline network and the reverse baseline network with the last two stages of the baseline network overlapped with the first two stages of the reverse baseline network.

Take the 6-stage 16 × 16 dilated Benes network shown in Figure 5 as an example. Suppose the binary representations of the input and the output are x3x2x1x0 and y3y2y1y0,

respectively. Then r2r3r4r5 = y3y2y1y0; the i-mappings, Sk’s, and OSk’s are:

F0 : x3x2x1x0 −→ x3x2x1r0 S0 = x3x2x1x0 F1 : x3x2x1x0 −→ x0x3x2r1 S1 = F0(S0) = x3x2x1r0 OS0 = x3x2x1 F2 : x3x2x1x0 −→ x3x0x2r2 S2 = F1(S1) = r0x3x2r1 OS1 = r0x3x2 F3 : x3x2x1x0 −→ x3x2x0r3 S3 = F2(S2) = r0r1x3r2 OS2 = r0r1x3 F4 : x3x2x1x0 −→ x3x1x0r4 S4 = F3(S3) = r0r1r2r3 OS3 = r0r1r2 F5 : x3x2x1x0 −→ x2x1x0r5 S5 = F4(S4) = r0r2r3r4 OS4 = r0r2r3 S6 = F5(S5) = r2r3r4r5 OS5 = r2r3r4

The following observation is crucial for the remaining discussions: A semi-permutation can be realized on M1(n′) ⊕ M

2(n) if and only if all rows of each optical window OWk,

(19)

0

stage 1 stage 2 stage 3 stage 4

0 1 stage 0 stage 5 0 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 1 1 1 2 3 1 1 1 1 1 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 2 4 5 2 2 2 2 2 4 5 4 5 4 5 4 5 4 5 4 5 3 8 6 7 3 3 3 4 3 3 8 6 7 8 6 7 8 6 7 8 6 7 8 6 7 8 6 7 4 5 8 9 10 11 4 5 4 5 4 5 4 5 4 5 8 9 10 11 8 9 10 11 8 9 10 11 8 9 10 11 8 9 10 11 8 9 10 11 5 6 11 12 13 5 6 5 6 5 6 5 6 5 6 11 12 13 11 12 13 11 12 13 11 12 13 11 12 13 11 12 13 7 14 15 7 7 7 7 7 14 15 14 15 14 15 14 15 14 15 14 15 F0 F1 F2 F3 F4 F5

Figure 5: A 16 × 16 dilated Benes network.

Theorem 3. (Sufficient condition for a combined (2n − 2)-stage optical MIN) In a combined (2n − 2)-stage optical MIN M1(n − 1) ⊕ M2(n) in which M2(n) follows

destination tag routing, if i-mappings exist for all i, 0 ≤ i ≤ 2n − 3, and each AR-bit rk,

0 ≤ k ≤ n − 3, occurs only in each OSℓ, for k+ 1 ≤ ℓ ≤ 2n − 4 − k, then the network is

CF-rearrangeable.

Proof. To prove this theorem, it suffices to prove that M1(n−1)⊕M2(n) can realize each

semi-permutation P with node-disjoint paths in one pass. In this proof, j is an integer in {0, 1, . . . ,N

2 − 1}. By the definition of a semi-permutation, exactly one of inputs 2j and

2j + 1 is in P; denote the one in P as j∗. Let r

0(j)r1(j) · · · r2n−3(j) be the routing bits of

j∗; in particular, r

k(j) is the routing bit of input j∗ at stage k.

Before going further, we define conjugate rows. By the constraints of this theorem, AR-bit rk, 0 ≤ k ≤ n − 3, appears only in OSk+1, OSk+2, . . . , OS2n−4−k. Thus r0 appears

only in OS1, OS2, . . . , OS2n−4; r1 appears only in OS2, OS3, . . . , OS2n−5; r2 appears only

in OS3, OS4, . . . , OS2n−6; and so on. Therefore, we have two properties.

(20)

one column is composed of rk−1(0), rk−1(1), . . . , rk−1(N2 − 1).

(ii) For all k, n − 1 ≤ k ≤ 2n − 4, n − 2 columns of OWk appear in OWk+1 and the

remaining one column is composed of r2n−4−k(0), r2n−4−k(1), . . . , r2n−4−k(N2 − 1).

Based on the above two properties, we define conjugate rows as follows. (i) For all k, 1 ≤ k ≤ n − 2, two rows j and j′ of OW

k are the conjugate row of each

other if these two rows are identical except at rk−1(j) and rk−1(j′).

(ii) For all k, n − 1 ≤ k ≤ 2n − 4, two rows j and j′ of OW

k are the conjugate row of

each other if these two rows are identical except at r2n−4−k(j) and r2n−4−k(j′).

Since M2(n) follows destination tag routing, rn−2(j)rn−1(j) · · · r2n−3(j) are

predeter-mined by the binary representation of the destination. Hence, to prove this theorem, it suffices to prove that for each j, routing bits r0(j)r1(j) · · · rn−3(j) (i.e., AR-bits) exist

such that all rows of each optical window OWk, 0 ≤ k ≤ 2n − 3, are distinct.

First consider OW0and OW2n−3. Since OS0is composed of the leftmost n−1 bits of the

inputs in P, all rows of OW0 are distinct. Since OS2n−3 is composed of the leftmost n − 1

bits of the outputs in P, all rows of OW2n−3 are distinct. In the following, we will show

that it is possible to assign the values of r0(0), r0(1), . . . , r0(N2−1) so that all rows of OW1

are distinct and all rows of OW2n−4are distinct, too. After r0(0), r0(1), . . . , r0(N2 −1) are

assigned, we will show that it is possible to assign the values of r1(0), r1(1), . . . , r1(N2 −1)

so that all rows of OW2 are distinct and all rows of OW2n−5 are distinct, too. In general,

after rk−1(0), rk−1(1), . . . , rk−1(N

2 − 1) are assigned, we will show that it is possible to

assign the values of rk(0), rk(1), . . . , rk(N2 − 1) so that all rows of OWk+1 are distinct

and all rows of OW2n−4−k are distinct, too.

Consider the pair of optical windows OWk+1and OW2n−4−kin the order k = 0, 1, . . . , n−

3. We now show that it is possible to use the idea of conjugate rows to assign the values of rk(0), rk(1), . . . , rk(N2 − 1) so that all rows of OWk+1 are distinct and all rows of

(21)

OW2n−4−k are distinct, too. Among the n − 1 columns of OWk+1, n − 2 of them appear

in OWk and the remaining one column is composed of rk(0), rk(1), . . . , rk(N2 − 1); also,

among the n − 1 columns of OW2n−4−k, n − 2 of them appear in OW2n−3−k and the

remaining one column is composed of rk(0), rk(1), . . . , rk(N2 − 1). Each row j of the

n− 2 columns of OWk+1 that appear in OWk has a conjugate row j′. Hence all rows of

OWk+1 are distinct if and only if for each pair of conjugate rows j and j′, rk(j) 6= rk(j′).

Similarly, each row j of the n − 2 columns of OW2n−4−k that appear in OW2n−3−k has a

conjugate row j′. Hence all rows of OW

2n−4−k are distinct if and only if for each pair of

conjugate rows j and j′, r

k(j) 6= rk(j′). We start with an arbitrary row j of OWk+1 and

set rk(j) = 0. Then we find the conjugate row j′ of row j in OWk+1 and set rk(j′) = 1. In

this way, rows j and j′ in OW

k+1 can be made distinct. Next, find the conjugate row j′′

of row j′ in OW

2n−4−k and set rk(j′′) = 0. Again, in this way, rows j′ and j′′ in OW2n−4−k

can be made distinct. Repeat the above process until rk(j) is assigned for all j. Thus it

is possible to assign the values of rk(0), rk(1), . . . , rk(N2 − 1) so that all rows of OWk+1

are distinct and all rows of OW2n−4−k are distinct, too. We now have this theorem.

By using the idea of conjugate rows (defined in the proof of Theorem 3), we now propose an algorithm to determine the AR-bits r0r1· · · rn−3 for optical MINs that satisfy

Theorem 3.

Since AR-bits r0r1· · · rn−3together with rn−2rn−1· · · r2n−3(the predetermined routing

bits) can be used to route a given semi-permutation with node-disjoint paths in one pass, Algorithm 3 is called the ROUTING ALGORITHM.

Take the 16 × 16 dilated Benes network shown in Figure 5 and the semi-permutation P =



0 2 5 7 8 11 13 15 13 11 2 0 9 14 5 7



(22)

Algorithm 3 ROUTING ALGORITHM

Require: An arbitrary semi-permutation P, the characteristic strings OSk, 1 ≤ k ≤

2n − 4, of the combined (2n − 2)-stage optical MIN, and the optical windows OW0

and OW2n−3 derived from P.

Ensure: AR-bits r0r1· · · rn−3, each represented as an (N2)-bit array such that rk(j), 0 ≤

j ≤ N

2 − 1, represents the routing bit of input j∗ at stage k, where j∗ = 2j if input 2j

is in P and j∗ = 2j + 1 if input 2j + 1 is in P.

1: for k = 0 to n − 3 do

2: use OSk+1 and Wk to form (n − 2) of the (n − 1) columns of OWk+1;

/* the remaining one column is for AR-bit rk and is determined below */

3: use OS2n−4−k and W2n−3−k to form (n − 2) of the (n − 1) columns of OW2n−4−k;

/* the remaining one column is for AR-bit rk and is determined below */

4: j ← 0;

5: rk(j) ← 0;

6: set the corresponding entry in the remaining one column of OWk+1 to 0;

7: find the conjugate row j′ in OW k+1;

8: rk(j′) ← 1;

9: set the corresponding entry in the remaining one column of OWk+1 to 1;

10: find the conjugate row j′′ in OW

2n−4−k;

11: rk(j′′) ← 0;

12: set the corresponding entry in the remaining one column of OW2n−4−k to 0;

13: repeat lines 7 to 12 until j′′= j;

14: if there exists a row j in OWk+1 such that rk(j) is not assigned then go to line 5;

15: end for

last optical windows OW0 and OW5 are as follows.

          ← OW0 → ← OW5 → x3 x2 x1 y3(= r2) y2(= r3) y1(= r4) 0 0 0 1 1 0 0 0 1 1 0 1 0 1 0 0 0 1 0 1 1 0 0 0 1 0 0 1 0 0 1 0 1 1 1 1 1 1 0 0 1 0 1 1 1 0 1 1          

Algorithm 3 determines r0 from OW1 and OW4 as follows. Note that for convenience, the

columns in OW1 is given in the order x3x2r0 instead of the order r0x3x2.

            ← OW1 → ← OW4 → x3 x2 r0 y3(= r2) y2(= r3) 0 0 0 1 1 0 0 1 0 0 1 0 0 0 1 0 0 1 0 1 0 1 0 1 1 1 1 0 1 1 1 0 1            

(23)

            ← OW1 → ← OW4 → x3 x2 r0 y3(= r2) y2(= r3) 0 0 0 1 1 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 1 0 1 0 1 1 1 1 0 1 1 1 0 1                         ← OW1 → ← OW4 → x3 x2 r0 y3(= r2) y2(= r3) 0 0 0 1 1 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 1 0 1 0 1 1 1 1 0 1 1 1 0 1                         ← OW1 → ← OW4 → x3 x2 r0 y3(= r2) y2(= r3) 0 0 0 1 1 0 0 1 1 0 0 1 0 0 0 0 1 1 0 0 1 0 0 1 0 1 0 1 1 1 1 1 0 0 1 1 1 1 0 1            

Algorithm 3 determines r1 from OW2 and OW3 as follows. Again, for convenience, the

columns in OW2 is given in the order x3r0r1 instead of the order r0r1x3.

            ← OW2 → ← OW3 → x3 r0 r1 y3(= r2) 0 0 0 1 0 1 0 1 0 0 1 0 0 1 1 0 1 0 1 1 1 1 1 1 1 0 0 0 1 1 0 0            

The routing bits r0r1r2r3r4r5 for P is listed below, in which r0 and r1 are the AR-bits

derived by Algorithm 3, and each of r2, r3, r4, r5 is represented as an (N2)-bit array such

(24)

input j∗ in P, where j= 2j if input 2j is in P and j= 2j + 1 if input 2j + 1 is in P.          r0 r1 r2 r3 r4 r5 0 0 1 1 0 1 1 0 1 0 1 1 0 1 0 0 1 0 1 1 0 0 0 0 0 1 1 0 0 1 1 1 1 1 1 0 0 0 0 1 0 1 1 0 0 1 1 1         

Figure 6 shows the routing paths of P when the above r0r1r2r3r4r5 is used.

stage 1 stage 2 stage 3 stage 4

8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 stage 0 stage 5 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7

Figure 6: The routing paths obtained by our algorithm. We now prove a theorem.

Theorem 4. Algorithm 3 is correct and takes O(N log2N) time. Moreover, it leads to an O(N log2N)-time semi-permutation routing algorithm.

Proof. The correctness of Algorithm 3 follows from the proof of Theorem 3. Since there are (n − 2) × N

2 rk(j)’s and each of them can be determined in O(1) time, Algorithm 3

takes O((log2N − 2) × N2) = O(N log2N) time. It is not difficult to see that the

AR-bits r0r1· · · rn−3 obtained by Algorithm 3 together with the n predetermined routing bits

rn−2rn−1· · · r2n−3 can be used to route a given semi-permutation with node-disjoint paths

(25)

The following is a sufficient condition for the CF-rearrangeability of a combined (2n − 1)-stage optical MIN. Since the proof is similar to that of Theorem 3, the proof is omitted. Theorem 5. (Sufficient condition for a combined (2n − 1)-stage optical MIN) In a combined (2n − 1)-stage optical MIN M1(n) ⊕ M2(n) in which M2(n) follows

des-tination tag routing, if i-mappings exist for all i, 0 ≤ i ≤ 2n − 2, and each AR-bit rk,

0 ≤ k ≤ n − 2, occurs only in each OSℓ, for k+ 1 ≤ ℓ ≤ 2n − 3 − k, then the network is

CF-rearrangeable.

Before ending this section, we list the characteristic strings of a (2n − 2)-stage dilated Benes network. These strings will be used in the next section.

OS0 = xn−1xn−2· · · x3x2x1 OS1 = r0xn−1xn−2· · · x3x2 OS2 = r0r1xn−1xn−2· · · x3 ... OSn−2 = r0r1r2· · · rn−3xn−1 OSn−1 = r0r1r2· · · rn−3rn−2 .. . OS2n−5 = r0r1rn−2rn−1· · · r2n−6 OS2n−4 = r0rn−2rn−1· · · r2n−6r2n−5 OS2n−3 = rn−2rn−1· · · r2n−6r2n−5r2n−4 (1)

Here xn−1xn−2· · · x0 denotes an input and yn−1yn−2· · · y0 denotes an output. Note that

rn−2rn−1· · · r2n−3 = yn−1yn−2· · · y0. It can be verified that a dilated Benes network

satis-fies the sufficient condition stated in Theorem 3 and hence is CF-rearrangeable.

5

A permutation routing algorithm in the baseline

(or reverse baseline) network

Recall that both the Benes network and the dilated Benes network are the concate-nation of the baseline network and the reverse baseline network. The Benes network is rearrangeable, whereas the dilated Benes network is CF-rearrangeable. For convenience, call the output links of switches of stage (n − 1) followed by routing paths in a permu-tation (respectively, semi-permupermu-tation) of the Benes network (respectively, dilated Benes

(26)

network) the intermediate destinations. In [27], by using the intermediate destinations of a Benes network, Yang and Wang proposed an algorithm (for convenience, call it Algo-rithm YW) to route an arbitrary permutation in a baseline (or reverse baseline) network with node-disjoint paths in four passes.

Algorithm YW uses Algorithm 1 to decompose a given permutation P into two semi-permutations L and R. Recall that Algorithm 1 has to construct a bipartite graph explic-itly. Also, to use the intermediate destinations of a Benes network, Algorithm YW has to run Algorithm 1 to decompose each of L and R into two semi-permutations, say, LL, LR, RL, and RR. Then, Algorithm YW has to run Algorithm 1 to further decompose each of LL, LR, RL, and RR into two semi-permutations, say, LLL, LLR, LRL, LRR, RLL, RLR, RRL, and RRR. The same process repeats until each semi-permutation contains only one input-output pair.

The purpose of this section is to improve Algorithm YW. To achieve this purpose, Algorithm 2 is used instead of Algorithm 1; also, the intermediate destinations of a dilated Benes network are used instead of the intermediate destinations of a Benes network. See the following for details.

We first use Algorithm 2 to decompose a given permutation P into two semi-permutations Land R; then, route L in a baseline network with node-disjoint paths in two passes, and route R in a baseline network with node-disjoint paths in two passes. In the following, we only present an algorithm to route an arbitrary semi-permutation P in a baseline network with node-disjoint paths in two passes. Obviously, setting P = L and P = R will route an arbitrary permutation in a baseline network with node-disjoint paths in four passes.

Suppose the semi-permutation is P =  a 0 a1 a2 · · · aN 2−1 b0 b1 b2 · · · bN 2−1  . Note that we have assumed that the links connecting inputs and switches of stage 0 are regarded as output links of switches of stage (−1). Therefore, an input-output path from input ai to output bi, 0 ≤ i ≤ N2 − 1, in a dilated Benes network can be represented as

(27)

Li,0 → Li,1 → Li,2 → · · · → Li,2n−3 → Li,2n−2, where Li,0 = ai, Li,2n−2 = bi, and Li,k,

1 ≤ k ≤ 2n − 3, is the output link of switches of stage k − 1 followed by the path. Our algorithm is based on the observation that L0,n, L1,n, . . . , LN

2−1,n are the intermediate destinations of a dilated Benes network for P and can be obtained by Algorithm 3. Let P1 =  a 0 a1 a2 · · · aN 2−1 L0,n L1,n L2,n · · · LN 2−1,n  and P2 =  L 0,n L1,n L2,n · · · LN 2−1,n b0 b1 b2 · · · bN 2−1  . The following lemma was proven in [27].

Lemma 6. [27] The set of all semi-permutations realized by a baseline network with node-disjoint paths in one pass is exactly the set of all semi-permutations realized by a reverse baseline network with node-disjoint paths in one pass.

We now prove a lemma.

Lemma 7. Both P1 and P2 can be realized by a baseline (or reverse baseline) network

with node-disjoint paths in one pass. Moreover, routing bits for P1 and P2 are the n-bit

binary representations of L0,n, L1,n, . . . , LN 2−1,n

and b0, b1, . . . , bN 2−1

, respectively.

Proof. Since a dilated Benes network is CF-rearrangeable, P can be realized in it with node-disjoint paths in one pass. Consider P1. The first n stages of a dilated Benes

network form a baseline network. Thus P1 can be realized by a baseline network with

node-disjoint paths in one pass. By Lemma 6, P1 can also be realized by a reverse baseline

network with node-disjoint paths in one pass. Now consider P2. Since the last n stages

of a dilated Benes network form a reverse baseline network, P2 can be realized in the last

n− 2 stages (i.e., stages 2, 3, . . ., n − 1) of a reverse baseline network with node-disjoint paths in one pass. Thus if we can prove that, in a reverse baseline network, input Li,n,

0 ≤ i ≤ N

2 − 1, can get to output link Li,n of stage 1, then P2 can be realized by a reverse

baseline network with node-disjoint paths in one pass. For input Li,n, choose its routing

bit at stage 0 to be 0 if jLi,n

2

k

is even and 1 if jLi,n

2

k

is odd; choose its routing bit at stage 1 to be 0 if Li,n is even and 1 if Li,n is odd. It is not difficult to see that the above

(28)

choices of routing bits ensure input Li,n to get to output link Li,n of stage 1. Therefore,

P2 can be realized by a reverse baseline network with node-disjoint paths in one pass.

By Lemma 6, P2 can also be realized by a baseline network with node-disjoint paths in

one pass. Since a baseline (or reverse baseline) network follows destination tag routing, routing bits for P1 and P2 are the n-bit binary representations of L0,n, L1,n, . . . , LN

2−1,n and b0, b1, . . . , bN

2−1, respectively.

The following is our algorithm for routing an arbitrary semi-permutation P in an N × N baseline (or reverse baseline) network with node-disjoint paths in two passes. Algorithm 4 ROUTING A SEMI-PERMUTATION IN A BASELINE OR REVERSE BASELINE NETWORK

Require: An arbitrary semi-permutation P and the characteristic strings OSk, 1 ≤ k ≤

2n − 4, of a (2n − 2)-stage dilated Benes network.

Ensure: Routing bits r0r1. . . rn−1 and r0′r1′ . . . r′n−1, each represented as an (N2)-bit array

such that rk(j), 0 ≤ k < n and 0 ≤ j ≤ N2 − 1, represents the routing bit of input j∗

at stage k for the first pass, and r′

k(j), 0 ≤ k < n and 0 ≤ j ≤ N

2 − 1, represents the

routing bit of input j∗ at stage k for the second pass, where j= 2j if input 2j is in

P and j∗ = 2j + 1 if input 2j + 1 is in P.

1: use P to derive the optical windows OW0 and OW2n−3 for a (2n − 2)-stage dilated

Benes network and use Algorithm 3 to find AR-bits r0r1. . . rn−3;

2: for j = 0 to N2 − 1 do

3: let yn−1yn−2· · · y0 be the n-bit binary representation of the destination of input j∗;

4: rn−2(j) ← yn−1; 5: rn−1(j) ← yn−2; 6: for i = 0 to n − 1 do 7: r′ i(j) ← yn−1−i; 8: end for 9: end for

In this algorithm, r0r1· · · rn−1 are the routing bits for the first pass (i.e., for P1),

and r′

0r1′ · · · rn−1′ are the routing bits for the second pass (i.e., for P2). This algorithm

uses the characteristic strings OSk, 1 ≤ k ≤ 2n − 4, of a (2n − 2)-stage dilated Benes

network to find r0r1· · · rn−3. It sets rn−2 and rn−1 to the leftmost two bits of the

n-bit binary representation of the destination and sets r′

0, r1′, . . . , rn−1′ to the n-bit binary

representation of the destination. Note that OSk, 1 ≤ k ≤ 2n − 4, of a (2n − 2)-stage

(29)

assume OSk, 1 ≤ k ≤ 2n − 4, are inputs, too.

We now give an example for Algorithm 4. Let P = 13 11 2 0 9 140 2 5 7 8 11 13 155 7  be the given semi-permutation. Then Algorithm 4 obtains the following routing bits.

         r0 r1 r2 r3 r′ 0 r1′ r2′ r3′ 0 0 1 1 1 1 0 1 1 0 1 0 1 0 1 1 0 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 1 1 1 1 1 1 1 1 0 0 0 0 1 0 1 0 1 1 0 0 1 0 1 1 1          .

Do notice that the routing bits obtained by Algorithm 4 works for both a baseline and a reverse baseline network. Routing paths in a baseline network are shown in Figure 7; those in a reverse baseline network are shown in Figure 8. It is not difficult to see that P1 =  0 2 5 7 8 11 13 15 3 10 4 12 6 15 1 9  and P2 =  3 10 4 12 6 15 1 9 13 11 2 0 9 14 5 7  . stage 1 stage 2 stage 3

stage 0 8 11 13 15 0 2 5 7 10 4 12 6 15 1 9 3 10 4 12 6 15 1 9 3

stage 1 stage 2 stage 3 stage 0 13 11 2 0 9 14 5 7 (a) (b)

Figure 7: (a) Routing paths in the first pass. (b) Routing paths in the second pass. We now analyze Algorithm 4.

Theorem 8. Algorithm 4 takes O(N log2N) time and it can realize any semi-permutation

with node-disjoint paths in a baseline (or reverse baseline) network in two passes.

Proof. It is not difficult to see that Algorithm 4 takes O(N log2N) time. By Lem-mas 7, it suffices to prove that: (i) r0r1· · · rn−1 are the n-bit binary representations

(30)

0

stage 1 stage 2 stage 3 stage 0

0

stage 1 stage 2 stage 3 stage 0 2 0 0 2 1 3 1 3 5 7 5 7 4 6 4 6 9 7 8 7 9 9 13 11 11 13 10 12 10 12 13 14 13 15 15 15 ( ) (b) (a) (b)

Figure 8: (a) Routing paths in the first pass. (b) Routing paths in the second pass. of L0,n, L1,n, . . . , LN

2−1,n, and (ii) r

0r′1· · · r′n−1 are the n-bit binary representations of

b0, b1, . . . , bN

2−1. Statement (ii) follows from lines 3, 6, 7, 8 of Algorithm 4. Since L0,n, L1,n, . . . , LN

2−1,n are the intermediate destinations of an (n − 2)-stage dilated Benes network for P, if we set r0r1· · · rn−1 to the first n routing bits of the 2n − 2 routing bits

obtained by Algorithm 3, then we have (i). By lines 1, 3, 4, 5 of Algorithm 4, r0r1· · · rn−1

are set accordingly. Thus we have (i).

6

Concluding remarks

This thesis considers the crosstalk-free rearrangeability of combined optical MINs. In [8], Das formulated an elegant sufficient condition for the rearrangeability of a combined (2n − 1)-stage electronic MIN and presented an O(N log2N)-time permutation routing algorithm for MINs that satisfy the sufficient condition. In this thesis, we have formulated a sufficient condition for the crosstalk-free rearrangeability of a combined (2n − 2)-stage optical MIN and a sufficient condition for the crosstalk-free rearrangeability of a combined (2n − 1)-stage optical MIN. We have proposed an O(N log2N)-time semi-permutation

routing algorithm for optical MINs that satisfy the sufficient condition, and improved the decomposition algorithm in [28] and the permutation routing algorithm in [27].

(31)

References

[1] C. S. Babu and C. S. Raghavendra, “On the invariants of shuffle/exchange networks,” in Proceedings of the 1st IEEE Symposium on Parallel and Distributed Processing, IEEE Computer Society, Los Alamitos, CA, pp. 249-256, 1989.

[2] V. E. Benes, “Optimal rearrangeable multistage connecting networks,” Bell System Tech. J., vol. 43, pp. 1641-1656, 1964.

[3] X. Bao and Q. Li, “Crosstalk-free rearrangeable multistage intercinnection networks,” Taiwan. J. Math., Vol. 10, No. 5, pp. 1225-1243, 2006.

[4] G. J. Chang, F. K. Hwang, and L. D. Tong, “Characterizing bit permutation net-works,” Networks, vol. 33, no. 4, pp. 261-267, 1999.

[5] H. Cam, “Rearrangeability of (2n−1)-stage shuffle-exchange network,” SIAM J. Com-put., vol. 32, pp. 557-585, 2003.

[6] Z. Chen, Z. Liu, and Z. Qiu, “Bidirectional shuffle-exchange network and tag-based routing algorithm,” IEEE Commun. Lett., vol. 7, no. 3, pp. 121-123, 2003.

[7] C. Chen and J. K. Lou, “An efficient tag-based routing algorithm for the backward network of a bidirectional general shuffle-exchange network,” IEEE Commun. Lett., vol. 10, no. 4, pp. 296-298, 2006.

[8] N. Das, “More on rearrangeability of combined (2n − 1)-stage networks,” J. Syst. Architect., vol. 51, pp. 207-222, 2005.

[9] S. T. Huang and S. K. Tripathi, “Finite state model and compatibility theory: New analysis tools for permutation networks,” IEEE Trans. Comput., vol. 35, pp. 591-601, 1986.

(32)

[10] F. K. Hwang, “The mathematical theory of nonblocking switching networks,” Series on Applied Mathematics, vol. 15, ch. 1, pp. 12-22, 2004.

[11] X. Jiang, P. H. Ho, H. Shen and S. Horiguchi, “A class of Benes-based optical multi-stage interconnection networks for crosstalk-free realization of permutations,” IEICE Trans. Commun., vol. E89VB, no. 1, pp. 19-27, 2006.

[12] C. P. Kuruskal, “A unified theory of interconnection network structure,” Theor. Comput. Sci., vol. 48, pp. 75-94, 1986.

[13] N. Linial and M. Tarsi, “Interpolation between bases and the shuffle-exchange net-work,” Eur. J. Comb., vol. 10, pp. 29-39, 1989.

[14] E. Lu and S. Q. Zheng, “High-speed crosstalk-free routing for optical multistage interconnection networks,” in Proceedings of International Conference on Computer Communications and Networks, 2003.

[15] V. W. Liu, C. Chen, and R. B. Chen, “Optimal all-to-all personalized exchange in d-nary banyan multistage interconnection networks,” J. Comb. Optim., vol. 14, pp. 131-142, 2007.

[16] G. Maier and A. Pattavina, “Design of photonic rearrangeable networks with zero first-order switching-element-crosstalk,” IEEE Trans. Commun., vol. 49, no. 7, pp. 1268-1279, 2001.

[17] K. Padmanabhan, “Design and analysis of even-sized binary shuffle-exchange net-works for multiprocessors,” IEEE Trans. Parallel Distrib. Syst., vol. 2, no. 4, pp. 385-397, 1991.

[18] K. Padmanabhan and A. N. Netravali, “Dilated networks for photonic switching,” IEEE Trans. Commun., vol. 35, no. 12, pp. 1357-1365, 1987.

(33)

[19] C. Qiao and L. Zhou, “Scheduling switching element disjoint connections in stage-controlled photonic banyans,” IEEE Trans. Commun., vol. 47, no. 1, pp. 139-148, 1999.

[20] H. S. Stone, “Parallel processing with the perfect shuffle,” IEEE Trans. Comput., vol. 20, pp. 153-161, 1971.

[21] X. TAN and S. Y. R. LI, “Rearrangeability of tandem cascade of banyan-type net-works,” IEICE Trans. Inf. Syst., vol. E90-D, no. 1, pp. 67-74, 2007.

[22] M. Vaez and C.-T. Lea, “Strictly nonblocking directional- coupler-based switching networks under crosstalk constraint,” IEEE Trans. Commun., vol. 48, no. 2, pp. 316-323, 2000.

[23] A. Waksman, “A permutation network,” J. ACM, vol. 15, pp. 159-163, 1969.

[24] C. Wu and T.-Y. Feng, “The universality of the shuffle-exchange network,” IEEE Trans. Comput., vol. 30, pp. 324-331, 1981.

[25] Y. Yang, J. Wang, “Optimal all-to-all personalized exchange in a class of optical multistage networks,” IEEE Trans. Parallel Distrib. Syst., vol. 12, no. 9, pp. 567-582, 2001.

[26] Y. Yang and J. Wang, “Routing permutations with link-disjoint and node-disjoint paths in a class of self-routable interconnects,” IEEE Trans. Parallel Distrib. Syst., vol 14, no. 4, pp. 383-393, 2003.

[27] Y. Yang and J. Wang, “Routing permutations on baseline networks with node-disjoint paths,” IEEE Trans. Parallel Distrib. Syst., vol. 16, no. 8,pp. 737-746, 2005.

[28] Y. Yang, J. Wang and Y. Pan, “Permutation capability of optical multistage inter-connection networks,” J. Parallel Distrib. Comput., vol. 60, no. 1, pp. 72-91, 2000.

數據

Figure 1: Communications among processors using an MIN.
Figure 2: A 5-stage, 8 × 8 MIN; this MIN is a 8 × 8 Benes network.
Figure 4: An 8 × 8 baseline network with labels and i-mappings.
Figure 5: A 16 × 16 dilated Benes network.
+4

參考文獻

相關文件

A factorization method for reconstructing an impenetrable obstacle in a homogeneous medium (Helmholtz equation) using the spectral data of the far-eld operator was developed

For example, Liu, Zhang and Wang [5] extended a class of merit functions proposed in [6] to the SCCP, Kong, Tuncel and Xiu [7] studied the extension of the implicit Lagrangian

In this paper, we extended the entropy-like proximal algo- rithm proposed by Eggermont [12] for convex programming subject to nonnegative constraints and proposed a class of

Using a one-factor higher-order item response theory (HO-IRT) model formulation, it is pos- ited that an examinee’s performance in each domain is accounted for by a

Arbenz et al.[1] proposed a hybrid preconditioner combining a hierarchical basis preconditioner and an algebraic multigrid preconditioner for the correc- tion equation in the

In summary, the main contribution of this paper is to propose a new family of smoothing functions and correct a flaw in an algorithm studied in [13], which is used to guarantee

For the proposed algorithm, we establish a global convergence estimate in terms of the objective value, and moreover present a dual application to the standard SCLP, which leads to

For the proposed algorithm, we establish its convergence properties, and also present a dual application to the SCLP, leading to an exponential multiplier method which is shown