國 立 交 通 大 學
應用數學系
碩 士 論 文
廣義的 Shuffle-exchange 網路在三種不同的
訊息傳送演算法之效益分析
The Performance Analysis of Three Routing
Algorithms of General Shuffle-exchange Networks
研 究 生:林威雄
指導教授:陳秋媛 教授
中 華 民 國 九 十 七 年 一 月
Contents
Abstract (in Chinese)
i
Abstract (in English)
ii
Acknowledgement
iii
Contents
iv
List of Figures
v
1 Introduction
1
2
Preliminaries
and
the
idea
of
our
analysis
4
3
The
analysis
of
P-algorithm
7
4
The
analysis
of
CLQ-algorithm
16
5 The analysis of CL-algorithm 17
List of Figures
1 The states of a
2×2switching element and the sub ports
1
2 A
10×10GSEN
2
3
Percentage
of
link-conflicts 22
4
Percentage
of
node-conflicts
22
The Performance Analysis of Three Routing
Algorithms of General Shuffle-Exchange Networks
Student: Bruce W. Lin
Advisor: Chiuyuan Chen
Department of Applied Mathematics National Chiao Tung University
Abstract
The shuffle-exchange network is a popular architecture for multistage intercon-nection networks. The number of nodes in a shuffle-exchange network is usually a power of k if each switching element in the network is of size k × k. In [10], Pad-manbhan relaxed the restriction that the number of nodes must be a power of k and proposed the general shuffle-exchange network (GSEN). Padmanbhan also proposed an elegant tag-based routing algorithm for the GSEN. Later, in [2], Chen, Liu, and Qui enhanced the GSEN with bidirectional links and they proposed a tag-based routing algorithm for the backward network of the GSEN. Recently, Chen and Lou [3] also proposed an tag-based routing algorithm for the backward network of the GSEN. A multistage interconnection network enables processors to send their mes-sages concurrently. When two routing requests occur simultaneously in the GSEN, a conflict may occur. The purpose of this thesis is to analyze the performance of the above three tag-based routing algorithms when there are two routing requests.
Keywords: multistage interconnection network, parallel and distributed com-puting, shuffle-exchange network, routing algorithm, conflict.
1
Introduction
The shuffle-exchange network has been proposed as a popular architecture for multi-stage interconnection networks; see [4, 5, 8, 10, 12]. The number of nodes in a shuffle-exchange network is usually a power of k if each switching element is of size k ×k. Since it is desirable to build a multistage interconnection network out of 2 × 2 switching elements instead of larger switching elements, throughout this thesis, we will assume that all the switching elements are identical and are of size 2 × 2.
It is well known that a 2 × 2 switching element has only two possible states: straight and cross, as shown in Figure 1. As can be seen from Figure 1, a 2 × 2 switching element has two upper and two lower sub ports. We will use sub port 0 (sub port 1) denote an upper (a lower) sub port.
sub port 0 sub port 1
Figure 1: The states of a 2 × 2 switching element and the sub ports.
In a multistage interconnection network, a path from an input to an output can be described by a sequence of labels that label the successive edges on this path. Such a sequence of labels is called a control tag [10] or tag [2] or path descriptor [6]. The control tag may be used as a header for routing a message: each successive node uses the first element of the sequence to route the message, and then discards it. For example, in Figure 2 (a), input 3 can get to output 4 by using the control tag 6 (0110), which means that the routing is via sub port 0 at stage 0, sub port 1 at stage 1, sub port 1 at stage 2, and sub port 0 at stage 3.
Recall that the number of nodes in a shuffle-exchange network is usually a power of 2 if each switching element is of size 2 × 2. The general shuffle-exchange network (GSEN) was proposed by Padmanbhan in [10] to relax the restriction on the number of nodes in a shuffle-exchange network. More precisely, an N0× N0 general shuffle-exchange network
0 1 2 3 4 5 6 7 8 9 (b) Figure 2: A 10 × 10 GSEN.
consists of the perfect shuffle operation (defined below) on N0 terminals followed by N0/2
switching elements. The perfect shuffle operation on N0 terminals is the permutation π
defined by π(i) = (2 · i + ¹ 2 · i N0 º ) mod N0, 0 ≤ i ≤ N0 − 1.
In other words, the perfect shuffle operation separates the top N0/2 terminals from the
bottom N0/2 terminals and precisely interleaves them, with the bottom terminals still
remaining at the bottom. See Figure 2 for an illustration.
Clearly, the number of stages in a GSEN is at least as large as log2N0. When it is
exactly log2N0, the GSEN is identical to the Omega network defined in [7] and the the
control tag depends only on the destination. If the number of stages in a GSEN is greater than log2N0, the control tag will depend on both the source and the destination. An
elegant tag-based routing algorithm for the GSEN has been proposed by Padmanbhan in [10]. In the remaining part of this thesis, we will call this algorithm P-algorithm for convenience.
After the work of Padmanbhan [10], Chen et al. [2] enhanced the GSEN with bidi-rectional links. Their reason for the enhancement is that although unidibidi-rectional links are widely used, bidirectional links also have many applications as suggested in [4]. A bidirectional GSEN can be divided into two dependent networks: the forward network and the backward network. The forward network is from the left-hand side of the GSEN to the right-side of the GSEN; thus a routing request in it is sent from left to right. On
the other hand, the backward network is from the right-hand side of the network to the left-hand side of the network; thus a routing request in it is sent from right to left. The control tags used in the forward (backward) network are called the forward (backward)
control tags.
Obviously, P-algorithm can be applied on the forward network. As for the backward network, Chen et al. [2] proposed a tag-based routing algorithm for it; this algorithm is based on the idea of inversely using the forward control tag. More precisely, this algorithm first runs P-algorithm to obtain the forward control tag; then, it runs another procedure to convert the forward control tag to the backward control tag. In the remaining part of this thesis, we will call the algorithm of Chen et al. CLQ-algorithm for convenience. Recently, Chen and Lou [3] also proposed a tag-based routing algorithm for the backward network. Unlike CLQ-algorithm, Chen and Lou’s algorithm does not run P-algorithm first and is not based on the idea of inversely using the forward control tag. In the remaining part of this thesis, we will call Chen and Lou’s algorithm CL-algorithm for convenience.
A multistage interconnection network enables processors to send their messages con-currently. However, routing must be handled carefully so that there is no conflict when messages are sending concurrently. There are two types of conflict-free routings in a mul-tistage interconnection network: one is routing with link-disjoint paths and the other is
routing with node-disjoint paths. The former is used in an electronic network and the
latter, an optical network. Routing with link-disjoint paths means that no two differ-ent messages have their paths share the same link in the network, while routing with node-disjoint paths means that no two different messages have their paths share the same switching element in the network.
The three known routing algorithms for the GSEN are P-algorithm, CLQ-algorithm, and CL-algorithm. P-algorithm can be used to send a message in a GSEN or in the forward network of a bidirectional GSEN. Both CLQ-algorithm and CL-algorithm can be used to send a message in the backward network of a bidirectional GSEN. When two routing requests occur simultaneously, a conflict may occur. Up to now, there is no
analysis for the conflicts of routing requests in the GSEN. For some routing requests, P-algorithm and CLQ-algorithm can provide two control tags, i.e., two paths that can fulfill the routing request. On the other hand, for each routing request, CL-algorithm provides only one control tag, i.e., only one path that can fulfill the routing request. It is also not known which control tag should be used (when two tags are available) to reduce the conflict. The purpose of this thesis is to analyze the performance of the above three routing algorithms of the GSEN. We will focus on the case that there are two routing requests.
This thesis is organized as follows: Section 2 gives some preliminaries and the idea of our performance analysis. Sections 3, 4, and 5 give the analysis of P-algorithm, CLQ-algorithm, and CL-CLQ-algorithm, respectively. Concluding remarks are given in the final section.
2
Preliminaries and the idea of our analysis
We first introduce some terminologies that will be used throughout this thesis. For convenience, GSEN is also used to denote a bidirectional GSEN. N0 is used to denote
the number of inputs and outputs of the given GSEN. Also, r and n + 1 are used to denote the number of switching elements in a stage and the number of stages in a GSEN, respectively. A GSEN is a multistage interconnection network with switches aligned in
n + 1 stages, labelled 0, 1, . . . , n. Each stage consists of r switching elements, labelled
with 0, 1, . . . , r − 1. Since every switching element is of size 2 × 2, there are a total of
N0 = 2 × r
ports on each side of a stage, labelled 0, 1, · · · , N0−1. The parameters N0, r, and n satisfy
the following equation:
dlog2(2 · r)e = dlog2N0e = n + 1.
The following conventions are used in this thesis. Stage 0 is the leftmost stage even if the network is the backward network of a GSEN. The switching elements in a stage
are considered cyclic; that is, the switching element labelled 0 is considered to be the successive switch element of the switching element labelled r − 1. Also, nodes i and i0 (j
and j0) are assumed to be on the left-hand (right-hand) side of the network. An (i,
j)-request denotes a j)-request for sending a message from i to j (from j to i if the network
is a backward network). An (i, j)-path denotes a path between i and j. The terms (i0, j0)-request and (i0, j0)-path are defined similarly.
Note that an (i, j)-request can be fulfilled by an (i, j)-path. Moreover, an (i, j)-path can be characterized by its port sequence, which is the sequence of ports
(R−1, R0, R1, · · · , Rn)
passed by this path such that R−1 is defined to be
R−1 = i
and R` is the port to the right of the switching element at stage ` on this path. Clearly,
Rn = j.
Take Figure 2 (a) for an example. A (3, 4)-request can be fulfilled by using the (3, 4)-path shown in this figure and this (3, 4)-path has the port sequence (3, 6, 3, 7, 4). The purpose of this thesis is to analyze the performance of three existing algorithms of GSENs: P-algorithm, CLQ-P-algorithm, and CL-algorithm. And we will focus on the case that two routing requests occur simultaneously. In an electronic GSEN, two routing requests can be sent simultaneously if their routing paths are link-disjoint, meaning that no two links of these two paths are identical. On the other hand, in an optical GSEN, two routing requests can be sent simultaneously if their routing paths are node-disjoint, meaning that no two switching elements of these two paths are identical; this is to ensure that only one signal passes through a switching element at a time and thus to avoid the crosstalk
problem (see also [16]).
Two routing paths are said to have a conflict (node-conflict) if they are not link-disjoint (node-link-disjoint). The following two lemmas are obvious and their proofs are omit-ted.
Lemma 1. If two routing paths have a link-conflict, then they have a node-conflict. Lemma 2. There are only three possible cases for two routing paths:
(i) they have no node-conflict (hence no link-conflict); (ii) they have a node-conflict and have no link-conflict; (iii) they have a link-conflict (hence a node-conflict).
Take Figure 2 (b) for an example. Suppose we have three routing requests: (3, 4)-request, (0, 8)-4)-request, and (9, 9)-request. Also suppose that these three requests are routed along path P with port sequence (3, 6, 3, 7, 4), path Q with port sequence (0, 1, 2, 4, 8), and path U with port sequence (9, 8, 7, 4, 9). Then P and U have no node-conflict and no link-conflict; P and Q have a node-conflict and have no link-conflict; U and Q have a link-conflict and a node-conflict.
We now describe the idea used in our analysis. Suppose the two requests that occur simultaneously are (i, j)-request and (i0, j0)-request. Also suppose that (i, j)-request is
routed along the (i, j)-path P with port sequence
(P−1, P0, P1, · · · , Pn) (1)
and (i0, j0)-request are routed along the (i0, j0)-path Q with port sequence
(Q−1, Q0, Q1, · · · , Qn). (2)
The following two lemmas describe how we detect a link-conflict or a node-conflict. Lemma 3. P and Q have a link-conflict if and only if
Pk= Qk for some − 1 ≤ k ≤ n.
Proof. For each −1 ≤ k ≤ n, port Pk determines a link on P. Similarly, for each
−1 ≤ k ≤ n, port Qk determines a link on Q. Thus we have this lemma.
Lemma 4. P and Q have a have a node-conflict if and only if
bPk
2 c = b
Qk
Proof. For each 0 ≤ k ≤ n, bPk
2 c determines a switching element on P. Similarly, for
each 0 ≤ k ≤ n, bQk
2 c determines a switching element on Q. Thus we have this lemma.
In the following sections, we will further assume that
i 6= i0 and j 6= j0.
This is because if i = i0 or j = j0 occur, then a link-conflict and also a node-conflict will
occur and it is definitely impossible to fulfill these two routing requests simultaneously.
3
The analysis of P-algorithm
In this section, we will analyze the performance of P-algorithm. This algorithm was stated in a theorem in [10].
Theorem 5. [10] Any i, 0 ≤ i < N0, in a GSEN can set up a path to a j, 0 ≤ j < N0,
by using the control tag
T1 = (j + 2Mi) mod N0.
In addition if T1+ N0 < 2N, then a second control tag exists and is given by
T2 = T1+ N0.
Using P-algorithm, each of the two requests (i, j)-request and (i0, j0)-request can be
fulfilled by using a T1 or a T2 control tag. Note that the second control tag T2 may not
exist. In this occurs, we will set T2 = T1. Thus there are four possible cases:
T1T1-case: Both requests are fulfilled by using their T1 tags.
T1T2-case: (i, j)-request is fulfilled by using its T1 tag and (i0, j0)-request, its T2 tag.
T2T1-case: (i, j)-request is fulfilled by using its T2 tag and (i0, j0)-request, its T1 tag.
We run computer programs to obtain the number of node-conflicts and the number of link-conflicts. For convenience, let
x ∈ {T1T1, T1T2, T2T1, T2T2}. (3)
Let LCF(i, j, x) denote the number of (i0, j0)-requests that have a link-conflict with the
(i, j)-request when the x-case occurs. Let NCF(i, j, x) denote the number of (i0, j0
)-requests that have a node-conflict with the (i, j)-request when the x-case occurs. For example, NCF(i, j, T1T2) denotes the number of (i0, j0)-requests that have a node-conflict
with the (i, j)-request when the T1T2-case occurs. To describe the results of our analysis,
we also define TotalLCF(i, ∗, x) = N0−1 X j=0 LCF(i, j, x), TotalNCF(i, ∗, x) = N0−1 X j=0 NCF(i, j, x), TotalLCF(∗, j, x) = N0−1 X i=0 LCF(i, j, x), TotalNCF(∗, j, x) = N0−1 X i=0 NCF(i, j, x).
For convenience, if a (i, j)-path is obtained from control tag T1, then we say it is a
T1-(i, j)-path and if it is obtained from control tag T2, then we say it is a T2-(i, j)-path.
We have two lemmas.
Lemma 6. If there is a T`-(i, j)-path with port sequence (i, R0, R1, R2, · · · , Rn−1, j), then
there is a T3−`-(N0− 1 − i, N0− 1 − j)-path with port sequence (N0− 1 − i, N0− 1 − R0,
N0− 1 − R
1, N0− 1 − R2, · · · , N0− 1 − Rn−1, N0− 1 − j) for ` = 1, 2.
Proof. This lemma follows from the fact that if we fold a GSEN so that its upper boundary coincides with its lower boundary, then a T`-(i, j)-path with port sequence
(i, R0, R1, R2, · · · , Rn−1, j) will coincide with a T3−`-(N0− 1 − i, N0− 1 − j)-path with port
sequence (N0− 1 − i, N0− 1 − R
Lemma 7. If there is link-conflict between the T`-(i, j)-path and the Tm-(i0, j0)-path, then
there is link-conflict between the T3−`-(N0 − 1 − i, N0− 1 − j)-path and the T3−m-(N0 −
1 − i0, N0 − 1 − j0)-path for ` = 1, 2 and m = 1, 2.
Proof. Let the port sequences of the T`-(i, j)-path P and the Tm-(i0, j0)-path Q be
(i, P0, P1, P2, · · · , Pn−1, j) and (i0, Q0, Q1, Q2, · · · , Qn−1, j0), respectively. By Lemma 3,
there exists −1 ≤ k ≤ n such that Pk = Qk, which implies N0 − 1 − Pk = N0 − 1 − Qk.
This lemma now follows from Lemmas 3 and 6.
The above two lemmas lead to the following two corollaries. Corollary 8.
(A) LCF(i, j, T1T1) = LCF(N0− 1 − i, N0− 1 − j, T2T2)
(B) NCF(i, j, T1T1) = NCF(N0− 1 − i, N0− 1 − j, T2T2)
(C) LCF(i, j, T1T2) = LCF(N0− 1 − i, N0− 1 − j, T2T1)
(D) NCF(i, j, T1T2) = NCF(N0− 1 − i, N0− 1 − j, T2T1)
Proof. The first statement of this corollary follows from Lemma 7. The other state-ments of this corollary can be proven in a similar way and we omit their proofs.
Corollary 9.
(a) TotalLCF(i, ∗, T1T1) = TotalLCF(N0− 1 − i, ∗, T2T2)
(b) TotalLCF(∗, j, T1T1) = TotalLCF(∗, N0− 1 − j, T2T2)
(c) TotalNCF(i, ∗, T1T1) = TotalNCF(N0− 1 − i, ∗, T2T2)
(d) TotalNCF(∗, j, T1T1) = TotalNCF(∗, N0− 1 − j, T2T2)
(e) TotalLCF(i, ∗, T1T2) = TotalLCF(N0− 1 − i, ∗, T2T1)
(f) TotalLCF(∗, j, T1T2) = TotalLCF(∗, N0− 1 − j, T2T1)
(g) TotalNCF(i, ∗, T1T2) = TotalNCF(N0− 1 − i, ∗, T2T1)
(h) TotalNCF(∗, j, T1T2) = TotalNCF(∗, N0− 1 − j, T2T1)
Proof.
(a) holds since:
TotalLCF(i, ∗, T1T1) =
N0−1
X
j=0
LCF(i, j, T1T1) (by definition)
=
N0−1
X
j=0
LCF(N0 − 1 − i, N0− 1 − j, T
2T2) (by (A) of Corollary 8)
= TotalLCF(N0 − 1 − i, ∗, T
(b) holds since:
TotalLCF(∗, j, T1T1) =
N0−1
X
i=0
LCF(i, j, T1T1) (by definition)
=
N0−1
X
i=0
LCF(N0− 1 − i, N0 − 1 − j, T
2T2) (by (A) of Corollary 8)
= TotalLCF(∗, N0− 1 − i, T
2T2) (by definition).
(c) and (d) can be proven in a similar way except (B), instead of (A), of Corollary 8 is used. (e) and (f) can be proven in a similar way except (C), instead of (A), of Corollary 8 is used. (g) and (h) can also be proven in a similar way except (D), instead of (A), of Corollary 8 is used.
To obtain LCF(i, j, x) or NCF(i, j, x), we need to test (N0 − 1) × (N0 − 1) pairs of
(i0, j0)-requests. We have run computer programs for each N0 = 4, 6, · · · , 46 and have
obtained the following values:
LCF(i, j, x) for all i, j, x; TotalLCF(i, ∗, x) for all i, x; TotalLCF(∗, j, x) for all j, x; NCF(i, j, x) for all i, j, x; TotalNCF(i, ∗, x) for all i, x; TotalNCF(∗, j, x) for all j, x. From our computer output, we have the following observations.
Observation 1. For each x ∈ {T1T1, T2T2, T1T2, T2T1},
LCF(i, j, x) = LCF(0, j + 2Mi, x), NCF(i, j, x) = NCF(0, j + 2Mi, x). Observation 2. For each x ∈ {T1T1, T2T2, T1T2, T2T1},
TotalLCF(i1, ∗, x) = TotalLCF(i2, ∗, x) = TotalLCF(∗, j1, x) = TotalLCF(∗, j2, x),
TotalNCF(i1, ∗, x) = TotalNCF(i2, ∗, x) = TotalNCF(∗, j1, x) = TotalNCF(∗, j2, x).
Observation 3. TotalLCF(i, ∗, T1T1) = TotalLCF(i, ∗, T1T2) whenever N0 = 2n+1.
Observation 4. TotalNCF(i, ∗, T1T1) = TotalNCF(i, ∗, T1T2) whenever N0 = 2n+1.
Observation 6. TotalNCF(i, ∗, T1T1) > TotalNCF(i, ∗, T1T2) whenever N0 6= 2n+1 and
4|N0.
Observation 7. TotalNCF(i, ∗, T1T1) < TotalNCF(i, ∗, T1T2) whenever N0 6= 2n+1 and
4 6 |N0.
From the above, we also have the following equalities.
TotalLCF(i, ∗, T1T1) = TotalLCF(0, ∗, T1T1) (by Observation 2)
TotalLCF(∗, j, T1T1) = TotalLCF(0, ∗, T1T1) (by Observation 2)
TotalLCF(i, ∗, T2T2) = TotalLCF(0, ∗, T1T1) (by Observation 2 and Corollary 9)
TotalLCF(∗, j, T2T2) = TotalLCF(0, ∗, T1T1) (by Observation 2 and Corollary 9)
TotalNCF(i, ∗, T1T1) = TotalNCF(0, ∗, T1T1) (by Observation 2)
TotalNCF(∗, j, T1T1) = TotalNCF(0, ∗, T1T1) (by Observation 2)
TotalNCF(i, ∗, T2T2) = TotalNCF(0, ∗, T1T1) (by Observation 2 and Corollary 9)
TotalNCF(∗, j, T2T2) = TotalNCF(0, ∗, T1T1) (by Observation 2 and Corollary 9)
TotalLCF(i, ∗, T1T2) = TotalLCF(0, ∗, T1T2) (by Observation 2)
TotalLCF(∗, j, T1T2) = TotalLCF(0, ∗, T1T2) (by Observation 2)
TotalLCF(i, ∗, T2T1) = TotalLCF(0, ∗, T1T2) (by Observation 2 and Corollary 9)
TotalLCF(∗, j, T2T1) = TotalLCF(0, ∗, T1T2) (by Observation 2 and Corollary 9)
TotalNCF(i, ∗, T1T2) = TotalNCF(0, ∗, T1T2) (by Observation 2)
TotalNCF(∗, j, T1T2) = TotalNCF(0, ∗, T1T2) (by Observation 2)
TotalNCF(i, ∗, T2T1) = TotalNCF(0, ∗, T1T2) (by Observation 2 and Corollary 9)
TotalNCF(∗, j, T2T1) = TotalNCF(0, ∗, T1T2) (by Observation 2 and Corollary 9)
In the following, we list the computer output when N0 = 18. The values of each
LCF(i, j, T1T1), LCF(i, j, T1T2), LCF(i, j, T2T1), LCF(i, j, T2T2), NCF(i, j, T1T1), NCF(i, j,
Table 1: Each (i, j)-entry in the table is LCF(i, j, T1T1). 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 0 27 27 29 29 31 31 29 29 27 27 29 29 27 27 25 25 25 25 1 31 31 29 29 27 27 29 29 27 27 25 25 25 25 27 27 29 29 2 27 27 29 29 27 27 25 25 25 25 27 27 29 29 31 31 29 29 3 27 27 25 25 25 25 27 27 29 29 31 31 29 29 27 27 29 29 4 25 25 27 27 29 29 31 31 29 29 27 27 29 29 27 27 25 25 5 29 29 31 31 29 29 27 27 29 29 27 27 25 25 25 25 27 27 6 29 29 27 27 29 29 27 27 25 25 25 25 27 27 29 29 31 31 7 29 29 27 27 25 25 25 25 27 27 29 29 31 31 29 29 27 27 8 25 25 25 25 27 27 29 29 31 31 29 29 27 27 29 29 27 27 9 27 27 29 29 31 31 29 29 27 27 29 29 27 27 25 25 25 25 10 31 31 29 29 27 27 29 29 27 27 25 25 25 25 27 27 29 29 11 27 27 29 29 27 27 25 25 25 25 27 27 29 29 31 31 29 29 12 27 27 25 25 25 25 27 27 29 29 31 31 29 29 27 27 29 29 13 25 25 27 27 29 29 31 31 29 29 27 27 29 29 27 27 25 25 14 29 29 31 31 29 29 27 27 29 29 27 27 25 25 25 25 27 27 15 29 29 27 27 29 29 27 27 25 25 25 25 27 27 29 29 31 31 16 29 29 27 27 25 25 25 25 27 27 29 29 31 31 29 29 27 27 17 25 25 25 25 27 27 29 29 31 31 29 29 27 27 29 29 27 27
Table 2: Each (i, j)-entry in the table is LCF(i, j, T1T2).
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 0 26 26 28 28 30 30 28 28 26 26 28 28 26 26 25 25 25 25 1 30 30 28 28 26 26 28 28 26 26 25 25 25 25 26 26 28 28 2 26 26 28 28 26 26 25 25 25 25 26 26 28 28 30 30 28 28 3 26 26 25 25 25 25 26 26 28 28 30 30 28 28 26 26 28 28 4 25 25 26 26 28 28 30 30 28 28 26 26 28 28 26 26 25 25 5 28 28 30 30 28 28 26 26 28 28 26 26 25 25 25 25 26 26 6 28 28 26 26 28 28 26 26 25 25 25 25 26 26 28 28 30 30 7 28 28 26 26 25 25 25 25 26 26 28 28 30 30 28 28 26 26 8 25 25 25 25 26 26 28 28 30 30 28 28 26 26 28 28 26 26 9 26 26 28 28 30 30 28 28 26 26 28 28 26 26 25 25 25 25 10 30 30 28 28 26 26 28 28 26 26 25 25 25 25 26 26 28 28 11 26 26 28 28 26 26 25 25 25 25 26 26 28 28 30 30 28 28 12 26 26 25 25 25 25 26 26 28 28 30 30 28 28 26 26 28 28 13 25 25 26 26 28 28 30 30 28 28 26 26 28 28 26 26 25 25 14 28 28 30 30 28 28 26 26 28 28 26 26 25 25 25 25 26 26 15 28 28 26 26 28 28 26 26 25 25 25 25 26 26 28 28 30 30 16 28 28 26 26 25 25 25 25 26 26 28 28 30 30 28 28 26 26 17 25 25 25 25 26 26 28 28 30 30 28 28 26 26 28 28 26 26
Table 3: Each (i, j)-entry in the table is LCF(i, j, T2T1). 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 0 26 26 28 28 26 26 28 28 30 30 28 28 26 26 25 25 25 25 1 26 26 28 28 30 30 28 28 26 26 25 25 25 25 26 26 28 28 2 30 30 28 28 26 26 25 25 25 25 26 26 28 28 26 26 28 28 3 26 26 25 25 25 25 26 26 28 28 26 26 28 28 30 30 28 28 4 25 25 26 26 28 28 26 26 28 28 30 30 28 28 26 26 25 25 5 28 28 26 26 28 28 30 30 28 28 26 26 25 25 25 25 26 26 6 28 28 30 30 28 28 26 26 25 25 25 25 26 26 28 28 26 26 7 28 28 26 26 25 25 25 25 26 26 28 28 26 26 28 28 30 30 8 25 25 25 25 26 26 28 28 26 26 28 28 30 30 28 28 26 26 9 26 26 28 28 26 26 28 28 30 30 28 28 26 26 25 25 25 25 10 26 26 28 28 30 30 28 28 26 26 25 25 25 25 26 26 28 28 11 30 30 28 28 26 26 25 25 25 25 26 26 28 28 26 26 28 28 12 26 26 25 25 25 25 26 26 28 28 26 26 28 28 30 30 28 28 13 25 25 26 26 28 28 26 26 28 28 30 30 28 28 26 26 25 25 14 28 28 26 26 28 28 30 30 28 28 26 26 25 25 25 25 26 26 15 28 28 30 30 28 28 26 26 25 25 25 25 26 26 28 28 26 26 16 28 28 26 26 25 25 25 25 26 26 28 28 26 26 28 28 30 30 17 25 25 25 25 26 26 28 28 26 26 28 28 30 30 28 28 26 26
Table 4: Each (i, j)-entry in the table is LCF(i, j, T2T2).
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 0 27 27 29 29 27 27 29 29 31 31 29 29 27 27 25 25 25 25 1 27 27 29 29 31 31 29 29 27 27 25 25 25 25 27 27 29 29 2 31 31 29 29 27 27 25 25 25 25 27 27 29 29 27 27 29 29 3 27 27 25 25 25 25 27 27 29 29 27 27 29 29 31 31 29 29 4 25 25 27 27 29 29 27 27 29 29 31 31 29 29 27 27 25 25 5 29 29 27 27 29 29 31 31 29 29 27 27 25 25 25 25 27 27 6 29 29 31 31 29 29 27 27 25 25 25 25 27 27 29 29 27 27 7 29 29 27 27 25 25 25 25 27 27 29 29 27 27 29 29 31 31 8 25 25 25 25 27 27 29 29 27 27 29 29 31 31 29 29 27 27 9 27 27 29 29 27 27 29 29 31 31 29 29 27 27 25 25 25 25 10 27 27 29 29 31 31 29 29 27 27 25 25 25 25 27 27 29 29 11 31 31 29 29 27 27 25 25 25 25 27 27 29 29 27 27 29 29 12 27 27 25 25 25 25 27 27 29 29 27 27 29 29 31 31 29 29 13 25 25 27 27 29 29 27 27 29 29 31 31 29 29 27 27 25 25 14 29 29 27 27 29 29 31 31 29 29 27 27 25 25 25 25 27 27 15 29 29 31 31 29 29 27 27 25 25 25 25 27 27 29 29 27 27 16 29 29 27 27 25 25 25 25 27 27 29 29 27 27 29 29 31 31 17 25 25 25 25 27 27 29 29 27 27 29 29 31 31 29 29 27 27
Table 5: Each (i, j)-entry in the table is NCF(i, j, T1T1). 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 0 53 53 57 57 57 57 61 61 57 57 61 61 61 61 65 65 81 81 1 57 57 61 61 57 57 61 61 61 61 65 65 81 81 53 53 57 57 2 57 57 61 61 61 61 65 65 81 81 53 53 57 57 57 57 61 61 3 61 61 65 65 81 81 53 53 57 57 57 57 61 61 57 57 61 61 4 81 81 53 53 57 57 57 57 61 61 57 57 61 61 61 61 65 65 5 57 57 57 57 61 61 57 57 61 61 61 61 65 65 81 81 53 53 6 61 61 57 57 61 61 61 61 65 65 81 81 53 53 57 57 57 57 7 61 61 61 61 65 65 81 81 53 53 57 57 57 57 61 61 57 57 8 65 65 81 81 53 53 57 57 57 57 61 61 57 57 61 61 61 61 9 53 53 57 57 57 57 61 61 57 57 61 61 61 61 65 65 81 81 10 57 57 61 61 57 57 61 61 61 61 65 65 81 81 53 53 57 57 11 57 57 61 61 61 61 65 65 81 81 53 53 57 57 57 57 61 61 12 61 61 65 65 81 81 53 53 57 57 57 57 61 61 57 57 61 61 13 81 81 53 53 57 57 57 57 61 61 57 57 61 61 61 61 65 65 14 57 57 57 57 61 61 57 57 61 61 61 61 65 65 81 81 53 53 15 61 61 57 57 61 61 61 61 65 65 81 81 53 53 57 57 57 57 16 61 61 61 61 65 65 81 81 53 53 57 57 57 57 61 61 57 57 17 65 65 81 81 53 53 57 57 57 57 61 61 57 57 61 61 61 61
Table 6: Each (i, j)-entry in the table is NCF(i, j, T1T2).
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 0 89 89 85 85 85 85 81 81 85 85 81 81 81 81 81 81 65 65 1 85 85 81 81 85 85 81 81 81 81 81 81 65 65 89 89 85 85 2 85 85 81 81 81 81 81 81 65 65 89 89 85 85 85 85 81 81 3 81 81 81 81 65 65 89 89 85 85 85 85 81 81 85 85 81 81 4 65 65 89 89 85 85 85 85 81 81 85 85 81 81 81 81 81 81 5 85 85 85 85 81 81 85 85 81 81 81 81 81 81 65 65 89 89 6 81 81 85 85 81 81 81 81 81 81 65 65 89 89 85 85 85 85 7 81 81 81 81 81 81 65 65 89 89 85 85 85 85 81 81 85 85 8 81 81 65 65 89 89 85 85 85 85 81 81 85 85 81 81 81 81 9 89 89 85 85 85 85 81 81 85 85 81 81 81 81 81 81 65 65 10 85 85 81 81 85 85 81 81 81 81 81 81 65 65 89 89 85 85 11 85 85 81 81 81 81 81 81 65 65 89 89 85 85 85 85 81 81 12 81 81 81 81 65 65 89 89 85 85 85 85 81 81 85 85 81 81 13 65 65 89 89 85 85 85 85 81 81 85 85 81 81 81 81 81 81 14 85 85 85 85 81 81 85 85 81 81 81 81 81 81 65 65 89 89 15 81 81 85 85 81 81 81 81 81 81 65 65 89 89 85 85 85 85 16 81 81 81 81 81 81 65 65 89 89 85 85 85 85 81 81 85 85 17 81 81 65 65 89 89 85 85 85 85 81 81 85 85 81 81 81 81
Table 7: Each (i, j)-entry in the table is NCF(i, j, T2T1). 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 0 81 81 81 81 85 85 81 81 85 85 85 85 89 89 65 65 81 81 1 85 85 81 81 85 85 85 85 89 89 65 65 81 81 81 81 81 81 2 85 85 85 85 89 89 65 65 81 81 81 81 81 81 85 85 81 81 3 89 89 65 65 81 81 81 81 81 81 85 85 81 81 85 85 85 85 4 81 81 81 81 81 81 85 85 81 81 85 85 85 85 89 89 65 65 5 81 81 85 85 81 81 85 85 85 85 89 89 65 65 81 81 81 81 6 81 81 85 85 85 85 89 89 65 65 81 81 81 81 81 81 85 85 7 85 85 89 89 65 65 81 81 81 81 81 81 85 85 81 81 85 85 8 65 65 81 81 81 81 81 81 85 85 81 81 85 85 85 85 89 89 9 81 81 81 81 85 85 81 81 85 85 85 85 89 89 65 65 81 81 10 85 85 81 81 85 85 85 85 89 89 65 65 81 81 81 81 81 81 11 85 85 85 85 89 89 65 65 81 81 81 81 81 81 85 85 81 81 12 89 89 65 65 81 81 81 81 81 81 85 85 81 81 85 85 85 85 13 81 81 81 81 81 81 85 85 81 81 85 85 85 85 89 89 65 65 14 81 81 85 85 81 81 85 85 85 85 89 89 65 65 81 81 81 81 15 81 81 85 85 85 85 89 89 65 65 81 81 81 81 81 81 85 85 16 85 85 89 89 65 65 81 81 81 81 81 81 85 85 81 81 85 85 17 65 65 81 81 81 81 81 81 85 85 81 81 85 85 85 85 89 89
Table 8: Each (i, j)-entry in the table is NCF(i, j, T2T2).
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 0 61 61 61 61 57 57 61 61 57 57 57 57 53 53 81 81 65 65 1 57 57 61 61 57 57 57 57 53 53 81 81 65 65 61 61 61 61 2 57 57 57 57 53 53 81 81 65 65 61 61 61 61 57 57 61 61 3 53 53 81 81 65 65 61 61 61 61 57 57 61 61 57 57 57 57 4 65 65 61 61 61 61 57 57 61 61 57 57 57 57 53 53 81 81 5 61 61 57 57 61 61 57 57 57 57 53 53 81 81 65 65 61 61 6 61 61 57 57 57 57 53 53 81 81 65 65 61 61 61 61 57 57 7 57 57 53 53 81 81 65 65 61 61 61 61 57 57 61 61 57 57 8 81 81 65 65 61 61 61 61 57 57 61 61 57 57 57 57 53 53 9 61 61 61 61 57 57 61 61 57 57 57 57 53 53 81 81 65 65 10 57 57 61 61 57 57 57 57 53 53 81 81 65 65 61 61 61 61 11 57 57 57 57 53 53 81 81 65 65 61 61 61 61 57 57 61 61 12 53 53 81 81 65 65 61 61 61 61 57 57 61 61 57 57 57 57 13 65 65 61 61 61 61 57 57 61 61 57 57 57 57 53 53 81 81 14 61 61 57 57 61 61 57 57 57 57 53 53 81 81 65 65 61 61 15 61 61 57 57 57 57 53 53 81 81 65 65 61 61 61 61 57 57 16 57 57 53 53 81 81 65 65 61 61 61 61 57 57 61 61 57 57 17 81 81 65 65 61 61 61 61 57 57 61 61 57 57 57 57 53 53
4
The analysis of CLQ-algorithm
In this section, we will analyze the performance of CLQ-algorithm. In [2], Chen et al. proposed the CLQ-algorithm for computing the control tag of the backward network of a GSEN. Suppose a request is to send a message from port j (on the right-hand side) to port i (on the left-hand side). Then the input to CLQ-algorithm are i and j and the output of this algorithm is the backward control tag S for sending a message from j to
i. CLQ-algorithm first computes the forward control tag T , which can be used to send
a message from i to j; then, it converts T into S. The following is CLQ-algorithm; note that we assume k = 2.
BEGIN-of-CLQ-algorithm
Input: i on the left-hand side and j on the right-hand side of a bidirectional GSEN. Output: The backward control tag S, which can be used to send a message from j to i. 1. Use P-algorithm to obtain a forward control tag T (t0t1t2· · · tn−1tn).
2. Get the sequence R` (0 ≤ ` ≤ n) in the path based on tag T in forward direction by
the following formulae:
R` = ( k · i mod N0+ t 0, ` = 0, k · (R`−1) mod N0+ t`, 1 ≤ ` ≤ n. (4) 3. Use the sequence R` and tag T to get tag S by the following formulae:
s` = ( bk·i N0c, ` = 0, bk·(R`−1) N0 c, 1 ≤ ` ≤ n. (5) END-of-CLQ-algorithm
We now analyze the performance of CLQ-algorithm. Suppose the two requests that are sent simultaneously are the (i, j)-request and the (i0, j0)-request. For each of the two
requests, CLQ-algorithm first uses P-algorithm to obtain a forward control tag; then, it converts the forward control tag into a backward control tag. Let T and T0 be the
forward control tags obtained by P-algorithm for the (i, j)-request and the (i0, j0)-request,
respectively. Also, let P and Q be the path corresponding to T and T0, respectively. Since
CLQ-algorithm fulfills the (i, j)-request by using P reversely and fulfills the (i0, j0)-request
by using Q reversely, we have the following lemma.
Lemma 10. The reverse path of P and the reverse path of Q has a link-conflict
(node-conflict) if and only if P and Q have a link-conflict (node-(node-conflict).
Proof. This lemma follows from the above discussion.
By Lemma 10, the analysis of P-algorithm can be used to obtain the analysis of CLQ-algorithm. In particular, for N0 = 4, 6, · · · , 46, Observations 1 to 7 and Tables 1 to 10
also hold for CLQ-algorithm.
5
The analysis of CL-algorithm
In this section, we will analyze the performance of CL-algorithm. In [3], Chen and Lou proposed the CL-algorithm for computing the control tag of the backward network of a GSEN. They showed that the backward network has a wonderful property: For each destination i, there are two backward control tags associated with it such that every source j can get to i by using one of the two tags.
In CL-algorithm, the switching elements are assumed to be of size k ×k. CL-algorithm is based on the following observations: At stage 0, only one switching element can get to
i. At stage 1, exactly k switching elements can get to i and these switching elements are
consecutive. At stage 2, exactly k2 switching elements can get to i and these switching
elements are consecutive. In general, at stage `, 0 ≤ ` ≤ n − 1, exactly k` switching
elements can get to i and these switching elements are consecutive. At stage n (the last stage), all the switching elements can get to i. Since at stage ` the switching elements that can get to i are consecutive, CL-algorithm only stores the label of the first one; let
critical value v(i) associated with i to be v(i) = Cn× k. The following is CL-algorithm;
note that we assume k = 2. BEGIN-of-CL-algorithm
Input: i on the left-hand side of a bidirectional GSEN.
Output: The critical value v(i) and the two backward control tags s0 s1 · · · sn and
s0
0 s01 · · · s0n associated with i (here s` and s0` are used at stage `).
1. /* Compute C0, C1, · · · , Cn. */
for ` = 0 to n do C` ← (i × k`) mod r;
2. /* Compute the critical value v(i). */
v(i) ← Cn× k; 3. /* Compute F0, F1, · · · , Fn. */ if (r − Cn−1) × k ≥ r then begin for ` = 0 to n − 1 do F` ← 0; Fn ← 1; end else
for ` = 0 to n do if C`+ k` > r then F` ← 1 else F` ← 0;
4. /* Compute the tag s0
0 s01 · · · s0n. */ s0 0 ← ¥i r ¦ ; for ` = 1 to n do s0 `← j k×C`−1 r k ; 5. /* Compute the tag s0 s1 · · · sn. */
for ` = 0 to n do s`← (s0`+ F`) mod k;
Chen and Lou proved that:
Theorem 11. [3] If j < v(i), then j can get to i by using the control tag s0 s1 · · · sn; if
j ≥ v(i), then j can get to i by using the control tag s0
0 s01 · · · s0n (here s` and s0` are used
at stage `).
We now analyze the performance of CL-algorithm. Suppose the two requests that are sent simultaneously are the (i, j)-request and the (i0, j0)-request. Also suppose the
(i, j)-request is fulfilled by using the control tag S and the (i0, j0)-request, the control tag
S0. Let P and Q be the paths corresponding to S and S0, respectively. We run computer
programs for N0 = 4, 6, · · · , 46 and find that for these N0, the result of using P is identical
to the result of using control tag T1 for the (i, j)-request in the forward network; also, the
result of using Q is identical to the result of using control tag T1 for the (i0, j0)-request in
the forward network. In particular, for N0 = 4, 6, · · · , 46, Observations 1 and 2 (when x
is T1T1) and Tables 1 and 5 also hold for CL-algorithm.
6
Concluding remarks
A multistage interconnection network enables processors to send their messages con-currently. When two routing requests occur simultaneously, a link-conflict or a node-conflict may occur. In this thesis, we analyze the performance of three tag-based routing algorithms of GSENs: P-algorithm, CLQ-algorithm, and CL-algorithm. P-algorithm can be used in a GSEN and in the forward network of a GSEN; CLQ-algorithm and CL-algorithm can be used in the backward network of a GSEN. In this thesis, we focus on the case that two routing requests occur simultaneously and we consider the link-conflict and the node-conflict.
We have run computer programs for N0 = 4, 6, · · · , 46. From the computer output,
the following observations have been obtained for P-algorithm and CLQ-algorithm:
• TotalNCF(i, ∗, T1T1) > TotalNCF(i, ∗, T1T2) whenever N0 6= 2n+1 and 4|N0.
• TotalNCF(i, ∗, T1T1) < TotalNCF(i, ∗, T1T2) whenever N0 6= 2n+1 and 4 6 |N0.
The above observations tell us that: When P-algorithm or CLQ-algorithm is used, two requests have a higher probability to have
• a link-conflict if N0 6= 2n+1 and both requests are fulfilled by using the same type of
control tags (that is, if both of them are fulfilled by using their T1 or T2 control
tags);
• a node-conflict if N0 6= 2n+1 and 4|N0 and both requests are fulfilled by using the
same type of control tags;
• a node-conflict if N0 6= 2n+1 and 4 6 |N0 and both requests are fulfilled by using the
different types of control tags.
The following table has been obtained by Chen and Lou in [3].
the time required to CLQ-algorithm CL-algorithm find a tag for a j to get to i O(n) O(n)
find the tags for every j to get to i O(N0n) O(n)
construct the routing table O(N02n) O(N0n)
This table tells us that: CL-algorithm is more efficient than CLQ-algorithm when we want to construct a routing table or when more than one control tag to the same destination needs to be found.
In this thesis, the following observations have been obtained for CL-algorithm: For
N0 = 4, 6, · · · , 46, the result of using CL-algorithm is identical to the result of the T
1T1
-case. The above observation suggests that: When there are two requests and
• when link-conflict is considered, CLQ-algorithm beats CL-algorithm if N0 6= 2n+1;
• when node-conflict is considered, CLQ-algorithm beats CL-algorithm if N0 6= 2n+1 and
• when node-conflict is considered, CL-algorithm beats CLQ-algorithm if N0 6= 2n+1 and
4 6 |N0.
The percentage of link-conflicts and node-conflicts for N0 = 4, 6, · · · , 46 are listed
in Tables 9 and 10; we also show these data in Figures 3 and 4. In Table 9 (Table 10),
T1T1 means that both the first and the second request are routed by using their T1 control
tags; T1T2 means that the first request is routed by using its T1 control tag and the second
request, its T2 control tags; ”arbitrary” means that only when all of the four types, T1T1,
T1T2, T2T1, and T2T2, fail to contribute two link-disjoint (node-disjoint) routing paths, we
will consider a conflict occurs.
Table 9: Percentage of link-conflicts. Table 10: Percentage of node-conflicts.
N0 T 1T1 T1T2 arbitrary 4 6.25 6.25 6.25 6 10.19 9.26 1.85 8 7.81 7.81 7.81 10 10.60 10.00 0.40 12 9.95 7.87 4.17 14 7.51 7.43 5.10 16 6.64 6.64 6.64 18 8.54 8.30 0.07 20 8.65 6.50 0.90 22 6.82 6.72 1.31 24 7.58 5.67 3.82 26 5.79 5.76 3.03 28 5.52 5.36 3.90 30 5.09 5.08 4.37 32 4.79 4.79 4.79 34 5.95 5.87 0.01 36 6.46 4.68 0.15 38 5.26 5.21 0.26 40 6.26 4.03 1.23 42 4.72 4.69 0.72 44 4.90 4.24 1.31 46 4.29 4.27 1.21 N0 T 1T1 T1T2 arbitrary 4 31.25 31.25 31.25 6 32.41 36.11 32.41 8 26.56 26.56 26.56 10 26.60 33.00 20.20 12 24.77 23.84 16.44 14 21.50 22.08 19.75 16 19.14 19.14 19.14 18 18.96 25.14 11.01 20 19.85 19.25 9.65 22 17.04 19.37 11.63 24 17.77 15.68 11.98 26 15.08 15.85 12.26 28 14.27 14.19 11.86 30 13.39 13.47 12.56 32 12.60 12.60 12.60 34 12.46 16.82 5.79 36 13.86 13.62 5.39 38 11.85 14.42 5.93 40 13.46 11.31 5.71 42 11.11 12.69 6.32 44 11.21 11.11 5.71 46 10.37 11.39 6.74
˃ʸ ˅ʸ ˇʸ ˉʸ ˋʸ ˄˃ʸ ˄˅ʸ ˇ ˉ ˋ ˄˃ ˄˅ ˄ˇ ˄ˉ ˄ˋ ˅˃ ˅˅ ˅ˇ ˅ˉ ˅ˋ ˆ˃ ˆ˅ ˆˇ ˆˉ ˆˋ ˇ˃ ˇ˅ ˇˇ ˇˉ ˧˄˧˄ ˧˄˧˅ ˔̅˵˼̇̅˴̅̌ Nodes
Figure 3: Percentage of link-conflicts.
˃ʸ ˈʸ ˄˃ʸ ˄ˈʸ ˅˃ʸ ˅ˈʸ ˆ˃ʸ ˆˈʸ ˇ˃ʸ 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 T1T1 T1T2 Arbitrary
Figure 4: Percentage of node-conflicts.
References
[1] G. J. Chang, F. K. Hwang, and L. D. Tong, “Characterizing bit permutation net-works,” Networks, vol. 33, no. 4, pp. 261-267, 1999.
[2] Z. Chen, Z. J. Liu, and Z. L. Qiu, “Bidirectional shuffle-exchange network and tag-based routing algorithm,” IEEE Commun. Lett., vol. 7, no. 3, pp. 121-123, 2003.
[3] C. Y. Chen, J. K. Lou, “An efficient tag-based routing algorithm for the backward network of a bidirectional general shuffle-exchange network,” IEEE Commun. Lett., vol. 10, no. 4, pp. 296-298, 2006.
[4] M. Gerla, E. Leonardi, F. Neri, and P. Palanti, “Routing in the bidirectional shuf-flenet,” IEEE-ACM Trans. Netw., vol. 9, no. 1, pp. 91-103, 2001.
[5] F. K. Hwang, “The mathematical theory of nonblocking swithcing networks,” Series
on Applied Mathematics, vol. 15, 2004.
[6] C. P. Kuruskal, “A unified theory of interconnection network structure,” Theoretical
Computer Science, vol. 48, pp. 75-94, 1986.
[7] D. H. Lawrie, “Access and alignment of data in an array processor,” IEEE Trans.
Comput., vol. 24, no. 12, pp. 1145-1155, 1975.
[8] V. W. Liu, C. Y. Chen, and R. B. Chen, “Optimal all-to-all personalized exchange in d-nary banyan multistage interconnection networks,” J. Comb. Optim.. vol. 14, pp. 131-142, 2007.
[9] A. Massini, “All-to-all personalized communication on multistage interconnection networks,” Discrete Appl. Math., vol. 128, no. 2, pp. 435-446, 2003.
[10] K. Padmanabham, “Design and analysis of even-sized binary shuffle-exchange net-works for multiprocessors,” IEEE Trans. Parallel Distrib. Syst., vol. 2, no. 4, pp. 385-397, 1991.
[11] C. Qiao and L. Zhou, “Scheduling switching element disjoint connections in stage-controlled photonic banyans,” IEEE Trans. Commun., vol. 47, no. 1, pp. 139-148, 1999.
[12] R. Ramaswami, “Multiwavelength lightwave networks for computer communication,”
[13] Y. Yang, J. Wang, “All-to-all personalized exchange in banyan networks,” Proc.
Parallel and Distributed Computing and Sysetems (PDCS’99), Cambridge, MA, pp.
78-86, 1999.
[14] Y. Yang, J. Wang, “Optimal all-to-all personalized exchange in multistage net-works,” Proc. Seventh International Conference on Parallel and Distributed Systems (ICPADS’00), Iwale, Japan, 2000.
[15] Y. Yang, J. Wang, “Optimal all-to-all personalized exchange in self-routable mul-tistage networks,” IEEE Trans. Parallel Distrib. Syst., vol. 11, no. 3, pp. 261-274, 2000.
[16] Y. Yang, J. Wang, “Optimal all-to-all personalized exchange in a class of optical multistage networks,” IEEE Trans. Parallel Distrib. Syst., vol. 12, no. 9, pp. 567-582, 2001.