A rule-based inter-session network coding scheme over IEEE 802.16(d) mesh CDS-mode networks

(1)

A rule-based inter-session network coding scheme over IEEE 802.16(d)

mesh CDS-mode networks

Shie-Yuan Wang

⇑

, Chih-Che Lin, Yu-Chi Chang

Department of Computer Science, National Chiao Tung University, 1001 University Road, Hsinchu, Taiwan

a r t i c l e

i n f o

Article history: Received 22 April 2010

Received in revised form 13 March 2011 Accepted 12 October 2011

Available online 31 October 2011 Keywords:

Network coding 802.16

Wireless mesh network Hidden terminal

Inter-session network coding

a b s t r a c t

In the literature, an opportunistic inter-session network coding scheme needs to exchange extra control messages to maintain trafﬁc ﬂow states to operate. However, such a require-ment increases the implerequire-mentation complexity of a network coding scheme. In this paper, we propose a rule-based network coding scheme (RNC) that performs opportunistic inter-session network coding using a stateless design. By exploiting this stateless design, the pro-posed coding scheme is easy to implement and deploy.

In addition, in this paper based on RNC we study the hidden terminal problems between different coding structures. Such a problem may result in severe packet collisions in a net-work-coding-based network and thus degrade network coding performance. To alleviate this problem, based on RNC we propose a smart handshake procedure (called RNC-SHP) over the IEEE 802.16(d) mesh coordinated distributed scheduling (CDS) mode to reduce the number of hidden terminals between pairwise network coding structures. Our simula-tion results show that the proposed RNC schemes can greatly outperform the original rout-ing-based scheme on end-to-end ﬂow goodputs and packet delays.

1. Introduction

Network coding is a packet-encoding–decoding mecha-nism that aims to increase the data transmission efﬁciency of a network. It was ﬁrst proposed by Ahlswede et al.[1]

and has gained much attention in recent years. A typical network coding mechanism is composed of two main oper-ations: (1) mixing distinct outgoing packets on intermedi-ate nodes; and (2) decoding mixed packets before (or when) they arrive at their destination nodes. With carefully ﬁnding and exploiting possible network coding opportuni-ties, network coding can reduce the number of packet transmissions required for disseminating the same amount of data, as compared with traditional routing.

The development of network coding can be classified into two categories based on the traffic patterns that it in-tends to address. One deals with the traffic generated by

one or more multicast flows (referred to as intra-session coding), while the other deals with the traffic generated by multiple unicast flows (referred to as inter-session cod-ing). In this paper, we focus on the inter-session coding problem.

Developing an inter-session network coding scheme to effectively improve the performances of unicast flows is challenging. In a unicast-flow scenario, each flow has only one node as its intended receiver. In this condition, blindly coding packets may waste network bandwidth because packets may be disseminated to nodes that are not inter-ested in them. For this reason, designing an efficient in-ter-session network coding scheme is difficult because it needs to consider three problems at the same time: (1) routing path selection; (2) traffic load balancing; and (3) network coding decision. In[2–5], the authors formulate the inter-session network coding problem as a resource optimization problem that considers the aforementioned three problems and use the linear programming technique to mathematically obtain the optimized solution to this problem.

doi:10.1016/j.comnet.2011.10.013

⇑Corresponding author.

E-mail address:[email protected](S.-Y. Wang).

Contents lists available atSciVerse ScienceDirect

Computer Networks

(2)

Such theoretical studies have explored the characteris-tics of the inter-session coding problem; however, they do not propose feasible distributed protocols for scheduling inter-session network coding across a network. Khreishah et al.[5]proposed a theoretical distributed framework to control inter-session coding over a virtualized wireless net-work model. However, their net-work focuses on netnet-works with only 2 unicast ﬂows and is linear-programming based. A linear-programming-based scheduling approach usually requires high computation complexity to ﬁnd the optimal scheduling, which means that a distributed algorithm developed based on this approach needs long time to converge1_{(and may not converge under a highly-changing}

trafﬁc pattern).

In addition, these previous studies assume that no pack-et collisions occur in a wireless npack-etwork, i.e., they do not consider the performance degradation of an inter-session network coding scheme caused by the well-known ‘‘hid-den terminal (HT) problem.’’ The impacts of the HT prob-lem can be discussed from two aspects: First, the packet collisions caused by the HT problem greatly reduce the number of packets that can be overheard by network nodes. Overhearing packets, however, is fundamental to the operation of network coding. Thus, the presence of HTs can greatly degrade the performance of network cod-ing. Second, using virtual carrier-sensing mechanisms (e.g., the RTS-CTS mechanism used in the IEEE 802.11 net-work) and reservation-based mechanisms (e.g., the three-way handshake procedure used in the IEEE 802.16 mesh network) to prevent the HT problem from occurring may introduce time and bandwidth overheads for packet trans-mission. Such extra overheads also decrease the perfor-mance beneﬁts achieved by network coding. Therefore, efﬁciently eliminating the HT problem is essential for a network coding scheme to achieve good performances in real-life networks.

The objective of this paper is to propose a stateless in-ter-session network coding scheme that can be easily implemented in the real world and at the same time still provide good data forwarding performances. Our work makes two contributions. First, in this paper we propose an easy-to-implement and easy-to-deploy inter-session network coding scheme that is based on an opportunistic approach. As compared with previous opportunistic coding schemes, such as COPE[7]and BFLY[8], our proposed rule-based network coding (RNC) scheme need not maintain the states of traffic flows to find coding opportunities or coding structures. Instead, our proposed scheme exploits only the routing and MAC-layer scheduling information that can be locally obtained by a node to find coding oppor-tunities and coding structures. Due to this advantage, our proposed scheme can be easily realized in a real-life network.

Second, in this paper we point out a new ‘‘extended hid-den terminal (EHT) problem,’’ which can frequently occur in a wireless network using network coding. (In this paper, we call such a network a network-coding-based wireless

network.) The EHT problem can fail the decoding of a net-work-coded packet, thus significantly decreasing the per-formance gain that can be achieved by network coding. To eliminate EHTs, mechanisms for a node to maintain the states of traffic flows and negotiate collision-free data multicasting schedules with neighboring nodes are re-quired. Thus, this problem cannot be completely solved on top of a stateless network coding scheme. However, it is still possible to alleviate this problem over a stateless network coding scheme. In this paper we propose a smart bandwidth reservation mechanism that is extended from the one used in the IEEE 802.16(d) mesh CDS mode and aims at reducing EHTs for a stateless inter-session network coding scheme (such as RNC). Our simulation results show that, by reducing HTs and EHTs, our proposed stateless inter-session network coding scheme can increase end-to-end flow goodput and decrease the end-end-to-end packet delay in an interference-prone wireless network, as com-pared with traditional routing.

The remainder of this paper is organized as follows. We ﬁrst present previous studies related to our work in Section

2 and then propose the basic design of our proposed opportunistic inter-session network coding scheme in Sec-tion3. In Section4, we explain why the EHT problem may occur in a network-coding-based wireless network. Our solution to this problem is presented in Section5. In Sec-tion 6, we evaluate the performances of our proposed scheme using the NCTUns network simulator[6]. In Sec-tion7, we discuss the applicability issue of our proposed scheme. Finally, we conclude this paper in Section8.

2. Related work

In the literature, several papers[2–5]have theoretically studied inter-session network coding using linear pro-gramming. Although these previous studies have discussed the characteristics of inter-session network coding, turning their mathematical results into feasible distributed real-life protocols is difﬁcult due to the huge computation com-plexity required by the mathematic operations of linear programming.

On the other hand, another track of inter-session net-work coding research is opportunistic coding, which aims to develop sub-optimal yet to-implement and easy-to-deploy network coding schemes for unicast-ﬂow wire-less networks. In [7], Katti et al. implemented the COPE coding scheme over a real-life IEEE 802.11(b) wireless net-work. COPE performs network coding operation in an opportunistic manner. Its main operations are described here.

Using COPE, each node periodically broadcasts a ‘‘recep-tion report’’ message to its neighboring nodes. A recep‘‘recep-tion report message of node i contains the information of pack-ets that node i currently possesses. Based on received reception report messages, a network node maintains a packet information table that records which packets are currently possessed by its neighboring nodes. Each time when a node is going to transmit a data packet, it ﬁrst looks up the packet information table to check whether a coding opportunity exists or not. If so, this node mixes several

1

The convergence of a network-coding protocol is deﬁned as: the coding and routing decisions of all nodes in a network is consistent and loop-free.

(3)

packets that can be coded together to form a new network-coded packet and then sends this new network-network-coded packet out. If not, data packets are immediately transmitted.

COPE uses the above protocol to ﬁnd coding opportuni-ties within a node’s one-hop neighborhood, meaning that using it a node can only ﬁnd coding opportunities in a 3-node chain-based topology or a star-based topology. In

[8,9], Omiwade et al. propose the BFLY scheme that allows nodes to find coding opportunities in a butterfly topology. The operation of BFLY is briefly described here.

Using BFLY, each node i should periodically broadcast a hello message to its neighboring nodes. A hello message of node i contains node i’s neighboring node list and a neigh-bor vector describing the relative neighneigh-boring relationship (regarding the butterfly structure) of node i and its neigh-boring nodes. With such information, each node can main-tain all of its possible butterfly coding structures based on the hello messages advertized by its neighboring nodes. Upon transmitting data packets, a node first checks whether its outgoing packets can form coding opportuni-ties over the maintained butterfly structures. If so, the node mixes these packets to generate new network-coded packets and then transmits the coded packets out instead of transmitting original data packets. Also, BFLY can collab-orate with COPE (called BFLY-COPE in this paper) to find coding opportunities in chain-based, star-based, and but-terfly-based topologies.

The idea of BFLY-COPE is similar to that of our work. Compared with COPE and BFLY, however, our proposed network coding scheme has several advantages. First, COPE and BFLY need to employ extra protocols to exchange the information of either packets possessed by neighboring nodes (in COPE) or butterﬂy-structure relationship of neighboring nodes (in BFLY). In contrast, our proposed net-work coding scheme utilizes only the routing and MAC-layer information to ﬁnd coding opportunities and thus need not employ any extra protocols to operate. Thus, the implementation complexity of our proposed scheme can be reduced, as compared with BFLY-COPE.

Second, to the best of the authors’ knowledge, previous work, such as COPE and BFLY, do not consider the inﬂuence of the HT and EHT problems, which may occur in real-life networks and can signiﬁcantly degrade network coding performances. Thus, these schemes may only work well over an interference-free wireless network, which is uncommon in the real world. In contrast, our work uses a

smart bandwidth reservation mechanism to reduce the occurrences of HTs and EHTs, which allows an inter-session network coding scheme to achieve better coding performances in an interference-prone wireless network than previous work.

Regarding solving the extended hidden terminal prob-lem, the objective of[11,12]is similar to that of our work. However, there are two differences between these two studies. First, the scheme proposed in[11,12](called CORE) uses a two-phase design that aims to provide stable and persistent data schedules to serve network coding before starting it and has to start network coding after network flows become stable (unchanged). Thus, it requires longer latencies to start network coding and is more suitable for long-lasting flows. In contrast, our work aims to propose a solution to solve the EHT problem in a one-phase man-ner. Data schedules obtained using our proposed approach may not be as optimal as those obtained using the CORE scheme; however, it requires shorter latencies to start net-work coding and is more suitable for dynamic flows. Sec-ond, the work in [11,12] discusses the problems when deploying network coding in cross-flow coding structures (i.e., chain, X, and star coding structures). In this paper, we explain why the EHT problem decreases the perfor-mances of network coding in chain and butterfly coding structures and propose a solution to solve it. We conducted proof-of-concept simulations in chain and grid topologies. Our simulation results show that the proposed scheme can effectively reduce EHTs and thus can further increase the flow goodputs of network coding.

Several previous works proposed network coding schemes for the IEEE 802.16 Point-to-multiPoint (PMP) mode network[14,15]. A PMP-mode network is essentially a cellular network, which employs a central base station to schedule the network bandwidth and completely differs from the CDS-mode network. These studies focus on adjusting some operational MAC-layer parameters and scheduling policies for PMP-mode networks, when cross-ﬂow network coding can be used. In contrast, our work proposes enhancements to the MAC-layer designs speciﬁc to the IEEE 802.16 mesh CDS mode network, which is purely distributed and does not require a central unit to schedule link bandwidth. Such differences make our work unique to the literature. Table 1 shows the qualitative comparison between the previous works and our proposed RNC scheme. The comparison is based on the type of net-work for which they are designed, the used netnet-work coding

Table 1

Comparisons between previous works and our proposed RNC scheme.

Jin and Li[14] Zhang and Li[15] COPE[7] BFLY[8] Mogre et al.[11,12] Proposed RNC

CDS-mode network ⁄ ⁄ ⁄ ⁄

PMP mode network ⁄ ⁄ ⁄

Opportunistic network coding ⁄ ⁄ ⁄ ⁄ ⁄

Random network coding ⁄

Proactive design ⁄ ⁄ ⁄ ⁄ ⁄

Reactive design ⁄

Stateless design ⁄

Using butterﬂy structure ⁄ ⁄

Solving the EHT problem ⁄

Seeking for optimization ⁄ ⁄

(4)

technique, design choice, and the issues that they intend to address.

3. Basic design of our proposed scheme

In this paper, we developed a bit-level network coding scheme, which can detect and perform network coding for chain and butterﬂy structures in an opportunistic and rule-based manner. To enable packet overhearing, which is necessary to perform network coding, we assume that each node’s radio operates in the promiscuous mode. This is accomplished by allowing each node to receive all over-heard packets despite their connection IDs[16] and dis-abling the encryption/decryption functions of the security and privacy sub-layer in the IEEE 802.16 mesh CDS-mode MAC layer.

Our proposed network coding scheme comprises two components: (1) a rule-based coding/decoding unit (RCU) and (2) a smart bandwidth reservation mechanism. The former is responsible for examining whether a coding opportunity exists and performing encoding/decoding operations, while the latter is responsible for establishing data schedules without generating HTs and EHTs in a net-work-coding-based wireless network. (In this paper, a data schedule refers to a set of consecutive minislots reserved for transmitting data packets. The notion of a minislot is explained later in Section 4.) In the following, we ﬁrst explain the design and implementation of RCU in this section. The details of the proposed smart bandwidth reservation mechanism will be explained in Section5later. RCU generates a network-coded packet by mixing two distinct fresh packets using the exclusive-or (XOR)

operation. The reason why it uses the simple XOR opera-tion to generate coded packets is explained here. The ana-lytical results presented in [13] suggest that, when only inter-session coding is present, using a good coding func-tion is not necessary to achieve good coding performances. Instead, other factors, such as whether good coding paths can be found in the network, are more important to the coding performances. Thus, for inter-session network cod-ing, the coding performances that can be achieved using simple XOR operations are the same as those using more advanced coding operations, e.g., ﬁnite ﬁeld operations.

The objective of our proposed coding scheme is not ﬁnding the optimal coding path in the network. This is be-cause doing so may consume a great amount of time and computation overheads and may be infeasible in a network with fast-changing trafﬁc. Instead, our proposed network coding scheme aims to provide an easy-to-implement and easy-to-deploy encoding/decoding approach that on average achieves better throughput and delay perfor-mances for a network in an opportunistic manner.

RCU is implemented at the MAC layer and composed of three parts: (1) the packet transmission procedure; (2) the packet reception procedure; and (3) the coding rule sets, which are explained below. For readers’ convenience, we ﬁrst present the notations frequently used in this paper inTable 2and the packet header formats used by RCU in Section3.1.

3.1. Packet header formats used by RCU

Using the proposed RNC-based scheme, each node should prepend an extra network-coding header to each outgoing packet. As shown in Fig. 1, the added network-coding header comprises two parts: (1) the Common Head-er (ComHdr) and (2) the Component Packet Information (CPI). Following the network-coding header are the IP header and the IP data payload (e.g., a TCP/UDP packet).

ComHdr is composed of three ﬁelds: (1) CPIN; (2) PL; and (3) TTL. The CPIN ﬁeld indicates the number of CPI entries contained in this packet. The details of a CPI entry are explained later. The CPIN value is set to 2 for a net-work-coded packet, because our proposed coding scheme only mixes two fresh packets to generate a network-coded packet. (In this paper, we call ingredient packets that are coded together to form a network-coded packet pncas pnc’s

Table 2

Frequently used notations. Notation Meaning CN The current node

DST(p) The destination node of packet p

FHP The proposed basic four-way handshake procedure NBR(A) The set of nodes that is one-hop away from node A NH(p) The next-hop node of packet p

Pkt(A) A packet transmitted by node A SHP The proposed smart handshake procedure TX(p) The transmitting node of packet p

THP The three-way handshake procedure used in IEEE 802.16 networks

(5)

component packets.) For a fresh packet, the CPIN value is set to 1 because it comprises only one packet.

The PL field denotes the length of the data payload con-tained in this packet in bytes. For a fresh packet, the PL field is set to the length of the complete IP packet that it contains. For a network-coded packet, this value is set to the length of the longer ‘‘component IP packet’’ that it con-tains. The TTL field specifies the maximum number of hops that a packet is allowed to be forwarded. When receiving a packet, a network node should decrement the TTL value of the received packet by one. The TTL value of a packet p should be set based on the following rules: (1) it should be set to 2, if p is a coded packet that is coded using the butterfly structure coding; (2) Otherwise, it should be set to 1. (The details of the coding rules used in our schemes are explained in Section3.4.)

The CPI part describes the information of the compo-nent packets that are coded together to form this coded packet. Each component packet is described by a CPI entry, which is composed of four fields: (1) SrcIP; (2) DstIP; (3) SeqNum; and (4) NextHop. The SrcIP field denotes the IP address of this component packet’s source node; the DstIP field denotes the IP address of the component packet’s des-tination node; the SeqNum field denotes the sequence number of this component packet assigned by its source node; and the NextHop field indicates the IP address of the next hop node to which this component packet should be transmitted. In our scheme, the SrcIP and SeqNum fields are used together to form a unique ID (SrcIP, SeqNum) for each fresh packet in the network.

3.2. Packet transmission procedure

On a transmitting node, outgoing packets (either gener-ated by itself or forwarded) are placed into the MAC-layer output queue for transmission. The transmitting node first checks whether there is any established data schedule. If yes, it transmits the packets in the output queue using Algorithm 1. If not, the transmitting node should first establish a data schedule for the outgoing packets. Only after it has established a data schedule with a neighboring node, can it transmit data packets buffered in the output queue. The 802.16 standard defines a three-way hand-shake procedure (THP) for nodes to establish data sched-ules. However, this THP design is only suitable for routing-based wireless network and is inefficient in a network-coding-based wireless network. This issue is discussed in Section4and our solution to this problem is presented in Section5.

The variables and functions used in Algorithm 1 is ex-plained here. Soutdenotes the set of the packets currently

buffered in the MAC-layer output queue. Packets in Sout

are sorted in the non-decreasing order based on their insertion times. M denotes the MAC-layer burst that is going to be transmitted over the forthcoming data sche-dule. The dequeue (S) function dequeues the ﬁrst element of a sorted set S while the p(S, i) function returns the ith packet in a sorted packet set S. The extract (S, i) function performs the following actions in sequence: (1) copy the ith packet pkt in a sorted packet set S to ptmp; (2) remove

the pkt in S; (3) re-sort S; and (4) return ptmp.

The chain_coding_opp_checker (pkt1, pkt2) and

butter-ﬂy_coding_opp_checker (pkt1, pkt2) functions check

whether pkt1and pkt2can form a chain coding opportunity

and a butterfly coding opportunity, respectively. They re-turn ‘‘PASSED,’’ if the two input packets can form a chain/ butterfly coding opportunity. Otherwise, they return ‘‘ FAILED.’’ The chain_coding (pkt1, pkt2) and butterfly_coding

(pkt1, pkt2) functions encode pkt1and pkt2to form a coded

packet using chain/butterﬂy structure coding and properly set the ComHdr and CPI entries of the coded packet for its future decoding.

Initially, the transmitting node ﬁrst dequeues the ﬁrst element of Sout (denoted as phead). It then iteratively

checks whether any chain-structure coding opportunity exists for pheadand other packets in Sout. If such a coding

opportunity exists, it ﬁrst encodes these two packets to generate a new network-coded packet pcodedusing chain

structure coding and then inserts pcodedinto M. Otherwise,

it then iteratively checks whether pheadand other packets

in Soutcan form a butterﬂy-structure coding opportunity.

If such a packet exists, the transmitting node ﬁrst encodes pheadand the found packet to generate a network-coded

packet pcodedusing butterﬂy structure coding and then

in-serts pcodedinto M. On the other hand, if no coding

oppor-tunity exists for phead, the transmitting node directly

inserts pheadinto M.

Such a procedure is repeated until one of the following two conditions is satisﬁed: (1) Soutbecomes empty; or (2)

the total length of M has equaled or exceeded its maximum allowed length. In these conditions, the transmitting node sends M out to ﬁnish its data transmission procedure.

Algorithm 1. Packet encoding procedure for node i 1: Sout:¼ {pj"p is a fresh packet sorted by its

insertion time}

2: M :¼ the MAC-layer burst that is going to be transmitted out

Require: Node i is allowed to transmit M 3: while Sout–; do

4: phead dequeue (Sout)

5: res :¼ UNPASSED 6: for j = 1 to jSoutj do

7: res chain_coding_opp_checker (phead,

p(Sout, j))

8: if res = PASSED then 9: ppeer extract (Sout, j)

10: pcoded chain_coding (phead, ppeer)

11: Insert pcodedinto M

12: break

13: end if 14: end for

15: if res – PASSED then 16: for j = 1 to jSoutj do

17: res butterﬂy_coding_opp_checker (phead, p(Sout, j))

18: if res = PASSED then 19: ppeer extract (Sout, j)

20: pcoded butterﬂy_coding (phead, ppeer)

(6)

21: Insert pcodedinto M

22: break

23: end if

24: end for

25: end if

26: if res – PASSED then 27: Insert pheadinto M

28: end if

29: if the payload length ofM equals or exceeds the maximum allowed length then

30: break

31: end if 32: end while 33: Transmit M 34: Return SUCCESS

3.3. Packet reception procedure

Upon receiving a packet (denoted as precv), a receiving

node i should perform the packet decoding procedure shown in Algorithm 2. Initially, node i ﬁrst checks whether precvis a fresh packet or not. If it is, node i then invokes the

decoding_func() function (shown in Algorithm 3) to de-code the packets stored in the de-coded packet pool as possi-ble as it can, in a recursive manner. (The coded packet pool temporarily stores the received coded packets that have not yet decoded correctly.) The decoding_func() ﬁrst in-serts precvinto the fresh packet pool (which temporarily

stores fresh and decoded packets). It then iteratively tries to decode packets stored in the coded packet pool. This is accomplished by calling the network_coding_decoder() function, which is explained later.

If any packet in the coded packet pool is successfully de-coded, the decoding_func() passes it to another decod-ing_func() instance. Such a recursive decoding process terminates when no more coded packets can be decoded further. At the end of the decoding_func(), it checks whether node i is the next-hop node of precv.2If it is, the

decoding_func() passes precv to the upper-layer routing

protocol for further route dispatching.

On the other hand, if precvis a network-coded packet,

node i ﬁrst tries to decode precv by invoking the

net-work_coding_decoder() function with precv as its input.

The network_coding_decoder() function iteratively searches the fresh packet pool to ﬁnd whether any precv’s

component packet exists. If yes, it decodes the other com-ponent packet of precvand returns the decoded fresh packet

as its output. If precv cannot be decoded by the

net-work_coding_decoder() function, node i ﬁrst checks whether the TTL value of precvis larger than zero and the

next-hop nodes of precv’s components are node i’s

neigh-boring nodes. If these conditions hold, it means that precv

should be re-broadcast again because it is coded using the butterﬂy structure coding and has not yet reached

the nodes possessing its remedy packets. (The details of the proposed coding rules and the usage of a packet’s TTL value will be explained in Section 3.4.) Therefore, node i should re-insert precvinto the MAC-layer output queue to

rebroadcast it. If not, it means that the remedy packet for precvis not present in the fresh packet pool. In this

condi-tion, node i inserts precv into the coded packet pool for

future possible decoding.

To prevent stored packets from consuming too much storage space, each packet in the fresh packet pool and the coded packet pool is associated with an expiry timer, which indicates the maximum time that a packet is al-lowed to reside in the pool. When an expiry timer expires, its associated packet is discarded immediately. In our sim-ulations, the expiration time for a fresh packet is set to 10 s and that for a coded packet is set to 5 s.

Algorithm 2. Packet decoding procedure for node i Require: Node i receives a MAC-layer data frame M,

extracts the packets contained in M, and stores them into a packet set Srecv

1: while Srecv–; do

2: precv dequeue (Srecv)

3: if precvis a fresh packet then

4: decoding_func(precv)

5: else

6: pdecoded network_coding_decoder (precv)

7: if pdecoded–NULL then

8: decoding_func(pdecoded)

9: else

10: if the TTL value of precv> 0 and NH (precv’s

component packets) 2 NBR (i) then

11: Insert precvto the MAC-layer connection

queue

12: else

13: Insert precvto the coded packet pool

14: end if

15: end if 16: end if

17: Return SUCCESS 18: end while

Algorithm 3. decoding_func (input: a fresh packet precv)

1: Insert precvto the fresh packet pool

2: for all packet tmp_p in the coded packet pool do

3: pdecoded network_coding_decoder (ptmp_p)

4: if pdecoded–NULL then

5: decoding_func(pdecoded)

8: if node i = NH (precv) then

9: Pass precvto the upper-layer routing protocol

for further route dispatching 10: end if

11: Return SUCCESS

2

Note that the next-hop node information of a packet can be obtained via retrieving current routing entries exported by the collaborative routing protocol.

(7)

3.4. Coding rule sets

The rule set used to find a chain-structure coding opportunity is called the chain-structure rule set (CRS) and that to find a butterfly-structure coding opportunity is called the butterfly-structure rule set (BRS). The details of these two rule sets are explained below.

3.4.1. CRS

The rules of CRS are listed inTable 3. Given two packets p1 and p2, if the transmitting node of p1 is the next-hop node of p2 and the transmitting node of p2 is the next-hop node of p1, then a chain-structure coding opportunity exists. In our implementation, the checking of CRS is implemented in the chain_coding_opp_checker (p1, p2) function, which is used by each node’s encoding procedure. The rationale of CRS is simple and explained below.

Consider a 3-node multi-hop chain network comprising nodes A, B, and C shown inFig. 2. In this chain network, the radio coverages of nodes A and C contain node B and them-selves, respectively. Thus, data packets from node A to node C should go through node B, and vice versa. Suppose that nodes A and C transmit data packets to each other. Packets transmitted by these two nodes will go through node B.

When node B receives a packet transmitted by node A (denoted as Pkt(A)) and a packet transmitted by node C (denoted as Pkt(C)), node B can code this pair of Pkt(A) and Pkt(C) together to generate a new coded packet (Pkt(A) Pkt(C)) and sends it out instead of sending out Pkt(A) and Pkt(C) separately. Since node A intends to ceive Pkt(C) and possesses Pkt(A), and node C intends to re-ceive Pkt(A) and possesses Pkt(C), upon receiving the packet (Pkt(A) Pkt(C)), node A can extract Pkt(C) by XOR-ing its possessed Pkt(A) and (Pkt(A) Pkt(C)). Simi-larly, after receiving (Pkt(A) Pkt(C)), node C can extract Pkt(A) by XOR-ing Pkt(C) and the received coded packet. Using this chain structure coding to forward data packets can theoretically reduce the number of required packet transmissions from 4 to 3, increasing of the packet trans-mission efﬁciency by 33%.

3.4.2. BRS

The chain structure coding increases the transmission efficiency of a wireless network by 33% at most, while, but-terfly structure coding can theoretically increase the trans-mission efficiency of a wireless network by 50%. Consider a 6-node grid network topology shown inFig. 3, where each dotted circle denotes the radio coverage of a node centered on it. In this network, node A intends to deliver packets to node F and node C intends to deliver packets to node D. Due to the limited radio coverage, the packets transmitted by nodes A and C should be forwarded by nodes B and E to

reach their respective destination nodes. When traditional store-and-forward routing is used, six packet transmis-sions are required to accomplish a packet delivery from node A to node F and that from node C to node D. (The packet transmissions required by the former are A–B, B– E, E–F, and those required by the latter are C–B, B–E, E–D.) In contrast, when butterfly structure coding is used, only four packet transmissions are required to accomplish these two packet deliveries. Using butterfly-structure cod-ing, upon receiving Pkt(A) and Pkt(C), node B can code these two packets to form a coded packet (Pkt(A) Pkt(C)) and then transmit this packet to node E. Later on, node E can broadcast this packet out again. Thus, nodes D and F can receive this coded packet (Pkt(A) Pkt(C)) at the same time. Since node D is node A’s one-hop neighboring node and node F is node C’s one-hop neighboring node, due to the wireless broadcast nature, node D can overhear Pkt(A) when node A is transmitting it and node F can overhear Pkt(C) when node C is transmitting it. As a result, node D will have Pkt(A) when receiving (Pkt(A) Pkt(C)) and node F will have Pkt(C) when receiving (Pkt(A) Pkt(C)). Thus, node D can decode Pkt(C) by XOR-ing Pkt(A) and (Pkt(A) Pkt(C)) and node F can decode Pkt(A) by XOR-ing Pkt(C) and (Pkt(A) Pkt(C)). The total number of required packet transmissions becomes four only (A–B, C–B, B–E, and E–DF). Using the butterfly structure coding, a network can increase its packet transmission efficiency by 50%, as compared with those using the traditional rout-ing scheme. A more detailed comparison between the chain structure coding and the butterfly structure coding is given in the Appendix.

Note that, when the chain structure coding is used, a coded packet is decoded on its next-hop node, which is only one-hop away from its transmitting node. However, when the butterfly structure coding is used, a coded packet should be decoded on the nodes that are two-hop away from its transmitting node to obtain performance gain. Due to this fundamental difference, the encoding/decoding mechanism for the chain structure greatly differs from that for the butterfly structure. The outline of the encoding and decoding procedures for the butterfly structure coding has been presented in Algorithms 1–3. In this section, we explain the operation of these two procedures in detail and present an example to show how a butterfly structure coding is completed.

The encoding operation of our proposed butterfly struc-ture coding is composed of two parts: (1) rule checking and (2) route designation for the next-hop node. In our imple-mentation, the former is realized in the butterfly_cod-ing_opp_checker() function while the latter is realized in the butterfly_coding() function.

Upon searching for a butterfly-structure coding oppor-tunity, each node n uses the rules shown in Table 4to check whether any pairs of packets in its MAC-layer output queue can be coded together using the butterfly structure coding. For a pair of packets p1 and p2, satisfying the first rule indicates that p1 and p2 have the same next-hop node on the ways to their respective destination nodes, and sat-isfying the second rule indicates that the two packets be-long to different flows and are destined to different destination nodes. Satisfying these two rules means that Table 3

The rules of CRS.

Rule ID Rule statement

1 TX(p1) = NH(p2)

(8)

the routes of p1 and p2 to their respective destination nodes are likely to form a butterﬂy structure.

On the other hand, the third rule checks whether the one-hop neighboring node set of p1’s transmitting node and that of p1’s next-hop node have common nodes other than node n and n’s one-hop neighboring nodes; the fourth rule checks whether the one-hop neighboring node set of p2’s transmitting node and that of p2’s next-hop node have common nodes other than node n and n’s one-hop neigh-boring nodes. For p1 and p2, satisfying these two rules indicates that the coded packet formed by these two pack-ets (p1 p2) is very likely to be decoded correctly later, because the remedy packets (i.e., p1 and p2) for (p1 p2) are very likely to be overheard by some nodes on the routes of p1 and p2.

The second part of our butterﬂy structure coding is designating the routes of (p1 p2)’s component packets used by (p1 p2)’s next-hop node, which is used to ensure that the decoding of (p1 p2) can be triggered and suc-cessfully performed. Also, the proposed route designation mechanism can effectively limit the spreading area of (p1 p2)’s broadcast, preventing link bandwidth from being wasted due to unnecessary broadcast packets.

This route designation is accomplished by specifying the NextHop ﬁeld of each CPI contained in the coded packet using the rules listed inTable 5. As one sees, our butterﬂy structure coding chooses a node that is likely to possess p1 (and therefore is likely to be able to recover p2) as p2’s next-hop node. Similarly, it chooses a node that is likely to possess p2 (and therefore is likely to recover p1) as p1’s next-hop node.

By doing this, after node n broadcasts (p1 p2), it will be received by all of n’s one-hop neighboring nodes. Among these nodes, however, only those that are neigh-boring to the next-hop nodes speciﬁed in the CPI entries of (p1 p2) are allowed to rebroadcast this coded packet. Thus, (p1 p2) can reach the nodes that possess its remedy packets (and thus get decoded correctly) while Fig. 2. An example topology suitable for the chain structure coding.

Fig. 3. An example topology suitable for the butterﬂy structure coding.

Table 4 The rules of BRS.

Rule ID Rule statement

1 NH(p1) = NH(p2)

2 DST(p1) – DST(p2)

3 (NBR(TX(p1)) \ NBR(NH (p1)) (CN [ NBR(CN))) – ; 4 (NBR(TX(p2)) \ NBR(NH (p2)) (CN [ NBR(CN))) – ;

(9)

avoiding unnecessary broadcasts for it. In addition, when using the butterﬂy structure coding, node n should set the TTL ﬁeld of (p1 p2) to 2, which explicitly indicates that (p1 p2) is only allowed to be re-broadcast once. This design further reduces the number of broadcast packets for (p1 p2).

We use the topology shown inFig. 3as an example to explain how a butterﬂy structure coding is accomplished. Upon receiving a Pkt(A) and a Pkt(C), node B ﬁrst checks whether these two packets satisfy the rules of BRS. If they do, it means that Pkt(A) and Pkt(C) have the same next-hop node (indicated by rule (1)) but different destination nodes (indicated by rule (2)). In addition, these two packets are likely to be overheard by some nodes that are on their routes to their respective destination nodes (indicated by rules (3) and (4)).

In this condition, node B ﬁrst codes Pkt(A) and Pkt(C) to generate (Pkt(A) Pkt(C)). It then designates the routes of (Pkt(A) Pkt(C))’s component packets (i.e., Pkt(A) and Pkt(C)) for (Pkt(A) Pkt(C))’s next-hop node E. That is, node E should forward Pkt(A) and Pkt(C) (and thus (Pkt(A) Pkt(C))) using the routes indicated by the Next-Hop ﬁelds of (Pkt(A) Pkt(C))’s CPI entries. In this example case, for (Pkt(A) Pkt(C))’s next-hop node E, node B desig-nates node F to be Pkt(A)’s next-hop node and node D to be Pkt(C)’s next-hop. Lastly, node B broadcasts the coded packet (Pkt(A) Pkt(C)) out.

After receiving the coded packet, node E will broadcast it again because node E is neighboring to nodes D and F, which are the next-hop nodes speciﬁed in the CPI entries of the coded packet. Finally, on receiving (Pkt(A) Pkt(C)), nodes D and F can decode it to obtain Pkt(C) and Pkt(A), respectively, because the former possesses Pkt(A) and the latter possesses Pkt(C).

The operation of RCU on node n only requires (1) the neighborhood information of node n and its one-hop neighboring nodes; and (2) the next-hop node information for each packet on node n. The former can be easily ob-tained from the control messages of an IEEE 802.16(d) mesh network,3_{while the latter can be obtained by}

consult-ing the collaborative routconsult-ing protocol. In modern operatconsult-ing systems, routing protocols usually have standard APIs to ex-port their maintained route information. Therefore, it is easy to integrate RCU with the collaborative routing protocol.

4. The EHT problems

As introduced in Section 1, most of previous work assumes that no packet collisions occur in a

network-coding-based network[7,10]. This assumption, however, is not always true in a real-life network. For example, the presence of HTs can result in severe packet collisions in a wireless network. To solve the HT problem, the IEEE 802.11 network employs the RTS/CTS mechanism to pro-tect data packets from being collided. Using this mecha-nism, however, each pair of transmitting and receiving nodes have to exchange RTS and CTS messages before data transmission is carried out. Thus, although the RTS/CTS mechanism can avoid data packet collisions, it consumes much network bandwidth to transmit control messages. For this reason, the performance gain of inter-session net-work coding in a real-life 802.11(b) netnet-work can be quite low.

On the other hand, the IEEE 802.16(d) mesh network uses a reservation-based approach to schedule data trans-mission. In this network, control messages are transmitted over transmission opportunities (TxOpps) and data packets are transmitted over minislots. The operation of this net-work can be found in[16,17]. For brevity, we do not present it in this paper. Before transmitting data, the transmitting and receiving nodes have to complete a three-way hand-shake procedure (THP) to obtain a minislot allocation, dur-ing which the transmittdur-ing node is ready to transmit data and the receiving node is ready to receive data.

The operation of THP is explained here. First, the trans-mitting node transmits a request Information Element (IE) and an availability IE to the receiving node using an MSH-DSCH message. The request IE speciﬁes the number of min-islots that the requesting node needs to transmit data and the availability IE speciﬁes a set of consecutive minislots on which the transmitting node can transmit data. That is, the request IE indicates the amount of link bandwidth that the transmitting node requests and the availability IE indicates the minislot set from which the receiving node can choose.

On receiving these two IEs, the receiving node first determines whether it can receive data from the transmit-ting node within the minislot set specified by the received availability IE. If not, the receiving node can simply ignore this bandwidth request. Otherwise, it should schedule a minislot allocation within the indicated minislot set and then transmit a grant IE back to the transmitting node as an acknowledgment using its MSH-DSCH message. The grant IE specifies the minislot set on which the receiving node is willing to receive data from the transmitting node. (The minislot set indicated in the grant IE must be the sub-set of the minislot sub-set specified in its corresponding avail-ability IE.)

Upon receiving the grant IE, the transmitting node then broadcasts a confirm IE using its MSH-DSCH message to complete this THP. The confirm IE is a copy of the received grant IE. It is used to notify the transmitting node’s neighboring nodes of the activation information of this minislot allocation. By broadcasting grant and confirm IEs, nodes neighboring to the transmitting and receiving nodes can learn when this minislot allocation will take place and thus suspend their data transmissions in that duration. By using this three-way design, THP can elimi-nate HTs around the transmitting and receiving nodes. (Note that, in an 802.16(d) mesh network, transmitting Table 5

Route designation of the next-hop node of a coded packet using BRS. Rule ID Rule statement

1 NH(p2) 2 (NBR(TX(p1)) \ NBR(NH(p1)) (CN [ NBR(CN))) 2 NH(p1) 2 (NBR(TX(p2)) \ NBR(NH(p2)) (CN [ NBR(CN)))

3

The one-hop node information can also be obtained from hello messages or link-state messages received by the collaborative routing protocol.

(10)

MSH-DSCH messages is guaranteed collision-free due to the use of a distributed election algorithm[16].)

In our implementation, when network coding is used, a node that receives a confirm IE will not schedule its own data transmissions on the same minislots indicated by the received confirm IE to ensure that it will be idle at that time to overhear neighboring nodes’ packets. Although this constraint may reduce the flexibility of minislot schedul-ing, our simulation results show that the proposed RNC-based schemes still greatly outperform traditional routing on end-to-end flow goodputs and packet delays by reduc-ing the number of transmitted packets and decreasreduc-ing packet queuing delays. (The minislot scheduling of tradi-tional routing need not take this constraint into account.) Unlike the RTS/CTS mechanism used in 802.11 net-works, the THP used by 802.16(d) mesh networks can schedule a mini-slot allocation that lasts at most 1.28 s. Thus, the bandwidth and time overheads introduced by the THP can be amortized over time and therefore less sig-nificant, as compared with those introduced by the RTS/ CTS mechanism. However, to inter-session network cod-ing, only eliminating HTs around the transmitting and receiving nodes cannot avoid all types of packet collisions. In the following, we point out a new EHT problem which occurs only in network-coding-based wireless networks. The EHT problem can result in packet collisions different from those caused by the HT problem and significantly decrease the performance gain that can be achieved by inter-session network coding.

Fig. 4shows an example 5-node chain network, where the dotted circles denote the transmission ranges of nodes B and D, respectively. A solid arrow from node n1 to node n2 represents that a packet transmitted by n1 is destined to n2 while a dotted arrow from node n1 to node n2 repre-sents that a packet transmitted by n1 can be overheard by n2. (This also means that n1 can interfere with n2 if one of them transmits data and the other receives data from an-other node at the same time.) In this network, two greedy UDP flows are generating traffic. One is from node A to node E and the other is from node E to node A. Packets gen-erated by these two flows have to be forwarded by nodes B, C, and D to reach their respective destination nodes.

Suppose that node B is going to forward a packet to node A and node D is going to forward a packet to node E. In a routing-based wireless network, these two packet trans-missions can be scheduled to take place at the same time, because the receiving nodes A and E are not interfered with

other nodes. Although the packets transmitted by nodes B and D may get collided on node C, such a packet collision does not affect the packet receptions of nodes A and E. In a network-coding-based wireless network, however, these two simultaneous packet transmissions can signiﬁcantly decrease the performance of inter-session network coding. The reason is explained below.

In this chain network, when inter-session network cod-ing is used, nodes B and D will use chain structure codcod-ing to code packets that are forwarded to different nodes. For example, node B will code Pkt(A) (destined to node C) and Pkt(C) (destined to node A) to generate a coded packet (Pkt(A) Pkt(C)) and broadcast it out. Similarly, node D will code Pkt(E) (destined to node C) and Pkt(C) (destined to node E) to generate a coded packet (Pkt(E) Pkt(C)) and broadcast it out. As one knows, if these two coded packets are simultaneously transmitted, they will be col-lided on node C. In such a condition, node C cannot over-hear these two coded packets and thus cannot decode them to obtain any packets that nodes B and D intend to forward to it. Therefore, the packet forwarding perfor-mance of network coding will be greatly degraded due to these undesired packet collisions.

The EHT problem not only can occur in chain structures but also can occur in butterﬂy structures. We explain the latter using an example 9-node grid network shown in

Fig. 5. In this example network, trafﬁc is generated by four greedy UDP ﬂows, whose source and destination node pairs are (A, I), (C, G), (I, A), (G, C), respectively.

The two flows (A, I) (C, G) select node B as the next-hop node of their packets and form a butterfly structure while the two flows (I, A), (G, C) select node H as the next-hop node of their packets and form another butterfly structure. In the first butterfly structure, packets generated by the flow (A, I) should be forwarded to node F to be decoded and those generated by the flow (C, G) should be forwarded to node D to be decoded. Similarly, in the second butterfly structure, packets generated by the flow (G, C) should be forwarded to node F to be decoded and those generated

(11)

by the ﬂow (I, A) should be forwarded to node D to be decoded.

To transmit data, nodes A and C need to negotiate mini-slot allocations with node B (denoted as MAABand MACB,

respectively) using THP. Setting up MAABand MACBensures

that node B can successfully receive Pkt(A) and Pkt(C) without generating collisions. On the other hand, the ﬂows (G, C) and (I, A) are also transmitting their data using the second butterﬂy structure at the same time. Similarly, nodes G and I need to negotiate minislot allocations, MAGH

and MAIH, with node H using THP. Setting up these two

minislot allocations ensures that node H can successfully receive Pkt(G) and Pkt(I) without generating collisions.

However, one should know that, for the first butterfly structure, node D is required to successfully overhear Pkt(A) to decode (Pkt(A) Pkt(C)) and, for the second but-terfly structure, it is also required to successfully overhear Pkt(G) to decoded (Pkt(G) Pkt(I)). Similarly, for the first butterfly structure, node F is required to successfully over-hear Pkt(C) to decode (Pkt(A) Pkt(C)) and, for the second butterfly structure, it is required to successfully overhear Pkt(I) to decoded (Pkt(G) Pkt(I)). Since MAABand MACB

only guarantee the success of node B’s packet reception and MAGH and MAIHonly guarantee the success of node

H’s packet reception, it is not guaranteed that node D can successfully receive packets transmitted from nodes A and G and node F can successfully receive packets trans-mitted from nodes C and I.

In such a condition, if nodes A and G simultaneously transmit their packets, their packets will be collided on node D. Similarly, if nodes C and I transmit their packets at the same time, their packets will be collided on node F. Because nodes D and F cannot obtain the necessary rem-edy packets for (Pkt(A) Pkt(C)) and (Pkt(G) Pkt(I)), they will fail to decode these coded packets. As a result, such undesired packet collisions between different butterﬂy structures can greatly decrease the network goodput achieved by inter-session network coding. To solve this EHT problem, we propose a smart handshake procedure (SHP) to replace the original THP for IEEE 802.16(d) mesh networks. The design of SHP is explained in Section5.

5. Smart handshake procedure

Our proposed SHP comprises four parts. The first is a four-way handshake procedure (abbreviated as FHP) ex-tended from the THP defined in the IEEE 802.16 mesh-mode standard; the second is the resolution for scheduling conflicts; the third is the activation time estimation for a minislot allocation; and the last is the timing control for starting an FHP. In this section, we explain the details of these parts in sequence.

5.1. FHP

The operation of FHP is illustrated inFig. 6, where a so-lid arrow from n1 to n2 denotes a message transmitted from n1 and destined to n2 and a dotted arrow from n1 to n2 denotes a message transmitted from n1 and can be overheard by n2. The operation of FHP is the same as that of THP except that, in FHP, some nodes are required to broadcast a new type of IE called ‘‘extended confirm’’ IE (denoted as ExtConfirm IE in the figure for brevity) to fur-ther disseminate the activation information of a scheduled minislot allocation. The detailed operation of FHP is explained below.

First, the requesting node sends the granting node a request IE. Then, the granting node acknowledges the requesting node with a grant IE. After receiving the grant IE, the requesting node broadcasts a confirm IE to accom-plish the original THP. Later on, upon receiving a confirm IE, nodes other than the granting node have to broadcast extended confirm IEs to further distribute the activation information of this minislot allocation. An extended con-firm IE is simply a copy of its corresponding concon-firm IE. By broadcasting extended confirm IEs, nodes that are two-hop away from the requesting node can know when this minislot allocation will be activated. Thus, they can suspend their data transmission during that period. Using this design, for two overlapped coding structures, their transmitting nodes can transmit their data at different times to avoid packet collisions resulting from the EHT problem.

(12)

Let us revisit the example shown inFig. 5. If FHP is used, upon receiving node A’s confirm IE, node D will broadcast an extended confirm IE to notify its neighboring nodes of this minislot allocation. Thus, node G can know when node D will overhear packets transmitted by node A and thus re-strain its packet transmissions at that period. Similarly, upon receiving node G’s confirm IE, node D will broadcast an extended confirm IE to notify neighboring nodes of node G’s minislot allocation. This allows node A to know when node D overhears node G’s packets. Thus, it can re-strain its packet transmissions during that period to avoid packet collisions on node D. As a result, FHP can reduce the number of EHTs and thus reduce packet collisions resulting from them in network-coding-based networks.

5.2. Scheduling conﬂict resolution

Due to the holdoff time design of the IEEE 802.16(d) mesh CDS mode[16], it takes some time to finish transmit-ting the necessary IEs of an FHP (i.e., request, grant, con-firm, and extended confirm IEs). This means that two FHPs may overlap on the time axis, which can cause their data schedules to overlap with each other. That is, this may result in two ongoing FHPs scheduling minislots that conflict with each other. On receiving an extended confirm IE, which asks it to suspend transmission during the spec-ified duration, a node may find that its ongoing FHP has chosen some part of the specified duration for transmis-sion. Fig. 7 illustrates an example of such a scheduling conflict.

In this example network, node R1 intends to schedule a minislot allocation (denoted as MA1) with node G1 and node R2 intends to schedule a minislot allocation (denoted as MA2) with node G2. As one sees, nodes R1 and R2 inde-pendently initiate their own FHPs with nodes G1 and G2 during the same period. Therefore, node R1 does not know the existence of MA2 when scheduling MA1 because it has not yet received the extended conﬁrm IE of MA2. Similarly, node R2 does not know the existence of MA1 when sched-uling MA2 because it has not received the extended

conﬁrm IE of MA1 at that time. In this condition, nodes R1 and R2 may schedule MA1 and MA2 that overlap with each other, resulting in a scheduling conﬂict.

Note that, after receiving an extended conﬁrm IE of MA2, node R1 will detect that MA1 overlaps with MA2. Similarly, after receiving an extended conﬁrm IE of MA1, node R2 will detect that MA2 overlaps with MA1. At this point, however, the minislot allocations MA1 and MA2 have been scheduled on the sending and receiving nodes and cannot be canceled.

When a scheduling conﬂict occurs, it is important to determine which node can transmit data and which node should not. For this purpose, we propose a Scheduling Con-ﬂict Resolution Algorithm (SCRA). SCRA takes two minislot allocations as its inputs and returns the one that can use the overlapped minislots as its output. Its pseudo code is shown in Algorithm 4, where the function sf takes a mini-slot allocation MA as its input and returns the starting frame number of MA as its output. The functions sm and nid take a minislot allocation MA as their input and return the starting minislot number of MA within a frame and the ID of MA’s transmitting node as their output, respectively. The details of SCRA are explained below. (Note that the pseudo code of SCRA shown here is for illustration purpose only. The details of SCRA for solving the frame number wrapping problem are omitted in this paper for brevity.)

Upon receiving an extended conﬁrm IE, node i ﬁrst obtains the minislot allocation MAec from the received

extended conﬁrm IE and then checks whether an existing minislot allocation conﬂicts with MAec. If no such a

mini-slot allocation exists, SCRA immediately terminates and returns MAecas its output. Otherwise, SCRA sets the

con-ﬂicting minislot allocation to MAoriand performs the

fol-lowing checks: First, if the starting frame number of MAec

is larger than that of MAori, SCRA returns MAorias its

out-puts. Otherwise, in case the starting frame numbers of the two minislot allocations are the same, SCRA returns the minislot allocation that has a larger starting minislot number within a frame as its output. In case the sf and sm values of MAec and MAori are the same, the tie is

(13)

resolved by comparing the IDs of their transmitting nodes. In this condition, SCRA returns the minislot allocation whose transmitting node ID is larger as its output.

Algorithm 4. The Scheduling Conﬂict Resolution Algorithm for node i

Require: Node i receives an extended conﬁrm IE 1: MAec:¼ the minislot allocation obtained in the

extended conﬁrm IE

2: MAori:¼ the existing minislot allocation

conﬂicting with MAec

3: if MAori= ; then

4: Return MAec

5: end if

6: WinnerMA ;

7: if sf (MAec) is larger than sf (MAori) then

8: WinnerMA MAori

9: else if (sf (MAec) = sf (MAori)) and (sm (MAori) is

larger than sm (MAec)) then

10: WinnerMA MAori

11: else if (sf (MAec) = sf (MAori)) and (sm

(MAec) = sm (MAori)) and

(nid (MAori) is larger thannid (MAec)) then

12: WinnerMA MAori

13: else

14: WinnerMA MAec

15: end if

16: Return WinnerMA

By using SCRA, after receiving an extended conﬁrm IE that indicates minislot allocation conﬂicts, nodes can use FHP to resolve them. Compared with THP, the FHP design with SCRA (denoted as FHP-SCRA) can increase scheduling performances. However, it may encounter two problems that degrade its scheduling performances. We explain these two problems and our proposed solutions to below.

5.3. Activation time estimation for a minislot allocation

The first problem of FHP-SCRA is that, the bandwidth before the extended confirm IEs of conflicting minislot allocations can be broadcast out is wasted. As can be seen inFig. 8, when using FHP-SCRA, two neighboring request-ing-granting node pairs (R1, G1) and (R2, G2) may indepen-dently launch two FHPs during the same period. In such a condition, (R1, G1) and (R2, G2) may schedule their respec-tive minislot allocations MA1 and MA2 during the same period (denoted as the rectangles shown inFig. 8). By the aid of SCRA, nodes R1 and R2 can resolve the scheduling conflicts between MA1 and MA2 after receiving the ex-tended confirm IEs for these two minislot allocations. However, before such extended confirm IEs can be broad-cast by the one-hop neighboring nodes of R1 and R2, MA1 and MA2 may have already been activated. In this condition, R1 and R2 will transmit their data at the same time until receiving the extended confirm IEs of MA1 and MA2, causing packet collisions.

To solve this problem, we propose an Activation Time Estimation Algorithm (ATEA) for requesting nodes to

esti-mate a good activation timing for their minislot allocations (in unit of frame number). AsFig. 9shows, using ATEA each node can schedule a minislot allocation that is activated after its launched FHP is completed. This will reduce the bandwidth wasted for such collided packets. ATEA takes the next MSH-DSCH TxOpp information of all neighboring nodes as its inputs and returns a suitable starting frame number for a requested minislot allocation as its output. Its pseudo code is shown in Algorithm 5 and explained below. (Note that the pseudo code shown in Algorithm 5 is for illustration purpose only. The details of ATEA for solving the frame number and TxOpp number wrapping problems are omitted in this paper for brevity.)

Algorithm 5. The Activation Time Estimation Algorithm for node i

1: NG:¼ the granting node

2: MyNextTxOpp :¼ the next MSH-DSCH TxOpp of node i

3: NbrList (NBR (i) NG)

4: if MyNextTxOpp is larger than the next MSH-DSCH TxOpp of NGthen

5: EstTxOpp MyNextTxOpp 6: else

7: EstTxOpp the next MSH-DSCH TxOpp of NG

8: end if

9: for node j 2 NbrList do

10: TxOppj the next MSH-DSCH TxOpp of node j

11: if TxOppjis larger than EstTxOpp then

12: EstTxOpp = TxOppj

15: EstFrNum fr(EstTxOpp) 16: Return EstFrNum

To determine the starting frame number for a requested minislot allocation (on the requesting node), ATEA first compares the next MSH-DSCH TxOpp of the requesting node (i.e., the current node) and that of the granting node. If the former is larger than the latter, it means that, after receiving request/availability IEs from the requesting node, the granting node can transmit grant IEs back to the requesting node over a TxOpp that is preceding to the requesting node’s next TxOpp. From this information, ATEA can estimate when the requesting node can transmit its confirm IE and when the broadcasts of the extended con-firm IEs can be finished to calculate an appropriate starting frame number for the requested minislot allocation. This is accomplished by two steps: First, ATEA chooses the largest MSH-DSCH TxOpp number among the next MSH-DSCH TxOpp of the requesting node and those of all its one-hop neighboring nodes (except the granting node) as the estimated TxOpp number. Second, ATEA uses the fr func-tion to compute the number of the frame that contains the estimated TxOpp number as its output. The rationale of this design is explained below.

If the next MSH-DSCH TxOpp of a neighboring node n is larger than that of the requesting node, it implies that, upon receiving the requesting node’s conﬁrm IE, node n

(14)

can broadcast the corresponding extended confirm IE on its next MSH-DSCH TxOpp known by the requesting node. In such a condition, ATEA can know when node n will broadcast this extended confirm IE. By exploiting this information, ATEA can more precisely estimate when all of these neighboring nodes will finish broadcasting their extended confirm IEs. This is done by finding the largest TxOpp number among their next MSH-DSCH TxOpps already known by the requesting node.

On the other hand, if the next MSH-DSCH TxOpp of the requesting node is smaller than that of the granting node, the requesting node is not sure when it can transmit the conﬁrm IE out because it cannot know its ‘‘next next’’ MSH-DSCH TxOpp at the current point of time. Although the information is not complete, ATEA still chooses the largest MSH-DSCH TxOpp number among the granting node’s next MSH-DSCH TxOpp and those of all its one-hop neighboring nodes as the estimated TxOpp num-Fig. 8. Why ATEA is needed.

(15)

ber and then uses the fr function to compute the number of the frame containing the estimated TxOpp number as its output. In such a case, ATEA cannot precisely determine the best activation time for the requested minislot alloca-tion. However, it is still useful to mitigate the EHT problem by allowing more nodes to know the activation time of this minislot allocation before it is activated.

5.4. Timing control for starting an FHP

The second problem of FHP-SCRA is that the bandwidth is under-utilized due to its long latency. As one knows, for a requesting node, the best timing to activate its next mini-slot allocation (denoted as MA) is after all its one-hop neighboring nodes (except for the granting node) have broadcast their extended conﬁrm IEs for MA. As shown in

Fig. 10, however, using such a decision a requesting node has to wait a long time before its next minislot allocation can be set up. (Note that inFig. 10, nodes N1, N2, and N3 are all one-hop neighboring nodes of R1.) Such a design will cause the network bandwidth to be under-utilized. To solve this problem, we propose an FHP Triggering Algo-rithm (FTA), which can smartly trigger an FHP in advance based on the estimated completion time of an FHP (deter-mined by ATEA proposed in the previous section). As shown inFig. 11, by using FTA each node can smartly start an FHP so that the requested minislot allocation is guaran-teed (1) to be activated as soon as possible after the exist-ing minislot allocation has elapsed and (2) not to be activated when the existing minislot allocation is still va-lid. In short, using FTA each node can effectively hide the long latency of FHP from trafﬁc ﬂows to increase band-width utilization.

The pseudo code of FTA is shown in Algorithm 6 and ex-plained below. (Note that the pseudo code of FTA shown here is for illustration purpose only. The details of FTA

for solving the frame number wrapping problem are omit-ted in this paper for brevity.)

Algorithm 6. The FHP Triggering Algorithm for node i Require: when node i is allowed to transmit an

MSH-DSCH message

1: frame_cur :¼ the current frame number 2: MAreq:¼ the minislot allocation to be requested

3: Nbits_q:¼ the number of data bits currently

buffered in the output queue

4: if there has been an active minislot allocation MAactthen

5: Nframe_r:¼ the number of frames that MAact

remains valid

6: Nbits_r:¼ the number of data bits that can be

transmitted using the remaining minislots of MAact

7: Nframe_dist:¼ ATEA (MAreq) frame_cur

8: if Nframe_r6Nframe_distand Nbit_r< Nbit_q

9: Start an FHP for MAreq

10: end if 11: else

12: if Nbit_q> 0 then

13: Start an FHP for MAreq

14: end if 15: end if 16: Return

For node i, each time when it is allowed to transmit an MSH-DSCH message, FTA ﬁrst examines whether an active minislot allocation (denoted as MAact) exists. If not, it then

checks whether it has any data send. If so, FTA immediately starts an FHP. Otherwise, it simply returns. On the other hand, if MAactexists, node i ﬁrst uses ATEA to compute

the estimated starting frame number of the minislot allo-Fig. 10. Why FTA is needed.

(16)

cation to be requested (denoted as ATEA (MAreq)) and then

compute the frame distance between the current frame and ATEA (MAreq) (denoted as Nframe_dist). It then compares

the number of MAact’s remaining frames and Nframe_dist. If

(1) the former is smaller or equal to the latter and (2) the remaining scheduled bandwidth cannot accommodate the transmission of the data currently buffered in the out-put queue, FTA will immediately start an FHP to obtain next minislot allocation as soon as possible. Otherwise, it defers starting the next FHP to avoid generating an MAreq

that overlaps MAact.

In summary, by using ATEA each node can more prop-erly set the activation time of requested minislot alloca-tions to reduce bandwidth wastage and by using FTA each node can start an FHP at a better time to increase band-width utilization. Doing so can achieve better performances than FHP-SCRA while effectively eliminating HTs and EHTs.

6. Performance evaluation

In this section, the data forwarding performances of three schemes are studied. The ﬁrst one is the original rout-ing-based 802.16(d) mesh-mode network (denoted as STD); the second one is our proposed rule-based network coding with the original THP design (denoted as RNC-THP); the last one is the RNC using the proposed SHP design (denoted as RNC-SHP). We use the NCTUns network simulator[6]to evaluate the performances of the three schemes. In addi-tion, we built an analytical model to derive the optimal cod-ing performance in IEEE 802.16 mesh CDS-mode networks in Section 6.1. Such theoretical performance results are compared with those of the three evaluated schemes.

6.1. Analytical model

The IEEE 802.16(d) mesh CDS-mode network uses a dis-tributed handshake process to negotiate for the bandwidth

resource at the MAC layer in an on-demand manner. In addition, the MAC layer of the 802.16(d) mesh CDS mode can merge/fragment upper-layer packets in a data frame. As a result, data transmissions carried out at the MAC layer are bursty. The time required for a node to complete the handshake process varies depending on several factors: the holdoff times of one-hop and two-hop nodes, the num-ber of competing nodes, and the current frame numnum-ber. Let Thxbe a random variable denoting the time required for

node x to complete the handshake procedure once. The optimal performance of pairwise network coding can be considered as follows.

We use the 3-node chain network shown inFig. 12as an example to explain the relationship between the minislot scheduling of the forwarding node (node 2) and those of the involved trafﬁc source nodes (nodes 1 and 3). As shown inFig. 15, the data transmissions of a forwarding node can be divided into ﬁve intervals: IQ, INC, ID1, ID3, and Iidle.

In the interval IQ, node 2 is not ready for forwarding

data but either node 1 or node 3 is transmitting data. Thus, node 2 has to queue the received packets within this interval and then transmit them out after it has established a minislot scheduling. In the interval INC, all of the three

nodes are ready for transmitting/forwarding data. In this condition, node 2 can perform the packet mixing operation and broadcast coded packets. In the interval ID1 (or ID3),

only node 1 (or node 3) and node 2 are ready for transmit-Fig. 11. A minislot allocation scheduling example when FTA is used.