Energy-conserving data gathering by mobile mules in a spatially separated wireless sensor network

(1)

Published online 15 August 2011 in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/wcm.1184

RESEARCH ARTICLE

Energy-conserving data gathering by mobile mules in

a spatially separated wireless sensor network

Fang-Jing Wu*_{and Yu-Chee Tseng}

Department of Computer Science, National Chiao-Tung University, Hsin-Chu, 30010, Taiwan

ABSTRACT

This paper considers a spatially separated wireless sensor network, which consists of a number of isolated subnetworks that could be far away from each other in distance. We address the issue of using mobile mules to collect data from these sensor nodes. In such an environment, both data-collection latency and network lifetime are critical issues. We model this problem as a bi-objective problem, called energy-constrained mule traveling salesman problem (EM-TSP), which aims at minimizing the traversal paths of mobile mules such that at least one node in each subnetwork is visited by a mule and the maximum energy consumption among all sensor nodes does not exceed a pre-defined threshold. Interestingly, the traversal problem turns out to be a generalization of the classical traveling salesman problem (TSP), an NP-complete problem. With some geometrical properties of the network, we propose some efficient heuristics for EM-TSP. We then extend our heuris-tics to multiple mobile mules. Extensive simulation results have been conducted, which show that our proposed solutions usually give much better solutions than most TSP-like approximations. Copyright © 2011 John Wiley & Sons, Ltd. KEYWORDS

data gathering; data mule; energy conservation; mobile sensor; traveling salesman problem (TSP); wireless sensor network *Correspondence

Fang-Jing Wu, Department of Computer Science, National Chiao-Tung University, Hsin-Chu, 30010, Taiwan. E-mail: fangjing@cs.nctu.edu.tw

1. INTRODUCTION

The progress of embedded micro-sensing microelectrome-chanical systems (MEMS) and wireless communications has made wireless sensor networks (WSNs) feasible. A conventional WSN normally consists of a sink and many inexpensive sensor nodes deployed in a sensing field. Each node has the capability of sensing, collecting, processing, and storing environment information, and com-municating with neighboring sensor nodes. Many applica-tions such as object tracking, health monitoring, security surveillance, and intelligent transportation [1–4] have been proposed.

This paper considers a spatially separated WSN (SS-WSN), which consists of several isolated subnetworks. These subnetworks are not connected because of reasons such as cost constraints, physical constraints (rivers and mountains), or unavoidable disasters (explosions or earth-quakes), and thus called spatially separated. For example, as a result of geographical constraints, the sensing field could be huge, and deploying a connected WSN is very dif-ficult. Even if the WSN is initially connected, it could be

partitioned because of emergencies such as fires. In addi-tion, when random deployment is adopted, the network is not necessarily connected. Therefore, we believe that an SS-WSN may appear frequently in many practical appli-cations, such as ecology-observing systems [5]. However, coordination among these isolated subnetworks is neces-sary. We thus consider using mobile mules to travel among these subnetworks to collect data. Figure 1 shows an exam-ple, where a helicopter serves as a mobile mule to provide connectivity among these isolated subnetworks. In fact, depending on different scenarios and availability of mules, there may exist various applications of such an SS-WSN architecture. For example, in a long-term monitoring appli-cation of a huge forest, deploying an SS-WSN is inevitable. In this case, a set of robotic cars may patrol along footpaths in the forest to collect sensing data of multiple geograph-ical areas, where the moving paths between two subnet-works can be simply modeled as a set of footpaths. In contrast, in underwater monitoring applications, although there is no geographical constraints in an underwater envi-ronment, data collection may highly rely on mobile mules because dramatic signal attenuation can easily partition

(2)

G1 Sink mule G4 G3 G2 T1 G0 v0 T2 T3 T4 T0 a b c

Figure 1. An example of one round of data gathering by a mobile mule in a spatially separated wireless sensor network.

a sensor network. In this case, submarines may serve as mobile mules [6].

This paper considers the data gathering issue in an SS-WSN. We focus on exploiting mobile mules (‘mules’ for short) to collect subnetworks’ sensory data. For example, in Figure 1, a mule is initially located at the sink node v0.

Assuming an identical data generation rate for each sen-sor, our goal is to dispatch the mule from v0 to visit one

sensor node in each subnetwork and then return to v0. The

node being visited in each subnetwork is called the land-ing port of the subnetwork. Any node can be selected as a landing port. (For load balance consideration, it is pos-sible to have multiple landing ports in a subnetwork; this will be addressed later on.) Therefore, the mule will switch between an inter-subnetwork movement state and an intra-subnetwork data gathering state. During the movement state, the mule leaves from its current landing port and moves to the next landing port. During the data gathering state, the mule stays at a landing port, contacts all one-hop neighboring nodes (termed gateway nodes), requests all nodes in this subnetwork to form a data-collection tree rooted at the mule, and commands all nodes to relay their sensory data to the mule along the tree. When the mule returns to v0, it relays all collected data to v0. This

completes one round of data gathering.

Two critical issues in the aforementioned scenario are data-collection latency and energy conservation. The for-mer is to quickly collect sensory data from all sensors in

the SS-WSN, whereas the latter is to prolong the network lifetime. To minimize the data-collection latency, we need to compute the shortest traversal path for the mule to visit each subnetwork in exactly one landing port. To prolong the network lifetime, we should enforce the mule to visit each node in each subnetwork (implying that a node needs not to relay data for others). Clearly, these are trade-offs. In this paper, we define a new problem called energy-constrained mule traveling salesman problem (EM-TSP), which is a generalization of the Euclidean traveling sales-man problem (ETSP) [7]. The goal of EM-TSP is to find the shortest traversal path of the mule to visit each sub-network at least one landing port. The landing ports of each subnetwork will connect all nodes in that subnetwork via some data-collection trees rooted at them, such that the maximum energy consumption among sensor nodes does not exceed a pre-defined threshold e. EM-TSP

gen-eralizes ETSP as follows: (i) it is sufficient to visit some landing ports in each subnetwork; (ii) load balance among gateways (due to the selection of landing ports) is essen-tial; and (iii) the construction of intra-subnetwork data-collection trees also matters. On the other hand, if we limit esuch that a node can only transmit the data generated by

itself but not relay data for others, EM-TSP degenerates to ETSP. We will give a formal nondeterministic polynomial (NP)-hardness proof.

Clearly, the existing solutions to ETSP cannot directly be applied to EM-TSP efficiently. We thus need to design a

(3)

new data gathering scheme for EM-TSP. It consists of three phases. The first phase is to properly partition each sub-network, if necessary, into multiple subnetworks to meet the constraint e. Note that e is an application-specific

parameter to provide flexibility in dealing with different application requirements. For example, one may use _eto balance between minimizing data-collection latency and maximizing the network lifetime. When eis very large,

each original subnetwork will remain as one subnetwork, thus imposing more work on data relay and damaging net-work lifetime. On the contrary, when eis very small, each

original subnetwork will be divided into many small sub-networks, thus imposing more work on mules. The second phase is to plan a traversal path for the mule to visit each subnetwork. Several schemes are proposed. In particular, we adopt the approach in [8], which shows that, when nodes are placed in a Euclidean space, the CH of input nodes in ETSP has some geometrical properties closely related to the optimal solution to ETSP. This motivates us to define a special convex polygon, termed convex con-tainer (CC), with some geometrical properties that help find thes shorter traversal path. Finally, the third phase is to construct a data-collection tree in each subnetwork to min-imize the energy consumption of nodes. In addition, we also extend our proposed scheme to the case of multiple mules. Extensive simulations have been conducted, which show that an exhausted search usually cannot find the opti-mal solution in a limited time, whereas our heuristic not only has low computation complexity but also gives much better solutions than several TSP-like approximations.

The rest of this paper is organized as follows. Related work is discussed in Section 2. Section 3 describes our net-work model and the problem definition. Section 4 presents our algorithms. Simulation results are in Section 5. Finally, Section 6 concludes this paper.

2. RELATED WORK

In this section, we survey some mobility control schemes in mobile WSNs [9] and some path-planning techniques.

For mobility control, researchers have studied the func-tionality of mobile sensor nodes in both homogeneous and heterogeneous networks. For homogeneous networks, the authors of [10–12] consider that all sensor nodes have identical capability. To address coverage and connectivity issues, Zou and Chakrabarty [10] and Heo and Varshney [11] adopt virtual forces to move sensors, whereas Wang et al. [12] use Voronoi diagram to detect coverage holes and then move sensors to cover these holes.

For heterogeneous networks, it is normally assumed that there are some resource-richer mobile nodes. In [13] and [14], actors are used to respond to dynamic events and pro-vide appropriate actions. In [15] and [16], mobile actors with long transmission ranges are used to relay data for their local cluster, which are formed by static sensors. Recently, for spatially separated subnetworks, researchers have designed mobile nodes, called data mules, to conduct

message relaying. Random mobility are assumed for such mules in [17–19]. Because random mobility may incur unbounded data-collection delay, message ferries with controllable mobility are studied. Reference [20] designs moving paths of robots for search and rescue systems. In [21], there is a ferry moving along a publicly known route. With knowledge of the ferry route, nodes can proac-tively schedule their transmission/reception with the ferry. An optimization problem is to find a ferry route such that the average message delay is minimized and that the com-munication time of each node can be met. Two ferrying schemes are proposed in [22]. One allows nodes to periodi-cally move closer to the ferry route. The other allows nodes to request the ferry, via high-power radios, to approach them when they have intention to transmit/receive. The optimization goal is to minimize the message drops. Mul-tiple ferries are considered in [23], where packets may be relayed by multiple ferries before reaching their destina-tions. In [24], sensors with different weights are deployed sparsely in an isolated way. Data mules must visit sen-sors along deterministic paths to collect their data such that sensors with higher weights are visited with lower inter-arrival time and that the total length of paths is min-imized. Reference [25] further extends [24] by designing probabilistic paths for data mules. In [26], sensors are mod-eled as disjoint disks with different radii, and a mobile robot is dispatched to visit each sensor with the shortest traversal path. Note that a sensor is visited once the robot is within its communication range. (In comparison, our work assumes that it is sufficient to visit one representative node in each subnetwork.) To relieve the funneling effect [27] in a connected WSN, researchers have proposed to use mobile collectors. Given a candidate set of rendezvous points, finding moving paths to visit these points are stud-ied in [28]. How to find better rendezvous points is studstud-ied in [29,30]. Distributed protocols are designed in [31,32] to navigate these mobile collectors, where a point may be visited by a multi-hop path.

The aforementioned results are highly related to the clas-sical path-planning issues, such as the traveling salesman problem (TSP). In [33], a 2-approximation algorithm with complexity O.n2/ for TSP based on the minimum span-ning tree (MST) of nodes is proposed, where n is the number of nodes. In [34], a .1:5/-approximation, termed

Christofides heuristic, with complexity O.n3/ based on

MST and the minimum-length matching of nodes is pro-posed. These two algorithms may find a self-intersecting path. When nodes are scattered in an Euclidean space, for a fixed c > 1, Arora [35] proposed a .1 C 1=c/-approximation scheme with complexity O.n.log n/O.c//, by putting nodes in a square box, partitioning them into smaller grids, and finding an initial path that enters and exits grids only through some special points, termed por-tals, such that self-intersections are avoided. Note that these approaches cannot handle EM-TSP because a mobile sensor only needs to be within the communication range of a node or a representative node of a subnetwork. With these path-planning technologies, Meliou et al. [36] designed the

(4)

paths for query/reply messages in a static and connected WSN, where a path is allowed to split into multiple ones and eventually converge into one before arriving at the sink. Reference [37] extends [36] by considering the set of queried sensor nodes being time varying.

3. NETWORK MODEL AND

PROBLEM DEFINITION

The SS-WSN is modeled as an undirected graph G D .V ; E/, where V is the set of sensor nodes and E is the set of communication links. A special node v02 V is

des-ignated as the sink node and is responsible for collecting sensory data from all other nodes in V . Each node has a communication range of rc. Constrained by rc and the

physical environment, the network G is spatially separated in the sense that it is partitioned into multiple connected

subnetworksG0; G1; : : : ; Gn. Without loss of generality,

we let sink v0 2 G0. The location of each vi 2 V is

denoted by .xi; yi/ (how to identify nodes’ locations is out

of the scope of this work). The Euclidean distance between two viand vj in V is denoted by d .vi; vj/. We assume a

long-term monitoring application, where each sensor node has an identical data generation rate and an identical ini-tial energy Etotal. A mobile mule m is responsible for

mov-ing to these subnetworks to collect and deliver their sensmov-ing data to v0. We assume that m also has a communication

range of rc.

We are interested in the data gathering issue in an SS-WSN G. This is achieved by the cooperation among v0,

m, and all subnetworks. The problem is formulated as follows. Time is divided into rounds. In each round, m will leave from v0, visit each subnetwork, collect all data

therein, return to v0, and forward the collected data to v0.

Therefore, during a round, m will switch between an inter-subnetwork movement state and an intra-inter-subnetwork data gathering state. A movement state starts when m leaves from the landing port of its current subnetwork and ends when m arrives at the landing port of the next subnetwork. A landing port of a subnetwork can be any sensor node in the subnetwork. In the beginning of a round, m will stay at v0. Then, it will visit one or multiple landing ports in each

Gi ¤ G0and return to v0. (The reason for requiring

vis-iting multiple landing ports in Gi will become clear later

on). This completes one round of data gathering. A data gathering state starts after m arrives at a landing port in Gi

and ends once it has collected all nodes’ newly generated sensory data. Once landed, m should contact all neighbor-ing nodes within its communication range (termed gateway

nodes), request all nodes in Gi to form a data-collection

tree rooted at m, and instruct all nodes in the tree to relay their sensory data to m along the tree. Note that the mule does not need to conduct data collection in G0 because

nodes in G0can report to v0at any time. Figure 1 gives an

example, where sensory data of each Giis relayed though

its tree to the mule and then to v0, except G0.

We make two notes as follows. First, for practical rea-sons, when a subnetwork contains too many nodes, we may enforce it to be divided into multiple subnetworks, thus requiring multiple landing ports. This may reduce the energy consumption of gateway nodes and balance their load. We will show how to conduct such partitioning later on. As follows, for ease of presentation, unless stated other-wise, a ‘subnetwork’ will refer to one after conducting such partitioning. Second, after m landed at a landing port in Gi,

the relaying load of this landing port should be regarded as zero (or near zero) because the root of the data-collection tree is the mule, rather than this landing port (sending its data to m is negligible). Thus, we will calculate the energy consumption of those gateway nodes associated with the tree. For example, in Figure 1, gateway nodes a, b, and c will take care of sensory data of 1 node, 3 nodes, and 5 nodes, respectively.

We consider two main performance metrics: data-collection latency of m and energy consumptions of sensor nodes. The former is modeled by the length of the traver-sal path of m in a round, whereas the latter is modeled by the maximum energy consumption among all sensor nodes in a round. (We do not consider the data-collection latency per subnetwork because it should be relatively much faster than the movement of the mule.†) By putting these two goals together, the objective in a round becomes finding the shortest traversal path of m to visit each subnetwork such that the maximum energy consumption among sen-sor nodes is minimized. However, these two goals contra-dict each other. Visiting a subnetwork exactly once is the best for the first metric but the worst for the second met-ric and vice versa. To resolve this problem, we define an optimization problem, termed EM-TSP, where the goal is to find the shortest path for m to visit each subnetwork at least once and then return to v0 such that the

maxi-mum energy consumption among all sensor nodes does not exceed than a threshold e. To measure the energy

consumption of a node, we consider the data-collection tree when m visits a subnetwork Gi. Let T be the

data-collection tree of Gi. We model the energy cost of vk2 Gi

by ET.vk/ D e ı jT .vk/j, where e is the energy

con-sumption for a node to transmit one unit of sensory data, is the data generation rate of each sensor node, ı is the maximum duration of a round, and T .vk/ is the subtree of

T rooted at vk. That is, ET.vk/ includes the energy cost

to report the sensory data of vkand vk’s descendants.

†_{Let t be the packet transmission time on a link. Given a}

subnet-work Gi, if a well-scheduled MAC protocol [38] is adopted, the

intra-subnetwork data-collection latency can be approximated by t jGij

(this happens when a pipeline effect occurs such that one packet is delivered to the mule per time unit while the mule is visiting Gi).

Therefore, the total data-collection latency can be approximated by the mule traveling time plusPni D0t jGij. The latter factor is close to a

(5)

Definition 1. Given an SS-WSNG D .V ; E/, a sink node

v0inG0, and an energy thresholde, the EM-TSP is to find

a proper partition ofG into subnetworks and a traversal

pathP starting and ending at v0, visiting each subnetwork

Gi¤ G0in one landing port, and connecting all nodes in

Givia a data-collection treeT rooted at the landing port,

such that maxvk2V ET.vk/ eand the total lengthjP j

is minimized.

To prove that EM-TSP is NP-hard, we define a decision problem as follows.

v0inG0, an energy thresholde, and a positive integerL,

the length-constrained and energy-constrained mule TSP

(LEM-TSP) is to find a proper partition ofG into

subnet-works and a traversal pathP starting and ending at v0,

visiting each subnetworkGi ¤ G0 in one landing port,

and connecting all nodes inGivia a data-collection treeT

rooted at the landing port, such that maxvk2V ET.vk/

eandjP j L.

Theorem 1. Length-constrained and energy-constrained mule traveling salesman problem is NP-hard.

Proof of Theorem 1 is proved in Appendix 6. We reduce ETSP [7], an NP-hard problem, to a special case of LEM-TSP by regarding each subnetwork as a ‘macro’ node.

4. HEURISTICS TO

ENERGY-CONSTRAINED MULE

TRAVELING SALESMAN PROBLEM

In this section, we propose some heuristics to solve EM-TSP. Our solutions consist of three phases: (i) subnetwork partition; (ii) path planning; and (iii) balanced tree con-struction. The first phase partitions each ‘original’ sub-network,‡ if necessary, into multiple smaller subnetworks such that each subnetwork can meet the energy require-ment e. The second phase is to plan a traversal path of m

to visit each subnetwork at one landing port such that the total path length is as small as possible. We will propose three schemes for phase 2. The third phase is to form a data-collection tree rooted at each landing port such that the minimum remaining energy among sensors is maxi-mized. At the end, we will analyze the complexity of our heuristics and discuss how to extend to multiple mules. 4.1. Subnetwork partition

To meet the energy constraint e, this phase tries to

par-tition each original subnetwork into several subnetworks such that (i) the amount of sensory data relayed by each

‡_{We use ‘original’ subnetworks to distinguish from those after}

parti-tioning.

sensor is bounded and (ii) the number of subnetworks after partitioning is minimized. To achieve these goals, we first try to limit the number of sensors in each subnetwork after partitioning within a bound s D b_eıe c such that each

original subnetwork Gi is partitioned into ˛i D djGsije

subnetworks. In the following, we propose a modified k-means algorithm to solve this problem. Note that the typ-ical k-means algorithm [39] cannot properly handle this partitioning problem for two reasons. First, it cannot guar-antee the number of nodes in each set. Second, we require that each subnetwork is connected by itself (i.e., with-out passing other subnetworks). For example, in Figure 2, although the partitioning is perfect, the right subnetwork needs to rely on the left subnetwork to become connected.

Our scheme works as follows. For each original subnet-work Gi, we set ˛i as its ideal number of partitions and

initially partition Giinto ˛igroups by the following

group-ing process. Then, we check whether each groupGj has

jGjj s. If not, we increase ˛i by one and repeat the

grouping process until each groupGj satisfies jGjj s.

Otherwise, each groupGjis regarded as a new subnetwork,

and this phase terminates. Note that, after this phase, we will useGjto denote a subnetwork later. The details of the

grouping process are as follows.

(1) For each Gi, we randomly select ˛i nodes as its

initial seeds. Each seed is considered as a trivial tree. (2) From the tree rooted at each seed, we try to grow the tree by one hop based on the breadth-first search order. For each node that has not joined any tree yet and is one hop away from at least one tree, it joins the smallest tree among all candidates.

(3) Step 2 is repeated until each node has joined a tree. (4) For each tree, generate a new seed by choosing the

node nearest to the tree’s center-of-gravity. (5) With these new seeds, re-run steps 2–4, until there

is no or very little change on the sizes of these trees (a threshold may be set so that this step terminates).

Communication links

partition 1

G

i

partition 2

5

s

Figure 2. An example of subnetwork partition using a typical k-means algorithm.

(6)

(6) Check if the ˛i trees meet our balancing criteria

(i.e., s). If so, the algorithm terminates; otherwise,

increase ˛iby one and repeat steps 1–5 again.

4.2. Path planning

In this phase, we propose three heuristics to find a traversal path for m. The input is a set of subnetworks

G D fG0;G1; : : : ;Gqg. The goal is to minimize the path

length. The first heuristic greedily chooses the next land-ing port repeatedly. The second one is derived from the CH approach in [8]. The third one relies on a special convex polygon that may lead to an even shorter path.

4.2.1. Greedy scheme.

This scheme is mainly designed for making compar-isons. It works in an iterative manner. Initially, m is located at v0. In each iteration, m chooses the next landing port,

denoted by vl, to be visited such that vlis located at an

unvisited subnetwork and closest to the current location of m. This process is repeated until all the subnetworks are visited. Finally, m returns to v0. This finds the traversal

path P .

Figure 3 gives an example of this scheme. It is to be noted that the traversal path may have intersections, which should be avoided, as to be shown later.

4.2.2. Convex hull-based (CH-based) scheme. This scheme first selects a delegation node in each sub-network and then constructs a convex hull (CH) of these delegation nodes to provide a base for constructing a traver-sal path such that intersections can be avoided. To start with, we extend a property raised in [40] to the following property.

Theorem 2. An optimal traversal path to EM-TSP has no intersection with itself.

Proof . Referring to Figure 4, an intersection is defined as two line segments that cross each other in a 2D plane. We assume that path P D v0 ! ! vi ! vi C1 !

vk! vkC1! ! v0is an optimal solution to EM-TSP

such that vivi C1and vkvkC1intersect each other.

How-ever, we can find another traversal path P0D v0! !

vi! vk ! ! vi C1! vkC1! : : : ! v0such that

jP0j < jP j by the triangle inequality. It is a contradiction. Therefore, this theorem is proven. A CH has some good geometric properties that can help find a shorter traversal path. First, a CH is a convex poly-gon that never intersects with itself. Second, Larson and Odoni [40] proved that the order of boundary nodes on the CH must appear, in that order, in the optimal solution to ETSP. These observations motivate us to conduct path construction and path improvement based on a CH. In the following, we modify the algorithm in [8] into one fitting our need.

Sink

v0

P

Figure 3. An example of the greedy scheme.

v

0

v

i

v

k+1

v

i+1

v

k

Figure 4. The proof of Theorem 2.

(1) For each subnetwork Gi¤G0, compute a

delegation node di, where di is the node

closest to the center-of-gravity of Gi, that is,

P .xk ;yk /2Gixk j_Gij ; P .xk ;yk /2Giyk j_Gij . The landing port ofGiwill be di. (2) Let D D fv0g [ f8dijGi¤G0g. Following

Theorem 2, we construct a CH of D (such algorithms can be found in [33]). Let the CH be our initial path P .

(3) We iteratively add nodes in D P into the traversal path. Specifically, in each iteration, for each node di2 D P , we try to insert diinto each link .x; y/

of P . The insertion cost of putting dibetween x and

y is cost .x; di; y/ D d .x; di/Cd .di; y/d .x; y/.

Let dmin be the node whose insertion between link

.x; y/ incurs the smallest cost .x; dmin; y/. Then,

we update P by inserting dmin between .x; y/ of

P . We repeat this process until all delegation nodes are included in P . This finds the final path P .

Figure 5 gives an example of the CH-based scheme. After step 1, D D fv0; d1; : : : ; d7g. After step 2, the CH of

(7)

Sink v₀ Convex hull of D P d7 d1 d2 d3 d4 d5 d6

Figure 5. An example of the convex hull-based scheme.

inserts d3into link .d2; d4/ and then inserts d5into link

.d3; d4/ to form the final P .

4.2.3. Convex container-based (CC-based) scheme.

We make two observations on the geometric proper-ties to form a small convex polygon. First, the landing port of each subnetwork are usually closer to the inner side of the whole field. Figure 6 shows an example of an SS-WSN after conducting the subnetwork partition phase. Using outer nodes A, B, C, and D as landing ports will incur a longer path than using inner nodes A0, B0, C0, and D0. Second, although inner nodes are generally preferred, sometimes skipping some inner nodes may even reduce the path length. In the above example, if we replace D0by D00, we can find an even better path.

The aforementioned observations motivate us to define a special kind of convex polygon, termed convex container (CC), that allows a subnetwork to either contribute a very inner node to the convex polygon or simply have some node(s) inside the convex polygon. The formal definition is as follows. A B C D D'' A' B' C' D'

Figure 6. Examples of selecting inner nodes as landing ports.

Definition 3. Given a set of node-disjoint subnetworks

G D fG0;G1; : : : ;Gqg, q 3, a CC ofG is a convex

poly-gonP D x1 ! x2 ! ! xr composed ofr nodes,

r q, such that for eachGi,i D 1; : : : ; q, either of the following conditions is satisfied:

Gihas one node belonging toP , and this node is the

only node ofGicontained insideP .

Gi has no node belonging toP but has at least one

node contained insideP .

Figure 7 shows some examples of Definition 3, where there are seven subnetworksG0;G1; : : : ;G6. The one in

Figure 7(a) is not a CC becauseG6is outside the polygon.

The one in Figure 7(b) is not a CC becauseG1has some

extra nodes inside the polygon. The one in Figure 7(c) is a CC. Clearly, CCs are not unique. (Note that to take into account the case of multiple nodes forming a straight line, a node on P is not considered inside P .)

Our CC-based scheme consists of four steps. Step 1 tries to choose a ‘min–max’ initial node to start our construc-tion. Step 2 is the main step to form a CC. Steps 3 and 4 add nodes of those unvisited subnetworks into the con-tainer to form a traversal path. Note that because v0must

be included in the final path P , we will imagine thatG0

contains only one node v0.

(1) Let h0D v0, and let hi be the node in Gi,

i D 1; : : : ; q, which has the largest y-coordinate. We then let hmin be the node in fh0; h1; : : : ; hqg

that has the smallest y-coordinate. (In Figure 8(a), hminD h4.)

(2) Initially, let P contain only h_min, and all subnet-works are considered unvisited except the one con-taining hmin. We then enter an iterative sweeping process to add nodes into P to form a CC. Intu-itively, this step simulates tying the shortest string around all subnetworks in the counterclockwise direction such that P is a CC. We imagine that there is an arrow string S of an infinite length pointing at degree 0 with hminas the origin. The following

steps repeatedly rotate S in the counterclockwise direction until a CC is constructed.

(a) Rotate S , from its current direction, counter-clockwise with the last node in P as its cen-ter. Stop rotating when any of the following conditions is encountered: (i) h_minis on string S and (ii) the first unvisited subnetwork (say Gi) appears such that all node ofGi have been

swept by string S and the last node/nodes swept by S is/are now on S . Note that condition (i) may happen when multiple nodes form a line. (Figure 8(b) shows an example of the i th iteration during the sweeping process.) (b) If hminis on S , a CC P is found, and we exit

this loop. Otherwise, let vj be the node ofGi

that is on S (if there are multiple such nodes, the one closest to the center of S is selected).

(8)

(c)

(b)

g0 g1 g2 g6 g3 g5 g4 g0 g1 g2 g6 g3 g5 g4

(a)

g0 g1 g2 g6 g3 g5 g4

Figure 7. The polygons in (a) and (b) are not convex containers, whereas that in (c) is.

Sink g3 g2 g1 g7 g0 g6 g4 g5 h7 h1 h2 h3 hmin=h4 h5 h6 h0=v0 vA vB vC S vD vE P Convex container

(a)

(b)

i-th iteration (i+1)-th iteration (i-1)-th iteration hmin

Figure 8. The CC-based scheme: (a) an example of the CC-based scheme, and (b) the i-th iteration of the sweeping process.

We then append vj to path P , markGias

vis-ited, and go back to step 2. (In Figure 8(a), in the first iteration,G2is the first subnetwork whose

nodes are all swept by S . Because the last node being swept is vA, vAis appended to P .)

(3) With the CC P , we divide the remaining unvisited subnetworks into two sets: bG contains those sub-networks that are ‘crossed’ by P and eG contains those that are completely inside P . (For example, in Figure 8(a),G3 is ‘crossed’ by P , whereas G5

is inside P .) We first deal with set bG. We will iter-atively choose one node in a subnetwork in bG and insert it into P . Specifically, in each iteration, for each node vj in each Gi 2 bG, we try to insert

vj between each link .x; y/ of P . The insertion

cost of inserting vj into .x; y/ is cost .x; vj; y/ D

d .x; vj/Cd .vj; y/d .x; y/. Let vminbe the node

among all candidates in all Gi 2 bG that incurs

the least cost. Then, we insert vmin into P and

remove the subnetwork containing v_min from bG.

This process is repeated until bG D ;. (For exam-ple, in Figure 8(a), vDofG3is inserted into the link

.hmin; vA/.)

(4) Next, we deal with set eG. We repeat the same pro-cess as step 3 to insert more nodes into P until

eG D ;. The final traversal path is P . (For example,

in Figure 8(a), vEofG5is inserted into .vC; hmin/.)

4.2.4. Some local optimizations.

Finally, we present some local optimization techniques to further reduce the length of P . Let P D v0! vp1 !

! v0be a path obtained by any one of the above

heuris-tics. We can examine any two consecutive links vpi !

vpi C1 ! vpi C2, try to find another v 0

pi C1 which is in

the same subnetwork as vpi C1 such that d .vpi; v 0 pi C1/ C

d .vp0i C1; vpi C2/ < d .vpi; vpi C1/ C d .vpi C1; vpi C2/,

and replace vpi C1 by v 0

pi C1. This process can be repeated

for all consecutive links of P until no further improvement is possible. This method can be extended to a three-link

(9)

look-ahead scheme, too. This is applicable to the iterative process of the aforementioned three heuristics.

4.3. Balanced tree construction

When the mule arrives at subnetwork Gi, m enters this

phase to form a data-collection tree rooted at itself to con-duct intra-subnetwork data gathering. Specifically, once m lands at the landing port of Gi, m will switch from

movement state to data gathering state, request the sen-sors within m’s communication range to be gateway nodes, and instruct all sensors inGi to report their sensory data

through these gateway nodes to m. Our goal is to balance the loads among these gateway nodes such that the mini-mum remaining energy among sensors inGiis maximized.

For this purpose, we extend the centralized algorithm pro-posed in [41] to a distributed one. We will use the remain-ing energies of nodes to measure the degree of balance of a tree. The protocol has two stages: tree construction stage and balancing stage. The first stage is to form an initial tree T in a top–down manner based on the remaining energies of nodes. The second stage is to adjust T in a top–down manner according to nodes’ balanced degrees.

4.3.1. Tree construction stage.

In this stage, an initial tree will be spanned from m to all nodes ofGi. A Contact.Gi/ message will be broadcast by

m to its direct neighbors.§

(1) When a node vkreceives a Contact.Gi/ message, it

becomes a gateway node and immediately replies an

Association.vk/ message to m to become m’s child.

Then, vk broadcasts a Form_Tree.e.vk// message

to span its subtree, where e.v_k/ is vk’s current

remaining energy.

(2) When non-gateway node vjreceives a Form_Tree./

message for the first time, it sets a timer w.

After wexpires, from all Form_Tree./ messages

it received, vj chooses an on-tree node vp with

the maximum e.vp/ as its parent by sending an

Association.vj/ message to vp. In case of a tie,

the one with the least number of neighbors is cho-sen. Then, vj sets itself as an on-tree node and

broadcasts a Form_Tree.e.vj// message.

§_{In practice, any localization algorithm will suffer from some degree of}

localization errors, typically ranged between 0:2rcand rc[42]. This

may lead to the mule missing the landing port or connecting to an incorrect landing port. The mule may need to circle around to locate the landing port or use its neighboring nodes as the landing port. Through simulations, our protocol still works correctly under such situations but may suffer from little performance degradation in the minimum remaining energy for the CH-based scheme. Clearly, if we allow the mule to pick a landing port different from the originally planned one (which is at the center-of-gravity), a larger localization error may cause a less balanced data-collection tree.

(3) After vj becomes an on-tree node, it sends a

J oi n.vj/ message to m. After m has received

J oi n./ messages from all nodes inGi, tree T is

formed, and m will broadcast an Adj ust me nt .T / message to instruct all nodes in Gi to enter the

next stage.

4.3.2. Balancing stage.

In this stage, the adjustment will be conducted in a top–down manner. Given T , a node vk can compute its

remaining energy after executing one round of data col-lection as follows: r.vk/ D e.vk/ ET.vk/. To evaluate

the balancing degree of T with respect to v_k, we define B.T ; vk/ D maxvj2C .vk/fr.vj/gminvj2C .vk/fr.vj/g,

where C .v_k/ is the set of vk’s children in T . Note that a

smaller B./ means better balance in terms of remaining energy. The adjustment works as follows:

(1) When vk receives an Adj ust me nt .T / message

from its parent, it computes its current B.T ; v_k/. Then, let vmin(resp., vmax) be the child of vk, which

has the smallest (resp., largest) remaining energy r./. For each subtree of vmin, vk tries to move it

to vmax(if the connectivity exists) and computes the

new balancing degree B.T0; vk/, where T0 is the

new tree. If there exists a movement that leads to the smallest balancing degree less than its current one, then vkinstructs vminand vmaxto do so.

(2) If vkmakes any change of T in step 1, it goes back

to step 1 and tries another change. Otherwise, it broadcasts an Adj ust me nt .T / message to its chil-dren and replies a C omplet e.v_k/ message to m. (3) After m has collected C omplet e./ messages from

all nodes inGi, this phase completes.

4.4. Complexity analysis

In this section, we analyze the time complexity of the pro-posed heuristics. In Section 5, we will further investigate the performance issue through simulations. Let N D jV j be the number of nodes,E be the number of communica-tion links, n be the number of original subnetworks, and n0 be the number of subnetworks after partitioning. Normally, N n0 n.

For the subnetwork partition phase, step 1 takes O.n C n0/ to compute the initial seeds. The time complexity of steps 2–3 is the same as the breadth-first search, that is, O.N CE/. Step 4 takes O.N / time to scan all nodes for updating seeds. Suppose that steps 2–4 are repeated I times. Normally, I N in practice because our sub-network partition phase is extended from the k-means algorithm [43]. Finally, step 6 will repeat the aforemen-tioned process up to O.n0/ times. Therefore, the total time complexity of the subnetwork partition phase is O..n C n0/ C ..N CE/ C N /I n0/ D O..N CE/I n0/.

(10)

For the path-planning phase, there are three schemes. For the greedy scheme, each iteration takes O.N / time to find the next landing port, and there are n0iterations. Thus, its time complexity is O.N n0/. For the CH-based scheme, step 1 takes O.N / time to scan all nodes for computing the delegation nodes. Finding a CH in step 2 takes O.n0r/ time by Jarvis’ march [33], where r n0is the number of nodes along the CH. In step 3, at most O.n0/ unvisited delegation nodes will be checked in each iteration, and there are O.n0/ iterations. In addition, all links in P will be checked, giving cost of O.n03/. Thus, the total complexity is O.N C n03/. For the CC-based scheme, step 1 costs O.N /. In step 2, finding the next node in the CC takes O.N /, and there are O.r/ iterations. Thus, step 2 takes O.N r/ time. In steps 3–4, each insertion will try O.N / nodes. Thus, steps 3–4 take O.N n02/ time. The total complexity is O.N n02/.

For the balanced tree construction phase, we first ana-lyze the computational cost and the message complexity of each node. For the computational cost, in the tree construc-tion stage, each node vk takes O.N / time to check those

received F orm_T ree./ messages for selecting its parent. In the balancing stage, for a pair of vmin and vmax, for

each possible movement T0, each node vkwill take O.N /

time to compute a new B.T0; vk/ value. Because there are

O.N2/ combinations of movements and O.N2/ possible pairs of v_min and vmax, the balancing stage takes O.N5/

time. Overall, the balanced tree construction phase takes O.N5/. As to the message complexity, the tree construc-tion stage is similar to forming an MST. Thus, the message complexity of this stage is O.N /. In the balancing stage, each node v_ksends at most O.N2/ Adj ust me nt .T / mes-sages to its vmin and vmax, and there are O.N2/ pairs

of v_min and vmax. Therefore, the balancing stage incurs

O.N4/ message complexity. Overall, the balanced tree construction phase incurs O.N4/ message complexity.

4.5. Extensions to multiple mobile mules The aforementioned solutions have assumed that there is only one mule. In the following, we show how to extend EM-TSP to multiple mules.

v0 in G0, an energy threshold e, and K mobile mules

located at v0, the min–max EM-TSP is to find a proper

partition of G into subnetworks and K traversal paths

P D fP1; P2; : : : ; PKg, each starting and ending at v0,

visiting each subnetworkGi ¤ G0in one landing port by

at least one path, and connecting all nodes inGi via a

data-collection treeT rooted at the landing port, such that

maxvk2V ET.vk/ e and the maximum of these path lengths is minimized.

Our three-phase heuristics can be directly applied to min–max EM-TSP, except that the path-planning phase needs to be extended to K traversal paths. In the follow-ing, we propose two solutions. The first one is to group subnetworks into K clusters by applying the traditional

k-means algorithm [39] before planning paths. First, the center-of-gravities of all subnetworks are identified. Then, the k-means algorithm is applied to all subnetworks, except G0, into K clusters. Then, for each cluster of subnetworks,

any one of our earlier path-planning schemes is applied. The k-means algorithm has more sense of the geo-graphic vicinity of subnetworks but little sense of load balance. The second heuristic is to iteratively merge clus-ters until only K clusclus-ters remain. Initially, each subnetwork is regarded as a cluster. For each cluster, we mergeG0into

this cluster and compute a tentative traversal path by any single-mule scheme. If there are sufficient mules (i.e., the number of clusters is less than K), the algorithm stops. Otherwise, we merge the cluster with the shortest traversal path with the cluster nearest to it. The distance between two clusters are defined as the minimum distance between any two nodes between these two clusters. If there are K clus-ters, then the algorithm terminates. Otherwise, we repeat the aforementioned process to merge more clusters.

5. SIMULATION RESULTS

A simulator has been implemented by JAVA programs. To simulate an SS-WSN, we randomly deployed N sen-sor nodes in an S S m2 field. The field is divided into grids, each of size sg sg m2. In order to form

a spatially separated network in a systematic way, we imposed a fail probability of P_f on each grid. If a grid is determined to fail, all sensor nodes inside it fail. This would partition the network into multiple subnetworks when Pf is sufficiently large. The sink node v0 is

ran-domly selected. Following the energy model of Mica2 [44], we set e D 100 mJ. Considering long-term mon-itoring applications [45], we set D 10 packets/h, with 10 kb per packet. Table I summarizes all default

Table I. Definitions of parameters and their default values used in our simulations.

Parameter Meanings Default value

N Number of sensor nodes 1000

S Area size 500 m sg Grid size 50 m Pf Fail probability of 0.5 a grid rc Transmission range 20 m of a node e Energy consumption 3 106mJ threshold

e Energy cost to transmit 100 mJ

one unit of data

Data arrival rate 10 packets/h

ı Maximum duration 100 h

between rounds

Etotal Initial energy of 107mJ

a node

(11)

parameters used in our simulations. We compared our EM-TSP solutions against a modified Christofides heuristic in which the path-planning phase randomly chooses a land-ing port for each subnetwork after partitionland-ing and then the Christofides heuristic is applied to find a traversal path of the mule. All simulation results are from the average of 100 runs.

We consider two performance metrics: (i) the path length of a mule and (ii) the minimum remaining energy among sensors. In the following, we first vary the number of sen-sor nodes N , the area size S , the fail probability Pf, and the

transmission range rc, to investigate the performance when

there is only one mobile mule. In addition, we also simu-late a more complicated irregular radio propagation model. Finally, we compare our heuristics against an exhaustive search by varying the number of subnetworks before par-titions. Then, we study the performance when multiple mules coexist.

5.1. Effect ofN

First, we investigate the effect of the number of nodes. Figure 9(a) shows its impact on path length. The CC-based scheme performs the best, which implies that its local opti-mization technique can efficiently shorten the path length. When N falls in the range 200–400, the path length is rela-tively longer. This is because there are too many (original) subnetworks, as reflected by Figure 9(b). As N increases (N D 400 1400), the path length decreases grad-ually because there are less subnetworks to be visited. Recall that the ideal number of partitions isPn_{i D0}djGij

s e.

Figure 9(b) shows that our schemes will not incur too many subnetworks after partitioning. Figure 10 shows that the CH-based scheme performs the best in terms of the minimum remaining energy of nodes. This is because the CH-based scheme selects the node close to the geometri-cal centroid of each subnetwork as the landing port. Thus, the average relaying hop counts from sensors to the gate-way nodes are reduced. As can be seen, although some of our path-planning policies choose landing ports nearby the

boundaries of subnetworks, the resulted minimum remain-ing energy among sensors is still slightly less than that of Christofides. Additionally, we can see that the greedy and CC-based schemes have the similar performance in terms of the minimum remaining energy among sensor nodes. This is because these schemes will choose nodes around the borders of subnetworks as landing ports. Thus, the average amount of sensory data relayed by a sensor becomes higher than that in the CH-based scheme. When N is relatively large, the minimum remaining energy is rel-atively small and eventually becomes flat. This is because the energy consumption of each sensor can be bounded by eeven if there is the larger N .

5.2. Effect ofS

Next, we study the effect of the area size by varying S from 400 to 800. Figure 11(a) shows that the mule’s traver-sal path length will increase proportionally with S . This is caused by two reasons: (i) there are more and more small subnetworks as the network is becoming sparser, as reflected by Figure 11(b), and (ii) the distance between subnetworks are relatively farther. Even when S is large (which means that subnetworks have less nodes), our solu-tions still outperform Christofides. This gives us evidence that the selection of landing ports is important and the existing TSP heuristics (such as Christofides) cannot be trivially applied to EM-TSP to achieve good performance. In Figure 12, the minimum remaining energy of sensor nodes increases as S increases. This is because the larger S will cause that each subnetwork includes the less sensor nodes and spends less energy on relaying.

5.3. Effect ofPf

We now investigate the effect of the fail probability by varying Pf from 0:9 to 0:1. As shown in Figure 13(a),

the CC-based scheme still performs better than the other schemes in terms of path length. The gap between the CC-based scheme and others actually enlarges as Pfdecreases.

1500 2000 2500 3000 3500 4000 4500 5000 200 400

(a)

(b)

600 800 1000 1200 1400 Path length (m) Number of nodes

2-link look ahead, CC-based CC-based CH-based Greedy Christofides 20 30 40 50 60 70 80 90 200 400 600 800 1000 1200 1400 Number of subnetworks Number of nodes Before partition After partition Ideal partition

(12)

70000 75000 80000 85000 90000 95000 100000 200 400 600 800 1000 1200 1400

Minimum remaining energy (100 mJ)

Number of nodes

2-link look ahead, CC-based CC-based CH-based Greedy Christofides

Figure 10. Effect of N D 200 1400 on the minimum remaining energy of sensor nodes.

This is because a smaller P_f will lead to more available sensors and, thus, larger-scale (original) subnetworks and more (logical) subnetworks. This also concludes that the CC-based is more important when we deploy larger-scale subnetworks. Note that in Figure 13(b), it is reasonable to see that the path length is related to the number of subnetworks after partition (rather than that before parti-tion). From Figure 13(b), we see that the gaps between the number of subnetworks before partition and the ideal num-ber of subnetworks are quite large when P_f is relatively small. This is due to two reasons. First, it is possible to have isolated nodes when nodes are randomly deployed. Second, the connectivity of nodes nearby the boundary of the field is not strong compared with those inner nodes. In Figure 14, the minimum remaining energy among sen-sors decreases as Pfdecreases because sensors spend more

energy on relaying.

5.4. Effect ofrc

We vary rc from 20 to 50 to study the effect of nodes’

transmission range. Figure 15(a) shows that the path length

75000 80000 85000 90000 95000 200 300 400 500 600 700 800

Area size

Figure 12. Effect ofS D 400 800 on the minimum remaining energy of sensor nodes.

decreases as rc increases because sensor nodes will

con-nect each other easily. Figure 15(b) also gives evidence that most subnetworks have many sensor nodes and need to be partitioned when a larger rcis considered. Figure 16 shows

that the minimum remaining energy of sensors will be bounded by eeven if the larger rccauses that each

subnet-work includes more sensors. By the aforementioned simu-lation results, we conclude that our schemes can efficiently solve the bi-objective EM-TSP.

5.5. Effect of irregular radio propagation To understand the impact of more complicated radio prop-agation, we adopt the degree of irregularity (DOI) model in [42,46], which allows us to vary the radio range in different directions. For example, when DOI D 0:1, the radio range is randomly chosen from Œ0:9rc; 1:1rc in each direction.

We vary DOI from 0:1 to 0:5 in the following simula-tions. Figure 17(a) shows that the path length of the mule decreases as DOI increases. This is because a sensor node may discover more neighbor nodes, leading to more oppor-tunities to reduce the number of subnetworks. Figure 17(b)

0 2000 4000 6000 8000 10000 12000 14000 200 300 400 500 600 700 800 Path length (m) Area size

2-link look ahead, CC-based CC-based CH-based Greedy Christofides 0 20 40 60 80 100 120 140 160 180 200 220 400 450 500 550 600 650 700 750 800 Number of subnetworks Area size Area size Before partition After partition Ideal partition

(a)

(b)

(13)

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Path length (m) Fail probability Fail probability

2-link look ahead, CC-based CC-based CH-based Greedy Christofides 10 20 30 40 50 60 70 80 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Number of subnetworks Fail probability Fail probability Before partition After partition Ideal partition

(a)

(b)

Figure 13. Effect ofPfD 0:9 0:1 on (a) path length and (b) number of subnetworks.

72000 76000 80000 84000 88000 92000 96000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Fail probability

Figure 14. Effect ofPfD 0:9 0:1 on the minimum remaining

energy of sensor nodes.

gives evidences of this argument. Figure 18 shows that the irregularity of radio propagation does not have a serious impact on energy consumption of sensors because it only changes the topology slightly.

5.6. Comparison to exhaustive search To understand how our path-planning heuristics perform against an exhaustive search algorithm, which is able to find the optimal path, we conducted the following simu-lations. We vary the value of n but skipped the subnet-work partition step in Section 4.1 (this is to ensure that our observation is not affected by the partitioning result). Unfortunately, only a small n (around 10) can be com-putationally handled by an exhaustive scheme (note that a subnetwork may contain a lot of sensor nodes, which also need be searched.) Figure 19 shows that the CC-based schemes perform very closely to the optimal scheme.

5.7. Effect ofK

Finally, we consider the case of multiple coexisting mules. We look at the maximum path length among all mules. First, we vary K from 1 to 20 to investigate the effect of the number of mules when the merging-based clustering scheme is adopted. Figure 20(a) shows that, as the num-ber of mules increases, the gaps between different path-planning schemes shrink gradually. This is because each

1000 1500 2000 2500 3000 3500 4000 4500 20 25 30 35 40 45 50 Path length (m) Transmission range (m)

2-link look ahead, CC-based CC-based CH-based Greedy Christofides 0 5 10 15 20 25 30 35 40 45 50 55 20 25 30 35 40 45 50 Number of subnetworks Transmission range (m) Before partition After partition Ideal partition

(a)

(b)

(14)

65000 70000 75000 80000 85000 90000 20 25 30 35 40 45 50

Transmission range (m)

Figure 16. Effect ofrcD 20 50 on the minimum remaining

energy of sensor nodes.

mule is responsible for traversing relatively less subnet-works. It is to be noted that, ideally, K mules should be able to reduce the path lengths by 1

K. As can be seen in the

figure, the reduction in the maximum path length is much lesser than what was expected. This shows the importance of load balance among mules. In Figure 20(b), we compare the performance of the merging-based and k-means-based schemes, where the CC-based scheme is adopted. Gener-ally speaking, the k-means-based scheme performs better. As K increases, the gaps decrease because each mule needs to take care of relatively less subnetworks.

To summarize, the greedy scheme has low computation cost, the CH-based scheme incurs low energy costs on sensor nodes, and the CC-based scheme can find shorter data-collection paths. Therefore, these three path-planning schemes may be adopted in different scenarios. For exam-ple, the greedy scheme is more suitable for a network with a high failure rate because frequent recomputing of the mule’s path may be needed. The CH-based scheme is more suitable for a sparse network because energy of sen-sor nodes becomes more critical for connectivity reason. On the other hand, the CC-based scheme is more suit-able for a delay-sensitive network with a stronger real-time requirement. 72000 76000 80000 84000 88000 92000 96000 0.1 0.2 0.3 0.4 0.5

DOI

Figure 18. Effect ofDOI D 0:1 0:5 on minimum remaining energy of sensor nodes.

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 40 35 30 25 20 15 1 9 8 7 6 5 4 3 Path length (m)

Number of (original) subnetworks

2-link look ahead, CC-based CC-based CH-based Greedy Christofides Optimal 0

Figure 19. Comparing with the optimal solution on path length.

6. CONCLUSIONS

This paper considers the data gathering issue in an SS-WSN, where sensor nodes may form several isolated sub-networks, each far away from each other. Mobile mules are adopted to traverse these subnetworks to conduct data collection. To address issues of data-collection latency

0 1000 2000 3000 4000 5000 0.1 0.2 0.3 0.4 0.5 Path length (m) DOI

2-link look ahead, CC-based CC-based CH-based Greedy Christofides 0 10 20 30 40 50 0.1 0.2 0.3 0.4 0.5 Number of subnetworks DOI Before partition After partition Ideal partition

(a)

(b)

(15)

900 1000 1100 1200 1300 1400 1500 1600 Number of mules

Maximum path length (m)

k-means-based clustering merging-based clustering

0 500 1000 1500 2000 2500 3000 3500 4000 5 10 15 20 25 30 35 40 45 50 1 5 10 15 20 Number of mules

Maximum path length (m)

greedy CH-based

CC-based 2-link look ahead, CC-based theoretical

(a)

(b)

Figure 20. Effect of the number of mules on maximum path length among mules, where (a) different path-planning schemes and (b) different clustering schemes.

and network lifetime simultaneously, we formulate a new problem, called EM-TSP, to find mules’ traversal paths to visit each subnetwork in at least one landing port such that the energy consumption of sensors is bounded and the traversal path lengths of mules are minimized. We show that EM-TSP is a generalization of the classical TSP. Based on some interesting geometrical properties, we pro-pose several heuristics to solve EM-TSP. In particular, the properties of the CH are explored to solve this prob-lem. Simulation results show that our approaches can find efficient solutions to EM-TSP so as to balance between data-collection latency and network lifetime.

APPENDIX A: PROOF OF

THEOREM 1

Proof . We prove that LEM-TSP is NP-hard by reduc-ing the ETSP [7], an NP-hard problem, to a special case of LEM-TSP. Note that ETSP is a special case of the TSP when nodes are given in a Euclidean space. Let G0 D .V0; E0/ and a positive integer L0 be an arbitrary instance of ETSP, where a complete graph and the distance between any two vertices v0_iand v0_jin V0is defined as their Euclidean distance. ETSP is to determine whether G0has a tour P0visiting all vertices in V0such that jP0j L0. We can reduce G0D .V0; E0/ to a special case G D .V ; E/ of LEM-TSP in polynomial time as follows. Let V D V0 and E D ; (i.e., the communication range of each sen-sor node is infinitesimal, and thus, each subnetwork Gi,

i D 0; 1; : : : ; jV 1j, contains only one node). The dis-tance between any two vertices viand vjin V is also their

Euclidean distance. The sink node v0can be any vertex in

V . Let the energy threshold e D e ı and L D L0.

Clearly, the reduction can be done in polynomial time. We now show that G0has a tour P0visiting all vertices in V0such that jP0j L0iff G has a partition and a traversal

path P starting and ending at v0, visiting each Gi ¤ G0

in one landing port, and connecting all nodes in Gi via a

data-collection tree T rooted at the landing port such that maxvk2V ET.vk/ eand jP j L. We first prove the if part. If G has a solution to LEM-TSP with a traversal path P and jP j L, P must visit each vertex in G exactly once because the only way to partition G is to make that each Gi includes only one node vi. Thus, we can find a

corresponding path P0in G0with the same visiting order of vertices as P such that jP0j L0. Conversely, we then prove the only if part. If G0has a solution P0to ETSP with jP0j L0, we can find a corresponding path P in G with the same visiting order of vertices as P0 and a partition of G that each subnetwork Gicontains only one node vi.

Clearly, P starts and ends at v0, visits each subnetwork Gi

in one landing port vi, and connects all nodes in Gi via a

data-collection tree T (i.e., T is a single-node tree) such that maxvk2VET.vk/ eand jP j L.

ACKNOWLEDGEMENTS

Yu-Chee Tseng’s research is co-sponsored by MoE ATU Plan; by NSC grants 97-3114-E-009-001, 97-2221-E-009-142-MY3, 98-2219-E-009-019, 98-2219-E-009-005, and 99-2218-E-009-005; by ITRI, Taiwan; by III, Taiwan; by D-Link; and by Intel.

REFERENCES

1. Rapaka A, Madria S. Two energy efficient algorithms for tracking objects in a sensor network. Wireless Communications and Mobile Computing 2007; 7(6): 809–819.

2. Hu F, Xiao Y, Hao Q. Congestion-aware, loss-resilient bio-monitoring sensor networking for mobile health

(16)

applications. IEEE Journal Selected Areas in Commu-nications 2009; 27(4): 450–465.

3. Liu H, Wan P, Jia X. Maximal lifetime scheduling for sensor surveillance systems with k sensors to one target. IEEE Transactions on Parallel and Distributed Systems 2006; 17(12): 1526–1536.

4. Tubaishat M, Zhuang P, Qi Q, Shang Y. Wire-less sensor networks in intelligent transportation sys-tems. Wireless Communications and Mobile Comput-ing 2009; 9(3): 287–302.

5. Terrestrial Ecology Observing Systems, Center For Embedded Networked Sensing. http://research.cens. ucla.edu/.

6. Data Mule, Center For Embedded Networked Sensing. http://research.cens.ucla.edu/projects/2005/Actuation/ datamule/.

7. Papadimitriou CH. The Euclidean traveling salesman problem is NP-complete. Theoretical Computer Sci-ence 1977; 4: 237–244.

8. Golden B, Bodin L, Doyle T, Stewart W, Jr. Approximate traveling salesman algorithms. Opera-tions Research 1980; 28(3): 694–711.

9. Wang Y-C, Wu F-J, Tseng Y-C. Mobility manage-ment algorithms and applications for mobile sensor networks. Wireless Communications and Mobile Com-puting. DOI: 10.1002/wcm.886.

10. Zou Y, Chakrabarty K. Sensor deployment and tar-get localization in distributed sensor networks. ACM Transactions on Embedded Computing Systems 2004; 3(1): 61–91.

11. Heo N, Varshney PK. Energy-efficient deployment of intelligent mobile sensor networks. IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems and Humans 2005; 35(1): 78–92.

12. Wang G, Cao G, La Porta TF. Movement-assisted sen-sor deployment. IEEE Transactions on Mobile Com-puting 2006; 5(6): 640–652.

13. Akkaya K, Younis M. COLA: a coverage and latency aware actor placement for wireless sensor and actor networks. In Proceedings of IEEE Vehicular Technol-ogy Conference, 2006; 1–5.

14. Koc O, Jaikaeo C, Shen C-C. Navigating actors in mobile sensor actor networks. In Proceedings of the ACM Workshop on Sensor and Actor Networks, 2007; 19–26.

15. Akkaya K, Senel F. Detecting and connecting disjoint sub-networks in wireless sensor and actor networks. Ad Hoc Networks 2009; 7(7): 1330–1346.

16. Yang G, Tong B, Qiao D, Zhang W. Sensor-aided over-lay deployment and relocation for vast-scale sensor networks. In Proceedings of IEEE INFOCOM, 2008; 2216–2224.

17. Shah RC, Roy S, Jain S, Brunette W. Data MULEs: modeling a three-tier architecture for sparse sensor

networks. In Procedings of IEEE Workshop on Sen-sor Network Protocols and Applications (SNPA), 2003; 30–41.

18. Jain S, Shah RC, Brunette W, Borriello G, Roy S. Exploiting mobility for energy efficient data collec-tion in wireless sensor networks. Mobile Networks and Applications 2006; 11(3): 327–339.

19. Luo L, Huang C, Abdelzaher T, Stankovic J. Enviro-Store: a cooperative storage system for disconnected operation in sensor networks. In Proceedings of IEEE INFOCOM, 2007; 1802–1810.

20. Singh A, Krause A, Kaiser WJ. Nonmyopic adap-tive informaadap-tive path planning for multiple robots. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2009; 1843–1850. 21. Zhao W, Ammar MH. Message ferrying: proactive

routing in highly-partitioned wireless ad hoc networks. In Proceedings of the IEEE Workshop Future Trends in Distributed Computing Systems, 2003; 308–314. 22. Zhao W, Ammar M, Zegura E. A message ferrying

approach for data delivery in sparse mobile ad hoc net-works. In Proceedings of the ACM International Sym-posium on Mobile Ad Hoc Networking and Computing, 2004; 187–198.

23. Zhao W, Ammar M, Zegura E. Controlling the mobil-ity of multiple data transport ferries in a delay-tolerant network. In Proceedings IEEE INFOCOM, 2005; 1407–1418.

24. Ngai EC-H, Liu J, Lyu MR. Delay-minimized route design for wireless sensor-actuator networks. In Pro-ceedings of IEEE Wireless Communications and Net-working Conference, 2007; 3675–3680.

25. Ngai EC-H, Liu J, Lyu MR. An adaptive delay-minimized route design for wireless sensor–actuator networks. IEEE Transactions on Vehicular Technology 2009; 58(9): 5083–5094.

26. Yuan B, Orlowska M, Sadiq S. On the optimal robot routing problem in wireless sensor networks. IEEE Transactions on Knowledge and Data Engineering 2007; 19(9): 1252–1261.

27. Wan C-Y, Eisenman SB, Campbell AT, Crowcroft J. Siphon: overload traffic management using multi-radio virtual sinks in sensor networks. In Proceedings of the ACM International Conference Embedded on Networked Sensor Systems, 2005; 116–129.

28. Ma M, Yang Y. Data gathering in wireless sensor net-works with mobile collectors. In Proceedings of the IEEE International Parallel and Distributed Process-ing Symposium, 2008; 1–9.

29. Xing G, Wang T, Xie Z, Jia W. Rendezvous plan-ning in wireless sensor networks with mobile elements. IEEE Transactions on Mobile Computing 2008; 7(12): 1430–1443.

(17)

30. Xing G, Wang T, Jia W, Li M. Rendezvous design algorithms for wireless sensor networks with a mobile base station. In Proceedings of the ACM Interna-tional Symposium on Mobile Ad Hoc Networking and Computing, 2008; 231–240.

31. Rao J, Wu T, Biswas S. Network-assisted sink naviga-tion protocols for data harvesting in sensor networks. In Proceedings of the IEEE Wireless Communications and Networking Conference, 2008; 2887–2892. 32. Rao J, Biswas S. Joint routing and navigation protocols

for data harvesting in sensor networks. In Proceedings of the IEEE International Conference on Mobile Ad Hoc and Sensor Systems, 2008; 143–152.

33. Cormen TH. Introduction to Algorithms. The MIT Press: US, 2001.

34. Christofides N. Worst-case analysis of a new heuris-tic for the travelling salesman problem. Report No. 388, GSIA, Carnegie-Mellon University, Pittsburgh, PA, 1976.

35. Arora S. Polynomial time approximation schemes for euclidean traveling salesman and other geometric problems. Journal of the ACM 1998; 45(5): 753–782. 36. Meliou A, Chu D, Guestrin C, Hellerstein J, Hong W.

Data gathering tours in sensor networks. In Proceed-ings of the IEEE International Symposium on Informa-tion Processing in sensor networks, 2006; 43–50. 37. Meliou A, Krause A, Guestrin C, Hellerstein J.

Non-myopic informative path planning in spatio-temporal models. In Proceedings of the Conference on Artificial Intelligence (AAAI), 2007; 602–607.

38. Keshavarzian A, Lee H, Venkatraman L. Wakeup scheduling in wireless sensor networks. In Proceed-ings of the ACM International Symposium on Mobile Ad Hoc Networking and Computing, 2006; 322–333. 39. Hartigan JA. Clustering Algorithms. Wiley: New York,

1975.

40. Larson RC, Odoni AR. Urban Operations Research. Prentice-Hall: Englewood Cliffs, NJ, 1981.

41. Dai H, Han R. A node-centric load balancing algorithm for wireless sensor networks. In Proceedings of the IEEE Global Telecommunications Conference, 2003; 548–552.

42. Hu L, Evans D. Localization for mobile sensor net-works. In Proceedings of the ACM International Con-ference on Mobile Computing and Networking, 2004; 45–57.

43. Duda RO, Stork DG, Hart PE. Pattern Classification. John Wiley and Sons, Inc.: New York, 2000.

44. Shnayder V, Hempstead M, Chen B-R, Allen GW, Welsh M. Simulating the power consumption of large-scale sensor network applications. In Proceedings of the ACM International Conference on Embedded Net-worked Sensor Systems, 2004; 188–200.

45. Hartung C, Han R, Seielstad C, Holbrook S. FireWxNet: a multitiered portable wireless system for monitoring weather conditions in wildland fire envi-ronments. In Proceedings of the ACM International Conference on Mobile Systems, Applications, and Ser-vices, 2006; 28–41.

46. Zhou G, He T, Krishnamurthy S, Stankovic JA. Mod-els and solutions for radio irregularity in wireless sen-sor networks. ACM Transactios on Sensen-sor Networks 2006; 2(2): 221–262.

AUTHORS’ BIOGRAPHIES

Fang-Jing Wu received the B.S. degree in Mathematics form the Fu Jen Catholic University and the M.S. degree in Computer Science and Information Engineering from the National Chiao-Tung University, Taiwan, in 2001 and 2004, respec-tively. She was a research assistant in the Department of Communication Engineering, National Chiao-Tung University, Taiwan, in 2004. She is currently pursuing Ph.D. in the Department of Computer Science, National Chiao-Tung University, Taiwan. Her current research interests are primarily in per-vasive computing and wireless sensor networks.

Yu-Chee Tseng got his Ph.D. in Computer and Information Science from the Ohio State University in January of 1994. He is/was Profes-sor (2000–present), Chairman (2005– 2009), and Associate Dean (2007– 2011) of the Department of Computer Science, National Chiao-Tung Uni-versity, Taiwan, and Chair Professor, Chung Yuan Christian University (2006–2010).

Dr. Tseng received Outstanding Research Award (National Science Council, 2001, 2003, and 2009), Best Paper Award (Int’l Conf. on Parallel Processing, 2003), Elite I. T. Award (2004), and Distinguished Alumnus Award (Ohio State University, 2005), and Y. Z. Hsu Sci-entific Paper Award (2009). His research interests include mobile computing, wireless communication, and parallel and distributed computing.

Dr. Tseng serves/served on the editorial boards of IEEE Transactions on Vehicular Technology (2005–2009), IEEE Transactions on Mobile Computing (2006–present), and IEEE Transations on Parallel and Distributed Systems (2008–present).