Wireless Sensor Networks

(1)

Optimizing Lifetime for Continuous Data Aggregation With Precision Guarantees in

Wireless Sensor Networks

Xueyan Tang, Member, IEEE, and Jianliang Xu, Senior Member, IEEE

Abstract—This paper exploits the tradeoff between data quality and energy consumption to extend the lifetime of wireless sensor networks. To obtain an aggregate form of sensor data with pre- cision guarantees, the precision constraint is partitioned and allocated to individual sensor nodes in a coordinated fashion. Our key idea is to differentiate the precisions of data collected from different sensor nodes to balance their energy consumption. Three factors affecting the lifetime of sensor nodes are identified: 1) the changing pattern of sensor readings; 2) the residual energy of sensor nodes; and 3) the communication cost between the sensor nodes and the base station. We analyze the optimal precision allocation in terms of network lifetime and propose an adaptive scheme that dynamically adjusts the precision constraints at the sensor nodes. The adaptive scheme also takes into consideration the topological relations among sensor nodes and the effect of in-network aggregation. Experimental results using real data traces show that the proposed scheme significantly improves network lifetime compared to existing methods.

Index Terms—Data accuracy, data aggregation, energy effi- ciency, network lifetime, sensor network.

I. INTRODUCTION

W

IRELESS sensor networks are used in a wide range of applications to capture, gather and analyze live en- vironmental data [1], [2]. A wireless sensor network typically consists of a base station and a group of sensor nodes (see Fig. 1). The sensor nodes are responsible for continuously sampling physical phenomena such as temperature and humidity.

They are also capable of communicating with each other and the base station through radios. The base station, on the other hand, serves as a gateway for the sensor network to exchange data with applications to accomplish their missions.

While the base station can have continuous power supply, the sensor nodes are usually battery-powered. The batteries are in- convenient and sometimes even impossible to replace. When a sensor node runs out of energy, its coverage is lost. The mission of a sensor application would not be able to continue if

Manuscript received October 24, 2006; revised February 23, 2007; first published February 2, 2008; last published August 15, 2008 (projected); approved by IEEE/ACM TRANSACTIONS ONNETWORKINGEditor N. Shroff. This work was supported in part by a grant from Nanyang Technological University under Project RG47/06. The work of J. Xu was supported in part by grants from the Research Grants Council of the Hong Kong under Projects HKBU211505, HKBU211307, and FRG/05-06/II-65. An earlier version of this work was presented at IEEE INFOCOM’2006, Barcelona, Spain, April 23–29, 2006.

X. Tang is with the School of Computer Engineering, Nanyang Technological University, Singapore 639798 (e-mail: [email protected]).

J. Xu is with the Department of Computer Science, Hong Kong Baptist Uni- versity, Kowloon Tong, Hong Kong (e-mail: [email protected]).

Digital Object Identifier 10.1109/TNET.2007.902699

Fig. 1. System architecture.

the coverage loss is remarkable. Therefore, the practical value of a sensor network is determined by the time duration before it fails to carry out the mission due to insufficient number of

“alive” sensor nodes. This duration is referred to as the network lifetime [1]. It is both mission-critical and economically desir- able to manage sensor data in an energy-efficient way to extend the lifetime of sensor networks.

The data captured by the sensor nodes are often converted into an aggregate form requested by the applications (e.g., average temperature reading). Primarily designed for monitoring purposes, many sensor applications require continuous aggregation of sensor data [3]. Exact data aggregation requires substan- tial energy consumption because each sensor node has to report every reading to the base station. In wireless sensor networks, communication is a dominant source of energy consumption [4], [5]. To save energy, data semantics can be relaxed to allow ap- proximate data aggregation with precision guarantees [6]–[9].

The precision can, for example, be specified in the form of quantitative error bounds: “average temperature reading of all sensor nodes within an error bound of 1 C.” In this way, the sensor nodes do not have to report all readings to the base station. Only the updates necessary to guarantee the desired level of precision need to be sent.

It is, however, a challenging task to optimize network lifetime under approximate data aggregation because the sensor nodes are inherently heterogeneous in energy consumption.

First, when the data captured by different sensor nodes change at different magnitudes and frequencies, the sensor nodes may report data at different rates. Second, the wireless communication cost depends on the transmission distance [10], [11]. Due to the geographically distributed nature of sensor networks, the sensor nodes are likely to differ significantly in the energy cost of sending a message to the base station. Even if all sensor nodes report data at the same rate, their energy consumption can be highly unbalanced, thereby reducing network lifetime.

In addition to reporting local sensor readings, the intermediate nodes in a multi-hop network are also responsible for relaying the data originated from other nodes to the base station. The

(2)

nodes closer to the base station normally relay larger amounts of data than the nodes farther away from the base station.

In this paper, we investigate the optimization of network lifetime for approximate data aggregation. We leverage the semantics of approximate data aggregation in balancing the energy consumption of the sensor nodes. Our key idea is to differentiate the quality of data collected from different sensor nodes by partitioning the precision constraint of data aggregation among the sensor nodes in a coordinated fashion. Our contributions are summarized as follows.

• We identify three factors affecting the lifetime of sensor nodes in the context of approximate data aggregation:

1) the changing pattern of sensor readings; 2) the residual energy of the sensor nodes; and 3) the communication cost between the sensor nodes and the base station. We then analyze the optimal precision allocation in terms of network lifetime.

• We develop a candidate-based method for precision allocation and prove its optimality for single-hop networks.

Based on this method, an adaptive scheme is proposed to dynamically adjust the error bounds allocated to the sensor nodes. The adjustment period is also dynamically set to control the communication overhead.

• We derive the hardness results of candidate-based precision allocation in multi-hop networks. We extend the adaptive scheme to work in multi-hop networks by taking into consideration the effect of in-network aggregation and the topological relations among the sensor nodes.

• We present an experimental evaluation using real data traces over a wide range of system configurations. The results show that the proposed scheme significantly improves network lifetime compared to existing methods.

The rest of this paper is organized as follows. Section II sum- marizes the related work. Section III describes the system model and gives some basic definitions. Section IV analyzes the optimal precision allocation in single-hop networks and then pro- poses an adaptive precision allocation scheme. Section V extends the adaptive scheme to multi-hop networks. The experimental setup and results are discussed in Section VI. Finally, Section VII concludes the paper.

II. RELATEDWORK

Wireless sensor networks have attracted much research effort in recent years. From the networking perspective, researchers have primarily focused on optimizing network related opera- tions such as routing and media access [12]–[15]. From the data- base perspective, researchers have mainly focused on query processing over sensor data [16]–[19]. However, not much work has looked into trading data quality for energy efficiency.

Recently, several approaches have been proposed to relax data semantics and allow a specified degree of inaccuracy to be tolerated in sensor data collection. To acquire approximate readings of individual sensor nodes, the precision constraints can be set independently for different sensor nodes [8], [20]. In contrast, to collect an aggregate form of sensor data over the network, the precision settings of different sensor nodes should be inter-related. Olston et al. [6] investigated burden-based pre- cision adjustment for continuous queries over distributed data

streams. However, they did not model in-network aggregation, which is a commonly used technique to reduce the traffic of data collection in wireless sensor networks [21]. Sharaf et al. [7] im- plemented a simple uniform precision allocation for in-network sensor data aggregation. Deligiannakis et al. [9] further opti- mized the allocation to reduce the number of messages transmitted in the network. However, none of these studies has taken energy and lifetime models into consideration. Thus, their proposed techniques are not effective in handling the energy constraints in wireless sensor networks. As shall be shown by our experimental results, minimizing the total network traffic does not necessarily optimize network lifetime. Different from existing work, in this paper, we aim at extending network lifetime for data aggregation with precision guarantees in sensor networks.

Considine et al. [22] and Nath et al. [23] implemented ap- proximate data aggregation in the presence of multi-path routing by means of sketches and synopses. However, they did not make use of temporal locality to suppress data updates. Deshpande et al. [24] and Chu et al. [25] applied statistical techniques to model the distributions of sensor data for approximate data collection. The performance of this approach depends on the quality of the models built. Different from this approach, we do not require the construction of statistical models in advance. Our proposed techniques dynamically adapt to the changing pattern of sensor readings on the fly. Related work on approximate data collection also includes representing sensor readings with so- phisticated data structures [26], [27] and exploiting the spatial correlation between sensor readings [28], [29]. These studies are complementary to our work.

III. PRELIMINARIES

We consider data aggregation with precision guarantees in a network of sensor nodes. The sensor nodes are geographically distributed in an operational area. They periodically sample the local phenomena such as temperature and humidity. Without loss of generality, the sampling period is assumed to be 1 time unit. The base station collects data from the sensor nodes and feeds them to an application. The application specifies the precision constraint of data aggregation by an upperbound (called the error bound) on the quantitative difference between an ap- proximate result and the exact result [7], [9]. That is, on receiving an aggregate result from the sensor network, the application would like to be assured that the exact aggregate result

lies in the interval .

In approximate data aggregation, not all sensor readings have to be sent to the base station. To reduce communication cost, the designated error bound on aggregate data can be partitioned and allocated to individual sensor nodes (we shall call it preci- sion allocation). Each sensor node updates a new reading with the base station only when the new reading significantly devi- ates from the last update to the base station and violates the allocated error bound. To guarantee the designated precision of aggregate data, the error bounds allocated to individual sensor nodes have to satisfy certain feasibility constraints. Different ag- gregation functions impose different constraints. In this paper, we consider three commonly used types of aggregations: SUM, COUNT and AVERAGE. For SUM and COUNT aggregations,

(3)

to guarantee an error bound on aggregate data, the total error bound allocated to the sensor nodes cannot exceed , i.e.,

(1)

where is the error bound allocated to node . For AVERAGE aggregation, the total error bound allocated to the sensor nodes cannot exceed , i.e.,

(2)

where is the number of sensor nodes.

Eligible precision allocation under the feasibility constraint is not unique. For example, in a network of 10 temperature sensor nodes, if the given error bound on AVERAGE aggregation is 1 C, we can allocate an error bound of 1 C to each sensor node.

Alternatively, we can also allocate an error bound of 5.5 C to a selected node and an error bound 0.5 C to each of the remaining nodes. This offers the flexibility to adjust the energy consumption of individual sensor nodes by careful precision allocation.

In general, to collect the readings of a sensor node at higher precision (i.e., smaller error bound), the sensor node needs to send data updates to the base station more frequently, which intro- duces higher energy consumption.

We denote the energy consumed by sensor node to send and receive a data update by and respectively. They can take different forms to cater for a wide range of factors. In the sim- plest case, if all sensor nodes use a default radio communication range, ’s are the same for all nodes. More sophisticatedly, if the sensor nodes know the locations of the receivers [11], [30], [31], they can adapt the power level to the transmission distance.

The sensor nodes with longer transmission distances would be associated with higher ’s. In addition, reliability can also be modeled in the energy cost. The sensor nodes incident to less reliable links are entitled to higher ’s and ’s due to possible retransmissions. The exact forms of and are orthogonal to our analysis and beyond the scope of this paper. We simply assume that each sensor node knows and .

Similar to other studies [32]–[35], we define the network lifetime as the time duration before the first sensor node runs out of energy. Our analysis is also applicable to redundant sensor deployment where each location of interest is covered by several sensor nodes. From the viewpoint of network lifetime, the set of sensor nodes monitoring the same location can be converted to an equivalent single node by adding up the energy budgets of these sensor nodes. More generally, if the network lifetime is defined as the time duration before a given portion of sensor nodes run out of energy, our proposed scheme can be applied repeatedly after the exhaustion of a sensor node’s energy.

IV. PRECISIONALLOCATION INSINGLE-HOPNETWORKS

We start by investigating the precision allocation in a single-hop network where each sensor node sends its local readings to the base station directly. Single-hop networks are preferred in some situations due to a number of reasons [11].

Moreover, the analysis of precision allocation in a single-hop network also provides insights on the allocation in a multi-hop

network. The adaptive precision allocation scheme developed for single-hop networks will serve as a building block of the scheme we shall propose for multi-hop networks in Section V.

Note that constraints (1) and (2) share the characteristic that the total error bound of the sensor nodes is capped by a given value. We shall focus on constraint (1) in our discussion. The analysis and algorithms developed in this paper can be adapted to handle constraint (2) in a straightforward manner. They are also directly applicable to SUM and AVERAGE aggregations over any fixed subset of the sensor nodes.

A. Analysis of Optimal Precision Allocation

Consider a snapshot of the network. Let be the error bounds currently allocated to sensor nodes

respectively. The quantitative relationship between the rate of data updates sent by a sensor node and its allocated error bound depends on the changing pattern of sensor readings. Without loss of generality, we shall denote the update rate of each sensor node as a function of the allocated error bound . is essentially the rate at which node ’s reading changes by more than . Intuitively, is a non-increasing function with respect to , and .

Since the sensor nodes in a single-hop network are not in- volved in relaying data from other sensor nodes to the base station, the energy consumption rate of node is simply

where refers to the energy cost for node to send a data update to the base station. Suppose the residual energy of node is . Then, the expected lifetime of node is

Therefore, the network lifetime is given by

The objective of precision allocation is to find a set of error bounds that maximize the network lifetime under the constraint

We now analyze the optimal precision allocation. For sim- plicity, we shall assume functions ’s are continuous and denote the inverse function of by .

Since is non-increasing, the minimum lifetime of sensor node is given by

Without loss of generality, suppose

For each pair of nodes and where , consider the error bound that makes the lifetime of node

(4)

equivalent to the minimum lifetime of node . Since is non-increasing, it follows from that

Thus, given any ,

This implies is non-decreasing with

increasing . Note that when ,

Therefore, given an error bound on data aggregation,

if , there must exist a , where

, such that

Since

we have

Hence, there also exists an , where , such that

(3)

On the other hand, if , since

’s are non-increasing and , there exists an , where , such that

(4)

For convenience, we shall denote in this case so that (4) is consistent with (3).

Theorem 1: An optimal precision allocation is given by

, . This allocation has a lifetime .

Proof: It follows from (3) that satisfies the feasibility constraint of precision allocation. Assume on the contrary that there exists another precision allocation which has a lifetime . The definition of network lifetime implies that for any ,

Thus,

Since is non-increasing, we have

Therefore,

which contradicts the feasibility of . Hence, the theorem is proven.

Theorem 1 implies that the sensor nodes with high residual energy , slow change in readings (i.e., low ), and low communication cost may be assigned zero error bounds.

The sensor nodes allocated nonzero error bounds in an optimal precision allocation must be equal in the energy consumption rate normalized by the residual energy

We shall call the normalized energy consumption rate. To ex- tend network lifetime, it is important to balance the normalized energy consumption rates of the sensor nodes.

B. Candidate-Based Precision Allocation

In practice, the exact forms of ’s (i.e., the changing patterns of sensor readings) may not be known a priori and they may even change dynamically. Thus, we propose a candidate- based method for precision allocation. The key idea is to let each sensor node estimate and report to the base station the normal- ized energy consumption rates for a number of candidate error bounds based on historical sensor readings. The base station op- timizes precision allocation based on these candidates to extend network lifetime. Since the general relationships between error bounds and update rates are not known, we restrict the error bound allocated to each sensor node to one of its candidates.

Such allocations are called candidate precision allocations and the one that maximizes network lifetime is called the optimal candidate precision allocation.

Assume that each sensor node chooses candidates. For each node , let be the list of candidate

(5)

error bounds, and be the corresponding normalized energy consumption rates. It follows that

. Suppose the smallest candidate error bounds for the sensor nodes do not add up to the designated bound on data

aggregation, i.e., .¹Algorithm 1

presents the pseudocode to compute the optimal candidate precision allocation.

Algorithm 1 Optimal Candidate Precision Allocation in a Single-Hop Network

Input:

: error bound of data aggregation

, : candidate error bounds and normalized energy consumption rates

Output:

: error bound of each node in optimal allocation

1: for to do

2: ;

3: end for

4: while do

5: ;

6: if then

7: break;

8: end if

9: ;

10: end while

Initially, the error bound of each sensor node is set to its smallest candidate (steps 1–3). In each iteration of steps 4–10, the error bound of the node having the highest energy consumption rate is replaced with its next smallest candidate. The iteration stops if a new replacement would make the total error bound of the sensor nodes exceed the designated bound on data aggregation (steps 6–7). The worst-case time complexity of Algo- rithm 1 is .²We show that Algorithm 1 produces an optimal candidate precision allocation.

Theorem 2: The candidate precision allocation computed by Algorithm 1 maximizes network lifetime.

Proof: Let be the precision

allocation computed by Algorithm 1. It is obvious that is feasible. Suppose under such allocation, sensor node has the highest normalized energy consumption rate, i.e.,

The network lifetime is then given by

1Our proposed candidate selection method (to be discussed later in this section) satisfies this constraint.

2As shall be shown by our experimental results in Section VI, a smallm like 5 is sufficient to achieve near optimal network lifetime.

It is easy to infer from Algorithm 1 that: (i) if ,

and (ii) for each where ,

Assuming on the contrary that there exists another candidate

precision allocation with a longer net-

work lifetime, i.e.,

It follows that

Since

we have

Therefore,

and hence,

Based on property (i)

Thus, there must exist a such that

which implies

It follows from property (ii) that

Therefore,

(6)

which contradicts the assumption that has a longer network lifetime.

Hence, the theorem is proven.

C. Adaptive Precision Allocation

We now present an adaptive precision allocation scheme that works by adjusting the error bounds of the sensor nodes periodically. The interval between two successive adjustments is called an adjustment period. At the beginning of an adjustment pe- riod, each sensor node selects a list of candidate error bounds . The node keeps track of the update counts under these error bounds as it captures new readings.³At the end of the adjustment period, node normalizes the counts by the length of period to obtain the data update rate for each . Node then computes the normalized energy consumption rate for each by

where is the present residual energy of node . Node sends a candidate report message including the ’s and

’s to the base station. On receiving the messages from all sensor nodes, the base station computes the optimal precision

allocation using Algorithm 1. In case

, the leftover error bound is simply allocated to the node with the highest normalized energy consumption rate since doing so would only extend network lifetime. Finally, the base station sends a precision allocation message to the sensor nodes including the new error bounds for their adjustments.

Algorithm 1 and Theorem 2 are generic in that they are applicable to any list of candidates. In this paper, we propose to choose a set of candidate error bounds that are exponentially spaced. The closer the candidates to the current error bound, the smaller the difference between neighboring candidates. The motivation is to adjust the error bounds at coarse granularity when they are far away from the optimum, and adjust them at fine granularity when they are close to the optimum. Let be the current error bound of sensor node . Then, the candidate error bounds of node range from to . Given the number of candidates , the candidate error bounds are selected as

Note that the network lifetime is determined by the lifetime of the most energy-consuming node. Thus, to control the energy overhead of adjustments, we propose to cap the energy overhead at the most energy-consuming node by a given portion of its energy budget. This is done by dynamically adapting the adjustment period at each adjustment. Specifically, each sensor node counts the number of data updates sent to the base station in the adjustment periods. At an adjustment, node estimates its energy consumption rate by , where is the update count in the past adjustment period, is the energy cost for sending, and is the duration of the past adjustment period. Note that at

3Note that the sensor node does not actually send data updates based on these candidate error bounds. It updates the readings with the base station according to the currently allocated error bound only.

an adjustment, each sensor node needs to send a candidate report message to and receive a precision allocation message from the base station. Thus, the energy cost at node due to an adjustment is , where and are the sending and receiving costs respectively. To limit it at a portion of the energy consumed by node , the duration of the next adjustment period should be set such that

i.e.,

Each sensor node computes and includes it in the candidate report message sent to the base station at the end of an adjustment period. Among all ’s received, the base station selects the lowest one as the next adjustment period so as to cap the adjustment overhead at a portion of the energy consumed at the most consuming node. is then included in the precision allocation message sent by the base station to all sensor nodes. We shall investigate the impact of with simulation experiments in Section VI.

V. PRECISIONALLOCATION INMULTI-HOPNETWORKS

A. Modeling In-Network Aggregation

If the base station is beyond the radio coverage of some sensor nodes, a multi-hop routing infrastructure has to be set up to transport data from the sensor nodes to the base station.

A common practice is to organize the sensor nodes into a tree structure rooted at the base station [21]. In-network aggregation is often used to reduce the network traffic of data collection in multi-hop networks [7], [9], [21], [27]. In this approach, each intermediate node aggregates the data received from its children before forwarding them upstream in order to cut down the volume of data sent over the upper-level links in the tree.

As a result, the data sent by an intermediate node to its parent is a partial aggregate result of the sensor readings in the subtree rooted at the intermediate node.

Like that in a single-hop network, each sensor node is allocated an error bound to control its reporting of data updates to the parent. We shall call it node ’s local error bound. The operation of a leaf node in a multi-hop network is the same as that in a single-hop network: it updates the parent node with a new reading whenever the new reading differs from the last reported reading by more than the local error bound. For each intermediate node, the local error bound is applied to the partial aggregate results at the node rather than its local readings [9]. To do so, each intermediate node maintains the latest data value reported by each child. At each sampling period, the intermediate node re-aggregates these data values together with its new local reading. It sends the new partial aggregate result to the parent only when the result has changed beyond the local error bound since the last update to the parent. For SUM aggregation, the partial aggregate result is the sum of the sensor readings in the subtree rooted at the intermediate node. In this way, the aggregate result collected by the base station is guaranteed to be within an error bound from the exact aggregate result over the network [9].

(7)

In fact, it can be shown by induction that for each node , the data value maintained by ’s parent node for differs from the exact aggregate result over the subtree rooted at node by at most . The correctness of this claim is trivial for any leaf node. Suppose it is also true for any child of an intermediate node . Let be the set of ’s children. Then, for any node

, we have

where is the sensor reading at node , is the data value maintained by node for node , and is the subtree rooted at node . Denote ’s parent node by . According to the operation of an intermediate node presented above, we also have

where is the data value maintained by node for node . Therefore,

It follows from the above claim that the data value maintained by the base station for each of its child differs from the exact aggregate result over subtree by at most . Therefore, the error of the aggregate result computed by the base station is bounded by the sum of the local error bounds at all sensor nodes

.

We denote the rate of data updates sent by a sensor node to its parent as a function of ’s local error bound .⁴ Taking into consideration the energy consumed in sending and receiving data updates, the energy consumption rate of node is then given by

where refers to the energy cost for node to send a data update to ’s parent, and refers to the energy cost for node to receive a data update from a child. Therefore, the expected network lifetime is given by

4Recall thati’s local error bound e is applied to the partial aggregate results at nodei. Therefore, strictly speaking, the update rate from node i to its parent also depends on how error bounds are allocated toi’s descendants. To simplify the analysis, we assume that the update rate relies one only.

where is the residual energy of node . The objective of precision allocation is again to find a set of error bounds that maximize the network lifetime subject to the constraint

where is the given error bound on data aggregation. Similar to single-hop networks, to extend network lifetime, it is important to balance the normalized energy consumption rates of the sensor nodes, i.e., to minimize

B. Adaptive Precision Allocation

Adaptive precision allocation in a multi-hop network also works by adjusting the error bounds of the sensor nodes periodically. Again, we adopt the candidate-based method for precision allocation. Each sensor node selects a list of candi-

date local error bounds . The sensor

node keeps track of the data update rates (to its parent node) for these error bounds as it captures new readings and produces new partial aggregate results. At the adjustment, the local error bound of each node is to be set to one of its candidates (where ). Following the analysis in Section V-A, the objective of precision allocation is to minimize

subject to the constraint

This is an NP-hard problem.

Theorem 3: The candidate precision allocation problem de- fined above is NP-hard.

Proof: We show the allocation problem is NP-hard by a polynomial reduction from the knapsack problem which is known to be NP-complete [36]. The knapsack problem is defined as follows: Given a knapsack of capacity , and

objects of sizes and profits , the

objective of the knapsack problem is to find the largest total profit among all subsets of the objects that fit in the knapsack.

Let be an instance of the knapsack problem. We first con- struct a tree topology including a base station and sensor nodes, where node 1 is a child of the base station and the remaining nodes are children of node 1 (see Fig. 2). An instance of the candidate precision allocation problem is then constructed on this tree by setting (i.e., each node has two candidate

local error bounds); , , ,

, , , , where

, , and are integer constants;

and . It is obvious that the construction time of instance is polynomial to the size of instance . Next, we show that, for any integral bound , there exists a subset of the

(8)

Fig. 2. InstanceQ of candidate precision allocation problem.

objects fitting in the knapsack with a total profit at least for instance if and only if there exists a feasible candidate precision allocation with highest normalized energy consumption rate at most for instance .

Note that all sensor nodes in instance have the same energy cost to send or receive a data update. This implies the energy consumption rates of the leaf nodes in Fig. 2 (i.e., nodes ) cannot exceed that of node 1. Also note that all sensor nodes have the same amount of residual energy. There- fore, regardless of precision allocation, the highest normalized energy consumption rate over all nodes is always that of node 1. It is then easy to establish a one-to-one correspondence between the object subsets fitting in the knapsack in instance and the feasible candidate precision allocations in instance . In fact, for any object subset , if fits in the knapsack (i.e., ), the corresponding candidate precision allo-

cation is feasible and in this

case, node 1 has a normalized energy consumption rate , where is the total profit of the object subset . Vice versa, for any feasible candidate precision allo-

cation , node 1 has a normalized energy

consumption rate , and the corresponding object subset fits in the knapsack. Thus, there exists an object subset with a total profit at least for instance if and only if there exists a candidate precision allocation with highest normalized energy consumption rate at most for instance .

Hence, the theorem is proven.

In the following, we present a distributed algorithm to compute a suboptimal candidate precision allocation. We advocate distributed algorithms because having all nodes reporting the estimated update rates ’s and residual energy levels ’s to the base station places further burdens of energy consumption on the nodes closer to the base station which are usually the energy bottlenecks in multi-hop networks.

To facilitate presentation, we shall refer to the sum of the local error bounds at the sensor nodes in the subtree rooted at node as its gross error bound. In addition to the candidate local error bounds, each sensor node also selects a list of

thresholds for its gross error bound

to assist the computation. For each threshold , node computes a locally best precision allocation (we shall call it ) among and its children under the constraint that the gross error bound at node does not exceed . The computation in our algorithm is carried out in a bottom-up manner from the leaf sensor nodes to the base station. On computing the allocations

’s, each node sends a candidate report message including a list

to its parent, where is the gross error bound at node under

(it is straightforward that ),

is the rate of data updates sent by node to its parent under , and is the highest normalized energy consumption rate of the sensor nodes in subtree under . A parent node per- forms the local computation after receiving the candidate report messages from all children.

If is a leaf sensor node, the thresholds are set the same as its candidate local error bounds, i.e., . Thus, ’s are simply ’s, and ’s are simply the estimated update rates ’s. Since does not receive data updates from any other

node, ’s are simply .

If is an intermediate sensor node, it collects the candidate report messages from all of its children. Together with the locally estimated update rates , node computes a locally best precision allocation for each threshold using Algorithm 2. Given a threshold , only the candidate local

error bounds satisfying are likely

to appear in a feasible precision allocation (step 3, otherwise the gross error bound at node would exceed ). For each of these ’s, the best allocation of among ’s children is computed using Algorithm 1 (step 4). Suppose is the gross error bound of each child in the best allocation. Then, is the corresponding data update rate from to , and is the highest normalized energy consumption rate of the nodes in subtree . On computing ’s energy consumption rate, the highest normalized energy consumption rate of the nodes in subtree can then be computed (step 6). The candidate local error bound that leads to the minimum highest energy consumption rate is included in the locally best precision allocation (steps 7–13). The corresponding allocation ’s among ’s children are also recorded in . The worst-case

time complexity of Algorithm 2 is ,⁵

where is the number of ’s children. Node records the computed best allocation for each threshold , and sends a candidate report message including the list to its parent node.

Algorithm 2 Locally Best Precision Allocation at Node in a Multi-Hop Network

Input:

: a threshold for the gross error bound of node , : candidate local error bounds of node and estimated data update rates to ’s parent node

, , : gross error bounds, data update rates and highest normalized energy consumption rates received from each child of node

Output:

: the computed best allocation, which includes the local error bound allocated to node and the gross error bound allocated to each child of node

: gross error bound at node under

: data update rate from node to its parent under : highest normalized energy consumption rate of the nodes in subtree under

5Again, as will be shown in Section VI, a smallm is sufficient to achieve near optimal network lifetime.

(9)

1: ;

2: for to do

3: if then

4: compute the optimal candidate precision allocation for error bound among ’s children using Algorithm 1 based on and ;

5: for each child of , let be the error bound of in the optimal allocation, then is the corresponding data update rate from to , and is the

corresponding highest normalized energy consumption rate of the nodes in subtree ;

6: ;

7: if then

8: ;

9: ;

10: ;

11: ;

12: for each child of , ; 13: end if

14: end if 15: end for

The base station, on receiving the candidate report messages from all of its children, computes a locally best precision allocation among the children using Algorithm 1. The computed error bounds are then sent to the sensor nodes for their adjustments in a top-down manner. The base station sends a precision allocation message to its children including the gross error bounds allocated to them. An intermediate sensor node, on receiving its allocated gross error bound, retrieves the stored corresponding best allocation which contains a local error bound and a set of gross error bounds for its children. The intermediate node ap- plies the local error bound to its partial aggregate results and sends the gross error bounds to its children in a precision allocation message. A leaf sensor node, on receiving its allocated gross error bound, simply takes it as the local error bound. In case the total error bound in the precision allocation computed by the base station does not add up to exactly, the leftover error bound is allocated to the node with the highest normalized energy consumption rate.⁶

Similar to adaptive precision allocation in a single-hop network, the candidate local error bounds of each sensor node and the thresholds for its gross error bound are exponentially spaced around its current local and gross error bounds, respectively.

Let and be the current local and gross error bounds of sensor node , respectively. Given the number of candidates

, the candidate local error bounds are selected as

and the thresholds for ’s gross error bound are selected as

6To do so, each recorded allocationA includes the child node that roots the subtree containing the node with the highest normalized energy consumption rateR . The allocation of the leftover error bound can then be routed to the intended node along with the precision allocation message.

Like that in a single-hop network, we dynamically adapt the adjustment period to limit the energy overhead of adjustments at the most energy-consuming node by a portion of its energy budget. Note that at an adjustment in a multi-hop network, a sensor node receives a candidate report message from each child and sends one to its parent. It also receives a precision allocation message from its parent and sends one to its children.

Thus, the energy cost at node due to an adjustment is , where and are the sending costs to the parent and children respectively, is the receiving cost,⁷ and is the set of ’s children. At an adjustment, the energy consumption rate of node is estimated by , where and are the numbers of data updates sent to ’s parent and received from ’s children, respectively, in the past adjustment period, and is the duration of the past adjustment period. Node suggests the duration of next adjustment period

as

Each leaf node includes the suggested period in the candidate report message sent to its parent. Each intermediate node, on receiving the candidate report messages from its children, chooses the shortest period among that suggested locally and those received from its children. This shortest period is then included in the candidate report message sent by the intermediate node to its parent. Among all suggested periods received, the base station selects the shortest one as the next adjustment period so as to cap the adjustment overhead at a portion of the energy consumed at the most consuming node. is then included in the precision allocation messages sent to all sensor nodes.

VI. PERFORMANCEEVALUATION

A. Experimental Setup

We developed a simulator based on ns-2 [37] and NRL’s sensor network extension [38] to evaluate the proposed adaptive precision allocation scheme. We used the following energy models [10]. The energy consumed by a sensor node to send a message is , where is the message size, nJ/b is a distance-independent term, pJ/b/m is the coefficient for a distance-dependent term, is the expo- nent for the distance-dependent term, and is the transmission distance. The energy consumed by a sensor node to receive a data update is , where nJ/b is a coefficient independent of transmission distance. In our experiments, the default message size was set at 48 bytes [16]. The initial energy budget at each sensor node was set at 0.5 J.

We simulated a single-hop network of 10 sensor nodes and multi-hop networks of 100 sensor nodes. The layout of the single-hop network is shown in Fig. 3. The multi-hop network topologies were generated by randomly placing the base station and 100 sensor nodes in a 200 m 200 m area. To simulate the spatial irregularity in sensor network deployment [39], we divided the area into a 4 4 grid. The probabilities of deploying sensor nodes in the grid cells were assumed to follow a Zipf-like distribution. That is, the 16 grid cells were randomly ordered into a list and the probability to deploy sensor nodes in the

7The receiving cost is normally independent of the sender [10], [11], [33].

(10)

Fig. 3. Single-hop network layout.

Fig. 4. Sample multi-hop network layout.

th cell on the list was set to , where is the Zipf parameter and is a normalization factor [40].

The default value of was set at 1. The sensor nodes were assumed to have a maximum radio transmission range of 40 m.

If two sensor nodes were within the radio range of each other, they were considered neighbors in the network connectivity graph. The breadth first search tree rooted at the base station was then computed from the connectivity graph and used as the routing infrastructure for data collection [21], [27]. We have experimented with many randomly generated network topologies and observed similar performance trends. Due to space limitations, we shall only report the results of a sample network topology in this paper. The layout of the topology is shown in Fig. 4, where the solid circle represents the base station, the remaining circles represent the sensor nodes and the lines represent the links in the routing tree.

We made use of the data provided by the LEM project [41] at the University of Washington to simulate the physical phenomena in the immediate surroundings of sensor nodes.

Weather data were collected in the LEM project from several stations in the Washington and Oregon states. We used the temperature (TEMP) and solar radiation (SOLAR) traces logged by the station at the University of Washington from August 2004 to August 2005 in our experiments. Each trace consisted of more than 500 000 readings captured at a sampling period of 1 minute. Fig. 5 shows some representative segments of these traces. The TEMP and SOLAR data both fluctuate over time—their readings are higher in the daytime and lower at night. In particular, the SOLAR readings remain unchanged regularly because the solar radiation is 0 at night. For each of the TEMP and SOLAR traces, we extracted 100 different subtraces starting at randomly selected timepoints and associated them with the sensor nodes in our simulated network. The

Fig. 5. Sample data traces.

Fig. 6. Network lifetime versus number of candidate error bounds in adaptive-PA (TEMP trace,E = 0:6 F).

sampling period between two successive readings in the trace was assumed to be 1 time unit.

The base station computes the AVERAGE aggregation of the readings collected from all sensor nodes with a designated error bound . As discussed in Section III, in this case, the total error bound allocated to the sensor nodes should be capped by , where is the number of sensor nodes. The experiments started with the error bound uniformly allocated to the sensor nodes, i.e., each node was allocated an error bound of . The following precision allocation schemes were simulated for performance comparison. We measured the energy consumption of each sensor node and the network lifetime in the experiments.

• Our Adaptive Precision Allocation (Adaptive-PA):

This is the adaptive precision allocation scheme proposed in Sections IV-C and V-B. By default, each sensor node selected candidate error bounds and the energy cost due to adjustments was capped at % of the energy consumed at the most consuming node. The performance impacts of and are investigated in Section VI-B.

We assumed that each data value in the message (e.g., sensor reading and candidate error bound) took up 2 bytes.

In addition, a timestamp of 2 bytes was included in all messages for ordering and synchronization purposes. The largest messages encountered in our experiments were the candidate report messages in multi-hop networks.

Recall that the candidate report message includes a list of ’s and a suggested next adjustment period. It requires a total of bytes when

, which fits into the default message size.

• Uniform Precision Allocation (Uniform-PA): The error bound is evenly partitioned among all sensor nodes [7],

(11)

Fig. 7. Network lifetime versus (TEMP trace). (a) Single-hop network. (b) Multi-hop network.

i.e., the precision allocation remains the initial one. This is a simple and static scheme which does not differentiate the sensor nodes by the changing pattern of sensor readings, the residual energy, and the communication cost with the base station.

• Burden-based Precision Allocation (Burden-PA):

Olston et al. [6] presented a burden-based precision al- location scheme for aggregate queries over distributed data streams. Their objective was to minimize the total communication cost between data sources and the data sink. In our experiments, the energy consumed by each sensor node to send a data update to its parent was taken as a measure of its communication cost.⁸Burden-PA works by periodically reducing the error bound of each sensor node by a shrink percentage and redistributing the leftover portion among the sensor nodes. As suggested by [6], the shrink percentage was set at 5%. We simulated Burden-PA over a wide range of different adjustment periods (from 144 to 2880 time units, which correspond to 0.1 to 2 days of data traces) and found that no single period provided the best performance for all experimental settings. Thus, to favor Burden-PA, for each experimental setting, we selected the best result obtained over all adjustment periods tested and present it in this paper.

• Potential-Gain-based Precision Allocation (PGain-PA):

To reduce the total number of messages in the network, Deligiannakis et al. [9] presented a precision allocation scheme for sensor data aggregation based on online esti- mation of potential gains. Similar to Burden-PA, PGain-PA periodically reduces the error bound of each sensor node by a shrink percentage and redistributes the leftover portion among the sensor nodes. As suggested by [9], the shrink percentage was set at 40%. Again, we simulated PGain-PA over a wide range of adjustment periods (from 144 to 2880 time units) and selected the best result obtained to present in this paper.

B. Effect of and in Adaptive-PA

First, we investigate the performance impact of the number of candidate error bounds in the proposed Adaptive-PA scheme.

8We have also simulated Burden-PA with the communication cost of each node set to the total energy consumed to send a data update to the base station, which includes the sending and receiving costs at intermediate nodes for relaying purposes if any. This strategy was observed to perform worse than the one above in the main text.

Fig. 6 shows the network lifetime for different values when the error bound was set at 0.6 for the TEMP trace.⁹Note that when , the current error bound is the only candidate.

Thus, the optimal candidate precision allocation computed by Algorithm 1 is always the same as the current allocation. Since the experiments started with uniformly allocated error bounds, Adaptive-PA degenerates to Uniform-PA at . The flexibility of precision allocation increases with . As seen from Fig. 6, an value of 3 improves network lifetime significantly compared to (by factors of 3.4 and 1.7 in single-hop and multi-hop networks respectively). The network lifetime is generally insensitive to when exceeds 5. Since the largest allowable for a candidate report message to fit into the default message size is 7, was set at 7 in the remaining experiments.

Recall that Adaptive-PA limits the energy overhead of adjustments at the most energy-consuming node by a portion of its energy budget. The setting of reflects a tradeoff between overhead and adaptivity, both of which increase with . Fig. 7 shows the network lifetime for different values. As expected, the curve of network lifetime is convex for most system configurations tested. In general, the performance of Adaptive-PA is not very sensitive to the value from 0.1% to 0.5%. There- fore, we shall report only the experimental results for the default

% in the remainder of this paper.

C. Performance Comparison in Single-Hop Networks

Fig. 8 shows the network lifetime as a function of the designated error bound on data aggregation for different precision allocation schemes in the single-hop network of Fig. 3. Note that an error bound implies exact data aggregation (the left- most points in Fig. 8). With exact data aggregation, all sensor nodes must be allocated error bounds of 0. Therefore, in this case, the four precision allocation schemes have similar performance.

As seen from Fig. 8, the network lifetime increases with error bound. When , the proposed Adaptive-PA scheme significantly outperforms the other schemes for both traces tested.

Even if the readings at all sensor nodes follow similar changing patterns, it is not desirable to allocate the same error bound to all nodes because they are geographically distributed. In a single-hop network, a node farther away from the base station consumes more energy in sending a data update than a node

9Only the experimental results of the TEMP trace are reported in this section to show the effect ofm and . The results of the SOLAR trace have similar trends.

(12)

Fig. 8. Network lifetime versus designated error bound (single-hop network). (a) TEMP trace. (b) SOLAR trace.

Fig. 9. Energy consumed at different sensor nodes (single-hop network). (a) TEMP trace,E = 0:6 F. (b) SOLAR trace, E = 60 W/m .

closer to the base station. Among the four precision allocation schemes examined, Uniform-PA and PGain-PA do not take this heterogeneity into consideration. Thus, as shown in Fig. 8, Adaptive-PA improves network lifetime by factors up to 3.4 and 2.6 compared to Uniform-PA and PGain-PA respectively. To show the importance of balancing energy consumption in extending network lifetime, we plot in Fig. 9 the total energy consumed by each sensor node by the time when the first node ran out of energy (i.e., the network lifetime elapsed). Under Adap- tive-PA, most nodes were close to exhausting their energy when the network lifetime elapsed. However, under Uniform-PA and PGain-PA, the nodes close to the base station (i.e., nodes 3 and 8 in Fig. 3) consumed as low as 5%–20% of the energy budget only.

Burden-PA considers the heterogeneity in communication cost due to transmission distance. However, the objective of Burden-PA is to minimize the total communication cost. Fig. 8 shows that our Adaptive-PA scheme extends network lifetime by a factor up to 1.9 over Burden-PA. This implies minimizing network-wide total energy consumption does not necessarily balance the energy consumption of the sensor nodes. As seen from Fig. 9(b), under Burden-PA, nodes 3 and 8 consumed as low as 12% and 36% of the energy respectively when the network lifetime elapsed.

D. Performance Comparison in Multi-Hop Networks

We have implemented in-network aggregation in the experiments for multi-hop networks. Fig. 10 shows the results for the multi-hop network of Fig. 4. The performance trends remain similar to those in the single-hop network. The network lifetime

increases rapidly with error bound. For example, under Adap- tive-PA, increasing from 0 (exact data aggregation) to 0.2 and 20 prolongs the network lifetime by factors of 2.6 and 3.8 for the TEMP and SOLAR traces, respectively. This demonstrates the effectiveness of approximate data aggregation in improving energy efficiency.

Comparing the performance of different precision allocation schemes, Adaptive-PA significantly outperforms the other schemes for both traces tested. As seen from Fig. 10, the im- provements over Uniform-PA, Burden-PA and PGain-PA are up to factors 3.7, 1.6 and 1.5 respectively. Comparing Figs. 8 and 10, it is also observed that the relative performance of PGain-PA to Burden-PA improves in the multi-hop network. This is because PGain-PA takes into account the topological relations among the sensor nodes as well as in-network aggregation.

In contrast, Burden-PA treats the sensor nodes separately and does not model in-network aggregation. However, PGain-PA aims at minimizing the total number of messages transmitted in the network without considering the heterogeneity in communication cost. Neither does it attempt to balance the energy consumption at different nodes. Thus, its performance is still much worse than our Adaptive-PA.

Fig. 11 shows the distribution of energy consumed at all sensor nodes when the network lifetime elapsed. A point on the curve means that of the nodes consume more than Joule energy each. It is clear that by balancing the energy consumption at different nodes, the proposed Adaptive-PA scheme makes much better utilization of the energy budgets than the other schemes. For the TEMP trace (see Fig. 11(a)), over 30%

of the nodes consume more than 80% of the energy budget (i.e.,

(13)

Fig. 10. Network lifetime versus designated error bound (multi-hop network). (a) TEMP trace. (b) SOLAR trace.

Fig. 11. Distribution of energy consumed at different sensor nodes (multi-hop network). (a) TEMP trace,E = 0:6 F. (b) SOLAR trace, E = 60 W/m .

0.4 J) in Adaptive-PA, while only 2%, 4%, and 3% of the nodes do so in Uniform-PA, Burden-PA and PGain-PA respectively.

This helps Adaptive-PA to improve network lifetime over the other three schemes. Similar trends are observed for the results of the SOLAR trace (see Fig. 11(b)).

VII. CONCLUSION

We have investigated adaptive precision allocation to extend the lifetime of data aggregation with precision guarantees in wireless sensor networks. The purpose of precision allocation is to differentiate the quality of data collected from different sensor nodes, thereby balancing their energy consumption. Our proposed schemes effectively exploit the tradeoff between data quality and energy consumption. These schemes dynamically adjust the error bounds allocated to the sensor nodes. The basic scheme for single-hop networks is based on the analysis of an optimal precision allocation in terms of network lifetime. The extended scheme for multi-hop networks takes into consideration the topological relations among the sensor nodes as well as the effect of in-network aggregation. Experimental results using real data traces show that: 1) tolerating just a small degree of inaccuracy in data collection prolongs network lifetime substan- tially; 2) due to geographically distributed nature of sensor networks, uniform precision allocation does not perform well even if the readings at all sensor nodes follow similar changing patterns; 3) to extend network lifetime, it is more important to balance the energy consumption of the sensor nodes than to minimize network-wide total energy consumption; and 4) the proposed adaptive precision allocation schemes significantly out- perform existing methods over a wide range of system configurations.

REFERENCES

[1] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci, “A survey on sensor networks,” IEEE Commun. Mag., vol. 40, no. 8, pp.

102–114, Aug. 2002.

[2] R. Szewczyk, E. Osterweil, J. Polastre, M. Hamilton, A. Mainwaring, and D. Estrin, “Habitat monitoring with sensor networks,” Commun.

ACM, vol. 47, no. 6, pp. 34–40, June 2004.

[3] J. Gehrke and S. Madden, “Query processing in sensor networks,”

IEEE Pervasive Comput, vol. 3, no. 1, pp. 45–55, Jan.–Mar. 2004.

[4] G. J. Pottie and W. J. Kaiser, “Wireless intergrated network sensors,”

Communications of the ACM, vol. 43, no. 5, pp. 51–58, May 2000.

[5] V. Shnayder, M. Hempstead, B. Chen, G. W. Allen, and M. Welsh,

“Simulating the power consumption of large-scale sensor network ap- plications,” in Proc. ACM SenSys’04, Nov. 2004, pp. 239–249.

[6] C. Olston, J. Jiang, and J. Widom, “Adaptive filters for continuous queries over distributed data streams,” in Proc. ACM SIGMOD’03, June 2003, pp. 563–574.

[7] M. A. Sharaf, J. Beaver, A. Labrinidis, and P. K. Chrysanthis, “TiNA:

A scheme for temporal coherency-aware in-network aggregation,” in Proc. ACM MobiDE’03, Sep. 2003, pp. 69–76.

[8] Q. Han, S. Mehrotra, and N. Venkatasubramanian, “Energy efficient data collection in distributed sensor environments,” in Proc. IEEE ICDCS’04, Mar. 2004, pp. 590–597.

[9] A. Deligiannakis, Y. Kotidis, and N. Roussopoulos, “Hierarchical in-network data aggregation with quality guarantees,” in Proc.

EDBT’04, Mar. 2004, pp. 658–675.

[10] W. R. Heinzelman, A. Chandrakasan, and H. Balakrishnan, “Energy- efficient communication protocol for wireless microsensor networks,”

in Proc. 33rd Hawaii Int. Conf. on System Sciences, Jan. 2000, pp.

3005–3014.

[11] J. Pan, Y. T. Hou, L. Cai, Y. Shi, and S. X. Shen, “Topology control for wireless sensor networks,” in Proc. ACM MobiCom’03, Sep. 2003, pp. 286–299.

[12] C. Intanagonwiwat, R. Govindan, and D. Estrin, “Directed diffusion for wireless sensor networking,” IEEE/ACM Trans. Netw., vol. 11, no.

1, pp. 2–16, Feb. 2003.

[13] F. Ye, H. Luo, J. Cheng, S. Lu, and L. Zhang, “A two-tier data dissem- ination model for large-scale wireless sensor networks,” in Proc. ACM MobiCom’02, Sep. 2002, pp. 148–159.

(14)

[14] A. Woo, T. Tong, and D. Culler, “Taming the underlying challenges of reliable multihop routing in sensor networks,” in Proc. ACM SenSys’03, Nov. 2003, pp. 14–27.

[15] W. Ye, J. Heidemann, and D. Estrin, “Medium access control with co- ordinated, adaptive sleeping for wireless sensor networks,” IEEE/ACM Trans. Netw., vol. 12, no. 3, pp. 493–506, Jun. 2004.

[16] S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong, “The design of an acquisitional query processor for sensor networks,” in Proc. ACM SIGMOD’03, June 2003, pp. 491–502.

[17] Y. Yao and J. Gehrke, “Query processing for sensor networks,” in Proc.

CIDR’03, Jan. 2003.

[18] X. Li, Y. J. Kim, R. Govindan, and W. Hong, “Multi-dimensional range queries in sensor networks,” in Proc. ACM SenSys’03, Nov. 2003, pp.

63–75.

[19] A. Silberstein, R. Braynard, and J. Yang, “Constraint chaining: On energy-efficient continuous monitoring in sensor networks,” in Proc.

ACM SIGMOD’06, Jun. 2006, pp. 157–168.

[20] J. Xu, X. Tang, and W.-C. Lee, “A new storage scheme for approxi- mate location queries in object tracking sensor networks,” IEEE Trans.

Parallel Distrib. Syst., vol. 19, no. 2, pp. 262–275, Feb. 2008.

[21] S. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong, “TAG: A tiny aggregation service for ad-hoc sensor networks,” in Proc. USENIX OSDI’02, Dec. 2002, pp. 131–146.

[22] J. Considine, F. Li, G. Kollios, and J. Byers, “Approximate aggregation techniques for sensor databases,” in Proc. IEEE ICDE’04, Mar. 2004, pp. 449–460.

[23] S. Nath, P. B. Gibbons, S. Seshan, and Z. R. Anderson, “Synopsis diffusion for robust aggregation in sensor networks,” in Proc. ACM SenSys’04, Nov. 2004, pp. 250–262.

[24] A. Deshpande, C. Guestrin, S. Madden, J. M. Hellerstein, and W. Hong,

“Model-driven data acquisition in sensor networks,” in Proc. VLDB’04, Sept. 2004, pp. 588–599.

[25] D. Chu, A. Deshpande, J. M. Hellerstein, and W. Hong, “Approximate data collection in sensor networks using probabilistic models,” in Proc.

IEEE ICDE’06, Apr. 2006.

[26] M. B. Greenwald and S. Khanna, “Power-conserving computation of order-statistics over sensor networks,” in Proc. ACM PODS’04, June 2004, pp. 275–285.

[27] N. Shrivastava, C. Buragohain, D. Agrawal, and S. Suri, “Medians and beyond: New aggregation techniques for sensor networks,” in Proc.

ACM SenSys’04, Nov. 2004, pp. 188–200.

[28] Y. Kotidis, “Snapshot queries: Towards data-centric sensor networks,”

in Proc. IEEE ICDE’05, Apr. 2005, pp. 131–142.

[29] G. Hartl and B. Li, “Infer: A bayesian inference approach towards en- ergy efficient data collection in dense sensor networks,” in Proc. IEEE ICDCS’05, June 2005, pp. 371–380.

[30] A. Savvides, C.-C. Han, and M. B. Strivastava, “Dynamic fine-grained localization in ad-hoc networks of sensors,” in Proc. ACM Mo- biCom’01, Jul. 2001, pp. 166–179.

[31] D. Niculescu and B. Nath, “Ad hoc positioning (APS) using AoA,” in Proc. IEEE INFOCOM’03, Apr. 2003, pp. 1734–1743.

[32] O. Younis and S. Fahmy, “Distributed clustering in ad-hoc sensor networks: A hybrid, energy-efficient approach,” in Proc. IEEE IN- FOCOM’04, Mar. 2004, pp. 629–640.

[33] Y. T. Hou, Y. Shi, and H. D. Sherali, “Rate allocation in wireless sensor networks with network lifetime requirement,” in Proc. ACM Mo- biHoc’04, May 2004, pp. 67–77.

[34] C. Buragohain, D. Agrawal, and S. Suri, “Power aware routing for sensor databases,” in Proc. IEEE INFOCOM’05, Mar. 2005, pp.

1747–1757.

[35] I. Kang and R. Poovendran, “Maximizing network lifetime of broad- casting over wireless stationary ad hoc networks,” Mobile Netw. Appl., vol. 10, no. 6, pp. 879–896, Dec. 2005.

[36] M. Garey and D. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness. San Francisco, CA: Freeman, 1979.

[37] The Network Simulator—ns-2 [Online]. Available: http://www.isi.edu/

nsnam/ns/

[38] NRL’s Sensor Network Extension to ns-2 [Online]. Available: http://

www.nrlsensorsim.pf.itd.nrl.navy.mil/

[39] D. Ganesan, S. Ratnasamy, H. Wang, and D. Estrin, “Coping with irreg- ular spatial-temporal sampling in sensor networks,” ACM SIGCOMM Computer Communications Review, vol. 34, no. 1, pp. 125–130, Jan.

2004.

[40] G. K. Zipf, Human Behavior and the Principles of Least Effort.

Reading, MA: Addison-Wesley, 1949.

[41] Live From Earth and Mars (LEM) Project [Online]. Available: http://

www.k12.atmos.washington.edu/k12/grayskies/

Xueyan Tang (M’04) received the B.Eng. degree in computer science and engineering from Shanghai Jiao Tong University, Shanghai, China, in 1998 and the Ph.D. degree in computer science from the Hong Kong University of Science and Technology in 2003.

He is currently an assistant professor in the School of Computer Engineering at Nanyang Technological University, Singapore. His research interests include mobile and pervasive computing, wireless sensor networks, Web and Internet, and distributed systems. He has published more than 30 technical papers in the above areas, mostly in prestigious journals and conference proceedings. He is an editor of a book entitled Web Content Delivery published by Springer. He has also served as a program committee member for many international conferences.

Jianliang Xu (S’02-M’03–SM’08) received the B.Eng. degree in computer science and engineering from Zhejiang University, Hangzhou, China, in 1998 and the Ph.D. degree in computer science from the Hong Kong University of Science and Technology in 2002.

He is currently an assistant professor in the De- partment of Computer Science at Hong Kong Bap- tist University. His research interests include mobile and pervasive computing, wireless sensor networks, and distributed systems, with an emphasis on data management. He has published more than 50 technical papers in these areas, many in prestigious journals and conferences, including ACM SIGMOD, Mo- biSys, IEEE ICDE, INFOCOM, IEEE TRANSACTIONS ONKNOWLEDGE AND DATAENGINEERING, IEEE TRANSACTIONS ONPARALLEL ANDDISTRIBUTED SYSTEMS, and VLDB Journal. He is an editor of a book entitled Web Content Delivery published by Springer. He has also served as a session chair and pro- gram committee member for many international conferences, including IEEE INFOCOM.