THE PROPOSED SCHEME - 行政院國家科學委員會專題研究計畫成果報告

Under our chain construction framework, a chain construction algorithm consists of two parts (Figure 2). The ﬁrst is to compute and store the costs of every possible pair of nodes. Provided the cost information, the second part constructs a logical chain among all sensor nodes. The issue of leader scheduling is discussed in Section 3.4.

a c b

e d

Figure 1. A chain under construction. Node e cannot be included in the chain.

3.1. Costs of node pairs

Conventionally, the cost of every pair of nodes is simply the energy expense of a direct transmission between them [5, 6]. Let Md be a matrix whose element indexed by i; j; Mdði; jÞ; is the energy expense of a direct transmission between nodes i and j: To allow a virtual chain, the costs should be associated with data propagation paths rather than direct links. Let M_p be the minimum cost matrix such that M_pði; jÞ ¼ cðP_i;jÞ for some P_i;j2 mcpði; jÞ: Such a P_i;j for every i and j can be found by running an all-pair shortest-path algorithm (e.g. Floyd-Warshall algorithm [9, pp. 558–562]) on the input M_d: As an example, Figure 3(a) represents M_d graphically for a four-sensor network, where each edge is labelled with the direct transmission cost between two terminal nodes. Figure 3(b) shows M_p that corresponds to all-pair shortest paths for M_d:

All-pair shortest-path algorithms are time expensive (Oðn³Þ in case of Floyd-Warshall algorithm). Alternatively, we may ﬁnd ﬁrst a minimum-cost spanning tree (MST) on the weighted complete graph corresponding to Md: Then Pi;j is designated to be the shortest path (actually the only path) traversing the tree from i to j: We denote the matrix that keeps such costs by Mt: With this approach, the data propagation paths found may not be optimal. However, the time complexity of constructing an MST and traversing it from every node is only Oðn²Þ:

Taking Figure 3(a) as an example, Figure 4(a) shows an MST of Figure 3(a). Mt

corresponding to the MST is illustrated in Figure 4(b). Here M_tðc; dÞ ¼ 13 because the data propagation path from c to d is conﬁned to be that along the tree (i.e. c; a; b; d). Observe that this is not a minimum-cost path.

It is interesting and also important to note the property of the triangle inequality in these cost matrices. The triangle inequality refers to that the cost between any two nodes A and B must be at most the cost between A and any other node C plus the cost between C and B: The triangle inequality does not hold if M_d is used as the cost matrix in our problem setting (due to the non-linear attenuation property of radio signals). That is, Mdði; jÞ can be larger than Mdði; kÞ þ M_dðk; jÞ for some i; j; and k: Nevertheless, the triangle inequality does hold in case

Calculate cost of every node pair

Construct a chain Node-pair

cost matrix Virtual

chain

Figure 2. Framework for chain constructions.

(a) (b)

c d

3 8 12 9

a b

c d

3 5 9

Figure 3. (a) M_d; and (b) the corresponding M_p:

ENERGY OPTIMIZATION FOR CHAIN-BASED DATA GATHERING

of Mp; as it is a property of shortest paths [9, p. 520]. For Mt computed based on an MST, the triangle inequality still hold by the following theorem.

Theorem 1

Let Td be an MST built on the graph corresponding to Md: If Mt is computed based on Td; we have Mtði; jÞ4M_tði; kÞ þ M_tðk; jÞ for any i; j; and k:

Proof

For any two nodes i and j in a tree, there exists exactly a unique simple path^} from i to j: The path from i to k and then to j is either the same path from i to j; for which the equality of cost holds, or a non-simple path. In the latter case, an edge incident with k must be included in the path twice, one immediately followed by the other (one joining at k and the other leaving k). If the occurrences of this edge are removed from the path, the path becomes either the exact simple path from i to j or a non-simple path with lower cost which can be further shrank by above

arguments. The conclusion thus follows. &

3.2. Chain construction

Once Mp (or Mt) and every Pi;j have been obtained, a virtual chain can be formed using any conventional chain construction algorithm such as those proposed in References [5, 6]. The only diﬀerence is that the algorithm may run on Mpor Mt instead of Md: Figure 5 shows diﬀerent chains obtained by running the appending-based chain construction algorithm of PEGASIS [5]

on diﬀerent cost matrices.

Although the insertion-based chain construction algorithm [6] generally performs well, here we consider an MST-based chain construction heuristic which is more time eﬃcient. The basic idea is to ﬁnd an MST ﬁrst (on the weighted complete graph representing M_d; M_t; or M_p) and then convert it to a chain. A tree can be converted to a chain by traversing the tree from the root in preﬁx order. The visiting sequence then corresponds to a chain. Figure 6 shows an example.

Time complexity of this approach is Oðn²Þ:

This heuristic has been devised for the TSP, and is often accompanied with the assumption of the triangle inequality. It can be shown that, thanks to the triangle inequality, the heuristic creates a TSP tour whose cost is at most twice the cost of the MST [9, pp. 969–972]. The cost can

(a) (b)

c d

3 8

a b

c d

3 5 10

Figure 4. (a) MST of Figure 3(a); and (b) the corresponding M_t:

}A path is simple if it does not include the same edge twice [10].

be further reduced to at most 1.5 times as the minimum cost [11]. However, a constant performance ratio is impossible without the triangle inequality.

In summary, we have one design choice among three cost metrics and another design choice among three chain construction algorithms. Table I lists all possible combinations. Among them, the operations of MST-based chain constructions are detailed in Figure 7. The procedure MST-MST can be further simpliﬁed by the following theorem.

Theorem 2

Let T_d be an MST built on the graph corresponding to M_d: Assume that M_t is the cost matrix computed on T_d: Let T_tbe an MST on the graph corresponding to M_t: The cost of T_tis equal to that of T_d:

Proof

For every edge ði; jÞ 2 T_t; let P_i;jdenote the data propagation path from i to j that traverses T_d: If jP_i;jj ¼ 1; edge ði; jÞ must be an edge of T_das well. So if we can prove that jP_i;jj ¼ 1 for every edge ði; jÞ 2 T_t; the cost of T_t will be equal to that of T_d: Suppose, by contradiction, that there exists

(a) (b)

a b

c d

a b

c d

2 5

(c)

a b

c d

Figure 5. Diﬀerent chains found by running PEGASIS on: (a) M_d of Figure 3(a); (b) M_pof Figure 3(b);

and (c) M_tof Figure 4(b).

b c d

e f g h

b c d

e f g h

(a) (b)

Figure 6. (a) A tree rooted at a; and (b) the chain corresponds to the preﬁx traversal of (a).

Table I. All possible cost-metric/chain construction combinations.

Chain construction

Cost matrix Greedy appending Greedy insertion MST traverse

M_d(direct transmission) PEGASIS [5] Direct-insertion [6] Direct-MST M_p(all-pair shortest paths) Shortest-appending Shortest-insertion Shortest-MST

M_t(paths conﬁned to MST) MST-appending MST-insertion MST-MST

ENERGY OPTIMIZATION FOR CHAIN-BASED DATA GATHERING

an edge ði; jÞ 2 T_t with jP_i;jj > 1: It follows that there is at least one intermediate node k on P_i;j: Since P_i;j corresponds to the shortest path traversing T_d from i to j; it must be a simple path.

Therefore, for any k we have M_tði; kÞ þ M_tðk; jÞ ¼ M_tði; jÞ:^} There are four possible cases depending on the relation among i; j; and k:

* Both edges ði; kÞ and ðk; jÞ are included in T_t: This is impossible since the inclusion of these edges plus ði; jÞ creates a cycle in Tt:

* Edge ði; kÞ but not ðk; jÞ is included in Tt: We can form T_t⁰ by ﬁrst removing ði; jÞ from Tt

and then adding ðk; jÞ into Tt: Note that T_t⁰ does not contain cycle and the cost of T_t⁰ is lower than that of Tt since we swap ði; jÞ for a lower-cost edge ðk; jÞ: It follows that T_t⁰is a tree with cost lower than that of Tt:

* Edge ðk; jÞ but not ði; kÞ is included in Tt: Similarly, this leads to another tree whose cost is lower than that of Tt:

* Neither ði; kÞ nor ðk; jÞ is included in T_t: T_t must contain a path from i to k and another from k to j as T_t is connected. The lengths of these paths must be greater than one. Now consider replacing ði; jÞ with the combination of ði; kÞ and ðk; jÞ in T_t: Let the result be T_t⁰: Note that T_t⁰has the same cost as T_tbut contains two cycles, one involving the path from i to k and the other j to k: We can remove any edge from the ﬁrst path and any other from the second, resulting in a tree with cost lower than that of T_t:

All these cases lead to impossibility or contradiction, so we conclude that there exists no edge

ði; jÞ 2 T_t with jPi;jj > 1: &

Theorem 2 indicates that, in case of MST-MST, we may directly convert T_d instead of Ttto a chain. Procedure MST-reduced in Figure 8 thus replaces MST-MST.

Table II lists the time complexities of all mentioned methods. Among them, PEGASIS, Direct-MST, MST-appending, and MST-reduced are more time eﬃcient than others.

Figure 7. Operations of MST-based chain constructions.

}Recall that the equality in Theorem 1 holds when k lies on the path from i to j:

3.3. Energy-latency trade-oﬀ

As mentioned, one drawback of using chains instead of trees or clusters is the increase of data latency. The situation may be aggravated when using virtual chains, as a virtual chain increases the number of hops to collect sensed data. Therefore, one may want to constrain data latency and meanwhile still make some gains in energy saving.

Given a conventional chain fNigⁿ_i¼1; if we replace LNi;Niþ1; the link between Ni and Niþ1

ð14i4n 1Þ; with the best data propagation path from N_ito Niþ1; PNi;Niþ1; the number of hops will be increased by jPNi;Niþ1j 1 while the energy gain is cðL_N_i_;N_iþ1Þ cðP_N_i_;N_iþ1Þ: Therefore, the maximal energy gain with latency constraint (MEGLC) problem can be deﬁned as to ﬁnd E f1; 2;. . . ; n 1g that maximizes

i2E

½cðL_N_i_;N_iþ1Þ cðP_N_i_;N_iþ1Þ

subject to

i2E

ðjP_N_i_;N_iþ1j 1Þ4T

where T is the maximal number of additional hops allowed to be added. This problem is also NP-hard as it can be shown that the 0/1 Knapsack problem reduces to MEGLC. The 0/1 Knapsack problem is to choose a set of items to put into a limited-capacity Knapsack, where the ith item has a proﬁt p_i and weighs w_i: The Knapsack capacity is essentially T; w_i can be transformed to jP_N_i_;N_iþ1j 1; and p_i is cðL_N_i_;N_iþ1Þ cðP_N_i_;N_iþ1Þ:

It also can be shown that MEGLC reduces to the 0/1 Knapsack problem. As the 0/1 Knapsack problem can be solved by a dynamic programming algorithm, so can MEGLC.

Nevertheless, we found through experiments that a greedy method performs well. Given Figure 8. Operations of MST-reduced.

Table II. Time complexities of all methods.

Cost matrix Chain

Method computation construction Overall

PEGASIS [5] Oðn²Þ Oðn²Þ Oðn²Þ

Direct-insertion [6] Oðn²Þ Oðn³Þ Oðn³Þ

Direct-MST Oðn²Þ Oðn²Þ Oðn²Þ

Shortest-appending Oðn³Þ Oðn²Þ Oðn³Þ

Shortest-insertion Oðn³Þ Oðn³Þ Oðn³Þ

Shortest-MST Oðn³Þ Oðn²Þ Oðn³Þ

MST-appending Oðn²Þ Oðn²Þ Oðn²Þ

MST-insertion Oðn²Þ Oðn³Þ Oðn³Þ

MST-reduced Oðn²Þ Oðn²Þ Oðn²Þ

ENERGY OPTIMIZATION FOR CHAIN-BASED DATA GATHERING

a conventional chain, the greedy method ‘virtualizes’ the edge that maximizes the ratio of energy gain to the latency raised.

Figure 9 shows how the greedy method trades latency for energy. The energy consumed by a conventional chain (without any edge being virtualized) is 0:44 J: In contrast, the energy expense with a virtual chain can be as low as 0:18 J (with over 20 virtualized edges), a 60% reduction. On the other hand, the conventional chain incurs no additional latency while a virtual chain increases the number of hops to a maximum of 40, a 50% increase (a conventional chain consisting of 80 sensors has a ﬁxed length of 79 hops.) As a remark, the energy gain is trivial after 20 edges have been virtualized. Further edge virtualization does not improve energy eﬃciency signiﬁcantly.

3.4. Leader scheduling

Given a chain structure, leader scheduling determines which node acts as a leader in each round of the data-collection processes. The goal is to prolong network lifetime, i.e. to maximize the number of data-collection rounds. In the following, we analyse the maximum number of data collection rounds that can be achieved before any node exhausts its power. To simplify the analysis, we focus on leader scheduling in a conventional chain. Without loss of generality, we assume that nodes in the chain are numbered sequentially as 1; 2;. . . ; n: We also use the following notations.

* ei: the energy consumed by node i in transmitting a data message to the BS.

* r_i;j: the energy consumed by i in transmitting a k-bit message to node j; where r_{i; j}¼ kE_elecþ ke_ampdði; jÞ^a:

* er¼ kE_elec: energy consumed in receiving a k-bit message.

* Ei: the amount of energy that node i initially has.

When some node i is selected to be the leader, every node numbered j5i (if any) expends energy r_{j; jþ1} in sending data to node j þ 1; at which energy e_r is consumed to receive the data. Likewise, every node numbered k > i (if any) expends r_k;k1to send data to node k 1;

where energy e_ris expended in receiving the data. The leader transmits the collected data to the BS, consuming energy e_i: Supposing that every node i is scheduled to be the leader x_i times,

0 10 20 30 40

0.1 0.2 0.3 0.4 0.5

Number of edges being virtualized

Energy consumption (J) Increased latency (hops)

Energy

0 10 20 30 40

Latency

Figure 9. Trade-oﬀ between energy and latency with the greedy method. The results were obtained with 80 sensors under a 200 200 network.

Table III shows the energy expense of every sensor node. Optimal leader scheduling problem is to ﬁnd positive integer values of x_i’s as to maximizeP

ix_isubject to the following constraints:

E₁5 ðe₁þ e_rÞx₁þ r_1;2x₂þ r_1;2x₃þ þ r_1;2x_n ...

E_i5 ðr_i;i1þ e_rÞx₁þ þ ðr_i;i1þ e_rÞx_i1

þ ðe_iþ 2e_rÞx_iþ ðr_i;iþ1þ e_rÞx_iþ1þ þ ðr_i;iþ1þ e_rÞx_n ...

En5 r_n;n1x1þ r_n;n1x2þ þ ðe_nþ e_rÞx_n These constraints can be reformulated as

A x₁ x2

... xn

0 BB BB BB BB B@

1 CC CC CC CC CA

4 E₁ E2

... En

0 BB BB BB BB B@

1 CC CC CC CC CA

ð1Þ

where

A ¼

e₁þ e_r r_1;2 r_1;2 r_1;2 r_2;1þ e_r e2þ 2e_r r_2;3þ e_r r_2;3þ e_r r_3;2þ e_r r_3;2þ e_r e3þ 2e_r r_3;4þ e_r

...

... r_n;n1 r_n;n1 r_n;n1 e_nþ e_r 0

BB BB BB BB BB

1 CC CC CC CC CC A

The problem turns out to be a linear programming problem. Some sensors may be ruled out by the BS in the leader scheduling process. If sensor i cannot be selected as a leader for some reason

Table III. Energy expense of every sensor.

Node id. In sending messages to the BS

In sending messages to neighbours

In receiving neighbour’s messages

1 e₁x₁ r_1;2Pn

j¼2x_j e_rx₁

i; 24i4n 1 e_ix_i r_i;i1Pi1

j¼1x_jþ r_i;iþ1 Pn

j¼iþ1x_j

e_rðPi1 j¼1x_jþPn

j¼iþ1x_jþ 2x_iÞ

n e_nx_n r_n;n1Pn1

j¼1x_j e_rx_n

xi: the number of times node i is selected to be the leader; ei: the amount of energy consumed in transmitting a message from node i to the BS; r_i;j: the energy consumed by i in transmitting a message to j; e_r: the energy consumed by any node in receiving a message.

ENERGY OPTIMIZATION FOR CHAIN-BASED DATA GATHERING

(for example, it has no direct link with the BS), variable x_iin (1) is bound to zero. Therefore, the existence of leader-ineligible sensors eases the scheduling work by reducing the population of leader candidates.

Round-robin leader scheduling (RR) equalizes the values of xi’s, which is generally far from optimal. In Reference [5], an improvement on RR is proposed. This approach sets up a threshold of distance, and nodes are not allowed to be leaders if their distances to their neighbours along the chain are beyond the threshold.

Instead of ﬁnding an optimal solution, we propose a simple rule called maximum residual power ﬁrst (MRPF) for leader scheduling. As the name suggests, MRPF selects the node that has the maximum residual power to be the leader in each round of data collection. Residual power information can be piggybacked with data messages as part of the aggregated data. If every node attaches its own power level to data message and let the BS ﬁnd the maximum value, it will incur an additional OðnÞ overhead on every message. A better approach is to let every node compare its power level with the one attached with the incoming data message (if any) and send only the larger. This is similar to existing distributed maximum-ﬁnding algorithms on rings [12–15] and the message overhead is only Oð1Þ:

Recall that the BS broadcasts the result of leader scheduling to all sensors before each data-collection round. The energy consumed in receiving broadcasts is not taken into account in the above model. If it is to be considered, a slight modiﬁcation on the modelling is required.

Suppose that receiving one broadcast consumes b unit of energy. As there are P

ixi data-collection rounds in total, all sensors uniformly spend bP

ixi unit of energy on receiving broadcasts. Taking account of this quantity, (1) becomes

A x₁ x₂ x3

... xn

0 BB BB BB BB B@

1 CC CC CC CC CA

4 E₁=b E₂=b E3=b

... En=b 0 BB BB BB BB B@

1 CC CC CC CC CA

ð2Þ

This formula is essentially the same as (1) with the only exception that the initial energy of each sensor E_i is uniformly divided by b: Therefore, if hw₁; w₂; . . . ; w_ni is the optimal value for hx₁; x₂; . . . ; x_ni that maximizes P

x_i subject to (1), hw₁=b; w₂=b; . . . ; w_n=bi will be the solution that maximizes P

x_i subject to (2). In other words, the consideration of energy expense on broadcasting only scales down the optimal value by a constant. It does not make the problem harder or easier to deal with.

The same conclusion also applies to other energy dissipation sources that have an equal eﬀect on all sensor nodes. An example is the energy expense in idle mode.

在文檔中行政院國家科學委員會專題研究計畫成果報告 (頁 59-67)