Storage Management - 無線感測網路探勘物體移動路徑機制

As mentioned above, in our proposed heterogeneous tracking model, those higher levels cluster heads should store moving records of objects. In general, cluster heads are also sensor nodes with more computing power and storage spaces. These cluster heads are intrinsically sensor nodes as well. As such, cluster heads may suﬀer from the storage saturation problem in which

CH4

CH1 CH2

CH3

sensor-level buffer * sensor-level buffer *

level 0 buffer CH2

1. object miss message

2. append

3. predict

4. append the sensor_ID of the node which detects the object 5. object found

message

6. append

CH5

Figure 12: An illustrative example of recovery procedure

there is no available free storage space for incoming moving records. Consequently, we propose an eﬃcient storage management for cluster heads. Assume that level i cluster head is chosen among level (i - 1) cluster heads. It is possible that a level i cluster head will maintain moving records of all level emission trees for each object. As result, higher level cluster heads will suﬀer more enormous storage overhead.

Let Pi be the probability that an object leaves its current level i cluster and di is the updating cost (i.e., hop counts) from the level i cluster head to the level (i+1) parent node.

Furthermore, suppose that there are (k +1) levels in the hierarchical tracking model (i.e., level 0 to level k). Therefore, the expected value of communication cost among cluster heads, denoted as E[C], for one object can be evaluated as following:

E[C] = P_k−1

Given the leaving probabilities P0 to Pk, it can be seen that the updating costs d0 to d_k−1 are the dominated factors for the value of E[C]. The values of di are dependent on the network topology. Therefore, two storage strategies are developed as follows. Consider an example of

the strategy one in Figure 13. Suppose that the sensor field is divided into a 2⁴× 2⁴ grid and the sink (i.e., the level 4 cluster head) is at the center. Each grid is a level 0 cluster head and the level i cluster head, the nearest to the sink in a 2ⁱ⁺¹× 2ⁱ⁺¹ cluster, is selected to be the level (i+1) cluster head. In Figure 13, the number i represents cluster heads of level 0 to level i. The arrows show the transmission for each level i CH sending the location updating messages to its level (i+1) parent. By observation, the expected value E1[di] = 2ⁱ, where i

=[0,2] and E1[d3] = 1.

Figure 14 shows the storage strategy two, where the number i means that the grid is a level 0 cluster head and a level i cluster heads. Once a level 0 cluster head is selected to be the upper level cluster head, this cluster head will never be selected again. The hierarchy construction principle is that among the level 0 cluster heads which are not selected yet, the nearest one to the level (i+1) cluster head in each 2ⁱ × 2ⁱ cluster will be chosen to be the child node of the level (i+1) cluster head. It can be verified that the expected value E2[di]is almost the same as E1[di]. Let E[S] be the expected value of the storage cost for each cluster head (i.e., average level that a CH is in charge of). Then E1[S] and E2[S] can be evaluated as follows:

In the strategy one, a level i cluster head must maintain sensor level, level 0, ..., level (i-1) trees (i.e., (i+1) levels totally), then,

E1[S] = (k + 1) + (4¹− 4⁰)k + (4²− 4¹)(k− 1) + ... + (4^k− 4^k−1) 4^k

= 4− 4^−k 3

In the strategy two, each cluster head maintains the trees at most 2 levels of information.

Then,

E2[S] = 2(⁴^k₃⁻¹) +^2×4₃^k⁺¹ 4^k

= 4− 4^−k 3

After the estimation of the average storage cost, we can evaluate the variance of the storage cost in strategy 1 and 2.

V1[S] = ((k + 1)− E¹[S])²+ 3(k− E¹[S])²+ ... + 3× 4^k−1(1− E¹[S])² 4^k

= (k + ⁴^−k₃ − ¹3)²+ 3P_k−1

i=0(4ⁱ((k− i) + ⁴^−k3 − ⁴3)²) 4^k

≥ (k− ¹3)²+ 3P_k−1

i=0(4ⁱ(k− i − 2)²) 4^k

= 41

12k²− 15

4 k + 10

3 4^−kk− 17

9 4^−k+5 3

V2[S] = (2− E²[S])²(⁴^k₃⁻¹) + (1− E²[S])²(^2×4₃^k⁺¹) 4^k

= 2 9 − 4^−k

9 −4^−2k 9

It can be seen that V1[S] will be much larger than V2[S] when k increases. Note that strategy 2 has much better load balance than strategy 1. Load balance is an important issue for the in-network mining and prediction in our work. If a cluster head has to maintain the trees for too many levels, its storage may becomes full soon. Once the storage expires, the trees can not be enhanced anymore and it aﬀects the prediction accuracy directly. In strategy 2, the hot spot problem is solved. In each cluster head, there will be much more memory space for the emission tree training at diﬀerent levels.

Figure 13: An illustrative example for the storage strategy 1.

Figure 14: An illustrative example for the storage strategy 2.

level (i - 1) trees

sensor level trees

insert prune

hit_rate < 0.8

T1 T2 T3 T4 T5

hit_rate < 0.6 hit_rate > 0.6

Figure 15: An illstrative diagram of pruning node in a level i cluster head when its memory space is full and the new node is decided to be inserted.

Applying the hierarchy construction approach as shown in Figure 14, a level i (i ≥ 1) cluster head only has to maintain two level buﬀers (i.e., the sensor-level and (i-1)^th level)and emission trees for each object. Since there are multiple objects and the storage of each cluster head is limited, it is still possible that the storage requirement exceeds the limit. Hence, a storage management strategy is necessary for dealing with this situation. When the memory space of a cluster head is full and the count of one symbol in the table maintained by an emission tree node becomes ≥ min_sup, we have to decide to prune other nodes so that the newborn node can be inserted into emission trees. We specify a threshold and if the prediction hit rate of the tree to which the newborn node belongs is already larger than , we can ignore the insertion since this emission tree already has higher prediction rates. Otherwise, we must select an appropriate node to be pruned from other trees. The pruning mechanism consists two steps:(1). select the tree to be pruned and (2). select the node to be pruned.

1) Select the tree to be pruned. Each object usually has its own moving behavior. An object may stay in some regions more frequently than other regions. Hence, the reporting rates of

root

CD A

CDA

CDB

Figure 16: An example of profit maintaining for leaf nodes

objects in each cluster head will be diﬀerent. For the tree selection, each tree maintains a counter. Once a tree is updated, the counter of the tree is increased by one. In addition, each counter minuses one every T periods. A tree with a lower counter value means that it is not often used than other trees. Note that since object movements usually exhibit locality, upper level trees will be updated infrequently and grow with a slower speed than lower level trees. If we just select the tree with the minimal counter value, the upper level trees will have more chances to be pruned and the emission tree at higher levels of cluster heads will be hard to achieve good accuracy. Hence, we only select the same level tree, to which the newborn node will be inserted. To guarantee that the accuracy of each tree is acceptable, we specify a threshold ε. Suppose that the new node will be inserted into a level i emission tree. Among other level i trees with their prediction hit rates ≥ ε, the tree with the minimal counter value will be selected. Consider an example in Figure 15, where the size of a tree stands for the access counter value of the tree. Let = 0.8 and ε = 0.6. Since the hit rate of tree T1 <0.8, we decide to insert the new node into T1 and prune one node in T2~T5. Although T2 has the minimal counter value, its hit rate < 0.6. T2 will not be selected and T4 is next selected.

Since the hit rate of T4 >0.6, T4 is then selected to be pruned.

2) Select the node which will be replaced by the new node. Once the tree is selected, we must prune the node such that there will be less impact to the selected emission tree. For

the node selection, each tree maintains the profits for each leaf node which no other node is derived from it. Let LNode is the set of all these nodes. Consider an example in Figure 16, assume that node CD, CDA and CDB are leaf nodes. Since node CDA and CDB are derived from node CD, we will not take node CD into consideration. The probability of a node represents the importance of the node to the tree. A lower probability means that the node is accessed infrequently than other nodes. Thus, we take the node probability as one factor for the node profit. In addition, since only the mature nodes will be used for prediction and probability estimation, the mature nodes bring more profits than the immature ones.

Among the immature nodes, there are still diﬀerences of the mature degree. An immature node, which is likely to be mature, is more important than a whole new node. Thus, we must first check whether a node is mature or not. let N be the number of times that the L_∞ distance of the probability distribution of node x. The mature degree of a node x, denoted by M D(x), is defined as follows:

M D(x) = N β

With the above two factors, the profit function of a node x, expressed by Profit(x), is formulated as follows:

Profit(x) = P (x) × (MD(x) + c), where c is a real constant used as the base

The node with the minimal profit value in LNode will be chose to be pruned.

To reduce the cost of maintaining LNode, if a node becomes mature, we won’t continue to update the probability entries in the table of the node. Consider an example in Figure 17. The nodes with bold sideline are mature nodes and the nodes with dotted sideline are immature nodes. Node DCAE is a node in LNode. Since node DC and node DCA is immature, P (DCAE)+ P(D)×P(C|D)×P(A|C)×P(E|CA). We don’t have to recalculate P(DCAE)

root

D C A

DC CA

DCA

Figure 17: An example of the maintenance of LNode

until one of node DC and node DCA becomes mature. With the pruning mechanism, there will not be a specific tree which is always selected to be pruned. Since if the prediction hit rate of a tree becomes < ε, the tree will not be selected anymore. Furthermore, even if the hit rate of a tree becomes lower due to the node pruning, it still has chances that the nodes can be inserted back.

4 Performance Study

In this section, experimental results are presented. The simulation model is described in Section 4.1. The comparison of our scheme with PES scheme [12] is conducted in Section 4.2.

Finally, the sensitivity analysis of in-network mining approach is described in Section 4.3.

4.1 Simulation Model

There are 3 levels in our proposed heterogeneous tracking model and we deploy 9 low-end sensors in each level 0 cluster. Hence, there are 16 level 0 CHs, 4 level 1 CHs, one level 2 CH, and the number of low-end sensors is 144. To simulate the object movements, we generate VMM model trees for each object in each cluster head. In addition, the city mobility model

[6] is used to simulate object movements with locality. With the model, each object has a probability p1to determine whether it should leave its current level 1 cluster, and a probability 1 - p1 to stay. In the former case, it will choose a level 1 cluster as the next position according to its VMM model tree in the level 2 CH (It may stay in the current level 1 cluster). In the latter case, it has a probability p0 to determine whether it should leave its current level 0 cluster, and a probability 1 - p0 to stay. Similarly, in the former case, it will choose a level 0 cluster as the next position according to its VMM model tree in the parent. In the latter case, it will stay in its current level 0 cluster. In all cases above, the VMM model looking up procedure is repeated until the object has decided to move to which low-end sensor monitored region. The probability pi is determined by an exponential probability pi = e^−C·2ⁱ⁺¹, where C is a positive constant. A higher value of C means higher locality.

在文檔中無線感測網路探勘物體移動路徑機制 (頁 25-34)