3.2 Algorithm DClocal
3.2.2 The Optimal Number of Hops for Collecting Local Informa- Informa-tionInforma-tion
In algorithm DClocal, each sensor broadcasts its synopsis to all its neighbors within k-hops distance. It consumes more energy in broadcasting synopsis for each sensor when k is larger.
However, for larger k, the sensors may be able to cover more neighbors, and the number of r-nodes needed to cover all the sensors is fewer. The energy consumption is saved by requiring fewer r-nodes to answer queries. Therefore, how to decide the optimal value of k is an important issue since the network lifetime is greatly affected by k.
In this subsection, we build a theoretical model for users to decide the optimal value for the parameter k. The value of k decided by this model can be used to analyze the performance of algorithm DCglobal and DClocal. We apply the idea of marginal benefit and marginal cost
in economics to model the extra energy saving and consumption by increasing the value of k.
To decide the optimal value of k, our idea is to increase the value of k until the marginal cost exceeds the marginal benefit.
To facilitate the following discussion, we make some assumptions: (1) the number of sensors follows a spatial Poisson process with parameter λ, and (2) the difference between the readings of two adjacent sensors is an independent identically-distributed (i.i.d.) Gaussian random variable with mean 0 and variance σ2. Note that the value of σ differs upon different environments.
When the value of k is increased from t to t + 1, the additional cost for the network is the additional number of message transmissions for each sensor to collect the synopses of its (t + hops neighbors. The message cost for a sensor to collect the synopsis of each (t + 1)-hops neighbor is t + 1. We define that the additional message transmission resulted from increasing the value of k is the marginal cost. We give a formal definition of the marginal cost in the following.
Definition 5: Let nt denote the average number of t-hops neighbors for each sensor, and M Ct+1 denote the marginal cost of increasing the value of k from t to t + 1, then
M Ct+1= nt+1× (t + 1) (3.1)
In the definition of spatial Poisson process, the number of sensors in an area A is λA. Let r denote the maximal transmission range of a sensor, therefore, the maximal distance for a sensor and its t-hops neighbor is tr. We derive that the number of neighbors within t-hops distance is λ × π (tr)2. Thus, the average number of (t + 1)-hops neighbors is
nt+1 = λ× π (t + 1)2r2− λ × πt2r2
= (2t + 1) λπr2.
From the above derivation, we can rewrite 3.1 as
M Ct+1 = (2t + 1) λπr2× (t + 1)
= ¡
2t2+ 3t + 1¢ λπr2.
After defining the marginal cost, we are going to define the marginal benefit of increasing the value of k. When the value of k is increased from t to t + 1, the additional benefit for the network is the future saving of message transmissions in answering queries. The formal definition of marginal benefit is shown below.
Definition 6: Let q denote the average number of queries,bntdenote the expected number of t-hops neighbors which can be data-covered by each sensor, distsink denote the average distance between a sensor and sink, and M Bt+1 denote the marginal benefit of increasing the value of k from t to t + 1, then
M Bt+1 = q× bnt+1× distsink.
To computebnt+1, we must have the probability density function of the dissimilarity between a sensor i and its t-hops neighbor j. Since we have assumed that the difference between the readings of two adjacent sensors is an i.i.d. Gaussian random variable with mean 0 and variance σ2, we can derive the distribution of the dissimilarity between i and j. Suppose that there is a path < s0 = i, s1, s2, ..., st−1, st = j > connecting i and j, and xuv is the random variable which is the difference between the readings of su and sv. The difference between the readings of i and j can be written as follows:
x0t= x01+ x12+ ... + x(t−1)t.
Let P (x0t) denote the probability density function of the dissimilarity between a sensor i
and its t-hops neighbor j. Recall that represents the pre-specified error threshold, we have bnt+1= nt+1× P (x0t≤ ).
According to the property of Gaussian distribution, P (x0t)also follows a Gaussian distrib-ution with mean 0 and variance tσ2. Thus, we have the distribution of the difference between the readings of a sensor i and its t-hops neighbor j. Recall that we have defined the sufficient condition for sensor i to data-cover j by the inequality shown below:
q
(vi1− vj1)2+ (vi2− vj2)2+ ... + (vil− vjl)2 ≤ . (3.2)
If we can derive the probability density function of the dissimilarity between sensor i and j, we can get the probability that sensor i can data-cover j.
Let xijm = vim− vjm, i.e., the difference between the mth readings of sensor i and j for 1≤ m ≤ l. Therefore, (3.2) becomes:
x2ij1 + x2ij2 + ... + x2ijl ≤ 2. (3.3)
By normalizing the random variable xijm, we define a new standard Gaussian random variable yijm = x√ijmtσ with mean 0 and variance 1. Thus, we can rewrite (3.3) as follows:
y2ij1 + yij22 + ... + yij2l ≤
2
tσ2.
To facilitate our discussion, we define a new random variable Y = yij21 + yij22 + ... + y2ijl. According to the property of standard Gaussian distribution, the distribution of Y is a chi-square distribution with degree of freedom l. Thus, P (x0t ≤ ) is equal to P³
Y ≤ tσ22´
where Γ denotes the Gamma function and γ is the incomplete Gamma function.
Thus, the probability that a sensor can data-cover its t-hops neighbors can be evaluated through (3.4). We get the expected number of t-hops neighbors that a sensor can data-cover by multiplying the probability and the average number of t-hops neighbors. Moreover, we can computed the marginal benefit of increasing k from t − 1 to t. We can then compute the marginal benefit according to the definition described above.
From the above discussion, the theoretically optimal value of k for their applications can be decided by the marginal benefit and marginal cost.