• 沒有找到結果。

Wiener Index for Variants of Digital Search Trees

4.1 Approximate Counting

4.2.3 Wiener Index for Variants of Digital Search Trees

Tn− E(Tn)

Var(Tn),Wn− E(Wn)

√Var(Wn) )

this proves our claim.

4.2.3 Wiener Index for Variants of Digital Search Trees

In this section, we are going to discuss similar results as in Section 4.2.2 for other digital trees. Proofs of these results follow along the same lines (or are even easier since in some cases Laplace transform is not needed) and will not be given). For the reader’s convenience, we will list the (differential-)functional equations for poissonized mean, variances and covariances which are crucial to the proofs in the AppendixA. Our results can be deduced from them with a similar approach as used in Section 4.2.2.

The first member of the digital tree family we are going to discuss is the bucket digital search trees. Note that there are two types of total path length in bucket digital trees: the sum of distances of all keys to the root and the sum of distances of all nodes to the root; the former is called key-wise path length and the latter node-wise path length (see [74] for more details).

Accordingly, we also have a key-wise Wiener index and a node-wise Wiener

index. Results for both Wiener indices in random bucket digital search trees will be presented below.

Another member of the digital tree family are tries. Note that for tries, the number of leaves is n whereas the number of internal nodes is random.

Hence, there are again two different types of Wiener indices, namely, the ex-ternal Wiener index which only uses exex-ternal nodes and the inex-ternal Wiener index where internal nodes are used. Again both of these Wiener indices will be discussed below.

As a final member of the digital tree family, we consider PATRICIA tries.

For binary PATRICIA tries, the number of internal nodes is not random and hence there is only external Wiener index which make sense. However, for m-ary PATRICIA tries with m > 2, the number of internal nodes is no longer definite and hence the internal Wiener index is well-defined. We will give the results of internal Wiener index for m-ary PATRICIA tries in the end of this section.

As in Section 4.2.2, we will denote by Tn the total path length (either key-wise or node-wise or external or internal depending on the context) and by Wn the Wiener index (again either key-wise or node-wise or external or internal). Moreover, for the node-wise Wiener index and the internal Wiener index, we also need the number of nodes (internal in case of the internal Wiener index) which will be denoted by Nn.

Key-wise Wiener Index of Bucket Digital Search Trees.

Here, we have the following distributional recurrences for Tn and Wn: for n≥ 0,

Tn+b

=Td Bn+ Tn−Bn + n,

Wn+b=Wd Bn + Wn−Bn+ (Bn+ 1)(Tn−Bn + n− Bn) + (n− Bn+ 1)(TBn+ Bn),

where notation is as in Section 1 and initial conditions are given by T0 =

· · · = Tb−1 = W0 =· · · = Wb−1 = 0.

From these recurrences, we obtain the following results for mean and variance.

Theorem 4.2.5. We have for the mean of the key-wise path length and

key-wise Wiener index of bucket digital search trees,

E(Tn) = n log2n + nP1(log2n) +O(log n),

E(Wn) = n2log2n + n2P1(log2n)− n2+O(n log n),

where P1(z) is a one-periodic function given in the remark below. Moreover, variances and covariances of the key-wise path length and key-wise Wiener index of bucket digital search trees are given by

Var(Tn) = nP2(log2n) +O(1), Cov(Tn, Wn) = n2P2(log2n) +O(n log n),

Var(Wn) = n3P2(log2n) +O(n2log n),

where P2(z) is again a one-periodic function given in the remark below.

Remark 8. The result for the mean and variance of the key-wise path length were first obtained by Hubalek in [89]. In [74], we gave the following expres-sions for the periodic functions

P1(z) = γ− 1

and ˜f1,0(z) denotes the Poisson generating function of E(Tn).

Note that the result for the mean of the Wiener index also follows from [22].

Moreover, we have the following bivariate central limit theorem.

Theorem 4.2.6. We have,

(

Tn− E(Tn)

Var(Tn),Wn− E(Wn)

Var(Wn) )

−→ (X, X),d

where X is a standard normal distributed random variable and −→ denotesd weak convergence.

Remark 9. The central limit theorem for the key-wise path length was first proved in [90].

Node-wise Wiener Index of Bucket Digital Search Trees.

Here, the distributional recurrences for Nn, Tn and Wn are given by: for n≥ 0,

Nn+b =Nd Bn+ Nn−B

n+ 1,

Tn+b =Td Bn+ Tn−Bn+ NBn + Nn−Bn, Wn+b

=Wd Bn + Wn−Bn + (NBn+ 1)(Tn−Bn+ Nn−Bn) + (Nn−Bn+ 1)(TBn + NBn),

where Bn is as in Section 1, the triplet (Nn, Tn, Wn) denotes an independent copy of (Nn, Tn, Wn) and (Nn, Tn, Wn) is independent of (Bn). Initial con-ditions are given by T0 = · · · = Tb−1 = W0 = · · · = Wb−1 = N0 = 0 and N1 =· · · = Nb−1= 1.

From this, we obtain the following result.

Theorem 4.2.7. We have for the mean of the number of nodes, node-wise

path length and node-wise Wiener index of bucket digital search trees,

E(Nn) = nP1(log2n) +O(1),

E(Tn) = n(log2n)P1(log2n) +O(n), E(Wn) = n2(log2n)P1(log2n)2+O(n2),

where P1(z) is a one-periodic function given in the remark below. Moreover, variances and covariances of the number of nodes, node-wise path length and node-wise Wiener index of bucket digital search trees are given by

Var(Nn) = nP2(log2n) +O(1),

Cov(Nn, Tn) = n(log2n)P2(log2n) +O(n), Var(Tn) = n(log2n)2P2(log2n) +O(n log n),

Cov(Nn, Wn) = 2n2(log2n)P1(log2n)P2(log2n) +O(n2), Cov(Tn, Wn) = 2n2(log2n)2P1(log2n)P2(log2n) +O(n2log n),

Var(Wn) = 4n3(log2n)2P1(log2n)2P2(log2n) +O(n3log n),

where P2(z) is again a one-periodic function given in the remark below.

Remark 10. The results for the number of nodes were first proved in [90].

Moreover, the results were reproved in [74] where in addition we also proved the results for the node-wise path length and gave the following expressions for P1(z) and P2(z)

and ˜f1,0(z) denotes the Poisson generating function of E(Tn).

Theorem 4.2.7 yields the following trivariate central limit theorem.

Theorem 4.2.8. We have,

(

where X is a standard normal distributed random variable and −→ denotesd weak convergence.

Remark 11. The central limit theorem for the number of nodes was first proved in [90]. Also note that we posed the problem of proving a bivariate central limit law of number of nodes and node-wise path length in Section 5 of [74].

External Wiener Index of Tries. Here, the distributional recurrences

for Tn and Wn are as follows: for n≥ 2,

Tn

= Td Bn+ Tn−Bn + n,

Wn= Wd Bn + Wn−Bn+ Bn(Tn−Bn+ n− Bn) + (n− Bn)(TBn + Bn), where notation is as in Section 1 and initial conditions are given by T0 = T1 = W0 = W1 = 0.

From this, we obtain the following theorem.

Theorem 4.2.9. We have for the mean of external path length and external

Wiener index of tries,

E(Tn) = n log2n + nP1(log2n) +O(log n),

E(Wn) = n2log2n + n2P1(log2n)− n2+O(n log n),

where P1(z) is a one-periodic function given in the remark below. Moreover, variances and covariances of the external path length and external Wiener index of tries are given by

Var(Tn) = nP2(log2n) +O(1), Cov(Tn, Wn) = n2P2(log2n) +O(n log n),

Var(Wn) = n3P2(log2n) +O(n2log n),

where P2(z) is again a one-periodic function given in the remark below.

Remark 12. The result about the mean of the total path length was first obtained in [127]. A detailed analysis of the variance of the total path length was first undertaken by Kirschenhofer, Prodinger and Szpankowski [119] (see also Jacquet and Régnier [94] for preliminary results). In Hwang, Fuchs and Zacharovas [75], we obtained the following expressions for the periodic functions

P1(z) = γ log 2 +1

2 1 log 2

k̸=0

Γ(−χk)e2kπiz and

P2(z) = 1 log 2

k

G2(−1 − χk)e2kπiz, where

G2(ω) =Γ(ω + 1) (

1 ω2+ ω + 4 2ω+3

)

+ 2∑

l≥1

(−1)lΓ(ω + l + 1)

l!(2l− 1) (l(ω + l)− 1).

Note that the result about the mean of the Wiener index also follows from [22].

From the previous result, we again obtain the following theorem.

Theorem 4.2.10. We have,

(

Tn− E(Tn)

Var(Tn),Wn− E(Wn)

Var(Wn) )

−→ (X, X),d

where X is a standard normal distributed random variable and −→ denotesd weak convergence.

Remark 13. The central limit theorem for the key-wise path length was first proved in [94].

Internal Wiener Index of Tries.

Here, the distributional recurrences for Nn, Tn and Wn are as follows: for n≥ 2,

Nn

=Nd Bn + Nn−Bn+ 1,

Tn=Td Bn+ Tn−Bn + NBn+ Nn−Bn,

Wn=Wd Bn+ Wn−Bn+ (NBn + 1)(Tn−Bn + Nn−Bn) + (Nn−B n + 1)(TBn+ NBn),

where notation is as for the node-wise Wiener index and initial conditions are given by N0 = N1 = T0 = T1 = W0 = W1 = 0.

Then, we have the following result for mean values, variances and covari-ances.

Theorem 4.2.11. We have for the mean of the number of internal nodes,

internal path length and internal Wiener index of tries,

E(Nn) = nP1(log2n) +O(1),

E(Tn) = n(log2n)P1(log2n) +O(n), E(Wn) = n2(log2n)P1(log2n)2+O(n2),

where P1(z) is a one-periodic function given in the remark below. Moreover, variances and covariances of the number of internal nodes, internal path

length and internal Wiener index of tries are given by Var(Nn) = nP2(log2n) +O(1),

Cov(Nn, Tn) = n(log2n)P2(log2n) +O(n), Var(Tn) = n(log2n)2P2(log2n) +O(n log n),

Cov(Nn, Wn) = 2n2(log2n)P1(log2n)P2(log2n) +O(n2), Cov(Tn, Wn) = 2n2(log2n)2P1(log2n)P2(log2n) +O(n2log n),

Var(Wn) = 4n3(log2n)2P1(log2n)2P2(log2n) +O(n3log n), where P2(z) is again a one-periodic function given in the remark below.

Remark 14. The result for the mean of the number of internal nodes was first proved in [127]. The variance of the number of internal nodes was first derived by Régnier and Jacquet [95] (see also [94], [93]). In [75], we gave the following expression for the periodic functions

P1(z) = 1

log 2 + 1 log 2

k̸=0

χkΓ(−1 − χk)e2kπiz.

and

P2(z) = 1 log 2

k

G2(−1 − χk)e2kπiz, where

G2(ω) =(ω + 1)Γ(ω) (

1 ω2+ 4ω + 8 2ω+3

)

+ 2∑

l≥1

(−1)llΓ(ω + l + 1)

(l + 1)!(2l− 1) (l(ω + l + 1)− 1).

The results for mean and variance of internal path length and covariance with the number of internal nodes are due to Nguyen-The [162].

As before, we have a central limit theorem which now reads as follows.

Theorem 4.2.12. We have,

(

Nn− E(Nn)

Var(Nn) ,Tn− E(Tn)

Var(Tn),Wn− E(Wn)

Var(Wn) )

−→ (X, X, X),d

where X is a standard normal distributed random variable and −→ denotesd weak convergence.

Remark 15. The central limit theorem for the number of internal nodes was first proved in [93] and [94]. The bivariate central limit theorem for the number of internal nodes and the internal path length was wrongly stated in [162] (the author of this work did not observe that the covariance matrix is singular leading to a wrong proof).

External Wiener Index of Binary PATRICIA tries. Here, we have

for Tn and Wn: for n≥ 2,

Tn =d

{ TBn + Tn−Bn+ n, if Bn ̸= 0 or Bn̸= n;

Tn, otherwise,

Wn=d



WBn+ Wn−Bn + Bn(Tn−Bn+ n− Bn)

+(n− Bn)(TBn+ Bn), if Bn ̸= 0 or Bn̸= n;

Wn, otherwise,

where notations is as in Section 1 and T0 = T1 = W0 = W1 = 0.

Then, we have the following result.

Theorem 4.2.13. We have for the mean of the total path length and Wiener

index of PATRICIA tries,

E(Tn) = n log2n + nP1(log2n) +O(log n),

E(Wn) = n2log2n + n2P1(log2n)− n2+O(n log n),

where P1(z) is a one-periodic function given in the remark below. More-over, variances and covariances of the total path length and Wiener index of PATRICIA tries are given by

Var(Tn) = nP2(log2n) +O(1), Cov(Tn, Wn) = n2P2(log2n) +O(n log n),

Var(Wn) = n3P2(log2n) +O(n2log n),

where P2(z) is again a one-periodic function given in the remark below.

Remark 16. The result for the mean of the external path length was first derived in [127]. The result for the variance of the total path length is due to Kirschenhofer, Prodinger and Szpankowski [118]. In [75], we obtained the expressions for the period functions

P1(z) = γ− 1 log 2 + 1

log 2

k̸=0

Γ(−χk)e2kπiz

and

The latter result again implies the following bivariate central limit theo-rem.

Theorem 4.2.14. We have,

(

where X is a standard normal distributed random variable and −→ denotesd weak convergence.

Remark 17. Up to our knowledge, this result was first obtained by Neininger and Rüschendorf in [160].

Internal Wiener Index of m-ary PATRICIA tries.

First, observe that the internal path length and internal Wiener index satisfy the following distribution recurrences for n≥ 2

Nn =d

Theorem 4.2.15. Consider m-ary PATRICIA tries built on strings with

digits from alphabetS = {a1, . . . , am}. Suppose that the probability for a digit of the random string being ai is pi for all 1≤ i ≤ m. Set h = −m

i=1pilog pi, then we have that for the mean of internal nodes, internal path length and internal Wiener index of m-ary PATRICIA tries, as n→ ∞,

E(Nn)∼ nP1(log1/an),

E(Tn)∼ h−1n log nP1(log1/an), E(Wn)∼ h−1n2log nP1(log1/an)2,

where P1(z) is a one-periodic function. Moreover, variances and covariances of the number of internal nodes, internal path length and internal Wiener index of m-ary PATRICIA tries are given by

Var(Nn)∼ nP2(log1/an),

Cov(Nn, Tn)∼ h−1n log nP2(log1/an), Var(Tn)∼ h−2n log2nP2(log1/an),

Cov(Nn, Wn)∼ 2h−1n2log nP1(log1/an)P2(log1/an), Cov(Tn, Wn)∼ 2h−2n2log2nP1(log1/an)P2(log1/an),

Var(Wn)∼ 4h−2n3log2nP1(log1/an)2P2(log1/an), where Q(z) is again a one-periodic function. In particular,

ρ(Nn, Tn)−→ 0, ρ(Nn, Wn)−→ 0, ρ(Tn, Wn)−→ 0, where ρ(·, ·) denotes the correlation coefficient.

Theorem 4.2.16. We have,

(

Nn− E(Nn)

Var(Nn) ,Tn− E(Tn)

Var(Tn),Wn− E(Wn)

Var(Wn) )

−→ (X, X, X).d