Numerical Results - 數位搜尋樹的機率演算分析

In this section, we discuss the computation of the numerical value of G∗(2). First, we separate the integral into three parts

Z ∞

Now we consider the first part of the integral. Since g(z) is given by (3.12) and fe₁^(j)(0) = 0 for j ≤ b and ef₁^(b+1)(0) = 1, we have

for large s. Thus, from this and (2.21) we obtain for N tends to infinity,

This means that if we choose N large enough, then the first part can be safely neglected.

Next, we need a good way of computing g(z) for the second integral. We first consider fe₁(z). Since ef₁(z) = e^−zP

and the error term introduced by this approximation is

e^−z(log N )z^{N +1}

which is very small if we choose T small enough, say N eT . So, in order to compute fe₁(z), we just have to generate µ_b+1, . . . , µ_N (this can be done via the recurrence satisfied by µ_n) and then use (3.16). Since a similar approach also works for the derivatives of fe₁(z), this can be used to compute g(z) as well.

Finally, we consider the last integral Z ∞ We use Watson’s lemma to get an asymptotic expansion of the Laplace transform of s/ϕ(2s):

Lemma 9. Consider a Laplace integral R∞

0 f (s)e^−zsds and assume (i) f (s) has a power series expansion which converges for |s| < R.

(ii) There exists an α > 0 such that f (s) = O(e^αs) as s → ∞.

We still need an asymptotic expansion of g(z) for z large. Again first consider ef₁(z). From this and the expression above, we get

F (s) ≈ϕe₀

Here, we have dropped all terms with sⁱ, i ≥ 0 (their contributions to ef₁(z) is negligi-ble). Then, by inverse Laplace transform,

fe₁(z) ∼ ϕe₀

By differentiation we obtain similar expansions for the derivatives of ef₁(z) and hence for g(z).

Plugging this expansion together with the expansion obtained from Watson’s lemma for the Laplace transform of s/ϕ(2s) into (3.17) and integration then yields an approxi-mation of (3.17) up to arbitrary large order. Choosing sufficiently many terms will then give a good approximation for very small T .

Finally, we explain how to compute ˆG∗(2 + χ_j) involved in (3.18). First notice that due to the very fast decay of ˆG∗(2+χj) only j = 0, 1, −1 are needed. For the computation we use the following result which is due to Flajolet and Richmond.

Lemma 10. The function J (s) defined in (2.26) admits the representation

J (s) = A₀(2^s) + (s − 1)A₁(2^s) + · · · + (s − 1)(s − 2) · · · (s − b + 1)A_b−1(2^s), where A_k(x)’s are entire functions, and

A_k(x) = (−1)^k k! · 1

Q^b_∞

∞

j=0

(−1)^jb2^−bj(j+1)/2

Q^b_j Y_b−1−k(j)(x2^b−1−k)^j, with Q_m =Qm

l=1(1 − 2^−l), Q_∞ =Q∞

l=1(1 − 2^−l) and the Y_β(j) are defined by X

β≥0

Y_β(j)w^β = exp bX

α≥1

(−1)^αw^α

α (X

l6=j

1 (2^l− 2^j)^α)

! .

Overall, by incorporating all the above ideas, we obtain the following program for computing the required values.

> b := 2; T := 30; N := 4*T;

> µ := vector(120);

for y from 1 to b do µ[y] := 0

end do;

> for y from 1 to 120-b do

µ[y+b] := y+2ˆ(1-y)*(sum(binomial(y, i)*µ[i], i=1..y)) end do;

> fnew := exp(-z)*(sum(µ[i]*zˆi/i!, i=1..120));

> f := vector(b);

for y from 1 to b do f[y] := diff(fnew, z$y+1) end do;

> evalf(evalf(Int(s*exp(-z*s)*((2*z+1)*f[1]ˆ2+z*f[2]ˆ2+4*z*f[1]*f[2])/

evalf(Product(1+s/2ˆi, i=0..60))ˆb, [z=0..T, s=0..N]))/ln(2));

0.1300797679

> with(PolynomialTools);

φ := sum(sum(b*(-1)ˆ(n-1)*sˆn/(n*(2ˆn-1)), n=1..12)ˆi/i!, i=0..12);

coeff phi := CoefficientVector(expand(φ), s);

> q := evalf(add((-1)ˆj*2ˆ(-binomial(j+1, 2)) /evalf(Product(1-2ˆ(-i), i=1..j)), j=0..60));

e := sum((-1)ˆi*wˆi*(sum(1/(2ˆd-2ˆj)ˆi, d=0..j-1)+sum(1/(2ˆd-2ˆj)ˆi, d=j+1..60 ))/i, i=1..b);

> coeff y := CoefficientVector(expand(sum((b*e)ˆi/i!, i=0..b-1)), w);

> a := vector(b);

for k from 0 to b-1 do

a[k+1] := evalf((-1)ˆk*add((-1)ˆ(b*j)*2ˆ(-(1/2)*b*j*(j+1))*coeff y[b-k]*2ˆ((b-1-k)

*j)/(Product(1-2ˆ(-i), i=1..j))ˆb, j=0..60)/(k!*qˆb)) end do;

> G := (s) → evalf(Pi*(sum((product(s-j, j=1..i-1))*a[i], i=1..b))/sin(Pi*s));

> χ := (2*Pi*I)/log(2);

func vec := vector(13);

func vec[1] := z*(-1+γ+ln(z))+G(χ)*zˆ(1+χ)/GAMMA(2+χ) +G(-χ)*zˆ(1-χ)/GAMMA(2-χ);

func vec[2] := γ+ln(z)+G(χ)*zˆχ/GAMMA(1+χ)+G(-χ)*zˆ(-χ) /GAMMA(1-χ);

for k from 3 to 13 do

func vec[k] := zˆ(-k+2)*(-1)ˆ(k-3)*(k-3)!+G(χ)*zˆ(-k+2+χ)/GAMMA(-k+3+χ) +G(-χ)*zˆ(-k+2-χ)/GAMMA(-k+3-χ)

end do;

> gnew := 0;

for k from 3 to 13 do

gnew := gnew+coeff phi[k]*func vec[k]

end do;

gnew := gnew/ln(2);

> g := vector(b);

for k from 1 to b do g[k] := diff(gnew, z$k+1)) end do;

> φ2 := s*(sum((sum(-b*(-1)ˆ(n-1)*2ˆn*sˆn/(n*(2ˆn-1)), n=1..12))ˆi/i!, i=0..12));

coeff phi 2 := CoefficientVector(expand(φ2), s);

> watson := 0;

for k from 1 to 13 do

watson := watson + coeff phi 2[k+1]*k!/zˆ(k+1) end do;

> evalf(Int(((2*z+1)*g[1]ˆ2+z*g[2]ˆ2+4*z*g[1]*g[2])*watson/log(2), z=T..∞, method = CCquad));

0.002522975851 + 1.220918538 · 10⁻¹⁹I

> 0.002522975851 + 0.1300797679;

0.1326027438

Chapter 4 Conclusion

To conclude this thesis, we briefly summarize the main contributions.

Our first goal was to give a self-contained survey of recent results in the analysis of DSTs. These results have been widely spread across the research literature before and this is up to our knowledge the first time that they appear in collected form. For clearance of presentation, we started by explaining the main techniques used in the analysis of DSTs in Chapter 1. Then, we showed applications of these techniques in Chapter 2. Therefore, we used the internal path length as guiding example and concentrated on mean value and variance. It should be stressed that all the results from Chapter 1 and Chapter 2 are not original. However, we improved and shortened several proofs, in particular, those concerned with the analysis of asymmetric DSTs.

Our second goal was to introduce a new approach of Fuchs, Hwang, and Zacherovas (which has not appeared yet) and to compare it with previous approaches from Chapter 2.

This new approach has two major improvements: one is the use of the Laplace transform which simplifies the overall analysis and the other is the consideration of eV (z) which makes the computation of the variance more easier. Then, we applied the new approach to the variance of the internal path length of b-DSTs and obtained more simpler expressions for the leading constant in the asymptotic expansion for b = 1, 2 (larger values of b can be treated as well). Finally, we explained how to obtain numerical values of the leading constant for small values of b via Maple. Our numerical computations suggest that previous computations contain several imprecisions.

Bibliography

[1] R. Aguech, N. Lasmar and H. Mahmoud, Distances in random digital search trees, Acta Informatica, 43 (2006) 243–264.

[2] P. Flajolet, X. Gourdon and P. Dumas, Mellin transforms and asymptotics: Harmonic sums, Theoret. Comput. Sci., 144 (1995) 3–58.

[3] P. Flajolet and A. Odlyzko, Singularity analysis of generating functions, SIAM J.

Discrete Math., 3 (1990) 216–240.

[4] P. Flajolet and B. Richmond, Generalized digital trees and their difference-differential equations, Random Structures and Algorithms, 3 (1992) 305–320.

[5] P. Flajolet and R. Sedgewick, Digital search trees revisited, SIAM J. Comput., 15 (1986) 748–767.

[6] P. Flajolet and R. Sedgewick, Mellin transforms and asymptotics: Finite differences and Rice’s integrals, Theoret. Comput. Sci., 144 (1995) 101–124.

[7] P. Flajolet and R. Sedgewick, Analytic Combinatorics, Cambridge University Press, 2009.

[8] F. Hubalek, On the variance of the internal path length of generalized digital trees -The Mellin convolution approach, -Theoret. Comput. Sci., 242 (2000) 143–168.

[9] F. Hubalek, H. Hwang, W. Lew, H. Mahmoud and H. Prodinger, A multivariate view of random bucket digital search trees, Journal of Algorithms, 44 (2002) 121–158.

[10] P. Jacquet and W. Szpankowski, Asymptotic behavior of the Lempel-Ziv parsing scheme and digital search trees, Theoret. Comput. Sci., 144 (1995) 161–197.

[11] P. Jacquet and W. Szpankowski, Analytical depoissonization and its applications, Theoret. Comput. Sci., 201 (1998) 1–62.

[12] P. Kirschenhofer and H. Prodinger, Further results on digital search trees, Theoret.

Comput. Sci., 58 (1988) 143–154.

[13] P. Kirschenhofer and H. Prodinger, Eine Anwendung der Theorie der Modulfunktio-nen in der Informatik. Sitzungsber., Abt. II, ¨Osterr. Akad. Wiss., Math.-Naturwiss.

Kl. 197, No.4-7, (1988) 339-366.

[14] P. Kirschenhofer, H. Prodinger and W. Szpankowski, Digital search trees again re-visited: The internal path length perspective, SIAM J. Comput., 23 (1994) 598–616.

[15] D.E. Knuth, The Art of Computer Programming, Vol. 3: Sorting and Searching, Addison-Wesley, Reading, MA, 1973.

[16] A.G. Konheim and D.J. Newman, A note on growing binary trees, Discrete Math., 4 (1973) 57–63.

[17] G. Louchard, Exact and asymptotic distributions in digital and binary search trees, RAIRO Theoret. Inform. Appl., 21 (1987) 479–495.

[18] G. Louchard, Digital search trees revisited, Cahiers Centre ´Etudes Rech. Oper., 36 (1995) 259–278.

[19] G. Louchard and W. Szpankowski, Average profile and limiting distribution for a phrase size in the Lempel-Ziv parsing algorithm, IEEE Trans. Inform. Theory, 41 (1995) 478–488.

[20] G. Louchard, W. Szpankowski and J. Tang, Average profile of the generalized digital search tree and the generalized Lempel-Ziv algorithm, SIAM J. Comput., 28 (1999) 904–934.

[21] H. Prodinger, External internal nodes in digital search tree via Mellin transforms, SIAM J. Comput., 21 (1992) 1180–1183.

[22] W. Szpankowski, A characterization of digital search trees from the successful search viewpoint, Theoret. Comput. Sci., 85 (1991) 117–134.

[23] W. Szpankowski, Average Case Analysis of Algorithms on Sequences, John Wiley &

Sons, Inc., New York, NY, 2001.

[24] E.C. Titchmarsh and D.R. Heath-Brown, The Theory of the Riemann Zeta-function (Oxford Science Publications, Oxford, 2nd ed., 1986).

在文檔中數位搜尋樹的機率演算分析 (頁 63-71)