¯¯
¯ P
ÃXn− E(Xn) pV(Xn) < x
!
− Φ(x)
¯¯
¯¯
¯
=
O(n−3(3/2−√2)), ifρ <√ 2 − 1;
O(n−3(3/2−√2)(log n)3), if ρ =√ 2 − 1;
O(n−3(1/2−ρ)), if√
2 − 1 < ρ < 1/2.
The corresponding local limit theorems can be derived whenYnassumes only integer values.
5 Random d-dimensional grid-trees
We consider briefly the phase changes in random grid-trees in this section, the required asymptotic transfers being also given.
Grid trees. Devroye [12] extended thed-dimensional point quadtrees and m-ary search trees as follows.
Instead of choosing the first point as the root, one chooses, say the firstm − 1 points (m ≥ 2) and places them at the root. Thesem − 1 points then split the space into md smaller regions (called grids) when no pair of points is collinear. Each node in the corresponding grid-tree has at mostmdsubtrees. Whenm = 2, grid-trees are quadtrees; whend = 1, grid-trees reduce to the usual m-ary search trees; see [37].
Random grid-trees. Fixm ≥ 2 and d ≥ 1 throughout this section. Assume that the input is a sequence of n random points uniformly and independently chosen from [0, 1]d. Construct the grid-tree from this sequence. The resulting tree is called a random grid-tree.
Phase changes of the number of leaves. For simplicity of presentation, we consider the number of leaves in random grid-trees, denoted byXn.
m 2 3 4 5, . . . , 8 9, . . . , 26 d 1, . . . , 8 1, . . . , 4 1, . . . , 3 1, 2 1
Table 4: The set S of all pairs of (m, d) for which Xn is asymptotically normally distributed. The two boundary cases(2, 26) (m-ary search trees) and (1, 8) (quadtrees) are both underlined.
Theorem 5. If(m, d) ∈ S, where S is given in Table4, then Xn− E(Xn)
pV(Xn)
→ N(0, 1);M
ifm ≥ 2, d ≥ 1 and (m, d) 6∈ S, then the sequence of random variables (Xn− E(Xn))/p
V(Xn) does not converge to a fixed limit law.
More refined results (and more phase changes) can be derived as in the case of quadtrees.
Recurrence ofXn. The recurrence ofXnnow has the form Moreover, the splitting probabilities can be expressed as
πn,j = P (J1 = j1, . . . , Jmd = jmd)
Recurrence of moments. All moments satisfy recurrences of the form An= Bn+ md X
0≤j≤n−m+1
πn,jAj, (n ≥ m − 1), (72)
whereπn,j denotes the probability that a specified subtree (say the first) of the root hasj nodes.
We now show thatπn,j can be expressed in the form
πn,j = X
To that purpose, we first splitπn,j as follows.
πn,j = X
j≤i1≤i2≤···≤id−1≤n−m+1
̟j;i1,...,id−1,
where̟j;i1,...,id−1denotes the probability that then random points are distributed in the d-dimensional unit cube in the following way: the firstm − 1 points, denoted by x1, . . . , xm−1, split[0, 1]dintomdgrids and the remaining points are placed in these grids such that grids of the form
h
By definition, we have
̟j;i1,...,id−1
wherei0 := j and id:= n − m + 1. It remains to evaluate integrals of the form Z
[0,1]m−1
xρ(1)¡
1 − x(1)¢τ
dx1. . . dxm−1,
whereρ, τ ≥ 0. By dividing the domain of integration into (m−1)! sets of the form {(x1, . . . , xm−1)|xσ(1)<
· · · < xσ(m−1)}, where σ runs through all permutations of m − 1 elements Z
[0,1]m−1
xρ(1)¡
1 − x(1)¢τ
dx1. . . dxm−1 = (m − 1)!
Z
0≤x1≤···≤xm−1≤1
xρ1(1 − x1)τdx1. . . dxm−1
= (m − 1) Z 1
0
xρ1(1 − x1)β+m−2dx1
= (m − 1)Γ(ρ + 1)Γ(τ + m − 1) Γ(ρ + τ + m) , by symmetry. Substituting this expression into (74) gives the desired result (73).
The DE. Let A(z) = P
n≥0Anzn, B(z) = P
n≥1Bnzn, andf = A − B. Then the recurrence (72) translates into the DE
(1 − z)m−1Dm−1¡
zm−1(1 − z)m−1Dm−1¢d−1
f (z) = m!dA(z), or, in terms of theϑ-operator,
ϑm−1³
zm−1ϑm−1´d−1
f (z) = m!dA(z), (75)
whereϑm−1 = ϑ(ϑ + 1) · · · (ϑ + m − 2).
The normal form. We then rewrite the DE in the form P0(ϑ)f (z) = X
1≤j≤(m−1)(d−1)
(1 − z)jPj(ϑ)f (z) + m!dB(z),
where thePj’s are polynomials of degreedm. In particular, P0(ϑ) = (ϑm−1)d− m!d = Y
1≤j≤d
³ϑm−1− m!e2jπi/d´ .
The unique case when the above DE reduces to a pure Cauchy-Euler type isd = 1. Also the “lineariza-tion” achieved by the Euler transform does not seem to work directly form ≥ 3. This says that it is not obvious how to derive an explicit expression such as (38) whenm ≥ 3.
Zeros of P0(x). Our method of proof for deriving the asymptotic transfers is mostly operational and requires only limited properties of the zeros of the indicial polynomialP0(x). The proofs of the following properties are straightforward and thus omitted.
• The zero with the largest real part is x = 2. All other zeros have real parts strictly less than 2.
• All zeros of P0(x) are simple (we need only this property for x = 2 and the second largest zeros in real part).
Other properties similar to those for the cased = 1 (m-ary search trees) can be derived as in [37, Ch. 3].
Asymptotic transfers. We state the main asymptotic transfers needed for proving Theorem5.
LetHm :=P
1≤j≤m1/j denotes the harmonic numbers. Define KB := 1
d(Hm− 1) X
k≥0
VkB∗(k + 2), (76)
when the series converges, whereVkis defined recursively byVk= 0 when k < 0, V0 = 1, and
Vk = X
1≤ℓ≤(m−1)(d−1)
Pℓ(k + 2)
P0(k + 2)Vk−ℓ (k ≥ 1), andB∗(s) :=R1
0 B(x)(1 − x)s−1dx when the integral converges.
Theorem 6. LetAnbe defined by the recurrence (72) withA0 and{Bn}n≥1given. Then (i) (Small toll functions)
An∼ KBn iff Bn= o(n) and ¯
¯¯ X
n
Bnn−2¯
¯¯ < ∞,
where the constantKBis given in (76);
(ii) (Linear toll functions) Assume that Bn = cn + un, wherec ∈ C and unis a sequence of complex numbers. Then
An ∼ c
d(Hm− 1)n log n + K1n iff un= o(n) and¯
¯¯ X
n
unn−2¯
¯¯ < ∞.
HereK1 := cK2+ Ku withKu defined by replacing the sequence Bn byunin (76) and K2 given explicitly by
K2 := 1 d(Hm− 1)
à X
k≥1
Vk
k(k + 1)+ γ − 2 − d
2(Hm− 1) + Hm(2)− 1 2(Hm− 1)
! ,
whereHm(2) :=P
1≤j≤m1/j2.
(iii) (Large toll functions) Assume that ℜ(υ) > 1 and c ∈ C. Then
Bn∼ cnυ iff An∼ c((υ + 1)m−1)d ((υ + 1)m−1)d− m!dnυ. In particular, ifd = 1, then Vk = δk,0and
KB = B∗(2)
Hm− 1 = 1 Hm− 1
X
k≥0
Bk (k + 1)(k + 2); see [3].
Growth order ofVkfor grid-trees. The sequenceVksatisfies the DE
¡(Dzz + m − 2) · · · (Dzz + 1)Dzz(1 − z)m−1¢d−1
× (Dzz + m − 2) · · · (Dzz + 1)Dz
¡z2V (z)¢
− m!dzV (z) = 0, implying that the solution of the formV (z) = (1 − z)−sφ(1 − z) has the indicial equation
sd(s + 1)d· · · (s + m − 2)d = 0.
Thus we deduce that
Vk= O¡
k−1(log k)c¢ ,
for somec ≥ d − 2. This implies that the series in (76) is convergent for both cases of small and linear toll functions.
Refining the asymptotic transfer for small toll functions. To derive the second-order term for E(Xn) and V(Xn), we also need the following types of transfer.
Let α + 1 denote the real part of the second largest zeros of P0(x) (all zeros arranged in decreasing order according to their real parts), and β > 0 denote the absolute value of the imaginary part of either zero.
Proposition 2. Assume thatAnsatisfies (72).
(i) If Bn∼ cnυ, wherec ∈ C and α < ℜ(υ) < 1, then
An = KBn + c((υ + 1)m−1)d
((υ + 1)m−1)d− m!dnυ+ o(nυ+ nε), whereKB is defined in (76).
(ii) If Bn= o(nα), then
An= KBn + K(λ1)nα+iβ + K(λ2)nα−iβ+ o(nα+ nε),
where theK(λj)’s are constants whose expressions are similarly defined as in (48). If theBk’s are all real, thenK(λ1) = K(λ2).
These types of transfer and the inductive arguments used for quadtrees can be applied to prove local limit theorems forXn with optimal convergence rates. Limit theorems for many other shape parameters can also be derived. We mention only the application to total path length.
Total path length. Neininger and R¨uschendorf [40] derived a general limit law for the total path length in random split trees of Devroye (see [12]), which cover in particular grid-trees. Their result is based on the assumption that the expected total path length satisfies asymptoticallycn log n + c′n. Our asymptotic transfer for linear toll functions shows that this is the case for grid-trees. This proves the limit law for the total path length in random grid-trees. Note that the limit law can also be derived directly by method of moments and our asymptotic transfer for large toll functions.
6 Conclusions
We extended in this paper the asymptotic theory for Cauchy-Euler DEs developed in [7] to essentially DEs with polynomial coefficients (often referred to as holonomic DEs) andz = 0 not an irregular singularity.
Not only the results are very general, but also the method of proof requires almost no knowledge on DEs.
Indeed, since all our manipulations are based on linear operators, only properties of the first-order DEs are used, which can be further avoided by completely operating on recurrences of quicksort type (see [30]). The main feature of such an approach is that all differential operators are regarded as coefficient-transformers, so that no analytic properties are needed for the functions involved.
We applied the general asymptotic transfers developed in this paper to clarify the phase changes of limit laws in quadtrees and more general grid-trees. Further applications to distributional properties of profiles of random search trees will be given elsewhere.
For more methodological interest, we conclude this paper by mentioning an alternative approach to proving general asymptotic transfers forAn(under suitable growth information onBn) based solely on the theory of differential equations. Such an approach was inspired by the series of papers by Flajolet and his coauthors (see [17, 20, 22,26]). We start from the method of Frobenius and seeks solutions of the form (1 − z)−λkφ(1 − z) for the homogeneous DE (ϑ(zϑ)d−1− 2d)f (z) = 0, where φ(z) is analytic at z = 0. A detailed information on the zeros ofP0(x) is needed; in particular, we can show that when d is a multiple of6 there are two pairs of non-real zeros differing by integers (in that case, logarithmic terms need to be introduced). Then we use the method of variation of parameters (see [32]) for the non-homogeneous DE;
a long and laborious calculation of the Wronskians then leads to the form f (z) = X
0≤j<d
ξj(z)(1 − z)−λj
+ 2d X
0≤j<d
ηj(z)(1 − z)−λj Z z
0 (1 − t)λj−1B(t) X
0≤r≤κd
ζj,r(t)³ logz
t
´r
dt, (77)
where κd ≤ (d − 1)2 and ξj, ηj, ζj,r are functions analytic in the unit circle satisfyingP
n|[zn]χ(z)| <
∞, where χ ∈ {ξj, ηj, ζj,r}. Similar expressions can be derived for P
1≤j<d(1 − z)jPj(ϑ)f . Then the sufficiency proofs of the transfers (12), (13), (15) are reduced to deriving asymptotic transfers for integrals of the form
ξ(z)(1 − z)−υ Z z
0 (1 − t)υ−1B(t)η(t)³ logz
t
´r
dt.
Such a general approach, although quickly gives the general form of the solution, does not seem easily amended for getting expressions for the leading constants (similar to most asymptotic problems on DEs and linear differential systems); also for more general DEs such as (75), the precise characterizetion of the zero locations (of their differences) requires more delicate analysis.
Acknowledgements
We thank Philippe Flajolet for showing us the phase change phenomena in random fragmentation processes and suggesting the current study. Part of the work of the third author was carried out while he was visiting Institut f¨ur Stochastik und Mathematische Informatik, J. W. Goethe-Universit¨at (Frankfurt); he thanks the Institute for its hospitality and support.
References
[1] Z.-D. Bai, H.-K. Hwang and T.-H. Tsai (2003). Berry-Esseen bounds for the number of maxima in planar regions. Electronic Journal of Probability 8 Article No. 9, 26 pp.
[2] B. Chauvin and N. Pouyanne (2004).m-ary search trees when m ≥ 27: a strong asymptotics for the space requirements. Random Structures and Algorithms 24 133–154.
[3] H.-H. Chern and H.-K. Hwang (2001). Phase changes in randomm-ary search trees and generalized quicksort. Random Structures and Algorithms 19 316–358.
[4] H.-H. Chern and H.-K. Hwang (2003). Partial match queries in random quadtrees. SIAM Journal on Computing 32 904–915.
[5] H.-H. Chern and H.-K. Hwang (2003). Partial match queries in random k-d trees. Manuscript sub-mitted for publication.
[6] H.-H. Chern and H.-K. Hwang (2004). Limit distribution of the number of consecutive records. Ran-dom Structures and Algorithms, accepted for publication.
[7] H.-H. Chern, H.-K. Hwang and T.-H. Tsai (2003). An asymptotic theory for Cauchy-Euler differential equations with applications to the analysis of algorithms. Journal of Algorithms 44 177–225.
[8] D. S. Dean and S. N. Majumdar (2002). Phase transition in a random fragmentation problem with applications to computer science. Journal of Physics. A. Mathematical and General 35 L501–L507.
[9] M. de Berg, M. van Kreveld, M. Overmars, O. Schwarzkopf (2000). Computational Geometry: Al-gorithms and Applications. Second revised edition, Springer-Verlag.
[10] L. Devroye (1987). Branching processes in the analysis of the heights of trees. Acta Informatica 24 277–298.
[11] L. Devroye (1991). Limit laws for local counters in random binary search trees. Random Structures and Algorithms 2 303–315.
[12] L. Devroye (1998). Universal limit laws for depths in random trees. SIAM Journal on Computing 28 409–432.
[13] L. Devroye and L. Laforest (1990). An analysis of randomd-dimensional quad trees. SIAM Journal on Computing 19 821–832.
[14] A. Erd´elyi, W. Magnus, F. Oberhettinger and F. Tricomi (1953). Higher Transcendental Functions.
Volume I. Robert E. Krieger Publishing Co., Fla.
[15] J. A. Fill and N. Kapur (2004). The space requirement of m-ary search trees: distributional asymp-totics form ≥ 27. In Proceedings of the 7th Iranian Statistical Conference.
[16] R. A. Finkel and J. L. Bentley (1974). Quadtrees: A data structure for retrieval on composite keys.
Acta Informatica 4 1–9.
[17] P. Flajolet, G. Gonnet, C. Puech and J. M. Robson (1993). Analytic variations on quadtrees. Algo-rithmica 10, 473–500.
[18] P. Flajolet, X. Gourdon and C. Mart´ınez (1997). Patterns in random binary search trees. Random Structures and Algorithms 11 223–244.
[19] P. Flajolet, G. Labelle, L. Laforest, and B. Salvy (1995). Hypergeometrics and the cost structure of quadtrees. Random Structures and Algorithms 7 117-114.
[20] P. Flajolet and T. Lafforgue (1994). Search costs in quadtrees and singularity perturbation asymp-totics. Discrete and Computational Geometry 12 151–175.
[21] P. Flajolet and A. M. Odlyzko (1990). Singularity analysis of generating functions. SIAM Journal on Discrete Mathematics 3 216–240.
[22] P. Flajolet and P. Puech (1986). Partial match retrieval of multi-dimensional data. Journal of the ACM 32 371–407.
[23] P. Flajolet and R. Sedgewick (1995). Mellin transforms and asymptotics: finite differences and Rice’s integrals. Theoretical Computer Science 144, 101–124.
[24] P. Flajolet and R. Sedgewick (1995). Analytic Combinatorics. Book in preparation; current version available via the linkalgo.inria.fr/flajolet/Publications/AnaCombi1to9.pdf.
[25] D. Foata, G.-N. Han and B. Lass (2001). Les nombres hyperharmoniques et la fratrie du collection-neur de vignettes. Seminaire Lotharingien de Combinatoire 47 Article B47a.
[26] M. Hoshi and P. Flajolet (1992). Page usage in a quadtree index. BIT 32 384–402.
[27] H.-K. Hwang (1998). On convergence rates in the central limit theorems for combinatorial structures.
European Journal of Combinatorics, 19 329–343.
[28] H.-K. Hwang (2003). Second phase changes in randomm-ary search trees and generalized quicksort:
convergence rates. Annals of Probability 31 609–629.
[29] H.-K. Hwang (2004). Phase changes in random recursive structures and algorithms (a brief survey).
In Proceedings of the Workshop on Probability with Applications to Finance and Insurance. Edited by T. L. Lai, H. Yang and S. P. Yung, World Scientific, pp. 82–97.
[30] H.-K. Hwang and R. Neininger (2002). Phase change of limit laws in the quicksort recurrence under varying toll functions. SIAM Journal on Computing 31 1687–1722.
[31] H.-K. Hwang and T.-H. Tsai (2003). An asymptotic theory for recurrence relations based on mini-mization and maximini-mization. Theoretical Computer Science 290 1475–1501.
[32] E. L. Ince (1926). Ordinary Differential Equations, Dover, New York.
[33] S. Janson (2004). Functional limit theorems for multitype branching processes and generalized Polya urns. Stochastic Processes and their Applications 110 177–245.
[34] G. Labelle and L. Laforest (1995). Combinatorial variations on multidimensional quadtrees. Journal of Combinatorial Theory, Series A 69 1–16.
[35] G. Labelle and L. Laforest (1995). Sur la distribution de l’arit´e de la racine d’une arborescence hyperquaternaire `ad dimensions. Discrete Mathematics 139 287–302.
[36] G. Labelle and L. Laforest (1996). ´Etude de constantes universelles pour les arborescences hyper-quaternaires de recherche. Discrete Mathematics 153 199–211.
[37] H. M. Mahmoud (1992). Evolution of Random Search Trees. John Wiley & Sons, New York.
[38] C. Mart´ınez, A. Panholzer and H. Prodinger (2001). Partial match queries in relaxed multidimen-sional search trees. Algorithmica 29 181–204.
[39] H. N. Minh and G. Jacob (2000). Symbolic integration of meromorphic differential systems via Dirichlet functions. Discrete Mathematics 210 87–116.
[40] R. Neininger and L. R¨uschendorf (1999). On the internal path length ofd-dimensional quad trees.
Random Structures and Algorithms 15 25–41.
[41] R. Neininger and L. R¨uschendorf (2001). Limit laws for partial match queries in quadtrees. Annals of Applied Probability 11 452–469.
[42] V. V. Petrov (1975). Sums of Independent Random Variables. Springer, New York.
[43] H. Samet (1990a). The Design and Analysis of Spatial Data Structures. Addison-Wesley, Reading, MA.
[44] H. Samet (1990b). Applications of Spatial Data Structures: Computer Graphics, Image Processing, and GIS. Addison-Wesley, Reading, MA.
[45] R. Sedgewick (1980). Quicksort. Garland, New York.