Abstract
We prove convergence in distribution for the profile (the number of nodes at each level), normalized by its mean, of random recursive trees when the limit ratio ˛ of the level and the logarithm of tree size lies in Œ0; e/. Convergence of all moments is shown to hold only for ˛ 2 Œ0; 1 (with only convergence of finite moments when ˛ 2 .1; e/). When the limit ratio is 0 or 1 for which the limit laws are both constant, we prove asymptotic normality for ˛ D 0 and a “quicksort type” limit law for ˛ D 1, the latter case having additionally a small range where there is no fixed limit law. Our tools are based on contraction method and method of moments. Similar phenomena also hold for other classes of trees; we apply our tools to binary search trees and give a complete characterization of the profile. The profiles of these random trees represent concrete examples for which the range of convergence in distribution differs from that of convergence of all moments.
1 Introduction
The profile or height profile of a tree is the sequence of numbers whose k-th element enumerates the number of nodes at distance k from the root of the tree (or the number of descendants in k-th generation in branching process terms). Profiles of trees are fine shape characteristics encountered in diverse problems
1Partially supported by National Science Council of ROC under the GrantNSC-93-2119-M-009-003.
2Partially supported by a research award of the Alexander von Humboldt Foundation and by National Science Council under the grantNSC-92-2118-M-001-019.
3Research supported by an Emmy Noether Fellowship of the DFG.
such as breadth-first search, data compression algorithms (Jacquet, Szpankowski, Tang, 2001), random generation of trees (Devroye and Robson, 1995), and the level-wise analysis of quicksort (Chern and Hwang, 2001b, Evans and Dunbar, 1982). In addition to their interest in applications and connections to many other shape parameters, we will show, through recursive trees and binary search trees, that profiles of random trees having roughly logarithmic height are a rich source of many intriguing phenomena. The high concentration of nodes at certain (log) levels results in the asymptotic bimodality for the variance, as already demonstrated in Drmota and Hwang (2005a); our purpose of this paper is to unveil and clarify the diverse phenomena exhibited by the limit distributions of the profiles of random recursive trees and binary search trees. The tools we use, as well as the results we derive, are of some generality.
Recursive trees. Recursive trees have been introduced as simple probability models for system gener-ation (Na and Rapoport, 1970), spread of contamingener-ation of organisms (Meir and Moon, 1974), pyramid scheme (Bhattacharya and Gastwirth, 1984, Smythe and Mahmoud, 1995), stemma construction of philol-ogy (Najock and Heyde, 1982), Internet interface map (Janic et al., 2002), stochastic growth of networks (Chan et al., 2003). They are related to some Internet models (van Mieghem et al., 2001, van der Hofstad et al., 2001, Devroye, McDiarmid and Reed, 2002) and some physical models (Tetzlaff, 2002); they also appeared in Hopf algebra under the name of “heap-ordered trees”; see Grossman and Larson (1989). The bijection between recursive trees and binary search trees not only makes the former a flexible representa-tion of the latter but also provides a rich direcrepresenta-tion for further extensions; see for example Mahmoud and Smythe (1995).
A simple way of constructing a random recursive tree of n nodes is as follows. One starts from a root node with the label 1; at stage i (i D 2; : : : ; n/ a new node with label i is attached uniformly at random to one of the previous nodes (1; : : : ; i 1). The process stops after node n is inserted. By construction, the labels of the nodes along any path from the root to a node form an increasing sequence; see Figure2 for a recursive tree of 10 nodes. For a survey of probabilistic properties of recursive trees, see Smythe and Mahmoud (1995).
Known results for the profile of recursive trees. Let Xn;k denote the number of nodes at level k in a random recursive tree of n nodes, where Xn;0 D 1 (the root) for n 1. Then Xn;k satisfies (see van der Hofstad et al., 2002)
Xn;k
D XD In;k 1C Xn I n;k ; (1)
for n; k 1 with Xn;0 D 1 ın;0 (ın;0 being Kronecker’s symbol), where .Xn;k/, .Xn;k / and .In/ are independent, Xn;k
D XD n;k , and In is uniformly distributed overf1; : : : ; n 1g.
Meir and Moon (1978) showed (implicitly) that
n;k WD E.Xn;k/ D s.n; k C 1/
.n 1/! .0 k < n/; (2)
wheres.n; k/ denotes the unsigned Stirling numbers of the first kind; see also Moon (1974) and Donda-jewski and Szyma´nski (1982). By the approximations given in Hwang (1995), we then have
n;k D kn
.1 C ˛n;k/k! 1 C O n1 ; (3)
uniformly for 1 k Kn, for any K > 1, where, here and throughout this paper,
n WD maxflog n; 1g; ˛n;k WD k=n;
and denotes the Gamma function. This approximation implies, in particular, a local limit theorem for the depth (distance of a random node to the root); see Devroye (1998), Szyma´nski (1990), Mahmoud (1991).
The second moment is also implicit in Meir and Moon (1978) E.Xn;k2 / D X
0jk
2j j
s.n; k C j C 1/
.n 1/! I
see also van der Hofstad et al. (2002). Precise asymptotic approximations for the variance V.Xn;k/ were derived in Drmota and Hwang (2005a) for all ranges of k. In particular, the variance is asymptotically of the same order as 2n;k when ˛ 2 .0; 2/ except k n (where the profile variance exhibits a bimodal behavior).
Limit distribution when 0 ˛ < e. From the asymptotic estimate (3), we have log n;k
n ! ˛ ˛ log ˛;
where here and throughout this paper k D k.n/ and ˛ WD limn!1k.n/=n. Thus n;k ! 1 when
˛ < e. Note that the expected height (length of the longest path from the root) of random recursive trees is asymptotic to en; see Devroye (1987) or Pittel (1994).
Define a class of random variables X.˛/ by the fixed-point equation
X.˛/D ˛UD ˛X.˛/ C .1 U /˛X.˛/; (4)
with E.X.˛// D 1, where X.˛/; X.˛/; U are independent, X.˛/ D X.˛/, and U is uniformly dis-D tributed in the unit interval; see Proposition1for existence and properties of X.˛/. Define X.0/D 1.
Theorem 1. .i / If 0 ˛ < e, then
Xn;k
n;k
D! X.˛/; (5)
where D! denotes convergence in distribution.
.ii/ If 0 ˛ < m1=.m 1/, where m 2, then Xn;k=n;k converges to X.˛/ with convergence of the first m moments but not the .mC 1/-st moment.
In particular, convergence of the second moment holds for 0 ˛ < 2.
Corollary 1. If 0 ˛ < 2, then
V.Xn;k/
.˛ C 1/2
.1 ˛=2/.2˛ C 1/ 1
2n;k:
Note that the coefficient on the right-hand side becomes zero when ˛ D 0 and ˛ D 1, and the variance indeed exhibits a bimodal behavior when ˛ D 1; see Figure1for a plot and Drmota and Hwang (2005a) or below for more precise approximations to the variance.
Since m1=.m 1/ # 1, the unit interval is the only range where convergence of all moments holds.
0 200 400 600 800 1000
2 4 6 8 10 12 14 0
2e–06 4e–06 6e–06 8e–06
2 4 6 8 10 12 14
Figure 1: A plot of E.Xn;k/ (the unimodal curve), V.Xn;k/ (the bimodal curve with higher valley), and jE.Xn;k n;k/3j (right) of the number Xn;k of nodes at level k in random recursive trees of n D 1100 nodes, all normalized by their maximum values. Note that the valley of jE.X1100;k 1100;k/3j (when normalized by n3) is deeper than that of V.X1100;k/ (normalized by n2); see Corollary5for the general description.
Corollary 2. If 0 ˛ 1, then
Xn;k
n;k
M! X.˛/; (6)
where M! denotes convergence of all moments. Convergence of all moments fails for 1 < ˛ < e.
Thus the profile of random recursive trees represents a concrete example for which the range of con-vergence in distribution is different from that of concon-vergence of all moments. We will show that such a property also holds for random binary search trees; it is expected to hold for other trees like ordered (or plane) recursive trees and m-ary search trees, but the technicalities are expected to be much more compli-cated. We focus at this stage on new phenomena and their proofs, not on generality.
The proof of (5) relies on the contraction method developed in Neininger and R¨uschendorf (2004) (see also the survey paper R¨osler and R¨uschendorf, 2001), and the moment convergence Xn;k=n;k uses the method of moments. Both methods are technically more involved because we are dealing with recurrences with two parameters. We will indeed prove a stronger approximation to (5) by deriving a rate under the Zolotarev metric (see Zolotarev, 1976).
But why m1=.m 1/? This is readily seen by the recurrence of the moments m.˛/ WD E.X.˛/m/ of X.˛/
m.˛/ D 1 m ˛m 1
X
1h<m
m h
h.˛/m h.˛/˛h 1.h˛ C 1/..m h/˛ C 1/
.m˛ C 1/ .m 2/; (7)
where 0.˛/ D 1.˛/ D 1. This recurrence is well-defined for m.˛/ when ˛ < m1=.m 1/. This explains the special sequence m1=.m 1/.
Note that since E.X.˛/m/ D 1 for ˛ m1=.m 1/, we have E.Xn;k=n;k/m! 1 in that range.
A “quicksort-type” limit distribution when ˛ D 1. Since X.1/ D 1, we can refine the limit result (5) for ˛D 1 as follows.
Theorem 2. .i / If k D nC tn;k, wherejtn;kj ! 1 and tn;k D o.n/, then Xn;k n;k
tn;kk 1n =k!
M! X0.1/; (8)
where X0.1/ WD .d=d˛/X.˛/j˛D1satisfies
X0.1/ D UXD 0.1/ C .1 U /X0.1/C U C U log U C .1 U / log.1 U /;
with X0.1/; X0.1/; U independent and X0.1/ D XD 0.1/.
.ii/ If k D n C O.1/, then the sequence of random variables .Xn;k n;k/=pV.Xn;k/ does not converge to a fixed law.
Although (8) can also be proved by the contraction method, we prove both results of the theorem by the method of moments because the proof for the non-convergence part is readily modified from that for (8); see also Chern et al. (2002) for more examples having no convergence to fixed limit law. On the other hand, since the distribution of X0.1/ is uniquely characterized by its moment sequence (see (41)), we have the convergence in distribution as follows.
Corollary 3. If k D nC tn;k, wherejtn;kj ! 1 and tn;k D o.n/, then Xn;k n;k
tn;kk 1n =k!
D! X0.1/:
The same limit law X0.1/ also appeared in the total path length (which isP
kkXn;k) of recursive trees (see Dobrow and Fill, 1999), or essentially the total left path length of random binary search trees, and the cost of an in-situ permutation algorithm; see Hwang and Neininger (2002).
The appearance of the same limit law as the total path length is not a coincidence. Intuitively, almost all nodes lie at the levels k D n C O.p
n/ (since E.Xn;k/ n=p
n by (3)) and it is these nodes that contribute predominantly to the total path length; see also (9) below for an estimate of the variance.
Analytically, a deeper connection between the profile and the total path length is seen through the level polynomialsP
kXn;kzk (properly normalized) for which we can derive, following Chauvin et al. (2001), an almost sure convergence to some (complex-valued) limit random variable. From such a uniform con-vergence, the profile is quickly linked to the total path length by taking derivative of the normalized level polynomial with respect to z and substituting z D 1. Indeed, limit theorems for weighted path-lengths of the formP
kkmXn;k, as well as the width (maxkXn;k), can be obtained as by-products. These and finer results on correlations and expected width are discussed in Drmota and Hwang (2005b).
Asymptotics of the variance. As a consequence of our convergence of all moments, we have the fol-lowing estimate for the variance.
Corollary 4. If k D nC tn;k, where tn;k D o.n/, then the variance of Xn;k satisfies V.Xn;k/ p2.tn;k/ k 1n
k!
2
; (9)
where p2.tn;k/ WD c2tn;k2 C 2c1tn;k C c0 with c2 WD 2 2
6 ; c1 WD c2.1 / .3/ C 1 c0 WD c2 2 2 C 3
2..3/ 1/.1 / 4
360: (10)
Here denotes Euler’s constant and .3/WDP
j 1j 3.
The expression (9) explains the valley for the variance in Figure 1. Note that V.Xn;k/=2n;k D O.tn;k2 =2n/ when tn;k D o.n/.
Our proof indeed yields the following extremal orders ofjE.Xn;k n;k/mj for m 2.
Corollary 5. The absolute value of the m-th central moment satisfies max
More refined results can be derived as in Drmota and Hwang (2005a). For example, by (40) below, we have shown to be asymptotically normally distributed. It is known (see Bergeron et al., 1992) that the out-degree of the root Xn;1 satisfies
P.Xn;1D j / D s.n 1; j /
.n 1/! .1 j < n/I
thus Xn;1 is asymptotically normal with mean and variance both asymptotic to n. Equivalently, Xn;1 is the number of nodes on the rightmost branch (the path starting from the root and always going right until reaching an external node) in a random binary search trees of n 1 nodes; see the transformation below for more information.
Let ˆ.x/ WD .2/ 1=2Rx
1e t2=2dt denote the distribution function of the standard normal distribu-tion.
Theorem 3. The distribution of the profile Xn;k satisfies
sup
uniformly for 1 k D o.n/, with mean and variance asymptotic to 8
In particular, Xn;2is asymptotically normally distributed with mean asymptotic to 122n and variance to
1
33n. A similar central limit theorem appeared in the logarithmic order of a random element in symmetric groups; see Erd˝os and Tur´an (1967).
Unlike previous cases, the proof of this result is based on a polynomial decomposition of the associated generating functions using characteristic functions and singularity analysis (see Flajolet and Odlyzko,
1
2
4 8 10
3
5
6 7
9
1
3
7
9
2
4 5
6
8
Figure 2: A recursive tree of 10 nodes and its corresponding transformed binary increasing tree of 9 nodes.
1990), the reasons being .i / this method leads to the optimal Berry-Esseen bound (11), which is not obvious by the method of moments; .ii/ it is of independent methodological interests, and .iii/ it can also be applied to give an alternative proof of (6).
The asymptotic normality of Xn;k when ˛ D 0 indicates that nodes are generated in a very regular way in recursive trees, at least for the first o.n/ levels. The rough picture here is that each node at these levels
“attracts” about n=k new-coming nodes, as is obvious from (3); see also Drmota and Hwang (2005b) for an asymptotic independence property for the number of nodes at two different levels, both being o.n/ away from the root.
Profiles of random binary search trees. Binary search trees are one of the most studied fundamental data structures in Computer Algorithms. They have also been introduced in other fields under different forms; see Drmota and Hwang (2005a) for more references.
This tree model is characterized by a recursive splitting process in which n 2 distinct labels are split into a root and two subtrees formed recursively by the same procedure (one may be empty) of sizes Jn and n 1 Jn, where Jn is uniformly distributed in f0; 1; : : : ; n 1g. Such a model is isomorphic to binary increasing trees in which a sequence of n 2 continuous random variables (independent and identically distributed) is split into a root with the smallest label and two subtrees formed recursively by the same splitting process corresponding to the subsequences to the left and right respectively of the smallest label. Note that when given a random permutation of n elements the size of the left subtree of the binary increasing tree constructed from the permutation equals j , 0 j n 1 with equal probability 1=n, the same as in random binary search trees.
A recursive tree can be transformed into a binary increasing tree by the well-known procedure (referred to as the natural correspondence in Kunth, 1997 and the rotation correspondence by others): drop first the root and arrange all subtrees from left to right in increasing order of their root labels; sibling relations are transformed into right branches (of the leftmost node in that generation) and the leftmost branches remain unchanged; a final relabeling (using labels from 1 to n 1) of nodes then yields a binary increasing tree of n 1 nodes. Such a transformation is invertible; see Figure2.
Under this transformation, the profile Xn;k in recursive trees becomes essentially the number of nodes in random binary search trees of n 1 nodes with left-distance k 1 (k 1), the left-distance of a node
being the number of left-branches needed to traverse from the root to that node. This also explains the recurrence (1).
Known and new results for profiles of random binary search trees. We distinguish two types of nodes for binary search trees: external nodes Yn;k (virtual nodes completed so that all nodes are of out-degree either zero or two) and internal nodes Zn;k(nodes holding labels). Chauvin et al. (2001) established almost sure convergence for Yn;k=E.Yn;k/ and Zn;k=E.Zn;k/ when 1:2 ˛ 2:8, and recently Chauvin et al.
(2005) extended the range for Yn;k=E.Yn;k/ to the optimal range ˛ < ˛ < ˛C, the two numbers ˛ 0:37; ˛C 4:31 being the fill-up and height constants (of binary search trees), namely, 0 < ˛ < 1 < ˛C
solving the equation e.z 1/=z D z=2; see also Chauvin and Rouault (2004). For other known results on the profiles Yn;k, see Drmota and Hwang (2005a) and the references therein.
Our tools for recursive trees also apply to binary search trees. Briefly, we derive convergence in distribution for Yn;k=E.Yn;k/ and Zn;k=E.Zn;k/ in the range ˛ 2 .˛ ; ˛C/ and convergence of all moments for ˛ 2 Œ1; 2, the degenerate cases ˛ D 1; 2 being further refined by more explicit limit laws; see Section7 for details.
While it is expected that the profiles for both types of nodes have similar behaviors to Xn;k, we will derive finer results showing more delicate structural difference between internal nodes and external nodes.
Organization of the paper. Since most of our asymptotic approximations are based on the solution (exact or asymptotic) of the underlying double-indexed recurrence (in n and k), we start from solving the recurrence in the next section. The proof of the convergence in distribution (5) of Xn;k=n;k when 0 < ˛ < e by contraction method is given in Section3. Then we prove the moment convergence part of Theorem1in Section4and Theorem2in Section5. The asymptotic normality when ˛ D 0 is proved in Section6, where an alternative proof of (6) is also indicated. Our methods of proof can be easily amended for binary search trees, and the results are given in Section7. We conclude this paper with a few questions.
Notations. Throughout this paper, n WD maxflog n; 1g, ˛n;k WD k=nand ˛ WD limn!1˛n;k when the limit exists. The symbol Œznf .z/ stands for the coefficient of zn in the Taylor expansion of f .z/. The generic symbols " and K always represent sufficiently small and large, respectively, positive constants whose values may vary from one occurrence to another. Finally, U represents a uniform Œ0; 1 random variable.