• 沒有找到結果。

An analytic approach to the asymptotic variance of trie statistics and related structures

N/A
N/A
Protected

Academic year: 2021

Share "An analytic approach to the asymptotic variance of trie statistics and related structures"

Copied!
36
0
0

加載中.... (立即查看全文)

全文

(1)

Contents lists available atScienceDirect

Theoretical Computer Science

www.elsevier.com/locate/tcs

An analytic approach to the asymptotic variance of trie

statistics and related structures

Michael Fuchs

a

,

, Hsien-Kuei Hwang

b

, Vytas Zacharovas

c

aDepartment of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan bInstitute of Statistical Science, Institute of Information Science, Academia Sinica, Taipei 115, Taiwan cDept. Mathematics & Informatics, Vilnius University, Lithuania

a r t i c l e

i n f o

a b s t r a c t

Article history: Received 7 August 2013

Received in revised form 7 January 2014 Accepted 21 January 2014

Communicated by H. Prodinger This paper is dedicated to the memory of Philippe Flajolet, who pioneered the asymptotic study of binomial splitting processes

Keywords: Digital trees

Binomial splitting process Mellin transform Variance

Periodic fluctuations

Contention resolution algorithms

We develop analytic tools for the asymptotics of general trie statistics, which are particularly advantageous for clarifying the asymptotic variance. Many concrete examples are discussed for which new Fourier expansions are given. The tools are also useful for other splitting processes with an underlying binomial distribution. We specially highlight Philippe Flajolet’s contribution in the analysis of these random structures.

©2014 Elsevier B.V. All rights reserved.

1. Introduction

Coin-flipping is one of the simplest ways of resolving a conflict, deciding between two alternatives, and generating ran-dom phenomena. It has been widely adopted in many daily-life situations and scientific disciplines. There exists even a term “flippism.” The curiosity of understanding the randomness behind throwing coins or dices was one of the motivat-ing origins of early probability theory, culminatmotivat-ing in the classical book “Ars Conjectandi” by Jacob Bernoulli, which was published exactly three hundred years ago in 1713 (many years after its completion; see[92,101]). When flipped succes-sively, one naturally encounters the binomial distribution, which is pervasive in many splitting processes and branching algorithms whose analysis was largely developed and clarified through Philippe Flajolet’s works, notably in the early 1980s, an important period marking the upsurgence of the use of complex-analytic tools in the Analysis of Algorithms.

Technical content of this paper. This paper is a sequel to[55]and we will develop an analytic approach that is especially useful for characterizing the asymptotics of the mean and the variance of additive statistics of random tries under the Bernoulli model; such statistics can often be computed recursively by

Xn

=

d XIn

+

Xn∗−In

+

Tn, (1)

*

Corresponding author. Tel.: +886 3 5712121x56461; fax: +886 3 5724679.

E-mail addresses:mfuchs@math.nctu.edu.tw(M. Fuchs),hkwhang@stat.sinica.edu.tw(H.-K. Hwang),vytas.zacharovas@mif.vu.lt(V. Zacharovas). 0304-3975/$ – see front matter ©2014 Elsevier B.V. All rights reserved.

(2)

with suitable initial conditions, where Tn is known, Xnis an independent copy of Xn and In is the binomial distribution with mean pn, 0

<

p

<

1.

Many asymptotic approximations are known in the literature for the variance of Xn, which has in many cases of interest the pattern

V(

Xn

)

n

=

c log n

+

c 

+



P

(



log n

),

iflog plog q

∈ Q

0

,

iflog plog q

∈ Q

/



+

o

(

1

),

(2)

where c may be zero,



depends on the ratio log plog q and P

(

x

)

=

P

(

x

+

1

)

is a bounded periodic function. However, known expressions in the literature for the periodic function P are rare due to the complexity of the problem, and are often either less transparent, or less explicit, or too messy to be stated. In many situations they are given in the form of one periodic function minus the square of the other. The approach developed here, in contrast, provides not only a systematic derivation of the asymptotic approximation (2)but also a simpler, explicit, independent expression for P , notably in the symmetric case (p

=

q). Further refinement of the o

(

1

)

-term lies outside the scope of this paper and can be dealt with by the approach developed by Flajolet et al. in[34].

Binomial splitting processes. In general, the simple splitting idea behind the recursive random variable(1)(0 going to the left and 1 going to the right) has also been widely adopted in many different modeling processes, which, for simplicity, will be vaguely referred to as “binomial splitting processes” (BSPs), where binomial distribution and some of its extensions are naturally involved in the analysis; see Fig. 1 for concrete examples of BSPs that are related to our analysis here. For convenience of presentation, we roughly group these structures in four categories: Data Structures, Algorithms, Collision Resolution Protocols, and Random Models.

To see how such BSPs in different areas can be analyzed, we start from the recurrence (q

=

1

p) an

=



0kn

π

n,k

(

ak

+

ank

)

+

bn, where

π

n,k

:=



n k



pkqnk

,

(3)

which results, for example, from (1)by taking expectation. Here the “toll-function” bn may itself involve aj ( j

=

0

,

1

, . . .

) but with multipliers that are exponentially small.

From an analytic point of view, the trie recurrence(3)translates for the Poisson generating function

˜

f

(

z

)

:=

ez



n0 an n

!

z n

,

(4)

into the trie functional equation

˜

f

(

z

)

= ˜

f

(

pz

)

+ ˜

f

(

qz

)

+ ˜

g

(

z

),

(5)

with suitable initial conditions. Such a functional equation is a special case of the more general pattern



0jb



b j



˜

f(j)

(

z

)

=

α

˜

f

(

pz

+ λ) + β ˜

f

(

qz

+ λ) + ˜

g

(

z

),

(6)

where b

=

0

,

1

, . . .

, and g itself may involve

˜

˜

f and its derivatives

˜

f(j) but with exponentially small factors. When b

=

0,

one has a pure functional equation,

˜

f

(

z

)

=

α

˜

f

(

pz

+ λ) + β ˜

f

(

qz

+ λ) + ˜

g

(

z

),

(7)

while when b



1, one has a differential–functional equation.

It turns out that Eq.(6)covers almost all cases we collected (a few hundred of publications) in the analysis of BSPs the majority of which correspond to the case b

= λ =

0. The cases when b

=

0 and

λ >

0 are thoroughly treated in[21,22,57,81], and the cases when b



1 are discussed in detail in[55](see also the references cited there). We focus on b

= λ =

0 in this paper. Since the literature abounds with Eq.(5)or the corresponding recurrence(3), we content ourselves with listing below some references that are either standard, representative or more closely connected to our study here. See also[17,19,41]for some non-random contexts where(5)appeared.

Data structures. Tries:[74,78,104]; PATRICIA tries:[15,74,70]; Quadtries and k-d tries:[32,43]; Hashing:[20,24,39,83]; Suf-fix trees:[61,104].

Algorithms. Radix-exchange sort:[74]; Bucket selection and bucket sort:[8,77]; Probabilistic counting schemes:[25,27,28, 30,90]; Polynomial factorization:[39]; Exponential variate generation: [35]; Group testing:[47]; Random genera-tion:[31,97].

(3)

Fig. 1. A tree rendering of the diverse themes pertinent to binomial splitting processes.

Collision resolution protocols. Tree algorithms in multi-access channel:[3,21,22,26,80,81,84,105]; Initialization in radio net-works:[86,102]; Mutual exclusion in mobile networks:[82]; Broadcast communication model:[7,49,107]; Leader election:[23,64,94]; Tree algorithms in RFID systems:[54,87].

Random models. Random graphs: [2,46,103]; Geometric IID RVs (or order statistics): [18,45,93]; Cantor distributions:[10, 48]; Evolutionary trees:[1,76]; Diffusion limited aggregates:[6,79]; Generalized Eden model on trees:[13].

Asymptotics of most of the BSPs can nowadays be handled by standard analytic techniques, which we owe largely to Flajolet for initiating and laying down the major groundwork. We focus in this paper on analytic methods. Many elementary and probabilistic methods have also been proposed in the literature with success; see, for example,[14,15,63,88,104] for more information.

Flajolet’s works on BSPs. We begin with a brief summary of Flajolet’s works in the analysis of BSPs. For more information,

see the two chapter introductions on Digital Trees (by Clément and Ward) and on Communication Protocols (by Jacquet) in Philippe Flajolet’s Collected Papers, Volume III (edited by Szpankowski).

Flajolet published his first work related to BSP in June 1982 in a paper jointly written with Dominique Sotteau entitled “A recursive partitioning process of computer science” (see[38]). This is indeed a review paper and starts with the sentence:

(4)

Fig. 2. The diverse themes and methodology developed (or mentioned) in Flajolet’s works that are connected to BSPs.

We informally review some of the algebraic and analytic techniques involved in investigating the properties of a combinatorial process that appears in very diverse contexts in computer science including digital sorting and searching, dynamic hashing methods, communication protocols in local networks and some polynomial factorization algorithms.

They first brought the attention of the generality of the same splitting principle in diverse contexts in their Introduction, followed by a systematic development of generating functions under different models ([38, Section 2: Algebraic methods]). Then a general introduction was given of the saddle-point method and Mellin transform to the Analysis of Algorithms ([38, Section 3: Analytic methods]). They concluded in the last section by giving applications of these techniques to one instance in each of the four areas mentioned above.

Such a synergistic germination of diverse research ideas

later expanded into a wide spectrum of applications and research networks (seeFig. 2for a plot of BSP-related themes). It was also fully developed and explored, and evolved into his theory of Analytic Combinatorics. Many of these objects become in his hands a subject of interest, and many follow-up papers continued and extended with much ease.

Analysis of algorithms (and particularly BSPs) in the pre-Flajolet era relied mostly on more elementary approaches (including Tauberian theorems; see [41]), with some sporadic exceptions in the use of the “Gamma-function method” (a particular case of Mellin transform): the height of random trees [11], the analysis of radix-exchange sort (essentially the external path length of random tries) given in[74, §5.2.2], PATRICIA tries and digital search trees[74, §6.3], odd–even merging[100], register function of random trees[67], analysis of carry propagation[73], and extendible hashing[20]. See Dumas’s chapter introduction (Chapter 4, Volume III) for a more detailed account.

Many asymptotic patterns such as(2), which most of us take for granted today, were far from being clear in the 1980’s, notably in the engineering contexts. For example, the minute periodic fluctuations when log p

/

log q is rational are often invisible in numerical calculations, leading possibly to wrong conclusions; see Fig. 3 for an illustration of the delicacy in visualizing the periodic oscillations. Flajolet pioneered and developed systematic analytic tools in Analysis of Algorithms to

(5)

Fig. 3. Delicacy of visualizing the periodic oscillations inμn, the expected size of tries in the symmetric case under the Bernoulli model:μn=1+ 21−n k n k

μkwith μ1=0. The periodic fluctuation is invisible if plotted naively as in the upper-left figure. The shift by the factor 1 is crucial here because it is the average value of the periodic function appearing in the next order term in the asymptotic expansion ofμn. Lower right: the periodic function by the analytic expression(31).

fully characterize such tiny perturbations, which he called “wobbles,” a word he learned from Hardy’s Twelve Lectures about Ramanujan[50]. See[29]for more information.

Amazingly, most of the items in the big picture ofFig. 2were already discovered or clarified in the 1980’s in Flajolet’s published works with a few later themes aiming at finer improvements in results or more general stochastic models. Among these, the “digital process” and “probabilistic counting algorithms” became two of his favorite subjects of presentation, as can be seen from his webpage of lectures where about one third of talks are related to these two subjects.

We organize this paper as follows. We briefly introduce tries, functional equations and the analytic tools in the next section. We then develop analytic tools we need in Section 3, the most difficult part being the proof of admissibility under Hadamard product. Then we focus on the characterization of the asymptotic variance of general trie statistics in the following sections. We also include PATRICIA tries in Section7and conclude this paper with a few remarks.

2. Random tries, functional equations and asymptotic analysis

The design of an ordinary dictionary according to the alphabetical (or lexicographical) order induces itself a tree struc-ture, which is also the splitting procedure used in many digital tree structures and bucketing algorithms such as tries and radix sort. Tries (coined by Fredkin[40], which is a mixture of “tree” and “retrieval”) were first introduced in computer algorithms by de la Briandais[12]in 1959, the same year when the radix-exchange sort (a digital realization of Quicksort) was proposed by Hildebrandt and Isbitz[52]; see[74, §6.3]for more information. Tries are one of the most widely adopted prototype data structures for words and strings, and admit a large number of extensions and variants.

Given a set of n random binary strings (each being a sequence of Bernoulli random variables with parameter p), we can recursively define the random trie associated with this set as follows. If n

=

0, then the trie is empty; if n

=

1, then the trie is composed of a single (external) node holding the input-string; if n

>

1, then the trie contains three parts: a root (internal) node used to direct keys to the left (when the first bit of the string is 0) or to the right (when the first bit of the string is 1), a left sub-trie of the root for keys whose first bits are 0 and a right sub-trie for keys whose first bits are 1; strings

(6)

Fig. 4. A trie of n=7 records: the circles represent internal nodes and rectangles holding the records are external nodes.

directing to each of the two subtrees are constructed recursively as tries (but using subsequent bits successively). Thus tries are ordered, prefix trees. SeeFig. 4for the plot of a trie of 7 keys.

Asymptotic analysis of the trie recurrence(3)is nowadays not difficult and a typical way of deriving asymptotic estimates starts with the Poisson generating function(4), which satisfies the functional equation(5)(when an does not grow faster than, say exponential), where g

˜

(

z

)

there depends on bn and the initial conditions. From this, one sees that the Mellin transform of

˜

f

(

z

)

M

[ ˜

f

;

s

] :=

0

˜

f

(

z

)

zs−1dz

,

which exists in some strip

α

<

(

s

) < β

(in the sequel, we will use the notation



α

, β



to denote this set), satisfies

M

[ ˜

f

;

s

] =

M

g

;

s

]

1

ps

qs

.

Then the asymptotics of an can be manipulated by a two-stage analytic approach: first derive asymptotics of

˜

f

(

z

)

for large

|

z

|

by the inverse Mellin integral

˜

f

(

z

)

=

1 2

π

i

M

g

;

s

]

zs 1

ps

qsds

,

(8)

where the integration path

denotes some vertical line, and then apply the saddle-point method to Cauchy’s integral formula (called analytic de-Poissonization and largely developed by Jacquet and Szpankowski[62])

an

=

n

!

2

π

i

|z|=r zn−1ez

˜

f

(

z

)

dz

(

r

>

0

).

(9)

This two-stage Mellin-saddle approach has proved very successful and can in many real applications be encapsulated into one, called the Poisson–Mellin–Newton cycle in Flajolet’s papers (see[33,36])

an

=

n

!

2

π

i

M

g

;

s

]

(

1

ps

qs

)Γ (

n

+

1

s

)

ds

,

(10)

which is formally obtained by substituting(8)into(9)and by interchanging the order of integration.

Note that such a formal representation may be meaningless due to the divergence of the integral. One of the most useful tools in justifying the exponential smallness of

M [˜

g

;

s

]

at c

± ∞

is Proposition 5 of Flajolet et al.’s survey paper [29] on Mellin transforms. For ease of reference, we call it the Exponential Smallness Lemma in this paper.

Exponential Smallness Lemma. (See[29, Prop. 5].) If, inside the sector

|

arg

(

z

)

|  θ (θ >

0

)

, f

(

z

)

=

O

(

|

z

|

α

)

, as z

0, and f

(

z

)

=

O

(

|

z

|

−β

)

as

|

z

| → ∞

, then

M [

f

;

s

] =

O

(

e−θ| (s)|

)

holds uniformly for

(

s

)

∈ α

, β



.

(7)

Fig. 5. The two analytic approaches to the asymptotics of an. Hereπn,k:=

n k

pkqnk.

This simple lemma is crucial in the development of our arguments.

In various practical cases, the use of the Poisson–Mellin–Newton approach relies mostly on the so-called Rice’s inte-gral formula (or inteinte-gral representation for finite differences) when the inteinte-gral converges; seeFig. 5 for a diagrammatic illustration. Under different manipulations and guises, such a Rice-integral approach has proved extremely useful in many situations and was widely studied in the early history of BSPs and related structures (see the survey paper[36]), notably for the asymptotics of the mean value; see also below for more references.

Asymptotics of either of the two integrals(8)and(10)rely heavily on the singularities of the integrand, which in turn depends on the location of the zeros of the equation 1

ps

qs

=

0. A detailed study of the zeros can be found in [21], and later in[17,99]. While the dominant asymptotic terms are often easy to characterize when analytic properties of

M [˜

g

;

s

]

are known (owing largely to the systematic tools Flajolet and his coauthors developed), error analysis turned out to be highly challenging when log p

/

log q is irrational; see[34].

These analytic tools are well-suited for computing the asymptotics of the mean, but soon become very messy when adopted for higher moments, which satisfy the same type of recurrences but involve convolution terms that are often difficult to manipulate analytically. The situation becomes even worse when dealing with the variance or higher central moments because the high concentration of binomial distribution results in smaller variance, meaning more complicated cancelations in the desired asymptotic approximations have to be properly taken into account. Much effort along this direc-tion was put forth in several pioneering papers dealing with the asymptotic variance of statistics related to tries and digital search trees; see, for example,[68,70–72]where the authors worked out an approach by considering the second moment and managing the delicate cancellations.

The key, crucial step of our approach to the asymptotic variance of trie statistics is to introduce, as in[55], the corrected Poissonized variance of the form

˜

V

(

z

)

:= ˜

f2

(

z

)

− ˜

f1

(

z

)

2

z

˜

f1

(

z

)

2

,

(11)

where

˜

f1 and

˜

f2 denote the Poisson generating functions of the first and the second moments, respectively. The

manipula-tion of such an approach is indirect in several previous papers in the sense that the asymptotics of the Poissonized variance

˜

f2

(

z

)

− ˜

f12

(

z

)

and that of z

˜

f1

(

z

)

2 are first worked out separately, and then the asymptotics of the variance can be

charac-terized by canceling the dominant terms; the resulting Fourier series is often expressed in terms of the difference of two Fourier series, different from that obtained by considering directly the asymptotics of(11); see for example[58,61,77,78,91, 96]and the references therein. Our Fourier series expansions for the periodic functions are in most cases simpler partly due to the no-cancelation character of the approach, especially in the symmetric case.

Several different approaches other than the above-mentioned second-moment approach and the Poissonized variance approach have also been proposed in the literature for the asymptotic variance with different degrees of precision; these include an elementary induction approach (see [7,53]), (bivariate) characteristic function approach [59,61,78], and Schachinger’s differencing approach[98].

We will enhance our corrected Poissonized variance approach by introducing the class of JS-admissible functions as in our previous paper [55], a notion formulated from Jacquet and Szpankowski’s works on analytic de-Poissonization (see [62]) and mostly inspired from Hayman’s classical work [51]on the saddle-point method (see also[37, §VIII.5]), via which many asymptotic approximations can be derived by checking only simple criteria of admissibility. Note that analytic de-Poissonization is a special case of the saddle-point method, and Hayman’s framework on admissible functions is indeed more general than JS-admissible functions. The combined use leads to a very effective, systematic approach that can be easily adapted for diverse contexts where a similar type of analytic nature is encountered; see Sections5and6for some examples.

(8)

In general, polynomial growth rate forg

˜

(

z

)

for large

|

z

|

implies the same for

˜

f in a small sector containing the real axis. An exception was recently observed for the functional–differential equation[2]

˜

f

(

z

)

= ˜

f

(

qz

)

+ ˜

g

(

z

),

for which the growth is of order zc log z when

˜

g grows polynomially for large

|

z

|

. Note that this equation is a special case of the so-called “pantograph equations”; see[2]for more information.

Notations. Throughout this paper, q

=

1

p and 0

<

p

<

1. Also h

:= −

p log p

q log q denotes the entropy of the Bernoulli distribution. The splitting distribution In is a binomial distribution with mean pn. For brevity, we introduce the generic symbol

F [

G

](

x

)

to denote a bounded periodic function of period 1 of the form

F

[

G

](

x

)

=



h−1



k∈Z\{0}G

(

1

+

χ

k)e2kπix

,

if log p log q

∈ Q

0

,

iflog plog q

∈ Q

/



,

(12)

where

χ

k

=

2rklog pπi when log p log q

=

r

with gcd

(

r

, )

=

1. The average value of

F [

G

]

is zero and the Fourier series is always

absolutely convergent (it is indeed infinitely differentiable for all cases we study).

3. JS-admissibility, Hadamard product and asymptotic transfer

We collect and develop in this section some technical preliminaries, which are needed later for our asymptotic analysis. The arduous part is the proof for the closure of JS-admissibility under Hadamard product.

3.1. JS-admissible functions

We begin with recalling the definition and a few fundamental properties from[55]of JS-admissibility in which Hayman’s general framework in[51]is specialized to analytic de-Poissonization (see[62]) with fixed saddle-point (at z

=

n) and with more precise expansions.

Definition 1. Let

˜

f

(

z

)

be an entire function. Then we say that

˜

f

(

z

)

is JS-admissible and write

˜

f

JS

(or more precisely,

˜

f

JS

α,

α

, β

∈ R

) if for some 0

< θ <

π

/

2 and

|

z

| 

1 the following two conditions hold.

(I) (Polynomial growth inside a sector) Uniformly for

|

arg

(

z

)

|  θ

,

˜

f

(

z

)

=

O



|

z

|

α



log+

|

z

|

β

,

where log+x

:=

log

(

1

+

x

)

.

(O) (Exponential bound) Uniformly for

θ

 |

arg

(

z

)

| 

π

, f

(

z

)

:=

ez

˜

f

(

z

)

=

O



e(1−ε)|z|

,

for some

ε

>

0.

The major reason of introducing JS-admissible functions is to provide a systematic analytic justification of the Poisson heuristic an

∼ ˜

f

(

n

)

, where

˜

f is the Poisson generating function of an. We do not however pursue optimum conditions here for the sake of simplicity and easy applications. On the other hand, since the conditions of admissibility we impose are strong, we can indeed provide a very precise asymptotic characterization of an.

Proposition 3.1. (See[55].) If

˜

f

JS

α, then ansatisfies the asymptotic expansion

an

=



0j<2k

˜

f(j)

(

n

)

j

!

τ

j

(

n

)

+

O



nα−klogβn

,

(13)

(9)

for k

=

0

,

1

, . . .

, where the

τ

j’s are polynomials of n of degree

j

/

2



given by

τ

j(n

)

=



0lj



j l



(

n

)

l n

!

(

n

j

+

l

)

!

(

j

=

0

,

1

, . . .).

Note that the asymptotic expansion is formulated very differently in[62]. Also the Poisson–Charlier expansion

an

=



j0

˜

f(j)

(

n

)

j

!

τ

j(n

)

converges as long as

˜

f is an entire function; see[55]. It is the asymptotic nature (13)that requires more regularity condi-tions, which can intuitively be seen by observing that

˜

f(j)

(

n

)



nj

˜

f

(

n

)

when

˜

f

JS

; see[55].

The

τ

j’s are closely connected to Charlier and Laguerre polynomials; see [55] for a more detailed discussion. The first few terms are given as follows.

τ0(n) τ1(n) τ2(n) τ3(n) τ4(n) τ5(n) τ6(n)

1 0 −n 2n 3n(n−2)4n(5n−6)5n(3n226n+24)

The fact that

τ

1

=

0 indicates that much information is condensed in the dominant term

˜

f

(

n

)

.

At the generating function level, the usefulness of JS-admissible functions lies in the closure properties under several elementary operations.

Proposition 3.2. (See[55].) Let m be a non-negative integer and

α

∈ (

0

,

1

)

. (i) zm

,

eαz

JS

.

(ii) If

˜

f

JS

, thenP

˜

f

˜

JS

for any polynomialP

˜

(

z

)

. (iii) If

˜

f

JS

, then

˜

f

(

α

z

)

JS

.

(iv) If

˜

f

,

g

˜

JS

, then

˜

f

+ ˜

g

JS

.

(v) If

˜

f

,

g

˜

JS

, then

˜

f

(

α

z

)

g

˜

((

1

α

)

z

)

JS

. (vi) If

˜

f

JS

, then

˜

f(m)

JS

.

We will enhance these closure properties by proving that JS-admissibility is also closed under Hadamard product. 3.2. Asymptotic transfer

For our purposes, we need also a transfer theorem for entire functions satisfying the functional equation(5).

Proposition 3.3. Let

˜

f

(

z

)

andg

˜

(

z

)

be entire functions satisfying

˜

f

(

z

)

= ˜

f

(

pz

)

+ ˜

f

(

qz

)

+ ˜

g

(

z

),

with f

(

0

)

given. Then

˜

f

JS

if and only if g

˜

JS

.

Proof. The proof is similar to and simpler than that of Proposition 2.4 in[55]. Thus we only give the proof for (I). Define

˜

B

(

r

)

:=

max |z|r |arg(z)|θ

˜

f

(

z

)

.

Then

˜

B

(

r

)

 ˜

B

(

pr

)

+ ˜

B

(

qr

)

+

O



(

log+r

)

β

+

1

.

Now define a majorant function K

˜

(

r

)

by

˜

(10)

where C

>

0. ThenB

˜

(

r

)

 ˜

K

(

r

)

for a sufficiently large C

>

0, and by standard Mellin argument [29] or by the proof used in[98]

˜

K

(

r

)

=

O

(

r

),

if

α

<

1

;

O

(

(

log+r

)

β+1

),

if

α

=

1

;

O

(

(

log+r

)

β

),

if

α

>

1

.

This completes the proof.

2

We now refine the asymptotic transfer and focus on asymptotically linear functions.

Proposition 3.4. Let

˜

f andg be entire functions related to each other by the functional equation

˜

˜

f

(

z

)

= ˜

f

(

pz

)

+ ˜

f

(

qz

)

+ ˜

g

(

z

)

with f

(

0

)

given. Assume 0

< θ <

π

/

2

,

α

<

1 and

β

∈ R

.

(a) Ifg

˜

(

z

)

=

O

(

|

z

|

α

(

log+

|

z

|)

β

)

, where the O -term holds uniformly for

|

z

| 

1 and

|

arg

(

z

)

|  θ

, then, as

|

z

| → ∞

in the same sector,

˜

f

(

z

)

z

=

G

(

1

)

h

+

F

[

G

](

r log1/pz

)

+

o

(

1

),

where the notations

F [

G

](

x

)

and r are defined in(12).

(b) If

˜

g

(

z

)

=

cz

+

O

(

|

z

|

α

(

log+

|

z

|)

β

)

uniformly for

|

z

| 

1 and

|

arg

(

z

)

|  θ

, then, as

|

z

| → ∞

in the same sector,

˜

f

(

z

)

z

=

c hlog z

+

h0

+

F

[

G

](

r log1/pz

)

+

o

(

1

),

where h0

:=

c0 h

+

c

(

p log2p

+

q log2q

)

2h2

,

(14)

G

(

s

)

is the meromorphic continuation of

M [˜

g

;

s

]

, and c0

:=

lim s→−1



G

(

s

)

+

c s

+

1



.

Proof. Without loss of generality, we may assume that

˜

f

(

0

)

= ˜

f

(

0

)

= ˜

g

(

0

)

= ˜

g

(

0

)

=

0. Then both Mellin transforms exist in the strip

−

2

,

1



and

M

[ ˜

f

;

s

] =

G

(

s

)

1

ps

qs

.

Note that G

(

s

)

can be extended to a meromorphic function in the strip

−

2

,

α

ε



. In the case of (a), G

(

s

)

is analytic on the line

(

s

)

= −

1 while in the case of (b) G

(

s

)

has a unique simple pole on

(

s

)

= −

1 at s

= −

1 with the local expansion G

(

s

)

= −

c

/(

s

+

1

)

+

c0

+ · · ·

. Note that by applying the Exponential Smallness Lemma[29, Prop. 5], we have the estimate

G

(

σ

+

it

)

=

O



e−θ|t|

,

uniformly for large

|

t

|

and

σ

∈ −

2

,

α

ε



. Thus the proposition follows from standard Mellin analysis (see [29]) and known properties of the zeros of 1

ps

qs(see[21]).

2

In the symmetric case when p

=

q

=

1

/

2, both error terms o

(

1

)

in the proposition can be improved to O

(

max

{

1

,

|

z

|

α−1

(

log

|

z

|)

β

})

. Indeed, all error terms in such a case in this paper can be improved by standard arguments; we focus

instead on the Fourier series expansion in this paper. 3.3. A Hadamard product for Poisson generating functions

We need a new closure property for the analysis of the variance. Given two exponential generating functions f

(

z

)

=



n0 an n

!

z n and g

(

z

)

=



n0 bn n

!

z n

,

the Hadamard product of these two functions is defined as h

(

z

)

:=

f

(

z

)



g

(

z

)

=



n0

anbn n

!

z

(11)

Then we consider their Poisson generating functions

˜

f

(

z

)

:=

ezf

(

z

),

g

˜

(

z

)

:=

ezg

(

z

),

and h

˜

(

z

)

:=

ezh

(

z

).

We show that JS-admissibility is closed under the Hadamard product. The proof is subtle and delicate.

Proposition 3.5. If

˜

f

JS

α

11andg

˜

JS

α21, thenh

˜

JS

α2+α212.

Proof. Let 0

< θ

0

<

π

/

2 be an angle where (I) holds for both

˜

f

(

z

)

andg

˜

(

z

)

. Note that conditions (I) and (O) remain true

if

θ

0 is replaced by an arbitrarily small but fixed angle 0

< θ

 θ

0with a suitable choice of

ε

=

ε

(θ )

.

We prove the proposition in the special case when

β

1

= β

2

=

0 since the proof in the general case remains the same

with only additional logarithmic terms in the corresponding error estimates. Define J

(

z

)

:=



n0 anbn

(

n

!)

2z 2n

=

1 2

π

π

−π f



zeit

g



zeit

dt

.

Substituting here z

zu, multiplying both sides by ueu2

, integrating from 0 to infinity and multiplying the result by ez2

, we obtain

˜

h



z2

=

ez2



n0 anbn n

!

z 2n

=

2ez2 ∞

0 ueu2J

(

zu

)

du

.

We now fix a 0

< θ < θ

0. We first show that h

(

z

)

satisfies condition (O) for z lying outside the sector

|

arg

(

z

)

|  θ

. Assume

θ/

2

 |

y

| 

π

/

2. Then J



reiy

=

1 2

π

π

−π f



rei(t+y)

g



rei(yt)

dt

.

Note that (I) and (O) imply that

f

(

z

)

=

O



|

z

|

α1

+

1

e|z|

and g

(

z

)

=

O



|

z

|

α2

+

1

e|z|

,

(15)

uniformly for z

∈ C

. Now making the change of variables t

→

t

y and taking into account that the function under the integral sign is periodic, we see that

J



reiy

=

1 2

π

π

−π f



reit

g



rei(2 yt)

dt

=

1 2

π

y

y f



reit

g



rei(2 yt)

dt

+

O



2

+

1

e(2−ε)r

.

Here we evaluated the integral over the region

|

y

|  |

t

| 

π

by the estimate

|

f

(

reit

)

| =

O

(

e(1−ε)r

)

, which follows from (O), and used the upper bound of(15) for

|

g

(

rei(2 yt)

)

|

. In a similar way, we note that 0

< θ/

2

 |

2 y

t

| 

3

π

/

2 whenever

|

t

|  |

y

|

and

θ/

2

 |

y

| 

π

/

2. This means that z

=

rei(2 yt)lies inside the sector

|

arg

(

z

)

|  θ/

2 and as a consequence we

can use estimates

|

g

(

rei(2 yt)

)

| =

O

(

e(1−ε)r

)

and

|

f

(

reit

)

| =

O

((

1

+

1

)

er

)

to evaluate the integral over the range

|

t

|  |

y

|

.

Combining these estimates, we get

J



reiy

=

O



1

+

rα2

+

1

e(2−ε)r

=

O



e2ε1r

,

where

ε

1 is chosen such that

(

2

ε

)/

2

<

ε

1

<

1. This estimate yields

er2e2iyh

˜



r2e2iy

=

O



∞ 0 ue2ε1rueu2du



=

O



reε12r2

,

which implies, by replacing r

r and y

y

/

2, the estimate ereiyh

˜



reiy

=

O



reε21r

,

in the region

θ

 |

y

| 

π

for any fixed

θ >

0. Thus condition (O) holds.

(12)

Note that

|

arg

(

ze±it

)

|  θ

0

/

4 for all values of t and z such that

|

arg z

|  θ

0

/

4 and

π

 |

t

|  θ

0

/

2, which, by (I) and (O), implies that f

(

zeit

)

=

O

(

2|z|

)

and g

(

zeit

)

=

O

(

2|z|

)

with a suitable choice of

ε

2

<

1 for all z and t satisfying such

restrictions. It follows that

J

(

z

)

=

1 2

π

θ0/2

−θ0/2 e2z cos t

˜

f



zeit

˜

g



zeit

dt

+

O



e2ε2|z|

,

when

|

arg z

|  θ

0

/

4. Thus

˜

h



z2

=

ez2

π

0 ueu2 θ0/2

−θ0/2

e2zu cos t

˜

f



zueit

g

˜



zueit

dt du

+

O



|

z

|

22|z|2−(z2)

.

(16)

Noting that

(

z2

)

=

cos

(

2 arg

(

z

))

|

z

|

2

 (

1

2 arg

(

z

)

2

)

|

z

|

2, we then have

˜

h



z2

=

I

(

z

)

+

O



|

z

|

e−(1−ε22−2 arg(z)2)|z|2

,

(17)

where I

(

z

)

denotes the double integral

I

(

z

)

:=

1

π

0 u

|t|θ0/2

e−(uz cos t)2−z2sin2tf

˜



zueit

g

˜



zueit

dt du

.

Since

|

arg

(

z

)

|  θ

0

/

4, the arguments zueit and zueit of the functions

˜

f and

˜

g lie inside the sector

|

arg

(

z

)

|  θ

0, which

means that

˜

f

(

zueit

)

=

O

(

|

z

|

α1

)

andg

˜

(

zueit

)

=

O

(

|

z

|

α2

)

.

Changing the order of integration and making a change of integration path from the interval u

∈ (

0

,

∞)

to the line

(

0

,

z

∞)

by mapping u

→

uz cos t, we get

I

(

z

)

=

z 2

π

θ0/2

−θ0/2 ∞

0

ez2(u−1)2cos2tz2sin2tf

˜



z2ueitcos t

g

˜



z2ueitcos t

u du cos2t dt

=

O



|

z

|

2 θ0/2

−θ0/2 ∞

0 e−(z2)(u−1)2cos20/2)−(z2)sin2t



|

z2

|

u

+

1

α1



|

z2

|

u

+

1

α2u du dt



=

O



|

z

|

2



∞ 0 e−(z2)(u−1)2cos20/2)



|

z

|

2u

+

1

α1+α2u du



|t|θ0/2 e−(z2)sin2tdt



=

O



(

|

z

| +

1

)

2+2α1+2α2

(

z2

)



=

O



|

z

|

2α1+2α2

,

uniformly for large

|

z

|

in the sector

|

arg

(

z

)

|  θ

0

/

4.

Applying this estimate to the expression(16)ofh

˜

(

z2

)

, we obtain

˜

h



z2

=

O



|

z

|

2α1+2α2

.

Thus

˜

h

(

z

)

=

O



|

z

|

α1+α2

for all

|

arg z

|  θ  θ

0

/

2, where

θ

is chosen to be small enough to ensure that the error term in(17)decreases exponentially

fast. This proves the proposition.

2

We can refine the above argument and obtain a more precise asymptotic estimate.

Proposition 3.6. If

˜

f

JS

α11andg

˜

JS

α21, then

˜

h

(

z

)

= ˜

f

(

z

)

g

˜

(

z

)

+

z

˜

f

(

z

)

g

˜



(

z

)

+

O



|

z

|

α1+α2−2



log

+

|

z

|

β12

,

(18)

(13)

Our proof indeed gives an asymptotic expansion forh; we content ourselves with the statement of

˜

(18), which is suffi-cient for our purposes.

Proof. To prove(18), we use the Taylor expansion

˜

f

(

z

)

=



0j<N

˜

f(j)

(

w

)

j

!

(

w

z

)

j

+

O



max



|

z

| +

1

α1−N

,



|

w

| +

1

α1−N



|

z

w

|

N

,

for any fixed N



1. Note that this estimate remains valid when

α



0 (due specially to the additional factors “1”). Applying this formula with z

z2ueitcos t and w

z2, we get

˜

f



z2ueitcos t

=



0j<N

˜

f(j)

(

z2

)

j

!

z 2 j



ueitcos t

1

j

+

O



|

z

|

2

(

u

+

1

)

+

1

α1

ueitcos t

1

N

,

and a similar expression forg. Substituting these expressions with N

˜

=

4 into I

(

z

)

, we get

I

(

z

)

=

z 2

π

θ0/2

−θ0/2 ∞

0

ez2(u−1)2cos2tz2sin2t

˜

f



z2ueitcos t

g

˜



z2ueitcos t

u du cos2t dt

=



k,l3

˜

f(k)

(

z2

)

g

˜

(l)

(

z2

)

k

!

l

!

·

z2(1+k+l)

π

θ0/2

−θ0/2 ∞

0

ez2(u−1)2cos2tz2sin2t



ueitcos t

1

k



ueitcos t

1

lu du cos2t dt

+

O



|

z

|

2



0k,l4 k+l>3 Iα12,k+l



,

(19) where Iρ,κ

:=

θ0/2

−θ0/2 ∞

0

ez2(u−1)2cos2tz2sin2tu du



|

z

|

2

(

u

+

1

)

+

1

ρ

ueitcos t

1

κdt

.

Applying now the inequality

ueitcos t

1

=

u cos t

eit

=|

u cos t

cos t

+

i sin t

|  |

u

1

| + |

t

|,

we get Iρ,κ

=

O



θ

0/2 −θ0/2 ∞

0 e−(z2)(u−1)2cos2t−(z2)sin2tu



|

z

|

2

(

u

+

1

)

+

1

ρ



|

u

1

|

κ

+ |

t

|

κ

du dt



.

Note that

(

z2

)

 |

z

|

2cos 2

θ

and sin2t



c

1t2 when

|

arg

(

z

)

|  θ

and

|

t

|  θ

0

/

2 for some constant c1

>

0. Thus there exists

a positive constant c

>

0 such that





z2

(

u

1

)

2cos2t

+ 



z2

sin2t



c

(

u

1

)

2

|

z

|

2

+

ct2

,

for

|

arg

(

z

)

|  θ

and

|

t

|  θ

0

/

2. It then follows that

Iρ,κ

=

O



θ

0/2 −θ0/2 ∞

0 ec(u−1)2|z|2−c|z|2t2u



|

z

|

2

(

u

+

1

)

+

1

ρ



|

u

1

|

κ

+ |

t

|

κ

du dt



=

O



∞ 0 ec(u−1)2|z|2



|

z

|

2

(

u

+

1

)

+

1

ρ

|

u

1

|

κu du



θ

0/2 −θ0/2 ec|z|2t2dt

+



∞ 0 ec(u−1)2|z|2



|

z

|

2

(

u

+

1

)

+

1

ρu du



θ

0/2 −θ0/2 ec|z|2t2

|

t

|

κdt



=

O



|

z

|

2

+

1

ρ

|

z

|

−κ−2

.

(14)

Substituting this bound in the error term of(19), we obtain I

(

z

)

=



0k,l3

˜

f(k)

(

z2

)

˜

g(l)

(

z2

)

k

!

l

!

Sk,l

+

O



|

z

|

2α1+2α2−4

,

where Sk,l

=

z2(1+k+l)

π

θ0/2

−θ0/2 ∞

0

ez2(u−1)2cos2tz2sin2t



ueitcos t

1

k



ueitcos t

1

lu du cos2t dt

.

We can approximate the integral Sk,l by reversing the order of the procedure by which we obtained it. First, making the change of variables u

→

u

/(

z cos t

)

, we get

Sk,l

=

1

π

θ0/2

−θ0/2 ∞

0

e−(uz cos t)2−z2sin2t



zueit

z2

k



zueit

z2

lu du dt

=

ez 2

π

π

−π ∞

0

e2zu cos t



zueit

z2

k



zueit

z2

lueu2du dt

+

O



1

+ |

z

|

k+l

e2(z)cos20/2)−(z2)

=

ez 2

π

0



π −π



ezueit



zueit

z2

k



ezueit



zueit

z2

l



dt



ueu2du

+

O



1

+ |

z

|

k+l

e2(z)cos20/2)−(z2)

=

ez2



n0

ν

n,k

ν

n,l n

!

z 2n

+

O



1

+ |

z

|

k+l

e2(z)cos20/2)−(z2)

,

where the

νn

,k’s are defined by



n0

ν

n,k n

!

w n

=

ew



w

z2

k

.

In particular, S1,1

=

ez 2



z4

+



n1

(

n

z2

)

2 n

!

z 2n



+

O



1

+ |

z

|

2

e2(z)cos20/2)−(z2)

=

z2

+

O



1

+ |

z

|

2

e2(z)cos20/2)−(z2)

.

Similarly, S0,0

=

1

+

O



e2(z)cos20/2)−(z2)

,

and Sm,n

=

O



1

+ |

z

|

m+n

e2(z)cos20/2)−(z2)

,

whenever m

=

n. Therefore

˜

h



z2

=

I

(

z

)

+

O



1

+ |

z

|

α1+α2

e2(z)cos20/2)−(z2)

= ˜

f



z2

˜

g



z2

+

z2

˜

f



z2

g

˜





z2

+

O



|

z

|

2α1+2α2−4

.

Accordingly,

˜

h

(

z

)

=

I

(

z

)

+

O



1

+ |

z

|

α1+α2

e2(z)cos20/2)−(z2)

= ˜

f

(

z

)

g

˜

(

z

)

+

z

˜

f

(

z

)

g

˜



(

z

)

+

O



|

z

|

α1+α2−2

.

數據

Fig. 1. A tree rendering of the diverse themes pertinent to binomial splitting processes.
Fig. 2. The diverse themes and methodology developed (or mentioned) in Flajolet’s works that are connected to BSPs.
Fig. 3. Delicacy of visualizing the periodic oscillations in μ n , the expected size of tries in the symmetric case under the Bernoulli model: μ n = 1 + 2 1− n  k  n k
Fig. 4. A trie of n = 7 records: the circles represent internal nodes and rectangles holding the records are external nodes.
+4

參考文獻

相關文件

Salas, Hille, Etgen Calculus: One and Several Variables Copyright 2007 © John Wiley &amp; Sons, Inc.. All

Good Data Structure Needs Proper Accessing Algorithms: get, insert. rule of thumb for speed: often-get

By contrast to the often heavily edited and revised printed Chán scriptures of the Sòng period (many of them eventually being integrated in the official Buddhist canon

In Pre-Qin and Han era, the theories of heaven mind and Tao mind had a different level from human mind: an individual can conduct the human mind by means of self-cultivation and

Reading Task 6: Genre Structure and Language Features. • Now let’s look at how language features (e.g. sentence patterns) are connected to the structure

Teachers may encourage students to approach the poem as an unseen text to practise the steps of analysis and annotation, instead of relying on secondary

Students are asked to collect information (including materials from books, pamphlet from Environmental Protection Department...etc.) of the possible effects of pollution on our

&#34;Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values,&#34; Data Mining and Knowledge Discovery, Vol. “Density-Based Clustering in