• 沒有找到結果。

The Cutoff Phenomenon for Ehrenfest Processes

N/A
N/A
Protected

Academic year: 2021

Share "The Cutoff Phenomenon for Ehrenfest Processes"

Copied!
25
0
0

加載中.... (立即查看全文)

全文

(1)

GUAN-YU CHEN1, YANG-JEN FANG2, AND YUAN-CHUNG SHEU3

Abstract. We consider families of Ehrenfest chains and provide a simple criterion on the Lp-cutoff and the Lp-precutoff with specified initial states for

1≤ p < ∞. For the family with an Lp-cutoff, a cutoff time is described and

a possible window is given. For the family without an Lp-precutoff, the exact

order of the Lp-mixing time is determined. The result is consistent with the

well-known conjecture on cutoffs of Markov chains proposed by Peres in 2004, which says that a cutoff exists if and only if the multiplication of the spectral gap and the mixing time tends to infinity.

1. Introduction

Consider a time-homogeneous Markov chain on a finite set Ω with one-step transition matrix K. Let Kt(x,·) denote the probability distribution of the chain

at time t starting from state x. It is well-known that if K is ergodic (irreducible and aperiodic), then

lim

t→∞K

t(x, y) = π(y) ∀x, y ∈ Ω,

where π is the unique invariant probability of K on Ω. Denote by kt

x the relative

density of Kt(x,·) with respect to π, that is, kt

x(y) = Kt(x, y)/π(y). For 1≤ p < ∞,

define the Lp-distance by

Dp(x, t) =∥kxt− 1∥Lp(π)=  ∑ y∈Ω |kt x(y)− 1| pπ(y)   1/p .

For p =∞, the L∞-distance is set to be D(x, t) = maxy|ktx(y)− 1|. In the case

p = 1, this is exactly twice of the the total variation distance between Kt(x,·) and

π, which is defined by

DTV(x, t) =∥Kt(x,·) − π∥TV= max

A⊂Ω{K

t(x, A)− π(A)}.

For p = 2, it is the so-called chi-square distance. For any ϵ > 0 and 1≤ p ≤ ∞, define the Lp-mixing time by

Tp(x, ϵ) = min{t ≥ 0 : Dp(x, t)≤ ϵ}.

The concept of cutoffs was introduced by Aldous and Diaconis in [1, 2, 3] to capture the fact that many ergodic Markov chains converge abruptly to their sta-tionary distributions (in total variation and separation). We refer the reader to

2000 Mathematics Subject Classification. 60J05,60J25.

Key words and phrases. Cutoff phenomenon, Ehrenfest chains.

1Partially supported by NSC grant NSC98-2628-M-009-003 and CMMSC and NCTS, Taiwan. 2Partially supported by NSC grant NSC99-2115-M-009-008 and CMMSC and NCTS, Taiwan. 3Partially supported by NSC grant NSC99-2115-M-009-010 and CMMSC and NCTS, Taiwan.

(2)

[6, 7, 13, 14, 15] for details and further discussions on variant examples. In a word, when 1 < p≤ ∞, a family of finite ergodic Markov chains (Ωn, Kn, πn) with

specified initial states xn has an Lp-cutoff with cutoff time tn if

lim

n→∞Dn,p(xn, (1 + a)tn) =

{

0 if a > 0

∞ if − 1 < a < 0,

where Dn,p denotes the Lp-distance of the nth Markov chain. The definition for

cutoffs in total variation, separation and L1-distance is the same as above expect

the replacement of the limit∞ with 1 in total variation and separation and with 2 in L1-distance.

In [6], the authors discussed a number of variants of cutoffs and produced, in the reversible case, a necessary and sufficient condition for the existence of a max-Lp

-cutoff, which is a cutoff in the distance maxx∈ΩDp(x,·) with 1 < p ≤ ∞. In [7],

there establishes an equivalent condition on the L2-cutoff for families of Markov

processes with specified initial distributions assuming the associated semigroups are normal. Also, a formula on the L2-cutoff time was introduced in [7], based on a complete information of the spectral decomposition. This is in contrast with techniques and results in [6] which do not involve much in spectral theory.

Consider the Ehrenfest chains. For n≥ 1, let Ωn ={0, 1, ..., n} and Kn be the

Markov kernel of the Ehrenfest chain on Ωn defined by

(1.1) Kn(i, i + 1) = 1−

i

n, Kn(i + 1, i) = i + 1

n , ∀0 ≤ i ≤ n − 1.

It is a simple exercise to check that the unbiased binomial distribution, πn(i) =

(n i

)

2−n, is the invariant probability of Kn and the pair (Kn, πn) is reversible, i.e.

πn(i)Kn(i, j) = πn(j)Kn(j, i) for all i, j ∈ Ωn. By lifting the chain to a random

walk on the hypercube, one may use the group representation of (Z2)n to identify

the eigenvalues and eigenvectors of Kn as follows.

Lemma 1.1. The matrix defined in (1.1) has eigenvalues

βn,i= 1 2i n 0≤ i ≤ n, with L2 n)-normalized eigenvectors (1.2) ψn,i(x) = ( n i )−1/2 i k=0 (−1)k ( x k )( n− x i− k ) 0≤ i, x ≤ n.

See, e.g., [8] for a proof. Based on the above result, Chen and Saloff-Coste obtained the following theorem.

Theorem 1.2 ([7, Theorem 6.5]). Let Kn be defined in (1.1) and set Kn′ = (I +

nKn)/(n + 1), πn(i) =

(n i

)

2−n. Then, the following are equivalent.

(1) The family{(Ωn, Kn′, πn) : n = 1, 2, ...} with starting states (xn)∞n=1 has an

L2-cutoff.

(2) |n − 2xn|/

n→ ∞ as n → ∞. Moreover, if (2) holds, then

Tn,2(xn, ϵ) =

n

2 log

|n − 2x n|

(3)

The notation Oϵ(n) denotes a sequence in n whose absolute values are bounded

above by Cϵn for all n≥ 1 with 0 < Cϵ<∞.

The aim of this paper is to provide a necessary and sufficient condition on the

Lp-cutoff of Ehrenfest chains with 1≤ p < ∞ and describe the Lp-cutoff time if

any. For 1 < p <∞, the eigenfunctions are useful in bounding the Lp-distance but,

however, they do not work very well in bounding the total variation distance of the associated semigroup from below. A path comparison to the simple random walk onZ is proposed to get suitable lower bound and this leads to the following result. Theorem 1.3. As in the setting of Theorem 1.2, the following are equivalent. For

p∈ [1, ∞),

(1) The family{(Ωn, Kn′, πn) : n = 1, 2, ...} with starting states (xn)∞n=1 has an

Lp-cutoff.

(2) The family{(Ωn, Kn′, πn) : n = 1, 2, ...} with starting states (xn)∞n=1 has an

Lp-precutoff.

(3) |n − 2xn|/

n→ ∞ as n → ∞. Moreover, if (2) holds, then

Tn,p(xn, ϵ) = n 2log |n − 2xn| n + Oϵ,p(n), ∀ϵ > 0, p ∈ (1, ∞). For p = 1, the above identity remains true with ϵ∈ (0, 2).

This theorem is a special case of Theorem 4.1 and 5.1. The concept of precutoff will be introduced in the next section. In the case p = 1, it has been proved in [7] that (3) is sufficient for (1). As the Ehrenfest chain is a birth-and-death chain, we refer the reader to [9, 10] for more results on cutoffs, where the first article treats the cutoff in separation for chains starting from one end-point and the second article considers the max-total variation cutoff for lazy chains. Both of them introduce a universal criterion on cutoffs but the Ehrenfest chain is out of their categories.

The remaining of this article is organized in the following way. In Section 2, we recall various notions of cutoffs and quote useful results from [6]. In Section 3, we recall some well-known results for simple random walks onZ, which will be used in latter context, and provide a proof on them. In Section 4, we deal with the total variation cutoff for the Ehrenfest chains in both the continuous time and discrete time cases. Those ideas inspired in this section are in fact applicable to more general models. In Section 5, we treat the Lp-cutoff and spell out the results

along with the open problems.

2. Cutoffs

Throughout the oncoming sections, the term (Ω, K, π, µ) will be used to denote a time-homogeneous irreducible Markov chain on Ω with one-step transition matrix

K, invariant probability π and initial distribution µ. Write (Ω, Ht, π, µ) as the

continuous time Markov chain associated with (Ω, K, π, µ) if Ht = e−t(I−K), the

semigroup associated with K. If the chain starts at state x, we write (Ω, K, π, x) and (Ω, Ht, π, x) instead. For any two sequences of positive numbers, say tn, sn, the

notation sn = O(tn) means that there are N > 0 and C > 0 such that sn ≤ Ctn

for all n≥ N. If both sn = O(tn) and tn = O(sn) hold, we simply write tn ≍ sn.

(4)

In this section, we recall various definitions of cutoffs and a series of related results from [6]. The notion of cutoff can be developed for any family of non-increasing functions taking values on [0,∞]. The following definitions treat the Lp -cutoff for families of finite ergodic Markov chains with specified initial distributions in discrete time case. We refer the reader to [6] for further details and examples. Definition 2.1. LetF = {(Ωn, Kn, πn, µn) : n = 1, 2, ...} be a family of irreducible

and aperiodic finite Markov chains. For p∈ (1, ∞], the family F is said to present: (1) An Lp-precutoff if there is a sequence tn > 0 and constants 0 < A < B

such that lim

n→∞Dn,p(µn, Bn) = 0, lim infn→∞ Dn,p(µn, An) > 0,

where Bn= inf{j ≥ 0 : j > Btn} and An = sup{j ≥ 0 : j < Atn}.

(2) An Lp-cutoff if there is a sequence t

n> 0 such that, for all ϵ∈ (0, 1),

lim

n→∞Dn,p(µn, kn(ϵ)) = 0, nlim→∞Dn,p(µn, kn(−ϵ)) = ∞,

where kn(ϵ) = inf{j ≥ 0 : j > (1 + ϵ)kn} and kn(ϵ) = sup{j ≥ 0 : j <

(1 + ϵ)tn}.

(3) A (tn, bn) Lp-cutoff if tn> 0, bn> 0, bn= o(tn) and

lim c→∞Fp(c) = 0, c→−∞lim Fp(c) =∞, where Fp(c) = lim sup n→∞ Dn,p(µn, k(n, c)), Fp(c) = lim inf n→∞ Dn,p(µn, k(n, c)),

and k(n, c) = inf{j ≥ 0 : j > tn + cbn} and k(n, c) = sup{j ≥ 0 : j <

tn+ cbn}.

The definition for the case p = 1 follows if∞ is replaced by 2.

The definition agrees with that in [6]. In (2) and (3), tn is called an Lp-cutoff

time and bn is a window with respect to tn. In (3), the functions, Fp and Fp, give

an idea on how the cutoff evolves and is sometimes called the shape of the (tn, bn)

cutoff.

Remark 2.1. Note that, for t > 0, the mapping t7→ Dn,p(µn, t) is non-increasing.

This implies that, if tn tends to infinity (or equivalently Tn,p(µn, ϵ)→ ∞ for some

ϵ > 0) in Definition 2.1, it makes no difference to replace An with⌊Atn⌋ or ⌈Atn⌉,

and so for the replacements of Bn, kn(ϵ), kn(ϵ), k(n, c), and k(n, c).

Remark 2.2. In the continuous time case, the definition of cutoffs in Definition 2.1

follows in the intuitive way. That is, An= Atn, Bn= Btn, kn(ϵ) = kn(ϵ) = (1+ϵ)tn

and k(n, c) = k(n, c) = tn+ cbn.

Remark 2.3. According to Definition 2.1, if a family has no Lp-precutoff (resp. Lp

-cutoff), then the new family obtained by merging this one with any other still has no

Lp-precutoff (resp. Lp-cutoff). This implies that if a subfamily has no Lp-precutoff

(resp. Lp-cutoff), then the original family has no Lp-precutoff (resp. Lp-cutoff).

But, however, there might exist another subfamily that has an Lp-precutoff (resp.

(5)

Definition 2.2. Let (Ω, K, π, µ) be an irreducible finite Markov chain and p [1,∞]. For ϵ > 0, the ϵ-Lp-mixing time (or briefly the Lp-mixing time) is defined to be

Tp(µ, ϵ) := inf{t ≥ 0 : Dp(µ, t)≤ ϵ},

where the right side is set to be infinity if the infimum is taken over an empty set. If (Ω, Ht, π, µ) is the continuous time chain associated with K, write the Lp-mixing

time as

Tpc(µ, ϵ) := inf{t ≥ 0 : Dpc(µ, t)≤ ϵ},

where Dcp(µ, t) is the Lp-distance between µHt and π.

The concept of cutoff can also be described using the notion of mixing time. For instance, assuming Tn,p(ϵ) → ∞ for some ϵ > 0, a family of irreducible and

aperiodic Markov chains has an Lp-cutoff if and only if lim

n→∞Tn,p(µn, ϵ)/Tn,p(µn, δ) = 1, ∀ϵ, δ ∈ (0, Mp),

where Mp=∞ if p > 1 and M1= 2. See [6, Proposition 2.3-2.4] for further details

and relationships.

We end this section by introducing the following lemmas and corollary, which provide an idea on proving or disproving cutoffs.

Lemma 2.1 ([7, Proposition 2.1]). Let F = {(Ωn, Kn, πn, µn) : n = 1, 2, ...} be a

family of irreducible and aperiodic Markov chains. For any subsequence ξ = (ξn)∞n=1

of positive integers, set = {(Ωξn, Kξn, πξn, µξn) : n = 1, 2, ...}. Let p ∈ [1, ∞]

and assume Tn,p(ϵ)→ ∞ for some ϵ > 0. Then, the following are equivalent.

(1) F has an Lp-cutoff (resp. (t

n, bn) Lp-cutoff ).

(2) For any subsequence ξ,Fξ has an Lp-cutoff (resp. (tξn, bξn) L

p-cutoff ).

(3) For any subsequence ξ, there is a further subsequence ξ′ such that Fξ′ has

an Lp-cutoff (resp. (t

ξn′, bξ′n) Lp-cutoff ).

Remark 2.4. In Lemma 2.1, (1)⇒(2)⇒(3) also holds true for the Lp-precutoff.

Lemma 2.2. Let F = {(Ωn, Kn, πn, µn) : n = 1, 2, ...} be a family of irreducible

and aperiodic Markov chains and p ∈ [1, ∞]. Suppose that there is ϵ > 0 and an → ∞ such that Tn,p(µn, ϵ)≍ an and Tn,p(µn, δ) = O(an) for all 0 < δ < ϵ.

Then, the following are equivalent.

(1) F has no Lp-precutoff. (2) For all c > 0, lim sup n→∞ Dn,p(µn,⌊can⌋) > 0. (3) As δ→ 0, lim sup n→∞ Tn,p(µn, δ) an → ∞.

Proof. (2)⇔(3) is obvious from the definition of the Lp-mixing time. By the

mono-tonicity of the Lp-distance, the converse statements for (1) and (2) are exactly

(1)’ F has an Lp-precutoff.

(2)’ There is C > 0 such that lim

(6)

We prove the equivalence of (1) and (2) by showing (1)’⇔(2)’ instead. First, assume thatF has an Lp-precutoff and, according to Remark 2.1, let tn> 0 and 0 < A < B

be constants such that lim inf

n→∞ Dn,p(µn,⌊Atn⌋) = ϵ0> 0, nlim→∞Dn,p(µn,⌊Btn⌋) = 0.

Let δ < min{ϵ, ϵ0} and choose N > 0, C1> 0 such that

Dn,p(µn,⌊Atn⌋) > δ > Dn,p(µn,⌊Btn⌋), Tn,p(µn, δ)≤ C1an, ∀n ≥ N.

The former implies Atn≤ Tn,p(µn, δ)≤ Btn and, then,

Btn≤ BTn,p(µn, δ) A BC1 A an. This yields lim sup n→∞ Dn,p(µn,⌊BC1an/A⌋) ≤ lim sup n→∞ Dn,p(µn,⌊Btn⌋) = 0.

Second, assume (2)’ and choose C2 > 0 such that Tn,p(µn, ϵ)≥ C2an and an≥

2/C2. Then, for n≥ 1,

Dn,p(µn,⌊C2an/2⌋) ≥ Dn,p(µn,⌊C2an− 1⌋) ≥ Dn,p(µn, Tn,p(µn, ϵ)− 1) > ϵ > 0.

This proves the Lp-precutoff. 

The following is a simple corollary from Lemma 2.2, which surveys the Lp

-precutoff in a more strict way.

Corollary 2.3. As in the setting of Lemma 2.2, the following are equivalent. (1) No subfamily of F has an Lp-precutoff.

(2) For all c > 0, lim inf n→∞ Dn,p(µn,⌊can⌋) > 0. (3) As δ→ 0, lim inf n→∞ Tn,p(µn, δ) an → ∞.

Remark 2.5. It makes no difference to replace⌊can⌋ with ⌈can⌉ in (2) of Lemma

2.2 and Corollary 2.3.

Remark 2.6. Lemma 2.1-2.2 and Corollary 2.3 can be generalized to any family of

non-increasing functions defined on {0, 1, 2, ...} or [0, ∞). In particular, they hold for the continuous time Markov chains without the assumption Tn,p(µn, ϵ) → ∞

and an→ ∞.

3. Simple random walks on Z

This section is contributed to the establishment of some frequently used inequal-ity related to the simple random walk on integers. A simple random walk is a discrete time Markov chain (Xn)∞n=0whose transition matrix is given by

K(i, i + 1) = K(i, i− 1) = 1/2, ∀i ∈ Z.

For m≥ 1, let Tmbe the first passage time to the set {±m}, i.e.

(3.1) Tm= inf{n ≥ 0 : Xn= m or Xn=−m}.

For the continuous time case, let N (t) be a Poisson process with parameter 1 and independent of Xnand set Yt= XN (t). Clearly, Ytis a realization of the semigroup

(7)

Ht = e−t(I−K) associated with K and the first passage time to{±m} is denoted

by

(3.2) Tem= inf{t ≥ 0 : Yt= m or Yt=−m}.

Theorem 3.1. Let Tm, eTm be the random times defined in (3.1)-(3.2) and P0 be

the conditional probability given the initial state is 0. Then, for any b > 1 and m≥ 5,

min{P0(Tm> bm2),P0( eTm> bm2)} ≥ e−2b.

Remark 3.1. This theorem says that, regardless of discrete time or continuous time

cases, the simple random walk starting from the origin never reaches ±m before time m2 with positive probability uniformly over m.

To prove this theorem, we introduce the following proposition.

Proposition 3.2. Let K be the transition matrix of an irreducible birth-and-death

chain on {0, 1, ...}. For m ≥ 1, let τm and eτm be respectively the first passage

times to state m associated with the discrete time and continuous time chains. Let λ1, ..., λmbe the eigenvalues of the submatrix of I− K indexed by {0, 1, ..., m − 1}.

Then, λi∈ (0, 2) for 1 ≤ i ≤ m, λi̸= λj for i̸= j, and

(3.3) P0(τm> k) = mi=1  ∏ j:j̸=i λj λj− λi (1 − λi)k and (3.4) P0(eτm> t) = mi=1  ∏ j:j̸=i λj λj− λi e−tλi.

Remark 3.2. The right side of (3.4) is exactly P( eT > t), where eT is a sum of m

independent exponential random variables with parameters λ1, ..., λm. Assuming

λi ∈ (0, 1) for all 1 ≤ i ≤ m, the right side of (3.3) is equal to P(T > k), where

T is a sum of independent geometric random variables with success probabilities λ1, ..., λm.

Proof of Proposition 3.2. The proof for the continuous time case is available in [4],

while the proof for the discrete time case follows in the same spirit.  Back to the setting of the simple random walk. Observe that

P0(Tm> k) =P0(|Xi| < m, ∀i ≤ k), P0( eTm> t) =P0(|Xs| < m, ∀s ≤ t).

By the symmetry of the walk starting from 0, one may collapse states±i to achieve P0(Tm> k) =P0(τm> k), P0( eTm> t) =P0(eτm> t),

where P0 is the probability for the birth-and-death chain on {0, 1, ...} with initial state 0 and transition matrix K′ given by

K′(0, 1) = 1, K′(i, i− 1) = K′(i, i + 1) = 1/2, ∀i ≥ 1.

Here, τmandeτmare the first passage times to state m associated with the discrete

(8)

[11, Section XIV.5], the eigenvalues and eigenvectors for the submatrix of I− K′ indexed by 0, 1, ..., m− 1 are λi= 1− cos (2i− 1)π 2m , ϕi(j) = cos (2i− 1)(j − 1)π 2m , ∀i, j ∈ {1, ..., m}. We first treat the continuous time case. Let S1, ..., Smbe independent

exponen-tial random variables with parameters λi. As a consequence of Proposition 3.2,

replacing t with bm2 yields

P0( eTm> bm2) =P(S1+· · · + Sm> bm2)≥ P(S1> bm2) = e−bm 2λ1

≥ e−2b,

where the last inequality uses the fact 1− cos t ≤ t2/2. For the discrete time case,

the periodicity of K′, which is of period 2, implies λi> 1 for some i. This prevents

us from doing the same reasoning as the continuous time case. An idea to erase the periodicity of K′ is to consider the lazy walk with transition matrix 12(I + K′), since the eigenvalues of the submatrix of I−12(I + K′) indexed by{0, ..., m − 1} are contained in (0, 1). To see the detail, let (Xn)n=0be the birth-and-death chain with transition matrix K′ and define Zn = X2n′ /2. Obviously,

P 0(Zn+1= 1|Zn = 0) =P0′(X2n+2′ = 2|X2n′ = 0) = 1/2. For i > 0, P 0(Zn+1= i + 1|Zn = i) =P0′(X2n+2′ = 2i + 2|X2n′ = 2i) = 1/4 and P 0(Zn+1= i− 1|Zn = i) =P0′(X2n+2′ = 2i− 2|X2n′ = 2i) = 1/4 and, for i≥ 0, P 0(Zn+1= i|Zn= i) =P0′(X2n+2′ = 2i|X2n′ = 2i) = 1/2.

This implies that given X0 = 0, or equivalently Z0= 0, (Zn)∞n=0is a Markov chain

on{0, 1, ...} with initial state 0 and transition matrix 12(I + K′). Furthermore, by the periodicity of K′, if m is even and positive, then

P

0(τm> k) =P0(Xi′< m,∀i ≤ k) = P′0(Zi< m/2,∀i ≤ ⌊k/2⌋).

If m is odd and m > 1, then P

0(τm> k) =P1(Xi′ < m,∀i ≤ k − 1) = P′0(Zi< (m− 1)/2, ∀i ≤ ⌊(k − 1)/2⌋),

where the last equality uses the fact that, given X0 = 1, the process (X2n − 1)∞n=1 has the same distribution as (Zn)∞n=1with Z0= 0. Let τm′ be the first passage time

to m of the chain (Zn)∞n=0. Putting all above together yields

P0(Tm> k) =P0(τm> k)≥ P′0⌊m/2⌋ >⌊k/2⌋).

Note that the eigenvalues of the submatrix of I−12(I+K′) indexed by 0, 1, ...,⌊m/2⌋− 1 are λi/2∈ (0, 1), 1 ≤ i ≤ ⌊m/2⌋. By Proposition 3.2, if S1′, ..., S⌊m/2⌋ are

inde-pendent geometric random variables with success probabilities λ1/2, ..., λ⌊m/2⌋/2,

then, for any positive integer k,

P0(Tm> k)≥ P(S1 +· · · + S⌊m/2⌋ >⌊k/2⌋) ≥ ( 1 + cos(π/(2⌊m/2⌋)) 2 )⌊k/2⌋ .

(9)

Replacing k with⌊bm2⌋, b > 1 and m > 1 gives P0(Tm> bm2) ( 1 + cos(π/(2⌊m/2⌋)) 2 )⌊k/2⌋ ( 1 + cos(π/(m− 1)) 2 )bm2/2 = ( cos π 2(m− 1) )bm2 ( 1 π 2 8(m− 1)2 )bm2 ≥ e−2b,

where the last inequality uses the fact log(1− t) ≥ −12t/11 for t < 1/12 and asks

m≥ 5.

4. The total variation cutoff of Ehrenfest chains

This section is dedicated to the total variation cutoff of Ehrenfest chains. First, recall the setting in (1.1). For n≥ 1, let Ωn={0, 1, ..., n} and Knbe the transition

matrix of the Ehrenfest chain on Ωn given by

(4.1) Kn(i, i + 1) = 1−

i

n, Kn(i + 1, i) = i + 1

n , ∀0 ≤ i ≤ n − 1.

It is easy to see that Kn is irreducible with stationary distribution πn(i) =

(n i

) 2−n for 0≤ i ≤ n and of period 2. Concerning the periodicity of Kn and the semigroup

associated with Kn, consider

(4.2) Kn = 1 n + 1I + n n + 1Kn, Hn,t= e −t(I−Kn)= i=0 ( e−tt i i! ) Kni.

The total variation distance between (Kn)t (resp. H

n,t) and πn with initial state

xn is defined by Dn,TV(xn, t) := max A⊂Ωn |(K′ n) t(x n, A)− πn(A)| and Dcn,TV(xn, t) := max A⊂Ωn |Hn,,t(xn, A)− πn(A)|.

The total variation variation mixing time is set to be

Tn,TV(xn, ϵ) := min{t ≥ 0 : Dn,TV(xn, t)≤ ϵ}

and

Tn,cTV(xn, ϵ) := min{t ≥ 0 : Dcn,TV(xn, t)≤ ϵ}.

For p∈ [1, ∞], let Dn,p, Dn,pc and Tn,p, Tn,pc be the L

p-distances and the Lp-mixing

time in the discrete and continuous time cases.

Remark 4.1. The coupling, a classical probabilistic technique, was introduced by

Aldous and Diaconis to control and further to identify the total variation distance. See [2] and the references therein for details.

According to the above setting, it is clear that the total variation distance is exactly half of the L1-distance and has 1 as its maximum. In the same spirit, the total variation cutoff is consistent with the L1-cutoff and, thus, the definition is the same as in Definition 2.1 except the replacement of∞ by 1. The following theorem deals with the total variation cutoff of Ehrenfest chains.

Theorem 4.1. For n≥ 1, let xn ∈ Ωn. Consider the familiesF = {(Ωn, Kn′, πn, xn) :

n = 1, 2, ...} and Fc ={(Ωn, Hn,t, πn, xn) : n = 1, 2, ...}. Then, the following are

(10)

(1) F (resp, Fc) has a total variation precutoff.

(2) F (resp, Fc) has a total variation cutoff.

(3) |n − 2xn|/

n→ ∞.

Furthermore, if (3) holds, then both F and Fc have a (tn, n) total variation cutoff

with tn= n 2 log |n − 2x n| n .

Remark 4.2. The window size n is optimal in the sense that, if F or Fc has a

(tn, bn) total variation cutoff, then n = O(bn). See [6] for details on variants of

window optimality.

Proof of Theorem 4.1. (3)⇒(2) and the (tn, n) total variation cutoff under (3) has

been proved in [7]. (2)⇒(1) follows from the definition. For (1)⇒(3), we assume (3) fails and proveF and Fc have no total variation precutoff. By Remark 2.3, it

suffices to show that, if|xn− n/2|/

n is bounded, then no subfamily ofF and Fc

has a total variation precutoff. The proof consists of three steps.

Step1: Bounding the total variation from above. Note that the total varia-tion distance is bounded above by the chi-square distance. That is,

2Dn,TV(x, t)≤ Dn,2(x, t), 2Dcn,TV(x, t)≤ D

c n,2(x, t).

Using the reversibility of Kn and Lemma 1.1, the L2-distance can be expressed as

follows. [Dn,2(x, t)]2= ni=1 |ψn,i(x)|2 ( 1 2i n + 1 )2t ≤ 2 ⌊n/2⌋ i=1 |ψn,i(x)|2 ( 1 2i n + 1 )2t + ( 1 2 n + 1 )2t ≤ 2 ⌊n/2⌋ i=1 |ψn,i(x)|2e−4it/(n+1)+ e−4t/(n+1),

where ψn,iis the function defined in (1.2) and the first inequality applies the identity

ψn,n−i(x) = (−1)xψn,i(x) for all x, i ∈ {0, 1, ..., n}. It is worthwhile to note that

the summation in the last line is also an upper bound for the continuous time case since [Dcn,2(x, t)]2= ni=1 |ψn,i(x)|2e−4it/n ≤ 2 ⌊n/2⌋ i=1 |ψn,i(x)|2e−4it/(n+1)+ e−4nt/(n+1).

Observe that ψn,i(x) =

(n i

)1/2

Pi(x, 1/2, n), where Pi(x, p, n) is the Krawtchouk

polynomial, i.e. Pi(x, p, n) = 2F1 ( −i, −x −n 1 p ) .

See [12] for the definition. Using the following recurrence relation (n− 2x)Pi(x, 1/2, n) = (n− i)Pi+1(x, 1/2, n) + iPi−1(x, 1/2, n),

(11)

we may rewrite

(4.3) ψn,i+1(x) =

n− 2x

n An,iψn,i(x)− Bn,iψn,i−1(x),

where An,i= √ n (i + 1)(n− i), Bn,i= √ i(n− i + 1) (i + 1)(n− i).

Obviously, for n ≥ 2 and 1 ≤ i < n, An,i ≤ 1 and Bn,i ≤ 1. By setting r =

1 + supn{|n − 2xn|/

n} < ∞, we obtain

|ψn,i+1(xn)| ≤ (r − 1)|ψn,i(xn)| + |ψn,i−1(xn)|, ∀1 ≤ i < n.

Along with the boundary condition,

|ψn,0(xn)| = 1, |ψn,1(xn)| = |n − 2xn|/

n≤ (r − 1),

the above inequality yields

|ψn,i(xn)| ≤ ri, ∀0 ≤ i ≤ n.

Putting this back to the computation of the L2-distance derives, for any positive

integer N 14log(2r2), (4.4) max{Dn,TV(xn, N (n + 1)), Dn,cTV(xn, N (n + 1))} 1 2  2⌊n/2⌋i=1 r2ie−4iN + e−4nN   1/2 ( 1 2 i=1 r2ie−4iN )1/2 ( r2e−4N 2(1− r2e−4N) )1/2 ≤ re−2N,

where the last inequality uses the fact et≥ 1 + t for t ≥ 0. Hence, for all ϵ ∈ (0, 1)

and n≥ 2, max{Tn,TV(xn, ϵ), Tn,cTV(xn, ϵ)} ≤ ⌈ 1 2log 2r ϵ⌉(n + 1).

Step 2: Bounding the total variation from below: Discrete time case. In this step, we treat the discrete time case. Note that Kn can be interpreted in the following way. First, flip a coin with probability n/(n + 1) landing on heads and evolve the chain according to Kn if a head appears. If the tail shows up, then the

chain keeps in current state. Since the coin has a high preference on heads, the periodicity of Kn still plays an important role in the evolution of Kn′. This implies

that the set partitioned by the period is a candidate of the testing set for the total variation. In the case of Ehrenfest chains, the set is either even integers or odd integers. From the viewpoint of the spectral theory, the period of any reversible finite Markov chain is either 1 or 2. Assuming the reversibility, a chain is periodic if and only if−1 is an eigenvalue of its transition matrix. Intuitively, the eigenvector associated with −1 should be able to provide a good idea on the construction of a testing set for the total variation. This is not clear for general chains, but it is quite obvious for Ehrenfest chain. According to Lemma 1.1, ψn,n(x) = (−1)x is

an eigenvector of Kn associated with the eigenvalue −1 and the sets, {x ∈ Ωn :

(12)

odd numbers in Ωn. Due to the above discussion, we set An={i ∈ Ωn: i is even}

and let 1An be the indicating function of An. Clearly, 2· 1An− 1 = ψn,n and

(4.5) Dn,TV(xn, t)≥ |(Kn′) t(x n, An)− πn(An)| = 12|[(Kn′)t(xn,·) − πn](2· 1An− 1)| = 1 2|(Kn′) t(x n, ψn,n)| = 1 2(1 2 n+1) t 1 2e−4t/(n+1),

for n≥ 3, where the last inequality applies the fact log(1 − t) ≥ −2t for t ∈ [0, 1/2]. This implies, for 0 < ϵ≤ 1/(2e4),

Tn,TV(xn, ϵ)≥ ⌊14log1⌋(n + 1), ∀n ≥ 3.

It is worthwhile to note that the lower bound is independent of the initial state. Along with the upper bound in Step 1, we obtain Tn,TV(xn, 1/(2e4)) ≍ n and

Tn,TV(xn, ϵ) = Oϵ(n) for all ϵ < 1/(2e4). Using the last inequality of (4.5), it is

easy to see that, for any c≥ 1 and n ≥ 1,

Dn,TV(xn,⌊cn⌋) ≥ Dn,TV(xn,⌊2c⌋(n + 1)) ≥ 12e−4⌊2c⌋≥ e−9c.

By Corollary 2.3, no subfamily ofF has a total variation precutoff.

Step 3: Bounding the total variation from below: Continuous time case. Again, we suppose|n − 2xn|/

n is bounded. It has been developed in Step 1 that Tc

n,TV(xn, ϵ) = Oϵ(n) for all ϵ∈ (0, 1). By Corollary 2.3, it suffices to show that

(4.6) lim inf

n→∞ D c

n,TV(xn, cn) > 0, ∀c > 0.

The trick used in Step 2 does not work for the continuous time case, since, by writing

exp{−t(I − Kn)} = exp

{ −2t [ I− ( I + Kn 2 )]} ,

the continuous time Markov chain behaves like the lazy chain, a Markov chain whose transition matrix has entries in the diagonal at least 1/2. Comparing with

Kn′, (I + Kn)/2 evolves according to a fair coin and Kn. That is, if the coin

lands on heads, then the chain transits states according to Kn. If the coin lands

on tails, then the chain keeps at current state. For lazy chains, their eigenvalues must be nonnegative and the smallest eigenvalue has less contribution to the L2 -distance and the total variation. Our policy to conquer the continuous time case is as follows. First, we compare the original discrete time Ehrenfest chain Kn with

the simple random walk onZ. Based on the symmetry of the Ehrenfest chain, the comparison will generate a lower bound on the total variation distance related to the first passage time discussion in Section 3. This will lead to (4.6).

First, observe that, for any A⊂ Ωn and t≥ 0,

(4.7) Dcn,TV(xn, t)≥ Hn,t(xn, A)− πn(A) = i=0 ( e−tt i i! ) Kni(xn, A)− πn(A).

By the symmetry of Knand the boundedness of|xn−n/2|/

n, it loses no generality

to assume that n/4≤ xn≤ n/2 for all n ≥ 0. Moreover, by Remark 2.4, it suffices

to deal with the following subcases. (4.8) (n/2− xn)/

n→ a ∈ [0, ∞), as n → ∞.

(13)

Proposition 4.2. Let Knbe the transition matrix on Ωn defined by (4.1). Suppose

µn is a probability concentrated on A = {0, 1, ..., ⌈n/2⌉}, i.e., µn(A) = 1. Then,

µnKnt(A)≥ 1/2 for all t ≥ 0.

This proposition realizes the intuition that, by the symmetry of Ehrenfest chains, if the initial distribution concentrates on the left half side of Ωn, then so does the

distribution of the chain at all time. See the appendix for a proof of this proposition. Now, let A ={0, 1, ..., ⌈n/2⌉}. Clearly, πn(A)≤ 1/2 + πn(⌈n/2⌉) and, by Stirling’s

formula, πn(⌈n/2⌉) ∼ (πn/2)−1/2. Let T be the first passage time to state ⌊n/2⌋,

the first time (including time 0) to hit ⌊n/2⌋, for the Ehrenfest chain Kn. The

irreducibility of Kn implies Pxn(T < ∞) = 1 and the strong Markov property

yields Kni(xn, A) = ij=0 Kni−j(⌊n/2⌋, A)Pxn(T = j) +Pxn(T > i)≥ 1 2 + 1 2Pxn(T > i).

Putting this back to (4.7), we obtain, for all m≥ 0,

(4.9) DTVc (xn, t)≥ 1 2 ( e−t mi=0 ti i! ) Pxn(T > m)− πn(⌈n/2⌉).

Next, we use Theorem 3.1 to boundPxn(T > m) from below. Consider the simple

random walk onZ. For m ≥ 1, k ≥ 1 and i ∈ Z, let P(m, k, i) be the set containing paths of length m starting from 0, ending at i and staying in{0, ±1, ±2, ..., ±(k−1)} up to time m. Clearly,

Pxn(T > m)≥

⌊n/2⌋−xn−1 i=0

Pxn(P(m, ⌊n/2⌋ − xn, i))

LetPbe the probability where the simple random walk onZ starting from the origin sits. For any path w = (w0, w1, ..., wm)∈ P(m, k, i) with |i| < k, one may partition

the edges{(wj, wj+1) : 0≤ k < m} into two subsets, say B1(w) and B2(w), where

B1(w) = {(j, j + 1) : 0 ≤ j < i} for i > 0, B1(w) = {(j, j − 1) : 0 ≥ j > i} for

i < 0, and B2(w) is a disjoint union of pairs in the form{(j, j + 1), (j + 1, j)} with

−k < j < k − 1. Note that, for 2xn− n/2 ≤ j ≤ n/2,

1 j n j n≥ 1 2 ( 4xn n − 1 ) = 1 2 ( 1−2(n− 2xn) n ) and ( 1 j n ) j + 1 n 1 4 [ 1 ( n− 2j n )2] 1 4 [ 1− 4 ( n− 2xn n )2] .

This leads toPxn(w)≥ cn(m)P′(w) for all w∈ P(m, ⌊n/2⌋−xn, i) and 2xn−n/2 ≤

i≤ n/2, where cn(m) = [ 1− 4 ( n− 2xn n )2]m( 1−2(n− 2xn) n )n/2−xn .

(14)

Let m = N n, where N is any positive integer. Using the notation in (3.1) and applying Theorem 3.1, we obtain

Pxn(T > N n)≥ cn(N n)P 0(T⌊n/2⌋−xn> N n) ≥ cn(N n) exp { 2N n (⌊n/2⌋ − xn)2 } ,

provided N n≥ (⌊n/2⌋ − xn)2. Putting this back to (4.9), we obtain

DcTV(xn, t)≥ 1 2 ( e−t N ni=0 ti i! ) cn(N n) exp { 2N n (⌊n/2⌋ − xn)2 } − πn(⌊n/2⌋),

if N n≥ (⌊n/2⌋ − xn)2. As a consequence of Lemma A.3, if a > 0 in the setting of

(4.8), then lim inf n→∞ D c TV(xn, cn)≥ 1 2e −(20a2+2/a2)N > 0, ∀N > max{c, a2, 1}.

By Corollary 2.3, this prove that if a > 0, then no subfamily of Fc has a total

variation precutoff.

In the end, we deal with the subcase a = 0. Obviously, the last inequality provides a trivial lower bound on the total variation. To get an applicable bound, we rewrite the transition density of Kt

n as follows using Lemma 1.1.

Knt(x, y)/πn(y)− 1 = n

i=1

ψn,i(x)ψn,i(y)|βn,i|t.

See [14, Lemma 1.3.3] for a proof. Applying this identity to the case (Kn)t and

Hn,t gives (4.10) (K n)t(x, y) πn(y) − 1 = ni=1

ψn,i(x)ψn,i(y)

( 1 + nβn,i n + 1 )t and (4.11) Hn,t(x, y) πn(y) − 1 = ni=1

ψn,i(x)ψn,i(y)e−t(1−βn,i)

For n≥ 1, set

Hn,t(xn, y)/πn(y)− 1 = fn(t, y) + gn(t, y),

where fn(t, y) = ψn,2(xn)e−t(1−βn,2)ψn,2(y) and gn(t, y) = ni=1,i̸=2

ψn,i(xn)e−t(1−βn,i)ψn,i(y).

As a consequence of the triangle inequality and Jensen’s inequality, we obtain 2DcTV(xn, t) =∥fn(t,·) + gn(t,·)∥L1

n)≥ ∥fn(t,·)∥L1(πn)− ∥gn(t,·)∥L2(πn).

It remains to prove that, for all c > 0, lim inf n→∞ [ ∥fn(cn,·)∥L1(πn)− ∥gn(cn,·)∥L2(πn) ] > 0.

(15)

First, observe that ∥gn(t,·)∥L2 n)= ( n− 2x n n e −4t/n+ ni=3 |ψn,i(xn)|2e−4it/n )1/2 .

Recall the following fact developed in Step 1. If r = 1 + supn{|n − 2xn|/

n} < ∞,

then

|ψn,i(xn)| ≤ ri, ∀0 ≤ i ≤ n.

Putting this back to the L2

n)-norm of gn(t,·) yields ∥gn(cn,·)∥L2(πn) ( n− 2xn n e −4c+ (re−4c)3 1− re−4c )1/2 ,

provided r < e4c. Also, it is an easy exercise to compute

ψn,2(x) =n 2(n− 1) [( n− 2x n )2 − 1 ] . This implies|ψn,2(xn)| ∼ 1/ 2 and ∥ψn,2∥L1 n) 1 2πn({x : |x − n/2| < n/4}) ∼ 1 1/2 0 e−u2/2du≥ 1 12. According to the assumption (n/2− xn)/

n→ a = 0, if r < e4c, then lim inf n→∞ [ ∥fn(cn,·)∥L1(πn)− ∥gn(cn,·)∥L2(πn) ] 1 122e −4c r3/2 1− re−4ce −6c= e−4c( 1 122 r3/2 1− re−4ce −2c)> 0,

for c large enough. By the monotonicity of the total variation distance, we have lim inf

n→∞ D c

TV(xn, cn) > 0, ∀c > 0.

By Corollary 2.3, no subfamily of Fc has a total variation precutoff when a = 0.

This finishes the proof. 

Remark 4.3. In the proof of Theorem 4.1, it has been shown that if|xn− n/2|/

n

is bounded, then no subfamily ofF and Fc presents a total variation precutoff and

the total variation mixing time is of order n.

Remark 4.4. In Step 3, the method for a = 0 is also valid for a > 0 if one replaces fn(t,·) with ψn,1(xn)e−t(1βn,1)ψn,1and changes gn(t,·) into Hn,t(xn,·)/πn− 1 − fn.

The proof for a > 0 also works for the discrete time case. 5. The Lp

-cutoff of Ehrenfest chains

This section is contributed to the development of the Lp-cutoff of Ehrenfest

chains with p ∈ (1, ∞). To bound the Lp-distance, we have to select suitable

test functions in accordance with the operator theory and the spectral information provides some good ideas on the choice, for instance, the eigenfunctions. The main theorem states as follows.

Theorem 5.1. LetF and Fc be the families in Theorem 4.1. For p∈ (1, ∞), the

following are equivalent.

(16)

(2) F (resp. Fc) has an Lp-cutoff.

(3) |xn− n/2|/

n→ ∞.

Moreover, if (3) holds, then bothF and Fc have a (tn, n) Lp-cutoff with

tn= n 2 log |n − 2xn| n .

Proof. In this proof, we will write∥f∥pas the Lp(π)-norm of f for short. Obviously,

(2)⇒(1) comes immediate from Definition 2.1 for all 1 < p < ∞. For (3)⇒(2) and the (tn, n) Lp-cutoff, we set

Fp(a) = lim sup n→∞

Dn,p(xn, tn+ an), Fp(a) = lim inf

n→∞ Dn,p(xn, tn+ an)

and

Gp(a) = lim sup n→∞

Dcn,p(xn, tn+ an), Gp(a) = lim inf n→∞ D

c

n,p(xn, tn+ an).

Consider in the following two cases, p∈ (1, 2] and p ∈ (2, ∞).

Case 1: (1 < p≤ 2) For p = 2, (2) and (3) have been proved equivalent in [7]. In detail, by Theorem 6.3-6.5 in [7] and the proofs therein, there are positive constants

A, N such that, for n≥ N,

max{Dn,2(xn, tn+ an), Dn,2c (xn, tn+ an)} ≤ Ae−2a+ o(1)

and

min{Dn,2(xn, tn+ an), Dn,2c (xn, tn+ an)} ≥ e−2a+ o(1).

This implies

(5.1) max{F2(a), G2(a)} ≤ Ae−2a, min{F2(a), G2(a)} ≥ e−2a, ∀a ∈ R.

Note that the Lr-distance is bounded above by Ls-distance for 1≤ r < s ≤ ∞. Using the first inequality of (5.1), we obtain, for p∈ (1, 2),

max{Fp(a), Gp(a)} ≤ Ae−2a→ 0, as a → ∞.

To get a lower bound, consider the test function ψn,1. Set q = (1− 1/p)−1. A

simple application of the central limit theorem yields

∥ψn,1∥q = ( nx=0 ( |n − 2x| n )q πn(x) )1/q → Cq := [E(|X|q)]1/q,

where X is a standard normal random variable andE denotes the expectation. It is a simple exercise to show that

Cq = (√ 2q πΓ ( q + 1 2 ))1/q <∞, ∀q ∈ (1, ∞),

where Γ is the Gamma function defined by Γ(z) =0∞e−ttz−1dt. As a consequence

of (4.10)-(4.11), we have

Fp(a)≥ lim inf n→∞ |⟨(K′ n)tn+an(xn,·)/πn− 1, ψn,1⟩πn| ∥ψn,1∥q = e−2a/Cq and

Gp(a)≥ lim inf n→∞

|⟨Hn,tn+an(xn,·)/πn− 1, ψn,1⟩πn|

∥ψn,1∥q

(17)

Obviously, min{Fp(a), Gp(a)} → ∞ as a → −∞. This proves the desired (tn, n)

Lp-cutoff.

Case 2: (2 < p <∞) Using the second inequality of (5.1), it is easy to see that min{Fp(a), Gp(a)} ≥ e−2a+ o(1)→ ∞, as a → −∞.

To get an upper bound, we apply the fact ψn,n−i(x) = (−1)xψn,i(x) to the right

sides of (4.10)-(4.11) and get

Dn,p(xn, t)≤ 2 ⌈n/2⌉ i=1 |ψn,i(xn)|∥ψn,i∥p ( 1 2i n + 1 )t + ( 1 2 n + 1 )t ≤ 2dp(n, t) and Dcn,p(xn, t)≤ 2 ⌈n/2⌉ i=1

|ψn,i(xn)|∥ψn,i∥pe−2it/n+

( 1 2 n + 1 )t ≤ 2dp(n, t), where dp(n, t) = ⌈n/2⌉ i=1

|ψn,i(xn)|∥ψn,i∥pe−2it/(n+1)+ e−2t/(n+1).

To bound dp(n, t) from above, one has to compute the Lp-norm of ψn,i. This

can be very complicated from its definition but, surprisingly, the identity in (4.3) is sufficient to give a reasonable upper bound. In detail, one may derive from (4.3) that, for i≤ n/2, |ψn,i+1(x)| ≤ (√ 2 i + 1× |n − 2x| n ) |ψn,i(x)| + |ψn,i−1(x)|.

Along with the initial conditions, ψn,0≡ 1 and ψn,1(x) = (n−2x)/

n, an inductive argument yields (5.2) |ψn,i(x)| ≤ √ 2i i! ij=1 ( |ψn,1(x)| +j 2 ) , ∀x ∈ Ωn, i≤ n/2.

For convenience, write i! = αiii+1/2e−i. By Stirling’s formula, αi→

2π as i→ ∞. Thus, we may choose β > 1 such that β−1≤ αi≤ β for all i ≥ 1. This implies

(5.3) ii+1/2e−i/β≤ i! ≤ βii+1/2e−i, ∀i ≥ 1.

In this setting, (5.2) gives

|ψn,i(x)| ≤ (2e)i/2i−1/4β1/2

(

|ψn,1(x)|i−1/2+ 1

)i

(5.4)

and, then, the Lp-norm of ψ

n,i is bounded above as follows.

∥ψn,i∥pp≤ (2e) pi/2i−p/4βp/2π n [( |ψn,1|i−1/2+ 1 )pi]

≤ (2e)pi/2i−p/4βp/22pi[i−pi/2π n ( |ψn,1|pi ) + 1 ] ,

where the last inequality uses the fact (s + t)r≤ 2r−1(sr+ tr) for any s > 0, t > 0 and r ≥ 1. It deserves to note that, for fixed i, the central limit theorem implies that πn(|ψn,1|pi) converges to the expectation of |X|pi, where X is the standard

normal random variable. To estimate such a convergence for all 1 ≤ i ≤ n, one may consider the convergence rate of the central limit theorem, but, however, this

(18)

can be very complicated. Here, we cook up a direct computation in Lemma A.4, which says that there exists a constant C > 1 such that

πn(|ψn,1|pi)≤ C4piΓ ( pi + 1 2 ) .

As a consequence of the identity Γ(t + 1) = tΓ(t), Γ ( pi + 1 2 ) ≤ 2 ⌊(pi−1)/2⌋ j=1 pi− 2j + 1 2 ≤ pi × (⌈ pi− 3 2 ⌉ ! ) ≤ 5βpi[(pi)/(2e)]pi/2.

For p≥ 2, the above inequalities gives

∥ψn,i∥p

(

(2e)pi/2i−p/4βp/22pi{20βC4pi(pi)[p/(2e)]pi/2} )1/p

≤ 10βCi1/4(8p)i.

Plugging the last term and (5.4) back to dp(n, t), we obtain

(5.5) dp(n, t)≤ 10β2C ⌈n/2⌉ i=1 (20p)i(|ψn,1(xn)| + 1) i e−2it/(n+1)+ e−2t/(n+1). Recall that tn= n 2log |n − 2xn| n = n 2log|ψn,1(xn)|. Clearly, for a > 1, tn+ an≥ n + 1 2 log|ψn,1(xn)| + (a − 1)n ≥ n + 1 2 log|ψn,1(xn)| + n + 1 2 (a− 1). This implies dp(n, tn+ an)≤ 10β2C ⌈n/2⌉ i=1 ( 20p ea−1 × |ψn,1(xn)| + 1 |ψn,1(xn)| )i + exp{−|ψn,1(xn)|}.

Under the assumption of (3), that is,|ψn,1(xn)| → ∞, if ea−1 > 8p, then

max{Fp(a), Gp(a)} ≤ 2 lim sup n→∞ dp(n, tn+ an) ≤ 20β2C i=1 i(20pe1−a)i= 400β 2Cpe1−a 1− 20pe1−a .

Obviously, the last term converges to 0 as a tends to infinity. This proves the (tn, n)

Lp-cutoff ofF and Fc with 2 < p <∞.

For (1)⇒(3), we assume that |xn − n/2|/

n is bounded and prove that no

subfamily ofF and Fc has an Lp-precutoff. Set M = supn≥1{|2xn− n|/

n} + 1.

By (5.5), we have, for p > 2 and ea≥ 20Mp

max{Dn,p(xn,⌈an⌉), Dn,pc (xn, an)} ≤ 2dp(n, an)

≤20β2C i=1 (20M pe−a)i+ 2e−a= 400M β 2Cpe−a 1− 20Mpe−a + 2e −a.

Again, the right side converges to 0 as a tends to infinity. This implies, for all ϵ > 0 and p <∞,

(19)

Also, by Remark 4.3 and Corollary 2.3, we have lim inf

n→∞ min{Dn,TV(xn, cn), D c

n,TV(xn, cn)} > 0, ∀c > 0.

This yields, for p > 1, lim inf

n→∞ min{Dn,p(xn, cn), D c

n,p(xn, cn)} > 0, ∀c > 0.

Consequently, for 1 < p <∞, no subfamily of F and Fc has an Lp-precutoff. This

finishes the proof. 

Remark 5.1. It is worthwhile to note that if |n − xn|/

n is bounded, then the Lp-mixing time of the Ehrenfest chains in (4.2) with p∈ [1, ∞) is of order n.

Remark 5.2. For p =∞, the equivalence in Theorem 5.1 might fall. Suppose n is

even, xn= n/2 and consider the separation distance, which is closely related to the

L∞-distance and is defined by

Dn,sep(x, t) = max y { 1−(K n)t(x, y) πn(y) } , Dcn,sep(x, t) = max y { 1−Hn,t(x, y) πn(y) } .

For n≥ 1, let Ln be a Markov kernel on{0, 1, ..., n/2} given by

Ln(i, i) = 0, ∀0 ≤ i ≤ n/2, Ln(i, i + 1) = 1− i n, ∀0 ≤ i < n/2, and Ln(i + 1, i) = i + 1 n , ∀0 ≤ i < n/2 − 1, Ln(n/2, n/2− 1) = 1.

It is obviously that Ln is obtained from Kn by collapsing states{i, n − i} and has

eπn(i) = 21−n

(n i

)

for i < n/2 and eπn(n/2) = 2−n

( n n/2

)

as the stationary distribu-tion. Let eDn,sep(x, t), eDn,csep(x, t) be respectively the separation distances between

(L′n)t, e−t(I−Ln) and

n, where L′n= (I + nLn)/(n + 1). Then, Dn,sep(n/2, t) = eDn,sep(n/2, t), D c n,sep(n/2, t) = eD c n,sep(n/2, t).

In fact, the above identities also hold in the Lp-distance with 1≤ p ≤ ∞. In [9],

the authors consider discrete time monotone birth-and-death chains, which is not satisfied by L′n, and continuous time birth-and-death chains without any constraint.

It is an easy exercise to check that I− Ln has eigenvalues 4i/n and eigenvectors

ϕn,i given by ϕn,i(x) = ψn,2i(x) for 0≤ i ≤ n/2. Clearly, the spectral gap of Ln is

λn= 4/n. Set tn = n/2i=1 n 4i = n log n 4 + O(n).

As a consequence of [9, Theorem 5.1-6.1], the family Fc in Theorem 4.1 has a

(14n log n, n) separation cutoff. However, according to Theorem 5.1 and Remark

5.1,Fc has no Lp-precutoff and the exact order of the Lp-mixing time is n.

Remark 5.3. There is no universal criterion on the total variation cutoff or

pre-cutoff, except specific chains such as lazy birth-and-death chains. Concerning the maximum total variation distance and the related mixing time, define

DTV(t) = max

(20)

and call the cutoff in the above distance as the maximum total variation cutoff. The authors of [10] prove that a family of lazy birth-and-death chains on Ωn =

{0, 1, ..., n} has a maximum total variation cutoff if and only if

lim

n→∞λnTn,TV(ϵ) =∞,

for some ϵ∈ (0, 1), where 1 − λn is the second largest eigenvalue of the transition

matrix on Ωn. Such a criterion is proposed by Peres during the ARCC workshop

held by AIM in Palo Alto, December 2004. Under the assumption of reversibility, it has been shown to be true in [6] for max- Lp distance with 1 < p < ∞, but disproved in [5] for p = 1 using an idea from Aldous. However, none of the above results is clear if the initial states or distributions for a family of ergodic Markov chains are specified. As a consequence of Theorem 4.1, Lemma 1.1 and Remark 4.3, the family in Theorem 4.1 has a total variation cutoff (also for the precutoff) if and only if

lim

n→∞λnTn,TV(xn, ϵ) =∞,

for some ϵ∈ (0, 1). This provides an example that is consistent with Peres’ conjec-ture.

Appendix A. Techniques and proofs We consider Proposition 4.2 in a more general setting.

Lemma A.1. Let K be the transition matrix of a periodic birth-and-death chain

on Ω ={0, 1, ..., m} with birth rate pi and death rate qi= 1− pi. That is,

K(i, i + 1) = pi, K(i, i− 1) = qi= 1− pi, ∀0 ≤ i ≤ m,

with the convention pm = q0 = 0. Let l = ⌊m/2⌋ and µ be a probability on Ω.

Suppose that, for any i≥ 0,

(A.1) µ(l− 2i) ≥ µ(l + 2i + 2) ≥ µ(l − 2i − 2), pl+2i≥ ql−2i≥ pl+2i+2,

and

(A.2) pl+2i+ ql+2i+2 ≥ pl−2i−2+ ql−2i≥ pl+2i+2+ ql+2i+4.

Then, for all i≥ 0,

µK(l + 2i + 1)≥ µK(l − 2i − 1) ≥ µK(l + 2i + 3). Proof. By the periodicity of K,

µK(j) = µ(j− 1)pj−1+ µ(j + 1)qj+1, ∀0 ≤ j ≤ m,

where

(A.3) µ(−1) = µ(m + 1) = p−1= qm+1= 0.

It is easy to check that both (A.1) and (A.2) hold under the extension in (A.3). If

i≤ (l − 1)/2, then l + 2i + 1 ≤ 2l ≤ m and µK(l + 2i + 1)− µK(l − 2i − 1)

=[µ(l + 2i)pl+2i+ µ(l + 2i + 2)ql+2i+2]

− [µ(l − 2i)ql−2i+ µ(l− 2i − 2)pl−2i−2]

≥µ(l − 2i)(pl+2i− ql−2i) + µ(l + 2i + 2)(ql+2i+2− pl−2i−2)

(21)

If l + 2i + 3≤ m, then l − 2i − 1 ≥ 2l + 2 − m ≥ 1 and

µK(l− 2i − 1) − µK(l + 2i + 3)

=[µ(l− 2i)ql−2i+ µ(l− 2i − 2)pi−2i−2]

− [µ(l + 2i + 2)pl+2i+2+ µ(l + 2i + 4)ql+2i+4]

≥µ(l + 2i + 2)(ql−2i− pl+2i+2) + µ(l− 2i − 2)(pi−2i−2− ql+2i+4)

≥µ(l − 2i − 2)(ql−2i− pl+2i+2+ pi−2i−2− ql+2i+4)≥ 0.

This finishes the proof. 

Remark A.1. Lemma A.1 also holds for the case that m is even and l = m/2− 1.

The proof goes similarly and is omitted.

The following is a simple corollary of Lemma A.1.

Corollary A.2. Let K be the transition matrix on Ω ={0, 1, ..., m} given by

K(i, i + 1) = pi, K(i, i− 1) = qi= 1− pi, ∀0 ≤ i ≤ m,

where pm= q0= 0, and let µ be a probability on Ω. Suppose that

pi= qm−i, pi≥ pi+1, ∀i ≥ 0,

and

pi+ qi+2 ≤ pi+1+ qi+3, ∀0 ≤ i ≤ ⌊m/2⌋ − 2.

(1) If m = 2l and

µ(l + 2i)≥ µ(l − 2i − 2) ≥ µ(l + 2i + 2), ∀i ≥ 0, then, for all i≥ 0 and t ∈ {0, 1, 2, ...},

µK2t+1(l− 2i − 1) ≥ µK2t+1(l + 2i + 1)≥ µK2t+1(l− 2i − 3)

and

µK2t(l + 2i)≥ µK2t(l− 2i − 2) ≥ µK2t(l + 2i + 2). (2) If m = 2l and

µ(l− 2i − 1) ≥ µ(l − 2i + 1) ≥ µ(l − 2i − 3), ∀i ≥ 0, then, for all i≥ 0 and t ∈ {0, 1, 2, ...},

µK2t+1(l + 2i)≥ µK2t+1(l− 2i − 2) ≥ µK2t+1(l + 2i + 2).

and

µK2t(l− 2i − 1) ≥ µK2t(l + 2i + 1)≥ µK2t(l− 2i − 3). (3) If m = 2l + 1 and

µ(l− 2i) ≥ µ(l + 2i + 2) ≥ µ(l − 2i − 2), ∀i ≥ 0, then, for all i≥ 0 and t ∈ {0, 1, 2, ...},

µK2t+1(l + 2i + 1)≥ µK2t+1(l− 2i − 1) ≥ µK2t+1(l + 2i + 3)

and

(22)

Proof of Proposition 4.2. For the birth-and-death chain in Proposition 4.2, it is

obvious that pi= 1− i/n and qi= i/n. This implies

pi= qn−i, pi> pi+1, pi+ qi+2 = 1 +

2

n, ∀i ≥ 0.

Applying Corollary A.2 with K = Kn and µ = δ⌈n/2⌉, the dirac mass on ⌈n/2⌉,

yields

Knt(⌈n/2⌉, A) ≥ 1/2, ∀t ≥ 0.

For the general case with µn(A) ≥ 1/2, let (Xt)∞t=0 be a Markov chain with

transition matrix Kn and let T be the first passage time to state ⌈n/2⌉, i.e.,

T = min{t ≥ 0 : Xt = ⌈n/2⌉}. By the irreducibility of Kn, Pµn(T < ∞) = 1.

Using the strong Markov property, we obtain, for t≥ 0,

µnKnt(A) = ti=0 Pµn(Xt∈ A, T = i) + Pµn(Xt∈ A, T > t) = ti=0 P(Xt−i∈ A|X0=⌈n/2⌉)Pµn(T = i) +Pµn(T > t) 1 2Pµn(T ≤ t) + Pµn(T > t)≥ 1/2.  Lemma A.3 ([6, Lemma A.1]). For n > 0, let an∈ R+, bn∈ Z+, cn =bn√−aa n

n and dn= e−anbn i=0 ai n

i!. Assume that an+ bn → ∞. Then

(A.4) lim sup

n→∞ dn= Φ ( lim sup n→∞ cn ) , lim inf n→∞ dn= Φ ( lim inf n→∞ cn ) , where Φ(x) = 1 x −∞e−t 2/2 dt.

In particular, if cn converges(the limit can be +∞ and −∞), then lim n→∞dn = Φ ( lim n→∞cn ) .

Lemma A.4. For n ≥ 1, let ξn be a binomial random variable with parameters

(n, 1/2). Then, there is a universal constant C > 0 such that

E ( n− 2ξ√n n θ) ≤ C4θΓ ( θ + 1 2 ) , ∀θ > 0, n ≥ 1, where Γ is the Gamma function.

Proof. Set Ωn ={0, 1, ..., n} and πn(x) =

(n

x

)

2−n. According to the definition of

ξn,P(ξn = x) = πn(x) for x∈ Ωn. For 0≤ j < n, set En,j={x ∈ Ωn:|n − 2x|/ n∈ (j, j + 1]}, yn,j = max{x ∈ En,j: x≤ n/2}. Clearly, [n− (j + 1)√n]/2≤ yn,j < (n− j n)/2 and (A.5) E ( n− 2ξ√nn θ) √n j=0 (j + 1)θπn(En,j).

(23)

Using (5.3), we obtain, for yn,j̸= 0, πn(En,j) = 2−nx∈En,j n! x!(n− x)! ≤ 2 1−nn n! yn,j!(n− yn,j)! ≤ 22−n3 nn+1/2 yyn,j+1/2 n,j (n− yn,j)n−yn,j+1/2 = 8β3/zn,j, where zn,j = ( 2 n )n+1 yn,jyn,j+1/2(n− yn,j)n−yn,j+1/2 = [ 2yn,j n ( 2−2yn,j n )](n+1)/2( n− yn,j yn,j )n/2−yn,j = [ 1 ( 1−2yn,j n )2](n+1)/2[ 1 + (1− 2yn,j/n) 1− (1 − 2yn,j/n) ]n/2−yn,j .

Note that the mapping t7→ (1 − t)1/tis strictly decreasing on (0, 1). This implies [ 1 ( 1−2yn,j n )2]n/2 [ 1 ( 1−2yn,j n )]n/2−yn,j and, hence, zn,j √ 1 ( 1−2yn,j n )2[ 1 + ( 1−2yn,j n )]n/2−yn,j ≥2yn,j n [ 1 + ( 1−2yn,j n )]n/2−yn,j

In the case yn,j ≥ n/6, one may use the inequality, log(1 + t) ≥ t/2 for t ∈ [0, 1], to

get zn,j≥ 1 3exp { n 4 ( 1−2yn,j n )2} 1 3e j2/4 .

In the case 1≤ yn,j≤ n/6, it is clear that

zn,j 2 n ( 5 3 )n/3 2 ne n/6 2 ne n/24ej2/8,

where the last inequality applies the fact j <√n. Putting both cases together, we

may choose a universal constant C > 1 such that

zn,j

ej2/8

C , ∀0 ≤ j ≤

n, yn,j ̸= 0, n ≥ 1.

Back to the computation of πn(En,j), this gives

πn(En,j)≤ 8Cβ3e−j 2/8

, ∀0 ≤ j ≤√n, yn,j̸= 0, n ≥ 1.

In fact, the above inequality also holds for yn,j = 0 (which must imply j =⌊

n⌋)

since, in such a case, πn(En,j) = 21−n ≤ 2e−(log 2)j 2

≤ 2e−j2/8

(24)

computation in (A.5), we have E ( n− 2ξ√n n θ) ≤ 8Cβ3 √n⌋ j=0 (j + 1)θe−j2/8≤ 16Cβ3 √n⌋ j=0 (j + 1)θe−(j+2)2/16 ≤ 16Cβ3 √n⌋ j=0j+2 j+1 tθe−t2/16dt≤ 64Cβ34θ 0 sθe−s2ds = 32Cβ34θΓ ( θ + 1 2 ) .  References

[1] David Aldous. Random walks on finite groups and rapidly mixing Markov chains. In Seminar

on probability, XVII, volume 986 of Lecture Notes in Math., pages 243–297. Springer, Berlin,

1983.

[2] David Aldous and Persi Diaconis. Shuffling cards and stopping times. Amer. Math. Monthly, 93(5):333–348, 1986.

[3] David Aldous and Persi Diaconis. Strong uniform times and finite random walks. Adv. in

Appl. Math., 8(1):69–97, 1987.

[4] M. Brown and Y.-S. Shao. Identifying coefficients in the spectral representation for first passage time distributions. Probab. Engrg. Inform. Sci., 1:69–74, 1987.

[5] Guan-Yu Chen. The cutoff phenomenon for finite Markov chains. PhD thesis, Cornell Uni-versity, 2006.

[6] Guan-Yu Chen and Laurent Saloff-Coste. The cutoff phenomenon for ergodic markov pro-cesses. Electron. J. Probab., 13:26–78, 2008.

[7] Guan-Yu Chen and Laurent Saloff-Coste. The L2-cutoff for reversible Markov processes. J. Funct. Anal., 258(7):2246–2315, 2010.

[8] Persi Diaconis and Phil Hanlon. Eigen-analysis for some examples of the Metropolis algo-rithm. In Hypergeometric functions on domains of positivity, Jack polynomials, and

applica-tions (Tampa, FL, 1991), volume 138 of Contemp. Math., pages 99–117. Amer. Math. Soc.,

Providence, RI, 1992.

[9] Persi Diaconis and Laurent Saloff-Coste. Separation cut-offs for birth and death chains. Ann.

Appl. Probab., 16(4):2098–2122, 2006.

[10] Jian Ding, Eyal Lubetzky, and Yuval Peres. Total variation cutoff in birth-and-death chains.

Probab. Theory Related Fields, 146(1-2):61–85, 2010.

[11] William Feller. An introduction to probability theory and its applications. Vol. I. Third edi-tion. John Wiley & Sons Inc., New York, 1968.

[12] R. Koekoek and R. Swarttouw. The askey-scheme of hypergeometric orthogonal polynomials and its q- analog. http://math.nist.gov/opsf/projects/koekoek.html, 1998.

[13] David A. Levin, Yuval Peres, and Elizabeth L. Wilmer. Markov chains and mixing times. American Mathematical Society, Providence, RI, 2009. With a chapter by James G. Propp and David B. Wilson.

[14] Laurent Saloff-Coste. Lectures on finite Markov chains. In Lectures on probability theory

and statistics (Saint-Flour, 1996), volume 1665 of Lecture Notes in Math., pages 301–413.

Springer, Berlin, 1997.

[15] Laurent Saloff-Coste. Random walks on finite groups. In Probability on discrete structures, volume 110 of Encyclopaedia Math. Sci., pages 263–346. Springer, Berlin, 2004.

1

Department of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan

(25)

2

Department of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan

E-mail address: [email protected] 3

Department of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan

參考文獻

相關文件

– Factorization is “harder than” calculating Euler’s phi function (see Lemma 51 on p. 404).. – So factorization is harder than calculating Euler’s phi function, which is

Wang, Unique continuation for the elasticity sys- tem and a counterexample for second order elliptic systems, Harmonic Analysis, Partial Differential Equations, Complex Analysis,

Reading Task 6: Genre Structure and Language Features. • Now let’s look at how language features (e.g. sentence patterns) are connected to the structure

• Content demands – Awareness that in different countries the weather is different and we need to wear different clothes / also culture. impacts on the clothing

• Examples of items NOT recognised for fee calculation*: staff gathering/ welfare/ meal allowances, expenses related to event celebrations without student participation,

Numerical results are reported for some convex second-order cone programs (SOCPs) by solving the unconstrained minimization reformulation of the KKT optimality conditions,

— John Wanamaker I know that half my advertising is a waste of money, I just don’t know which half.. —

• A language in ZPP has two Monte Carlo algorithms, one with no false positives and the other with no