• 沒有找到結果。

馬可夫鏈的熵收斂

N/A
N/A
Protected

Academic year: 2021

Share "馬可夫鏈的熵收斂"

Copied!
79
0
0

加載中.... (立即查看全文)

全文

(1)

SPECTRAL COMPUTATIONS FOR BIRTH AND DEATH CHAINS

GUAN-YU CHEN1 AND LAURENT SALOFF-COSTE2

Abstract. We consider the spectrum of birth and death chains on a n-path. An iterative scheme is proposed to compute any eigenvalue with exponential convergence rate independent of n. This allows one to determine the whole spectrum in order n2 elementary operations. Using the same idea, we also

provide a lower bound on the spectral gap, which is of the correct order on some classes of examples.

1. Introduction

Let G = (V, E) be the undirected finite path with vertex set V ={1, 2, ...n} and edge set E ={{i, i + 1} : i = 1, 2, ..., n − 1}. Given two positive measures π, ν on

V, E with π(V ) = 1, the Dirichlet form and variance associated with ν and π are

defined by

Eν(f, g) := n−1i=1

[f (i)− f(i + 1)][g(i) − g(i + 1)]ν(i, i + 1) and

Varπ(f ) := π(f2)− π(f)2,

where f, g are functions on V . When convenient, we set ν(0, 1) = ν(n, n + 1) = 0. The spectral gap of G with respect to π, ν is defined as

λGπ,ν := min { Eν(f, f ) Varπ(f ) f is non-constant}. Let MG

π,ν be a matrix given by Mπ,νG (i, j) = 0 for|i − j| > 1 and

Mπ,νG (i, j) =−ν(i, j) π(i) ,∀|i − j| = 1, M G π,ν(i, i) = ν(i− 1, i) + ν(i, i + 1) π(i) .

Obviously, λGπ,ν is the smallest non-zero eigenvalue of Mπ,νG .

Undirected paths equipped with measures π, ν are closely related to birth and death chains. A birth and death chain on {0, 1, 2, ..., n} with birth rate pi, death rate qi and holding rate ri is a Markov chain with transition matrix K given by (1.1) K(i, i + 1) = pi, K(i, i− 1) = qi, K(i, i) = ri, ∀0 ≤ i ≤ n,

where pi+ qi+ ri = 1 and pn = q0 = 0. Under the assumption of irreducibility,

that is, piqi+1 > 0 for 0 ≤ i < n, K has a unique stationary distribution π given by π(i) = c(p0· · · pi−1)/(q1· · · qi), where c is the positive constant such that

2000 Mathematics Subject Classification. 60J10,60J27.

Key words and phrases. Birth and death chains, spectrum.

1Partially supported by NSC grant NSC100-2115-M-009-003-MY2 and NCTS, Taiwan. 2Partially supported by NSF grant DMS-1004771.

(2)

n

i=0π(i) = 1. The smallest non-zero eigenvalue of I− K is exactly the spectral gap of the path on {0, 1, ..., n} with measures π, ν, where ν(i, i + 1) = π(i)pi =

π(i + 1)qi+1 for 0≤ i < n.

Note that if 1 is the constant function of value 1 and ψ is a minimizer for λG π,ν, then ψ− π(ψ)1 is an eigenvector of MG

π,ν. This implies that any minimizer ψ for

λG

π,ν satisfying π(ψ) = 0 satisfies the Euler-Lagrange equation,

(1.2) λGπ,νπ(i)ψ(i) = [ψ(i)− ψ(i − 1)]ν(i − 1, i) + [ψ(i) − ψ(i + 1)]ν(i, i + 1),

for all 1 ≤ i ≤ n. Assuming the connectedness of G (i.e., the superdiagonal and subdiagonal entries of MG

π,νare positive), the rank of Mπ,νG −λI is at least n−1. This implies that all eigenvalues of MG

π,ν are simple. See Lemma A.3 for an illustration. Observe that, by (1.2), any non-trivial eigenvector of Mπ,νG has mean 0 under π. This implies that all minimizers for the spectral gap are of the form aψ + b1, where

a, b are constants and ψ is a nontrivial solution of (1.2). In 2009, Miclo obtained

implicitly the following result.

Theorem 1.1. [15, Proposition 1] If ψ is a minimizer for λG

π,ν, then ψ must be

monotonic, that is, either ψ(i)≤ ψ(i + 1) for all 1 ≤ i < n or ψ(i) ≥ ψ(i + 1) for all 1≤ i < n.

One aim of this paper is to provide a scheme to compute the spectrum of MG π,ν, in particular, the spectral gap. Based on Miclo’s observation, it is natural to consider the following algorithm.

(A1)

Choose two positive reals λ0, a in advance and set, for k = 0, 1, ...,

1. ψk(1) =−a,

2. ψk(i + 1) = ψk(i) +{[ψ

k(i)− ψk(i− 1)]ν(i − 1, i) − λkπ(i)ψk(i)}+

ν(i, i + 1) ,

for 1≤ i < n, where t+= max{t, 0}, 3. λk+1= E

ν(ψk, ψk) Varπ(ψk)

.

The following theorems discuss the behavior of λk.

Theorem 1.2 (Convergence to the exact value). Referring to (A1), if n = 2, then λk= λGπ,ν for all k≥ 1. If n ≥ 3, then the sequence (λk, ψk) satisfies

(1) If λ0= λGπ,ν, then λk= λGπ,ν for all k≥ 0. (2) If λ0̸= λGπ,ν, then λk> λk+1> λGπ,ν for k≥ 1. (3) Set (λ∗, ψ∗) = lim k→∞(λk, ψk). Then, λ =E ν(ψ∗, ψ∗)/Varπ(ψ∗) = λGπ,ν and π(ψ∗) = 0.

Theorem 1.3 (Rate of convergence). Referring to Theorem 1.2, there is a constant σ∈ (0, 1) independent of the choice of (λ0, a) such that 0≤ λk− λGπ,ν ≤ σk−1λ1for

all k≥ 1.

By Theorem 1.3, we know that the sequence λk generated in (A1) converges to the spectral gap exponentially but the rate (− log σ) is undetermined. The following alternative scheme is based on using more information on the spectral gap and will provide convergence at a constant rate.

(3)

(A2)

Choose a > 0, L0< λGπ,ν < U0in advance and set, for k = 0, 1, ...,

1. ψk(1) =−a, λk = 12(Lk+ Uk) 2. ψk(i + 1) = ψk(i) +{[ψ

k(i)− ψk(i− 1)]ν(i − 1, i) − λkπ(i)ψk(i)}+

ν(i, i + 1) ,

for 1≤ i < n, where t+= max{t, 0},

3.      Lk+1= Lk, Uk+1= λk if π(ψk) > 0 Lk+1= λk, Uk+1= Uk if π(ψk) < 0 Lk+1= Uk+1= λk if π(ψk) = 0 .

Theorem 1.4 (Dichotomy method). Referring to (A2), it holds true that

0≤ max{Uk− λGπ,ν, λ G

π,ν− Lk} ≤ (U0− L0)2−k, ∀k ≥ 0.

In Theorem 1.4, the convergence to the spectral gap is exponentially fast with explicit rate, log 2. See Remark 2.2 for a discussion on the choice of L0and U0. For

higher order spectra, Miclo has a detailed description of the shape of eigenvectors in [14] and this will motivate the definition of similar algorithms for every eigenvalue in spectrum. See (Di) and Theorem 3.4 for a generalization of (A2) and Theorem 3.14 for a localized version of Theorem 1.3.

The spectral gap is an important parameter in the quantitative analysis of Markov chains. The cutoff phenomenon, a sharp phase transition phenomenon for Markov chains, was introduced by Aldous and Diaconis in early 1980s. It is of interest in many applications. A heuristic conjecture proposed by Peres in 2004 says that the cutoff exists if and only if the product of the spectral gap and the mixing time tends to infinity. Assuming reversibility, this has been proved to hold for Lp-convergence with 1 < p ≤ ∞ in [2]. For the L1-convergence, Ding et al.

[10] prove this conjecture for continuous time birth and death chains. In order to use Peres’ conjecture in practice, the orders of the magnitudes of spectral gap and mixing time are required. The second aspect of this paper is to derive a theoretical lower bound on the spectral gap using only the birth and death rates. This lower bound is obtained using the same idea used to analyze the above algorithm. For estimates on the mixing time of birth and death chains, we refer the readers to the recent work [4] by Chen and Saloff-Coste. For illustration, we consider several examples of specific interest and show that the lower bound provided here is in fact of the correct order in these examples.

This article is organized as follows. In Section 2, the algorithms in (A1)-(A2) are explored and proofs for Theorems 1.2-1.4 are given. In Section 3, the spectrum of Mπ,νG is discussed further and, based on Miclo’s work [14], Algorithm (A2) is generalized to any specified eigenvalue of MG

π,ν. Our method is applicable for paths of infinite length (one-sided) and this is described in Section 4. For illustration, we consider some Metropolis chains and display numerical results of Algorithm (A2) in Section 5. In Section 6, we focus on uniform measures with bottlenecks and determine the correct order of the spectral gap using the theory in Sections 2-3. It is worthwhile to remark that the assumptions in Section 6 can be relaxed using the comparison technique in [7, 8]. As the work in this paper can also be regarded as a stochastic counterpart of theory of finite Jacobi matrices, we would like to refer the readers to [18, 19] for a complementary perspective.

(4)

2. Convergence to the spectral gap

This section is devoted to proving Theorems 1.2-1.4. First, we prove Theorem 1.1 in the following form.

Lemma 2.1. Let λ > 0 and ψ be a non-constant function on V . Suppose (λ, ψ) solves (1.2) and ψ is monotonic. Then, ψ is strictly monotonic, that is, either ψ(i) < ψ(i + 1) for 1≤ i < n or ψ(i) > ψ(i + 1) for 1 ≤ i < n.

Proof. Obviously, (1.2) implies that π(ψ) = 0. Without loss of generality, it suffices

to consider the case when ψ(1) < 0 and ψ(n) > 0. Since ψ is non-constant and

λG

π,ν > 0, we have ψ(1) < ψ(2) and ψ(n− 1) < ψ(n). Note that if there are 1 < i < j < n such that ψ(i− 1) < ψ(i), ψ(j) < ψ(j + 1) and ψ(k) = ψ(i) = ψ(j) for i≤ k ≤ j, then (1.2) yields

λGπ,νπ(i)ψ(i) = [ψ(i)− ψ(i − 1)]ν(i − 1, i) + [ψ(i) − ψ(i + 1)]ν(i, i + 1) > 0

and

λGπ,νπ(j)ψ(j) = [ψ(j)− ψ(j − 1)]ν(j − 1, j) + [ψ(j) − ψ(j + 1)]ν(j, j + 1) < 0,

a contradiction. Thus, ψ is strictly increasing. 

We note the following corollary.

Corollary 2.2. Let (λ, ψ) be a pair satisfying (1.2). Then, λ = λG

π,ν if and only if

ψ is monotonic.

Proof. One direction is obvious from Theorem 1.1. For the other direction, assume

that ψ is monotonic and let ϕ be a minimizer for λG

π,ν with π(ϕ) = 0. Since (λ, ψ) and (λG

π,ν, ϕ) are solutions to (1.2), one has

λπ(ψϕ) =Eν(ψ, ϕ) = λGπ,νπ(ϕψ).

By Lemma 2.1, ψ and ϕ are strictly monotonic and this impliesEν(ψ, ϕ)̸= 0. As a consequence of the above equations, we have λ = λGπ,ν. 

The following proposition is the key to Theorem 1.2.

Proposition 2.3. Suppose that (λ, ψ) satisfies λ > 0, ψ(1) < 0 and, for 1≤ i < n,

(2.1) ψ(i + 1) = ψ(i) +{[ψ(i) − ψ(i − 1)]ν(i − 1, i) − λπ(i)ψ(i)}

+

ν(i, i + 1) ,

where t+= max{t, 0}. Then, the following are equivalent.

(1) Eν(ψ, ψ) = λVarπ(ψ). (2) π(ψ) = 0.

(3) λ = λGπ,ν.

Furthermore, if n≥ 3, then any of the above is equivalent to

(4) Eν(ψ, ψ) = λGπ,νVar(ψ)

Remark 2.1. For n = 2, it is an easy exercise to show that λG

π,ν= ν(1, 2)/(π(1)π(2)). By following the formula in (2.1), one has ψ(2) = ψ(1)[1− λπ(1)/ν(1, 2)], which leads toEν(ψ, ψ)/Varπ(ψ) = λGπ,ν.

(5)

Proof of Proposition 2.3. Set B ={1 ≤ i ≤ n|ψ(i) = ψ(n)} and Bc ={1, 2, ..., i0}.

Since ψ(1) < 0 and λ > 0, ψ(1) < ψ(2) and Bc is nonempty. According to (2.1), ψ is non-decreasing. Note that if ψ(i) = ψ(i+1), then ψ(i)≥ 0 and ψ(i+2) = ψ(i+1). This implies ψ is strictly increasing on{1, 2, ..., i0+ 1} and, for 1 ≤ i ≤ i0,

λπ(i)ψ(i) = [ψ(i)− ψ(i + 1)]ν(i, i + 1) + [ψ(i) − ψ(i − 1)]ν(i − 1, i).

Multiplying ψ(i) on both sides and summing over all i in Bc yields

λ i0 ∑ i=1 ψ(i)2π(i) = i∑0−1 i=1

[ψ(i)− ψ(i + 1)]2ν(i, i + 1)

+ ψ(i0)[ψ(i0)− ψ(i0+ 1)]ν(i0, i0+ 1)

=Eν(ψ, ψ) + ψ(i0+ 1)[ψ(i0)− ψ(i0+ 1)]ν(i0, i0+ 1)

=Eν(ψ, ψ) + λψ(n) i0 ∑ i=1 ψ(i)π(i). This is equivalent to (2.2) Eν(ψ, ψ) = λVarπ(ψ) + λπ(ψ)[π(ψ)− ψ(n)], which proves (1)⇔(2).

If λ = λGπ,ν, then ψ is an eigenvector for Mπ,νG associated to λGπ,ν. This proves (3)⇒(2). For (2)⇒(3), assume that π(ψ) = 0. In this case, ψ must be strictly increasing. Otherwise, ψ(i) = ψ(n) > 0 for i ∈ B and, according to (2.1), this implies λVarπ(ψ) > λ n−1 i=1 π(i)ψ2(i)≥ n−1i=1

[ψ(i)− ψ(i + 1)]2ν(i, i + 1) =E(ψ, ψ),

which contradicts (1). As ψ is strictly increasing and π(ψ) = 0, (λ, ψ) solves (1.2). By Corollary 2.2, λ = λG

π,ν.

To finish the proof, it remains to show (4)⇒(3) ((3)⇒(4) is obvious from the equivalence among (1), (2) and (3)). Assume that Eν(ψ, ψ) = λGπ,νVarπ(ψ). By Lemma 2.1, ψ is strictly monotonic and this implies, for 1≤ i < n,

λπ(i)ψ(i) = [ψ(i)− ψ(i + 1)]ν(i, i + 1) + [ψ(i) − ψ(i − 1)]ν(i − 1, i).

As ψ is a minimizer for λGπ,ν, one has, for 1≤ i ≤ n,

λGπ,νπ(i)[ψ(i)− π(ψ)] = [ψ(i) − ψ(i + 1)]ν(i, i + 1) + [ψ(i) − ψ(i − 1)]ν(i − 1, i).

If λ̸= λGπ,ν, the comparison of both systems yields

ψ(i) = λ G π,νπ(ψ) λG π,ν− λ , ∀1 ≤ i < n.

As n≥ 3, ψ(1) = ψ(2), a contradiction! This forces λ = λG

π,ν, as desired.  The following is a simple corollary of Proposition 2.3, which plays an important role in proving Theorem 1.4.

Corollary 2.4. Let n≥ 3. For λ > 0, let ϕλ be the vector generated by (2.1) with

(6)

Proof. Without loss of generality, we fix ϕλ(1) = −1 for all λ > 0. Set T (λ) =

π(ϕλ). To prove this corollary, it suffices to show that

T (λ) { < 0 if λ < λG π,ν > 0 if λ > λG π,ν .

For λ > 0, define L(λ) :=Eν(ϕλ, ϕλ)/Varπ(ϕλ). By (2.2), one has (2.3) L(λ)− λ = λT (λ)[π(ϕλ)− ϕλ(n)]

Varπ(ϕλ)

.

Since ϕλ is non-constant, π(ϕλ) < ϕλ(n). This implies T (λ) < 0 for λ∈ (0, λGπ,ν). For λ > λG

π,ν, set I = (λGπ,ν,∞). By Proposition 2.3, T (λ) = 0 if and only if λ =

λGπ,ν. By the continuity of T , this implies either T (I)⊂ (−∞, 0) or T (I) ⊂ (0, ∞). In the case T (I) ⊂ (−∞, 0), one has L(λ) > λ for λ ∈ I. As L(I) is bounded,

Lk(λ) is convergent with limit eλ > λG

π,ν and this yields

0 = lim k→∞[L

k+1(λ)− Lk(λ)] = eλT(eλ)[π(ϕeλ)− ϕeλ(n)] Varπ(ϕeλ)

> 0,

a contradiction. Hence, T (λ) > 0 for λ > λGπ,ν. 

Proof of Theorem 1.2. The proof for n = 2 is obvious from a direct computation

and we deal with the case n≥ 3, here. By the equivalence of Proposition 2.3 (3)-(4), if λ0 = λGπ,ν, then λk = λGπ,ν for all k ≥ 1. If λ0 ̸= λGπ,ν, then λk > λGπ,ν for

k≥ 1. Note that (λk, ψk) solves the system in (2.1). By (2.2), this implies

λk+1− λk=

λkπ(ψk)[π(ψk)− ψk(n)] Varπ(ψk)

, ∀k ≥ 0.

The strict monotonicity of λk in (2) comes immediately from Corollary 2.4. In (3), the continuity of (2.1) in λ implies that (λ∗, ψ∗) is a solution to (2.1) and

Eν(ψ∗, ψ∗) = λ∗Var(ψ∗). By Proposition 2.3, λ∗= λGπ,ν and π(ψ∗) = 0, as desired. 

Proof of Theorem 1.3. Recall the notation in the proof of Corollary 2.4: For λ > 0,

let ϕλ be the function defined by (2.1) and L(λ) =Eν(ϕλ, ϕλ)/Varπ(ϕλ). By (2.2) and Corollary 2.4, L(λ)∈ (λG

π,ν, λ) for λ > λGπ,ν. As L is bounded, Theorem 1.3

follows from Lemma A.1. 

Proof of theorem 1.4. Immediate from Corollary 2.4. 

In the end of this section, we use the following proposition to find how the shape of the function ψ in (2.1) evolves with λ. In Proposition 2.5, we set ϕλ= ψ when

ψ is given by (2.1). It is easy to see from (2.1) that ϕλis strictly increasing before some constant, say i0= i0(λ), and then stays constant equal to ϕλ(i0) after i0. The

proposition shows how the constant i0(λ) evolves.

Proposition 2.5. For λ > 0, let ϕλbe the function generated by (2.1) with ϕλ(1) =

−1 and, for 1 ≤ i ≤ n, set Ti(λ) =i

j=1ϕλ(i)π(i). For 1≤ i < n, let

(7)

(2.4) Ai(λ) =              a1(λ) 1 0 0 · · · 0 π(3) π(2) a2(λ) 1 0 .. . 0 π(4)π(3) a3(λ) . .. . .. ... 0 0 . .. . .. . .. 0 .. . . .. . .. ai−1(λ) 1 0 · · · · · · 0 π(i+1)π(i) ai(λ)              ,

and let λ(i)be the smallest root of det Ai(λ) = 0. Then, (1) λGπ,ν= λ(n−1)< λ(n−2)<· · · < λ(1).

(2) ϕλ(i) < ϕλ(i + 1) = ϕλ(i + 2) for λ∈ [λ(i), λ(i−1)) and 1≤ i ≤ n − 2, where

λ(0):=∞.

(3) ϕλ(n− 1) < ϕλ(n) for λ∈ (0, λ(n−2)).

In particular, Ti+1(λ) =−π(1) det Ai(λ) for λ∈ (0, λ(i−1)) and (λ−λ(i))Ti+1(λ) > 0 for λ∈ (0, λ(i))∪ (λ(i),∞) with 1 ≤ i ≤ n − 1.

Proof. By Lemma A.2, λ(1) > λ(2)>· · · > λ(n−1)> 0 and, for 1≤ i ≤ n − 1,

(2.5) det Ai(λ)

{

> 0 ∀λ ∈ (−∞, λ(i))

< 0 ∀λ ∈ (λ(i), λ(i−1)),

where λ(0)=∞. Note that if Ti(λ) < 0 for some 1≤ i ≤ n − 1, then

ϕλ(j + 1) = ϕλ(j) + [ϕλ(j)− ϕλ(j− 1)]ν(j − 1, j) − λπ(j)ϕλ(j) ν(j, j + 1) , ∀1 ≤ j ≤ i. This implies (2.6) ϕλ(ℓ + 1) = ϕλ(ℓ)− λ ν(ℓ, ℓ + 1) j=1 π(j)ϕλ(j), ∀1 ≤ ℓ ≤ i.

Multiplying π(ℓ + 1) and adding up Tℓ(λ) yields

Tℓ+1(λ) = aℓ(λ)Tℓ(λ)−

π(ℓ + 1)

π(ℓ) Tℓ−1(λ), ∀1 ≤ ℓ ≤ i.

From the above discussion, we conclude that if Ti(λ) < 0, then (2.7) Tℓ+1(λ) =−π(1) det Aℓ(λ), ∀1 ≤ ℓ ≤ i.

When ℓ = i− 1, (2.5) implies det Ai−1(λ) > 0 for λ < λ(i−1). By the continuity of Ti and det Ai−1, if there is some λ < λ(i−1) such that Ti(λ) < 0, then Ti(λ) =

−π(1) det Ai−1(λ) for λ < λ(i−1). As a consequence of (2.7) with ℓ = i, this will imply Ti+1(λ) = −π(1) det Ai(λ) for λ < λ(i−1). Hence, it remains to show that Ti(λ) < 0 for some λ < λ(i−1). To see this, according to Corollary 2.4, one can choose a constant eλ < min{λG

π,ν, λ(i−1)} such that Tn−1(eλ) < 0. Since ϕλ(i) is non-decreasing in i, we obtain Ti(eλ) < 0, as desired. This proves Ti+1(λ) =

−π(1) det Ai(λ) for λ < λ(i−1). In particular, Tn(λ) =−π(1) det An−1(λ) for λ <

λ(n−2). By Corollary 2.4, we have λ(n−1)= λG

(8)

Next, observe that, for λ∈ (λ(i), λ(i−1)), i+1j=1 π(j)ϕλ(j) = Ti+1(λ) > 0, ij=1 π(j)ϕλ(j) = Ti(λ) < 0.

By (2.6), it is easy to see that [ϕλ(i + 1)− ϕλ(i)]ν(i, i + 1) =−λTi(λ) > 0 and [ϕλ(i + 2)− ϕλ(i + 1)]ν(i + 1, i + 2)

={[ϕλ(i + 1)− ϕλ(i)]ν(i, i + 1)− λπ(i + 1)ϕλ(i + 1)}

+

={−λTi+1(λ)}+= 0.

This proves Proposition 2.5 (2). To prove Proposition 2.5 (3), we use (1) to derive

Tn−1(λ) =−π(1) det An−2(λ) < 0, ∀λ ∈ (0, λ(n−2)).

Using (2.6), this implies ϕλ(n−1) < ϕλ(n). The last part of Proposition 2.5 follows easily from (2.5) and the fact that

Ti(λ)≥ 0 ⇒ Ti+1(λ) > 0 and Ti(λ)≤ 0 ⇒ Ti−1(λ) < 0.



Remark 2.2. In Proposition 2.5, if λ > λ(1)= ν(1, 2)[π(1)−1+π(2)−1], then ϕ

λ(i) =

ϕλ(2) for i = 2, ..., n. Note that, for λ≥ λ(1), ϕλ(2) =−1 + λπ(1)/ν(1, 2) and

π(ϕλ) =−1 +

λπ(1)(1− π(1))

ν(1, 2) , Varπ(ϕλ) =

λ2π(1)3(1− π(1))

ν(1, 2)2 .

By (2.3), this leads to L(λ) = ν(1, 2)/[π(1)(1−π(1)] for λ ≥ λ(1). In the case n = 2, it is clear that ν(1, 2)/[π(1)(1− π(1)] = ν(1, 2)[π(1)−1+ π(2)−1] = λGπ,ν.

3. Convergence to other eigenvalues

In this section, we generalize the algorithms (A1) and (A2) so that they can be applied for the computation to any specified eigenvalue.

3.1. Basic setup and fundamental results. Recall that G is a graph with vertex set V = {1, 2, ..., n} and edge set E = {{i, i + 1}|i = 1, 2, ..., n − 1}. Given two positive measures π, ν on V, E with π(V ) = 1, let Mπ,νG be a n-by-n matrix defined in the introduction and given by

(3.1) Mπ,νG (i, j) =      −ν(i, j)/π(i) if|i − j| = 1 [ν(i− 1, i) + ν(i, i + 1)]/π(i) if j = i

0 if|i − j| > 1

.

Since ν is positive everywhere and Mπ,νG is tridiagonal, all eigenvalues of Mπ,νG have algebraic multiplicity 1. Throughout this section, let {λG

0 < λG1 < · · · < λGn−1} denote the eigenvalues of MG

π,νwith associated L2(π)-normalized eigenvectors ζ0=

1, ζ2, ..., ζn−1. Clearly, λG0 = 0, λG1 = λGπ,ν and, for 1≤ k ≤ n,

(3.2) λGi ζi(k)π(k) = [ζi(k)− ζi(k− 1)]ν(k − 1, k) + [ζi(k)− ζi(k + 1)]ν(k, k + 1). Let 1 ≤ i ≤ n − 1. As ζi is non-constant, it is clear that ζi(1) ̸= ζi(2) and

ζi(n− 1) ̸= ζi(n). Moreover, if ζi(k) = ζi(k + 1) for some 1 < k < n, then

(9)

there are exactly i sign changes for ζi with 1≤ i ≤ n. Miclo [14] gives a detailed description on the shape of ζi as follows.

Theorem 3.1. For 1 ≤ i ≤ n − 1, let ζi be an eigenvector associated to the ith

smallest non-zero eigenvalue of the matrix in (3.1) with ζi(1) < 0. Then, there are 1 = a1< b1 ≤ a2< b2 ≤ · · · ≤ ai < bi = n with aj+1− bj ∈ {0, 1} such that ζi is

strictly increasing on [aj, bj] for odd j and is strictly decreasing on [aj, bj] for even

j, and ζi(aj+1) = ζi(bj) for 1≤ j < i.

In the following, we make some analysis related to the Euler-Lagrange equations in (3.2).

Definition 3.1. Fix n≥ 1 and let f be a function on {1, 2, ..., n}. For 1 ≤ i ≤ n−1, f is called “Type i” if there are 1 = a1< b1≤ a2< b2≤ · · · ≤ ai< bi≤ n satisfying

aj+1− bj∈ {0, 1} such that

(1) f is strictly monotonic on [aj, bj] for 1≤ j ≤ i.

(2) [f (aj)− f(aj+ 1)][f (aj+1)− f(aj+1+ 1)] < 0 for 1≤ j < i. (3) f (aj+1) = f (bj), for 1≤ j < i, and f(k) = f(bi), for bi≤ k ≤ n. The points aj, bj will be called “peak-valley points” in this paper.

Remark 3.1. Note that the difference between Definition 3.1 and Theorem 3.1 is the

requirement bi ≤ n, instead of bi= n. By Theorem 3.1, any eigenvector associated to the ith smallest non-zero eigenvalue of the matrix in (3.1) must be of type i with

bi = n.

Definition 3.2. Let π, ν be positive measures on V, E with π(V ) = 1. For λ∈ R,

let ξλ be a function on{1, 2, ..., n} defined by ξλ(1) =−1 and, for 1 ≤ k < n,

ξλ(k + 1) = ξλ(k) +

[ξλ(k)− ξλ(k− 1)]ν(k − 1, k) − λπ(k)ξλ(k)

ν(k, k + 1) .

Remark 3.2. Note that ξ0 = −1 and, for λ < 0, ξλ is strictly decreasing and of type 1. For λ > 0, if ξλ(k− 1) < ξλ(k) = ξλ(k + 1), then ξλ(k) > 0 and this implies

ξλ(k + 2) < ξλ(k + 1). Similarly, if ξλ(k− 1) > ξλ(k) = ξλ(k + 1), then ξλ(k) < 0 and ξλ(k + 2) > ξλ(k + 1). Thus, ξλ must be of type i for some 1≤ i ≤ n − 1.

Lemma 3.2. For λ > 0, let ξλ be the function in Definition 3.2. Suppose that ξλ

is of type i with 1≤ i ≤ n − 1.

(1) If ξλ(n− 1) ̸= ξλ(n), then there is ϵ > 0 such that ξλ+δ is of type i for

−ϵ < δ < ϵ.

(2) If ξλ(n− 1) = ξλ(n), then there is ϵ > 0 such that ξλ+δ is of type i + 1 and

ξλ−δ is of type i for 0 < δ < ϵ.

Proof. Let aj, bj be the peak-valley points of ξλ. By the continuity of ξλ in λ and Remark 3.2, one can choose ϵ > 0 such that, for δ ∈ (−ϵ, ϵ), ξλ+δ remains strictly monotonic on [aj, bj] for j = 1, ..., i and

[ξλ+δ(bj− 1) − ξλ+δ(bj)][ξλ+δ(aj+1+ 1)− ξλ+δ(aj+1)] > 0,

for 1≤ j < i. In (1), bi= n. Fix δ∈ (−ϵ, ϵ) and set a′1= a1= 1, b′i= bi= n. For 1 < j < i, set      b′j = a′j+1= bj if [ξλ+δ(bj− 1) − ξλ+δ(bj)][ξλ+δ(bj)− ξλ+δ(aj+1)] < 0 b′j = a′j+1= aj+1 if [ξλ+δ(bj− 1) − ξλ+δ(bj)][ξλ+δ(bj)− ξλ+δ(aj+1)] > 0 b′j = bj, a′j+1= aj+1 if [ξλ+δ(bj− 1) − ξλ+δ(bj)][ξλ+δ(bj)− ξλ+δ(aj+1)] = 0 .

(10)

Clearly, ξλ+δis of type i with peak-valley points a′j, b′j. This proves Lemma 3.2 (1). For part (2), we consider i ≤ n − 2 and bi = n− 1. By similar argument as before, one can choose ϵ > 0 such that the restriction of ξλ+δ to {1, 2, ..., n − 1} is of type i for δ∈ (−ϵ, ϵ). To finish the proof, it remains to compare ξλ+δ(n− 1) and

ξλ+δ(n). Recall that Tj(λ) =j

k=1ξλ(k)π(k) as in the proof for Proposition 2.5. Using a similar reasoning as for (2.7), one shows that Ti+1(λ) =−π(1) det Ai(λ) for 1 ≤ i < n, where Ai(λ) is the matrix in (2.4). This implies that the non-zero eigenvalues of Mπ,νG , say λG1, ..., λGn−1, are the roots of det An−1(λ) = 0. As a consequence of Lemma A.2, det An−2(λ) = 0 has exactly n− 2 distinct roots, say

α1< α2<· · · < αn−1, and they satisfy the interlacing property λGj < αj < λGj+1for 1≤ j ≤ n − 2. Note that det An−2(λ) and det An−1(λ) tend to infinity as−λ tends to infinity. This leads to the fact that if det An−2(λ) = 0 and det An−1(λ) < 0, then det An−2(·) is strictly decreasing in a neighborhood of λ. If det An−2(λ) = 0 and det An−1(λ) > 0, then det An−2(·) is strictly increasing in a neighborhood of

λ.

Back to the proof of (2). Suppose that ξλ(n− 2) < ξλ(n− 1). By Remark 3.2, it is easy to check that Tn−1(λ) = 0 and Tn(λ) > 0 or, equivalently, det An−2(λ) = 0 and det An−1(λ) < 0. According to the conclusion in the previous paragraph, we can find ϵ > 0 such that det An−2(·) is strictly decreasing on (λ − ϵ, λ + ϵ), which yields ξλ+δ(n) = ξλ+δ(n− 1) − (λ + δ)Tn−1(λ + δ) ν(n− 1, n) { < ξλ+δ(n− 1) if 0 < δ < ϵ > ξλ+δ(n− 1) if − ϵ < δ < 0 .

This gives the desired property in Lemma 3.2 (2). The other case, ξλ(n− 2) >

ξλ(n− 1), can be proved in the same way and we omit the details.  The following proposition characterizes the shape of ξλ for λ > 0.

Proposition 3.3. For λ > 0, let ξλ be the function in Definition 3.2. Let λG1 <

· · · < λG

n−1 be non-zero eigenvalues of Mπ,νG in (3.1) and α1<· · · < αn−2 be zeros

of det An−2(λ), where An−2(·) is the matrix in (2.4). Then, (1) λG

j < αj < λGj+1, for 1≤ j ≤ n − 2.

(2) ξλ is of type j for λ∈ (αj−1, αj] and 1≤ j ≤ n − 1, where α0 := 0 and

αn−1:=∞.

Proof. (1) is immediate from Lemma A.2. For (2), note that αi is an eigenvalue of the submatix of MG

π,ν obtained by removing the nth row and column. This implies ξαi(n− 1) = ξαi(n) for i = 1, ..., n− 2 and ξλ(n− 1) ̸= ξλ(n) for λ > 0 and

λ /∈ {α1, ..., αn−2}. By Lemma 3.2, ξλ is of type i for αi−1< λ≤ αi.  Given λ > 0, the above proposition provides a simple criterion to determine to which of the intervals (αj, αj+1] λ belongs to, that is, the type of ξλ. However, knowing the type of ξλis not sufficient to determine whether λ is bigger or smaller than λG

i . We need the following remark.

Remark 3.3. Using the same argument as the proof of Proposition 2.5, one can show

that π(ξλ) = −π(1) det An−1(λ), where An−1(λ) is the matrix in (2.4). Clearly,

π(ξλ) has zeros λG1, ..., λGn−1and tends to minus infinity as λ tends to minus infinity. This implies that π(ξλ) < 0, for λ < λG1, and

π(ξλ) > 0 ∀λ ∈ (λG2i−1, λ

G

2i), π(ξλ) < 0 ∀λ ∈ (λG2i, λ

G

(11)

for i≥ 1, where λGn :=∞.

As a consequence of Proposition 3.3 and Remark 3.3, we obtain the following dichotomy algorithm, which is a generalization of (A2). Let 1≤ i ≤ n − 1.

(Di)

Choose positive reals L0< λGi < U0 and set, for ℓ = 0, 1, ...,

1. ξλℓ be the function generated by λℓ= (Lℓ+ Uℓ)/2 in Definition 3.2,

2. According to Definition 3.1, set                Lℓ+1= Lℓ, Uℓ+1= λℓ if ξλℓ is of type j with j > i, or if ξλℓ is of type i and (−1) i−1π(ξ λℓ) > 0 Uℓ+1= Uℓ, Lℓ+1 = λℓ if ξλℓ is of type j with j < i, or if ξλℓ is of type i and (−1) i−1π(ξ λℓ) < 0 Lℓ+1= Uℓ+1= λℓ if ξλℓ is of type i and π(ξλℓ) > 0 .

Theorem 3.4. Referring to (Di),

0≤ max{Uℓ− λGi , λ G

i − Lℓ} ≤ (U0− L0)2−ℓ, ∀ℓ ≥ 0.

Proof. Immediate from Proposition 3.3 and Remark 3.3. 

Proposition 3.3 (2) bounds the eigenvalues using the shape of ξλgenerated from one end point. We now introduce some other criteria to bound eigenvalues using the shape of ξλ from either boundary point. Those results will be used to prove Theorem 6.1.

Proposition 3.5. For λ > 0, let ξλ be the function in Definition 3.2 and eξλ be a

function given by

eξλ(k− 1) = eξλ(k) +

[eξλ(k)− eξλ(k + 1)]ν(k, k + 1)− λπ(k)eξλ(k)

ν(k− 1, k) ,

for k = n, n− 1, ..., 2 with eξλ(n) =−1. Let λG0 <· · · < λGn−1 be eigenvalues of Mπ,νG

in (3.1) and let f|B be the restriction of f to a subset B of V . Suppose 1≤ k0≤ n.

(1) If ξλ|{1,...,k0} is of type i with (−1)

iξ

λ(k0) > 0 and eξλ|{k0,...,n} is of type j

with (−1)jeξ

λ(k0) > 0, then λGi+j−2< λ < λGi+j−1. (2) If ξλ|{1,...,k0} is of type i with (−1)

iξ

λ(k0) < 0 and eξλ|{k0,...,n} is of type j

with (−1)jeξλ(k0) < 0, then λGi+j−1< λ < λ G i+j+1. (3) If ξλ|{1,...,k0} is of type i with (−1) iξ λ(k0) > 0 and eξλ|{k0,...,n} is of type j with (−1)jeξ

λ(k0) < 0, then λGi+j−2< λ < λGi+j.

Proof. By Proposition 3.3, ξλ(n) is a polynomial of degree n− 1 satisfying (−1)i+1ξλG

i(n) > 0, ∀0 ≤ i < n, (−1)

i+1ξ

βi(n) > 0, ∀1 ≤ i < n − 1.

This implies that there are wi∈ (βi, λGi+1), 0≤ i ≤ n−2, such that (−1)i+1ξλ(n) > 0 for λ∈ (wi−1, wi) and 0≤ i ≤ n − 1 with w−1=−∞ and wn−1=∞.

The proofs for (1)-(3) in Proposition 3.5 are similar and we deal with (1) only. By the Euler-Lagrange equations in (3.2), it is easy to see that, for 1≤ l < n, ξλG l

and eξλG

l are eigenvectors of M

G

π,ν in (3.1) associated with λGl , which implies ξλG l =

−ξλG

l (n)eξλGl. First, assume that λ≤ λ

G

i+j−2. By Proposition 3.3, ξλG

(12)

is of type at least i and eξλG

i+j−2|{k0,...,n} is of type at least j. This implies that the patching of ξλG

i+j−2|{1,...,k0} and −ξλGi+j−2(n)eξλGi+j−2|{k0,...,n}, which equals to

ξλG

i+j−2, is of type at least i + j− 1. This is a contradiction.

Next, assume that λ≥ λG

i+j−1. By Proposition 3.3, we may choose a1< λ (resp.

a2< λ) such that ξλ|{1,...,k0} (resp. eξλ|{k0,...,n}) changes the type at a1 (resp. a2). If λGi+j−1 ≤ min{a1, a2}, then a similar reasoning as before implies that ξλG

i+j−1 is

of type at most i + j− 2, a contradiction. If min{a1, a2} < λGi+j−1< max{a1, a2},

then exactly one of ξλG

i+j−1|{1,...,k0} and eξλGi+j−1|{k0,...,n} does not change its type. This implies that the gluing point k0 can not be a local extremum and, thus, the

patching function is of type at most i + j− 2, another contradiction! According to the discussion in the first paragraph of this proof, if λG

i+j−1 ≥ max{a1, a2}, then

none of ξλG

i+j−1|{1,...,k0} and eξλGi+j−1|{k0,...,n} changes type nor, of course, the sign at

k0. Consequently, we obtain (−1)i+jξλG

i+j−1(k0)eξλGi+j−1(k0) > 0, which contradicts

the fact ξλG

i+j−1 =−ξλGi+j−1(n)eξλGi+j−1. 

Proposition 3.6. For λ > 0 and 1≤ k ≤ n−1, let sk(λ) be the kth sign change of

ξλdefined by s0:= 0 and sk+1(λ) := inf{l > sk(λ)|ξλ(l)ξλ(l− 1) < 0 or ξλ(l) = 0},

where inf∅ := n + 1. Then, for 0 < λ1< λ2, sk(λ1)≥ sk(λ2) for all 1≤ k ≤ n − 1.

Proof. Let 1≤ k ≤ n − 1. If sk(λ1) = n + 1, then it is clear that sk(λ1)≥ sk(λ2).

Suppose that sk(λ1) = ℓ ≤ n. Obviously, ξλ1|{1,...,ℓ} is of type k. Referring to (2.4), let λℓ1, ..., λℓ−1 be the roots of det Aℓ−1(λ) = 0 and αℓ1, ..., α

ℓ−2 be roots of det Aℓ−2(λ) = 0. According to the first paragraph of the proof for Proposition 3.5, there are wℓ

i ∈ (αℓi−1, λℓi) with 1 ≤ i ≤ ℓ − 1 such that (−1)i+1ξλ(ℓ) > 0 for

λ∈ (wℓ

i, wℓi+1) and 1≤ i ≤ ℓ − 1, where αℓ0 := 0. Since ξλ1(ℓ)ξλℓ

k(ℓ)≥ 0, one has

wℓk≤ λ1< αℓk. As it is assumed that λ2> λ1, if λ2> αℓk, then ξλ2|{1,...,ℓ}is of type at least k + 1 and, consequently, sk(λ2) < ℓ = sk(λ1). If λ1 < αℓk, then ξλ2|{1,...,ℓ} is type k and ξλ2(ℓ) < 0. This implies sk(λ2)≤ ℓ = sk(λ1), as desired. 

3.2. Bounding eigenvalues from below. Motivated by Theorem 3.1, we intro-duce another scheme generalizing (2.1) to bound the other eigenvalues of MG

π,ν from below.

Definition 3.3. For λ > 0, let ξλbe a function in Definition 3.2. If ξλis of type i, 1≤ i ≤ n − 1, with peak-valley points 1 = a1< b1 ≤ a2< b2≤ · · · ≤ ai < bi ≤ n, then define ξ(j)λ (k) = { ξλ(k) for k≤ bj ξλ(k) = ξλ(bj) for k > bj , ∀1 ≤ j < i

(13)

Remark 3.4. For λ > 0, if ξλis of type i, then ξ

(j)

λ is of type j for j < i. Moreover, for k < bj, ξλ(j)(k + 1) = ξλ(j)(k) +[ξ (j) λ (k)− ξ (j) λ (k− 1)]ν(k − 1, k) − λπ(k)ξ (j) λ (k) ν(k, k + 1) = ξλ(j)(k)−λ[π(1)ξ (j) λ (1) +· · · + π(k)ξ (j) λ (k)] ν(k, k + 1) , and, for bj≤ k < n, ξ(j)λ (k + 1) = ξλ(j)(k) +Fj([ξ (j) λ (k)− ξ (j) λ (k− 1)]ν(k − 1, k) − λπ(k)ξ (j) λ (k)) ν(k, k + 1) ,

where Fj(t) = max{t, 0} if j is odd, and Fj(t) = min{t, 0} if j is even. Note that

ξ(1)λ is exactly ϕλin Proposition 2.5.

Thereafter, letL and L(i)be functions on (0,∞) defined by

(3.3) L(λ) = Eν(ξλ, ξλ) Varπ(ξλ) , L(i)(λ) = Eν(ξ (i) λ , ξ (i) λ ) Varπ(ξ (i) λ ) , ∀1 ≤ i ≤ n − 1, where ξλ and ξ (i)

λ are functions in Definitions 3.2-3.3.

Remark 3.5. Note thatL = L(n−1). By a similar reasoning as in the proof for (2.2),

one can show that, for λ > 0,

L(λ) = λ +λπ(ξλ)[π(ξλ)− ξλ(n)] Varπ(ξλ) , L(i)(λ) = λ +λπ(ξ (i) λ )[π(ξ (i) λ )− ξ (i) λ (n)] Varπ(ξ (i) λ ) .

From Proposition 3.3, it followss immediately thatL(λ) = L(i)(λ) for λ∈ (0, α

i]. To explore furtherL and L(i), we need more information of π(ξ

λ), π(ξ (i) λ ), π(ξλ) ξλ(n) and π(ξ (i) λ )− ξ (i) λ (n).

Lemma 3.7. Let ξλ be the function in Definition 3.2 and λGi , αi be constants in Proposition 3.3. Then, π(ξλ)− ξλ(n) = 0 has n− 1 distinct roots, say β0 < β1<

· · · < βn−2, which satisfy β0= 0 and αi< βi< λGi+1for 1≤ i ≤ n−2. Furthermore,

π(ξλ)− ξλ(n) > 0 for λ∈ (β2i−1, β2i) and π(ξλ)− ξλ(n) < 0 for λ∈ (β2i, β2i+1),

with β−1=−∞ and βn−1=∞.

Proof. Set u(λ) := π(ξλ)− ξλ(n). According to Definition 3.2, u(λ) is a polynomial of degree n−1 and satisfies u(0) = 0. Note that π(ξλ) = 0 for λ∈ {λG1, ..., λGn−1}. If

i is odd, then ξλG

i(n−1) < ξλGi (n). This implies ξλGi(n) > 0 and, hence, u(λ

G i ) < 0. Similarly, if i is even, then u(λG

i ) > 0.

By Lemma 3.2 and Proposition 3.3, if λ = αi with odd i, then ξαi is of type i

with ξαi(n− 1) = ξαi(n). This implies ξαi(n) > 0 and π(ξαi) = π(n)ξαi(n), which

yields u(αi) < 0. Similarly, one can show that u(αi) > 0 if i is even. 

Remark 3.6. We consider the sign of π(ξλ(i)) and π(ξλ(i))− ξλ(i)(n) in this remark. By Proposition 3.3, ξ(i)λ = ξλ for λ ≤ αi. If λ > αi with 1 ≤ i ≤ n − 2, then ξλ is of type j with j > i. Fix 1 ≤ i ≤ n − 2 and set k0 = k0(λ) =

min{k|ξλ(i)(j) = ξλ(i)(n),∀k ≤ j ≤ n}. Clearly, k0(λ) ≤ n − 1 for λ > αi. Ob-serve that, for λ > αi with odd i, ξλ(k0− 1) < ξλ(k0)≥ ξλ(k0+ 1), which implies

(14)

k0−1

k=1 π(k)ξλ(k) < 0 andk0

k=1π(k)ξλ(k) ≥ 0. A similar reasoning for the case of even i givesk0−1

k=1 π(k)ξλ(k) > 0 andk0

k=1π(k)ξλ(k)≤ 0. Consequently, we obtain

(3.4) (−1)i−1π(ξλ(i)) > 0, (−1)i[π(ξλ(i))− ξλ(i)(n)] > 0, for λ > αiand 1≤ i ≤ n−2. Note that, by Proposition 3.3, ξ

(i)

λ = ξλfor λ≤ αi. In addition with Remark 3.3, Lemma 3.7 and the continuity of ξλ(i), the first inequality of (3.4) holds for λ > λG

i and the second inequalities of (3.4) hold for λ > βi−1. According to Lemma 3.7 and Remark 3.6, we derive a generalized version of Proposition 2.3 in the following.

Proposition 3.8. Let n ≥ 3 and 1 ≤ i ≤ n − 1. For λ > 0, let ξλ, ξ

(i)

λ be the

functions in Definition 3.2 and βi be the constants in Lemma 3.7. (1) For λ > βi−1, the following are equivalent.

(1-1) Eν(ξ (i) λ , ξ (i) λ ) = λVarπ(ξ (i) λ ). (1-2) π(ξλ(i)) = 0. (1-3) λ = λG i .

(2) For βi−1< λ < βi, the following are equivalent. (2-1) Eν(ξλ, ξλ) = λVarπ(ξλ).

(2-2) π(ξλ) = 0. (2-3) λ = λGi .

Proof. The proof for Proposition 3.8 (2) is similar to the proof for Proposition 3.8

(1) and we deal only with the latter. By Lemma 3.7 and Remark 3.6, one has

π(ξ(i)λ )[π(ξλ(i))− ξ(i)λ (n)] {

< 0 for λ > λG i

> 0 for βi−1< λ < λGi

.

This proves the equivalence of (1-1) and (1-2). Under the assumption of (1-2) and using Remark 3.3, one has λ≤ αi. This implies ξ

(i)

λ = ξλis an eigenvector for Mπ,νG with associated eigenvalue λ. As λ∈ (βi−1, αi], it must be the case λ = λGi . This gives (1-3), while (1-3)⇒(1-2) is obvious and omitted. 

Remark 3.7. It is worthwhile to note that if (1-1) and (2-1) of Proposition 3.8 are

removed, then the equivalence in (1) holds for λ > λG

i−1 and the equivalence in (2) holds for λ ∈ (λG

i−1, λGi+1). Once λGi−1 is known, we can determine λGi using the sign of π(ξλ(i)). See Theorem 3.9 for details.

Remark 3.8. Note that condition (4) of Proposition 2.3 is not included in

Propo-sition 3.8. In fact, the equivalence may fail, that is, there may exist some λ (βi−1, βi)\ {λGi } such that Eν(ξλ, ξλ)/Varπ(ξλ) = λGi . See Example 3.2 for a coun-terexample.

As Proposition 3.8 focuses on the characterization of zeros of L(λ) − λ, the following theorem concerns the sign ofL(λ) − λ.

Theorem 3.9. Let λG

i , αi, βi be the constants in Proposition 3.3 and Lemma 3.7,

and L be the function in (3.3). Then, λG

1, ..., λGn−1, β1, ..., βn−2 are fixed points of

L and, for 1 ≤ i ≤ n − 2,

(1) L(λ) < λ for λ ∈ (λG i , βi).

(15)

(2) L(λ) > λ for λ ∈ (βi, λGi+1). (3) L(i)(λ) < λ for λ∈ (λGi ,∞).

Proof. Immediate from Lemma 3.7 and Remarks 3.5-3.6. 

By Theorem 3.9, we obtain a lower bound on any specified eigenvalue of MG π,ν.

Corollary 3.10. Let 1≤ i ≤ n − 1 and λ0> λGi . Consider the sequence λℓ+1=

L(i)

ℓ) with ℓ≥ 0 and set

λ∗= {

limℓ→∞λℓ if λℓ converges sup∈Iλℓ if λℓ diverges

, where I ={ℓ|λℓ−1 > λℓ< λℓ+1}. Then, λ∗≤ λGi .

It is not clear yet whether the sequence λℓin Corollary 3.10 is convergent, even locally. This subject will be discussed in the next subsection. Now, we establish some relations between the roots of det Ai(λ) = 0 and the shape of ξ

(i)

λ . This is a generalization of Proposition 2.5.

Proposition 3.11. For 1 ≤ i ≤ n − 1, let Ai(λ) be the matrix in (2.4), θ

(i) 1 <

· · · < θ(i)

i be zeros of det Ai(λ) = 0 and set θ

(i−1)

i :=∞. Referring to the notation

in Proposition 3.3, it holds true that, for 1≤ i ≤ n − 1,

(1) λG i = θ (n−1) i < αi= θ (n−2) i <· · · < θ (i) i .

(2) ξ(i)λ (j)̸= ξ(i)λ (j + 1) =· · · = ξλ(i)(n) for λ∈ [θ(j)i , θi(j−1)) and i≤ j ≤ n − 2. (3) ξλ(i)(n− 1) ̸= ξλ(i)(n) for λ∈ (θi(n−1−2), θ(ni −2)) and i≤ n − 1.

Proof. The order in (1) is a simple application of Lemma A.3. For (2), fix 1 i ≤ n − 1 and set γ(λ) = min{j|ξλ(i)(k) = ξ(i)λ (n),∀j ≤ k ≤ n} and B(λ) =

{1, 2, ..., γ(λ)}, B+(λ) = B(λ)∪ {γ(λ) + 1}. Clearly, i + 1 ≤ γ(λ) ≤ n. We use the

notation ξλ|C to denote the restriction of ξλto a set C. Suppose that i is odd. By Remark 3.4, ξλ(i)= ξλon B(λ) and ξλ|B(λ) is of type i with

ξλ(γ(λ)− 1) < ξλ(γ(λ))≥ ξλ(γ(λ) + 1).

By Lemma 3.2(1), if ξλ(γ(λ) + 1) < ξλ(γ(λ)), then there is ϵ > 0 such that, for

|δ| < ϵ, ξλ+δ|B(λ) is of type i and

ξλ+δ(γ(λ)− 1) < ξλ+δ(γ(λ)) > ξλ+δ(γ(λ) + 1).

This implies γ(λ + δ) = γ(λ) for δ ∈ (−ϵ, ϵ). By Lemma 3.2(2), if ξλ(γ(λ) + 1) =

ξλ(γ(λ)), then there is ϵ > 0 such that, for δ∈ (−ϵ, 0), ξλ+δ|B+(λ) is of type i with

ξλ+δ(γ(λ)− 1) < ξλ+δ(γ(λ)) < ξλ+δ(γ(λ) + 1), and, for δ∈ (0, ϵ), ξλ+δ|B+(λ) is of type i + 1 with

ξλ+δ(γ(λ)− 1) < ξλ+δ(γ(λ)) > ξλ+δ(γ(λ) + 1).

This yields γ(λ + δ) = γ(λ) for δ∈ (0, ϵ) and γ(λ + δ) = γ(λ) + 1 for δ ∈ (−ϵ, 0). The proof for the case of even i is similar and we conclude from the above that

γ(λ) is a non-increasing and right-continuous function taking values on{i+1, ..., n}.

Let ci+1 >· · · > cn−1 be the discontinuous points of γ(λ) such that γ(cj) = j for

i + 1≤ j ≤ n − 1. As a consequence of the above discussion, ξcj|{1,...,j} is of type

i with ξcj(j) = ξcj(j + 1) and this implies

j

k=1π(k)ξcj(k) = 0. That means cj

(16)

second equality in (1), γ(λ) = n for θi(n−1−2)< λ < θi(n−2) and, thus, cj ≥ θ

(n−2)

i for

j ≥ i + 1. As a consequence of the interlacing relationship θi(ℓ) < θ(ℓi −1)< θi+1(ℓ) , it must be cj= θ

(j+1)

i for i + 1≤ j ≤ n − 1. This finishes the proof. 

Remark 3.9. For 1 ≤ i ≤ n − 1, θ(i)1 , ..., θi(i) are also non-zero eigenvalues of the (i + 1)× (i + 1) principal submatrix of (3.1) indexed by 1, ..., i + 1.

Remark 3.10. In fact, by Proposition 2.5, ξλ(1)(n− 1) ̸= ξλ(1)(n) for λ∈ (0, θ(n1 −2)), which is better than Proposition 3.11(3).

3.3. Local convergence of L. This subsection is dedicated to the local conver-gence ofL in (3.3). Let αi, βi, λGi be the constants in Proposition 3.3 and Lemma 3.7. As before, let ζ0= 1, ..., ζn−1denote the L2(π)-normalized eigenvectors of Mπ,νG associated with λG

0, ..., λGn−1. Clearly, ξλG

i = −ζi/ζi(1) and ξλ =

n−1

i=0 ρi(λ)ζi, where ρi(λ) = π(ξλζi) for 0≤ i ≤ n − 1. Note that ρi(λ) is a polynomial of degree

n− 1 and satisfies ρi(λj) =−δi(j)/ζi(1) for i, j∈ {0, 1, ..., n − 1}. This implies (3.5) ρ0(λ) =− n−1 j=1 λG j − λ λG j , ρi(λ) =− λ ζi(1)λGi n−1 j=1,j̸=i λG j − λ λG j − λGi ,

for all 1 ≤ i ≤ n − 1. Moreover, by multiplying (3.2) with ξλ(k) and summing up k, we obtain Eν(ξλ, ζi) = λGi ρi(λ). In the same spirit, one can show that

Eν(ξλ, ζi) = λ[ρi(λ)− ζi(n)ρ0(λ)] using Definition 3.2. Putting both equations

together yields (3.6) ρi(λ) = λζi(n) λ− λG i ρ0(λ), ∀0 ≤ i ≤ n − 1.

As a consequence of Remark 3.5, this gives

(3.7) L(λ) =n−1 i=1 λ G i ρ 2 i(λ)n−1 i=1 ρ 2 i(λ) = λ +n−1 i=1(λ G i − λ)−1ζ 2 i(n)n−1 i=1(λ G i − λ)−2ζ 2 i(n) ,

for λ /∈ {λG0, ..., λGn−1}. The next proposition follows immediately from the second equation in (3.5) and (3.6).

Proposition 3.12. Let λG1, ..., λGn−1 be the non-zero eigenvalues of Mπ,νG in (3.1)

and ζ1, ..., ζn−1 be the corresponding L2(π)-normalized eigenvectors. Then,

ζi(1)ζi(n) =− n−1 j=1,j̸=i λG j λG j − λ G i , ∀1 ≤ i ≤ n − 1. Set u(λ) =nj=1−1(λG

j − λ)−1ζj2(n). By Theorem 3.9, β1, ..., βn−2 are zeros of

u(λ)nj=1−1(λG

j − λ), which is a polynomial of degree n − 2. This implies

u(λ) = C  n−1 j=1 1 λG j − λ    n−2 j=1 (βj− λ) , where C = λ1···λn−1 β1···βn−2n−1 j=1ζ 2

j(n)/λGj. Putting this back toL yields

(3.8) 1 L(λ) − λ = u′(λ) u(λ) = n−1 j=1 1 λG j − λ n−2j=1 1 βj− λ ,

(17)

for λ /∈ {λG0, ..., λGn−1, β1, ..., βn−2}.

Proposition 3.13. LetL be the function in (3.3), λG

i be the eigenvalue of Mπ,νG and

βibe the constant in Lemma 3.7. Let Di = ∑n−2 j=1(βj−λGi )−1−n−1 j=1,j̸=i(λ G j−λGi )−1

with 1≤ i ≤ n − 1. Then, for 2 ≤ i ≤ n − 2,

(1) If Di < 0, then there is τ ∈ (λGi , βi) such that L is strictly increasing on (βi−1, λGi )∪ (τ, βi) and strictly decreasing on (λGi , τ ).

(2) If Di> 0, then there is η∈ (βi−1, λGi ) such thatL is strictly increasing on (βi−1, η)∪ (λGi , βi) and strictly increasing on (η, λGi ).

(3) If Di= 0, then L is strictly increasing on (βi−1, βi).

Proof. Using (3.7) and (3.8), one can show thatL′(λG

i ) = 0 and (3.9) L′′(λGi ) = n−1j=1,j̸=i ζ2 i(n) λG j − λGi = 2  n−2 j=1 1 βj− λGi n−1j=1,j̸=i 1 λG j − λGi = 2Di. To prove (1) and (2), it suffices to show that if L′(τ ) = 0 for some τ ∈ (λG

i , βi), then τ is a local minimum ofL, and if L′(η) = 0 for some η∈ (βi−1, λGi ), then η is a local maximum ofL. We discuss the first case, whereas the second case is similar and is omitted. Recall that u(λ) =nj=1−1(λGj − λ)−1ζj2(n). As τ is a critical point forL, one has 2(u′(τ ))2= u(τ )u′′(τ ). This implies

L′′(τ ) = u(τ )[3(u′′(τ ))2− 2u′(τ )u′′′(τ )] 2(u′(τ ))3 > 0,

where the last inequality uses the fact that u(λ) < 0, for λ∈ (λG

i , βi), and 3(u′′(λ))2− 2u′(λ)u′′′(λ) =−12 ∑ 1≤i<j≤n−1 [ (λGi − λGj )ζi(n)ζj(n) (λG i − λ)2(λGj − λ)2 ]2 < 0.

This proves (1) and (2).

To see (3), we assume that Di= 0. Computations show that

L(λ) − λG i L(λ) − λ = (λ− λ G i )   n−1 j=1,j̸=i 1 λG j − λ n−2 j=1 1 βj− λ   =(λ− λGi )2   n−1 j=1,j̸=i 1 (λG j − λ)(λGj − λGi ) n−1j=1 1 (βj− λ)(βj− λGi )   < 0, for λ ∈ (βi−1, λGi )∪ (λGi , βi), where the last inequality uses the fact that (λGj

λ)(λG

j − λGi ) > (βj− λ)(βj− λGi ) for j < i and (λGj − λ)(λGj − λGi ) > (βj−1

λ)(βj−1− λGi ) for j > i. By Theorem 3.9, this impliesL(λ) > λGi for λ∈ (λGi , βi) and L(λ) < λG

i for λ∈ (βi−1, λGi ). The desired property comes immediate from

the discussion in the previous paragraph. 

Remark 3.11. Note that D1 > 0 and Dn−1 < 0. Using the same proof as above, this implies that L(λ) is strictly increasing on (λG

1, β1)∪ (βn−2, λGn−1). Moreover, by (3.7), one may compute

(u′(λ))2L′(λ) =−2i<j (λG i − λGj)2 (λG i − λ)3(λGj − λ)3 < 0, ∀λ ∈ (0, λG1)∪ (λGn−1,∞).

(18)

This impliesL(λ) is strictly decreasing on (0, λG1)∪ (λGn−1,∞) and lim λ→0L(λ) =n−1 i=1 ζ 2 i(n)/λ G in−1 i=1 ζi2(n)/(λGi )2 , lim λ→∞L(λ) = ( 1 π(n)− 1 )n−1 i=1 λGi ζi2(n).

The following local convergence is a simple corollary of Theorem 3.9 and Propo-sition 3.13.

Theorem 3.14 (Local convergence). Let λ0> 0 and set λℓ+1 =L(λℓ) for ℓ≥ 0.

Then, there is ϵ > 0 such that the sequence (λℓ)∞ℓ=1 is monotonic and converges to

λG

i for λ0∈ (λGi − ϵ, λGi + ϵ) and 1≤ i ≤ n − 1.

We use the following examples to illustrate the different cases in Proposition 3.13.

Example 3.1 (Simple random walks). Let n > 1. A simple random walk on

{1, 2, ..., n} with reflecting probability 1/2 at the boundary is a birth and death

chain with transition matrix given by K(i, j) = K(1, 1) = K(n, n) = 1/2 for

|i − j| = 1. It is easy to see that the uniform probability is the stationary

distri-bution of K. In the setting of graph, we have ν(i, i + 1) = 1/(2n) and π(i) = 1/n. One may apply the method in [11] to obtain the following spectral information.

λGj = 1− cosjπ n, ζj(k) = 1 √ λG j ( sinjkπ n − sin j(k− 1)π n ) , ∀1 ≤ j < n.

See, e.g., [3, Section 7]. By (3.9), we get

Di= 1 2 n−1j=1,j̸=i sin2(jπ/n) λG j(λ G j − λ G i ) = n−1 j=1,j̸=i 1 + cos(jπ/n) cos(iπ/n)− cos(jπ/n). Clearly, D1> 0 and Dn−1< 0. If n is even, then Dn/2< 0.

Example 3.2 (Ehrenfest chains). An Ehrenfest chain on V ={0, 1, ..., n} is a Markov

chain with transition matrix K given by K(i, i + 1) = 1− i/n and K(i + 1, i) = (i + 1)/n for i = 0, ..., n− 1. The associated stationary distribution is the unbiased binomial distribution on V , that is, π(i) = (ni)2−n for i ∈ V . To the Ehrenfest chain, the measure ν is defined by ν(i, i + 1) =(n−1i )2−nfor i = 0, ..., n− 1. Using the group representation for the binary group{0, 1}n, one may compute

λj= 2j n, ζj(k) = ( n j )−1/2 j ℓ=0 (−1)ℓ ( k )( n− k j− ℓ ) , ∀1 ≤ j ≤ n.

Plugging this back into (3.9) yields

Di= n 4 nj=1,j̸=i (n j ) j− i      > 0 for i < n/2 = 0 for i = n/2 < 0 for i > n/2. .

This example points out the possibility of different signs in {Di|i = 1, ..., n − 1} including 0.

(19)

3.4. A remark on the separation for birth and death chains. In this subsec-tion, we give a new proof of a result, Theorem 3.15, which deals with convergence in separation distance for birth and death chains. Let (Xm)∞m=0be a birth and death chain with transition matrix K given by (1.1). In the continuous time setting, we consider the process Yt = XNt, where Nt is a Poisson process with parameter 1

independent of Xm. Given the initial distribution µ, which is the distribution of

X0, the distributions of Xm and Yt are respectively µKm and µe−t(I−K), where

eA := l=0A

l/l!. Briefly, we write H

t = e−t(I−K). It is well-known that if K is irreducible, then µHt converges to π as t→ ∞. If K is irreducible and ri > 0 for some i, then µKm converges to π as m → ∞. Concerning the convergence, we consider the separations of Xm, Yt with respect to π, which are defined by

dsep(µ, m) = max 0≤x≤n { 1−µK m(x) π(x) } , dcsep(µ, t) = max 0≤x≤n { 1−µHt(x) π(x) } .

The following theorem is from [9].

Theorem 3.15. Let K be an irreducible birth and death chain on{0, 1, ..., n} with eigenvalues λ0= 0 < λ1<· · · < λn.

(1) For the discrete time chain, if pi+ qi+1 ≤ 1 for all 0 ≤ i < n, then

dsep(0, m) = dsep(n, m) = nj=1   ∏n i=1,i̸=j λi λi− λj (1 − λj)m. (2) For the continuous time chain, it holds true that

dcsep(0, t) = d c sep(n, t) = nj=1   ∏n i=1,i̸=j λi λi− λj e−λjt.

Diaconis and Fill [6, 12] introduce the concept of dual chain to express the separations in Theorem 3.15 as the probability of the first passage time. Brown and Shao [1] characterize the first passage time using the eigenvalues of K for a special class of continuous time Markov chains including birth and death chains. The idea in [1] is also applicable for discrete time chains and this leads to the formula above. See [9] for further discussions. Here, we use Proposition 3.12 and Lemma 3.16 to prove this result directly.

Lemma 3.16. Let K be the transition matrix in (1.1) with stationary distribution π. Suppose that µ is a probability distribution satisfying µ(i)/π(i)≤ µ(i+1)/π(i+1) for all 0≤ i ≤ n − 1.

(1) For the discrete time chain, if pi + qi+1 ≤ 1 for all 0 ≤ i < n, then

µKm(i)/π(i)≤ µKm(i + 1)/π(i + 1) for all 0≤ i < n and m ≥ 0.

(2) For the continuous time chain, µHt(i)/π(i)≤ µHt(i + 1)/π(i + 1) for all 0≤ i < n and t ≥ 0.

Proof. Note that (2) follows from (1) if we write Ht= exp{−2t(I −I+K2 )}. For the proof of (1), observe that

µKm+1(i) π(i) = µKm(i− 1) π(i− 1) qi+ µKm(i) π(i) ri+ µKm(i + 1) π(i + 1) pi, ∀i.

(20)

By induction, if µKm(i)/π(i)≤ µKm(i + 1)/π(i + 1) for 0≤ i < n, then µKm+1(i + 1) π(i + 1) = µKm(i) π(i) qi+1+ µKm(i + 1) π(i + 1) ri+1+ µKm(i + 2) π(i + 2) pi+1 ≥µKm(i) π(i) qi+1+ µKm(i + 1) π(i + 1) (1− qi+1) ≥µKm(i) π(i) (1− pi) + µKm(i + 1) π(i + 1) pi≥ µKm+1(i) π(i) . 

Remark 3.12. Lemma 3.16 is also developed in [10] in which it is shown that, for

any non-negative function f , Kmf is non-decreasing if f is non-decreasing for all

m≥ 0. Consider the adjoint chain K∗ of K in L2(π). As birth and death chains

are reversible, one has K∗= K. Using the identity µK/π = K∗(µ/π), it is easy to see that the above proof is consistent with the proof in [10].

Proof of Theorem 3.15. Assume that K is irreducible and let λ0= 0 < λ1<· · · <

λn be the eigenvalues of I− K with L2(π)-normalized eigenvector ζ0 = 1, ..., ζn. By Lemma 3.16, if µ satisfies µ(i)/π(i)≥ µ(i + 1)/π(i + 1) for 0 ≤ i < n, then

dcsep(µ, t) = 1−µHt(n) π(n) = nj=1 µ(ζj)ζj(n)e−λjt, where µ(ζj) = ∑n

i=0ζj(i)µ(i). If K satisfies pi+ qi+1 ≤ 1 for all 0 ≤ i < n, then

dsep(µ, m) = 1− µKm(n) π(n) = nj=1 µ(ζj)ζj(n)(1− λj)m.

By Proposition 3.12, setting µ to be one of the dirac measure δ0, δn leads to the

desired identities. 

4. Paths of infinite length

In this section, the graph G = (V, E) under consideration is infinite with V =

{1, 2, ...} and E = {{i, i + 1}|i = 1, 2, ...}. As before, let π, ν be positive measures

on V, E satisfying π(V ) = 1. The Dirichlet form and the variance are defined in a similar way as in the introduction and the spectral gap of G with respect to π, ν is given by λGπ,ν= inf { Eν(f, f ) Varπ(f ) f is non-constant and π(f2) < } .

For n ≥ 2, let Gn = (Vn, En) be the subgraph of G with Vn ={1, 2, ..., n}, En =

{{i, i + 1}|1 ≤ i < n} and let πn, νn be normalized restrictions of π, ν to Vn, En. That is, πn(i) = cnπ(i), νn(i, i + 1) = cnν(i, i + 1) with cn = 1/[π(1) +· · · + π(n)]. As before, let MG

π,ν be an infinite matrix indexed by V and defined by (4.1) Mπ,νG (i, j) =−ν(i, j) π(i) , ∀|i − j| = 1, M G π,ν(i, i) = ν(i− 1, i) + ν(i, i + 1) π(i) . Clearly, MGn

πn,νn is the principal submatrix of M

G

π,ν indexed by Vn× Vn.

Lemma 4.1. Referring to the above setting, λGn+1

πn+1,νn+1 < λ

Gn

πn,νn for n > 1 and

數據

Figure 1. These curves display the mapping m 7→
Figure 2. The dashed lines denote the weak edges of ν in Theo- Theo-rem 6.5. r r r r p p r r r r r p p p p r r r r r p p r r r r  aJ n -  aJ n  - J n /a -  J n /a -AAUMost weak edges
Figure 1. The curves display the total variation distance of the chains in Remark 2.2, where the left most curve is for K 1 and the right most curve is for K 2

參考文獻

相關文件

In Section 3, the shift and scale argument from [2] is applied to show how each quantitative Landis theorem follows from the corresponding order-of-vanishing estimate.. A number

Teachers may consider the school’s aims and conditions or even the language environment to select the most appropriate approach according to students’ need and ability; or develop

(1) Western musical terms and names of composers commonly used in the teaching of Music are included in this glossary.. (2) The Western musical terms and names of composers

Strands (or learning dimensions) are categories of mathematical knowledge and concepts for organizing the curriculum. Their main function is to organize mathematical

Students are asked to collect information (including materials from books, pamphlet from Environmental Protection Department...etc.) of the possible effects of pollution on our

The accuracy of a linear relationship is also explored, and the results in this article examine the effect of test characteristics (e.g., item locations and discrimination) and

Then, it is easy to see that there are 9 problems for which the iterative numbers of the algorithm using ψ α,θ,p in the case of θ = 1 and p = 3 are less than the one of the

Elsewhere the difference between and this plain wave is, in virtue of equation (A13), of order of .Generally the best choice for x 1 ,x 2 are the points where V(x) has