馬可夫鏈的熵收斂

(1)

SPECTRAL COMPUTATIONS FOR BIRTH AND DEATH CHAINS

GUAN-YU CHEN1 _{AND LAURENT SALOFF-COSTE}2

Abstract. We consider the spectrum of birth and death chains on a n-path. An iterative scheme is proposed to compute any eigenvalue with exponential convergence rate independent of n. This allows one to determine the whole spectrum in order n2 _{elementary operations. Using the same idea, we also}

provide a lower bound on the spectral gap, which is of the correct order on some classes of examples.

1. Introduction

Let G = (V, E) be the undirected ﬁnite path with vertex set V ={1, 2, ...n} and edge set E ={{i, i + 1} : i = 1, 2, ..., n − 1}. Given two positive measures π, ν on

V, E with π(V ) = 1, the Dirichlet form and variance associated with ν and π are

deﬁned by

Eν(f, g) := n−1 ∑ i=1

[f (i)− f(i + 1)][g(i) − g(i + 1)]ν(i, i + 1) and

Varπ(f ) := π(f2)− π(f)2,

where f, g are functions on V . When convenient, we set ν(0, 1) = ν(n, n + 1) = 0. The spectral gap of G with respect to π, ν is deﬁned as

λG_π,ν := min { Eν(f, f ) Varπ(f ) f is non-constant}. Let MG

π,ν be a matrix given by Mπ,νG (i, j) = 0 for|i − j| > 1 and

M_π,νG (i, j) =−ν(i, j) π(i) ,∀|i − j| = 1, M G π,ν(i, i) = ν(i− 1, i) + ν(i, i + 1) π(i) .

Obviously, λGπ,ν is the smallest non-zero eigenvalue of Mπ,νG .

Undirected paths equipped with measures π, ν are closely related to birth and death chains. A birth and death chain on {0, 1, 2, ..., n} with birth rate pi, death rate qi and holding rate ri is a Markov chain with transition matrix K given by (1.1) K(i, i + 1) = pi, K(i, i− 1) = qi, K(i, i) = ri, ∀0 ≤ i ≤ n,

where pi+ qi+ ri = 1 and pn = q0 = 0. Under the assumption of irreducibility,

that is, piqi+1 > 0 for 0 ≤ i < n, K has a unique stationary distribution π given by π(i) = c(p0· · · pi−1)/(q1· · · qi), where c is the positive constant such that

2000 Mathematics Subject Classiﬁcation. 60J10,60J27.

Key words and phrases. Birth and death chains, spectrum.

1_{Partially supported by NSC grant NSC100-2115-M-009-003-MY2 and NCTS, Taiwan.} 2_{Partially supported by NSF grant DMS-1004771.}

(2)

∑n

i=0π(i) = 1. The smallest non-zero eigenvalue of I− K is exactly the spectral gap of the path on {0, 1, ..., n} with measures π, ν, where ν(i, i + 1) = π(i)pi =

π(i + 1)qi+1 for 0≤ i < n.

Note that if 1 is the constant function of value 1 and ψ is a minimizer for λG π,ν, then ψ− π(ψ)1 is an eigenvector of MG

π,ν. This implies that any minimizer ψ for

λG

π,ν satisfying π(ψ) = 0 satisﬁes the Euler-Lagrange equation,

(1.2) λG_π,νπ(i)ψ(i) = [ψ(i)− ψ(i − 1)]ν(i − 1, i) + [ψ(i) − ψ(i + 1)]ν(i, i + 1),

for all 1 ≤ i ≤ n. Assuming the connectedness of G (i.e., the superdiagonal and subdiagonal entries of MG

π,νare positive), the rank of Mπ,νG −λI is at least n−1. This implies that all eigenvalues of MG

π,ν are simple. See Lemma A.3 for an illustration. Observe that, by (1.2), any non-trivial eigenvector of Mπ,νG has mean 0 under π. This implies that all minimizers for the spectral gap are of the form aψ + b1, where

a, b are constants and ψ is a nontrivial solution of (1.2). In 2009, Miclo obtained

implicitly the following result.

Theorem 1.1. [15, Proposition 1] If ψ is a minimizer for λG

π,ν, then ψ must be

monotonic, that is, either ψ(i)≤ ψ(i + 1) for all 1 ≤ i < n or ψ(i) ≥ ψ(i + 1) for all 1≤ i < n.

One aim of this paper is to provide a scheme to compute the spectrum of MG π,ν, in particular, the spectral gap. Based on Miclo’s observation, it is natural to consider the following algorithm.

(A1)

Choose two positive reals λ0, a in advance and set, for k = 0, 1, ...,

1. ψk(1) =−a,

2. ψk(i + 1) = ψk(i) +{[ψ

k(i)− ψk(i− 1)]ν(i − 1, i) − λkπ(i)ψk(i)}+

ν(i, i + 1) ,

for 1≤ i < n, where t+= max{t, 0}, 3. λk+1= E

ν(ψk, ψk) Varπ(ψk)

.

The following theorems discuss the behavior of λk.

Theorem 1.2 (Convergence to the exact value). Referring to (A1), if n = 2, then λk= λGπ,ν for all k≥ 1. If n ≥ 3, then the sequence (λk, ψk) satisfies

(1) If λ0= λGπ,ν, then λk= λGπ,ν for all k≥ 0. (2) If λ0̸= λGπ,ν, then λk> λk+1> λGπ,ν for k≥ 1. (3) Set (λ∗, ψ∗) = lim k→∞(λk, ψk). Then, λ ∗ ₌_E ν(ψ∗, ψ∗)/Varπ(ψ∗) = λGπ,ν and π(ψ∗) = 0.

Theorem 1.3 (Rate of convergence). Referring to Theorem 1.2, there is a constant σ∈ (0, 1) independent of the choice of (λ0, a) such that 0≤ λk− λGπ,ν ≤ σk−1λ1for

all k≥ 1.

By Theorem 1.3, we know that the sequence λk generated in (A1) converges to the spectral gap exponentially but the rate (− log σ) is undetermined. The following alternative scheme is based on using more information on the spectral gap and will provide convergence at a constant rate.

(3)

(A2)

Choose a > 0, L0< λGπ,ν < U0in advance and set, for k = 0, 1, ...,

1. ψk(1) =−a, λk = 1₂(Lk+ Uk) 2. ψk(i + 1) = ψk(i) +{[ψ

k(i)− ψk(i− 1)]ν(i − 1, i) − λkπ(i)ψk(i)}+

ν(i, i + 1) ,

for 1≤ i < n, where t+= max{t, 0},

3.      Lk+1= Lk, Uk+1= λk if π(ψk) > 0 Lk+1= λk, Uk+1= Uk if π(ψk) < 0 Lk+1= Uk+1= λk if π(ψk) = 0 .

Theorem 1.4 (Dichotomy method). Referring to (A2), it holds true that

0≤ max{Uk− λGπ,ν, λ G

π,ν− Lk} ≤ (U0− L0)2−k, ∀k ≥ 0.

In Theorem 1.4, the convergence to the spectral gap is exponentially fast with explicit rate, log 2. See Remark 2.2 for a discussion on the choice of L0and U0. For

higher order spectra, Miclo has a detailed description of the shape of eigenvectors in [14] and this will motivate the deﬁnition of similar algorithms for every eigenvalue in spectrum. See (Di) and Theorem 3.4 for a generalization of (A2) and Theorem 3.14 for a localized version of Theorem 1.3.

The spectral gap is an important parameter in the quantitative analysis of Markov chains. The cutoff phenomenon, a sharp phase transition phenomenon for Markov chains, was introduced by Aldous and Diaconis in early 1980s. It is of interest in many applications. A heuristic conjecture proposed by Peres in 2004 says that the cutoff exists if and only if the product of the spectral gap and the mixing time tends to infinity. Assuming reversibility, this has been proved to hold for Lp_{-convergence with 1 < p} _{≤ ∞ in [2]. For the L}1_{-convergence, Ding et al.}

[10] prove this conjecture for continuous time birth and death chains. In order to use Peres’ conjecture in practice, the orders of the magnitudes of spectral gap and mixing time are required. The second aspect of this paper is to derive a theoretical lower bound on the spectral gap using only the birth and death rates. This lower bound is obtained using the same idea used to analyze the above algorithm. For estimates on the mixing time of birth and death chains, we refer the readers to the recent work [4] by Chen and Saloﬀ-Coste. For illustration, we consider several examples of speciﬁc interest and show that the lower bound provided here is in fact of the correct order in these examples.

This article is organized as follows. In Section 2, the algorithms in (A1)-(A2) are explored and proofs for Theorems 1.2-1.4 are given. In Section 3, the spectrum of Mπ,νG is discussed further and, based on Miclo’s work [14], Algorithm (A2) is generalized to any speciﬁed eigenvalue of MG

π,ν. Our method is applicable for paths of inﬁnite length (one-sided) and this is described in Section 4. For illustration, we consider some Metropolis chains and display numerical results of Algorithm (A2) in Section 5. In Section 6, we focus on uniform measures with bottlenecks and determine the correct order of the spectral gap using the theory in Sections 2-3. It is worthwhile to remark that the assumptions in Section 6 can be relaxed using the comparison technique in [7, 8]. As the work in this paper can also be regarded as a stochastic counterpart of theory of ﬁnite Jacobi matrices, we would like to refer the readers to [18, 19] for a complementary perspective.

(4)

2. Convergence to the spectral gap

This section is devoted to proving Theorems 1.2-1.4. First, we prove Theorem 1.1 in the following form.

Lemma 2.1. Let λ > 0 and ψ be a non-constant function on V . Suppose (λ, ψ) solves (1.2) and ψ is monotonic. Then, ψ is strictly monotonic, that is, either ψ(i) < ψ(i + 1) for 1≤ i < n or ψ(i) > ψ(i + 1) for 1 ≤ i < n.

Proof. Obviously, (1.2) implies that π(ψ) = 0. Without loss of generality, it suﬃces

to consider the case when ψ(1) < 0 and ψ(n) > 0. Since ψ is non-constant and

λG

π,ν > 0, we have ψ(1) < ψ(2) and ψ(n− 1) < ψ(n). Note that if there are 1 < i < j < n such that ψ(i− 1) < ψ(i), ψ(j) < ψ(j + 1) and ψ(k) = ψ(i) = ψ(j) for i≤ k ≤ j, then (1.2) yields

λG_π,νπ(i)ψ(i) = [ψ(i)− ψ(i − 1)]ν(i − 1, i) + [ψ(i) − ψ(i + 1)]ν(i, i + 1) > 0

and

λGπ,νπ(j)ψ(j) = [ψ(j)− ψ(j − 1)]ν(j − 1, j) + [ψ(j) − ψ(j + 1)]ν(j, j + 1) < 0,

a contradiction. Thus, ψ is strictly increasing.

We note the following corollary.

Corollary 2.2. Let (λ, ψ) be a pair satisfying (1.2). Then, λ = λG

π,ν if and only if

ψ is monotonic.

Proof. One direction is obvious from Theorem 1.1. For the other direction, assume

that ψ is monotonic and let ϕ be a minimizer for λG

π,ν with π(ϕ) = 0. Since (λ, ψ) and (λG

π,ν, ϕ) are solutions to (1.2), one has

λπ(ψϕ) =Eν(ψ, ϕ) = λGπ,νπ(ϕψ).

By Lemma 2.1, ψ and ϕ are strictly monotonic and this impliesEν(ψ, ϕ)̸= 0. As a consequence of the above equations, we have λ = λGπ,ν.

The following proposition is the key to Theorem 1.2.

Proposition 2.3. Suppose that (λ, ψ) satisfies λ > 0, ψ(1) < 0 and, for 1≤ i < n,

(2.1) ψ(i + 1) = ψ(i) +{[ψ(i) − ψ(i − 1)]ν(i − 1, i) − λπ(i)ψ(i)}

+

ν(i, i + 1) ,

where t+_{= max}_{{t, 0}. Then, the following are equivalent.}

(1) Eν(ψ, ψ) = λVarπ(ψ). (2) π(ψ) = 0.

(3) λ = λG_π,ν.

Furthermore, if n≥ 3, then any of the above is equivalent to

(4) Eν(ψ, ψ) = λGπ,νVar(ψ)

Remark 2.1. For n = 2, it is an easy exercise to show that λG

π,ν= ν(1, 2)/(π(1)π(2)). By following the formula in (2.1), one has ψ(2) = ψ(1)[1− λπ(1)/ν(1, 2)], which leads toEν(ψ, ψ)/Varπ(ψ) = λGπ,ν.

(5)

Proof of Proposition 2.3. Set B ={1 ≤ i ≤ n|ψ(i) = ψ(n)} and Bc ={1, 2, ..., i0}.

Since ψ(1) < 0 and λ > 0, ψ(1) < ψ(2) and Bc is nonempty. According to (2.1), ψ is non-decreasing. Note that if ψ(i) = ψ(i+1), then ψ(i)≥ 0 and ψ(i+2) = ψ(i+1). This implies ψ is strictly increasing on{1, 2, ..., i0+ 1} and, for 1 ≤ i ≤ i0,

λπ(i)ψ(i) = [ψ(i)− ψ(i + 1)]ν(i, i + 1) + [ψ(i) − ψ(i − 1)]ν(i − 1, i).

Multiplying ψ(i) on both sides and summing over all i in Bc _yields

λ i0 ∑ i=1 ψ(i)2π(i) = i∑0−1 i=1

[ψ(i)− ψ(i + 1)]2ν(i, i + 1)

+ ψ(i0)[ψ(i0)− ψ(i0+ 1)]ν(i0, i0+ 1)

=Eν(ψ, ψ) + ψ(i0+ 1)[ψ(i0)− ψ(i0+ 1)]ν(i0, i0+ 1)

=Eν(ψ, ψ) + λψ(n) i0 ∑ i=1 ψ(i)π(i). This is equivalent to (2.2) Eν(ψ, ψ) = λVarπ(ψ) + λπ(ψ)[π(ψ)− ψ(n)], which proves (1)⇔(2).

If λ = λG_π,ν, then ψ is an eigenvector for M_π,νG associated to λG_π,ν. This proves (3)⇒(2). For (2)⇒(3), assume that π(ψ) = 0. In this case, ψ must be strictly increasing. Otherwise, ψ(i) = ψ(n) > 0 for i ∈ B and, according to (2.1), this implies λVarπ(ψ) > λ n∑−1 i=1 π(i)ψ2(i)≥ n−1 ∑ i=1

[ψ(i)− ψ(i + 1)]2ν(i, i + 1) =E(ψ, ψ),

which contradicts (1). As ψ is strictly increasing and π(ψ) = 0, (λ, ψ) solves (1.2). By Corollary 2.2, λ = λG

π,ν.

To ﬁnish the proof, it remains to show (4)⇒(3) ((3)⇒(4) is obvious from the equivalence among (1), (2) and (3)). Assume that Eν(ψ, ψ) = λGπ,νVarπ(ψ). By Lemma 2.1, ψ is strictly monotonic and this implies, for 1≤ i < n,

λπ(i)ψ(i) = [ψ(i)− ψ(i + 1)]ν(i, i + 1) + [ψ(i) − ψ(i − 1)]ν(i − 1, i).

As ψ is a minimizer for λG_π,ν, one has, for 1≤ i ≤ n,

λG_π,νπ(i)[ψ(i)− π(ψ)] = [ψ(i) − ψ(i + 1)]ν(i, i + 1) + [ψ(i) − ψ(i − 1)]ν(i − 1, i).

If λ̸= λG_π,ν, the comparison of both systems yields

ψ(i) = λ G π,νπ(ψ) λG π,ν− λ , ∀1 ≤ i < n.

As n≥ 3, ψ(1) = ψ(2), a contradiction! This forces λ = λG

π,ν, as desired. The following is a simple corollary of Proposition 2.3, which plays an important role in proving Theorem 1.4.

Corollary 2.4. Let n≥ 3. For λ > 0, let ϕλ be the vector generated by (2.1) with

(6)

Proof. Without loss of generality, we ﬁx ϕλ(1) = −1 for all λ > 0. Set T (λ) =

π(ϕλ). To prove this corollary, it suﬃces to show that

T (λ) { < 0 if λ < λG π,ν > 0 if λ > λG π,ν .

For λ > 0, deﬁne L(λ) :=Eν(ϕλ, ϕλ)/Varπ(ϕλ). By (2.2), one has (2.3) L(λ)− λ = λT (λ)[π(ϕλ)− ϕλ(n)]

Varπ(ϕλ)

.

Since ϕλ is non-constant, π(ϕλ) < ϕλ(n). This implies T (λ) < 0 for λ∈ (0, λGπ,ν). For λ > λG

π,ν, set I = (λGπ,ν,∞). By Proposition 2.3, T (λ) = 0 if and only if λ =

λG_π,ν. By the continuity of T , this implies either T (I)⊂ (−∞, 0) or T (I) ⊂ (0, ∞). In the case T (I) ⊂ (−∞, 0), one has L(λ) > λ for λ ∈ I. As L(I) is bounded,

Lk_{(λ) is convergent with limit e}_{λ > λ}G

π,ν and this yields

0 = lim k→∞[L

k+1_(λ)_{− L}k_{(λ)] =} eλT(eλ)[π(ϕeλ)− ϕeλ(n)] Varπ(ϕ_e_λ)

> 0,

a contradiction. Hence, T (λ) > 0 for λ > λGπ,ν.

Proof of Theorem 1.2. The proof for n = 2 is obvious from a direct computation

and we deal with the case n≥ 3, here. By the equivalence of Proposition 2.3 (3)-(4), if λ0 = λGπ,ν, then λk = λGπ,ν for all k ≥ 1. If λ0 ̸= λGπ,ν, then λk > λGπ,ν for

k≥ 1. Note that (λk, ψk) solves the system in (2.1). By (2.2), this implies

λk+1− λk=

λkπ(ψk)[π(ψk)− ψk(n)] Varπ(ψk)

, ∀k ≥ 0.

The strict monotonicity of λk in (2) comes immediately from Corollary 2.4. In (3), the continuity of (2.1) in λ implies that (λ∗, ψ∗) is a solution to (2.1) and

Eν(ψ∗, ψ∗) = λ∗Var(ψ∗). By Proposition 2.3, λ∗= λGπ,ν and π(ψ∗) = 0, as desired.

Proof of Theorem 1.3. Recall the notation in the proof of Corollary 2.4: For λ > 0,

let ϕλ be the function deﬁned by (2.1) and L(λ) =Eν(ϕλ, ϕλ)/Varπ(ϕλ). By (2.2) and Corollary 2.4, L(λ)∈ (λG

π,ν, λ) for λ > λGπ,ν. As L is bounded, Theorem 1.3

follows from Lemma A.1.

Proof of theorem 1.4. Immediate from Corollary 2.4.

In the end of this section, we use the following proposition to ﬁnd how the shape of the function ψ in (2.1) evolves with λ. In Proposition 2.5, we set ϕλ= ψ when

ψ is given by (2.1). It is easy to see from (2.1) that ϕλis strictly increasing before some constant, say i0= i0(λ), and then stays constant equal to ϕλ(i0) after i0. The

proposition shows how the constant i0(λ) evolves.

Proposition 2.5. For λ > 0, let ϕλbe the function generated by (2.1) with ϕλ(1) =

−1 and, for 1 ≤ i ≤ n, set Ti(λ) = ∑i

j=1ϕλ(i)π(i). For 1≤ i < n, let

(7)

(2.4) Ai(λ) =              a1(λ) 1 0 0 · · · 0 π(3) π(2) a2(λ) 1 0 .. . 0 π(4)_π(3) a3(λ) . .. . .. ... 0 0 . .. . .. . .. 0 .. . . .. . .. ai−1(λ) 1 0 · · · · · · 0 π(i+1)_π(i) ai(λ)              ,

and let λ(i)be the smallest root of det Ai(λ) = 0. Then, (1) λG_π,ν= λ(n−1)< λ(n−2)<· · · < λ(1).

(2) ϕλ(i) < ϕλ(i + 1) = ϕλ(i + 2) for λ∈ [λ(i), λ(i−1)) and 1≤ i ≤ n − 2, where

λ(0):=∞.

(3) ϕλ(n− 1) < ϕλ(n) for λ∈ (0, λ(n−2)).

In particular, Ti+1(λ) =−π(1) det Ai(λ) for λ∈ (0, λ(i−1)) and (λ−λ(i))Ti+1(λ) > 0 for λ∈ (0, λ(i)₎_{∪ (λ}(i)_,_{∞) with 1 ≤ i ≤ n − 1.}

Proof. By Lemma A.2, λ(1) _{> λ}(2)_>_{· · · > λ}(n−1)_{> 0 and, for 1}_{≤ i ≤ n − 1,}

(2.5) det Ai(λ)

{

> 0 ∀λ ∈ (−∞, λ(i)₎

< 0 ∀λ ∈ (λ(i)_{, λ}(i−1)₎,

where λ(0)=∞. Note that if Ti(λ) < 0 for some 1≤ i ≤ n − 1, then

ϕλ(j + 1) = ϕλ(j) + [ϕλ(j)− ϕλ(j− 1)]ν(j − 1, j) − λπ(j)ϕλ(j) ν(j, j + 1) , ∀1 ≤ j ≤ i. This implies (2.6) ϕλ(ℓ + 1) = ϕλ(ℓ)− λ ν(ℓ, ℓ + 1) ℓ ∑ j=1 π(j)ϕλ(j), ∀1 ≤ ℓ ≤ i.

Multiplying π(ℓ + 1) and adding up Tℓ(λ) yields

Tℓ+1(λ) = aℓ(λ)Tℓ(λ)−

π(ℓ + 1)

π(ℓ) Tℓ−1(λ), ∀1 ≤ ℓ ≤ i.

From the above discussion, we conclude that if Ti(λ) < 0, then (2.7) Tℓ+1(λ) =−π(1) det Aℓ(λ), ∀1 ≤ ℓ ≤ i.

When ℓ = i− 1, (2.5) implies det Ai−1(λ) > 0 for λ < λ(i−1). By the continuity of Ti and det Ai₋₁, if there is some λ < λ(i−1) such that Ti(λ) < 0, then Ti(λ) =

−π(1) det Ai−1(λ) for λ < λ(i−1). As a consequence of (2.7) with ℓ = i, this will imply Ti+1(λ) = −π(1) det Ai(λ) for λ < λ(i−1). Hence, it remains to show that Ti(λ) < 0 for some λ < λ(i−1). To see this, according to Corollary 2.4, one can choose a constant eλ < min{λG

π,ν, λ(i−1)} such that Tn−1(eλ) < 0. Since ϕλ(i) is non-decreasing in i, we obtain Ti(eλ) < 0, as desired. This proves Ti+1(λ) =

−π(1) det Ai(λ) for λ < λ(i−1). In particular, Tn(λ) =−π(1) det An₋₁(λ) for λ <

λ(n−2)_{. By Corollary 2.4, we have λ}(n−1)_{= λ}G

(8)

Next, observe that, for λ∈ (λ(i), λ(i−1)), i+1 ∑ j=1 π(j)ϕλ(j) = Ti+1(λ) > 0, i ∑ j=1 π(j)ϕλ(j) = Ti(λ) < 0.

By (2.6), it is easy to see that [ϕλ(i + 1)− ϕλ(i)]ν(i, i + 1) =−λTi(λ) > 0 and [ϕλ(i + 2)− ϕλ(i + 1)]ν(i + 1, i + 2)

={[ϕλ(i + 1)− ϕλ(i)]ν(i, i + 1)− λπ(i + 1)ϕλ(i + 1)}

+

={−λTi+1(λ)}+= 0.

This proves Proposition 2.5 (2). To prove Proposition 2.5 (3), we use (1) to derive

Tn−1(λ) =−π(1) det An−2(λ) < 0, ∀λ ∈ (0, λ(n−2)).

Using (2.6), this implies ϕλ(n−1) < ϕλ(n). The last part of Proposition 2.5 follows easily from (2.5) and the fact that

Ti(λ)≥ 0 ⇒ Ti+1(λ) > 0 and Ti(λ)≤ 0 ⇒ Ti₋₁(λ) < 0.

Remark 2.2. In Proposition 2.5, if λ > λ(1)_{= ν(1, 2)[π(1)}−1_+π(2)−1_{], then ϕ}

λ(i) =

ϕλ(2) for i = 2, ..., n. Note that, for λ≥ λ(1), ϕλ(2) =−1 + λπ(1)/ν(1, 2) and

π(ϕλ) =−1 +

λπ(1)(1− π(1))

ν(1, 2) , Varπ(ϕλ) =

λ2_π(1)3₍₁_{− π(1))}

ν(1, 2)2 .

By (2.3), this leads to L(λ) = ν(1, 2)/[π(1)(1−π(1)] for λ ≥ λ(1). In the case n = 2, it is clear that ν(1, 2)/[π(1)(1− π(1)] = ν(1, 2)[π(1)−1+ π(2)−1] = λGπ,ν.

3. Convergence to other eigenvalues

In this section, we generalize the algorithms (A1) and (A2) so that they can be applied for the computation to any speciﬁed eigenvalue.

3.1. Basic setup and fundamental results. Recall that G is a graph with vertex set V = {1, 2, ..., n} and edge set E = {{i, i + 1}|i = 1, 2, ..., n − 1}. Given two positive measures π, ν on V, E with π(V ) = 1, let M_π,νG be a n-by-n matrix deﬁned in the introduction and given by

(3.1) M_π,νG (i, j) =      −ν(i, j)/π(i) if|i − j| = 1 [ν(i− 1, i) + ν(i, i + 1)]/π(i) if j = i

0 if|i − j| > 1

.

Since ν is positive everywhere and Mπ,νG is tridiagonal, all eigenvalues of Mπ,νG have algebraic multiplicity 1. Throughout this section, let {λG

0 < λG1 < · · · < λGn₋₁} denote the eigenvalues of MG

π,νwith associated L2(π)-normalized eigenvectors ζ0=

1, ζ2, ..., ζn−1. Clearly, λG0 = 0, λG1 = λGπ,ν and, for 1≤ k ≤ n,

(3.2) λG_i ζi(k)π(k) = [ζi(k)− ζi(k− 1)]ν(k − 1, k) + [ζi(k)− ζi(k + 1)]ν(k, k + 1). Let 1 ≤ i ≤ n − 1. As ζi is non-constant, it is clear that ζi(1) ̸= ζi(2) and

ζi(n− 1) ̸= ζi(n). Moreover, if ζi(k) = ζi(k + 1) for some 1 < k < n, then

(9)

there are exactly i sign changes for ζi with 1≤ i ≤ n. Miclo [14] gives a detailed description on the shape of ζi as follows.

Theorem 3.1. For 1 ≤ i ≤ n − 1, let ζi be an eigenvector associated to the ith

smallest non-zero eigenvalue of the matrix in (3.1) with ζi(1) < 0. Then, there are 1 = a1< b1 ≤ a2< b2 ≤ · · · ≤ ai < bi = n with aj+1− bj ∈ {0, 1} such that ζi is

strictly increasing on [aj, bj] for odd j and is strictly decreasing on [aj, bj] for even

j, and ζi(aj+1) = ζi(bj) for 1≤ j < i.

In the following, we make some analysis related to the Euler-Lagrange equations in (3.2).

Definition 3.1. Fix n≥ 1 and let f be a function on {1, 2, ..., n}. For 1 ≤ i ≤ n−1, f is called “Type i” if there are 1 = a1< b1≤ a2< b2≤ · · · ≤ ai< bi≤ n satisfying

aj+1− bj∈ {0, 1} such that

(1) f is strictly monotonic on [aj, bj] for 1≤ j ≤ i.

(2) [f (aj)− f(aj+ 1)][f (aj+1)− f(aj+1+ 1)] < 0 for 1≤ j < i. (3) f (aj+1) = f (bj), for 1≤ j < i, and f(k) = f(bi), for bi≤ k ≤ n. The points aj, bj will be called “peak-valley points” in this paper.

Remark 3.1. Note that the diﬀerence between Deﬁnition 3.1 and Theorem 3.1 is the

requirement bi ≤ n, instead of bi= n. By Theorem 3.1, any eigenvector associated to the ith smallest non-zero eigenvalue of the matrix in (3.1) must be of type i with

bi = n.

Definition 3.2. Let π, ν be positive measures on V, E with π(V ) = 1. For λ∈ R,

let ξλ be a function on{1, 2, ..., n} deﬁned by ξλ(1) =−1 and, for 1 ≤ k < n,

ξλ(k + 1) = ξλ(k) +

[ξλ(k)− ξλ(k− 1)]ν(k − 1, k) − λπ(k)ξλ(k)

ν(k, k + 1) .

Remark 3.2. Note that ξ0 = −1 and, for λ < 0, ξλ is strictly decreasing and of type 1. For λ > 0, if ξλ(k− 1) < ξλ(k) = ξλ(k + 1), then ξλ(k) > 0 and this implies

ξλ(k + 2) < ξλ(k + 1). Similarly, if ξλ(k− 1) > ξλ(k) = ξλ(k + 1), then ξλ(k) < 0 and ξλ(k + 2) > ξλ(k + 1). Thus, ξλ must be of type i for some 1≤ i ≤ n − 1.

Lemma 3.2. For λ > 0, let ξλ be the function in Definition 3.2. Suppose that ξλ

is of type i with 1≤ i ≤ n − 1.

(1) If ξλ(n− 1) ̸= ξλ(n), then there is ϵ > 0 such that ξλ+δ is of type i for

−ϵ < δ < ϵ.

(2) If ξλ(n− 1) = ξλ(n), then there is ϵ > 0 such that ξλ+δ is of type i + 1 and

ξλ−δ is of type i for 0 < δ < ϵ.

Proof. Let aj, bj be the peak-valley points of ξλ. By the continuity of ξλ in λ and Remark 3.2, one can choose ϵ > 0 such that, for δ ∈ (−ϵ, ϵ), ξλ+δ remains strictly monotonic on [aj, bj] for j = 1, ..., i and

[ξλ+δ(bj− 1) − ξλ+δ(bj)][ξλ+δ(aj+1+ 1)− ξλ+δ(aj+1)] > 0,

for 1≤ j < i. In (1), bi= n. Fix δ∈ (−ϵ, ϵ) and set a′1= a1= 1, b′i= bi= n. For 1 < j < i, set      b′_j = a′_j+1= bj if [ξλ+δ(bj− 1) − ξλ+δ(bj)][ξλ+δ(bj)− ξλ+δ(aj+1)] < 0 b′_j = a′_j+1= aj+1 if [ξλ+δ(bj− 1) − ξλ+δ(bj)][ξλ+δ(bj)− ξλ+δ(aj+1)] > 0 b′_j = bj, a′j+1= aj+1 if [ξλ+δ(bj− 1) − ξλ+δ(bj)][ξλ+δ(bj)− ξλ+δ(aj+1)] = 0 .

(10)

Clearly, ξλ+δis of type i with peak-valley points a′j, b′j. This proves Lemma 3.2 (1). For part (2), we consider i ≤ n − 2 and bi = n− 1. By similar argument as before, one can choose ϵ > 0 such that the restriction of ξλ+δ to {1, 2, ..., n − 1} is of type i for δ∈ (−ϵ, ϵ). To ﬁnish the proof, it remains to compare ξλ+δ(n− 1) and

ξλ+δ(n). Recall that Tj(λ) = ∑j

k=1ξλ(k)π(k) as in the proof for Proposition 2.5. Using a similar reasoning as for (2.7), one shows that Ti+1(λ) =−π(1) det Ai(λ) for 1 ≤ i < n, where Ai(λ) is the matrix in (2.4). This implies that the non-zero eigenvalues of Mπ,νG , say λG1, ..., λGn₋₁, are the roots of det An−1(λ) = 0. As a consequence of Lemma A.2, det An₋₂(λ) = 0 has exactly n− 2 distinct roots, say

α1< α2<· · · < αn₋₁, and they satisfy the interlacing property λGj < αj < λGj+1for 1≤ j ≤ n − 2. Note that det An−2(λ) and det An−1(λ) tend to inﬁnity as−λ tends to inﬁnity. This leads to the fact that if det An−2(λ) = 0 and det An−1(λ) < 0, then det An−2(·) is strictly decreasing in a neighborhood of λ. If det An−2(λ) = 0 and det An−1(λ) > 0, then det An−2(·) is strictly increasing in a neighborhood of

λ.

Back to the proof of (2). Suppose that ξλ(n− 2) < ξλ(n− 1). By Remark 3.2, it is easy to check that Tn−1(λ) = 0 and Tn(λ) > 0 or, equivalently, det An−2(λ) = 0 and det An₋₁(λ) < 0. According to the conclusion in the previous paragraph, we can ﬁnd ϵ > 0 such that det An₋₂(·) is strictly decreasing on (λ − ϵ, λ + ϵ), which yields ξλ+δ(n) = ξλ+δ(n− 1) − (λ + δ)Tn−1(λ + δ) ν(n− 1, n) { < ξλ+δ(n− 1) if 0 < δ < ϵ > ξλ+δ(n− 1) if − ϵ < δ < 0 .

This gives the desired property in Lemma 3.2 (2). The other case, ξλ(n− 2) >

ξλ(n− 1), can be proved in the same way and we omit the details. The following proposition characterizes the shape of ξλ for λ > 0.

Proposition 3.3. For λ > 0, let ξλ be the function in Deﬁnition 3.2. Let λG1 <

· · · < λG

n₋₁ be non-zero eigenvalues of Mπ,νG in (3.1) and α1<· · · < αn−2 be zeros

of det An₋₂(λ), where An₋₂(·) is the matrix in (2.4). Then, (1) λG

j < αj < λGj+1, for 1≤ j ≤ n − 2.

(2) ξλ is of type j for λ∈ (αj₋₁, αj] and 1≤ j ≤ n − 1, where α0 := 0 and

αn₋₁:=∞.

Proof. (1) is immediate from Lemma A.2. For (2), note that αi is an eigenvalue of the submatix of MG

π,ν obtained by removing the nth row and column. This implies ξαi(n− 1) = ξαi(n) for i = 1, ..., n− 2 and ξλ(n− 1) ̸= ξλ(n) for λ > 0 and

λ /∈ {α1, ..., αn−2}. By Lemma 3.2, ξλ is of type i for αi−1< λ≤ αi. Given λ > 0, the above proposition provides a simple criterion to determine to which of the intervals (αj, αj+1] λ belongs to, that is, the type of ξλ. However, knowing the type of ξλis not suﬃcient to determine whether λ is bigger or smaller than λG

i . We need the following remark.

Remark 3.3. Using the same argument as the proof of Proposition 2.5, one can show

that π(ξλ) = −π(1) det An₋₁(λ), where An₋₁(λ) is the matrix in (2.4). Clearly,

π(ξλ) has zeros λG1, ..., λGn−1and tends to minus inﬁnity as λ tends to minus inﬁnity. This implies that π(ξλ) < 0, for λ < λG1, and

π(ξλ) > 0 ∀λ ∈ (λG2i₋₁, λ

G

2i), π(ξλ) < 0 ∀λ ∈ (λG2i, λ

G

(11)

for i≥ 1, where λGn :=∞.

As a consequence of Proposition 3.3 and Remark 3.3, we obtain the following dichotomy algorithm, which is a generalization of (A2). Let 1≤ i ≤ n − 1.

(Di)

Choose positive reals L0< λGi < U0 and set, for ℓ = 0, 1, ...,

1. ξλℓ be the function generated by λℓ= (Lℓ+ Uℓ)/2 in Deﬁnition 3.2,

2. According to Deﬁnition 3.1, set                Lℓ+1= Lℓ, Uℓ+1= λℓ if ξλℓ is of type j with j > i, or if ξλℓ is of type i and (−1) i−1_π(ξ λℓ) > 0 Uℓ+1= Uℓ, Lℓ+1 = λℓ if ξλℓ is of type j with j < i, or if ξλℓ is of type i and (−1) i₋₁_π(ξ λℓ) < 0 Lℓ+1= Uℓ+1= λℓ if ξλℓ is of type i and π(ξλℓ) > 0 .

Theorem 3.4. Referring to (Di),

0≤ max{Uℓ− λGi , λ G

i − Lℓ} ≤ (U0− L0)2−ℓ, ∀ℓ ≥ 0.

Proof. Immediate from Proposition 3.3 and Remark 3.3.

Proposition 3.3 (2) bounds the eigenvalues using the shape of ξλgenerated from one end point. We now introduce some other criteria to bound eigenvalues using the shape of ξλ from either boundary point. Those results will be used to prove Theorem 6.1.

Proposition 3.5. For λ > 0, let ξλ be the function in Definition 3.2 and eξλ be a

function given by

eξλ(k− 1) = eξλ(k) +

[eξλ(k)− eξλ(k + 1)]ν(k, k + 1)− λπ(k)eξλ(k)

ν(k− 1, k) ,

for k = n, n− 1, ..., 2 with eξλ(n) =−1. Let λG0 <· · · < λGn−1 be eigenvalues of Mπ,νG

in (3.1) and let f|B be the restriction of f to a subset B of V . Suppose 1≤ k0≤ n.

(1) If ξλ|_{1,...,k0} is of type i with (−1)

i_ξ

λ(k0) > 0 and eξλ|_{k0,...,n} is of type j

with (−1)jeξ

λ(k0) > 0, then λGi+j−2< λ < λGi+j−1. (2) If ξλ|{1,...,k0} is of type i with (−1)

i_ξ

λ(k0) < 0 and eξλ|{k0,...,n} is of type j

with (−1)jeξλ(k0) < 0, then λGi+j−1< λ < λ G i+j+1. (3) If ξλ|_{1,...,k0} is of type i with (−1) i_ξ λ(k0) > 0 and eξλ|_{k0,...,n} is of type j with (−1)jeξ

λ(k0) < 0, then λGi+j₋₂< λ < λGi+j.

Proof. By Proposition 3.3, ξλ(n) is a polynomial of degree n− 1 satisfying (−1)i+1ξ_λG

i(n) > 0, ∀0 ≤ i < n, (−1)

i+1_ξ

βi(n) > 0, ∀1 ≤ i < n − 1.

This implies that there are wi∈ (βi, λGi+1), 0≤ i ≤ n−2, such that (−1)i+1ξλ(n) > 0 for λ∈ (wi−1, wi) and 0≤ i ≤ n − 1 with w−1=−∞ and wn−1=∞.

The proofs for (1)-(3) in Proposition 3.5 are similar and we deal with (1) only. By the Euler-Lagrange equations in (3.2), it is easy to see that, for 1≤ l < n, ξ_λG l

and eξλG

l are eigenvectors of M

G

π,ν in (3.1) associated with λGl , which implies ξλG l =

−ξλG

l (n)eξλGl. First, assume that λ≤ λ

G

i+j−2. By Proposition 3.3, ξλG

(12)

is of type at least i and eξλG

i+j−2|{k0,...,n} is of type at least j. This implies that the patching of ξ_λG

i+j−2|{1,...,k0} and −ξλGi+j−2(n)eξλGi+j−2|{k0,...,n}, which equals to

ξλG

i+j−2, is of type at least i + j− 1. This is a contradiction.

Next, assume that λ≥ λG

i+j−1. By Proposition 3.3, we may choose a1< λ (resp.

a2< λ) such that ξλ|{1,...,k0} (resp. eξλ|{k0,...,n}) changes the type at a1 (resp. a2). If λG_i+j₋₁ ≤ min{a1, a2}, then a similar reasoning as before implies that ξλG

i+j−1 is

of type at most i + j− 2, a contradiction. If min{a1, a2} < λGi+j−1< max{a1, a2},

then exactly one of ξ_λG

i+j−1|{1,...,k0} and eξλGi+j−1|{k0,...,n} does not change its type. This implies that the gluing point k0 can not be a local extremum and, thus, the

patching function is of type at most i + j− 2, another contradiction! According to the discussion in the ﬁrst paragraph of this proof, if λG

i+j₋₁ ≥ max{a1, a2}, then

none of ξλG

i+j−1|{1,...,k0} and eξλG_i+j−1|{k0,...,n} changes type nor, of course, the sign at

k0. Consequently, we obtain (−1)i+jξλG

i+j−1(k0)eξλGi+j−1(k0) > 0, which contradicts

the fact ξ_λG

i+j−1 =−ξλGi+j−1(n)eξλGi+j−1.

Proposition 3.6. For λ > 0 and 1≤ k ≤ n−1, let sk(λ) be the kth sign change of

ξλdefined by s0:= 0 and sk+1(λ) := inf{l > sk(λ)|ξλ(l)ξλ(l− 1) < 0 or ξλ(l) = 0},

where inf∅ := n + 1. Then, for 0 < λ1< λ2, sk(λ1)≥ sk(λ2) for all 1≤ k ≤ n − 1.

Proof. Let 1≤ k ≤ n − 1. If sk(λ1) = n + 1, then it is clear that sk(λ1)≥ sk(λ2).

Suppose that sk(λ1) = ℓ ≤ n. Obviously, ξλ1|{1,...,ℓ} is of type k. Referring to (2.4), let λℓ₁, ..., λℓ_ℓ₋₁ be the roots of det Aℓ−1(λ) = 0 and αℓ1, ..., α

ℓ

ℓ−2 be roots of det Aℓ₋₂(λ) = 0. According to the ﬁrst paragraph of the proof for Proposition 3.5, there are wℓ

i ∈ (αℓi−1, λℓi) with 1 ≤ i ≤ ℓ − 1 such that (−1)i+1ξλ(ℓ) > 0 for

λ∈ (wℓ

i, wℓi+1) and 1≤ i ≤ ℓ − 1, where αℓ0 := 0. Since ξλ1(ℓ)ξλℓ

k(ℓ)≥ 0, one has

wℓ_k≤ λ1< αℓk. As it is assumed that λ2> λ1, if λ2> αℓk, then ξλ2|{1,...,ℓ}is of type at least k + 1 and, consequently, sk(λ2) < ℓ = sk(λ1). If λ1 < αℓk, then ξλ2|{1,...,ℓ} is type k and ξλ2(ℓ) < 0. This implies sk(λ2)≤ ℓ = sk(λ1), as desired.

3.2. Bounding eigenvalues from below. Motivated by Theorem 3.1, we intro-duce another scheme generalizing (2.1) to bound the other eigenvalues of MG

π,ν from below.

Definition 3.3. For λ > 0, let ξλbe a function in Deﬁnition 3.2. If ξλis of type i, 1≤ i ≤ n − 1, with peak-valley points 1 = a1< b1 ≤ a2< b2≤ · · · ≤ ai < bi ≤ n, then deﬁne ξ(j)_λ (k) = { ξλ(k) for k≤ bj ξλ(k) = ξλ(bj) for k > bj , ∀1 ≤ j < i

(13)

Remark 3.4. For λ > 0, if ξλis of type i, then ξ

(j)

λ is of type j for j < i. Moreover, for k < bj, ξ_λ(j)(k + 1) = ξ_λ(j)(k) +[ξ (j) λ (k)− ξ (j) λ (k− 1)]ν(k − 1, k) − λπ(k)ξ (j) λ (k) ν(k, k + 1) = ξ_λ(j)(k)−λ[π(1)ξ (j) λ (1) +· · · + π(k)ξ (j) λ (k)] ν(k, k + 1) , and, for bj≤ k < n, ξ(j)_λ (k + 1) = ξ_λ(j)(k) +Fj([ξ (j) λ (k)− ξ (j) λ (k− 1)]ν(k − 1, k) − λπ(k)ξ (j) λ (k)) ν(k, k + 1) ,

where Fj(t) = max{t, 0} if j is odd, and Fj(t) = min{t, 0} if j is even. Note that

ξ(1)_λ is exactly ϕλin Proposition 2.5.

Thereafter, letL and L(i)_{be functions on (0,}_{∞) deﬁned by}

(3.3) L(λ) = Eν(ξλ, ξλ) Varπ(ξλ) , L(i)(λ) = Eν(ξ (i) λ , ξ (i) λ ) Varπ(ξ (i) λ ) , ∀1 ≤ i ≤ n − 1, where ξλ and ξ (i)

λ are functions in Deﬁnitions 3.2-3.3.

Remark 3.5. Note thatL = L(n−1)_{. By a similar reasoning as in the proof for (2.2),}

one can show that, for λ > 0,

L(λ) = λ +λπ(ξλ)[π(ξλ)− ξλ(n)] Varπ(ξλ) , L(i)(λ) = λ +λπ(ξ (i) λ )[π(ξ (i) λ )− ξ (i) λ (n)] Varπ(ξ (i) λ ) .

From Proposition 3.3, it followss immediately thatL(λ) = L(i)_{(λ) for λ}_{∈ (0, α}

i]. To explore furtherL and L(i)_{, we need more information of π(ξ}

λ), π(ξ (i) λ ), π(ξλ)− ξλ(n) and π(ξ (i) λ )− ξ (i) λ (n).

Lemma 3.7. Let ξλ be the function in Deﬁnition 3.2 and λGi , αi be constants in Proposition 3.3. Then, π(ξλ)− ξλ(n) = 0 has n− 1 distinct roots, say β0 < β1<

· · · < βn−2, which satisfy β0= 0 and αi< βi< λGi+1for 1≤ i ≤ n−2. Furthermore,

π(ξλ)− ξλ(n) > 0 for λ∈ (β2i₋₁, β2i) and π(ξλ)− ξλ(n) < 0 for λ∈ (β2i, β2i+1),

with β₋₁=−∞ and βn₋₁=∞.

Proof. Set u(λ) := π(ξλ)− ξλ(n). According to Deﬁnition 3.2, u(λ) is a polynomial of degree n−1 and satisﬁes u(0) = 0. Note that π(ξλ) = 0 for λ∈ {λG1, ..., λGn−1}. If

i is odd, then ξ_λG

i(n−1) < ξλGi (n). This implies ξλGi(n) > 0 and, hence, u(λ

G i ) < 0. Similarly, if i is even, then u(λG

i ) > 0.

By Lemma 3.2 and Proposition 3.3, if λ = αi with odd i, then ξαi is of type i

with ξαi(n− 1) = ξαi(n). This implies ξαi(n) > 0 and π(ξαi) = π(n)ξαi(n), which

yields u(αi) < 0. Similarly, one can show that u(αi) > 0 if i is even.

Remark 3.6. We consider the sign of π(ξ_λ(i)) and π(ξ_λ(i))− ξ_λ(i)(n) in this remark. By Proposition 3.3, ξ(i)_λ = ξλ for λ ≤ αi. If λ > αi with 1 ≤ i ≤ n − 2, then ξλ is of type j with j > i. Fix 1 ≤ i ≤ n − 2 and set k0 = k0(λ) =

min{k|ξ_λ(i)(j) = ξ_λ(i)(n),∀k ≤ j ≤ n}. Clearly, k0(λ) ≤ n − 1 for λ > αi. Ob-serve that, for λ > αi with odd i, ξλ(k0− 1) < ξλ(k0)≥ ξλ(k0+ 1), which implies

(14)

∑k0−1

k=1 π(k)ξλ(k) < 0 and ∑k0

k=1π(k)ξλ(k) ≥ 0. A similar reasoning for the case of even i gives ∑k0−1

k=1 π(k)ξλ(k) > 0 and ∑k0

k=1π(k)ξλ(k)≤ 0. Consequently, we obtain

(3.4) (−1)i−1π(ξ_λ(i)) > 0, (−1)i[π(ξ_λ(i))− ξ_λ(i)(n)] > 0, for λ > αiand 1≤ i ≤ n−2. Note that, by Proposition 3.3, ξ

(i)

λ = ξλfor λ≤ αi. In addition with Remark 3.3, Lemma 3.7 and the continuity of ξ_λ(i), the ﬁrst inequality of (3.4) holds for λ > λG

i and the second inequalities of (3.4) hold for λ > βi₋₁. According to Lemma 3.7 and Remark 3.6, we derive a generalized version of Proposition 2.3 in the following.

Proposition 3.8. Let n ≥ 3 and 1 ≤ i ≤ n − 1. For λ > 0, let ξλ, ξ

(i)

λ be the

functions in Deﬁnition 3.2 and βi be the constants in Lemma 3.7. (1) For λ > βi₋₁, the following are equivalent.

(1-1) Eν(ξ (i) λ , ξ (i) λ ) = λVarπ(ξ (i) λ ). (1-2) π(ξ_λ(i)) = 0. (1-3) λ = λG i .

(2) For βi₋₁< λ < βi, the following are equivalent. (2-1) Eν(ξλ, ξλ) = λVarπ(ξλ).

(2-2) π(ξλ) = 0. (2-3) λ = λG_i .

Proof. The proof for Proposition 3.8 (2) is similar to the proof for Proposition 3.8

(1) and we deal only with the latter. By Lemma 3.7 and Remark 3.6, one has

π(ξ(i)_λ )[π(ξ_λ(i))− ξ(i)_λ (n)] {

< 0 for λ > λG i

> 0 for βi−1< λ < λGi

.

This proves the equivalence of (1-1) and (1-2). Under the assumption of (1-2) and using Remark 3.3, one has λ≤ αi. This implies ξ

(i)

λ = ξλis an eigenvector for Mπ,νG with associated eigenvalue λ. As λ∈ (βi−1, αi], it must be the case λ = λGi . This gives (1-3), while (1-3)⇒(1-2) is obvious and omitted.

Remark 3.7. It is worthwhile to note that if (1-1) and (2-1) of Proposition 3.8 are

removed, then the equivalence in (1) holds for λ > λG

i−1 and the equivalence in (2) holds for λ ∈ (λG

i₋₁, λGi+1). Once λGi₋₁ is known, we can determine λGi using the sign of π(ξ_λ(i)). See Theorem 3.9 for details.

Remark 3.8. Note that condition (4) of Proposition 2.3 is not included in

Propo-sition 3.8. In fact, the equivalence may fail, that is, there may exist some λ ∈ (βi−1, βi)\ {λGi } such that Eν(ξλ, ξλ)/Varπ(ξλ) = λGi . See Example 3.2 for a coun-terexample.

As Proposition 3.8 focuses on the characterization of zeros of L(λ) − λ, the following theorem concerns the sign ofL(λ) − λ.

Theorem 3.9. Let λG

i , αi, βi be the constants in Proposition 3.3 and Lemma 3.7,

and L be the function in (3.3). Then, λG

1, ..., λGn−1, β1, ..., βn₋₂ are fixed points of

L and, for 1 ≤ i ≤ n − 2,

(1) L(λ) < λ for λ ∈ (λG i , βi).

(15)

(2) L(λ) > λ for λ ∈ (βi, λGi+1). (3) L(i)(λ) < λ for λ∈ (λG_i ,∞).

Proof. Immediate from Lemma 3.7 and Remarks 3.5-3.6.

By Theorem 3.9, we obtain a lower bound on any speciﬁed eigenvalue of MG π,ν.

Corollary 3.10. Let 1≤ i ≤ n − 1 and λ0> λGi . Consider the sequence λℓ+1=

L(i)_(λ

ℓ) with ℓ≥ 0 and set

λ∗= {

limℓ→∞λℓ if λℓ converges sup_ℓ_∈Iλℓ if λℓ diverges

, where I ={ℓ|λℓ−1 > λℓ< λℓ+1}. Then, λ∗≤ λGi .

It is not clear yet whether the sequence λℓin Corollary 3.10 is convergent, even locally. This subject will be discussed in the next subsection. Now, we establish some relations between the roots of det Ai(λ) = 0 and the shape of ξ

(i)

λ . This is a generalization of Proposition 2.5.

Proposition 3.11. For 1 ≤ i ≤ n − 1, let Ai(λ) be the matrix in (2.4), θ

(i) 1 <

· · · < θ(i)

i be zeros of det Ai(λ) = 0 and set θ

(i−1)

i :=∞. Referring to the notation

in Proposition 3.3, it holds true that, for 1≤ i ≤ n − 1,

(1) λG i = θ (n−1) i < αi= θ (n−2) i <· · · < θ (i) i .

(2) ξ(i)_λ (j)̸= ξ(i)_λ (j + 1) =· · · = ξ_λ(i)(n) for λ∈ [θ(j)_i , θ_i(j−1)) and i≤ j ≤ n − 2. (3) ξ_λ(i)(n− 1) ̸= ξ_λ(i)(n) for λ∈ (θ_i(n₋₁−2), θ(n_i −2)) and i≤ n − 1.

Proof. The order in (1) is a simple application of Lemma A.3. For (2), ﬁx 1 ≤ i ≤ n − 1 and set γ(λ) = min{j|ξ_λ(i)(k) = ξ(i)_λ (n),∀j ≤ k ≤ n} and B(λ) =

{1, 2, ..., γ(λ)}, B+_{(λ) = B(λ)}_{∪ {γ(λ) + 1}. Clearly, i + 1 ≤ γ(λ) ≤ n. We use the}

notation ξλ|C to denote the restriction of ξλto a set C. Suppose that i is odd. By Remark 3.4, ξ_λ(i)= ξλon B(λ) and ξλ|B(λ) is of type i with

ξλ(γ(λ)− 1) < ξλ(γ(λ))≥ ξλ(γ(λ) + 1).

By Lemma 3.2(1), if ξλ(γ(λ) + 1) < ξλ(γ(λ)), then there is ϵ > 0 such that, for

|δ| < ϵ, ξλ+δ|B(λ) is of type i and

ξλ+δ(γ(λ)− 1) < ξλ+δ(γ(λ)) > ξλ+δ(γ(λ) + 1).

This implies γ(λ + δ) = γ(λ) for δ ∈ (−ϵ, ϵ). By Lemma 3.2(2), if ξλ(γ(λ) + 1) =

ξλ(γ(λ)), then there is ϵ > 0 such that, for δ∈ (−ϵ, 0), ξλ+δ|B+_(λ) is of type i with

ξλ+δ(γ(λ)− 1) < ξλ+δ(γ(λ)) < ξλ+δ(γ(λ) + 1), and, for δ∈ (0, ϵ), ξλ+δ|B+_(λ) is of type i + 1 with

ξλ+δ(γ(λ)− 1) < ξλ+δ(γ(λ)) > ξλ+δ(γ(λ) + 1).

This yields γ(λ + δ) = γ(λ) for δ∈ (0, ϵ) and γ(λ + δ) = γ(λ) + 1 for δ ∈ (−ϵ, 0). The proof for the case of even i is similar and we conclude from the above that

γ(λ) is a non-increasing and right-continuous function taking values on{i+1, ..., n}.

Let ci+1 >· · · > cn−1 be the discontinuous points of γ(λ) such that γ(cj) = j for

i + 1≤ j ≤ n − 1. As a consequence of the above discussion, ξcj|{1,...,j} is of type

i with ξcj(j) = ξcj(j + 1) and this implies

∑j

k=1π(k)ξcj(k) = 0. That means cj

(16)

second equality in (1), γ(λ) = n for θ_i(n₋₁−2)< λ < θ_i(n−2) and, thus, cj ≥ θ

(n−2)

i for

j ≥ i + 1. As a consequence of the interlacing relationship θ_i(ℓ) < θ(ℓ_i −1)< θ_i+1(ℓ) , it must be cj= θ

(j+1)

i for i + 1≤ j ≤ n − 1. This ﬁnishes the proof.

Remark 3.9. For 1 ≤ i ≤ n − 1, θ(i)₁ , ..., θ_i(i) are also non-zero eigenvalues of the (i + 1)× (i + 1) principal submatrix of (3.1) indexed by 1, ..., i + 1.

Remark 3.10. In fact, by Proposition 2.5, ξ_λ(1)(n− 1) ̸= ξ_λ(1)(n) for λ∈ (0, θ(n₁ −2)), which is better than Proposition 3.11(3).

3.3. Local convergence of L. This subsection is dedicated to the local conver-gence ofL in (3.3). Let αi, βi, λGi be the constants in Proposition 3.3 and Lemma 3.7. As before, let ζ0= 1, ..., ζn₋₁denote the L2(π)-normalized eigenvectors of Mπ,νG associated with λG

0, ..., λGn₋₁. Clearly, ξλG

i = −ζi/ζi(1) and ξλ =

∑n−1

i=0 ρi(λ)ζi, where ρi(λ) = π(ξλζi) for 0≤ i ≤ n − 1. Note that ρi(λ) is a polynomial of degree

n− 1 and satisﬁes ρi(λj) =−δi(j)/ζi(1) for i, j∈ {0, 1, ..., n − 1}. This implies (3.5) ρ0(λ) =− n∏−1 j=1 λG j − λ λG j , ρi(λ) =− λ ζi(1)λGi n∏−1 j=1,j_̸=i λG j − λ λG j − λGi ,

for all 1 ≤ i ≤ n − 1. Moreover, by multiplying (3.2) with ξλ(k) and summing up k, we obtain Eν(ξλ, ζi) = λGi ρi(λ). In the same spirit, one can show that

Eν(ξλ, ζi) = λ[ρi(λ)− ζi(n)ρ0(λ)] using Deﬁnition 3.2. Putting both equations

together yields (3.6) ρi(λ) = λζi(n) λ− λG i ρ0(λ), ∀0 ≤ i ≤ n − 1.

As a consequence of Remark 3.5, this gives

(3.7) L(λ) = ∑n−1 i=1 λ G i ρ 2 i(λ) ∑n−1 i=1 ρ 2 i(λ) = λ + ∑n−1 i=1(λ G i − λ)−1ζ 2 i(n) ∑n−1 i=1(λ G i − λ)−2ζ 2 i(n) ,

for λ /∈ {λG0, ..., λGn₋₁}. The next proposition follows immediately from the second equation in (3.5) and (3.6).

Proposition 3.12. Let λG1, ..., λGn₋₁ be the non-zero eigenvalues of Mπ,νG in (3.1)

and ζ1, ..., ζn₋₁ be the corresponding L2(π)-normalized eigenvectors. Then,

ζi(1)ζi(n) =− n_∏−1 j=1,j̸=i λG j λG j − λ G i , ∀1 ≤ i ≤ n − 1. Set u(λ) = ∑n_j=1−1(λG

j − λ)−1ζj2(n). By Theorem 3.9, β1, ..., βn₋₂ are zeros of

u(λ)∏n_j=1−1(λG

j − λ), which is a polynomial of degree n − 2. This implies

u(λ) = C  n∏−1 j=1 1 λG j − λ    n∏−2 j=1 (βj− λ)   , where C = λ1···λn−1 β1···βn−2 ∑n−1 j=1ζ 2

j(n)/λGj. Putting this back toL yields

(3.8) 1 L(λ) − λ = u′(λ) u(λ) = n∑−1 j=1 1 λG j − λ − n−2 ∑ j=1 1 βj− λ ,

(17)

for λ /∈ {λG0, ..., λGn₋₁, β1, ..., βn−2}.

Proposition 3.13. LetL be the function in (3.3), λG

i be the eigenvalue of Mπ,νG and

βibe the constant in Lemma 3.7. Let Di = ∑n−2 j=1(βj−λGi )−1− ∑n−1 j=1,j̸=i(λ G j−λGi )−1

with 1≤ i ≤ n − 1. Then, for 2 ≤ i ≤ n − 2,

(1) If Di < 0, then there is τ ∈ (λGi , βi) such that L is strictly increasing on (βi₋₁, λGi )∪ (τ, βi) and strictly decreasing on (λGi , τ ).

(2) If Di> 0, then there is η∈ (βi₋₁, λGi ) such thatL is strictly increasing on (βi₋₁, η)∪ (λGi , βi) and strictly increasing on (η, λGi ).

(3) If Di= 0, then L is strictly increasing on (βi₋₁, βi).

Proof. Using (3.7) and (3.8), one can show thatL′(λG

i ) = 0 and (3.9) L′′(λG_i ) = n−1 ∑ j=1,j̸=i ζ2 i(n) λG j − λGi = 2  n∑−2 j=1 1 βj− λGi − n−1 ∑ j=1,j̸=i 1 λG j − λGi   = 2Di. To prove (1) and (2), it suﬃces to show that if L′(τ ) = 0 for some τ ∈ (λG

i , βi), then τ is a local minimum ofL, and if L′(η) = 0 for some η∈ (βi₋₁, λGi ), then η is a local maximum ofL. We discuss the ﬁrst case, whereas the second case is similar and is omitted. Recall that u(λ) =∑n_j=1−1(λG_j − λ)−1ζ_j2(n). As τ is a critical point forL, one has 2(u′(τ ))2_{= u(τ )u}′′_{(τ ). This implies}

L′′_{(τ ) =} u(τ )[3(u′′(τ ))2− 2u′(τ )u′′′(τ )] 2(u′(τ ))3 > 0,

where the last inequality uses the fact that u(λ) < 0, for λ∈ (λG

i , βi), and 3(u′′(λ))2− 2u′(λ)u′′′(λ) =−12 ∑ 1≤i<j≤n−1 [ (λG_i − λG_j )ζi(n)ζj(n) (λG i − λ)2(λGj − λ)2 ]2 < 0.

This proves (1) and (2).

To see (3), we assume that Di= 0. Computations show that

L(λ) − λG i L(λ) − λ = (λ− λ G i )   n∑−1 j=1,j_̸=i 1 λG j − λ − n∑−2 j=1 1 βj− λ   =(λ− λG_i )2   n∑−1 j=1,j̸=i 1 (λG j − λ)(λGj − λGi ) − n₋₁ ∑ j=1 1 (βj− λ)(βj− λGi )   < 0, for λ ∈ (βi₋₁, λGi )∪ (λGi , βi), where the last inequality uses the fact that (λGj −

λ)(λG

j − λGi ) > (βj− λ)(βj− λGi ) for j < i and (λGj − λ)(λGj − λGi ) > (βj₋₁−

λ)(βj₋₁− λGi ) for j > i. By Theorem 3.9, this impliesL(λ) > λGi for λ∈ (λGi , βi) and L(λ) < λG

i for λ∈ (βi−1, λGi ). The desired property comes immediate from

the discussion in the previous paragraph.

Remark 3.11. Note that D1 > 0 and Dn₋₁ < 0. Using the same proof as above, this implies that L(λ) is strictly increasing on (λG

1, β1)∪ (βn₋₂, λGn−1). Moreover, by (3.7), one may compute

(u′(λ))2L′(λ) =−2∑ i<j (λG i − λGj)2 (λG i − λ)3(λGj − λ)3 < 0, ∀λ ∈ (0, λG₁)∪ (λG_n₋₁,∞).

(18)

This impliesL(λ) is strictly decreasing on (0, λG1)∪ (λGn₋₁,∞) and lim λ→0L(λ) = ∑n₋₁ i=1 ζ 2 i(n)/λ G i ∑n−1 i=1 ζi2(n)/(λGi )2 , lim λ→∞L(λ) = ( 1 π(n)− 1 )n_∑₋₁ i=1 λG_i ζ_i2(n).

The following local convergence is a simple corollary of Theorem 3.9 and Propo-sition 3.13.

Theorem 3.14 (Local convergence). Let λ0> 0 and set λℓ+1 =L(λℓ) for ℓ≥ 0.

Then, there is ϵ > 0 such that the sequence (λℓ)∞ℓ=1 is monotonic and converges to

λG

i for λ0∈ (λGi − ϵ, λGi + ϵ) and 1≤ i ≤ n − 1.

We use the following examples to illustrate the diﬀerent cases in Proposition 3.13.

Example 3.1 (Simple random walks). Let n > 1. A simple random walk on

{1, 2, ..., n} with reﬂecting probability 1/2 at the boundary is a birth and death

chain with transition matrix given by K(i, j) = K(1, 1) = K(n, n) = 1/2 for

|i − j| = 1. It is easy to see that the uniform probability is the stationary

distri-bution of K. In the setting of graph, we have ν(i, i + 1) = 1/(2n) and π(i) = 1/n. One may apply the method in [11] to obtain the following spectral information.

λG_j = 1− cosjπ n, ζj(k) = 1 √ λG j ( sinjkπ n − sin j(k− 1)π n ) , ∀1 ≤ j < n.

See, e.g., [3, Section 7]. By (3.9), we get

Di= 1 2 n₋₁ ∑ j=1,j̸=i sin2(jπ/n) λG j(λ G j − λ G i ) = n_∑₋₁ j=1,j̸=i 1 + cos(jπ/n) cos(iπ/n)− cos(jπ/n). Clearly, D1> 0 and Dn₋₁< 0. If n is even, then Dn/2< 0.

Example 3.2 (Ehrenfest chains). An Ehrenfest chain on V ={0, 1, ..., n} is a Markov

chain with transition matrix K given by K(i, i + 1) = 1− i/n and K(i + 1, i) = (i + 1)/n for i = 0, ..., n− 1. The associated stationary distribution is the unbiased binomial distribution on V , that is, π(i) = (n_i)2−n for i ∈ V . To the Ehrenfest chain, the measure ν is deﬁned by ν(i, i + 1) =(n−1_i )2−nfor i = 0, ..., n− 1. Using the group representation for the binary group{0, 1}n_{, one may compute}

λj= 2j n, ζj(k) = ( n j )_{−1/2 j}_∑ ℓ=0 (−1)ℓ ( k ℓ )( n− k j− ℓ ) , ∀1 ≤ j ≤ n.

Plugging this back into (3.9) yields

Di= n 4 n ∑ j=1,j_̸=i (_n j ) j− i      > 0 for i < n/2 = 0 for i = n/2 < 0 for i > n/2. .

This example points out the possibility of diﬀerent signs in {Di|i = 1, ..., n − 1} including 0.

(19)

3.4. A remark on the separation for birth and death chains. In this subsec-tion, we give a new proof of a result, Theorem 3.15, which deals with convergence in separation distance for birth and death chains. Let (Xm)∞m=0be a birth and death chain with transition matrix K given by (1.1). In the continuous time setting, we consider the process Yt = XNt, where Nt is a Poisson process with parameter 1

independent of Xm. Given the initial distribution µ, which is the distribution of

X0, the distributions of Xm and Yt are respectively µKm and µe−t(I−K), where

eA _:=∑∞ l=0A

l_{/l!. Brieﬂy, we write H}

t = e−t(I−K). It is well-known that if K is irreducible, then µHt converges to π as t→ ∞. If K is irreducible and ri > 0 for some i, then µKm _{converges to π as m} _{→ ∞. Concerning the convergence, we} consider the separations of Xm, Yt with respect to π, which are deﬁned by

dsep(µ, m) = max 0≤x≤n { 1−µK m_(x) π(x) } , dc_sep(µ, t) = max 0≤x≤n { 1−µHt(x) π(x) } .

The following theorem is from [9].

Theorem 3.15. Let K be an irreducible birth and death chain on{0, 1, ..., n} with eigenvalues λ0= 0 < λ1<· · · < λn.

(1) For the discrete time chain, if pi+ qi+1 ≤ 1 for all 0 ≤ i < n, then

dsep(0, m) = dsep(n, m) = n ∑ j=1   ∏n i=1,i̸=j λi λi− λj   (1 − λj)m. (2) For the continuous time chain, it holds true that

dcsep(0, t) = d c sep(n, t) = n ∑ j=1   ∏n i=1,i̸=j λi λi− λj   e−λjt_.

Diaconis and Fill [6, 12] introduce the concept of dual chain to express the separations in Theorem 3.15 as the probability of the ﬁrst passage time. Brown and Shao [1] characterize the ﬁrst passage time using the eigenvalues of K for a special class of continuous time Markov chains including birth and death chains. The idea in [1] is also applicable for discrete time chains and this leads to the formula above. See [9] for further discussions. Here, we use Proposition 3.12 and Lemma 3.16 to prove this result directly.

Lemma 3.16. Let K be the transition matrix in (1.1) with stationary distribution π. Suppose that µ is a probability distribution satisfying µ(i)/π(i)≤ µ(i+1)/π(i+1) for all 0≤ i ≤ n − 1.

(1) For the discrete time chain, if pi + qi+1 ≤ 1 for all 0 ≤ i < n, then

µKm(i)/π(i)≤ µKm(i + 1)/π(i + 1) for all 0≤ i < n and m ≥ 0.

(2) For the continuous time chain, µHt(i)/π(i)≤ µHt(i + 1)/π(i + 1) for all 0≤ i < n and t ≥ 0.

Proof. Note that (2) follows from (1) if we write Ht= exp{−2t(I −I+K₂ )}. For the proof of (1), observe that

µKm+1_(i) π(i) = µKm_(i_{− 1)} π(i− 1) qi+ µKm_(i) π(i) ri+ µKm_{(i + 1)} π(i + 1) pi, ∀i.

(20)

By induction, if µKm(i)/π(i)≤ µKm(i + 1)/π(i + 1) for 0≤ i < n, then µKm+1_{(i + 1)} π(i + 1) = µKm_(i) π(i) qi+1+ µKm_{(i + 1)} π(i + 1) ri+1+ µKm_{(i + 2)} π(i + 2) pi+1 ≥µKm(i) π(i) qi+1+ µKm(i + 1) π(i + 1) (1− qi+1) ≥µKm(i) π(i) (1− pi) + µKm_{(i + 1)} π(i + 1) pi≥ µKm+1_(i) π(i) .

Remark 3.12. Lemma 3.16 is also developed in [10] in which it is shown that, for

any non-negative function f , Km_{f is non-decreasing if f is non-decreasing for all}

m≥ 0. Consider the adjoint chain K∗ of K in L2_{(π). As birth and death chains}

are reversible, one has K∗= K. Using the identity µK/π = K∗(µ/π), it is easy to see that the above proof is consistent with the proof in [10].

Proof of Theorem 3.15. Assume that K is irreducible and let λ0= 0 < λ1<· · · <

λn be the eigenvalues of I− K with L2(π)-normalized eigenvector ζ0 = 1, ..., ζn. By Lemma 3.16, if µ satisﬁes µ(i)/π(i)≥ µ(i + 1)/π(i + 1) for 0 ≤ i < n, then

dc_sep(µ, t) = 1−µHt(n) π(n) = n ∑ j=1 µ(ζj)ζj(n)e−λjt, where µ(ζj) = ∑n

i=0ζj(i)µ(i). If K satisﬁes pi+ qi+1 ≤ 1 for all 0 ≤ i < n, then

dsep(µ, m) = 1− µKm_(n) π(n) = n ∑ j=1 µ(ζj)ζj(n)(1− λj)m.

By Proposition 3.12, setting µ to be one of the dirac measure δ0, δn leads to the

desired identities.

4. Paths of infinite length

In this section, the graph G = (V, E) under consideration is inﬁnite with V =

{1, 2, ...} and E = {{i, i + 1}|i = 1, 2, ...}. As before, let π, ν be positive measures

on V, E satisfying π(V ) = 1. The Dirichlet form and the variance are deﬁned in a similar way as in the introduction and the spectral gap of G with respect to π, ν is given by λG_π,ν= inf { Eν(f, f ) Varπ(f ) f is non-constant and π(f2_{) <}_∞ } .

For n ≥ 2, let Gn = (Vn, En) be the subgraph of G with Vn ={1, 2, ..., n}, En =

{{i, i + 1}|1 ≤ i < n} and let πn, νn be normalized restrictions of π, ν to Vn, En. That is, πn(i) = cnπ(i), νn(i, i + 1) = cnν(i, i + 1) with cn = 1/[π(1) +· · · + π(n)]. As before, let MG

π,ν be an inﬁnite matrix indexed by V and deﬁned by (4.1) M_π,νG (i, j) =−ν(i, j) π(i) , ∀|i − j| = 1, M G π,ν(i, i) = ν(i− 1, i) + ν(i, i + 1) π(i) . Clearly, MGn

πn,νn is the principal submatrix of M

G

π,ν indexed by Vn× Vn.

Lemma 4.1. Referring to the above setting, λGn+1

πn+1,νn+1 < λ

Gn

πn,νn for n > 1 and