Spectral computations for birth and death chains

(1)

ScienceDirect

Stochastic Processes and their Applications 124 (2014) 848–882

www.elsevier.com/locate/spa

Spectral computations for birth and death chains

Guan-Yu Chen

a,∗

, Laurent Saloff-Coste

b

a_{Department of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan} b_{Malott Hall, Department of Mathematics, Cornell University, Ithaca, NY 14853-4201, United States}

Received 22 June 2012; received in revised form 18 September 2013; accepted 1 October 2013 Available online 11 October 2013

Abstract

We consider the spectrum of birth and death chains on an n-path. An iterative scheme is proposed to compute any eigenvalue with exponential convergence rate independent of n. This allows one to determine the whole spectrum in order n2elementary operations. Using the same idea, we also provide a lower bound on the spectral gap, which is of the correct order on some classes of examples.

c

MSC:60J10; 60J27

Keywords:Birth and death chains; Spectrum

1. Introduction

Let G =(V, E) be the undirected finite path with vertex set V = {1, 2, . . . , n} and edge set E = {{i, i +1} : i = 1, 2, . . . , n −1}. Given two positive measures π, ν on V, E with π(V ) = 1, the Dirichlet form and variance associated withν and π are defined by

E_ν( f, g) :=

n−1



i =1

[f(i) − f (i + 1)][g(i) − g(i + 1)]ν(i, i + 1) and

Var_π( f ) := π( f2) − π( f )2,

∗_{Corresponding author. Tel.: +886 35712121.}

(2)

where f, g are functions on V . When convenient, we set ν(0, 1) = ν(n, n + 1) = 0. The spectral gap of G with respect toπ, ν is defined as

λG π,ν :=min _E ν( f, f ) Var_π( f )     f is non-constant  .

Let M_π,νG be a matrix given by M_π,νG (i, j) = 0 for |i − j| > 1 and M_π,νG (i, j) = −ν(i, j)

π(i) , ∀|i − j| = 1, Mπ,νG (i, i) =

ν(i − 1, i) + ν(i, i + 1) π(i) . Obviously,λG_π,νis the smallest non-zero eigenvalue of M_π,νG .

Undirected paths equipped with measuresπ, ν are closely related to birth and death chains. A birth and death chain on {0, 1, 2, . . . , n} with birth rate pi, death rate qi and holding rate ri is a

Markov chain with transition matrix K given by

K(i, i + 1) = pi, K(i, i − 1) = qi, K(i, i) = ri, ∀0 ≤ i ≤ n, (1.1)

where pi +qi +ri = 1 and pn = q0 = 0. Under the assumption of irreducibility, that

is, piqi +1 > 0 for 0 ≤ i < n, K has a unique stationary distribution π given by π(i) =

c(p0· · ·pi −1)/(q1· · ·qi), where c is the positive constant such that ni =0π(i) = 1. The smallest

non-zero eigenvalue of I −K is exactly the spectral gap of the path on {0, 1, . . . , n} with measures π, ν, where ν(i, i + 1) = π(i)pi =π(i + 1)qi +1for 0 ≤ i < n.

Note that if 1 is the constant function of value 1 andψ is a minimizer for λG_π,ν, thenψ −π(ψ)1 is an eigenvector of M_π,νG . This implies that any minimizerψ for λG_π,ν satisfying π(ψ) = 0 satisfies the Euler–Lagrange equation,

λG

π,νπ(i)ψ(i) = [ψ(i) − ψ(i − 1)]ν(i − 1, i) + [ψ(i) − ψ(i + 1)]ν(i, i + 1), (1.2)

for all 1 ≤ i ≤ n. Assuming the connectedness of G (i.e., the superdiagonal and subdiagonal entries of M_π,νG are positive), the rank of M_π,νG −λI is at least n − 1. This implies that all eigenvalues of M_π,νG are simple. SeeLemma A.3for an illustration. Observe that, by(1.2), any non-trivial eigenvector of M_π,νG has mean 0 underπ. This implies that all minimizers for the spectral gap are of the form aψ + b1, where a, b are constants and ψ is a nontrivial solution of

(1.2). In 2009, Miclo obtained implicitly the following result.

Theorem 1.1 ([15, Proposition 1]). If ψ is a minimizer for λG_π,ν, thenψ must be monotonic, that is, eitherψ(i) ≤ ψ(i + 1) for all 1 ≤ i < n or ψ(i) ≥ ψ(i + 1) for all 1 ≤ i < n.

One aim of this paper is to provide a scheme to compute the spectrum of M_π,νG , in particular, the spectral gap. Based on Miclo’s observation, it is natural to consider the following algorithm.

Choose two positive realsλ0, a in advance and set, for k = 0, 1, . . . ,

1. ψk(1) = −a,

2. ψk(i + 1) = ψk(i) +

{[ψ_k(i) − ψ_k(i − 1)]ν(i − 1, i) − λ_kπ(i)ψ_k(i)}+ ν(i, i + 1) , for 1 ≤ i < n, where t+=max{t, 0},

3. λk+1=

E_ν(ψk, ψk)

Var_π(ψk) .

(A1)

(3)

Theorem 1.2 (Convergence to the Exact Value). Referring to(A1), if n = 2, then λk =λG_π,νfor

all k ≥1. If n ≥ 3, then the sequence(λk, ψk) satisfies

(1) Ifλ0=λG_π,ν, thenλk=λ_π,νG for all k ≥0.

(2) Ifλ0̸=λG_π,ν, thenλk> λk+1> λ_π,νG for k ≥1.

(3) Set(λ∗, ψ∗) = limk→∞(λk, ψk). Then, λ∗ = Eν(ψ∗, ψ∗)/Varπ(ψ∗) = λG_π,ν and π(ψ∗)

=0.

Theorem 1.3 (Rate of Convergence). Referring toTheorem1.2, there is a constantσ ∈ (0, 1) independent of the choice of (λ0, a) such that 0 ≤ λk−λG_π,ν ≤σk−1λ1for all k ≥1.

ByTheorem 1.3, we know that the sequenceλk generated in(A1)converges to the spectral

gap exponentially but the rate(− log σ ) is undetermined. The following alternative scheme is based on using more information on the spectral gap and will provide convergence at a constant rate.

Choose a> 0, L0< λG_π,ν < U0in advance and set, for k = 0, 1, . . . ,

1. ψk(1) = −a, λk =

1

2(Lk+Uk) 2. ψk(i + 1) = ψk(i) +

{[ψ_k(i) − ψ_k(i − 1)]ν(i − 1, i) − λ_kπ(i)ψ_k(i)}+ ν(i, i + 1) , for 1 ≤ i< n, where t+=max{t, 0},

3.    Lk+1=Lk, Uk+1=λk ifπ(ψk) > 0 Lk+1=λk, Uk+1=Uk ifπ(ψk) < 0 Lk+1=Uk+1=λk ifπ(ψk) = 0. (A2)

Theorem 1.4 (Dichotomy Method). Referring to(A2), it holds true that 0 ≤ max{Uk−λ_π,νG , λG_π,ν−Lk} ≤(U0−L0)2−k, ∀k ≥ 0.

In Theorem 1.4, the convergence to the spectral gap is exponentially fast with explicit rate, log 2. SeeRemark 2.2 for a discussion on the choice of L0and U0. For higher order spectra,

Miclo has a detailed description of the shape of eigenvectors in [14] and this will motivate the definition of similar algorithms for every eigenvalue in spectrum. See(Di)andTheorem 3.4for

a generalization of(A2)andTheorem 3.14for a localized version ofTheorem 1.3.

The spectral gap is an important parameter in the quantitative analysis of Markov chains. The cutoff phenomenon, a sharp phase transition phenomenon for Markov chains, was introduced by Aldous and Diaconis in early 1980s. It is of interest in many applications. A heuristic conjecture proposed by Peres in 2004 says that the cutoff exists if and only if the product of the spectral gap and the mixing time tends to infinity. Assuming reversibility, this has been proved to hold for Lp-convergence with 1 < p ≤ ∞ in [2]. For the L1-convergence, Ding et al. [10] prove this conjecture for continuous time birth and death chains. In order to use Peres’ conjecture in practice, the orders of the magnitudes of spectral gap and mixing time are required. The second aspect of this paper is to derive a theoretical lower bound on the spectral gap using only the birth and death rates. This lower bound is obtained using the same idea used to analyze the above algorithm. For estimates on the mixing time of birth and death chains, we refer the readers to the recent work [4] by Chen and Saloff-Coste. For illustration, we consider several examples of specific interest and show that the lower bound provided here is in fact of the correct order in these examples.

(4)

This article is organized as follows. In Section2, the algorithms in(A1)–(A2)are explored and proofs forTheorems 1.2–1.4are given. In Section 3, the spectrum of M_π,νG is discussed further and, based on Miclo’s work [14], algorithm (A2) is generalized to any specified eigenvalue of M_π,νG . Our method is applicable for paths of infinite length (one-sided) and this is described in Section 4. For illustration, we consider some Metropolis chains and display numerical results of algorithm(A2)in Section5. In Section6, we focus on uniform measures with bottlenecks and determine the correct order of the spectral gap using the theory in Sections2

and3. It is worthwhile to remark that the assumptions in Section6 can be relaxed using the comparison technique in [7,8]. As the work in this paper can also be regarded as a stochastic counterpart of theory of finite Jacobi matrices, we would like to refer the readers to [18,19] for a complementary perspective.

2. Convergence to the spectral gap

This section is devoted to provingTheorems 1.2–1.4. First, we proveTheorem 1.1 in the following form.

Lemma 2.1. Letλ > 0 and ψ be a non-constant function on V . Suppose (λ, ψ) solves(1.2)and ψ is monotonic. Then, ψ is strictly monotonic, that is, either ψ(i) < ψ(i + 1) for 1 ≤ i < n or ψ(i) > ψ(i + 1) for 1 ≤ i < n.

Proof. Obviously,(1.2)implies thatπ(ψ) = 0. Without loss of generality, it suffices to consider the case when ψ(1) < 0 and ψ(n) > 0. Since ψ is non-constant and λ_π,νG > 0, we have ψ(1) < ψ(2) and ψ(n − 1) < ψ(n). Note that if there are 1 < i < j < n such that ψ(i − 1) < ψ(i), ψ( j) < ψ( j + 1) and ψ(k) = ψ(i) = ψ( j) for i ≤ k ≤ j, then(1.2)

yields λG

π,νπ(i)ψ(i) = [ψ(i) − ψ(i − 1)]ν(i − 1, i) + [ψ(i) − ψ(i + 1)]ν(i, i + 1) > 0

and λG

π,νπ( j)ψ( j) = [ψ( j) − ψ( j − 1)]ν( j − 1, j) + [ψ( j) − ψ( j + 1)]ν( j, j + 1) < 0,

a contradiction. Thus,ψ is strictly increasing. We note the following corollary.

Corollary 2.2. Let (λ, ψ) be a pair satisfying (1.2). Then, λ = λG_π,ν if and only if ψ is monotonic.

Proof. One direction is obvious fromTheorem 1.1. For the other direction, assume that ψ is monotonic and letφ be a minimizer for λG_π,ν withπ(φ) = 0. Since (λ, ψ) and (λG_π,ν, φ) are solutions to(1.2), one has

λπ(ψφ) = Eν(ψ, φ) = λGπ,νπ(φψ).

ByLemma 2.1,ψ and φ are strictly monotonic and this implies E_ν(ψ, φ) ̸= 0. As a consequence of the above equations, we haveλ = λG_π,ν.

(5)

Proposition 2.3. Suppose that(λ, ψ) satisfies λ > 0, ψ(1) < 0 and, for 1 ≤ i < n, ψ(i + 1) = ψ(i) +{[ψ(i) − ψ(i − 1)]ν(i − 1, i) − λπ(i)ψ(i)}

+

ν(i, i + 1) , (2.1)

where t+=max{t, 0}. Then, the following are equivalent. (1) E_ν(ψ, ψ) = λVar_π(ψ).

(2) π(ψ) = 0. (3) λ = λ_π,νG .

Furthermore, if n ≥3, then any of the above is equivalent to (4) E_ν(ψ, ψ) = λ_π,νG Var(ψ).

Remark 2.1. For n = 2, it is an easy exercise to show that λG_π,ν = ν(1, 2)/(π(1)π(2)). By following the formula in (2.1), one has ψ(2) = ψ(1)[1 − λπ(1)/ν(1, 2)], which leads to E_ν(ψ, ψ)/Var_π(ψ) = λG

π,ν.

Proof of Proposition 2.3. Set B = {1 ≤ i ≤ n|ψ(i) = ψ(n)} and Bc _{= {1}_{, 2, . . . , i}

0}. Since

ψ(1) < 0 and λ > 0, ψ(1) < ψ(2) and Bc_{is nonempty. According to}_(2.1)_,_{ψ is non-decreasing.}

Note that ifψ(i) = ψ(i + 1), then ψ(i) ≥ 0 and ψ(i + 2) = ψ(i + 1). This implies ψ is strictly increasing on {1, 2, . . . , i0+1} and, for 1 ≤ i ≤ i0,

λπ(i)ψ(i) = [ψ(i) − ψ(i + 1)]ν(i, i + 1) + [ψ(i) − ψ(i − 1)]ν(i − 1, i). Multiplyingψ(i) on both sides and summing over all i in Bcyields

λ i0  i =1 ψ(i)2_{π(i) =} i0−1  i =1

[ψ(i) − ψ(i + 1)]2ν(i, i + 1) +ψ(i₀)[ψ(i₀) − ψ(i₀+1)]ν(i0, i0+1)

=E_ν(ψ, ψ) + ψ(i0+1)[ψ(i0) − ψ(i0+1)]ν(i0, i0+1)

=E_ν_{(ψ, ψ) + λψ(n)} i0  i =1 ψ(i)π(i). This is equivalent to E_ν(ψ, ψ) = λVar_π(ψ) + λπ(ψ)[π(ψ) − ψ(n)], (2.2) which proves (1)⇔(2). Ifλ = λG

π,ν, then ψ is an eigenvector for Mπ,νG associated to λGπ,ν. This proves (3)⇒(2).

For (2)⇒(3), assume thatπ(ψ) = 0. In this case, ψ must be strictly increasing. Otherwise, ψ(i) = ψ(n) > 0 for i ∈ B and, according to(2.1), this implies

λVarπ(ψ) > λ n−1  i =1 π(i)ψ2_{(i) ≥} n−1  i =1

[ψ(i) − ψ(i + 1)]2ν(i, i + 1) = E(ψ, ψ),

which contradicts (1). As ψ is strictly increasing and π(ψ) = 0, (λ, ψ) solves (1.2). By

Corollary 2.2,λ = λG_π,ν.

To finish the proof, it remains to show (4)⇒(3) ((3)⇒(4) is obvious from the equivalence among (1), (2) and (3)). Assume that E_ν(ψ, ψ) = λG_π,νVar_π(ψ). ByLemma 2.1,ψ is strictly monotonic and this implies, for 1 ≤ i< n,

(6)

Asψ is a minimizer for λG_π,ν, one has, for 1 ≤ i ≤ n, λG

π,νπ(i)[ψ(i) − π(ψ)] = [ψ(i) − ψ(i + 1)]ν(i, i + 1) + [ψ(i) − ψ(i − 1)]ν(i − 1, i).

Ifλ ̸= λG

π,ν, the comparison of both systems yields

ψ(i) = λ

G π,νπ(ψ)

λG

π,ν−λ, ∀1 ≤ i < n.

As n ≥ 3,ψ(1) = ψ(2), a contradiction! This forces λ = λG_π,ν, as desired.

The following is a simple corollary of Proposition 2.3, which plays an important role in provingTheorem 1.4.

Corollary 2.4. Let n ≥ 3. For λ > 0, let φ_λ be the vector generated by(2.1)withφ(1) < 0. Then,(λ − λG_π,ν)π(φ_λ) > 0 for λ > 0 and λ ̸= λG_π,ν.

Proof. Without loss of generality, we fixφ_λ(1) = −1 for all λ > 0. Set T (λ) = π(φ_λ). To prove this corollary, it suffices to show that

T(λ)<0 if λ < λ

G π,ν

>0 if λ > λG π,ν.

Forλ > 0, define L(λ) := E_ν(φ_λ, φ_λ)/Var_π(φ_λ). By(2.2), one has L(λ) − λ = λT (λ)[π(φλ) − φλ(n)]

Var_π(φ_λ) . (2.3)

Sinceφ_λis non-constant,π(φ_λ) < φ_λ(n). This implies T (λ) < 0 for λ ∈ (0, λG_π,ν).

Forλ > λG_π,ν, set I = (λG_π,ν, ∞). ByProposition 2.3, T(λ) = 0 if and only if λ = λ_π,νG . By the continuity of T , this implies either T(I ) ⊂ (−∞, 0) or T (I ) ⊂ (0, ∞). In the case T(I ) ⊂ (−∞, 0), one has L(λ) > λ for λ ∈ I . As L(I ) is bounded, Lk(λ) is convergent with limit λ > λ_π,νG and this yields

0 = lim k→∞ [Lk+1(λ) − Lk(λ)] =λT (λ)[π(φλ) − φλ(n)] Var_π(φ λ) > 0, a contradiction. Hence, T(λ) > 0 for λ > λG_π,ν.

Proof of Theorem 1.2. The proof for n = 2 is obvious from a direct computation and we deal with the case n ≥ 3, here. By the equivalence ofProposition 2.3(3)–(4), ifλ0 = λG_π,ν, then

λk =λG_π,ν for all k ≥ 1. Ifλ0̸=λ_π,νG , thenλk > λG_π,ν for k ≥ 1. Note that(λk, ψk) solves the

system in(2.1). By(2.2), this implies

λk+1−λk= λkπ(ψk)[π(ψk) − ψk(n)]

Var_π(ψk)

, ∀k ≥ 0.

The strict monotonicity ofλkin (2) comes immediately fromCorollary 2.4. In (3), the continuity

of (2.1) in λ implies that (λ∗, ψ∗) is a solution to (2.1) and E_ν(ψ∗, ψ∗) = λ∗Var(ψ∗). By

Proposition 2.3,λ∗=λG_π,νandπ(ψ∗) = 0, as desired.

Proof of Theorem 1.3. Recall the notation in the proof ofCorollary 2.4: forλ > 0, let φ_λ be the function defined by(2.1) and L(λ) = E_ν(φ_λ, φ_λ)/Var_π(φ_λ). By (2.2)and Corollary 2.4, L(λ) ∈ (λG_π,ν, λ) for λ > λG_π,ν. As L is bounded,Theorem 1.3follows fromLemma A.1.

(7)

Proof of Theorem 1.4. Immediate fromCorollary 2.4.

In the end of this section, we use the following proposition to find how the shape of the functionψ in(2.1)evolves withλ. InProposition 2.5, we setφ_λ =ψ when ψ is given by(2.1). It is easy to see from(2.1)that φ_λ is strictly increasing before some constant, say i0 =i0(λ),

and then stays constant equal toφ_λ(i0) after i0. The proposition shows how the constant i0(λ)

evolves.

Proposition 2.5. Forλ > 0, let φ_λbe the function generated by(2.1)withφ_λ(1) = −1 and, for 1 ≤ i ≤ n, set Ti(λ) = ij =1φλ(i)π(i). For 1 ≤ i < n, let

ai(λ) = 1 + π(i + 1)/π(i) − λπ(i + 1)/ν(i, i + 1),

Ai(λ) =                a1(λ) 1 0 0 · · · 0 π(3) π(2) a2(λ) 1 0 ... 0 π(4) π(3) a3(λ) ... ... ... 0 0 ... ... ... 0 ... ... ... ai −1(λ) 1 0 · · · 0 π(i + 1) π(i) ai(λ)                , (2.4)

and letλ(i)be the smallest root of det Ai(λ) = 0. Then,

(1) λG_π,ν =λ(n−1)< λ(n−2)< · · · < λ(1).

(2) φ_λ(i) < φ_λ(i + 1) = φ_λ(i + 2) for λ ∈ [λ(i), λ(i−1)) and 1 ≤ i ≤ n − 2, where λ(0):= ∞. (3) φ_λ(n − 1) < φ_λ(n) for λ ∈ (0, λ(n−2)).

In particular, Ti +1(λ) = −π(1) det Ai(λ) for λ ∈ (0, λ(i−1)) and (λ − λ(i))Ti +1(λ) > 0 for

λ ∈ (0, λ(i)_{) ∪ (λ}(i)_{, ∞) with 1 ≤ i ≤ n − 1.}

Proof. ByLemma A.2,λ(1)> λ(2)> · · · > λ(n−1)> 0 and, for 1 ≤ i ≤ n − 1, det Ai(λ)

_{>0 ∀λ ∈ (−∞, λ}(i)₎

<0 ∀λ ∈ (λ(i), λ(i−1)), (2.5) whereλ(0)= ∞. Note that if Ti(λ) < 0 for some 1 ≤ i ≤ n − 1, then

φλ( j + 1) = φλ( j) +[φλ( j) − φλ( j − 1)]ν( j − 1, j) − λπ( j)φ_{ν( j, j + 1)} λ( j), ∀1 ≤ j ≤ i. This implies φλ(ℓ + 1) = φλ(ℓ) −_{ν(ℓ, ℓ + 1)}λ ℓ  j =1 π( j)φλ( j), ∀1 ≤ ℓ ≤ i. (2.6)

Multiplyingπ(ℓ + 1) and adding up T_ℓ(λ) yields T_ℓ+1(λ) = a_ℓ(λ)T_ℓ(λ) −π(ℓ + 1)

(8)

From the above discussion, we conclude that if Ti(λ) < 0, then

T_ℓ+1(λ) = −π(1) det A_ℓ(λ), ∀1 ≤ ℓ ≤ i. (2.7) Whenℓ = i − 1,(2.5)implies det Ai −1(λ) > 0 for λ < λ(i−1). By the continuity of Ti and

det Ai −1, if there is someλ < λ(i−1)such that Ti(λ) < 0, then Ti(λ) = −π(1) det Ai −1(λ) for

λ < λ(i−1)_{. As a consequence of}_(2.7)_with_{ℓ = i, this will imply T}_{i +1}_{(λ) = −π(1) det A}_i_(λ)

for λ < λ(i−1). Hence, it remains to show that Ti(λ) < 0 for some λ < λ(i−1). To see

this, according to Corollary 2.4, one can choose a constant λ < min{λG_π,ν, λ(i−1)}such that Tn−1(λ) < 0. Since φλ(i) is non-decreasing in i, we obtain Ti(λ) < 0, as desired. This proves Ti +1(λ) = −π(1) det Ai(λ) for λ < λ(i−1). In particular, Tn(λ) = −π(1) det An−1(λ) for

λ < λ(n−2)_{. By}_{Corollary 2.4}_{, we have}_λ(n−1)=λG_π,ν. This provesProposition 2.5(1). Next, observe that, forλ ∈ (λ(i), λ(i−1)),

i +1  j =1 π( j)φλ( j) = Ti +1(λ) > 0, i  j =1 π( j)φλ( j) = Ti(λ) < 0.

By(2.6), it is easy to see that [φ_λ(i + 1) − φ_λ(i)]ν(i, i + 1) = −λTi(λ) > 0 and

[φ_λ(i + 2) − φ_λ(i + 1)]ν(i + 1, i + 2)

= {[φ_λ(i + 1) − φ_λ(i)]ν(i, i + 1) − λπ(i + 1)φ_λ(i + 1)}+ = {−λT_{i +1}(λ)}+=0.

This provesProposition 2.5(2). To proveProposition 2.5(3), we use (1) to derive Tn−1(λ) = −π(1) det An−2(λ) < 0, ∀λ ∈ (0, λ(n−2)).

Using(2.6), this impliesφ_λ(n − 1) < φ_λ(n). The last part ofProposition 2.5follows easily from

(2.5)and the fact that

Ti(λ) ≥ 0 ⇒ Ti +1(λ) > 0 and Ti(λ) ≤ 0 ⇒ Ti −1(λ) < 0.

Remark 2.2. InProposition 2.5, ifλ > λ(1) =ν(1, 2)[π(1)−1+π(2)−1], thenφ_λ(i) = φ_λ(2) for i = 2, . . . , n. Note that, for λ ≥ λ(1),φ_λ(2) = −1 + λπ(1)/ν(1, 2) and

π(φλ) = −1 +λπ(1)(1 − π(1))_{ν(1, 2)} , Var_π(φ_λ) = λ

2_π(1)3_{(1 − π(1))}

ν(1, 2)2 .

By(2.3), this leads to L(λ) = ν(1, 2)/[π(1)(1 − π(1))] for λ ≥ λ(1). In the case n = 2, it is clear thatν(1, 2)/[π(1)(1 − π(1))] = ν(1, 2)[π(1)−1+π(2)−1] =λ_π,νG .

3. Convergence to other eigenvalues

In this section, we generalize the algorithms(A1)and(A2)so that they can be applied for the computation to any specified eigenvalue.

3.1. Basic setup and fundamental results

Recall that G is a graph with vertex set V = {1, 2, . . . , n} and edge set E = {{i, i + 1}|i = 1, 2, . . . , n − 1}. Given two positive measures π, ν on V, E with π(V ) = 1, let M_π,νG be an

(9)

n-by-n matrix defined in the introduction and given by M_π,νG (i, j) =    −ν(i, j)/π(i) if |i − j | = 1 [ν(i − 1, i) + ν(i, i + 1)]/π(i) if j = i 0 if |i − j |> 1.

(3.1)

Sinceν is positive everywhere and M_π,νG is tridiagonal, all eigenvalues of M_π,νG have algebraic multiplicity 1. Throughout this section, let {λG₀ < λG₁ < · · · < λG_n−1}denote the eigenvalues of M_π,νG with associated L2(π)-normalized eigenvectors ζ0=1, ζ2, . . . , ζn−1. Clearly,λ₀G =0,

λG

1 =λGπ,νand, for 1 ≤ k ≤ n,

λG

i ζi(k)π(k) = [ζi(k) − ζi(k − 1)]ν(k − 1, k) + [ζi(k) − ζi(k + 1)]ν(k, k + 1). (3.2)

Let 1 ≤ i ≤ n − 1. Asζi is non-constant, it is clear thatζi(1) ̸= ζi(2) and ζi(n − 1) ̸= ζi(n).

Moreover, ifζi(k) = ζi(k + 1) for some 1 < k < n, then ζi(k) ̸= ζi(k − 1) and ζi(k + 1) ̸=

ζi(k + 2). Gantmacher and Krein [13] showed that there are exactly i sign changes forζi with

1 ≤ i ≤ n. Miclo [14] gives a detailed description on the shape ofζias follows.

Theorem 3.1. For 1 ≤ i ≤ n − 1, letζibe an eigenvector associated to the i th smallest non-zero

eigenvalue of the matrix in(3.1)withζi(1) < 0. Then, there are 1 = a1 < b1 ≤ a2 < b2 ≤

· · · ≤ai < bi =n with aj +1−bj ∈ {0, 1} such that ζiis strictly increasing on [aj, bj]for odd

j and is strictly decreasing on [aj, bj]for even j , andζi(aj +1) = ζi(bj) for 1 ≤ j < i.

In the following, we make some analysis related to the Euler–Lagrange equations in(3.2). Definition 3.1. Fix n ≥ 1 and let f be a function on {1, 2, . . . , n}. For 1 ≤ i ≤ n − 1, f is called “Type i ” if there are 1 = a1< b1≤a2< b2≤ · · · ≤ai < bi ≤nsatisfying aj +1−bj ∈ {0, 1}

such that

(1) f is strictly monotonic on [aj, bj]for 1 ≤ j ≤ i .

(2) [ f(aj) − f (aj+1)][ f (aj +1) − f (aj +1+1)] < 0 for 1 ≤ j < i.

(3) f(aj +1) = f (bj), for 1 ≤ j < i, and f (k) = f (bi), for bi ≤k ≤ n.

The points aj, bj will be called “peak-valley points” in this paper.

Remark 3.1. Note that the difference betweenDefinition 3.1andTheorem 3.1is the requirement bi ≤n, instead of bi =n. ByTheorem 3.1, any eigenvector associated to the i th smallest

non-zero eigenvalue of the matrix in(3.1)must be of type i with bi =n.

Definition 3.2. Letπ, ν be positive measures on V, E with π(V ) = 1. For λ ∈ R, let ξ_λbe a function on {1, 2, . . . , n} defined by ξ_λ(1) = −1 and, for 1 ≤ k < n,

ξλ(k + 1) = ξλ(k) +

[ξ_λ(k) − ξ_λ(k − 1)]ν(k − 1, k) − λπ(k)ξ_λ(k) ν(k, k + 1) .

Remark 3.2. Note thatξ0 = −1 and, forλ < 0, ξλ is strictly decreasing and of type 1. For

λ > 0, if ξλ(k − 1) < ξλ(k) = ξλ(k + 1), then ξλ(k) > 0 and this implies ξλ(k + 2) < ξλ(k + 1).

Similarly, ifξ_λ(k − 1) > ξ_λ(k) = ξ_λ(k + 1), then ξ_λ(k) < 0 and ξ_λ(k + 2) > ξ_λ(k + 1). Thus, ξλmust be of type i for some 1 ≤ i ≤ n − 1.

(10)

Lemma 3.2. Forλ > 0, let ξ_λbe the function inDefinition3.2. Suppose thatξ_λis of type i with 1 ≤ i ≤ n − 1.

(1) Ifξ_λ(n − 1) ̸= ξ_λ(n), then there is ϵ > 0 such that ξ_λ+δis of type i for −ϵ < δ < ϵ. (2) Ifξ_λ(n − 1) = ξ_λ(n), then there is ϵ > 0 such that ξ_λ+δis of type i +1 andξ_λ−δis of type i

for0< δ < ϵ.

Proof. Let aj, bj be the peak-valley points ofξλ. By the continuity ofξλinλ andRemark 3.2,

one can chooseϵ > 0 such that, for δ ∈ (−ϵ, ϵ), ξ_λ+δremains strictly monotonic on [aj, bj]for

j =1, . . . , i and

[ξ_λ+δ(b_j −1) − ξ_λ+δ(bj)][ξλ+δ(aj +1+1) − ξλ+δ(aj +1)] > 0,

for 1 ≤ j < i. In (1), bi =n. Fixδ ∈ (−ϵ, ϵ) and set a′1=a1=1, b ′ i =bi =n. For 1< j < i, set      b′_j =a′_{j +1}=bj if [ξλ+δ(bj −1) − ξλ+δ(bj)][ξλ+δ(bj) − ξλ+δ(aj +1)] < 0 b′_j =a′_{j +1}=aj +1 if [ξλ+δ(bj −1) − ξλ+δ(bj)][ξλ+δ(bj) − ξλ+δ(aj +1)] > 0 b′_j =bj, a′j +1=aj +1 if [ξλ+δ(bj −1) − ξλ+δ(bj)][ξλ+δ(bj) − ξλ+δ(aj +1)] = 0.

Clearly,ξ_λ+δis of type i with peak-valley points a′_j, b′_j. This provesLemma 3.2(1).

For part (2), we consider i ≤ n − 2 and bi = n − 1. By similar argument as before,

one can choose ϵ > 0 such that the restriction of ξ_λ+δ to {1, 2, . . . , n − 1} is of type i for δ ∈ (−ϵ, ϵ). To finish the proof, it remains to compare ξλ+δ(n − 1) and ξλ+δ(n). Recall that

Tj(λ) = _k=1j ξλ(k)π(k) as in the proof forProposition 2.5. Using a similar reasoning as for (2.7), one shows that Ti +1(λ) = −π(1) det Ai(λ) for 1 ≤ i < n, where Ai(λ) is the matrix

in(2.4). This implies that the non-zero eigenvalues of M_π,νG , sayλG₁, . . . , λG_n−1, are the roots of det An−1(λ) = 0. As a consequence ofLemma A.2, det An−2(λ) = 0 has exactly n − 2 distinct

roots, sayα1< α2< · · · < αn−1, and they satisfy the interlacing propertyλGj < αj < λGj +1for

1 ≤ j ≤ n − 2. Note that det An−2(λ) and det An−1(λ) tend to infinity as −λ tends to infinity.

This leads to the fact that if det An−2(λ) = 0 and det An−1(λ) < 0, then det An−2(·) is strictly

decreasing in a neighborhood ofλ. If det An−2(λ) = 0 and det An−1(λ) > 0, then det An−2(·) is

strictly increasing in a neighborhood ofλ.

Back to the proof of (2). Suppose thatξ_λ(n − 2) < ξ_λ(n − 1). ByRemark 3.2, it is easy to check that Tn−1(λ) = 0 and Tn(λ) > 0 or, equivalently, det An−2(λ) = 0 and det An−1(λ) < 0.

According to the conclusion in the previous paragraph, we can findϵ > 0 such that det An−2(·)

is strictly decreasing on(λ − ϵ, λ + ϵ), which yields ξλ+δ(n) = ξλ+δ(n − 1) −(λ + δ)T_{ν(n − 1, n)}n−1(λ + δ)



<ξλ+δ(n − 1) if 0 < δ < ϵ

>ξλ+δ(n − 1) if − ϵ < δ < 0.

This gives the desired property inLemma 3.2(2). The other case,ξ_λ(n − 2) > ξ_λ(n − 1), can be proved in the same way and we omit the details.

The following proposition characterizes the shape ofξ_λforλ > 0.

Proposition 3.3. Forλ > 0, let ξ_λ be the function inDefinition3.2. Let λG₁ < · · · < λG_n−1be non-zero eigenvalues of M_π,νG in(3.1)and α1 < · · · < αn−2 be zeros of det An−2(λ), where

An−2(·) is the matrix in(2.4). Then,

(1) λG_j < αj < λG_{j +1}, for1 ≤ j ≤ n − 2.

(11)

Proof. (1) is immediate fromLemma A.2. For (2), note thatαi is an eigenvalue of the submatrix

of M_π,νG obtained by removing the nth row and column. This impliesξ_α_i(n − 1) = ξ_α_i(n) for i =1, . . . , n − 2 and ξ_λ(n − 1) ̸= ξ_λ(n) for λ > 0 and λ ̸∈ {α1, . . . , αn−2}. ByLemma 3.2,ξλ

is of type i forαi −1< λ ≤ αi.

Givenλ > 0, the above proposition provides a simple criterion to determine which of the intervals(αj, αj +1]λ belongs to, that is, the type of ξλ. However, knowing the type ofξλis not

sufficient to determine whetherλ is bigger or smaller than λ_iG. We need the following remark. Remark 3.3. Using the same argument as the proof of Proposition 2.5, one can show that π(ξλ) = −π(1) det An−1(λ), where An−1(λ) is the matrix in(2.4). Clearly,π(ξλ) has zeros

λG 1, . . . , λ

G

n−1and tends to minus infinity asλ tends to minus infinity. This implies that π(ξλ) <

0, forλ < λG₁, and π(ξλ) > 0 ∀λ ∈ (λ2i −1G , λ G 2i), π(ξλ) < 0 ∀λ ∈ (λ G 2i, λ G 2i +1), for i ≥ 1, whereλ_nG:= ∞.

As a consequence ofProposition 3.3 andRemark 3.3, we obtain the following dichotomy algorithm, which is a generalization of(A2). Let 1 ≤ i ≤ n − 1.

Choose positive reals L0< λ_iG< U0and set, forℓ = 0, 1, . . . ,

1. ξ_λ_ℓ be the function generated byλ_ℓ=(L_ℓ+U_ℓ)/2 inDefinition 3.2,

2. According toDefinition 3.1, set

           L_ℓ+1=L_ℓ, U_ℓ+1=λ_ℓ ifξ_λ_ℓis of type j with j > i, or ifξ_λ_ℓis of type i and(−1)i −1π(ξ_λ_ℓ) > 0 U_ℓ+1=U_ℓ, L_ℓ+1=λ_ℓ ifξ_λ_ℓis of type j with j < i, or ifξ_λ_ℓis of type i and(−1)i −1π(ξ_λ_ℓ) < 0 L_ℓ+1=U_ℓ+1=λ_ℓ ifξ_λ_ℓis of type i andπ(ξ_λ_ℓ) > 0. (Di) Theorem 3.4. Referring to(Di), 0 ≤ max{U_ℓ−λG i , λiG−Lℓ} ≤(U0−L0)2−ℓ, ∀ℓ ≥ 0.

Proof. Immediate fromProposition 3.3andRemark 3.3.

Proposition 3.3(2) bounds the eigenvalues using the shape ofξ_λgenerated from one end point. We now introduce some other criteria to bound eigenvalues using the shape ofξ_λ from either boundary point. Those results will be used to proveTheorem 6.1.

Proposition 3.5. Forλ > 0, let ξ_λbe the function inDefinition3.2and ξ_λbe a function given by

ξλ(k − 1) =ξλ(k) +

[ξλ(k) −ξλ(k + 1)]ν(k, k + 1) − λπ(k)ξλ(k) ν(k − 1, k) ,

for k = n, n − 1, . . . , 2 withξλ(n) = −1. Let λG₀ < · · · < λ_n−1G be eigenvalues of Mπ,νG in(3.1)and let f |Bbe the restriction of f to a subset B of V . Suppose1 ≤ k0≤n.

(1) Ifξ_λ|_{1_,...,k

0}is of type i with(−1)

i_ξ

λ(k0) > 0 andξλ|{k0,...,n}is of type j with(−1)

j

ξλ(k0) >

(12)

(2) Ifξ_λ|_{1_,...,k

0}is of type i with(−1)

i_ξ

λ(k0) < 0 andξλ|{k0,...,n}is of type j with(−1)

j ξλ(k0) < 0, thenλG_{i + j −1}< λ < λG_{i + j +1}. (3) Ifξ_λ|_{1_,...,k 0}is of type i with(−1) i_ξ

λ(k0) > 0 andξλ|{k0,...,n}is of type j with(−1)

j

ξλ(k0) <

0, thenλG_{i + j −2}< λ < λG_{i + j}.

Proof. ByProposition 3.3,ξ_λ(n) is a polynomial of degree n − 1 satisfying (−1)i +1_ξ

λG

i (n) > 0 ∀0 ≤ i < n, (−1)

i +1_ξ

βi(n) > 0 ∀1 ≤ i < n − 1.

This implies that there arewi ∈ (βi, λi +1G ), 0 ≤ i ≤ n − 2, such that (−1)i +1ξλ(n) > 0 for

λ ∈ (wi −1, wi) and 0 ≤ i ≤ n − 1 with w−1= −∞andwn−1= ∞.

The proofs for (1)–(3) in Proposition 3.5 are similar and we deal with (1) only. By the Euler–Lagrange equations in (3.2), it is easy to see that, for 1 ≤ l < n, ξ_λG

l and ξλGl

are eigenvectors of M_π,νG in (3.1) associated with λ_lG, which implies ξ_λG l

= −ξ_λG l (n)ξλlG.

First, assume that λ ≤ λG_{i + j −2}. By Proposition 3.3, ξ_λG i + j −2

|_{1_,...,k

0} is of type at least i and

ξ_λG i + j −2

|_{k

0,...,n} is of type at least j . This implies that the patching of ξλG i + j −2 |_{1_,...,k 0} and −ξ_λG i + j −2(n)ξλGi + j −2 |_{k 0,...,n}, which equals ξλG

i + j −2, is of type at least i + j − 1. This is a

contradiction.

Next, assume thatλ ≥ λ_{i + j −1}G . ByProposition 3.3, we may choose a1 < λ (resp. a2 < λ)

such thatξ_λ|_{1_,...,k

0}(resp. ξλ|{k0,...,n}) changes the type at a1(resp. a2). Ifλ

G

i + j −1≤min{a1, a2},

then a similar reasoning as before implies that ξ_λG

i + j −1 is of type at most i + j − 2, a

contradiction. If min{a1, a2}< λG_{i + j −1} < max{a1, a2}, then exactly one ofξ_λG i + j −1 |_{1_,...,k 0}and ξ_λG i + j −1 |_{k

0,...,n}does not change its type. This implies that the gluing point k0cannot be a local

extremum and, thus, the patching function is of type at most i + j − 2, another contradiction! According to the discussion in the first paragraph of this proof, if λG_{i + j −1} ≥ max{a1, a2},

then none of ξ_λG i + j −1 |_{1_,...,k 0} and ξλG i + j −1 |_{k

0,...,n} changes type nor, of course, the sign at k0.

Consequently, we obtain(−1)i + j_ξ λG

i + j −1(k0)ξλi + j −1G (k0) > 0, which contradicts the fact ξλGi + j −1

= −ξ_λG

i + j −1(n)ξλ G

i + j −1.

Proposition 3.6. Forλ > 0 and 1 ≤ k ≤ n − 1, let sk(λ) be the kth sign change of ξλdefined

by s0:=0 and sk+1(λ) := inf{l > sk(λ)|ξλ(l)ξλ(l − 1) < 0 or ξλ(l) = 0}, where inf ∅ := n + 1.

Then, for0< λ1< λ2, sk(λ1) ≥ sk(λ2) for all 1 ≤ k ≤ n − 1.

Proof. Let 1 ≤ k ≤ n − 1. If sk(λ1) = n + 1, then it is clear that sk(λ1) ≥ sk(λ2). Suppose that

sk(λ1) = ℓ ≤ n. Obviously, ξλ1|{1,...,ℓ} is of type k. Referring to(2.4), letλℓ1, . . . , λℓℓ−1be the

roots of det A_ℓ−1(λ) = 0 and α₁ℓ, . . . , α_ℓ−2ℓ be roots of det A_ℓ−2(λ) = 0. According to the first paragraph of the proof forProposition 3.5, there arew_iℓ∈(αℓ

i −1, λℓi) with 1 ≤ i ≤ ℓ−1 such that

(−1)i +1_ξ

λ(ℓ) > 0 for λ ∈ (wℓi, wℓi +1) and 1 ≤ i ≤ ℓ−1, where α0ℓ:=0. Sinceξλ1(ℓ)ξλℓ_k(ℓ) ≥ 0,

one haswℓ_k ≤ λ₁< αℓ

k. As it is assumed thatλ2> λ1, ifλ2 > αkℓ, thenξλ2|{1,...,ℓ}is of type at

least k + 1 and, consequently, sk(λ2) < ℓ = sk(λ1). If λ1 < αℓ_k, thenξλ2|{1,...,ℓ}is type k and

(13)

3.2. Bounding eigenvalues from below

Motivated byTheorem 3.1, we introduce another scheme generalizing(2.1)to bound the other eigenvalues of M_π,νG from below.

Definition 3.3. Forλ > 0, let ξλbe a function inDefinition 3.2. Ifξ_λis of type i , 1 ≤ i ≤ n − 1, with peak-valley points 1 = a1< b1≤a2< b2≤ · · · ≤ai < bi ≤n, then define

ξ_λ( j)(k) = 

ξλ(k) for k ≤ bj

ξλ(k) = ξλ(bj) for k > bj,

∀1 ≤ j< i and setξ_λ( j)=ξ_λfor i ≤ j ≤ n − 1.

Remark 3.4. Forλ > 0, if ξ_λis of type i , thenξ_λ( j)is of type j for j < i. Moreover, for k < bj,

ξ_λ( j)(k + 1) = ξ_λ( j)(k) +[ξλ( j)(k) − ξλ( j)(k − 1)]ν(k − 1, k) − λπ(k)ξλ( j)(k) ν(k, k + 1) =ξ_λ( j)(k) −λ[π(1)ξ ( j) λ (1) + · · · + π(k)ξλ( j)(k)] ν(k, k + 1) , and, for bj ≤k< n, ξ_λ( j)(k + 1) = ξ_λ( j)(k) + Fj([ξ_λ( j)(k) − ξ_λ( j)(k − 1)]ν(k − 1, k) − λπ(k)ξ_λ( j)(k)) ν(k, k + 1) ,

where Fj(t) = max{t, 0} if j is odd, and Fj(t) = min{t, 0} if j is even. Note that ξ_λ(1)is exactly

φλinProposition 2.5.

Thereafter, let L and L(i)be functions on(0, ∞) defined by

L(λ) = Eν(ξλ, ξλ) Var_π(ξ_λ), L

(i)_{(λ) =} Eν(ξλ(i), ξλ(i))

Var_π(ξ_λ(i)) , ∀1 ≤ i ≤ n − 1,

(3.3)

whereξ_λandξ_λ(i)are functions inDefinitions 3.2–3.3.

Remark 3.5. Note that L = L(n−1). By a similar reasoning as in the proof for(2.2), one can show that, forλ > 0,

L(λ) = λ +λπ(ξλ)[π(ξλ) − ξλ(n)] Var_π(ξ_λ) , L

(i)_{(λ) = λ +} λπ(ξλ(i))[π(ξλ(i)) − ξλ(i)(n)]

Var_π(ξ_λ(i)) . FromProposition 3.3, it follows immediately that L(λ) = L(i)(λ) for λ ∈ (0, αi].

To explore further L and L(i), we need more information ofπ(ξ_λ), π(ξ_λ(i)), π(ξ_λ) − ξ_λ(n) and π(ξ_λ(i)) − ξ_λ(i)(n).

Lemma 3.7. Letξ_λbe the function inDefinition3.2andλ_iG, αi be constants inProposition3.3.

Then,π(ξ_λ)−ξ_λ(n) = 0 has n−1 distinct roots, say β0< β1< · · · < βn−2, which satisfyβ0=0

andαi < βi < λG_{i +1}for1 ≤ i ≤ n − 2. Furthermore,π(ξλ) − ξλ(n) > 0 for λ ∈ (β2i −1, β2i)

(14)

Proof. Set u(λ) := π(ξλ) − ξλ(n). According toDefinition 3.2, u(λ) is a polynomial of degree n −1 and satisfies u(0) = 0. Note that π(ξ_λ) = 0 for λ ∈ {λG₁, . . . , λG_n−1}. If i is odd, then ξ_λG

i (n − 1) < ξλGi (n). This implies ξλGi (n) > 0 and, hence, u(λ

G

i ) < 0. Similarly, if i is even,

then u(λ_iG) > 0.

By Lemma 3.2 and Proposition 3.3, if λ = αi with odd i , then ξαi is of type i with

ξαi(n − 1) = ξαi(n). This implies ξαi(n) > 0 and π(ξαi) = π(n)ξαi(n), which yields u(αi) < 0.

Similarly, one can show that u(αi) > 0 if i is even.

Remark 3.6. We consider the sign of π(ξ_λ(i)) and π(ξ_λ(i)) − ξ_λ(i)(n) in this remark. By

Proposition 3.3,ξ_λ(i) =ξ_λforλ ≤ αi. Ifλ > αi with 1 ≤ i ≤ n − 2, thenξλis of type j with

j > i. Fix 1 ≤ i ≤ n − 2 and set k0=k0(λ) = min{k|ξ_λ(i)( j) = ξ_λ(i)(n), ∀k ≤ j ≤ n}. Clearly,

k0(λ) ≤ n −1 for λ > αi. Observe that, forλ > αi with odd i ,ξλ(k0−1) < ξλ(k0) ≥ ξλ(k0+1),

which impliesk0−1

k=1 π(k)ξλ(k) < 0 and  k0

k=1π(k)ξλ(k) ≥ 0. A similar reasoning for the case

of even i givesk0−1 k=1 π(k)ξλ(k) > 0 and  k0 k=1π(k)ξλ(k) ≤ 0. Consequently, we obtain (−1)i −1_π(ξ(i) λ ) > 0, (−1)i[π(ξλ(i)) − ξλ(i)(n)] > 0, (3.4)

forλ > αi and 1 ≤ i ≤ n − 2. Note that, byProposition 3.3,ξ_λ(i) =ξλforλ ≤ αi. In addition

withRemark 3.3,Lemma 3.7and the continuity ofξ_λ(i), the first inequality of(3.4)holds for λ > λG

i and the second inequalities of(3.4)hold forλ > βi −1.

According toLemma 3.7andRemark 3.6, we derive a generalized version ofProposition 2.3

in the following.

Proposition 3.8. Let n ≥ 3 and 1 ≤ i ≤ n − 1. For λ > 0, let ξλ, ξ_λ(i) be the functions inDefinition3.2andβi be the constants inLemma3.7.

(1) Forλ > βi −1, the following are equivalent.

(1-1) E_ν(ξ_λ(i), ξ_λ(i)) = λVar_π(ξ_λ(i)). (1-2) π(ξ_λ(i)) = 0.

(1-3) λ = λG_i .

(2) Forβi −1< λ < βi, the following are equivalent.

(2-1) E_ν(ξ_λ, ξ_λ) = λVar_π(ξ_λ). (2-2) π(ξ_λ) = 0.

(2-3) λ = λG_i .

Proof. The proof forProposition 3.8(2) is similar to the proof forProposition 3.8(1) and we deal only with the latter. ByLemma 3.7andRemark 3.6, one has

π(ξ_λ(i))[π(ξ_λ(i)) − ξ_λ(i)(n)]<0 for λ > λiG

>0 for βi −1< λ < λGi .

This proves the equivalence of (1-1) and (1-2). Under the assumption of (1-2) and using

Remark 3.3, one hasλ ≤ αi. This impliesξ_λ(i) =ξλis an eigenvector for M_π,νG with associated

eigenvalueλ. As λ ∈ (βi −1, αi], it must be the caseλ = λG_i . This gives (1-3), while (1-3)⇒(1-2)

(15)

Remark 3.7. It is worthwhile to note that if (1-1) and (2-1) ofProposition 3.8are removed, then the equivalence in (1) holds forλ > λ_{i −1}G and the equivalence in (2) holds forλ ∈ (λG_{i −1}, λG_{i +1}). OnceλG_{i −1}is known, we can determineλG_i using the sign ofπ(ξ_λ(i)). SeeTheorem 3.9for details. Remark 3.8. Note that condition (4) ofProposition 2.3is not included inProposition 3.8. In fact, the equivalence may fail, that is, there may exist someλ ∈ (βi −1, βi) \ {λ_iG}such that

E_ν(ξ_λ, ξ_λ)/Var_π(ξ_λ) = λG

i . SeeExample 3.2for a counterexample.

AsProposition 3.8focuses on the characterization of zeros of L(λ)−λ, the following theorem concerns the sign of L(λ) − λ.

Theorem 3.9. Letλ_iG, αi, βi be the constants inProposition3.3andLemma3.7, andL be the

function in(3.3). Then,λG₁, . . . , λ_n−1G , β1, . . . , βn−2are fixed points ofL and, for 1 ≤ i ≤ n − 2,

(1) L(λ) < λ for λ ∈ (λG_i , βi).

(2) L(λ) > λ for λ ∈ (βi, λi +1G ).

(3) L(i)(λ) < λ for λ ∈ (λG_i , ∞).

Proof. Immediate fromLemma 3.7andRemarks 3.5–3.6.

ByTheorem 3.9, we obtain a lower bound on any specified eigenvalue of M_π,νG .

Corollary 3.10. Let 1 ≤ i ≤ n − 1 andλ0> λ_iG. Consider the sequenceλℓ+1=L(i)(λℓ) with

ℓ ≥ 0 and set λ∗₌    lim ℓ→∞λℓ if λℓconverges sup ℓ∈Iλℓ if λℓdiverges, where I = {ℓ|λ_ℓ−1> λ_ℓ< λ_ℓ+1}. Then,λ∗≤λG i .

It is not clear yet whether the sequenceλ_ℓ inCorollary 3.10is convergent, even locally. This subject will be discussed in the next subsection. Now, we establish some relations between the roots of det Ai(λ) = 0 and the shape of ξ_λ(i). This is a generalization ofProposition 2.5.

Proposition 3.11. For 1 ≤ i ≤ n − 1, let Ai(λ) be the matrix in(2.4),θ₁(i)< · · · < θ_i(i)be zeros

ofdet Ai(λ) = 0 and set θ_i(i−1):= ∞. Referring to the notation inProposition3.3, it holds true

that, for1 ≤ i ≤ n − 1, (1) λG_i =θ(n−1) i < αi =θ (n−2) i < · · · < θ (i) i .

(2) ξ_λ(i)( j) ̸= ξ_λ(i)( j + 1) = · · · = ξ_λ(i)(n) for λ ∈ [θ_i( j), θ_i( j−1)) and i ≤ j ≤ n − 2. (3) ξ_λ(i)(n − 1) ̸= ξ_λ(i)(n) for λ ∈ (θ_{i −1}(n−2), θ_i(n−2)) and i ≤ n − 1.

Proof. The order in (1) is a simple application ofLemma A.3. For (2), fix 1 ≤ i ≤ n − 1 and set γ (λ) = min{ j|ξ_λ(i)(k) = ξ_λ(i)(n), ∀ j ≤ k ≤ n} and B(λ) = {1, 2, . . . , γ (λ)}, B+(λ) = B(λ) ∪ {γ (λ) + 1}. Clearly, i + 1 ≤ γ (λ) ≤ n. We use the notation ξ_λ|_Cto denote the restriction ofξ_λto a set C. Suppose that i is odd. ByRemark 3.4,ξ_λ(i)=ξ_λon B(λ) and ξ_λ|_B(λ) is of type i with

(16)

By Lemma 3.2(1), ifξ_λ(γ (λ) + 1) < ξ_λ(γ (λ)), then there is ϵ > 0 such that, for |δ| < ϵ, ξλ+δ|_B(λ)is of type i and

ξλ+δ(γ (λ) − 1) < ξλ+δ(γ (λ)) > ξλ+δ(γ (λ) + 1).

This impliesγ (λ + δ) = γ (λ) for δ ∈ (−ϵ, ϵ). ByLemma 3.2(2), ifξ_λ(γ (λ) + 1) = ξ_λ(γ (λ)), then there isϵ > 0 such that, for δ ∈ (−ϵ, 0), ξ_λ+δ|_B+_(λ)is of type i with

ξλ+δ(γ (λ) − 1) < ξλ+δ(γ (λ)) < ξλ+δ(γ (λ) + 1),

and, forδ ∈ (0, ϵ), ξ_λ+δ|_B+_(λ)is of type i + 1 with

ξλ+δ(γ (λ) − 1) < ξλ+δ(γ (λ)) > ξλ+δ(γ (λ) + 1).

This yieldsγ (λ + δ) = γ (λ) for δ ∈ (0, ϵ) and γ (λ + δ) = γ (λ) + 1 for δ ∈ (−ϵ, 0). The proof for the case of even i is similar and we conclude from the above thatγ (λ) is a non-increasing and right-continuous function taking values on {i + 1, . . . , n}. Let ci +1 > · · · > cn−1 be the

discontinuous points ofγ (λ) such that γ (cj) = j for i + 1 ≤ j ≤ n − 1. As a consequence

of the above discussion, ξcj|{1,..., j} is of type i with ξcj( j) = ξcj( j + 1) and this implies

j

k=1π(k)ξcj(k) = 0. That means cj is a root of det Aj −1(λ) = 0 for j = i + 1, . . . , n − 1.

ByProposition 3.3and the second equality in (1),γ (λ) = n for θ_{i −1}(n−2) < λ < θ_i(n−2)and, thus, cj ≥θ_i(n−2)for j ≥ i + 1. As a consequence of the interlacing relationshipθ_i(ℓ)< θ_i(ℓ−1)< θ_{i +1}(ℓ),

it must be cj =θ_i( j+1)for i + 1 ≤ j ≤ n − 1. This finishes the proof.

Remark 3.9. For 1 ≤ i ≤ n−1,θ₁(i), . . . , θ_i(i)are also non-zero eigenvalues of the(i+1)×(i+1) principal submatrix of(3.1)indexed by 1, . . . , i + 1.

Remark 3.10. In fact, byProposition 2.5,ξ_λ(1)(n − 1) ̸= ξ_λ(1)(n) for λ ∈ (0, θ₁(n−2)), which is better thanProposition 3.11(3).

3.3. Local convergence ofL

This subsection is dedicated to the local convergence of L in(3.3). Let αi, βi, λGi be the

constants inProposition 3.3andLemma 3.7. As before, letζ0=1, . . . , ζn−1denote the L2

(π)-normalized eigenvectors of M_π,νG associated withλ₀G, . . . , λG_n−1. Clearly,ξ_λG i

= −ζ_i/ζ_i(1) and ξλ =n−1

i =0ρi(λ)ζi, whereρi(λ) = π(ξλζi) for 0 ≤ i ≤ n − 1. Note that ρi(λ) is a polynomial

of degree n − 1 and satisfiesρi(λj) = −δi( j)/ζi(1) for i, j ∈ {0, 1, . . . , n − 1}. This implies

ρ0(λ) = − n−1  j =1 λG j −λ λG j , ρi(λ) = − λ ζi(1)λ_iG n−1  j =1, j̸=i λG j −λ λG j −λ G i , (3.5)

for all 1 ≤ i ≤ n − 1. Moreover, by multiplying(3.2)withξ_λ(k) and summing up k, we obtain E_ν(ξ_λ, ζi) = λG_i ρi(λ). In the same spirit, one can show that Eν(ξλ, ζi) = λ[ρi(λ) − ζi(n)ρ0(λ)]

usingDefinition 3.2. Putting both equations together yields ρi(λ) = λζ

i(n)

λ − λG i

(17)

As a consequence ofRemark 3.5, this gives L(λ) = n−1  i =1 λG i ρi2(λ) n−1  i =1 ρ2 i(λ) =λ + n−1  i =1 (λG i −λ)−1ζi2(n) n−1  i =1 (λG i −λ)−2ζ 2 i (n) , (3.7)

forλ ̸∈ {λG₀, . . . , λG_n−1}. The next proposition follows immediately from the second equation in

(3.5)and(3.6).

Proposition 3.12. LetλG₁, . . . , λG_n−1be the non-zero eigenvalues of M_π,νG in(3.1)and ζ1, . . . ,

ζn−1be the corresponding L2(π)-normalized eigenvectors. Then,

ζi(1)ζi(n) = − n−1  j =1, j̸=i λG j λG j −λ G i , ∀1 ≤ i ≤ n − 1. Set u(λ) = n−1_{j =1}(λG_j −λ)−1ζ2

j(n). ByTheorem 3.9,β1, . . . , βn−2are zeros of u(λ) n−1j =1

(λG

j −λ), which is a polynomial of degree n − 2. This implies

u(λ) = C _n−1  j =1 1 λG j −λ  _n−2  j =1 (βj −λ)  , where C = λ1···λn−1 β1···βn−2 n−1

j =1ζ2j(n)/λGj. Putting this back to L yields

1 L(λ) − λ= u′(λ) u(λ) = n−1  j =1 1 λG j −λ − n−2  j =1 1 βj−λ, (3.8) forλ ̸∈ {λ₀G, . . . , λG_n−1, β1, . . . , βn−2}.

Proposition 3.13. Let L be the function in(3.3),λG_i be the eigenvalue of M_π,νG and βi be the

constant inLemma3.7. Let Di =n−2_{j =1}(βj−λ_iG)−1−n−1_{j =1}_{, j̸=i}(λG_j−λG_i )−1with1 ≤ i ≤ n−1.

Then, for2 ≤ i ≤ n − 2,

(1) If Di < 0, then there is τ ∈ (λ_iG, βi) such that L is strictly increasing on (βi −1, λG_i )∪(τ, βi)

and strictly decreasing on(λG_i , τ).

(2) If Di > 0, then there is η ∈ (βi −1, λiG) such that L is strictly increasing on (βi −1, η) ∪

(λG

i , βi) and strictly increasing on (η, λiG).

(3) If Di =0, then L is strictly increasing on(βi −1, βi).

Proof. Using(3.7)and(3.8), one can show that L′(λ_iG) = 0 and

L′′(λ_iG) = n−1  j =1, j̸=i ζ2 i (n) λG j −λiG =2 _n−2  j =1 1 βj−λiG − n−1  j =1, j̸=i 1 λG j −λGi  =2Di. (3.9)

To prove (1) and (2), it suffices to show that if L′(τ) = 0 for some τ ∈ (λ_iG, βi), then τ is a

(18)

of L. We discuss the first case, whereas the second case is similar and is omitted. Recall that u(λ) = n−1_{j =1}(λG_j −λ)−1ζ2

j(n). As τ is a critical point for L, one has 2(u

′_(τ))2₌_u_(τ)u′′_(τ).

This implies

L′′(τ) = u(τ)[3(u

′′_(τ))2₋_2u′_(τ)u′′′_(τ)]

2(u′(τ))3 > 0,

where the last inequality uses the fact that u(λ) < 0, for λ ∈ (λG_i , βi), and

3(u′′(λ))2−2u′(λ)u′′′(λ) = −12  1≤i< j≤n−1  (λG i −λ G j)ζi(n)ζj(n) (λG i −λ)2(λ G j −λ)2 2 < 0. This proves (1) and (2).

To see (3), we assume that Di =0. Computations show that

L(λ) − λG i L(λ) − λ =(λ − λ G i )  _n−1  j =1, j̸=i 1 λG j −λ − n−2  j =1 1 βj−λ  =(λ − λG_i )2  _n−1  j =1, j̸=i 1 (λG j −λ)(λ G j −λ G i ) − n−1  j =1 1 (βj−λ)(βj−λ_iG)  < 0,

forλ ∈ (βi −1, λGi ) ∪ (λGi , βi), where the last inequality uses the fact that (λGj −λ)(λGj −λGi ) >

(βj−λ)(βj−λiG) for j < i and (λGj −λ)(λGj −λGi ) > (βj −1−λ)(βj −1−λGi ) for j > i. By Theorem 3.9, this implies L(λ) > λ_iGforλ ∈ (λG_i , βi) and L(λ) < λiGforλ ∈ (βi −1, λiG). The

desired property comes immediate from the discussion in the previous paragraph.

Remark 3.11. Note that D1> 0 and Dn−1< 0. Using the same proof as above, this implies that

L(λ) is strictly increasing on (λG

1, β1) ∪ (βn−2, λn−1G ). Moreover, by(3.7), one may compute

(u′_(λ))2_L′_{(λ) = −2} i< j (λG i −λ G j) 2 (λG i −λ)3(λGj −λ)3 < 0, ∀λ ∈ (0, λG 1) ∪ (λ G n−1, ∞).

This implies L(λ) is strictly decreasing on (0, λ₁G) ∪ (λG_n−1, ∞) and

lim λ→0L(λ) = n−1  i =1 ζ2 i (n)/λiG n−1  i =1 ζ2 i(n)/(λ G i )2 , lim λ→∞L(λ) =  ₁ π(n) −1 n−1  i =1 λG i ζ 2 i(n).

The following local convergence is a simple corollary ofTheorem 3.9andProposition 3.13. Theorem 3.14 (Local Convergence). Letλ0> 0 and set λℓ+1=L(λℓ) for ℓ ≥ 0. Then, there is

ϵ > 0 such that the sequence (λℓ)∞_ℓ=1is monotonic and converges toλ_iGforλ0∈(λG_i −ϵ, λG_i +ϵ)

and1 ≤ i ≤ n − 1.

We use the following examples to illustrate the different cases inProposition 3.13.

Example 3.1 (Simple Random Walks). Let n > 1. A simple random walk on {1, 2, . . . , n} with reflecting probability 1/2 at the boundary is a birth and death chain with transition matrix given

(19)

by K(i, j) = K (1, 1) = K (n, n) = 1/2 for |i − j| = 1. It is easy to see that the uniform probability is the stationary distribution of K . In the setting of graph, we haveν(i, i+1) = 1/(2n) andπ(i) = 1/n. One may apply the method in [11] to obtain the following spectral information.

λG j =1 − cos jπ n , ζj(k) = 1 λG j  sin j kπ n −sin j(k − 1)π n  , ∀1 ≤ j < n. See, e.g., [3, Section 7]. By(3.9), we get

Di = 1 2 n−1  j =1, j̸=i sin2( jπ/n) λG j(λ G j −λ G i ) = n−1  j =1, j̸=i 1 + cos( jπ/n) cos(iπ/n) − cos( jπ/n). Clearly, D1> 0 and Dn−1< 0. If n is even, then Dn/2< 0.

Example 3.2 (Ehrenfest Chains). An Ehrenfest chain on V = {0, 1, . . . , n} is a Markov chain with transition matrix K given by K(i, i + 1) = 1 − i/n and K (i + 1, i) = (i + 1)/n for i = 0, . . . , n − 1. The associated stationary distribution is the unbiased binomial distribution on V , that is,π(i) = n_i 2−n for i ∈ V . To the Ehrenfest chain, the measureν is defined by ν(i, i + 1) = n−1

i



2−n for i = 0, . . . , n − 1. Using the group representation for the binary group {0, 1}n, one may compute

λj = 2 j n , ζj(k) =  n j −1/2 j  ℓ=0 (−1)ℓ k_ℓ  n − k j −ℓ  , ∀1 ≤ j ≤ n. Plugging this back into(3.9)yields

Di = n 4 n  j =1, j̸=i _n j  j − i    >0 for i < n/2 =0 for i = n/2 <0 for i > n/2.

This example points out the possibility of different signs in {Di|i =1, . . . , n − 1} including 0.

3.4. A remark on the separation for birth and death chains

In this subsection, we give a new proof of a result, Theorem 3.15, which deals with convergence in separation distance for birth and death chains. Let(Xm)∞m=0be a birth and death

chain with transition matrix K given by(1.1). In the continuous time setting, we consider the process Yt = XNt, where Nt is a Poisson process with parameter 1 independent of Xm. Given

the initial distribution µ, which is the distribution of X0, the distributions of Xm and Yt are

respectivelyµKm andµe−t(I −K ), where eA :=∞

l=0Al/l!. Briefly, we write Ht =e−t(I −K ).

It is well-known that if K is irreducible, thenµHt converges toπ as t → ∞. If K is irreducible

and ri > 0 for some i, then µKmconverges toπ as m → ∞. Concerning the convergence, we

consider the separations of Xm, Yt with respect toπ, which are defined by

dsep(µ, m) = max 0≤x≤n  1 −µK m_(x) π(x)  , d_sepc (µ, t) = max 0≤x≤n  1 − µHt(x) π(x)  . The following theorem is from [9].

(20)

Theorem 3.15. Let K be an irreducible birth and death chain on {0, 1, . . . , n} with eigenvalues λ0=0< λ1< · · · < λn.

(1) For the discrete time chain, if pi+qi +1≤1 for all 0 ≤ i< n, then

dsep(0, m) = dsep(n, m) = n  j =1  _n  i =1,i̸= j λi λi−λj  (1 − λj)m.

(2) For the continuous time chain, it holds true that

d_sepc (0, t) = d_sepc (n, t) = n  j =1  _n  i =1,i̸= j λi λi−λj  e−λjt.

Diaconis and Fill [5,12] introduce the concept of dual chain to express the separations in

Theorem 3.15as the probability of the first passage time. Brown and Shao [1] characterize the first passage time using the eigenvalues of K for a special class of continuous time Markov chains including birth and death chains. The idea in [1] is also applicable for discrete time chains and this leads to the formula above. See [9] for further discussions. Here, we useProposition 3.12

andLemma 3.16to prove this result directly.

Lemma 3.16. Let K be the transition matrix in(1.1)with stationary distributionπ. Suppose that µ is a probability distribution satisfying µ(i)/π(i) ≤ µ(i + 1)/π(i + 1) for all 0 ≤ i ≤ n − 1. (1) For the discrete time chain, if pi +qi +1 ≤ 1 for all 0 ≤ i < n, then µKm(i)/π(i) ≤

µKm_{(i + 1)/π(i + 1) for all 0 ≤ i < n and m ≥ 0.}

(2) For the continuous time chain,µHt(i)/π(i) ≤ µHt(i + 1)/π(i + 1) for all 0 ≤ i < n and

t ≥0.

Proof. Note that (2) follows from (1) if we write Ht =exp{−2t(I − I +K₂ )}. For the proof of

(1), observe that µKm+1_(i) π(i) = µKm_{(i − 1)} π(i − 1) qi+ µKm_(i) π(i) ri+ µKm_{(i + 1)} π(i + 1) pi, ∀i.

By induction, ifµKm(i)/π(i) ≤ µKm(i + 1)/π(i + 1) for 0 ≤ i < n, then µKm+1_{(i + 1)} π(i + 1) = µKm_(i) π(i) qi +1+µK m_{(i + 1)} π(i + 1) ri +1+µK m_{(i + 2)} π(i + 2) pi +1 ≥ µK m_(i) π(i) qi +1+µK m_{(i + 1)} π(i + 1) (1 − qi +1) ≥ µK m_(i) π(i) (1 − pi) +µK m_{(i + 1)} π(i + 1) pi ≥ µK m+1_(i) π(i) .

Remark 3.12. Lemma 3.16 is also developed in [10] in which it is shown that, for any non-negative function f , Kmf is non-decreasing if f is non-decreasing for all m ≥ 0. Consider the adjoint chain K∗ of K in L2(π). As birth and death chains are reversible, one has K∗ = K. Using the identityµK/π = K∗(µ/π), it is easy to see that the above proof is consistent with the proof in [10].

(21)

Proof of Theorem 3.15. Assume that K is irreducible and letλ0=0< λ1 < · · · < λnbe the

eigenvalues of I − K with L2(π)-normalized eigenvector ζ0=1, . . . , ζn. ByLemma 3.16, ifµ

satisfiesµ(i)/π(i) ≥ µ(i + 1)/π(i + 1) for 0 ≤ i < n, then d_sepc (µ, t) = 1 −µHt(n) π(n) = n  j =1 µ(ζj)ζj(n)e−λjt,

whereµ(ζj) = ni =0ζj(i)µ(i). If K satisfies pi +qi +1≤1 for all 0 ≤ i < n, then

dsep(µ, m) = 1 −µK m_(n) π(n) = n  j =1 µ(ζj)ζj(n)(1 − λj)m.

By Proposition 3.12, setting µ to be one of the Dirac measure δ0, δn leads to the desired

identities.

4. Paths of infinite length

In this section, the graph G = (V, E) under consideration is infinite with V = {1, 2, . . .} and E = {{i, i + 1}|i = 1, 2, . . .}. As before, let π, ν be positive measures on V, E satisfying π(V ) = 1. The Dirichlet form and the variance are defined in a similar way as in the introduction and the spectral gap of G with respect toπ, ν is given by

λG π,ν =inf _E ν( f, f ) Var_π( f )     f is non-constant andπ( f2) < ∞  .

For n ≥ 2, let Gn=(Vn, En) be the subgraph of G with Vn= {1, 2, . . . , n}, En= {{i, i +1}|1 ≤

i < n} and let πn, νn be normalized restrictions of π, ν to Vn, En. That is,πn(i) = cnπ(i),

νn(i, i + 1) = cnν(i, i + 1) with cn=1/[π(1) + · · · + π(n)]. As before, let M_π,νG be an infinite

matrix indexed by V and defined by

M_π,νG (i, j) = −ν(i, j)

π(i) , ∀|i − j| = 1, Mπ,νG (i, i) =

ν(i − 1, i) + ν(i, i + 1) π(i) . (4.1) Clearly, MGn

πn,νn is the principal submatrix of M

G

π,ν indexed by Vn×Vn.

Lemma 4.1. Referring to the above setting, λGn+1

πn+1,νn+1 < λ Gn πn,νn for n > 1 and λ G π,ν = limn→∞λπGnn,νn.

Proof. Briefly, we write λ for λG_π,ν andλn for λπGnn,νn. Note that λn is the smallest non-zero

eigenvalue of the principal submatrix of M_π,νG indexed by Vn ×Vn. As a consequence of Proposition 3.11(1) andRemark 3.9, λn+1 < λn. For n > 1, let φn be a minimizer forλn

and define ψn(i) = 1Vn(i)φn(i) for i ≥ 1. Clearly, one has Eνn(φn, φn) = cnEν(ψn, ψn)

and Var_π_n(φn) = cnVarπ(ψn). This implies λ ≤ λn for n ≥ 2. Let λ∗ = limn→∞λn.

Note that it remains to show λ∗ = λ. For ϵ > 0, choose a function f on V such that E_ν( f, f ) < (λ + ϵ/2)Var_π( f ) with π( f2_{) < ∞. For δ > 0, we choose N > 0 such that}

Var_π_N(g) > (1 − δ)Var_π( f ) and E_ν_N(g, g) < (1 + δ)E_ν( f, f ), where g = f |VN, the restriction

of f to VN. This implies λ∗_≤_λ N ≤ E_ν_N(g, g) Var_π_N(g) ≤ (1 + δ)Eν( f, f ) (1 − δ)Varπ( f ).

(22)

Remark 4.1. Silver [17] contains a discussion of the (weak*) convergence of the spectral measure for Gn to the spectral measure for G in a very general setting. Lemma 4.1can also

be proved using Theorem 4.3.4 in [17].

Proposition 4.2. Forλ > 0, let φ_λ(1) = −1 and φλ(i + 1) = φλ(i) +

{[φ_λ(i) − φ_λ(i − 1)]ν(i − 1, i) − λπ(i)φ_λ(i)}+

ν(i, i + 1) , ∀i ≥ 1. Setλ1= ∞andλn=λGπnn,νn for n ≥2.

(1) For i ≥ 2 andλ ∈ [λi, λi −1), φλ(i − 1) < φλ(i) = φλ(i + 1).

(2) Forλ ∈ (0, λG_π,ν],φ_λ(i) < φ_λ(i + 1) for all i ≥ 1.

Proof. Immediate fromProposition 3.11andRemarks 3.9–3.10.

Remark 4.2. ByProposition 4.2, one may generate a dichotomy algorithm forλ_π,νG using the shape ofφ_λ. See(Di).

The following theorem extendsTheorem 1.1to infinite paths.

Theorem 4.3. If λG_π,ν > 0 and E_ν(ψ, ψ)/Var_π(ψ) = λG_π,ν for some functionψ on V with π(ψ) = 0, then ψ is strictly monotonic and satisfies

λG

π,νπ(i)ψ(i) = [ψ(i) − ψ(i + 1)]ν(i, i + 1) + [ψ(i) − ψ(i − 1)]ν(i − 1, i) ∀i ≥ 1.

Theorem 4.4. Forλ > 0, let φ_λbe the function inProposition4.2and set L(λ) = E_π(φ_λ, φ_λ)/ Var_π(φ_λ). Then,

(1) λG

π,ν < L(λ) < λ for λ ∈ (λGπ,ν, ∞).

(2) Ln_{(λ) → λ}G

π,νas n → ∞ forλ ∈ (λπ,νG , ∞).

Proof. Letλ > λG_π,ν. ByLemma 4.1,λi ≤λ < λi −1for some i ≥ 2. ByProposition 4.2(1), one

hasφ_λ(i − 1) < φ_λ(i) = φ_λ(i + 1). As in(2.2), we obtain L(λ) = λ + λπ(φλ)[π(φλ) − φλ(i)] Var_π(φ_λ) , i  j =1 φλ( j)π( j) ≥ 0.

This leads toπ(φ_λ) > 0 and π(φ_λ) < φ_λ(i), which implies L(λ) < λ. That means L has no fixed point on(λG_π,ν, ∞). The lower bound of (1) follows immediately fromTheorem 4.3. For (2), setλ∗=limn→∞Ln(λ) ≥ λG_π,ν. As a consequence of (1), L is continuous on(λG_π,ν, ∞). If

λ∗_{> λ}G

π,ν, thenλ∗is a fixed point of L, a contradiction! Hence,λ∗=λGπ,ν.

5. A numerical experiment

In this section, we illustrate the algorithm(A2)on a specific Metropolis chain. The Metropolis algorithm introduced by Metropolis et al. in 1953 is a widely used construction that produces a Markov chain with a given stationary distribution π. Let π be a positive probability measure on V and K be an irreducible Markov transition matrix on V . For simplicity, we assume that K(x, y) = K (y, x) for all x, y ∈ V . The Metropolis chain evolves in the following way. Given the initial state x, select a state, say y, according to K(x, ·) and compute the ratio A(x, y) = π(y)/π(x). If A(x, y) ≥ 1, then move to y. If A(x, y) < 1, then flip a coin with probability A(x, y) on heads and move to y if the head appears. If the coin lands on tails, stay

(23)

at x. Accordingly, if M is the transition matrix of the Metropolis chain, then M(x, y) =        K(x, y) if A(x, y) ≥ 1, x ̸= y K(x, y)A(x, y) if A(x, y) < 1 K(x, x) +  z: A(x,z)<1 K(x, z)(1 − A(x, z)) if x = y.

It is easy to check π(x)M(x, y) = π(y)M(y, x). As K is irreducible, M is irreducible. Moreover, if π is not uniform, then M(x, x) > 0 for some x ∈ V . This implies that M is aperiodic and, consequently, Mt(x, y) → π(y) and e−t(I −M)(x, y) → π(y) as t → ∞. For further information on Metropolis chains, see [6] and the references therein.

For n ≥ 1, let Gn =(Vn, En) be a graph with Vn = {0, ±1, . . . , ±n} and En = {{i, i + 1} :

i = −n, . . . , n − 1}. Suppose that Kn is the transition matrix of the simple random walk on

Vn, that is, Kn(−n, −n) = Kn(n, n) = 1/2 and Kn(i, i + 1) = Kn(i + 1, i) = 1/2 for all

−n ≤ i < n. For a > 0, let ˇπn,a, ˆπn,abe probabilities on Vn= {0, ±1, . . . , ±n} given by

ˇ

πn,a(i) = ˇcn,a(|i| + 1)a, πˆn,a(i) = ˆcn,a(n − |i| + 1)a,

where ˇcn,aand ˆcn,a are normalizing constants. It is easy to compute that

cn,a/2 ≤ 1/ˆcn,a < 1/ˇcn,a ≤2cn,a, (5.1) where cn,a = (n + 1) a+1 a +1 +(n + 1) a_.

The Metropolis chains, ˇKn,a and ˆKn,a, for ˇπn,a and ˆπn,a based on the simple random walk Kn

have transition matrices given by ˇ

Kn,a(i, j) = ˇKn,a(−i, − j), Kˆn,a(i, j) = ˆKn,a(−i, − j)

and ˇ Kn,a(i, j) =                      1 2 if j = i + 1, i ∈ [0, n − 1] ia 2(i + 1)a if j = i − 1, i ∈ [1, n] (i + 1)a₋_ia 2(i + 1)a if j = i, i ̸∈ {0, n} 1 − n a 2(n + 1)a if i = j = n and ˆ Kn,a(i, j) =                      1 2 if j = i − 1, i ∈ [1, n] (n − i)a 2(n − i + 1)a if j = i + 1, i ∈ [0, n − 1] (n − i + 1)a₋_{(n − i)}a 2(n − i + 1)a if j = i ̸= 0 1 − n a (n + 1)a if i = j = 0.

(24)

Saloff-Coste [16] discussed the above chains and obtained the correct order of the spectral gaps. Let ˇλ_n,a, ˆλ_n,a denote the spectral gaps of ˇK_n,a, ˆK_n,a. Referring to the recent work in [4], one has

1/(4C) ≤ λ ≤ 1/C,

where(λ, C) is any of (ˇλn,a, ˇCn(a)) and (ˆλn,a, ˆCn(a)), and

ˇ Cn(a) = 2 max 1≤i ≤n _{i −1}  j =0 ( j + 1)−a   _n  j =i ( j + 1)a  , and ˆ Cn(a) = 2 max 1≤i ≤n _{i −1}  j =0 ( j + 1)a   _n−1  j =i −1 ( j + 1)−a  .

Theorem 5.1. Let ˇλn,a, ˆλn,abe spectral gaps for ˇKn,a, ˆKn,a. Then,

1 8η−a(1, n)ηa(2, n + 1) ≤ ˇλ_n_,a ≤ 2 η−a(1, n)ηa(2, n + 1) , and 1 64ηa(1, ⌈n/2⌉)η−a(⌈n/2⌉, n) ≤ ˆλ_n_,a ≤ 1 2ηa(1, ⌈n/2⌉)η−a(⌈n/2⌉, n) whereηa(k, l) = li =kia.

Proof of Theorem 5.1. The bound for ˇλn,afollows immediately from the fact

η−a(1, n)ηa(2, n + 1)

2 ≤ ˇCn(a) ≤ 2η−a(1, n)ηa(2, n + 1). For ˆλ_n,a, note that

ˆ Cn(a) = 2 max n/2≤i≤n _{i −1}  j =0 ( j + 1)a   _n−1  j =i −1 ( j + 1)−a  .

Taking i = ⌈n/2⌉ yields the upper bound. For the lower bound, we write ˆ Cn(a) = 2 max n/2≤i≤n _{i −1}  j =0  1 − j i a n−i  j =0  1 − j i + j a . For i ≥ n/2, it is clear that

i −1  j =0  1 − j i a ≥ i −1  j =0  1 − 2 j n a ≥ 1 2 n−1  j =0  1 − j n a . Observe that, for a> 0,

C_i′_,n(a) 2 ≤ n−i  j =0  1 − j i + j a ≤C_i′_,n(a), (5.2)

(25)

where C_i′_,n(a) = 1 +        i(i/n) a−1₋₁ 1 − a if a ̸= 1 ilogn i if a = 1.

It is clear that, for i ≥ n/2, C_i,n′ (a) ≤ 2C_⌈n/2⌉,n′ (a) and this leads to

n−i  j =0  1 − j i + j a ≤4 n−⌈n/2⌉  j =0  1 − j ⌈n/2⌉ + j a . Summarizing all above gives the desired lower bound.

Remark 5.1. Comparing with [16, Theorem 9.5], the bounds for ˇλn,agiven inTheorem 5.1have

a similar lower bound and an improved upper bound by a multiple of about 1/4. For ˆλn,a, observe

that C_i′′(a) 2 ≤ i −1  j =0  1 − j i a ≤C_i′′(a), where C_i′′(a) = 1 +i − i −a 1 + a .

Recall the constant C_i′_,n(a) in the proof ofTheorem 5.1. Note that n + a 2(1 + a) ≤C ′′ ⌈n/2⌉(a) ≤ 2(n + a) (1 + a) , and, for a> 0, a ̸= 1 and n ≥ 3,

C_⌈n′ _/2⌉,n(a) ≤ 1 + n +1 2(1 + a)_a_>0,a̸=1sup (21−a₋₁_{)(1 + a)} 1 − a ≤ 3(n + a) 1 + a ,

where the last inequality is obtained by considering the subcases a < 2 and a ≥ 2. The above computation also applies for a = 1 and n ∈ {1, 2}. In the same spirit, one can show that C_⌈n/2⌉,n′ (a) ≥ ₆_(1+a)n+a . This yields

(n + a)2

6(1 + a)2 ≤ ˆCn,a ≤

12(n + a)2

(1 + a)2 , ∀n ≥ 1. (5.3)

Hence, we have ˆλn,a ≍(1 + a)2/(n + a)2. As a consequence of(5.1)and(5.2), we obtain that,

uniformly for a> 0, 1/ˇλn,a ≍na  1 + 1 n a + n 1 + a  (1 + v(n, a)) as n → ∞, wherev(n, 1) = log n and v(n, a) = (n1−a−1)/(1 − a) for a ̸= 1.

Remark 5.2. Note that the lower bound inTheorem 6.1provides the correct order of the spectral gap for the chain ˇKn,auniformly in a but not for ˆKn,a. For instance, if a grows with n, say a = n,