ScienceDirect
Stochastic Processes and their Applications 124 (2014) 848–882
www.elsevier.com/locate/spa
Spectral computations for birth and death chains
Guan-Yu Chen
a,∗, Laurent Saloff-Coste
baDepartment of Applied Mathematics, National Chiao Tung University, Hsinchu 300, Taiwan bMalott Hall, Department of Mathematics, Cornell University, Ithaca, NY 14853-4201, United States
Received 22 June 2012; received in revised form 18 September 2013; accepted 1 October 2013 Available online 11 October 2013
Abstract
We consider the spectrum of birth and death chains on an n-path. An iterative scheme is proposed to compute any eigenvalue with exponential convergence rate independent of n. This allows one to determine the whole spectrum in order n2elementary operations. Using the same idea, we also provide a lower bound on the spectral gap, which is of the correct order on some classes of examples.
c
⃝2013 Elsevier B.V. All rights reserved.
MSC:60J10; 60J27
Keywords:Birth and death chains; Spectrum
1. Introduction
Let G =(V, E) be the undirected finite path with vertex set V = {1, 2, . . . , n} and edge set E = {{i, i +1} : i = 1, 2, . . . , n −1}. Given two positive measures π, ν on V, E with π(V ) = 1, the Dirichlet form and variance associated withν and π are defined by
Eν( f, g) :=
n−1
i =1
[f(i) − f (i + 1)][g(i) − g(i + 1)]ν(i, i + 1) and
Varπ( f ) := π( f2) − π( f )2,
∗Corresponding author. Tel.: +886 35712121.
E-mail addresses:[email protected](G.-Y. Chen),[email protected](L. Saloff-Coste). 0304-4149/$ - see front matter c⃝2013 Elsevier B.V. All rights reserved.
where f, g are functions on V . When convenient, we set ν(0, 1) = ν(n, n + 1) = 0. The spectral gap of G with respect toπ, ν is defined as
λG π,ν :=min E ν( f, f ) Varπ( f ) f is non-constant .
Let Mπ,νG be a matrix given by Mπ,νG (i, j) = 0 for |i − j| > 1 and Mπ,νG (i, j) = −ν(i, j)
π(i) , ∀|i − j| = 1, Mπ,νG (i, i) =
ν(i − 1, i) + ν(i, i + 1) π(i) . Obviously,λGπ,νis the smallest non-zero eigenvalue of Mπ,νG .
Undirected paths equipped with measuresπ, ν are closely related to birth and death chains. A birth and death chain on {0, 1, 2, . . . , n} with birth rate pi, death rate qi and holding rate ri is a
Markov chain with transition matrix K given by
K(i, i + 1) = pi, K(i, i − 1) = qi, K(i, i) = ri, ∀0 ≤ i ≤ n, (1.1)
where pi +qi +ri = 1 and pn = q0 = 0. Under the assumption of irreducibility, that
is, piqi +1 > 0 for 0 ≤ i < n, K has a unique stationary distribution π given by π(i) =
c(p0· · ·pi −1)/(q1· · ·qi), where c is the positive constant such that ni =0π(i) = 1. The smallest
non-zero eigenvalue of I −K is exactly the spectral gap of the path on {0, 1, . . . , n} with measures π, ν, where ν(i, i + 1) = π(i)pi =π(i + 1)qi +1for 0 ≤ i < n.
Note that if 1 is the constant function of value 1 andψ is a minimizer for λGπ,ν, thenψ −π(ψ)1 is an eigenvector of Mπ,νG . This implies that any minimizerψ for λGπ,ν satisfying π(ψ) = 0 satisfies the Euler–Lagrange equation,
λG
π,νπ(i)ψ(i) = [ψ(i) − ψ(i − 1)]ν(i − 1, i) + [ψ(i) − ψ(i + 1)]ν(i, i + 1), (1.2)
for all 1 ≤ i ≤ n. Assuming the connectedness of G (i.e., the superdiagonal and subdiagonal entries of Mπ,νG are positive), the rank of Mπ,νG −λI is at least n − 1. This implies that all eigenvalues of Mπ,νG are simple. SeeLemma A.3for an illustration. Observe that, by(1.2), any non-trivial eigenvector of Mπ,νG has mean 0 underπ. This implies that all minimizers for the spectral gap are of the form aψ + b1, where a, b are constants and ψ is a nontrivial solution of
(1.2). In 2009, Miclo obtained implicitly the following result.
Theorem 1.1 ([15, Proposition 1]). If ψ is a minimizer for λGπ,ν, thenψ must be monotonic, that is, eitherψ(i) ≤ ψ(i + 1) for all 1 ≤ i < n or ψ(i) ≥ ψ(i + 1) for all 1 ≤ i < n.
One aim of this paper is to provide a scheme to compute the spectrum of Mπ,νG , in particular, the spectral gap. Based on Miclo’s observation, it is natural to consider the following algorithm.
Choose two positive realsλ0, a in advance and set, for k = 0, 1, . . . ,
1. ψk(1) = −a,
2. ψk(i + 1) = ψk(i) +
{[ψk(i) − ψk(i − 1)]ν(i − 1, i) − λkπ(i)ψk(i)}+ ν(i, i + 1) , for 1 ≤ i < n, where t+=max{t, 0},
3. λk+1=
Eν(ψk, ψk)
Varπ(ψk) .
(A1)
Theorem 1.2 (Convergence to the Exact Value). Referring to(A1), if n = 2, then λk =λGπ,νfor
all k ≥1. If n ≥ 3, then the sequence(λk, ψk) satisfies
(1) Ifλ0=λGπ,ν, thenλk=λπ,νG for all k ≥0.
(2) Ifλ0̸=λGπ,ν, thenλk> λk+1> λπ,νG for k ≥1.
(3) Set(λ∗, ψ∗) = limk→∞(λk, ψk). Then, λ∗ = Eν(ψ∗, ψ∗)/Varπ(ψ∗) = λGπ,ν and π(ψ∗)
=0.
Theorem 1.3 (Rate of Convergence). Referring toTheorem1.2, there is a constantσ ∈ (0, 1) independent of the choice of (λ0, a) such that 0 ≤ λk−λGπ,ν ≤σk−1λ1for all k ≥1.
ByTheorem 1.3, we know that the sequenceλk generated in(A1)converges to the spectral
gap exponentially but the rate(− log σ ) is undetermined. The following alternative scheme is based on using more information on the spectral gap and will provide convergence at a constant rate.
Choose a> 0, L0< λGπ,ν < U0in advance and set, for k = 0, 1, . . . ,
1. ψk(1) = −a, λk =
1
2(Lk+Uk) 2. ψk(i + 1) = ψk(i) +
{[ψk(i) − ψk(i − 1)]ν(i − 1, i) − λkπ(i)ψk(i)}+ ν(i, i + 1) , for 1 ≤ i< n, where t+=max{t, 0},
3. Lk+1=Lk, Uk+1=λk ifπ(ψk) > 0 Lk+1=λk, Uk+1=Uk ifπ(ψk) < 0 Lk+1=Uk+1=λk ifπ(ψk) = 0. (A2)
Theorem 1.4 (Dichotomy Method). Referring to(A2), it holds true that 0 ≤ max{Uk−λπ,νG , λGπ,ν−Lk} ≤(U0−L0)2−k, ∀k ≥ 0.
In Theorem 1.4, the convergence to the spectral gap is exponentially fast with explicit rate, log 2. SeeRemark 2.2 for a discussion on the choice of L0and U0. For higher order spectra,
Miclo has a detailed description of the shape of eigenvectors in [14] and this will motivate the definition of similar algorithms for every eigenvalue in spectrum. See(Di)andTheorem 3.4for
a generalization of(A2)andTheorem 3.14for a localized version ofTheorem 1.3.
The spectral gap is an important parameter in the quantitative analysis of Markov chains. The cutoff phenomenon, a sharp phase transition phenomenon for Markov chains, was introduced by Aldous and Diaconis in early 1980s. It is of interest in many applications. A heuristic conjecture proposed by Peres in 2004 says that the cutoff exists if and only if the product of the spectral gap and the mixing time tends to infinity. Assuming reversibility, this has been proved to hold for Lp-convergence with 1 < p ≤ ∞ in [2]. For the L1-convergence, Ding et al. [10] prove this conjecture for continuous time birth and death chains. In order to use Peres’ conjecture in practice, the orders of the magnitudes of spectral gap and mixing time are required. The second aspect of this paper is to derive a theoretical lower bound on the spectral gap using only the birth and death rates. This lower bound is obtained using the same idea used to analyze the above algorithm. For estimates on the mixing time of birth and death chains, we refer the readers to the recent work [4] by Chen and Saloff-Coste. For illustration, we consider several examples of specific interest and show that the lower bound provided here is in fact of the correct order in these examples.
This article is organized as follows. In Section2, the algorithms in(A1)–(A2)are explored and proofs forTheorems 1.2–1.4are given. In Section 3, the spectrum of Mπ,νG is discussed further and, based on Miclo’s work [14], algorithm (A2) is generalized to any specified eigenvalue of Mπ,νG . Our method is applicable for paths of infinite length (one-sided) and this is described in Section 4. For illustration, we consider some Metropolis chains and display numerical results of algorithm(A2)in Section5. In Section6, we focus on uniform measures with bottlenecks and determine the correct order of the spectral gap using the theory in Sections2
and3. It is worthwhile to remark that the assumptions in Section6 can be relaxed using the comparison technique in [7,8]. As the work in this paper can also be regarded as a stochastic counterpart of theory of finite Jacobi matrices, we would like to refer the readers to [18,19] for a complementary perspective.
2. Convergence to the spectral gap
This section is devoted to provingTheorems 1.2–1.4. First, we proveTheorem 1.1 in the following form.
Lemma 2.1. Letλ > 0 and ψ be a non-constant function on V . Suppose (λ, ψ) solves(1.2)and ψ is monotonic. Then, ψ is strictly monotonic, that is, either ψ(i) < ψ(i + 1) for 1 ≤ i < n or ψ(i) > ψ(i + 1) for 1 ≤ i < n.
Proof. Obviously,(1.2)implies thatπ(ψ) = 0. Without loss of generality, it suffices to consider the case when ψ(1) < 0 and ψ(n) > 0. Since ψ is non-constant and λπ,νG > 0, we have ψ(1) < ψ(2) and ψ(n − 1) < ψ(n). Note that if there are 1 < i < j < n such that ψ(i − 1) < ψ(i), ψ( j) < ψ( j + 1) and ψ(k) = ψ(i) = ψ( j) for i ≤ k ≤ j, then(1.2)
yields λG
π,νπ(i)ψ(i) = [ψ(i) − ψ(i − 1)]ν(i − 1, i) + [ψ(i) − ψ(i + 1)]ν(i, i + 1) > 0
and λG
π,νπ( j)ψ( j) = [ψ( j) − ψ( j − 1)]ν( j − 1, j) + [ψ( j) − ψ( j + 1)]ν( j, j + 1) < 0,
a contradiction. Thus,ψ is strictly increasing. We note the following corollary.
Corollary 2.2. Let (λ, ψ) be a pair satisfying (1.2). Then, λ = λGπ,ν if and only if ψ is monotonic.
Proof. One direction is obvious fromTheorem 1.1. For the other direction, assume that ψ is monotonic and letφ be a minimizer for λGπ,ν withπ(φ) = 0. Since (λ, ψ) and (λGπ,ν, φ) are solutions to(1.2), one has
λπ(ψφ) = Eν(ψ, φ) = λGπ,νπ(φψ).
ByLemma 2.1,ψ and φ are strictly monotonic and this implies Eν(ψ, φ) ̸= 0. As a consequence of the above equations, we haveλ = λGπ,ν.
Proposition 2.3. Suppose that(λ, ψ) satisfies λ > 0, ψ(1) < 0 and, for 1 ≤ i < n, ψ(i + 1) = ψ(i) +{[ψ(i) − ψ(i − 1)]ν(i − 1, i) − λπ(i)ψ(i)}
+
ν(i, i + 1) , (2.1)
where t+=max{t, 0}. Then, the following are equivalent. (1) Eν(ψ, ψ) = λVarπ(ψ).
(2) π(ψ) = 0. (3) λ = λπ,νG .
Furthermore, if n ≥3, then any of the above is equivalent to (4) Eν(ψ, ψ) = λπ,νG Var(ψ).
Remark 2.1. For n = 2, it is an easy exercise to show that λGπ,ν = ν(1, 2)/(π(1)π(2)). By following the formula in (2.1), one has ψ(2) = ψ(1)[1 − λπ(1)/ν(1, 2)], which leads to Eν(ψ, ψ)/Varπ(ψ) = λG
π,ν.
Proof of Proposition 2.3. Set B = {1 ≤ i ≤ n|ψ(i) = ψ(n)} and Bc = {1, 2, . . . , i
0}. Since
ψ(1) < 0 and λ > 0, ψ(1) < ψ(2) and Bcis nonempty. According to(2.1),ψ is non-decreasing.
Note that ifψ(i) = ψ(i + 1), then ψ(i) ≥ 0 and ψ(i + 2) = ψ(i + 1). This implies ψ is strictly increasing on {1, 2, . . . , i0+1} and, for 1 ≤ i ≤ i0,
λπ(i)ψ(i) = [ψ(i) − ψ(i + 1)]ν(i, i + 1) + [ψ(i) − ψ(i − 1)]ν(i − 1, i). Multiplyingψ(i) on both sides and summing over all i in Bcyields
λ i0 i =1 ψ(i)2π(i) = i0−1 i =1
[ψ(i) − ψ(i + 1)]2ν(i, i + 1) +ψ(i0)[ψ(i0) − ψ(i0+1)]ν(i0, i0+1)
=Eν(ψ, ψ) + ψ(i0+1)[ψ(i0) − ψ(i0+1)]ν(i0, i0+1)
=Eν(ψ, ψ) + λψ(n) i0 i =1 ψ(i)π(i). This is equivalent to Eν(ψ, ψ) = λVarπ(ψ) + λπ(ψ)[π(ψ) − ψ(n)], (2.2) which proves (1)⇔(2). Ifλ = λG
π,ν, then ψ is an eigenvector for Mπ,νG associated to λGπ,ν. This proves (3)⇒(2).
For (2)⇒(3), assume thatπ(ψ) = 0. In this case, ψ must be strictly increasing. Otherwise, ψ(i) = ψ(n) > 0 for i ∈ B and, according to(2.1), this implies
λVarπ(ψ) > λ n−1 i =1 π(i)ψ2(i) ≥ n−1 i =1
[ψ(i) − ψ(i + 1)]2ν(i, i + 1) = E(ψ, ψ),
which contradicts (1). As ψ is strictly increasing and π(ψ) = 0, (λ, ψ) solves (1.2). By
Corollary 2.2,λ = λGπ,ν.
To finish the proof, it remains to show (4)⇒(3) ((3)⇒(4) is obvious from the equivalence among (1), (2) and (3)). Assume that Eν(ψ, ψ) = λGπ,νVarπ(ψ). ByLemma 2.1,ψ is strictly monotonic and this implies, for 1 ≤ i< n,
Asψ is a minimizer for λGπ,ν, one has, for 1 ≤ i ≤ n, λG
π,νπ(i)[ψ(i) − π(ψ)] = [ψ(i) − ψ(i + 1)]ν(i, i + 1) + [ψ(i) − ψ(i − 1)]ν(i − 1, i).
Ifλ ̸= λG
π,ν, the comparison of both systems yields
ψ(i) = λ
G π,νπ(ψ)
λG
π,ν−λ, ∀1 ≤ i < n.
As n ≥ 3,ψ(1) = ψ(2), a contradiction! This forces λ = λGπ,ν, as desired.
The following is a simple corollary of Proposition 2.3, which plays an important role in provingTheorem 1.4.
Corollary 2.4. Let n ≥ 3. For λ > 0, let φλ be the vector generated by(2.1)withφ(1) < 0. Then,(λ − λGπ,ν)π(φλ) > 0 for λ > 0 and λ ̸= λGπ,ν.
Proof. Without loss of generality, we fixφλ(1) = −1 for all λ > 0. Set T (λ) = π(φλ). To prove this corollary, it suffices to show that
T(λ)<0 if λ < λ
G π,ν
>0 if λ > λG π,ν.
Forλ > 0, define L(λ) := Eν(φλ, φλ)/Varπ(φλ). By(2.2), one has L(λ) − λ = λT (λ)[π(φλ) − φλ(n)]
Varπ(φλ) . (2.3)
Sinceφλis non-constant,π(φλ) < φλ(n). This implies T (λ) < 0 for λ ∈ (0, λGπ,ν).
Forλ > λGπ,ν, set I = (λGπ,ν, ∞). ByProposition 2.3, T(λ) = 0 if and only if λ = λπ,νG . By the continuity of T , this implies either T(I ) ⊂ (−∞, 0) or T (I ) ⊂ (0, ∞). In the case T(I ) ⊂ (−∞, 0), one has L(λ) > λ for λ ∈ I . As L(I ) is bounded, Lk(λ) is convergent with limit λ > λπ,νG and this yields
0 = lim k→∞ [Lk+1(λ) − Lk(λ)] =λT (λ)[π(φλ) − φλ(n)] Varπ(φ λ) > 0, a contradiction. Hence, T(λ) > 0 for λ > λGπ,ν.
Proof of Theorem 1.2. The proof for n = 2 is obvious from a direct computation and we deal with the case n ≥ 3, here. By the equivalence ofProposition 2.3(3)–(4), ifλ0 = λGπ,ν, then
λk =λGπ,ν for all k ≥ 1. Ifλ0̸=λπ,νG , thenλk > λGπ,ν for k ≥ 1. Note that(λk, ψk) solves the
system in(2.1). By(2.2), this implies
λk+1−λk= λkπ(ψk)[π(ψk) − ψk(n)]
Varπ(ψk)
, ∀k ≥ 0.
The strict monotonicity ofλkin (2) comes immediately fromCorollary 2.4. In (3), the continuity
of (2.1) in λ implies that (λ∗, ψ∗) is a solution to (2.1) and Eν(ψ∗, ψ∗) = λ∗Var(ψ∗). By
Proposition 2.3,λ∗=λGπ,νandπ(ψ∗) = 0, as desired.
Proof of Theorem 1.3. Recall the notation in the proof ofCorollary 2.4: forλ > 0, let φλ be the function defined by(2.1) and L(λ) = Eν(φλ, φλ)/Varπ(φλ). By (2.2)and Corollary 2.4, L(λ) ∈ (λGπ,ν, λ) for λ > λGπ,ν. As L is bounded,Theorem 1.3follows fromLemma A.1.
Proof of Theorem 1.4. Immediate fromCorollary 2.4.
In the end of this section, we use the following proposition to find how the shape of the functionψ in(2.1)evolves withλ. InProposition 2.5, we setφλ =ψ when ψ is given by(2.1). It is easy to see from(2.1)that φλ is strictly increasing before some constant, say i0 =i0(λ),
and then stays constant equal toφλ(i0) after i0. The proposition shows how the constant i0(λ)
evolves.
Proposition 2.5. Forλ > 0, let φλbe the function generated by(2.1)withφλ(1) = −1 and, for 1 ≤ i ≤ n, set Ti(λ) = ij =1φλ(i)π(i). For 1 ≤ i < n, let
ai(λ) = 1 + π(i + 1)/π(i) − λπ(i + 1)/ν(i, i + 1),
Ai(λ) = a1(λ) 1 0 0 · · · 0 π(3) π(2) a2(λ) 1 0 ... 0 π(4) π(3) a3(λ) ... ... ... 0 0 ... ... ... 0 ... ... ... ai −1(λ) 1 0 · · · 0 π(i + 1) π(i) ai(λ) , (2.4)
and letλ(i)be the smallest root of det Ai(λ) = 0. Then,
(1) λGπ,ν =λ(n−1)< λ(n−2)< · · · < λ(1).
(2) φλ(i) < φλ(i + 1) = φλ(i + 2) for λ ∈ [λ(i), λ(i−1)) and 1 ≤ i ≤ n − 2, where λ(0):= ∞. (3) φλ(n − 1) < φλ(n) for λ ∈ (0, λ(n−2)).
In particular, Ti +1(λ) = −π(1) det Ai(λ) for λ ∈ (0, λ(i−1)) and (λ − λ(i))Ti +1(λ) > 0 for
λ ∈ (0, λ(i)) ∪ (λ(i), ∞) with 1 ≤ i ≤ n − 1.
Proof. ByLemma A.2,λ(1)> λ(2)> · · · > λ(n−1)> 0 and, for 1 ≤ i ≤ n − 1, det Ai(λ)
>0 ∀λ ∈ (−∞, λ(i))
<0 ∀λ ∈ (λ(i), λ(i−1)), (2.5) whereλ(0)= ∞. Note that if Ti(λ) < 0 for some 1 ≤ i ≤ n − 1, then
φλ( j + 1) = φλ( j) +[φλ( j) − φλ( j − 1)]ν( j − 1, j) − λπ( j)φν( j, j + 1) λ( j), ∀1 ≤ j ≤ i. This implies φλ(ℓ + 1) = φλ(ℓ) −ν(ℓ, ℓ + 1)λ ℓ j =1 π( j)φλ( j), ∀1 ≤ ℓ ≤ i. (2.6)
Multiplyingπ(ℓ + 1) and adding up Tℓ(λ) yields Tℓ+1(λ) = aℓ(λ)Tℓ(λ) −π(ℓ + 1)
From the above discussion, we conclude that if Ti(λ) < 0, then
Tℓ+1(λ) = −π(1) det Aℓ(λ), ∀1 ≤ ℓ ≤ i. (2.7) Whenℓ = i − 1,(2.5)implies det Ai −1(λ) > 0 for λ < λ(i−1). By the continuity of Ti and
det Ai −1, if there is someλ < λ(i−1)such that Ti(λ) < 0, then Ti(λ) = −π(1) det Ai −1(λ) for
λ < λ(i−1). As a consequence of(2.7)withℓ = i, this will imply Ti +1(λ) = −π(1) det Ai(λ)
for λ < λ(i−1). Hence, it remains to show that Ti(λ) < 0 for some λ < λ(i−1). To see
this, according to Corollary 2.4, one can choose a constant λ < min{λGπ,ν, λ(i−1)}such that Tn−1(λ) < 0. Since φλ(i) is non-decreasing in i, we obtain Ti(λ) < 0, as desired. This proves Ti +1(λ) = −π(1) det Ai(λ) for λ < λ(i−1). In particular, Tn(λ) = −π(1) det An−1(λ) for
λ < λ(n−2). ByCorollary 2.4, we haveλ(n−1)=λGπ,ν. This provesProposition 2.5(1). Next, observe that, forλ ∈ (λ(i), λ(i−1)),
i +1 j =1 π( j)φλ( j) = Ti +1(λ) > 0, i j =1 π( j)φλ( j) = Ti(λ) < 0.
By(2.6), it is easy to see that [φλ(i + 1) − φλ(i)]ν(i, i + 1) = −λTi(λ) > 0 and
[φλ(i + 2) − φλ(i + 1)]ν(i + 1, i + 2)
= {[φλ(i + 1) − φλ(i)]ν(i, i + 1) − λπ(i + 1)φλ(i + 1)}+ = {−λTi +1(λ)}+=0.
This provesProposition 2.5(2). To proveProposition 2.5(3), we use (1) to derive Tn−1(λ) = −π(1) det An−2(λ) < 0, ∀λ ∈ (0, λ(n−2)).
Using(2.6), this impliesφλ(n − 1) < φλ(n). The last part ofProposition 2.5follows easily from
(2.5)and the fact that
Ti(λ) ≥ 0 ⇒ Ti +1(λ) > 0 and Ti(λ) ≤ 0 ⇒ Ti −1(λ) < 0.
Remark 2.2. InProposition 2.5, ifλ > λ(1) =ν(1, 2)[π(1)−1+π(2)−1], thenφλ(i) = φλ(2) for i = 2, . . . , n. Note that, for λ ≥ λ(1),φλ(2) = −1 + λπ(1)/ν(1, 2) and
π(φλ) = −1 +λπ(1)(1 − π(1))ν(1, 2) , Varπ(φλ) = λ
2π(1)3(1 − π(1))
ν(1, 2)2 .
By(2.3), this leads to L(λ) = ν(1, 2)/[π(1)(1 − π(1))] for λ ≥ λ(1). In the case n = 2, it is clear thatν(1, 2)/[π(1)(1 − π(1))] = ν(1, 2)[π(1)−1+π(2)−1] =λπ,νG .
3. Convergence to other eigenvalues
In this section, we generalize the algorithms(A1)and(A2)so that they can be applied for the computation to any specified eigenvalue.
3.1. Basic setup and fundamental results
Recall that G is a graph with vertex set V = {1, 2, . . . , n} and edge set E = {{i, i + 1}|i = 1, 2, . . . , n − 1}. Given two positive measures π, ν on V, E with π(V ) = 1, let Mπ,νG be an
n-by-n matrix defined in the introduction and given by Mπ,νG (i, j) = −ν(i, j)/π(i) if |i − j | = 1 [ν(i − 1, i) + ν(i, i + 1)]/π(i) if j = i 0 if |i − j |> 1.
(3.1)
Sinceν is positive everywhere and Mπ,νG is tridiagonal, all eigenvalues of Mπ,νG have algebraic multiplicity 1. Throughout this section, let {λG0 < λG1 < · · · < λGn−1}denote the eigenvalues of Mπ,νG with associated L2(π)-normalized eigenvectors ζ0=1, ζ2, . . . , ζn−1. Clearly,λ0G =0,
λG
1 =λGπ,νand, for 1 ≤ k ≤ n,
λG
i ζi(k)π(k) = [ζi(k) − ζi(k − 1)]ν(k − 1, k) + [ζi(k) − ζi(k + 1)]ν(k, k + 1). (3.2)
Let 1 ≤ i ≤ n − 1. Asζi is non-constant, it is clear thatζi(1) ̸= ζi(2) and ζi(n − 1) ̸= ζi(n).
Moreover, ifζi(k) = ζi(k + 1) for some 1 < k < n, then ζi(k) ̸= ζi(k − 1) and ζi(k + 1) ̸=
ζi(k + 2). Gantmacher and Krein [13] showed that there are exactly i sign changes forζi with
1 ≤ i ≤ n. Miclo [14] gives a detailed description on the shape ofζias follows.
Theorem 3.1. For 1 ≤ i ≤ n − 1, letζibe an eigenvector associated to the i th smallest non-zero
eigenvalue of the matrix in(3.1)withζi(1) < 0. Then, there are 1 = a1 < b1 ≤ a2 < b2 ≤
· · · ≤ai < bi =n with aj +1−bj ∈ {0, 1} such that ζiis strictly increasing on [aj, bj]for odd
j and is strictly decreasing on [aj, bj]for even j , andζi(aj +1) = ζi(bj) for 1 ≤ j < i.
In the following, we make some analysis related to the Euler–Lagrange equations in(3.2). Definition 3.1. Fix n ≥ 1 and let f be a function on {1, 2, . . . , n}. For 1 ≤ i ≤ n − 1, f is called “Type i ” if there are 1 = a1< b1≤a2< b2≤ · · · ≤ai < bi ≤nsatisfying aj +1−bj ∈ {0, 1}
such that
(1) f is strictly monotonic on [aj, bj]for 1 ≤ j ≤ i .
(2) [ f(aj) − f (aj+1)][ f (aj +1) − f (aj +1+1)] < 0 for 1 ≤ j < i.
(3) f(aj +1) = f (bj), for 1 ≤ j < i, and f (k) = f (bi), for bi ≤k ≤ n.
The points aj, bj will be called “peak-valley points” in this paper.
Remark 3.1. Note that the difference betweenDefinition 3.1andTheorem 3.1is the requirement bi ≤n, instead of bi =n. ByTheorem 3.1, any eigenvector associated to the i th smallest
non-zero eigenvalue of the matrix in(3.1)must be of type i with bi =n.
Definition 3.2. Letπ, ν be positive measures on V, E with π(V ) = 1. For λ ∈ R, let ξλbe a function on {1, 2, . . . , n} defined by ξλ(1) = −1 and, for 1 ≤ k < n,
ξλ(k + 1) = ξλ(k) +
[ξλ(k) − ξλ(k − 1)]ν(k − 1, k) − λπ(k)ξλ(k) ν(k, k + 1) .
Remark 3.2. Note thatξ0 = −1 and, forλ < 0, ξλ is strictly decreasing and of type 1. For
λ > 0, if ξλ(k − 1) < ξλ(k) = ξλ(k + 1), then ξλ(k) > 0 and this implies ξλ(k + 2) < ξλ(k + 1).
Similarly, ifξλ(k − 1) > ξλ(k) = ξλ(k + 1), then ξλ(k) < 0 and ξλ(k + 2) > ξλ(k + 1). Thus, ξλmust be of type i for some 1 ≤ i ≤ n − 1.
Lemma 3.2. Forλ > 0, let ξλbe the function inDefinition3.2. Suppose thatξλis of type i with 1 ≤ i ≤ n − 1.
(1) Ifξλ(n − 1) ̸= ξλ(n), then there is ϵ > 0 such that ξλ+δis of type i for −ϵ < δ < ϵ. (2) Ifξλ(n − 1) = ξλ(n), then there is ϵ > 0 such that ξλ+δis of type i +1 andξλ−δis of type i
for0< δ < ϵ.
Proof. Let aj, bj be the peak-valley points ofξλ. By the continuity ofξλinλ andRemark 3.2,
one can chooseϵ > 0 such that, for δ ∈ (−ϵ, ϵ), ξλ+δremains strictly monotonic on [aj, bj]for
j =1, . . . , i and
[ξλ+δ(bj −1) − ξλ+δ(bj)][ξλ+δ(aj +1+1) − ξλ+δ(aj +1)] > 0,
for 1 ≤ j < i. In (1), bi =n. Fixδ ∈ (−ϵ, ϵ) and set a′1=a1=1, b ′ i =bi =n. For 1< j < i, set b′j =a′j +1=bj if [ξλ+δ(bj −1) − ξλ+δ(bj)][ξλ+δ(bj) − ξλ+δ(aj +1)] < 0 b′j =a′j +1=aj +1 if [ξλ+δ(bj −1) − ξλ+δ(bj)][ξλ+δ(bj) − ξλ+δ(aj +1)] > 0 b′j =bj, a′j +1=aj +1 if [ξλ+δ(bj −1) − ξλ+δ(bj)][ξλ+δ(bj) − ξλ+δ(aj +1)] = 0.
Clearly,ξλ+δis of type i with peak-valley points a′j, b′j. This provesLemma 3.2(1).
For part (2), we consider i ≤ n − 2 and bi = n − 1. By similar argument as before,
one can choose ϵ > 0 such that the restriction of ξλ+δ to {1, 2, . . . , n − 1} is of type i for δ ∈ (−ϵ, ϵ). To finish the proof, it remains to compare ξλ+δ(n − 1) and ξλ+δ(n). Recall that
Tj(λ) = k=1j ξλ(k)π(k) as in the proof forProposition 2.5. Using a similar reasoning as for (2.7), one shows that Ti +1(λ) = −π(1) det Ai(λ) for 1 ≤ i < n, where Ai(λ) is the matrix
in(2.4). This implies that the non-zero eigenvalues of Mπ,νG , sayλG1, . . . , λGn−1, are the roots of det An−1(λ) = 0. As a consequence ofLemma A.2, det An−2(λ) = 0 has exactly n − 2 distinct
roots, sayα1< α2< · · · < αn−1, and they satisfy the interlacing propertyλGj < αj < λGj +1for
1 ≤ j ≤ n − 2. Note that det An−2(λ) and det An−1(λ) tend to infinity as −λ tends to infinity.
This leads to the fact that if det An−2(λ) = 0 and det An−1(λ) < 0, then det An−2(·) is strictly
decreasing in a neighborhood ofλ. If det An−2(λ) = 0 and det An−1(λ) > 0, then det An−2(·) is
strictly increasing in a neighborhood ofλ.
Back to the proof of (2). Suppose thatξλ(n − 2) < ξλ(n − 1). ByRemark 3.2, it is easy to check that Tn−1(λ) = 0 and Tn(λ) > 0 or, equivalently, det An−2(λ) = 0 and det An−1(λ) < 0.
According to the conclusion in the previous paragraph, we can findϵ > 0 such that det An−2(·)
is strictly decreasing on(λ − ϵ, λ + ϵ), which yields ξλ+δ(n) = ξλ+δ(n − 1) −(λ + δ)Tν(n − 1, n)n−1(λ + δ)
<ξλ+δ(n − 1) if 0 < δ < ϵ
>ξλ+δ(n − 1) if − ϵ < δ < 0.
This gives the desired property inLemma 3.2(2). The other case,ξλ(n − 2) > ξλ(n − 1), can be proved in the same way and we omit the details.
The following proposition characterizes the shape ofξλforλ > 0.
Proposition 3.3. Forλ > 0, let ξλ be the function inDefinition3.2. Let λG1 < · · · < λGn−1be non-zero eigenvalues of Mπ,νG in(3.1)and α1 < · · · < αn−2 be zeros of det An−2(λ), where
An−2(·) is the matrix in(2.4). Then,
(1) λGj < αj < λGj +1, for1 ≤ j ≤ n − 2.
Proof. (1) is immediate fromLemma A.2. For (2), note thatαi is an eigenvalue of the submatrix
of Mπ,νG obtained by removing the nth row and column. This impliesξαi(n − 1) = ξαi(n) for i =1, . . . , n − 2 and ξλ(n − 1) ̸= ξλ(n) for λ > 0 and λ ̸∈ {α1, . . . , αn−2}. ByLemma 3.2,ξλ
is of type i forαi −1< λ ≤ αi.
Givenλ > 0, the above proposition provides a simple criterion to determine which of the intervals(αj, αj +1]λ belongs to, that is, the type of ξλ. However, knowing the type ofξλis not
sufficient to determine whetherλ is bigger or smaller than λiG. We need the following remark. Remark 3.3. Using the same argument as the proof of Proposition 2.5, one can show that π(ξλ) = −π(1) det An−1(λ), where An−1(λ) is the matrix in(2.4). Clearly,π(ξλ) has zeros
λG 1, . . . , λ
G
n−1and tends to minus infinity asλ tends to minus infinity. This implies that π(ξλ) <
0, forλ < λG1, and π(ξλ) > 0 ∀λ ∈ (λ2i −1G , λ G 2i), π(ξλ) < 0 ∀λ ∈ (λ G 2i, λ G 2i +1), for i ≥ 1, whereλnG:= ∞.
As a consequence ofProposition 3.3 andRemark 3.3, we obtain the following dichotomy algorithm, which is a generalization of(A2). Let 1 ≤ i ≤ n − 1.
Choose positive reals L0< λiG< U0and set, forℓ = 0, 1, . . . ,
1. ξλℓ be the function generated byλℓ=(Lℓ+Uℓ)/2 inDefinition 3.2,
2. According toDefinition 3.1, set
Lℓ+1=Lℓ, Uℓ+1=λℓ ifξλℓis of type j with j > i, or ifξλℓis of type i and(−1)i −1π(ξλℓ) > 0 Uℓ+1=Uℓ, Lℓ+1=λℓ ifξλℓis of type j with j < i, or ifξλℓis of type i and(−1)i −1π(ξλℓ) < 0 Lℓ+1=Uℓ+1=λℓ ifξλℓis of type i andπ(ξλℓ) > 0. (Di) Theorem 3.4. Referring to(Di), 0 ≤ max{Uℓ−λG i , λiG−Lℓ} ≤(U0−L0)2−ℓ, ∀ℓ ≥ 0.
Proof. Immediate fromProposition 3.3andRemark 3.3.
Proposition 3.3(2) bounds the eigenvalues using the shape ofξλgenerated from one end point. We now introduce some other criteria to bound eigenvalues using the shape ofξλ from either boundary point. Those results will be used to proveTheorem 6.1.
Proposition 3.5. Forλ > 0, let ξλbe the function inDefinition3.2and ξλbe a function given by
ξλ(k − 1) =ξλ(k) +
[ξλ(k) −ξλ(k + 1)]ν(k, k + 1) − λπ(k)ξλ(k) ν(k − 1, k) ,
for k = n, n − 1, . . . , 2 withξλ(n) = −1. Let λG0 < · · · < λn−1G be eigenvalues of Mπ,νG in(3.1)and let f |Bbe the restriction of f to a subset B of V . Suppose1 ≤ k0≤n.
(1) Ifξλ|{1,...,k
0}is of type i with(−1)
iξ
λ(k0) > 0 andξλ|{k0,...,n}is of type j with(−1)
j
ξλ(k0) >
(2) Ifξλ|{1,...,k
0}is of type i with(−1)
iξ
λ(k0) < 0 andξλ|{k0,...,n}is of type j with(−1)
j ξλ(k0) < 0, thenλGi + j −1< λ < λGi + j +1. (3) Ifξλ|{1,...,k 0}is of type i with(−1) iξ
λ(k0) > 0 andξλ|{k0,...,n}is of type j with(−1)
j
ξλ(k0) <
0, thenλGi + j −2< λ < λGi + j.
Proof. ByProposition 3.3,ξλ(n) is a polynomial of degree n − 1 satisfying (−1)i +1ξ
λG
i (n) > 0 ∀0 ≤ i < n, (−1)
i +1ξ
βi(n) > 0 ∀1 ≤ i < n − 1.
This implies that there arewi ∈ (βi, λi +1G ), 0 ≤ i ≤ n − 2, such that (−1)i +1ξλ(n) > 0 for
λ ∈ (wi −1, wi) and 0 ≤ i ≤ n − 1 with w−1= −∞andwn−1= ∞.
The proofs for (1)–(3) in Proposition 3.5 are similar and we deal with (1) only. By the Euler–Lagrange equations in (3.2), it is easy to see that, for 1 ≤ l < n, ξλG
l and ξλGl
are eigenvectors of Mπ,νG in (3.1) associated with λlG, which implies ξλG l
= −ξλG l (n)ξλlG.
First, assume that λ ≤ λGi + j −2. By Proposition 3.3, ξλG i + j −2
|{1,...,k
0} is of type at least i and
ξλG i + j −2
|{k
0,...,n} is of type at least j . This implies that the patching of ξλG i + j −2 |{1,...,k 0} and −ξλG i + j −2(n)ξλGi + j −2 |{k 0,...,n}, which equals ξλG
i + j −2, is of type at least i + j − 1. This is a
contradiction.
Next, assume thatλ ≥ λi + j −1G . ByProposition 3.3, we may choose a1 < λ (resp. a2 < λ)
such thatξλ|{1,...,k
0}(resp. ξλ|{k0,...,n}) changes the type at a1(resp. a2). Ifλ
G
i + j −1≤min{a1, a2},
then a similar reasoning as before implies that ξλG
i + j −1 is of type at most i + j − 2, a
contradiction. If min{a1, a2}< λGi + j −1 < max{a1, a2}, then exactly one ofξλG i + j −1 |{1,...,k 0}and ξλG i + j −1 |{k
0,...,n}does not change its type. This implies that the gluing point k0cannot be a local
extremum and, thus, the patching function is of type at most i + j − 2, another contradiction! According to the discussion in the first paragraph of this proof, if λGi + j −1 ≥ max{a1, a2},
then none of ξλG i + j −1 |{1,...,k 0} and ξλG i + j −1 |{k
0,...,n} changes type nor, of course, the sign at k0.
Consequently, we obtain(−1)i + jξ λG
i + j −1(k0)ξλi + j −1G (k0) > 0, which contradicts the fact ξλGi + j −1
= −ξλG
i + j −1(n)ξλ G
i + j −1.
Proposition 3.6. Forλ > 0 and 1 ≤ k ≤ n − 1, let sk(λ) be the kth sign change of ξλdefined
by s0:=0 and sk+1(λ) := inf{l > sk(λ)|ξλ(l)ξλ(l − 1) < 0 or ξλ(l) = 0}, where inf ∅ := n + 1.
Then, for0< λ1< λ2, sk(λ1) ≥ sk(λ2) for all 1 ≤ k ≤ n − 1.
Proof. Let 1 ≤ k ≤ n − 1. If sk(λ1) = n + 1, then it is clear that sk(λ1) ≥ sk(λ2). Suppose that
sk(λ1) = ℓ ≤ n. Obviously, ξλ1|{1,...,ℓ} is of type k. Referring to(2.4), letλℓ1, . . . , λℓℓ−1be the
roots of det Aℓ−1(λ) = 0 and α1ℓ, . . . , αℓ−2ℓ be roots of det Aℓ−2(λ) = 0. According to the first paragraph of the proof forProposition 3.5, there arewiℓ∈(αℓ
i −1, λℓi) with 1 ≤ i ≤ ℓ−1 such that
(−1)i +1ξ
λ(ℓ) > 0 for λ ∈ (wℓi, wℓi +1) and 1 ≤ i ≤ ℓ−1, where α0ℓ:=0. Sinceξλ1(ℓ)ξλℓk(ℓ) ≥ 0,
one haswℓk ≤ λ1< αℓ
k. As it is assumed thatλ2> λ1, ifλ2 > αkℓ, thenξλ2|{1,...,ℓ}is of type at
least k + 1 and, consequently, sk(λ2) < ℓ = sk(λ1). If λ1 < αℓk, thenξλ2|{1,...,ℓ}is type k and
3.2. Bounding eigenvalues from below
Motivated byTheorem 3.1, we introduce another scheme generalizing(2.1)to bound the other eigenvalues of Mπ,νG from below.
Definition 3.3. Forλ > 0, let ξλbe a function inDefinition 3.2. Ifξλis of type i , 1 ≤ i ≤ n − 1, with peak-valley points 1 = a1< b1≤a2< b2≤ · · · ≤ai < bi ≤n, then define
ξλ( j)(k) =
ξλ(k) for k ≤ bj
ξλ(k) = ξλ(bj) for k > bj,
∀1 ≤ j< i and setξλ( j)=ξλfor i ≤ j ≤ n − 1.
Remark 3.4. Forλ > 0, if ξλis of type i , thenξλ( j)is of type j for j < i. Moreover, for k < bj,
ξλ( j)(k + 1) = ξλ( j)(k) +[ξλ( j)(k) − ξλ( j)(k − 1)]ν(k − 1, k) − λπ(k)ξλ( j)(k) ν(k, k + 1) =ξλ( j)(k) −λ[π(1)ξ ( j) λ (1) + · · · + π(k)ξλ( j)(k)] ν(k, k + 1) , and, for bj ≤k< n, ξλ( j)(k + 1) = ξλ( j)(k) + Fj([ξλ( j)(k) − ξλ( j)(k − 1)]ν(k − 1, k) − λπ(k)ξλ( j)(k)) ν(k, k + 1) ,
where Fj(t) = max{t, 0} if j is odd, and Fj(t) = min{t, 0} if j is even. Note that ξλ(1)is exactly
φλinProposition 2.5.
Thereafter, let L and L(i)be functions on(0, ∞) defined by
L(λ) = Eν(ξλ, ξλ) Varπ(ξλ), L
(i)(λ) = Eν(ξλ(i), ξλ(i))
Varπ(ξλ(i)) , ∀1 ≤ i ≤ n − 1,
(3.3)
whereξλandξλ(i)are functions inDefinitions 3.2–3.3.
Remark 3.5. Note that L = L(n−1). By a similar reasoning as in the proof for(2.2), one can show that, forλ > 0,
L(λ) = λ +λπ(ξλ)[π(ξλ) − ξλ(n)] Varπ(ξλ) , L
(i)(λ) = λ + λπ(ξλ(i))[π(ξλ(i)) − ξλ(i)(n)]
Varπ(ξλ(i)) . FromProposition 3.3, it follows immediately that L(λ) = L(i)(λ) for λ ∈ (0, αi].
To explore further L and L(i), we need more information ofπ(ξλ), π(ξλ(i)), π(ξλ) − ξλ(n) and π(ξλ(i)) − ξλ(i)(n).
Lemma 3.7. Letξλbe the function inDefinition3.2andλiG, αi be constants inProposition3.3.
Then,π(ξλ)−ξλ(n) = 0 has n−1 distinct roots, say β0< β1< · · · < βn−2, which satisfyβ0=0
andαi < βi < λGi +1for1 ≤ i ≤ n − 2. Furthermore,π(ξλ) − ξλ(n) > 0 for λ ∈ (β2i −1, β2i)
Proof. Set u(λ) := π(ξλ) − ξλ(n). According toDefinition 3.2, u(λ) is a polynomial of degree n −1 and satisfies u(0) = 0. Note that π(ξλ) = 0 for λ ∈ {λG1, . . . , λGn−1}. If i is odd, then ξλG
i (n − 1) < ξλGi (n). This implies ξλGi (n) > 0 and, hence, u(λ
G
i ) < 0. Similarly, if i is even,
then u(λiG) > 0.
By Lemma 3.2 and Proposition 3.3, if λ = αi with odd i , then ξαi is of type i with
ξαi(n − 1) = ξαi(n). This implies ξαi(n) > 0 and π(ξαi) = π(n)ξαi(n), which yields u(αi) < 0.
Similarly, one can show that u(αi) > 0 if i is even.
Remark 3.6. We consider the sign of π(ξλ(i)) and π(ξλ(i)) − ξλ(i)(n) in this remark. By
Proposition 3.3,ξλ(i) =ξλforλ ≤ αi. Ifλ > αi with 1 ≤ i ≤ n − 2, thenξλis of type j with
j > i. Fix 1 ≤ i ≤ n − 2 and set k0=k0(λ) = min{k|ξλ(i)( j) = ξλ(i)(n), ∀k ≤ j ≤ n}. Clearly,
k0(λ) ≤ n −1 for λ > αi. Observe that, forλ > αi with odd i ,ξλ(k0−1) < ξλ(k0) ≥ ξλ(k0+1),
which impliesk0−1
k=1 π(k)ξλ(k) < 0 and k0
k=1π(k)ξλ(k) ≥ 0. A similar reasoning for the case
of even i givesk0−1 k=1 π(k)ξλ(k) > 0 and k0 k=1π(k)ξλ(k) ≤ 0. Consequently, we obtain (−1)i −1π(ξ(i) λ ) > 0, (−1)i[π(ξλ(i)) − ξλ(i)(n)] > 0, (3.4)
forλ > αi and 1 ≤ i ≤ n − 2. Note that, byProposition 3.3,ξλ(i) =ξλforλ ≤ αi. In addition
withRemark 3.3,Lemma 3.7and the continuity ofξλ(i), the first inequality of(3.4)holds for λ > λG
i and the second inequalities of(3.4)hold forλ > βi −1.
According toLemma 3.7andRemark 3.6, we derive a generalized version ofProposition 2.3
in the following.
Proposition 3.8. Let n ≥ 3 and 1 ≤ i ≤ n − 1. For λ > 0, let ξλ, ξλ(i) be the functions inDefinition3.2andβi be the constants inLemma3.7.
(1) Forλ > βi −1, the following are equivalent.
(1-1) Eν(ξλ(i), ξλ(i)) = λVarπ(ξλ(i)). (1-2) π(ξλ(i)) = 0.
(1-3) λ = λGi .
(2) Forβi −1< λ < βi, the following are equivalent.
(2-1) Eν(ξλ, ξλ) = λVarπ(ξλ). (2-2) π(ξλ) = 0.
(2-3) λ = λGi .
Proof. The proof forProposition 3.8(2) is similar to the proof forProposition 3.8(1) and we deal only with the latter. ByLemma 3.7andRemark 3.6, one has
π(ξλ(i))[π(ξλ(i)) − ξλ(i)(n)]<0 for λ > λiG
>0 for βi −1< λ < λGi .
This proves the equivalence of (1-1) and (1-2). Under the assumption of (1-2) and using
Remark 3.3, one hasλ ≤ αi. This impliesξλ(i) =ξλis an eigenvector for Mπ,νG with associated
eigenvalueλ. As λ ∈ (βi −1, αi], it must be the caseλ = λGi . This gives (1-3), while (1-3)⇒(1-2)
Remark 3.7. It is worthwhile to note that if (1-1) and (2-1) ofProposition 3.8are removed, then the equivalence in (1) holds forλ > λi −1G and the equivalence in (2) holds forλ ∈ (λGi −1, λGi +1). OnceλGi −1is known, we can determineλGi using the sign ofπ(ξλ(i)). SeeTheorem 3.9for details. Remark 3.8. Note that condition (4) ofProposition 2.3is not included inProposition 3.8. In fact, the equivalence may fail, that is, there may exist someλ ∈ (βi −1, βi) \ {λiG}such that
Eν(ξλ, ξλ)/Varπ(ξλ) = λG
i . SeeExample 3.2for a counterexample.
AsProposition 3.8focuses on the characterization of zeros of L(λ)−λ, the following theorem concerns the sign of L(λ) − λ.
Theorem 3.9. LetλiG, αi, βi be the constants inProposition3.3andLemma3.7, andL be the
function in(3.3). Then,λG1, . . . , λn−1G , β1, . . . , βn−2are fixed points ofL and, for 1 ≤ i ≤ n − 2,
(1) L(λ) < λ for λ ∈ (λGi , βi).
(2) L(λ) > λ for λ ∈ (βi, λi +1G ).
(3) L(i)(λ) < λ for λ ∈ (λGi , ∞).
Proof. Immediate fromLemma 3.7andRemarks 3.5–3.6.
ByTheorem 3.9, we obtain a lower bound on any specified eigenvalue of Mπ,νG .
Corollary 3.10. Let 1 ≤ i ≤ n − 1 andλ0> λiG. Consider the sequenceλℓ+1=L(i)(λℓ) with
ℓ ≥ 0 and set λ∗= lim ℓ→∞λℓ if λℓconverges sup ℓ∈Iλℓ if λℓdiverges, where I = {ℓ|λℓ−1> λℓ< λℓ+1}. Then,λ∗≤λG i .
It is not clear yet whether the sequenceλℓ inCorollary 3.10is convergent, even locally. This subject will be discussed in the next subsection. Now, we establish some relations between the roots of det Ai(λ) = 0 and the shape of ξλ(i). This is a generalization ofProposition 2.5.
Proposition 3.11. For 1 ≤ i ≤ n − 1, let Ai(λ) be the matrix in(2.4),θ1(i)< · · · < θi(i)be zeros
ofdet Ai(λ) = 0 and set θi(i−1):= ∞. Referring to the notation inProposition3.3, it holds true
that, for1 ≤ i ≤ n − 1, (1) λGi =θ(n−1) i < αi =θ (n−2) i < · · · < θ (i) i .
(2) ξλ(i)( j) ̸= ξλ(i)( j + 1) = · · · = ξλ(i)(n) for λ ∈ [θi( j), θi( j−1)) and i ≤ j ≤ n − 2. (3) ξλ(i)(n − 1) ̸= ξλ(i)(n) for λ ∈ (θi −1(n−2), θi(n−2)) and i ≤ n − 1.
Proof. The order in (1) is a simple application ofLemma A.3. For (2), fix 1 ≤ i ≤ n − 1 and set γ (λ) = min{ j|ξλ(i)(k) = ξλ(i)(n), ∀ j ≤ k ≤ n} and B(λ) = {1, 2, . . . , γ (λ)}, B+(λ) = B(λ) ∪ {γ (λ) + 1}. Clearly, i + 1 ≤ γ (λ) ≤ n. We use the notation ξλ|Cto denote the restriction ofξλto a set C. Suppose that i is odd. ByRemark 3.4,ξλ(i)=ξλon B(λ) and ξλ|B(λ) is of type i with
By Lemma 3.2(1), ifξλ(γ (λ) + 1) < ξλ(γ (λ)), then there is ϵ > 0 such that, for |δ| < ϵ, ξλ+δ|B(λ)is of type i and
ξλ+δ(γ (λ) − 1) < ξλ+δ(γ (λ)) > ξλ+δ(γ (λ) + 1).
This impliesγ (λ + δ) = γ (λ) for δ ∈ (−ϵ, ϵ). ByLemma 3.2(2), ifξλ(γ (λ) + 1) = ξλ(γ (λ)), then there isϵ > 0 such that, for δ ∈ (−ϵ, 0), ξλ+δ|B+(λ)is of type i with
ξλ+δ(γ (λ) − 1) < ξλ+δ(γ (λ)) < ξλ+δ(γ (λ) + 1),
and, forδ ∈ (0, ϵ), ξλ+δ|B+(λ)is of type i + 1 with
ξλ+δ(γ (λ) − 1) < ξλ+δ(γ (λ)) > ξλ+δ(γ (λ) + 1).
This yieldsγ (λ + δ) = γ (λ) for δ ∈ (0, ϵ) and γ (λ + δ) = γ (λ) + 1 for δ ∈ (−ϵ, 0). The proof for the case of even i is similar and we conclude from the above thatγ (λ) is a non-increasing and right-continuous function taking values on {i + 1, . . . , n}. Let ci +1 > · · · > cn−1 be the
discontinuous points ofγ (λ) such that γ (cj) = j for i + 1 ≤ j ≤ n − 1. As a consequence
of the above discussion, ξcj|{1,..., j} is of type i with ξcj( j) = ξcj( j + 1) and this implies
j
k=1π(k)ξcj(k) = 0. That means cj is a root of det Aj −1(λ) = 0 for j = i + 1, . . . , n − 1.
ByProposition 3.3and the second equality in (1),γ (λ) = n for θi −1(n−2) < λ < θi(n−2)and, thus, cj ≥θi(n−2)for j ≥ i + 1. As a consequence of the interlacing relationshipθi(ℓ)< θi(ℓ−1)< θi +1(ℓ),
it must be cj =θi( j+1)for i + 1 ≤ j ≤ n − 1. This finishes the proof.
Remark 3.9. For 1 ≤ i ≤ n−1,θ1(i), . . . , θi(i)are also non-zero eigenvalues of the(i+1)×(i+1) principal submatrix of(3.1)indexed by 1, . . . , i + 1.
Remark 3.10. In fact, byProposition 2.5,ξλ(1)(n − 1) ̸= ξλ(1)(n) for λ ∈ (0, θ1(n−2)), which is better thanProposition 3.11(3).
3.3. Local convergence ofL
This subsection is dedicated to the local convergence of L in(3.3). Let αi, βi, λGi be the
constants inProposition 3.3andLemma 3.7. As before, letζ0=1, . . . , ζn−1denote the L2
(π)-normalized eigenvectors of Mπ,νG associated withλ0G, . . . , λGn−1. Clearly,ξλG i
= −ζi/ζi(1) and ξλ =n−1
i =0ρi(λ)ζi, whereρi(λ) = π(ξλζi) for 0 ≤ i ≤ n − 1. Note that ρi(λ) is a polynomial
of degree n − 1 and satisfiesρi(λj) = −δi( j)/ζi(1) for i, j ∈ {0, 1, . . . , n − 1}. This implies
ρ0(λ) = − n−1 j =1 λG j −λ λG j , ρi(λ) = − λ ζi(1)λiG n−1 j =1, j̸=i λG j −λ λG j −λ G i , (3.5)
for all 1 ≤ i ≤ n − 1. Moreover, by multiplying(3.2)withξλ(k) and summing up k, we obtain Eν(ξλ, ζi) = λGi ρi(λ). In the same spirit, one can show that Eν(ξλ, ζi) = λ[ρi(λ) − ζi(n)ρ0(λ)]
usingDefinition 3.2. Putting both equations together yields ρi(λ) = λζ
i(n)
λ − λG i
As a consequence ofRemark 3.5, this gives L(λ) = n−1 i =1 λG i ρi2(λ) n−1 i =1 ρ2 i(λ) =λ + n−1 i =1 (λG i −λ)−1ζi2(n) n−1 i =1 (λG i −λ)−2ζ 2 i (n) , (3.7)
forλ ̸∈ {λG0, . . . , λGn−1}. The next proposition follows immediately from the second equation in
(3.5)and(3.6).
Proposition 3.12. LetλG1, . . . , λGn−1be the non-zero eigenvalues of Mπ,νG in(3.1)and ζ1, . . . ,
ζn−1be the corresponding L2(π)-normalized eigenvectors. Then,
ζi(1)ζi(n) = − n−1 j =1, j̸=i λG j λG j −λ G i , ∀1 ≤ i ≤ n − 1. Set u(λ) = n−1j =1(λGj −λ)−1ζ2
j(n). ByTheorem 3.9,β1, . . . , βn−2are zeros of u(λ) n−1j =1
(λG
j −λ), which is a polynomial of degree n − 2. This implies
u(λ) = C n−1 j =1 1 λG j −λ n−2 j =1 (βj −λ) , where C = λ1···λn−1 β1···βn−2 n−1
j =1ζ2j(n)/λGj. Putting this back to L yields
1 L(λ) − λ= u′(λ) u(λ) = n−1 j =1 1 λG j −λ − n−2 j =1 1 βj−λ, (3.8) forλ ̸∈ {λ0G, . . . , λGn−1, β1, . . . , βn−2}.
Proposition 3.13. Let L be the function in(3.3),λGi be the eigenvalue of Mπ,νG and βi be the
constant inLemma3.7. Let Di =n−2j =1(βj−λiG)−1−n−1j =1, j̸=i(λGj−λGi )−1with1 ≤ i ≤ n−1.
Then, for2 ≤ i ≤ n − 2,
(1) If Di < 0, then there is τ ∈ (λiG, βi) such that L is strictly increasing on (βi −1, λGi )∪(τ, βi)
and strictly decreasing on(λGi , τ).
(2) If Di > 0, then there is η ∈ (βi −1, λiG) such that L is strictly increasing on (βi −1, η) ∪
(λG
i , βi) and strictly increasing on (η, λiG).
(3) If Di =0, then L is strictly increasing on(βi −1, βi).
Proof. Using(3.7)and(3.8), one can show that L′(λiG) = 0 and
L′′(λiG) = n−1 j =1, j̸=i ζ2 i (n) λG j −λiG =2 n−2 j =1 1 βj−λiG − n−1 j =1, j̸=i 1 λG j −λGi =2Di. (3.9)
To prove (1) and (2), it suffices to show that if L′(τ) = 0 for some τ ∈ (λiG, βi), then τ is a
of L. We discuss the first case, whereas the second case is similar and is omitted. Recall that u(λ) = n−1j =1(λGj −λ)−1ζ2
j(n). As τ is a critical point for L, one has 2(u
′(τ))2=u(τ)u′′(τ).
This implies
L′′(τ) = u(τ)[3(u
′′(τ))2−2u′(τ)u′′′(τ)]
2(u′(τ))3 > 0,
where the last inequality uses the fact that u(λ) < 0, for λ ∈ (λGi , βi), and
3(u′′(λ))2−2u′(λ)u′′′(λ) = −12 1≤i< j≤n−1 (λG i −λ G j)ζi(n)ζj(n) (λG i −λ)2(λ G j −λ)2 2 < 0. This proves (1) and (2).
To see (3), we assume that Di =0. Computations show that
L(λ) − λG i L(λ) − λ =(λ − λ G i ) n−1 j =1, j̸=i 1 λG j −λ − n−2 j =1 1 βj−λ =(λ − λGi )2 n−1 j =1, j̸=i 1 (λG j −λ)(λ G j −λ G i ) − n−1 j =1 1 (βj−λ)(βj−λiG) < 0,
forλ ∈ (βi −1, λGi ) ∪ (λGi , βi), where the last inequality uses the fact that (λGj −λ)(λGj −λGi ) >
(βj−λ)(βj−λiG) for j < i and (λGj −λ)(λGj −λGi ) > (βj −1−λ)(βj −1−λGi ) for j > i. By Theorem 3.9, this implies L(λ) > λiGforλ ∈ (λGi , βi) and L(λ) < λiGforλ ∈ (βi −1, λiG). The
desired property comes immediate from the discussion in the previous paragraph.
Remark 3.11. Note that D1> 0 and Dn−1< 0. Using the same proof as above, this implies that
L(λ) is strictly increasing on (λG
1, β1) ∪ (βn−2, λn−1G ). Moreover, by(3.7), one may compute
(u′(λ))2L′(λ) = −2 i< j (λG i −λ G j) 2 (λG i −λ)3(λGj −λ)3 < 0, ∀λ ∈ (0, λG 1) ∪ (λ G n−1, ∞).
This implies L(λ) is strictly decreasing on (0, λ1G) ∪ (λGn−1, ∞) and
lim λ→0L(λ) = n−1 i =1 ζ2 i (n)/λiG n−1 i =1 ζ2 i(n)/(λ G i )2 , lim λ→∞L(λ) = 1 π(n) −1 n−1 i =1 λG i ζ 2 i(n).
The following local convergence is a simple corollary ofTheorem 3.9andProposition 3.13. Theorem 3.14 (Local Convergence). Letλ0> 0 and set λℓ+1=L(λℓ) for ℓ ≥ 0. Then, there is
ϵ > 0 such that the sequence (λℓ)∞ℓ=1is monotonic and converges toλiGforλ0∈(λGi −ϵ, λGi +ϵ)
and1 ≤ i ≤ n − 1.
We use the following examples to illustrate the different cases inProposition 3.13.
Example 3.1 (Simple Random Walks). Let n > 1. A simple random walk on {1, 2, . . . , n} with reflecting probability 1/2 at the boundary is a birth and death chain with transition matrix given
by K(i, j) = K (1, 1) = K (n, n) = 1/2 for |i − j| = 1. It is easy to see that the uniform probability is the stationary distribution of K . In the setting of graph, we haveν(i, i+1) = 1/(2n) andπ(i) = 1/n. One may apply the method in [11] to obtain the following spectral information.
λG j =1 − cos jπ n , ζj(k) = 1 λG j sin j kπ n −sin j(k − 1)π n , ∀1 ≤ j < n. See, e.g., [3, Section 7]. By(3.9), we get
Di = 1 2 n−1 j =1, j̸=i sin2( jπ/n) λG j(λ G j −λ G i ) = n−1 j =1, j̸=i 1 + cos( jπ/n) cos(iπ/n) − cos( jπ/n). Clearly, D1> 0 and Dn−1< 0. If n is even, then Dn/2< 0.
Example 3.2 (Ehrenfest Chains). An Ehrenfest chain on V = {0, 1, . . . , n} is a Markov chain with transition matrix K given by K(i, i + 1) = 1 − i/n and K (i + 1, i) = (i + 1)/n for i = 0, . . . , n − 1. The associated stationary distribution is the unbiased binomial distribution on V , that is,π(i) = ni 2−n for i ∈ V . To the Ehrenfest chain, the measureν is defined by ν(i, i + 1) = n−1
i
2−n for i = 0, . . . , n − 1. Using the group representation for the binary group {0, 1}n, one may compute
λj = 2 j n , ζj(k) = n j −1/2 j ℓ=0 (−1)ℓ kℓ n − k j −ℓ , ∀1 ≤ j ≤ n. Plugging this back into(3.9)yields
Di = n 4 n j =1, j̸=i n j j − i >0 for i < n/2 =0 for i = n/2 <0 for i > n/2.
This example points out the possibility of different signs in {Di|i =1, . . . , n − 1} including 0.
3.4. A remark on the separation for birth and death chains
In this subsection, we give a new proof of a result, Theorem 3.15, which deals with convergence in separation distance for birth and death chains. Let(Xm)∞m=0be a birth and death
chain with transition matrix K given by(1.1). In the continuous time setting, we consider the process Yt = XNt, where Nt is a Poisson process with parameter 1 independent of Xm. Given
the initial distribution µ, which is the distribution of X0, the distributions of Xm and Yt are
respectivelyµKm andµe−t(I −K ), where eA :=∞
l=0Al/l!. Briefly, we write Ht =e−t(I −K ).
It is well-known that if K is irreducible, thenµHt converges toπ as t → ∞. If K is irreducible
and ri > 0 for some i, then µKmconverges toπ as m → ∞. Concerning the convergence, we
consider the separations of Xm, Yt with respect toπ, which are defined by
dsep(µ, m) = max 0≤x≤n 1 −µK m(x) π(x) , dsepc (µ, t) = max 0≤x≤n 1 − µHt(x) π(x) . The following theorem is from [9].
Theorem 3.15. Let K be an irreducible birth and death chain on {0, 1, . . . , n} with eigenvalues λ0=0< λ1< · · · < λn.
(1) For the discrete time chain, if pi+qi +1≤1 for all 0 ≤ i< n, then
dsep(0, m) = dsep(n, m) = n j =1 n i =1,i̸= j λi λi−λj (1 − λj)m.
(2) For the continuous time chain, it holds true that
dsepc (0, t) = dsepc (n, t) = n j =1 n i =1,i̸= j λi λi−λj e−λjt.
Diaconis and Fill [5,12] introduce the concept of dual chain to express the separations in
Theorem 3.15as the probability of the first passage time. Brown and Shao [1] characterize the first passage time using the eigenvalues of K for a special class of continuous time Markov chains including birth and death chains. The idea in [1] is also applicable for discrete time chains and this leads to the formula above. See [9] for further discussions. Here, we useProposition 3.12
andLemma 3.16to prove this result directly.
Lemma 3.16. Let K be the transition matrix in(1.1)with stationary distributionπ. Suppose that µ is a probability distribution satisfying µ(i)/π(i) ≤ µ(i + 1)/π(i + 1) for all 0 ≤ i ≤ n − 1. (1) For the discrete time chain, if pi +qi +1 ≤ 1 for all 0 ≤ i < n, then µKm(i)/π(i) ≤
µKm(i + 1)/π(i + 1) for all 0 ≤ i < n and m ≥ 0.
(2) For the continuous time chain,µHt(i)/π(i) ≤ µHt(i + 1)/π(i + 1) for all 0 ≤ i < n and
t ≥0.
Proof. Note that (2) follows from (1) if we write Ht =exp{−2t(I − I +K2 )}. For the proof of
(1), observe that µKm+1(i) π(i) = µKm(i − 1) π(i − 1) qi+ µKm(i) π(i) ri+ µKm(i + 1) π(i + 1) pi, ∀i.
By induction, ifµKm(i)/π(i) ≤ µKm(i + 1)/π(i + 1) for 0 ≤ i < n, then µKm+1(i + 1) π(i + 1) = µKm(i) π(i) qi +1+µK m(i + 1) π(i + 1) ri +1+µK m(i + 2) π(i + 2) pi +1 ≥ µK m(i) π(i) qi +1+µK m(i + 1) π(i + 1) (1 − qi +1) ≥ µK m(i) π(i) (1 − pi) +µK m(i + 1) π(i + 1) pi ≥ µK m+1(i) π(i) .
Remark 3.12. Lemma 3.16 is also developed in [10] in which it is shown that, for any non-negative function f , Kmf is non-decreasing if f is non-decreasing for all m ≥ 0. Consider the adjoint chain K∗ of K in L2(π). As birth and death chains are reversible, one has K∗ = K. Using the identityµK/π = K∗(µ/π), it is easy to see that the above proof is consistent with the proof in [10].
Proof of Theorem 3.15. Assume that K is irreducible and letλ0=0< λ1 < · · · < λnbe the
eigenvalues of I − K with L2(π)-normalized eigenvector ζ0=1, . . . , ζn. ByLemma 3.16, ifµ
satisfiesµ(i)/π(i) ≥ µ(i + 1)/π(i + 1) for 0 ≤ i < n, then dsepc (µ, t) = 1 −µHt(n) π(n) = n j =1 µ(ζj)ζj(n)e−λjt,
whereµ(ζj) = ni =0ζj(i)µ(i). If K satisfies pi +qi +1≤1 for all 0 ≤ i < n, then
dsep(µ, m) = 1 −µK m(n) π(n) = n j =1 µ(ζj)ζj(n)(1 − λj)m.
By Proposition 3.12, setting µ to be one of the Dirac measure δ0, δn leads to the desired
identities.
4. Paths of infinite length
In this section, the graph G = (V, E) under consideration is infinite with V = {1, 2, . . .} and E = {{i, i + 1}|i = 1, 2, . . .}. As before, let π, ν be positive measures on V, E satisfying π(V ) = 1. The Dirichlet form and the variance are defined in a similar way as in the introduction and the spectral gap of G with respect toπ, ν is given by
λG π,ν =inf E ν( f, f ) Varπ( f ) f is non-constant andπ( f2) < ∞ .
For n ≥ 2, let Gn=(Vn, En) be the subgraph of G with Vn= {1, 2, . . . , n}, En= {{i, i +1}|1 ≤
i < n} and let πn, νn be normalized restrictions of π, ν to Vn, En. That is,πn(i) = cnπ(i),
νn(i, i + 1) = cnν(i, i + 1) with cn=1/[π(1) + · · · + π(n)]. As before, let Mπ,νG be an infinite
matrix indexed by V and defined by
Mπ,νG (i, j) = −ν(i, j)
π(i) , ∀|i − j| = 1, Mπ,νG (i, i) =
ν(i − 1, i) + ν(i, i + 1) π(i) . (4.1) Clearly, MGn
πn,νn is the principal submatrix of M
G
π,ν indexed by Vn×Vn.
Lemma 4.1. Referring to the above setting, λGn+1
πn+1,νn+1 < λ Gn πn,νn for n > 1 and λ G π,ν = limn→∞λπGnn,νn.
Proof. Briefly, we write λ for λGπ,ν andλn for λπGnn,νn. Note that λn is the smallest non-zero
eigenvalue of the principal submatrix of Mπ,νG indexed by Vn ×Vn. As a consequence of Proposition 3.11(1) andRemark 3.9, λn+1 < λn. For n > 1, let φn be a minimizer forλn
and define ψn(i) = 1Vn(i)φn(i) for i ≥ 1. Clearly, one has Eνn(φn, φn) = cnEν(ψn, ψn)
and Varπn(φn) = cnVarπ(ψn). This implies λ ≤ λn for n ≥ 2. Let λ∗ = limn→∞λn.
Note that it remains to show λ∗ = λ. For ϵ > 0, choose a function f on V such that Eν( f, f ) < (λ + ϵ/2)Varπ( f ) with π( f2) < ∞. For δ > 0, we choose N > 0 such that
VarπN(g) > (1 − δ)Varπ( f ) and EνN(g, g) < (1 + δ)Eν( f, f ), where g = f |VN, the restriction
of f to VN. This implies λ∗≤λ N ≤ EνN(g, g) VarπN(g) ≤ (1 + δ)Eν( f, f ) (1 − δ)Varπ( f ).
Remark 4.1. Silver [17] contains a discussion of the (weak*) convergence of the spectral measure for Gn to the spectral measure for G in a very general setting. Lemma 4.1can also
be proved using Theorem 4.3.4 in [17].
Proposition 4.2. Forλ > 0, let φλ(1) = −1 and φλ(i + 1) = φλ(i) +
{[φλ(i) − φλ(i − 1)]ν(i − 1, i) − λπ(i)φλ(i)}+
ν(i, i + 1) , ∀i ≥ 1. Setλ1= ∞andλn=λGπnn,νn for n ≥2.
(1) For i ≥ 2 andλ ∈ [λi, λi −1), φλ(i − 1) < φλ(i) = φλ(i + 1).
(2) Forλ ∈ (0, λGπ,ν],φλ(i) < φλ(i + 1) for all i ≥ 1.
Proof. Immediate fromProposition 3.11andRemarks 3.9–3.10.
Remark 4.2. ByProposition 4.2, one may generate a dichotomy algorithm forλπ,νG using the shape ofφλ. See(Di).
The following theorem extendsTheorem 1.1to infinite paths.
Theorem 4.3. If λGπ,ν > 0 and Eν(ψ, ψ)/Varπ(ψ) = λGπ,ν for some functionψ on V with π(ψ) = 0, then ψ is strictly monotonic and satisfies
λG
π,νπ(i)ψ(i) = [ψ(i) − ψ(i + 1)]ν(i, i + 1) + [ψ(i) − ψ(i − 1)]ν(i − 1, i) ∀i ≥ 1.
Theorem 4.4. Forλ > 0, let φλbe the function inProposition4.2and set L(λ) = Eπ(φλ, φλ)/ Varπ(φλ). Then,
(1) λG
π,ν < L(λ) < λ for λ ∈ (λGπ,ν, ∞).
(2) Ln(λ) → λG
π,νas n → ∞ forλ ∈ (λπ,νG , ∞).
Proof. Letλ > λGπ,ν. ByLemma 4.1,λi ≤λ < λi −1for some i ≥ 2. ByProposition 4.2(1), one
hasφλ(i − 1) < φλ(i) = φλ(i + 1). As in(2.2), we obtain L(λ) = λ + λπ(φλ)[π(φλ) − φλ(i)] Varπ(φλ) , i j =1 φλ( j)π( j) ≥ 0.
This leads toπ(φλ) > 0 and π(φλ) < φλ(i), which implies L(λ) < λ. That means L has no fixed point on(λGπ,ν, ∞). The lower bound of (1) follows immediately fromTheorem 4.3. For (2), setλ∗=limn→∞Ln(λ) ≥ λGπ,ν. As a consequence of (1), L is continuous on(λGπ,ν, ∞). If
λ∗> λG
π,ν, thenλ∗is a fixed point of L, a contradiction! Hence,λ∗=λGπ,ν.
5. A numerical experiment
In this section, we illustrate the algorithm(A2)on a specific Metropolis chain. The Metropolis algorithm introduced by Metropolis et al. in 1953 is a widely used construction that produces a Markov chain with a given stationary distribution π. Let π be a positive probability measure on V and K be an irreducible Markov transition matrix on V . For simplicity, we assume that K(x, y) = K (y, x) for all x, y ∈ V . The Metropolis chain evolves in the following way. Given the initial state x, select a state, say y, according to K(x, ·) and compute the ratio A(x, y) = π(y)/π(x). If A(x, y) ≥ 1, then move to y. If A(x, y) < 1, then flip a coin with probability A(x, y) on heads and move to y if the head appears. If the coin lands on tails, stay
at x. Accordingly, if M is the transition matrix of the Metropolis chain, then M(x, y) = K(x, y) if A(x, y) ≥ 1, x ̸= y K(x, y)A(x, y) if A(x, y) < 1 K(x, x) + z: A(x,z)<1 K(x, z)(1 − A(x, z)) if x = y.
It is easy to check π(x)M(x, y) = π(y)M(y, x). As K is irreducible, M is irreducible. Moreover, if π is not uniform, then M(x, x) > 0 for some x ∈ V . This implies that M is aperiodic and, consequently, Mt(x, y) → π(y) and e−t(I −M)(x, y) → π(y) as t → ∞. For further information on Metropolis chains, see [6] and the references therein.
For n ≥ 1, let Gn =(Vn, En) be a graph with Vn = {0, ±1, . . . , ±n} and En = {{i, i + 1} :
i = −n, . . . , n − 1}. Suppose that Kn is the transition matrix of the simple random walk on
Vn, that is, Kn(−n, −n) = Kn(n, n) = 1/2 and Kn(i, i + 1) = Kn(i + 1, i) = 1/2 for all
−n ≤ i < n. For a > 0, let ˇπn,a, ˆπn,abe probabilities on Vn= {0, ±1, . . . , ±n} given by
ˇ
πn,a(i) = ˇcn,a(|i| + 1)a, πˆn,a(i) = ˆcn,a(n − |i| + 1)a,
where ˇcn,aand ˆcn,a are normalizing constants. It is easy to compute that
cn,a/2 ≤ 1/ˆcn,a < 1/ˇcn,a ≤2cn,a, (5.1) where cn,a = (n + 1) a+1 a +1 +(n + 1) a.
The Metropolis chains, ˇKn,a and ˆKn,a, for ˇπn,a and ˆπn,a based on the simple random walk Kn
have transition matrices given by ˇ
Kn,a(i, j) = ˇKn,a(−i, − j), Kˆn,a(i, j) = ˆKn,a(−i, − j)
and ˇ Kn,a(i, j) = 1 2 if j = i + 1, i ∈ [0, n − 1] ia 2(i + 1)a if j = i − 1, i ∈ [1, n] (i + 1)a−ia 2(i + 1)a if j = i, i ̸∈ {0, n} 1 − n a 2(n + 1)a if i = j = n and ˆ Kn,a(i, j) = 1 2 if j = i − 1, i ∈ [1, n] (n − i)a 2(n − i + 1)a if j = i + 1, i ∈ [0, n − 1] (n − i + 1)a−(n − i)a 2(n − i + 1)a if j = i ̸= 0 1 − n a (n + 1)a if i = j = 0.
Saloff-Coste [16] discussed the above chains and obtained the correct order of the spectral gaps. Let ˇλn,a, ˆλn,a denote the spectral gaps of ˇKn,a, ˆKn,a. Referring to the recent work in [4], one has
1/(4C) ≤ λ ≤ 1/C,
where(λ, C) is any of (ˇλn,a, ˇCn(a)) and (ˆλn,a, ˆCn(a)), and
ˇ Cn(a) = 2 max 1≤i ≤n i −1 j =0 ( j + 1)−a n j =i ( j + 1)a , and ˆ Cn(a) = 2 max 1≤i ≤n i −1 j =0 ( j + 1)a n−1 j =i −1 ( j + 1)−a .
Theorem 5.1. Let ˇλn,a, ˆλn,abe spectral gaps for ˇKn,a, ˆKn,a. Then,
1 8η−a(1, n)ηa(2, n + 1) ≤ ˇλn,a ≤ 2 η−a(1, n)ηa(2, n + 1) , and 1 64ηa(1, ⌈n/2⌉)η−a(⌈n/2⌉, n) ≤ ˆλn,a ≤ 1 2ηa(1, ⌈n/2⌉)η−a(⌈n/2⌉, n) whereηa(k, l) = li =kia.
Proof of Theorem 5.1. The bound for ˇλn,afollows immediately from the fact
η−a(1, n)ηa(2, n + 1)
2 ≤ ˇCn(a) ≤ 2η−a(1, n)ηa(2, n + 1). For ˆλn,a, note that
ˆ Cn(a) = 2 max n/2≤i≤n i −1 j =0 ( j + 1)a n−1 j =i −1 ( j + 1)−a .
Taking i = ⌈n/2⌉ yields the upper bound. For the lower bound, we write ˆ Cn(a) = 2 max n/2≤i≤n i −1 j =0 1 − j i a n−i j =0 1 − j i + j a . For i ≥ n/2, it is clear that
i −1 j =0 1 − j i a ≥ i −1 j =0 1 − 2 j n a ≥ 1 2 n−1 j =0 1 − j n a . Observe that, for a> 0,
Ci′,n(a) 2 ≤ n−i j =0 1 − j i + j a ≤Ci′,n(a), (5.2)
where Ci′,n(a) = 1 + i(i/n) a−1−1 1 − a if a ̸= 1 ilogn i if a = 1.
It is clear that, for i ≥ n/2, Ci,n′ (a) ≤ 2C⌈n/2⌉,n′ (a) and this leads to
n−i j =0 1 − j i + j a ≤4 n−⌈n/2⌉ j =0 1 − j ⌈n/2⌉ + j a . Summarizing all above gives the desired lower bound.
Remark 5.1. Comparing with [16, Theorem 9.5], the bounds for ˇλn,agiven inTheorem 5.1have
a similar lower bound and an improved upper bound by a multiple of about 1/4. For ˆλn,a, observe
that Ci′′(a) 2 ≤ i −1 j =0 1 − j i a ≤Ci′′(a), where Ci′′(a) = 1 +i − i −a 1 + a .
Recall the constant Ci′,n(a) in the proof ofTheorem 5.1. Note that n + a 2(1 + a) ≤C ′′ ⌈n/2⌉(a) ≤ 2(n + a) (1 + a) , and, for a> 0, a ̸= 1 and n ≥ 3,
C⌈n′ /2⌉,n(a) ≤ 1 + n +1 2(1 + a)a>0,a̸=1sup (21−a−1)(1 + a) 1 − a ≤ 3(n + a) 1 + a ,
where the last inequality is obtained by considering the subcases a < 2 and a ≥ 2. The above computation also applies for a = 1 and n ∈ {1, 2}. In the same spirit, one can show that C⌈n/2⌉,n′ (a) ≥ 6(1+a)n+a . This yields
(n + a)2
6(1 + a)2 ≤ ˆCn,a ≤
12(n + a)2
(1 + a)2 , ∀n ≥ 1. (5.3)
Hence, we have ˆλn,a ≍(1 + a)2/(n + a)2. As a consequence of(5.1)and(5.2), we obtain that,
uniformly for a> 0, 1/ˇλn,a ≍na 1 + 1 n a + n 1 + a (1 + v(n, a)) as n → ∞, wherev(n, 1) = log n and v(n, a) = (n1−a−1)/(1 − a) for a ̸= 1.
Remark 5.2. Note that the lower bound inTheorem 6.1provides the correct order of the spectral gap for the chain ˇKn,auniformly in a but not for ˆKn,a. For instance, if a grows with n, say a = n,