Jacobi-Davidson methods for cubic eigenvalue problems

(1)

Published online 6 September 2004 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/nla.423

Jacobi–Davidson methods for cubic eigenvalue problems

Tsung-Min Hwang

1

, Wen-Wei Lin

2;∗;†

, Jinn-Liang Liu

3

and Weichung Wang

4

1_{Department of Mathematics; National Taiwan Normal University; Taipei 116; Taiwan} 2_{Department of Mathematics; National Tsing Hua University; Hsinchu 300; Taiwan} 3_{Department of Applied Mathematics; National Chiao Tung University; Hsinchu 300; Taiwan} 4_{Department of Applied Mathematics; National University of Kaohsiung; Kaohsiung 811; Taiwan}

SUMMARY

Several Jacobi–Davidson type methods are proposed for computing interior eigenpairs of large-scale cubic eigenvalue problems. To successively compute the eigenpairs, a novel explicit non-equivalence deation method with low-rank updates is developed and analysed. Various techniques such as locking, search direction transformation, restarting, and preconditioning are incorporated into the methods to im-prove stability and eciency. A semiconductor quantum dot model is given as an example to illustrate the cubic nature of the eigenvalue system resulting from the nite dierence approximation. Numeri-cal results of this model are given to demonstrate the convergence and eectiveness of the methods. Comparison results are also provided to indicate advantages and disadvantages among the various

methods. Copyright ?2004 John Wiley & Sons, Ltd.

KEY WORDS: cubic eigenvalue problem; cubic Jacobi–Davidson method; non-equivalence deation; 3D Schrodinger equation

1. INTRODUCTION A cubic eigenvalue problem of order n can be dened as

A()F≡(3A3+ 2A2+ A1+ A0)F = 0 (1) where ∈ C, F∈ Cn, and Ai∈ Rn×n for i = 0; 1; 2; 3. In applications, a set of the eigenvalues embedded in the interior of the spectrum of a large-scale eigenvalue problem are often of interest. For example, a semiconductor quantum dot model with non-parabolic band structure described by the three-dimensional (3D) Schrodinger equation [1–3] can result in a cubic eigenvalue problem of (1) with order up to 211 400 from the nite dierence approximation

∗_{Correspondence to: W.-W. Lin, Department of Mathematics; National Tsing Hua University; Hsinchu 300; Taiwan.} †_{E-mail: [email protected]}

Contract=grant sponsor: National Science Council

(2)

(see Section 3). And we are concerned only with several smallest positive real eigenvalues (energy states) and their associated eigenvectors (wave functions). Motivated by this model, various methods based on the Jacobi–Davidson (JD) and explicit deation techniques are proposed here for calculating the interior eigenpairs of the cubic eigenvalue problem (1).

A classical approach that can be used for computing the solutions of (1) is to consider the linearization of (1), ⎡ ⎢ ⎢ ⎣ 0 I 0 0 0 I A0 A1 A2 ⎤ ⎥ ⎥ ⎦ ⎡ ⎢ ⎢ ⎣ F F 2_F ⎤ ⎥ ⎥ ⎦= ⎡ ⎢ ⎢ ⎣ I 0 0 0 I 0 0 0 −A3 ⎤ ⎥ ⎥ ⎦ ⎡ ⎢ ⎢ ⎣ F F 2_F ⎤ ⎥ ⎥ ⎦ (2)

This enlarged linear eigenvalue problem can then be solved by various Lanczos or Arnoldi methods [4]. These methods are well established in many aspects of numerical algorithms, convergence properties, and stability analysis [5–7]. However, disadvantages of such an approach still exist. First of all, the order of the matrix is tripled and its condition number may increase signicantly since the set of admissible perturbations for (2) is larger than that of (1) [8]. Secondly, the performance of these methods may be reduced for the enlarged problem in terms of convergence, eciency, and accuracy. Thirdly, Lanczos and Arnoldi methods re-quire the use of the shift-and-invert technique for such a large sparse eigenvalue problem since the desired eigenpairs are located in the interior of the spectrum of the problem. Consequently, the computational cost for solving linear system is excessive.

Another approach is a direct solution of (1) by means of the JD method. Although this method has been developed for linear and quadratic eigenvalue problems [4, 9–12], it is far less studied than its classical counterpart. To our knowledge, there appears no numerical algorithms or computational experiences being reported in the literature for the cubic eigenvalue problems. In this paper, we extend the JD method presented in References [4, 9–12] to solve the cubic problems and propose various forms of the method to improve stability and eciency in calculating the interior eigenpairs.

In order to compute the interior eigenpairs successively, it is necessary to incorporate the JD method with some deation techniques. For linear eigenvalue problems, it is well known that a combination of JD and implicit deation techniques based on the Schur form can lead to eective algorithms (see e.g. Reference [4, Sections 4.7 and 8.4]). For quadratic eigen-value problems, Meerbergen [13] proposes a JD method by using the locking and restarting scheme based on the Schur form of the linearized problem. This method illustrates the es-sential ingredients for the extension of the JD method from the linear case to the quadratic case. Furthermore, Guo et al. develop a deation method for large sparse quadratic eigenvalue problems [14] and examine several deation strategies for analytic non-defective matrix func-tion [15]. Ruhe [16] suggests using the smallest eigenvalue as an initial guess for computing the second eigenvalue and using the sum of the rst two eigenvalues as an initial guess for the third eigenvalue in Newton’s method.

However, it is not clear how to incorporate an implicit deation scheme with the JD method for the cubic eigenvalue problems since the Schur form is not dened for a cubic matrix pencil in general. We propose here a cubic version of the JD method and an explicit non-equivalence deation method with low-rank updates to deal with these problems. Several algorithms are then given to illustrate various modications of these two methods. The main

(3)

procedure of the algorithms is as follows. The standard cubic JD (CJD) method is rst used to nd the rst smallest eigenpair. The current eigenvalue is then deated to innity and a new (deated) cubic eigenvalue problem is subsequently formed. The CJD method itself or its variant is applied again for the next eigenpair. This procedure is repeated until all the desired eigenpairs are found.

The main results of this paper are briey summarized as follows:

• The explicit non-equivalence deation method is proved to deate the computed eigen-values to innity while all other unknown eigeneigen-values remain unchanged.

• Several variants of the CJD method are developed for the deated cubic eigenproblem to improve the stability and eciency of the method in cases that the two consecutive eigenvalues are too close to each other and that the computational cost is expensive due to the accumulative low-rank updates as deations increase.

• Among all the CJD methods, we nd that the combination of the CJD, the locking, and the explicit deation o (see Section 2) is shown numerically to be most robust and ecient in terms of accuracy and computational cost.

This paper is organized as follows. In Section 2, we rst present the cubic Jacobi–Davidson method and a primitive locking technique based on Reference [13] for computing the desired eigenvalues. An explicit non-equivalence deation method is then given and analysed for the rest of the desired eigenpairs. The variants of the CJD method for the deated cubic eigenproblem are also given in this section. In Section 3, the quantum dot model is described and discretized by the nite dierence method using non-uniform grids. A brief derivation of the resulting cubic eigenvalue problem (1) from the discretization then follows. Numerical results are given in Section 4. Some concluding remarks are made in Section 5. Note that throughout the paper, when we specify the order of an eigenpair such as the smallest (rst) positive eigenpair, we mean the smallest positive eigenvalue and the associated eigenvector.

2. CUBIC JACOBI–DAVIDSON AND EXPLICIT DEFLATION METHODS In this section, we rst present the CJD method incorporated with a locking technique for the desired eigenpairs in Section 2.1. The explicit non-equivalence deation method is presented and analysed in Section 2.2. The deation method is then generalized to deal with more practical situations to improve its stability and eciency in Section 2.3. We summarize and compare all the proposed algorithms in Section 2.4.

2.1. A CJD method for desired eigenpairs

We rst propose a CJD method incorporated with a simple locking technique in Algorithm 2.1. The algorithm adopts the same framework of the quadratic JD method presented in References [17, 18]. The locking technique used here is similar to the techniques suggested in Reference [13] for quadratic eigenvalue problems. However, our locking scheme is rather primitive in the following sense. Unlike the schemes in Reference [13], we do not perform the reordering of the partial Schur form. We simply append the convergent eigenvectors into the trial subspace V = [Vini] as shown in Steps (2.3) and (2.4) of Algorithm 2.1.

(4)

Algorithm 2.1 (CJD-Lk).

CJD method with locking for cubic eigenproblem.

(0)Given A() = 3i=0iAi and the number k of desired eigenvalues.

(1)Choose an n-by-m orthonormal matrix V = [Vini] and set VF= [].

(2)For ‘ = 1; : : : ; k

(2.1)Compute Wi= AiV and Mi= V∗Wi for i = 0; : : : ; 3. (2.2)Iterate until convergence

(i) Compute the eigenpairs (; s) of (3_{i = 0}iMi)s = 0

by using QZ algorithm [6] for solving the generalized linear eigenproblem ⎡ ⎢ ⎢ ⎣ 0 I 0 0 0 I M0 M1 M2 ⎤ ⎥ ⎥ ⎦ ⎡ ⎢ ⎣ s s 2_s ⎤ ⎥ ⎦= ⎡ ⎢ ⎢ ⎣ I I −M3 ⎤ ⎥ ⎥ ⎦ ⎡ ⎢ ⎢ ⎣ s s 2_s ⎤ ⎥ ⎥ ⎦

(ii) Select the desired (target) eigenpair (; s) with s2 = 1:

(iii) Compute u = Vs, p = A_{()u, r = A()u:}

(iv) If (r2¡ ), Set ‘= , F‘= u, Goto Locking Steps (2.3) and (2.4).

(v) Solve (approximately) for t ⊥u from

I−pu_u_∗_p∗

A()(I−uu∗)t = −r

(vi) Orthogonalize t against V , v = t=t2:

(vii) Compute wi= Aiv, Mi= _M i V∗wi v∗_W_i _v∗_w_i for i = 0; : : : ; 3. (viii)Expand V = [V; v] and Wi= [Wi; wi]

(2.3)Orthogonalize _F‘ against VF; Compute F‘= F‘=F‘ ;

Update VF= [VF; F‘].

(2.4)Choose an orthonormal matrix Vini⊥VF; Set V = [VF; Vini].

End for

(3)Output the approximated eigenpairs (‘; F‘) for ‘ = 1; : : : ; k.

It is worth mentioning following practical considerations. As suggested in References [11, 12], the correction equation

I−pu_u_∗_p∗

A()(I−uu∗)t = −r (3)

needs to be solved. Since the vector t is supposed to be orthogonal to the vector u, Equation (3) can be rewritten as

(5)

with

= u∗A()−1r u∗_A()−1_p

In Step (2.2.v) of Algorithm 2.1, the correction equation (3) is solved approximately by choosing a preconditioner M_A≈A() so that the vector t is approximated by

t≈ −M−1

A r + MA−1p (5)

Since t is ideally orthogonal to the vector u, the scalar can be obtained by = u∗MA−1r

u∗_M−1

A p

(6) In Section 4, we give some suggestions on how to choose the preconditioner M_A for the model problem. The numerical results show that the algorithm can be very ecient if the preconditioner is suitably chosen.

2.2. An explicit non-equivalence deation method

After the smallest, or a few smallest, positive eigenpairs have been computed, we proceed to compute the rest of eigenpairs by an explicit non-equivalence deation method in a consecutive manner. This method is modied from that of Reference [14].

Let (; VF)∈ Rr×r × Rn×r be an eigenmatrix pair of A() with VFTVF= Ir and 0 =∈(), where () denotes the spectrum of . In other words, we have

A3VF3+ A2VF2+ A1VF + A0VF= 0 (7) Now we dene a new deated cubic eigenvalue problem by

˜ A()F = (3A˜3+ 2A˜2+ ˜A1+ ˜A0)F = 0 (8) where ˜ A0= A0 ˜ A1= A1−(A1VFVFT+ A2VFVFT+ A3VF2VFT) ˜ A2= A2−(A2VFVFT+ A3VFVFT) ˜ A3= A3−A3VFVFT (9)

Note that the superscript tilde is used to denote the variant coecient matrices associated with the deated cubic eigenvalue problem. In the following we rst prove a useful lemma and then, in Theorem 2, we show that the computed eigenvalues are deated to innity in the new deated cubic eigenproblem ˜A() while the rest of the unknown eigenvalues remain unchanged.

Lemma 1

Let A() and ˜A() be cubic pencils given by (1) and (8), respectively. Then it holds ˜

(6)

Proof

Using (9) and (7), and the fundamental matrix calculation, we have ˜

A() = A()−(2A3VFVFT+ A2VFVFT+ A3VFVFT+ A1VFVFT +A2VFVFT+ A3VF2VFT)

= A()−(A3VF(Ir−)3(Ir−)−1VFT

+3A3VF(Ir−)2(Ir−)−1VFT+ 3A3VF2(Ir−)(Ir−)−1VFT +A2VF(Ir−)2(Ir−)−1VFT+ 2A2VF(Ir−)(Ir−)−1VFT +A1VF(Ir−)(Ir−)−1VFT)

= A()−{[A3VF(3Ir−3) + A2VF(2Ir−2) + A1VF(Ir−) + A0VF −A0VF](Ir−)−1VFT}

= A()−[A()VF(Ir−)−1VFT] = A()[In−VF(Ir−)−1VFT]

Theorem 2

Let (; VF)∈ Rr×r× Rn×r be an eigenmatrix pair of A() as in (7) with VFTVF= Ir. Then (i) the new deated cubic pencil ˜A() in (8) has the same eigenvalues as those of A()

except that the r eigenvalues of are replaced by innity, i.e. ((A())())∪ {∞}= ( ˜A()):

(ii) Let (; z) be an eigenpair of A() with z2 = 1 and =∈(). Dene

˜z = (In−VF−1VFT)z≡T ()z (11) Then (; ˜z) is an eigenpair of ˜_A().

Proof

(i) Using the identity (see e.g. Reference [19, pp. 53]) det(In+ RS) = det(Im+ SR) and Lemma 1, we have

det( ˜A()) = det(A())det(In−VF(Ir−)−1VFT) = det(A())det(In−(Ir−)−1) = det(A())det(Ir−)−1det(−)

(7)

Since 0 =∈(), det(−)= 0. Thus, ˜A() and A() have the same nite spectrum except the eigenvalues in (). Furthermore, dividing Equation (8) by 3 _{and using the fact that}

˜

A3VF= (A3−A3VFVFT)VF= 0

we see that (diag_r{∞; : : : ;∞}; VF) is an eigenmatrix pair of ˜A() corresponding to innite eigenvalues.

(ii) Since =∈(), the matrix T () = (I−VF−1VFT) in (11) is invertible with the inverse T ()−1_{= I}_n₋_V_F_(I_r₋₎−1_VT

F (12)

From Lemma 1, we have ˜

A()˜z = A()[In−VF(Ir−)−1VFT][In−VF−1VFT]z = 0 This completes the proof.

Theorem 2 suggests that the explicit non-equivalence deation scheme can be applied re-peatedly to compute all desired interior eigenpairs. To be specic, Algorithm 2.1 is modied to achieve the goal as follows. We refer this modied algorithm as CJD-D.

1. The locking steps (2.3) and (2.4) in Algorithm 2.1 are replaced with the following two updating steps. Note that in (2.4), the convergent eigenvectors in VF are not appended to the trial subspace V .

(2.3)Orthonormalize _F‘ against current VF. Update VF= [VF; F‘] and by the upper triangular matrix in the Gram–Schmidt process. (2.4)Choose an orthonormal matrix Vini⊥VF. Set V = Vini.

2. In the rst iteration on Step (2), the matrices A() in (2.2.iii) and (2.2.v) are dened by the original eigenvalue problem (1). Starting from the second iteration, the matrices are dened by the deated system (8).

However, there are some drawbacks with the deation transformation matrix T () in (11). For example, if is close to the eigenvalue of , the matrix T () may be ill-conditioned and hence the transformation (11) may be inaccurate. Moreover, the computational cost for solving the deated cubic eigenproblem (8) becomes more expensive when the number of columns of VF in (9) is getting larger. Fortunately, the drawbacks can be avoided by the observations in the next subsection.

2.3. Variants of the CJD method for deated cubic eigenproblems

To overcome these disadvantages, the main idea is to avoid the use of the deated cubic eigenproblem ˜_{A() in (8) and the deation transformation T () in (11), directly. The goal can} be achieved by rewriting the correction equation in the CJD-D method involving the matrices

˜

A() and T () so that the new equivalent correction equation depends only on the original vectors and matrices. Consequently, the computational cost can be reduced signicantly and the scheme becomes more stable.

(8)

Using the CJD-D method for solving the deated cubic eigenproblem (8), we rst note that it is required to compute

˜

r = ˜A() ˜u and p = ˜A˜ () ˜u (13) where is a Ritz value. By the denition of T () ≡(I −VF−1VFT) and (12), Lemma 1 implies

˜

A()T () = A() (14)

Dierentiating ˜A() with respect to and using (14), we get ˜

A()T () = A()−A()T−1()T()

= A()−A(){VF[−−1+ (Ir−)−1−1]VFT} (from (12)) = A()−A()VF(Ir−)−1VFT = A()−(3A3+ 2A2+ A1)VF(Ir−)−1VF (from (7)) + (A3VF3+ A2VF2+ A1VF)(Ir−)−1VF = A()−[A3VF(2+ + 2Ir)VFT+ A2VF( + Ir)VFT+ A1VFVFT] (15) By dening u = T ()−1_u_˜ ₍₁₆₎

Theorem 2 (ii) shows that if (; ˜u) is an eigenpair of ˜A(), then the vector (; u) is an eigenpair of A(). Furthermore, from (14) and (16) the residual ˜r of the eigenpair (; ˜u) for the deated cubic eigenproblem can be rewritten as

˜

r = ˜A() ˜u = ˜A()T() u = A() u = r (17) which is also the residual of the eigenpair (; u) of the original cubic eigenproblem. Moreover, by (15) and (16), the skew orthogonalization vector ˜p in (13) for CJD-D method can then be computed by

˜

p = A₍₎_u₋_[A

3VF(2+ + 2Ir)VFT+ A2VF( + Ir)VFT+ A1VFVFT] u (18) In other words, by using (17) and (18) rather than (13), we can achieve signicant saving on computing ˜r and ˜p as the size of and VF becomes large.

We can further reduce the cost of computation of the vector ˜

t = ˜A−1() ˜r + ˜ ˜A−1() ˜p (19) by dening

(9)

and using (14) such that

t = −A−1() ˜r + ˜A−1() ˜p The vector t can therefore be approximated by

t≈ −M−1

A r + ˜M˜ A−1p˜ (21)

with a preconditioner MA≈A(), which is preferable since it is in general more cost ecient than the matrix MA˜≈A().˜

We next note that, by neglecting the low-rank updates in the deated matrix ˜A(), the matrices ˜Wi and the vectors ˜wi can also be computed by using the original matrices, i.e.

˜

Wi= AiV and w˜i= Aiv (22)

for i = 0; 1; 2; 3. This heuristic scheme results in that the Ritz vector u of A() can be obtained without using the transformation in (16). In other words, with Equations (16), (20) and (22), there is no need to explicitly compute ˜u and ˜t when applying Algorithm 2.1 to the deated cubic eigenproblem.

Finally, based on the previous observation, we consider two dierent choices of the param-eter ˜ for approximating the vector t in (21) for the deated cubic eigenproblem.

1. The vector ˜t dened in (19) should be orthogonal to the vector ˜u = T () u, i.e. ˜u∗t = 0.˜ Consequently, ˜ can be chosen as

˜ = ˜D= u ∗_T∗_{()T ()M}−1 A r˜ u∗_T∗_{()T ()M}−1 A p˜ (23) where T∗_{()T () = I}_n₋_V_F₍−T₊−1₋−T−1_)VT F

Here, the subscript ‘D’ in (23) is used to indicate that the vectors ˜u and ˜t involve the deation transformation T ().

2. Since we have simplied the computation of ˜t by replacing it with t, it is natural to require t dened in (21) be orthogonal to the vector u, i.e. u∗t = 0. We can thus choose

˜ = ˜O= u ∗_M−1 A r˜ u∗_M−1 A p˜ (24) By doing so, we further relax the need of computing T∗()T (). The subscript ‘O’ in (24) is used to emphasize that the computation of u and t involve only the original cubic eigenproblem.

In short, by introducing the vectors u and t, we have shown that the computation of ˜r, ˜

Wi and ˜wi in the process of applying the CJD-D method to the deated eigenproblem can involve only the original system A(). The vector ˜p computed by (18) still depends on the matrices and VF, but not on the transformation matrix T ().

We summarize previous discussions in the following algorithm for the computation of all desired eigenpairs of the deated cubic eigenproblem.

(10)

Algorithm 2.2 (CJD-Lk-D-D and CJD-Lk-D-O). CJD method with locking and variant explicit deations.

(0)Given A() = 3i=0iAi and the number k of desired eigenvalues.

(1)Choose an n-by-m orthonormal matrix V = [Vini]; Set VF= [] and = [].

(2)For ‘ = 1; : : : ; k

(2.1)Compute Wi= AiV and Mi= V∗Wi for i = 0; : : : ; 3. (2.2)Iterate until convergence

(i) Compute the eigenpairs (; s) of (3_{i = 0}iMi)s = 0 by

using QZ algorithm for solving the enlarged generalized linear eigenproblem as in (2.2.i) of Algorithm 2.1.

(ii) Select the desired (target) eigenpair (; s) with s2 = 1:

(iii) Compute u = Vs, r = A()u and p by Equation (18). (iv) If (r2¡ ), Set ‘= , F‘= u, Goto Locking

Steps (2.3), (2.4). (v) Compute t = −M_A−1r + M_A−1p by = D= u ∗_{T ()}∗_{T ()M}−1 A r u∗_{T ()}∗_{T ()M}−1 A p or = O= u ∗_M−1 A r u∗_M−1 A p

(vi) Orthogonalize t against V , v = t=t2:

(vii) Compute wi= Aiv, Mi= _M i V∗wi v∗_W_i _v∗_w_i for i = 0; : : : ; 3. (viii)Expand V = [V; v] and Wi= [Wi; wi].

(2.3)Update and VF by the Gram–Schmidt process. That is,

orthonormalize F‘ against current VF, expand VF= [VF; F‘], update by the upper triangular matrix in the Gram–Schmidt process. (2.4)Choose an orthonormal matrix Vini⊥VF; Set V = [VF; Vini].

End for

(3)Output the approximated eigenpairs (‘; F‘) for ‘ = 1; : : : ; k.

2.4. A summary of the algorithms

We have proposed several ideas for computing all desired interior eigenpairs of the cubic eigenvalue problems. These ideas have led to the following four algorithms. We discuss the advantages and disadvantages of the methods, which elaborate the motivations regarding the developments of the methods. Furthermore, these considerations will be veried by the numerical experiments in Section 4.

CJD-Lk (proposed in Section 2.1):

This method is described in Algorithm 2.1, which includes the primitive locking technique. In general, a Schur form does not exist for a cubic eigenvalue problem. Two spurious Ritz values (which have no meaning) thus will be obtained when the convergent eigenvectors are appended to the subspace V and the small cubic eigenvalue problems in step (2.2.i) of Algorithm 2.1 are solved. These two spurious Ritz values could aect the choice of the next desired eigenvalue. An incorrect choice of the Ritz value will slow down the overall convergence. Neglecting this disadvantage, however, CJD-Lk needs less computational cost.

(11)

CJD-D (proposed in Section 2.2):

Without the locking steps, this scheme solves the original cubic eigenvalue problem (1) and the deated cubic eigenvalue problems (8)–(9).

In this case, the convergent eigenpairs have been deated to the innity (Theorem 2). Therefore, the method would not produce any spurious Ritz value to aect the convergence. However, the computational cost for solving (8) becomes more and more expensive when the number of the desired eigenpairs is getting larger.

CJD-Lk-D-Dand CJD-Lk-D-O (proposed in Section 2.3):

Aiming to improve the performance of CJD-D, these two methods are described in Algorithm 2.2. The primitive locking technique is used.

Two variant choices of = D in (23) or = O in (24) are derived to predict the new search direction t in (21). The only dierence between (21) and (5) is the choice of the skew orthogonalization vector ˜p in (18). This new search direction ˜p involves only the original cubic eigenvalue problem matrices, Ai, but not ˜Ai. Consequently, we can achieve signicant saving on the computation of ˜r and ˜p as the size of the desired eigenpair becomes large by using (17), (18), (21), (23) and (24). We would like to emphasize that the choice of ˜p and then t (by D or O) does share the same concept with the explicit non-equivalence deation in the deated cubic eigenvalue problem, the choice further gains the saving on computation.

On the other hand, performing the locking steps will also benet Algorithm 2.2. Since the new search direction t in (21) is solved approximately by a suitable chosen preconditioner MA, the inexact search direction t might slow down the convergence of the rest desired eigenpairs. Furthermore, neglecting the low-rank updates in (22) leads to slow convergences. We therefore suggest applying the locking technique to yield better overall performance in Algorithm 2.2.

3. A QUANTUM DOT MODEL PROBLEM

Semiconductor quantum dot (QD) is a structure in which the carriers are conned in all three dimensions. In many physics and engineering applications, it is essential to estimate the discrete energy states (eigenvalues) and wave functions (eigenvectors) of the QD structure. Specically, we consider that a single electron is conned by a cylindrical InAs QD embedded in the centre of a cylindrical GaAs matrix with the same rotation axis. Figure 1 illustrates the schema of the QD structure. Moreover, the model is based on the eective-mass envelope-function approximation with one conduction band, the BenDaniel–Duke boundary conditions, and non-parabolic eective mass depending on both energy and position [1–3]. On the bound-ary of the QD, the nite hard-wall 3D connement potential is induced by real discontinuity of the conduction band.

The QD model can be described by the following time-independent Schrodinger equation [1, 2] in the cylindrical co-ordinate (r; ; z)

−˜2 2m‘() @2_F @r2 + 1 r @F @r + 1 r2 @2_F @2 + @2_F @z2 + c‘F = F (25)

where ˜ is the reduced Plank constant, is the total electron energy, and F = F(r; ; z) is a wave function. The index ‘ depends on r and z and is used to make a distinction

(12)

btm Z top Z mtx Z 0 dot R Rmtx 0 R Z dot (l=1) matrix (l=2)

Figure 1. The quantum dot structure schema showing that a cylindrical quantum dot is embedded in the hetero-structure matrix.

between the region of the dot (‘ = 1) and the matrix (‘ = 2). Here the eective mass m‘() is given as 1 m‘()= P2 ‘ ˜2 2 + g‘−c‘ + 1 + g‘−c‘+ ‘ (26) where P‘, g‘, c‘, and ‘ are momentum element, energy gap, connement potential, and spin-orbit splitting in the ‘th region, respectively. Equation (25) is equipped with Dirichlet boundary conditions

F(r; ; Zmtx) = F(r; ; 0) = F(Rmtx; ; z) = 0 (27) and the interface conditions

−˜2 m1() @F @r R− dot = −˜ 2 m2() @F @r R+ dot −˜2 2m2() @F @z Z− btm = −˜ 2 2m1() @F @z Z+ btm −˜2 2m1() @F @z Z− top = −˜ 2 2m2() @F @z Z+ top (28)

where Zmtx, Ztop, and Zbtm denote the co-ordinate of the top of the matrix, the top of the dot, and the bottom of the dot, respectively. The radii of the dot and the matrix are denoted as Rdot and Rmtx, respectively.

To discretize the 3D cylindrical model (25), we choose non-uniform mesh points with ne meshes around the heterojunction (interface). Furthermore, the mesh points are shifted with a half mesh width in the radial direction, so that no pole conditions need to be imposed [20]. Based on the mesh points, we use the standard centred seven-point nite dierence

(13)

method and two-point nite dierence method to approximate Equation (25) and the interface conditions (28), respectively.

Due to the non-parabolic eective mass (26), the discretization results in a large sparse cubic eigenvalue problem of (1) with a matrix size -by- , where , , and denote the mesh point numbers in the radial (r), azimuthal (), and axial (z) direction, respectively. However, by exploring the periodicity in the azimuth direction and applying suitable permutations and the Fourier transformation, the 3D eigenvalue problem can be decoupled into independent -by- 2D eigenproblems as ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ ˜ G1() . .. ˜ G() ⎤ ⎥ ⎥ ⎥ ⎥ ⎦ ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ ˜ F1 .. . ˜ F ⎤ ⎥ ⎥ ⎥ ⎥ ⎦= ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ ˜ D1() . .. ˜ D() ⎤ ⎥ ⎥ ⎥ ⎥ ⎦ ⎡ ⎢ ⎢ ⎢ ⎢ ⎣ ˜ F1 .. . ˜ F ⎤ ⎥ ⎥ ⎥ ⎥ ⎦ (29)

where ˜Gj() and ˜Dj() are -by- matrices for j = 1; : : : ; . Note that the mesh points associated with a certain azimuthal index number j (i.e. with the unknown vector ˜Fj) have the same value. That is, these mesh points are all located on a certain vertical 2D half-plane. It is worth pointing out that only several 2D eigenproblems associated with the rst j-indices need to be solved to obtain the smallest eigenvalues which are of interest in application.

The decoupled 2D eigenproblems in (29) can be straightforwardly formulated as a non-linear eigenvalue problem

G()F = DF (30)

where G() is a -by- matrix with entries containing in rational form (see (26)), D is the corresponding diagonal matrix, and F is the jth part of the associated eigenvector. By multiplying the common denominator of (26) and then simplifying the equation, we obtain a reduced version of (1), i.e.

A()F = (3A3+ 2A2+ A1+ A0)F = 0 (31) where A0, A1, A2, and A3 are n×n real coecient matrices.

The decoupling scheme dramatically reduces computational cost without losing accuracy. For an example as will be used in Section 4.2, a partition of the domain with 755, 280, and 360 grid points in the radial, axial, and azimuthal direction, respectively, results in a 3D system with the matrix size about 76 million. It is then reduced to several (three, for instance, in the next section) decoupled cubic eigenvalue systems (29) with the size of 211 400. The reduction from the 3D formulation to the 2D formulation (29) and full description of the matrices in these formulations are rather complicated and tedious. We refer readers to Reference [21] for more details. Nevertheless, we present the sparsity patterns of the matrices A0, A1, A2, and A3 for = 8 and = 12 in Figure 2 to provide more characteristic insights about the cubic eigenvalue problems. Furthermore, the spectrum of a specic cubic eigenvalue problem with the matrices Ai∈ R169×169, i = 0; 1; 2; 3, is illustrated in Figure 3. All the computed eigenvalues are plotted on the complex plane with the plus symbol. For this specic example, the target eigenvalues are located within the interval [0; 0:35], and they are emphasized by the symbol ⊕. It is clear that the target eigenvalues are embedded in the interior of the spectrum. In the

(14)

0 20 40 60 80 0 10 20 30 40 50 60 70 80 90 A₁, A₀ 0 20 40 60 80 0 10 20 30 40 50 60 70 80 90 A₃, A₂ (a) (b)

Figure 2. Sparsity patterns of the matrices A3 and A2, as well as A1 and A0 are shown in (a) and (b),

respectively. Note that the rows containing non-zeros in the o-diagonal of A3 are associated with the

interface of the hetero-structure. For other Ai’s, the corresponding rows have the same property.

next section, we explore the performance of the algorithms for solving the cubic eigenvalue problem (31) with more details.

4. NUMERICAL EXPERIMENTS

We implemented the proposed algorithms by Fortran 90 for the numerical experiments. All the numerical tests were performed on a Linux (Red Hat release 7.3) based workstation equipped with 2.2 GHz Xeon CPU and four gigabytes main memory. Absoft Pro Fortran [22] compiler was used to compile the programs. The timing results are in seconds.

The diameter and the height of the cylindrical QD considered here are 15 and 2:5 nm, re-spectively, whereas that of the matrix are 75 and 12:5nm, respectively. The QD size is chosen so that it is approximately comparable with that of the experimental model presented in Refer-ence [23] and the non-parabolic eect of the band structure is signicant [3]. Furthermore, the semiconductor band structure parameters used in the numerical computations are c1= 0:000, g1= 0:235, 1= 0:81, P1= 0:2875, c2= 0:350, g2= 1:590, 2= 0:80, and P2= 0:1993.

4.1. Choice of the parameters

The rst part of the numerical experiments shows that the timing performance can be signif-icantly improved by tuning the following two parameters:

• The rst one is the number of Ritz vectors used to span the initial search subspace whenever restarting occurs in Step (2.2.viii) of Algorithm 2.1 or 2.2. We perform the restarting scheme to keep the matrix V in reasonable sizes. The Ritz vectors extracted

(15)

−4 −3 −2 −1 0 0.35 1 2 3 −5 −4 −3 −2 −1 0 1 2 3 4 5x 10

-3 Matrix size of A₃, A₂, A₁, A₀ = 169. No. of interested eigenvalues = 4.

Real part of the eigenvalues

Imaginary part of the eigenvalues

Figure 3. The spectrum of a cubic eigenvalue problem with Ai∈ R169×169, for i = 0; 1; 2; 3. The desired

eigenvalues marked by⊕ are located in the interior of the spectrum (namely, the interval [0; 0:35]).

to form the new V are those associated with the Ritz values that are closest to the target eigenvalue.

• The second one is a parameter involved in the preconditioner. To compute M_A−1r and M−1

A p in Equation (5) or (21), we use SSOR(!) as a preconditioner, i.e. we set M_A:= SSOR(!) = (D−!L)D−1(D−!U ); !∈(0; 2)

where D, L, and U are the diagonal, strictly lower triangular, and strictly upper triangular matrices of A(), respectively.

To explore the eect of the parameters, we solve the three eigenvalue problems in the form of (31) that are corresponding to the rst three azimuthal indices j = 1; 2,and 3. We compute the smallest positive eigenvalues by running through the cases for ! chosen to be 0:1 to 1:9. The number of Ritz vectors in the initial search subspace is set to 1 to 5. The matrix size of the eigenvalue problems are 107 055.

From the computational results illustrated in Figure 4, we observe that the best choice for ! is around 1:6. Convergence is slow if we restart with only one Ritz vector even for better

(16)

0 0.5 1 1.5 2 0 200 400 600 800 1000 Parameterω Time (sec.) Azimuthal index: 1 0 0.5 1 1.5 2 0 200 400 600 800 1000 Parameterω Time (sec.) Azimuthal index: 2 Number of Ritz vectors = 1 Number of Ritz vectors = 2 Number of Ritz vectors = 3 Number of Ritz vectors = 4 Number of Ritz vectors = 5

0 0.5 1 1.5 2 0 200 400 600 800 1000 Parameterω Time (sec.) Azimuthal index: 3 1 1.2 1.4 1.6 50 100 150 200 250 Parameterω Time (sec.)

Azimuthal index: 1 (zoom-in)

(c) (d)

(b) (a)

Figure 4. Comparison results of Algorithm 2.1 with various relaxation values and Ritz vectors. The

timing results are marked by pluses,×-marks, triangles, and squares for the algorithms CJD-D, CJD-Lk,

CJD-Lk-D-D, and CJD-Lk-D-O, respectively.

!. The results obtained by using two to ve Ritz vectors are quite similar. However, a closer look at part (d) shows that the case of using four Ritz vectors is most ecient for all three eigenvalue problems. In summary, to span the initial subspace when restarting, our numerical results suggest the use of four Ritz vectors associated with the four Ritz values that are closest to the target eigenvalue (which is equal to zero here).

4.2. Variants of the CJD methods for cubic eigenvalue problems

We solve the cubic eigenvalue problems by CJD-D, CJD-Lk, CJD-Lk-D-D, and CJD-Lk-D-O. Comparison results of the four variants are given in Figures 5 and 6. All programmes are terminated if the residual is less than 5:0×10−12 or the iteration number is greater than 6000. The matrix size of the cubic eigenvalue problems (31) are 211 400.

Parts (a)–(c) of Figure 5 show the results of the slices with the index of azimuthal an-gle j = 1, 2, and 3, respectively. The timing results are marked by pluses, ×-marks, tri-angles, and squares for CJD-D, CJD-Lk, CJD-Lk-D-D, and CJD-Lk-D-O, respectively. Results are not shown in the gure if the method fails to converge in a reasonable time

(17)

1 2 3 4 5 6 0 2000 4000 6000 8000 10000 Order of eigenvalue Time (second)

The first slice (j=1) CJD−Dfl CJD−Lk CJD−Lk−Dfl−ε_D CJD−Lk−DFl− ε_O 1 2 3 4 5 6 0 1000 2000 3000 4000 5000 6000 Order of eigenvalue Time (second)

The second slice (j=2)

1 2 3 4 5 6 0 1000 2000 3000 4000 5000 6000 Order of eigenvalue Time (second)

The third slice (j=3)

1 2 3 2 3 4 5 6

Order of the slice (value of j)

No. of prob. solved (higher is better)

Comparison of robustness CJD−Dfl CJD−Lk CJD−Lk−Dfl−ε_D CJD−Lk−Dfl− ε_O (b) (a) (c) (d)

Figure 5. Comparison results in times of four methods are shown in parts (a)–(c).

Numbers of eigenvalue subproblems solved successfully in the three eigenvalue problems

for j = 1; 2; 3 are shown in part (d).

(10 000 s for the case j = 1 and 6000 s for the cases j = 2; 3). The timing marks of CJD-Lk-D-O are connected by dotted lines to serve as a base line. Parts (a)–(c) in Figure 5 shows that, in almost all of the cases, the CJD-Lk-D-O method is the quickest one among the four methods. Moreover, it can be observed from part (d) that the CJD-Lk-D-O method is the most robust in the sense that it converges within the iteration limit for all six eigenpairs in all three cases. Part (d) also suggests that the methods based on the explicit deation scheme (CJD-D, CJD-Lk-D-D, and CJD-Lk-D-O) are more robust than the CJD-Lk method that no explicit deation scheme is involved.

In order to further explore the overall performances among the dierent methods, the ‘average’ timing results are presented in Figure 6. The average times are calculated by the following ways. In part (a), for each one of the three cubic eigenproblems (31) corresponding to j = 1; 2; 3, we consider only the computing times for the eigenpairs that all four methods converge. That is, we take the arithmetic mean of the times for the rst ve, ve, and three eigenpairs corresponding to the problems for j = 1, 2, and 3, respectively. In part (b), we take the arithmetic mean of the six computing times for the rst six eigenpairs as the aver-age time if the method converges. Otherwise, the computing time is taken as the maximum

(18)

1 2 3 0 1000 2000 3000 4000 5000 6000

Time in seconds (lower is better)

Average times for the convergent eigenpairs CJD−Dfl CJD−Lk CJD−Lk−Dfl−ε_D CJD−Lk−Dfl− ε_O 1 2 3 0 1000 2000 3000 4000 5000 6000

Time in seconds (lower is better)

Average times for all the eigenpairs

(b) (a)

Figure 6. Comparison of eciency. Average timing results calculated by two ways are compared for the four methods.

(i.e. 6000) iteration time for each failed eigenpair, i.e. the method fails to converge for this eigenpair. Based on Figure 6, we highlight following observations. First, if we ignore the results beyond the iteration limit, the CJD-Lk,CJD-Lk-D-D, and CJD-Lk-D-O meth-ods are basically comparable to each other, while CJD-Lk-D-O is the fastest one in al-most all cases. Second, without using the transformation T () in (23), the CJD-Lk-D-O method is better than the CJD-Lk-D-D method. Considering the overall performance, we recommend using CJD-Lk-D-O for the target eigenvalue problems due to its eciency and robustness.

We nally demonstrate the computational results of the desired eigenpairs by CJD-Lk-D-O. The decoupled system (29) allows us to solve several cubic eigenvalue problems independently to obtain all bound states. Table I shows all the computed eigenvalues that are less than 0:35, which is the dierence between the connement potentials c1 and c2. The table also presents the values of azimuthal index j, the order of the smallest eigenvalues of each slice (denoted as Ord.), and the convergent residuals of the eigensystems. The mesh size is so chosen that at least three signicant digits of the computed eigenvalues remain unchanged whenever the domain is further rened.

(19)

Table I. Computational results of the discrete eigenvalues, the azimuthal indices j, the order of the smallest eigenvalues of the slices (denoted as Ord.), and the

conver-gent residuals of the eigensystems.

j Ord. Residual 0.0873 1 1 2.84e-12 0.1101 2 1 3.38e-12 0.1386 3 1 4.36e-12 0.1503 1 2 4.88e-12 0.1708 4 1 4.61e-12 0.1931 2 2 3.01e-12 0.2054 5 1 4.56e-12 0.2370 3 2 4.57e-12 0.2412 6 1 4.78e-12 0.2459 1 3 4.18e-12 0.2777 7 1 4.81e-12 0.2811 4 2 4.25e-12 0.2971 2 3 4.82e-12 0.3141 8 1 4.08e-12 0.3245 5 2 4.99e-12 0.3305 1 4 4.82e-12 0.3384 2 4 4.46e-12 0.3454 3 3 4.43e-12 0.3485 3 4 4.00e-12 0.3495 9 1 4.27e-12 5. CONCLUSION

Numerical methods that can be used to eectively compute multiple eigenvalues embedded in the interior of the spectrum of an eigenvalue system together with their associated eigen-vectors are of great interests in a wide range of engineering and scientic areas. And yet many challenging issues in this regard remain to be explored, especially for large-scale and non-linear problems. Based on the framework of the Jacobi–Davidson method, we propose and compare several numerical algorithms for the cubic eigenvalue problems in this article. Moreover, an explicit non-equivalence deation method for computing successive eigenpairs is developed and analysed. Several improvements by using eective preconditioners and locking and restarting techniques on these methods are also provided to yield better performance.

All numerical results are generated by using a semiconductor quantum dot model which exhibits both non-linear and large-scale properties in the resulting eigenvalue systems from the nite dierence approximation in cylindrical co-ordinates. These systems are decoupled into 2D subsystems by rotational symmetry of the model problem. The order of energy levels (eigenvalues) depends critically on the number of subsystems, i.e. on the partition number in the azimuthal direction.

Based on our intensive numerical investigation, we conclude that the cubic Jacobi–Davidson method combined with the explicit deation method and the primitive locking technique, but without using the transformation matrix T () (i.e. CJD-Lk-D-O) is the most favourable for the QD model in terms of robustness and eciency. This method is most robust in the sense that it converges for all tested subsystems and for all desired eigenpairs. It is most ecient in the sense that the average computing time required for all the eigenpairs is minimal.

(20)

ACKNOWLEDGEMENTS

The authors thank O. Voskoboynikov for motivating their attention to the quantum dot model discussed in this paper. The authors are grateful for the anonymous referees’ comments. This work is partially supported by the National Science Council and the National Center for Theoretical Sciences in Taiwan.

REFERENCES

1. de Andrada e Silva EA, La Rocca GC, Bassani F. Spin-split subbands and magneto-oscillations in III-V asymmetric heterostructures. Physical Review B 1994; 50(12):8523–8533.

2. Voskoboynikov O, Liu SS, Lee CP. Spin-dependent electronic tunnelling at zero magnetic eld. Physical Review B 1998; 58(23):15397–15400.

3. Li Y, Liu J-L, Voskoboynikov O, Lee CP, Sze SM. Electron energy level calculations for cylindrical narrow gap semiconductor quantum dot. Computer Physics Communications 2001; 140:399–404.

4. Bai Z, Demmel J, Dongarra J, Ruhe A, van der Vorst H. Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. SIAM: Philadelphia, 2000.

5. Anderson E, Bai Z, Bischof C, Blackford S, Demmel J, Dongarra J, Du Croz J, Greenbaum A, Hammarling S, McKenney A, Sorensen D. LAPACK Users’ Guide (3rd edn). SIAM: Philadelphia, 1999.

6. Golub GH, Van Loan CF. Matrix Computations (3rd edn). Johns Hopkins University Press: Baltimore, 1996. 7. Stewart GW, Sun J-G. Matrix Perturbation Theory. Academic Press: New York, 1990.

8. Tisseur F. Backward error analysis of polynomial eigenvalue problems. Linear Algebra and its Applications 2000; 309:339–361.

9. Fokkema DR, Sleijpen GLG, van der Vorst HA. Jacobi–Davidson style QR and QZ algorithms for the reduction of matrix pencils. SIAM Journal on Scientic Computing 1998; 20(1):94–125.

10. Hochstenbach ME, van der Vorst HA. Alternatives to the Rayleigh quotient for the quadratic eigenvalue problem. Technical Report 1212, Department of Mathematics, University of Utrecht, P.O. Box 80.010, NL-3508 TA Utrecht, The Netherlands, 2001.

11. Sleijpen GLG, Booten AGL, Fokkema DR, van der Vorst HA. Jacobi–Davidson type methods for generalized eigenproblems and polynomial eigenproblems. BIT 1996; 36(3):595–633.

12. Sleijpen GLG, van der Vorst HA. A Jacobi–Davidson iteration method for linear eigenvalue problems. SIAM Journal on Matrix Analysis and Applications 1996; 17(2):401–425.

13. Meerbergen K. Locking and restarting quadratic eigenvalue solvers. SIAM Journal on Scientic Computing 2001; 22(5):1814–1839.

14. Guo J-S, Lin W-W, Wang C-S. Numerical solutions for large sparse quadratic eigenvalue problems. Linear Algebra and its Applications 1995; 225:57–89.

15. Guo J-S, Lin W-W, Wang C-S. Nonequivalence deation for the solution of matrix latent value problems. Linear Algebra and its Applications 1995; 231:15–45.

16. Ruhe A. Algorithms for the non-linear eigenvalue problem. SIAM Journal on Numerical Analysis 1973; 10: 674–689.

17. Bai Z, Sleijpen G, van der Vorst H. Nonlinear eigenvalue problems. In Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide, Chapter 9, Bai Z, Demmel J, Dongarra J, Ruhe A, van der Vorst H (eds), SIAM: Philadelphia, 2000.

18. Sleijpen GLG, van der Vorst HA, van Gijzen M. Quadratic eigenproblems are no problem. SIAM News 1996; 29:8–9.

19. Horn RA, Johnson CA. Matrix Analysis. Cambridge University Press: Cambridge, 1985.

20. Lai M-C. A note on nite dierence discretizations for Poisson equation on a disk. Numerical Methods for Partial Dierential Equations 2001; 17(3):199–203.

21. Wang W, Hwang T-M, Lin W-W, Liu J-L. Numerical methods for semiconductor heterostructures with band non-parabolicity. Journal of Computational Physics 2003; 190(1):141–158.

22. Absoft Corporation. Pro Fortran Linux User Guide. Rochester Hills: MI, U.S.A., 2001.

23. Schoenfeld WV. Spectroscopy of the electronic structure of coupled quantum dots systems. Ph.D. Thesis, Materials Department, University of California, Santa Barbara, July 2000.