We also illustrate the properties of the new approach for several other application problems

(1)

NUMERICAL SOLUTION OF QUADRATIC EIGENVALUE PROBLEMS WITH STRUCTURE-PRESERVING METHODS^∗

TSUNG-MIN HWANG^†, WEN-WEI LIN^‡, AND VOLKER MEHRMANN^§

Abstract. Numerical methods for the solution of large scale structured quadratic eigenvalue problems are discussed. We describe a new extraction procedure for the computation of eigenvectors and invariant subspaces ofskew-Hamiltonian/Hamiltonian pencils using the recently proposed skew- Hamiltonian isotropic implicitly restarted Arnoldi method (SHIRA).

As an application we discuss damped gyroscopic systems. For this problem we ﬁrst solve the eigenvalue problem for the undamped system using the structure-preserving method and then use the quadratic Jacobi–Davidson method as correction procedure. We also illustrate the properties of the new approach for several other application problems.

Key words. quadratic eigenvalue problems, skew-Hamiltonian/Hamiltonian pencils, invariant subspace, gyroscopic system, quadratic Jacobi–Davidson method, nonequivalence deﬂation technique

AMS subject classiﬁcations. 15A18, 47A15, 47J10 PII. S106482750139220X

1. Introduction. In this paper we study the numerical methods for computing eigenpairs or invariant subspaces of the quadratic eigenvalue problem

(λ²M + λ(G + εD) + K)x = 0, (1)

where M, C, K, D are square n × n real matrices with M = M^T, G = −G^T, D = D^T, and K = K^T. Typically, the matrices M and K or −K represent the mass matrix and the stiﬀness matrix, respectively; G and εD represent gyroscopic forces and the damping, respectively, of the system. We not only concentrate on the case in which ε is a small parameter, but also discuss in detail the undamped case that ε = 0.

Quadratic eigenvalue problems (1) arise in the solution of initial or boundary value problems for second order systems of the form

M ¨x + (G + εD) ˙x + Kx = f (2)

and in numerous other applications. These include ﬁnite element discretization in structural analysis [34], in acoustic simulation of poro-elastic materials [22, 32, 35], or in the elastic deformation of anisotropic materials [2, 19, 33]. See also [42] for a recent survey.

The classical approach in solving the quadratic eigenvalue problems is to turn them into linear eigenvalue problems by introducing a new vector y = λx. In the case

∗Received by the editors July 10, 2001; accepted for publication (in revised form) August 19, 2002;

published electronically February 6, 2003. This research was partially supported by the National Center for Theoretical Science, National Tsing Hua University, Hsinchu, Taiwan.

http://www.siam.org/journals/sisc/24-4/39220.html

†Department ofMathematics, National Taiwan Normal University, Taipei 116, Taiwan, Republic ofChina (min@math.ntnu.edu.tw).

‡Department ofMathematics, National Tsing Hua University, Hsinchu 300, Taiwan, Republic of China (wwlin@am.nthu.edu.tw).

§Institut f¨ur Mathematik, MA 4-5, Technische Universit¨at Berlin, Str. des 17. Juni 136, D-10623 Germany (mehrmann@math.tu-berlin.de). This author was also partially supported by Deutsche Forschungsgemeinschaft research grant Me 790/11-3.

1283

(2)

of (1) this leads to the linearized generalized eigenvalue problem

λ

M G + εD

0 I

−

0 −K

I 0

y x

= 0.

(3)

If (λ, [^y_x]) is an eigenpair of (3), then x is an eigenvector of (1) associated with the eigenvalue λ. The approach (3) allows us to determine eigenpairs numerically, since for the generalized eigenvalue problem like (3) the mathematical theory, numerical methods as well as the perturbation theory are well established [1, 11, 39]. How- ever, a difficulty of the linearization approach is that due to the embedding into the problem of double size, the condition number, i.e., the sensitivity of the eigenvalues and eigenvectors with respect to perturbations in the data matrices M, G, D, K, may increase. This is because the set of admissible perturbations for (3) is larger than for (1). If the perturbations would, however, respect the specific structure of the blocks in (3), i.e., the zero, the identity blocks, and the symmetries, then the perturbation results would be the same. Furthermore, there are many different ways to do the linearization with different conditioning, as has been demonstrated in [40]. In view of these remarks it would be ideal to have a numerical method that works directly with the original data of the quadratic eigenvalue problem and that avoids the problem of the increased condition numbers. It is difficult to develop such a method, and there are only few methods that partially fulfill these requirements, such as the quadratic or polynomial Jacobi–Davidson method [14, 36, 37].

In many applications the original quadratic eigenvalue problem has some extra structures that should be reﬂected in its linearization. A typical structure is the Hamiltonian eigenvalue symmetry, i.e., that the spectrum is symmetric with respect to the real and imaginary axes. This symmetry occurs if ε = 0 in (1), which is the case, for example, in the gyroscopic system [18] and in the elasticity problems [2, 26]; see [42] for a recent survey. A similar eigenvalue symmetry arises in the optimal control problems for which the variational principle leads to a skew-Hamiltonian/Hamiltonian (SHH) pencil that has similar properties as (3) in the undamped case; see [5, 25, 26].

In these applications the generalized eigenvalue problem arises directly and is not from any linearization of a quadratic eigenvalue problem.

Another structure that should be reflected in a proper method is preservation of the sparsity structure inherited from finite element or finite difference discretizations [3, 26, 34].

In this paper we discuss first the solution of large sparse generalized eigenvalue problems that have a Hamiltonian eigenvalue symmetry and then allow perturbations of this structure. The use of structure-preserving linearizations for problem (1) in the undamped case has first been suggested and successfully employed in [10, 26] to design structure-preserving methods for large sparse linearized quadratic eigenvalues problems with a Hamiltonian eigenvalue symmetry. We make use of these techniques, in particular, the skew-Hamiltonian isotropic implicitly-restarted Arnoldi algorithm (SHIRA) proposed in [26]. We will briefly recall this algorithm in section 2.

The key feature of the SHIRA algorithm is that it inherits the convergence properties, the implicit restart, and the reorthogonalization techniques of the standard shift-and-invert Arnoldi algorithm [20, 31, 38], while being, in general, more eﬃcient.

Furthermore, it generates isotropic Krylov subspaces to guarantee that the computed spectrum has the correct eigenvalue symmetry. In [41] the stability properties of the methods that preserve the Hamiltonian structure (although not for this algorithm) are analyzed.

(3)

A disadvantage of the SHIRA algorithm is that it does not directly generate the eigenvectors or invariant subspaces of the linearization. Instead, they are determined in [26] using a few steps of inverse or subspace iteration [11, 44], respectively. For these iterations one has to solve in every step a linear system, and hence extra sparse matrix factorizations are needed. This results in a bottleneck in the computation and also limits the possible system size. Even though, in practice, only a few iterations are needed, these extra factorizations add to the method substantial computational overhead.

In this paper we develop a new extraction method for invariant subspaces in the SHIRA algorithm that avoids extra sparse matrix factorizations. This method is a modiﬁcation of an idea suggested in [45, 46] for the computation of the stable invariant subspace of a Hamiltonian matrix. We compare the performance of this extraction procedure with that of classical inverse or subspace iteration in section 5.

As an application we then discuss numerical methods for problems like (1) that have a damping term with ε = 0. In this problem the Hamiltonian eigenvalue symmetry is destroyed and therefore structure-preserving methods cannot be used. On the other hand, if the damping is small, then we can expect that the damped system has a spectrum and also invariant subspaces close to those of the structured system.

We discuss this topic in section 4 and introduce a method that ﬁrst computes the eigenvalues, eigenvectors, or invariant subspaces of the structured system and then use them as initial vectors in the quadratic Jacobi–Davidson procedure [3, 36] to compute the eigenvalues, eigenvectors, or invariant subspaces of the damped system.

A similar approach for problems with small damping in structural mechanics was suggested in [27]. In order to treat eigenvalue problems with clusters of eigenvalues, a nonequivalence low-rank transformation technique [9, 12] is used to deﬂate the computed eigenpairs. We compare this approach with the classical shift-and-invert subspace iteration (SISI) for several application problems in section 5.

2. Structure-preserving algorithms. In this section we brieﬂy recall the SHIRA algorithm of [26], which is designed for the computation of a small number of speciﬁed eigenvalues of real large scale generalized eigenvalue problems for SHH pencils of the form

αN − βH = α

F1 G1

H1 F₁^T

− β

F2 G2

H2 −F₂^T

, (4)

where G1= −G^T₁, H1= −H₁^T, G2= G^T₂, and H2= H₂^T.

Matrices N and H in (4) are called skew-Hamiltonian and Hamiltonian, respec- tively. If J = [_−I⁰_n^I₀ⁿ] with In being the n×n identity matrix, then skew-Hamiltonian matrices satisfy (N J)^T = −N J and Hamiltonian matrices satisfy (HJ)^T = HJ. T he special structure of SHH pencils ensures [23, 24, 25] that the eigenvalues occur in quadruples λ, λ, −λ, −λ, if λ is complex, or pairs λ, −λ if λ is real. Moreover, if x is a right eigenvector associated with an eigenvalue λ, then Jx is a left eigenvector associated with −λ.

If the skew-Hamiltonian matrix N in the SHH pencil αN −βH in (4) is invertible and given in the factored form N = Z₁Z₂ with Z₂^TJ = ±JZ₁, then the pencil is equivalent to a pencil of the form αI − βW, where

W = ±Z₁⁻¹HZ₂⁻¹

is Hamiltonian. The factorization of N can be easily determined via a Cholesky-like decomposition for skew-symmetric matrices developed in [7]; see also [4].

(4)

Suppose that one wishes to compute the eigenvalues of W nearest to some target value λ₀. Then typically one would use shift-and-invert of the form (W − λ₀I)⁻¹ to get the eigenvalues near λ₀ via a Krylov subspace method. But this matrix is not Hamiltonian any longer. In order to preserve the structure, [26] suggests a rational transformation with four shifts (λ₀, λ₀, −λ₀, −λ₀) given by

R1(λ0, W) = (W − λ0I)⁻¹(W + λ0I)⁻¹(W − λ0I)⁻¹(W + λ0I)⁻¹. (5)

If the target λ₀ is either real or purely imaginary, one may use the simpler transformation

R2(λ0, W) = (W − λ0I)⁻¹(W + λ0I)⁻¹. (6)

If the matrix W is real and Hamiltonian, then both matrices R₁and R₂are real and skew-Hamiltonian, and hence the eigenvalues have algebraic multiplicity of at least two, if they are not purely imaginary. For a real skew-Hamiltonian matrix K and a given vector q₁, an Arnoldi iteration applied to K would generate the Krylov space

V(K, q1, k) = span{[q1, Kq1, . . . , K^k−1q1]}.

Using an appropriate orthogonal basis of this space given by the columns of an or- thogonal matrix Q_k, one produces a “Ritz”-projection

Kk = Q^T_kKQk.

The “Ritz”-values, i.e., the eigenvalues of K_k are then used as eigenvalue approxima- tions.

In order to obtain a structure-preserving method, one needs a “Ritz”-projection that is again skew-Hamiltonian. For this one would need an isotropic subspace V, i.e., a subspace for which x^TJy = 0 for all x, y ∈ V; see [26]. Let V_k∈ R^2n×k and let

Q^TVk =





 Rk

T0_k 0



 (7) 

be the symplectic QR-factorization [8] of V_k, where Q ∈ R^2n×2n is orthogonal and symplectic and where Rk and Tk ∈ R^k×k are upper and strictly upper triangular, respectively. If Vk is isotropic and of full column rank, then it is shown in [8] that R⁻¹_k exists and that Tk= 0. This means, in particular, that if Vnis of full column rank, then there exists an orthogonal symplectic matrix Q ≡ [Qn| JQn] with Qn ∈ R^2n×n, such that

Q^T_n Q^T_nJ^T

K [Qn JQn] =

H_n K_n 0 H_n^T

. (8)

Here Hn = Q^T_nKQn is an upper Hessenberg matrix and Kn = Q^T_nKJQn is skew- symmetric. This idea is the basis for the structure-preserving Arnoldi algorithm SHIRA introduced in [26], which is given by the recursion

KQk= QkHk+ qk+1hk+1,ke^T_k, (9)

(5)

where H_k is the leading k × k principal submatrix of H_n and e_k = [0, . . . , 0, 1]^T ∈ R^k×1. Since Q_k is orthogonal and Q^T_kJQ_k = 0 for k < n, it is easily seen that [Q_k|JQ_k] ∈ R^2n×k is orthogonal. This implies that

Q^T_k Q^T_kJ^T

K [Qk | JQk] =

Hk Gk

0 H_k^T

∈ R^2k×2k (10)

is skew-Hamiltonian and G_k= Q^T_kKJQ_k is skew-symmetric.

Let (θi, vi) be an eigenpair of Hk, i.e., Hkvi = θivi, and let xi = Qkvi be a

“Ritz”-vector of the eigenvalue problem Kx = µx corresponding to the “Ritz” value θi. Then we have the following residual bound, see [31], for the “Ritz”-pair (θi, xi):

 Kxi− θixi2= KQkvi− θiQkvi2

= 

Q_kH_k+ q_k+1h_k+1,ke^T_k

v_i− θ_iQ_kv_i₂

= qk+1hk+1,k(e^T_kvi) 2= |hk+1,k||e^T_kvi|.

Remark 1.

(a) The transformations R₁(λ₀, W) and R₂(λ₀, W) can be viewed as a structure- preserving shift-and-invert strategy to accelerate the convergence of the desired eigenvalues.

(b) In exact arithmetic, the values t_i,k= (Jq_i)^TKq_kin the kth step of the SHIRA algorithm are zero, but in practice roundoﬀ errors cause them to be nonzero.

It has been shown in [26] how to design an isotropic orthogonalization scheme to ensure that the spaces span{q1, . . . , qk} are isotropic to working precision.

(c) To avoid the reorthogonalization, after a number of steps the SHIRA algorithm also uses implicit restarts as in [38].

(d) A minor disadvantage of the SHIRA method is that due to the rational trans- formations (5) or (6) the eigenvectors associated with pairs of eigenvalues λ and −λ in the real case or λ and −¯λ in the complex case are both mapped to the same invariant subspace associated with |λ|². This has the consequence that in many applications the desired invariant subspace associated with the stable eigenvalues is not obtained directly from the method. In section 3 we discuss an extraction method that allows us to compute this subspace directly.

3. An extraction method for stable eigenspaces. In this section we discuss a method for computing speciﬁc invariant subspaces of a Hamiltonian matrix W directly from an isotropic invariant subspace of the skew-Hamiltonian matrix W² without using inverse or subspace iteration. In many applications, we are interested in a subspace associated with eigenvalues in the left half plane. The same method can also be applied to the skew-Hamiltonian functions R₁, R₂introduced in section 2.

The method that we describe is a modiﬁcation of a technique ﬁrst suggested in [45, 46]. Suppose that W has no eigenvalues on the imaginary axis and let Q_n be as in (8) for K = W². Then

W²Qn= QnHn. (11)

Let Qk∈ R^2n×k (k ≤ n) be a basis of an isotropic invariant subspace of W²such that W²Q_k= Q_kΩ_k,

(12)

(6)

where for the spectrum σ we have σ(Ω_k) ⊆ σ(H_n) and σ(Ωk) ∩ {σ(Hn)\σ(Ωk)} = ∅,

i.e., the complete multiplicity of a multiple eigenvalue is included. Since we have assumed that W has no purely imaginary eigenvalues, it follows that there exist two bases V_k⁻ and V_k⁺, both of dimension k, associated with the stable and unstable isotropic, invariant subspaces of W, respectively, such that

WV_k⁻= V_k⁻Λ⁻_k, WV_k⁺= V_k⁺Λ⁺_k, (13)

where

σ(Λ⁻_k) = {λ|Re(λ) < 0, λ²∈ σ(Ω_k)}, σ(Λ⁺_k) = {λ|Re(λ) > 0, λ²∈ σ(Ω_k)}.

(14)

Since by assumption Ωk has no purely imaginary eigenvalue, it follows that Ωk

has a unique positive square root Xk[16] satisfying X_k²= Ωk such that all eigenvalues have positive real part. We call this root the positive square root. It can, for example, be computed via the MATLAB function sqrtm [13].

The following theorem shows how to extract the stable eigenspace span(V_k⁻) from the isotropic invariant subspace span(Q_k).

Theorem 1. Let W ∈ R^2n×2n be Hamiltonian and let Q_k∈ R^2n×k be as in (12).

Suppose that there exist nonsingular matrices L⁻_k and L⁺_k ∈ R^k×k such that Qk= V_k⁺L⁺_k + V_k⁻L⁻_k,

(15)

where V_k⁺ and V_k⁻ are as in (13). Then for Xk, the positive square root of Ωk, we have

span(WQ_k− Q_kX_k) = span(V_k⁻).

(16)

Proof. From (13) and (15) we have

WQk− QkXk = W(V_k⁺L⁺_k + V_k⁻L⁻_k) − (V_k⁺L⁺_k + V_k⁻L⁻_k)Xk

= V_k⁺

Λ⁺_kL⁺_k − L⁺_kX_k + V_k⁻

Λ⁻_kL⁻_k − L⁻_kX_k .

If we can prove that Λ⁺_kL⁺_k − L⁺_kX_k = 0 and that Λ⁻_kL⁻_k − L⁻_kX_kis nonsingular, then assertion (16) follows. To do this, we show ﬁrst that

X_k = (L⁺_k)⁻¹Λ⁺_kL⁺_k. (17)

From (15) and (13) we have

W²Qk = W²

V_k⁺L⁺_k + V_k⁻L⁻_k

= V_k⁺ Λ⁺_k₂

L⁺_k + V_k⁻ Λ⁻_k₂

L⁻_k. (18)

(7)

On the other hand, from (12) and (15) we have W²Q_k = Q_kΩ_k =

V_k⁺L⁺_k + V_k⁻L⁻_k Ω_k. (19)

Subtracting (19) from (18) we obtain V_k⁺ V_k⁻

Λ⁺_k₂

L⁺_k − L⁺_kΩk

Λ⁻_k₂

L⁻_k − L⁻_kΩk

= 0.

Since

V_k⁺ V_k⁻

has full rank it follows immediately that Λ⁺_k₂

L⁺_k − L⁺_kΩk = 0 and (17) follows due to the uniqueness of Xk.

We also have Λ⁻_k₂

L⁻_k − L⁻_kΩk = 0, and the uniqueness of the positive square root of Ωk implies that

X_k = L⁻_k₋₁

Z_kL⁻_k, (20)

where Zk denotes the unique positive square root of Λ⁻_k₂

. Hence, Λ⁻_kL⁻_k − L⁻_kXk = Λ⁻_k − Zk

L⁻_k is nonsingular. This completes the proof.

Theorem 1 indicates a way to determine the desired stable invariant subspace.

We apply SHIRA to the skew-Hamiltonian operators R₁(λ₀, W) and R₂(λ₀, W) as in (5) and (6), respectively, and determine the associated isotropic invariant subspace Q_k.

If the target λ₀ /∈ σ(W) is real or purely imaginary, and if the entry h_k+1,kof H_k is negligible, then the space

span(Q_k) = span{q₁, . . . , q_k}

generated by SHIRA is a good approximation of an isotropic invariant subspace of R₂(λ₀, W). Then Q_k satisﬁes (approximately)

R₂(λ₀, W)Q_k= (W²− λ²₀I)⁻¹Q_k= Q_kH_k, (21)

where H_k is an upper Hessenberg matrix. This implies that Ω_k = Q^T_kW²Q_k = H_k⁻¹+ λ²₀I.

(22)

Applying Theorem 1, we can extract the stable isotropic eigenspace V_k⁻ of W from Ωk by computing its positive square root.

If the target λ0 /∈ σ(W) is neither real nor purely imaginary and if hk+1,k is negligible, then the space span(Qk) = span{q1, . . . , qk} generated by SHIRA is a good approximation to an isotropic invariant subspace of R1(λ0, W). However, as has been shown in [26], it may happen that this subspace fails to be invariant under W².

It is therefore necessary to check whether or not span(Q_k) is invariant under W² by computing the residual

 W²Q_k− Q_kΩ_k _F,

with Ω_k = Q^T_kW²Q_k. If this residual is small, then the procedure to compute the invariant subspace is as before.

(8)

We summarize the extraction method in the following algorithm.

Algorithm 1 (Extraction method (EM) for the stable invariant subspace or eigenvectors of a Hamiltonian matrix).

Input Hamiltonian matrix W and a target value λ0 with negative real part.

Output Approximate invariant subspace V⁻of W associated with eigenvalues of negative real part nearest to λ0.

(i) IF λ0= α or λ0= iα for α ∈ R, THEN apply SHIRA to R2(λ0, W), i.e., (a) generate Arnoldi vectors as columns of Qk= [q1, . . . , qk] and an upper

Hessenberg matrix Hk∈ R^k×k such that

(W²− λ²₀I)⁻¹Qk= QkHk. (b) Compute Ωk= H_k⁻¹+ λ²0I.

(ii) ELSE IF λ0= α + iβ, where α, β ∈ R, then apply SHIRA to R1(λ0, W), i.e., (a) generate Arnoldi vectors as columns of Qk= [q1, . . . , qk] and an upper

Hessenberg matrix Hk∈ R^k×k such that R1(λ0, W)Qk= QkHk. (b) Compute Ωk= Q^T_kW²Qk.

(iii) EndCompute the real Schur decomposition of Ωk as Ωk= UkTkU_k^T.

(iv) Reorder the stable eigenvalues of Tk to the top of Tkusing the reordering method for the Schur form of [1], i.e.,

Tk= Vk

T11 T12

0 T22

V_k^T, where T11∈ R^×has the stable eigenvalues.

(v) Set ˜Q:= QkUkVk

I

0

.

(vi) Compute the unique positive square root√

T11of T11using the method sqrtm of [21].

(vii) Compute the stable invariant subspace V⁻= W ˜Q− ˜Q√

T11.

The computational eﬀort for the computation of the square root of Ω_k is of order k³. If k is small compared to n, then this cost is negligible and it does not add much to the cost of the SHIRA iteration.

Remark 2. In a worst case analysis this procedure may suffer from a loss of accuracy due to the fact that we are computing the square root of the Schur form of a function of W, which at least contains quadratic terms in W. Thus, effects similar to the square reduced method for the Hamiltonian eigenvalue problem may occur, see [43], where the squaring of W leads to a loss of accuracy for small eigenvalues. See also the analysis in [45]. We will demonstrate this in Example 2 and also show that one step of inverse iteration fixes this accuracy loss.

Algorithm 1 can be applied directly to large sparse problems of the form (4).

An important application here is the linear quadratic optimal control problem for descriptor systems, where the pencil has the form [5, 6, 25]

α

E 0

0 E^T

− β

A BB^T

C^TC −A^T

. (23)

Such problems arise in the optimal control of semidiscretized parabolic partial diﬀer- ential equations [15, 28, 29, 30].

Assume that the skew-Hamiltonian matrix N in (4) is given in factored form N = Z₁Z₂, where Z₂^TJ = ±JZ₁.

(24)

(9)

Such a factorization (called J-Cholesky factorization) exists for all real skew-Hamiltonian matrices [5] and it can be obtained either trivially or numerically via a Cholesky-like factorization of skew-symmetric matrices [7].

If N is invertible, then using this factorization we can (at least formally) transform the SHH pencil αN − βH to a standard eigenvalue problem αI − βW, where W =

±Z₁⁻¹HZ₂⁻¹ is Hamiltonian. If in (23) the matrix E is invertible, then the resulting Hamiltonian eigenvalue problem is

α

I 0 0 I

− β

E⁻¹A E⁻¹BB^TE^−T C^TC −A^TE^−T

. (25)

Numerical examples of this type are presented in section 5.

4. Damped gyroscopic systems. In this section we show how the extraction procedure given by Algorithm 1 can be used to solve quadratic eigenvalue problems for systems of the form (1). To do this we ﬁrst study the undamped case (ε = 0), i.e.,

λ²Mx + λGx + Kx = 0, (26)

with M = M^T and K = K^T positive deﬁnite and G = −G^T. Using the linearization (3) and the factorization

N = Z1Z2=

I ¹₂G 0 M

M ¹₂G

0 I

yields the Hamiltonian eigenvalue problem

(λI − W)x = (λI − Z₁⁻¹HZ₂⁻¹)x = 0.

(27)

The matrix (W − λI)⁻¹ can be factored as (W − λI)⁻¹

(28)

=

M ¹₂G

0 I

I λI

0 I

0 M⁻¹

−Q(λ)⁻¹ 0

I ¹₂G + λM

0 M

,

where

Q(λ) = λ²M + λG + K.

(29)

To this problem we can directly apply the extraction procedure given by Algorithm 1.

If we include the damping term in the gyroscopic system as in (1), then we cannot use the structured SHIRA method directly. But we can still linearize the system (1) and obtain the perturbed SHH system (3) or, in the diﬀerent form,

λ

M G

0 M

−

−εD −K

M 0

y x

= 0, (30)

which is now a pencil with one skew-Hamiltonian and one perturbed Hamiltonian matrix. We can still use the factorization (27) of the skew-Hamiltonian matrix and obtain the perturbed Hamiltonian eigenvalue problem

(λI − ˆW)x = (λI − Z₁⁻¹HZˆ ₂⁻¹)x = 0, (31)

(10)

with

W =ˆ

−(εD + ¹₂G)M⁻¹ −K +¹₄GM⁻¹G + εDM⁻¹G

M⁻¹ −¹₂M⁻¹G

= W + ε

−DM⁻¹ DM⁻¹G

0 0

.

If the perturbation ε is small, then eigenvalue/eigenvector pairs (or invariant subspaces) of the problem (1) can be regarded as small perturbations of eigenvalue/

eigenvector pairs (or invariant subspaces) of problem (26). In this situation it is natural to use an eigenvalue/eigenvector pair of the undamped problem (26) computed via Algorithm 1 as start for a correction method. This method could be any method that can be used for eigenvalue, eigenvector, or subspace correction, such as subspace iteration, inverse iteration, Newton’s method, or the Jacobi–Davidson method. Here we present results that are obtained with the quadratic Jacobi–Davidson method [3, 36] as correction. This method is typically very efficient and well suited for the given problem. However, if the desired eigenvalues of (1) form a cluster of nearby eigenvalues, then the quadratic Jacobi–Davidson method sometimes has difficulties in detecting and resolving such a cluster. The undesired effect is that, in this case for different starting values (eigenvalue/eigenvector pairs) obtained from the unperturbed problem, it converges to the same eigenvalue/eigenvector pair of (1). It is known that implicit deflation techniques based on Schur forms (see, e.g., section 4.7 and section 8.4 of [3]) combined with the Jacobi–Davidson method perform well for linear eigenvalue problems. However, in the quadratic eigenvalue problem, it is not clear how to incorporate an implicit deflation technique because a quadratic Schur form, in general, does not exist for quadratic matrix polynomials.

For this reason we have also analyzed the use of an explicit nonequivalence low- rank transformation deﬂation technique that was suggested in [9, 12] for quadratic eigenvalue problems. Let us brieﬂy recall this technique for the polynomial

L(λ) = λ²M + λC + K with C = G + εD.

(32)

We study the two cases of real eigenvalues or complex conjugate pairs separately.

Suppose that we have computed a real eigenvalue λ₁ as well as the associated right and left eigenvectors x₁ and z₁, respectively, with z₁^TKx₁= 1 such that L(λ₁)x₁= 0 and z^T₁L(λ1) = 0. Let θ1= (z₁^TMx1)⁻¹. We then introduce a new deﬂated quadratic eigenproblem as in [12] via

L(λ)x ≡

λ²M + λ  C + K x = 0, (33)

where

M = M − θ 1Mx1z₁^TM, C = C + θ1

λ₁(Mx₁z₁^TK + Kx₁z₁^TM), (34)

K = K − θ1

λ²₁Kx₁z^T₁K.

Suppose that we have computed an complex eigenvalue λ₁ = α₁+ iβ₁ as well as the associated right and left eigenvectors x₁ = x_1R+ ix_1I and z₁ = z_1R+ iz_1I, respectively, such that Z₁^TKX₁= I₂, where X₁ = [x_1R, x_1I] and Z₁= [z_1R, z_1I]. Let

(11)

Θ₁= (Z₁^TMX₁)⁻¹. We then introduce a new deﬂated quadratic eigenproblem as in [9] via (33), where

M = M − MX ₁Θ₁Z₁^TM,

C = C + MX 1Θ1Λ^−T₁ Z₁^TK + KX1Λ⁻¹₁ Θ^T₁Z₁^TM, (35)

K = K − KX ₁Λ⁻¹₁ Θ₁Λ^−T₁ Z₁^TK in which Λ₁= [_−β^α¹₁^β_α¹₁].

Note that if the matrices M and K are symmetric, then the matrices M and K in (34) or (35) are symmetric as well. The results in [9, 12] then imply the following proposition.

Proposition 1. (i) Let λ1 be a simple real eigenvalue of L(λ) as in (32) and let x1 and z1 with z₁^TKx1 = 1 be the right and left eigenvectors, respectively. Then the spectrum of L(λ) in (33) with coeﬃcients as in (34) is given by (σ(L(λ)){λ₁}) ∪ {∞} = σ(L(λ)) provided that λ²₁= θ₁.

(ii) Let λ₁ be a simple complex eigenvalue of L(λ) as in (32) and let X₁ = [x_1R, x_1I] and Z₁ = [z_1R, z_1I] with Z₁^TKX₁ = I₂, where x₁ = x_1R+ ix_1I and z₁ = z_1R+iz_1I are the associated right and left eigenvectors, respectively. Then the spectrum of L(λ) in (33) with coeﬃcients as in (35) is given by

σ(L(λ)){λ₁, ¯λ₁}

∪{∞, ∞} = σ(L(λ)) provided that Λ₁Λ^T₁ = Θ₁.

Furthermore, in both cases (i) and (ii), if λ₂= λ₁ and (λ₂, x₂) is an eigenpair of L(λ), then the pair (λ₂, x₂) is also an eigenpair of L(λ).

Using this deﬂation procedure we have now presented all the ingredients for our algorithm. We solve the damped gyroscopic system (1) by the quadratic Jacobi–

Davidson method combined with the explicit nonequivalence deﬂation technique (33), (34) or (33), (35) by using eigenpairs computed via Algorithm 1 as starting values.

We summarize this approach in the following algorithm.

Algorithm 2 (Quadratic Jacobi–Davidson method with deﬂation).

Input Matrices M, G, D, and K and parameter ε as in (1). Target shift λ0 and number of desired eigenvalue/eigenvector pairs nearest to λ0. Tolerance Tol for stopping criterion.

Output The eigenpairs {(λj, xj)}_j=1 of

L(λ)x = (λ²M + λ(G + εD) + K)x = 0, associated with eigenvalues {λj}_j=1that are closest to the target λ0. (i) Compute the eigenpairs {(λ⁽⁰⁾_j , x⁽⁰⁾_j )}j=1of

(λ²M + λG + K)x = 0

by Algorithm 1, where {λ⁽⁰⁾_j }_j=1 are the closest eigenvalues to the target λ0

and (x⁽⁰⁾_j )^Hx⁽⁰⁾_j = 1.

(ii) For j = 1, . . . ,

Compute the eigenpair (λj, (xj, zj)) of L(λ)by the quadratic Jacobi–Davidson method with target λ⁽⁰⁾_j and initial vector x⁽⁰⁾_j , where xjand zjare the associated right and left eigenvectors satisfying the relations as in (i)or (ii), respectively, of Proposition 1.

If xj− xi ≤ T ol for some i < j, then

Compute the eigenpair (λj, (xj, zj)) ofL(λ)as in (33), (34)if λ i is real, and as in (33), (35) if λi is complex, by the quadratic Jacobi–Davidson method with target λ⁽⁰⁾_j and initial vector x⁽⁰⁾_j .

End if End for

(12)

Note that a recently proposed locking technique for the solution of quadratic eigenvalue problem as suggested in [22] might also be adapted instead of our nonequivalence low-rank deﬂation technique in Algorithm 2.

5. Numerical results. In this section we present some numerical tests for the algorithms proposed in this paper. All computations were done in MATLAB 6.0 [21]

or in Fortran 90 on a Compaq DS20E workstation.

We ﬁrst discuss the new extraction method and compare it with several other iterative methods. These methods are as follows:

EM: the SHIRA algorithm combined with the extraction method given by Algorithm 1;

SISI(q): the SHIRA algorithm followed by q steps of shift-and-invert subspace iteration;

EM SISI: the SHIRA algorithm combined with the extraction method EM adding one step of shift-and-invert subspace iteration;

IPI: one step of the inverse power iteration.

Here, in the SISI the target value λ0 with negative real part is taken as a ﬁxed shift and the iteration starts with the subspace span(Qk) generated by SHIRA, which includes the eigenvectors associated with eigenvalues near λ₀and −λ₀. Then SISI con- verges to a subspace span(Q⁻_k), associated with stable eigenvalues and the eigenpairs are computed from the “Ritz”-pairs of (Q⁻_k)^TWQ⁻_k.

In the EM SISI variant, the target value with negative real part is again taken as a ﬁxed shift and one extra step of SISI is performed starting with span(V_k⁻) obtained by the new extraction method EM. With the resulting subspace span(V_k), then the eigenpairs are computed from the “Ritz”-pairs of V_k^TWV_k.

In the inverse power iteration (IPI) an approximate eigenvalue computed by SHIRA is taken as shift and the associated Arnoldi vector qi is used as starting vector.

In the following we present results for three problem classes: quadratic eigenvalue problems from elasticity theory, linear quadratic optimal control problems, and damped gyroscopic systems.

5.1. Quadratic eigenproblems from elasticity theory. Consider the quadratic eigenvalue problem

λ²Mx + λGx + Kx = 0, (36)

where M, G, and K are deﬁned as follows. As in [26], let

B =







0 0

1 ...

... ...

0 1 0





∈ R^m×m,

M =˜ ¹₆(4Im+ B + B^T), ˜G = B − B^T, and ˜K = −(2Im− B − B^T). Deﬁne M = c11Im⊗ ˜M + c12M ⊗ I˜ m,

G = c21Im⊗ ˜G + c22G ⊗ I˜ m, K = c₃₁I_m⊗ ˜K + c₃₂K ⊗ I˜ _m, (37)

(13)

−0.14 −0.12 −0.1 −0.08 −0.06 −0.04 (1a) eigenvalues

target value

−0.14 −0.12 −0.1 −0.08 −0.06 −0.04

10⁻¹⁵ 10⁻¹⁰ 10⁻⁵

(1b) eigenvalues

residuals

residuals by EM residuals by SISI(10) residuals by IPI

Fig. 1. Eigenvalues and the corresponding residual.

where cij are positive constants. Then we have M = M^T > 0, G = −G^T, and K = K^T.

Example 1 (see [26, Example 6.2]). We take m = 90, so that W is a 16200×16200 matrix, and

c₁₁= 1.00, c₁₂= 1.30, c₂₁= 0.10, c₂₂= 1.10, (38)

c31= 1.00, c32= 1.20.

For this example, the twelve eigenvalues computed by the SHIRA algorithm clos- est to the target λ0= −0.1 are depicted in Figure 1(a). The residuals (λ²_iM +λiG+

K)x_i ₂ for the eigenpairs (λ_i, x_i), where x_i is computed via EM, SISI(10), and IPI, respectively, are given in Figure 1(b).

We see from Figure 1(b) that the methods EM and IPI yield residuals of the same magnitude, ≈ 10⁻¹⁰, while SISI(10) yields smaller residual for some eigenvalues but the residuals of the eigenvalues further away from the target are much larger.

In order to evaluate the computational complexity for this problem class, we compare the major computational tasks per iteration step. One iteration of the inverse power iteration for (36) requires one forward/backward substitution for evaluating Q(λ)⁻¹z (assuming that an LU-factorization for Q(λ) is available). On the other hand, from (29), we see that the matrix W can be factored as

W =

I −¹₂GM⁻¹

0 M⁻¹

0 −K

I 0

I −¹₂G

0 I

.

The dominant cost of the extraction method is to determine the eigenvector in step (vii) of Algorithm 1. It consists of matrix-vector products for evaluating Kz and Gz, respectively, forward/backward substitution for evaluating M⁻¹z (assuming that an

(14)

Table 5.1

Computational costs for EM and IPI; here k = = 12.

EM IPI

0 evaluations Q(λi) evaluations Q(λi)

1 sparse LU factorization of M sparse LU factorizations of Q(λi)

f/b substitutions of M⁻¹z f/b substitutions of Q(λi)⁻¹z

matrix-vector products Kz 0 matrix-vector products Kz 2 matrix-vector products Gz 0 matrix-vector products Gz

(2 + k) SAXPY operations 0 SAXPY operations

LU or Cholesky-factorization for M is available) as well as 2 + k SAXPY operations.

The cost for the evaluation of Ω_k, U_k and V_k in Algorithm 1 are negligible if k n.

In Table 5.1 we compare the computational cost for the computation of eigenvectors using the EM and one step of the IPI, respectively. This comparison shows that the computational cost for the new extraction method compares favorably with that of one step of inverse power iteration. The major savings in computational time arise from the fact that no further factorizations of Q(λ) are needed.

5.2. Optimal control problems. The second class of problems arises from continuous-time linear quadratic control problems of the form

minx,u

_∞

0 (u^TRu + x^TC^TCx)dt subject to

˙x = Ax + Bu, y = Cx, (39)

where A ∈ R^n×n, B ∈ R^n×m, C ∈ R^p×n, R ∈ R^m×mwith p, m n and where R is symmetric positive deﬁnite.

As test cases we consider the spatial (central diﬀerence) discretization in polar coordinates of a reaction-diﬀusion equation with Dirichlet boundary conditions on the two-dimensional unit disk; see [17] for details.

Performing a semidiscretization in space, one obtains a continuous-time system as in (39),

A = I + T, (40)

and there exists an orthogonal matrix W , see [17], such that the matrix A in (40) can be transformed to a tridiagonal matrix A ≡ W^TAW . Thus, the LU-factorization of A is easily computed in O(n) operations [11]. Let

W^T(A − λ₀I)W = L₁U₁, −W^T(A^T+ λ₀I)W = L₂U₂ (41)

be LU-factorizations of W^T(A − λ0I)W and −W^T(A^T + λ0I)W , respectively. For this problem, (W − λ₀I)⁻¹ can be factored as

(W − λ₀I)⁻¹ =

W 0

0 W

F⁻¹ 0

0 I

I − ˜B ˜B^T

0 I

F 0 0 D⁻¹

×

I 0

− ˜C^TC I˜

F⁻¹ 0

0 I

W^T 0

0 W^T

, (42)

(15)

−0.049 −0.048 −0.047

−4

−2 0 2 4x 10⁻⁴

target value

(2a) real parts

imaginary parts

−0.049 −0.048 −0.047

10⁻²⁰ 10⁻¹⁵ 10⁻¹⁰ 10⁻⁵

(2b) negative values of norms of eigenvalues

residuals

residuals by EM residuals by SISI(6) residuals by EM_SISI

Fig. 2. Eigenvalues and corresponding residuals.

where F = W^T(A − λ0I)W , ˜B = W^TB, ˜C = CW , and D = −W^T(A^T + λ0I)W − C˜^TCF˜ ⁻¹B ˜˜B^T. To evaluate D⁻¹, we use the LU-factorizations in (41) combined with the Sherman–Morrison–Woodbury formula [11] to obtain

D⁻¹= U₂⁻¹

I + L⁻¹₂ C˜^TS⁻¹CU˜ ₁⁻¹L⁻¹₁ B ˜˜B^TU₂⁻¹ L⁻¹₂ , (43)

where

S = I − ˜CU₁⁻¹L⁻¹₁ B ˜˜B^TU₂⁻¹L⁻¹₂ C˜^T. (44)

Example 2. With the data in [17] we get a system of size n = 8100. We chose B, C ∈ R^n×2 randomly with uniform distribution in the interval (0, 1) and used SHIRA to compute the 16 eigenvalues closest to λ₀ = −0.048. The eigenvalues are depicted in Figure 2(a). The associated eigenvectors are then computed by EM, EM SISI, SISI(6), and IPI, respectively. The residuals for the eigenpairs computed by EM, EM SISI, and SISI(6), respectively, are shown in Figure 2(b).

In this example the IPI had convergence problems for the vectors x_j associated with those eigenvalues with index j = 2, 3, 5, 6, 7, 8, 10, 11, 15, 16, because these com- plex conjugate pairs have relatively small imaginary parts and their use as shifts is not adequate. Here we see the loss of accuracy in the extraction method and we also observe that, as expected, this is easily compensated by adding a step of SISI.

Let us brieﬂy compare the costs for computing an 5-dimensional invariant sub- space associated with stable eigenvalues for optimal control problems via EM SISI and SISI.

The matrix S in (44) is computed in SHIRA; it can be reused for SISI. It follows from the factorizations (42) and (43) that one iteration of SISI with re- orthogonalization requires 55 forward/backward substitutions by assuming that the