Rank Revealing LU Factorizations Tsung-Min Hwang, Wen-Wei Lin,* and Eugene K. Yang

(1)

Rank Revealing LU Factorizations

Tsung-Min Hwang, Wen-Wei Lin,* and Eugene K. Yang Institute of Applied Mathematics

National Tsing Hua University Hsinchu

Taiwan 30043, R. 0. C.

Submitted by Richard A. Brualdi

ABSTRACT

We consider permutations of any given squared matrix and the generalized LU( r ) factorization of the permuted matrix that reveals the rank deficiency of the matrix.

Chan has considered the case with nearly rank deficiency equal to one. This paper extends his results to the case with nearly rank deficiency greater than one. Two applications in constrained optimization are given. We are primarily interested in the existence of such factorizations. In addition to the theories, we also present an efficient two-pass rank revealing LU( r) algorithm.

1. INTRODUCTION

Let A be an n-by-n matrix. We shall consider the generalized LU(r) factorization P, AQ, = LU which will reveal the nearly rank deficiency of ^A (herein P, and Qr always denote permutation matrices, L unit lower triangular and U upper triangular except for a small r X r block at its last r X r position; see the definition of LU(r) factorization in Section 2 below).

Our main interest is on the nearly singular case. Chan [4] considers the case when the nearly rank deficiency of A is one. In this paper, we extend his results to the case with higher-dimensional rank deficiency. Such a rank

*E-mail: [email protected].

LINEAR ALGEBRA AND ITS APPLICATIONS 175: 115-141(1992) 0 Elsevier Science Publishing Co., Inc., 1992

115

655 Avenue of the Americas, New York, NY 10010 0024-3795/92/$5.00

(2)

116 T.-M. HWANG, W.-W. LIN, AND E. K. YANG

revealing (henceforth, RR) IN(r) f ac orization t is faster than either SVD (singular-value decomposition) or RRQR(r) factorization (see Chan [3] and Foster [lo]) and is important in matrix theory and linear algebra for its wide applications.

One kind of applications of such RR factorization arises from constrained optimization (see Chan [4, 51, Chan and Resasco [7]). When an equality constrained problem is solved by the Lagrange-multiplier approach, we have a symmetric but not positive definite system with the Hessian in the (1,1) block and the constraints constituting the borders. If the Hessian is singular at the solution, then our RR factorization together with deflated block elimination [7] can be used to solve the problem (Chan [6]). The following is another application. If the active-set method is used to solve an inequality constrained optimization problem and the problem is (nearly) degenerate at an intermediate iteration, then the RR factorization is essential to make the method successful (Fletcher [9]). B esi es, d it can also be used to solve least-squares problems following the method proposed by BjGrk [l, 21.

Let T > 1 be the nearly rank deficiency of A. That is, the rth smallest singular value a;, _ r + 1 of A is of small magnitude, and the (r. + I)th smallest singular value a,- r is of order one. We will show that, for this matrix A, there always exists a generalized LU(r) factorization with an T X r position of U. Here “small” means O(a,_,+ l). Chan [4] notes that the usual partial pivoting cannot guarantee to produce small pivots, i.e., to reveal the rank deficiency. In this paper, we shall concern ourselves with the theoretical questions of the existence of such RRLU(r) f ac orizations but shall also give t a practical (i.e. efficient) algorithm for computing such factorizations.

The following notation will be used throughout. n(A) and det A denote the nullity and the determinant of A, respectively. A[i,, . . . , i,(j,, . . . , j,]

denotes the p X p submatrix of A obtained from the intersection of rows 21,. . . > i, withcolumnsjl,...,jp. When the two sets of indices are the same, we write A[i,, , ip] for short. A(i,, . . . , iplj,, . . . ,j,) and A(i,, . . , ip>

denote the determinants of A[i,, . . . , i,lj,, . . . ,j,] and A[i,, . . , i,], respectively.

In this paper some lemmas are simple extensions of those in 141, but others are not simple. In Section 2, we outline the ways to find permutations P and Q such that there exists a generalized LU(r) factorization for PAQ. In Section 3, we discuss the exactly singular case; in Section 4, we discuss the nonsingular case, In Section 5, we present an efficient two-pass algorithm, RRLU(r), that utilizes the theories in the previous sections. We give some numerical results to illustrate that the first pass of Algorithm RRLU(r) fails but the second pass succeeds in revealing the nearly rank deficiency of a given nearly singular matrix.

(3)

RANK REVEALING LU FACTORIZATION!3 117

2. EXISTENCE OF GENERALIZED M(T) FACTORIZATIONS

In this section, we first define the generalized LU(r) factorization of a given matrix A E R”’ “, and then we give an equivalence condition for the existence of the LU(r) factorization.

DEFINITION. Let A be an n X n matrix and 0 < r < n. If there exist permutations P, and Qr which, respectively, permute only the first n - T rows and columns of A such that

(2.1)

where Us, E RrXr (not necessary upper triangular), U,, E R(“-‘)x(“-‘) is upper triangular, and L,, E R(“-‘!x(n-r) is unit lower triangular, then we say that A has a generalized LU(r) factorization.

Note that the generalized LU(0) factorization of A is the usual LU factorization of A (which always exists).

We first prove two lemmas for the fundamental existence theorem.

LEMMA 2.1.

(a) Zf A i-s nonsingular and has a generalized LU(r1 factorization as in (2.0, then we can perturb the submatrix

P,AQ,[n-r+ l,...,n]

by U, to make A singular with nullity equal to r.

(b) Zf A is singular with nullity r and has a generalized LU(r) factoriza- tion (2.1) with U,, = 0, then we can perturb the submatrix P, AQ,[n - r +

1,. . , n] by a nonsingular r X r matrix to make A nonsingular.

Proof. (a): Write

(4)

118

Let

T.-M. HWANG, W.-W. LIN, AND E. K. YANG

A = P,AQ, - dkg{0,_,,U2,} = -Lull JLu12

L u

21 11

L u

21 12

1

Thus n(A) = r. The proof for part (b) is similar. n LEMMA 2.2. Let A, with nullity equal to 0

partitioned form

or r, be represented in the

4, A,, A=A

_[ ²¹

A >

22

1

where A,, E

R(n-r)x(n-r)

and A,, E

R"'.

Then we can change the nullity of A (from 0 to r or vice versa) by perturbing the submatrix A,, if and only if A,, is nonsingular.

Proof. The “only if” part: Suppose that A,, is singular and rank A,, = k.

Let

A,, =U ; ; VT

[ I

be the SVD of A,,, where C = diag(u,, . . . , Us) > 0. Define

S(A22) = [” () 1, “]A[: qT= [s;l s!2 ;g

where S,, and S& are in

R'X("-k-').

For the case rank A = n - r:

(i) If S,, has deficient column rank, then rank S( A,,) < n - ¹for all A

=iii) If S, has full column rank, then

(5)

RANK REVEALING LU FACTORIZATIONS 119

and since rank A = n - r, we have S,, = 0. Thus

rankS(Azz) <n - 1 for all A,, .

From (i) and (ii), we know that there does not exist A, such that rank S( A,) = n, which is contradictory. Thus A,, is nonsingular.

For the case rank A = n: Since rank A = n, we have rank S,, = n - k - r (full column rank), rank S,, = n-k-r(fullrowrank),andn_2r<

k < n - r - 1. Thus

rankS(A,) >k + ( n - k - r) + (n - k - r) = 2n - k - 2r.

From n - 2r < k < n - r - 1, we have n - r + 1 < 2n - k - 2r G n.

Therefore, rank S( A,,) 2 n_- r + 1 for all A,,; i.er there does not exist a perturbation of A,, say A,,, such that rank S( A,,) = n - r, which is contradictory. Thus A,, is nonsingular.

The “if” part: Suppose that Al, is nonsingular. Then

From (2.2), we have that

All

⁰

rank A = rank

0 A,, ^- &&‘4,

1

.

Therefore, we can change the nullity of A (from 0 to r or vice versa) by perturbing the submatrix A,. This complete the proof. n

On the basis of Lemma 2.1 and 2.2, we want to find a submatrix H of A such that if we perturb H, then n(A) is changed either from 0 to r or vice versa.

DEFINITION. Let C,(r) denote the set of all r ^x r submatrices

of A, where n(A) can be changed (from 0 to r or vice versa) by perturbing the submatrix H of A alone.

(6)

When an element H in C,(r) is found, we use permutations, called P and Q, to permute H to the last r X r position of PAQ. In the following we will prove the fundamental theorem on the existence of a generalized LU(r) factorization for PAQ.

THEOREM 2.3. Let P and Q be two permutations. Then PAQ has a generalized LU(r) factorization as in (2.1) if and only if P and Q permute a submatrix H in C,(r) to the last r-by-r position of PAQ.

Proof. The “only if” part is Lemma 2.1. We prove the “if” part as follows. Let

All Al,

PAQ= 1

^A ^A

₁

^, ^where A,, E Rrx’.

21 22

By Lemma 2.2, A,, is nonsingular. So A,, has an LU(0) factorization, say

~~~~~~~ = L,,U,,, where rri, 8, are permutations and Vi, has nonzero diagonal elements. Let

P, = “0’ 0

[ ^I

^r 1

Thus

where U,, = A,, - A,181U~~1L~1~lAl, = A,, - A,,A,‘A,,. Therefore,

PAQ has a generalized LU(r)-factorization. w

In the following two sections, we shall establish some subsets of C,(r) which will give further equivalent and sufficient conditions for the existence of the LU(r) factorization of a singular or a nonsingular matrix. This information will lead to a practical algorithm in Section 5.

3. THE SINGULAR CASE

In this section, let A be a singular matrix with n(A) = r. We shall show how to find P and Q such that PAQ h as a generalized LU(r) factorization (2.1) with U, = 0. First, we need the following two preliminary lemmas.

(7)

RANK REVEALING LU FAGTORIZATIONS 121

LEMMA

3.1. Let D =

diag(d, ,..., d,_,,O ,..., 01, U, z

[ul,....url, and

V, = [u,,

. , II,] E R"".

Then

= ,G d,U,( n - r + 1,. . . , n(1,. , r)V,( n - r + 1,. . . , n(1,. . . , r).

Proof. See appendix. W

Now, let A = X CY T be the SVD of A, where c = diag(ar ,

. . , (T

_n--r*

0 ,..., O>,X=[xl ,...,

r,],andY=[yr ,..., y,]areorthogonahFrom now on, let H denote an r x r matrix [a,,]. Then

LEMMA 3.2.

A + 2 k &e,,ei l=l k=l

=(detXdetYdetH)X(i, ,..., i,(n-r+l,..., n)

n-r

XY(jl,. .

.,jrln-r+ l,..., n)ngi.

i=l

Proof. See appendix. n

The next lemma shows that Cl(r) is related to the left and right singular vectors corresponding to 0, _ r+ i = *** = a, = 0, respectively.

DEFINITION. Define the set

C,(r) = {H: H isin C,(r)

satisfying

X(i,,...,i,ln -

r + 1,

. . ..n) # 0

andY(j,,... ,j,ln - r + 1,.

. . , n) # 0).

Note that we use nonzero determinants instead of nonzero components to generalize the definition of C,(r) in Chan [4].

LEMMA 3.3. Zf A is singular with n(A) = r, then C,(r) = C,(r) and is rwnempty.

(8)

122 T.-M. HWANG, W.-W. LIN, AND E. K. YANG Proof. By Lemma 3.2, we know that for a nonsingular matrix H = [a,,]

E RrXr

A + i i 6,, e,,eJ? # 0 if and only if

I=1 k=l

X(i,, . .

^.) _i,ln-r+I _,..., _n)ZO and Y(ji ,..., j,ln-r+l,..., n)#O.

Thus C,(r) = C,(r).

It remains to show that C,(r) is nonempty. If

X(i, ,

. . , i,ln - r + 1 1 . . . , n) =

0

for any possible set of i, : 1 = 1,. . , T, then {x,,_,+ ,, . . . , x,,) is linearly dependent, which is contradictory. This can be similarly shown for the determinant involving Y. Therefore, C, is nonempty. n

Combining Theorem 2.3 and Lemma 3.3, we have the following primary result of this section.

THEOREM 3.4. Suppose that n(A) = r. Then PAQ has a generalized LU(r) factorization (2.1) with U,, =

0

if and only q P and Q permute an element in C,(r) to the last r x r position of PAQ. Moreover, there always exists at least one such j&torization for any singular A.

4. THE NONSINGULAR CASE

In this section, we assume that A is nonsingular but with r small singular values. We shall show how to find permutations P and Q such that PAQ has a generalized LU(r) f ac t orization

(2.1)

with such a small U,, E

R" r

that the factorization reveals the nearly rank r deficiency of A. First, we show that C,(r) is related to r-by-r submatrices of A-’ with nonzero determinants.

DEFINITION. Let M = A-l. Define

cd?-)

E

{H = A[i,,...,i,Ij,,...,j,]:/3 =

M(j,,

. . ..jrlil....,ir) # 0).

LEMMA 4.1. Let A be nonsingular. Zf H E C,(r) and H = [ 6,,], for 1, k = 1,. . . , r, is the inverse of M[j,, . . . ,j,li,, . , i,], i.e.,

‘1, =

( -

l)‘+k

P M(j,,...,.fk, . . . . j,lil ,..., i, ,..., i,), (4.1)

(9)

RANK REVEALING LU FAGTORIZATIONS

where i means “omit i,” then

n A - i k &eireI~

( I=1 k=l In other words, we have C,(r) C C,(r).

Proof. See appendix.

123

r.

n THEOREM 4.2. Zf P and Q pemute a submatrix A[i,, . . , i,lj,, . . . , j,]

E C,(r) to the last r X r position of PAQ, and

A - i i Slkei,ejT

Z=l k=l

then PAQ has a generalized LU(r) factorization (2.1) with U, = [t&l.

Proof. Let

PAQ=

4,

^A

4,

[ ²¹ ^A ²²

1

^> ^where ^A²² ^E ^R”‘.

By Lemma 2.2 and the definition of C,(r), we have that A,, is nonsingular.

By Theorem 2.3, PAQ has a generalized LU(r) factorization. Let it be

If P, = diaar,, Z,) and Q1 = diag(8,, I,), then

Let li=A,

where -

“141% = 4% ami v,, = A22 - L2,42.

(4.2)

&2 = A22 - [slkl~

(4.3)

(10)

On the other hand,

where

ti22 = AZ2 -

L,,U,, .

(4.4)

n(A) = r implies tiss = 0. From (4.2), (4.31, and (4.41, we then have

Q!s =

&,I.

ⁿ

Combining Lemma 4.1 and Theorem 4.2, we have the following theorem immediately.

THEOREM 4.3. Zf A is nonsingular and P and Q permute an element of C,(r) to the last r x r position of PAQ, then PAQ has a generalized LU(r) factorization (2.1). Moreouer, the (1, k)th entry of U,, , say 6,, , is equal to

( -

l)‘+k

P M(j,, . . . ?ik,. . . , j,bl,. . . ,i,, . . . ,ir) (4.5) for 1, k = 1,. . . , r, where p = M(j,, . . . , j,lil,. . , i,).

Now, let XrAY = x = diag(u,, . . . , u,,) be the SVD of A with cri > ..*

> a, > 0, and denote the columns of X and Y by xi and yi, respectively.

In the following, we will characterize the generalized LU(r) factorization (2.1) of PAQ with small U,, E Rrx’ by the singular vectors x,_,+i, . . . , x, and Yn-,+l>...> Yn corresponding to ~“_~+i,. . . , a,,, where on_, * ~“_~+i

= O(E) (small in magnitude).

DEFINITION. Define

C,(r) = H e A[i,, . . . , irljl,. . . ,jr] :

absY(j,,... , jrln - r + 1,. . . , n) >

( r!(nn;r)! r2

and abs X(i,, . . . , i,ln - r + 1,. . . , n) >

r!(n - r)!

n!

where abs denotes the absolute value.

(11)

RANK REVEALING LU FACTORIZATIONS 125

THEOREM 4.4. C,(r) is empty.

Proof. Let Y, = Y[l, . . . , nln - r + 1,. . . , n]. Since Y, is orthogonal, by the Binet-Cauchy formula we have

1 = det Y,‘Y, = c Y,(k,, . . . , k,ln - r + 1,. . .) n)‘.

l<k,< ... <k,<n

Hence, there exist indices j,, . . . , j, such that

absY(j, ,...,

j,ln -

r + l,..., n) > ( r!‘nn; r)! r2,

Similarly, there exist i,, , i, such that

abs X(i,, . , i,)n - r + 1,. . ., n) > ( r!(nn; r)! r2. ,

Next, we show our main result in this section. The next theorem, using the Binet-Cauchy formula, is also the main contribution of this paper.

THEOREM 4.5. Let A be nonsingular. Zf P and Q permute an element in C,(r) to the last r x r position of PAQ, then PAQ has a generalized LU(r) factorization (2.1) with U, = [ 6,,] satisfying the following upper bound:

IS,,1 < n!

[

1 ^-1

n. q-r+1

r!(n - r)! s-r+1 ’ - r!(n - r)! q_,

1

for 1, k = 1,. . . , r, provided that th e quantity inside the bracket is positive.

Proof. For ease of exposition, we shall first use the case of r = 2 to illustrate the derivation. Let A = XXY T and M = A-’ = Et=, uilykxl.

Then

M[j, kli,Z] = Y[j, kll,. ..,n]diag(a,‘,..., a;‘)X[i,Zll,..., n]‘.

(12)

By the Binet-Cauchy formula, we have

P = M(j,klQ) = c ~(j,klp,9)X(~,~lp,q)~,-‘a,-’

l<p<9<n

= Y(j,

kin -

1, n)X(i,Zln - l,n)a,.,a,’

+

c y(j,klp,q)X(~,~lp,q)~,-‘~~‘. (4.6)

l<p<g<n (p,q)zh-l,n)

In the following we will show the upper bound for the second term in (4.6) is o-;lo--?,:

abs c Y(j,klp,9)X(i,~lp,q)a~‘~~’

l<p<q<n (p,q)+(n-l,n)

Q

a,-lu;J, c abs[Y(j,klp,q)X(i,Ilp,q)]

l<p<q<n (p,g)f(n-1.n)

i

1

^l/2

Q

a, ‘u;Jg c Y(j,klp,9)’

l<p<q<n (p,q)z(n-l,n)

i

^c x(i,zlp,9)2 1

^l/2

l,<p<q<n (p,q)+(n-l,n)

(by the Cauchy-Schwarz inequality)

< a, lu”-12 .

The last inequality follows from applying the Binet-Cauchy formula to the matrices

I2 =

Y [ j,

kll, . . . , n]Y [ j,

kll, . . . , nl’

(4.7a) and

z2 =

X[i, 111,. . *

)

n]x[i, 111,. . . , n]‘. (4.7.b)

(13)

RANK REVEALING LU FACTORIZATION!3 127

By the definition of C,(r) and the assumption that the quantity inside the bracket is positive, we have

Hence C,(r) G C,(r) and

n(n - 1)

IPI-‘<

2 anun_, 1 -

[

n(n -

l)U"_,

-l

1

2u,_, .

We show this similarly for the general case, i.e., when the nearly rank deficiency is r. By the Binet-Cauchy formula,

p = M(j,, . . . ,j&,, . . . ) iJ

= Y(&,. . &In - r + 1,. . , rr)

x X(i,, . . . ) i,ln - ?- + 1,. .) f+n-J,+l a*’ o;,’

+ c Y(jl>. . .

J,lp,, . . . > pr)

l<p,< ..’ <p,<n (p I,.‘., p,)#(n-r+1,.. ,n)

x(i1,. . .) i,lp,, . .) p,)up;l 0.. up;‘. (4.8)

The second term of (4.8) can be estimated as follows by using the Cauchy- Schwarz inequality and the Binet-Cauchy formula:

abs (

c

Y(jl> .

^.J,l p,, . f. > pJ

l<pl< .‘. <p,<n (pl ,...> p,)+(n-r+ I,..., n)

xx@,,. . .)

i,lp,, . . ., p,)c$yl --* up;’

< U” -1 -1

... un-r+2un-?r c

l<p,< “’ <pr4n (pl ,..., p,)+(n-r+l,..., n)

1

^l/2

qj1,. . . J,l p,, . . . , pt.)2

(14)

128 T.-M. HWANG, W.-W. LIN, AND E. K. YANG

X

1 ^c x(i,, . . .

^I

i,l p,, . . . , pJ2

^l/2

l<p,< .‘. <p,<n

(PI,. .., p,)+(n-r+l,..., n)

i

Hence, by the definition of C,(r), (4.8), (4.9), and the assumption of the theorem, a lower bound on I p I is given by

r!(n - r)!

ii

[

n!

IPI, n! u;’ 1 - - ^q-r+1 >o.

p=n-r+l r!(n - r)! u,_, I

Thus, we have C,(r) c C,(r) and

IpI-’

Q

n!

ii

r!(n - r)! p_n-r+l ^up 1 -

[

n! _q-r+1 ^-1

r!(n - r)! - u,_,

1

^. ^(4.10)

By Theorem 4.3, PAQ h as a generalized LU(r) factorization (2.1). We have estimated an upper bound (4.10) for I P 1-l. To estimate an upper bound for I S,, I (see Theorem 4.3 for S,, ), we only need to estimate an upper bound for the numerator of /6,,1. By using the Cauchy-Schwarz inequality and the Binet-Cauchy formula in a similar way, we have

abs(M(j,,.. .,I’~,.. .,j,li,,. ..,i, ,..., ir))

= abs

( c Y(jr 1..., j.k >...>

j,lpw

_Pr-

1)

l=sp,< .” <p,_,<n

xx@,, . . . ,

il, .

. . ,

i,l p,, . . . , p,_ &qy 52, 1

< a, -1 -1

*** q-*+2. (4.11)

By (4.10) and (4.11) we complete the proof. n

Theorem 4.4 and Theorem 4.5 together establish the existence of a generalized W(r) f ac orization with small U, t for any rank r deficient square matrix A. In addition to being nonempty, C,(r) is applicable in practical algorithms. In the next section, we propose such an algorithm for

(15)

RANK REVEALING ^LU FACI-ORIZATIONS 129

finding an element in C,(r), which leads to a generalized W(r) factorization with small U, as bounded by the bound of Theorem 4.5.

5. ALGORITHM AND EXAMPLES

Suppose the matrix A E

Rnx"

has nearly rank deficiency r (r > 1) and the quantity [n!/r!(n - r)!]a,_,.+, is sufficiently small, where a, _ r+ i is the rth smallest singular value of A. By using Theorem 4.4 and Theorem 4.5, we give an efficient algorithm for finding a rank revealing LU(r) factorization for the matrix A.

ALGORITHM RRLU(r). Given A E

R""'

with nearly rank deficiency r, I where r > 1 (but r is unknown a priori). Let X rAY = diag(o,, . . . , q,,) be the SVD of A with cri > **a > a,_, * ~“_~+i > 9.. > a;, > 0. This algorithm computes permutations P, Q and a generalized LU(r) factorization of

PAQ

PdPAQ)Ql = [ ::; ;][ “;I’ ;I]

with a small Us, = [S,,] E

R"',

which reveals the rank r deficiency of A.

Step 1: Compute the LU(0) factorization of A by some conventional pivoting strategy (e.g., partial pivoting):

Step 2: Determine a temporary rank deficiency i. Determine the index i (0 < ; < n) such that ltii j 1 < tolerance for all i, j = n - i + 1 . . > n and there exists an’index k E (n - i,. . . , n} with I&,_, k]

>: ’ tolerance.

Step 3: Use inverse iteration to determine the true rank deficiency r and compute the approximate singular values Us_ r+ i(A), , a,( A), the corresponding approximate right singular vectors Y, =

r+i>“‘r t::I,+,,...,x:l:

y ] and the approximate left singular vectors X, =

If i = 0, then set m = 1; else set m := i.

For k = 1,2,. . . m:= 2m.

Given an orthonormal matrix 2 E

Rnx"

(16)

Set Q = I, A, = 1.

While A1 > tolerance, do Set U = ZQ.

Solve AW = U by using the factorization (5.1).

Solve ATV = W by using the factorization (5.1).

Let 2 = V(VTV)-1/2 (e.g., Gram-Schmidt).

Compute the eigenvalues of ZTAATZ. Since ZTAATZ = ZTU(VTV)-‘/2, we compute an orthogonal Q E

Rmx" so

that

QTZTU(VTV)-1’2 Q = hag(h,,...,&), (5.2) where 0 Q A, < .*. < A,,,.

Endwhile

If there exists index r (1 < r < m) such that A, Q tolerance and A r+ 1 > tolerance then stop; else continue;

Endfor k.

Let X, = ZQr, where Qr is the first r columns of Q.

Remark : The convergence rate for the above inverse iteration process without adding (5.2) is (c”_ .+Ju”_~)~. The convergence rate is now accelerated by the correction step (5.2) (see [8] for details).

Solve A? = X,.

Let Y, = yl(ylr+?

Compute the singular values of X,?AY,, which are approximations to the smallest singular values a, _ r+ 1(A), . , q,(A).

If r = i, then done (first pass).

Step 4: Determine an element in the set C,(r).

Comment: Indeed, the maximal elements of the sets

(absY,.(j,, . ,j,ll,. . . , r): 1 <j, < ... <jr < n} =y and

{abs X,(i,, . , i,(l,. . , r): 1 < i, < *** < i, < n} E-?- satisfy the conditions in C,(r). That is, the corresponding indices

(j,, . . . , jr>

^{and (i,,} ^{. . , i,)} yield a submatrix

H = A[i,, . . . ,

i,lj,, . . ..j.l

ⁱⁿ^C,(r). Unfortunately, for an r > 2 it is not econom- ical to find the maximums of y and z’, because that needs n!/[r!(n - r)!] flop counts. In the following we give an efficient algorithm to find an element in C,(r).

(17)

RANK REVEALING LU FAGTORIZATIONS 131

Compute the LU(O) factorizations with complete pivoting of Y, and

X, ,

respectively:

Q’Y,e, = L,

2 , [

¹

PTX,8, =

L, RX

[

0 1

’

^(5.3b)

where PT, Q’ E R”“‘, 13,) 0, E R” r are permutations, L, , Lx E R “‘” are unit lower triangular, and R, , Rx are upper triangular.

Comment: Let (1,. . . , n)Q = (j,, . . ,j,) and (1,.

. . , n>P = (i,,

. . . , i,). It is easily seen that Y,.(ji, . . . ,j,ll, . . . , r) = det R, and X,(i,, . . . ,

i,ll, .

. . , r) = det Rx. Although we cannot prove that both ldet Rrl and ldet Rx1 are larger than (r!(n - r)!/n!)“‘, a statistical result shows that there is no counterexample in up to a total of about 60,000 randomly generated tested orthonormal matrices Y,. and X, E Rnx’ for n = 10,12,. . . ,100 and r = 2,. . . , n/2.

That is, the corresponding submatrix H = A[i,, . . . ,

i,lj,, . . ,j,l

produced by (5.3) is always in C,(r).

Step 5: Compute the generalized LU(r) factorization (2.1) of PAQ: Per- form the Gaussian eliminations by using the partial pivoting only on the 1st row to the (n - r)th row of the current matrix.

Comment: By Theorem 4.5, we have that the entries 6,, of U,, satisfy

n! n! -1

l&l G

CT

[

n-r+1

r!(n - r)! ^@i-r+1 ^{1 -} r!(n - r)! u”,-~

I

for 1, k = 1, . . . , r.

The main work of this algorithm consists of two parts. The first part (steps 1, 2, and 3), referred to as the first pass of the algorithm, computes an initial LU(0) factorization of A. It requires n3/3 + 0(n2) flops. If and only if r z i in step 3, then we perform the second part of the algorithm (steps 4 and 5), which is referred as the second pass of Algorithm RR LU(r). Step 3 computes the left and right singular vectors corresponding to u .-.+1(A), . . . > a,(A) by using the inverse iteration method. It needs about 2Jn2r flops, where J is the number of inverse iterations used for the computation of X, and Y,. Usually / < 5 is sufficient in practice. Step 4 computes the complete LU(O)-factorizations of X, and Y,. as in (5.3). It

(18)

132 T.-M. HWANG, W.-W. LIN, AND E. K. YANG

requires about 2nr ’ flops. Step 5 refactors the matrix PAQ. Here, we compute the Gaussian eliminations from the 1st column to the (n - r)th column by forcing the partial pivoting only on the submatrix Ack)[k, . . . , n - rlk,..., n - r], where Ack’ denotes the current matrix in the elimination process with initial matrix A (i) = PAQ. It requires about n3/3 flops. There- fore, the total flop count C(r) for Algorithm RRLU(t-) is given by

C(r) = +n” + 2]n2r + 2nr2

= $n” + 4n2r + 2nr2, assuming that J = 2.

Therefore, if r 4 n, the total work for Algorithm RRLU(r) is the same as that for RRQR(r) [3, lo]. H owever, Algorithm RRLU(r) has the following practical advantage over RRQR(r). Chan [4] notes that the rank deficiencies of almost all nearly singular matrices can be successfully detected by the first pass of RRLU(r) algorithm, unless A is nearly singular and A-’ has a very skew distribution of the sizes of its elements. The first pass requires in” + 0(n2), which is only half the cost of computing the QR factorization in Algorithm RRQR(r) [3, lo].

Chan [4] presented numerical results for two well-known matrices:

T,, = ]

-yEp

^(5.4a)

. -1 1 1 and

w=

1 10 1

1 9 1

1 1

10 1

1 -1 1

i -9 1

1 - 10

E R21X21

(5.4b)

(19)

RANK REVEALING LU FACTORIZATIONS 133

which are nearly singular with nearly rank 1 deficiency. In the following, we construct some nearly singular matrices with higher-dimensional rank deficiency based on a direct sum of matrices T, or W. All computations were performed on a PC ^MATLAB.

EXAMPLE 4.1. Let A = diag(T,,, Tdo). Compute elementary row and column operations of A by the following steps:

fori = 1:40

A(1:40,81-i)=A(1:40,81-i)+A(1:40,i)

A(i, 41: 80) = A(i, 41: 80) + A(81 - i, 41: 80) end

Then the resulting matrix is

A=

1 -1

0

. -1 -1

-1 -1

1 2

1 -1 -1

-1 2

2 -1 .

. -1 . -1 . -1

1

which has nearly rank 2 deficiency.

The three smallest singular values of A are 6.214 X lo-‘, 1.93 X lo-‘“, and 1.93 x 1~‘~. Algorithm RRLU(r) produces the generalized LU(2) factorization (2.1) with

u, =

1.05 x lo-i5 3.638 x lo-l2 -3.638 x lo-l2 1.323 x 1O-23 I ’

The upper bound of elements of U, in Theorem 4.5 is 6.098 ^X lo-‘.

EXAMPLE 4.2. Let A = diag(TN, T30, T30). Compute elementary row

(20)

and column operations of A by the following steps:

fir i = 1: 30

A(1:90,61 - i) = A(1:90,61 - i) + A(1:90,i) A(i, 1: 90) = A(i, 1: 90) + A(61 - i, 1: 90) end

for i = 1: 30

A(1 : 90,91 - i) = A(1 : 90,91 - i> + A(1 : 90,30 + i) A(30+i,1:90)=A(30+i,1:90)+A(91-i,1:90)

end

Then the resulting matrix is

A=

1 -1 -1 . . . -1 -1 . . . -1 2 2 -1 -1 . . . -1

1 -1 . . . -1 -1 . . . 2 -1 -1 2 -1 . . . -1

1’. : :

. . -i -i 2 -i -i i -i

1 2 -1.. . -1 . . . -1 2

1 -1.. . -1 . . . -1 2

. . . 2 -1

. . . . . .

The smallest four singular values of A are 4.0204 X lo-‘, 2.794 X lo-‘, 1.397 x 1o-g, and 1.397 X lo-‘. Algorithm RRLU(r) produces the generalized LU(3) factorization (2.1) with

-3.725 x lo-’ -3.868 x 10-l’ 1.388 x 10-l’

u, = [

5.454 x 10-l’ -3.726 x lo-’ 3.725 x lo-’ .

-1.585 x 10-l’ 3.725 x lo-’ 5.906 x 1o-26

1

The upper bound of elements of U, in Theorem 4.5 is 3.285 X 10e4.

EXAMPLE 4.3. Let A = diag(W, W). Compute elementary row and

. -1-1 2 . .

0 ₁ 2 -1 . . . -1 -1

1 -1 . . . -1 -1 1 . .

. . -1 -i 1 -1

1

which has nearly rank 3 deficiency.

(21)

RANK REVEALING LU FACTORIZATION!5

column operations of A by the following steps:

j&i = 1:21

A(1:21,43 - i) =A(l:21,43 -i) +A(1:21,i) A(i, 22 : 42) = A(i, 22 : 42) + A(43 - i, 22 : 42) end

Then the resulting matrix is

A=

1

1 1

-1

1.

2 0 2 0 2 2 0 2

.

. .

.

-9

120 2

1 -10 0 2 10 1

19 1

1 1’ 1

10 1

1 -1 1

l’-i 1

1 -10 135

which has nearly rank deficiency r = 2.

The three smallest singular values of A are 5.523 X 10-r, 8.675 X lo-r6, and 2.669 x lo- 16. Step 1 of Alg orithm

submatrix of fi as follows:

RRLU(r) produces the last 3 x 3

1.0 x loo -9.0 x 10” 1.0 x loo

ri[n - 2,n] = 0 1.0 x loo -1.0 x 10’ ,

0 0 8.1552 x 10P”

1

which gives the temporary rank deficiency i = 1. To illustrate in a general case, suppose that the true rank deficiency r is unknown a priori. Step 3 of the algorithm always determines correctly r. The second pass of Algorithm

(22)

136 T.-M. HWANC, W.-W. LIN, AND E. K. YANG

RR LLJ(2) produces the generalized

LU(2)

factorization (2.1) with

u22 =

^-0.20970.1106 x lo-l6 ^{x lo-l6} ^-0.6775^-0.2944 ^{x lo-i6}^{x lo-r6} I

The theoretical upper bound in Theorem 4.5 is 7.4692 X 10-13. Note that only if the first pass produces a nearly rank deficiency r which is larger than i is the second pass required in order to obtain the correct RRLU(r) factorization.

Although the theoretical bound in Theorem 4.5 is not always tight, it works well if [ n!/r!(n - r)!]a,_ r+ 1 is small [i.e., smaller than the tolerance we desire in Algorithm RR

LU(r)].

Under this condition, Algorithm RR

LU(r)

can produce small U,, even though the conventional partial-pivoting

LU

factorization fails to do so.

6. CONCLUSION

The main contribution of this paper is to extend the theory of Chan 141 for rank revealing

LU

factorizations to the general case when the nearly rank deficiency is greater than one. We have also proposed an efficient two-pass algorithm for finding an RR

LU(r)

factorization which usually succeeds in the first pass, thus taking $” + 0(n2) flops. If the first pass fails (i.e., r # F), then the current efficient implementation of the second pass, taking another

$” flops, finds RRLU(r) f ac orizations t for all of our 60,000 test problems.

In comparison, Algorithm RRQR( r needs 97~” flops in the first pass and ) O(n’) flops in the second pass. The rank deficiency of most randomly generated nearly singular matrices can be detected by the first pass of both algorithms. Therefore, for the square matrix A, Algorithm RRLU(r) is in most cases twice as efficient as RRQR(r). In the extreme case that the first pass fails and step 4 also fails to find an element in C,(r) (which never happened in our tests), we can switch to Algorithm RRQR(r) of [4, 71 as a last resort. This hybrid strategy can take advantage of both the efficiency of our Algorithm RR

LU( r )

and the hundred-percent guarantee of finding an RR(r) factorization given by Algorithm RRQR(r).

APPENDIX

In this appendix, we shall prove Lemma 3.1, Lemma 3.2, and Lemma 4.1.

(23)

RANK REVEALING LU FACTORIZATIONS

Proof of Lemma 3.1. Let

D, = diag(d, ,..., d,_,,l,..., l),

I, = diag(l,..., 1,0 ,..., 0).

Then

D + i up: = ^D,

^{I, +} ⁱ

(

D,'ui)u,T

i=l i=l I

Z + i (D;‘u,)u~ - i

e,_i+le;_i+l

i=l i=l

1

= D,(Z + [D,‘u,,. . . ,

D&.,e,_,+,,. we,]

x[u, ,...> u,,--e,-,+,,..., -“IT)

= D,B.

Let

c = z + [Ul,. . . ,u,, -e”_r+l,. . .) --enIT

XID;‘ul ,..., D;lu,,e,_,+l ,..., e,].

137

(‘4 J)

Then

C= ^{I, +}VTTD;‘U, V,[n - r + 1,. . . , nil, . . . , f-1

-u,[n - r + 1,. . .) nil,.

. .) ?-I 0, 1.

Thus

det C = Ur(fl - r + 1,. . . , nil,. . . , r)Vr(n - r + 1,. . , 41, . . , r).

(A J)

(24)

138 T.-M. HWANG, W.-W. LIN, AND E. K. YANG

From (A.l), (A.2), and the well-known result det B = det C, the conclusion

follows. n

Proof of Lemma 3.2. From

= det (X[Z + 2 (Xrei,)( i S,,(Yrejk)‘)]Yr} (A .3)

I=1 k=l

and Lemma 3.1, we have

(A.31

=det Xdet Ydet(C + X’[l,. . .,nli,, .., i,.]HY[j,,. . . ,j,ll,. ., n])

=(det XdetYdet H)XT(n - r + l,..., nli, ,..., i,.)

XY(j,,

^.^.^.,jrln^{- r +}l,...,n)nai.

i=l

w

Proof of Lemma 4.1. By defining U and V as

’ E [ ‘ll(

^Mei,), ^{. . .}⁾

slr( Me,,), . . . , a,,(

Me,,), . . . ,

S,,(

A4eir)]

and

VE

[-e. ,,, . . . ,

-ejr,.

. . ,

-ej,,

. . ,

-ejrlT we have

A - i i Slkei,ejT, =

A Z

- i k ^a,,(

^hfe,,)ei

Z=l k=l I=1 k=l I

=

A[Z - UVT].

(25)

RANK REVEALING LU FACTORIZATIONS 139

Let W = Z - V TU. We have n(W) = n( A[ Z - W T]). Since H is nonsingu- lar, for each i = l,..., r there exists j such that aij # 0. By a careful computation, S,, (1, k = 1,. , r.) g ‘ven by (4.1) in Lemma 4.1 solve the equations

det

and

-&kmjli, m.. _JI’I+I _‘..

= 0 for all I = 1,. . , r.

Therefore, we have n(W) z r.

Next we need to show that n(W) < r. After row subtractions on the matrix

w=

-1 - s,,mjltl ... -

s,,mj,i, ...

-%lm,$, ...

-km,,,,

-6.

^llmj,i, “’ 1 - Slrmj r ^il ... -S,;mjrz, ... - Cmj,,,

- S,;mj,i, ... -S,,mili, ... - S,,.mjl,, .‘. - %imj,zr

- S,,'mjvi, ... - S,,mjri, ... -

%lytF ... -

^%Jmj.,_

-S,,mj,i, ... - Slrmjli, ^... 1 - tir,mj,i, ... - ^hmj,i,

-6 Ilmj,i, .” - S,Jm,ril ... - S,;mjri, ... 1 - irmjri,

(26)

we have the new transformed matrix

1 ^-1 ⁰

0

1 (j -‘I

*= 0 1 -1 0

1 0 -'l

-S,,mj,i, “’ -6,,m,,,, . . -Sr-l,rmj,i,_, 1 - hlmj,,, “’ -hm,,+

-6 Ilm,,il ... - Slrmj,t, ’ .

-Gm,,,,.,

-Srlm,,zr ... 1 - krm,,,,

Thus rank W = rank W > r(r. - l), which implies n(W) < r and

A - 2 i Slkei,e,: =n(W) =r.

I=1 k=l

Therefore, C,(r) c C,(r). 4

We are grateful to Professor Tony Chan and an anonymous referee for their many helpful and detailed suggestions.

REFERENCES

A. Bjijrck, Least squares methods, in Handbook of Numerical Analysis (P. G.

Ciarlet and J. L. Lions, Eds.), Vol. 1, North-Holland, 1990, pp. 465-652.

A. BjGrck, A direct method for the solution of sparse linear least squares problems, in Large Scale Matrix Problems (A. Bjiirck, R. J. Plemmons, and H.

Schneider, Eds.), North-Holland, 1981.

T. F. Chan, Rank revealing QR factorizations, Linear Algebra Appl. 88/89:67-82 (1987).

T. F. Chan, On the existence and computation of LU-factorizations with small pivots, Math. Comp. 42:535-547 (1984).

T. C. Chan, An Efficient Modular Algorithm for Coupled Nonlinear Systems, Research Report YALEU/DCS/RR-328, Sept. 1984.

T. F. Chan, Private communication, 1990.

T. F. Chan and D. C. Resasco, Generalized deflated block-elimination, SZAM J.

Numer. Anal. 23:913-924 (1986).

(27)

RANK REVEALING LU FACTORIZATIONS 141

8 M. Clint and A. Jennings, The evaluation of eigenvalues and eigenvectors of real symmetric matrices by simultaneous iteration, Cornput. J. 13:76-80 (1970).

9 Ft. Fletcher, Practical Methods of Optimization, Vol. 2, Wiley, 1981, pp. 38-39, 86-87, 103.

10 L. V. Foster, The probability of large diagonal elements in the QR factorization, SIAM]. Sci. Statist. Cornput. 11:531-544 (1990).

Received 8 October 1990; final manuscript accepted 9 ]uly 1991

Rank Revealing LU Factorizations Tsung-Min Hwang, Wen-Wei Lin,* and Eugene K. Yang