Rank Revealing LU Factorizations
Tsung-Min Hwang, Wen-Wei Lin,* and Eugene K. Yang Institute of Applied Mathematics
National Tsing Hua University Hsinchu
Taiwan 30043, R. 0. C.
Submitted by Richard A. Brualdi
ABSTRACT
We consider permutations of any given squared matrix and the generalized LU( r ) factorization of the permuted matrix that reveals the rank deficiency of the matrix.
Chan has considered the case with nearly rank deficiency equal to one. This paper extends his results to the case with nearly rank deficiency greater than one. Two applications in constrained optimization are given. We are primarily interested in the existence of such factorizations. In addition to the theories, we also present an efficient two-pass rank revealing LU( r) algorithm.
1. INTRODUCTION
Let A be an n-by-n matrix. We shall consider the generalized LU(r) factorization P, AQ, = LU which will reveal the nearly rank deficiency of A (herein P, and Qr always denote permutation matrices, L unit lower triangular and U upper triangular except for a small r X r block at its last r X r position; see the definition of LU(r) factorization in Section 2 below).
Our main interest is on the nearly singular case. Chan [4] considers the case when the nearly rank deficiency of A is one. In this paper, we extend his results to the case with higher-dimensional rank deficiency. Such a rank
*E-mail: wwlin@AM.nthu.edu.tw.
LINEAR ALGEBRA AND ITS APPLICATIONS 175: 115-141(1992) 0 Elsevier Science Publishing Co., Inc., 1992
115
655 Avenue of the Americas, New York, NY 10010 0024-3795/92/$5.00
116 T.-M. HWANG, W.-W. LIN, AND E. K. YANG
revealing (henceforth, RR) IN(r) f ac orization t is faster than either SVD (singular-value decomposition) or RRQR(r) factorization (see Chan [3] and Foster [lo]) and is important in matrix theory and linear algebra for its wide applications.
One kind of applications of such RR factorization arises from constrained optimization (see Chan [4, 51, Chan and Resasco [7]). When an equality constrained problem is solved by the Lagrange-multiplier approach, we have a symmetric but not positive definite system with the Hessian in the (1,1) block and the constraints constituting the borders. If the Hessian is singular at the solution, then our RR factorization together with deflated block elimination [7] can be used to solve the problem (Chan [6]). The following is another application. If the active-set method is used to solve an inequality constrained optimization problem and the problem is (nearly) degenerate at an intermediate iteration, then the RR factorization is essential to make the method successful (Fletcher [9]). B esi es, d it can also be used to solve least-squares problems following the method proposed by BjGrk [l, 21.
Let T > 1 be the nearly rank deficiency of A. That is, the rth smallest singular value a;, _ r + 1 of A is of small magnitude, and the (r. + I)th smallest singular value a,- r is of order one. We will show that, for this matrix A, there always exists a generalized LU(r) factorization with an T X r position of U. Here “small” means O(a,_,+ l). Chan [4] notes that the usual partial pivoting cannot guarantee to produce small pivots, i.e., to reveal the rank deficiency. In this paper, we shall concern ourselves with the theoretical questions of the existence of such RRLU(r) f ac orizations but shall also give t a practical (i.e. efficient) algorithm for computing such factorizations.
The following notation will be used throughout. n(A) and det A denote the nullity and the determinant of A, respectively. A[i,, . . . , i,(j,, . . . , j,]
denotes the p X p submatrix of A obtained from the intersection of rows 21,. . . > i, withcolumnsjl,...,jp. When the two sets of indices are the same, we write A[i,, , ip] for short. A(i,, . . . , iplj,, . . . ,j,) and A(i,, . . , ip>
denote the determinants of A[i,, . . . , i,lj,, . . . ,j,] and A[i,, . . , i,], respectively.
In this paper some lemmas are simple extensions of those in 141, but others are not simple. In Section 2, we outline the ways to find permutations P and Q such that there exists a generalized LU(r) factorization for PAQ. In Section 3, we discuss the exactly singular case; in Section 4, we discuss the nonsingular case, In Section 5, we present an efficient two-pass algorithm, RRLU(r), that utilizes the theories in the previous sections. We give some numerical results to illustrate that the first pass of Algorithm RRLU(r) fails but the second pass succeeds in revealing the nearly rank deficiency of a given nearly singular matrix.
RANK REVEALING LU FACTORIZATION!3 117
2. EXISTENCE OF GENERALIZED M(T) FACTORIZATIONS
In this section, we first define the generalized LU(r) factorization of a given matrix A E R”’ “, and then we give an equivalence condition for the existence of the LU(r) factorization.
DEFINITION. Let A be an n X n matrix and 0 < r < n. If there exist permutations P, and Qr which, respectively, permute only the first n - T rows and columns of A such that
(2.1)
where Us, E RrXr (not necessary upper triangular), U,, E R(“-‘)x(“-‘) is upper triangular, and L,, E R(“-‘!x(n-r) is unit lower triangular, then we say that A has a generalized LU(r) factorization.
Note that the generalized LU(0) factorization of A is the usual LU factorization of A (which always exists).
We first prove two lemmas for the fundamental existence theorem.
LEMMA 2.1.
(a) Zf A i-s nonsingular and has a generalized LU(r1 factorization as in (2.0, then we can perturb the submatrix
P,AQ,[n-r+ l,...,n]
by U, to make A singular with nullity equal to r.
(b) Zf A is singular with nullity r and has a generalized LU(r) factoriza- tion (2.1) with U,, = 0, then we can perturb the submatrix P, AQ,[n - r +
1,. . , n] by a nonsingular r X r matrix to make A nonsingular.
Proof. (a): Write
118
Let
T.-M. HWANG, W.-W. LIN, AND E. K. YANG
A = P,AQ, - dkg{0,_,,U2,} = -Lull JLu12
L u
21 11
L u
21 12
1
Thus n(A) = r. The proof for part (b) is similar. n LEMMA 2.2. Let A, with nullity equal to 0
partitioned form
or r, be represented in the
4, A,, A=A
[ 21A >
221
where A,, E
R(n-r)x(n-r)
and A,, ER"'.
Then we can change the nullity of A (from 0 to r or vice versa) by perturbing the submatrix A,, if and only if A,, is nonsingular.Proof. The “only if” part: Suppose that A,, is singular and rank A,, = k.
Let
A,, =U ; ; VT
[ I
be the SVD of A,,, where C = diag(u,, . . . , Us) > 0. Define
S(A22) = [” () 1, “]A[: qT= [s;l s!2 ;g
where S,, and S& are in
R'X("-k-').
For the case rank A = n - r:
(i) If S,, has deficient column rank, then rank S( A,,) < n - 1 for all A
=iii) If S, has full column rank, then
RANK REVEALING LU FACTORIZATIONS 119
and since rank A = n - r, we have S,, = 0. Thus
rankS(Azz) <n - 1 for all A,, .
From (i) and (ii), we know that there does not exist A, such that rank S( A,) = n, which is contradictory. Thus A,, is nonsingular.
For the case rank A = n: Since rank A = n, we have rank S,, = n - k - r (full column rank), rank S,, = n-k-r(fullrowrank),andn_2r<
k < n - r - 1. Thus
rankS(A,) >k + ( n - k - r) + (n - k - r) = 2n - k - 2r.
From n - 2r < k < n - r - 1, we have n - r + 1 < 2n - k - 2r G n.
Therefore, rank S( A,,) 2 n_- r + 1 for all A,,; i.er there does not exist a perturbation of A,, say A,,, such that rank S( A,,) = n - r, which is contradictory. Thus A,, is nonsingular.
The “if” part: Suppose that Al, is nonsingular. Then
From (2.2), we have that
All
0rank A = rank
0 A,, - &&‘4,
1
.Therefore, we can change the nullity of A (from 0 to r or vice versa) by perturbing the submatrix A,. This complete the proof. n
On the basis of Lemma 2.1 and 2.2, we want to find a submatrix H of A such that if we perturb H, then n(A) is changed either from 0 to r or vice versa.
DEFINITION. Let C,(r) denote the set of all r x r submatrices
of A, where n(A) can be changed (from 0 to r or vice versa) by perturbing the submatrix H of A alone.
120 T.-M. HWANG, W.-W. LIN, AND E. K. YANG
When an element H in C,(r) is found, we use permutations, called P and Q, to permute H to the last r X r position of PAQ. In the following we will prove the fundamental theorem on the existence of a generalized LU(r) factorization for PAQ.
THEOREM 2.3. Let P and Q be two permutations. Then PAQ has a generalized LU(r) factorization as in (2.1) if and only if P and Q permute a submatrix H in C,(r) to the last r-by-r position of PAQ.
Proof. The “only if” part is Lemma 2.1. We prove the “if” part as follows. Let
All Al,
PAQ= 1
A A1
, where A,, E Rrx’.21 22
By Lemma 2.2, A,, is nonsingular. So A,, has an LU(0) factorization, say
~~~~~~~ = L,,U,,, where rri, 8, are permutations and Vi, has nonzero diagonal elements. Let
P, = “0’ 0
[ I
r 1
Thus
where U,, = A,, - A,181U~~1L~1~lAl, = A,, - A,,A,‘A,,. Therefore,
PAQ has a generalized LU(r)-factorization. w
In the following two sections, we shall establish some subsets of C,(r) which will give further equivalent and sufficient conditions for the existence of the LU(r) factorization of a singular or a nonsingular matrix. This information will lead to a practical algorithm in Section 5.
3. THE SINGULAR CASE
In this section, let A be a singular matrix with n(A) = r. We shall show how to find P and Q such that PAQ h as a generalized LU(r) factorization (2.1) with U, = 0. First, we need the following two preliminary lemmas.
RANK REVEALING LU FAGTORIZATIONS 121
LEMMA
3.1. Let D =
diag(d, ,..., d,_,,O ,..., 01, U, z[ul,....url, and
V, = [u,,. , II,] E R"".
Then= ,G d,U,( n - r + 1,. . . , n(1,. , r)V,( n - r + 1,. . . , n(1,. . . , r).
Proof. See appendix. W
Now, let A = X CY T be the SVD of A, where c = diag(ar ,
. . , (T
n--r*0 ,..., O>,X=[xl ,...,
r,],andY=[yr ,..., y,]areorthogonahFrom now on, let H denote an r x r matrix [a,,]. ThenLEMMA 3.2.
A + 2 k &e,,ei l=l k=l
=(detXdetYdetH)X(i, ,..., i,(n-r+l,..., n)
n-r
XY(jl,. .
.,jrln-r+ l,..., n)ngi.i=l
Proof. See appendix. n
The next lemma shows that Cl(r) is related to the left and right singular vectors corresponding to 0, _ r+ i = *** = a, = 0, respectively.
DEFINITION. Define the set
C,(r) = {H: H isin C,(r)
satisfying
X(i,,...,i,ln -
r + 1,. . ..n) # 0
andY(j,,... ,j,ln - r + 1,.
. . , n) # 0).
Note that we use nonzero determinants instead of nonzero components to generalize the definition of C,(r) in Chan [4].
LEMMA 3.3. Zf A is singular with n(A) = r, then C,(r) = C,(r) and is rwnempty.
122 T.-M. HWANG, W.-W. LIN, AND E. K. YANG Proof. By Lemma 3.2, we know that for a nonsingular matrix H = [a,,]
E RrXr
A + i i 6,, e,,eJ? # 0 if and only if
I=1 k=l
X(i,, . .
.) i,ln-r+I ,..., n)ZO and Y(ji ,..., j,ln-r+l,..., n)#O.Thus C,(r) = C,(r).
It remains to show that C,(r) is nonempty. If
X(i, ,
. . , i,ln - r + 1 1 . . . , n) =0
for any possible set of i, : 1 = 1,. . , T, then {x,,_,+ ,, . . . , x,,) is linearly dependent, which is contradictory. This can be similarly shown for the determinant involving Y. Therefore, C, is nonempty. nCombining Theorem 2.3 and Lemma 3.3, we have the following primary result of this section.
THEOREM 3.4. Suppose that n(A) = r. Then PAQ has a generalized LU(r) factorization (2.1) with U,, =
0
if and only q P and Q permute an element in C,(r) to the last r x r position of PAQ. Moreover, there always exists at least one such j&torization for any singular A.4. THE NONSINGULAR CASE
In this section, we assume that A is nonsingular but with r small singular values. We shall show how to find permutations P and Q such that PAQ has a generalized LU(r) f ac t orization
(2.1)
with such a small U,, ER" r
that the factorization reveals the nearly rank r deficiency of A. First, we show that C,(r) is related to r-by-r submatrices of A-’ with nonzero determinants.DEFINITION. Let M = A-l. Define
cd?-)
E{H = A[i,,...,i,Ij,,...,j,]:/3 =
M(j,,. . ..jrlil....,ir) # 0).
LEMMA 4.1. Let A be nonsingular. Zf H E C,(r) and H = [ 6,,], for 1, k = 1,. . . , r, is the inverse of M[j,, . . . ,j,li,, . , i,], i.e.,
‘1, =
( -
l)‘+kP M(j,,...,.fk, . . . . j,lil ,..., i, ,..., i,), (4.1)
RANK REVEALING LU FAGTORIZATIONS
where i means “omit i,” then
n A - i k &eireI~
( I=1 k=l In other words, we have C,(r) C C,(r).
Proof. See appendix.
123
r.
n THEOREM 4.2. Zf P and Q pemute a submatrix A[i,, . . , i,lj,, . . . , j,]
E C,(r) to the last r X r position of PAQ, and
A - i i Slkei,ejT
Z=l k=l
then PAQ has a generalized LU(r) factorization (2.1) with U, = [t&l.
Proof. Let
PAQ=
4,
A4,
[ 21 A 22
1
> where A 22 E R”‘.By Lemma 2.2 and the definition of C,(r), we have that A,, is nonsingular.
By Theorem 2.3, PAQ has a generalized LU(r) factorization. Let it be
If P, = diaar,, Z,) and Q1 = diag(8,, I,), then
Let li=A,
where -
“141% = 4% ami v,, = A22 - L2,42.
(4.2)
&2 = A22 - [slkl~
(4.3)
124 T.-M. HWANG, W.-W. LIN, AND E. K. YANG
On the other hand,
where
ti22 = AZ2 -
L,,U,, .(4.4)
n(A) = r implies tiss = 0. From (4.2), (4.31, and (4.41, we then have
Q!s =
&,I.
nCombining Lemma 4.1 and Theorem 4.2, we have the following theorem immediately.
THEOREM 4.3. Zf A is nonsingular and P and Q permute an element of C,(r) to the last r x r position of PAQ, then PAQ has a generalized LU(r) factorization (2.1). Moreouer, the (1, k)th entry of U,, , say 6,, , is equal to
( -
l)‘+kP M(j,, . . . ?ik,. . . , j,bl,. . . ,i,, . . . ,ir) (4.5) for 1, k = 1,. . . , r, where p = M(j,, . . . , j,lil,. . , i,).
Now, let XrAY = x = diag(u,, . . . , u,,) be the SVD of A with cri > ..*
> a, > 0, and denote the columns of X and Y by xi and yi, respectively.
In the following, we will characterize the generalized LU(r) factorization (2.1) of PAQ with small U,, E Rrx’ by the singular vectors x,_,+i, . . . , x, and Yn-,+l>...> Yn corresponding to ~“_~+i,. . . , a,,, where on_, * ~“_~+i
= O(E) (small in magnitude).
DEFINITION. Define
C,(r) = H e A[i,, . . . , irljl,. . . ,jr] :
absY(j,,... , jrln - r + 1,. . . , n) >
( r!(nn;r)! r2
and abs X(i,, . . . , i,ln - r + 1,. . . , n) >
r!(n - r)!
n!
where abs denotes the absolute value.
RANK REVEALING LU FACTORIZATIONS 125
THEOREM 4.4. C,(r) is empty.
Proof. Let Y, = Y[l, . . . , nln - r + 1,. . . , n]. Since Y, is orthogonal, by the Binet-Cauchy formula we have
1 = det Y,‘Y, = c Y,(k,, . . . , k,ln - r + 1,. . .) n)‘.
l<k,< ... <k,<n
Hence, there exist indices j,, . . . , j, such that
absY(j, ,...,
j,ln -
r + l,..., n) > ( r!‘nn; r)! r2,Similarly, there exist i,, , i, such that
abs X(i,, . , i,)n - r + 1,. . ., n) > ( r!(nn; r)! r2. ,
Next, we show our main result in this section. The next theorem, using the Binet-Cauchy formula, is also the main contribution of this paper.
THEOREM 4.5. Let A be nonsingular. Zf P and Q permute an element in C,(r) to the last r x r position of PAQ, then PAQ has a generalized LU(r) factorization (2.1) with U, = [ 6,,] satisfying the following upper bound:
IS,,1 < n!
[
1 -1
n. q-r+1
r!(n - r)! s-r+1 ’ - r!(n - r)! q_,
1
for 1, k = 1,. . . , r, provided that th e quantity inside the bracket is positive.
Proof. For ease of exposition, we shall first use the case of r = 2 to illustrate the derivation. Let A = XXY T and M = A-’ = Et=, uilykxl.
Then
M[j, kli,Z] = Y[j, kll,. ..,n]diag(a,‘,..., a;‘)X[i,Zll,..., n]‘.
126 T.-M. HWANG, W.-W. LIN, AND E. K. YANG
By the Binet-Cauchy formula, we have
P = M(j,klQ) = c ~(j,klp,9)X(~,~lp,q)~,-‘a,-’
l<p<9<n
= Y(j,
kin -
1, n)X(i,Zln - l,n)a,.,a,’+
c y(j,klp,q)X(~,~lp,q)~,-‘~~‘. (4.6)
l<p<g<n (p,q)zh-l,n)
In the following we will show the upper bound for the second term in (4.6) is o-;lo--?,:
abs c Y(j,klp,9)X(i,~lp,q)a~‘~~’
l<p<q<n (p,q)+(n-l,n)
Q
a,-lu;J, c abs[Y(j,klp,q)X(i,Ilp,q)]
l<p<q<n (p,g)f(n-1.n)
i
1
l/2Q
a, ‘u;Jg c Y(j,klp,9)’
l<p<q<n (p,q)z(n-l,n)
i
c x(i,zlp,9)2 1
l/2l,<p<q<n (p,q)+(n-l,n)
(by the Cauchy-Schwarz inequality)
< a, lu”-12 .
The last inequality follows from applying the Binet-Cauchy formula to the matrices
I2 =
Y [ j,
kll, . . . , n]Y [ j,kll, . . . , nl’
(4.7a) andz2 =
X[i, 111,. . *
)n]x[i, 111,. . . , n]‘. (4.7.b)
RANK REVEALING LU FACTORIZATION!3 127
By the definition of C,(r) and the assumption that the quantity inside the bracket is positive, we have
Hence C,(r) G C,(r) and
n(n - 1)
IPI-‘<
2 anun_, 1 -[
n(n -
l)U"_,-l
1
2u,_, .
We show this similarly for the general case, i.e., when the nearly rank deficiency is r. By the Binet-Cauchy formula,
p = M(j,, . . . ,j&,, . . . ) iJ
= Y(&,. . &In - r + 1,. . , rr)
x X(i,, . . . ) i,ln - ?- + 1,. .) f+n-J,+l a*’ o;,’
+ c Y(jl>. . .
J,lp,, . . . > pr)
l<p,< ..’ <p,<n (p I,.‘., p,)#(n-r+1,.. ,n)
x(i1,. . .) i,lp,, . .) p,)up;l 0.. up;‘. (4.8)
The second term of (4.8) can be estimated as follows by using the Cauchy- Schwarz inequality and the Binet-Cauchy formula:
abs (
c
Y(jl> .
. J,l p,, . f. > pJl<pl< .‘. <p,<n (pl ,...> p,)+(n-r+ I,..., n)
xx@,,. . .)
i,lp,, . . ., p,)c$yl --* up;’
< U” -1 -1
... un-r+2un-?r c
l<p,< “’ <pr4n (pl ,..., p,)+(n-r+l,..., n)
1
l/2qj1,. . . J,l p,, . . . , pt.)2
128 T.-M. HWANG, W.-W. LIN, AND E. K. YANG
X
1 c x(i,, . . .
Ii,l p,, . . . , pJ2
l/2l<p,< .‘. <p,<n
(PI,. .., p,)+(n-r+l,..., n)
i
Hence, by the definition of C,(r), (4.8), (4.9), and the assumption of the theorem, a lower bound on I p I is given by
r!(n - r)!
ii
[
n!
IPI, n! u;’ 1 - - q-r+1 >o.
p=n-r+l r!(n - r)! u,_, I
Thus, we have C,(r) c C,(r) and
IpI-’
Qn!
iir!(n - r)! p_n-r+l up 1 -
[
n! q-r+1 -1
r!(n - r)! - u,_,
1
. (4.10)By Theorem 4.3, PAQ h as a generalized LU(r) factorization (2.1). We have estimated an upper bound (4.10) for I P 1-l. To estimate an upper bound for I S,, I (see Theorem 4.3 for S,, ), we only need to estimate an upper bound for the numerator of /6,,1. By using the Cauchy-Schwarz inequality and the Binet-Cauchy formula in a similar way, we have
abs(M(j,,.. .,I’~,.. .,j,li,,. ..,i, ,..., ir))
= abs
( c Y(jr 1..., j.k >...>
j,lpw
Pr-1)
l=sp,< .” <p,_,<n
xx@,, . . . ,
il, .
. . ,i,l p,, . . . , p,_ &qy 52, 1
< a, -1 -1
*** q-*+2. (4.11)
By (4.10) and (4.11) we complete the proof. n
Theorem 4.4 and Theorem 4.5 together establish the existence of a generalized W(r) f ac orization with small U, t for any rank r deficient square matrix A. In addition to being nonempty, C,(r) is applicable in practical algorithms. In the next section, we propose such an algorithm for
RANK REVEALING LU FACI-ORIZATIONS 129
finding an element in C,(r), which leads to a generalized W(r) factorization with small U, as bounded by the bound of Theorem 4.5.
5. ALGORITHM AND EXAMPLES
Suppose the matrix A E
Rnx"
has nearly rank deficiency r (r > 1) and the quantity [n!/r!(n - r)!]a,_,.+, is sufficiently small, where a, _ r+ i is the rth smallest singular value of A. By using Theorem 4.4 and Theorem 4.5, we give an efficient algorithm for finding a rank revealing LU(r) factorization for the matrix A.ALGORITHM RRLU(r). Given A E
R""'
with nearly rank deficiency r, I where r > 1 (but r is unknown a priori). Let X rAY = diag(o,, . . . , q,,) be the SVD of A with cri > **a > a,_, * ~“_~+i > 9.. > a;, > 0. This algo- rithm computes permutations P, Q and a generalized LU(r) factorization ofPAQ
PdPAQ)Ql = [ ::; ;][ “;I’ ;I]
with a small Us, = [S,,] E
R"',
which reveals the rank r deficiency of A.Step 1: Compute the LU(0) factorization of A by some conventional pivoting strategy (e.g., partial pivoting):
Step 2: Determine a temporary rank deficiency i. Determine the index i (0 < ; < n) such that ltii j 1 < tolerance for all i, j = n - i + 1 . . > n and there exists an’index k E (n - i,. . . , n} with I&,_, k]
>: ’ tolerance.
Step 3: Use inverse iteration to determine the true rank deficiency r and compute the approximate singular values Us_ r+ i(A), , a,( A), the corresponding approximate right singular vectors Y, =
r+i>“‘r t::I,+,,...,x:l:
y ] and the approximate left singular vectors X, =
If i = 0, then set m = 1; else set m := i.
For k = 1,2,. . . m:= 2m.
Given an orthonormal matrix 2 E
Rnx"
130 T.-M. HWANG, W.-W. LIN, AND E. K. YANG
Set Q = I, A, = 1.
While A1 > tolerance, do Set U = ZQ.
Solve AW = U by using the factorization (5.1).
Solve ATV = W by using the factorization (5.1).
Let 2 = V(VTV)-1/2 (e.g., Gram-Schmidt).
Compute the eigenvalues of ZTAATZ. Since ZTAATZ = ZTU(VTV)-‘/2, we compute an orthogonal Q E
Rmx" so
that
QTZTU(VTV)-1’2 Q = hag(h,,...,&), (5.2) where 0 Q A, < .*. < A,,,.
Endwhile
If there exists index r (1 < r < m) such that A, Q tolerance and A r+ 1 > tolerance then stop; else continue;
Endfor k.
Let X, = ZQr, where Qr is the first r columns of Q.
Remark : The convergence rate for the above inverse iteration process without adding (5.2) is (c”_ .+Ju”_~)~. The convergence rate is now accelerated by the correction step (5.2) (see [8] for details).
Solve A? = X,.
Let Y, = yl(ylr+?
Compute the singular values of X,?AY,, which are approximations to the smallest singular values a, _ r+ 1(A), . , q,(A).
If r = i, then done (first pass).
Step 4: Determine an element in the set C,(r).
Comment: Indeed, the maximal elements of the sets
(absY,.(j,, . ,j,ll,. . . , r): 1 <j, < ... <jr < n} =y and
{abs X,(i,, . , i,(l,. . , r): 1 < i, < *** < i, < n} E-?- satisfy the conditions in C,(r). That is, the corresponding indices
(j,, . . . , jr>
and (i,, . . , i,) yield a submatrixH = A[i,, . . . ,
i,lj,, . . ..j.l
in C,(r). Unfortunately, for an r > 2 it is not econom- ical to find the maximums of y and z’, because that needs n!/[r!(n - r)!] flop counts. In the following we give an efficient algorithm to find an element in C,(r).RANK REVEALING LU FAGTORIZATIONS 131
Compute the LU(O) factorizations with complete pivoting of Y, and
X, ,
respectively:Q’Y,e, = L,
2 , [
1PTX,8, =
L, RX[
0 1’
(5.3b)where PT, Q’ E R”“‘, 13,) 0, E R” r are permutations, L, , Lx E R “‘” are unit lower triangular, and R, , Rx are upper triangular.
Comment: Let (1,. . . , n)Q = (j,, . . ,j,) and (1,.
. . , n>P = (i,,
. . . , i,). It is easily seen that Y,.(ji, . . . ,j,ll, . . . , r) = det R, and X,(i,, . . . ,i,ll, .
. . , r) = det Rx. Although we cannot prove that both ldet Rrl and ldet Rx1 are larger than (r!(n - r)!/n!)“‘, a statistical result shows that there is no counterexample in up to a total of about 60,000 randomly generated tested orthonormal matri- ces Y,. and X, E Rnx’ for n = 10,12,. . . ,100 and r = 2,. . . , n/2.That is, the corresponding submatrix H = A[i,, . . . ,
i,lj,, . . ,j,l
produced by (5.3) is always in C,(r).
Step 5: Compute the generalized LU(r) factorization (2.1) of PAQ: Per- form the Gaussian eliminations by using the partial pivoting only on the 1st row to the (n - r)th row of the current matrix.
Comment: By Theorem 4.5, we have that the entries 6,, of U,, satisfy
n! n! -1
l&l G
CT[
n-r+1
r!(n - r)! @i-r+1 1 - r!(n - r)! u”,-~
I
for 1, k = 1, . . . , r.
The main work of this algorithm consists of two parts. The first part (steps 1, 2, and 3), referred to as the first pass of the algorithm, computes an initial LU(0) factorization of A. It requires n3/3 + 0(n2) flops. If and only if r z i in step 3, then we perform the second part of the algorithm (steps 4 and 5), which is referred as the second pass of Algorithm RR LU(r). Step 3 computes the left and right singular vectors corresponding to u .-.+1(A), . . . > a,(A) by using the inverse iteration method. It needs about 2Jn2r flops, where J is the number of inverse iterations used for the computation of X, and Y,. Usually / < 5 is sufficient in practice. Step 4 computes the complete LU(O)-factorizations of X, and Y,. as in (5.3). It
132 T.-M. HWANG, W.-W. LIN, AND E. K. YANG
requires about 2nr ’ flops. Step 5 refactors the matrix PAQ. Here, we compute the Gaussian eliminations from the 1st column to the (n - r)th column by forcing the partial pivoting only on the submatrix Ack)[k, . . . , n - rlk,..., n - r], where Ack’ denotes the current matrix in the elimination process with initial matrix A (i) = PAQ. It requires about n3/3 flops. There- fore, the total flop count C(r) for Algorithm RRLU(t-) is given by
C(r) = +n” + 2]n2r + 2nr2
= $n” + 4n2r + 2nr2, assuming that J = 2.
Therefore, if r 4 n, the total work for Algorithm RRLU(r) is the same as that for RRQR(r) [3, lo]. H owever, Algorithm RRLU(r) has the following practical advantage over RRQR(r). Chan [4] notes that the rank deficiencies of almost all nearly singular matrices can be successfully detected by the first pass of RRLU(r) algorithm, unless A is nearly singular and A-’ has a very skew distribution of the sizes of its elements. The first pass requires in” + 0(n2), which is only half the cost of computing the QR factorization in Algorithm RRQR(r) [3, lo].
Chan [4] presented numerical results for two well-known matrices:
T,, = ]
-yEp
(5.4a). -1 1 1 and
w=
1 10 1
1 9 1
1 1
10 1
1 -1 1
i -9 1
1 - 10
E R21X21
(5.4b)
RANK REVEALING LU FACTORIZATIONS 133
which are nearly singular with nearly rank 1 deficiency. In the following, we construct some nearly singular matrices with higher-dimensional rank defi- ciency based on a direct sum of matrices T, or W. All computations were performed on a PC MATLAB.
EXAMPLE 4.1. Let A = diag(T,,, Tdo). Compute elementary row and column operations of A by the following steps:
fori = 1:40
A(1:40,81-i)=A(1:40,81-i)+A(1:40,i)
A(i, 41: 80) = A(i, 41: 80) + A(81 - i, 41: 80) end
Then the resulting matrix is
A=
1 -1
0
. -1 -1
-1 -1
1 2
1 -1 -1
-1 2
2 -1 .
. -1 . -1 . -1
1
which has nearly rank 2 deficiency.
The three smallest singular values of A are 6.214 X lo-‘, 1.93 X lo-‘“, and 1.93 x 1~‘~. Algorithm RRLU(r) produces the generalized LU(2) factorization (2.1) with
u, =
1.05 x lo-i5 3.638 x lo-l2 -3.638 x lo-l2 1.323 x 1O-23 I ’The upper bound of elements of U, in Theorem 4.5 is 6.098 X lo-‘.
EXAMPLE 4.2. Let A = diag(TN, T30, T30). Compute elementary row
134 T.-M. HWANG, W.-W. LIN, AND E. K. YANG
and column operations of A by the following steps:
fir i = 1: 30
A(1:90,61 - i) = A(1:90,61 - i) + A(1:90,i) A(i, 1: 90) = A(i, 1: 90) + A(61 - i, 1: 90) end
for i = 1: 30
A(1 : 90,91 - i) = A(1 : 90,91 - i> + A(1 : 90,30 + i) A(30+i,1:90)=A(30+i,1:90)+A(91-i,1:90)
end
Then the resulting matrix is
A=
1 -1 -1 . . . -1 -1 . . . -1 2 2 -1 -1 . . . -1
1 -1 . . . -1 -1 . . . 2 -1 -1 2 -1 . . . -1
1’. : :
. . -i -i 2 -i -i i -i
1 2 -1.. . -1 . . . -1 2
1 -1.. . -1 . . . -1 2
. . . 2 -1
. . . . . .
The smallest four singular values of A are 4.0204 X lo-‘, 2.794 X lo-‘, 1.397 x 1o-g, and 1.397 X lo-‘. Algorithm RRLU(r) produces the generalized LU(3) factorization (2.1) with
-3.725 x lo-’ -3.868 x 10-l’ 1.388 x 10-l’
u, = [
5.454 x 10-l’ -3.726 x lo-’ 3.725 x lo-’ .
-1.585 x 10-l’ 3.725 x lo-’ 5.906 x 1o-26
1
The upper bound of elements of U, in Theorem 4.5 is 3.285 X 10e4.
EXAMPLE 4.3. Let A = diag(W, W). Compute elementary row and
. -1-1 2 . .
0 1 2 -1 . . . -1 -1
1 -1 . . . -1 -1 1 . .
. . -1 -i 1 -1
1
which has nearly rank 3 deficiency.
RANK REVEALING LU FACTORIZATION!5
column operations of A by the following steps:
j&i = 1:21
A(1:21,43 - i) =A(l:21,43 -i) +A(1:21,i) A(i, 22 : 42) = A(i, 22 : 42) + A(43 - i, 22 : 42) end
Then the resulting matrix is
A=
1
1 1
-1
1.
2 0 2 0 2 2 0 2
.
. .
.
-9
120 21 -10 0 2 10 1
19 1
1 1’ 1
10 1
1 -1 1
l’-i 1
1 -10 135
which has nearly rank deficiency r = 2.
The three smallest singular values of A are 5.523 X 10-r, 8.675 X lo-r6, and 2.669 x lo- 16. Step 1 of Alg orithm
submatrix of fi as follows:
RRLU(r) produces the last 3 x 3
1.0 x loo -9.0 x 10” 1.0 x loo
ri[n - 2,n] = 0 1.0 x loo -1.0 x 10’ ,
0 0 8.1552 x 10P”
1
which gives the temporary rank deficiency i = 1. To illustrate in a general case, suppose that the true rank deficiency r is unknown a priori. Step 3 of the algorithm always determines correctly r. The second pass of Algorithm
136 T.-M. HWANC, W.-W. LIN, AND E. K. YANG
RR LLJ(2) produces the generalized
LU(2)
factorization (2.1) withu22 =
-0.2097 0.1106 x lo-l6 x lo-l6 -0.6775 -0.2944 x lo-i6 x lo-r6 IThe theoretical upper bound in Theorem 4.5 is 7.4692 X 10-13. Note that only if the first pass produces a nearly rank deficiency r which is larger than i is the second pass required in order to obtain the correct RRLU(r) factorization.
Although the theoretical bound in Theorem 4.5 is not always tight, it works well if [ n!/r!(n - r)!]a,_ r+ 1 is small [i.e., smaller than the tolerance we desire in Algorithm RR
LU(r)].
Under this condition, Algorithm RRLU(r)
can produce small U,, even though the conventional partial-pivoting
LU
factorization fails to do so.
6. CONCLUSION
The main contribution of this paper is to extend the theory of Chan 141 for rank revealing
LU
factorizations to the general case when the nearly rank deficiency is greater than one. We have also proposed an efficient two-pass algorithm for finding an RRLU(r)
factorization which usually succeeds in the first pass, thus taking $” + 0(n2) flops. If the first pass fails (i.e., r # F), then the current efficient implementation of the second pass, taking another$” flops, finds RRLU(r) f ac orizations t for all of our 60,000 test problems.
In comparison, Algorithm RRQR( r needs 97~” flops in the first pass and ) O(n’) flops in the second pass. The rank deficiency of most randomly generated nearly singular matrices can be detected by the first pass of both algorithms. Therefore, for the square matrix A, Algorithm RRLU(r) is in most cases twice as efficient as RRQR(r). In the extreme case that the first pass fails and step 4 also fails to find an element in C,(r) (which never happened in our tests), we can switch to Algorithm RRQR(r) of [4, 71 as a last resort. This hybrid strategy can take advantage of both the efficiency of our Algorithm RR
LU( r )
and the hundred-percent guarantee of finding an RR(r) factorization given by Algorithm RRQR(r).APPENDIX
In this appendix, we shall prove Lemma 3.1, Lemma 3.2, and Lemma 4.1.
RANK REVEALING LU FACTORIZATIONS
Proof of Lemma 3.1. Let
D, = diag(d, ,..., d,_,,l,..., l),
I, = diag(l,..., 1,0 ,..., 0).
Then
D + i up: = D,
I, + i(
D,'ui)u,Ti=l i=l I
Z + i (D;‘u,)u~ - i
e,_i+le;_i+l
i=l i=l
1
= D,(Z + [D,‘u,,. . . ,
D&.,e,_,+,,. we,]
x[u, ,...> u,,--e,-,+,,..., -“IT)
= D,B.
Let
c = z + [Ul,. . . ,u,, -e”_r+l,. . .) --enIT
XID;‘ul ,..., D;lu,,e,_,+l ,..., e,].
137
(‘4 J)
Then
C= I, + VTTD;‘U, V,[n - r + 1,. . . , nil, . . . , f-1
-u,[n - r + 1,. . .) nil,.
. .) ?-I 0, 1.
Thus
det C = Ur(fl - r + 1,. . . , nil,. . . , r)Vr(n - r + 1,. . , 41, . . , r).
(A J)
138 T.-M. HWANG, W.-W. LIN, AND E. K. YANG
From (A.l), (A.2), and the well-known result det B = det C, the conclusion
follows. n
Proof of Lemma 3.2. From
= det (X[Z + 2 (Xrei,)( i S,,(Yrejk)‘)]Yr} (A .3)
I=1 k=l
and Lemma 3.1, we have
(A.31
=det Xdet Ydet(C + X’[l,. . .,nli,, .., i,.]HY[j,,. . . ,j,ll,. ., n])
=(det XdetYdet H)XT(n - r + l,..., nli, ,..., i,.)
XY(j,,
. . .,jrln - r + l,...,n)nai.i=l
w
Proof of Lemma 4.1. By defining U and V as
’ E [ ‘ll(
Mei,), . . . )slr( Me,,), . . . , a,,(
Me,,), . . . ,S,,(
A4eir)]and
VE
[-e. ,,, . . . ,
-ejr,.. . ,
-ej,,. . ,
-ejrlT we haveA - i i Slkei,ejT, =
A Z- i k a,,(
hfe,,)eiZ=l k=l I=1 k=l I
=
A[Z - UVT].
RANK REVEALING LU FACTORIZATIONS 139
Let W = Z - V TU. We have n(W) = n( A[ Z - W T]). Since H is nonsingu- lar, for each i = l,..., r there exists j such that aij # 0. By a careful computation, S,, (1, k = 1,. , r.) g ‘ven by (4.1) in Lemma 4.1 solve the equations
det
and
-&kmjli, m.. JI’I+I ‘..
= 0 for all I = 1,. . , r.
Therefore, we have n(W) z r.
Next we need to show that n(W) < r. After row subtractions on the matrix
w=
-1 - s,,mjltl ... -
s,,mj,i, ...
-%lm,$, ...-km,,,,
-6.
llmj,i, “’ 1 - Slrmj r il ... -S,;mjrz, ... - Cmj,,,- S,;mj,i, ... -S,,mili, ... - S,,.mjl,, .‘. - %imj,zr
- S,,'mjvi, ... - S,,mjri, ... -
%lytF ... -
%Jmj.,_-S,,mj,i, ... - Slrmjli, ... 1 - tir,mj,i, ... - hmj,i,
-6 Ilmj,i, .” - S,Jm,ril ... - S,;mjri, ... 1 - irmjri,
140 T.-M. HWANG, W.-W. LIN, AND E. K. YANG
we have the new transformed matrix
1 -1 0
0
1 (j -‘I
*= 0 1 -1 0
1 0 -'l
-S,,mj,i, “’ -6,,m,,,, . . -Sr-l,rmj,i,_, 1 - hlmj,,, “’ -hm,,+
-6 Ilm,,il ... - Slrmj,t, ’ .
-Gm,,,,.,
-Srlm,,zr ... 1 - krm,,,,Thus rank W = rank W > r(r. - l), which implies n(W) < r and
A - 2 i Slkei,e,: =n(W) =r.
I=1 k=l
Therefore, C,(r) c C,(r). 4
We are grateful to Professor Tony Chan and an anonymous referee for their many helpful and detailed suggestions.
REFERENCES
A. Bjijrck, Least squares methods, in Handbook of Numerical Analysis (P. G.
Ciarlet and J. L. Lions, Eds.), Vol. 1, North-Holland, 1990, pp. 465-652.
A. BjGrck, A direct method for the solution of sparse linear least squares problems, in Large Scale Matrix Problems (A. Bjiirck, R. J. Plemmons, and H.
Schneider, Eds.), North-Holland, 1981.
T. F. Chan, Rank revealing QR factorizations, Linear Algebra Appl. 88/89:67-82 (1987).
T. F. Chan, On the existence and computation of LU-factorizations with small pivots, Math. Comp. 42:535-547 (1984).
T. C. Chan, An Efficient Modular Algorithm for Coupled Nonlinear Systems, Research Report YALEU/DCS/RR-328, Sept. 1984.
T. F. Chan, Private communication, 1990.
T. F. Chan and D. C. Resasco, Generalized deflated block-elimination, SZAM J.
Numer. Anal. 23:913-924 (1986).
RANK REVEALING LU FACTORIZATIONS 141
8 M. Clint and A. Jennings, The evaluation of eigenvalues and eigenvectors of real symmetric matrices by simultaneous iteration, Cornput. J. 13:76-80 (1970).
9 Ft. Fletcher, Practical Methods of Optimization, Vol. 2, Wiley, 1981, pp. 38-39, 86-87, 103.
10 L. V. Foster, The probability of large diagonal elements in the QR factorization, SIAM]. Sci. Statist. Cornput. 11:531-544 (1990).
Received 8 October 1990; final manuscript accepted 9 ]uly 1991