Applied Multivariate
Quantitative Methods
Factor Analysis
By Jen-pei Liu, PhD
Division of Biometry, Department of Agronomy,
National Taiwan University
and
Wei-Chie Chie, MD, PhD
Department of Public Health
National Taiwan University
Factor Analysis
Introduction
Exploratory Factor Analysis
Principal Components Factor Analysis
Maximum Likelihood Method
Other Methods
Examples
Confirmatory Factor Analysis and LISREL
Procedures
Examples
Introduction
Correlation between Test Scores
Classics French English Math DoP Music
Classics 1.00 0.83 0.78 0.70 0.66 0.63
French 0.83 1.00 0.67 0.67 0.65 0.57
English 0.78 0.67 1.00 0.64 0.54 0.51
Math 0.70 0.67 0.64 1.00 0.45 0.51
DoP 0.66 0.65 0.54 0.45 1.00 0.40
Music -0.63 0.57 0.51 0.51 0.40 1.00ch
DoP: Discrimination of Pitch; Source: Manly (2005) and Spearman
(1904)
Introduction
Correlation between Test Scores
Classics French English Math DoP Music
Classics 1.00 0.83 0.78 0.70 0.66 0.63
French
0.83
1.00
0.67
0.67 0.65 0.57
English 0.78 0.67 1.00 0.64 0.54 0.51
Math
0.70
0.67
0.64
1.00 0.45 0.51
DoP
0.66
0.65
0.54
0.45 1.00 0.40
Music
0.63
0.57
0.51
0.51 0.40 1.00ch
Introduction
Structure and property of the correlation
(covariance matrix) matrix
Any two rows are almost proportional if
the diagonals are ignored
For classics (1) and English (3)
0.83 0.70 0.66 0.63
1.2
0.67 0.64 0.54 0.51
Introduction
A constant ratio between the rows of the corr
elation matrix indicates that there might be a
model for the variation of the data
The six standardized test scores can be descri
bed as
X
i
= a
i
F + e
i
, i = 1,…,6
where X
i
is the ith standardized test score, a
i
i
s a constant, F is a “factor” value that has me
an of 0 and variance of 1, and e
i
is the part of
Introduction
If F and e
i
are assumed independent
Var(X
i
)=Var(a
i
F + e
i
)
= Var(a
i
F) + Var(e
i
)
= a
i2
Var(F) + Var(e
i
)
= a
i2
+ Var(e
i
)
Since X
i
is a standardized score,
1 = a
i2
+ Var(e
i
)
a
i
is called the factor loading
a
i2
: proportion of variance of X
i
that is accounte
Introduction
One-factor Model
1
1
1
2
2
2
p
p
p
i
X
a F+e
X
a F+e
.
X
a F+e
X is referred to as observed indictor such as
Introduction
Examples
Education
Indicators
test scores on Classics, French, English,
Mathematics, Discrimination of Pitch, and Music
Factors
Intelligence
study attitude
Introduction
Examples
Quality of Life
Indicators
Scores from 1 to 5 for each of questions in the
questionnaires
Factors (domain)
Physical
Mental
Introduction
Examples
Hospital Anxiety and Depression Scale
Indicators
Scores from each of 14 questions in the
questionnaires
Factors (construct)
Anxiety
Depression
Introduction
Two-factor model
1
11 1
12 2
1
2
21 1
22 2
2
p
p1 1
p2 2
p
X
a F +a F +e
X
a F +a F +e
.
X
a F +a F +e
Introduction
Objectives of factor analysis
To reduce the number of variables (indictors) to t
he smallest number of common factors
To identify the common factors (latent constructs)
to best explain the intercorrelation among indicato
rs and to build the most parsimonious factor mode
l
Principal component analysis also tries to reduce t
he number of variables. However, factor analysis i
s based on a model
Introduction
Example:
Four correlated variables: X1, X2, X3, X4
X1=X1
X2=X2
X3=2X1+3X2
X4=5X1+4X2
Introduction
If there are two independent variables
Y1 and Y2,
X1=Y1, X3=2Y1+3Y2
X2=Y2, X4=5Y1+4Y2
Introduction
m-factor model
1
11 1
12 2
1m P
1
2
21 1
22 2
2m 2
2
p
p1 1
p2 2
pm p
p
1
P
1
m
X
a F + a F + ... + a F + e
X
a F + a F + ... + a F + e
.
X
a F + a F + ... + a F + e
(X ,.., X ) ',
(F ,..., F ) ',
and
(e ,...,e ) '
X
F
e
Introduction
X
i
is the ith score with mean 0 and unit varia
nce
a
i1
to a
im
are the factor loadings for the ith tes
t
F
1
to F
m
are m uncorrelated common factors,
each with mean 0 and unit variance
e
i
is a factor specific only to the ith test that is
uncorrelated with any of the common factors
and has zero mean
Introduction
11
12
1p
12
22
2p
1p
2p
pp
a a ... a
a a ... a
=
.. .. ... ..
a a ... a
The factor model can be expressed in
matrix form as follows
A
Introduction
1
2
p
E( ) = , Cov( ) =
E( ) = , Cov( ) =
E( ) =
0 ... 0
0
... 0
Cov( )
. . ... 0
0 . ...
Cov( ) =
'
X
0
X
I
F
0
F
I
e
0
e
W
X
AA W
Introduction
i 2 2 2 i1 1 i2 2 im m i 2 2 2 i1 i2 im i i 2 2 2 i i1 i2 imV(X ) = 1
= a V(F ) + a V(F ) + ... + a V(F ) + V(e )
= a + a + ... + a +
The variation of X can be partitioned in two parts:
(1) h =a + a + ... + a :the c
i i
ommunality
or common variance of X (the part of variation
of X that is related to the common factors), and
(2) V(e ) : the specificity of X (the part of variation
Introduction
i
j
ij
ij
i
j
i
ij
Cov(X , F ) a
the factor loading a is the
correation between the indicator X and the
common factor F .
Because the community 0 h
1,
-1 a
+1.
If indicator variables are standardized,
factor load
Introduction
i
i'
ij
i
i'
i1 i'1
i2 i'2
im i'm
i
i'
The correlation between two indicators X and X
r
cov(X , X ) = a a + a a + ... + a a ,
The correlation between two indicators
X and X is the sum of cross-products
of their respectiv
ely factor loadings.
The two indicator variables can be only highly
correlated if they have high factor loadings
Introduction
Exploratory factor analysis (EFA)
Little or no knowledge about the factor
structure
The number of factors
The number of indicators for each factor
Which indicators represent which factors
To collect data and explore or search for a
factor structure or theory which can
Introduction
Confirmatory factor analysis (CFA)
Factor structure is known or hypothesized
(specified)
a priori
Complete factor structure with their respective
indicators
Nature pattern of factor loadings
To empirically verify or confirm the factor
Exploratory Factor Analysis
Principal Components Factor (PCF) Analysis
p variables X
1
,X
2
,…,X
p
, and p principal
components Z
1
,Z
2
,…,Z
p
Z
1
=b
11
X
1
+b
12
X
2
+…+b
1p
X
p
Z
2
=b
21
X
1
+b
22
X
2
+…+b
2p
X
p
..
Exploratory Factor Analysis
11 12 1p 1 1 2 21 22 2p 2 p p1 p2 pp pb b ... b
Z
X
Z
b b ... b
X
..
. . ... .
..
X
b b ... b
=
The row vectors of are the eigenvectors
obtained from
Z
Z
BX
B
Exploratory Factor Analysis
The inverse relationship between X and Z
X
1
=b
11
Z
1
+b
21
Z
2
+…+b
p1
Z
p
X
2
=b
12
Z
1
+b
22
Z
2
+…+b
p2
Z
p
..
Exploratory Factor Analysis
For factor analysis, only m of the princip
al components are retained
X
1
=b
11
Z
1
+b
21
Z
2
+…+b
m1
Z
m
+ e
1
X
2
=b
12
Z
1
+b
22
Z
2
+…+b
m2
Z
m
+ e
2
..
Exploratory Factor Analysis
For factor analysis, factors have zero mean a
nd unit variance.
Z
i
is divided by its standard deviation
i
, the
square root of the corresponding eigenvalue
X
1
=
1
b
11
F
1
+
2
b
21
F
2
+…+
m
b
m1
F
m
+ e
1
X
2
=
1
b
12
F
1
+
2
b
22
F
2
+…+
m
b
m2
F
m
+ e
2
..
Exploratory Factor Analysis
The unrotated factor model is
X
1
= a
11
F
1
+a
12
F
2
+…+a
1m
F
m
+ e
1
X
2
= a
21
F
1
+a
22
F
2
+…+a
2m
F
m
+ e
2
..
X
p
= a
p1
F
1
+a
p2
F
2
+…+a
pm
F
m
+ e
p
where F
i
= Z
i
/
i
and a
ij
=
j
b
ji
Exploratory Factor Analysis
Varimax Rotation
Rotation that produces the maximum va
riation
Objectives
To have a factor structure in which each variabl
es loads highly on one and only one factor
A given variable should have a high loading on
one factor and near zero loadings on others
Exploratory Factor Analysis
Principal Axis Factor Analysis (PAF): an
iterative PCF method
Step 1: Assume that the prior estimates of
the communalities are one. A PCF is
obtained. Based on the number of factors
retained, estimates of factor loadings are
obtained and then are used to re-estimate
the communalities.
Exploratory Factor Analysis
Principal Axis Factor (PAF) Analysis: an
iterative PCF method
Step 2: Compute the maximum change in
estimated communalities among all indicators
between two iterations.
Step 3: If the maximum change in communalities
is greater than a pre-specified convergence
criterion, then replace the diagonal elements of
the original correlation matrix by the estimated
communalities
Exploratory Factor Analysis
Principal Axis Factor (PAF) Analysis: an
iterative PCF method
Step 4: A new principal components factor
analysis is performed using the modified
correlation matrix and Step 2 is repeated.
Step 2 to Step 4 are repeated until the
maximum change in the communalities is
smaller than the pre-specified convergence
criterion
Exploratory Factor Analysis
Factor Scores
F* = XG(G’G)
-1
where F* is the nxm matrix of factor sc
ores, with one row for each of the n ro
ws of data, X is the nxp matrix of origin
al data, G is the pxm matrix of factor lo
adings
Exploratory Factor Analysis
Maximum Likelihood (ML) Method
Assumption: multivariate normal
Cov(X) = = A’A + W
Maximize the log-likelihood function
Ln(l)=-(n/2){ln(|A’A + W|+tr(A’A + W)
-1C}+k
Equivalent to minimize
Exploratory Factor Analysis
Differentiate f with respect to and
and set them to 0:
f
=2
( - )
f
=diag(
( - )
)
Re quire iterations to find estimates of
and
1
1
1
1
A
W
ΣΣCΣA 0
A
ΣΣCΣ0
W
A
W
Exploratory Factor Analysis
For a given W, find the solutions to
CW
-1
A=A(I+A’W
-1
A)
Let A
o
be the solution, a solution for W i
s given as
Exploratory Factor Analysis
Selection of The Number of Factors
Cumulative proportion of variation of the r
common factors > 80% or 90%
When the eigenvalue of the rth common fa
ctor > average variation. When the correlat
ion matrix is used, the average variation is
1. Choose the r common factors which eige
nvalues are greater than 1
Exploratory Factor Analysis
Selection of The Number of Factors
Decree plot: number of factors vs. eigenval
ues – choose the number of factors when t
he curve becomes flat
Maximum likelihood method:
Test the hypothesis until failure to reject th
e null hypothesis
Exploratory Factor Analysis
Selection of The Number of Factors
Test Statistic: M = kln[|A’A+W|/|R|],
where k = n-2(p+4r+11)/6.
Reject Ho if M >
2,df
,
df=[(p-r)
2
-p-r]/2,
r = the number of factors
110/07/16 Copyright by Jen-pei Liu, PhD an
d Wei-Chu Chie, MD, PhD 42
Exploratory Factor Analysis
Selection of The Number of Factors
Step 1: Start r=2, if fail to reject Ho, stop and
select 2 as the number of factors,
Step 2: otherwise let r=3, re-compute
A’A+W and perform the test again. If fail to
reject Ho, stop and select 3 as the number of
factors
Step 3: Repeat Step 1 and Step 2 until failing
to reject the null hypothesis
Exploratory Factor Analysis
When to rotate
Find the factor loadings >= 0.5 (ignore the sign)
The large and moderate factor loadings indicate
relatively good correlation between indicators and
factors
Try to avoid the situation where a large number of
indicators is related strongly to only a few factors
Rotate the factors such that there is no
overlapping of large or moderate factor loadings
among factors
Exploratory Factor Analysis
Examples (PCF)
Evaluation of two coking methods for fish: Sh
en (1998) and Rencher (1995)
Evaluation items (score:1-10)
aroma (X1)
taste (X2)
texture (X3)
Exploratory Factor Analysis
Examples (PCF)
Evaluation of two coking methods for fish
Correlation Matrix(n=20)
X1 X2 X3
X4
X1
1
0.62938
0.43116
0.33543
X2 1 0.22779 0.30179
X3 1 0.75061
X4 1
Exploratory Factor Analysis
Examples (PCF)
Evaluation of Two coking methods for fish
Eigenvalues and Eigenvectors
1
2
3
4
2.34486 1.04840 0.40394 0.20280
0.50584 0.44079 -0.64930 -0.35810
0.44267 0.60513 0.58061 0.31742
0.52726 -0.47040 -0.26945 0.65432
Exploratory Factor Analysis
Examples (PCF)-Continued
The first 2 principal components accounts for
84.83% of total variation. We take 2 factors
Computation of factor loading for the first
factor and X3
a
31
= (
1
)b
13
= (2.34486)(0.52726) = 0.80738
h
12
= a
112
+a
122
Exploratory Factor Analysis
Examples (PCF)
Evaluation of Two coking methods for fish
Factor Loadings and Communality
Specific
Variable
F1
F2
Communality Variance
X1
0.77459 0.45133
0.80370
0.19630
X2
0.67786 0.61980
0.84340
0.15660
X3
0.80738 -0.48164
0.88385
0.11615
X4
0.79593 -0.47834
0.86232
0.13768
Variance
2.34486 1.04840
3.39327
Proportion
0.5862 0.2621
0.8483
Exploratory Factor Analysis
Examples (PCF)-Continued
The communalities are quite high (> 0.80) for
all variables. Most of the variation for the 4 vari
ables can be accounted by the 2 common facto
rs
The factor loadings for the unrotated first facto
r are very large and have the same sign. The fi
rst factor seems to represent the overall qualit
y of the cooking method
Exploratory Factor Analysis
Examples (PCF)-Continued
The absolute values of factor loadings for the unro
tated second factor are in the same magnitude. Ho
wever, aroma and taste have different sign with te
xture and moisture. This factor is a contrast betwe
en aroma with taste and texture with moisture.
However, the factor loadings within each factor ha
ve the same magnitudes. Therefore, some of the 4
variables are strongly related to the two factors –
a undesirable properties of the factor. We need to
see whether a rotation can help.
Exploratory Factor Analysis
Examples (PCF)-Continued
Unrotated
Varimax Rotation
F1
F2
F1
F2
0.77449
0.45133
0.27554
0.85310
0.67786
0.61960
0.09120
0.91383
0.80738
-0.48164
0.92269
0.18025
Exploratory Factor Analysis
Examples (PCF)-Continued
It can be verified that the communalities are
unchanged based on the factor loading after
rotation
Factor loadings greater than 0.5 (ignore the sign)
are underlined. These large and moderate
loadings indicate how the variables are related to
the factors
Exploratory Factor Analysis
Estimation of Correlation Matrix
0.7746 0.4513 0.6779 0.6196 0.7746 0.6779 0.8074 0.7959 ' 0.8074 -0.4816 0.4513 0.6196 -0.4816 -0.4783 0.7959 -0.4783 0.1963 0 0 + AA W 0 0 0.1566 0 0 0 0 0.1162 0 0 0 0 0.1377 1 0.8047 0.4081 0.4006 1 = 0.2489 0.2432 1 0.8730 1 0 -0.17532 0.02306 -0.06517 ( ' ) 0 -0.02111 0. R AA W 05859
Exploratory Factor Analysis
Example (PCF) – European Employment (n=30)
Un-rotated
Varimax Rotation
Indicators F1 F2 F3 F4 F1 F2 F3 F4
AGR
0.90
-0.03 -0.34 0.02
0.85
0.10 0.27 -0.36
MIN
0.66
0.00
0.63
0.12 0.11 0.30
0.86
-0.10
MAN -0.43
0.58
-0.61
0.06 -0.03 0.32
-0.89
-0.09
PS -
0.56
0.15 -0.36 0.02 -0.19 -0.04
-0.64
0.14
CON -0.39 -0.33 0.09
0.81
-0.02 0.08 -0.04
0.95
SER -
0.67
-
0.55
0.08 0.17 -0.35 -0.48 -0.15
0.65
FIN -0.23
-0.74
-0.12
-0.50
-0.08 -
0.98
0.00 -0.01
SPS -
0.76
0.07 0.44 -0.33 -
0.91
-0.17 -0.12 0.04
Exploratory Factor Analysis
Example (PCF) – European Employment
Factor 1: +high loading in AGR, -high loadi
ngs in SPS amd TC: rural industry rather th
an social service and communicaton
Factor 2: +moderate loading in TC and –hi
gh loading in FIN: lack of finance industries
Exploratory Factor Analysis
Example (PCF) – European Employment
Factor 3: +high loading in MIN, -high
loadings in MAN –moderate loading in PS:
mining rather than manufacturing
Factor 4: +high loading in CON and
+moderate high in SER: construction and
service industry
Exploratory Factor Analysis
Example: Test scores
(PAF)
Correlation Matrix(n=200)
M
P
C
E
H
F
M
1.00 0.62 0.54 0.32 0.284 0.37
P 1.00 0.51 0.38 0.351 0.43
C 1.00 0.36 0.336 0.405
E 1.00 0.686 0.730
H
1.00 0.7345
F 1.00
M: mathematics, P: physics, C: chemistry
Exploratory Factor Analysis
Example (PAF): Test scores
History of PAF analysis
Iter. Change M P C E H F
1
.359 .766 .714 .641 .797 .812 .829
2.128 .698 .626 .513 .725 .744 .784
3.042 .679 .598 .471 .698 .719 .774
.
.
Exploratory Factor Analysis
Example (PAF): Test scores
Cum.
Indicator Eigenvalue Prop. Factor 1
Factor2
M 3.0281 0.7826 0.63584 0.52255
P 0.8410 1.0000 0.65784 0.38549
C 0.0004 1.0004 0.59812 0.30447
E 0.0003 1.0007 0.76233 -0.31509
H -0.0003 1.0004 0.74908 -0.36797
F -0.0004 1.0000 0.83129 -0.30329
Exploratory Factor Analysis
Example (PAF): Test scores
Unrotated
Varimax
Factor 1 Factor2 Factor 1 Factor 2 Communality
M 0.63584 0.52255 0.15200 0.80886 0.677354
P 0.65784 0.38549 0.25687 0.71790 0.581356
C 0.59812 0.30447 0.26309 0.61744 0.450447
E 0.76233 -0.31509 0.78676 0.24786 0.680426
H 0.74908 -0.36797 0.81055 0.19881 0.696517
F 0.83129 -0.30329 0.83205 0.30118 0.783020
Exploratory Factor Analysis
Example (ML): Meteorological Measurements
X1: daily high temperature (F)
X2: daily low temperature (F)
X3: daily soil high temperature (F)
X4: daily soil low temperature (F)
X5: relative humidity for daily high temperature
X6: relative humidity for daily low temperature
X7: daily wind speed
Exploratory Factor Analysis
Example (ML): Meteorological Measurements (n=20)
Correlation matrix
X1
X2
X3
X4
X5
X6
X7
X8
X1 1
0.737
0.931
0.446
-0.479
-0.551
0.384
0.879
X2
1
0.827
0.865
-0.255
-0.019
0.536
0.597
X3
1
0.616
-0.381
-0.341
0.466
0.848
X4
1
-0.154
0.382
0.683
0.375
X5
1
0.500
-0.106
-0.539
X6
1
0.299
-0.568
X7
1
0.494
X8
1
Exploratory Factor Analysis
Example (ML): Meteorological Measurements
Ho: 2 factors are sufficient
Ha: More factors are needed
Test Statistic M=[n-2(p+4r11)/6] ln[|A’A+W|/|R|]
= 24.789
degrees of freedom=[(p-r)
2-p-r]/2
=[(8-2)
2-8-2]/2 = 26/2=13
M=24.789 >
20.05,13=22.362
Reject Ho at the 5% significance level.
Two factors are not sufficient.
Exploratory Factor Analysis
Example (ML): Meteorological Measurements
Ho: 3 factors are sufficient
Ha: More factors are needed
Test Statistic M=[n-2(p+4r11)/6] ln[|A’A+W|/|R|]
= 12.186
degrees of freedom=[(p-r)
2-p-r]/2
=[(8-3)
2-8-3]/2 = 14/2=7
M=12.186 <
20.05,7=14.076
Exploratory Factor Analysis
Example (ML): Meteorological Measurements
Un-rotated Factor Loadings
Indicator F1
F2
F3
Communality
X1
0.3843
0.8937
-0.0972
0.9558
X2
0.5363
0.6462
-0.4964
0.9503
X3
0.4658
0.8362
0.0989
0.9259
X4
0.6827
0.2868
0.6171
0.9291
X5
-0.1063
-0.4919
0.2275
0.3050
X6
0.2990
-0.6779
0.6104
0.9215
X7
1.0000
-0.0000
-0.0000
1.0000
X8
0.4942
0.7564
-0.3091
0.9119
Exploratory Factor Analysis
Example (ML): Meteorological Measurements
Factor Loadings by Varimax
Indicator F1
F2
F3
Communality
X1
0.8223
0.5062
0.1528
0.9558
X2
0.2980
0.9058
0.2025
0.9503
X3
0.6717
0.6617
0.1920
0.9259
X4
-0.0554
0.8759
0.3992
0.9291
X5
-0.5416
-0.1034
-0.0318
0.3050
X6
-0.8788
0.2523
0.2925
0.9215
X7
0.0728
0.3602
0.9300
1.0000
Exploratory Factor Analysis
Differences between principal components an
alysis and factor analysis
PCA emphasize on explaining the variation in the
data
Factor analysis is to explain the correlation among
indicators
For PCA, indicators form an index such as Comsu
mer Price Index or Dow Jones Industrial Average
For factor analysis, indicators reflect the presence
Exploratory Factor Analysis
Factor Indeterminacy
The factor analysis solution is not unique
Due to factor rotation problem
Infinite ways to rotate the factors
Due to estimation of communality
Different methods for estimation of communality
provide different solutions
Exploratory Factor Analysis
Are the data appropriate for factor analysis?
Examine the correlation. Pattern of groups of varia
bles with high correlations
Partial correlations after controlling all other variab
les. Low partial correlations
Kaiser-Meyer-Olkin (KMO) measure for sampling a
dequacy > 0.6
The overall square root of mean square residuals
Exploratory Factor Analysis
Kaiser-Meyer-Olkin (KMO) measure for sampling ade
quacy
KMO measure Recommendation
>= 0.90 Marvelous
0.80+
Meritoroius
desirable
0.70+
Middling
0.6+
Mediocre
tolerable
0.5+
Miserable
Below 0.5
Unacceptable
Exploratory Factor Analysis
The overall square root of mean square
residuals (RMSR)
Reproduced correlation matrix = A’A
Residual correlation matrix = R – A’A
RMSR is the square root of the average
squared values of off-diagonal entries of
the residual correlation matrix
Exploratory Factor Analysis
Principal Axis Factor (PAF) Analysis: an
iterative PCF method
SAS default: 30 iterations with a
convergence criterion = 0.001.
If more iterations are required, the data
may not be suitable for factor analysis
Exploratory Factor Analysis
Factor loadings are not unique
The variance in common between the factor a
nd the indicators is not unique either
The variance in common between the factor a
nd the indicators is not a meaningful measure
of factor importance
However, the objective of factor analysis is to
explain the intercorrelations among indicators
and is not to account for the total variation in
the data
Exploratory Factor Analysis
PCF analysis labeled the first few principal
components accounting for most of variation
as common factors and the other principle
components as the specific part.
PCF analysis can not really be called factor
analysis
PAF implicitly assumes a factor model and it
is a preferred method by most of researchers
Confirmatory Factor Analysis
Correlation vs. Covariance Matrix?
The objective of exploratory factor analysis
(EFA) is to investigate the intercorrelations
among indicators
The correlation matrix is always exclusively
used in the EFA
The correlation is the covariance matrix of
Confirmatory Factor Analysis
Correlation vs. Covariance Matrix?
The standardized variables are unitless and
hence are not scale invariant
Most of confirmatory factor models are scal
e invariant
The maximum likelihood procedure for CFA
are derived from covariance matrix
Confirmatory Factor Analysis
Objectives
Given the covariance matrix, to estimate
the parameters of the hypothesized factor
model
To determine the fit of the hypothesized
Confirmatory Factor Analysis
The Models
1
11 1
12 2
1m P
1
2
21 1
22 2
2m 2
2
p
p1 1
p2 2
pm p
p
1
P
1
m
X
a F + a F + ... + a F + e
X
a F + a F + ... + a F + e
.
X
a F + a F + ... + a F + e
(X ,.., X ) ',
(F ,..., F ) ',
X
F
Confirmatory Factor Analysis
11
12
1p
12
22
2p
1p
2p
pp
a a ... a
a a ... a
=
.. .. ... ..
a a ... a
The factor model can be expressed in
matrix form as follows
=
+
A
X AF
E
Confirmatory Factor Analysis
E( ) = , Cov( ) =
E( ) = , Cov( ) =
E( ) =
Cov( )
Cov( ) =
'
X
0
XΣ
F
0
FΦ
e
0
e
W
X
A A W
Confirmatory Factor Analysis
is a pxp matrix of variances and covar
iances of indicators
A is a pxm matrix of factor loadings
is a mxm matrix of the variances and
covariances of latent construct
W us a pxp matrix of the variancs and c
Confirmatory Factor Analysis
Confirmatory factor analysis is one of
li
n
ear
s
tructural
rel
ation (LISREL)
Parameter Matrix
Order
A
pxm
mxm
Confirmatory Factor Analysis
The parameters of the models
Free parameters: the parameters to be esti
mated
Fixed parameters: the parameters not be e
stimated and values to be provided
Constrained parameters: the parameters wi
th values constrained to to be equal to oth
er free parameters
Confirmatory Factor Analysis
The parameters of the models
Constrained parameters: the variances of the
errors of all indicators are constrained to be equal
Most of the latent construct such as attitudes,
intelligence, or excellence do not have a natural
measurement scale. We need to define the metric
or scale for the latent construct
In general, the scale of the latent construct is
Confirmatory Factor Analysis
Estimation of the parameters of the
hypothesized factor model
Example: one-factor model
1 1 1 2 2 2 p p p i
X
a F+e
X
a F+e
.
X
a F+e
X is referred to as observed indictor such as
test score, and F is referred to as common or
latent factor or an unobservable construct
Confirmatory Factor Analysis
Estimation of the parameters of the hypothesized
factor model
Example: one-factor model: assume that p=2 and varia
nce of F is 1
1 1 1 2 2 2 2 2 1 1 1 2 2 2 1 2 2 2X
a F+e
X
a F+e
a
w a a
( ) =
a a a
w
3 parameters to be estimated in 3 equations
θ
Confirmatory Factor Analysis
Estimation of the parameters of the hypothesized
factor model
Example: one-factor model: assume that p=3 and varia
nce of F is 1
2 2 1 1 1 2 1 3 2 2 1 2 2 2 2 3 2 2 1 3 2 3 3 3 2 2 2a
w a a a a
( ) = a a a
w a a
a a a a a
w
6 parameters to be estimated in 4 equations
= (a a a w w w ) '
θ
θ
Confirmatory Factor Analysis
Estimation of the parameters of the hypothesized
factor model
Example: one-factor model: assume that p=4 and varia
nce of F is 1
2 2 1 1 1 2 1 3 1 4 2 2 1 2 2 2 2 3 2 4 2 2 1 3 2 3 3 3 3 4 2 2 1 4 2 4 3 4 4 4a
w a a a a a a
a a a
w a a a a
( ) =
a a a a a
w a a
a a a a a a a
w
8 parameters
θ
to be estimated in 10 equations
Confirmatory Factor Analysis
Estimation of the parameters of the hypothesized
factor model
The number of parameters > the number of equations:
unidentified model - one-factor model with p=2 set w
12=
w
22= w three equations for three parameters
The number of parameters = the number of equations:
just-identified model - one-factor model with p=3
The number of parameters > the number of equations:
over-identified model - one-factor model with p=4
Degrees of freedom = over-identifying equations
= p(p+1)/2 - # of free parameters:
p=4: df = 4(4+1)/2 – 8 =2
Confirmatory Factor Analysis
Estimation of the parameters of the
hypothesized factor model
Maximum likelihood estimation procedure for all
free and constrained parameters
Convergence
Statistical significance
Inadmissible estimates
Factoring loadings outside [-1,1]
Confirmatory Factor Analysis
Evaluation of Model Fit
Ho: = () vs. Ho: ()
The
2test
Test Statistic: M = ln[| |/|C|]+Tr(C ) - p
where is the estimated covariance matrix esti
mated using MLE under the assumed factor mod
el.
Reject Ho if M >
2,df,
df=[p(p+1)/2- # of free parameters],
Σ
Σ
1
Confirmatory Factor Analysis
Confirmatory Factor Analysis
Evaluation of Model Fit
Other measures
2 2
Goodness-of-fit index (GFI)
tr[(
- ) ]
GFI =
1-tr[(
) ]
0 GFI 1
The approximate expected value of GFI
1
EGFI
1+(2df/pn)
ΣΧI
ΣΧ
Confirmatory Factor Analysis
Evaluation of Model Fit
Other measures
Adjusted Goodness-of-fit index (AGFI)
p(p 1)
AGFI = 1-[
][1 GFI]
2df
0 AGFI 1
The approximate expected value of AGFI
p(p 1)
1
EAGFI 1-[
][1
]
2df
1+(2df/pn)
AGFI
RAGFI
Confirmatory Factor Analysis
Evaluation of Model Fit
Other measures
2
-0.5NCP
Rescaled noncentrality parameter
- df
NCP =
n
McDonald 's transformation of NCP (MDN)
MDN = e
Confirmatory Factor Analysis
Evaluation of Model Fit
Other measures
n n h h n n n h n nTucker-Lewis index (TLI)
NCP /df - NCP /df
TLI =
NCP /df
Re lative noncentrality index (RNI) or
Bentler Comparative index (CFI)
NCP - NCP
RNI =
NCP
where NCP is the NCP for the null model
(no factor m
odel is assumed), and NCP is the
hConfirmatory Factor Analysis
Evaluation of Model Fit
Other measures
The residual matrix: C - contains the variances
and covariance that can not be explained by the
assumed factor model.
No more than 5% of the standardized residuals
should be greater than 1.96
Squared multiple correlation between the indicators
and construct– estimated variances of construct
(estimated communalities)
Confirmatory Factor Analysis
Evaluation of Model Fit
Other measures
Total cofficient of determination
1 -
where
is the determinat of the
estimated covariance matrix of the error
variance and
is the determinat of the
sample covariance matrix
A value great
W
C
W
C
Confirmatory Factor Analysis
Alternative measures of fit
2
n h o n
Population Discrepancy Function (PDF)
- df
PDF = max{
,0}
n
Bentler-Bonnet Normed Fit index (NFI)
F - F
NFI =
,
F
where F is the discrepancy function under
Confirmatory Factor Analysis
Alternative Measures of Fit
n n h h n n 2 2 n n h h 2 n n 2 2 n n 2 2 h h
Bentler-Bonnet Non-normed Fit index (NNFI)
F /df - F /df
NNFI =
F /df - 1
/df -
/df
=
/df -1/n
where
(df ) is
(df) under the null model
and
(df )is
(df) under
Confirmatory Factor Analysis
Evaluation Criteria
Fit indices
Criteria
Chi-square
NS and as small as possible
Chi-square/df
<3
GFI
>0.9
AGFI
>0.8
Confirmatory Factor Analysis
Evaluation Criteria
Fit indices
Criteria
TLI
>0.9
RNI
>0.9
NFI
>0.9
NNFI
>0.9
Confirmatory Factor Analysis
Example: Test Scores
(CFA)
Covariance Matrix (n=200)
M
P
C
E
H
F
M
4.00 2.48 2.16 1.28 1.136 1.48
P 4.00 2.04 1.52 1.404 1.72
C 4.00 1.44 1.334 1.62
E 4.00 2.744 2.92
H
4.00 2.94
F 4.00
Confirmatory Factor Analysis
Example:
Test Scores
(CFA)
Model: one-factor model with 6 indicators
P=6 and m=1
covariance matrices of construct and error
are assumed to be symmetric
The scale of the latent construct is
assumed be the same as the first indicator
variable (score of mathematics)
Confirmatory Factor Analysis
Example:
Test Scores
(CFA)
Standardized factor loadings
Estimates of parameters are standardized such
that the variance of the construct is 1
Completely standardized factor loadings
Estimated factor loadings are standardized such
that both variances of the construct and
indicators are 1.
Confirmatory Factor Analysis
Example:
Test Scores
(CFA)
Estimates of factor loading (MLE)
Completely
Unstandardized Standardized Standardized
t-statistics
M
1.000
0.914
0.457
-P
1.134
1.037
0.518
5.21
C
1.073
0.981
0.491
5.046
E
1.786
1.633
0.816
6.393
H
1.770
1.618
0.809
6.375
F
1.937
1.771
0.886
6.533
Confirmatory Factor Analysis
Estimated error covariance matrix
M
P
C
E
H
F
M
0.791 0
0
0
0
0
P
0.731 0
0
0
0
C
0.759 0
0
0
E
0.333 0
0
H
0.345 0
F
0.216
Confirmatory Factor Analysis
Estimates:
Estimated factor loading are between -1 and 1
No negative estimates of variances of the
construct and error terms
Variances of the error terms are all smaller than 1
Estimates of factoring loadings and covariance
matrices of the construct and error terms are
admissible
Confirmatory Factor Analysis
Goodness of Model Fit
Measures
value Recommendation
2test
113.02 (9 df)
with a p-value < 0.001 reject the model
GFI
0.822 < 0.9 poor fit
EGFI
0.985
RGFI
0.835(0.822/0.985) < 0.9 poor fit
AGFI
0.584 < 0.8 poor fit
Confirmatory Factor Analysis
Goodness of Model Fit
Measures
value Recommendation
RMSR
0.507 Poor fit
(Multiple correlation)
2M
0.209 <0.5 no good for construct
P
0.269 <0.5 no good for construct
C
0.241 <0.5 no good for construct
E
0.667 >0.5 OK for construct
H
0.655 >0.5 OK for construct
F
0.784 >0.5 OK for construct
Confirmatory Factor Analysis
Goodness of Model Fit
Measures
value Recommendation
NCP
h0.520 >0.10 poor fit
MDN
0.771 <0.90 poor fit
NCP
n2.748 <0.90 poor fit
TLI
0.685 <0.90 poor fit
Confirmatory Factor Analysis
Degrees of freedom under one-factor model
Number of free parameters: 5 factor loadings, 1 v
ariance of construct and 6 variances of error terms
= 5+1+6 =12
Df
h= 6(6+1) -12 =9
Degrees of freedom under null model
Number of free parameters: 6 variances of error t
erms = 6
Confirmatory Factor Analysis
2n
= 564.67 and
2h
= 113.02
NCP
n
= (564.67-15)/200 = 2.748
NCP
h
= (113.02-9)/200 = 0.520
MDN = exp[-0.520/2] = 0.771
TLI = [2.748/15 – 0.520/9]/[2.748/15)]
= 0.685
Confirmatory Factor Analysis
Example: Test Scores (CFA)