• 沒有找到結果。

Applied Multivariate Quantitative Methods-Factor Analysis

N/A
N/A
Protected

Academic year: 2021

Share "Applied Multivariate Quantitative Methods-Factor Analysis"

Copied!
123
0
0

加載中.... (立即查看全文)

全文

(1)

Applied Multivariate

Quantitative Methods

Factor Analysis

By Jen-pei Liu, PhD

Division of Biometry, Department of Agronomy,

National Taiwan University

and

Wei-Chie Chie, MD, PhD

Department of Public Health

National Taiwan University

(2)

Factor Analysis

Introduction

Exploratory Factor Analysis

Principal Components Factor Analysis

Maximum Likelihood Method

Other Methods

Examples

Confirmatory Factor Analysis and LISREL

Procedures

Examples

(3)

Introduction

Correlation between Test Scores

Classics French English Math DoP Music

Classics 1.00 0.83 0.78 0.70 0.66 0.63

French 0.83 1.00 0.67 0.67 0.65 0.57

English 0.78 0.67 1.00 0.64 0.54 0.51

Math 0.70 0.67 0.64 1.00 0.45 0.51

DoP 0.66 0.65 0.54 0.45 1.00 0.40

Music -0.63 0.57 0.51 0.51 0.40 1.00ch

DoP: Discrimination of Pitch; Source: Manly (2005) and Spearman

(1904)

(4)

Introduction

Correlation between Test Scores

Classics French English Math DoP Music

Classics 1.00 0.83 0.78 0.70 0.66 0.63

French

0.83

1.00

0.67

0.67 0.65 0.57

English 0.78 0.67 1.00 0.64 0.54 0.51

Math

0.70

0.67

0.64

1.00 0.45 0.51

DoP

0.66

0.65

0.54

0.45 1.00 0.40

Music

0.63

0.57

0.51

0.51 0.40 1.00ch

(5)

Introduction

Structure and property of the correlation

(covariance matrix) matrix

Any two rows are almost proportional if

the diagonals are ignored

For classics (1) and English (3)

0.83 0.70 0.66 0.63

1.2

0.67 0.64 0.54 0.51

(6)

Introduction

A constant ratio between the rows of the corr

elation matrix indicates that there might be a

model for the variation of the data

The six standardized test scores can be descri

bed as

X

i

= a

i

F + e

i

, i = 1,…,6

where X

i

is the ith standardized test score, a

i

i

s a constant, F is a “factor” value that has me

an of 0 and variance of 1, and e

i

is the part of

(7)

Introduction

If F and e

i

are assumed independent

Var(X

i

)=Var(a

i

F + e

i

)

= Var(a

i

F) + Var(e

i

)

= a

i2

Var(F) + Var(e

i

)

= a

i2

+ Var(e

i

)

Since X

i

is a standardized score,

1 = a

i2

+ Var(e

i

)

a

i

is called the factor loading

a

i2

: proportion of variance of X

i

that is accounte

(8)

Introduction

One-factor Model

1

1

1

2

2

2

p

p

p

i

X

a F+e

X

a F+e

.

X

a F+e

X is referred to as observed indictor such as

(9)

Introduction

Examples

Education

Indicators

test scores on Classics, French, English,

Mathematics, Discrimination of Pitch, and Music

Factors

Intelligence

study attitude

(10)

Introduction

Examples

Quality of Life

Indicators

Scores from 1 to 5 for each of questions in the

questionnaires

Factors (domain)

Physical

Mental

(11)

Introduction

Examples

Hospital Anxiety and Depression Scale

Indicators

Scores from each of 14 questions in the

questionnaires

Factors (construct)

Anxiety

Depression

(12)

Introduction

Two-factor model

1

11 1

12 2

1

2

21 1

22 2

2

p

p1 1

p2 2

p

X

a F +a F +e

X

a F +a F +e

.

X

a F +a F +e

(13)

Introduction

Objectives of factor analysis

To reduce the number of variables (indictors) to t

he smallest number of common factors

To identify the common factors (latent constructs)

to best explain the intercorrelation among indicato

rs and to build the most parsimonious factor mode

l

Principal component analysis also tries to reduce t

he number of variables. However, factor analysis i

s based on a model

(14)

Introduction

Example:

Four correlated variables: X1, X2, X3, X4

X1=X1

X2=X2

X3=2X1+3X2

X4=5X1+4X2

(15)

Introduction

If there are two independent variables

Y1 and Y2,

X1=Y1, X3=2Y1+3Y2

X2=Y2, X4=5Y1+4Y2

(16)

Introduction

m-factor model

1

11 1

12 2

1m P

1

2

21 1

22 2

2m 2

2

p

p1 1

p2 2

pm p

p

1

P

1

m

X

a F + a F + ... + a F + e

X

a F + a F + ... + a F + e

.

X

a F + a F + ... + a F + e

(X ,.., X ) ',

(F ,..., F ) ',

and

(e ,...,e ) '

X

F

e

(17)

Introduction

X

i

is the ith score with mean 0 and unit varia

nce

a

i1

to a

im

are the factor loadings for the ith tes

t

F

1

to F

m

are m uncorrelated common factors,

each with mean 0 and unit variance

e

i

is a factor specific only to the ith test that is

uncorrelated with any of the common factors

and has zero mean

(18)

Introduction

11

12

1p

12

22

2p

1p

2p

pp

a a ... a

a a ... a

=

.. .. ... ..

a a ... a

The factor model can be expressed in

matrix form as follows

A

(19)

Introduction

1

2

p

E( ) = , Cov( ) =

E( ) = , Cov( ) =

E( ) =

0 ... 0

0

... 0

Cov( )

. . ... 0

0 . ...

Cov( ) =

'

X

0

X

I

F

0

F

I

e

0

e

W

X

AA W

(20)

Introduction

i 2 2 2 i1 1 i2 2 im m i 2 2 2 i1 i2 im i i 2 2 2 i i1 i2 im

V(X ) = 1

= a V(F ) + a V(F ) + ... + a V(F ) + V(e )

= a + a + ... + a +

The variation of X can be partitioned in two parts:

(1) h =a + a + ... + a :the c

i i

ommunality

or common variance of X (the part of variation

of X that is related to the common factors), and

(2) V(e ) : the specificity of X (the part of variation

(21)

Introduction

i

j

ij

ij

i

j

i

ij

Cov(X , F ) a

the factor loading a is the

correation between the indicator X and the

common factor F .

Because the community 0 h

1,

-1 a

+1.

If indicator variables are standardized,

factor load

(22)

Introduction

i

i'

ij

i

i'

i1 i'1

i2 i'2

im i'm

i

i'

The correlation between two indicators X and X

r

cov(X , X ) = a a + a a + ... + a a ,

The correlation between two indicators

X and X is the sum of cross-products

of their respectiv

ely factor loadings.

The two indicator variables can be only highly

correlated if they have high factor loadings

(23)

Introduction

Exploratory factor analysis (EFA)

Little or no knowledge about the factor

structure

The number of factors

The number of indicators for each factor

Which indicators represent which factors

To collect data and explore or search for a

factor structure or theory which can

(24)

Introduction

Confirmatory factor analysis (CFA)

Factor structure is known or hypothesized

(specified)

a priori

Complete factor structure with their respective

indicators

Nature pattern of factor loadings

To empirically verify or confirm the factor

(25)

Exploratory Factor Analysis

Principal Components Factor (PCF) Analysis

p variables X

1

,X

2

,…,X

p

, and p principal

components Z

1

,Z

2

,…,Z

p

Z

1

=b

11

X

1

+b

12

X

2

+…+b

1p

X

p

Z

2

=b

21

X

1

+b

22

X

2

+…+b

2p

X

p

..

(26)

Exploratory Factor Analysis

11 12 1p 1 1 2 21 22 2p 2 p p1 p2 pp p

b b ... b

Z

X

Z

b b ... b

X

..

. . ... .

..

X

b b ... b

=

The row vectors of are the eigenvectors

obtained from

Z

 

Z

BX

B

(27)

Exploratory Factor Analysis

The inverse relationship between X and Z

X

1

=b

11

Z

1

+b

21

Z

2

+…+b

p1

Z

p

X

2

=b

12

Z

1

+b

22

Z

2

+…+b

p2

Z

p

..

(28)

Exploratory Factor Analysis

For factor analysis, only m of the princip

al components are retained

X

1

=b

11

Z

1

+b

21

Z

2

+…+b

m1

Z

m

+ e

1

X

2

=b

12

Z

1

+b

22

Z

2

+…+b

m2

Z

m

+ e

2

..

(29)

Exploratory Factor Analysis

For factor analysis, factors have zero mean a

nd unit variance.

Z

i

is divided by its standard deviation 

i

, the

square root of the corresponding eigenvalue

X

1

= 

1

b

11

F

1

+

2

b

21

F

2

+…+

m

b

m1

F

m

+ e

1

X

2

= 

1

b

12

F

1

+

2

b

22

F

2

+…+

m

b

m2

F

m

+ e

2

..

(30)

Exploratory Factor Analysis

The unrotated factor model is

X

1

= a

11

F

1

+a

12

F

2

+…+a

1m

F

m

+ e

1

X

2

= a

21

F

1

+a

22

F

2

+…+a

2m

F

m

+ e

2

..

X

p

= a

p1

F

1

+a

p2

F

2

+…+a

pm

F

m

+ e

p

where F

i

= Z

i

/

i

and a

ij

= 

j

b

ji

(31)

Exploratory Factor Analysis

Varimax Rotation

Rotation that produces the maximum va

riation

Objectives

To have a factor structure in which each variabl

es loads highly on one and only one factor

A given variable should have a high loading on

one factor and near zero loadings on others

(32)

Exploratory Factor Analysis

Principal Axis Factor Analysis (PAF): an

iterative PCF method

Step 1: Assume that the prior estimates of

the communalities are one. A PCF is

obtained. Based on the number of factors

retained, estimates of factor loadings are

obtained and then are used to re-estimate

the communalities.

(33)

Exploratory Factor Analysis

Principal Axis Factor (PAF) Analysis: an

iterative PCF method

Step 2: Compute the maximum change in

estimated communalities among all indicators

between two iterations.

Step 3: If the maximum change in communalities

is greater than a pre-specified convergence

criterion, then replace the diagonal elements of

the original correlation matrix by the estimated

communalities

(34)

Exploratory Factor Analysis

Principal Axis Factor (PAF) Analysis: an

iterative PCF method

Step 4: A new principal components factor

analysis is performed using the modified

correlation matrix and Step 2 is repeated.

Step 2 to Step 4 are repeated until the

maximum change in the communalities is

smaller than the pre-specified convergence

criterion

(35)

Exploratory Factor Analysis

Factor Scores

F* = XG(G’G)

-1

where F* is the nxm matrix of factor sc

ores, with one row for each of the n ro

ws of data, X is the nxp matrix of origin

al data, G is the pxm matrix of factor lo

adings

(36)

Exploratory Factor Analysis

Maximum Likelihood (ML) Method

Assumption: multivariate normal

Cov(X) =  = A’A + W

Maximize the log-likelihood function

Ln(l)=-(n/2){ln(|A’A + W|+tr(A’A + W)

-1

C}+k

Equivalent to minimize

(37)

Exploratory Factor Analysis

Differentiate f with respect to and

and set them to 0:

f

=2

( - )

f

=diag(

( - )

)

Re quire iterations to find estimates of

and

1

1

1

1

A

W

ΣΣCΣA 0

A

ΣΣCΣ0

W

A

W

(38)

Exploratory Factor Analysis

For a given W, find the solutions to

CW

-1

A=A(I+A’W

-1

A)

Let A

o

be the solution, a solution for W i

s given as

(39)

Exploratory Factor Analysis

Selection of The Number of Factors

Cumulative proportion of variation of the r

common factors > 80% or 90%

When the eigenvalue of the rth common fa

ctor > average variation. When the correlat

ion matrix is used, the average variation is

1. Choose the r common factors which eige

nvalues are greater than 1

(40)

Exploratory Factor Analysis

Selection of The Number of Factors

Decree plot: number of factors vs. eigenval

ues – choose the number of factors when t

he curve becomes flat

Maximum likelihood method:

Test the hypothesis until failure to reject th

e null hypothesis

(41)

Exploratory Factor Analysis

Selection of The Number of Factors

Test Statistic: M = kln[|A’A+W|/|R|],

where k = n-2(p+4r+11)/6.

Reject Ho if M >

2,df

,

df=[(p-r)

2

-p-r]/2,

r = the number of factors

(42)

110/07/16 Copyright by Jen-pei Liu, PhD an

d Wei-Chu Chie, MD, PhD 42

Exploratory Factor Analysis

Selection of The Number of Factors

Step 1: Start r=2, if fail to reject Ho, stop and

select 2 as the number of factors,

Step 2: otherwise let r=3, re-compute

A’A+W and perform the test again. If fail to

reject Ho, stop and select 3 as the number of

factors

Step 3: Repeat Step 1 and Step 2 until failing

to reject the null hypothesis

(43)

Exploratory Factor Analysis

When to rotate

Find the factor loadings >= 0.5 (ignore the sign)

The large and moderate factor loadings indicate

relatively good correlation between indicators and

factors

Try to avoid the situation where a large number of

indicators is related strongly to only a few factors

Rotate the factors such that there is no

overlapping of large or moderate factor loadings

among factors

(44)

Exploratory Factor Analysis

Examples (PCF)

Evaluation of two coking methods for fish: Sh

en (1998) and Rencher (1995)

Evaluation items (score:1-10)

aroma (X1)

taste (X2)

texture (X3)

(45)

Exploratory Factor Analysis

Examples (PCF)

Evaluation of two coking methods for fish

Correlation Matrix(n=20)

X1 X2 X3

X4

X1

1

0.62938

0.43116

0.33543

X2 1 0.22779 0.30179

X3 1 0.75061

X4 1

(46)

Exploratory Factor Analysis

Examples (PCF)

Evaluation of Two coking methods for fish

Eigenvalues and Eigenvectors

1

2

3

4

2.34486 1.04840 0.40394 0.20280

0.50584 0.44079 -0.64930 -0.35810

0.44267 0.60513 0.58061 0.31742

0.52726 -0.47040 -0.26945 0.65432

(47)

Exploratory Factor Analysis

Examples (PCF)-Continued

The first 2 principal components accounts for

84.83% of total variation. We take 2 factors

Computation of factor loading for the first

factor and X3

a

31

= (

1

)b

13

= (2.34486)(0.52726) = 0.80738

h

12

= a

112

+a

122

(48)

Exploratory Factor Analysis

Examples (PCF)

Evaluation of Two coking methods for fish

Factor Loadings and Communality

Specific

Variable

F1

F2

Communality Variance

X1

0.77459 0.45133

0.80370

0.19630

X2

0.67786 0.61980

0.84340

0.15660

X3

0.80738 -0.48164

0.88385

0.11615

X4

0.79593 -0.47834

0.86232

0.13768

Variance

2.34486 1.04840

3.39327

Proportion

0.5862 0.2621

0.8483

(49)

Exploratory Factor Analysis

Examples (PCF)-Continued

The communalities are quite high (> 0.80) for

all variables. Most of the variation for the 4 vari

ables can be accounted by the 2 common facto

rs

The factor loadings for the unrotated first facto

r are very large and have the same sign. The fi

rst factor seems to represent the overall qualit

y of the cooking method

(50)

Exploratory Factor Analysis

Examples (PCF)-Continued

The absolute values of factor loadings for the unro

tated second factor are in the same magnitude. Ho

wever, aroma and taste have different sign with te

xture and moisture. This factor is a contrast betwe

en aroma with taste and texture with moisture.

However, the factor loadings within each factor ha

ve the same magnitudes. Therefore, some of the 4

variables are strongly related to the two factors –

a undesirable properties of the factor. We need to

see whether a rotation can help.

(51)

Exploratory Factor Analysis

Examples (PCF)-Continued

Unrotated

Varimax Rotation

F1

F2

F1

F2

0.77449

0.45133

0.27554

0.85310

0.67786

0.61960

0.09120

0.91383

0.80738

-0.48164

0.92269

0.18025

(52)

Exploratory Factor Analysis

Examples (PCF)-Continued

It can be verified that the communalities are

unchanged based on the factor loading after

rotation

Factor loadings greater than 0.5 (ignore the sign)

are underlined. These large and moderate

loadings indicate how the variables are related to

the factors

(53)

Exploratory Factor Analysis

Estimation of Correlation Matrix

0.7746 0.4513 0.6779 0.6196 0.7746 0.6779 0.8074 0.7959 ' 0.8074 -0.4816 0.4513 0.6196 -0.4816 -0.4783 0.7959 -0.4783 0.1963 0 0 +              AA W 0 0 0.1566 0 0 0 0 0.1162 0 0 0 0 0.1377 1 0.8047 0.4081 0.4006 1 =             0.2489 0.2432 1 0.8730 1 0 -0.17532 0.02306 -0.06517 ( ' ) 0 -0.02111 0.                R AA W 05859      

(54)

Exploratory Factor Analysis

Example (PCF) – European Employment (n=30)

Un-rotated

Varimax Rotation

Indicators F1 F2 F3 F4 F1 F2 F3 F4

AGR

0.90

-0.03 -0.34 0.02

0.85

0.10 0.27 -0.36

MIN

0.66

0.00

0.63

0.12 0.11 0.30

0.86

-0.10

MAN -0.43

0.58

-0.61

0.06 -0.03 0.32

-0.89

-0.09

PS -

0.56

0.15 -0.36 0.02 -0.19 -0.04

-0.64

0.14

CON -0.39 -0.33 0.09

0.81

-0.02 0.08 -0.04

0.95

SER -

0.67

-

0.55

0.08 0.17 -0.35 -0.48 -0.15

0.65

FIN -0.23

-0.74

-0.12

-0.50

-0.08 -

0.98

0.00 -0.01

SPS -

0.76

0.07 0.44 -0.33 -

0.91

-0.17 -0.12 0.04

(55)

Exploratory Factor Analysis

Example (PCF) – European Employment

Factor 1: +high loading in AGR, -high loadi

ngs in SPS amd TC: rural industry rather th

an social service and communicaton

Factor 2: +moderate loading in TC and –hi

gh loading in FIN: lack of finance industries

(56)

Exploratory Factor Analysis

Example (PCF) – European Employment

Factor 3: +high loading in MIN, -high

loadings in MAN –moderate loading in PS:

mining rather than manufacturing

Factor 4: +high loading in CON and

+moderate high in SER: construction and

service industry

(57)

Exploratory Factor Analysis

Example: Test scores

(PAF)

Correlation Matrix(n=200)

M

P

C

E

H

F

M

1.00 0.62 0.54 0.32 0.284 0.37

P 1.00 0.51 0.38 0.351 0.43

C 1.00 0.36 0.336 0.405

E 1.00 0.686 0.730

H

1.00 0.7345

F 1.00

M: mathematics, P: physics, C: chemistry

(58)

Exploratory Factor Analysis

Example (PAF): Test scores

History of PAF analysis

Iter. Change M P C E H F

1

.359 .766 .714 .641 .797 .812 .829

2

.128 .698 .626 .513 .725 .744 .784

3

.042 .679 .598 .471 .698 .719 .774

.

.

(59)

Exploratory Factor Analysis

Example (PAF): Test scores

Cum.

Indicator Eigenvalue Prop. Factor 1

Factor2

M 3.0281 0.7826 0.63584 0.52255

P 0.8410 1.0000 0.65784 0.38549

C 0.0004 1.0004 0.59812 0.30447

E 0.0003 1.0007 0.76233 -0.31509

H -0.0003 1.0004 0.74908 -0.36797

F -0.0004 1.0000 0.83129 -0.30329

(60)

Exploratory Factor Analysis

Example (PAF): Test scores

Unrotated

Varimax

Factor 1 Factor2 Factor 1 Factor 2 Communality

M 0.63584 0.52255 0.15200 0.80886 0.677354

P 0.65784 0.38549 0.25687 0.71790 0.581356

C 0.59812 0.30447 0.26309 0.61744 0.450447

E 0.76233 -0.31509 0.78676 0.24786 0.680426

H 0.74908 -0.36797 0.81055 0.19881 0.696517

F 0.83129 -0.30329 0.83205 0.30118 0.783020

(61)

Exploratory Factor Analysis

Example (ML): Meteorological Measurements

X1: daily high temperature (F)

X2: daily low temperature (F)

X3: daily soil high temperature (F)

X4: daily soil low temperature (F)

X5: relative humidity for daily high temperature

X6: relative humidity for daily low temperature

X7: daily wind speed

(62)

Exploratory Factor Analysis

Example (ML): Meteorological Measurements (n=20)

Correlation matrix

X1

X2

X3

X4

X5

X6

X7

X8

X1 1

0.737

0.931

0.446

-0.479

-0.551

0.384

0.879

X2

1

0.827

0.865

-0.255

-0.019

0.536

0.597

X3

1

0.616

-0.381

-0.341

0.466

0.848

X4

1

-0.154

0.382

0.683

0.375

X5

1

0.500

-0.106

-0.539

X6

1

0.299

-0.568

X7

1

0.494

X8

1

(63)

Exploratory Factor Analysis

Example (ML): Meteorological Measurements

Ho: 2 factors are sufficient

Ha: More factors are needed

Test Statistic M=[n-2(p+4r11)/6] ln[|A’A+W|/|R|]

= 24.789

degrees of freedom=[(p-r)

2

-p-r]/2

=[(8-2)

2

-8-2]/2 = 26/2=13

M=24.789 > 

20.05,13

=22.362

Reject Ho at the 5% significance level.

Two factors are not sufficient.

(64)

Exploratory Factor Analysis

Example (ML): Meteorological Measurements

Ho: 3 factors are sufficient

Ha: More factors are needed

Test Statistic M=[n-2(p+4r11)/6] ln[|A’A+W|/|R|]

= 12.186

degrees of freedom=[(p-r)

2

-p-r]/2

=[(8-3)

2

-8-3]/2 = 14/2=7

M=12.186 < 

20.05,7

=14.076

(65)

Exploratory Factor Analysis

Example (ML): Meteorological Measurements

Un-rotated Factor Loadings

Indicator F1

F2

F3

Communality

X1

0.3843

0.8937

-0.0972

0.9558

X2

0.5363

0.6462

-0.4964

0.9503

X3

0.4658

0.8362

0.0989

0.9259

X4

0.6827

0.2868

0.6171

0.9291

X5

-0.1063

-0.4919

0.2275

0.3050

X6

0.2990

-0.6779

0.6104

0.9215

X7

1.0000

-0.0000

-0.0000

1.0000

X8

0.4942

0.7564

-0.3091

0.9119

(66)

Exploratory Factor Analysis

Example (ML): Meteorological Measurements

Factor Loadings by Varimax

Indicator F1

F2

F3

Communality

X1

0.8223

0.5062

0.1528

0.9558

X2

0.2980

0.9058

0.2025

0.9503

X3

0.6717

0.6617

0.1920

0.9259

X4

-0.0554

0.8759

0.3992

0.9291

X5

-0.5416

-0.1034

-0.0318

0.3050

X6

-0.8788

0.2523

0.2925

0.9215

X7

0.0728

0.3602

0.9300

1.0000

(67)

Exploratory Factor Analysis

Differences between principal components an

alysis and factor analysis

PCA emphasize on explaining the variation in the

data

Factor analysis is to explain the correlation among

indicators

For PCA, indicators form an index such as Comsu

mer Price Index or Dow Jones Industrial Average

For factor analysis, indicators reflect the presence

(68)

Exploratory Factor Analysis

Factor Indeterminacy

The factor analysis solution is not unique

Due to factor rotation problem

Infinite ways to rotate the factors

Due to estimation of communality

Different methods for estimation of communality

provide different solutions

(69)

Exploratory Factor Analysis

Are the data appropriate for factor analysis?

Examine the correlation. Pattern of groups of varia

bles with high correlations

Partial correlations after controlling all other variab

les. Low partial correlations

Kaiser-Meyer-Olkin (KMO) measure for sampling a

dequacy > 0.6

The overall square root of mean square residuals

(70)

Exploratory Factor Analysis

Kaiser-Meyer-Olkin (KMO) measure for sampling ade

quacy

KMO measure Recommendation

>= 0.90 Marvelous

0.80+

Meritoroius

desirable

0.70+

Middling

0.6+

Mediocre

tolerable

0.5+

Miserable

Below 0.5

Unacceptable

(71)

Exploratory Factor Analysis

The overall square root of mean square

residuals (RMSR)

Reproduced correlation matrix = A’A

Residual correlation matrix = R – A’A

RMSR is the square root of the average

squared values of off-diagonal entries of

the residual correlation matrix

(72)

Exploratory Factor Analysis

Principal Axis Factor (PAF) Analysis: an

iterative PCF method

SAS default: 30 iterations with a

convergence criterion = 0.001.

If more iterations are required, the data

may not be suitable for factor analysis

(73)

Exploratory Factor Analysis

Factor loadings are not unique

The variance in common between the factor a

nd the indicators is not unique either

The variance in common between the factor a

nd the indicators is not a meaningful measure

of factor importance

However, the objective of factor analysis is to

explain the intercorrelations among indicators

and is not to account for the total variation in

the data

(74)

Exploratory Factor Analysis

PCF analysis labeled the first few principal

components accounting for most of variation

as common factors and the other principle

components as the specific part.

PCF analysis can not really be called factor

analysis

PAF implicitly assumes a factor model and it

is a preferred method by most of researchers

(75)

Confirmatory Factor Analysis

Correlation vs. Covariance Matrix?

The objective of exploratory factor analysis

(EFA) is to investigate the intercorrelations

among indicators

The correlation matrix is always exclusively

used in the EFA

The correlation is the covariance matrix of

(76)

Confirmatory Factor Analysis

Correlation vs. Covariance Matrix?

The standardized variables are unitless and

hence are not scale invariant

Most of confirmatory factor models are scal

e invariant

The maximum likelihood procedure for CFA

are derived from covariance matrix

(77)

Confirmatory Factor Analysis

Objectives

Given the covariance matrix, to estimate

the parameters of the hypothesized factor

model

To determine the fit of the hypothesized

(78)

Confirmatory Factor Analysis

The Models

1

11 1

12 2

1m P

1

2

21 1

22 2

2m 2

2

p

p1 1

p2 2

pm p

p

1

P

1

m

X

a F + a F + ... + a F + e

X

a F + a F + ... + a F + e

.

X

a F + a F + ... + a F + e

(X ,.., X ) ',

(F ,..., F ) ',

X

F

(79)

Confirmatory Factor Analysis

11

12

1p

12

22

2p

1p

2p

pp

a a ... a

a a ... a

=

.. .. ... ..

a a ... a

The factor model can be expressed in

matrix form as follows

=

+

A

X AF

E

(80)

Confirmatory Factor Analysis

E( ) = , Cov( ) =

E( ) = , Cov( ) =

E( ) =

Cov( )

Cov( ) =

'

 

X

0

F

0

e

0

e

W

X

A A W

(81)

Confirmatory Factor Analysis

 is a pxp matrix of variances and covar

iances of indicators

A is a pxm matrix of factor loadings

 is a mxm matrix of the variances and

covariances of latent construct

W us a pxp matrix of the variancs and c

(82)

Confirmatory Factor Analysis

Confirmatory factor analysis is one of

li

n

ear

s

tructural

rel

ation (LISREL)

Parameter Matrix

Order

A

pxm

mxm

(83)

Confirmatory Factor Analysis

The parameters of the models

Free parameters: the parameters to be esti

mated

Fixed parameters: the parameters not be e

stimated and values to be provided

Constrained parameters: the parameters wi

th values constrained to to be equal to oth

er free parameters

(84)

Confirmatory Factor Analysis

The parameters of the models

Constrained parameters: the variances of the

errors of all indicators are constrained to be equal

Most of the latent construct such as attitudes,

intelligence, or excellence do not have a natural

measurement scale. We need to define the metric

or scale for the latent construct

In general, the scale of the latent construct is

(85)

Confirmatory Factor Analysis

Estimation of the parameters of the

hypothesized factor model

Example: one-factor model

1 1 1 2 2 2 p p p i

X

a F+e

X

a F+e

.

X

a F+e

X is referred to as observed indictor such as

test score, and F is referred to as common or

latent factor or an unobservable construct

(86)

Confirmatory Factor Analysis

Estimation of the parameters of the hypothesized

factor model

Example: one-factor model: assume that p=2 and varia

nce of F is 1

1 1 1 2 2 2 2 2 1 1 1 2 2 2 1 2 2 2

X

a F+e

X

a F+e

a

w a a

( ) =

a a a

w

3 parameters to be estimated in 3 equations

θ

(87)

Confirmatory Factor Analysis

Estimation of the parameters of the hypothesized

factor model

Example: one-factor model: assume that p=3 and varia

nce of F is 1

2 2 1 1 1 2 1 3 2 2 1 2 2 2 2 3 2 2 1 3 2 3 3 3 2 2 2

a

w a a a a

( ) = a a a

w a a

a a a a a

w

6 parameters to be estimated in 4 equations

= (a a a w w w ) '

θ

θ

(88)

Confirmatory Factor Analysis

Estimation of the parameters of the hypothesized

factor model

Example: one-factor model: assume that p=4 and varia

nce of F is 1

2 2 1 1 1 2 1 3 1 4 2 2 1 2 2 2 2 3 2 4 2 2 1 3 2 3 3 3 3 4 2 2 1 4 2 4 3 4 4 4

a

w a a a a a a

a a a

w a a a a

( ) =

a a a a a

w a a

a a a a a a a

w

8 parameters

θ

to be estimated in 10 equations

(89)

Confirmatory Factor Analysis

Estimation of the parameters of the hypothesized

factor model

The number of parameters > the number of equations:

unidentified model - one-factor model with p=2  set w

12

=

w

22

= w three equations for three parameters

The number of parameters = the number of equations:

just-identified model - one-factor model with p=3

The number of parameters > the number of equations:

over-identified model - one-factor model with p=4

Degrees of freedom = over-identifying equations

= p(p+1)/2 - # of free parameters:

p=4: df = 4(4+1)/2 – 8 =2

(90)

Confirmatory Factor Analysis

Estimation of the parameters of the

hypothesized factor model

Maximum likelihood estimation procedure for all

free and constrained parameters

Convergence

Statistical significance

Inadmissible estimates

Factoring loadings outside [-1,1]

(91)

Confirmatory Factor Analysis

Evaluation of Model Fit

Ho:  = () vs. Ho:   ()

The 

2

test

Test Statistic: M = ln[| |/|C|]+Tr(C ) - p

where is the estimated covariance matrix esti

mated using MLE under the assumed factor mod

el.

Reject Ho if M >

2,df

,

df=[p(p+1)/2- # of free parameters],

Σ

Σ

1 

(92)

Confirmatory Factor Analysis

Confirmatory Factor Analysis

Evaluation of Model Fit

Other measures

2 2

Goodness-of-fit index (GFI)

tr[(

- ) ]

GFI =

1-tr[(

) ]

0 GFI 1

The approximate expected value of GFI

1

EGFI

1+(2df/pn)

ΣΧI

ΣΧ

(93)

Confirmatory Factor Analysis

Evaluation of Model Fit

Other measures

Adjusted Goodness-of-fit index (AGFI)

p(p 1)

AGFI = 1-[

][1 GFI]

2df

0 AGFI 1

The approximate expected value of AGFI

p(p 1)

1

EAGFI 1-[

][1

]

2df

1+(2df/pn)

AGFI

RAGFI

(94)

Confirmatory Factor Analysis

Evaluation of Model Fit

Other measures

2

-0.5NCP

Rescaled noncentrality parameter

- df

NCP =

n

McDonald 's transformation of NCP (MDN)

MDN = e

(95)

Confirmatory Factor Analysis

Evaluation of Model Fit

Other measures

n n h h n n n h n n

Tucker-Lewis index (TLI)

NCP /df - NCP /df

TLI =

NCP /df

Re lative noncentrality index (RNI) or

Bentler Comparative index (CFI)

NCP - NCP

RNI =

NCP

where NCP is the NCP for the null model

(no factor m

odel is assumed), and NCP is the

h

(96)

Confirmatory Factor Analysis

Evaluation of Model Fit

Other measures

The residual matrix: C - contains the variances

and covariance that can not be explained by the

assumed factor model.

No more than 5% of the standardized residuals

should be greater than 1.96

Squared multiple correlation between the indicators

and construct– estimated variances of construct

(estimated communalities)

(97)

Confirmatory Factor Analysis

Evaluation of Model Fit

Other measures

Total cofficient of determination

1 -

where

is the determinat of the

estimated covariance matrix of the error

variance and

is the determinat of the

sample covariance matrix

A value great

W

C

W

C

(98)

Confirmatory Factor Analysis

Alternative measures of fit

2

n h o n

Population Discrepancy Function (PDF)

- df

PDF = max{

,0}

n

Bentler-Bonnet Normed Fit index (NFI)

F - F

NFI =

,

F

where F is the discrepancy function under

(99)

Confirmatory Factor Analysis

Alternative Measures of Fit

n n h h n n 2 2 n n h h 2 n n 2 2 n n 2 2 h h

Bentler-Bonnet Non-normed Fit index (NNFI)

F /df - F /df

NNFI =

F /df - 1

/df -

/df

=

/df -1/n

where

(df ) is

(df) under the null model

and

(df )is

(df) under

(100)

Confirmatory Factor Analysis

Evaluation Criteria

Fit indices

Criteria

Chi-square

NS and as small as possible

Chi-square/df

<3

GFI

>0.9

AGFI

>0.8

(101)

Confirmatory Factor Analysis

Evaluation Criteria

Fit indices

Criteria

TLI

>0.9

RNI

>0.9

NFI

>0.9

NNFI

>0.9

(102)

Confirmatory Factor Analysis

Example: Test Scores

(CFA)

Covariance Matrix (n=200)

M

P

C

E

H

F

M

4.00 2.48 2.16 1.28 1.136 1.48

P 4.00 2.04 1.52 1.404 1.72

C 4.00 1.44 1.334 1.62

E 4.00 2.744 2.92

H

4.00 2.94

F 4.00

(103)

Confirmatory Factor Analysis

Example:

Test Scores

(CFA)

Model: one-factor model with 6 indicators

P=6 and m=1

covariance matrices of construct and error

are assumed to be symmetric

The scale of the latent construct is

assumed be the same as the first indicator

variable (score of mathematics)

(104)

Confirmatory Factor Analysis

Example:

Test Scores

(CFA)

Standardized factor loadings

Estimates of parameters are standardized such

that the variance of the construct is 1

Completely standardized factor loadings

Estimated factor loadings are standardized such

that both variances of the construct and

indicators are 1.

(105)

Confirmatory Factor Analysis

Example:

Test Scores

(CFA)

Estimates of factor loading (MLE)

Completely

Unstandardized Standardized Standardized

t-statistics

M

1.000

0.914

0.457

-P

1.134

1.037

0.518

5.21

C

1.073

0.981

0.491

5.046

E

1.786

1.633

0.816

6.393

H

1.770

1.618

0.809

6.375

F

1.937

1.771

0.886

6.533

(106)

Confirmatory Factor Analysis

Estimated error covariance matrix

M

P

C

E

H

F

M

0.791 0

0

0

0

0

P

0.731 0

0

0

0

C

0.759 0

0

0

E

0.333 0

0

H

0.345 0

F

0.216

(107)

Confirmatory Factor Analysis

Estimates:

Estimated factor loading are between -1 and 1

No negative estimates of variances of the

construct and error terms

Variances of the error terms are all smaller than 1

Estimates of factoring loadings and covariance

matrices of the construct and error terms are

admissible

(108)

Confirmatory Factor Analysis

Goodness of Model Fit

Measures

value Recommendation

2

test

113.02 (9 df)

with a p-value < 0.001 reject the model

GFI

0.822 < 0.9  poor fit

EGFI

0.985

RGFI

0.835(0.822/0.985) < 0.9  poor fit

AGFI

0.584 < 0.8  poor fit

(109)

Confirmatory Factor Analysis

Goodness of Model Fit

Measures

value Recommendation

RMSR

0.507 Poor fit

(Multiple correlation)

2

M

0.209 <0.5 no good for construct

P

0.269 <0.5 no good for construct

C

0.241 <0.5 no good for construct

E

0.667 >0.5 OK for construct

H

0.655 >0.5 OK for construct

F

0.784 >0.5 OK for construct

(110)

Confirmatory Factor Analysis

Goodness of Model Fit

Measures

value Recommendation

NCP

h

0.520 >0.10  poor fit

MDN

0.771 <0.90  poor fit

NCP

n

2.748 <0.90  poor fit

TLI

0.685 <0.90  poor fit

(111)

Confirmatory Factor Analysis

Degrees of freedom under one-factor model

Number of free parameters: 5 factor loadings, 1 v

ariance of construct and 6 variances of error terms

= 5+1+6 =12

Df

h

= 6(6+1) -12 =9

Degrees of freedom under null model

Number of free parameters: 6 variances of error t

erms = 6

(112)

Confirmatory Factor Analysis

2n

= 564.67 and 

2h

= 113.02

NCP

n

= (564.67-15)/200 = 2.748

NCP

h

= (113.02-9)/200 = 0.520

MDN = exp[-0.520/2] = 0.771

TLI = [2.748/15 – 0.520/9]/[2.748/15)]

= 0.685

(113)

Confirmatory Factor Analysis

Example: Test Scores (CFA)

Model: Two-factor model with 6 indicators and

correlated construct. P=6 and m=2

11 1

1

21 1

3

31 1

3

42 2

4

52 2

5

M a F +e

P a F +e

C a F +e

E a F +e

H a F +e

(114)

Confirmatory Factor Analysis

Example:

Test Scores

(CFA)

Covariance matrices of construct and error

are assumed to be symmetric

Correlation between two constructs

The scale of the latent constructs is assum

ed be the same as the first and 4th indicat

or variable (score of mathematics and Engli

(115)

Confirmatory Factor Analysis

Example:

Test Scores

(CFA)

Estimates of factor loading (MLE)

Completely Completely

Standardized (F1) Standardized (F2) t-statistics

M

0.776

0

0

P

0.785

0

9.210

C

0.684

0

8.610

E

0

0.823

0

H

0

0.822

12.933

F

0

0.894

13.939

(116)

Confirmatory Factor Analysis

Estimated error covariance matrix

M

P

C

E

H

F

M

0.398 0

0

0

0

0

P

0.384 0

0

0

0

C

0.533 0

0

0

E

0.323 0

0

H

0.324 0

F

0.200

The variance of the construct has been standardized to have uni

t variance

(117)

Confirmatory Factor Analysis

Estimated covariance matrix of two

construct

Quantitative Verbal Quant

itative

1

0.568

Verbal

1

The variance of the construct has been standardized to

have unit variance

(118)

Confirmatory Factor Analysis

Goodness of Model Fit

Measures

value Recommendation

2

test

6.05 (8 df) with a p-value =0.642

Fail to reject the model

GFI

0.990 > 0.9  good fit

EGFI

0.990

RGFI

1.000(0.990/0.990) > 0.9  good fit

AGFI

0.974 > 0.8  good fit

(119)

Confirmatory Factor Analysis

Goodness of Model Fit

Measures

value Recommendation

RMSR

0.111 Fair fit

(Multiple correlation)

2

M

0.602 >0.5 OK for construct

P

0.616 <0.5 OK for construct

C

0.457 <0.5 no good for construct

E

0.667 >0.5 OK for construct

H

0.676 >0.5 OK for construct

F

0.800 >0.5 OK for construct

(120)

Confirmatory Factor Analysis

Goodness of Model Fit

Measures

value Recommendation

NCP

h

0.000 <0.10  good fit

MDN

1.000 >0.90  good fit

NCP

n

2.748 <0.90  poor fit

TLI

1.000 >0.90  good fit

(121)

Confirmatory Factor Analysis

If the null hypothesis is true, the

expected value of NCP is 0

Due to sampling errors, it is possible to

obtain negative estimates of NCP

NCP is set to be 0 and MDN,TLI and

RNI will be 1

This represents an almost perfect model

(122)

Confirmatory Factor Analysis

Reliability of a given construct

(Sum of factor loadings)

2

/

[(Sum of factor loadings)

2

+sum of error

variance]

Quantitative = [0.776+0.785+0.684]

2

/

[(0.776+0.785+0.684)

2

+(0.398+0.384+0.553)]

=0.793

Verbal = [0.823+0.822+0.894]

2

/

[(0.823+0.822+0.894)

2

+(0.323+0.324+0.200)]

=0.884

(123)

Summary

Difference between principal

component analysis and factor analysis

Exploratory factor analysis

Confirmatory factor analysis

Linear structural relations

參考文獻

相關文件

In this report, formats were specified for single, double, and extended precisions, and these standards are generally followed by microcomputer manufactures using

(1) Western musical terms and names of composers commonly used in the teaching of Music are included in this glossary.. (2) The Western musical terms and names of composers

An algorithm is called stable if it satisfies the property that small changes in the initial data produce correspondingly small changes in the final results. (初始資料的微小變動

The WG also conducted three open seminars, two student forums and a school questionnaire survey to collect views from the public, school principals, teachers,

Wang, Solving pseudomonotone variational inequalities and pseudocon- vex optimization problems using the projection neural network, IEEE Transactions on Neural Networks 17

Define instead the imaginary.. potential, magnetic field, lattice…) Dirac-BdG Hamiltonian:. with small, and matrix

Microphone and 600 ohm line conduits shall be mechanically and electrically connected to receptacle boxes and electrically grounded to the audio system ground point.. Lines in

The presented methods for mining semantically related terms are based on either internal lexical similarities or external aspects of term occurrences in documents