• 沒有找到結果。

Applied Multivariate Quantitative Methods-Canonical Correlation Analysis

N/A
N/A
Protected

Academic year: 2021

Share "Applied Multivariate Quantitative Methods-Canonical Correlation Analysis"

Copied!
43
0
0

加載中.... (立即查看全文)

全文

(1)

Applied Multivariate

Quantitative Methods

Canonical Correlation Analysis

By Jen-pei Liu, PhD

Division of Biometry, Department of Agronomy,

National Taiwan University

and

(2)

Canonical Correlation Analysis

Introduction

Procedures

Examples

(3)

Introduction

Examples

The health department is interested in

determining a relationship between hosing

quality – measured by a number of

variables such as type of housing, heating,

and cooling conditions, availability of

running water, and kitchen and toilet

facilities, and incidences of minor and

(4)

Introduction

Examples

A medical research is interested in

determining if individual’s lifestyles and

eating habits have an effect on their health

measured by a number of health-related

variables such as hypertension, weight,

anxiety, and tension levels

(5)

Introduction

Examples

The marketing manager is interested in det

ermining if there is a relationship between

types of products purchased and consumer

’s lifestyles and personalities

Any relationship between four environment

(6)

Introduction

Objectives

To determine whether there is a

relationship between two sets of variables

To identify relationships between two sets

of variables

To determine if the predictor set of

variables affects the dependent set of

variables

(7)

Procedures

Example: Hotelling (1936)

Reading speed (X1)

Reading power (X2)

Arithmetic speed (Y1)

Arithmetic power (Y2)

(8)

Procedures

A linear combination of X1 and X2

U = a1X1+a2X2

A linear combination of Y2 and Y2

V = b1X1 +b2X2

Coefficients in U and V are chosen as

(9)

Procedures

Procedure is very much similar to

principal component analysis (PCA)

PCA is to maximize the variance of

selected components

Canonical analysis is to maximize the

(10)

Procedures

Example (Continued)

U = -2.78X1 + 2.27X2

V = -2.44Y1 + 1.00Y2

U and V has a correlation of 0.67

U measures the difference between reading power

and speed

V measures the difference between reading power

and speed

A large difference between X1 and X2 also tends

to have a difference between Y1 and Y2

This is the aspect of reading and arithmetic that

(11)

Procedures

X1, X2, …, Xp based on m objects

Y1, Y2, …, Yq based on n objects

There can be up to the minimum of p a

(12)

Procedures

U

1

= a

11

X

1

+ a

12

X

2

+…+a

1p

X

p

U

2

= a

21

X

1

+ a

22

X

2

+…+a

2p

X

p

.

.

U

r

= a

r1

X

1

+ a

r2

X

2

+…+a

rp

X

p

(13)

Procedures

V

1

= b

11

Y

1

+ b

12

Y

2

+…+ b

1p

Y

p

V

2

= b

21

Y

1

+ b

22

Y

2

+…+ b

2p

Y

p

.

.

V

r

= b

r1

Y

1

+ b

r2

Y

2

+…+ b

rp

Y

p

(14)

Procedures

Maximum correlation between U1 and

V1

Maximum correlation between U2 and

V2 subject to these variables being

uncorrelated to (U1, V1)

Maximum correlation between U3 and

V3 subject to these variables being

uncorrelated to (U1, V1, U2, V2)

(15)

Procedures

Each of (U1,V1),(U2,V2),…,(Ur,Vr) repre

sents an independent dimension in the r

elationship between two sets of variable

s (X1,X2,..,Xp) and (Y1,Y2,..,Yq)

The first pair (U1,V1): highest correlatio

(16)

Procedures

Xi and Yj are standardized to have mean 0 an

d varianace 1

Compute the (p+q)x(p+q) correlation matrix f

or X1,X2,…,Xp and Y1,Y2,…,Yq

A is pxp correlation matrix of X1, X2, … ,Xp

B is qxq correlation matrix of Y1, Y2,…, Yq

C is pxq correlation matrix between X1,X2,…,

(17)
(18)

Procedures

Compute B

-1

C’A

-1

C

Find the eigenvalues and eigenvectors

(B

-1

C’A

-1

C -

I)b = 0

1

 

2

 …  

r

, eigenvalues are the sq

uare of the correlation between canonic

al variables (U1,V1), (U2,V2),…(Ur,Vr)

(19)

Procedures

b

1

, b

2

,…, b

r

provide the coefficients for

V

j

based on Y1,…,Yq

a

i

= A

-1

Cb

i

gives the coefficients for Ui

U

i

= a

i

X = a

i1

X

1

+ a

i2

X

2

+…+a

ip

X

p

(20)

Procedures

Test of significance

X

2

=-{n-(p+q+3)/2}ln(1- 

i

)

Reject null hypothesis of no correlation at t

he  signifinance level if

X

2

> 

2,pq

(21)

Procedures

Test of significance for individual contri

bution

X

2i

= -{n-(p+q+3)/2}ln(1- 

i)

Reject null hypothesis of no correlation at t

he  signifinance level if

X

2

> 

2,p+q-2i+1

(22)

Procedures

Test of significance from the (i+1)th to

the rth contribution

Sum from the (i+1)th X

2i+1

to the rth X

2r

X

2i

= -{n-(p+q+3)/2}ln(1- 

i

)

Reject null hypothesis of no correlation at t

he  signifinance level if

X

2

> 

2,(p-i)(q-i)

(23)

Examples

Environmental and genetic correlations

16 colonies of the butterfly

Euphydryas edit

ha

in Oregon and California

Four environmental variables (Xs)

Altitude (x1), annual precipitation (X2), annual

maximum temperature (X3), and annual minim

um temperature (X4)

(24)

Examples

Environmental and genetic correlations

Issue: the sum of 6 Pgi gene frequencies is 100%

- linear dependence among Pgi gene frequencies

Solution: Removal one of Pgi gene frequencies (%

of 1.30 mobility genes) and combining the lower fr

equency genes (0.4 and 0.6-mobility genes)

Y1=% of 0.40-0.80 mobility genes, Y2=% of 0.8

0-mobility genes, Y3=% of 1.00-mobility genes, Y

4=% of 1.16-mobility genes

(25)

Examples

Examples

Correlation Matrix of X1, X2, X3, and X4

X1

X2

X3

X4

X1

1.000

0.568

-0.828

-0.936

X2

1.000

-0.479

-0.705

(26)

Examples

Examples

Correlation Matrix of Y1, Y2, Y3, and Y4

Y1

Y2

Y3

Y4

Y1

1.000

0.638

-0.561

-0.584

Y2

1.000

-0.824

-0.127

Y3

1.000

-0.264

(27)

Examples

Examples

Correlation Matrix of Y1, Y2, Y3, and Y4

and X1, X2, X3, and X4

Y1

Y2

Y3

Y4

X1

-0.201

-0.573

0.727

-0.458

X2

-0.468

-0.550

0.699

-0.138

(28)

Examples

The eigenvalues are 0.7425, 0.2049, 0.1425 a

nd 0.0069

The corresponding canonical correlations are

the square root of eigenvalues: 0.8617, 0.452

7, 0.3775, and 0.0833

X

2

=18.34 < 

20.05, 16

= 26.2962, fail to reject th

e null hypothesis of zero correlation

(29)

Examples

U1 = -0.09X1-0.29X2+0.48X3+0.29X4

V1 = 0.54Y1+0.42Y2-0.10Y3+0.82Y4

U2 = 2.31X1-0.73X2+0.45X3+1.27X4

V2 = -1.66Y1-2.20Y2+0.45Y3+2.77Y4

U3 = 3.02X1+1.33X2+0.57X3+3.58X4

V3 = -3.56Y1-1.35Y2-3.86Y3-2.86Y4

(30)

Examples

U1 is a contrast between maximum with

minimum temperature and precipitation

V1 is for 0.60-, 0.80-, and 1.16-mobility

genes

Correlations between environmental

variables

(31)

Examples

Correlations between environmental variables

Altitude

-0.92

Precipitation

-0.77

Maximun Temperature

0.90

Minimum Temperature

0.92

(32)

Examples

Correlations between genetic variables

Mobility 0.40/06 0.38

Mobility 0.80

0.74

Mobility 1.00

-0.96

Mobility 1.16

0.49

Vi indicates a lack of mobility-1.00 genes

(33)
(34)

Examples

Soil and Vegetation

Prehistoric Maya sites in Belize in Central

America

Four soil variables

Four vegetation variables

(35)

Examples

Soil and Vegetation

Soil variables

X1:% of soil with constant lime enrichment

X2: % of meadow soil with calcium

groundwater

X3: % of soil with coral bedrock under

(36)

Examples

Soil and Vegetation

Vegetation variables

Y1: % of deciduous seasonal broadleaf forest

Y2: % of high and low marsh forest,

herbaceous marsh, and swamp

Y3: % of cohune palm forest

Y4: % mixed forest

(37)

Examples

The eigenvalues are 0.580, 0.320, 0.059 and

0.0149

The canonical correlations are 0.762, 0.566, 0

.243 and 0.122

X

2

=193.63 > 

20.05, 16

= 26.2962,

(38)

Examples

U1 = 1.34X1+0.34X2+1.33X3+0.59X4

V1 = 1.71Y1+1.07Y2+0.22Y3+0.52Y4

U2 = 0.41X1+0.90X2+0.23X3+0.89X4

V2 = 0.64Y1+1.47Y2+0.27Y3+0.28Y4

U3 = -0.44X1-0.51X2+0.18X3+0.93X4

V3 = -0.18Y1-0.24Y2+0.93Y3+0.22Y4

U4 = -0.44X1-0.02X2+0.72X3+0.15X4

V4 = 0.12Y1+0.01Y2+0.26Y3-0.93Y4

(39)

Examples

Correlation between the canonical

variables and soil variables

U1

U2

U3

U4

X1 0.55

-0.23

0.00

-0.80

X2 -0.02

0.73

-0.68

-0.04

(40)

Examples

Correlation between the canonical

variables and Vegetation variables

U1 U2 U3 U4

Y1 0.77 -0.58

-0.08

0.24

Y2 -0.36

0.91 -0.19

-0.03

Y3 0.03 0.13 0.95 0.28

(41)

Examples

The most important relationships between soil and

vegetation variables are described by the first two

pairs of canonical variables

The presence of soil types 1 and 3 and the absence of soil

type 4 are associated with the presence of vegetation type 1

The presence of soil types 2 and 4 is associated with the

presence of vegetation type 2 and absence of vegetation

type 1

151 observations are not truly independent since

adjacent squares tend to have similar soil and

vegetation characteristics – a problem for spatial

(42)
(43)

Summary

Objectives of canonical analysis

Statistical methods

Examples of environmental and genetic

variables in 16 colonies of a butterfly

Examples of soil types and vegetation

參考文獻

相關文件

INFORMAÇÃO GLOBAL SOBRE AS ASSOCIAÇÕES DE SOLIDARIEDADE SOCIAL E OS SERVIÇOS SUBSIDIADOS REGULARMENTE PELO INSTITUTO DE ACÇÃO SOCIAL. STATISTICS ON SOCIAL SOLIDARITY ASSOCIATIONS

Valor acrescentado bruto : Receitas mais variação de existências, menos compras de bens e serviços para revenda, menos comissões pagas, menos despesas de exploração. Excedente

Valor acrescentado bruto : Receitas do jogo e dos serviços relacionados menos compras de bens e serviços para venda, menos comissões pagas menos despesas de

Valor acrescentado bruto : Receitas do jogo e dos serviços relacionados menos compras de bens e serviços para venda, menos comissões pagas menos despesas de

[r]

[r]

(In Section 7.5 we will be able to use Newton's Law of Cooling to find an equation for T as a function of time.) By measuring the slope of the tangent, estimate the rate of change

In fact, the formation of chemical C-O state increases the extra factor inside the DOS re-distribution; therefore, without this, like the case of the sidewalls region (C), it shows