Chapter 3 Research Methodology
3.4 Partial Least Squares (PLS)
Partial least square (PLS) is a soft-modeling and classical statistical method. It originally proposed by Wold (1966) for econometrics and integrated path analysis and causal modeling method to resolve complex economic problems. PLS is a recent technique that combines features from and generalizes principle component analysis (PCA) and multiple linear regression (Abdi, 2010). In the chemo metrics fields, PLS regression first gained a huge success. Currently, PLS has been extensively applied in industrial applications, such as computer science and information, technological prediction, marketing, management, and human behavior (Berghman, Matthyssens, &
Vandenbempt, 2012; Chiang, 2013; Y.-T. H. Chiu, Lee, & Chen, 2014; Land Jr et al.,
2011; Martínez, De Andrés, & García, 2014; Moon, Park, Jung, & Choe, 2010;
Venkatesh, Thong, & Xu, 2012). For example, Joe F Hair, Ringle, and Sarstedt (2011) proposed the comparison between structure equation model (SEM) and PLS, and used PLS techniques to estimate causal models in many theoretic models and empirical data situations.
The purpose of PLS is to understand or forecast a set of dependent variables from a set of independent variables. By obtaining from the variables a set of orthogonal factors called latent variables which have the best predictive power can realize the prediction and analysis (Abdi, 2010). Besides, PLS has the ability to model latent constructs that are uncontaminated by measurement error (Joseph F Hair, 2009) under conditions of non-normality and small to medium sample sizes (Chiang, 2013). In other words, it can avoid the small sample size problem in linear model analysis. Therefore, it offers some analytical advantages over techniques such as regression assuming error-free measurement (Chiang, 2013). In general, two applications of PLS approach being used are possible (Chin, 1998): (1) theory confirmation; (2) theory development. Due to the above advantages of PLS, it is often considered by researchers to be a good alternative to traditional covariance-based techniques in SEM.
In order to simplify the notation of the model and in line with conventional descriptions of PLS, the latent and manifest variables being assumed are standardized so that the location parameters can be discarded in the following equations based on literature referenced by Henseler, Ringle, and Sinkovics (2009).
Figure 3-4Example of a PLS Path Model.
The inner model of relationships between latent variables can be illustrated as:
(18) where is the vector of latent variable, denotes the matrix of coefficients of their relationships, and stands for the inner model residuals. The basic PLS design assumes a recursive inner model that is subject to prediction specification. Therefore, the inner model constitutes a causal chain system. Predictor specification reduces equation (18) to:
(19) PLS path modeling comprises of two main outer modes: reflective modes and formative mode (see Figure 3-4). The reflective model has causal relationships from the latent variable to the manifest variables in its block. Each of manifests thereby in a certain measurement model is assumed to be generated as a linear function of its latent variables and the residual :
(20) Where means the loading coefficients. The outer relationships are also subject to predictor specification implying that there are no correlations between the outer residuals and the latent variable of the same block that reduces equation (20) to:
(21)
Outer Model (Formative Mode) Outer Model (Reflective Mode) Inner
manifest variables to the latent variable. For those blocks, the linear relationships are given as follows:
(22) In the formative mode, predictor specification is also in effect, reducing equation (22) to :
(23) Urbach and Ahlemann (2010), who suggest that reflective and formative indicators‟
measurement. First, reflective model, which is termed as an outer model, likely the functions as the measurement model in covariance-based SEM, is often employed to obtain the construct validity and reliability of measurement items. The four stages are shown below:
1. Undimensionality refers to use exploratory factor analysis (EFA) with eigenvalue to understand whether only one constructs in multi-construct empirical researches.
2. Reliability is used to conduct the assessment of the internal consistency in a construct and there are two common indexes to fit including composite reliability and Cronbach‟s alpha.
3. Convergent validity, which is used to measure the correlation of a construct‟s multiple indicators. Convergent validity is acceptable if the following criteria are met (Joseph F Hair, 2009): (i) the statistical significance of each factor loading is confirmed by a p-value of 0.5, (ii) construct reliability exceeds 0.7, and (iii) average variance extracted (AVE) is greater than 0.5 (Fornell & Larcker, 1981).
4. Discriminant validity concerns the degree to which the various constructs are distinct from one another.
The formative model, which is termed as the inner model, is also often used to
x
X
x x
| X
x
xX
xmeasure and check the model of research outcomes. To effectively assess the validity of the structural model, the four criteria being suggested by Urbach and Ahlemann (2010) are shown below:
1. Coefficient of determination ( ), stands for attempting to assess the explained variance of a latent variable relative to its total variance. Values of approximately 0.670 are viewed as substantial, values around 0.333 moderate, and values around 0.190 weak (Chin, 1998; Ringle, 2004).
2. Path coefficients between the latent variables should be analyzed in terms of their algebraic sign, magnitude, and significance (Huber, Herrmann, Meyer, Vogel, &
Vollhardt, 2007).
3. Effect size ( ), which is described to assess whether an independent latent variable has a substantial impact on a dependent latent variable. Values of 0.020, 0.150, 0.350 indicate the predicator variable‟ low, medium, or large effect in the structural model (Chin, 1998; Cohen, 1988; Ringle, 2004).
4. Predictive relevance ( ), refers to the statistic is a measure of the predictive relevance of a block of manifest variables. The proposed threshold value is 0.
The predictive relevance‟s relative impact can be assessed by means of the measure (Fornell & Cha, 1994; Geisser, 1975; Stone, 1974).
In summary, according to above researchers‟ perspectives for using PLS as the analytic technique to execute data analysis of structural model including (Urbach &
Ahlemann, 2010): PLS makes fewer demands with regard to sample sizes than other approaches; PLS does not require normal-distributed input data; PLS can be used to complex SEM with a great number of dimensions; PLS is able to handle both reflective and formative constructs; PLS is better suited for theory development than for theory testing; and PLS is especially useful for prediction
R
2f2
Q2 Q2
Q2
q2