• 沒有找到結果。

5. Meta-Analysis

5.3 Meta-analytic approaches

There are some main significant figures needed to bring out when talking about meta-analysis: Glass and colleagues (1976), who focused on methods to aggregate

mean differences, dealt with (quasi-)experimental designs on the comparison of psychotherapies; Hunter and Schmidt(1990), who focused on correlations, were concerned with the problem of predictive validity in personnel selection; Rosenthal (1978) presented methods for the combination of probabilities as study results and was the first to consider the so-called file-drawer problem in meta-analysis in depth (Rosenthal, 1979). Another major effort-- if not the most detailed and statistically elaborate in the behavioral sciences to date-- to specify the (statistical) methods of meta-analysis was presented by Hedges and Olkin (for a comprehensive overview, see Hedges & Olkin, 1985). Here, the main focus was not a substantive problem, but a precise statistical formulation of the models in meta-analysis and the presentation of corresponding proofs for the situations given in meta-analyses (Schulze, 2004).

Because this research mainly used Hedges and Olkin’s method, other types of meta-analysis would be a brief introduction.

5.3.1 Glass, McGraw, and Smith

In 1976, Gene Glass (Glass, 1976, 1977) proposed a method to integrate and summarize the findings from a large body of research. Glass called his method

"meta-analysis". His approach was developed during a time that most studies in psychotherapy produced positive, null, and negative results. Narrative reviews of these studies failed to resolve these discrepancies. His desire to interpret these results led him to standardize and average treatment-control results for 375 psychotherapy studies. Glass was criticized for combining findings from distinctly different therapies such as cognitive behavioral, psychodynamics etc.

Glass supported his work by claiming his overall interest was in finding an effect for the overall effectiveness of different types of psychotherapy

In 1981, Glass and his colleague published “meta-analysis in social research”.

In this book, it started to state the problem of research review and integration, and told the audience what meta-analysis is: the approach to research integration referred to as "meta-analysis" is nothing more than the attitude of data analysis applied to quantitative summaries of individual experiments. By recording the properties of studies and their findings in quantitative terms, the meta-analysis of research invites one who would integrate numerous and diverse findings to apply the full power of statistical methods to the task. Thus it is not a technique;

rather it is a perspective that uses many techniques of measurement and statistical analysis.

The primary property of Glass’s meta-analysis is: they put a strong emphasis on effect sizes rather than significance levels. Glass believed the purpose of research integration is more descriptive than inferential, and he felt that the most important descriptive statistics are those that indicate most clearly the magnitude of effects. His meta-analysis typically employed estimates of the Pearson r or of d, where and 𝑋̅𝐸 and 𝑋𝐶 are the means of the experimental and control groups, respectively. SD is the SD of the control group. Glass (1977) has presented quite a number of useful formulas for converting statistics in studies to estimates of r or d. The initial product of a Glassian meta-analysis is the mean and standard deviation of effect sizes across studies (e.g., see Smith &

Glass, 1977).

∆= 𝑋 ̅̅̅̅ − 𝑋 𝐸 ̅̅̅̅ 𝐶 𝑆𝐷

𝑋𝐸

̅̅̅̅ the means of the experimental groups 𝑋̅𝐶 the means of the control groups

SD the SD of the control group

They pointed out that the characteristics of meta-analysis are: meta-analysis is quantitative; meta-analysis does not prejudge research findings in terms of research quality; meta-analysis seeks general conclusions.

5.3.2 Hedges and Olkin

Hedges and Olkin (1985) stated the purpose of writing book is to address the statistical issues for integrating independent studies. Therefore, you can expect they focused on statistics.

They criticized that the conventional analyses had problems because the conventional analysis lacks two important features of the best case analysis. First, it is impossible to directly test the consistency of effect sizes across studies in the conventional analysis, and second, the amount of the variation among observed effect sizes that is systematic is unknown. And they quoted Presby’s (1978) criticism to show that the Glass’s analysis was flawed because he used overly broad categories and they also pointed out that conventional analyses have statistical problems.

In conducting meta-analysis, they are mainly interested in estimating the mean effect size and testing for the homogeneity of the effect size. Meta-analysis uses numerical indexes called effect sizes to combine data from multiple studies.

Examples of effect sizes include correlation coefficient, odds ratio, and response ratio and standardize mean difference. These various effect sizes estimates are then combined across studies to produce a summary of all of the findings (Hedges, Gurevitch & Curtis, 1999).

In order for the effect size to have consistent interpretation across studies, it must be expressed in a common metric. Various studies will often use a variety of measures hence the mean difference between groups will not be directly comparable across studies. The standardized mean difference avoids this problem by dividing the mean difference by the within-group standard deviation.

This removes differences due to scaling of the dependent variables, and promotes comparability of effect sizes across studies. Effect size quantifies the size of the difference between two groups, and may therefore be said to be a true measure of the significance of the difference. (Hedges et al., 1999).

Below are Hedges and Olkin (1985) Equation (Table 2.4). After calculating the effect sizes of each article, the data should be coded and perform statistical analysis.

Table 2.4 calculation type is the most accurate way. But if the collected data lacks of Mean and standard deviation data and provide F or t and the sample number of experiment and control group instead, the researcher will use the formula of t and samples or F and samples to calculate the ES.

In order to correct the small sample bias, the following Equation should be used:

𝑑 𝑖 = (1 − 3

4𝑁 − 9 ) 𝑔 𝑖

𝑁 = 𝑛 1 + 𝑛 2

will then aggregate to form an overall weighted mean estimate of the treatment effect (d+). The sampling variance of 𝜎(𝑑2𝑖) is defined as the inverse of the sum

The significance of the mean ES was judged by its 95% confidence interval (95% CI). A significantly positive (+) mean ES indicated that the results favored experimental group; a significantly negative (–) mean ES indicated that the results favored control group. The Equation is as below.

𝑑 + − 𝐶 𝛼

2 𝜎 (𝑑 + ) ≤ 𝛿 ≤ 𝑑 + + 𝐶 𝛼

2 𝜎 (𝑑 + )

To determine whether the findings in each dataset shared a common ES, the set of ESs was tested for homogeneity by the homogeneity statistic (QT). When all findings share the same population ES, QT has an approximate χ2 distribution with k – 1 degrees of freedom, where k is the number of ES. If the obtained QT

was larger than the critical value, the findings were determined to be significantly heterogeneous, meaning that there was more variability in the ESs than chance fluctuation would allow.

The homogeneity of effect size is calculated to examine whether all studies share a common treatment effect. The homogeneity test is used to determine whether at least one of the effect sizes in a series of comparisons differ from the rest (Wang et al., 2001). A test for homogeneity examines the null hypothesis that all studies are evaluating the same effect size. If the homogeneity is rejected, the distribution of effect size is assumed to be heterogeneous, that is, the effect of at least one of the treatment is different from the rest of them. The Equation is:

𝒬 = ∑ (𝒹 𝒾− 𝒹 + ) 2 σ (𝒹 2 𝒾 )

k

𝔦=1

If the homogeneity test is reject and the moderators are categorical variable, Hedges and Olkin(1985) suggest the following Equation to exam which moderator may affect the result of the homogeneity. The Equations are:

Table 2.5 distribution with k – 1 degrees of freedom, where k is the number of ES. If the obtained QT is larger than the critical value, the findings are determined to be significantly heterogeneous, meaning that there is more variability in the ESs than chance fluctuation would allow (Hedges et al., 1985).

Next, a series of subgroup moderator variable analyses will be conducted. Each coded study feature with sufficient variability will be tested through two homogeneity statistics, between-class homogeneity (Qbetween) and within-class homogeneity (Qwithin). A QB tests for homogeneity of ESs across classes. It has an approximate χ2 distribution with k – 1 degrees of freedom, where k is the

number of classes. If QB is greater than the critical value, it indicates a significant difference among the classes of ESs. When a moderator has more than two classes, Scheffe’s post hoc comparisons will be performed to control type I error.

5.3.3 Hunter and Schmidt

The specific about Hunter and Schmidt’s method also known as Validity Generalization and the effect sizes are expressed as correlations. Unlike Glassian meta-analysis, Schmidt-Hunter meta-analysis did not take the variance of observed effect sizes (SES2 ) at face value. Instead, the step after determination of the mean effect size is to test the hypothesis that (SES2 ) is entirely due to various statistical artifacts. These artifacts include (1) sampling error, (2) study differences in reliability of independent and dependent variable measures, (3) study differences in range restriction, (4) study differences in instrument validity, and (5) computational, typographical, and transcription errors. Hunter and Schmidt developed methods of estimating and subtracting variance due to the first three of these five artifacts.

Meta-analysis can overcome the Sampling Error which traditional quantitative literature review cannot conquer. If the population correlation is assumed to be constant over studies, then the best estimate of that correlation is not the simple mean across studies but a weighted average in which each correlation is weighted by the number of persons in that study. Here, ri is the correlation in study i and Ni is the number of persons in study i. Thus, the best estimate of the population correlation is:

𝑟̅ = ∑[𝑁 𝑖 𝑟 𝑖 ]

∑ 𝑁 𝑖

5.3.4 Rosenthal and Rubin

The methods proposed by Rosenthal and Rubin were described in Rosenthal (1978, 1991, 1993) as well as Rosenthal and Rubin (1979, 1982).Their primary effect size estimator would be the correlation coefficient r. Fisher (1928) devised a transformation (zr) that is distributed nearly normally. In virtually all the meta-analytic procedures we shall be discussing, whenever we are interested in r we shall actually carry out most of our computations not on r but on its transformation zr The relationship between r and zr is given by:

𝑍 𝑟 = 1

2 log 𝑒 [ 1 + 𝑟 1 − 𝑟 ]

So, when we ask whether two studies are telling the same story, what we usually mean is whether the results (in terms of the estimated effect size) are reasonably consistent with each other or whether they are significantly heterogeneous. For each of the two studies to be compared we compute the effect size r and find for each of these r's the associated Fisher zr defined as l/2 loge [(1 + r)/(l — r)].

Tables to convert obtained r's to Fisher zr's are available in most introductory textbooks of statistics. Rosenthal (1984) has brought together the various effect size indicators. Table 2.6 serves as a summary.

Table 2.6 here because it is an effect size estimate that needs only to be multiplied by √𝑑𝑓 to yield the associated test of significance, t. The index r/k turns out also to be related to Cohen's d in an interesting way-- it equals d/2 for situations in which we can think of the two populations being compared as equally numerous (Cohen, 1977; Friedman, 1968). The indicator zr is also not typically employed as an effect size estimate though it, too, could be. However, it is frequently used as a

transformation of r in a variety of meta-analytic procedures. Cohen's q indexes the difference between two correlation coefficients in units of zr.

The next three indicators of Table 2.6 are all standardized mean differences. They differ from each other only in the standardizing denominator. Cohen's d employs the σ computed from both groups employing N rather than N‒1 as the within group divisor for the sums of squares. Glass's ∆ and Hedges's g both employ N‒

1 divisors for sums of squares. Glass, however, computes S only for the control group, while Hedges computes S from both experimental and control groups.

The last three indicators of Table 2.6 include two from Cohen (1977). Cohen's g is the difference between an obtained proportion and a proportion of .50. The index d' is the difference between two obtained proportions. Cohen's h is also the difference between two obtained proportions but only after the proportions have been transformed to angles (measured in units called radians, equal to about 57.3 degrees).

In sum, different groups of authors with different substantive and technical focus have dealt with the methods of meta-analysis—many of them simultaneously—to arrive at a pre-packaged comprehensive treatment of the topic. Such packages, associated with different author names, focus, and procedures, will be called approaches in the following. The publications corresponding to the approaches soon became standard references in certain sub-disciplines in psychology. For example, the work of Hunter et al. (1982) became a quasi-standard in the field of industrial and organizational (I/O) psychology, whereas the work of Glass et al. (1981) was the main reference

for meta-analytic research in educational psychology. Within the field of meta-analysis, different approaches have different procedures, computations, and interpretation of results.

It is most important that the researchers explicitly point out which was implemented within their respective studies (Schulze, 2004). So, the researcher will adopt Hedges and Olkin method and the calculation of ES will use S1−x̅2

pooled and d as indicator.

5.4 Strength and weakness of meta-analysis