The Results of Student Ratings

全文

(1)Journal of Taiwan Normal University:Education 2004, 49(1), 171-186. The Results of Student Ratings: Paper vs.Online Te-Sheng Chang National Hualien Teachers College Graduate Institute of Compulsory Education. Abstract The purpose of this study was to compare the results of student ratings of their instructors via paper and online surveys. The sample consisted of students at 624 undergraduate courses at National Hualien Teachers College in the fall semester of 2001: 198 (31.73%) freshman courses, 161 (25.80%) sophomore courses, 146 (23.40%) junior courses, and 119 (19.07%) senior courses. The instrument was the Students’ Rating of Instructors (SRI) form developed in 1995 at this college. The SRI form was composed of 13 questions rated on a 5-point Likert scale, ranging from “strongly agree” (5 points) to “strongly disagree” (1 point). These 13 items were clustered around four teaching factors: Preparation/Planning, Material/Content, Method/Skill, and Assignments/Examination. The scores on these four factors were added to give the total score (rating) for a faculty member. The paper scores are significantly higher than the online scores for all of the evaluation items. There are 573 (91.8%) courses for which the average total paper evaluation score is higher than the average total online evaluation score, but only 51 (8.2%) courses for which the average total online score is higher than the average total paper score. These results indicate that the majority of students in all courses give the instructors a higher score when they evaluate them using the more traditional method: sitting in the classroom and using paper forms. Key words: Student Rating of Instruction, Paper Survey, Online Survey. Introduction Student ratings of instruction are widely used to evaluate teaching effectiveness in higher education (Seldin, 1999).. United States and around the world. In. a. comprehensive. review. of. research. In the United States, for. concerning student ratings of college and university. example, Wagenaar (1995) stated that well over 90. instructors, Wachtel (1998) noted that “after nearly. percent of schools currently use student ratings for. seven decades of research on the use of student. assessing the teaching staff.. Wilson (1998). evaluations of teaching effectiveness, it can safely be. predicted most colleges would consider student. stated that the majority of researchers believe that. ratings of instruction as a measure of "teaching. student ratings are a valid, reliable, and worthwhile. quality" for their faculty.. Student ratings of. means of evaluating teaching” (p. 2). Yet, there are. instruction are ongoing and an ubiquitous rite on. many others who would argue against this point, as. college and university campuses through out the. more. recent. research. has. indicated. that.

(2) Te-Sheng Chang. 172. methodological factors and situational characteristics can affect the validity of student evaluations.. For. Administering an online survey rather than a paper one shows a number of advantages.. For data. example, it has been shown that student ratings may. collection, there are four advantages: (1) survey data. be affected by student perceptions of grade leniency. entry is automated, (2) missing data can be. (Nimmer & Stone, 1991), by instructor enthusiasm. eliminated, (3) there are no out-of-range responses,. (Williams & Ceci, 1997), and by survey procedures. and (4) complex item branching, transparent to the. (Layne, DeCristoforo, & McGinty, 1999).. respondent, can be used (Rosenfeld & Stephanie,. Traditionally, student ratings are usually. 1993).. When surveys are. administered via. administered in class using paper questionnaires - a. computer or online, data entry takes place as the. costly, time-consuming process that is inconvenient. respondent completes the survey.. for faculty and often restricts the thoughtfulness. and leads to more accurate data because one of the. and the depth of student responses (Johnson, 2002).. stages at which errors are normally introduced into. The increasing use of technology in education,. the database is skipped.. especially the World Wide Web, has led to the. collected via a computer or online will not contain. possibility of online administration and reporting. missing responses.. of student ratings.. many. In a recent survey of the 200. economic. This saves time. In general, survey data. Besides, online surveys have benefits,. including. reduced. most wired colleges in the United States, 25% of. processing time and costs, ease of administration, and. respondents said they were already using or plan to. more. covert to online student ratings (Hmieleski, 2000).. Llewellyn, 2002).. detailed,. faster. reports. (Johnson,. 2002;. online. Though the online survey method offers a. evaluations yielded over 80 universities converting. number of advantages over the paper method, little is. paper into online student ratings for some courses. known about the extent to which its results are. (e.g., specific colleges or departments, online. comparable to those obtained through the traditional. courses) and three universities using online ratings. mode, the paper method.. for all courses on campus (Johnson, 2002).. question,. A web. search. of. institutions. using. Like. online. Without answering this. student. ratings. program’s. the universities in the United States, some colleges. accountability demands cannot be met.. and universities in Taiwan have used online ratings. the major purpose of this study is to compare the. systems for all courses on campus.. results of student ratings via paper and those by. In a recent. survey of 76 colleges by Chang (2001), ten schools. online surveys.. completed the online ratings systems for all courses. study are as follows.. Therefore,. The research questions guiding the. on campus and nine schools were planning to. 1. Is there any significant difference in average. convert paper ratings systems to online systems. ratings comparing paper survey to online survey. (Chang, 2001).. method?. For example, National Hualien. Teacher College has moved all of its course evaluation surveys to a web-based system since the fall semester of 1998.. 2. Do classes give courses the same ratings scores across paper and online survey methods?.

(3) 173. The Results of Student Ratings: Paper vs.Online. Literature Review Student ratings of instruction have undergone. interviews, telephone surveys, or electronic surveys.. Marsh and Roche (1993) cited. Both face-to-face interview surveys and telephone. three general reasons for evaluation of teaching by. surveys have a number of disadvantages related to. students: (1) to provide formative feedback to. paper and computer surveys.. faculty for improving teaching, course content and. to conduct, both because of the costs of training. structure; (2) to provide a summative evaluation of. interviewers and because of the hours needed to. teaching effectiveness for promotion and tenure. gather. decisions; and (3) to provide information to. interviewers are not trained or are poorly trained,. students for the selection of courses and teachers.. the quality of data may be variable (Miller, 1991).. several purposes.. data. is. very. time. They are expensive. consuming.. If. As Wilson (1998) predicted, most colleges use. Paper surveys have a number of important. student ratings of instruction as a part of teaching. strengths: They are easy and efficient to administer,. quality for their faculty promotion.. Since. inexpensive, and familiar to those being surveyed.. summative evaluation of instruction has profound. It is easy to establish confidentiality on a paper. implication for faculty with respect to promotion,. survey.. tenure, and merit pay increases, care should be. of drawbacks.. exercised in the development of any rating. surveys may skip items or choose multiple. instrument and the procedure for administering this. responses (Doherty & Thomas, 1986).. instrument.. translating the survey responses from paper to the. Paper surveys, however, have a number Individuals completing paper Also,. ratings. computer for analysis is time-consuming and may. instruments, many studies have examined issues. introduce additional errors into the data base. such as the development and validity of an. (Rosenfeld & Booth-Kewley, 1993).. In. the. development. of. student. evaluation instrument (e.g., Marsh, 1987), the. The major weakness of student ratings via. validity (e.g., Cohen, 1981), and reliability (e.g.,. paper survey is the difficulty in ensuring the. Feldman, 1977) of student ratings in measuring. integrity of the survey procedures.. teaching effectiveness, and the potential bias of. increasing. student ratings (e.g., Centra & Gaubatz, 2000;. personnel-decisions reference, the lack of survey. Chang, 2000; Feldman, 1993).. administration standardization procedures is very. In the procedure for administering the ratings. crucial. and. use. of. student. particularly. With the. ratings. troubling. for. (Layne,. instrument, the paper survey is the traditional and. DeCristoforo, & McGinty, 1999).. most common method for collecting data in. indicated that his campus had reported violations of. organizations.. Because of their longer history and. survey administration integrity “due to instructors. more frequent use, paper surveys are usually. administering the forms as they walked around. considered as the standard against other forms of. their classrooms or collecting and reading the. survey. evaluations before sending them to the campus. administration,. such. as. face-to-face. Ory (1990).

(4) Te-Sheng Chang. 174. office” (p. 65).. Online survey, on the other hand,. There. are. some. problems. in. Layne,. could be administered without any involvement of. Decristoforo and McGinty (1999) and Hardy (2002). the instructor (Layne, DeCristoforo, & McGinty,. studies.. 1999) since it can be done outside the classroom.. different individual bias in Layne, Decristoforo and. Evan and Miller (1969) were the first. First, the sampling error occurs due to. McGinty (1999) study.. According to this study,. researchers to compare the electronic survey with. students in the classes were assigned to either the. the paper self-administered survey.. The result of. tradition survey group or the electronic group.. their study supported that the electronic survey. That means the different student groups using. provoked more honest and candid responses.. different survey methods.. Erdman, Klein, and Greist (1983); Martin and. methods was mixed with the bias of the different. Nagao (1989); Kiesler and Sproull (1986); and. individuals. Second, the inferences of the previous. Layne, DeCristoforo, and McGinty (1999) also. research findings are limited since the sample sizes. found that students who used the electronic method. are quiet small.. left fewer blank items and gave fewer socially. Decristoforo, and McGinty’s study and only 31 in. desirable responses.. Hardy’s study.. There are only a few studies comparing. The effect of survey. There are 66 classes in Layne, Third, the response rates for. online ratings in Layne, Decristoforo and McGinty. electronic student ratings with traditional paper. (1999) study were not high.. ratings.. rate for the electronic group was 47.8% and 60.6%. Layne, Decristoforo, and McGinty (1999). compared the difference in average ratings between. The total response. for the traditional survey group.. these two survey methods by investigating 66. This study at National Hualien Teachers. classes at a southeastern university in the United. College differed from the previously mentioned in. States.. three ways.. In their study, students in each of the. First, instead of using different raters. participating classes were assigned randomly to. (different students) in the previous studies, this. either the tradition survey group or the electronic. study asked all participants (same students for the. group.. same instructor and same course) to respond to. The results of their study indicated the. ratings scores were not influenced by survey. both survey methods.. method.. dependent- sample design to reduce sampling error. Another study conducted by Hardy. That is, this study took a. (2002) at Northwestern University investigated a. due to different individual bias.. group of 31 classes in which the same instructor. sample size is much larger than those of the. has taught the same class multiple times with both. previous research.. paper and online evaluation of the class.. courses over a college campus.. He. Second, the. This study consisted of 624 Third, the. found 10 classes being higher for online scores, 16. response rate for online ratings in this study was. classes being higher for paper scores, and 5 classes. high because online evaluation was a requirement. being mixed with some of the online scores higher. for all participants before they can become. and some of the paper scores higher.. registered for the following semester..

(5) 175. The Results of Student Ratings: Paper vs.Online. Method content validity validated all items.. Data The sample consisted of 624 undergraduate. The online survey mirrored the paper version. courses with 198 (31.73%) freshman courses, 161. in content.. (25.80%) sophomore courses, 146 (23.40%) junior. three basic information items and the 13 ratings. courses, and 119 (19.07%) senior courses from. items used for data analysis in this study.. National Hualien Teachers College in Taiwan in the. addition to the question section of the survey, both. fall semester of 2001.. At the time of the study,. versions included another two sections, one having. the total courses offered by the school were. five items for student self-evaluation and the other. approximately 1052.. having one item for open-ended comment.. The students in all the. The question section consisted of the In. Thus. courses were asked to do the evaluation on both. the total number of items on the SRI administered. paper and online methods.. in this study was 19.. A high response rate. Since this study was focused. was achieved, 78.99% (831/1052=78.99%) for. on the student ratings of instruction, the data from. paper survey and 95.25% (1002/1052= 95.25%) for. these two sections were not analyzed.. online system, respectively.. Survey Administration. Both data files were. matched by teacher identification numbers and course section codes. 793 courses.. During the last four weeks of the semester. The matched data included. 2001, all the instructors on campus were sent an. For the representative reason, the. invitation letter that explained the purpose of the. class size less than 5 were deleted from study.. study and the mechanics of the survey.. The final data file consisted of 624 courses ranging. instructor was asked to designate an individual (e.g.. from 5 to 51 students.. head of the class) to administer the survey.. Measure. the survey administrators attended a half-hour. The instrument used in the study was the Student. Ratings. of. Instruction. developed in 1995 at the college.. (SRI). form. The rating form. Each All. training session, during which they were provided with materials for the traditional survey and a list of all the courses in the department.. was composed of 13 questions rated on a 5-point. Because the course opinion survey had. Likert scale ranging from strongly agree (5-point). already been converted to an online system for the. to strongly disagree (1-point).. These 13 items. school from the fall semester of 1998, only the. were clustered around four teaching effectiveness. paper evaluation form, with an explanation cover. factors: Preparation/Planning (Item 1, 2, 3),. letter, was mailed to all the courses in the last two. Material/Content (Item 4, 5, 6), Method/Skill (Item. weeks of the semester.. 7, 8, 9, 10), and Assignment/Examination (Item 11,. before the end of the last class in each course, the. 12, 13).. instructor. The sum of these four factors was. left. the. Approximately 10 minutes room,. and. the. survey. considered as a total rating score for a faculty. administrator explained to the entire class that a. member.. simple comparison was to be made between the. A panel of 15 evaluation experts for.

(6) Te-Sheng Chang. 176. paper survey and the online survey.. All of the. students completed both the paper and the online survey.. underlying factor structures of the paper and the online survey groups. Descriptive data was provided to assist with. They completed paper evaluation form. together in the classroom setting.. Students. the interpretation of the differences between the. completed online evaluation survey outside the. paper and the online survey results.. classroom, either before or after they completed the. and standard deviation for the paper and the online. paper survey.. survey methods on each item, each factor, and the. Students in both survey formats completed the. The mean. total ratings of the SRI questionnaire were. surveys under conditions of complete anonymity. accordingly computed.. To investigate whether. and there was no coding scheme employed.. the ratings were influenced by the survey method,. Although students in the online format had to use. each ratings item, four factors and the total score. their identity number to gain access to the system,. were analyzed using a dependent t test (same. ratings results were stored in a separate database. instructors, same courses, and same raters but. with no identifying information.. different survey modes). In addition, the scores differences between. Analytic Strategy All analyses were performed on class-average responses for the sample.. The preliminary. these two methods were computed by subtracting the online scores from the paper scores for each of. principal components analysis and the α coefficient. the 624 courses.. The courses scoring above zero. of internal consistency reliability on the 13 items. were classified as paper-higher courses.. were performed separately for the paper and the. scoring below zero were classified as online-higher. online survey methods. The correlation coefficients. courses. Those scoring equal to zero were. between these two factor patterns were computed. classified as paper-online-equal courses.. by using the Pearson correlation formula in order to. percentage for each category was computed for. determine the degree of similarity between the. each level and for all of the courses.. Those. The. Results and Discussion Before the primary research questions were. factor are consistently large, between .864 and .950.. addressed, preliminary factor analyses on the 13. The four factors account for 92.28% of the total. core teaching items were performed separately for. variance on the online evaluation.. the paper and the online survey methods.. The. coefficients of internal consistency reliability. factor loadings for the paper evaluation items. are .9792 and .9805 for the paper and the online. designed to measure each factor are large,. evaluation. between .850 and .937.. The four factors account. loadings and the α coefficients confirm that the. for 91.25% of the total variance on the paper. SRI is a valid and reliable instrument regardless of. evaluation.. the. Similarly, the factor loadings for the. online evaluation items designed to measure each. survey. scores,. respectively.. administration. The. methods.. The α. factor. The. correlation coefficients for the two factor patterns.

(7) 177. Paper vs. Online Student Ratings. are.790,.839,.837,.790, and.834 for Preparation/. (1990) emphasized, the research on student ratings. Planning, Material/Content, ethod/Skill, Assignment/. has showed an obvious positive-response bias,. Examination, and the total, respectively.. which needs to be taken into account in interpreting. Table 1 lays out the mean and standard. and using results.. They advocated the use of. deviation for the paper and the online evaluation. normative data to help counteract this rater. scores and the mean differences between these two. leniency effect.. The. Previous research studies (e.g. DeCristoforo,. ranges are 3.966 to 4.219 for the paper survery and. 1992; Sproull & Kiesler, 1991) have revealed that. 3.711 to 3.986 for the online survey.. The highest. students completing electronic surveys may feel. scored item for both survey methods is Item 4, ". more anonymous and thus become more involved. relates the material of this course with other areas. with the nature of the electronic survey method.. of knowledge".. They may also respond more honestly than in. scores for each evaluation item of the SRI.. The lowest scored item for both. survey methods is Item 12, “ gives fair grades”. It seems that most students give the instructors the above-average evaluation score.. Why is there. responding with other survey techniques, such as the paper survey.. It was initially assumed that. students who completed online surveys might. One of. become so involved with the interactive online. the possible reasons is positive-biasing effect.. survey method that their responses may negate this. Researchers in the general survey research area. positive-response bias.. have recognized the positive-biasing effect that. occur in this study since survey results do not vary. self-report surveys have had on survey responses.. according to survey method.. a large cluster of score above the average?. This obviously does not. when. Based on Table 1, the paper scores are. respondents exhibit a tendency to deny undesirable. significantly higher than the online scores on each. characteristics and instead respond in a socially. item.. desirable manner (Phillips & Clancy, 1970).. - 29.411 range for individual rating items.. Likert-type scales, the most frequent scale applied. values of all the dependent t tests are less than .001.. in student ratings of instruction (Berk, 1979), are. That is, paper evaluation scores are significantly. particularly vulnerable to this yea-saying or. higher than online evaluation scores on each item.. This. positive-biasing. effect. exists. acquiescence response set (Couch & Keniston,. The ranges of the t values are in the 19.028 The p. The summary of dependent t tests between these two scores for each factor and total scores is. 1960). In fact, Centra (1979) summarized a 1975. showed in Table 2.. The ranges of the evaluation. He did research. scores are in the 3.980 - 4.095 range for the paper. on student response tendencies of over 400,000. survey and in the 3.741 - 3.851 range for the online. faculty members, and determined that only 12% of. survey.. this group received below average ratings due to. methods is Material/Content.. positive-response bias.. factor for both survey methods is Method/Skill.. educational testing service study.. Arreola and Aleamoni. The highest scored factor for both survey The lowest scored.

(8) Te-Sheng Chang. 178. Table 1. The Summary of Dependent t Test for the Survey Method on Each Ratings Item (N= 624) Dimension. Paper. Rating Items. M Preparation/ Planning. 1.is concerned about the effectiveness of. Online SD. M. Mean. SD. diff. t. 4.028. .3392 3.8364. .347. .192. 19.028***. 2.provides a detailed course syllabus. 4.018. .3797. 3.811. .373. .207. 20.371***. 3.states course objectives for each class. 4.019. .3747. 3.800. .365. .222. 22.136***. 4.219. .3196. 3.986. .332. .233. 25.152***. 4.146. .3943. 3.888. .398. .258. 28.158***. 3.977. .3982. 3.733. .367. .244. 23.994***. his/her teaching. section Material/ Content. 4.relates the material of this course with other areas of knowledge 5.demonstrates knowledge and makes it clear how each topic fits into the course 6.is aware when students are having difficulty in understanding a topic. Method/. 7.establishes and maintains an interaction. 4.036. .4389. 3.794. .442. .242. 24.376***. Skill. 8.keeps the course moving rapidly enough. 3.993. .3765. 3.753. .346. .239. 25.575***. 9.explains material clearly. 3.981. .4182. 3.733. .393. .248. 24.760***. 10. is helpful with difficulties. 3.967. .4203. 3.738. .408. .223. 23.558***. Assignment/ 11. gives good comments on written work. 3.987. .4063. 3.747. .380. .240. 21.913***. Examination 12. gives fair grades. 3.966. .3504. 3.711. .324. .255. 26.615***. 4.069. .3124. 3.814. .305. .255. 29.411***. for the material. 13.gives exams and papers appropriate for the course Note. *** p < .001. From. Table. 2,. are. Table 3 shows the number and percentage for. significantly higher than the online scores on each. course levels on student ratings by survey methods.. of the four factors and the total evaluation scores.. When all courses are taken into account, there are. The t values are 23.693, 30.011, 27.563, 28.957,. 573 (91.8%) courses being higher for paper scores,. and 30.270 for Preparation/Planning, Material/. 51 (8.2%) courses being higher for online scores,. Content,. and 0 (.0%) courses being equal for the total. Method/skill,. Total, respectively.. the. paper. scores. Assignment/Exam,. and. The p values of all these. dependent t tests are less than 0.001.. evaluation.. Similarly, there are 536 (85.9%), 568. Like the. (91.0%), 554 (88.8%), and 564 (90.4%) courses. findings in Table 1, the findings in Table 2 indicate. being higher for paper scores for Preparation/. that paper evaluation scores are significantly higher. Planning,. than online evaluation scores.. Assignment/Examination, respectively.. Material/Content,. Method/Skill,. and.

(9) 179. Paper vs. Online Student Ratings. Table 2. The Summary of Dependent t Test for the Survey Method on Each Factor and Total scores (N=624) Survey method. Paper. Evaluation Dimension. Online. Mean diff. t. M. SD. M. SD. Preparation/Planning. 4.023. .334. 3.815. .338. .207. 23.693***. Material/Content. 4.095. .361. 3.851. .356. .244. 30.011***. Method/Skill. 3.980. .388. 3.741. .365. .238. 27.563***. Assignment/Exam. 4.008. .341. 3.758. .322. .250. 28.957***. Total. 4.031. .341. 3.796. .333. .236. 30.270***. Note. *** p < .001. In terms of the total score, there are 170. by online format?. The answer is negative, since. (85.9%), 152 (94.4%), 140 (95.9%), and 111. the instructors are rated by the same students with. (93.9%) courses being higher for paper scores for. both paper and online survey methods.. freshman, sophomore, junior, and senior courses,. main reasons that paper evaluation scores are. respectively.. significantly higher than online evaluation scores is. Although. the. percentages. of. One of the. paper-higher courses are much more than those of. due to the way the survey is administered.. online-higher courses for all different levels, the. findings indicate that it is more probable for. percentage. the. students to give instructors higher evaluation. freshman level is relatively lower than those of the. scores when they complete the rating form by a. other three course levels.. paper survey.. of. paper-higher. courses. for. For freshman courses,. The. there are 159 (80.3%), 167 (84.3%), 159 (80.3%),. It seems that most students felt the paper. 168 (84.8%), and 170 (85.9%) courses being higher. method afforded a lower degree of anonymity than. for. the online method.. paper. scores. Material/Content,. for. Preparation/Planning,. Method/Skill,. Assignment/. One of the possible reasons is. that students feel more secure and free to write the. For the. honest truth since they do not feel as though their. rest of the level courses, the percentages are higher. instructor would find out what they had written. than 90.0% for most of the evaluation factors and. about him/her.. total score.. These results indicate the majority of. survey was administered without any involvement. courses give the instructors higher scores when. of the instructor (Layne, DeCristoforo, & McGinty,. they do the faculty teaching evaluation with a. 1999).. traditionally paper-pencil way in the classroom.. survey technology revealed that students who used. Examination, and the total, respectively.. The other reason is that online. Previous research related to electronic. scores. the electronic method gave fewer socially desirable. significantly higher than the online evaluation. responses (Erdman, Klein, & Greist, 1983; Kiesler. scores?. & Sproull; Martin & Nagao, 1989).. Why. are. the. paper. evaluation. Does this mean instructors rated by paper. format teach more effectively than instructors rated.

(10) 180. The Results of Student Ratings: Paper vs.Online. Table 3. The Number and Percentage for Course Level on Student Ratings by Survey Methods Course level. Freshman. Sophomore. Junior. Senior. All. Ratings results. (N=198). (N=161). (N=146). (N=119). (N=624). N. %. N. %. N. %. N. %. N. %. 159. 80.3. 149. 92.5. 128. 87.7. 100. 84.0. 536. 85.9. Paper-online-equal. 1. 0.5. 0. 0. 1. .7. 2. 1.7. 4. .6. Online-higher. 38. 19.2. 12. 7.5. 17. 11.6. 17. 14.3. 84. 13.5. 167. 84.3. 148. 91.9. 143. 97.9. 110. 92.4. 568. 91.0. Paper-online-equal. 1. .5. 0. 0. 0. 0. 1. .8. 2. .3. Online-higher. 30. 15.2. 13. 8.1. 3. 2.1. 8. 6.7. 54. 8.7. 159. 80.3. 148. 91.9. 137. 93.8. 110. 92.4. 554. 88.8. Paper-online-equal. 0. 0. 1. .6. 0. 0. 1. .8. 2. .3. Online-higher. 39. 19.7. 12. 7.5. 9. 6.2. 8. 6.7. 68. 10.9. 168. 84.8. 150. 93.2. 134. 91.8. 112. 94.1. 564. 90.4. Paper-online-equal. 0. 0. 0. 0. 1. .7. 1. .8. 2. .3. Online-higher. 30. 15.2. 11. 6.8. 11. 7.5. 6. 5.0. 58. 9.3. 170. 85.9. 152. 94.4. 140. 95.9. 111. 93.9. 573. 91.8. Paper-online-equal. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. Online-higher. 28. 14.1. 9. 5.6. 6. 4.1. 8. 6.7. 51. 8.2. Preparation/Planning Paper-higher. Material/Content Paper-higher. Method/Skill Paper-higher. Assignment/Examination Paper-higher. Total Paper-higher. Conclusions and Recommendations The purpose of this study has been to ascertain. on each evaluation item, evaluation factor and the. whether there is any significant difference in. total ratings score.. average. administration. their instructors “less teaching effective scores”. This study confirms that either paper or. when they do student ratings with an online mode.. online method can be a valid and reliable way of. A possible explanation may be that the online. student ratings of faculty teaching effectiveness.. survey provides a safer and more candid response. However, the mean scores of paper evaluation are. environment.. significantly higher than those of online evaluation. online method give fewer socially desirable. methods.. ratings. across. survey. It seems that the classes give. Therefore, students who used the.

(11) 181. Paper vs. Online Student Ratings. guarantee of confidentiality might be necessary,. responses. The results of this study suggest that online. similar to the guarantee that students’ records are. student ratings of instruction can be successfully. confidential.. administered at colleges where the student body is. social desirable ratings is to ask instructors to leave. fairly computer literate and familiar with accessing. from the classroom while students are evaluating. the campus computer network.. their instructors.. In addition, the. Another way to avoid student giving. Allowing students to do. study provided strong evidence that the validity. evaluation off campus is another way to administer. and the reliability of student ratings do not vary. the paper evaluation without involvement of. according to an online or paper survey is used.. faculty. Though the response rate in the online group. This is consistent with the results of previous study (e.g. Layne, Decristoforo, and McGinty, 1999). They are many survey methods for institutions. in this study was high, incentives may have to be offered to encourage students to respond when the. to implement in order to attain student ratings of. online. instruction.. implementation.. If institutions continue to believe in. method. is. used. in. a. non-forced. A number of possible incentives. the importance of student voice in evaluating the. could be tested experimentally to determine their. faculty, it should be necessary to pay attention to. effectiveness.. the influence of survey methods on the rating result. complete the online survey could be assigned. scores.. Which scores are the real representation. earlier registration times for the next semester. of faculty teaching effectiveness, paper scores or. based on the number or proportion of courses that. online scores?. they. The educational institutions need. For. evaluated. example,. during. current. semester.. Additionally,. their faculty members.. evaluation information could be made available to. the paper survey. method. is to be. access. who. to reflect deeply before they make a decision about If. immediate. the. student. to. course. those students participating in the evaluation. continuously implemented in most colleges as the. process.. mode of data collection for student ratings of. achieved by allowing students to complete surveys. faculty, the perception that it offers less anonymity. somewhat earlier during the semester, when. must be countered.. students are less pressured and the online resources. Students still believe that. Higher response rates may also be. their instructors would find out what they had. are less likely to be overloaded.. written about him/her on the paper evaluation form.. research has shown that rating results can be. Therefore, there is clearly an educational process. relatively stable from the midterm point to the end. that needs to take place in order to convince. of the term (Costin, 1968; Feldman, 1979). Overall, although the results are informative,. students that paper evaluations are legitimately anonymous. government. An endorsement by the student association. and. other. reputable. they should be taken as a preliminary investigation. The. generality. of. with. the. findings. agencies could help to allay fears about the paper. strengthened. survey system.. universities and other disciplines.. In addition, an institutional. Student rating. replications. should. over. be. different.

(12) 182. The Results of Student Ratings: Paper vs.Online. Acknowledgment The author would like to thank the five AERA. their willingness to participate in this research.. (American Educational Research Association). This research was supported by research grants. anonymous reviewers for their comments and. (NSC-90-2413-H-026-011). National Hualien Teachers College in Taiwan for. Science Council.. from. the. National.

(13) Paper vs. Online Student Ratings. 183. References Arreola, R. A & Aleamoni, L. M. (1990). Practical decisions in developing and operating a faculty evaluation system. In M. Theall & J. Franklin (Eds.), Student ratings of instruction: Issues for improving practice (pp. 37-56). San Francisco: Jossey-Bass.. Doherty, L., & Thomas, M. D. (1986). Effects of an automated survey system upon responses. In O. Brown, Jr., & H. W. Hendric (Eds.), Human factors in organizational design management-II. North Holland: Elsevier Science.. Berk, R. A. (1979). The construction of rating instruments for faculty evaluation. Journal of Higher Education, 50(5), 651-669.. Erdman, H., Klein, M., & Greist, J. (1983). The reliability of a computer interview for drug use/abuse information. Behavior Research Methods and Instrumenation, 15, 66-68.. Centra, J. A. (1979). Determining faculty effectiveness. San Francisco: Jossey-Bass.. Evan, W. M., & Miller, J. R. (1969). Differential effects on response bias of computer versus conventional administration of a social science questionnaire: An exploratory methodological experiment. Behavioral Science, 14, 216-227.. Centra, J. A., & Gaubatz, N. B. (2000). Is there gender bias in student evaluations of teaching? Journal of Higher Education, 70(1), 17-33. Chang, T. (2000). Student ratings: What are teachers college students telling us about them? Paper presented at the meeting of the American Educational Research Association, New Orleans, LA. Chang, T. (2001). The comparison between web-based and paper-and-pencil surveys of student ratings of instruction. Hualien, Taiwan: National Hualien Teachers College. (NSC-90-2413-H-026-011) Cohen, P. A. (1981). Student ratings of instruction and student achievement: A meta-analysis of multisection validity studies. Review of Educational Research, 51, 281-309. Costin, F. (1968). A graduate course in the teaching of psychology: Description and evaluation. The Journal of Teacher Education, 19(4), 425-432. Couch, A., & Keniston, K. (1960). Yeasayers and naysayers: Agreeing response set as a personality variable. Journal of Abnormal and Social Psychology, 60, 151-174. DeCristoforo, J. R. (1992). Electronic versus traditional administration of student ratings of instruction at the Georgia Institute of Technology: A summative analysis. Unpublished doctoral dissertation, Georgia State University, Atlanta.. Feldman, K. A. (1977). Consistency and variability among college students in their ratings among courses: A review and analysis. Research in Higher Education, 6, 223-274. Feldman, K. A. (1979). The significance of circumstances for college students’ ratings of their teachers and courses. Research in Higher Education, 10(2), 149-172. Feldman, K. A. (1993). College students' views of male and female college teachers: Part II-Evldence from students' evaluations of their classroom teachers. Research in Higher Education, 34, 151-211. Hardy, N. (2002). Perceptions of online evaluations: Fact and fiction. Paper presented at the annual meeting of American Educational Research Association, New Orleans, LA. Hmieleski, K. (2000). Barriers to online evaluation: Surveying the nation’s top 200 most wired colleges. Unpublished report. Rensselaer Plytechnic Institute, Troy, New York. Johnson, T. (2002). Online student ratings: Will students respond? Paper presented at the annual meeting of American Educational Research Association, New Orleans, LA. Kiesler, S., & Sproull, L. (1986). Response effects in the electronic survey. Public Option Quarterly, 50, 4-2-413. Llewellyn, D. C. (2002). Online reporting of student course survey results-methods, benefits and concerns..

(14) 184. Te-Sheng Chang. Paper presented at the annual meeting of American Educational Research Association, New Orleans, LA. Layne, B. H., DeCristoforo, J., R., & McGinty, D. (1999). Electronic versus traditional student ratings of instruction. Research in Higher Education, 40(2), 221-232. Marsh, H. W. (1987). Students’ evaluations of university teaching: Research findings, methodological issues, and directions for future research. International Journal of Educational Research, 11, 253-388. Marsh, H. W., & Roche, L. (1993). The use of students’ evaluations and an individually structured intervention to enhance university teaching effectiveness. American Educational Research Journal, 30, 217-251. Martin, C., & Nagao, H. (1989). Some effects of computerized interviewing on job applicant responses. Journal of Applied Psychology, 74(1), 72-80. Miller, D. C. (1991). Handbook or research design and social measurement. Newbury Park, CA: Sage.. Seldin, P. (1999). Current practices-good and bad-nationally. In P. Seldin (Ed.), Changing practices in evaluating teaching: A practical guide to improved faculty performance and promotion/tenure decisions (pp. 1-25). Bolton, MA: Anker. Sproull, L., & Kiesler, S. (1991). Connections: New ways of working in the networked organization. Boston: MIT Press. Wachtel, H. K., (1998). Student evaluation of college teaching effectiveness: A brief review. Assessment and Evaluation in Higher Education, 23(2), 191-211. Wagenaar, T. C. (1995). Student evaluation of teaching: Some cautions and suggestions. Teaching and Sociology, 64, 64-68. Williams W. M., & Ceci, S. (1997). “ How’m I doing?” Problems with student ratings of instructors and courses. Change, 29(5), 12-23. Wilson, R. (1998). New research casts doubt of student evaluation of professors. The Chronicle of Higher Education, 44(19), 12-14.. Nimmer, J. G. & Stone, E. F. (1991). Effects of grading practices and time of ratings on student ratings of faculty performance and student learning. Research in Higher Education, 32, 195-215.. 作者簡介. Ory, J. (1990). Student ratings of instruction: Ethics and practice. In M. Theall and J. Franklin (Eds.), Student ratings of instruction: Issues for improving practice (pp. 63-74). San Francisco: Jossey-Bass.. E-mail:achang@sparc2.nhltc.edu.tw 投稿日期：92 年 9 月 18 日修正日期：93 年 1 月 5 日接受日期：93 年 1 月 9 日. Phillips, D. L., & Clancy, K. J. (1970). Some effects of social desirability in survey studies. American Journal of Sociology, 77(5), 921-940. Rosenfeld, P., & Booth-Kewley, S. (1993). Computer-administered surveys in organizational settings: Alternatives, advantages, and applications. American Behavioral Scientist, 36(4), 485-511.. 張德勝，花蓮師院國民教育研究所教授. Te-Sheng Chang is a Professor of National Hualien Teachers College.

(15) 師大學報：教育類民國 93 年，49(1)，171-186. 「學生評鑑教師教學」之結果：紙筆與網路調查的比較張德勝國立花蓮師範學院國民教育研究所. 摘要本研究主要是比較紙筆調查與網路調查對於「學生評鑑教師教學」結果的差異性。研究對象為九十學年度第一學期國立花蓮師範學院大學部所開設的 624 門課，包含一年級 198 (31.73%)門、二年級 161 (25.80%)、三年級 146 (23.40%)以及四年級 119 (19.07%) 門。研究工具為「國立花蓮師範學院教學意見反映調查表」共包含四個層面：「準備與計畫」、「教材與內容」、「教法與技巧」、「作業與評鑑」。研究結果顯示在學生評鑑教師教學所有四個層面及總分，紙筆調查所得的結果都顯著的高於網路調查的結果。在所有的班級中，有 573 (91.8%) 門課，紙筆調查的平均分數高於網路調查的結果。相對的，只有 51 (8.2%)門課，網路調查的結果高於紙筆調查的結果。研究結果顯示紙筆調查的方式比網路調查的方式，更容易讓學生給任課教師較高的評鑑分數。關鍵字：學生評鑑教師教學、紙筆調查、網路調查.

(16) 186. Hsi-nan Yeh, Hui-chen Chan, Yuh-show Cheng.

(17)