• 沒有找到結果。

Chapter 5. Conclusion and Suggestion

5.3. Limitation

Finally, for cube enumeration this study expected to further investigate the cognitive process involved with cube enumeration hoping to analyze its factors, understand examinees’ solving process in detail, infer the reason causing item difficulty by using these data, construct a better teaching and evaluation tool and finally make a cube enumeration test evaluate examinees more precisely.

5.3. Limitation

1. To generate items with appropriate difficulty, the maximum size for the cubes in each item was set to 4 by 4 by 4. According to our pilot study, the item larger than this criterion would be too difficult for the sixth graders.

2. The subjects of this study were the sixth graders only, and other graders weren’t included and therefore the feasibility of this study on them wasn’t clear.

100

101

References

Ackerman, T. A. (1989). Unidimensional IRT calibration of compensatory and noncompensatory multidimensional items. Applied Psychological Measurement, 13(2), 101–125.

Baker, F. B. (1985). The basics of item response theory. Portsmouth, New Hampshire:

Heinemann.

Baker, F. B. (1992). Item response theory: Parameter estimation techniques. New York: Marcel Dekker.

Battista, M. T. (1999). Fifth graders’ enumeration of cubes in 3D arrays: Conceptual progress in an inquiry-based classroom. Journal for Research in Mathematics Education, 30, 417-448.

Battista, M. T., & Clements, D. H. (1996). Student's Understanding of Three-Dimensional Rectangular Arrays of Cubes. Journal for Research in Mathematics Education, 27(3), 258-292.

Battista, M. T., & Clements, D. H. (1998). Finding the number of cubes in rectangular cube buildings. Teaching Children Mathematics, 4(5), 258-264.

Bejar, I. (2002). Generative testing: From conception to implementation. In S. H.

Irvine & P. C. Kyllonen (Eds.), Item generation for test development, 199-217.

Ben-Haim, D., Lappan, G., & Houang, R. T. (1985). Visualizing rectangular solids made of small cubes: Analyzing and effecting student's performance. Educational Studies in Mathematics, 16(4), 389-409.

Chang, S. W., & Ansley, T. N. (2003). A comparative study of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 40, 71-103.

102

Chen, H. J. (2003). Developing an online self-access English reading and vocabulary learning center. Paper presented at the Proceedings of the Eleventh International Conference on Computer Assisted Instruction, Taipei, Taiwan, 24-26.

Chen, S. Y. (2004). Computer adaptive testing questions exposure control methods of test. Technology and ability to assess indicators International Symposium, 1, 31-41.

Chen, S. Y., & Lei, P. W. (2005), Controlling item exposure and test overlap in computerized adaptive testing, Applied Psychological Measurement, 29, 204-217.

Chen, S. Y., Lei, P. W., & Liao, W. (2008). Controlling item exposure and test overlap on the fly in computerized adaptive testing. British Journal of Mathematical and Statistical Psychology, 61, 471-492.

Chen, S. Y., & Lei, P. W. (2010). Investigating the relationship between item exposure and test overlap: Item sharing and item pooling. British Journal of Mathematical and Statistical Psychology, 63, 205-226.

Chen, S. Z. (2007). Competency-based distribution of SHC exposure control method.

Unpublished master dissertation, National Taichung University of Education, Taichung, Taiwan.

Chen, S. Y., Lei, P. W., & Liao, W. (2008). Controlling item exposure and test overlap on the fly in computerized adaptive testing. British Journal of Mathematical and Statistical Psychology, 61, 471-492.

Chang, S. W., & Ansley, T. N. (2003). A comparative study of item exposure control methods in computerized adaptive testing. Journal of Educational Measurement, 40, 71-103

103

Chiang, C. T. (1984). Visual-spatial dimensions of cognitive ability. Taipei: National Changhua University of Education.

Coniam, D. (1997). A preliminary inquiry into using corpus word frequency data in the automatic generation of English language cloze tests. Computer Assisted Language Instruction Consortium, 16 (2–4), 15–33.

Cureton, E. E. (1957). The upper and lower twenty-seven percent rule. Psychometrika, 22, 293–296.

Davey, T. & Parshall, C. G. (1995). New algorithms for item selection and exposure control with computerized adaptive testing. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, USA.

Drasgow, F., Luecht, R. M., & Bennett, R. E. (2006). Technology and testing. In R. L.

Brennan (Ed.), Educational measurement (pp. 471-515). Westport, CT:

ACE/Praeger.

Eliot, J. (1980). Classification of figural spatial tests. Perceptual and Motor skills, 51(1),847-851.

Gao, Z. M., & Liu, C.L. (2003). A web-based assessment and profiling system for college English. Paper presented at the Proceedings of the Eleventh International Conference on Computer Assisted Instruction, Taipei, Taiwan, 24-26.

Gibson, E. J., Brewer, P. W., Dholakia, A., Vouk, M. A., & Bitzer, D. L. (1995). A comparative analysis of Web-based testing and evaluation systems. Proceedings of the 4th WWW conference, Boston.

104

Hambleton, R. K., & Jones, R. W. (1993). Comparison of classical test theory and item response theory and their applications to test development. Educational Measurement: Issues and Practice, 12(3), 3847.

Hambleton, R. K., Rogers, H. J., & Swaminathan, H. (1995). Fundamentals of item response theory. Newbury Park: Sage.

Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston, MA: Kluwer-Nijhoff.

Harvey, R. J., & Hammer, A. L. (1999). Item response theory. The Counseling Psychologist, 27(3), 353–384.

Ho, R.-G. (1989). Development and implementation of the CAI software database and interactive evaluation system. Paper presented at the Proceedings of the 1st 1989 International CAI Conference. Taipei, Taiwan. (Invited Speech)

International Assessment of Educational Progress (IAEP). (1992). Learning Science.

Princeton, NJ: Educational Testing Service.

Irvine, S. H. & Kyllonen, P. (Eds.). (2002). Item Generation for test development.

Mahwah, NJ: Lawrence Earlbaum Associates, Inc.

Jansen, M. G. H. (2003). Estimating the parameters of a structural model for the latent traits in Rasch's model for speed tests. Applied Psychological Measurement, 27(2), 138–151.

Jiang, P. -J. (2010). An Automatic Item Generation System Based on Structural Pattern.

Unpublished master dissertation, Dayeh University, Taiwan.

Kao, Zhao-Ming. (2000) AWETS: An automatic web-based English testing system.

Paper presented at the Proceedings of the 8th International Conference on

105

Computers in Education/ International Conference on Computer-Assisted Instruction. Taipei, Taiwan.

Kelly, T. L. (1939). The selection of upper and lower groups for the validation of test items. Journal of Educational Psychology, 30, 17–24.

Kline, T. J. B. (2005). Psychological testing: A practical approach to design and evaluation, Thousand Oaks, CA: Sage.

Lai, H. , Alves, C., & Gierl M. J.(2009). Using Automatic Item Generation to Address Item Demands for CAT. Paper presented at the Proceedings of CAT Research and Applications Around the World Poster Session. Tokyo, Japan.

Lawson, S. (1991). One parameter latent trait measurement: Do the results justify the effort? In B. Thompson (Ed.), Advances in educational research: Substantive findings, methodological developments (pp. 159-168). Greenwich, CT: JAI.

Li, H. (2011). A Study on the Ontology-based Chinese Idiom Practice System.

Unpublished master dissertation, National Chengchi University, Taiwan.

Liao, W. W. (2002). Design a Virtual Item Bank based on image processing technique (Unpublished master dissertation). National Taiwan Normal University, Taipei, Taiwan.

Liu, Z. J., Liang, R. K., & Lin, S. H. (2001), Automatic item-generation and online testing system for new figure reasoning test. In National Central University (Eds), 5th Global Chinese Conference on Computers in Education / International Conference on Computer-Assisted Instruction 2001 (pp. 326-333). Chung-Li:

National Central University.

Luh, W. M. (1999). Validation of Measurement on Spatial Abilities. Psychological Testing, 46(2), 101-111.

106

Lord, F. M. (1980). Applications of item response theory to practical testing problems.

Hillsdale, NJ: Lawrence Erlbawn Associates.

Lord, F. M., & Novick, M. R. (1968). Theory of mental test scores. Reading, MA:

Addison-Wesley.

Lord, F. M. (1983). Small n justifies the Rasch model. In D. J. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing.

New York: Academic Press.

Meadows, M., & Billington, L. (2005). A Review of the Literature on Marking Reliability. AQA: Manchester.

Mehtre, B. M., Kankanhalli, M. S., & Lee, W. F. (1998). Content-based image retrieval using composite color-shape approach, Information Processing &

Management, 34(1), 109-120.

Millman, J., & Westman, R. (1989). Computer-assisted writing of achievement items:

toward a future technology. Journal of Educational Measurement, 26(2), 177-190.

Mitkov, R. & Ha, L. A. (2003). Computer-aided generation of multiple-choice tests.

Paper presented at the Proceedings of the HLT-NAACL 2003 Workshop on Building Educational Applications Using Natural Language Processing, Stroudsburg, PA, USA.

Olkun, S., & Knaupp, J. E. (2010). Children’s understanding of rectangular solids made of small cubes. Germany: LAP Lambert Academic Publishing.

Poel, C. J., & Weatherly, S. D. (1997). A cloze look at placement testing. Shiken:

JALT (Japanese Assoc. for Language Teaching) Testing & Evaluation SIG Newsletter, 1 (1), 4–10.

107

Pommerich, M. (2006). Validation of group domain score estimates using a test of domain. Journal of Educational Measurement, 43, 97–111.

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests.

Copenhagen: The Danish Institute for Educational Research.

Reckase, M. D. (1981). Tailored testing, measurement problems and latent trait theory.

Paper presented at the annual meeting of the National Council for Measurement in Education, Los Angeles, U.S.A.

Roever, C. (2001). Web-based language testing. Language Learning & Technology, 5(2), 84–94.

Schulz, E. M., Kolen, M. J., & Nicewander, W. A. (1999). A rationale for defining achievement levels using IRT-estimated domain scores. Applied Psychological Measurement (23), 347–362.

Schulz, E. M. & Lee, W.-C. (2002, April). Describing NAEP achievement levels with multiple domain scores. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans, LA, U.S.A.

Singley, M., & Bennett, R. (2002). Item generation and beyond: applications of schema theory to mathematics assessment. In S. H. Irvine & P. C. Kyllonen (Eds.). Item generation for test development, 361-384.

Steven, V. (1991). Classroom concordancing: vocabulary materials derived from relevant authentic text. English for Specific Purposes, 10 (1), 35–46.

Stocking, M. L., & Lewis, C. (1995). A new method of controlling item exposure in computerized adaptive testing. (Research Rep. 95-25). Princeton, NJ: Educational Testing Service.

108

Stocking, M. L., & Lewis, C. (1998). Controlling item exposure conditional on ability in computerized adaptive testing. Journal of Educational and Behavioral Statistics, 23, 57-75.

Sympson, J. B., & Hetter, R. D. (1985). Controlling item-exposure rates in computerized adaptive testing.

Wainer, H. et al. (Eds.). (1990). Computerized adaptive testing: A primer. Hillsdale, NJ: Lawrence Erlbaum Associates.

Wilson, E. (1997). The automatic generation of CALL exercises from general corpora.

Paper presented at the Proceedings of the Conference on Teaching and Language corpora, 116–130.

Yen, Y. C. (2010). A 4PL-Based Error-Correction Mechanism for Reviewable Computerized Adaptive Testing. Unpublished doctoral dissertation, National Taiwan Normal University, Taiwan.

109

110 四四

四四、、、、算出下面圖形的體積算出下面圖形的體積算出下面圖形的體積:算出下面圖形的體積:::22220%0%0%0%

(1)

6公分

6公分

6公分

( ) 立方公分立方公分立方公分立方公分 (2)

11公分

3公分

5公分

( ) 立方立方立方立方公分公分公分公分

五 五 五

五、、、、應用問題應用問題應用問題:應用問題:::44440%0%0% 0%

1. 下圖是由大小相同,且每邊長為 5 公分的正方體積木 堆疊而成,它的體積是多少立方公分?

2. 正方體骰子每邊邊長都是 1 公分。有一個紙盒,內側 的尺寸長是 10 公分、寬是 6 公分、高是 3 公分,這個紙 盒最多能放多少個骰子?

3. 算算看,下面立體圖形的體積是多少立方公分?(每 個夾角都是直角)

4. 算出下面容器的容積:(每個夾角都是直角,容器厚 度 8 公分)

111

Appendix B:: Items of invisible cube enumeration test

3 invisible cubes and

112

113

Appendix C: : : : Items of cube enumeration test

3 invisible cubes,

114

115 8 invisible cubes,

integrity level was 8, 20 total cubes.

8 invisible cubes, integrity level was 9, 20 total cubes.

8 invisible cubes, integrity level was 7, 20 total cubes.

8 invisible cubes, integrity level was 6, 24 total cubes.

8 invisible cubes, integrity level was 10, 21 total cubes.

8 invisible cubes, integrity level was 6, 20 total cubes.