This empirical study aimed to explore the efficacy of three caption modes (i.e., full, partial, and real-time captioning) on L2 comprehension of L2 learners with different modality preferences. Listed below are the insights generated from the current findings:
What this study adds:
L2 learners’ modality preference serves as a crucial factor moderating the efficacy of different caption modes on L2 learners’ comprehension of L2 audiovisual materials.
Full captioning was proved to benefit visual L2 learners’ comprehension more than auditory L2 learners’ due to its complete textual support, namely, the transcription of corresponding oral discourse in the video.
Partial captioning, due to its limited textual support and unpredictable display, has the potential of redirecting L2 learners’ attention during captioned-video watching.
Notably, although partial captioning tends to assist auditory L2 learners’
comprehension, it slightly disrupts visual L2 learners’ understanding.
Real-time captioning proves to be an effective L2 comprehension aid, especially for L2 visual learners. Its dynamic display of caption could facilitate visual L2 learners’ comprehension without dampening their information processing.
Together, these major findings confirmed the researcher’s hypothesis that L2 learners’
modality preference serves as a modulating factor of the efficacy of different caption modes, especially when the aim is to optimize L2 learners’ comprehension. This not
only calls for more attention to individual differences in L2 caption research, but also underscores the importance of differentiated instructions when utilizing captioning as L2 facilitative tool. The ensuing section will focus on two major pedagogical implications regarding the implementation of differentiated caption support in both L2 classroom and self-learning scenarios.
6.2 Pedagogical implications
6.2.1 Differentiated caption support is warranted and should be implemented based on L2 learners’ modality preference.
Though high-intermediate L2 learners could understand the gist of the video content without having to rely on caption support, their comprehension performance was enhanced with the provision of captions. However, captions exerted differential and selective effects on L2 learners of different learning styles. Specifically, while visual L2 learners benefited the most from full and real-time captioning, auditory L2 learners performed the best when assisted by partial captioning. This finding implies that the provision of only one type of captioning is not the most desirable way to present captioned videos to all L2 learners. Instead, differentiated caption support based on L2 learners’ modality preference is warranted, especially when L2 listening comprehension is the aim.
In L2 classroom scenarios, where individual differences are commonly observed, L2 instructors are encouraged to use a fast but effective instrument (such as CRT) to determine their students’ learning styles, and to consider this information when using captioned videos as their instructional materials. This ensures that both visual and auditory L2 learners have equal chance to receive the most effective comprehension aids in class. The provision of different caption modes is especially helpful when the video content becomes challenging and/or when the instructor intends to use the video
as a language acquisition tool (e.g., learning new words while viewing the video); in this case, ensuring that the video content is indeed a comprehensible input to all learners is the key to further language development. Since L2 learners’ capacity of real-time processing is limited and their L2 aural decoding skills are not yet automatic, they would focus even more on the modality they prefer when the incoming information is difficult to comprehend. In this regard, their comprehension would be compromised because videos simultaneously provide learners with input from several modalities.
Namely, as the difficulty level of the video increases, the processing difference caused by L2 learners’ modality preferences is likely to become more salient. Under this circumstance, L2 learners would be more in need of their desirable textual aids to optimize their comprehension performance.
Notwithstanding, differentiated caption support can be difficult to implement in classroom setting without certain manipulation and technical support. While it is impossible for L2 instructors to present a video with different caption modes to the whole class at once, L2 instructors might want different types of captioning video materials (e.g., full vs. partial vs. real-time) and alternative between and among them in class (not sticking to only one type of captioned videos) whenever appropriate. For instance, L2 instructors can incorporate mobile devices (e.g., tablets) into classroom setting, which allows them to turn video-watching into group task. By separating visual and auditory L2 learners into different groups and presenting the video with the suitable caption mode for each type of L2 learner, every individual could receive caption aids that optimize their listening comprehension. Specifically, visual L2 learner groups could watch the video with full or real-time captioning while auditory L2 learner groups enjoying the same content with partial captioning. If every L2 learner in class have access to a smartphone or other handheld device, captioned video watching can even be done individually and the selection of captioning can thus be more flexible.
Additionally, L2 instructors can also advise students to use a particular type of caption videos as their after-class listening materials based on their preferred learning styles.
Another possible difficulty faced by L2 instructors when implementing differentiated caption support may arise from partial caption production. To determine and select the keywords for this viewing environment, the current study drew on professional judgement. However, this protocol took time and might not be the most desirable practice in the classroom setting. In this regard, L2 instructors might want to select the keywords based on the understanding of their students’ profiles, displaying words that are novel, unfamiliar, and difficult for their students. Also, words or phrases that are essential for L2 learners’ general understanding should also be considered. By following the above criteria, L2 instructors could produce tailor-made partial captioning for the target auditory L2 learners based on their proficiency level and the difficulty level of the selected material.
The insight that L2 learners with different modality preferences should be assisted with different caption modes is crucial not only for L2 instructors, but also for developers of L2 self-learning websites and software. For the purpose of catering to the needs of visual and auditory L2 learners, developers of L2 self-learning websites or software are also encouraged to provide different caption mode options on their platforms. By enabling users to select from full, partial, and real-time captioning based on their modality preferences, differentiated caption support can be realized in L2 self-learning scenarios. Such caption mode selection on websites and software could also serve as in-class instructional tools, providing technical support for L2 instructors who want to bring differentiated caption instruction into their classrooms.
6.2.2 CRT serves as a feasible tool to determine L2 learners’ modality preferences in L2 classroom and self-learning scenarios.
The implementation of differentiated caption support requires the identification of L2 learners’ modality preferences both in the contexts of L2 classrooms and self-learning platforms. Consequently, L2 instructors or website and software developers need to adopt professional methods to help L2 learners identify their preferred modality during real-time processing. One possible way is to design level-appropriate CRT for the target L2 learners. CRT has been demonstrated in several existing studies and the current study as a valid real-time measurement of L2 learners’ modality preferences.
Notably, CRT, which took the form of multiple-choice questions, is not only a reliable assessment that can be utilized in the lab setting but also a feasible pedagogical tool that can be easily implemented by L2 instructors in the classroom setting. By creating unobtrusive incongruence between the audio and visual input in the selected videos, test designer could extrapolate test takers’ preferred modalities after calculating the proportion of audio and visual answers.
For L2 instructors, a tailor-made CRT for target L2 learners can be easily created with accessible technical support. By using the caption editing function on Youtube or other video-editing software, L2 instructors are able to make any changes to the oral transcription, producing incongruence between audio and visual input in any videos.
After that, L2 instructors could design several multiple-choice comprehension questions to develop their own CRT. Administration of this kind of CRT can be done both in class and outside of the classroom, allowing flexibility for implementation.
As for L2 self-learning websites or software, developers could consider designing level-appropriate CRTs for L2 learners of different proficiency levels to help them identify their own modality preference. After taking the CRT, it would be helpful if L2 learners can be given suggestions on which kind of caption aids they should receive for
optimal comprehension outcomes. Such individualized learning guidance can not only make L2 self-learning websites or software more effective, but also allow L2 learners to know more about their learning styles, which may in turn optimize other aspects of their L2 learning. While online CRT is crucial for differentiated caption instruction, it is suggested that each L2 learners take the CRT only once in case they become aware of the incongruence of the items, which should remain unobtrusive.
6.3 Limitations and suggestions for future research
Although the current study yielded valuable insights on L2 captioned videos instructions, several limitations should be noted.
First, the video selected in the current study was a TED talk video which only involved the speaker and a simple a simple backdrop. In this case, limited visual cues/aids was provided. While such speech videos provide limited visual aids (i.e., the speaker’s facial expression, hand gestures, and the graph), other video materials include a lot more visual cues such as animation or dynamic images to help viewers make sense of the content. Since the animation and dynamic images could easily become attention getters and might either assist or disrupt L2 learners’ information processing, these visual aids are likely to affect visual and auditory L2 learners’ comprehension to different extents. Accordingly, whether the current finding could not be generalized to other types of video materials is yet to be established. Meanwhile, further investigation is needed to shed light on how these ‘rich’ and ‘additional’ visual clues in captioned videos affect the viewing behaviors and comprehension performance of visual and auditory L2 learners respectively.
Second, the results of the current study proved that different caption presentation modes would affect L2 learners’ attention allocation while video watching, thus contributing to different reactions of visual and auditory L2 learners. In particular,
partial captioning demonstrated the greatest potential on directing L2 learners’ attention to captions during video watching. However, the current study mainly proposed the aforementioned view based on off-line data including the comprehension test scores and the questionnaire results. To further confirm the current proposal, on-line eye-tracking data could provide more concrete evidence as to the attention allocation of L2 learners under different caption conditions. Such real-time processing data might also provide more comprehensive explanation as to why visual and auditory L2 learners reacted so differently when receiving the same caption support.
Third, while the current findings provided useful guidance as to which caption mode serves as the most desirable aid for visual and auditory L2 learners, it is important to note that such guidance mainly concerned L2 comprehension rather than L2 acquisition. In the current one-shot experiment, captions were utilized as L2 comprehension aids and the main goal was to investigate the effects of each caption mode on L2 learners’ audiovisual processing through their comprehension performance.
Therefore, the results cannot be overgeneralized to L2 acquisition. Future research on the use of captioned videos as L2 acquisitonal tool is warranted to shed light on the most effective caption mode for visual and auditory L2 learners.
REFERENCES
Almeida, P. A., & Costa, P. D. (2014). Foreign language acquisition: the role of subtitling. Procedia-Social and Behavioral Sciences, 141, 1234-1238.
Bailly, G., & Barbour, W. S. (2011). Synchronous reading: learning French orthography by audiovisual training. Paper presented at the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011).
BavaHarji, M., Alavi, Z. K., & Letchumanan, K. (2014). Captioned instructional video:
Effects on content comprehension, vocabulary acquisition and language proficiency. English Language Teaching, 7(5), 1-16.
Behroozizad, S., & Majidi, S. (2015). The Effect of Different Modes of English Captioning on EFL Learners' General Listening Comprehension: Full Text vs.
Keyword Captions. Advances in Language and Literary Studies, 6(4), 115-121.
Bisson, M. J., Van Heuven, W. J. B., Conklin, K., & Tunney, R. J. (2014). Processing of Native and Foreign Language Subtitles in Films: An Eye Tracking Study.
Applied Psycholinguistics, 35(2), 399-418.
Bolter, J. D. (2001). Writing space: Computers, hypertext, and the remediation of print.
Routledge.
Chen, Y. R., Liu, Y. T., & Todd, A. G. (2018). Transient but Effective? Captioning and Adolescent EFL Learners’ Spoken Vocabulary Acquisition. English Teaching &
Learning, 1-32.
Choi, S. (2017). Processing and Learning of Enhanced English Collocations: An Eye Movement Study. Language Teaching Research, 21(3), 403-426.
Danan, M. (2004). Captioning and Subtitling: Undervalued Language Learning Strategies. Meta: Journal des Traducteurs/Translators' Journal, 49(1), 68-77.
Faber, J., & Fonseca, L. M. (2014). How sample size influences research outcomes. Dental press journal of orthodontics, 19(4), 27-29.
Garza, T. J. (1991). Evaluating the Use of Captioned Video Materials in Advanced Foreign Language Learning. Foreign Language Annals, 24(3), 239-258.
Graham, S. (2006). Listening Comprehension: The Learners' Perspective. System: An International Journal of Educational Technology and Applied Linguistics, 34(2), 165-182.
Guichon, N., & McLornan, S. (2008). The Effects of Multimodality on L2 Learners:
Implications for CALL Resource Design. System: An International Journal of Educational Technology and Applied Linguistics, 36(1), 85-93.
Guillory, H. G. (1998). The Effects of Keyword Captions to Authentic French Video on Learner Comprehension. CALICO Journal, 15(1-3), 89-108.
Hayati, A., & Mohmedi, F. (2011). The Effect of Films with and without Subtitles on Listening Comprehension of EFL Learners. British Journal of Educational Technology, 42(1), 181-192.
Hsu, C. K., Hwang, G. J., Chang, Y. T., & Chang, C. K. (2013). Effects of Video Caption Modes on English Listening Comprehension and Vocabulary Acquisition Using Handheld Devices. Educational Technology & Society, 16(1), 403-414.
Huang, H. C., & Eskey, D. E. (1999). The effects of closed-captioned television on the listening comprehension of intermediate English as a second language (ESL) students. Journal of Educational Technology Systems, 28(1), 75-96.
Kozan, K., Erçetin, G., & Richardson, J. C. (2015). Input modality and working memory: Effects on second language text comprehension in a multimedia learning environment. System, 55, 63-73.
Kress, G. R. (2010). Multimodality: A Social Semiotic Approach to Contemporary Communication. Taylor & Francis.
Kruger, J. L., & Steyn, F. (2014). Subtitles and Eye Tracking: Reading and Performance.
Reading Research Quarterly, 49(1), 105-120. doi:10.1002/rrq.59
Kushalnagar, R. S., Lasecki, W. S., & Bigham, J. P. (2012). A readability evaluation of real-time crowd captions in the classroom. ACM SIGACCESS Conference on Computers & Accessibility, 71-78. doi:10.1145/2384916.2384930
Lasecki, W. S., Kushalnagar, R., & Bigham, J. P. (2014). Helping students keep up with real-time captions by pausing and highlighting. Paper presented at the Proceedings of the 11th Web for All Conference.
Leveridge, A. N., & Yang, J. C. (2013). Testing Learner Reliance on Caption Supports in Second Language Listening Comprehension Multimedia Environments.
ReCALL, 25(2), 199-214.
Liu, Y. T., & Todd, A. G. (2014). Dual-Modality Input in Repeated Reading for Foreign Language Learners with Different Learning Styles. Foreign Language Annals, 47(4), 684-706.
Lwo, L., & Lin, M. C. T. (2012). The Effects of Captions in Teenagers' Multimedia L2 Learning. ReCALL, 24(2), 188-208.
Markham, P., Peter, L. A., & McCarthy, T. J. (2001). The effects of native language vs.
target language captions on foreign language students' DVD video comprehension. Foreign Language Annals, 34(5), 439-445.
Mayer, R. E. (2001). Multimedia learning. New York: Cambridge University Press.
Mayer, R. E., & Moreno, R. (2003). Nine ways to reduce cognitive load in multimedia learning. Educational psychologist, 38(1), 43-52.
Mayer, R. E. (2005). Cognitive theory of multimedia learning. In R. E. Mayer (Ed.), Cambridge handbook of multimedia learning (pp. 31–48). New York:
Cambridge University Press.
Mirzaei, M. S., Meshgi, K., Akita, Y., & Kawahara, T. (2017). Partial and Synchronized Captioning: A New Tool to Assist Learners in Developing Second Language Listening Skill. ReCALL, 29(2), 178-199.
Mohsen, M. A. (2016). The Use of Help Options in Multimedia Listening Environments to Aid Language Learning: A Review. British Journal of Educational Technology, 47(6), 1232-1242.
Montero Perez, M., Van Den Noortgate, W., & Desmet, P. (2013). Captioned video for L2 listening and vocabulary learning: A meta-analysis. System, 41(3), 720-739.
doi:10.1016/j.system.2013.07.013
Montero Perez, M., Kulak, L., Peters, E., Clarebout, G., & Desmet, P. (2014). Effects of Captioning on Video Comprehension and Incidental Vocabulary Learning.
Language Learning & Technology: A Refereed Journal for Second and Foreign Language Educators, 18(1), 118-141.
Montero Perez, M., Peters, E., & Desmet, P. (2014). Is less more? Effectiveness and perceived usefulness of keyword and full captioned video for L2 listening comprehension. ReCALL, 26(1), 21-43. doi:10.1017/S0958344013000256 Montero Perez, M., Peters, E., & Desmet, P. (2015). Enhancing Vocabulary Learning
through Captioned Video: An Eye-Tracking Study. Modern Language Journal, 99(2), 308-328.
Othman, J., & Vanathas, C. (2017). Topic familiarity and its influence on listening comprehension. The English Teacher, 14.
Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124(3), 372-422. doi:10.1037/0033-2909.124.3.372
Robinson, P. (2001). Task complexity, cognitive resources, and syllabus design: A triadic framework for examining task influences on SLA. Cognition and second language instruction, 288.
Robinson, P., & Gilabert, R. (2007). Task Complexity, the Cognition Hypothesis and Second Language Learning and Performance. International Review of Applied Linguistics in Language Teaching (IRAL), 45(3), 161-176.
Taylor, G. (2005). Perceived Processing Strategies of Students Watching Captioned Video. Foreign Language Annals, 38(3), 422-427.
Thomas, J. R., Nelson, J. K., & Thomas, K. T. (1999). A generalized rank-order method for nonparametric analysis of data from exercise science: a tutorial. Research Quarterly for Exercise and Sport, 70(1), 11-23.
Trancoso, I., Serralheiro, A., Viana, C., Caseiro, D., & Mascarenhas, I. (2007). Digital talking books in multiple languages and varieties. Paper presented at the 3rd Language & Technology Conference, Poznan, Poland.
Vandergrift, L. (2004). Listening to Learn or Learning to Listen? Annual Review of Applied Linguistics, 24, 3-25.
Vandergrift, L. (2007). Recent developments in second and foreign language listening comprehension research. Language Teaching, 40(3), 191-210.
Vanderplank, R. (1988). The value of teletext sub-titles in language learning Robert Vanderplank. ELT journal, 42, 4.
Vanderplank, R. (2010). "Deja Vu"? A Decade of Research on Language Laboratories, Television and Video in Language Learning. Language Teaching, 43(1), 1-37.
Vigliocco, G., Perniss, P., & Vinson, D. (2014). Language as a multimodal phenomenon: implications for language learning, processing and evolution.
Philosophical Transactions Of The Royal Society Of London. Series B, Biological Sciences, 369(1651), 20130292-20130292. doi:10.1098/rstb.2013.0292
Wald, M. (2006). Creating accessible educational multimedia through editing automatic speech recognition captioning in real time. Interactive Technology and Smart Education, 3(2), 131-141.
Winke, P., Gass, S., & Sydorenko, T. (2010). The Effects of Captioning Videos Used for Foreign Language Listening Activities. Language Learning & Technology, 14(1), 65-86.
Winke, P., Gass, S., & Sydorenko, T. (2013). Factors Influencing the Use of Captions by Foreign Language Learners: An Eye-Tracking Study. Modern Language Journal, 97(1), 254-275.
Yang, J. C., & Chang, P. (2014). Captions and Reduced Forms Instruction: The Impact on EFL Students' Listening Comprehension. ReCALL, 26(1), 44-61.