CHAPTER 4 RESULTS
4.2 Qualitative data
This study used both questionnaire and interviews as a secondary data source to support the quantitative results. Since the questionnaire was designed to probe the participants’ perception/experience with captioned video content, the responses reported below will mainly focused on the participants assigned to the full caption (FC) condition. For the participants in the no caption (NC) condition, their perceptions toward captions will be reported by drawing on insights from their qualitative comments.
Table 4
Descriptive statistics of the questionnaire data with five-point Likert scale items (1 =
Strongly disagree, 2 = Disagree, 3 = Neutral, 4 = Agree, 5 = Strongly agree)
Statements FC
I focused more on the audio than the captions while watching the video.
2.41 4.6 2.7 4.93
I focused more on the captions than the audio while watching the video.
4.47 3.57 4.87 3.87
Note. (1) FC = Full captions; NC = No captions; A = Auditory learners; V = Visual
learners; H = High working memory capacity.
53
Auditory learners under the full caption condition
In Table 4, the questionnaire data showed that the auditory learners under the full caption condition reported that they tended to agree that their focal attention was directed to the oral discourse of the video (M = 4.6), in particular for those with higher working memory capacity (M = 4.93). Despite this, their average rating for the second questionnaire item (“I focused more on the captions than the audio”) is 3.57, indicating that they tended to agree that despite their (auditory) modality preference, they were aware of their attention to captions. This is especially the case for those auditory learners with higher working memory capacity (M = 3.87). This finding is further supported by the qualitative comments from the auditory learners: two thirds of the auditory learners (8/15) from the full caption condition expressed that captions engaged a lot of their attention, four of which even showed strong dislikes towards captions.
Additionally, when asked for their online-viewing preference, two thirds of the auditory learners under the full caption condition preferred not having captions on. Specifically, seven of them evinced that captions could direct their attention unwantedly, two of them found captions could be distracting, and 1 stressed that captions could impose more cognitive load to their understanding of the multimodal information. The followings are a few examples from the interview data:
“…Captions can attract a lot of attention from me, and I don't like it. When this
54
happens, I feel like im not polishing my listening skills. I was paying a lot of attention on captions during video watching….”
“…Captions can be distracting…”
“…I fell captions could attract a lot of my attention during video watching…”
“…Captions can attract a lot of attention, and I don't really like my attention being controlled by captions…”
“…Even though I'd normally turn captions on, I still don't like my attention being controlled by captions…”
“…But with captions, it can impose more cognitive load because you have to constantly attend to so many things. It can also be a little distracting…”
Visual learners under the full caption condition
In contrast, the visual learners’ average rating for the first item is 2.41, indicating that they tended to strongly disagree that they attended to the oral discourse while viewing the captioned video; even for those visual learners with high working memory capacity (M = 2.7). When responding to the second item, the visual learners’ average rating is 4.47, suggesting that they tended to strongly agree that they attended to captions; this tendency is even more apparent for the visual learners with high working memory capacity (M = 4.87).
55
Unlike the auditory learners’ negative comments noted above, visual learners exhibited a more positive attitude towards having captions during their online viewing experience. 6 of them explained that captions would not be distracting; 5 of them regarded captions as facilitative to their understanding of the audiovisual cues; 2 of them stressed that captions made them feel secure, allowing them to verify whether the received input is accurate. Specifically, having captions on would make them feel less cognitively demanded, for they intently relied on reading captions. The followings are a few examples from the interview data:
“…I don't find captions distracting, it can help me understand difficult terms…”
“…Watching video with captions is much easier (less cognitive load) for me;
whereas without captions could be more demanding.”
“…Captions allow me to make sure that what I listen is correct. I really enjoy havnig captions because it allows me to verify whether my listening is correct…”
“When I turn captions on, it's less cognitively challenging because I can just read them. Without captions, however, may be more cognitively challenging for me because I have to force myself to concentrate and to not space out.”
Notably, five visual learners seem to treat captions as an optional asset, namely, an multimodal input that they could draw on when needed:
56
“…I feel I pay more attention to listening, and only briefly attend to captions when I need details to facilitate my comprehension…”
“Captions can be helpful for understanding the content. If I don't want to look at it, I just ignore it.”
Auditory learners under the no caption condition
Seven auditory learners felt that captions were attention-demanding during video-viewing. They specifically concerned that captions directed their attention to specifics, rather than the tenor of the content. In that case, all of them preferred not having captions on during online viewing, for they concerned that captions were not conducive to their listening comprehension:
“…Captions may be helpful for understanding technical terms. I wouldn't turn captions on for this video because it's more focusing on understanding "central ideas”…”
“…I think if we were to train our listening comprehension skills, having captions on may be less desirable…”
“…Captions can draw a lot of attention and maybe this is not helpful for training L2 listening...”
57
Visual learners under the no caption condition
Twelve visual learners stated that they found captions attention-demanding, but not distracting. Importantly, five of them explicitly noted “I typically find captions helpful in promoting my comprehension, but viewing non-captioned videos (i.e., the TED talk video) also does not bother me at all because the video content already provides a lot of contextual visual clues.” Notably, these comments all came from the visual learners with high working memory capacity.
58
CHAPTER 5 DISCUSSION
The two research questions addressed in this study intend to unravel whether L2 learners’ preferred modality and working memory capacity modulate the effect of captioning. In particular, the extent to which the investigated variables interacted with each other will be discussed in the ensuing paragraphs, with an aim to further the understanding of captioning in the context of differentiated teaching/learning.
Prior to discussion, an interesting finding stood out. Specifically, this finding reinforced the need of considering L2 learners’ input processing profiles (i.e., preferred modality and working memory capacity), especially when video instructional materials are used. Overall, the quantitative results showed that when L2 learners’ input processing profiles were not taken into consideration, L2 learners—irrespective of their caption viewing conditions—did not differ significantly in their listening comprehension performances. The comparable outcomes from both caption and no-caption conditions indicated that the presence of no-captions did not significantly influence L2 learners’ listening comprehension.
Overall, although the effect of captions was not significantly accentuated in this study, larger variance of listening scores was observed in the full caption condition (SD
= 1.77). Notably, when L2 learners’ input processing profiles were considered, such
59
variance became greater, especially in conditions with full captions. The fact that larger variance was seen in full caption conditions indicated that the effect of captions may still exist, except selectively. Such selectiveness corroborated with a point made in chapter 2, suggesting that not all L2 learners would benefit from captions. Also pointed out in the previous sections, there is still a discrepancy regarding whether captions are truly beneficial to L2 listening comprehension. The ensuing sections, therefore, present the insights obtained from this study with an empirical quest to close the aforementioned discrepancy.
5.1 RQ1: Does preferred modality modulate the effect of captioning?
The first research question concerns whether participants’ preferred modality (i.e., visual and auditory) would modulate the effect of captioning on L2 learners’ listening comprehension. Statistical results revealed that this input processing factor significantly modulated the effect of captioning. It was also evidenced that modality preference alone resulted in statistically significant difference in the L2 learners’ performance data. Both findings lent support to the hypothesis proposed in chapter 2, that modality predilections, as an input processing factor, could potentially modulate the extent to which the effect of captions is determined. Since preferred modality is theoretically and empirically pertinent to processing multimodal input, the following section will discuss
60
the finding vis-à-vis both Mayer’s CTML and existing empirical evidence from L2 caption research.
5.1.1 Effects of caption mode on auditory learners
The quantitative data revealed that L2 learners who were prescribed as auditory learners performed the best under the no caption condition. When compared to their captioned counterparts, the no-caption listening advantage became even more statistically salient. Therefore, it became clear that L2 auditory learners understood the audiovisual information best when immersed in no-captioned multimodal environment.
The prominent no-captioned effect manifested in the auditory learners may be accounted for by Mayer’s active processing assumption in CTML. In theory, multimodal processing requires efficient learner control in selecting the most relevant input to aid comprehension (Hasler, Kersten, & Sweller, 2007; Mayer, 2001). Similarly, L2 learning context also requires learners to the same—be efficient in allocating attention to cues that are directly beneficial to listening comprehension (Taylor, 2005).
In this case, L2 learners are most likely to depend on the input that matches their modality preference. Furthermore, Oxford (2003) hypothesized that in order to generate the most desirable learning outcome, what is preferred has to match with what is presented (Oxford, 2003). As the auditory learners in this study were found to be a case
61
in point, it is reasonable to extrapolate that the no-caption condition may be a more
favorable environment for them to optimize their multimodal listening outcome.
The observation that having no caption appears to be the optimal listening condition for the auditory learners indicates that visual support (in this case, any form of captions) disrupted, rather than facilitated auditory learners’ understanding of multimodal video content. This explains why their comprehension was impaired when captions were provided. This finding contradicts with the previous studies, where captions were found to aid the listening comprehension process (Chai & Erlam, 2008; Danan, 2004;
Markham & Peter, 2003; Sydorenko, 2010; Taylor, 2005; Vanderplank, 2010; Winke et al., 2010). In fact, prior research maintained that captions were widely used as a strategy to facilitate perceptual processing in listening (Goh, 2000). If captions were designed to facilitate the encoding of the acoustic message, why is it detrimental to the auditory learners’ listening comprehension?
One possible explanation may lie in the disturbance of L2 learners’ optimal processing channel. In the case of the auditory learners, their preferred processing channel is through what they hear rather than what they see. This speculation is established by the qualitative data reported earlier; namely when exposed to captioned video viewing environment, the presence of captions may have reoriented their attention to the visual input—the channel that is not of their processing preference.
62
Moreover, according to the questionnaire data, the auditory learners indicated that presence of captions may have divided the auditory learners’ attentional resources. With limited attentional resources at hand (Mayer, 2001), it is possible that such demand from captions imposed difficulty in processing information from their optimal and most preferred channel. Furthermore, two thirds of the auditory learners did not express positive attitude towards receiving captions during their video-watching. Both sets of data sources, quantitative and qualitative, jointly indicated that the presence of captions may not lead to the optimal multimodal processing environment for the auditory learners. This finding, in turn, helped shed light on recalibrating the listening strategy suitable for those who preferred to “listen” in a multimodally enhanced environment.
5.1.2 Effects of caption modes on visual learners
Notwithstanding the notable effect captions had on the auditory learners, it did not manifest in the visual learners’ listening outcome. This study found that the listening scores from the visual leaners in the caption condition did not differ significantly from those in the no caption condition. Without the assistance of captions, visual learners were able to achieve similar level of understanding as those assisted with the visualized text. This illuminating finding indicated that for visual learners, captions may not be the only scaffold facilitative to their listening comprehension.
63
Since visual learners tend to rely on what they see, it is possible that even without captions, they would still rely on other available elements presented visually.
Nonlinguistic (e.g., facial expressions, body language etc.) input, for example, may contain subtle but rich contextual cues that may provide assistance equal to the effect of captions. As evidenced in Sueyoshi and Hardison’s (2005) study, paralinguistic elements, such as gestures and facial cues, can assist the understanding of videotaped lectures—a type of video content similar to the TED video used in this study. This may explain that without captions, L2 visual learners were still able to comprehend at comparable depth as their captioned counterparts in a multimodal listening environment.
Aside from the comparable performances between L2 visual learners under full and no caption conditions, closer examination at the standard deviation revealed a visible variance among the visual learners’ listening outcome. Such variance was also manifest in the qualitative records. Unlike the auditory learners whose qualitative data exhibited more consistency in rejecting the effect of captions, the data from the visual learners showed that they perceived captions in a positive lens: while some denounced the benefits of captions, others held a more positive mindset towards the online visual aid. Specifically, some reported that captions were (1) not distracting, (2) facilitative to their understanding of the novel words (see Winke et al., 2013), and (3) useful to resolve ambiguity which helped release their anxiety (see Vanderplank, 2010). Notably, of all
64
the participants in this study, only the visual learners reported that having captions on made them feel less cognitively demanded. The statistical variance along with the visual learners’ qualitative remarks posed a question as to whether the effect of captions was in fact, “selective” to L2 visual learners with other intra-learner factors.
5.2 RQ 2: Does working memory capacity modulate the effect of captioning?
With the aforementioned speculation in mind, the second research question illuminated that working memory capacity is an important intra-learner factor at play during multimodal input processing. Although working memory as a factor alone did not modulate the effect of captioning, quantitative data revealed that there is a significant interaction between working memory, L2 learners’ modality predilection, and different caption modes. Specifically, working memory significantly determined the role of learners’ preferred modality in their listening performance under different caption conditions (i.e., full and no caption).
The deciding role working memory played indicated its prominent impact on comprehension in a multimodal L2 video viewing environment—a finding that corroborated with Miyake and Friedman (1998). Such an impact lent support to the hypothesis of this study, which stemmed largely from Mayer’s CTML. Based on Mayer’s cognitive theory, the degree to which learners benefit from multimodal input
65
depends on how proficient they utilize the incoming sources of information. Since listening is multimodal and transient in nature, L2 learners have to be efficient in their online input processing. It can then be hypothesized that those who are less automated in storing and processing information simultaneously may find it difficult to utilize multimodal input—a phenomenon that may undermine the effect of captions which will be discussed in the ensuing paragraph.
5.2.1 Effects of caption modes on L2 learners with lower working memory capacity
As hypothesized, the effect of captions was absent from the listening performance obtained from the visual and the auditory L2 learners with lower working memory capacity. In fact, neither was the presence of captions facilitative nor debilitative on
both types of L2 learners with lesser cognitive capacity. Such phenomenon resonates with Mayer’s assumptions in CTML, in which he claimed that although
multimedia-enhanced learning has the potency to enhance L2 learning outcomes, its promising effect is conditioned to the limited capacity of working memory (Mayer, 2001, 2005a).
With limited capacity to simultaneously store and process ongoing information during video watching (Baddeley, 2003), those with lower working memory capacity may struggle to execute multimodal input processing. They may not be able to effectively and actively allocate attention to utilize enriched input while watching the
66
video. With such cognitive capacity at hand, it is reasonable to postulate that the demand from having to attend to various input in a multimodal learning environment might have exceeded their cognitive capacity, as the presence of captions made little difference to either visual or auditory L2 learners’ listening comprehension.
Consequently, the multimodal nature of the video—irrespective of caption conditions and modality predilections—may not impose significant difference to those with lesser working memory, which in turn, did not benefit their listening comprehension outcome.
5.2.2 Effects of caption modes on L2 learners with higher working memory capacity
Contrary to the findings on lower working memory L2 learners, the interaction between the effect of captioning and modality predilections became salient in the listening performances of L2 learners with higher working memory capacity. Both auditory and visual learners with higher working memory performed the best under their most preferable input combinations. In other words, the visual learners significantly benefited from the captioned video; while the auditory learners performed most ideally without captions. The fact that the effect of captions was determined by their idiosyncratic modality preference again, reinforced Mayer’s active processing assumption in CTML: learners tend to rely on their most preferred input for the most desirable learning outcomes (Mayer, 2001, 2005a). This match-making phenomenon
67
generated the most optimal results for both types of learners with higher (i.e., average and/or above average) working memory capacity in this study, which again verified the hypothesis proposed in Oxford (2003). Specifically, the quantitative results obtained from the auditory learners with higher working memory capacity concurred with the findings discussed in the earlier paragraphs, as the presence of captions may have disrupted their listening comprehension, which was evidenced in their listening scores.
As for their visual counterparts, full textual support led to a more advantageous listening outcome—a finding contrary to what was observed in the auditory learners.
Such effect may be attributed to them reporting paying more attention to their most preferred sources of input, visualized text, rather than the aural stream—a match that generated the best multimodal input processing scenario for them. Aside from the advantage for immersing in their most preferred processing environment, having higher attentional control could also explain the visual learners’ achievement. Based on the qualitative data, some of them reported that captions were not distracting or cognitively demanding. They also self-assessed as more capable of attending to the multimodal input, which allowed them to dexterously utilize any given input during online processing (Colflesh & Conway, 2007). More control in processing implies more use
of captions to enhance form-meaning mapping that aids listening comprehension (Chai
& Erlam, 2008; Danan, 2004; Markham & Peter, 2003; Sydorenko, 2010; Taylor, 2005;