Chapter 2 Literature Review
2.7 Fluency and Pauses
2.7.5 Pauses in interpreting
In Rennert (2010), factors of fluency, including pauses, was manipulated in two pieces of interpretation. In their research, filled pauses such as “umm” and “ahh” were termed “hesitations”, so it could be presumed that the “pauses” in their study refereed to silent pauses. Pauses in non-syntactic positions were shortened or removed, and additional pauses were inserted to syntactic positions in the fluent version, while in the non-fluent version, pauses were added to non-syntactic positions, and existing pauses were lengthened. A survey was conducted to see how 47 business students perceive the fluency of the interpretation. It was found that the disfluent version was indeed perceived as less fluent by the listeners. Moreover, the results showed a link between the perceived fluency and accuracy, suggesting that lower fluency may have negative impact on listeners’ impression of interpreting quality.
Macías (2006) also explored the relation between quality perception and disfluency, focusing solely on silent pauses. In her experiment, subjects were shown three videos of simulated simultaneous interpreting. Two of the videos had additional 7 and 13 pauses inserted in the interpreting. The video without any inserted pauses received the highest ranking in fluency, among other parameters. However, it must be noted that none of the results reached statistical significance.
Both Rennert (2010) and Macías (2006) explored pauses as a parameter of fluency from listeners’ perspective. Other researchers focused on how pauses in interpreters’ performance may be a function of source speech speed rates (Cecot, 2011; Piccaluga, Nespoulous and Harmegnies, 2005), noise in the source speech (Piccaluga et al., 2005), directionality of the interpreting task (Chiang et al., 2009;
Mead, 2000; Piccaluga et al., 2005) and the amount of training or expertise (Chiang et al., 2009; Piccaluga et al., 2005). Given the established relation between pauses and speakers’ cognitive process in spontaneous speech, pauses have been viewed as “a
46
window on cognitive processing” in interpreting studies (Piccaluga et al., 2005, p.
151), reflecting cognitive load imposed by features of the source speech and by directionality or faced by interpreters at different stages of training.
Cecot (2011) conducted a descriptive analysis of non-fluency occurrence.
Occurrences of disfluencies and unfilled pauses in 11 professional interpreters were categorized and calculated. The source texts for interpreting were recorded at two different speech rates. Differences in pause duration and function between the source speech and the translation were compared. A subjective questionnaire was conducted in the end, to see if the interpreters were aware of the pauses in the delivery and how they perceive the use of pauses. More disfluencies were found than unfilled pauses in the translation of both texts. Segmentation pauses were used more often than
hesitation pauses. Most of the interpreters were not aware of the use of pauses in their translation. Moreover, it was found, interestingly, that women tended to use
disfluencies more than men do.
Unlike Macías (2006), Tissi (2000) did not manipulate pauses in the source speech, as she used real speeches delivered in a conference context as her experiment materials. She merely observed the difference between disfluences occurring in the source speech and in the oral translation. It was found that the translation contained fewer but all together longer silent pauses. In addition, the translation had a slightly higher number of grammatical pauses.
Studies on sight translation and simultaneous interpreting have also shown that disfluencies, including pauses, in interpreters’ output increase with their cognitive load. (Shreve et al., 2011). In Shreve et al.’s study (2011), they found syntactically more complex areas, which presumably call for more cognitive effort from the subjects, saw more and longer disfluencies in sight translation output. The higher cognitive demand from syntactic complexity was supported by eye-tracking metrics,
47
such as greater numbers of fixations, longer fixation durations, and greater numbers of regressions. They also found disfluency rates in their experimental sight translation three times higher than the rates in spontaneous speech (Fox Tree, 1995), suggesting that sight translation is more cognitively demanding than spontaneous speech.
Moreover, complex syntax is more disruptive to sight translation than text translation.
In Mead (2000), both the number of total pauses and filled pauses were found significantly higher in English (the students' B language) though silent pauses alone did not differ significantly. Piccaluga et al (2005) also found that pauses increase when cognitive load (from noise in the original speech, translating into a less familiar language) did. However, they also found that when expertise/ task difficulty ratio was high, the number of pauses was low. The finding seems to suggest that experts, someone with more interpreting experience and/or language skills, are less cognitively strained when facing difficult tasks.
Chiang et al. (2009), on the other hand, showed how training reduced pause frequencies and duration as well as the influence of directionality. They compared advanced interpreting students’ silent pause patterns during two-way sight translation with that of beginners. It was found that beginners’ pauses were significantly longer and more frequent than the senior students. They also paused more frequently when translating into English, their second language, while their seniors showed no
difference in pause duration or frequency. In addition to the global patterns of pauses, the researchers also studied the distribution of pauses. To be more specific, they singled out and examined “within-in constituent pauses”— “inappropriate pauses” or pauses “at ungrammatical positions.” Although no more detailed descriptions were given to the pauses, the names showed that they didn’t occur at syntactic junctures, suggesting that they could very likely be “hesitation pauses”, as defined in the previous section. It was discovered that advanced students produced significantly
48
fewer hesitation pauses than beginners, suggesting that training and practice may have helped students to avoid stopping at ungrammatical positions.
Many researchers studied pauses in interpreting and their link with fluency and cognitive process, but the biggest issue with many of their studies is that they did not distinguish between juncture and hesitation pauses (e.g. Cecot, 2001; Macías, 2006;
Mead, 2000; Piccaluga et al., 2005; Shreve et al., 2011). Given their dissimilarity, merely looking at global patterns of pauses would be a simplistic approach to the study of fluency.
Furthermore, although Piccaluga et al. (2005) found relations between pause patterns and variables related to the subjects’ cognitive load, i.e. language
directionality of the interpreting tasks, linguistic and interpreting expertise, and noise interferences in the source speeches, they didn’t elaborate on the “cognitive
processing” involved—despite the title of the study. In Mead (2000), subjects were asked to listen to the recordings of their interpretation output and explain why they paused. Like in all retrospection studies, the subjects’ explanation may shed light on some of the cognitive difficulties they’d encountered during the process of
interpreting, but it was unlikely that they could remember reasons for all the pauses.
They might not even be aware of some of the reasons, or even some of the pauses, in the first place.
Perceiving the above issues, when studying pauses in novice interpreters’ sight translation performance, Su (2013) distinguished between juncture pauses and hesitation pauses. The novices’ oral output data was examined and then triangulated with their eye-movement data, as a reflection of cognition, to better understand the interpreters cognitive processing during the two types of pauses.
49
Chapter 3 Oral Data Analyses
3.1 Data source and collection
This research is an extension of three previous studies on eye-movements during sight translation: Huang (2011), Chen (2013), and Su (2013). Huang asked 18 novices interpreters to take part in an experiment involving silent reading, reading aloud and sight translation. The subjects were all students aged 23 to 40 on translating and interpreting programs in Taiwan, and had completed sight-translation courses in the first year of the program. The three tasks were given in random order to avoid
learning or fatigue effect. For each task, after a practice session with one paragraph of text, the subjects were given two passages of approximately 150 Chinese characters to read or sight-translate. For the formal experiment, there were a total of six pieces of text assigned in rotation to the three tasks. The six paragraphs were grouped into three blocks, which were assigned to the three tasks in rotating order. The texts were not manipulated linguistically. They were taken from authentic Chinese speech with a general topic that required no prior knowledge.
The interpreters’ eye movements and oral output were recorded. Eye movement data from the tree tasks were examined and systemically compared to better
understand the interpreters’ reading processes and possibly cognitive processes.
Theoretically, all three stages of the interpreting processing, comprehension, reformulation and production, were involved in sight translation. Silent reading, on the other hand, only involved comprehension; when reading texts out loud, additional effort is put into production. Therefore, it was argued that through systemic
comparison of eye-movement indices from the three tasks, the three components could be isolated for better understanding of their cognitive significance and their role in the whole task.
50
Chen (2013) reproduced Huang’s (2011) experiment and invited 18 experienced interpreters to be her subjects. At the time of the experiment, they were all active interpreters, aged 29 to 58 with 7.6 years of experience in average. All of them had at least 150 days of work experience, meeting the criterion for “experienced interpreters”
(AIIC, n.d.). Fifteen of them received training at graduate translating and interpreting (T&I) institutes or programs, and fourteen of them had an MA in T&I. Like the
novices’, the experienced interpreters’ eye-movement indices were examined for more information on the effort put into the three tasks. Then the results were compared with the data from novices (Huang, 2011). Differences in the two groups’ eye movement patterns would reflect differences in their cognitive processes at a certain stage during the tasks. In addition, Chen also had the oral outputs of all novice and experienced interpreters evaluated to verify if the experienced interpreters’ performance quality was indeed better than the novices’.
Su (2013) focused on pauses in the novices’ sight translation output (Huang, 2011). She marked out silent pauses in the novices’ output recordings, and further divided them into juncture pauses and hesitation pauses based on the pauses’ position in the output. Then she triangulated the oral output data with the eye-movement data to find fixations that occurred during the pauses. By setting and examining parameters for the tangible eye-movement data, cognitive characteristics of the two types of pauses could be verified.
Following the methodology of Su (2013), this current research examined pauses in the experienced interpreters’ oral output (Chen, 2013), triangulated them with the interpreters’ eye-movement data, and compared the results with that of Su (2013).
Like in Su (2013), only data collected from the sight-translation task were used and examined.
51
3.2 Data Processing
3.2.1 Selection of observation points
Following Su (2013), all silent pauses longer than 200ms, i.e. periods of time without any linguistic output including fillers such as “uh” or “um”, were marked.
The duration was set based on Goldman-Eisler (1968) that pauses shorter that 200ms were not easily noticeable to listeners, nor were they cognitively significant.
Moreover, Boomer and Dittmann (1962) discovered that listeners could well discover hesitation pauses longer than 200ms.
Audacity, a digital audio editor, was used to show and, if necessary, amplify the sound waves of the subjects’ output recordings. (See Figure 3-1.) The onset and offset time as well as the duration of each pause were noted down. The data would later be compared with the onset time and duration of fixations in the eye movement data to determine eye fixation(s) during the pauses (See Chapter 4 Eye-movement analyses).
Figure 3-1. Selection of Pauses in Audacity
The number of hesitation pauses and juncture pauses in the experienced interpreters’ performance was calculated so that the distribution of the two types of pauses within the group of subjects could be determined and that inter-group comparison could be made between the experienced interpreters and the novices.
52
In Su (2013), she examined 200 observation points, namely 87 juncture pauses and 113 hesitation pauses, collected from 11 of the total 18 subjects’ output. Out of the total six paragraphs, the selected data covered three of them, namely Paragraph 1, 4 and 6. Each participant sight-translated one of the three paragraphs. However, during the course of this current study, it was discovered that the 200 observation points in Su’s (2013) oral data analysis were not merely selected based on the acoustic and temporal features of pauses in the oral data, i.e. a period of silence more than 200ms long, but had been juxtaposed with the eye-movement data to have some of the points excluded. The pauses were excluded because they came with “out of range”
fixations, i.e.fixations whose position were not recorded by the eye-tracker in an experiment because the subjects blinked or fixated somewhere out of the Regions of Interest (ROIs) assigned by experiment designers.
Yet, if “all the observation points were first analyzed with the oral data only” (Su, 2013, p.46), why should some of the points be excluded according to errors occurred in the eye-movement data? Moreover, the exclusion may have posed a problem: it could change the proportion of juncture pauses to hesitations pauses, which was an important element in the analysis of pause patterns.
Based on this reasoning, for this current study, pre-exclusion pause data from 18 experienced interpreters (Chen, 2013) and 11 novice interpreters (Huang, 2011; Su, 2013) were used for the oral output analysis. All sight-translation output data from the experienced interprets for the above-mentioned Paragraph 1, 4, and 6 were examined.
All 18 participants sight-translated one of the three pieces of text, making a total of 18 tasks to be analyzed. The raw data of Su’s research was also retrieved and
re-examined so that the number of pre-exclusion observation points could be found.
In the end, for the oral output analysis, 300 observation points, i.e. pauses, were found in the experienced interpreters data, and 298 were found in the novices’.
53
3.2.2 Annotated protocols
The outputs were then transcribed by the researcher. All disfluencies from silent pauses, filled pauses to false starts were noted through repetitive listening to be written into the transcript, which became a very thorough annotated protocol.
With the output transcribed, it was easier to categorize a pause as a “juncture pause” or a “hesitation pause”, based on its position in the transcription. Juncture pauses were pauses that occurred at syntactic junctures, “namely where punctuations like comma and period were noted” (Su, 2013, p.43). All other pauses were marked as hesitation pauses.
In the transcripts, a pause mark “^” that came after a comma or a period indicated that there was a juncture pause, while the rest of the marks put between words showed the positions of hesitation pauses. Ex. 3-1 and 3-2 are excerpts from the annotated protocols. The pause types can be told from the pause marks’ positions in the text. Information about the pause is given in the bracket in the following order:
the sequential number of the pause, its onset and offset time, pause duration, and fixated words during the pauses. Disfluences such as filled pauses (“uh”, “um”) and false starts (“do not um lay all the”) have all been noted down in the transcripts.
Ex. 3-1
Different people fit different kinds of investment method.
^[5:16697-16994:297ms;的→方式] For example, aggressive people
^[6:18758-19824:565ms;人→積極型→積極型→積極型] fit uh
^[7:20649-21607:956ms;適合→機會] it’s the uh investment opportunities that are more profitable but also more risky. ^[8:27079-27587:507ms;風險]
54
Ex. 3-2
Those aggressive people ^[7:19991-21091:1099ms;比如→積極型→的→適合]
are good for those high ^[8:23092-23424:332ms;適合] risk
^[9:23913-24223:309ms;機會] but high ^[10:24862-24548:686ms;風險→也→機 會] um return ^[11:26542-26803:260ms;機會] investment
^[12:27605-28437:831ms;獲利→機會→獲利→獲利] product.
^[13:29085-29739:654ms;機會→獲利→東西] But also, do not um lay all the
^[14:32663-33102:439ms;雞蛋] do not keep all the eggs in the same
basket.^[15:35928-37493:1564ms;都→同一→籃子→裡;反過來說→反過來說]
3.3 Oral data analysis results
When the oral output data was the only thing considered, 300 pauses were found, with 152 (50.67%) juncture pauses and 148 (49.33%) hesitation pauses. As explained in the previous section, all silent pauses over 200ms in the oral output were noted and kept for analysis. A total of 300 pauses were found in the 18 experienced interpreters’
performance, 152 (50.67%) juncture pauses and 148 (49.33%) hesitation pauses. On the other hand, the 11 novice interpreters made 298 pauses, 125 (41.95%) juncture pauses and 173 (58.05%) hesitation pauses. (See Table 3-1.)
Table3-1. Number of pause for Oral Data Analysis
Group Juncture Pauses Hesitation Pauses All Pauses Novice (11) 125 (41.95%) 173 (58.05%) 298 (100%) Exp. (18) 152 (50.67%) 148 (49.33%) 300 (100%)
55
The first thing that caught the eye was that the 11 novices made almost as many pauses as the 18 experienced interpreters did. It was obvious that the novices’ paused much more often than the experienced interpreters. In fact, statistics showed the novices not only made more pauses in average. The Mean of juncture pauses and the mean of hesitation pauses made by them were also higher (See Table 3-2).
Table 3-2. Average juncture pauses and hesitation pauses made by the experienced interpreters and the novices
Group Statistics
Group N Mean Std. Deviation Std. Error Mean
JPs
Exp. 18 8.44 2.617 .617
Novice 11 11.36 2.580 .778
HPs
Exp. 18 8.22 5.786 1.364
Nov 11 15.73 3.901 1.176
All Pauses
Exp. 18 16.67 7.776 1.833
Novice 11 27.09 5.752 1.734
Abbreviations:: JPs (Juncture Pauses); HPs (Hesitation Pauses) ; Exp. (Experienced Interpreters)
The inter-group difference was later confirmed by an independent t-test. The novices made significantly more pauses (M = 27.09) when sight-translating a 150-character Chinese paragraph than the experienced interpreters did (M = 16.67), t(27)= -3.839, p < .01. Significant inter-group difference was also found in the
number of juncture pauses (Novices: M = 11.36; Experienced Interpreters: M = 8.44), t(27) = -2.930, p < .01. The same thing happened to hesitation pauses made by the two
groups (Novices M = 15.73; Experienced interpreters M = 8.22), t(27) = -3.794, p < .01. (See Table 3-3).
56
Table 3-3. Independent t-test results for inter-group comparison of juncture and hesitation pauses Independent Sample Test
Abbreviations:: JPs (Juncture Pauses); HPs (Hesitation Pauses) ; Exp. (Experienced Interpreters)
Not only did the experienced interpreters made fewer juncture and hesitation pauses, paired t-tests showed that the distribution of the two types of pauses in their performance was different from the novices’. While the novices made much fewer juncture pauses (M = 11.36) than hesitation pauses (M = 15.73), t(10)= -4.434, p < .01, there was no significant difference between the number of the two types of pauses in the experienced interpreters’ data, (Juncture pauses: M = 8.44; Hesitation Pauses:
M = 8.22), t(17) = 0.210, p > .05. (See Table 3-4).
57
Table 3-4. Pared t-test for intra-group comparison between juncture and hesitation pauses Paired Samples Correlations
Group N Correlation Sig.
Novices JPs and HPs 11 .557 .075
Exp. JPs and HPs 18 .665 .003
Paired Samples Test
Group Paired Differences t df Sig. (2-tailed)
Mean Std. Deviation
Novices JPs - HPs -4.364 3.264 -4.434 10 .001
Exp. JPs - HPs .222 4.493 .210 17 .836
Abbreviations:: JPs (Juncture Pauses); HPs (Hesitation Pauses) ; Exp. (Experienced Interpreters)
There were two additional discoveries from statistics. First, the experienced interpreters (SD = 5.786) showed much more variance than the novices (SD = 3.901) in terms of the number of hesitation pauses they made. (See Table 3-2). Second, the number of hesitation pauses and juncture pauses in the experienced interpreters’ data showed moderate, positive correlation, r (16) = 0.665, p < .01). (See Table 3-4).
58
Chapter 4
Eye-movement Data Analyses
4.1 Data Collection and Preparation
All silent periods longer than 200ms had been marked out as pauses in the oral data analysis, and had been categorized as either juncture pauses or hesitation pauses based on their position in the translation.
For eye-movement data analysis, when the experienced interpreters’ oral output data was juxtaposed with their eye-movement data, it was discovered that one of the 18 subjects’ eye movement data was incomplete and was therefore dropped at this stage of analysis, leaving the research with 17 tasks to be analyzed. Data on the 17 subjects’ fixations, including their onset time, duration, position and reading passes they occurred in, were extracted from the raw data recorded by the eye tracker and put into Excel. The onset time and duration of fixations was set against the onset and offset time of the pauses to determine which fixations fell within the pauses. The fixations that occurred during pauses were marked out on the Excel spreadsheets for
For eye-movement data analysis, when the experienced interpreters’ oral output data was juxtaposed with their eye-movement data, it was discovered that one of the 18 subjects’ eye movement data was incomplete and was therefore dropped at this stage of analysis, leaving the research with 17 tasks to be analyzed. Data on the 17 subjects’ fixations, including their onset time, duration, position and reading passes they occurred in, were extracted from the raw data recorded by the eye tracker and put into Excel. The onset time and duration of fixations was set against the onset and offset time of the pauses to determine which fixations fell within the pauses. The fixations that occurred during pauses were marked out on the Excel spreadsheets for