Chapter 1 Introduction
1.2 Research Questions
The purpose of the present study was to explore the process of sight translation through
tracking the eye movements of interpreting students during three tasks: silent reading of a
Chinese speech, reading aloud a Chinese speech, and sight translation of a Chinese speech
interpreting, they could be analyzed and compared to sight translation so that certain
components of interpreting could be further understood. In detail, the primary research
questions that we addressed are as follows:
(1) To investigate how and when the comprehension component in interpreting occurs in
sight translation;
(2) To examine whether the comprehension and reformulation components overlap in sight
translation, that is, to explore the validity of the vertical and horizontal perspectives in
interpreting;
(3) To ascertain whether the conventional wisdom of “reading ahead” is sound in sight
translation, that is, to determine whether the comprehension and production components
overlap during sight translation.
The remainder of this thesis is organized into four sections. Chapter 2 provides basic
definitions of important concepts and reviews previous studies on sight translation and eye
tracking. Chapter 3 describes the methodology, procedures, and results of an eye-tracking
experiment on silent reading, reading aloud, and sight translation. Finally, Chapter 4 offers
a general discussion on how the findings of the experiment can be applied to T&I while
suggestions are made for future research.
Chapter 2
Literature Review
2.1 Definitions: types of interpreting and language processing
It is vital to understand the types of interpreting as they are the basics to why sight
translation is of enormous value to interpretation training. Generally speaking, interpreting
can be categorized into two major modes: simultaneous interpreting (SI) and consecutive
interpreting (CI). Simultaneous interpreting (SI) means new input is continuously presented
and the interpreter comprehends the continuously incoming input while simultaneously
reformulating the message and producing it orally in the target language. In contrast to the
immediacy of SI, output in consecutive interpreting (CI) begins only after the speaker has
verbalized a group of words or sentences. The interpreter alternates between listening and
speaking, and only starts to translate after the speaker has finished speaking. In SI, the
interpreter multitasks and coordinates various efforts while in CI, the interpreter utilizes
note-taking skills and capitalizes on short-term memory skills. The two modes of
interpreting require distinct language processing skills while different training programs are
designed in T&I schools to equip interpreter with relevant capabilities.
2.1.1 Simultaneous Interpreting (SI)
In real-life simultaneous interpreting (SI) settings, the speaker speaks continuously and
does not pause for the interpreter to render his/her oral translation. In conference settings,
the interpreter usually sits inside a sound-proof booth and wears headphones to listen to the
speaker’s delivery. While the interpreter listens to the speaker, he/she talks into a
microphone to the audience, who are also wearing headphones. The interpreter listens to the
source language and orally produces the target language while at the same time still
listening to the speaker’s continuously incoming segments. In other words, the interpreter
continuously hears new input while simultaneously comprehending the input and stores
segments of it in memory. While this is happening, an earlier segment has to be
reformulated mentally into the target language, and an even earlier segment has to be orally
produced (Christoffels & De Groot, 2005; Liu, Schallert, & Carroll, 2004). Furthermore,
the interpreter, while already listening to the source language and producing it in the target
language, has to listen and monitor his/her own speech production to ensure no mistakes are
made. The simultaneity of comprehension and production imposes a severe strain on
cognitive processing capacity. This is one of the reasons that SI is such a cognitively
demanding task, which also explains why professional interpreters normally work in pair or
groups of three for 20-minute periods each (Christoffels & De Groot, 2004; Lambert, 2004).
same time, yet the interpreter’s orally produced content falls behind the speaker as the
he/she needs to listen and understand the message before producing the oral output.
Gile (1995) proposed an effort model for simultaneous interpreting:
Simultaneous Interpreting=
listening and analysis effort
+ short term memory effort
+ speech production effort
+ coordination effort
The linguistic input is oral in SI, and therefore listening and analysis efforts both play
critical roles. The interpreter needs to store information that is heard in his/her short-term
memory for the time interval between the moment the speech is heard and the completion
of its target language production (Agrifoglio, 2004) while all of the aforementioned efforts
need to be coordinated.
Gile’s (1995) effort model proved that SI is a cognitively demanding task since the
coordination of many types of efforts is required. Gerver (1976) also pointed that SI is a
complex task for it involves perception, storage, retrieval, transformation, and transmission
of verbal information. As the most widely used form of interpreting in international
conference settings, SI has been a core subject in T&I training programs while students
have had to undergo rigorous training and extensive practice to master its skills.
2.1.2 Sight Translation (ST)
Sight translation (ST) is a form of interpreting in which the interpreter’s linguistic
input is in the written rather than the oral form. During the process of ST, the interpreter
reads the source text while rendering the oral interpretation in the target language (Weber,
1990). Unlike simultaneous interpreting, in which the interpreter has no control over the
speed of the input, the interpreter can control the speed in which the written input is
perceived. However, the task of ST is still challenging because the demand on the quality of
oral production is very high. ST is perceived as an oral translation of a written text that
should sound as smooth as if the interpreter were merely reading a document written in the
target language (Angellini, 1999). Any pause of over 2 seconds would be considered an
error (楊承淑, 2005). The difficulty of ST lies in that fact that the interpreter needs to read
the source text, comprehend its content, translate and produce the speech in another
language while monitoring his/her oral production (Syysnummi, 2003). In this regard, ST
resembles simultaneous interpreting because it also involves multitasking.
Gile (1995) proposed the effort model of sight translation:
Sight Translation = reading and analysis effort
+ speech production effort
According to Gile’s effort model, ST consumes reading and analysis efforts as well as
speech production efforts. Hence, ST is regarded as a combination of interpretation and
translation, which echoes Lambert’s views (1989). It has been argued that ST is difficult not
because of the written form of the source text but because of the interpreters needs to
smoothly coordinate the reading, memory and production efforts while working to avoid
the interference of the source language.
Gile (1995) further explained that the listening and analysis effort becomes a reading
effort in ST while the production effort remains. Since information is always available on
paper, there does not seem to be a memory effort similar to the one in simultaneous or
consecutive interpreting. In contrast, Agrifoglio (2004) claimed that there seems to be a
memory effort involved in ST, which is similar to the short-term memory demands of
simultaneous interpreting, because the syntactic differences between languages may force
the interpreter to store some information in memory until it could be appropriately
produced in the target language. The two opposing claims still need to be tested by further
evidence and the results may vary between different language combinations. However, both
of these assumptions shed some light on the necessary efforts of ST, which serve as
foundation for further research.
T&I scholars such as Weber (1990), Moser-Mercer (1994), Lambert (2004), and
Sampaio (2007) have highlighted the benefits of ST, which has been considered an ideal
pedagogy in interpreter training programs for several reasons: (1) the interpreter becomes
familiarized with the technical terms in context and develops immediate reflexes of these
terms; (2) the interpreter can rehearse speech texts thoroughly in advance before the actual
interpreting assignment; (3) the interpreter develops skills of speed reading and gives more
fluent production after reading the source text (Weber, 1990).
In terms of the applications, ST is used usually, though not exclusively, in judicial and
medical interpreting. It is also an essential skill applied when the speaker reads from a
prepared speech. In the US, the National Association of Judiciary Interpreters and
Translators (NAJIT) offers a rigorous examination including two sight translation tests from
the first language into the second language and vice versa. In Brazil, professionals have to
qualify for exams administered by the Board of Trade which include sight translation to
become qualified sworn-in interpreters (Sampaio, 2007).
2.1.3 Simultaneous Interpreting with Text
with text). In SI with text, the interpreter receives two sources of input: listening to the
speaker’s oral presentation and also reading a written text. SI with text could be defined as
simultaneous interpreting with the extra task of sight translation As opposed to sight
translation, SI with text is one step closer to simultaneous interpretation as the source
language is presented both orally and visually (Lambert, 2004). Usually in authentic
interpreting settings, the interpreter obtains the speaker’s text beforehand and the speaker
reads aloud the text during the actual speech. Although SI with text is not performed by
participants in this study, it is still worthwhile to mention since it is an extension of sight
translation and even one step closer to authentic interpreting settings.
In SI with text, the interpreter devotes efforts to both listening and reading. Gile (1995)
did not propose an effort model for sight interpretation, but judging from the efforts needed
in both sight translation and simultaneous interpreting, the efforts needed in SI with text
include the listening and analysis effort, reading and analysis effort, production effort, and
coordination effort.
To sum up, the benefits and importance of sight translation have proved to be
self-evident. ST encompasses all the essential abilities of a conference interpreter and
enhances the cognitive processing speed of the interpreter (Weber, 1990). At the same time,
the rapid and efficient visual-brain-vocal coordination required by ST standards serves as
the foothold which helps an interpreter master consecutive and simultaneous interpreting
skills (Sampaio, 2007).
2.1.4 Skills of Sight Translation
Weber (1990) pointed out that the guidelines to sight translation include the following:
(1) analyzing a text rapidly; (2) producing the meaning rather than a word-for-word
interpretation; (3) rapid conversion of information from one cultural setting (language) to
another; (4) public speaking techniques. Before actually proceeding with sight translation,
student interpreters should skim through the speech quickly while conducting segmentation
and making marks to indicate the order in which the speech will be interpreted (何慧玲,
1997).
From the author’s experience as an interpreting student and practitioner, strategies of
sight translation often taught by instructors of interpreting include the following:
(1) Scanning a document rapidly for content and style;
(2) Analyzing units of meaning which form each sentence;
(3) Anticipating syntactic rearrangement necessary in the target language;
(4) Rendering sight translation in the target language while reading ahead to prepare to
produce next units of meaning;
(5) Rendering sight translation with accuracy and fidelity to the text;
appropriate pauses and intensity, delivers the message with fluidity, and in a well
modulated voice.
Although certain guidelines have been proposed for training of ST, Sampaio (2007)
noted that literature which document the sight translation pedagogy has been scant. This is
very likely due to the fact that no research findings are yet available concerning the
cognitive process of ST, which will be discussed in the following section.
2.2 The interpretation process and the comprehension phase
Theories of interpretation have noted the importance of comprehension process in the
interpretation task (Dillinger, 1994). However, beyond the comprehension process,
interpreters perform a reformulation or code-switching process between the two languages
and produce the output in the target language. Generally speaking, interpreting can be
categorized into three components which include comprehension, reformulation (also
referred to as code-switching), and target language production (Gerver, 1976; Seleskovitch,
1976).
2.2.1 The vertical and the horizontal perspectives
Despite the fact that theorists agree about the components of interpreting
(comprehension, reformulation, and production), there exist two different views on the way
these operations occur, namely, the vertical perspective and the horizontal perspective
(Macizo & Bajo, 2004, 2006).
The vertical perspective is also referred to as the meaning-based strategy. The
interpreter is thought to retain the meaning of information chunks during comprehension to
reformulate the meaning, and to produce it in the target language (Fabbro & Gran, 1994).
Meaning-based interpreting is conceptually mediated and the input is fully comprehended
in a way similar to ordinary comprehension. The interpreter’s job is to give lexical
expression to the meaning extracted from the full comprehension of the input. The vertical
perspective is also in line with the deverbalization theory proposed by Seleskovitch (1976).
The theory claimed that interpreting involves first the processing of information in the
source language to obtain its meaning. Second, after the comprehension process is
complete, the message is restructured according to target language grammar while specific
linguistic form of the source language is discarded. This is the so-called deverbalization
process, which occurs only after the comprehension process has been completed. The
message is then reformulated to be produced in the target language. According to this
strategy, interpreting involves full comprehension of the source language in a way similar
to common comprehension of speech (Christoffels & De Groot, 2005; Macizo & Bajo,
2006). Therefore, from the vertical perspective, comprehension and reformulation are
performed sequentially rather than concurrently without any direct links between the source
language and target language at the lexical/syntactic levels of analysis (Macizo & Bajo,
2004, 2006).
Figure 1 (Macizo & Bajo, 2004, 2006) shows the sequence of processes involved in
interpreting under the vertical perspective/meaning-based strategy. The left hand side refers
to the interpreter’s understanding in the source language (SL) while the “abstract” part
indicates the extraction of the meaning of the SL. The right hand side shows the production
in the target language (TL) after obtaining the meaning of the original message.
Figure 1. The vertical perspective/meaning-based strategy (Macizo & Bajo, 2004)
In contrast to the vertical perspective, there is a varying view called the horizontal
perspective, or the transcoding strategy. The horizontal perspective sees interpreting as the
direct processes of recoding from one linguistic code to another. The interpreter may
engage in partial reformulation and seek the equivalent of the smallest meaningful unit in
the TL while still reading and comprehending the source text. The lexical units in the TL
are supposed to be activated continuously in a parallel manner, before the source language
(SL) meaning units are fully comprehended. The horizontal approach has also been referred
to as a word-based or word-for-word strategy (Fabbro, Gran, Basso, & Bava, 1990).
However, it does not mean literally that words per se serve as the transcoding unit in
interpreting. Rather, it indicates that each meaning unit is reformulated before the
comprehension process of that meaning unit has been completed. In other words,
comprehension and reformulation occur concurrently rather than serially, which is opposed
to the claim of the vertical perspective.
Figure 2 (Macizo & Bajo, 2004) shows the sequence of processes involved in
interpreting from the horizontal perspective/transcoding strategy. The left hand side refers
to the interpreter’s understanding in the source language (SL) while the arrows pointing
from the TL to the SL at the lexical, syntactic, and discourse levels indicate the ongoing
transcoding or reformulation processes at different levels during the course of the
comprehension process. The right hand side shows the production in the TL after
reformulation of the original message.
Figure 2. The horizontal perspective/transcoding strategy (Macizo & Bajo, 2004)
It should be noted that the horizontal and the vertical perspectives are the possible
approaches for interpretation strategies rather than proven theories. Also, what the unit is
for these two perspectives has not yet been specified in literature. However, what is certain
is that for the vertical perspective, SL comprehension plays a pivotal role in interpreting.
There is no parallel access to the TL as the interpreter receives SL input. Because
reformulation only occurs after interpreters have extracted the SL meaning, normal reading
and reading for the purpose of interpreting should impose similar demands on the
interpreter’s working memory. In contrast, for the horizontal perspective, partial
reformulation already takes place while the interpreter reads the SL. The partial
reformulation process consumes working memory and adds a greater loading to the
cognitive resources than that for normal comprehension. As a result, reading processes
would be more demanding when reading for the purpose of interpreting because of the
extra demands on working memory. Also, the increased cognitive load would be especially
high when comprehension of the SL is difficult.
Macizo and Bajo (2006) conducted two types of self-paced reading experiments to
determine whether the horizontal and vertical perspectives was valid. Their prediction was
that if reading for the purpose of interpreting took longer time than normal reading, this
would be evidence for the horizontal perspective. In one type of experiment, the task
(reading for repetition or reading for interpreting from Spanish into English) and the lexical
ambiguity of the target word (ambiguous: homograph or unambiguous) were manipulated
within participants. Memory load (low or high) was a between-groups variable, which was
manipulated by varying the number of words between the target word and the
disambiguating context (5 words versus 7 words). Sixteen professional translators were
divided into two groups composing the two memory load conditions. The stimuli were
sentences which appeared word-by-word in the middle of a computer screen. Participants
were told to repeat the sentence or to interpret the sentence. They could read at their own
pace by pressing the space bar every time they wanted to see new words. The time between
consecutive key presses was taken as an index of the processing time for the displayed
words. The same experiment was repeated on 16 Spanish-English bilinguals.
The findings suggested that when participants read and interpreted sentences, global
comprehension and the speed of the reading processes were affected by the presence of
lexical ambiguity and memory load. Reading for interpreting became slower and
understanding became less accurate when the sentences contained ambiguous words and
the distance between the ambiguous word and the disambiguating context was large (high
memory load condition). In contrast, when participants were instructed to only read,
understand, and repeat the sentences, the presentation of an ambiguous word did not affect
reading times in either of the two memory load conditions. Macizo and Bajo claimed that
whereas reading for interpreting requires working memory resources for parallel activation
of the TL lexical entries and switching the two languages involved, reading for repetition
does not need these additional resources. The results were in agreement with the predictions
of the horizontal perspective.
In the other type of experiment, Macizo and Bajo (2006) tried to prove that there was
parallel activation of TL lexical entries when reading for the purpose of interpreting.
Sixteen professional translators were asked to read sentences which contained cognate
words (words that resemble its target language equivalent, eg. “cebra” in Spanish vs.
“zebra” in English) at the beginning and at the end of sentences for the purpose of
“zebra” in English) at the beginning and at the end of sentences for the purpose of