Chapter 2 Literature Review
2.1 Stylometry
2.1.3 Discussion on Quantitative-Statistical Approach
Regarding how efficient the measurement of word types (vocabulary richness) performs in authorship attribution and to what extent it is affected by the text length, Tweedie and Baayen (1998) conducted a comprehensive study to examine a variety of alternative measures of vocabulary richness or repeat rate, which were popular
constants in authorship attribution studies to discriminate individual authors. Many of these measures of lexical richness were claimed to be independent, or roughly
independent of text length. However, after testing both theoretically and empirically, they made a conclusion that most of these indexes were highly text-length dependent.
They stated in their report that almost all proposed measures of these ‘constant’ of author’s style change considerably in systematic ways with the length of text. Their results thus question the efficacy of including various ‘constants’ in authorship attribution studies (e.g. Holmes, 1992; Holmes and Forsyth, 1995)
They conclude that, despite the large number of measures available, none of the common measures of vocabulary richness are truly constant with respect to text length.
Intuitionally, this is not difficult to understand. When the text length increases, the possibility of texts covering a wider range of topics, and thus containing more word types, also increases. Therefore, the text length issue is an important yet difficult factor to be stipulated. No one can assure that what certain amount of length is assumed to be more appropriate and efficient than the others, and should be the norm when the variety of word type (or lexical richness) serves as the ‘constant’, and the only measurement, of authorship attribution.
Therefore, Burrows’ relative frequency of words from different frequency spectrums has become the popular analysis method in Stylometry discipline,
especially the measures of function words. Despite the fact that function words prove
18
to be very effective in many stylometric studies, it is, however, a pity that in Burrows’
studies, as he himself stated, words of different frequency stratums could not
collaborate and integrate to provide a more detailed illustration of the author’s writing preference under this method. In addition, the fact that the occurrence-rate of
word-types and that of word-tokens could not work together under this kind of single-index measurement is also a defect. On the one hand, when observing
individual’s writing preference, either the distribution of the most common function words, or the occurrence of common content words to the use of rare words are all important and interesting in observing one’s writing style. If the different words’
behaviors could not be measured under the same scope, we could only observe and analyze author’s writing style from one specific angle, and could possibly lose the opportunity to obtain the accumulated effect that only emerges when every trivial variation was taken into consideration. On the other hand, when analyzing an author’s style and comparing to others, whether the specific author adopted certain words and how often he/she used these words in the texts are of equal important, if we can either choose the occurrence-rate of the word-types (the distinctive types of words) or that of word-tokens (the frequency of words), again, some crucial information would be lost because of the restriction of the experiment design. Therefore, we need a method to include all words, both the function words and content words, in the author’s text, no matter how frequent they are, and a method could simultaneously measure the appearance of word (word-type), and the frequency of word (word-token).
Briefly, the main limitation of the quantitative-statistical approach I have discussed so far can be further criticized from two perspectives. The first one is the need of constructing a ‘norm’ model for every tested item to compare with. The norm
19
has to be a huge collection of texts to guarantee that it can best represent an unbiased standard, and therefore any biased use of words, which results from the author’s special writing style, can be detected through the comparison process with the ‘norm’.
Nevertheless, in practice, the construction of a ‘qualified norm’ is difficult and questionable, because nobody can precisely describe the criteria of building a proper norm. Second, the Delta approach Burrows proposed was to assume that every individual author has their special patterns in the use of every function word (the z-score was calculated by the difference between the local frequency of the function words and their norm behavior). This is to assume that every author has an
individualized frame of using function words (based on frequency), and this frame will reoccur and will not alter in his other works. This approach could probably work in the lengthy literature, where the frame of a set of function words’ behavior is observable. However, we can confidently infer that it can’t work on the modern genre of texts, for example, the texts collected from the e-format, whose characteristic is typically shorter and more casual than the literature work. Therefore, another discipline arouse, while inspired and descended from the Stylometry2, from the Information Theory field to tackle the authorship identification issue for the modern texts to meet the need of practical applications in modern texts classification. In the following section, other methods of tackling authorship identification on the modern texts will be introduced.
2 For those who are still interested in other intriguing stylometry researches, they can refer to Whissell’s (1996) emotional stylometry tested on John Lennon’s and McCartney’s lyrics during the year 1962-1970, and Burrows’ (1987) authorial and chronological tests on Jane Austen’s text segments.
20