Taiwan Humanities
Citation Index
Kuang-hua Chen
The author
Kuang-hua Chen is an Associate Professor in the Department of Library and Information Science, National Taiwan University, Taipei, Taiwan.
Keywords
Software tools, Research, Taiwan Abstract
The Taiwan Humanities Citation Index (THCI) is Taiwan’s effort to construct a search, research, and evaluation tool for research in the arts and humanities. This article describes the design, framework, features, and policies and rules of the THCI. Citation analysis has been regarded as a systematic way to investigate research developments and trends. Since the Arts & Humanities Citation Index (A&HCI) indexes mostly English journals, the THCI could become an auxiliary citation index of the A&HCI for Taiwanese researchers.
Electronic access
The Emerald Research Register for this journal is available at
www.emeraldinsight.com/researchregister The current issue and full text archive of this journal is available at
www.emeraldinsight.com/1468-4527.htm
In the history of research development, research or innovation is built on the base of preceding research. Researchers usually “cite” the ideas and papers of pioneers. This behavior is called “citation”. A citation includes two objects: one is the source work; the other is the cited work. Although there are many reasons for a citation (Weinstock, 1971), the cited work has something to do with the source work. For general users, cited works provide a guide to further reading. For researchers, the collection of
information about cited works is a treasure for citation analysis. Citation analysis permits the researcher to comprehend current developments in subject fields, the properties of bibliographical usage, and research trends.
The ISI has produced the Science Citation Index (SCI), Social Science Citation Index (SSCI), and Arts & Humanities Citation Index (A&HCI) for many years. The Library of the Chinese Academy of Sciences has produced the Chinese Science Citation Database (CSCD) since 1989 (Meng, 1995). Nanjing University in mainland China and the Hong Kong University of Science and Technology produce the Chinese Social Science Citation Index (CSSCI) (Su et al., 2001). These citation index databases have a great impact on research. Some researchers apply citation data to evaluate the contributions of research fellows, journals, or institutes (Garfield, 1972); some apply citation data to analyze the structures of specified research fields (Henry, 1973); some apply citation data to retrospectively evaluate research trends, and to make projections about future trends (Garfield, 1979).
Most journals in ISI’s citation indexes are published in English. In Taiwan, we need local citation index databases for local requirements. The CSCD and CSSCI provide citation data for researchers in mainland China. If we want to have an overall picture of Taiwanese research in the arts and humanities, we have to construct a Taiwan Humanities Citation Index (THCI).
The purpose of the THCI project is to construct a citation index for humanities journals published in Taiwan. At the same time, we hope researchers will be able to apply the THCI to understand humanities research, to analyse the changing patterns of humanities studies, and to predict future trends in the humanities.
Online Information Review
Volume 28 · Number 6 · 2004 · pp. 410-419
qEmerald Group Publishing Limited · ISSN 1468-4527 DOI 10.1108/14684520410570535
Revised article received 22 July 2004 Accepted for publication 29 July 2004
The author is grateful to the staff of the THCI for their hard work on this project. The THCI project is supported by the Centre for Humanities Research of the National Science Council, Republic of China (www.hrc.ntu.edu.tw/)
The goals of the THCI project are as follows:
. to construct the humanities citation index; . to provide essential search features (i.e.
journal title search, author search, cited author search, source article title search, and cited article title search); and
. to provide citation analysis functions (the
THCI can identify the frequency of citation for journals and authors: researchers could use this data to calculate impact factor and immediacy).
“Citation” is widely used in bibliometrics. Price (1986, p. 284) has said:
It seems to me a great pity to waste a good technical term by using the words citation and reference interchangeably. I therefore propose and adopt the convention that if paper R contains a bibliographic footnote using and describing paper C, then R contains a reference to C, and C has a citation from R.
However, it is very easy to confuse readers with the word “citation”. Therefore, the word “citation” in this article is regarded as an action where authors make references to other works. The two objects involved in a citation will be called the “source work” and the “cited work”, respectively.
Related works
In 1958, Eugene Garfield set up the Institute for Scientific Information (ISI) to provide information retrieval services. ISI is now the most authoritative citation index company in the world.
ISI provides the following academic data search services: life sciences, clinical medicine, physics, chemistry, agriculture, biology, livestock medical, engineering and technology, society, environment and behavioral science, art and humanities, etc. These databases cover more than 16,000 journals, proceedings and monographs in the sciences, social sciences, and the arts and humanities. The service includes complete bibliographies and references.
ISI’s products and services include Citation Index, Current Awareness, Specialized Content, Evaluation/Analytical Tools, Information Management Tools, ISI Custom Marketing and Intelligence, Research Community, etc. Moreover, these products and services are provided in print, on diskette, on CD-ROM, on magnetic tape, and online via the internet.
ISI began publication of its first citation index, the Science Citation Index (SCI) in 1961. Since then, the company has developed other indexes, such as the Social Science Citation Index (SSCI) and the Arts & Humanities Citation Index (A&HCI). These citation indexes are not only
important search tools for the general public, but are also used as tools for faculty evaluation, journal evaluation, and institute evaluation.
The Chinese Academy of Science in mainland China established the Chinese Science Citation Database (CSCD) in 1989. It indexes 315 science and technology journals published in mainland China, as a supplement of SCI (Meng, 1995). The first paper-based CSCI based on the CSCD was published in 1995, and the first CSCI CDROM (CSCD-CD-96) was published in 1996. By contrast, Nanjing University has focused on social science and arts and humanities journals, and has produced the Chinese Social Science Citation Database since 1997. In 1999, Nanjing University in mainland China signed a contract for the cooperative development of the CSSCI with Hong Kong’s Science and Technology University. The first CSSCI CDROM (CSSCI-1998) was published in 2000 (Su et al., 2001).
Several citation index projects are also under way in Taiwan. In 1997, the National Science Council (NSC) of the Republic of China
supported a pilot project for producing the Taiwan Science Citation Index (TSCI). The pilot project created a prototype database, the Taiwan Science Citation Index Database, which indexed several science and technology journals published in Taiwan from August 1996 to July 1997 (Chui, 1998). In 1999, NSC established two project-based research centres: the Social Science Research Centre (SSRC) and the Centre for Humanities Research (HRC). The main tasks of the two centres are to produce the Taiwan Social Science Citation Index (TSSCI) and the Taiwan Humanities Citation Index (THCI), respectively. This article will describe the THCI project and its achievements to date. (Note that THCI is a stand-alone citation index database, which is produced locally in Taiwan and contains no data from other citation databases.)
Design of the THCI
The main goal of the THCI is to produce a citation index of arts and humanities journals published in Taiwan. In addition to being a search tool and an evaluation tool, a citation database can be regarded as a research tool, from which can be learned the characteristics, developments, and research trends in arts and humanities research in Taiwan. Use of the THCI as a research tool is the core philosophy behind the project. The following criteria are used to determine whether a journal should be indexed by the THCI:
. the subject fields of indexed journals should be
. the indexed journals should be published in
Taiwan.
We first made a list of arts and humanities journals published in Taiwan. Then, the final list was decided by a committee composed of members from different disciplines. The priority to be given to each journal was assigned. The final list contained 245 titles with three different priorities. Since the positioning of the THCI as a research tool in the main concern, THCI must index as many journals as possible. Therefore, few journals are filtered out. Table I shows the statistics of the final list.
Although the THCI indexes many journals, not every journal article is indexed. Only research papers or academic articles are indexed. Therefore, short stories, prose, poetry, jottings, news articles, editorials, letters, book reviews, bibliographies, catalogues, speech transcriptions, interviews, minutes, event diaries, dictation sketches, visiting reports, compliments, obituaries, and the like are screened out. For convenience, and to avoid misunderstanding, the words “source work(s)” and “cited work(s)” are used for the two objects involved in a citation (see Figure 1). Since the source work(s) in the THCI are journal articles, we will also use the words “source articles” or “source journals” in the following sections.
The data in a citation index database are basic bibliographic data for source works and cited works, and the citation relationships. We have defined numerous attributes for all of the possible uses for the THCI. The following paragraphs will describe these attributes and their corresponding values.
Table II is used to describe the journals. Since the source works and the cited works are stored in the same table, the “PublicationType” is for cited works (books, proceedings, journals, reports, theses and dissertations). When the
“PublicationType” is “book”, the corresponding “ISXN” field will record an ISBN rather than an ISSN. Table III stores the data on the source articles and the cited articles. The “Funding” field is used to identify funding sources such as the National Science Council and the Ministry of Education. As a result, we can estimate how much funded research is published. The “JournalID” field is a foreign key.
We distinguished six types of citation used in Chinese arts and humanities articles. Since some articles might use more than one type of citation simultaneously, we defined the priority for each citation type. The THCI keeps a record of only those cited works with the higher priority of citation type used in each source work. These types of citation are shown in Table IV. Table V shows the core relationship in the TCHI. Table V shows the relationship between “SourceID” and
“CitedID”, which are foreign keys borrowed from the “Article” relationship.
In addition to taking citation types into account, THCI identifies different types of cited work. Generally speaking, THCI only indexes journal articles, conference papers, books, field
investigation reports, technique reports, and theses and dissertations. In arts and humanities
Table I Statistics of indexed journals by discipline
Discipline Number of titles
Arts 16 Literature 82 History 48 Language 46 Library Science 23 Philosophy 17 Religion 13 Total 245 Figure 1 A citation
Table II Journal table
Field tag Field name Data type
JournalID Bno VARCHAR(15)
JournalName Bname VARCHAR(255)
ISXN Isxn VARCHAR(20)
Discipline Bdomain VARCHAR(3)
PublicationFreq Frequency VARCHAR(8) PublicationDate Syear VARCHAR(10) PublicationType Btype VARCHAR(10)
Publisher Publisher VARCHAR(50)
SubPublisher SubPublisher VARCHAR(50)
PublisherAdd Badd VARCHAR(80)
PublisherWWW Website VARCHAR(80)
NationalLibID NCLno VARCHAR(8)
Remarks Bnotes VARCHAR(255)
Table III Article table
Field tag Field name Data type
ArticleID Pno VARCHAR(15)
ArticleTitle Title VARCHAR(255)
ArticleLang Language VARCHAR(10)
ArticleDiscipline PDomain VARCHAR(10)
ArticleVolNo Volno VARCHAR(10)
ArticleDate Pdate VARCHAR(22)
PageNo Pageno VARCHAR(10)
Keywords Keywords VARCHAR(255)
Funding Funding VARCHAR(50)
JournalID Bno VARCHAR(15)
disciplines, the use of languages other than English is a common phenomenon. Therefore, the languages of cited works are recorded as well. Languages are recorded as English, Chinese, Japanese, French, German, Italian, Spanish, Russian, Korean or Other.
In order to allow for the possibility of investigating the cross-impact of different
disciplines using the THCI, we have identified five main disciplines and eight sub-disciplines of the humanities. Table VI shows these disciplines and sub-disciplines.
Since author(s) may move to new affiliations, we have to trace the transition. Therefore, two relationships are related to persons, as shown by Table VII. The first relationship is for the author(s) of source or cited works; the second for the researchers themselves. Researchers may publish different articles while they are in different institutes or organizations. The “Researcher” table keeps up-to-date information for the researchers. We trace the historical data of a researcher’s publication, rather than his transition from institute to institute. Figure 2 shows the relationships between the tables.
Framework of the THCI
The whole THCI framework is composed of three components, as shown in Figure 3. The first is an indexing system, the second is a checking system, and the third is a searching system. The indexing
Figure 2 THCI tables relationships Table IV Types of citation
Citation type Priority
Citation 1 Notes 2 Footnotes 3 References (bibliography) 4 Notes in articles 5 Quotation 6
Table V Citation table
Field tag Field name Data type
CitationID Cno VARCHAR(15)
SourceID SourceNo VARCHAR(15)
CitedID CitedNo VARCHAR(15)
CitationType CType CHAR(1)
Table VI Disciplines
Main discipline Sub-discipline
Humanities Arts Chinese Literature Foreign Literature History Language Library Science Philosophy Religion Social Sciences Natural Sciences Biological Sciences Engineering & Technology
Table VII Researchers and authors
RESEARCHER Table AUTHOR Table
ResearcherNo (P.K.) AuthorNo (P.K.) ResearcherName AuthorName ResearcherTitle AuthorTitle ResearcherOrganization AuthorOrganization ResearcherDepartment AuthorDepartment ResearcherAddress AuthorRemarks ResearcherPhone ResearcherNo (F.K.) ResearcherFax ResearcherEmail ResearcherRemarks
system consists of a back-end working database system and a front-end client system. The checking system also consists of a back-end working database system and a front-end client system. The searching system has a typical three-tier architecture and is composed of a web browser, a web server, and a searching database system. The indexing system and the checking system use the same working database. The searching database is periodically imported from the working database. All databases in the THCI framework are the implementation of the relational data model. The platforms for the indexing and checking systems in the THCI framework are Intel-based machines with a Windows 2000 server and a SQL server. The front-end clients of the two systems are implemented using Borland Cþ þ Builder.
The searching system provides Basic
Information Retrieval (BI Retrieval) and Citation Analysis Retrieval (CA Retrieval). BI Retrieval provides journal title search, author search, source article title search, and cited article title search. The functions of CA Retrieval are complete but are not available online at the present time. CA Retrieval will provide an immediacy index, an impact factor with different time windows, and frequency of citation, after the data-checking task is completed.
Policies and rules of the THCI
Because the format for citations in humanities journals in Taiwan is very inconsistent, we could not draw up a complete set of rules for the TCHI in advance. Nevertheless, some policies and rules have been defined.
Policies
. The indexed articles are research-oriented and
humanities-based.
. The “broad indexing” principle is adopted.
. THCI is a research-based database rather
than an evaluation-based database.
. The source journals are published in Taiwan. . The cited works are restricted to journal
articles, books, articles collections, research reports, theses & dissertations, and others (explained in Rules subsection).
Rules Source works (1) Title:
. Use the very title shown in journals or
articles.
. Use double-byte BIG5 code for Chinese
punctuation marks; use single-byte ASCII code for English punctuation marks.
. Use “ ¼ ” for title presented with more
than one language. The leading title is in the same language as the article.
. Rome phonetic transcription for Chinese
could be used while cited works use it. (2) Author:
. Use “last name first” principle, e.g.
“Scholes, Robert”.
. Use the name listed in the article first and a
“ ¼ ’ followed by the transliterated name, if available. For example, the listed author is “ ” and if the transliterated name is available, “ ¼ Chan, Sucheng” will be recorded. By contrast, “Scholes, Robert ¼ . ” will be used if the listed name is “Scholes, Robert”.
. If no author information is available but
editor information is, use editor name instead.
. Record translator name in the “Remarks”
field rather than the “Author” field.
. Institute author is recorded in “Author”
field as well. (3) Date:
. Use the Gregorian calendar.
. Use “/” for cross-year notation, e.g. 1998/
1999. (4) Place:
. Use traditional form for Chinese character
rather than simple form, e.g. “ ” vs “ ” for Taiwan.
(5) Publisher:
. Use complete name for Publisher, e.g.
“ ” vs “ ” for “Department of Foreign Languages and Literature”.
. Use the first publisher in the publisher list
for group publication. Other publishers are recorded in the “Remarks” field.
(6) Start/end pages:
. For source articles, the starting and the
ending page numbers have to be recorded completely.
. For cited articles, rules depend on the
citation type (please refer to Table IV).
. Record the starting page numbers for
“Notes”, “Footnotes”, “Notes in articles” and “Quotation”.
. Record both starting and ending page
numbers for “Citation” and “References”. (7) Remarks: Record other important data:
. Translator names. . Themes for special issues. . Use “
W” to separate data in the “Remarks”
field.
(8) Use “A” for non-encoding Chinese characters.
Cited works
(1) Record cited works based on the priority listed in Table IV.
(2) Cited works are journal articles:
. Rules for source articles are applied here.
(3) Cited works are books:
. Record book title rather than chapter title. . Do not record page number.
. If the books are separated into numerous
volumes, volume numbers are seen as part of book title, e.g. “Chinese history (1)”. (4) Cited works are collections of articles:
. Articles in the collection are regarded as
the “cited works”.
. Rules for source articles are applied here.
(5) Cited works are research reports:
. Rules for cited books are applied here. . Project title, project grant number, and
funding organization are recorded in “Remarks” field, if available.
(6) Cited works are theses and dissertations:
. Rules for cited books are applied here. . The granting institution is recorded in the
“Publishers” field. (7) Cited works are “Others”:
. “Others” includes ancient books, historical
materials, gazettes, archives, standards, laws, patents, electronic articles, etc.
. Books published before 1911 are regarded
as ancient books.
. Basically, rules for cited books are applied
here with some exceptional rules shown in the following.
. No matter how many times “others” are
cited in one source article, it would only be considered as one citation.
. Newspaper titles are recorded in the book/
journal title field.
. The caption of a news article is recorded in
the article title field.
This is a partial description of the rules set, which is being revised continuously.
Applications of the THCI
Up to April 2004, the THCI had indexed 3,550 journals, 35,097 source titles and 499,932 cited titles. In addition to being a search tool for humanities bibliography, the THCI is intended to be a research tool which is useful for
understanding research trends, for depicting the citation map, for finding important objects (institutes, journals, researchers, and articles), and for identifying research sub-disciplines (Henry, 1993). As mentioned above, a major goal of the THCI project is for the index to function as a research tool, although it also can be regarded as an evaluation tool.
Search tool
As a search tool, BI Retrieval and CA Retrieval provide various features, as mentioned previously. The current version of the THCI is available online at www.hrc.ntu.edu.tw/thci/thci.htm. Figure 4 shows a snapshot of the THCI web page. This page provides BI Retrieval. If users wish to use other features, they can click on the “Citation Analysis Retrieval System” (although this feature is not available in the mean time) and “Related Articles on Citation Analysis” listed on the right of the red triangle. There are two sections in Figure 4. The left part is BI Retrieval; the right section allows browsing of journals by discipline.
If users wish to search for a particular author, they key in the name in the search box with a check mark in the corresponding radio button, for example, “ ”, as shown in Figure 5. Figure 6 shows the search results. Click on one of the hits, for example, “ ”, and the articles are displayed accordingly. Figure 7 shows not only the articles list, but also basic information about the author, the number of that author’s articles that are cited in the THCI, and the frequency of their citation.
By clicking on one of the articles, the cited works for this article are shown. Figure 8 shows the basic information for this article and the
corresponding cited works. By clicking on one of the cited works, a list of articles which cite the cited work will be displayed (see Figure 9).
Research tool
The THCI online service does not provide research features directly. Interested researchers must first identify the scope of their research. Researchers then need to search for the required data, download it, and arrange the downloaded data for further analysis. For example, the co-citation matrix (Garfield, 2001) might be constructed in the arrangement stage.
Evaluation tool
Before using any citation index as an evaluation tool, we have to understand the motivation behind citation. Weinstock (1971) has discussed this issue in detail. However, a few citations are still made in an incorrect way. Therefore, use of a citation index
as an evaluation tool is still controversial. Even assuming that all of the citations are fair and correct, there are still problems (e.g. the use of inappropriate metrics for evaluation).
In order to overcome as many inaccuracies as possible, we are planning to provide a metric
Figure 4 THCI homepage
“impact factor with time window”, IF(ts,yr). As we know, ISI uses a fixed two-year time window. For engineering-oriented disciplines, two years is appropriate, but it is too short for other disciplines. Researchers at Leiden University in The
Netherlands have found that even six years is short for the disciplines of biology and medicine. The
appropriate time window may be even longer for disciplines in the arts and humanities. The time-dependent IF is a much more informative metric. The application of IF with a clear description of the time window make clear the basis of the calculation, and enable us to understand the potential problems.
Figure 6 Search results for author
Difficulties and problems
We have encountered many difficulties and problems in producing the THCI. The first is the varied formats of the citations. Different
disciplines have different formats; different journals use different formats; even different articles in the same journal use different formats.
The second problem is the lack of skilled staff. We are a project-based team and employ part-time students to key in the citation data. Some students have difficulty in recognizing the components of a
citation. Often these students cannot read the language in which the articles are written, which may be English, French, Japanese, Korean, Russian, Spanish, and even Sanskrit.
The third problem is one of incorrect or incomplete citations. In this case, our staff have to check the citations using various online
bibliographic services. This is time-consuming and costly.
The fourth problem is identifying the discipline of source articles and cited works. Some source articles cover more than one discipline, and staff
Figure 8 Cited works of an article
have difficulty in identifying the main discipline. Further, bibliographic data about cited works often offers the only clue as to the discipline of the article, making accurate judgments difficult.
Conclusions
The THCI project has been under way since January 2000. Although various problems and difficulties have occurred, we have solved them one by one and completed the following tasks:
. constructing THCI’s working environment; . setting up THCI’s construction procedures; . drawing up THCI’s policies and rules; . implementing THCI’s framework systems;
and
. fulfilling various search features.
Since most indexed journals of A&HCI are English journals, THCI could be an auxiliary citation index for Taiwanese researchers to gain an overall picture of Taiwanese research in the arts and humanities. The THCI’s online service can be found at www.hrc.ntu.edu.tw/thci.htm
A CD-ROM version of the THCI is under development and will be released in the near future.
References
Chui, S.-L. (1998),The Applications and Construction of Taiwan Science and Technology Citation Index Database, National Science Council, Taipei (in Chinese).
Garfield, E. (1972), “Citation analysis as a tool in journal evaluation”,Science, No. 178, pp. 471-9.
Garfield, E. (1979),Citation Indexing: Its Theory and Application in Science, Technology, and Humanities, Wiley, New York, NY.
Garfield, E. (2001), “From bibliographic coupling to co-citation analysis via algorithmic historio-bibliography”, speech delivered at Drexel University, Philadelphia, PA, November 27, available at: www.garfield.library.upenn.edu/papers/ drexelbelvergriffith92001.pdf
Henry, S. (1973), “Co-citation in the scientific literature”,Journal of the American Society for Information Science, Vol. 24 No. 4, pp. 265-9.
Henry, S. (1993), “Macro-level changes in the structure of co-citation clusters: 1983-1989”,Scientometrics, Vol. 26 No. 1, pp. 5-20.
Meng, L.S. (1995), “Construction of Chinese Science Citation Database and its application prospects”,Journal of the China Society for Scientific and Technical Information, Vol. 14 No. 30, pp. 206-11.
Price, D.J.D. (1986),Little Science, Big Science: And beyond, Columbia University Press, New York, NY.
Su, X.N., Han, X.M. and Han, X.N. (2001), “Developing the Chinese Social Science Citation Index”,Online Information Review, Vol. 25 No. 6, pp. 365-9.
Weinstock, M. (1971), “Citation index”, in Kent, A. (Ed.), Encyclopedia of Library and Information Science, Vol. 5, Marcel Dekker, New York, NY, pp. 16-40.