• 沒有找到結果。

Chapter 3 The Leading Relationship between Conferences and Journals

3.3 Datasets Properties

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

well-known concepts that have already reached consensus. A long period of time would be required for domain experts to reach a consensus about a concept.

Therefore, this study concludes that keywords do not describe the content of a paper as well as its title does.

 The Full Text: The full text includes every concept the researcher used concerning the subject, yet each word embodies very little substance. The full text obviously includes the integrity of the content, but it is a compilation of an immense load of information that far exceeds that of titles, abstracts and keywords. Therefore the degree to which each phrase can express the concepts of the paper is small. Using the full text to describe the content of a paper would waste resources and time.

Therefore we conclude that only phrases with a high density of knowledge should be employed to represent to the full content of the paper. The titles, abstracts, and keywords of research papers all have this quality. In particular, titles and keywords are composed of short strings of words that express the concepts discussed in the research paper. This study aims to discover whether conference papers represent the new trend for academic papers, and therefore must identify concepts that represent these new trends. Keywords cannot fulfill this particular requirement, as explained previously. Therefore, the titles of research papers are adopted as descriptors to express the full content of a paper.

3.3 Datasets Properties

The datasets of the two categories of papers are described below. To reduce the effect of the different search algorithms, the research criteria were designed to minimize the differences among them.

3.3.1 Search Conference Papers

Relevant search criteria were input into the search engines to help search for papers. ACM and IEEE were adopted as the two main databases for searching conference papers. The papers be retrieved must contain the specified keywords in the article title. Although the ACM and IEEE databases included a lot of papers they collected and published by themselves, but there are some papers of them may be published by other organizations. In the research, we only use the conference papers published by ACM and IEEE. Papers from 1990 to the end of July 2007 were collected. Papers added into the database after the deadline of 31 July 2007 were not included in the analysis.

Having completed the searching for conference papers, conferences were first

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

selected as the criteria in searching representative papers. Table 3-1 shows conference papers retrieved when searching only among conferences listed in. All the conference papers that came up as search results will be recorded in Table 3-2. A total of 632 conference papers were collected for this research.

Table 3-1 Names of Conferences Responsible for the Publishing of All Conference Papers Collected.

Publisher

Conference Names Abbreviated

Full Name of Conferences ACM CIKM Conference on Information and Knowledge Management

IEEE COMPSAC Computer Software and Applications Conference

IEEE DASFAA Database Systems for Advanced Applications

IEEE DEXA Database and Expert Systems Applications

ACM DMKD Data Mining and Knowledge Discovery

IEEE HICSS Hawaii International Conference on System Sciences

IEEE ICDE International Conference on Data Engineering

ACM ICDM International Conference on Data Mining

IEEE IDEAS International Database Engineering and Applications Symposium

IEEE IPDPS International Parallel and Distributed Processing Symposium

IEEE ISDA Intelligent Systems Design and Applications

ACM SAC Symposium on Applied Computing

ACM SIGIR Conference on Research and Development in Information Retrieval

ACM SIGKDD Conference on Knowledge Discovery in Data

ACM SIGMOD International Conference on Management of Data

IEEE SSDBM Scientific and Statistical Database Management

IEEE WI Web Intelligence

ACM WIDM Workshop On Web Information And Data Management

IEEE WIRI Web Information Retrieval and Integration

IEEE WISE Web Information Systems Engineering

Table 3-2 Number of Conferences Recorded in the Databases for Conference Papers and the Number of Papers Cited.

Name of Database Number of Conference Recorded in Database

Number of Papers Cited in this Research

ACM 85 conferences 345

IEEE Over 1700 conferences 287

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

3.3.23.3.2 Search Journal Papers

Journal papers were collected from four databases. The collected papers were listed as SCIE (SCIENCE CITATION INDEX EXPANDED) or SSCI (SOCIAL SCIENCES CITATION INDEX). They were required to contain the specified keywords. Although the database of ACM and IEEE included many papers they collected which were published by the other organization, but the research only use the conference paper which published by ACM and IEEE.

To use the database and search engine of SDOS, the “Expanded Search” button was clicked after logging onto SDOS front page. The selected data mining and information retrieval papers involved several fields, such as business, management, accounting, and strategy and computer science, indicating that the proposed technique may be applied to all of these domains. To verify this assumption, the following three groups of SDOS journals were selected in this analysis.

Database and search engine of ProQuest: ProQuest provides a variety of databases. Papers from the ABI, Academic Research Library and ProQuest Science Journals were selected from this database. “Advanced Search” was selected to search the web database. Although all four different search engines function under their individual search formats, attempts were made to synchronize all of the search criteria.

The above method was conducted to collect journal papers from 1990 to 2007 that matched the search criteria of the four databases, and had been submitted to SCIE and SSCI journals. Table 3-3 presents the total number of journals collected in the databases, and the number of papers included in this analysis.

Table 3-3 The Data of Journals Collected in Databases and the Number of Papers Cited.

Name of Database Number of Journals Collected

Number of Papers Cited in this Research

ACM 25 59

IEEE 35 121

SDOS 2161 642

ProQuest 3942 231

The filtering rules are that we selected the papers which’s title contains the keywords involving information retrieval and data mining. The observe time is during 1990 to 2007. The ACM and IEEE are two important societies when we mention to computer science. The quality of the published conference papers is trustable. The criteria of the journal papers also included the publication of the papers must be the list of SCI/SSCI proposed by ISI. After filtering, we selected 632 conference papers

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y