• 沒有找到結果。

A Database Schema Integration Methodology for Data Warehouse 陳明毅、黃士銘

N/A
N/A
Protected

Academic year: 2022

Share "A Database Schema Integration Methodology for Data Warehouse 陳明毅、黃士銘"

Copied!
2
0
0

加載中.... (立即查看全文)

全文

(1)

A Database Schema Integration Methodology for Data Warehouse 陳明毅、黃士銘

E-mail: 9121492@mail.dyu.edu.tw

ABSTRACT

Designing a data warehouse system needs to integrate distributed/heterogeneous databases into a reconciled data platform for strategic use. To establish the reconciled data platform, a global schema or global data views which contain higher semantic and integrity constraints are required during the development phase. In this dissertation, we describe a novel methodology to integrate independent local logical database schemas into a global semantic database schema. These schemas are converted from the local databases to Extended Entity Relationship (EER) models. Equivalency of domain, attribute, entity, and relationship are identified between two of these EER models. While a discovery and resolution mechanism by using these equivalent definitions is applied to deal with the naming conflict and structural conflict between two EER models, a reconciled semantic view between the two EER models is established. A semantic merge mechanism by using data analysis technique is also applied to extract more semantics during the merge phase. Two local database schemas are then integrated. These steps are repeated until the schemas of all databases to be integrated have been consolidated into a single global schema.

Keywords : data warehouse ; database schema ; scheam integration Table of Contents

Contents Abstract...I Contents...II List of

Figure...IV Chapter 1 Introduction 1.1 Overview...1 1.2 Research Motivation and Objective...5 1.3 Organization of the Dissertation...6 Chapter 2 Related Work 2.1 Formulation of Schema Integration Problem...9 2.2 Taxonomy of Schema Conflict...9 2.3 Related Schema Integration Methodology...10 2.3.1 Superviews: An Approach Based on Integration Operators

...11 2.3.2 An Approach That Preserves Semantic Relativism...11 2.3.3 Extended ER Clustering for Integration process……….12 2.4 Summary………...13 Chapter 3 The Formalism 3.1 Motivation………..14 3.2 The EER Hierarchies………

………..14 3.3 The Principles of EER Schema Integration………17 3.3.1 Attributes Level Equivalence………..18 3.3.2 Entity Level Equivalence………

………21 3.3.3 Relationship Level Equivalence………..23 3.4 Summary………

………26 Chapter 4 A Methodology for Database Schema Integration 4.1 Phase I: Discover and resolve the database schema conflicts………29 4.2 Phase II: Strategies for schema integration………

……39 4.3 A Case Study………..46 Chapter 5 System Implement 5.1 Overview………...50 5.2 The Meta-data………

……….50 5.3 The Schema Integration Tool……….54 5.4 A Data Warehouse Case Study………...59 5.4.1 Schema translation from various data models to EER model……….59 5.4.2 Schema integration of EER model into a global schema………60 5.4.3 Data Conversion from existing databases to data warehouse……….63 5.5 Create Data Warehouse Application………..65 Chapter 6 Conclusion and Future Work 6.1 Conclusion……….67 6.2 Future Work…

………...68 References………

………70 Appendix A. The Author’s Publication List……….72 REFERENCES

1] Berson, A., and Smith, S. J., “Data Warehousing, Data Mining, and OLAP,” McGraw-Hill, pp. 14-21, 1997.

[2] S. B. Navathe and S. G. Gadgil, “A methodology for view integration in logical data base design,” in Proc. 8th Int. Conf. Very Large Data Bases, Mexico City, Sept. 1982 pp. 142-152.

[3] C. Batini, M. Lenzerini, and S. B. Navathe, “A Comparative Analysis of Methodologies for Database Schema Integration”, ACM Computing Surveys, VOL. 18, NO. 4, pp.323-364, Dec. 1986.

(2)

[4] Tse-Min Hung, “A Novel Data Warehouse Architecture: A Database Proxy Server Approach,” Master Thesis, Department of Computer Science and Engineering, Tatung Institute of Technology, June 1999.

[5] Philippe-Pierre Dornier, Ricardo Ernst, Michel fender, and Panos Kouvelis, “Global Operations and Logostics,” John Wiley & Sons, Inc., ISBN0-471-12036-7, pp. 1-5, 1998.

[6] C.Palvia Prashant, “Research Issues in Global Information Technology Management,” Information Resource Management Journal, Vol. 11, No. 2, 1998.

[7] Larson, J. A, Navathe, S. B., and Elmasri, R., “ A Theory of Attribute Equivalence in Databases with Application to Schema Integration,”

IEEE Transaction on Software Engineering, VOL. 15, NO. 4, pp. 449-463, Apr. 1989.

[8] Gengo Suzuki, Masashi Yamamuro, “Schema Integration Methodology Including Structural Conflict Resolution and Checking Conceptual Similarity,” Database Reengineering and Interoperability, Plenum Press, New York, pp. 247-260, 1996.

[9] SHOVAL, P. and ZOHN, A., “Binary-relationship integration methodology”, Data & Knowledge Engineering, VOL. 6, pp.225- 250, 1996.

[10] Ernst, R., Fender, M. and Kouvelis, P., “Global Operations and Logistics Text and Cases,” John Wiley & Sons, pp. 1-5, INC. 1998.

[11] Stefano Saccapietra, Christine Parent, and Yann Dupont, “Model Independent Assertions for Integration of Heterogeneous Schemas”, The VLDB Journal, Vol. 1, No 1, pp. 81-126, July 1992.

[12] Omran A. Bukhres and Ahmed K. Elmagarmid, “Object- Oriented Multidatabase Systems A Solution for Advanced Applications,”

Prentice Hall, pp. 105-202, 1996.

[13] Giuseppe Santucci, “Semantic schema refinements for multilevel schema integration,” Data & Knowledge Engineering 25, pp. 301-326, 1998.

[14] Li S. H., Huang S. M., and Chen H. H., “Discover Missing Semantic from Existing Relational Databases,” 8th International Database Workshop, Hong Kong, pp. 275-286, 1997.

[15] Shing-Han Li, “Translating A Relational Database Schema into An Extended Entity-Relationship Database Schema: A Data Mining and Data Dictionary Approach,” Master Thesis, Department of Computer Science and Engineering, Tatung Institute of Technology, Jun. 1997.

[16] Joseph Fong and Shi-Ming Huang, “Architecture of a Universal Database: A Frame Model Approach,” The International Journal of Cooperative Information System, World Scientific Pub., Jun. 1999.

[17] Joseph Fong and Shi-Ming Huang, “Information Systems Reegineering,” Springer-Verlag Singapore Pte. Ltd., 1997.

參考文獻

相關文件

ArchIS’ architecture uses (a) XML to support temporally grouped (virtual) representations of the database history, (b) XQuery to express powerful temporal queries on such views,

• The memory storage unit holds instructions and data for a running program.. • A bus is a group of wires that transfer data from one part to another (data,

Srikant, Fast Algorithms for Mining Association Rules in Large Database, Proceedings of the 20 th International Conference on Very Large Data Bases, 1994, 487-499. Swami,

Geometry gml:CurvePropertyType ISO 19136-1 捷運系統名稱 xs:string XML Schema 捷運線段名稱 xs:string XML Schema 捷運類型代碼 xs:integer XML Schema 測製年月

代碼 姓名 姓別 住址 電話 部門 部門 位置..

In our AI term project, all chosen machine learning tools will be use to diagnose cancer Wisconsin dataset.. To be consistent with the literature [1, 2] we removed the 16

The remaining positions contain //the rest of the original array elements //the rest of the original array elements.

2 machine learning, data mining and statistics all need data. 3 data mining is just another name for