Many researches [30] claim that the similarity of two entities will be influenced by their neighbor entities. Hence, when mapping a foreign key to some possible object properties, if only considering the information foreign key itself has, there will cause a low accuracy of mapping results. Besides, since some semantic of foreign keys can be gotten by classifying tables a foreign key can have different matching level depend on the semantic it has.
Combining above mentions, two principals of mapping foreign keys to object properties are composed as follows:
• Principal 1: The similarity between two entities should be influenced by their neigh-bor entities.
• Principal 2: Foreign keys can have different level matching depend on the semantic they have.
Therefore, the principals and source of computing similarity are considered when mapping foreign keys to object properties. In the following sections, we introduce how to map various kinds of foreign keys to object properties except IS-A FKs. Because as for OWL ontologies, it does not use object properties to declare subclass and super-class relationship between super-classes but use a built-in vocabulary “owl:superClassOf” or
“owl:subClassOf” and hence IS-A FKs will be ignored in the matching process.
4.5.1 Mapping Base FKs
As defined in 4.3.5, the semantic of a base FK has explicitly expressed the fact that it can relate one table to another table and as for object properties which can link individuals each other between domain classes and range classes. Therefore, for one base foreign key and one object property, their total similarity can be combined the local similarity and the external similarity to improve the performance of mapping results. The related definitions of these similarities are defined as follows.
Definition 4.5.1 (Local Similarity for Two Entities). For two entities ei and ej, their local similarity can be expressed as Simlocal(ei, ej) which is the result using matchers to compute their local information such as names.
For a foreign key ci, the table ci stores in and the table ci refers to will be the main structures for ci. The counter parts in object property opi are its domain classes and range classes. Hence, the external similarity of ci and opi can combine their domain similarity Simdomain(ci, opi) and range similarity Simrange(ci, opi). These two similarities are defined as follows.
Definition 4.5.2 (External Similarity of FKs and OPs). For foreign key ci of table ti referring to table tj and object property opi which has n related domain classes {dc1, dc2, . . . , dcn} and m related range classes {rc1, rc2, . . . , rcn}, the exteranl
According to the principal 2 foreign keys can have different level matching depend on the semantic they have. In this approach, two matching levels, which are level one matching and level two matching, are used. Level one matching represents the matching between entities only considers the explicit information they have. As opposed to level one matching, level two matching uses implicit information to match entities. Hence, since a base FK in this approach is regarded as an entity which has no implicit information it will be matched up to level one and its similarity is defined as follows.
Definition 4.5.3 (Similarity for Base FKs and Object Properties). For base foreign key ci and object property opi, the similarity between them can be computed as:
Simone(ci, opi) = Simlocal(ci, opi) + Simexternal(ci, opi) , where
• Simlocal(ci, opi) is the local similarity.
• Simexternal(ci, opi) is the external similarity.
Since for each base foreign key there are many possible object properties to match, we give the priority to the matching candidate based on the rank of their similarity. For example, the best matching candidate for base foreign key ci can be defined as follows.
Definition 4.5.4. For base foreign key ciand n numbers of object properties{op1, op2, . . . , opn}, the best matching candidate from these object properties for ci is opi if :
Sim(ci, opi) = max
1≤x≤nSimone(ci, opx)
4.5.2 Mapping Part-Of FKs
For any two individuals I1 and I2 in OWL ontologies, if I1 is part of I2 and I2 has part of I1, I1 and I2 will have inverse relationship and this kind of relation can be easily claimed in the OWL ontologies as two object properties which have inverse relation. But as for the relational database schema, inverse relationship between tables can not be explicitly expressed. Because for any two tables ti and tj, if they have inverse relationship in the real word, database designers usually regard this kind of relationship as the common binary relationship and hence may design that ti uses a foreign key to refer to tj and the fact that tj can also use a foreign key to refer to ti would not be captured.
As mentioned above, the fact that two tables have the inverse relationship will be represented incompletely, in other words, some implicit information is not expressed in the relational database schema. Hence, extracting these implicit information is necessary to improve the performance of mapping result. For this purpose, our approach takes advantage of Part-Of FKs to find the inverse relationship between tables. Because for a table ti and its Part-Of FK ci referring to table tj, the fact that ti and tj have inverse relationship is satisfied since both of them have the common primary key describing themselves.
Mapping a Part-Of FK to object properties can adopt a level two matching since this kind of FK has implicitly expressed the information about two tables exist an inverse relationship. In level one matching for a Part-Of FK ci, the behavior of computing similarity between ci and object properties is the same as the one of a base FK. But when reaching the level two matching, we will create a conceived foreign key cj to express the implicit information of ci. The name of cj is the same as ci and now cj becomes
a foreign key of the table ci refers to and cj refers to the table ci stores in. Hence, if combining level one and level two matching for ci and n numbers of object properties, there will be 2n numbers of matching results for ci. n numbers of them are the matching results using explicit information ci to match and the others are the matching results using implicit information ci to match. Now we define the best matching candidate of ci. Definition 4.5.5. For tables ti and tj, if ti has a Part-Of foreign key ci referring to tj and ci will match to n numbers of object properties {op1, op2, . . . , opn}, a conceived foreign key cj of tj referring to ti will be created and the best matching candidate from these object properties for ci is opi if :
When calculating the external similarity between foreign keys and object properties, we usually take the tables foreign keys store in and domain classes of object properties as input to compute the domain similarity but as for foreign keys of relationship tables we doesn’t always do that. Because relationship tables are created to facilitate the ”many-to-many” relationship betweens tables and not to represent entities in the real word.
Furthermore, on one hand, one relationship table can represent n-arity relationship, but on the other hand, OWL ontologies can only support unary or binary relationship.
Hence for a relationship table ti which has a set of foreign keys {c1, c2, . . . , cn}, when mapping these foreign key to object properties, two cases are considered as follows to capture the semantic of relationship table as far as possible.
• Case 1: If n = 2 and COL(ti) = F K(ti), the table used to compute the domain similarity for c1 will be the the table referred by c2 and the table used to compute the domain similarity for c2 will be the table referred by c1
• Case 2: If n ≥ 2 or COL(ti) = F K(ti), every table used to compute the domain similarity for cn will be the relationship table itself