• 沒有找到結果。

A unified interface for integrating information retrieval

A mobile agent is not bound to the system where it begins execution. It has the unique ability to transport itself from one system in a network to

E- mail based mobile agent runtime environment: The other major solution is much the same as the first method. The only different is the

4.4 A unified interface for integrating information retrieval

As mentioned above, to intensively explore the information on the Internet, to provide a unified interface for integrating information retrieval is urgent. In this section, we will outline the approach of unified interface and a flexible architecture for querying various information sources on the Internet and the WWW using both a popular object model and a data model. We have proposed an Integrated Information Retrieval (IIR) service based on the Common Object Service Specification COSS for Common Object Request Broker

Architecture (CORBA) and apply the Document Type Definition DTD of eXtensible Markup Language XML to define the metadata of information sources for sharing the ontology between mediator and wrappers. The objective of using the IIR design is not only to provide programmers with a uniform interface for coding a software application that can query a variety of information sources on the Internet, but also to create a flexible and extensible environment that easily allows system developers to add new or updated wrappers to the system.

To provide this mechanism, it can achieve the following objectives:

1. To propose a uniform interface for information retrieval and gathering in an approved standard of distributed object-oriented environment. This offers a programming interface to retrieve what applications are wanted, and uses agent technology to implement the infrastructure of IIR.

2. Each type of information source has its own query language, schema and attribute.

With this approach, it is necessary to support an extensible environment that will allow integrating various information sources in the future.

3. The Document Type Definition (DTD) announced by the World Wide Web Consortium (W3C) is a popular description language of scheme. We apply the DTD of eXtensible Markup Language (XML) to define the schema of information sources, and to provide the interface in the IIR for managing metadata.

4. Due to the unity of interface, a service provider can easily implement a wrapper for their speed up the system development.

5. Both the object model of CORBA and data model of XML are the approved standards and are widely accepted by the industry and so will be by users and programmers.

6. We adopt Structure Query Language (SQL) in the IIR for transparently querying information from various sources on the WWW and the Internet. IIR can seamlessly combine references to the Web with references to the relational database. Anyone familiar with SQL can create programs using IIR easily.

A flexible architecture and framework can improve access transparency, system scalability and extensibility. We dedicate our IIR design to such participants as data providers and information inquirers who can dynamically join the system flexibility, no matter

dynamically into the system. An IIR client can obtain the information about information sources by inquiring their metadata. A service provider can also replace, access, and maintain the metadata of information sources and provide an adaptable environment.

Metadata management and the extensibility and scalability of system critically enforce the IIR.

In addition, the approach also allows that the system should be capable of retaining the autonomy of a jointed local query system, that is, that the IIR and local query systems should co-exist.

IIR architecture is simple and complete. From the client’s perspective, the requirements are a uniform access interface as well as unified data model for representing the results of queries. With IIR, clients use a standard query interface to acquire information based on a CORBA object model.

Figure 4.8 depicts IIR architecture. It comprises InformatiionRetriever, MetaData,

Wrapper and Collector. InformationRetriever acts as a mediator for dispatching query

requests to the wrappers of information source and collects the results. A client program sends a query request to the information sources by first obtaining the InformationRetriever object from a Factory object. Due to the access transparency, the operations for querying all information sources using IIR are the same. A client queries information sources by mean of invoking InfomartionRetriver. The InformationRetriever activates corresponding wrapper(s) according to the query string involved in the parameter of query operation. It is obvious that clients accessing information sources are completely transparent by invoking an

InformationRetriever object.

Figure 4.8: Architecture of IIR.

MetaData is the management of metadata. Its purpose is to minimize the degree of

complexity fro federating heterogeneous information sources. With IIR, it has three following functions. First, a client can query the MetaData, construct a world-view and formulate the query string when it is unfamiliar with the schema and the semantics of accessed information sources. Second, the metadata is the ontology with respect to information sources. The InformationRetriever and the Wrapper shares the metadata in querying information sources and in translating the content of the query. InformationRetriever will refer to the metadata in judging the query string and determining the related wrapper when it receives a query request. MetaData provides the ability of access transparency for the IIR.

Finally, IIR is needed to enable management of the metadata when the source dimension changes, that is, for example, for adding or deleting a wrapper of information source. Neither the query operations in client nor the objects in IIR are necessary to be changed. The

InformationRetriever refers the metadata and judges the meaning of the query operation.

Obviously, the IIR have extensibility and scalability. There is a need to have some supporting methods for the management of metadata in IIR.

The Wrapper is responsible for translating the query request into the request format associated with the information sources and the results from the local system data representation to the IIR system. If the results are from multiple similar sources, they are

filtered. The wrapper activates the Filter object according to the kind of sources. The result is packed into a standard format, for example, XML, and put into the Collector object.

Finally, the Collector collects the results and translates it into export view. For the client, IIR supports a unified invocation approach for querying source and obtaining results.

Owing to the benefits and the integration of various information sources that maybe configured on the WWW or the Internet, the approach adopts the SQL as the query language rather than invent a new one. In this way, the query language of IIR provides programmers with the illusion that the information sources are stored and organized in a relational database.

As we know, a schema describes the structure of a relational database, i.e. the tables, fields, and the relationships between them. Generally, a Web-based documents or a Web-based processing system involves a table, even if it has multiple backend physical database because it has a single interface to query inner data via the Common Gateway Interface (CGI) program. For example, search engine and biographical query system. Such the systems, we can suppose the whole system contains only a single table. The table name is defined as the service name. Some querying examples of this approach are shown in Table 4.1.

Table 4.1. Some querying examples of integrating information retrieval approach

Querying string Purpose

SELECT * FROM Yahoo WHERE