XML Data Manipulations - XML-Relational資料庫系統之模型對應綱要設計

Typically, data stored in database systems is often manipulated (including insertion, deletion, updating and querying) by users. In this section, we will provide a discussion of how to perform XML data manipulations under the XPred schema. There are some query language that can be used to manipulate XML data, such as XQuery[16] and XPath[15]. We only focus on the parsing and processing of XQuery commands, because its popularity. When an XQuery command is requested by a user, it must be translated into corresponding SQL commands. It is because we store XML data in a relational database.

3.3.1 XML Query Processing

We use the following example to explain how an XQuery command is translated into corre-sponding SQL commands.

FROM Path P1, Data D1, Data D2, Path P2 WHERE D1.PathID=P1.PathID

Figure 3.4: An example XQuery command in Figure 2.1 and its corresponding SQL com-mand.

Example 3 Query Translation(from XQuery to SQL commands)

Consider the XML document in Figure 1.1, Figure 3.4(a) is the XQuery command that gets the title of video which is directed by Lance Rivera. In order to process this query in a relational database, we must translate the XQuery command into a corresponding SQL cmmand. Two label-paths are involved in this example, ” ./Bib/Video/Director ” and ” ./Bib/Video/Title ”. The condition is on the former label-path, and the results are from the latter label-path. The result set contains titles of those videos directed by Lance Rivera. We must search all (Director, Title) pairs to select titles of the videos directed by Lance Rivera from all (Director, Title) pairs for this query, because all data are stored in these tables in a RDBS. Besides, the pairs must be for the same video, and the director must be Lance Rivera. Figure 3.4(b) shows the corresponding SQL command under the XPred schema. An algorithm that can translate the XQuery command into corresponding SQL command is described as follows: First, we must know which tables are needed to join. Whenever a label-path exists, it means two tables (i.e., the Table Path and Data or the Table Node) are needed to join. If a label-path connects with a element node, the table Path and Node are joined together. If a label-path connects with a text or an attribute node, the table Path and Data are joined. When many label-paths are involved, we must decide that which tables should be join with their relationships between these label-paths. If the relationship is sibling between two label-paths, we only need a equijoin that two Table Data or Node are joined together with the attribute PredID. The relationships between these label-paths can be translated into the FROM clauses in a SQL command. Next, the WHERE clauses in the XQuery command can be translated into the WHERE clauses in the SQL command. For example 3, the third row of statement (i.e.,WHERE $ y=”Lance Rivera”) in Figure 3.4(a) can be translated into a selection (i.e., WHERE D1.Value=’Lance Rivera’) in the WHERE clause in Figure 3.4(b). The ORDER-BY clauses in the XQuery command can be translated into the ORDER-BY clauses in the SQL command. For example 3, the fourth row of statement (i.e., ORDER BY $ x/Title) in Figure 3.4(a) can be translated into the ninth statement (i.e., ORDER BY D2.Value ASC) in the SELECT clause in Figure 3.4(b). Finally, the RETURN clauses in the XQuery command can be translated into the SELECT clauses in

SELECT clause in Figure 3.4(b).

3.3.2 XML Data Insertion and Deletion

In this section, we will explain that how the XPred schema perform the insertion and deletion operations of XML data in a relational database. Operation of the insertion and the deletion can be represented as follows:

InsertNode(P ath(Nodei),P redID(Nodei),V alue(Nodei)).

DeletNode(P ath(Nodei),NodeID(Nodei)).

Note that the deletion of a Nodeimust delete all of its descendant nodes. The operations of a insertion and deletion of XML data in a relational database under the XPred schema are as follows:

XPred Insertion Approach:

Step1: We must find out an ID of a path (i.e., P athIDi) of the node Nodei in the Table Path according th the assigned P ath(Nodei). If it doest’t exist in the Table Path, we must insert an record of the new path into the Table Path.

Step2: At first we must judge the type of the node Nodei whether the element type or the text type is. If the type of the node Nodeiis the element type, we will insert an new tuple into the Table Node. We assign a new sequence number NodeIDi to Nodei. Give the ID of the predecessor node Nodej of the node Nodei (i.e., P redIDi) according to the assigned P redID(Nodei). We find out an maximum ordinal MaxOrdinali of all tuples where the ID of the predecessor node Nodej of the node (i.e., PredID) and the PathID of the node Nodei

are equal to the P redIDi) and the P athIDi in the Table Node. we set the value of the ordinal of the node Nodei (i.e., Ordinali) that is equal to the MaxOrdinali adding one.

Then we insert a tuple (NodeIDi, P athIDi, Ordinali, P redIDi) into the Table Node. If the type of the node Nodei is the text type, we will insert an new tuple into the Table Data.

Inserting data inot the Table Data and Node are very similar, except that the Table Data must be inserted the attribute value of the node Nodei.

XPred Deletion Approach:

Step1: If the node Nodei that we want to delet is only a text node, we select and delete the label-path that is only owned by the node Nodei in the Table Path. If the node Nodei

deleted is a element node, we select and delete the lable-paths that only arrive at the node Nodei and its all descendant nodes Nodej in the Table Path. The foregoing are assigned to P ath(Nodei).

Step2: If the deleted node Nodei is only a text node, We find out an tuple in the Table Data, where PathID is equal to P athIDi and NodeID is equal to NodeIDi. Then we must delet the tuple that are selected in the Table Data. If the deleted node Nodei is a element node, we must find out all tuple of the node Nodei and its descendant nodes Nodej in the Table Node or the Table Data, where PathID is equal to P athIDi or P athIDj and NodeID is equal to NodeIDi or P redIDi. Then we delet the tuples that we find out in the Table Node or Data. The foregoing are assigned to NodeID(Nodei).

Chapter 4 Performance Evaluation

The experiments described in this chapter are meant to assess the capabilities of the proposed approach in storing and manipulating of XML documents for an XML storage system. The performance of simulation experiments of the XPred schema was compared to the XParent schema in a relational database management system. The experiments are conducted on a PC with a Intel(R) Core^{T M}2 Quad 2.40GHz processor and DDR2 memory with 1024MB capacity that its speed is 800MHZ, and the relational database management system is Or-acle10g. All algorithms of the XPred schema are implemented in C++. We use Xerces for C++ parse XML documents. The experimental results are based on two performance diagnostic tools in Oracle10g, which are SQL trace and TKPROF. The rest of this chapter are to describe performance metrics, data sets, and experimental results.

4.1 Performance Metrics

The primary performance metric is the response time of an operation, referred to as ResponseT ime. The response time of each operation Oi is a time interval from start to finish. Let StartT imei and F inishT imei be the time of Oi starts and finishes, respec-tively. ResponseT ime of an operation Oi is calculated as F inishT imei − StartT imei. Other interested metric are CP UT ime, I/OT ime, NumP hysicalIO and NumLogicalIO.

The CP UT ime and I/OT ime are the CPU computation time of an operation and the I/O execution time of an operation, respectively. The NumP hysicalIO and NumLogicalIO are total number of data blocks physically reading from disk and total number of buffers retrieved from memory, respectively.

site

Figure 4.1: The database description of an XML document generated by the XMark.

在文檔中 XML-Relational資料庫系統之模型對應綱要設計 (頁 29-34)