• 沒有找到結果。

Data-Oriented XML Processing

在文檔中 Building Web Services with Java (頁 106-110)

When you’re thinking about applications working with XML, it’s important to note that all the mechanisms for parsing and generating XML described so far are syntax-oriented.

They force the application to work with concepts such as elements, attributes, and pieces of text.This is similar to applications that use text files for storage being forced to work with characters, lines, carriage returns (CR), and line feeds (LF).

Typically, applications want a higher-level view of their data.They aren’t concerned with the physical structure of the data, be it characters and lines in the case of text files or elements and attributes in the case of XML documents.They want to abstract this away and expose the meaning or semantics of the data. In other words, applications don’t want to work with syntax-oriented APIs; they want to work with data-oriented APIs.Therefore, typical data-oriented XML applications introduce a data abstraction layer between the syntax-oriented parsing and output APIs and application logic (see Figure 2.5).

Syntax-oriented APIs

Application

Data Abstraction

Layer Application Logic

Figure 2.5 Data abstraction layer in XML applications

When working with XML in a data-oriented manner, you’ll typically use one of two approaches: operation-centric and data-centric.

The Operation-Centric Approach

The operation-centric approach works in terms of custom-built APIs for certain opera-tions on the XML document.The implementation of these APIs hides the details of XML processing. Only non-XML types are passed through the APIs.

Consider, for example, the task of SkatesTown trying to independently check the total amount on the invoices it’s sending to its customers. From a Java application perspective, a good way to implement an operation like this would be through the interface shown in Listing 2.29.

Listing 2.29 InvoiceCheckerInterface package com.skatestown.invoice;

import java.io.InputStream;

/**

* SkatesTown invoice checker

*/

public interface InvoiceChecker { /**

* Check invoice totals.

*

* @param invoiceXML Invoice XML document

* @exception Exception Any exception returned during checking

*/

void checkInvoice(InputStream invoiceXML) throws Exception;

}

The implementation of checkInvoice()must do the following:

1. Obtain an XML parser.

2. Parse the XML from the input stream.

3. Initialize a running total to zero.

4. Find all order items, and calculate item subtotals by multiplying quantities and unit prices. Add the item subtotals to the running total.

5. Add tax to the running total.

6. Add shipping and handling to the running total.

7. Compare the running total to the total on the invoice.

8. If there is a difference, throw an exception.

9. Otherwise, return.

The most important aspect of this approach is that any XML processing details are hid-den from the application. It can happily deal with the InvoiceCheckerinterface, never knowing or caring about how checkInvoice()works.

The Data-Centric Approach

An alternative is the data-centric approach. Data-centric XML computing reduces the problem of working with XML documents to that of mapping the XML to and from application data and then working with the data independently of its XML origins.

Application data covers the common datatypes developers work with every day: boolean values, numbers, strings, date-time values, arrays, associative arrays (dictionaries, maps,

83 Processing XML

hash tables), database recordsets, and complex object types. Note that in this context, DOM tree objects aren’t considered true application data because they’re tied to XML syntax.The process of converting application data to XML is called marshalling

g

.The

XML is a serialized representation of the application data.The process of generating application data from XML is called unmarshalling

g

.

For example, the XML invoice markup could be mapped to the set of Java classes introduced in the schema section (see Listing 2.30).

Listing 2.30 Java Classes Representing Invoice Data class Address { ... }

class Item { ... }

class InvoiceItem extends Item { ... } class Invoice

{

int id;

Date submitted;

int customerId;

Address billTo;

Address shipTo;

InvoiceItem order[];

double tax;

double shippingAndHandling;

double totalCost;

}

Schema Compilers

The traditional approach for generating XML from application data has been to custom-code the way data values become elements, attributes, and element content.The tradi-tional approach of working with XML to produce application data has been to parse it using a SAX or a DOM parser. Data structures are built from the SAX events or the DOM tree using custom code. However, there are better ways to map data to and from XML using technologies specifically built for marshalling and unmarshalling data to and from XML. Enter schema compilation tools.

Schema compilers are tools that analyze XML schema and code-generate marshalling and unmarshalling modules specific to the schema.These modules work with data struc-tures tuned to the schema. Figure 2.6 shows the basic process for working with schema compilers.The schema compiler needs to be invoked only once; then the application can use the code-generated modules like any other API.

Figure 2.6 Using a schema compiler.

Binding Customization

In some cases, the object types generated by the schema compiler offer a good enough API for working with the types and elements described in the target schema.The appli-cation can use these classes directly. Other cases may require customization of the default binding of XML types to object types.That is where the binding customizations come in:They provide additional information to the schema compiler about how the binding between XML and application structures should happen.

There are two main reasons for applying customization:

n To deal with predefined application data structures—This reason applies in environments where the application already has defined object types to represent the concepts described in the schema. An example is a PO processing system that was designed to receive inputs from a human-facing UI and an EDI data feed. Now, the system must be extended to handle XML POs.The task is to map the XML of POs to the existing application data structures.There is zero chance that the default map-ping defined by the schema compiler will do this in a satisfactory manner.

n To simplify the API—This reason for applying customization is driven by program-ming convenience. Sometimes the conventions of schema design don’t map well to the conventions of object-oriented design. For example, localized text is often rep-resented in schema as a subelement with an xml:langattribute identifying the language. Most applications represent this construct as a string object property whose value is determined by the active internationalization locale. Further, there is often more than one way to express a schema type in a programming language.

For example, should an xsd:decimalbe mapped to a BigDecimal,double, or

floatin Java? The right answer depends on the application.

Common examples of customizations include the following:

n Changing the names of namespaces, object types, and object properties; for exam-ple, mapping the customerIDattribute to the _cidobject property.

n Defining the type mapping, especially for simple types, as the previous

xsd:decimalmapping example suggested.

Marshaller Module

Schema Compiler

Target Schema

Binding Customizations

Unmarshaller Module

Target XML Format Application Data

codegen

codegen

85 Processing XML

n Choosing which subelements to map to object properties and whether to map them as simple types or as properties that are objects themselves, as the localized text example suggested.

n Specifying how repeated types should be mapped to collection types in program-ming languages. For example, should the order items in POs be represented by a simple array type, a dynamic array type, or some other data structure such as a list?

The Java community has defined a standard set of tools and APIs for mapping schema types to Java data structures called Java Architecture for XML Binding (JAXB)

g

. JAXB

took a long time to develop because the problems it was trying to address were very complex. Initially, the work targeted DTD-to-Java mapping.This, and the fact that JAXB was JSR-31 in the Java Community Process (JCP), gives you an idea of how long JAXB has taken to evolve. Because of its long history, JAXB isn’t yet fully aligned with the lat-est thinking about XML type mapping for Web services.The good news is that JAXB now supports a significant part of XML Schema and is ready for production use. JAXB 2.0 will synchronize JAXB with JAX-RPC (the Java APIs for remote procedure calls using Web services), which will make JAXB even better suited for use with Web services.

Chapters 3 and 5 (“Implementing Web Services with Apache Axis”) introduce advanced data-mapping concepts specific to Web services as well as more sophisticated mechanisms for working with XML.The rest of this section will offer a taste of XML processing by implementing the checkInvoice()API described earlier using both a SAX and a DOM parser as well as JAXB.

在文檔中 Building Web Services with Java (頁 106-110)