DOM-Based checkInvoice() - Building Web Services with Java

The basic architecture of the JAXP DOM parsing APIs is shown in Figure 2.8.This architecture uses the same factory design pattern as the SAX API. An application uses the

javax.xml.parsers.DocumentBuilderFactoryclass to get a DocumentBuilderobject instance, and uses that to produce a document that conforms to the DOM specification.

The value of the system property javax.xml.parsers.DocumentBuilderFactory deter-mines which factory implementation produces the builder.This is how JAXP enables applications to work with different DOM parsers.

The important packages for working with JAXP and DOM are as follows:

n org.w3c.dom—Defines the DOM programming interfaces for XML (and, option-ally, HTML) documents, as specified by the W3C

n javax.xml.parsers—DefinesDocumentBuilderandDocumentBuilderFactory

classes

The DOM defines APIs that allow applications to navigate XML documents and to manipulate their content and structure.The DOM defines interfaces, not a particular implementation.These interfaces are specified using the Interface Description Language (IDL) so that any language can define bindings for them. Separate Java bindings are pro-vided to make working with the DOM in Java easy.

The DOM has several levels and various facets within a level. In the fall of 1998, DOM Level 1 was released. It provided the basic functionality to navigate and manipu-late XML and HTML documents. DOM Level 2 builds upon Level 1 with more and better-segmented functionality:

n The DOM Level 2 Core APIs build on Level 1, fix some problem spots, and define additional ways to navigate and manipulate the content and structure of documents.These APIs also provide full support for namespaces.

n The DOM Level 2 Views API specifies interfaces that let programmers view alter-nate presentations of the XML or HTML document.

n The DOM Level 2 Style API specifies interfaces that let programmers dynamically access and manipulate style sheets.

n The DOM Level 2 Events API specifies interfaces that give programmers a generic event system.

n The DOM Level 2 Traversal-Range API specifies interfaces that let programmers traverse a representation of the XML document.

n The DOM Level 2 HTML API specifies interfaces that let programmers work with HTML documents.

Document (DOM)

object

object object

object object DocumentBuilder

Factory

Document Builder XML Data

Figure 2.8 DOM parsing architecture

All interfaces (apart from the Core ones) are optional.This is the main reason most applications rely entirely on the DOM Core.You can expect parsers to support more of the DOM soon. In fact, the W3C is currently working on DOM Level 3.

The DOM originated as an API for XML processing at a time when the majority of XML applications were document-centric. As a result, the interfaces in the DOM describe low-level syntax constructs in XML documents.This makes working with the DOM for data-oriented applications somewhat cumbersome and is one of the reasons the Java community is working on the JDOM APIs.

To better understand the XML DOM, you need to be familiar with the core inter-faces and their most significant methods. Figure 2.9 shows a Universal Modeling Language (UML) diagram describing some of them.

The root interface is ^Node. It contains methods for working with the node name (getNodeName()), type (getNodeType()), and attributes (getNodeAttributes()).^Node types cover various XML syntax elements: document, element, attribute, character data, text node, comment, processing instruction, and so on. All of these are shown in subclass

Node, but not all are shown in Figure 2.9.To traverse the document hierarchy, nodes can access their parent (getParentNode()) as well as their children (getChildNodes()).^Node also has several convenience methods for retrieving the first and last child as well as the previous and following sibling.

93 Processing XML

Figure 2.9 Key DOM interfaces and operations

The most important operations in ^Documentinvolve creating nodes (at least one for every node type); assembling these nodes into the tree (not shown); and locating ele-ments by name, regardless of their location in the DOM (getElementsByTagName()).

This last API is convenient because it can save you from having to traverse the tree to get to a particular node.

The rest of the interfaces in the figure are simple. Elements, attributes, and character data each offer a few methods for getting and setting their data members.^NodeListand

NamedNodeMapare convenience interfaces for dealing with collections of nodes and attributes, respectively.

What Figure 2.9 doesn’t show is that DOM Level 2 is fully namespace-aware and that all DOM APIs have versions that take in namespace URIs.Typically, their name is the same as the name of the original API with NS appended, such as ^Element’s

getAttributeNS(String nsURI, String localName).

With this information in mind, building a class to check invoice totals becomes rela-tively simple.The DOM implementation of InvoiceCheckeris shown in Listing 2.32.

Listing 2.32 DOM-Based Invoice Checker (InvoiceCheckerDOM.java) package com.skatestown.invoice;

import javax.xml.parsers.DocumentBuilderFactory;

/**

* Check SkatesTown invoice totals using a DOM parser.

public class InvoiceCheckerDOM implements InvoiceChecker { /**

* Check invoice totals.

* @param invoiceXML Invoice XML document

* @exception Exception Any exception returned during checking

public void checkInvoice(InputStream invoiceXML) throws Exception

{

// Invoice running total double runningTotal = 0.0;

// Obtain parser instance and parse the document DocumentBuilderFactory factory =

DocumentBuilderFactory.newInstance();

DocumentBuilder builder = factory.newDocumentBuilder();

Document doc = builder.parse(invoiceXML);

// Calculate order subtotal

NodeList itemList = doc.getElementsByTagName(“item”);

for (int i = 0; i < itemList.getLength(); i++) { // Extract quantity and price

Element item = (Element)itemList.item(i);

Integer qty = Integer.valueOf(

item.getAttribute(“quantity”));

Double price = Double.valueOf(

item.getAttribute(“unitPrice”));

// Add subtotal to running total

runningTotal += qty.intValue() * price.doubleValue();

}

// Add tax

Node nodeTax = doc.getElementsByTagName(“tax”).item(0);

runningTotal += doubleValue(nodeTax);

// Add shipping and handling Node nodeShippingAndHandling =

doc.getElementsByTagName(“shippingAndHandling”).item(0);

Listing 2.32 Continued

95 Processing XML

runningTotal += doubleValue(nodeShippingAndHandling);

// Get invoice total Node nodeTotalCost =

doc.getElementsByTagName(“totalCost”).item(0);

double total = doubleValue(nodeTotalCost);

// Use delta equality check to prevent cumulative // binary arithmetic errors. In this case, the delta // is one half of one cent

if (Math.abs(runningTotal - total) >= 0.005) {

throw new Exception(

“Invoice error: total is “ + Double.toString(total) +

“ while our calculation shows a total of “ +

Double.toString(Math.round(runningTotal * 100) / 100.0));

} } /**

* Extract a double from the text content of a DOM node.

* @param node A DOM node with character content.

* @return The double representation of the node’s content.

* @exception Exception Could be the result of either a node

* that doesn’t have text content being passed in

* or a node whose text content is not a number.

private double doubleValue(Node node) throws Exception { // Get the character data from the node and parse it

String value = ((CharacterData)node.getFirstChild()).getData();

return Double.valueOf(value).doubleValue();

} }

InvoiceCheckerDOMmust implement the InvoiceCheckerinterface in order to provide thecheckInvoice()functionality. Apart from this, it’s a standalone class. Also, note that the class has no member data, because there is no need to maintain parsing context.The context is implicit in the hierarchy of the DOM tree that will be the result of the parsing process.

The factory pattern used here to parse the invoice is the same as the one from the SAX implementation; it just uses DocumentBuilderFactoryandDocumentBuilder. Although the SAX parse method returns no data (it starts firing events instead), the DOM^parse()method returns a ^Documentobject that holds the complete parse tree of the invoice document.

Listing 2.32 Continued

Within the parse tree, the call to getElementsByTagName(“item”)retrieves a node list of all order items.The loop iterates over the list, extracting the ^quantityand

unitPriceattributes for every item, obtaining an item subtotal, and adding this to the running total.

The same getElementsByTagName()API combined with the utility function

doubleValue()extracts the amounts of tax, the shipping and handling, and the invoice total cost.

Just as in the SAX example, the code has to use a difference check instead of a direct equality check to guard against inexact decimal-to-binary conversions.

The class also defines a convenient utility function that takes in a DOM node that should have only character content and returns the numeric representation of that con-tent as a double. Any nontrivial DOM processing will typically require these types of utility functions. It goes to prove that the DOM is very syntax-oriented and not con-cerned about data.

That’s all it takes to process the invoice using DOM. Of course, this is a simple exam-ple; just as in the SAX example, a real application would have done at least three things differently:

n It would have used namespace information and prefixed element names instead of using local names.

n It would have defined its own exception type to communicate invoice validation information. It would have implemented try-catch logic inside the

checkInvoice()method in order to report more meaningful errors.

n It would have either explicitly turned on validation of the incoming XML docu-ment or traversed the DOM tree step by step from the docudocu-ment root to all the elements of interest. Using getElementsByTagName()presumes that the structure of the document (relative positions of elements) has already been validated. If this is the case, it’s okay to ask for all item elements regardless of where they are in the document.The example implementation took this approach for code readability purposes.

These changes aren’t complex, but they would have increased the size and complexity of the example beyond its goals as a basic introduction to DOM processing.

在文檔中 Building Web Services with Java (頁 116-121)