Home > Articles

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

Implementation Anomalies

The following section details findings discovered while using each of the aforementioned DOM implementations. Each implementation has its own warts and idiosyncrasies. It should be kept in mind that the implementations are in various forms of compliance with the specifications. Any and all of these issues may be addressed with newer releases.

Processing Instructions

					
<?xml version="1.0" encoding="UTF-8" ?>

Only the Oracle implementation returned anything for this line. It was returned as a ProcessingInstruction node with appropriate contents.

Unexpected Child Nodes

The IBM implementation returned a number of child nodes off of the DOCUMENT_TYPE_NODE object. Both the Sun and Oracle implementations return 0 children for this node. The IBM implementation listed the entities of the XML as children of this node as well as a number of other nodes that appear to represent the structure of the DTD.

Results Using toString

Many Java developers, myself included, use the toString method to examine object contents. Various results were obtained by using this method on different objects in the DOM hierarchy. It is strongly recommended that you not depend on the results of this method because the DOM Core does not specify what it should return. With that said, the following results were observed.

node.getAttributes().toString returned differing results.

  • Sun returned 'discount="wholesale" cur="us"'.

  • IBM returned '[retail, us]'.

  • Oracle returned what appears to be the underlying result of toStringing the actual attribute objects.

CR/LF in XML Document Text

One of the requirements of the DOM is that it reports structurally isomorphic results. That is to say that two documents are identical from a processing perspective if formatting that makes no structural difference, such as whitespace outside real content, is not considered. In English, that means what goes in should come out. However, different implementations handle Carriage Return/Line Feed pairs differently. Specifically, the CR/LF pair between lines in the XML document is discarded by the Sun parser but returned as a text node by the IBM parser. The Oracle implementation returned CR/LF pairs where expected.

Comments

Comments are another area where implementations differed significantly.

  • Sun—Lost comments

  • IBM—Shown in appropriate places as comment nodes

  • Oracle—Shown in appropriate places as comment nodes

Entities

Because the DOM Core allows for validating and non-validating parsers, entities can be expected to be handled slightly differently between implementations. The following results were observed.

  • Sun—Entities returned but values shown always as null

  • IBM—Entity class cast exception when casting to entity

  • Oracle—Returned as expected

As we can see, there are differences between the implementations. However, we can assume that, as of this writing, all the implementations are beta and many of these issues will be addressed.

  • + Share This
  • 🔖 Save To Your Account