Home > Articles > Programming > Java

  • Print
  • + Share This
This chapter is from the book

Using SOAP to Send Binary Data

Our example messages to date have been fairly small, but we can easily imagine wanting to use SOAP to send large binary blobs of data. For example, consider an automated insurance claim registry—remote agents might use SOAP-enabled software to submit new claims to a central server, and part of the data associated with a claim might be digital images recording damages or the environment around an accident. Since XML can't directly encode true 8-bit binary data at present, a simple way to do this kind of thing might be to use the XML Schema type base64binary and encode your images as base64 text inside the XML:

<soap:Envelope
 xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <soap:Body>
 <submitClaim>
  <accountNumber>5XJ45-3B2</accountNumber>
  <eventType>accident</eventType>
  <image imageType="jpg" xsi:type="base64binary">
   4f3e9b0...(rest of encoded image)
  </image>
 </submitClaim>
 </soap:Body>
</soap:Envelope>

This technique works, but it's not particularly efficient in terms of bandwidth, and it takes processing time to encode and decode bytes to and from base64. Email has been using the Multipurpose Internet Mail Extensions (MIME) standard for some time now to do this job, and MIME allows the encoding of 8-bit binary. MIME is also the basis for some of the data encoding used in HTTP; since HTTP software can usually deal with MIME, it might be nice if there were a way to integrate the SOAP protocol with this standard and a more efficient way of sending binary data.

SOAP with Attachments and DIME

In late 2000, HP and Microsoft released a specification called "SOAP Messages with Attachments." The spec describes a simple way to use the multiref encoding in SOAP 1.1 to reference MIME-encoded attachment parts. We won't go into much detail here; if you want to read the spec, you can find it at http://www.w3.org/TR/2000/NOTE-SOAP-attachments-20001211.

The basic idea behind SOAP with Attachments (SwA) is that you use the same HREF trick you saw in the section "Object Graphs" to insert a reference to the data in the SOAP message instead of directly encoding it. In the SwA case, however, you use the content-id (cid) of the MIME part containing the data you're interested in as the reference instead of the ID of some XML. So, the message encoded earlier would look something like this:

MIME-Version: 1.0
Content-Type: Multipart/Related; boundary=MIME_boundary;
type=application/soap+xml;start="<claim@insurance.com>" --MIME_boundary Content-Type: application/soap+xml; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-ID: <claim@insurance.com> <soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope"> <soap:Body> <submitClaim> <accountNumber>5XJ45-3B2</accountNumber> <eventType>accident</eventType> <image href="cid:image@insurance.com"/> </submitClaim> </soap:Body> </soap:Envelope> --MIME_boundary Content-Type: image/jpeg Content-Transfer-Encoding: binary Content-ID: <image@insurance.com> ...binary JPG image... --MIME_boundary--

Another technology called Direct Internet Message Encapsulation (DIME) , from Microsoft and IBM, used a similar technique, except that the on-the-wire encoding was smaller and more efficient than MIME. DIME was submitted to the IETF in 2002 but has since lost the support of even its original authors.

SwA and DIME are great technologies, and they get the job done, but there are a few problems. The main issue is that both SwA and DIME introduce a data structure that is explicitly outside the realm of the XML data model. In other words, if an intermediary received the earlier MIME message and wanted to digitally sign or encrypt the SOAP body, it would need rules that told it how the content in the MIME attachment was related to the SOAP envelope. Those rules weren't formalized for SwA/DIME. Therefore, tools and software that work with the XML data model need to be modified in order to understand the SwA/DIME packaging structure and have a way to access the data embedded in the MIME attachments.

Various XML and Web service visionaries began discussing the general issue of merging binary content with the XML data model in earnest. As a result, several proposals are now evolving to solve this problem in an architecturally cleaner fashion.

PASWA, MTOM, and XOP

In April 2003, the "Proposed Infoset Addendum to SOAP With Attachments" (PASWA) g document was released by several companies including Microsoft, AT&T, and SAP. PASWA introduced a logical model for including binary content directly into a SOAP infoset. Physically, the messages that PASWA deals with look almost identical to our two earlier examples (the image encoded first as base64 inline with the XML and then as a MIME attachment)—the difference is in how we think about the attachments. Instead of thinking of the MIME-encoded image as a separate entity that is explicitly referred to in the SOAP envelope, we logically think of it as if it were still inline with the XML. In other words, the MIME packaging is an optimization, and implementations need to ensure that processors looking at the SOAP data model for purposes of encryption or signing still see the actual data as if it were base64-encoded in the XML.

In July 2003, after a long series of conversations between the XML Protocol Group and the PASWA supporters, the Message Transmission Optimization Mechanism (MTOM) g was born, owned by the XMLP group. It reframed the ideas in PASWA into an abstract feature to better sync with the SOAP 1.2 extensibility model, and then offered an implementation of that feature over HTTP. The serialization mechanism is called XML-Binary Optimized Packaging (XOP) g; it was factored into a separate spec so that it could also be used in non-SOAP contexts.

As an example, we slightly modified the earlier insurance claim by augmenting the XML with a content-type attribute (from the XOP spec) that tells us what MIME content type to use when serializing this infoset using XOP. Here's the new version:

<soap:Envelope
 xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xmlns:xop-mime="http://www.w3.org/2003/12/xop/mime">
 <soap:Body>
 <submitClaim>
  <accountNumber>5XJ45-3B2</accountNumber>
  <eventType>accident</eventType>
  <image xop-mime:content-type="image/jpeg" 
      xsi:type="base64binary">
   4f3e9b0...(rest of encoded image)
  </image>
 </submitClaim>
 </soap:Body>
</soap:Envelope>

An MTOM/XOP version of our modified insurance claim looks like this:

MIME-Version: 1.0
Content-Type: Multipart/Related; boundary=MIME_boundary;
type=application/soap+xml;start="<claim@insurance.com>" --MIME_boundary Content-Type: application/soap+xml; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-ID: <claim@insurance.com> <soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xop='http://www.w3.org/2003/12/xop/include' xmlns:xop-mime='http://www.w3.org/2003/12/xop/mime'> <soap:Body> <submitClaim> <accountNumber>5XJ45-3B2</accountNumber> <eventType>accident</eventType> <image xop-mime:content-type='image/jpeg'> <xop:Include href="cid:image@insurance.com"/> </image> </submitClaim> </soap:Body> </soap:Envelope> --MIME_boundary Content-Type: image/jpeg Content-Transfer-Encoding: binary Content-ID: <image@insurance.com> ...binary JPG image... --MIME_boundary--

Essentially, it's the same on the wire as the SwA version, but it uses the xop:Include> element instead of just the href attribute. The real difference is architectural, since we imagine tools and APIs will manipulate this message exactly as if it were an XML data model.

MTOM and XOP are on their way to being released by the XML Protocol Working Group some time in 2004, and it remains to be seen how well they will be accepted by the broader user community. Early feedback has been very positive, however, and the authors of this book are behind the idea of a unified data model for XML and binary content.

  • + Share This
  • 🔖 Save To Your Account