Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

SQL Server I/O: Creating XML Output

Last updated Mar 28, 2003.

My grandpa used to say, "Life is short. Eat dessert first." I'll take that let's-do-the-easy-stuff-first approach in this week's installment of my XML series on SQL Server Input and Output.

It's fairly simple to get XML data output from a query in SQL Server. In fact, every major database vendor now supports XML output of some sort. The hard part is inserting XML data into a database from an XML document. So lets eat desert first; we'll cover inserts in next week's article.

You can make any database that supports ANSI SQL create XML documents. Here's a simple script that would create an XML document from the pubs database in SQL Server:

/* 
Name: ANSIXML.sql
Purpose: Manually creates an XML file from selected
database fields. To be used on disparate
database platforms that do not have an XML
engine. Note that you have to code your own element tags.
Author: Buck Woody
Last 
Edited: 04/02/01
*/

SET NOCOUNT ON 
SELECT 
'<?xml version ="1.0" encoding = "UTF-8"?>' +CHAR(13)+
'<authors>'
GO
SELECT
'  <author id = "' + au_id+'">'+CHAR(13)+
'   <FirstName>'+au_fname+'</FirstName>'+CHAR(13)+
'   <LastName>'+au_lname+'</LastName>'+CHAR(13)+
'  </author>'+CHAR(13)
FROM authors

SELECT 
'</authors>'+CHAR(13)

This script isn't completely correct, as some XML parsing engines will bark at the xml directive tag at the top of the output. The proper way to render the line is actually:

<?xml version ='1.0' encoding = 'UTF-8'?>

You might want to escape out the double quotes with whatever the syntax of your platform requires if you use this method. You might also run into trouble with the CHAR(13) part I have here, so use whatever your platform needs to create a line feed. To be technical about it, you don't even need the linefeeds anyway; the whitespace is ignored.

You'll notice that I formatted this output in a very "element-centric" fashion. That means that the columns are broken out as an element for each heading. That's my preference in this situation; someone else might require something different. In XML, elements can repeat, and attributes can't. For instance, this is legal:

<FirstName>Buck</FirstName>
<FirstName>John</FirstName>
<FirstName>Jane</FirstName>

But this isn't:

<PAuthor FirstName="Buck" FirstName ="John" FirstName ="Jane">Woody</Author>

In this case, I reserved attributes as meta-data, and elements as column headings. It's completely acceptable to make the same output as attribute-centric instead, creating an element that essentially contains a row of data, like this:

<Author id="123-45-6789" FirstName="Buck" LastName="Woody"/>

The advantage of this approach is that the document is smaller, and the XML parsing engine doesn't have to move down a tree level (called a "node"), keeping track of where it is in the structure. Attributes can also be in any order, and don't need to be "nested" like elements are. Again, the situation will dictate your design choices.

Let's get back to the process for creating the XML. Why did Microsoft bother creating an engine in SQL Server to handle XML creation, if you can just hard-code some strings to do the same thing? For one thing, hard-coding requires touching the code each time the database structure changes. Microsoft includes a few ways to get data out of a database into an XML schema. You can select data with an extension to T-SQL, or use a Web interface to talk with SQL Server to create the documents.

Most T-SQL Statements can create an XML document with a simple modifier, called FOR XML. Here's a simple script to get the same type of XML out of the pubs database:

/* Simple XML output, 
attribute-centric
*/
SELECT 
au_id
, au_fname
, au_lname
FROM authors
FOR XML AUTO

Here's an abbreviated output from that statement:

<authors au_id="409-56-7008" au_fname="Abraham" au_lname="Bennet"/>
<authors au_id="648-92-1872" au_fname="Reginald" au_lname="Blotchet-Halls"/>
<authors au_id="238-95-7766" au_fname="Cheryl" au_lname="Carson"/>
<authors au_id="722-51-5454" au_fname="Michel" au_lname="DeFrance"/>
<authors au_id="712-45-1867" au_fname="Innes" au_lname="del Castillo"/>
<authors au_id="427-17-2319" au_fname="Ann" au_lname="Dull"/>
<authors au_id="213-46-8915" au_fname="Marjorie" au_lname="Green"/>

You might notice a couple of things right away about this output. First, it seems to violate the syntax rules I mentioned in the last two articles, since there are no beginning and closing tags. There's just one tag, and it's terminated at the end. That's actually OK for "empty" elements, that is, elements with no data in them.

Also, the output is not element-centric but attribute-centric. The tag repeats over and over, with the attributes completely inside in each tag. The qualifier on the query that created the XML is FOR XML AUTO. (There are other qualifiers to create the data as well, such as FOR XML RAW, which creates a "row" tag, and FOR XML EXPLICIT, which allows you to specify the "shape" of the XML data. We look at those two qualifiers later.)

The FOR XML AUTO qualifier has options to help us specify the XML output even further. The order of the SELECT statement creates the nesting of the XML document. This nesting refers to which element (or node) is the "parent" and which are the "children."

If what we're after is element-centric output, the FOR XML AUTO qualifier has an option to do that:

/* Simple XML output, 
attribute-centric
*/
SELECT 
au_id
, au_fname
, au_lname
FROM authors
FOR XML AUTO, ELEMENTS

And here's a partial output from that command:

<authors>
  <au_id>409-56-7008</au_id>
  <au_fname>Abraham</au_fname>
  <au_lname>Bennet</au_lname>
</authors>

The , ELEMENTS option is what does the trick.

As I mentioned, you can use most any T-SQL query to create the XML document, but there are some pretty serious limitations. For one, you can't use any aggregate functions in the query when you use FOR XML. This makes sense, because it's difficult to think of a result set like that as a tree. And that's the crux – it's important to keep in mind all the time that what you're really doing is mapping relational data to a hierarchical structure. Keep that concept in mind when you build your queries.

Another limitation is that you can't use a GROUP BY clause in the SELECT. You also can't use them in a subselect (which only makes sense), in a cursor, or in a view. You can use a view to select from. In other words, this won't work:

CREATE VIEW MyTestXMLView
AS 
SELECT * 
FROM pubs 
FOR XML AUTO

But this will:

SELECT * 
FROM SomeOtherView
FOR XML AUTO

We'll continue our XML journey next week.

Online Resources

Ronald Bourret has a fantastic treatise on XML and databases. First class!