Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

XML Overview

Last updated Mar 28, 2003.

Although there's been a lot of hype about Extensible Markup Language (XML), in reality it isn't very complex. At its core it's a text file with special characters (tags) similar to those of Hypertext Markup Language (HTML). But I don't mean that XML isn't powerful — it is.

HTML has a finite set of tags, and is used to provide special instructions to web browsers to display text and graphics. XML tags, on the other hand, are not predefined. You define them to mean what you want, and it's left up to the consuming program to figure out what they're used for.

That flexibility allows XML to be used to transfer data between various software packages — from databases to word processors, from spreadsheets to databases. Not only that, but the systems running the software don't even have to be similar — for example, you can use the same XML file (called a document) between Macintosh equipment, Sun servers, and Microsoft operating systems.

XML files are structured documents. What that means is that the document is laid out in a particular way, and uses tags to indicate special words or sections of the document. You're probably used to seeing HTML documents with tags indicating bold, underlining, or other presentation methods. These tags "bookend" a word (or even a letter), causing the browser to interpret the tag as a formatting code. You've probably seen raw HTML like this:

This is <B>Bold</B>.The <B> tag starts the boldfacing, and the </B> tag ends it. XML is similar, but tags are used for data, not just formatting. That means that you can have an XML document that defines the last-name field from the pubs database found in SQL Server like this:

<au_lname>Woody</au_lname>

The above example is just a snippet of what an XML document might contain. A more complete document might look like this:

<?xml version ='1.0' encoding = 'UTF-8'?>
<authors>
  <author>
  <au_id = '123-12-1234'>
   <au_lname>Woody</au_lname>
   <au_fname>Buck</au_fname>
  </author>
</authors>

You can type this XML document with any text editor. What makes it XML is the first line — and an interpreter. You see, until the file is read by something, it's just text. The client application (Internet Explorer, for example) has to know how to deal with the file.

Just for fun, open Notepad and type or copy the lines you see above. Save the file with the extension .XML and then double-click that file (if you have an Internet browser installed). You'll notice that the file arranges itself in a self-collapsing hierarchy — which is ideal for database-type applications.

One final note about XML. Although HTML is quite forgiving about tag-matching and case-sensitivity, XML is not. If you don’t match the tags and tag order in your XML document, it won’t be "well-formed” and it won’t work properly. Also, XML is case-sensitive, meaning that <Name> is different than <name>.

XML and SQL Server 2000

Microsoft SQL Server 2000 has XML interpreters built into a dynamic link library (DLL). This DLL can read and write XML. Although SQL Server 2000 ships with native XML support, you'll need to obtain and apply various service packs to your installation if you want to exploit the latest XML features. You'll also need Microsoft's Internet Information Server (IIS) to use the XML features.

Using the web server, data can be queried directly from the database in the user's browser. Normally this is done within an HTML page. The data can be formatted using XML style sheets to produce a very rich client experience.

XML can also be queried out of the database with SELECT statements, and SQL Server 2000 supports the XPath XML query language. The process for accessing data as XML is to read the document into memory with a stored procedure, and then select data out of the resulting recordset like a table. When you're done, you release the document from memory.

SQL Server 2000 can also store XML data directly in the database, using the dt:type attribute. While this isn't done a great deal, it allows an entire document to reside in one field of a database!

SQL Server 2000 includes XML, and the buzz is that Yukon (the next release of SQL Server) will extend that support. With the support currently available in SQL Server 2000, you can do the following:

  • Use HTTP to access SQL Server
  • Use XDR (XML-Data Reduced) schemas and XPath queries
  • Access XML data using the SELECT statement and the FOR XML clause
  • Write XML data using OPENXML
  • Use SQLOLEDB to conform XML documents into command text and to return the result sets as a stream

Creating and Accessing XML Data with SQL Server 2000

Here's the flow for producing XML documents using SQL Server 2000:

Before you can use XML with SQL Server, you need to configure the SQL Server 2000 XML product with extensions for Internet Information Server (IIS). You can install IIS on the same computer as SQL Server 2000, or on another machine.

Create three directories. (Anywhere is fine, but jot down where they are.)

The first directory acts as the main web site for the XML access (called the root).

The second directory (usually made under the first) is for the templates that SQL Server 2000 uses in IIS for XML. Templates are special XML documents that store queries.

Finally, create a directory (also under the first) that will house the schema. Schemas are references for data that XML documents need.

After you create the directories, tie them to IIS using virtual directories. To do this, click Configure SQL XML Support in IIS in the SQL Server Tools program group. This process will lead you through a wizard that will point IIS to the directories you made earlier. This wizard will also configure the security of how the queries will be handled and other important settings. (The default settings are fine for this test, but you check Books Online later for more information about what they do.)

Now that the SQL Server and web site are configured, users can enter queries in the address bar of their web browser to display the data, or you can create predefined queries (called templates) stored on the server. This second method is the type seen most often.

Creating XML queries isn't difficult to learn. The key concept is adding FOR XML AUTO to the end of most Transact-SQL language queries. The FOR XML query structures the data for XML, and the AUTO part is an XML mode. This mode allows you to format the XML into the type that the application needs. Here's a quick chart of those modes:

Mode

Description

AUTO

A nested XML tree. AUTO sets the tables in the FROM clause as an XML element.

RAW

Each row in the query is an XML element with a generic identifier row.

 EXPLICIT

Lets you specify the shape of the XML tree.

For this test, enter the following command in your browser's URL bar (assuming that you pointed your data source to the pubs database earlier):

SELECT * FROM authors FOR XML AUTO

The XML implementation in SQL Server 2000 continues to evolve—even if you follow the instructions I've given, there's more to do. Microsoft has release several service packs to enhance the XML extensions for SQL Server. When you perform these upgrades, you'll get new versions of the XML documentation as well.

XML and SQL Server 2005

SQL Server 2005 fully implements XML right inside the engine of the product. The primary improvements you’ll see in this version are the inclusion of XML as a native data type, new enhancements of the FOR XML statement and an implementation of the XQuery language.

The ability to store data natively as XML is a great new feature. Although XML can be stored out on the file system, there are times when it makes sense to store XML inside of the database. You have great control over the security of the data in a database, and the XML data is part of the maintenance schedule when it is in the database. You can even index the XML data for speedy access.

But not everything is well represented as XML. If your data is hierarchical, has a sparse structure, and you want to query or update the data based on its structure, then it may make sense to store data as an XML document. Otherwise, you should stick with standard columns.

It’s easy to create and use an XML column in your database. You simply use the same syntax as any table, like this:

CREATE TABLE TestTable (C1 XML)
GO

To insert data into that table, you can use a variable, read an XML document, or just enter an element manually, like this:
INSERT INTO TestTable 
VALUES('<Customer Name="Buck Woody" />')
GO

With SQL Server 2005, you can use the FOR XML predicate on the SELECT statement as before, but you can also use the sp_xml_preparedocument stored procedure and the OPENXML statement to work with XML documents in your queries.

XQuery is a new standard way of querying XML data. SQL Server 2005 includes XQuery language constructs that you can use within a SELECT statement. I won’t cover that here, but I will show you how to use it in other tutorials.

InformIT Articles and Sample Chapters

We have a sample chapter on creating well-formed XML documents from the book Teach Yourself XML in 21 Days.

Books and eBooks

Need a complete reference on XML? Check out this Special Edition of Using XML, 2nd Edition.

Online Resources

It’s a bit difficult to navigate, but this is the lead article in Books Online on XML in 2005.