Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

SQL Server I/O: XML in Database Terms

Last updated Mar 28, 2003.

As a database professional, periodically you'll be asked to step outside the comfort of SQL Server knowledge and into a more data-centric view. You need to be familiar with many data technologies, especially the ones to which SQL Server can talk natively.

SQL Server has many ways to output and receive data. The most common way is in a data stream to an application, but you can also send sand receive data to a text file, an Excel file, Oracle tables, a backup, and also XML. Each has a use, and you should be conversant with them all.

In this series of articles, I explain methods, other than programming, that get data into and out of a SQL Server database. Here, I start with XML.

I covered XML briefly in an article on SQL Server features, which serves as a brief overview of how SQL Server uses XML. In the next few articles, I'll dive down a bit deeper on the topics introduced in that earlier reference. In addition, the references section at the end of this article points out some great XML resources right here on InformIT.

Let's begin this exploration with the database reasons for XML. By now, you've probably heard a lot about this file format, which is all XML really is. There are many reasons to store data in a format other than SQL Server, but the most compelling I've seen is data interchange.

The beauty of XML is that it is self-defining, and it allows you a great deal of flexibility to create custom tags. In addition, many standard tags have been adopted by disciplines as varied as mathematics and insurance claims. I work in a large company, and we often have to exchange data with client and vendor systems. Some of these systems are connected with Business-to-Business (B2B) software, but some are smaller or more complicated hookups; XML transfers have been a real help in those situations.

To properly frame this discussion, let's define a few terms. By now you've probably read at least a few articles describing XML layouts. I won't repeat that information here. What may be more useful for our purposes is to put a few definitions into database terms. Before I do that, we need to get the ground rules straight for making a comparison like this.

XML is a specification, but it lends itself to creating hierarchical data arrangements. SQL Server stores data using relational concepts, so the two don't meld together seamlessly. There are things you can do in XML that you can't in a SQL Server database, and vice-versa. So take the following information in that spirit. I'll refer back to this information fairly often throughout this series.

XML Term: Document

An XML document is a set of ASCII characters with two requirements: A special set of characters called a declaration, and an open and closed root tag. The document can be this simple:

<?xml version="1.0" ?>
<authors> </authors>

Note that the above example is the same as:

<?xml version="1.0" ?>
<authors> 
</authors>

In the database world, the corollary to the XML document is a table. You could actually have several tables within an XML document, but that makes structuring the document to act like a relational database structure more difficult.

XML Term: Processing Instructions

Processing Instructions are commands to the XML processing engine, or consumer. Most Web browsers have an XML processor, as does SQL Server and other tools.

Processing Instructions start and end this way: <? ?>. There are several, but the one that is required is the declaration:

<?xml version="1.0" ?>

The database corollary for a directive might be the file storage format.

XML Term: Element (Tag)

Elements in XML are the delimiters of data. Any word can be used as a tag, and case matters. Once a tag is declared, it must be terminated with a slash (/) to parse the document correctly. Although HTML is fairly forgiving about un-closed tags and case, XML is not.

Element tags are analogous to a database column. You can structure the document to use attributes (more on that in a moment) as columns instead, but elements can repeat and attributes cannot. In the following example, a directive starts an XML document, the root element (authors) is set, and the first names elements are repeated:

<?xml version="1.0" ?>
<authors> 
  <fname>Buck</fname>
  <fname>John</fname>
  <fname>Dianna</fname>
  <fname>Terri</fname>
</authors>

XML Term: Element (Data)

Enclosed between element tags are the data they hold. In this example:

<fname>Buck</fname>

The Element tag is fname and the data it contains is Buck.

XML Term: Entity

An XML entity is basically an escape sequence. Since certain characters aren't valid inside an XML document (such as an ampersand (&)) an entity provides the ability to create these characters. You aren't limited to just nonstandard characters; you can create your own entities.

Using entities in this way is similar to either character string escapes in T-SQL or User Defined Data Types (UDT's).

You can also use entities to include binary data in an XML file. In this use they are similar to the BLOB data type in SQL Server.

Another use for entities are to stand in as "placeholders" that are filled at XML render-time. In this way, they are similar to table variables, temp tables, and the like.

Another way entities can be used is to link other XML documents, in-line. In this use, they are similar to a subquery.

XML Term: Attributes

Attributes are descriptors of an element. In the following example, author is now the element, and fname and lname are attributes:

<author fname="Buck" lname = "Woody">1234</fname>

You often see this arrangement when a part number is more significant than the text that describes it.

There's no direct corollary to a standard relational database object for attributes, assuming you've used elements as columns, although you could make the argument they are like check constraints. In future articles, I'll show you how to leverage this useful construct.

XML Term: XSLT

XSLT refers to another XML document, used to format the results of an XML file. XSLT is similar to formatting commands in T-SQL, but it's more powerful.

XML Term: DTD

Document Type Definitions are another form of XML that is paired with an XML document to specify how tags should be used. It's used to provide a limited form of referential integrity. It's more like programmatic referential integrity than declarative referential integrity, and is the earliest form of this kind of enforcement.

XML Term: Schema

A schema document is similar to a DTD, but is newer and has more features. Like DTDs, Schema documents enforce referential integrity.

XML Term: XPATH

XPath is a query specification using Style Sheets. While it is fairly basic, it is slightly similar to the Structured Query Language (SQL) syntax.

XML Term: Namespace

An XML namespace uniquely identifies the tags in a document. By using a namespace and appending your company's Web location, it enables the same type of behavior as a SQL Server's Global Identifier.

There's an inherent danger in comparing a hierarchical tagged file to a relational database, but I think it's useful to compare something with which you're familiar to something which you aren't. These comparisons will serve us well as we use XML as a data transfer mechanism in the next few articles.

Online Resources

Microsoft has a great XML site that has many references you can find here.

If you need a basic tutorial on XML, check out the earlier article I mentioned and then go here. This site provides a great place to start.

InformIT Tutorials and Sample Chapters

The best resource for XML? Why it's our very own Nicholas Chase. He has an entire section on XML, and deals with databases in specific here.