If you have to store loosely structured data (or data in which the structure is expected to change significantly in the future) in a relational database, it might be best to store the less rigid part of the data in XML format, using the traditional table columns to store just that part of the data that’s well-defined, refers to other tables, or participates extensively in database queries.
My article "Store XML Data in Relational Databases" describes how you can combine the XML markup with any relational database, but also warns that doing so might result in considerable resource utilization if you have to modify or query the XML data.
The XML data type and additional XML features introduced in Microsoft SQL Server 2005 significantly reduce the performance impact of using the XML markup in a relational database environment:
- The XML data stored in the xml data type is pre-parsed, resulting in faster queries and updates.
- The query performance can be increased further by indexing the XML data. You can index the tags, values, and paths in the XML markup, resulting in significantly faster queries—but slower updates (due to index maintenance activities).
- You can define computed table columns on XML data to further optimize the query processing.
- SQL Server 2005 introduced XML Data Modification Language (DML) that allows you to modify parsed binary representation of the XML markup directly. Using XML DML in stored procedures is obviously much faster than transferring the XML markup to the SQL client, parsing and modifying it there, and re-storing it in the database.
In this article, I’ll describe the capabilities of the XML data type in SQL Server 2005 from the developer’s perspective; we’ll leave the in-depth performance issues to the database administrators. Before going into the details, it’s worth repeating the recommendations on XML data usage in relational databases:
- If you simply need to store XML data along with other column values, but don’t need to perform queries or modifications on it, treat it as a Binary Large Object (BLOB) and store it into the VARCHAR(MAX) or NVARCHAR(MAX) data type. You can always change the variable character data types into xml data type if needed.
- If your data is highly structured, store it in table columns. Even if you receive data in XML format but don’t have to retrieve it in its original form, it’s better to parse the data into individual values and store them in table columns in the data-insertion process.
- Use the xml data type if you need to perform queries or updates on the XML data.