Home > Articles > Data > SQL Server

This chapter is from the book

As we saw in the earlier discussions of updategrams and OPENXML, inserting XML data into a SQL Server database is relatively easy. However, both of these methods of loading data have one serious drawback: They're not suitable for loading large amounts of data. In the same way that using the Transact-SQL INSERT statement is suboptimal for loading large numbers of rows, using updategrams and OPENXML to load large volumes of XML data into SQL Server is slow and resource intensive.

SQLXML provides a facility intended specifically to address this problem. Called the XML Bulk Load component, it is a COM component you can call from OLE Automation–capable languages and tools such as Visual Basic, Delphi, and even Transact-SQL. It presents an object-oriented interface to loading XML data in bulk in a manner similar to the Transact-SQL BULK INSERT command.

Architecturally, XML Bulk Load is an in-process COM component named SQLXMLBulkLoad that resides in a DLL named XBLKLDn.DLL. When it bulk loads data to SQL Server, it does so via the bulk load interface of SQL Server's SQLOLEDB native OLE DB provider. If you have a Profiler trace running while the bulk load is occurring, you'll see an INSERT BULK language event show up in the trace. INSERT BULK is indicative of a special TDS packet type designed especially for bulk loading data. It's neither a true language event nor an RPC event; instead, it is a distinct type of data packet that bulk load facilities send to the server when they want to initiate a bulk copy operation.

Using the Component

The first step in using the XML Bulk Load component is to define a mapping schema that maps the XML data you're importing to tables and columns in your database. When the component loads your XML data, it will read it as a stream and use the mapping schema to decide where the data goes in the database.

The mapping schema determines the scope of each row added by the Bulk Load component. As the closing tag for each row is read, its corresponding data is written to the database.

You access the Bulk Load component itself via the SQLXMLBulkLoad interface on the SQLXMLBulkLoad COM object. The first step in using it is to connect to the database using an OLE DB connection string or by setting its ConnectionCommand property to an existing ADO Command object. The second step is to call its Execute method. The VBScript code in Listing 18.70 illustrates.

Listing 18.70

Set objBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
objBulkLoad.ConnectionString = _
  "provider=SQLOLEDB;data source=KUFNATHE;database=Northwind;" & _
  "Integrated Security=SSPI;"
objBulkLoad.Execute "d:\xml\OrdersSchema.xml",
    "d:\xml\OrdersData.xml"
Set objBulkLoad = Nothing

You can also specify an XML stream (rather than a file) to load, making cross-DBMS data transfers (from platforms that feature XML support) fairly easy.

XML Fragments

Setting the XMLFragment property to True allows the Bulk Load component to load data from an XML fragment (an XML document with no root element, similar to the type returned by Transact-SQL's FOR XML extension). Listing 18.71 shows an example.

Listing 18.71

Set objBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
objBulkLoad.ConnectionString = _
  "provider=SQLOLEDB;data source=KUFNATHE;database=Northwind;" & _
  "Integrated Security=SSPI;"
objBulkLoad.XMLFragment = True
objBulkLoad.Execute "d:\xml\OrdersSchema.xml",
    "d:\xml\OrdersData.xml"
Set objBulkLoad = Nothing

Enforcing Constraints

By default, the XML Bulk Load component does not enforce check and referential integrity constraints. Enforcing constraints as data is loaded slows down the process significantly, so the component doesn't enforce them unless you tell it to. For example, you might want to do that when you're loading data directly into production tables and you want to ensure that the integrity of your data is not compromised. To cause the component to enforce your constraints as it loads data, set the CheckConstraints property to True, as shown in Listing 18.72.

Listing 18.72

Set objBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
objBulkLoad.ConnectionString = _
  "provider=SQLOLEDB;data source=KUFNATHE;database=Northwind;" & _
  "Integrated Security=SSPI;"
objBulkLoad.CheckConstraints = True
objBulkLoad.Execute "d:\xml\OrdersSchema.xml",
    "d:\xml\OrdersData.xml"
Set objBulkLoad = Nothing

Duplicate Keys

Normally you'd want to stop a bulk load process when you encounter a duplicate key. Usually this means you've got unexpected data values or data corruption of some type and you need to look at the source data before proceeding. There are, however, exceptions. Say, for example, that you get a daily data feed from an external source that contains the entirety of a table. Each day, a few new rows show up, but, for the most part, the data in the XML document already exists in your table. Your interest is in loading the new rows, but the external source that provides you the data may not know which rows you have and which ones you don't. They may provide data to lots of companies—what your particular database contains may be unknown to them.

In this situation, you can set the IgnoreDuplicateKeys property before the load, and the component will ignore the duplicate key values it encounters. The bulk load won't halt when it encounters a duplicate key—it will simply ignore the row containing the duplicate key, and the rows with nonduplicate keys will be loaded as you'd expect. Listing 18.73 shows an example.

Listing 18.73

Set objBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
objBulkLoad.ConnectionString = _
  "provider=SQLOLEDB;data source=KUFNATHE;database=Northwind;" & _
  "Integrated Security=SSPI;"
objBulkLoad.IgnoreDuplicateKeys = True
objBulkLoad.Execute "d:\xml\OrdersSchema.xml",
    "d:\xml\OrdersData.xml"
Set objBulkLoad = Nothing

When IgnoreDuplicateKeys is set to True, inserts that would cause a duplicate key will still fail, but the bulk load process will not halt. The remainder of the rows will be processed as though no error occurred.

IDENTITY Columns

SQLXMLBulkLoad's KeepIdentity property is True by default. This means that values for identity columns in your XML data will be loaded into the database rather than being generated on-the-fly by SQL Server. Normally, this is what you'd want, but you can set KeepIdentity to False if you'd rather have SQL Server generate these values.

There are a couple of caveats regarding the KeepIdentity property. First, when KeepIdentity is set to True, SQL Server uses SET IDENTITY_ INSERT to enable identity value insertion into the target table. SET IDENTITY_ INSERT has specific permissions requirements—execute permission defaults to the sysadmin role, the db_owner and db_ddladmin fixed database roles, and the table owner. This means that a user who does not own the target table and who also is not a sysadmin, db_owner, or DDL administrator will likely have trouble loading data with the XML Bulk Load component. Merely having bulkadmin rights is not enough.

Another caveat is that you would normally want to preserve identity values when bulk loading data into a table with dependent tables. Allowing these values to be regenerated by the server could be disastrous—you could break parent-child relationships between tables with no hope of reconstructing them. If a parent table's primary key is its identity column and KeepIdentity is set to False when you load it, you may not be able to resynchronize it with the data you load for its child table. Fortunately, KeepIdentity is enabled by default, so normally this isn't a concern, but be sure you know what you're doing if you choose to set it to False.

Listing 18.74 illustrates setting the KeepIdentity property.

Listing 18.74

Set objBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
objBulkLoad.ConnectionString = _
  "provider=SQLOLEDB;data source=KUFNATHE;database=Northwind;" & _
  "Integrated Security=SSPI;"
objBulkLoad.KeepIdentity = False
objBulkLoad.Execute "d:\xml\OrdersSchema.xml",
    "d:\xml\OrdersData.xml"
Set objBulkLoad = Nothing

Another thing to keep in mind is that KeepIdentity is a very binary option—either it's on or it's not. The value you give it affects every object into which XML Bulk Load inserts rows within a given bulk load. You can't retain identity values for some tables and allow SQL Server to generate them for others.

NULL Values

For a column not mapped in the schema, the column's default value is inserted. If the column doesn't have a default, NULL is inserted. If the column doesn't allow NULLs, the bulk load halts with an error message.

The KeepNulls property allows you to tell the bulk load facility to insert a NULL value rather than a column's default when the column is not mapped in the schema. Listing 18.75 demonstrates.

Listing 18.75

Set objBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
objBulkLoad.ConnectionString = _
  "provider=SQLOLEDB;data source=KUFNATHE;database=Northwind;" & _
  "Integrated Security=SSPI;"
objBulkLoad.KeepNulls = True
objBulkLoad.Execute "d:\xml\OrdersSchema.xml",
    "d:\xml\OrdersData.xml"
Set objBulkLoad = Nothing

Table Locks

As with SQL Server's other bulk load facilities, you can configure SQLXMLBulkLoad to lock the target table before it begins loading data into it. This is more efficient and faster than using more granular locks but has the disadvantage of preventing other users from accessing the table while the bulk load runs. To force a table lock during an XML bulk load, set the ForceTableLock property to True, as shown in Listing 18.76.

Listing 18.76

Set objBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
objBulkLoad.ConnectionString = _
  "provider=SQLOLEDB;data source=KUFNATHE;database=Northwind;" & _
  "Integrated Security=SSPI;"
objBulkLoad.ForceTableLock = True
objBulkLoad.Execute "d:\xml\OrdersSchema.xml",
    "d:\xml\OrdersData.xml"
Set objBulkLoad = Nothing

Transactions

By default, XML bulk load operations are not transactional—that is, if an error occurs during the load process, the rows loaded up to that point will remain in the database. This is the fastest way to do things, but it has the disadvantage of possibly leaving a table in a partially loaded state. To force a bulk load operation to be handled as a single transaction, set SQLXMLBulkLoad's Transaction property to True before calling Execute.

When Transaction is True, all inserts are cached in a temporary file before being loaded onto SQL Server. You can control where this file is written by setting the TempFilePath property. TempFilePath has no meaning unless Transaction is True. If TempFilePath is not otherwise set, it defaults to the folder specified by the TEMP environmental variable on the server.

I should point out that bulk loading data within a transaction is much slower than loading it outside of one. That's why the component doesn't load data within a transaction by default. Also note that you can't bulk load binary XML data from within a transaction.

Listing 18.77 illustrates a transactional bulk load.

Listing 18.77

Set objBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
objBulkLoad.ConnectionString = _
  "provider=SQLOLEDB;data source=KUFNATHE;database=Northwind;" & _
  "Integrated Security=SSPI;"
objBulkLoad.Transaction = True
objBulkLoad.TempFilePath = "c:\temp\xmlswap"
objBulkLoad.Execute "d:\xml\OrdersSchema.xml",
    "d:\xml\OrdersData.xml"
Set objBulkLoad = Nothing

In this example, SQLXMLBulkLoad establishes its own connection to the server over OLE DB, so it operates within its own transaction context. If an error occurs during the bulk load, the component rolls back its own transaction.

When SQLXMLBulkLoad uses an existing OLE DB connection via its ConnectionCommand property, the transaction context belongs to that connection and is controlled by the client application. When the bulk load completes, the client application must explicitly commit or roll back the transaction. Listing 18.78 shows an example.

Listing 18.78

On Error Resume Next
Err.Clear
Set objCmd = CreateObject("ADODB.Command")
objCmd.ActiveConnection= _
  "provider=SQLOLEDB;data source=KUFNATHE;database=Northwind;" & _
  "Integrated Security=SSPI;"
Set objBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
objBulkLoad.Transaction = True
objBulkLoad.ConnectionCommand = objCmd
objBulkLoad.Execute "d:\xml\OrdersSchema.xml",
    "d:\xml\OrdersData.xml"
If Err.Number = 0 Then
  objCmd.ActiveConnection.CommitTrans
Else
  objCmd.ActiveConnection.RollbackTrans
End If
Set objBulkLoad = Nothing
Set objCmd = Nothing

Note that when using the ConnectionCommand property, Transaction is required—it must be set to True.

Errors

The XML Bulk Copy component supports logging error messages to a file via its ErrorLogFile property. This file is an XML document itself that lists any errors that occurred during the bulk load. Listing 18.79 demonstrates how to use this property.

Listing 18.79

Set objBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
objBulkLoad.ConnectionString = _
  "provider=SQLOLEDB;data source=KUFNATHE;database=Northwind;" & _
  "Integrated Security=SSPI;"
objBulkLoad.ErrorLogFile = "c:\temp\xmlswap\errors.xml"
objBulkLoad.Execute "d:\xml\OrdersSchema.xml",
    "d:\xml\OrdersData.xml"
Set objBulkLoad = Nothing

The file you specify will contain a Record element for each error that occurred during the last bulk load. The most recent error message will be listed first.

Generating Database Schemas

In addition to loading data into existing tables, the XML Bulk Copy component can also create target tables for you if they do not already exist, or drop and recreate them if they do exist. To create nonexistent tables, set the component's SchemaGen property to True, as shown in Listing 18.80.

Listing 18.80

Set objBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
objBulkLoad.ConnectionString = _
  "provider=SQLOLEDB;data source=KUFNATHE;database=Northwind;" & _
  "Integrated Security=SSPI;"
objBulkLoad.SchemaGen = True
objBulkLoad.Execute "d:\xml\OrdersSchema.xml",
    "d:\xml\OrdersData.xml"
Set objBulkLoad = Nothing

Since SchemaGen is set to True, any tables in the schema that don't already exist will be created when the bulk load starts. For tables that already exist, data is simply loaded into them as it would normally be.

If you set the BulkLoad property of the component to False, no data is loaded. So, if SchemaGen is set to True but BulkLoad is False, you'll get empty tables for those in the mapping schema that did not already exist in the database, but you'll get no data. Listing 18.81 presents an example.

Listing 18.81

Set objBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
objBulkLoad.ConnectionString = _
  "provider=SQLOLEDB;data source=KUFNATHE;database=Northwind;" & _
  "Integrated Security=SSPI;"
objBulkLoad.SchemaGen = True
objBulkLoad.BulkLoad = False
objBulkLoad.Execute "d:\xml\OrdersSchema.xml",
    "d:\xml\OrdersData.xml"
Set objBulkLoad = Nothing

When XML Bulk Load creates tables, it uses the information in the mapping schema to define the columns in each table. The sql:datatype annotation defines column data types, and the dt:type attribute further defines column type information. To define a primary key within the mapping schema, set a column's dt:type attribute to id and set the SGUseID property of the XML Bulk Load component to True. The mapping schema in Listing 18.82 illustrates.

Listing 18.82

<ElementType name="Orders" sql:relation="Orders">
  <AttributeType name="OrderID" sql:datatype="int" dt:type="id"/>
  <AttributeType name="ShipCity" sql:datatype="nvarchar(30)"/>

  <attribute type="OrderID" sql:field="OrderID"/>
  <attribute type="ShipCity" sql:field="ShipCity"/>
</ElementType>

Listing 18.83 shows some VBScript code that sets the SGUseID property so that a primary key will automatically be defined for the table that's created on the server.

Listing 18.83

Set objBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
objBulkLoad.ConnectionString = _
  "provider=SQLOLEDB;data source=KUFNATHE;database=Northwind;" & _
  "Integrated Security=SSPI;"
objBulkLoad.SchemaGen = True
objBulkLoad.SGUseID = True
objBulkLoad.Execute "d:\xml\OrdersSchema.xml",
    "d:\xml\OrdersData.xml"
Set objBulkLoad = Nothing

Here's the Transact-SQL that results when the bulk load executes:

CREATE TABLE Orders
(
  OrderID int NOT NULL,
  ShipCity nvarchar(30) NULL,
  PRIMARY KEY CLUSTERED (OrderID)
)

In addition to being able to create new tables from those in the mapping schema, SQLXMLBulkLoad can also drop and recreate tables. Set the SGDropTables property to True to cause the component to drop and recreate the tables mapped in the schema, as shown in Listing 18.84.

Listing 18.84

Set objBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
objBulkLoad.ConnectionString = _
  "provider=SQLOLEDB;data source=KUFNATHE;database=Northwind;" & _
  "Integrated Security=SSPI;"
objBulkLoad.SchemaGen = True
objBulkLoad.SGDropTables = True
objBulkLoad.Execute "d:\xml\OrdersSchema.xml",
    "d:\xml\OrdersData.xml"
Set objBulkLoad = Nothing

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020