Home > Articles > Web Services > XML

  • Print
  • + Share This
From the author of


When we talked about XML documents in Chapter 1, we mentioned certain special characters that must be treated differently than other characters because they are interpreted differently depending on their location in documents. For more information, go to http://www.landfield.com/rfcs/rfc1738.html.

We must also be concerned now about special characters in URLs. Certain characters have a functionality all their own when used in URLs.

Entities in XML

The characters listed in Table 4.1 should not be used between tags in an XML document. These characters have special meaning to XML and will cause misinterpretation during parsing. The appropriate substitution entities that should be used in their place are provided in the table.

For a more thorough discussion of these characters and why they affect things the way they do, see RFC 2396. This RFC is freely available on the Internet at http://www.landfield.com/rfcs/rfc2396.html.

Table 4.1 Entity Substitutions



& (ampersand)

Use &

' (apostrophe)

Use '

< (less than)

Use &lt;

> (greater than)

Use &gt;

" (quote)

Use &quot;

Let's look at the sample template file in Listing 4.2. You'll see why these entities are necessary. Listing 4.3 shows the result.

Listing 4.2 Entities in Template Files

<ROOT xmlns:sql="urn:schemas-microsoft-com:xml-sql">
   SELECT CustomerID, OrderDate, Freight
   FROM  Orders
   WHERE Freight &gt; 800      <!--&gt; substituted for '>' -->

Listing 4.3 Results of Entity Substitution

<ROOT xmlns:sql="urn:schemas-microsoft-com:xml-sql">
  <Orders CustomerID="QUEEN" OrderDate="1996-12-04T00:00:00"
     Freight="890.78" />
  <Orders CustomerID="QUICK" OrderDate="1997-05-19T00:00:00"
     Freight="1007.64" />
  <Orders CustomerID="QUICK" OrderDate="1997-10-03T00:00:00"
     Freight="810.05" />
  <Orders CustomerID="SAVEA" OrderDate="1998-04-17T00:00:00"
     Freight="830.75" />

Entities in URLs

When executing a query in a URL, you must be careful when using the characters listed in Table 4.2. They are interpreted according to the description in the table. All these characters are required at one point or another in the interpretation of URLs. Again, RFC 2396 discusses these characters in greater detail.

Table 4.2 Special Characters in URLs



Hexadecimal Value


Indicates a space (spaces cannot be used in a URL).



Separates directories and subdirectories.



Separates the URL from the parameters.



Specifies special characters.



Indicates bookmark anchors.



Separates parameters specified in the URL.


Here's an example of a direct SQL query in a URL:


Here we are trying to retrieve all information concerning employees whose last names start with D. Because the % character is one of the special characters for URLs, trying this query directly in a URL results in several errors. To fix the problem, you need to substitute the hexadecimal value of the special character in its place, like this:


There's one more point to make here. There are instances when it might become necessary to use a combination of XML and URL special characters. Look at the following sample template that could be specified directly in a URL. See if you can spot the problem.

<ROOT xmlns:sql="urn:schemas-microsoft-com:xml-sql"><sql:query>SELECT+

Hopefully, by now you know that the > character won't work here, so we'll change it to the required XML entity.

<ROOT xmlns:sql="urn:schemas-microsoft-com:xml-sql"><sql:query>SELECT+

Did you make it this far? If so, good; but if you stopped here, you didn't go quite far enough. The & character is a URL special character, and a substitution needs to be made here also. Replace the & with the hexadecimal value %26.

<ROOT xmlns:sql="urn:schemas-microsoft-com:xml-sql"><sql:query>SELECT+
  • + Share This
  • 🔖 Save To Your Account