Home > Articles > Web Services > XML

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

3.6 Navigation Complexities

This section discusses the last remaining navigation topics. All of these were either too esoteric or too complex to merit including in the previous sections.

3.6.1 Namespaces

XPath 1.0 doesn't have a way to introduce namespace declarations and doesn't use the prefixes of the data it is navigating. Consequently, any namespace prefixes used in an XPath expression must be defined outside of it. In XQuery, namespaces can be declared in the query prolog (covered in Chapter 5), or using namespace declarations in XML elements (Chapter 7).

Table 3.4. Qualified names versus expanded names

 

Prefix

Local

Namespace

Example

Qualified name

Yes

Yes

No

foo:bar

Expanded name

No

Yes

Yes

{urn:baz}bar

Recall that a qualified name consists of two parts, the prefix and the local name, separated by a colon (:). The prefix is bound to a namespace, but is otherwise unimportant for the purposes of navigation. Instead, it is better to think in terms of expanded names. An expanded name is the namespace and local name parts of a name (ignoring the prefix). Table 3.4 summarizes the differences between the two.

The second example in Table 3.4 is fabricated; XML and XQuery don't have a syntax for expanded names. Instead, they always associate the namespace with a prefix, and then use a qualified name.

In prose descriptions, expanded names are often written with the namespace part in curly braces ({}), like this: {namespace}local-name. When there isn't a namespace (because the name was unqualified or had an empty namespace), then the expanded name is written as just {}local-name. Again, this syntax isn't used in XML or XQuery, just in descriptions of how they work.

Listing 3.17. XML with namespaces

<root xmlns:x="uri1">
  <x:one fish="red"/>
  <two x:fish="blue" xmlns="uri2"/>
</root>

For example, in the XML shown in Listing 3.17, there are three elements with qualified names: root, x:one, and two. The expanded names of these elements are {}root, {uri1}one, and {uri2}two, respectively. There are two attributes with qualified names: fish and x:fish. The first of these has the expanded name {}fish, the second has the expanded name {uri1}fish.

Expanded names are more verbose than qualified names, which explains why qualified names are used instead. However, most operations—including validation and navigation—operate only on the namespace and local name parts of XML names, usually completely ignoring the prefixes that were used in the original XML serialization.

Suppose doc("sample.xml") accesses the XML shown in Listing 3.18.

Listing 3.18. sample.xml

<this xmlns="urn:default" xmlns:ns1="urn:one">
  <is a="complex">
    <ns1:example ns2:attr="42" xmlns:ns2="urn:two"/>
  </is>
</this>

Then you could navigate into it using the XQuery shown in Listing 3.19.

Listing 3.19. Path using namespaces

declare namespace x = "urn:default";
declare namespace y = "urn:one";
declare namespace z = "urn:two";
doc("sample.xml")/x:this/x:is/y:example/@z:attr

The query prolog introduces three namespace declarations, binding the prefixes x, y, and z to the namespaces urn:default, urn:namespace1, and urn:namespace2, respectively. The path then uses these namespaces to perform its qualified name tests: x:this, x:is, y:example, and z:attr. Again, notice that the prefixes in the document and the prefixes in the XQuery are completely unrelated to one another; all that matters are the local name and namespace parts of the names.

XQuery provides two functions that can access the namespaces in scope on a node. The get-in-scope-prefixes() function takes one argument, an element node, and returns a list of strings (in any order) that are the namespace prefixes in scope for that element. An empty string is listed for the default namespace declaration, if any.

The get-namespace-uri-for-prefix() function can look up the namespace value for a prefix. It takes two arguments, an element node and a string prefix, and returns the string that is the namespace bound to that prefix (use the empty string to look up the default namespace declaration). If there is no namespace bound to that prefix, then this function returns the empty sequence. Both effects are demonstrated by the examples in Listing 3.20.

Listing 3.20. Querying the namespaces in scope

declare namespace x = "urn:default";
get-in-scope-prefixes(doc("sample.xml")/x:this)
=>
("ns1", "")

declare namespace x = "urn:default";
get-namespace-uri-for-prefix(doc("sample.xml")/x:this, "ns1")
=>
"urn:one"

3.6.2 Node Identity

Navigation has some interesting interactions with node identity. When navigating over constructed XML, it's important to realize that the construction process copies nodes used as content, thereby “losing” their node identity.

For example, in the expression <x>{doc("y.xml")//y}}</x>, the nodes selected by the path are copied into the x element. If you then navigate into that constructed XML, you get different nodes (by identity) than the originals, as demonstrated by Listing 3.21. (See Chapter 7 for more information).

Listing 3.21. Navigating over constructed XML

(<x>{doc("y.xml")//y}</x>)//y is doc("y.xml")//y   => false

The doc() function is special in that whenever the same string value is passed to it, the same node (by identity) is returned. This special behavior prevents you from writing a user-defined function that fully emulates doc() using construction, because every time your function is invoked it constructs a new node instance. The difference is demonstrated in Listing 3.22.

Listing 3.22. The doc() function can't be completely simulated by your own

declare namespace my = "http://www.awprofessional.com";
declare function my:doc($dummy as xs:string) as node() {
  document {
    element root { () }
  }
};

doc("a.xml") is doc("a.xml")       => true (if a.xml exists)
my:doc("a.xml") is my:doc("a.xml") => false

3.6.3 Other Context Information

In addition to the context items listed in Section 3.4, XQuery provides several other less commonly used values in the expression context.

The base uri property is part of the static context and is used by the doc() function when resolving relative URIs. This property can also be accessed using the base-uri() function.

The current XML space policy is part of the static context and can be changed by the query prolog. It determines how space characters are handled in XML constructors (see Chapter 7).

The static context may also provide schema definitions from imported schemas and a default validation mode and/or validation context. These determine what user-defined type names are available for use in type tests and other type operators. See Chapter 9 for examples.

Finally, the current date/time and the implicit timezone properties are part of the evaluation context. Despite their names, these don't really provide the current time, but just some point in time determined by the implementation. The value doesn't change during the execution of a query. These values can be accessed using the current-date(), current-time(), current-dateTime(), and implicit-timezone() functions (see Appendix C).

  • + Share This
  • 🔖 Save To Your Account