Home > Articles > Web Services > XML

Navigation in XQuery

This chapter is from the book

This chapter is from the book

3.2 Paths

Navigation involves starting from one part of an XML document and moving to another part of the document (or a different document). XQuery performs navigation using paths. Paths were invented in 1970 for use with the PDP-11 file system. The path concept has been so generally useful that it has found broad application in a variety of systems, including XML query processing.

In XQuery, every path consists of a sequence of steps which, conceptually at least, are executed in order from left to right. A step consists of three parts, illustrated in Figure 3.1:

  • A direction of travel, called the axis

  • A description of the nodes to select upon arrival, called the node test

  • Zero or more filters to further narrow that selection (each filter is called a predicate)

03fig01.gifFigure 3.1. Anatomy of a path

By allowing some of these parts to be abbreviated or omitted entirely, XQuery keeps paths very concise. Each of these parts is described next, and then Section 3.5 has many examples demonstrating how to use paths to accomplish common tasks.

Each step affects the evaluation context for the next step. This context and how it changes with each step are described in Section 3.4, but for now it's enough to know that there is a current context item that affects—and is affected by—each step in the path. Except for predicates, navigation steps can be applied only when the current context item is a node (in which case it is often called the current context node).

3.2.1 Beginnings

Every path starts somewhere. For the purpose of XQuery navigation, there are effectively three places from which a path can begin:

  • The current context node

  • The root of the tree in which the current context node resides

  • Any other node set, such as a variable or an XML constructor

With each successive step, the path may move to other nodes or alter the context.

The root of the tree in which the current context node resides is selected by a lone forward slash (/) or equivalently using the built-in root() function. Paths beginning from the root are absolute. In contrast, paths starting from the current context node are relative. Paths may also start from certain other expressions, such as variables, function calls, or parenthesized expressions (XQuery does not give a name to such paths).

From these humble beginnings, paths may navigate anywhere in the document, or even to other documents, step by step. Listing 3.1 shows a few paths. In a path, individual steps are almost always separated by one forward slash (/). (The exception, two forward slashes (//), is described in Section 3.2.4.)

Listing 3.1. Absolute, relative, and other paths

/AbsolutePath/First/Second
RelativePath[. = "fun"]
$other//x
id("other")[@y > 1]/z

Paths with more than one step always result in a (possibly empty) sequence of nodes, sorted in document order. To sort nodes in some other order, you must use a FLWOR expression (see Chapter 6).

3.2.2 Axes

Each step consists of three parts: the axis (optional), the node test, and zero or more predicates. XPath defines a total of thirteen axes, and all but the namespace axis appear in XQuery. Of these, the four simplest and most commonly used ones are child, attribute, parent, and self (see Table 3.1). The other axes are explained in Section 3.2.4.

Table 3.1. The four basic axes and their abbreviations

Axis name

Abbreviation

Equivalent examples

attribute

@

x/attribute::y

x/@y

child

 

x/child::y

x/y

parent

..

x/parent::node()

x/..

self

.

x/self::node()

x/.

The child axis is so common that it is the default axis if no axis name is specified explicitly. The other three common axes all have shorthand abbreviations for convenience. XPath gets much of its succinctness from these shorthand forms. When the non-abbreviated name is used, it is followed by two colons (::) to distinguish axis names from XML qualified names (which contain at most one colon).

These four axes behave exactly as their names suggest:

  • The child axis navigates into the children of the current context node.

  • The attribute axis navigates into the attributes of the current context node.

  • The self axis essentially goes nowhere (navigating into the current context node itself).

  • The parent axis navigates to the parent of the current context node.

For example, x, which is short for child::x, selects the child elements named x from the current context node, while x/y, which is short for child::x/child::y, first selects the child elements named x from the current context node just like the previous example, and then from those selects the child elements named y.

3.2.3 Node Tests

Following the axis is the second part of the step, the node test. Node tests come in three varieties: names (qualified or unqualified), node kinds, and wildcards.

3.2.3.1 Name Tests

By far the most common node test is the name test. A name test selects only those nodes with the same name. Names in XQuery, as in XML, are case-sensitive. For example, the absolute path /x/y/@z starts at the root of the current document, navigates to the top-level elements named x, navigates to their child elements named y, and finally navigates to their attribute nodes named z. If you were to execute this XQuery over the XML document in Listing 3.2, it would select the two attributes named z and no other nodes.

Name tests can also select names that are in an XML namespace. However, this process is fairly complicated, so this description is deferred until Section 3.6.1.

Listing 3.2. A sample XML document

<x thisAttribute="isNotSelected">
  <y z="1"/>
  <y z="2" thisAttribute="alsoIsNotSelected" </y>
</x>

3.2.3.2 Node Kind Tests

Name tests are not the only node tests available in navigation steps. In fact, some kinds of XML nodes (for example, text, comment, and document nodes) have no names at all. To select nodes by kind, XQuery uses the same node kind tests used by sequence type matching (described in Chapter 2). Listing 3.3 shows two node kind tests.

Listing 3.3. Examples of node kind tests

x/comment()                (: select all comment children of x :)
x/attribute()              (: select all attributes of x :)
attribute(@*, xs:integer)  (: select all integer attributes :)
attribute(y)               (: select all attributes named y :)
attribute(y, xs:integer)   (: select integer attributes named y :)

Recall from Chapter 2 that the node() node test matches any kind of node, including the document node. The text() and comment() node kind tests match text nodes and comment nodes, respectively. The processing-instruction() node test accepts an optional name argument. When no name is specified, it matches all processing instruction nodes; otherwise, it matches only those with the same name.

The document-node() test matches the invisible document node that occurs at the root of any tree loaded from an XML document using doc() (or constructed using the document constructor—see Chapter 7). It accepts an optional argument specifying an element node kind test, in which case it matches the document node only if its element content matches that element test.

And finally, the element() and attribute() node kind tests accept optional name and type arguments. Without these extra arguments, they match all elements and attributes, respectively; with these arguments, they match only elements or attributes that have the specified name and/or type. The name or type can also be *, in which case it matches all names or all types, respectively. The name specified in an attribute() test must start with an @ symbol to emphasize that it matches attributes.

3.2.3.3 Wildcards

Sometimes you want to select all nodes whose name is in a particular namespace, or conversely all nodes with the same local name regardless of the namespace. There are two equivalent ways to accomplish this goal. One is to use predicates; in fact, as you will see later, predicates can be used to perform all kinds of tests.

A more succinct way is to use the third kind of node test, the wildcard. Wildcard node tests combine aspects of both name and node kind tests; the names matched depend on the wildcard, and the node kind matched depends on the axis. The attribute axis by default selects attribute nodes; all other XQuery axes select elements by default. The default node kind is called the principal node kind for the axis.

XQuery supports three wildcard node tests. Two of these come from XPath 1.0: the star (*), which matches any name at all, and a qualified star (prefix:*) that matches all names in the namespace to which the prefix is bound. XQuery adds a third wildcard node test, *:local-name, which matches all names with the given local name and any namespace.

The only difference between the star wildcard * and the node() node kind test is that node() matches every kind of node with any name, while * matches only nodes of the principal node kind (with any name).

3.2.4 Other Axes

XQuery supports two more axes from XPath 1.0, called descendant and descendant-or-self. The descendant axis matches all descendants of the current context node. (It is the closure of the child axis under fixed-point recursion.) The descendant-or-self axis includes the current context node as well, and so is equivalent to the union of the descendant and self axes.

The descendant-or-self axis is so commonly used that it has its own abbreviation, //. Some caution should be observed when using it; it's easy to make mistakes when using predicates with // (see Chapter 11 for examples).

Additionally, implementations are allowed but not required to support the other six axes from XPath: ancestor, ancestor-or-self, following, following-sibling, preceding, and preceding-sibling. The first two of these are the inverses of descendant and descendant-or-self axes. They select all the ancestors of the current node (ancestor-or-self includes the node itself).

The following and preceding axes select all the nodes in the same document as the current context node that occur before and after it, respectively. There's really no reason to use them in XQuery, because the >> and << node comparison operators allow you to write the same meaning more compactly (see Chapter 5).

Finally, the following-sibling and preceding-sibling axes restrict their selections to the siblings of the current context node (that is, those nodes having the same parent as it).

3.2.5 Predicates

The third and final part of each navigation step consists of zero or more predicates. Like the node test, each predicate acts as a filter on the selected nodes, eliminating some from consideration and keeping the rest. For each node selected by the current step, the current context item is set to that node and then the predicate condition is evaluated with that context.

Any XQuery expression may be used inside a predicate; the meaning of the predicate depends on the type of the expression it contains. There are two cases: numeric and boolean predicates.

3.2.5.1 Numeric Predicates

Numeric predicates select nodes by their position in the current context. For example, /x/y[1] selects the first y child element of each x element. As this example demonstrates, predicates bind tightly to the current step. To apply a predicate to the entire results of a path, you must use parentheses. For example, (/x/y)[1] selects the first y element out of all the nodes selected by /x/y.

Because paths can start with other kinds of expressions, such as parenthesized expressions, predicates can be applied to more than just sequences of nodes. For example, the expression ("a", "b", "c")[2] selects the second item in the sequence, the string "b".

Numeric predicates, like the ones in Listing 3.4, filter by position. In general, when a predicate evaluates to a number N, it's as if the predicate were actually the boolean-valued predicate position()=N. For example, the path /x[1] is equivalent to the path /x[position() = 1]. This expansion applies not only to numeric constants, but also to any numeric-typed expression. For example, the path /x[@y + 1] is equivalent to the path /x[position() = @y + 1].

Listing 3.4. Numeric predicates filter by position

(//Customer)[2]
Fruit[@index + 1]

The position is 1-based (the first item in the sequence is at position 1). When the predicate evaluates to a non-integral value, a value less than 1, or a value greater than the length of the sequence, then the predicate will be false for all items in the sequence and the result will be the empty sequence. In other words, it isn't an error to select an index that is out of bounds for the sequence.

3.2.5.2 Boolean Predicates

All other kinds of predicate expressions, such as the ones in Listing 3.5, filter a sequence so that only those items for which the predicate evaluates to true are kept. The predicate is converted to a boolean value by computing the Effective Boolean Value of the expression.

Listing 3.5. All other predicates filter as boolean conditions

/x[@a=1 and @b=1]
/x[@a=1]/y[@b < 2]

As described in Section 2.6.2, the Effective Boolean Value acts as an existence test on sequences. Consequently, when the predicate is itself a path, the predicate evaluates to true if and only if the node(s) selected by that path exist. For example, x[y] matches all x elements that have a y child element, and x[not(@y)] matches all x elements that don't have a y attribute.

3.2.5.3 Successive and Nested Predicates

Several predicates can be applied to a step, with the effect that each predicate is evaluated with respect to the nodes remaining after the previous predicate. The order of evaluation of the predicates is always left to right, which matters only when computing positional predicates. For example, the path x[1][@y=2] selects the first x element (if there is one), and then only if that element has a y attribute whose value is 2; while the path x[@y=2][1] selects all x elements that have a y attribute whose value is 2, and then from that set selects the first one. Over the XML <x y="3"/><x y="2"/> the first path selects nothing (because the first x element has y="3"), while the second path selects the second element.

Predicates can also be nested. For example, the path x[y[@z=1] = 2] selects all x elements where there exists a y element with a z attribute equal to 1 and the value of the y element itself equals 2.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020