XML Schema Patterns
The 'pattern' facet requires more explanation than the brief description given in Section 14.6 provides. This XML feature is based on the regular expression capabilities of the Perl programming language. It is therefore very powerful, but this strength comes at the cost of some complexity.
15.1 Introduction
Although the XML Schema language has a large number of built-in data types that can be used, restricted, and extended, some requirements demand much finer control over the exact structure of a value. For example, a simple code might need to consist of three lowercase letters:
<Code>abc</Code> <!-- OK --> <Code>ABC</Code> <!-- ERROR --> <Code>abcd</Code> <!-- ERROR -->
Similarly, when an element or attribute contains an ISBN (International Standard Book Number), it should be possible to apply constraints that reflect the nature of ISBN codes. All ISBN codes are composed of three identifiers (location, publisher, and book) and a check digit, separated by hyphens (or spaces). Valid values would include '0-201-41999-8' and '963-9131-21-0'. The schema processor should detect any error in an ISBN attribute:
<Book ISBN="0-201-77059-8" ...> <!-- OK --> <Book ISBN="X-999999-" ...> <!-- ERRORS -->
Some programming languages, such as Perl, include a regular expression language, which defines a pattern against which a series of characters can be compared. Typically, this feature is used to search for fragments of a text document, but the XML Schema language has co-opted it for sophisticated validation of element content and attribute values.