Home > Articles > Web Services > XML

Simple Types

  • Print
  • + Share This
This article, written by Priscilla Walmsley, a member of the W3C working group that created XML Schema, provides a high-level overview of W3C XML Schema. It describes the basic features of XML Schemas and provides specific examples of its functionality.
This article is excerpted from Definitive XML Schema (Prentice Hall PTR, 2001, ISBN: 0130655678), by Priscilla Walmsley.
This chapter is from the book

This chapter is from the book

Both element and attribute declarations can use simple types to describe the data content of the components. This article introduces simple types, and explains how to define your own atomic simple types for use in your schemas.

Simple Type Varieties

There are three varieties of simple type: atomic types, list types, and union types.

  • Automatic types have values that are indivisible, such as 10 and large.

  • List types have values that are whitespace-separated lists of atomic values, such as <availableSizes>10 large 2</availableSizes>.

  • Union types may have values that are either atomic values or list values. What differentiates them is that the set of valid values, or "value space," for the type is the union of the value spaces of two or more other simple types. For example, to represent a dress size, you may define a union type that allows a value to be either an integer from 2 through 18, or one of the string values small, medium, or large.

Design Hint: How Much Should I Break Down My Data Values?

Data values should be broken down to the most atomic level possible. This allows them to be processed in a variety of ways for different uses, such as display, mathematical operations, and validation. It is much easier to concatenate two data values back together than it is to split them apart. In addition, more granular data is much easier to validate.

It is a fairly common practice to put a data value and its units in the same element, for example <length>3cm</length>. However, the preferred approach is to have a separate data value, preferably an attribute, for the units, for example <length units="cm">3</length>.

Using a single concatenated value is limiting because

  • It is extremely cumbersome to validate. You have to apply a complicated pattern that would need to change every time a unit type is added.

  • You cannot perform comparisons, conversions, or mathematical operations on the data without splitting it apart.

  • If you want to display the data item differently (for example, as "3 centimeters" or "3 cm" or just "3", you have to split it apart. This complicates the stylesheets and applications that process the instance document.

It is possible to go too far, though. For example, you may break a date down as follows:

<orderDate>
 <year>2001</year>
 <month>06</month>
 <day>15</day>
</orderDate> 

This is probably an overkill unless you have a special need to process these items separately.

  • + Share This
  • 🔖 Save To Your Account