Home > Articles > Software Development & Management > UML

Modeling XML Vocabularies with UML: Part II

📄 Contents

  1. Mapping UML Models to XML Schema
  2. Conclusions
  3. Tips for Success
This article presents a list of design choices and alternative approaches for mapping UML to W3C XML Schema. The UML model presented in the first article in this series is refined to reflect the design choices made by the authors of the W3C's XML Schema Primer, where this example originated.
Like this article? We recommend

In Part 1 of this article series, I emphasized that models are an inevitable part of system analysis and design, even if a model is sometimes only in the developer's mind. By using UML to capture a conceptual model of the planned vocabulary, we can clarify the essential terms and relationships without getting caught up in the syntactic issues of the chosen schema language. In fact, industry standards groups may want to use UML as the primary definition for their vocabularies and leave the final choice of schema language(s) to implementing vendors.

I also want to emphasize that choosing a model-driven approach to schema design does not force you into a drawn-out waterfall development process. The approach described in these articles illustrates an evolutionary and incremental development process. The first schema produced using default mapping rules from this purchase order model may not be ideal, but it accurately captures the domain semantics that were modeled. Part 3 of this series describes how the model may be specialized to capture design characteristics that are unique to XML schema generation. This approach is compatible with the contemporary methodologies for agile programming and modeling, where the models fulfill a very pragmatic role in the development process. (See XMLmodeling.com, which is a web portal that I have created to gather case studies and modeling resources.)

To achieve these rather lofty objectives, it's essential that we have a complete, flexible mapping specification between UML and XML schemas. The following examples do not present the complete picture, but attempt to ease you into a maze of terminology from UML and the W3C XML Schema definition language (XSD).

Mapping UML Models to XML Schema

This is where the rubber meets the road when using UML in the development of XML schemas. A primary goal guiding the specification of this mapping is to allow sufficient flexibility to encompass most schema design requirements, while retaining a smooth transition from the conceptual vocabulary model to its detailed design and generation.

A related goal is to allow a valid XML schema to be automatically generated from any UML class diagram, even if the modeler has no familiarity with the XML schema syntax. Having this ability enables a rapid development process and supports reuse of the model vocabularies in several different deployment languages or environments, because the core model is not overly specialized to XML structure.

Please note that the schema examples in this article not fully compatible with the corresponding example in the XML Schema Primer. Nonetheless, the following schema fragments are still valid interpretations of the conceptual model. The third article in this series will continue the refinement process to its logical conclusion, where the resulting schema can validate the XSD Primer example.

The conceptual model for purchase orders shown in Figure 1 is duplicated with very slight modification from the first article. We'll dissect this diagram into all of its major structures and map each part to the W3C XML Schema definition language. I'll note several situations in which other alternatives are possible and also point out where the schema differs from the XSD Primer example.

Figure 1 Conceptual model of purchase order vocabulary.

Class and Attribute

A class in UML defines a complex data structure (and associated behavior) that maps by default to a complexType in XSD. As a first step, the PurchaseOrder class and its UML attributes produce the following XML Schema definition:

<xs:complexType name="PurchaseOrder">
 <xs:all>
  <xs:element name="orderDate" type="xs:date"
        minOccurs="0" maxOccurs="1"/>
  <xs:element name="comment" type="xs:string"
        minOccurs="0" maxOccurs="1"/>
 </xs:all>
</xs:complexType>

The attributes in a UML class are not restricted to a particular order, so an XSD <xs:all> element is used to create an unordered model group. In addition, a UML class creates a distinct namespace for its attribute names (that is, two classes can contain attributes having the same name), so these are produced as local element definitions in the schema. See A New Kind of Namespace for more explanation of this topic.

Both of these UML attributes are optional, indicated by [0..1] in Figure 1. These are mapped to minOccurs and maxOccurs attributes in the XSD. The UML attributes are defined using primitive datatypes from the XSD specification, so these are written directly to the generated schema using the appropriate namespace prefix. If other datatypes are used in the UML model, an XSD type library can be created to define these types for use in a schema. For example, I have created an XSD type library for the Java primitive types and common Java classes such as Date, String, Boolean, and so on.

As a useful default, a top-level element is automatically created for each complexType in the schema. The default name for this element is the same as the class name; this is allowed in the W3C XML Schema because it uses separate namespaces within the schema itself for complexTypes and top-level elements. For PurchaseOrder, the top-level schema element is created as follows:

<xs:element name="PurchaseOrder" type="PurchaseOrder"/>

Referring to the XSD Primer example, orderDate is modeled as an XML attribute, not a local element in PurchaseOrder. It also uses a <sequence> model group instead of <all>. And the top-level element is defined in the Primer using a lowercase first letter; that is, purchaseOrder (often called "lower camel case" format). All of these differences are addressed in the third article in this series by using a UML profile to expand the mapping to XML schemas.

Association

The PurchaseOrder type is specified not only by its UML attributes but also by its associations to other classes in the model. Figure 1 includes three associations that originate at PurchaseOrder, which is designated by navigation arrows at the opposite ends. Each association has a role name and multiplicity to specify how the target class is related. These associations are added to the model group of the XSD complexType, along with the elements created from the UML attributes.

<xs:complexType name="PurchaseOrder">
 <xs:all>
  <xs:element name="orderDate" type="xs:date"
        minOccurs="0" maxOccurs="1"/>
  <xs:element name="comment" type="xs:string"
        minOccurs="0" maxOccurs="1"/>
  <xs:element name="shipTo">
   <xs:complexType>
    <xs:sequence>
     <xs:element ref="Address"/>
    </xs:sequence>
   </xs:complexType>
  </xs:element>
  <xs:element name="billTo">
   <xs:complexType>
    <xs:sequence>
     <xs:element ref="Address"/>
    </xs:sequence>
   </xs:complexType>
  </xs:element>
  <xs:element name="items" minOccurs="0" maxOccurs="1">
   <xs:complexType>
    <xs:sequence>
     <xs:element ref="Item"
           minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
   </xs:complexType>
  </xs:element>
 </xs:all>
</xs:complexType>

Because the UML attributes for orderDate and comment have primitive datatypes, the schema embeds these values as element content. However, the default mapping for associations creates a wrapper element in XSD corresponding to the role name in UML. This element then contains the instances of the associated class, which the schema refers to using the top-level element created for each complexType.

If you want to create a W3C XML Schema using the <all> content model, a wrapper element is necessary whenever the associated class has more than one occurrence. This is because <all> can be used only when the contained elements have either [0..1] or [1..1] multiplicity. So when generating the wrapper element for the association to Item, the element named item allows zero or one instances, which hold zero or more Item elements.

The difference between this default schema generated from UML and the schema included in the XSD Primer is that the Primer's shipTo and billTo roles contain the address content directly, without use of an element for the associated class. In other words, child elements for name, street, city, and so on are contained directly within shipTo and billTo. This design alternative is covered in the extensions presented in the third article in this series.

User-Defined Datatype

The default mapping to XSD would produce a complexType definition for SKU and QuantityType, but we want these to become user-defined simple datatypes in the XML Schema. This is easily achieved by adding a UML stereotype to each of these two classes, which is shown as <<XSDsimpleType>> in Figure 1. This ability to include stereotypes is an integral part of the UML standard and is used to specify additional model characteristics that are usually unique to a particular domain; in this case, unique to XML schema design.

Using the stereotype, the schema generator knows to create the following definition for SKU:

<xs:simpleType name="SKU">
 <xs:annotation>
  <xs:documentation>Stock Keeping Unit, a code for identifying
           products</xs:documentation>
 </xs:annotation>
 <xs:restriction base="xs:string">
  <xs:pattern value="\d{3}-[A-Z]{2}"/>
 </xs:restriction>
</xs:simpleType>

A UML model may also include documentation for any of its model elements, which is passed through to the XML schema definition as shown in this example. The UML generalization relationship indicates which existing simple datatype should be used as the base for this user-defined type. Finally, the pattern attribute on SKU is mapped to an XSD facet that constrains the SKU string value.

The second module in the purchase order schema definition represents a reusable set of specifications for addresses, as shown in Figure 2. These definitions are taken directly from section 4.1 of the XSD Primer. Two additional schema constructs are required by this model, in addition to those used when producing a schema from Figure 1.

Figure 2 Modularized Address schema component.

Generalization

A fundamental and pervasive concept in object-oriented analysis and design is generalization from one class to another. The specialized subclass inherits attributes and associations from all of its parent classes. This is easily represented in W3C XML Schema, although it requires more indirect mechanisms when producing other XML schema languages.

In Figure 2, the Address class is shown in italic font, which is used in UML to indicate that this is an abstract class that is only intended to be used for deriving other specialized classes. Following the same default rules used for PurchaseOrder, the complexType definitions for Address and USAddress are produced as follows:

<xs:element name="Address" type="Address" abstract="true"/>
<xs:complexType name="Address" abstract="true">
 <xs:sequence>
  <xs:element name="name" type="xs:string"/>
  <xs:element name="street" type="xs:string"/>
  <xs:element name="city" type="xs:string"/>
 </xs:sequence>
</xs:complexType>

<xs:element name="USAddress" type="USAddress"
      substitutionGroup="Address"/>
<xs:complexType name="USAddress">
 <xs:complexContent>
  <xs:extension base="Address">
   <xs:sequence>
    <xs:element name="state" type="USState"/>
    <xs:element name="zip" type="xs:positiveInteger"/>
   </xs:sequence>
  </xs:extension>
 </xs:complexContent>
</xs:complexType>

There are three areas of difference from previous examples:

  • The top-level element and complexType definitions for Address include the XSD attribute abstract="true".

  • The USAddress element includes substitutionGroup="Address", which means that whenever the Address element is required as a content element, USAddress may be substituted in its place. Thus, we can use USAddress (or, similarly, UKAddress) as the content of shipTo and billTo in the PurchaseOrder.

  • The complexType definition for USAddress is extended from the base complexType named Address. But there is a significant point of difference in how this inheritance structure is interpreted in UML versus in XSD. In UML, the order of attributes and associations within a class is not specified and the features inherited from parent classes are freely intermingled with locally defined attributes and associations in a subclass. In XSD, a complexType defined using an <all> group is not allowed to be extended with additional element content in a child complexType. As a result of this limitation in XSD, the complexType for Address and each of its specialized subtypes must use a <sequence> group definition for their element content. So the three elements inherited from Address are an ordered group in USAddress, followed in sequence by another ordered group of the two elements defined in USAddress. You cannot define an unordered group of the five elements when one or more is inherited.

Enumerated Datatype

The state element of USAddress refers to a simple type definition for USState, which is generated from a UML enumeration. In Figure 2, USState is shown with an <<enumeration>> stereotype that notifies the schema generator to create an XSD enumeration value for each of the attributes defined for this class. An enumerated type in XSD is just a specialized kind of simpleType definitions, so it must also specify a superclass in UML to use as a base type in XSD. The schema is generated as follows:

<xs:simpleType name="USState">
 <xs:restriction base="xs:string">
  <xs:enumeration value="AK"/>
  <xs:enumeration value="AL"/>
  <xs:enumeration value="AR"/>
  <xs:enumeration value="PA"/>
 </xs:restriction>
</xs:simpleType>

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020