Home > Articles > Web Services > XML

Introduction to SVG

This chapter is from the book

This chapter is from the book

XML and SVG

We've talked a lot about how SVG is written in XML, based on XML, etc., and you're probably wondering whether you need to know XML to write or edit SVG code. In fact, SVG is a type of XML, and knowing a bit of XML document structure and syntax will help you to understand the logic, conventions, and limitations you might encounter in SVG. In this chapter, we give you a little taste of XML. If you really get interested, we go further into XML in Chapter 10.

So, what is XML? XML is an extensible and flexible markup language. What does that mean? Let's find out.

What is Markup?

HTML (Hypertext Markup Language) is an example of a markup language that you may be familiar with. In an HTML document, if you run across something that looks like this:

<p>

you know that it is a markup tag that indicates a new paragraph. HTML has built-in markup tags, which are mostly for formatting, such as <p> for paragraph and <b> for bold. HTML markup tells browsers, processors, and editors how to format and organize content. It consists of start and end tags, processing instructions, etc. Markup describes content.

Notice the angle brackets around the <p>. A good way to think of markup is that anything between angle brackets is markup <This is markup>. Anything described by that markup is content.

<p>This is content<p>

What is Extensible?

So what about XML? Well, like HTML, XML has two types of building blocks, markup and content. Content is the character or image data that resides inside the markup. What makes XML so different and so powerful is that, unlike HTML, which comes with built-in markup tags, XML is extensible, which means that you can define your own markup. That means the potential for creating markup is relatively unlimited, or extensible. You can create markup that describes tags in your own document rather than being limited to the tags that HTML gives you. Or you can use preset definitions, called DTDs, or document type definitions. For now, let's take a look at some really basic XML code (Example 1–1).

Example 1–1 Script 1–2

1.  <?xml version="1.0"?>
2.  <document>
3.  <title>My first experiment</title>
4.  <sentence>This is fun</sentence>
5.  </document>

This is a very simple XML document. What do we see here?

  1. The processing statement, or XML declaration. We'll go into this a little later in the chapter.

  2. An opening tag named <document> that we created. That means we did not get it from any other source than ourselves.

  3. An opening tag named <title> that we created, content for that tag (My first experiment), and closing </title> tag.

  4. An opening tag named <sentence> that we created, content for that tag (This is fun), and closing </sentence> tag.

  5. Closing </document> tag.

Elements

So what are all these opening and closing tags about? Well in XML, we build documents with elements. An element has an opening tag, <tag>, and a closing tag, </tag>. The element includes the tags and what is in between them. In the above code snippet,

<title>My first experiment</title> 

<title> and </title> are the opening and closing tags, and <title>My first experiment</title> is the entire title element. Note that a closing tag differs from an opening tag in that it has a forward slash (/) inserted into it before the tag name.

In the above code snippet, the element <document> is the root element. Every XML document needs one root element. The root element contains the other elements inside of it. Another way of saying this is that the other elements are nested within the root element <document>. Some basic markup symbols and their uses are listed in Table 1–1.

TABLE 1–1 Some Basic XML Markup Symbols

Symbol

Use

?

begins and ends a processing instruction

<

delimiter that starts the beginning of a tag

>

delimiter that ends a tag

</

starts the beginning of an end tag


Processing Statement: Now to get to that mysterious first line of code:

<?xml version="1.0"?>

This line of code is the XML declaration. It is part of the prolog for the XML document. It must contain the version number but can also contain other instructions, such as targeting a specific character set (see Chapter 12 on Unicode) or stating whether the XML document is standalone (which means it "stands alone" or does not reference an external document) or whether it is not standalone (it references an external document). This is particularly important to note because most SVG documents reference an external document, usually the W3C standards body on SVG.

The above code statement is known as a processing instruction because it tells the computer to do something. In this case, it tells the browser and related software that this is an XML file and what version of XML it is. At this writing, there is only one version of XML, but in the future that is sure to change! Processing instructions use angle brackets and question marks <?processing instruction?> to begin and end the line of processing instruction code.

Well Formed or Valid?

An XML document can be well formed only or both well formed and valid. What does this mean? Let's look at well-formed documents first.

For an XML document to be well formed, it must follow some simple syntax rules:

  • Each opening tag must have a corresponding closing tag.

  • XML is case-sensitive; thus, <title>, <Title>, and <TITLE> are all three different tags.

  • There must be at least one element in a well-formed XML document.

  • There can be only one root element.

  • Elements and their tags must nest correctly.

  • Element names must conform to the following naming rules:

    • They must start with a letter or an underscore

    • They can contain letters, digits, periods (.), underscores (_), or hyphens (-).

    • Whitespace is not allowed.

    • They cannot begin with the sequence xml.

Unlike HTML, XML is completely and, unfortunately, totally unforgiving of syntax and space errors. Triple-check your XML code. XML is case-sensitive, so pick one style of naming, using either all or no caps, and stick with it. By convention, XML tags are usually written in lowercase. All XML parsers check to make sure that start and end tags exist and check delimiters and characters, as well.

Well formed means the document and syntax structure is correct, according to the above bulleted list.

Let's experiment with this now. Open a simple text editor, such as Notepad for Windows or BBEdit for Macs.

Type in the following code, exactly as you see it:

<?xml version="1.0"?>
<document>
<title>My first experiment</title>
<sentence>This is fun</sentence>
</document>

Now save the file to your hard drive with a name such as myfirst.xml, making sure you save it with the .xml extension. To do this in Windows, you will need to change the file type dropdown to All Files. It is important to save this as an .xml file; otherwise, you won't be able to see it as XML code. We saved our file as myfirst.xml.

Now open a browser window, either by connecting to the Internet or choosing Work Offline. Either click File, Open, Browse and find myfirst.xml in your own drive path specification or type in the file path in the location bar.

If you open the file in IE, you will see the code colored to denote syntax (Figure 1–5, left). Congratulations! You have a well-formed XML document!

NOTE

If you open the file in Netscape, you will see the text content alone, without the markup (Figure 1–5, right). This is correct; you are still on track. Netscape doesn't show the code tree the way IE does.

Figure xxxFigure xxxFIGURE 1–5 The left image is myfirst.xml in IE 5; the right image is myfirst.xml in Netscape 6.


Now let's see what happens when we deliberately make a mistake in coding. Go to your Notepad file and delete the closing </document> tag. Save the file as myFirstError.xml, again making sure that you save it with the .xml extension. Now open the file in a browser and see what happens.

Note that you get an error telling you exactly what you did wrong in both Netscape and IE! Pretty darned cooperative, aren't they? Sometimes fixing errors is that easy, sometimes not. Often, you have to hunt around a bit for what you did wrong in the code. Usually in IE 5 or Netscape 6, you will get at least a line number where the error is supposed to have occurred.

Now, put back the </document> tag to end the code properly. Save and view it. All is well.

Now let's get adventuresome here. Let's add another element. Open your file in your text editor of choice and, after the code line:

<title>My first experiment</title>

Press Enter and type in:

<author>
        <firstName>Jane</firstName>
        <lastName>Jones</lastName>
</author>

Notice that we've added a new element, but that something is different here. This new element, author, has two elements nested within it: firstName and lastName. Each nested element has an opening and closing tag, and the entire author element ends with the </author> closing tag. This is an example of two elements (firstName and lastName) nested within the element author. The element author contains the two elements firstName and lastName. In XML, we say that author is the parent element of firstName and lastName, and that firstName and lastName are child elements of the element author.

NOTE

With nested elements, each child element must begin and end completely within the parent element.

We're going to add one more thing—a comment. A comment is a piece of code that does nothing, it is there only for you to describe or notate the code. It is quite useful to comment your code, because often you will do something in the code and forget about why you did it! It is also very helpful when viewing another person's code to be able to see his or her comments.

A comment looks like this:

<!-- This is a comment-->

Open up your file again, and, below the line of code:

<title>My first experiment</title>

Type in:

<!-- firstName and lastName are child elements nested within the parent author element -->

Save the file, and view it as usual. You will see that, in Internet Explorer, you can see the comment as part of the code, but in Netscape, you still see only the content of the code, not the markup or comment.

If you want, play around with this code some more until you get comfortable with the "well-formed" concept. You can edit a Notepad file while it is open. Just remember to save it by going to File, Save. Play around! Try adding an element or two or some other content. To view the newly edited version in IE or Netscape, just click the Refresh button on the toolbar after saving your file. This will reload the updated file.

Now that you're pretty comfortable with well-formed documents, we're going to up the ante a little bit. Most XML documents need to be valid in addition to being well formed. What defines a valid XML document? Quite simply, a valid XML document must include a reference to a DTD.

DTD

An important aspect of XML structure is the XML DTD. To understand the DTD a bit, let's go back to our HTML example. HTML code contains a lot of markup that is already defined, such as our favorite, <p>. How does HTML know that <p> means to start a new paragraph? Simple, that information is defined in HTML's DTD. HTML's DTD tells the parser to start a new paragraph every time it runs across <p>.

As we said before, XML is extensible. That means that we define our own elements and markup, and it also means that we must define our own DTD for our XML documents. The DTD file holds all of the allowable parameters for the XML file that references it. All valid XML files must have a DTD.

Let's say that you want to build a house. The blueprint of the "house" would be the DTD. The DTD defines what is and what is not allowed in the XML document that references it. This is like saying you are going to build a house and allow the following: doors, walls, a roof, and a floor. (Of course, this is way too skimpy for a real house, but you get the idea.) So the door element would be one of the elements in the house, defined by us.

Once a DTD is named and the elements are specified, you can define specific characteristics of those elements, which are called attributes. Attributes can describe color, height, width, etc. An attribute of the door element in our house example might be wood, metal, glass, aluminum, or rubber. The attribute wood is, therefore, an attribute that we have just given the element door. So think of the DTD as the blueprint, an element as a door, and the element door's attribute as wood.

How does the valid XML document read the DTD? In one of two ways. The DTD can either be internal, which means that it is written into the XML document, or it can be external, in which case, you must include a reference to the external DTD.

What does a DTD look like? The following is a partial example of how a DTD is set up. We are using our "house" example.

<!DOCTYPE house [
<!ELEMENT house (doors,walls,roof,floor)>
<!ELEMENT doors (#PCDATA)>
<!ELEMENT walls (#PCDATA)>
<!ELEMENT roof (#PCDATA)>
<!ELEMENT floor (#PCDATA)>
]>

We used the idea of house as the DTD. Obviously, a rabbit is not an allowable item in a blueprint of a house. A rabbit is an animal. So if our DTD is about a house, and we don't want to include a rabbit in it, then in our valid XML file, we wouldn't be able to include a rabbit. In other words, everything that appears in a valid XML document must be declared in a referenced or inline DTD.

Note that the above code includes no attributes.

DTD Code

Take a look at the DTD code above.

First, DOCTYPE is the document type declaration, not to be confused with the DTD, or document type definition. This begins the DTD file.

Again, each element you are going to use for your XML document must be declared in the DTD that it references. The element declaration must start with an exclamation point (!), and the name of the element must start with a letter or underscore character. The ! defines the instruction to the browser or processor that this is an ELEMENT and not just a word. It is a good idea, though not necessary, to start the first ELEMENT statement with the same name as the DOCTYPE. ELEMENTS are written as !ELEMENT.

DTD Element Tags

<          Start delimiter
</         End delimiter
!ELEMENT   Element declaration (all caps necessary)
>          Close delimiter
/>         Empty Tag (mostly used as a placeholder)
!ATTLIST   Attribute declaration (all caps necessary)

(#PCDATA)

Parsed character data, (#PCDATA), tells the processor that characters are allowed in an ELEMENT, as opposed to elements or other instructions to the computer. (#PCDATA) is most often used for text content. You must tell the processor what type of data is allowed in each ELEMENT statement. Is an image allowed? Is just text allowed? More than one type of data is allowed in an element, but it all must be declared.

In the example above, in !ELEMENT house (which is a parent), we say that the elements doors, walls, roof, and floor (which are child elements of house) are allowed; then we break it down further by saying what can be included in each child element. We say (and this is only a partial list of the full statement) that in the !ELEMENTs doors, walls, roof, and floor, we can use text. Looks simple, right? Remember, we set up the parameters for our own documents, and we get to decide what we will include, but everything we use in our valid XML document must be declared in our DTD.

NOTE

(#PCDATA)

(#PCDATA) is always enclosed in parentheses.

A DTD can be saved as a separate file from the XML document, then referenced in the XML document with a line of code, like so:

<!DOCTYPE house SYSTEM "j://myHouse/house.dtd"> 

That line of code references the house.dtd. SYSTEM refers to the fact that this particular DTD resides on your home system. If the DTD resides on the Web, PUBLIC is the identifier you would use in place of SYSTEM.

The DTD can also be included in the XML document as an "internal" DTD.

Adding a DTD to myFirst

Open up Notepad for Windows or BBEdit for Macs, and reopen your myfirst.xml file.

Type or copy in the following after the XML declaration (remember what that was?). It was the line of code that looks like this: <?xml version="1.0"?>

<!DOCTYPE document [
<!ELEMENT document (title,author,sentence)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (firstName,lastName)>
<!ELEMENT firstName (#PCDATA)>
<!ELEMENT lastName (#PCDATA)>
<!ELEMENT sentence (#PCDATA)>
]>

Save the document again as myfirst.xml. Again, when saving the document, make sure to use the .xml three-character extension and save it as type All Files, or it will be saved as a .txt document. In Windows 95/98, be careful to put quotes around it ("myfirst.xml"), or it will be saved as myfirst.xml.txt.

Now view the file in a browser. You should see something similar to Figure 1–6 or Figure 1–7.

Figure xxxFIGURE 1–6 myfirst.xml in Internet Explorer 5.

Figure xxxFIGURE 1–7 myfirst.xml in Netscape 6.

Once again, if the code is colored in IE or if just the text shows up in Netscape, the document is well formed, but we want to find out whether it is valid, as well. How do we do that? To validate the document, you have to have a validating XML parser. IE 5 has a parser but you may have to download parts of it, depending on your operating system. If you don't have a validating parser, there are some on the Web that let you paste in your code and tell you whether it is valid. One is located at www.stg.brown.edu/service/xmlvalid/. Simply paste in your code where you are given the "text" form, and press the Validate button. You will be returned to a page that tells you whether the document validates. If you have errors in the document, this program will also list them.

Parsers

An XML parser, including ones contained in some but not all browsers, check the framework and content of an XML statement for well-formedness; and, if they are validating parsers, they validate an XML file. Depending on the company, the programs also allow extensive planning and editing. This is a developing field, and there are new products out constantly.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020