Day Six: Well-Formedness and Syntactical Rules in XHTML 1.0
How has XML has influenced HTML's familiar syntax? XHTML expert Molly Holzschlag discusses several key concepts that are inherent to XHTML as a result of XML's influence.
In previous days, we've looked at what XHTML 1.0 is, why it makes sense, and how to create basic document templates that conform to its three available document type definitions (DTDs).
But what about the ways in which XML has influenced HTML's familiar syntax? Several key concepts are inherent to XHTML as a result of XML's influence, but they are perhaps significantly different from the way you've been authoring HTML.
First, the concept of well-formedness is key. This means that any document that you write must follow the correct order of elements and the correct method of writing those elements. As mentioned on Day 1, browsers forgive. So, if I were to write the following in HTML, a browser is likely to display my text as being both bold and italic:
<b><i>Welcome to my Web site!</b></i>
However, look at the markup. I open with the opening bold tag and then the italics tag. But instead of nesting my tags properly, I close the bold tag first! This is improper nesting, and, as a result, the code is considered poorly formed. To be well-formed, the code must be properly ordered:
<b><i>Welcome to my Web site!</i></b>
This is a well-formed bit of markup. Well-formedness is a critical concept in XHTML 1.0, and you must get used to following logical order within your documents.
There are some other issues related to markup that are necessary in XHTML 1.0. They include the following:
- All elements and attribute names must appear in lower case—HTML is not case-sensitive. You can write HTML elements and attribute names in lower case (<p align="right">), upper case (<P ALIGN="RIGHT">), or mixed case (<P aLiGn="right">). All of those mean the same thing in HTML. But in XHTML, every element and attribute name must be in lower case: <p align="right">. Note that attribute values (such as right, in this case, but especially true for case-sensitive file names in URLs) can be in mixed case.
- All attribute values must be quoted—In HTML, you can get away
without quoting values. So, I can have the following:
<img src="my.gif" height=55 width=65 alt="picture of me">
I've got some attributes quoted, and some I don't. But when writing XHTML, you'll quote all attribute values. There are no exceptions to this:
<img src="my.gif" height="55" width="65" alt="picture of me"> - All nonempty elements require end tags, and empty elements must be properly
terminated—A nonempty element is an element that contains content.
A paragraph is nonempty because within the tags exist text, images, or other
media. In HTML, you could open a paragraph but not close it. In XHTML 1.0,
you must close any nonempty element:
So, this is correct:
<p>This text is content within my non-empty paragraph element.</p>
This is not correct:
<p>This text is content within my non-empty paragraph element.
Another good example of this is the list item element, <li>. In HTML, you can simply open the list item and never close it—it's optional. But in XHTML, you must close it.
This is correct:
<li>This is the first item in my list.</li>
This is not correct:
<li>This is the first item in my list.
Empty elements are those elements that do not contain content. Good examples are breaks, horizontal rules, and images. In the case of empty elements, a termination is required. In XML this is done by using a slash after the element name, so <br> becomes <br/>. But because of some browser bugs that will cause pages to render improperly, in XHTML 1.0, we add a space before the final slash to ensure the page is readable: <br />.
Remember that image element just a few paragraphs earlier? Well, even with all the attributes quoted, it's not proper XHTML 1.0. Because it's an empty element, it must be terminated accordingly:
<img src="my.gif" height="55" width="65" alt="picture of me" />
As you can see, the rules here are not so daunting really. It just takes a little knowledge and a little precision, and you can easily author documents that are readable by current browsers and adhere to the XHTML 1.0 standard.
About the Author
An author, instructor, and designer, Molly E. Holzschlag brings her irrepressible enthusiasm to books, classrooms, and Web sites. Honored as one of the Top 25 Most Influential Women on the Web, Molly has worked in the online world for an almost unprecedented decade. She has written and contributed to more than 10 books about the Internet and, in particular, the Web.
Molly holds a B.A. in communications and writing and an M.A. in media studies from the New School for Social Research. You can visit her Web site at http://www.molly.com/.
Molly's most recent publications are Special Edition Using XHTML(Que, November 2000), Sams Teach Yourself Adobe LiveMotion in 24 Hours (Sams, June 2000), and Special Edition Using HTML 4.0, Sixth Edition (Que, December 1999).