What HTML Files Look Like
Pages written in HTML are plain text files (ASCII), which means that they contain no platform- or program-specific information. Any editor that supports text (which should be just about any editor--more about this subject in "Programs to Help You Write HTML" later in this chapter) can read them. HTML files contain the following:
The text of the page itself
HTML tags that indicate page elements, structure, formatting, and hypertext links to other pages or to included media
Most HTML tags look something like the following:
<thetagname> affected text </thetagname>
The tag name itself (here, thetagname) is enclosed in brackets (< >). HTML tags generally have a beginning and an ending tag surrounding the text they affect. The beginning tag "turns on" a feature (such as headings, bold, and so on), and the ending tag turns it off. Closing tags have the tag name preceded by a slash (/). The opening tag (for example, <p> for paragraphs) and closing tag (for example, </p> for paragraphs) compose what is officially called an HTML element.
Be aware of the difference between the forward slash (/) mentioned with relation to tags, and backslashes (\), which are used by DOS and Windows in directory references on hard drives (as in C:\window or other directory paths). If you accidentally use the backslash in place of a forward slash in HTML, the browser won't recognize the ending tags.
Not all HTML tags have both an opening and closing tag. Some tags are only one-sided, and still other tags are "containers" that hold extra information and text inside the brackets. XHTML 1.0, however, requires that all tags be closed. You'll learn the proper way to open and close the tags as the book progresses.
Another difference between HTML 4.0 and XHTML 1.0 relates to usage of lowercase tags and attributes. HTML tags are not case sensitive; that is, you can specify them in uppercase, lowercase, or in any mixture. So, <HTML> is the same as <html>, which is the same as <HtMl>. This is not the case for XHTML 1.0, where all tag and attribute names must be written in lowercase. To get you thinking in this mindset, the examples in this book display tag and attribute names in bold lowercase text.
Exercise 3.1: Creating Your First HTML Page
Now that you've seen what HTML looks like, it's your turn to create your own Web page. Start with a simple example so that you can get a basic feel for HTML.
To get started writing HTML, you don't need a Web server, a Web provider, or even a connection to the Web itself. All you really need is an application where you can create your HTML files and at least one browser to view them. You can write, link, and test whole suites of Web pages without even touching a network. In fact, that's what you're going to do for the majority of this book. I'll talk later about publishing everything on the Web so that other people can see your work.
In order to get started, you'll need a text editor. A text editor is a program that saves files in ASCII format. ASCII format is just plain text, with no font formatting or special characters. In Windows, Notepad and Microsoft Wordpad are good basic text editors (and free with your system). Shareware text editors are also available for various operating systems, including DOS, Windows 3.1, Windows 95/98, Windows NT, Macintosh, and Linux. If you point your Web browser to http://www.download.com and enter Text Editors as a search term, you will find many resources available to download.
If you prefer to work in a word processor such as Microsoft Word, don't panic. You can still write pages in word processors just as you would in text editors, although doing so is more complicated. When you use the Save or Save As command, you'll see a menu of formats you can use to save the file. One of them should be Text Only, Text Only with Line Breaks, or DOS Text. All these options will save your file as plain ASCII text, just as if you were using a text editor. For HTML files, if you have a choice between DOS Text and just Text, use DOS Text, and use the Line Breaks option if you have it.
If you do use a word processor for your HTML development, be very careful. Many recent word processors are including HTML modes or mechanisms for creating HTML or XML code. This feature can produce unusual results or files that simply don't behave as you expect. If you run into trouble with a word processor, try using a text editor and see whether it helps.
What about the plethora of free and commercial HTML editors that claim to help you write HTML more easily? Most of them are text editors that simplify common tasks associated with HTML coding. If you've got one of these editors, go ahead and use it. If you've got a fancier editor that claims to hide all the HTML for you, put it aside for the next couple of days and try using a plain text editor just for a little while. Appendix A, "Sources for Further Information," lists many URLs where you can download free and commercial HTML editors that are available for different platforms. They appear in the section titled "HTML Editors and Converters."
Open your text editor, and type the following code. You don't have to understand what any of it means at this point. You'll learn more about much of this today and tomorrow. This simple example is just to get you started.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/transitional.dtd"> <html> <head> <title>My Sample HTML page</title> </head> <body> <h1>This is an HTML Page</h1> </body> </html>
Note that the <!DOCTYPE> tag in the previous example doesn't appear in lowercase like the rest of the tags. This tag is an exception to the XHTML rule and should appear in uppercase. This is explained in detail on Day 12.
After you create your HTML file, save it to your hard disk. Remember that if you're using a word processor, choose Save As and make sure you're saving it as text only. When you choose a name for the file, follow these two rules:
The filename should have an extension of .html (.htm on DOS or Windows systems that have only three-character extensions)--for example, myfile.html, text.html, or index.htm. Most Web software will require your files to have these extensions, so get into the habit of doing it now.
Use small, simple names. Don't include spaces or special characters (bullets, accented characters)--just letters and numbers are fine.
Exercise 3.2: Viewing the Result
Now that you have an HTML file, start up your Web browser. You don't have to be connected to the network because you're not going to be opening pages at any other site. Your browser or network connection software might complain about the lack of a network connection, but usually it will give up and let you use it anyway.
After your browser is running, look for a menu item or button labeled Open Page, Open File, or maybe just Open. Choosing it will enable you to browse your local disk. The Open command (or its equivalent) opens a document from the Web or from your local disk, parses it, and displays it. By using your browser and the Open command, you can write and test your HTML files on your computer in the privacy of your own home.
If you don't see something similar to what is shown in Figure 3.2 (for example, if parts are missing or if everything looks like a heading), go back into your text editor and compare your file to the example. Make sure that all your tags have closing tags and that all your < characters are matched by > characters. You don't have to quit your browser to do so; just fix the file and save it again under the same name.
Figure 3.2 The sample HTML file.
Next, go back to your browser. Locate and choose a menu item or button called Reload (for Netscape users) or Refresh (for Internet Explorer users). The browser will read the new version of your file, and voilà! You can edit and preview and edit and preview until you get the file right.
If you're getting the actual HTML text repeated in your browser rather than what's shown in Figure 3.2, make sure that your HTML file has an .html or .htm extension. This file extension tells your browser that it is an HTML file. The extension is important.
If things are going really wrong--if you're getting a blank screen or you're getting some really strange characters--something is wrong with your original file. If you've been using a word processor to edit your files, try opening your saved HTML file in a plain text editor (again Notepad or SimpleText will work just fine). If the text editor can't read the file, or if the result is garbled, you haven't saved the original file in the right format. Go back into your original editor, and try saving the file as text only again. Then try viewing the file again in your browser until you get it right.
A Note About Formatting
When an HTML page is parsed by a browser, any formatting you might have done by hand--that is, any extra spaces, tabs, returns, and so on--are all ignored. The only thing that formats an HTML page is an HTML tag. If you spend hours carefully editing a plain text file to have nicely formatted paragraphs and columns of numbers but don't include any tags, when you read the page into an HTML browser, all the text will flow into one paragraph. All your work will have been in vain.
The one exception to this rule is a tag called <pre>. You'll learn about this tag in Day 6, "More Text Formatting with HTML."
The advantage of having all whitespace (spaces, tabs, returns) ignored is that you can put your tags wherever you want.
The following examples all produce the same output. Try them!
<h1>If music be the food of love, play on.</h1>
<h1> If music be the food of love, play on. </h1>
<h1> If music be the food of love, play on. </h1>
<h1> If music be the food of love, play on. </h1 >