- All the XML that You Need to Know
- Understanding the RSS specification
- Setting Up an Aggregator
- Making RSS Work for Your Organization
Blogging, formally known as web logging, is becoming as pervasive as email. These days it seems as if everybody—from worldwide news services to soccer moms and dads—has a blog. Newspapers and magazines with an online presence use blogging technology to publish the work of its journalists. Individual users employ blogs as a personal publishing tool to post online diaries, essays, poetry, humor, and special-interest topics. Blogging is here to stay!
With everybody blogging on a daily basis, there is a lot of interesting information being generated. Yet how is one to keep track of everything and find what's important? If you are interested in only one or two blogs, making a visit to the blog's web site each day and reading what is posted is a reasonable thing to do. But what do you do if you are interested in 10 blogs or 20 blogs? Clearly, keeping track of all those blogs can become a laborious, time-consuming activity. Wouldn't it be nice if there were a way to keep a list of your favorite blogs and be automatically notified when a new entry is made? This is where RSS comes in.
RSS (Really Simple Syndication) is a protocol that allows bloggers to publish their information in a standardized format that can be read by a special type of software called an RSS Aggregator (a.k.a a news reader). An RSS Aggregator understands the structure of RSS and can present a particular blog in very much the same way that your email reader presents your email. Most RSS Aggregators contain a window that lists a brief summary line, telling you the subject of the blog entry and the time posted. Then if you want to read the entire blog entry, you select the summary to see the complete contents of the post. Thus, you can keep track of many blogs, selecting only the specific blog item that you find to be of interest.
Figure 1 An RSS Aggregator allows you to keep track of a number of blogs and notifies you when a blog has been updated.
Given that blogs are here to stay, knowing a thing or two about RSS technology can be quite useful. In this article I will give you a brief overview of the RSS protocol, tell you how it relates to XML, and show you how to set up your blog so that it supports RSS. Then I'll walk you through the steps of setting up typical RSS Aggregators to keep track of a variety of blogs. Lastly I'll offer some suggestions as to how you and your organization can benefit from using blogs.
I'll assume that you have an operational understanding about the structure of email and using email on an email reader. Also, I assume that you have heard about blogging and may have read a blog.
All the XML that You Need to Know
The RSS format is based on XML, Extensible Markup Language. So to understand RSS you need to have a fundamental knowledge of XML.
XML is a markup language similar to HTML, the language that is used to render web pages in your browser. HTML works by using "tags" to describe how a set of characters should be presented on a web page. For example, the HTML code will present the words Hello World in bold face type:
The tag set, <b></b> tells the browser that the characters that come after the "start tag" <b> and before the "end tag" </b> are to be made bold. The important thing to understand is that in a tag set, a "begin tag" takes the form <tag_name> and "end tag" takes the form </tag_name>, where tag_name is the name of the tag. The HTML standard has a number of predefined tag sets. (Tag sets are also referred to as elements.) Table 1 shows some examples of a few common HTML tags.
Table 1: Examples of Common HTML Tags
End tag is included in the begin tag
Note that the last entry in Table 1, the <br /> tag, does not have an end tag because the tag represents a line break. HTML that is coded like this:
A line.<br /> Another line.
will be displayed in the browser this way:
A line Another line.
A line break is a stand-alone HTML tag. It does not format any characters. In cases of stand-alone tags, HTML allows you to combine a set of begin and end tags by using the </> format.)
The begin tag-end tag structure is a fundamental building block of XML, which allows you to use the tag structure to create your own elements. Whereas in HTML the tag structure is used to define how a set of characters is displayed, in XML the tags allow you to describe something about the characters in a more abstract fashion. For example, let's say that you have a set of characters Bob Reselman. You can use XML to describe that the characters Bob are something we call a first name and the characters Reselman are something that we call a last name. Listing 1 shows you how to make this description in XML.
Listing 1: Two simple XML elements.
We could also describe first name and last name as shown in Listing 2.
Listing 2: Two more simple XML elements.
Please be advised that the only standard in play here is the <tag_name></tag_name> specification. XML has no defined way to describe a first name or a last name. I simply made the tag names up. That the characters between <first_name></first_name> and <firstName></firstName> are a first name is because I say that they are.
Let's take the elegance of XML one step farther. The XML in Listing 1 and Listing 2 indicate that there two elements in play. One is an element that describes a first name, and the other is an element that describes a last name. However, there is nothing in the XML structure that indicates that these elements are part of an entire name structure. Grouping elements together is easy in XML. XML allows me to group elements together by nesting elements within elements. Please take a look at Listing 3. An XML document must have only one base tag that includes all other tags. The tag writer_name is the base tag for this XML document.
Listing 3: You group elements together in XML by nesting them within a tag.
<writer_name> <first_name>Bob</first_name> <last_name>Reselman</last_name> </writer_name>
In Listing 3, you can see that the elements <first_name>Bob</first_name> and <last_name>Reselman</last_name> are placed within the element <writer_name></writer_name>.
What XML is saying is that there is an element writer_name, and it is made up of two subsidiary elements: first_name and last_name
What is very cool about XML is that it is a self-describing language. In other words, all you need to really understand to make sense of an XML structure are three rules: (1) tags take the form <></>; (2) you can nest tags; (3) an XML document must have only one base tag set that is not a subsidiary of any other tag, but includes all other tags. Please take a look at Listing 4.
Listing 4: A more complex XML structure.
<speeding_ticket> <ticket_number>87849934541</ticket_number> <officer> <first_name>Jim</first_name> <last_name>Smith</last_name> <badge_number>11295</badge_number> </officer> <offender> <first_name>Mike</first_name> <last_name>Jones</last_name> <drivers_lic_number>457-111-1478</drivers_lic_number> <state>MA</state> </offender> <vehicle> <make>Nissan</make> <model>Altima</model> <year>2001</year> <plate_number>WSP-987</plate_number> <state>WA</state> </vehicle> <speed>80</speed> <date>July 8, 2004</date> <time>16:30</time> <location> <street>Sunset Blvd.</street> <cross_street>La Cienega Blvd</cross_street> <city>Los Angeles</city> <state>CA</state> <zip>90027</zip> </location> </speeding_ticket>
Using only the three XML rules described above, we can figure out the following about Listing 4.
The XML structure describes a thing called a speeding ticket.
The speeding ticket element is made up of elements:
An Officer element has a:
An Offender element has a
Drivers License Number
A Vehicle element has a:
A Location element has a
Thus, to turn the XML above into friendly language we can say:
On July 8, 2004 at 2:30 PM, Officer Jim Smith, badge number 11295, issued speeding ticket number 87849934541 to Mike Jones, driver's license number 457-111-1478, issued in Massachusetts. Mr. Jones was driving a 2001 Nissan Altima with Washington plates numbered WSP-987. The ticket was issued on the corner of Sunset Blvd and La Cienega Blvd in Los Angeles, CA. Mr Jones was driving at 80 MPH.
As you can see, XML can describe a lot of information by using not a lot of rules. What's even nicer is that most browsers can read XML and display it in a hierarchical fashion, as shown in Figure 2.
Figure 2 Most modern web browsers show XML data in a hierarchical display.