Home > Articles > Programming > Web Services/ XML/ SOA/ WebSphere/ WCF

Cleaning Your Web Pages with HTML Tidy

  • PrintPrint
  • Share ThisShare This
  • DiscussDiscuss
Even with the push for web standards, many web pages are still plagued by sloppy coding. Technical writer Scott Nesbitt looks at fixing problems with HTML files using HTML Tidy.

Introduction

After many years and the efforts of countless evangelists, web standards are finally being taken seriously by people who build web pages, or any other kind of HTML document. But badly-formed HTML—the kind that doesn't conform to the standards laid down by the World Wide Web Consortium—is still a problem. You've probably seen what I'm talking about all over the web: closing tags that are MIA, proprietary extensions like <font> and <center>, and other constructs that break in all but one or two browsers.

So how do you get around the problem of bad HTML? You could use one of the many applications or online services that validate HTML syntax. More often than not, though, these applications and services are good but not great. Most will check HTML but not correct it. If you have a lot of files, you must check each file and make corrections by hand. This takes a lot of time and effort.

Or you could turn to HTML Tidy.

HTML Tidy (hereafter just Tidy) is free software, weighing in at under 500KB, and it doesn't just check HTML files; it fixes the problems it finds—and does a whole lot more. Tidy is an anachronism in the world of the graphical user interface. It's a command-line application, meaning that you have to type a string of commands to get Tidy to run. It sounds like an old-fashioned way of doing things; in fact, it's anything but. The command-line interface gives Tidy a great deal of flexibility.

This tutorial teaches you the basics of working with Tidy. I can't cover all of the aspects of Tidy in this article, but I can give you enough information to set you on the road to mastering the software. You'll learn how to run the program, use Tidy's options at the command line, and use Tidy with configuration files to make your work more efficient. I'll even point you to some web editing software in which Tidy is integrated.

NOTE

This article only looks at using Tidy at the Windows or Linux command line. However, the syntax for other operating systems is the same.

  • Share ThisShare This
  • Your Account

Discussions

Make a New Comment

You must log in in order to post a comment.

Related Resources

Danny KalevMinutes from the October 2009 Meeting
By Danny Kalev on November 19, 2009 No Comments

The minutes from the Santa Cruz (October 2009) meeting are available here. Even if you're not a language layer at heart, I encourage you to read them.

Danny KalevA Reader's Opinion on Attributes
By Danny Kalev on October 20, 2009 No Comments

In August I dedicated a series to the debate about C++0x attributes. I believe that it covered the subject in a balanced and detailed way, but I keep getting complaints from C++ users who don't like attributes for various reasons. Here's a recent email I received from a Polish C++ programmer. While it  doesn't represent my opinion about attributes -- I'm rather neutral about this feature and consider it a "solution waiting for a problem" -- but it suggests that attributes are still a highly controversial issue that will haunt C++ for a long time. The email is quoted here with minor edits that and as usual, with all private details removed.

Danny KalevFollowup: The Web 2.0 Guy I Ain't
By Danny Kalev on October 16, 2009 1 Comment

Almost a year ago, I posted here The Web 2.0 Guy I Ain't. People wonder whether I still resist all those Web 2.0 features and technologies at the end of 2009.

See All Related Blogs

Informit Network