Converting with Dreamweaver
Acting on advice of colleagues, I tried the Dreamweaver function called Clean Up Word HTML, which is pretty useful if you already have Dreamweaver. If you don't, you can download the 30-day free trial, which contains this feature; but at $399 retail, you're likely to find Dreamweaver a bargain for this purpose alone.
I opened the original file in Dreamweaver and tried Commands, Clean Up Word HTML. The free Dreamweaver download didn't have Word 2003 as an option on the drop-down menu, so I chose Word 2000/2002. I was warned that there might be problems because my version of Word wasn't listed, but I never saw any. On the contrary, Dreamweaver cleaned up plenty of junk. It offered choices about specific tags to clean out, too, as shown in Figure 2.
Dreamweaver's cleanup left the original 71KB a tidy 42KB, and showed me the result in a text editor, ready for any further tidying up. This feature turned out to be useful because, while links stayed linked and tables held their own, extra spaces appeared in the code sections and would have to be repaired by hand. However, in Dreamweaver, the process requires a few clicks in the editor to pull out the extra sets of </pre><pre> tags. The fact that all this is done in Dreamweaver while you're in the editor itself is one of the nicest features of doing Word HTML cleanup here.
I also tried cleaning up the file with filtered HTML. It came out a little bit smaller, 41KB, with the same formatting issues.
Batch processing is not available with Dreamweaver, however, and each file must be saved as a web file in Word before it can be opened.