Saving Word Files as HTML
"When I use the 'Save As' feature in Microsoft Word to save a document as a web page, the resulting HTML is a bloated mess. Is all that formatting stuff really needed? If not, is there a way to get rid of it?"

| Check out this week's most popular articles. |
I'd Like to Phone a Friend, Regis
I asked my friend Allan Wyatt, who is an internationally recognized author, software expert, and publisher of the WordTips newsletter to handle this question. Here's what he says:
When Word 2000 creates a Web document, it saves quite a bit of information in the HTML document. This information is Word-specific. It is not necessary for your Web browser, and is only useful if you are planning on loading the HTML document back into Word 2000 at a later date. One element that it records is font sizes. The Web, by default, doesn't support a large number of different font sizes and typographical conventions. It certainly doesn't support as many as Word can. So Word 2000 stores that information in a created HTML document anyway, tucked away so that it can decipher it when you later load up the document in Word.Some people don't like the way font formatting is done by Word, and prefer to take advantage of the "relative" font sizing that is natural to the Web. The relative font sizing allows the browser--and the user through the browser--to specify the relative size of the text that appears on-screen. This can be a great feature to some people. Word, however, doesn't use the relative font sizing, instead trying to make the font appear as close to what the document author used as possible.
If you are not going to load the document back into Word, you can get rid of all that extra baggage. You can either do this the tedious way, or the somewhat-less-tedious way. The tedious way, of course, involves opening the HTML file in a text editor and removing all but the bare HTML code that is necessary for displaying your information. This requires, of course, that you be fairly conversant in HTML coding.
The somewhat-less-tedious way involves the use of a Microsoft add-in for Word 2000 (called the Office 2000 HTML Filter) that will remove all the Word-specific HTML code for you. The add-in is free; you can learn more about it (and download it) at the following address:
Even after running the Office 2000 HTML Filter, you may still want to open the file and examine to resulting HTML code to make sure it displays information exactly as you intend. While this may require some knowledge of HTML, it doesn't require all the tedious steps of doing the removal and recoding yourself.
Thanks, Allen!
Posted by Bob Rankin on October 5, 2005 12:17 PM
| Need More Help? Try the AskBobRankin Updates Newsletter. It's Free! |
![]() |
Prev Article: History of Linux |
|
Next Article: System Hangs |
![]() |
|
Link to this article from your site or blog. Just copy and paste from this box: Related Keywords: Software word ms word save as html save as web page convert word to html convert doc to html word tips |
Most recent comments on "Saving Word Files as HTML"
|
Posted by:
|
When a document is saved in HTML format, a sub directory is created to store the graphics. A cleaner solution is to save as MHT (MHTML). EDITOR'S NOTE: Great, another non-standard Microsoft standard. So how do you upload that to your web server? Looks like you can't. :-( |
|
Posted by:
|
Nice advice about getting rid of Word html bloat, but I have Office 2003, and there is no advice for that product. EDITOR'S NOTE: I'd guess that the Word 2000 tool work still work. Give it a try... |
|
Posted by:
|
Check out this link : http://office.microsoft.com/en-us/assistance/HP030852791033.aspx It seems that Microsoft have incorporated it directly into Office 2003, no need to install any add-ins. |
|
Posted by:
|
Word versions later than 2000 can optionally save only the HTML and omit the bloat. No need for an add-in. |
|
Posted by:
|
Using the 2000 HTML filter still produces nonstandard code. You should also run it through Tidy which needs to be configured to catch Word's garbage code. Tidy can be found at: http://tidy.sourceforge.net |
|
Posted by:
|
Great article. Is there a similar tool for Word 2002 and Word 2003? Or does the Word 2000 tool work on later versions? |
|
Posted by:
|
I please want information on what to do to so as that if I click on a submit button it will send my form instantly. |
Post your Comments, Questions or Suggestions
|
Ask Bob Rankin Home Page
Subscribe to AskBobRankin Updates: Free Newsletter |
|
| Copyright © 2005 - Bob Rankin - All Rights Reserved | ||
Article information: AskBobRankin -- Saving Word Files as HTML (Posted: October 5, 2005 12:17 PM)
Printed from: http://askbobrankin.com/saving_word_files_as_html.html
Copyright © 2005 - Bob Rankin - All Rights Reserved









Check out other articles in this category:







