Features overview

Output

  • The best output of all existing Word to HTML converters
  • Unicode (UTF-8) (perfect multilingual support)
  • HTML entities where possible: © &tm; – etc
  • Good for further processing:
    • Valid XHTML 1.0 Transitional (with few exceptions)
    • Valid XML (always)
  • Good for manual editing:
    • Hierarchical indentation
    • Strips all the unnecessary attributes like lang=en etc.
    • Removes empty or unneeded <span> tags and other similar annoyances
  • Configurable defaults:
    • Strips colours by default
    • Joins adjacent bold, italic, underline and other similar tags
    • Converts <b> to <strong> and <i> to <em>
  • Equations and charts are converted to images, saved into separate folder and linked from the HTML

Input

  • Converts many documents at once
  • Converts RTF to HTML and XHTML
  • Converts .doc files created with any version of Word into HTML and XHTML
    • Word 97
    • Word 2000
    • Word 2003
    • and all others too

Interface

  • Accessible from context menu for selected documents
  • GUI application
  • Supports command line execution
    • Can be used in scripts
    • Supports passing multiple files at once, for ex.: word2html report1.doc report2.doc
    • Supports file-masks, for ex.: word2html *report*.doc
  • Supports DDE (adds files to conversion queue on the fly)