Convert HTML to Word in Java

·

1 min read

HTML is a markup language used to create web pages. To parse an HTML file, a web browser is required. If you want to make your HTML file more readable, converting HTML files to Word is a great option. This article will share how to convert a simple HTML file to a Word document using Free Spire.Doc for Java.

Install the Library (Two Methods)

Method 1: Download the free Java library and unzip it. Then add the Spire.Doc.jar file to your Java application as dependency.
Method 2: Directly add the jar dependency to the maven project by adding the following configurations to the pom.xml.

<repositories>
   <repository>
      <id>com.e-iceblue</id>
      <name>e-iceblue</name>
      <url>https://repo.e-iceblue.com/nexus/content/groups/public/</url>
   </repository>
</repositories>
<dependencies>
   <dependency>
      <groupId>e-iceblue</groupId>
      <artifactId>spire.doc.free</artifactId>
      <version>5.2.0</version>
   </dependency>
</dependencies>

Sample Code

It's quite simple to convert an Html file to a Word document with Free Spire.Doc for Java. You just need to load an Html file using Document.loadFromFile(String fileName, FileFormat fileFormat, XHTMLValidationType validationType) method and then it to a .doc/.docx document using Document.saveToFile(String fileName, FileFormat fileFormat) method. The complete sample code is shown below.

import com.spire.doc.*;
import com.spire.doc.documents.XHTMLValidationType;

public class htmlFileToWord {
    public static void main(String[] args) {
        //Load an html file
        Document document = new Document();
        document.loadFromFile("E:\\Files\\input.html", FileFormat.Html, XHTMLValidationType.None);

        //Save to a Word document
        document.saveToFile("htmlFileToWord.docx", FileFormat.Docx);
    }
}