How to convert the PDF to HTML in C# code?

Some guys asked that “We are trying to use VeryPDF Command line utility for converting PDF to HTML in C# code. It is smartly working for our requirements.But, there is a questions related to command line utility usage that once conversion is successfully done, I want to continue the process to my other code in that method but VeryPDF doesn't complete and hangs on my machine after the conversion too and my code execution is not proceeding at all.”

In order to introduce a software which can convert the pdf to html and use in C# code meanwhile tackle the problem for this friend, I put an article here to share the information.

It is PDF to HTML v2.0 which can converts PDF files to HTML files while seeking to preserve the original page layout (as best as technically possible). PDF2HTML enables the conversion of layout originally designed for paper to be used on the Internet. This software is a association for GUI version and command line version. Then I will show to convert the pdf to html and use in C# code.

  • Visit PDF to HTML v2.0 homepage.

In order to make full of use this software, we should visit its home page to know more about it. As least, we should be familiar with the  parameters of this software. This is the link of its homepage:http://www.verypdf.com/pdf2htm/index.html

  • Download PDF to HTML v2.0

As you want to use C# code, you should better download the version server License which  easily to be called from ASP/ PHP/ C#/.NET/... etc. server side applications.

  • Use the common way to start the MS Dos windows.

Run "cmd" command from "Start Menu"->"Run" then run the PDF to HTM command from the command line window (PDF to HTM software can be found in your PDF to HTM command line uncompress directory.

  • Add file and input the command as the usage shows.

The usage is: PDF2HTML [Option] <PDF File> [<HTM File>]  and you can check the parameters here: http://www.verypdf.com/pdf2htm/help/help.html

I will take a random PDF file for example to show you how to convert the pdf to html.

Snap9

The command I inputted as below. As I only converted three pages,  there are three converted and shown in the picture. A few seconds later, you will see the htm file show up on in the folder C:\Documents and Settings\admin\Desktop\New Folder

C:\Documents and Settings\admin>"C:\Documents and Settings\admin\My Documents\Do
wnloads\pdf2htlm commandline\pdf2html_cmd\pdf2html.exe" -f 3 -l 5 "E:\2011-9-28\
pdfeditor-1.pdf" "C:\Documents and Settings\admin\Desktop\New Folder\p.htm
"
Processing page 1 of 3...
Processing page 2 of 3...
Processing page 3 of 3...

  •  Check the htm file we have made.

Snap10

You can use the same way to convert PDF to HTML in C# code.

VN:F [1.9.20_1166]
Rating: 8.8/10 (5 votes cast)
VN:F [1.9.20_1166]
Rating: +3 (from 3 votes)
How to convert the PDF to HTML in C# code?, 8.8 out of 10 based on 5 ratings

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!