ocr products, pdf to text converter

Is verypdf pdf to text converter extract Text in Unicode?

We are looking for a command line application for converting pdf to text , is verypdf pdf to text converter extract Text in Unicode. Since we deal with scientific documents α, β, γ etc.

The current application that we use , do it very efficiently but we are planning to replace it as it can’t maintain the Format.

It will kind of you if you can answer my queries. Also please let me know if demo is available for us to test.
===========================
Yes, our PDF2TXT software does support command line and unicode features.

Please run following command line to convert your PDF file to text file to try again, (-breaker parameter will insert page breaker into converted .txt file)

"C:\Program Files\VeryPDF PDF2TXT v3.2\pdf2txt.exe" C:\in.pdf C:\out.txt -unicode -breaker

You can also run following command line to convert PDF file to text file without page breaker symbols,

"C:\Program Files\VeryPDF PDF2TXT v3.2\pdf2txt.exe" C:\in.pdf C:\out.txt -unicode

We hoping "-unicode" parameter will work better for you, please to try.

VeryPDF
===========================
Thanks for your reply.

Another query is that , does it convert directory containing pdf files something like,

C:\MyPDFFiles\*.pdf D:\MyConvertedTXT\ -unicode

Or we have to make bat of with command for each individual file.

As per your instructions I shall try out and see the output.

Also another query, if we go for ocr command line version will it be able extract the text from pdf having embedded fonts.
===========================
You can run following command line to batch convert all PDF files in a folder to text files,

for %F in (C:\MyPDFFiles\*.pdf) do "C:\Program Files\VeryPDF PDF2TXT v3.2\pdf2txt.exe" "%F" "%~nF.pdf" -unicode -breaker

if you wish put above command line into a .bat file, you need use %% to instead of % character,

for %%F in (C:\MyPDFFiles\*.pdf) do "C:\Program Files\VeryPDF PDF2TXT v3.2\pdf2txt.exe" "%%F" "%%~nF.pdf" -unicode -breaker

Yes, PDF to Text OCR version is able to extract the text from PDF having embedded fonts, that's no problem.

VeryPDF
===========================
I tried a sample PDF with the demo PDT to TXT OCR , the output was jumbled. Can you please have a look and see why it fails. Also the layout.
======================
You can run following command line to convert your PDF file to text file properly,

pdf2txtocr.exe -ocr -bitcount 1 "D:\temp\EKA_US_EN_48.pdf" "D:\temp\EKA_US_EN_48.pdf.txt"

for example,

D:\temp>"E:\pdf2txtocrcmd\pdf2txtocr.exe" -ocr -bitcount 1 "D:\temp\EKA_US_EN_48.pdf" "D:\temp\EKA_US_EN_48.pdf.txt"
You have 297 times to evaluate this product, you may purchase a full version from 'https://www.verypdf.com'.
==========================
The test version can only convert PDF files in the first few pages, if you need
to convert more of the page, please purchase the full version from
https://www.verypdf.com site.
==========================
[OCR] Processing page 1 of 3...
[OCR] Processing page 2 of 3...
[OCR] Processing page 3 of 3...

VeryPDF

VN:F [1.9.20_1166]
Rating: 1.0/10 (1 vote cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)
pdf editor

Can I print a PDF file from PDF Viewer OCX Control?

I am currently running Windows XP and I am evaluating PDF viewers that we can use inside some of our applications that we distribute and I was working with the PDF Viewer control and I really like it but I do not see a way to print from the control. Is there functionality to do this? Also do you have a reference manual for the control?
=====================
Our PDF Viewer OCX Control is can only view the PDF file, it doesn't support PDF printing function, if you wish print the PDF file, you can download the PDF Print Command Line product from our website to try,

http://www.verydoc.com/pdf-viewer-ocx.html

You can call PDF Print Command Line application from your code to print PDF files easily.

However, we have another VeryPDF PDF Editor&Viewer OCX Control v2.6 product, this product is support PDF viewing, printing, editing, commenting, etc. function, you can download the trial version of VeryPDF PDF Editor&Viewer OCX Control v2.6 software from following web page to try,

https://www.verypdf.com/pdf-editor/index.html#dl

VeryPDF

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)
docconverter com

Can’t convert MHTML files to PDF files, cannot open any mhtml file in IE

Without html2pdf running I am unable to replicate the issue. Is there anything in the Process logs from yesterday that would point to where the problem lies?
===================
I have just conducted some testing which indicates that when VeryPDF’s HTML2pdf application is running, Internet Explorer cannot open MHT’s files. This was done in the following way;

1. Create a batch file that calls html2pdf to create 10 different PDF’s from 10 different MHT files.
2. Run the above batch file.
3. While the above batch file is running.. Attempt to open another MHT file in internet explorer – if html2pdf is running internet explorer will display “This page cannot be displayed”.

This would explain why the “page cannot be displayed error” was seemingly random  - as this potentially is only a problem when html2pdf runs for a couple of seconds.

VeryPDF Support – Can you please try and replicate this issue  using the above steps on a system using “Windows Server 2003 R2 x86”?

===================
Can you please email to us your sample MHTML files in question? After we reproduced your problem in our system, we will figure out a solution to you shortly.

VeryPDF
===================
I have included all MHT's used to carry out testing. I have also included the batch file used to create the PDF files.

I look forward to hearing a response.

===================
Please by following steps to try again,

1. Please launch a CMD window by administrator privilege, change the current folder to the folder of "doc2pdf_com_full_v2.5",
2. Run "install_as_exe.bat" to install html2pdf.exe and pdfout.dll and DocConverter COM into your system,
3. Please make sure "doc2pdf_service.exe" is running, if "doc2pdf_service.exe" is not running, please run following command line to run it,

doc2pdf_service.exe -exe

4. Please modify "CreatePDFs.bat" file to following command lines,

html2pdf.exe "%CD%\0a15c570-fc10-439b-abfc-6318fc711ee1.mht" "%CD%\1.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"
html2pdf.exe "%CD%\2fd9d2ab-d630-4091-ae77-910921b7bb0a.mht" "%CD%\2.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"
html2pdf.exe "%CD%\3b96a5e5-e13e-4a1c-bc8d-4643fdfc987b.mht" "%CD%\3.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"
html2pdf.exe "%CD%\6ce07278-f670-4a61-94d8-9145a733e0c3.mht" "%CD%\4.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"
html2pdf.exe "%CD%\7bd9bc69-482a-41fb-8c71-80c7ce813fd5.mht" "%CD%\5.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"
html2pdf.exe "%CD%\139ae7f5-f165-492b-ae80-8cb49ff4fa2b.mht" "%CD%\6.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"
html2pdf.exe "%CD%\7792050e-2565-4486-936c-076c7b19b245.mht" "%CD%\7.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"
html2pdf.exe "%CD%\56889532-72af-4f56-a87a-db29db3ca84d.mht" "%CD%\8.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"
html2pdf.exe "%CD%\90801926-11d3-44b0-adf3-4444fc2f182c.mht" "%CD%\9.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"
html2pdf.exe "%CD%\a7bad16c-dd87-4825-a4f3-a07a830e7894.mht" "%CD%\10.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"
html2pdf.exe "%CD%\ea28d711-50be-4f26-a4ae-bab366fb49c8.mht" "%CD%\11.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"
html2pdf.exe "%CD%\f5e5ffa5-34e5-4a9c-943d-dc77f6892bc4.mht" "%CD%\12.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"
html2pdf.exe "%CD%\f14054a3-e553-4753-8223-95090d1989e1.mht" "%CD%\13.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"
pause

5. Please run "CreatePDFs.bat" file file to do the batch conversion, then you can convert your MHTML files to PDF files properly.

You can also run following one command line to batch convert all of your MHTML or HTML files to PDF files at one time,

for %F in (D:\test\*.htm) do html2pdf.exe "%F" "%~dpnF.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"

You can also refer to following steps to batch convert lots of HTML or DOC files to PDF files at one time,
~~~~~~~~~~~~~~~~~~~~~~~
Please by following solution to batch convert lots of HTML or DOC or other documents to PDF files at one time,

1. Please run doc2pdf_service.exe as a normal Windows application by following command line,

doc2pdf_service.exe -exe

2. Please run following command line to convert all of HTML files to PDF files in D:\test folder and sub folders,

for /r D:\test %F in (*.htm) do html2pdf.exe "%F" "%~dpnF.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"

you can replace "D:\test" folder to correct folder in your system.

All of your HTML files will be converted to PDF files at one time by above command line.

if you needn't convert HTML files in sub folders, you can run following command line to convert all HTML files in D:\test folder only,

for %F in (D:\test\*.htm) do html2pdf.exe "%F" "%~dpnF.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"

you can also call following command line from Windows Service or your web applications to convert DOC file to PDF file properly,

html2pdf.exe "C:\file.Doc" "C:\file.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"
~~~~~~~~~~~~~~~~~~~~~~~

VeryPDF
===================
Thanks for the information, however, We already installed the Install_as_exe.bat outlined in steps 1,2 and 3 prior to the testing.

As you are specifying to run as service within the batch file, shouldn’t steps 1 to 3 be installing the exe as a service batch file?
==================
Please don't install doc2pdf_service.exe as a service, you can launch it as a normal windows application, the owner of doc2pdf_service.exe should be the current logged user account, such as administrator user account or admin or your name, after doc2pdf_service.exe is running as normal windows application, you can run following command line to batch convert your MHTML files to PDF files,

for %F in (D:\test\*.mhtml) do html2pdf.exe "%F" "%~dpnF.pdf" "Overwrite=yes,KillProcess=no,doc2pdf_service=yes"

VeryPDF

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)
pdf to text converter

Index or search keywords in PDF files

Hi, I have a licence of VeryPdf PCL converter. I need a product for create index in a PDF file for example indexing a PDF invoice using the invoice number. Do you have any product to do this?
====================
We have PDF2TXT and PDF2TXT OCR software, you can download them from following web page to try,

https://www.verypdf.com/pdf2txt/pdf2txt.htm#dl

you can use PDF2TXT or PDF2TXT OCR software to convert your PDF files to text files first, then you can search " invoice number " in these PDF files easily.

you can also use our PDF Parser SDK to parser text contents from existing PDF file,

http://www.verydoc.com/pdfparsersdk.html

you can get following text  information from your PDF pages,

<div style="position:absolute;left:666;top:78;width:84;height:14"><span style="font-style:normal;font-weight:700;font-size:13px;font-family:Arial;color:#000000;">MORGAN</span></div>

<div style="position:absolute;left:761;top:78;width:90;height:14"><span style="font-style:normal;font-weight:700;font-size:13px;font-family:Arial;color:#000000;">STANLEY</span></div>

you can index or search keywords in these XML files easily.

VeryPDF

VN:F [1.9.20_1166]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)
docprint pro

docPrint PDF Driver Client Machine Registration

3.  How will my client's Very PDF printer know it is not a Trial or does this happen automatically when my program with the SDK registers the printer on their workstation?
=========================

After you purchased it, you will get a license key, please set the license key to "RegisterNO" in registry, docPrint Pro will be registered automatically,

HKEY_CURRENT_USER\Software\verypdf\pdfcamp
RegisterNO="Your License Key"

VeryPDF
=========================

loWSH = createobject("wscript.shell")
loWSH.RegWrite("HKEY_CURRENT_USER\Software\verypdf\pdfcamp\RegisterNO", MyLicense)

I put in the registry on a client machine but it still shows the purchase license prompt.  Please help.
=========================
You can set following license keys into registry to try again,

For 32bit Windows system,

HKEY_LOCAL_MACHINE\SOFTWARE\verypdf\pdfcamp
RegisterNO="Your License Key"

HKEY_CURRENT_USER\Software\verypdf\pdfcamp
RegisterNO="Your License Key"

For 64bit Windows system,

HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432node\verypdf\pdfcamp
RegisterNO="Your License Key"

HKEY_CURRENT_USER\SOFTWARE\WOW6432node\verypdf\pdfcamp
RegisterNO="Your License Key"

You can click right button under above keys, create a "RegisterNO" key with "string" type, set the value to ""Your License Key"".

After you set your License Key to these fields, please print a document to docPrint PDF Driver, then you will not be prompted by registration message box and a demo watermark in output PDF files.

VeryPDF
=========================
Is there a way to do this for any computer?  How can I know if a workstation is 32 or 64 bit?  Can't I just use one entry and do it the same on a 32 or 64 bit?
=========================
Yes, you can always use following item in registry,

HKEY_LOCAL_MACHINE\SOFTWARE\verypdf\pdfcamp

On the 64bit system and UAC option enabled, if you access above registry item from a 32bit EXE application, above registry item will be redirected to following item automatically, you needn't change anything to your source code,

HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432node\verypdf\pdfcamp

VeryPDF

VN:F [1.9.20_1166]
Rating: 5.0/10 (2 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)