Is verypdf pdf to text converter extract Text in Unicode?

We are looking for a command line application for converting pdf to text , is verypdf pdf to text converter extract Text in Unicode. Since we deal with scientific documents α, β, γ etc.

The current application that we use , do it very efficiently but we are planning to replace it as it can’t maintain the Format.

It will kind of you if you can answer my queries. Also please let me know if demo is available for us to test.
===========================
Yes, our PDF2TXT software does support command line and unicode features.

Please run following command line to convert your PDF file to text file to try again, (-breaker parameter will insert page breaker into converted .txt file)

"C:\Program Files\VeryPDF PDF2TXT v3.2\pdf2txt.exe" C:\in.pdf C:\out.txt -unicode -breaker

You can also run following command line to convert PDF file to text file without page breaker symbols,

"C:\Program Files\VeryPDF PDF2TXT v3.2\pdf2txt.exe" C:\in.pdf C:\out.txt -unicode

We hoping "-unicode" parameter will work better for you, please to try.

VeryPDF
===========================
Thanks for your reply.

Another query is that , does it convert directory containing pdf files something like,

C:\MyPDFFiles\*.pdf D:\MyConvertedTXT\ -unicode

Or we have to make bat of with command for each individual file.

As per your instructions I shall try out and see the output.

Also another query, if we go for ocr command line version will it be able extract the text from pdf having embedded fonts.
===========================
You can run following command line to batch convert all PDF files in a folder to text files,

for %F in (C:\MyPDFFiles\*.pdf) do "C:\Program Files\VeryPDF PDF2TXT v3.2\pdf2txt.exe" "%F" "%~nF.pdf" -unicode -breaker

if you wish put above command line into a .bat file, you need use %% to instead of % character,

for %%F in (C:\MyPDFFiles\*.pdf) do "C:\Program Files\VeryPDF PDF2TXT v3.2\pdf2txt.exe" "%%F" "%%~nF.pdf" -unicode -breaker

Yes, PDF to Text OCR version is able to extract the text from PDF having embedded fonts, that's no problem.

VeryPDF
===========================
I tried a sample PDF with the demo PDT to TXT OCR , the output was jumbled. Can you please have a look and see why it fails. Also the layout.
======================
You can run following command line to convert your PDF file to text file properly,

pdf2txtocr.exe -ocr -bitcount 1 "D:\temp\EKA_US_EN_48.pdf" "D:\temp\EKA_US_EN_48.pdf.txt"

for example,

D:\temp>"E:\pdf2txtocrcmd\pdf2txtocr.exe" -ocr -bitcount 1 "D:\temp\EKA_US_EN_48.pdf" "D:\temp\EKA_US_EN_48.pdf.txt"
You have 297 times to evaluate this product, you may purchase a full version from 'http://www.verypdf.com'.
==========================
The test version can only convert PDF files in the first few pages, if you need
to convert more of the page, please purchase the full version from
http://www.verypdf.com site.
==========================
[OCR] Processing page 1 of 3...
[OCR] Processing page 2 of 3...
[OCR] Processing page 3 of 3...

VeryPDF

VN:F [1.9.20_1166]
Rating: 1.0/10 (1 vote cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)
Is verypdf pdf to text converter extract Text in Unicode?, 1.0 out of 10 based on 1 rating

Related Posts

This entry was posted in OCR Products, PDF to Text Converter and tagged , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!