We are trying to convert scanned pdfs/images(tiff/jpgs) to searchable pdf files

We are trying to convert scanned pdfs/images(tiff/jpgs) to searchable pdfs. Please provide better option to run in the command prompt. I tried pdf2txtocrcmd trail version, but not able to o 100% OCR and some strings are not searchable.

PDF to Text OCR Converter Command Line,
https://www.verypdf.com/app/pdf-to-text-ocr-converter/try-and-buy.html

Please find two images I am trying to convert to searchable pdfs.

I did two steps for the tif file,

1. Convert tiff to pdf

pdf2txtocr.exe -ocrmode 3 D:\Jhansi\pdf2txtocrcmd\test\testtif1.tif D:\pdf2txtocrcmd\test\out_testtif1_ocrmode3.pdf

2. Image pdf (output from the above step) to searchable PDF

pdf2txtocr.exe -ocr -ocrmode 1 -res 72 D:\pdf2txtocrcmd\test\out_testtif1_ocrmode3.pdf D:\pdf2txtocrcmd\test\ser_out_testtif1_ocrmode3.pdf

I followed similar process for jpeg file too.

Thanks,
Customer
--------------------------------------------------
Thanks for your sample files, we have researched your "testtif1.tif" file carefully, the OCR engine in pdf2txtocr.exe couldn't support this TIFF file very well.

However, the OCR engine in "VeryPDF OCR to Any Converter Command Line" works great for this TIFF file, you may download "VeryPDF OCR to Any Converter Command Line" from this web page to try,

https://www.verypdf.com/app/ocr-to-any-converter-cmd/try-and-buy.html#buy
https://www.verypdf.com/dl2.php/ocr2any_cmd.zip

after you download it, you could run following command line to convert testtif1.tif file to text file, Word and other documents,

ocr2any.exe -ocr2 D:\downloads\testtif1.tif D:\downloads\testtif1.txt

ocr2any.exe -ocr2 D:\downloads\testtif1.tif D:\downloads\testtif1.rtf

ocr2any.exe -ocr2 D:\downloads\testtif1.tif D:\downloads\testtif1.html

This is original TIFF file before OCR,

image

This is the Word document after OCR,

image

This is the plaintext file after OCR,

image

This is the HTML file after OCR,

image

VeryPDF
--------------------------------------------------
Hi,

I am looking for PDF out put, you mentioned that ocr2any will produce .txt,.rtf and .html. Can we convert to searchable PDF.

Thanks,
Customer
--------------------------------------------------
Thanks for your message, -ocr option in ocr2any.exe does support searchable PDF creation, however, -ocr2 option in ocr2any.exe does not support searchable PDF creation yet, we will add searchable PDF creation function for -ocr2 option in the future releases of ocr2any.exe software, thanks for your patience.

VeryPDF

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!