Convert scanned PDF file to a new PDF file with OCR and despeckle processing
We're trying to use pdf2txtocrcmd to OCR a PDF. Here's the parameters we're using:
pdf2txtocrcmd.exe -imageopt -ocrmode 3 c:\filein.pdf c:\fileout.pdf
The input file is three pages. When we don't use the -imageopt flag, then the PDF OCR works as expected. However we're trying to get better results from the OCR so we want to also use the -imageopt flag so that we can despeckle the PDF before OCR. When we use the -imageopt flag, the first two pages of the output PDF are blank (except for your watermark). The third page does have the OCR.
We obviously need all the input pages output.
Thanks for your message, we will research this problem and try to fix it in the new version of PDF to Text OCR Command Line software shortly.
In the meantime, please download "Image to PDF OCR Converter Command Line" software from following web page to try,
after you download and unzip it to a folder, you can run following command line to convert your scanned PDF file to a new PDF file with OCR and despeckle functions,
img2pdfnew.exe -ocr 1 -tsocr -despeckle D:\downloads\noOCR_sub.pdf D:\downloads\newOCR_despeckle.pdf
The speckles will be removed from output PDF file, the output PDF file looks clear enough.
Here is the source PDF file, it is contain speckles and text contents are not selectable,
Here is the OCRed PDF file, as you see, the speckles are removed and text contents are selectable, you can select text contents and copy them into MS Word easily,
- VeryPDF Image Processing SDK, Automatically clean-up images, including auto-rotation, auto-deskew, crop, noise removal, etc. operations.
- How to convert an image based PDF file to editable PDF file?
- How to replace a text word in a scanned PDF file or an image based PDF file or a graphics based PDF file or a AutoCAD drawing PDF file?
- Is there a way to convert scanned Color PDFs to searchable Color PDF?
- Our product does use OCR technology to extract content from Graphic Images
- How to batch convert scanned PDF files to Searchable PDF files and remove background color from new created PDF files (OCRed PDF files)?
- VeryPDF Best OCR Module to convert from scanned PDF files to searchable PDF files
- [VeryPDF Release Notes] VeryPDF Released a new version of VeryPDF OCR to Any Converter for Windows today
- PDF+text files and Apple’s PDFkit
- We receive and save faxes as "plain or image" PDF. We need to OCR, extract a text based on some criteria and insert it into a custom form. We need to use PDF extraction to Text function.
- My machine platform is Vista, can I use Image2PDF?
- Who can help me convert this WMF image to PDF file by command line?
- Convert PDF file to image files, and convert image files back to PDF file again, but the new PDF file size is become too large. pdf2img command line and img2pdf command line
- Two ways to convert PCD to PDF
- How to convert PNG image to PDF file by command line?