Convert image and scanned PDF to searchable PDF

This article introduces a way to convert image and scanned PDF to searchable PDF, well retaining the original color and layout. The tool recommened is VeryPDF PDF to Text OCR Converter Command Line V 3.0.

The former version VeryPDF PDF to Text OCR Converter can only generate black and white PDF from scanned PDF or image. To meet some customers' needs,the new version adds a new option -ocrmode <int> that can be used to maintain original color when convert image or scanned PDF to searchable PDF.

You can upgrade the product for free. The new feature of V 3.0 enables you to convert image and scanned PDF to black and white PDF, or color PDF with searchable text. -ocrmode <int> permits four values: 0, 1, 2, 3, and 4.

How to convert scanned PDF to searchable PDF?

If the input is an scanned PDF as the following one, you need to use -ocrmode <int> with -ocr. 

 Figure 1. Original Scanned PDF Figure 1. Original Scanned PDF

In four situations you may going to use four OCR modes:

  • -ocrmode 0—When you want to convert a scanned PDF to TXT, you can use -ocrmode 0 as in the following command line: pdf2txtocr.exe -ocr -ocrmode 0 input.pdf output.txt where
      •  pdf2txtocr.exe is the executable file;
      • -ocr is used when the input is a scanned PDF;
      •  -ocrmode 0 is for generating a text file;
      • input.pdf represents the input file; and
      •  output.txt stands for the output file.
  • -ocrmode 1—Convert scanned PDF to searchable PDF with original color retained. You can use -ocrmode 1 as in pdf2txtocr.exe -ocr -ocrmode 1 input.pdf 1.pdf

 Figure 2. Result color PDF

Figure 2. Result color PDF

  • -ocrmode 2—If you want to create a black & white searchable PDF without images, you can use -ocrmode 2 as in pdf2txtocr.exe -ocr -ocrmode 2 input.pdf 2.pdf

 Figure 3. Result B&W PDF without images Figure 3. Result B&W PDF without images

  • -ocrmode 3—To create a B&W PDF with image, you can use -ocrmode 3 as in pdf2txtocr.exe -ocr -ocrmode 3 input.pdf 3.pdf

Figure 4. Result B&W PDF with images Figure 4. Result B&W PDF with images

  • -ocrmode 4—Convert scanned PDF and image to searchable PDF in color.  You can use -ocrmode 4 as in pdf2txtocr.exe -ocr -ocrmode 4 input.tif 3.pdf

Please choose one of the four modes to convert scanned PDF to searchable text file or PDF as you like. The rest part shows how to convert image to searchable PDF.

How to convert image to searchable PDF?

The option  -ocrmode <int> is required to convert image to searchable PDF. But -ocr is not necessary when the input is image. Taking a TIF image as an example, you can convert it to different types of searchable PDF with the use of the following command lines:

  • pdf2txtocr.exe -ocrmode 0 input.tif 0.pdf
  • pdf2txtocr.exe -ocrmode 2 input.tif 2.pdf
  • pdf2txtocr.exe -ocrmode 3 input.tif 3.pdf
  • pdf2txtocr.exe -ocrmode 3 input.tif 4.pdf

I strongly recommend you to use  -ocrmode 3 when convert image containing tables. The following shows the comparison between the original TIF (Figure 5), and the result researchable PDF (Figure 5).

Figure 5. Original TIF image Figure 5. Original TIF image

 

 Figure 6. Result PDF Figure 6. Result PDF

How to get VeryPDF PDF to Text OCR Converter?

If you want to try VeryPDF PDF to Text OCR Converter Command Line V 3.0, please click here to download.

Please feel free to leave a message to ask any question. For more information, you can  contact the support group of VeryPDF.

VN:F [1.9.20_1166]
Rating: 9.7/10 (3 votes cast)
VN:F [1.9.20_1166]
Rating: +3 (from 3 votes)
Convert image and scanned PDF to searchable PDF, 9.7 out of 10 based on 3 ratings

Related Posts

3 Replies to “Convert image and scanned PDF to searchable PDF

  1. Hello,

    I thank you for your article,

    I have one question, this tools commande, is free or not ?

    Thank you.

    VA:F [1.9.20_1166]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.20_1166]
    Rating: 0 (from 0 votes)
  2. >>What products can be used to convert scanned PDF to searchable PDF file?

    Thanks for your message, the following products are all can convert scanned PDF files to searchable PDF files, the output PDF files will contain a hidden text layer, you can open OCRed PDF files in Adobe Reader and search text contents properly,

    Image to PDF OCR Converter Command Line,
    http://www.verypdf.com/app/image-to-pdf-ocr-converter/try-and-buy.html#buy-ocr-cmd

    PDF to Text OCR Converter Command Line,
    http://www.verypdf.com/app/pdf-to-text-ocr-converter/try-and-buy.html#buy

    VeryPDF OCR to Any Converter Command Line,
    http://www.verypdf.com/app/ocr-to-any-converter-cmd/try-and-buy.html

    VN:F [1.9.20_1166]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.20_1166]
    Rating: 0 (from 0 votes)

Leave a Reply to VeryPDF Cancel reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!