Convert scanned PDF to searchable PDF without losing color

This article shares tips on how to convert scanned PDF to searchable PDF without losing color. VeryPDF PDF to Text OCR Converter Command Line v3.0 lets you use different ways to create searchable PDF according to different input files. Some input PDF contains scanned pages and editable pages, some PDF contains only scanned pages. If you want to convert such PDF to searchable PDF without losing original color, you can try the option -ocrmode <int>. 

-ocrmode 1 vs -ocrmode 2

Four values are permitted by -ocrmode <int>. To remain color, you can use -ocrmode 1 and -ocrmode 4. The following is the comparison between the two modes that you can use to generate color PDF where text is searchable:

          -ocrmode 1            -ocrmode 4
supported input formats scanned PDF PDF and images
text in output PDF vector-based, searchable raster-based, searchable
quality of the magnified text high quality loss clarity
text layer under original PDF pages hidden
original PDF pages retain removed
original color retain retain

When use -ocrmode 1?

If the input PDF contains only scanned pages, you are recommended to use  -ocrmode 1 as in pdf2txtocr.exe -ocr -ocrmode 1 ocr.pdf ocr1.pdf, Where

  • pdf2txtocr.exe calls VeryPDF PDF to Text OCR Converter.
  • -ocr calls the built-in OCR engine. This option must appear when convert scanned PDF.
  • -ocrmode 1 means to recognize text in scanned PDF, and insert new text layer under original PDF pages.
  • ocr.pdf represents the input file.
  • ocr1.pdf stands for the output file.

The illustrations below show the effects of conversion from a scanned PDF to searchable PDF. The text in the result PDF can be magnified by any amount without lowering quality.

 Input scanned PDF

Fig.1 Input scanned PDF

    Fig. 2  After use -ocrmode 1      Fig. 3 Magnified for 16 times

Fig. 2  After use -ocrmode 1   Fig. 3 Magnified for 16 times

[Tips] -ocrmode 1 only recognizes text in scanned PDF. It can’t recognize text in images. If the input PDF has editable pages, there might appear two text layers:  one is newly created, and the other belongs to original editable pages. Such problems can be solved using -ocrmode 4.

When use -ocrmode 4?

In order to convert image to searchable PDF, scanned PDF to searchable PDF, and PDF with some searchable pages to editable PDF, -ocrmode 4 is provided. When use -ocrmode 4 to convert scanned PDF, you will find that the text in the result PDF text will loss clarity as being magnified. The illustrations below show the effects:

   Fig. 5 After use -ocrmode 4      Fig. 6 Magnified for 16 times 

Fig. 4 After use -ocrmode 4      Fig. 5 Magnified for 16 times

The following are two command lines for conversion from scanned PDF to searchable PDF:

  • pdf2txtocr.exe -ocr -ocrmode 4 -bitcount 24 ocr.pdf color.pdf
  • pdf2txtocr.exe -ocr -ocrmode 4 ocr.pdf grey.pdf

The illustrations below show the effects of the two command lines:

image   Fig.7 result PDF for the second command line

Fig. 7 1st command line     Fig.8 2nd command line

The following are for conversion from image to PDF:

  • pdf2txtocr.exe -ocrmode 4 ocr.tif color.pdf
  • pdf2txtocr.exe -ocrmode 4 ocr.png color.pdf

[Tips] When convert image to PDF, -ocr is not required as in the fourth and last command lines above . When convert scanned PDF, -ocr must appear as in the first two command lines above. Moreover, to retain original color when create searchable PDF from scanned PDF, you need to use -bitcount 24. Otherwise, the result PDF will be grey as Fig 8.

Download Link

If you want to use VeryPDF PDF to Text OCR ConverterCommand Line v3.0, please click here to download. If there is any question, please consult the Support Group of VeryPDF.

VN:F [1.9.20_1166]
Rating: 5.5/10 (2 votes cast)
VN:F [1.9.20_1166]
Rating: +1 (from 1 vote)
Convert scanned PDF to searchable PDF without losing color, 5.5 out of 10 based on 2 ratings

Related Posts

One Reply to “Convert scanned PDF to searchable PDF without losing color”

  1. Thanks for your message, the following products are all can convert scanned PDF files to searchable PDF files, the output PDF files will contain a hidden text layer, you can open OCRed PDF files in Adobe Reader and search text contents properly,

    Image to PDF OCR Converter Command Line,
    http://www.verypdf.com/app/image-to-pdf-ocr-converter/try-and-buy.html#buy-ocr-cmd

    PDF to Text OCR Converter Command Line,
    http://www.verypdf.com/app/pdf-to-text-ocr-converter/try-and-buy.html#buy

    VeryPDF OCR to Any Converter Command Line,
    http://www.verypdf.com/app/ocr-to-any-converter-cmd/try-and-buy.html

    Please look at following web pages for more information,

    http://www.verypdf.com/wordpress/201211/convert-scanned-pdf-to-searchable-pdf-without-losing-color-32937.html

    http://www.verypdf.com/wordpress/201312/bulk-scanned-pdf-files-to-searchable-pdf-files-batch-converter-40025.html

    http://www.verypdf.com/wordpress/201211/convert-image-and-scanned-pdf-to-searchable-pdf-32896.html

    VN:F [1.9.20_1166]
    Rating: 0.0/5 (0 votes cast)
    VN:F [1.9.20_1166]
    Rating: 0 (from 0 votes)

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!