How to OCR image, PDF, scan file by one software?

   OCR is abbreviated from Optical character recognition, now this technology in the mechanical or electronic conversion of scanned images of handwritten, typewritten or printed text into machine-encoded text. This technology is greatly important when you nee d to extract text from image file, image PDF file or some scan file. There is many OCR software in the market but seldom ones which could be used to OCR all the formats mentioned above.

VeryPDF OCR to Any Converter Command Line can finish all the tasks mentioned above. If you need to know more functions of this software, please visit its homepage. In the following part, I will show you how to use this software.

Step 1. Download OCR to Any Converter Command Line v2.0 

  • There are two versions of this software: GUI version and command line version. Here I will take the command line version for example. If you are not familiar with the command line operation, please download the GUI version.
  • When downloading command line version finishes, there will be a zip file.  You need to extract it to some folder then you can find executable file and help documents.

Step 2. OCR image, PDF, Scan file by one software.

  • When you use this software, please refer to the usage and examples.
  • Here is the usage for your reference: ocr2any.exe [options] <PDF-file> <Text-file>
  • When OCR image files here are some examples for your reference.
  • ocr2any.exe C:\in.tif C:\out.txt
    ocr2any.exe C:\in.jpg C:\out.txt
    ocr2any.exe C:\in.bmp C:\out.txt
    ocr2any.exe C:\in.png C:\out.txt

Here are just some of examples, please check more in readme.txt. And when OCR image, it can help you output content in text file.

  • When OCR PDF file, please refer to the following command line templates.
  • ocr2any.exe -ocr -lang deu -ocrmode 1 C:\in.pdf C:\out.pdf
    ocr2any.exe -ocr -lang eng -ocrmode 2 C:\in.pdf C:\out.pdf
    ocr2any.exe -ocr -lang eng -ocrmode 3 C:\in.pdf C:\out.pdf
    Here are some parameters:

    -ocr                    : enable OCR function for scanned PDF file
      -lang <string>          : choose the language for OCR engine
      -ocrmode <int>          : set OCR mode
        -ocrmode 0: output to text file
        -ocrmode 1: OCR PDF pages and insert new text layer under original PDF pages
        -ocrmode 2: output to plain text based PDF file
        -ocrmode 3: output to OCRed PDF file (BW) with hidden text layer
        -ocrmode 4: output to OCRed PDF file (Color) with hidden text layer

When OCR PDF file, you can either output searchable PDF file, text , word, HTML, Excel and others editable file formats.

  • When OCR scan file, you’d better scan file to black and white as this can enhance OCR rate greatly.  When OCR scan file, this software allows you to process input scan file quality in advance. Say you can change bit count, rotate image and adjust threshold hold. Here are some examples.
    ocr2any.exe -imageopt -threshold 0 C:\in.tif C:\out.bmp
    ocr2any.exe -threshold 240 C:\in.tif C:\out.bmp
    ocr2any.exe -dither 0 C:\in.bmp C:\out.png
    ocr2any.exe -dither 7 C:\in.bmp C:\out.png
    ocr2any.exe -imageopt -resizewidth 800 -resizeheight 600 C:\in.gif C:\out.tga

So you can consider this software as all in one. It can nearly OCR most of common file formats. During the using, if you have any question, please contact us as soon as possible.

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Related Posts

This entry was posted in OCR Products and tagged . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!