How to convert image to text by command line?

     Sometimes we need to extract text from image file but it will be hard if you do not have a helpful software in hand. In the market, there are many GUI version software, but command line version are rare. In the following part, I will show you software which can convert image to text by command line. It is VeryPDF PDF to Text OCR Converter Command Line, which can help you convert image of TIFF, BMP, PNG, JPG, PCX, and TGA to text. Please check more information on software homepage. In the following part, I will show you how to use this software.

Step 1. Download PDF to Text OCR Converter Command Line

  • This is Windows application, once downloading finishes there will be a zip file. You need to extract it to some folder then you can call the executable file in MS Dos Windows.
  • When you use this software, please refer to the usage and examples.

Step 2. Convert image to text by command line.

  • There is also software component that provides tools and libraries for software programmers or developers to quickly integrate PDF to Text OCR Converter or functions of it to into other applications.  If you are developers, please use this version.
  • In the following part, I will take the common command line version for example. 
  • Here is the usage for your reference. Usage: pdf2txtocr.exe [options] <PDF-file> <Text-file>
  • When converting image to text, please refer to the following command line templates.
  • pdf2txtocr.exe C:\in.tif C:\out.txt
    pdf2txtocr.exe C:\in.jpg C:\out.txt
    pdf2txtocr.exe C:\in.bmp C:\out.txt
    pdf2txtocr.exe C:\in.png C:\out.txt
    pdf2txtocr.exe C:\in.pcx C:\out.txt
    pdf2txtocr.exe C:\in.tga C:\out.txt
    You do not need to input any parameters, simply input the full path of the input image file and output text file.

  • When converting image to text, this software allows you to process image in advance like you can rotate PDF, adjust lightness threshold that used to convert image to B&W. Now let us check related parameters.
    -bitcount <int>     : set color depth when render PDF page to image data, it can be set 1, 8, 24, default is 8bit
    -rotate <int>       : rotate pages before OCR
    -threshold <int>    : lightness threshold that used to convert image to B&W

So this software is extremely good when converting black and with scan file to text. You can use it together with scanner, and then you can directly extract content from input image file. Sometimes if the resolution of input image file is too low, the recognition effect will be a little bad, so when scan file, you’d better try to improve image resolution.

Now let us check the conversion effect from the following snapshot. During the using, if you have any question, please contact us as soon as possible.

input tiff file
             This is from input tiff file.

output text
                   This is from output text file.

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Related Posts

This entry was posted in PDF to Text OCR Command Line and tagged . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!