Convert PDF to text and add page number in text file using OCR

   In this article, I will show you how to convert image PDF to text and add page number on output text file using advanced OCR technology. There are lots of software in the market which can be used to convert PDF to text, but there are seldom ones which allow you to add page number in various forms to output text file and have OCR function. By OCR function, you can convert either image PDF or text based PDF file to editable text file.

VeryPDF PDF to Text OCR Converter CMD  has such functions. If you need to know more about it, please visit its homepage. And in the following part, I will show you how to use this software.

Step 1. Download PDF to Text OCR Converter Command Line

  • All the VeryPDF software are free downloading and free trial, so you can rest assured to use it without worrying extra fee charged without your permission.
  • And when downloading finishes, there will be an zip file. Please extract it to some folder then you can call the executable file in MS Dos Windows.

Step 2. Convert PDF to text and add page number in text file

  • Here is the usage for your reference:  pdf2txtocr.exe [options] <PDF-file> <Text-file>
  • When converting text based PDF file to text and add page number, please refer to the following command line template.
  • pdf2txtocr.exe -text "PageText %PageNumber% of %PageCount%" C:\in.pdf C:\out.txt
    By this command line template, you can convert PDF to text and add page number like current page number and total pages.

  • When converting image PDF to text and add page number, you need to add OCR parameter and related parameters like the following command line template.
    pdf2txtocr.exe -ocr -lang eng -text "PageText %PageNumber% of %PageCount%" C:\in.pdf C:\out.txt
    Please choose proper OCR language according to the content in PDF file.
    Here are some parameters you may use, please have a check.
  • -ocr                : this parameter enables OCR function for scanned PDF file
    -lang <string>      : when you need to choose the language for OCR engine, please add this parameter and then corresponding language package.
    -ocrmode <int>      : when you need to set OCR mode, please add this parameter first.
        -ocrmode 0: this mode will output to text file
    -text <string>      : When you need to add additional text at end of each text page, please add this parameter. And this parameter supports the following variables:
        %PageNumber%: current page number
        %PageCount% : total page count of PDF file

Now let us check the adding page number effect from the following snapshot.

output text with page number
     This snapshot is from output text file.
input image PDF file
                  This snapshot is from image PDF file.

Checking from the above snapshot, we can get that this software can convert image PDF to text and add page number perfectly. During the using, if you have any question, please contact us as soon as possible.

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Related Posts

This entry was posted in PDF to Text OCR Command Line and tagged , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!