Convert scanned image to text based E-book by command line

   When connecting scanner with software converter, you can convert scanned image file to PDF E-book directly. This method will be good to create E-book from paper books or documents. However, there is a problem, when converting image to PDF E-book, you can not do do copy and paste or search in the output E-book. Based on this needs, VeryPDF developed software Scan to E-book OCR Maker which allows you to create searchable E-book from either scanned image or image PDF. In the following part, I will show you how to make it.

Step 1. Download Scan to E-book OCR Maker

  • There are two versions of this software: GUI version and command line version. In this article, I will show you the command line version.
  • When downloading finishes, there will be an zip file in the downloading folder. Please extract it to some folder then call the image2pdfnew.exe in MS Dos Windows.

Step 2. Convert Scan Image to PDF E-book

  • Here is the usage for your reference.
  • Usage: img2pdf [options] <Image-file> [<PDF-file>]
  • When you convert scan image to searchable PDF, please refer to the following command line template.
  • img2pdfnew.exe -ocr 1 -combineword 1 -bitcount 1 C:\in.tif C:\out.pdf
    img2pdfnew.exe -ocr 1 -combineword 1 -bitcount 1 C:\in.pdf C:\out.pdf
    img2pdfnew.exe -ocr 1 -tsocr C:\in.tif C:\out.pdf
    img2pdfnew.exe -ocr 1 -tsocr C:\in.jpg C:\out.pdf
    img2pdfnew.exe -ocr 1 -tsocr C:\in.png C:\out.pdf
    img2pdfnew.exe -ocr 1 -tsocr C:\in.pdf C:\out.pdf
    When converting scan file to searchable E-book, the scan file either can be save as PDF, BMP, GIF, TIFF, JPEG, PNG, PCX and TGA. All those file formats can be converted to searchable PDF E-book.
    Related Parameters:
    -ocr <int>   1: when you need to create full-text searchable PDF file, please add this parameter.
    -combineword <int>: when you need to combine OCRed characters to words, please add this parameter.
    -tsocr: use tesseract-ocr engine
    -tsocrlang <string> : set language for tesseract-ocr engine
    -ocrtxt <string>: export OCRed text to a text file
    -ocrtxtxy <string>: export OCRed text with X, Y coordinate to a text file
    -bitcount <int> : PDF to Image: set color depth

Now let us check the conversion effect from the following snapshot.

input tiff and output searachable PDF

When you make E-Book from searchable PDF file, the E-book is also searchable. No matter you read them on iPhone, iPad or other smart cell phone, you can do the searching function casually.

There are more functions of this software, I can not list all of them here. If you need to check more, please visit it on our website or readme.txt file. During the using, if you have any question, please contact us as soon as possible.

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *

Verify Code   If you cannot see the CheckCode image,please refresh the page again!