How to extract German text from tiff file?

In this article, I will show you how to extract text saved in tiff file to text document by command line. Normally speaking, conversion from tiff to text is easy, but when meet tiff with German text, the conversion accuracy can not be guaranteed as there are seldom OCR engine supports German language. VeryPDF Image to PDF OCR Converter Command Line can help you extract German text saved in tiff file to text. Or you can convert tiff to searchable PDF file. In the following part, I will show you how to make it.

Step 1. Download Image to PDF OCR Converter Command Line

  • This software is Windows application, for now it can not work under Mac and Linux system. And there is also GUI version available if you are not familiar with OCR function.
  • When downloading finishes, there will be an zip file. Please extract it some folder then you can call the executable file.
  • If you have owned this software, you do not need to buy image to PDF converter as this software also can help you convert image to normal PDF file.

Step 2. Extract German Text from Tiff file to Text Document

Usage:img2pdf [options] <Image-file> [<PDF-file>]

  • When you need to extract text from this tiff file, please refer to the following command line templates.
    img2pdfnew.exe -ocr 1 -ocrtxt bw-ocr.txt bw.tif bw-ocr.pdf
    By the above command line, there will be either text document output and searchable PDF output. Now let us check related parameters.
  • -ocr <int>:this parameter can help you create full-text searchable PDF file from tiff or other image file formats.
    -tsocr :when you need to launch tesseract-ocr engine, please add this parameter.
    -tsocrlang <string>:when you need to set language for tesseract-ocr engine, please add this parameter and corresponding values.
    -ocrtxt <string>:when you need to export OCRed text to a text file, please add this parameter.
    -ocrtxtxy <string> : when you need to export OCRed text with X, Y coordinate to a text file, please add this parameter.

    Now let us check the conversion effect from the following snapshot. For easy checking, I paste the content to word document from text file.

    input tiff with German and output text

    Checking from the above snapshot, we can deduce that all the conversion character in German can be converted correctly. This software also supports many other languages. And it also supports searchable PDF as output. And this software also supports those files as input file format: PDF, BMP, GIF, TIFF, JPEG, PNG, PCX and TGA, which means you can extract content  from the above input file formats.

    During the using, if you have any question, please contact us as soon as possible.

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *

Verify Code   If you cannot see the CheckCode image,please refresh the page again!