How to install language data for VeryPDF Image to PDF OCR Command Line V5?

In comparison with the former version, the latest version of VeryPDF Image to PDF OCR Command Line (V5) is more powerful. The newly added OCR engine greatly enhances the OCR accuracy and speeds the process, and makes it possible to convert scanned image to PDF in different languages .

How to use command line?

Two options provided by VeryPDF Image to PDF OCR Command Line v5 can help you recognize text in scanned images and create searchable PDF from scanned image. They are -ocr 1 and -tsocr. You can use either or use both in a command line. One difference between the two options is that the -tsocr can call the OCR engine which supports multiple languages, while -ocr 1 calls the other OCR engine which can only recognize English.

In order to reduce the setup file size, VeryPDF Image to PDF OCR Command Line v5 only provides two built-in language packages: English and German. If you want the software to recognize languages besides English, you need to use  -tsocr with -tsocrlang <string>. String enclosed by the angle brackets stands for the file name of a language package. For example, when the text of the input image is in German, you can use -socrlang deu as in the following command line:

 img2pdfnew.exe -tsocr -socrlang deu source.tif export.pdf

  • img2pdfnew.exe calls VeryPDF Image to PDF OCR Command Line v5.
  • -tsocr run the OCR engine which supports multiple languages.
  • -socrlang deu specifies the language package. deu, which is the file name of the German language package, stands for German.
  • source.tif presents the input image.
  • export.pdf stands for the output PDF file.

How to install more language data?

The following section tells how to download and install language data for VeryPDF Image to PDF OCR Command Line V5. Assuming you want to create PDF in  French from a scanned image, you can do as follows:

Step 1. download the French language package
First, please click OCR Language List to visit the webpage. Second, download any language package you need to use. For example, you can click the link  tesseract-ocr-3.02.fra.tar.gz to download the French language package.

    Step 2. Uncompress the language package
    After download the French language package, please uncompress it. Open the uncompressed folder and then open the folder named tessdata in it, and you will find two files there.

    Step 3. Move the two data files to the directory under tessdata
    Copy the two files and paste them to the tessdata folder under the installation folder of VeryPDF Image to PDF OCR Command Line V5.
    Then, you can use -tsocr and -socrlang <string> to convert image to PDF in French. The value of French for the option        -socrlang <string> is fra, which you can find before .tar.gz in the language package name tesseract-ocr-3.02.fra.tar.gz.

    Values for -socrlang <string>

    You can use the same way to find other values for other languages after you download more language packages. The following list some values of other languages:

    Value Language Download Link
    spa Spanish tesseract-ocr-3.02.spa.tar.gz
    fra French tesseract-ocr-3.02.fra.tar.gz
    ita Italian tesseract-ocr-3.02.ita.tar.gz
    nld Dutch tesseract-ocr-3.02.nld.tar.gz
    ell Greek tesseract-ocr-3.02.ell.tar.gz
    swe Swedish tesseract-ocr-3.02.swe.tar.gz
    por Portuguese tesseract-ocr-3.02.por.tar.gz
    More…

    Hope  this article  is helpful. If you want to ask any question about  how to convert image to PDF or  about the software VeryPDF Image to PDF OCR Command Line V5, please drop a line. You can also contact the support  group of VeryPDF for help.

    To download VeryPDF Image to PDF OCR Command Line V5, please click here.

    VN:F [1.9.20_1166]
    Rating: 0.0/10 (0 votes cast)
    VN:F [1.9.20_1166]
    Rating: 0 (from 0 votes)

    Related Posts

    Leave a Reply

    Your email address will not be published. Required fields are marked *


    Verify Code   If you cannot see the CheckCode image,please refresh the page again!