Home > Products Windows > PDF to Text OCR Converter Command Line
VeryPDF PDF to Text OCR Converter Command Line

VeryPDF

PDF to Text OCR Converter CMD


  • Extract text from scanned PDF.
  • Convert image to editable textual file.
  • Recognize text from scanned documents in batches.
Download Buy Now
Version 3.0
$195.00

VeryPDF PDF to Text OCR SDK for .NET is a software component that provides tools and libraries for software programmers or developers to quickly integrate PDF to Text OCR Converter or functions of it to into other applications. This PDF to text converter can convert scanned PDF and images to plain TXT text with OCR (Optical Character Recognition) technology.

System Requirement

Operating Systems: all the Windows systems, like Windows 2000, XP, Vista, 7, Windows Server 2003, 2008 of 32 and 64 bits, etc.

Version: v2.0

Program UI Language: English

Input: Text based PDF file, Scanned PDF file, scanned TIFF file, JPEG, JPG, BMP, GIF, PNG

Output: Text file with layout, Text file with reading order, Searchable PDF file with color information, Searchable PDF file without color information

Main Features

Convert normal and scanned PDF to text
VeryPDF PDF to Text OCR SDK for .NET can not only convert PDF files to text files with/without original layout, but also can recognize and extract words and texts from scanned PDF to text with OCR.

 

Extract text from scanned TIFF and image
If you want to extract text from scanned TIFF and image, the application can help you a lot. It can recognize and extract text contents from scanned TIFF and images to text with OCR technology.

 

Create searchable PDF
This application can create searchable PDF from scanned TIFF, image and PDF files. What’s more, it can set open password, owner password, permission, key length to output PDF file.

 

Support OCR
The OCR engine supports more than ten languages and five different modes.

 

Support various settings
It supports to make different settings such as deskew and despeckle images automatically, keep coordination information of text in original PDF, insert page breaks 0x0C between pages in text file, rotate pages, lightness threshold, etc.

 

Support various program languages
It provides COM interface which can be easily called from VB, VB.NET, C#, ASP.NET program languages. Software developers or programmers can easily integrate the codes and APIs of the program into their own applications of higher capability, quality and security.

Feature List of VeryPDF PDF to Text OCR SDK for .NET

  • Convert PDF files to text files and keep original layout;
  • Convert PDF files to text files and keep reading order (without original layout);
  • Provide COM interface which can be called easily from VB, VB.NET, C#, ASP.NET program languages;
  • Convert scanned PDF files to text files;
  • Convert scanned TIFF and Image files to text files;
  • Support multi-page TIFF and PDF files as input format;
  • Support more than ten languages;
  • Able to create searchable PDF files to scanned TIFF files, image files and PDF files;
  • Create searchable PDF with original color retained;
  • Create searchable black-and-white PDF without image;
  • Create searchable black-and-white PDF with image;
  • Create searchable PDF with specific color bitcount, such as, color or grayscale PDF file;
  • Create Text file containing the coordination information of text in original PDF, include [X, Y, Width, Height] information for each word when OCR;
  • Able to set open password, owner password, permission, key length to output PDF file;
  • Able insert page breaks 0x0C between pages in text file;
  • Able to rotate pages before OCR;
  • Support threshold option, able to control lightness threshold that used to convert color image to black and white image;
  • Able to deskew and despeckle images automatically;
  • Support more OCR modes, such as,

    -ocrmode <int> : set OCR mode when convert text based PDF files and scanned PDF files to searchable PDF files

    -ocrmode 0: output to plain text file

    -ocrmode 1: OCR PDF pages and insert a new text layer under original PDF pages

    -ocrmode 2: output to plain text based PDF file (pure text based PDF file)

    -ocrmode 3: output to OCRed PDF file (BW) with hidden text layer

    -ocrmode 4: output to OCRed PDF file (Color) with hidden text layer

Sample: C# Project:

namespace CSharp_WindowsFormsApplication1

{

    public partial class Form1 : Form

    {

        public Form1()

        {

            InitializeComponent();

        }

 

        private void button1_Click(object sender, EventArgs e)

        {

            string strStartupPath = System.Windows.Forms.Application.StartupPath + "\\";

 

            System.Type pdf2vecName = Type.GetTypeFromProgID("pdfcom.pdfclass");

            if (pdf2vecName != null)

            {

                object pdf2vec = Activator.CreateInstance(pdf2vecName);

                string strInFile = strStartupPath + "test-color.tif";

                string strOutFile = strStartupPath + "_test-color.pdf";

                string strCmd = "-$ XXXXXXXXXXXXXXXXXXXX -ocrmode 4 \"" + strInFile + "\" \""

                        + strOutFile + "\""

                MessageBox.Show(strCmd);

                object[] argn = new object[1];

                argn[0] = strCmd;

                int nRet = (int)pdf2vecName.InvokeMember("com_PDFToTextOCRSDKShell",
                            BindingFlags.InvokeMethod, null, pdf2vec, argn);

                MessageBox.Show("Return Value is: " + string.Format("{0}", nRet));

            }

        }

    }

}

To know more usage about this .NET package, you can download VeryPDF PDF to Text OCR SDK for .NET and have a try.

To get full version of this .NET package, you can buy VeryPDF PDF to Text OCR SDK for .NET here.

Video Demo

Related Links

Relative Products

Gold Support 30-DAY NO RISK REFUND
 
  Learn more about
PDF to Text OCR Converter Command Line
  See other products   Download   Buy Now
 
 
                   
 You may like these products
VeryPDF PDFcamp Printer Pro
VeryPDF PDFcamp Printer Pro

$38.00

Convert files of Microsoft Word, PowerPoint, Excel, JPG, PNG, GIF, and HTML to PDF. Create PDF from printable documents.
VeryPDF PDF Editor
VeryPDF PDF Editor

$89.95

Create PDF, annotate PDF, fill PDF forms, edit PDF contents and hyperlinks, and convert PDF to image. It is a cost-effective PDF editor.
VeryPDF PDF to Word OCR Converter
VeryPDF PDF to Word OCR Converter

$59.95

Recognize characters in scanned image PDF and save as Word. It supports batch process that can convert multiple PDF files with one click.