VeryPDF Document Processing Technologies for Document Scans.
Document Processing comprises a range of features that aim to enhance the functionality and automation of scanned documents or improve their image quality. Some of these features include OCR, batch splitting, blank page removal, despeckle and deskew, and cloud-based or self-hosted document processing.
1. OCR
2. Batch Splitting
3. Blank Page Removal
4. Despeckle and Deskew
5. Document Processing in the cloud or self-hosted
VeryPDF OCR to Any Converter SDK/COM,
https://www.verypdf.com/app/ocr-to-any-converter-cmd/try-and-buy.html#buysdk
1. OCR
OCR, or Optical Character Recognition, is a process that involves taking an image, such as a scanned document, and reconstructing its text. This process enables scanned documents to become searchable and editable. By making documents searchable, users can easily locate and copy specific content within the document. Additionally, if the document has been added to a document management system, users can find it by searching for its content.
Although OCR is a useful tool, it is a resource-intensive process that can add seconds or even tens of seconds per page to the time it takes to deliver a document. As a result, it is recommended to enable OCR on scan actions where it is most useful rather than where fast delivery is a priority.
Currently, VeryPDF software supports the following text-searchable file types: PDF (text-searchable), which includes PDF v1.4 with PDF/A-1 compliance according to the requirements defined by the PDF/A standard, and DOCX.
OCR also supports the extraction of text for approximately 100 languages, allowing users to choose up to 10 languages. However, for optimal performance, it is recommended to limit language choices to a maximum of four languages.
2. Batch Splitting
Batch Splitting is a powerful feature that allows you to split a large input document into multiple output documents. This feature is particularly useful for high-capacity document feeders and when scanning batches of forms or invoices.
When using Batch Splitting, you have two options: splitting every N pages or splitting on blank separator pages. The first option lets you set the number of pages to split after. The second option detects a blank page and uses it as a separator to split the current document into multiple output documents.
All output documents will have the same image and quality settings, such as DPI, color, and orientation. They will also be delivered to the same destination with the same root file name appended numerically for each document. For example, if your root file name is "Invoice," your output documents will be named "Invoice_1," "Invoice_2," and so on.
You can also use Batch Splitting in combination with Blank Page Removal. This means that the blank pages will be detected and used as a trigger for splitting, and then removed from the resulting individual documents.
3. Blank Page Removal
The Blank Page Removal feature offered by VeryPDF Software is designed to streamline the process of creating digital copies of paper documents. By detecting and removing pages that contain no content, Blank Page Removal can reduce the size of scanned documents and improve the overall user experience.
At the scan action level, Blank Page Removal can be configured to be turned ON or OFF. The feature works by analyzing each page and comparing it against a white space threshold. Pages that meet or exceed the threshold are deleted. Although the threshold is set by default, users can manually adjust it to fine-tune sensitivity by using the configuration key.
One of the benefits of Blank Page Removal is that it operates at the page level, not sheet level. As a result, it is particularly well-suited for reducing the size of scans of single-sided content. In cases where there are multiple blank pages in a row, Blank Page Removal treats them as a single blank page and removes them all together.
Blank Page Removal is a valuable tool for anyone looking to create digital copies of paper documents in a more efficient and streamlined manner. By removing unnecessary pages and reducing file sizes, this feature can help save time and improve the quality of digital documents.
4. Despeckle and Deskew
The Despeckle and Deskew features are essential tools for improving scanned image quality. The Despeckle feature is particularly useful for removing pixel noise from documents that have been printed or copied multiple times, such as student forms or older documents. On the other hand, the Deskew feature detects crooked documents and corrects them up to 45 degrees, making it perfect for scanning documents on a flatbed scanner, where there's a risk of paper alignment errors.
Using these features can greatly enhance the accuracy of OCR and other Document Processing features. They can be used alone or in conjunction with other features to achieve the best results possible.
5. Document Processing in the cloud or self-hosted
VeryPDF Software offers two options for running Document Processing: the VeryPDF Cloud Document Processing service, or a self-hosted infrastructure. The self-hosted solution is only available for Windows.
If you choose the VeryPDF Cloud Document Processing service, the processing takes place on VeryPDF's servers. This option is ideal for organizations that don't have a high-performance application server or multiple Document Processing servers.
On the other hand, the self-hosted Document Processing option is available for organizations that have a need to keep their data within their own infrastructure. This option requires high-performing application servers or the resources to configure multiple Document Processing servers. It is typically used for regulatory or compliance reasons.
However, keep in mind that the self-hosted option requires you to install the service on selected infrastructure and keep it updated by installing new versions. For more information, check out the Document Processing FAQs or visit the Set up self-hosted Document Processing page to get started.
➤ Want to buy this product from VeryPDF?
Should you be interested in acquiring a license for our product or require assistance in developing a custom software solution based on it, please do not hesitate to reach out to us. Our team is always ready to assist you and provide you with the necessary support.
We look forward to the opportunity of working with you and providing developer assistance if required.