Extract table from scanned PDF in low quality

VeryPDF Table Extractor OCR, with its powerful built-in OCR engines and useful preprocess tools, can help you easily extract tables from scanned PDFs and images. This article explains how to use one of its pre-preprocess tools which can be used to clean the unwelcome dots, frame and color cell background on the original file, helping you extract text from tables accurately and effectively.

The following is the Advanced OCR dialog box containing the preprocess tools.  The options under Clean, which are marked by the red rectangle, can be used to clean pages.

advanced ocr tools for extracting table from pdf and image 

The two examples as following show the effects of using the clean tool provided by VeryPDF Table Extractor OCR.

Example 1

The first screenshot displays a part of the input PDF. After using the De-speckle button, the dots are greatly decreased in the same part as shown in the second screenshot.   

Before                                                        After

pair one-origional  pair 1 -after despeckle

Example 2

See the other pair of screenshots below? The first one is a part of an image file.  After using VeryPDF Table Extractor OCR to remove the frames of the table and some cells’ background, the image only contains some text which is illustrated by the second screenshot.

Before                                                        After

pair 2-after    pair 2-after filling with white

The usages of the two tools under Clean:

  • clip_image006If there are some spots and dots on scanned PDF or images, you can use this tool to automatically remove the spot and dots from the PDF or images.
  • clip_image008can be used to remove color blocks such as colorful cell background, frame, and background color.

Are you interested in using VeryPDF Table Extractor OCR to extract tables and text from images and scanned PDF in low quality? Why not contact the Support Group of VeryPDF to ask for the trial version of  VeryPDF Table Extractor OCR?

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *

Verify Code   If you cannot see the CheckCode image,please refresh the page again!