VeryPDF Table Extractor OCR is a professional tool that can extract tables from scanned PDFs and images, and save result tables as HTML, Excel, TXT, and CVS. Or, after the conversion, you can also upload original files and result files to the cloud. This tool can recognize more than 20 languages and provides a Windows version and a Mac version.
Normally, the resolution of input files should be higher than 200 DPI. In order to improve the OCR correct rate ,especially for the files with resolution between 150-200 DPI, VeryPDF Table Extractor OCR provides some advanced OCR options.
You can click buttons to automatically de-skew and de-speckle original input files. You can also set the threshold to turn color images or PDFs into black and white files; set rotation angles to rotate input PDF pages, fill color cell background or table frame lines in input files with white, etc.
This article mainly introduces the advanced OCR options of VeryPDF Table Extractor OCR. The following are the buttons you will going to need when use the OCR function.
When you process a PDF or image with the resolution higher than 200 DPI, you can just click the OCR button . When the input pages or images are not straight, or have spots or marks in the backgrounds, or the backgrounds are not white, or the resolution is between 150 to 200 DPI, you may need to use the advanced OCR options. Please click the down arrow beside the OCR button . Then, a dialog box will pop out. You will find some buttons as following in the dialog box:
1. Set the threshold
You can turn color files into black and white ones by setting options under Threshold. The default threshold value is 200. You can either click the arrows in the spin box to change the value or drag the handle on the rail of the slider .
2. Rotate Pages
In case some pages of your input file is not straight, you can use buttons under Rotation, or change the value in the spin box under Rotation.
- You can click to automatically rotate the selected page.
- You can also click to turn the page clockwise by 90 degrees, or click to turn the page counter-clockwise by 90 degrees.
- Furthermore, you can change the number in the spin box and then click the button after the spin box to rotate the page clockwise by the specified degrees.
- If you click , you can draw a line on the selected page to rotate the pages. For instance, when you draw a line from left down to right top by 75 degrees as the green line below shows,
the page will turn clockwise by 75 degrees as illustrated below:
3. Clean the Background
If there are some spots on a background, you can use the buttons under Clean to improve the accuracy rate. You can either choose to de-speckle the selected page automatically, or de-speckle the page manually.
- If you click , the application will automatically remove the speckles.
- When you want to remove the table frame lines, or a cell background, you can you click and then click on the lines or background to replace them with white.
After setting all the options in the advanced OCR dialog box, you can click Apply to apply the settings for the present page.You can click Revert to undo all the settings before closing the dialog box. All the advanced OCR options can help you effectively enhance the OCR accuracy rate.
4. Support Multiple Languages
- To make the OCR engines recognize multiple languages, you can do as follows:
- Click the language button after the OCR button to open the Language dialog box.
- In the Language dialog box, please double click the radio button before the language you want to recognize such as Swedish.
- A dialog box will pop out, asking whether you want to download the corresponding language package.
- Please click Yes to start downloading.
- After the language package is downloaded, please click Apply in the Language dialog box. Then, the text in the button will change to the selected language, e.g., .
Then, once you click the OCR button, the OCR engines can recognize Swedish.
------------------------------------------------------------
How to Get VeryPDF Table Extractor OCR ?
Do you want to try this product right away? Please contact the Support Group of VeryPDF by sending an email to support@verypdf.com to ask for a beta version. You can also visit the homepage of VeryPDF to get the updates. As soon as the product pass the test, it will be published. You can get news about it on the homepage.
------------------------------------------------------------