How to extract tables from scanned PDF files or image based PDF files?

I can't seem to extract the table inside certain PDF files. When I use other software, however, the conversion is almost flawless. Can you assist to see which setting I should use to get the best result here? Thanks!

Attached is the file that contains table that I'm trying to extract.

image

Customer
----------------------------------------
Thanks for your sample PDF file, we noticed your PDF file was created from scanner, it's an image based PDF file, it doesn't contain any text information.

We have figured out a solution to you, please by following steps to extract tables from this scanned PDF file to plain text based tables.

1. Please download PDF to Text OCR Converter Command Line from this web page,

https://www.verypdf.com/app/pdf-to-text-ocr-converter/try-and-buy.html#buy
https://www.verypdf.com/pdf2txt/pdf2txtocrcmd.zip

2. After you download and unzip it to a folder, you may run following command line to convert this scanned PDF file to plain text based PDF file,

pdf2txtocr.exe -ocrmode 2 -forcefontsize 20 D:\downloads\file01.pdf D:\downloads\file02.pdf

Above command line will convert your scanned PDF file to a pure text based PDF file.

image

3. You can run following command line to convert new pure text based PDF file to text file,

pdf2txtocr.exe -layout D:\downloads\file02.pdf D:\downloads\file02.txt

The resultant text file contains text columns, you can import the text columns into MS Excel application easily.

image

4. You can also use PDF to Excel Converter Command Line or PDF Table Extractor to extract the tables from pure text based PDF file to CSV file, you may download these software from following web pages to try,

https://veryutils.com/pdf-to-excel-converter
https://veryutils.com/pdf-to-excel-converter-command-line

The PDF to Excel Converter Command Line software allows you extract tables from this PDF file and save to CSV files from command line.

In PDF Table Extractor software, you can draw rectangles to select the tables which you want to extract,

image

image

After you extract data to a CSV file, you can reuse the text data in tables in MS Excel application easily,

image

We suggest you may download these products from our website to try, if you encounter any problem with them, please feel free to let us know, we are glad to assist you asap.

VeryPDF

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!