I can't seem to extract the table inside certain PDF files. When I use other software, however, the conversion is almost flawless. Can you assist to see which setting I should use to get the best result here? Thanks!
Attached is the file that contains table that I'm trying to extract.
Thanks for your sample PDF file, we noticed your PDF file was created from scanner, it's an image based PDF file, it doesn't contain any text information.
We have figured out a solution to you, please by following steps to extract tables from this scanned PDF file to plain text based tables.
1. Please download PDF to Text OCR Converter Command Line from this web page,
2. After you download and unzip it to a folder, you may run following command line to convert this scanned PDF file to plain text based PDF file,
pdf2txtocr.exe -ocrmode 2 -forcefontsize 20 D:\downloads\file01.pdf D:\downloads\file02.pdf
Above command line will convert your scanned PDF file to a pure text based PDF file.
3. You can run following command line to convert new pure text based PDF file to text file,
pdf2txtocr.exe -layout D:\downloads\file02.pdf D:\downloads\file02.txt
The resultant text file contains text columns, you can import the text columns into MS Excel application easily.
4. You can also useConverter Command Line or PDF Table Extractor to extract the tables from pure text based PDF file to CSV file, you may download these software from following web pages to try,
TheConverter Command Line software allows you extract tables from this PDF file and save to CSV files from command line.
In PDF Table Extractor software, you can draw rectangles to select the tables which you want to extract,
After you extract data to a CSV file, you can reuse the text data in tables in MS Excel application easily,
We suggest you may download these products from our website to try, if you encounter any problem with them, please feel free to let us know, we are glad to assist you asap.