How to get key value pairs from fillable PDF files?

We have some PDF files, these PDF files are contain fillable forms + background image, users will fill the data into the forms which be placed on the image. We are using "VeryPDF PDF Parser & Modify Component for .NET Developer License" to extract text contents from this PDF file currently.

http://www.verypdf.com/app/pdftoolbox/pdf-parse-modify.html

image

In this PDF form, part of the content is available in the htm output as text. Only challenge left out is to get the labels which are present as image.

Converting PDF to image and OCRing will have an impact in the accuracy. As you are the PDF expert, please suggest to get the label content which is present in this pdf as an image/metadata.

Since this is an Form PDF, is there any way to get the key value pair from this form PDF?

Regards
Customer
-------------------------------
>>In this form, part of the content is available in the htm output as text. Only challenge left out is to get the labels which are present as image.

We have double checked your PDF file, yes, your PDF file contains an entire background image and some fillable forms on the image, please look at this background image at below,

image

The filled data are text contents, but the background is a big image, user will fill the data on the image directly.

We have a solution for you to get key value pairs from this form PDF file, for example,

Step 1. We will extract user filled text contents and their coordinates from this PDF file first,

Step 2. We will use OCR technology to convert entire PDF page (or only the background image) to text contents,

Step 3. We will combine the data from #1 and #2 together, Label Text from OCR + User Filled Text from PDF Parser SDK, with this solution, we will able to get the final key value pairs from this form PDF properly.

Finally, I think PDF Parser SDK + OCR SDK will finish this work to you, we can get user filled data from PDF Parser SDK and get Text Labels from OCR software, if we combine them together, we will able to get the final key value pairs from this form PDF file.

If you have any question for this solution, please feel free to let me know.

VeryPDF

Related Software in this article,

VeryPDF PDF Parse & Modify Component for .NET,
http://www.verypdf.com/app/pdftoolbox/pdf-parse-modify.html

VeryPDF OCR to Any Converter Command Line,
http://www.verypdf.com/app/ocr-to-any-converter-cmd/index.html

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!