How to read text in a rectangle from PDF file?

I have the pdf editor toolkit. I would like to capture the electronic text
in selected areas of the PDF. How do I go about that?

Regards,
Customer

-------------------------------------------------------

Thanks for your message, pdf editor toolkit hasn't an option to capture the electronic text in selected areas of the PDF, if you need this function, you may download "PDF to TEXT Converter SDK" or "PDF Parse & Modify Component for .NET" from our website to try, "PDF to TEXT Converter SDK" does convert entire PDF page to a text file, it doesn't support rectangle, "PDF Parse & Modify Component for .NET" will output PDF page contents to a XML file, this XML file is contain X, Y, Width, Height information for each word, you can write a simple function to get all words in a rectangle easily,

"PDF to TEXT Converter SDK" can be downloaded from following web page,

http://www.verypdf.com/app/pdf-to-txt-converter/try-and-buy.html

"PDF Parse & Modify Component for .NET" can be downloaded from following web page,

http://www.verypdf.com/app/pdftoolbox/try-and-buy.html#buy-pmc
http://www.verypdf.com/dl2.php/pdfparsersdk.zip

This is user guide for "PDF Parse & Modify Component for .NET" product,

http://www.verypdf.com/app/pdftoolbox/parse-modify-guide.html

You will get the text contents in PDF page like below,

666,78,84,14,MORGAN;
761,78,90,14,STANLEY;
862,78,57,14,FIXED;
930,78,76,14,INCOME;
1017,78,107,14,RESEARCH;

OR

<div style="position: absolute;top:2;left:-1"><img width="1275" height="1650" src="out_pg_0001.png"></img></div>
<div style="position:absolute;left:666;top:78;width:84;height:14"><span style="font-style:normal;font-weight:700;font-size:13px;font-family:Arial;color:#000000;">MORGAN</span></div>
<div style="position:absolute;left:761;top:78;width:90;height:14"><span style="font-style:normal;font-weight:700;font-size:13px;font-family:Arial;color:#000000;">STANLEY</span></div>
<div style="position:absolute;left:862;top:78;width:57;height:14"><span style="font-style:normal;font-weight:700;font-size:13px;font-family:Arial;color:#000000;">FIXED</span></div>

You can write a function to get the words in a rectangle by their position easily.

VeryPDF

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Related Posts

This entry was posted in PDF Parser & Modify SDK, PDF to Text Converter and tagged , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!