How to extract text, image, graphics, color spaces, etc. elements from PDF file?


I am using C# to create a web application and I need to have access to all of the elements of a pdf and especially text and paths.

I want to read a pdf, find a path object, check its CMYK fill color and/or stoke size and color and change if necessary based on my criteria. I will do the same with text elements.

We currently use a software to do some other actions but are limited.

Can you program do this?

I'm looking for a solution / API (i.e. like PDFLib) that can extract (and remove) a drawn path from a graphic PDF. For example a path that outlines a picture or logo that was drawn in Illustrator or Indesign (not JPG clipping path), that is set to a specific spot color (i.e. "CutContour"). I need to get the data that makes up that path to extract for use in a cutting system.

While PDFLib can extract text, it cannot extract graphic elements. I'm even open to solutions outside of PHP!

Thanks in advance!

Thanks for your message, we suggest you may download "VeryPDF PDF Extract Tool Command Line" from following web page to try,

You can use "VeryPDF PDF Extract Tool Command Line" to extract all information from PDF file and save to XML and text files, include text elements, path elements, color spaces, graphic etc. objects, you can parse XML file to get all necessary information, then you can reuse these elements easily.

"VeryPDF PDF Extract Tool Command Line" is a command line application, you can call it from PHP code to parse PDF files easily.

If you encounter any problem with "VeryPDF PDF Extract Tool Command Line", please feel free to let us know, we will assist you asap.


VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *

Verify Code   If you cannot see the CheckCode image,please refresh the page again!