How to extract text, image, graphics, color spaces, etc. elements from PDF file?

Hi,

I am using C# to create a web application and I need to have access to all of the elements of a pdf and especially text and paths.

I want to read a pdf, find a path object, check its CMYK fill color and/or stoke size and color and change if necessary based on my criteria. I will do the same with text elements.

We currently use a software to do some other actions but are limited.

Can you program do this?
Thanks,
Customer

-------------------------------------------
I'm looking for a solution / API (i.e. like PDFLib) that can extract (and remove) a drawn path from a graphic PDF. For example a path that outlines a picture or logo that was drawn in Illustrator or Indesign (not JPG clipping path), that is set to a specific spot color (i.e. "CutContour"). I need to get the data that makes up that path to extract for use in a cutting system.

While PDFLib can extract text, it cannot extract graphic elements. I'm even open to solutions outside of PHP!

Thanks in advance!
Customer
-------------------------------------------

image
Thanks for your message, we suggest you may download "VeryPDF PDF Extract Tool Command Line" from following web page to try,

http://www.verypdf.com/app/pdf-extract-tool/try-and-buy.html
http://www.verypdf.com/dl2.php/verypdf_pdf_extract_tool.zip

You can use "VeryPDF PDF Extract Tool Command Line" to extract all information from PDF file and save to XML and text files, include text elements, path elements, color spaces, graphic etc. objects, you can parse XML file to get all necessary information, then you can reuse these elements easily.

"VeryPDF PDF Extract Tool Command Line" is a command line application, you can call it from PHP code to parse PDF files easily.

If you encounter any problem with "VeryPDF PDF Extract Tool Command Line", please feel free to let us know, we will assist you asap.

VeryPDF

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)
Posted in PDF Parser & Modify SDK | Tagged | Leave a comment

Information on VeryPDF Cloud API. What is the difference between Cloud API Plans?

For the VeryPDF Cloud API Platform,

http://www.verypdf.com/online/cloud-api/try-and-buy.html

I need information on the following:

1. How many files can it convert simultaneously ?
2. What is the cost per file for conversion after the monthly limit of 2000 files has been used?
3. Are my files going to be stored on your cloud/server as well? If yes, how will I be able to delete them after the conversion process?
4. What are the terms of service and agreements to use this API?
5. What is the difference between online storage and 2000 files/month? Say I use the cloud API through my PHP application for 100 files to convert and they are 500MB in total, will my limit be reached? Or will i be able to use the API till I reach 2000 files?
6. What level of technical or customer support will I be receiving ?
7. How will I be able to track the no. of files I have already converted in a month ?

I need to move ahead with the service for my production usage once I get clarity around these

Thanks.
Customer
-------------------------------------------------------
>>1. How many files can it convert simultaneously?

VeryPDF Cloud API hasn't limitation on maximum number of files be converted simultaneously. But if you execute too many conversion jobs at same time, these conversion jobs will take too much CPU usage, each conversion job will take long time than normal, our Server Monitor will have to kill some heavy jobs to ensure all conversions smoothly.

In general, we suggest you should better convert your files one by one, if so, you will able to convert all of your files smoothly.

Just a calculation for different plans,

Startup Plan, 2000 files per month, 2000/30/24=2.78, it allows you to convert 2.78 files per hour.

Business Plan, 4000 files per month, 4000/30/24=5.56, it allows you to convert 5.56 files per hour.

Unlimited Plan, Unlimited Number files per month, we assume you will convert one file per minute, 1*60*24*30=43200, you can convert about 43200 files per month.

You can choose either one plan according to the number of your files.

>>2. What is the cost per file for conversion after the monthly limit of 2000 files has been used?

You can choose "Unlimited Plan" which cost at $99.95 per Month, "Unlimited Plan" hasn't maximum number of files be converted per month.

>>3. Are my files going to be stored on your cloud/server as well? If yes, how will I be able to delete them after the conversion process?

VeryPDF Cloud API Server is just used to store the converted files temporarily, you should better download converted files to your local server after conversion, you can use "DeleteFile" API to delete converted files from VeryPDF Server, please look at following web page for more information,

http://www.verypdf.com/wordpress/201309/verypdf-cloud-api-app-name-deletefile-delete-temporary-files-and-convert-pdf-files-to-as3-swf-files-38685.html

>>4. What are the terms of service and agreements to use this API?

Please refer to them at following URLs,

http://www.verypdf.com/terms-of-use.html
http://www.verypdf.com/privacy.html

>>5. What is the difference between online storage and 2000 files/month? Say I use the cloud API through my PHP application for 100 files to convert and they are 500MB in total, will my limit be reached? Or will i be able to use the API till I reach 2000 files?

In the "Startup Plan", Online Storage is 500 MB, "Document Number Limits / Month" is 2000, it is mean that you can store the files on VeryPDF Server up to 500MB and convert up to 2000 files.

If you have already stored 500MB files in VeryPDF Server, you can continue to use Cloud API up to 2000 files, this is no problem.

>>6. What level of technical or customer support will I be receiving ?

Please refer to our support options at following web page,

http://www.verypdf.com/custom/maintenance.htm

>>7. How will I be able to track the no. of files I have already converted in a month?

The statistics web page is not available yet, but we are planning release it in the future.

At the moment, you can simple add a track in your PHP code, this can be done easily.

VeryPDF

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)
Posted in VeryPDF Cloud API | Tagged | Leave a comment

Can PDF Stamper app stamp PDF files with the filename WITHOUT the .pdf file extension?

Dear Tech Support,

Can this app stamp PDF files with the filename, WITHOUT the .pdf file extension?

Also, can it place leading zeros on the page numbers?

I need to stamp hundreds of PDF files with their file name, without the .pdf extension, and 3 digit page numbers.

Example:

SOURCE = DE1000001.PDF

Desired Stamp = DE1000001-001, DE1000001-002, etc.

Thank you very much,
Customer
--------------------------------------------------

>>Can this app stamp PDF files with the filename, WITHOUT the .pdf file extension?

You need to use PDF Stamp Command Line software, PDF Stamp Command Line has "\e" option to stamp filename and remove ".pdf" extension, PDF Stamp Command Line software can be downloaded from following web page,

http://www.verypdf.com/app/pdf-stamp/try-and-buy.html#buy-cmd

You can run following command lines and check what will happen,

pdfstamp -PDF "D:\test\example.pdf" -o "D:\test\fullpath_without_extension.pdf" -AT "\E"

pdfstamp -PDF "D:\test\example.pdf" -o "D:\test\filename_without_extension.pdf" -AT "\e"

pdfstamp -PDF "D:\test\example.pdf" -o "D:\test\filename_with_extension.pdf" -AT "\f"

pdfstamp -PDF "D:\test\example.pdf" -o "D:\test\fullpath_with_extension.pdf" -AT "\F"

>>Also, can it place leading zeros on the page numbers?

Yes, this function is named "Bates Numbers", PDF Stamp Command Line does support "Bates Numbers" function, you can run following command lines to add "Bates Numbers" to your PDF pages,

pdfstamp -PDF "example.pdf" -o "bates-numbers.pdf" -AT "Bates Numbers \B(0000105)" -p3 -mlr-30 -mtb30 -fs10 -fn300 -c#FF0000

You can combine stamp filename and "Bates Numbers" two functions into one command line, e.g.,

pdfstamp -PDF "example.pdf" -o "_bates-numbers.pdf" -AT "FileName: \e, Bates Numbers: \B(0000)" -p3 -mlr-30 -mtb30 -fs10 -fn300 -c#FF0000

image

image

image

image

VeryPDF

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)
Posted in PDF Stamp | Tagged , | Leave a comment

PDF Form Datum Extractor from PDF files which contain XDP data or Dynamic XML Forms

Do you have a version that works for 1.7 format PDF files? I tried your demo using the following command line and it creates an empty file.

pdftoolbox test1.pdf -outformdata -outfile test.txt

http://www.verypdf.com/app/pdftoolbox/try-and-buy.html#buy

I am trying to extract data from a PDF form. For reference I have attached an example, pleas let me know if you need any more information.
(See attached file: test1.pdf)

Customer
------------------------------------------------------------
Your PDF file is contain dynamic XFA XML Form layer, you need convert this PDF file to static PDF file first, then you can extract data from it.

VeryPDF
------------------------------------------------------------
Thanks for the update.

I tried the following command line, but it failed with no error message.

pdftoolbox.exe test1.pdf -flattenform -outfile test1FLAT.pdf

Can you please provide me with the correct syntax for converting a file to Static PDF?

Customer
------------------------------------------------------------
You need open this PDF file in "Adobe LiveCycle Designer ES 8.2", save as to a new PDF file with "Adobe Static PDF Forms (*.pdf)" option, then you will get a new PDF file with static forms. Here are the setting to use when saving LiveCycle Forms to in order to merge XDP data with LiveCycle forms,

image

LiveCycle_Setup_SaveAs_StaticOnly

after you get the new PDF file, you can run following command line to extract form data from the new PDF file,

pdftoolbox test1.pdf -outformdata -outfile test.txt

VeryPDF

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)
Posted in PDF Form Filler, PDF Toolbox Command Line | Tagged , | Leave a comment

Find and Replace Text Issues in PDF Text Replacer software

Hi there,

I am evaluating the PDF Text Replacer, and so far so good, other than the fact that it doesn't seem to be able to replace certain things..

For example, I am currently trying to replace the telephone number "01788 548855" with "0845 2300201", so I have set up the following rules under 'According to Content':
1. Find: 01788 - Replace With: 0845
2. Find: 548855 - Replace With: 2300201

The result: it replaces 01788 with 0845, but does not replace 548855 with 2300201.

I tried removing rule #1 so that the only rule being applied is to replace 548855 with 2300201 and it still doesn't work.

Can you advise? I am looking to buy this software if it works, immediately..

Customer
------------------------------------


We suggest you may use "PDF Text Replacer Command Line Standalone Version (pdftr.exe)" to instead of GUI version, the command line version is work better than GUI version,

http://www.verypdf.com/app/pdf-text-replacer/search-and-replace-pdf-text-command-line.html#pdftr
http://www.verypdf.com/dl2.php/pdftextreplacer_cmd.zip

You can use "-searchandoverlaytext" option to replace text in PDF pages easily,

pdftr.exe -searchandoverlaytext "PDFcamp Printer=>VeryPDF Printer" -overlaytextfontsize 8 D:\in.pdf D:\out.pdf

pdftr.exe -searchandoverlaytext "PDFcamp Printer=>VeryPDF Printer" -overlaytextfontsize 80% D:\in.pdf D:\out.pdf

VeryPDF
------------------------------------
Hi,

unfortunately I am still having issues using pdftr.exe - replacing 'website.co.uk' with 'myotherwebsite.com' is sometimes (but not always) missing out various characters when replacing, resulting in something like 'myothwebite.com' ...

Are you aware of anything that causes this? So far yours is the best software I can find for doing this, but still cannot purchase unless I can achieve accurate results.. I am required to convert several thousand PDFs so I cant afford for there to be many (if any) anomalies because there's no way I can check them all before publishing..

Please let me know your thoughts.

Customer
------------------------------------
Please use -searchandoverlaytext to instead of -contentreplace to try again, e.g.,

pdftr.exe -searchandoverlaytext "PDFcamp Printer=>VeryPDF Printer" -overlaytextfontsize 8 D:\in.pdf D:\out.pdf

Because -contentreplace is using original embedded font, if original embedded font is not contain "er" characters, "er" characters will missing out from output PDF file, the -searchandoverlaytext option solve this problem, so please use -searchandoverlaytext option to try again.

VeryPDF
------------------------------------
Hi there, that indeed is the answer to our problems..

Will be placing an order as soon as I receive approval from my superiors :)

Thanks a lot for the help!

Customer

VN:F [1.9.20_1166]
Rating: 10.0/10 (1 vote cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)
Posted in PDF Text Replacer | Tagged | Leave a comment
Page 1 of 1,28712345...102030...Last »