VeryPDF has released PDF Extract Tool Command Line software today, this powerful command line application will allow you to extract anything from PDF file easily, if you want extract something which not included in current version of PDF Extract Tool Command Line software, please feel free to let us know, we are glad to add this function to you asap.
PDF Extract Tool Command Line Home Page:
https://www.verypdf.com/app/pdf-extract-tool/index.html
PDF Extract Tool Command Line Online Demo:
https://www.verypdf.com/app/pdf-extract-tool/online.html
PDF Extract Tool Command Line software has following features,
- Extract the fonts to TTF, CFF, and AFM files;
- Extract images to TIFF, JPG, PNG, PBM, PPM files;
- Extract text to TXT files;
- Extract metadata to XMP file;
- Extract forms to FDF file;
- Extract drawing to XML file;
- etc.
PDF Extract Tool Command Line is a command line application, you can call it from your source code easily by CreateProcess() or exec() or system() or other similar functions.
You can launch a Command Line Window, and run "pdfextract.exe" in this cmd window, you will see the command line options,
You can run following command line to extract various information from PDF file, and save extracted material to output folder,
pdfextract.exe -outfolder _annotstamp annotstamp.pdf _annotstamp.pdf.log
pdfextract.exe -outfolder _test-long-page test-long-page.pdf _test-long-page.pdf.log
pdfextract.exe -outfolder _test-form test-form.pdf _test-form.pdf.log
pdfextract.exe -outfolder _test-embedded-fonts test-embedded-fonts.pdf _test-embedded-fonts.pdf.log
pdfextract.exe -outfolder _test-embedded-fonts2 test-embedded-fonts2.pdf _test-embedded-fonts2.pdf.log
pdfextract.exe -outfolder __test-embedded-fonts test-embedded-fonts.pdf __test-embedded-fonts.log
pdfextract.exe -outfolder __test-embedded-fonts2 test-embedded-fonts2.pdf __test-embedded-fonts2.log
You will see some information be shown to screen, such as,
================== VeryPDF PDF-Spy Section #1 ==================
Document Info
-------------
File: test-embedded-fonts2.pdf
PDF Version: 1.5
Page Count: 1
Page Size: 612 x 792 pts
Fast Web View Enabled: No
Tagged: No
Encrypted: No
Printing Allowed: Yes
Modification Allowed: Yes
Copy&Paste Allowed: Yes
Add/Modify Annotations Allowed: Yes
Fill&Sign Allowed: Yes
Accessibility Allowed: Yes
Document Assembly Allowed: Yes
High Quality Print Allowed: Yes
Catalog
----------------
<<
/Type /Catalog
/Lang (en-US)
/MarkInfo <<
/Marked true
>>
/Pages 2 0 R
/StructTreeRoot 27 0 R
>>
Classic Metadata
----------------
Title:
Subject:
Author:
Keywords:
Creator: Microsoft? Word 2010
Producer: Microsoft? Word 2010
CreationDate: D:20120606113549+08'00'
ModDate: D:20140102100701+08'00'
Trapped:
Page Info
---------
Page Count: 1
Page 0:
->Internal Number:1
->Object Number:3 0 R
MediaBox: [ 0.000000 0.000000 612.000000 792.000000 ]
CropBox: [ 0.000000 0.000000 612.000000 792.000000 ]
TrimBox: [ 0.000000 0.000000 612.000000 792.000000 ]
BleedBox: [ 0.000000 0.000000 612.000000 792.000000 ]
ArtBox: [ 0.000000 0.000000 612.000000 792.000000 ]
Rotation: 0
# of Annotations: 0
Outlines
--------
None Found
Names
-----
None Found
================== VeryPDF PDF-Spy Section #2 ==================
name type emb sub uni object ID
------------------------------------ ----------------- --- --- --- ---------
[Extract Text File] Save to "X:\test\_test-embedded-fonts2\original-ABCDEE+Arial Black-61.ttf" file.
ABCDEE+Arial Black TrueType yes yes no 5 0
[Extract Text File] Save to "X:\test\_test-embedded-fonts2\original-ABCDEE+Arial Rounded MT Bold-63.ttf" file.
ABCDEE+Arial Rounded MT Bold TrueType yes yes no 7 0
[Warning] Can't find 'Arial' font.
Arial TrueType no no no 9 0
[Extract Text File] Save to "X:\test\_test-embedded-fonts2\original-ABCDEE+Calibri-66.ttf" file.
ABCDEE+Calibri TrueType yes yes no 11 0
[Extract Text File] Save to "X:\test\_test-embedded-fonts2\original-ABCDEE+BankGothic Md BT-68.ttf" file.
ABCDEE+BankGothic Md BT TrueType yes yes no 13 0
[Extract Text File] Save to "X:\test\_test-embedded-fonts2\original-ABCDEE+Candara-70.ttf" file.
ABCDEE+Candara TrueType yes yes no 15 0
ABCDEE+DejaVu Sans Mono TrueType yes yes no 17 0
ABCDEE+Eras Demi ITC TrueType yes yes no 19 0
ABCDEE+Franklin Gothic Demi TrueType yes yes no 21 0
ABCDEE+Georgia TrueType yes yes no 23 0
ABCDEE+High Tower Text TrueType yes yes no 25 0
Please purchase full version from 'https://www.verypdf.com' to extract following fonts from your PDF file,
ABCDEE+DejaVu Sans Mono
ABCDEE+Eras Demi ITC
ABCDEE+Franklin Gothic Demi
ABCDEE+Georgia
ABCDEE+High Tower Text
[List All Fonts], The number of fonts in this PDF file is: 11
ABCDEE+Arial Black TrueType yes yes no 5 0
ABCDEE+Arial Rounded MT Bold TrueType yes yes no 7 0
Arial TrueType no no no 9 0
ABCDEE+Calibri TrueType yes yes no 11 0
ABCDEE+BankGothic Md BT TrueType yes yes no 13 0
ABCDEE+Candara TrueType yes yes no 15 0
ABCDEE+DejaVu Sans Mono TrueType yes yes no 17 0
ABCDEE+Eras Demi ITC TrueType yes yes no 19 0
ABCDEE+Franklin Gothic Demi TrueType yes yes no 21 0
ABCDEE+Georgia TrueType yes yes no 23 0
ABCDEE+High Tower Text TrueType yes yes no 25 0
================== VeryPDF PDF-Spy Section #3 ==================
File: test-embedded-fonts2.pdf
PageCount: 1
Page 1 [0.000000 0.000000 612.000000 792.000000], Rotate=0
MediaBox: 0.00 0.00 612.00 792.00
CropBox: 0.00 0.00 612.00 792.00
BleedBox: 0.00 0.00 612.00 792.00
TrimBox: 0.00 0.00 612.00 792.00
ArtBox: 0.00 0.00 612.00 792.00
================== VeryPDF PDF-Spy Section #4 ==================
Creator: Microsoft? Word 2010
Producer: Microsoft? Word 2010
CreationDate: 06/06/12 11:35:49
ModDate: 06/06/12 11:35:49
Tagged: yes
Form: none
Pages: 1
Encrypted: no
Page 1 size: 612 x 792 pts (letter)
Page 1 MediaBox: 0.00 0.00 612.00 792.00
Page 1 CropBox: 0.00 0.00 612.00 792.00
Page 1 BleedBox: 0.00 0.00 612.00 792.00
Page 1 TrimBox: 0.00 0.00 612.00 792.00
Page 1 ArtBox: 0.00 0.00 612.00 792.00
File size: 270439 bytes
Optimized: no
PDF version: 1.5
================== VeryPDF PDF-Spy Section #5 ==================
List All Embedded Files in PDF file: [0] embedded files
================== VeryPDF PDF-Spy Section #6 ==================
[Extract Text File] Save to "X:\test\_test-embedded-fonts2\TextFile.txt" file.
[Extract Text File] Save to "X:\test\_test-embedded-fonts2\TextFileWithPosition.txt" file.
================== VeryPDF PDF-Spy Section #7 ==================
================== VeryPDF PDF-Spy Section #8 ==================
Dump Document Summaries from PDF file: "test-embedded-fonts2.pdf"
================== VeryPDF PDF-Spy Section #9 ==================
Dump Form Name, Form Type and Form Values from PDF file: "test-embedded-fonts2.pdf"
================== VeryPDF PDF-Spy Section #10 ==================
Dump Annotations from PDF file: "test-embedded-fonts2.pdf"
================== VeryPDF PDF-Spy Section #11 ==================
Generate FDF from fillable PDF file: X:\test\_test-embedded-fonts2\pdfforms.fdf
================== VeryPDF PDF-Spy Section #12 ==================
Generate Page Content XML file from PDF file: X:\test\_test-embedded-fonts2\PageContents.xml
================== VeryPDF PDF-Spy Section #13 ==================
[Waning] The Gen Numer of obj 1 is 0, we will reset it to 0
[INFO] 'test-embedded-fonts2.pdf' file is 'NOT Encrypted'.
[Waning] The Gen Numer of obj 1 is 0, we will reset it to 0
[ContentParserExport] Processing page 1 of 1...
================== VeryPDF PDF-Spy Section End ==================
Please go to output folder, you will see following extracted files,
cnt*.txt are extracted text contents.
metadata.xmp is extracted metadata data.
*.cff; *.ttf; *.afm are extracted font files.
PageContents.xml is contain the extracted drawings, this XML file is contain text, drawing, colorspace, font name, matrix, graphics state, etc. issues.
pdfforms.fdf is contain form information of input PDF file.
TextFile.txt is contain text information.
TextFileWithPosition.txt file is contain text contents with positions.
You can parse these information for further processing.
If you need some useful information which not extracted by current version of PDF Extract Tool Command Line software, please feel free to let us know, we are glad to assist you asap.
Please feel free contact us via VeryPDF Ticket System,