[VeryPDF Release Notes] VeryPDF Releases PDF Extract Tool Command Line software

VeryPDF has released PDF Extract Tool Command Line software today, this powerful command line application will allow you to extract anything from PDF file easily, if you want extract something which not included in current version of PDF Extract Tool Command Line software, please feel free to let us know, we are glad to add this function to you asap.

PDF Extract Tool Command Line Home Page:

http://www.verypdf.com/app/pdf-extract-tool/index.html

PDF Extract Tool Command Line Online Demo:

http://www.verypdf.com/app/pdf-extract-tool/online.html

PDF Extract Tool Command Line software has following features,

  • Extract the fonts to TTF, CFF, and AFM files;
  • Extract images to TIFF, JPG, PNG, PBM, PPM files;
  • Extract text to TXT files;
  • Extract metadata to XMP file;
  • Extract forms to FDF file;
  • Extract drawing to XML file;
  • etc.

PDF Extract Tool Command Line is a command line application, you can call it from your source code easily by CreateProcess() or exec() or system() or other similar functions.

You can launch a Command Line Window, and run "pdfextract.exe" in this cmd window, you will see the command line options,

image

You can run following command line to extract various information from PDF file, and save extracted material to output folder,

pdfextract.exe -outfolder _annotstamp annotstamp.pdf _annotstamp.pdf.log

pdfextract.exe -outfolder _test-long-page test-long-page.pdf _test-long-page.pdf.log

pdfextract.exe -outfolder _test-form test-form.pdf _test-form.pdf.log

pdfextract.exe -outfolder _test-embedded-fonts test-embedded-fonts.pdf _test-embedded-fonts.pdf.log

pdfextract.exe -outfolder _test-embedded-fonts2 test-embedded-fonts2.pdf _test-embedded-fonts2.pdf.log

pdfextract.exe -outfolder __test-embedded-fonts test-embedded-fonts.pdf __test-embedded-fonts.log

pdfextract.exe -outfolder __test-embedded-fonts2 test-embedded-fonts2.pdf __test-embedded-fonts2.log

You will see some information be shown to screen, such as,

================== VeryPDF PDF-Spy Section #1 ==================

Document Info
-------------
    File: test-embedded-fonts2.pdf
    PDF Version: 1.5
    Page Count: 1
    Page Size: 612 x 792 pts

    Fast Web View Enabled: No
    Tagged: No
    Encrypted: No
    Printing Allowed: Yes
    Modification Allowed: Yes
    Copy&Paste Allowed: Yes
    Add/Modify Annotations Allowed: Yes
    Fill&Sign Allowed: Yes
    Accessibility Allowed: Yes
    Document Assembly Allowed: Yes
    High Quality Print Allowed: Yes

Catalog
----------------
<<
/Type /Catalog
/Lang (en-US)
/MarkInfo <<
/Marked true
>>
/Pages 2 0 R
/StructTreeRoot 27 0 R
>>

Classic Metadata
----------------
    Title:
    Subject:
    Author:
    Keywords:
    Creator: Microsoft? Word 2010
    Producer: Microsoft? Word 2010
    CreationDate: D:20120606113549+08'00'
    ModDate: D:20140102100701+08'00'
    Trapped:

Page Info
---------
Page Count: 1
Page 0:
->Internal Number:1
->Object Number:3 0 R
    MediaBox: [ 0.000000 0.000000 612.000000 792.000000 ]
    CropBox: [ 0.000000 0.000000 612.000000 792.000000 ]
    TrimBox: [ 0.000000 0.000000 612.000000 792.000000 ]
    BleedBox: [ 0.000000 0.000000 612.000000 792.000000 ]
    ArtBox: [ 0.000000 0.000000 612.000000 792.000000 ]
    Rotation: 0
    # of Annotations: 0
Outlines
--------
    None Found
Names
-----
    None Found

================== VeryPDF PDF-Spy Section #2 ==================

name                                 type              emb sub uni object ID
------------------------------------ ----------------- --- --- --- ---------
[Extract Text File] Save to "X:\test\_test-embedded-fonts2\original-ABCDEE+Arial Black-61.ttf" file.
ABCDEE+Arial Black                   TrueType          yes yes no       5  0
[Extract Text File] Save to "X:\test\_test-embedded-fonts2\original-ABCDEE+Arial Rounded MT Bold-63.ttf" file.
ABCDEE+Arial Rounded MT Bold         TrueType          yes yes no       7  0
[Warning] Can't find 'Arial' font.
Arial                                TrueType          no  no  no       9  0
[Extract Text File] Save to "X:\test\_test-embedded-fonts2\original-ABCDEE+Calibri-66.ttf" file.
ABCDEE+Calibri                       TrueType          yes yes no      11  0
[Extract Text File] Save to "X:\test\_test-embedded-fonts2\original-ABCDEE+BankGothic Md BT-68.ttf" file.
ABCDEE+BankGothic Md BT              TrueType          yes yes no      13  0
[Extract Text File] Save to "X:\test\_test-embedded-fonts2\original-ABCDEE+Candara-70.ttf" file.
ABCDEE+Candara                       TrueType          yes yes no      15  0
ABCDEE+DejaVu Sans Mono              TrueType          yes yes no      17  0
ABCDEE+Eras Demi ITC                 TrueType          yes yes no      19  0
ABCDEE+Franklin Gothic Demi          TrueType          yes yes no      21  0
ABCDEE+Georgia                       TrueType          yes yes no      23  0
ABCDEE+High Tower Text               TrueType          yes yes no      25  0

Please purchase full version from 'http://www.verypdf.com' to extract following fonts from your PDF file,
ABCDEE+DejaVu Sans Mono
ABCDEE+Eras Demi ITC
ABCDEE+Franklin Gothic Demi
ABCDEE+Georgia
ABCDEE+High Tower Text

[List All Fonts], The number of fonts in this PDF file is: 11
ABCDEE+Arial Black                   TrueType          yes yes no       5  0
ABCDEE+Arial Rounded MT Bold         TrueType          yes yes no       7  0
Arial                                TrueType          no  no  no       9  0
ABCDEE+Calibri                       TrueType          yes yes no      11  0
ABCDEE+BankGothic Md BT              TrueType          yes yes no      13  0
ABCDEE+Candara                       TrueType          yes yes no      15  0
ABCDEE+DejaVu Sans Mono              TrueType          yes yes no      17  0
ABCDEE+Eras Demi ITC                 TrueType          yes yes no      19  0
ABCDEE+Franklin Gothic Demi          TrueType          yes yes no      21  0
ABCDEE+Georgia                       TrueType          yes yes no      23  0
ABCDEE+High Tower Text               TrueType          yes yes no      25  0

================== VeryPDF PDF-Spy Section #3 ==================

File: test-embedded-fonts2.pdf
PageCount: 1
Page 1 [0.000000 0.000000 612.000000 792.000000], Rotate=0
MediaBox:           0.00     0.00   612.00   792.00
CropBox:            0.00     0.00   612.00   792.00
BleedBox:           0.00     0.00   612.00   792.00
TrimBox:            0.00     0.00   612.00   792.00
ArtBox:             0.00     0.00   612.00   792.00

================== VeryPDF PDF-Spy Section #4 ==================

Creator:        Microsoft? Word 2010
Producer:       Microsoft? Word 2010
CreationDate:   06/06/12 11:35:49
ModDate:        06/06/12 11:35:49
Tagged:         yes
Form:           none
Pages:          1
Encrypted:      no
Page    1 size: 612 x 792 pts (letter)
Page    1 MediaBox:     0.00     0.00   612.00   792.00
Page    1 CropBox:      0.00     0.00   612.00   792.00
Page    1 BleedBox:     0.00     0.00   612.00   792.00
Page    1 TrimBox:      0.00     0.00   612.00   792.00
Page    1 ArtBox:       0.00     0.00   612.00   792.00
File size:      270439 bytes
Optimized:      no
PDF version:    1.5
================== VeryPDF PDF-Spy Section #5 ==================
List All Embedded Files in PDF file: [0] embedded files
================== VeryPDF PDF-Spy Section #6 ==================
[Extract Text File] Save to "X:\test\_test-embedded-fonts2\TextFile.txt" file.
[Extract Text File] Save to "X:\test\_test-embedded-fonts2\TextFileWithPosition.txt" file.
================== VeryPDF PDF-Spy Section #7 ==================
================== VeryPDF PDF-Spy Section #8 ==================
Dump Document Summaries from PDF file: "test-embedded-fonts2.pdf"
================== VeryPDF PDF-Spy Section #9 ==================
Dump Form Name, Form Type and Form Values from PDF file: "test-embedded-fonts2.pdf"
================== VeryPDF PDF-Spy Section #10 ==================
Dump Annotations from PDF file: "test-embedded-fonts2.pdf"
================== VeryPDF PDF-Spy Section #11 ==================
Generate FDF from fillable PDF file: X:\test\_test-embedded-fonts2\pdfforms.fdf
================== VeryPDF PDF-Spy Section #12 ==================
Generate Page Content XML file from PDF file: X:\test\_test-embedded-fonts2\PageContents.xml
================== VeryPDF PDF-Spy Section #13 ==================
[Waning] The Gen Numer of obj 1 is 0, we will reset it to 0
[INFO] 'test-embedded-fonts2.pdf' file is 'NOT Encrypted'.
[Waning] The Gen Numer of obj 1 is 0, we will reset it to 0
[ContentParserExport] Processing page 1 of 1...
================== VeryPDF PDF-Spy Section End ==================

Please go to output folder, you will see following extracted files,

image

cnt*.txt are extracted text contents.

metadata.xmp is extracted metadata data.

*.cff; *.ttf; *.afm are extracted font files.

PageContents.xml is contain the extracted drawings, this XML file is contain text, drawing, colorspace, font name, matrix, graphics state, etc. issues.

pdfforms.fdf is contain form information of input PDF file.

TextFile.txt is contain text information.

TextFileWithPosition.txt file is contain text contents with positions.

You can parse these information for further processing.

If you need some useful information which not extracted by current version of PDF Extract Tool Command Line software, please feel free to let us know, we are glad to assist you asap.

Please feel free contact us via VeryPDF Ticket System,

http://support.verypdf.com

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Related Posts

This entry was posted in @VeryPDF News and tagged , , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!