Extract Statistics from PDF and encode output as UTF-8

   VeryPDF PDF Toolbox Command Line can be used to extract statistics, metadata, bookmarks and page from input PDF file to the given output file  and this software also can help you encode output as UTF-8. This software has many other functions like it can merge, split, encrypt, fill PDF form, edit PDF description information, and optimize PDF. In the following part, I will show you how to use this software to extract statistics.

Step 1. Download PDF Toolbox

  • This software can work under multiplatform like Mac, Linux and Windows. When downloading, please download the right version or else this software can not work.
  • When downloading finishes, there will be an zip file in downloading folder. Once downloading finishes, please extract it to some folder then you can check the elements in it.

Step 2. Extract Statistics from PDF and save it as UTF-8 code

Usage: pdftoolbox <input files> [options] <-outfile output >

  • UTF-8 is the default encoding for XML and since 2010 has become the dominant character set on the Web. It is the abbreviation of Unicode Transformation Format 8-bit , which is a variable-width encoding that can represent every character in the Unicode character set.
  • Based on the understanding of the importance of this code, we add this function to this software.
  • When you need to extract statistics from PDF and save it as UTF-8 code, please refer to the following command line template.
    pdftoolbox sample_in1.pdf -getinfoutf8 -outfile "_getinfoutf8_out.txt"
    By the above command line template, I will extract statistics from sample_in1.pdf  and save it as "_getinfoutf8_out.txt. Now let us check the conversion effect from the following snapshot.

input PDF and output data

Parameter Related:

-getinfo                   : extract statistics, metadata, bookmarks and page labels from input PDF file to the given output file.
-getinfoutf8            : this parameter has the same function as -getinfo except that the output is encoded as UTF-8.
-outformdata         : extract field statistics from the input PDF to the given output file.
-outformdatautf8  : this parameter has the same function same as -outformdata except that the output is encoded as UTF-8.
-setinfo                   : set metadata to PDF's Info dictionary.
-setinfoutf8            : set UTF-8 metadata to PDF's Info dictionary.

By this software, you can also change the statistics in one PDF by the statistics extracted from another PDF file. When you do not need to save it as UTF-8, there is also available parameter for you to use. During the using, if you have any question, please contact us as soon as possible.

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!