How can I read the properties/metadata like Title, Author, Subject and Keywords stored on a PDF file using Python? and How can I set special properties and metadata to PDF files using Python? We often got some questions from our customers, they would like to manipulate their PDF files using Python Program Language.
Here is a question from our customer in the past,
I'm looking for a very fast, lightweight Python library to read . I don't need any write capabilities. It would be better if only the metadata information is loaded, not the entire file.
I realize an interpreted language like Python isn't the best choice for speed, but as this solution needs to be cross platform and work with an existing Python application there doesn't seem to be much of a choice.
I checked out some Python PDF libraries, but I am ideally looking for something lighter and faster, suitable for processing tens of thousands of files in one go.
Fortunately, VeryUtils has a PythonPDF Library product, it's small and fast enough. This product can be used to manipulate PDF files, such as rotate PDF pages, split and merge PDF files, stamp PDF files and many other functions. With VeryUtils PythonPDF Library product, you can retrieve and set following attributes from/to PDF files easily,
Please by following steps to use VeryUtils PythonPDF Library,
1. You may buy VeryUtils PythonPDF Library from this web page,
2. After you buy it, you will get a package with full python source code, VeryUtils PythonPDF Library is a pure Python product, it doesn't require any third party application.
3. Please download PythonPDF Library and unzip it to a folder, such as D:\downloads\python-pdfrw folder,
4. Please run following command line to set the main folder to "PYTHONPATH" environment variable,
5. Please go to "examples" folder, you can run following command line to test PDF Properties/Metadata modification function,
python alter.py testcmd.pdf
Please look at following screenshot for the modified PDF file,
6. Please find the sample Python Source Code at below,
from pdfrw import PdfReader, PdfWriter
inpfn, = sys.argv[1:]
outfn = 'alter.' + os.path.basename(inpfn)
trailer = PdfReader(inpfn)
trailer.Info.Title = 'My New Title Goes Here'
trailer.Info.Producer = "My Producer";
trailer.Info.Author = "My Author";
trailer.Info.Creator = "My Creator";
trailer.Info.Subject = "My Subject";
trailer.Info.Keywords = "My Keywords";
trailer.Info.CreationDate = 'D:20150803195603Z';
trailer.Info.ModDate = 'D:20150803195603Z';
If you have any question for VeryUtils Python library product, please feel free to let us know, we are glad to assist you asap.