Extract font and its corresponding cmap in PDF

Question: I am tried several ways to extract font from PDF viz. fontforge, mupdf, pdfparser in C# and also some pythone script. But I am just confusing about get exact pair of a font and its cmap embeded in PDF. Please direct me the right approach by which I will get exact pairs of fonts and its cmaps.

Answer: The right place to start looking is the PDF specification ISO 32000-1:2008. That in combination with a PDF library that allows you to access low-level PDF objects (e.g. iText and iTextSharp) allows you to match embedded fonts and PDF CMaps and extract them. If you tell us more about the requirements you have, detailed answers may come. BTW, you are aware that depending on the font in question you may have to acquire some license to be allowed to use the data for anything but PDF display.

   If you feel that you need some tool which can make extracting font in PDF more easily, maybe you can have a free trial of this software: VeryPDF PDF Font Extractor Command Line, by which you can supports font file formats like TTF (TrueType), CFF (Compact Font Format), and AFM (Adobe Font Metrics).  If you need to call it from C#, please use the Developer License.

When you use this software, you do not have other PDF software installed. And it supports all the versions of PDF files. It also can help  you extract embedded PDF fonts to font files and render TrueType (TTF) fonts to GIF images.Please check more information of this software on its homepage, in the following part, let us check some parameters here.

Usage: pdffont [options] <PDF-file>
  -f <int>     : first page to examine
  -l <int>     : last page to examine
  -opw <string>: owner password (for encrypted files)
  -upw <string>: user password (for encrypted files)
  -img         : convert TTF fonts to image files
  -h           : print usage information
  -$ <string>  : input your license key
Example:
   pdffont.exe C:\in.pdf C:\out
   pdffont.exe -f 1 -l 1 C:\in.pdf C:\out
   pdffont.exe -opw 123 -upw 456 C:\in.pdf C:\out

Telling from the parameters, we can tell that by this software when extracting font in PDF, you can control page range. Meanwhile, it also can help you extract font in PDF of password protected. And the command line and usage are quite simply, even if you do not have knowledge in command line operation, once you see the command line, you will know how to use this software.  During the using, if you have any question, please contact us as soon as possible.

VN:F [1.9.20_1166]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.20_1166]
Rating: 0 (from 0 votes)

Related Posts

This entry was posted in PDF Editor Toolkit and tagged . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *


Verify Code   If you cannot see the CheckCode image,please refresh the page again!