My university excepts that the submitted PDF-Files are in the PDF/A format (which I've never heard about...). I tried to find a converter, but they were all very expensive and/or complicated. How could I convert my existing PDF-file into a PDF/A?
Thank you very much!
There are a lot of PDFs out there which claim to be PDF/A, but fail a real smoke test. That claim is just a tag in the file's metadata. That tag can make f.e. Acrobat Reader display a special hint when rendering it.
A PDF/A document is just a PDF document that uses a specific subset of PDF that is designed to ensure it is 'self-contained'. Ie it is not permitted to be reliant on information from external sources (e.g. font programs and hyperlinks).
VeryPDF has "PDF to PDF/A Converter GUI" and "PDF to PDF/A Converter Command Line" products, they are also named PDF/A Manager software, you can download them from following web pages,
PDF/A Manager is a PDF/A (ISO 19005-1) validation and conversion software. It is available as a command-line tool and as a development toolkit.
The conversion option analyses the content of existing PDF files and performs a sequence of modifications in order to produce a PDF/A compliant document. Features that are not suitable for long-term archiving (such as encryption, obsolete compression schemes, missing fonts, or device-dependent color) are replaced with their PDF/A compliant equivalents. Because the conversion process applies only necessary changes to the source file, the information loss is minimal. Also, because the converter provides a detailed report for each change, it is simple to inspect changes and to determine whether the conversion loss is acceptable.
The validation option in PDF/A Manager can be used to quickly determine whether a PDF file fully complies with the PDF/A specification according to the international standard ISO 19005-1. For files that are not compliant, the validation option can be used to produce a detailed report of compliance violations as well as a list of relevant error objects.
PDF/A Manager has following features,
- Checks if a PDF file is compliant with PDF/A (ISO 19005-1) specification.
- Converts any PDF to a PDF/A compliant document.
- Supports both PDF/A-1a and PDF/A-1b.
- Produces a detailed report of compliance violations and associated PDF objects.
- Keeps the required changes a minimum, preserving the consistency of the original.
- Tracks all changes to allow for automatic assessment of data loss.
- Allows user to customize compliance checks or omit specific changes during PDF/A conversion.
- Supports user-defined color profiles.
- Offers automatic font substitution, embedding, and subsetting options.
- Supports automation and batch operation. PDF/A Converter is designed to be used in
- unattended mode in high throughput server or batch environments.
Frequent Questions about PDF/A
What is PDF/A?
Portable Document Format/Archive (PDF/A) is an International Organization for Standardization (ISO) standardized version specialized for the digital preservation of electronic documents. PDF/A ensures that documents are able to be reproduced exactly the same way in years to come. The format forbids dynamic content; restrictions are applied to the use of PDF objects; everything that is required to render the document, fonts, color profiles, images, etc., is 100% self-contained in the PDF/A file.
Currently, there are two variations on PDF/A-PDF/A-1a and PDF/A-1b. The standard specifies two levels of compliance:
- PDF/A-1a: Level "A" compliance: exact visual reproduction, mapping text to Unicode and structuring of the document content, preservation of a document's logical structure and content text stream in natural reading order, especially important when the document must be displayed on a mobile device
- PDF/A-1b: Level "B" compliance: exact visual reproduction
Why use PDF/A?
PDF/A stores objects (e.g. text, graphics), allowing for an efficient full-text search in an entire archive. Files stored as Tagged Image File Format (TIFF) cannot be searched. TIFF is a raster format and must first be scanned with an OCR (optical character recognition) engine.
PDF/A files require only a fraction of the memory space of original or TIFF files, without loss of quality. The smaller file size is especially advantageous for electronic file transfers (FTP, e-mail attachment, etc.)
PDF/A format can be optimized. The optimization can be focused on images (e.g. scanned checks) or extracting structured data (e.g. voucher information). TIFF treats all file information the same.
Metadata like title, author, creation date, modification date, subject, keywords, etc., can be stored in a PDF/A file. PDF/A files can be automatically classified based on the metadata, without requiring human intervention.
What is the difference between PDF and PDF/A?
PDF files may include videos, GIS data and 3-dimensional images. PDF/A, however, is highly restrictive and does not support dynamic PDF content.
PDF/A-1 allows various different metadata, including bookmarks, to be saved in the document. Extensible Metadata Platform (XMP), a technology that unifies different metadata methods, is used for the metadata in a PDF/A-1 file. However, PDF files with dynamic objects like audio and video cannot be converted to PDF/A because they rely on an external player that may not be available in the future.
Generally speaking, NARA-acceptable PDF records must comply with PDF versions 1.0 through 1.4, have no encryption or security settings, have the fonts embedded, follow the NARA transfer guidance for scanned images, and have multimedia/special features negotiated beforehand in a "notification process".
What are the PDF/A-1 restrictions?
One of the key differences between PDF and PDF/A is the restrictions that PDF/A places on PDF.
PDF/A-1 files must include:
- Embedded fonts
- Device-independent color
- Extensible Metadata Platform (XMP) metadata
- PDF/A-1 files may not include:
- LZW Compression
- Embedded files
- External content references
- PDF Transparency
Are PowerPoint files good candidates for converting to PDF/A?
Yes. You might have to take preparatory steps such as ensuring that annotations are also carried over into the PDF/A file before saving.
Can PDF/A files contain copyright information, like TIFF can?
Yes. PDF/A gives you the possibility to save various different metadata (for example, the copyright) in the document. XMP, a technology that unifies different metadata methods, is used for the metadata in a PDF/A file.
How can you best make PDF/A files text searchable?
If a PDF/A file is created from a digital text document, the text will automatically be recognized. For a scanned paper document or image, OCR can be used to make it searchable. In this case, only PDF/A-1b is possible, not the more stringent PDF/A-1a.
What about "mixed" objects in PDF, like audio and video? Can these be used in PDF/A?
PDF files with dynamic objects like audio and video cannot be converted to PDF/A. PDF/A must guarantee an exact reproducibility, which is not possible with embedded objects like sound or movies. These types of objects require an external player (and quite often in a specific version). There is no guarantee that the player application will be available in the future.
Which text recognition software works well together with Acrobat 8/9/X?
Acrobat 8/9/X Professional comes with its own OCR software that can be used to convert scanned pages into searchable text. Note: most EPA staff do not have this version of Adobe Acrobat.
Can PDF files be converted to PDF/A?
Yes. But not all PDF features may be transferrable to PDF/A. PDF/A is based on PDF 1.4. Certain features in newer PDF versions (like transparency and layers) were not (fully) introduced with PDF 1.4 and are therefore not supported by PDF/A. In this case, the transparency has to be removed and the layers flattened in order to create a PDF/A-1 document. The next version of PDF/A - PDF/A-2 - is based on the PDF specification 1.7 and will allow a lot of the newer features.
What are the different ways to create a PDF/A file?
- Print to PDF/A on a client computer
- Print to PDF/A using a print stream on a server
- Scan to PDF/A (paper to PDF/A)
- Convert existing image files to PDF/A
- Convert existing PDF files to PDF/A
- Export a document to PDF/A format
- Create PDF/A "on-the-fly" from data or a database
How can I find out if a font is embedded?
When a PDF/A file is created, the program ensures that the fonts are embedded. If the fonts are not embedded, you don't have a valid PDF/A file. You can verify if fonts (and which ones) are embedded in any PDF file by checking under "Properties" in Acrobat and the Adobe Reader. In addition, PDF/A validation tools will inform you if fonts have or have not been embedded, and whether the files conform to PDF/A or not.
What are special font considerations?
Many fonts have restrictions on use, embedding and exchange. PDF/A requires fonts to be embedded. Therefore, organizations using PDF/A-1 must take extra precautions to be sure that the fonts they use are properly licensed to allow embedding.
Can PDF/A files contain an electronic signature?
Yes. There are a number of tools, strategies and software solutions available for this. Even Acrobat Professional can be used to digitally sign PDF/A files.
What does PDF/A mean by "long term"?
PDF/A defines long-term as: 'the period of time long enough for there to be concern about the impacts of changing technologies, including support for new media and data formats, and of a changing user community, on the information being held in a repository, which may extend into the indefinite future."
Does PDF/A-1 replace other archival file formats?
No. PDF/A-1 was developed to allow PDF to be used as an archival format in a well-defined and robust manner.
What long-term preservation needs does PDF/A-1 address?
Characteristics identified as objectives for PDF/A:
- Device Independent - Can be reliably and consistently rendered without regard to the hardware or software platform
- Self-contained - Contains all resources necessary for rendering
- Self-documenting - Contains its own description
- Unfettered - Absence of technical file protection mechanisms
- Available - Authoritative specification publicly available
- Adoption - Widespread use may be the best deterrent against preservation risk
When should PDF/A be used?
PDF/A should be used as a way to standardize the use of PDF for electronic document storage and ensure that these documents will be available well into the future. This is important to support business needs that require reliable rendering of electronic documents over the long term.
As a file format specification, users will need to establish their own capture methodology that meets domain specific policies and procedures (e.g., for reliability, integrity, compliance, comprehensiveness).
For permanent records in PDF, federal agencies will need to implement PDF/A-1 in conjunction with additional requirements identified in guidance from the National Archives and Records Administration (NARA) for transferring permanent PDF records.
It is important to be aware that:
- PDF/A-1 alone does not guarantee preservation
- PDF/A-1 alone does not guarantee exact replication of source material
Does NARA accept PDF/A files?
NARA may accept transfers of permanent records in PDF/A format that additionally meet the current transfer requirements for electronic records in PDF.
How did PDF/A get started and who is involved?
The PDF/A activity was initiated through the joint sponsorship of AIIM, Association for Information and Image Management and NPES, The Association for Suppliers of Printing, Publishing, and Converting Technologies. Under the auspices of TC-171 Document Management Application Subcommittee 2 Application Issues, a Joint Working Group (WG5) was formed with representatives from ISO Technical Committees 42, 46, 130, and 171. Librarians, archivists, PDF software developers, government agencies, imaging experts, graphics experts and others collaborated to develop PDF/A. Initial meetings were held in mid-2002 and the standard was approved in June 2005. Technical experts from 15 national standards bodies provided input throughout the development process.
PDF to PDF/A Converter GUI,
PDF to PDF/A Converter Command Line,