CHAPTER 10
946
Document Interchange
10.9 Web Capture
Web Capture
is a PDF 1.3 feature that allows information from Internet-based or
locally resident HTML, PDF, GIF, JPEG, and ASCII text files to be imported into
a PDF file. This feature is implemented in Acrobat 4.0 and later viewers by a Web
Capture plug-in extension (sometimes called AcroSpider). The information in
the Web Capture data structures enables viewer applications to perform the fol-
lowing operations:
Save locally and preserve the visual appearance of material from the Web
Retrieve additional material from the Web and add it to an existing PDF file
Update or modify existing material previously captured from the Web
Find source information for material captured from the Web, such as the URL
(if any) from which it was captured
Find all material in a PDF file that was generated from a given URL
Find all material in a PDF file that matches a given digital identifier (MD5
hash)
The information needed to perform these operations is recorded in two data
structures in the PDF file:
The
Web Capture information dictionary
holds document-level information
related to Web Capture.
The Web Capture
content database
keeps track of the material retrieved by Web
Capture and where it came from, enabling Web Capture to avoid downloading
material that is already present in the file.
The following sections provide a detailed overview of these structures. See
Appendix C for information about implementation limits in Web Capture.
Note:
The following discussion centers on HTML and GIF files, although Web Cap-
ture handles other file types as well.
10.9.1 Web Capture Information Dictionary
The optional
SpiderInfo
entry in the document catalog (see Section 3.6.1, “Docu-
ment Catalog”) holds an optional
Web Capture information dictionary
containing
document-level information related to Web Capture. Table 10.37 shows the con-
tents of this dictionary.
Index Bookmark Pages Text
Previous Next
Pages: Index All Pages
This HTML file was created by VeryPDF PDF to HTML Converter product.