Previous Next


                                               884
       CHAPTER 10                                                    Document Interchange



       • Structure types (Section 10.7.3, “Standard Structure Types”). A set of standard
         structure types define the meaning of structure elements, such as paragraphs,
         headings, articles, and tables.
       • Structure attributes (Section 10.7.4, “Standard Structure Attributes”). Standard
         structure attributes preserve styling information used by the authoring applica-
         tion in laying out content on the page.

       A Tagged PDF document must also contain a mark information dictionary (see
       Table 10.8) with a value of true for the Marked entry.

       Note: The types and attributes defined for Tagged PDF are intended to provide a set
       of standard fallback roles and minimum guaranteed attributes to enable consumer
       applications to perform operations such as those mentioned above. Producer appli-
       cations are free to define additional structure types as long as they also provide a
       role mapping to the nearest equivalent standard types, as described in Section
       10.6.2, “Structure Types.” Likewise, producer applications can define additional
       structure attributes using any of the available extension mechanisms.


10.7.1 Tagged PDF and Page Content

       Like all PDF documents, a Tagged PDF document consists of a sequence of self-
       contained pages, each of which is described by one or more page content streams
       (including any subsidiary streams such as form XObjects and annotation appear-
       ances). Tagged PDF defines some further conventions for organizing and mark-
       ing content streams so that additional information can be derived from them:

       • Distinguishing between the author’s original content and artifacts of the layout
         process (see “Real Content and Artifacts” on page 885)
       • Specifying a content order to guide the layout process if the page content must
         be reflowed (see “Page Content Order” on page 889)
       • Representing text in a form from which a Unicode representation and informa-
         tion about font characteristics can be unambiguously derived (see “Extraction
         of Character Properties” on page 891)
       • Representing word breaks unambiguously (see “Identifying Word Breaks” on
         page 894)
       • Marking text with information for making it accessible to users with visual im-
         pairments (see Section 10.8, “Accessibility Support)

Previous Next