SECTION 10.7
891
Tagged PDF
various technical and historical reasons, however, many such fonts follow the
same conventions as those designed for Western writing systems, with glyph ori-
gins at the lower left and positive widths, as shown in Figure 5.4 on page 394.
Consequently, showing text in such right-to-left writing systems requires either
positioning each glyph individually (which is tedious and costly) or representing
text with show strings (see “Organization and Use of Fonts” on page 388) whose
character codes are given in reverse order. When the latter method is used, the
character codes’ correct page content order is the reverse of their order within the
show string.
The marked-content tag
ReversedChars
informs the Tagged PDF consumer appli-
cation that show strings within a marked-content sequence contain characters in
the reverse of page content order. If the sequence encompasses multiple show
strings, only the individual characters within each string are reversed; the strings
themselves are in natural reading order. For example, the sequence
/ReversedChars
BMC
( olleH ) Tj
−200
0 Td
( . dlrow ) Tj
EMC
represents the text
Hello world .
The show strings may have a space character at the beginning or end to indicate a
word break (see “Identifying Word Breaks” on page 894) but may not contain
interior spaces. This limitation is not serious, since a space provides an opportu-
nity to realign the typography without visible effect, and it serves the valuable
purpose of limiting the scope of reversals for word-processing consumer applica-
tions.
Extraction of Character Properties
It is a requirement of Tagged PDF that character codes can be unambiguously
converted to Unicode values representing the information content of the text.
There are several methods for doing this; a Tagged PDF document must conform
to at least one of them (see “Unicode Mapping in Tagged PDF,” below).
Index Bookmark Pages Text
Previous Next
Pages: Index All Pages
This HTML file was created by VeryPDF PDF to HTML Converter product.