Previous Next


                                       454
CHAPTER 5                                                                     Text



The codespace ranges in the CMap (delimited by begincodespacerange and
endcodespacerange) determine how many bytes are extracted from the string for
each successive character code. A codespace range is specified by a pair of codes
of some particular length giving the lower and upper bounds of that range. A
code is considered to match the range if it is the same length as the bounding
codes and the value of each of its bytes lies between the corresponding bytes of
the lower and upper bounds. The code length cannot exceed the number of bytes
representable in an integer (see Appendix C).

A sequence of one or more bytes is extracted from the string and matched against
the codespace ranges in the CMap. That is, the first byte is matched against 1-byte
codespace ranges; if no match is found, a second byte is extracted, and the 2-byte
code is matched against 2-byte codespace ranges. This process continues for suc-
cessively longer codes until a match is found or all codespace ranges have been
tested. There will be at most one match because codespace ranges do not overlap.

The code extracted from the string is looked up in the character code mappings
for codes of that length. (These are the mappings defined by beginbfchar,
endbfchar, begincidchar, endcidchar, and corresponding operators for ranges.)
Failing that, it is looked up in the notdef mappings, as described in the next
section.

The results of the CMap mapping algorithm are a font number and a character
selector. The font number is used as an index into the Type 0 font’s
DescendantFonts array to select a CIDFont. In PDF, the font number is always 0
and the character selector is always a CID; this is the only case described here.
The CID is then used to select a glyph in the CIDFont. If the CIDFont contains
no glyph for that CID, the notdef mappings are consulted, as described in the
next section.


Handling Undefined Characters

A CMap mapping operation can fail to select a glyph for a variety of reasons. This
section describes those reasons and what happens when they occur.

If a code maps to a CID for which no such glyph exists in the descendant
CIDFont, the notdef mappings in the CMap are consulted to obtain a substitute
character selector. These mappings (so called by analogy with the . notdef charac-
ter mechanism in simple fonts) are delimited by the operators beginnotdefchar,
endnotdefchar, beginnotdefrange, and endnotdefrange. They always map to a

Previous Next