SECTION F.1
1023
Background and Assumptions
After a transaction has completed, obtaining more data requires a new request-
response transaction. The connection between client and server does not ordi-
narily persist beyond the end of a transaction, although some implementations
may attempt to cache the open connection to expedite subsequent transactions
with the same server.
Round-trip delay can be significant. A request-response transaction can take
up to several seconds, independent of the amount of data requested.
The data rate may be limited. A typical bottleneck is a slow modem link be-
tween the client and the Internet service provider.
These properties are generally shared by other wide-area network architectures
besides the Web. Also, CD-ROMs share some of these properties, since they have
relatively slow seek times and limited data rates compared to magnetic media.
The remainder of this appendix focuses on the Web.
Some additional properties of the HTTP protocol are relevant to the problem of
accessing PDF files efficiently. These properties may not all be shared by other
protocols or network environments.
When a PDF file is initially accessed (such as by following a URL hyperlink
from some other document), the file type is not known to the client. Therefore,
the client initiates a transaction to retrieve the entire document and then in-
spects the MIME tag of the response as it arrives. Only at that point is the doc-
ument known to be PDF. Additionally, with a properly configured server
environment, the length of the document becomes known at that time.
The client can abort a response while the transaction is still in progress if it
decides that the remainder of the data is not of immediate interest. In HTTP,
aborting the transaction requires closing the connection, which interferes with
the strategy of caching the open connection between transactions.
The client can request retrieval of portions of a document by specifying one or
more byte ranges (by offset and count) in the HTTP request headers. Each
range can be relative to either the beginning or the end of the file. The client
can specify as many ranges as it wants in the request, and the response consists
of multiple blocks, each properly tagged.
The client can initiate multiple concurrent transactions in an attempt to ob-
tain multiple responses in parallel. This is commonly done, for instance, to re-
trieve inline images referenced from an HTML document. This strategy is not
Index Bookmark Pages Text
Previous Next
Pages: Index All Pages
This HTML file was created by VeryPDF PDF to HTML Converter product.