Home > Store

Web Protocols and Practice: HTTP/1.1, Networking Protocols, Caching, and Traffic Measurement

Register your product to gain access to bonus material or receive a coupon.

Web Protocols and Practice: HTTP/1.1, Networking Protocols, Caching, and Traffic Measurement

Book

  • Sorry, this book is no longer in print.
Not for Sale

About

Features

Description

  • Copyright 2001
  • Dimensions: 7-5/8" x 9-1/2"
  • Pages: 672
  • Edition: 1st
  • Book
  • ISBN-10: 0-201-71088-9
  • ISBN-13: 978-0-201-71088-5

Just as TCP/IP is a central protocol for the Internet, HTTP is a central protocol for the web. They are both critical for web networking. Web Protocols and Practice is the most authoritative and comprehensive guide to the Web's technical underpinnings. Authored by legendary AT&T Labs researcher Bala Krishnamurthy and renowned Web networking expert Jennifer Rexford, this book offers exceptionally thorough coverage of core Web protocols-including the most detailed discussion of HTTP /1.1 and its relationship to TCP/IP networking ever presented. The authors begin with a broad overview of the evolution of the Web, including its naming infrastructure, HTML document language, and HTTP message exchange protocol. Next, they introduce the inner workings of clients, proxies, and servers, as well as scripts, handlers, search engines, cookies, and authentication. The heart of the book is a detailed discussion of the core Web protocols DNS, TCP/IP, and HTTP. An essential resource for all networking and Internet professionals, and for all developers building Internet applications.

Sample Content

Table of Contents

I. GETTING STARTED.

1. Introduction.

Origin and Growth of the World Wide Web.

Historical Evolution of the Web.

The State of the Web.

Semantic Components of the Web.

Uniform Resource Identifer (URI).

HyperText Markup Language (HTML).

HyperText Transfer Protocol (HTTP).

Terms and Concepts.

Content on the Web.

Software Components.

Underlying Network.

Standardization.

Web.

Web Applications.

Topics Not Covered.

Tour Through the Book.

II. WEB SOFTWARE COMPONENTS.

2. Web Clients.

Client as a program.

Evolution of the browser.

Web-related browser functions.

Canonical Web transfer example.

Issuing a request from a browser.

Browser caching.

Request message headers.

Response handling.

Browser configuration.

Physical appearance.

Semantic choices.

Configuring browser for non-protocol functions.

Browser security issues.

Cookies.

Motivations for cookies.

Use of cookies in a browser.

User's control over cookies.

Privacy problems with cookies.

Spiders.

Searching on the Web.

Spider client.

Use of spiders in search engines.

Intelligent agents and special-purpose browsers.

Intelligent agents.

Special-purpose browsers.

Summary.

3. Web Proxies.

History and evolution of intermediaries.

High-level classification of proxies.

Caching proxy.

Transparent proxy.

Proxy applications.

Sharing access to the Web.

Caching responses.

Anonymizing clients.

Transforming requests and responses.

Gateway to non-HTTP systems.

Filtering requests and responses.

HTTP-related proxy roles.

Steps in request-response exchange with a proxy.

Handling HTTP requests and responses.

Proxy as Web server.

Proxy as Web client.

Example use of a proxy.

Proxy chaining and hierarchies.

Proxy configuration.

Proxy privacy issues.

Other kinds of proxies.

Reverse proxies or surrogates.

Interception proxies.

Summary.

4. Web Servers.

Web Site Versus Web Server.

Web Site.

Web Server.

Handling a Client Request.

Steps in Handling a Client Request.

Access Control.

Dynamically Generated Responses.

Creating and Using Cookies.

Sharing Information Across Requests.

Sharing HTTP Responses Across Requests.

Sharing Metadata Across Requests.

Server Architecture.

Event-Driven Server Architecture.

Process-Driven Server Architecture.

Hybrid Server Architecture.

Server Hosting.

Multiple Web Sites on a Single Machine.

Multiple Machines for a Single Web Site.

Case Study: Apache Web Server.

Resource Management.

HTTP Request Processing.

Summary.

III. WEB PROTOCOLS.

5. Protocols Underlying HTTP.

Internet Protocol.

Evolution of the Internet Architecture.

IP Design Goals.

IP Addresses.

IP Header Details.

Transmission Control Protocol 0.

Socket Abstraction 1.

Ordered Reliable Byte Stream.

Opening and Closing a TCP Connection.

Sliding-Window Flow Control.

Retransmission of Lost Packets.

TCP Congestion Control.

TCP Header Details.

Domain Name System.

DNS Resolver.

DNS Architecture.

DNS Protocol.

DNS Queries and the Web.

DNS-Based Web Server Load Balancing.

Application-Layer Protocols.

Telnet Protocol.

File Transfer Protocol.

Simple Mail Transfer Protocol.

The Network News Transfer Protocol.

Properties of Application-Layer Protocols.

Summary.

6. HTTP Protocol Design and Description.

Overview of HTTP.

Protocol properties.

Protocol infuences.

HTTP language elements.

HTTP terms.

HTTP/1.0 Request Methods.

HTTP/1.0 Headers.

HTTP/1.0 Response classes.

HTTP Extensibility.

SSL and Security.

SSL.

HTTPS: Using SSL in Web exchanges.

Security in HTTP/1.0.

Protocol compliance and interoperability.

Version number and interoperability.

MUST, SHOULD, MAY requirement levels.

Summary.

7. HTTP/1.1 221.

The evolution of HTTP/1.1 protocol.

History of evolution.

Problems with HTTP/1.0 25.

New concepts in HTTP/1.1.

Methods, headers, response codes in 1.0 and 1.1.

Old and new request methods.

Old and new headers.

Old and new response codes.

Caching.

Caching related terms.

Caching in HTTP/1.0.

Caching in HTTP/1.1.

Bandwidth optimization.

The Range request.

The Expect/Continue mechanism.

Compression.

Connection Management.

The Connection: Keep-Alive mechanism of HTTP/1.0.

Evolution of HTTP/1.1 persistent connection mechanism.

The Connection header.

Pipelining on persistent connections.

Closing persistent connections.

Message transmission.

Extensibility.

Learning about the server.

Learning about intermediate servers.

Upgrading to other protocols.

Internet address conservation.

Content negotiation.

Security, authentication, and integrity.

Security and authentication.

Integrity.

The role of proxies in HTTP/1.1.

Types of proxies.

Syntactic requirements on an HTTP/1.1 proxy.

Semantic requirements on an HTTP/1.1 proxy.

Other miscellaneous changes.

Method-related miscellaneous changes.

Header-related miscellaneous changes.

Response-code-related miscellaneous changes.

Summary.

8. HTTP/TCP Interaction.

TCP Timers.

Retransmission Timer.

Slow-Start Restart.

The TIME WAIT State.

HTTP/TCP Layering.

Aborted HTTP Transfers.

Nagle's Algorithm.

Delayed Acknowledgments.

Multiplexing TCP Connections.

Motivation for Parallel Connections.

Problems with Parallel Connections.

Server Overheads.

Combining System Calls.

Managing Multiple Connections.

Summary.

IV. MEASURING AND CHARACTERIZING WEB TRAFFIC.

9. Web Traffic Measurement.

Motivation for Web Measurement.

Motivation for Content Creators.

Motivation for Web Hosting Companies.

Motivation for Network Operators.

Motivation for Web/Networking Researchers.

Measurement Techniques.

Server Logging.

Proxy Logging.

Client Logging.

Packet Traces.

Active Measurement.

Proxy/Server Logs.

Common Log Format (CLF).

Extended Common Log Format (ECLF).

Preprocessing Measurement Data for Analysis.

Parsing Measurement Data.

Filtering Measurement Data.

Transforming Measurement Data.

Drawing Inferences From Measurement Data.

Limitations of HTTP Header Information.

Ambiguous Client/Server Identity.

Inferring User Actions.

Detecting Resource Modifications.

Case Studies.

Saskatchewan Server Log Study.

British Columbia Proxy Log Study.

Boston University Client Log Study.

AT&T Packet-Trace Study.

Summary.

10. Web Workload Characterization.

Workload Characterization.

Applications of Workload Models.

Selecting Workload Parameters.

Statistics and Probability Distributions.

Mean, Median, and Variance.

Probability Distributions.

HTTP Message Characteristics.

HTTP Request Methods.

HTTP Response Codes.

Web Resource Characteristics.

Content Types.

Resource Sizes.

Response Sizes.

Resource Popularity.

Resource Changes.

Temporal Locality.

Number of Embedded Resources.

User Behavior Characteristics.

Session and Request Arrivals.

Clicks per Session.

Request Interarrival Time.

Applying Workload Models.

Combining Workload Parameters.

Validating the Workload Model.

Generating Synthetic Traffic.

User Privacy.

Access to User-Level Data.

Information Available to Software Components.

Application of User-Level Data.

Summary.

V. WEB APPLICATIONS.

11. Web Caching.

The origins and goals of Web Caching.

Why cache?

What is cacheable?

Protocol-speci_c considerations.

Content-speci_c considerations.

Where is caching done?

How is caching done?

Deciding whether a message is cacheable.

Cache replacement and storing response in cache.

Returning a cached response.

Maintaining a cache.

Cache replacement.

Cache coherency.

Rate of change of resources.

Cache-related protocols.

Internet Cache Protocol (ICP).

Cache Array Resolution Protocol (CARP).

Cache Digest Protocol.

Web Cache Coordination Protocol (WCCP).

Cache software and hardware.

Cache software: The Squid cache.

Caching hardware.

Impediments to caching.

Cache busting.

Privacy issues in caching.

Caching versus replication.

Content distribution.

Content adaptation.

Summary.

12. Delivering Multimedia Streams.

Multimedia Streaming.

Audio and Video Data.

Multimedia Streaming Applications.

Properties of Multimedia Applications.

Delivering Multimedia Content.

Performance Requirements.

Limitations of IP Networks.

Multimedia-on-Demand Over HTTP.

Protocols for Multimedia Streaming.

Data Transport.

Session Establishment.

Session Description.

Presentation Description.

Real Time Streaming Protocol.

Similarities and Differences.

RTSP Request Methods.

RTSP Headers.

RTSP Status Codes.

Summary.

VI. Research Perspectives.

13. Research Perspectives in Caching.

Cache revalidation and invalidation.

Costs associated with revalidation.

Prevalidation.

The Piggybacking approach.

Server-driven invalidation.

End-to-end information exchange.

Server volumes.

Proxy filters.

Volumes and Filters: Practical details.

Volume Construction Algorithms.

Evaluation of volume construction algorithms.

End-to-end information exchange summary.

Prefetching.

DNS Prefetching.

Connection Prefetching.

HTTP Prefetching.

Trade-offs in Prefetching.

Summary.

14. Research Perspectives in Measurement.

Packet Monitoring of HTTP Traffic.

Tapping a Link.

Capturing Packets.

Demultiplexing Packets.

Reconstructing the Ordered Stream.

Extracting HTTP Messages.

Generating HTTP Traces.

Analyzing Web Server Logs.

Parsing and Filtering.

Transforming.

Publicly-Available Logs and Traces.

Measuring Multimedia Streams.

Static Analysis of Multimedia Resources.

Multimedia Server Logs.

Packet Monitoring of Multimedia Streams.

Multi-Layer Packet Monitoring.

Summary.

15. Research Perspectives in Protocol Issues.

Multiplexing HTTP Transfers.

WebMux: An experimental multiplexing Protocol.

TCP Control Block Interdependence.

Integrated Congestion Management.

Adding a Differencing Mechanism to HTTP/1.1.

Motivations for a differencing mechanism for HTTP messages.

Evaluation of delta algorithms.

Deployment issues of delta mechanism in HTTP/1.1.

Status of adding delta mechanism to HTTP/1.1.

HTTP/1.1 Protocol compliance.

Motivation for protocol compliance study.

Testing compliance of clients and proxies.

Methodology of testing compliance.

PROCOW: A large scale compliance study.

Summary of protocol compliance.

End-to-end measurements to study Web performance.

Identifying the factors in end-to-end performance.

Report on an end-to-end performance study.

Summary of end-to-end performance study.

Other extensions to HTTP.

Transparent Content Negotiation.

WebDAV|Web Distributed Authoring and Versioning.

An HTTP Extension Framework.

Summary.

Bibliography.
Index. 0201710889T04062001

Preface

Introduction

This book describes the technical underpinnings of the World Wide Web. We discuss the technology for transferring, caching, and measuring the messages that carry the content between Web sites and end users. The messages are exchanged between clients, proxies, and servers|the three main software components of the Web. The format and transfer of these messages are dictated by communication protocols codified in standards documents over a period of years. Evaluating and improving Web performance relies on having effective techniques for collecting and analyzing measurements of the message traffic. By moving Web content closer to the end users, caching reduces user-perceived latency, as well as load on the Web servers and the underlying network. Web traffic is moving from delivery of text and image content to include audio and video streaming. Multimedia streaming has its own suite of communication protocols. These topics, constituting the technical core of the Web, are discussed in detail in this book.

This book provides a comprehensive treatment of the systems and protocols responsible for the transfer of Web content. The audience for this book includes Web technologists, Web site administrators, developers who rely on the Web infrastructure, students in networking and the Web, and the Web research community. The book focuses on the mature and stable aspects of the Web. In contrast to the rapidly changing techniques for creating and displaying Web content, the standardized communication protocols discussed in the book change relatively slowly. A variety of examples, state-of-the-art reports, and case studies are used to illustrate the operation of the Web and the interplay among the various components. The book includes detailed examples of the HTTP protocol, a state-of-the-art overview of Web caching and multimedia streaming, and case studies of the Apache Web server, the Squid proxy, and traffic measurement techniques. The book is a valuable resource for understanding the technology and current practices of the Web.

Organization of the Book

The first section of the book consists of an opening chapter that provides a broad overview of the evolution of the World Wide Web and discusses the Web's naming infrastructure, document language, and message exchange protocol. The remainder of the book is divided into five sections consisting of 14 chapters:

  • Software components: These three chapters present the inner workings of clients, proxies, and servers, including a discussion of related topics such as scripts, handlers, search engines, cookies, and authentication.
  • Web protocols: The core of the book, these four chapters present the networking protocols underlying the Web (Internet Protocol, Transmission Control Protocol, and the Domain Name System), the design of HTTP/1.0, a comprehensive overview of HTTP/1.1, and the interaction between http and TCP.
  • Traffic measurement and workload characterization: These two chapters describe the various techniques for measuring and analyzing Web traffic, as well as an overview of the key parameters of Web workload models used in evaluating Web performance.
  • Web caching and multimedia streaming: These two chapters provide a state-of-the-art overview of key Web applications. Web caching involves moving content closer to the user to reduce user-perceived latency and the load on the server and the network. Multimedia streaming involves overlapping the transfer of audio and video data with the playback at the receiver.
  • Research perspectives: These three chapters present research perspectives on caching, measurement, and protocols to provide a glimpse of the evolving technology in these areas and reinforce the material presented in the earlier parts of the book.

Intended Audience

The book is self-contained and does not assume any prior knowledge of Web or networking technology. An extensive bibliography points readers to additional information on specific topics. The book has several audience segments, including:

  • Students: Undergraduate students in advanced courses and graduate students can use the book as an introduction to the protocol, network, and measurement aspects of the Web. The book is self-contained and does not assume the student is familiar with network protocols. We do assume a basic familiarity with computer science concepts. The book includes case studies and research perspectives to guide students in applying the ideas they have learned. The book's focus on core concepts and protocol evolution ensures that the student acquires knowledge that has broad applications beyond any particular realization of Web technology.
  • Web technologists: The book provides developers with an in-depth treatment of the various protocols and software components in the Web. A developer can learn about HTTP and the related networking protocols, such as IP, TCP, and DNS, and their relationship to Web clients, proxies, and servers. In addition, the book includes an extensive treatment of Web traffic measurement andworkload characterization that can aid developers in evaluating and improving the performance of their software in realistic settings.
  • Web and networking researchers: Academic and industrial researchers can use the book as a primary source of information about the technical underpinnings of the Web and its relationship to the Internet. The core portions of the book highlight the mature technologies underlying the Web, to provide the necessary context for research work in this area. The advanced material on research perspectives provides a timely view of ongoing work that may influence the evolution of the Web, and the extensive bibliography points the reader to research publications and standards documents with additional details.
  • Web administrators: Administrators of Web proxies and servers can develop a deeper understanding of the operation of these software components. The book can serve as a reference for key concepts and protocol features. The emphasis on performance issues can aid administrators in tuning the configuration of a proxy or server, complementing other texts that present detailed guidelines of how to configure a particular hardware or software platform. The material on Web measurement and the interaction between HTTP and networking protocols can help administrators in diagnosing performance problems.

The book can be used as a reference, a self-study guide, or part of a one- or two-semester course on Web technology or networking. Readers may follow a variety of paths through the book, depending on their backgrounds and interests. Some readers may skip the elementary chapters, whereas other readers may skip the research perspectives material.



0201710889P04302001

Bibliography

Click below for Bibliography related to this title:
biblio.html

Updates

Submit Errata

More Information

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020