Home > Articles > Networking

Understanding Application Layer Protocols

This chapter is from the book

In this chapter, we'll move further up the OSI Seven Layer Model and take an in-depth look at the workings of some of the Application layer protocols that are most commonly used in content switching. These include TCP-based services such as HTTP, UDP services like DNS, and applications that use a combination of TCP and UDP, such as the Real Time Streaming Protocol (RTSP). Finally, we'll look at how these types of applications can be secured using Secure Sockets Layer (SSL).

HyperText Transfer Protocol (HTTP)

The HyperText Transfer Protocol, or HTTP, must be the most widely used Application layer protocol in the world today. It forms the basis of what most people understand the Internet to be—the World Wide Web. Its purpose is to provide a lightweight protocol for the retrieval of HyperText Markup Language (HTML) and other documents from Web sites throughout the Internet. Each time you open a Web browser to surf the Internet, you are using HTTP over TCP/IP.

HTTP was first ratified in the early 1990s and has been through three main iterations:

  • HTTP/0.9: A simplistic first implementation of the protocol that only supported the option to get a Web page.

  • HTTP/1.0: Ratified by the IETF as RFC 1945 in 1996. This version added many supplemental data fields, known as headers to the specification. This allowed for other information passing between the client and server, alongside the request and consequent page.

  • HTTP/1.1: Defined in RFC 2068 by the IETF, version 1.1 implemented a number of improvements over and above the 1.0 specification. One of the main improvements of 1.1 over 1.0 was the implementation of techniques such as persistent TCP connections, pipelining, and cache control to improve performance within HTTP-based applications.

Most browsers these days offer support for both 1.0 and 1.1 implementations, with new browsers using 1.1 as a default but supporting the ability to fall back to earlier versions if required. One thing the RFC definitions are clear to point out is that all implementations of the HTTP protocol should be backward compatible. That is to say that a browser implementing the HTTP/1.1 specification should be capable of receiving a 1.0 response from a server. Conversely, a 1.1 implementation on the server side should also be capable of responding to requests from a 1.0 browser.

It is well outside the bounds of this book to cover the HTTP protocols in huge detail, so let's concentrate on those elements most relevant to content switching.

Basic HTTP Page Retrieval

Let's start at the beginning and see how a basic browser retrieves a Web page from a Web server. The first important point to note is that a Web page is typically made up of many dozens of objects, ranging from the HTML base through to the images that are present on the page. The HTML can be thought of as the template for the page overall, instructing the browser on the layout of the text, font sizes and colors, background color of the page, and which other images need to be retrieved to make up the page.

Think of the process, taking place in the following order:

  1. Client sends a request for the required page to the Web server.

  2. The server analyzes the request and sends back an acknowledgment to the client along with the HTML code required to make the page.

  3. The client will begin interpreting the HTML and building the page.

  4. The client, in subsequent requests, will retrieve any embedded objects, such as images or other multimedia sources.

Once all elements of the page have been retrieved, the client browser will display the completed Web page. The order and timing of the process described previously depends largely on which implementation of HTTP is used—1.0 or 1.1—although all browsers work in this way of request and response.

HTTP Methods

HTTP does not only offer a mechanism for the client to receive data from the server, but also other communication types such as the passing of data from the client to the server. Such mechanisms are known within the HTTP specifications as a method. Table 3-1 shows the supported method types in HTTP/1.0 and 1.1.

Table 3-1. The HTTP Method Headers in HTTP/1.0 and HTTP/1.1

METHOD

DESCRIPTION

HTTP/1.0

HTTP/1.1

GET

Retrieve the information specified.

HEAD

Identical to the GET request, but the server must not return any page content other than the HTTP headers.

POST

Allows the client to submit information to the server, used for submitting information from a form, etc.

PUT

Allows the client to place an item on the server in the location specified.

 

DELETE

Allows the client to delete the item specified in the request.

 

TRACE

Allows the client to see the request it made to the server. This acts as a loopback in effect.

 

OPTIONS

Allows the client to determine the communications options available on the server.

 

In terms of general Web browsing, the GET and POST methods are by far the most commonly used. For a browser to build a standard Web page, the GET method is used to retrieve each object individually, whereas for transactional Web sites implementing shopping cart style applications, the POST method will also be used.

The HTTP URL

The URL is the most important piece of information that the client browser includes in any GET request. The URL is defined as being a combination of the host where the site is located, the scheme used to retrieve the page, and the full path and filename. Optionally, the URL may include information such as the TCP port number to be used or a unique reference point within a larger page. Figure 3-1 shows the breakdown of an example URL.

03fig01.gifFigure 3-1. An example URL and its components.


The URI is also commonly used when referencing the location of documents within HTTP. The formal definition of the difference between a URL and a URI is simple: A URI is a URL without the scheme defined.

Persistent Connections in HTTP

One of the other major differences in operation between HTTP/1.0 and HTTP/1.1 is the handling of TCP connections required to retrieve a full Web page. Given that a client will typically have to retrieve multiple objects to make up a single Web page, it is often inefficient to open and close TCP sessions repeatedly when retrieving objects from the same server. To improve the overall performance of HTTP in this instance, the protocol defines the Connection: header that communicates to the server whether the TCP session should be closed or remain open once the object has been retrieved. The Connection: header has two options:

  • Connection: Closed: The default for HTTP/1.0

  • Connection: Keep-Alive: The default for HTTP/1.1

The Closed state indicates that the server should close the TCP connection once the request has been fulfilled. The Keep-Alive state indicates that the server should keep the TCP connection open after the request has been fulfilled. Along with an obvious performance increase from removing the need to open and close TCP connections, the Keep-Alive state also allows the implementation of pipelining. Pipelining allows a client to send multiple HTTP GET requests over the same TCP connection without needing to wait for individual responses after each. Figure 3-2 shows the difference in these connection types.

03fig02.gifFigure 3-2. The difference in TCP handling between HTTP/1.0 and HTTP/1.1.


The final piece in the puzzle of interaction between client and server is in opening multiple TCP connections. We've already seen that a client can open a persistent TCP connection to the server and pipeline HTTP requests. To further improve performance of the HTTP operation, many browsers will open several simultaneous connections. Figure 3-3 gives examples of pipelining and multiple connections.

03fig03.gifFigure 3-3. Implementing pipelining and multiple connections as performance mechanisms.


Other HTTP Headers

The HTTP protocol includes definitions for dozens of headers that can be included in the client-to-server and server-to-client requests and responses. We will not attempt to list and describe all those available here; for a full description, the RFC for HTTP/1.0 and HTTP/1.1 offers a better source. The RFCs define a series of standard headers, which can be complemented by adding user-defined headers from either the client or server side.

As headers are ASCII readable text in every HTTP request and response pair, they can prove very useful in the implementation of content switching. Let's look at some of the HTTP headers most commonly used in content switching.

The "Accept:" Header

The client browser uses the "Accept:" header to indicate to the server which content and media types can be accepted. Examples of the "Accept:" header include:

Accept: */*

Accept anything

Accept: text/plain; text/html

Accept plain text and HTML

Accept: text/html; image/jpeg; image/bmp

Accept HTML and JPEG and bitmap images

The "Accept:" header is useful in the context of content switching to be able to determine the capabilities of a particular client. If the client browser cannot accept images, for example, the request can be directed to a server optimized to deliver text-only versions of the Web pages.

The "Host:" Header

One of the main problems in the original HTTP/1.0 specification was that a user's request as typed into the browser (e.g., http://www.foocorp.com/index.html) would not contain the host (www.foocorp.com) element in the GET request sent to the server. This represents a problem if virtual hosting is used within a Web server farm, where the server is potentially hosting multiple Web sites and needs to use this host information to determine which path and page the user is requesting.

Within the HTTP/1.1 specification, and subsequently in many new HTTP/1.0 browsers, support was added for the "Host:" header. This allows the user's requested URL, typed into the browser, to be converted into a GET request containing the full path and filename along with the host from which the content is being fetched. The following is an example of translating a full URL into its component parts.

URL : http://www.foocorp.com/directory/somewhere/page.html

GET /directory/somewhere/page.html HTTP/1.0\r\n
Host: wwwfoocorp.com

The "Host:" header has many uses within content switching, examples of which are shown in Chapter 6, Content-Aware Server Load Balancing.

The "User-Agent:" Header

The "User-Agent:" header indicates to the server the type of browser being used by the client. The "User-Agent:" header is useful in the context of content switching as it can be used to determine the browser type used by the client and direct the request to a resource offering content optimized for such a browser. The following is an example of the "User-Agent:".

User-Agent: Mozilla/4.0(Compatible; MSIE 6.0; Windows NT 5.0)

Cookies—The HTTP State Management Mechanism

As we'll see in later chapters, one of the biggest challenges in HTTP environments, whether content switched or not, is maintaining some form of client-side state that enables Web servers and intermediary devices to recognize the client session and understand the current status of the user session. This issue was tackled in RFC 2109, which defined the use of the Set-Cookie and Cookie HTTP headers used to set and use the cookies, respectively. In HTTP, cookies take the form of a small piece of text information that is implanted into the user's browser either permanently or temporarily. The term cookie is commonly used in computing to describe an opaque piece of information held during a session and, unfortunately, seems to have no more interesting origin than that. Once the backend server has implanted the cookie into the user's browser, the information can be used for a number of different applications ranging from content personalization, user session persistence for online shopping, and the collection of demographic and statistical information on Web site usage.

The server issuing a Set-Cookie header in any HTTP response can post a cookie to the client at any time during an HTTP session. This Set-Cookie header has the following syntax:

Set-Cookie: <name>=<value>; expires=<date>; path=<path>; domain=<domain>; secure

The name and value fields are the only ones that are mandatory when issuing a cookie. As the name suggests, these define the name of the cookie and its value, such as UserID=Phil, for example. The expires field identifies, down to the second, the date and time on which a cookie will expire and be deleted from the client computer. The path and domain fields indicate the domain, such as www.foocorp.com, and the URL, such as /home/brochures/, for which the cookie should be used. Both of these options can effectively be wild-carded by specifying foocorp.com to match www.foocorp.com and intranet.foocorp.com, for example. Finally, the secure field indicates to the client that the cookie should only be used when a secure connection (SSL secured HTTP or HTTPS) is used between the client and server. Figure 3-4 shows the interaction between a client and server as two different cookies are inserted and used.

03fig04.gifFigure 3-4. The interaction between a client and a server when two different cookies are implanted and used.


The following code shows the HTTP responses from the server in more detail. Note that the second cookie includes the Path field, which will limit the use of the cookie to URLs requested by the user that include the string /docs.

Hypertext Transfer Protocol
    HTTP/1.1 200 OK\r\n
    Set-Cookie: UserID=Phil
   Connection: Keep-Alive\r\n
   Content-Type: text/html\r\n
   \r\n
   
   Hypertext Transfer Protocol
   HTTP/1.1 200 OK\r\n
   Set-Cookie: UserType=Gold; Path=/docs
   Connection: Keep-Alive\r\n
   Content-Type: text/html\r\n
   \r\n
   

The mechanism that governs whether a cookie is permanent (i.e., stored on the hard disk of the user's machine) or temporary (i.e., removed once the user closes the browser application) is the Expires field in the Set-Cookie header. If the server does not issue an Expires directive when implanting the cookie, it is considered temporary, whereas if the Expires directive is used, then the cookie will be stored on the client machine until the expiry date has passed.

Cookies are by far one of the most useful additions made to the HTTP specifications, and as we'll see in later chapters can be used in conjunction with content switching to enable a whole host of new experience-enhancing services.

HTTP—Further Reading

It is outside the scope of this book to cover the HTTP protocol in its entirety;. the RFC for HTTP/1.1 alone is over 160 pages. For more in-depth detail on the protocol, it's worth looking at the following RFCs:

  • RCF 1945 Hypertext Transfer ProtocolHTTP/1.0

  • RFC 2068 Hypertext Transfer ProtocolHTTP/1.1

  • RFC 2109 HTTP State Management Mechanism

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020