InformIT

Understanding Application Layer Protocols

Date: Mar 5, 2004

Sample Chapter is provided courtesy of Prentice Hall Professional.

Return to the article

In this chapter, we'll move further up the OSI Seven Layer Model and take an in-depth look at the workings of some of the Application layer protocols that are most commonly used in content switching. These include TCP-based services such as HTTP, UDP services like DNS, and applications that use a combination of TCP and UDP, such as the Real Time Streaming Protocol (RTSP). Finally, we'll look at how these types of applications can be secured using Secure Sockets Layer (SSL).

HyperText Transfer Protocol (HTTP)

The HyperText Transfer Protocol, or HTTP, must be the most widely used Application layer protocol in the world today. It forms the basis of what most people understand the Internet to be—the World Wide Web. Its purpose is to provide a lightweight protocol for the retrieval of HyperText Markup Language (HTML) and other documents from Web sites throughout the Internet. Each time you open a Web browser to surf the Internet, you are using HTTP over TCP/IP.

HTTP was first ratified in the early 1990s and has been through three main iterations:

Most browsers these days offer support for both 1.0 and 1.1 implementations, with new browsers using 1.1 as a default but supporting the ability to fall back to earlier versions if required. One thing the RFC definitions are clear to point out is that all implementations of the HTTP protocol should be backward compatible. That is to say that a browser implementing the HTTP/1.1 specification should be capable of receiving a 1.0 response from a server. Conversely, a 1.1 implementation on the server side should also be capable of responding to requests from a 1.0 browser.

It is well outside the bounds of this book to cover the HTTP protocols in huge detail, so let's concentrate on those elements most relevant to content switching.

Basic HTTP Page Retrieval

Let's start at the beginning and see how a basic browser retrieves a Web page from a Web server. The first important point to note is that a Web page is typically made up of many dozens of objects, ranging from the HTML base through to the images that are present on the page. The HTML can be thought of as the template for the page overall, instructing the browser on the layout of the text, font sizes and colors, background color of the page, and which other images need to be retrieved to make up the page.

Think of the process, taking place in the following order:

  1. Client sends a request for the required page to the Web server.

  2. The server analyzes the request and sends back an acknowledgment to the client along with the HTML code required to make the page.

  3. The client will begin interpreting the HTML and building the page.

  4. The client, in subsequent requests, will retrieve any embedded objects, such as images or other multimedia sources.

Once all elements of the page have been retrieved, the client browser will display the completed Web page. The order and timing of the process described previously depends largely on which implementation of HTTP is used—1.0 or 1.1—although all browsers work in this way of request and response.

HTTP Methods

HTTP does not only offer a mechanism for the client to receive data from the server, but also other communication types such as the passing of data from the client to the server. Such mechanisms are known within the HTTP specifications as a method. Table 3-1 shows the supported method types in HTTP/1.0 and 1.1.

Table 3-1. The HTTP Method Headers in HTTP/1.0 and HTTP/1.1

METHOD

DESCRIPTION

HTTP/1.0

HTTP/1.1

GET

Retrieve the information specified.

HEAD

Identical to the GET request, but the server must not return any page content other than the HTTP headers.

POST

Allows the client to submit information to the server, used for submitting information from a form, etc.

PUT

Allows the client to place an item on the server in the location specified.

 

DELETE

Allows the client to delete the item specified in the request.

 

TRACE

Allows the client to see the request it made to the server. This acts as a loopback in effect.

 

OPTIONS

Allows the client to determine the communications options available on the server.

 

In terms of general Web browsing, the GET and POST methods are by far the most commonly used. For a browser to build a standard Web page, the GET method is used to retrieve each object individually, whereas for transactional Web sites implementing shopping cart style applications, the POST method will also be used.

The HTTP URL

The URL is the most important piece of information that the client browser includes in any GET request. The URL is defined as being a combination of the host where the site is located, the scheme used to retrieve the page, and the full path and filename. Optionally, the URL may include information such as the TCP port number to be used or a unique reference point within a larger page. Figure 3-1 shows the breakdown of an example URL.

03fig01.gifFigure 3-1. An example URL and its components.


The URI is also commonly used when referencing the location of documents within HTTP. The formal definition of the difference between a URL and a URI is simple: A URI is a URL without the scheme defined.

Persistent Connections in HTTP

One of the other major differences in operation between HTTP/1.0 and HTTP/1.1 is the handling of TCP connections required to retrieve a full Web page. Given that a client will typically have to retrieve multiple objects to make up a single Web page, it is often inefficient to open and close TCP sessions repeatedly when retrieving objects from the same server. To improve the overall performance of HTTP in this instance, the protocol defines the Connection: header that communicates to the server whether the TCP session should be closed or remain open once the object has been retrieved. The Connection: header has two options:

The Closed state indicates that the server should close the TCP connection once the request has been fulfilled. The Keep-Alive state indicates that the server should keep the TCP connection open after the request has been fulfilled. Along with an obvious performance increase from removing the need to open and close TCP connections, the Keep-Alive state also allows the implementation of pipelining. Pipelining allows a client to send multiple HTTP GET requests over the same TCP connection without needing to wait for individual responses after each. Figure 3-2 shows the difference in these connection types.

03fig02.gifFigure 3-2. The difference in TCP handling between HTTP/1.0 and HTTP/1.1.


The final piece in the puzzle of interaction between client and server is in opening multiple TCP connections. We've already seen that a client can open a persistent TCP connection to the server and pipeline HTTP requests. To further improve performance of the HTTP operation, many browsers will open several simultaneous connections. Figure 3-3 gives examples of pipelining and multiple connections.

03fig03.gifFigure 3-3. Implementing pipelining and multiple connections as performance mechanisms.


Other HTTP Headers

The HTTP protocol includes definitions for dozens of headers that can be included in the client-to-server and server-to-client requests and responses. We will not attempt to list and describe all those available here; for a full description, the RFC for HTTP/1.0 and HTTP/1.1 offers a better source. The RFCs define a series of standard headers, which can be complemented by adding user-defined headers from either the client or server side.

As headers are ASCII readable text in every HTTP request and response pair, they can prove very useful in the implementation of content switching. Let's look at some of the HTTP headers most commonly used in content switching.

The "Accept:" Header

The client browser uses the "Accept:" header to indicate to the server which content and media types can be accepted. Examples of the "Accept:" header include:

Accept: */*

Accept anything

Accept: text/plain; text/html

Accept plain text and HTML

Accept: text/html; image/jpeg; image/bmp

Accept HTML and JPEG and bitmap images

The "Accept:" header is useful in the context of content switching to be able to determine the capabilities of a particular client. If the client browser cannot accept images, for example, the request can be directed to a server optimized to deliver text-only versions of the Web pages.

The "Host:" Header

One of the main problems in the original HTTP/1.0 specification was that a user's request as typed into the browser (e.g., http://www.foocorp.com/index.html) would not contain the host (www.foocorp.com) element in the GET request sent to the server. This represents a problem if virtual hosting is used within a Web server farm, where the server is potentially hosting multiple Web sites and needs to use this host information to determine which path and page the user is requesting.

Within the HTTP/1.1 specification, and subsequently in many new HTTP/1.0 browsers, support was added for the "Host:" header. This allows the user's requested URL, typed into the browser, to be converted into a GET request containing the full path and filename along with the host from which the content is being fetched. The following is an example of translating a full URL into its component parts.

URL : http://www.foocorp.com/directory/somewhere/page.html

GET /directory/somewhere/page.html HTTP/1.0\r\n
Host: wwwfoocorp.com

The "Host:" header has many uses within content switching, examples of which are shown in Chapter 6, Content-Aware Server Load Balancing.

The "User-Agent:" Header

The "User-Agent:" header indicates to the server the type of browser being used by the client. The "User-Agent:" header is useful in the context of content switching as it can be used to determine the browser type used by the client and direct the request to a resource offering content optimized for such a browser. The following is an example of the "User-Agent:".

User-Agent: Mozilla/4.0(Compatible; MSIE 6.0; Windows NT 5.0)

Cookies—The HTTP State Management Mechanism

As we'll see in later chapters, one of the biggest challenges in HTTP environments, whether content switched or not, is maintaining some form of client-side state that enables Web servers and intermediary devices to recognize the client session and understand the current status of the user session. This issue was tackled in RFC 2109, which defined the use of the Set-Cookie and Cookie HTTP headers used to set and use the cookies, respectively. In HTTP, cookies take the form of a small piece of text information that is implanted into the user's browser either permanently or temporarily. The term cookie is commonly used in computing to describe an opaque piece of information held during a session and, unfortunately, seems to have no more interesting origin than that. Once the backend server has implanted the cookie into the user's browser, the information can be used for a number of different applications ranging from content personalization, user session persistence for online shopping, and the collection of demographic and statistical information on Web site usage.

The server issuing a Set-Cookie header in any HTTP response can post a cookie to the client at any time during an HTTP session. This Set-Cookie header has the following syntax:

Set-Cookie: <name>=<value>; expires=<date>; path=<path>; domain=<domain>; secure

The name and value fields are the only ones that are mandatory when issuing a cookie. As the name suggests, these define the name of the cookie and its value, such as UserID=Phil, for example. The expires field identifies, down to the second, the date and time on which a cookie will expire and be deleted from the client computer. The path and domain fields indicate the domain, such as www.foocorp.com, and the URL, such as /home/brochures/, for which the cookie should be used. Both of these options can effectively be wild-carded by specifying foocorp.com to match www.foocorp.com and intranet.foocorp.com, for example. Finally, the secure field indicates to the client that the cookie should only be used when a secure connection (SSL secured HTTP or HTTPS) is used between the client and server. Figure 3-4 shows the interaction between a client and server as two different cookies are inserted and used.

03fig04.gifFigure 3-4. The interaction between a client and a server when two different cookies are implanted and used.


The following code shows the HTTP responses from the server in more detail. Note that the second cookie includes the Path field, which will limit the use of the cookie to URLs requested by the user that include the string /docs.

Hypertext Transfer Protocol
    HTTP/1.1 200 OK\r\n
    Set-Cookie: UserID=Phil
   Connection: Keep-Alive\r\n
   Content-Type: text/html\r\n
   \r\n
   
   Hypertext Transfer Protocol
   HTTP/1.1 200 OK\r\n
   Set-Cookie: UserType=Gold; Path=/docs
   Connection: Keep-Alive\r\n
   Content-Type: text/html\r\n
   \r\n
   

The mechanism that governs whether a cookie is permanent (i.e., stored on the hard disk of the user's machine) or temporary (i.e., removed once the user closes the browser application) is the Expires field in the Set-Cookie header. If the server does not issue an Expires directive when implanting the cookie, it is considered temporary, whereas if the Expires directive is used, then the cookie will be stored on the client machine until the expiry date has passed.

Cookies are by far one of the most useful additions made to the HTTP specifications, and as we'll see in later chapters can be used in conjunction with content switching to enable a whole host of new experience-enhancing services.

HTTP—Further Reading

It is outside the scope of this book to cover the HTTP protocol in its entirety;. the RFC for HTTP/1.1 alone is over 160 pages. For more in-depth detail on the protocol, it's worth looking at the following RFCs:

File Transfer Protocol (FTP)

In Internet terms, The File Transfer Protocol, or FTP, has been around for a long time. First defined in RFC 172 written in June 1971, the protocol has been through several changes through to the current specification, which is defined in RFC 959. Again, while it's not the purpose of this book to describe every detail about FTP, it's worth looking at its basic operation to get a better understanding of how content switching can improve performance and reliability in FTP environments.

FTP Basics

FTP exists primarily for the transfer of data between two end points. The RFC itself actually states that two of the objectives of the protocol are to "promote the sharing of files" and "transfer data reliably and efficiently." FTP differs from HTTP fundamentally as it is an application made up of two distinct TCP connections:

Using these two communication connections, two distinct modes of operation determine in which direction the connections are established: Active mode and Passive mode.

Active Mode FTP

Within an Active FTP session, the Control connection is established from the client to the server, with the Data connection established back from the server to the client. In order to do this, the client issues a PORT command to the server that contains the IP address and source and destination TCP ports that should be used during the Data connection. Figure 3-5 shows the lifecycle of an Active FTP session.

03fig05.gifFigure 3-5. An active FTP session example.


As we can see from Figure 3-5, once the user has logged on with a valid username and password, the very first "data" that is passed—in this case, a directory listing—is carried using a separate data channel. The format for communicating the IP and TCP information of the data channel is as follows:

PORT [Octet 1],[Octet 2],[Octet 3],[Octet 4], [TCP Port 8 Bytes],[TCP Port 8 Bytes]

Therfore, in the preceding example, the PORT command of PORT 10,10,10,10,15,199 equates to IP address 10.10.10.10 and TCP port 4039 [15×256 + 199×1].

In some instances, Active FTP can be considered a security risk mainly because there is often little control over the contents of the PORT command. Under normal usage, this information should be the IP address and listening TCP port of the client waiting for the Data connection. When used maliciously, however, the client could issue PORT commands with IP addresses and TCP ports of other machines either within the same network as the server or remotely. Many Application layer firewalls and proxies, or firewalls with support for FTP command parsing can be used to reduce the effectiveness of such attacks. One alternative is to implement the second method of FTP—Passive mode FTP.

Passive Mode FTP

Passive mode FTP works similarly to Active mode FTP with one major exception: both the Control and Data connections within a Passive mode FTP session are established from the client to the server. To implement this, rather than use the PORT command, Passive mode FTP implements the PASV command, which instructs the server that it should listen for the incoming Data connection. Figure 3-6 shows the Passive mode FTP in more detail.

03fig06.gifFigure 3-6. A Passive FTP session example.


In Figure 3-6, we can see that rather than the client dictating the parameters of the Data connection, it simply requests this information from the server. Similarly to the PORT command in Active mode, the server's RESPONSE to the PASV request from the client can be interpreted as follows:

RESPONSE 227 (10,10,10,10,41,38)

which means open from client to server on IP address 10.10.10.10 and TCP port 10534 [41×256 + 38×1].

FTP—Further Reading

For further information on the detailed workings of FTP, it's worth looking at RFC 959.

Real Time Streaming Protocol (RTSP)

In the modern Internet, applications are required to deliver value. One of the biggest conundrums in recent years has been the battle to actually make the Internet a viable platform for making money. As we'll see throughout the course of this book, one of the biggest drivers for delivering on the "Gold Rush" promise of Internet technologies is content. Making content attractive to end consumers to the point where they are willing to pay is a big challenge and one that has been aided by the delivery of Application layer protocols such as RTSP, which enables the delivery of real-time video and audio in variable qualities. The other Application layer protocols we've looked at so far in this chapter work in a request/response manner, whereby the client asks for some piece of content, the content is delivered using TCP or UDP, and then the client application can display the content to the user. While these mechanisms are suitable for a large number of applications in the Internet, there also exists a requirement to deliver content, be it images, audio, video, or a combination of all three, in real time. Imagine if a user were to try to watch a full-screen video file of a one-hour movie using HTTP or FTP as the Application layer protocol. The movie file might be several hundred megabytes, if not several gigabytes, in size. Even with modern broadband services deliverable to the home, this type of large file size does not fit well in the "download then play" model we saw previously.

RTSP uses a combination of reliable transmission over TCP (used for control) and best-efforts delivery over UDP (used for content) to stream content to users. By this, we mean that the file delivery can start and the client-side application can begin displaying the audio and video content before the complete file has arrived. In terms of our one-hour movie example, this means that the client can request a movie file and watch a "live" feed similar to how one would watch a TV. Along with this "on demand" type service, RTSP also enables the delivery of live broadcast content that would not be possible with traditional download and play type mechanisms.

The Components of RTSP Delivery

During our look at RTSP, we'll use the term to describe a number of protocols that work together in delivering content to the user.

RTSP

RTSP is the control protocol for the delivery of multimedia content across IP networks. It is based typically on TCP for reliable delivery and has a very similar operation and syntax to HTTP. RTSP is used by the client application to communicate to the server information such as the media file being requested, the type of application the client is using, the mechanism of delivery of the file (unicast or multicast, UDP or TCP), and other important control information commands such as DESCRIBE, SETUP, and PLAY. The actual multimedia content is not typically delivered over the RTSP connection(s), although it can be interleaved if required. RTSP is analogous to the remote control of the streaming protocols.

Real Time Transport Protocol (RTP)

RTP is the protocol used for the actual transport and delivery of the real-time audio and video data. As the delivery of the actual data for audio and video is typically delay sensitive, the lighter weight UDP protocol is used as the Layer 4 delivery mechanism, although TCP might also be used in environments that suffer higher packet loss. The RTP flow when delivering the content is unidirectional from the server to the client. One interesting part of the RTP operation is that the source port used by the server when sending the UDP data is always even—although it is dynamically assigned. The destination port (i.e., the UDP port on which the client is listening) is chosen by the client and communicated over the RTSP control connection.

Real Time Control Protocol (RTCP)

RTCP is a complimentary protocol to RTP and is a bidirectional UDP-based mechanism to allow the client to communicate stream-quality information back to the object server. The RTCP UDP communication always uses the next UDP source port up from that used by the RTP stream, and consequently is always odd. Figure 3-7 shows how the three protocols work together.

03fig07.gifFigure 3-7. The three main application protocols used in real-time streaming.


RTSP Operation

The RTSP protocol is very similar in structure and specifically syntax to HTTP. Both use the same URL structure to describe an object, with RTSP using the rtsp:// scheme rather than the http://. RTSP, however, introduces a number of additional headers (such as DESCRIBE, SETUP, and PLAY) and also allows data transport out-of-band and over a different protocol, such as RTP described earlier. The best way to understand how the components described previously work together to deliver an audio/video stream is to look at an example. The basic steps involved in the process are as follows:

  1. The client establishes a TCP connection to the servers, typically on TCP port 554, the well-known port for RTSP.

  2. The client will then commence issuing a series of RTSP header commands that have a similar format to HTTP, each of which is acknowledged by the server. Within these RTSP commands, the client will describe to the server details of the session requirements, such as the version of RTSP it supports, the transport to be used for the data flow, and any associated UDP or TCP port information. This information is passed using the DESCRIBE and SETUP headers and is augmented on the server response with a Session ID that the client, and any transitory proxy devices, can use to identify the stream in further exchanges.

  3. Once the negotiation of transport parameters has been completed, the client will issue a PLAY command to instruct the server to commence delivery of the RTP data stream.

  4. Once the client decides to close the stream, a TEARDOWN command is issued along with the Session ID instructing the server to cease the RTP delivery associated with that ID.

Example—RTSP with UDP-Based RTP Delivery

Let's consider an example interaction where the client and server will use a combination of TCP-based RTSP and UDP-based RTP and RTCP to deliver and view a video stream. In the first step, the client will establish a TCP connection to port 554 on the server and issue an OPTIONS command showing the protocol version used for the session. The server acknowledges this with a 200 OK message, similar to HTTP.

C->S  OPTIONS rtsp://video.foocorp.com:554 RTSP/1.0
Cseq: 1

S->C  RTSP/1.0 200 OK
      Cseq: 1

Next, the client issues a DESCRIBE command that indicates to the server the URL of the media file being requested. The server responds with another 200 OK acknowledgment and includes a full media description of the content, which is presented in either Session Description Protocol (SDP) or Multimedia and Hypermedia Experts Group (MHEG) format.

C->S  DESCRIBE rtsp://video.foocorp.com:554/streams/example.rm RTSP/1.0
      Cseq:2

S->C  RTSP/1.0 200 OK
      Cseq: 2
      Content-Type: application/sdp
Content-Length: 210
      <SDP Data...>

In the third stage of the RTSP negotiation, the client issues a SETUP command that identifies to the server the transport mechanisms, in order of preference, the client wants to use. We won't list all of the available transport options here (the RFC obviously contains an exhaustive list), but we'll see the client request RTP over UDP on ports 5067 and 5068 for the data transport. The server responds with confirmation of the RTP over UDP transport mechanism and the client-side ports and includes the unique Session ID and server port information.

C->S  SETUP rtsp://video.foocorp.com:554/streams/example.rm RTSP/1.0
      Cseq: 3
      Transport: rtp/udp;unicast;client_port=5067-5068

S->C  RTSP/1.0 200 OK
      Cseq: 3
      Session: 12345678
      Transport: rtp/udp;client_port=5067-5068;server_port=6023-6024

Finally, the client is now ready to commence the receipt of the data stream and issues a PLAY command. This simply contains the URL and Session ID previously provided by the server. The server acknowledges this PLAY command, and the RTP stream from the server to client will begin.

C->S  PLAY rtsp://video.foocorp.com:554/streams/example.rm RTSP/1.0
      Cseq: 4
      Session: 12345678

S->C  RTSP/1.0 200 OK
      Cseq: 4

Once the client decides that the stream can be stopped, a TEARDOWN command is issued over the RTSP connection referenced only by the Session ID. The server again acknowledges this and the RTP delivery will cease.

C->S  TEARDOWN rtsp://video.foocorp.com:554/streams/example.rm RTSP/1.0
      Cseq: 5
      Session: 12345678

S->C  RTSP/1.0 200 OK
      Cseq: 5

Figure 3-8 shows this example in a simplified graphic form.

03fig08.gifFigure 3-8. An example of RTSP in action with the video and audio data being delivered over a separate UDP-based RTP stream.


Other Options for Data Delivery

In certain scenarios, the best-effort, dynamic port methods of UDP-based RTP, as described previously, are not suitable. Some environments might consider the allocation of dynamic source and destination UDP ports through firewalls to be something they can live happily without. Moreover, just the nature of the Layer 1 and Layer 2 transport mechanisms underlying the data delivery might not be suited to nonguaranteed UDP traffic. In either instance, RTSP allows for the negotiation of the RTP delivery of the media data to be interleaved into the existing TCP connection.

When interleaving, the client-to-server SETUP command has the following format:

C->S  SETUP rtsp://video.foocorp.com:554/streams/example.rm RTSP/1.0
      Cseq: 3
      Transport: rtp/avp/tcp; interleaved=0-1

The changeover in the preceding example is in the transport description. First, the transport mechanisms have changed to show that the RTP delivery must be over TCP rather than UDP. Second, the addition of the interleaved option shows that the RTP data should be interleaved and use channel identifiers 0 and 1—0 will be used for the RTP data and 1 will be used for the RTCP messages. To confirm the transport setup, the server will respond with confirmation and a Session ID as before:

S->C  RTSP/1.0 200 OK
      Cseq: 3
      Session: 12345678
      Transport: rtp/ avp/tcp; interleaved=0-1

The RTP and RTCP data can now be transmitted over the existing RTSP TCP connection with the server using the 0 and 1 identifiers to represent the relevant channel.

One further delivery option for RTP and RTCP under RTSP is to wrap the delivery of all media streaming components inside traditional HTTP frame formats. This removes most barriers presented when using streaming media through firewalled environments, as even the most stringent administrator will typically allow HTTP traffic to traverse perimeter security. While HTTP and RTSP interleaved delivery of the streamed media data will make the content available to the widest possible audience, when you consider the overhead of wrapping all RTP data inside either an existing TCP stream or, worse still, inside HTTP, it is the least efficient method for delivery. To enable the streaming media client browser to cope with the different options described previously, most offer the client users the ability to configure their preferred delivery mechanism or mechanisms, and the timeout that should be imposed in failing between them. What you will see from a client perspective is that the client application will first request that the stream be delivered using RTP in UDP, and if the stream does not arrive within x seconds (as it is potentially being blocked by an intermediate firewall), it will fail back to using RTP interleaved in the existing RTSP connection.

RTSP and RTP—Further Reading

For further information on the RTSP and RTP protocols, RFCs 2326 and 1889, respectively, are a good source.

Secure Sockets Layer (SSL)

The final protocol we'll look at in this chapter is neither a Layer 4 transport protocol nor an Application layer protocol, but one that sits between these layers to provide security services to many modern Internet applications. Secure Sockets Layer, or SSL, has been one of the major forces in Internet security technology since its inception by Netscape Communications, and continues to be included in all major browsers. This has enabled Web application developers to deliver secure content and services using traditional HTTP servers with few changes required in terms of the setup of the basic server or restructuring of the HTML content. The other major advantage of the integration of SSL into all major browsers is its transparency to the user. SSL typically gets used without the knowledge of the client, other than the appearance of a small padlock in the corner of the browser window, thus meaning that no additional level of expertise is required to use Internet applications with this security. Figure 3-9 shows a browser that is currently using SSL.

03fig09.gifFigure 3-9. A Web browser will typically use SSL when instructed by the Web site with little or no input required by the user. The use of SSL can be seen by the inclusion of a small padlock in the browser.


While the most common implementation of SSL is within Web browsers, creating the application protocol hybrid known as HTTPS, it should be remembered that it is a transparent protocol available to any TCP/IP-based application. Along with HTTPS, other common SSL secured protocols include SMTPS and Telnet-S.

The Need for Application Security

The need for security within Internet applications is clear—the Internet is still a public network with little or no security infrastructure designed to protect all users. Imagine using the online services of your favorite bank. Passing important data such as your bank account number, password, and balance across the Internet using only HTTP represents a huge personal security risk, as the data is potentially visible to any device sitting between your browser and the bank's Web site. SSL can be used very effectively to hide all of all the application data as it traverses the Internet to prevent anybody snooping the connection from reading personal data—a process referred to as encryption.

The second important feature provided by SSL for Internet applications is authentication; in other words, the ability for the client to be able to distinguish the Web site as valid. Imagine in our previous bank example if another rogue site were to masquerade as the bank's Web site. This might allow the rogue site to intercept the personal and banking details of thousands of customers, not a welcome situation. SSL provides mechanisms to implement authentication as a way for each side to identify itself to the other.

The final security element that is provided by SSL is tamper detection. Imagine finally that someone were to sit between the client and the bank's Web site and change certain pieces of data as they pass back and forth. This would give the opportunity to alter key personal and banking data and potentially set up fraudulent transactions. SSL provides mechanisms for each side to ensure that the Application layer data being sent and received has not changed in any way as it traverses the Internet.

For the Internet to continue to grow, not only in size, but also as a credible medium for business and commerce, it must be able to provide mechanisms such as SSL as a way to guarantee security.

Fitting SSL into the Seven Layer Model

In the concepts of the OSI Seven Layer Model as we saw in Chapter 2, Understanding Layer 2, 3, and 4 Protocols, SSL sits between the Application layer and the Transport layer, traditionally seen as part of the Presentation layer. This means that the use of SSL is selectively performed by each application rather than as a whole with encryption based in IPSec. This gives the client machine the ability to run secure services for certain applications only, while remaining impartial to the underlying Layer 3 and 4 services below. In comparison, IPSec, for example, can operate in a tunneling mode, which means that all traffic flowing to or from a particular address or range of addresses is encrypted right down to the IP layer. Within SSL, only the Application layer data is encrypted. Figure 3-10 shows the presence of SSL in the OSI model.

03fig10.gifFigure 3-10. Where SSL sits in the OSI model in comparison to IPSec.


Encryption and Cryptography

The process of encryption and decryption fundamentally means to take some source data, transform it to a state where it cannot be read by anyone else, and then transform it back to its original state, thus rendering it readable once more. This approach requires the use of two important elements: the Cryptographic Algorithm, or cipher, and a key. A cipher is a mathematical formula or function that is applied either to the original data (to encrypt) or to the transformed data (to decrypt). One thing always remains true, however—the cipher used to encrypt the data must also be used to decrypt at the other end. To enable this commonality in a network such as the Internet where there are enormous numbers of potential client-server connection combinations, a series of standard ciphers have been developed over time such as Data Encryption Standard (DES) and RC4.

As these ciphers are well known, they rely on the second element to introduce some form of random factor to the process, known as a key. The use of a key, or series of keys, gives the cipher the ability to encrypt the data in such a way so as not to be decrypted easily. If you were to encrypt a simple sentence using an algorithm that is widely known, it would be a relatively simple task to run the data through the same algorithm and arrive at the answer. The use of a key means that in order to decrypt the data, the recipient must know both the appropriate cipher to use and the key used to encrypt the data originally.

This combination of cipher and key forms the basic premise of modern cryptography: Decryption with the known key is simple, but decryption without the key is extremely difficult and in most cases computationally impossible. SSL uses a combination of two basic encryption techniques, symmetric-key encryption and public-key encryption.

Symmetric-Key Encryption

With symmetric-key encryption, both sides use the same key value to perform both the encryption and decryption. Figure 3-11 shows a simple graphical representation of symmetric-key encryption.

03fig11.gifFigure 3-11. With symmetric-key encryption, both the encryption and decryption use the same key.


Symmetric-key encryption has a number of advantages and disadvantages. First, performing this type of encryption and decryption is computationally inexpensive, which means that the performance of applications using symmetric keys is generally better. On the downside, if the shared key is compromised on either side, the security of the encryption between the parties is broken. Moreover, the process of sharing a single shared key between two sides wanting to use symmetric-key encryption can be cumbersome. Imagine two Internet-based users wishing to communicate—they must first share a key to use before they can encrypt and transmit data. This in itself is a major headache, as the key cannot just be simply sent in clear text over the Internet for fear of being captured. SSL uses symmetric-key encryption for bulk encryption—that is, the encryption of all Application layer data—but it employs a very clever technique to arrive at a common shared key—public-key or asymmetric-key encryption.

Public-Key or Asymmetric-Key Encryption

As its name suggests, public-key or asymmetric-key encryption uses two different keys to perform encryption and decryption, respectively. These keys are known as the public and private keys and are mathematically linked to security. The mathematics of public-key encryption are different from those in symmetric-key encryption, as any data encrypted using the public key cannot be easily decrypted using the public key, and similarly with the private key. For public-key encryption to work correctly, the client must encrypt using the public key and the server must decrypt using the private key. As a result, the security of the public key is largely irrelevant and it is commonly available. In SSL terms, the public key is carried in a certificate—more on that later. The security of the private key, however, is of utmost importance, and typically, the private key will never leave the server for which it was generated for fear of compromising the security of the key pair. Therefore, in summary, if you encrypt with the widely available public key, the resulting data can only be decrypted using the corresponding private key. Figure 3-12 shows a simple representation of public-key encryption.

03fig12.gifFigure 3-12. In asymmetric-key or public-key encryption, any data encrypted using the easily available public key can only be decrypted using the corresponding private key.


This approach affords private-key encryption a couple of key advantages. First, the combination of corresponding, mathematically linked keys means that once the data has been encrypted, it can only be decrypted by the holder of the private key. Second, as the public key can be transmitted in clear text to the intended receiver, it is well suited to large-scale, public networks such as the Internet. The main downside of public-key encryption is that it is computationally expensive, thus rendering it unsuitable for situations in which large volumes are required. Above all, the security of the private key is paramount; if it is lost or compromised, the entire premise on which the process is built is broken.

SSL—Combining Symmetric and Asymmetric Encryption

Therefore, on the one hand we have a symmetric encryption mechanism that is computationally cheap but does not scale well to large numbers of users, and on the other, we have a computationally expensive algorithm which does scale well due to its concept of public keys. The answer in terms of SSL is to use a combination of both of these mechanisms to achieve the result we're looking for. The aim of combining the two methods is to allow for encrypted access from anywhere by anyone. The process uses asymmetric encryption to initialize the connection, and then uses symmetric encryption to provide a secure communication channel for the duration of the conversation.

When communications begin, the client creates a random number whose length is determined by the encryption strength required. This large random number will effectively form the shared private key for the symmetric encryption that will be used to exchange application data. The client encrypts this random number with the public key and sends the encrypted version of this to the server. The asymmetric encryption at this stage ensures that only the private key can decrypt the data. Once decrypted, this random number is now used as the symmetric key for the duration of the conversation, as each party has successfully shared a common key. The beauty of this process is that the actual private keys (random numbers) never actually traverse the connection in clear form, thus minimizing the chance of being intercepted. Figure 3-13 shows this combination of symmetric and public-key encryption as used by SSL.

03fig13.gifFigure 3-13. SSL uses a combination of public-key encryption to exchange the symmetric-key and symmetric encryption to encrypt the bulk application data.


Encryption Algorithms

There are many encryption algorithms used, and each provides different levels of encryption, depending on the degree of security required. Earlier algorithms were 40 bits in length, but with today's computing power can typically be cracked within a few hours. The longer the encryption length, the harder they are to crack. All algorithms work in conjunction with a secret key to create the encryption. In the case of SSL, this secret key is the randomly generated number. Common encryption algorithms used today are DES, 3DES, and AES.

Certificates

Now that we've seen the importance of passing the public key within SSL, let's look at the mechanism used to undertake this. Certificates are used in SSL to perform two key functions: first, they provide a level of authentication, potentially for both sides, and second, they provide a standard format in which to pass the public key to the requester. Certificates are like digital passports that can authenticate an organization to a user on connection to its site. Two types of certificates can be used: a server certificate and a client certificate. In a typical SSL environment, only the server certificate is used. This is so that the server (or site) can authenticate who they are on the initial client request and pass the public key. Remember, it is the client that initiates the connection and asks for the certificate to be sent. This allows an organization to publish its services, and even though the users cannot see where they are going (e.g., there is no storefront, or actual physical structure), they know they have connected to the site based on the server certificate issued. While we agree that this could be spoofed in theory, one has to question the rationality of this. It requires that the private key be retrieved, DNS entries to that site be hijacked, or updated to the new address, and all of this needs to happen without the existing site becoming aware of it. This is highly unlikely in an age where security is a number-one agenda item, and a highly active site would be immediately aware of a site failure or attack. In addition, a user must actually make a credit card payment to this fraudulent site. If this did happen, it would have to be to a site that is not well monitored and actively trading, and therefore very rarely visited, which in turn makes the exercise superfluous as no huge revenue or loss of reputation would be achieved by the hacker. Typically, the receipt of a server certificate is all that is required to begin a secure connection with a site. In some cases, the site also wants to ensure that the users are who they say they are. This is certainly a requirement in business-to-business transactions where companies want to be able to control access to their site, especially when access to sensitive information or large sums of money are involved.

Client certificates are used to provide client-side authentication. These certificates, normally derived from the server certificate, are loaded on to the user's machine, and on connection, the server will request the certificate to be sent to it to authenticate the user. As each certificate will have a unique identifier, this can be used to track access. Should connectivity no longer be permitted or required, then this unique identifier can also be used to revoke access to the specific site.

Having the ability to provide client and server authentication builds a very compelling case for SSL deployment. Figure 3-14 is an example of what a certificate looks like followed by the associated private key:

Certificates such as these can be easily copied and pasted into a security appliance.

Certificate Authorities

Certificate authorities (CAs) are like the passport control of the SSL world. They confirm that a site is what it says it is, as they have signed the certificate. Many organizations act as CAs and sign certificates on behalf of sites. These organizations are often seen as respected businesses or in some cases quasigovernment type departments such as a post office or telecommunications provider. The largest ones around today are dedicated to providing a certificate signing function, such as Verisign, Entrust, and Thawte. By default, Web browsers have a list of accepted CAs, which is checked when a site is accessed. If the CA is not present, the browser will display a message asking if this certificate should be accepted. Adding or deleting CAs can be done by users within their favorite browser. Certificates have the ability to be chained. This means that a certificate can be trusted if it has a link or chain back to the original issuer whom you trust. This method is transparent to the user and is handled by the SSL protocol.

Figure 3-14 Sample of a public certificate and private key. As you can see, it is merely clear text and can be easily copied.

-----BEGIN CERTIFICATE-----
IFtTCCBR6gAwIBAgIEN0sJFTANBgkqhkiG9w0BAQQFADCBwzELMAkGA1UEB
VVMxFDASBgNVBAoTC0VudHJ1c3QubmV0MTswOQYDVQQLEzJ3d3cuZW50cnV
ZXQvQ1BTIGluY29ycC4gYnkgcmVmLiAobGltaXRzIGxpYWIuKTElMCMGA1U
LmVudHJ1c3QubmV0L0NQUyBpbmNvcnAuIGJ5IHJlZi4gKGxpbWl0cyBsaWFiLikxJTAjBgNVBAs
THChjKSAxOTk5IEVudHJ1c3QubmV0IExpbWl0ZWQxOjA4BgNVBAMTVudHJ1c3QubmV0IFNlY3Vy
ZSBTZXJ2ZXIgQ2VydGlmaWNhdGlvbiBBdXRob3JpdHkDjAMBgNVBAMTBUNSTDEyMCygKqAohiZ
odHR3QubmV0
L0NSTC9zZXJ2ZXIxLmNybDAfBgNVHSMEGDAWgBTwF2ITVT2z/woAa/tQhJfz7WLQGjAdBgNVHQ4
graphics/ccc.gif EFgQU3Rc4WmXyFuApzKBZCUyzwqoO6jkwCQYDVR0TBAgkqhkiG9n0HQQAEDDAKGwRWNC4wAwIDq
   DANBgkqhkiG9w0BAQQFAAOBgQBbSMGk6BtJ7g6UzC4hL1nJZYQldua3ot6K7EstAu6pBiE0DhAG
   JKm0tCrS16h
   KGMpIDE5OTkgRW50cnVzdC5uZXQffffltaXRlZDE6MDgGA1UEAxMxRW50cn
   ZXQgU2VjdXJlIFNlcnZlciBDZXJ0aWZpY2F0aW9uIEF1dGhvcml0eTAeFw0
   MDgxNjA4MjdaFw0wMjAxMDgxNjM4MjdaMH4xCzAJBgNVBAYTAlNFMRIwEAY
   EwlTdG9ja2hvbG0xEjAQBgNVBAcTCVN0b2NraG9sbTEUMBIGA1UEChMLQmx
   aWwgQUIxFDASBgNVBAsTC0RldmVsb3BtZW50MRswGQYDVQQDExJ2aXAyYS5
   dGFpbC5jb20wgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBALctVjRkmPJ
   FsI/oo1Xh0yJqyC/Vl2tWS3ujM8lSqCA9afq8cqfcRN5cWcelix5oEbaz5e
   GdtLVWqBHw09As3w1AyZsdiSUpdOFNdjPhv9IC9S13y7zCzr0SyS/u7l1c4
   c3QubmV0L2NwczCBwAYIKwYBBQUHAgIwgbMwEhYLRW50cnVzdC5uZXQwAwI
   9TsMAFHBudxPK58IPkKUSpdxZvg7AgMBAAGjggL4MIIC9DCCAQcGA1UdIAS
   /DCB+QYJKoZIhvZ9B0sCMIHrMCYGCCsGAQUFBwIBFhpodHRwOi8vd3d3LnEVudHJ1c3QubmV0IE
   graphics/ccc.gif NQUyBpbmNvcnBvcmF0ZWQgYnkgcmVmZXJlbmNlLiBUayBjb250YWlucyBsaW1pdGF0aW9ucyBvb
   iB3YXJyYW50aWVzIGFuZCBsaWFi
   aWxpdGllcy4gIENvcHlyaWdodCAoYykgMTk5OSBFbnRydXN0Lm5ldCAgd3d
   dHJ1c3QubmV0L2NwczALBgNVHQ8EBAMCBaAwKwYDVR0QBCQwIoAPMjAwMTA
   NjM4MjdagQ8yMDAxMDkyMTA0MzgyN1owEQYJYIZIAYb4QgEBBAQDAgZAMBM
   JQQMMAoGCCsGAQUFBwMBMIIBHQYDVR0fBIIBFDCCARAwgd+ggdyggdmkgdY
   CzAJBgNVBAYTAlVTMRQwEgYDVQQKEwtFbnRy5ldDE7MDkGA1UECxMyd3d3
   /wWqspaKSNsWfqc0AWFfgKznJJmnxsyThudodg5iTM1Nfr93aD2P/3qPMxSSEm/T/
   uOKBaLPLVd3dmjPc/0v1AU48dc0hgx6VhqX98poLiHJAHg==
   -----END CERTIFICATE-----
   
   -----BEGIN RSA PRIVATE KEY-----
   Proc-Type: 4,ENCRYPTED
   DEK-Info: DES-EDE3-CBC,9BCDFA41DAC78C8D
   
   +AsRro1zm2vlV0deB0kw9geWpMJoLOz67sdb8+8E2Pal5hZC1asZapwHGXOAgqeQfUb6VZKy+2H
   zjz8Nw6I3xcAyi7xnF1YYRJxlz7sA+5ACBSAYvZGZRXF7jyTXomIITrwPt40V9uGldjFmwAd6e1
   k1qxKi2T6qtzdVeYZhz27+njtMkDa1PVdJWbcLFyLMRZAUp5Ubu8mIUgkReyMSPMdn6bjmf7hKE
   3jbT/REnICiDcLe3SZzXes8mckUOOV++dBD+orBxeU8dkB59ivWE/WlAP4cf1wOPS/
   B1yzFsHqlbyqlvtfxjF472vU4V0JLOe0RQ5NyVqw09N/NHrgBHce6JgwEHfmgfRr/
   P2RFYvwhs1wUvKVgOOK8KxHdRgNMGshFWMOGmrWV82dO0pywC25Xlq1GiC6vglwHxvzfSr4pnYv
   5VcgDzfkvsYJCVpTiWYiS522Svb0Ln3Gyx55JgIdlaMVhZUCmdbRqH6KFoWyr0Ud+++6PbI+HWb
   VPBpifrqyj3LDnuPTRTDkwy7WlzggXXY1TbdO8XY7KrhgpcBpN4amILANhcZG/
   -----END RSA PRIVATE KEY-----
   

SSL in Action

Let's see the combination of cipher suites, keys, algorithms. and certificates in action as we run through an example SSL session.

When enabling SSL on services on your server, you will first need to create a private and public key pair and corresponding certificate. This process is automatically initiated on most Web servers and will result in the creation of what is known as a certificate signing request, or CSR. This CSR, containing the public key, should be forwarded to the chosen CA for signing, and once completed imported back into the Web server. A point to note here is that the private key must stay private; if it were available, someone could easily masquerade as your site because the public key is just that, public. With the private and public key pair, all encrypted traffic can be decrypted. But let's get back to the CSR. The CSR needs to be sent to a CA, who will sign it and return it to you for you to validate your domain name. We must point out here that a certificate is tied to a domain name and not an IP address. This domain name needs to be resolved to the address of the server in order to work. Once complete, the Web site can be ready for use, and all that is required is that the servers have the SSL service running.

Now we are ready to begin the actual SSL setup as illustrated in Figure 3-15. Let's look at the steps in more detail, remembering that certain message types within the SSL protocol are used to determine specific requests:

  1. Once the client has established a TCP session on port 443 with the server, the client sends a client hello message. This client hello includes information such as the cipher suites that it supports.

  2. The server selects the cipher suite from the list presented and responds with a server hello indicating to the client the ciphers it deems suitable. The client and the server have now agreed on a cipher suite to use.

  3. The server then issues the client a copy of its certificate (remember that this certificate also contains the public key). Optionally, the server may request a copy of the client's certificate if client-side authentication is required.

  4. Next, the server sends a server hello done message to tell the client it has completed the first phase of the session setup. As there is no key yet, this process is carried out in clear text.

  5. The client now generates a random number, encrypts it with its public key, and sends the server the client key. This process is known as the client key exchange. This is the symmetric key that will be used for the duration of the symmetric encryption session. Communication from here on is encrypted.

  6. The client now sends a change cipher spec message to the server to say it will now begin using the negotiated cipher suite (determined in step 2) for the duration of the session.

  7. Once this is done, the client sends a finished message to the server to say that it is ready.

  8. The server, in turn, sends a change cipher spec message to the client using the agreed information. The server also sends out a finished message on completion.

  9. A secure encrypted tunnel is now set up, and communication can begin using the symmetric encryption details negotiated.

03fig15.gifFigure 3-15. SSL session setup is a computationally intensive process that we need to offload to increase network performance.


One key piece of information in this exchange, which we will see has relevance in content switching in later chapters, is the SSL Session ID. This is a random identifier agreed by both sides when first initiating the SSL session to the server and is used to uniquely identify the tunnel they have established. One option that is held by the client during the negotiation process described previously is to reuse a set of agreed ciphers and keys by including the Session ID in the client hello it sends to the server. Provided that the server is configured to allow this type of session reuse, it will skip the need to swap the symmetric key and thus bypass the big number arithmetic needed, in turn speeding up the process. The SSL Session ID can be read in clear text, as it is not passed encrypted between client and server.

SSL Summary

SSL is a standards-based encryption and authentication mechanism widely used within the Internet today. While by far the most common implementations use HTTP as the Application layer protocol, SSL can be used to secure other applications. As we'll see in later chapters, the inclusion of SSL as a security mechanism for modern Web sites creates yet another part of the puzzle of content switching.

Summary

As with our coverage of Layer 2, 3, and 4 protocols, there are many other more detailed books covering the Application layer protocols we saw in this chapter. Hopefully, however, this chapter has served to give a better understanding of the ways in which TCP, UDP, and IP can be combined to provide application services, all optionally wrapped in SSL for greater security. Equipped with this understanding, we can begin to understand the concepts of content switching and put the techniques to use to solve many of the scalability problems of modern IP networks.

800 East 96th Street, Indianapolis, Indiana 46240