The Concept of State
One confusing concept to understand when discussing firewall and TCP/IP communications is the meaning of state. The main reason this term is so elusive is that it can mean different things in different situations. Basically, state is the condition of being of a given communication session. The definition of this condition of being for a given host or session can differ greatly, depending on the application with which the parties are communicating and the protocols the parties are using for the exchange.
Devices that track state most often store the information as a table. This state table holds entries that represent all the communication sessions of which the device is aware. Every entry holds a laundry list of information that uniquely identifies the communication session it represents. Such information might include source and destination IP address information, flags, sequence and acknowledgment numbers, and more. A state table entry is created when a connection is started out through the stateful device. Then, when traffic returns, the device compares the packet's information to the state table information to determine whether it is part of a currently logged communication session. If the packet is related to a current table entry, it is allowed to pass. This is why the information held in the state table must be as specific and detailed as possible to guarantee that attackers will not be able to construct traffic that will be able to pass the state table test.
Firewall Clustering and Tracking State
It is possible to cluster firewalls together for redundancy, or to allow more bandwidth than a single firewall can handle on its own. In this clustered state, any of the firewall partners could possibly receive any part of a traffic flow. Therefore, although the initial SYN packet for a connection might be received on firewall 1, the final ACK response might come back to firewall 2. To be able to handle traffic statefully when firewalls are clustered, a single shared state table must be available to all the cluster members. This facilitates the complete knowledge of all traffic that other cluster members have seen. It is often accomplished using a dedicated communications cable between the members as a sort of direct link, solely for the sharing of vital state information. Such a mechanism affords an efficient means for the propagation of said state table information, allowing even the fastest communication links to operate without the problem of an incompletely updated state table.
The only other means to implement clustered firewalls without having to share state is by placing the firewalls in a "sandwich" between load-balancers. This way, a given traffic stream will always hit the same firewall it was initiated through. For more information on design choices for firewall clustering, take a look at Chapter 17, "Tuning the Design for Performance."
Transport and Network Protocols and State
Transport protocols can have their connection's state tracked in various ways. Many of the attributes that make up a communication session, including IP address and port pairings, sequence numbers, and flags, can all be used to fingerprint an individual connection. The combination of these pieces of information is often held as a hash in a state table for easy comparison. The particulars depend on the vendor's individual implementation. However, because these protocols are different, so are the ways the state of their communications can be effectively tracked.
TCP and State
Because TCP is a connection-oriented protocol, the state of its communication sessions can be solidly defined. Because the beginning and end of a communication session is well defined in TCP and because it tracks the state of its connections with flags, TCP is considered a stateful protocol. TCP's connection is tracked as being in one of 11 states, as defined in RFC 793. To truly understand the stateful tracking of TCP, it is important to realize the many stages a TCP connection goes through, as detailed in the following list:
CLOSEDA "non-state" that exists before a connection actually begins.
LISTENThe state a host is in when waiting for a request to start a connection. This is the true starting state of a TCP connection.
SYN-SENTThe time after a host has sent out a SYN packet and is waiting for the proper SYN-ACK reply.
SYN-RCVDThe state a host is in after receiving a SYN packet and replying with its SYN-ACK reply.
ESTABLISHEDThe state a connection is in after its necessary ACK packet has been received. The initiating host goes into this state after receiving a SYN-ACK, as the responding host does after receiving the lone ACK.
During the process of establishing a TCP connection, a host goes through these states. This is all part of the three-way handshake, as shown in Figure 3.1.
Figure 3.1 The TCP three-way handshake connection establishment consists of five well-defined states.
The remaining 6 of the 11 TCP connection states describe the tearing down of a TCP connection. The first state is used during an active close by the initiator and a passive close by the receiver, as shown in Figure 3.2.
Figure 3.2 The active/passive closing of a normal TCP connection consists of six states.
FIN-WAIT-1The state a connection is in after it has sent an initial FIN packet asking for a graceful close of the TCP connection.
CLOSE-WAITThe state a host's connection is in after it receives an initial FIN and sends back an ACK to acknowledge the FIN.
FIN-WAIT-2The connection state of the host that has received the ACK response to its initial FIN, as it waits for a final FIN from its connection partner.
LAST-ACKThe state of the host that just sent the second FIN needed to gracefully close the TCP connection back to the initiating host while it waits for an acknowledgment.
TIME-WAITThe state of the initiating host that received the final FIN and has sent an ACK to close the connection. Because it will not receive an acknowledgment of its sent ACK from the connection partner, it has to wait a given time period before closing (hence, the name TIME-WAIT); the other party has sufficient time to receive the ACK packet before it leaves this state.
CLOSINGA state that is employed when a connection uses the nonstandard simultaneous close. The connection is in this state after receiving an initial FIN and sending an ACK. After receiving an ACK back for its FIN, the connection will go into the TIME-WAIT state (see Figure 3.3).
The amount of time the TIME-WAIT state is defined to pause is equal to the Maximum Segment Lifetime (MSL), as defined for the TCP implementation, multiplied by two. This is why this state is also called 2MSL.
Figure 3.3 The simultaneous close of a TCP connection, where both parties close actively, consists of six states.
You can determine the state of the TCP connection by checking the flags being carried by the packets, as alluded to by the various descriptions of the TCP states. The tracking of this flag information, in combination with the IP address/port address information for each of the communicating parties, can paint a pretty good picture of what is going on with the given connection. The only other pieces of the puzzle that might be needed for clarity are the sequence and acknowledgment numbers of the packets. This way, if packets arrive out of order, the dialog flow of the communication can be more easily discerned, and the use of replay attacks against a device tracking state will be less likely to succeed.
Entries for TCP communication sessions in a state table are removed when the connection is closed. To prevent connections that are improperly closed from remaining in the state table indefinitely, timers are also used. While the three-way handshake is transpiring, the initial timeout value used is typically short (under a minute), so network scans and the like are more quickly cleared from the state table. The value is lengthened considerably (to as long as an hour or more) after the connection is established, because a properly initiated session is more likely to be gracefully closed.
It would seem from what we have just covered that the state of any TCP connection is easily definable, concrete, and objective. However, when you're tracking the overall communication session, these rules might not always apply. What if an application that employs nonstandard communication techniques was being used? For example, as discussed in Chapter 2, standard FTP uses an atypical communication exchange when initializing its data channel. The states of the two individual TCP connections that make up an FTP session can be tracked in the normal fashion. However, the state of the FTP connection obeys different rules. For a stateful device to be able to correctly pass the traffic of an FTP session, it must be able to take into account the way that standard FTP uses one outbound connection for the control channel and one inbound connection for the data channel. We will cover this issue in greater detail in the "Application-Level Traffic and State" section, later in this chapter.
UDP and State
Unlike TCP, UDP is a connectionless transport protocol. This makes the tracking of its state a much more complicated process. In actuality, a connectionless protocol has no state; therefore, a stateful device must track a UDP connection in a pseudo-stateful manner, keeping track of items specific to its connection only. Because UDP has no sequence numbers or flags, the only items on which we can base a session's state are the IP addressing and port numbers used by the source and destination hosts. Because the ephemeral ports are at least somewhat random, and they differ for any connection coming from a given IP address, this adds a little bit of credence to this pseudo-stateful method of session tracking. However, because the UDP session is connectionless, it has no set method of connection teardown that announces the session's end. Because of this lack of a defined ending, a state-tracking device will typically be set up to clear a UDP session's state table entries after a preconfigured timeout value (usually a minute or less) is reached. This prevents entries from filling the table.
Another point of concern with UDP traffic is that because it cannot correct communication issues on its own, it relies entirely on ICMP as its error handler, making ICMP an important part of a UDP session to be considered when tracking its overall state.
For example, what if during a UDP communication session a host can no longer keep up with the speed at which it is receiving packets? UDP offers no method of letting the other party know to slow down transmission. However, the receiving host can send an ICMP source quench message to let the sending host know to slow down transmission of packets. However, if the firewall blocks this message because it is not part of the normal UDP session, the host that is sending packets too quickly does not know that an issue has come up, and it continues to send at the same speed, resulting in lost packets at the receiving host. Stateful firewalls must consider such "related" traffic when deciding what traffic should be returned to protected hosts.
ICMP and State
ICMP, like UDP, really isn't a stateful protocol. However, like UDP, it also has attributes that allow its connections to be pseudo-statefully tracked. The more complicated part of tracking ICMP involves its one-way communications. The ICMP protocol is often used to return error messages when a host or protocol can't do so on its own, in what can be described as a "response" message. ICMP response-type messages are precipitated by requests by other protocols (TCP, UDP). Because of this multiprotocol issue, figuring ICMP messages into the state of an existing UDP or TCP session can be confusing to say the least. The other, easier-to-track way in which ICMP is used is in a request/ reply-type usage. The most popular example of an application that uses this request/reply form is ping. It sends echo requests and receives echo reply messages. Obviously, because the given stimulus in these cases produces an expected response, the session's state becomes less complicated to track. However, instead of being tracked based on source and destination addresses, the ICMP message can be tracked on request message type and reply message type. This tracking method is about the only way ICMP can enter into a state table.
Another issue with ICMP is that, like UDP, it is connectionless; therefore, it must base the retention of a state table entry on a predetermined timeout because ICMP also does not have a specific mechanism to end its communication sessions.
Application-Level Traffic and State
We have covered in some detail the ways that state can be tracked at the transport and network protocol levels; however, things change when you are concerned about the state of the entire session. When a stateful device is deciding which traffic to allow into the network, application behaviors must be taken into account to verify that all session- related traffic is properly handled. Because the application might follow different rules for communication exchanges, it might change the way that state has to be considered for that particular communication session. Let's look at an application that uses a standard communication style (HTTP) and one that handles things in a nonstandard way (FTP).
HTTP and State
HTTP is the one of the main protocols used for web access, and it's the most commonly used protocol on the Internet today. It uses TCP as its transport protocol, and its session initialization follows the standard way that TCP connections are formed. Look at the following tcpdump trace:
21:55:46.1 Host.1096 > maverick.giac.org.80: S 489703169:489703169(0) win 16384 <mss1460,nop,nop,sackOK> (DF) 21:55:46.2 maverick.giac.org.80 > Host.1096: S 3148360676:3148360676(0) ack 489703170win 5840 <mss 1460,nop,nop,sackOK> (DF) 21:55:46.5 Host.1096 > maverick.giac.org.80: . ack 1 win 17520 (DF)
This tcpdump trace shows the three-way handshake between a contacting client named Host and the SANS GIAC web server, Maverick. It is a standard TCP connection establishment in all aspects.
The following packet lists the first transaction after the TCP connection was established. Notice that in the payload of the packet, the GET / HTTP/1.1 statement can be clearly made out (we truncated the output for display purposes):
21:55:46.6 Host.1096 > maverick.giac.org.80: P 1:326(325) ack 1 win 17520 (DF) E..m."@....6.....!...H.P.0G...+.P.Dpe$..GET./.HTTP/1.1..Accept:.image/gif,.image
This is the first HTTP command that a station issues to receive a web page from a remote source.
Let's look at the next packet, which is truncated for display purposes:
21:55:46.8 maverick.giac.org.80 > Host.1096: P 1:587(586) ack 326 win 6432 (DF) E..r..@.2..6.!.......P.H..+..0HGP.......HTTP/1.1.301.Moved.Permanently.. Date:.Wed,.06.Feb.2002.02:56:03.GMT..Server:.Apache..Location: .http://www.sans.org/newlook/home.php..Kee
Notice that this reply packet begins to return the home page for the SANS GIAC website at http://www.sans.org. As shown in the preceding example, protocols such as HTTP that follow a standard TCP flow allow an easier definition of the overall session's state. Because it uses a single established connection from the client to the server and because all requests are outbound and responses inbound, the state of the connection doesn't differ much from what would be commonly tracked with TCP. If tracking only the state of the TCP connection in this example, a firewall would allow the HTTP traffic to transpire as expected. However, there is merit in also tracking the application-level commands being communicated. We cover this topic more in the section "Problems with Application-Level Inspection," later in this chapter. Next, we look at how this scenario changes when dealing with applications that use a nonstandard communication flow, such as standard FTP traffic.
File Transfer Protocol and State
File Transfer Protocol (FTP) is a popular means to move files between systems, especially across the Internet. FTP in its standard form, however, behaves quite differently from most other TCP protocols. This strange two-way connection establishment also brings up some issues with the tracking of state of the entire connection. Would a firewall that only tracks the state of the TCP connections on a system be able to pass standard FTP traffic? As seen in Chapter 2, the answer is no. A firewall cannot know to allow in the SYN packet that establishes an FTP data channel if it doesn't take into account the behavior of FTP. For a stateful firewall to be able to truly facilitate all types of TCP connections, it must have some knowledge of the application protocols being run, especially those that behave in nonstandard ways.
When the application-level examination capabilities of a stateful inspection system are being used, a complicated transaction like that used by a standard FTP connection can be dissected and handled in an effective and secure manner.
The stateful firewall begins by examining all outbound traffic and paying special attention to certain types of sessions. As we know from Chapter 2, an FTP control session can be established without difficulty; it is the inbound data-channel initialization that is problematic. Therefore, when a stateful firewall sees that a client is initializing an outbound FTP control session (using TCP port 21), it knows to expect the server being contacted to initiate an inbound data channel on TCP port 20 back to the client. The firewall can dynamically allow an inbound connection from the IP address of the server on port 20 to the address of the client. However, for utmost security, the firewall should also specify the port on which the client will be contacted for this exchange.
The firewall discovers on which port the client is contacted through the use of application inspection. Despite the fact that every other piece of information we have needed thus far in this exchange has been Layer 4 or lower, the port number used by the server initializing the data channel is actually sent to it in an FTP port command from the client. Therefore, by inspecting the traffic flow between client and server, the firewall also picks up the port information needed for inbound data channel connection. This process is illustrated in Figure 3.4.
Figure 3.4 The stateful firewall examines the FTP port command to determine the destination port for the establishment of the FTP data channel.
Multimedia Protocols and the Stateful Firewall
Multimedia protocols work similarly to FTP through a stateful firewalljust with more connections and complexity. The widespread use of multimedia communication types, such as H.323, Real Time Streaming Protocol (RTSP), CUSeeME, Microsoft's NetShow, and more, have demanded a secure means to allow such traffic to pass into the networks of the world.
All these protocols rely on at least one TCP control channel to communicate commands and one or more channels for multimedia data streams running on TCP or UDP. The control channels are monitored by the stateful firewall to receive the IP addresses and port numbers used for the multimedia streams. This address information is then used to open secure conduits to facilitate the media streams' entrance into the network, as shown in Figure 3.5.
Figure 3.5 The stateful firewall tracks the multimedia protocol's communication channel to facilitate the passing of incoming media streams.
Stateful firewalls now allow the use of multistream multimedia applications, such as H.323, in conjunction with Port Address Translation (PAT). In the not-so-distant past, this was a long-time problem with multistream protocols because the multiple ports used per connection could easily conflict with the PAT translation's port dispersal.
Problems with Application-Level Inspection
Despite the fact that many stateful firewalls by definition can examine application layer traffic, holes in their implementation prevent stateful firewalls from being a replacement for proxy firewalls in environments that need the utmost in application-level control. The main problems with the stateful examination of application-level traffic involve the abbreviated examination of application-level traffic and the lack of thoroughness of this examination, including the firewall's inability to track the content of said application flow.
To provide better performance, many stateful firewalls abbreviate examinations by performing only an application-level examination of the packet that initiates a communication session, which means that all subsequent packets are tracked through the state table using Layer 4 information and lower. This is an efficient way to track communications, but it lacks the ability to consider the full application dialog of a session. In turn, any deviant application-level behavior after the initial packet might be missed, and there are no checks to verify that proper application commands are being used throughout the communication session.
However, because the state table entry will record at least the source and destination IP address and port information, whatever exploit was applied would have to involve those two communicating parties and transpire over the same port numbers. Also, the connection that established the state table entry would not be properly terminated, or the entry would be instantly cleared. Finally, whatever activity transpired would have to take place in the time left on the timeout of the state table entry in question. Making such an exploit work would take a determined attacker or involve an accomplice on the inside.
Another issue with the way stateful inspection firewalls handle application-level traffic is that they typically watch traffic more so for triggers than for a full understanding of the communication dialog; therefore, they lack full application support. As an example, a stateful device might be monitoring an FTP session for the port command, but it might let other non-FTP traffic pass through the FTP port as normal. Such is the nature of a stateful firewall; it is most often reactive and not proactive. A stateful firewall simply filters on one particular command type on which it must act rather than considering each command that might pass in a communication flow. Such behavior, although efficient, can leave openings for unwanted communications types, such as those used by covert channels or those used by outbound devious application traffic.
In the previous example, we considered that the stateful firewall watches diligently for the FTP port command, while letting non-FTP traffic traverse without issue. For this reason, it would be possible in most standard stateful firewall implementations to pass traffic of one protocol through a port that was being monitored at the application level for a different protocol. For example, if you are only allowing HTTP traffic on TCP port 80 out of your stateful firewall, an inside user could run a communication channel of some sort (that uses a protocol other than the HTTP protocol) to an outside server listening for such communications on port 80.
Another potential issue with a stateful firewall is its inability to monitor the content of allowed traffic. For example, because you allow HTTP and HTTPS out through your firewall, it would be possible for an inside user to contact an outside website service such as http://www.gotomypc.com. This website offers users the ability to access their PC from anywhere via the web. The firewall will not prevent this access, because their desktop will initiate a connection to the outside Gotomypc.com server via TCP port 443 using HTTPS, which is allowed by your firewall policy. Then the user can contact the Gotomypc.com server from the outside and it will "proxy" the user's access back to his desktop via the same TCP port 443 data flow. The whole communication will transpire over HTTPS. The firewall won't be able to prevent this obvious security breach because the application inspection portion of most stateful firewalls really isn't meant to consider content. It is looking for certain trigger-application behaviors, but most often (with some exceptions) not the lack thereof. In the case of http://www.gotomypc.com, application-level inspection has no means to decipher that this content is inappropriate.
Despite the fact that standard stateful examination capabilities of most such firewalls could not catch deviant traffic flows such as the covert channels based on commonly open ports, many vendors also offer content filtering or Deep Packet Inspection features on their stateful firewall products to prevent such issues. FireWall-1 and the PIX both offer varying levels of content filtering, for example. However, such features are often not enabled by default or need to be purchased separately and must be configured properly to be effective.
Another popular example of traffic that sneaks out of many otherwise secure networks involves programs such as AOL Instant Messenger, Kazaa, and other messaging and peer-to-peer file-sharing programs. These programs have the potential to transmit through any port, and because most stateful firewalls have at least one outbound port open, they will find their way out. Like the aforementioned "covert channel" example, the standard stateful firewall does not differentiate this type of traffic; it allows the traffic to pass as long as it is using one of the available ports. Content-level filtering or a true proxy firewall that considers all application-level commands could be used to prevent such traffic.
Deep Packet Inspection
Some of the biggest problems security professionals face today are allowed through their firewalls by design. As mentioned in the previous section, covert channels, nefarious content traversing known ports, and even malicious code carried on known protocols are some of the most damaging security threats your business will be exposed to. It is true that even a defense mechanism as simple as a packet filter could block most of these threats if you blocked the port they were carried on, but the real issue is that they travel over protocols you want to allow into your network and are required for your business! For example, many of the most widespread worms travel over NetBIOS, HTTP, or SQL-related protocolsall of which can be an important part of your Internet or network business. Obviously, it is not good form to allow NetBIOS or SQL into your network from the Internet, but if an attack is launched from an email attachment received at a user's PC, it is very likely that you might allow these protocols to traverse security zones on your network. How can we prevent issues carried by protocols that our businesses require to function? The answer is Deep Packet Inspection.
Deep Packet Inspection devices are concerned with the content of the packets. The term Deep Packet Inspection is actually a marketing buzzword that was recently coined for technology that has been around for some time; content examination is not something new. Antivirus software has been doing it at the host and mail server level, and network IDSs have been doing it on the wire for years. However, these products have limited visibility and capability to deal with the malicious payloads they find. A major disadvantage of content filtering at these levels is that the worm, Trojan horse, or malicious packet has already entered your network perimeter. Firewalls offering Deep Packet Inspection technology have the ability to detect and drop packets at the ingress point of the network. What more appropriate place to stop malicious traffic than at the firewall?
In the past, you have been able to use router or firewall content-filtering technologies to enter the signature of a worm or other malicious event and block it at the exterior of your network. However, what newer Deep Packet Inspection devices bring to the table are preloaded signatures, similar to those used by an antivirus solution. This way, your firewall is aware of and able to detect and remove malicious content as it arrives at your network. Also, because the packet's content is being considered at the application layer, traffic anomalies representative of an attack or worm can also be considered and filtered even if a specific signature isn't available for it. For example, if some attack uses a command that is considered nonstandard for a particular protocol, the device doing Deep Packet Inspection would be able to recognize it and drop the malicious content.
The Deep Packet Inspection technology used in many popular firewall solutions is very similar to the content examination capabilities inherent in Intrusion Prevention Systems (IPSs). However, despite the fact that the technology is similar, the firewall-based solutions lack the volume of signatures and the thoroughness of analysis that a true IPS offers. Firewall-based Deep Packet Inspection could be considered "IPS-Lite." For more information on IPS, take a look at Chapter 11, "Intrusion Prevention Systems."
A Deep Packet Inspection firewall is responsible for performing many simultaneous functions. The entire content of a packet's application layer information needs to be reviewed against a list of attack signatures as well as for anomalous traffic behaviors. These firewalls also have to perform all the standard functions a stateful firewall typically handles. Therefore, advanced hardware is required to perform all these processes in a timely manner. This advanced hardware integration (typically dedicated "processors" just for this task) is what has set Deep Packet Inspection firewalls apart from their predecessors. It enables the swift processing and removal of anomalous traffic, with the added advantage of the stateful firewall's perspective on the overall communication flow of the network. This offers a major edge when determining which traffic is malicious and which is not.
It is important to remember that for Deep Packet Inspection to work on SSL encrypted traffic flows, some means to decrypt the traffic must be employed. SSL certificates must also be loaded on the Deep Packet Inspection device and SSL flows must be decrypted, reviewed, and reencrypted before they are sent on to their destination. This process will cause some network latency and requires additional processing power to achieve efficient communications.
Most vendors are either already offering or are considering to offer solutions that incorporate this type of Deep Packet Inspection technology. Major vendors, including Check Point, Cisco, and Juniper, are using some form of Deep Packet Inspection in their products and are constantly advancing it to help handle the new attacks that arrive at our networks on a daily basis.
As heavy-hitting worms such as SQL-Slammer, Blaster, Code-Red, and Welchia pound on our networks, transported via protocols that we use on a regular basis, the need for devices that consider the content of packets as well as its application become more and more urgent. Deep Packet Inspection is an excellent method to shut down some of the most used attack vectors exploited by malicious content today.