Linux Firewalls: Packet Filtering
A small site may have Internet access through a T1 line, a cable modem, DSL, ISDN, or often, a PPP connection to a phone line dial-up account. The computer connected directly to the Internet is the focus for security issues. Whether you have one computer or a local area network (LAN) of linked computers, the initial focus for a small site will be on the machine with the direct Internet connection. This machine will be the firewall machine.
The term firewall has a number of meanings depending on its implementation and purpose. At this opening point in the book, firewall means the Internet-connected machine. This is where your primary security policies for Internet access will be implemented. The firewall machine's external network interface card is the connection point, or gateway, to the Internet. The purpose of a firewall is to protect what's on your side of this gateway from what's on the other side.
A simple firewall setup is sometimes called a bastion firewall because it's the main line of defense against attack from the outside. All your security measures are mounted from this one defender of your realm. Consequently, everything possible is done to protect this system. It's your one and only bastion of defense.
Behind this line of defense is your single computer or your group of computers. The purpose of the firewall machine might simply be to serve as the connection point to the Internet for other machines on your LAN. You might be running local, private services behind this firewall, such as a shared printer or shared file systems. Or, you might want all your computers to have access to the World Wide Web. One of your machines might host your private financial records. You might want to have Internet access from this machine, but you won't want anyone getting in. At some point, you might want to offer your own services to the Internet. One of the machines might be hosting your own web site for the Internet. Another might function as your mail server or gateway. Your setup and goals will determine your security policies.
The firewall's purpose is to enforce the security policies you define. These policies reflect the decisions you've made about which Internet services you want to be accessible to your computers, which services you want to offer the world from your computers, which services you want to offer to specific remote users or sites, and which services and programs you want to run locally for your own private use. Security policies are all about access control and authenticated use of private or protected services, programs, and files on your computers.
Home and small business systems don't face all the security issues of a larger corporate site, but the basic ideas and steps are the same. There just aren't as many factors to consider, and security policies often are less stringent than those of a corporate site. The emphasis is on protecting your site from unwelcome access from the Internet. (A corporate site would emphasize internal security, which isn't much of an issue for a home-based site.) A packet-filtering firewall is one common approach to, and one piece of, network security and controlling access to and from the outside.
Something to keep in mind is that the Internet paradigm is based on the premise of end-to-end transparency. The networks between the two communicating machines are intended to be invisible. In fact, if a network device somewhere along the path fails, the idea is that traffic between the two endpoint machines will be silently re-routed.
Firewalls should be transparent, ideally. Nevertheless, they break the Internet paradigm by introducing a single point-of-failure within the networks between the two endpoint machines. Additionally, not all network applications use communication protocols that are easily passed through a simple packet-filtering firewall. It isn't possible to pass certain traffic through a firewall without additional application support or more sophisticated firewall technology.
Further complicating the issue has been the introduction of dynamically allocated IP addresses (DHCP and IPCP) and network address translation (NAT, or masquerading in Linux parlance). Both technologies, in their current form, were primarily developed as stopgaps to alleviate the global IP address shortage problem. As such, both technologies are widely used, even though they make some types of network traffic difficult, impossible, complex, or expensive.
A final complication has been the proliferation of multimedia and peer-to-peer protocols used in both realtime communication software and often in popular networked games. These protocols are antithetical to today's firewall technology. Today, specific software solutions must be built and deployed for each application protocol. The firewall architectures for easily and economically handling these protocols are in process in the standards committees' working groups.
It's important to keep in mind that the combination of firewalling, DHCP, and NAT introduces complexities for which practical solutions don't yet exist or aren't generally economically feasible for small sites. Home sites often have to compromise system security to some extent to use the network services the home users want. Small businesses often have to deploy multiple LANs and more complex network configurations to meet the varying security needs of the individual local hosts.
Before going into the details of developing a firewall, this chapter introduces the basic underlying concepts and mechanisms on which a packet-filtering firewall is based. These concepts include a general frame of reference for what network communication is, how network-based services are identified, what a packet is, and the types of messages and information sent between computers on a network.
The TCP/IP Reference Networking Model
In order to provide a framework for this chapter and for the rest of the book, I'm going to use a few terms before they're defined in the following sections of this chapter. The definitions are old hat to computer science and networking people, but they might be new to less technically inclined people. If this is all new to you, don't worry. Right now, I'm just trying to give you a conceptual place on which to hang the upcoming definitions so that they make more sense.
If you've had formal academic training in networking, you're familiar with the Open Systems Interconnection (OSI) reference model. The OSI reference model was developed in the late 1970s and early 1980s to provide a framework for network interconnection standards. The OSI model is a formal, careful, academic model. Textbooks and academicians use this model as their conceptual framework when talking about networking.
Networking was just taking off in the late 1970s and early 1980s, so the world functioned without a standardized model during the seven years the OSI reference model was being hammered out. As TCP/IP became the de facto standard for Internet communication between UNIX machines during this time, a second, informal model called the TCP/IP reference model developed. Rather than being an academic ideal, the TCP/IP reference model is based on what manufacturers and developers finally came to agree upon for communication across the Internet. Because the model focuses on TCP/IP from a practical, real-world, developer's point of view, the model is simpler than the OSI model. So where OSI explicitly delineates seven layers, the TCP/IP model clumps them into four layers.
This book uses the TCP/IP reference model. As with most people with a computer science background, I tend to mix and match vocabulary, but I map the OSI model onto the TCP/IP conceptual model.
Network communication is conceptualized as a layered model, with communication taking place between adjacent layers on an individual computer, and between parallel layers on communicating computers. The program you're running (for example, your web browser) is at the top, at the application layer, talking to another program on another computer (for example, a web server).
In order for your web browser client application to send a request for a web page to the web server application, it has to make library and system calls that take the information from the web browser and encapsulate it in a message suitable for transport between the two programs across the network. These messages are either transport-layer TCP segments or UDP datagrams. To construct these messages, the application layer calls the transport layer to provide this service. The transport-layer messages are sent between the web browser client and the web server. The transport layer knows how to deliver messages between a program on one computer and a program on the other end of the network. Both the OSI model and TCP/IP model call this layer the transport layer, although the OSI model breaks this layer into several different layers functionally.
In order for these transport-layer messages to be delivered between the two programs, the messages have to be sent between the two computers. To do this, the transport layer calls functions in the operating system that take the TCP or UDP transport message and encapsulate it in an Internet datagram suitable for sending to the other computer. These datagrams are IP packets. The IP Internet packets are sent between the two computers across the Internet. The Internet layer knows how to talk to the computer on the other end of the network. The TCP/IP reference model calls this layer the Internet layer. The OSI reference model vocabulary is commonly used for this layer, so it's more commonly called the network layer. They are one and the same.
Beneath the network layer is the subnet, or link, layer. Again, the packet is encapsulated in an Ethernet datagram. At the subnet level, the message is now called an Ethernet frame. From the TCP/IP point of view, the subnet layer is a clump of everything that happens to get the IP packet delivered to the next computer. This clump includes all the addressing and delivery details associated with routing the frame between the computers, from one router to the next, until the destination computer is finally found. This layer includes translating the network frame from one kind of network to another along the way. Most public networks today are Ethernet networks but along the way can be other network technologies, ATM being the most common.
This clump includes the hardware, the physical wires connecting two computers, and the signals, the voltage changes representing the individual bits of a frame, and the control information required to frame an individual byte. It includes the routing, signaling and security protocols sent between routers. It includes the mechanism to translate the IP address to the network card's hardware MAC address. And so on. None of this is visible to the firewall, usually.
The summary idea, as shown in Figure 1.1, is that the application layer represents communication between two programs. The transport layer represents how this communication is delivered between the two programs. Programs are identified by numbers called service ports. The network layer represents how this communication is carried between the two end computers. At the network layer, computers, or their individual network interface cards, are identified by unique numbers called IP addresses. The subnet layer represents how this communication is carried between each individual computer along the way. On an Ethernet network, these computer network interfaces are identified by numbers called Ethernet addresses, which you are probably familiar with as your network card's burned-in hardware MAC address.
The unique IP addresses are carried end-to-end, from the source computer to the destination computer. The unique Ethernet or MAC addresses are passed between adjacent computers in the network. Each intermediate router along the way replaces the source Ethernet address with its own, and the destination Ethernet address with that of the next hop along the route.
Figure 1.1. TCP/IP reference model.
The next sections describe the information these layers pass among themselves that ends up being used by a packet-filtering firewall.