What Is a Baseline?
Sometimes when you talk to a seasoned system or network administrator, he'll tell you that he knows that something is wrong when things don't feel right. This isn't an admission of paranormal powers; it's just a shorthand method for explaining that these experts know how their system or network is supposed to behave and that it isn't acting like that now. These administrators have created a baseline for their environment. Not all of them have done it formally, but the ones who have will have gained significant added benefits.
A Baseline Defined
Several things make up a baseline, but at its heart, a baseline is merely a snapshot of your network the way it normally acts. The least effective form of a baseline is the "sixth sense" that you develop when you've been around something for a while. It seems to work because you to notice aberrations subconsciously because you're used to the way things ought to be. Better baselines will be less informal and may include the following components:
Summarized network utilization data
Logs of work done on the network
Maps of the network
Records of equipment on the network and related configuration data
In Chapter 10, "Network Monitoring Tools" we discussed the ethereal network analyzer. This tool's capability to save capture files (or traces) enables you to maintain a history of your network. If the only traces you have saved represent your troubleshooting efforts, you won't have a very good picture of your network.
You also need to be aware that a lot of things will influence the contents of the traces you collect. Weekend vs. weekday; Monday or Friday vs. the rest of the week; and time of day are all examples of the kinds of factors that will affect your data. Running ethereal (or some other analyzer) at least three times a day, every day, and saving the capture file will give you a much clearer idea of how things normally work.
Several tools can give you a quick look at your network's behavior: netstat, traceroute, ping, and even the contents of your system logs are all good sources of information.
The netstat tool can show you several important bits of information. Running it with the -M, -i, and -a switches are especially helpful. I typically add the -n switch to netstat as wellæthis switch turns off name resolution, which is a real boon if DNS is broken or IP addresses don't resolve back to names properly. The -i switch gives you interface specific information:
[pate@cherry sgml]$ netstat -i
Kernel Interface table
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0 1500 0 0 0 0 0 39 0 0 0 BRU
lo 3924 0 36 0 0 0 36 0 0 0 LRU
The -M switch gives information pertaining to masqueraded connections:
[pate@router pate]$ netstat -Mn
IP masquerading entries
prot expire source destination ports
tcp 59:59.96 192.168.1.10 188.8.131.52 1028 -> 80 (61002)
tcp 58:43.75 192.168.1.10 184.108.40.206 622 -> 22 (61001)
udp 16:37.72 192.168.1.10 220.127.116.11 1025 -> 53 (61000)
The -a switch gives connection-oriented output (this output has been abbreviated):
[pate@cherry pate]$ netstat -an
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:6000 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN
udp 0 0 0.0.0.0:111 0.0.0.0:*
raw 0 0 0.0.0.0:1 0.0.0.0:* 7
raw 0 0 0.0.0.0:6 0.0.0.0:* 7
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags Type State I-Node Path
unix 1 [ ] STREAM CONNECTED 1332 /tmp/.X11-unix/X0
unix 1 [ ] STREAM CONNECTED 1330 /tmp/.X11-unix/X0
unix 0 [ ] DGRAM 440
The traceroute tool is especially important for servers that handle connections from disparate parts of the Internet. Setting up several traceroutes to different remote hosts can give you an indication of remote users connection speeds to your server.
The ping tool can help you watch the performance of a local or remote network in much the same way that traceroute does. It does not give as much detail, but it requires less overhead.
When users connect to services on your hosts, they leave a trail through your log files. If you use a central logging host and a log reader to grab important entries, you can build a history of how often services are used and when they are most heavily utilized.
You will likely find yourself touching a lot of the equipment on your network, so it is important that you keep good records of what you do. Even seemingly blind trails in troubleshooting may lead you to discover information about your network. In addition, you will find that your documentation will be an invaluable aid the next time you need to troubleshoot a similar problem.
Some people like to carry around a paper notebook to keep their records in; others prefer to keep things online. Both camps have good points, many related to information access. If you keep everything in a notebook but don't have it handy, it does you no good. Similarly, if everything is online and the network is down, you're in bad shape.
My preference is to keep things online, but in a cvs repository. Then you can keep it on a central server or two while also keeping a copy on your laptop/PC/palmtop. If you like, you can even grab printouts. A nice benefit to this is that several people can make updates to documentation and then commit their changes back to the cvs repository when they've finished.
I won't get into the Web vs. flatfile vs. database vs. XML vs. whatever conflict. They all have benefits. Choose the right option for your organization, and stick to it. The important bit is that you have the data, right?
A roundly ignored set of baseline information is the network map. If you have more than two systems in your network and don't have a map, set down this book for 20 minutes and sketch something out. It doesn't have to be pretty, just reasonably accurate. Are you back? Good. Now that you have a map showing what is where, we can get back to work.
Most people want to deal with two kinds of maps. The first is a topological/physical map, which shows what equipment is where and how it is connected. The second is a logical map. This shows what services are provided and what user communities are supported by which servers. If you can combine these two maps, so much the better; color coding, numeric coding, and outlined boxes are all mechanisms that can help with this. A sample map is shown in Figure 1.
Figure 1 A sample network map
Like the information discussed in the previous "Work/Problem Logs" section, I recommend that you keep your maps online and in a couple of places. (cvs can be a good solution here as well.) Nicely done maps also look good on your wall, not to mention that this is a convenient place to find them when a problem breaks out and you need to start troubleshooting.
You should also have accurate records of the hardware and software in your network. At a minimum, you should have a hardware listing of each box on the network, a list of system and application levels (showing currently installed versions and patches), and configurations of the same. If you keep this in cvs, you'll also have a nice mechanism for looking at your history.
If you decide to keep these records, it is vital that they be kept up-to-date. Every time you make a change, you should edit the appropriate file and commit it to cvs. If you fall behind, you'll miss something, and then you'll really be stuck.