- Sep 9, 2005
What a Hierarchical File System Is All About
In a nutshell, a hierarchy is a system organized by graded categorization. A familiar example is the organizational structure of a company, where workers report to supervisors and supervisors report to middle managers. Middle managers, in turn, report to senior managers, and senior managers report to vice-presidents, who report to the president of the company. Graphically, this hierarchy looks as shown in Figure 3.1.
Figure 3.1 A typical organizational hierarchy.
You’ve doubtless seen this type of illustration before, and you know that a higher position indicates more control. Each position is controlled by the next highest position or row. The president is top dog of the organization, but each subsequent manager is also in control of his or her own small fiefdom.
To understand how a file system has a similar organization, imagine each of the managers in the illustration as a file folder and each of the employees as a piece of paper, filed in a particular folder. Open any file cabinet, and you probably see things organized this way: Filed papers are placed in labeled folders, and often these folders are filed in groups under specific topics. The drawer might then have a specific label to distinguish it from other drawers in the cabinet, and so on.
That’s exactly what a hierarchical file system is all about. You want to have your files located in the most appropriate place in the file system, whether at the very top, in a folder, or in a nested series of folders. With careful usage, a hierarchical file system can contain thousands of files and still allow users to find any individual file quickly.
On my computer, the chapters of this book are organized in a hierarchical fashion, as shown in Figure 3.2.
Figure 3.2 File organization for the chapters of Sams Teach Yourself Unix in 24 Hours, Fourth Edition.
Task 3.1: The Unix File System Organization
A key concept enabling the Unix hierarchical file system to be so effective is that anything that is not a folder is a file. Programs are files in Unix, device drivers are files, documents and spreadsheets are files, your keyboard is represented as a file, your display is a file, and even your tty line and mouse are files.
What this means is that as Unix has developed, it has avoided becoming an ungainly mess. Unix does not have hundreds of cryptic files stuck at the top (this is still a problem in DOS) or tucked away in confusing folders within the System Folder (as with the Macintosh).
The top level of the Unix file structure (/) is known as the root directory or slash directory, and it always has a certain set of subdirectories, including bin, dev, etc, lib, mnt, tmp, and usr. There can be a lot more, however.
You can obtain a listing of the files and directories in your own top-level directory by using the ls -F / command. (You’ll learn all about the ls command in the next hour. For now, just be sure that you enter exactly what’s shown in the example.)
On a different computer system, here’s what I see when I enter that command:
% ls -F / Mail/ export/ public/ News/ home/ reviews/ add_swap/ kadb* sbin/ apps/ layout sys@ archives/ lib@ tftpboot/ bin@ lost+found/ tmp/ boot mnt/ usr/ cdrom/ net/ utilities/ chess/ news/ var/ dev/ nntpserver vmunix* etc/ pcfs/
In this example, any filename that ends with a slash (/) is a folder (Unix calls these directories). Any filename that ends with an asterisk (*) is a program. Anything ending with the at sign (@) is a symbolic link (a pointer to another file or directory elsewhere in the file system), and everything else is a normal, plain file.
As you can see from this example, and as you’ll immediately find when you try the command yourself, there is much variation in how different Unix systems organize the top-level directory. There are some directories and files in common, and once you start examining the contents of specific directories, you’ll find that hundreds of programs and files always show up in the same place from Unix to Unix.
It’s as if you were working as a file clerk at a new law firm. Although this firm might have a specific approach to filing information, the approach can be similar to the filing system of other firms where you have worked in the past. If you know the underlying organization, you can quickly pick up the specifics of a particular organization.
- Try the command ls -F / on your computer system, and identify, as previously explained, each of the directories in your resultant listing.
The output of the previous ls command shows the files and directories in the top level of your system. Next, you learn what the commonly found directories are.
The bin Directory
In Unix parlance, programs are considered executables because users can execute them. (In this case, execute is a synonym for run, not an indication that you get to wander about murdering innocent applications!) When the program has been compiled, it is translated from source code into what’s called a binary format. Add the two together, and you have a common Unix description for an application—an executable binary.
It’s no surprise that the original Unix developers decided to have a directory labeled binaries to store all the executable programs on the system. Remember the primitive teletypewriter discussed earlier? Having a slow system to talk with the computer had many ramifications you might not expect. The single most obvious one was that everything became quite concise. There were no lengthy words like binaries or listfiles, but rather succinct abbreviations: bin and ls are, respectively, the Unix equivalents.
The bin directory (pronounce it to rhyme with "tin") is where all the executable binaries were kept in early Unix. Over time, as more and more executables were added to Unix, having all the executables in one place proved unmanageable, and the bin directory split into multiple parts (/bin, /sbin, /usr/bin).
The dev Directory
Among the most important portions of any computer are its device drivers. Without them, you wouldn’t have any information on your screen (the information arrives courtesy of the display device driver). You wouldn’t be able to enter information (the information is read and given to the system by the keyboard device driver), and you wouldn’t be able to use your floppy disk drive (managed by the floppy device driver).
Remember, everything in Unix is a file. Every component of the system, from the keyboard driver to the hard disk, is a file.
Earlier, you learned how almost anything in Unix is considered a file in the file system, and the dev directory is an example. All device drivers—often numbering into the hundreds—are stored as separate files in the standard Unix dev (devices) directory. Pronounce this directory name "dev," not "dee-ee-vee."
The etc Directory
Unix administration can be quite complex, involving management of user accounts, the file system, security, device drivers, hardware configurations, and more. To help, Unix designates the etc directory as the storage place for all administrative files and information.
Pronounce the directory name "ee-tea-sea," "et-sea," or "etcetera." All three pronunciations are common.
The lib Directory
Like your own community, Unix has a central storage place for function and procedural libraries. These specific executables are included with specific programs, allowing programs to offer features and capabilities otherwise unavailable. The idea is that if programs want to include certain features, they can reference only the shared copy in the Unix library rather than having a new, unique copy.
Pronounce the directory name "libe" or "lib" (to rhyme with the word bib).
The lost+found Directory
With multiple users running many different programs simultaneously, it’s been a challenge over the years to develop a file system that can remain synchronized with the activity of the computer. Various parts of the Unix kernel—the brains of the system—help with this problem. When files are recovered after any sort of problem or failure, they are placed here, in the lost+found directory, if the kernel cannot ascertain the proper location in the file system. This directory should be empty almost all the time.
This directory is commonly pronounced "lost and found" rather than "lost plus found."
The mnt and sys Directories
The mnt (pronounced "em-en-tea") and sys (pronounced "sis") directories are safely ignored by Unix users. The mnt directory is intended to be a common place to mount external media—hard disks, removable cartridge drives, and so on—in Unix. On many systems, though not all, sys contains files indicating the system configuration.
The tmp Directory
A directory that you can’t ignore, the tmp directory—say "temp"—is used by many of the programs in Unix as a temporary file-storage space. If you’re editing a file, for example, the editor makes a copy of the file, saves it in tmp, and you work directly with that, saving the new file back to your original only when you’ve completed your work.
On most systems, tmp ends up littered with various files and executables left by programs that don’t remove their own temporary files. On one system I use, it’s not uncommon to find 10–30 megabytes of files wasting space.
Even so, if you’re manipulating files or working with copies of files, tmp is the best place to keep the temporary copies. Indeed, on some Unix workstations, tmp actually can be the fastest device on the computer, allowing for dramatic performance improvements over working with files directly in your home directory.
The usr Directory
The last of the standard directories at the top level of the Unix file system hierarchy is the usr—pronounced "user"—directory. Originally, this directory was intended to be the central storage place for all user-related commands. Today, however, many companies have their own interpretation, and there’s no telling what you’ll find in this directory.
Other Miscellaneous Stuff at the Top Level
In addition to all the directories previously listed, various other directories and files commonly occur in Unix systems. Some files might have slight variations in name on your computer, so when you compare your listing to the following files and directories, be alert for possible alternative spellings.
A file you must have in order to bring up Unix at all is one usually called unix or vmunix, or named after the specific version of Unix on the computer. The file contains the actual Unix operating system. The file must have a specific name and must be found at the top level of the file system. Hand-in-hand with the operating system is another file called boot, which helps during initial startup of the hardware.
Notice on one of the previous listings that the files boot and dynix appear. (DYNIX is the name of the particular variant of Unix used on Sequent computers.) By comparison, the listing from the Sun Microsystems workstation shows boot and vmunix as the two files.
Another directory you might find in your own top-level listing is diag—pronounced "dye-ag"—which acts as a storehouse for diagnostic and maintenance programs. If you have any programs within this directory, it’s best not to try them out without proper training!
The home directory, /home, also sometimes called users, is a central place for organizing all files owned by specific users. Listing this directory is usually an easy way to find out what accounts are on the system, too, because by convention each individual account directory is named after the user’s account name. On one system I use, my account is taylor, and my individual account directory is also called taylor. Home directories are always created by the system administrator.
The net directory, if set up correctly, is a handy shortcut for accessing other computers on your network.
The tftpboot directory is a relatively new feature of Unix. The letters stand for "trivial file transfer protocol boot." Don’t let the name confuse you, though; this directory contains versions of the kernel suitable for X Window System–based terminals and diskless workstations to run Unix.
Some Unix systems have directories named for specific types of peripherals that can be attached. On the Sun workstation, you can see examples with the directories cdrom and pcfs. The former is for a CD-ROM drive and the latter for DOS-format floppy disks.
Many more directories are in Unix, but this will give you an idea of how things are organized.