Installing Ubuntu Server
So you've downloaded your Ubuntu Server CD from http://releases. ubuntu.com/ubuntu-server, burned it, eagerly placed it in your CD drive, and rebooted the machine to be greeted by the friendly Ubuntu menu. The first option, Install in Text Mode, installs an Ubuntu server. The second option, Install a LAMP Server, runs the same installer as the first, but will also automatically install and set up Apache, MySQL, and PHP for you. However, if you installed from the DVD then you select Install a Server to get started.
For the most part, server installation is identical to installing a regular Ubuntu machine. This is because Ubuntu takes extra care to ask only the most fundamental questions in the installer, and it turns out those don't differ much between a desktop and a server system. For a quick review of the installation procedure, turn back to Chapter 2. Here, we'll be looking at some of the advanced installer gadgetry that that chapter leaves out, and which is particularly geared toward server users.
The neat stuff begins when you arrive at the partitioning section of the installer. With a desktop machine, you'd probably let the installer configure a basic set of partitions by itself and go on its merry way. But with servers, things get a bit more complicated.
A Couple of Installer Tricks
As we'll explore below, in terms of partitioning and storage, server installations can be quite a bit more complex than desktop ones. There's a small bag of useful tricks with the installer that can help when things get hairy.
The installer itself runs on virtual console 1. If you switch to console 2 pressing Alt+F2, you'll be able to activate the console by hitting Enter and land in a minimalistic (busybox) shell. This will let you explore the complete installer environment, and take some matters into your own hands if need be. You can switch back to the installer console pressing Alt+F1. Console 4 contains a running, noninteractive log file of the installation, which you can inspect by pressing Alt+F4. Finally, it's sometimes useful to be able to connect to another server during installation, perhaps to upload a log file or to gain access to your mailbox or other communication. By default, the shell on console 2 will not provide you with an ssh client, but you can install one by running anna-install openssh-client-udeb after the installer configures the network. Now you can use the ssh and scp binaries to log in or copy data to the server of your choice.
Partitioning Your Ubuntu Server
Deciding how to partition the storage in your server is a finicky affair, and certainly no exact science. Generally, it's a good idea to have at least three partitions separate from the rest of the system
- /home—where all the user files will live
- /tmp—temporary scratch space for running applications
- /var—mail spools and log files
Keeping data on separate partitions gives you, the administrator, a fine-grain choice of filesystem you use for a particular purpose. For instance, you might choose to put /tmp on ReiserFS for its superior handling of many files in a directory and excellent performance on small files, but you might keep /home and /var on ext3 for its rock-solid robustness.
In addition, a dedicated /home partition lets you use special options when mounting it to your system, such as imposing disk space quotas or enabling extended security on user data. The reason to keep /tmp and /var separate from the rest of your system is much more prosaic: These directories are prone to filling up. This is the case with /tmp because it's a scratchpad, and administrators often give users very liberal quotas there (but have a policy, for example, of all user data in /tmp older than two days getting purged), which means/tmp can easily get clogged up. /var, on the other hand, stores log files and mail spools, both of which can come to take up massive amounts of disk space either as a result of malicious activity or a significant spike in normal system usage.
Becoming a system administrator means you have to learn how to think like one. If /tmp and /var are easy to fill up, you compartmentalize them so that they can't eventually consume all the disk space available on your server.
The Story of RAID
If you've only got one hard drive in your server, feel free to skip ahead. Otherwise, let's talk about putting those extra drives to use. The acronym RAID stands for redundant array of inexpensive disks, although if you're a businessman, you can substitute the word "independent" for "inexpensive." We forgive you. And if you're in France, RAID is short for recherche assistance intervention dissuasion, which is an elite commando unit of the National Police—but if that's the RAID you need help with, you're reading the wrong book. We think RAID is just a Really Awesome Idea for Data: When dealing with your information, it provides extra speed, fault tolerance, or both.
At its core, RAID is just a way of replicating the same information across multiple physical drives. The process can be set up in a number of ways, and specific kinds of drive configurations are referred to as RAID levels. These days, even low- to mid-range servers ship with integrated hardware RAID controllers, which operate without any support from the OS. If your new server doesn't come with a RAID controller, you can use the software RAID functionality in the Ubuntu kernel to accomplish the same goal.
Setting up software RAID while installing your Linux system was difficult and unwieldy only a short while ago, but it is a breeze these days: the Ubuntu installer provides a nice, convenient interface for it, and then handles all the requisite backstage magic. You can choose from three RAID levels: 0, 1, and 5.
A so-called striped set, RAID 0 allows you to pool the storage space of a number of separate drives into one large, virtual drive. The important thing to keep in mind is that RAID 0 does not actually concatenate the physical drives—it actually spreads the data across them evenly, which means that no more space will be used on each physical drive than can fit on the smallest one. In practical terms, if you had two 250GB drives and a 200GB drive, the total amount of space on your virtual drive would equal 600GB; 50GB on each of the two larger drives would go unused. Spreading data in this fashion provides amazing performance, but also significantly decreases reliability. If any of the drives in your RAID 0 array fail, the entire array will come crashing down, taking your data with it.
This level provides very straightforward data replication. It will take the contents of one physical drive and multiplex it to as many other drives as you'd like. A RAID 1 array does not grow in size with the addition of extra drives—instead, it grows in reliability and read performance. The size of the entire array is limited by the size of its smallest constituent drive.
When the chief goal of your storage is fault-tolerance, and you want to use more space than provided by the single physical drive in RAID 1, this is the level you want to use. RAID 5 lets you use n identically sized physical drives (if different-sized drives are present, no more space than the size of the smallest one will be used on each drive) to construct an array whose total available space is that of n-1 drives, and the array tolerates the failure of any one—but no more than one—drive without data loss.
Which RAID to Choose?
If you're indecisive by nature, the past few paragraphs may have left you awkwardly hunched in your chair, mercilessly chewing a No. 2 pencil, feet tapping the floor nervously. Luckily, the initial choice of RAID level is often a no-brainer, so you'll have to direct your indecision elsewhere. If you have one hard drive, no RAID for you. Do not pass Go, do not collect $200. Two drives? Toss them into RAID 1, and sleep better at night. Three or more? RAID 5. Unless you really know what you're doing, avoid RAID 0 like the plague. If you're not serving mostly read-only data without a care about redundancy, RAID 0 isn't what you want.
Setting Up RAID
After carefully studying the last section, maybe reading a few books on abstract algebra and another few on finite field theory, you finally decided on a RAID level that suits you. Since books can't yet read your mind, we'll assume you chose RAID 1. So how do you set it up?
Back to the installer. When prompted about partitioning disks, you'll want to bravely select the last option, Manually Edit Partition Table. The very first option in the new dialog box is Configure Software RAID, but don't go there just yet. You need to prepare the physical partitions first.
Below the top four options on the screen (RAID, Logical Volume Manager, Guided Partitioning, and Help), you'll find a list of the physical drives in your server that the Ubuntu installer detected.
Indented below each drive, you'll find the list of any pre-existing partitions, along with their on-disk ordinal number, size, bootable status, filesystem type, and possibly, their mount point. Using the arrow keys, highlight the line summarizing a physical drive (not any of its partitions), and hit Enter—you'll be asked to confirm replacing any existing partition table with a new one. Select Yes, and the only entry listed below that drive will be FREE SPACE. In our fictional server, we have two 80GB drives—hda and hdb—so we'd follow this process for both drives, giving each a fresh partition table. Say we've decided on a 20GB /home partition. Arrow over to the free space, hit Enter, and create the partition—look back to Chapter 1 if you need a refresher on all the options. Once you've entered the size for the new partition, you'll be brought to a dialog where you can choose the filesystem and mount options. Instead of plopping a filesystem on the raw partition, however, you'll want to enter the Use As dialog, and set the new partition to be a physical volume for RAID.
Still with us? Now rinse and repeat for the other drive—create the exact same partition, same size, and set it as a RAID volume. When you're done, you should be back at the initial partitioning screen, and you should have an identically sized partition under each drive. At this point, choose Configure Software RAID at the top of the screen, agree to write out changes to the storage devices if need be, and then choose to create an MD (multidisk) device. After selecting RAID 1, you'll be asked to enter the number of active devices for the array. In our fictional two-drive server it's two. The next question concerns the number of spare devices in the array, which you can leave at zero. Now simply use the spacebar to put a check next to both partitions that you've created (hda1 and hdb1), and hit Finish in the Multidisk dialog to return to the basic partitioner.
If you look below the two physical drives that you used to have there, you'll notice a brand new drive, the Software RAID device that has one partition below it. That's your future /home partition, sitting happily on a RAID array. If you arrow over to it and hit Enter, you can now configure it just as you would a real partition.
The process is the same for any other partitions you want to toss into RAID. Create identical-sized partitions on all participating physical drives, select to use them as RAID space, enter the multidisk configurator (software RAID), and finally, create an array that utilizes the real partitions. Then create a filesystem on the newly created array.
That's it! The Ubuntu installer will take care of all the pesky details of configuring the system to boot the RAID arrays at the right time and use them, even if you've chosen to keep your root partition on an array. Now let's look at another great feature of the Ubuntu installer: the Logical Volume Manager (LVM).
The Story of the Logical Volume Manager
Let's take a step back from our RAID adventure and look at the bigger picture in data storage. The entire situation is unpleasant. Hard drives are slow and fail often, and though abolished for working memory ages ago, fixed-size partitions are still the predominant mode of storage space allocation. As if worrying about speed and data loss weren't enough, you also have to worry about whether your partition size calculations were just right when you were installing a server or if you'll wind up in the unenviable position of having a partition run out of space, even though another partition is maybe mostly unused. And if you might have to move a partition across physical volume boundaries on a running system, well, woe is you.
RAID helps to some degree. It'll do wonders for your worries about performance and fault tolerance, but it operates at too low a level to help with the partition size or fluidity concerns. What we'd really want is a way to push the partition concept up one level of abstraction, so it doesn't operate directly on the underlying physical media. Then we could have partitions that are trivially resizable or that can span multiple drives, we could easily take some space from one partition and tack it on another, and we can juggle partitions around on physical drives on a live server. Sounds cool, right?
Very cool, and very doable via logical volume management (LVM), a system that shifts the fundamental unit of storage from physical drives to virtual, or "logical" ones (although we harbor our suspicions that the term "logical" is a jab at the storage status quo, which is anything but). LVM has traditionally been a feature of expensive, enterprise Unix operating systems, or was available for purchase from third-party vendors. Through the magic of free software, a guy by the name of Heinz Mauelshagen wrote an implementation of a logical volume manager for Linux in 1998, which we'll refer to as LVM. LVM has undergone tremendous improvements since then, is widely used in production today, and just as you'd expect, the Ubuntu installer makes it easy for you to configure it on your server during installation.
LVM Theory and Jargon
Wrapping your head around LVM is a bit more difficult than with RAID because LVM rethinks the whole way of dealing with storage, which expectedly introduces a bit of jargon that you need to learn. Under LVM, physical volumes, or PVs, are seen just as providers of disk space without any inherent organization (such as partitions mapping to a mount point in the OS). We group PVs into volume groups, or VGs, which are virtual storage pools that look like good old cookie-cutter hard drives. We carve those up into logical volumes, or LVs, that act like the normal partitions we're used to dealing with. We create filesystems on these LVs, and mount them into our directory tree. And behind the scenes, LVM splits up physical volumes into small slabs of bytes (4 Mb by default), each of which is called a physical extent, or a PE.
Okay, so that was a mouthful of acronyms, but as long as you understand the progression, you're in good shape. You take a physical hard drive and set up one or more partitions on it that will be used for LVM. These partitions are now physical volumes (PV), which are split into physical extents (PE), and then grouped in volume groups (VG), on top of which you finally create logical volumes. It's the LVs, these virtual partitions, and not the ones on the physical hard drive, that carry a filesystem and are mapped and mounted into the OS. And if you're really confused about what possible benefit we get from adding all this complexity only to wind up with the same fixed-size partitions in the end, hang in there. It'll make sense in a second.
The reason LVM splits physical volumes into small, equally sized physical extents is that the definition of a volume group (the space that'll be carved into logical volumes) then becomes "a collection of physical extents" rather than "a physical area on a physical drive," as with old-school partitions. Notice that "a collection of extents" says nothing about where the extents are coming from, and certainly doesn't impose a fixed limit on the size of a volume group. We can take PEs from a bunch of different drives and toss them into one volume group, which addresses our desire to abstract partitions away from physical drives. We can take a VG and make it bigger simply by adding a few extents to it, maybe by taking them from another VG, or maybe by tossing in a new physical volume and using extents from there. And we can take a VG and move it to different physical storage simply by telling it to relocate to a different collection of extents. Best of all, we can do all this on the fly, without any server downtime.
Do you smell that? That's the fresh smell of the storage revolution.
Setting Up LVM
By now, you must be convinced that LVM is the best thing since sliced bread. Which it is—and, surprisingly enough, setting it up during installation is no harder than setting up RAID. Create partitions on each physical drive you want to use for LVM just as you did with RAID, but tell the installer to use them as "physical space for LVM." Note that in this context, PVs are not actual physical hard drives; they are the partitions that you're creating.
You don't have to devote your entire drive to partitions for LVM. If you'd like, you're free to create actual filesystem-containing partitions alongside the storage partitions used for LVM, but make sure you're satisfied with your partitioning choice before you proceed. Once you enter the LVM configurator in the installer, the partition layout on all drives that contain LVM partitions will be frozen.
Let's look back to our fictional server, but let's give it four drives, which are 10GB, 20GB, 80GB, and 120GB in size. Say we want to create an LVM partition, or PV, utilizing all available space on each drive, and then combine the first two PVs into a 30GB volume group, and the latter two into a 200GB one. Each VG will act as a large virtual hard drive on top of which we can create logical volumes just as we would normal partitions.
As with RAID, arrowing over to the name of each drive and hitting Enter will let us erase the partition table. Then hitting Enter on the FREE SPACE entry lets us create a physical volume—a partition which we set to be used as a physical space for LVM. Once all three LVM partitions are in place, we select Configure the Logical Volume Manager on the partitioning menu.
After a warning about the partition layout, we get to a rather Spartan LVM dialog that lets us modify VGs and LVs. According to our plan, we choose the former option, and create the two VGs we want, choosing the appropriate PVs. We then select Modify Logical Volumes and create the LVs corresponding to the normal partitions we want to put on the system—say one for each of /, /var, /home, and /tmp.
You can already see some of the partition fluidity that LVM brings you. If you decide you want a 25GB logical volume for /var, you can carve it out of the first VG you created, and /var will magically span the two smaller hard drives. If you later decide you've given /var too much space, you can shrink the filesystem, and then simply move some of the storage space from the first VG over to the second. The possibilities are endless.
You're Done—Now Watch Out for Root!
Whew. With the storage stuff out of the way, the rest of your server installation should go no differently than installing a regular Ubuntu workstation. And now that your server is installed, we can move on to the fun stuff. From this point on, everything we do will happen in a shell.
When your Ubuntu server first boots, you'll have to log in with the user you created during installation. Here's an important point that bites a number of newcomers to Ubuntu: Unlike most distributions, Ubuntu does not enable the root account during installation! Instead, the installer adds the user you've created during installation to the admin group, which lets you use a mechanism called sudo for performing administrative tasks. We'll show you how to use sudo in a bit. In the meantime, if you're interested in the rationale for the decision to disable direct use of the root account, simply run man sudo_root after logging in.