Planning the Storage Layout
There are just about as many "best practices" for storage configuration on a server as there are people who configure storage on a server. Although there really is no "one size fits all" solution that works without issues for every SBS installation, this section addresses the main factors that should be considered when planning the storage layout for the new system.
The two main types of media that will be used to comprise the storage configuration on any server are disk media and backup media. Disk media storage has grown from single MFM/RLL drives to IDE to SCSI to SATA. Historically, backup media has almost exclusively been tape media of some type, whether DAT, DLT, LTO, and so on. These days, some shops are using external hard drives with either USB or FireWire connections as backup devices as well. But the function of each type of storage remains the same. Disk media is used for real-time access to data; backup media is generally accessed offline for archival or disaster recovery purposes.
The next section covers terminology as it relates to real-time and backup protection.
Fault tolerance defines a system’s capability to recover from a failure of hardware or software in such a way as to minimize the impact on the system. In most computer systems, hard disk drives are the first components to fail because they have the most moving parts and are accessed constantly while the system is powered on. Knowing this, most server systems are built with some form of fault tolerance for the disk system to minimize the impact when a disk drive fails.
Hardware Versus Software Fault Tolerance
SBS servers can achieve fault tolerance for the disk subsystems using either hardware or software solutions. Hardware solutions rely on specialized disk controllers to handle the management of the fault tolerance implementation selected, and these controllers are more expensive than standard disk controllers. Hardware fault tolerant solutions provide either a mirrored solution—where two disks of the same size act as one—or a RAID (redundant array of inexpensive disks) solution—where three or more disks function as a single drive. See the next section, "RAID Types," for a more detailed explanation of RAID arrays and their functions.
Microsoft servers can also implement mirrored and RAID solutions via software, avoiding the expense of a specialized disk controller card. Through the Disk Manager control panel, partitions of the same size can be mirrored by the operating system or combined into a RAID.
Although more expensive, hardware-based fault tolerance solutions are preferred over the software solutions for one reason—performance. Although the software implementations Microsoft provides for mirroring and RAID are less expensive from a hardware standpoint, the amount of overhead involved in managing the mirror or RAID has a significant impact on server performance.
Traditionally, SCSI RAID controllers are the devices of choice for fault tolerance solutions for the disk subsystem. But disk and controller manufacturers have been looking at less expensive options for the last few years because IDE/ATA drives are much less expensive than their SCSI counterparts and have similar performance specifications, which was not the case just a few years ago. Recently, a number of IDE-RAID and Serial ATA (SATA) RAID controllers have come on the market, and several major hardware manufacturers are beginning to incorporate these devices into their desktop and server lines. Over the next few years, new disk storage technologies will likely be introduced that will help drive down the cost of fault tolerant disk solutions for servers.
RAID, which stands for redundant array of inexpensive disks or redundant array of individual disks depending on whom you ask, is a specification for combining multiple disk units of the same size into a single logical unit for the purpose of improving read/write performance or providing fault tolerance or both. Although there are a number of RAID specifications, only a few are actually used in practice. Table 3.2 lists the most commonly used types of RAID and describes their functions, advantages, and disadvantages. The number of disks needed for each RAID type is listed as is the total available disk space for each type (the values are based on 40GB drives used as individual elements in the array).
One other advantage of a RAID configuration is that most RAID controllers can accommodate a hot spare—an extra disk drive on the controller that automatically becomes active if one of the other members of the array fails. Plus, when combined with a hot-swappable drive technology, the failed drive can be removed and replaced without bringing down the server. The upside is obvious because the system automatically rebuilds the necessary information on the newly activated disk if one fails and reduces the time the server spends without fault tolerance due to the failed drive. The downside is the overhead associated with rebuilding data onto the newly added drive, and that can be observed by end users during the rebuilding process. Use of a hot spare is more commonly found with RAID 5 implementations but can be used with a mirrored configuration as well.
Multiple Partitions Versus Multiple Spindles
Finding the ideal storage layout is a giant puzzle with a number of key pieces. In the end, the layout implemented is the result of a number of compromises with these pieces.
Ideally, some would suggest that an optimum SBS installation would have three spindles, or separate drive mechanisms. One spindle would contain the OS and key applications, one would contain the Exchange log files, and one would contain the Exchange mail databases. This layout would be optimized for performance because the type of disk access needed to read and process the Exchange log files (sequential) is different from the disk access needed to process the Exchange databases (random). User data could be stored on the spindle with the Exchange logs because most user data would be read and written sequentially, and any systemwide databases would be stored on the spindle with the Exchange databases because they would use a similar type of drive access.
But the cost of such a layout would keep a small business from implementing it. To achieve any level of fault tolerance, you would need to at least mirror each of the spindles, a total of six drives. If performance were truly the primary consideration, the two non-OS spindles would likely be a RAID 5 or RAID 50 array, jumping the number of disk drives up to at least eight. In the heady days of the .com spending of the late 1990s when startup capital seemed to come from the woodwork, allocating financial resources for this type of setup might have been possible. Not so today.
In the cost-aware economy of the mid-2000s, almost all organizations, especially small business owners, are looking for ways to reduce costs, and a requisition for a server as described previously might not even make it past the first approval signature needed. So the most cost-effective way to implement drive storage would be a single-spindle solution with some measure of fault tolerance. At a minimum, this would be a server with two hard drives mirrored—not an ideal scenario for performance but cost effective and has some degree of fault tolerance.
Many would argue that a server should really have two spindles—one for the operating system and one for data. The reasoning behind this is that if something were to happen to the drives containing the operating system, the data is still intact, and the server can be brought back to life fairly quickly by reinstalling and restoring the operating system configuration from backup. If the budget permits, each of the two spindles could be configured as separate RAID 5 arrays, or even a mirror for the OS spindle and RAID 5 for the data spindle.
But for some, the bottom line is everything, so the system must be built as inexpensively as possible. This usually means leaving the server operating on a single spindle, either a mirror or a RAID 5 array. Is this a bad configuration? No, but it does not present many opportunities for optimizing the storage space for speed. And the drive can still be logically divided into partitions to either segregate the data for organizational benefit or to help speed recovery times in case of a data disaster. Partitioning a single drive does not offer any performance benefits because the same drive mechanism is being used to read and write data to each of the partitions, so separating the Exchange log files onto a different partition from the Exchange databases will not have any positive performance impact on the server.
That being said, the SBS support community does have some general recommendations for basic data storage layout. The general rule of thumb is to have a C: partition of 12GB–16GB and a data partition as large as needed to handle the client’s storage needs. Ideally, these would be on separate spindles for performance benefits, but at least partitioning a single spindle is recommended. The best solution for storage allocation cannot be boiled down to a single formula that works for every installation. The correct answer always depends on the needs of each individual installation, recognizing that cost will often be the mitigating factor.
Choosing the best backup system for the server also depends on a number of factors. Cost is certainly one of those, but so are reliability, speed, capacity, and ease of use, to name just a few. Although tape backup has been a mainstay for years and is probably still the default assumption of system builders, recent technology improvements have changed the backup landscape slightly, and that landscape is worth another look.
The biggest challenge facing any backup technology is capacity. Disk storage continues to increase in speed and capacity and drop in price. Tape backup systems have not enjoyed the same success. It is not uncommon to find small businesses needing servers with hundreds of gigabytes of disk storage. It is uncommon to find a tape technology that can back up that much data on a single cartridge that doesn’t cost more than the server itself.
One limitation of the built-in backup solution provided through the SBS wizards is twofold. First, the wizard configures a backup process that attempts to back up all the data on the server in one job. Although certain areas of the server can be excluded from the backup job, that is the limit of the customization that can be achieved with the backup wizards. The SBS backup process is covered in greater detail in Chapter 13, "Ex-change Disaster Recovery," and Chapter 18, "Backing Up SBS," and those chapters cover additional methods and built-in tools that can be used to streamline the backup process.
The second limitation of the SBS Backup Wizard is that it can only back up to a single tape device. If the data to be backed up is larger than the capacity of the tape device, the wizard fails. It has no mechanism to prompt to change a tape or interface with a tape loader. This is where the cost versus capacity challenge in tape devices really hits home for the SBS customer. If a client wants the simplicity of using the one-click backup offered by the SBS wizard but has 300GB–400GB of data stored on the server, he will need some sort of high-end LTO or other device that can handle the capacity. Given the cost of these devices, it probably makes more sense to look at a third-party backup solution and a midrange capacity tape auto-loader.
However, all hope is not lost. One option of the SBS Backup Wizard is to store the backup to a data file on disk. This can be a local disk on the server, a disk accessible across the network, or a removable USB or FireWire disk attached to the server. Many consultants have started looking seriously at the external USB and FireWire disk drives as an alternative to tape for a number of reasons. First is cost—several external hard drive units can be purchased for less than the cost of a midrange tape drive unit. Second is portability—with a removable drive, the backups can still be stored offsite, a requirement of some legislation affecting certain industries. Third is speed—accessing data on a disk is inherently faster than accessing the same data on a tape. In Chapter 18, the mechanics of implementing a backup rotation using a removable disk are covered in greater detail. For this chapter, if a removable disk backup solution is being considered for the server, make sure that the server has a USB 2.0 interface if a FireWire interface is not available. FireWire generally gives faster data transfer than USB 2.0, but if USB is the only option, the USB 2.0 interface is essential; USB 1.1 is just too slow to make that a practical option for backup.