Home > Articles > Data > SQL Server

This chapter is from the book

Storage Hardware Overview

A basic understanding of server storage hardware is essential for any DBA who wants to effectively configure and administer the Database Engine of an instance of SQL Server 2012. The most elementary metric of hardware storage performance is IOPS, meaning input/output operations per second. Any elementary metric of performance is “throughput,” which is the amount of megabytes per second that an I/O subsystem can read or write in a sustained fashion.

This section discusses the fundamental server hardware components used for storage and I/O and how those components affect SQL Server performance, scalability, and cost. Storage performance and scalability is usually balanced against what the budget allows. There are always trade-offs between what is best and what is the best that the checkbook can afford.

The hardware storage subsystem is based on these fundamental components:

  • Hard Disks
  • RAID (Redundant Arrays of Inexpensive Disks)
  • Disk Controllers and Host Bust Adapters (HBA)
  • Network Attached Storage (NAS), Storage Area Networks (SAN), and Logical Units (LUN)
  • Solid State Drives (SSD)

The following section addresses the highlights concerning each of the preceding concepts:

Understanding Hard Disks

Hard disks have been with us for many decades and are familiar to most readers. Disks, also known as spindles, have natural limitations on their performance. Disks are essentially a mechanical spinning platter with a moving armature and read/write head that moves over the surface of the platter as needed to read and write data. Naturally, physics limits the speed that a mechanical device like this can perform reads and writes. Plus, the further apart the needed data might be on the surface of the disk, the further the armature has to travel, the longer it takes to perform the I/O operation. That’s why defragmentation, or the process of putting related data back together in contiguous disk sectors, is so important for hard disk-based I/O subsystems.

Depending on the speed of rotation on the platter and the “seek” speed of the armature, a hard disk is rated for the number of I/Os it can perform per second, called IOPS. For example, a modern, current generation hard disk might sustain ~200 IOPS with a seek speed of ~ 3.5ms. A bit of math quickly reveals that a SQL Server with a single hard disk would be overwhelmed by an application that needed to do any more than a couple hundred transactions per second. (And don’t forget that Windows needs to do I/O of its own!) So, the typical solution for high-performance database applications is to add more hard disks.

Time spent waiting for an I/O operation to complete is known as latency. Latency is usually very low for sequential operations—that is, an I/O that starts at one sector on the platter and moves directly to the next sector on the platter. Latency is usually higher for random operations—that is, an operation where the I/O starts at one sector but then must proceed to a random location elsewhere on the platter. Microsoft provides a lot of guidance about acceptable latency for a SQL Server I/O subsystem, with transaction log latency recommended to be 10ms or less and database file latency recommended to be 20ms or less. These are very fast I/O response times, so doubling those recommendations is likely to be acceptable for most business applications.

Although latency and IOPS are the most important characteristics of a hard disk for OLTP applications, business intelligence (BI) applications are typically heavy on read operations. Consequently, BI applications usually seek to maximize “disk throughput” because they make frequent use of large, serial reads culling through many gigabytes of data. A typical, current-generation hard disk today has a throughput ~125MB/second at 15,000 RPM. In BI applications, DBAs frequently monitor disk throughput along with disk latency. Refer to Chapter 15, “Monitoring SQL Server 2012,” for more details on monitoring SQL Server 2012’s performance.

The next section tells the best way to combine multiple disks for greater I/O performance, the redundant array of inexpensive disks (RAID).

Understanding RAID Technologies

If a single hard disk is insufficient for the I/O needs of the SQL Server instance, the usual approach is to add more hard disks. It is possible to add more spindles and then place specific SQL Server objects, such as tempdb, onto a single, additional hard disk and thereby see an improvement in I/O performance by segregating the I/O. Bad idea! Hard disks are prone to failure and, when they fail, a single hard disk failure can crash SQL Server. Instead, DBAs use redundant arrays of inexpensive disks (RAID) to add more hard disks while providing greater fault tolerance.

RAID is described in terms of many “levels.” But with database technology, the most commonly used types of RAID are RAID1, RAID5, and RAID10. RAID0 is also described but not recommended for databases. These are described a bit more in the following sections.

RAID0

RAID0, called striping, spreads IOPS evenly across two disks. RAID0 is very fast for reads and writes, but if any one disk in the array fails, the whole array crashes.

Figure 3.1 Shows two disks in RAID0 configuration.

Figure 3.1.

Figure 3.1. RAID Level 0, striped array of disks.

In this example, assume we have three transactions of two blocks each. So, Transaction1 needs to write two blocks to disk: Block A and Block B. Transaction2 needs to write Blocks C and D. Finally, Transaction3 needs to write Blocks E and F.

Each transaction takes only half the time to write to the RAID0 set as it would with a single disk because each of the two blocks are written simultaneously on Disk 1 and Disk 2, instead of writing the two blocks serially on a single disk. Of course, the downside is that if either drive fails, you lose the whole set, given that half of the data is on Disk 1 and the other half is on Disk 2.

RAID1

RAID1, called mirroring, is essentially a set of two disks in which every read and every write operation is performed on both disks simultaneously. Fault tolerance is improved because either disk can fail without crashing the array, allowing SQL Server to continue operating. The DBA or server administrator can then replace the failed drive without an emergency drill. RAID1 is essentially as fast as a single hard disk, but it is fault tolerant whenever a single disk in the array fails.

Figure 3.2 represents two disks in RAID1 configuration.

Figure 3.2.

Figure 3.2. RAID Level 1, mirrored array of disks.

In this example, assume as before that we have three transactions of two blocks each, this time written to a RAID1 set. So, Transaction1 needs to write two blocks to disk: Block A and Block B. Transaction2 needs to write Blocks C and D. Finally, Transaction3 needs to write Blocks E and F.

Each transaction takes about the same to write to the RAID1 set as it would with a single disk because each of the two blocks are written sereally on Disk 1 and Disk 2, just like on a single disk. The big benefit here is that if either drive fails, you still have a full and complete copy of the data on both Disk 1 and Disk 2. And since there is no parity bit calculation, write speed is superior to that of RAID5.

RAID5

RAID5 is a group of at least three disks in which every read is striped across all the disks in the array. This means that read-centric IOPS are faster than with a single disk because the required data can be pulled from more than one disk simultaneously. Write performance, however, is slower than read performance on RAID5 because every write IOP also includes one additional parity write. This parity bit enables the array to reassemble any lost data should one of the drives in the array fail. That means a RAID5 can survive a single drive failure, but not more than one drive failure at a time.

It also means that RAID5 is good for read-heavy applications, but is not as good for write-heavy applications. RAID5 can be expanded beyond three disks, but is typically never bigger than seven disks because of the considerable time needed to reconstruct a failed drive from the parity bits written across all those other drives.

Figure 3.3 represents three disks in RAID5 configuration.

Figure 3.3.

Figure 3.3. RAID Level 5, striped array of disks with parity.

In this example, assume as before that we have three transactions of two blocks each, this time written to a RAID5 set. So, Transaction1 needs to write two blocks to disk: Block A and Block B. Transaction2 needs to write Blocks C and D. Finally, Transaction3 needs to write Blocks E and F.

The first difference you’ll notice is that RAID5 requires a minimum of three disks. Each transaction takes longer to write to the RAID5 set as it would with a single disk because each of the two blocks are written as a stripe across two of the disks while the third disk has a calculated parity bit written to it. The calculation and extra block write takes more time. A transaction that reads two blocks off of the RAID5 set would be faster than a single disk read, because the blocks are striped and could be read simultaneously. The benefit here is that if any single drive fails, you have enough information to recalculate the missing data using the parity blocks. The other benefit is that it is cheaper than other kinds of RAID sets.

But RAID5 has drawbacks too. First, write operations are a lot slower than on DASD or the other RAID configurations described here. Second, even though RAID5 provides inexpensive fault tolerance, should a drive fail, the process of calculating the values of the missing data from the parity blocks can be time consuming. So a full recovery, while easy to do, can be slower than on RAID1 or RAID10.

RAID10

RAID10 is also called RAID1+0. RAID10 is a group of at least four disks in which a RAID1 pair of disks are also striped, RAID0 style, to another pairs of RAID1 disks. This approach uses twice as many disks as RAID1. But it means that the array can sustain two failed drives simultaneously without crashing, as long as the failed disks are not in a single RAID1 pair. RAID10 has both fast read and write speeds, but is more expensive and consumes more space than RAID1 and RAID5.

Figure 3.4 represents four disks in RAID10 configuration.

Figure 3.4.

Figure 3.4. RAID Level 10, mirrored plus striping.

In this example, assume as before that we have three transactions of two blocks each, this time written to a RAID10 set. So, Transaction1 needs to write two blocks to disk: Block A and Block B. Transaction2 needs to write Blocks C and D. Finally, Transaction3 needs to write Blocks E and F.

The first difference you’ll notice is that RAID10 requires a minimum of four disks. Each transaction is faster to write to the RAID10 set as it would with a single disk because each of the two blocks are written as a stripe across each pair of mirrored disks. A transaction that reads two blocks off of the RAID 10 set would be much faster than a single disk read, because the blocks are striped and could be read simultaneously. So, both writes and reads are typically faster than any other option presented here.

In addition to speed of both read and write operations, RAID10 sets are more fault tolerant than any other configuration. That is because each mirrored pair within the RAID10 set can sustain a single disk failure, meaning that more than one disk can fail without causing the array to crash.

But RAID10 most obvious and dramatic drawback is the cost of requiring at least four disks. The disks themselves are costly, but they also consume plenty of space and energy as well. If you can afford them, RAID10 arrays are certainly the best option. But they constitute a much greater expense than DASD.

Disk Controllers and Host Bus Adapters (HBA)

Disks are typically configured and controlled using one of two popular standards: SCSI (pronounced “scuzzy”) and IDE/ATA (usually “A-T-A” for short). SCSI is by far the most popular standard for servers, whereas ATA (usually labeled as SATA) is the most popular standard for home and personal use. Servers using hard disks that adhere to these standards can be directly cabled to the disks, resulting in the acronym DASD, for Direct Attached Storage Device. DASD can also be cabled using standard network protocols like IP over Ethernet cable or Fibre Channel. Note that Fibre Channel and iSCSI are seen primarily on SANs.

Host bus adapters (HBAs) control the movement of data between the server motherboard and the hard disks. HBAs typically include performance options, such as a built-in cache, which can improve performance by buffering writes to the hard disk and then sending them in a burst.

HBAs are important in storage and I/O discussions because it is possible, with enough active processes, to saturate a single HBA. For example, imagine a scheduled job attempts to back up all databases simultaneously on a SQL Server with multiple TBs of data while other I/O heavy operations were also processing. A situation like that could attempt to push through more I/O than the HBA can sustain. HBAs are also important because their firmware may need to be independently updated and maintained when faced with updates to Windows or other hardware components. Remember to stay apprised of vendor recommendations and updates when using high-end HBAs.

Network Attached Storage (NAS), Storage Area Networks (SAN), and Logical Units (LUN)

The Internet is full of information about digital storage. However, some of it is old, outdated, or simply not useful. With SQL Server and storage, consider Network Attached Storage (NAS) as one such area where DBAs should steer clear. Think of NAS servers as more of a file-and-print server technology, not suitable for SQL Server database files, transaction log files, or backup files. NAS may be useful in some Extract-Transform-Load (ETL) applications, but only when their use and potential failure won’t crash the production SQL Server database.

Storage Area Networks (SAN) are usually slightly slower than DASD, if only because they are more heavily used by a variety of applications than DASD. SAN is typically expensive and also has a degree of management overhead in that it should be set up, configured, and administrated only by a dedicated IT professional. But what it loses in speed, it makes up in flexibility, redundant components, and manageability.

For example, it is very easy to extend, reduce, or reconfigure storage on a SAN in Logical Units (LUNs) without stopping important Windows services, like SQL Server. These LUNs can be made available to Windows servers as if they were regular disk drives, when in fact they are usually RAID volumes or even just portions of RAID volumes. In fact, it is easy for SAN administrators to virtualize storage so that it can be quickly moved around on-the-fly.

This is great for SAN administrators, but it can be bad for DBAs if the storage the application has been depending upon is shared with another application in the enterprise. For example, it is possible to configure a large RAID10 volume as two LUNs. These two LUNs share the same underlying disks and, if both LUNS are busy, contend with one another for IOPS on the underlying RAID volume. It’s not good, but certainly a possibility.

Depending on the SAN in use, enormous amounts of cache may also be available to help speed I/O processing. These caches can also be quickly and easily configured (and reconfigured) by SAN administrators to better balance and tune applications that use the SAN.

Unfortunately, many if not most SAN administrators think about storage only as measured by volume, not IOPS or disk throughput. Consequently, it is up to DBAs to know how much I/O performance their applications need and to monitor I/O performance within SQL Server to ensure that they are achieving adequate I/O performance. Many a SAN administrator has been startled to find out that the SQL Server DBA is better informed about the I/O speed (or lack thereof) on the LUNs assigned to them than they are, usually due to a misconfigured setting somewhere on the SAN.

The bottom line for DBAs when administrating storage on a SAN is to follow the SAN vendor’s recommendations wherever possible, and then to monitor and performance tune the SQL Server instance as if the LUN(s) are normal disks.

Solid State Disks (SSD)

The new kids on the block for storage and I/O subsystems are several kinds of solid state disks (SSDs). SSDs are treated just like hard disks when configured, as individual devices or in RAID volumes. Because they are entirely electronic in nature, they offer significant savings in power consumption, speed, and resistance to damage from impact compared to hard disks. SSDs are becoming increasingly popular in the database administration community because they are remarkably faster than hard disks, especially for random I/O operations.

From a hardware perspective, a variety of different types of memory chips might be used within the SSD. But the two most common types of memory in SSDs are DRAM, which usually has volatile memory, and NAND flash memory, which is slower but nonvolatile. Volatile memory loses data when it loses power, whereas nonvolatile memory does not lose data when there is no power. When assessing flash memory-based SSDs, multilevel cell (MLC) flash memory is slower and less reliable than single-level cell (SLC) flash memory.

SSDs, however, have a few special considerations. First, the memory blocks within an SSD can be erased and rewritten a limited number of times. (DRAM-based SSD does not have this limitation.) Enterprise-quality flash drives work around this limitation by overprovisioning the amount of storage on the SSD through algorithms called wear leveling, thus ensuring that the SSD will last for the same duration as a similarly priced hard disk. Second, SSDs require a lot of free memory blocks to perform write operations. Whereas hard disks simply overwrite an unused sector, SSDs must clear out a previously written block using an algorithm called TRIM.

Finally, whereas SQL Server indexes residing on hard disks need frequent defragmentation, indexes residing on SSDs have no such requirement. Because all memory blocks on the SSD are only a few electrons away, all read access is pretty much the same speed whether the index pages are contiguous or not.

Now that you understand the most important aspects of storage hardware, let’s discuss the principles and management tasks that correlate the storage and I/O elements of SQL Server back to the system hardware.

The following “Notes from the Field” section introduces the one of the most commonly used methods of improving I/O performance: segregation of workload.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020