Home > Articles > Data > SQL Server

  • Print
  • + Share This
This chapter is from the book

Designing and Administering Storage on SQL Server 2012

The following section is topical in approach. Rather than describe all the administrative functions and capabilities of a certain screen, such as the Database Settings page in the SSMS Object Explorer, this section provides a top-down view of the most important considerations when designing the storage for an instance of SQL Server 2012 and how to achieve maximum performance, scalability, and reliability.

This section begins with an overview of database files and their importance to overall I/O performance, in “Designing and Administering Database Files in SQL Server 2012,” followed by information on how to perform important step-by-step tasks and management operations. SQL Server storage is centered on databases, although a few settings are adjustable at the instance-level. So, great importance is placed on proper design and management of database files.

The next section, titled “Designing and Administering Filegroups in SQL Server 2012,” provides an overview of filegroups as well as details on important tasks. Prescriptive guidance also tells important ways to optimize the use of filegroups in SQL Server 2012.

Next, FILESTREAM functionality and administration are discussed, along with step-by-step tasks and management operations in the section “Designing for BLOB Storage.” This section also provides a brief introduction and overview to another supported method storage called Remote Blob Store (RBS).

Finally, an overview of partitioning details how and when to use partitions in SQL Server 2012, their most effective application, common step-by-step tasks, and common use-cases, such as a “sliding window” partition. Partitioning may be used for both tables and indexes, as detailed in the upcoming section “Designing and Administrating Partitions in SQL Server 2012.”

Designing and Administrating Database Files in SQL Server 2012

Whenever a database is created on an instance of SQL Server 2012, a minimum of two database files are required: one for the database file and one for the transaction log. By default, SQL Server will create a single database file and transaction log file on the same default destination disk. Under this configuration, the data file is called the Primary data file and has the .mdf file extension, by default. The log file has a file extension of .ldf, by default. When databases need more I/O performance, it’s typical to add more data files to the user database that needs added performance. These added data files are called Secondary files and typically use the .ndf file extension.

As mentioned in the earlier “Notes from the Field” section, adding multiple files to a database is an effective way to increase I/O performance, especially when those additional files are used to segregate and offload a portion of I/O. We will provide additional guidance on using multiple database files in the later section titled “Designing and Administrating Multiple Data Files.”

When you have an instance of SQL Server 2012 that does not have a high performance requirement, a single disk probably provides adequate performance. But in most cases, especially an important production database, optimal I/O performance is crucial to meeting the goals of the organization.

The following sections address important proscriptive guidance concerning data files. First, design tips and recommendations are provided for where on disk to place database files, as well as the optimal number of database files to use for a particular production database. Other guidance is provided to describe the I/O impact of certain database-level options.

Placing Data Files onto Disks

At this stage of the design process, imagine that you have a user database that has only one data file and one log file. Where those individual files are placed on the I/O subsystem can have an enormous impact on their overall performance, typically because they must share I/O with other files and executables stored on the same disks. So, if we can place the user data file(s) and log files onto separate disks, where is the best place to put them?

When designing and segregating I/O by workload on SQL Server database files, there are certain predictable payoffs in terms of improved performance. When separating workload on to separate disks, it is implied that by “disks” we mean a single disk, a RAID1, -5, or -10 array, or a volume mount point on a SAN. The following list ranks the best payoff, in terms of providing improved I/O performance, for a transaction processing workload with a single major database:

  1. Separate the user log file from all other user and system data files and log files. The server now has two disks:
    • Disk A:\ is for randomized reads and writes. It houses the Windows OS files, the SQL Server executables, the SQL Server system databases, and the production database file(s).
    • Disk B:\ is solely for serial writes (and very occasionally for writes) of the user database log file. This single change can often provide a 30% or greater improvement in I/O performance compared to a system where all data files and log files are on the same disk.

    Figure 3.5 shows what this configuration might look like.

    Figure 3.5.

    Figure 3.5. Example of basic file placement for OLTP workloads.

  2. Separate tempdb, both data file and log file onto a separate disk. Even better is to put the data file(s) and the log file onto their own disks. The server now has three or four disks:
    • Disk A:\ is for randomized reads and writes. It houses the Windows OS files, the SQL Server executables, the SQL Server system databases, and the user database file(s).
    • Disk B:\ is solely for serial reads and writes of the user database log file.
    • Disk C:\ for tempd data file(s) and log file. Separating tempdb onto its own disk provides varying amounts of improvement to I/O performance, but it is often in the mid-teens, with 14–17% improvement common for OLTP workloads.
    • Optionally, Disk D:\ to separate the tempdb transaction log file from the tempdb database file.

    Figure 3.6 shows an example of intermediate file placement for OLTP workloads.

    Figure 3.6.

    Figure 3.6. Example of intermediate file placement for OLTP workloads.

  3. Separate user data file(s) onto their own disk(s). Usually, one disk is sufficient for many user data files, because they all have a randomized read-write workload. If there are multiple user databases of high importance, make sure to separate the log files of other user databases, in order of business, onto their own disks. The server now has many disks, with an additional disk for the important user data file and, where needed, many disks for log files of the user databases on the server:
    • Disk A:\ is for randomized reads and writes. It houses the Windows OS files, the SQL Server executables, and the SQL Server system databases.
    • Disk B:\ is solely for serial reads and writes of the user database log file.
    • Disk C:\ is for tempd data file(s) and log file.
    • Disk E:\ is for randomized reads and writes for all the user database files.
    • Drive F:\ and greater are for the log files of other important user databases, one drive per log file.

    Figure 3.7 shows and example of advanced file placement for OLTP workloads.

    Figure 3.7.

    Figure 3.7. Example of advanced file placement for OLTP workloads.

  4. Repeat step 3 as needed to further segregate database files and transaction log files whose activity creates contention on the I/O subsystem. And remember—the figures only illustrate the concept of a logical disk. So, Disk E in Figure 3.7 might easily be a RAID10 array containing twelve actual physical hard disks.

Utilizing Multiple Data Files

As mentioned earlier, SQL Server defaults to the creation of a single primary data file and a single primary log file when creating a new database. The log file contains the information needed to make transactions and databases fully recoverable. Because its I/O workload is serial, writing one transaction after the next, the disk read-write head rarely moves. In fact, we don’t want it to move. Also, for this reason, adding additional files to a transaction log almost never improves performance. Conversely, data files contain the tables (along with the data they contain), indexes, views, constraints, stored procedures, and so on. Naturally, if the data files reside on segregated disks, I/O performance improves because the data files no longer contend with one another for the I/O of that specific disk.

Less well known, though, is that SQL Server is able to provide better I/O performance when you add secondary data files to a database, even when the secondary data files are on the same disk, because the Database Engine can use multiple I/O threads on a database that has multiple data files. The general rule for this technique is to create one data file for every two to four logical processors available on the server. So, a server with a single one-core CPU can’t really take advantage of this technique. If a server had two four-core CPUs, for a total of eight logical CPUs, an important user database might do well to have four data files.

The newer and faster the CPU, the higher the ratio to use. A brand-new server with two four-core CPUs might do best with just two data files. Also note that this technique offers improving performance with more data files, but it does plateau at either 4, 8, or in rare cases 16 data files. Thus, a commodity server might show improving performance on user databases with two and four data files, but stops showing any improvement using more than four data files. Your mileage may vary, so be sure to test any changes in a nonproduction environment before implementing them.

Sizing Multiple Data Files

Suppose we have a new database application, called BossData, coming online that is a very important production application. It is the only production database on the server, and according to the guidance provided earlier, we have configured the disks and database files like this:

  1. Drive C:\ is a RAID1 pair of disks acting as the boot drive housing the Windows Server OS, the SQL Server executables, and the system databases of Master, MSDB, and Model.
  2. Drive D:\ is the DVD drive.
  3. Drive E:\ is a RAID1 pair of high-speed SSDs housing tempdb data files and the log file.
  4. DRIVE F:\ in RAID10 configuration with lots of disks houses the random I/O workload of the eight BossData data files: one primary file and seven secondary files.
  5. DRIVE G:\ is a RAID1 pair of disks housing the BossData log file.

Most of the time, BossData has fantastic I/O performance. However, it occasionally slows down for no immediately evident reason. Why would that be?

As it turns out, the size of multiple data files is also important. Whenever a database has one file larger than another, SQL Server will send more I/O to the large file because of an algorithm called round-robin, proportional fill. “Round-robin” means that SQL Server will send I/O to one data file at a time, one right after the other. So for the BossData database, the SQL Server Database Engine would send one I/O first to the primary data file, the next I/O would go to the first secondary data file in line, the next I/O to the next secondary data file, and so on. So far, so good.

However, the “proportional fill” part of the algorithm means that SQL Server will focus its I/Os on each data file in turn until it is as full, in proportion, to all the other data files. So, if all but two of the data files in the BossData database are 50Gb, but two are 200Gb, SQL Server would send four times as many I/Os to the two bigger data files in an effort to keep them as proportionately full as all the others.

In a situation where BossData needs a total of 800Gb of storage, it would be much better to have eight 100Gb data files than to have six 50Gb data files and two 200Gb data files.

Autogrowth and I/O Performance

When you’re allocating space for the first time to both data files and log files, it is a best practice to plan for future I/O and storage needs, which is also known as capacity planning.

In this situation, estimate the amount of space required not only for operating the database in the near future, but estimate its total storage needs well into the future. After you’ve arrived at the amount of I/O and storage needed at a reasonable point in the future, say one year hence, you should preallocate the specific amount of disk space and I/O capacity from the beginning.

Over-relying on the default autogrowth features causes two significant problems. First, growing a data file causes database operations to slow down while the new space is allocated and can lead to data files with widely varying sizes for a single database. (Refer to the earlier section “Sizing Multiple Data Files.”) Growing a log file causes write activity to stop until the new space is allocated. Second, constantly growing the data and log files typically leads to more logical fragmentation within the database and, in turn, performance degradation.

Most experienced DBAs will also set the autogrow settings sufficiently high to avoid frequent autogrowths. For example, data file autogrow defaults to a meager 25Mb, which is certainly a very small amount of space for a busy OLTP database. It is recommended to set these autogrow values to a considerable percentage size of the file expected at the one-year mark. So, for a database with 100Gb data file and 25GB log file expected at the one-year mark, you might set the autogrowth values to 10Gb and 2.5Gb, respectively.

Additionally, log files that have been subjected to many tiny, incremental autogrowths have been shown to underperform compared to log files with fewer, larger file growths. This phenomena occurs because each time the log file is grown, SQL Server creates a new VLF, or virtual log file. The VLFs connect to one another using pointers to show SQL Server where one VLF ends and the next begins. This chaining works seamlessly behind the scenes. But it’s simple common sense that the more often SQL Server has to read the VLF chaining metadata, the more overhead is incurred. So a 20Gb log file containing four VLFs of 5Gb each will outperform the same 20Gb log file containing 2000 VLFs.

Configuring Autogrowth on a Database File

To configure autogrowth on a database file (as shown in Figure 3.8), follow these steps:

  1. From within the File page on the Database Properties dialog box, click the ellipsis button located in the Autogrowth column on a desired database file to configure it.
  2. In the Change Autogrowth dialog box, configure the File Growth and Maximum File Size settings and click OK.
  3. Click OK in the Database Properties dialog box to complete the task.
Figure 3.8.

Figure 3.8. Configure Autogrowth on database files.

You can alternately use the following Transact-SQL syntax to modify the Autogrowth settings for a database file based on a growth rate of 10Gb and an unlimited maximum file size:

USE [master]
GO
ALTER DATABASE [AdventureWorks2012]
MODIFY FILE ( NAME = N'AdventureWorks2012_Data',
MAXSIZE = UNLIMITED , FILEGROWTH = 10240KB
)
GO

Data File Initialization

Anytime SQL Server has to initialize a data or log file, it overwrites any residual data on the disk sectors that might be hanging around because of previously deleted files. This process fills the files with zeros and occurs whenever SQL Server creates a database, adds files to a database, expands the size of an existing log or data file through autogrow or a manual growth process, or due to a database or filegroup restore. This isn’t a particularly time-consuming operation unless the files involved are large, such as over 100Gbs. But when the files are large, file initialization can take quite a long time.

It is possible to avoid full file initialization on data files through a technique call instant file initialization. Instead of writing the entire file to zeros, SQL Server will overwrite any existing data as new data is written to the file when instant file initialization is enabled. Instant file initialization does not work on log files, nor on databases where transparent data encryption is enabled.

SQL Server will use instant file initialization whenever it can, provided the SQL Server service account has SE_MANAGE_VOLUME_NAME privileges. This is a Windows-level permission granted to members of the Windows Administrator group and to users with the Perform Volume Maintenance Task security policy.

For more information, refer to the SQL Server Books Online documentation.

Shrinking Databases, Files, and I/O Performance

The Shrink Database task reduces the physical database and log files to a specific size. This operation removes excess space in the database based on a percentage value. In addition, you can enter thresholds in megabytes, indicating the amount of shrinkage that needs to take place when the database reaches a certain size and the amount of free space that must remain after the excess space is removed. Free space can be retained in the database or released back to the operating system.

It is a best practice not to shrink the database. First, when shrinking the database, SQL Server moves full pages at the end of data file(s) to the first open space it can find at the beginning of the file, allowing the end of the files to be truncated and the file to be shrunk. This process can increase the log file size because all moves are logged. Second, if the database is heavily used and there are many inserts, the data files may have to grow again.

SQL 2005 and later addresses slow autogrowth with instant file initialization; therefore, the growth process is not as slow as it was in the past. However, sometimes autogrow does not catch up with the space requirements, causing a performance degradation. Finally, simply shrinking the database leads to excessive fragmentation. If you absolutely must shrink the database, you should do it manually when the server is not being heavily utilized.

You can shrink a database by right-clicking a database and selecting Tasks, Shrink, and then Database or File.

Alternatively, you can use Transact-SQL to shrink a database or file. The following Transact=SQL syntax shrinks the AdventureWorks2012 database, returns freed space to the operating system, and allows for 15% of free space to remain after the shrink:

USE [AdventureWorks2012]
GO
DBCC SHRINKDATABASE(N'AdventureWorks2012', 15, TRUNCATEONLY)
GO

Administering Database Files

The Database Properties dialog box is where you manage the configuration options and values of a user or system database. You can execute additional tasks from within these pages, such as database mirroring and transaction log shipping. The configuration pages in the Database Properties dialog box that affect I/O performance include the following:

  • Files
  • Filegroups
  • Options
  • Change Tracking

The upcoming sections describe each page and setting in its entirety. To invoke the Database Properties dialog box, perform the following steps:

  1. Choose Start, All Programs, Microsoft SQL Server 2012, SQL Server Management Studio.
  2. In Object Explorer, first connect to the Database Engine, expand the desired instance, and then expand the Databases folder.
  3. Select a desired database, such as AdventureWorks2012, right-click, and select Properties. The Database Properties dialog box is displayed.

Administering the Database Properties Files Page

The second Database Properties page is called Files. Here you can change the owner of the database, enable full-text indexing, and manage the database files, as shown in Figure 3.9.

Figure 3.9.

Figure 3.9. Configuring the database files settings from within the Files page.

Administrating Database Files

Use the Files page to configure settings pertaining to database files and transaction logs. You will spend time working in the Files page when initially rolling out a database and conducting capacity planning. Following are the settings you’ll see:

  • Data and Log File Types—A SQL Server 2012 database is composed of two types of files: data and log. Each database has at least one data file and one log file. When you’re scaling a database, it is possible to create more than one data and one log file. If multiple data files exist, the first data file in the database has the extension *.mdf and subsequent data files maintain the extension *.ndf. In addition, all log files use the extension *.ldf.
  • Filegroups—When you’re working with multiple data files, it is possible to create filegroups. A filegroup allows you to logically group database objects and files together. The default filegroup, known as the Primary Filegroup, maintains all the system tables and data files not assigned to other filegroups. Subsequent filegroups need to be created and named explicitly.
  • Initial Size in MB—This setting indicates the preliminary size of a database or transaction log file. You can increase the size of a file by modifying this value to a higher number in megabytes.

Increasing Initial Size of a Database File

Perform the following steps to increase the data file for the AdventureWorks2012 database using SSMS:

  1. In Object Explorer, right-click the AdventureWorks2012 database and select Properties.
  2. Select the Files page in the Database Properties dialog box.
  3. Enter the new numerical value for the desired file size in the Initial Size (MB) column for a data or log file and click OK.

Other Database Options That Affect I/O Performance

Keep in mind that many other database options can have a profound, if not at least a nominal, impact on I/O performance. To look at these options, right-click the database name in the SSMS Object Explorer, and then select Properties. The Database Properties page appears, allowing you to select Options or Change Tracking. A few things on the Options and Change Tracking tabs to keep in mind include the following:

  • Options: Recovery Model—SQL Server offers three recovery models: Simple, Bulk Logged, and Full. These settings can have a huge impact on how much logging, and thus I/O, is incurred on the log file. Refer to Chapter 6, “Backing Up and Restoring SQL Server 2012 Databases,” for more information on backup settings.
  • Options: Auto—SQL Server can be set to automatically create and automatically update index statistics. Keep in mind that, although typically a nominal hit on I/O, these processes incur overhead and are unpredictable as to when they may be invoked. Consequently, many DBAs use automated SQL Agent jobs to routinely create and update statistics on very high-performance systems to avoid contention for I/O resources.
  • Options: State: Read-Only—Although not frequent for OLTP systems, placing a database into the read-only state enormously reduces the locking and I/O on that database. For high reporting systems, some DBAs place the database into the read-only state during regular working hours, and then place the database into read-write state to update and load data.
  • Options: State: Encryption—Transparent data encryption adds a nominal amount of added I/O overhead.
  • Change Tracking—Options within SQL Server that increase the amount of system auditing, such as change tracking and change data capture, significantly increase the overall system I/O because SQL Server must record all the auditing information showing the system activity.

Designing and Administering Filegroups in SQL Server 2012

Filegroups are used to house data files. Log files are never housed in filegroups. Every database has a primary filegroup, and additional secondary filegroups may be created at any time. The primary filegroup is also the default filegroup, although the default file group can be changed after the fact. Whenever a table or index is created, it will be allocated to the default filegroup unless another filegroup is specified.

Filegroups are typically used to place tables and indexes into groups and, frequently, onto specific disks. Filegroups can be used to stripe data files across multiple disks in situations where the server does not have RAID available to it. (However, placing data and log files directly on RAID is a superior solution using filegroups to stripe data and log files.) Filegroups are also used as the logical container for special purpose data management features like partitions and FILESTREAM, both discussed later in this chapter. But they provide other benefits as well. For example, it is possible to back up and recover individual filegroups. (Refer to Chapter 6 for more information on recovering a specific filegroup.)

To perform standard administrative tasks on a filegroup, read the following sections.

Creating Additional Filegroups for a Database

Perform the following steps to create a new filegroup and files using the AdventureWorks2012 database with both SSMS and Transact-SQL:

  1. In Object Explorer, right-click the AdventureWorks2012 database and select Properties.
  2. Select the Filegroups page in the Database Properties dialog box.
  3. Click the Add button to create a new filegroup.
  4. When a new row appears, enter the name of the new filegroup and enable the option Default.

Alternately, you may create a new filegroup as a set of adding a new file to a database, as shown in Figure 3.10. In this case, perform the following steps:

Figure 3.10.

Figure 3.10. Creating a New Filegroup from the Files page.

  1. In Object Explorer, right-click the AdventureWorks2012 database and select Properties.
  2. Select the Files page in the Database Properties dialog box.
  3. Click the Add button to create a new file. Enter the name of the new file in the Logical Name field.
  4. Click in the Filegroup field and select <new filegroup>.
  5. When the New Filegroup page appears, enter the name of the new filegroup, specify any important options, and then click OK.

Alternatively, you can use the following Transact-SQL script to create the new filegroup for the AdventureWorks2012 database:

USE [master]
GO
ALTER DATABASE [AdventureWorks2012] ADD FILEGROUP
[SecondFileGroup]
GO

Creating New Data Files for a Database and Placing Them in Different Filegroups

Now that you’ve created a new filegroup, you can create two additional data files for the AdventureWorks2012 database and place them in the newly created filegroup:

  1. In Object Explorer, right-click the AdventureWorks2012 database and select Properties.
  2. Select the Files page in the Database Properties dialog box.
  3. Click the Add button to create new data files.
  4. In the Database Files section, enter the following information in the appropriate columns:

    Columns

    Value

    Logical Name

    AdventureWorks2012_Data2

    File Type

    Data

    FileGroup

    SecondFileGroup

    Size

    10MB

    Path

    C:\

    File Name

    AdventureWorks2012_Data2.ndf

  5. Click OK.

The earlier image, in Figure 3.10, showed the basic elements of the Database Files page. Alternatively, use the following Transact-SQL syntax to create a new data file:

USE [master]
GO
ALTER DATABASE [AdventureWorks2012]
ADD FILE (NAME = N'AdventureWorks2012_Data2',
FILENAME = N'C:\AdventureWorks2012_Data2.ndf',
SIZE = 10240KB , FILEGROWTH = 1024KB )
TO FILEGROUP [SecondFileGroup]
GO

Administering the Database Properties Filegroups Page

As stated previously, filegroups are a great way to organize data objects, address performance issues, and minimize backup times. The Filegroup page is best used for viewing existing filegroups, creating new ones, marking filegroups as read-only, and configuring which filegroup will be the default.

To improve performance, you can create subsequent filegroups and place database files, FILESTREAM data, and indexes onto them. In addition, if there isn’t enough physical storage available on a volume, you can create a new filegroup and physically place all files on a different volume or LUN if a SAN is used.

Finally, if a database has static data such as that found in an archive, it is possible to move this data to a specific filegroup and mark that filegroup as read-only. Read-only filegroups are extremely fast for queries. Read-only filegroups are also easy to back up because the data rarely if ever changes.

  • + Share This
  • 🔖 Save To Your Account