Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

Clustering Services

Last updated Mar 28, 2003.

Some computing environments don't have intense uptime requirements. Of course, most of us rarely turn our servers off, and we wouldn't want to take them out of service for an extended period of time. But we usually have some latitude for maintenance time, or to perform an upgrade to a new application or SQL Server version. The DBA and technical staff normally has to work weekends, nights or holidays to make upgrade to systems, when most of the users are off-line. They coordinate the time, and then send out a notice to the users that the system will be unavailable.

Sometimes downtime for a system is unplanned. Software bugs, hardware issues and the like can cause a system to become unstable and power down. When this happens the technical staff has to re-route network traffic to a backup system or work quickly to resolve the problem.

But in some environments, downtime is not only inconvenient, it could be dangerous. Medical facilities might be able to do without a financial application for a short time, but not the system that controls medical machinery. SQL Server systems are also used in nuclear environments, and also the military. Downtime in these environments can be life-threatening. Even financial systems that use SQL Server can be considered critical, if not to health then to the wellbeing of a large company or a stock exchange. In all these situations, failure even for a moment isn't really an option.

Microsoft SQL Server provides a mechanism that comes with the Standard and Enterprise editions that will allow you to create applications that automatically recover when a system has a critical outage, called clustering. Clustering is one of the high-availability methods you can use to ensure the safety and continuous operation of your systems.

Actually the base of clustering involves the Microsoft Windows operating system. You first cluster the operating system; and then any applications that support being clustered are installed after. SQL Server is one of those applications.

All versions of SQL Server since 7 support clustering, but since as of this writing SQL Server 7 and 2000 are out of support, if you have uptime requirements that dictate redundant systems, you need to upgrade those older SQL Server servers immediately.

There are two types of clustering: Application Load Balancing and Failover. I’ll give you an overview of each type, and explain what I mean when we discuss clustering for SQL Server. In a separate tutorial I'll show you how to create your own cluster, using only the hardware you have in place today.

Application Load Balancing Cluster

In an Application Load Balancing cluster, all servers (called "nodes") act as a single unit. A specific node or software service creates the illusion of a single server to the outside world. This server or process passes processing requests off to one or more server(s) (normally using a messaging or caching system) so that it can determine which physical computer is available to process a request. This sharing of work produces a very powerful "virtual" computer. If one of the nodes leaves the cluster, the system hands the work to another server. These types of clusters most often don’t share any of their subsystems such as the processors or hard drives, and are used for memory, I/O or processor-intensive applications.

Microsoft is now beginning to offer this type of environment, something they call a Compute Cluster. As of this writing, SQL Server isn’t implemented on this system, since there are issues with keeping a particular transaction atomic, specifically when it performs write operations. You can read more about the Microsoft Computer Cluster at the link in the Resources section at the end of this tutorial.

Failover Clustering

In failover clustering, two servers share a single storage system. The servers establish a signal between each other which acts as a heartbeat, and should the secondary node not detect the primary, the second node takes over the identity of the first. The disk retains the data so that only one system writes to it at a time. Once again, the applications and users work with a separate name, which routes the current calls to one of the servers.

Failover clustering is fairly easy to set up, and provides high safety for your environment. Windows Server 2003 and higher support clustering,

SQL Server versions 7 and higher handle this type of cluster. SQL Server has two modes of operating in this type of cluster: Active/Active and Active/Passive.

Active/Passive

This is the most common clustering in SQL Server arrangements. Two or more servers (depending on the versions and editions of operating systems and SQL Server software) are used, but only one of them is set to be the “primary” system. Let’s assume you have two servers, one called ServerA and the other called ServerB. You would set up a cluster so that both of these servers are actually known to the applications and users as Cluster1. In fact, only ServerA is answering requests from the network. If ServerA should go down for whatever reason, ServerB is set as the primary node, although the users still access Cluster1. It looks something like this:

Normal operation:
ServerA = Cluster1
ServerB (Standing By for Cluster1)

ServerA  Node Fails:
ServerA (Offline)
ServerB = Cluster1

Active/Active Cluster

In this type of failover cluster, each server acts on its own, and can also handle the other server's failure. As an example, you might have two servers (nodes) which are clustered together, one named ServerA and the other ServerB. An Active/Active cluster just means that two clusters would be set up such that ServerA would handle ServerB’s load if it failed, using the name Cluster1. Then ServerB would be set up to handle ServerA’s load if it failed, using the name Cluster2. Like this:

Normal operation:
ServerA = Cluster1 (Standing By for Cluster2)
ServerB  = Cluster2 (Standing By for Cluster1)

ServerA  Node Fails:
ServerA (Offline)
ServerB = Cluster1 + Cluster2

ServerB  Node Fails:
ServerA = Cluster1 + Cluster2
ServerB (Offline)

Why Cluster?

Although the primary reason to cluster is for safety, using this feature allows you to provide maintenance time for a system that needs to be up constantly. To perform maintenance or apply service packs, you can manually fail over a system to the second node, upgrade a service pack on the first node, fail it back over, and then upgrade the second.

Clustering Requirements

No matter what configurations of failover clustering you choose, for Windows Server 2003 and earlier, there are some fairly stringent hardware requirements, in addition to the software requirements I mentioned earlier. You must use the hardware on the Microsoft Hardware Compatibility List to ensure that the cluster will work when you need it most. You might be able to install the software on hardware not listed there, but you won’t get support from Microsoft if you do. You can find that here: http://www.microsoft.com/whdc/hcl/search.mspx.

To begin, you’ll need two similar systems. They don’t have to be duplicate sets of hardware, but it does simplify support if they are. You’ll want to include enough RAM on both systems to accommodate a failover. If you’re using Active/Active clustering, include the amount of RAM equal to all configurations running on a single system on all nodes.

You'll need two network cards in each server. The first will act as the "public" network that all users access, and the second as the "private" network between servers to check the heartbeat signal. The private card should be hooked to a fast switch or other direct connection between the nodes only. You’ll need four sets of IP address segments (networks) on these cards: One for the heartbeat connection, one for the public card that identifies the individual system, another on the public network for the cluster name, and another for the SQL Server instance.

Next, you need a disk system to share between the servers. This is accomplished by adding a special set of adapter cards in each node that provide a connection to the I/O subsystem but are aware of each other. Microsoft calls this a "shared SCSI bus". You can find the list of adapters and I/O subsystems on the Hardware Compatibility List. You will create at least two separate drives on this subsystem: One for the Quorum disk which holds the files that synchronize the cluster and another that holds the data that both servers can see, such as databases and log files.

On Windows 2000 and 2003 the Microsoft Clustering Service (MSCS) provides the Cluster Manager. This tool is located in the Administrative Tools area on your Start menu once it is installed. You use the Cluster Manager to control the nodes and the services they provide, from starting SQL Server in clustered mode to file shares.

For SQL Server, other than starting and stopping the clustering portion of the service you treat it as a normal installation. The following tools are supported in SQL Server clustering:

  • Full-Text Search/Queries
  • SQL Server Enterprise Manager (2000)
  • All Management Tools (2005)
  • SQL Server Service Control Manager
  • SQL Server Profiler
  • SQL Server Query Analyzer

Client applications access the cluster as a regular SQL Server installation.

Configuring the Cluster

I'll give you a brief overview of the process to create a cluster here, but you should carefully review the installation documents for any kind of production setup.

To begin, you need to assemble all of your hardware with no operating system, with all components connected and ready. Install Windows 2000 or 2003 on the first node and join a NETBIOS or Active Directory domain. Configure all the IP addresses to support the public and private networks, and have at least two more IP addresses on the public network ready for the cluster name and the SQL Server name.

Configure the second node with the operating system in a similar way. Depending on the I/O subsystem, you may need to shut the first node down first so that the second can configure itself to the shared SCSI bus.

Windows Server 2008 and higher has a different set of requirements, but you no longer have to buy just what is on the HCL. You can now run a tool on your systems to see if they “pass the test” for clustering. Another change is that Windows 2008 Server uses iSCSI for the shared drives.

Once the operating system is installed, you need to install or enable the clustering software. In Windows 2000, this is another selection from the Windows Components section of the Add/Remove Software applet. In Windows 2003 it’s a matter of selecting the Cluster Manager software from the Administrative Tools item in the Windows Start menu. In both cases, a wizard starts and asks you to complete the process, requesting the location of the Quorum drive (a device that all nodes share to know which one “own” the hard drive between them at any one time), the Shared Drives (where the databases will live), and the network card addresses.

In Windows 2008 Server, you’ll also find the Cluster Manager tool, but there are new ways to define the storage ownership, which used to be the Quorum process. Although that still exists, you have other more flexible options. You’ll also find that it is much simpler to set up clustering in Windows 2008 Server.

In SQL Server 7-2005, you can install all the nodes from one location. The SQL Server installation program detects that you are installing on a cluster and the only differences are the location of the database files and the names of the nodes you are installing on. The rest is handled automatically. Install from the Primary node and select the other nodes you wish to present in the cluster during the installation process.

Or, once that installation is complete, you can repeat the process on every other node. In that case you’ll “join” a current cluster during the installation process rather than creating a new one. With all nodes up and running, test using a failover scenario to make sure you’re ready to go. Ensure that you’re back on the primary node before you start the installation of SQL Server.

In SQL Server 2008 and higher, you need to install the SQL Server software on each node, one at a time, from the Installation Center on the install media. In all versions, you can script the installation.

InformIT Articles and Sample Chapters

Read Creating a Fault-Tolerant Environment in Windows Server 2003.

Books and eBooks

In their book Microsoft Windows Server 2003 Insider Solutions, Ed Roberts, Andrew Abbate, Eric Kovach, and Rand Morimoto cover more information you can use.

Online Resources

There’s a whitepaper on SQL Server Clustering that you can read here.