Home > Articles > Data > SQL Server

SQL Server Reference Guide

Hosted by

Toggle Open Guide Table of ContentsGuide Contents

Close Table of ContentsGuide Contents

Close Table of Contents

High Availability - SQL Server Clustering

Last updated Mar 28, 2003.

High Availability, as it refers to SQL Server, is simply making sure that the data is available to the users and applications as much as possible. Based on the needs you have, you can set up anything from being able to get the data back to a certain date all the way through keeping the data available constantly.

But there is a cost for each level within this spectrum. It boils down to the fact that the more uptime you want, the more expensive and complicated your solution becomes. The proper thing to do is to work with the organization to find out what kind of downtime they can tolerate. You have to be careful how you phrase this question, however. Asking the business “How much uptime do you want?” will most certainly result in the response “We want the system to be available all the time!” That may or may not be a realistic goal, since they might not be willing to tolerate the cost and manpower required to maintain that level.

So what are the levels, and what kind of availability do they provide? I’ll cover that in another tutorial, and I’ll cover the particular solutions in other articles here at InformIT. Here’s a list of some of those solutions:

  • Hardware redundancy
  • Backups
  • Replication
  • Transaction Log Shipping
  • Database Mirroring

Windows and SQL Server Clustering

Clustering is one of the high-availability methods you can use to ensure the safety and continuous operation of your systems. It involves the Microsoft Windows operating system and the SQL Server platform. Server 2000 through the very latest versions supports failover clustering. Before I continue with the explanation of clustering with SQL Server, I need to clarify what that term means here — since there are two types of clustering: Application Load Balancing and Failover.

Application Load Balancing Cluster

In an Application Load Balancing cluster, all servers (called "nodes") that are part of the cluster act as a single unit. A specific node or software service creates the illusion of a single server to the outside world. This server or process passes processing requests off to one or more server(s) using a messaging system so that it can determine which physical computer is available to process a request. That node or nodes return the data request back to the message controller which in turn returns it to the requester. This sharing of work produces a very powerful "virtual" computer. If one of the nodes leaves the cluster, the system hands the work to another server. These types of clusters most often don't share any of their subsystems such as the processors or hard drives, and are used for memory, I/O or processor-intensive applications.

Microsoft offers this type of environment in their “Compute Cluster” product, as well as in their Cloud database server. You can read more about the Microsoft Computer Cluster and the data cloud services at the link in the Resources section at the end of this tutorial.

Failover Clustering

In failover clustering, there are multiple servers sharing a single storage system between them. One system acts as the active or primary node, and the others act as passive or standby nodes. The servers establish a network signal between them that acts as a heartbeat, and should the standby node(s) not detect the active nodes, another node takes over the identity of the primary or active node. The disk retains the data so that only one system writes to it at a time.

Failover clustering is fairly easy to set up, and provides high safety for your environment. On Microsoft systems, Windows 2000 Server through 2008 supports clustering. The higher the edition, (such as Standard through Enterprise or Datacenter) the more nodes are supported in the cluster.

So what you will need to set up the Windows Cluster are at least two computers that are fairly similar in power and configuration. You do not have to buy exactly the same hardware for all the nodes — they just need to be able to handle the load if called on.

The next thing you’ll need is at least two network cards in each system. One card is used to talk with the clients on the network, and the other is used for that “heartbeat” signal that the systems maintain with each other. You will set up a network address between the nodes that only they have access to.

You’ll need an internal disk that each system uses for its own operating system and local files. It’s important that in a solution like this that you make sure to protect this drive, usually with a RAID setup.

Another internal drive is the “Quorum” storage. In Windows 2000 Server, this is a central location where the nodes replicate certain data. In Windows Server 2003, the Quorum can be located on a share as well. In Windows Server 2008, the entire idea of a Quorum has changed — and it’s all share-based, although the share is very resilient.

Next, you’ll need the share data storage between all the nodes. In Windows 2000 Server through 2003, you can provide that with a “shared SCSI” bus, which is a type of card you install in each system that you connect to external storage, usually a Storage Area Network or SAN. In Windows 2008 server, only “iSCSI Targets” are supported, so you’ll need to move to that technology for the latest versions.

To make all of this work, there are various service accounts and domain groups that you will create or that will be created for you. These are all different based on the version of Windows Server you are using.

SQL Server and Clustering

SQL Server works on one of these Windows clusters, and has since version 7.0. SQL Server has two modes of operating in a failover cluster: Active/Active and Active/Passive.

To get SQL Server clustering installed, the basic process is to set up the Windows Cluster first, and attempt a failover to ensure that all resource groups are still available. Once you’re sure the cluster is functioning properly, you can begin the installation of SQL Server on the nodes. You should always read the full documentation on the installation for your setup so that you know which options to pick. This overview provides a general guide, but not the individual steps for the installation of SQL Server on a cluster.

In SQL Server versions 7 through 2005, you simply install SQL Server on the active node, and the installation program will automatically detect that it is being installed on a cluster. It will then prompt you to install SQL Server to the other nodes, if that is what you want.

In SQL Server 2008, the installation process has changed. You will need to run setup on the active node, and then on each node in turn that you want SQL Server to run on. When you start the new installation process, the main screen provides a link to all of the installation documentation.

Active/Active

In this type of failover cluster, each server acts on its own, and can also handle the other server’s failure. Let’s say you have two servers (nodes) in the cluster named ServerA and ServerB. The cluster itself is known by another name such as Cluster1. Users can address either Cluster1, in which case they are using the cluster as a single computer, or they can address ServerB directly. Should ServerA go offline, ServerB becomes Cluster1 and still retains the identity of ServerB.

In SQL Server versions 7 through 2000, an Active/Active cluster can have four nodes participating, and up to 16 instances. An instance is another installation of SQL Server running currently on a single. In ODBC connections and other connection types, it is addressed by the name of the server and then the name of the instance, like this: SERVERNAME\INSTANCENAME instead of just the SERVERNAME that you’ may be used to using.

This type of cluster use should be avoided. The reason for clustering in the first place is to provide availability in case of an issue, and using the recovery node for another purpose can compromise that goal. Not only that, the second active node will not be protected if it fails.

Active/Passive

In Active/Passive, the Cluster is the only name known to the outside world. The others "stand by" and are activated only manually or if the first node fails.

Using the same example from the previous discussion, you might have two nodes — one called ServerA and the other called ServerB. Once they join the cluster, the users only communicate with their “shared” name of Cluster1. If either node goes down, the shared disk arrangement protects the data and the combined hardware and software protects the SQL Server services and Instance name.

The number of nodes that you add to a SQL Server cluster depends on the operating system (Standard versus Enterprise and so on) and the version and edition of SQL Server you have installed. In any case, the more nodes you have, the more that can fail before you have a major issue.

Combined with the latest version of Windows clustering, from SQL Server 2008 onwards you can now use “stretch” clusters, which allow clustering over larger geographical areas.

Why Cluster?

Although the primary reason to cluster is for safety, using this feature allows you to provide maintenance time for a system that needs to be up constantly. To perform maintenance or apply service packs, you can manually fail over a system to the second node, upgrade a service pack on the first node, fail it back over, and then upgrade the second.

Requirements

No matter what configurations of failover clustering you choose, there are some fairly stringent hardware requirements, in addition to the software requirements I mentioned earlier. You must use the hardware on the Microsoft Hardware Compatibility List to ensure that the cluster will work when you need it most. You might be able to install the software on hardware not listed there, but you won't get support from Microsoft if you do. You can find that here: http://www.microsoft.com/whdc/hcl/search.mspx

To begin, you'll need two similar systems. They don't have to be duplicate sets of hardware, but it does simplify support if they are. You'll want to include enough RAM on both systems to accommodate a failover. If you're using Active/Active clustering, include the amount of RAM equal to all configurations running on a single system on all nodes.

You’ll need two network cards in each server. The first will act as the "public" network that all users access, and the second as the "private" network between servers to check the heartbeat signal. The private card should be hooked to a fast switch or other direct connection between the nodes only. You'll need four sets of IP address segments (networks) on these cards: One for the heartbeat connection, one for the public card that identifies the individual system, another on the public network for the cluster name, and another for the SQL Server instance.

Next, you’re going to need a disk to share between the servers. This is accomplished by adding a special set of adapter cards in each node that provide a connection to the I/O subsystem but are aware of each other. Microsoft calls this a "shared SCSI bus." You can find the list of adapters and I/O subsystems on the Hardware Compatibility List. You will create at least two separate drives on this subsystem: One for the Quorum disk which holds the files that synchronize the cluster and another that holds the data that both servers can see, such as databases and log files.

If you want to learn more about clustering but don't have the budget to purchase all the hardware, you should investigate using a Virtual Machine. A virtual machine is a software emulation of the parts of a computer. Once you install the virtual machine software, it places a window on your screen that looks like a machine rebooting — and that’s exactly what it is. A virtual machine is basically just a hardware layer in software — so you can install another operating system and it believes it’s running on this software based hardware. If you are an administrator, you absolutely have to get this kind of software. I use a product called VMWare and another from Microsoft called Virtual Server and now HyperV. All of these allow you to install all kinds of operating systems, from Microsoft Windows to Linux and Novell. Once these images have been created, they can be used on any other system that has VMWare installed. This will not work for Windows 2008 and higher — they have removed the ability to use a single SCSI “target” for the shared storage.

I use this process to create all those testing servers I need in Windows 2003. It keeps me from having to buy a new box every time I need to test an upgrade, fake a set of network boxes and so forth. VM software allows you to set "backup points" so that you can roll back to a certain place in your build and more.

On Windows the Microsoft Clustering Service (MSCS) provides the Cluster Manager. This tool is located in the Administrative Tools area on your Start menu once it is installed. You use the Cluster Manager to control the nodes and the services they provide, from starting SQL Server in clustered mode to file shares.

For SQL Server, other than starting and stopping the clustering portion of the service you treat it as a normal installation. The following tools are supported in SQL Server clustering:

  • Full-Text Search/Queries
  • SQL Server Enterprise Manager (2000)
  • All Management Tools (2005)
  • SQL Server Service Control Manager
  • SQL Server Profiler
  • SQL Server Query Analyzer

Client applications access the cluster as a regular SQL Server installation.

Configuring the Cluster

To begin, you need to assemble all of your hardware with no operating system, with all components connected and ready. Install Windows Server on the first node and join or create an Active Directory domain. Configure all the IP addresses to support the public and private networks, and have at least two more IP addresses on the public network ready for the cluster name and the SQL Server name.

Configure the second node with the operating system in a similar way. Depending on the I/O subsystem, you may need to shut the first node down first so that the second can configure itself to the shared SCSI bus.

Once the operating system is installed, you need to install or enable the clustering software. In Windows 2000, this is another selection from the Windows Components section of the Add/Remove Software applet. In Windows 2003 and higher it's a matter of selecting the Cluster Manager software from the Administrative Tools item in the Windows Start menu. In both cases, a wizard starts and asks you to complete the process, requesting the location of the Quorum drive, the Shared Drives, and the network card addresses.

Once that installation is complete, repeat the process on the second node. In that case you'll join a current cluster rather than creating a new one. With both nodes up and running, test it using a failover scenario to make sure you're ready to go. Ensure that you're back on the primary node before you start the installation of SQL Server.

In all of the versions that support clustering, the SQL Server installation program detects that you are installing on a cluster and the only differences are the location of the database files and the names of the nodes you are installing on. The rest is handled automatically. Install from the Primary node and select the other nodes you wish to present in the cluster during the installation process. As I mentioned, with SQL Server 2008, you’ll install on each node.

In future overviews and tutorials I’ll explain how to manage a cluster, and what is different for basic operations.

InformIT Articles and Sample Chapters

Building the Foundations for a Highly Available Windows Server Architecture has a section that talks about the basics of clustering.

Creating a Fault-Tolerant Environment in Windows Server 2003 is a related sample chapter from the book Microsoft Windows Server 2003 Insider Solutions.

Books and eBooks

In their book Microsoft Windows Server 2003 Insider Solutions, Ed Roberts, Andrew Abbate, Eric Kovach, and Rand Morimoto cover more information you can use. (Read in Safari Books Online)

Online Resources

You can find more about Microsoft's new Application Load Balancing Cluster initiative, called the Compute Cluster.