Home > Articles > Operating Systems, Server > Microsoft Servers

  • Print
  • + Share This
This chapter is from the book

Windows Clustering 101

There was a time in the not-too-distant past when the thought of clustering Windows servers sent a chill down the spines of network engineers and caused them to go take out long-term care insurance. Those days are gone with the clustering services that are now built into the Windows Server 2003 operating system. Only Windows Server 2003, Enterprise Edition and Windows Server 2003, Datacenter Edition can create clusters. Windows Storage Server 2003 is a version of Enterprise Edition for clustering file share resources.

There are two parts to clustering a high-availability service or application, such as Exchange 2003 or SQL Server 2000 or SQL Server 2005. The first part entails setting up the base cluster service and getting a virtual server going. The second part entails creating the resources that failover on that virtual server. Most of the second part of clustering is dealt with in Part II of this book.

By the time you are ready to cluster Exchange or SQL Server, you will be able to failover the virtual server resources from one node to the other and keep services, like drives and network interface cards, under the control of the cluster.

The Cluster Model

With Windows Server 2003, you have three models from which to choose; they are built into the operating system and are, thus, supported by Microsoft. The third option may require third-party software. Table 6.1 discusses the models in order of increasing complexity.

The most common cluster model (and inherited from Windows 2000 and Windows NT) is the single quorum cluster model in which multiple nodes of a cluster share a single quorum resource. In this model, all nodes communicate with each other across a local interconnect, and all nodes share a common disk array (in a SAN or a SCSI enclosure).

Windows Server 2003 also introduces the concept of a single node cluster, which is a cluster that is comprised of a single node or server. For obvious reasons, a single node cluster runs host cluster resources, but the cluster resources cannot fail-over to anything.

Then there is the geographic cluster or so-called "geo-cluster" in which the nodes that comprise the cluster are separated over a geographic divide. A wide area network usually separates the nodes and the geo-cluster nodes can be in different buildings or even across the country. They don't share storage or a quorum.

The central repository of data in a cluster is the so-called quorum resource. You can think of the quorum as the brain center of the cluster. The idea of a cluster is to provide system or server redundancy. In other words, when a server in the cluster fails, the cluster service is able to transfer operations to a healthy node. This is called failover. The quorum resource data is persistent and the quorum must survive node failure in the cluster or the resources cannot fail to the healthy node and start up.

This is why in a traditional, single quorum resource cluster, the quorum cannot be mounted into any single device on the node of the cluster unless the cluster can gain exclusive access to the device (and unless it can be moved or transferred upon node failure, which is technically possible even on a local disk resource as we will soon see). There are two exceptions to this rule: the single node cluster and the so-called geo-cluster, a concept in clustering now possible with Windows Server 2003.

Each of the cluster models discussed employs a different quorum resource type. Table 6.1 discusses the models.

Table 6.1 Cluster Model Options

Cluster Model

Application

Location of Cluster Configuration Data

Single Node

Ideal for labs, testing, development, and hosting applications on a virtual server

The quorum resource maintains the cluster configuration data either on a cluster storage device (an external drive array) or as a local drive on the node. Setup requires selection of the Local Quorum resource type.

Single Quorum

Typical local Active-Passive and Active-Active clusters

The quorum resource maintains the cluster configuration data on the single cluster storage device to which all nodes in the cluster are connected. Setup of this model ratifies the Physical Disk resource type (or other storage class resource type). The cluster installation will fail if this resource time does not test true as a viable quorum (we demostrate this later in the chapter).

Majority Node Set

Geographically dispersed server clusters

Geographic clusters are separated over wide area networks; therefore, each node maintains its own copy of the cluster configuration data. The quorum resource ensures the cluster configuration data is kept consistent across the nodes.


Single Node

Of particular interest is the Single Node cluster model in which the quorum resource can be maintained on a storage device on the local node. The idea behind the single node model is novel. With previous versions of the operating system, it was impossible to establish a virtual server, what users attach to, on a cluster comprising only one node. The single node cluster enables this. The Single Node cluster model is illustrated in Figure 6.1.

NOTE

Note: This chapter covers the creation of a single quorum cluster. However, we do touch on the subject of geo-clusters in Chapter 10, "High Availability, High-Performance Exchange."

You can use the single node cluster for lab testing of applications that have been engineered for clustering. You can also use it to test access to storage devices, quorum resources, and so on. The lab or development work is, thus, used to migrate the cluster-aware application into production as a standard single quorum cluster. It is also possible to simply cluster the single node with other nodes at a later time. The resource groups are in place and all you need to do is configure fail-over policies for the groups.

Figure 6.1Figure 6.1 Single Node cluster model.

A single node cluster can also be used to simply provide a virtual server that users connect to. The virtual server service and name, thus, survives hardware failure. Both administrators and clients can see the virtual servers on the network and they do not have to browse a list of actual servers to find file shares.

What happens when the server hosting the single node cluster and the virtual server fail? The Cluster service automatically restarts the various application and dependent resources when the node is repaired. You can also use this service to automatically restart applications that would not otherwise be able to restart themselves.

For example, you can use this model to locate all the file and print resources in your organization on a single computer, establishing separate groups for each department. When clients from one department need to connect to the appropriate file or print share, they can find the share as easily as they would find an actual computer.

You can move the virtual server to a new node and end users never know the physical server behind the virtual server name has been changed. The real NetBIOS name of the server is never used. The downside of this idea is downtime. Moving the virtual server name to a new server requires downtime. Therefore, this is not suitable for a high-availability solution.

Single Quorum Cluster

This cluster model prescribes the quorum resource maintains all cluster configuration data on a single cluster storage device that all nodes have the potential to control. As mentioned earlier, this is the cluster model available in previous versions of Windows. The Single Quorum cluster model is illustrated in Figure 6.2.

Microsoft discounts the perception that the cluster storage device can be a single point of failure and promotes the idea that a Storage Area Network (SAN) where there are often multiple, redundant paths from the cluster nodes to the storage device mitigates in favor of this solution. While not discounting this model, if you study how a SAN is built, you discover there is some truth that a SAN is a single point of failure.

You can indeed have multple paths to the storage device (the "heart" of the SAN) as discussed in Chapters 3, "Storage for Highly Available Systems," and 4, "Highly Available Networks." However, the SAN controller is really nothing more than a server with an operating system that is dedicated to hosting the drive arrays in its enclosures. Unless you have redundant controllers, your SAN will fail if a component in the SAN controller fails. SAN memory can fail, its operating system can hang, the processors can be fried, and so on. Thus, to really elimimate every single point of failure in this model, you really need to have two SANs on the back end. This idea really opens a can of worms. After all, most IT shops do not budget for two SANs for every cluster. The SANs of today have many redundant components within their single footprint (usually a very large footprint) in the data center. To deploy two-mirrored SANs on a cluster is not only a very expensive proposition, but it is technically very difficult to install and manage.

Figure 6.2Figure 6.2 Single Quorum cluster model.

Majority Node Set

As mentioned, geo-cluster nodes can reside on opposite sides of the planet because each node maintains its own copy of the cluster configuration data. The quorum resource in the geo-cluster is called the Majority Node Set resource. Its job is to ensure the cluster configuration data is kept consistent across the different nodes; it is essentially a mirroring mechanism. The Majority Node Set cluster model is illustrated in Figure 6.3.

The quorum data is transmitted unencrypted over Server Message Block (SMB) file shares from one node to the other. Naturally, the cluster nodes cannot be connected to a common cluster disk array, which is the main idea behind this model.

Figure 6.3Figure 6.3 Majority Node Set cluster model.

You can use a majority node set cluster in special situations, and it will likely require special third-party software and hardware offered by your Original Equipment Manufacturer (OEM), Independent Software Vendor (ISV), or Independent Hardware Vendor (IHV).

Let's look at an example. Let's say we create an 8-node geo-cluster. We could, for example, locate four nodes in one data center, say in Atlanta, and the other 4 nodes in another data center in Phoenix. This can be achieved, and you can still present a single point of access to your clients. At any time a node in the geo-cluster can be taken offline, either intentionally or as a result of failure, and the cluster still remains available.

You can create these clusters without cluster disks. In other words, you can host applications that can failover, but the data the application needs are replicated or mirrored to the quorum data repositories on the other nodes on the cluster. For example, we can use this model with SQL Server to keep a database state up-to-date with log shipping. In Chapter 10, we investigate the particular solutions offered by NSI Software: Double-Take and GeoCluster.

The majority node set is enticing, but there are disadvantages. For starters, if more than half the nodes fail at any one time, then the entire cluster itself fails. When this happens, we say the cluster has lost quorum. This fail-over limitation is in contrast to the Single Quorum cluster model discussed ealier which will not fail until the last node in the cluster fails.

The Quorum Resource

Every cluster requires a resource which is designated as the quorum resource. The idea of the quorum is to provide a place to store configuration data for the cluster. Thus, when a cluster node fails, the quorum lives to service the new active node (or nodes) in the cluster. The quorum essentially maintains the configuration data the cluster needs to recover.

This data in the quorum is saved in the form of recovery logs. These logs store the changes that have been saved in the cluster database. Each node in the cluster depends on the data in the cluster database for configuration and state.

A cluster cannot exist without the cluster database. For example, a cluster is created when each node that joins the cluster updates its private copy of the cluster database. When you add a node to the existing cluster, the Cluster service retrieves data from the other active nodes and uses it to expand the cluster. When you create the first node in a cluster, the creation process updates the cluster database with details about the new node. This is discussed in more detail in the section "Clustering" later in this chapter.

The quorum resource is also used by the cluster service to ensure the cluster is composed of an active collection of communicating nodes. If the nodes in the cluster can communicate normally with each other (across the cluster interconnect), then you have a cluster. Like all service databases on the Windows platform, the cluster database and the quorum resource logs can become corrupt. There are procedures to fix these resources and we cover this a little later in this chapter.

When you attempt to create a cluster, the first node in the cluster needs to gain control of the quorum resource. If it cannot see the resource (this quorum), then the cluster installation fails. We show you this later. In addition, a new node is allowed to join a cluster or remain in the cluster only if it can communicate with the node that controls the quorum resource.

Let's now look at how the quorum resource is used in a two-node cluster, which is the type of cluster we will in the coming chapters.

When the first node in the cluster fails, the second node continues to write changes to the cluster database that it has taken control of. When the first node recovers and a fail-back is initiated, then ownership on the cluster database and quorum resource is returned in the fail-back mechanism.

But what if the second node fails before the first is recovered? In such a case, the first node must first update its copy of the cluster database with the changes made by the second node before it failed. It does this using the quorum resource recovery logs.

If the event the interconnect between the nodes fails, then each node automatically assumes the other node has failed. Typically, both nodes then attempt to continue operating as the cluster, and what you now have is a state called split brain sydrome. Imagine both servers succeeded in operating the cluster, you would then have two separate clusters claiming the same virtual server name and competing for the same disk resources. This is not a good condition for a system to find itself in.

The operating system prevents this scenario with quorum resource ownership. The node that succeeds in gaining control of the quorum resource wins and continues to present the cluster. In other words, whoever controls the "brain" wins. The other node submits, the fail-over completes, and the resources on the failed node are deactivated.

What constitutes a valid quorum resource? The quorum can be any resource that meets the following attributes:

  • It can be accessed by a single node that must be able to gain physical control of it and defend the control.

  • It must reside on physical storage that can be accessed by any node in the cluster.

  • It must be established on the NTFS file system.

It is possible to create custom resource types as long as developers meet the arbitration and storage requirements specified in the API exposed by the Microsoft Software Development Kit. Let's now look at some deployment scenarios.

Deployment Scenarios

Let's discuss some example deployment schemes, namely the n-node fail-over scheme, the fail-over ring scheme, and the hot-standby server scheme.

In the n-node fail-over scheme you deploy applications that are setup to be moved to a passive node when the primary node on a 2-node cluster fails. In this configuration, you limit the possible owners for each resource group. You will see how we do this in Part II of this book.

Let's consider the so-called N+I hot-standby server scheme. Here you reduce the overhead of the 2-node failover by adding a "spare" node (one for each cluster pair) to the cluster. This provides a so-called "hot-standby" server that is part and parcel of the cluster and equally capable of running the applications from each node pair in the event of a failure of both of the other nodes. Both of these solutions are called active/passive clusters—n-node and n-node+1 (or N+1).

As you create the N+1 mode cluster, you will discover it is a simple matter to configure as the spare node. How you use a combination of the preferred owners lists and the possible owners list depends on your application. You typically set the preferred node to the node that the application runs on by default; and you set the possible owners for a given resource group to the preferred node and the spare node.

Then there is the concept of a Failover Ring. Here you set up each node in the cluster to run an application instance. Let's assume we have an instance of SQL Server on each node of the cluster. In the event of a failure, the SQL Server on the failed node is moved to the next node in sequence. Actually, an instance of SQL Server is installed on every server. Fail-over simply activates the SQL Server instance, and it takes control of the databases stored on the SAN or SCSI array. We call this the Active-Active cluster.

You can also allow the server cluster to choose the failover node at random. You can do this with large clusters and you'll just not define a preferred owners list for the resource groups. In other words, each resource group that has an empty preferred owners list is failed over to any node in random fashion in the event that the node currently hosting that group fails.

We will leave the clustering subject now and return to the creation of the infrastructure to support our clusters.

  • + Share This
  • 🔖 Save To Your Account