Home > Articles > Operating Systems, Server > Solaris

  • Print
  • + Share This
Like this article? We recommend

Clustering

In today's business environment, time really is money. In the financial markets, applications can be responsible for moving literally trillions of dollars a day. Any significant unavailability of such an application, or even an outage on a smaller system that is responsible for only billions or possibly millions of dollars of daily revenue, can have a significant impact on a company's bottom line. Consequently, highly available (HA) environments are required for certain applications. One strategy for creating an HA environment might include a cluster of computing devices and related storage.

In the following sections, we provide an overview of clustering technology and focus on TruCluster and Sun Cluster 3.0 software. Although our example does not involve the migration from one clustering technology to another, we discuss the technology here to provide background information that will be helpful if you encounter an opportunity to migrate from using TruCluster to using Sun Cluster software.

Overview

A cluster is a group of two or more computers (nodes) that share a common storage device and are connected in a way that allows them to operate as a single, continuously available system. Should an application on one of the computers, or the computer itself, fail, a companion machine in the cluster takes over to provide the same functionality as the failing computer. Whereas fault-tolerant hardware can provide near-continuous uptime by providing specialized proprietary hardware sharing the same memory, clustering technology provides highly available applications through the use of redundancy (redundant servers and redundant interconnects, networking, and storage, even redundant adapters and controllers). All of this redundancy allows work to continue if a hardware or software failure occurs, by transparently switching to a working component.

Clusters provide an enterprise a cost-effective and flexible method for deploying technology. Machines can be added or removed from a cluster as business demands vary. As newer technology becomes available, it can be added incrementally to the cluster, thereby reducing the need to perform a "forklift" upgrade. Clustering provides the following benefits:

  • High availability

  • Scalability in several directions

  • Ease of use and administration

  • Cost-effective, incremental growth path

TruCluster software was a pioneering version of cluster technology. Simple to configure and highly reliable, this framework allowed the deployment of campus clusters as well as machines located over great distances.

Sun Cluster 3.0 software is a scalable and flexible solution that is equally suited for a small local cluster or larger extended clusters.

Cluster Agents—TruCluster and Sun Cluster 3.0 Software

The ability to detect when an application or resource is no longer operating as it was designed to is an integral component of any cluster. When the cluster detects these types of changes, the application can be restarted or moved to a different node.

In the TruCluster environment, this functionality is provided by the Cluster Application Availability (CAA) subsystem. This facility provides a way for the environment applications to determine whether they are operating properly and allows administrators to specify what actions should be taken if problems are detected.

Sun Cluster 3.0 software supports a similar framework that enables IT staff to develop a customized agent that can be used to monitor the health of the clustered application.

Although these subsystems provide similar functionality, they have significantly different implementations. Porting an application from one clustered environment to another requires you to not only transform the application source code to adhere to the new OS APIs but also that you integrate the application into the high availability framework of that clustered so that application failure can be detected. The framework will also have to be programmed to specify the actions that should take place if an application fails.

  • + Share This
  • 🔖 Save To Your Account