Home > Articles > Operating Systems, Server > Solaris

  • Print
  • + Share This
Like this article? We recommend

Building a Compute Cluster

A production compute cluster is a configuration of a number of machines into a single computing resource. Instead of starting a job on a specific machine (or host), the user submits a job to a queue. The queuing system runs the job on the best available machine. It is possible, although not very common, to run interactive jobs through the queuing system, too. If a user submits the task of running an xterm, the queuing system will start the task on a lightly-loaded host.

A compute cluster consists of the following parts:

  • Network - Ethernet, Myrinet, and so forth

  • File sharing—Network File System (NFS), Andrew File System (AFS), or Distributed File System (DFS)

  • Queuing system—SunTM Grid Engine software, parallel batch system (PBS), or Platform Computing's LSF (load-sharing facility)

  • Message passing (if used)—an MPI library (Sun HPC ClusterTools software or MPICH)

  • Compiler (if needed)—Forte Developer software or GNU

  • Maintenance tools—JumpStart' software, automatic patch installation tools, and so forth

  • Administration tools—hardware health checking tools (SunVTSTM software, Sun' Management Center software (hereafter called Sun MC), resource allocation (Solaris' BandWidth Manager software, disk quota, Network Information System (NIS), and so forth)

  • Terminal servers for the consoles

Job execution depends on how the queuing system is configured. You can optimize for the use of expensive software licenses, maximize total resource utilization, prioritize a certain group of users, and so forth.

Note that the optimal application development environment may be different than the optimal production environment. In a development environment, response time is more important than throughput.

To benefit the user and system administrator the most, the cluster must have these characteristics

  • Powerful
  • Simple to use
  • Easy to program
  • Simple to administer
  • Easy to add more resources
  • Good price and performance
  • + Share This
  • 🔖 Save To Your Account