Cluster Building Process, Phase 1: Design
The first phase in our cluster process involves completing the physical and logical design of the cluster. There are three basic steps to this phase (see Figure 2):
- Technical analysis
- Preliminary solution design
- Final solution design
Figure 2 Cluster building process, phase 1: design.
Notice the dotted lines between the steps in Figure 2. The dotted paths show the potential for iteration between the process steps. While designing a cluster for your customer, you may encounter issues that send it back to the drawing board. It's all but impossible to design a cluster and get all of the details correct on the first try—you will need to make multiple passes at some portions of the process. Learn to love dotted lines, and add some of your own.
The technical analysis step exists to discover and document the requirements and objectives for the cluster project. At the same time, we must identify constraints on the project (time, budget, performance, and so on). The information gathered here will directly affect the components used in the cluster and the overall complexity of the project, and provide a goal line to identify when we have met the customer's objectives.
Whether you're building the cluster for an internal or an external customer, this step may be considered pre-sales effort. You're trying to converge on a solution that's built from available technology and meets specific customer requirements. This type of analysis effort usually precedes the project go-ahead and any funding for the job. If the customer needs are not met, the project may not be approved. Completely agreeing on the initial set of design objectives is not always necessary, however, at least in simple cluster projects or clusters that are built purely for research.
If you're building a cluster "for hire," the pre-sales effort involved in the initial specification steps may not be funded by the customer, so the more quickly you can converge on a final viable design, the better for your costs. (We all have at least a little sales activity in our technical lives, whether we realize it or not.)
Once the context for the solution is framed, the next step is to craft a preliminary design.
Preliminary Solution Design
With the specifications and context given for the cluster, it's time to begin producing a first pass at the cluster's hardware and software structure. This step involves a lot of research into prices, product features, cabling and power requirements, heat generation, weight, rack space requirements, and the like. Creating component lists for the hardware and software in the cluster is also part of this step. Many tradeoffs must be made during the design; just remember that cheapest is not always best for the job at hand.
If your organization doesn't have the necessary systems-engineering expertise or isn't up to the task of doing the research, it's best to use predesigned cluster components for the hardware and use cluster-software toolkits for the operating system and infrastructure. (Part 2 of this series discussed some of these possibilities). Whether the toolkit approach fits your needs will depend on your time constraints, budget, and previous experience level. You may have to trade a little expediency for some of the projected cost savings.
At the same time that you're working on the physical design for the racks and hardware, you need to be choosing the proper software components and architecture. Some handy tips for the do-it-yourself audience:
- Find a good graphical layout tool to use in designing the rack layouts. This suggestion may be controversial in an open source crowd, but I've found nothing better than Microsoft's Visio product, and thousands of free shapes and templates are available through sites like the Visio Café.
- Follow manufacturer's specifications for power, but make sure that they apply to your configurations. Many major hardware vendors offer configuration tools on their web sites to help with this step.
- Standardize everything. Wherever possible, use identical racks and other components.
- Verify physical constraints. Consider floor space, power connections, door height, and so on.
- Identify all of the required software infrastructure. Sketch out the hardware required to support the software, along with expected subsystem configurations.
Once you have a preliminary design in place, it's time to check it with the potential customer. The likelihood that you'll need to go back to the technical analysis step is somewhat greater than the probability that you'll proceed to the final solution design step. Be ready for this possibility.
Final Solution Design
Upon completing this stage of the design phase, you should be confident that you can identify all of the physical and virtual components of your cluster—confident enough to generate the complete bill of materials and begin ordering the individual parts. You should also be able to put together a project plan for the cluster and assign tasks and resources.
The following list shows some of the documentation you should have at this point:
- Cluster rack diagrams
- Hardware cabling diagrams
- Cluster hardware and software bill of materials
- Hardware and software vendor list and ordering information
- Hardware power and cooling requirements
- Software infrastructure design
- Network design
- Test plan for the cluster
- Project plan for the cluster
Beyond this phase is the point where theory and diagrams start becoming physical reality. Before exiting this phase of the process, it's essential to have the complete hardware and software design for the cluster in hand. You have the roadmap in your hands; it's time to start traversing the path to completion.