- Chapter 5 A Step-By-Step Approach to Capacity Planning in Client/Server Systems
- 2 Adequate Capacity
- 3 A Methodology for Capacity Planning in C/S Environments
- 4 Understanding the Environment
- 5 Workload Characterization
- 6 Workload Forecasting
- 7 Performance Modeling and Prediction
- 8 Development of a Cost Model
- 9 Cost/Performance Analysis
- 10 Concluding Remarks
- BIBLIOGRAPHY
5.7 Performance Modeling and Prediction
An important aspect of capacity management involves predicting whether a system will deliver performance metrics (e.g., response time and throughput) that meet desired or acceptable service-levels.
5.7.1 Performance Models
Performance prediction is the process of estimating performance measures of a computer system for a given set of parameters. Typical performance measures include response time, throughput, resource utilization, and resource queue length. Examples of performance measures for the C/S system of Fig. 5.3 include the response time for retrieving mail from the mail server, the response time experienced during Web-based corporate training sessions, the throughput of the file server in LAN 2, the utilization of the backbone FDDI ring, and the throughput and average number of requests queued at the Web proxy server. Parameters are divided into the following categories:
-
system parameters: characteristics of a C/S system that affect performance. Examples include load-balancing disciplines for Web server mirroring, network protocols, maximum number of connections supported by a Web server, and maximum number of threads supported by the database management system.
Figure 5.9. Workload forecasting.
-
resource parameters: intrinsic features of a resource that affect performance. Examples include disk seek times, latency and transfer rates, network bandwidth, router latency, and CPU speed ratings.
-
workload parameters: derived from workload characterization and divided into:
-
workload intensity parameters: provide a measure of the load placed on the system, indicated by the number of units of work that contend for system resources. Examples include the number of hits/day to the Web proxy server, number of requests/sec submitted to the file server, number of sales transactions submitted per second to the database server, and the number of clients running scientific applications. Another important characteristic of the workload is the burstiness of the arrival process as discussed in Chaps. 4 and 10.
-
workload service demand parameters: specify the total amount of service time required by each basic component at each resource. Examples include the CPU time of transactions at the database server, the total transmission time of replies from the database server in LAN 4, and the total I/O time at the Web proxy server for requests of images and video clips used in the Web-based training classes.
Performance prediction requires the use of models. Two types of models may be used: simulation models and analytical models. Both types of models have to consider contention for resources and the queues that arise at each system resource| CPUs, disks, routers, and communication lines. Queues also arise for software resources|threads, database locks, and protocol ports.
The various queues that represent a distributed C/S system are interconnected, giving rise to a network of queues, called a queuing network (QN). The level of detail at which resources are depicted in the QN depends on the reasons to build the model and the availability of detailed information about the operation and availability of detailed parameters of specific resources.
Example 5.3:
To illustrate the above concepts we will use the notation introduced in Chap. 3 to show two versions|a high level and a more detailed level|of the QN model that corresponds to LAN 3 in Fig. 5.3, its Web server, its 100 clients, and the connections of LAN 3 to the Internet and to the FDDI ring. Figure 5.10 depicts a high-level QN model for LAN 3. The 10 Mbps LAN is depicted as a queuing resource and so is the Web server. The set of 100 Windows clients is depicted as a single delay resource because requests do not contend for the use of the client; they just spend some time at the client. The FDDI ring and the Internet are not explicitly modeled as this model focuses on LAN 3 only. However, traffic coming from the FDDI ring and from the Internet into LAN 3 has to be taken into account.
Note that the model of Fig. 5.10 hides many of the details of the Web server. As mentioned in Sec. 5.4, the Web server runs on a dual processor machine. Thus, a more detailed representation of the QN model would have to include the server processors and disks as shown in Fig. 5.11.
Figure 5.10. High-level QN of LAN 3.
Chapters 8 and 9 discuss, in detail, techniques used to build performance models of C/S systems.
5.7.2 Performance Prediction Technique
To predict the performance of a C/S system we need to be able to solve the performance model that represents the system. Analytic models [8] are based on a set of formulas and/or computational algorithms used to generate performance metrics from model parameters. Simulation models [5], [6], [9] are computer programs that mimic the behavior of a system as transactions flow through the various simulated resources. Statistics on the time spent at each queue and each resource are accumulated by the program for all transactions so that averages, standard deviations, and even distributions of performance metrics can be reported on the performance measures.
There is a wide range of modeling alternatives. As more system elements are represented in greater detail, model accuracy increases. Data gathering requirements also increase, as shown in Fig. 5.12. It is important that a reasonable balance be made between model accuracy and ease of use to allow for the analysis of many alternatives with little effort and in very little time. Analytic models are quite appropriate for the performance prediction component of any capacity management/planning study. In this book, we explore how analytic-based QN models can be used to model C/S systems, Web servers, and intranets.
Many times it is unfeasible to obtain detailed performance data. It is important to note that a detailed performance model that uses unreliable data yields non-representative results.
5.7.3 Performance Model Validation
A performance model is said to be valid if the performance metrics (e.g., response time, resource utilizations, and throughputs) calculated by the model match the measurements of the actual system within a certain acceptable margin of error. Accuracies from 10 to 30% are acceptable in capacity planning [8].
Figure 5.11. Detailed QN of LAN 3.
Figure 5.12. Performance model accuracy.
Fig. 5.13 illustrates the various steps involved in performance model validation. During workload characterization, measurements are taken for service demands, workload intensity, and for performance metrics such as response time, throughput, and device utilization. The same measures are computed by means of the performance model. If the computed values do not match the measured values within an acceptable level, the model must be calibrated. Otherwise, the model is deemed valid and can be used for performance prediction. A detailed discussion on performance model calibration techniques is given in [8].
Figure 5.13. Performance model validation.