Capacity Planning, Architecture, and Design Issues Essentials
Proper capacity planning can minimize performance problems in the system after it is implemented.
Capacity planning for an e-Business is an art rather than a science because its requirements change all the time.
Oracle8i (release 8.1.6) and higher provide a mechanism called the STATSPACK utility for historical data collection and the capability to translate the data into reports that can show database usage trends.
One key characteristic of a successful e-Business is its capability and willingness to change quickly according to market needs. An even better characteristic is its capability to predict the direction in which the market might shift in the future so that it can prepare in advance to meet the new demands.
The e-Business architecture is better than a straight client/server architecture because the client doesn't stay connected to the database, the application server stays connected only for the time required to make a request and receive the results back from the database server, network traffic is reduced, idle processes do not consume resources, and response time is improved.
When you start to move your systems to an e-Business, it can help to start small and identify a small project that can quickly make the transition and at the same time provide high impact and return on investment (ROI). When this transition is successful, you can build on the experience and feedback from this venture.
An e-Business needs to have a forward-looking application architecture that deals with the following critical requirements:
Integration of intra-business processes
Innovative methods for adding value
Customer- and solution-oriented procedure
Plan for growth and change
In this chapter, we will do the following:
Learn how to perform capacity planning
Learn how to use StatsPack for capacity planning purposes
Examine architectural issues for an e-Business
Examine the architectural components
Look at design issues for e-Businesses
Capacity planning is part of the planning phase. This very important process is completed with the intention of minimizing performance problems with the system after it is implemented. In this sense, capacity planning involves analyzing the same kinds of issues that you would analyze when tuning the system:
How should you choose the system components to use?
What type of architecture is suitable?
What values should be set for the various parameters?
What are the characteristics of the applications?
What are the characteristics of the system?
What business rules are to be implemented?
How much can be forecasted or predicted about the system usage?
What are the system requirements in terms of availability? Scalability? Security? Performance?
Good capacity planning provides a cushion for sudden bursts in load and can tolerate some application inefficiencies.
Capacity planning for an e-Business is an art rather than a science because its requirements change all the time. Its characteristics also change all the time and are difficult to predict. To understand the capacity planning issues with an e-Business such as DOeBIZ.com, you have to understand that it is very difficult in such an environment to have a satisfactory architecture that meets all the requirements well. You should plan for possible extra growth and build a system that has more capacity than current and predicted future needs.
Capacity Planning Questions
Several important questions need to be addressed as part of the capacity planning process:
What are the scalability requirements? Scalability is the ability to gracefully enhance the capacity of the system to support a greater system load. The architecture you choose should enable you to easily do the following:
Add more hardware to increase capacity (instead of completely reworking the system)
Upgrade the operating system to take advantage of the new hardware
Redistribute the load to take advantage of the new system configuration
Generally, scalability is determined in terms of the performance increase relative to additional system components; for example, if one CPU gives a performance of x and adding an extra CPU gives a performance of 1.4x, the scalability is 1.4. This increase is not usually linear because of the overhead associated with coordinating the activity of the additional components. Good capacity planning requires a good understanding of future needs. If you improperly predict future needs, you might be able to satisfy your immediate system needs, but fixing problems due to unforeseen increases or changes of needs would require a lot of effort.
What are the availability requirements? Availability requirements are generally specified as the percentage of time that your system has to be up. Transaction processing e-Commerce sites usually have very high availability requirements. The Internet itself has a reliable built-in capability to route around problems; therefore, when you're looking at high availability solutions, you should look at these elements of your own system:
Disk drivesConsider using a Redundant Array of Inexpensive Disks (RAID) configuration.
Web server hardwarePCs are cheap but not very reliable, whereas mainframes are extremely reliable but very expensive. Therefore, you could use workstations as a compromise for the Web server hardware.
Operating systemCompared to Windows and Macintosh, UNIX systems are much more stable and are better suited for high availability requirements. You might also consider a high-availability OS such as Solaris HA.
Uninterruptible power supply (UPS)Using a UPS is a necessity.
Clustering solutionYou should seriously consider using a UNIX- or NT-based clustering solution.
High availability requirements dictate quick reaction by an iDBA. Therefore, it is very valuable to have an architecture in place that supports automatic notification to an iDBA when performance falls below a certain threshold or when other critical events occur.
What are the budgetary constraints? Money is always an important issue, and you need to ensure that the chosen architecture can provide a high benefit-per-cost ratio. You should also consider the budget for ongoing maintenance and upgrades of the architecture as well as the cost for administration and performance tuning.
Beware of Bottlenecks
When you add extra components to increase capacity, take care to ensure that you are not introducing a bottleneck somewhere by virtue of this system change.
Oracle Database Trend Analysis Using STATSPACK
Proper designing and capacity planning (and later, database tuning) will benefit tremendously if some trend analysis is available for the database usage. Oracle8i (release 8.1.6) and higher provides a mechanism called the STATSPACK utility for historical data collection and the capability to translate the data into reports that can show the database usage trend. Even though Oracle introduces STATSPACK with Oracle 8.1.6, a patch available from Oracle Corporation allows you to use STATSPACK with any Oracle8 database.
The STATSPACK utility is a set of scripts that run a special version of the Oracle BSTAT/ESTAT utilities. Unlike BSTAT/ESTAT, STATSPACK captures performance data and stores it in special Oracle tables with names that have the prefix STATS$. Each STATS$ table is owned by the PERFSTAT user. Note that you can use the regular SQL*Plus DESCRIBE command to determine the columns in the STATS$ table.
DBAs commonly use trend analysis to understand the behavior of the database at various times. This type of analysis also helps in identifying patterns of behavior in terms of database usage. Oracle Enterprise Manager (OEM) provides a capacity planning tool as well, but STATSPACK provides a large number of statistics that can be used to generate different types of trend reports. Here are some useful metrics:
Data buffer hit ratioThis metric gives an idea of the effectiveness of the database buffer cache and the setting of the db_block_buffers parameter.
Physical disk reads and physical disk writesThese metrics give a good idea about the I/O activity in the system.
I/O waitsThis metric can be used to determine I/O contention problems and can indicate whether the physical design of the database is good.
SortsThis metric is an indication of the effectiveness of sort activity.
Buffer busy waitsThis metric is an indication of freelist contention and can also indicate concurrent UPDATE or INSERT activity on the same object.
After STATSPACK captures the various performance statistics, you can generate various trend reports using the data gathered. These trend reports would provide signatures for various database metrics. For example, you might observe that on Monday mornings between 8 a.m. and 9 a.m., the I/O activity is extraordinarily high. This knowledge can help in taking the proper actions. In order to forecast the database growth trend properly, you must archive and analyze the data that is collected by any tool such as STATSPACK. The script in Listing 3.1 shows how to plot the average I/O occurring by hour of the day.
Listing 3.1 Determining the Average I/O by Hour of the Day
Set pages 9999; Column phyreads format 999,999,999 Column phywrites format 999,999,999 Select To_char(snap_time, 'HH24'), Avg(physical_reads) phyreads, Avg(physical_writes) phywrites From Perfstat.stats$buffer_pool_statistics bp, Perfstat.stats$snapshot sn, Where Bp.snap_id = sn.snap_id Group by To_char(snap_time, 'HH24');
In addition to generating trend reports, you can use queries to generate customized alerts that indicate to the DBA when certain preset thresholds get exceeded. The script in Listing 3.2 can be scheduled to run at set intervals to indicate when the following are true:
Listing 3.2 Setting Alerts
Set pages 9999; Set feedback off; Set verify off; Prompt **************************************** Prompt When the buffer cache hit rate is < 85%, Prompt you should consider increasing the Prompt db_block_buffers parameter Prompt **************************************** Column "BUFFER HIT RATIO" format 999 Select To_char(snap_time, 'yyyy-mm-dd HH24'), Round(100*((a.value+b.value)-c.value)/(a.value+B.value)) "BUFFER HIT RATIO" From Perfstat.stats$sysstat a, Perfstat.stats$sysstat b, Perfstat.stats$sysstat c, Perfstat.stats$sysstat d, Perfstat.stats$snapshot s Where (round(100*((a.value+b.value)-c.value)/(a.value+b.value))) < 85 and snap_time > sysdate-&1 and a.snap_id = s.snap_id and b.snap_id = s.snap_id and c.snap_id = s.snap_id and d.snap_id = s.snap_id and a.statistic# = 39 and b.statistic# = 38 and c.statistic# = 40 and d.statistic# = 41; Prompt ****************************************************** Prompt When the I/O wait is very high, Prompt it indicates disk contention and you should consider Prompt reshuffling of datafiles to remove the contention Prompt ***************************************************** Column snapdate format a16 Column filename format a40 Select To_char(snap_time,'yyyy-mm-dd HH24') snapdate, Filename, Wait_count From Perfstat.stats$filestatxs f, Perfstat.stats$snapshot s Where Snap_time > sysdate-&1 and f.snap_id = s.snap_id and wait_count > 2000;