4.4 Examples of Metrics Programs
Motorola's software metrics program is well articulated by Daskalantonakis (1992). By following the Goal/Question/Metric paradigm of Basili and Weiss (1984), goals were identified, questions were formulated in quantifiable terms, and metrics were established. The goals and measurement areas identified by the Motorola Quality Policy for Software Development (QPSD) are listed in the following.
Goal 1: Improve project planning.
Goal 2: Increase defect containment.
Goal 3: Increase software reliability.
Goal 4: Decrease software defect density.
Goal 5: Improve customer service.
Goal 6: Reduce the cost of nonconformance.
Goal 7: Increase software productivity.
Delivered defects and delivered defects per size
Total effectiveness throughout the process
Adherence to schedule
Accuracy of estimates
Number of open customer problems
Time that problems remain open
Cost of nonconformance
For each goal the questions to be asked and the corresponding metrics were also formulated. In the following, we list the questions and metrics for each goal:1
Goal 1: Improve Project Planning
Question 1.1: What was the accuracy of estimating the actual value of project schedule?
Metric 1.1 : Schedule Estimation Accuracy (SEA)
Question 1.2: What was the accuracy of estimating the actual value of project effort?
Metric 1.2 : Effort Estimation Accuracy (EEA)
Goal 2: Increase Defect Containment
Question 2.1: What is the currently known effectiveness of the defect detection process prior to release?
Metric 2.1: Total Defect Containment Effectiveness (TDCE)
Question 2.2: What is the currently known containment effectiveness of faults introduced during each constructive phase of software development for a particular software product?
Metric 2.2: Phase Containment Effectiveness for phase i (PCEi)
From Daskalantonakis's definition of error and defect, it appears that Motorola's use of the two terms differs from what was discussed earlier in this chapter. To understand the preceding metric, consider Daskalantonakis's definitions:
Error: A problem found during the review of the phase where it was introduced.
Defect: A problem found later than the review of the phase where it was introduced.
Fault: Both errors and defects are considered faults.
Goal 3: Increase Software Reliability
Question 3.1: What is the rate of software failures, and how does it change over time?
Metric 3.1: Failure Rate (FR)
Goal 4: Decrease Software Defect Density
Question 4.1: What is the normalized number of in-process faults, and how does it compare with the number of in-process defects?
Metric 4.1a: In-process Faults (IPF)
Metric 4.1b: In-process Defects (IPD)
Question 4.2: What is the currently known defect content of software delivered to customers, normalized by Assembly-equivalent size?
Metric 4.2a: Total Released Defects (TRD) total
Metric 4.2b: Total Released Defects (TRD) delta
Question 4.3: What is the currently known customer-found defect content of software delivered to customers, normalized by Assembly-equivalent source size?
Metric 4.3a: Customer-Found Defects (CFD) total
Metric 4.3b: Customer-Found Defects (CFD) delta
Goal 5: Improve Customer Service
Question 5.1 What is the number of new problems opened during the month? Metric 5.1: New Open Problems (NOP)
NOP = Total new postrelease problems opened during the month
Question 5.2 What is the total number of open problems at the end of the month? Metric 5.2: Total Open Problems (TOP)
TOP = Total postrelease problems that remain open at the end of the month
Question 5.3: What is the mean age of open problems at the end of the month? Metric 5.3: Mean Age of Open Problems (AOP)
AOP = (Total time postrelease problems remaining open at the end of the month have been open)/(Number of open post release problems remaining open at the end of the month)
Question 5.4: What is the mean age of the problems that were closed during the month?
Metric 5.4: Mean Age of Closed Problems (ACP)
ACP = (Total time postrelease problems closed within the month were open)/(Number of open postrelease problems closed within the month)
Goal 6: Reduce the Cost of Nonconformance
Question 6.1: What was the cost to fix postrelease problems during the month? Metric 6.1: Cost of Fixing Problems (CFP)
CFP = Dollar cost associated with fixing postrelease problems within the month
Goal 7: Increase Software Productivity
Question 7.1: What was the productivity of software development projects (based on source size)?
Metric 7.1a: Software Productivity total (SP total)
Metric 7.1b: Software Productivity delta (SP delta)
From the preceding goals one can see that metrics 3.1, 4.2a, 4.2b, 4.3a, and 4.3b are metrics for end-product quality, metrics 5.1 through 5.4 are metrics for software maintenance, and metrics 2.1, 2.2, 4.1a, and 4.1b are in-process quality metrics. The others are for scheduling, estimation, and productivity.
In addition to the preceding metrics, which are defined by the Motorola Software Engineering Process Group (SEPG), Daskalantonakis describes in-process metrics that can be used for schedule, project, and quality control. Without getting into too many details, we list these additional in-process metrics in the following. [For details and other information about Motorola's software metrics program, see Daskalantonakis's original article (1992).] Items 1 through 4 are for project status/control and items 5 through 7 are really in-process quality metrics that can provide information about the status of the project and lead to possible actions for further quality improvement.
Life-cycle phase and schedule tracking metric: Track schedule based on life-cycle phase and compare actual to plan.
Cost/earned value tracking metric: Track actual cumulative cost of the project versus budgeted cost, and actual cost of the project so far, with continuous update throughout the project.
Requirements tracking metric: Track the number of requirements change at the project level.
Design tracking metric: Track the number of requirements implemented in design versus the number of requirements written.
Fault-type tracking metric: Track causes of faults.
Remaining defect metrics: Track faults per month for the project and use Rayleigh curve to project the number of faults in the months ahead during development.
Review effectiveness metric: Track error density by stages of review and use control chart methods to flag the exceptionally high or low data points.
Grady and Caswell (1986) offer a good description of Hewlett-Packard's software metric program, including both the primitive metrics and computed metrics that are widely used at HP. Primitive metrics are those that are directly measurable and accountable such as control token, data token, defect, total operands, LOC, and so forth. Computed metrics are metrics that are mathematical combinations of two or more primitive metrics. The following is an excerpt of HP's computed metrics:2
Average fixed defects/working day: self-explanatory.Average engineering hours/fixed defect: self-explanatory.
Average reported defects/working day: self-explanatory.
Bang: "A quantitative indicator of net usable function from the user's point of view" (DeMarco, 1982). There are two methods for computing Bang. Computation of Bang for function-strong systems involves counting the tokens entering and leaving the function multiplied by the weight of the function. For data-strong systems it involves counting the objects in the database weighted by the number of relationships of which the object is a member.
Branches covered/total branches: When running a program, this metric indicates what percentage of the decision points were actually executed.
Defects/KNCSS: Self-explanatory (KNCSSThousand noncomment source statements).
Defects/LOD: Self-explanatory (LODLines of documentation not included in program source code).
Defects/testing time: Self-explanatory.
Design weight: "Design weight is a simple sum of the module weights over the set of all modules in the design" (DeMarco, 1982). Each module weight is a function of the token count associated with the module and the expected number of decision counts which are based on the structure of data.
NCSS/engineering month: Self-explanatory.
Percent overtime: Average overtime/40 hours per week.
Phase: engineering months/total engineering months: Self-explanatory.
Of these metrics, defects/KNCSS and defects/LOD are end-product quality metrics. Defects/testing time is a statement of testing effectiveness, and branches covered/ total branches is testing coverage in terms of decision points. Therefore, both are meaningful in-process quality metrics. Bang is a measurement of functions and NCSS/engineering month is a productivity measure. Design weight is an interesting measurement but its use is not clear. The other metrics are for workload, schedule, project control, and cost of defects.
As Grady and Caswell point out, this list represents the most widely used computed metrics at HP, but it may not be comprehensive. For instance, many others are discussed in other sections of their book. For example, customer satisfaction measurements in relation to software quality attributes are a key area in HP's software metrics. As mentioned earlier in this chapter, the software quality attributes defined by HP are called FURPS (functionality, usability, reliability, performance, and supportability). Goals and objectives for FURPS are set for software projects. Furthermore, to achieve the FURPS goals of the end product, measurable objectives using FURPS for each life-cycle phase are also set (Grady and Caswell, 1986, pp. 159162).
MacLeod (1993) describes the implementation and sustenance of a software inspection program in an HP division. The metrics used include average hours per inspection, average defects per inspection, average hours per defect, and defect causes. These inspection metrics, used appropriately in the proper context (e.g., comparing the current project with previous projects), can be used to monitor the inspection phase (front end) of the software development process.
4.4.3 IBM Rochester
Because many examples of the metrics used at IBM Rochester have already been discussed or will be elaborated on later, here we give just an overview. Furthermore, we list only selected quality metrics; metrics related to project management, productivity, scheduling, costs, and resources are not included.
Overall customer satisfaction as well as satisfaction with various quality attributes such as CUPRIMDS (capability, usability, performance, reliability, install, maintenance, documentation/information, and service).
Postrelease defect rates such as those discussed in section 4.1.1.
Customer problem calls per month
Fix response time
Number of defective fixes
Backlog management index
Postrelease arrival patterns for defects and problems (both defects and non-defect-oriented problems)
Defect removal model for the software development process
Phase effectiveness (for each phase of inspection and testing)
Inspection coverage and effort
Compile failures and build/integration defects
Weekly defect arrivals and backlog during testing
Defect cause and problem component analysis
Reliability: mean time to initial program loading (IPL) during testing
Stress level of the system during testing as measured in level of CPU use in terms of number of CPU hours per system per day during stress testing
Number of system crashes and hangs during stress testing and system testing
Models for postrelease defect estimation
Various customer feedback metrics at the end of the development cycle before the product is shipped
S curves for project progress comparing actual to plan for each phase of development such as number of inspections conducted by week, LOC integrated by week, number of test cases attempted and succeeded by week, and so forth.