In his book The Personal MBA: Master the Art of Business, the author Josh Kaufman recommends isolating two to five critical indicators, or what he calls Key Performance Indicators (KPI). These numbers should measure throughput and of a given system. The goal in identifying and following these critical numbers is to improve the system under investigation. At a high level, our system under investigation is the software testing process. The throughput could be seen as the quality of the software based on the requirements of the application.
I recommend taking a reverse approach to measuring software quality. Since software bugs are based on features which do not work as prescribed we may understand the lower the “throughput” the better the system works which is under investigation.
Critical numbers are, just what the name implies, “critical”. Identifying, quantifying and reporting against too many issues results in burying the stakeholders in reports they will not read. If reports are not read, the whole exercise is of little use. Experiment to discover which numbers support your enterprise’s core values.
The following example, assumes the business places a high regard on offering a trusted enduser experience. Following the scenario detailed in Part Four, I have identified three Key Performance Indicators reporting issues discovered during testing and after deployment.
These critical numbers contain three elements. These are use cases, bugs found before deployment and bugs found after deployment. Here is how I connect these elements to create my reports:
- Relate Issues to use cases (This addresses the question if more use cases lead to more production bugs.)
- Relate the amount of production bugs to the week in which they were found. (This reports how long it took to stabilize the production system. The longer it takes to stabilize a system the lower the customer satisfaction.)
- Relate the total number of production bugs from the current release to earlier releases. (The raw number of bugs that “slipped through the cracks” compares the effectiveness of one test plan to the other, enabling better planning.)
The first report details the ratio of issues to use cases. I use ratio instead of percentage, because comparing these numbers helps me discover how many use cases the team can efficiently manage in the time allotted. Making the assumption that n amount of use cases are too many to test in the six weeks has no meaning, until I have the numbers to back up the statement.
The following example offers these interesting insights:
- Test Round A – two new persons joined the development team and the error ratio was higher then when compared to other rounds with a similar amount of use cases. (Comparing Test Rounds B and D to A, the impact of these new team members becomes obvious.)
- Test Round C demonstrates the impact of adding more test cases to the release. Fifty-six use cases resulted in the discovery of 301 errors. (Compare the amount of use cases from Test Round A, B and D to Test Round C.)
|Line Item||Test Round A||Test Round B||Test Round C||Test Rond D|
The second report shows the sum of bugs discovered for the first six weeks after deployment. In this example each Friday the production bugs for that week are added to the report. The following are some initial impressions:
- The quality assurance team did a excellent job at identifying bugs caused by adding two new developers. (Compare Test Round A to the other rounds.)
- For Test Round A and B it took four weeks to stabilize the system. However, it took five weeks to stabilize the system for Test Rounds C and D. A closer look should be taken by discovering the differences between Test Rounds A and B with C and D.
|End of Week||Test Round A||Test Round B||Test Round C||Test Round D|
The final report in our example represents the total number of bugs found in production during the first six weeks after deployment. The following are some observations:
- The jump in production bugs for Test Round C could be due to the increase of use cases requiring testing. Since each cycle is fixed at six weeks and the amount of use cases which may be included in a round is not, it is entirely possible 301 use cases are too many to test given the fixed time table. Further data is required to form a conclusion, however the quality assurance manager should keep a close eye on this number in the future.
- Since Test Round D had around the same use cases as Test Rounds A and B, there is apparently an issue somewhere in the test process or the specific test plan for Test Round D.
- Discovering the differences between Test Rounds A and B with D is mission critical and requires a corrective action plan.
- The enterprise must decide if an average of 18 production bugs after each release reflects the company’s values.
|Test Rounds||Pre Deploy||Post Deploy||Ratio||Percentage|
|Test Round A||138||17||8.12||12.32%|
|Test Round B||147||19||7.74||12.93%|
|Test Round C||301||54||5.57||17.94%|
|Test Round D||155||50||3.10||32.26%|
Click her for Part Six – Discovering What Went Wrong