Misleading Metrics

Statistics may be defined as “a body of methods for making wise decisions in the face of uncertainty.”  W.A. Wallis

Here’s the Situation. Jim needs to release software on a specific date for a big trade show. It’s currently Monday and ship is Friday. Everyone is pumped! Marketing has their message for Friday morning ready. Brochures are printed. Sales team is pumped. Support and services have been trained. Everything is set!

Jim is the project manager for this release. From the information he has there doesn’t appear to be any  issue with hitting the deadline. During the mid-morning bug triage the test manager Neil reported that 45 bugs out of the 74 for this project need to be fixed for release.

Jim doesn’t see a problem. “Seem like a reasonable goal – 45 bugs in 5 days.. with all our coding resources dedicated to fixing this it should be no problem. We’ll work on them in priority order – P1′s first, P2′s second, P3′s third. With 6 developers working full time we will fix them in no time.”

“This is how humans are: we question all our beliefs, except for the ones we really believe, and those we never think to question.” -Orson Scott Card

In the last week of software projects things can get pretty heated. Lots of buzz. Lots of teamwork and camaraderie. Lots of great questions from management too! How long will it take to fix these bugs? Will we ship on time? Can we do it? What will it take? Can we put more people on the project and get more of these bugs fixed? There can also be an insane amount of pressure

Neil isn’t so sure of what Jim is saying. Of the 45 must fix bugs the priorities were as 4 P1, 21 P2, 8 P3, 12 P4. Jim kept getting hounded by upper management – “are we going to make the date?”. To answer Jim did something to get a quick and easy response that not even Neil was expecting.. He applied a linear trend analysis to the bug counts and predicted the release would be ok.   ”Well – it’s now Wednesday morning and we are fixing 10 bugs a day. We have 25 bugs left so we will be done in 2.5 days. That should leave us with a half of day buffer.” Jim proudly noted!

What do you think happened to the project and to Jim?

In reality the project was late by 3 business days. 5 elapsed days if you include the weekend work done. Everyone on the team was frustrated. Neil wasn’t sure what to do. What could he do? The project manager was misleading managment and they were buying it. Neil knew that the linear trend analysis Jim did assumes all bugs are equal. It did not take into account the showstopper bug that caused the test team to not be able to install the product. The test team to lost an entire day of testing. When the testers did get the build installed Thursday morning they found three of the P1 bugs weren’t fixed sufficiently and had to be re-visited.

“Do not put your faith in what statistics say until you have carefully considered what they do not say.”  ~William W. Watt

What went wrong here for Jim? Isn’t this how you predict when you will be done fixing bugs? Statistically it made sense. It was “obvious”. You just count the bugs you start with, figure out the rate of fixing (this might also be called bug fix velocity) and voila – you get your answer. Unfortunately real software isn’t that simple. Applying statistical models to complicated real life scenarios can lead you down a path of frustration, misrepresentation, late projects, overtime and overworked employees.  This is exactly what happened to Jim who got himself in hot water because he tried to give management an answer that was easy and on the surface seems to make sense.

What does the fix rate trend analysis NOT take into account?

- The time for tester to verify and close bugs.

- Time for developers to re-visit the issues that weren’t done properly the first time.

- The build being broken

- Changing business priorities as the deadline approaches.

- The team finding any more bugs along the way – especially important ones.

The lesson learned here for Jim is that all bugs are not created equal. Beware of metrics that aren’t valid. Know what they can’t do for you. They can have huge impact on you, your team, your software and the proejcts bottom line. The fallout from working on such a schedule is that the team also had to do several (read eight) hot fixes to address issues that came up once the product was released.

As project manager is there something Jim could have done differently? What should Neil, the test manager, have done?

“He uses statistics as a drunken man uses lampposts – for support rather than for illumination.”  ~Andrew Lang

6 Responses to “Misleading Metrics”

  1. Tweets that mention Adam White » Blog Archive » Misleading Metrics -- Topsy.com Says:

    [...] This post was mentioned on Twitter by Bob Marshall, Rob Lambert. Rob Lambert said: Nice post by Adam White on the dangers of metrics. http://bit.ly/9jX2pP [...]

  2. Joseph Kubik Says:

    Is the problem with a linear trend line (or any type of trend line) that we fail to use statistics like statisticians do?
    For other math / stats we would use a trend line and a P value (mathematically, how consistent is our rate, or is the trend an average of very random data) and then say +/- N units. We would use the P value to decide how accurate our trend is.

    example:
    We have fixed an average of 5 bugs per day over 10 weeks with a 95th percentile of 0 and 25 leaving us a P value that means a very bad match. so we will ship on March 15th, +/- a LARGE variance.
    OR,
    We have fixed an average of 5 bugs per day over 10 weeks with a 95th percentile of 4 and 6 leaving us a P value that means a very good match. so we will ship on March 15th, +/- a very small variance.

    -Joseph-

  3. Glory Leung Says:

    Jim could have perhaps assigned Risk numbers to each P value. Where P is the “observed” value of time it takes for development on average to fix that type of issue. (say.. in hours.. but you can use anything you want)

    For Example, if it takes roughly 10 hours days to fix a P1. Then assign a factor of 10. If it takes roughly 7 hours to fix a P2, Then assign a factor of 7. We’ll use 5 for P3s and 3 for P4s.

    Therefore we have 263 man hours divided by 6 hours a day divided by N number of Developers. (In this case, if there is 1 developer, it will take 44 days!!)

    By no means this is the real solution, but I think it can give you a more accurate account of the situation at hand. There are too many external values to get an absolute estimate (people getting sick, delays, acts of god)

    I’ll get to Nick’s side later.

  4. Glory Leung Says:

    Nick probably should have just told Jim that his metrics are wrong. I wouldn’t totally say it’s Nick’s fault, but he should take some blame in not convincing Jim enough to delay the release or push more bugs.

    Nick should play all the cards to the Jim and say “Here is the situation. blah blah blah”. Whether or not Jim takes it to heart is another question but at least Nick covered for the Test team.

  5. John Allen Says:

    Who is Nick and what is his side?

  6. Adam White » Blog Archive » InsideSpin – Contributor of the month Says:

    [...] Strictly by the number of posts in the discussion forum. Maybe I should look at my own post about misleading metrics What is [...]

Leave a Reply