The Original Sin of Software Metrics (2014)

m463 · on Feb 24, 2020

I'm reminded of this story from ancient apple lore:

https://www.folklore.org/StoryView.py?story=Negative_2000_Li...

teddyh · on Feb 24, 2020

Also this:

https://dilbert.com/strip/1995-11-13

artsyca · on Feb 24, 2020

Yea nowadays he would've been pulled into a 1:1 and dressed down before being let go three months later

teddyh · on Feb 24, 2020

“When a measure becomes a target, it ceases to be a good measure”

— Goodhart's law as phrased by Marilyn Strathern

artsyca · on Feb 24, 2020

Add to this Conway's principle and Greenspun's tenth rule and you have an environment where the culture ends up somewhere between animal farm and Brazil and the only ones benefitting are spies from former and current communist dictatorships

closeparen · on Feb 24, 2020

My argument against defect rate metrics is much simpler: the really scary situations are the ones where no one is looking closely enough to file bugs in the first place. That’s where you find out it hasn’t worked at all in six months.

Whereas the functionality subject to many discerning eyes filing bugs about every little thing is probably doing pretty well.

kabdib · on Feb 24, 2020

Been there.

That time at Apple when we were sitting in a meeting, fat and happy about the bug count in a new C compiler and how ready it looked to ship, when it was revealed that a Q/A person had turned the floating point tests off six months earlier "because none of them were passing" and hadn't turned them back on.

We all learned a lot about how floating point worked in the next couple of months.

MarkSweep · on Feb 24, 2020

This reminds me of a good example from work: the emergency stop button on an industrial robot. By definition, the customer should not be pushing it. Pushing it means something has gone terribly wrong. But when they push it, they expect certain things to happen and they are unhappy when they don't.

(Your typically industrial robot does not rely on software for safety. However software can make the robot respond more gracefully.)

honestoHeminway · on Feb 24, 2020

We once tested industrial robot fences- those fences do not stop the robot, if you deactivate the safety break zone- the robot goes clean through. All variants.

They are there to prevent humans from entering.

PS: Warstory seems familiar? Sebastian? Olli?

shanemhansen · on Feb 24, 2020

I worked on a project where this came up. We had some product features that weren't really used. We continued to talk about all the benefits of those features and when the business had renewed interest in that area we jumped to turning those features on.

Only to find out they barely worked. FWIW the unit tests were pretty solid but the integrations weren't and the graceful degradation story was a little too good.

So we went from no complaints to having an entire department breathing down our next because all of the sudden people started giving a shit about our product (web performance improvements).

alkonaut · on Feb 24, 2020

Instead of measuring "enhancements" vs "defects", just treat them all as one. Things that go into the product via demand usually has some backlog anyway ("Backlog item", "User story" etc on a higher level). What might be interesting is marking things as regressions or not regressions which might give some hint about which areas are too complex or too undertested.

artsyca · on Feb 24, 2020

Next you're going to start assigning story points to bugs and have a velocity of dozens of points per sprint while delivering zero usable features

alkonaut · on Feb 24, 2020

Yes. I don’t subscribe to the idea that bugs don’t belong on the backlog with everything else. You can separate into two different backlogs (bugs and features) but you still need to pick from both.

If you don’t keep them on a backlog then people will start arguing whether something is a bug or in fact a feature request/missing functionality. That is a waste of time.

If you get a bug report for a bug that will take a week to fix you need to “assign story points” to it (in the sense that if you commit to fixing it you will at least know you have a week less available for something else!). Fixing the bug becomes a prioritization between the bug and adding features. So the bug is measured in the same units as features and prioritized against features. And fixing the bug adds value too - if you have feature X not working and you make it work, that is value. If you fix X instead of implement new feature Y you deliver value with feature X instead of feature Y. Does that value not count because it was “implemented before”? If you fix bugs for a month and ship a new version with only bug fixes then that is zero new features yes. Your velocity can be counted in terms of “new features” and be zero. Or “rate of fixing bugs” and be high (it’s still a good measurement for knowing whether you have enough resources to hit a goal such as a yearly release - which might be planned with 50% bug fixes and 50% new features!).

fhars · on Feb 24, 2020

Don‘t forget do ascribe half a story point to each meeting.

supercanuck · on Feb 24, 2020

Enhancements could be capitalized, a defect could be an operating expense.

I think this is where it truly came from.

mathattack · on Feb 24, 2020

Is this true for both internal IT departments and Software companies?

This is actually a 2nd examples of businesses optimizing too much on the wrong metric. (100% focus on earnings rather than cash, or vice versa)

a_c · on Feb 24, 2020

So instead of measuring number of feature delivered, measure time to delivery? Do we not have an implicit time constraint in "number of feature delivered"?

raxxorrax · on Feb 24, 2020

Taylorism perverted since 19-something. Without the boni of course.

Bad management? Yes.