Data Abuse

Consider a study of the effectiveness of pain management methods when in agony from nerve pain based on a survey of friends online at the moment. Please check for yourself. No data was harmed in the making of this chart:



Please note some of the key features: a) No units of measurement for the numbers. 600 whats? What in the world am I talking about here? No context or relation between the methods. What do acupuncture and a boot to the head have in common? This chart! Also,what period of time is this over? Who participated? Was there a focus group? How is effectiveness measured? Who even put this together? How do you get more info?

Scared yet? Seen any charts like this impacting LIVES of real people at your workplace? Innocent facts are being abused! Sometimes by well intended individuals who are unaware of the proper care and feeding of data as to cause neglect, and at times intentionally by malicious numeralphiles who can't resist the allure of an attractive chart in the face of economic pressures.

It seems cliché' to talk about interesting times and the economic downturn, but the last few years have seen an increase in the character condensation among people I encounter in the software industry. Whatever they were before, they become even more of. Betrayal, gossip, defensiveness, and thinly veiled self-serving protectionism are happening more often. At the same time, altruistic and generous people are reaching out to support each other. Hope, service, loyalty, and volunteerism are still present but harder to seek out, and those who are money driven think we are "suckers", "unprofessional", "immature", or "self limiting". I would say that if it is unprofessional, immature, self limiting, and unrealistic to have ethics, sign me up. Justify playing a dirty game how you will, but don't insult me for my decision not to. I'll pay for it. You'll be the one rolling around on a pile of money, so you win the money. I've never been super motivated by money alone, so that's ok. I prefer the good people. I'll keep them.

In all business there are decisions made, and it seems to me, more than ever, data plays an important role in those decisions. There are many good uses of data, and I do not mean to imply that there are not responsible and useful metrics that can be used well. I am just saying that there are some egregious abuses going on right now in software. Most people want to base their choices on more than just gut feeling and whims, or have some explanation to back up those decisions. Data comes in to play as soon as someone has a question they want a logical answer to. Data itself is absolutely innocent of any intent, malice, or will. However, the fact it exists at all means that there is the potential for abuse, and there are certain types of data that are simply begging for it. These masochistic data types lure in well meaning people causing all sorts of havoc, and ill intentioned and desperate individuals also are found near these troubled numbers.

1. Maybe it's Maybelline-Unverifiable data. These are the numbers that are made "confidential" or access is locked down so that no one else can check the source or how the information is tabulated. This is similar to tabloid data, where items presented as fact are mostly rumor or based on "our sources".

2. Eyes Without a Face-Data with absolutely no context presented as fact. An example of this is the ever popular bug trend. Showing that there are fewer bugs does not say much about quality if you don't understand the defect density, have the same number of testable builds, a reasonably close amount of test coverage, and all known bugs included in the count. When you present just one number it is very easy to make that number appear in any way.

3. Some Airbrushing-Same data, different presentation. "We have improved our customer service ratings by 100% in the last month!" Based on that, should you give my quality team a bonus? Customers are more satisfied. Of course, the month before we dropped to an all time low and now we are STILL at an all time low and customers hate us more than ever. Furthermore, sales have dropped and because our low quality is well known now, fewer people were surveyed this month, and we only surveyed people when we had time, so we hand-picked the respondents. We also changed our collection methods and started asking a totally different question about satisfaction, but compared the data anyhow, because the data won't mind. After all, it's the best information we have and when asked we'll just say, "Well, its just a guideline. It's the best data we have and that is far better than no data!" The trouble is, the gap between reality and the metrics will come out. When you see the unairbrushed version in sweats, how are you going to react?

4. Dude Looks Like a Lady-This is when you take subjective data and make it appear objective with a nice dress. So, for example, you have people enter in the time they actually spent doing things compared to an estimate and then you use it for metrics. The trouble is, people are not good at entering how much time they actually spent, especially if you measure them based on it. People enter round numbers and they don't remember real time. Things they hate doing seem to last forever! Things they love just fly by for them. Perceptual time in the human mind is not actual time. Anyhow, a common data abuse is to collect estimates from people and have them self report either time or arbitrary percentages of total time without any basis in real time. Then you take the time spent and present it as objective data. If you get information like this, I urge you to check for the telltale signs and ask questions.

5. Minotaur Madness-When there is obviously not a fit or a match, yet you find a way to match up the wrong data with the point you are trying to make. If you are making a comparison between apples and oranges, the conclusion you are drawing is abusive to the data.


There are many more abuses happening to data which are unfair and must be stopped. False accusations and verbal abuse are rampant. Data isn't useless, stupid, nor is it a lying tramp. Data itself exists to serve us in any way that it can, not judging, not doing anything at all. We should set up a data protection program so that it can survive without being abused, and those who abuse it should be punished and rehabilitated so that the abuse can stop and we all can make more productive and reasonable business decisions.
 

What did you think of this article?




Trackbacks
  • No trackbacks exist for this post.
Comments
Page: 1 of 1
  • 20 Nov 2009 Marlena wrote:
    I was just in a meeting yesterday and was presented with a slide that had all kinds of crap percentages listed.

    "We're 50% better!"
    "There's 98% more uptime!"

    No percentage should EVER be presented without a total count.

    I mean, look at this picture:
    http://www.techcrunch.com/wp-content/uploads/2009/03/figure1full.png

    You're up 202% of what? There's no data in this picture. Just some squiggly lines.

    Your "maybe it's maybelline" cracks me up.
    Reply to this
  • 20 Nov 2009 Lanette Creamer wrote:
    I added my own example chart as well! That is a riot. I think we must raise the bar on use of data so that abuse is no longer acceptable in software, especially on the quality side. We are too smart to be fooled by such transparent abuse any longer. We have a right to expect better, to insist on better, and certainly we can be better ourselves.
    Reply to this
  • 21 Nov 2009 Laura wrote:
    Thanks for taking some serious issues and explaining them in a clear and fun way. I also love your wording (ex Dude Looks Like a Lady). Nicely done. Thank you.
    Reply to this
    1. 22 Nov 2009 Lanette Creamer wrote:
      Thanks for the encouragement! I'm researching metrics to try to help suggest ways to improve them for those of us who need to generate them for business.
      Reply to this

Page: 1 of 1
Leave a comment

Submitted comments are subject to moderation before being displayed.

 Enter the above security code (required)

 Name (required)

 Email (will not be published) (required)

 Website

Your comment is 0 characters limited to 3000 characters.