Statistics Tolerance Training
While I almost always prefer to focus on my areas of strength, there are a few areas of weakness that I'd like to address to some extent because I feel it would help in terms of credibility and make me a better tester and enable me to better collaborate with and understand other people.
For this reason I'm taking on some exercises to practice my thinking and raise my tolerance for tasks that I think of as "uninteresting" and mix them with things I do find interesting, like research.
So, I present to you the results of last weekend's training exercise. This is an excel spreadsheet that is a breakdown of household grocery expenses over the last 3 years. It also includes the national average adjusted for the cost of living in King County Washington (which is higher than much of the nation). Were I able to more easily find the data I'd compare against the costs of eating out because what this doesn't show is that I cook on average at least 10 times per week and only buy lunch 1-2 times per week at work. If the only goal were to lower the grocery bill, just eating out more often would do that, but it would also mean that food and beverage would consume more of our budget. I should also note that while most people get coffee out or go to the bar, we only do so about once per month, so the wine consumed at home is part of the grocery budget. What this chart also doesn't show is that when we first moved into the house we had never owned a home before and we had to buy many basics at the grocery store that were non-food household items. Those items tend to be expensive things like cleaning supplies, laundry detergent, cleaning tools, cooking utensils, and one time staples that are only replaced every so often. What looks like "waste" may not be. Yet most business decisions are made by this kind of "valid data". Creating data that is viewed as valid is very powerful. It needs to be "not wrong" and show how things are trending.
I also figured out that 10-40% of the average grocery budget in the United States is wasted (ends up being recycled or thrown out). For that reason I set my goal at 15% waste because to get it down to 10% might mean having to eat less healthy fresh foods and I'm not willing to sacrifice food quality to that extent to lower the grocery budget. According to the USDA Cost of Food at Home March 2009 data which is for the US $557.60 per month for a family of 2 between the ages of 19-50 years old.
To summarize, although I worry about how data and statistics are used in incomplete sets and often unintentionally are misleading (on occasion intentionally, but usually unintentionally), I still want to practice and learn more about statistics and practice building up my tolerance. If you have an exercise for me, a tutorial, or anything besides a book (I've read some books and now I need to practice), please drop me a line. If you want to mention more unintentional effects tracking only budget could have to prompt my thinking about unintended impacts of using such measurements feel free!
I've also learned something of interest about my personality. I started with this assignment because I tired of the casual complaint from Craig that the "grocery bill was too high". I now have presented data that shows not only has the bill been trending downwards while costs have been rising, but that his impression of the situation is stuck in the past. The grocery bill WAS too high in 2007 for good reason, but now in 2009 I expect him to apologize and tell me indeed we are stellar that despite eating out far less than the average Americans we have been so frugal as to beat the King Country average for 2 years running and as a result I deserve a pony as the sole person who does over 90% of the meal planning, cooking, and shopping. This my friends is one of the costs of dating a fellow nerd. You may get an email like this.
For this reason I'm taking on some exercises to practice my thinking and raise my tolerance for tasks that I think of as "uninteresting" and mix them with things I do find interesting, like research.
So, I present to you the results of last weekend's training exercise. This is an excel spreadsheet that is a breakdown of household grocery expenses over the last 3 years. It also includes the national average adjusted for the cost of living in King County Washington (which is higher than much of the nation). Were I able to more easily find the data I'd compare against the costs of eating out because what this doesn't show is that I cook on average at least 10 times per week and only buy lunch 1-2 times per week at work. If the only goal were to lower the grocery bill, just eating out more often would do that, but it would also mean that food and beverage would consume more of our budget. I should also note that while most people get coffee out or go to the bar, we only do so about once per month, so the wine consumed at home is part of the grocery budget. What this chart also doesn't show is that when we first moved into the house we had never owned a home before and we had to buy many basics at the grocery store that were non-food household items. Those items tend to be expensive things like cleaning supplies, laundry detergent, cleaning tools, cooking utensils, and one time staples that are only replaced every so often. What looks like "waste" may not be. Yet most business decisions are made by this kind of "valid data". Creating data that is viewed as valid is very powerful. It needs to be "not wrong" and show how things are trending.

I also figured out that 10-40% of the average grocery budget in the United States is wasted (ends up being recycled or thrown out). For that reason I set my goal at 15% waste because to get it down to 10% might mean having to eat less healthy fresh foods and I'm not willing to sacrifice food quality to that extent to lower the grocery budget. According to the USDA Cost of Food at Home March 2009 data which is for the US $557.60 per month for a family of 2 between the ages of 19-50 years old.

To summarize, although I worry about how data and statistics are used in incomplete sets and often unintentionally are misleading (on occasion intentionally, but usually unintentionally), I still want to practice and learn more about statistics and practice building up my tolerance. If you have an exercise for me, a tutorial, or anything besides a book (I've read some books and now I need to practice), please drop me a line. If you want to mention more unintentional effects tracking only budget could have to prompt my thinking about unintended impacts of using such measurements feel free!
I've also learned something of interest about my personality. I started with this assignment because I tired of the casual complaint from Craig that the "grocery bill was too high". I now have presented data that shows not only has the bill been trending downwards while costs have been rising, but that his impression of the situation is stuck in the past. The grocery bill WAS too high in 2007 for good reason, but now in 2009 I expect him to apologize and tell me indeed we are stellar that despite eating out far less than the average Americans we have been so frugal as to beat the King Country average for 2 years running and as a result I deserve a pony as the sole person who does over 90% of the meal planning, cooking, and shopping. This my friends is one of the costs of dating a fellow nerd. You may get an email like this.


Of course, proper analysis of raw data is important. Not only are there going to be problems with making sure your data is what you imagine/assert it to be, there's also the question of understanding what the data *says*.
Your observations that non-grocery food cannot be discounted from a meaningful study of food bills. Also, that grocery purchases that aren't food MUST be discounted if one intends to study food bills.
What one has, with a record of past grocery store bills, is just that, and can never (with complete validity) be used in a study of food budgetting. It may give you general trending, but it's not solid and prone to problems, as you've mentioned.
It's one reason I'm glad not to have to make decisions based on flawed data. It'd drive me batty!
Reply to this
While not a valid food study, the one thing that it can do is disprove Craig's casual statement that the grocery bill is too high.
Considering that I cook more often than average and despite rising grocery costs even without deduction essential non-food items from the bill, we STILL are spending less than what is average. Based on this information of what average spending is, hopefully Craig will now understand that while he may not think that the grocery expenditures are reasonable, we have evidence that they are in fact below average when compared to other Americans, and well below average since our actual expenses include the non-food items and we still come in below average.
I have yet to see a good set of data where enough is tracked so that what the person intends to show is what the data actually shows. Still, lots of decisions are made off of data with less verification, validation, and context than this. In many cases people judge just on the chart without even knowing where the data came from.
Reply to this
I really appreciate your work to this site.So thanks for it.I hope you can continue this type of hard work to this site in future also..Your work is really remarkable.
Reply to this
Lol, this is excellent. An excel spreadsheet that is a breakdown of household grocery expenses over the last 3 years seems like a great way to prove your point. I may give that a try with some of MY bills!
-Hanna
Reply to this
Took me time to read all the comments, but I really enjoyed the article. It proved to be Very helpful to me and I am sure to all the commenters here! It's always nice when you can not only be informed, but also entertained! I'm sure you had fun writing this article.
Reply to this