Tester Challenge-Baby Cry Interpreter.
Now on sale for $39.99 from thinkgeek.com is the Why Baby Cry Analyzer which tells you why an infant is crying.
How would you ethically test this device? Would you change the strategy to test the iPhone app ? If so, how? Should there be a disclaimer? If so, what should it be?
What impresses me about these products is that they were developed and they exist. Most testers I know would be so concerned about the liability and ethics of diagnosing why a human baby is crying that they may not be comfortable shipping such an application, yet here they are, available as low as $4.99.
How would you ethically test this device? Would you change the strategy to test the iPhone app ? If so, how? Should there be a disclaimer? If so, what should it be?
What impresses me about these products is that they were developed and they exist. Most testers I know would be so concerned about the liability and ethics of diagnosing why a human baby is crying that they may not be comfortable shipping such an application, yet here they are, available as low as $4.99.


This is not necessarily hard to test.
First off, it doesn't even have to be accurate. I bet there's a disclaimer included in the package that says it's for "entertainment purposes only."
I would want to read whatever research this is supposedly based on.
It could be tested in a rudimentary way by using recordings of babies gathered under natural circumstances.
One test would be to play the same recording 100 times and see if the system consistently responded in the same way.
BTW, this is an irresponsible product. No parent should rely on a device like this. There's no shortcut for getting to know your child. And no device can figure in the matter of prior probability (e.g. if your baby has been awake a long time, why it's probably sleepy, regardless of the cry sounds like.)
Reply to this
What is your oracle for testing this? That is the main question of interest. I looked for the disclaimer on the iPhone version and instead it claims to be "96% accurate" with no information on how that was determined.
I know that humans can tell pretty accurately why their baby is crying. Seeing my friend the second day after she gave birth be very accurate with her new daughter amazed me. She didn't need such a device.
What about cases like a totally deaf person. They can still tell a baby is crying of course, but would this give them more information than their other senses? Is it still irresponsible to rely on it for additional information (on top of knowing your child)? Also, what does it do if a child is hungry, poopy, tired, uncomfortable, all at once? Who determines it is 96% accurate since the babies can't vote here?
Reply to this
I think there are two separate issues here. Effectiveness of the product and whether it is working as designed.
You would need a randomized, placebo-controlled (maybe a device that randomly gives output?) study perhaps using a parental satisfaction survey as the dependent variable.
Independent of those results, James Bach covered a lot of that ground in his comment. Other than that: How will it respond to an hour of continuous crying? When there are two babies crying in the same room? Does it blow up after a year of not being turned off? What happens if the device is dropped in the toilet or covered in juice?
But I agree w James that this is probably a joke product, I HOPE parents aren't taking this seriously; a lot of the parental bond is figuring things out for yourself!
Reply to this
One of the things that bothers me about this advice is flawed assumptions that I believe went into the creation of the device.
1. Babies cry for a reason.
2. When they cry it is for one reason, not many reasons.
3. The reasons fit neatly into 5-6 buckets.
4. When the baby stops crying it is always because you solved the problem they were facing and not for some other reason.
The product does say lots of ambient noise will interfere with the product working, so it is designed for one baby crying (not multiple).
If a babysitter uses it is that unethical considering they don't know the child as well?
Reply to this
about #4 "When the baby stops crying it is always because you solved the problem they were facing and not for some other reason" --> not necessarily about solving problem. Most often it is the distraction. Deviate their attention from what they are crying for. But when a baby remembers why he/she was crying for the activity resumes all over again.
i have used this approach with my little one several times. Deviating her attention, turn her focus to something enjoying that made her chuckle always helped. Babies do not understand the resolution similarly it is hard for grown ups to understand the baby's problem. (Exception: a parent (mostly mother) most often knows the exact reason her infant cries for. )
For this above exception, should I say, this is by behavior theory. Observation theory over a period of time when a grown up and baby get to know each other, create that bonding with each other. And baby knows he/she can trust the person because they have developed that trust over a period of time.
That's what reproducing the bug in testing leads to, correct? we observe, we study, we track the inputs and outputs for whatever the way the application responds/reacts (behaves), and that helps the debugging factors to fix the bug as appropriate. Agreed, we don't use the deviate option here, but we get to solve the issue. well, thinking about it again, we do use the deviate approach by finding the work around, business priority and other constraints to decide whether or not fix the bug or defer it to later time with provided work around. see, the issue is not resolved, but we approach this by providing the alternate work around. When a similar inputs/constraints/environment were to occur again, the defect crops up again (baby remembers and resumes crying).
Thank you for this post Lanette, a great analogy of testing such devices.
Reply to this
I agree with the previous comments.
As a first time parent, I have found there are lots of products on the market that are just not needed but do exist and there is no substitute for growing and learning with your baby.
What I did find weird, was recieving next doors satelite tv when trying out the baby monitor camera(which we didn't use anyway) What I found with the baby monitor was there was so much interference. I think they have a caveat/disclaimer on interference for this product, and would probably be the getout for this product working? So I agree with James that it doesn't have to work.
What constitutes a baby, I have a 14 month old boy, is he still a baby? What's classed as Interference? If it analyzes frequency, lots of skewing can occur, with other things producing the same frequency? The 20 second maximum response time is interesting?
How is crying power analyzed? I'm also thinking sound muffling..what about echos?
The claim of reliability and clinically tested has no detailed information. Also nursery schools vs newborn would need explanation.
No..I'm just not comfortable with this....
Reply to this
Are you still uncomfortable if this is just for entertainment? What if it provides stress relief to overtired and stressed parents when their baby cries and cries and they feel helpless? What if they just use it to help give them an idea where to start in addition to their intuition?
I think your 14 month old is a baby! If older babies can express why they are crying that would be another sort of a test.
What makes you most uncomfortable? I was pretty surprised that there isn't just a product like this, but an app as cheap as $4.99 using the iPhone. It bothered me, but at the same time I thought it was interesting. Personally, I wouldn't use one of these for my own child.
Reply to this
Ok, I think the thing I was uncomfortable with is using this device solely as a means of identifying why a babies is crying. It actually doesn't say that is what it should be used for. However, first time parents could be vulnerable to thinking this.
Perhaps I need to do a bit more research.
I take your point about the stressed out parent thing, but what if it gave them the wrong indication and would there be a potential to be even more stressed then they all ready were? and a contrast to their intuition? Confused?
What would be the consequences of a misinterpretation i.e. the baby only tired when the baby could be ill.
Also my 14 month old sons cries have changed. i.e a 3 month cry is different to a 14 month cry.
He can also make no noise when crying.
And is there a difference between a cry and wimper? or a cry and any other noise?
What if a baby cries out in their sleep were nothing is wrong?
I agree that there are more than 5 categories of crying...where does wet nappy fit in? Pain..I'm thinking about teething here. Where does illness fit in?
Entertainment is a broad area, so it depends...but I can see how someone might want to prove it wrong! i.e. I 'beat' the computer again with a wry smile...
Why wouldn't you use the device? For the 4 points mentioned earlier?
Reply to this
Wow, you make a great point about interference. A few months ago, we were down stairs, and baby upstairs, asleep in her crib. The Receiver is on downstairs and i hear a baby crying. So after a minute I gather myself and drag myself up stairs, but when I get up there, not only is she not crying, she is sleeping quietly.
I think hrms that's odd, maybe she just went back to sleep. So back downstairs a few moments later, and I hear it again. After the third time I grab my wife and have her stand by the stairs saying "Do you hear her crying?" She didn't hear her either, but going back to the monitor we did.
So I go back upstairs, to find, yes you guessed it she was sleeping. Now I'm really wondering what is going on, so I grab the monitor and find its not even plugged in! (whoops!) So that meant my receiver was picking up some other baby crying for like ten minutes with no answer. I felt truly sorry for that child. I should add that I switched channels and made sure it was plugged in from then on out.
Reply to this
My sister kept picking up another baby on her baby monitor which happened to be down the road, and my sisters baby kept getting picked up on her friends monitor down the road. They picked each other baby up!
Reply to this
Even analog recordings chop off the upper and lower ranges of sound, so recordings are not a true record of the sound generated in life. Digital recordings turn smooth sine waves into a sawtooth pattern unknown in nature.
I bring this up for two points. I agree with Jon that a simple disclaimer will remove 90% of the testing requirements from this product. Second, the only testing we can ethically conduct without said disclaimer are field trials using journals similar to the old Nielsen rating system. "Tester" parents would need to enter information about times and date when their babies cried, what the machine registered as the reason, and what eventually lead to their baby to stop crying. A large enough study group conducted over at least a month should safely account for human error.
Reply to this
I think it would be more important to test a range of baby cries rather than one baby crying looped.
What about the reasons a parent can't fix? Like fever or illness? What about teething? It is possible to sooth, but not solve the exact problem. As a tester, I'd want to know what the default category is if the machine can't figure out which reason the cry fits into. I was expecting to see an "other" or "error" option, but it doesn't seem to have that. Each cry results in just ONE, or multiple options? What if a cry is a combination of reasons or switches from one reason to another?
Reply to this
That's why I think the only "real" way to test this device is in field trials. We could do white box testing using recorded baby cries in order to verify expected responses, but it would only be testing to the design. In order to verify the design, we'd need to record reactions and responses in real world scenarios with random factors such as background noise or unanticipated emotions like illness, insomnia, loneliness, or others.
Reply to this
Just thinkin about it, I disagree with James, playing the same recording 100 times will just mean it can work for the particular baby recorded. Does it really mean it will work for all the other babies and factors mentioned.
What about the customers? I see from the website that they may be a target to midwifes...So does that mean it has to work for them?
Reply to this
I agree with many of the other comments. Is this a gag gift, or is it literally meant to diagnose. I think back to Episode 3 of Star Wars where they had Medical bots diagnosing that Senator Amidala is pregnant etc. Which got me to thinking, we think of robots helping to care for family in the future. (Think Jetsons for example), so how exactly do you get to a point where you could have a robot nurse, or automated nanny?
So Lanette's question still stands, is it ethical as a device? Could it be one day? I'd say yes, but its a device that IMO needs rigorous testing. I like the suggestion of the placebo test, just like drug manufacturers have to go through. I also don't see how a device like this can work with out some ability to learn and adapt on its own either. Babies change and grow so fast, and while we adjust pretty quickly, how does a machine?
What's more if you look at the particular device linked too, it just lists mood. It doesn't necessarily go to a specific diagnosis from what I can see. The baby is bored, or annoyed, sleepy or hungry, or stressed. How do you develop test cases to test each of those perceived states? How would you even setup an infant and be able to say oh yeah she's bored or annoyed? But I don't see this saying, oh she's wet, or oh, she's dirtied her diaper, or oh she needs to burp, or maybe she just needs held.
Those five diagnoses almost seem too generally to be of any real good to me. Plus there is the placebo affect possible that treating one possible symptom may at least temporarily take the babies mind off what the real problem is. Is she hot, running a fever, hungry, or wet? The terms being listed as types are really quite vague if you ask me.
Reply to this
I'd like to see a feature request for new reasons:
State of existential crisis
Drama-Baby needs attention
Whining-Baby wants candy
Fear-Nightmare, Spiders, Clowns, Mommy's Gone.
I see none of these reasons listed. All are valid reasons a baby might cry.
Reply to this
Oh that's a good point. I don't know if anyone else has experienced this, but I have. There are times when I'll hear our baby daughter crying (She's 6 months old right now) and I'd I'll pick her up to try and sooth her by holding her up to my shoulder level.
Usually I can figure out what she needs from that position. Does the diaper feel enlarged or wet, is her tummy growling, etc. Sometimes though, no amount of diaper changing, or feeding will take care of it. Apparently sometimes she just wants to be held by her mom. Not her dad, not her Grams, nope mom. It always amazes me at how fast children, especially young ones can pick out the differences between people or learn things. One of those times my wife was stretched out on the couch, and laid the baby almost flat with her head on one of her breasts. Sure enough the baby calmed down right away almost like she wanted it for a pillow or something. I remember joking that hey maybe I should get a pair of those. But yeah there are all kinds of unique situations for kids.
Here's another one, my son Stephen until he was like 4 would not go to sleep unless you stayed in his bed room, or at least within eye/ear shot of the door until he was actually out. You'd get up to try and sneak out when you thought he was asleep, only to find him crying moments later cause you weren't there. Talk about tough times. That took some work to realize what the problem was. I tell ya.
Every child is different too. I never liked to sleep with lights on. I hated taking naps in the day time, but would snooze at night pretty easily. My son is the opposite, we basically have to leave a light on his room (Typically a smaller 40 watt bulb is enough to do the trick.). Rose may not be the same way, of course her room is also positioned differently in relationship to the rest of the house. Then of course there was the issue that when that was all going on we had just moved to a new community several hours away. It was a new strange house and my son at the time was actually potty trained, but the move set him back and we had to go back to pull ups for a while.
I wonder if this device is smart enough for situations like I just described. My gut says probably not, and what's more it probably doesn't really have that capacity to learn. Which as a Computer Engineer by degree, I gotta wonder how on earth they figure out at x decibel level, or at x pitch or wave length means this or that.
Can you imagine having a Scrum about how to build test cases for this thing during its design phase?
Reply to this
Hey Lanette. As Spock might say "Fascinating."
First, as James implies, parents can get to know the cries of their children. That said, I'm not sure how well a parent, once so trained, can pick out the cries of other children. So to have any hope of this device working, I would expect the parent to have to record and "label" sample cries.
That does not seem to be in the cards. So I would look very carefully at the user manual for caveats, exceptions, rules, etc.
I'd seek information from any available source. Googling around found me one (1) five-star review on amazon. The target.com description said this was a tool to help/train new parents.
To ethically test it, I could hire a dozen parents who knew their babies cries, let the child cry for 20 seconds before being picked up, ask the parents why the child was crying, and compare that to what the software said. Do this a dozen times and you've got 144 test runs. I don't see an ethical dilemma. (Come to think of it, I'd want to hire a child development expert as an Oracle, and ask about the differences between children -- say race. It's possible, but unlikely, that the software works for one race but fails for children of another race.)
After the black-box testing, I'd want to get white-boxy - why does the creator thing this would work? What variables would be play with to 'throw off' the scent?
Then I would start talking to our company lawyers. What claims are we making for fitness of purpose? Do we have a guarantee of any type?
All in all, I'd say this product might actually be helpful to babysitters and in-laws, and maybe, conceptually, a little helpful to new parents.
We could also a field study, which Curtis Suggested, to find out how often those cries are some sort of other or dangerous cry. If we warrant 95% accuracy, and those cries at 92.5% of the time, and we are careful with the wording on the package, we might be ok.
Another way to test would be to record 15 to 100 'ideal' cries of each category and run them against the device, anywhere from 1 to 6 feet away at various pitches, and see if the software is correct.
Another way to test would be to find some of the experts from the 'secrets of the baby whisperer' forums (google it) and actively engage them in product development, getting their approval and support - especially on positioning and wording. (That is, engage subject matter experts as testers.)
Finally, we can find other products like this on the market (I found one via google, but it's the same company) and use that as a comparison, with senior moms as the oracle. If our software is 'correct' more often than our competitors allready in the marketplace, I wouldn't worry tooo much.
Reply to this
Correction: If those exceptions are < 2.5% of the time and our promise is 95% correctness, we might be fine.
Reply to this
Yesterday was my first time to comment, and reading the other responses to this post has made me realize that I'm not alone in my extreme OCD geekiness re: testing. You're all my professional soul mates
Reply to this
Hi Lannette,
Just found your blog for the first time today. And what a good mental exercise for a Monday, to get me into test mode for the week...
Your post gave me a lightbulb moment to share: sometimes just the name for the product gives testable parameters.
* "Baby". Will it respond to crying pre-adolescents, adults, high-pitched females, coyotes, nails on blackboard, sirens, helium ingestion? The further outside the box I go usually leads to some fertile issues.
* "Cry". Laughter? Squeals? [Ooo, test pig noises. Are there animals that cry?] Crying with hiccups? Coughing and crying?
There's easily a half-dozen tests from the name that don't even come close to ethical concerns.
Thanks for the challenge. Now to get caught up on some of your earlier posts...
Reply to this
What I find interesting is how come there is no translation in Chinese/Japanese/Korean.
Was there a business case to not have instructions in these said countries? (seeing that it'll miss out on the billions of people there) Perhaps Chinese/Japanese/Korean babies have a different pitch that is not translatable.
Maybe the product is just not ready for Unicode!
Reply to this
As mentioned, there are questions surrounding just the existance of this product in the first place. Before entry points are even considered, wouldn't it be part of the testing process to determine the use cases as feasability for testing? Then, certainly the risks and liability ...
But I agree. If there's any disclaimer as this product is for "entertainment purposes only, etc.", that puts it into a whole, new category with no risks or liability associated. The same category used for horoscopes, ouija boards, and fart machines (there's probably an app for that!). No guarantee of output probably means little or no testing required, so it's not a test object in any that sense. Just in a very limited, functional sense ranging from it "sort works" to be entertaining, to "who cares" ...
Reply to this
This device isn't for entertainment purposes after all. The website claims this is a serious tool. They go so far as to say that it has been clinically tested and "passed" (whatever that means).
It seems the software makes it evaluation on crying "power", frequencies, correlation with established pattern, and interval.
As a tester I would focus on volume, frequencies, patterns, and intervals but not maybe not with real crying recordings. I would try a function generator or an audio file editor first which would allow me to manipulate and control each of the variables.
While as a tester I may not be able to prove that a baby is crying because they are annoyed (I will leave that to a clinician or psychologist) I can determine how it functions based on altering the variables. Based on these finding I could go in all kinds of fun directions.
I'm half tempted to buy one of these and start testing it.
Check it out: http://www.whycry.com/
Reply to this
I wouldn't be concerned with ethics at all. A responsible parent will not rely solely on a product to interpret a baby's cry. However, this may be a small step in a technology that can help interpret communication for those who can't talk. Technologies continue to evolve and improve. How about trying to interpret a dog's bark? How about interpreting meaning from tone of voice? Communication is much more than just words.
I have a friend who is suffering from Lou Gehrig's disease (ALS). This is a terrible disease where you lose all muscle control. He is now at the point where he can barely communicate... I think I heard it takes 42 muscles to speak. Maybe some day there will be a device that will help people to communicate when they can no longer speak, but they can still make sounds.
Though I don't think it's appropriate for parents to depend on the baby cry interpreter, it seems like their feedback could be helpful in perfecting the technology. I'm tempted to get one for my daughter, who has a new baby, just to see how accurate it is...
Reply to this
If the app is just for fun, then we should not worry much about the analysis and instead concentrate on the stability of the app. If the app is indeed serious and claims to be so, then if we have enough budget, get a voice converter to convert adult sound into a child's one. This way, we can easily simulate the different needs of a child, viz, Hunger, pain etc and see if the app analyses it properly. We can repeat it with different voices as well.
Personally i feel that human emotions cant be judged by any device and it really becomes tough on the part of a tester to test such an app, especially one which deals with a child. The best thing would be to list out the scenarios which has been tested and certifying the app against it rather than including all the possibilities.
Reply to this