Let me tell you a story about a well-meaning researcher—let’s call him Dr. Art—who was analyzing data from a clinical trial. The study was looking at the number of IVF treatments that were needed to reach a live birth in women using two kinds of ovarian stimulating drugs, and Dr. Art, ever the optimist, decided to use his trusty t-test to see if there was a significant difference between women in the two treatment groups. He ran the numbers and proudly announced his findings: "The t-test shows no significant difference, so the treatments are the same!" Case closed, right?
Dr. Art forgot to check one thing: whether his data was normally distributed. You see, a t-test is what we call a parametric test, meaning it assumes the data is bell-shaped, like a nice, smooth hill. But the distribution of live births across the number of IVF cycles looks more like a beautiful whale, with a big hump and a long, thin tail. That’s where things start to get tricky.
![](https://static.wixstatic.com/media/9ae0b1_67ad9db4bd3e4261ad7c15b40d9d099b~mv2.png/v1/fill/w_980,h_420,al_c,q_90,usm_0.66_1.00_0.01,enc_auto/9ae0b1_67ad9db4bd3e4261ad7c15b40d9d099b~mv2.png)
For a t-test, b is what you must have!
Many patients might reach a live birth after 1 or 2 treatments, while a progressively smaller minority takes much longer than average, creating a skewed distribution. But Dr. Art plowed ahead with his t-test anyway, blissfully unaware that his data violated one of its core assumptions.
The result? Misleading conclusions. If your data doesn’t meet the normality assumption, using a t-test is like trying to fit a square peg into a round hole. The test might still give you an answer, but it won’t be the right one.
So, what should Dr. Art have done? When your data isn’t normally distributed, it’s time to bring in the non-parametric tests, like the Mann-Whitney U test. These tests don’t care if your data looks like a lopsided hill—they’ll still give you reliable results.
Comments