I Have An Above-Average Number Of Feet
In The New York Times, Barry Gewen reviews a book on stats abuse, The Numbers Game, The Commonsense Guide to Understanding Numbers in the News, in Politics, and in Life, by Michael Blastland and Andrew Dilnot:
It's hard to resist a book that tells you that most people have more than the average number of feet. Or that researchers have found that Republicans enjoy sex more than Democrats do. Michael Blastland and Andrew Dilnot delight in bringing such facts to our attention -- and then in explaining them away.Because of amputations, birth defects and the like, the average number of feet per person across the human population is slightly fewer than two.
...Most of us, Mr. Blastland and Mr. Dilnot observe, expect numbers to do too much. We like their precision and want to believe that statistics can tell us all we need to know about the world. But precision comes at a price: before you can count something, you have to define what it is you're counting, and often that's not as simple as it sounds.
Unemployment statistics, for example, conceal a host of decisions. How much can someone work and still be considered unemployed? How hard does a person have to be looking for a job? The Thatcher government changed the definition of "unemployed" either 23 or 27 times. ("There is some disagreement" about the precise number, the authors blandly write.)
Sampling is another headache. Most of the numbers we need involve populations too big to count one by one. As a result, we sample. But no matter how sophisticated the statistical techniques, they are still prone to error. Any American has only to think back to the polls during last year's primary season.
And that's with communicative human beings. Mr. Blastland and Mr. Dilnot describe Britain's efforts to count its hedgehog population (the National Hedgehog Survey) because of worries that this shy animal was in decline. They also discuss the Herculean -- but gravely important -- task of counting the fish in the sea.
"Uncertainty is a fact of life," they say, even if it's a given of human nature to look for meaning where there isn't any (see under: religion). They devote an entire chapter to chance to explain why the public sometimes sees a pattern where there is no such thing.
In 2003 the villagers of Wishaw, England, convinced that a recent rise in the incidence of cancer in their area was caused by a nearby cellphone tower, proceeded to lynch the tower, or, more accurately, pull it down in the dead of night.
What the villagers didn't know, the authors say, is that "cancer clusters" occur naturally, just as a coin tossed 30 times will probably produce at least one sequence of four heads or four tails. Tattoo this on your arm: a pattern doesn't always mean a plan. Throw some rice in the air and you will most likely see patterns in the way it lands.
Statisticians even have a name for the phenomenon: it's called the Texas Sharpshooter Fallacy. "The alleged sharpshooter," the authors write, "takes numerous shots at a barn (actually, he's a terrible shot -- that's why it's a fallacy), then draws his bull's-eye afterward, around the holes that cluster."
Actually, I read that first in a story about WWII, which, if I recall correctly, didn't take place in Texas. But, their book sounds pretty good and pretty necessary.







You can rapidly get lost, even briefly reading about statistics, but here are some principles you can use to rid yourself of the idea that they are useless, or only used by liars.
Every statistic is a measure of probability, which is (probably!) not what you think it is. Each assessment of probability concerns an "event universe". For instance, the possibilities in a coin flip are heads, tails and edge and no more. With Lotto numbers, the range of numbers is clearly defined. In population studies, you must really look hard at the demographic to determine if the presentation is well-defined. It wouldn't do to talk about sickle-cell anemia in natives of Namibia without defining what "native" meant.
Every proper presentation of a statistic will also show the method by which data is collected.
Every proper analyst will point out the limitations of the collected statistic.
"Unpredictable" does NOT mean "random". Back to Lotto, you can see that the number of solutions the Lotto machine can offer is limited to six different numbers within a range, one of which may be a duplicate. Understanding permutation and combination is basic.
Statistics is a science, and a useful one. The hard disk on your computer might be built using Partial Response, Maximum Likelihood, and your processor is almost certain to have Branch Prediction built into it. But it's not new. Pharoah's surveyors used it to line out squares with great precision for their day, as they recognized that thousands of blocks would look pretty sorry if they weren't all straight. In that case, statistics is applied to understand the limits of measurement - which is literally never "exact".
Radwaste at February 7, 2009 9:50 AM
A readable, interesting, and cheap book about how not to be fooled
How to Lie With Statistics
There is more info at the link.
---
Example. A study might report that smoking increases the risk of a particular cancer by 20%. Scary. But, you learn that the incidence goes from 2 up to 2.4 per 1000 population. This small difference is not so scary. If 5000 people were studied, 2/1000 is 10 cases, and 2.4/1000 is 12 cases. So, 2 extra cases of the studied cancer produced the headline. Is this significant or a random variation? A good question.
---
Andrew_M_Garland at February 7, 2009 12:01 PM
The Black Swan by Nassim Nicholas Taleb describes this problem better than any book I've encountered. He explains why statistics is essentially useless for the things its commonly used for, and why CEOs could be replaced by a coin toss.
smith at February 8, 2009 11:33 AM
Leave a comment