There’s very little checking done of the accuracy of the prediction of pundits, which is a major reason I avoid even hearing what they have to say.
One exception to this rule is Nate Silver and his crew at FiveThirtyEight.com. They’ve just published an analysis of as many of their predictions as possible. It’s worth reading, just for the demonstration of doing an analysis in the first place.
I suspect most readers/viewers of the news find probability hard to understand, and pundits generally don’t do that. People generally don’t want to hear probabilities; they’d prefer certainty. I have not done one, but I suspect a survey would show that people prefer a certain answer to an accurate-but-uncertain one.
More importantly, it tends to demonstrate the nature of prediction and the value of having evaluations. This makes me tend away from TV and radio as news sources and toward written sources in which I can check the sources. And, of course, toward written articles that actually cite sources.
Whether it’s about elections or hurricane predictions, neither the media nor the public understand probability. I suspect this is because we are evolutionarily programmed to look for certainty. Certainty leads to decisive action. It is sometimes said in military circles that a bad decision is often better than no decision. But it’s easy to be decisively wrong.
For example, if you looked at the actual data about Hurricane Irma, and looked at the predicted range of possibilities (you know, either the cone or those circles around the predicted center), the prediction process went quite well. As reported in the media and as “understood” by many in the public, not so much.
Thus I read with great pleasure Nate Silver’s article today at FiveThirtyEight.com (one of my favorite sites), The Media Has A Probability Problem. There were those who criticized Silver for his data analysis in the 2016 election where he was giving a greater probability of a Trump victory than anyone else. Not predicting a Trump victory, but giving it a higher probability. There were those who were rating Clinton’s chances in the high 90s. Following the election there are those who see Silver as wrong, along with the rest. But that’s a probability. A 30% chance is hardly a prediction that something won’t happen. If you understand probability, that is.
Most don’t. Or they understand it in their heads, but don’t feel it. Here’s a summary from Nate Silver:
Probably the most important problem with 2016 coverage was confirmation bias — coupled with what you might call good old-fashioned liberal media bias. Journalists just didn’t believe that someone like Trump could become president, running a populist and at times also nationalist, racist and misogynistic campaign in a country that had twice elected Obama and whose demographics supposedly favored Democrats. So they cherry-picked their way through the data to support their belief, ignoring evidence — such as Clinton’s poor standing in the Midwest — that didn’t fit the narrative.
Now don’t take this as supporting President Trump’s cherry-picking of polls and numbers. That’s just another, less nuanced form of confirmation bias, or more likely simple carelessness with and disregard for facts.
Further, if we are going to blame the media for problems, we need to watch where we go instead. Many blame the media for very real problems of bias, stupidity, and deception, only to turn to even less reliable sources which they believe implicitly. One advantage I’ve found with reasonably good media reports is this: If you read beyond the headline, and check the references, you can almost always find what you need to double check and correct the news story. For example, most news organizations provide links to the actual poll data and analysis.
So if you want good information, follow the chain back to the source. Don’t just find something more agreeable and believe that. There are perfectly good ways to analyze data and avoid errors. None of us is perfect, but we can and should be better. Much better.
I know headline writers need their splashy headlines, but as the media is filled with word of a stunning upset, we should remember the number of times that one candidate or the other was “destroyed” or “finished,” or the election was declared “over.”
A poll takes a snapshot of part of the electorate which is extrapolated to the whole. It will have a margin of error, generally something like 2.5% to 3.5%. (Why that is can be a project for research.) That means that a candidate who is at 50% in that poll might actually be as low as 46.5% and as high as 53.5% if the error margin is 3.5%. If we’re thinking of two candidates, the other, let’s say showing at 45%, could be anywhere from 41.5% to 48.5%, an overlap of 2%. Now that’s using a margin for candidate A of +5%. The average for Clinton was around +3.5% the day before the election. Now not all polls were at that value, and also not all polls have a 3.5% error margin.
Now there’s an additional percentage involved, which is a probability, variously 90-95%, that the poll itself is within that margin. So in the poll above, if the figure is 95%, for example, 95 out of 100 times the election being polled would reflect a result within that margin for error. Otherwise it might be anywhere. I’m not making any effort here to keep these details realistic — if you read up on the topic you can learn how these numbers relate, but more importantly, if you read the data on a poll, you can find what these numbers are for that particular poll.
There is no set way to combine polls into an aggregate, and there is no established error margin for polls that are combined. That’s because not all polls are created equal. On the eve of the election, Nate Silver and crew were giving something near a one in four chance that Trump would win the election, i.e., according to their analysis, if you had a good enough sample, one in four elections run where the data looked like this would go to Trump. Three in four would go to Clinton. The election doesn’t make them wrong. That’s a probability.
Let’s look at that. If I flip a coin, the probability is one in two that it will come up heads an one in two that it will come up tails. So I flip the coin, and it comes up heads. Was my projection wrong? Not at all. Similarly, the FiveThirtyEight people aren’t wrong either. If they had said, “Clinton will be the next president of the United States,” then they would have been wrong. What they said was that there was around a 78% chance it would be Clinton an a 22% chance it would be Trump.
They were critiqued by Sam Wang of the Princeton Election Consortium, and several people wrote on FiveThirtyEight to defend their methodology. Dr. Wang gave a 99% chance that it would be Clinton. Both Nate Silver (and an unknown number of members of his crew) and Dr. Wang are much more skilled at this than I am (in the same sense that an MLB player is more skilled at baseball than I, considering I have never picked up a baseball bat for a game, is better at baseball than I am). I spent some time with their data and couldn’t really find a way to understand fully how they got their probabilities and why they differed so much. (On your mental flowchart, create a box and label it “Lots of Statistical Figuring.”) But my intuitive feeling was that Nate Silver was getting the better of the argument. It seemed to me that there was insufficient data on which to base that high a level of confidence in the aggregation of these polls. We haven’t been polling presidential elections for that long.
Why don’t we have neat numbers for the aggregate values? Let me note, first, that any time one uses a phrase like “the polls show,” one is doing some sort of aggregation, however loose. Folks like Sam Wang and Nate Silver do it in a very scientific way. (I’ll let them argue over which is more scientific, and they do so with some vigor.) We all do it when we look at polls and make a generalization. The reason there isn’t a neat x% margin of error an y% probability that the poll will, in fact, fall within that margin is that polls use different methodologies. If you average the number of apples an oranges you have, you don’t get a better value each for apples an oranges. It might be better to say that if I average my Gala apples an my Golden Delicious apples, I don’t get an accurate picture of what type of apples I have available. One set of those apples Jody wants to bake into a crisp, and the other I’m going to slice up and eat. I’m afraid I’ll have to look at the actual apples.
Again, my inexpert intuition is that aggregation needs more experience and testing to get more accurate, even as I think that Nate Silver’s work is the more promising. In the meantime, what is quite certain is that nobody in the media has a clue about any of this. Alternatively, they don’t care, and just want to write headlines to sell papers, whether they reflect actual data or not. I suppose that’s possible.
We like meaning and connections, and we’ll sometimes find them even when they’re not there. People who understand this can deceive you. The Improbability Principle from Neuroblogica is a very good summary of this.
I may be hopelessly naive in the matter of probability, though it is the one area of math that I have actually studied, but I am simply not terribly impressed with probability arguments. That’s probably (!) a major reason why I’m not impressed with intelligent design (ID). I’m particularly not impressed with probabilities calculated for processes that are not yet understood. If you don’t know all the factors, how can you calculate a probability?
On the other hand, it appears that many creationists are much more impressed with probabilities that are largely guessed, while they are not terribly impressed with extrapolation in historical studies. For them, it often doesn’t matter how much detail you get for the development of various structures in the past, it’s not enough, because it would only be testable if we could see every stage and explain everything.
Thus when an ID writer claims something is highly improbable, even though he hasn’t a clue how it actually happened, it impresses his fellow creationists, while when a scientist extrapolates development between existing specimens, the same creationists are totally unimpressed. Yet which of these is operating on the greater level of evidence?
If anyone is wondering why I see strong evidence for evolution, here’s the answer. I’m used to and respect historical methods. If you find a pottery type developing, and then you find several examples of stages, sequentially arranged by date, you can extrapolate a path from one style to the next. You don’t need an example of every pot. If you see writing develop from one style to the next, you don’t need every stage. You can extrapolate.
For me, the simple fact of large numbers of sequences in the development of complex structures suggests that such things have developed naturally. Extrapolating the intermediate steps is not terribly difficult for those who study these things, and it is a quite proper procedure. Challenging the observed sequence by indicating that it is improbable strikes me as absurd. The only proper challenge would be to say, “Here! This is where the intelligent designer intervened.” But of course, ID advocates do no such thing.
NCSE has produced a video, which I will embed below, that shows such a sequence on the development of the eye. It’s very clear, but it lacks some steps. I don’t know whether the video produced all the steps we know of, or just a sampling, but if these were the sum total of examples we have in a sequence of eye development, we would have good cause to believe that the eye evolved.