Browsed by
Tag: polls

Vocabulary Word of the Day: Probability

Vocabulary Word of the Day: Probability

Picture Credit:
Picture Credit:

I know headline writers need their splashy headlines, but as the media is filled with word of a stunning upset, we should remember the number of times that one candidate or the other was “destroyed” or “finished,” or the election was declared “over.”

A poll takes a snapshot of part of the electorate which is extrapolated to the whole. It will have a margin of error, generally something like 2.5% to 3.5%. (Why that is can be a project for research.) That means that a candidate who is at 50% in that poll might actually be as low as 46.5% and as high as 53.5% if the error margin is 3.5%. If we’re thinking of two candidates, the other, let’s say showing at 45%, could be anywhere from 41.5% to 48.5%, an overlap of 2%. Now that’s using a margin for candidate A of +5%. The average for Clinton was around +3.5% the day before the election. Now not all polls were at that value, and also not all polls have a 3.5% error margin.

Now there’s an additional percentage involved, which is a probability, variously 90-95%, that the poll itself is within that margin. So in the poll above, if the figure is 95%, for example, 95 out of 100 times the election being polled would reflect a result within that margin for error. Otherwise it might be anywhere. I’m not making any effort here to keep these details realistic — if you read up on the topic you can learn how these numbers relate, but more importantly, if you read the data on a poll, you can find what these numbers are for that particular poll.

There is no set way to combine polls into an aggregate, and there is no established error margin for polls that are combined. That’s because not all polls are created equal. On the eve of the election, Nate Silver and crew were giving something near a one in four chance that Trump would win the election, i.e., according to their analysis, if you had a good enough sample, one in four elections run where the data looked like this would go to Trump. Three in four would go to Clinton. The election doesn’t make them wrong. That’s a probability.

Let’s look at that. If I flip a coin, the probability is one in two that it will come up heads an one in two that it will come up tails. So I flip the coin, and it comes up heads. Was my projection wrong? Not at all. Similarly, the FiveThirtyEight people aren’t wrong either. If they had said, “Clinton will be the next president of the United States,” then they would have been wrong. What they said was that there was around a 78% chance it would be Clinton an a 22% chance it would be Trump.

They were critiqued by Sam Wang of the Princeton Election Consortium, and several people wrote on FiveThirtyEight to defend their methodology. Dr. Wang gave a 99% chance that it would be Clinton. Both Nate Silver (and an unknown number of members of his crew) and Dr. Wang are much more skilled at this than I am (in the same sense that an MLB player is more skilled at baseball than I, considering I have never picked up a baseball bat for a game, is better at baseball than I am). I spent some time with their data and couldn’t really find a way to understand fully how they got their probabilities and why they differed so much. (On your mental flowchart, create a box and label it “Lots of Statistical Figuring.”) But my intuitive feeling was that Nate Silver was getting the better of the argument. It seemed to me that there was insufficient data on which to base that high a level of confidence in the aggregation of these polls. We haven’t been polling presidential elections for that long.

Why don’t we have neat numbers for the aggregate values? Let me note, first, that any time one uses a phrase like “the polls show,” one is doing some sort of aggregation, however loose. Folks like Sam Wang and Nate Silver do it in a very scientific way. (I’ll let them argue over which is more scientific, and they do so with some vigor.) We all do it when we look at polls and make a generalization. The reason there isn’t a neat x% margin of error an y% probability that the poll will, in fact, fall within that margin is that polls use different methodologies. If you average the number of apples an oranges you have, you don’t get a better value each for apples an oranges. It might be better to say that if I average my Gala apples an my Golden Delicious apples, I don’t get an accurate picture of what type of apples I have available. One set of those apples Jody wants to bake into a crisp, and the other I’m going to slice up and eat. I’m afraid I’ll have to look at the actual apples.

Again, my inexpert intuition is that aggregation needs more experience and testing to get more accurate, even as I think that Nate Silver’s work is the more promising. In the meantime, what is quite certain is that nobody in the media has a clue about any of this. Alternatively, they don’t care, and just want to write headlines to sell papers, whether they reflect actual data or not. I suppose that’s possible.



The Flawed Way People Read Polls

The Flawed Way People Read Polls


Here’s your illustration. Liberals loved Nate Silver because he calculated that Barack Obama would win the presidency, among other things. Conservatives didn’t like him so much. Now conservatives are pointing to the poor odds, though 60-40 is a ratio many politicians would covet.

I love Nate Silver not because of who he supports but because he shows his work, admits his mistakes, and has a pretty good track record. If I want to disagree with him, I can find the data in his own material. No, I don’t think he’s always right. The good thing is that he doesn’t think he’s always right.

People on both sides of the political spectrum try to make polls say what they want, or they cherry pick the poll that suits them. Newspapers tend to represent polls in whatever way will sell the most papers. It causes me to remember the book Lies, Damn Lies, and Statistics. There’s no better way to lie to people than to combine two factors: 1) Tell them what they want to hear and 2) Put some numbers in it.

In Reporting Polls, Please …

In Reporting Polls, Please …

… always consider the sampling error when you report the difference between successive polls.

News organizations have been getting some better, in my subjective view, in noting when a result is within the sampling error in a particular poll, but they still report increases or decreases in a lead without that note. If a candidate moves from 46% to 48% in successive polls where the margin is +/-4%, that is not a statistically significant change. If multiple polls show results that are all within their various sampling errors, the polls are not scattered all over the map or giving a different story.

I also wish new stories would define the various terms they use to modify “lead” or “trail.” One has no idea from the headline just what has happened.

OK, that’s my whining for the moment. 🙂

Voter Ignorance

Voter Ignorance

Sen. Bob Bennett, Chairman Dianne Feinstein, U...
Image via Wikipedia


A Kaiser poll (pdf) finds that 22% of respondents believe the health care reform bill has been repealed, and another 26% don’t know. (HT: Dispatches and The Daily Dish)

Now I realize that the majority in the poll do realize, correctly, that the health care bill is still law. But consider that even the 22% is more than the gap between those who support and those who oppose the law. The question this raises for me is just how meaningful the rest of the poll responses can be, when answered by people who aren’t at all acquainted with it.

Elgin Hushbeck, Jr. in his book Preserving Democracy (published by my company Energion Publications) notes another poll:

A key reason is that while most people know who the President is*, a significant number of voters have no clue about Congress. In a Zogby poll of those who voted for Obama conducted shortly after the election, less than half, 42.6 percent, even knew that Congress was controlled by the Democrats and 36.5 percent actually believed that the Republicans were in control.224 In a USA Today/Gallup Poll conducted just after the election, 28 percent of those asked said they had never heard of Harry Reid, the Senate Majority Leader, while Nancy Pelosi, the Speaker of the House was better known, as only eight percent226 had never heard of her (pp. 208-209).

These are not matters subject to opinion. I often hear people called ignorant because they don’t agree with certain conclusions. Such accusations only reflect badly on the accuser. But whether a law has been passed or which party is in control of congress are not matters of opinion.

No wonder polls tend to vary wildly and people’s votes often seem to contradict their values.


Enhanced by Zemanta
Changing Polls

Changing Polls

No, President Obama’s approval rating; the poll that I have in the right hand sidebar.  It has been there for more than 18 months, and surprisingly enough is still generating interest.  The last comment is dated December 19, 2009 and there have been quite a number.  You can see the results here.

 Who does God hate?  
Selection   Votes 
Everyone, because we are all sinners  3% 13 
Only unrepentant sinners  9% 40 
All non-Christians  2%
Those who have never said the sinner’s prayer  0%
Nobody, God loves everybody  77% 330 
Other (please comment)  9% 39 
431 votes total free polls

In case anyone wants to keep this poll alive even further past its sell-by date, I’m including the form as well:

Who does God hate?
Everyone, because we are all sinners
Only unrepentant sinners
All non-Christians
Those who have never said the sinner’s prayer
Nobody, God loves everybody
Other (please comment) free polls