Statistics for journalists: Tips to assess the quality of surveys and polls
One of my colleagues is determined to conduct a survey. As she knows that I am delving into statistics this summer (if you don’t, I explain it here), she asked me how to get a random and representative sample using only her resources.
I included her request in my interviews with statisticians, but the answers were not as encouraging as I expected. My favourite one came from Kevin McConway, professor of Applied Statistics at The Open University and who has been involved with More or Less BBC 4 program.
“It would be great if data journalists could deal with some techniques from statistical inference.” But we dangerously tend to assume that the calculation tells us everything. And he added:
And it does not tell you errors, such as asking the wrong question, picking the wrong people, not selecting enough young people…
Looking at media like the BBC or Huffington Post, they do surveys in collaboration with YouGov. But I took part in the analysis of one survey about the head teachers’ opinion of the school funding crisis, by ITV and the National Association of Head Teacher, and that gave me an idea:
What if she had accessed to the whole population?
1. What is your survey about?
The starting point is the questionnaire, the professor in Sociology at Freie Universität von Berlin Julián Cárdenas explains in “Manual de investigación cuantitativa.”
That means spending time thinking, for instance, what do we mean by “data;” writing non- abstract, general or negative questions; considering exhaustive answers to cover all the options, and divided the questionnaire into blocs, among many other tasks.
That process is key to consider the quality of the survey. In fact, “have they asked the right questions?” is one of the five points that BBC advises to assess before reporting it.
The questions are as well important to write the story. The Sun wrote “six out of ten voters think she (Theresa May) has handled the Brexit talks badly and should go as soon as we leave next March.”
But that is inaccurate given their infographic. 27% says she should leave immediately; the remaining 33% may be contained in those who think she should call a leadership election before leaving or staying until a new leader was chosen. Moreover, written that way seems voters who think she should go and voters who think she is doing a bad job are the same.
Companies like YouGov or Ipsos Mori give access to the full questionnaire and their methodology. But that does not happen always. I have been paying attention to the press releases I received, and a very few provides with a couple of lines about their methodology, while none of them comes with the questions (yes, I know, that is not a representative sample).
2. Who is your population?
If the population of my colleague’s survey was the whole Indonesian country with its more than 17,000 islands, I would just run away.
But she is interested in the 375 news organisations whose contacts have been already scraped.
To extrapolate the results, she will need a representative sample. For instance, if 10% are big institutions and 90% smaller, she would need this proportion in the sample. But whether this characteristic is relevant or not to shape the sample will depend on the questions.
The next step is choosing the model for a probability-based sample. The most popular is the Simple Random, but there are alternatives:
Regarding size, bigger is better, but there is no “ideal size”. This will depend on the margin of error admitted, the heterogeneity and size of the population and the confidence interval (Cárdenas, 2018).
And, think careful about these points because…
Many of the most egregious statistical assertions are caused by good statistical methods applied to bad samples, not the opposite.
That is Charles Wheelan (Naked statistics, 2012) warning and he adds: there is no such “supercomputer or fancy formula” that can fix a bad sample.
“It seems too much given my time constraints… Plus, the geographical barriers don’t make it easy,” my colleague told me when I was still in the middle of the process.
Despite her doubts, I had already given myself up to this challenge and I was on a bike, downhill, and without breaks (no gif needed for this).
So, if the sample is the queen, how can I assess it?
I took my list of press releases based on surveys. I discarded those that does not include information about the sample size (like this one). And I started to do some calculations.
For this story 2,000 Spanish people, aged 18–75, were surveyed. The number was weighted for the population of each region. According to the INE (Instituto Nacional de Estadística), there are 34,22 million people aged 18–75, and 1.6 of them live in Canary Islands. That is 4.7%, and 4.7% out of 2,000 is 93.5 people. The margin of error for these 93 ‘canarios’ is more than 10%, but the press release says it is between 2.19% and 3.1%.
Although I was not given the confidence interval (CI), I supposed a CI of 90% and I took the higher margin of error, 3.1%. Taken the 1.6 m population, I got that the sample should be 704 people. But Canary Islands is the eighth region most populated out of the 17, and they surveyed 2,000 people in total.
Numbers were not shouting a clear “yes” or “no” out. I then compared with other cases. The BBC and YouGov interviewed 20,081 people for this story which refers to the England’s adult population (around 45m), while the one in The Sun asked 1,019 voters.
Finally, I asked Professor Julián Cárdenas:
They should give you details about the size, but also about the heterogeneity of the population according to the variables measured.
“They should publish their process (…) and the variables that intervene in their results,” concludes the Instituto de Estudios Sociales Avanzados (IESA-CSIC) in a study about the lack of transparency and risk of manipulation in the Spanish polls.
“The best measure of the quality of research is how transparent your vendor is when they describe their research methodology, and the strengths and weaknesses associated with it,” wrote the Chief Research Officer at Peanut Labs, Annie Pettit, who has asked journalists to stop demanding for the margin of error.
The margin of error may help to evaluate the size of the sample, but not how representative and random is. This figure tells you the variation in the responses if you interviewed the whole populations instead of the probability-based sample.
But it assumes that you are using a probability-based sample. And, it “does not apply to opt-in online surveys and other non-probability-based polls,” states the American Association for Public Opinion Research (AAPOR).
That is why Pettit claims that reporting it can be “inappropriate.” She also adds “insufficient,” and this is because of the other “systematic errors that cannot be analysed statistically,” The de-emphasis of sampling error in coverage of presidential approval ratings report explains. Those are the measurement error, such as the wording or the order of the questions, and the specification error.
Nate Cohn demonstrates in The Upshot how pollsters’ “sensible and defensible” decisions can add up differences in the result, even though they work with the same data.
Cohn concludes that the margin of error “doesn’t even come close to capturing the total survey error. That’s why we didn’t report a margin of error in our original article.”
Nevertheless, there is no unity about this.
According to the previous study about sampling errors, the margin of error can prevent journalists from wrongly report the outcomes of polls:
Results suggest that the news media frequently err by treating as a news event that a poll crosses an arbitrary line, such as the 50 percent (…), even when that line was crossed by an amount well within the poll’s margin of sampling error.
Organisations such as the Office for National Statistics (ONS) have also taken the decision of showing the uncertainty in their reports.
And, even non-probability-based surveys, like those from SurveyMonkey, reported a modelled error estimate; a different measure which is considered a best practice method in these cases, according to AAPOR.
3. Which methods will you use?
There is a risk of recall bias in every survey, but there are other biases associated with when, where and how it is done. And this should be stated in the methodology, too.
Answers about diet products consumption would vary from May (bikini operation) to December (Christmas fattening). And people would have loved your exposition with in-person methods, although their visit lasted less than 5 minutes.
Response rates for telephone have dropped to 10% of the calls ending in an interview, says the Harvard Business Review; so, pollster call numbers back around seven attempts, and then they correct this sort of bias weighting the results.
Online methods face other challenges. As my colleague was interested in them, I registered in YouGov to test one.
They draw sub-samples from their “800,000 of British adults,” database and send a mail to your inbox. You then earn points that can convert into money. It took me around 15 minutes to complete one, and I did only two. I had no time or the topics were irrelevant for me.
“Recent studies show that even after weighting, online polls tend to overrepresent men and the unemployed,” reports the Harvard Business Review.
However, “it would be unfair to say that online is ‘biased’ in a way that offline is not,” YouGov says in its website. “The fact is, there are different biases for which all approaches have to account.”
“But bias is a risk, not a certainty,” Andrew Dilnot and Michael Blastland wrote in The Tiger That isn’t. And it is “less likely in carefully designed surveys which aim for a random sample of the population and have asked more that the first half-dozen of people”
And there is one more point to consider…
The missing data, which can “mean (that) these studies are biased, less accurate and less powerful,” Dr Jamilla Hussain explains here.
“(The authors of the survey) would have to provide you with the number of missing data and how they have dealt with it,” told me Cárdenas.
“The classic way is substituting the missing observations with a mean of that variable from the rest of the observations. But there are other methods,” he added.
4. Tell your methodology
By the end of this post my colleague was accepting the non-probability-based methods for her survey.
“I won’t extrapolate my conclusions to the whole population,” she told me, “but the insights I can get from this method are even though valuable for my project.”
And I agree, under one condition: explain your methodology and its caveats to your readers, as Paul Bradshaw suggested to The Breaker.
As for The Breaker’s investigation, surveys and polls are a powerful tool for journalists and a valuable source of information. But there are errors and uncertainty inherent to them. Being aware of that would prevent inaccuracy in our stories, and it may demand a stronger editorial judgement.
“Get down to the grit of the way data is gathered, and you often find something slightly disturbing: human mess and muddle, luck and judgement, and always a margin of error in gathering only a small slice of the true total,” The tiger that isn’t.
Any mistake? Please, let me know. Comments are welcome! And here is the first post about statistics: