As many of you may be aware, the UK general election took place yesterday, resulting in a surprising victory for the Conservative Party. The pre-election opinion polls predicted that the Conservatives and Labour would be roughly equal in terms of votes cast, with perhaps a small Conservative advantage leading to a hung parliament; instead the Conservatives got 36.9% of the vote to Labour's 30.4%, and won the election outright.
There has already been a lot of discussion about why the polls were wrong, from methodological problems to incorrect adjustments. But perhaps more interesting is the possibility that the polls were right! For example, Survation did a poll on the evening before the election, which predicted the correct result (Conservatives 37%, Labour 31%). However, that poll was never published because the results seemed "out of line." Survation didn't want to look silly by breaking with the herd, so they just kept quiet about their results. Naturally this makes me wonder about the existence of other unpublished polls with similar readings.
This seems to be a case of two well know problems colliding with devastating effect. Conformity bias caused Survation to ignore the data and go with what they "knew" to be the case (for which they have now paid dearly). And then the file drawer effect meant that the generally available data was skewed, misleading third parties. The scientific thing to do is to publish all data, including "outliers," both so that information can change over time rather than be anchored, and to avoid artificially compressing the variance. Interestingly, the exit poll, which had a methodology agreed beforehand and was previously committed to be published, was basically right.
This is now the third time in living memory that opinion polls have been embarrassingly wrong about the UK general election. Each time this has lead to big changes in the polling industry. I would suggest that one important scientific improvement is for polling companies to announce the methodology of a poll and any adjustments to be made before the poll takes place, and commit to publishing all polls they carry out. Once this became the norm, data from any polling company that didn't follow this practice would be rightly seen as unreliable by comparison.