Likely Voters, Low Response Rates, and Little Coverage Error: Advances in Internet Preelection Polling Methodology

Ashley Burton and Dr. Quin Monson, Political Science

The purpose of my research was to determine if internet preelection surveys administered to a probability proportionate to size (PPS) sample are able to reduce coverage error by including likely voters, even if the response rates are low. This research was conducted during the Utah 3rd District primary election in June 2008 and again in the Utah November 2008 general election. The experimentation was repeated in November to improve the sampling method and to generalize the findings to primary and general elections alike.

A classic problem with preelection polling is that it is difficult for pollsters to identify likely voters. This leads to coverage error which error occurs with preelection surveys when those surveyed before an election systematically select a different candidate than actual voters do on election day. My research presented a new method for preelection surveys that combined some of the best features of previous preelection survey methodologies. I, along with my mentor and colleagues, used an online preelection survey administered to a probability sample drawn from a voter registration list. We hypothesized that even though the response rate was quite low, the low response rate actually served to reduce coverage error because the sample included only the most likely voters.

Two methods of probability sampling were tested. First, a Simple Random Sample (SRS) was drawn from the Utah voter file. Second, a regression model employing past turnout and other factors was used to estimate a probability of voting in the both elections for each person in the Utah voter file. A Probability Proportionate to Size (PPS) sample was drawn using this probability of voting.

Analysis of the June 2008 preelection survey results showed that the PPS sample predicted the election results more accurately than the SRS respondents. And further, when we restricted the respondents to those who personally identified themselves as likely voters on the survey, the results predicted the exact vote margin as the final results on election day. See Table 1 below.

Because general elections are different in nature than primary elections, for example higher voter turnout and multiple competing parties, the research was not complete with only the June 2008 primary results. Therefore, we reconstructed the experiment in November 2008 and analyzed the data to see if internet preelection surveys are indeed able to reduce coverage error when sampling likely voters. Our vote margin predictions were close to actual election results, within 5 percentage points, but the November 2008 preelection results were not as accurate as the June 2008 results were. Again, the PPS sample predicted the election results more accurately the SRS respondents.

Possible reasons for less accurate results include error in the voter likelihood regression model or sampling error. Also, the increased number of voters in a general election as compared to a primary election may have made the second survey results less accurate. Another difficulty with the November survey was the that while we invited the same number of people to participate in the November 2008 preelection survey as the June survey, we had a lower response rate. Of course, our hypothesis was that a low response rate contributes to accuracy, but if it is too low, it may not truly represent election-day voters.
Even though the results were not as accurate with the second phase of the research, the results were still fairly close. Internet surveys are growing in popularity and the results of this research have significant implications. Online surveys are certainly useful for predicting elections results as well as vote margin and for examining why voters vote the way they do. These conclusions are not perfect, and more research needs to be done. However, this research has strong internal validity regarding the Utah Third District, and the likely voter modeling and sampling methodologies will be perfected for future elections in the district. There is also external validity because pollsters and researchers can replicate this type of sampling methodology if they obtain the voter file for their location and find ways to encourage likely voters to participate in their online preelection surveys. Modeling likely voters does work, but the model depends on the nature of the population of interest and the specific location of the election.

Fortunately, the Brigham Young University Center for the Study of Elections and Democracy continues to test online surveys a few times a year, and each time there is a primary or general election. The sampling methods and online survey quality are improving with each election.
My paper of the June 2008 analysis was submitted to a student competition hosted by the Pacific Chapter of American Association for Public Opinion Research (PAPOR). The paper was awarded second place, and I traveled to the PAPOR annual conference to present a poster of the findings. Now that I have graduated, students and professors will continue this research and more presentations and eventual publication are likely.