Sample vote counts

posted on 03:13 PM on Thursday 17 September 2015

For this year's elections, they introduced sample vote counts which samples each polling station. Interestingly, they also made the statement that the results of the sample vote counts will be within 4% of the actual results. I was wondering how they computed that. And it appears that they are right. The following are the R codes to run simulations as well as probabilities estimated from a hypergeometric distribution.

fraction <- 0.5
total <- 1e6
sample <- 0.005
iterations <- 1000

# simulation results
votes <- c(rep(1, fraction * total), rep(0, (1-fraction) * total))
hist(sapply(1:iterations, function(x) 100*mean(sample(votes, sample*total))), breaks=21)

# hypergeometric distribution
sum(dhyper((fraction*total*sample*0.96):(fraction*total*sample*1.04), fraction*total, (1-fraction)*total, sample*total))

bernett.net