The kids prawning away at East Coast.
For this year’s elections, they introduced sample vote counts which samples each polling station. Interestingly, they also made the statement that the results of the sample vote counts will be within 4% of the actual results. I was wondering how they computed that. And it appears that they are right. The following are the R codes to run simulations as well as probabilities estimated from a hypergeometric distribution.
fraction <- 0.5 total <- 1e6 sample <- 0.005 iterations <- 1000 # simulation results votes <- c(rep(1, fraction * total), rep(0, (1-fraction) * total)) hist(sapply(1:iterations, function(x) 100*mean(sample(votes, sample*total))), breaks=21) # hypergeometric distribution sum(dhyper((fraction*total*sample*0.96):(fraction*total*sample*1.04), fraction*total, (1-fraction)*total, sample*total))