Press "Enter" to skip to content

Sampling without Replacement and Unequal Probabilities

Peter Ellis finds interesting results with sampling in R:

A week ago I was surprised to read on Thomas Lumley’s Biased and Inefficient blog that when using R’s sample() function without replacement and with unequal probabilities of individual units being sampled:

“What R currently has is sequential sampling: if you give it a set of priorities w it will sample an element with probability proportional to w from the population, remove it from the population, then sample with probability proportional to w from the remaining elements, and so on. This is useful, but a lot of people don’t realise that the probability of element i being sampled is not proportional to w_i”

Read on for a demonstration. H/T R-Bloggers.