Projections And Confidence Intervals

Mirio De Rosa explains confidence intervals and sampling issues, using polls as an example:

Weighting is used to make sure samples reproduce the underlying characteristics of the population they are drawn from. For instance, in the UK 51.6% of voters are women, of these 22.5% are above 65 years of age, 23.1% have a higher education and so on. The people selected to make part of a sample may be recruited to ensure they match these proportions.

Within the context of weighting there are two major sampling procedures: Quota and stratified sampling. The relevant difference between them is the latter uses some sort of randomization device while with the former the interviewer decides whether or not to interview a person with certain characteristics. YouGov presumably applied Quota Sampling for the Brexit survey[4], and the gender, age and education weights they applied are shown in the following image.

Read the whole thing.

Related Posts

Calculating AUC in R

Andrew Treadway shows how you can calculate Area Under the Curve in R: AUC is an important metric in machine learning for classification. It is often used as a measure of a model’s performance. In effect, AUC is a measure between 0 and 1 of a model’s performance that rank-orders predictions from a model. For […]

Read More

Python versus R (Again)

Alex Woodie looks at whether Python is dominating R in the data science space: There is some evidence that Python’s popularity is hurting R usage. According to the TIOBE Index, Python is currently the third most popular language in the world, behind perennial heavyweights Java and C. From August 2018 to August 2019, Python usage surged […]

Read More


July 2016
« Jun Aug »