# The Central Limit Theorem

2017-04-05

The central limit theorem states that the sampling distribution of the mean of any independent,random variable will be normal or nearly normal, if the sample size is large enough. How large is “large enough”? The answer depends on two factors.

• Requirements for accuracy. The more closely the sampling distribution needs to resemble a normal distribution, the more sample points will be required.
• The shape of the underlying population. The more closely the original population resembles a normal distribution, the fewer sample points will be required. (from stattrek.com).

The main use of the sampling distribution is to verify the accuracy of many statistics and population they were based upon.

Read on for an example and to see how to calculate this in T-SQL.

## Logistic Regression In R

2017-04-21

Steph Locke has a presentation on performing logistic regression using R: Logistic regressions are a great tool for predicting outcomes that are categorical. They use a transformation function based on probability to perform a linear regression. This makes them easy to interpret and implement in other systems. Logistic regressions can be used to perform a classification […]

## Understanding The Problem: Churn Edition

2017-04-18

Emre Yazici points out the importance and difficulty of nailing down good definitions, using bookings churn as an example: WHEN: Let’s say, we have made our design, constructed a model and obtained a good accuracy. However our model predicts (even with 95% accuracy) the customer who are going to churn in next day! That means […]