A Central Limit Theorem (CLT) is a statistical concept that states the sample mean distribution of random variables that will assume a near-normal distribution if the sample size is large enough. The theorem states sampling distribution as the size of the sample increases, regardless of the original sample distribution.
A sample is a group of observations from a broader population of all possible observations that could be made given trials. It has the following modules:
- Observation: Result from one trial of an experiment.
- Sample: Group of results gathered from separate independent trials.
- Population: Space of all possible observations that could be seen from a trial.
As a user increases the sample’s size and numbers, the graph’s mean will move towards a normal distribution. An important segment of the theorem is the mean of the sample will be the mean of the entire population. If you calculate the mean multiple samples of a population, add them up, and find their average, the result will be a summation of the population mean.
History of the Central Limit Theorem
The very first version of CLT was coined by Abraham De Moivre, a French mathematician. He published an article in 1733 where he used the normal distribution to find the number of heads resulting from multiple coin tosses.
In 1812, the concept was reintroduced by Pierre-Simon Laplace, another French mathematician. He re-introduced the normal distribution concept in his work titled “Théorie Analytique des Probabilités,” where he attempted to approximate binomial distribution with the normal distribution.
In 1901, the Central Limit Theorem was explored ahead by Aleksandr Lyapunov, a Russian mathematician. He went a step further to define the concept in general terms to prove how the concept worked mathematically. The characteristic functions that he used to provide the theorem were adopted in modern probability theory.
Let’s have a look at the mathematical equation came out with:
Why is Central Limit Theorem important?
The CLT tells us distributions of population, its shape, and approaches normality as the sample size increases.
The output of it is useful as the research never knows which mean in the sampling distribution is the same as the population mean. But by selecting random samples from the population mashes the data together, allowing the research to make a good estimation of the population mean.
Distribution of the Variable in the Population
The CLT’s definition states that “regardless of the variable’s distribution in the population.” In a population, the values of a variable can follow different probability distributions. These distributions can range from normal, left-skewed, right-skewed, and uniform among others.
The CLT also applies to all types of probability distributions. There are some expectations in it too. For example, the population must have finite variance.
Properties of the Central Limit Theorem
Normal distributions have two parameters — the mean and standard deviation. As the sample size increases, the amplitude of sample distribution converges on a normal distribution where the means equals the population mean, and the standard deviation equals σ/√n.
σ = the population standard deviation
n = the sample size
As the sample size (n) increases, the standard deviation of the sampling distribution becomes smaller.
Understanding CLT is crucial when it comes to presuming validation of your results and assessing estimations. Using large sample sizes satisfied the normal approximation even when your data are unevenly distributed to obtain more accurate estimations.