Parameter : When we are talking about whole population, meaning when we generalize whole population it is known as parameter.
Sample: Sample is a small portion of population. Generalizing small portion of population to make results possible and accurate.
Why we need to know this fact ? To understand some symbols and replacement of n to (n-1) in variance / standard deviation formula.
Look at the picture given below to understand difference of symbols.
This is how symbols come into picture while calculating same operation.
Now you must have a question, how to decide that a particular set is a population or sample ?
In simple terms a population is a large data on which we apply the result of experiments to that we have done to samples.
For example in a city of 10,000 people we want to know , how many people like Apple ? For this 10,000 is a population for us. Now because it is not possible to ask every 10,000 person , we take a sample of 100 people. These 100 people is a sample for us.
And after doing an experiment on these 100 people (suppose we got a result that 80 person like Apple) . We concluded that in particular city 80% person like Apple.
So, basically depends on data you can choose how you want to use it.
What is degree of freedom and why we replace N to n-1 in sample variance / standard deviation.
First look at the formulas
Standard deviation and variance for a sample :
Standard deviation and variance for a population:
Let's start with an example
mu (µ) population mean is actually not a defined value , it is not easy to calculate an average of a population. But still we we take it as an average of a population. In the case given below mu value is 5.
xi |
xi-mu (µ) |
7 |
7-5 |
3 |
3-5 |
6 |
6-5 |
In the above example we have three observation and three independent observation values (xi). note that we don't have to match mean value (mean of xi) to population mean mu (5). Because it is not defined or not possible to calculate exact 5.
7+3+6/3=5.3....
Now let's take the same case with sample mean (x̄i). Note that sample is a small set or we can say that we can calculate exact mean value for a sample. Assume that x̄i = 5 for this example .
In this case after two independent observation values the third value should be 5 to match sample mean (x̄i) because it (sample mean ) is a defined or an exact value. So, the conclusion is that here we lost one degree of freedom . that is (n-1).
xi |
xi-x̄i |
7 |
7-5 |
3 |
3-5 |
5 |
5-5 |
sample mean (x̄i) =7+3+5/3=5
Degree of freedom: The number of values free to vary is called degree of freedom. The concept explain in the above examples.
So, in the case of the sample mean (n), there is no value, free to vary on other hand in the population mean the last value is free to vary because we don't have to match the exact mean (N).