Query on Variance/Standard Deviation formula for sample

134 views
Skip to first unread message

AC

unread,
Oct 29, 2020, 11:58:10 AM10/29/20
to Discussion forum for Statistics for Data Science I

Respected Faculty members and Support Team,

 I have a query:

I found that the population SD/variance formula :

1.png

can be reduced to the form :

2-2.png

(pls pardon the sudden change in notation of population mean here, i.e from µ to x )

 My question is:

 Can we use such a reduced formula for calculating the sample variance or the sample SD (where we use : N-1)?

 

Thanks in advance.

Best Regards,

Anirban

Anand Iyer

unread,
Oct 30, 2020, 1:35:00 AM10/30/20
to Discussion forum for Statistics for Data Science I, AC
how did you derive this formula?  Doesn't seem right, even on a population.

Sherlock Holmes

unread,
Oct 30, 2020, 2:08:53 AM10/30/20
to Discussion forum for Statistics for Data Science I, anandd...@gmail.com, AC
It is a correct derivation as you can expand the square  and obtain it . As for Anirban's  question , yes we can use it in this format, i was wondering about this same point yesterday and found this
Screenshot (15).png
 if you use the same derivation technique for this as well you will see that it comes in the same format with n replaced about by n-1.
..

AC

unread,
Oct 30, 2020, 2:15:56 AM10/30/20
to Discussion forum for Statistics for Data Science I, Sherlock Holmes, anandd...@gmail.com, AC
Dear Sherlock,

Thanks a ton for the input!!

See - that was elementary, my dear Anand ! :)

Best Regards,

(an ardent fan of Sir Arthur)

Anand Iyer

unread,
Oct 30, 2020, 2:22:49 AM10/30/20
to AC, Discussion forum for Statistics for Data Science I, Sherlock Holmes
Brilliant!

Can you please post the proof, if you don't mind...It's very useful.
--
Cheers,

Sherlock Holmes

unread,
Oct 30, 2020, 2:24:50 AM10/30/20
to Discussion forum for Statistics for Data Science I, anandd...@gmail.com, Discussion forum for Statistics for Data Science I, Sherlock Holmes, AC
Sure , just give me a moment..

Sherlock Holmes

unread,
Oct 30, 2020, 2:39:10 AM10/30/20
to Discussion forum for Statistics for Data Science I, Sherlock Holmes, anandd...@gmail.com, Discussion forum for Statistics for Data Science I, AC
Excuse my awful handwriting
Adobe Scan 30 Oct 2020.pdf

Anand Iyer

unread,
Oct 30, 2020, 2:46:45 AM10/30/20
to Sherlock Holmes, Discussion forum for Statistics for Data Science I, AC
Wow.  Sir Holmes!
--
Cheers,

Sherlock Holmes

unread,
Oct 30, 2020, 2:48:55 AM10/30/20
to Discussion forum for Statistics for Data Science I, anandd...@gmail.com, Discussion forum for Statistics for Data Science I, AC, Sherlock Holmes
Its Madam Holmes in my case:):)

AC

unread,
Oct 30, 2020, 2:53:35 AM10/30/20
to Discussion forum for Statistics for Data Science I, Sherlock Holmes, anandd...@gmail.com, Discussion forum for Statistics for Data Science I, AC
Well, then it must be Enola using Sherlock's alias then?

Excellent!!

Just one thing :  the formula says "x" and "n" : -- so it is applicable for population variance, right?

[as opposed to µ , normally used for population mean in the formula]

So do we get to substitute n with (n-1) in the derived formula to get the desired result for sample variance problems?

 

Best.

Sherlock Holmes

unread,
Oct 30, 2020, 3:04:57 AM10/30/20
to Discussion forum for Statistics for Data Science I, AC, Sherlock Holmes, anandd...@gmail.com, Discussion forum for Statistics for Data Science I
No there is difference in the n notation , if you observe the picture i had sent at first it has N for population variance where N is the population size , but in the sample variance it is n-1 NOT N-1 , where n is the sample size . There is a subtle difference. i will get back on this again  i am researching a bit on this.

Sherlock Holmes

unread,
Oct 30, 2020, 3:42:15 AM10/30/20
to Discussion forum for Statistics for Data Science I, Sherlock Holmes, AC, anandd...@gmail.com, Discussion forum for Statistics for Data Science I
Okay so after quite a lot of research i can conclude this
here in the picture below we can understand the actual difference as opposed to the confusion created by n and n-1, its always best to use the shorten form but like me  i guess you were also curious as to how the long form would be. i was a little wrong as to generalize the proof for both sample and population ( i had learnt it in maths and felt it was going to hold true) . But exceptions are what make a subject beautiful.
Screenshot (19).png
according to the proof that i had sent earlier sample mean should have been n-1 but IT IS CERTAINLY NOT SO.
I HAVE VERIFIED THIS ABOVE FORMULA IN GOOGLE SHEETS , YOU CAN PLUG IN THE VALUES AND CHECK , it fits
Screenshot (21).png
CONCLUSION
SAMPLE MEAN IS SUM OF ALL VALUES IN THE SAMPLE / NO. OF ELEMENTS IN SAMPLE  AND 
POPULATION MEAN IS SUM OF ALL VALUES IN THE POPULATION / NO. OF ELEMENTS IN POPULATION .
Hence the picture you had sent initially Anirban that would not hold true for sample by just replacing n with n-1. how we could use it for sample is what i have explained with pic 1 in this message.

SUHANA PARVEEN S

unread,
Oct 30, 2020, 4:13:57 AM10/30/20
to Discussion forum for Statistics for Data Science I, Sherlock Holmes, anandd...@gmail.com, Discussion forum for Statistics for Data Science I, AC
 Madam Sherlock!!

This is the eqn which I have been using from my school times..!! 

 Find outed one another person using this!! :-):-)

AC

unread,
Oct 30, 2020, 11:36:00 AM10/30/20
to Sherlock Holmes, Discussion forum for Statistics for Data Science I, anandd...@gmail.com
Madam Sherlock Holmes,

Case solved and all sewn up!
Thank you so much for your contribution.
Your reply has helped me clarify my doubt.

Best.

Sherlock Holmes

unread,
Oct 30, 2020, 12:48:12 PM10/30/20
to Discussion forum for Statistics for Data Science I, AC, Discussion forum for Statistics for Data Science I, anandd...@gmail.com, Sherlock Holmes
 Great!!Happy to Help!

Abhishek Kumar

unread,
Oct 30, 2020, 3:08:04 PM10/30/20
to Discussion forum for Statistics for Data Science I, Sherlock Holmes, AC, Discussion forum for Statistics for Data Science I, anandd...@gmail.com
Discussion forum is really a good place to learn lot of good stuff....thanx Anirban for sharing a short method for Pop variance...researched a bit and found everything about this on Khan Academy
Message has been deleted

Malabika Guha Mustafi

unread,
Oct 30, 2020, 10:01:33 PM10/30/20
to Discussion forum for Statistics for Data Science I, abhin...@gmail.com, Sherlock Holmes, AC, Discussion forum for Statistics for Data Science I, Anand Iyer
@AC thank you for pointing out the formula.
@ Sherlock  thanks for the proof.
So , sample variance = n/n-1 population variance.
(where both n is equal, for a particular finite data set)

Abhishek Kumar

unread,
Oct 31, 2020, 5:00:34 AM10/31/20
to Discussion forum for Statistics for Data Science I, Malabika Guha Mustafi, Abhishek Kumar, Sherlock Holmes, AC, Discussion forum for Statistics for Data Science I, anandd...@gmail.com
The one short method to find pop variance led me to dig up various short methods to find covariance as well as correlation coefficient. Tried all these on the not graded assignments and getting the correct results
IMG_4871.JPG

Malabika Guha Mustafi

unread,
Oct 31, 2020, 5:50:26 AM10/31/20
to Discussion forum for Statistics for Data Science I, abhin...@gmail.com, Malabika Guha Mustafi, Sherlock Holmes, AC, Discussion forum for Statistics for Data Science I, Anand Iyer
:) Thanks. Now the forum really serves it's purpose.
Reply all
Reply to author
Forward
0 new messages