Shortcut Formulae For Variance, SD, Covariance and Correlation Coefficient

409 views
Skip to first unread message

CG

unread,
Nov 15, 2020, 11:20:02 PM11/15/20
to Discussion forum for Statistics for Data Science I
I have compiled a list of shortcut formulae (also called computation formulae) for Variance, SD, Covariance and Correlation Coefficient in the attached PDF. The reduction steps are also included. On the last page there is a table that will help in remembering and doing the calculations quickly.

Thanks to Akshay and another person (cannot find the name now) for posting this on the Group Earlier.


Statistics 1 Shortcut Formulae.pdf

Antony

unread,
Nov 15, 2020, 11:26:36 PM11/15/20
to Discussion forum for Statistics for Data Science I, CG
Thanks ! This helps lazy ppl like me 

Malabika Guha Mustafi

unread,
Nov 16, 2020, 12:21:34 AM11/16/20
to Discussion forum for Statistics for Data Science I, Antony, CG
Good one. Thanks

Tejasvi Hegde

unread,
Nov 16, 2020, 12:27:06 AM11/16/20
to Discussion forum for Statistics for Data Science I, Malabika Guha Mustafi, Antony, CG
@CG Thanks!
That's really helpful.

Ajay Kumar

unread,
Nov 16, 2020, 12:32:08 AM11/16/20
to Discussion forum for Statistics for Data Science I, Malabika Guha Mustafi, Antony, CG
Thanks. We can save time using these formulae

On Monday, November 16, 2020 at 10:51:34 AM UTC+5:30 Malabika Guha Mustafi wrote:

MANOJ KUMAR Narayan

unread,
Nov 16, 2020, 12:44:06 AM11/16/20
to Discussion forum for Statistics for Data Science I, CG
good job dear !!

Satya Mohapatra

unread,
Nov 16, 2020, 1:26:42 AM11/16/20
to Discussion forum for Statistics for Data Science I, CG
good one. 

I see that covariance and correlation coefficient fields are left (?) .

Covariance-> Add a constant ->  Remains same.(because difference x-xmean remains same.)
Covariance-> Multiply a constant -> Becomes *C square 

Correlation coefficient -> Add a constant      ->  Remains Same. (because difference x-xmean remains same and SD remains same.)
Correlation coefficient-> Multiply a constant -> Remains Same (because its a ratio.)



On Monday, November 16, 2020 at 9:50:02 AM UTC+5:30 CG wrote:

Satya Mohapatra

unread,
Nov 16, 2020, 1:35:46 AM11/16/20
to Discussion forum for Statistics for Data Science I, Satya Mohapatra, CG
covariance is affected by outlier, 
Correlation coefficient I am not sure we can really say its affected..

Cherian George

unread,
Nov 17, 2020, 2:10:36 AM11/17/20
to Manitha Tp, Discussion forum for Statistics for Data Science I
Solution using the shortcut formula is given below.

x   = 1*7,2*8,3*6,4*4,5*7,6*8
x^2 = 1*7,4*8,9*6,16*4,25*7,36*8
Sum of x = 7+16+18+16+35+48 = 140
Mean = 140/(7+8+6+4+7+8) = 140/40 = 7/2
Mean^2 = 49/4
Sum of x^2 = 7+32+54+64+175+288 = 620
N = 40

Population variance formula
= Sum(x)2/N - mean^2
=620/40 - 49/4 = 13/4 = 3.25



On Tue, Nov 17, 2020 at 11:40 AM Manitha Tp <mani...@gmail.com> wrote:
Hello, 

Could you please tell me how to do this question using shortcut formula as mentioned by you 

Thanks

On Mon, Nov 16, 2020 at 9:50 AM CG <cheria...@gmail.com> wrote:
I have compiled a list of shortcut formulae (also called computation formulae) for Variance, SD, Covariance and Correlation Coefficient in the attached PDF. The reduction steps are also included. On the last page there is a table that will help in remembering and doing the calculations quickly.

Thanks to Akshay and another person (cannot find the name now) for posting this on the Group Earlier.


--
You received this message because you are subscribed to the Google Groups "Discussion forum for Statistics for Data Science I" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ma1002-discus...@nptel.iitm.ac.in.
To view this discussion on the web visit https://groups.google.com/a/nptel.iitm.ac.in/d/msgid/ma1002-discuss/e2b26c9d-edda-4fde-8789-41deb0d290ban%40nptel.iitm.ac.in.
--
Manitha Sabarinathan
M.Tech(Bioinformatics)


Tejasvi Hegde

unread,
Nov 17, 2020, 3:52:16 AM11/17/20
to Discussion forum for Statistics for Data Science I, CG
I think same formula (without denominator) can be applied to SSE also ?

On Monday, November 16, 2020 at 9:50:02 AM UTC+5:30 CG wrote:

Cherian George

unread,
Nov 17, 2020, 4:58:55 AM11/17/20
to Manitha Tp, Discussion forum for Statistics for Data Science I
@Manitha
Yes. The shortcut formula for variance will also work when class intervals are given. 
I have done a sample problem in the image attached below
Screen Shot 2020-11-17 at 3.25.44 PM.png

On Tue, Nov 17, 2020 at 2:44 PM Manitha Tp <mani...@gmail.com> wrote:
Thanks, but what about class interval case? We need to calculate midpoints and same formula right? Your shortcut methods saved huge time. Thanks for sharing. 
--
Manitha Sabarinathan
M.Tech(Bioinformatics)


Cherian George

unread,
Nov 17, 2020, 5:24:53 AM11/17/20
to Tejasvi Hegde, Discussion forum for Statistics for Data Science I
@Tejasvi

I just re-read my own notes on SSE from Maths Lecture 24 (3.5) Straight Line Fit. 
  • We start with some observations of x and y. (Like measured values of voltage and current)
  • Propose a line equation y=mx+c that is closest to the points
  • The equation can also be written as y-mx-c=0 . Only if all the points lie exactly on the line will it be equal to zero for every ordered pair substituted in the equation. Otherwise it gives a positive or negative value. 
  • In order to find SSE, we substitute all observed points (ordered pairs) in y-mx-c and square the result. The sum of the squares is SSE
  • SSE = Summation [(y-mx-c)^2]
Best fit is when SSE is minimum : Least Squares motivation.

So it would be incorrect to use the numerator of variance to find SSE. Mean values are not even considered in the original SSE equation.
Maybe you could make this correction in your notes as well. 

Thanks

Tejasvi Hegde

unread,
Nov 17, 2020, 12:52:19 PM11/17/20
to Cherian George, Discussion forum for Statistics for Data Science I
Thanks @CG

--

Regards
Tejasvi
Reply all
Reply to author
Forward
0 new messages