COVARIANCE SHORTCUT: Easy calculations

348 views
Skip to first unread message

Akshay Malik

unread,
Nov 14, 2020, 11:43:09 PM11/14/20
to Discussion forum for Statistics for Data Science I
Covariance Shortcut.PNG

Satish

unread,
Nov 15, 2020, 12:22:34 AM11/15/20
to Discussion forum for Statistics for Data Science I, akshaym...@gmail.com


On Sunday, November 15, 2020 at 10:13:09 AM UTC+5:30 akshaym...@gmail.com wrote:
Covariance Shortcut.PNG

Malabika Guha Mustafi

unread,
Nov 15, 2020, 1:41:51 AM11/15/20
to Discussion forum for Statistics for Data Science I, Satish, akshaym...@gmail.com
cool. Thanks

Priyanshu Singh

unread,
Nov 15, 2020, 1:58:44 AM11/15/20
to Discussion forum for Statistics for Data Science I, Malabika Guha Mustafi, Satish, akshaym...@gmail.com
this is formla for polaution and smaple covariance other than the one taught to us in lecture,are these totally correct?

Akshay Malik

unread,
Nov 15, 2020, 2:12:58 AM11/15/20
to Discussion forum for Statistics for Data Science I, priyanshu...@gmail.com, Malabika Guha Mustafi, Satish, Akshay Malik
I checked them by  finding covariance on Google Sheets for the same dataset. But I would recommend you re-check them just to be safe!

CG

unread,
Nov 15, 2020, 2:41:24 AM11/15/20
to Discussion forum for Statistics for Data Science I, akshaym...@gmail.com, priyanshu...@gmail.com, Malabika Guha Mustafi, Satish
Hi Akshay,

I tried to derive these reductions 

Photo on 15-11-20 at 1.06 PM.jpg

Akshay Malik

unread,
Nov 15, 2020, 2:50:13 AM11/15/20
to Discussion forum for Statistics for Data Science I, CG, Akshay Malik, priyanshu...@gmail.com, Malabika Guha Mustafi, Satish
The  derivation is available on the internet. I picked up the derived result  from there.

Malabika Guha Mustafi

unread,
Nov 15, 2020, 4:41:35 AM11/15/20
to Discussion forum for Statistics for Data Science I, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Malabika Guha Mustafi, Satish
@ CG 
If you take mean of x and y  are the sigma of xi value and sigma of yi value then you will get  the @akshayam formula.
But it is susceptible in catastrofic calculation remember that.

Malabika Guha Mustafi

unread,
Nov 15, 2020, 6:21:35 AM11/15/20
to Discussion forum for Statistics for Data Science I, Malabika Guha Mustafi, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Satish
@akshayam, have you checked this formula for practice assignment week 4 problem?

Mrinal Chandra

unread,
Nov 15, 2020, 6:26:16 AM11/15/20
to Discussion forum for Statistics for Data Science I, Malabika Guha Mustafi, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Satish
@malabika can you please help me with question 10 practice assignment 3?

Swagat

unread,
Nov 15, 2020, 6:33:29 AM11/15/20
to Discussion forum for Statistics for Data Science I, Malabika Guha Mustafi, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Satish
This formula seems to be working fine. Just calculated for the problem you mentioned.

Malabika Guha Mustafi

unread,
Nov 15, 2020, 6:37:57 AM11/15/20
to Discussion forum for Statistics for Data Science I, motagi...@gmail.com, Malabika Guha Mustafi, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Satish
@ motagi,
Please post it here. It may help me to  rectify my silly mistakes.

Malabika Guha Mustafi

unread,
Nov 15, 2020, 7:17:27 AM11/15/20
to Discussion forum for Statistics for Data Science I, Malabika Guha Mustafi, motagi...@gmail.com, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Satish
@ Mrinal,
 The five-number summary of 99 observations of a numerical variable is 25, 35, 47, 56, 78. Based on this information, which of the following statements could be true?
lowest value= 25
Highest value= 78

Range= 78-25=53

Median  is 47 so it is 50 th value of the data set.


Q1= 35 ( 99*0.25 = 24.75 or 25 th  observation)
Q3= 56 (99*0.75= 74.25 or 75 th observation)

IQR= 56-35=21

 Now apply the logic as stated in solution PDF
25th percentile is at 25th observation.
Possible values for 2-24 observations is [25, 35]. ( It means 2nd observation  and other observations upto 24 th can take minimum value 25  and maximum value 35. It may take any value but  not below 25( which is lowest value of data set and not more than 35 since it is Q1 at 25 th observation)

same logic for 26-49 th observation and other also.

50th percentile is at 50th observation.
Possible values for 26-49 observations is [35, 47].
Minimum possible value for 26-49 observations is 35.
Maximum possible value for 26-49 observations is 47.

75th percentile is at 75th observation.
Possible values for 51-74 observations is [47, 56].

Minimum possible value for 51-74 observations is 47.
Maximum possible value for 51-74 observations is 56.

100th percentile is at 99th observation.
Possible values for 76-98 observations is [56, 78].
Minimum possible value for 76-98 observations is 56.
Maximum possible value for 76-98 observations is  78.

Therefore, minimum possible mean of the observations is
{(24 × 25) + (25 × 35) + (25 × 47) + (24 × 56) + 78}/99
=
4072/99
= 41.13

Therefore, maximum possible mean of the observations is
{(24 × 78) + (25 × 56) + (25 × 47) + (24 × 35) + 25}/99
=
5312/
99
= 53.65

80 percentile=99 * 0.8= 79.2 or 80 th observation.
 It may be true that the value is 78 ( considering the fact the maximum value  of 76 to 98 observation is 78)
we can not say definitely , the value is 78 but there is a possibility that it may be 78.

Since , in question it is asked which of the following statements could be true?
so This option is true

Hope it helps you. 

Malabika Guha Mustafi

unread,
Nov 15, 2020, 7:18:38 AM11/15/20
to Discussion forum for Statistics for Data Science I, Malabika Guha Mustafi, motagi...@gmail.com, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Satish
@ Mrinal and @ Chandra is the same person , I assume :)

Mrinal Chandra

unread,
Nov 15, 2020, 7:29:13 AM11/15/20
to Discussion forum for Statistics for Data Science I, Malabika Guha Mustafi, motagi...@gmail.com, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Satish
@malabika yes-yes same person. my confusion is with the second part only, that is the mean part. why it says 2-24 and not 1-24? also the addition part is not clear....

Mrinal Chandra

unread,
Nov 15, 2020, 7:31:19 AM11/15/20
to Discussion forum for Statistics for Data Science I, Mrinal Chandra, Malabika Guha Mustafi, motagi...@gmail.com, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Satish
@malabika also thank you for responding...i am stuck with this question

Mrinal Chandra

unread,
Nov 15, 2020, 7:34:34 AM11/15/20
to Discussion forum for Statistics for Data Science I, Mrinal Chandra, Malabika Guha Mustafi, motagi...@gmail.com, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Satish
why is it leaving both ends? that is 1st and 25th...and the same is being repeated in the rest of the calculation...

Mrinal Chandra

unread,
Nov 15, 2020, 7:37:29 AM11/15/20
to Discussion forum for Statistics for Data Science I, Mrinal Chandra, Malabika Guha Mustafi, motagi...@gmail.com, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Satish
78 is the maximum value....why are we assuming it to be 80 percentile
Message has been deleted
Message has been deleted

Mrinal Chandra

unread,
Nov 15, 2020, 7:48:21 AM11/15/20
to Discussion forum for Statistics for Data Science I, Malabika Guha Mustafi, Mrinal Chandra, motagi...@gmail.com, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Satish
ok..makes sense...thank you...
but i still cannot make out the 80 percentile thing...there are values between 56 and 78...why are we assuming it to be 78

On Sunday, 15 November 2020 at 18:12:12 UTC+5:30 Malabika Guha Mustafi wrote:
why is it leaving both ends? that is 1st and 25th...and the same is being repeated in the rest of the calculation...

The first one is already 25 and last one is 78. so no need for explanation.
No. it is not repeated in mean calculation.
Therefore, minimum possible mean of the observations is
{(24 × 25) + (25 × 35) + (25 × 47) + (24 × 56) + 78}/99
( first 25 including lowest end , then 25 ( from 26 th to 50 th observations), then 25 ( 51 st to 75 th observation) , then 24 ( 76 to 98 th observation ) and 78 (final 99 th Observation).
No end or poin is excluded.

Malabika Guha Mustafi

unread,
Nov 15, 2020, 7:51:32 AM11/15/20
to Discussion forum for Statistics for Data Science I, chandra...@gmail.com, Malabika Guha Mustafi, motagi...@gmail.com, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Satish
why is it leaving both ends? that is 1st and 25th...and the same is being repeated in the rest of the calculation...

The first one is already 25 and last one is 78. so no need for explanation.
No. it is not repeated in mean calculation.
Therefore, minimum possible mean of the observations is
{(24 × 25) + (25 × 35) + (25 × 47) + (24 × 56) + 78}/99
( first 24 including lowest end , then 25 ( from 25 th to 49 th observations), then 25 ( 50th to 754th observation) , then 24 ( 75 to 98 th observation ) and 78 (final 99 th Observation).
No end or poin is excluded.
 This is the right one. Please ignore the last one.

Malabika Guha Mustafi

unread,
Nov 15, 2020, 7:54:44 AM11/15/20
to Discussion forum for Statistics for Data Science I, Malabika Guha Mustafi, chandra...@gmail.com, motagi...@gmail.com, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Satish
For 80 percentile  the value is between 56 and 78. You are absolutely correct. But read the question carefully.
The five-number summary of 99 observations of a numerical variable is 25, 35, 47, 56, 78. Based on this information, which of the following statements could be true?
It is not absolute or definite value but it could be true so it is right option.
consider if they give the option that mean is 42 or 52.
Then we need to check that option also. since , the value is within the range.

Mrinal Chandra

unread,
Nov 15, 2020, 7:59:26 AM11/15/20
to Discussion forum for Statistics for Data Science I, Malabika Guha Mustafi, Mrinal Chandra, motagi...@gmail.com, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Satish
yeah right...maybe exam stress and anxiety doing things to me..it was easy...thank you so much

Malabika Guha Mustafi

unread,
Nov 15, 2020, 8:40:15 AM11/15/20
to Discussion forum for Statistics for Data Science I, chandra...@gmail.com, Malabika Guha Mustafi, motagi...@gmail.com, akshaym...@gmail.com, CG, priyanshu...@gmail.com, Satish
@ moatg  AND @ akshaym, 
The formula is working for that particular problem. Thanks.

Reply all
Reply to author
Forward
Message has been deleted
0 new messages