Week 4 Practice Assignment

73 views
Skip to first unread message

CG

unread,
Nov 1, 2020, 1:29:08 AM11/1/20
to Discussion forum for Statistics for Data Science I

Question MCQ 6 / Question 18. Consider the set of points (2, 6), (3, 8), (4, 10), (5, 14), (10, n) in the XY - plane. What
should the value of n be so that the correlation between the X-values and Y -values is 1?
a) 23
b) 26
c) 29
d) A value different from any of the above.
e) No value for n can make r = 1.
No linear relation is possible for the given data points such that the value of correlation coefficient equal to 1. Therefore, option (e) is correct.

My doubt : What is the procedure for concluding that no linear relation is possible for the given data points?

Question MCQ 1 / Question 6. What can be said about the correlation coefficient r of x and y where y = x2 + 8x + 16, x takes the values of the first ten positive integers?
a) r = 1
b) 0 < r < 1
c) −1 < r < 0
d) r = −1

My doubt : In the lectures for Week 4, covariance and correlation coefficient are measures of linear association. But in the above question it is a quadratic association. 
Are we calculating r because of the conditions that domain is limited to first 10 discrete positive integers. 

Another way of solving by examining the options given instead of calculating the correlation coefficient. This is shown below.
a) r=1 :If r were to be equal to 1 for the given dataset there should be an equation of a line (y=mx+c) passing through all the 10 points. Verify that this is not possible by selecting 2 separate points and getting the equation of the line from point-point form. So this option is not correct.
b) 0 < r < 1 : This is a possibility since we know that r should be positive
c) −1 < r < 0 : Since X increases and Y increases the correlation should be positive. 
d) r  = −1 : This also cannot be. Same reason why c cannot be true and a cannot be true.

However the calculated correlation coefficient is dangerously close to 1 (0.99), so the above method may not be watertight. I used this because the question itself specifies that the association is quadratic.

Screen Shot 2020-11-01 at 11.53.36 AM.png


I manually calculated the correlation coefficient for all the questions in the Practice assignment. For me it takes around 20-30 minutes for this by using a calculator. 


Anand Iyer

unread,
Nov 1, 2020, 10:53:11 PM11/1/20
to Discussion forum for Statistics for Data Science I, CG
Here's a much easier way.  

If the correlation is equal to 1 (highest possible), it means there's a linear relationship.

Now, in the case of a linear relationship, there must be a single line through all these points.  Thu,s, every pair of points should result in the same slope.

(2, 6), (3, 8) result in 2.

But,  (4, 10), (5, 14) result in 4.

Thus, this is not a linear relationship at all.

Statistics 1 Support 1

unread,
Nov 2, 2020, 12:19:05 AM11/2/20
to Discussion forum for Statistics for Data Science I, anandd...@gmail.com, CG
Hi,
Since correlation is given to be one it means that they must lie on a straight line but when you check for the given four points, you will find that they are not in a line that's why for no values of n, they can never lie on a straight line.

Que 6) Here quadratic equation is given to find the points and since they are lying on the parabola (quadratic equation), the value of r can not be one, and your reasoning is right about this question.

Thanks,
Nitin Jha
Course support team

CG

unread,
Nov 2, 2020, 6:02:09 AM11/2/20
to Discussion forum for Statistics for Data Science I, anandd...@gmail.com, cheria...@gmail.com
Thank you.
Reply all
Reply to author
Forward
0 new messages