When to use Correlation coefficient and when to use Point Biserial correlation coeffecient

83 views
Skip to first unread message

K Venkat Krishna Rao

unread,
Nov 21, 2020, 5:35:16 AM11/21/20
to Discussion forum for Statistics for Data Science I
Here in the below image, from the question, it looks like there are two variables, one is class type (X,Y) and other is Marks of students. I thought here we must apply Correlation between numerical and categorical variable, but it turns out in solution he used correlation between two numerical variables. Its very confusing when to use one over the other. Will it be specified when to use which?

rakesh ky

unread,
Nov 21, 2020, 5:59:52 AM11/21/20
to Discussion forum for Statistics for Data Science I, K Venkat Krishna Rao
same doubt
and i also need hw to find point biserial correlation coefficient 

Anand Iyer

unread,
Nov 21, 2020, 5:59:56 AM11/21/20
to Discussion forum for Statistics for Data Science I, K Venkat Krishna Rao
image is missing.  can you reattach?

On Saturday, November 21, 2020 at 4:05:16 PM UTC+5:30 K Venkat Krishna Rao wrote:

K Venkat Krishna Rao

unread,
Nov 21, 2020, 6:27:53 AM11/21/20
to Discussion forum for Statistics for Data Science I, anandd...@gmail.com, K Venkat Krishna Rao
Hi, yes here it is.
Que Image.png

Anand Iyer

unread,
Nov 21, 2020, 6:33:37 AM11/21/20
to Discussion forum for Statistics for Data Science I, K Venkat Krishna Rao, Anand Iyer
As far as I see, there're two numerical variables, where did you see the categorical variable?

K Venkat Krishna Rao

unread,
Nov 21, 2020, 6:38:43 AM11/21/20
to Discussion forum for Statistics for Data Science I, anandd...@gmail.com, K Venkat Krishna Rao
Its the difficulty in interpretation. You can interpret both as numerical variables or one numerical and one categorical. When it comes to two numerical variables, we saw in the lecture there was age and price of car or age and IQ of students which are two different valued numerical variables.. Here its the marks as variable of two different categories X class and Y class. It just seems relationship between categorical and numerical variable and creates lot of confusion.

Anand Iyer

unread,
Nov 21, 2020, 6:43:03 AM11/21/20
to K Venkat Krishna Rao, Discussion forum for Statistics for Data Science I
I don't relate to what you're saying at all...

to me, both are numerical variables.
--
Cheers,

Boss Annapillai

unread,
Nov 21, 2020, 6:44:22 AM11/21/20
to Discussion forum for Statistics for Data Science I, K Venkat Krishna Rao, anandd...@gmail.com
point bi serial  correlation coeffecient  applies only to dichotomous data (e..g. M ,F) 

K Venkat Krishna Rao

unread,
Nov 21, 2020, 6:57:06 AM11/21/20
to Discussion forum for Statistics for Data Science I, boss...@gmail.com, K Venkat Krishna Rao, anandd...@gmail.com
Isn't Class X and Class Y dichotomous?

Anand Iyer

unread,
Nov 21, 2020, 7:31:46 AM11/21/20
to K Venkat Krishna Rao, Discussion forum for Statistics for Data Science I, boss...@gmail.com
I think you're complicating things a lot more than required...

We're not trying to create an association between class-type and marks of students.  Instead, we're checking to see if there's an association between students marks from two classes.

This problem is different from the Gender-Marks problem.
--
Cheers,

Statistics 1 Support 1

unread,
Nov 21, 2020, 11:32:31 AM11/21/20
to Discussion forum for Statistics for Data Science I, anandd...@gmail.com, Discussion forum for Statistics for Data Science I, boss...@gmail.com, kvkri...@gmail.com
If one variable is categorical and other is numerical then it would be point bi-serial correlation. If both variables are numerical then it would be the correlation coefficient.

Thanks & Regards,
Ram,
Course Support Team

Reply all
Reply to author
Forward
0 new messages