Query about Point bi-serial correlation coefficient

270 views
Skip to first unread message

Deepshikha Sharma

unread,
Nov 17, 2020, 11:51:54 PM11/17/20
to Discussion forum for Statistics for Data Science I
Hello,

As per solution of Question 3 in Mock test, for sample of 10 employees ,the Point bi-serial correlation coefficient is computed using Population standard deviation.
But in the Tutorial 4 of week 4 lectures ,the point bi-serial correlation coefficient is calculated using Sample standard deviation.

Please confirm we have to use sample standard deviation or population standard deviation for calculating Point bi-serial correlation coefficient ?

Thanks,
Deepshikha

Jai Hanani

unread,
Nov 18, 2020, 12:10:32 AM11/18/20
to Discussion forum for Statistics for Data Science I, Deepshikha Sharma
It depends on the question. In the mock test question, the whole population is given(10 employees and their salaries), therefore we had to use Population SD. 
In that Chennai COVID-19 question in one of the graded assignments, we will have to use Sample SD, because the whole population cannot be possibly given.

Deepshikha Sharma

unread,
Nov 18, 2020, 12:19:17 AM11/18/20
to Discussion forum for Statistics for Data Science I, jaiha...@gmail.com, Deepshikha Sharma
In the mock test question ,sample is given and not population ,Still Population standard deviation is  used to calculate point bi serial coefficient.
Mock test Question 3  is " Use the following information and data given in Table S.1 to answer the questions 3 and 4
In an organization, data from a sample of 10 employees is collected. This data includes gender, age, and salary of employees and is given in Table S.1."

Please check and confirm.

Thanks,
Deepshikha Sharma

CG

unread,
Nov 18, 2020, 2:17:55 AM11/18/20
to Discussion forum for Statistics for Data Science I, Deepshikha Sharma, jaiha...@gmail.com
I am also interested in knowing the answer to this. Hope the course team will respond. 

Statistics 1 Support 1

unread,
Nov 18, 2020, 2:39:03 AM11/18/20
to Discussion forum for Statistics for Data Science I, CG, Deepshikha Sharma, jaiha...@gmail.com
Hello,

You can use the sample standard deviation for calculating the point bi-serial coefficient, but in that case you need to use the sample correction in the formula. Instead of using sqrt(p0 *p1), use sqrt[(n0/n-1)(n1/n)].  But if you are using population standard deviation for calculation, then use sqrt(p0*p1).
In tutorial 4, there is a mistake, we will send an announcement.

Thanks
Nikita
Statistics course support team

Deepshikha Sharma

unread,
Nov 18, 2020, 4:20:55 AM11/18/20
to Discussion forum for Statistics for Data Science I, stats1-...@onlinedegree.iitm.ac.in, CG, Deepshikha Sharma, jaiha...@gmail.com
Hi Nikita,

Thank you so much for giving clarity on above query.

Thanks & Regards,
Deepshikha Sharma

Jai Hanani

unread,
Nov 18, 2020, 4:38:28 AM11/18/20
to Discussion forum for Statistics for Data Science I, Deepshikha Sharma, Jai Hanani
Indeed. I was mistaken. You are right. 

Shinas MN

unread,
Nov 26, 2020, 4:17:19 AM11/26/20
to Discussion forum for Statistics for Data Science I, jaiha...@gmail.com, Deepshikha Sharma
So in short in shift 2 stat bi-serial ...Should we take N or N-1when we multiply it with root p0*p1?
Reply all
Reply to author
Forward
0 new messages