Doubt regarding categorical data

183 views
Skip to first unread message

Harshit Dhiman

unread,
Oct 13, 2020, 11:16:07 AM10/13/20
to Discussion forum for Statistics for Data Science I

Suppose we collect data on the satisfaction of the customers of many different hotels in a range from very bad to very good in five steps. Now we can convert the rating into integers from 1 to 5 very easily. We record the customer rating of each hotel in different datasets for the different hotel. We name the rating field as variable A.
Now we average the ratings for each hotel and create a new dataset in which we record the average rating of each hotel. This variable is named B.
Now would the variable B be a numerical variable, since it somewhat represents the distribution of the population behind it, or is it a categorical variable with a bigger set of options to pick from.

Harshit Dhiman

unread,
Oct 14, 2020, 9:42:33 AM10/14/20
to Discussion forum for Statistics for Data Science I, Harshit Dhiman
Can someone please answer this. It has been bugging me for quite some time.

Statistics 1 Support 1

unread,
Oct 14, 2020, 10:11:01 AM10/14/20
to Discussion forum for Statistics for Data Science I, harshitd...@gmail.com
Hello, harshit,

If you are collecting data as labels from very bad to very good and later encoding each variable into a number from 1 to 5. It is just encoding dataset It is not correct to perform the average rating as these are just encoded values and not the numerical variable values. So the idea of making a new dataset of every restaurant using these average ratings is not correct as these are just encoded values for rating. Instead of encoding it to integers from 1 to 5, it could 2 to 10 or anything, so average ratings changes. So, it is not a good option to find the average of encoded values.

Harshit Dhiman

unread,
Oct 14, 2020, 10:33:02 AM10/14/20
to Discussion forum for Statistics for Data Science I, stats1-...@onlinedegree.iitm.ac.in, Harshit Dhiman
Hello, sir,
I used this example to establish that numerical rating systems can be compared to rating in any other form and it is infact categorical. We could actually just start with a numerical rating system and it would not make a difference.
Now what bugs me is we use datasets like these in app stores or any other form of services, and I want to know if those average ratings are categorical or numerical. At one point they seem like just one of the many levels of satisfaction, but on the other hand, they somewhat represent the population behind the poll. For example, a rating of 4.5 means most of the peple are satisfied by the product(If the ratings were from 2-10 then it would be another number but woul represent the same information). I personally think it is categorical but the above mentioned things make it somewhat ambigiuos. Please clarify .

Statistics 1 Support 1

unread,
Oct 14, 2020, 10:49:25 AM10/14/20
to Discussion forum for Statistics for Data Science I, harshitd...@gmail.com, Statistics 1 Support 1
Hello, harshit, 

Even though your question is genuine, it is directly related to the graded assignment. So, I will be answering this question after the deadline of week 1 graded assignment. Hope you understand it. 

Best,
Ram, 
Statistics-1 Course Instructor 



Harshit Dhiman

unread,
Oct 14, 2020, 5:17:19 PM10/14/20
to Discussion forum for Statistics for Data Science I, stats1-...@onlinedegree.iitm.ac.in, Harshit Dhiman
Sure, sir. I will be waiting for your reply.

rakesh ky

unread,
Oct 14, 2020, 11:01:57 PM10/14/20
to Discussion forum for Statistics for Data Science I, stats1-...@onlinedegree.iitm.ac.in
sir audio of few videos in statistics are very low like mode and median video and many such video in that week 2

Nikita Kumari

unread,
Oct 15, 2020, 12:45:14 AM10/15/20
to Discussion forum for Statistics for Data Science I, 498.r...@gmail.com, stats1-...@onlinedegree.iitm.ac.in
Hello Rakesh,

We will try to fix this.

Thank you
Nikita
Statistics course support team

Harshit Dhiman

unread,
Oct 20, 2020, 2:36:21 AM10/20/20
to Discussion forum for Statistics for Data Science I, nikita...@onlinedegree.iitm.ac.in, 498.r...@gmail.com, stats1-...@onlinedegree.iitm.ac.in
Sir, can you please answer the question?

Arun Stephen

unread,
Oct 21, 2020, 6:46:10 AM10/21/20
to Discussion forum for Statistics for Data Science I, harshitd...@gmail.com, nikita...@onlinedegree.iitm.ac.in, 498.r...@gmail.com, stats1-...@onlinedegree.iitm.ac.in
Sir/Madam,

Can you please provide explanation on this topic. In the assessment, the correct answer seems to be App rating is numeric data on interval scale. I am having hard time wrapping my head around it.
1) It is still qualitative data on customer satisfaction and grouped
2) It does not represent actual measurement like no. of downloads
3) In the video solution, it was told that addition and subtraction is possible
It is possible because it is numerically represented but what is the real time application of subtraction on such data?


Please help us understand how is it not a categorical data on ordinal scale.

Regards,
Arun Stephen

Harshit Dhiman

unread,
Oct 21, 2020, 7:20:12 AM10/21/20
to Discussion forum for Statistics for Data Science I, Harshit Dhiman, stats1-...@onlinedegree.iitm.ac.in

Sir, The deadline for week one is gone so can you please answer it now?

Malabika Guha Mustafi

unread,
Oct 21, 2020, 7:23:47 AM10/21/20
to Discussion forum for Statistics for Data Science I, harshitd...@gmail.com, stats1-...@onlinedegree.iitm.ac.in
Team please clarify our doubt.

Arun Stephen

unread,
Oct 26, 2020, 12:18:13 AM10/26/20
to Discussion forum for Statistics for Data Science I, Malabika Guha Mustafi, harshitd...@gmail.com, stats1-...@onlinedegree.iitm.ac.in
Any help will greatly appreciated
Reply all
Reply to author
Forward
0 new messages