Any project on Name Commonality

37 views
Skip to first unread message

Pradeep Bhatt

unread,
Mar 5, 2018, 12:29:23 PM3/5/18
to data...@googlegroups.com
Hi All,

Is there any work done on name commonality in India, something like this site


Finding how many "Yuvraj Singh" or "Priyanka Chopra" are there in India.

Guys, who have scraped Voter ID data. Do you think its possible ?

Regards,
Pradeep

Raphael Susewind

unread,
Mar 6, 2018, 12:06:16 PM3/6/18
to data...@googlegroups.com
Dear Pradeep,

it is possible in principle, though with complications (including
ethical complications). Have a look at my github for starters on how to
extract names from the electoral rolls:

https://github.com/raphael-susewind

What is definitely possible is something like this:

https://www.raphael-susewind.de/blog/2012/noor-mohd-ali

Best,
Raphael
> --
> Datameet is a community of Data Science enthusiasts in India. Know more
> about us by visiting http://datameet.org
> ---
> You received this message because you are subscribed to the Google
> Groups "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to datameet+u...@googlegroups.com
> <mailto:datameet+u...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

Pradeep Bhatt

unread,
Mar 8, 2018, 2:12:52 AM3/8/18
to data...@googlegroups.com
Thanks Raphael !

I will build it from Voter ID database.

Regards,
Pradeep


> For more options, visit https://groups.google.com/d/optout.

--
Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org
---
You received this message because you are subscribed to the Google Groups "datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+unsubscribe@googlegroups.com.

Anand Chitipothu

unread,
Mar 8, 2018, 2:58:02 AM3/8/18
to data...@googlegroups.com
I've done that for the states of AP and Telangana.

https://archive.org/details/india-names-dataset

For the privacy reasons, it counts each word in the name instead of the full name. So, you'll be able to find how may "Yuvraj" and how many "Singh" were born in each year, but not "Yuvraj Singh".

Anand

Vaishnavi Jayakumar

unread,
Mar 8, 2018, 3:47:35 AM3/8/18
to datameet
There's a voters id database? You mean scraped from online voters lists? 

What are the privacy implications? These were the same lists used to target communities during riots. In fact that's part of the recommendations DRA made on ECI's website accessibility - to prevent data misuse.

Pradeep Bhatt

unread,
Mar 19, 2018, 10:52:53 PM3/19/18
to data...@googlegroups.com
Thanks Anand !

I will check it out.

--
Reply all
Reply to author
Forward
0 new messages