Trouble reading Indian Districts Database's files

56 views
Skip to first unread message

Eduardo Campillo Betancourt

unread,
Jan 31, 2021, 10:10:19 PM1/31/21
to datameet
Hi all,

Have any of you successfully read the .data files in the Indian Districts Database (http://vanneman.umd.edu/districts/files/index.html)? I tried reading them in R as ASCII and elsewhere as text files or tab separated files and all I got was a single column of about 150K rows with what looked like the data but with no sense of which column corresponds to which variable. Also, the SAS command files are not available (links broken).

Thank you in advance!

Best,
Eduardo

Dilawar Singh

unread,
Jan 31, 2021, 11:16:04 PM1/31/21
to datameet
These files are hopelessly mangled. I've extracted the data and put them here https://github.com/dilawar/data/tree/master/IndianDistrictDatabase/original . Most entries have 5 columns but some are one column and two columns. 

You can search for this site snapshots on the internet archive and if you are lucky, some old snapshots might have the currently missing SAS files. The last time I tried, their search engine was having issues:


--
Dilawar Singh, Ph.D.


--
Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org
---
You received this message because you are subscribed to the Google Groups "datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/datameet/5566c50b-9009-4bd9-a0a5-22b0247e2d6bn%40googlegroups.com.

Dilawar Singh

unread,
Jan 31, 2021, 11:21:03 PM1/31/21
to datameet
Message has been deleted

Eduardo Campillo Betancourt

unread,
Feb 2, 2021, 4:10:41 PM2/2/21
to datameet

Hi Dilawar,

Thank you so much! The SAS files were very useful in understanding the structure. They're surprisingly not mangled, just very strangely arranged! Let me know if you need them in .dta and I can share the files once I'm done cleaning them.

Best,
Eduardo

Dilawar Singh

unread,
Feb 3, 2021, 8:26:47 AM2/3/21
to datameet
Hi Eduardo,

Structured data will be useful for anyone visiting this thread. You can send them to me (I'll add them to the Github link); or you can also post them to a public server and post a link here.

best,
    Dilawar

Eduardo Campillo Betancourt

unread,
Feb 3, 2021, 11:13:28 AM2/3/21
to datameet
Definitely! I'll share when I finish cleaning the panel
Reply all
Reply to author
Forward
0 new messages