Comprehensive list of GOI online services

25 views
Skip to first unread message

konark modi

unread,
May 16, 2017, 4:10:49 PM5/16/17
to data...@googlegroups.com
Hi All,

I am always looking for a comprehensive list of GOI websites in a consumable manner for various projects. Hence I decided to scrape http://goidirectory.nic.in/index.php. (YES! There is not HTTPS for this link).


Number of Websites: 10741
Suffix Count
.gov.in 4805
.nic.in 2766
.org 855
.com 566
.ac.in 499
.in 485
.co.in 209
.org.in 176
.res.in 158
.edu.in 110
.net 37
.edu 26
.net_in 9
.info 7
.aero 2
.gen_in 1
.coop 1


Hope this list is useful for quite some projects / studies.

Please feel free to add missing domains, or other information which would be relevant, the working repo is: https://github.com/konarkmodi/DigitalIndia


-Konark
@konarkmodi

srinivas kodali

unread,
May 16, 2017, 10:05:25 PM5/16/17
to datameet
Unfortunately this is not all the websites. There are more which are not part of directory. We should probably start crowdsourcing the others.

Regards,
Srinivas Kodali

--
Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org
---
You received this message because you are subscribed to the Google Groups "datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Vaishnavi Jayakumar (Inclusive India)

unread,
May 16, 2017, 11:27:28 PM5/16/17
to datameet
Yes please to the crowdsourcing! 

Mammoth task - this itself is 10741. (And more popping up all the time. )

Old one that's missing for eg =  araiindia.com
New one that's not been updated = sci.gov.in

When o when are they going to be updated to reflect the gov.in default?
When o when will we stop seeing gmail ids for government work by govt officials?

---------------------------------------
VAISHNAVI JAYAKUMAR
http://about.me/vjayakumar

konark modi

unread,
May 17, 2017, 3:53:14 AM5/17/17
to data...@googlegroups.com
Thank you @srinivas  @vaishnavi for your feedback.


I totally like the idea of crowdsourcing. How do you want to proceed ?

1. Issue a PR with changes in the TSV or open it as an issue ?
2. If there is a source which needs to be scraped then open it as an Issue ?
3. Use this repo as the main source or move it somewhere more open maybe datameet repo ?
 
If you know of any other resources let me know, will pull them in.

-Konark
@konarkmodi
Reply all
Reply to author
Forward
0 new messages