Now since I work in the NGO sector, I am already working on digging up
such data for our programs and research and analysis.
So all these data are of great value for such development
organisations and project.
The problem with Indian government sites is, if at all they have data
available, they are not in machine readable/extractable or programmer
friendly formats.
I am currently working on planning for our Flood/Disaster Risk
Reduction project and evaluating how we can use information to make
quick decisions etc.
So, that made me search Google for "Open Data" initiatives started by
"anyone" in India. Some tried, but were never able to do it or make it
available to public..
This could be a very nice project to start up on and invest our
energies to drive in volunteers.
The idea is to make data available in as many standard formats and
maybe open up APIs also and provide it as a web service. Free of
course :)
Can someone experienced, like Indranil, tell us if its legally allowed
for us to host government hosted data in open formats on our site's?
We had done similar work for http://voteindia.in where we downloaded
legal affidavits scanned copies from government sites of MLAs and hand
typed them into excel sheets and then made it available.
I see
http://india.ictd.asia/opendata already in place ;) and then other
countries volunteers follow suit!
Oh, to start with,
1) http://www.india-water.com/ffs/index.htm - River Basin data. - Can
be used for Flood analysis.
2) http://indiabudget.nic.in/es2009-10/esmain.htm - Economic Survey of
India 2009-2010
3) Old Census Data?
Thoughts/Comments please! :)
Regards,
Ajay Kumar
I had a discussion with Ajay on this on IRC. This is very good idea. I
had been thinking recently about how most of the eContent on education
by Indian government doesn't always use Unicode (like the NCERT
textbooks). This is on similar lines.
> Can someone experienced, like Indranil, tell us if its legally allowed
> for us to host government hosted data in open formats on our site's?
IMO, thats the deal maker or breaker. If this works out, I guess there
is nothing stopping this.
Even if this is available as a plain data initially it would be good.
APIs and stuff can come in later?
---
Nandeep
On Sat, Jun 19, 2010 at 10:54 PM, Ajay Kumar <ajuo...@gmail.com> wrote:
> [snipping good stuff]
I had a discussion with Ajay on this on IRC. This is very good idea. I
had been thinking recently about how most of the eContent on education
by Indian government doesn't always use Unicode (like the NCERT
textbooks). This is on similar lines.
IMO, thats the deal maker or breaker. If this works out, I guess there
> Can someone experienced, like Indranil, tell us if its legally allowed
> for us to host government hosted data in open formats on our site's?
is nothing stopping this.
Even if this is available as a plain data initially it would be good.
APIs and stuff can come in later?
> I tried to gather some more feedback on people who have been either working on similar stuff or know about it, and we kind of agree that we can play around with the data and host it with attribution ofcourse, since its public data.
I'll have to trust your judgement
as I do not know Indian law
> Now the next step would be to devise a methodology for the same. Anyone with an experience with playing around with data and standards, can help us suggest a better way to approach this.
a few things I'd suggest might be quite useful
in the intermediate- and long-term
(despite being a bit of work in the short-term)
as we have found in our recent work with
the US Federal Election Commission data/files:
1) develop relationship diagrams (linking)
related data is often stored in multiple files
even files in multiple locations w multiple access methods
and having documentation of what keys link different data
is very valuable in figuring out how to
form coherent & integrated information for the citizen-user
in applications and servers developed by the group
2) develop schemas & convert data to XML-form (defining)
clear definitions of data structure & constraints is critical
to understanding and manipulating it
and is often either present alongside the files in "metadata"
or can be constructed through examination of the data
with the data in XML-form conforming to your schemata
transformations and queries of the data
as well as reformation of logically-linked data
into more useful files/documents
becomes much much easier
as does producing useful outputs for citizen-users
3) develop open RESTful API(s) for access
this is a very powerful way to ensure
client and server code and logic is disentangled
thus enabling others not in the group
to also develop their own clients
as well as enabling the development of
unexpected (at the front end of development)
uses of the data
take Twitter as an example of this...
part of their success is because
they enabled a client-code-focused ecosystem to develop
through separating client & server code
and through keeping an open RESTful API
just a couple of suggestions...
jeffs
--
“Water? Never drink it...
Do you know what fish do in that stuff?”
-- attributed to W. C. Fields --
==========
Prof. Jeff Sonstein
Similar requirement made me to think about this a while back (http://
thejeshgn.com/2010/02/24/open-data-in-india/).
I couldn't take time
out to work on the idea. Now that we have a team, we can. To begin
with I have tagged all the openly available data on delicious
http://delicious.com/gnthej/open-data-india