Happy Independence Day and Open Indian Village Boundaries

603 views
Skip to first unread message

Thejesh GN

unread,
Aug 15, 2016, 1:02:09 AM8/15/16
to data...@googlegroups.com
Hi all,

One of the most discussed subject on this list is Village Boundaries. We have seen many working on it and sharing data etc. Now we have a an effort to consolidate, cleanup and publish these village maps. 

Just like or MP/MLA maps. The Project is run by volunteers and all of you can participate. Data will be available on GitHub project page and under ODBl.


We need your help.

Go to the project page to know how to contribute, formats and attributes etc. This is an ongoing process which will take sometime to cover all the states.  I will blog and publish as the states go online.

Initial Blog: http://datameet.org/2016/08/15/happy-independence-day-open-indian-village-boundaries/


Happy Independence Day. Take care. 


Thej
--
Thejesh GN  ತೇಜೇಶ್ ಜಿ.ಎನ್
http://thejeshgn.com
GPG ID :  0xBFFC8DD3C06DD6B0

"Sharad Lele [शरच्चंद्र लेले]"

unread,
Aug 15, 2016, 8:42:14 AM8/15/16
to data...@googlegroups.com
For some reason, this email by Thejesh did not show up datameet (or maybe I just missed it?). So forwarding it to the group.

Thejesh, thanks for this. I have two queries:

1. Is this effort different from the effort by Nisha? My impression was that we had (with the group) access to boundaries of many more states, but the issue was copyright etc, and that Nisha, you and some others were taking legal opinion on the same. Is what you have put out a result of that? What is the legal opinion?

2. I think I have asked this silly question on the group before, but have to ask it again: what is the 'json' format, and why is it better to put the data out in that format than in .shp? And how does one convert from json to .shp?

Thanks.
Sharad

-

   

---------- Forwarded message ----------
From: "Data{Meet}" <donot...@wordpress.com>
Date: 15 Aug 2016 10:27
Subject: [New post] Happy Independence Day and Open Indian Village Boundaries
To: <sbad...@atree.org>
Cc:

Thejesh GN posted: "One of the longest and most passionately discussed subject on the Data{Meet} list is the availability of Indian Village Boundaries in Digital format. Search for Indian Village shape files and you can spend hours on reading interesting conversations. Over"

New post on Data{Meet}

Happy Independence Day and Open Indian Village Boundaries

by Thejesh GN

One of the longest and most passionately discussed subject on the Data{Meet} list is the availability of Indian Village Boundaries in Digital format. Search for Indian Village shape files and you can spend hours on reading interesting conversations.

Over last two years different members of community have tried to digitize the maps available through various government platforms or shared the maps through their organizations.

A look at the list discussion tells you that boundaries of at the least 75% of the states are available in various formats and quality. What we need at this point is a consolidate effort to bring them all on par in format, attributes and to some level quality. So some volunteers at Data{Meet} agreed to come together, clean up the available maps, add attributes, make them geojson and publish them on our GitHub repository called Indian Village Boundaries.

Of course this will be an on going effort but we would love to reach a baseline (all states) by year end. As of now I have cleaned up and uploaded Gujarat. I have at the least 4 more states to go live by month end. Karnataka, Kerala, Tamil Nadu and Goa. I will announce them on the list as they go live.

The boundaries are organized by state using state ISO code. All the village boundaries are available in geojson (WGS84, EPSG4326) format. The project page gives you the status of the data as we clean and upload. Data is not perfect yet, there could many errors both in data and boundaries. You can contribute by sending the pull requests. Please use the census names when correcting the attributes and geojson for shapes and but please source them to an official source.

Like everything else community creates. All map data will be available under Open Data Commons Open Database License (ODbL). This data is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. If you find issues we are more than happy to accept corrections but please source them to an official source.

On this 70th Independence day, as we celebrate the historic event of India becoming Free and Independent, Data{Meet} community celebrates by cleaning, formatting and digitizing our village boundaries. Have a great time using the maps and contributing back to society.

https://github.com/datameet/indian_village_boundaries

Picture: Kedarnath range behind the Kedarnath temple early morning. By Kaustabh, Available under CCBYSA.

Comment    See all comments

Unsubscribe to no longer receive posts from Data{Meet}.
Change your email settings at Manage Subscriptions.

Trouble clicking? Copy and paste this URL into your browser:
http://datameet.org/2016/08/15/happy-independence-day-open-indian-village-boundaries/




--

 

Sharachchandra Lele

 

Senior Fellow & Convenor

Centre for Environment & Development

Ashoka Trust for Research in Ecology and the Environment (ATREE)

                        Phone (office) (080)-2363-5555 ext. 317; (mob): +91-94800-15850

                        Email: sl...@atree.org   Skype ID: sharad_lele

                        Personal webpage: http://atree.org/sharad_lele

 

Postal address: Royal Enclave, Srirampura, Jakkur P.O.

                        Bangalore 560 064, INDIA

President, Indian Society for Ecological Economics (2014-16) (www.ecoinsee.org)

 

Democratizing Forest Governance in India (published Sept 2014)

(edited by Sharachchandra Lele and Ajit Menon)

Oxford University Press, India

 



Arun Ganesh

unread,
Aug 15, 2016, 10:56:18 AM8/15/16
to datameet
2. I think I have asked this silly question on the group before, but have to ask it again: what is the 'json' format, and why is it better to put the data out in that format than in .shp? And how does one convert from json to .shp?


For starters, the property names in shapefiles is limited to 8 characters :o

You can convert from geojson to shapefiles using QGIS or ogr2ogr or a bunch of online websites https://ogre.adc4gis.com . All modern web maps work natively with the geojson format. 

Thejesh GN

unread,
Aug 15, 2016, 12:03:04 PM8/15/16
to data...@googlegroups.com
On 15 August 2016 at 18:12, "Sharad Lele [शरच्चंद्र लेले]" <shara...@gmail.com> wrote:
For some reason, this email by Thejesh did not show up datameet (or maybe I just missed it?). So forwarding it to the group.

Thejesh, thanks for this. I have two queries:

1. Is this effort different from the effort by Nisha? My impression was that we had (with the group) access to boundaries of many more states, but the issue was copyright etc, and that Nisha, you and some others were taking legal opinion on the same. Is what you have put out a result of that? What is the legal opinion?


Its the same effort, This is going to be our single effort. In this phase-1 we are going to publish boundaries of the villages for states which are derived from the government websites. We are applying our license - ODBL to the additional skilled and creative work our volunteers do, designing, organizing, extracting, georeferencing, drawing missing areas, adding more attributes, cleaning etc and attribute/credit the source (Govt Website) for each state.

I know there are more boundaries with complicated licenses. The talks are still on, as of now we are not publishing them. we will look at them in phase-2. We will keep the list updated.

So as of now its govt public data + DataMeet volunteer's creative work.  
 
2. I think I have asked this silly question on the group before, but have to ask it again: what is the 'json' format, and why is it better to put the data out in that format than in .shp? And how does one convert from json to .shp?


 
Our boundaries are incomplete in some cases or might have errors. Hence we are maintaining them in git similar to source code, where any changes to the file can be seen tracked and seen.

geojson or json is a simple text format. Which can be opened by a simple text editor and hence gives advantage to compare the versions, do a diff just like source code.

Shape files are a format by ESRI is more like a binary format. Its difficult to compare them on GitHub.

I would suggest http://mapshaper.org/ (Its open source so you can run internally if required) . You can convert between Shapefiles/geojson/topojson. Work on them and export. I have attached screenshots.

I have been using it a lot and it has really great support for all formats. I would suggest you to use http://mapshaper.org/ and let the group know if you face any issues.


Let me know if you have any more questions.

 

--
Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org
---
You received this message because you are subscribed to the Google Groups "datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Screenshot from 2016-08-15 21:25:13.png
Screenshot from 2016-08-15 21:25:04.png

Thejesh GN

unread,
Aug 16, 2016, 2:52:15 AM8/16/16
to data...@googlegroups.com

@arun: i will rename the attributes so the length of attribute name is 8. Can you put an issue. Easy to track.

@question:  in census, they use subdistrict. Is it same as Taluk?

Also anyone has 2001 to 2011 village code mappings?

Devdatta Tengshe

unread,
Aug 16, 2016, 4:19:13 AM8/16/16
to data...@googlegroups.com
>i will rename the attributes so the length of attribute name is 8. Can you put an issue. Easy to track.
There is no need to do that. This is one of the limitations of the shapefile format, and we have no such limitations when we have a GeoJSON file.

> in census, they use subdistrict. Is it same as Taluk?
Pretty much. In different parts of the country, sub districts are called by different names. They can be called Taluk, Tehesil, Mandal, Blocs, etc, depending on where you are in the country. hence the Census came up with an Abstract name for them.



Regards,
Devdatta

Thejesh GN

unread,
Aug 16, 2016, 4:41:30 AM8/16/16
to data...@googlegroups.com

@dev
I was planning to rename attributes because if someone converts these  geojsons  into shapefile then the name would remain same and reduce confusion.

What do you think?

Arun Ganesh

unread,
Aug 16, 2016, 7:01:19 AM8/16/16
to datameet
Thej, if it helps, the OSM wiki probably has the most comprehensive documentation of the varying boundary levels in India http://wiki.openstreetmap.org/wiki/WikiProject_India/Boundaries

Would be great to get some experts well versed the the governance structure to take a look and provide feedback.

Arun Ganesh

Nisha Thompson

unread,
Aug 17, 2016, 8:28:51 AM8/17/16
to datameet
@Sharad - To expand what Thej said:

Thej and I met with lawyers at Alternative Law Forum in Bangalore.

We discussed DataMeet, we are a community of people who believe in open data and share data with one another over the group. The data is usually government owned publicly available data that was procured through many methods but sometimes data owned by individuals (companies, researchers, students etc) are shared as well.

We met with them regarding the mapping data we hold. Currently DataMeet has the only publicly available Parliamentary and Assembly Constituency shapefiles for download under OBDL, as well as District, some ward maps and other isolated mapping information provided to us to maintain by the community.

When the community decided to start tackling the village files Thej and I were concerned about having that much mapping data available from government sources given the Survey of India's regulations regarding mapping information.

There is a lot of gray areas in the law and in the regulation. Basically we should be concerned with copyright violation but it is unlikely that they will go after you for that. It is important to create a sense of demand, then document the process of getting and processing the information and also that this information being shared is in the public interest. 

Copyright Law - Data is not technically copyrighted but the way it is stored and created is. Maps are creations and interpretations of data therefore a map is technically copyrighted. Making a copy of it and changing it's format from one to another is a gray area in copyright. (Some say yes some say no)

Mapping Policy - Only the Survey of India is allowed to map and distribute maps. However, they make a lot of partnerships with vendors and distributors for their maps. Again taking an official GOI and SOI map (Census, NIC village provided maps) and changing the format a gray area for copyright but whether the SOI considered it a violation of their distribution is another story. 

What it comes down to is our intent and how risk adverse we are. 

We are not very risk adverse and we believe in open data for public good, so we decided to move forward with the effort but proceed cautiously and deliberately.  

If the government asks us to take them down we will but we should be prepared to make the case for why they should provide the maps. 

We spent some time collecting demand information from people on why mapping information needs to be available and ideally in the open.
We articulated the need to have this information in the open and with the Save the Map campaign in full swing at the same time there has been a reasonable public conversation about the importance of mapping information be available, not restricted, and in the open.
We then looked at the states that already had PDF maps of their villages available and began to convert them, people contributed their individual efforts in converting maps and over the last several months managed to get the whole country in some form or the other.
Now we can finish processing them and get them out to the public.

This is an ongoing conversation and we will continue having the conversation with ALF and others so please send any questions.

Nisha
Nisha Thompson
DataMeet.org
skype: nishaqt
mobile: 962-061-2245


Arun Ganesh

unread,
Aug 17, 2016, 11:48:00 AM8/17/16
to datameet
Its amazing to see this move forward with DataMeet, thank you Nisha and Thej for pushing this! Any immediate tasks with which others could support?

Thejesh GN

unread,
Aug 17, 2016, 11:58:10 AM8/17/16
to data...@googlegroups.com

@arun
Thank you

Couple of tasks:
It will help us as we work on the maps.

2.  Adding CENSUS_CODE_2011 codes for villages. It can be derived from CENSUS_CODE_2001 codes. But needs work and validation.




Thej
--
Thejesh GN  ತೇಜೇಶ್ ಜಿ.ಎನ್
http://thejeshgn.com
GPG ID :  0xBFFC8DD3C06DD6B0

Sumit

unread,
Aug 17, 2016, 1:32:41 PM8/17/16
to datameet
Also anyone has 2001 to 2011 village code mappings?
I have this dataset, Thejesh.
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.

Craig Dsouza

unread,
Aug 18, 2016, 3:28:18 PM8/18/16
to datameet
Thej,
the census codes lookup tables 2001-2011 are attached.
Raphael on the group had shared them earlier.


Regards,
Devdatta

To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org
---
You received this message because you are subscribed to the Google Groups "datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org
---
You received this message because you are subscribed to the Google Groups "datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org
---
You received this message because you are subscribed to the Google Groups "datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Arun Ganesh

--
Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org
---
You received this message because you are subscribed to the Google Groups "datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Nisha Thompson
DataMeet.org
skype: nishaqt
mobile: 962-061-2245


--
Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org
---
You received this message because you are subscribed to the Google Groups "datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Arun Ganesh

--
Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org
---
You received this message because you are subscribed to the Google Groups "datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.

Thejesh GN

unread,
Aug 19, 2016, 7:27:12 AM8/19/16
to data...@googlegroups.com
1. Thank you. I have added the same to reference section.
2. I will write a python script to update geojson based on this. I will share once its done.
3. In the meantime, we have added Karnataka maps. I am yet to clean things there (tasks: https://github.com/datameet/indian_village_boundaries/blob/master/ka/ka.md)
4. I am also working on the Bihar maps. We should have it by Monday. 






Thej
--
Thejesh GN  ತೇಜೇಶ್ ಜಿ.ಎನ್
http://thejeshgn.com
GPG ID :  0xBFFC8DD3C06DD6B0

To unsubscribe from this group and stop receiving emails from it, send an email to datameet+unsubscribe@googlegroups.com.

Justin Meyers

unread,
Aug 25, 2016, 9:52:11 AM8/25/16
to datameet
Thej,
Could you share the download links for the data?  The three village datasets you created align close with Bhuvan data, but there is a shift.  I believe there are a few different versions of villages that bhuvan uses and WRIS.  Yesterday I found a whole new India village dataset - attached are three samples.  Your Kerala data seems to be the best quality of the data you have uploaded thus far.  At the end of the day, all this data must come from the SOI.  I just wish it was easier to speak to them and have open and transparent conversations about the benefit of sharing this data with the people.  Right now we have to scrape data - trust names and codes.  The data is old, outdated, shifted, and no-one knows.  No-one tells us anything and we have to assume a nation wide dataset of villages, tehsils, etc is correct. 

Cheers 
DATAMEET_SAMPLES.zip

Sharad Lele

unread,
Aug 25, 2016, 11:29:03 PM8/25/16
to datameet
Hi Justin:

Your sample dataset looks interesting. I compared the Karnataka sample with the one that Thejesh had uploaded (which was shared from our archive). In some ways the match /fit with physical features is better, but some village shapes seem totally different and perhaps oversimplified. Could do a detailed check later. But I don't see any village codes/names in the attribute table. Are they present? Or linkable somehow? To census 2011?

It would be good to know this source of India-wide village boundaries... or perhaps you could simply share the full dataset.... At this stage, more the merrier.... :-)

Sharad

Thejesh GN

unread,
Aug 26, 2016, 6:37:55 AM8/26/16
to data...@googlegroups.com
@justin
For every state its different. We have a file inside every state folder which gives the details about source, process used and what tasks are pending.

Whats the source of your data? Is it GOI website? If yes, we can try and use that also as source material for the DM dataset.

We could start the conversation with SoI. Start a letter on hackpad, lets write them a letter asking them to release the data.



Thej
--
Thejesh GN  ತೇಜೇಶ್ ಜಿ.ಎನ್
http://thejeshgn.com
GPG ID :  0xBFFC8DD3C06DD6B0

To unsubscribe from this group and stop receiving emails from it, send an email to datameet+unsubscribe@googlegroups.com.

Nishadh K A

unread,
Mar 31, 2019, 2:30:42 PM3/31/19
to datameet
Hi,
has village level index published as pdf files, such as for a single state as 

The pdf file contains figures/maps of village boundary in each Taluk/Tehsil with a list of index. This would be the reference 
source to correct village polygons and its name details. There are 35 such pdf files with page numbers totalling 
near to 6500 pages.    

taluk/tehsil.

This may useful for correcting the Census of India 2011 village polygons and its name details. 

Regards,

Nishadh.K.A.
Research Associate

Nikhil VJ

unread,
Apr 1, 2019, 1:58:50 AM4/1/19
to datameet
Hi Nishadh,

Thanks for sharing this! Great job on compiling all the states data together in your repo.

Query for all: See this screenshot:



1. Are those boundaries for villages, or census blocks? (Or are they one and the same?)

2. Are there thicker and lighter boundaries in there, or is that just my eyes playing with the rasterized image?

3. Are the point locations with numbers : Villages, or polling stations? Or something else?

4. If villages, then do those numbers mean the last digits of 2011 census code?

5. Do those points coincide with sizeable human settlement locations?


We should geo-reference this stuff! Tall order, but hey, doesn't hurt to put it out there - maybe there's a GIS student out there looking for a worthy side project to show off on their CV? Just sayin' !


Regards
Nikhil VJ, Pune, India
Reply all
Reply to author
Forward
0 new messages