Parliamentary Constituency to Assembly Constituency to Ward linkages

818 views
Skip to first unread message

Siddarth Raman

unread,
Mar 11, 2014, 10:49:50 PM3/11/14
to data...@googlegroups.com
Hi All,

In line with the discussions on elections, this is something I'd started working on a while back (and dropped). I was essentially hoping for a PC to AC to Ward mapping. As far as I understand, census 2011 has population data either at the level of the ward or the district, so if we had to run even rudimentary data analysis on a parliamentary or assembly constituency (like total population) accurately, I'm guessing we need to go bottom up.

I had started this by attempting to convert http://eci.nic.in/eci_main/CurrentElections/CONSOLIDATED_ORDER%20_ECI%20.pdf into excel (using a mixture of pattern matching in notepad++ and a bit of excel vb). It's time consuming (largely because each state follows its own convention - not standardized)

Any suggestions on how one might go about this? If I wanted to estimate the population in a parliamentary constituency, or the total households, or the urban/rural split, how would I go about it? Is there a better method than looking at the above demarcation notification? Are there datasets on this already?

New to the group, didn't find any prior discussions on Parliamentary to Assembly to Ward/Village demarcations. 

Regards,
Siddarth

Associate,
Public Records of Operations and Finance,
Janaagraha Centre for Citizenship and Democracy

 

Anand Chitipothu

unread,
Mar 12, 2014, 12:15:08 AM3/12/14
to data...@googlegroups.com
On Wed, Mar 12, 2014 at 8:19 AM, Siddarth Raman <thridda...@gmail.com> wrote:
Hi All,

In line with the discussions on elections, this is something I'd started working on a while back (and dropped). I was essentially hoping for a PC to AC to Ward mapping. As far as I understand, census 2011 has population data either at the level of the ward or the district, so if we had to run even rudimentary data analysis on a parliamentary or assembly constituency (like total population) accurately, I'm guessing we need to go bottom up.

I had started this by attempting to convert http://eci.nic.in/eci_main/CurrentElections/CONSOLIDATED_ORDER%20_ECI%20.pdf into excel (using a mixture of pattern matching in notepad++ and a bit of excel vb). It's time consuming (largely because each state follows its own convention - not standardized)

Any suggestions on how one might go about this? If I wanted to estimate the population in a parliamentary constituency, or the total households, or the urban/rural split, how would I go about it? Is there a better method than looking at the above demarcation notification? Are there datasets on this already?

New to the group, didn't find any prior discussions on Parliamentary to Assembly to Ward/Village demarcations. 

Hi Siddarth,

The voter list PDFs have the ward info for each polling booth. The PDFs have the number of voter, but not the population. So it possible to sum up those number to get a count of number of voters in a PC or AC.

If you want polling  booth to ward mapping, I'll be able to provide it.

Anand

Anand Chitipothu

unread,
Mar 12, 2014, 12:17:14 AM3/12/14
to data...@googlegroups.com
btw, Anand Doshi has already parsed that PDF. The results are available at:


Anand
P.S: uff, so many Anands on this list

Raphael Susewind

unread,
Mar 12, 2014, 2:49:24 AM3/12/14
to data...@googlegroups.com
Hey Siddhart, and Anand,

I, too, am really interested in this, but have not made much progress
yet. I think there are two ways to do this, neither of which is
straightforward.

The "extract ward/village mentioned in roll PDF" strategy is one option.
Depending on raw data, this can however be cumbersome (one source in the
vernacular, one in latin script, etc); I know a couple of scholars who
attempt to do this and they are stuck all the time, having had to
manually match rather frequently (which is a pain given that there are
800.000 or so polling stations).

Currently, we have the additional problem that many of the current roll
PDFs - for instance in UP - are broken: one cannot copy-paste (or
pdftotext, or extract through whatever means) from them, chiefly because
the ToUnicodeCMap is corrupted by the version of CrystalReports the ECI
is using. There is no real workaround other than reverse-OCR, which is a
pain-in-the-a**. Let me know if you figure another way...

The second option would be a very different strategy, namely GIS
matching through next neighbour analysis: "what is the closest Census
village/ward around that particular polling booth" (or the other way
round - the computational challenge is to match ALL booths to at least
one ward AND vice versa). Unfortunately, Census village/ward lat/long is
not in the public domain, as far as I see - and using proprietary data
to do the matching is legally complicated (even if one redistributes
only the matching result and not the proprietary data).

My 5 cents,
Let us know of any progress,

Raphael
> --
> For more details about this list
> http://datameet.org/discussions/
> ---
> You received this message because you are subscribed to the Google
> Groups "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to datameet+u...@googlegroups.com
> <mailto:datameet+u...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--
Raphael Susewind | BGHS Bielefeld University, CSASP University of Oxford
Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany
Papers & Blog | http://www.raphael-susewind.de

Please do consider http://www.gnupg.org for encryption (key id A5ED49AE)

Avinash Celestine

unread,
Mar 13, 2014, 12:13:10 AM3/13/14
to data...@googlegroups.com
Hi Raphael

In fact the problem with the UP rolls is exactly what I am grappling with now. It seems to me that one way is to look at the exact mapping of Unicode characters embedded within the files. One way of generating such maps is to use a plugin like PDFLIBs font reporter which works with Adobe Acrobat(http://www.pdflib.com/products/fontreporter/). Have you tried out this method and did it work for you? Do tell me if you (or anyone else) has given it a shot. I am planning to give it a go atleast...

I have attached a sample roll (of an AC in Agra), along with the generated font report if anyone wants to give it a look

A closer look at the roll shows that the main problem seems to be with the Devanagari 'matras' which are not rendering correctly when you cut and paste

regards

Avinash


To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.
fontreportUP.zip

Avinash Celestine

unread,
Mar 13, 2014, 12:35:16 AM3/13/14
to data...@googlegroups.com
well i checked out the unicode table and it only confirms what we knew anyway... that there's duplication of unicode hex values for different characters... 

So i guess its back to the drawing board.

Raphael Susewind

unread,
Mar 13, 2014, 3:03:41 AM3/13/14
to data...@googlegroups.com, Avinash Celestine
Hey Avinash,

yep - thats what I figured, too. Not only misplaced matras (those could
be rearranged), but a real garbling, which cannot be resolved as far as
I see. Worse, there isnt even a clear pattern - for a few
constituencies, I fed the Voter ID (which is in latin script) to the
"search roll details by voter ID" function on the CEO website, which
returns the properly written unicode name. I then compared garbled name
and unicode name to see if there are any statistical regularities - yet
unfortunately, there are a thousand ways of garbling "Avinash" - its not
always "Abniszhaa".

The only solution I can think of is the following (but I have not
implemented it): train TesserAct (an IndicScript OCR) with the exact
font used in the PDF reports, so that it almost perfectly recognizes
something written in this font (this was a stumblestone for me, rather
complicated work), then extract images of text areas of interest, and
run them through OCR. If you want to give it a shot...

Otherwise, we could only try to convince the EC to fix the bug in
Crystal Reports, and re-generate all PDFs - which is highly unlikely,
they have more important things to do right now (the PDFs display and
print alright, after all, just text extraction does not work - they
would perhaps even consider it a feature rather than a bug).

It might be useful to compile a list of states where this problem occurs
- I have seen it in Gujarat and UP for sure, but don't know whether it
happens everywhere,

Best,
Raphael
> > <mailto:anand...@gmail.com <mailto:anand...@gmail.com>>>
> wrote:
> >
> >
> >
> > On Wed, Mar 12, 2014 at 8:19 AM, Siddarth Raman
> > <thridda...@gmail.com
> <mailto:thridda...@gmail.com>
> <mailto:thridda...@gmail.com
> <mailto:datameet%2Bunsu...@googlegroups.com>
> > <mailto:datameet+u...@googlegroups.com
> <mailto:datameet%2Bunsu...@googlegroups.com>>.
> > For more options, visit https://groups.google.com/d/optout.
>
> --
> Raphael Susewind | BGHS Bielefeld University, CSASP University
> of Oxford
> Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld, Germany
> Papers & Blog | http://www.raphael-susewind.de
>
> Please do consider http://www.gnupg.org for encryption (key id
> A5ED49AE)
>
> --
> For more details about this list
> http://datameet.org/discussions/
> ---
> You received this message because you are subscribed to the
> Google Groups "datameet" group.
> To unsubscribe from this group and stop receiving emails from
> it, send an email to datameet+u...@googlegroups.com
> <mailto:datameet%2Bunsu...@googlegroups.com>.

Avinash Celestine

unread,
Mar 13, 2014, 3:52:26 AM3/13/14
to Raphael Susewind, data...@googlegroups.com
oh i see so its worse than i thought :-(

you are right. I doubt the EC will fix it (for entirely good reasons on their part - they have more important things to worry about). 

I am trying a couple of alternative methods. Let me see if anything works - I will report back. For now, the OCR seems to be the best option.

Avinash


Siddarth Raman

unread,
Mar 13, 2014, 10:09:31 PM3/13/14
to data...@googlegroups.com
Hi Anand,

Thanks, but the csv link me only has the PC to AC mapping (still awesomely useful!).

Also hoping for ward level details. My intent isn't necessarily focused on Voter Rolls. It's on the larger census data itself. What % of the population is enrolled at the PC (or AC or Ward) level? Was looking to calculate that data, and then overlay voter enrollment data on top as and when required. 

Regards,
Siddarth

Raphael Susewind

unread,
Mar 14, 2014, 2:44:21 AM3/14/14
to data...@googlegroups.com
Hi all,

apropos Anand Doshi's https://gist.github.com/anandpdoshi/9448203 -
does somebody have the same table including AC constituency ID codes
(rather than just names)?

Best,
Raphael

Anand Chitipothu

unread,
Mar 14, 2014, 2:50:33 AM3/14/14
to data...@googlegroups.com
On Fri, Mar 14, 2014 at 12:14 PM, Raphael Susewind <li...@raphael-susewind.de> wrote:
Hi all,

apropos Anand Doshi's  https://gist.github.com/anandpdoshi/9448203 -
does somebody have the same table including AC constituency ID codes
(rather than just names)?

I have that for some states.


If you want any other state, I can try to generate it.

Anand

Raphael Susewind

unread,
Mar 14, 2014, 2:56:14 AM3/14/14
to data...@googlegroups.com
Looks great - all states would be even better... perhaps at the ODC
hackathon next weekend? R

Anand Chitipothu

unread,
Mar 14, 2014, 5:11:06 AM3/14/14
to data...@googlegroups.com
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

indro ray

unread,
Mar 14, 2014, 12:57:08 PM3/14/14
to data...@googlegroups.com
Hi Anand (Chitipothu),
Can I know the source from where you get the polling booth and ward data? Is it individual for each state and does it provide the lat-long for the polling booths?

Thanks,
Indro


--

Avinash Celestine

unread,
Mar 15, 2014, 1:57:05 AM3/15/14
to data...@googlegroups.com
hi

attached an excel with AC-PC-district -states matching along with codes for AC-PC. I can add census district codes if you like...give me a day or two

some states are not present - like J&K... if someone could add those that would be great

Avinash
Dist-AC-PC.xls

Avinash Celestine

unread,
Mar 15, 2014, 1:58:29 AM3/15/14
to data...@googlegroups.com
oh J&K is there after all. but would also be grateful if someone could do a random check to see if the matches between PC/AC are correct.

I took these from the delimitation final papers if someone wants to know the source

A

Avinash Celestine

unread,
Mar 15, 2014, 3:43:49 AM3/15/14
to data...@googlegroups.com
Re the issue of mapping of wards to AC/PC boundaries etc raised by Siddharth and the original subject of this thread, here's my two bits. warning, longish and extremely boring email follows! :

firstly, you might be better off using the delimitation xls files rather than pdfs. they are here:

The delimitation papers map the AC and PC to census 2001 codes and boundaries. So its actually a two stage process - one, get the AC, PC census area/village/ward etc to line up neatly in a usable table or database, and
two. map the 2001 census codes to the 2011 census codes

both are non trivial steps

one, the census categories which are specified in the AC/PC delimitation documents do not map neatly to the areas in the census data files. In the first place, there are no census codes for individual areas in the delim docs making it that much harder. secondly, the level of aggregation across the delim papers and the census docs itself is different. To take UP for example, many constituencies in the delim papers are mapped to what in UP are called 'KCs' (kanungoo circles) or PCs(patwari circles). They do not go down further and say which villages make up those KCs or PCs. ditto with bihar and other states. The census tables on the other hand, have data about villages, tehsils, and blocks. they do not go into KCs and PCs. So theres often a mismatch between aggregation levels which will take work to resolve.

two. mapping census 2001 codes to 2011 codes. if you do that, the starting point is here:
https://egovstandards.gov.in/mapping_land_region_codification (the site will throw a security warning in many browsers but i think thats because nic has not updated its security certificates or whatever. its not been a problem for me, but you proceed at your own risk)

this maps 2001 codes to 2011 codes. Here's the problem:
for urban areas, the coding goes down to the town level. So there is one town code for the whole of mumbai for example, which maps 2001 to 2011. What you cannot do with this table, and which is a big problem for urban areas, is map wards which exist as of 2001 census, to wards which exist as of 2011. Many city municipalities have rejigged ward boundaries over the last decade or so (i know delhi has). So even if you can match town codes, you still need to match wards from 2001 to 2011. All this is less of a problem for rural areas though its still present to some extent. This problem also makes 2001 and 2011 census data non comparable at a ward level because, if I recall correctly, census 2011 uses newly delimited wards whereas 2001 will (obviously) use the old ward boundaries. 

If you are interested in only a specific city or area, here's a suggestion. bypass the delim papers altogether. Start with the pdf electoral rolls which are now online for most states. The first page of the roll for each polling station has a standard format which has the area and ward boundaries, the AC, PC data as well aggregate number of electors. Write a scraper to parse just the first page of each roll. Of course if you are doing this for UP, you are totally screwed, because as Raphael pointed out earlier in this thread, there is a problem with the pdf unicode mapping so you'll basically get gibberish. But I think that pdfs for other cities may be more scrape-friendly. I tried it out with a couple of pdf rolls for delhi as a test case, and it worked reasonably well. The ward data from the scrape should line up easily with the census data. hopefully.  

Having said that, I did take a shot at doing the mapping atleast to the census 2001 codes. attached is the result. this is an excel (within zip file) with about 54000 rows, of which 20,000 rows are 'not matched' for reasons described above, so its very much a work in progress. Different states have been matched to different extents. UP, Bihar have big gaps, - states like delhi and gujarat less so. A few more points :
* the left side of the table is from the delim papers, the right side is census 2001.
* where the delim papers specify that only a 'part' of a ward or area are contained within that AC, i have worked out the proportion of the entire ward population that is covered, in the right most column. This column is not complete either.

A word about the methodology and my 'big breakthrough' :-)) in matching the two datasets even to this extent. The delim papers have population/Sc/ST data from the census. It struck me that given the district and state, these population numbers are actually a kind of unique identifier of their own. As in, the census population figure for village/ward (x) given the state and district, should match down to the last individual, the population figures from the census - in other words, exactly. So the matching field is some form of : statecode-districtcode-population total. This actually worked far better than i had hoped, though obviously not completely. As a cross check on the above, i re-ran the match, using state-districtcode-SC population/ST population. The possibility that two areas in a district have exactly the same total population and the same SC and ST population is, i hope, quite small. 

Anyway, I am hoping people can add to this ...The main caveat applies which is that the possibility of error is definitely there. So if you do use this for analysis, please, please, please do random cross checks. It'll take time, but it will save potential embarassment :-) and wrong data. And if you do find errors please fix and reupload.


regards

Avinash



PC-AC-wards-villages.zip

Raphael Susewind

unread,
Mar 15, 2014, 3:49:26 AM3/15/14
to data...@googlegroups.com, Avinash Celestine
Hi Avinash and all,

I realized that each constituency falls within only one district in your
file, but there are constituencies that span several districts and vice
versa (rare, but it happens). I attached a list of those, extracted from
polling-station data on eci-polldaymonitoring.nic.in. These are AC only,
naturally the problem would proliferate if you aggregate to PC,

Hope it helps,
Raphael

On 15.03.2014 06:57, Avinash Celestine wrote:
> hi
>
> attached an excel with AC-PC-district -states matching along with codes
> for AC-PC. I can add census district codes if you like...give me a day
> or two
>
> some states are not present - like J&K... if someone could add those
> that would be great
>
> Avinash
>
>
> On Fri, Mar 14, 2014 at 10:27 PM, indro ray <rayind...@gmail.com
> <mailto:rayind...@gmail.com>> wrote:
>
> Hi Anand (Chitipothu),
> Can I know the source from where you get the polling booth and ward
> data? Is it individual for each state and does it provide the
> lat-long for the polling booths?
>
> Thanks,
> Indro
>
>
> On Wed, Mar 12, 2014 at 9:45 AM, Anand Chitipothu
> <anand...@gmail.com <mailto:anand...@gmail.com>> wrote:
>
>
>
> On Wed, Mar 12, 2014 at 8:19 AM, Siddarth Raman
> <thridda...@gmail.com <mailto:thridda...@gmail.com>>
> <mailto:datameet+u...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> For more details about this list
> http://datameet.org/discussions/
> ---
> You received this message because you are subscribed to the Google
> Groups "datameet" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to datameet+u...@googlegroups.com
> <mailto:datameet+u...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> For more details about this list
> http://datameet.org/discussions/
> ---
> You received this message because you are subscribed to the Google
> Groups "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to datameet+u...@googlegroups.com
> <mailto:datameet+u...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--
multiple.csv

Avinash Celestine

unread,
Mar 15, 2014, 3:58:50 AM3/15/14
to data...@googlegroups.com
thanks. the rule, as far as i remember, is that ACs are entirely contained within a district boundary. PCs, on the other hand, can span across district boundaries.

A

Raphael Susewind

unread,
Mar 15, 2014, 4:02:10 AM3/15/14
to data...@googlegroups.com, Avinash Celestine
Hey Avinash et al,

using population figures for matching is a neat idea, great!

Meanwhile, I made both legal (licensing issues) and mathematical
progress on matching 2001 Census villages to 2014 polling booths. I have
a large conference next week which might delay things, but I expect to
bring out an open license dataset with the resulting matching table soon
after that. Of course, the matching quality with my strategy entirely
depends on accuracy of GIS data, which varies from district to district
(in some districts, the officers concerned clearly decided to photoshop
rather than visit each station, resulting in a neat artificial grid -
quite funny to see, but quite useless otherwise). Theoretically, one
could combine my algorithm with a fuzzy "name proximity" measure, but I
am not sure yet whether this will improve accuracy or just add confusion.

Anyways, it will be interesting to combine my approach with yours, and
with that of others going down similar roads.

Which still does not solve the 2001 to 2011 Census mapping of course,

Best,
Raphael
> village/ward (x) given the state and district, should match_down to the
> last individual_, the population figures from the census - in other
> words, exactly. So the matching field is some form of :
> statecode-districtcode-population total. This actually worked far better
> than i had hoped, though obviously not completely. As a cross check on
> the above, i re-ran the match, using state-districtcode-SC population/ST
> population. The possibility that two areas in a district have exactly
> the same total population _and _the same SC and ST population is, i
> hope, quite small.
>
> Anyway, I am hoping people can add to this ...The main caveat applies
> which is that the possibility of error is definitely there. So if you do
> use this for analysis, please, please, please do random cross checks.
> It'll take time, but it will save potential embarassment :-) and wrong
> data. And if you do find errors please fix and reupload.
>
>
> regards
>
> Avinash
>
>
>
>
>
> On Sat, Mar 15, 2014 at 11:28 AM, Avinash Celestine
> <avinash....@gmail.com <mailto:avinash....@gmail.com>> wrote:
>
> oh J&K is there after all. but would also be grateful if someone
> could do a random check to see if the matches between PC/AC are correct.
>
> I took these from the delimitation final papers if someone wants to
> know the source
>
> A
>
>
> On Sat, Mar 15, 2014 at 11:27 AM, Avinash Celestine
> <avinash....@gmail.com <mailto:avinash....@gmail.com>>
> wrote:
>
> hi
>
> attached an excel with AC-PC-district -states matching along
> with codes for AC-PC. I can add census district codes if you
> like...give me a day or two
>
> some states are not present - like J&K... if someone could add
> those that would be great
>
> Avinash
>
>
> On Fri, Mar 14, 2014 at 10:27 PM, indro ray
> <rayind...@gmail.com <mailto:rayind...@gmail.com>> wrote:
>
> Hi Anand (Chitipothu),
> Can I know the source from where you get the polling booth
> and ward data? Is it individual for each state and does it
> provide the lat-long for the polling booths?
>
> Thanks,
> Indro
>
>
> On Wed, Mar 12, 2014 at 9:45 AM, Anand Chitipothu
> <anand...@gmail.com <mailto:anand...@gmail.com>> wrote:
>
>
>
> On Wed, Mar 12, 2014 at 8:19 AM, Siddarth Raman
> <thridda...@gmail.com
> <mailto:datameet+u...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> For more details about this list
> http://datameet.org/discussions/
> ---
> You received this message because you are subscribed to the
> Google Groups "datameet" group.
> To unsubscribe from this group and stop receiving emails
> from it, send an email to
> datameet+u...@googlegroups.com
> <mailto:datameet+u...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.
>
>
>
>
> --
> For more details about this list
> http://datameet.org/discussions/
> ---
> You received this message because you are subscribed to the Google
> Groups "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to datameet+u...@googlegroups.com
> <mailto:datameet+u...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout.

--

Raphael Susewind

unread,
Mar 15, 2014, 4:03:30 AM3/15/14
to data...@googlegroups.com
Might well be the rule (I remember having read something like this,
too), but the reality apparently differs (at least in the EC's own
data)... Never depend on rules, check them! ;-)

On 15.03.2014 08:58, Avinash Celestine wrote:
> thanks. the rule, as far as i remember, is that ACs are entirely
> contained within a district boundary. PCs, on the other hand, can span
> across district boundaries.
>
> A
>
>
> On Sat, Mar 15, 2014 at 1:19 PM, Raphael Susewind
> <li...@raphael-susewind.de <mailto:li...@raphael-susewind.de>> wrote:
>
> Hi Avinash and all,
>
> I realized that each constituency falls within only one district in your
> file, but there are constituencies that span several districts and vice
> versa (rare, but it happens). I attached a list of those, extracted from
> polling-station data on eci-polldaymonitoring.nic.in
> <http://eci-polldaymonitoring.nic.in>. These are AC only,
> naturally the problem would proliferate if you aggregate to PC,
>
> Hope it helps,
> Raphael
>
> On 15.03.2014 06:57, Avinash Celestine wrote:
> > hi
> >
> > attached an excel with AC-PC-district -states matching along with
> codes
> > for AC-PC. I can add census district codes if you like...give me a day
> > or two
> >
> > some states are not present - like J&K... if someone could add those
> > that would be great
> >
> > Avinash
> >
> >
> > On Fri, Mar 14, 2014 at 10:27 PM, indro ray
> <rayind...@gmail.com <mailto:rayind...@gmail.com>
> > <mailto:rayind...@gmail.com <mailto:rayind...@gmail.com>>>
> wrote:
> >
> > Hi Anand (Chitipothu),
> > Can I know the source from where you get the polling booth and
> ward
> > data? Is it individual for each state and does it provide the
> > lat-long for the polling booths?
> >
> > Thanks,
> > Indro
> >
> >
> > On Wed, Mar 12, 2014 at 9:45 AM, Anand Chitipothu
> > <anand...@gmail.com <mailto:anand...@gmail.com>
> <mailto:anand...@gmail.com <mailto:anand...@gmail.com>>> wrote:
> >
> >
> >
> > On Wed, Mar 12, 2014 at 8:19 AM, Siddarth Raman
> > <thridda...@gmail.com
> <mailto:thridda...@gmail.com> <mailto:thridda...@gmail.com
> <mailto:datameet%2Bunsu...@googlegroups.com>
> > <mailto:datameet+u...@googlegroups.com
> <mailto:datameet%2Bunsu...@googlegroups.com>>.
> > For more options, visit https://groups.google.com/d/optout.
> >
> >
> > --
> > For more details about this list
> > http://datameet.org/discussions/
> > ---
> > You received this message because you are subscribed to the Google
> > Groups "datameet" group.
> > To unsubscribe from this group and stop receiving emails from it,
> > send an email to datameet+u...@googlegroups.com
> <mailto:datameet%2Bunsu...@googlegroups.com>
> > <mailto:datameet+u...@googlegroups.com
> <mailto:datameet%2Bunsu...@googlegroups.com>>.
> > For more options, visit https://groups.google.com/d/optout.
> >
> >
> > --
> > For more details about this list
> > http://datameet.org/discussions/
> > ---
> > You received this message because you are subscribed to the Google
> > Groups "datameet" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> > an email to datameet+u...@googlegroups.com
> <mailto:datameet%2Bunsu...@googlegroups.com>
> > <mailto:datameet+u...@googlegroups.com
> <mailto:datameet%2Bunsu...@googlegroups.com>>.

Avinash Celestine

unread,
Mar 15, 2014, 4:19:17 AM3/15/14
to data...@googlegroups.com
unfortunately you may be right... so thats another layer of complexity...

On a slightly related note, i have often thought, though i dont know if its actually possible in practice, for governments to do some delimitation on their own (for political purposes). For instance, if a village/area is near the border of a constituency, its possible through an order to bring it under the administrative jurisdiction of a neighbouring district. If that district is then served by a different AC, you have effectively done some delimitation of your own, without actually calling it that....

given that delimitation papers don't specify individual villages in many cases, it seems entirely possible to do...

looking forward to your dataset, Raphael!

avinash


To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.

Srinivasan Ramani

unread,
Mar 15, 2014, 4:30:51 AM3/15/14
to data...@googlegroups.com
Interjecting in a fantastic conversation... (Kudos to Avinash & Raphael and others for the efforts to mix/match AC-PC and administrative jurisdictions)..

There is no direct containment of ACs within a district. Case in point is Delhi, where ACs dont' fit single districts at all. 

Avinash, 

Trouble with the kind of political delimitation that you talk about is that..it doesn't really serve any purpose. With cross-determination of powers at various levels - blocks, wards, districts under the bureaucracy vis-a-vis MLAs, changing administrative jurisdictions doesn't make much sense as much as doing direct gerrymandering for political vote-gaining. In other words, the powers of a MLA administratively is much too nebulous as compared to district officials across the bureaucracy and the third tier of democracy. 
--
Best Regards,
Srinivasan V. Ramani ,
Senior Assistant Editor,
Economic and Political Weekly ,
New Delhi: 110 067
09650855669

Avinash Celestine

unread,
Mar 15, 2014, 5:01:34 AM3/15/14
to data...@googlegroups.com
hmm yes thats true. its basically an inefficient way to engineer seat gains - there are many other more efficient ways! 

A


Siddarth Raman

unread,
Mar 16, 2014, 1:03:57 AM3/16/14
to data...@googlegroups.com
Hi Avinash,

Thanks a ton for pointing out the excel files with delimitation. I read what you wrote. Will take a look at the zip fie and cross-check. I too had hoped the district mapping was contiguous with some political boundaries, but they aren't. Bangalore, funnily has a ward (44 I think) which is split across three different patches of land which don't share a boundary! 

For those interested in more background regarding the why of it all...

I was curious to understand what according to anyone is an Urban Parliamentary constituency? Mint had done a study a while back - http://www.livemint.com/Specials/XovcjYRkWCBLJSwQwxY6wN/India-has-only-53-predominantly-urban-constituencies.html - their main source was the million plus cities of India as per census. That sparked off the thought. I wanted to dig deeper. I thought that while one might disagree with the census definition of urban, it's a basis to begin with. Was hoping to look at all PC and AC with a % urban. > 50% would imply urban constituency (perhaps not the best method, but seemed like a good start)

I guess it isn't as easy as I imagined, but still would be good to figure out. Do let me know if anyone has other ideas.

Regards,
Siddarth

Raphael Susewind

unread,
Mar 16, 2014, 5:24:14 AM3/16/14
to data...@googlegroups.com
Hi Siddhart,

for my UP dataset, I used spatial matching of polling booth locations
against the MODIS urban extent satellite layer of 2002 - tends to be
larger urban centres, though. Another option is to look at "how many
polling stations have multiple booths" [polling stations being defined
as booths with almost same name in almost same location] - this turned
out to be a rather accurate (and up-to-date) representation of the
"urban" as well as "small town" - only real rural stations have only one
booth, in my experience (UP)...

Best,
Raphael
> > <li...@raphael-susewind.de <javascript:>
> <mailto:li...@raphael-susewind.de <javascript:>>> wrote:
> >
> > Hi Avinash and all,
> >
> > I realized that each constituency falls within
> only one district in your
> > file, but there are constituencies that span
> several districts and vice
> > versa (rare, but it happens). I attached a list of
> those, extracted from
> > polling-station data on
> eci-polldaymonitoring.nic.in
> <http://eci-polldaymonitoring.nic.in>
> > <http://eci-polldaymonitoring.nic.in
> <http://eci-polldaymonitoring.nic.in>>. These are AC only,
> > naturally the problem would proliferate if you
> aggregate to PC,
> >
> > Hope it helps,
> > Raphael
> >
> > On 15.03.2014 06:57, Avinash Celestine wrote:
> > > hi
> > >
> > > attached an excel with AC-PC-district -states
> matching along with
> > codes
> > > for AC-PC. I can add census district codes if
> you like...give me a day
> > > or two
> > >
> > > some states are not present - like J&K... if
> someone could add those
> > > that would be great
> > >
> > > Avinash
> > >
> > >
> > > On Fri, Mar 14, 2014 at 10:27 PM, indro ray
> > <rayind...@gmail.com <javascript:>
> <mailto:rayind...@gmail.com <javascript:>>
> > > <mailto:rayind...@gmail.com <javascript:>
> <mailto:rayind...@gmail.com <javascript:>>>>
> > wrote:
> > >
> > > Hi Anand (Chitipothu),
> > > Can I know the source from where you get the
> polling booth and
> > ward
> > > data? Is it individual for each state and
> does it provide the
> > > lat-long for the polling booths?
> > >
> > > Thanks,
> > > Indro
> > >
> > >
> > > On Wed, Mar 12, 2014 at 9:45 AM, Anand
> Chitipothu
> > > <anand...@gmail.com <javascript:>
> <mailto:anand...@gmail.com <javascript:>>
> > <mailto:anand...@gmail.com <javascript:>
> <mailto:anand...@gmail.com <javascript:>>>> wrote:
> > >
> > >
> > >
> > > On Wed, Mar 12, 2014 at 8:19 AM,
> Siddarth Raman
> > > <thridda...@gmail.com <javascript:>
> > <mailto:thridda...@gmail.com <javascript:>>
> <mailto:thridda...@gmail.com <javascript:>
> > <mailto:thridda...@gmail.com <javascript:>>>>
> datameet+u...@googlegroups.com <javascript:>
> > <mailto:datameet%2Bunsu...@googlegroups.com
> <javascript:>>
> > > <mailto:datameet+u...@googlegroups.com
> <javascript:>
> > <mailto:datameet%2Bunsu...@googlegroups.com
> <javascript:>>>.
> > > For more options, visit
> https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
> > >
> > >
> > > --
> > > For more details about this list
> > > http://datameet.org/discussions/
> <http://datameet.org/discussions/>
> > > ---
> > > You received this message because you are
> subscribed to the Google
> > > Groups "datameet" group.
> > > To unsubscribe from this group and stop
> receiving emails from it,
> > > send an email to
> datameet+u...@googlegroups.com <javascript:>
> > <mailto:datameet%2Bunsu...@googlegroups.com
> <javascript:>>
> > > <mailto:datameet+u...@googlegroups.com
> <javascript:>
> > <mailto:datameet%2Bunsu...@googlegroups.com
> <javascript:>>>.
> > > For more options, visit
> https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
> > >
> > >
> > > --
> > > For more details about this list
> > > http://datameet.org/discussions/
> <http://datameet.org/discussions/>
> > > ---
> > > You received this message because you are
> subscribed to the Google
> > > Groups "datameet" group.
> > > To unsubscribe from this group and stop
> receiving emails from it, send
> > > an email to datameet+u...@googlegroups.com
> <javascript:>
> > <mailto:datameet%2Bunsu...@googlegroups.com
> <javascript:>>
> > > <mailto:datameet+u...@googlegroups.com <javascript:>
> > <mailto:datameet%2Bunsu...@googlegroups.com
> <javascript:>>>.
> > > For more options, visit
> https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
> >
> > --
> > Raphael Susewind | BGHS Bielefeld University,
> CSASP University of Oxford
> > Snail Mail | Melanchthonstr. 4a, 33615
> Bielefeld, Germany
> > Papers & Blog | http://www.raphael-susewind.de
> >
> > Please do consider http://www.gnupg.org for
> encryption (key id A5ED49AE)
> >
> >
> > --
> > For more details about this list
> > http://datameet.org/discussions/
> <http://datameet.org/discussions/>
> > ---
> > You received this message because you are subscribed
> to the Google
> > Groups "datameet" group.
> > To unsubscribe from this group and stop receiving
> emails from it, send
> > an email to datameet+u...@googlegroups.com <javascript:>
> > <mailto:datameet+u...@googlegroups.com <javascript:>>.
> > For more options, visit
> https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
> --
> Raphael Susewind | BGHS Bielefeld University, CSASP
> University of Oxford
> Snail Mail | Melanchthonstr. 4a, 33615 Bielefeld,
> Germany
> Papers & Blog | http://www.raphael-susewind.de
>
> Please do consider http://www.gnupg.org for encryption
> (key id A5ED49AE)
>
> --
> For more details about this list
> http://datameet.org/discussions/
> <http://datameet.org/discussions/>
> ---
> You received this message because you are subscribed to
> the Google Groups "datameet" group.
> To unsubscribe from this group and stop receiving emails
> from it, send an email to datameet+u...@googlegroups.com
> <javascript:>.
> For more options, visit
> https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
>
> --
> For more details about this list
> http://datameet.org/discussions/
> <http://datameet.org/discussions/>
> ---
> You received this message because you are subscribed to the
> Google Groups "datameet" group.
> To unsubscribe from this group and stop receiving emails
> from it, send an email to datameet+u...@googlegroups.com
> <javascript:>.
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
>
>
>
> --
> Best Regards,
> Srinivasan V. Ramani ,
> Senior Assistant Editor,
> Economic and Political Weekly ,
> New Delhi: 110 067
> 09650855669
>
> --
> For more details about this list
> http://datameet.org/discussions/ <http://datameet.org/discussions/>
> ---
> You received this message because you are subscribed to the
> Google Groups "datameet" group.
> To unsubscribe from this group and stop receiving emails from
> it, send an email to datameet+u...@googlegroups.com <javascript:>.
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.

Rukmini S

unread,
Mar 25, 2014, 6:14:06 AM3/25/14
to data...@googlegroups.com
Just chipping in to say I have been grappling with this PC/ district issue for so long, discovering this thread is cathartic. Unfortunately my tech skills are a bit sad so I will quietly cheer from the sidelines.

Best,
Rukmini


On Wednesday, March 12, 2014 8:19:50 AM UTC+5:30, Siddarth Raman wrote:
Hi All,

In line with the discussions on elections, this is something I'd started working on a while back (and dropped). I was essentially hoping for a PC to AC to Ward mapping. As far as I understand, census 2011 has population data either at the level of the ward or the district, so if we had to run even rudimentary data analysis on a parliamentary or assembly constituency (like total population) accurately, I'm guessing we need to go bottom up.

I had started this by attempting to convert http://eci.nic.in/eci_main/CurrentElections/CONSOLIDATED_ORDER%20_ECI%20.pdf into excel (using a mixture of pattern matching in notepad++ and a bit of excel vb). It's time consuming (largely because each state follows its own convention - not standardized)

Any suggestions on how one might go about this? If I wanted to estimate the population in a parliamentary constituency, or the total households, or the urban/rural split, how would I go about it? Is there a better method than looking at the above demarcation notification? Are there datasets on this already?

New to the group, didn't find any prior discussions on Parliamentary to Assembly to Ward/Village demarcations. 

Regards,
Siddarth

Associate,
Public Records of Operations and Finance,
Janaagraha Centre for Citizenship and Democracy

 

Avinash Celestine

unread,
Mar 25, 2014, 7:30:08 AM3/25/14
to data...@googlegroups.com
actually a quick and dirty way to do it, and not that hard, is to restrict the mapping to the district level only, and not bother to go further down (to village/town/ward etc). If you are doing analysis for parliamentary constituencies, my guess is this should work reasonably well. It works less well if you want to look at assembly constituencies. What you end up with is effectively a weighted average, for each PC of the districts that it matches to. In most cases, we would anyway use only percentages (share of literate popn and so forth) rather than absolute numbers, so it works.

I updated the AC-PC to district mapping file that I uploaded a few days ago to correct for errors in two constituencies (PC no 33 in bihar, and one constituency in uttarakhand. I had duplicated them).

 In an additional sheet, the EC's elector information for the latest rolls, is given, along with mapping of district to AC. I suggest you use that rather than mine. Its more comprehensive, and is likely free (or atleast freer!) from errors that i made. (source of the data is here http://www.eci-polldaymonitoring.nic.in/erollpublic/). I matched in the PC names to that table. File is attached.

Avinash




--
For more details about this list
http://datameet.org/discussions/
---
You received this message because you are subscribed to the Google Groups "datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.
District-AC-PC.xlsx

Raphael Susewind

unread,
Apr 17, 2014, 6:46:24 AM4/17/14
to data...@googlegroups.com
Dear all,

just a follow-up to this oldish thread: I recently switched to the
newest version of TesserAct OCR to transform buggy PDF rolls to text -
and it works surprisingly well. Small typos here and there, but that can
be rectified. In case anyone else looks for a solution to this...

Best,
Raphael
Reply all
Reply to author
Forward
0 new messages