Hey Avinash et al,
using population figures for matching is a neat idea, great!
Meanwhile, I made both legal (licensing issues) and mathematical
progress on matching 2001 Census villages to 2014 polling booths. I have
a large conference next week which might delay things, but I expect to
bring out an open license dataset with the resulting matching table soon
after that. Of course, the matching quality with my strategy entirely
depends on accuracy of GIS data, which varies from district to district
(in some districts, the officers concerned clearly decided to photoshop
rather than visit each station, resulting in a neat artificial grid -
quite funny to see, but quite useless otherwise). Theoretically, one
could combine my algorithm with a fuzzy "name proximity" measure, but I
am not sure yet whether this will improve accuracy or just add confusion.
Anyways, it will be interesting to combine my approach with yours, and
with that of others going down similar roads.
Which still does not solve the 2001 to 2011 Census mapping of course,
Best,
Raphael
> village/ward (x) given the state and district, should match_down to the
> last individual_, the population figures from the census - in other
> words, exactly. So the matching field is some form of :
> statecode-districtcode-population total. This actually worked far better
> than i had hoped, though obviously not completely. As a cross check on
> the above, i re-ran the match, using state-districtcode-SC population/ST
> population. The possibility that two areas in a district have exactly
> the same total population _and _the same SC and ST population is, i
> hope, quite small.
>
> Anyway, I am hoping people can add to this ...The main caveat applies
> which is that the possibility of error is definitely there. So if you do
> use this for analysis, please, please, please do random cross checks.
> It'll take time, but it will save potential embarassment :-) and wrong
> data. And if you do find errors please fix and reupload.
>
>
> regards
>
> Avinash
>
>
>
>
>
> On Sat, Mar 15, 2014 at 11:28 AM, Avinash Celestine
> <
avinash....@gmail.com <mailto:
avinash....@gmail.com>> wrote:
>
> oh J&K is there after all. but would also be grateful if someone
> could do a random check to see if the matches between PC/AC are correct.
>
> I took these from the delimitation final papers if someone wants to
> know the source
>
> A
>
>
> On Sat, Mar 15, 2014 at 11:27 AM, Avinash Celestine
> <
avinash....@gmail.com <mailto:
avinash....@gmail.com>>
> wrote:
>
> hi
>
> attached an excel with AC-PC-district -states matching along
> with codes for AC-PC. I can add census district codes if you
> like...give me a day or two
>
> some states are not present - like J&K... if someone could add
> those that would be great
>
> Avinash
>
>
> On Fri, Mar 14, 2014 at 10:27 PM, indro ray
> <
rayind...@gmail.com <mailto:
rayind...@gmail.com>> wrote:
>
> Hi Anand (Chitipothu),
> Can I know the source from where you get the polling booth
> and ward data? Is it individual for each state and does it
> provide the lat-long for the polling booths?
>
> Thanks,
> Indro
>
>
> On Wed, Mar 12, 2014 at 9:45 AM, Anand Chitipothu
> <
anand...@gmail.com <mailto:
anand...@gmail.com>> wrote:
>
>
>
> On Wed, Mar 12, 2014 at 8:19 AM, Siddarth Raman
> <
thridda...@gmail.com
> <mailto:
datameet+u...@googlegroups.com>.
> <mailto:
datameet+u...@googlegroups.com>.
> <mailto:
datameet+u...@googlegroups.com>.