Hi Shubham,
They have been delaying on this for several years, since census requires the cooperation of so many people, all of whom would be required to follow governemnt dictats. Survey, and other data collection methods are far easier to capture, since the number of nodes to act in a sketchy manner is manageable. However, since they now have experience of Systematic Voter Deletion, in addition to general change in norms about expectations of truth from this government, they have now went ahead with building national demographic dataset in the image of their idea of India.
There is a portal
WorldPop, which has built a seemingly reliable dataset based on estimations by several factors and Geospatial imageries. It is also hosted by GEE:
This can also be used to check their numbers.
However, since these dataset only include numbers of people (and one of them classifies them with Gender also), it would be hard to counter check if the Census gives the number of people to be correct, while fudging the attributes of that person or locality. This time, we are also having `caste` as part of census, which too is susceptible to fudging, in addition to basic current norms, such as fudging against certain religions. For analysing these kinds of fudging, we could have analysed electoral roll data, which is supposed to be a dataset of almost everyone over 18 years of age. Electoral Rolls data also contains names, from which, for crude analysis, some researchers around the world have built ML based methodologies (
It’s All in the Name: A Character-Based Approach to Infer Religion - Cambridge University Press - 23 March 2023) to estimate religion from names. This methodology can also be expanded for Caste inference, if reliable large enough dataset is available with verified names based caste inference, however, it will always be prone to local variations, limitations with generic second names etc. So, based on my current, understanding fudging at this level would be hard to counter check without going to localities.
It would be an essential democratic tool if we can build a reliable way to identify patterns of data fudging, spatial variations (where they are most likely to implement it most, like West Bengal, JK, Punjab etc), and then methodologies for each of these fudging patterns.
- Amit