2024 Lok Sabha election results

163 views
Skip to first unread message

Vivek Matthew

unread,
Jun 4, 2024, 11:55:19 PMJun 4
to datameet
Hi all,

I have scraped the 2024 Lok Sabha election results from the results.eci.gov.in website. In case anyone is interested, you can find the CSV with the results attached.

Once constituency-wise turnout numbers are released for phase 7, I will include additional columns for turnout and vote share numbers.

Note that semicolon (;) is used as the column separator.

Regards,
Vivek
LS2024.csv

Nikhil VJ

unread,
Jun 5, 2024, 3:21:49 AMJun 5
to data...@googlegroups.com
Hi all,

Nice work Vivek!

I was scraping to catch by timestamp how lead margins, vote counts change over time, from statewise results pages like this  and PC-wise results pages like this .

I've collated the data and posted it along with scraping and collating (python) scripts on this github repo:

Flaws in this data:
1. Didn't catch it all from the beginning : leads-margins tally scraping was started from around 1:50 pm, per-candidate vote numbers scraping was started from around 4.30pm. 
2. There would be some missed time intervals for some constituencies sometimes some pages didn't load, script errored out due to edge cases
3. I bungled up on applying "U" prefixes for union territories so those rows were scraped quite late. 

But all in all I think it's a pretty good dataset to make time-series viz's, 
to "audit" tallies over time and detect out-of-norm additions, etc for folks who are interested in settling some ongoing debates using data.

Disclaimer : I'm only sharing the data as-was at those timestamps, this is secondary scraped data that is prone to flukes like a html tag mis-rendering causing a bad number to come in. If you find something odd, kindly lookup the official sources, file RTIs etc, but leave me out of it pls.


------------------------------------

My compliments to Election Commission of India, in case anyone from there is reading:
1. It was good to have whole integers of absolute vote counts given by ECI. Hope to see this maintained. This was a lot better than the rounded-off fractions of vote-shares we were getting during the US 2020 elections counting which had made it impossible to calculate the actual numbers of votes.

2. Good website work, consistent naming of each constituency / state's pages and consistent page structures.

3. Page-not-opening cases were there but were rare, and the chinks disappeared from around evening onwards when the declarations were happening and I'd expect more site visitors. On my part, I ensured my scripts were hitting 1-at-a-time only, kept adequate time intervals etc so that I don't bombard the server (to coders : this was intentional. Don't suggest "fixing" it by parallel threading etc, that gets you 429'd).

4. Candidates' photos were properly organized and were instantly rendered on all the PC-wise pages I was checking out. Which means each and every candidate was properly tracked in the DB and their files were properly linked and small thumbnails were kept, as opposed to past elections when there would only be scanned pages listing all the candidates's totals. One suggestion: converting these to .webp format will shrink the sizes and your egress loads by around 10x.

5. Even prior to election, voters lists were quite well managed, even the voter roll pdfs were easy to download, and it was quite easy to find our part + serial number provided we'd done our homework (which FYI was the only info we needed in hand apart from photo id on voting day, if you just shared these with the officer when you entered the booth, they'd locate your entry in 5 seconds and you would be done voting in under a minute.)

6. All in all, we've come a long way in digitization and making this data accessible to all, Thank you for all the work done. 

7. It would be great if you published some inside stories of the technical infrastructure (server specs etc) used on 4th June for serving the website.


--
Cheers,
Nikhil VJ
https://nikhilvj.co.in


--
Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org
---
You received this message because you are subscribed to the Google Groups "datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email to datameet+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/datameet/0c95b2a3-27d3-4146-8ce3-44a49ae72f6fn%40googlegroups.com.

Vivek Matthew

unread,
Jun 5, 2024, 4:16:09 AMJun 5
to data...@googlegroups.com
This data with timestamps is great, Nikhil! It will be interesting to compare with roundwise results when ECI puts it up for parliamentary constituencies.

ECI has already put out the roundwise vote counts for yesterday's assembly election results: https://results.eci.gov.in/AcResultGenJune2024/RoundwiseS011.htm?ac=1


You received this message because you are subscribed to a topic in the Google Groups "datameet" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/datameet/pmbjGJNGHaM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to datameet+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/datameet/CAH7jeuMg6EiA0ZL1x8hjVOQi8BxB10vqme5seQbY0-R%2BSbjK7g%40mail.gmail.com.

pmay...@gmail.com

unread,
Jun 5, 2024, 6:44:30 AMJun 5
to datameet
Hi Vivek,
Thanks so much for this!
best wishes,
Peter
Reply all
Reply to author
Forward
0 new messages