Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

CBFC cuts data

58 views
Skip to first unread message

Thejesh GN

unread,
Apr 9, 2025, 10:49:07 AMApr 9
to datameet

You can search for specific film and their certification but doesn't list cursor changes.

Is there a place I get that? May be every day for the latest films. This is for trend analysis.


--
Thejesh GN ⏚ ತೇಜೇಶ್ ಜಿ.ಎನ್
http://thejeshgn.com
GPG ID :  0xBFFC8DD3C06DD6B0

Aman Bhargava

unread,
Apr 9, 2025, 11:56:14 AMApr 9
to datameet
Hello!
Yes there is. The cuts are posted on the ecinepramaan website (for example, here is Aavesham). The IDs on the URL are sequential and not hashed, so we've figured out a hacky way to just scrape in a brute-force manner. This also means that we've not yet been able to figure out a way to get a specific movie ID, we scrape whatever we can and see if the movie we're interested in fell within that range (you can narrow it a little bit by year and such). We've been working on scraping data and cleaning it up for the last few months.

Our work-in-progress repository is here: https://github.com/diagram-chasing/censor-board-cuts

The data scraping logic is explained here: https://github.com/diagram-chasing/censor-board-cuts/tree/master/data-scripts/scrape

This is not complete by any means. Issues are listed in the repository, as are our TODO items. But we've made some progress on scraping and structuring small samples, which you can see here. The data sample is also only a small subset of what can be gotten and was scraped a few months ago. The end goal is to automate this and create a regularly updated explorer with some basic trend analysis (modifications, types of modifications etc.).
Reply all
Reply to author
Forward
0 new messages