CBFC cuts data

134 views
Skip to first unread message

Thejesh GN

unread,
Apr 9, 2025, 10:49:07 AMApr 9
to datameet

You can search for specific film and their certification but doesn't list cursor changes.

Is there a place I get that? May be every day for the latest films. This is for trend analysis.


--
Thejesh GN ⏚ ತೇಜೇಶ್ ಜಿ.ಎನ್
http://thejeshgn.com
GPG ID :  0xBFFC8DD3C06DD6B0

Aman Bhargava

unread,
Apr 9, 2025, 11:56:14 AMApr 9
to datameet
Hello!
Yes there is. The cuts are posted on the ecinepramaan website (for example, here is Aavesham). The IDs on the URL are sequential and not hashed, so we've figured out a hacky way to just scrape in a brute-force manner. This also means that we've not yet been able to figure out a way to get a specific movie ID, we scrape whatever we can and see if the movie we're interested in fell within that range (you can narrow it a little bit by year and such). We've been working on scraping data and cleaning it up for the last few months.

Our work-in-progress repository is here: https://github.com/diagram-chasing/censor-board-cuts

The data scraping logic is explained here: https://github.com/diagram-chasing/censor-board-cuts/tree/master/data-scripts/scrape

This is not complete by any means. Issues are listed in the repository, as are our TODO items. But we've made some progress on scraping and structuring small samples, which you can see here. The data sample is also only a small subset of what can be gotten and was scraped a few months ago. The end goal is to automate this and create a regularly updated explorer with some basic trend analysis (modifications, types of modifications etc.).

Aman Bhargava

unread,
Jun 26, 2025, 12:15:01 AMJun 26
to datameet
Update on this: The CBFC has blocked any such access to the data. While we have the data between 2017 to June 2025, it is no longer possible to scrape since the URLs have been changed from being sequential IDs to encrypted strings.

Aroon Deep of The Hindu wrote about this in today's paper: https://www.thehindu.com/entertainment/movies/censor-board-discontinues-full-access-to-cuts-on-website/article69736377.ece

Vivek Matthew

unread,
Sep 14, 2025, 1:26:37 PMSep 14
to datameet
Aman and I have now launched CBFC Watch, an archive and explorer for over 1 lakh censorship records across close to 18k movies released in India since 2017: https://cbfc.watch

The code for the site itself and related analysis is up on GitHub: https://github.com/diagram-chasing/cbfc-watch
Reply all
Reply to author
Forward
0 new messages