Temporary rate-limit increase request — academic research use (developer key)

5 views
Skip to first unread message

Benjamin Clyde

unread,
4:44 AM (11 hours ago) 4:44 AM
to Guardian Open Platform API Forum
Dear Open Platform team,

I am using the Content API under a developer key for a non-commercial academic study of long-run language change in British news discourse, 1999–2019 (possibly extending to the present). The Guardian archive is the study's primary source, and the Open Platform will be cited as the data source in any resulting publication.

Building the research corpus requires a one-off archive pull of article text (type=article, show-fields=bodyText, page-size=200), plus monthly totals for normalisation and a set of targeted search queries. My volume estimate:

~2.0M articles, 1999–2019 ≈ 10,200 requests
optional extension to the present ≈ 3,500 requests
monthly denominators, term searches, validation re-runs ≈ 5,000–8,000 requests
Total ≈ 20,000–25,000 requests, one-off
At the current developer allowance of 500 requests/day, even the core archive pull would take roughly three weeks of saturating the key continuously, and the full project six to seven weeks — which is not really practical to run unattended. I would therefore like to request a temporary increase to ~10,000 calls/day for two to four weeks, which would let me complete the harvest in a couple of days. A small increase to the per-second limit would help reduce wall-clock time but is secondary; the daily cap is the binding constraint. After the one-off harvest, usage will fall back to occasional small queries (monthly refresh, spot validation), well within the standard limit.

If you would prefer that I instead run at the standard cap over a longer window, I am happy to do that — in that case please treat this message as a courtesy notification that sustained near-cap usage on this key is deliberate and research-related, not abuse.

The harvester is paced (≤1 request/second per the current limit, exponential backoff on 429/5xx, month-sliced with checkpointing). I am committed to the terms of service: article text is stored locally for analysis only; it will not be redistributed, publicly mirrored, or used to train generative AI models; published materials will share only article IDs/URLs and aggregate statistics, so that others can re-derive the corpus directly through the API.

Thank you for maintaining this service — it is rare for a major newspaper to keep its archive open to researchers in this form, and it is genuinely appreciated.

Best regards,
Benjamin
Reply all
Reply to author
Forward
0 new messages