Full database meta-data bulk download by s/ftp or API (everything except the full text)

4 views
Skip to first unread message

D. Kreil

unread,
Nov 25, 2025, 8:43:55 AM (8 days ago) Nov 25
to Europe PMC Developer Forum
Dear Sir/Madam,

We are a small research group at a federal university in Vienna, Austria, working on academic benchmarks for re-ranking articles by current and predicted future impact.

While we can obtain the full citation graph in PubMed format from PubMed, I was browsing the bulk downloads page,
and only found a range of subsets to download.

Can you perhaps support a bulk download of paper meta data for all articles?

We in particular need:
- title
- abstract
- authors (ideally, with a unique ID)
- journal
- keywords and/or MeSH terms
- citations (manuscripts that this article cites; we can reconstruct the other direction, who cited this article).

I have briefly discussed this with the helpdesk and they suggested I turn to the developer's forum here. I also see there were other questions posted here about bulk downloads that get pointed to the standard API (which can do up to 100 in a query I understand).

We would not republish any data, we want to do academic research on network graphs on it.

For that, however, we need the whole data. 

With the rate limitations I have been given (max 10 requests/sec and 500/min), is that per field or per complete record?
If it's per record, download would take about 3 months. If it's per field then that would make working with your data infeasible. If it's per query then it would still take a day. I was actually recently contacted by the help desk because there just was a flood of API requests (no, it wasn't us). We'd be more than happy with an ftp/sftp download.

What can you suggest? If there currently is no support for this kind of download/research, can you perhaps consider this in the near future?

With many thanks
and best regards,
David Kreil.

Madhumiethaa Jayaprabha Palanisamy

unread,
Nov 25, 2025, 12:16:40 PM (8 days ago) Nov 25
to Europe PMC Developer Forum, dpk...@gmail.com
Hi David,

Thanks for getting in touch.

Regarding bulk download of metadata for all articles in Europe PMC, we do provide a metadata export here:
https://europepmc.org/ftp/pmclitemetadata/
This is based on the "lite" version of our API response which returns only the key metadata such as title, journal, authors, etc., for a given search term. It may not be suitable for your use case, as it does not include the core metadata such as, abstract, full text links, reference list, MeSH terms etc.,

For programmatic access, the rate limit is10 requests per second or 500 per minute and is applied per request/query, not per field or per returned record. The maximum supported pageSize is 1000. So, querying with resultType "core" and an effective page size would the most efficient way to retrieve large portions of the dataset.

I hope this clarifies your question.

kind regards,
Madhu

Reply all
Reply to author
Forward
0 new messages