Missing full-text XML for PMC9810470 in FTP and S3 open-access dumps

14 views
Skip to first unread message

James I.

unread,
Apr 9, 2026, 8:00:55 AMApr 9
to Europe PMC Developer Forum

Dear Europe PMC Developers,


I found an article whose full text is available on the Europe PMC website (https://europepmc.org/article/PMC/PMC9810470), but I cannot find the corresponding full-text file in either the FTP dump or the S3 bucket.


I checked the FTP source:

FTP_ADDRESS = "ftp.ebi.ac.uk"

FTP_ROOTDIR = "pub/databases/pmc/oa/"


I also checked the S3 source:

s3://pmc-oa-opendata/PMC9810470.1/PMC9810470.1.xml
(
aws s3api head-object   --bucket pmc-oa-opendata   --key PMC9810470.1/PMC9810470.1.xml)


Could you confirm whether this article should be included in the FTP/S3 open-access dumps, or whether website availability does not always imply the full text is available there?

Thank you.


Islam Hassan

unread,
Apr 9, 2026, 11:57:28 AMApr 9
to Europe PMC Developer Forum, James I.
Dear James,

Some full-text articles on Europe PMC are available only for viewing on the website while some other are completely open access which makes them available for download and text-mining. This depends on the legal agreements.

This article is of the first category. Hence, it's not available for download and not part of the FTP dump.

Please note that the S3 service is owned and operated by the PubMed Central team at NCBI, not by Europe PMC.

Yours sincerely,
Islam Hassan
Reply all
Reply to author
Forward
0 new messages