Hi Cong,
Thanks for your question.
The OA files on the Europe PMC FTP site are generated as a weekly full refresh rather than incremental updates. This means that even if a file keeps the same name, its contents may still change. Europe PMC does not provide per file update information, but you can identify which records have changed by querying the API with the UPDATE_DATE field, for example:
Please note that UPDATE_DATE reflects any type of update, whether to the file or to the metadata.
The filenames (e.g. PMC13900_PMC17829.xml.gz) represent processing chunks and are not continuous ranges, though all PMCIDs within a file will fall somewhere within the numeric range shown. Given that, one possible approach might be use the updated PMCIDs returned by the API and maintain your own local mapping of which PMCIDs belong to which FTP file and download them. The specific workflow is up to you, as we don't provide any file level update indicators.
Best regards,
Madhu
Hi,
Yes, that’s correct, the query will capture that. Also pointing out, that the weekly update volume will be much higher than 20, the sample query earlier for instance returned 27,239 ( hitCount ) updated OA in a single week.
Best regards,
Madhu