List of harvesters

35 views
Skip to first unread message

Michelle Urberg

unread,
Jul 3, 2021, 2:46:19 AM7/3/21
to OAI-PMH
Hi OAI-PMH group,

I have what I hope is an easy question, but maybe not. If a journal set up with OAI-PMH harvesting is having content harvested by indexes and full-text databases, can one find out who is doing the harvesting from the journal side? 

Thanks for your time.

Michelle Urberg

John Salter

unread,
Jul 5, 2021, 4:17:50 AM7/5/21
to oai...@googlegroups.com
Hi Michelle,
If the journal platform keeps a log of web requests, you should be able to filter these for just the OAI-PMH requests, and then analyse that data.

Hopefully the logs will contain the User-Agent and the originating IP address of the requests.

Some harvesters send a useful user agent e.g. 
    "IRUS_metadata_harvesting_bot"
    "OAIHarvester/2.0 - core.ac.uk"
    "Summon/1.0"
Others just the software being used to harvest e.g.
    "Java/11.0.4"
    "OAIHarvester/2.0"

NB The user-agent information may not be reliable in all cases. The harvester can claim to be something it's not. How much trust you place in this data is up to you.

The IP address may provide some useful information if it can be resolved to an organisation. In a lot of cases it will resolve to a cloud-computing platform, which probably won't tell you anything useful.

Cheers,
John


From: oai...@googlegroups.com <oai...@googlegroups.com> on behalf of Michelle Urberg <maur...@gmail.com>
Sent: 02 July 2021 17:25
To: OAI-PMH <oai...@googlegroups.com>
Subject: [OAI-PMH] List of harvesters
 
--

---
You received this message because you are subscribed to the Google Groups "OAI-PMH" group.
To unsubscribe from this group and stop receiving emails from it, send an email to oai-pmh+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/oai-pmh/56fb29cd-55d7-42c4-9377-03e92d17586cn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages