Harvesting from AtoM to Primo

159 views
Skip to first unread message

Helen Cooper

unread,
Feb 27, 2017, 9:53:58 AM2/27/17
to ica-ato...@googlegroups.com

Hello,

I have harvested our top-level Archive data from AtoM (v2.3.0 -138) to Primo (august 2016 release) using a Primo OAI pipe, however, I have a couple of small issues that I would like some help with.

(1)    The resumption token does not appear to work when using the Primo OAI pipe to harvest. If I set the AtoM resumption token limit at 100, the pipe continually pulls the same 100 records until the job is terminated.

(2)    When records are deleted from AtoM, the deletions are not reflected in Primo.

As our top-level collection is < 1000, I have managed to work around both of these issues by setting the resumption token limit at 1000 and performing a full “delete and reload” of the collection each time the harvest is run.

http://<our AtoMserver>/;oai?verb=ListRecords&metadataPrefix=oai_dc&set=oai:virtual:top-level-records&from=2010-01-01T10:39:37Z&until=2017-02-25T11:10:39Z

Would be interested to know if anyone else has tried to harvest from AtoM to Primo? 

Many Thanks,

Helen

----------------------------------------
Helen Cooper

Library Systems Analyst/Developer

Information Services Directorate

University of Strathclyde

Tel: 44 (0) 141 548 2898
Email: helen....@strath.ac.uk

http://www.strath.ac.uk/media/1newwebsite/commsoffice/THE2016.png

 

The University of Strathclyde is a charitable body, registered in Scotland, with registration number SC015263.

 

Dan Gillean

unread,
Feb 27, 2017, 12:51:33 PM2/27/17
to ICA-AtoM Users
Hi Helen,

Thanks for sharing your experiences so far. Regarding the resumption token - when the harvester reaches the end of the first 100 records (or wherever the resumption token limit is set), the harvester must then pass the resumption token included at the bottom of the response to page to the next set of results. You can see an example of this in our documentation for the ListIdentifiers response, here:

There is no automated, continuing response between AtoM and a harvester, so a record will not be automatically deleted in Primo just because it is deleted in AtoM. Instead, the protocol relies on the next harvest attempt to gain information about records that have been deleted. A repository has 3 options which can be included in the Identify response about how it handles deletions:

  • no - the repository does not maintain information about deletions. A repository that indicates this level of support must not reveal a deleted status in any response.
  • persistent - the repository maintains information about deletions with no time limit. A repository that indicates this level of support must persistently keep track of the full history of deletions and consistently reveal the status of a deleted record over time.
  • transient - the repository does not guarantee that a list of deletions is maintained persistently or consistently. A repository that indicates this level of support may reveal a deleted status for records.

See: https://www.openarchives.org/OAI/openarchivesprotocol.html#deletion

At present, AtoM does not provide any information on deletions, unfortunately - further development would be required to improve upon this functionality.

Finally, I will mention for other users that Artefactual is currently doing development to add EAD XML support to the OAI repository functionality in AtoM - you can read more about it in this thread:

Are there other users who are making use of AtoM's OAI module who might have experiences and/or workarounds to share with Helen?

Regards,


Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-users+unsubscribe@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/1CFA2FC3908BE241BCB30328F268FAE4943A34BD%40EX2010-MBX3.ds.strath.ac.uk.
For more options, visit https://groups.google.com/d/optout.

helen....@strath.ac.uk

unread,
Feb 28, 2017, 10:25:10 AM2/28/17
to AtoM Users
Hi Dan,
 
Thank you for replying to my email, and for clarifying the position re deletions.
 
Re the resumption token.
Having checked the Primo logs, I can see that the resumption token is being passed to AtoM on each successive call after the 100 limit, however the same batch of 100 records is returned and the set never reaches the end.
e.g. Primo log
The OAI request is: [http://atomtest.lib.strath.ac.uk/;oai?verb=ListRecords&resumptionToken=eyJmcm9tIjoiMjAwMC0wMS0wMVQxNDozMDozOVoiLCJ1bnRpbCI6IjIwMTctMDItMTdUMTY6NDQ6MTNaIiwiY3Vyc29yIjoxMjk0NDAwMCwibWV0YWRhdGFQcmVmaXgiOiJvYWlfZGMiLCJzZXQiOiJvYWk6dmlydHVhbDp0b3AtbGV2ZWwtcmVjb3JkcyJ9]
The OAI request is: [http://atomtest.lib.strath.ac.uk/;oai?verb=ListRecords&resumptionToken=eyJmcm9tIjoiMjAwMC0wMS0wMVQxNDozMDozOVoiLCJ1bnRpbCI6IjIwMTctMDItMTdUMTY6NDQ6MTNaIiwiY3Vyc29yIjoxMjk0NDEwMCwibWV0YWRhdGFQcmVmaXgiOiJvYWlfZGMiLCJzZXQiOiJvYWk6dmlydHVhbDp0b3AtbGV2ZWwtcmVjb3JkcyJ9]
The OAI request is: [http://atomtest.lib.strath.ac.uk/;oai?verb=ListRecords&resumptionToken=eyJmcm9tIjoiMjAwMC0wMS0wMVQxNDozMDozOVoiLCJ1bnRpbCI6IjIwMTctMDItMTdUMTY6NDQ6MTNaIiwiY3Vyc29yIjoxMjk0NDIwMCwibWV0YWRhdGFQcmVmaXgiOiJvYWlfZGMiLCJzZXQiOiJvYWk6dmlydHVhbDp0b3AtbGV2ZWwtcmVjb3JkcyJ9]
and so on....
 
I do not believe that the issue lies with Primo, as the same issue happens via browser. Please see attached document.
 
Am I missing a setting in AtoM?
 
Thanks
Helen
 

helen....@strath.ac.uk

unread,
Feb 28, 2017, 10:26:08 AM2/28/17
to AtoM Users
Sorry I forgot the attachment.

AtoM to Primo - resumption token issue.docx

Jim Adamson

unread,
Feb 28, 2017, 10:58:04 AM2/28/17
to AtoM Users, helen....@strath.ac.uk
Hi Helen & Dan,

We are making use of AtoM's OAI module, and our Resumption token limit is set to 100, but it looks like everything is working correctly for us (506 AtoM records in Primo). We are running AtoM 2.2 and Primo February 2017 release (latest) in a direct-dedicated environment. I have just tested it through the browser and the resumption definitely works fine for us.

Helen: Have you had this working with a previous version of AtoM?

On the deletions issue: my colleague Jen informs me that we don't tend to delete top-level records, so this isn't currently a problem for us.

Thanks, Jim

HelenC

unread,
Feb 28, 2017, 11:21:21 AM2/28/17
to AtoM Users, helen....@strath.ac.uk
Thanks Jim for your reply.
I think this issue must be connected to the version of AtoM we are running. This is the first time we have tried the harvest. I get the same problem via the browser, so it's not a Primo issue.
 
Regards,
Helen

José Anjos

unread,
Jul 10, 2017, 11:59:03 AM7/10/17
to AtoM Users, helen....@strath.ac.uk
Hi.
I'm trying to do the same but I have the same problem.
I have 2 VM's (production/tests) and both with version 2.3.1 - 138
I thought it was a version problem so then I've tested it with https://demo.accesstomemory.org
I've activated the plugin and set it to give 10 results.
Then:
https://demo.accesstomemory.org/;oai?verb=ListIdentifiers&metadataPrefix=oai_dc

And then:
https://demo.accesstomemory.org/;oai?verb=ListIdentifiers&resumptionToken=eyJmcm9tIjoiIiwidW50aWwiOiIiLCJjdXJzb3IiOjEwLCJtZXRhZGF0YVByZWZpeCI6Im9haV9kYyIsInNldCI6IiJ9==

The result is exactly the same except this part that adds the resumptionToken:
<request verb="ListIdentifiers" resumptionToken="eyJmcm9tIjoiIiwidW50aWwiOiIiLCJjdXJzb3IiOjEwLCJtZXRhZGF0YVByZWZpeCI6Im9haV9kYyIsInNldCI6IiJ9" cursor="10" metadataPrefix="oai_dc">https://demo.accesstomemory.org/;oai</request><ListIdentifiers>

Are these command right or something wrong?

Thank you,
José anjos

Dan Gillean

unread,
Jul 10, 2017, 12:48:26 PM7/10/17
to ICA-AtoM Users
Hi José,

This is a bug we identified and fixed after the 2.3.1 release. This means that the patch is available in our code repository ( in the qa/2.3.x branch) but it is unfortunately not included in the tarball available on our Downloads page. It has also been fixed in the upcoming 2.4 release.

Here is the related issue ticket: https://projects.artefactual.com/issues/11042

If you want this fix immediately, you can try following Option 2 in our installation instructions - installing from our GitHub code repository:

Alternatively, there is a link on the issue ticket to the related pull request. As you can see there, it is actually a single line change in the code to fix the issue:

You could always try making this change locally until you upgrade. If you do, I suggest you clear your cache and restart services such as PHP-FPM etc after making the change.

Regards,


Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-users+unsubscribe@googlegroups.com.
To post to this group, send email to ica-atom-users@googlegroups.com.

José Anjos

unread,
Jul 11, 2017, 4:25:09 AM7/11/17
to AtoM Users
Hi Dan.
It work's.
Thank you very much,
José Anjos
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages