Strathclyde University - OAI-PMH to harvest archive data to Primo.

460 views
Skip to first unread message

Helen Cooper

unread,
Mar 26, 2015, 7:38:22 AM3/26/15
to ica-ato...@googlegroups.com

Hello,

 

I am a developer at Strathclyde University, and am currently looking into the possibility of harvesting our archive data from AtoM to Primo using OAI-PMH.

 

I have tried the following URL’s (which we had used successfully some time ago),  But these now return a 404 Page not found error.

http://ica-atom.org/strathclyde/index.php/oai/oai?verb=Identify

http://ica-atom.org/strathclyde/index.php/oai/oai?verb=ListRecords&metadataPrefix=oai_dc

 Can you possibly tell me the current base URL for OAI-PMH?

Also I believe there were some concerns about using OAI within AtoM 2.0 , see attached email from Dan Gillean dated October 2013.

Is there still a problem exposing data sets?

Is there some up-to-date documentation I can read?

 

Apologies with bombarding you with so many questions!

 

Many Thanks

Helen

----------------------------------------
Helen Cooper

Library Systems Analyst/Developer

Information Services Directorate

University of Strathclyde

Tel: 44 (0) 141 548 4611
Email: helen....@strath.ac.uk

 

cid:image001.jpg@01CEEF7D.3298ACA0

 

The University of Strathclyde is a charitable body, registered in Scotland, with registration number SC015263.

 

José Anjos

unread,
Mar 26, 2015, 11:42:19 AM3/26/15
to ica-ato...@googlegroups.com, helen....@strath.ac.uk
Maybe you have disabled arOaiPlugin for some reason?
Hope it helps


On Thursday, 26 March 2015 11:38:22 UTC, Helen Cooper wrote:

Hello,

 

I am a developer at Strathclyde University, and am currently looking into the possibility of harvesting our archive data from AtoM to Primo using OAI-PMH.

 

I have tried the following URL’s (which we had used successfully some time ago),  But these now return a 404 Page not found error.

http://ica-atom.org/strathclyde/index.php/oai/oai?verb=Identify

http://ica-atom.org/strathclyde/index.php/oai/oai?verb=ListRecords&metadataPrefix=oai_dc 

http://10.10.1.3/                       index.php/;oai?verb=ListRecords&metadataPrefix=oai_dc

Hutchinson, Tim

unread,
Mar 26, 2015, 12:00:53 PM3/26/15
to ica-ato...@googlegroups.com

Hi Helen,

 

The routing for OAI has indeed changed, although I believe there is also a typo in your earlier URL.

 

This should work:

http://strathclyde.ica-atom.org/index.php/;oai?verb=Identify

 

Have you enabled OAI in the settings? I still get a 404 error when I try that. If you’ve enabled the setting, it could relate to the name of the module changing – something to check if you’re still having problems is that the setting_i18n table should have arOaiPlugin rather than qtOaiPlugin.

 

Tim

 

 

Tim Hutchinson
Head, University Archives & Special Collections
University Library, University of Saskatchewan

Tel: (306) 966-6028  Fax: (306) 966-6040

Email: tim.hut...@usask.ca

Web: http://library.usask.ca/archives/

--
You received this message because you are subscribed to the Google Groups "ICA-AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at http://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/1CFA2FC3908BE241BCB30328F268FAE447AFFB45%40EX2010-MBX3.ds.strath.ac.uk.
For more options, visit https://groups.google.com/d/optout.

Dan Gillean

unread,
Mar 26, 2015, 3:38:05 PM3/26/15
to ica-ato...@googlegroups.com
Hi Helen,

I will wait to see if the above suggestions help resolve your issues before suggesting anything further.

Re: status and documentation of OAI-PMH in AtoM...

Basically, OAI-PMH support was quickly added to a very early version of ICA-AtoM but was not maintained, and never properly documented. Since that time, we have managed to complete a number of fixes and are slowly rendering the module useful again for exposing metadata. AtoM does not currently have the ability to act as a harvester.

Here is a post in the user forum from about a year and a half ago, to give you some context for where we were at prior to the 2.0.0 release: https://groups.google.com/d/msg/ica-atom-users/neungj7cgGI/NfhPkZlllxoJ

We have since managed to fix some of the outstanding bugs related to OAI-PMH - there are a couple of others still floating around, but this should give you a sense of what tickets we have, which have been verified and which are outstanding in our issue tracking system. The comments on #3888, #3889 and #3890 should give you a few samples of some of the verbs that can be used at the moment.

There are also a couple of new features that will be added in 2.2 for OAI-PMH - see tickets #8143 and #8144.

This means that, as the module has finally become more functional and is even seeing recent development, I need to prioritize getting some official documentation completed for this. I will likely be adding it to the 2.2 documentation. My apologies that we have not had the resources and opportunity to undertake this sooner. I hope this helps a bit in the meantime.

Regards,

Dan Gillean, MAS, MLIS
AtoM Product Manager / Systems Analyst,
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

Jenny Mitcham

unread,
Mar 27, 2015, 5:47:07 AM3/27/15
to ica-ato...@googlegroups.com
Hi Helen,

Do let me know how you get on and whether you are successful in doing this. At the University of York we are going to be looking at doing something very similar in the future.

Best of luck,
Jen


For more options, visit https://groups.google.com/d/optout.



--
Jenny Mitcham
Digital Archivist
Borthwick Institute for Archives
University of York
Heslington
York
YO10 5DD

Telephone: 01904 321170

Borthwick Institute website: http://www.york.ac.uk/borthwick/
Digital archiving blog: http://digital-archiving.blogspot.co.uk/
Twitter: @Jenny_Mitcham





João Pereira

unread,
Mar 27, 2015, 6:48:39 AM3/27/15
to ica-ato...@googlegroups.com
Hi all!

Like to inform that the OAI-PMH AtoM has a bug. The resumption token is not working because the harvesting servers fail to make the harvester beyond the first page. To see this situation go to the OAI-PMH Validator site in http://validator.oaipmh.com/ and test your AtoM repository (do not forget to activate the plugin AtoM - arOaiPlugin and enable in settings the OAI Repository). I also have some doubts about the dublin core generated, see some examples of DC generated in DSPACE (http://repositorio-tematico.up.pt/oaiextended/request?verb=ListRecords&metadataPrefix=oai_dc&set=rap) and other similar platforms (http://arquivo.cm-vidigueira.pt/OAI-PMH/oai/?verb=ListRecords&metadataPrefix=oai_dc).

Tahnk you,

João Pereira

To post to this group, send email to ica-at...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "ICA-AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at http://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/6B26EA65847C3D47A5A1450ABE5A06884FE85EBA%40CAMPUSMB3.usask.ca.
For more options, visit https://groups.google.com/d/optout.

Hutchinson, Tim

unread,
Mar 27, 2015, 11:04:04 AM3/27/15
to ica-ato...@googlegroups.com

Hi João,

 

Are you referring to the “request time too much” error? I haven’t been able to test with a real harvester, so wasn’t sure if that was a blocker. Does changing the resumption token limit help at all?

 

Regarding the resumption token, see also this note about needing to properly encode it:

https://groups.google.com/d/topic/ica-atom-users/paj51fk51wE/discussion

 

Tim

Dan Gillean

unread,
Mar 27, 2015, 5:40:49 PM3/27/15
to ica-ato...@googlegroups.com
Hi all,

Thanks for this interesting and engaging thread - it's nice to see an under-developed feature in AtoM getting some attention.

In response to feedback received from João and Tim, I have filed two new feature requests in AtoM outlining some of the issues raised in this thread:
  • #8158, to improve the way we implement resumptionTokens in AtoM
  • #8159, to improve the way that oai_dc is implemented in AtoM

There remain other open tickets as well, and there might be more issues to be discovered and reported - as I said, this module has so far been underdeveloped in AtoM, and we have not yet received much feedback from anyone who is actively exposing AtoM metadata to an OAI harvester. If your institution is doing that work, please feel free to tell us more!

At this point, the two new tickets will likely require community support for us to be able to address them in AtoM - either via a community-created pull request submitted for inclusion in a future public AtoM release, or via development sponsorship. Feel free to ask questions on-list if you intend to work on a pull request; feel free to contact me off-list if your institution is willing to sponsor enhancements to the OAI repository module.

Regards,


Dan Gillean, MAS, MLIS
AtoM Product Manager / Systems Analyst,
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

João Pereira

unread,
Mar 30, 2015, 9:57:02 AM3/30/15
to ica-ato...@googlegroups.com
Hi Dan,

Thank you for creating the threads #8158 and #8159.

For the thread #8159, I think is easy to resolve the problem of the dublin core. I made some modifications to the file _dc.xml.php in "[AtoM diretory]\plugins\sfDcPlugin\modules\sfDcPlugin\templates" of my localhost AtoM, then where is:

<oai_dc:dc xmlns="http://purl.org/dc/elements/1.1/"
    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">

  <title><?php echo esc_specialchars($resource->title) ?></title>

  <?php foreach ($resource->getCreators() as $item): ?>
    <creator><?php echo esc_specialchars($item) ?></creator>
  <?php endforeach; ?>

  <?php foreach ($dc->subject as $item): ?>
    <subject><?php echo esc_specialchars($item) ?></subject>
  <?php endforeach; ?>

  <description><?php echo esc_specialchars($resource->scopeAndContent) ?></description>

  <?php foreach ($resource->getPublishers() as $item): ?>
    <publisher><?php echo esc_specialchars($item) ?></publisher>
  <?php endforeach; ?>

  <?php foreach ($resource->getContributors() as $item): ?>
    <contributor><?php echo esc_specialchars($item) ?></contributor>
  <?php endforeach; ?>

  <?php foreach ($dc->date as $item): ?>
    <date><?php echo esc_specialchars($item) ?></date>
  <?php endforeach; ?>

  <?php foreach ($dc->type as $item): ?>
    <type><?php echo esc_specialchars($item) ?></type>
  <?php endforeach; ?>

  <?php foreach ($dc->format as $item): ?>
    <format><?php echo esc_specialchars($item) ?></format>
  <?php endforeach; ?>

  <identifier><?php echo url_for(array($resource, 'module' => 'informationobject'), true) ?></identifier>

  <identifier><?php echo esc_specialchars($resource->identifier) ?></identifier>

  <source><?php echo esc_specialchars($resource->locationOfOriginals) ?></source>

  <?php foreach ($resource->language as $code): ?>
    <language xsi:type="dcterms:ISO639-3"><?php echo strtolower($iso639convertor->getID3($code)) ?></language>
  <?php endforeach; ?>

  <?php if (isset($resource->repository)): ?>
    <relation><?php echo url_for(array($resource->repository, 'module' => 'repository'), true) ?></relation>
    <relation><?php echo esc_specialchars($resource->repository->authorizedFormOfName) ?></relation>
  <?php endif; ?>

  <?php foreach ($dc->coverage as $item): ?>
    <coverage><?php echo esc_specialchars($item) ?></coverage>
  <?php endforeach; ?>

  <rights><?php echo esc_specialchars($resource->accessConditions) ?></rights>

</oai_dc:dc>


I modified to:

<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">

  <dc:title><?php echo esc_specialchars($resource->title) ?></dc:title>

  <?php foreach ($resource->getCreators() as $item): ?>
    <dc:creator><?php echo esc_specialchars($item) ?></dc:creator>
  <?php endforeach; ?>

  <?php foreach ($dc->subject as $item): ?>
    <dc:subject><?php echo esc_specialchars($item) ?></dc:subject>
  <?php endforeach; ?>

  <dc:description><?php echo esc_specialchars($resource->scopeAndContent) ?></dc:description>

  <?php foreach ($resource->getPublishers() as $item): ?>
    <dc:publisher><?php echo esc_specialchars($item) ?></dc:publisher>
  <?php endforeach; ?>

  <?php foreach ($resource->getContributors() as $item): ?>
    <dc:contributor><?php echo esc_specialchars($item) ?></dc:contributor>
  <?php endforeach; ?>

  <?php foreach ($dc->date as $item): ?>
    <dc:date><?php echo esc_specialchars($item) ?></dc:date>
  <?php endforeach; ?>

  <?php foreach ($dc->type as $item): ?>
    <dc:type><?php echo esc_specialchars($item) ?></dc:type>
  <?php endforeach; ?>

  <?php foreach ($dc->format as $item): ?>
    <dc:format><?php echo esc_specialchars($item) ?></dc:format>
  <?php endforeach; ?>

  <dc:identifier><?php echo url_for(array($resource, 'module' => 'informationobject'), true) ?></dc:identifier>

  <dc:identifier><?php echo esc_specialchars($resource->identifier) ?></dc:identifier>

  <dc:source><?php echo esc_specialchars($resource->locationOfOriginals) ?></dc:source>

  <?php foreach ($resource->language as $code): ?>
    <dc:language xsi:type="dcterms:ISO639-3"><?php echo strtolower($iso639convertor->getID3($code)) ?></dc:language>
  <?php endforeach; ?>

  <?php if (isset($resource->repository)): ?>
    <dc:relation><?php echo url_for(array($resource->repository, 'module' => 'repository'), true) ?></dc:relation>
    <dc:relation><?php echo esc_specialchars($resource->repository->authorizedFormOfName) ?></dc:relation>
  <?php endif; ?>

  <?php foreach ($dc->coverage as $item): ?>
    <dc:coverage><?php echo esc_specialchars($item) ?></dc:coverage>
  <?php endforeach; ?>

  <dc:rights><?php echo esc_specialchars($resource->accessConditions) ?></dc:rights>

</oai_dc:dc>


Because I'm not

--

Message has been deleted

João Pereira

unread,
Mar 30, 2015, 10:04:02 AM3/30/15
to ica-ato...@googlegroups.com
But because I am not a developer, I ask to test these changes and evaluate its possible application.

Thank you all!

Best regards,

João Pereira

Dan Gillean

unread,
Mar 30, 2015, 5:30:11 PM3/30/15
to ica-ato...@googlegroups.com
Hi all,

First, João, thanks for your useful comments! I have added them to ticket #8159.

I have some other good news on the OAI front as well:

I have just tested and verified a fix for #6434 (earliestDateStamp not formatted correctly). We've also resolved #7964 (from and until parameters overwriting each other in OAI request handling) These fixes will be included in our upcoming 2.2 release.

We have a community developer who is looking into developing new functionality for AtoM's OAI repository module - he has mentioned in this recent thread that he hopes to be able to address #6436 as well. We're sure crossing our fingers! 

I also have the initial OAI repository documentation completed in our 2.2 documentation branch - you can't see this yet on our website (we'll put up the 2.2 docs when 2.2 is released), but for those curious now, you can see it in our GitHub repository. Note that GitHub markdown is different from the Sphinx reST we are using to write the docs, so the presentation version will not render correctly in GitHub's page - but it is still readable (minus images), and you can get a sense of what has been added. You can also view the raw reST version directly if desired:

This is where we currently are with OAI-PMH related issues in our AtoM project issue tracker. 5 issues; only one outstanding bug! #8158 and #8159 are currently listed as features because there are workarounds, though for fully automated harvesting they may be necessary. The other two feature requests are development ideas that might extend the usefulness of the OAI repository module. Likely, all tickets currently marked feature requests would require some form of community support or sponsorship for Artefactual to be able to address.

Nevertheless, we have slowly made progress. Helen, I hope this gives you some of the answers you were looking for in your original email!

Regards,

Dan Gillean, MAS, MLIS
AtoM Product Manager / Systems Analyst,
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

João Pereira

unread,
Mar 31, 2015, 5:35:45 AM3/31/15
to ica-ato...@googlegroups.com
Great news Dan!

Best regards,

João Pereira

Mark Triggs

unread,
Apr 10, 2015, 3:06:35 AM4/10/15
to Dan Gillean, ica-ato...@googlegroups.com

Hi everyone,

Just a quick message to note that I'm currently looking at a couple of these issues. For this one:

https://projects.artefactual.com/issues/8158

I have enhanced the resumption token implementation to keep track of the OAI set used, and have modified the encoding scheme to use base64-encoded JSON (which fits the bill for not requiring any special escaping when harvested).

I also hit the same issue as João Pereira documented in this issue:

https://projects.artefactual.com/issues/8159

so I've incorporated similar handling of special characters in the OAI XML responses.

With the changes I've made so far, I've got jOAI automatically harvesting its way through one of the new OAI sets I added recently, so things are looking better. Once I've had a chance to do some more testing I'll clean up my work and send a pull request for review. Next week, hopefully!

Cheers,

Mark

Dan Gillean <d...@artefactual.com> writes:

We have a community developer who is looking into developing new functionality for AtoM's OAI repository module - he has mentioned in this recent thread that he hopes to be able to address #6436 as well. We're sure crossing our fingers! 

[...]

This is where we currently are with OAI-PMH related issues in our AtoM project issue tracker. 5 issues; only one outstanding bug! #8158 and #8159 are currently listed as features because there are workarounds, though for fully automated harvesting they may be necessary. The other two feature requests are development ideas that might extend the usefulness of the OAI repository module. Likely, all tickets currently marked feature requests would require some form of community support or sponsorship for Artefactual to be able to address.

Nevertheless, we have slowly made progress. Helen, I hope this gives you some of the answers you were looking for in your original email!

Dan Gillean

unread,
Apr 10, 2015, 1:45:51 PM4/10/15
to ica-ato...@googlegroups.com
Mark, this is great news!

Thank you so much for working with us so that the entire community can benefit from the enhancements you are undertaking. Seeing this kind of collaboration in action makes me proud to be a part of the AtoM community.

We look forward to checking our your pull requests!

Cheers,

Dan Gillean, MAS, MLIS
AtoM Product Manager / Systems Analyst,
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

--
You received this message because you are subscribed to the Google Groups "ICA-AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at http://groups.google.com/group/ica-atom-users.

Mark Triggs

unread,
Apr 10, 2015, 6:47:57 PM4/10/15
to Dan Gillean, ica-ato...@googlegroups.com

Thanks Dan! I must in turn thank the Australian National University Archives for sponsoring this work and giving me a good reason to dive in :)

Cheers,

Mark

Dan Gillean <d...@artefactual.com> writes:

Mark, this is great news!

Thank you so much for working with us so that the entire community can benefit from the enhancements you are undertaking. Seeing this kind of collaboration in action makes me proud to be a part of the AtoM community.

We look forward to checking our your pull requests!

Cheers,

Creighton Barrett

unread,
Apr 11, 2015, 3:39:03 PM4/11/15
to ica-ato...@googlegroups.com, Dan Gillean
Amazing. This is so cool. Looking forward to hearing your progress. Great work all around!

--
You received this message because you are subscribed to the Google Groups "ICA-AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at http://groups.google.com/group/ica-atom-users.

Victoria Peters

unread,
Apr 13, 2015, 4:23:03 AM4/13/15
to ica-ato...@googlegroups.com, Dan Gillean

Yes, excellent!  Thanks very much.

 

Victoria

 

 

From: ica-ato...@googlegroups.com [mailto:ica-ato...@googlegroups.com] On Behalf Of Creighton Barrett
Sent: 11 April 2015 20:39
To: ica-ato...@googlegroups.com
Cc: Dan Gillean
Subject: Re: [ica-atom-users] RE: Strathclyde University - OAI-PMH to harvest archive data to Primo.

 

Amazing. This is so cool. Looking forward to hearing your progress. Great work all around!

Dan Gillean

unread,
May 11, 2015, 7:17:14 PM5/11/15
to ica-ato...@googlegroups.com
Hi all,

One more positive update:

Mark Triggs has implemented João's suggestions on #8159, I've tested them locally, and Mark has tested them with 2 different OAI harvesters. We've added this as the last feature that will make it into the 2.2 release!

Note that this fix has an inadvertent positive side effect - it makes our DC XML more compliant throughout the application :D

At this point, looking in our issue tracker, there are currently no open bug tickets for OAI-PMH - only a couple enhancements/feature requests to add future functionality.

A big thanks to everyone involved in this thread for working together to help outline the issues, come up with strategies, and implement them! Thanks especially to João for the suggestions on #8159 (which was filed in response to his testing and comments on this thread), and to Mark for implementing them and sharing the code with the AtoM community!


Regards,

Dan Gillean, MAS, MLIS
AtoM Product Manager / Systems Analyst,
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

Creighton Barrett

unread,
May 11, 2015, 8:29:58 PM5/11/15
to ica-ato...@googlegroups.com
Fantastic work, all around! We've been super interested in harvesting AtoM with Primo and I'm thrilled to see that this now looks possible. I'd love to hear from Mark and anyone else who has successfully harvested some AtoM records into a different system using these new OAI tools. We plan on testing in 2.2 when we get it installed.

Well done!

João Pereira

unread,
May 12, 2015, 9:38:36 AM5/12/15
to ica-ato...@googlegroups.com
Fantastic job of everyone involved!

Thank you all!

João Pereira

--
You received this message because you are subscribed to a topic in the Google Groups "ICA-AtoM Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ica-atom-users/6BmOLdLEkPs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ica-atom-user...@googlegroups.com.

To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at http://groups.google.com/group/ica-atom-users.

Victoria Peters

unread,
May 12, 2015, 9:41:37 AM5/12/15
to ica-ato...@googlegroups.com

Great news indeed!

 

We will be testing too, once we have 2.2.

 

Best wishes

 

Victoria

Mark Triggs

unread,
May 14, 2015, 10:33:13 PM5/14/15
to Creighton Barrett, ica-ato...@googlegroups.com

Hi Creighton,

My testing so far has been with the jOAI harvester and Trove.

With the jOAI harvester (http://www.dlese.org/oai/docs/harvester.jsp), I've confirmed I can successfully run full and partial harvests. This doesn't do anything much with the harvested records, but at least checks that the mechanics of the OAI repository are working.

In conjunction with the Australian National University Archives and the Trove team (http://trove.nla.gov.au/), I've also been working on getting an AtoM repository harvested by Trove. That's looking good too: Trove has been able to harvest records from AtoM and add them to its collection, so that's a good indicator that it will work with harvesters "in the wild" too. It will be interesting to hear whether the Primo harvester works too.

Happy to answer any questions if you want to shoot me an email (either on or off list is fine).

Cheers!

Mark

Creighton Barrett <csba...@gmail.com> writes:

Fantastic work, all around! We've been super interested in harvesting AtoM with Primo and I'm thrilled to see that this now looks possible. I'd love to hear from Mark and anyone else who has successfully harvested some AtoM records into a different system using these new OAI tools. We plan on testing in 2.2 when we get it installed.

Creighton Barrett

unread,
May 15, 2015, 8:30:36 AM5/15/15
to Mark Triggs, ica-ato...@googlegroups.com
Awesome, thanks Mark. Great to hear about your early successes. I will definitely get in touch once 2.2 is released and we get to the upgrade and make time for testing in Primo. Could be a little while, but it's definitely high on our to-do list.

Gomes Silva

unread,
Aug 11, 2017, 5:17:16 PM8/11/17
to AtoM Users, ma...@teaspoon-consulting.com
Hello everyone,

There are two things I need to change in _dc.xml.php file.


1- The first thing are related to dates. The Harvester I want to subscribe uses this dc scheme for dates (a start date and a end date):
<dc:date>1901</dc:date>
<dc:date>1909</dc:date>


So, with the help of João we did this:

<?php foreach ($resource->getDates() as $item): ?>
     <dc:date><?php echo $item->startDate ?></dc:date>
  <?php endforeach; ?>

  <?php foreach ($resource->getDates() as $item): ?>
     <dc:date><?php echo $item->endDate ?></dc:date>
  <?php endforeach; ?>

Result:
<dc:date>1901-00-00</dc:date>
<dc:date>1909-00-00</dc:date>

It works fine, except when months and days don’t exist. If months and days are not filled it will show zeros on them. I need the startDate and endDate but how can I hide month and day values when they are not filled?

In the default code, years, months and days behave as expected. Is there a way to make these startDate and endDate behave like the default code (hide month and day values when empty)? 



2- The second thing is related to thumbnails.
The Harvester I want to subscribe is able to generate thumbnails if you place the digital object URL on <dc:relation> element,
so according to João we had to do this:

<?php if (isset($resource->digitalObjects[0])): ?>
   <dc:relation>http://<?php echo $sf_request->getHost().$sf_request->getRelativeUrlRoot().$resource->digitalObjects[0]->getFullPath() ?></dc:relation>
<?php endif; ?>

However, instead of providing the Master's URL (which is a heavy file, not always available to everyone and not always JPEG) we wanted to provide the Thumbnail URL instead which basically is the same URL but it ends on _142.jpg. What should be changed in this code?


I don’t know if there’s an easy solution for these problems, but if they are quick to solve I appreciate your help.

Thank you very for your support!

sbr...@artefactual.com

unread,
Aug 16, 2017, 3:16:08 PM8/16/17
to ica-ato...@googlegroups.com, ma...@teaspoon-consulting.com
Hello Gomes

Regarding your first question on the dates:

You should be able to use our renderDates() function like so:

foreach ($io->getDates() as $date)
{
  print("Date: " . Qubit::renderDate($date->startDate) . "\n");
}

Output:  Date: 2017


You second point is a little more difficult for me to help you with as I do not know how the path to the digital object needs to be manipulated - you say it needs to end in '_142.jpg' but I suspect that number would need to change per object.  You could certainly append anything you want to the path string, but I suspect it's not as simple as this.


Steve

Gomes Silva

unread,
Aug 17, 2017, 4:19:46 AM8/17/17
to AtoM Users, ma...@teaspoon-consulting.com
Excelent!
The Qubit::renderDate function worked perfectly.

The first question is solved. Thank you very much for your support!

Gomes Silva

unread,
Aug 17, 2017, 4:38:06 AM8/17/17
to AtoM Users, ma...@teaspoon-consulting.com
Just to clarify the second point:

I don't know if archivematica's DIP have the same behavior, but AtoM 2.x creates files like this:

Filename                                            Size
derivada_PT-CMALB-NI-DNPDN-DNPDN-4940.pdf   13636KB This is the master file
derivada_PT-CMALB-NI-DNPDN-DNPDN-4940_141.jpg 49KB This "_141.jpg" is the reference representation
derivada_PT-CMALB-NI-DNPDN-DNPDN-4940_142.jpg 36KB This "_142.jpg" is the thumbnail representation

Note: They are always stored on the same folder. I think this is AtoM's default behavior.
If there's an easy way to show thumbnail's path it would be great.

In any case, the first question was what concerned me most and you already solved it.

Thank you again for all your help.

Dan Gillean

unread,
Aug 17, 2017, 3:32:30 PM8/17/17
to ICA-AtoM Users
Hi Gomes, 

You might find the following post (and it links to another thread as well) useful in figuring out how the digital objects are stored, and how you might figure out the file path you need. See: 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-users+unsubscribe@googlegroups.com.
To post to this group, send email to ica-atom-users@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/e7be7cf9-fecb-4f5a-aeac-0429be1cf5af%40googlegroups.com.

Gomes Silva

unread,
Aug 21, 2017, 7:54:42 AM8/21/17
to AtoM Users
Hello Dan,

Thanks to your and Sbreker's tips I managed to solve the thumbnail issue with rtrim() and suchlike. So, the issues are pretty much solved and we thank you both for that.


If I'm not mistaken AtoM is designed around the standards-based principle of non-duplication of information at lower levels of description so we have to enter the name of the "Creator" at the highest level of description only. I believe the same is true for "Repository" and "LangMaterial" elements.

So, is there a way to make DublinCore XML grab a toplevel creator, repository and langmaterial information?
These are mandatory elements to that national Harvester, but they don't appear in lowerlevel descriptions.

Thank you again
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.

Dan Gillean

unread,
Aug 29, 2017, 1:24:29 PM8/29/17
to ICA-AtoM Users
Hi Gomes, 

There's no easy way to do this without overhauling how the DC XML and OAI Repository plugin are currently implemented, unfortunately. However, I did want to let you know about a new feature coming in 2.4 that might be of use to you. In 2.4, you will be able to expose EAD 2002 XML via the OAI Repository module. Since EAD 2002 XML contains a complete archival hierarchy, you won't need to perform multiple queries to get all the different levels of description and then try to figure out how they relate to each other after. Docs here: 

Often, EAD 2002 XML exports can be too big to generate on the fly and serve to a harvester - so AtoM 2.4 will also include the ability to pre-generate the  EAD XML every time you create or edit a description via the user interface, and then cache the XML so it can be served up when users try to download the file or a harvester requests oai-ead as the metadata format. See: 
There is also a command-line task available that will allow you to generate and cache the XML for all existing descriptions: 

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-users+unsubscribe@googlegroups.com.
To post to this group, send email to ica-atom-users@googlegroups.com.

arquivo.a...@gmail.com

unread,
Dec 7, 2017, 12:20:00 PM12/7/17
to AtoM Users
Hi,

For those who are interested in providing the Thumbnail URL we can use $digitalObject->getChildByUsageId(QubitTerm::THUMBNAIL_ID);

Regards,
Helder

Gomes Silva

unread,
Jan 19, 2018, 2:21:02 PM1/19/18
to AtoM Users
Hello again,

Thank you for the reply. 

The EAD will surely solve those problems. I just hopped it would be as easy as it was for the identifier element. By default, the _dc.xml.php uses $resource->identifier which only shows the identifier of the current record, but if we replace it with $dc->identifier it will bring all identifiers from the upper levels as well, giving us a complete hierarchical reference code. Maybe a better example would be with the CSV export where those empty fields (creator, repository and LangMaterial) are automatically filled based on the upper levels in spite of CSV not being an hierarchical format.


Just one more question, regarding to <dc:relation>. We managed to make it retrieve the master, reference and thumbnail URLs, but is there a way to (make it) check if the associated digital object (e.g. master) is accessible to public?
The idea is that if $resource->digitalObjects[0] cannot be accessed/seen by anonymous users it would echo something else, like a $digitalObject->getChildByUsageId(QubitTerm::REFERENCE_ID) instead, for instance.

Thank you again for your patience.

Gomes Silva

unread,
Jan 11, 2019, 5:15:52 PM1/11/19
to AtoM Users
Hello, 

Just a quick update to tell that the Repository issue has been solved.

So, now the only thing that remains is the Creator. After this the Dublin Core export will totally be compatible with  AtoM's (and ISAD's) standards-based principle of non-duplication of information at lower levels of description.


1. How to enable the inherited reference code

This is AtoM's original code:
<?php echo esc_specialchars(strval($resource->identifier)) ?>

Change to this:  
<?php echo esc_specialchars(strval($dc->identifier)) ?>


2. How to enable the inherited Repository

This is AtoM's original code:
<?php echo esc_specialchars(strval($resource->repository->authorizedFormOfName)) ?>

Change to this:
<?php echo esc_specialchars(strval($resource->getRepository(array( 'inherit' => true)))) ?>


3. How to enable the inherited Creator


This is AtoM's original code:

  <?php foreach ($resource->getCreators() as $item): ?>
    <dc:creator><?php echo esc_specialchars(strval($item)) ?></dc:creator>
  <?php endforeach; ?>

Change to this:
  ...

Well, I still don't know how to do this one, but I believe it's possible since that information is already being shown at lower levels in the archival descriptions view just like Repository did. If I manage to find out I'll update here. Meanwhile if a developer could provide the code, it would be awesome.


Best regards

Dan Gillean

unread,
Jan 14, 2019, 12:15:07 PM1/14/19
to ICA-AtoM Users
Hi Gomes, 

The developer who is best suited to answer this question is currently on vacation, but I've sent him a note and asked him if he can provide any guidance when he returns. 

Hopefully we'll be able to point you in the right direction! 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

José Raddaoui

unread,
Jan 17, 2019, 11:38:57 AM1/17/19
to AtoM Users
Hi Gomes,

You'll need to traverse the ancestors (and self) going backwards and get the first resource with existing creators. This is the code used in the archival descriptions view page:


I hope that helps.

Regards.

Ricardo Pinho

unread,
Jan 21, 2019, 4:54:31 AM1/21/19
to ica-ato...@googlegroups.com
Hi José,
Based on your tip I've managed to find this solution:
Could you please have a look at it and coorect if needed.
Thank you.
Cheers,
Ricardo Pinho


For more options, visit https://groups.google.com/d/optout.


--
Ricardo Pinho

José Raddaoui

unread,
Jan 21, 2019, 1:06:59 PM1/21/19
to AtoM Users

Hi Ricardo,

That should print only the first creator if founds in the ancestors. If that's the intended behavior, it looks good to me. Otherwise, if you want to print all the creators of the first ancestor with some, you could remove the first "break;". I'd also change "item" for "ancestor" and "ancestor" for "creator", as it makes more sense when reading the code, but that shouldn't change the final result.

Best regards.

Gomes Silva

unread,
Jan 21, 2019, 1:51:38 PM1/21/19
to AtoM Users
You guys are amazing. Thank you all for your support!

As promised, let me share what was found out:


1. How to enable the inherited reference code

This is AtoM's original code:
<?php echo esc_specialchars(strval($resource->identifier)) ?>

Change to this:  
<?php echo esc_specialchars(strval($dc->identifier)) ?>


2. How to enable the inherited Repository

This is AtoM's original code:
<?php echo esc_specialchars(strval($resource->repository->authorizedFormOfName)) ?>

Change to this:
<?php echo esc_specialchars(strval($resource->getRepository(array( 'inherit' => true)))) ?>


3. How to enable the inherited Creator

This is AtoM's original code:
  <?php foreach ($resource->getCreators() as $item): ?>
    <dc:creator><?php echo esc_specialchars(strval($item)) ?></dc:creator>
  <?php endforeach; ?>

Change to this:
 <?php
  foreach ($resource->ancestors->andSelf()->orderBy('rgt') as $ancestor)
  {
    foreach ($ancestor->getCreators() as $creator)
    {
      if (0 < count($creator))
      {
        echo "<dc:creator>" . esc_specialchars(strval($creator)) . "</dc:creator>";
        break; //Comment this line if you want to show multiple creators
      }
      break;
    }
  }
 ?>


4. How to enable the inherited Language

 This is AtoM's original code:
 <?php foreach ($resource->language as $code): ?>
    <dc:language xsi:type="dcterms:ISO639-3"><?php echo esc_specialchars(strval(strtolower($iso639convertor->getID3($code)))) ?></dc:language>
  <?php endforeach; ?>

Change to this:
<?php
  foreach ($resource->ancestors->andSelf()->orderBy('rgt') as $ancestor) {
    if (0 < count($ancestor->language)){
    foreach ($ancestor->language as $code){
      if (isset($code)) {
        echo "<dc:language>";
        echo esc_specialchars(strval(strtolower($iso639convertor->getID3($code))));
        echo "</dc:language>";
      }
      //break; #uncomment for single language (shows only the first in alphabetical order)
    }
    break;
    }
  }
?>


Best regards
Visit this group at <a href="https://groups.google.com/group/ica-atom-users" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/group/ica-atom-users';return true;" onclick="this.href='https://groups.google.


 

Ricardo Pinho

unread,
Jan 22, 2019, 4:25:30 AM1/22/19
to ica-ato...@googlegroups.com
Hi José,

> That should print only the first creator if founds in the ancestors.
> If that's the intended behavior, it looks good to me.
> Otherwise, if you want to print all the creators of the first ancestor with some, you could remove the first "break;".  

That's correct! We are now testing both behaviors to see witch one is the best.

>  I'd also change "item" for "ancestor" and "ancestor" for "creator", as it makes more sense when reading the code, but that shouldn't change the final result.


For more options, visit https://groups.google.com/d/optout.


--
Ricardo Pinho
Reply all
Reply to author
Forward
0 new messages