OAI and Dublin core export depends on browser language

322 views
Skip to first unread message

TeD TeD

unread,
Mar 16, 2022, 9:19:14 AM3/16/22
to AtoM Users
Hello everyone,

I'm experimenting with AtoM lately and I realized that in a multilingual installation, the Dublin Core export (and OAI harvest) depends on the browser language.

For example,
In a English/French environment, let's say you enter an English title for a digital object and you switch to the french interface. If you choose to export the record you will NOT get the English title in the xml. If you switch back to the English interface, the title is exported in the xml

The same happens with the OAI harvest! If -for the same example- you try to access a record using i.e. http://xxx.yyy.zzz/;oai?verb=GetRecord&identifier=oai:xxx.yyy.zzz_385&metadataPrefix=oai_dc you will get the title ONLY IF in the same browser you have selected to see atom in the same language as the title.

As you understand, if you try to harvest from another server (using only command line), there is no way to pass the &sf_culture parameter in the URL, so you will end with partial metadata.

The same applies to other multilingual areas (such as rights, etc).

I think that the proper way to handle this would be to export ALL data in ALL languages, specifying the appropriate language in the DC/OAI xml output. After all it's an export for machine use, so the interface should not play any role :)

As I'm fairly new to AtoM, I might have misunderstood something or missed a configuration option... In that case, please advise accordingly and accept my apologies for not reading carefully the -extensive- documentation!

Keep up the great work,

Kind regards,
Theodoros Theodoropoulos

Dan Gillean

unread,
Mar 18, 2022, 11:16:17 AM3/18/22
to ICA-AtoM Users
Hi Theodoros, 

Unfortunately, I think you are encountering a known issue in AtoM that we need to address. There is a wishlist ticket that captures some of the problem and a proposed solution here: 
I think that the proper way to handle this would be to export ALL data in ALL languages, specifying the appropriate language in the DC/OAI xml output. After all it's an export for machine use, so the interface should not play any role :)

Would you imagine this to be multiple different DC XML records returned as a set, or a single record that repeats each field in all available cultures/languages? If the latter, I have been trying to find some online examples of multilingual DC XML records, but so far haven't found any. If you know of some good examples that will validate, please let me know where to find them for reference! 

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/6241c1c3-d2cd-443e-a7a7-8017600a39f7n%40googlegroups.com.

TeD TeD

unread,
Mar 23, 2022, 7:01:15 AM3/23/22
to AtoM Users
Hello Dan,

Having different export files per language code is not a good practice i think. One would expect all data to be in a single xml file having the multilingual fields defined appropriately. I believe the xml:lang attribute is used in the standard.

Please see: https://www.dublincore.org/specifications/dublin-core/dc-xml/ and jump to "XML Example 22: Rich Representation - Binary Data". See also https://www.dublincore.org/specifications/dublin-core/dc-xml-guidelines/ (Recommendation 9)

Best regards,
Theodoros

Dan Gillean

unread,
Mar 25, 2022, 8:59:17 AM3/25/22
to ICA-AtoM Users
Hi Theodoros, 

Thanks for the links, they are helpful. I've updated the Wishlist ticket with further information based on the links you've provided. See: 
Unfortunately, I can't find any indication as to how one would create a valid multilingual EAD 2002 XML file, nor could I find any examples. The @langcode attribute is restricted to a few elements, and most EAD elements do not support any away to repeat them in different languages and differentiate between the two. One could theoretically use @ID attributes and some sort of internal logic to determine which ID maps to which language, but this would not import well into any other system, defeating the purpose of EAD 2002 as a metadata exchange format. 

Once again, if I'm missing a resource or example you're aware of, I would love to see it. 

Either way, a fair bit of analysis and development will be required to implement something like the proposed changes, meaning we will likely need community support, either via code contributions or development sponsorship, to be able to address this in the near future. Long term as we revise AtoM we will certainly keep better multilingual import and export support in mind as a top design principle. 

In the meantime, though there are some similar problems with the CSV export, a full descriptions CSV export run from the command-line should include all available languages. 

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

TeD TeD

unread,
Mar 31, 2022, 7:28:17 AM3/31/22
to AtoM Users
I would like to stress here that the main showstopper for us is that OAI export is incomplete when there are information objects in more than one 'cultures' in the system. This export is our only way to 'sync' AtoM metadata with another (external) system that combines data from our other online sources. OAI is a standard for inter-system communication and is ideal for our scenario. As such, CSV export, although very useful, is not a viable option for us.

Right now the only workaround I see is to remove all extra languages and change the culture in all our current information objects, even if it is in another language. Can you thing of another 'hack' that would allow us get ALL data via OAI export? (That is, without having to wait for the code fix!)

Best regards,
Theodoros Theodoropoulos

TeD TeD

unread,
Mar 31, 2022, 7:54:20 AM3/31/22
to AtoM Users
Hmm... As our data is mostly (if not only) in Greek, If you could give us a hint regarding how we could 'force' the OAI export to use the el culture (something like &sf_culture=el, but hardcoded in an internal function that produces the DC export that is used in the OAI request), it would help a lot!
This way, we could leave the multilingual interface as is and we wouldn't have to tamper with the culture column in the DB.

TeD TeD

unread,
Apr 1, 2022, 8:30:55 AM4/1/22
to AtoM Users
Note to myself: It seems that the latter hack (force a culture for DC export) IS possible.
One should change ./plugins/sfDcPlugin/modules/sfDcPlugin/templates/_dc.xml.php
and (to force 'el' culture) do something like:

8,9c8,9
<   <?php if (!empty($resource->title)): ?>
<     <dc:title><?php echo esc_specialchars(strval($resource->title)) ?></dc:title>
---
>   <?php if (!empty($resource->getTitle(array('culture' => 'el')))): ?>
>     <dc:title><?php echo esc_specialchars(strval($resource->getTitle(array('culture' => 'el')))) ?></dc:title>

The same will be required for dc:rights, dc:relation and probably other dc fields as well.

It seems that supporting proper multilingual output in DC isn't super complicated after all (and has been implemented for SkosPlugin). Unfortunately my php knowledge is only basic and I'm a newbie to AtoM. I might give it a try at some point, but (provided I manage to get it working in the first place) the code will be messy.

I'm leaving this 'note to myself' for future reference in case someone finds it useful.

Dan Gillean

unread,
Apr 1, 2022, 9:53:48 AM4/1/22
to ICA-AtoM Users
Hi Theodoros, 

I checked in with one of our developers, who suggested trying the following: 
Add the following and save the file: 
  • $this->context->getUser()->setCulture("el");
This should hopefully implement the change for all DC exports, not only OAI. Alternatively, try adding the same here instead: 
And it should apply to OAI DC XML responses, but not to regular DC XML exports. 

I haven't personally tested this, so please let us know if it works! 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

TeD TeD

unread,
Apr 5, 2022, 4:10:12 AM4/5/22
to AtoM Users
Thanks Dan (and AtoM developers), your suggestion works as expected!

Remember it's only a dirty workaround that seems to fit our current needs. The requirement for supporting multilingual DC export still exists :)

Keep up the good work,
Theodoros

TeD TeD

unread,
Nov 11, 2024, 8:45:41 AM11/11/24
to AtoM Users
Hello Dan and apologies for raising again an old issue, but 2 1/2 years have passed and I fear that our request for a DC export mechanism that would include metadata from ALL languages might have been lost and forgotten.
After all, the wishlist issue https://projects.artefactual.com/issues/12107 is no longer available/accessible.

To remind you all of our current situation: We have a production multilingual AtoM site that has mostly Greek/English metadata and we would like to have a way to export them all in one file (ideally in DC or OAI) in order to feed them to our University Library Search Engine, or even for backup purposes, for comparing to previous versions, etc!

Currently we're able to 'hack' the OAI plugin and force-set the culture for exporting either Greek or English metadata (but not both!). I think it would be hugely useful for ALL AtoM users who run multilingual sites, to either have either a 'special' culture name that would result in exporting of ALL metadata in ALL cultures or even make this the DEFAULT behavior if user has not specified a sf_culture parameter in the URL. To my view, one who wants to export a metadata (system) file, would probably want to get every available metadata, and THEN display them according to user preferences.

I understand that you have limited resources and probably have other, more urgent priorities, so I'm not asking for an immediate fix, but please (if you understand the importance of this and agree that is!) put it again in your TODO list!

I'm at your disposal for further information,
Kind regards,

Theodoros Theodoropoulos

Dan Gillean

unread,
Nov 13, 2024, 9:05:04 AM11/13/24
to ica-ato...@googlegroups.com
Hi Theodoros, 

First, I should mention that I am no longer the AtoM Program Manager, and am only on this forum sporadically - please see the announcement posted here: 
For now, the AtoM project is under the stewardship of a team of Maintainers: 
I have made the AtoM Maintainers aware of this thread. They will track this request along with the many others that come in from our clients and community as they assess how best to support AtoM. That said, I can offer no promises that your request will be implemented any time soon. 

Your feature request is a great one and I would love to see AtoM's multilingual support improved. However, as hinted at in our previous exchange, it is also a much more complex request than it first appears, and additional analysis would be needed to see if the OAI-PMH specification, and/or EAD or DC XML, would even support your request in a way that would generate valid XML and could effectively be ingested by any OAI harvester. I did an initial bit of searching and failed to locate any concrete examples of multilingual OAI implementations, or any documentation discussing this. Not saying it's not possible - just that even determining if the standards will support it in a generalized, repeatable way that third-party harvesters would even understand will require work, let alone all of the development changes that would be required in AtoM to implement such a change.  

If this is a serious need, I would encourage you to consider what your organization can do to actively support its development. Artefactual is moving away from a sales-led, "if you have money we will automatically develop whatever you want and put it in an AtoM release" type of approach to how we maintain and develop AtoM - but we remain interested in building long-term relationships and partnering with cultural memory organizations to help solve their problems, and in many cases the best way for us to do that is to improve the products we maintain to better serve their users. If solving the problem of multilingual description exchange is a priority for you, then you might consider contacting Artefactual directly to discuss options for such a partnership. 

Regards, 

Dan Gillean, MAS, MLIS
Business & User Experience Analyst
Artefactual Systems, Inc.
604-527-2056
he / him


TeD TeD

unread,
Nov 14, 2024, 5:55:29 AM11/14/24
to ica-ato...@googlegroups.com
Thank you Dan for clarifying the current status of the project.

For the sake of completeness, as I mentioned before, all OAI, DC XML and EAD standards DO support multilingual elements:

For OAI see here:
<dc:title xml:lang="en">The Cornell Law Quarterly</dc:title>
For and DC XML, see the guidelines here:
Recommendation 9. Where the language of the value is indicated, it should be encoded using the 'xml:lang' attribute. For example:
<dc:subject xml:lang="en">seafood</dc:subject>
<dc:subject xml:lang="fr">fruits de mer</dc:subject>

For EAD, see here the @lang attribute (May be used consistently in a multi-lingual finding aid to specify which elements are written in which language. Available on all non-empty elements.):
<corpname encodinganalog="610" identifier="http://viaf.org/viaf/139169065" lang="eng">
<part>Hudson's Bay Company</part>
</corpname>

As for your concern whether it will be parsed properly by a harvester, I can assure you it will! We constantly use OAI-PMH exports in most our services (ie our OJS - Open Journal System, the most well known -ALSO of CANADIAN origin!- Open Source platform for hosting/publishing eJournals), they do have multilingual entries for titles, etc and they get harvested by third parties without issues.

Europeana, a European harvester for cultural heritage, also supports harvesting of multilingual context using @lang attribute, see here
<dc:description xml:lang="fr">végétation des montagnes de France</dc:description>

I hope I gave the new developer team a good starting point :)

I'm always at your disposal for further information,
Kind regards,
Theodoros Theodoropoulos

You received this message because you are subscribed to a topic in the Google Groups "AtoM Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ica-atom-users/B1qIG7-dVUA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ica-atom-user...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ica-atom-users/CAC1FhZJ0caMiGQvX0j-Ndnh9D4ULrwtTC3gZejW%3DFctkT5yh5g%40mail.gmail.com.

Dan Gillean

unread,
Nov 14, 2024, 8:54:24 AM11/14/24
to ica-ato...@googlegroups.com
Hi Theodoros, 

I appreciate the added context - I am sure it will help whoever picks up this request next. 

If you know of any OAI response examples that include multilingual content (ideally that i can look at in my web browser), it would also help as well to see some real implementations - the one example link included in the Europeana help article you shared is dead, unfortunately, and I wasn't successful in finding a good example on my own. 

Cheers, 

Dan Gillean, MAS, MLIS

Business & User Experience Analyst
Artefactual Systems, Inc.
604-527-2056
he / him


Reply all
Reply to author
Forward
0 new messages