OAI and Dublin core export depends on browser language

133 views
Skip to first unread message

TeD TeD

unread,
Mar 16, 2022, 9:19:14 AM3/16/22
to AtoM Users
Hello everyone,

I'm experimenting with AtoM lately and I realized that in a multilingual installation, the Dublin Core export (and OAI harvest) depends on the browser language.

For example,
In a English/French environment, let's say you enter an English title for a digital object and you switch to the french interface. If you choose to export the record you will NOT get the English title in the xml. If you switch back to the English interface, the title is exported in the xml

The same happens with the OAI harvest! If -for the same example- you try to access a record using i.e. http://xxx.yyy.zzz/;oai?verb=GetRecord&identifier=oai:xxx.yyy.zzz_385&metadataPrefix=oai_dc you will get the title ONLY IF in the same browser you have selected to see atom in the same language as the title.

As you understand, if you try to harvest from another server (using only command line), there is no way to pass the &sf_culture parameter in the URL, so you will end with partial metadata.

The same applies to other multilingual areas (such as rights, etc).

I think that the proper way to handle this would be to export ALL data in ALL languages, specifying the appropriate language in the DC/OAI xml output. After all it's an export for machine use, so the interface should not play any role :)

As I'm fairly new to AtoM, I might have misunderstood something or missed a configuration option... In that case, please advise accordingly and accept my apologies for not reading carefully the -extensive- documentation!

Keep up the great work,

Kind regards,
Theodoros Theodoropoulos

Dan Gillean

unread,
Mar 18, 2022, 11:16:17 AM3/18/22
to ICA-AtoM Users
Hi Theodoros, 

Unfortunately, I think you are encountering a known issue in AtoM that we need to address. There is a wishlist ticket that captures some of the problem and a proposed solution here: 
I think that the proper way to handle this would be to export ALL data in ALL languages, specifying the appropriate language in the DC/OAI xml output. After all it's an export for machine use, so the interface should not play any role :)

Would you imagine this to be multiple different DC XML records returned as a set, or a single record that repeats each field in all available cultures/languages? If the latter, I have been trying to find some online examples of multilingual DC XML records, but so far haven't found any. If you know of some good examples that will validate, please let me know where to find them for reference! 

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/6241c1c3-d2cd-443e-a7a7-8017600a39f7n%40googlegroups.com.

TeD TeD

unread,
Mar 23, 2022, 7:01:15 AM3/23/22
to AtoM Users
Hello Dan,

Having different export files per language code is not a good practice i think. One would expect all data to be in a single xml file having the multilingual fields defined appropriately. I believe the xml:lang attribute is used in the standard.

Please see: https://www.dublincore.org/specifications/dublin-core/dc-xml/ and jump to "XML Example 22: Rich Representation - Binary Data". See also https://www.dublincore.org/specifications/dublin-core/dc-xml-guidelines/ (Recommendation 9)

Best regards,
Theodoros

Dan Gillean

unread,
Mar 25, 2022, 8:59:17 AM3/25/22
to ICA-AtoM Users
Hi Theodoros, 

Thanks for the links, they are helpful. I've updated the Wishlist ticket with further information based on the links you've provided. See: 
Unfortunately, I can't find any indication as to how one would create a valid multilingual EAD 2002 XML file, nor could I find any examples. The @langcode attribute is restricted to a few elements, and most EAD elements do not support any away to repeat them in different languages and differentiate between the two. One could theoretically use @ID attributes and some sort of internal logic to determine which ID maps to which language, but this would not import well into any other system, defeating the purpose of EAD 2002 as a metadata exchange format. 

Once again, if I'm missing a resource or example you're aware of, I would love to see it. 

Either way, a fair bit of analysis and development will be required to implement something like the proposed changes, meaning we will likely need community support, either via code contributions or development sponsorship, to be able to address this in the near future. Long term as we revise AtoM we will certainly keep better multilingual import and export support in mind as a top design principle. 

In the meantime, though there are some similar problems with the CSV export, a full descriptions CSV export run from the command-line should include all available languages. 

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

TeD TeD

unread,
Mar 31, 2022, 7:28:17 AM3/31/22
to AtoM Users
I would like to stress here that the main showstopper for us is that OAI export is incomplete when there are information objects in more than one 'cultures' in the system. This export is our only way to 'sync' AtoM metadata with another (external) system that combines data from our other online sources. OAI is a standard for inter-system communication and is ideal for our scenario. As such, CSV export, although very useful, is not a viable option for us.

Right now the only workaround I see is to remove all extra languages and change the culture in all our current information objects, even if it is in another language. Can you thing of another 'hack' that would allow us get ALL data via OAI export? (That is, without having to wait for the code fix!)

Best regards,
Theodoros Theodoropoulos

TeD TeD

unread,
Mar 31, 2022, 7:54:20 AM3/31/22
to AtoM Users
Hmm... As our data is mostly (if not only) in Greek, If you could give us a hint regarding how we could 'force' the OAI export to use the el culture (something like &sf_culture=el, but hardcoded in an internal function that produces the DC export that is used in the OAI request), it would help a lot!
This way, we could leave the multilingual interface as is and we wouldn't have to tamper with the culture column in the DB.

TeD TeD

unread,
Apr 1, 2022, 8:30:55 AM4/1/22
to AtoM Users
Note to myself: It seems that the latter hack (force a culture for DC export) IS possible.
One should change ./plugins/sfDcPlugin/modules/sfDcPlugin/templates/_dc.xml.php
and (to force 'el' culture) do something like:

8,9c8,9
<   <?php if (!empty($resource->title)): ?>
<     <dc:title><?php echo esc_specialchars(strval($resource->title)) ?></dc:title>
---
>   <?php if (!empty($resource->getTitle(array('culture' => 'el')))): ?>
>     <dc:title><?php echo esc_specialchars(strval($resource->getTitle(array('culture' => 'el')))) ?></dc:title>

The same will be required for dc:rights, dc:relation and probably other dc fields as well.

It seems that supporting proper multilingual output in DC isn't super complicated after all (and has been implemented for SkosPlugin). Unfortunately my php knowledge is only basic and I'm a newbie to AtoM. I might give it a try at some point, but (provided I manage to get it working in the first place) the code will be messy.

I'm leaving this 'note to myself' for future reference in case someone finds it useful.

Dan Gillean

unread,
Apr 1, 2022, 9:53:48 AM4/1/22
to ICA-AtoM Users
Hi Theodoros, 

I checked in with one of our developers, who suggested trying the following: 
Add the following and save the file: 
  • $this->context->getUser()->setCulture("el");
This should hopefully implement the change for all DC exports, not only OAI. Alternatively, try adding the same here instead: 
And it should apply to OAI DC XML responses, but not to regular DC XML exports. 

I haven't personally tested this, so please let us know if it works! 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

TeD TeD

unread,
Apr 5, 2022, 4:10:12 AM4/5/22
to AtoM Users
Thanks Dan (and AtoM developers), your suggestion works as expected!

Remember it's only a dirty workaround that seems to fit our current needs. The requirement for supporting multilingual DC export still exists :)

Keep up the good work,
Theodoros
Reply all
Reply to author
Forward
0 new messages