Questions about generated XML for archival descriptions

45 views
Skip to first unread message

Ad Axem

unread,
Dec 12, 2020, 2:40:28 AM12/12/20
to AtoM Users
Hi everyone,

We have been using the CLI tools (specifically the php symfony cache:xml-representations command) to generate and cache EAD XML and DC XML files for all archival descriptions in a bilingual (English - Greek) installation of the software.

Some questions:

a. Is there a CLI parameter we can use to generate XML files for the secondary language of the archival descriptions (Greek)? If not, any workarounds you can suggest?

b. The filenames of the DC XML files seem to be strings unrelated to the corresponding archival description (slug, reference code or any other differentiating element). Is there any way of knowing which DC XML file corresponds to which archival description when browsing through the Downloads > Exports > DC directory, without having to open each XML file to look at the contents?

c. A single generated and cached EAD XML file is used by all archival descriptions which are part of the same archival hierarchy, i.e. an EAD XML file documenting the entire archival hierarchy, from the highest level (e.g. Fonds) to the lowest level (e.g. Item). Is this behaviour according to the EAD 2002 standard specification? Is there a way to generate EAD XML files that only document individual archival descriptions rather than the entire archival hierarchy?

Thank you in advance for your time and help!

Have a great weekend.

Efthimios Mavrikas



Dan Gillean

unread,
Dec 15, 2020, 2:30:23 PM12/15/20
to ICA-AtoM Users
Hi Efthimios, 

a. Is there a CLI parameter we can use to generate XML files for the secondary language of the archival descriptions (Greek)? If not, any workarounds you can suggest?

Unfortunately not at the moment, but I agree that this would be a great feature to have! 

I haven't tested it, but the workaround I would suggest would be to try changing the default installation culture of the application (in apps/qubit/config/settings.yml), clearing the cache and restarting PHP-FPM and memcached (and repopulating the search index), and then running the XML cache task again. 

I'm not sure what the outcome will be if there's already an English XML file in place, however - it may be overwritten, or if not, AtoM may not know when to serve each file. I suspect that the import/export functionality requires some deeper analysis to ensure it has full i18n support throughout... however, if you experiment with this, please let us know what you find. 


b. The filenames of the DC XML files seem to be strings unrelated to the corresponding archival description (slug, reference code or any other differentiating element). Is there any way of knowing which DC XML file corresponds to which archival description when browsing through the Downloads > Exports > DC directory, without having to open each XML file to look at the contents?

I think that the original reason for this was sort of security via obfuscation - the developers didn't want users to be able to guess the paths for draft descriptions that might have cached XML and thereby access the metadata. However, with development this could be changed - we could improve AtoM's overall security for paths like this, add an option to ensure that draft descriptions don't get cached XML, etc. and then use the slugs (possibly with a format appended, e.g. slug-ead and slug-dc for example). 

However, in the meantime, I did find an older developer's script that can be used to return the path to a cached XML when given the related information object (i.e. description) slug. See: 

First, this was created in 2017, and I have NOT tested it yet to make sure it still works! 

Second, this is currently set up to return the path to the EAD XML. However, I think if you changed the ead parameters on lines 32 and 33, it should work for DC XML.

You can run this script using the generic tools:run task. The process would look something like this: 
  • Go to the following URL, and save the file as dc_cache_filepaths.php
  • Remember to change the ead parameters in lines 32 and 33 to dc
  • Place this file somewhere accessible to AtoM - the root AtoM directory would be fine. 
  • Use the tools:run command to execute the script: 
    • php symfony tools:run dc_cache_filepaths.php
    • Remember, if you didn't put the script in the root directory, then you'll need to include the file path to the PHP file you want to run. 
  • The script will prompt you for the slug - enter it, and the script should return the file path of any associated cached xml files. 
 
c. A single generated and cached EAD XML file is used by all archival descriptions which are part of the same archival hierarchy, i.e. an EAD XML file documenting the entire archival hierarchy, from the highest level (e.g. Fonds) to the lowest level (e.g. Item). Is this behaviour according to the EAD 2002 standard specification? Is there a way to generate EAD XML files that only document individual archival descriptions rather than the entire archival hierarchy?

This follows the way that the EAD 2002 XML standard expects to structure multi-level descriptive hierarchies, and is the reason we introduced the XML cache option in the first place - with large hierarchies, these XML files can be quite large!

There's likely a way in EAD 2002 to structure a file so it only references the current level, and points to parent and child records. However, an option to do this in AtoM on export or cache generation does not currently exist in AtoM, and would take development to implement. I suspect there would also be roundtripping issues as well, as AtoM would likely try to create the parent and child descriptions based on the reference in the file - meaning that importing a whole hierarchy from a series of individual EAD XML files would likely be difficult. I'm sure it could be done but, as I said, it will require analysis, testing, and development to find the right solution. 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/ec7a7c01-19ca-47fb-93cd-29ba523d5285n%40googlegroups.com.

Ad Axem

unread,
Dec 15, 2020, 7:17:39 PM12/15/20
to AtoM Users
Thank you for the thorough reply and suggestions Dan.

We will change the default installation culture to Greek for that installation and let you know what happens. Last time we did just that for an entirely different issue, it worked like a charm, so fingers crossed!

The script for the XML files path looks super useful, thank you for sharing, I love these undocumented gems :)

Also, great quote about the EAD 2002 standard for multi-level descriptive hierarchies, I will use it to get that request (for EAD XML files that only document individual archival descriptions rather than the entire archival hierarchy) cancelled.
Reply all
Reply to author
Forward
Message has been deleted
0 new messages