New User Questions about Languages, Authority Records, Taxonomies, and Digital Records

107 views
Skip to first unread message

Raphael Michel Mikhail

unread,
Sep 19, 2023, 8:25:06 PM9/19/23
to AtoM Users
Greetings!

After months of learning about archival science and trying different programs, I am finally diving in with AtoM and hoping for the best. I am using it to make an archive of a missionary's life-long ministry as part of my PhD in theology.

I am on cloud-hosted AtoM version 2.7.1.

I have the following questions:

1. I would like to have archival descriptions with titles in Arabic and English in a way that is visible without needing to toggle the change language drop down. Perhaps one hack to accomplish this would be to use the Identifier field for the English title and the Title field for the Arabic title. Are there drawbacks to this approach and is there a better way? For example, is there a way to configure AtoM to have an English Title field and an Arabic Title field?

2. Many lectures have a speaker and a translator. I would like to record the names of both in the Archival Description. "Name access points" gives the impression that the lecture is about rather than by a certain person because AtoM appends "(Subject)" to the person's name, so that doesn't seem like the right place. I think "Name of creator" could be used for the speaker. Is there some way to add a "Name of Translator" field? Of course I could just write the names in plain text in one of the fields, but I would like to have the speaker and translator connected to Authority records to have more effective search and discovery.

2b. Related question: I changed the User interface label "creator" to be "speaker" and now when browsing the "Narrow your results by" bar has "Speaker"; however, when you click on an archival description, it still says "Name of creator" rather than "Name of speaker" – is there some way to fix that?

3. What does AtoM do when generating a reference copy for an mp3 file? Most of the time the filesize of the reference copy is about 75% smaller than the original but sounds the same. (In one odd case the filesize of the reference copy was actually larger).

4. Uploading files with Arabic file names, such as كيف نصوم - الفجالة 12-2-1987.MP3 show up as ____-____12-2-1987.MP3 If I go to More > Rename and re-input the arabic file name, when I save, it becomes: 2-12-1987.mp3. I read the section here about AtoM sanitizing file names, but is there some way to have it not sanitize arabic?

5. The docs say only authenticated users can download digital objects. Is there any way to change that? I would like non-authenticated users to be able to download.

6. The docs say that AtoM ships with a fixed number of taxonomies. I would like to have some way of marking recording quality with the Terms: Good, Okay, and Poor. Perhaps I could hijack some existing taxonomy and repurpose the terms, but then how do I make it show up to the user with the label "Recording Quality"?

Thanks for your help!

Dan Gillean

unread,
Sep 20, 2023, 9:41:58 AM9/20/23
to ica-ato...@googlegroups.com
Hi Raphael,  and welcome to the AtoM community! 

You've asked a lot of really astute questions, and I greatly appreciate that you've reviewed the documentation and tried to answer them yourself before posting here - thanks! In  the meantime, I will do my best to offer an initial response, and hopefully other members of our community might share their own strategies and workarounds for similar issues. 



1. I would like to have archival descriptions with titles in Arabic and English in a way that is visible without needing to toggle the change language drop down. Perhaps one hack to accomplish this would be to use the Identifier field for the English title and the Title field for the Arabic title. Are there drawbacks to this approach and is there a better way? For example, is there a way to configure AtoM to have an English Title field and an Arabic Title field?

I don't recommend using the Identifier field - I think some result stubs would look very strange, and search would be a challenge that way. 

I am not sure what jurisdiction you are based in, but if you are willing, I would encourage you to investigate the Canadian Rules for Archival Description (RAD) template in AtoM, as I think it will help address a number of your questions (and you will see I will bring it up again below). 

It's perhaps a bit unconventional to use a different national standard, but RAD has a bit of unique history that gives it a number of fields and options that don't appear in other archival standards that map back to ISAD(G). RAD actually predates the ICA's standards. Lacking any international guidance for implementation, its creators basically took a copy of the AACR2 library cataloguing standard, threw in some more "archival" fields into a Notes section, and called it a day. While this makes RAD seem weirdly outdated today (especially compared to almost all other national archival standards), it does have some advantages for edge cases - particularly in AtoM, where transforming the standards into templates makes everything rather literal and inflexible. 

[Sidenote: if you'd like a great read about the history of RAD, see Richard Dancy's "RAD Past, Present, Future.")

So, for example: 

Because RAD was based off a standard for describing books and other media, The "Title and Statement of Responsibility Area" includes a couple additional fields that are more suitable for repurposing, such as: 
  • Parallel titles (RAD 1.1D): this is actually intended for the title in another language, so it fits very well. Many Canadian archivists also assume that it means "other title" and use it as an alternative title field anyway. It matches your use case very well. 
  • Other title information (RAD 1.1E): Intended for contexutal info, subtitles, and the like, this too could be repurposed as needed, or
  • Title notes (RAD 1.8B1-1.8B6): RAD includes a number of repeatable note fields relating to titles that you could use, including "Continuation of title," "Parallel title notes and other title information," "Variations on title," etc. 
If this option doesn't appeal then you might just add both languages to the English title field, separated by a dash, and perhaps with more explanatory information added in the body of the record (such as the ISAD General notes field). At least both will be visible and searchable this way! 




2. Many lectures have a speaker and a translator. I would like to record the names of both in the Archival Description. "Name access points" gives the impression that the lecture is about rather than by a certain person because AtoM appends "(Subject)" to the person's name, so that doesn't seem like the right place. I think "Name of creator" could be used for the speaker. Is there some way to add a "Name of Translator" field? Of course I could just write the names in plain text in one of the fields, but I would like to have the speaker and translator connected to Authority records to have more effective search and discovery.

Once again, the RAD standard will be your friend here. AtoM's ISAD template (and unfortunately the other templates in their own way) are limited to only showing the event types the standards themselves explicitly list as supported. RAD on the other hand, explicitly lists a number of additional options and leaves open the possibility of other actors being involved - therefore: 
  1. By default, RAD includes a whole bunch of additional event types in the drop-down. Where ISAD has only "Creation" and "Accumulation", RAD includes these plus Custody, Publication, Contribution, Collection, Reproduction, Distribution, Broadcasting and Manufacturing
  2. Additionally, these terms are in a user-editable taxonomy, so you can add more as needed by creating new terms in the Event Types taxonomy. 
Quick tip about point 2: If you do add a new term, you will see that the edit template has a field called Display note(s). This field is badly named and actually shouldn't be a repeating field for how it is used in AtoM (old design choices that have never been corrected) but: the purpose of this field is actually so you can tell AtoM how to use it in relation to a person. For example, if your Event type term is "Creation" then in the Display note field, add "Creator" and AtoM will use that as the label next to the related actor. 

Now, this is not a perfect solution, unfortunately. If you look in our older issue tracker (we have recently moved our bug tracking to GitHub), you can find a long string of issues related to how non-creator actors are displayed in AtoM. We've made some good strides but there are still missing features and open requests, like this one: 
Nevertheless, this is likely the best way for you to list multiple actors invovled in production of the record in question, while being able to define better the labels used



2b. Related question: I changed the User interface label "creator" to be "speaker" and now when browsing the "Narrow your results by" bar has "Speaker"; however, when you click on an archival description, it still says "Name of creator" rather than "Name of speaker" – is there some way to fix that?

I suspect this is an oversight in the user interface label feature, and likely cannot be changed without making underlying code changes. However, if you use the solution recommended above for 2a, it resolves the issue by simply not using that label. 

For RAD, the label is just "Date(s)" and then it qualifies the event with the event type you've used: 

Screenshot from 2023-09-20 08-58-56.png
However, the same issue exists with the name of the section itself, which unfortunately cannot be changed via settings. 

In the ISAD template, this is where the issues with my 2a suggestion become a bit more apparent. Nothing is shown where the creator normally appears - instead all other event type actors are just listed in the Name access points, but with a qualifier: 

Screenshot from 2023-09-20 08-59-33.png



3. What does AtoM do when generating a reference copy for an mp3 file? Most of the time the filesize of the reference copy is about 75% smaller than the original but sounds the same. (In one odd case the filesize of the reference copy was actually larger).

You've hit on another long standing known issue in AtoM that we need to improve at some point, but haven't yet. 

AtoM uses ImageMagick, ffmpeg, and ghostscript as utilities to handle incoming digital objects and generate derivatives from them. The way it is set up now is... not smart, unfortunately. Meaning, even if you upload a format that is already suitable for use as an access derivative (like an MP3), AtoM will still generate its own derivative using a simple convert command with one of the underlying utilities. And... yes, in some cases, these derivatives are WORSE in quality and/or LARGER in file size than what was originally provided. 

This is most apparent when using AtoM with Archivematica, our open source digital preservation system. Upload a DIP of JPEGs from Archivematica, and AtoM will create its own JPEG reference copy that is quite often larger than the original provided in the DIP. 

At this time, I have no real suggestions for you.You can always click on the "More" button, select "Edit Digital Object," delete the reference display copy, and upload your own instead (even if it's the same mp3), but at scale this is not practical. 

Long-term, AtoM needs something akin to Archivematica's Format Policy Registry (FPR) - a place to configure the rules for the normalization of various file formats, so the system can behave more intelligently, and in some cases know not to generate an additional derivative, but simply reuse the original. We have collaborated with others in the initial proposal for a Preservation Action Registry - a vendor-neutral protocol for handling format policies and preservation rules across systems (see: parcore.org), and hope that someday, the system that we build to replace AtoM can use PAR by default. In the meantime, there's not much (short of development work) to address this in the short term, other than manually editing the derivatives. 


4. Uploading files with Arabic file names, such as كيف نصوم - الفجالة 12-2-1987.MP3 show up as ____-____12-2-1987.MP3 If I go to More > Rename and re-input the arabic file name, when I save, it becomes: 2-12-1987.mp3. I read the section here about AtoM sanitizing file names, but is there some way to have it not sanitize arabic?

Hmm, that's unfortunate, and not intended. I will test to reproduce, but I suspect this is a bug we will need to file, to address in a future release. If you have ensured that the Arabic characters you are using are UTF-8 encoded, then the problem is with AtoM, unfortunately. 



5. The docs say only authenticated users can download digital objects. Is there any way to change that? I would like non-authenticated users to be able to download.

Yes! As an admin, navigate to Admin > Groups, select the anonymous group, and enter edit mode on the Archival descriptions permissions tab. Make sure you set  "View Master" to Grant, and save. Now unauthenticated (i.e. public) users should be able click through to view the original uploads, and like any other user, can right-click and save that Master digital object. 

Note that AtoM doesn't really provide a "Download" button on the view page for a description with a digital object, unless there is no reference display copy available. 

Note as well that in 2.7 and later, AtoM also has an option you can enable that will allow public users to download multiple digital objects via the Clipboard. See: 
However, to keep this from being a scalability nightmare, there are limitations. A user cannot automatically export the descendants of a record on the clipboard AND include all the digital objects, because in some cases that could mean thousands of records AND thousands of digital objects at once, depending on the size of the hierarchy. Instead, these options are mutually exclusive - meaning EITHER you can focus on metadata or digital objects with an export. If metadata, then you can simply add a top-level record (like a fonds or collection) to the clipboard, select the "Include descendants" option in the export configuration, and get an export of all records in the archival hierarchy. 

If digital objects, then you must manually add every description with a digital object you want to export to the clipboard. So for example, a collection with 10 item-level photographs: you would need to manually add all 10 item-level descriptions to the clipboard to export them with their digital objects. 



6. The docs say that AtoM ships with a fixed number of taxonomies. I would like to have some way of marking recording quality with the Terms: Good, Okay, and Poor. Perhaps I could hijack some existing taxonomy and repurpose the terms, but then how do I make it show up to the user with the label "Recording Quality"?

Adding new taxonomies requires development, unfortunately - as does changing the labels used directly in the standards-based templates. So, unfortunately at this time I can't personally think of an easy way to do this. 

Do you need your users to be able to filter / facet search results by this qualifier? If not, I'd suggest simply adding it to an appropriate free text field - perhaps as part of your Extent and medium / Physical description statement, or in one of the Notes fields (again, RAD will have more options to consider that might feel appropriate). 

If you do want to be able to facet against these terms like access points, you might consider either using the Genre taxonomy for this purpose, or putting terms directly in the Subjects taxonomy. I am suggesting doing something a bit unorthodox like including a label in the term itself - something like Recording quality: good, Recording quality: okay, etc. That way your taxonomy can serve multilple purposes, and the purpose of the term is clear to end users. Not ideal, but perhaps it will spark your own creative workaround ideas if nothing else. Unfortunately, there's no easy way to rename a Taxonomy in AtoM right now. 

In any case, I hope these suggestions help! Best of luck!

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/0666d6f6-da1f-4b4b-b672-4936802e33c8n%40googlegroups.com.

Raphael Michel Mikhail

unread,
Sep 20, 2023, 8:11:02 PM9/20/23
to AtoM Users
Hi Dan,

I am really so grateful for your response and have been working through your ideas and answers. Here are some follow up comments/questions if I may:

1. I switched to RAD and indeed it is very convenient for having multiple titles. It works especially well because aside from needing English and Arabic, on rare occasions the same lecture has two or more different titles from different sources, so now I have a convenient way to record them.

2a. Likewise, switching to RAD enabled me to configure the Terms of the Event Types taxonomy and add a Translator option. However, I noticed one UI drawback with this. When using ISAD(G), the name of the Creator was hyperlinked so that users could easily click and browse all archival descriptions related to that Authority Record. But on RAD, the names are not hyperlinked unfortunately. Of course, users could copy/paste the name into the search or navigate to Browse > Authority records and find the name, but that is certainly a regression in user experience.
ISAD(G):
 Screenshot 2023-09-20 at 2.26.28 PM.png
RAD: 
Screenshot 2023-09-20 at 2.30.50 PM.png

2b. It took me quite a while to figure out how you accomplished what is shown in the screenshot below that you sent, where "(Broadcaster)" is shown after the name in the name access point. 
Screenshot from 2023-09-20 08-59-33.png
I think you had RAD as the default template in the settings, created a RAD archival description with the name added via the Dates of creation area, then changed the default template in the settings to ISAD(G), then viewed that archival description. I was interested to know how you did that because it fits fairly well with what I was after in my original question, namely having Authority records differentiated in the archival descriptions by various terms like Broadcaster and Translator (not just "Subject" which is what appears when adding names via the Name access points section). Further, here the Authority records are conveniently hyperlinked. However, aside from this being a very roundabout way to accomplish this, it has a major drawback: when browsing, the "Narrow your results by" bar no longer lists Authority records as a search facet. It seems like AtoM somehow got confused as I was switching back and forth between ISAD(G) and RAD. Is there some way to restore that search facet? All I get now is:
Screenshot 2023-09-20 at 4.11.10 PM.png

As a side note, I could see that AtoM really expects you to have either ISAD(G) or RAD, not both, because when editing an existing archival description, AtoM does not necessarily provide the editing template that the archival description was created with. It brings up whatever template is currently selected in the Default template settings. 

Also, I'm a former software engineer, so I was looking at fixing the "Name of creator" vs "Name of speaker" issue mentioned previously. I found the contribute code page and github repo. However, I didn't find any kind of guide to the codebase that provides some direction for which parts of the repo are for what things, or some kind of architecture diagram from a developer's perspective. If those are available somewhere, can you please direct me to them? The alternative is just to poke around all over the place which is time consuming and if I don't catch some dependencies then bugs could be accidentally introduced. I can already imagine that happening with how the translate page system might work. For example, if I change a hard-coded string "Name of creator" to something like "Name of %creator-label%" and the translation system just looks for "Name of creator" then it won't translate that line anymore.

At some point I could also look into hyperlinking Authority records mentioned in the RAD "Dates of creation" area if it's relatively easy to do.

3. [Typo in my original question: "25% smaller" not 75%]
I actually don't mind so much the work around you mentioned of re-uploading the original as the reference copy. That's something I could probably write a script to do later on.

4. I'm sure I'm not the only one with arabic file names, if anyone out there has faced this issue please let me know!
Otherwise, Dan this is another thing I could consider fixing if it's relatively simple and easy because I can foresee this being a fairly significant issue for us. If it's beyond my ability, and requires an Artefactual engineer to fix, then we could consider paying for that if it's reasonable.

5. This is great, thanks!

6. Yes I was hoping to have faceted search by recording quality so users can easily filter out poor quality recordings. I understand the possibility of overloading the Subjects taxonomy with this, but I was already planning on abusing Subjects to act more like tags, meaning an archival description could have a dozen Subjects. I'll think through this more and see how best to arrange things. I came across your post here which mentioned using a custom template to change things. I'll do my best to avoid that, but I guess it's a last resort.

Thanks again for your help! It's really helping me tremendously to make progress on setting up the software.

Raphael

Dan Gillean

unread,
Sep 21, 2023, 9:11:19 AM9/21/23
to ica-ato...@googlegroups.com
Hi Raphael, 

I'm glad my response was useful! A few follow-ups: 


2a. Likewise, switching to RAD enabled me to configure the Terms of the Event Types taxonomy and add a Translator option. However, I noticed one UI drawback with this. When using ISAD(G), the name of the Creator was hyperlinked so that users could easily click and browse all archival descriptions related to that Authority Record. But on RAD, the names are not hyperlinked unfortunately. Of course, users could copy/paste the name into the search or navigate to Browse > Authority records and find the name, but that is certainly a regression in user experience.

Yes, it's unfortunate they are not hyperlinked. Again, an early design decision and the fact that for many years as a small company that was never intended to be the permanent organizational home of AtoM (the ICA was supposed to take over governance but never did - long story), our development model was basically limited by doing whatever institutions would pay for, meaning features came in bits and spurts, and rarely got the polish and revisions for consistency they deserved. We are slowly undergoing an internal transformation to become more product driven, and to better take on ongoing maintenance work in our applications. In the meantime you can read a bit about the project history and our old development model on our wiki, as we haven't yet replaced it with anything newer: 
All this to say: it would be nice if all  event actor names were hyperlinked, and not just creators.

I think you had RAD as the default template in the settings, created a RAD archival description with the name added via the Dates of creation area, then changed the default template in the settings to ISAD(G), then viewed that archival description.

That's correct. A little tip: if you ever want to preview a description in a different template without having to enter edit mode, you can add a semi-colon and the short form of the standard after the URL. This allows you to see how fields in one standard might crosswalk to another. So, if your default template is set to RAD for example, you can add: 
  • ;isad - preview in ISAD(G)
  • ;mods - preview in MODS v3.3
  • ;dc - preview in Dublin Core simple elements
  • ;dacs - preview in the U.S. DACS standard
This and other tricks can be found in the following slide deck, by the way: 

 
I was interested to know how you did that because it fits fairly well with what I was after in my original question, namely having Authority records differentiated in the archival descriptions by various terms like Broadcaster and Translator (not just "Subject" which is what appears when adding names via the Name access points section). Further, here the Authority records are conveniently hyperlinked. However, aside from this being a very roundabout way to accomplish this, it has a major drawback: when browsing, the "Narrow your results by" bar no longer lists Authority records as a search facet. It seems like AtoM somehow got confused as I was switching back and forth between ISAD(G) and RAD. Is there some way to restore that search facet? 

The facets will automatically hide themselves when there are 0 or only 1 results in any particular facet (since that's not useful). And.... unfortunately, the Name access point facet is currently limited to unqualified names - i.e. those treated as name (subject) access points. So qualified event actors of a different type (e.g. Broadcaster, Translator, whatever else you add) will not appear there, unfortunately. 

I'm not sure if it WOULD make sense to add them to the Name access point facet - or possibly create a different facet element for other actors that aren't creators (since Creators has its own facet). Lumping them all together could get confusing - and right now it would really only benefit RAD users and (to a very limited degree) some DACS entries. The other standards don't support qualification at all, beyond ISAD(G)'s Accumulation which some may inherit. So: there's some design analysis to be done there before making changes. In the meantime however, I don't have a great workaround for you, unfortunately. 


As a side note, I could see that AtoM really expects you to have either ISAD(G) or RAD, not both, because when editing an existing archival description, AtoM does not necessarily provide the editing template that the archival description was created with. It brings up whatever template is currently selected in the Default template settings. 

It is extremely uncommon for archivists to use a mix of standards - but that said, AtoM *should* respect the template you are using. 

In addition to the Default template option, you can change the template of a description in the Administration area of the description's edit page, and you are given the option to apply this to descendant records as well. See: 
If changed this way, AtoM *should* re-open the selected standard template for that description when you enter edit mode again, though it will still use the default template for new descriptions created elsewhere. If that's not happening you may have found a regression - I will try to find some time to do some testing, and file a bug ticket if I can reproduce it (I am actually in the process of transitioning to a different role in Artefactual and am not working directly on the AtoM project most days right now). You are also welcome to file a bug ticket yourself if you have (or create) a GitHub account: 

Also, I'm a former software engineer, so I was looking at fixing the "Name of creator" vs "Name of speaker" issue mentioned previously. I found the contribute code page and github repo. However, I didn't find any kind of guide to the codebase that provides some direction for which parts of the repo are for what things, or some kind of architecture diagram from a developer's perspective. If those are available somewhere, can you please direct me to them? The alternative is just to poke around all over the place which is time consuming and if I don't catch some dependencies then bugs could be accidentally introduced.

I recently rounded up a bunch of links and resources for another forum thread. See: 
See also: 

I can already imagine that happening with how the translate page system might work. For example, if I change a hard-coded string "Name of creator" to something like "Name of %creator-label%" and the translation system just looks for "Name of creator" then it won't translate that line anymore.

I'm not actually sure we have much documentation that focuses on how AtoM's internationalization works, but a lot of that comes out of the (VERY OLD) Symfony 1.x framework that AtoM is still stuck on for now, so I would check the Symfony resources for an overview - for example: 
Also, the best way to get to know AtoM's code base and how to develop modules is to look at previous commits and pull requests with related features. For example, here's a sizeable and recent PR that will be adding a new setting in AtoM to better support search mapping for characters with diacritics: 
Here's a fix to the i18n behavior on notes fields when the UI language is changed: 
(PS, the "ref #13631" points to our old issue tracking system, Redmine. We recently moved to GitHub, but for older refs like this, you can still go find the issues in Redmine if you are curious. Here's that issue: 

At some point I could also look into hyperlinking Authority records mentioned in the RAD "Dates of creation" area if it's relatively easy to do.

We certainly welcome pull requests! We have a new team of AtoM Maintainers who will be overseeing decisions about what gets merged into the public project - they haven't yet defined preferences and criteria, so for now I will point you to older guidance. We also intend soon to have a Contributor Success role at Artefactual to help community developers contribute more easily to the project, so expect things to improve. In the meantime: 
Again, please keep in mind: these resources are from my time as AtoM Program Manager, a role I am slowly in the process of transitioning out of, so I can help Artefactual develop new projects. Our new Maintainer team may have slightly different expectations, so please be patient if we offer feedback or make requests not covered in the above! 


 Otherwise, Dan this is another thing I could consider fixing if it's relatively simple and easy because I can foresee this being a fairly significant issue for us. If it's beyond my ability, and requires an Artefactual engineer to fix, then we could consider paying for that if it's reasonable.

For any inquiries about paid work, please contact in...@artefactual.com :)



Thanks again for your help! It's really helping me tremendously to make progress on setting up the software.

Always happy to help onboard new AtoM users to the community! 

Cheers,  
 
Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


Reply all
Reply to author
Forward
0 new messages