Titles displayed as “Untitled” in lists while titles actually exist.

102 views
Skip to first unread message

Thomas Debesse

unread,
Nov 6, 2021, 3:56:50 AM11/6/21
to AtoM Users
I imported a bunch of archival descriptions by converting a database from an obsolete software to a set of EAD xml files. It looks like the import process did not reported any issue and while importing descriptions, related authority records, subjects and places look to have been properly imported.

The issue is that all archival descriptions and all authority records are named “Untitled” when listed (see screenshot). But all of them are actually titled properly, as it can be seen when displaying the archival description or the authority record itself (see other screenshot).

I imported all the data by running this command as “atom” user, the xml/ folder containing the .xml files:

php symfony import:bulk xml/

Then I ran with “atom” user this command to index the new entries:

php symfony cache:clear && php symfony search:populate --update

Also, just to make sure, I restarted as root various related services:

systemctl restart mysql nginx php7.4-fpm atom-worker memcached

But archival description titles and authority record titles are still displayed as “Untitled” in lists while properly displayed when displaying the entry itself.

Before doing the bulk import I did some manual import of EAD xml file I generated, testing the import from the web interface, and archival description and authority record titles were properly listed. But after having imported files with the bulk import tool those titles are not properly displayed.

The AtoM instance is running branch stable/2.6.x (commit e1b3b3d) on Ubuntu 20.04.3 LTS.

Would you know how to fix the list of titles?

Note: subjects and places are listed correctly though.
20211106-083436-000.atom-untitled-in-list.png
20211106-083352-000.atom-untitled-in-list.png

Thomas Debesse

unread,
Nov 8, 2021, 1:21:38 AM11/8/21
to AtoM Users
I previously wrote that subjects and places are listed correctly, it's wrong. Subjects and places titles are correctly listed in side column, but when actually requesting the list, those lists display “Untitled” titles like other lists, see attached screenshot.

Ubuntu 20.04.03 provides php7.4 by default but I also tried php7.2 from Ubuntu 18.04.6 (both php7.2-fpm for hosting the website and php7.2-cli for running the search:populate query) and the result is the same.
20211108-053718-000.atom-untitled-in-list.png

Dan Gillean

unread,
Nov 9, 2021, 11:33:34 AM11/9/21
to ICA-AtoM Users
Hi Thomas, 

This appears at first glance to be a language issue. It looks like the default installation culture of your site is English, but the metadata you are importing has a French source culture. AtoM has culture fallback in some places (so that the source string in the original language will be shown if there is not a translation available in the current user interface culture/language), but it looks like that is not kicking in in search/browse pages. If you use the language menu and change the user interface to French, do all the titles display properly?

Do you have French added as an indexed language in Admin > Settings > i18n languages? Have you tried re-indexing?

I tried quickly to recreate the issue by manually creating a French collection with several children in our demo site, then browsing by latest modifications in the English UI. In our public demo site, I was unable to recreate the issue this way - even though no English titles were present, culture fallback still showed the French description titles in search and browse pages. I will try to experiment with a modified EAD XML file to see if I can recreate the issue, but if you have a sample EAD file you would be willing to share with me, please feel free to send it to me off-list. 

Anything else you can tell me about your installation and/or the workflow that led to this outcome may help with identifying the issue as well. I will say that though I do not believe it to be the cause of this particular issue, I suspect you will encounter some issues trying to use PHP 7.4 and Ubuntu 20.04 with AtoM 2.6 or earlier. We are targeting these dependencies for our upcoming 2.7 release, but I recall having to fix a few things to properly support changes in PHP 7.4.

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/a5f76733-7ba6-4800-b8ad-35118ab83ccen%40googlegroups.com.

Thomas Debesse

unread,
Nov 9, 2021, 6:24:26 PM11/9/21
to AtoM Users
Hi, thank you for your answer. The titles are all displayed as “Untitled” in all languages, the “Untitled” string just get translated, for example “Sans titre” in French or “Sem título” in Português.

French is already added (from the start) as an indexed language in Admin > Settings > i18n languages, I tried reindexing multiple time by clearing before with “php symfony cache:clear && php symfony search:populate” or by updating in place with “php symfony search:populate --update”.

One additional thing to note is that using the global search returns nothing.

Creating collections whatever the language by hand does work (lists properly and can be searched), see attached screenshot.

Also if I'm right importing EAD xml files one by one from the web worked before I started the bulk import (listed properly).

What does not work (does not lists properly) is to import EAD xml files with the bulk command line importer then reindexing from command line, and now, I also fail at getting EAD xml files imported from the web indexed properly. I did it multiple time with the EAD file attached with same result (see screenshot). The import log reports no error.

I have temporary configured the system to run the web application on php-fpm 7.2 and to run import command line tools with php-cli 7.2 and re-cleared cache, restarted reindexing but I get the same issue. So the chances the issue is coming from php 7.4 looks to be low. The application had been installed with php 7.4 though.

One interesting thing to note is that sometime when browsing the Archival description list without being logged in (and then with default English language), I get an Elastic Search error “Elastica\Exception\ResponseException”. As soon as I change language (and even revert back to English) or after I log in the error disappears and the list is displayed but with “Untitled“ titles as initially reported. I don't know if it's related but I would appreciate advices to help to track down this error.

One question more : is there an easy way to prune all actual Archival descriptions, Authority notices, Subjects and Places if I want to try to redo entirely the bulk import process? Before importing I made a backup of the mySQL database I can restore but if there is an atom tool for that it would be convenient.

Attached is the smallest EAD file I have. I noticed a small mistake in the converter stripped out the utf8 declaration at the beginning of the file, bu when I import from the web this makes no difference if this line is present or not: the new archival description does not get listed in all case, neither in Archival description list, nor in search.
20211108-080344-001.atom-elastisearch-error.png
20211110-000753-000.atom-list-imported-from-web-after-bulk.png
20211110-001136-000.atom-search-created-by-hand.png
20211108-080324-001.atom-elastisearch-error.png
20211110-001107-000.atom-list-created-by-hand.png
16280.utf8-header.xml
16280.xml

Thomas Debesse

unread,
Nov 9, 2021, 8:35:38 PM11/9/21
to AtoM Users
This may be relevant, I found this in elasticsearch log:

org.elasticsearch.transport.RemoteTransportException: [PBDUL7E][127.0.0.1:9300][indices:data/read/search[phase/query]]
Caused by: org.elasticsearch.index.query.QueryShardException: No mapping found for [i18n.fr_FR.title.alphasort] in order to sort on
    at org.elasticsearch.search.sort.FieldSortBuilder.build(FieldSortBuilder.java:262) ~[elasticsearch-5.6.16.jar:5.6.16]
    at org.elasticsearch.search.sort.SortBuilder.buildSort(SortBuilder.java:156) ~[elasticsearch-5.6.16.jar:5.6.16]
    at org.elasticsearch.search.SearchService.parseSource(SearchService.java:634) ~[elasticsearch-5.6.16.jar:5.6.16]
    at org.elasticsearch.search.SearchService.createContext(SearchService.java:485) ~[elasticsearch-5.6.16.jar:5.6.16]
    at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:461) ~[elasticsearch-5.6.16.jar:5.6.16]
    at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:257) ~[elasticsearch-5.6.16.jar:5.6.16]
    at org.elasticsearch.action.search.SearchTransportService$6.messageReceived(SearchTransportService.java:343) ~[elasticsearch-5.6.16.jar:5.6.16]
    at org.elasticsearch.action.search.SearchTransportService$6.messageReceived(SearchTransportService.java:340) ~[elasticsearch-5.6.16.jar:5.6.16]
    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-5.6.16.jar:5.6.16]
    at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:662) [elasticsearch-5.6.16.jar:5.6.16]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:675) [elasticsearch-5.6.16.jar:5.6.16]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-5.6.16.jar:5.6.16]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_292]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_292]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]

Thomas Debesse

unread,
Nov 10, 2021, 8:29:50 AM11/10/21
to AtoM Users
I may have found the cause of the problem, but maybe that mistake I did uncovered bugs from AtoM anyway.

I noticed that archival descriptions created by hand still get properly listed and searchable after I clear and rebuild cache. That may indicates the problem is in the way the data is stored, so maybe in the way I generate the EAD xml files. I wrote the EAD generator by looking at what AtoM produces, but I now see that what AtoM produces can change given this or that situation.

For example if the current display language is French, an EAD file produced by AtoM will contain this XML data:

<langusage><language langcode="fre">fr_FR</language></langusage>

But if the current display language is English, an EAD file produced by AtoM for the exact same archival description will contain this XML data instead (even if the archival description is written in French):

<langusage><language langcode="eng">English</language></langusage>

First, we see the language not being the one of the archival description. Then we see that when it's said to be French, an iso code is used (fr_FR) but if it's said to be English (that may be wrong), a vernacular name is used (English). Also it may be possible that in some case it may write “français” instead of “fr_FR” since that's what I produced while trying to reproduce what AtoM produces.

In the EAD xml files I generated (and imported), the line was written this way:

<langusage><language>français</language></langusage>

The language name is a vernacular name and not an ISO code, and It misses the “langcode” attribute. The lack of “langcode“ attribute is maybe a mistake of mine, but the use of a vernacular name is maybe just a copy paste.

So I modified the EAD file previously quoted to modify the langusage data this way:

<langusage><language langcode="fre">fr_FR</language></langusage>

And imported this modified EAD file using the bulk importer, then cleared cache and repopulated it, and now the archival description is properly listed (see screenshot) and searchable. You can find the EAD file that seems to work attached.

So, I'll try to regenerate all the EAD files and revert the database to before the bulk import and reimport all of them again.  I'll keep you updated.

If there are AtoM actions to prune records without having to go the SQL way I would appreciate.

All in all, it looks like the EAD files produced by AtoM may reveal some AtoM bug.
16280.language.xml
20211110-142210-000.atom-correct-list.png

Dan Gillean

unread,
Nov 16, 2021, 11:43:43 AM11/16/21
to ICA-AtoM Users
Hi Thomas, 

Sorry for the delay in replying - I wanted a chance to conduct a bit of testing myself. I tested on our latest development branch (qa/2.x, which will be used a the basis for the 2.7 release) and I'm not seeing what you're seeing exactly, despite the fact that I don't believe our EAD import mappings have changed in the 2.7 release. 

Essentially, I took your sample file and created a number of variations - for example: 
  • using <language langcode="fre">English</language>
  • using <language langcode="fre">français<</language>
  • <language langcode="eng">français<</language>
  • removing the <langusage> and <language> elements entirely
  • etc
I tried importing a few manually via the user interface, and could not create a situation where there was no title in the search results. I thought perhaps a command-line bulk import might behave differently, so I put them in a directory and imported them with: 
  • php symfony import:bulk --index /vagrant/lang-test/
Despite this, they all imported with titles. My test instance uses English as the default installation culture, and both English and French were present in the Language menu. Titles displayed regardless of whether English or French was chosen as the user interface display language. 

As a separate issue however, of my 8 files, 5 imported as French, and 3 as English records, and it's not clear to me exactly what criteria was used. Here are the french ones (I tried to give them descriptive titles to be able to determine which was which post-import): 

lang-test-results.png

When I have a chance, I will have to ask our developers to take a look at the import code and see if they can tell me the criteria AtoM is using. I was surprised to see the following ones imported as French, for example: 
  • no "fre" language code, but fr_FR as the <language> element value AND  no "fre" language code, but français as the <language> element value both imported as French records
  • The record where I fully removed the <langusage> and <language> elements also imported as French, despite the fact that the default AtoM installation culture in my test site is English
There may be other factors at play here - additional elements that are checked; some cascading fallback behavior; different behavior when a series of records are bulk imported together, etc. I'm not sure yet but will let you know if I learn anything. 

In the meantime, all titles displayed for me regardless of the import language or the user interface language selected. 

I have attached the test files I used in case you wish to experiment further with them. 

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

lang-test.zip

Thomas Debesse

unread,
Jan 17, 2022, 5:54:59 PM1/17/22
to AtoM Users
I forgot to reply, but on my end formatting the langusage markup this way fixed my issue:

<langusage><language langcode="fre">fr_FR</language></langusage>

I pruned everything and reimported everything and the database is now in production since november without reproducing the reported issue.

So the issue was caused by the missing langcode="fre" attribute.
Reply all
Reply to author
Forward
0 new messages