Search function returning too many results

84 views
Skip to first unread message

sally-an...@york.ac.uk

unread,
May 17, 2023, 5:17:31 AM5/17/23
to AtoM Users
Hi,

I've looked for this error on the forum but haven't found anything that sounds similar for 2.7.1.  We're having an issue where using the search function is a bit hit and miss and sometimes AtoM just returns every single archival description in an archive, regardless of whether or not the specific description includes the word.

For example when searching for the word 'disability' it shows the following, which includes the entirety of the Joseph Rowntree Foundation catalogue,  almost all of the Family Fund catalogue, but only specific entries from other catalogues.  In the case of the JRF and the FF the vast majority of the results do not contain the word disability anywhere in the description.

disability example.JPG
However in both cases the authority record of the archive does contain the word 'disability' and the creator is inherited at every level, so could AtoM be factoring this into the search?  Although that would not explain why it's only including 606 of 800 descriptions for the Family Fund.

Anyway I was wondering if anyone else had encountered this problem with 2.7.1 and found a solution for it.

Best wishes,

Sally

Message has been deleted

Dan Gillean

unread,
May 18, 2023, 10:11:10 AM5/18/23
to ica-ato...@googlegroups.com
Hi Sally, 

I have part of an answer for you. 

I've examined our current archival description index mapping, and noticed in the following sections that we are currently indexing a number of fields - including the history field - not just from linked creators, but from INHERITED linked creators as well, per: 
So in general, I think your theory is right - the majority of this problem comes from this indexing pattern, as you have rightly deduced. I managed to reproduce this behavior locally, and filed the following issue ticket for our Maintainers to consider: 
A couple important notes: 

First, I tried to see if I could just comment out those lines in the mapping.yml file, restart Elasticsearch, and then reindex (to see if you could potentially do this as a workaround until this is addressed)... and unfortunately I got a reindexing error. So - there's a bit more to be changed than just removing those lines, but it's beyond my non-developer skills to figure out. 

Second, while I was able to reproduce the issue in general, for me it returned ALL inherited records, not 606/800 as in your case. Without closing examining your records, I can't speak to why most but not all were returned. 

I have recommended in the issue ticket that the Maintainers review some of the other indexing choices as well when addressing this issue, as there may be more cases where we are indexing too much and introducing noise into the results. Hopefully this is something that can be addressed in an upcoming release, although I think at this point that the scope is already set for the upcoming 2.7.2 release, so any fix would likely be 2.8 or beyond. 

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/c4748f70-386e-42a0-bba8-4a0b48a7f023n%40googlegroups.com.

sally-an...@york.ac.uk

unread,
May 19, 2023, 4:05:34 AM5/19/23
to AtoM Users
Hi Dan,

Thanks, as ever, for such a speedy and thorough reply! It's a relief to know it isn't just us doing something weird with AtoM.  We'll carry on as we are then and hopefully the issue will be ironed out in a future release.

Best,

Sally

John Hewson

unread,
May 24, 2023, 4:22:10 PM5/24/23
to AtoM Users
Hi Sally and Dan,

Thanks both of you for identifying this issue. I hadn't noticed it, but our archivists had. As a temporary fix, if you edit /plugins/arElasticSearchPlugin/config/mapping.yml by adding a property to the inherited_creators after line 547:
https://github.com/artefactual/atom/blob/qa/2.x/plugins/arElasticSearchPlugin/config/mapping.yml#L547
so that it now reads
        properties:
          i18n: { type: object, dynamic: true, include_in_all: false }
          id: { type: integer }
then reindex and clear the caches, that should keep those inherited fields out of the search results.

Regards,

John

Dan Gillean

unread,
May 25, 2023, 8:03:28 AM5/25/23
to ica-ato...@googlegroups.com
Thanks for the tip, that's very helpful, John!

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

sally-an...@york.ac.uk

unread,
May 26, 2023, 8:01:55 AM5/26/23
to AtoM Users
I just wanted to add our thanks too! Jim has tried this on our test server and it has eliminated the issue, it's just being run on the production version now.  Thank you so much for letting us know, it will make our searches a lot more efficient.

Best,

Sally

Reply all
Reply to author
Forward
0 new messages