Search problem regarding Authority Records (when not Creator)

53 views
Skip to first unread message

virtu...@yahoo.com.br

unread,
Mar 18, 2015, 10:25:29 PM3/18/15
to ica-ato...@googlegroups.com
Hello everyone,

Im Carlos, talking from Brazil. I have a question and a proposal regarding the search engine that i couldnt find in the community list. We dont know how, and how much, everyone is using the "history" field of the authority records. Here, we plan to invest heavily on this and doing so we found a situation that may be a problem. Or at least seem to be a problem for us here.

We consider that it is logic that any content of "history" field in authority records must be traceable by the search engine when the refered authorithy stands as the creator of any archival record. After all, in this case the history of creator appears automatically in the archival record as one of its most importante fields.

But, we detected here that even when the refered authority is included as an access point in the "names" field as a subject/not creator (suposing that is the right option when indexing any archival record by a given name) the history of that name remains traceable by the search engine. When "history" field of many authority records are filled and when many of these auhorities are used as subjects, it turns the simple search results into a big mess.

For example: we are working here on the archives of former President Lula da Silva, from Brazil. He was na important union leader and including this information in his biography we found that every archival record (for example, a photograph) that included Lula as subject is beeing retrived by the term "unions", even those not created by him, as it would be if it was indexed by "unions" on subjects - and it is not because the refered photo have nothing to do with unions.

Advanced surch could fix this problem. But there is no option there that convers the text content of digital objects - and that is other aspect we are investing havely.

It seemed to us that this is excessive and confusing. Therefore, we bring this question to the list: is this working as planned? Our proposal is that "history" fields on authorithy records remain traceable only when the refered authorithy stands as a creator of a given archival record. And not when the refered authority is associated with archival record in any other circumstances.

Carlos 

Creighton Barrett

unread,
Mar 19, 2015, 9:18:31 AM3/19/15
to ica-ato...@googlegroups.com
Carlos - have you considered limiting the basic search results to archival descriptions? Here in Nova Scotia, we have customized the basic search box of the MemoryNS provincial catalogue so it does not search authority records or subject headings. Check it out: https://memoryns.ca/

Dalhousie University Archives has also made this customization: http://findingaids.library.dal.ca/

This was an easy customization. I have no idea how ElasticSearch works or if it is possible to limit basic search results so the history of an authority record is only used to populate search results when that authority is the creator of an archival record that matches your search term. But if you are linking authority records to individual files, then you should be able to build advanced searches that search the authority headings AND any field in the descriptions. So you wouldn't lose that functionality by limiting your basic search box to the archival descriptions.

Just a thought!

Cheers,
Creighton

--
You received this message because you are subscribed to the Google Groups "ICA-AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at http://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/dd21f010-8469-44cb-ab96-81a548a8fb38%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dan Gillean

unread,
Mar 19, 2015, 6:08:32 PM3/19/15
to ica-ato...@googlegroups.com
Hi Carlos,

I'd love more information about this behaviour - so far I have been unable to recreate it during my tests using our AtoM 2.2 development branch.

I tested by doing the following
  • Create a description "Test" (fonds) with 2 children, "A", and "B"
  • Create a new authority record, "Foo." In the history field, add a unique term - e.g. something like "cookiemonster"
  • Make "Foo" the creator of A, the subject of B.
  • Search for "cookiemonster"

I can only get results for A returned - B is not returned.

Now, I try making Foo the subject of the parent, "Test", as well. When I search for "cookiemonster", I still only get A returned. I tried a few other variations as well, but I never found cookiemonster to return a description where I had added Foo as a subject only.

As far as I can tell, in our current version the history of a related authority record will only return results for an archival description if a) the related authority record is added as a creator, and b) it is added directly, not inherited (such as at lower levels, for example).

It sounds like you are seeing something different. What version of AtoM are you using? Have you tried re-indexing? Is it possible the term unions is found somewhere else? For example, PDF text will be indexed and returned as a search result if you have pdftotext installed - could "unions" be in the attached digital object PDFs, perhaps?

I suggest trying something similar to the above test, to make sure that the problem is in fact that when you add an authority record to the name (subject) access point field, you are in fact seeing the History of the related name access point interfering in the search results.

Let us know what you find!

Cheers,


Dan Gillean, MAS, MLIS
AtoM Product Manager / Systems Analyst,
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

virtu...@yahoo.com.br

unread,
Mar 20, 2015, 1:59:39 PM3/20/15
to ica-ato...@googlegroups.com
Hi Creighton,
We will consider this option.
Thanks very much for sharing your ideas.
cheers!
Carlos

virtu...@yahoo.com.br

unread,
Mar 20, 2015, 2:22:37 PM3/20/15
to ica-ato...@googlegroups.com
Hello Dean,
Thanks again for your help! I repeated your test in the demo version on the web page and made an intriguing discover. Apparently we are both right. Indeed the history of the authority record dindt returned B for cookiemonster when B stands as part of the Test Fonds and have Foo as subject. But, when you add Foo as subject of Test as a Fond (not creator), or as subject of C (included as an item of part B), you get both (Test and C) as result in a search for cookiemonster. It seems that the problem of the history of authorithy records confusing the simple search when names are subjects is somehow associated with the level of description of the archival records.
Cheers,
Carlos

virtu...@yahoo.com.br

unread,
Mar 21, 2015, 4:24:06 PM3/21/15
to ica-ato...@googlegroups.com, virtu...@yahoo.com.br
Dan,
Very strange. I repeated today the test on the demo version and B, even as a part of Test Fonds, did returned as result for cookimonster when Foo was added as subject. Maybe I did something wrong at the first time. So, this time, not only Fonds and Items, but Parts as well returned as result for cookiemonster when Foo was included as subject of them with "cookiemonster" in its history field. The same behavior I reported in the first message. Could it be some bug of the demo version? In the next days I will make more tests on my installed version of Atom and share the results here in the list.
Cheers

Dan Gillean

unread,
Mar 23, 2015, 5:41:47 PM3/23/15
to ica-ato...@googlegroups.com, virtu...@yahoo.com.br
Hi Carlos,

Very strange....

I also did some more testing, in my local dev environment, on both 2.1 and our development 2.2 branch. I simplified the test just to see what I could find, and I consistently found that your original analysis was correct. Here is what I did:
  • Create a new authority record, "FOO"
  • Add a unique search term to the actor history - "peanut"
  • Create a new archival description, "BAR"
  • Add "FOO" as a name access point to BAR, and save (e.g. add NO creator)
  • Search for "peanut"

When I did this, I consistently got the description (BAR) returned when I searched for peanut.

I have filed a bug for this issue here:

From speaking briefly to our developers, it seems the problem is in our Elasticsearch mapping, which was not made granular enough, here:

It looks like we are currently pulling in the entirety of the indexed fields for an actor every time a name access point is added - in fact, all we really need is the authorized form of name. You could argue that adding some other fields might be useful - such as the other forms of name fields - but we certainly do not need EVERY field, and especially not the history!

Changing it in the file linked above apparently also requires a number of changes throughout the application to support this change. Because of this, I don't yet know if we will be able to include a fix for this in the 2.2 release - but we will try!

Thank you again for your patience, and for reporting this issue. If you have developers and intend to try to fix this yourself, please consider submitting a pull request! Or if your institution is interested in sponsoring a fix to guarantee its inclusion in 2.2, please feel free to contact me off-list to discuss further.

Regards,


Dan Gillean, MAS, MLIS
AtoM Product Manager / Systems Analyst,
Artefactual Systems, Inc.
604-527-2056
@accesstomemory

Reply all
Reply to author
Forward
0 new messages