SORL Indexing Customisation in DSpace 7 and 8

79 views
Skip to first unread message

Ben Parkes

unread,
Oct 8, 2024, 7:45:43 AM10/8/24
to dspace-c...@googlegroups.com

Hello,

 

I am trying to better understand the DS7/8 search result indexing and stop full text searching in an instance of DS7.6.

 

I believe stopping full text indexing and searching is done by either setting the full-text <field> indexed and stored parameters to "false" or by commenting the whole <field> before rebuilding the index.

 

My issue is that in my instance I have a use case where I the search results are not displaying in the desired order. We have dc.identifier storing a code (ENLI10199 for example). When searching for this code the results will return items where fragments of this code string are present ("10199" for example, see attached images) before items where the full code is present in the metadata or full text.

 

This behaviour is different to the DS6 instance I am upgrading where searching for "ENLI10199" returns a list of items where only this full string is present in the metadata and does not return items where there are only fragments of this string (again see attached images).

 

What is desired is that the search results list will always (and if possible, only) display items where the string "ENLI10199" is present like the results produced by the DS6 instance.I am assuming something has changed in the way DS7 indexes and returns search results that is producing the differences identified.

 

Does anyone know if this is possible and if there is a way to do this in the config (as I assume it is done in DS6 as we haven't done any development work to produce these results to that instance)?

 

Thanks,

Ben

 

____________________________

Ben Parkes

Lead SCURL Developer | Digital Library Software Developer

Digital Library, Library & University Collections, University of Edinburgh 

Argyle House, Lady Lawson Street, Edinburgh EH3 9DR

Pronouns: He, him, his

 

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
DS6.png
DS7.png

DSpace Community

unread,
Oct 8, 2024, 10:10:55 AM10/8/24
to DSpace Community
Hi Ben,

To stop DSpace from full-text indexing any content, you should be able to set the "textextractor.max-chars = 0" (or some other very small number).

This setting defines how much of the text to index when using full-text indexing.  I believe you can set it to zero to disable it.


Tim
Reply all
Reply to author
Forward
0 new messages