Search with double quotes in DSpace 7

69 views
Skip to first unread message

Ibai Stats

unread,
Dec 16, 2024, 3:43:33 PM12/16/24
to DSpace Technical Support
Hello:

I would need to ask why searching with double quotes doesn't work like it's supposed to.

For example, if I search for "consumer products" in the repository, I get items that contain those words in a row, but I also get other items that contain the word "consumer" in one metadata and the word "products" in another metadata, giving the impression that The double quotes are not doing their job.

Is this standard behavior? How could I configure it so that in the case of double quotes it only shows me the results that have the literal found in them?

Thank you very much, and best regards.

DSpace Technical Support

unread,
Dec 19, 2024, 9:00:43 PM12/19/24
to DSpace Technical Support
Hello,

I'm not seeing that same behavior locally, or on the demo site at: https://demo.dspace.org/   As far as I'm seeing, the double quotes *is* doing a phrase search as described in the docs at https://wiki.lyrasis.org/display/DSDOC7x/Search+-+Advanced

Here's an example on the demo site when searching for "test item" (with quotes): https://demo.dspace.org/search?spc.page=1&query=%22test%20item%22

If you instead search for "test item" (no quotes) you get a much larger result set: https://demo.dspace.org/search?spc.page=1&query=test%20item

I think we'd need more information about what version of DSpace 7 you are using (it could be an old bug that is fixed in a later version).  You also may want to see if you can reproduce the odd behavior on our demo site or another site, as that would help narrow down what might be going on.

Tim

Ibai Stats

unread,
Jan 20, 2025, 7:31:15 AMJan 20
to DSpace Technical Support
Hello: 

We have encountered an example of this: 

https://escena.cdmae.cat/home

• Jaume Melendres or Jaume AND Melendres - 289 results. We believe this is OK.
• "Jaume Melendres" - 160 results. It includes records where it's also written Melendres, Jaume.
• "Melendres, Jaume" - 220 results. Although there are more results, we believe it doesn't include the records where it's written "Jaume Melendres."  

The DSpace version is 7.5. 

Andrew Thompson

unread,
Jan 22, 2025, 12:51:19 AMJan 22
to DSpace Technical Support
DSpace uses Apache Solr under the hood and directly queries it. DSpace copies all of its fields into a search_text field which is very likely to include both Jaume Melendres, and Melendres, Jaume. See here,  here, then the field type. The field type is responsible for the indexing behaviour (type="index"), and the search behaviour (type="query"), which both use analyzers to define their behaviour. DSpace uses the Lucene query parser which is the part that gives meaning to the double quotes.

So, what data do you have in the search_text field?  

Try modifying the schema.xml file, restarting DSpace, and re-indexing, to see how it affects the search queries in both the frontend and the Apache Solr backend. That will help give you an idea of how the search works and doesn't work.

-Andrew

Reply all
Reply to author
Forward
0 new messages