[DuraSpace JIRA] (DS-4271) Discovery "contains" filter does not perform a phrase query

0 views
Skip to first unread message

Chris Wilper (DuraSpace JIRA)

unread,
Jun 11, 2019, 11:01:01 AM6/11/19
to dspace-...@googlegroups.com
Chris Wilper created an issue
 
DSpace / Bug DS-4271
Discovery "contains" filter does not perform a phrase query
Issue Type: Bug Bug
Affects Versions: 6.3, 4.2, 7
Assignee: Unassigned
Components: Discovery
Created: 11/Jun/19 10:00 AM
Priority: Minor Minor
Reporter: Chris Wilper

When doing a "contains" filter query, if the value entered for the field contains spaces, the behavior is to return documents where the individual words are present in any order the field.

A typical user would expect "contains" to do a phrase query, where the entire phrase occurs somewhere in the field's value.

This is a long-standing issue that stems from the following line in SolrServiceImpl, which has remained untouched since the original Discovery implementation, and is present in versions 4.x through 7.0 Preview 1:

https://github.com/DSpace/DSpace/blob/dspace-6_x/dspace-api/src/main/java/org/dspace/discovery/SolrServiceImpl.java#L2110

Notice how it surrounds the text sent to solr with "(" and ")". This has specific meaning in solr, which differs from surrounding the text with double quotes. Surrounding with double quotes would have the expected effect of doing a phrase query. Surrounding with parentheses is equivalent to entering multiple fielded search tearms, one for each word. For example, a the solr query:

title:(introduction to dspace)

...is the same as the solr query:

title:introduction title:to title:dspace

...which is not the expected effect.

The solution, which we have tested, is to replace the surrounding parentheses in the code referenced above, with double quotes.

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v7.10.0#710001-sha1:0399717)

Tim Donohue (DuraSpace JIRA)

unread,
Jun 24, 2019, 12:15:01 PM6/24/19
to dspace-...@googlegroups.com
Tim Donohue updated an issue
Change By: Tim Donohue
Status: Received Volunteer Needed

Jonas Van Goolen (Atmire) (DuraSpace JIRA)

unread,
Oct 10, 2019, 5:56:01 AM10/10/19
to dspace-...@googlegroups.com
Jonas Van Goolen (Atmire) commented on Bug DS-4271
 
Re: Discovery "contains" filter does not perform a phrase query

As this has been running in production for http://repository.uneca.org/ for a while now, and we have recently also applied and testing this fix for a whole batch of clients, I quickly created this PR since we want to avoid having to port this for newer projects that would start from the latest DSpace-6_x

Refraining from already porting this to Dspace7 as we might want to take care to concatenate any improvements to the searching and indexing instead of doing this for such small fixes.

This message was sent by Atlassian Jira (v8.4.1#804002-sha1:94e96d6)
Atlassian logo

Jonas Van Goolen (Atmire) (DuraSpace JIRA)

unread,
Oct 10, 2019, 5:57:01 AM10/10/19
to dspace-...@googlegroups.com
Jonas Van Goolen (Atmire) edited a comment on Bug DS-4271
As I checked up with [~cwilper], and as this has been running in production for http://repository.uneca.org/ for a while now, and we have recently also applied and testing this fix for a whole batch of clients, I quickly created this PR since we want to avoid having to port this for newer projects that would start from the latest DSpace-6_x

Refraining from already porting this to Dspace7 as we might want to take care to concatenate any improvements to the searching and indexing instead of doing this for such small fixes.


The related PR can be found here:
https://github.com/DSpace/DSpace/pull/2543

Anonymous (DuraSpace JIRA)

unread,
Oct 10, 2019, 6:49:01 AM10/10/19
to dspace-...@googlegroups.com
Issue was automatically transitioned when Jonas Van Goolen created pull request #2543 in GitHub
 
Change By: Jonas Van Goolen
Status: Volunteer Code Review Needed

Alan Orth (LYRASIS JIRA)

unread,
Jun 21, 2021, 6:27:02 AM6/21/21
to dspace-...@googlegroups.com
Alan Orth updated an issue
Change By: Alan Orth
Attachment: ds-4271-before-fs8.png
This message was sent by Atlassian Jira (v8.13.2#813002-sha1:c495a97)
Atlassian logo

Alan Orth (LYRASIS JIRA)

unread,
Jun 21, 2021, 6:28:01 AM6/21/21
to dspace-...@googlegroups.com
Alan Orth commented on Bug DS-4271
 
Re: Discovery "contains" filter does not perform a phrase query

Agree this is a good fix for a problem I didn't even realize we had. Tested with a contains filter with spaces on a subject term "farmer managed irrigation systems" in our staging repository.

  • Before, 293 results, of which many (all?!) are not what the user would expect
  • After, 162 results, of which all are exactly matching the search term

Looks good to me for DSpace 6.4. Merge!

Reply all
Reply to author
Forward
0 new messages