Hi,
Just bringing to the list’s attention that the proposed changes associated with jira KERN-2107 could affect all solr queries, and perhaps those more experienced in solr may like to run their eye over the proposed changes, as perhaps this is not the correct approach.
The aim of the change was to move “reader restrictions” from the solr “q” parameter to the solr “fq” parameter to prevent “reader restrictions” from influencing the solr documents score. I’m not sure what the unintended consequences of this proposal are ?
The proposed code changes can be found here [1], and the pull request can be found here [2].
Regards,
Mark Walsh.
[1]
https://github.com/mawalsh/nakamura/tree/kern-2107
[2] pull request
| ALBURY-WODONGA | BATHURST | CANBERRA | DUBBO | GOULBURN | ONTARIO | ORANGE | SYDNEY | WAGGA WAGGA |
Give Generously - Support Young AustraliansCharles Sturt University in
Australia The Chancellery, Panorama Avenue, Bathurst NSW Australia 2795
(ABN: 83 878 708 551; CRICOS Provider Numbers: 00005F (NSW), 01947G (VIC),
02960B (ACT)).
Charles Sturt University in Ontario 860
Harrington Court, Burlington Ontario Canada L7N 3N4 Registration: www.peqab.ca
Hi,Just bringing to the list’s attention that the proposed changes associated with jira KERN-2107 could affect all solr queries, and perhaps those more experienced in solr may like to run their eye over the proposed changes, as perhaps this is not the correct approach.The aim of the change was to move “reader restrictions” from the solr “q” parameter to the solr “fq” parameter to prevent “reader restrictions” from influencing the solr documents score. I’m not sure what the unintended consequences of this proposal are ?The proposed code changes can be found here [1], and the pull request can be found here [2].Regards,Mark Walsh.[1][2] pull request
| ALBURY-WODONGA | BATHURST | CANBERRA | DUBBO | GOULBURN | ONTARIO | ORANGE | SYDNEY | WAGGA WAGGA |
Give Generously - Support Young Australians
You can help young Australians to go to University and succeed in their studies by giving generously to the Charles Sturt University Foundation. To find out more or to make a donation go to the Foundation web site. Australian donations are tax deductible.
LEGAL NOTICE
This email (and any attachment) is confidential and is intended for the use of the addressee(s) only. If you are not the intended recipient of this email, you must not copy, distribute, take any action in reliance on it or disclose it to anyone. Any confidentiality is not waived or lost by reason of mistaken delivery. Email should be checked for viruses and defects before opening. Charles Sturt University (CSU) does not accept liability for viruses or any consequence which arise as a result of this email transmission. Email communications with CSU may be subject to automated email filtering, which could result in the delay or deletion of a legitimate email before it is read at CSU. The views expressed in this email are not necessarily those of CSU.Charles Sturt University in Australia The Chancellery, Panorama Avenue, Bathurst NSW Australia 2795 (ABN: 83 878 708 551; CRICOS Provider Numbers: 00005F (NSW), 01947G (VIC), 02960B (ACT)).
Consider the environment before printing this email.
Charles Sturt University in Ontario 860 Harrington Court, Burlington Ontario Canada L7N 3N4 Registration: www.peqab.ca
--
You received this message because you are subscribed to the Google Groups "Sakai Nakamura" group.
To post to this group, send email to sakai-...@googlegroups.com.
To unsubscribe from this group, send email to sakai-kernel...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sakai-kernel?hl=en.
I assume the issue you're seeing with scoring is that the current
practice of sticking an extra implicit clause onto each query like:
AND readers:(mwalsh OR mst OR ...)
is causing documents with more readers in common with the current user's
to end up being scored more highly? If so, I think filters sound like
what you want, since you really just want to know if any term in that
field matched or not.
Your pull request looks good to me, but if it's not too hairy I'd
consider passing through the "readers:" filter and any pre-existing
filters as separate strings if possible. Passing multiple filters to
Solr is semantically the same as ANDing them together (as you're doing),
but should allow Solr to construct and cache the filters separately.
Since a user's list of "readers" isn't likely to change much between
queries, it's a good candidate for being cached and reused, and Solr
should do that if you keep it as a separate filter.
Give me a yell if I can help with anything,
Mark
"Walsh, Mark" <maw...@csu.edu.au> writes:
> Hi,
>
> Just bringing to the list's attention that the proposed changes
> associated with jira KERN-2107 could affect all solr queries, and
> perhaps those more experienced in solr may like to run their eye over
> the proposed changes, as perhaps this is not the correct approach.
>
> The aim of the change was to move "reader restrictions" from the solr
> "q" parameter to the solr "fq" parameter to prevent "reader
> restrictions" from influencing the solr documents score. I'm not sure
> what the unintended consequences of this proposal are ?
>
> The proposed code changes can be found here [1], and the pull request
> can be found here [2].
>
> Regards,
> Mark Walsh.
>
> [1]
> https://github.com/mawalsh/nakamura/tree/kern-2107
>
> [2] pull request
> https://github.com/sakaiproject/nakamura/pull/284
--
Mark Triggs
<ma...@dishevelled.net>
As far as the change goes, functionally I agree with it. I think this will also take some of the unexpected randomness out of some of our searches.
Thanks for digging into this.My only point of reserve is I would like to hear Mark Triggs weigh in on how filter queries that aren't very similar will affect memory usage and caching.To John's points:# This will affect all Solr queries in the system as this is how we limit what searched content is shown to the authenticated user. All Solr queries that go through our search framework pass through this processor to have the appropriate "readers" added to the query. Whenever content is indexed, the indexing processor adds "readers" to the index document.
# We don't currently have a good testing arrangement for changes in search tuning and design. Most of our tuning so far has been a result of finding slow queries either through the logs or by user feedback. I think having some description of the search behavior in different areas could be good for deploying institutions and those extending the system.
??? I thought we were putting randomness in? Are you suggesting that taking readers into account can produce individual results that are surprising to the external observer? This could be a good thing.On 30 Aug 2011, at 22:05, Carl Hall wrote:As far as the change goes, functionally I agree with it. I think this will also take some of the unexpected randomness out of some of our searches.
To John's points:# This will affect all Solr queries in the system as this is how we limit what searched content is shown to the authenticated user. All Solr queries that go through our search framework pass through this processor to have the appropriate "readers" added to the query. Whenever content is indexed, the indexing processor adds "readers" to the index document.So if this is correct, are we not throwing away some potentially useful information with the other readers that might help when for example, I am looking for stuff related to my courses?
# We don't currently have a good testing arrangement for changes in search tuning and design. Most of our tuning so far has been a result of finding slow queries either through the logs or by user feedback. I think having some description of the search behavior in different areas could be good for deploying institutions and those extending the system.I wasn't talking about the technical performance of search code so much as the effectiveness of the search in finding what I am looking for. That's why I tentatively thought it might be a UX issue.