Solr 6 faceting performance

1,235 views
Skip to first unread message

magi...@gmail.com

unread,
Dec 20, 2016, 4:30:42 PM12/20/16
to Blacklight Development
Has anyone running Solr 6 noticed issues with faceting performance, particularly in the common case of faceting over all docs?
https://issues.apache.org/jira/browse/SOLR-8096

My current understanding is that the new default docValues faceting is much more near-realtime-friendly than previous default UninvertedField faceting, but it also seems to do a lot more heavy lifting at query time. This means faster/nonexistent searcher warming (wrt faceting), but significantly slower user query performance compared to UninvertedField faceting (which builds a heavier-weight data structure the first time a user requests faceting on a field, and then reuses that data structure for all subsequent requests against the same field/IndexSearcher (index version)).

I haven't compared response times between solr versions, but have observed behavior consistent with the description in the SOLR-8096 -- notably, that response latency is closely proportional to the total number of docs over which facets are calculated. This makes sense for an iterator-based per-request approach.

Thanks in advance for any insights you may have!

Michael

Nikitas Tampakis

unread,
Dec 21, 2016, 11:26:57 AM12/21/16
to blacklight-...@googlegroups.com
I've encountered similar faceting performance problems too. The more documents that get returned in any given query and the more facets that we have, the slower the response time. Since we've always been on Solr 5 and Solr 6, the facets have always been docValues. Rather than looking at the underlying Lucene structures, our strategy has been to minimize the number of facets we expose in the interface. Particularly on our homepage: https://pulsearch.princeton.edu/, we only expose 4 facets. During a search we have 12. In most searches the result sets are pretty small and the performance isn't much of an issue. However clicking a facet like "In the Library," format "Book," or simply a blank search gets very slow.

One relatively new Solr feature which may helf is the JSON Facet API: http://yonik.com/json-facet-api/. I haven't had much time to play around with it, but the performance improvements seem promising: http://yonik.com/facet-performance/

Nikitas
--
You received this message because you are subscribed to the Google Groups "Blacklight Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blacklight-develo...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

magi...@gmail.com

unread,
Dec 21, 2016, 1:06:56 PM12/21/16
to Blacklight Development
Thanks, Nikitias! In my experience another effective way to work around this is by deploying SolrCloud (as opposed to a monolithic instance) since the calculation of the base facet counts happens per-shard. Doubling the number of nodes in SolrCloud in our experience roughly halves response time. I'd consider this a workaround rather than a solution though.

I've also worked on a proof-of-concept for some minimal per-node caching (in org.apache.solr.request.DocValuesFacets) that handles at a minimum the very common case of faceting over all docs, and my initial impression is that it makes a huge difference. Trying to figure out whether it makes sense to contribute this as a comment/suggestion to the JIRA issue linked below.

Another thing to try out: the "facet.threads" parameter worked as expected (evaluating facets in parallel) on docValues faceting in our Solr6 deployment. Default is to evaluate facets serially, so if you have a lot of cores on your server, you might try messing around with this parameter ...

Thanks for pointing out the JSON Facet API ... I need to familiarize myself more with that, but it's not immediately clear to me whether it's orthogonal to the performance issues we're looking at (since it can be used in conjunction with docValues faceting?)

Looking forward to hearing from others regarding this ...

Michael
To unsubscribe from this group and stop receiving emails from it, send an email to blacklight-development+unsub...@googlegroups.com.

Bill Dueber

unread,
Dec 21, 2016, 6:22:26 PM12/21/16
to blacklight-...@googlegroups.com
Have you tried turning off docvalues on the fields you use to facet? I haven't taken time to run any benchmarks, and I'm kinda hoping someone else has :-)

To unsubscribe from this group and stop receiving emails from it, send an email to blacklight-development+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Blacklight Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blacklight-development+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Bill Dueber
Library Systems Programmer
University of Michigan Library

magi...@gmail.com

unread,
Dec 22, 2016, 8:46:22 AM12/22/16
to Blacklight Development
Wouldn't that be equivalent to forcing legacy UninvertedField faceting (which I think (?) you can already do by specifying"facet.method=uif")? Either way of forcing uif faceting would probably help as a workaround, but I'd be hesitant to consider it as a long-term solution.
Reply all
Reply to author
Forward
0 new messages