solr facet.missing

251 views
Skip to first unread message

Jonathan Rochkind

unread,
Dec 30, 2009, 4:54:48 PM12/30/09
to blacklight-...@googlegroups.com
So in typical Blacklight installations, it seems like if you want
certain records to show up as "Unknown" value in a particular facet, if
they have no other value, people deal with this at the indexing stage by
making sure "Unknown" (or what have you) is actually included in the
solr field.

However, Solr supports a "&facet.missing" feature, where you can tell
Solr to give you a count of all records that actually have no value in
the facet field. I would like to be able to use this in certain
circumstances where it ends up being more convenient than dealing with
it at the indexing stage (It's also perhaps somewhat more efficient on
the solr end, at least in terms of disk space!).

I started looking into a patch to Blacklight to support this, but it
gets a bit trickier than at first it seemed. I may need to patch rsolr
or rsolr-ext in order to support retrieval of this "missing facet".
(You need to ask solr for: "&fq=-facetField:[* TO *]", which right now I
think there's no way to tell rsolr to do).

So I'm curious if anyone else is interested in this feature, which might
encourage me to actually finish it up. And I'm wondering, if I want to
submit a patch to rsolr or rsolr-ext... what's the right way to do that?
I figure they have their own seperate repos? (svn or git?).

Jonathan

Erik Hatcher

unread,
Dec 31, 2009, 11:14:48 AM12/31/09
to blacklight-...@googlegroups.com

On Dec 30, 2009, at 4:54 PM, Jonathan Rochkind wrote:
> I started looking into a patch to Blacklight to support this, but it
> gets a bit trickier than at first it seemed. I may need to patch rsolr
> or rsolr-ext in order to support retrieval of this "missing facet".
> (You need to ask solr for: "&fq=-facetField:[* TO *]", which right
> now I
> think there's no way to tell rsolr to do).

Why wouldn't RSolr support this? Doesn't it simply have a pass-
through mechanism so all params get sent to Solr?

Erik

Erik Hatcher

unread,
Dec 31, 2009, 11:17:02 AM12/31/09
to blacklight-...@googlegroups.com

On Dec 30, 2009, at 4:54 PM, Jonathan Rochkind wrote:
> However, Solr supports a "&facet.missing" feature, where you can tell
> Solr to give you a count of all records that actually have no value in
> the facet field. I would like to be able to use this in certain
> circumstances where it ends up being more convenient than dealing with
> it at the indexing stage (It's also perhaps somewhat more efficient on
> the solr end, at least in terms of disk space!).

It'd be minimal, insignificant impact on disk space to simply index a
"missing" value for a field on many documents. Remember, it's an
index. Adding the same word to every page in a book doesn't really
impact the index in the back of the book much.

Erik

Jonathan Rochkind

unread,
Jan 4, 2010, 10:44:45 AM1/4/10
to blacklight-...@googlegroups.com
Yep, the disk space issue was maybe a red herring, I'd like to use solr
facet.missing anyway because it makes configuration much more convenient
in some cases for me.

Still curious if anyone else is interested in this, or alternately if
anyone would find a patch to support this welcome or not.

It may not require much change to rsolr, but I need to figure it out.
rsolr right now takes some params in it's own custom way that it maps to
solr params, it doesn't always just take raw solr params. I'm not sure
if it just passes on arbitrary raw solr params to solr. And I'm not sure
the best way to tell it to send "&fq=-facetField:[* TO *]" to Solr in a
way that won't be confusingly inconsistent with it's current API. I need
to look at it. Hopefully it won't be too much trouble.

Jonathan

> --
>
> You received this message because you are subscribed to the Google Groups "Blacklight Development" group.
> To post to this group, send email to blacklight-...@googlegroups.com.
> To unsubscribe from this group, send email to blacklight-develo...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/blacklight-development?hl=en.
>
>
>
>

Matt Mitchell

unread,
Jan 4, 2010, 10:52:44 AM1/4/10
to blacklight-...@googlegroups.com
Jonathan,

You should be able to just set it using :fq. Remember, BL is using rsolr-ext too, so that's where the :filters stuff comes from. We have plans to remove this though, and make the queries explicitly in BL without the :filters/:phrases mappings.

So, if :fq is set, then rsolr-ext will simply append another :fq to the query. I hope anyway ;)

http://github.com/mwmitchell/rsolr-ext/blob/master/lib/rsolr-ext/request.rb

Matt

Jonathan Rochkind

unread,
Jan 4, 2010, 11:08:09 AM1/4/10
to blacklight-...@googlegroups.com
Thanks, this helps. I agree with the sentiment to remove the special
mapping and just send straight solr parameters, to avoid confusion. So I
won't feel bad doing so now, even if it's inconsistent with the
rsolr-ext stuff (thanks, I didn't realize what came from rsolr-ext and
what came from rsolr), knowing that that's the direction people want to
go anyhow.

So, would anyone oppose a patch to BL to make BL do something more
reasonable if your Solr is set with facet.missing=true? Right now if
your solr is set with facet.missing=true, BL will list a facet value
count without any label at all, and thus without any clickable filter.
I'd like to change it to list "Unknown" (or another word configurable in
Blacklight.config I guess), and allow that to be clickable to filter on
the "missing" facet, in cases where Solr is configured with
facet.missing=true. Obviously if you don't want to use
facet.missing=true on your solr, then there would be no change.

So I'll go ahead and prepare such a patch unless anyone minds?

Jonathan

Matt Mitchell wrote:
> Jonathan,
>
> You should be able to just set it using :fq. Remember, BL is using rsolr-ext too, so that's where the :filters stuff comes from. We have plans to remove this though, and make the queries explicitly in BL without the :filters/:phrases mappings.
>
> So, if :fq is set, then rsolr-ext will simply append another :fq to the query. I hope anyway ;)
>
> http://github.com/mwmitchell/rsolr-ext/blob/master/lib/rsolr-ext/request.rb
>
> Matt
>

>> To post to this group, send email to blacklight-...@googlegroups.com<mailto:blacklight-...@googlegroups.com>.
>> To unsubscribe from this group, send email to blacklight-develo...@googlegroups.com<mailto:blacklight-development%2Bunsu...@googlegroups.com>.


>> For more options, visit this group at http://groups.google.com/group/blacklight-development?hl=en.
>>
>>
>>
>>
>>
>
> --
>
> You received this message because you are subscribed to the Google Groups "Blacklight Development" group.

> To post to this group, send email to blacklight-...@googlegroups.com<mailto:blacklight-...@googlegroups.com>.
> To unsubscribe from this group, send email to blacklight-develo...@googlegroups.com<mailto:blacklight-development%2Bunsu...@googlegroups.com>.

Naomi Dushay

unread,
Jan 4, 2010, 2:29:42 PM1/4/10
to blacklight-...@googlegroups.com
Sounds good to me, IFF the default behavior is the way it is now.

This could be especially useful if we implement facet checkboxes -
"I'd like anything that isn't an LC or Dewey call number" (which would
be the same as missing in the call number facet) ... when the index
isn't currently set up with an "unknown/other" facet value. (Think
"staff view")

- Naomi

Reply all
Reply to author
Forward
0 new messages