creating browseable, drill downable facets

37 views
Skip to first unread message

Nicholas Faiz

unread,
Mar 22, 2011, 7:12:21 AM3/22/11
to thinkin...@googlegroups.com
Hi,

I'm trying to set up a lot of facets for a library catalogue. It's the typical faceted search scenario: search 'abc', see n related facets (perhaps one called discipline, with a value of 'histories" and 150 matches), the user then clicks on histories and expects to see those 150 records.

I've been configuring my facets as attributes, too, with the hope that something like the following will work:

Book.facets 'abc', :facet => [:discipline], :with => {:discipling => "histories"}

With the attribute/facet discipline set as a string, though, inaccurate matching was the result (I'm using Postgres).

After some reading I discovered the CRC type, and by converting the facets/attributes to CRCs. I can obtain accurate matches through rails console:

Book.facets 'abc', :facet => [:discipline], :with => {:discipling => "histories".to_crc}

But the browsing experience is lost as all of the facet names are integers; so under the facet discipline I have lists of numbers.

I coded this far into it, thinking there'd be a way to convert from the CRC representation for histories to the string representation (it's a Ruby encoding, after all). 

What are people doing to solve this problem

Cheers,
Nicholas

James Healy

unread,
Mar 22, 2011, 7:22:40 AM3/22/11
to thinkin...@googlegroups.com
Here's a faceted search I wrote:

http://www.mosaicresources.com.au/titles?q=george+r+martin

String facets were hard, so I avoided them.

The price and misc facets are all Boolean attributes. I have helpers
that parse the facets object and display the refine link if the true
count for each facet is greater than 0 and less than the current
result count.

The format and subject facets are all on foreign key ints, but
otherwise work in a similar way.

A library to replace my embarrassingly ugly helpers would be awesome,
will_facet anyone?

James

> --
> You received this message because you are subscribed to the Google Groups
> "Thinking Sphinx" group.
> To post to this group, send email to thinkin...@googlegroups.com.
> To unsubscribe from this group, send email to
> thinking-sphi...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/thinking-sphinx?hl=en.
>

Clemens Kofler

unread,
Mar 22, 2011, 7:24:47 AM3/22/11
to thinkin...@googlegroups.com
Hi Nicholas,

unfortunately, the CRC is a one-way function – there is no way to directly convert it back to its original string.

You can see a solution I'm using in https://gist.github.com/881075 (using MySQL). A product has a brand name (e.g. Adidas) which is CRC'd so it can be faceted. Note that I've not validated the code (I did cut some unnecessary stuff) so you might need to adjust a thing or two. Also note that depending on the size of the queried table (products in this case) this might be slow as hell. You can, of course, let Postgres perform its wizardry which cached views, indexed expressions (you could index the resulting CRC values) etc.

Hope that helps,
- C.

Clemens Kofler

unread,
Mar 22, 2011, 7:27:59 AM3/22/11
to thinkin...@googlegroups.com
Addendum: String attributes are a feature of Sphinx 1.10-beta (see http://sphinxsearch.com/docs/manual-1.10.html#conf-sql-attr-string). Sadly, they don't have any function aside from storage and retrieval yet. I haven't checked whether this is planned for 1.10 final (or another future version) but you might want to check back on that.

- C.

Clemens Kofler

unread,
Mar 22, 2011, 7:30:58 AM3/22/11
to thinkin...@googlegroups.com
Addendum 2 (sorry, I've got a headache today): Considering that string attributes can be stored, I guess you could store the same field as integer attribute (CRC) and string attribute and hack TS's facets so that it uses the CRC attribute to extract the faceted values from Sphinx but then use the string attribute as the key in the facet hash. You'd have to dig into the facet code a little bit but the code is not that difficult (IMO).

- C.

Nicholas Faiz

unread,
Mar 22, 2011, 7:43:38 AM3/22/11
to thinkin...@googlegroups.com, Clemens Kofler
Thanks Clemens - I'm just having a play with that code now. I appreciate your response.

James - cheers, but I'm dealing with huge XML documents full of strings, so that's what I have to make facetable!

Nicholas

Nicholas Faiz

unread,
Mar 22, 2011, 8:01:57 AM3/22/11
to thinkin...@googlegroups.com, Clemens Kofler
Just as an update - I've experimented with Clemens code and I think I could make something work, though I have reservations about speed (though I can throw a lot of resources at the machine and the tables aren't overly large).

I've invested some time in a Thinking Sphinx solution, so I don't want to opt out immediately. I'll look into the latest releases of Sphinx, following Clemens' suggestions, and see if I can create the coding workaround he suggests and test it.

Will report back.

Cheers,
Nick


Reply all
Reply to author
Forward
0 new messages