Dynamic Facets and Indexing Lots of Content

635 views
Skip to first unread message

Dylan Vaughn

unread,
Apr 13, 2010, 1:27:15 PM4/13/10
to ruby-s...@googlegroups.com
Hey all,

I have two issues I would like some feedback on.

1. Dynamic Facets

In one of the projects I am working on, users of the application have
the ability to create and manage search filters (or categories). For
example, an administrator could set up a category named 'Fruit' with
values of 'Banana', 'Apple', 'Orange', and 'Grape'. On the search
results page, 'Fruit' should be displayed as a facet category with
counts for matches for 'Banana', 'Apple', 'Orange', and 'Grape' on any
text field of the document.

Currently I am managing this at index time and also when an admin
changes one of the categories (by searching the document for the
custom facet terms and then tagging them with what matched). I am
wondering if there is a way to instead manage this at query time?

2. Models with large rarely changing text field

In this application we have a model which represents a document. We
extract the text from the document and index it using solr. This text
of the document (which could be fairly large) rarely changes, however
other metadata associated with the document changes frequently (and is
also searchable). I would like to prevent reindexing the large text
field when just metadata is changing. I was considering splitting the
text of the document into a different but related model but wanted to
see if anyone else has solved a similar problem.

Thanks!

Dylan

Mat Brown

unread,
Apr 13, 2010, 1:43:20 PM4/13/10
to ruby-s...@googlegroups.com
Hey Dylan,

1. Have you looked at dynamic fields in Sunspot? Your problem sounds
like the use case for which that feature was designed. It'd look
something like this (without knowing exactly how your model is
structure):

class MyClass
searchable do
dynamic_integer :custom_category_ids, :multiple => true do
custom_categories.inject(Hash.new { |h, k| h[k] = [] }) do |map,
custom_category|
map[custom_category.id] << custom_category_values_for(custom_category)
end
end
end
end

search = MyClass.search do
dynamic(:custom_categories) do
facet(some_custom_category.id)
end
end

facet = search.facet(:custom_categories, some_custom_category.id)

2. Unfortunately not -- Lucene doesn't have a concept of partial
updates. In fact, it doesn't have a concept of updates at all -- under
the hood, Solr just deletes the document and re-adds it. I'm not aware
of any workarounds for this, but I'd be interested to know if others
do. One thing you could potentially look into is external data sources
in Lucene (using that to pull in the often-changing metadata at query
time, so you don't have to update Solr each time the metadata
changes), but I'm not sure how practical that is or how much support
Solr has for it.

Mat

> --
> You received this message because you are subscribed to the Google Groups "Sunspot" group.
> To post to this group, send email to ruby-s...@googlegroups.com.
> To unsubscribe from this group, send email to ruby-sunspot...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/ruby-sunspot?hl=en.
>
>

Reply all
Reply to author
Forward
0 new messages