thinking sphinx result includes attribute filter data

174 views
Skip to first unread message

Eric

unread,
Oct 2, 2012, 8:27:15 PM10/2/12
to thinkin...@googlegroups.com
I'm returning a fairly large data set from sphinx for a client with a lot of filter attributes.  I'm trying to get just the active record IDs.  

I'm seeing significant delays and a lot of memory usage.  In debugging this I noticed that in search.rb's compose_ids_results, the @results object contains not only the active record IDs but pretty much my entire index with all the filter attributes.  This is a lot of data and very slow for sphinx to return to my rails app.

I'm wondering if there is a way to tell sphinx not to return everything it knows about each document and just return the sphinx_internal_id?

Thanks much,
Eric
Message has been deleted

Chris

unread,
Oct 3, 2012, 12:00:21 AM10/3/12
to thinkin...@googlegroups.com
There is a search_for_ids method.  See this page.

Eric

unread,
Oct 3, 2012, 8:05:32 AM10/3/12
to thinkin...@googlegroups.com
RIght, I'm using search_for_ids and it works; Thinking Sphinx only fetches the list of IDs not ActiveRecord objects.  But when it loads the list of matching records from sphinx, the underlying protocol returns an enormous hash (in my case) for each document like this:

{:doc=>31848312, :weight=>1, :index=>22813, :attributes=>{\"sphinx_internal_id\"=>41015, \"sphinx_deleted\"=>0, \"class_crc\"=>1924706381, \"sphinx_internal_class\"=>\"GrantRequest\", \"fip_title_sort\"=>\"..........\", \"request_id_sort\"=>\".............\", \"project_summary_sort\"=>\"...............\" 
.....
}

I have a lot of attributes associated with each document in the index, so the result is enormous.  I debugged the thinking sphinx code, looking at search.rb's compose_ids_results and it actually includes the text of each full text field as well as all the attributes, no idea why.  I really need to figure out how to avoid this as it adds significant overhead and I don't see a need for returning all this data.

Thanks,
Eric

Eric

unread,
Oct 3, 2012, 8:06:36 AM10/3/12
to thinkin...@googlegroups.com
BTW, I'm using thinking sphinx 2.0.11.

Pat Allan

unread,
Oct 3, 2012, 8:41:39 AM10/3/12
to thinkin...@googlegroups.com
The :sphinx_select option should do the trick - it's much like a SELECT option for a SQL query... so, try adding this option:

:sphinx_select => 'sphinx_internal_id'

Not sure off the top of my head whether other attributes will be required, but probably not given you're using search_for_ids.

Cheers

--
Pat

> --
> You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group.
> To view this discussion on the web visit https://groups.google.com/d/msg/thinking-sphinx/-/fn4acDteSTYJ.
> To post to this group, send email to thinkin...@googlegroups.com.
> To unsubscribe from this group, send email to thinking-sphi...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en.



Eric

unread,
Oct 4, 2012, 1:13:31 PM10/4/12
to thinkin...@googlegroups.com
Thanks pat.  As always, you have the answer- sphinx_select does the trick. 

I wanted to mention a caveat; although full-text searching and attribute filtering worked great, I needed to include the fields I wanted to order by.  So when I build my query, I essentially parse the order clause and add the fields to the sphinx_select clause and everything works perfectly.

For a large page, this makes an enormous gigantic difference.

Thanks a million!
Eric
Reply all
Reply to author
Forward
0 new messages