Re: [ts] Initial search is slow.

82 views
Skip to first unread message

Pat Allan

unread,
Nov 20, 2012, 8:19:46 AM11/20/12
to thinkin...@googlegroups.com
Hi Dan

Is Product an STI model, and is the inheritance column collection_type? If so, I'd recommend using the `set_sphinx_types` method in your Product class and specify all subclasses, to avoid Thinking Sphinx querying the database to figure that out instead (which can take some time):

class Product < ActiveRecord::Base
# ...

set_sphinx_types %w( Product SpecialProduct CustomProduct )

# ...
end

If I've guessed wrong and you're not using STI (I'm just guessing, based on the SQL query shown in your benchmark output) let me know.

Cheers

--
Pat

On 20/11/2012, at 3:43 PM, Dan Milne wrote:

> Hello there,
> I'm looking to get sphinx and thinking-sphinx integrated into my site. However, I've run into a weird problem.
>
> The first search takes just under 400 seconds. Subsequent searches are fast. For example:
>
> irb> Benchmark.realtime { Product.search("Monday") }
> (398699.9ms) SELECT DISTINCT collection_type FROM products WHERE collection_type IS NOT NULL
> => 398.756526954
> irb> Benchmark.realtime { Product.search("Monday") }
> => 0.000110899
>
> Connections using Riddle appear to be instant (After starting a new rails console):
>
> irb> client = Riddle::Client.new 'sphinxhost.blah.com', 9312
> => #<Riddle::Client:0xb1c9d7c @servers=["sphinxhost.blah.com"], @port=9312, @socket=nil, @key=nil, @offset=0, @limit=20, @max_matches=1000, @match_mode=:all, @sort_mode=:relevance, @sort_by="", @weights=[], @id_range=0..0, @filters=[], @group_by="", @group_function=:day, @group_clause="@weight DESC", @group_distinct="", @cut_off=0, @retry_count=0, @retry_delay=0, @anchor={}, @index_weights={}, @rank_mode=:proximity_bm25, @rank_expr="", @max_query_time=0, @field_weights={}, @timeout=0, @overrides={}, @select="*", @queue=[]>
> irb> Benchmark.realtime { client.query('Monday') }
> => 0.067847711
>
> Telnetting to the sphinx host is immediately connected - although I haven't tried a query through that interface.
>
> Any thoughts on why the first connection is so slow? It looks like some kind of timeout to me - but I can't see what. DNS queries both forward and reverse are fast.
>
> I'm using a main + delta setup with deltas being updated every 20 minutes and merged every 20 minutes ( offset from the updates by 10 minutes). I'm running seachd from init because I don't like to run long running processes as a user (init will restart failed processes). I symlink /etc/sphinxsearch/sphinx.conf to my thinking-sphinx generated configuration. I don't think this is related though, as searches do eventually work.
>
> Cheers,
> Dan
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group.
> To view this discussion on the web visit https://groups.google.com/d/msg/thinking-sphinx/-/gJFa7kX9QC0J.
> To post to this group, send email to thinkin...@googlegroups.com.
> To unsubscribe from this group, send email to thinking-sphi...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en.



Dan Milne

unread,
Nov 20, 2012, 9:07:56 PM11/20/12
to thinkin...@googlegroups.com
Thanks Pat -  that's really helpful.  Product isn't an STI - but Collection is a polymorphic association.  Taking another look at the output - I thought that was output from Sphinx - but I can see it's SQL for the app's database.  That query really will take that long as it's not indexed. 

Is there someway to provide a list of models so it doesn't need to figure it out with this query? I see this issue on github, but it doesn't look like it was ever implemented? 

Thanks very much for your work on Thinking Sphinx!

Cheers,
Dan

Pat Allan

unread,
Nov 28, 2012, 6:52:09 AM11/28/12
to thinkin...@googlegroups.com
Hi Dan

Sorry for the slow reply, and I'm afraid I don't have anything great to report: there's no option for skipping that query in Thinking Sphinx at this point in time (and indeed, I've not even implemented anything in the edge branch - 3.0.0.pre - for polymorphic associations). I'd love to say I'll get something there, but I don't have much free time right now, so: patches are very much welcome, if you want to take a stab at it.

--
Pat

> To view this discussion on the web visit https://groups.google.com/d/msg/thinking-sphinx/-/yuF7iKtbHx8J.

Pat Allan

unread,
Nov 28, 2012, 7:27:19 AM11/28/12
to thinkin...@googlegroups.com
One thing that may help, even though I wish it wasn't necessary: add a database index to that collection_type column in the products table.
Reply all
Reply to author
Forward
0 new messages