Couple of questions

3 views
Skip to first unread message

esp

unread,
Apr 26, 2009, 8:44:11 PM4/26/09
to Thinking Sphinx
Hi all -
I've recently incorporated Sphinx 0.9.8 and the latest thinking
sphinx code to my site
and have a couple of questions.


1) Is there any way to add static data into a model. For example if
a
table with a row that has a name field that is "Loch Ness" is there a
way
that I can add static text to this so it will return results for the
keyword "lake".

2) I attempted to use the delta mechanism but after added the delta
column
and the set propertly :delta => true into my define_index section of
my
model, indexing for the first time never completed. I Hvae one table
with
close to 7 million rows that takes maybe a couple of minutes to index
but
after adding the delta and trying to create a fresh index, i let it
run for
3 hours and it never even got to the point in the console output that
said
it was processing. It just showed the initial "creating index...".

Thanks again for any advice and for an excellent plugin.

Mike


http://www.everysingleplace.com/

James Healy

unread,
Apr 26, 2009, 9:24:22 PM4/26/09
to thinkin...@googlegroups.com
Hi Mike,

esp wrote:
> 1) Is there any way to add static data into a model. For example if
> a
> table with a row that has a name field that is "Loch Ness" is there a
> way
> that I can add static text to this so it will return results for the
> keyword "lake".

You can pass raw SQL fragments to TS in your define_index block.
Assuming you're using mysql:

indexes "IF(`name` = 'Loch Ness', 'lake','')", :as => :landmark

> 2) I attempted to use the delta mechanism but after added the delta
> column
> and the set propertly :delta => true into my define_index section of
> my
> model, indexing for the first time never completed. I Hvae one table
> with
> close to 7 million rows that takes maybe a couple of minutes to index
> but
> after adding the delta and trying to create a fresh index, i let it
> run for
> 3 hours and it never even got to the point in the console output that
> said
> it was processing. It just showed the initial "creating index...".

Is there an index on your delta column?

-- James Healy <jimmy-at-deefa-dot-com> Mon, 27 Apr 2009 11:20:24 +1000

signature.asc

esp

unread,
Apr 27, 2009, 7:50:27 AM4/27/09
to Thinking Sphinx
Thanks James - comments inline

On Apr 26, 9:24 pm, James Healy <ji...@deefa.com> wrote:
> Hi Mike,
>
> esp wrote:
> > 1) Is there any way to add static data into a  model.   For example if
> > a
> > table with a row that has a name field that is "Loch Ness" is there a
> > way
> > that I can add static text to this so it will return results for the
> > keyword "lake".
>
> You can pass raw SQL fragments to TS in your define_index block.
> Assuming you're using mysql:
>
>   indexes "IF(`name` = 'Loch Ness', 'lake','')", :as => :landmark
>
Not sure I follow the synax - I'm using PostgreSQL.

> > 2) I attempted to use the delta mechanism but after added the delta
> > column
> > and the set propertly :delta => true into my define_index section of
> > my
> > model, indexing for the first time never completed.   I Hvae one table
> > with
> > close to 7 million rows that takes maybe a couple of minutes to index
> > but
> > after adding the delta and trying to create a fresh index, i let it
> > run for
> > 3 hours and it never even got to the point in the console output that
> > said
> > it was processing.  It just showed the initial "creating index...".
>
> Is there an index on your delta column?
>
Doh! Forgot to add an index. That's gotta be it. Thanks!
> -- James Healy <jimmy-at-deefa-dot-com>  Mon, 27 Apr 2009 11:20:24 +1000
>
>  signature.asc
> < 1KViewDownload

James Healy

unread,
Apr 27, 2009, 8:00:08 AM4/27/09
to thinkin...@googlegroups.com
esp wrote:
> > esp wrote:
> > > 1) Is there any way to add static data into a  model.   For example if
> > > a
> > > table with a row that has a name field that is "Loch Ness" is there a
> > > way
> > > that I can add static text to this so it will return results for the
> > > keyword "lake".
> >
> > You can pass raw SQL fragments to TS in your define_index block.
> > Assuming you're using mysql:
> >
> >   indexes "IF(`name` = 'Loch Ness', 'lake','')", :as => :landmark
> >
> Not sure I follow the synax - I'm using PostgreSQL.

An indexes call can just be a string that contains a raw SQL statement.
Pretend you're adding a manual field to an SQL SELECT.

The comments in lib/thinking_sphinx/index/builder.rb have (some) further
information on this feature.

The IF() function is for MySQL, but I'm sure PostgreSQL will have
something equivalent.

-- James Healy <jimmy-at-deefa-dot-com> Mon, 27 Apr 2009 21:55:08 +1000

Pat Allan

unread,
Apr 27, 2009, 8:02:21 AM4/27/09
to thinkin...@googlegroups.com
Could try it with a CASE instead:

indexes "CASE name WHEN 'Loch Ness' THEN 'lake' ELSE '' END", :as
=> :landmark

Although I'm sure there's an equivalent IF condition for PostgreSQL,
but CASE statement is what I've come across first :)

--
Pat

David Wennergren

unread,
Apr 27, 2009, 5:29:50 PM4/27/09
to thinkin...@googlegroups.com
I'm experiencing the same increase in time when building my index after activating delta index (with delayed_job). I have a index on my delta column.

Before it took a few minutes. Now it stops at:

indexing index 'article_core'...

After an hour or so it continues to build the index with the same speed as before using delta index.

I'm using postgres and the 0.9.9 branch of TS with Sphinx 0.9.9-rc2

/David

David Wennergren

unread,
Apr 29, 2009, 4:59:25 PM4/29/09
to thinkin...@googlegroups.com
I found the reson for my speed problem when rebuilding the index and using deltas. The problem was the reset_query method in the DefaultDelta class. It currently look like this:

def reset_query(model)
   "UPDATE #{model.quoted_table_name} SET " +
   "#{@index.quote_column(@column.to_s)} = #{adapter.boolean(false)}"
end

This means that it will update every row in the table regardless of whether delta is true or false. If I add an where statement only updating rows where delta is true is much much faster:

def reset_query(model)
  "UPDATE #{model.quoted_table_name} SET " +
  "#{@index.quote_column(@column.to_s)} = #{adapter.boolean(false)}" +
  " WHERE #{@index.quote_column(@column.to_s)} = #{adapter.boolean(true)}"
end

/David
--
David Wennergren

Newsdesk AB
Bondegatan 21
S-116 33 Stockholm
Sweden

+46 (0)8 50 900 211
da...@newsdesk.se

Pat Allan

unread,
May 8, 2009, 7:54:00 PM5/8/09
to thinkin...@googlegroups.com
Thanks for pointing this out David (and others - I know it's been raised before). I've just pushed the change to GitHub.

Cheers

-- 
Pat

Reply all
Reply to author
Forward
0 new messages