reindexing after ActiveRecord update_all is called

135 views
Skip to first unread message

Matt Murphy

unread,
Aug 12, 2008, 3:46:37 AM8/12/08
to Thinking Sphinx
I have delta indexes enabled and I'm not sure if the proper behavior
is occurring or not.

I have updated a large number of records by calling ActiveRecord's
update_all on a model that is indexed with ts.

It seems that the boolean delta column is not being set to true by
update_all.

Should I manually set delta=true as part of the update_all? If so is
there then something I should do to trigger the delta update to occur
(after the update_all call)?

Or is there a better approach?

Any suggestions would be much appreciated.

Xavier Noria

unread,
Aug 12, 2008, 7:17:11 AM8/12/08
to thinkin...@googlegroups.com
On Tue, Aug 12, 2008 at 9:46 AM, Matt Murphy <mmm...@gmail.com> wrote:

> I have updated a large number of records by calling ActiveRecord's
> update_all on a model that is indexed with ts.
>
> It seems that the boolean delta column is not being set to true by
> update_all.
>
> Should I manually set delta=true as part of the update_all? If so is
> there then something I should do to trigger the delta update to occur
> (after the update_all call)?

Toggling the delta flag is done in a before_save callback, and delta
indexing is triggered in an after_commit hook. Problem is update_all
bypasses those.

I think the best you can do as of this version of TS is to include
delta = 1 in update_all and then send(:index_delta) to any of them
(one of them suffices to have all of them indexed.)

Pat Allan

unread,
Aug 12, 2008, 12:06:40 PM8/12/08
to thinkin...@googlegroups.com
Xavier's got it exactly right - include delta = 1 in your update_all
call, and then you'll only need to fire index_delta on one of them -
that'll add them all to the delta index.

Cheers

--
Pat

Morten

unread,
Aug 12, 2008, 1:08:28 PM8/12/08
to Thinking Sphinx

Piggy back question: What's the overhead of delta indexing seen from
the perspective of a Rails application (ball park)?

Br,

Morten

Matt Murphy

unread,
Aug 12, 2008, 2:10:54 PM8/12/08
to thinkin...@googlegroups.com
Excellent.  By the way, is it possible to include :delta => 1 in my without clause to make sure I don't end up with any such records in my search results?

Coilcore

unread,
Aug 12, 2008, 4:34:22 PM8/12/08
to Thinking Sphinx
Morten -

Delta indexing is cheap on searches, but fairly heavy on changes.

Sphinx is setup to be able to search across multiple indexes and the
delta index is just another index.

For changes its unfortunately fairly heavy, the current mechanism is
to spawn a child process via system call to the indexer binary that
comes with sphinx.

Pat Allan

unread,
Aug 12, 2008, 5:28:49 PM8/12/08
to thinkin...@googlegroups.com
You'll need to add delta as an attribute first:

has delta

-- 
Pat

James Healy

unread,
Aug 12, 2008, 7:59:07 PM8/12/08
to thinkin...@googlegroups.com
Coilcore wrote:
> Delta indexing is cheap on searches, but fairly heavy on changes.
>
> Sphinx is setup to be able to search across multiple indexes and the
> delta index is just another index.
>
> For changes its unfortunately fairly heavy, the current mechanism is
> to spawn a child process via system call to the indexer binary that
> comes with sphinx.

There's a nice fork on github that allows "offline delta indexing".

http://github.com/bassnode/thinking-sphinx/commit/e54af20c434a5929d0ef77822b26a721ad1ac8be

With the new setting enabled, delta will still be set to 1 in the
database when a model changes, but rebuilding the delta index is left up
to the developer.

Depending on the application, if changed records don't need to be
searchable instantly, you could have a cron job setup to rebuild the
delta indexes every 5-10 minutes. This would avoid the expensive shell
call from rails.

-- James Healy <jimmy-at-deefa-dot-com> Wed, 13 Aug 2008 09:54:08 +1000

Reply all
Reply to author
Forward
0 new messages