Thinking Sphinx running out of memory - throwing errors

195 views
Skip to first unread message

mhodgson

unread,
May 13, 2008, 9:36:54 AM5/13/08
to Thinking Sphinx
Getting this error when trying to view a page. Any quick help would be
really appreciated. What exactly is running out of memory?

Processing AccountController#article (for 68.166.121.236 at 2008-05-13
09:33:26) [GET]
Session ID: c2cda7bc0d2481909eb6e195f15b8787
Parameters: {"userName"=>"Mach ZRo", "permalink"=>"home-sweet-home-
or-are-we-away", "action"=>"article", "controller"=>"account"}


Errno::ENOMEM (Not enough space):
/vendor/plugins/thinking-sphinx/lib/thinking_sphinx/active_record/
delta.rb:75:in `system'
/vendor/plugins/thinking-sphinx/lib/thinking_sphinx/active_record/
delta.rb:75:in `index_delta'
/opt/csw/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/
active_record/callbacks.rb:307:in `send'
/opt/csw/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/
active_record/callbacks.rb:307:in `callback'
/opt/csw/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/
active_record/callbacks.rb:304:in `each'
/opt/csw/lib/ruby/gems/1.8/gems/activerecord-2.0.2/lib/
active_record/callbacks.rb:304:in `callback'
/vendor/plugins/thinking-sphinx/lib/thinking_sphinx/active_record/
delta.rb:32:in `save'
/app/controllers/account_controller.rb:229:in `article'
/opt/csw/lib/ruby/gems/1.8/gems/actionpack-2.0.2/lib/
action_controller/base.rb:1158:in `send'
/opt/csw/lib/ruby/gems/1.8/gems/actionpack-2.0.2/lib/
action_controller/base.rb:1158:in `perform_action_without_filters'

mhodgson

unread,
May 13, 2008, 9:55:28 AM5/13/08
to Thinking Sphinx
I guess the server was actually running out of memory. I feel like the
indexer system call should be rescued in production to avoid these
kinds of things. If I'm not mistaken, not having the deltas updated
isn't the end of the world and our users certainly would rather see
the page than an error message. I really don't know a ton about this
yet though so I'd love to hear other people's thoughts

Pat Allan

unread,
May 14, 2008, 11:14:08 PM5/14/08
to thinkin...@googlegroups.com
I guess my one concern with adding in a catch for an ENOMEM error is
that it's not really a Sphinx issue - it's a system issue. I don't
know if a plugin be expected to handle that...

--
Pat

mhodgson

unread,
May 17, 2008, 12:00:18 AM5/17/08
to Thinking Sphinx
I can understand that. I'm still a little concerned that I was
consistently getting this error. I *did* have enough memory available
given the 64MB limit imposed by the configuration. I think what may
happen is that during high traffic periods the plugin tries to launch
a number of indexer instances. Just a hunch though, I actually disable
deltas since I could get rid of this error.

Any thoughts? I guess a different solution could be to run the delta
indexer with cron as well (just much more frequently).

-Matt

Pat Allan

unread,
May 17, 2008, 3:22:01 AM5/17/08
to thinkin...@googlegroups.com
On 17/05/2008, at 2:00 PM, mhodgson wrote:
> I can understand that. I'm still a little concerned that I was
> consistently getting this error. I *did* have enough memory available
> given the 64MB limit imposed by the configuration. I think what may
> happen is that during high traffic periods the plugin tries to launch
> a number of indexer instances. Just a hunch though, I actually disable
> deltas since I could get rid of this error.

Yeah, that makes sense - for every save, a new indexer instance will
fire up. I've been wanting to shift this off into a thread, or ideally
something that hooks into either background-job or backgroundrb if
either is available. If anyone's familiar with either, would love a
patch :)

Otherwise I'll get around to it at some point - unfortunately things
are pretty hectic for me at the moment (preparing to head over to the
US for RailsConf as the start of a round-the-world trip), so can't
provide a timeline as to when I'll have that fix ready.

> Any thoughts? I guess a different solution could be to run the delta
> indexer with cron as well (just much more frequently).

If you were running a delta index regularly, could just run the full
index too? Of course, depends how much load that'll put the server
under, though.

Cheers

--
Pat

jae...@gmail.com

unread,
May 20, 2008, 12:52:31 PM5/20/08
to Thinking Sphinx
well, I am also interested on this matter.

presumably, we can implement delta as follow.
(I am not sure which one is better..)

1. just use thinking sphinx way of delta
2. improved background delta processing
- well, quick and dirty fix would be just setting up cron job so
that only one delta indexer is running.
- obviously, there will be slight lagging with "real" time.
3. soft delta
- just do main indexing (along with update sphinx call for dirty
out the updated index)
- when querying sphinx, do a sphinx query on main index, PLUS do
regular "%a%" sql query on incremental data (delta). then merge the
search result.
- this might be complex and UGLY since we need to generate SQL
syntax for RDB model, which is equivalent to sphinx query.
- but this will free you from "delta" indexing.
- there hardly be lagging with "real" time, soft delta may be slow

for now, I would recommend 2 as quick and dirty fix.

Just out of curiosity, did you put DB index (create index with SQL) on
"delta" field?

-Jae

mhodgson

unread,
May 20, 2008, 9:25:08 PM5/20/08
to Thinking Sphinx
Just to be clear, the ideal solution would be something as follows:

1. If a record is saved and indexer isn't running, start it up in the
background and move on
2. If a record is saved and the indexer IS running, schedule a job to
run the indexer once the current one finishes
3. If a record is saved and there is already an indexer job scheduled,
don't do anything, just move on.

Is this correct? I use Backgroundrb all the time, but I'm not sure it
could accomplish this (the scheduling on the fly part). I'll look into
Bj as an alternative. Either way it is less than desirable to require
one of these for thinking sphinx. The are both non trivial to set up
and manage.

Can anyone else think of a better solution?

-Matt

Pat Allan

unread,
May 20, 2008, 9:26:50 PM5/20/08
to thinkin...@googlegroups.com
That's pretty much my feel for how it *should* work.

I was assuming backgroundrb or bj could take care of this - but if
not, then we'll need to find another way...

--
Pat

James Healy

unread,
May 20, 2008, 9:03:28 PM5/20/08
to thinkin...@googlegroups.com
mhodgson wrote:
> Just to be clear, the ideal solution would be something as follows:
>
> 1. If a record is saved and indexer isn't running, start it up in the
> background and move on
> 2. If a record is saved and the indexer IS running, schedule a job to
> run the indexer once the current one finishes
> 3. If a record is saved and there is already an indexer job scheduled,
> don't do anything, just move on.

The only danger I can see with this is that on a busy site, the delta
indexer might just end up running all the time.

For busy sites, it might be best to just disable the after_commit delta
index rebuild, and do it in cron every 5-10mins.

It really depends on whether scalability or immediate index updates are
the priority.

-- James Healy <jimmy-at-deefa-dot-com> Wed, 21 May 2008 10:59:22 +1000

mhodgson

unread,
May 20, 2008, 10:03:16 PM5/20/08
to Thinking Sphinx
The biggest danger I see with running it on a timed interval is that
if a record is deleted in the mean time, any searches could
potentially return a reference to the deleted record, which leads to
rails throwing a big record not found error. having the delta index
triggered on create and destroy methods keeps this from happening.

Hmmmm, this is a tough one. I think the ideal solution would really be
a separate little server that ran all the time in the background and
could take commands from rails to run various indexes. This "indexer"
server could manage all the indexing requests and intelligently call
the indexer when appropriate. I would imagine it would be pretty easy
to gut backgroundrb for something like this.

What would be even slicker would be to extend the indexer itself to
allow for this sort of functionality. No idea if this is possible.

I know this is probably more work than we were thinking, but it seems
like it would be the ideal solution.

What do you guys think?

-Matt
Reply all
Reply to author
Forward
0 new messages