Throttling rebuild_index?

38 views
Skip to first unread message

Subsume

unread,
Apr 25, 2012, 11:27:58 PM4/25/12
to django-haystack
Hey there,

Got haystack munching on an index but I can't seem to find a box and
settings with the right fit.

Had it on a small box with lots of memory--was slow. Memory filled.

Had it on a big box with lots of memory--still slow, memory not
filled. CPU is a problem here I think as I do other things on that
box.

Looking at --batch_size but the docs about what this means or what
playing with it does is vague. Could this maybe help me?

Danny Adair

unread,
Apr 26, 2012, 12:20:43 AM4/26/12
to django-...@googlegroups.com
Hi Subsume,

On Thu, Apr 26, 2012 at 15:27, Subsume <sub...@gmail.com> wrote:
>[...]
> Looking at --batch_size but the docs about what this means or what
> playing with it does is vague. Could this maybe help me?

I had the same issue and --batch_size is indeed what you'll need to
tweak (putting aside possible memory issues in the indexed attributes
themselves).

The docs say:
"""
BATCH_SIZE - How many records should be updated at once via the
management commands. Default is 1000.
"""

1000 killed my machine. In my case 200 was a good value but you'd have
to monitor your resources to find your best value.
Also, regarding performance, if your index needs to access the
database a lot, the value should be in healthy proportion to the
number of concurrent database queries. You may need to go as low as
20, time the indexation and work yourself towards the optimal value
(which shouldn't change as long as you don't change your index).

Cheers,
Danny
Reply all
Reply to author
Forward
0 new messages