On Fri, Feb 4, 2011 at 8:13 AM, boult...@googlemail.com
<boult...@gmail.com> wrote:
> On Feb 3, 12:40 pm, Bruno Rezende <brunovianareze...@gmail.com> wrote:
...
>>
>> http://trac.xapian.org/ticket/250(replace_document should make
>> minimal changes to database file).
>>
>> it seems this ticket is exactly about what I'm doing. The changes were
>> backported to version 1.0.18. Are these changes available to the
>> xapian version xappy uses (the one that get_xapian.py,http://code.google.com/p/xappy/source/browse/trunk/libs/get_xapian.py,
>> retrieves)?
>
> Yes, they're definitely included in that version. I assume you're
> using chert databases, too (the improvements didn't work so will with
> flint, due to the way document lengths were stored).
Yes, I'm using chert.
>
> What sort of speed do you get if you change your code to delete the
> old document and then add it back, rather than replacing it? I'd
> expect that to be much slower, since that's what the old code path did
> (ie, before xapian ticket 250 was fixed).
>
I don't have this info now. I'll do a test and report back.
> Are you flushing frequently when doing this update, or not at all
> during the update?
>
I flush at each 10K items. I'm using this value to try to keep memory
usage low, we had a case where the memory usage went up to 15GB. But,
I think it didn't work very well, we had some days ago a 4GB memory
usage case.
> What sort of speed do you get when doing the initial update?
by initial update you call when I add it for the first time? I'll need
to check on this machine.
>
> One thought occurs; do you have a query cache enabled on this index?
> I think that may be being updated when you call replace(), and could
> account for some of the time.
>
yes, I have. I'll try to disable the cache and test this too.
> It's possible that there's a lot of unnecessary parsing going on in
> python here; I think some profiling output will be needed to dig into
> this (at the least, finding out whether the time is being spent in
> Python, or in the Xapian C++ code).
>
ok. I'll do some more testings and see if I can get some profiling info.
> --
> Richard
>
> --
> You received this message because you are subscribed to the Google Groups "xappy-discuss" group.
> To post to this group, send email to xappy-...@googlegroups.com.
> To unsubscribe from this group, send email to xappy-discus...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/xappy-discuss?hl=en.
>
>
--
Bruno
--
Bruno