Letting scorch async tasks work for enough time

Pavel Bazika

unread,

Mar 3, 2021, 2:35:14 PM3/3/21

to bleve

Hi,

I'm using bleve to search in email mailboxes. In general each user has several indices and there are many users. I cannot hold indices of all users open, but I also don't close them immediately for performance - the user often generates more search/index requests in a row. So there's some kind of cache. The cache is currently driven by last access time - an index not accessed for some time is closed.

I noticed, that scorch has some background async tasks, especially the mergerLoop. I do not know it much in detail, but clearly the merger job is interrupted once the index becomes expired in cache.

Is it worth to care about merger job and wait for it's current task to finish (I know, it's a loop which never finishes, I mean some part of it's work). And is it even possible?

As I wrote, I don't know exactly what those async tasks do, thus maybe this is just unnecesary concern.

Thanks

Pavel

Marty Schoch

unread,

Mar 3, 2021, 5:05:13 PM3/3/21

to bl...@googlegroups.com

This is a great question.

First, I would start by saying that the overall design is that you shouldn't really have to worry about this. This means that applications should be able to call Close(), expect that these routines wrap up work in reasonable time, exit cleanly, and with a subsequent call to open the index again, everything continues normally.

But, for the use case you described, it certainly helps to know more about what is going on internally, and possibly tweak things with that in mind.

The asynchronous tasks are:

- persister - takes incoming segments which are in memory, and records them to disk

- merger - takes multiple segments and combines them into a single segment

- introducer - coordinates activity of application (indexing/deleting data), the persister, and the merger, to ensure that the index evolves in a sequence of valid snapshots

When an application calls Close(), each of these goroutines are signaled via a channel that they should exit. The persister and introducer are relatively tight loops, so they simply select on this channel and exit when necessary. The merger loop sometimes runs for long periods of time, so it has additional logic to be able to stop in the middle of an on-going merge. Waiting for the merger to complete would result in Close() taking longer than applications expect.

Not directly related to your question, but something people may be wondering at this point, will all data be persisted first if I call Close(). And the answer is that there is nothing to specifically ensure that. To ensure data is persisted you always have to follow the same rules. If you use the default scorch setting (unsafe_batch=false), then as soon as a call to Index/Delete/Batch returns, the data has been persisted. If you use unsafe_batch=true, then you must rely on the batch persisted callback. For most applications, this means if all of your calls to Index have returned, and then you call Close, it will be safe.

Back to the original question, since the merge can take longer to run (its writing larger files), and since it's work can be interrupted by a call to close, there are some degenerate cases to consider. We can imagine a use case where you open an index, quickly index some data, ensure it's persisted, and then close. If you keep repeating this, the introducer/persister always succeeded at recording new segments, but the merger never has the opportunity to merge them into more efficient larger segments. Over time this can become a problem.

As of today, there isn't any specific way to detect/avoid this. However, we expose a lot of stats, and clever use of them could be applied to this problem. Here is an example of a different, but somewhat related task of force merging the index to a single segment:

https://gist.github.com/mschoch/38ca2f678515dd6abe6c48c2f6cfd2cd

I bring up this example because it is functional, but crude. We hope to evolve this into something cleaner, and eventually expose a simpler API. And I would expect a similar path for this use case. Perhaps we try to see what is possible with stats today, and maybe it evolves into something better.

So to start, since you mentioned that you're tracking the index last accessed time, you could add some additional tracking of the TotFileMergeLoopEnd stat. I think if the merger is completely idle, that count will remain the same. But you might also need to check TotFileMergeLoopBeg, to differentiate the case where a long merge is running the entire time. Anyway, I'd suggest maybe playing around with these stats. Please don't hesitate to open an issue on GitHub if you need clarity on the stats or want to run the implementation by us.

Another further thing to consider for your use case, is that possibly if there are times where users are only searching, you could open the index read-only to avoid any costs related to these maintenance goroutines. Again, ideally you'd pair this with the other solution, because you really only want to avoid the cost of the maintenance, when you know there isn't any more maintenance to be done. But, both ideas support each other, as if you have fewer indexes open for writing, you might be able to wait longer to let them finish merging before closing.

I realize this is long, and doesn't give you a simple solution, but it's full of interesting engineering trade-offs. Hopefully this helps,

marty

--
You received this message because you are subscribed to the Google Groups "bleve" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bleve+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bleve/8529d5ca-2c79-4536-8b30-1c6632ec9dean%40googlegroups.com.

Pavel Bazika

unread,

Mar 4, 2021, 4:54:58 AM3/4/21

to bl...@googlegroups.com

Hi Marty,

it's a nice piece of documentation., many thanks.

Pavel

st 3. 3. 2021 v 23:05 odesílatel Marty Schoch <marty....@gmail.com> napsal:

To view this discussion on the web visit https://groups.google.com/d/msgid/bleve/CAE2t7bpJ7guA4pOcNwNUpAe6YgnH6PnBvGTeXggrd%2BUE0OpgcA%40mail.gmail.com.

Reply all

Reply to author

Forward