mu4e indexing is slow - possible to avoid cleanup?

220 views
Skip to first unread message

Toby Gee

unread,
Apr 10, 2016, 9:49:15 AM4/10/16
to mu-discuss
I'm a new mu/mu4e user, and while it is mostly amazing, I'm finding that every few minutes mu4e doesn't respond to any search (even opening the inbox) for a minute or so.

After some playing around, I think I have figured out that the problem is the cleanup phase of indexing - I'm guessing that it's periodically doing this, and it's what's causing the freeze. I have a gmail account with just over 260k mails in the "All Mail" folder, and 20 or so in the inbox, so I'd like to just be refreshing the inbox fairly often and the "All mail" folder only once or twice a day. 

I've put the times from the command line below - I don't know if they are typical (and I don't know why the cleanup involves twice as many emails as are actually there), but if they are, my question is: how can I have mu4e run with --no-cleanup ?



I put a .noupdate file in the "All Mail" folder, and if I run mu-index with that there, I get:

indexing messages under /Users/tsg20/mbsync [/Users/tsg20/.mu/xapian]
\ processing mail; processed: 1500; updated/new: 0, cleaned-up: 0
cleaning up messages [/Users/tsg20/.mu/xapian]
\ processing mail; processed: 530662; updated/new: 0, cleaned-up: 3
elapsed: 59 second(s), ~ 8994 msg/s

If I run mu-index with --no-cleanup, it's instant:

mu index --maildir=mbsync --nocleanup
indexing messages under /Users/tsg20/mbsync [/Users/tsg20/.mu/xapian]
| processing mail; processed: 1558; updated/new: 0, cleaned-up: 0
elapsed: 0 second(s)

[and if I run without the .noupdate, it takes a while longer:

- processing mail; processed: 266625; updated/new: 0, cleaned-up: 0
cleaning up messages [/Users/tsg20/.mu/xapian]
- processing mail; processed: 530659; updated/new: 0, cleaned-up: 0
elapsed: 44 second(s), ~ 12060 msg/s
\ processing mail; processed: 530659; updated/new: 0, cleaned-up: 0
elapsed: 70 second(s), ~ 7580 msg/s]


Dirk-Jan C. Binnema

unread,
May 1, 2016, 1:44:38 PM5/1/16
to mu-di...@googlegroups.com
Hi Toby,

On Sunday Apr 10 2016, Toby Gee wrote:

> I'm a new mu/mu4e user, and while it is mostly amazing, I'm finding that
> every few minutes mu4e doesn't respond to any search (even opening the
> inbox) for a minute or so.
>
> After some playing around, I think I have figured out that the problem is
> the cleanup phase of indexing - I'm guessing that it's periodically doing
> this, and it's what's causing the freeze. I have a gmail account with just
> over 260k mails in the "All Mail" folder, and 20 or so in the inbox, so I'd
> like to just be refreshing the inbox fairly often and the "All mail" folder
> only once or twice a day.
>
> I've put the times from the command line below - I don't know if they are
> typical (and I don't know why the cleanup involves twice as many emails as
> are actually there), but if they are, my question is: how can I have mu4e
> run with --no-cleanup ?

You cannot (easily) do that right now... so perhaps we should add an
option for that. Can you file an issue at the Github issue track so we
won't forget?

As for the extra mails, I suspect the same mails might be in the
database multiple times -- perhaps through a symlink? You could try to
do a --rebuild and see if that helps.

Kind regards,
Dirk.

--
Dirk-Jan C. Binnema Helsinki, Finland
e:dj...@djcbsoftware.nl w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C

Toby Gee

unread,
Jul 14, 2016, 11:25:05 AM7/14/16
to mu-di...@googlegroups.com
Dear Dirk,

> You cannot (easily) do that right now... so perhaps we should add an
> option for that. Can you file an issue at the Github issue track so we
> won't forget?

Thank you very much for this reply, and apologies for being so slow to
do this - this email has languished in my org todos for too long. In the
meantime I've realised that it is a much bigger issue on slower
machines; now that I'm running mu4e on all my machines, I see that on a
new iMac it's taking only a few seconds to do the cleanup, but it's
about 45s on my Macbook 12".

It would also be good to know how often I need to run the cleanup on
this machine (would once a day be OK?).

I've now opened an issue about this, anyway (#883).

> As for the extra mails, I suspect the same mails might be in the
> database multiple times -- perhaps through a symlink? You could try to
> do a --rebuild and see if that helps.

Thanks - there were indeed some mails there more than once, I ended up
deleting a few ancient emails that seemed to be troublesome and it's
been fine since then.

Best,

Toby

Dirk-Jan C. Binnema

unread,
Jul 17, 2016, 4:02:46 AM7/17/16
to mu-di...@googlegroups.com
Hi Toby,

On Thursday Jul 14 2016, Toby Gee wrote:

> Dear Dirk,
>
>> You cannot (easily) do that right now... so perhaps we should add an
>> option for that. Can you file an issue at the Github issue track so we
>> won't forget?
>
> Thank you very much for this reply, and apologies for being so slow to
> do this - this email has languished in my org todos for too long. In the
> meantime I've realised that it is a much bigger issue on slower
> machines; now that I'm running mu4e on all my machines, I see that on a
> new iMac it's taking only a few seconds to do the cleanup, but it's
> about 45s on my Macbook 12".
>
> It would also be good to know how often I need to run the cleanup on
> this machine (would once a day be OK?).

In the cleanup face, we check what messages in the mu/xapian database
are no longer present in the filesystem. Say - because your deleted a
message with 'rm', or with some other e-mail program. Otherwise, it's
only rarely needed.

When you try from the command-line (when mu4d is not running)

$ time mu index

and

$ time mu index --nocleanup

you can see how long it actually takes.


> I've now opened an issue about this, anyway (#883).
>
>> As for the extra mails, I suspect the same mails might be in the
>> database multiple times -- perhaps through a symlink? You could try to
>> do a --rebuild and see if that helps.
>
> Thanks - there were indeed some mails there more than once, I ended up
> deleting a few ancient emails that seemed to be troublesome and it's
> been fine since then.

When using GMail with offlineimap (or presumably mbsync too), you'll get
copy of messages for each of the virtual folders they live in on GMail's
side.

Cheers,

Toby Gee

unread,
Jul 17, 2016, 9:50:08 AM7/17/16
to mu-di...@googlegroups.com
Hi Dirk,

> In the cleanup face, we check what messages in the mu/xapian database
> are no longer present in the filesystem. Say - because your deleted a
> message with 'rm', or with some other e-mail program. Otherwise, it's
> only rarely needed.

That's very helpful - thank you.

> When you try from the command-line (when mu4d is not running)
>
> $ time mu index
>
> and
>
> $ time mu index --nocleanup

I'm using a .noupdate file in my main gmail folder, and the difference
(on the 12" Macbook) is dramatic - indexing is 19.261s real time with
cleanup, but only 0.613s without.

Best,

Toby
Reply all
Reply to author
Forward
0 new messages