Loosen the giant Xapian lock in "mu server"?

111 views
Skip to first unread message

derek...@gmail.com

unread,
Jan 17, 2013, 1:30:34 AM1/17/13
to mu-di...@googlegroups.com
The Xapian docs say that the locks on Xapian databases allow for one writer but multiple readers.  Looking in mu-cmd.c, one can see how the mu implementation sets up a read-only store for "find" calls, but a read-write store for the other calls that hit the index:

    case MU_CONFIG_CMD_FIND:
        merr = with_store (mu_cmd_find, opts, TRUE, err);      break;
    case MU_CONFIG_CMD_INDEX:
        merr = with_store (mu_cmd_index, opts, FALSE, err);    break;
    case MU_CONFIG_CMD_ADD:
        merr = with_store (mu_cmd_add, opts, FALSE, err);      break;
    case MU_CONFIG_CMD_REMOVE:
        merr = with_store (mu_cmd_remove, opts, FALSE, err);   break;
    case MU_CONFIG_CMD_SERVER:
        merr = with_store (mu_cmd_server, opts, FALSE, err);   break;

This is why if you are running mu4e, no other mu process can refresh the index.  The server is grabbing the write lock "just in case" it needs to write.

Would it be possible to...
  1. run mu_cmd_server in a read-only context, and
  2. if mu_cmd_server needs to do a write action ("index", "add", "remove", possibly "guile"), have it allocate a writeable store alongside the read-only store and pass that to the callback, deallocating it at the end?
This would let people run a background process to sync the maildir and refresh the index, and that process would continue to work properly whether they were running zero, one, or N instances of mu4e.

Dirk-Jan C. Binnema

unread,
Jan 17, 2013, 1:37:45 PM1/17/13
to mu-di...@googlegroups.com
Hi Derek,

On Thu, Jan 17 2013, derek...@gmail.com wrote:

> The Xapian docs say that the locks on Xapian databases allow for one writer
> but multiple readers. Looking in mu-cmd.c, one can see how the mu
> implementation sets up a read-only store for "find" calls, but a read-write
> store for the other calls that hit the index:
>
> case MU_CONFIG_CMD_FIND:
> merr = with_store (mu_cmd_find, opts, TRUE, err); break;
> case MU_CONFIG_CMD_INDEX:
> merr = with_store (mu_cmd_index, opts, FALSE, err); break;
> case MU_CONFIG_CMD_ADD:
> merr = with_store (mu_cmd_add, opts, FALSE, err); break;
> case MU_CONFIG_CMD_REMOVE:
> merr = with_store (mu_cmd_remove, opts, FALSE, err); break;
> case MU_CONFIG_CMD_SERVER:
> merr = with_store (mu_cmd_server, opts, FALSE, err); break;
>
> This is why if you are running mu4e, no other mu process can refresh the
> index. The server is grabbing the write lock "just in case" it needs to
> write.

Indeed.

> Would it be possible to...
>
> 1. run mu_cmd_server in a read-only context, and
> 2. if mu_cmd_server needs to do a write action ("index", "add",
> "remove", possibly "guile"), have it allocate a writeable store alongside
> the read-only store and pass that to the callback, deallocating it at the
> end?
>
> This would let people run a background process to sync the maildir and
> refresh the index, and that process would continue to work properly whether
> they were running zero, one, or N instances of mu4e.

Possibly; however, the initial reason for keeping the database open all
the time is speed - open-closing the database could become rather
costly; removing 10 messages would require 10 open-close operations.

It'd be better if Xapian would support multiple writers... Or you can
you different databases.

Best wishes,
Dirk.

--
Dirk-Jan C. Binnema Helsinki, Finland
e:dj...@djcbsoftware.nl w:www.djcbsoftware.nl
pgp: D09C E664 897D 7D39 5047 A178 E96A C7A1 017D DA3C

derek...@gmail.com

unread,
Jan 18, 2013, 11:38:44 PM1/18/13
to mu-di...@googlegroups.com
I made some code changes to get the server running RW commands in temporary sessions, and then I made some code changes to get the server running all commands in temporary sessions.  They're both flaky.  It looks like there is some sort of internal mu state that's getting out of sync with what's happening on disk, because mu4e is regularly reporting errors about not being able to find files in the "new" directory to perform deletion operations.  When I look for the files, they are in the "cur" directory, so mu moved them but then forgot about the move.

Looking at this from the opposite direction, the problem isn't that only one server can run, it's that only one process can communicate with the server.  Emacs has dbus support, so I'll experiment with adding a "dbus_server" command to mu.

Derek
e:d...@djcbsoftware.nl           w:www.djcbsoftware.nl

Dirk-Jan C. Binnema

unread,
Jan 19, 2013, 5:35:01 AM1/19/13
to mu-di...@googlegroups.com

On Sat, Jan 19 2013, derek...@gmail.com wrote:

> I made some code changes to get the server running RW commands in temporary
> sessions, and then I made some code changes to get the server running all
> commands in temporary sessions. They're both flaky. It looks like there
> is some sort of internal mu state that's getting out of sync with what's
> happening on disk, because mu4e is regularly reporting errors about not
> being able to find files in the "new" directory to perform deletion
> operations. When I look for the files, they are in the "cur" directory, so
> mu moved them but then forgot about the move.

Ah, yes, when you open a message, it gets moved from new/ to cur/, as
per the maildir spec.

> Looking at this from the opposite direction, the problem isn't that only
> one server can run, it's that only one process can communicate with the
> server. Emacs has dbus support, so I'll experiment with adding a
> "dbus_server" command to mu.

Ah, that sounds interesting. Thinking about the multi-mu4e case, I
suppose mu4e-->mu messages would need some originator field, so mu knows
where to send the response; for a few response (such as updates) it
might make sense to broadcast them to all mu4e instances.

But, taking one step back, is there some specific problem that can only
be solved by having multiple mu4e/emacs instances alive at the same
time? An alternative is to use multiple databases.

Best wishes,
Dirk.

--
Dirk-Jan C. Binnema Helsinki, Finland
e:dj...@djcbsoftware.nl w:www.djcbsoftware.nl

derek...@gmail.com

unread,
Jan 19, 2013, 1:09:36 PM1/19/13
to mu-di...@googlegroups.com
With current mu capabilities, there is only one privileged Emacs session that can read mail.  That's not a problem for people running just one Emacs instance, and I know there are a lot of people who work that way.  My style has evolved in the opposite direction: I pop up brand new Emacs instances all the time for different tasks and shove them into different workspaces to divide them up.  So my real interest is...

- If I bring up mu4e in my current Emacs instance to read mail, I should not be blocked by some other instance running mu4e.

As it stands right now, if I bring up another mu4e, I need to either kill the running mu process (which runs the risk of corruping my maildir if mu is doing work at the time), or I need to look through all of my Emacs instances in all of my workspaces until I find the one with mu4e buffers and do something to address the situation.  And by then my mental context will be long gone.  I'm looking for some way to declare to the world "this is the new interactive mail reader", with any previous ones gracefully handling the situation: either continuing to function normally (the "multiple readers" scenario), or cleaning themselves up (stop any update timer, kill any buffers with docids that could become invalid, etc.).

Derek
e:d...@djcbsoftware.nl           w:www.djcbsoftware.nl

Dirk-Jan C. Binnema

unread,
Jan 19, 2013, 1:54:07 PM1/19/13
to mu-di...@googlegroups.com
Hi Derek,

On Sat, Jan 19 2013, derek...@gmail.com wrote:

> With current mu capabilities, there is only one privileged Emacs session
> that can read mail. That's not a problem for people running just one Emacs
> instance, and I know there are a lot of people who work that way. My style
> has evolved in the opposite direction: I pop up brand new Emacs instances
> all the time for different tasks and shove them into different workspaces
> to divide them up. So my real interest is...
>
> - If I bring up mu4e in my current Emacs instance to read mail, I should
> not be blocked by some other instance running mu4e.
>
> As it stands right now, if I bring up another mu4e, I need to either kill
> the running mu process (which runs the risk of corruping my maildir if mu
> is doing work at the time), or I need to look through all of my Emacs
> instances in all of my workspaces until I find the one with mu4e buffers
> and do something to address the situation. And by then my mental context
> will be long gone. I'm looking for some way to declare to the world "this
> is the new interactive mail reader", with any previous ones gracefully
> handling the situation: either continuing to function normally (the
> "multiple readers" scenario), or cleaning themselves up (stop any update
> timer, kill any buffers with docids that could become invalid, etc.).

If you kill the mu server process with SIGTERM, it should shutdown
gracefully; so if you do that before starting mu4e on another emacs, it
should just work, I think.

Of course, the update timers would still be there, but that would only
cause the mail-fetching to happen a bit more often, not a big problem I
think.

But, let me not discourage you from experimenting; that can only be a
good thing!

Best wishes,
Dirk.

--
Dirk-Jan C. Binnema Helsinki, Finland
e:dj...@djcbsoftware.nl w:www.djcbsoftware.nl

derek...@gmail.com

unread,
Feb 12, 2013, 11:13:55 AM2/12/13
to mu-di...@googlegroups.com
On Saturday, January 19, 2013 10:54:07 AM UTC-8, djcb wrote:
If you kill the mu server process with SIGTERM, it should shutdown
gracefully; so if you do that before starting mu4e on another emacs, it
should just work, I think.

Of course, the update timers would still be there, but that would only
cause the mail-fetching to happen a bit more often, not a big problem I
think.

But, let me not discourage you from experimenting; that can only be a
good thing!

I have something that works.  D-Bus acts as transport and framing layer; the underlying S-expressions remain the same.  The code abstracts the I/O layer on both sides, so you can pick between server mode or dbus mode at startup time.  There are three problems right now:

1. This exposed a bug in Emacs' event handling code (down in the C layer).  There's a possible elisp workaround that I'm going to try.  The trunk has a real fix, but it won't go out in the upcoming 24.3 release.  Version 24.4 will have it.

2. I haven't yet figured out how to get automake to run the D-Bus marshalling layer generator.  Right now I'm doing that by hand.  (I did get the configure script to make dbus support an option.)

3. The "index" command generates status updates that trickle back over the stdio connection, which doesn't fit with the D-Bus concept of request/response.  Right now all status updates come back in a single response message, and you only see the last update.  However, D-Bus does support one-way "signals" to clients that have registered to receive them.  Done correctly, the indexing status updates can show up in all mu4e clients connected to the server.

Derek

Dirk-Jan C. Binnema

unread,
Feb 12, 2013, 3:10:09 PM2/12/13
to mu-di...@googlegroups.com
Hi Derek,

On Tue, Feb 12 2013, derek...@gmail.com wrote:

> On Saturday, January 19, 2013 10:54:07 AM UTC-8, djcb wrote:
>>
>> If you kill the mu server process with SIGTERM, it should shutdown
>> gracefully; so if you do that before starting mu4e on another emacs, it
>> should just work, I think.
>>
>> Of course, the update timers would still be there, but that would only
>> cause the mail-fetching to happen a bit more often, not a big problem I
>> think.
>>
>> But, let me not discourage you from experimenting; that can only be a
>> good thing!
>>
>
> I have something that works. D-Bus acts as transport and framing layer;
> the underlying S-expressions remain the same. The code abstracts the I/O
> layer on both sides, so you can pick between server mode or dbus mode at
> startup time.

Nice!

Do you have any rough estimates on the performance compared to the
current mu-server?

> There are three problems right now:
>
> 1. This exposed a bug in Emacs' event handling code (down in the C layer).
> There's a possible elisp workaround that I'm going to try. The trunk has a
> real fix, but it won't go out in the upcoming 24.3 release. Version 24.4
> will have it.
>
> 2. I haven't yet figured out how to get automake to run the D-Bus
> marshalling layer generator. Right now I'm doing that by hand. (I did get
> the configure script to make dbus support an option.)

Are you using GDBus? If so, I should be able to help.

> 3. The "index" command generates status updates that trickle back over the
> stdio connection, which doesn't fit with the D-Bus concept of
> request/response. Right now all status updates come back in a single
> response message, and you only see the last update. However, D-Bus does
> support one-way "signals" to clients that have registered to receive them.
> Done correctly, the indexing status updates can show up in all mu4e clients
> connected to the server.

That's really just a minor issue.

Anyway, cool stuff! Is the code available somewhere?

derek...@gmail.com

unread,
Feb 12, 2013, 8:46:59 PM2/12/13
to mu-di...@googlegroups.com
On Tuesday, February 12, 2013 12:10:09 PM UTC-8, djcb wrote:
Do you have any rough estimates on the performance compared to the
current mu-server?

No.  I'll see what I can come up with.

Are you using GDBus? If so, I should be able to help.

Yes, gdbus-codegen.
 
Anyway, cool stuff! Is the code available somewhere?

I need to jump through some legal hoops before I can release my changes, but I got that process started today.  I'll work on the asynchronous notifications in the meantime.

Derek

Reply all
Reply to author
Forward
0 new messages