mojo IPC seems broken after mojo_service_manager restart

996 views
Skip to first unread message

Tao Jin

unread,
Aug 20, 2024, 2:17:25 PM8/20/24
to chromium-mojo
Hello, 

I ported the libchrome, libmojo and mojo service manager from CrOS to my own project 

I try to understand the mojo disconnection handling logic and run into the following issue

I have two process A and B.  A is a mojo service provider , B is the mojo client.  I also have mojo_service_manager process up and running. 

When I run all 3 processes, A and B can connect to mojo_service_manager, A can register its service, B can request A service.  A and B can talk to each other. 

When I kill mojo_service_manager, A and B both can detect the disconnection.  In the OnServiceManagerDisconnect() callback, I reset service_manager remote and mojo service remote, before I try to re-connect to the mojo service manager and re-initialize the mojo service provider and remote. 

With above implementation, after I restart mojo_service_manager, both A and B can re-connect to mojo_service_manager without any problem.  

However,  A and B cannot do IPC any more. 

My questions:  in case of mojo_service_manager crash and restart, is it possible for the mojo service provider and mojo client processes to recover the entire IPC again? 

thanks,
Tao 

Ken Rockot

unread,
Aug 20, 2024, 2:20:27 PM8/20/24
to chromium-mojo, jinta...@gmail.com
Though I'm not familiar with the details of mojo_service_manager (which is a Chrome OS-specific service) I do believe this would be broken with the old Mojo implementation but not with the new one.

Can you try running every process with --enable-features=MojoIpcz, or change the feature to be enabled-by-default in mojo/core/embedder/features.cc? I would expect the processes to be able to re-establish communication just fine in that case.

Ken Rockot

unread,
Aug 20, 2024, 3:43:28 PM8/20/24
to Tao Jin, chromium-mojo


On Tue, Aug 20, 2024 at 11:31 AM Tao Jin <jinta...@gmail.com> wrote:
Thanks Ken for the quick response! 

I ported a libchrome from ~2022, I don't see ipcz support in my version.  

Ah yeah, it was only recently rolled into libchrome now that cros working to adopt it.
 

Could you please share more about why the old mojo didn't support such re-connection? 

I know mojo_service_manager is a cros thing, but IIUC, chromium browser internally has a broker-ish module as well to help bootstrap the mojo communication.  In case such broker crash, what's the best practice to handle this in old mojo implementation?  I was surprised to find that in CrOS, most daemons simply CHECK() crash and respawn if any mojo daemon detects the mojo service manager (broker) disconnection.   Why is that? 

The details are a bit gnarly, but this behavior is rooted in a fundamental assumption that we wouldn't ever need to worry about recovering from broker process failure. This was decided in a time when the browser process was the only broker process in practice, and a browser crash meant (IIRC) bringing down all of cros anyway. Several design and implementation decisions followed from this assumption, some of which became ossified through the routing protocol's backwards-compatibility requirements. Issues like this were part of the motivation to develop ipcz.

Tao Jin

unread,
Aug 20, 2024, 3:47:13 PM8/20/24
to Ken Rockot, chromium-mojo
Thanks Ken for the quick response! 

I ported a libchrome from ~2022, I don't see ipcz support in my version.  

Could you please share more about why the old mojo didn't support such re-connection? 

I know mojo_service_manager is a cros thing, but IIUC, chromium browser internally has a broker-ish module as well to help bootstrap the mojo communication.  In case such broker crash, what's the best practice to handle this in old mojo implementation?  I was surprised to find that in CrOS, most daemons simply CHECK() crash and respawn if any mojo daemon detects the mojo service manager (broker) disconnection.   Why is that? 

On Wed, Aug 21, 2024 at 2:20 AM Ken Rockot <roc...@google.com> wrote:
Reply all
Reply to author
Forward
0 new messages