Broker, BrokerHost and "broker process"

61 views
Skip to first unread message

Honglin Yu

unread,
Jun 16, 2021, 9:40:32 AM6/16/21
to chromium-mojo
Hi Chromium-Mojo,

I am recently debugging CrOS/ml-service's multiprocess's IPC and trying to understand some internals of mojo but get confused with the relation among the Broker classthe BrokerHost class and the "broker process". It seems,

1. Although its name contains the word "broker", "BrokerHost" is actually created in sending invitations. So more accurately speaking, it always resides in the "inviter process" (usually the parent process) rather than the unique "broker process" in the graph, although they are usually the same in chrome. Is this correct?

2. If the observation in 1 is correct, it seems the "broker responsibilities" (like creating a shared buffer) are actually handled in the inviter processes (where BrokerHost resides) rather than the unique "broker process" in the graph. Is this correct?

If the observation and guesses in 1 and 2 above are correct, it seems the brokering work is actually being shared by many nodes in the graph. Then what is the point of still having a unique "broker process"?

Thanks!

Best,
Honglin

Ken Rockot

unread,
Jun 16, 2021, 1:32:02 PM6/16/21
to Honglin Yu, chromium-mojo
On Wed, Jun 16, 2021 at 6:40 AM 'Honglin Yu' via chromium-mojo <chromi...@chromium.org> wrote:
Hi Chromium-Mojo,

I am recently debugging CrOS/ml-service's multiprocess's IPC and trying to understand some internals of mojo but get confused with the relation among the Broker classthe BrokerHost class and the "broker process". It seems,

1. Although its name contains the word "broker", "BrokerHost" is actually created in sending invitations. So more accurately speaking, it always resides in the "inviter process" (usually the parent process) rather than the unique "broker process" in the graph, although they are usually the same in chrome. Is this correct?


Indeed this is a confusing double-meaning of "broker."

2. If the observation in 1 is correct, it seems the "broker responsibilities" (like creating a shared buffer) are actually handled in the inviter processes (where BrokerHost resides) rather than the unique "broker process" in the graph. Is this correct?

In the context of Broker/BrokerHost endpoints, yes these responsibilities fall on the inviter, which may not be "The Broker." The specific responsibilities however are limited to exactly two things:
  • negotiating the initial async NodeChannel connection between inviter and invitee
  • delegating shared memory allocation
The first one belongs where it is, and we should probably rename Broker/BrokerHost to InvitationClient/InvitationHost or something similar.

The second one should really be done by the broker process, and the fact that it isn't is just an artifact from when this was first implemented and we weren't thinking much about more complex process graphs than Chrome's.


If the observation and guesses in 1 and 2 above are correct, it seems the brokering work is actually being shared by many nodes in the graph. Then what is the point of still having a unique "broker process"?

The "broker process" is something different altogether, and it must still exist for several important reasons:
  • It's responsible for introducing processes to each other: suppose you have processes A (the broker), B, and C. A invites B and C. Over various application-level messages, B sends a pipe endpoint to A who then forwards it along to some service in C. Now you have a message pipe routing between B and C, but perhaps this is the first such pipe and there's not yet any direct (OS-level) link between the two processes. When B hear's C's name and realizes it doesn't know how to reach C, it will ask A (the broker) for an introduction. A will create a new socketpair (on POSIX) and send one end to B in a message that says "this is C", and the other end to C in a message that says "this is B." Now they can talk to each other.
  • On Windows, handle transfer between processes happens via a privileged system call to directly manipulate another process's handle table. Because of this, all messages carrying system handles must be relayed through some privileged process who can do the necessary handle duplications. The broker process does this.
  • In some edge cases it's necessary to broadcast an event to every connected process, and this is done via a request to the broker process.
In essence, the broker process is the only process in the graph that is guaranteed to know about (and have a direct OS-level link to) every other process in the graph, and it's the only one guaranteed to be privileged enough to globally manage handle duplication on Windows.


Thanks!

Best,
Honglin

--
You received this message because you are subscribed to the Google Groups "chromium-mojo" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-moj...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-mojo/CAFstQzOKAw67k%3DXKAsMCYr7XCsaMNkVZ7e_-PVDNV_jaU1vGzQ%40mail.gmail.com.

Honglin Yu

unread,
Jun 16, 2021, 6:55:30 PM6/16/21
to Ken Rockot, chromium-mojo
Thanks for the quick and informative response, Ken! I understand it now. May I confirm that (which is useful in debugging whether there are file descriptor leaks for us): 
1. Each non-broker leaf node in the graph will only have one "sync IPC" platform channel (and this sync IPC channel is to the "inviter process", not the "broker process").
2. There is at most one "Async IPC" platform channel between any pair of nodes (i.e. the NodeChannel). That is to say, except for invitation and sharedbuffer, every other things are going through this channel.

Please see my other responses inline below,
 
The second one should really be done by the broker process, and the fact that it isn't is just an artifact from when this was first implemented and we weren't thinking much about more complex process graphs than Chrome's.
Thanks for letting me know this! I totally understand it now.
 
If the observation and guesses in 1 and 2 above are correct, it seems the brokering work is actually being shared by many nodes in the graph. Then what is the point of still having a unique "broker process"?

The "broker process" is something different altogether, and it must still exist for several important reasons:
  • It's responsible for introducing processes to each other: suppose you have processes A (the broker), B, and C. A invites B and C. Over various application-level messages, B sends a pipe endpoint to A who then forwards it along to some service in C. Now you have a message pipe routing between B and C, but perhaps this is the first such pipe and there's not yet any direct (OS-level) link between the two processes. When B hear's C's name and realizes it doesn't know how to reach C, it will ask A (the broker) for an introduction. A will create a new socketpair (on POSIX) and send one end to B in a message that says "this is C", and the other end to C in a message that says "this is B." Now they can talk to each other.
Cool, I saw a similar description from the doc. I started investigating the internals from "invitation APIs" and have not looked into every detail of the async channel yet (i.e. the NodeChannel which is much more complex).

  • On Windows, handle transfer between processes happens via a privileged system call to directly manipulate another process's handle table. Because of this, all messages carrying system handles must be relayed through some privileged process who can do the necessary handle duplications. The broker process does this.
  • In some edge cases it's necessary to broadcast an event to every connected process, and this is done via a request to the broker process.
In essence, the broker process is the only process in the graph that is guaranteed to know about (and have a direct OS-level link to) every other process in the graph, and it's the only one guaranteed to be privileged enough to globally manage handle duplication on Windows.
Thank you very much for the summary! The big picture here can really speed up my investigation.

Best,
Honglin

 

Ken Rockot

unread,
Jun 16, 2021, 6:58:13 PM6/16/21
to Honglin Yu, chromium-mojo
On Wed, Jun 16, 2021 at 3:55 PM Honglin Yu <hong...@google.com> wrote:
Thanks for the quick and informative response, Ken! I understand it now. May I confirm that (which is useful in debugging whether there are file descriptor leaks for us): 
1. Each non-broker leaf node in the graph will only have one "sync IPC" platform channel (and this sync IPC channel is to the "inviter process", not the "broker process").

Yes
 
2. There is at most one "Async IPC" platform channel between any pair of nodes (i.e. the NodeChannel). That is to say, except for invitation and sharedbuffer, every other things are going through this channel.

Yes

Honglin Yu

unread,
Jun 16, 2021, 6:59:51 PM6/16/21
to Ken Rockot, chromium-mojo
Got it, thanks for the super quick response, Ken!
Reply all
Reply to author
Forward
0 new messages