Protocol "disembargo" message

29 views
Skip to first unread message

Jens Alfke

unread,
Dec 17, 2022, 4:18:35 PM12/17/22
to Cap'n Proto
As I peruse the RPC spec (rpc.capnp), I’m following along pretty well until I hit the ‘disembargo’ message (which my spellchecker keeps changing into ‘disembark’, but that’s another matter.)

At this point I’m only interested in Level 1 functionality, two peers, which AFAIK is as far as the implementation is gotten.Disembargo is a level 1 message, and I’m having some trouble understanding it, partly because it seems to only be relevant at levels 2+. The description starts on line 660:

  # Message sent to indicate that an embargo on a recently-resolved promise may now be lifted.
  #
  # Embargos are used to enforce E-order in the presence of promise resolution.  That is, if an
  # application makes two calls foo() and bar() on the same capability reference, in that order,
  # the calls should be delivered in the order in which they were made.  But if foo() is called
  # on a promise, and that promise happens to resolve before bar() is called, then the two calls
  # may travel different paths over the network, and thus could arrive in the wrong order.  In
  # this case, the call to `bar()` must be embargoed, and a `Disembargo` message must be sent along
  # the same path as `foo()` to ensure that the `Disembargo` arrives after `foo()`.  Once the
  # `Disembargo` arrives, `bar()` can then be delivered.

I was following along OK until “the two calls may travel different paths over the network.” We’re at Level 1, so there is only one path, the socket between peer A and peer B.

Also, in "that promise happens to resolve before bar() is called”, should that be "after bar() is called”? Because if foo resolves before, why would that make the foo call arrive after the bar call?

If you have time, Kenton, I’d love to see the " Carol lives in Vat A, i.e. next to Alice.  In this case, Vat A needs to send a `Disembargo` message that echos through Vat B and back…” scenario broken down step-by-step. (Or if this situation is discussed in the old erights.org site, let me know the URL; I didn’t see it when I was reading through their protocol docs.)

Thanks!

—Jens

Kenton Varda

unread,
Dec 17, 2022, 4:46:52 PM12/17/22
to Jens Alfke, Cap'n Proto
Hi Jens,

To clarify, when I say "Carol lives in Vat A, i.e. next to Alice", I am saying that Alice and Carol are two objects living in the same process. So a capability pointing to Carol was passed across the network and then back again, over the same connection. In this case, we want to shorten the path so that calls from Alice to Carol are direct local calls, not bouncing across the network connection. This scenario does in fact come up in Level 1, as it only involves two "vats" (a "vat" is essentially one process running Cap'n Proto; the RPC protocol is used to communicate between vats).

Embargoes are needed here for the reasons described in the comments: When Alice discovers her promise P resolves to Carol, she may have already sent some messages towards the promise using pipelining. Before Alice can start making direct local calls to Carol, those messages have to cross the network connection and reflect back, so that they are delivered first.

-Kenton

--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/capnproto/E3F6C227-23BD-41E9-A4BA-A93EE4586C49%40mooseyard.com.

Jens Alfke

unread,
Dec 26, 2022, 5:42:02 PM12/26/22
to Kenton Varda, Cap'n Proto


On Dec 17, 2022, at 1:46 PM, Kenton Varda <ken...@cloudflare.com> wrote:

To clarify, when I say "Carol lives in Vat A, i.e. next to Alice", I am saying that Alice and Carol are two objects living in the same process. So a capability pointing to Carol was passed across the network and then back again, over the same connection. 

Thanks. I then spent some time trying to figure out why this scenario occurs — when the local peer received the capability Carol from the remote one, shouldn’t it have been marked in the protocol as being a peer exported by the recipient?

I think this can happen as follows; is this correct?

0. I’ve already sent the capability Carol to the remote peer earlier in the connection, so the peer has a reference to Carol.
1. I send an RPC call to acquire a remote capability, and allocate a (negative) remote capability # to it. Call it X. (I think this is what the protocol calls a promise?)
2. Before the response arrives, I pipeline some more RPC calls addressed to X.
3. I get a response to the first RPC call, identifying X as my capability Carol.

At this point I can remap X to point to Carol, but I’ve already got some messages in flight addressed to remote capability X. I assume what happens to these is the peer just sends them back to me, substituting Carol for X, and then forwards my reply back to me? Thus the problem that I might send local messages to Carol that would arrive before the echoed messages to X even though I sent them later.

A different architectural solution to this problem might be for a peer to reject an incoming request addressed to a capability that isn’t local. Instead it would return an error indicating “X isn’t mine, it’s your Carol, so forward it there”.

But either way, isn’t there still a race condition if, in a new step 2.5, I send a message to local Carol? This message was sent after the message to X but arrives before I discover X is Carol, so it’s received out of order.

—Jens

Kenton Varda

unread,
Dec 26, 2022, 6:28:52 PM12/26/22
to Jens Alfke, Cap'n Proto
On Mon, Dec 26, 2022 at 4:42 PM Jens Alfke <je...@mooseyard.com> wrote:
Thanks. I then spent some time trying to figure out why this scenario occurs — when the local peer received the capability Carol from the remote one, shouldn’t it have been marked in the protocol as being a peer exported by the recipient?

I think this can happen as follows; is this correct?

0. I’ve already sent the capability Carol to the remote peer earlier in the connection, so the peer has a reference to Carol.
1. I send an RPC call to acquire a remote capability, and allocate a (negative) remote capability # to it. Call it X. (I think this is what the protocol calls a promise?)
2. Before the response arrives, I pipeline some more RPC calls addressed to X.
3. I get a response to the first RPC call, identifying X as my capability Carol.

At this point I can remap X to point to Carol, but I’ve already got some messages in flight addressed to remote capability X. I assume what happens to these is the peer just sends them back to me, substituting Carol for X, and then forwards my reply back to me? Thus the problem that I might send local messages to Carol that would arrive before the echoed messages to X even though I sent them later.

That's correct, except one minor detail: I'm not sure what you mean about "(negative) remote capability #". There are no negative numbers. The caller assigns a question ID to the outgoing RPC, and can identify expected results for promise pipelining by the question ID plus a field path.
 
A different architectural solution to this problem might be for a peer to reject an incoming request addressed to a capability that isn’t local. Instead it would return an error indicating “X isn’t mine, it’s your Carol, so forward it there”.

Sure, but this would have several problems:
* If the RPC system is expected to deal with it transparently, then that implies the RPC system has to keep a copy of every message it has sent on a promise, until either it gets a response or the promise resolves, wasting memory in order to handle an unusual situation.

* If the RPC system is NOT expected to deal with it transparently, then the application has to deal with it. This would compromise the abstraction, as application code would now have to be written with awareness of which objects reside in which vats in order to be ready for this scenario.

* In some three-party scenarios (when that is implemented), it would result in extra network round trips. Imagine a scenario where a client talks to a server in a remote datacenter, and the server then redirects the client to an adjacent server. In a naive redirect scenario, the first server must send a message to the client redirecting it, and the client must re-send the message to the second server, all before the second server can receive the message at all. This is an extra round trip on the long-distance connection between the client and the two servers. With Cap'n Proto's approach, the first server directly forwards the request to the second server, while in parallel letting the client know to expect the response from that server. The second server can then send the response directly to the client. So no extra network round trip is needed.
 
But either way, isn’t there still a race condition if, in a new step 2.5, I send a message to local Carol? This message was sent after the message to X but arrives before I discover X is Carol, so it’s received out of order.

E-order only guarantees that calls made on the same capability will be received in order. If you make calls on two different capabilities that, unknown to you, end up pointing to the same destination object, no guarantees are made regarding the order in which those calls are delivered. This has been shown to match what developers intuitively expect in practice.

 -Kenton

Jens Alfke

unread,
Dec 26, 2022, 11:34:27 PM12/26/22
to Kenton Varda, Cap'n Proto


On Dec 26, 2022, at 3:28 PM, Kenton Varda <ken...@cloudflare.com> wrote:

 I'm not sure what you mean about "(negative) remote capability #". There are no negative numbers. 

Oops, I got that from the old E documentation (erights.org.) They used negative numbers to identify capability IDs that were created on the “other side” of the connection, i.e. promises.

(Speaking of which, is there any other good reading material on capabilities-with-RPC, besides erights.org and your own rpc.capnp?)

—Jens

Kenton Varda

unread,
Dec 27, 2022, 12:38:39 PM12/27/22
to Jens Alfke, Cap'n Proto
The other one that comes to my mind off the top of my head (and is also quite old) is Waterken:

https://waterken.sourceforge.net/

Capability people will often distinguish between CapTP-style and "Ken-style" (from Waterken) capabilities. I don't quite remember what the main differences are, though, except that Cap'n Proto is much closer to CapTP.

-Kenton
Reply all
Reply to author
Forward
0 new messages