CRDTs and ocap systems

53 views
Skip to first unread message

Christopher Lemmer Webber

unread,
Nov 28, 2020, 3:37:38 PM11/28/20
to cap-...@googlegroups.com
A thing I don't see much attention to in ocap land as much but I do see
in everywhere else it seems is CRDTs. They seem useful, but there's an
extent to which it sounds like when people are talking about them it
often sounds like they're this magic wand that can be waved and whammo!
We no longer have synchronization issues.

They seem like they could be useful sometimes, but it strikes me that in
general the following aspects are true:

- They seem good at, given absorbing all the same inputs out of order,
resulting in the same value.

- They tend to require cooperatively interested parties from all
interested writers in the scenarios I've described. Ie, most
examples I've seen have been multiple writers in for example, a
corporate setting where it's important that *someone* wins, but there
aren't competing interests as to who "gets what they want" (eg some
corporate database, where the sum "interests" ends up being "one
institution"... that's ignoring the fact that different teams *do*
often have mutually combative interests, but the pressure to get
along can result in not trying to game systems). I'm sure there are
probably some cases where this isn't the case, but every example of
where they "fix things" seem to describe this.

- They seem to require a lot of careful thinking/wiring usually to
compose and fit whatever scenario you're envisioning solving.

This is all fine and doesn't mean they aren't useful, but I've gotten
some cases where people talk at me like "if you had just chosen CRDTs,
you wouldn't have any issues"... the kind of magic hand-waving I get
suspicious of. Maybe that's because people aren't working on *general
purpose* distributed *mutually suspicious* systems? I don't know.

I can see CRDTs useful for particular ocap scenarios (there are plenty
of scoped multi-writer problems in the ocap with mutual interest in
converging on a value), and building a CRDT library on top of Goblins
seems like a good idea. But if they magically solved all multi-writer
problems in scenarios with competing interests, then presumably we
wouldn't need consensus algorithms. That alone seems to make my
gut-sense feel like it's probably on the right track.

What do you think? Is my gut sense correct? And have there been nice
applications of CRDTs to ocaps I should know/think about?

- Chris

Raoul Duke

unread,
Nov 28, 2020, 4:13:09 PM11/28/20
to cap-...@googlegroups.com
> What do you think? Is my gut sense correct?

whatever anybody else says, ... yes, i believe it is correct.

Alan Karp

unread,
Nov 28, 2020, 5:05:55 PM11/28/20
to cap-...@googlegroups.com
I don't see the connection.  CRDTs are about data; ocaps are about permissions.  Would you be asking this question if the topic was databases instead of CRDTs?

--------------
Alan Karp


--
You received this message because you are subscribed to the Google Groups "cap-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cap-talk+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cap-talk/87r1odgqvk.fsf%40dustycloud.org.

Raoul Duke

unread,
Nov 28, 2020, 5:09:41 PM11/28/20
to cap-...@googlegroups.com
ps to clarify i only meant with respect to crdts being overhyped :-)

Christopher Lemmer Webber

unread,
Nov 29, 2020, 4:28:10 PM11/29/20
to cap-...@googlegroups.com, Raoul Duke
That's the one thing I suspected the most strongly. :)

Christopher Lemmer Webber

unread,
Nov 29, 2020, 4:47:59 PM11/29/20
to cap-...@googlegroups.com, Alan Karp
Ocaps can be about data too, but aren't necessarily (the line is less
squishy around E, squishier around certificate ocaps and ocap logic
databases, I'd think). Here is another way to put it: ocaps aren't just
about authority, but about behavior. Data can flow out of an object as
one type of behavior of invoking that reference.

But you're asking what the connection is, and I'm admitting straight
out: I don't know. I suspect there's something here to investigate, but
I haven't quite sussed out what it is.

The origin of this is more or less people 'splain'ing to me that I made
major errors in my systems by not putting CRDTs at the heart. As an
example, multiple people have told me "ActivityPub is hopeless because
you didn't start out with CRDTs". When trying to get clarity as to why
I feel like the responses are vague. But they might be somewhat on to,
and off of, something...

I think this is one of those cases that I'll call, at the moment,
"approach enthusiasm". You can say I'm an "approach enthusiast" of
ocaps, lisps, reproducible software. Sometimes when I see a problem, I
might grumble and indicate that I think one of the above would have
solved the problem much better. But sometimes when I make that claim, I
don't fully know... it's a gut feeling from having seen similar problems
which have been solved by these approaches, and having seen the pitfalls
from having chosen alternatives. But it may be while making a claim
that I haven't actually thought through all of the ways they would have.
None of the above are "magic pixie dust", and I know that, but it's
tempting enough to treat them way in shorthands, because I believe they
are "the right approach"... even if the ways in which they solve
problems might take longer for me to suss out (or maybe they don't in
this case!).

Similarly, I think maybe that it's that I'm encountering "CRDT approach
enthusiasts", who have seen a lot of problems successfully solved with
CRDTs, and thus they assume that the ones I'm working on would benefit
as well. I think part of this is that CRDTs have captured the
enterprise-worker-distributed-systems-mindset... the kinds of problems
facing "enterprise software" (intentionally vague there) might actually
well be suited by CRDTs generally.

I'm not quite sure where I'm going with this, and sorry if it's just too
vague. Just trying to figure out what the source of this zeitgeist is
and if I'm wrong for not giving it enough time and attention.

To give the benefit of doubt, here are maybe a few places where I think
CRDTs might compose nicely with ocap systems I'm working on:

- A cooperative quorum of vats which all want to choose the same input
and are willing to rewind as necessary to get there. Kind of like an
ocap version of (my understanding of) Distributed TeaTime.

- Similarly, setting up something like an unum system but where the
source of update-truth may actually come from multiple sources at
multiple times. Eventually, everyone's "unum presence" should
converge on the same value, having seen all inputs.

What it seems like CRDTs don't solve:

- Blockchain-type needs, such as the double spending problem, on their
own... consensus still seems necessary in the case that two mutually
opposing choices of "spending" might occur with effects (I have
enough money to buy a new computer or to buy groceries, but not
both...)

- CRDTs don't, on their own, really have anything to do with authority,
as Alan indicates. I think that is a useful observation and is
partly why I'm confused by advocates maybe indicating it'll "solve
all my problems"... a lot of my problems are authority-oriented...

- Sometimes I want opaque objects which spit out data. I don't know
how to do this other than "distributed actors with opaque machines
on the network" type approach. CRDTs don't do anything here.

Dunno. Guess that's not so helpful. Thinking out loud though. Sorry
if it sounded like I was asking a specific question when I was really
trying to figure out what's under all this murk.


Alan Karp writes:

> I don't see the connection. CRDTs are about data; ocaps are about
> permissions. Would you be asking this question if the topic was databases
> instead of CRDTs?
>

Tony Arcieri

unread,
Dec 1, 2020, 9:51:00 AM12/1/20
to cap-...@googlegroups.com
On Sat, Nov 28, 2020 at 2:05 PM Alan Karp <alan...@gmail.com> wrote:
I don't see the connection.  CRDTs are about data; ocaps are about permissions.
 
A tricky problem is how to reconcile the two. For example, let's say we want the following crypto-OCap access control (using Tahoe-ish terminology, in order of decreasing authority) for a collaborative editing system based on CRDTs (e.g. Automerge):

- writecap: can encrypt and sign updates to a CRDT, which are then verified to authorize changes
- readcap: can decrypt an encrypted CRDT and verify its authenticity
- verifycap: can verify signed updates to a CRDT, but cannot decrypt the contents. Useful for an honest-but-curious "server" which aggregates/stores CRDTs in case participants are not online at the same time.

Encryption isn't terribly tricky: a particular CRDT can have a symmetric (derivation) key, and holders of that key have "readcap" authority. Use derived keys per message and/or an MRAE mode and you don't need to worry about nonce collisions in a multi-user setting.

The more interesting aspect is signing, particularly in the context of δ-CRDTs, which gossip state updates between each other as opposed to sending the whole state.

In this capacity it'd be nice to have some sort of "commutative" signature operation that allows CRDT delta states to be processed and aggregated/coalesced in any order, but without the need to retain signatures for every update, as signatures can be quite large relative to the state being changed. In such a scheme, any client processing all of the updates would deterministically compute the same signature which is capable of authenticating the entire CRDT in its current state, regardless of the order in which the updates are received.

There are a few different algorithms which might be useful for these sorts of signatures which seem worth exploring:

- RSA accumulators
- BLS signatures (note this is a slightly different application than BLS multisignatures, which aggregate multiple signatures over the same document)
- Recursive zero-knowledge proofs

This is an area where I haven't seen a whole lot of research personally, and in general access control is underexplored in these systems, IMO.

If anyone is aware of any examples of people trying to apply an access control scheme to CRDTs, I'd certainly be interested!

--
Tony Arcieri

Alan Karp

unread,
Dec 1, 2020, 1:02:20 PM12/1/20
to cap-...@googlegroups.com
Tony Arcieri <bas...@gmail.com> wrote:
On Sat, Nov 28, 2020 at 2:05 PM Alan Karp <alan...@gmail.com> wrote:
I don't see the connection.  CRDTs are about data; ocaps are about permissions.
 
A tricky problem is how to reconcile the two. 
 
I think your note makes my point.  The CRDTs are about the data; the ocaps are about controlling updates to the data.

While the encryption you describe is interesting, I think it's orthogonal to the CRDT/ocap discussion.  That is unless I'm missing something. 

--------------
Alan Karp


--
You received this message because you are subscribed to the Google Groups "cap-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cap-talk+u...@googlegroups.com.

David Nicol

unread,
Dec 1, 2020, 7:20:19 PM12/1/20
to cap-...@googlegroups.com
https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type

I see an overlap with capability security in that any updatable value
could become an attack vector.

the approaches described on the wikipedia page presume "resolution
rule" disciplines, safe write operations, and generally non-hostile
community members within the perimeter.

It seems to me (never having even seen this new ETLA before today)
that keeping CRDT engineering practice separate from permissions would
only be valid if the operations from which the CRDT instance is
composed are somehow already safe.


>>> I don't see the connection. CRDTs are about data; ocaps are about permissions.
>> A tricky problem is how to reconcile the two.
> I think your note makes my point. The CRDTs are about the data; the ocaps are about controlling updates to the data.
> While the encryption you describe is interesting, I think it's orthogonal to the CRDT/ocap discussion. That is unless I'm missing something.

Tony Arcieri

unread,
Dec 1, 2020, 7:41:02 PM12/1/20
to cap-...@googlegroups.com
On Tue, Dec 1, 2020 at 10:02 AM Alan Karp <alan...@gmail.com> wrote:
While the encryption you describe is interesting, I think it's orthogonal to the CRDT/ocap discussion.  That is unless I'm missing something.

Given CRDTs are often intended to be used in P2P systems, e.g. collaborative editing, adding a crypto-OCap access control layer can be immensely helpful, particularly for things like providing read-only views of a CRDT's state, or augmenting a CRDT system with a server who can receive deltas and compute a new state, verifying access as part of that (enabling state-based replication of all the data for new participants who don't want to receive the entire history of deltas), but is still unable to see the contents.

The open question remains around how to actually authenticate them, however.

--
Tony Arcieri

David Nicol

unread,
Dec 1, 2020, 7:51:50 PM12/1/20
to cap-...@googlegroups.com
On Tue, Dec 1, 2020 at 6:41 PM Tony Arcieri <bas...@gmail.com> wrote:
>
> The open question remains around how to actually authenticate them, however.
>

Is there anything about a CRDT system that significantly
differentiates its data from the data held in any other mutable
resource?
That is, assuming the CRDT itself is sound and isn't leaking access
into it, is decomposing the operations on it (read, write, determine
existence, append, touch, etc) any different from decomposing the
operations on a transactional RDBMS or a file system? I think Karp was
saying it isn't.

That's user-facing though, and presuming a good perimeter. Internal to
the presumed-friendly cooperating nodes of the CRDT, that's a
different discussion.


--
"If you are neutral in situations of injustice, you have chosen the
side of the oppressor." -- Bshp. Desmond Tutu

Orie

unread,
Dec 2, 2020, 9:05:55 AM12/2/20
to cap-...@googlegroups.com
sorry I am late to the capabilities CRDT party.

"Identity Hubs" section of the now rebranded confidential storage spec https://github.com/decentralized-identity/confidential-storage

is working on this exact issue.

I have made a proposal for building CRDTs on top of EDV Documents, and accepting updates to them via HTTP Signatures and capabilities invocations, here is the issue:

https://github.com/decentralized-identity/confidential-storage/issues/97

DID Peer is a patch based CRDT, its not well described and does not use capabilities, but you may find it interesting, here is one implementation, there are others out there: 
https://github.com/transmute-industries/did-peer.js/tree/master/packages/did-peer.js

I have experimented with Signed AutoMerge Deltas before, I find automerge really awesome.

The Secure Data Store WG has essentially adopted ZCAPs for the authorization structure and HTTP signatures for their invocation for both "Hubs" and "Encrypted Data Vaults" sections of the "Confidential Storage" spec. Hubs are almost entirely about CRDTs whereas EDVS are just about structured documents.

Hubs are not super well defined, if this is of interest to any of you all, I invite you to challenge the folks at microsoft to step their game up in defining them :)


OS

--
You received this message because you are subscribed to the Google Groups "cap-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cap-talk+u...@googlegroups.com.

mostawe...@gmail.com

unread,
Dec 2, 2020, 10:10:38 AM12/2/20
to cap-talk
I haven't seen anybody mention yet that CRDTs are (idempotent) commutative monoids. That is, a CRDT has a binary operation which is commutative, has a unit, and optionally is idempotent.

First, by the Eckmann-Hilton argument, if a communication system's messages form a CRDT, then that CRDT will absorb any other operation which is even vaguely monoidal, when the values are the same in each operation. For example, if the communicators are playing a game of relational logic where they describe logical features of an imagined world to each other (as in a MUD, or a Prolog query session), then there is a commutative monoid on the imagined world state between them, and the Eckmann-Hilton argument suggests that there's a natural way to line it up with the communication system's CRDT: The message with unital logical content is both the CRDT's and the world state's unit.

Monads are monoids too. The environment monad (Haskell: Reader), which sets a fixed context for a computation, is commutative; it's idempotent if the context is immutable. The partial monad (Haskell: Maybe), which lets results be absent, is commutative and idempotent. But also, the classic E.join/2 behavior on the heap, and more generally the reference mechanics of E, is commutative and idempotent upon each client's heap.

The Eckmann-Hilton argument thus suggests that we can line up all of these effects. I want to mix in one more monad, but this one is non-commutative; it's our standard async-and-error-handling promise/deferred monad. And the resulting system is a minimal CapTP; its messages have a context where the receiving client will interpret them, messages can fail to be delivered, refs can become more resolved or broken, and there is a sequential semantics (from the promise/deferred monad) which steps the entire protocol forward.

I hope that this demystifies at least some of the reasoning behind why folks may have intuitively reasoned that CRDT terminology is just a complex way to talk about what already existed in object-capability land.

Tony Arcieri

unread,
Dec 2, 2020, 10:19:14 AM12/2/20
to cap-...@googlegroups.com
On Tue, Dec 1, 2020 at 4:51 PM David Nicol <david...@gmail.com> wrote:
Is there anything about a CRDT system that significantly
differentiates its data from the data held in any other mutable
resource?

With a typical mutable store with crypto-OCap access control, let's say something like a Tahoe mutable file, a holder of a writecap updates a single symlink-like pointer to a readcap which represents the current state of a file. Easy peasy.

δ-CRDTs don't work that way: they're a state lattice of "dots", where each "dot" effectively represents a change to the structure. In order to determine the current state of the object you need all of the "dots" (and several clients may be contributing dots simultaneously), which can be received in any order, and different clients may have different views which eventually converge on the same structure when they learn all of the same "dots".

Naively each client could sign every single dot they add to the state lattice, and that would work just fine, aside from every signature being e.g. 64-bytes. Now the problem is if you have a lot of small edits by a lot of clients, these signatures will quickly dominate the bulk of the state being stored. So really it's a scalability issue.

Solving that particular problem is more of a cryptographic design one than a question of access control: you need a scheme where the signature/authenticator over your view of a particular CRDT (i.e. the set of dots you have witnessed) "commutes" with the views others have. Two peers which are able to synchronize the set of dots they have witnessed and aggregated/combined their signature/accumulator should arrive at the same value for such an authenticator, and that value should be sufficient to prove that all of the dots in the CRDT were signed by holders of "writecaps" (of which there could potentially be many, and there could potentially be revocation although that gets tricky and requires some degree of coordination).

There are several schemes that have this property which are being explored for other reasons, most notably RSA accumulators and BLS signatures.

--
Tony Arcieri

David Nicol

unread,
Dec 2, 2020, 11:14:02 AM12/2/20
to cap-...@googlegroups.com
the Matrix.org chat room system promises negotiated resolution of
message presence and ordering, that makes it a CRDT right?

On Wed, Dec 2, 2020 at 8:05 AM Orie <or...@or13.io> wrote:
>
> CRDT party.

David Nicol

unread,
Dec 2, 2020, 11:23:10 AM12/2/20
to cap-...@googlegroups.com
On Wed, Dec 2, 2020 at 9:19 AM Tony Arcieri <bas...@gmail.com> wrote:
>
> On Tue, Dec 1, 2020 at 4:51 PM David Nicol <david...@gmail.com> wrote:
>>
>> Is there anything about a CRDT system that significantly
>> differentiates its data from the data held in any other mutable
>> resource?
>
>
> With a typical mutable store with crypto-OCap access control, let's say something like a Tahoe mutable file, a holder of a writecap updates a single symlink-like pointer to a readcap which represents the current state of a file. Easy peasy.
>
> δ-CRDTs don't work that way: they're a state lattice of "dots", where each "dot" effectively represents a change to the structure. In order to determine the current state of the object you need all of the "dots" (and several clients may be contributing dots simultaneously), which can be received in any order, and different clients may have different views which eventually converge on the same structure when they learn all of the same "dots".

Thanks! To argue Karp's position here, have you not left out the "and
waits for success" from the EZPZ " a holder of a writecap updates a
single symlink-like pointer to a readcap which represents the current
state of a file," and is not that elided step equivalent to waiting
for the convergence to complete, whatever that means in the CRDT
implementation?

Tony Arcieri

unread,
Dec 2, 2020, 12:20:43 PM12/2/20
to cap-...@googlegroups.com
On Wed, Dec 2, 2020 at 8:14 AM David Nicol <david...@gmail.com> wrote:
the Matrix.org chat room system promises negotiated resolution of
message presence and ordering, that makes it a CRDT right?

There are several systems which provide CRDT-like properties but aren't CRDTs. Operational Transformations (OT) are one of the most notable, and despite their success in collaborative editing systems, they notably aren't CRDTs because they have conflicts.

Not knowing how Matrix handles this I can't say offhand. 

On Wed, Dec 2, 2020 at 8:23 AM David Nicol <david...@gmail.com> wrote:
Thanks! To argue Karp's position here, have you not left out the "and
waits for success" from the EZPZ " a holder of a writecap updates a
single symlink-like pointer to a readcap which represents the current
state of a file," and is not that elided step equivalent to waiting
for the convergence to complete, whatever that means in the CRDT
implementation?

There's no waiting required with CRDTs: you can make updates locally, and gossip them as connectivity permits. Convergence is never guaranteed to complete either (hence "eventual consistency"), and most CRDT systems are designed with the notion that peers interested in syncing the state may drop off never to return.

--
Tony Arcieri

Ben Laurie

unread,
Dec 2, 2020, 1:41:14 PM12/2/20
to cap-talk
On Wed, 2 Dec 2020 at 17:20, Tony Arcieri <bas...@gmail.com> wrote:
There's no waiting required with CRDTs: you can make updates locally, and gossip them as connectivity permits. Convergence is never guaranteed to complete either (hence "eventual consistency"), and most CRDT systems are designed with the notion that peers interested in syncing the state may drop off never to return.

Most? You mean some exist that defy reality?
 

Neil Madden

unread,
Dec 2, 2020, 1:56:14 PM12/2/20
to cap-...@googlegroups.com
I don’t have much to add to the discussion of CRDTs specifically, but as Datalog has been mentioned a few times on this list there is a nice connection there in the form of CALM: (eventual) consistency as logical monotonicity (http://bloom-lang.net/calm/). Bloom is based on Dedalus, which is a temporal Datalog.

— Neil

On 28 Nov 2020, at 20:37, Christopher Lemmer Webber <cwe...@dustycloud.org> wrote:

A thing I don't see much attention to in ocap land as much but I do see
--
You received this message because you are subscribed to the Google Groups "cap-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cap-talk+u...@googlegroups.com.

ForgeRock values your Privacy

Tony Arcieri

unread,
Dec 2, 2020, 2:04:09 PM12/2/20
to cap-...@googlegroups.com
On Wed, Dec 2, 2020 at 10:41 AM 'Ben Laurie' via cap-talk <cap-...@googlegroups.com> wrote:
Most? You mean some exist that defy reality?

I might describe them as "naively designed" or "possessing bad tradeoffs".

There are several cases where it'd be nice to perform a coordinated/synchronous operation among the set of peers exchanging CRDTs. The main one is garbage collection: since δ-CRDTs change state by adding "dots", they grow unboundedly. If all peers could be online and agree to perform such an operation, it'd be nice to periodically garbage collect the "dots", which would more or less involve everyone agreeing to abandon an original CRDT and move on to a new one which begins with some agreed upon state, but this is fraught with difficulty for various reasons. Using a sort of "epoch" mechanism which combines a traditional BFT consensus mechanism (requiring 2/3rds honest participants) and placing some sort of timeout on each epoch is a heavy-handed approach to this.

Revocation is another case where such a synchronous mechanism would be desirable.

Op-based CRDTs are another example of brittle CRDTs: earlier "mostawesomedude" mentioned that CRDTs are idempotent, but that isn't always the case. Op-based CRDTs require "exactly once" delivery, which is a nearly impossible guarantee to provide.

--
Tony Arcieri

Tony Arcieri

unread,
Dec 2, 2020, 3:58:14 PM12/2/20
to cap-...@googlegroups.com
There are some tricky access control questions around applying OCap to δ-CRDTs which may be more interesting to this group beyond what I've brought up before.

State transitions in δ-CRDTs are based on "dots". A "dot" can itself be thought of as a capability: it confers the authority to make a single change to the CRDT. To make a handwavy allusion to cryptography, a dot can almost be thought of as a nonce.

To obtain a dot, a peer needs to confer with a "dot generator" node/role, whose duty is to ensure that dots are handed out uniquely. In existing CRDT systems, this requires both honest dot generators and honest peers, as dots are trivially forgable, and you trust the dot generator to maintain a sequential ordering. (If you're curious, there can be multiple dot generators, so long as they hand out dots from unique ranges)

Tricky OCap questions:

- How do you represent "dots" in a way that's unforgeable? (one option which is applicable to e.g. BLS signatures is to associate each dot with a public/private keypair)
- How do you confer "dot generator" authority as a capability? This would seem to require a capability which designates some sort of address for the dot generator as well as the range of dots they're authorized to generate.
- How are "writecaps" represented, i.e. what authorizes a peer in the system to ask a dot generator for dots and then use them to update a particular CRDT?

For background, here's one of the seminal δ-CRDT papers: https://arxiv.org/abs/1603.01529

--
Tony Arcieri

David Nicol

unread,
Dec 2, 2020, 4:00:51 PM12/2/20
to cap-...@googlegroups.com
On Wed, Dec 2, 2020 at 1:04 PM Tony Arcieri <bas...@gmail.com> wrote:

> Op-based CRDTs require "exactly once" delivery, which is a nearly impossible guarantee to provide.

Fiddlesticks! the final hop is a gatekeeper that rejects messages it's
seen before, and all messages have
(source,timestamp,counter,entropy,signature) data in them, or similar,
to guarantee uniqueness.

Tony Arcieri

unread,
Dec 2, 2020, 4:07:42 PM12/2/20
to cap-...@googlegroups.com
On Wed, Dec 2, 2020 at 1:00 PM David Nicol <david...@gmail.com> wrote:
Fiddlesticks! the final hop is a gatekeeper that rejects messages it's
seen before, and all messages have
(source,timestamp,counter,entropy,signature) data in them, or similar,
to guarantee uniqueness.

While that works, one of the main use cases of CRDTs is ad hoc P2P systems where there doesn't need to be any sort of intermediary node at all present

--
Tony Arcieri

Raoul Duke

unread,
Dec 2, 2020, 4:08:03 PM12/2/20
to cap-...@googlegroups.com
> Fiddlesticks! the final hop is a gatekeeper that rejects messages it's
> seen before, and all messages have
> (source,timestamp,counter,entropy,signature) data in them, or similar,
> to guarantee uniqueness.

and it is willing to wait around for the heat death of the universe
before purging anything in the tables? :-)

Mark S. Miller

unread,
Dec 2, 2020, 4:22:32 PM12/2/20
to cap-...@googlegroups.com
CALM and Daedalus had a huge influence on Yedalog.

Alan Karp

unread,
Dec 2, 2020, 4:24:08 PM12/2/20
to cap-...@googlegroups.com

That's exactly the way waterken works with the receiver acting as the gatekeeper.  The sender keeps sending a message (with exponential backoff) until it receives an ACK.  Messages that are never acknowledged are kept by the sender "forever."  Forever ends when you're willing to accept any inconsistencies that might arise due to forgetting the message.  In practice, you never have to do that.  As the Plan 9 folks noted, their disks got emptier over time as they replaced old disks with denser ones.

--------------
Alan Karp

Alan Karp

unread,
Dec 2, 2020, 4:28:29 PM12/2/20
to cap-...@googlegroups.com
Tony Arcieri <bas...@gmail.com> wrote:

- How do you represent "dots" in a way that's unforgeable? (one option which is applicable to e.g. BLS signatures is to associate each dot with a public/private keypair)
- How do you confer "dot generator" authority as a capability? This would seem to require a capability which designates some sort of address for the dot generator as well as the range of dots they're authorized to generate.
- How are "writecaps" represented, i.e. what authorizes a peer in the system to ask a dot generator for dots and then use them to update a particular CRDT?

One question you left off is what is the root of authority?  Presumably, that's the someone who starts the CRDT service.  I think the answers to your last two questions come from that observation.

--------------
Alan Karp

David Nicol

unread,
Dec 2, 2020, 4:29:39 PM12/2/20
to cap-...@googlegroups.com
Well, the local universe, yeah. Otherwise, the other discussion of
distributed GC would apply to deleting entries from %seen. Or a
reasonable time limit, for use with the timestamp.

Tony Arcieri

unread,
Dec 2, 2020, 4:36:04 PM12/2/20
to cap-...@googlegroups.com
On Wed, Dec 2, 2020 at 1:28 PM Alan Karp <alan...@gmail.com> wrote:
One question you left off is what is the root of authority?  Presumably, that's the someone who starts the CRDT service.  I think the answers to your last two questions come from that observation.

Indeed, someone (potentially via some service) needs to create each CRDT initially and then confer the authority to generate dots for it to one or more dot generators

Ben Laurie

unread,
Dec 3, 2020, 8:21:00 AM12/3/20
to cap-talk
On Wed, 2 Dec 2020 at 19:04, Tony Arcieri <bas...@gmail.com> wrote:
On Wed, Dec 2, 2020 at 10:41 AM 'Ben Laurie' via cap-talk <cap-...@googlegroups.com> wrote:
Most? You mean some exist that defy reality?

I might describe them as "naively designed" or "possessing bad tradeoffs".

There are several cases where it'd be nice to perform a coordinated/synchronous operation among the set of peers exchanging CRDTs. The main one is garbage collection: since δ-CRDTs change state by adding "dots", they grow unboundedly. If all peers could be online and agree to perform such an operation, it'd be nice to periodically garbage collect the "dots", which would more or less involve everyone agreeing to abandon an original CRDT and move on to a new one which begins with some agreed upon state, but this is fraught with difficulty for various reasons. Using a sort of "epoch" mechanism which combines a traditional BFT consensus mechanism (requiring 2/3rds honest participants) and placing some sort of timeout on each epoch is a heavy-handed approach to this.

If you know who the participants are then you can discard dots once everyone has seen them...
 

Revocation is another case where such a synchronous mechanism would be desirable.

Op-based CRDTs are another example of brittle CRDTs: earlier "mostawesomedude" mentioned that CRDTs are idempotent, but that isn't always the case. Op-based CRDTs require "exactly once" delivery, which is a nearly impossible guarantee to provide.

--
Tony Arcieri

--
You received this message because you are subscribed to the Google Groups "cap-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cap-talk+u...@googlegroups.com.

Raoul Duke

unread,
Dec 3, 2020, 10:58:49 AM12/3/20
to cap-...@googlegroups.com
(ie if the per client counter really works then you can store what is missing up to the highest counter rather than everything that was received.)

Neil Madden

unread,
Dec 3, 2020, 3:34:34 PM12/3/20
to cap-...@googlegroups.com
That’s good to know. I still haven’t read up on Yedalog. My impression was that CALM somewhat subsumed CRDTs, so that you could think about ocaps and CRDTs by instead thinking about ocaps and Datalog. Does that sound like a reasonable inference?

On 2 Dec 2020, at 21:22, Mark S. Miller <ma...@agoric.com> wrote:



ForgeRock values your Privacy

Christopher Lemmer Webber

unread,
Dec 5, 2020, 10:04:00 PM12/5/20
to cap-...@googlegroups.com, Alan Karp
Alan Karp writes:

> Tony Arcieri <bas...@gmail.com> wrote:
>
>> On Sat, Nov 28, 2020 at 2:05 PM Alan Karp <alan...@gmail.com> wrote:
>>
>>> I don't see the connection. CRDTs are about data; ocaps are about
>>> permissions.
>>>
>>
>> A tricky problem is how to reconcile the two.
>>
>
>>
> I think your note makes my point. The CRDTs are about the data; the ocaps
> are about controlling updates to the data.
>
> While the encryption you describe is interesting, I think it's orthogonal
> to the CRDT/ocap discussion. That is unless I'm missing something.

As Alan has pointed out, yes you can have ocaps as the basis and input
for inputs, but you could also do that on a key-value store, and we
probably wouldn't raise much of an eyebrow. However, I do think that
maybe ocaps have something *of value to provide* that are not in most
"enterprise CRDT libraries", namely the ability to have a wider
CRDT-backed database with more fine-grained access.

A conversation on a call not recorded here, but I think interesting:
there is probably a subset of una which are well constructed by CRDTs,
and maybe compositionally so: I think Baldur said "A CRDT library for
ocap environments as an una construction kit" (or something similar).

Anyway, some fun talking about this despite me throwing this topic out
here not knowing quite what I was looking for, and agreeing ultimately
that these are *mostly* just two different interesting things which
*can* be wired together for interesting results. ;)

Martin Kleppmann

unread,
Dec 9, 2020, 10:04:12 AM12/9/20
to cap-talk
Hi all, Tony pointed me at this thread. I do a lot of research on CRDTs, including the Automerge library that has been mentioned here. Forgive me for parachuting in out of nowhere; I just have a few thoughts that might be helpful.

I totally agree that CRDTs are not magic, and they are not appropriate for all applications. I think of them as data structures that can be updated concurrently by different users or nodes, and which have built-in conflict resolution policies. For example, if two users concurrently insert items into a list, the conflict resolution will ensure that both inserted items are preserved, and that each item remains in the correct list position relative to its neighbours. If the items are inserted in the same place, the CRDT ensures that everyone deterministically puts those items in the same order. CRDTs by themselves do not say anything about who is allowed to make those changes.

Applications that are a good fit for CRDTs: collaboration software where multiple users can concurrently update a shared dataset (in the style of Google Docs, Spreadsheets, Trello, Figma, calendar sync, …); and ensuring eventual consistency for gossip protocols (e.g. managing cluster state in distributed databases). CRDTs allow these types of systems to be built in a decentralised/peer-to-peer manner.

Not appropriate for CRDTs: applications where there is contention/mutual exclusion around a limited resource (not selling more tickets than you have seats on an aeroplane/in a theatre, not selling more items than you have in stock, not allowing an account balance to go negative, not allowing the same token to be used more than once). These applications need strong consistency and/or consensus.

We have recently been doing work on CRDTs in adversarial settings. Two things that might be of interest:

- This paper https://arxiv.org/abs/2012.00472 shows that certain types of CRDT-based application (defined precisely in the paper) can be immune to Sybil attacks, allowing them to use much lighter-weight techniques than blockchains. This is also closely related to CALM and logical monotonicity, which was mentioned on this thread.

- This paper https://eprint.iacr.org/2020/1281 defines an approach for encryption in CRDT-based systems, including key rotation as users are added or removed. Related to capabilities, a key part of this protocol is determining the set of current valid users in the absence of a trusted server. It's not entirely obvious what happens, for example, when two users both try to remove each other from the system!

One last thing: I respectfully disagree with Tony on his statement that "Op-based CRDTs require 'exactly once' delivery, which is a nearly impossible guarantee to provide". You can ensure the required guarantee by acknowledging received operations, retransmitting unacknowledged operations, and suppressing duplicate operations at the recipient. This is easy to do if you are willing to remember all the operations you have seen so far (Automerge is op-based and it does this).

Remembering all operations ever may seem wasteful, but we have been working on compression techniques that make this efficient: e.g. for text editing, we are able to store the entire keystroke-by-keystroke editing history using less than 1 byte per operation. More on this topic in this video: https://youtu.be/x7drE24geUw?t=3194 (starting at 53:14). With such compression, remembering the full CRDT operation history is viable for a large set of applications.

Hope that is a constructive contribution to the conversation.
Best wishes,
Martin
On Sunday, November 29, 2020 at 9:47:59 PM UTC cwe...@dustycloud.org wrote:
Ocaps can be about data too, but aren't necessarily (the line is less
squishy around E, squishier around certificate ocaps and ocap logic
databases, I'd think). Here is another way to put it: ocaps aren't just
about authority, but about behavior. Data can flow out of an object as
one type of behavior of invoking that reference.

But you're asking what the connection is, and I'm admitting straight
out: I don't know. I suspect there's something here to investigate, but
I haven't quite sussed out what it is.

The origin of this is more or less people 'splain'ing to me that I made
major errors in my systems by not putting CRDTs at the heart. As an
example, multiple people have told me "ActivityPub is hopeless because
you didn't start out with CRDTs". When trying to get clarity as to why
I feel like the responses are vague. But they might be somewhat on to,
and off of, something...

I think this is one of those cases that I'll call, at the moment,
"approach enthusiasm". You can say I'm an "approach enthusiast" of
ocaps, lisps, reproducible software. Sometimes when I see a problem, I
might grumble and indicate that I think one of the above would have
solved the problem much better. But sometimes when I make that claim, I
don't fully know... it's a gut feeling from having seen similar problems
which have been solved by these approaches, and having seen the pitfalls
from having chosen alternatives. But it may be while making a claim
that I haven't actually thought through all of the ways they would have.
None of the above are "magic pixie dust", and I know that, but it's
tempting enough to treat them way in shorthands, because I believe they
are "the right approach"... even if the ways in which they solve
problems might take longer for me to suss out (or maybe they don't in
this case!).

Similarly, I think maybe that it's that I'm encountering "CRDT approach
enthusiasts", who have seen a lot of problems successfully solved with
CRDTs, and thus they assume that the ones I'm working on would benefit
as well. I think part of this is that CRDTs have captured the
enterprise-worker-distributed-systems-mindset... the kinds of problems
facing "enterprise software" (intentionally vague there) might actually
well be suited by CRDTs generally.

I'm not quite sure where I'm going with this, and sorry if it's just too
vague. Just trying to figure out what the source of this zeitgeist is
and if I'm wrong for not giving it enough time and attention.

To give the benefit of doubt, here are maybe a few places where I think
CRDTs might compose nicely with ocap systems I'm working on:

- A cooperative quorum of vats which all want to choose the same input
and are willing to rewind as necessary to get there. Kind of like an
ocap version of (my understanding of) Distributed TeaTime.

- Similarly, setting up something like an unum system but where the
source of update-truth may actually come from multiple sources at
multiple times. Eventually, everyone's "unum presence" should
converge on the same value, having seen all inputs.

What it seems like CRDTs don't solve:

- Blockchain-type needs, such as the double spending problem, on their
own... consensus still seems necessary in the case that two mutually
opposing choices of "spending" might occur with effects (I have
enough money to buy a new computer or to buy groceries, but not
both...)

- CRDTs don't, on their own, really have anything to do with authority,
as Alan indicates. I think that is a useful observation and is
partly why I'm confused by advocates maybe indicating it'll "solve
all my problems"... a lot of my problems are authority-oriented...

- Sometimes I want opaque objects which spit out data. I don't know
how to do this other than "distributed actors with opaque machines
on the network" type approach. CRDTs don't do anything here.

Dunno. Guess that's not so helpful. Thinking out loud though. Sorry
if it sounded like I was asking a specific question when I was really
trying to figure out what's under all this murk.


Alan Karp writes:

> I don't see the connection. CRDTs are about data; ocaps are about
> permissions. Would you be asking this question if the topic was databases
> instead of CRDTs?
>
> On Sat, Nov 28, 2020 at 12:37 PM Christopher Lemmer Webber <

Christopher Lemmer Webber

unread,
Dec 9, 2020, 2:47:37 PM12/9/20
to cap-...@googlegroups.com, Martin Kleppmann
Martin Kleppmann writes:

> Hi all, Tony pointed me at this thread. I do a lot of research on CRDTs,
> including the Automerge library that has been mentioned here. Forgive me
> for parachuting in out of nowhere; I just have a few thoughts that might be
> helpful.

Hi Martin,

Thank you for posting in this thread! This is fascinating stuff.

> I totally agree that CRDTs are not magic, and they are not appropriate for
> all applications. I think of them as data structures that can be updated
> concurrently by different users or nodes, and which have built-in conflict
> resolution policies. For example, if two users concurrently insert items
> into a list, the conflict resolution will ensure that both inserted items
> are preserved, and that each item remains in the correct list position
> relative to its neighbours. If the items are inserted in the same place,
> the CRDT ensures that everyone deterministically puts those items in the
> same order. CRDTs by themselves do not say anything about who is allowed to
> make those changes.
>
> Applications that are a good fit for CRDTs: collaboration software where
> multiple users can concurrently update a shared dataset (in the style of
> Google Docs, Spreadsheets, Trello, Figma, calendar sync, …); and ensuring
> eventual consistency for gossip protocols (e.g. managing cluster state in
> distributed databases). CRDTs allow these types of systems to be built in a
> decentralised/peer-to-peer manner.
>
> Not appropriate for CRDTs: applications where there is contention/mutual
> exclusion around a limited resource (not selling more tickets than you have
> seats on an aeroplane/in a theatre, not selling more items than you have in
> stock, not allowing an account balance to go negative, not allowing the
> same token to be used more than once). These applications need strong
> consistency and/or consensus.

Whew! My gut instincts are correct!

> We have recently been doing work on CRDTs in adversarial settings. Two
> things that might be of interest:

Oooooh!

> - This paper https://arxiv.org/abs/2012.00472 shows that certain types of
> CRDT-based application (defined precisely in the paper) can be immune to
> Sybil attacks, allowing them to use much lighter-weight techniques than
> blockchains. This is also closely related to CALM and logical monotonicity,
> which was mentioned on this thread.
>
> - This paper https://eprint.iacr.org/2020/1281 defines an approach for
> encryption in CRDT-based systems, including key rotation as users are added
> or removed. Related to capabilities, a key part of this protocol is
> determining the set of current valid users in the absence of a trusted
> server. It's not entirely obvious what happens, for example, when two users
> both try to remove each other from the system!

I made it a good chunk through the former (excellently written and
approachable, even for a newbie like me), and have put the latter on the queue.
Promising stuff.

> One last thing: I respectfully disagree with Tony on his statement that
> "Op-based CRDTs require 'exactly once' delivery, which is a nearly
> impossible guarantee to provide". You can ensure the required guarantee by
> acknowledging received operations, retransmitting unacknowledged
> operations, and suppressing duplicate operations at the recipient. This is
> easy to do if you are willing to remember all the operations you have seen
> so far (Automerge is op-based and it does this).
>
> Remembering all operations ever may seem wasteful, but we have been working
> on compression techniques that make this efficient: e.g. for text editing,
> we are able to store the entire keystroke-by-keystroke editing history
> using less than 1 byte per operation. More on this topic in this video:
> https://youtu.be/x7drE24geUw?t=3194 (starting at 53:14). With such
> compression, remembering the full CRDT operation history is viable for a
> large set of applications.
>
> Hope that is a constructive contribution to the conversation.
> Best wishes,
> Martin

Very constructive! Not the priority of my work right now, but
eventually probably will become critical, so I appreciate these resources!
Reply all
Reply to author
Forward
0 new messages