As written right now there's only one format for expressing patches. Example from the spec[1]:
Content-Length: 53
Content-Range: json=.messages[1:1]
[{"text": "Yo!",
"author": {"link": "/user/yobot"}]
This expresses the change:
- Replace document.messages[1:1] (an empty set)
- With the content in the list [{text:..., author:...}]
The spec lets you specify custom algorithms for describing how concurrent patches are *merged*. But the format of the patch itself is hardcoded.
In general you can define any patch format you'd like, in arbitrary english, with a spec. See [RFC6902] for an example. You would probably like to make one for sharedb as well.Request: PUT /chat Version: "up12vyc5ib" | Version Parents: "2bcbi84nsp" | Content-Type: application/json | Merge-Type: sync9 | Patches: 1 | | Content-Length: 326 | | Patch Content-Type: application/json-patch+json | | | | [ | | { "op": "test", "path": "/a/b/c", "value": "foo" }, | | { "op": "remove", "path": "/a/b/c" }, | | { "op": "add", "path": "/a/b/c", "value": [] }, | | { "op": "replace", "path": "/a/b/c", "value": 42 }, | | { "op": "move", "from": "/a/b", "path": "/a/d }, | | { "op": "copy", "from": "/a/d/d", "path": "/a/d/e" }| | ] | | Response: HTTP/1.1 200 OK Patches: OK
automerge_change = { "actor": "abc2", "seq": 2, "startOp": 13, "time": 1612489353, "deps": ["100@face", "2@beef"], "ops": [{ "obj": "7@abc1", "elemId": "12@abc2", "action": "set", "insert": true, "value": "h", "pred": [] }, { "obj": "7@abc1", "elemId": "13@abc2", "action": "set", "insert": true, "value": "i", "pred": [] }] }
PUT /chat
["100@face", "2@beef"]
PUT /doc Version: "abc2@2" Parents:["100@face", "2@beef"]
Patches: 1 Content-Length: xxx Content-Type: application/automerge <<Automerge's compact binary encoding of operations>>
PATCH /doc Version: "abc2@2" Parents:["100@face", "2@beef"]
Content-Length: xxx Content-Type: application/automerge <<Automerge's compact binary encoding of operations>>
HTTP/1.1 209 Subscriptioncache-control: no-cacheconnection: keep-alivecontent-type: application/automergeTransfer-Encoding: chunked
version: ["100@face", "2@beef"] <---- herecontent-length: xxxversion: ["100@face", "2@beef"] <---- and here
<<automerge encoded document>>----(later)----content-length: xxx
version: "abc2@2" <---- and maybe here?<<automerge encoded patch>>
--
You received this message because you are subscribed to the Google Groups "Braid" group.
To unsubscribe from this group and stop receiving emails from it, send an email to braid-http+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/braid-http/79b3a497-43c9-477e-a39e-d62bfd7f2106%40www.fastmail.com.
I think "100@face" is an (agent, seq) pair, and thus ["100@face",
"2@beef"] is a pair of pairs. A pair of pairs is a pair of
versions, not a version. Thus I think Duane is wondering if it was
intentional to write a pair of versions in a field named
"version". Usually pairs of versions appear in fields labeled
"Parents".
I think I can see the confusion. There are multiple ways to refer to a version when using (agentid, seq) pairs:
(1) With the (agentid, seq) that created the version
(2) With a version vector of all the latest [(agent, seq) ...] known when the version was created
It sounds like you were using version vectors (2) in this example, and Duane and I were expecting (1). Thus the confusion.
I think we can get back to your proposal now.
PUT /chat
Version: "abc2@2"Parents:["100@face", "2@beef"]
Patches: 2
The versioning system I'm using here is independent of my proposal. I'd love to discuss versions - but lets do it in another thread or we'll miss the forest for the trees.
I'm all for discussing versions separately. In that case, it would help for the versions in this example to meet the current Braid spec, so that all the differences in the examples are things that you want us to focus on here.
Here are a couple of differences that have been tripping me up so far:
(1) The value of "Parents:" is not supposed to be surrounded by square brackets:
Parents: ["100@face", "2@beef"]
It's just a list of strings, separated by commas, using the
HTTPBis "Structured Headers" format:
Parents are also specified with a header in a PUT request or GET response: Parents: "ajtva12kid", "cmdpvkpll2" The Parents header is a List of Strings, in the syntax of HTTP's [STRUCTURED-HEADERS].
(2) A version, likewise, is not supposed to be surrounded by square brackets:
version: ["100@face", "2@beef"]
It's supposed to just be a string in the "Structured Headers" format:
These differences were leading me to think that there was either an unintended bug in the example, or that you were proposing a new versioning system along with the change in patches.Versions are specified as a string in the [STRUCTURED-HEADERS] format. Each version string must be unique, to differentiate any two points in time. ...<snip>... Version: "dkn7ov2vwg"
(1) The value of "Parents:" is not supposed to be surrounded by square brackets:Parents:
["100@face", "2@beef"]
"100@face", "2@beef"
(2) A version, likewise, is not supposed to be surrounded by square brackets:
version: ["100@face", "2@beef"]
These differences were leading me to think that there was either an unintended bug in the example, or that you were proposing a new versioning system along with the change in patches.
content-length: xxxversion: "b:0"
content-length: xxxversion: "b:0"Oops thats a typo - should have read version: "v:1".
parents: "a:0", "b:0" <-- How should the client interpret this? Its never seen "a:0" or "b:0"content-range: json=.messages[1:1] <-- What are these indexes relative to?<<patch content>> <-- Is this content not transformed by the server? Even when doing OT?Thanks!-Seph
--You received this message because you are subscribed to the Google Groups "Braid" group.To unsubscribe from this group and stop receiving emails from it, send an email to braid-http+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/braid-http/653c2471-3c21-4916-b642-7c383264b3ac%40www.fastmail.com.
The startOp field is split out explicitly into each patch object. We also depend on a new content-range header which expresses the insert location using automerge semantics:PUT /chat
Version: "abc2@2"Parents:["100@face", "2@beef"]
Patches: 2Content-Length: xxxContent-Type: application/json+automergeContent-Range: automerge="7...@abc1.12@abc2"{ "id": "13@abc2", "action": "set", "insert": true, "value": "h", "pred": [] }Content-Length: xxxContent-Type: application/json+automergeContent-Range: automerge="7...@abc1.13@abc2"{ "id": "14@abc2", "action": "set", "insert": true, "value": "i", "pred": [] }
PUT /chat
Version: "abc2@2"
Parents: "100@face", "2@beef"
Patches: 2
Content-Range: automerge="7...@abc1.12@abc2"
Content-Length: 1
h
Content-Range: automerge="7...@abc1.13@abc2"
Content-Length: 1
i
- Automerge and Yjs already have compact, optimized binary representations for patches. If we break patches out into a series of operations, we are forced to discard these optimized formats. And that would be a mistake because we need them. Automerge's optimized format is *50x* smaller. Gzip simply doesn't help enough - and even if it did, v8 struggles to parse the hundreds of megabytes of data objects created in a few hour editing session.
- Yes, we probably can convert json-ot moves to use your unspecified portal format but why bother? That sounds like a lot of effort given it already works well, and works correctly. Lets push that into a separate RFC if its important to you.
- The spec is simpler this way:- No more Patches: X in the read or write path.
- Unless I'm missing something, I don't think we need PUT to send patches to the server
- No more triply-nested HTTP in subscriptions. We only need 2 levels of HTTP encoding. (Outer and inner)
- The content-range header can be removed from the braid spec. It can live in the sync9 spec or something, which is the only implementation I expect will actually use it.
Seph, this opinion about Merge Types and Patches is aggregating a number topics into a single narrative. I'd suggest breaking it apart into the smallest possible units, so that we can discuss each one independently. Then we can make more progress together as a group.It is much easier to solve small independent issues one-by-one. This way you can enumerate, and consider one-by-one, all options for issue, without getting lost in the combinatorial explosion of considering all options x all issues. And this is especially important for a large group, because breaking up issues into smaller issues allows each member of the group to choose different issues to work on, without having to pay attention everything that happens in the entire group.
In this case, I see two primary issues being raised:
- Patch Formats: General vs. Custom?
- A new syntax for encoding Versions & Patches
On the other hand, it's possible that you are trying to argue that "Braid should not have Range Patches, every patch should be a Custom Patch format." Is this what you are trying to say?
That's not cool. Range Patches are optional and sit in their own separate file. They aren't causing you any harm.
And furthermore, Range Patches are awesome. We can generalize every synchronizer's patch format into a single simple model! That's so fascinating!
And it's a model already built into HTTP— the Range Request —and but just hasn't been defined yet for PUTs! This is a great discovery in Computer Science!