ShareJS and ShareDB

148 views
Skip to first unread message

Joseph Gentle

unread,
Oct 8, 2015, 11:52:06 AM10/8/15
to sha...@googlegroups.com, derbyjs
Hi! This is longer than I expected. Skip to the NEXT STEPS section for the tldr.


I first wrote ShareJS hoping for a simple library on top of which
people could build their own collaborative applications. I showed off
the first working version in April 2012, running on NodeJS 0.4. The
list of libraries you could manually download was in a github wiki
page. There was also no websockets - not even a draft.

When I wrote ShareJS I wasn't thinking about JSON at all. It was a
strange little idea a german researcher proposed over italian food at
the google wave summit. (Held right after wave was cancelled). I
somehow convinced Jeremy (@nornagon) to implement the damn thing (much
harder than it sounds!) and we put it in sharejs - because sharejs can
support editing arbitrary types. ShareJS 0.6 was a beautifully crafted
snowflake. I prided myself on its tests-to-code ratio of about 1.5:1.
I took it personally when bugs survived in that code base - my tests
also managed to find bugs in socket.io, bugs in uglifyjs and once a
bug in the coffeescript compiler. I tested sharejs down to IE6.

Then at the start of 2014 I moved to the bay area and started working
at Lever. The company was only a few months old at the time. Nate (our
CEO) had written this little framework called Derby which he'd
presented at the realtime.js conference. We geeked out over combining
forces - what would Derby look like on top of ShareJS? It was a total
conceit - probably a terrible idea for such a young company to invest
resources into but we did it anyway. We did it - but we never had
quite enough time to do everything *well*. It was as fast as it needed
to be, with just enough features to make our app work just well
enough. I'm super proud of what lever is growing into, but I want my
code to *never fall down*. That was my standard with sharejs a few
years ago, and I ruined it. I still get that guilty vertigo feeling
when I look at the issue tracker.

This all hit home early this year when I was invited onto some
javascript podcast. One of the people on the call asked about using
sharejs in his own project - he wanted a simple collaborative text
editor. I told him about all of sharejs's great features and he said
in that case it probably wasn't the right tool for him. I didn't know
how to respond - it was like I had been slapped in the face. I made
ShareJS exactly for his use case; and I failed. I took sharejs.org
down the other day and I felt good about it - I felt like the site was
lying. (And it was still running 0.5 if you can believe it.)

------------------------------------------------------------------

Next steps

In a sense I've made two versions of sharejs. The first version (0.6)
is a small tool to embed collaborative text editing on your site.

The second version (livedb + sharejs 0.7) is an OT-based database for
realtime websites.

My hope for sharejs 0.7 was to have a unified library to cover all of
the above use cases. I think I've failed in that goal.

So we're going to fork the project in two.

What is currently ShareJS master (v0.7) will be combined with livedb
and renamed sharedb. Nate & his team at Lever will run sharedb (as
they've been doing for the past year). It will be updated to only
support JSON documents at the root level - which simplifies a lot of
code. JSON documents support subdocuments with arbitrary OT types
anyway, so there shouldn't be any loss in functionality. Should this
get its own email list?

Meanwhile I'm going to revert sharejs back to the 0.6 codebase and
modernise the whole thing. The API should stay (more or less) the same
as it was at v0.6, although I'm going to merge the cursors branch in.
In practical terms this means:
- Native (default) websocket support
- Coffee -> JS
- Mocha
- Browserify
- Out of the box quilljs support + examples
- Resurrect the old DB bindings for sqlite, postgresql, couchdb, leveldb, etc.
- ShareJS will not support having multiple backend servers (unless you
shard). (Although it would be nice if it could also use livedb
(sharedb) as a backend somehow).
- It'll be called sharejs 1.0 when this is done. (Although I feel like
at this point we deserve a higher number)

I'm not sure about whether to just use a document name (like 0.6 does
now) or use collection name + document name (0.7 style). I really
don't have a strong opinion.

I don't have a timeframe for this - I'm going to finish the new json
type first. But I wanted to hear everyone's thoughts first.

Also, I've been using this email list as a sort of developer diary for
a lot of stuff I'm working on in this space. Does anyone have
particularly strong opinions about this? Good / bad / are there better
places for this sort of thing? I'm not *philosophically* against web
forums, but I know of any that I like.

As always, I'd love to hear comments / feedback about all of the
above. Sorry its such a bumpy ride!

-J

James Keener

unread,
Oct 8, 2015, 12:20:21 PM10/8/15
to sha...@googlegroups.com, derbyjs
Thanks for the update! I can't wait for the new JSON type and the
ShareDB release! This is all very exciting!

I enjoy your updates/dev diary via email; it helps me understand what's
going on with the project. I would also suggest (but it's not a big
deal) that they be placed somewhere besides just Google groups (personal
blog), as it's sometimes hard to find historical information.

Keep up the great work!
Jim

Joseph Gentle

unread,
Oct 8, 2015, 8:50:17 PM10/8/15
to sha...@googlegroups.com, derbyjs
At a technical level I think this is a great idea. Some of the things
you've suggested we've already done. (The hard-to-write OT code will
absolutely be shared between projects). And as you say, the driver API
was very explicitly designed that way so drivers can live in their own
modules. I only held off on doing this while the API was in rapid flux
- but its settled down now. Having a simpler API to setup the whole
thing would great. The levelup, leveldown and level packages are nice
example of this model.

But all that said there's two reasons why I don't want to go down that road.

The first reason is complexity - there's an awful lot of code in the
current stack which only makes sense for JSON documents. Take a look
at the client document class in both 0.6 and master:
https://github.com/share/ShareJS/blob/0.6/src/client/doc.coffee - 330
lines of code
https://github.com/share/ShareJS/blob/master/lib/client/doc.js - 1030
lines of code (!)

Most of that extra code is needed for things like queries, operation
shattering and rejecting ops - features which are only needed for JSON
documents. I care a lot about implementation complexity than most
people (certainly more than Nate does). I'd rather make a minimal
library than a big library which covers every use case. This is just
my personal taste.

The second reason is that I'm moving to europe and starting a business
next year. Nobody pays me to work on this stuff, and it would burn too
much of my runway to get the current version of sharejs into a state
where I'm happy with it. Thankfully I don't have to - by handing over
the reins of the project to Nate + team, I know it'll be in good
hands. And I know they care about the features which are important to
this use case - their business depends on it.

As I say - I really like the changes you're suggesting. But there's no
reason we can't do both. Maybe talk to Nate about doing those changes
in sharedb anyway. If sharedb grows to do everything the little
sharejs library does but better, that wouldn't be a failure at all.

-J

On Fri, Oct 9, 2015 at 3:55 AM, Devon Govett <devon...@gmail.com> wrote:
> Hi Joseph,
>
> I’m not sure I see the purpose of splitting the project in two. It seems
> like a huge duplication of effort when the projects will have so much in
> common. 0.7 already does support your use case for the new ShareJS project
> (collaborative text editing) along with support for any other OT type you
> might come up with. So I’m not sure why you think 0.7 has failed in creating
> a unified library. If it’s because you’re not happy with the code, then
> let’s make it better rather than forking completely. If it’s because you
> think it’s too hard to set up a simple collaborative text editing system,
> then let’s create a small library on top of the current ShareJS/LiveDB which
> hooks everything up for you easily (db + text type etc.).
>
> If you really think we should fork, I’ll suggest what I said the last time
> this came up which is that rather than starting from scratch and having so
> much duplication of functionality, we should find a new common baseline for
> the projects to both sit on, so that they both can benefit from
> enhancements. Let’s start from 0.7 and abstract out and modularize various
> pieces that are currently built into sharejs and livedb and make the
> projects more extendible. Then you could easily build simpler wrappers on
> top to hide the complexity from users that wanted to get up and running
> quickly.
>
> Here’s what I said before:
>
> I think, if there are going to be two separate projects going forward, it
> would be best for everyone involved if as much code as possible could be
> shared between the projects. For example, the direction of moving the OT
> types out of ShareJS into their own modules is a good start. If the two
> projects could share that code, it would be awesome. Additionally, the
> networking stuff (client-server protocol) seems like it could be shared as
> well. The backend for transforming ops and saving them to a database could
> also be common.
>
> Come to think of it, I actually think it may be better to start from the
> current ShareJS and livedb and make them more modular rather than splitting
> the project entirely in two. Move the redis and in process driver stuff from
> livedb into separate modules that can be plugged in. Move the memory store
> into a separate module, like livedb-mongo is. Move the query stuff into a
> separate plugin as well, for both ShareJS and livedb. Move the rest server
> out of ShareJS and into a separate module. Same for projections. Cursor
> support could also be added as a plugin, configurable to support whatever OT
> type you’re using (e.g. text or JSON), or even separate modules for each.
>
> ShareJS would end up being just the client and server of the basic network
> protocol, only supporting fetching and subscribing to documents, and
> submitting ops via livedb. The protocol could be extendible so that other
> things could be added, such as queries, projections, and cursors. Users
> could use browserify to build the ShareJS client and each plugin they need
> in their application. Livedb would become just the basis of an OT backend,
> with support for multiple plugins for databases and drivers (already almost
> there). Out of the box it couldn’t do anything by itself - you’d need to add
> a backend and a driver at the minimum. It could be made just a bit more
> extendible to support things like queries and projections as well.
>
> This looks like a lot when I write it out, but I think we’re already pretty
> close to what I described, at least on the livedb side (drivers and backends
> are already basically plugins). If we went with this modular design, and you
> wanted to build your small text editing library, you’d just require livedb
> with an in process driver and whatever backend you wanted, the ShareJS
> client/server, a plain text or rich text OT type, and cursor support. It
> would end up lighter weight than currently since you wouldn’t require
> anything you don’t need (e.g. JSON OT, redis, queries, and projections). You
> could even publish this as a module for others to use with zero
> configuration (and also a good example for people who want to build their
> own config).
>
>
> Like I said before, I’m happy to help with this. I think sharing as much
> code as possible will be better for everyone in the long run.
>
> Devon
> --
> You received this message because you are subscribed to the Google Groups
> "ShareJS" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sharejs+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "ShareJS" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sharejs+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Ian Johnson

unread,
Oct 8, 2015, 10:22:23 PM10/8/15
to der...@googlegroups.com, sha...@googlegroups.com
Just chiming in to say this all makes a lot of sense to me. I look forward to the development on both ends!
Also I enjoyed reading this update, I think this is a fine medium for it. I agree dumping it in a blog might be nice for posterity too.

You received this message because you are subscribed to the Google Groups "Derby" group.
To unsubscribe from this group and stop receiving emails from it, send an email to derbyjs+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Ian Johnson - 周彦

Joseph Gentle

unread,
Oct 8, 2015, 11:25:03 PM10/8/15
to sha...@googlegroups.com, derbyjs
:D

Yeah posting to my blog is a good call. Its up now at
https://josephg.com/blog/forki/ . It might also be worth uploading all
the recent posts about JSON2.

-J

Osman Mazinov

unread,
Oct 9, 2015, 3:50:00 PM10/9/15
to Derby, sha...@googlegroups.com, m...@josephg.com
Hi Joseph, thank you. The story behind development of ShareJS and Derby is very interesting.
I think OT-based Share.js is what makes Derby.js unique. Built-in server side rendering and smart conflict resolution (OT instead of LWW) are technologies that give Derby advantage over other full stack real-time frameworks.
Looking forward to read more interesting posts from you.

Joseph Gentle

unread,
Oct 10, 2015, 11:05:59 PM10/10/15
to sha...@googlegroups.com, derbyjs
Yay more Australians!

Yeah 1.0 will include the REST API, as well as better ways to delete
documents and set them up with initial data. The general problem with
deleting documents is that if I have a document at version 10 then go
offline and make some changes, and you delete then recreate the entire
document, when I come online again what happens? Its important that
the system doesn't try to merge my changes into the new document.

In ShareJS 0.6 I just strongly discouraged people from deleting
documents and provided no client-side API to do it.
In 0.7 I made it so all documents exist by default at version 0
without a type. Creating and deleting documents are special kinds of
operation which also bump the version. Deleting a document doesn't
*really* delete anything - it just strips the type & snapshot data but
the leaves the old version intact. (Well, it increments it). Then you
can recreate the document later if you want. That way if you try to
apply an old operation, well, even if the document was deleted &
created, it'll try to transform by all the intervening changes. One of
those changes will be a delete operation, and your edit will be
deleted.

Thats conceptually clean, but also kind of really gross. I thought of
a nicer way to do this which is to simply have a random tag on the
document. When the document gets created, the tag gets (randomly) set,
and then it gets reset if the document is deleted & regenerated. When
you reconnect, you send the tag of the document generation you expect,
and if it doesn't match you know the document has been deleted. This
adds some extra data on every document (bad), but its a stupidly
simple scheme and it lets us genuinely delete documents.

As for the JSON2 stuff, it'll be compatible. Which is to say, I'll
write a function to convert from the old json operations to the new
ones. Or you can just keep using the old OT type. ShareJS has (since
the very first release) a standard API for all OT types. It works with
any type that obeys the API. The old JSON code will continue to work
just fine if you want to keep using it.

-J


On Sat, Oct 10, 2015 at 4:46 PM, <james...@theconversation.edu.au> wrote:
> Thanks for the update Joseph,
>
> We're actively using 0.6.3 in production, with ~40,000 JSON documents and
> ~90,000,000 operations stored in postgres. We're probably not pushing its
> limits in terms of concurrency and features, but it's certainly a core piece
> of infrastructure that we rely on. It's also rock solid - we've had very few
> issues in 3.5 years.
>
> We haven't explored upgrading to 0.7 as yet - the differences felt
> significant, and the upgrade path unclear.
>
> A 1.0 release that's based on 0.6 sounds like it would suit us. I haven't
> followed the development of the new JSON type, but a clear upgrade path from
> 0.6 to 1.0 would be super helpful. I guess that would either mean keeping
> the original JSON type available as an option, or a documented way we can
> migrate our millions of operations to the json2 format.
>
> Finally, we rely on the REST interface to fetch current snapshots from our
> CMS. Will 1.0 include the REST API?
>
> James
Reply all
Reply to author
Forward
0 new messages