ShareJS and ShareDB

1,397 views
Skip to first unread message

Joseph Gentle

unread,
Oct 8, 2015, 11:52:05 AM10/8/15
to sha...@googlegroups.com, derbyjs
Hi! This is longer than I expected. Skip to the NEXT STEPS section for the tldr.


I first wrote ShareJS hoping for a simple library on top of which
people could build their own collaborative applications. I showed off
the first working version in April 2012, running on NodeJS 0.4. The
list of libraries you could manually download was in a github wiki
page. There was also no websockets - not even a draft.

When I wrote ShareJS I wasn't thinking about JSON at all. It was a
strange little idea a german researcher proposed over italian food at
the google wave summit. (Held right after wave was cancelled). I
somehow convinced Jeremy (@nornagon) to implement the damn thing (much
harder than it sounds!) and we put it in sharejs - because sharejs can
support editing arbitrary types. ShareJS 0.6 was a beautifully crafted
snowflake. I prided myself on its tests-to-code ratio of about 1.5:1.
I took it personally when bugs survived in that code base - my tests
also managed to find bugs in socket.io, bugs in uglifyjs and once a
bug in the coffeescript compiler. I tested sharejs down to IE6.

Then at the start of 2014 I moved to the bay area and started working
at Lever. The company was only a few months old at the time. Nate (our
CEO) had written this little framework called Derby which he'd
presented at the realtime.js conference. We geeked out over combining
forces - what would Derby look like on top of ShareJS? It was a total
conceit - probably a terrible idea for such a young company to invest
resources into but we did it anyway. We did it - but we never had
quite enough time to do everything *well*. It was as fast as it needed
to be, with just enough features to make our app work just well
enough. I'm super proud of what lever is growing into, but I want my
code to *never fall down*. That was my standard with sharejs a few
years ago, and I ruined it. I still get that guilty vertigo feeling
when I look at the issue tracker.

This all hit home early this year when I was invited onto some
javascript podcast. One of the people on the call asked about using
sharejs in his own project - he wanted a simple collaborative text
editor. I told him about all of sharejs's great features and he said
in that case it probably wasn't the right tool for him. I didn't know
how to respond - it was like I had been slapped in the face. I made
ShareJS exactly for his use case; and I failed. I took sharejs.org
down the other day and I felt good about it - I felt like the site was
lying. (And it was still running 0.5 if you can believe it.)

------------------------------------------------------------------

Next steps

In a sense I've made two versions of sharejs. The first version (0.6)
is a small tool to embed collaborative text editing on your site.

The second version (livedb + sharejs 0.7) is an OT-based database for
realtime websites.

My hope for sharejs 0.7 was to have a unified library to cover all of
the above use cases. I think I've failed in that goal.

So we're going to fork the project in two.

What is currently ShareJS master (v0.7) will be combined with livedb
and renamed sharedb. Nate & his team at Lever will run sharedb (as
they've been doing for the past year). It will be updated to only
support JSON documents at the root level - which simplifies a lot of
code. JSON documents support subdocuments with arbitrary OT types
anyway, so there shouldn't be any loss in functionality. Should this
get its own email list?

Meanwhile I'm going to revert sharejs back to the 0.6 codebase and
modernise the whole thing. The API should stay (more or less) the same
as it was at v0.6, although I'm going to merge the cursors branch in.
In practical terms this means:
- Native (default) websocket support
- Coffee -> JS
- Mocha
- Browserify
- Out of the box quilljs support + examples
- Resurrect the old DB bindings for sqlite, postgresql, couchdb, leveldb, etc.
- ShareJS will not support having multiple backend servers (unless you
shard). (Although it would be nice if it could also use livedb
(sharedb) as a backend somehow).
- It'll be called sharejs 1.0 when this is done. (Although I feel like
at this point we deserve a higher number)

I'm not sure about whether to just use a document name (like 0.6 does
now) or use collection name + document name (0.7 style). I really
don't have a strong opinion.

I don't have a timeframe for this - I'm going to finish the new json
type first. But I wanted to hear everyone's thoughts first.

Also, I've been using this email list as a sort of developer diary for
a lot of stuff I'm working on in this space. Does anyone have
particularly strong opinions about this? Good / bad / are there better
places for this sort of thing? I'm not *philosophically* against web
forums, but I know of any that I like.

As always, I'd love to hear comments / feedback about all of the
above. Sorry its such a bumpy ride!

-J

Geoff Goodman

unread,
Oct 8, 2015, 12:10:35 PM10/8/15
to sha...@googlegroups.com, derbyjs
Very exciting to see ShareJS go back to a much more focused scope!

You ask about document name vs collection + document names and I counter-ask: If we wanted to namespace documents in a collection couldn't we do so via our own naming convention? What would users of ShareJS want two levels of namespacing like that?

Looking forward to seeing this move forward!

Geoff

--
You received this message because you are subscribed to the Google Groups "ShareJS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sharejs+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

James Keener

unread,
Oct 8, 2015, 12:20:20 PM10/8/15
to sha...@googlegroups.com, derbyjs
Thanks for the update! I can't wait for the new JSON type and the
ShareDB release! This is all very exciting!

I enjoy your updates/dev diary via email; it helps me understand what's
going on with the project. I would also suggest (but it's not a big
deal) that they be placed somewhere besides just Google groups (personal
blog), as it's sometimes hard to find historical information.

Keep up the great work!
Jim

Devon Govett

unread,
Oct 8, 2015, 12:55:53 PM10/8/15
to sha...@googlegroups.com
Hi Joseph,

I’m not sure I see the purpose of splitting the project in two. It seems like a huge duplication of effort when the projects will have so much in common. 0.7 already does support your use case for the new ShareJS project (collaborative text editing) along with support for any other OT type you might come up with. So I’m not sure why you think 0.7 has failed in creating a unified library. If it’s because you’re not happy with the code, then let’s make it better rather than forking completely. If it’s because you think it’s too hard to set up a simple collaborative text editing system, then let’s create a small library on top of the current ShareJS/LiveDB which hooks everything up for you easily (db + text type etc.).

If you really think we should fork, I’ll suggest what I said the last time this came up  which is that rather than starting from scratch and having so much duplication of functionality, we should find a new common baseline for the projects to both sit on, so that they both can benefit from enhancements. Let’s start from 0.7 and abstract out and modularize various pieces that are currently built into sharejs and livedb and make the projects more extendible. Then you could easily build simpler wrappers on top to hide the complexity from users that wanted to get up and running quickly.

Here’s what I said before:

I think, if there are going to be two separate projects going forward, it would be best for everyone involved if as much code as possible could be shared between the projects. For example, the direction of moving the OT types out of ShareJS into their own modules is a good start. If the two projects could share that code, it would be awesome. Additionally, the networking stuff (client-server protocol) seems like it could be shared as well. The backend for transforming ops and saving them to a database could also be common. 

Come to think of it, I actually think it may be better to start from the current ShareJS and livedb and make them more modular rather than splitting the project entirely in two. Move the redis and in process driver stuff from livedb into separate modules that can be plugged in. Move the memory store into a separate module, like livedb-mongo is. Move the query stuff into a separate plugin as well, for both ShareJS and livedb. Move the rest server out of ShareJS and into a separate module. Same for projections. Cursor support could also be added as a plugin, configurable to support whatever OT type you’re using (e.g. text or JSON), or even separate modules for each. 

ShareJS would end up being just the client and server of the basic network protocol, only supporting fetching and subscribing to documents, and submitting ops via livedb. The protocol could be extendible so that other things could be added, such as queries, projections, and cursors. Users could use browserify to build the ShareJS client and each plugin they need in their application. Livedb would become just the basis of an OT backend, with support for multiple plugins for databases and drivers (already almost there). Out of the box it couldn’t do anything by itself - you’d need to add a backend and a driver at the minimum. It could be made just a bit more extendible to support things like queries and projections as well. 

This looks like a lot when I write it out, but I think we’re already pretty close to what I described, at least on the livedb side (drivers and backends are already basically plugins). If we went with this modular design, and you wanted to build your small text editing library, you’d just require livedb with an in process driver and whatever backend you wanted, the ShareJS client/server, a plain text or rich text OT type, and cursor support. It would end up lighter weight than currently since you wouldn’t require anything you don’t need (e.g. JSON OT, redis, queries, and projections). You could even publish this as a module for others to use with zero configuration (and also a good example for people who want to build their own config). 

Like I said before, I’m happy to help with this. I think sharing as much code as possible will be better for everyone in the long run.

Devon

James Keener

unread,
Oct 8, 2015, 1:06:45 PM10/8/15
to sha...@googlegroups.com, Devon Govett
Many good points Devon! While I'm not in a place to discuss how this work should be done, I am willing (and forgot to say initially, hence this email) to help code, document, test, and exemplify things.

Jim
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Joseph Gentle

unread,
Oct 8, 2015, 8:50:16 PM10/8/15
to sha...@googlegroups.com, derbyjs
At a technical level I think this is a great idea. Some of the things
you've suggested we've already done. (The hard-to-write OT code will
absolutely be shared between projects). And as you say, the driver API
was very explicitly designed that way so drivers can live in their own
modules. I only held off on doing this while the API was in rapid flux
- but its settled down now. Having a simpler API to setup the whole
thing would great. The levelup, leveldown and level packages are nice
example of this model.

But all that said there's two reasons why I don't want to go down that road.

The first reason is complexity - there's an awful lot of code in the
current stack which only makes sense for JSON documents. Take a look
at the client document class in both 0.6 and master:
https://github.com/share/ShareJS/blob/0.6/src/client/doc.coffee - 330
lines of code
https://github.com/share/ShareJS/blob/master/lib/client/doc.js - 1030
lines of code (!)

Most of that extra code is needed for things like queries, operation
shattering and rejecting ops - features which are only needed for JSON
documents. I care a lot about implementation complexity than most
people (certainly more than Nate does). I'd rather make a minimal
library than a big library which covers every use case. This is just
my personal taste.

The second reason is that I'm moving to europe and starting a business
next year. Nobody pays me to work on this stuff, and it would burn too
much of my runway to get the current version of sharejs into a state
where I'm happy with it. Thankfully I don't have to - by handing over
the reins of the project to Nate + team, I know it'll be in good
hands. And I know they care about the features which are important to
this use case - their business depends on it.

As I say - I really like the changes you're suggesting. But there's no
reason we can't do both. Maybe talk to Nate about doing those changes
in sharedb anyway. If sharedb grows to do everything the little
sharejs library does but better, that wouldn't be a failure at all.

-J

Devon Govett

unread,
Oct 8, 2015, 10:19:36 PM10/8/15
to sha...@googlegroups.com
My response to your first point about complexity, is that if there is code that only make sense for a certain type of document, then that code should live in a separate module. That would reduce the complexity of the base library tremendously.

To your second point, I completely understand. It would be a sizable amount of work. I still think it’s a good direction to go in, however, and maybe when we get there, it will make sense for sharejs to be a thin wrapper around sharedb (such confusing naming!).

Ian Johnson

unread,
Oct 8, 2015, 10:22:22 PM10/8/15
to der...@googlegroups.com, sha...@googlegroups.com
Just chiming in to say this all makes a lot of sense to me. I look forward to the development on both ends!
Also I enjoyed reading this update, I think this is a fine medium for it. I agree dumping it in a blog might be nice for posterity too.

You received this message because you are subscribed to the Google Groups "Derby" group.
To unsubscribe from this group and stop receiving emails from it, send an email to derbyjs+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Ian Johnson - 周彦

Joseph Gentle

unread,
Oct 8, 2015, 11:25:03 PM10/8/15
to sha...@googlegroups.com, derbyjs
:D

Yeah posting to my blog is a good call. Its up now at
https://josephg.com/blog/forki/ . It might also be worth uploading all
the recent posts about JSON2.

-J

Maksims Mihejevs

unread,
Oct 9, 2015, 7:09:59 AM10/9/15
to ShareJS, der...@googlegroups.com, m...@josephg.com
Hi Joseph.
First - huge thanks for all your work you are doing. We use ShareJS 0.7 a lot in our real-time collaborative PlayCanvas Editor. It is very fundamental part of our product indeed.

I've joined PlayCanvas when we had v0.6, and we had only 3d scene JSON representation stored in memory, and constantly pushed to database.
We needed be in database from the beginning, which livedb solved. And then we needed more stuff, like assets data to be real-time too. Then we added code-editing as well (currently .js files are still not, but other code files are collaborative).
We do have multiple processes submitting operations on data, so we use redis driver for that.
So we have already 4 collections, purely managed by sharejs, and few in-memory things, with horizontal scaling, json and text. So pretty much pushed ShareJS to its boundaries.

Working with v0.6, and now with v0.7 I did felt there was some weird complications, and sometimes holes in things.
It does feel more of ideological and practices problem, lead by complicated design.
Moving toward Complexity, reducing Complications would be a very good way to go.
If framework would be designed in a flat manner, with little number of layers, it would allow to have simplicity out of the box, and depth to do custom and complicated things. Separation of things in very clear manner, would help here a lot.

Currently we do use most of its features, and it proved to be very powerful, although we do have some exceptions and errors, that are very hard to debug, and sometimes have to find just workarounds.
It more related to not well normalised error reporting, that does not points out to where to look at.
Documentation is a bit misleading sometimes, and became a bit messy as covers 0.6 and 0.7.
There is not much separation in docs about how to submit ops from server side. There is still no way to delete a doc.

Few more missing things: creation of doc that driven by server, there is no way to notify clients of a "subscribtion" manner about it. We have cases where projects have 100+ assets, which is individual docs. Some operation may lead to dozen of assets created in short time. We have to send manual messages to client about docs availability, so it can subscribe to them. There is scoping visibility and permissions taken in advance.
When we delete docs - as well, we do need to send our own messages for that.
We use ShareJS and had to hack a stream a bit, so to stick own messages on same channel.
Other thing, is that _id is stringified by ShareJS which is quiet annoying strict behaviour. We have number-based IDs all around, and we endup sticking parseInt all around :(
The only thing I found we don't really use, is history, as we do it client-side. And we found that we can delete ops collections by hand, kinda works :)

Although it does serves us very well, we still feel it could be improved dramatically.
If there will be some conversation regarding API design and implementation design, we would love to share our experience on more specific terms.
It does feel like we will have to go with ShareDB though, as we do need horizontal scaling, and a lot of custom middleware all around, including our own data validation process.

Mega Thanks again for this brilliant library.
Kind Regards from PlayCanvas

Cheers,
Max

james...@theconversation.edu.au

unread,
Oct 10, 2015, 1:46:55 AM10/10/15
to ShareJS, der...@googlegroups.com
Thanks for the update Joseph,

We're actively using 0.6.3 in production, with ~40,000 JSON documents and ~90,000,000 operations stored in postgres. We're probably not pushing its limits in terms of concurrency and features, but it's certainly a core piece of infrastructure that we rely on. It's also rock solid - we've had very few issues in 3.5 years.

We haven't explored upgrading to 0.7 as yet - the differences felt significant, and the upgrade path unclear.

A 1.0 release that's based on 0.6 sounds like it would suit us. I haven't followed the development of the new JSON type, but a clear upgrade path from 0.6 to 1.0 would be super helpful. I guess that would either mean keeping the original JSON type available as an option, or a documented way we can migrate our millions of operations to the json2 format.

Finally, we rely on the REST interface to fetch current snapshots from our CMS. Will 1.0 include the REST API?

James

Joseph Gentle

unread,
Oct 10, 2015, 11:05:58 PM10/10/15
to sha...@googlegroups.com, derbyjs
Yay more Australians!

Yeah 1.0 will include the REST API, as well as better ways to delete
documents and set them up with initial data. The general problem with
deleting documents is that if I have a document at version 10 then go
offline and make some changes, and you delete then recreate the entire
document, when I come online again what happens? Its important that
the system doesn't try to merge my changes into the new document.

In ShareJS 0.6 I just strongly discouraged people from deleting
documents and provided no client-side API to do it.
In 0.7 I made it so all documents exist by default at version 0
without a type. Creating and deleting documents are special kinds of
operation which also bump the version. Deleting a document doesn't
*really* delete anything - it just strips the type & snapshot data but
the leaves the old version intact. (Well, it increments it). Then you
can recreate the document later if you want. That way if you try to
apply an old operation, well, even if the document was deleted &
created, it'll try to transform by all the intervening changes. One of
those changes will be a delete operation, and your edit will be
deleted.

Thats conceptually clean, but also kind of really gross. I thought of
a nicer way to do this which is to simply have a random tag on the
document. When the document gets created, the tag gets (randomly) set,
and then it gets reset if the document is deleted & regenerated. When
you reconnect, you send the tag of the document generation you expect,
and if it doesn't match you know the document has been deleted. This
adds some extra data on every document (bad), but its a stupidly
simple scheme and it lets us genuinely delete documents.

As for the JSON2 stuff, it'll be compatible. Which is to say, I'll
write a function to convert from the old json operations to the new
ones. Or you can just keep using the old OT type. ShareJS has (since
the very first release) a standard API for all OT types. It works with
any type that obeys the API. The old JSON code will continue to work
just fine if you want to keep using it.

-J

Jonathan Clem

unread,
Oct 12, 2015, 4:57:02 PM10/12/15
to ShareJS, der...@googlegroups.com, m...@josephg.com
Sounds reasonable. We've been using 0.7 for almost a year (FWIW it's never "fallen down" for us) and we've been quite happy with it. While I'm excited that ShareDB will have people dedicated to working on it, I hope it still stays totally separate from Derby. We're using ShareJS in a text editor implemented in Ember and have been really, really happy with the combination of the two. The JSON-only limitation sounds great, as well, if it simplifies things.

Nate Smith

unread,
Oct 13, 2015, 4:29:24 AM10/13/15
to sha...@googlegroups.com
As Joseph said, I and the Lever team will be continuing to develop the next generation of the 0.7 branch of ShareJS + LiveDB as ShareDB. I merged the relevant code from ShareJS 0.7 into the previous livedb repo and renamed it to sharedb. The boundary line between the two repos wasn't particularly useful, so from now on, there will just be a core repo for ShareDB and various DB and PubSub adapters. Once the API stabilizes, I might break sharedb-client into its own npm module, so it can have its own semantic version, but it is easier to iterate in a single repo for now.

I actually just published the first version of the new ShareDB today. Still updating the Lever's apps, tests, and the README, but things are mostly working already:


Note that both Redis and Mongo were required for persistence in 0.7, but now the mongo driver is completely independent of Redis. I'll need to follow up with much more in depth explanation in proper documentation, but in short, this is a complete re-architecture. Previously, the Redis driver both coordinated operation submission and PubSub of operations. In the previous LiveDB implementation, if the mongo write were to fail, it was basically impossible to rollback, because the Redis driver had already accepted and published the op. There was also a race condition between publishing the op and querying mongo for the document, which meant needing to go to both Redis and Mongo to get the current state of the world. This was very complex and didn't have a great scaling story, since Redis was a single bottleneck / point of failure.

Now, the DB adapter fully handles committing an operation + the new snapshot in a single method, which must succeed or fail atomically. Different DBs with different features might have very different implementations. For Mongo, we use a backreference from the snapshot to the last op and from each op to the previous op to perform optimistic locking across documents, since Mongo lacks transaction support. The approach is somewhat similar to how Software Transaction Memories are implemented.

The most important change is that now the source of truth of what is committed is the Mongo document snapshot. Thus, queries and document reads can go straight to Mongo. Only writes need to go through ShareDB. Also, tailing the mongo journal is an appropriate way to stream updates to a search index or other computed data in realtime. A failure to write the op or snapshot to Mongo will bubble all the way back to the client and the server will remain in a consistent state. The system is just based on standard Mongo reads and updates, so it will support Mongo sharding and standard horizontal scaling approaches.

The Redis pub-sub adapter is now very simple, and it is only needed if you are running more than one ShareDB process. The PubSub adapter API is super simple, so it would be easy to write others.

I've refactored all of the middleware and options to be much simpler and more powerful. Since the middleware was previously only in ShareJS 0.7 and not LiveDB, the API was previously much harder to work with.

ShareDB will not include a REST API or bindings to a textarea or any specific use case features like that. Happy for others to build those as additional modules, but depending on your framework or application, you may or may not need these, so they are best done in separate modules.

ShareDB will indeed remain a separate project from Derby, which is good for both projects. Derby works well in total standalone mode as an HTML or string renderer, a client-side DOM renderer, and along with ShareDB as a full-stack realtime framework. In the future, I also intend for Derby to have a simple REST backend option in addition to the ShareDB realtime backend. Derby is especially well suited to work with ShareDB, but the projects are independently useful as well.

ShareDB and ShareJS will continue to use the same ottypes (the most meaningful code that should remain common), and we've obviously kept the names close together to emphasize the association.


--

E Francis

unread,
Nov 23, 2015, 3:03:10 PM11/23/15
to ShareJS, na...@nateps.com
Hey guys,

This split between ShareDB and ShareJS sounds like the right move, but I've got a couple questions about it and I'm hoping you can answer them:

1. I'm looking to hook up the Quill js editor and Share to get a collaborative rich text editor. Would ShareJS + the rich-text ot type be the correct library to use for that? Or is ShareDB + rich-text type the right choice? I know Quill was developed alongside Share to work together but it looks like there's no documentation around hooking them up so I'm having a hard time understanding what the right approach is.

2. It sounds like the redis pub-sub mechanism will only be included in ShareDB, is that correct? So ShareJS won't be able to scale horizontally by default (at least initially)?

Thanks for open sourcing all this work btw, it really is an awesome effort and you guys deserve credit for sharing it!

Jason Chen

unread,
Dec 1, 2015, 6:53:59 PM12/1/15
to ShareJS
Hey all,

Sorry for joining the conversation late. It’s great to see Quill support listed as a 1.0 item for ShareJS. Collaborative editing is on Quill's list as well and I plan on focusing a lot more on demos/examples once Quill 1.0 is out (which also does not have a public timeline). Happy to help out on my end to make sure Quill 1.0 + ShareJS 1.0 will work well together.

It does seem ShareJS + rich-text is the way to go right now provided your persistence needs are met. rich-text will be the data type I am planning to support for Quill going forward. Depending on how json1 turns out it may be appropriate for it to succeed rich-text's implementation as it is more powerful but I am somewhat fond of rich-text's current API so it would probably just be the internals.
Reply all
Reply to author
Forward
0 new messages