Scaling ShareDB / experiences

Stian Håklev

unread,

Oct 9, 2017, 11:46:17 PM10/9/17

to ShareJS

Hi everyone,

I wonder if anyone here have experience scaling ShareDB - in terms of number of users per document, per node, number of nodes connected through Redis, etc. We are relying very heavily on ShareDB for our collaborative learning platform, and are currently beginning real-world trials. Our current experiments are in a lecture hall with up to 200 people connecting at the same time. Our app is Meteor based, and I am currently using several ShareDB server connected through Redis - still trying to discover how many users are ideal per machine, etc. Curious about how many nodes it would be feasible to connect together through Redis. We will probably never have many thousand users per document, but we could easily have many thousand users all together, so if there is a concern about having too many instances connected over Redis, we might want to shard different groups to different separate instances...

Also curious if anyone found certain design choices (OT types, object size, etc) impacting scalability/performance.

Will share our findings as our work progresses.

best

Stian Håklev

Ecole Polytechnique Fédérale de Lausanne

Jeremy Apthorp

unread,

Oct 10, 2017, 2:18:44 PM10/10/17

to ShareJS

Hi Stian!

I haven't myself had much experience scaling sharedb, but one thing to note on choice of OT type is that at present, the json0 OT type is O(n*m) in CPU required to compute transform. That is, if your application has many concurrent edits in the same json0 document, the amount of work required to reconcile them grows exponentially with the number of those concurrent edits.

Practically speaking, in applications that are live (I.e. time from user pressing a key to server acknowledging the op is in the <10sec range), truly concurrent edits are relatively rare. But if you have 200 users mashing their keyboard on the same json0 document, you might bump into an issue there. The best way to resolve that right now is to split up your data model into separate documents, if possible, such that each document has a relatively small number of truly concurrent edits. As an example, if you had a graph structure represented in JSON where each node was a text document, and you expected most edits to be in those text nodes, rather than in the graph structure, then you might choose to store that text in separate documents, one per node. Then the graph structure can simply hold document IDs, pointers to the actual content. If most users are editing different documents, then very little transform computation needs to be done, and you'll avoid the O(n*m) dragon. It's quite possible this won't be a problem for you at all though, depending on your workload. As latency goes down, concurrent edits get rarer, and in practice it's rare for edits to be truly concurrent. But probably don't store your entire data model in a single document :)

(Side note, the new json ot type that Joseph has been working on, called json2, has O(n+m) instead of O(n*m) complexity in transform, and so avoids this issue. It's not done yet, though.)

Excited that you're building on ShareDB, and looking forward to hearing more about your project!

Jeremy

--
You received this message because you are subscribed to the Google Groups "ShareJS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sharejs+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jeremy Apthorp

unread,

Oct 10, 2017, 2:19:41 PM10/10/17

to ShareJS

And by "grows exponentially", I mean "grows as the square" 🤦‍♂️

John Hewson

unread,

Oct 10, 2017, 3:26:02 PM10/10/17

to sha...@googlegroups.com

On Oct 10, 2017, at 11:18, Jeremy Apthorp <norn...@nornagon.net> wrote:

Hi Stian!

I haven't myself had much experience scaling sharedb, but one thing to note on choice of OT type is that at present, the json0 OT type is O(n*m) in CPU required to compute transform. That is, if your application has many concurrent edits in the same json0 document, the amount of work required to reconcile them grows exponentially with the number of those concurrent edits.

That’s actually quadratic growth, rather than exponential, so not as much of a problem. In ShareDB only the changes from the latest snapshot need to be transformed. So the number of concurrent edits/users should be the dominant factor, rather than the absolute length of the document.

— John

Jeremy Apthorp

unread,

Oct 10, 2017, 3:54:36 PM10/10/17

to sha...@googlegroups.com

Yep, but if your document encompasses more of your data then it's more likely that edits collide

John Hewson

unread,

Oct 10, 2017, 4:15:25 PM10/10/17

to sha...@googlegroups.com

On Oct 10, 2017, at 12:54, Jeremy Apthorp <norn...@nornagon.net> wrote:

Yep, but if your document encompasses more of your data then it's more likely that edits collide

I’m not sure I follow, OT edits don’t collide, they always succeed.

That said, ShareDB only accepts a single non-concurrent edit against the latest version of the document and forces clients to merge in the changes from the server before re-submitting their own edits. So a large number of concurrent edits creates more work for other clients with pending edits than it does for the server. But that’s not dependent on the size of the document.

— John

Jeremy Apthorp

unread,

Oct 10, 2017, 4:44:09 PM10/10/17

to sha...@googlegroups.com

If you have 200 users editing at 1Hz and 1 document, the chance that two edits need to be transformed is high, since it's likely that multiple operations will be in flight simultaneously. If you have 200 users editing 100 documents, where the avg number of users per document is ~2, the number of concurrent ops per document is much smaller (probably 0 or 1 per document).

Put another way, when your op reaches the server, it needs to get transformed past all the ops on that document the client didn't have yet. If there's 0 ops since your op, then 0 transforms need to be done. If there's been 100 ops since your op, and your op contains 10 components, then you'll need to do 10*100 = 1000 transforms. If there are 199 other people editing your document, then it's likely the # of ops the server needs to transform your op past is high. If there's 1 other person, the number is likely much lower. (It also depends on edit frequency and latency—if edits happen once a day and get reported to all other users within seconds, then it's very unlikely that any concurrent edits will ever happen.)

That's moot if the server pushes all transform work to the client, of course, but in that situation you'll end up with a huge amount of contention on the CAS lock and it will take a very long time for a client's edits to actually successfully be accepted by the server. So instead of your servers falling over, the clients will stop working instead :) (also, that would unfairly favor clients with faster networks)

John Hewson

unread,

Oct 10, 2017, 9:01:53 PM10/10/17

to sha...@googlegroups.com

On Oct 10, 2017, at 13:43, Jeremy Apthorp <norn...@nornagon.net> wrote:

If you have 200 users editing at 1Hz and 1 document, the chance that two edits need to be transformed is high, since it's likely that multiple operations will be in flight simultaneously. If you have 200 users editing 100 documents, where the avg number of users per document is ~2, the number of concurrent ops per document is much smaller (probably 0 or 1 per document).

Ok, I see, yes, but that’s assuming that you can break a single large document down into 100 independent documents, i.e. you don’t need atomic operations across them, or a single document “version” which comprises sever sub-documents. But assuming you can, then yes, the reduction of concurrency is a good thing.

Put another way, when your op reaches the server, it needs to get transformed past all the ops on that document the client didn't have yet. If there's 0 ops since your op, then 0 transforms need to be done. If there's been 100 ops since your op, and your op contains 10 components, then you'll need to do 10*100 = 1000 transforms. If there are 199 other people editing your document, then it's likely the # of ops the server needs to transform your op past is high.

Those assumptions are use-case dependent though. If 199 other people are editing the document but 198 of them are just looking at it and one is editing, then you’ll need far fewer transformations. I’m not sure what uses cases involve 200 people actually typing in edit commands to a document at the same time. In my experience most users of a document are passive viewers.

If there's 1 other person, the number is likely much lower. (It also depends on edit frequency and latency—if edits happen once a day and get reported to all other users within seconds, then it's very unlikely that any concurrent edits will ever happen.)

Right, the use case matters more than anything else. If you want 200 editors who sporadically take turns to edit over the course of an hour, it’s probably fine, but if they all press a key at the same time… trouble.

That's moot if the server pushes all transform work to the client, of course, but in that situation you'll end up with a huge amount of contention on the CAS lock and it will take a very long time for a client's edits to actually successfully be accepted by the server. So instead of your servers falling over, the clients will stop working instead :) (also, that would unfairly favor clients with faster networks)

ShareDB does do some transform work on the server (any edit made by a client against a known server snapshot version is transformed up to the latest snapshot version). But it’s the client job to transform their operations to be based against a version which the server already knows about (multiple client edits are queued until the server acknowledges the earlier edits and gives them a version number). But as you say, rapid concurrent edits means that the clients will fall behind and their edit operations will end up queued - on both the clients and the server. Yep, that means that clients slowing down will probably be the first sign of trouble.

Nate Smith

unread,

Oct 12, 2017, 12:09:41 PM10/12/17

to sha...@googlegroups.com

Are you only subscribing to specific documents or are you using queries? Jeremy did a good job outlining the scaling characteristics of concurrent edits to documents.

If you are using the query subscription feature, there are some additional considerations to limit how often ShareDB has to issue database queries. To scale query subscriptions, you'll want to make sure that query polling does not fan out too much, which you can achieve by writing middleware that maps the query channel more precisely.

— John

To unsubscribe from this group and stop receiving emails from it, send an email to sharejs+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "ShareJS" group.

To unsubscribe from this group and stop receiving emails from it, send an email to sharejs+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "ShareJS" group.

To unsubscribe from this group and stop receiving emails from it, send an email to sharejs+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "ShareJS" group.

To unsubscribe from this group and stop receiving emails from it, send an email to sharejs+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "ShareJS" group.

To unsubscribe from this group and stop receiving emails from it, send an email to sharejs+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "ShareJS" group.

To unsubscribe from this group and stop receiving emails from it, send an email to sharejs+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "ShareJS" group.

To unsubscribe from this group and stop receiving emails from it, send an email to sharejs+unsubscribe@googlegroups.com.

Stian Håklev

unread,

Oct 13, 2017, 2:06:44 AM10/13/17

to sha...@googlegroups.com

Thanks a lot for all the replies. I thought I'd outline a bit more in detail what we do, since I'd love to hear about more efficient approaches, and would be very happy if anything we do is useful to others. (All code at https://github.com/chili-epfl/FROG).

So the basic idea is that we want to have a plugin system for rich learning activities - these could be generic, like watching a video, editing a text/form, doing a quiz etc, but also very specific, like a physics simulation, etc. The key is that they should be configurable, accept input data, stream learning analytics, and produce output data. We also have operators which are basically functions (with configuration) which transform data. The end result is some kind of very specialized graphical programming language for collaborative learning scenarios.

The idea is to make these activity plugins as easy to write as possible, abstracting a lot of the "hard work". We use Meteor, and ShareDB, but activities are separate NPM packages, which only need to import React. They provide a configuration object, and we use react-jsonschema-form to render the config in the editor, and store the configuration. The activity package also defines the initial shape of it's data collection, as well as a merge function which will receive the input data from previous operators/activities and merge this into its own data collection.

The way the graph works, is that the bottom line denotes individual activities, the middle-line group activities, and the top line "whole class" activities. Depending on where the activity is located, we generate one shareDB document per individual, group or one for the whole class. The document is initiated with the "shape" requested by the activity (for example {}), and then the merge function is run, if there is any incoming data.

Then the React component "ActivityRunner" from the activity package is mounted, with the props data (which is the doc.data relevant to that given user), and dataFn (which is a wrapper around ot-json ops, tied to a specific doc). (https://github.com/chili-epfl/FROG/blob/develop/frog-utils/src/generateReactiveFn.js)

This gives activity authors an enormous amount of flexibility, while still making it easy to do individual and collaborative activities. Below is an almost complete example of a minimal group chat, with some configuration:

```js

const meta = {

name: 'Text Component'

};

const config = {

type: 'object',

properties: {

title: {

type: 'string',

title: 'Title'

}

};

const dataStructure = [];

const ActivityRunner = ({ config, data, dataFn, logger, userInfo }) =>

<div>

<h1>

{config.title}

</h1>

<TextInput

callbackFn={e => {

dataFn.listAppend({ msg: e, user: userInfo.name, id: uuid() });

logger({ chat: e });

}}

/>

</div>

export default {id: 'ac-chat', meta, config, dataStructure, ActivityRunner}

```

So far we have mainly been using json0, however we just added support for text, with a simple component called <ReactiveText type='textarea|textinput' path=['text'] dataFn={dataFn} />. This creates a text area/text input, storing the text in the path given in the document given to the activity runner. We are also planning to add rich text, code editing (syntax highlighting/live preview etc), and as the result of a previous project, we have complex forms with collab editing (https://github.com/chili-epfl/collab-react-components).

Part of this research project is to look at how we can provide sophisticated feedback to the teacher, and analytics for researchers, if we control more of the stack (instead of using Google Docs etc). We have two students currently working on analytics of collaborative writing behaviour, based on data from either ShareDB or Etherpad (first step is to create a unified representation from those two data sources). Ideas we are exploring: highlight areas of the text that are the "oldest" / "most vs least edited", infer collaborative behaviour (I write one paragraph while you write another, vs you write a paragraph and I add information, vs you write a paragraph and I make a lot of changes, etc). Can these kinds of higher-level information predict collaborative success?

---

Anyway, so the pattern right now is that we have a number of Meteor servers (all connected to the same Mongo server), which import the ShareDB server, connecting to a separate Mongo instance for ShareDB, as well as a Redis server. In development, the front-end client connects to two websockets on the Meteor server, one for the ShareDB and one for the Meteor stuff.

In production, we only use the Meteor SharedB instance for back-end processing, and front-end clients connect to four ShareDB servers, which are user-facing (just a few lines of Node importing and starting the server https://github.com/chili-epfl/FROG/blob/unil/sharedb/index.js), connected to the same Redis and Mongo DBs, using Nginx for ip-hash based load balancing and SSL reverse proxying.

Students move in lock-step through the script, so when a teacher transitions to a new activity set, one or several activities need to be initiated - the data needed is calculated, and then the necessary set of ShareDB documents are initialized from the server-side. This could be many hundred documents, and this is not quite as fast as we'd like, even though it's sufferable for now. Initially we did it very naively with first subscribing to each document, and then on('load'), creating the document, and not even doing it in parallel. Currently we directly get, and then create documents, without waiting for load, because we know they don't already exist, and try to use Futures to get parallelism, which makes it much better.

In most workloads, only 4-5 students are connected to a single document. In the future there will be more collaborative editing, but currently it's a lot of adding list items or objects (we are experimenting with objects of the form { 'uuid': { id: 'uuid', chatmsg: 'hi', etc}}, which both gives us easy editing/deleting without worrying about list index, but also makes it easy to Object.values(x).map(<li key=x.id>) etc. Thus the conflicting edits should be minimal.

Once an activity concludes, we need to gather all the data from all the instances, and write to the database / use for future activities. Here we also used to do it incredibly inefficiently (because with a few test users on localhost it was never a problem), but after switching to doing a simple fetchquery, - where we actually do a regexp on the document ID (not sure this is a good idea, any way of attaching metadata to a document that doesn't become part of doc.data?), because we generate IDs like activityId/instanceId, and we want all instances for a given activityId - this is superfast.

My concern was more around number of users/websocket connections to a given instance. We had many issues with 200 users on a single node, and because we do experiments in large classrooms with up to 300 students connecting at the same time, it's not so easy to experiment and narrow it down. Therefore we then went with 5 meteor instances and 4 separate sharedb instances - and this worked well. It might well be that we need much less though - maybe Meteor is much more demanding than ShareDB, because the consensus in the Meteor community seems to be around 50 active clients for a 1CPU small VM...

... which seems very little to me. During my PhD, I wrote collaborative MOOC software with Elixir/Phoenix, which had thousands of web sockets with never more than 3% server load. Back then I actually tried rewriting the server part of (then sharejs) into Elixir, thinking I could make it compatible with the ShareJS frontend client, and make it much easier to integrate into my Elixir program, and scale well (I really didn't like Node back then). I actually made a bunch of the JS tests work on this server, but never completed it. Sometimes still dream about completing this and launching "ShareDB as a service" :) https://github.com/houshuang/ot_text/blob/master/lib/text.ex

Anyway, thanks for listening. If you have any feedback on the architecture, it would be welcome. Right now I'm trying with 4 Digital Ocean droplets, all four running the ShareDB server, with one of them also running the Redis and MongoDB - don't know if these should be separate... I might try in the future to see if I can get buy with fewer servers. In the future, we might also do experiments in MOOCs, so we could expect thousands of people connecting at the same time, although not editing the same documents. However, since we have many different group structures (for example, you might be assigned a role as a mayor, and be in group 55, and first meet with all the mayors to discuss how to respond to an earthquake, and then go to group 55, to discuss with a property developer, an environmental activity, and a business owner - these would be two shareDB documents with different subscribers), it would be very difficult to shard based on collection ID... so I first need to know roughly how many users a single instance can support, and then how many instances a single Redis/Mongo server can mediate between, and at what point things begin to break down.

Also very happy to discuss with anyone who is doing research around collaboration, collaborative editing etc.

thanks to all the ShareDB people for an amazing tool!

Stian

Ecole Polytechnique Fédérale de Lausanne

You received this message because you are subscribed to a topic in the Google Groups "ShareJS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sharejs/N7QBY-qI2O4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sharejs+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

http://reganmian.net/blog -- Random Stuff that Matters

Reply all

Reply to author

Forward