Fwd: [faye] How does it scale relative amount of channels? (#62)

369 views
Skip to first unread message

James Coglan

unread,
May 31, 2011, 6:40:25 PM5/31/11
to faye-...@googlegroups.com
Forwarding this question asked on GitHub because I thought it might be useful to others:

---------- Forwarded message ----------
From: manast
Date: 27 May 2011 09:13
Subject: [faye] How does it scale relative amount of channels? (#62)

I just wonder how well does Faye scale relative amount of channels. I suspect the system is optimized for few channels and many users subscribed to them. But what if we have many users and every user is subscribed to its own private channel? Would Faye handle this scenario efficiently?


And my reply...

As always, it depends what you mean by 'efficiently'. I'll describe how Faye stores data so you can get a better idea of what's going on. This is based on the 0.6 release, which has a very different internal model to previously releases and makes certain operations like unsubscription faster.

The Faye engine needs to store several pieces of information:

1. A set of all active client IDs
2. For each channel, a set of client IDs subscribed to it
3. For each client, a set of channels it is subscribe to (i.e. reverse index for 2)

A set in Ruby and JavaScript is just implemented as a hashtable, so this data ends up looking like this (I will use JSON to represent it):

    engine = {
      "namespace": {
        "foo": true,
        "bar": true,
        "qux": true
      },
      "channels": {
        "/some/channel": {
          "foo": true,
          "qux": true
        },
        "/some/other/channel": {
          "foo": true
        }
      },
      "clients": {
        "foo": {
          "/some/channel":        true,
          "/some/other/channel":  true
        },
        "bar": {
        },
        "qux": {
          "/some/channel":        true
        }
      }
    }

The 'namespace' is the set of all active client IDs. A map with values on the left and 'true' on the right is just how we represent a set of the values on the left, since the table makes sure the keys are unique and can be efficiently deleted. 'channels' maps channel names to sets of client IDs, and 'clients' does the reverse mapping from client IDs to sets of channels.

So, subscribing a client to an existing channel, e.g. subscribing "bar" to "/some/channel", means adding "bar":true to the "/some/channel" set and "/some/channel":true to the "bar" set.

Creating a new channel is a little more expensive since we allocate a new object to store the client IDs for the channel. In the case where there is one channel per client, we simply have lots of sets with one member rather than one set with many members. How much this is a problem with depend on the overhead of allocating and object, since we still need to store the client IDs no matter what.

For example having one channel with many users looks like this:

      "channels": {
        "/some/channel": {
          "foo": true,
          "qux": true
        }
      }

Whereas having many channels with one user looks like this:

      "channels": {
        "/channels/foo": {
          "foo": true
        },
        "/channels/qux": {
          "qux": true
        }
      }

In the 'clients' section, each subscription means adding one member to a client's set, and this would be true no matter what the channels were called. The size of this collection depends only on the number of subscriptions, not on the number of channels.

The important thing about all these structures is that it is ideally O(1) to add and remove any element from them since all operations are just adding and removing from sets. This is true on the Redis engine as well, where the sadd and srem operations are O(1). Prior to 0.6 I used a more complex storage system that I think was O(N) for unsubscribing, but this is now O(1).

As to whether this is 'efficient', all you should really care about is whether it's efficient *enough*. You may find that object allocation in JavaScript and hash allocation in Ruby have different performance and GC characteristics, and you should load test with realistic application data to get a better idea of whether Faye suits your needs.

I hope this has shed some light for you to start to answer your question.

Manuel Astudillo

unread,
Jun 1, 2011, 2:05:29 AM6/1/11
to Faye users
Thanks for the answer. Actually it was exactly what I wanted to know.
For my application I think it will scale quite well.

James Coglan

unread,
Jun 1, 2011, 6:41:29 AM6/1/11
to faye-...@googlegroups.com
This is based on the 0.6 release, which has a very different internal model to previously releases and makes certain operations like unsubscription faster.

A small addendum to this: I've made some changes that will be in 0.6.1 that make the Memory engine better at cleaning up unused resources, like disconnected client IDs and channels with no subscribers. Redis already handled this since it deletes sets with no members, but now the Memory engine does this too.

mgutz

unread,
Jun 11, 2011, 2:03:48 PM6/11/11
to Faye users
That was a good read. I was going to add it to the Faye Wiki on
github, but Wiki is not enabled.
Reply all
Reply to author
Forward
0 new messages