Large amounts of Channels for Sync Gateway

668 views
Skip to first unread message

Robert Payne

unread,
Jan 14, 2016, 9:01:50 PM1/14/16
to Couchbase Mobile
Hi Couchbase,

So far am pretty enthused with CouchbaseLite and Sync Gateway and appreciate all the hard work put in.

I have a use case that I'm unsure is workable but I feel like others may have similar use cases:

– Users subscribe/unsubscribe from channels
– Users need to sync all documents related to those channels
– If a user subscribes to a new channel, that channels data should also start to sync

Looking at Sync Gateway I think I can accomplish this by:

– Ensuring documents are tagged with appropriate channels
– Using the Sync Gateway Admin API to modify channel access (PUT /{db}/_user/{name})


I was just interested from an architecture point of view of how scalable it is to have many very "shallow" channels. Some channels may only be 1-2 documents, others 400-500 and each user will be subscribed to likely 50-150 channels at any given time.


Cheers,
Robert

Jens Alfke

unread,
Jan 14, 2016, 11:29:17 PM1/14/16
to mobile-c...@googlegroups.com

On Jan 14, 2016, at 5:55 PM, Robert Payne <rob...@zwopple.com> wrote:

I was just interested from an architecture point of view of how scalable it is to have many very "shallow" channels. Some channels may only be 1-2 documents, others 400-500 and each user will be subscribed to likely 50-150 channels at any given time.

Channels are pretty lightweight. They primarily exist as internal tags on stored documents, and there’s a view index internal to SG that’s keyed by channel name.

—Jens

Robert Payne

unread,
Feb 8, 2016, 9:39:51 PM2/8/16
to Couchbase Mobile
Hey Jens,

Sounds good, we're probably planning upwards of 1000-2000 channels per user actually that will be actively synced. One other question I had is do channels "catch up" take for instance:

1. User syncs channel "A" to sequence 1000
2. User adds subscription to channel "B" and begins replication

Will it sync from channel B starting at sequence 1000 or sequence 0? I'm worried we're going to have to create a replicator for every single channel to ensure it catches from sequence 0.

Jens Alfke

unread,
Feb 9, 2016, 12:03:22 AM2/9/16
to mobile-c...@googlegroups.com

On Feb 8, 2016, at 6:39 PM, Robert Payne <rob...@zwopple.com> wrote:

Will it sync from channel B starting at sequence 1000 or sequence 0? I'm worried we're going to have to create a replicator for every single channel to ensure it catches from sequence 0.

From sequence 0.

—Jens

Adam Fraser

unread,
Feb 10, 2016, 1:30:58 PM2/10/16
to Couchbase Mobile
Jens is correct that channels are fairly lightweight, but there are a few additional details you may want to take into consideration while evaluating your design:

1. Adding/removing users to/from channels is a more computationally expensive operation than adding/removing documents to/from a channel.  When a user is added to a channel, a Couchbase view call is required to recalculate that user's channel access.  When a document is added to a channel, it's just an in-memory operation to add that to the cache for that channel.  Frequency subscription/unsubscription from a channel is going to be a relatively expensive operation.
2. Sync Gateway maintains an in-memory cache of recent changes to a channel.  That cache retains at least 50 entries (entries beyond 50 will be expire out of the cache).  If most of your channels are shallow (<50 docs), it means that Sync Gateway will be attempting to cache a large fraction of your total docs in memory.  This will probably need to be considered when sizing your SG node(s).
3. Although channels are lightweight, I'd still expect you to see some increase in CPU requirements as you increase the number of channels per user, particularly as you get into thousands of channels per user.  Each replication is making a _changes request that's going to need to check each of those channels for changes.  

I don't think any of these are barriers to your design as described, but I wanted to share the details as additional context.

Thanks,
Adam

atom992

unread,
Feb 14, 2016, 2:51:12 AM2/14/16
to Couchbase Mobile
"That cache retains at least 50 entries " This means The Cache retains 50 entries totally, or The Cache retains at least 50 entries per Channel?

Jens Alfke

unread,
Feb 14, 2016, 2:58:19 AM2/14/16
to mobile-c...@googlegroups.com

On Feb 13, 2016, at 11:51 PM, atom992 <yangzi...@gmail.com> wrote:

"That cache retains at least 50 entries " This means The Cache retains 50 entries totally, or The Cache retains at least 50 entries per Channel?

50 per channel, unless that's been changed since I last worked on it. But they’re loaded on demand, when a client syncs with a channel, so if a channel isn’t used anymore it won’t be cached.

—Jens

atom992

unread,
Feb 14, 2016, 3:07:07 AM2/14/16
to Couchbase Mobile
Cool, Thanks. Can I custom the numbers of entries per Channel by environment variable or Config?

Adam Fraser

unread,
Feb 15, 2016, 1:19:29 PM2/15/16
to Couchbase Mobile
There isn't currently a config setting for the minimum retained - only the maximum can be configured.  That sounds like a reasonable enhancement, although it would have to be used only when appropriate - a lower cache size will normally result in an increased number of relatively expensive view calls to return any channel data not stored in the cache.

Jens Alfke

unread,
Feb 15, 2016, 3:06:04 PM2/15/16
to mobile-c...@googlegroups.com

> On Feb 15, 2016, at 10:19 AM, Adam Fraser <adamc...@gmail.com> wrote:
>
> There isn't currently a config setting for the minimum retained - only the maximum can be configured. That sounds like a reasonable enhancement, although it would have to be used only when appropriate - a lower cache size will normally result in an increased number of relatively expensive view calls to return any channel data not stored in the cache.

Actually it looks like these are already settable — I traced those values from db/channel_cache.go (where they’re used) up to DbConfig.CacheConfig (in rest/config.go).

It looks like you can edit the JSON config file to add a `cache` property to a database config object, whose value is an object that can contain keys like `channel_cache_min_length` and `channel_cache_max_length`.

(Or I might be misreading the code … it’s been at least a year since I looked at this.)

—Jens
Message has been deleted

Adam Fraser

unread,
Feb 15, 2016, 8:16:26 PM2/15/16
to Couchbase Mobile
You're right Jens - I had a complete memory failure on that one.  Thanks very much for sanity checking that.

As you say, you can define the cache properly inside your database config - here's an config example excerpt:


   "databases": {
      "default": {
        "server": "http://localhost:8091",
        "bucket": "default",
        "cache": {
            "channel_cache_max_length": 1000,
            "channel_cache_min_length": 25,
        }
      }
   }

Robert Payne

unread,
Feb 18, 2016, 2:20:56 AM2/18/16
to Couchbase Mobile

Thanks this is really really useful.

Our channels are all "open" for reads so we're actually manipulating it client side via the `channels` property on the pull replicator. We sync first a known set of channels that let us determine the full range of channels necessary.

Most of our channels are going to be very shallow so it's good to know about the caching. A lot of users will share the same subset of channels though.

We probably need to evaluate the performance characteristics a bit more in detail for our use case in summary though!

Thanks,
Robert

Faheem

unread,
May 29, 2016, 12:04:06 PM5/29/16
to Couchbase Mobile

 
            So we have a use case where we need to push certain JSON to a large number of users 100s of thousands  and potentially can be in millions. 
So we can have channels may be  in 100s and those users can subscribe to those channels, but  then we would be in a situation to subscribe/unsubscribe to those channels which you mentioned is an expensive operations.  So out of following two , which one is a better design for this kind of  use case. 

1. Create channels may be in 100s or even in thousands, and then subscribe those users to specific channels ,and then JSON docs can belong to those  channels,  but JSON would only  be associated with a ten of channels, which is fine I think , but then we would have  issue to subscribing and un subscribing the user from those channels, which you mentioned is an expensive operation, how to handle this situation here ? 
 Also is there any limit in how many user can subscribe to a channel

2. We create one channels for each user so we have Millions of channels , one for each user. Now   in this case  each JSON would need to be associated with Millions of channels, is there any limit on how  many  Channel a JSON doc can belongs to ?


Also a question , since I am not very familiar with  what kind of User are being talked about when Sync gateway docs refer to a users, are those App users, of a Custom App maintained users ?


Thanks 
Reply all
Reply to author
Forward
0 new messages