sync_gateway filter pull replication by a timestamp

61 views
Skip to first unread message

Yonah Forst

unread,
Sep 30, 2015, 1:09:32 PM9/30/15
to Couchbase Mobile
Does anyone know of a way to filter a sync_gateway pull replication by a timestamp on a document? e.g. only pull documents with a timestamp in the last 30 minutes
I'm writing a location sharing app, and I don't want users to download documents (locations) that haven't been updated in the last 30 minutes since they are no longer relevant, and I don't want to waste the users data plan

At first I tried with channels and a cronjob. I had a script checking for expired documents and setting the isExpired property to true. My sync function would check that property and assign it either the 'expired' channel or the 'active' channel. The problem was that replications would still pull all revisions up until that document got moved to the 'expired' channel which defeated the purpose.

Then I tried it with permissions; by granting users access to only the 'active' channel. That successfully prevented all revisions of expired documents from being pulled. But the problem was that I then lost the ability to filter pull replications by an arbitrary channel (even if the user had access to that document through a different channel) . (I can explain why that is needed)

I've read in the comments that replication channels are just a convenience that set/get the filter and query_params properties, which I'm guessing queries the sync_gateway/_view/channels view. Is there a way to create my own view (emitting the updatedAt property) and ask the replication to filter by that view, using min and max query params?

Jens Alfke

unread,
Sep 30, 2015, 3:36:40 PM9/30/15
to mobile-c...@googlegroups.com

On Sep 30, 2015, at 10:09 AM, Yonah Forst <yonah...@gmail.com> wrote:

Does anyone know of a way to filter a sync_gateway pull replication by a timestamp on a document? e.g. only pull documents with a timestamp in the last 30 minutes

There isn’t really a clean way to do that.

At first I tried with channels and a cronjob. I had a script checking for expired documents and setting the isExpired property to true. My sync function would check that property and assign it either the 'expired' channel or the 'active' channel. The problem was that replications would still pull all revisions up until that document got moved to the 'expired' channel which defeated the purpose.

It sounds like you did this before you removed access to the ‘expired’ channel. In which case it wouldn’t have any effect on what clients download, since they’d still be able to access the docs whether or not they were expired.

But the problem was that I then lost the ability to filter pull replications by an arbitrary channel (even if the user had access to that document through a different channel) . (I can explain why that is needed)

You should still be able to do that. Users can have access to any channels other than ‘expired’. (In fact you don’t need to have an ‘expired’ channel at all. When a doc expires, the sync fn just doesn’t add it to any channels.)

I've read in the comments that replication channels are just a convenience that set/get the filter and query_params properties, which I'm guessing queries the sync_gateway/_view/channels view.

No, channels are a real thing that are deeply built into the Sync Gateway (and make up most of its complexity…)

What I think you read is that the API property Replication.channels is a convenience that just sets the replication’s underlying `filter` and `query_params` properties. That’s how the channel info is communicated to Sync Gateway, for compatibility with CouchDB and the pre-existing REST API.

—Jens

Jens Alfke

unread,
Sep 30, 2015, 3:41:23 PM9/30/15
to mobile-c...@googlegroups.com

On Sep 30, 2015, at 12:36 PM, Jens Alfke <je...@couchbase.com> wrote:

There isn’t really a clean way to do that.

Actually I just thought of a possibly-better way: on the server side, instead of updating a doc to mark it as expired, just purge it. That basically nukes the doc from the server, as though it had never existed. It will never show up in any future pull replications to any client. As a side benefit, it also reclaims space in the server database.

Of course this only works if you never need to access those expired docs again...

—Jens

Yonah Forst

unread,
Oct 1, 2015, 12:31:38 PM10/1/15
to Couchbase Mobile
Thanks Jens. I would need to purge the document via the sync_gateway REST API, correct? I couldn't find documentation on how to purge a specific item.

Jens Alfke

unread,
Oct 3, 2015, 3:30:03 PM10/3/15
to mobile-c...@googlegroups.com

On Oct 1, 2015, at 9:31 AM, Yonah Forst <yonah...@gmail.com> wrote:

Thanks Jens. I would need to purge the document via the sync_gateway REST API, correct? I couldn't find documentation on how to purge a specific item.

Oops, it looks like Sync Gateway doesn’t implement _purge. :(

—Jens

Yonah Forst

unread,
Oct 6, 2015, 10:30:05 AM10/6/15
to Couchbase Mobile
Awww man...

Do you have any suggestions? I've poked around online and found a few other cases where people were concerned about their users replicating large amounts of 'stale' data. Is there a correct way to model your DB or user permissions to avoid this problem? 
Reply all
Reply to author
Forward
0 new messages