Using "since" param in _changes Native API

31 views
Skip to first unread message

Rajagopal V

unread,
Oct 13, 2016, 12:25:47 AM10/13/16
to Couchbase Mobile
Im using Couchbase Lite Android/iOS that replicates data from a CouchDB server. We have hit a problem where when the number of documents increases to a high number (> 100000), the replication goes very slow. To avoid this, Im trying to see if I can do a view on CouchDB that returns the required documents, then do a one-off replication on them using the _doc_ids filter followed by continuous replication so that future changes are pulled in as well. This seems to work okay and better than the replication, which easily takes about 8-10 mins for the initial replication to finish.

Just to explain better, the sequence is this:
1. User logs in to mobile
2. Make a HTTP call to CouchDB View that gets the documents that we need
3. Do a One-off replication using Native API  . After this,(which takes about 3 mins for 100000 documents), the application is usable.
4. Trigger a continuous replication so that future changes are obtained. This takes a while to catch up to the sequence until which the one-off has replicated changes. 

The problem is that for Step 4 above, there is no way to pass in a "since" parameter to the _changes feed. The Native API doesnt seem to have a way to feed in the since parameter. If I can get the sequence number after the one-off, which should be returned as per http://docs.couchdb.org/en/1.6.0/api/database/changes.html?highlight=_changes), how can I start the continuous replication from that point?

Thanks
Raja
 

Jens Alfke

unread,
Oct 13, 2016, 1:32:15 AM10/13/16
to mobile-c...@googlegroups.com
On Oct 12, 2016, at 9:25 PM, Rajagopal V <raja...@gmail.com> wrote:

Im using Couchbase Lite Android/iOS that replicates data from a CouchDB server. We have hit a problem where when the number of documents increases to a high number (> 100000), the replication goes very slow.

Right; IIRC this is because you’re using a filter function on CouchDB, which scales poorly because every document has to be passed through a JavaScript function (for every client that replicates.)

The problem is that for Step 4 above, there is no way to pass in a "since" parameter to the _changes feed. The Native API doesnt seem to have a way to feed in the since parameter.

I think what you’re saying is that there’s no way to pass a remote “since” value to the replicator, i.e. a checkpoint to start from. This is true. The only case where you might need it is if you’re trying to halfway replace the replicator, as you’re doing, and I don’t think this is something we can provide support for. The replicator is pretty intricate, and we have enough support issues without exposing more parameters that could mess up its state… :/

It’s also worth pointing out [if I haven’t already!] that the ‘channels’ mechanism in Sync Gateway was designed explicitly to get around this scalability problem in CouchDB.

—Jens

Raja

unread,
Oct 13, 2016, 1:48:16 AM10/13/16
to mobile-c...@googlegroups.com
On Thu, Oct 13, 2016 at 11:02 AM, Jens Alfke <je...@couchbase.com> wrote:

Right; IIRC this is because you’re using a filter function on CouchDB, which scales poorly because every document has to be passed through a JavaScript function (for every client that replicates.)


Yes, That seems to be a problem , even with filters written in Erlang (Sidenote: Ive posted a SO question indicating filters in JS/Erlang as the Erlang ones seem slower than Javascript, which is very strange -- http://stackoverflow.com/questions/39990772/couchdb-erlang-replication-filter-slower-than-javascript)  
The problem is that for Step 4 above, there is no way to pass in a "since" parameter to the _changes feed. The Native API doesnt seem to have a way to feed in the since parameter.

I think what you’re saying is that there’s no way to pass a remote “since” value to the replicator, i.e. a checkpoint to start from. This is true. The only case where you might need it is if you’re trying to halfway replace the replicator, as you’re doing, and I don’t think this is something we can provide support for. The replicator is pretty intricate, and we have enough support issues without exposing more parameters that could mess up its state… :/

It’s also worth pointing out [if I haven’t already!] that the ‘channels’ mechanism in Sync Gateway was designed explicitly to get around this scalability problem in CouchDB.

@tleyden suggested in another thread the same thing, Ill try and see if the above solution is bearable, otherwise will have to shift to using Channels and Sync Gateway.

Thanks
Raja

Reply all
Reply to author
Forward
0 new messages