Mongo *_ops collections

338 views
Skip to first unread message

Christian Stewart

unread,
May 29, 2014, 2:13:10 AM5/29/14
to der...@googlegroups.com
Hi all,

I was wondering (as I couldn't find anything on this) if there is some way to not use collection_ops collections in the mongo store? I thought redis was supposed to be used for multiple server process sync, not mongo?

The main issue is that at this point the ops collections are now larger than the actual stored data, and they never seem to be cleared. Should I just periodically clean them, or is there some better solution?

Thanks!

Vladimir Makhaev

unread,
May 30, 2014, 1:01:48 AM5/30/14
to der...@googlegroups.com
As far as I know storing ops is not necessary. And also you can store them in another db.
Here docs on LiveDB, oplog - is ops storage: https://github.com/share/livedb

Christian Stewart

unread,
May 30, 2014, 3:07:04 AM5/30/14
to der...@googlegroups.com
How can I disable storing ops then?


--
You received this message because you are subscribed to a topic in the Google Groups "Derby" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/derbyjs/lfwziMXnaM4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to derbyjs+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Vladimir Makhaev

unread,
May 30, 2014, 3:09:19 AM5/30/14
to der...@googlegroups.com
Try this:
var store = derby.createStore({snapshotDb: db});

Christian Stewart

unread,
May 30, 2014, 3:11:17 AM5/30/14
to der...@googlegroups.com
Do I just set it to mongodb or? It would be good if it could use oplog tailing.

Joseph Gentle

unread,
May 30, 2014, 3:14:04 AM5/30/14
to derbyjs
Its a good question, and you're not the first to run into this
problem. The ops are needed to allow users to make changes to a
document while they're on holiday with no internet access. We need the
ops to be able to resolve what happens.

I think the right solution is to allow ops to be removed or expired
from the database after a timeout of a week or two. This might even
work now (and some people already do this), but until there are tests
around this use case in livedb, I won't make any guarantees about the
server's behaviour.

I'm hoping someone steps in and adds tests so we know that it'll do
the right thing if ops are missing and you submit an op or call getOps
/ fetch. It should be as simple as setting an expiry for those
collections in mongodb, and that should be configurable via
livedb-mongo.

-J


On Thu, May 29, 2014 at 10:01 PM, Vladimir Makhaev <vmak...@gmail.com> wrote:
> --
> You received this message because you are subscribed to the Google Groups
> "Derby" group.
> To unsubscribe from this group and stop receiving emails from it, send an

Joseph Gentle

unread,
May 30, 2014, 3:16:20 AM5/30/14
to derbyjs
What Vladimir suggested won't work - you still need an oplog
(otherwise you can't handle simultaneous operations). You just don't
want to store ops for longer than a couple of weeks. I guess these ops
could be in redis - we already store a couple of weeks of ops in redis
and let them expire via redis's TTLs.

What do you mean by oplog tailing?
> You received this message because you are subscribed to the Google Groups
> "Derby" group.
> To unsubscribe from this group and stop receiving emails from it, send an

Christian Stewart

unread,
May 30, 2014, 3:22:05 AM5/30/14
to der...@googlegroups.com
Meteor is using oplog tailing now, which means, it can read the operations log output (now supported on MongoHQ) to stream operations quickly. It's just the log of changes in MongoDB and far more efficient than storing operations in a collection.

Joseph Gentle

unread,
May 30, 2014, 4:23:39 AM5/30/14
to der...@googlegroups.com
Meteor's editing operations are quite different from what we use in
livedb (and hence derby). Livedb supports operations like "Insert at
this position in this list", and concurrent edits to the same list
will be resolved naturally and correctly. We don't talk about it
enough, but derby supports realtime collaborative editing out of the
box as a natural result. (With almost no special code in derby).

If two users try to edit the same document at the same time with
meteor, as I understand it only one user's edits will be reflected in
the final document.

If you want to resolve things correctly, the mongo operation ("Replace
{x:[1,2,3]} with {x:[1,2,2,3]}") simply doesn't contain enough
information to resolve concurrent edits. We could use the mongo oplog
to do multiserver concurrency (ie, replace redis's pubsub system), but
that wouldn't solve your problem anyway. It would also be slower...

To resolve the change, we also need to know the list of concurrent
changes (semantic changes, not just the new data) which have been made
to the document. To do that we need to rewind the oplog a little and
look at some of the old data in it. As I understand it, mongo's oplog
is just a feed. You can't rewind it.

Unlike meteor, livedb (and hence derby) has no hard dependancy on
mongo. At lever I'd like to replace mongo & redis with postgresql &
foundationdb because I think they're better fits for what we're doing.
And thats something we can quite reasonably do due to livedb's
architecture.

-J
> > email to derbyjs+unsubscrib

Christian Stewart

unread,
May 30, 2014, 12:58:41 PM5/30/14
to der...@googlegroups.com
Thanks for explaining that! It seems much faster than Meteor's implementation at the moment (the main reason I'm migrating is because Meteor used a massive amount of CPU and Ram even when scaled to > 10 instances). 

So, is there any process I can use to clean up the _ops collections? It seems like right now they could expand to be far too masive.

Vladimir Makhaev

unread,
Jun 2, 2014, 1:39:05 PM6/2/14
to der...@googlegroups.com
Christian, looks like you recently have experience with both Meteor and Derby. Could you give some comparison between them?
We raised this theme before, but it was long ago, here: https://groups.google.com/forum/#!searchin/derbyjs/meteor/derbyjs/6pyjiY33nFg/I2FG9MmyWCUJ
There are some my thoughts, but last time I used Meteor about 2 years ago. What`s news there? Why have you switched to Derby?

Christian Stewart

unread,
Jun 2, 2014, 2:37:43 PM6/2/14
to der...@googlegroups.com
Hey!

Basically I had terrible performance scaling Meteor (subscriptions would take ages to go through) and tons of CPU and RAM usage. Derby seemed lower level and much lighter so I wanted to use it for the site.

I actually ended up switching back to using a very very very simple version of my old meteor app with just authentication and client side reactive templating (local collections) powered by a C++ websocket server for the realtime data. I did this because I need to release this app quickly and also because my existing database doesn't seem compatible (ShareJS's pointless _ properties seem to mess me up there). It just seemed like too much effort to integrate Derby at this point when there is so little documentation (for example I couldn't figure out if there was any equivalent to server-side methods or if those had to be done through HTTP requests).

I will probably try using Derby for some project in the future but at the moment Meteor is just too easy to work with and it seems they're getting closer to improving scaling performance with their recent efforts into oplog tailing and their Galaxy scaling service.

Vladimir Makhaev

unread,
Jun 3, 2014, 12:32:31 AM6/3/14
to der...@googlegroups.com
Thanks, that`s interesting. I agree, docs - is a problem.

Dan Dascalescu

unread,
Mar 20, 2015, 9:46:00 PM3/20/15
to der...@googlegroups.com
Hey Christian,

I'm writing a section on Meteor's scalability in Why Meteor, and I wanted to follow-up on your experience with RAM and CPU consumption in Meteor vs Derby.js:


On Friday, May 30, 2014 at 9:58:41 AM UTC-7, Christian Stewart wrote:
Thanks for explaining that! It seems much faster than Meteor's implementation at the moment (the main reason I'm migrating is because Meteor used a massive amount of CPU and Ram even when scaled to > 10 instances). 

Were you using MongoDB oplog tailing back then?

In your post a few days later, you said:
 
I will probably try using Derby for some project in the future but at the moment Meteor is just too easy to work with and it seems they're getting closer to improving scaling performance with their recent efforts into oplog tailing and their Galaxy scaling service.

Wondering if poll-and-diff was the CPU+RAM problem back then.

Thanks,
Dan

Ian Johnson

unread,
Mar 23, 2015, 3:05:17 PM3/23/15
to der...@googlegroups.com
Hi Dan,

Thanks for the article, its quite fair and a good overview of the options.

Just one question, how do you see Meteor as the most "open"?
It's the only one that uses it's own package management system and isn't built from standard open source pieces. The numbers dont lie when it comes to current usage and popularity, but that statement sounds unqualified to me.


--
You received this message because you are subscribed to the Google Groups "Derby" group.
To unsubscribe from this group and stop receiving emails from it, send an email to derbyjs+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Ian Johnson - 周彦

Dan Dascalescu

unread,
Mar 23, 2015, 6:36:47 PM3/23/15
to der...@googlegroups.com
On Monday, March 23, 2015 at 12:05:17 PM UTC-7, Ian Johnson wrote:
Just one question, how do you see Meteor as the most "open"?

Well, Meteor is quite a bit more open than Wakanda, which uses its own database. Meteor can use npm packages, and there are thousands of packages on Atmosphere that wrap 3rd party libraries. More about that in the Ecosystem section.

Meteor could be even more open if it used npm instead of its own packaging system, but there are good reasons they've made the choice to roll out Isobuild. Thanks for bringing up the openness topic, actually - I've updated the Ecosystem section in that regards.

Matteo Brunati

unread,
Mar 25, 2015, 4:56:11 AM3/25/15
to der...@googlegroups.com
Hi Christian,
When we don't need the oplog history, we cruelly delete the _ops collections periodically - let's say one time per day.
By now no problems observed so far.

Maybe a less drastic solution would be to remove all records in the _ops collections but not the last one.
So that ShareJS should not complain to much when trying to resolve concurrency issues, right?

We're are going to try flushing all records in redis as well from time to time, but didn't arrived there yet.

Cheers,
Matteo

Matteo Brunati

unread,
Mar 27, 2015, 6:28:53 AM3/27/15
to der...@googlegroups.com
I forgot to say that we also remove deleted records on db.
For now, when deleting a document using model.del(), the db record is still alive in mongo but with all the document data substituted by a {data: null}, and the last data pushed to the _ops collection.
In our cleaning script, we clean the _ops collections but we also clean the {data: null} objects, because in our case a deleted object is not recoverable.

You may have an example code of this here https://github.com/mattbrun/cows/blob/master/server/clean.js
In this case I don't remove the whole _ops collection. I Just delete all the records in it but not the last one.

Cheers,
Matteo
Reply all
Reply to author
Forward
0 new messages