Introducing Smart Collections

Arunoda Susiripala

unread,

Jul 19, 2013, 1:22:06 PM7/19/13

to meteo...@googlegroups.com

Hi Guys,

Want make your Meteor app perform well? to make it scale easily? to get the advantage multiple CPUs?

Smart Collections is a complete re-write of mongodb collection implementation for meteor. Integration is pretty simple, and this is available via atmosphere.

Let me know, what you think of this!

Cheers

--
Arunoda Susiripala

@arunoda

https://github.com/arunoda

http://www.linkedin.com/in/arunoda

Josh Cope

unread,

Jul 19, 2013, 3:48:52 PM7/19/13

to meteo...@googlegroups.com

Nice I will look at it later

Arunoda Susiripala

unread,

Jul 19, 2013, 3:50:02 PM7/19/13

to meteo...@googlegroups.com

Cool.

On Sat, Jul 20, 2013 at 1:18 AM, Josh Cope <jcop...@gmail.com> wrote:

Nice I will look at it later

--
You received this message because you are subscribed to the Google Groups "meteor-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-talk...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Josh Cope

unread,

Jul 19, 2013, 4:01:04 PM7/19/13

to meteo...@googlegroups.com

I notice you use

mongodb 1.3.11 - https://npmjs.org/package/mongodb

https://github.com/arunoda/meteor-smart-collections/blob/master/package.js

1.3.12 just got released today if you want to update

1.3.12 2013-07-19
-----------------
- Fixed issue where timeouts sometimes would behave wrongly (Issue #1032)
- Fixed bug with callback third parameter on some commands (Issue #1033)
- Fixed possible issue where killcursor command might leave hanging functions
- Fixed issue where Mongos was not correctly removing dead servers from the pool of eligable servers
- Throw error if dbName or collection name contains null character (at command level and at collection level)
- Updated bson parser to 0.2.1 with security fix and non-promotion of Long values to javascript Numbers (once a long always a long)

Arunoda Susiripala

unread,

Jul 19, 2013, 4:03:12 PM7/19/13

to meteo...@googlegroups.com

Thanks. Will do the update.

--

You received this message because you are subscribed to the Google Groups "meteor-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-talk...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

curiou...@gmail.com

unread,

Jul 22, 2013, 12:58:34 PM7/22/13

to meteo...@googlegroups.com

Thanks for your contributions to the Meteor community. I am a novice to these technologies and unable to evaluate the difference alternatives on a deep level. Are there any disadvantages of using this compared to the standard Collections. If not, will this be integrated into the Core Meteor project any time soon?

Arunoda Susiripala

unread,

Jul 22, 2013, 1:08:20 PM7/22/13

to meteo...@googlegroups.com

I've done a performance test and you can get ~5x performance for your app and ~20x performance to mongo.

More details will be published tomorrow on MeteorHacks.

Not sure about the integration, but it doesn't need to, just get it from the atmosphere and use it.

Josh Cope

unread,

Jul 22, 2013, 1:38:40 PM7/22/13

to meteo...@googlegroups.com

I wonder if the meteor team would be interested in integrating it with the core

I know not everyone uses atmosphere and might not realize that certain package exists, as long as it's well tested, I don't see the harm, I know they are planning on make user created packages available straight through the core in the feature?

Josh Cope

unread,

Jul 22, 2013, 1:39:50 PM7/22/13

to meteo...@googlegroups.com

future not "feature"

Arunoda Susiripala

unread,

Jul 22, 2013, 1:44:26 PM7/22/13

to meteo...@googlegroups.com

Yes I saw its on a blog. Lets see how this works out :)

On Mon, Jul 22, 2013 at 11:09 PM, Josh Cope <jcop...@gmail.com> wrote:

future not "feature"

--
You received this message because you are subscribed to the Google Groups "meteor-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-talk...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

curiou...@gmail.com

unread,

Jul 22, 2013, 2:15:48 PM7/22/13

to meteo...@googlegroups.com

That is great. Thanks again!

Gabriel Pugliese

unread,

Jul 22, 2013, 2:17:38 PM7/22/13

to meteo...@googlegroups.com

At least it would be nice if anyone from Meteor core team comment about it, because, AFAIK, they are already working on mongo improvements.

Gabriel Pugliese
CodersTV.com
@gabrielsapo

Arunoda Susiripala

unread,

Jul 22, 2013, 2:39:37 PM7/22/13

to meteo...@googlegroups.com

Yep. I know that. But this a total re-write. They've to use this approach it they need to integrate oplog. Otherwise it doesn't help much.

I'll be working on the oplog in this weekend :)

Owen Rees-Hayward

unread,

Jul 23, 2013, 9:24:27 AM7/23/13

to meteo...@googlegroups.com

Hey Arunoda,

Thanks for another great contribution to the Meteor eco-system. The scalability of the Live Results set is a real concern for me, and I'm sure many others who are considering production apps with Meteor. I'm sure Smart Collections and Meteor Cluster are helping drive forward the adoption of Meteor.

Really looking forward to the Smart Collections oplog solution for horizontal scaling.

Owen

Arunoda Susiripala

unread,

Jul 23, 2013, 9:37:50 AM7/23/13

to meteo...@googlegroups.com

After the oplog integration, we no longer needs Cluster. I'll be releasing oplog integration next week :)

Gadi Cohen

unread,

Jul 23, 2013, 9:52:12 AM7/23/13

to meteo...@googlegroups.com

*drool*

Doubly awesome! Can't wait for cluster support via the oplog.

Either way, use of smart collections will be the next update I make to Meteorpedia :)

Thanks again for yet another significant contribution to the Meteor community.

Gadi

Arunoda Susiripala

unread,

Jul 23, 2013, 10:27:18 AM7/23/13

to meteo...@googlegroups.com

Awesome. Ping me back for any issues.

--
You received this message because you are subscribed to the Google Groups "meteor-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-talk...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Owen Rees-Hayward

unread,

Jul 30, 2013, 7:59:04 AM7/30/13

to meteo...@googlegroups.com

Hey Arunoda,

I'm hoping to try out Smart Collections on our production site, https://datamooch.com, in the next few days; it's only an early alpha release so there aren't too many users to upset if it goes pear-shaped ; )

I'll let you know how we get on.

Cheers, Owen

Arunoda Susiripala

unread,

Jul 30, 2013, 8:57:58 AM7/30/13

to meteo...@googlegroups.com

Awesome. That's really nice. Let me know how it goes?

--

You received this message because you are subscribed to the Google Groups "meteor-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-talk...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

David Glasser

unread,

Aug 3, 2013, 4:04:05 AM8/3/13

to meteo...@googlegroups.com

Short answer is --- we've been pretty heads down working on the current release and some other exciting things. Our opinion (or mine, at least) has always been that we're proud of the APIs we've designed around data access, but not necessarily around the current implementation. It's really validating to see other people able to take our APIs and make them perform better. I haven't had time to look in detail at Arunoda's work yet but am planning to do so as soon as I get back from vacation. From a cursory glance it looks like a combination of things we've been planning to but haven't had the chance yet, along with some other interesting ideas. It's great to hear that the package on Atmosphere is making developer's lives easier today, and hopefully core will benefit either from direct use of Arunoda's code or at least from the lessons he's learned!

--dave

Arunoda Susiripala

unread,

Aug 3, 2013, 4:39:27 AM8/3/13

to meteo...@googlegroups.com

Hi Dave,

Lets have a chat later on this after the 0.6.5 release :)

David Glasser

unread,

Aug 14, 2013, 2:31:25 PM8/14/13

to meteo...@googlegroups.com, meteo...@googlegroups.com

Hey Meteor folks! I've had a busy month since Arunoda first posted Smart

Collections (finishing up linker/0.6.5! vacation!) but I finally found the time

to review his package. I've been planning to put some time into similar work for

a while now but there's so much to do here at Meteor HQ, so it's really exciting

when community members outside of the community do it for you :)

Arunoda's package is a pleasure to read, and looking at the commit history and

tests it's clear he's made it evolve quickly. And though I haven't run it

against our benchmarks the results he and others are reporting do speak for

themselves.

While performance is certainly important, the main strategy so far in this

"preview mode" Meteor has been on designing great APIs that are easy to use and

*can* have implementations with great performance, and not necessarily on

getting the most optimal performance in the initial implementations. We've

always known that the relatively naive "re-run queries in full when we suspect

they've changed, and diff the results" algorithm is not the final implementation

of live cursors, and that it doesn't always perform well. So it's super

validating to see Arunoda taking (some of) our Mongo observeChanges APIs and

getting notably better performance.

My summary of the ideas behind Arunoda's implementation is:

To do an unordered observeChanges call (which is what drives "return a cursor

from a publish function", which is the most important server-side use of

observeChanges and the biggest bottleneck for most Meteor apps), you don't

actually need to cache the entire contents of the cursor, and you don't need to

execute a full diff of every object. Instead, you just need to cache the set of

documents (by ID), and be connected enough to write operations that we have an

idea of which documents change.

What do I mean by "be connected enough"? Arunoda's package implements two

separate strategies:

(A) If you can configure it to connect to the mongo oplog (not possible in

every mongo hosting environment!) then you get a direct feed of every

insert, update, and remove operation in the entire database. Each operation

is super specific: it tells you exactly which document changed (by ID), and

for updates, it simply says "$set/$unset these fields" (no complex

modifiers like $inc or $addToSet).

(B) If you can't configure the oplog, it uses a similar strategy to current

Meteor where it notices write operations that originate inside the process

itself. These write operations can have arbitrary selectors and modifiers

and fully understanding them does require using minimongo or the like.

When it notices an insert, it evaluates the selectors on all cursors, and for

those that match, it adds the document to their set. This doesn't require any

database reads, but it does require the selector logic to correctly match Mongo.

When it notices a single-document-by-ID remove, it just removes them from every

cursor that contains them (since cursors do track their set of IDs) --- easy!

When it notices a by-selector remove, it re-polls every cursor (but only asking

it to return IDs). This is similar to what current Meteor does, but at least

comparing a list of IDs is faster than doing a full recursive diff. (This does

NOT occur when using oplog, though!)

When it notices a single-document-by-ID update, it does a single-document read

of the changed document, looks at all the fields mentioned in the modifier, and

emits a changed callback listing those fields. (It's possible that some of those

fields won't have actually changed, though! eg, if you are {$set}ing something

to the value it already has. In the common case of "immediately hooked up to DDP

publication", a different caching layer wil suppress the extra message, but this

does technically break the observeChanges API.)

When it notices a by-selector update.... well, unfortunately the current

behavior is broken (see

https://github.com/arunoda/meteor-smart-collections/issues/12 ; the quote in the

first comment is from me). Arunoda plans to change this to poll all cursors,

which certainly is pretty similar to the current Meteor implementation (though

his clever "just look at the fields mentioned in the modifier instead of doing a

full diff" idea will still help). The good news is that this case doesn't occur

when you have an oplog anyway.

The package doesn't implement a bunch of things from the collection API; some of

these I'm sure he could implement quite easily, whereas others I suspect are

fundamentally incompatible with this approach:

- fields filtering (really important for security)! --- not too hard to

implement. (in related news, Slava started implementing this for minimongo

recently!)

- ordered observeChanges, skip/limit with sort, etc: I don't really see how

these could be implemented without caching more information about the

documents, or doing a full re-poll like we do now. (But we could be clever

and only cache the data that is relevant for the sort.)

- latency compensation (the write fence): Meteor methods (such as the

auto-generated insert/update/remove collection methods) have two different

"done" messages. One is the "result" message which contains the method's

return value or error, and is delivered as soon as the method body returns

or throws. The other is the "updated" message, which specifies that any

writes done by the method have been reflected in data messages

(added/changed/removed/etc) sent from server to client. The latter method is

what links together the two components of DDP (methods and data). If any

collection write happens in a method body, then by use of an object called

the Write Fence, we ensure that the "updated" method does not get sent until

all possibly-affected cursors have been polled one time. The client uses

this message to prevent flicker (ie, latency compensation): essentially, any

documents that are modified by the client-side stub are "frozen" until the

method's "updated" message shows up, at which point we should have seen the

final value of the documents.

Arunoda hasn't implemented use of the write fence, which means that it's

very possible to see a flicker back to the original value after running a

method, before the new value comes down the wire. This can be added though

(with a slightly trickier implementation than for the polling algorithm).

So how do we get something like this into Meteor core? The great thing about

building a non-core module like smart collections is that you *don't* have to

implement every detail of the API, at least at first. But that's not an optino

for core.

Because some parts of the observeChanges API (esp the ordered parts) probably

can't be easily supported with this approach, I think we do need to leave some

version of the current approach in Meteor core. Additionally, my concern about

"update via selector" makes me mostly interested in using this approach when the

oplog is available, not without oplog (because it does seem like if you have to

process arbitrary update commands, we have to fall back on polling

anyway). Plus, I'm not sure that "doesn't see database changes from outside the

single server process" is tenable for core. Not every Meteor deploy is going to

have oplog access, so we can't assume that.

Additionally, Mongo is a pretty complex beast. Minimongo is a good start at an

implementation of selector logic, but it's definitely imperfect. So far in

Meteor, it's mainly used on the client, which isn't a security-critical place:

it doesn't affect what data gets published over the network. Putting Minimongo's

evaluators into the critical path determining what data gets sent to the client

is scary! Now, that's kind of a good kind of scary --- the sort of fear that

will make us try hard to have a great implementation, and this is something we

always knew we'd have to do. But I'd like to get there incrementally. And so I'd

like to start only doing this for "simple" selectors without some of the more

complex $operators where we are more likely to disagree with Mongo's

implementation.

So I definitely do want to keep around the current implementation as a fallback

strategy when oplog isn't available, or (for now) for complex selectors, or for

ordered observeChanges, skip/limit/sort, etc. Which means that directly merging

in Arunoda's package isn't an option. But taking inspiration from it certainly

is! This week, I'm going to start work on the oplog branch, adding oplog-driven

observeChanges for a subset of cursors where I'm confident we'll get the logic

right. Let's see how this goes!

--dave

ps:

some other observations about Arunoda's package:

- It defines a Meteor.deepEqual. These days in Meteor we want all data to be

EJSON and then you can just use EJSON.equals.

- Factoring out the implementation of allow/deny into its own class is a great

idea.

- Regexp in oplog tailing needs a literal '\\.' after the DB name

- When you pass a callback to non-Meteor code, you can't just wrap it in a

Fiber --- you need to use Meteor.bindEnvironment (or better yet,

Meteor._wrapAsync). This guarantees that when the callback gets called, it

still knows which method it's inside --- which includes "what the userId" is,

so forgetting this leads to security bugs. (Yes, I realize we have not really

documented this. _wrapAsync is new and we are waiting to make sure it's

exactly the right API, at which point we will remove the _ and document it.)

- The package combines the idea of the "cursor" and the "observe handle" in a

way that doesn't match the actual API. You should be able to call

observeChanges multiple times on the same cursor (with different callbacks,

say) and stop them independently.

- The package doesn't really do de-duping in the same way that the current

implementation does (though admittedly de-duping in the current

implementation is a little complex). There is some de-duping for the

remove-by-selector polling.

Arunoda Susiripala

unread,

Aug 14, 2013, 3:27:03 PM8/14/13

to meteo...@googlegroups.com, meteo...@googlegroups.com

Thanks Dave for the complete audit,

Actually single-document-by-ID update does not break the observeChanges. SC compiles update modifier and it figure out which fields need to get changed.

I didn't get the field filtering problem. SC does support all of the nodejs mongodb find(). options. So fields are there.

I Should definitely use EJSON.equal

Meteor._wrapAsync is new to me. Definitely I'll add it.

I think I can work out with for calling multiple observeChanges() in a cursor.

I need to work more on the Latency Compensation and write fence. (hope someone can help me out too)

I just added polling support for update by selector. So SC #12 should get fixed.

Adding support for skip and limit is tricky. I started in limit. After I completed that I can work on the skip. But we should really avoid skip :) It does table scan.

I agree with Dave. SC is cannot be directly merged into core. SC misses some features. But I keep working on SC since few of our apps are depend on this. We really enjoy the efficiency provided by SC.

BTW: We can use emscripten to port the original mongo selector or use a nodejs C++ addon. With this we can use the existing mongo algo as it is. Just a thought.

Hope to see much improved Collection Implementation for Meteor Core. I'd happy to help to make it a success.

You received this message because you are subscribed to the Google Groups "meteor-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-core...@googlegroups.com.
To post to this group, send email to meteo...@googlegroups.com.
Visit this group at http://groups.google.com/group/meteor-core.

For more options, visit https://groups.google.com/groups/opt_out.

James Wilson

unread,

Aug 14, 2013, 3:37:52 PM8/14/13

to meteo...@googlegroups.com, meteo...@googlegroups.com

Thanks to everyone involved for your hard work!

I'd most be interested in seeing Smart Collections take on a Collection2'esque schema, also with support for unique records.

James Wilson

Rafiki Cai

unread,

Aug 14, 2013, 3:42:48 PM8/14/13

to meteo...@googlegroups.com, owe...@googlemail.com

Blessings and Peace:

Owen, I've taken a look at DataSmooch and find a great concept.

I have a design suggestion, if I may. I urge you to re-arrange your
layout, so that there are some questions "above the fold". Presently,
one has to scroll down a bit before getting pulled into the core element.

A single column of questions could run down the left hand side of your
page. Or at least the first row of questions could be pulled up above
the fold. Or boldly placed at the very top of the page.

Just my two bars of platinum on the matter.

In Service of THE ONENESS,
Rafiki Cai

Arunoda Susiripala

unread,

Aug 14, 2013, 3:42:42 PM8/14/13

to meteo...@googlegroups.com, meteo...@googlegroups.com

Eric has started SC support in Collection2. He is in vacation I think in these days. We might could see it in upcoming weeks :)

David Glasser

unread,

Aug 14, 2013, 3:57:19 PM8/14/13

to meteo...@googlegroups.com, meteo...@googlegroups.com

Re field filtering: maybe I'm missing something, but let's say you're observing some query that matches doc X with fields limited to just the "publishme" field. And then something comes across the oplog which sets the "dontpublish" field on doc X. I don't see anything in your code (but maybe I'm missing something?) that makes it realize it isn't supposed to issue a changed message with that field...

David Glasser

unread,

Aug 14, 2013, 3:58:13 PM8/14/13

to meteo...@googlegroups.com, meteo...@googlegroups.com

And I agree about single-document-by-ID updates, I thought I said that in my message :)

Owen Rees-Hayward

unread,

Aug 15, 2013, 3:55:30 AM8/15/13

to meteo...@googlegroups.com, owe...@googlemail.com

Hey Rafiki,

Thanks for the suggestion, it's a good one. It's very early days for Datamooch. It definitely has more than a few rough edges to smooth over! It will be interesting to see if it gets any traction.

Thanks for the platinum!

Tim Heckel

unread,

Aug 15, 2013, 11:46:32 AM8/15/13

to meteo...@googlegroups.com, meteo...@googlegroups.com

David thanks for taking the time to respond so fully. I really appreciate this answer.

Richard

unread,

Aug 17, 2013, 4:41:00 AM8/17/13

to meteo...@googlegroups.com, meteo...@googlegroups.com

David, I also want to thank you for your thorough and very helpful response to this subject.

Arunoda has been rapidly adding custom, production (large-scale) support for Meteor and, not surprisingly, the community has been adopting much of his work in this context.

Much credit to Meteor's core team and to Arunoda for all the recent efforts that are taking Meteor from just a sexy prototyping tool to a competitive, world-class, cutting-edge platform for developing modern web applications.

It will be insightful to get Arunoda's response to David's message.

Gabriel Pugliese

unread,

Aug 17, 2013, 10:43:00 AM8/17/13

to meteo...@googlegroups.com, meteo...@googlegroups.com

Not only Arunoda's package but others like Mesosphere, collection-hooks that makes me continue doing my projects with Meteor :)

Gabriel Pugliese
CodersTV.com
@gabrielsapo

You received this message because you are subscribed to the Google Groups "meteor-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-core...@googlegroups.com.
To post to this group, send email to meteo...@googlegroups.com.
Visit this group at http://groups.google.com/group/meteor-core.

Arunoda Susiripala

unread,

Aug 17, 2013, 10:48:19 AM8/17/13

to meteo...@googlegroups.com, meteo...@googlegroups.com

Ground DB and Collections2 also some great projects.

--

You received this message because you are subscribed to the Google Groups "meteor-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-core...@googlegroups.com.
To post to this group, send email to meteo...@googlegroups.com.
Visit this group at http://groups.google.com/group/meteor-core.
For more options, visit https://groups.google.com/groups/opt_out.

Mitar

unread,

Aug 31, 2013, 3:21:39 AM8/31/13

to meteo...@googlegroups.com, meteo...@googlegroups.com

Hi!

On Wed, Aug 14, 2013 at 11:31 AM, David Glasser <gla...@meteor.com> wrote:

(Yes, I realize we have not really
documented this. _wrapAsync is new and we are waiting to make sure it's

exactly the right API, at which point we will remove the _ and document it.)

I have a bit different API here:

https://github.com/peerlibrary/meteor-blocking/blob/master/server.js

And "f._blocking = true;" is useful for testing. :-)

Mitar

--
http://mitar.tnode.com/
https://twitter.com/mitar_m

Reply all

Reply to author

Forward