Introducing Smart Collections

749 views
Skip to first unread message

Arunoda Susiripala

unread,
Jul 19, 2013, 1:22:06 PM7/19/13
to meteo...@googlegroups.com
Hi Guys,

Want make your Meteor app perform well? to make it scale easily? to get the advantage multiple CPUs?


Smart Collections is a complete re-write of mongodb collection implementation for meteor. Integration is pretty simple, and this is available via atmosphere.

Let me know, what you think of this!

Josh Cope

unread,
Jul 19, 2013, 3:48:52 PM7/19/13
to meteo...@googlegroups.com
Nice I will look at it later

Arunoda Susiripala

unread,
Jul 19, 2013, 3:50:02 PM7/19/13
to meteo...@googlegroups.com
Cool. 


On Sat, Jul 20, 2013 at 1:18 AM, Josh Cope <jcop...@gmail.com> wrote:
Nice I will look at it later

--
You received this message because you are subscribed to the Google Groups "meteor-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-talk...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Josh Cope

unread,
Jul 19, 2013, 4:01:04 PM7/19/13
to meteo...@googlegroups.com
I notice you use 



1.3.12 just got released today if you want to update

1.3.12 2013-07-19
-----------------
- Fixed issue where timeouts sometimes would behave wrongly (Issue #1032)
- Fixed bug with callback third parameter on some commands (Issue #1033)
- Fixed possible issue where killcursor command might leave hanging functions
- Fixed issue where Mongos was not correctly removing dead servers from the pool of eligable servers
- Throw error if dbName or collection name contains null character (at command level and at collection level)
- Updated bson parser to 0.2.1 with security fix and non-promotion of Long values to javascript Numbers (once a long always a long)

Arunoda Susiripala

unread,
Jul 19, 2013, 4:03:12 PM7/19/13
to meteo...@googlegroups.com
Thanks. Will do the update.


--
You received this message because you are subscribed to the Google Groups "meteor-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-talk...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

curiou...@gmail.com

unread,
Jul 22, 2013, 12:58:34 PM7/22/13
to meteo...@googlegroups.com
Thanks for your contributions to the Meteor community. I am a novice to these technologies and unable to evaluate the difference alternatives on a deep level. Are there any disadvantages of using this compared to the standard Collections. If not, will this be integrated into the Core Meteor project any time soon? 

Arunoda Susiripala

unread,
Jul 22, 2013, 1:08:20 PM7/22/13
to meteo...@googlegroups.com
I've done a performance test and you can get ~5x performance for your app and ~20x performance to mongo.
More details will be published tomorrow on MeteorHacks.

Not sure about the integration, but it doesn't need to, just get it from the atmosphere and use it.

Josh Cope

unread,
Jul 22, 2013, 1:38:40 PM7/22/13
to meteo...@googlegroups.com
I wonder if the meteor team would be interested in integrating it with the core

I know not everyone uses atmosphere and might not realize that certain package exists, as long as it's well tested, I don't see the harm, I know they are planning on make user created packages available straight through the core in the feature?

Josh Cope

unread,
Jul 22, 2013, 1:39:50 PM7/22/13
to meteo...@googlegroups.com
future not "feature"

Arunoda Susiripala

unread,
Jul 22, 2013, 1:44:26 PM7/22/13
to meteo...@googlegroups.com
Yes I saw its on a blog. Lets see how this works out :)


On Mon, Jul 22, 2013 at 11:09 PM, Josh Cope <jcop...@gmail.com> wrote:
future not "feature"

--
You received this message because you are subscribed to the Google Groups "meteor-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-talk...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

curiou...@gmail.com

unread,
Jul 22, 2013, 2:15:48 PM7/22/13
to meteo...@googlegroups.com
That is great. Thanks again!

Gabriel Pugliese

unread,
Jul 22, 2013, 2:17:38 PM7/22/13
to meteo...@googlegroups.com
At least it would be nice if anyone from Meteor core team comment about it, because, AFAIK, they are already working on mongo improvements.


Gabriel Pugliese
CodersTV.com
@gabrielsapo

Arunoda Susiripala

unread,
Jul 22, 2013, 2:39:37 PM7/22/13
to meteo...@googlegroups.com
Yep. I know that. But this a total re-write. They've to use this approach it they need to integrate oplog. Otherwise it doesn't help much.

I'll be working on the oplog in this weekend :)

Owen Rees-Hayward

unread,
Jul 23, 2013, 9:24:27 AM7/23/13
to meteo...@googlegroups.com
Hey Arunoda,

Thanks for another great contribution to the Meteor eco-system. The scalability of the Live Results set is a real concern for me, and I'm sure many others who are considering production apps with Meteor. I'm sure Smart Collections and Meteor Cluster are helping drive forward the adoption of Meteor.

Really looking forward to the Smart Collections oplog solution for horizontal scaling.

Owen

Arunoda Susiripala

unread,
Jul 23, 2013, 9:37:50 AM7/23/13
to meteo...@googlegroups.com
After the oplog integration, we no longer needs Cluster. I'll be releasing oplog integration next week :)

Gadi Cohen

unread,
Jul 23, 2013, 9:52:12 AM7/23/13
to meteo...@googlegroups.com
*drool*

Doubly awesome!  Can't wait for cluster support via the oplog.

Either way, use of smart collections will be the next update I make to Meteorpedia :)

Thanks again for yet another significant contribution to the Meteor community.

Gadi

Arunoda Susiripala

unread,
Jul 23, 2013, 10:27:18 AM7/23/13
to meteo...@googlegroups.com
Awesome. Ping me back for any issues. 
--
You received this message because you are subscribed to the Google Groups "meteor-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-talk...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Owen Rees-Hayward

unread,
Jul 30, 2013, 7:59:04 AM7/30/13
to meteo...@googlegroups.com
Hey Arunoda,

I'm hoping to try out Smart Collections on our production site, https://datamooch.com, in the next few days; it's only an early alpha release so there aren't too many users to upset if it goes pear-shaped ; )

I'll let you know how we get on.

Cheers, Owen

Arunoda Susiripala

unread,
Jul 30, 2013, 8:57:58 AM7/30/13
to meteo...@googlegroups.com
Awesome. That's really nice. Let me know how it goes?



--
You received this message because you are subscribed to the Google Groups "meteor-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-talk...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

David Glasser

unread,
Aug 3, 2013, 4:04:05 AM8/3/13
to meteo...@googlegroups.com
Short answer is --- we've been pretty heads down working on the current release and some other exciting things. Our opinion (or mine, at least) has always been that we're proud of the APIs we've designed around data access, but not necessarily around the current implementation. It's really validating to see other people able to take our APIs and make them perform better. I haven't had time to look in detail at Arunoda's work yet but am planning to do so as soon as I get back from vacation. From a cursory glance it looks like a combination of things we've been planning to but haven't had the chance yet, along with some other interesting ideas.  It's great to hear that the package on Atmosphere is making developer's lives easier today, and hopefully core will benefit either from direct use of Arunoda's code or at least from the lessons he's learned!

--dave

Arunoda Susiripala

unread,
Aug 3, 2013, 4:39:27 AM8/3/13
to meteo...@googlegroups.com
Hi Dave,

Lets have a chat later on this after the 0.6.5 release :)

David Glasser

unread,
Aug 14, 2013, 2:31:25 PM8/14/13
to meteo...@googlegroups.com, meteo...@googlegroups.com
Hey Meteor folks! I've had a busy month since Arunoda first posted Smart
Collections (finishing up linker/0.6.5! vacation!) but I finally found the time
to review his package. I've been planning to put some time into similar work for
a while now but there's so much to do here at Meteor HQ, so it's really exciting
when community members outside of the community do it for you :)

Arunoda's package is a pleasure to read, and looking at the commit history and
tests it's clear he's made it evolve quickly. And though I haven't run it
against our benchmarks the results he and others are reporting do speak for
themselves.

While performance is certainly important, the main strategy so far in this
"preview mode" Meteor has been on designing great APIs that are easy to use and
*can* have implementations with great performance, and not necessarily on
getting the most optimal performance in the initial implementations. We've
always known that the relatively naive "re-run queries in full when we suspect
they've changed, and diff the results" algorithm is not the final implementation
of live cursors, and that it doesn't always perform well.  So it's super
validating to see Arunoda taking (some of) our Mongo observeChanges APIs and
getting notably better performance.

My summary of the ideas behind Arunoda's implementation is:

To do an unordered observeChanges call (which is what drives "return a cursor
from a publish function", which is the most important server-side use of
observeChanges and the biggest bottleneck for most Meteor apps), you don't
actually need to cache the entire contents of the cursor, and you don't need to
execute a full diff of every object. Instead, you just need to cache the set of
documents (by ID), and be connected enough to write operations that we have an
idea of which documents change.

What do I mean by "be connected enough"? Arunoda's package implements two
separate strategies:

 (A) If you can configure it to connect to the mongo oplog (not possible in
     every mongo hosting environment!) then you get a direct feed of every
     insert, update, and remove operation in the entire database. Each operation
     is super specific: it tells you exactly which document changed (by ID), and
     for updates, it simply says "$set/$unset these fields" (no complex
     modifiers like $inc or $addToSet).

 (B) If you can't configure the oplog, it uses a similar strategy to current
     Meteor where it notices write operations that originate inside the process
     itself. These write operations can have arbitrary selectors and modifiers
     and fully understanding them does require using minimongo or the like.

When it notices an insert, it evaluates the selectors on all cursors, and for
those that match, it adds the document to their set. This doesn't require any
database reads, but it does require the selector logic to correctly match Mongo.

When it notices a single-document-by-ID remove, it just removes them from every
cursor that contains them (since cursors do track their set of IDs) --- easy!

When it notices a by-selector remove, it re-polls every cursor (but only asking
it to return IDs). This is similar to what current Meteor does, but at least
comparing a list of IDs is faster than doing a full recursive diff. (This does
NOT occur when using oplog, though!)

When it notices a single-document-by-ID update, it does a single-document read
of the changed document, looks at all the fields mentioned in the modifier, and
emits a changed callback listing those fields. (It's possible that some of those
fields won't have actually changed, though! eg, if you are {$set}ing something
to the value it already has. In the common case of "immediately hooked up to DDP
publication", a different caching layer wil suppress the extra message, but this
does technically break the observeChanges API.)

When it notices a by-selector update.... well, unfortunately the current
behavior is broken (see
first comment is from me). Arunoda plans to change this to poll all cursors,
which certainly is pretty similar to the current Meteor implementation (though
his clever "just look at the fields mentioned in the modifier instead of doing a
full diff" idea will still help). The good news is that this case doesn't occur
when you have an oplog anyway.

The package doesn't implement a bunch of things from the collection API; some of
these I'm sure he could implement quite easily, whereas others I suspect are
fundamentally incompatible with this approach:

  - fields filtering (really important for security)! --- not too hard to
    implement. (in related news, Slava started implementing this for minimongo
    recently!)

  - ordered observeChanges, skip/limit with sort, etc: I don't really see how
    these could be implemented without caching more information about the
    documents, or doing a full re-poll like we do now. (But we could be clever
    and only cache the data that is relevant for the sort.)

  - latency compensation (the write fence): Meteor methods (such as the
    auto-generated insert/update/remove collection methods) have two different
    "done" messages. One is the "result" message which contains the method's
    return value or error, and is delivered as soon as the method body returns
    or throws. The other is the "updated" message, which specifies that any
    writes done by the method have been reflected in data messages
    (added/changed/removed/etc) sent from server to client. The latter method is
    what links together the two components of DDP (methods and data). If any
    collection write happens in a method body, then by use of an object called
    the Write Fence, we ensure that the "updated" method does not get sent until
    all possibly-affected cursors have been polled one time. The client uses
    this message to prevent flicker (ie, latency compensation): essentially, any
    documents that are modified by the client-side stub are "frozen" until the
    method's "updated" message shows up, at which point we should have seen the
    final value of the documents.

    Arunoda hasn't implemented use of the write fence, which means that it's
    very possible to see a flicker back to the original value after running a
    method, before the new value comes down the wire. This can be added though
    (with a slightly trickier implementation than for the polling algorithm).

So how do we get something like this into Meteor core? The great thing about
building a non-core module like smart collections is that you *don't* have to
implement every detail of the API, at least at first. But that's not an optino
for core.

Because some parts of the observeChanges API (esp the ordered parts) probably
can't be easily supported with this approach, I think we do need to leave some
version of the current approach in Meteor core. Additionally, my concern about
"update via selector" makes me mostly interested in using this approach when the
oplog is available, not without oplog (because it does seem like if you have to
process arbitrary update commands, we have to fall back on polling
anyway). Plus, I'm not sure that "doesn't see database changes from outside the
single server process" is tenable for core. Not every Meteor deploy is going to
have oplog access, so we can't assume that.

Additionally, Mongo is a pretty complex beast. Minimongo is a good start at an
implementation of selector logic, but it's definitely imperfect. So far in
Meteor, it's mainly used on the client, which isn't a security-critical place:
it doesn't affect what data gets published over the network. Putting Minimongo's
evaluators into the critical path determining what data gets sent to the client
is scary! Now, that's kind of a good kind of scary --- the sort of fear that
will make us try hard to have a great implementation, and this is something we
always knew we'd have to do. But I'd like to get there incrementally. And so I'd
like to start only doing this for "simple" selectors without some of the more
complex $operators where we are more likely to disagree with Mongo's
implementation.

So I definitely do want to keep around the current implementation as a fallback
strategy when oplog isn't available, or (for now) for complex selectors, or for
ordered observeChanges, skip/limit/sort, etc. Which means that directly merging
in Arunoda's package isn't an option. But taking inspiration from it certainly
is! This week, I'm going to start work on the oplog branch, adding oplog-driven
observeChanges for a subset of cursors where I'm confident we'll get the logic
right. Let's see how this goes!

--dave



ps:

some other observations about Arunoda's package:

 - It defines a Meteor.deepEqual. These days in Meteor we want all data to be
   EJSON and then you can just use EJSON.equals.

 - Factoring out the implementation of allow/deny into its own class is a great
   idea.

 - Regexp in oplog tailing needs a literal '\\.' after the DB name

 - When you pass a callback to non-Meteor code, you can't just wrap it in a
   Fiber --- you need to use Meteor.bindEnvironment (or better yet,
   Meteor._wrapAsync). This guarantees that when the callback gets called, it
   still knows which method it's inside --- which includes "what the userId" is,
   so forgetting this leads to security bugs. (Yes, I realize we have not really
   documented this. _wrapAsync is new and we are waiting to make sure it's
   exactly the right API, at which point we will remove the _ and document it.)

 - The package combines the idea of the "cursor" and the "observe handle" in a
   way that doesn't match the actual API. You should be able to call
   observeChanges multiple times on the same cursor (with different callbacks,
   say) and stop them independently.

 - The package doesn't really do de-duping in the same way that the current
   implementation does (though admittedly de-duping in the current
   implementation is a little complex). There is some de-duping for the
   remove-by-selector polling.

Arunoda Susiripala

unread,
Aug 14, 2013, 3:27:03 PM8/14/13
to meteo...@googlegroups.com, meteo...@googlegroups.com
Thanks Dave for the complete audit,

Actually single-document-by-ID update does not break the observeChanges. SC compiles update modifier and it figure out which fields need to get changed.

I didn't get the field filtering problem. SC does support all of the nodejs mongodb find(). options. So fields are there.

I Should definitely use EJSON.equal

Meteor._wrapAsync is new to me. Definitely I'll add it.

I think I can work out with for calling multiple observeChanges() in a cursor.

I need to work more on the Latency Compensation and write fence. (hope someone can help me out too)

I just added polling support for update by selector. So SC #12 should get fixed.

Adding support for skip and limit is tricky. I started in limit. After I completed that I can work on the skip. But we should really avoid skip :) It does table scan.

I agree with Dave. SC is cannot be directly merged into core. SC misses some features. But I keep working on SC since few of our apps are depend on this. We really enjoy the efficiency provided by SC.

BTW: We can use emscripten to port the original mongo selector or use a nodejs C++ addon. With this we can use the existing mongo algo as it is. Just a thought.

Hope to see much improved Collection Implementation for Meteor Core. I'd happy to help to make it a success.

You received this message because you are subscribed to the Google Groups "meteor-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-core...@googlegroups.com.
To post to this group, send email to meteo...@googlegroups.com.
Visit this group at http://groups.google.com/group/meteor-core.

For more options, visit https://groups.google.com/groups/opt_out.
 
 

James Wilson

unread,
Aug 14, 2013, 3:37:52 PM8/14/13
to meteo...@googlegroups.com, meteo...@googlegroups.com
Thanks to everyone involved for your hard work!

I'd most be interested in seeing Smart Collections take on a Collection2'esque schema, also with support for unique records.



James Wilson

Rafiki Cai

unread,
Aug 14, 2013, 3:42:48 PM8/14/13
to meteo...@googlegroups.com, owe...@googlemail.com
Blessings and Peace:

Owen, I've taken a look at DataSmooch and find a great concept.

I have a design suggestion, if I may. I urge you to re-arrange your 
layout, so that there are some questions "above the fold".  Presently,
one has to scroll down a bit before getting pulled into the core element.

A single column of questions could run down the left hand side of your 
page.  Or at least the first row of questions could be pulled up above
the fold.  Or boldly placed at the very top of the page.

Just my two bars of platinum on the matter.

In Service of THE ONENESS,
Rafiki Cai


Arunoda Susiripala

unread,
Aug 14, 2013, 3:42:42 PM8/14/13
to meteo...@googlegroups.com, meteo...@googlegroups.com
Eric has started SC support in Collection2. He is in vacation I think in these days. We might could see it in upcoming weeks :)

David Glasser

unread,
Aug 14, 2013, 3:57:19 PM8/14/13
to meteo...@googlegroups.com, meteo...@googlegroups.com
Re field filtering: maybe I'm missing something, but let's say you're observing some query that matches doc X with fields limited to just the "publishme" field.  And then something comes across the oplog which sets the "dontpublish" field on doc X. I don't see anything in your code (but maybe I'm missing something?) that makes it realize it isn't supposed to issue a changed message with that field...

David Glasser

unread,
Aug 14, 2013, 3:58:13 PM8/14/13
to meteo...@googlegroups.com, meteo...@googlegroups.com
And I agree about single-document-by-ID updates, I thought I said that in my message :)

Owen Rees-Hayward

unread,
Aug 15, 2013, 3:55:30 AM8/15/13
to meteo...@googlegroups.com, owe...@googlemail.com
Hey Rafiki,

Thanks for the suggestion, it's a good one. It's very early days for Datamooch. It definitely has more than a few rough edges to smooth over! It will be interesting to see if it gets any traction.

Thanks for the platinum!

Tim Heckel

unread,
Aug 15, 2013, 11:46:32 AM8/15/13
to meteo...@googlegroups.com, meteo...@googlegroups.com
David thanks for taking the time to respond so fully. I really appreciate this answer.

Richard

unread,
Aug 17, 2013, 4:41:00 AM8/17/13
to meteo...@googlegroups.com, meteo...@googlegroups.com
David, I also want to thank you for your thorough and very helpful response to this subject.

Arunoda has been rapidly adding custom, production (large-scale) support for Meteor and, not surprisingly, the community has been adopting much of his work in this context.

Much credit to Meteor's core team and to Arunoda for all the recent efforts that are taking Meteor from just a sexy prototyping tool to a competitive, world-class, cutting-edge platform for developing modern web applications. 

It will be insightful to get Arunoda's response to David's message.

Gabriel Pugliese

unread,
Aug 17, 2013, 10:43:00 AM8/17/13
to meteo...@googlegroups.com, meteo...@googlegroups.com
Not only Arunoda's package but others like Mesosphere, collection-hooks that makes me continue doing my projects with Meteor :)


Gabriel Pugliese
CodersTV.com
@gabrielsapo


You received this message because you are subscribed to the Google Groups "meteor-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-core...@googlegroups.com.
To post to this group, send email to meteo...@googlegroups.com.
Visit this group at http://groups.google.com/group/meteor-core.

Arunoda Susiripala

unread,
Aug 17, 2013, 10:48:19 AM8/17/13
to meteo...@googlegroups.com, meteo...@googlegroups.com
Ground DB and Collections2 also some great projects. 


--
You received this message because you are subscribed to the Google Groups "meteor-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-core...@googlegroups.com.
To post to this group, send email to meteo...@googlegroups.com.
Visit this group at http://groups.google.com/group/meteor-core.
For more options, visit https://groups.google.com/groups/opt_out.

Mitar

unread,
Aug 31, 2013, 3:21:39 AM8/31/13
to meteo...@googlegroups.com, meteo...@googlegroups.com
Hi!

On Wed, Aug 14, 2013 at 11:31 AM, David Glasser <gla...@meteor.com> wrote:
   (Yes, I realize we have not really
   documented this. _wrapAsync is new and we are waiting to make sure it's
   exactly the right API, at which point we will remove the _ and document it.)

And "f._blocking = true;" is useful for testing. :-)


Mitar

--
http://mitar.tnode.com/
https://twitter.com/mitar_m
Reply all
Reply to author
Forward
0 new messages