Re: [meteor-talk] Re: Introducing Smart Collections

440 vistas
Ir al primer mensaje no leído

David Glasser

no leída,
14 ago 2013, 2:31:25 p.m.14/8/2013
para meteo...@googlegroups.com,meteo...@googlegroups.com
Hey Meteor folks! I've had a busy month since Arunoda first posted Smart
Collections (finishing up linker/0.6.5! vacation!) but I finally found the time
to review his package. I've been planning to put some time into similar work for
a while now but there's so much to do here at Meteor HQ, so it's really exciting
when community members outside of the community do it for you :)

Arunoda's package is a pleasure to read, and looking at the commit history and
tests it's clear he's made it evolve quickly. And though I haven't run it
against our benchmarks the results he and others are reporting do speak for
themselves.

While performance is certainly important, the main strategy so far in this
"preview mode" Meteor has been on designing great APIs that are easy to use and
*can* have implementations with great performance, and not necessarily on
getting the most optimal performance in the initial implementations. We've
always known that the relatively naive "re-run queries in full when we suspect
they've changed, and diff the results" algorithm is not the final implementation
of live cursors, and that it doesn't always perform well.  So it's super
validating to see Arunoda taking (some of) our Mongo observeChanges APIs and
getting notably better performance.

My summary of the ideas behind Arunoda's implementation is:

To do an unordered observeChanges call (which is what drives "return a cursor
from a publish function", which is the most important server-side use of
observeChanges and the biggest bottleneck for most Meteor apps), you don't
actually need to cache the entire contents of the cursor, and you don't need to
execute a full diff of every object. Instead, you just need to cache the set of
documents (by ID), and be connected enough to write operations that we have an
idea of which documents change.

What do I mean by "be connected enough"? Arunoda's package implements two
separate strategies:

 (A) If you can configure it to connect to the mongo oplog (not possible in
     every mongo hosting environment!) then you get a direct feed of every
     insert, update, and remove operation in the entire database. Each operation
     is super specific: it tells you exactly which document changed (by ID), and
     for updates, it simply says "$set/$unset these fields" (no complex
     modifiers like $inc or $addToSet).

 (B) If you can't configure the oplog, it uses a similar strategy to current
     Meteor where it notices write operations that originate inside the process
     itself. These write operations can have arbitrary selectors and modifiers
     and fully understanding them does require using minimongo or the like.

When it notices an insert, it evaluates the selectors on all cursors, and for
those that match, it adds the document to their set. This doesn't require any
database reads, but it does require the selector logic to correctly match Mongo.

When it notices a single-document-by-ID remove, it just removes them from every
cursor that contains them (since cursors do track their set of IDs) --- easy!

When it notices a by-selector remove, it re-polls every cursor (but only asking
it to return IDs). This is similar to what current Meteor does, but at least
comparing a list of IDs is faster than doing a full recursive diff. (This does
NOT occur when using oplog, though!)

When it notices a single-document-by-ID update, it does a single-document read
of the changed document, looks at all the fields mentioned in the modifier, and
emits a changed callback listing those fields. (It's possible that some of those
fields won't have actually changed, though! eg, if you are {$set}ing something
to the value it already has. In the common case of "immediately hooked up to DDP
publication", a different caching layer wil suppress the extra message, but this
does technically break the observeChanges API.)

When it notices a by-selector update.... well, unfortunately the current
behavior is broken (see
first comment is from me). Arunoda plans to change this to poll all cursors,
which certainly is pretty similar to the current Meteor implementation (though
his clever "just look at the fields mentioned in the modifier instead of doing a
full diff" idea will still help). The good news is that this case doesn't occur
when you have an oplog anyway.

The package doesn't implement a bunch of things from the collection API; some of
these I'm sure he could implement quite easily, whereas others I suspect are
fundamentally incompatible with this approach:

  - fields filtering (really important for security)! --- not too hard to
    implement. (in related news, Slava started implementing this for minimongo
    recently!)

  - ordered observeChanges, skip/limit with sort, etc: I don't really see how
    these could be implemented without caching more information about the
    documents, or doing a full re-poll like we do now. (But we could be clever
    and only cache the data that is relevant for the sort.)

  - latency compensation (the write fence): Meteor methods (such as the
    auto-generated insert/update/remove collection methods) have two different
    "done" messages. One is the "result" message which contains the method's
    return value or error, and is delivered as soon as the method body returns
    or throws. The other is the "updated" message, which specifies that any
    writes done by the method have been reflected in data messages
    (added/changed/removed/etc) sent from server to client. The latter method is
    what links together the two components of DDP (methods and data). If any
    collection write happens in a method body, then by use of an object called
    the Write Fence, we ensure that the "updated" method does not get sent until
    all possibly-affected cursors have been polled one time. The client uses
    this message to prevent flicker (ie, latency compensation): essentially, any
    documents that are modified by the client-side stub are "frozen" until the
    method's "updated" message shows up, at which point we should have seen the
    final value of the documents.

    Arunoda hasn't implemented use of the write fence, which means that it's
    very possible to see a flicker back to the original value after running a
    method, before the new value comes down the wire. This can be added though
    (with a slightly trickier implementation than for the polling algorithm).

So how do we get something like this into Meteor core? The great thing about
building a non-core module like smart collections is that you *don't* have to
implement every detail of the API, at least at first. But that's not an optino
for core.

Because some parts of the observeChanges API (esp the ordered parts) probably
can't be easily supported with this approach, I think we do need to leave some
version of the current approach in Meteor core. Additionally, my concern about
"update via selector" makes me mostly interested in using this approach when the
oplog is available, not without oplog (because it does seem like if you have to
process arbitrary update commands, we have to fall back on polling
anyway). Plus, I'm not sure that "doesn't see database changes from outside the
single server process" is tenable for core. Not every Meteor deploy is going to
have oplog access, so we can't assume that.

Additionally, Mongo is a pretty complex beast. Minimongo is a good start at an
implementation of selector logic, but it's definitely imperfect. So far in
Meteor, it's mainly used on the client, which isn't a security-critical place:
it doesn't affect what data gets published over the network. Putting Minimongo's
evaluators into the critical path determining what data gets sent to the client
is scary! Now, that's kind of a good kind of scary --- the sort of fear that
will make us try hard to have a great implementation, and this is something we
always knew we'd have to do. But I'd like to get there incrementally. And so I'd
like to start only doing this for "simple" selectors without some of the more
complex $operators where we are more likely to disagree with Mongo's
implementation.

So I definitely do want to keep around the current implementation as a fallback
strategy when oplog isn't available, or (for now) for complex selectors, or for
ordered observeChanges, skip/limit/sort, etc. Which means that directly merging
in Arunoda's package isn't an option. But taking inspiration from it certainly
is! This week, I'm going to start work on the oplog branch, adding oplog-driven
observeChanges for a subset of cursors where I'm confident we'll get the logic
right. Let's see how this goes!

--dave



ps:

some other observations about Arunoda's package:

 - It defines a Meteor.deepEqual. These days in Meteor we want all data to be
   EJSON and then you can just use EJSON.equals.

 - Factoring out the implementation of allow/deny into its own class is a great
   idea.

 - Regexp in oplog tailing needs a literal '\\.' after the DB name

 - When you pass a callback to non-Meteor code, you can't just wrap it in a
   Fiber --- you need to use Meteor.bindEnvironment (or better yet,
   Meteor._wrapAsync). This guarantees that when the callback gets called, it
   still knows which method it's inside --- which includes "what the userId" is,
   so forgetting this leads to security bugs. (Yes, I realize we have not really
   documented this. _wrapAsync is new and we are waiting to make sure it's
   exactly the right API, at which point we will remove the _ and document it.)

 - The package combines the idea of the "cursor" and the "observe handle" in a
   way that doesn't match the actual API. You should be able to call
   observeChanges multiple times on the same cursor (with different callbacks,
   say) and stop them independently.

 - The package doesn't really do de-duping in the same way that the current
   implementation does (though admittedly de-duping in the current
   implementation is a little complex). There is some de-duping for the
   remove-by-selector polling.



On Sat, Aug 3, 2013 at 4:04 AM, David Glasser <gla...@meteor.com> wrote:
Short answer is --- we've been pretty heads down working on the current release and some other exciting things. Our opinion (or mine, at least) has always been that we're proud of the APIs we've designed around data access, but not necessarily around the current implementation. It's really validating to see other people able to take our APIs and make them perform better. I haven't had time to look in detail at Arunoda's work yet but am planning to do so as soon as I get back from vacation. From a cursory glance it looks like a combination of things we've been planning to but haven't had the chance yet, along with some other interesting ideas.  It's great to hear that the package on Atmosphere is making developer's lives easier today, and hopefully core will benefit either from direct use of Arunoda's code or at least from the lessons he's learned!

--dave


On Mon, Jul 22, 2013 at 11:17 AM, Gabriel Pugliese <gabrielh...@gmail.com> wrote:
At least it would be nice if anyone from Meteor core team comment about it, because, AFAIK, they are already working on mongo improvements.


Gabriel Pugliese
CodersTV.com
@gabrielsapo


On Mon, Jul 22, 2013 at 3:15 PM, <curiou...@gmail.com> wrote:
That is great. Thanks again!


On Monday, July 22, 2013 1:08:20 PM UTC-4, Arunoda Susiripala wrote:
I've done a performance test and you can get ~5x performance for your app and ~20x performance to mongo.
More details will be published tomorrow on MeteorHacks.

Not sure about the integration, but it doesn't need to, just get it from the atmosphere and use it.


On Mon, Jul 22, 2013 at 10:28 PM, <curiou...@gmail.com> wrote:
Thanks for your contributions to the Meteor community. I am a novice to these technologies and unable to evaluate the difference alternatives on a deep level. Are there any disadvantages of using this compared to the standard Collections. If not, will this be integrated into the Core Meteor project any time soon? 




On Friday, July 19, 2013 4:03:12 PM UTC-4, Arunoda Susiripala wrote:
Thanks. Will do the update.


On Sat, Jul 20, 2013 at 1:31 AM, Josh Cope <jcop...@gmail.com> wrote:
I notice you use 



1.3.12 just got released today if you want to update



1.3.12 2013-07-19
-----------------
- Fixed issue where timeouts sometimes would behave wrongly (Issue #1032)
- Fixed bug with callback third parameter on some commands (Issue #1033)
- Fixed possible issue where killcursor command might leave hanging functions
- Fixed issue where Mongos was not correctly removing dead servers from the pool of eligable servers
- Throw error if dbName or collection name contains null character (at command level and at collection level)
- Updated bson parser to 0.2.1 with security fix and non-promotion of Long values to javascript Numbers (once a long always a long)

--
You received this message because you are subscribed to the Google Groups "meteor-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-talk...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "meteor-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-talk...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Arunoda Susiripala

--
You received this message because you are subscribed to the Google Groups "meteor-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-talk...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "meteor-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-talk...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 


James Wilson

no leída,
14 ago 2013, 3:37:52 p.m.14/8/2013
para meteo...@googlegroups.com,meteo...@googlegroups.com
Thanks to everyone involved for your hard work!

I'd most be interested in seeing Smart Collections take on a Collection2'esque schema, also with support for unique records.



James Wilson


On Wed, Aug 14, 2013 at 2:27 PM, Arunoda Susiripala <arunoda.s...@gmail.com> wrote:
Thanks Dave for the complete audit,

Actually single-document-by-ID update does not break the observeChanges. SC compiles update modifier and it figure out which fields need to get changed.

I didn't get the field filtering problem. SC does support all of the nodejs mongodb find(). options. So fields are there.

I Should definitely use EJSON.equal

Meteor._wrapAsync is new to me. Definitely I'll add it.

I think I can work out with for calling multiple observeChanges() in a cursor.

I need to work more on the Latency Compensation and write fence. (hope someone can help me out too)

I just added polling support for update by selector. So SC #12 should get fixed.

Adding support for skip and limit is tricky. I started in limit. After I completed that I can work on the skip. But we should really avoid skip :) It does table scan.

I agree with Dave. SC is cannot be directly merged into core. SC misses some features. But I keep working on SC since few of our apps are depend on this. We really enjoy the efficiency provided by SC.

BTW: We can use emscripten to port the original mongo selector or use a nodejs C++ addon. With this we can use the existing mongo algo as it is. Just a thought.

Hope to see much improved Collection Implementation for Meteor Core. I'd happy to help to make it a success.

You received this message because you are subscribed to the Google Groups "meteor-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-core...@googlegroups.com.
To post to this group, send email to meteo...@googlegroups.com.
Visit this group at http://groups.google.com/group/meteor-core.

For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "meteor-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-core...@googlegroups.com.
To post to this group, send email to meteo...@googlegroups.com.
Visit this group at http://groups.google.com/group/meteor-core.

Arunoda Susiripala

no leída,
14 ago 2013, 3:42:42 p.m.14/8/2013
para meteo...@googlegroups.com,meteo...@googlegroups.com
Eric has started SC support in Collection2. He is in vacation I think in these days. We might could see it in upcoming weeks :)

David Glasser

no leída,
14 ago 2013, 3:57:19 p.m.14/8/2013
para meteo...@googlegroups.com,meteo...@googlegroups.com
Re field filtering: maybe I'm missing something, but let's say you're observing some query that matches doc X with fields limited to just the "publishme" field.  And then something comes across the oplog which sets the "dontpublish" field on doc X. I don't see anything in your code (but maybe I'm missing something?) that makes it realize it isn't supposed to issue a changed message with that field...


On Wed, Aug 14, 2013 at 3:27 PM, Arunoda Susiripala <arunoda.s...@gmail.com> wrote:
Thanks Dave for the complete audit,

Actually single-document-by-ID update does not break the observeChanges. SC compiles update modifier and it figure out which fields need to get changed.

I didn't get the field filtering problem. SC does support all of the nodejs mongodb find(). options. So fields are there.

I Should definitely use EJSON.equal

Meteor._wrapAsync is new to me. Definitely I'll add it.

I think I can work out with for calling multiple observeChanges() in a cursor.

I need to work more on the Latency Compensation and write fence. (hope someone can help me out too)

I just added polling support for update by selector. So SC #12 should get fixed.

Adding support for skip and limit is tricky. I started in limit. After I completed that I can work on the skip. But we should really avoid skip :) It does table scan.

I agree with Dave. SC is cannot be directly merged into core. SC misses some features. But I keep working on SC since few of our apps are depend on this. We really enjoy the efficiency provided by SC.

BTW: We can use emscripten to port the original mongo selector or use a nodejs C++ addon. With this we can use the existing mongo algo as it is. Just a thought.

Hope to see much improved Collection Implementation for Meteor Core. I'd happy to help to make it a success.

You received this message because you are subscribed to the Google Groups "meteor-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-core...@googlegroups.com.
To post to this group, send email to meteo...@googlegroups.com.
Visit this group at http://groups.google.com/group/meteor-core.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

David Glasser

no leída,
14 ago 2013, 3:58:13 p.m.14/8/2013
para meteo...@googlegroups.com,meteo...@googlegroups.com
And I agree about single-document-by-ID updates, I thought I said that in my message :)

Arunoda Susiripala

no leída,
14 ago 2013, 4:00:03 p.m.14/8/2013
para meteo...@googlegroups.com
Ah yes. I need to figure this out. It wouldn't be hard as limit :)

Arunoda Susiripala

no leída,
14 ago 2013, 4:02:33 p.m.14/8/2013
para meteo...@googlegroups.com
You did a awesome job auditing SC. Thank You. learned a lot new things :)

Arunoda Susiripala

no leída,
14 ago 2013, 3:27:03 p.m.14/8/2013
para meteo...@googlegroups.com,meteo...@googlegroups.com
Thanks Dave for the complete audit,

Actually single-document-by-ID update does not break the observeChanges. SC compiles update modifier and it figure out which fields need to get changed.

I didn't get the field filtering problem. SC does support all of the nodejs mongodb find(). options. So fields are there.

I Should definitely use EJSON.equal

Meteor._wrapAsync is new to me. Definitely I'll add it.

I think I can work out with for calling multiple observeChanges() in a cursor.

I need to work more on the Latency Compensation and write fence. (hope someone can help me out too)

I just added polling support for update by selector. So SC #12 should get fixed.

Adding support for skip and limit is tricky. I started in limit. After I completed that I can work on the skip. But we should really avoid skip :) It does table scan.

I agree with Dave. SC is cannot be directly merged into core. SC misses some features. But I keep working on SC since few of our apps are depend on this. We really enjoy the efficiency provided by SC.

BTW: We can use emscripten to port the original mongo selector or use a nodejs C++ addon. With this we can use the existing mongo algo as it is. Just a thought.

Hope to see much improved Collection Implementation for Meteor Core. I'd happy to help to make it a success.

You received this message because you are subscribed to the Google Groups "meteor-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-core...@googlegroups.com.
To post to this group, send email to meteo...@googlegroups.com.
Visit this group at http://groups.google.com/group/meteor-core.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Tim Heckel

no leída,
15 ago 2013, 11:46:32 a.m.15/8/2013
para meteo...@googlegroups.com,meteo...@googlegroups.com
David thanks for taking the time to respond so fully. I really appreciate this answer.

Richard

no leída,
17 ago 2013, 4:41:00 a.m.17/8/2013
para meteo...@googlegroups.com,meteo...@googlegroups.com
David, I also want to thank you for your thorough and very helpful response to this subject.

Arunoda has been rapidly adding custom, production (large-scale) support for Meteor and, not surprisingly, the community has been adopting much of his work in this context.

Much credit to Meteor's core team and to Arunoda for all the recent efforts that are taking Meteor from just a sexy prototyping tool to a competitive, world-class, cutting-edge platform for developing modern web applications. 

It will be insightful to get Arunoda's response to David's message.

Gabriel Pugliese

no leída,
17 ago 2013, 10:43:00 a.m.17/8/2013
para meteo...@googlegroups.com,meteo...@googlegroups.com
Not only Arunoda's package but others like Mesosphere, collection-hooks that makes me continue doing my projects with Meteor :)


Gabriel Pugliese
CodersTV.com
@gabrielsapo


You received this message because you are subscribed to the Google Groups "meteor-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-core...@googlegroups.com.
To post to this group, send email to meteo...@googlegroups.com.
Visit this group at http://groups.google.com/group/meteor-core.

Arunoda Susiripala

no leída,
17 ago 2013, 10:48:19 a.m.17/8/2013
para meteo...@googlegroups.com,meteo...@googlegroups.com
Ground DB and Collections2 also some great projects. 


--
You received this message because you are subscribed to the Google Groups "meteor-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-core...@googlegroups.com.
To post to this group, send email to meteo...@googlegroups.com.
Visit this group at http://groups.google.com/group/meteor-core.
For more options, visit https://groups.google.com/groups/opt_out.

Richard

no leída,
17 ago 2013, 3:45:04 p.m.17/8/2013
para meteo...@googlegroups.com
Arunoda,
Will you post a follow-up to David's message? I think it will be insightful, if you do.

Arunoda Susiripala

no leída,
17 ago 2013, 9:38:20 p.m.17/8/2013
para meteo...@googlegroups.com
Hi Richard, I think I did. Can't you see the post after his post :)

And I started working on his suggestions. See https://github.com/arunoda/meteor-smart-collections/commits/master

I'll make a blog post after, all(most) of the suggestions has been implemented. Hopefully in 1-2 week.


On Sun, Aug 18, 2013 at 1:15 AM, Richard <richar...@gmail.com> wrote:
Arunoda,
Will you post a follow-up to David's message? I think it will be insightful, if you do.
--
You received this message because you are subscribed to the Google Groups "meteor-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to meteor-core...@googlegroups.com.
To post to this group, send email to meteo...@googlegroups.com.
Visit this group at http://groups.google.com/group/meteor-core.
For more options, visit https://groups.google.com/groups/opt_out.

Richard

no leída,
17 ago 2013, 9:48:09 p.m.17/8/2013
para meteo...@googlegroups.com
Arunoda, I had to go back and search carefully to see your response to David and I found it. Thanks for that and thanks for continuing to improve Smart Collections. 

I will definitely use Smart Collections in my Meteor production project once it is solid and you have implemented much of Dave's suggestions, especially since you expect to finish most of the work within 2 weeks.

Arunoda Susiripala

no leída,
17 ago 2013, 9:57:18 p.m.17/8/2013
para meteo...@googlegroups.com
Awesome :)
Responder a todos
Responder al autor
Reenviar
0 mensajes nuevos