Hey Meteor folks! I've had a busy month since Arunoda first posted Smart
Collections (finishing up linker/0.6.5! vacation!) but I finally found the time
to review his package. I've been planning to put some time into similar work for
a while now but there's so much to do here at Meteor HQ, so it's really exciting
when community members outside of the community do it for you :)
Arunoda's package is a pleasure to read, and looking at the commit history and
tests it's clear he's made it evolve quickly. And though I haven't run it
against our benchmarks the results he and others are reporting do speak for
themselves.
While performance is certainly important, the main strategy so far in this
"preview mode" Meteor has been on designing great APIs that are easy to use and
*can* have implementations with great performance, and not necessarily on
getting the most optimal performance in the initial implementations. We've
always known that the relatively naive "re-run queries in full when we suspect
they've changed, and diff the results" algorithm is not the final implementation
of live cursors, and that it doesn't always perform well. So it's super
validating to see Arunoda taking (some of) our Mongo observeChanges APIs and
getting notably better performance.
My summary of the ideas behind Arunoda's implementation is:
To do an unordered observeChanges call (which is what drives "return a cursor
from a publish function", which is the most important server-side use of
observeChanges and the biggest bottleneck for most Meteor apps), you don't
actually need to cache the entire contents of the cursor, and you don't need to
execute a full diff of every object. Instead, you just need to cache the set of
documents (by ID), and be connected enough to write operations that we have an
idea of which documents change.
What do I mean by "be connected enough"? Arunoda's package implements two
separate strategies:
(A) If you can configure it to connect to the mongo oplog (not possible in
every mongo hosting environment!) then you get a direct feed of every
insert, update, and remove operation in the entire database. Each operation
is super specific: it tells you exactly which document changed (by ID), and
for updates, it simply says "$set/$unset these fields" (no complex
modifiers like $inc or $addToSet).
(B) If you can't configure the oplog, it uses a similar strategy to current
Meteor where it notices write operations that originate inside the process
itself. These write operations can have arbitrary selectors and modifiers
and fully understanding them does require using minimongo or the like.
When it notices an insert, it evaluates the selectors on all cursors, and for
those that match, it adds the document to their set. This doesn't require any
database reads, but it does require the selector logic to correctly match Mongo.
When it notices a single-document-by-ID remove, it just removes them from every
cursor that contains them (since cursors do track their set of IDs) --- easy!
When it notices a by-selector remove, it re-polls every cursor (but only asking
it to return IDs). This is similar to what current Meteor does, but at least
comparing a list of IDs is faster than doing a full recursive diff. (This does
NOT occur when using oplog, though!)
When it notices a single-document-by-ID update, it does a single-document read
of the changed document, looks at all the fields mentioned in the modifier, and
emits a changed callback listing those fields. (It's possible that some of those
fields won't have actually changed, though! eg, if you are {$set}ing something
to the value it already has. In the common case of "immediately hooked up to DDP
publication", a different caching layer wil suppress the extra message, but this
does technically break the observeChanges API.)
When it notices a by-selector update.... well, unfortunately the current
behavior is broken (see
first comment is from me). Arunoda plans to change this to poll all cursors,
which certainly is pretty similar to the current Meteor implementation (though
his clever "just look at the fields mentioned in the modifier instead of doing a
full diff" idea will still help). The good news is that this case doesn't occur
when you have an oplog anyway.
The package doesn't implement a bunch of things from the collection API; some of
these I'm sure he could implement quite easily, whereas others I suspect are
fundamentally incompatible with this approach:
- fields filtering (really important for security)! --- not too hard to
implement. (in related news, Slava started implementing this for minimongo
recently!)
- ordered observeChanges, skip/limit with sort, etc: I don't really see how
these could be implemented without caching more information about the
documents, or doing a full re-poll like we do now. (But we could be clever
and only cache the data that is relevant for the sort.)
- latency compensation (the write fence): Meteor methods (such as the
auto-generated insert/update/remove collection methods) have two different
"done" messages. One is the "result" message which contains the method's
return value or error, and is delivered as soon as the method body returns
or throws. The other is the "updated" message, which specifies that any
writes done by the method have been reflected in data messages
(added/changed/removed/etc) sent from server to client. The latter method is
what links together the two components of DDP (methods and data). If any
collection write happens in a method body, then by use of an object called
the Write Fence, we ensure that the "updated" method does not get sent until
all possibly-affected cursors have been polled one time. The client uses
this message to prevent flicker (ie, latency compensation): essentially, any
documents that are modified by the client-side stub are "frozen" until the
method's "updated" message shows up, at which point we should have seen the
final value of the documents.
Arunoda hasn't implemented use of the write fence, which means that it's
very possible to see a flicker back to the original value after running a
method, before the new value comes down the wire. This can be added though
(with a slightly trickier implementation than for the polling algorithm).
So how do we get something like this into Meteor core? The great thing about
building a non-core module like smart collections is that you *don't* have to
implement every detail of the API, at least at first. But that's not an optino
for core.
Because some parts of the observeChanges API (esp the ordered parts) probably
can't be easily supported with this approach, I think we do need to leave some
version of the current approach in Meteor core. Additionally, my concern about
"update via selector" makes me mostly interested in using this approach when the
oplog is available, not without oplog (because it does seem like if you have to
process arbitrary update commands, we have to fall back on polling
anyway). Plus, I'm not sure that "doesn't see database changes from outside the
single server process" is tenable for core. Not every Meteor deploy is going to
have oplog access, so we can't assume that.
Additionally, Mongo is a pretty complex beast. Minimongo is a good start at an
implementation of selector logic, but it's definitely imperfect. So far in
Meteor, it's mainly used on the client, which isn't a security-critical place:
it doesn't affect what data gets published over the network. Putting Minimongo's
evaluators into the critical path determining what data gets sent to the client
is scary! Now, that's kind of a good kind of scary --- the sort of fear that
will make us try hard to have a great implementation, and this is something we
always knew we'd have to do. But I'd like to get there incrementally. And so I'd
like to start only doing this for "simple" selectors without some of the more
complex $operators where we are more likely to disagree with Mongo's
implementation.
So I definitely do want to keep around the current implementation as a fallback
strategy when oplog isn't available, or (for now) for complex selectors, or for
ordered observeChanges, skip/limit/sort, etc. Which means that directly merging
in Arunoda's package isn't an option. But taking inspiration from it certainly
is! This week, I'm going to start work on the oplog branch, adding oplog-driven
observeChanges for a subset of cursors where I'm confident we'll get the logic
right. Let's see how this goes!
--dave
ps:
some other observations about Arunoda's package:
- It defines a Meteor.deepEqual. These days in Meteor we want all data to be
EJSON and then you can just use EJSON.equals.
- Factoring out the implementation of allow/deny into its own class is a great
idea.
- Regexp in oplog tailing needs a literal '\\.' after the DB name
- When you pass a callback to non-Meteor code, you can't just wrap it in a
Fiber --- you need to use Meteor.bindEnvironment (or better yet,
Meteor._wrapAsync). This guarantees that when the callback gets called, it
still knows which method it's inside --- which includes "what the userId" is,
so forgetting this leads to security bugs. (Yes, I realize we have not really
documented this. _wrapAsync is new and we are waiting to make sure it's
exactly the right API, at which point we will remove the _ and document it.)
- The package combines the idea of the "cursor" and the "observe handle" in a
way that doesn't match the actual API. You should be able to call
observeChanges multiple times on the same cursor (with different callbacks,
say) and stop them independently.
- The package doesn't really do de-duping in the same way that the current
implementation does (though admittedly de-duping in the current
implementation is a little complex). There is some de-duping for the
remove-by-selector polling.