Can a single query observe two different database states?

29 views
Skip to first unread message

JB

unread,
Apr 16, 2017, 12:01:09 PM4/16/17
to mongodb-dev
Hello everyone,

I have a question concerning MongoDB queries.
Say we have a query and two inserts. Initially the database is in state 0. After insert#1 it is in state 1 and after insert#2 it is in state 2. Insert#1 is executed on the server before insert#2.

Depending on the scheduling of the operations it is now of course possible that the query observes state 0,1 or 2. But is it also possible that the query observes insert#2 but not insert#1 even though both documents match the query-selector? (That would be a weird combination of state 0 and state 2)

Thank you in advance!

Cheers,
JB

Andy Schwerin

unread,
Apr 16, 2017, 12:36:29 PM4/16/17
to mongodb-dev
Yeah, it's possible in some circumstances in sharded clusters with the default (local) read concern, or even with the "majority" read concern. Most often, it would happen as the result of a shard primary failure, or an ill-timed chunk migration. It should not be possible with "linearizable" read concern. We're doing some development to support lighter weight causal consistency in 3.6, which would eliminate this behavior for most of the use cases where it can still happen at "local" and "majority" read concern.

--
You received this message because you are subscribed to the Google Groups "mongodb-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-dev...@googlegroups.com.
To post to this group, send email to mongo...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-dev.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-dev/989fd692-7bc7-416f-a113-d3d6df62c0c0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

JB

unread,
Apr 16, 2017, 1:27:39 PM4/16/17
to mongodb-dev
Thanks for the super quick reply!
So it would not be possible on a replicated cluster but only on a sharded one? I am researching about the possible serializability violations on replica sets. To exclude certain violations I need that information. Since all updates go to the primary and have a total order it would not be possible for a single query to observe a newer update but not an older update at the same time if it can only observe one database state.

Andy Schwerin

unread,
Apr 26, 2017, 5:10:23 PM4/26/17
to mongodb-dev

The answer to this question is a little tricky. At read concern "majority", in a single replica set, a single query won't observe older state after having observed newer state. At read concern "local", I believe this is also true because we kill running queries before rolling back uncommited (local) state. I don't believe this is sufficient to meet the standard definition of "serializability", though.

Once you issue a second query, however, you're at risk, even if you make a good faith effort to only read from primaries. The problem is that in certain network partitions, you may be able to see a node that does not yet know it has been partitioned. That node may believe it is primary for some time after another primary (which it cannot see) is elected. You might route one read to the partitioned "old" primary and another to the "new" primary. You might route the second read to the "old" primary and the first to the "new" one, and then even at "majority" read concern see a causality violation. Reading with "linearizable" read concern prevents this, by ensuring that the node you're reading from is still primary (by converting reads to writes, essentially). However, IIRC, even this isn't the same as serializability, and it is less powerful than "strict serializability".

If you haven't already, I would recommend reading Kyle Kingsbury's posts about MongoDB. The most recent one covers the 3.4 branch and a number of improvements we made over the last few years. The older ones show a lot of MongoDB's prior warts, and also contain a nice drawing of the "map of consistency models" from Bailis's High Availability Transactions paper. The Bailis paper contains a convenient, not-too-controversial description of the relationship between database and distributed systems consistency concepts.

Reply all
Reply to author
Forward
0 new messages