MongoDB support for Monotonic Read Consistency (MRC)

dhsieh

unread,

Apr 25, 2012, 12:15:33 PM4/25/12

to mongodb-user

In http://blog.mongodb.org/post/523516007/on-distributed-consistency-part-6-consistency-chart,
Dwight alluded that MongoDB's default consistence model supports
Strong Consistency, a super set of MRC. I would like to reconfirm
through the following scenario that MongoDB does indeed support MRC:

(1) There are 2 DC with DC1(1M1S), DC2(1S). There are 2 Apps with App2
continues to run in the same session
(2) App1 writes w1 to M in DC1
(3) App2 in DC1 sets slaveOk() true, reads w1 from S in DC1 assuming
w1 has been replicated from M in DC1
(4) Assuming S in DC2 receives w1 replicated from M in DC1
(4) App1 writes w2 to M in DC1
(5) App2 reads w2 from S in DC1 assuming w2 has been replicated from M
in DC1
(6) Assuming S in DC1 failed & App2 reads w2 again to find out if
there is a new w3. This time, App2 receives w1 from DC2 S since w2
hasn't been replicated from M in DC1

My question is that since App2 continues to run in the same session,
will Mongo language driver detects violation of MRC and returns an
error?

Dan Crosta

unread,

Apr 25, 2012, 12:41:37 PM4/25/12

to mongodb-user

MongoDB allows you to tune the level of consistency you achieve by
controlling (a) whether you read from the primary only, or from
secondaries, and (b) when you write, how many nodes you wish to
acknowledge the write before control returns to the writing process
(this is useful for read-your-own-writes, for instance). MongoDB also
uses an in-order replication protocol such that a read from any given
node, once it succeeds (i.e. some particular write can be read from
some particular secondary), it will never not succeed when reading
from that node.

As you point out, there are some situations around failures and fail
overs in which the writes can seem to disappear. They will eventually
show up on all secondaries, but this can take some time depending on
the read and write load, network conditions, etc.

None of the 10gen drivers support detection of this sort of
"violation" of MRC. If it is critical to your application to always
have monotonic read consistency, you should read from the primary
only, as this offers the strongest consistency guarantees.

- Dan

On Apr 25, 12:15 pm, dhsieh <dhsi...@gmail.com> wrote:
> Inhttp://blog.mongodb.org/post/523516007/on-distributed-consistency-par...,

dhsieh

unread,

Apr 25, 2012, 4:23:58 PM4/25/12

to mongodb-user

Would you please clarify what you mean on the following setences:

Dan Crosta

unread,

Apr 25, 2012, 4:34:33 PM4/25/12

to mongodb-user

Sure. Suppose that two write operations happen on the primary -- call
them W1 and W2. These writes will be included in the primary's oplog
in the same order as they happened on the primary, so W1 will precede
W2.

Secondaries "replay" the oplog in the same order as it is written to.
So for any secondary, one of the following three statements must be
true (with regards to these two write operations):

1. The secondary has not replayed either W1 or W2
2. The secondary has replayed W1 but not yet W2
3. The secondary has replayed W1 and W2

My point initially was that there is no scenario in which the
secondary has replayed W2 only, but not W1 -- so it is impossible to
go "backwards in time" with this replication model. Put another way,
once you can read W1 from some given node, you will always be able to
read W1 from it. And once you can read W2 from some node, you can also
always read W1 from it.

- Dan

dhsieh

unread,

Apr 26, 2012, 12:01:18 PM4/26/12

to mongodb-user

Realistically, the only meaningful way I can think of implementing MRC
behavior in cluster-wide MongoDB is as follows:

(1) Define MongoDB Replica Set with some Data Center Awareness tagging
info. Basically, you want to confirm writes to preferred slave nodes
in designated DC
(2) During MRC read, you want the language driver to go to those DC
tagged slave nodes only to ensure MRC

With this approach, one would incur high cost of write due to wait
latency depending on how many DC tagged slave nodes need be write
confirmed

Dan Crosta

unread,

Apr 26, 2012, 12:18:53 PM4/26/12

to mongod...@googlegroups.com

There's currently no way to read from tags, only write to them. This is a feature being considered for future releases of the MongoDB drivers (see https://jira.mongodb.org/browse/SERVER-3358 -- drivers will enable support for tagged reading once the server/mongos support it. This is currently planned for the upcoming 2.2 release of MongoDB).

You can, however, use tags when writing, to ensure that writes are propagated to a particular data center, rack, etc, before acknowledgement. Since most drivers read from "nearby" (as measured by network latency) secondaries when reading from secondaries, this may be enough to satisfy your needs. When writing, your application server will need to know in which data center it is, and set an appropriate getLastErrorMode when writing. See http://www.mongodb.org/display/DOCS/Data+Center+Awareness for more detail on tagged writes with getLastError.

- Dan

dhsieh

unread,

Apr 26, 2012, 12:59:23 PM4/26/12

to mongodb-user

Thanks for the clarification.

Reply all

Reply to author

Forward