Your 3 assumptions are correct.
The synchronization between the 3 sections (oplog, data, indexes) is
guaranteed because its all using the same storage engine + journal.
So once the journal commit happens, all 3 are guaranteed in sync.
Adding in a 2nd storage engine that has the same transactional
properties isn't going to be easy.
Not even sure its possible to do in an elegant way.
On Sun, Sep 23, 2012 at 7:50 PM, Zardosht Kasheff <
zard...@gmail.com> wrote:
> Hello all,
>
> I am a Tokutek engineer investigating the possible integration of a
> different storage engine into MongoDB, be it at the index level or the
> storage engine level.
>
> For the purpose of this email, suppose that a collection either:
> - has a secondary index that is using our engine.
> - the entire collection is implemented using our engine.
>
> I am trying to learn how crash safety/recovery works and replication would
> work with a possible third-party engine. The problem I see right now is
> should MongoDB crash, I do not understand how we can ensure that we recover
> to a state that MongoDB finds acceptable. That being said, I was wondering
> if somebody could please help with these questions:
>
> After a crash and recovery, what is the expected state of the system? Here
> are my guesses, based on things I have read, but they are only guesses:
> - secondary indexes are in sync with the main data heap
> - the main data heap is in sync with the replication log (which I think is
> called the opLog)
> - the exact data in the database depends on when the last fsync of the
> journal occurred.
>
> Are my guesses correct? If not, what are the invariants of the system after
> a crash regarding the journal, data heap, and opLog (and anything else I may
> not know about)?
>
> If so, here is the challenge I am thinking about. Upon a crash, if we are
> just a secondary index, how do we ensure that we are in sync with the main
> data heap, and if we have the entire collection, how do we ensure that we
> are in sync with the opLog?
>
> To answer this, I am trying to learn the locking in the system that ensures
> these invariants hold? I see the following in instance.cpp and query.cpp:
> - receivedInsert, receivedUpdate, and receivedDelete call Lock::DBWrite
> lk(ns), which I guess grabs some database level lock, and releases the lock
> should there be a "PageFaultException" (which I guess is I/O). Is this a
> database level lock that gets yielded during I/O?
> - receivedInsert has a reference to "read locked in big log". What does
> this mean?
> - runQuery, through "Client::ReadContext ctx( ns , dbpath );" grabs some
> read lock? Is this a read lock on the same lock grabbed in receivedInsert
> etc...?
>
> I guess some locking needs to be in place to ensure that the opLog and
> journal is in sync with the data heap, but with the locking above, I do not
> understand how this is done. Is there a global rw lock that does this? If
> so, where in code can I read about it?
>
> Thanks
> -Zardosht
>
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-dev" group.
> To view this discussion on the web visit
>
https://groups.google.com/d/msg/mongodb-dev/-/FxydGgHX4Q4J.
> To post to this group, send email to
mongo...@googlegroups.com.
> To unsubscribe from this group, send email to
>
mongodb-dev...@googlegroups.com.
> For more options, visit this group at
>
http://groups.google.com/group/mongodb-dev?hl=en.