I'm much more familiar with MySQL and Redis. This is my first post here. I did a search first; surprisingly, there's only two topics on this forum relating to MongoDB eventual consistency.
I understand that MongoDB *eventually writes my data*, and that is primarily what makes it fast. Okay, fine. Everything in life is a trade-off. My questions are related to the impact of the choice to use MongoDB on the application architecture. For example...
Do you always have to check to see if your data actually got written, before moving on in your write code? That seems wasteful and avoids the speed gains you would get from using it.
Or instead, do I use MongoDB like a cache layer, and always check if data is not there, and if not, re-fetch it from my persistent store (mysql, API or other...) That is to imply, use another store as the main one, and MongoDB as secondary?
Or do I run N MongoDB servers and deal with replication at the MongoDB server level?
How does MongoDB deal with larger data size than I have available RAM? Are swap-to-disks fast? Evictions? Or is it not accurate to think of it as a cache like Redis?
In short, how do application architects and developers deal with MongoDB eventual persistence, if it is going to be the app's primary data store?
I think there is some confusion about what is meant by eventual consistency. In a single-server environment, for each client MongoDB will reliably write data in the order that the client writes it, and data will be available to everyone as soon as MongoDB (the server) has written it. This data will not be written to disk immediately, but instead is written to memory-mapped files that the OS flushes to disk (additionally, every 60 seconds MongoDB forces a flush to disk).
However, MongoDB drivers do not, by default, check the server's return code when they write to the database. That means that if a write fails because it is illegal (for instance, if it violates a unique index), the application will not find out. Setting "safe mode = true", which is slightly different syntactically for each driver, will ensure that the application gets the return code for each operation it sends to the server. See this stack overflow question: http://stackoverflow.com/questions/11563627/what-is-exactly-meant-by-....
In a replication scenario (more than one server), writes are made to a single primary server, which is polled by secondaries that replicate its data. In a case such as this, if you allow your application to read from secondaries (by default, this is not allowed, but it can be good for scaling reads), then the application could miss data that has been written to the primary but not yet replicated to the secondary.
MongoDB is not an in-memory database, and it is not intended to be used as a cache layer; it is a persistent database that writes to disk, and is meant to be used as a primary data store. Its speed is due to good implementation rather than poor write persistence. In the normal course of using MongoDB, consistency does not show up as a problem.
If you would like to become more familiar with how different scenarios are handled, the official docs at http://docs.mongodb.org/manual/ are a good place to start.
On Thursday, October 11, 2012 1:24:48 PM UTC-4, Geoffrey Hoffman wrote:
> I'm much more familiar with MySQL and Redis. This is my first post here. I > did a search first; surprisingly, there's only two topics on this forum > relating to MongoDB eventual consistency.
> I understand that MongoDB *eventually writes my data*, and that is > primarily what makes it fast. Okay, fine. Everything in life is a > trade-off. My questions are related to the impact of the choice to use > MongoDB on the application architecture. For example...
> Do you always have to check to see if your data actually got written, > before moving on in your write code? That seems wasteful and avoids the > speed gains you would get from using it.
> Or instead, do I use MongoDB like a cache layer, and always check if data > is not there, and if not, re-fetch it from my persistent store (mysql, API > or other...) That is to imply, use another store as the main one, and > MongoDB as secondary?
> Or do I run N MongoDB servers and deal with replication at the MongoDB > server level?
> How does MongoDB deal with larger data size than I have available RAM? Are > swap-to-disks fast? Evictions? Or is it not accurate to think of it as a > cache like Redis?
> In short, how do application architects and developers deal with MongoDB > eventual persistence, if it is going to be the app's primary data store?
> A wise MySQL DBA told me once, if you care about your data, use InnoDB.
> Would, therefore, a wise MongoDBA suggest to never use fewer than 2 or 3
> servers (if I care about my data)?
> When you're coming from an ACID world, making the leap of faith to
> fire-and-forget may be a tough swallow.
> When can I read the data I just wrote? Immediately (while it's still in
> RAM and not yet on Disk?) or only once it's on disk?
> On Thursday, October 11, 2012 2:49:03 PM UTC-7, Sam Helman wrote:
>> Hello,
>> I think there is some confusion about what is meant by eventual
>> consistency. In a single-server environment, for each client MongoDB will
>> reliably write data in the order that the client writes it, and data will
>> be available to everyone as soon as MongoDB (the server) has written it.
>> This data will not be written to disk immediately, but instead is written
>> to memory-mapped files that the OS flushes to disk (additionally, every 60
>> seconds MongoDB forces a flush to disk).
>> However, MongoDB drivers do not, by default, check the server's return
>> code when they write to the database. That means that if a write fails
>> because it is illegal (for instance, if it violates a unique index), the
>> application will not find out. Setting "safe mode = true", which is
>> slightly different syntactically for each driver, will ensure that the
>> application gets the return code for each operation it sends to the server.
>> See this stack overflow question: http://**stackoverflow.com/questions/*
>> *11563627/what-is-exactly-**meant-by-fire-and-forget-**write-in-mongodb<http://stackoverflow.com/questions/11563627/what-is-exactly-meant-by-...>
>> .
>> In a replication scenario (more than one server), writes are made to a
>> single primary server, which is polled by secondaries that replicate its
>> data. In a case such as this, if you allow your application to read from
>> secondaries (by default, this is not allowed, but it can be good for
>> scaling reads), then the application could miss data that has been written
>> to the primary but not yet replicated to the secondary.
>> MongoDB is not an in-memory database, and it is not intended to be used
>> as a cache layer; it is a persistent database that writes to disk, and is
>> meant to be used as a primary data store. Its speed is due to good
>> implementation rather than poor write persistence. In the normal course of
>> using MongoDB, consistency does not show up as a problem.
>> On Thursday, October 11, 2012 1:24:48 PM UTC-4, Geoffrey Hoffman wrote:
>>> I'm much more familiar with MySQL and Redis. This is my first post here.
>>> I did a search first; surprisingly, there's only two topics on this forum
>>> relating to MongoDB eventual consistency.
>>> I understand that MongoDB *eventually writes my data*, and that is
>>> primarily what makes it fast. Okay, fine. Everything in life is a
>>> trade-off. My questions are related to the impact of the choice to use
>>> MongoDB on the application architecture. For example...
>>> Do you always have to check to see if your data actually got written,
>>> before moving on in your write code? That seems wasteful and avoids the
>>> speed gains you would get from using it.
>>> Or instead, do I use MongoDB like a cache layer, and always check if
>>> data is not there, and if not, re-fetch it from my persistent store (mysql,
>>> API or other...) That is to imply, use another store as the main one, and
>>> MongoDB as secondary?
>>> Or do I run N MongoDB servers and deal with replication at the MongoDB
>>> server level?
>>> How does MongoDB deal with larger data size than I have available RAM?
>>> Are swap-to-disks fast? Evictions? Or is it not accurate to think of it as
>>> a cache like Redis?
>>> In short, how do application architects and developers deal with MongoDB
>>> eventual persistence, if it is going to be the app's primary data store?
>>> Thanks for insights.
>>> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongodb-user@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user+unsubscribe@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb
It is also worth noting that all of the drivers are configuable to NOT use fire-and-forget, even though it is the default. Take a look at safe mode if you begin to develop with MongoDB.
On Friday, October 12, 2012 2:28:27 PM UTC-4, Sammaye wrote:
> "Would, therefore, a wise MongoDBA suggest to never use fewer than 2 or 3 > servers (if I care about my data)?"
> It can also be solved by a solid backup and recovery strategy, so packing > more servers is not the answer sometimes.
> "When can I read the data I just wrote? Immediately (while it's still in > RAM and not yet on Disk?) or only once it's on disk?"
> Immediately.
> On 12 October 2012 18:28, Geoffrey Hoffman <geoffrey...@gmail.com<javascript:> > > wrote:
>> Thanks for the thoughtful reply, Sam.
>> A wise MySQL DBA told me once, if you care about your data, use InnoDB.
>> Would, therefore, a wise MongoDBA suggest to never use fewer than 2 or 3 >> servers (if I care about my data)?
>> When you're coming from an ACID world, making the leap of faith to >> fire-and-forget may be a tough swallow.
>> When can I read the data I just wrote? Immediately (while it's still in >> RAM and not yet on Disk?) or only once it's on disk?
>> On Thursday, October 11, 2012 2:49:03 PM UTC-7, Sam Helman wrote:
>>> Hello,
>>> I think there is some confusion about what is meant by eventual >>> consistency. In a single-server environment, for each client MongoDB will >>> reliably write data in the order that the client writes it, and data will >>> be available to everyone as soon as MongoDB (the server) has written it. >>> This data will not be written to disk immediately, but instead is written >>> to memory-mapped files that the OS flushes to disk (additionally, every 60 >>> seconds MongoDB forces a flush to disk).
>>> However, MongoDB drivers do not, by default, check the server's return >>> code when they write to the database. That means that if a write fails >>> because it is illegal (for instance, if it violates a unique index), the >>> application will not find out. Setting "safe mode = true", which is >>> slightly different syntactically for each driver, will ensure that the >>> application gets the return code for each operation it sends to the server. >>> See this stack overflow question: http://**stackoverflow.com/questions/ >>> **11563627/what-is-exactly-**meant-by-fire-and-forget-**write-in-mongodb<http://stackoverflow.com/questions/11563627/what-is-exactly-meant-by-...> >>> .
>>> In a replication scenario (more than one server), writes are made to a >>> single primary server, which is polled by secondaries that replicate its >>> data. In a case such as this, if you allow your application to read from >>> secondaries (by default, this is not allowed, but it can be good for >>> scaling reads), then the application could miss data that has been written >>> to the primary but not yet replicated to the secondary.
>>> MongoDB is not an in-memory database, and it is not intended to be used >>> as a cache layer; it is a persistent database that writes to disk, and is >>> meant to be used as a primary data store. Its speed is due to good >>> implementation rather than poor write persistence. In the normal course of >>> using MongoDB, consistency does not show up as a problem.
>>> On Thursday, October 11, 2012 1:24:48 PM UTC-4, Geoffrey Hoffman wrote:
>>>> I'm much more familiar with MySQL and Redis. This is my first post >>>> here. I did a search first; surprisingly, there's only two topics on this >>>> forum relating to MongoDB eventual consistency.
>>>> I understand that MongoDB *eventually writes my data*, and that is >>>> primarily what makes it fast. Okay, fine. Everything in life is a >>>> trade-off. My questions are related to the impact of the choice to use >>>> MongoDB on the application architecture. For example...
>>>> Do you always have to check to see if your data actually got written, >>>> before moving on in your write code? That seems wasteful and avoids the >>>> speed gains you would get from using it.
>>>> Or instead, do I use MongoDB like a cache layer, and always check if >>>> data is not there, and if not, re-fetch it from my persistent store (mysql, >>>> API or other...) That is to imply, use another store as the main one, and >>>> MongoDB as secondary?
>>>> Or do I run N MongoDB servers and deal with replication at the MongoDB >>>> server level?
>>>> How does MongoDB deal with larger data size than I have available RAM? >>>> Are swap-to-disks fast? Evictions? Or is it not accurate to think of it as >>>> a cache like Redis?
>>>> In short, how do application architects and developers deal with >>>> MongoDB eventual persistence, if it is going to be the app's primary data >>>> store?
>>>> Thanks for insights.
>>>> -- >> You received this message because you are subscribed to the Google >> Groups "mongodb-user" group. >> To post to this group, send email to mongod...@googlegroups.com<javascript:> >> To unsubscribe from this group, send email to >> mongodb-user...@googlegroups.com <javascript:> >> See also the IRC channel -- freenode.net#mongodb