How often and how much data does MongoDB write to disk?

70 views
Skip to first unread message

MongoDB User

unread,
Feb 17, 2018, 5:59:36 PM2/17/18
to mongodb-user
Let's say I have a MongoDB database which is about 30 GB in size.

Let's say I am importing masses of data into it right now, doing queries which insert 10,000 records at a time, every few seconds, and the collection is already up to 500 million+ records after a few days worth of importing.

How often will MongoDB write to disk.

And perhaps more importantly.... will it over-write the entire 30 GB with every disk write, or will is only over-write certain parts of the file?

I ask this, because all hard drives fail if you write too much to them.

I don't have an enterprise quality SSD. My SSD lets me do 360 Terabytes written before it goes out of warranty.

This importing will continue until I have probably 1 to 1.5 billion records in my collection.

Kevin Adistambha

unread,
Feb 26, 2018, 11:40:35 PM2/26/18
to mongodb-user

Hi

How often will MongoDB write to disk.

In MongoDB 3.4 series and earlier, WiredTiger will create a checkpoint every 60 seconds or after 2 GB of journal data (see How frequently does WiredTiger write to disk). However, in MongoDB 3.6 series, this was changed to every 60 seconds only (see SERVER-29210). A checkpoint represents a valid state of the database.

WiredTiger also writes to the journal between checkpoints to help persist data. This is configurable with the j option in WriteConcern.

will it over-write the entire 30 GB with every disk write, or will is only over-write certain parts of the file?

WiredTiger will not overwrite your whole database in every disk write. This will not be performant, and there’s no point rewriting parts of the data that doesn’t change. Every checkpoint will write only as necessary. For example, if the database does not change between checkpoints, nothing will be checkpointed to disk.

Having said that, it’s best if you create a replica set deplyoment in a production environment to provide high availability and ensure your data’s redundancy.

Best regards
Kevin

Reply all
Reply to author
Forward
0 new messages