Is WiredTiger recommended for a update heavy write load?

kaushik subramanian

unread,

Jul 27, 2017, 7:16:39 PM7/27/17

to mongodb-user

Hello all,

We have an analytics platform that uses MongoDB as the underlying data store to store and retrieve many different time series. Long story short here is what we have now

- MongoDB 3.2 with WiredTiger storage engine

- zlib compression

- Minute based documents: Data is being exported from a given device once every minute and there are hundreds of such devices. Every datapoint translates to a new insert.

- The load was purely insert heavy and there were on updates at all.

Now that we are a little wiser than before, we are moving to a time series based schema where every record will hold data for 1 hour. So here is how it will be,

- MongoDB 3.2 with WiredTiger storage engine

- zlib compression

- preallocate 1-hour documents: Data is still being exported from a given device every minute and there are hundreds of such devices. Every datapoint translates to an update for the respective hour-based document.

- The load just got converted to a update heavy load.

WiredTiger documentation says that there is no longer support for inplace update. Given that fact, is WiredTiger well suited for an update heavy write load?

As far as the read load is concerned, we are not too heavy on reads as we are on writes. The reason i say this is because, data is being generated every minute by hundreds of devices and the users retrieving the data do not come close to generating a heavy read load.

Would appreciate some insights on this one!!

kaushik subramanian

unread,

Aug 8, 2017, 4:12:04 PM8/8/17

to mongodb-user

Hi Asya,

Would be great if you can comment on this topic. Thank you in advance.

Regards,

Kaushik

Kevin Adistambha

unread,

Aug 13, 2017, 10:05:10 PM8/13/17

to mongodb-user

Hi Kaushik

Depending on your use case, WiredTiger may or may not be suited to your particular use case.

However there are some observations regarding your setup:

MongoDB 3.2 with WiredTiger storage engine

The latest MongoDB version is 3.4.7. I would encourage you to test using the newest MongoDB since there have been many improvements since 3.2.

zlib compression

Using zlib compression could lower your throughput since you’re forcing the CPU to do more work. Try the default snappy compression or no compression (see Compression) for relatively more throughput.

preallocate 1-hour documents: Data is still being exported from a given device every minute and there are hundreds of such devices. Every datapoint translates to an update for the respective hour-based document.

In WiredTiger, preallocation does not have the same benefit as MMAPv1.

The load just got converted to a update heavy load.

Since WiredTiger is a no-overwrite storage engine, an update is the same as a rewrite. The primary reason for this design is data safety, where an unfinished/corrupt writes cannot corrupt persisted data.

Aside from better data safety, there are other advantages of WiredTiger vs. MMAPv1 storage engine, such as better concurrency, document-level locking, and compression.

Since every use case is different, there is no simple answer to your question. I would encourage you to perform your own testing using your specific use case, and compare the performance of MMAPv1 vs. WiredTiger using a simulated production workload. You may also want to experiment with different schema design, or Sharding if you need more throughput in your deployment.

Best regards,
Kevin

kaushik subramanian

unread,

Sep 14, 2017, 2:25:10 AM9/14/17

to mongodb-user

Hi Kevin,

My apologies for the delayed response. Thank you very much for the response. As you suggested, some additional prototype based testing would give us better insight.