Hi mongo users,
I am currently planning a database migration MMAPv1 to WiredTiger. As far as I know this can only be done via mongodump + mongorestore.
I made some tests with mongodump on Windows (with gzip compression enabled), but I always run into problems because at some point the dump creation always fails.
My mongo installation (~2 TB) mainly consists of two large databases with each three collections. The largest collection has about 40 million entries and a lot of data in the GridFS.
If I understand the problem correctly it fails because the disk fragmentation exceeds the limits of the NTFS file-system (I did not even know of such a limitation before I ran into it). Note that I dump onto a fresh 100% not fragmented disk at the beginning.
I assume that the fragmentation is caused by the fact that mongodump uses by default 4 parallel connections. Reducing it to one thread helps, however the dump now takes ages. Dumping just the largest collection with it's 40 million entries collection takes about 5 days, hence I assume creating a full dump without parallelization would take about two weeks...
From my point of view the major problem is that mongodump uses a way too small write buffer. For example a write buffer of about 100 MB per thread would drastically reduce the fragmentation, however I did not find a way configure it as mongodump does not seem to allow to configure it or did I miss it?
What are your experience with mongodump? Is there a way to improve the dump speed or is it possible to create an dump offline (the database does not have to be available 24/7)?
Jan