disk storage grows by deleting?!

248 views
Skip to first unread message

jona...@findmeon.com

unread,
Oct 16, 2015, 6:35:44 PM10/16/15
to mongodb-user
I'm using mongodb 3.0.6 and WiredTiger+zlib. the os is ubuntu 12.04, and i installed off the official package.

My collection was ~30GB of data.  After some analysis, I decided to remove 10% of records as they accounted for 1/3 of disk space.  Instead of just deleting them - to be safe - I copied them to a new collection. 

I am a little surprised by the results: I have an 11GB "oversized" collection, but my "main" collection's disk storage neither decreased in size nor maintained size by keeping the allocated filesystem blocks.  the disk storage actually grew about 10GB.  

as you can imagine, this is less than ideal.  I took 11GB out so I could get the DB down to around 20GB... and now I have a 40GB db.  

looking at the collection stats, it registers the loss of records and actual filesize (~20GB) against the disk storage size (~40GB).  

Is this expected?  Is there anything I can/should do to fix this and avoid in the future?

A. Jalil @AJ

unread,
Oct 16, 2015, 10:49:29 PM10/16/15
to mongodb-user
I would check if you have powerof2sizes enabled.. I've read somewhere that it does increase the storage size, but on the other hand it increases performance as well..

Here another response by Stephen regarding powerof2Sizes:
https://groups.google.com/forum/#!msg/mongodb-user/4MNqcNR4k8M/nnDkOILjq8gJ

Hope this helps !

@AJ

Stephen Steneker

unread,
Oct 17, 2015, 12:10:02 AM10/17/15
to mongodb-user

On Saturday, 17 October 2015 09:35:44 UTC+11, jonathan wrote:

I’m using mongodb 3.0.6 and WiredTiger+zlib. the os is ubuntu 12.04, and i installed off the official package.

My collection was ~30GB of data. After some analysis, I decided to remove 10% of records as they accounted for 1/3 of disk space. Instead of just deleting them - to be safe - I copied them to a new collection.

Hi Jonathan,

Disk usage behaviour varies by storage engine. In the case of WiredTiger the documents have been marked as deleted with the space available for reuse. You can use the compact command to rewrite the collection and release unused disk space.

Note to MMAP users who might find this post: the compact command does not release free space on MMAPv1. Make sure you check the MongoDB documentation for any notes relevant to your configured storage engine.

I took 11GB out so I could get the DB down to around 20GB… and now I have a 40GB db.

Does the 40GB refer to the total of all data, or does it appear that your original collection grow from 30GB to 40GB?

If you started with ~30GB of data files and copied ~11GB to another collection, it sounds like the expected total of all files would be around 40GB.


On Saturday, 17 October 2015 13:49:29 UTC+11, A. Jalil @AJ wrote:

I would check if you have powerof2sizes enabled.. I’ve read somewhere that it does increase the storage size, but on the other hand it increases performance as well..

The powerOf2Sizes allocation strategy is specific to the MMAP storage engine, and is the default allocation strategy for MMAP in MongoDB 2.6+. MMAP supports in-place document updates so the additional storage allocation minimizes unnecessary index updates and moves due to document growth. A document that outgrows its storage allocation in MMAP will get moved to a larger storage allocation, which also requires updating all the index entries to point to the new document location.

Regards,
Stephen

jona...@findmeon.com

unread,
Oct 17, 2015, 1:35:17 AM10/17/15
to mongodb-user
On Saturday, October 17, 2015 at 12:10:02 AM UTC-4, Stephen Steneker wrote:

Does the 40GB refer to the total of all data, or does it appear that your original collection grow from 30GB to 40GB?

If you started with ~30GB of data files and copied ~11GB to another collection, it sounds like the expected total of all files would be around 40GB.

Yeah, I can see how you would think that I got confused from the similarity of numbers, however the 30GB collection grew to 40GB of disk space.  And there is another 11GB of the actual documents on top of that. 

When these documents were migrated into their own collection...
"size" : 11152109299,
"storageSize" : 11210907648,

this collection grew from 30-> 40GB 
"size" : 20748163974,
"storageSize" : 39900454912,

So I'm now using 50GB total disk space to house 30GB of data, when I was trying to see if I could drop the db down to 20GB.  You can imagine my excitement at this.


i had already tried `db.runCommand({compact: ''collection_name})`, which doesn't do anything.  it immediately returns, not space is freed, and the global log shows nothing.  

some online docs suggested i use `db.repairDatabase()`.  i've got that running for now.

 

jona...@findmeon.com

unread,
Oct 17, 2015, 2:28:22 PM10/17/15
to mongodb-user
Mongo doesn't want to give up the space.  I've run compact and repair database.  This comes back on the collection's stats:

"block-manager" : {
"file allocation unit size" : 4096,
"blocks allocated" : 3,
"checkpoint size" : 20813705216,
"allocations requiring file extension" : 0,
"blocks freed" : 0,
"file magic number" : 120897,
"file major version number" : 1,
"minor version number" : 0,
"file bytes available for reuse" : 19089190912,
"file size in bytes" : 39900454912
},

The space is being marked unused, but mongo won't release it - even with repairDatabase(), which everything says is the exact tool to do so.

I would dump the database to reinsert, but thanks to losing this 20GB of size I don't have enough to handle an uncompressed dump.

anyone have an idea?

A. Jalil @AJ

unread,
Oct 17, 2015, 2:42:20 PM10/17/15
to mongodb-user
Hi Stephen,

In one of your responses at https://groups.google.com/forum/#!topic/mongodb-user/82ORyR5hbYc   you said I can use <resync> command to reclaim space, now I see the command <compact> mentioned on this thread - I was wondering if these commands do different things and whether I should also try <compact> to reclaim free space in my case..

Thanks !
@AJ

Stephen Steneker

unread,
Oct 17, 2015, 4:24:27 PM10/17/15
to mongodb-user

On Sunday, 18 October 2015 05:42:20 UTC+11, A. Jalil @AJ wrote:

In one of your responses at https://groups.google.com/forum/#!topic/mongodb-user/82ORyR5hbYc   you said I can use <resync> command to reclaim space, now I see the command <compact> mentioned on this thread - I was wondering if these commands do different things and whether I should also try <compact> to reclaim free space in my case..

Hi AJ,

As I noted earlier in this thread, the behaviour of the compact command is storage engine specific. With MMAP (the default storage engine in MongoDB 3.0 and earlier), a compact command will rewrite the collection but does not release any preallocated space for MMAP: http://docs.mongodb.org/v3.0/reference/command/compact/#disk-space. The original question in this thread was asking about WiredTiger on a standalone node.

Completely rebuilding data files via repair or resync is a more general way to reclaim unused space. I would only recommend the repair command for a standalone node.

If a node is part of a replica set deployment, the recommended approach to release unused space without downtime is a rolling resync (re-syncing one secondary at a time until finally you step down and re-sync the former primary).

Regard,
Stephen

jona...@findmeon.com

unread,
Oct 17, 2015, 5:07:33 PM10/17/15
to mongodb-user
I've got one standalone node.  `.repairDatabase()` didn't work as advertised.  i tried restarting the server too. should the commandline tool work? 

jona...@findmeon.com

unread,
Oct 19, 2015, 4:33:33 PM10/19/15
to mongodb-user
The commandline version of `mongodb --repair` didn't work either.  I tried a half-dozen times with different settings, including using the repair-path option.  While specifying a repair path required a writeable path to exist, it did not copy over any files because the server noticed nothing "wrong" with the source files after extensive audits.

The only way to get my disk space under control was deleting enough non-mongo files to make enough space to pipe a mongodump into a gzipped file, then read that back in via mongorestore.

This is unsettling.  Not only did the disk space grow for a random reason -- not a single utility or command designed to address this works as documented or intended.  From my perspective, mongo seems thoroughly broken.

* Desired: Turn 30GB collection into 20GB + 10GB collections  (30GB total space)
* Result: 20GB collection now 40GB + 10GB new collection. (50GB total space)
* Dump to gzip data: 50GB mongo + 20GB zipfile (70GB total space)
* Read back into mongo as a new collection: 70GB mongo + 20 gb zipfile (90GB total space)
* after auditing to ensure all data made it: 30GB total space

Cesare Ghirelli

unread,
Jun 6, 2016, 7:25:43 PM6/6/16
to mongodb-user
Same problem here with Replica Set too (destroy and resync doesn't free any disk space).
None of the above commands did work, the product seems actually broken, not ready for production use at all.
Only way I found to get back my disk space is dropping and recreating the db. For sure this isn't a solution applicable in a production environment.

hope someone had good news on this problem

thank you to read
Reply all
Reply to author
Forward
0 new messages