Compaction doesn't remove extent files

72 views
Skip to first unread message

Troy Davisson

unread,
Oct 6, 2011, 12:26:53 AM10/6/11
to mongod...@googlegroups.com
I have a MongoDB collection which was using about 556GB as of yesterday.  After removing about 400 million documents (little over half of the total), I learned that I needed to run a "compact" command on the collection in order to free up the excess disk space pre-allocated to it since the total disk usage didn't move after the purge.

After a compaction process which took about 12 hours, the system now reports collection stats such as:

> db.stats()
{
"db" : "retslogs",
"collections" : 4,
"objects" : 288709043,
"avgObjSize" : 720.6959858475926,
"dataSize" : 208071448368,
"storageSize" : 214467101664,
"numExtents" : 103,
"indexes" : 3,
"indexSize" : 39519321216,
"fileSize" : 594496454656,
"nsSizeMB" : 16,
"ok" : 1
}
> db.hits.stats()
{
"ns" : "retslogs.hits",
"count" : 288713988,
"size" : 208074966568,
"avgObjSize" : 720.6958277615562,
"storageSize" : 214466995936,
"numExtents" : 100,
"nindexes" : 3,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 39520588496,
"indexSizes" : {
"_id_" : 8443993952,
"maTechId_1_startDtTm_-1_userName_1" : 21070477680,
"startDtTm_1_transaction_1" : 10006116864
},
"ok" : 1
}


... yet my /data/db directory is still at 556GB and contains files numbered from retslogs.0 all the way to retslogs.280 (which I believe tells me I have 281x 2GB extents).

The mongodb.log says "Wed Oct  5 22:01:35 [conn20] compact retslogs.hits end" and the console returned:

> db.hits.runCommand("compact")
{ "ok" : 1 }

... so I don't have any other reason to believe that it quit early before the final cleanup.

Nearly all of the 556GB of data was written using version 1.8.0 .  I upgraded to 2.0.0 yesterday prior to the mass delete and compaction.  Due to time and space constraints, doing a full repairDatabase() or dump/restore is pretty difficult but not out of the question if it's my only option.  I just can't figure out if I missed a step somewhere.

Server is started with:

/opt/mongodb/bin/mongod --fork --logpath /var/log/mongodb.log --logappend

Any help or guidance would be greatly appreciated.

Thanks,

Troy Davisson

Karl Seguin

unread,
Oct 6, 2011, 12:41:54 AM10/6/11
to mongod...@googlegroups.com
You didn't do anything wrong. Compact does not shrink your data files. repairDatabase is the only way to do this.

For large sets, you should attempt to repairDatabase on the slave, then, once completed and the slave is caught up, switch the slave and master.

Karl

Troy Davisson

unread,
Oct 6, 2011, 9:56:51 AM10/6/11
to mongod...@googlegroups.com
Thanks Karl.  I believe I can see now where the ambiguity came in when reading the compaction documentation.

The website says that repairDatabase requires double disk space in order to work.  Is that double the current data file size or double used storageSize reported by Mongo?


--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/5ttw4JAGBo0J.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.

Scott Hernandez

unread,
Oct 6, 2011, 9:59:26 AM10/6/11
to mongod...@googlegroups.com
At max, double the currently allocated files but in practice just
double the data size (+ a file or more per db). It just needs room to
copy all the data to new files, so whatever space that takes.
Remember, it will compact as it goes so it could be much less than the
currently allocated database files.
Reply all
Reply to author
Forward
0 new messages