I have a MongoDB collection which was using about 556GB as of yesterday. After removing about 400 million documents (little over half of the total), I learned that I needed to run a "compact" command on the collection in order to free up the excess disk space pre-allocated to it since the total disk usage didn't move after the purge.
After a compaction process which took about 12 hours, the system now reports collection stats such as:
> db.stats()
{
"db" : "retslogs",
"collections" : 4,
"objects" : 288709043,
"avgObjSize" : 720.6959858475926,
"dataSize" : 208071448368,
"storageSize" : 214467101664,
"numExtents" : 103,
"indexes" : 3,
"indexSize" : 39519321216,
"fileSize" : 594496454656,
"nsSizeMB" : 16,
"ok" : 1
}
> db.hits.stats()
{
"ns" : "retslogs.hits",
"count" : 288713988,
"size" : 208074966568,
"avgObjSize" : 720.6958277615562,
"storageSize" : 214466995936,
"numExtents" : 100,
"nindexes" : 3,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 39520588496,
"indexSizes" : {
"_id_" : 8443993952,
"maTechId_1_startDtTm_-1_userName_1" : 21070477680,
"startDtTm_1_transaction_1" : 10006116864
},
"ok" : 1
}
... yet my /data/db directory is still at 556GB and contains files numbered from retslogs.0 all the way to retslogs.280 (which I believe tells me I have 281x 2GB extents).
The mongodb.log says "Wed Oct 5 22:01:35 [conn20] compact retslogs.hits end" and the console returned:
> db.hits.runCommand("compact")
{ "ok" : 1 }
... so I don't have any other reason to believe that it quit early before the final cleanup.
Nearly all of the 556GB of data was written using version 1.8.0 . I upgraded to 2.0.0 yesterday prior to the mass delete and compaction. Due to time and space constraints, doing a full repairDatabase() or dump/restore is pretty difficult but not out of the question if it's my only option. I just can't figure out if I missed a step somewhere.
Server is started with:
/opt/mongodb/bin/mongod --fork --logpath /var/log/mongodb.log --logappend
Any help or guidance would be greatly appreciated.
Thanks,
Troy Davisson