Dropping a capped collection doesn't clean up space?

305 views
Skip to first unread message

aharbick

unread,
Apr 13, 2011, 1:16:41 PM4/13/11
to mongodb-user
I've got a test suite that over the course of the run repeatedly
inserts dozens of documents (between 4k and 20k) into a capped
collection performs various asserts then drops the collection. Over
time as I run the suite the data files grow and grow and the test
slows down as disk I/O grows.

If I call db.repairDatabase() the data files get cleaned up but that
operation slows down the test suite even worse. Rather than dropping
the capped collection I could drop the database as that seems to avoid
this problem as well. However, I don't really want to drop the whole
database as it has stuff in it that doesn't need to be "reset" in a
test.

Last, I don't seem to see this problem for uncapped collections (I can
repeatedly create collection, insert, drop collection, and repeat
without having the space grow unbounded).

Here's what my data directory looks like:

aharbick:mongo aharbick$ ls -l
total 12480520
drwxrwxr-x 2 aharbick staff 68 Apr 13 12:54 _tmp
drwxrwxr-x 2 aharbick staff 68 Apr 13 12:29 mongo_test
-rw------- 1 aharbick staff 67108864 Apr 13 12:54 mongo_test.0
-rw------- 1 aharbick staff 134217728 Apr 13 12:32 mongo_test.1
-rw------- 1 aharbick staff 268435456 Apr 13 12:54 mongo_test.2
-rw------- 1 aharbick staff 536870912 Apr 13 12:49 mongo_test.3
-rw------- 1 aharbick staff 1073741824 Apr 13 12:54 mongo_test.4
-rw------- 1 aharbick staff 2146435072 Apr 13 12:54 mongo_test.5
-rw------- 1 aharbick staff 2146435072 Apr 13 12:54 mongo_test.6
-rw------- 1 aharbick staff 16777216 Apr 13 12:54 mongo_test.ns
-rwxrwxr-x 1 aharbick staff 6 Apr 13 12:28 mongod.lock

Here's what db.stats() reports (note the overall size ~3MB is in a
collection that I'm not dropping ever)

> db.stats()
{
"collections" : 11,
"objects" : 277,
"avgObjSize" : 11505.906137184116,
"dataSize" : 3187136,
"storageSize" : 56830464,
"numExtents" : 20,
"indexes" : 10,
"indexSize" : 81920,
"fileSize" : 6373244928,
"ok" : 1
}

Here's what db.capped_collection reports after dropping it, recreating
it, and adding a few documents:
{
"ns" : "mongo_test.capped_collection",
"count" : 9,
"size" : 31500,
"avgObjSize" : 3500,
"storageSize" : 10742784,
"numExtents" : 1,
"nindexes" : 0,
"lastExtentSize" : 10742784,
"paddingFactor" : 1,
"flags" : 0,
"totalIndexSize" : 0,
"indexSizes" : {

},
"capped" : 1,
"max" : 10000,
"ok" : 1
}

Finally, here's in essence the code that I'm running in ruby...

conn = Mongo::Connection.new
all_tests_to_run.each do
conn.db('mongo_test').create_collection('capped_collection',
{:capped => true, :size => 10*1024*1024, :max => 10000})

# Setup the collection with up to hundreds of inserts...
conn.db('mongo_test').collection('capped_collection').save({....})

# Run code and asserts on that data
....

# Reset the capped collection
conn.db('mongo_test').collection('capped_collection').drop()
end

Hopefully I'm not missing something obvious.

Thanks for your help!

Andy

Gaetan Voyer-Perrault

unread,
Apr 13, 2011, 7:44:13 PM4/13/11
to mongod...@googlegroups.com
I'm not finding any JIRA issues related to this, though it looks to be trivially reproducible.

What version of MongoDB are you running?


--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


aharbick

unread,
Apr 13, 2011, 9:38:54 PM4/13/11
to mongodb-user
The previous version... I think the client reports 1.6.6.

Andy
> > -rw-------  1 aharbick  staff  1073741824Apr 13 12:54 mongo_test.4
> > -rw-------  1 aharbick  staff  2146435072Apr 13 12:54 mongo_test.5
> > -rw-------  1 aharbick  staff  2146435072Apr 13 12:54 mongo_test.6

Gaetan Voyer-Perrault

unread,
Apr 14, 2011, 1:04:24 AM4/14/11
to mongod...@googlegroups.com
> I've got a test suite that over the course of the run repeatedly inserts dozens of documents (between 4k and 20k) into a capped collection performs various asserts then drops the collection.  Over time as I run the suite the data files grow and grow...

So to be clear here.
 - Dropping the collection will not remove any data files.
 - Writing to a capped collection should not cause the data files to grow. (unless there's an index)

You also mention trying repairDatabase and dropDatabase. Obviously, you don't want to do these things. But, you shouldn't have to.

If you create a capped collection, the space for that collection is pre-allocated.

So the question here is "what's going on"? There are few possibilities:
 1. This could be a bug, maybe MongoDB is not re-using space that is already allocated.
 2. The tests could be writing to a non-capped collection.

To test #1, I wrote a quick script. I could not find any problems. Tested with 1.6.5 and 1.8.1:
Creating and destroying the capped collection does allow Mongo to re-used the space.

To test #2, you'll have to check this on your end.

Try outputting a db.stats() at the end of your test runs just to ensure that the DB is in the same state at the end of each run. If it's not then at least you'll have some indication that's something is up.

- Gates

aharbick

unread,
Apr 14, 2011, 1:53:56 PM4/14/11
to mongodb-user
Hey Gates,

Thanks for looking deeper!

1. I can confirm that your code does NOT cause a problem on my set.
2. I found the issue which I think you should be able to repro (at
least on 1.6.5) here: https://gist.github.com/920015

In a nutshell: if you drop a capped collection and recreate it with a
different size (even slightly different) then the original space hangs
around and isn't used.

More details: the way I'd factored my code I didn't have the original
size passed into createCollection's "size" parameter. So when I
needed to recreate (and recap) the collection I used the storageSize
reported by db.capped.stats() in the new call to
createCollection({size: oldStorageSize}). This meant that my
storageSize grew slightly over time since mongo seemed to allocate a
bit more than I asked for usually. However it also mean that the
amount of space consumed on disk grew dramatically. Note this is
independent of inserting any documents.

So... I can fix my code, but it's not clear to me if this is a bug or
not. One thing that is a little strange is that it's impossible to
empty a capped collection and recreate it without knowing the original
size passed in.

Does that make sense?

Andy
Reply all
Reply to author
Forward
0 new messages