ConvertToCapped 29TB - "can't map file memory" when replication is off

93 views
Skip to first unread message

Thorn Roby

unread,
Feb 8, 2012, 2:05:32 PM2/8/12
to mongodb-user
I'm trying to convert an existing 4TB collection to capped 29TB. I
started on the replica set secondary, bringing it up without the
replSet argument because I'm not sure if the table lock might cause
replication problems if it's actively replicating (I did enlarge the
oplog to a size that should be able to handle the roughly 20 hours I
anticipate will be needed for the conversion). However, when I bring
up the (normally) SECONDARY without the replSet argument, and attempt
the convertToCapped, I get

db.runCommand({"convertToCapped": "log", size: 29000000000000});
{
"assertion" : "can't map file memory",
"assertionCode" : 10085,
"errmsg" : "db assertion failure",
"ok" : 0
}

and in the logfile I see

couldn't open /data/db/lfs/lfs.1013 errno:24 Too many open files

I see the same error even if just looking at collection stats
(db.log.stats() ) . Another strange thing is that the VM size in top
is only 181M instead of the expected 8TB. This is 2.0.2 running on
Linux 64bit. I'm not sure why the absence of replication would cause
this. I verified that all datafiles and oplog files are owned by
mongod. I am starting the process with "numactl --interleave=all". The
system has 48G RAM.

I tried the conversion on the SECONDARY with replication running, but
I get "not master". I was hoping to run it on the SECONDARY so I could
get the replica set re-synced before running it on the PRIMARY.

Scott Hernandez

unread,
Feb 8, 2012, 2:38:01 PM2/8/12
to mongod...@googlegroups.com
See this for help increasing file handles:
http://www.mongodb.org/display/DOCS/Too+Many+Open+Files

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>

Dwight Merriman

unread,
Feb 8, 2012, 3:05:54 PM2/8/12
to mongodb-user
right > 14.5K handles will be needed (29000/2)

Thorn Roby

unread,
Feb 8, 2012, 7:17:46 PM2/8/12
to mongodb-user
I did increase the ulimit for mongod but it didn't resolve any of the
issues. I'm not sure it was necessary because I noticed mongod's file
limit was set by default to 1024, yet the process already had over
4000 files open. This is on CentOS 5.4 - maybe ulimit was being
ignored. Anyway, I went ahead and shut down the clients and started
the convertToCapped on the PRIMARY. Am I right in understanding that
the operation will create new datafiles sufficient to provide an
aggregate size equal to the 29T I specified, or will it create 29T in
addition to the existing 4TB of datafiles I already have?

On Feb 8, 12:05 pm, Thorn Roby <thornr...@gmail.com> wrote:
> I'm trying to convert an existing 4TB collection to capped 29TB. I
> started on the replica set secondary, bringing it up without the
> replSet argument because I'm not sure if the table lock might cause
> replication problems if it's actively replicating (I did enlarge the
> oplog to a size that should be able to handle the roughly 20 hours I
> anticipate will be needed for the conversion). However, when I bring
> up the (normally) SECONDARY without the replSet argument, and attempt
> theconvertToCapped, I get

Scott Hernandez

unread,
Feb 8, 2012, 8:09:23 PM2/8/12
to mongod...@googlegroups.com

In addition. It cant reuse the old space until it is freed, after the copy.

Thorn Roby

unread,
Feb 9, 2012, 1:45:49 PM2/9/12
to mongodb-user
I had to interrupt the ConvertToCapped operation because it was going
to exceed the available disk. It ran for about 16 hours and was going
to take about 2-3 days to finish the 30TB collection. I did a killOp
on the convert process, which reported it was killed, but it kept
creating datafiles. I was unable to force a server shutdown short of a
kill -9. When I brought the server back up (with the secondary down so
as not to replicate the convertToCapped operation) it came up quickly,
did a brief recovery referencing the last datafile before the addition
by converToCapped. However, even bringing it up as part of a replica
set (with the secondary down but the arbiter up) results in the same
symptoms as before, outside the replica set - only 800GB of VM is
mapped (should be 8000) and any db.collection.stats operation fails
with "can't map file memory". This was what was happening before I ran
the convertToCapped, but only when I omitted the replSet argument on
startup. Now it does it always. Since I can't see any collection
stats, I can't be certain whether it knows about the newly created
empty datafiles, so I don't know if it's safe to delete them and retry
the convertToCapped with a smaller size. Also, I can't bring up the
secondary within the replica set because the only outstanding
operation to replicate is the convertToCapped, which I don't want to
replicate. It seems my only option is to remove the system from the
replica set, recreate the filesystem (rather than delete the existing
20TB of data) and try to rebuild the replica set from the 4TB
datafiles on the remaining server, which was formerly the secondary.

1. Why am I unable to map all files when outside of the replica set or
when the secondary is not available? This is not a ulimit issue.

2. Is there a way to delete an operation from the oplog (in this case
the convertToCapped) to prevent it from replicating?

On Feb 8, 6:09 pm, Scott Hernandez <scotthernan...@gmail.com> wrote:
> In addition. It cant reuse the old space until it is freed, after the copy.

Eliot Horowitz

unread,
Feb 12, 2012, 1:23:50 AM2/12/12
to mongod...@googlegroups.com
Can you send the full server log?
Reply all
Reply to author
Forward
0 new messages