Mongod.exe crashes with "can't open:" on win64 when attempting to rotate logs

310 views
Skip to first unread message

jdmerth

unread,
Aug 10, 2012, 1:41:59 PM8/10/12
to mongod...@googlegroups.com
I have a mongo replica set running on windows azure instances, every 1 hr there is a job that rotates the log files using the command logRotate. Occasionally the server will exit with the following in the event log:

Faulting application name: mongod.exe, version: 0.0.0.0, time stamp: 0x50065a9a
Faulting module name: mongod.exe, version: 0.0.0.0, time stamp: 0x50065a9a
Exception code: 0x40000015
Fault offset: 0x00000000004a6d71
Faulting process id: 0x820
Faulting application start time: 0x01cd75776f11bf4b
Faulting application path: E:\approot\MongoDBBinaries\bin\mongod.exe
Faulting module path: E:\approot\MongoDBBinaries\bin\mongod.exe
Report Id: 8b069c8e-e30f-11e1-8a93-00155d32377b

and this in the log:

Fri Aug 10 17:01:00 [conn63702] getmore local.oplog.rs query: { ts: { $gte: new Date(5774725551155576833) } } cursorid:112962143165828 ntoreturn:0 keyUpdates:0 locks(micros) r:140 nreturned:1 reslen:2351 62ms
Fri Aug 10 17:01:00 [conn153516] run command admin.$cmd { getlasterror: 1 }
Fri Aug 10 17:01:00 [conn153516] command admin.$cmd command: { getlasterror: 1 } ntoreturn:1 keyUpdates:0  reslen:101 0ms
Fri Aug 10 17:01:00 [conn153516] query tracking.packages query: { ps: 3 } cursorid:86337725880765 ntoreturn:0 keyUpdates:0 locks(micros) r:3483 nreturned:101 reslen:232884 0ms
Fri Aug 10 17:01:00 [slaveTracking] User Assertion: 11000:E11000 duplicate key error index: local.slaves.$_id_  dup key: { : ObjectId('4fdf4701cb24d5830133d832') }
Fri Aug 10 17:01:00 [slaveTracking] update local.slaves query: { _id: ObjectId('502125a623be1afdbbb9238b'), host: "10.28.212.27", ns: "local.oplog.rs" } update: { $set: { syncedTo: Timestamp 1344618060000|4 } } nscanned:1 nupdated:1 keyUpdates:0 locks(micros) w:896 0ms
can't open: 

here's my configuration:

Fri Aug 10 17:28:30 [initandlisten] MongoDB starting : pid=3284 port=27017 dbpath=F:\data 64-bit host=RD00155D32377B
Fri Aug 10 17:28:30 [initandlisten] db version v2.2.0-rc0, pdfile version 4.5
Fri Aug 10 17:28:30 [initandlisten] git version: 33dc8445316479bbaa062db00f179fa5c39bbddb
Fri Aug 10 17:28:30 [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_49
Fri Aug 10 17:28:30 [initandlisten] options: { dbpath: "F:\data", journal: true, logappend: true, logpath: "C:\Resources\directory\3905dd8eed7b42248672809af14e1f65.ReplicaSetRole.MongodLogDir\\mongod.log", port: 27017, quiet: true, replSet: "rs", rest: true, verbose: true }

Has anyone seen this? Is there something I need to be doing before rotating the logs to prevent it?

Thanks!

sridhar

unread,
Aug 14, 2012, 4:01:02 AM8/14/12
to mongod...@googlegroups.com
This does not seem to be directly caused by the log rotation. Can you give us the full logs? If you could open a ticket at jira.mongodb.org under the "Community Private" project and attach logs to it referencing this URL that would help greatly. Note - tickets in community private are private between you and mongodb support.

sridhar

unread,
Aug 16, 2012, 3:06:58 AM8/16/12
to mongod...@googlegroups.com
I notice a cannot open at the end of the log. Did you run out space for local instance storage? Getting us full logs would still be extremely useful in diagnosing the problem

jdmerth

unread,
Aug 16, 2012, 10:32:01 AM8/16/12
to mongod...@googlegroups.com
Hi sridhar,
Thank you for the response. We have not had issues with disk space, when this would happen the role would come back up after running a recover and work without issue for a while.  Recently I disabled the hourly log rotation logic and the roles have not had any issues. Today I manually ran the logRotate command on my primary which crashed the server with this error from the command line:

rs:PRIMARY> db.runCommand("logRotate")
Thu Aug 16 14:02:10 Socket recv() errno:10054 An existing connection was forcibl
y closed by the remote host. 127.0.0.1:27017
Thu Aug 16 14:02:10 SocketException: remote: 127.0.0.1:27017 error: 9001 socket
exception [1] server [127.0.0.1:27017]
Thu Aug 16 14:02:10 DBClientCursor::init call() failed
Thu Aug 16 14:02:10 query failed : admin.$cmd { logRotate: 1.0 } to: 127.0.0.1:2
7017
Thu Aug 16 14:02:10 Error: error doing query: failed shell/collection.js:155
Thu Aug 16 14:02:10 trying reconnect to 127.0.0.1:27017
Thu Aug 16 14:02:11 reconnect 127.0.0.1:27017 failed couldn't connect to server

>

As requested I created a ticket in jira with the log info.

Thank you!
Reply all
Reply to author
Forward
0 new messages