Mo Cassidy
unread,May 22, 2012, 5:18:19 PM5/22/12Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to mongod...@googlegroups.com
Today, we had the following seemingly spontaneously occur:
Tue May 22 14:51:45 [conn27990] update default.instances query: { [...SNIP...] } idhack:1 upsert:1 124ms
Tue May 22 14:51:47 access violation
Tue May 22 14:52:15 [rsHealthPoll] replSet member iis-man-002:27021 is now in state PRIMARY
Tue May 22 14:52:17 [rsSync] replSet syncing to: iis-man-002:27021
***** SERVER RESTARTED *****
Tue May 22 14:52:19 [initandlisten] MongoDB starting : pid=1960 port=27021 dbpath=C:\SitesData\testing-website\mongodata 64-bit host=iis-man-003
Tue May 22 14:52:19 [initandlisten] db version v2.0.4, pdfile version 4.5
Tue May 22 14:52:19 [initandlisten] git version: 329f3c47fe8136c03392c8f0e548506cb21f8ebf
Tue May 22 14:52:19 [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_42
Tue May 22 14:52:19 [initandlisten] options: { dbpath: "C:\SitesData\testing-website\mongodata", logappend: true, logpath: "C:\SitesData\testing-website\mongo.log", port: 27021, quiet: true, replSet: "rsTest", rest: true, service: true }
Tue May 22 14:52:19 [initandlisten] journal dir=C:/SitesData/testing-website/mongodata/journal
Tue May 22 14:52:19 [initandlisten] recover begin
Tue May 22 14:52:19 [initandlisten] recover lsn: 258699461
Tue May 22 14:52:19 [initandlisten] recover C:/SitesData/testing-website/mongodata/journal/j._4
Tue May 22 14:52:20 [initandlisten] recover skipping application of section seq:209344204 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section seq:209382318 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section seq:209420683 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section seq:209458887 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section seq:209497082 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section seq:209535336 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section seq:209573520 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section seq:209611794 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section seq:209649958 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section more...
Tue May 22 14:52:58 [initandlisten] recover C:/SitesData/testing-website/mongodata/journal/j._5
Tue May 22 14:53:17 [initandlisten] recover cleaning up
Tue May 22 14:53:17 [initandlisten] removeJournalFiles
Tue May 22 14:53:17 [initandlisten] recover done
Tue May 22 14:53:18 [initandlisten] waiting for connections on port 27021
Tue May 22 14:53:18 [websvr] admin web console waiting for connections on port 28021
Tue May 22 14:53:19 [rsStart] trying to contact iis-man-002:27021
Tue May 22 14:53:19 [rsHealthPoll] replSet member iis-man-002:27021 is up
Tue May 22 14:53:19 [rsHealthPoll] replSet member iis-man-002:27021 is now in state PRIMARY
Tue May 22 14:53:19 [rsStart] replSet STARTUP2
Tue May 22 14:53:19 [rsHealthPoll] replSet member iis-man-001:27021 is up
Tue May 22 14:53:19 [rsHealthPoll] replSet member iis-man-001:27021 is now in state ARBITER
Tue May 22 14:53:19 [rsSync] replSet SECONDARY
Tue May 22 14:53:23 [rsSync] replSet syncing to: iis-man-002:27021
Tue May 22 14:53:24 access violation
***** SERVER RESTARTED *****
Tue May 22 14:53:25 [initandlisten] MongoDB starting : pid=2356 port=27021 dbpath=C:\SitesData\testing-website\mongodata 64-bit host=iis-man-003
Tue May 22 14:53:25 [initandlisten] db version v2.0.4, pdfile version 4.5
Tue May 22 14:53:25 [initandlisten] git version: 329f3c47fe8136c03392c8f0e548506cb21f8ebf
Tue May 22 14:53:25 [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_42
Tue May 22 14:53:25 [initandlisten] options: { dbpath: "C:\SitesData\testing-website\mongodata", logappend: true, logpath: "C:\SitesData\testing-website\mongo.log", port: 27021, quiet: true, replSet: "rsTest", rest: true, service: true }
Tue May 22 14:53:25 [initandlisten] journal dir=C:/SitesData/testing-website/mongodata/journal
Tue May 22 14:53:25 [initandlisten] recover begin
Tue May 22 14:53:25 [initandlisten] recover lsn: 844
Tue May 22 14:53:25 [initandlisten] recover C:/SitesData/testing-website/mongodata/journal/j._0
Tue May 22 14:53:25 [initandlisten] recover cleaning up
Tue May 22 14:53:25 [initandlisten] removeJournalFiles
Tue May 22 14:53:25 [initandlisten] recover done
Tue May 22 14:53:25 [initandlisten] waiting for connections on port 27021
Tue May 22 14:53:25 [websvr] admin web console waiting for connections on port 28021
Tue May 22 14:53:25 [rsStart] replSet STARTUP2
Tue May 22 14:53:25 [rsSync] replSet SECONDARY
Tue May 22 14:53:25 [rsHealthPoll] replSet member iis-man-002:27021 is up
Tue May 22 14:53:25 [rsHealthPoll] replSet member iis-man-002:27021 is now in state PRIMARY
Tue May 22 14:53:25 [rsHealthPoll] replSet member iis-man-001:27021 is up
Tue May 22 14:53:25 [rsHealthPoll] replSet member iis-man-001:27021 is now in state ARBITER
Tue May 22 14:53:30 [rsSync] replSet syncing to: iis-man-002:27021
Tue May 22 14:53:30 access violation
***** SERVER RESTARTED *****
Tue May 22 14:53:30 [initandlisten] MongoDB starting : pid=2268 port=27021 dbpath=C:\SitesData\testing-website\mongodata 64-bit host=iis-man-003
Tue May 22 14:53:30 [initandlisten] db version v2.0.4, pdfile version 4.5
Tue May 22 14:53:30 [initandlisten] git version: 329f3c47fe8136c03392c8f0e548506cb21f8ebf
Tue May 22 14:53:30 [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_42
Tue May 22 14:53:30 [initandlisten] options: { dbpath: "C:\SitesData\testing-website\mongodata", logappend: true, logpath: "C:\SitesData\testing-website\mongo.log", port: 27021, quiet: true, replSet: "rsTest", rest: true, service: true }
Tue May 22 14:53:30 [initandlisten] journal dir=C:/SitesData/testing-website/mongodata/journal
Tue May 22 14:53:30 [initandlisten] recover begin
Tue May 22 14:53:30 [initandlisten] info no lsn file in journal/ directory
Tue May 22 14:53:30 [initandlisten] recover lsn: 0
Tue May 22 14:53:30 [initandlisten] recover C:/SitesData/testing-website/mongodata/journal/j._0
Tue May 22 14:53:30 [initandlisten] recover cleaning up
Tue May 22 14:53:30 [initandlisten] removeJournalFiles
Tue May 22 14:53:30 [initandlisten] recover done
Tue May 22 14:53:30 [initandlisten] waiting for connections on port 27021
Tue May 22 14:53:30 [websvr] admin web console waiting for connections on port 28021
Tue May 22 14:53:30 [rsStart] replSet STARTUP2
Tue May 22 14:53:30 [rsSync] replSet SECONDARY
Tue May 22 14:53:30 [rsHealthPoll] replSet member iis-man-002:27021 is up
Tue May 22 14:53:30 [rsHealthPoll] replSet member iis-man-002:27021 is now in state PRIMARY
Tue May 22 14:53:30 [rsHealthPoll] replSet member iis-man-001:27021 is up
Tue May 22 14:53:30 [rsHealthPoll] replSet member iis-man-001:27021 is now in state ARBITER
Tue May 22 14:53:35 [rsSync] replSet syncing to: iis-man-002:27021
Tue May 22 14:53:35 access violation
***** SERVER RESTARTED *****
The last pattern above repeated many times until someone noticed the error in our website logging system.
After seeing that, I manually stopped the service, which gave the following log messages:
***** SERVER RESTARTED *****
Tue May 22 15:20:03 [initandlisten] MongoDB starting : pid=3132 port=27021 dbpath=C:\SitesData\testing-website\mongodata 64-bit host=iis-man-003
Tue May 22 15:20:03 [initandlisten] db version v2.0.4, pdfile version 4.5
Tue May 22 15:20:03 [initandlisten] git version: 329f3c47fe8136c03392c8f0e548506cb21f8ebf
Tue May 22 15:20:03 [initandlisten] build info: windows sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1') BOOST_LIB_VERSION=1_42
Tue May 22 15:20:03 [initandlisten] options: { dbpath: "C:\SitesData\testing-website\mongodata", logappend: true, logpath: "C:\SitesData\testing-website\mongo.log", port: 27021, quiet: true, replSet: "rsTest", rest: true, service: true }
Tue May 22 15:20:03 [initandlisten] journal dir=C:/SitesData/testing-website/mongodata/journal
Tue May 22 15:20:03 [initandlisten] recover begin
Tue May 22 15:20:03 [initandlisten] info no lsn file in journal/ directory
Tue May 22 15:20:03 [initandlisten] recover lsn: 0
Tue May 22 15:20:03 [initandlisten] recover C:/SitesData/testing-website/mongodata/journal/j._0
Tue May 22 15:20:03 [initandlisten] recover cleaning up
Tue May 22 15:20:03 [initandlisten] removeJournalFiles
Tue May 22 15:20:03 [initandlisten] recover done
Tue May 22 15:20:03 [initandlisten] waiting for connections on port 27021
Tue May 22 15:20:03 [websvr] admin web console waiting for connections on port 28021
Tue May 22 15:20:03 [rsStart] replSet STARTUP2
Tue May 22 15:20:03 [rsSync] replSet SECONDARY
Tue May 22 15:20:03 [rsHealthPoll] replSet member iis-man-002:27021 is up
Tue May 22 15:20:03 [rsHealthPoll] replSet member iis-man-002:27021 is now in state PRIMARY
Tue May 22 15:20:03 [rsHealthPoll] replSet member iis-man-001:27021 is up
Tue May 22 15:20:03 [rsHealthPoll] replSet member iis-man-001:27021 is now in state ARBITER
Tue May 22 15:20:04 shutdown: going to close listening sockets...
Tue May 22 15:20:04 closing listening socket: 320
Tue May 22 15:20:04 closing listening socket: 368
Tue May 22 15:20:04 shutdown: going to flush diaglog...
Tue May 22 15:20:04 shutdown: going to close sockets...
Tue May 22 15:20:04 shutdown: waiting for fs preallocator...
Tue May 22 15:20:04 shutdown: lock for final commit...
Tue May 22 15:20:04 shutdown: final commit...
Tue May 22 15:20:04 shutdown: closing all files...
Tue May 22 15:20:04 Assertion failure inShutdown() db\dur.cpp 681
Tue May 22 15:20:04 [websvr] Listener: accept() returns -1 errno:10038 An operation was attempted on something that is not a socket.
Tue May 22 15:20:04 [websvr] select() failure: ret=-1 errno:10038 An operation was attempted on something that is not a socket.
Tue May 22 15:20:04 [initandlisten] Listener: accept() returns -1 errno:10038 An operation was attempted on something that is not a socket.
Tue May 22 15:20:04 [initandlisten] select() failure: ret=-1 errno:10038 An operation was attempted on something that is not a socket.
Tue May 22 15:20:04 [initandlisten] now exiting
Tue May 22 15:20:04 dbexit:
Tue May 22 15:20:04 [initandlisten] shutdown: going to close listening sockets...
Tue May 22 15:20:04 [initandlisten] shutdown: going to flush diaglog...
Tue May 22 15:20:04 [initandlisten] shutdown: going to close sockets...
Tue May 22 15:20:04 [initandlisten] shutdown: waiting for fs preallocator...
Tue May 22 15:20:04 [initandlisten] shutdown: lock for final commit...
Tue May 22 15:20:04 [initandlisten] shutdown: final commit...
Tue May 22 15:20:04 unhandled windows exception
Tue May 22 15:20:04 ec=0xe06d7363
***** SERVER RESTARTED *****
I saw mongod.lock in the data folder, so I renamed that, and tried again, but got the same messages, and saw mongo was automatically restarting over and over very quickly.
I manually tried to start mongo, using the command from HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\MongoDB\ImagePath:
"c:\SitesData\testing-website\mongo\mongod.exe" --quiet --rest --replSet rsTest --dbpath "C:\SitesData\testing-website\mongodata" --port 27021 --logpath "C:\SitesData\testing-website\mongo.log" --logappend --service
But that only gave the following logs messages:
***** SERVER RESTARTED *****
Tue May 22 15:29:04 BackgroundJob starting: DataFileSync
Tue May 22 15:29:04 dbexit:
Tue May 22 15:29:04 shutdown: going to close listening sockets...
Tue May 22 15:29:04 shutdown: going to flush diaglog...
Tue May 22 15:29:04 shutdown: going to close sockets...
Tue May 22 15:29:04 shutdown: waiting for fs preallocator...
Tue May 22 15:29:04 shutdown: lock for final commit...
Tue May 22 15:29:04 shutdown: final commit...
Tue May 22 15:29:04 shutdown: closing all files...
Tue May 22 15:29:04 closeAllFiles() finished
Tue May 22 15:29:04 dbexit: really exiting now
***** SERVER RESTARTED *****
The way this failure occurred prevented failover from happening, because the local mongo instance could be seen, and connected to (slaveok is true), but would immediately shutdown and close the connection, causing our ASP.Net website to keep throwing System.Net.Sockets.SocketException "An existing connection was forcibly closed by the remote host"
I'm not sure what to do from here. I'd like to avoid a full resync, since we only have two non-arbiter members, and this would mean a performance impact while this instance fully resynched with the primary. Does anyone have any advice?