Hi there!
I'm using mongodb 2.0.5 with java driver 2.7.3
Current configuration is two shards with three replicas each
Mongos is deployed locally.
When I run long query in Java (full scan on 3000000 documents, 2-3 kilobytes each) after about an hour the process fails with the exception
[2012-07-18 17:49:06,787] ERROR [main] { "$err" : "getMore: cursor didn't exist on server, possible restart or timeout?" , "code" : 13127}
No hardweight ops performed inside iterating loop.
Here is mongos logs:
Wed Jul 18 17:49:05 [conn4] want cursor : 4809402721582786738
Wed Jul 18 17:49:05 [conn4] CursorCache::get id: 4809402721582786738
Wed Jul 18 17:49:06 [conn4] hasMore: 1 sendMore: 1 cursorMore: 1 ntoreturn: 0 num: 526 wouldSendMoreIfHad: 1 id:4809402721582786738 totalSent: 1684483
Wed Jul 18 17:49:06 [conn4] Request::process ns: testdb.load msg id:2986 attempt: 0
Wed Jul 18 17:49:06 [conn4] want cursor : 4809402721582786738
Wed Jul 18 17:49:06 [conn4] CursorCache::get id: 4809402721582786738
Wed Jul 18 17:49:06 [conn4] creating new connection to:mongodb02:27017
Wed Jul 18 17:49:06 BackgroundJob starting: ConnectBG
Wed Jul 18 17:49:06 [conn4] connected connection!
Wed Jul 18 17:49:06 [conn4] scoped connection to mongodb02:27017 not being returned to the pool
Wed Jul 18 17:49:06 [conn4] AssertionException while processing op type : 2005 to : testdb.load :: caused by :: 13127 getMore: cursor didn't exist on server, possible restart or timeo
ut?
Wed Jul 18 17:49:07 [conn4] Socket recv() conn closed?
127.0.0.1:57229Wed Jul 18 17:49:07 [conn4] SocketException: remote:
127.0.0.1:57229 error: 9001 socket exception [0] server [
127.0.0.1:57229]
Wed Jul 18 17:49:07 [conn4] end connection
127.0.0.1:57229Wed Jul 18 17:49:09 [Balancer] about to acquire distributed lock 'balancer/anton:27000:1342613385:1804289383:
"when" : { "$date" : "Wed Jul 18 17:49:09 2012" },
Wed Jul 18 17:49:10 [Balancer] distributed lock 'balancer/anton:27000:1342613385:1804289383' acquired, ts : 5006bed5d7f351f313cd9658
Wed Jul 18 17:49:10 [Balancer] *** start balancing round
During query processing no background jobs except this one were started.
Common log output:
Wed Jul 18 17:49:04 [conn4] CursorCache::get id: 4809402721582786738
Wed Jul 18 17:49:04 [conn4] hasMore: 1 sendMore: 1 cursorMore: 1 ntoreturn: 0 num: 539 wouldSendMoreIfHad: 1 id:4809402721582786738 totalSent: 1683451
Wed Jul 18 17:49:04 [conn4] Request::process ns: testdb.load msg id:2984 attempt: 0
Wed Jul 18 17:49:04 [conn4] want cursor : 4809402721582786738
Wed Jul 18 17:49:04 [conn4] CursorCache::get id: 4809402721582786738
Wed Jul 18 17:49:04 [ReplicaSetMonitorWatcher] checking replica set: rset01