weird behavior in Mongodb with journaling

133 views
Skip to first unread message

Birdy

unread,
Dec 30, 2011, 4:20:38 PM12/30/11
to mongodb-user
Hi Guys,

We are using mongoDB ver 2.0.2 with journaling enabled and 2 shards
configured with 1 mongoS and 1 mongoC. (we are using total 10 shard
keys, 5 mapped to one shard(1-5) and 5 to other(6-10)
We are observing a weird behavior. During ongoing insertion, if one of
the shards goes down, MongoS starts failing insertion on both the
shards(In getLastError we get "socket exception with code 11002, ok is
1)
We are using recommended C-MongoDriver from mongo community.

This is the output of printShardingStatus command:
00301211.Database chunks:
shard0000 6
shard0001 5
{ "_sk" : { $minKey : 1 } } -->> { "_sk" : 1 } on : shard0000
{ "t" : 6000, "i" : 1 }
{ "_sk" : 1 } -->> { "_sk" : 2 } on : shard0000 { "t" : 1000, "i" :
3 }
{ "_sk" : 2 } -->> { "_sk" : 3 } on : shard0000 { "t" : 1000, "i" :
5 }
{ "_sk" : 3 } -->> { "_sk" : 4 } on : shard0000 { "t" : 1000, "i" :
7 }
{ "_sk" : 4 } -->> { "_sk" : 5 } on : shard0000 { "t" : 1000, "i" :
9 }
{ "_sk" : 5 } -->> { "_sk" : 6 } on : shard0000 { "t" : 1000, "i" :
11 }
{ "_sk" : 6 } -->> { "_sk" : 7 } on : shard0001 { "t" : 2000, "i" :
0 }
{ "_sk" : 7 } -->> { "_sk" : 8 } on : shard0001 { "t" : 3000, "i" :
0 }
{ "_sk" : 8 } -->> { "_sk" : 9 } on : shard0001 { "t" : 4000, "i" :
0 }
{ "_sk" : 9 } -->> { "_sk" : 10 } on : shard0001 { "t" : 5000,
"i" : 0 }
{ "_sk" : 10 } -->> { "_sk" : { $maxKey : 1 } } on : shard0001
{ "t" : 6000, "i" : 0 }
{ "_id" : "00010112", "partitioned" : true, "primary" :
"shard0000" }

Please guide us in what could be missing?
PS: We are observing this consistently on different Mongodb versions.
We kill mongodb shard with kill -2 command.

Thanks in Advance,
Anurag Berdia

Birdy

unread,
Dec 30, 2011, 5:00:27 PM12/30/11
to mongodb-user
One more things: "When we stop the application and start again then
also most of the times problem persist."

Thanks.

Eliot Horowitz

unread,
Dec 30, 2011, 6:44:12 PM12/30/11
to mongod...@googlegroups.com
Can you send the mongos log?

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>

Birdy

unread,
Dec 31, 2011, 3:27:31 AM12/31/11
to mongodb-user
Hi Eliot,

These are the mongos logs:

Sat Dec 31 01:19:45 mongos db version v2.0.2, pdfile version 4.5
starting (--help for usage)
Sat Dec 31 01:19:45 git version:
514b122d308928517f5841888ceaa4246a7f18e3
Sat Dec 31 01:19:45 build info: Linux bs-linux64.10gen.cc
2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64
BOOST_LIB_VERSION=1_41
Sat Dec 31 01:19:45 [Balancer] about to contact config servers and
shards
Sat Dec 31 01:19:45 [mongosMain] waiting for connections on port 35000
Sat Dec 31 01:19:45 [websvr] admin web console waiting for connections
on port 36000
Sat Dec 31 01:19:45 [Balancer] config servers and shards contacted
successfully
Sat Dec 31 01:19:45 [Balancer] balancer id: ct-0038:35000 started at
Dec 31 01:19:45
Sat Dec 31 01:19:45 [Balancer] created new distributed lock for
balancer on 192.168.50.174:30000 ( lock timeout : 900000, ping
interval : 30000, process : 0 )
Sat Dec 31 01:19:45 [Balancer] creating WriteBackListener for:
192.168.50.174:30000 serverID: 4efe15d905ec770fdec2e0db
Sat Dec 31 01:19:56 [conn1] created new distributed lock for
00311211.Database on 192.168.50.174:30000 ( lock timeout : 900000,
ping interval : 30000, process : 0 )
Sat Dec 31 01:19:56 [conn1] ChunkManager: time to load chunks for
00311211.Database: 0ms sequenceNumber: 2 version: 6|1
Sat Dec 31 01:19:56 [conn1] creating WriteBackListener for:
192.168.50.174:20000 serverID: 4efe15d905ec770fdec2e0db
Sat Dec 31 01:19:56 [conn1] creating WriteBackListener for:
192.168.50.174:25000 serverID: 4efe15d905ec770fdec2e0db
Sat Dec 31 01:20:17 [WriteBackListener-192.168.50.174:25000]
DBClientCursor::init call() failed
Sat Dec 31 01:20:17 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : DBClientBase::findN: transport error:
192.168.50.174:25000 query: { writebacklisten:
ObjectId('4efe15d905ec770fdec2e0db') }
Sat Dec 31 01:20:18 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Sat Dec 31 01:20:20 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Sat Dec 31 01:20:23 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Sat Dec 31 01:20:24 [conn1] DBClientCursor::init call() failed
Sat Dec 31 01:20:24 [conn1] warning: could have autosplit on
collection: 00311211.Database but: DBClientBase::findN: transport
error: 192.168.50.174:25000 query: { getlasterror: 1 }
Sat Dec 31 01:20:27 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Sat Dec 31 01:20:28 [conn1] trying reconnect to 192.168.50.174:25000
Sat Dec 31 01:20:28 [conn1] reconnect 192.168.50.174:25000 failed
couldn't connect to server 192.168.50.174:25000
Sat Dec 31 01:20:28 [conn1] DBException in process: socket exception
Sat Dec 31 01:20:31 [conn1] DBException in process: socket exception
Sat Dec 31 01:20:32 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Sat Dec 31 01:20:33 [conn1] DBException in process: socket exception
Sat Dec 31 01:20:36 [conn1] DBException in process: socket exception
Sat Dec 31 01:20:38 [conn1] DBException in process: socket exception
Sat Dec 31 01:20:38 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Sat Dec 31 01:20:45 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Sat Dec 31 01:20:51 [conn1] DBException in process: socket exception
Sat Dec 31 01:20:53 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Sat Dec 31 01:21:02 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Sat Dec 31 01:21:12 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Sat Dec 31 01:21:23 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Sat Dec 31 01:21:24 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Sat Dec 31 01:21:26 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Sat Dec 31 01:21:29 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception

Thanks,
Anurag Berdia

Eliot Horowitz

unread,
Dec 31, 2011, 5:59:13 PM12/31/11
to mongod...@googlegroups.com
Nothing too much there.

Can you restarting mongos with -vvvv and re-run the test?

Is the test scripted in some way you could send us?

Birdy

unread,
Jan 2, 2012, 10:22:42 AM1/2/12
to mongodb-user
we are just inserting the records one by one from key 1 to 10. We are
inserting binary data in mongo. (If it is of any help)
Here are more logs:

Mon Jan 2 20:45:06 [conn19] Request::process ns: 00020112.$cmd msg id:
26500 attempt: 0
Mon Jan 2 20:45:06 [conn19] single query: 00020112.$cmd
{ getlasterror: 1 } ntoreturn: 1 options : 0
Mon Jan 2 20:45:09 [conn19] Request::process ns: 00020112.Database
msg id:19169 attempt: 0
Mon Jan 2 20:45:09 [conn19] write: 00020112.Database
Mon Jan 2 20:45:09 [conn19] server:shard0000:192.168.50.174:20000
bulk insert 1 documents
Mon Jan 2 20:45:09 [conn19] Request::process ns: 00020112.$cmd msg id:
15724 attempt: 0
Mon Jan 2 20:45:09 [conn19] single query: 00020112.$cmd
{ getlasterror: 1 } ntoreturn: 1 options : 0
Mon Jan 2 20:45:15 [conn14] Socket recv() conn closed?
192.168.50.174:55880
Mon Jan 2 20:45:15 [conn14] SocketException: remote:
192.168.50.174:55880 error: 9001 socket exception [0] server
[192.168.50.174:55880]
Mon Jan 2 20:45:27 [WriteBackListener-192.168.50.174:25000] Socket
recv() conn closed? 192.168.50.174:25000
Mon Jan 2 20:45:27 [WriteBackListener-192.168.50.174:25000]
SocketException: remote: 192.168.50.174:25000 error: 9001 socket
exception [0] server [192.168.50.174:25000]
Mon Jan 2 20:45:27 [WriteBackListener-192.168.50.174:25000]
DBClientCursor::init call() failed
Mon Jan 2 20:45:27 [WriteBackListener-192.168.50.174:25000] User
Assertion: 10276:DBClientBase::findN: transport error:
192.168.50.174:25000 ns: admin.$cmd query: { writebacklisten:
ObjectId('4f01a187cdc03e0a43004529') }
Mon Jan 2 20:45:27 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : DBClientBase::findN: transport error:
192.168.50.174:25000 ns: admin.$cmd query: { writebacklisten:
ObjectId('4f01a187cdc03e0a43004529') }
Mon Jan 2 20:45:28 [WriteBackListener-192.168.50.174:25000] creating
new connection to:192.168.50.174:25000
Mon Jan 2 20:45:28 BackgroundJob starting: ConnectBG
Mon Jan 2 20:45:28 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Mon Jan 2 20:45:30 [WriteBackListener-192.168.50.174:25000] creating
new connection to:192.168.50.174:25000
Mon Jan 2 20:45:30 BackgroundJob starting: ConnectBG
Mon Jan 2 20:45:30 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Mon Jan 2 20:45:31 [WriteBackListener-192.168.50.174:30000]
192.168.50.174:30000 is not a shard node
Mon Jan 2 20:45:31 [PeriodicTask::Runner] task: DBConnectionPool-
cleaner took: 0ms
Mon Jan 2 20:45:31 [PeriodicTask::Runner] task: DBConnectionPool-
cleaner took: 0ms
Mon Jan 2 20:45:31 [Balancer] skipping balancing round because
balancing is disabled
Mon Jan 2 20:45:33 [WriteBackListener-192.168.50.174:25000] creating
new connection to:192.168.50.174:25000
Mon Jan 2 20:45:33 BackgroundJob starting: ConnectBG
Mon Jan 2 20:45:33 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Mon Jan 2 20:45:37 [WriteBackListener-192.168.50.174:25000] creating
new connection to:192.168.50.174:25000
Mon Jan 2 20:45:37 BackgroundJob starting: ConnectBG
Mon Jan 2 20:45:37 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Mon Jan 2 20:45:38 [conn19] Request::process ns: 00020112.Database
msg id:11478 attempt: 0
Mon Jan 2 20:45:38 [conn19] write: 00020112.Database
Mon Jan 2 20:45:38 [conn19] server:shard0000:192.168.50.174:20000
bulk insert 1 documents
Mon Jan 2 20:45:39 [conn19] Request::process ns: 00020112.$cmd msg id:
29358 attempt: 0
Mon Jan 2 20:45:39 [conn19] single query: 00020112.$cmd
{ getlasterror: 1 } ntoreturn: 1 options : 0
Mon Jan 2 20:45:42 [conn19] Request::process ns: 00020112.Database
msg id:26962 attempt: 0
Mon Jan 2 20:45:42 [conn19] write: 00020112.Database
Mon Jan 2 20:45:42 [conn19] server:shard0000:192.168.50.174:20000
bulk insert 1 documents
Mon Jan 2 20:45:42 [conn19] Socket recv() conn closed?
192.168.50.174:25000
Mon Jan 2 20:45:42 [conn19] SocketException: remote:
192.168.50.174:25000 error: 9001 socket exception [0] server
[192.168.50.174:25000]
Mon Jan 2 20:45:42 [conn19] DBClientCursor::init call() failed
Mon Jan 2 20:45:42 [conn19] User Assertion:
10276:DBClientBase::findN: transport error: 192.168.50.174:25000 ns:
admin.$cmd query: { getlasterror: 1 }
Mon Jan 2 20:45:42 [conn19] warning: could have autosplit on
collection: 00020112.Database but: DBClientBase::findN: transport
error: 192.168.50.174:25000 ns: admin.$cmd query: { getlasterror: 1 }
Mon Jan 2 20:45:42 [conn19] Request::process ns: 00020112.$cmd msg id:
24464 attempt: 0
Mon Jan 2 20:45:42 [conn19] single query: 00020112.$cmd
{ getlasterror: 1 } ntoreturn: 1 options : 0
Mon Jan 2 20:45:42 [WriteBackListener-192.168.50.174:25000] creating
new connection to:192.168.50.174:25000
Mon Jan 2 20:45:42 BackgroundJob starting: ConnectBG
Mon Jan 2 20:45:42 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Mon Jan 2 20:45:44 [conn19] Request::process ns: 00020112.Database
msg id:5705 attempt: 0
Mon Jan 2 20:45:44 [conn19] write: 00020112.Database
Mon Jan 2 20:45:44 [conn19] server:shard0001:192.168.50.174:25000
bulk insert 1 documents
Mon Jan 2 20:45:44 [conn19] trying reconnect to 192.168.50.174:25000
Mon Jan 2 20:45:44 BackgroundJob starting: ConnectBG
Mon Jan 2 20:45:44 [conn19] reconnect 192.168.50.174:25000 failed
couldn't connect to server 192.168.50.174:25000
Mon Jan 2 20:45:44 [conn19] DBException in process: socket exception
Mon Jan 2 20:45:45 [conn19] Request::process ns: 00020112.$cmd msg id:
28145 attempt: 0
Mon Jan 2 20:45:45 [conn19] single query: 00020112.$cmd
{ getlasterror: 1 } ntoreturn: 1 options : 0
Mon Jan 2 20:45:48 [WriteBackListener-192.168.50.174:25000] creating
new connection to:192.168.50.174:25000
Mon Jan 2 20:45:48 BackgroundJob starting: ConnectBG
Mon Jan 2 20:45:48 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception
Mon Jan 2 20:45:55 [WriteBackListener-192.168.50.174:25000] creating
new connection to:192.168.50.174:25000
Mon Jan 2 20:45:55 BackgroundJob starting: ConnectBG
Mon Jan 2 20:45:55 [WriteBackListener-192.168.50.174:25000]
WriteBackListener exception : socket exception

Thanks,
Anurag Berdia

Birdy

unread,
Jan 3, 2012, 10:22:07 AM1/3/12
to mongodb-user
Hi,

Any updates on this?? This is very critical issue for us.
We are planning to use mongo with our case in production.
Can you please update on this?

Thanks,
Anurag Berdia
> > > WriteBackListener exception : socket...
>
> read more »

Mathias Stearn

unread,
Jan 3, 2012, 2:40:15 PM1/3/12
to mongod...@googlegroups.com
Are you sure that the writes are not going through? In my testing, the writes go through, but the getLastError complains about a socket exception. When I did inserts, if I do a find with the QUERYOPTION_PARTIAL (1<<7) bit set in the flags I am able to see the newly inserted objects on the shard that is still up.

That said, we highly recommend using a replica-set for each shard in production. The members of each replica set should be distributed such that there is a low likely-hood of all members of a shard going down unless everything goes down. eg if you have two rack (or data centers) each shard will have one member in each rack rather than putting all members of one shard on each rack.

Birdy

unread,
Jan 4, 2012, 7:02:00 AM1/4/12
to mongodb-user
Hi Mathias,
Thanks for the reply,

Let me explain my complete setup:
I have two disk arrays for storage on which mongo shards will write
data.
We are planning to use 10 shards with 1 mongos and 3 config dbs
running on server1.
Our plan is to use 1-20 shard keys to distribute data on these shards.
For example, key 1-2 on shard1, key 3-4 on shard2 and so on.
Also we are planning to use same 10shards with 1 mongos and 3
configdbs on server2 (For server1 failover). Once server1 is down,
mongos and mongod's on server2 will start. (Note both the servers are
using the same data store-same volume on disk array)

Is this a good mongodb usage?

PS: we wanted to avoid two copies of data, that's why we dropped
replica-set in our deployment. Is it possible to use replica-set and
have only one copy of data on disk array?

Thanks,
Anurag Berdia

Birdy

unread,
Jan 4, 2012, 8:00:28 AM1/4/12
to mongodb-user
Also in our case, MONGOS is actually writing data to the up shard.
(when one shard is down)
But in getLastError() it gives socket exception. If we are getting
this exception in getLastError, then our driver declare that "shard
down" and write failed. So Application again retry to insert same
object in MongoDB.
Isn't this how we should handle it? Do we need to do find, to make
sure the shard is actually down or not?
I Believe mongos should return success, if object is getting inserted
to a shard that is up in GetLastError().

Thanks,
Anurag Berdia


On Jan 4, 12:40 am, Mathias Stearn <math...@10gen.com> wrote:

Birdy

unread,
Jan 6, 2012, 1:53:55 AM1/6/12
to mongodb-user
Any news on this?

Thanks,
Anurag

Mathias Stearn

unread,
Jan 17, 2012, 5:26:04 PM1/17/12
to mongod...@googlegroups.com
It looks like you ran into https://jira.mongodb.org/browse/SERVER-3763. Please watch that case for updates.

As to your configuration, while it might work, it isn't exactly recommended. The ideal deployment of mongo uses local storage on each node. *if* you need to use non-local storage for some reason, I still wouldn't recommend having replicas sharing storage. The idea with replica sets is that member failure should be completely independent so it is unlikily that you will loose all at the same time. Using the same shared storage introduces a single-point-of-failure (SPOF), that can significantly reduce your availability and durability.
Reply all
Reply to author
Forward
0 new messages