Adding replica nodes to a sharded cluster

raylu

unread,

Jul 8, 2011, 5:42:28 PM7/8/11

to mongodb-user

We've been adding a lot of replica nodes to our 1.8.2 cluster as phase
one of a cluster move. The replica sets themselves are part of a
sharded cluster with three configdbs.

Adding nodes cause some (but not all) mongos's to fail:

[mpliveappapi01] run: echo show dbs | bin/mongo
[mpliveappapi01] out: MongoDB shell version: 1.8.2
[mpliveappapi01] out: connecting to: test
[mpliveappapi01] out: > show dbs
[mpliveappapi01] out: Fri Jul 8 21:23:36 uncaught exception:
listDatabases failed:{
[mpliveappapi01] out: "assertion" : "DBClientBase::findOne: transport
error: [ip of newly added node]:27017 query: { listDatabases: 1 }",
[mpliveappapi01] out: "assertionCode" : 10276,
[mpliveappapi01] out: "errmsg" : "db assertion failure",
[mpliveappapi01] out: "ok" : 0
[mpliveappapi01] out: }
[mpliveappapi01] out: > bye

All queries on these failed mongos's will return similar errors and
they never reconnect/recover. Bouncing the failed mongos's fixes the
problem (until the next time a replica set is changed). The mongos
logs contain no text that is unusual but a ton (62K) of null bytes
near the beginning of the file. The end of the log file usually reads:

Fri Jul 8 04:02:15 [mongosMain] connection accepted from
127.0.0.1:8816 #31
Fri Jul 8 04:02:15 [WriteBackListener] WriteBackListener exception :
socket exception
Fri Jul 8 04:02:16 [conn31] MessagingPort recv() errno:104 Connection
reset by peer [ip of the master of the replica set that just got
changed]:27017
Fri Jul 8 04:02:16 [conn31] SocketException: remote: error: 9001
socket exception [1]
Fri Jul 8 04:02:16 [conn31] DBClientCursor::init call() failed
Fri Jul 8 04:02:17 [conn31] end connection 127.0.0.1:8816
Fri Jul 8 04:02:23 [WriteBackListener] WriteBackListener exception :
socket exception
Fri Jul 8 04:02:26 [mongosMain] dbexit: received signal 2 rc:0
received signal 2

The connections from localhost here are me running the "show dbs"
command.

On my latest replica set addition, however, some of the mongos's
failed as usual but one of them hung and did not respond to "kill -2
[pid]". Inspecting the logfile revealed the usual null bytes and:

Backtrace: 0x52f8f5 0x7fb6d2e6eaf0 0x5523de 0x557ec5 0x50454b 0x505e04
0x6a50a0 0x7fb6d39729ca 0x7fb6d2f2170d
prod/bin/mongos(_ZN5mongo17printStackAndExitEi+0x75)[0x52f8f5]
/lib/libc.so.6(+0x33af0)[0x7fb6d2e6eaf0]
prod/bin/mongos(_ZN5mongo17ReplicaSetMonitor8checkAllEv+0x23e)
[0x5523de]
prod/bin/mongos(_ZN5mongo24ReplicaSetMonitorWatcher3runEv+0x55)
[0x557ec5]
prod/bin/
mongos(_ZN5mongo13BackgroundJob7jobBodyEN5boost10shared_ptrINS0_9JobStatusEEE
+0x12b)[0x50454b]
prod/bin/
mongos(_ZN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf1IvN5mongo13BackgroundJobENS_10shared_ptrINS7_9JobStatusEEEEENS2_5list2INS2_5valueIPS7_EENSD_ISA_EEEEEEE3runEv
+0x74)[0x505e04]
prod/bin/mongos(thread_proxy+0x80)[0x6a50a0]
/lib/libpthread.so.0(+0x69ca)[0x7fb6d39729ca]
/lib/libc.so.6(clone+0x6d)[0x7fb6d2f2170d]

Eliot Horowitz

unread,

Jul 10, 2011, 8:01:45 PM7/10/11

to mongod...@googlegroups.com

Have you checked connectivity between the mongos and the replica sets?

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

raylu

unread,

Jul 11, 2011, 2:35:52 PM7/11/11

to mongodb-user

Yes. After all, they work upon bouncing the mongos without any other
changes in configuration.

-raylu

Greg Studer

unread,

Jul 11, 2011, 3:22:43 PM7/11/11

to mongod...@googlegroups.com

Hmm... wondering if the mongos is somehow seeing the rs in a weird
state...

Are there any messages in the logs of the newly added (secondary?) node
of the replica set, potentially corresponding to the mongos
listdatabases query? And was the backtrace you mention logged previous
to you hardkilling the mongos, or afterwards?

raylu

unread,

Jul 11, 2011, 8:00:57 PM7/11/11

to mongodb-user

There are entries in the failed mongos' logs like

Fri Jul 8 02:48:49 [ReplicaSetMonitorWatcher] updated set ...

but those are probably unrelated because they happened well before the
failure. I don't see anything with the new IPs of the replica set that
caused the mongos to fail, only a connection reset from the primary of
that set.

How could the backtrace possibly have been logged after a kill -9?

-raylu

Greg Studer

unread,

Jul 11, 2011, 10:23:12 PM7/11/11

to mongod...@googlegroups.com

> but those are probably unrelated because they happened well before the
> failure.

Probably so. Is it possible to increase the verbosity of the log
messages when restarting a mongos?

The writeback listener exceptions seem indicative of some connectivity
issue though - do they only start after adding the new nodes? I assume
the new replica set nodes are added as secondaries, the primary isn't
changing, and there aren't any failovers while adding. The hang you saw
could also be related, if there was an issue checking the
connections.

> How could the backtrace possibly have been logged after a kill -9?

You make an excellent point.

raylu

unread,

Jul 12, 2011, 4:28:10 PM7/12/11

to mongodb-user

Well, the cluster move is actually done so I would prefer not to
reproduce the situation by adding new nodes to the replica sets. I
think it would be pretty easy to reproduce, though. Just set up a
sharded cluster with replica sets as nodes and add mongod's to the
replica sets. Run a lot of mongos's and some should fail.

writeback listener exceptions are there for a while after a replica
set gets changed (and before the change that breaks the mongos),
regardless if the mongos fails or not. It's a bit hard to determine if
they only start after adding the new nodes, though, because the mongos
has been restarted so many times. While looking into whether the
specific mongos had that issue, I found something interesting:
http://localhostr.com/download/bCc7tSN/mongos.log
This is a log of a mongos that has been running smoothly after the
cluster move has completely finished. Those null bytes are still
there.

Yes, the new replica set nodes are added as secondaries, the primary
does not change (except briefly it seems to step down during a
reconfig). By "the hang you saw", do you mean crash?

Greg Studer

unread,

Jul 12, 2011, 4:59:13 PM7/12/11

to mongod...@googlegroups.com

> reconfig). By "the hang you saw", do you mean crash?

Just meant the stack trace you sent in the first email, mentioned most
mongos crashed but one hung.

I'll see if we can reproduce in our test framework...

Reply all

Reply to author

Forward