Kyle, thanks for helping me out :)
See below, I have copied what I could find
On Oct 12, 8:18 pm, Kyle Banker <
k...@10gen.com> wrote:
> Bo,
>
> We haven't see the sort of instability you're describing. It seems like the
> individual nodes are going down dude to networking issues. Is that your
> suspicion? We'd like to help. Can you provide
> - MongoDB version
db version v1.6.3, pdfile version 4.5
> - ReplSet config object
query local.system.replset
{ "_id" : "setname",
"version" : 1,
"members" : [
{ "_id" : 0,
"host" : "
hostname1.somedomain.com" },
{ "_id" : 1,
"host" : "
hostname2.somedomain.com",
"arbiterOnly" : true },
{ "_id" : 2,
"host" : "
hostname3.somedomain.com" } ] }
> - Relevant log files
Log from Primary (Showing last line - suggesting an sudden
interruption):
Tue Oct 12 01:49:18 [conn8] getmore
local.oplog.rs cid:
8832731092773422124 getMore: { ts: { $gte: new
Date(5526531326833852417) } } bytes:20 nreturned:0 10387ms
Log from Secondary:
Mon Oct 11 12:21:33 MongoDB starting : pid=8843 port=27017 dbpath=/
data/db/ 64-bit
Mon Oct 11 12:21:33 db version v1.6.3, pdfile version 4.5
Mon Oct 11 12:21:33 git version:
278bd2ac2f2efbee556f32c13c1b6803224d1c01
Mon Oct 11 12:21:33 sys info: Linux domU-12-31-39-06-79-A1
2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64
BOOST_LIB_VERSION=1_41
Mon Oct 11 12:21:36 [initandlisten] waiting for connections on port
27017
Mon Oct 11 12:21:36 [startReplSets] replSet can't get
local.system.replset config from self or any seed (yet)
Mon Oct 11 12:21:39 [websvr] web admin interface listening on port
28017
Mon Oct 11 12:21:39 [initandlisten] connection accepted from
so.me.ip.number1:54564 #1
Mon Oct 11 12:21:40 [initandlisten] connection accepted from
so.me.ip.number2:51843 #2
Mon Oct 11 12:21:46 [initandlisten] connection accepted from
so.me.ip.number3:41076 #3
Mon Oct 11 12:21:46 [startReplSets] replSet STARTUP2
Mon Oct 11 12:21:46 [rs Manager] replSet can't see a majority, will
not try to elect self
Mon Oct 11 12:21:48 [ReplSetHealthPollTask] replSet info
hostname1.somedomain.com is now up
Mon Oct 11 12:21:48 [ReplSetHealthPollTask] replSet
hostname1.somedomain.com ARBITER
Mon Oct 11 12:21:48 [ReplSetHealthPollTask] replSet info
hostname3.somedomain.com is now up
Mon Oct 11 12:21:48 [ReplSetHealthPollTask] replSet
hostname3.somedomain.com PRIMARY
Mon Oct 11 12:21:48 [rs Manager] replSet info electSelf 2
Mon Oct 11 12:21:48 [rs Manager] replSet PRIMARY
Mon Oct 11 12:21:48 [rs_sync] replSet SECONDARY
Mon Oct 11 12:36:47 [conn2] end connection so.me.ip.number2:51843
Mon Oct 11 12:36:47 [rs_sync] replSet syncThread: 10278 dbclient error
communicating with server
Mon Oct 11 12:36:48 [ReplSetHealthPollTask] replSet info
hostname3.somedomain.com is now down (or slow to respond)
Mon Oct 11 12:36:48 [rs Manager] replSet info electSelf 2
Mon Oct 11 12:36:48 [rs Manager] replSet PRIMARY
Mon Oct 11 12:39:09 [ReplSetHealthPollTask] replSet info
hostname3.somedomain.com is now up
Mon Oct 11 12:39:09 [ReplSetHealthPollTask] replSet
hostname3.somedomain.com STARTUP2
Mon Oct 11 12:39:09 [initandlisten] connection accepted from
so.me.ip.number2:48926 #4
Mon Oct 11 12:39:11 [ReplSetHealthPollTask] replSet
hostname3.somedomain.com RECOVERING
Mon Oct 11 12:39:11 [initandlisten] connection accepted from
so.me.ip.number2:48943 #5
Mon Oct 11 12:39:11 [conn5] query
local.oplog.rs ntoreturn:1 reslen:
115 nscanned:1 {} nreturned:1 150ms
Mon Oct 11 12:39:13 [conn5] query
local.oplog.rs reslen:115 nscanned:1
{ ts: { $gte: new Date(5526531326833852417) } } nreturned:1 2285ms
Mon Oct 11 12:39:15 [ReplSetHealthPollTask] replSet
hostname3.somedomain.com SECONDARY
Mon Oct 11 12:39:15 [slaveTracking] building new index on { _id: 1 }
for local.slaves
Mon Oct 11 12:39:15 [slaveTracking] done for 0 records 0.132secs
Mon Oct 11 12:39:15 [slaveTracking] update local.slaves query: { _id:
ObjectId('4cb367df19f53a593c506823'), host: "so.me.ip.number2", ns:
"
local.oplog.rs" } 506ms
Mon Oct 11 12:39:22 [conn5] getmore
local.oplog.rs cid:
5696201091655256647 getMore: { ts: { $gte: new
Date(5526531326833852417) } } bytes:20 nreturned:0 9030ms
Mon Oct 11 12:39:29 [conn5] getmore
local.oplog.rs cid:
5696201091655256647 getMore: { ts: { $gte: new
Date(5526531326833852417) } } bytes:20 nreturned:0 6315ms
Mon Oct 11 12:39:35 [conn5] getmore
local.oplog.rs cid:
5696201091655256647 getMore: { ts: { $gte: new
Date(5526531326833852417) } } bytes:20 nreturned:0 6209ms
Mon Oct 11 12:39:41 [conn5] getmore
local.oplog.rs cid:
5696201091655256647 getMore: { ts: { $gte: new
Date(5526531326833852417) } } bytes:20 nreturned:0 6307ms
Mon Oct 11 12:39:47 [conn5] getmore
local.oplog.rs cid:
5696201091655256647 getMore: { ts: { $gte: new
Date(5526531326833852417) } } bytes:20 nreturned:0 5854ms
Mon Oct 11 12:39:49 [initandlisten] connection accepted from
so.me.ip.number3:40229 #6
Mon Oct 11 12:39:54 [conn5] getmore
local.oplog.rs cid:
5696201091655256647 getMore: { ts: { $gte: new
Date(5526531326833852417) } } bytes:20 nreturned:0 6526ms
Mon Oct 11 12:40:00 [conn5] getmore
local.oplog.rs cid:
5696201091655256647 getMore: { ts: { $gte: new
Date(5526531326833852417) } } bytes:20 nreturned:0 6655ms
Mon Oct 11 12:40:07 [conn5] getmore
local.oplog.rs cid:
5696201091655256647 getMore: { ts: { $gte: new
Date(5526531326833852417) } } bytes:20 nreturned:0 6463ms
Mon Oct 11 12:40:13 [conn5] getmore
local.oplog.rs cid:
5696201091655256647 getMore: { ts: { $gte: new
Date(5526531326833852417) } } bytes:20 nreturned:0 6587ms
Mon Oct 11 12:40:15 [conn6] replSet RECOVERING
Mon Oct 11 12:40:15 [conn6] replSet info stepped down as primary
Mon Oct 11 12:40:16 [conn4] replSet info voting yea for 0
Mon Oct 11 12:40:17 [ReplSetHealthPollTask] replSet
hostname3.somedomain.com PRIMARY
Mon Oct 11 12:40:17 [rs_sync] replSet SECONDARY
Mon Oct 11 12:40:19 [conn5] getmore
local.oplog.rs cid:
5696201091655256647 getMore: { ts: { $gte: new
Date(5526531326833852417) } } bytes:20 nreturned:0 5834ms
Mon Oct 11 12:40:19 [conn5] end connection so.me.ip.number2:48943
Mon Oct 11 15:04:58 [conn6] end connection so.me.ip.number3:40229
Mon Oct 11 15:54:26 [initandlisten] connection accepted from
so.me.ip.number1:54089 #7
Mon Oct 11 15:54:31 [conn1] end connection so.me.ip.number1:54564
Mon Oct 11 15:54:34 [ReplSetHealthPollTask] replSet info
hostname1.somedomain.com is now down (or slow to respond)
Mon Oct 11 15:54:36 [ReplSetHealthPollTask] replSet info
hostname1.somedomain.com is now up
Tue Oct 12 02:13:32 [ReplSetHealthPollTask] MessagingPort recv()
remote dead so.me.ip.number2:27017
Tue Oct 12 02:13:32 [ReplSetHealthPollTask] SocketException: 9001
socket exception
Tue Oct 12 02:13:32 [ReplSetHealthPollTask] replSet info
hostname3.somedomain.com is now down (or slow to respond)
Tue Oct 12 02:13:32 [rs Manager] replSet info electSelf 2
Tue Oct 12 02:13:32 [rs Manager] replSet PRIMARY
Tue Oct 12 03:49:18 [rs_sync] MessagingPort recv() errno:104
Connection reset by peer so.me.ip.number2:27017
Tue Oct 12 03:49:18 [rs_sync] SocketException: 9001 socket exception
Tue Oct 12 03:49:18 [rs_sync] MessagingPort flush send() errno:32
Broken pipe so.me.ip.number2:27017
Tue Oct 12 03:49:18 [rs_sync] caught exception (socket exception) in
destructor (~PiggyBackData)
Tue Oct 12 03:49:18 [rs_sync] replSet syncThread: 10278 dbclient error
communicating with server
Tue Oct 12 03:49:20 [conn4] end connection so.me.ip.number2:48926
Tue Oct 12 11:14:42 [initandlisten] connection accepted from
so.me.ip.number1:36864 #8
Tue Oct 12 11:14:43 [conn8] end connection so.me.ip.number1:36864
>
> Kyle
>
> On Tue, Oct 12, 2010 at 1:42 PM, Sergei Tulentsev <
>
> >>
mongodb-user...@googlegroups.com<
mongodb-user%2Bunsu...@googlegroups.com>
> >> .
> >> For more options, visit this group at
> >>
http://groups.google.com/group/mongodb-user?hl=en.
>
> > --
> > Best regards,
> > Sergei Tulentsev
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "mongodb-user" group.
> > To post to this group, send email to
mongod...@googlegroups.com.
> > To unsubscribe from this group, send email to
> >
mongodb-user...@googlegroups.com<
mongodb-user%2Bunsu...@googlegroups.com>
> > .