STARTUP2

1,484 views
Skip to first unread message

stever

unread,
May 22, 2012, 9:57:48 AM5/22/12
to mongodb-user
Hi,

I (slightly) botched the upgrade from 1.6 to 2.0.5 on my cluster.
I have a minimal sharded replica set on 4 machines.
One shard came up fine (though the primary isn't really the machine I
want),
but the other has both replicas in "STARTUP2" with
"errmsg" : "initial sync need a member to be primary or secondary to
do our initial sync"

How can I get one of the machines to take primary?
I want to save the DB, but it would be ok if either server became
master (I'm not
concerned about loosing a few updates...).

Thanks,

Steve

stever

unread,
May 22, 2012, 1:12:17 PM5/22/12
to mongodb-user
PS: I've tried every trick I can think of to get one of the servers
to take primary.
I've taken the servers out of replication set mode and validated that
the data is there.
Any time I bring the replica-set up both non-arbitors stay stuck in
STARTUP2 mode with
a state of 5.

Any help would be greatly appreciated.

Eliot Horowitz

unread,
May 22, 2012, 1:26:25 PM5/22/12
to mongod...@googlegroups.com
Can you send the full log?
How many nodes are in the set?
> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb

stever

unread,
May 22, 2012, 1:46:34 PM5/22/12
to mongodb-user
Hi,

There are 4 servers, two shards, each a replica-set of two servers and
an Arbitor.

{
"shards" : [
{
"_id" : "sgm1",
"host" : "sgm1/192.168.0.160:30000,192.168.0.165:30000"
},
{
"_id" : "sgm2",
"host" : "sgm2/192.168.0.170:30000,192.168.0.175:30000"
}
],
"ok" : 1
}

{
"_id" : "sgm1",
"version" : 188167,
"members" : [
{
"_id" : 0,
"host" : "192.168.0.160:30000",
"priority" : 2
},
{
"_id" : 1,
"host" : "192.168.0.165:30000"
},
{
"_id" : 2,
"host" : "192.168.0.170:30001",
"arbiterOnly" : true
}
]
}

{
"_id" : "sgm2",
"version" : 1,
"members" : [
{
"_id" : 0,
"host" : "192.168.0.170:30000"
},
{
"_id" : 1,
"host" : "192.168.0.175:30000"
},
{
"_id" : 2,
"host" : "192.168.0.160:30001",
"arbiterOnly" : true
}
]
}

status on sgm2 is fine

status on sgm1 is:

{
"set" : "sgm1",
"date" : ISODate("2012-05-22T17:46:05Z"),
"myState" : 5,
"members" : [
{
"_id" : 0,
"name" : "192.168.0.160:30000",
"health" : 1,
"state" : 5,
"stateStr" : "STARTUP2",
"optime" : {
"t" : 0,
"i" : 0
},
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"self" : true
},
{
"_id" : 1,
"name" : "192.168.0.165:30000",
"health" : 1,
"state" : 5,
"stateStr" : "STARTUP2",
"uptime" : 2257,
"optime" : {
"t" : 0,
"i" : 0
},
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2012-05-22T17:46:05Z"),
"pingMs" : 0,
"errmsg" : "initial sync need a member to be primary or secondary
to do our initial sync"
},
{
"_id" : 2,
"name" : "192.168.0.170:30001",
"health" : 1,
"state" : 5,
"stateStr" : "STARTUP2",
"uptime" : 2639,
"optime" : {
"t" : 0,
"i" : 0
},
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2012-05-22T17:46:03Z"),
"pingMs" : 0
}
],
"ok" : 1

Eliot Horowitz

unread,
May 22, 2012, 1:49:38 PM5/22/12
to mongod...@googlegroups.com
Were all 3 nodes in sgm1 upgraded?
Can you send the log from 192.168.0.160:30000?

stever

unread,
May 22, 2012, 2:06:04 PM5/22/12
to mongodb-user
All nodes have been upgraded.

I will send the log...

Eliot Horowitz

unread,
May 22, 2012, 3:38:22 PM5/22/12
to mongod...@googlegroups.com
The log you send seems to be from a node with an empty dbpath.
Was there a config option lost along the way?

Can you send the other node?

stever

unread,
May 22, 2012, 5:24:21 PM5/22/12
to mongodb-user
Hmm...

It's there in the config: dbpath=/data/mongo/sgm1
I did loose my settings the first time I tried to start-up after the
upgrade,
but I quickly caught that and fixed it. You can see the message in
the log
about the port conflict too... I have mongod running on port 30000
and
mongos running on 27017

Steve

stever

unread,
May 22, 2012, 6:05:31 PM5/22/12
to mongodb-user
My servers seem to be creating lots of local.* files. Should I
attempt to
clean up some of that and restart them? I tried something like this at
one
point but at start-up the server complained about some of the missing
local
files. Are there certain ones that need to exist? Again, I'm not
really worried
about loosing some of my data, but wouldn't want to start empty
either.

Steve

Eliot Horowitz

unread,
May 22, 2012, 11:51:29 PM5/22/12
to mongod...@googlegroups.com
Can you send the listing of the dbpath directories on both servers?

stever

unread,
May 23, 2012, 12:07:38 AM5/23/12
to mongodb-user
I just confirmed the are both the same:

dbpath=/data/mongo/sgm1

And this is correct because both servers have separate mounted /data
volumes.

The other shard has both set to:

dbpath=/data/mongo/sgm2

stever

unread,
May 23, 2012, 12:10:17 AM5/23/12
to mongodb-user
On .160:

journal local.12 local.17 local.21 local.3 local.8 soulgoal.
0
local.0 local.13 local.18 local.22 local.4 local.9 soulgoal.
1
local.1 local.14 local.19 local.23 local.5 local.ns
soulgoal.ns
local.10 local.15 local.2 local.24 local.6 mongod.lock
local.11 local.16 local.20 local.25 local.7 moveChunk

On .165:

journal local.12 local.17 local.21 local.3 local.8 soulgoal.
1
local.0 local.13 local.18 local.22 local.4 local.9
soulgoal.ns
local.1 local.14 local.19 local.23 local.5 local.ns
local.10 local.15 local.2 local.24 local.6 mongod.lock
local.11 local.16 local.20 local.25 local.7 soulgoal.0

Eliot Horowitz

unread,
May 23, 2012, 12:10:32 AM5/23/12
to mongod...@googlegroups.com
I mean what are the contents of the directories?
ls -lah

stever

unread,
May 23, 2012, 12:22:34 AM5/23/12
to mongodb-user
Sorry, might thing enlish was my second language ;-)
.160:

drwxr-xr-x 4 mongod mongod 4096 May 22 13:01 .
drwxr-xr-x 5 mongod mongod 41 Sep 6 2010 ..
drwxr-xr-x 2 mongod mongod 61 May 22 13:06 journal
-rw------- 1 mongod mongod 67108864 May 22 00:37 local.0
-rw------- 1 mongod mongod 134217728 May 22 00:37 local.1
-rw------- 1 mongod mongod 2146435072 May 22 00:39 local.10
-rw------- 1 mongod mongod 2146435072 May 22 00:39 local.11
-rw------- 1 mongod mongod 2146435072 May 22 00:39 local.12
-rw------- 1 mongod mongod 2146435072 May 22 00:39 local.13
-rw------- 1 mongod mongod 2146435072 May 22 00:40 local.14
-rw------- 1 mongod mongod 2146435072 May 22 00:40 local.15
-rw------- 1 mongod mongod 2146435072 May 22 00:40 local.16
-rw------- 1 mongod mongod 2146435072 May 22 00:40 local.17
-rw------- 1 mongod mongod 2146435072 May 22 00:41 local.18
-rw------- 1 mongod mongod 2146435072 May 22 00:41 local.19
-rw------- 1 mongod mongod 2146435072 May 22 00:37 local.2
-rw------- 1 mongod mongod 2146435072 May 22 00:41 local.20
-rw------- 1 mongod mongod 2146435072 May 22 00:41 local.21
-rw------- 1 mongod mongod 2146435072 May 22 00:42 local.22
-rw------- 1 mongod mongod 2146435072 May 22 00:42 local.23
-rw------- 1 mongod mongod 2146435072 May 22 00:42 local.24
-rw------- 1 mongod mongod 2146435072 May 22 00:42 local.25
-rw------- 1 mongod mongod 2146435072 May 22 00:37 local.3
-rw------- 1 mongod mongod 2146435072 May 22 00:37 local.4
-rw------- 1 mongod mongod 2146435072 May 22 00:37 local.5
-rw------- 1 mongod mongod 2146435072 May 22 00:38 local.6
-rw------- 1 mongod mongod 2146435072 May 22 00:38 local.7
-rw------- 1 mongod mongod 2146435072 May 22 00:38 local.8
-rw------- 1 mongod mongod 2146435072 May 22 00:38 local.9
-rw------- 1 mongod mongod 16777216 May 22 00:37 local.ns
-rwxr-xr-x 1 mongod mongod 6 May 22 13:01 mongod.lock
drwxr-xr-x 5 mongod mongod 85 Sep 6 2010 moveChunk
-rw------- 1 mongod mongod 67108864 Apr 9 2010 soulgoal.0
-rw------- 1 mongod mongod 134217728 Apr 15 2010 soulgoal.1
-rw------- 1 mongod mongod 16777216 Apr 9 2010 soulgoal.ns


.165:

total 49G
drwxr-xr-x 3 mongod mongod 4.0K May 22 13:07 .
drwxr-xr-x 5 mongod mongod 41 Oct 30 2010 ..
drwxr-xr-x 2 mongod mongod 61 May 22 13:08 journal
-rw------- 1 mongod mongod 64M May 22 00:35 local.0
-rw------- 1 mongod mongod 128M May 22 00:35 local.1
-rw------- 1 mongod mongod 2.0G May 22 00:37 local.10
-rw------- 1 mongod mongod 2.0G May 22 00:37 local.11
-rw------- 1 mongod mongod 2.0G May 22 00:38 local.12
-rw------- 1 mongod mongod 2.0G May 22 00:38 local.13
-rw------- 1 mongod mongod 2.0G May 22 00:38 local.14
-rw------- 1 mongod mongod 2.0G May 22 00:38 local.15
-rw------- 1 mongod mongod 2.0G May 22 00:39 local.16
-rw------- 1 mongod mongod 2.0G May 22 00:39 local.17
-rw------- 1 mongod mongod 2.0G May 22 00:39 local.18
-rw------- 1 mongod mongod 2.0G May 22 00:39 local.19
-rw------- 1 mongod mongod 2.0G May 22 00:36 local.2
-rw------- 1 mongod mongod 2.0G May 22 00:40 local.20
-rw------- 1 mongod mongod 2.0G May 22 00:40 local.21
-rw------- 1 mongod mongod 2.0G May 22 00:40 local.22
-rw------- 1 mongod mongod 2.0G May 22 00:40 local.23
-rw------- 1 mongod mongod 2.0G May 22 00:41 local.24
-rw------- 1 mongod mongod 2.0G May 22 00:41 local.25
-rw------- 1 mongod mongod 2.0G May 22 00:36 local.3
-rw------- 1 mongod mongod 2.0G May 22 00:36 local.4
-rw------- 1 mongod mongod 2.0G May 22 00:36 local.5
-rw------- 1 mongod mongod 2.0G May 22 00:36 local.6
-rw------- 1 mongod mongod 2.0G May 22 00:36 local.7
-rw------- 1 mongod mongod 2.0G May 22 00:37 local.8
-rw------- 1 mongod mongod 2.0G May 22 00:37 local.9
-rw------- 1 mongod mongod 16M May 22 00:35 local.ns
-rwxr-xr-x 1 mongod mongod 5 May 22 13:07 mongod.lock
-rw------- 1 mongod mongod 64M Sep 23 2011 soulgoal.0
-rw------- 1 mongod mongod 128M Nov 1 2010 soulgoal.1
-rw------- 1 mongod mongod 16M Sep 23 2011 soulgoal.ns
> ...
>
> read more »

Eliot Horowitz

unread,
May 23, 2012, 1:06:28 AM5/23/12
to mongod...@googlegroups.com
can you shutdown mongod, copy soulgoal.* and startup a single mongod
with no replication and see what's in there?

stever

unread,
May 23, 2012, 1:22:41 PM5/23/12
to mongodb-user
I've done that. The data is there and fine...
> ...
>
> read more »

Eliot Horowitz

unread,
May 24, 2012, 12:58:42 AM5/24/12
to mongod...@googlegroups.com
Looks like the config is not setup ok.

To just reset everything, I would:
- backup data
- pick a node as the new root
- remove all local.* on all 3 nodes
- on new root, start mongod and do rs.initiate()
- check data is there
- rs.add other nodes

stever

unread,
May 24, 2012, 1:29:53 PM5/24/12
to mongodb-user
Well, finally I have everything running again as it was before the
upgrade.
Thanks for your help. It really was too much of a trial and error
process to get
the replica-sets working again. After I got the first set working I
had trouble bringing
up the second set. I had to do the same routine of deleting the local
files and shutting
down the set and the arbitor. I would bring up the one server stand-
alone but I still
had a lot of trouble getting the set to loose the old config.

Now that both sets and shards are up with good status, I still have a
bunch of local.*
files. Is that normal? It seemed like as soon as I initiated the set
it created them.
Every set member has 25 of them in its data directory...

total 49G
drwxr-xr-x 4 mongod mongod 4.0K May 24 10:03 .
drwxr-xr-x 5 mongod mongod 41 Sep 6 2010 ..
drwxr-xr-x 2 mongod mongod 27 May 24 10:04 journal
-rw------- 1 mongod mongod 64M May 24 09:58 local.0
-rw------- 1 mongod mongod 128M May 24 09:58 local.1
-rw------- 1 mongod mongod 2.0G May 24 09:59 local.10
-rw------- 1 mongod mongod 2.0G May 24 09:59 local.11
-rw------- 1 mongod mongod 2.0G May 24 10:00 local.12
-rw------- 1 mongod mongod 2.0G May 24 10:00 local.13
-rw------- 1 mongod mongod 2.0G May 24 10:00 local.14
-rw------- 1 mongod mongod 2.0G May 24 10:00 local.15
-rw------- 1 mongod mongod 2.0G May 24 10:01 local.16
-rw------- 1 mongod mongod 2.0G May 24 10:01 local.17
-rw------- 1 mongod mongod 2.0G May 24 10:01 local.18
-rw------- 1 mongod mongod 2.0G May 24 10:01 local.19
-rw------- 1 mongod mongod 2.0G May 24 09:58 local.2
-rw------- 1 mongod mongod 2.0G May 24 10:02 local.20
-rw------- 1 mongod mongod 2.0G May 24 10:02 local.21
-rw------- 1 mongod mongod 2.0G May 24 10:02 local.22
-rw------- 1 mongod mongod 2.0G May 24 10:03 local.23
-rw------- 1 mongod mongod 2.0G May 24 10:03 local.24
-rw------- 1 mongod mongod 2.0G May 24 10:03 local.25
-rw------- 1 mongod mongod 2.0G May 24 09:58 local.3
-rw------- 1 mongod mongod 2.0G May 24 09:58 local.4
-rw------- 1 mongod mongod 2.0G May 24 09:58 local.5
-rw------- 1 mongod mongod 2.0G May 24 09:58 local.6
-rw------- 1 mongod mongod 2.0G May 24 09:58 local.7
-rw------- 1 mongod mongod 2.0G May 24 09:59 local.8
-rw------- 1 mongod mongod 2.0G May 24 09:59 local.9
-rw------- 1 mongod mongod 16M May 24 09:58 local.ns
-rwxr-xr-x 1 mongod mongod 6 May 24 07:38 mongod.lock
-rw------- 1 mongod mongod 64M Apr 9 2010 soulgoal.0
-rw------- 1 mongod mongod 128M Apr 15 2010 soulgoal.1
-rw------- 1 mongod mongod 16M Apr 9 2010 soulgoal.ns
drwxr-xr-x 2 mongod mongod 6 May 24 10:11 _tmp
> ...
>
> read more »

Eliot Horowitz

unread,
May 24, 2012, 1:37:00 PM5/24/12
to mongod...@googlegroups.com
The local files are normal and used for the replication oplog.

stever

unread,
May 24, 2012, 9:13:50 PM5/24/12
to mongodb-user
Good to know, thanks.

Steve
> ...
>
> read more »
Reply all
Reply to author
Forward
0 new messages