Trying to set up replica set, secondary nodes stuck on "still initializing" with error "EMPTYUNREACHABLE" (ec2)

3,602 views
Skip to first unread message

Dan

unread,
Aug 30, 2012, 6:13:48 PM8/30/12
to mongod...@googlegroups.com
I am following this tutorial for setting up a replica set with EC2 instances.

I have a client site currently using MongoDB without a replica set, which I would like to implement.  I threw up 3 micro instances to test out the process of creating a replica set; I'm glad I did, because I'm having trouble.

I've followed through all steps of the tutorial listed above, and the instance I ran rs.initiate() on shows the following for rs.status():

prod_repl:SECONDARY> rs.status()
{
        "set" : "prod_repl",
        "date" : ISODate("2012-08-30T22:07:07Z"),
        "myState" : 2,
        "members" : [
                {
                        "_id" : 0,
                        "name" : "ip-10-xxx-xxx-xxx:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 329,
                        "optime" : Timestamp(1346361547000, 1),
                        "optimeDate" : ISODate("2012-08-30T21:19:07Z"),
                        "self" : true
                },
                {
                        "_id" : 1,
                        "name" : "ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com:27017",
                        "health" : 1,
                        "state" : 6,
                        "stateStr" : "UNKNOWN",
                        "uptime" : 329,
                        "optime" : Timestamp(0, 0),
                        "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
                        "lastHeartbeat" : ISODate("2012-08-30T22:07:06Z"),
                        "pingMs" : 0,
                        "errmsg" : "still initializing"
                },
                {
                        "_id" : 2,
                        "name" : "ec2- xxx-xxx-xxx-xxx .compute-1.amazonaws.com:27017",
                        "health" : 1,
                        "state" : 6,
                        "stateStr" : "UNKNOWN",
                        "uptime" : 329,
                        "optime" : Timestamp(0, 0),
                        "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
                        "lastHeartbeat" : ISODate("2012-08-30T22:07:06Z"),
                        "pingMs" : 1,
                        "errmsg" : "still initializing"
                }
        ],
        "ok" : 1
}

Note that this server was initially saying it was the PRIMARY, but after restarting and trying a few things, it then changed to secondary.

Doing rs.status() on the other two nodes gives the following:

 > rs.status()
{
        "startupStatus" : 4,
        "errmsg" : "can't currently get local.system.replset config from self or any seed (EMPTYUNREACHABLE)",
        "ok" : 0
}

I've tried googling around and trying a number of things to no avail.  My /etc/mongod.conf file has 'replSet = prod_repl' as a line in it, and this is all running on ubuntu.

Thanks!

Brian McNamara

unread,
Aug 30, 2012, 9:05:51 PM8/30/12
to mongod...@googlegroups.com
Hi Dan,

Just a few quick thoughts and questions...

1. Are there any exceptions in the secondary servers' mongo.log?

2. Is communication open among the replica set servers' EC2 security groups? Can you telnet from one of the secondary servers to the listen port of the primary? Are all three in the same AWS region?

3. Can you confirm the names of the hosts' names are still valid and each of the servers can be resolved from each replica set member?

Regards,
Brian

Dan

unread,
Aug 30, 2012, 9:29:06 PM8/30/12
to mongod...@googlegroups.com
Hi Brian

The log is littered with various messages.  Here's a few that seem the most pertinent:

[rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)
[rsStart] replSet info you may need to run replSetInitiate -- rs.initiate() in the shell -- if that is not already done
[rsStart] getaddrinfo("ip-xx-xxx-xxx-xxx") failed: Name or service not known
[rsStart] couldn't connect to ip-xx-xxx-xxx-xxx:27017: couldn't connect to server ip- xx-xxx-xxx-xxx:27017

 It seems as though the primary is reporting its hostname that the secondary is trying to resolve?  But the hostname is not resolvable (i.e. it's a name) rather than the DNS, which would be.

To answer your other two questions, the hosts are able to connect to each others' databases and are in the same AWS region.  I am using the DNS to resolve/connect to them.

Trying to fix this I seem to have backed myself into a corner now, where I am not longer able to do rs.initialize() on the main server, and the main server now thinks it's a secondary.  I had tried changing the rs name and creating a new one and now can't seem to do much of anything.

Thanks,
Dan

Brian McNamara

unread,
Aug 31, 2012, 8:34:11 AM8/31/12
to mongod...@googlegroups.com
Hi Dan,

Did the 3 servers have Elastic IP addresses? If not, the public DNS name would have changed unless you rebooted an EBS-backed instance.

Can you recreate the replica set with instances that have EIPs and see whether the behavior changes?

Regards,
Brian

Dan

unread,
Aug 31, 2012, 9:46:46 AM8/31/12
to mongod...@googlegroups.com
Hey Brian,

They didn't have elastic IPs, but the DNS addresses hadn't changed.  I think the issue was that the "ip-181-11-12-13" address was something that the primary could resolve, but made no sense for the secondaries.  I'll try your suggestion though and see if it changes anything.

Thanks,
Dan

Scott Hernandez

unread,
Aug 31, 2012, 9:49:13 AM8/31/12
to mongod...@googlegroups.com
Please verify that from every member you can connect to each other
node and that the names in the rs.conf() resolve correctly everywhere.

Using "mongo hostname.domain:port" is a good way to test for example
-- run db.serverStatus() will require both a connection and will show
that everything is working.
> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb

Brian McNamara

unread,
Aug 31, 2012, 10:33:40 AM8/31/12
to mongod...@googlegroups.com
You put it much more succinctly, Scott. Dan, based on the exceptions there is a problem in some members of the replica set communicating with others.

Regards,
Brian

Dan

unread,
Aug 31, 2012, 11:13:02 AM8/31/12
to mongod...@googlegroups.com
I am able to connect to the other db instances just fine from one another.  Running rs.conf() on the three servers gives the following:

prod_repl:SECONDARY> rs.conf()
{
        "_id" : " prod_repl ",
        "version" : 3,
        "members" : [
                {
                        "_id" : 0,
                        "host" : "ip-10-xxx-xx-xxx:27017"
                },
                {
                        "_id" : 1,
                        "host" : "ec2-184-xx-xxx-xx.compute-1.amazonaws.com:27017"
                },
                {
                        "_id" : 2,
                        "host" : "ec2-23-xx-xxx-xx.compute-1.amazonaws.com:27017"
                }
        ]
}

> rs.conf()
null

> rs.conf()
null

I'm assuming the fact that the first server is saying SECONDARY is an issue, though I wasn't able to get this working even when it said PRIMARY.  I tried doing rs.reconfig and changing the first member host to the public DNS, but it saying that the reconfiguration has to happen on a primary.

Dan

Brian McNamara

unread,
Aug 31, 2012, 11:56:54 AM8/31/12
to mongod...@googlegroups.com
Hi Dan,

Please confirm how you're verifying connectivity. Please show the output from each of the servers when running the mongo command (connecting to other servers using the connection syntax Scott proposed). I'm willing to bet the problem lies in connecting from host 2 and 3 back to host 1.

Did you have a chance to stand up a second replica set with different hosts?

Regards,
Brian

Dan

unread,
Aug 31, 2012, 12:05:48 PM8/31/12
to mongod...@googlegroups.com
Hi Brian,

I ultimately ended up performing the following steps, which seemed to work:
  • Deleted all local.* files on what was the primary server
  • Re-initialized the replica set
  • Before adding the secondary nodes, I reconfigured the rs.conf to change the primary's host from ip-xx-xxx-xxx-xxx to the full EC2 public DNS
  • Added the secondary nodes
That seemed to work fine.  Right now I'm trying to verify that adding items to the primary node actually updates all the secondary nodes, which doesn't seem to happen.  Does it just take time?

Anyway, for what it's worth, connecting from one server instance to another looks like this (though the prompt didn't indicate the replica set name):

ubuntu@ip-xx-xxx-xx-xxx:/var/lib/mongodb$ mongo ec2-xxx-xx-xxx-xx.compute-1.amazonaws.com:27017
MongoDB shell version: 2.2.0
prod_repl:SECONDARY>

Thank you all for your help.  Let me know if you think I've done something wrong.  I'm curious as to how people manage host names across multiple instances, since I think that would be a better long term solution for us.

Dan

Daniel Ellis

unread,
Aug 31, 2012, 12:13:47 PM8/31/12
to mongod...@googlegroups.com
Finishing up on this, for anyone who may search to get here.

I couldn't see the new data because I was actually in the wrong database.  Typing "use test", the default db I used for this, followed by querying the collection I inserted data into, gave this error:

error: { "$err" : "not master and slaveOk=false", "code" : 13435 }

I fixed this by running "rs.slaveOk()" in the shell session, which allowed me to read from the collection and verify that the replication was indeed working properly.

--

Brian McNamara

unread,
Aug 31, 2012, 1:36:40 PM8/31/12
to mongod...@googlegroups.com
Hi Dan,

Thanks for posting your results and, more importantly, glad to hear you worked through your issue. Name resolution in EC2 is always interesting. I would say, where possible, use the public DNS when addressing hosts.

Regards,
Brian

Reply all
Reply to author
Forward
0 new messages