Hi all,
I put a new Mongo DB backed app in production today. In "Mongo terms"
it's a very small 3 node replica cluster. The configuration is:
(I've replaced the host names because I'm paranoid)
> rs.conf()
{
"_id" : "ihset",
"version" : 7,
"members" : [
{
"_id" : 0,
"host" : "h1",
"votes" : 5
},
{
"_id" : 1,
"host" : "h2",
"votes" : 5
},
{
"_id" : 2,
"host" : "h3",
"priority" : 0
}
]
}
Hosts h1 and h2 are in my main data center with the client app, h3 is
in another data center and is there for disaster recover/backup
purposes. So far so good.
For the entire work day, h1 was the master and the system worked just
fine. When I got home, I decided to force the system to failover
with rs.stepDown(120) which I executed on the master. After a few
seconds h2 became the master. Perfect.
> rs.status()
{
"set" : "ihset",
"date" : "Thu Feb 10 2011 20:32:50 GMT-0500 (EST)",
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "h1",
"health" : 1,
"state" : 2,
"uptime" : 520569,
"lastHeartbeat" : "Thu Feb 10 2011 20:32:49 GMT-0500 (EST)"
},
{
"_id" : 1,
"name" : "h2",
"health" : 1,
"state" : 1,
"self" : true
},
{
"_id" : 2,
"name" : "h3",
"health" : 1,
"state" : 2,
"uptime" : 76654,
"lastHeartbeat" : "Thu Feb 10 2011 20:32:49 GMT-0500 (EST)"
}
],
"ok" : 1
}
>
Shortly after that my client application started throwing errors
galore:
[snip]
Caused by: com.mongodb.MongoException: not master
at com.mongodb.DBTCPConnector._checkWriteError(DBTCPConnector.java:
136)
at com.mongodb.DBTCPConnector.say(DBTCPConnector.java:157)
at com.mongodb.DBTCPConnector.say(DBTCPConnector.java:141)
at com.mongodb.DBApiLayer$MyCollection.insert(DBApiLayer.java:225)
at com.mongodb.DBApiLayer$MyCollection.insert(DBApiLayer.java:180)
at com.mongodb.DBCollection.insert(DBCollection.java:72)
at com.mongodb.DBCollection.save(DBCollection.java:537)
at com.mongodb.DBCollection.save(DBCollection.java:517)
at
com.starpoint.instihire.dao.hibernate.querybuilder.QueryBuilder.createQuery(QueryBuilder.java:
561)
[snip]
So obviously my system was trying to write to the old master.
My question is, is it my job to catch this, or should the java library
catch this and reconnect to the master?
The code I wrote that caused this:
[snip]
DBObject dbo = new BasicDBObject();
dbo.put("hql", hql);
dbo.put("params", mparams);
dbo.put("timestamp", System.currentTimeMillis());
mongoProvider.getDB().getCollection("querylog").save(dbo);
[snip]
and MongoProvider is a Spring bean..
[snip]
public class MongoProviderImpl extends Logic implements MongoProvider
{
private InstiHireProps instiHireProps;
private Mongo mongo = null;
private final Logger logger =
Logger.getLogger(MongoProviderImpl.class);
public void setInstiHireProps(InstiHireProps instiHireProps) {
this.instiHireProps = instiHireProps;
}
public void afterPropertiesSet() throws Exception {
assertSet(instiHireProps, "instiHireProps");
}
private Mongo getMongo() {
if (mongo == null) {
try {
List<ServerAddress> servers = new
ArrayList<ServerAddress>();
for (String serverName :
instiHireProps.getMongoServer().split(",")) {
ServerAddress serverAddress = new
ServerAddress(serverName.trim());
logger.debug(serverAddress.toString());
servers.add(serverAddress);
}
if (servers.size() == 1) {
mongo = new Mongo(servers.get(0));
} else {
mongo = new Mongo(servers);
mongo.slaveOk();
mongo.setWriteConcern(new
WriteConcern(servers.size() - 1, 30000, false));
}
logger.info("WriteConcern: " +
mongo.getWriteConcern());
} catch (Exception e) {
logger.error("Unable to get connection to Mongo", e);
throw new BusinessException(e);
}
}
return mongo;
}
public DB getDB() {
return getMongo().getDB(instiHireProps.getMongoDatabase());
}
}
[snip]
Any help would be greatly appreciated.
Tony Nelson
Starpoint Solutions