SlaveOk queries doesn't seems to go through mongoS

890 views
Skip to first unread message

Simon T.

unread,
Mar 19, 2012, 4:04:11 PM3/19/12
to mongod...@googlegroups.com
Hi guys,

My Mongo cluster is setupped this way:

- 2 mongos
- 2 shards
- 1 master/2 secondaries per shard

I'm using the Java drivers and the C# drivers.

With both drivers whenever I make a query when connecting through a mongoS with the slaveOk parameters, it's always the Master of each shard that anwser, never the secondaries.

However, I tested with the C# drivers, if I connect directly through one replica set with the SlaveOk option, it works fine.

To enable the SlaveOk property, I use the connection string ie."mongodb://server/?slaveOk=true" to tell the drivers I want the slave Ok option. When debugging I can see in my MongoServerSettings object that its SlaveOk property is set to true.

Currently, through mongos, it does not working either with the java drivers nor the C# drivers, and I've also tried the python drivers and its not working.


Is there something specific I need to check or config to make it happened ? Like I said earlier: it does work when connecting directly to the replica set. When I debug using the replica set, I see the 3 servers instance and the drivers know which server is the master and which one are the secondaries. But what happens when those servers sit behind a mongos ? Since the drivers don't know anything about the servers behind the mongos, the mongos need to do the dispatch for the SlaveOk to work correctly. So is there any special configuration required for the mongos to make the SlaveOk work properly ?

Thanks

Simon

Simon T.

unread,
Mar 19, 2012, 4:42:53 PM3/19/12
to mongod...@googlegroups.com
I forgot to mention that I'm using C# drivers 1.3.1 (latest as of now, I believe) and our servers are 2.0.3.

Kristina Chodorow

unread,
Mar 19, 2012, 5:34:04 PM3/19/12
to mongod...@googlegroups.com
What makes you think that mongos isn't dispatching these queries to the secondaries?

Simon T.

unread,
Mar 20, 2012, 9:54:18 AM3/20/12
to mongod...@googlegroups.com
If I fire 30k requests to the mongos. When watching the mongostat of each server, I see that its always the master of each shard that is anwsering and none of the secondaries.

Since all the drivers can see is the mongos to whom it is connected, and its SlaveOk setting is set to "true". I assume that it's the mongos that did not forward the queries to the replica set with the mention "SlaveOk = true".

Kristina Chodorow

unread,
Mar 20, 2012, 11:23:31 AM3/20/12
to mongod...@googlegroups.com
Hmm, can you run mongostat --discover for a few minutes while you do some queries and then send its output + mongos log from that time?

Simon T.

unread,
Mar 20, 2012, 12:31:24 PM3/20/12
to mongod...@googlegroups.com
Thanks for your help Kristina.

Attached to this message is a .tar file with the mongostat -discover output and the mongos log.  The test started around 16:07:05 so you can scroll down to that part.

Thank you very much again.

Simon
mongodebug.tar

Simon T.

unread,
Mar 20, 2012, 12:35:35 PM3/20/12
to mongod...@googlegroups.com
By the way, I'm aware of the issue with the server 1 and 3 on shardB.

Kristina Chodorow

unread,
Mar 20, 2012, 3:11:33 PM3/20/12
to mongod...@googlegroups.com
Thanks for the info, this is weird and I haven't been able to reproduce.  As a sanity check, can you try this in the shell:  

> // connect to mongos
> db.getMongo().setSlaveOk()
> db.collectionYouAreQueryingFromC#.find(/* somequery */).explain()

...and send the output?  (Replacing collectionYouAreQueryingFromC# and /* somequery */ :) )

Simon T.

unread,
Mar 21, 2012, 10:51:02 AM3/21/12
to mongod...@googlegroups.com
Hi Kristina,

here is the output of the shell after doing the query with the explain() function:

mongos> db.getMongo().setSlaveOk()
mongos> db.players.find({"s_nintendoID":"250000"}).explain()
{
        "cursor" : "BtreeCursor idx_nintendoID",
        "nscanned" : 1,
        "nscannedObjects" : 1,
        "n" : 1,
        "millis" : 0,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : {
                "s_nintendoID" : [
                        [
                                "250000",
                                "250000"
                        ]
                ]
        }
}
mongos>


Here is the output of mongostat, stating that it is a master (osmngb01) that did answer again:

osmnga01:27018       0      0      0      0       2       1       1   276g   555g  58.4g      0        0          0       0|0     2|1   156b     1k    27 sharda    M   14:48:54
osmnga02:27018      *0     *0     *0     *0       0     2|0       0   276g   553g  56.3g      0        0          0       0|0     0|0   188b     1k     9 sharda  SEC   14:48:54
osmnga03:27018      *0     *0     *0     *0       0     3|0       0   276g   553g  3.11g      0        0          0       0|0     0|0   314b     1k     9 sharda  SEC   14:48:54
osmngb01:27018       0      1      0      0       0       6       0   311g   623g   141g      0        0          0       0|0     2|0   532b     2k    33 shardb    M   14:48:54
osmngb02:27018      *0     *0     *0     *0       0     2|0       0   362g   727g   103g      0        0          0       0|0     0|0   188b     1k     8 shardb  SEC   14:48:54
osmngb03:27018      *0     *0     *0     *0       0     2|0       0   311g   623g   109g      0        0          0       0|0     0|0   188b     1k     8 shardb  SEC   14:48:54
      osmngs01       0      1      0      0       0       2                  397m    10m      0                                         238b     1k    17         RTR   14:48:54
osmngs02:27017       0      0      0 

Thanks again




Scott Hernandez

unread,
Mar 21, 2012, 11:15:24 AM3/21/12
to mongod...@googlegroups.com
Can you please post the output from these command connected to a mongos:

use config
db.shards.find()
db.printShardingStatus()

> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/mongodb-user/-/idzt3ru2LuMJ.
>
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.

Simon T.

unread,
Mar 21, 2012, 11:29:12 AM3/21/12
to mongod...@googlegroups.com
MongoDB shell version: 2.0.3
connecting to: localhost/admin
mongos> use config
switched to db config
mongos> db.shards.find()
{ "_id" : "sharda", "host" : "sharda/osmnga01:27018,osmnga02:27018,osmnga03:27018" }
{ "_id" : "shardb", "host" : "shardb/osmngb01:27018,osmngb02:27018,osmngb03:27018" }
mongos> db.printShardingStatus()
--- Sharding Status ---
  sharding version: { "_id" : 1, "version" : 3 }
  shards:
        {  "_id" : "sharda",  "host" : "sharda/osmnga01:27018,osmnga02:27018,osmnga03:27018" }
        {  "_id" : "shardb",  "host" : "shardb/osmngb01:27018,osmngb02:27018,osmngb03:27018" }
  databases:
        {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
        {  "_id" : "test_db",  "partitioned" : true,  "primary" : "sharda" }
        {  "_id" : "hor",  "partitioned" : true,  "primary" : "sharda" }
        {  "_id" : "load_test",  "partitioned" : true,  "primary" : "sharda" }
                load_test.loadtest chunks:
                                sharda  717
                                shardb  717
                        too many chunks to print, use verbose if you want to force print
        {  "_id" : "lt2",  "partitioned" : true,  "primary" : "shardb" }
                lt2.loadtest chunks:
                                sharda  11
                                shardb  12
                        too many chunks to print, use verbose if you want to force print
        {  "_id" : "hor-backup1",  "partitioned" : true,  "primary" : "sharda" }
                hor-backup1.metrics chunks:
                                shardb  1
                                sharda  2
                        { "s_UID" : { $minKey : 1 } } -->> { "s_UID" : "5380ef37bf19b550" } on : shardb { "t" : 2000, "i" : 0 }
                        { "s_UID" : "5380ef37bf19b550" } -->> { "s_UID" : "e0e9eba1c3d2fc8d" } on : sharda { "t" : 2000, "i" : 1 }
                        { "s_UID" : "e0e9eba1c3d2fc8d" } -->> { "s_UID" : { $maxKey : 1 } } on : sharda { "t" : 1000, "i" : 3 }
        {  "_id" : "localhost",  "partitioned" : false,  "primary" : "shardb" }
        {  "_id" : "players",  "partitioned" : false,  "primary" : "shardb" }

On Wednesday, March 21, 2012 11:15:24 AM UTC-4, Scott Hernandez wrote:
Can you please post the output from these command connected to a mongos:

use config
db.shards.find()
db.printShardingStatus()

On Mon, Mar 19, 2012 at 5:34 PM, Kristina Chodorow <kris...@10gen.com> wrote:
> What makes you think that mongos isn't dispatching these queries to the
> secondaries?
>
>
>
> On Monday, March 19, 2012 4:42:53 PM UTC-4, Simon T. wrote:
>>
>> I forgot to mention that I'm using C# drivers 1.3.1 (latest as of now, I
>> believe) and our servers are 2.0.3.
>
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/mongodb-user/-/idzt3ru2LuMJ.
>
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to

> mongodb-user+unsubscribe@googlegroups.com.

Greg Studer

unread,
Mar 22, 2012, 12:14:42 PM3/22/12
to mongod...@googlegroups.com
Hmm... still weird.  Can you up the logLevel of the mongos to 5 temporarily - this will give us the exact options being used by the query as it is dispatched.

mongo <mongos>
> use admin
> db.runCommand({ setParameter : 1, logLevel : 5 })
// Up
> db.runCommand({ setParameter : 1, logLevel : 0 })

Is/was the replica set setup with any hidden nodes?  rs.status()

If simple, can you replace the 2.0.3 mongos with a 2.0.2 mongos and see if there's a change?

Simon T.

unread,
Mar 23, 2012, 10:43:43 AM3/23/12
to mongod...@googlegroups.com
Hi guys, here is the logs. I've enable it and launch a couple a query.

A line that caught my attention was:
Fri Mar 23 13:58:19 [conn380] dbclient_rs getSlave not selecting { addr: "osmngb02:27018", isMaster: false, secondary: true, hidden: false, ok: false }, not currently okForSecondaryQueries

Attached is a sample of the log file.

Tell me what do you think.

Simon
mongodblog2.log

Kristina Chodorow

unread,
Mar 26, 2012, 11:29:36 AM3/26/12
to mongod...@googlegroups.com
Thanks for the log!  It looks like it's getting back healthy responses from the secondaries but not updating that they are healthy.  I'm not sure why it's not refreshing the secondaries' health, I've created a bug for it you can track: https://jira.mongodb.org/browse/SERVER-5405.

In the meantime, have you tried restarting the mongos?

Simon T.

unread,
Mar 26, 2012, 1:04:58 PM3/26/12
to mongod...@googlegroups.com
Hi Kristina,

Our MongoDB administrator is out-of-office for the next 2 days (he's currently at your MongoDB training for administrators in London ;) ).

We'll restart the mongos once he's back.

In the meantime, I'll watch the Jira issue.

Simon

Tomáš Jelínek

unread,
Mar 27, 2012, 8:33:54 AM3/27/12
to mongod...@googlegroups.com

Hi Kristina,

Yes, I can confirm MongoS servers were restarted but it doesn’t make any difference.

I've quickly created a test php code to simulate this problem and I see exactly same behaviour as Simon (he's testing java and c# drivers I believe).

Also if I run queries directly from MongoS shell I still see only primary nodes of each shard responding.

As Simon mentioned I am currently attuning MongoDB training in London so I had a chance to show our mongo environment to Chris Harris - he had quick look and according to him servers are healthy and overall setup is OK...

Regards

Tom

Message has been deleted

Simon T.

unread,
Mar 27, 2012, 9:42:10 AM3/27/12
to mongod...@googlegroups.com
I confirm that I've tested it with Java, C# and python drivers.

Kristina Chodorow

unread,
Mar 27, 2012, 10:07:40 AM3/27/12
to mongod...@googlegroups.com
Yes, it looks like a bug in mongos, so it won't matter what language you use.  If you connect directly to a replica set primary and run rs.status(), what is the output?

Tomáš Jelínek

unread,
Mar 27, 2012, 10:37:54 AM3/27/12
to mongod...@googlegroups.com
Shard A
 
connecting to: localhost:27018/admin
PRIMARY> rs.status()
{
        "set" : "sharda",
        "date" : ISODate("2012-03-27T14:35:29Z"),
        "myState" : 1,
        "members" : [
                {
                        "_id" : 0,
                        "name" : "osmnga01:27018",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "optime" : {
                                "t" : 1332858904000,
                                "i" : 5
                        },
                        "optimeDate" : ISODate("2012-03-27T14:35:04Z"),
                        "self" : true
                },
                {
                        "_id" : 1,
                        "name" : "osmnga02:27018",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 403892,
                        "optime" : {
                                "t" : 1332858904000,
                                "i" : 5
                        },
                        "optimeDate" : ISODate("2012-03-27T14:35:04Z"),
                        "lastHeartbeat" : ISODate("2012-03-27T14:35:28Z"),
                        "pingMs" : 0
                },
                {
                        "_id" : 2,
                        "name" : "osmnga03:27018",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 210,
                        "optime" : {
                                "t" : 1332857863000,
                                "i" : 1
                        },
                        "optimeDate" : ISODate("2012-03-27T14:17:43Z"),
                        "lastHeartbeat" : ISODate("2012-03-27T14:35:29Z"),
                        "pingMs" : 23
                }
        ],
        "ok" : 1
}
PRIMARY>
ShardB
 
PRIMARY> rs.status()
{
        "set" : "shardb",
        "date" : ISODate("2012-03-27T14:36:48Z"),
        "myState" : 1,
        "syncingTo" : "osmngb02:27018",
        "members" : [
                {
                        "_id" : 0,
                        "name" : "osmngb01:27018",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "optime" : {
                                "t" : 1332853220000,
                                "i" : 1
                        },
                        "optimeDate" : ISODate("2012-03-27T13:00:20Z"),
                        "self" : true
                },
                {
                        "_id" : 1,
                        "name" : "osmngb02:27018",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 86910,
                        "optime" : {
                                "t" : 1332853220000,
                                "i" : 1
                        },
                        "optimeDate" : ISODate("2012-03-27T13:00:20Z"),
                        "lastHeartbeat" : ISODate("2012-03-27T14:36:46Z"),
                        "pingMs" : 0
                },
                {
                        "_id" : 2,
                        "name" : "osmngb03:27018",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 86910,
                        "optime" : {
                                "t" : 1332853220000,
                                "i" : 1
                        },
                        "optimeDate" : ISODate("2012-03-27T13:00:20Z"),
                        "lastHeartbeat" : ISODate("2012-03-27T14:36:46Z"),
                        "pingMs" : 0
                }
        ],
        "ok" : 1
}
PRIMARY>

Kristina Chodorow

unread,
Mar 27, 2012, 10:51:40 AM3/27/12
to mongod...@googlegroups.com
Okay, looks fine on that end.  I'd recommend following the ticket.

Randolph Tan

unread,
Mar 29, 2012, 1:52:47 PM3/29/12
to mongod...@googlegroups.com
Hi Tomáš & Simon,

Would it be possible to provide the test scripts you were using? Is this problem persistent, meaning does it not resolve after some time? I am asking this because there is a background thread that tries to refresh the state of all replica set connections every 10 seconds.

Simon T.

unread,
Mar 29, 2012, 3:27:40 PM3/29/12
to mongod...@googlegroups.com
The problem is indeed persistent, I never could get the queries to be redirected to the secondary while passing by the mongos.

The scripts are very simple: it is just looping queries. Here is two examples that does the same thing, one in java and the other one in C#:

JAVA:
try {
Mongo _mongo = new Mongo("osmngs01", 27017);
_mongo.setReadPreference(ReadPreference.SECONDARY);
DB db = _mongo.getDB("AAAAA");
if (db.authenticate("XXXXXXX", "YYYYYYYY".toCharArray()) == false) {
String message = String.format("Initialization Error: Failed to authenticate to %1$s.", "osmngs01");
System.out.println(message);
}


DBCollection coll = db.getCollection("players");
for(int i=0;i<10000;i++)
{
DBObject player = coll.findOne(new BasicDBObject("s_nintendoID", Integer.toString(i)));

System.out.println("index: " + i + " players: " + player);
}

} catch (UnknownHostException ex) {
String message = String.format("Initialization Error: Host %1$s does not exists. Message: %2$s",
"osmngs01", ex.getMessage());
System.out.println(message);
} catch (MongoException ex) {
String message = String.format("Initialization Error: Cannot connect to host %1$s:%2$s. Message: %3$s",
"osmngs01", 27017, ex.getMessage());
System.out.println(message);
}


AND C#:

string connectionString = "mongodb://XXXXX:YYYYY@osmngs01/?slaveOk=true";
MongoServer mongoServer = MongoServer.Create(connectionString);
MongoDatabase mongoDB = mongoServer.GetDatabase(MONGO_DB);
MongoCollection<BsonDocument> documents = mongoDB.GetCollection("players");
Random rndNb = new Random();
for (int i = 0; i < 15000; i++)
{
int iPlayer = rndNb.Next(1, 400000);
try
{
var query = Query.EQ("s_nintendoID", iPlayer.ToString());

BsonDocument bsonPlayer = documents.FindOne(query);

}
catch (Exception exc)
{
System.Diagnostics.Debug.WriteLine(exc.Message);
}
}




Simon

Randolph Tan

unread,
Apr 4, 2012, 4:40:55 PM4/4/12
to mongod...@googlegroups.com
Are you still having this issue? I am struggling to reproduce this bug. Do you have a verbose logs of the replicas that corresponds to the same time frame with the verbose mongos log that is experiencing this issue?

Tomáš Jelínek

unread,
Apr 5, 2012, 10:23:19 AM4/5/12
to mongod...@googlegroups.com
Hi Randolph,

first - let me confirm - yes, the issue is permanent - its been like this since we have noticed it long time ago.

Saying that, the logs Simon posted on March 23 are exactly what you are asking for - i.e, verbose logs from the time frame while we are experiencing the issue. 

BTW I would suggest to take this conversation offline until we find the solution, possibly to an email thread as I am slightly uncomfortable talking about our set-up on public forum ( this is production mongo we having issue with). I will try to contact you through G+ with private message.

Regards
Tom

Simon T.

unread,
Apr 5, 2012, 12:05:11 PM4/5/12
to mongod...@googlegroups.com
Hi Randolph,

here is the full log (level:5) of a RS, mongos and mongostat

The procedure was:

  1. Start fresh log on a server
  1. db.runCommand({ setParameter : 1, logLevel : 5 })
  1. Start a load test with the scripts that you can find on this thread
Simon

slaveOKtest20120405.rar

Randolph Tan

unread,
Apr 5, 2012, 10:42:59 PM4/5/12
to mongod...@googlegroups.com
Hi,

Are you also passing the same keyFile you pass to the replica shards to mongos?

Simon T.

unread,
Apr 6, 2012, 10:14:02 AM4/6/12
to mongod...@googlegroups.com
Hi Randolph,

Could you explain a bit more what you mean ?

Thanks

Randolph Tan

unread,
Apr 6, 2012, 1:23:24 PM4/6/12
to mongod...@googlegroups.com
Something like this. If you pass the keyfile to one of the replica set as this:

mongod --replSet sharda --keyFile /data/mongodb/sharda.keyfile ...

You should also do the same with mongos:

mongos --keyFile /data/mongodb/sharda.keyfile ...

The sharda.keyfile should exactly have the same contents.

Tomáš Jelínek

unread,
Apr 10, 2012, 4:58:54 AM4/10/12
to mongod...@googlegroups.com
Hi Randolph,
yes we are passing the key - would replicas work if we wouldnt be passing it around???

This is how we do it:

MongoS01

# mongodb -- numactl --interleave=all /usr/bin/mongos --config /etc/mongodb-mongos.conf

xxx@osmngs01:/etc# cat /etc/mongodb-mongos.conf
bind_ip = osmngs01,127.0.0.1 
logpath = /var/log/mongodb/mongodb.log
logappend = true
configdb = osmnga01:27019,osmnga02:27019,osmnga03:27019
keyFile = /data/mongodb/mongos.keyfile


A01 (and other replicas):

# mongodb -- numactl --interleave=all /usr/bin/mongod --config /etc/mongodb-sharda.conf

xxx@osmnga01:~# cat /etc/mongodb-sharda.conf
replSet = sharda
cpu = true
rest = true
shardsvr = true
bind_ip = osmnga01,127.0.0.1
dbpath = /data/mongodb/sharda/
logpath = /var/log/mongodb/mongodb.log
logappend = true
keyFile = /data/mongodb/sharda.keyfile

And keyfiles are the same on all machines...

Regards
Tom

Randolph Tan

unread,
Apr 10, 2012, 10:39:08 AM4/10/12
to mongod...@googlegroups.com
Hi,

We have found out that there is actually a bug and you can track it here.

Tomáš Jelínek

unread,
Apr 10, 2012, 12:29:02 PM4/10/12
to mongod...@googlegroups.com
Thanks for confirmation Randolph, will be keeping an eye on the buck track link.

Cheers
Tom

Simon T.

unread,
May 1, 2012, 1:32:17 PM5/1/12
to mongod...@googlegroups.com
Hi guys,

I watched https://jira.mongodb.org/browse/SERVER-5405 for the past few weeks. It seems like the bugs has been fixed, then re-opened and it still under the "unresolved" status. It seems that some patches have been commited on github.

Are you guys able to give us the current status of the bug. And if possible, update the ETA (2.1.1 is still scheduled for yesterday on Jira ;) )

Simon

Adam C

unread,
May 1, 2012, 1:47:06 PM5/1/12
to mongod...@googlegroups.com
The fix is checked into the 2.1 branch and I have personally seen it fixed by testing the nightly builds.  The usual reason for re-opening is to evaluate the option to back port the bug to the previous branch.  Naturally, it won't make it for 2.0.5 (that's already in release candidate status), but if suitable should make it for the next release.

Adam

Simon T.

unread,
May 1, 2012, 2:37:34 PM5/1/12
to mongod...@googlegroups.com
Thank you very much Adam !
Reply all
Reply to author
Forward
0 new messages