Help: PHP extension does not failover in replica sets

33 views
Skip to first unread message

Egor Egorov

unread,
Oct 12, 2010, 6:21:52 AM10/12/10
to mongodb-user
Hi!

We are running a popular iPhone messenger, http://pushme.to on a
replica set database on Ubuntu 10.04 and 10.10, MongoDB 10.6.3. We use
pecl Mongo extension 1.0.10 with persistent connections enabled; APC
is also used. PHP is in FastCGI mode.

The problem: PHP extension seems like almost never able to failover
whenever master server changes.

In the simpliest form, it's just enough to say "rs.stepDown()" on the
master server to make all php-cgi processes go dead. They throw
exception "Not master" and never fail over to a new master.

The more complicated error happens when the master suddenly dies. In
this case php-cgi processes just saty in a blocking mode, and only
kill -9 helps. Even after a few minutes after the master has been
elected.

If I shut down the master mongod and new master is not elected
immediately (let's say about 30 seconds passes before another node
becomes master), then PHP extension is not able to detect new master
and php-cgi processes do hang.

And the best part: when master change suddenly happens in
production. :)

Well, we only have four servers in the cluster for pushme.to and it's
not really hard to killall php-cgi on all of them, but doesn't it just
defeat the purpose of replica set?..

These situations are 100% repeatable.

We are willing to help test possible solutions (even on production).
Are there any?

Kristina Chodorow

unread,
Oct 12, 2010, 8:51:52 AM10/12/10
to mongod...@googlegroups.com
Let me look into it, sounds like FastCGI is doing something different than Apache. 

rs.stepDown won't make the driver fail over correctly (unless you're using a 1.7 nightly of the database, which I wouldn't recommend for production), but I might be able to fix that at the driver level. 

Thanks for the details, will keep you posted.



--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.


Kristina Chodorow

unread,
Oct 13, 2010, 5:29:23 PM10/13/10
to mongod...@googlegroups.com
Okay, I've pushed some changes to the driver.  Could you try the code at http://github.com/mongodb/mongo-php-driver and let us know how it works for you?  It should detect master step-downs and handle normal failover better.

Egor Egorov

unread,
Oct 13, 2010, 7:33:30 PM10/13/10
to mongodb-user

Kristina, thanks for quick reply!

We will start testing it very soon, so I hope to get back to you with
the results within 24-48 hours.

On Oct 14, 12:29 am, Kristina Chodorow <krist...@10gen.com> wrote:
> Okay, I've pushed some changes to the driver.  Could you try the code athttp://github.com/mongodb/mongo-php-driverand let us know how it works for
> you?  It should detect master step-downs and handle normal failover better.
>
> On Tue, Oct 12, 2010 at 8:51 AM, Kristina Chodorow <krist...@10gen.com>wrote:
>
>
>
>
>
>
>
> > Let me look into it, sounds like FastCGI is doing something different than
> > Apache.
>
> > rs.stepDown won't make the driver fail over correctly (unless you're using
> > a 1.7 nightly of the database, which I wouldn't recommend for production),
> > but I might be able to fix that at the driver level.
>
> > Thanks for the details, will keep you posted.
>
> > On Tue, Oct 12, 2010 at 6:21 AM, Egor Egorov <egor.ego...@gmail.com>wrote:
>
> >> Hi!
>
> >> We are running a popular iPhone messenger,http://pushme.toon a
> >> mongodb-user...@googlegroups.com<mongodb-user%2Bunsubscribe@google groups.com>
> >> .

Egor Egorov

unread,
Nov 8, 2010, 6:58:41 PM11/8/10
to mongodb-user

... well it took just a liiiitle bit longer to test this patch and
research our options.

Well, we had to close() connections and restart them once the
exception comes after rs.stepDown(). The driver doesn't on it's own :
(

Here's the test case and the algorythm that works for us. I feel like
it's not the right way?

$connection->connect();

while (true) {
try {
$result = $connection->test->a->find();
$result->getNext();
printf(".");
} catch(Exception $e) {
printf("!");
while (true) {
try {
$connection->close();
sleep (1);
printf("?");
$connection->connect();
break;
} catch (Exception $e) {
}
}
}
flush();
usleep(300*1000);
}


On Oct 13, 11:29 pm, Kristina Chodorow <krist...@10gen.com> wrote:
> Okay, I've pushed some changes to the driver.  Could you try the code athttp://github.com/mongodb/mongo-php-driverand let us know how it works for
> you?  It should detect master step-downs and handle normal failover better.
>
> On Tue, Oct 12, 2010 at 8:51 AM, Kristina Chodorow <krist...@10gen.com>wrote:
>
>
>
>
>
>
>
> > Let me look into it, sounds like FastCGI is doing something different than
> > Apache.
>
> > rs.stepDown won't make the driver fail over correctly (unless you're using
> > a 1.7 nightly of the database, which I wouldn't recommend for production),
> > but I might be able to fix that at the driver level.
>
> > Thanks for the details, will keep you posted.
>
> > On Tue, Oct 12, 2010 at 6:21 AM, Egor Egorov <egor.ego...@gmail.com>wrote:
>
> >> Hi!
>
> >> We are running a popular iPhone messenger,http://pushme.toon a
> >> mongodb-user...@googlegroups.com<mongodb-user%2Bunsubscribe@google groups.com>
> >> .

Kristina Chodorow

unread,
Nov 8, 2010, 7:11:05 PM11/8/10
to mongod...@googlegroups.com
Have you upgraded to 1.0.11?

Here's what it should look like:


$connection->connect();

while (true) {
 try {
   $result = $connection->test->a->find();
   $result->getNext();
   printf(".");
 } catch(Exception $e) {
   printf("!");
 }
 flush();
 usleep(300*1000);
}

Here's what I get when I run this and step down masters:
............................................!!!.!!.................!!.!!!!!!!!!!!.........................!!.!!!!!!!!!!!.....

Does it not do this for you?  You should not have to close the connection in 1.0.11.



To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.

Egor Egorov

unread,
Nov 11, 2010, 2:42:02 PM11/11/10
to mongodb-user


On Nov 9, 2:11 am, Kristina Chodorow <krist...@10gen.com> wrote:
> Have you upgraded to 1.0.11?


[]


> Here's what I get when I run this and step down masters:
> ............................................!!!.!!.................!!.!!!!! !!!!!!.........................!!.!!!!!!!!!!!.....
>
> Does it not do this for you?  You should not have to close the connection in
> 1.0.11.

Thanks!

1.0.11 and your version of the script works perfectly.

We have deployed on the production these changes and I will watch for
it for a few days how it goes.
> > athttp://github.com/mongodb/mongo-php-driverandlet us know how it works
Reply all
Reply to author
Forward
0 new messages