Robbat, can you verify this chunk?
> 2. Patch uses the correct df binary and fixes du arguments if SunOS is
> detected.
I already have a (slightly more complete) patch for this from someone
else... I need a few days to circle back and apply a bunch of patches. So
lets just apply the second one after it's been verified to still work from
both sides (pg/mysql)
> I finally got mogilefs to work ;-)
>
> http://victori.uploadbooth.com/patches/mogilefs-sunos.patch
>
Thanks! :)
-Dormando
Alternatively maybe it's something the Pg bindings do differently on
Solaris than Linux?
--
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead
E-Mail : rob...@gentoo.org
GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85
Also, just wondering, what version of Postgres are you using?
My production is still on 8.2, and getting ready to jump to 8.4
> new configuration option: max_requests
> defaults to 0 (keep dbh cached forever, just how mogilefs works now.)
> Setting max_requests anything than 0 will be the max amount of dbh
> handles mogilefs store gives out before disconnecting and giving a
> fresh handle.
Sounds good in practice, but couple of requests here:
- Split up the patches: InactiveDestroy, max-requests, df-SunOS
- is handles_given correctly reset when the DBI connection is dead?
- Can you just always use handles_given even if max_requests is zero? I
think it would be interesting to instrument.
- With the above done, what handle counts are you seeing on Solaris?
Any runaway growth?
P.S.
Using git-svn will make your life a lot easier ;-)
If you want an initial tree to start with, clone from
git://git.isohunt.com/pub/mogilefs-robbat2.git.
--
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead
E-Mail : rob...@gentoo.org
> http://victori.uploadbooth.com/patches/store-max-requests.patch (max-
> requests - with disconnect reset and mogdbsetup fix)
Merged, I refactored it a bunch, and changed the actual name from
max-requests to max-handles, as it's DB handles, not requests to the DB
on a given handle.
> http://victori.uploadbooth.com/patches/mogilefs-sunos-pg.patch
> (InactiveDestroy)
Merged.
--
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead
E-Mail : rob...@gentoo.org
> - InactiveDestroy gets the whole thing to work on Solaris but causes a
> bug with the DBI driver, causing a memory leak.
> - I have changed the way the static Mgd class caches the store
> instances. It has been test on Solaris and OSX and does not require
> the leaky InactiveDestory bandaid for MogileFS to function on
> Solaris.
I don't follow why that code is a problem. In a fork, the pid ($$)
should be different, so it should return a new store anyway. What is
Solaris doing differently that the existing code doesn't work?
--
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead
E-Mail : rob...@gentoo.org
> Concerning max_handles improvement, now that I have more experience
> with the mogilefs codebase, I see that it might be an issue with some
> parts of MogileFS the way $dbh is handled.
For cases where there is no sleep, how long should we trust the $dbh
handle, or should we convert every instance of:
'my $dbh = $self->dbh;'
To not keep a $dbh variable?
Can you submit a patch for this so we can get more review on it?
Sure with a sleep and two seperate sets of work I can see it being valid
to make sure we ask for a new handle, but for a transaction, using a var
is probably better.
Or alternatively we just check that the $dbh is good more at the start
of each $dbh var.
--
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead
E-Mail : rob...@gentoo.org
It seems like we need to understand the problem a little more. Look at
where 'validate_dbh' is being called in the code. That's going to be the
safest place to decide whether or not to nuke a connection... Right now it
just calls '->ping' to make sure the connection is valid, and keeps it
forever if it is.
There're a couple places where dbh errors are checked lazily and an awful
lot of dbh calling code. I'm looking at the latest patch for solaris and
am still confused as to why that patch was necessary.
It looks like it's because the dbh isn't forcibly reopened after a fork?
or there's some place where we're forking and not doing that. Which means
the $store{$$} = $etc should have two entries for cases where it's
erroneous.
But... I don't see how that's possible given the old line:
return $store if $store && $store_pid == $$; ? From review, that lest
patch in trunk replaces a pid check with an *equivalent* pid check by
nature of $stores{$$} needing to be unique to a child.
It'd be nice to see a standalone test of a short piece of perl that fork's
and runs code similar to that... This behavior's pretty darn weird.
-Dormando
But... I don't see how that's possible given the old line:
return $store if $store && $store_pid == $$; ? From review, that lest
patch in trunk replaces a pid check with an *equivalent* pid check by
nature of $stores{$$} needing to be unique to a child.
I just kicked trunk around a bunch. Can you try testing it with your
Solaris/Postgres setup? If you're relying on max_handles, it's the same as
before but you'll want to set the limit a little lower to reflect that
it's the number of times the handler is validated instead of requested.
I think the root of the problem is finally addressed (safely) in both
places. One is that under solaris/postgres, touching a database handle
post-fork is deadly somehow. So now we kill it pre-fork. Finally, the
handler reuse stuff is done more safely.
Tests pass, code seems sane, try it out please and thank you? :)
-Dormando