Patch to get mogilefs to work on solaris

victori

unread,

Oct 31, 2009, 3:23:52 PM10/31/09

to mogile

Here is my patch to get mogilefs to work on solaris/opensolaris.

1. InactiveDestroy DBI argument is mandatory for DBD::Pg to work with
mogilefs's forking nature.
2. Patch uses the correct df binary and fixes du arguments if SunOS is
detected.

I finally got mogilefs to work ;-)

http://victori.uploadbooth.com/patches/mogilefs-sunos.patch

dormando

unread,

Oct 31, 2009, 3:31:53 PM10/31/09

to mogile

> Here is my patch to get mogilefs to work on solaris/opensolaris.
>
> 1. InactiveDestroy DBI argument is mandatory for DBD::Pg to work with
> mogilefs's forking nature.

Robbat, can you verify this chunk?

> 2. Patch uses the correct df binary and fixes du arguments if SunOS is
> detected.

I already have a (slightly more complete) patch for this from someone
else... I need a few days to circle back and apply a bunch of patches. So
lets just apply the second one after it's been verified to still work from
both sides (pg/mysql)

> I finally got mogilefs to work ;-)
>
> http://victori.uploadbooth.com/patches/mogilefs-sunos.patch
>

Thanks! :)

-Dormando

Robin H. Johnson

unread,

Nov 1, 2009, 3:25:19 PM11/1/09

to mog...@googlegroups.com

On Sat, Oct 31, 2009 at 12:31:53PM -0700, dormando wrote:
> > Here is my patch to get mogilefs to work on solaris/opensolaris.
> > 1. InactiveDestroy DBI argument is mandatory for DBD::Pg to work with
> > mogilefs's forking nature.
> Robbat, can you verify this chunk?

Can you state where you saw a DB connection leak?
The validate_dbh/recheck_dbh logic should ensure that each fork gets a
separate connection already and it works here at least, my Postgres
reports one connection for each mog worker I have.

Alternatively maybe it's something the Pg bindings do differently on
Solaris than Linux?

--
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead
E-Mail : rob...@gentoo.org
GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85

victori

unread,

Nov 1, 2009, 4:13:52 PM11/1/09

to mogile

Yes, on Solaris the bindings are different where the argument is
required for mogilefs to function.

Right now I am currently doing a mass import of files into mogilefs
and I am seeing postgresql memory use sky rocket almost killing our
main production box. Seems like the mogilefs tracker is reusing/
pooling the same dbh handle that creates this issue when you have a
lot of inserts happening. I have written a patch that forces the
mogilefs tracker to re-connect with a new handle past a certain of
"max_requests." This puts a limit on how many commands can be issued
per connection before forcing the mogilefs tracker to pickup a new
database handle.

new configuration option: max_requests
defaults to 0 (keep dbh cached forever, just how mogilefs works now.)
Setting max_requests anything than 0 will be the max amount of dbh
handles mogilefs store gives out before disconnecting and giving a
fresh handle.

http://victori.uploadbooth.com/patches/mogilefs-sunos-full-v2.patch

> E-Mail : robb...@gentoo.org

> GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85
>

> application_pgp-signature_part
> < 1KViewDownload

Robin H. Johnson

unread,

Nov 1, 2009, 5:42:14 PM11/1/09

to mog...@googlegroups.com

On Sun, Nov 01, 2009 at 01:13:52PM -0800, victori wrote:
> Yes, on Solaris the bindings are different where the argument is
> required for mogilefs to function.

Ok, I'll merge that fragment later this afternoon, I just want to see if
it's going to have any negative effect in Linux-Postgres. If somebody
else can test it more with MySQL, that would be great too.

Also, just wondering, what version of Postgres are you using?
My production is still on 8.2, and getting ready to jump to 8.4

> new configuration option: max_requests
> defaults to 0 (keep dbh cached forever, just how mogilefs works now.)
> Setting max_requests anything than 0 will be the max amount of dbh
> handles mogilefs store gives out before disconnecting and giving a
> fresh handle.

Sounds good in practice, but couple of requests here:
- Split up the patches: InactiveDestroy, max-requests, df-SunOS
- is handles_given correctly reset when the DBI connection is dead?
- Can you just always use handles_given even if max_requests is zero? I
think it would be interesting to instrument.
- With the above done, what handle counts are you seeing on Solaris?
Any runaway growth?

P.S.
Using git-svn will make your life a lot easier ;-)
If you want an initial tree to start with, clone from
git://git.isohunt.com/pub/mogilefs-robbat2.git.

--
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead

E-Mail : rob...@gentoo.org

victori

unread,

Nov 2, 2009, 3:34:35 AM11/2/09

to mogile

Robin,

> - Split up the patches: InactiveDestroy, max-requests, df-SunOS

- I will take care of the patches tomorrow, will post them here.

> - is handles_given correctly reset when the DBI connection is dead?

- Not in the current patch. I will have this setup correctly tomorrow.

> - Can you just always use handles_given even if max_requests is zero? I
> think it would be interesting to instrument.

- I explicitly did this on purpose, not to disrupt the way mogilefs
works. Make it optional for databases that need connections cycled.

> - With the above done, what handle counts are you seeing on Solaris?
> Any runaway growth?

- With my patches applied, I have been able to upload 200,000 images
on to mogilefs without any issue. Before my patches I could barely do
50,000 images before the system will slow down to a crawl due to
postgresql eating all the ram.

I have another million or so images to import, so far so good. ;-)

> E-Mail : robb...@gentoo.org

> GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85
>

> application_pgp-signature_part
> < 1KViewDownload

victori

unread,

Nov 2, 2009, 5:56:59 PM11/2/09

to mogile

http://victori.uploadbooth.com/patches/solaris-diskusage-du.patch ( df
utility fix for solaris)

http://victori.uploadbooth.com/patches/store-max-requests.patch (max-
requests - with disconnect reset and mogdbsetup fix)

http://victori.uploadbooth.com/patches/mogilefs-sunos-pg.patch
(InactiveDestroy)

http://victori.uploadbooth.com/patches/solaris-mogilefs-full.patch
(full patch if needed)

With all the patches applied it works wonderfully on solaris.

Robin H. Johnson

unread,

Nov 3, 2009, 12:06:05 AM11/3/09

to mog...@googlegroups.com

On Mon, Nov 02, 2009 at 02:56:59PM -0800, victori wrote:
> http://victori.uploadbooth.com/patches/solaris-diskusage-du.patch ( df
> utility fix for solaris)

Pending on dormando.

> http://victori.uploadbooth.com/patches/store-max-requests.patch (max-
> requests - with disconnect reset and mogdbsetup fix)

Merged, I refactored it a bunch, and changed the actual name from
max-requests to max-handles, as it's DB handles, not requests to the DB
on a given handle.

> http://victori.uploadbooth.com/patches/mogilefs-sunos-pg.patch
> (InactiveDestroy)
Merged.

--
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead

E-Mail : rob...@gentoo.org

victori

unread,

Nov 4, 2009, 4:17:22 PM11/4/09

to mogile

Robin,

One last patch. The dbh handle InactiveDestroy should be toggled back
to 0 before disconnecting. This way it is made sure that the dbh
handle gets garbage collected in the child processes. This solved all
my leaks ;-)

http://victori.uploadbooth.com/patches/pg-memfix.patch

On Nov 2, 9:06 pm, "Robin H. Johnson" <robb...@gentoo.org> wrote:
> On Mon, Nov 02, 2009 at 02:56:59PM -0800, victori wrote:

> >http://victori.uploadbooth.com/patches/solaris-diskusage-du.patch( df

> > utility fix for solaris)
>
> Pending on dormando.
>
> >http://victori.uploadbooth.com/patches/store-max-requests.patch (max-
> > requests - with disconnect reset and mogdbsetup fix)
>
> Merged, I refactored it a bunch, and changed the actual name from
> max-requests to max-handles, as it's DB handles, not requests to the DB
> on a given handle.
>
> >http://victori.uploadbooth.com/patches/mogilefs-sunos-pg.patch
> > (InactiveDestroy)
>
> Merged.
>
> --
> Robin Hugh Johnson
> Gentoo Linux: Developer, Trustee & Infrastructure Lead

> E-Mail : robb...@gentoo.org

victori

unread,

Nov 4, 2009, 11:55:17 PM11/4/09

to mogile

Robin,

Forget about my last post, I have finally fixed the root issue. The
whole thing works great without the InactiveDestroy bandaid.

http://victori.uploadbooth.com/patches/solaris-dbi-cache-fix.patch

- InactiveDestroy gets the whole thing to work on Solaris but causes a
bug with the DBI driver, causing a memory leak.
- I have changed the way the static Mgd class caches the store
instances. It has been test on Solaris and OSX and does not require
the leaky InactiveDestory bandaid for MogileFS to function on
Solaris.

Instead of saving the Store in $store and fork pid in $store_pid,
consolidate the whole thing into a single static hash. $store{pid}

This way of caching the Store instance has no issues with Solaris, and
does not require the InactiveDestroy bandaid.

Robin H. Johnson

unread,

Nov 5, 2009, 12:16:22 AM11/5/09

to mog...@googlegroups.com

On Wed, Nov 04, 2009 at 08:55:17PM -0800, victori wrote:
>
> Robin,
>
> Forget about my last post, I have finally fixed the root issue. The
> whole thing works great without the InactiveDestroy bandaid.
>
> http://victori.uploadbooth.com/patches/solaris-dbi-cache-fix.patch

This still has:
+$self->{dbh}->disconnect();
+$self->{dbh}{InactiveDestroy} = 0;
in the max_handles code.

> - InactiveDestroy gets the whole thing to work on Solaris but causes a
> bug with the DBI driver, causing a memory leak.
> - I have changed the way the static Mgd class caches the store
> instances. It has been test on Solaris and OSX and does not require
> the leaky InactiveDestory bandaid for MogileFS to function on
> Solaris.

I don't follow why that code is a problem. In a fork, the pid ($$)
should be different, so it should return a new store anyway. What is
Solaris doing differently that the existing code doesn't work?

--
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead

E-Mail : rob...@gentoo.org

victori

unread,

Nov 5, 2009, 1:17:22 AM11/5/09

to mogile

> This still has:
> +$self->{dbh}->disconnect();
> +$self->{dbh}{InactiveDestroy} = 0;
> in the max_handles code.

I set inactivedestroy to 0 to make sure the dbh handle is garbage
collected to avoid a memory leak. Got the idea from the SQLGrey
project. See http://article.gmane.org/gmane.mail.postfix.sqlgrey/1033

> I don't follow why that code is a problem. In a fork, the pid ($$)
> should be different, so it should return a new store anyway. What is
> Solaris doing differently that the existing code doesn't work?

With the static hash each worker gets its own dedicated Store
instance, avoiding all the issues with fork and dbd::pg. This works on
solaris and does not leak memory. I have the mogilefs tracker running
for a few hours now and it is still at 6mb RSS.

Concerning max_handles improvement, now that I have more experience
with the mogilefs codebase, I see that it might be an issue with some
parts of MogileFS the way $dbh is handled.

For example, to illustrate my point.

my $dbh = $self->get_store()->dbh;
$dbh->do_stuff;
sleep 5;
$dbh->do_stuff;

During those 5 seconds, $dbh instance that your referencing might of
been disconnected and undefined.

Here is an example that needs to be refactored to work without issues
with the max_handles improvement.

sub was_duplicate_error {
my $self = shift;
my $dbh = $self->dbh;
return 0 unless $dbh->err;
return 1 if $dbh->err == 1062 || $dbh->errstr =~ /duplicate/i;
}

to

sub was_duplicate_error {
my $self = shift;
return 0 unless $self->dbh->err;
return 1 if $self->dbh->err == 1062 || $self->dbh->errstr =~ /
duplicate/i;
}

Basically you want to avoid state, that handle instance that you have
might be disconnected/undefined. Run mogilefs with max_handles set to
1, you will see what I mean.

syris:server victori$ perl mogilefsd --config=mogilefsd.conf
beginning run
Starting event loop for frontend job on pid 61523.
Job replicate has only 0, wants 1, making 1.
Job delete has only 0, wants 1, making 1.
Job queryworker has only 0, wants 5, making 5.
Job monitor has only 0, wants 1, making 1.
Job reaper has only 0, wants 1, making 1.
Job job_master has only 0, wants 1, making 1.
Job fsck has only 0, wants 1, making 1.
[monitor(61531)] Monitor running; scanning usage files
[reaper(61532)] Reaper running; looking for dead devices
[monitor(61531)] Monitor running; scanning usage files
crash log: execute on disconnected handle at lib/MogileFS/Store.pm
line 1159.

This is with max_handles set to 1.

> E-Mail : robb...@gentoo.org

Robin H. Johnson

unread,

Nov 5, 2009, 2:03:50 AM11/5/09

to mog...@googlegroups.com

On Wed, Nov 04, 2009 at 10:17:22PM -0800, victori wrote:
> > This still has:
> > +$self->{dbh}->disconnect();
> > +$self->{dbh}{InactiveDestroy} = 0;
> > in the max_handles code.
> I set inactivedestroy to 0 to make sure the dbh handle is garbage
> collected to avoid a memory leak. Got the idea from the SQLGrey
> project. See http://article.gmane.org/gmane.mail.postfix.sqlgrey/1033

Ok, that patch is merged in now, thanks for it.

> Concerning max_handles improvement, now that I have more experience
> with the mogilefs codebase, I see that it might be an issue with some
> parts of MogileFS the way $dbh is handled.

For cases where there is no sleep, how long should we trust the $dbh
handle, or should we convert every instance of:
'my $dbh = $self->dbh;'
To not keep a $dbh variable?

Can you submit a patch for this so we can get more review on it?

Sure with a sleep and two seperate sets of work I can see it being valid
to make sure we ask for a new handle, but for a transaction, using a var
is probably better.

Or alternatively we just check that the $dbh is good more at the start
of each $dbh var.

--
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee & Infrastructure Lead

E-Mail : rob...@gentoo.org

dormando

unread,

Nov 5, 2009, 3:15:53 AM11/5/09

to mogile

Any chance you folks could move these changes into a branch? Or at least
lets resolve this quickly.. Some folks upgrade off of trunk and these last
few merges could break quite a bit.

It seems like we need to understand the problem a little more. Look at
where 'validate_dbh' is being called in the code. That's going to be the
safest place to decide whether or not to nuke a connection... Right now it
just calls '->ping' to make sure the connection is valid, and keeps it
forever if it is.

There're a couple places where dbh errors are checked lazily and an awful
lot of dbh calling code. I'm looking at the latest patch for solaris and
am still confused as to why that patch was necessary.

It looks like it's because the dbh isn't forcibly reopened after a fork?
or there's some place where we're forking and not doing that. Which means
the $store{$$} = $etc should have two entries for cases where it's
erroneous.

But... I don't see how that's possible given the old line:
return $store if $store && $store_pid == $$; ? From review, that lest
patch in trunk replaces a pid check with an *equivalent* pid check by
nature of $stores{$$} needing to be unique to a child.

It'd be nice to see a standalone test of a short piece of perl that fork's
and runs code similar to that... This behavior's pretty darn weird.

-Dormando

Gavin Mogan

unread,

Nov 5, 2009, 6:19:31 AM11/5/09

to mog...@googlegroups.com

But... I don't see how that's possible given the old line:
return $store if $store && $store_pid == $$; ? From review, that lest
patch in trunk replaces a pid check with an *equivalent* pid check by
nature of $stores{$$} needing to be unique to a child.

Maybe its not so much the check and the returning of the new one, but with the hash, the old handle is also kept, so its not DESTROY'd when it goes out of scope.

victori

unread,

Nov 5, 2009, 11:49:11 AM11/5/09

to mogile

Just some general thoughts.

- The patches introduced thus far have been safe and stable.
Max_handles is defaulted to OFF, so the old/safe/stable connection
behavior is on by default. Max_handles is for postgres users like I
who need it to avoid postgres process bloat during mass imports of
data into mogilefs.

- InactiveDestroy is actually the correct way of handling DBD::Pg
connections across forks. However, the problem is it leaks memory, at
least here in Solaris. http://www.perlmonks.org/?node_id=594175

- $stores{$$} is what I am proposing to avoid the whole hairy
connection sharing across workers. Give each worker a dedicated Store
instance to avoid the sharing resources. So far with my latest patch,
mogilefs works without any memory leaks.

http://victori.uploadbooth.com/patches/idle-tracker-memcheck.png

svcs: -a ignored when used with arguments.
STATE STIME FMRI
online 20:59:55 svc:/network/mogilefs:default
# date
Thu Nov 5 08:48:41 PST 2009

Its been up for a while.. no memory leaks.

-Victor

victori

unread,

Nov 5, 2009, 12:20:35 PM11/5/09

to mogile

Gavin, your right. A background worker(possibly in the reaper
process?) will need to iterate over the static hash making sure there
are no stale Stores in that hash.

> > > project. Seehttp://article.gmane.org/gmane.mail.postfix.sqlgrey/1033

victori

unread,

Nov 5, 2009, 3:22:14 PM11/5/09

to mogile

Here is another patch, to cover what Gavin pointed out. This ties up
any loose ends with lingering DBI instances.

http://victori.uploadbooth.com/patches/solaris-dbi-reaper-update.patch

This patch updates the reaper process to undefine unused DBI stores to
avoid any memory leaks.

victori

unread,

Nov 5, 2009, 3:59:12 PM11/5/09

to mogile

After a brain storming session with dormando on IRC, the reaper patch
is not needed. I was thinking in multithreaded terms not multi-
process. Mgd is not shared across all processes, so the last patch is
not needed. It works great as-is.

dormando

unread,

Nov 14, 2009, 5:34:38 AM11/14/09

to mogile

Hey,

I just kicked trunk around a bunch. Can you try testing it with your
Solaris/Postgres setup? If you're relying on max_handles, it's the same as
before but you'll want to set the limit a little lower to reflect that
it's the number of times the handler is validated instead of requested.

I think the root of the problem is finally addressed (safely) in both
places. One is that under solaris/postgres, touching a database handle
post-fork is deadly somehow. So now we kill it pre-fork. Finally, the
handler reuse stuff is done more safely.