Before I get further lost in this I wonder if anyone else believes or
has comments on the following extract of a Devel::NYTProf run. The
code uses LWP::UserAgent in a POE server to retrieve ~ 30Mb file on
the local 1Gb network (and it takes over 400s). We are NOT using
LWP::UserAgent::POE - in fact it is not even installed on this
machine. I fail to see how line 158 below can take 320s. This is a
POST request to an apache server with a small amount of Perl to read
the file and check it belongs to the person who is retrieving it. Run
outside of the POE Server the same POST takes around 5s with this
code:
use LWP::UserAgent;
use HTTP::Request;
my $ua = LWP::UserAgent->new();
$get = HTTP::Request->new('POST', 'http://xxx.yyy.zzz/cgi/x.cgi? job_id=697171&session_id=892416290D9E04EFE040007F01006F5B');
my $r = $ua->request($get);
and this is the same code being run inside the POE Server.
Devel::NYTProf reports the top time in HTTP::Message::add_content like
this:
149 sub add_content
150
# spent 320s (320+149ms) within HTTP::Message::add_content which was
called 7585 times, avg 42.3ms/call: # 7585 times (320s+149ms) by
LWP::Protocol::__ANON__[/usr/local/share/perl/5.10.0/LWP/Protocol.pm:
139] at line 137 of LWP/Protocol.pm, avg 42.3ms/call
{
151 7585 12.0ms my $self = shift;
152 7585 22.4ms $self->_content unless exists $self->{_content};
153 7585 14.5ms my $chunkref = \$_[0];
154 7585 14.7ms $chunkref = $$chunkref if ref($$chunkref); # legacy
155
156 7585 66.2ms 7585 149ms _utf8_downgrade($$chunkref);
# spent 149ms making 7585 calls to HTTP::Message::__ANON__[HTTP/
Message.pm:18], avg 20µs/call
157
158 7585 320s my $ref = ref($self->{_content});
159 7585 53.9ms if (!$ref) {
160 7585 365ms $self->{_content} .= $$chunkref;
161 }
162 elsif ($ref eq "SCALAR") {
163 ${$self->{_content}} .= $$chunkref;
164 }
165 else {
166 Carp::croak("Can't append to $ref content");
167 }
168 7585 242ms delete $self->{_parts};
169 }
POE::Kernel::CORE:sselect is next with an exclusive time of 103s which
I can imagine is the waiting for data to be sent from Apache.
Unfortunately this is a huge bit of code and I've been unsuccessful in
reducing it to a small example which exhibits the same problem so far.
I'd appreciate any comments or suggestions before I go any further in
case someone sees something obvious I've missed.
On Thu, Jun 17, 2010 at 11:04:11AM -0700, Martin wrote: > Before I get further lost in this I wonder if anyone else believes or > has comments on the following extract of a Devel::NYTProf run. The > code uses LWP::UserAgent in a POE server to retrieve ~ 30Mb file on > the local 1Gb network (and it takes over 400s). We are NOT using > LWP::UserAgent::POE - in fact it is not even installed on this > machine. I fail to see how line 158 below can take 320s.
On Jun 17, 11:20 pm, Tim Bunce <Tim.Bu...@pobox.com> wrote:
> On Thu, Jun 17, 2010 at 11:04:11AM -0700, Martin wrote:
> > Before I get further lost in this I wonder if anyone else believes or
> > has comments on the following extract of a Devel::NYTProf run. The
> > code uses LWP::UserAgent in a POE server to retrieve ~ 30Mb file on
> > the local 1Gb network (and it takes over 400s). We are NOT using
> > LWP::UserAgent::POE - in fact it is not even installed on this
> > machine. I fail to see how line 158 below can take 320s.
> Can you send me the profile data file?
> Tim.
By that I presume you mean the nytprof.out file. I can do that
tomorrow when back at work as my colleague has it.
It may be very large in which case I may provide FTP access to it.
On Thu, Jun 17, 2010 at 11:04:11AM -0700, Martin wrote: > Before I get further lost in this I wonder if anyone else believes or > has comments on the following extract of a Devel::NYTProf run. The > code uses LWP::UserAgent in a POE server to retrieve ~ 30Mb file on > the local 1Gb network (and it takes over 400s).
The overall numbers make sense though. The application run really does take the amount of time nytprof says, and the subroutine times and statement times are consistent. So I suspect perl not nytprof.
> I fail to see how line 158 below can take 320s. > Run outside of the POE Server the same POST takes around 5s with this > code: [...] > and this is the same code being run inside the POE Server. > Devel::NYTProf reports the top time in HTTP::Message::add_content like > this:
On Jun 18, 10:17 am, Tim Bunce <Tim.Bu...@pobox.com> wrote:
> Thanks for sending me the profile Martin.
> On Thu, Jun 17, 2010 at 11:04:11AM -0700, Martin wrote:
> > Before I get further lost in this I wonder if anyone else believes or
> > has comments on the following extract of a Devel::NYTProf run. The
> > code uses LWP::UserAgent in a POE server to retrieve ~ 30Mb file on
> > the local 1Gb network (and it takes over 400s).
> The overall numbers make sense though. The application run really does
> take the amount of time nytprof says, and the subroutine times and
> statement times are consistent. So I suspect perl not nytprof.
The application run really does take that time, I agree and I should
have stated that to avoid confusion - sorry.
> > I fail to see how line 158 below can take 320s.
> > Run outside of the POE Server the same POST takes around 5s with this
> > code: [...]
> > and this is the same code being run inside the POE Server.
> > Devel::NYTProf reports the top time in HTTP::Message::add_content like
> > this:
my $ref = ref($self->{_content});
if (!$ref) {
$self->{_content} .= $$chunkref;
}
and I could see the concatenation perhaps taking longer and longer as
it is concatenating 4K chunks each time until it reaches 30Mb. I
didn't understand why the 320s was reported on the ref() though and
that is what confused me.
> Nick, does this ring any bells with you?
> Martin, please try with perl 5.10.1 as your 5.10.0 may be buggy.
> Tim.
It may take a bit longer to try 5.10.1 as we have a huge number of
modules in use but I'll have a go.
I'm now also getting negative times for some statements time in subs:
49 1 19µs 1 -90.1s $out = $q->sendRequest(\%r);
# spent - 90.1s making 1 call to XXX::Queue::sendRequest
so very strange things are happening.
Thanks for looking at it.
I'm still looking at it so I'll come back with what I find.
> On Jun 18, 10:17 am, Tim Bunce <Tim.Bu...@pobox.com> wrote:
> > Thanks for sending me the profile Martin.
> > On Thu, Jun 17, 2010 at 11:04:11AM -0700, Martin wrote:
> > > Before I get further lost in this I wonder if anyone else believes or
> > > has comments on the following extract of a Devel::NYTProf run. The
> > > code uses LWP::UserAgent in a POE server to retrieve ~ 30Mb file on
> > > the local 1Gb network (and it takes over 400s).
> > The overall numbers make sense though. The application run really does
> > take the amount of time nytprof says, and the subroutine times and
> > statement times are consistent. So I suspect perl not nytprof.
> The application run really does take that time, I agree and I should
> have stated that to avoid confusion - sorry.
> > > I fail to see how line 158 below can take 320s.
> > > Run outside of the POE Server the same POST takes around 5s with this
> > > code: [...]
> > > and this is the same code being run inside the POE Server.
> > > Devel::NYTProf reports the top time in HTTP::Message::add_content like
> > > this:
> my $ref = ref($self->{_content});
> if (!$ref) {
> $self->{_content} .= $$chunkref;
> }
> and I could see the concatenation perhaps taking longer and longer as
> it is concatenating 4K chunks each time until it reaches 30Mb. I
> didn't understand why the 320s was reported on the ref() though and
> that is what confused me.
> > Nick, does this ring any bells with you?
> > Martin, please try with perl 5.10.1 as your 5.10.0 may be buggy.
> > Tim.
> It may take a bit longer to try 5.10.1 as we have a huge number of
> modules in use but I'll have a go.
> I'm now also getting negative times for some statements time in subs:
> 49 1 19µs 1 -90.1s $out = $q->sendRequest(\%r);
> # spent - 90.1s making 1 call to XXX::Queue::sendRequest
> so very strange things are happening.
> Thanks for looking at it.
> I'm still looking at it so I'll come back with what I find.
> Martin
I upgraded to Perl 5.10.1 and problem has gone away.
It is going to be a PITA to do this permanently but I seem to have
little choice.
I've no idea what problem was fixed between 5.10.0 and 5.10.1 which
caused this,
I had a quick look in perldelta but nothing struck me.
On Fri, Jun 18, 2010 at 05:47:50AM -0700, Martin wrote:
> I upgraded to Perl 5.10.1 and problem has gone away. > It is going to be a PITA to do this permanently but I seem to have > little choice. I've no idea what problem was fixed between 5.10.0 and > 5.10.1 which caused this, I had a quick look in perldelta but nothing > struck me.
There were weird things with bless, overload and pathalogical hases around the time of 5.10.0 but I didn't need to pay attention (since we're using 5.8 .8 now and 5.12.1 soon) so I didn't.
Maybe Nicholas can shed some pumpkin light on it.
Please also give us your perl -V output, for the record.
> Thanks Tim for suggesting that.
You're welcome.
I'm glad NYTProf did "exactly what it said on the tin" :)
On Fri, Jun 18, 2010 at 03:38:07AM -0700, Martin wrote: > On Jun 18, 10:17 am, Tim Bunce <Tim.Bu...@pobox.com> wrote: > > Thanks for sending me the profile Martin.
> > On Thu, Jun 17, 2010 at 11:04:11AM -0700, Martin wrote: > > > Before I get further lost in this I wonder if anyone else believes or > > > has comments on the following extract of a Devel::NYTProf run. The > > > code uses LWP::UserAgent in a POE server to retrieve ~ 30Mb file on > > > the local 1Gb network (and it takes over 400s).
> > The overall numbers make sense though. The application run really does > > take the amount of time nytprof says, and the subroutine times and > > statement times are consistent. So I suspect perl not nytprof.
> The application run really does take that time, I agree and I should > have stated that to avoid confusion - sorry.
No problem. I was just stating it for the record. I often do "stream of thought" emails when I'm working on diagnosing a problem.
> I'm now also getting negative times for some statements time in subs:
(actually that's a subroutine profiler time for time spent in the sub)
> 49 1 19µs 1 -90.1s $out = $q->sendRequest(\%r); > # spent - 90.1s making 1 call to XXX::Queue::sendRequest
I don't see sendRequest in the profile you sent me, but I do see some negative times for subroutines. Looking at the table of all subs and sorting by inclusive time gives me:
Recursion tends to cause odd timings but I'm not sure if that's the case here. I've committed a change to add recursion details to the subroutine table. The way timings are handled for subs that recurse is quite possibly broken.
On Fri, Jun 18, 2010 at 03:38:07AM -0700, Martin wrote: > The code in question is:
> my $ref = ref($self->{_content}); > if (!$ref) { > $self->{_content} .= $$chunkref; > }
> and I could see the concatenation perhaps taking longer and longer as > it is concatenating 4K chunks each time until it reaches 30Mb. I > didn't understand why the 320s was reported on the ref() though and > that is what confused me.
That's what I would have thought. Assuming that the line number attribution is accurate, and it's something in the hash lookup and reference taking, that's most strange. I don't think that there's any copying involved.
On Fri, Jun 18, 2010 at 02:26:53PM +0100, Tim Bunce wrote: > On Fri, Jun 18, 2010 at 05:47:50AM -0700, Martin wrote:
> > I upgraded to Perl 5.10.1 and problem has gone away. > > It is going to be a PITA to do this permanently but I seem to have > > little choice. I've no idea what problem was fixed between 5.10.0 and > > 5.10.1 which caused this, I had a quick look in perldelta but nothing > > struck me.
> There were weird things with bless, overload and pathalogical hases > around the time of 5.10.0 but I didn't need to pay attention (since > we're using 5.8 .8 now and 5.12.1 soon) so I didn't.
> Maybe Nicholas can shed some pumpkin light on it.
> Please also give us your perl -V output, for the record.
-V for both would be interesting.
5.10.0 and 5.10.1 generate an identical optree for
my $ref = ref($self->{_content}); if (!$ref) { $self->{_content} .= $$chunkref; }
On Fri, Jun 18, 2010 at 03:38:07AM -0700, Martin wrote: > It may take a bit longer to try 5.10.1 as we have a huge number of > modules in use but I'll have a go.
5.10.1 is binary compatible with 5.10.0 (if you build it with the same options) So you don't (shouldn't) need to rebuild any modules.
On Fri, Jun 18, 2010 at 08:39:47AM -0700, Martin wrote: > perl -V for broken case:
> -DDEBUGGING=-g -Doptimize=-O2
That's a curious configuration. On the plus side it does mean the binary will have symbols. So you could use a sampling profiler[1]to probe where perl is spending its time.
> On Fri, Jun 18, 2010 at 03:38:07AM -0700, Martin wrote:
> > The code in question is:
> > my $ref = ref($self->{_content});
> > if (!$ref) {
> > $self->{_content} .= $$chunkref;
> > }
> > and I could see the concatenation perhaps taking longer and longer as
> > it is concatenating 4K chunks each time until it reaches 30Mb. I
> > didn't understand why the 320s was reported on the ref() though and
> > that is what confused me.
> That's what I would have thought.
> Assuming that the line number attribution is accurate, and it's something
> in the hash lookup and reference taking, that's most strange. I don't think
> that there's any copying involved.
> On Fri, Jun 18, 2010 at 02:26:53PM +0100, Tim Bunce wrote:
> > On Fri, Jun 18, 2010 at 05:47:50AM -0700, Martin wrote:
> > > I upgraded to Perl 5.10.1 and problem has gone away.
> > > It is going to be a PITA to do this permanently but I seem to have
> > > little choice. I've no idea what problem was fixed between 5.10.0 and
> > > 5.10.1 which caused this, I had a quick look in perldelta but nothing
> > > struck me.
> > There were weird things with bless, overload and pathalogical hases
> > around the time of 5.10.0 but I didn't need to pay attention (since
> > we're using 5.8 .8 now and 5.12.1 soon) so I didn't.
> > Maybe Nicholas can shed some pumpkin light on it.
> > Please also give us your perl -V output, for the record.
> -V for both would be interesting.
perl -V for working 5.10.1.
Summary of my perl5 (revision 5 version 10 subversion 1)
configuration:
Characteristics of this binary (from libperl):
Compile-time options: PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP
USE_LARGE_FILES USE_PERLIO
Built under linux
Compiled at Dec 3 2009 16:41:19
%ENV:
PERL5LIB="/home/martin/bet/tools/modules/BET/lib:/home/martin/bet/
cgi"
@INC:
/home/martin/bet/tools/modules/BET/lib
/home/martin/bet/cgi
/home/martin/perl5101/lib/5.10.1/i686-linux
/home/martin/perl5101/lib/5.10.1
/home/martin/perl5101/lib/site_perl/5.10.1/i686-linux
/home/martin/perl5101/lib/site_perl/5.10.1
.
> On Jun 19, 7:37 pm, Tim Bunce <Tim.Bu...@pobox.com> wrote:
> > On Fri, Jun 18, 2010 at 08:39:47AM -0700, Martin wrote:
> > > perl -V for broken case:
> > > -DDEBUGGING=-g -Doptimize=-O2
> > That's a curious configuration.
> Not sure where it came from on this machine - it may be as installed
> by Ubuntu. The newer 5.10.1 was built by me using all defaults.
> > On the plus side it does mean the binary
> > will have symbols. So you could use a sampling profiler[1]to probe where
> > perl is spending its time.
> So far I have been unable to reproduce this is a smaller test case but
> I am continuing to try.
> Martin
OK, I've found the problem and it has nothing to Perl versions. It is
taint checking. The one that was taking ages was running with -t from
init and I forgot that and I was running it manually without -t.
Now I am even more suspicious of the concatentation line rather than
the ref(x) line taking the time. Since Tim found a line was taking
more and more ticks and the concatenation adds 4K each time to 32Mb it
almost sounds like taint checking is expensive and doing something
every time a chunk is added to the string.
I'm very sorry I forgot about taint checking. This is the second issue
we've found like this. I'm really going off taint mode quickly.
I can now reproduce with this:
create a file.dat file in your apache htdocs and make it around 32Mb
in size then run this:
use LWP::UserAgent;
use HTTP::Request;
my $ua = LWP::UserAgent->new();
$get = HTTP::Request->new('GET', 'http://localhost/file.dat'); my $r = $ua->request($get);
print length($r->decoded_content);
with and without taint checking. I should also add our Apache
returns .dat files as UTF-8 encoded.
> On Jun 21, 8:42 am, Martin <martin.j.ev...@gmail.com> wrote:
> > On Jun 19, 7:37 pm, Tim Bunce <Tim.Bu...@pobox.com> wrote:
> > > On Fri, Jun 18, 2010 at 08:39:47AM -0700, Martin wrote:
> > > > perl -V for broken case:
> > > > -DDEBUGGING=-g -Doptimize=-O2
> > > That's a curious configuration.
> > Not sure where it came from on this machine - it may be as installed
> > by Ubuntu. The newer 5.10.1 was built by me using all defaults.
> > > On the plus side it does mean the binary
> > > will have symbols. So you could use a sampling profiler[1]to probe where
> > > perl is spending its time.
> > So far I have been unable to reproduce this is a smaller test case but
> > I am continuing to try.
> > Martin
> OK, I've found the problem and it has nothing to Perl versions. It is
> taint checking. The one that was taking ages was running with -t from
> init and I forgot that and I was running it manually without -t.
> Now I am even more suspicious of the concatentation line rather than
> the ref(x) line taking the time. Since Tim found a line was taking
> more and more ticks and the concatenation adds 4K each time to 32Mb it
> almost sounds like taint checking is expensive and doing something
> every time a chunk is added to the string.
> I'm very sorry I forgot about taint checking. This is the second issue
> we've found like this. I'm really going off taint mode quickly.
Just for your reference, I originally thought Devel::NYTProf was
reporting a long time spent on the wrong line. I know this is not a
Devel::NYTProf issue and it is simply that I used Devel::NYTProf to
highlight the troublesome line. Here is a smaller, self contained
example which proves the line originally highlighted by Devel::NYTProf
is the problem:
use strict;
use warnings;
use Scalar::Util qw(tainted);
my $fd;
open($fd, ">", "file.dat");
print $fd 'x' x 4096;
close $fd;
my $data;
{
local $/;
open ($fd, "<", "file.dat");
$data = <$fd>;
close $fd;
On Tue, Jun 22, 2010 at 4:54 AM, Martin <martin.j.ev...@gmail.com> wrote: > Just for your reference, I originally thought Devel::NYTProf was > reporting a long time spent on the wrong line. I know this is not a > Devel::NYTProf issue and it is simply that I used Devel::NYTProf to > highlight the troublesome line. Here is a smaller, self contained > example which proves the line originally highlighted by Devel::NYTProf > is the problem:
> use strict; > use warnings; > use Scalar::Util qw(tainted);
> my $fd; > open($fd, ">", "file.dat"); > print $fd 'x' x 4096; > close $fd;
> my $data; > { > local $/; > open ($fd, "<", "file.dat"); > $data = <$fd>; > close $fd; > } > print "data is tainted: ", tainted($data) ? 'yes' : 'no', "\n";
> my %hash; > $hash{content} = '';
> foreach (1..10000) { > # comment following line out to make this fly > my $ref = ref($hash{content}); > #if (!$ref) { > $hash{content} .= $data; > #} > }
> print length($hash{content}), "\n";
> Problem exists at least until up to 5.12.1.
> I will pursue this elsewhere now (although I'm not sure where right > now) and thanks for your help.
If you run `perlbug` then you can file this is as a bug against core perl and all the core perl 5 developers will hear about it in an email. Most of what you've already posted would be great to copy/paste into the bug report. It'll generate a formatted email to an address something like perl...@perl.org. You can then let it either send the message directly if your system routes for external email or do what I do which is then save the bug report as a text file and use my gmail to send the exact "same" message. Please use the text formatting as generated by perlbug.
> On Tue, Jun 22, 2010 at 4:54 AM, Martin <martin.j.ev...@gmail.com> wrote:
> > Just for your reference, I originally thought Devel::NYTProf was
> > reporting a long time spent on the wrong line. I know this is not a
> > Devel::NYTProf issue and it is simply that I used Devel::NYTProf to
> > highlight the troublesome line. Here is a smaller, self contained
> > example which proves the line originally highlighted by Devel::NYTProf
> > is the problem:
> > use strict;
> > use warnings;
> > use Scalar::Util qw(tainted);
> > my $fd;
> > open($fd, ">", "file.dat");
> > print $fd 'x' x 4096;
> > close $fd;
> > my $data;
> > {
> > local $/;
> > open ($fd, "<", "file.dat");
> > $data = <$fd>;
> > close $fd;
> > }
> > print "data is tainted: ", tainted($data) ? 'yes' : 'no', "\n";
> > my %hash;
> > $hash{content} = '';
> > foreach (1..10000) {
> > # comment following line out to make this fly
> > my $ref = ref($hash{content});
> > #if (!$ref) {
> > $hash{content} .= $data;
> > #}
> > }
> > print length($hash{content}), "\n";
> > Problem exists at least until up to 5.12.1.
> > I will pursue this elsewhere now (although I'm not sure where right
> > now) and thanks for your help.
> If you run `perlbug` then you can file this is as a bug against core
> perl and all the core perl 5 developers will hear about it in an
> email. Most of what you've already posted would be great to copy/paste
> into the bug report. It'll generate a formatted email to an address
> something like perl...@perl.org. You can then let it either send the
> message directly if your system routes for external email or do what I
> do which is then save the bug report as a text file and use my gmail
> to send the exact "same" message. Please use the text formatting as
> generated by perlbug.