Profiling live traffic under Apache MPM worker

m...@nationstates.net

unread,

Nov 9, 2014, 7:55:59 PM11/9/14

to develnyt...@googlegroups.com

Hello Tim!

Firstly, thank you for this wonderful profiler. I thought it was sexy BEFORE it got flame graphs.

The way I usually use NYTProf is I try to think which part of my web app might be slow, or get run a lot, and profile it on my test machine. This has been terrifically useful and revealed some great potential optimizations.

However, I have a vision for taking this to the next level and implementing it in the production environment, like this:

1. When Apache starts, it loads Devel::NYTProf::Apache as per the docs, but with "start=no"

2. Every few thousand requests, the web app calls DB::enable_profile($unique_filename) at the start of the request and then DB::finish_profile() at the end, to grab a sample of live traffic.

3. Once a day, the server runs nytprofmerge and nytprofhtml to generate beautiful HTML.

This actually works and is incredibly useful because it answers two questions at once: "Which areas of the code are our users hitting most?" and "Which areas of the code are slower than they should be?" That is infinitely better than me guessing at the first question and profiling what I think looks suspicious, since I am never going to get the answer right. (In particular, it's hard to predict what combinations of user account settings and pageviews might cause unnecessarily high load.)

Unfortunately, it only works under Apache MPM prefork. And prefork isn't a practical option in production, because our web app receives hundreds of concurrent connections, which prefork would require enormous amounts of RAM to satisfy.

I understand from the docs that NYTProf is not thread-safe. So when I try the above on a machine running Apache MPM worker, NYTProf simply does a lot of this:

~ init_profiler for pid 27194, clock 1, tps 10000000, start 0, perldb 0xf10, exitf 0x2
~ init_profiler done
~ enable_profile call from different interpreter ignored
~ finish_profile call from different interpreter ignored
~ enable_profile call from different interpreter ignored
~ finish_profile call from different interpreter ignored

I'm writing in anyway because I'm holding onto the hope that this need not apply in my limited case, where I only want to start profiling after Apache has created a thread and finish profiling in that same thread. I don't need NYTProf to follow an Apache process through a fork or whatever it does. In fact, if it matters, I'd be happy to limit the profiling to a single grab from each MPM worker process--they get recycled reasonably often, so that would still provide enough profiling data to be useful. I just want to be able to sample live data.

What do you think?

Thank you again for your marvellous software,

Max.

Tim Bunce

unread,

Nov 10, 2014, 11:25:25 AM11/10/14

to develnyt...@googlegroups.com

On Sun, Nov 09, 2014 at 04:55:59PM -0800, m...@nationstates.net wrote:
> Hello Tim!

Hi Max.

> I understand from the docs that NYTProf is not thread-safe. So when I try the above on a machine running
> Apache MPM worker, NYTProf simply does a lot of this:
>
> ~ init_profiler for pid 27194, clock 1, tps 10000000, start 0, perldb 0xf10, exitf 0x2
> ~ init_profiler done
> ~ enable_profile call from different interpreter ignored
> ~ finish_profile call from different interpreter ignored
> ~ enable_profile call from different interpreter ignored
> ~ finish_profile call from different interpreter ignored
>
> I'm writing in anyway because I'm holding onto the hope that this need not apply in my limited case,
> where I only want to start profiling after Apache has created a thread and finish profiling in that same
> thread. I don't need NYTProf to follow an Apache process through a fork or whatever it does. In fact, if
> it matters, I'd be happy to limit the profiling to a single grab from each MPM worker process--they get
> recycled reasonably often, so that would still provide enough profiling data to be useful. I just want
> to be able to sample live data.
>
> What do you think?

The "... call from different interpreter ignored" message means
"call from different interpreter than the first one that enabled profiling".
So the 1-in-N sampling could be some on the basis of 1-in-N requests
that are handled by that thread.

Another approach would be to use a FastCGI backend connected via a
filesystem socket. Then, whenever it suits you, you can simply start a
separate single-threaded FastCGI backend process connected to that socket.

> Thank you again for your marvellous software,

You're welcome.

Tim.

m...@nationstates.net

unread,

Nov 10, 2014, 9:16:59 PM11/10/14

to develnyt...@googlegroups.com

Thank you for the reply!

On Tuesday, November 11, 2014 3:25:25 AM UTC+11, Tim.Bunce wrote:

The "... call from different interpreter ignored" message means
"call from different interpreter than the first one that enabled profiling".

So the 1-in-N sampling could be some on the basis of 1-in-N requests
that are handled by that thread.

We run under mod-perl. I should have mentioned that before, sorry. From what I understand, mod-perl creates a pool of perl interpreters like this:

mod_perl 2.0 has [...] two classes of interpreters, parent and clone. A parent is like in mod_perl 1.0, where the main interpreter created at startup time compiles any preloaded Perl code. A clone is created from the parent using the Perl API perl_clone() function. At request time, parent interpreters are used only for making more clones, as the clones are the interpreters that actually handle requests.

Now when you say "the first [interpreter] that enabled profiling," you're talking about the initial load of Devel::NYTProf::Apache, right? As opposed to later calls of DB::enable_profiling(). Because I can see that loading the module immediately triggers the following log, even when NYTPROF is set to "start=no":

~ init_profiler for pid 22440, clock 1, tps 10000000, start 0, perldb 0xf10, exitf 0x2

And I gather this happens because of the BEGIN { } block in Devel::NYTProf::Apache.

After that, under Apache MPM worker, all subsequent calls to DB::enable_profiling() fail like this:

~ enable_profile call from different interpreter ignored

If I have this right, the problem is it's no good loading NYTProf and letting it initialize in a mod-perl parent interpreter, because parent interpreters don't handle requests: only child interpreters do.

So today I've been trying to delay the loading of Devel::NYTProf::Apache until it's in a child interpreter. I kind of got this working, but the final output is not great... still useful, but NYTProf seems very confused and inaccurate, I guess due to being loaded so late. So this whole approach may be doomed. But anyway, what I found worked was not loading Devel::NYTProf::Apache until the web app decides it wants to profile a session, then calling "eval { require Devel::NYTProf::Apache; }".

This works intermittently. Under load, when mod-perl starts spitting out new interpreters, it fails a lot with "Devel/NYTProf.pm did not return a true value at /usr/lib/perl5/Devel/NYTProf/Apache.pm line 30." It also puts out lines like this:

NYTProf: perl interpreter address changed after init (threads/multiplicity not supported)

For fun, I tried unloading and reloading NYTProf but got segfaults.

Working intermittently is actually not so bad, since I only want to sample intermittently. But given the confused final output, it doesn't seem worthwhile.

So I'm afraid I'm stuck on how to follow your advice to sample "1-in-N requests that are handled by that thread." It seems if I load NYTProf early, like it wants, it will then ignore all DB::enable_profiling() calls, because they come from child interpreters. But if I load it late, it can't do its job.

Another approach would be to use a FastCGI backend connected via a
filesystem socket. Then, whenever it suits you, you can simply start a
separate single-threaded FastCGI backend process connected to that socket.

I have tried, honestly, but completely failed to understand what this means. I know those words but not what you're describing.

Thank you again!

Max.

Moritz Lenz

unread,

Nov 11, 2014, 1:31:26 AM11/11/14

to develnyt...@googlegroups.com

Hi,

You could run two instances of Apache, one MPM worker for the high loads
and many concurrent requests, and a much smaller instance on prefork
just for profiling, and then use some kind of load balancing to route a
small amount of requests to the prefork server. Based on my experience
with NYTProf, Apache and threading, that will be much more reliable than
trying to run NYTProf in the threaded worker.

Cheers,
Moritz

Tim Bunce

unread,

Nov 11, 2014, 10:25:47 AM11/11/14

to develnyt...@googlegroups.com

On Mon, Nov 10, 2014 at 06:16:59PM -0800, m...@nationstates.net wrote:
>
> So today I've been trying to delay the loading of Devel::NYTProf::Apache until it's in a child
> interpreter.

Load it in the parent but ensure the NYTPROF 'start' option is set to 'no'.
https://metacpan.org/pod/Devel::NYTProf#start
https://metacpan.org/pod/Devel::NYTProf#RUN-TIME-CONTROL-OF-PROFILING

> Another approach would be to use a FastCGI backend connected via a
> filesystem socket. Then, whenever it suits you, you can simply start a
> separate single-threaded FastCGI backend process connected to that socket.
>
> I have tried, honestly, but completely failed to understand what
> this means. I know those words but not what you're describing.

It would be a significant change of architecture, so ignore it for now.

Moritz's suggestion sounds more directly useful.

Tim.

m...@nationstates.net

unread,

Nov 11, 2014, 7:04:00 PM11/11/14

to develnyt...@googlegroups.com

On Wednesday, November 12, 2014 2:25:47 AM UTC+11, Tim.Bunce wrote:

On Mon, Nov 10, 2014 at 06:16:59PM -0800, m...@nationstates.net wrote:
>
> So today I've been trying to delay the loading of Devel::NYTProf::Apache until it's in a child
> interpreter.

Load it in the parent but ensure the NYTPROF 'start' option is set to 'no'.
https://metacpan.org/pod/Devel::NYTProf#start
https://metacpan.org/pod/Devel::NYTProf#RUN-TIME-CONTROL-OF-PROFILING

Right! That's exactly what I've been doing. But NYTProf still calls init_profiler immediately, then ignores calls to enable_profile.

Here is an example. In http.conf, I have "PerlSetEnv NYTPROF start=no:trace=1":

$ sudo service apache2 start
$ cat /var/log/apache2/error.log
# trace=1
# endatexit=1
# addpid=1
~ init_profiler for pid 10870, clock 1, tps 10000000, start 0, perldb 0xf10, exitf 0x2
~ init_profiler done
~ END done
~ finish_profile (overhead 0t, is_profiling 0)
~ disable_profile (previously disabled, pid 10870, trace 1)
~ END done
~ finish_profile (overhead 0t, is_profiling 0)
~ disable_profile (previously disabled, pid 10870, trace 1)
~ finish_profile (overhead 0t, is_profiling 0)
~ disable_profile (previously disabled, pid 10870, trace 1)
# trace=1
# endatexit=1
# addpid=1
~ init_profiler for pid 10871, clock 1, tps 10000000, start 0, perldb 0xf10, exitf 0x2
~ init_profiler done
[Wed Nov 12 10:05:00.631765 2014] [mpm_prefork:notice] [pid 10871] AH00163: Apache/2.4.7 (Ubuntu) mod_perl/2.0.8 Perl/v5.18.2 configured -- resuming normal operations
[Wed Nov 12 10:05:00.631809 2014] [core:notice] [pid 10871] AH00094: Command line: '/usr/sbin/apache2'

So far this is the same whether I'm running Apache MPM prefork or worker. But the behavior diverges once Apache accepts its first request and the web app calls DB::enable_profile():

In MPM prefork, everything is awesome:

~ enable_profile (previously disabled) to /tmp/nytprof.ns.1415747125.488050.out
~ new pid 10875 (was 10871) forkdepth -1
~ opened /tmp/nytprof.ns.1415747125.488050.out.10875 at 1415747125.533632
~ first statement profiled at line 110 of /var/www/cgi-bin/index.cgi, pid 10875
~ finish_profile (overhead 26825t, is_profiling 1)
~ disable_profile (previously enabled, pid 10875, trace 1)
~ writing file source code
~ writing sub line ranges - prescan
~ writing sub line ranges of 3202 subs
~ writing sub callers for 229 subs
~ closed file at 1415747126.277812

In MPM worker, not so good:

~ enable_profile call from different interpreter ignored

~ finish_profile call from different interpreter ignored

There are definitely no calls to enable_profile before this. In fact, under this setup, enable_profile never gets called, not once, by anything.

By the way, I don't know if this is related, but I discovered some strange behavior with setting NYTPROF, after I got suspicious about how there's no "# start=no" line in the log despite that option definitely being set. What I found is that if NYTPROF is set to "file=<whatever>" only, then Apache MPM worker segfaults. But if I throw in an additional valid option, it doesn't. E.g.:

NYTPROF=file=/tmp -->segfault
NYTPROF=file=/tmp:start=no --> works fine
NYTPROF=trace=1:file=/tmp --> works fine
NYTPROF=file=/tmp:foo=bar --> segfault

It also segfaults if NYTPROF isn't defined, due to the default:

NYTPROF env var not set, so defaulting to NYTPROF='file=/tmp/nytprof.12033.out' at /usr/lib/perl5/Devel/NYTProf/Apache.pm line 22.

You may not care, since this only happens under MPM worker and NYTProf isn't supposed to be thread-safe, but I thought it was odd enough to mention.

I didn't discover why the log doesn't say "# start=no". But NYTProf's behavior is a little different with that option set--it makes no enable_profile calls--so it does seem to be registered.

Moritz's suggestion sounds more directly useful.

Mmm, yes, thank you for that, Moritz! But I am loathe to go down that path since we've tried running multiple instances of Apache in the past and had issues. It's definitely doable, it's just a more difficult setup to maintain going forward.

Max.

Reply all

Reply to author

Forward