Parrot, Perl 5 and performance

ozg...@gmail.com

unread,

Mar 12, 2007, 5:29:49 PM3/12/07

to perl6-c...@perl.org, perl-perl...@moderators.isc.org

Hi all,

I'd like to get opinions from developers on this list. I'm looking
into this system that executes massive amounts of Perl 5 code on a
Perl 5.8 interpreter. The system burns tons of CPU while running Perl
code, and I'm speculating on ways to improve our throughput (say, 50
billion inst per module -> 10 billion inst/modl) and latency (say, 2
sec/modl -> 0.5 secs/modl).

For us, switching to another language would require considerable
effort. I'm wondering, if we could compile Perl 5 source code into
Parrot byte code and run it in Parrot's VM, would we see any
performance benefit? Intuitively, running compiled code should go
faster than interpreting it. We can also pre-compile all of our Perl
code, almost at no-cost (the CPU cost is relatively very small).

In summary, my questions are:

(1) How much effort would it take to convert Perl 5 source to Parrot
bytecode? Is this even possible?
(2) How much performance benefit should I expect if I [could] compile
and run Perl 5 source?

(I read "Perl 6 and the Parrot VM", but couldn't find any numbers or
even performance speculations. I also looked into "The case for
virtual register machines", but the paper examines the Java VM. Java's
developer library cites an increase of 10x going from interpreted to
compiled, but I don't know if that applies to dynamically typed
languages.)

Thanks,

Ozgun.

Chromatic

unread,

Mar 12, 2007, 8:43:02 PM3/12/07

to perl6-c...@perl.org, ozg...@gmail.com

On Monday 12 March 2007 14:29, ozg...@gmail.com wrote:

I wish I had better news for you, but I'm not sure anyone can answer your
questions easily without a lot more information.

> I'd like to get opinions from developers on this list. I'm looking
> into this system that executes massive amounts of Perl 5 code on a
> Perl 5.8 interpreter. The system burns tons of CPU while running Perl
> code, and I'm speculating on ways to improve our throughput (say, 50
> billion inst per module -> 10 billion inst/modl) and latency (say, 2
> sec/modl -> 0.5 secs/modl).

> For us, switching to another language would require considerable
> effort. I'm wondering, if we could compile Perl 5 source code into
> Parrot byte code and run it in Parrot's VM, would we see any
> performance benefit? Intuitively, running compiled code should go
> faster than interpreting it.

That depends on what you mean by "compiled" and "interpreting". Parrot has to
interpret its byte code too, depending on what "interpret" means and Perl 5
compiles source code into an op tree and walks that.

Parrot does have a JIT that can turn certain types of operations into
platform-specific instructions, but there's a time and memory cost for doing
that as well, and it can benefit certain applications but not others.

> In summary, my questions are:
>
> (1) How much effort would it take to convert Perl 5 source to Parrot
> bytecode?

A fair amount. A complete translation of Perl 5 syntax and semantics is a
complex project. A subset of Perl 5 could be much easier.

> Is this even possible?

Yes it is.

> (2) How much performance benefit should I expect if I [could] compile
> and run Perl 5 source?

That really depends on your program and what the overhead is.

I *expect* Parrot has less overhead for most operations than Perl 5 does
because:

*) Parrot bytecode has better potential cache characteristics than Perl 5
optrees do

*) Parrot doesn't pay the overhead of magic in Perl 5 (which complicates just
about every pp_code)

*) Parrot uses registers, not stacks, so it avoids the overhead of stack
manipulation

However, Parrot does lack some of the maturity of Perl 5, so it hasn't had as
many optimizations done.

> (I read "Perl 6 and the Parrot VM", but couldn't find any numbers or
> even performance speculations. I also looked into "The case for
> virtual register machines", but the paper examines the Java VM. Java's
> developer library cites an increase of 10x going from interpreted to
> compiled, but I don't know if that applies to dynamically typed
> languages.)

The terms "interpreted" and "compiled" really need disambiguation in these
circumstances anyway; they're awfully fuzzy terms.

-- c

Gabor Szabo

unread,

Mar 13, 2007, 8:42:39 AM3/13/07

to ozg...@gmail.com, perl6-c...@perl.org, perl-perl...@moderators.isc.org

On 12 Mar 2007 14:29:49 -0700, ozg...@gmail.com <ozg...@gmail.com> wrote:
> Hi all,
>
> I'd like to get opinions from developers on this list. I'm looking
> into this system that executes massive amounts of Perl 5 code on a
> Perl 5.8 interpreter. The system burns tons of CPU while running Perl
> code, and I'm speculating on ways to improve our throughput (say, 50
> billion inst per module -> 10 billion inst/modl) and latency (say, 2
> sec/modl -> 0.5 secs/modl).

Have you tried to profile your code and see if there are place to do
optimizations?

Gabor

ozg...@gmail.com

unread,

Mar 13, 2007, 1:28:59 PM3/13/07

to perl6-c...@perl.org, perl-perl...@moderators.isc.org

Thanks for the reply. To be a little more specific, we have internally
tens of thousands of Perl modules. Before running them in production:
we profile these modules, pass them through a circular reference
detector (code that intercepts all allocations, and at certain
intervals walks over Perl's SVs, AVs, HVs), and try to do static
analysis on them for security purposes (this doesn't work well with
Perl unfortunately, due to dynamic typing).

In our setting, there isn't a single module that takes up 3 or 5% of
CPU. Since we profile code on a regular basis, these offenders are
easy to catch and fix. The problem is, we have more and more modules
written every day, each taking, say, %0.01 of CPU time, but they add
up to quite a lot. We also have a lot of code running in the same
environment in C++ (they talk over XS), and recently a friend
forwarded the following benchmarks:

http://shootout.alioth.debian.org/debian/benchmark.php?test=nbody&lang=all
(one sample test)
http://shootout.alioth.debian.org/debian/benchmark.php?test=all&lang=java&lang2=perl

Every language has its merits, and I know you can never do an apples
to apples comparison. However, if we can get more performance out of
Perl's VM, well, that would be great. In short, I just wondered if
Perl/Parrot were on the benchmarks, around where it would be.

Thanks,

Ozgun.

Isaac Gouy

unread,

Mar 13, 2007, 7:55:26 PM3/13/07

to perl6-c...@perl.org, perl-perl...@moderators.isc.org

On Mar 13, 10:28 am, ozg...@gmail.com wrote:
> Thanks for the reply. To be a little more specific, we have internally
> tens of thousands of Perl modules. Before running them in production:
> we profile these modules, pass them through a circular reference
> detector (code that intercepts all allocations, and at certain
> intervals walks over Perl's SVs, AVs, HVs), and try to do static
> analysis on them for security purposes (this doesn't work well with
> Perl unfortunately, due to dynamic typing).
>
> In our setting, there isn't a single module that takes up 3 or 5% of
> CPU. Since we profile code on a regular basis, these offenders are
> easy to catch and fix. The problem is, we have more and more modules
> written every day, each taking, say, %0.01 of CPU time, but they add
> up to quite a lot. We also have a lot of code running in the same
> environment in C++ (they talk over XS), and recently a friend
> forwarded the following benchmarks:
>

> http://shootout.alioth.debian.org/debian/benchmark.php?test=nbody&lan...
> (one sample test)http://shootout.alioth.debian.org/debian/benchmark.php?test=all<=...
>
> Everylanguagehas its merits, and I know you can never do an apples

> to apples comparison. However, if we can get more performance out of
> Perl's VM, well, that would be great. In short, I just wondered if
> Perl/Parrot were on the benchmarks, around where it would be.
>
> Thanks,
>
> Ozgun.

PIR is on the debian computer language shootout, see
http://shootout.alioth.debian.org/sandbox/benchmark.php?test=all&lang=parrot

Also see the gentoo computer language shootout, http://shootout.alioth.debian.org/gp4/

a...@ippimail.com

unread,

Mar 14, 2007, 2:47:45 PM3/14/07

to perl6-c...@perl.org

From ozgun:
>
> Inlining replies.
>
>> 1. What's the environment; Solaris, GNU/Linux, *nix, Windows?
>
> Linux.
>
>> 2. What hard information do you have on the resources being used? Have
>> you been able to profile it? Pareto's Law applies surprisingly often.
>
> Tons. In fact, we have more information than we'd like. Since we
> profile and optimize on a regular basis, our workset doesn't exhibit
> Pareto's Law.
>
>> 3. What sort of application is it? Is it CPU or I/O intensive?
>
> CPU intensive. In fact, there is no disk I/O. We're planning to take
> the disks out at some point because they cost too much.
>
Are you doing DNA sequencing?

It looks as though you've done all the obvious things. If you haven't
already done them, my only questions remain: are the boxes maxed-out with
memory, and are the algorithms optimal? (Checking that might be difficult
if you've got a "death by a thousand cuts" type of workload.)

--

Email and shopping with the feelgood factor!
55% of income to good causes. http://www.ippimail.com

Nicholas Clark

unread,

Mar 15, 2007, 12:27:39 PM3/15/07

to ozg...@gmail.com, perl6-c...@perl.org, perl-perl...@moderators.isc.org

I've re-ordered things because gmail's love of top posting confuses the flow
of the narrative.

> On Mar 12, 5:43 pm, chroma...@wgz.org (Chromatic) wrote:
> > On Monday 12 March 2007 14:29, ozg...@gmail.com wrote:
> >
> > I wish I had better news for you, but I'm not sure anyone can answer your
> > questions easily without a lot more information.
> >
> > > I'd like to get opinions from developers on this list. I'm looking
> > > into this system that executes massive amounts of Perl 5 code on a
> > > Perl 5.8 interpreter. The system burns tons of CPU while running Perl
> > > code, and I'm speculating on ways to improve our throughput (say, 50
> > > billion inst per module -> 10 billion inst/modl) and latency (say, 2
> > > sec/modl -> 0.5 secs/modl).

I don't think that you're going to get a 4 or 5 fold speedup on existing
code by either re-writing the interpreter, or recompiling to different
bytecode for a different VM.

> > > In summary, my questions are:
> >
> > > (1) How much effort would it take to convert Perl 5 source to Parrot
> > > bytecode?
> >
> > A fair amount. A complete translation of Perl 5 syntax and semantics is a
> > complex project. A subset of Perl 5 could be much easier.

But the problem with trying to figure out any subset is that it's likely that
an existing codebase uses rather more Perl than you might like. Particularly
if it has dependencies off into CPAN.

> > *) Parrot doesn't pay the overhead of magic in Perl 5 (which complicates just
> > about every pp_code)

But the problem is that to make a faithful translation of Perl 5 into anything
else, requires that the behaviour of magic and overloading works properly.

Perl 5 is tricky to convert to a form that it suitable for a JIT compiler to
make headway on, because not only can any value be polymorphic (string or
floating point or integer), it can also be tied or overloaded. So basically
a JIT for regular (and therefore existing written) Perl 5 code would be
stringing together calls to the existing Perl 5 ops (or code as flexible as
them), rather than really converting to low level CPU instructions.

Perl 6 has resolved part of this by defaulting variables no "not tied" -
only variables that are explicity declared as such may hold tied values.

It may well be possible to provide ways to annotate Perl 5 with things like
"this is never tied" or even "this is an integer" but it won't help existing
code without a massive review/annotation(/bug introduction) phase

On Tue, Mar 13, 2007 at 10:28:59AM -0700, ozg...@gmail.com wrote:

> In our setting, there isn't a single module that takes up 3 or 5% of
> CPU. Since we profile code on a regular basis, these offenders are
> easy to catch and fix. The problem is, we have more and more modules
> written every day, each taking, say, %0.01 of CPU time, but they add
> up to quite a lot. We also have a lot of code running in the same
> environment in C++ (they talk over XS), and recently a friend
> forwarded the following benchmarks:

It would seem that profiling the Perl interpreter (while it is running your
code) might reveal more than profiling the code itself. I would have thought
that an organisation large enough to be maintaining tens of thousands of
modules would be able to justify the resources to do that, and hopefully
then feed back any improvements to the core.

> http://shootout.alioth.debian.org/debian/benchmark.php?test=nbody&lang=all
> (one sample test)
> http://shootout.alioth.debian.org/debian/benchmark.php?test=all&lang=java&lang2=perl
>
> Every language has its merits, and I know you can never do an apples
> to apples comparison. However, if we can get more performance out of
> Perl's VM, well, that would be great. In short, I just wondered if
> Perl/Parrot were on the benchmarks, around where it would be.

JVMs such as Sun's have a whole team of full time engineers working on them.
Perl (and Parrot) don't have anyone paid - it's all volunteers. In some ways
it's surprising how well Perl is doing despite that.

We've managed to find some ways to speed things up in 5.10, and those that
I've been able to backport will be in 5.8.9, but it's really hard to

a: Actually find a representitive benchmark for "Perl code"
b: Find any way to make a measurable difference

Mostly I've found ways to reduce the memory of core structures, but as these
are necessarily binary incompatible they can't go back to 5.8.x

Nicholas Clark

Isaac Gouy

unread,

Mar 13, 2007, 7:58:03 PM3/13/07

to perl6-c...@perl.org, perl-perl...@moderators.isc.org

On Mar 13, 10:28 am, ozg...@gmail.com wrote:

> Thanks for the reply. To be a little more specific, we have internally
> tens of thousands of Perl modules. Before running them in production:
> we profile these modules, pass them through a circular reference
> detector (code that intercepts all allocations, and at certain
> intervals walks over Perl's SVs, AVs, HVs), and try to do static
> analysis on them for security purposes (this doesn't work well with
> Perl unfortunately, due to dynamic typing).
>
> In our setting, there isn't a single module that takes up 3 or 5% of
> CPU. Since we profile code on a regular basis, these offenders are
> easy to catch and fix. The problem is, we have more and more modules
> written every day, each taking, say, %0.01 of CPU time, but they add
> up to quite a lot. We also have a lot of code running in the same
> environment in C++ (they talk over XS), and recently a friend
> forwarded the following benchmarks:
>

> http://shootout.alioth.debian.org/debian/benchmark.php?test=nbody&lan...
> (one sample test)http://shootout.alioth.debian.org/debian/benchmark.php?test=all<=...
>

> Everylanguagehas its merits, and I know you can never do an apples

> to apples comparison. However, if we can get more performance out of
> Perl's VM, well, that would be great. In short, I just wondered if
> Perl/Parrot were on the benchmarks, around where it would be.
>
> Thanks,
>
> Ozgun.

PIR is on the debian computer language shootout, see