Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[perl #16144] [PATCH] quotematch speedup

1 view
Skip to first unread message

Nicholas Clark

unread,
Aug 12, 2002, 10:02:43 AM8/12/02
to perl6-i...@perl.org
On Mon, Aug 12, 2002 at 01:56:49PM +0000, jryan wrote:
> # New Ticket Created by jryan
> # Please include the string: [perl #16144]
> # in the subject line of all future correspondence about this issue.
> # <URL: http://rt.perl.org/rt2/Ticket/Display.html?id=16144 >
>
>
> Small speed patch for assemble.pl here...
> Benchmarks that I've done show that the new expression will
> match about 40% faster for strings that should match and fail
> about 400% faster for strings that should fail.
>
> Not really a big deal, but speed is good right? :)

Well, I find assemble.pl too slow, so I like speed.

> --- assemble_old.pl 2002-08-04 21:00:02.000000000 -0400
> +++ assemble.pl 2002-08-12 00:03:56.000000000 -0400
> @@ -263,8 +263,8 @@ sub preprocess {
> }
> elsif(/^\.constant \s+
> ($label_re) \s+
> - (\"(?:[^\\\"]*(?:\\.[^\\\"]*)*)\" |
> - \'(?:[^\\\']*(?:\\.[^\\\']*)*)\'
> + ( " (?: \\. | [^\\"]* ) " |
> + ' (?: \\. | [^\\"]* ) '
> )/x) { # .constant {name} {string}
> $self->{constants}{$1} = $2;
> }

The regexp you've changed is identical to the $str_re regexp, isn't it?
In which case, does changing that section to $str_re and then changing the
(global) definition for $str_re to your better version give an even bigger
speedup?

Nicholas Clark

Simon Cozens

unread,
Aug 12, 2002, 11:01:13 AM8/12/02
to perl6-i...@perl.org
ni...@ccl4.org (Nicholas Clark) writes:
> Well, I find assemble.pl too slow, so I like speed.

Good grief. Maybe someone should implement it in XS; then as well as being
fast, we'd avoid duplicating code from the core, and we'd have the basis of
a bytecode emission library that things compiling to Parrot can use.

Oh, wait.

The assembler appears to have an amazingly cyclic lifestyle. Has anyone
suggested splitting it up to a bunch of different modules (again) yet?

--
When Simon Cozens writes code, I always think twice about whether
something is a bug or an esoteric implementation.
- Daniel Packer

Simon Cozens

unread,
Aug 12, 2002, 12:33:48 PM8/12/02
to Nicholas Clark, perl6-i...@perl.org
Nicholas Clark:
> Specifically:
> Why are we back to a single file assembler in pure perl?
> Why is it being proposed to be split up again?
> Are we going round in circles, or do the changes represent a spiral?

My design decisions for what *I* did, in rewriting the original assembler:

1) De-bloat the core bytecode reading/writing code.
2) De-bloat the assembler code, and make it much smaller and easier to
understand.
3) Reduce code duplication.
4) Make it *deliberately hard* to do clever stuff with assembler.pl, because
this is only a prototype, and Real Compilers will probably be written in C.
5) Remove assembly syntax which is only there for human consumption, to force
people to make higher-level tools such as imcc.

"Make it faster" was a happy consequence, not a deliberate decision.

It's my opinion that it went from there back to pure-Perl because people here
are happier handling pure Perl than XS. Jeff may have to correct me on that.

--
The Two Phases Of University Employment:

1. Doesn't know enough to get a Real Job.
2. Knows too much to want a Real Job.
- sharkey <sha...@zoic.org>

Nicholas Clark

unread,
Aug 12, 2002, 12:13:44 PM8/12/02
to perl6-i...@perl.org, Simon Cozens
On Mon, Aug 12, 2002 at 04:01:13PM +0100, Simon Cozens wrote:
> ni...@ccl4.org (Nicholas Clark) writes:
> > Well, I find assemble.pl too slow, so I like speed.
>
> Good grief. Maybe someone should implement it in XS; then as well as being
> fast, we'd avoid duplicating code from the core, and we'd have the basis of
> a bytecode emission library that things compiling to Parrot can use.

"Those who do not learn from history are doomed to repeat it"
-- George Santayana

Why are we going round in circles?
(Simon's message jogs my memory; Simon wrote an XS version of the assembler.)

Specifically:
Why are we back to a single file assembler in pure perl?
Why is it being proposed to be split up again?
Are we going round in circles, or do the changes represent a spiral?

Design reasons behind the decisions for each change are more interesting
than the changes themselves as answers to these questions.

Nicholas Clark

Simon Cozens

unread,
Aug 12, 2002, 1:56:12 PM8/12/02
to perl6-i...@perl.org
dan...@grunblatt.com.ar (Daniel Grunblatt) writes:
> I moved it back to pure-Perl because there were something like half of the
> tinderboxes failing to assemble anything.

Ah, right. Yeah, the tinderboxes are good slaves but really bad masters.

Here's a more interesting question: which parts of Parrot are enshrined,
and which are prototypes, ready to be thrown away? For instance, I'd
say much of languages/* is all proof-of-concept prototype stuff; imcc
may not be. The assembler I'd call a prototype. The regex engine? The
GC? ...

--
10. The Earth quakes and the heavens rattle; the beasts of nature flock
together and the nations of men flock apart; volcanoes usher up heat
while elsewhere water becomes ice and melts; and then on other days it
just rains. - Prin. Dis.

Daniel Grunblatt

unread,
Aug 12, 2002, 2:44:40 PM8/12/02
to Simon Cozens, perl6-i...@perl.org
On 12 Aug 2002, Simon Cozens wrote:

> Here's a more interesting question: which parts of Parrot are enshrined,
> and which are prototypes, ready to be thrown away? For instance, I'd
> say much of languages/* is all proof-of-concept prototype stuff; imcc
> may not be. The assembler I'd call a prototype. The regex engine? The
> GC? ...
>

The assembler is a bit outdated, it shouldn't be too difficult to bring it
up to date, I just don't have enough time latetly. But it did work fine
and is easy to extend it. Why do you think it should be thrown away?

Daniel Grunblatt.

Daniel Grunblatt

unread,
Aug 12, 2002, 2:23:25 PM8/12/02
to Simon Cozens, Nicholas Clark, perl6-i...@perl.org
On Mon, 12 Aug 2002, Simon Cozens wrote:

> It's my opinion that it went from there back to pure-Perl because people here
> are happier handling pure Perl than XS. Jeff may have to correct me on that.

I moved it back to pure-Perl because there were something like half of the


tinderboxes failing to assemble anything.

Probably I should have gone the other way and fix it.

Daniel Grunblatt.

Daniel Grunblatt

unread,
Aug 12, 2002, 2:53:28 PM8/12/02
to Simon Cozens, perl6-i...@perl.org
On 12 Aug 2002, Simon Cozens wrote:

> dan...@grunblatt.com.ar (Daniel Grunblatt) writes:
> > I moved it back to pure-Perl because there were something like half of the
> > tinderboxes failing to assemble anything.
>
> Ah, right. Yeah, the tinderboxes are good slaves but really bad masters.

True, but I was writing the register allocator for the jit and needed to
assemble bytecode on the alphas, just went the way I found easiest for me.

Daniel Grunblatt.

Simon Cozens

unread,
Aug 12, 2002, 2:38:22 PM8/12/02
to perl6-i...@perl.org
dan...@grunblatt.com.ar (Daniel Grunblatt) writes:
> The assembler is a bit outdated, it shouldn't be too difficult to bring it
> up to date, I just don't have enough time latetly. But it did work fine
> and is easy to extend it. Why do you think it should be thrown away?

It's in Perl?

--
MISTAKES:
It Could Be That The Purpose Of Your Life Is Only To Serve As
A Warning To Others

http://www.despair.com

Juergen Boemmels

unread,
Aug 12, 2002, 3:10:04 PM8/12/02
to perl6-i...@perl.org, bugs-bi...@netlabs.develooper.com
jryan (via RT) <bugs-...@netlabs.develooper.com> writes:

> --- assemble_old.pl 2002-08-04 21:00:02.000000000 -0400
> +++ assemble.pl 2002-08-12 00:03:56.000000000 -0400
> @@ -263,8 +263,8 @@ sub preprocess {
> }
> elsif(/^\.constant \s+
> ($label_re) \s+
> - (\"(?:[^\\\"]*(?:\\.[^\\\"]*)*)\" |
> - \'(?:[^\\\']*(?:\\.[^\\\']*)*)\'
> + ( " (?: \\. | [^\\"]* ) " |
> + ' (?: \\. | [^\\"]* ) '
> )/x) { # .constant {name} {string}
> $self->{constants}{$1} = $2;
> }

Sorry, but I don't think the path does the right thing.
I just did a simple test: (leaving out the unintresting part)

$_ $2 old regexp $2 new regexp
"test" "test" "test"
"\"" "\"" "\""
"\"test" "\"test" failed
"te"st" "te" "te"
'te'st' 'te' 'te'st'

A corrected version of the patch is:


--- assemble_old.pl 2002-08-04 21:00:02.000000000 -0400
+++ assemble.pl 2002-08-12 00:03:56.000000000 -0400
@@ -263,8 +263,8 @@ sub preprocess {
}
elsif(/^\.constant \s+
($label_re) \s+
- (\"(?:[^\\\"]*(?:\\.[^\\\"]*)*)\" |
- \'(?:[^\\\']*(?:\\.[^\\\']*)*)\'

+ ( " (?: \\. | [^\\"]* )* " |
+ ' (?: \\. | [^\\']* )* '


)/x) { # .constant {name} {string}
$self->{constants}{$1} = $2;
}

I did not checked the speed gain of this patch, but I checked the
corner cases I can think of.

by
juergen

Daniel Grunblatt

unread,
Aug 12, 2002, 4:10:36 PM8/12/02
to Simon Cozens, perl6-i...@perl.org
On 12 Aug 2002, Simon Cozens wrote:

> dan...@grunblatt.com.ar (Daniel Grunblatt) writes:
> > The assembler is a bit outdated, it shouldn't be too difficult to bring it
> > up to date, I just don't have enough time latetly. But it did work fine
> > and is easy to extend it. Why do you think it should be thrown away?
>
> It's in Perl?
>

Oh, no, I was talking about languages/parrot_compiler/. Sorry.

Daniel Grunblatt.

Simon Cozens

unread,
Aug 12, 2002, 3:41:42 PM8/12/02
to perl6-i...@perl.org
dan...@grunblatt.com.ar (Daniel Grunblatt) writes:
> Oh, no, I was talking about languages/parrot_compiler/. Sorry.

Oh, I hadn't seen that. I can't work out what it is; it seems to be a
device for generating "Couldn't find operator" errors. Is there any,
dare I say it, documentation for it?

--
Going to church does not make a person religious, nor does going to school
make a person educated, any more than going to a garage makes a person a car.

Daniel Grunblatt

unread,
Aug 12, 2002, 4:27:14 PM8/12/02
to Simon Cozens, perl6-i...@perl.org
On 12 Aug 2002, Simon Cozens wrote:

> dan...@grunblatt.com.ar (Daniel Grunblatt) writes:
> > Oh, no, I was talking about languages/parrot_compiler/. Sorry.
>
> Oh, I hadn't seen that. I can't work out what it is; it seems to be a
> device for generating "Couldn't find operator" errors. Is there any,
> dare I say it, documentation for it?
>

I did say it was outdated.
No there is no documentation, and there won't be any from me in the near
future.

Daniel Grunblatt.

Melvin Smith

unread,
Aug 12, 2002, 11:10:39 PM8/12/02
to Simon Cozens, perl6-i...@perl.org
At 06:56 PM 8/12/2002 +0100, Simon Cozens wrote:
>Here's a more interesting question: which parts of Parrot are enshrined,
>and which are prototypes, ready to be thrown away? For instance, I'd
>say much of languages/* is all proof-of-concept prototype stuff; imcc
>may not be. The assembler I'd call a prototype. The regex engine? The
>GC? ...

On that topic, given that the reference assembler is too slow for on-the-fly
assembly, I already decided that imcc should get its own C based
assembler. Now that the C (XS) interface is gone, it means we will be
duplicating code. I'm not saying the Perl based assembler is a BAD thing,
but I think time spent "tuning" the reference assembler is wasted when it could
be spent writing a really fast one in C.

-Melvin


Dan Sugalski

unread,
Aug 13, 2002, 2:12:33 AM8/13/02
to Simon Cozens, perl6-i...@perl.org
At 6:56 PM +0100 8/12/02, Simon Cozens wrote:
>dan...@grunblatt.com.ar (Daniel Grunblatt) writes:
>> I moved it back to pure-Perl because there were something like half of the
>> tinderboxes failing to assemble anything.
>
>Ah, right. Yeah, the tinderboxes are good slaves but really bad masters.
>
>Here's a more interesting question: which parts of Parrot are enshrined,
>and which are prototypes, ready to be thrown away? For instance, I'd
>say much of languages/* is all proof-of-concept prototype stuff; imcc
>may not be. The assembler I'd call a prototype. The regex engine? The
>GC? ...

It's all potentially prototype. When 1.0 is released the bytecode
format, opcode behaviour, and the various external interfaces will
be final, but all the rest of the code is subject to change.
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Dan Sugalski

unread,
Aug 13, 2002, 2:29:03 AM8/13/02
to Melvin Smith, Simon Cozens, perl6-i...@perl.org

A fast C assembler's fine, or a fast Parrot one based on what
Daniel's got in already. Don't care either way--it'll all be linked
together in one big mass of parrot.so at some point anyway...

Piers Cawley

unread,
Aug 13, 2002, 3:25:14 AM8/13/02
to Melvin Smith, Simon Cozens, perl6-i...@perl.org
Melvin Smith <mrjol...@mindspring.com> writes:

There's a small, mad part of me that wonders if parrot would now
support an assembler that was implemented in Parrot. Then we'd know
that the assembler was at least as portable as parrot itself...

--
Piers

"It is a truth universally acknowledged that a language in
possession of a rich syntax must be in need of a rewrite."
-- Jane Austen?

Daniel Grunblatt

unread,
Aug 13, 2002, 5:48:53 PM8/13/02
to Piers Cawley, Melvin Smith, Simon Cozens, perl6-i...@perl.org
On 13 Aug 2002, Piers Cawley wrote:

> Melvin Smith <mrjol...@mindspring.com> writes:
>
> > At 06:56 PM 8/12/2002 +0100, Simon Cozens wrote:
> > >Here's a more interesting question: which parts of Parrot are
> > >enshrined, and which are prototypes, ready to be thrown away? For
> > >instance, I'd say much of languages/* is all proof-of-concept
> > >prototype stuff; imcc may not be. The assembler I'd call a
> > >prototype. The regex engine? The GC? ...
> >
> > On that topic, given that the reference assembler is too slow for
> > on-the-fly assembly, I already decided that imcc should get its own
> > C based assembler. Now that the C (XS) interface is gone, it means
> > we will be duplicating code. I'm not saying the Perl based assembler
> > is a BAD thing, but I think time spent "tuning" the reference
> > assembler is wasted when it could be spent writing a really fast one
> > in C.
>
> There's a small, mad part of me that wonders if parrot would now
> support an assembler that was implemented in Parrot. Then we'd know
> that the assembler was at least as portable as parrot itself...

Something like languages/parrot_compiler/ but working, right?

Daniel Grunblatt.

Piers Cawley

unread,
Aug 13, 2002, 5:29:48 PM8/13/02
to Daniel Grunblatt, Melvin Smith, Simon Cozens, perl6-i...@perl.org
Daniel Grunblatt <dan...@grunblatt.com.ar> writes:

That'd be it.

I'd also like to be able to generate parrot code from within parrot
and immediately execute it...

John Porter

unread,
Aug 13, 2002, 7:41:35 PM8/13/02
to perl6-i...@perl.org
Piers Cawley wrote:
> I'd also like to be able to generate parrot code from within parrot
> and immediately execute it...

Something like that will be needed for eval() anyway, right?

--
John Douglas Porter

Daniel Grunblatt

unread,
Aug 13, 2002, 8:24:01 PM8/13/02
to perl6-i...@perl.org
On Tue, 13 Aug 2002, John Porter wrote:

> Piers Cawley wrote:
> > I'd also like to be able to generate parrot code from within parrot
> > and immediately execute it...
>
> Something like that will be needed for eval() anyway, right?

Yes, like PDB_eval() may be...

Daniel Grunblatt.

Dan Sugalski

unread,
Aug 16, 2002, 5:17:32 PM8/16/02
to Piers Cawley, Daniel Grunblatt, Melvin Smith, Simon Cozens, perl6-i...@perl.org
At 10:29 PM +0100 8/13/02, Piers Cawley wrote:
>I'd also like to be able to generate parrot code from within parrot
>and immediately execute it...

Working on the specs for that. Should be out soon...

0 new messages