Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[perl #39313] [TODO] or [BUG] improve PMC compiler

1 view
Skip to first unread message

Leopold Toetsch

unread,
Jun 6, 2006, 10:40:56 AM6/6/06
to bugs-bi...@rt.perl.org
# New Ticket Created by Leopold Toetsch
# Please include the string: [perl #39313]
# in the subject line of all future correspondence about this issue.
# <URL: https://rt.perl.org/rt3/Ticket/Display.html?id=39313 >


It's easy to add 'invalide' code to .pmc files. E.g. I had defined:

METHOD parent() {
return PMC_pmc_val(SELF) ? PMC_pmc_val(SELF) : PMCNULL;
}

Due to the absence of a return value, the PMC compiler just ignores this
'method' without further notice.

This also is happening, if there's just a whitespace before the '*':

METHOD PMC *parent() {
return PMC_pmc_val(SELF) ? PMC_pmc_val(SELF) : PMCNULL;
}

This totally valid C type declaration is just ignored.

Fixes welcome,
leo

Klaas-Jan Stol

unread,
Jun 7, 2006, 11:08:58 AM6/7/06
to perl6-i...@perl.org
Hi,

I had a look at this, but I'm not that good at Perl, and regular
expressions. However, I found where things go wrong, so someone who
really groks REs may fix it.

THe problem is (well, at least I think it is) at about line 440 in pmc2c.pl

sub parse_pmc {
my $code = shift;

my $signature_re = qr{
^
(?: #blank spaces and comments and spurious semicolons
[;\n\s]*
(?:/\*.*?\*/)? # C-like comments
)*

(METHOD\s+)? #method flag

(\w+\**) #type <<<==========I'd say this
should be (\w+\s*\**) so it matches a word (the return type), optional
spaces, and then optial *'s to indicate a pointer
\s+
(\w+) #method name
\s*
\( ([^\(]*) \) #parameters
}sx;

If the fix as I noted above is done, things don't compile anymore.
I'm sorry I can't provide a real fix, but at least it's easier to fix
now, hopefully.

A more kludgy fix may be to check whether $type equals "METHOD", if so,
then there is something wrong. And it may be that not everything is
handled by this.

kind regards,

klaas-jan

Joshua Juran

unread,
Jun 8, 2006, 3:15:23 PM6/8/06
to Klaas-Jan Stol, Perl 6 Internals
On Jun 7, 2006, at 8:08 AM, Klaas-Jan Stol wrote:

> I had a look at this, but I'm not that good at Perl, and regular
> expressions. However, I found where things go wrong, so someone who
> really groks REs may fix it.

I'm no Abigail, :-) but I'll try to help.

> THe problem is (well, at least I think it is) at about line 440 in
> pmc2c.pl
>
> sub parse_pmc {
> my $code = shift;
>
> my $signature_re = qr{
> ^
> (?: #blank spaces and comments and spurious
> semicolons
> [;\n\s]*
> (?:/\*.*?\*/)? # C-like comments
> )*

You're asking for multiple instances of something that could be
empty. I don't know if this is problematic, but I suspect it might
cause unnecessary backtracking. I would write: (?: [;\n\s] | (?:/
\*.*?\*/) )*

> (METHOD\s+)? #method flag
>
> (\w+\**) #type <<<==========I'd say
> this should be (\w+\s*\**) so it matches a word (the return type),
> optional spaces, and then optial *'s to indicate a pointer
> \s+
> (\w+) #method name

In the case where there are no '*'s in the text, the pattern '\s*'
eats up all the whitespace so the following '\s+' doesn't match.
Although I don't understand why backtracking wouldn't kick in and
make things match up, albeit inefficiently.

Try writing: ( \w+ (?: \s* \*+ )? )

(Some word characters optionally followed by any whitespace and some
'*'.)

> \s*
> \( ([^\(]*) \) #parameters
> }sx;
>
> If the fix as I noted above is done, things don't compile anymore.
> I'm sorry I can't provide a real fix, but at least it's easier to
> fix now, hopefully.

The real solution would use regular expressions but not rely on them.

I've been reading Higher Order Perl, by Mark Jason Dominus. It has a
chapter on writing parsers which is applicable to this discussion,
and I cannot recommend it highly enough.

Higher Order Perl
http://hop.perl.plover.com/

Josh

0 new messages