perly.y

jdi...@cis.clarion.edu

unread,

Jul 24, 2000, 3:00:00 AM7/24/00

to

So... How do I get this little beast to actually perform lexical
analysis on something and spit out parsed code?

Sent via Deja.com http://www.deja.com/
Before you buy.

Mark-Jason Dominus

unread,

Jul 24, 2000, 3:00:00 AM7/24/00

to

In article <8lhukr$vvc$1...@nnrp1.deja.com>, <jdi...@cis.clarion.edu> wrote:
>So... How do I get this little beast to actually perform lexical
>analysis on something and spit out parsed code?

I'd suggest

perl -Dx program.pl

But without knowing what you are trying to accomplish, it's really
hard to understandyour question.

jdi...@my-deja.com

unread,

Jul 25, 2000, 3:00:00 AM7/25/00

to

Sorry, I guess that was pretty vague. The whole parsing Perl topic
bothers me greatly. I mean... I don't mind the fact that it's done in
a completely unorthodox manner, and that the parser:smoke:mirrors
distribution is more like 1:9:99... I'm all for that, but why would
NOBODY ever dare to explain or even comment on the whole process?? The
source of toke.c and perly.y is remarkably far from clear.

I guess what I'm trying to do is extract the parsing functionality into
my own Perl5 parser, and I'd like to know how it's done. The reason
for my doing this is that I want to. The other reason is to piss all
them people that always tell me 'well just use Perl to parse Perl'.
And the yet another reason is that I want to be able to parse code for
which I don't have all the parts (modules) and which was written for
different platform (realizing that complete parsing under those
circumstances is impossible, but that's fine with me - I still wanna be
able to partially parse what I do have).

And can anyone give me the real reson for Perl people to say that
specifying a proper grammar for Perl is impossible?

In article <8lhukr$vvc$1...@nnrp1.deja.com>,
jdi...@cis.clarion.edu wrote:
> So... How do I get this little beast to actually perform lexical
> analysis on something and spit out parsed code?
>

Mark-Jason Dominus

unread,

Jul 25, 2000, 3:00:00 AM7/25/00

to

In article <8lkb36$nm7$1...@nnrp1.deja.com>, <jdi...@my-deja.com> wrote:
>I guess what I'm trying to do is extract the parsing functionality into
>my own Perl5 parser, and I'd like to know how it's done.

You might want to talk to Simon Cozens. I understand that he has a
nearly-complete implentation of a Perl parser written in Perl.

>And can anyone give me the real reson for Perl people to say that
>specifying a proper grammar for Perl is impossible?

Well, just to take a couple of examples:

$x = foo;

parses the same as

$x = 'foo';

but if it was preceded by

sub foo { ... }

then it parses instead as if you had written

$x = foo();

Now, suppose you want to parse the following program:

use Module;
$x = foo;

and suppose that Module.pm contains the following:

package Module;

sub import {
if (twin_primes_conjecture_is_true()) {
*main::foo = sub { 1 }; # Export function definition into main
}
}

sub twin_primes_conjecture_is_true {
...
}

1;

Now you can't parse "$x = foo" unless you know whether the famous Twin
Primes conjecture is true, because the parse is

$x = foo();

if it is true and

$x = "foo";

if it isn't. But the Twin Primes conjecture has been a famous open
problem of mathematics for several hundred years.

This is a rather silly example, but it may give you some idea of the
difficulties involved. One of the largest difficulties is that a
compile-time declaration like 'use' might load in *and execute*
arbitrary Perl code, with subsequent parsing depending on the result
of the execution. This means that in general you can't parse paerl
code correctly without being able to execute arbitrary Perl code;
hence the saying that 'only perl can parse Perl'.

There are a lot of similar oddities. Consider

$x = new Carrot (color => 'orange');

Is this parsed the same as this?

$x = "Carrot"->new(color => 'orange');

Or is it parsed the same as this?

$x = new(Carrot(color => 'orange'));

The answer, again, is 'it depends'. I forget just what it depends on.

Abigail

unread,

Jul 25, 2000, 3:00:00 AM7/25/00

to

Mark-Jason Dominus (m...@plover.com) wrote on MMDXX September MCMXCIII in
<URL:news:397dbd35.5b6$1...@news.op.net>:
{}
{} There are a lot of similar oddities. Consider

{}
{}
{} $x = new Carrot (color => 'orange');
{}
{} Is this parsed the same as this?
{}
{} $x = "Carrot"->new(color => 'orange');
{}
{} Or is it parsed the same as this?
{}
{} $x = new(Carrot(color => 'orange'));
{}
{} The answer, again, is 'it depends'. I forget just what it depends on.

Whether there's a subroutine called 'new' in the current package.

Abigail
--
perl -wle '(1 x $_) !~ /^(11+)\1+$/ && print while ++ $_'

Mark-Jason Dominus

unread,

Jul 25, 2000, 3:00:00 AM7/25/00

to

[Mailed and posted.]

In article <slrn8nrh95....@alexandra.foad.org>,

Abigail <abi...@foad.org> wrote:
>Mark-Jason Dominus (m...@plover.com) wrote on MMDXX September MCMXCIII in
><URL:news:397dbd35.5b6$1...@news.op.net>:

>{} $x = new Carrot (color => 'orange');

>{} The answer, again, is 'it depends'. I forget just what it depends on.
>
>
>Whether there's a subroutine called 'new' in the current package.

Ha! You *wish* it were that simple.

If there is a subroutine called 'new' in the current package, then you
get the

new(Carrot(color => 'orange'))

parse, UNLESS Carrot::new *also* exists, in which case you get the
other parse:

'Carrot'->new(color => 'orange');

But EVEN IF Carrot::new exists, you get the first parse anyway, if
BOTH 'new' and 'Carrot' exist in the current package.

OK, so perhaps the rule is:

1. If main::new and main::Carrot both exist, take the first parse.
2. Otherwise, if Carrot::new exists, take the second parse.
3. Otherwise do something else.

No, that's not it either, because if main::Carrot and Carrot::new both
exist but main::new doesn't, then you get a compile-time error because
it tried to take the first parse and then got upset because there is
no 'new' function.

I sure hope Paul Prescod is listening. It would be a shame if he were
to miss this delight. Hi, Paul!

main::new and main::new main::Carrot neither
main::Carrot only only one

Carrot::new first parse second parse ERROR second parse
exists

Carrot::new first parse first parse ERROR second parse
doesn't

The 'ERROR' here indicates a compile-time error:

Bareword found where operator expected at /tmp/x.pl line 15, near "new Carrot"
(Do you need to predeclare new?)

I *think* that's the whole story, but I'm not sure.