P6ML?

Michael Lazzaro

unread,

Mar 25, 2003, 1:44:38 PM3/25/03

to perl6-l...@perl.org

So, is anyone working on a P6ML, and/or is there any
discussion/agreement of what it would entail?

MikeL

Robin Berjon

unread,

Mar 25, 2003, 2:02:32 PM3/25/03

to Michael Lazzaro, perl6-l...@perl.org

Michael Lazzaro wrote:
> So, is anyone working on a P6ML, and/or is there any
> discussion/agreement of what it would entail?

Imho P6ML is a bad idea, if it means what I think it means (creating a parser
for quasi-MLs). People will laugh at our folly, and rightly so for trying to be
able to parse all the horrors of the world in a sensical manner will lead to the
same madness that happened with HTML. People will also hate us, and rightly so,
for increasing tolerance for that kind of behaviour goes against the work
accomplished over the past five years.

There are a number of pockets of bugosity that still produce broken XML, but
they are being quenched one by one. Being too kind to them will only encourage
them. As someone that works with XML every single second of my work time (and
much of my fun time), I can only too well understand the frustration of
developers faced with other people's buggy output and do want to help. But as
someone that also had to parse other people's random formats before we had XML,
I would like to stress strongly the fact that the current situation is *much*
better than it was. Encouraging people to produce broken data by making efforts
in that area at more or less language level visibilities is a step backwards
("Oh, it's broken but they use Perl so it doesn't matter").

If it is creating a /toolset/ to make recuperating data from a quasi-XML (aka
tag soup) then it is an interesting area of research. I can think of two approaches:

- have a parametrisable XML grammar. By default it would really parse XML,
and barf with extreme prejudice on errors. However individual rules will be
relaxable and modifiable to accept different, possibly slightly broken, input.
This is imho the least desirable approach.

- base a quasi-parser on something that does quasi-parsing well, namely an
HTML parser, which would be wrapped to look like an XML parser but would be able
to correct most typical problems (poorly defined entities, missing end tags,
encoding errors, etc). Advantages are: a) it addresses 98% of existing problems,
b) trying to solve the remaining issues in any non ad hoc manner is suicidal, c)
can be pointed to to developers in trouble, and d) has very low general public
visibility. Oh, and e) the perl-xml community is already on it, expect something
in the month to come.

Either way, I really think it shouldn't be called P6ML.

--
Robin Berjon <robin....@expway.fr>
Research Engineer, Expway http://expway.fr/
7FC0 6F5F D864 EFB8 08CE 8E74 58E6 D5DB 4889 2488

Dan Sugalski

unread,

Mar 25, 2003, 2:35:40 PM3/25/03

to perl6-l...@perl.org

At 10:44 AM -0800 3/25/03, Michael Lazzaro wrote:
>So, is anyone working on a P6ML, and/or is there any
>discussion/agreement of what it would entail?

I, for one, think it's a great idea, and the thought of altering perl
6's grammar to make it a functional language is sheer genius, making
the concepts behind ML more accessible to folks used to procedural
languages. Darned good idea--I say start right away!
--
Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Austin Hastings

unread,

Mar 25, 2003, 2:41:50 PM3/25/03

to Dan Sugalski, perl6-l...@perl.org

--- Dan Sugalski <d...@sidhe.org> wrote:
> At 10:44 AM -0800 3/25/03, Michael Lazzaro wrote:
> >So, is anyone working on a P6ML, and/or is there any
> >discussion/agreement of what it would entail?
>
> I, for one, think it's a great idea, and the thought of altering perl
>
> 6's grammar to make it a functional language is sheer genius, making
> the concepts behind ML more accessible to folks used to procedural
> languages. Darned good idea--I say start right away!

|==============================================[*]|
Sarcasmeter?

=Austin

Austin Hastings

unread,

Mar 25, 2003, 2:51:01 PM3/25/03

to robin....@expway.fr, Michael Lazzaro, perl6-l...@perl.org

--- Robin Berjon <robin....@expway.fr> wrote:
> If it is creating a /toolset/ to make recuperating data from a
> quasi-XML (aka
> tag soup) then it is an interesting area of research. I can think of
> two approaches:
>
> - have a parametrisable XML grammar. By default it would really
> parse XML, and barf with extreme prejudice on errors. However
> individual rules will be relaxable and modifiable to accept
> different, possibly slightly broken, input. This is imho the
> least desirable approach.

Why is this the least desirable approach?

=Austin

Paul

unread,

Mar 25, 2003, 2:52:23 PM3/25/03

to perl6-l...@perl.org

lol -- I think my BS-o-meter just redlined, too....

But just to make sure I'm not completely clueless on this one, would
someone give me a clue as to exactly what P6ML is supposed to mean, and
whether or not the original post was intended as humor? No insult
intended (at least not from me, lol), but ML as in "Markup Language"?
Or maybe as in the ML programming language (you know, the one used in
recursion examples), and it was a question of whether it was being
ported to parrot????

__________________________________________________
Do you Yahoo!?
Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop!
http://platinum.yahoo.com

Austin Hastings

unread,

Mar 25, 2003, 3:03:41 PM3/25/03

to Hod...@writeme.com, perl6-l...@perl.org

--- Paul <ydb...@yahoo.com> wrote:
>
> --- Austin Hastings <austin_...@yahoo.com> wrote:
> > --- Dan Sugalski <d...@sidhe.org> wrote:
> > > At 10:44 AM -0800 3/25/03, Michael Lazzaro wrote:
> > > >So, is anyone working on a P6ML, and/or is there any
> > > >discussion/agreement of what it would entail?
> > >
> > > I, for one, think it's a great idea, and the thought of altering
> > > perl 6's grammar to make it a functional language is sheer
> genius,
> > > making the concepts behind ML more accessible to folks used to
> > > procedural languages. Darned good idea--I say start right away!
> >
> > |==============================================[*]|
> > Sarcasmeter?
>
> lol -- I think my BS-o-meter just redlined, too....
>
> But just to make sure I'm not completely clueless on this one, would
> someone give me a clue as to exactly what P6ML is supposed to mean,
> and
> whether or not the original post was intended as humor? No insult
> intended (at least not from me, lol), but ML as in "Markup Language"?
> Or maybe as in the ML programming language (you know, the one used in
> recursion examples), and it was a question of whether it was being
> ported to parrot????
>

My assumption is that Dan has chosen to deliberately misinterpret the
P6ML as being "P6/ML" -- and that he's all in favor of making P6 more
functional since it will make the parser easier.

Sadly, however, P6ML comes from P6/XML, and so Dan is condemned to
slave away at a grammar which is going to make PL/I look easy, when all
is said and done. (Although I still maintain that Fortran is currently
underrepresented in core, since FORMATs went away. :-)

=Austin

Dan Sugalski

unread,

Mar 25, 2003, 3:21:42 PM3/25/03

to perl6-l...@perl.org

At 11:52 AM -0800 3/25/03, Paul wrote:
>--- Austin Hastings <austin_...@yahoo.com> wrote:
>> --- Dan Sugalski <d...@sidhe.org> wrote:
>> > At 10:44 AM -0800 3/25/03, Michael Lazzaro wrote:
>> > >So, is anyone working on a P6ML, and/or is there any
>> > >discussion/agreement of what it would entail?
>> >
>> > I, for one, think it's a great idea, and the thought of altering
>> > perl 6's grammar to make it a functional language is sheer genius,
>> > making the concepts behind ML more accessible to folks used to
>> > procedural languages. Darned good idea--I say start right away!
>>
>> |==============================================[*]|
>> Sarcasmeter?
>
>lol -- I think my BS-o-meter just redlined, too....

Heh. Sorry 'bout that. Bring it to OSCON and I'll get it fixed. :)

I think the original was an XML in perl 6 proposal of some sort. XML
makes me twitch, though. Ick.

A (or is that an?) ML compiler for parrot'd be really cool, though.

Paul

unread,

Mar 25, 2003, 3:47:23 PM3/25/03

to Dan Sugalski, perl6-l...@perl.org

> >> |==============================================[*]|
> >> Sarcasmeter?
> >
> >lol -- I think my BS-o-meter just redlined, too....
>
> Heh. Sorry 'bout that. Bring it to OSCON and I'll get it fixed. :)

lol -- when/where is that? (Seems all I do here is ask dumb questions).
*sigh*

> I think the original was an XML in perl 6 proposal of some sort. XML
> makes me twitch, though. Ick.

That was kinda what I got. Leave it for a module.

> A (or is that an?) ML compiler for parrot'd be really cool, though.

Anything in Parrot is likely to be pretty cool. :)

Michael Lazzaro

unread,

Mar 25, 2003, 4:05:46 PM3/25/03

to perl6-l...@perl.org

On Tuesday, March 25, 2003, at 11:02 AM, Robin Berjon wrote:
> Michael Lazzaro wrote:
>> So, is anyone working on a P6ML, and/or is there any
>> discussion/agreement of what it would entail?
>
> Imho P6ML is a bad idea, if it means what I think it means (creating a
> parser for quasi-MLs). People will laugh at our folly, and rightly so
> for trying to be able

My own musing was not something that would accept bad XML, but
something more geared as a P6-based replacement for the steaming hunk
of crap known as XSL. An XML-based derivative that performs XML
transformations, allowing/using embedded P6 regexs, closures, etc., and
able to more easily translate XML <==> P6 data.

Something like that might significantly help P6 adoption rates.[*]
While we're stuck with XML, I'm not willing to say we in Perl-land
should be stuck with the currently craptacular XML transformation
methods being adopted by other languages. :-P

Anyway, it's a future library issue more than a language development
one, but I'd be interested in hearing if any such plans were already
underway.

MikeL

[*] For example, one of the Very First Things I'll be doing with Perl6
is, of course, creating a P6-specific companion to ASP/JSP/PHP, but one
that's substantially more OO in nature... all of those *Ps have pretty
poor capabilities, and do not allow sufficiently flexible OO-based
templatizations, in my experience. And while P5's Mason is impressive,
one can imagine a more firmly P6, OO-based solution that would have a
*lot* of additional speed/capability. (I have a longtime P5 prototype
that we use here, but limitations of the P5 implementation makes it
annoyingly slow during template compilation & init.)

Dan Sugalski

unread,

Mar 25, 2003, 4:01:45 PM3/25/03

to Hod...@writeme.com, perl6-l...@perl.org

At 12:47 PM -0800 3/25/03, Paul wrote:
> > >> |==============================================[*]|
>> >> Sarcasmeter?
>> >
>> >lol -- I think my BS-o-meter just redlined, too....
>>
>> Heh. Sorry 'bout that. Bring it to OSCON and I'll get it fixed. :)
>
>lol -- when/where is that? (Seems all I do here is ask dumb questions).
>*sigh*

Portland Oregon, July 7-11. The 7th and 8th are tutorials, the
conference proper is wednesday the 9th through friday the 11th. The
conference is just down the street from Powell's (www.powells.com)
which is possibly the single best, and certainly biggest, used
bookstore in the US, if not the planet. Bring lots of money and a
spare pair of suitcases.

>
>> A (or is that an?) ML compiler for parrot'd be really cool, though.
>
>Anything in Parrot is likely to be pretty cool. :)

I dunno. Can *anything* make INTERCAL cool? I think not! :-P

Christian Renz

unread,

Mar 25, 2003, 6:10:58 PM3/25/03

to perl6-l...@perl.org

>of crap known as XSL. An XML-based derivative that performs XML
>transformations, allowing/using embedded P6 regexs, closures, etc., and
>able to more easily translate XML <==> P6 data.

I'm still quite XML-phobic, but I see the need for strong XML support
in Perl 6. However, I'd like to work with XML in Perl 6 in a way that
I don't even notice it's XML. Would it be possible to come up with an
interface to XML that is at least as intuitive as tie is for
hash<->DBM file? And that can cope with megabyte-sized XML files?

In fact, if we're talking about data storage only, it would be
interesting to have such a tie that allows me to store my data in an
XML file, YAML file, SQL database etc.

XML transformations sounds to me like it would be useful to be able to
transform data that is structured according to one grammar into
another grammatical structure. (Please excuse my long sentences.) Is
that already possible with Perl 6 Grammars? (Please excuse my
ignorance.) If yes, we might even think about an C-to-Intercal
translator. (Please excuse me, Dan.)

>creating a P6-specific companion to ASP/JSP/PHP, but one that's
>substantially more OO in nature...

Although it doesn't end in P, I'd add Zope to that list. Definitely
sounds like a killer-app for Perl 6.

Greetings,
Christian

--
cr...@web42.com - http://www.web42.com/crenz/ - http://www.web42.com/

"No Christian and, indeed, no historian could accept the epigram which
defines religion as 'what a man does with his solitude.'"
-- C.S. Lewis, The Weight of Glory

Andy Wardley

unread,

Mar 26, 2003, 4:13:12 AM3/26/03

to Robin Berjon, Michael Lazzaro, perl6-l...@perl.org

Robin Berjon wrote:
> But as someone that also had to parse other people's random
> formats before we had XML, I would like to stress strongly the fact that
> the current situation is *much* better than it was.

True, but you're also missing the point that XML is a festering pile
of steaming camel turds that has been over-designed and over-engineered
by committee for 4 decades and still isn't any closer to being pleasant
or easy to use.

Convergence is good, unless you're converging in a bad place. Now all our
markup is verbose, difficult to parse, memory hungry, tiresome to manipulate,
and so on, (unless you're using YAML of course). XML is Yet Another Silver
Bullet Bandwagon that we all jumped on because the XML software vendors told
us to.

Anyway, this isn't the time or place for an anti-XML rant. Suffice it to
say that at least one of us is hoping that we can do markup better in Perl
6.

A

Simon Cozens

unread,

Mar 26, 2003, 4:19:36 AM3/26/03

to perl6-l...@perl.org

To what extent should the (presumably library-side) ability to parse a
given markup language influence Perl 6's core language design? (which
is what this list is nominally about.) I think this ought to
approximate to "none at all".

--
I'd rather have ham in my sandwich than cheese, but complaining won't do
any good.

Robin Berjon

unread,

Mar 26, 2003, 4:39:34 AM3/26/03

to Austin_...@yahoo.com, perl6-l...@perl.org

Austin Hastings wrote:
> --- Robin Berjon <robin....@expway.fr> wrote:
>>If it is creating a /toolset/ to make recuperating data from a
>>quasi-XML (aka
>>tag soup) then it is an interesting area of research. I can think of
>>two approaches:
>>
>> - have a parametrisable XML grammar. By default it would really
>>parse XML, and barf with extreme prejudice on errors. However
>>individual rules will be relaxable and modifiable to accept
>>different, possibly slightly broken, input. This is imho the
>>least desirable approach.
>
> Why is this the least desirable approach?

To be clear, I don't think it would be a bad thing to have, as a tool. I think
however that it is less optimal than the other solution as it would require
people to parameterise it each time they want to address a new kind of bug,
whereas the HTMLish approach should work out of the box.

Having a grammar that can be finely controlled so that it isn't too hard to
implement XML parser behaviours (and I mean proper XML) mixing push, pull,
trees, and whatnot built on it then that's a grand idea.

Robin Berjon

unread,

Mar 26, 2003, 4:59:53 AM3/26/03

to Christian Renz, perl6-l...@perl.org

Christian Renz wrote:
>> of crap known as XSL. An XML-based derivative that performs XML
>> transformations, allowing/using embedded P6 regexs, closures, etc.,
>> and able to more easily translate XML <==> P6 data.
>
> I'm still quite XML-phobic, but I see the need for strong XML support
> in Perl 6. However, I'd like to work with XML in Perl 6 in a way that
> I don't even notice it's XML. Would it be possible to come up with an
> interface to XML that is at least as intuitive as tie is for
> hash<->DBM file? And that can cope with megabyte-sized XML files?

It depends what you mean by all that. In some sense it already exists, there's
nothing keeping someone from providing you a tied interface to XML::Simple, or
some such other module.

If you want more there are all sorts of existing solutions. XML::Twig is one,
and probably the only one that will fulfill all your requirements (ease of use
of an in-memory structure resilient to large documents -- XML took deliberate
design decisions that make that difficult if not impossible ).

Barrie Slaymaker has also created tools such as XML::Essex that make XML
processing much more perlish (or in fact, simply more adapted to programming in
general).

There are also budding cross-language projects such as XBind that use a simple
definition of how to map a given vocabulary to programming structures. I'm
betting those will go a long way making things easier.

> In fact, if we're talking about data storage only, it would be
> interesting to have such a tie that allows me to store my data in an
> XML file, YAML file, SQL database etc.

That already exists or is very easily doable, so long as you don't care too much
what the XML looks like.

My point is: Perl 5 already makes XML processing significantly easier than in
other languages, the only competitor I'm aware of being Python. I see *much* in
Perl 6 already that will make it easier still, but a large part of the
frustrations programmers experience have nothing or little to do with what
programing language they use (do long as it's dynamic). They're problems that
need to be solved through new interesting ways of processing XML. I agree with
Simon that there is very likely nothing that Perl 6, at the language level, can
do to provide a solution to these issues. If this list were to find a solution,
it would have nothing that would be Perl 6 specific.

There's a friendly perl-xml list to expose issues and solutions you may have.

Robin Berjon

unread,

Mar 26, 2003, 5:06:35 AM3/26/03

to Andy Wardley, perl6-l...@perl.org

Andy Wardley wrote:
> Robin Berjon wrote:
>>But as someone that also had to parse other people's random
>>formats before we had XML, I would like to stress strongly the fact that
>>the current situation is *much* better than it was.
>
> True, but you're also missing the point that XML is a festering pile
> of steaming camel turds that has been over-designed and over-engineered
> by committee for 4 decades and still isn't any closer to being pleasant
> or easy to use.

This is a large overstatement, not to mention the fact that piling SGML onto it
makes little sense (if more XMLers knew about SGML, they'd produce less cruft).

There are people that use XML to produce camel turd, and there are people that
use it to create very nice things. What this has to do with Perl 6 evades me.

> (unless you're using YAML of course).

YAML is cool, I'd just shoot myself before using it for document authoring.

> Anyway, this isn't the time or place for an anti-XML rant. Suffice it to
> say that at least one of us is hoping that we can do markup better in Perl
> 6.

Everyone is hoping that, I just have yet to see someone point at one place where
Perl 5 hinders XML processing in such a way that Perl 6 could help. I'm all ears
though.

Robin Berjon

unread,

Mar 26, 2003, 5:17:14 AM3/26/03

to Michael Lazzaro, perl6-l...@perl.org

Michael Lazzaro wrote:
> My own musing was not something that would accept bad XML, but something
> more geared as a P6-based replacement for the steaming hunk of crap
> known as XSL. An XML-based derivative that performs XML
> transformations, allowing/using embedded P6 regexs, closures, etc., and
> able to more easily translate XML <==> P6 data.

I personally like XSLT, it does its job well, the syntax is verbose but the
design is clean. It's a real language though, so it's frustrating until one
knows enough to be comfortable (but it does also make much room for baby talk).
Mine even has regexen and the such through Perl extensions.

Have you looked at the replacements such as XML::XPathScript or XML::STX? Or
others implemented in other languages that could be ported? For XML <=> P6
translations, are you aware of projects like XBind?

There are a lot of wheels out there, I think p6l can't reinvent all of them ;)

> While we're stuck with XML, I'm not willing to say we in Perl-land
> should be stuck with the currently craptacular XML transformation
> methods being adopted by other languages. :-P

Those of us that like XSLT have very happily adopted it, and in fact we're
better equiped to use it than say the Java folks. Those that didn't like it have
created alternatives. *shrug* Perl as usual.

Leon Brocard

unread,

Mar 26, 2003, 5:05:16 AM3/26/03

to perl6-l...@perl.org

Michael Lazzaro sent the following bits through the ether:

> My own musing was not something that would accept bad XML, but
> something more geared as a P6-based replacement for the steaming

> hunk of crap known as XSL ... For example, one of the Very First

> Things I'll be doing with Perl6 is, of course, creating a
> P6-specific companion to ASP/JSP/PHP

While risking the chance of going very much off topic, might I suggest
that you don't wait until Perl 6 to do all these. Sure, Perl 6 will be
all-dancing with beautiful syntax but Perl 5 has the advantage of
being here now and not really that different. You can play with
prototypes and desired syntax now, and get something finished by the
release of Perl 6.0.0 ;-)

Leon
--
Leon Brocard.............................http://www.astray.com/
scribot.................................http://www.scribot.com/

... Are you asking me out? That's so cute. What's your name again?

Andy Wardley

unread,

Mar 26, 2003, 9:55:19 AM3/26/03

to Robin Berjon, perl6-l...@perl.org

Robin Berjon wrote:
> I just have yet to see someone point at one place
> where Perl 5 hinders XML processing in such a way that Perl 6 could help.

If my understanding of the design of Perl 6 is correct, the lexer, parser
and any other related components will be highly configurable and/or
replaceable. The goal is to provide support for "little languages" by
separating Perl the language from perl the interpreter. It will be
possible to modify or replace Perl the grammar so that perl the program
can parse other languages, including Python, Ruby and presumably, XML.

So instead of writing Perl programs to parse and manipulate XML, it
should be possible to modify Perl itself so that it parses the XML directly
into some internal form suitable for programmatical manipulation.

I presume that it should also be possible to extend the rules of a default
non-validating XML parser grammar with additional rules to encode an XML
schema. On top of that it should be possible to define further production
rules that are invoked as the source document is parsed, i.e an XML schedule
(schema/schedule ~= pattern/action).

How exactly this will manifest itself, I cannot tell. Nor can I say if this
is actually a sensible thing to do or not. But unless my understanding is
warped, support for parsing XML and other markup languages could be moved
down into the core of the parser internals for Perl 6.

For example, it might be possible to do something like this:

use Perl6::XML;

use Perl6;

print $thingy.blah;

This is all speculation and hand-waving, of course. But the point is that
Perl 6's extending parsing capabilities could well provide a much greater
level of integration between Perl, XML and various other programming and
markup languages.

My rant against the XML machine was really an aside. Take everything I say
with a pinch of salt. :-)

A

Rafael Garcia-Suarez

unread,

Mar 26, 2003, 10:23:31 AM3/26/03

to Andy Wardley, robin....@expway.fr, perl6-l...@perl.org

Andy Wardley wrote:
>
> If my understanding of the design of Perl 6 is correct, the lexer, parser
> and any other related components will be highly configurable and/or
> replaceable. The goal is to provide support for "little languages" by
> separating Perl the language from perl the interpreter. It will be
> possible to modify or replace Perl the grammar so that perl the program
> can parse other languages, including Python, Ruby and presumably, XML.

I think that you're a bit mistaken : the goal is to have (a) parrot
execute other languages (once compiled to parrot bytecode) and (b) perl's
parser able to modify itself at runtime. The fact that Perl's grammar
can evolve doesn't mean that the basic entities it operates on will also
evolve, and, as a Python string can't be seamlessly mapped to a Perl string,
one can't have perl behave as a Python interpreter only by modifying its
parser.

> So instead of writing Perl programs to parse and manipulate XML, it
> should be possible to modify Perl itself so that it parses the XML directly
> into some internal form suitable for programmatical manipulation.

And moreover XML by itself is not a programming language, so I don't
see how it's possible to build a generic interpreter for it.

> How exactly this will manifest itself, I cannot tell.

If you wave hands very fast, bushes might start burning ;-)

> Nor can I say if this
> is actually a sensible thing to do or not. But unless my understanding is
> warped, support for parsing XML and other markup languages could be moved
> down into the core of the parser internals for Perl 6.

I think you're overriding too much the meaning of 'parser' here.
Basically I think that perl6's internal parser, even after heavy
reconfiguration, will remain an engine to parse context-free languages,
with a few improvements. That's very different from a parser for markup
languages. (Of course, with a mechanism comparable to perl 5's source
filters, one can plug everything. So, as Leon was saying, you can begin to
implement a Perl5::XML source filter module right now.)

Austin Hastings

unread,

Mar 26, 2003, 10:32:58 AM3/26/03

to robin....@expway.fr, Michael Lazzaro, perl6-l...@perl.org

--- Robin Berjon <robin....@expway.fr> wrote:

> Have you looked at the replacements such as XML::XPathScript or
> XML::STX? Or others implemented in other languages that could be
> ported? For XML <=> P6
> translations, are you aware of projects like XBind?
>
> There are a lot of wheels out there, I think p6l can't reinvent all
> of them ;)

To answer a question you asked on an earlier thread, this is one of the
ways that Perl makes doing XML difficult.

Q: "What's the right CPAN lib to pull for parsing/rewriting XML?"

A: Look, we've got a plethora of XML libs, all indistinguishable at
first glance. You'll need to do a week-long research project to figure
out what's what! OK?

While P6ML may be off-topic for the language, maybe this issue isn't:
Is there a plan for the "core libs"? In other words, since we're moving
scads of things out of core, we're implying a set of standard libs. In
P5 this was a very minimalist set, since must of what was essential was
in core. Now I'm proposing that some technologies, like DBI, mod_perl,
etc. have proven themselves so popular that they (1) will instantly get
moved over; and (2) should probably be moved over exactly ONCE. Rather
than having a maze of twisty database interfaces, all alike, we want
DBI. Well, rather than having a slew of subtly incompatible XML
interfaces, ...

So I guess, at the language level I'm asking if there's a process in
place to identify these essential libs and to move forward on them?

=Austin

Austin Hastings

unread,

Mar 26, 2003, 10:43:41 AM3/26/03

to Andy Wardley, Robin Berjon, perl6-l...@perl.org

--- Andy Wardley <a...@andywardley.com> wrote:
> For example, it might be possible to do something like this:
>
> use Perl6::XML;
>
> <thingy>
> <blah>blah blah</blah>
> </thingy>
>
> use Perl6;
>
> print $thingy.blah;
>

Every once in a while, I look at what I'm sending to the list, and I
think "Am I too far out, here?"

And then I see messages like this, and I think "I'm not visionary
enough for this list."

Thanks, Andy.

=Austin

Robin Berjon

unread,

Mar 26, 2003, 11:02:00 AM3/26/03

to Austin_...@yahoo.com, perl6-l...@perl.org

Austin Hastings wrote:
> --- Robin Berjon <robin....@expway.fr> wrote:
> To answer a question you asked on an earlier thread, this is one of the
> ways that Perl makes doing XML difficult.
>
> Q: "What's the right CPAN lib to pull for parsing/rewriting XML?"
>
> A: Look, we've got a plethora of XML libs, all indistinguishable at
> first glance. You'll need to do a week-long research project to figure
> out what's what! OK?

I understand this issue very clearly, but I don't think it's Perl's fault. There
have been talks on and off about a Perl-XML SDK for at least two or three years,
and it's not easy.

We have standardised interop between modules on PerlSAX. If there is sufficient
community pressure, we have all the drafts ready to do the same thing with
PerlDOM in a shortish timeframe (it just seems that people are happy with
XML::LibXML). We're standardising the selection of transforming modules right now.

However that will only help so much. If you want to jump into XML processing and
you don't know about SAX or DOM (related or not Perl) you have some homework to
do. You don't need to master them, but you need to have an idea of the ways in
which they work.

Also, it is by and large recognised outside the bounds of our community that
Perl's wealth in XML processing is that while most other languages have just DOM
and SAX, we have all sorts of alternatives like XML::Twig, XML::Essex,
XML::Simple and so forth that make things much easier when you have specific
requirements. We're possibly the only language with three or four different
transformation packages...

Which of those would go in a core lib or SDK? The ones that correspond to XML
standards are used a lot and standard so they'd probably be in (with C libs
dependencies issues), others are non-standard but also used much, others aren't
used a lot but are so good they really should be... Then you get to the modules
that interface to specific languages such as RDF or SVG, and it's a mess to deal
with. All things considered, less energy might be spent by people spending a
week researching Perl XML modules all together than on the creation of an SDK ;)

It's a complex set of issues, and if we're to work on it that work can be done
with Perl 5 modules, on the perl-xml list, right now, independently from p6l
issues. Otherwise p6l will probably get to know *much* more about XML than it
wants to, and the perl-xml community will be excluded from choices on stuff that
concerns it very directly.

> Well, rather than having a slew of subtly incompatible XML
> interfaces, ...

All the major ones should support SAX 2 for interop by now. If there are
incompatibilities you can certainly file a bug report.

> So I guess, at the language level I'm asking if there's a process in
> place to identify these essential libs and to move forward on them?

Ask the people that use them?

Jonathan Scott Duff

unread,

Mar 26, 2003, 11:14:15 AM3/26/03

to Robin Berjon, Austin_...@yahoo.com, perl6-l...@perl.org

On Wed, Mar 26, 2003 at 05:02:00PM +0100, Robin Berjon wrote:
> > So I guess, at the language level I'm asking if there's a process in
> > place to identify these essential libs and to move forward on them?
>
> Ask the people that use them?

Didn't there used to be a stdlib mailing list for discussing this
stuff?

-Scott
--
Jonathan Scott Duff
du...@cbi.tamucc.edu

Robin Berjon

unread,

Mar 26, 2003, 11:26:03 AM3/26/03

to du...@pobox.com, perl6-l...@perl.org

Jonathan Scott Duff wrote:
> On Wed, Mar 26, 2003 at 05:02:00PM +0100, Robin Berjon wrote:
>>Ask the people that use them?
>
> Didn't there used to be a stdlib mailing list for discussing this
> stuff?

Yes, and it had even started well by trimming a long list of suggestions one by
one (I think Nat was in charge, but I could be misremembering) but it went the
way of the dodo. IIRC it was perl-sdk or something like that.

A subsequent idea was to have Perl sub-communities define their own SDKs, but
that apparently didn't work out either.

Robin Berjon

unread,

Mar 26, 2003, 12:07:52 PM3/26/03

to Andy Wardley, perl6-l...@perl.org

Andy Wardley wrote:
> Robin Berjon wrote:
>>I just have yet to see someone point at one place
>>where Perl 5 hinders XML processing in such a way that Perl 6 could help.

> (...)

> So instead of writing Perl programs to parse and manipulate XML, it
> should be possible to modify Perl itself so that it parses the XML directly
> into some internal form suitable for programmatical manipulation.

> (...)

> How exactly this will manifest itself, I cannot tell. Nor can I say if this
> is actually a sensible thing to do or not. But unless my understanding is
> warped, support for parsing XML and other markup languages could be moved
> down into the core of the parser internals for Perl 6.
>
> For example, it might be possible to do something like this:
>
> use Perl6::XML;
>
> <thingy>
> <blah>blah blah</blah>
> </thingy>
>
> use Perl6;
>
> print $thingy.blah;

What you point to in terms both of difficulties with the existing approaches and
in terms of solutions makes a *lot* of sense. I'm afraid however that some form
of cold is preventing you from smelling the sulfurous fumes emanating from
dragons hiding right around the corner :)

I'll leave aside the excellent idea of allowing one to embed XML data into Perl
source as you describe it (a nice replacement for __DATA__ for sure) to focus on
the rest because if we can do that with external XML documents, the part about
inlining XML becomes trivial.

The basic problem is that to produce a data structure you can either know
something of the kind of XML you're to be using or you can do it in a generic
manner.

The generic manner is simple, in fact it's called XML::Simple. It's great at
what it does, but you get a data structure which you need to discover and in
many case you probably want something where you have to pay less attention to
whether something is a string or a hashref. Ask Nat[0] ;)

The vocabulary specific manner is more complex, because you need something
external to the XML to describe how the mapping operates. In your example if I
were to add a <blah> element, all of a sudden $thingy.blah might be an array
with the two contents. Things get hairy fast without even using anything crufty,
especially when you add attribute parsing, namespaces, in-document links...

The data binding folks have tried to address the problem using XML Schema, and
the result is, hmmm, "unpleasant" to use something polite. The SOAP and WSDL
people have been at it, and I won't even describe the result because I couldn't
possibly be polite about it.

Imho a grammar-based approach would likely be too low-level. I'm currently
betting on something that would mix XBind[1] and Regular Fragmentations[2]. The
first one defines simple mappings as described above, the second tells you how
to parse data in XML documents that has structure not expressed in XML (eg
<date>2003-03-26</date>) so that it is seen in a structured way, without the
need for typing.

These approaches are elegant, and have the advantage of being truly
cross-language so that we can let the Python people write the descriptions and
use them directly :)

One very cool thing that could be done in Perl 6 would be to take an
XBind+RegFrag document and generate a grammar derived from the P6 XML grammar
that would 1) be specific to the vocabulary (and thus hopefully faster than a
generic XML grammar, though I don't have /too/ much hope) and 2) directly
produce the object representation you want and return it in the parse object.

> This is all speculation and hand-waving, of course. But the point is that
> Perl 6's extending parsing capabilities could well provide a much greater
> level of integration between Perl, XML and various other programming and
> markup languages.

Yes certainly, but again we could already go much farther than we are today
using Perl 5 (and a lot of tuits).

> My rant against the XML machine was really an aside. Take everything I say
> with a pinch of salt. :-)

I might have overreacted slightly because I'm tired of the xmlHorribleKludges
obscuring the coolness that nice and helpful people work on hard. I can't blame
anyone for not seeing through the blazing storm of hypish PR...

[0]http://use.perl.org/~gnat/journal/11081
[1]http://www.prescod.net/xml/xbind/
[2]http://www.simonstl.com/projects/fragment/

Michael Lazzaro

unread,

Mar 26, 2003, 2:45:14 PM3/26/03

to perl6-l...@perl.org

Robin Berjon wrote:
<one metric ton of useful stuff, in various messages, all of which I
agree with>

Including...

> The data binding folks have tried to address the problem using XML
> Schema, and the result is, hmmm, "unpleasant" to use something polite.
> The SOAP and WSDL people have been at it, and I won't even describe
> the result because I couldn't possibly be polite about it.
>
> Imho a grammar-based approach would likely be too low-level. I'm
> currently betting on something that would mix XBind[1] and Regular
> Fragmentations[2]. The first one defines simple mappings as described
> above, the second tells you how to parse data in XML documents that
> has structure not expressed in XML (eg <date>2003-03-26</date>) so
> that it is seen in a structured way, without the need for typing.

<snip>

> One very cool thing that could be done in Perl 6 would be to take an
> XBind+RegFrag document and generate a grammar derived from the P6 XML
> grammar that would 1) be specific to the vocabulary (and thus
> hopefully faster than a generic XML grammar, though I don't have /too/
> much hope) and 2) directly produce the object representation you want
> and return it in the parse object.

Indeed. This is the primary problem space. Nobody has done this well.
If we could provide a toolset for doing this, we would Really Have
Something.

My initial query about the ambiguously-named "P6ML" was not based so
much on a notion that such an effort couldn't be done in Perl5, and
more on the notion that it may be far _more_ possible to do this,
quickly & credibly, using P6 typing/OO and the new regex engine. As I
said, I've done quite a bit of prototyping, and the P5 solutions can be
very, very tedious. (P5 and P6 may be mostly alike, but it's the parts
that aren't "mostly" that have driven the very need for P6 -- and just
so happen to be the very parts that make this problem so awkward in P5.)

And in case I haven't mentioned it this week, you Parrot folks are my
heros.

> [0]http://use.perl.org/~gnat/journal/11081
> [1]http://www.prescod.net/xml/xbind/
> [2]http://www.simonstl.com/projects/fragment/

Thanks for those... I was aware of the first two links, but I had
completely missed the Frag one -- I plead ignorance on that. You are
correct, it looks quite promising.

MikeL

Miko O'Sullivan

unread,

Mar 26, 2003, 2:24:57 PM3/26/03

to perl6-l...@perl.org

Andy Wardley wrote:
>
> For example, it might be possible to do something like this:
>
> use Perl6::XML;
>
> <thingy>
> <blah>blah blah</blah>
> </thingy>
>
> use Perl6;
>
> print $thingy.blah;

We already have the ability to embed foreign languages (XML, HTML,
whatever) using here docs:

$myml = MyXmlParser->new(<< '(MARKUP)');

(MARKUP)

So I guess I don't see the point in adding another way to say "the foreign
syntax starts HERE and ends HERE". (Is that why they're called "here"
documents? I've always wondered about that name.)

And now to make a bit of a tangent... I've always thought it would be nice
to have an official way to indicate the foreign language in the here doc.
That way my editor could do syntax highlighting for HTML, JavaScript,
whatever. I suppose it could even do grammar and spell checking on
English content.

It wouldn't have to be much of an extension to the here doc syntax to
allow for a language indicator:

$myml = MyXmlParser->new(<< '(MARKUP)', type=>'text/xml');

or

print << '(MARKUP)', type=>'human/en';

-Miko

Miko O'Sullivan
Programmer Analyst
Rescue Mission of Roanoke

Adam Turoff

unread,

Mar 26, 2003, 4:23:37 PM3/26/03

to Simon Cozens, perl6-l...@perl.org

On Wed, Mar 26, 2003 at 09:19:36AM +0000, Simon Cozens wrote:
> To what extent should the (presumably library-side) ability to parse a
> given markup language influence Perl 6's core language design? (which
> is what this list is nominally about.) I think this ought to
> approximate to "none at all".

Approximately none, except that Perl's self-selected problem domain is
text hacking, and XML is redefining the meaning of "text hacking".

AFAICT, all of this is rather moot. The ability to create a presumably
fast parser using rule{}'s and such solves 80% of the problem. From
there, it's a SMOP to convert text-with-angle-brackets to sensible
native data structures or native processing techniques.

I believe Robin's interest in the area is in ensuring that there will be
a simple way to take a specific XML grammar and [auto]generate an
angle-bracket-friendly parser that produces appropriate domain-specific
data structures without the grovelling through horribly generic data
structure, events or whatnot.

Z.

Joseph F. Ryan

unread,

Mar 26, 2003, 8:15:53 PM3/26/03

to MikoO'Sullivan, perl6-l...@perl.org

Miko O'Sullivan wrote:

>Andy Wardley wrote:
>
>>For example, it might be possible to do something like this:
>>
>> use Perl6::XML;
>>
>> <thingy>
>> <blah>blah blah</blah>
>> </thingy>
>>
>> use Perl6;
>>
>> print $thingy.blah;
>
>
>
>We already have the ability to embed foreign languages (XML, HTML,
>whatever) using here docs:
>
> $myml = MyXmlParser->new(<< '(MARKUP)');
> <thingy>
> <blah>blah blah</blah>
> </thingy>
> (MARKUP)
>

Well, P6C has the new ability of inlining code from another parrot-
based language. All someone needs to do is write an XML processor
that spits out pasm/imcc, and then:

use inline 'XML', q[

<processing XSLT stuff or whatever here />
];

or even:

use inline 'XML', <<"XML_IS_FUN";

<processing XSLT stuff or whatever here />
XML_IS_FUN

See how easy that is? Who needs a stinking P6ML now? (-:

Joseph F. Ryan
ryan...@osu.edu

--
This message was sent using 3wmail.
Your fast free POP3 mail client at www.3wmail.com

Joseph F. Ryan

unread,

Mar 26, 2003, 9:47:45 PM3/26/03

to perl6-l...@perl.org

Miko O'Sullivan wrote:

>Andy Wardley wrote:
>
>>For example, it might be possible to do something like this:
>>
>> use Perl6::XML;
>>
>> <thingy>
>> <blah>blah blah</blah>
>> </thingy>
>>
>> use Perl6;
>>
>> print $thingy.blah;
>
>
>
>We already have the ability to embed foreign languages (XML, HTML,
>whatever) using here docs:
>
> $myml = MyXmlParser->new(<< '(MARKUP)');
> <thingy>
> <blah>blah blah</blah>
> </thingy>
> (MARKUP)

As a side note, P6C now has the ability to inline code of a different
language, so something like this will work:

use inline 'XML', q[
<thingy>
<blah>blah blah</blah>
</thingy>

...

<XSLT PROCESSING STUFF>
];

Provided, of course, that there is an parrot/imcc targetted XML processor. Who needs a P6ML now? (-:

Andy Wardley

unread,

Mar 27, 2003, 6:37:59 AM3/27/03

to Miko O'Sullivan, perl6-l...@perl.org

Miko O'Sullivan wrote:
> We already have the ability to embed foreign languages (XML, HTML,
> whatever) using here docs:
>
> $myml = MyXmlParser->new(<< '(MARKUP)');
> <thingy>
> <blah>blah blah</blah>
> </thingy>
> (MARKUP)

True, but what kind of magic is hiding inside MyXmlParser?

One problem is that writing MyXmlParser to parse and validate XML and
then generate some corresponding Perl data structure is difficult and
error prone.

In the simple case, XML::Simple is your friend. But as Robin points
out, the simple approach falls down work when you need finer control
over what you're doing.

You can use the XML::Schema modules (if you're feeling brave) and that
will generate a validating parser with control over the generated data
structure. But it's big and bulky and the complexities of XML Schema
itself make it a daunting task.

There are various other modules and techniques which can acheive the
desired result, but I've yet to find one that was both easy to use and
powerful (although I need to check out those links that Robin posted).

So I'm thinking that if the Perl 6 parser is as flexible and powerful
as promises, then can we adapt it to simplify the task of parsing XML
into internal data structures?

One benefit of inlined XML over the example above is that it would be
parsed at compile time, not runtime. When our modified parser
sees this:

use Perl6::XML;

It would effectively re-write it as if written:

my $thingy = {
blah => 'blah blah',
}

and then generate the appropriate opcodes to implement it at runtime.

A further benefit would be that your parsed and validated XML markup
could then be stored as Parrot bytcode. You would effectively be
"compiling" XML into bytecode that you could load into other programs
with a simple "use". That would be neat.

As and when we need more control over the XML validation or code
generation, we would write our own modified XML grammar modules.
Apocalypse 5 suggests this would be a simple matter of defining a
few new 'rule' constructs. For example, we might want to add a rule
for matching thingy/blah that constructs a list rather than a scalar.
Thus, the XML would be parsed as if written:

my $thingy = {
blah => [ 'blah blah' ],
}

This is all largely hypothetical, of course. Hence the continued hand
waving and general lack of detail. Consider it an open thought in process.

:-)

A

Joseph F. Ryan

unread,

Mar 27, 2003, 2:53:28 PM3/27/03

to perl6-l...@perl.org

Joseph F. Ryan wrote:

>Miko O'Sullivan wrote:
>
>>Andy Wardley wrote:
>>

>>>For example, it might be possible to do something like this:

>>>
>>> use Perl6::XML;
>>>
>>> <thingy>
>>> <blah>blah blah</blah>
>>> </thingy>
>>>

>>> use Perl6;
>>>
>>> print $thingy.blah;

>>
>>
>>
>>We already have the ability to embed foreign languages (XML, HTML,
>>whatever) using here docs:
>>
>>$myml = MyXmlParser->new(<< '(MARKUP)');
>> <thingy>
>> <blah>blah blah</blah>
>> </thingy>
>>(MARKUP)
>
>

>As a side note, P6C now has the ability to inline code of a different
>language, so something like this will work:
>
>use inline 'XML', q[

> <thingy>
> <blah>blah blah</blah>
> </thingy>

> ...
>
> <XSLT PROCESSING STUFF>
>];
>
>Provided, of course, that there is an parrot/imcc targetted XML processor. Who needs a P6ML now? (-:
>
>Joseph F. Ryan
>ryan...@osu.edu

Woops; my mail client crashed when I sent this the first time; I
had thought it hadn't sent, so I re-wrote it and sent it again. Sorry
for the double post!