[PUGS] patch - few hyperops

Markus Laire

unread,

Mar 11, 2005, 5:53:45 AM3/11/05

to perl6-c...@perl.org

This patch adds these binary-hyperops to pugs: »+« »*« »/« »x« »xx«

This is my first patch ever, so could someone check if this is OK or
not. This only adds few hyperops, as I'm not 100% sure if this is the
right way to do it.

Also currently this code emits following warning while compiling:

src/Prim.hs:398:
Warning: Pattern match(es) are non-exhaustive
In the definition of `op2Hyper': Patterns not matched: _ _ _

because I'm not sure what op2Hyper should do if it gets invalid types.

First todo-test from t/op/hyper.t passes.
Also these examples now work:

(1,2,3) »x« 3 --> ('111', '222', '333')
(1,2,3) »x« (3,2,1) --> ('111', '22', '3')
1 »x« (3,2,1) --> ('111', '11', '1')

(1,2,3) »xx« 3 --> ((1,1,1), (2,2,2), (3,3,3))
(1,2,3) »xx« (3,2,1) --> ((1,1,1), (2,2), (3)
1 »xx« (3,2,1) --> ((1,1,1), (1,1), (1))

(20,40,60) »/« (2,5,10) --> (10,8,6)

(1,2,3) »+« (10,20,30) »*« (2,3,4) --> (21,62,123)
((1,2,3) »+« (10,20,30)) »*« (2,3,4) --> (22,66,132)

--
Markus Laire
<Jam. 1:5-6>

hyper_ops.patch

Markus Laire

unread,

Mar 11, 2005, 2:30:40 PM3/11/05

to perl6-c...@perl.org

Thanks, applied. :)

Larry Wall

unread,

Mar 11, 2005, 3:25:28 PM3/11/05

to perl6-c...@perl.org

On Fri, Mar 11, 2005 at 12:53:45PM +0200, Markus Laire wrote:
: This is my first patch ever, so could someone check if this is OK or

: not. This only adds few hyperops, as I'm not 100% sure if this is the
: right way to do it.

It is good to have a general mechanism for wrapping any binary op
up as a hyper op. We'll eventually have to extend that idea into
the parser as well so that we don't have to dup all the operators,
but first things first.

Larry

Autrijus Tang

unread,

Mar 12, 2005, 2:14:33 AM3/12/05

to perl6-c...@perl.org

On Fri, Mar 11, 2005 at 12:25:28PM -0800, Larry Wall wrote:
> It is good to have a general mechanism for wrapping any binary op
> up as a hyper op. We'll eventually have to extend that idea into
> the parser as well so that we don't have to dup all the operators,
> but first things first.

Yup, my thoughts exactly. I'll hyper-opify the parser this weekend,
if Luke did not beat me to it.

Oh, btw, is there some more documents for the &statement:<> level
parsing and handling somewhere, or at least a general overview of
how those things are defined? :)

Thanks,
/Autrijus/

Larry Wall

unread,

Mar 12, 2005, 3:06:09 AM3/12/05

to perl6-c...@perl.org

On Sat, Mar 12, 2005 at 03:14:33PM +0800, Autrijus Tang wrote:
: Oh, btw, is there some more documents for the &statement:<> level

: parsing and handling somewhere, or at least a general overview of
: how those things are defined? :)

Below is an excerpt of something I sent Patrick last month that might
provide a bit of help. A bit of background--for some time we've been
proposing to use a hybrid parser with three layers: there's a top-down
parser that gets down to the expression level, a bottom-up operator
precedence parser that does expressions, and finally the "lexer"
for each term or operator is again a top-down parser called by the
operator precedence grammar whenever it needs the next lookahead.
This keeps most of the benefits of top-down parsing while letting us
avoid 24 or so levels of recursion on every term. It also lets us
add new operators and precedence levels without having to recalculate
the entire grammar after every definition. Anyway, the discussion
below assumes that architecture.

Larry
-------------------------------------
[snip]
But in the long run I think anything that can show up right after
a term has to be recognized by the lexer in parallel, including
infix ops. I think we have several "spots" that are combinations of
various syntactic categories. To oversimplify, at the start of a
statement the lexer can recognize in parallel any of:

statement_control|term|prefix|label

Otherwise if we're expecting a term, it's:

term|prefix

and if we're expecting an operator, it's:

postfix|infix

The intent of S5 redefinition of how %foo is matched is to allow
those three main categories to each be represented by a single hash
that is really a data structure functioning as a switch. But each of
those hashes might switch out to any of several of the real syntactic
categories, as shown by the | above.

As I say, that's oversimplified. The real 3 states are closer to this:

Statement:
statement_control
term
prefix
label
scope_declarator

Term:
term
prefix
circumfix
statement_modifier
scope_declarator
infix_postfix_meta_operator
prefix_postfix_meta_operator

Operator:
postfix
postfix_prefix_meta_operator
postcircumfix
infix
infix_circumfix_meta_operator
coerce
statement_modifier
statement_block

Though I'm neglecting the fact that to handle our whitespace
dependencies, some of these categories are split into two substates
depending on whether we just traversed any whitespace. So there
are really five main states (statements don't care about leading
whitespace):

Expect statement:
statement_control
label
term
prefix
circumfix
scope_declarator

Expect term without <ws>:
term
prefix
circumfix
statement_modifier
scope_declarator
infix_postfix_meta_operator
prefix_postfix_meta_operator

Expect term after <ws>:
term
prefix
circumfix
statement_modifier
scope_declarator

Expect operator without <ws>:
postfix (either dotted or undotted form)
postfix_prefix_meta_operator (either dotted or undotted form)
postcircumfix (either dotted or undotted form)
infix (except those hidden by undotted postfix)
infix_circumfix_meta_operator (except those hidden by undotted postfix)
coerce
statement_modifier

Expect operator after <ws>:
postfix (undotted, only if not hidden by infix)
postfix (dotted)
postfix_prefix_meta_operator (only if next postfix not hidden by infix)
postcircumfix (dotted form only)
infix
infix_circumfix_meta_operator
coerce
statement_modifier
statement_block

Or something like that. There are other minor states, such as within
declarations where we're looking for categories like trait_verb and
trait_auxiliary, or within rules where we might pick up various rule
modifiers and such. Or maybe those aren't really lexer states, if
they're just used by token parser rules directly and aren't visible
to the operator precedence grammar.

But those five states above are the big lexer states. I say "lexer
states", but these states are probably kept track of by the operator
precedence parser, and it just calls into one of five rules that each
start with one of our magical hashes that parallelize these various
multiple user-visible syntactic categories.

Does this give you a little better idea of where I'm pushing this?

Actually, now that I think a little more, the bottom-up engine maybe
doesn't have to know about <ws> if the 2nd and 4th states' hashes
include significant whitespace entries that fall into the 3rd and
5th states automatically. Similarly, the statement-level hash could
just defer to the 3rd hash if it doesn't recognize anything statement-like.
Which means the operator precedence parser is back to knowing only
two states, which is proper. Actually, the statement level parser
just calls into the bottom-up parser, which in turn will start at
the 3rd hash, assuming it starts up in expect-term-after-whitespace
state.

So the statement rule is basically:

rule statement { %statementthing | { $\ := op_parse(3) } }

or some such, where the op_parse function is what takes the place of
your <expression> rule above.

Larry