Odd expression?

Janis Papanagnou

unread,

Dec 27, 2014, 10:33:29 AM12/27/14

to

With reference to a recent thread and a comment of what was considered
an "odd expression", namely a==b==c.

The POSIX Awk specifies *no* associativity for the relational operators
(<, <=, >, >=, ==, !=, ~, !~). But the 'in' operator is defined to have
left-asscotiativity. - Umm.. - So, beyond the question whether this is
an odd expression or not, this x in y in z is a possible expression.
And it's supported by gawk. Try, for example the following program with
input data, say 22 and 42, and also with lines b[0]=.. and respectively
b[1]=.. commented out.

BEGIN { a[42]
b[0] = "zero"
b[1] = "one"
}
$1 in a in b { print "yes", b[$1 in a] }

A numeric/boolean result can be used in subsequent index tests. (And
also compare that to the POSIX-undefined test of the numeric/boolean
value in sequences of '==' tests.)

Why are such 'in'-cascades possible (given that '==' cascades are not);
is there any rationale you can think of?

Janis

Kaz Kylheku

unread,

Dec 27, 2014, 11:30:09 AM12/27/14

to

On 2014-12-27, Janis Papanagnou <janis_pa...@hotmail.com> wrote:
> With reference to a recent thread and a comment of what was considered
> an "odd expression", namely a==b==c.
>
> The POSIX Awk specifies *no* associativity for the relational operators
> (<, <=, >, >=, ==, !=, ~, !~).

"No associativity" means that a diagnostic ("syntax error" or whatever) is
required for "e1 == e2 == e3".

That is what %nonassoc means in Yacc.

Kaz Kylheku

unread,

Dec 27, 2014, 12:11:52 PM12/27/14

to

On 2014-12-27, Janis Papanagnou <janis_pa...@hotmail.com> wrote:

> The POSIX Awk specifies *no* associativity for the relational operators
> (<, <=, >, >=, ==, !=, ~, !~). But the 'in' operator is defined to have
> left-asscotiativity.

Historic notes.

In 1979, Awk already had "for (x in y) ...", but not the "x in y"
expression.

The "x in y" expression appeared surprisingly late, between 1992 and 1996.

The IN operator token was classified as %nonassoc at that time.

It remained %nonassoc after that.

However, the grammar is actually

ppattern IN varname

Where ppatern can derive a varname via ppatern -> term -> var -> varname.
The productions for ppattern are all left-recursive.

This left recursion determines the parse for x in y in z, making it
left-associative.

The %nonassoc declaration that was introduced for the IN token
is effectively useless.

It seems that the *intent* was to make it non-associative,
that wish being expressed clearly with the %nonassoc. But because of the way
the operator was worked into a left-recursive grammar production, it became
effectively left-associative.

Given an input like

x in y in z

the parser will shift the x token (the position indicated by the dot).

x . in y in z

There is no ambiguity; the parser must reduce the previous material
through several reductions:

varname . in y in z
var . in y in z
term . in y in z
ppattern . in y in z

Now the parse can continue by shifting the two tokens:

ppattern in y . in z

There is no ambiguity here, either; the left part "pattern in y" in the parse
stack must be reduced to a ppattern, because only ppattern as a whole
takes an 'in' operator:

ppattern . in z

Now there is a pattern which matches the production rule, and so
the parse continues:

ppattern in . z # shift 'in' onto stack
ppattern in z . # now reduce z -> varname
ppattern in varname . # then reduce ppattern in varname -> ppattern
ppattern # left-associative parse done.

Janis Papanagnou

unread,

Dec 27, 2014, 3:16:01 PM12/27/14

to

On 27.12.2014 18:11, Kaz Kylheku wrote:
>> [...]

Thanks for your reply.

>
> Historic notes.
>
> In 1979, Awk already had "for (x in y) ...", but not the "x in y"
> expression.
>
> The "x in y" expression appeared surprisingly late, between 1992 and 1996.

Where do you have that date from?

To my best knowledge the 'in' operator was part of the first major revision
that was developed in the mid to late 1980's. It is described in A.,W., and
K.'s original book released 1987 or 1988 (see page 46, expression operators,
and various examples in the text that contain the 'in'). The 'in' operator
was also not only defined as a special construct in the for-construct but
also defined (as said) as expression, usable (e.g.) in the if-construct and
also described.

Janis

> [ snip parsing explanations ]

Kaz Kylheku

unread,

Dec 27, 2014, 9:23:34 PM12/27/14

to

On 2014-12-27, Janis Papanagnou <janis_pa...@hotmail.com> wrote:

> On 27.12.2014 18:11, Kaz Kylheku wrote:
>>> [...]
>
> Thanks for your reply.
>
>>
>> Historic notes.
>>
>> In 1979, Awk already had "for (x in y) ...", but not the "x in y"
>> expression.
>>
>> The "x in y" expression appeared surprisingly late, between 1992 and 1996.
>
> Where do you have that date from?

Dan Fuzz maintains a git repository of historic Awk sources:

Git URL:

https://github.com/danfuzz/one-true-awk.git

Browseable, "versions" subdirectory:

https://github.com/danfuzz/one-true-awk/tree/master/versions

The sources are not from a single branch, but from various places: version of
AT&T and BSD Unix, Kernighan's web page, ...

> To my best knowledge the 'in' operator was part of the first major revision
> that was developed in the mid to late 1980's. It is described in A.,W., and
> K.'s original book released 1987 or 1988 (see page 46, expression operators,
> and various examples in the text that contain the 'in'). The 'in' operator
> was also not only defined as a special construct in the for-construct but
> also defined (as said) as expression, usable (e.g.) in the if-construct and
> also described.

You got it!

Now I see where I went wrong. Since the sources aren't from one branch,
it is incorrect to do a binary search through the history to see
where something was introduced.

Though we don't see the IN operator in 4.4 BSD sources dated 1992, we
do in fact see it in the earlier SVR4 sources from 1989!