Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Chained comparisons A < B < C

83 views
Skip to first unread message

James Harris

unread,
Sep 5, 2021, 6:50:21 AM9/5/21
to
I've got loads of other posts in this ng to respond to but I came across
something last night that I thought you might find interesting.

The issue is whether a language should support chained comparisons such
as where

A < B < C

means that B is between A and C? Or should a programmer have to write

A < B && B < C

?

I like the visual simplicity of the first form but it doesn't look so
intuitive with a mix of relational operators such as in

A < B > C

What does that mean?!

For programming, what do you guys prefer to see in a language? Would you
rather have each relation yield false/true (so that A < B in the above
yields a boolean) or to allow the above kind of chaining where a meaning
is attributed to a succession of such operators?

If the latter, what rules should govern how successive relational
operators are applied?


--
James Harris

Bart

unread,
Sep 5, 2021, 7:27:20 AM9/5/21
to
I support such operators (althought it might be limited to 4 operators).
Something like:

A op1 B op2 C op3 D

is approximately the same as:

(A op1 B) and (B op2 C) and (C op3) D

except that the middle terms should be evaluated once, not twice.

(I thought I was doing them once, but it seems to have lapsed back to
double evaluation; I'll have to check it out.

Double evaluaton is hard to avoid if doing this by transforming the AST
into that second version. But I now use a special AST form where middle
terms occur only once, so there's no excuse).

As for use-case, I mostly use it with equals:

if A = B = C then

I did use to do:

if A <= B <= C then

but I now tend to have a dedicated op for that which is:

if A in B..C then

So chained comparisons /could/ be restricted to equals only. Except some
will want to do A < B <= C and so on.

In any case, any combinations will be well-defined. Your last example
means A < B and B > C, and will be true for A, B, C values of 5, 20, 10
for example.

Charles Lindsey

unread,
Sep 5, 2021, 10:36:07 AM9/5/21
to
On 05/09/2021 11:50, James Harris wrote:
> I've got loads of other posts in this ng to respond to but I came across
> something last night that I thought you might find interesting.
>
> The issue is whether a language should support chained comparisons such as where
>
>   A < B < C

I think in most languages operators of the same precedence would associate to
the left. So that would mean whatever is meant by

(A < B) < C

which in most languages would be ill-formed (unless you had overloaded '<' with
some suitable meaning).
>
> means that B is between A and C? Or should a programmer have to write
>
>   A < B && B < C

Yes

>
> ?
>
> I like the visual simplicity of the first form but it doesn't look so intuitive
> with a mix of relational operators such as in
>
>   A < B > C
>
> What does that mean?!

Nothing unless you had defined some meaning. For sure you could define a
language which allowed such constructs, but it would stick out as a special case
like a sore thumb.

Note that

A + B + C
has a defined meaning, but
A + C + B
might be different (one or the other might overflow).

> applied?
>
>


--
Charles H. Lindsey ---------At my New Home, still doing my own thing------
Tel: +44 161 488 1845 Web: https://www.clerew.man.ac.uk
Email: c...@clerew.man.ac.uk Snail-mail: Apt 40, SK8 5BF, U.K.
PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5

Bart

unread,
Sep 5, 2021, 12:57:26 PM9/5/21
to
On 05/09/2021 15:36, Charles Lindsey wrote:
> On 05/09/2021 11:50, James Harris wrote:
>> I've got loads of other posts in this ng to respond to but I came
>> across something last night that I thought you might find interesting.
>>
>> The issue is whether a language should support chained comparisons
>> such as where
>>
>>    A < B < C
>
> I think in most languages operators of the same precedence would
> associate to the left. So that would mean whatever is meant by
>
> (A < B) < C
>
> which in most languages would be ill-formed (unless you had overloaded
> '<' with some suitable meaning).
>>
>> means that B is between A and C? Or should a programmer have to write
>>
>>    A < B && B < C
>
> Yes
>
>>
>> ?
>>
>> I like the visual simplicity of the first form but it doesn't look so
>> intuitive with a mix of relational operators such as in
>>
>>    A < B > C
>>
>> What does that mean?!
>
> Nothing unless you had defined some meaning. For sure you could define a
> language which allowed such constructs, but it would stick out as a
> special case like a sore thumb.

Yes, it a little different. In my case it's more like an N-ary operator,
so that this fragment:

real x, y, z
x < y < z

Generates this AST (one composite operator and 3 operands):

i64------- 1 cmpchain: <op_lt_r64 op_lt_r64>
r64------- - 1 name: x
r64------- - 1 name: y
r64------- - 1 name: z

To do the same thing without the feature, you'd need to write:

x < y and y < z

with this more elaborate AST:

i64------- - 1 andl: <andl_i64>
i64------- - - 1 cmp: <lt_r64>
r64------- - - - 1 name: x
r64------- - - - 2 name: y
i64------- - - 2 cmp: <lt_r64>
r64------- - - - 1 name: y
r64------- - - - 2 name: z

With the problem that y (if an expression) could have side-effects that
may be executed twice, but you'd want to avoid evaluating it twice
anyway, so extra effort.


James Harris

unread,
Sep 6, 2021, 5:40:20 AM9/6/21
to
On 05/09/2021 15:36, Charles Lindsey wrote:
> On 05/09/2021 11:50, James Harris wrote:
>> I've got loads of other posts in this ng to respond to but I came
>> across something last night that I thought you might find interesting.
>>
>> The issue is whether a language should support chained comparisons
>> such as where
>>
>>    A < B < C
>
> I think in most languages operators of the same precedence would
> associate to the left. So that would mean whatever is meant by
>
> (A < B) < C
>
> which in most languages would be ill-formed (unless you had overloaded
> '<' with some suitable meaning).

Either overloaded 'number op bool' or decided that comparison operators
result in an integer (as C does) so that it can be the input to another
comparison against a number.

>>
>> means that B is between A and C? Or should a programmer have to write
>>
>>    A < B && B < C
>
> Yes

Noted. Given that viewpoint what do you make of

https://youtu.be/M3GAJ1AIIlA

?

...


> Note that
>
> A + B + C
> has a defined meaning, but
> A + C + B
> might be different (one or the other might overflow).

Good point. Language expressions are not quite the same as mathematical
ones.


--
James Harris

David Brown

unread,
Sep 6, 2021, 5:55:19 AM9/6/21
to
On 05/09/2021 12:50, James Harris wrote:
> I've got loads of other posts in this ng to respond to but I came across
> something last night that I thought you might find interesting.
>
> The issue is whether a language should support chained comparisons such
> as where
>
>   A < B < C
>
> means that B is between A and C? Or should a programmer have to write
>
>   A < B && B < C
>
> ?
>

Ask yourself if the need to check that a value is within a range is
common enough that you need a special syntax to handle it. I'd say no,
but there is always a balance to be found and it varies from language to
language. I am not a fan of having lots of special syntaxes, or
complicated operators - whether they are done using symbols or vast
numbers of keywords.

My vote would be to make relational operators return a "boolean", and to
make operations between booleans and other types a syntax error or
constraint error, and to disallow relational operators for booleans.
Then "A < B < C" is a compile-time error. It is not particularly hard
to write "A < B && B < C" when you need it. Put more focus on making it
hard to write incorrect or unclear code.

James Harris

unread,
Sep 6, 2021, 6:28:12 AM9/6/21
to
On 05/09/2021 12:27, Bart wrote:
> On 05/09/2021 11:50, James Harris wrote:

...

>>    A < B < C

...

>> For programming, what do you guys prefer to see in a language? Would
>> you rather have each relation yield false/true (so that A < B in the
>> above yields a boolean) or to allow the above kind of chaining where a
>> meaning is attributed to a succession of such operators?
>>
>> If the latter, what rules should govern how successive relational
>> operators are applied?
>
>
> I support such operators (althought it might be limited to 4 operators).
> Something like:
>
>   A op1 B op2 C op3 D
>
> is approximately the same as:
>
>   (A op1 B) and (B op2 C) and (C op3) D
>
> except that the middle terms should be evaluated once, not twice.

Until seeing https://youtu.be/M3GAJ1AIIlA (which is the same link I
posted in reply to Charles) I never realised that there was a simple and
logical way to interpret operator sequences but it appears I am well
behind the curve. You do that, and so did BCPL in 1967!

But there's a question of relative precedences.

I take it that aside from single evaluation you treat

A op1 B op2 C

essentially as

(A op1 B) and (B op2 C)

but how did you choose to have that interact with higher and lower
precedences of surrounding operators? Do you treat chained comparisons
as syntactic sugar or as compound expressions in their own right?

My booleans operators (such as 'and' and 'not') have lower precedence
than comparisons (but with 'not' having higher precedence than 'and')
and my arithmetic operators have higher precedence.

In case that's confusing, for these few operators I have the following
order (higher to lower).

Arithmetic operators including +
Comparisons including <
Booleans including 'not' and 'and'

Therefore,

not a < b + 1 and b + 1 < c

would evaluate the (b + 1)s first, as in

B = (b + 1)

It would then apply 'not' before 'and' as in

(not (a < B)) and (B < c)

but if I were to implement operator chaining I think it would be better
for the 'not' in

not a < B < c

to apply to the entire comparison

not (a < B < c)

In a sense, the 'and' which is part of comparison chaining would have
the same precedence as the comparison operators rather than the
precedence of the real 'and' operator.

If you are still with me(!), what did you choose?

...

> In any case, any combinations will be well-defined. Your last example
> means A < B and B > C, and will be true for A, B, C values of 5, 20, 10
> for example.

Now I know what the expression means I find it surprisingly easy to read
- just adding "and" between parts and knowing that all parts must be
true for the composite comparison to be true makes any uses of it easy
to understand.


--
James Harris

Bart

unread,
Sep 6, 2021, 6:41:26 AM9/6/21
to
On 06/09/2021 10:55, David Brown wrote:
> On 05/09/2021 12:50, James Harris wrote:
>> I've got loads of other posts in this ng to respond to but I came across
>> something last night that I thought you might find interesting.
>>
>> The issue is whether a language should support chained comparisons such
>> as where
>>
>>   A < B < C
>>
>> means that B is between A and C? Or should a programmer have to write
>>
>>   A < B && B < C
>>
>> ?
>>

[OK, this is a /fourth/ reply to you within half an hour. You should
stop saying things I disagree with!]


> Ask yourself if the need to check that a value is within a range is
> common enough that you need a special syntax to handle it.

Yes, it is, in my code anyway.

I used to use A <= B <= C for that, until that was replaced by A in B..C:

elsif c='1' and lxsptr^ in '0'..'6' and ...

Some older code:

return int32.min <= x <= int32.max

(However a more common use of 'in' is to compare against several values:
'A in [B, C, D]')

> I'd say no,
> but there is always a balance to be found and it varies from language to
> language. I am not a fan of having lots of special syntaxes, or
> complicated operators - whether they are done using symbols or vast
> numbers of keywords.

In C-style, the example above would be:

else if (c=='1' && *lxsptr>='0' && *lxsptr<='6' && ...)


> My vote would be to make relational operators return a "boolean",

They do. Except that A=B=C is not two operators, but treated as a single
operator: (A=B=C) returns a boolean.

> It is not particularly hard
> to write "A < B && B < C" when you need it.

* B has to be written twice

* It's not always a simple expression, so you need to ensure both are
actually identical

* The reader needs to double check that it /is/ the same expression

* The compiler needs to do extra work to avoid evaluating it twice

* But if it has side-effects, the compiler is obliged to evaluate twice

David Brown

unread,
Sep 6, 2021, 7:24:50 AM9/6/21
to
On 06/09/2021 12:41, Bart wrote:
> On 06/09/2021 10:55, David Brown wrote:
>> On 05/09/2021 12:50, James Harris wrote:
>>> I've got loads of other posts in this ng to respond to but I came across
>>> something last night that I thought you might find interesting.
>>>
>>> The issue is whether a language should support chained comparisons such
>>> as where
>>>
>>>    A < B < C
>>>
>>> means that B is between A and C? Or should a programmer have to write
>>>
>>>    A < B && B < C
>>>
>>> ?
>>>
>
> [OK, this is a /fourth/ reply to you within half an hour. You should
> stop saying things I disagree with!]
>

I've been interspersing with a few agreements...

>
>> Ask yourself if the need to check that a value is within a range is
>> common enough that you need a special syntax to handle it.
>
> Yes, it is, in my code anyway.
>
> I used to use A <= B <= C for that, until that was replaced by A in B..C:

If a language supports a concept of ranges, represented by "b .. c" (or
similar syntax), then "a in b .. c" is a good way to handle such tests.
I would not invent such a syntax purely for such tests, but if it is
used for array slices, for-loops, etc., then you have a good feature.


(Since I may have accidentally complemented you for a language feature,
I need to add balance - don't you have a space key on your keyboard?
Why don't you use it when writing code?)

>
>     elsif c='1' and lxsptr^ in '0'..'6' and ...
>
> Some older code:
>
>     return int32.min <= x <= int32.max
>
> (However a more common use of 'in' is to compare against several values:
> 'A in [B, C, D]')
>
>> I'd say no,
>> but there is always a balance to be found and it varies from language to
>> language.  I am not a fan of having lots of special syntaxes, or
>> complicated operators - whether they are done using symbols or vast
>> numbers of keywords.
>
> In C-style, the example above would be:
>
>     else if (c=='1' && *lxsptr>='0' && *lxsptr<='6' && ...)
>
>
>> My vote would be to make relational operators return a "boolean",
>
> They do. Except that A=B=C is not two operators, but treated as a single
> operator: (A=B=C) returns a boolean.
>
>>  It is not particularly hard
>> to write "A < B && B < C" when you need it.
>
> * B has to be written twice
>
> * It's not always a simple expression, so you need to ensure both are
> actually identical
>
> * The reader needs to double check that it /is/ the same expression
>
> * The compiler needs to do extra work to avoid evaluating it twice
>
> * But if it has side-effects, the compiler is obliged to evaluate twice
>

If B is complicated, you can always use a local variable - split things
up to make them clear. (Of course, you want a language that has proper
scoped local variables for that - but you'd want that anyway.)

Bart

unread,
Sep 6, 2021, 7:25:25 AM9/6/21
to
On 06/09/2021 11:28, James Harris wrote:
> On 05/09/2021 12:27, Bart wrote:

> I take it that aside from single evaluation you treat
>
>   A op1 B op2 C
>
> essentially as
>
>   (A op1 B) and (B op2 C)

> but how did you choose to have that interact with higher and lower
> precedences of surrounding operators?

Exactly the same as a single comparison operator. All comparisons are
the same precedence (see below), so:

A + B = C = D + E is (A+B) = C = (D+E)

A and B = C = D and E is A and (B=C=D) and E


> Do you treat chained comparisons
> as syntactic sugar or as compound expressions in their own right?

I used to transform them into explicit AND expressions, but that wasn't
satisfactory. Now chained comparisons form a single AST node (see my
reply to Charles).




> My booleans operators (such as 'and' and 'not') have lower precedence
> than comparisons (but with 'not' having higher precedence than 'and')
> and my arithmetic operators have higher precedence.
>
> In case that's confusing, for these few operators I have the following
> order (higher to lower).
>
>   Arithmetic operators including +
>   Comparisons including <
>   Booleans including 'not' and 'and'

You have the wrong precedence for 'not'. As a unary operator, it should
work like C's - (negate), !, ~.

> Therefore,
>
>   not a < b + 1 and b + 1 < c
>
> would evaluate the (b + 1)s first, as in
>
>   B = (b + 1)
>
> It would then apply 'not' before 'and' as in
>
>   (not (a < B)) and (B < c)
>
> but if I were to implement operator chaining I think it would be better
> for the 'not' in
>
>   not a < B < c
>
> to apply to the entire comparison
>
>   not (a < B < c)

As written here that's fine. But I'd have a problem with making a
special case for unary 'not':

not A < B means not (A < B)? (I assume it applies to single >)
-A < B means (-A) < B

For the first, just use parentheses. (I also use:

unless A < B ...

for the reverse logic)


> In a sense, the 'and' which is part of comparison chaining would have
> the same precedence as the comparison operators rather than the
> precedence of the real 'and' operator.
>
> If you are still with me(!), what did you choose?

These are my precedence levels:

8 ** # highest

7 * / rem << >>
6 + - iand ior ixor
5 ..
4 = <> < <= >= > in notin
3 and
2 or

1 := # lowest

And, Or are lower precedence than comparisons.

Unary op are always applied first (but multiple unary ops on either side
of a term - prefix and postfix - have their own rules).


Dmitry A. Kazakov

unread,
Sep 6, 2021, 8:02:51 AM9/6/21
to
On 2021-09-06 13:24, David Brown wrote:

> If a language supports a concept of ranges, represented by "b .. c" (or
> similar syntax), then "a in b .. c" is a good way to handle such tests.
> I would not invent such a syntax purely for such tests, but if it is
> used for array slices, for-loops, etc., then you have a good feature.

Yes, and this is another challenge for the type system, unless ranges
are built-in (e.g. in Ada).

In a more powerful type system "in", ".." could be operations as any
other. The range would be a first-class type.

Further generalization. The mathematical term for range is an indicator
set. A general case indicator set can be non-contiguous, e.g. a number
of ranges. Others examples are a set of indices extracting a column of a
matrix, a submatrix, a diagonal of a matrix etc.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

James Harris

unread,
Sep 6, 2021, 8:20:26 AM9/6/21
to
On 06/09/2021 12:25, Bart wrote:
> On 06/09/2021 11:28, James Harris wrote:
>> On 05/09/2021 12:27, Bart wrote:

...

>> Do you treat chained comparisons as syntactic sugar or as compound
>> expressions in their own right?
>
> I used to transform them into explicit AND expressions, but that wasn't
> satisfactory. Now chained comparisons form a single AST node (see my
> reply to Charles).

IIRC in your reply to Charles your node relied on the programmer using
the same operator as in the < of

A < B < C

whereas proper chaining requires support for different operators such as

A < B <= C

If one is going to support chaining then AISI that expression should
also be parsed to be a node.

>
>
>
>
>> My booleans operators (such as 'and' and 'not') have lower precedence
>> than comparisons (but with 'not' having higher precedence than 'and')
>> and my arithmetic operators have higher precedence.
>>
>> In case that's confusing, for these few operators I have the following
>> order (higher to lower).
>>
>>    Arithmetic operators including +
>>    Comparisons including <
>>    Booleans including 'not' and 'and'
>
> You have the wrong precedence for 'not'. As a unary operator, it should
> work like C's - (negate), !, ~.

I am not copying C!

I could have chosen to do so, of course. But some of it's operator
precedence levels are in the wrong order. Even DMR acknowledged that.

If one is not going to copy C's why decide to copy any of it? If some
precedences are going to be different then why not just do whatever's best?

...

>> but if I were to implement operator chaining I think it would be
>> better for the 'not' in
>>
>>    not a < B < c
>>
>> to apply to the entire comparison
>>
>>    not (a < B < c)
>
> As written here that's fine. But I'd have a problem with making a
> special case for unary 'not':
>
>    not A < B       means not (A < B)? (I assume it applies to single >)
>    -A < B          means (-A) < B

Why are you equating a boolean with an arithmetic operator? What
connection do you see between them?

;-)



>
> For the first, just use parentheses. (I also use:
>
>    unless A < B ...
>
> for the reverse logic)
>
>
>> In a sense, the 'and' which is part of comparison chaining would have
>> the same precedence as the comparison operators rather than the
>> precedence of the real 'and' operator.
>>
>> If you are still with me(!), what did you choose?
>
> These are my precedence levels:
>
>   8   **                             # highest
>
>   7   * / rem << >>
>   6   + - iand ior ixor
>   5   ..
>   4   = <> < <= >= > in notin
>   3   and
>   2   or
>
>   1   :=                             # lowest

I have a different and arguably simpler order. Of those you mention,
highest to lowest I have

* bitwise ops
* arithmetic ops
* comparison ops
* boolean ops
* assignment ops

So

not A gt B

means

not (A gt B)

If you think about it some more I believe you'll agree that it makes
sense and will immediately change your compiler and all your source code
to suit. ;-)

I am not saying I would not change what I have so far but I've put a lot
of thought into the precedence table. Keeping the operators in logical
groups or 'families' makes the overall order a great deal easier to
remember. For example, **all** the boolean ops are applied after the
comparison ops.

Within each family the operators are in familiar orders, e.g.
multiplication comes before addition, 'and' comes before 'or', etc.



>
> And, Or are lower precedence than comparisons.
>
> Unary op are always applied first (but multiple unary ops on either side
> of a term - prefix and postfix - have their own rules).

Having "their own rules" sounds as though it could be confusing.

:-(


--
James Harris

David Brown

unread,
Sep 6, 2021, 8:25:27 AM9/6/21
to
I think you'd want to avoid that kind of generalisation, at least at the
language level. Having an "in" operator that could be used with
different types would be all you need. The language could support
common cases that are simple to implement, such as Pascal-like sets, or
a range made as a pair of two types that support relational operators.
But support for more general sets, collections of ranges, etc., should
be left to classes.

James Harris

unread,
Sep 6, 2021, 9:22:51 AM9/6/21
to
On 06/09/2021 10:55, David Brown wrote:
> On 05/09/2021 12:50, James Harris wrote:
>> I've got loads of other posts in this ng to respond to but I came across
>> something last night that I thought you might find interesting.
>>
>> The issue is whether a language should support chained comparisons such
>> as where
>>
>>   A < B < C
>>
>> means that B is between A and C? Or should a programmer have to write
>>
>>   A < B && B < C
>>
>> ?
>>
>
> Ask yourself if the need to check that a value is within a range is
> common enough that you need a special syntax to handle it.

I don't see this as that special a syntax but (at least, potentially) an
expression form which is widely useful and easy to understand. That is,
of course, once one knows the rules(!) but the rules look as though they
could be easy to grasp: For a series of comparison ops in a phrase such as

A op B op C op D

the ops would apply left to right, with inner expressions being
evaluated as they were encountered, and result in True if all
subcomponents were true; if any were found to be false then the rest of
the phrase would not be evaluated. Maybe I'm missing something but that
seems remarkably simple.


> I'd say no,
> but there is always a balance to be found and it varies from language to
> language. I am not a fan of having lots of special syntaxes, or
> complicated operators - whether they are done using symbols or vast
> numbers of keywords.
>
> My vote would be to make relational operators return a "boolean", and to
> make operations between booleans and other types a syntax error or
> constraint error, and to disallow relational operators for booleans.
> Then "A < B < C" is a compile-time error. It is not particularly hard
> to write "A < B && B < C" when you need it. Put more focus on making it
> hard to write incorrect or unclear code.
>

Noted.

One thing in favour of the chained operators (other than their
simplicity once one knows the rules) is single evaluation. In

A < sub(B) <= C

would evaluate sub(B) once.

I am not decided yet on whether to support this but I do note that it
was present in BCPL in some form in 1967 so it's not a new concept and I
find it surprisingly easy to read - just say "and" between parts! :-)


--
James Harris

Dmitry A. Kazakov

unread,
Sep 6, 2021, 9:39:09 AM9/6/21
to
But given a user type you want to be able to have arrays, vectors
matrices indexed by the type. You want standard loops working with it
etc. The usual OOPL method is to manually create a dozen of helper
classes that producing a total mess and still no support of slices,
submatrices etc. I would like some integrated mechanics for this stuff.

Bart

unread,
Sep 6, 2021, 9:46:07 AM9/6/21
to
On 06/09/2021 13:20, James Harris wrote:
> On 06/09/2021 12:25, Bart wrote:
>> On 06/09/2021 11:28, James Harris wrote:
>>> On 05/09/2021 12:27, Bart wrote:
>
> ...
>
>>> Do you treat chained comparisons as syntactic sugar or as compound
>>> expressions in their own right?
>>
>> I used to transform them into explicit AND expressions, but that
>> wasn't satisfactory. Now chained comparisons form a single AST node
>> (see my reply to Charles).
>
> IIRC in your reply to Charles your node relied on the programmer using
> the same operator as in the < of
>
>   A < B < C
>
> whereas proper chaining requires support for different operators such as
>
>   A < B <= C
>
> If one is going to support chaining then AISI that expression should
> also be parsed to be a node.

That's just because that same example was used in previous posts.

For parsing purposes, then = <> < <= >= > are all treated the same.

>>> In case that's confusing, for these few operators I have the
>>> following order (higher to lower).
>>>
>>>    Arithmetic operators including +
>>>    Comparisons including <
>>>    Booleans including 'not' and 'and'
>>
>> You have the wrong precedence for 'not'. As a unary operator, it
>> should work like C's - (negate), !, ~.
>
> I am not copying C!

I used C because that's widely known. But unary ops, in pretty much
every language I've tried, always bind more tightly than binary ops. So
they don't have meaningful precedence.

(There might be the odd exception such as -A**B which in maths means
-(A**B) not (-A)**B. Actually maths would be a good model for this:

sin x + y

means (sin(x)) + y not sin(x+y).)

>

>>>    not (a < B < c)
>>
>> As written here that's fine. But I'd have a problem with making a
>> special case for unary 'not':
>>
>>     not A < B       means not (A < B)? (I assume it applies to single >)
>>     -A < B          means (-A) < B
>
> Why are you equating a boolean with an arithmetic operator? What
> connection do you see between them?

Both not and - (negate) are unary ops, and should be parsed the same
way, regardless of what types they typically take.

>
>   ;-)
>
>
>
>>
>> For the first, just use parentheses. (I also use:
>>
>>     unless A < B ...
>>
>> for the reverse logic)
>>
>>
>>> In a sense, the 'and' which is part of comparison chaining would have
>>> the same precedence as the comparison operators rather than the
>>> precedence of the real 'and' operator.
>>>
>>> If you are still with me(!), what did you choose?
>>
>> These are my precedence levels:
>>
>>    8   **                             # highest
>>
>>    7   * / rem << >>
>>    6   + - iand ior ixor
>>    5   ..
>>    4   = <> < <= >= > in notin
>>    3   and
>>    2   or
>>
>>    1   :=                             # lowest
>
> I have a different and arguably simpler order. Of those you mention,
> highest to lowest I have
>
>   * bitwise ops
>   * arithmetic ops
>   * comparison ops
>   * boolean ops
>   * assignment ops

That's the same as mine except that you treat bitwise ops (and or xor
shifts) as higher than anything else, and in the same group?

But I don't believe this is the complete set of precedence levels,
unless "+" has the same precedence as "*" so that:

a + b * c

means (a+b) * c, contrary to school arithmetic rules.



> So
>
>   not A gt B
>
> means
>
>   not (A gt B)
>
> If you think about it some more I believe you'll agree that it makes
> sense and will immediately change your compiler and all your source code
> to suit. ;-)

This is interesting, because I used to write expressions like this:

if not (A in B)

as without the parentheses, it would be (not A ) in B. But rather than
introduce bizarre rules for 'not', contrary to every other language so
it would cause confusion, I instead allowed:

if A not in B

However this doesn't work well or comparison ops, or with a chain of
such ops.

The only way your proposal can make sense, is for either a single
comparison or chain of comparisons to bind so tightly that it forms a
single term.

So syntactically, A<B<C is treated like A.B.C.

That means dispensing with normal precedence rules for comparison, which
would have consequences:

if N + 1 <= Limit then ++N ...

would end up being parsed as :

if N + (1 <= Limit) then ++N ...


> I am not saying I would not change what I have so far but I've put a lot
> of thought into the precedence table. Keeping the operators in logical
> groups or 'families' makes the overall order a great deal easier to
> remember. For example, **all** the boolean ops are applied after the
> comparison ops.
>
> Within each family the operators are in familiar orders, e.g.
> multiplication comes before addition, 'and' comes before 'or', etc.


See above.

>
>
>>
>> And, Or are lower precedence than comparisons.
>>
>> Unary op are always applied first (but multiple unary ops on either
>> side of a term - prefix and postfix - have their own rules).
>
> Having "their own rules" sounds as though it could be confusing.
>
> :-(

This is to be in line with other languages. It means that this term:

op1 op2 X op3 op4

is parsed as:

op1(op2(op4((op3 X))))

In practice it works intuitively; you mustn't over-think it! So:

-P^ # ^ is a deref op

means -(P^). (^ is really syntax, not an operator, but it follows the
same rules.)

Since P here needs to be a pointer, negating it first wouldn't be useful.


Bart

unread,
Sep 6, 2021, 10:06:16 AM9/6/21
to
On 06/09/2021 13:02, Dmitry A. Kazakov wrote:
> On 2021-09-06 13:24, David Brown wrote:
>
>> If a language supports a concept of ranges, represented by "b .. c" (or
>> similar syntax), then "a in b .. c" is a good way to handle such tests.
>>   I would not invent such a syntax purely for such tests, but if it is
>> used for array slices, for-loops, etc., then you have a good feature.
>
> Yes, and this is another challenge for the type system, unless ranges
> are built-in (e.g. in Ada).
>
> In a more powerful type system "in", ".." could be operations as any
> other. The range would be a first-class type.

I guess my dynamic language has a powerful type system then!

".." is not quite an operator, but a constructor which takes two
integers and yields a range type.

And a range is a proper first class type:

a := 10..20
println a.type # 'range'
println a.lwb # 10
println a.upb # 20
println a.len # 11
println a.bounds # 10..20
println a[10] # 10 (index must be 10..20)
println a.isrange # 1 (true)
println 13 in a # 1 (true)

b := new(list, a)
println b.bounds # 10..20

for i in a do
println sqr i # 100 to 400
od

In my static language however, then ranges are mostly syntax.


James Harris

unread,
Sep 6, 2021, 1:03:48 PM9/6/21
to
On 06/09/2021 14:46, Bart wrote:
> On 06/09/2021 13:20, James Harris wrote:
>> On 06/09/2021 12:25, Bart wrote:
>>> On 06/09/2021 11:28, James Harris wrote:

...

>>> You have the wrong precedence for 'not'. As a unary operator, it
>>> should work like C's - (negate), !, ~.
>>
>> I am not copying C!
>
> I used C because that's widely known. But unary ops, in pretty much
> every language I've tried, always bind more tightly than binary ops. So
> they don't have meaningful precedence.
>
> (There might be the odd exception such as -A**B which in maths means
> -(A**B) not (-A)**B. Actually maths would be a good model for this:

Indeed, exceptions are not good. I wrestled with the
minus-exponentiation combo for a long time before setting on the
arithmetic group having its operators in the following order (high to low).

**
+ - (prefix)
* /
+ - (infix)

Note that I have attempted to be consistent between groups, where there
are corresponding operations. The bitwise group (currently) similarly
has shifts before bitnot. So

! a >> b

means

! (a >> b)

The order corresponds with the minus-exponentiation combo in the
arithmetic group.

Incidentally, in another case of not following C I use ! for bitnot.
That's because it is used to make other composite symbols with the same
meaning of unary negation and I preferred to try to be consistent. For
example,

! bitwise not
& bitwise and
!& bitwise nand

C's ~ didn't look good:

~&

...

>>>>    not (a < B < c)
>>>
>>> As written here that's fine. But I'd have a problem with making a
>>> special case for unary 'not':
>>>
>>>     not A < B       means not (A < B)? (I assume it applies to single >)
>>>     -A < B          means (-A) < B
>>
>> Why are you equating a boolean with an arithmetic operator? What
>> connection do you see between them?
>
> Both not and - (negate) are unary ops, and should be parsed the same
> way, regardless of what types they typically take.

Certainly! It would be far too confusing for a human to parse if the
other of evaluation changed for different types.

But one can still have bitwise not below bitwise shift, unary minus
below exponentiation, and boolean 'not' below comparisons which yield
False or True.

...


>>> These are my precedence levels:
>>>
>>>    8   **                             # highest
>>>
>>>    7   * / rem << >>
>>>    6   + - iand ior ixor
>>>    5   ..
>>>    4   = <> < <= >= > in notin
>>>    3   and
>>>    2   or
>>>
>>>    1   :=                             # lowest
>>
>> I have a different and arguably simpler order. Of those you mention,
>> highest to lowest I have
>>
>>    * bitwise ops
>>    * arithmetic ops
>>    * comparison ops
>>    * boolean ops
>>    * assignment ops
>
> That's the same as mine except that you treat bitwise ops (and or xor
> shifts) as higher than anything else, and in the same group?

Yes, the bitwise operations are all in the same group. It has: shifts,
not, and, xor, or in that order.

>
> But I don't believe this is the complete set of precedence levels,
> unless "+" has the same precedence as "*" so that:

What I listed was the families. Plus and times (+ and *) are both in the
'arithmetic' group. And times comes before plus.

>
>
>
>> So
>>
>>    not A gt B
>>
>> means
>>
>>    not (A gt B)
>>
>> If you think about it some more I believe you'll agree that it makes
>> sense and will immediately change your compiler and all your source
>> code to suit. ;-)
>
> This is interesting, because I used to write expressions like this:
>
>   if not (A in B)
>
> as without the parentheses, it would be (not A ) in B. But rather than
> introduce bizarre rules for 'not', contrary to every other language so
> it would cause confusion, I instead allowed:
>
>   if A not in B

I have that as

if not A in B

because 'in' comes before 'not'.

>
> However this doesn't work well or comparison ops, or with a chain of
> such ops.
>
> The only way your proposal can make sense, is for either a single
> comparison or chain of comparisons to bind so tightly that it forms a
> single term.

Well, the basic precedences have been in place for some time but if I
added chained comparisons then they (the chained comparisons) would be
treated as a composite subexpression of the same precedence as any other
comparison.

>
> So syntactically, A<B<C is treated like A.B.C.

I don't get that. I could have

if X and not (A < (B + 1) == C)

where none of the parens would be necessary.


>
> That means dispensing with normal precedence rules for comparison, which
> would have consequences:
>
>    if N + 1 <= Limit then ++N ...
>
> would end up being parsed as :
>
>    if N + (1 <= Limit) then ++N ...

Because arithmetic comes before comparison I would parse that as

if (N + 1) <= Limit

...

>>> Unary op are always applied first (but multiple unary ops on either
>>> side of a term - prefix and postfix - have their own rules).
>>
>> Having "their own rules" sounds as though it could be confusing.
>>
>> :-(
>
> This is to be in line with other languages. It means that this term:
>
>   op1 op2 X op3 op4
>
> is parsed as:
>
>   op1(op2(op4((op3 X))))

I agree with much of that. Postfix operators come first. But I don't put
all prefix operators so high.

Consider

- a & b ;i.e. negate (a bitand b)
- a ** b ;i.e. negate (a to the power of b)
not a & b ;i.e. boolean not of (a bitand b)



>
> In practice it works intuitively; you mustn't over-think it! So:
>
>   -P^     # ^ is a deref op

Mine does that. Of higher precedence than bitwise ops I have the
'operand' group which includes dereference. So I would have dereference
before unary negation.

The operand group essentially has

( invoke subprogram
[ array or slice
. member selection
& inhibit auto dereference
* dereference

All of the operators in that group are postfix, all have highest
precedence and all, therefore, associate left-to-right.



--
James Harris

James Harris

unread,
Sep 6, 2021, 3:11:00 PM9/6/21
to
On 06/09/2021 12:24, David Brown wrote:
> On 06/09/2021 12:41, Bart wrote:

...

>> [OK, this is a /fourth/ reply to you within half an hour. You should
>> stop saying things I disagree with!]
>>
>
> I've been interspersing with a few agreements...

Very lax. :-)

...

> If a language supports a concept of ranges, represented by "b .. c" (or
> similar syntax), then "a in b .. c" is a good way to handle such tests.
> I would not invent such a syntax purely for such tests,

I don't see the chaining of comparisons as only for range tests. AISI
where each 'op' is a comparison operator

A op B op C op D

simply means that all component parts have to be true for the whole
phrase to be true. In words,

(A op B) and (B op C) and (C op D)


> but if it is
> used for array slices, for-loops, etc., then you have a good feature.

I don't understand that. How can chaining be used for array slices, for
loops, etc?

>
>
> (Since I may have accidentally complemented you for a language feature,
> I need to add balance - don't you have a space key on your keyboard?
> Why don't you use it when writing code?)

Indeed!

In fairness, the absence of spaces doesn't look too bad when variable
names are short - such as might be used in example code fragments. But
IMO spaceless code becomes hard to read with longer identifiers as used
in proper programming.


--
James Harris

Bart

unread,
Sep 6, 2021, 4:52:32 PM9/6/21
to
On 06/09/2021 18:03, James Harris wrote:
> On 06/09/2021 14:46, Bart wrote:

>> So syntactically, A<B<C is treated like A.B.C.
>
> I don't get that. I could have
>
>   if X and not (A < (B + 1) == C)
>
> where none of the parens would be necessary.


I mean that I have these broad levels

Syntax A() A.B A^ a[]
Unary -A +A abs A inot A istrue A not A ... and maths ops
Binary A+B etc

For comparison ops to bind more tightly than any unary, they'd have to
be in that top level, which would then lead to thse issues:

* A + B < C would bind in funny ways like my example

* Only A.B in the syntactic group has an infix 'shape', and there is now
ambiguity in A.B < C.D, which would be parsed as ((A.B) < C).D, unless I
now introduced different precedences at that level.

You get around that by allowing groups of unary ops in-between groups of
binary ops, but that's a little too outre for me.

>   - a & b    ;i.e. negate (a bitand b)

So - A & B means -(A & B), but - A * B still means (-A) * B ?


I really don't like precedences, or having to remember them, and tried
to keep them minimal:

** is unusual, so that's easy to remember

:= is at the other end, so that's easy too

* and /, and + and -, are known to everyone, and must go between those two.

That leaves comparisons, AND or OR. Comparisons can naturally go just
below normal expressions.

AND and OR I just remember from Pascal.

The full set is 6 levels:

Exponentiation

Scaling

Adding

Comparing (includes 'in/not in')

AND

OR

I can leave out assignment as you don't need to think about it; in many
languages you don't even have assignment in an expression. But when you
don't, it'll be lowest of all.

I haven't mentioned shifts and bitwise ops. Since shifts do scaling,
they can lumped in that group. The other bitwise ones are lumped with
add, since I can't think of a good reason they should be (a) higher
predence then Add; (b) lower precedence than Add.

(There is one more "..", which is an odd one out. There are problems at
every level, but I'm trying it out between Add and Compare. That one I
can never remember where it goes.)

David Brown

unread,
Sep 6, 2021, 5:24:48 PM9/6/21
to
It is a matter of style and opinion (and Bart knows that, of course).

But it is a serious point. Layout of code is vital to readability, and
spaces are a big part of that (as is consistency - you don't want
different spacing depending on the length of the variables). It is
better to write "c == 1" than "c==1", because the "==" is not part of
either then "c" or the "1". Some characters, such as parenthesis, are
thin enough that they provide their own visual space, and sometimes you
/do/ want to tie elements together closely and thus skip the space. The
details of a good spacing system are complicated, but I have no doubts
that the readability of Bart's code would be improved with more spaces.

I blame Knuth for my pedantry here. After having read "The TeX Book" in
my student days, I have been unable to ignore poor layout of code,
mathematics, or text. Lamport's "LaTeX" book didn't help either.

James Harris

unread,
Sep 7, 2021, 3:19:02 AM9/7/21
to
On 06/09/2021 22:24, David Brown wrote:

...

> I blame Knuth for my pedantry here. After having read "The TeX Book" in
> my student days, I have been unable to ignore poor layout of code,
> mathematics, or text. Lamport's "LaTeX" book didn't help either.

What's the exact title? I can find "Tex, The Program" but copies are too
expensive.


--
James Harris

David Brown

unread,
Sep 7, 2021, 4:09:38 AM9/7/21
to
"The TeXbook", by Donald Knuth.

"LaTeX : A Document Preparation System", by Leslie Lamport

These are not the smallest, cheapest or easiest introductions to TeX or
LaTeX if you want to use these systems - there is endless information
available for free online, as well as tutorial books, guides for more
specialist use in typesetting, etc.

But they are written by the authors of TeX and LaTeX respectively, and
give an understanding into the philosophy and technical details of
typesetting and good structured document layout. It is not just about
saying /how/ you make spaces of different sizes and stretchiness, but
saying /why/ you should do so, and when to use them.

James Harris

unread,
Sep 9, 2021, 12:52:49 PM9/9/21
to
On 06/09/2021 21:52, Bart wrote:
> On 06/09/2021 18:03, James Harris wrote:
>> On 06/09/2021 14:46, Bart wrote:
>
>>> So syntactically, A<B<C is treated like A.B.C.
>>
>> I don't get that. I could have
>>
>>    if X and not (A < (B + 1) == C)
>>
>> where none of the parens would be necessary.
>
>
> I mean that I have these broad levels
>
>   Syntax   A() A.B A^ a[]
>   Unary    -A +A  abs A  inot A  istrue A not A ... and maths ops
>   Binary   A+B etc
>
> For comparison ops to bind more tightly than any unary, they'd have to
> be in that top level,

Would they? Couldn't the same be achieved by allowing some of the
unaries to have lower levels - so that the issues you go on mention
wouldn't arise?

>
> which would then lead to thse issues:
>
> * A + B < C    would bind in funny ways like my example
>
> * Only A.B in the syntactic group has an infix 'shape', and there is now
> ambiguity in A.B < C.D, which would be parsed as ((A.B) < C).D, unless I
> now introduced different precedences at that level.
>
> You get around that by allowing groups of unary ops in-between groups of
> binary ops, but that's a little too outre for me.
>
>>    - a & b    ;i.e. negate (a bitand b)
>
> So - A & B means -(A & B), but - A * B still means (-A) * B ?

Yes, I have bitwise operations before arithmetic operations.

One could make a good case for prohibiting the combination of arithmetic
and bitwise operators because the first works on numbers and the second
works on bit patterns. The two do not match unless you require a mapping
between numbers and putative representations. But /if/ one decides that
they can be combined then one has to decide how they relate.

The worst place to put manipulation of bit patterns, ISTM, is between
arithmetic and comparisons. That's because arithmetic /produces/
numbers, while common comparisons (e.g. less-than) consume those
selfsame numbers. Bit patterns between those two would convert from
numbers to bit patterns then back to numbers. :-(

The only place left, /if/ one is to allow them to be combined, is to
carry out bitwise operations before doing arithmetic, and that's where
I've put them.

>
>
> I really don't like precedences, or having to remember them,

Nor me. But I found that keeping operators in families made their
ordering much easier to remember.

> and tried
> to keep them minimal:
>
> ** is unusual, so that's easy to remember
>
> := is at the other end, so that's easy too
>
> * and /, and + and -, are known to everyone, and must go between those two.
>
> That leaves comparisons, AND or OR. Comparisons can naturally go just
> below normal expressions.
>
> AND and OR I just remember from Pascal.
>
> The full set is 6 levels:
>
>    Exponentiation
>
>    Scaling
>
>    Adding
>
>    Comparing (includes 'in/not in')
>
>    AND
>
>    OR

That's interesting. You've taken a structural approach. I took a
semantic approach. IMO neither is provably best; there are options
rather than perfect solutions. The best possible is probably a
combination of convenience and memorability.

>
> I can leave out assignment as you don't need to think about it; in many
> languages you don't even have assignment in an expression. But when you
> don't, it'll be lowest of all.

Yes. I think that when explaining precedence levels it might be best to
start with the arithmetic operators and work up and down (in terms of
precedence level) from there.

>
> I haven't mentioned shifts and bitwise ops. Since shifts do scaling,
> they can lumped in that group. The other bitwise ones are lumped with
> add, since I can't think of a good reason they should be (a) higher
> predence then Add; (b) lower precedence than Add.

I recognise the dilemma. See above for where I ended up.

>
> (There is one more "..", which is an odd one out. There are problems at
> every level, but I'm trying it out between Add and Compare. That one I
> can never remember where it goes.)
>

I, too, have operators to fit in such as

Bitwise concatenation
Range specifiers


--
James Harris

James Harris

unread,
Oct 23, 2021, 10:40:03 AM10/23/21
to
On 06/09/2021 10:55, David Brown wrote:
> On 05/09/2021 12:50, James Harris wrote:
>> I've got loads of other posts in this ng to respond to but I came across
>> something last night that I thought you might find interesting.
>>
>> The issue is whether a language should support chained comparisons such
>> as where
>>
>>   A < B < C
>>
>> means that B is between A and C? Or should a programmer have to write
>>
>>   A < B && B < C
>>
>> ?
>>
>
> Ask yourself if the need to check that a value is within a range is
> common enough that you need a special syntax to handle it. I'd say no,

If this were about just checking whether a value was in range I would
say the same.


> but there is always a balance to be found and it varies from language to
> language. I am not a fan of having lots of special syntaxes, or
> complicated operators - whether they are done using symbols or vast
> numbers of keywords.

Nor me.

>
> My vote would be to make relational operators return a "boolean", and to
> make operations between booleans and other types a syntax error or
> constraint error, and to disallow relational operators for booleans.
> Then "A < B < C" is a compile-time error. It is not particularly hard
> to write "A < B && B < C" when you need it. Put more focus on making it
> hard to write incorrect or unclear code.
>

Interesting that you should say that. For me, the main appeal of chained
or composite comparisons is that they can make code /easier/ to read.
Yes, there are ranges - with or without inclusivity as in

low <= v <= high
0 <= index < len

Though it's not just about ranges. For example,

low == mid == high

Some such as

x < y > z

is not, by itself, so intuitive. But as you say, they can all be read as
written with "and" between the parts. In this case,

x < y && y > z

I haven't yet decided whether or not to include composite comparisons
but I can see readability benefits.


--
James Harris

James Harris

unread,
Oct 23, 2021, 10:42:54 AM10/23/21
to
On 06/09/2021 11:41, Bart wrote:
> On 06/09/2021 10:55, David Brown wrote:
>> On 05/09/2021 12:50, James Harris wrote:
>>> I've got loads of other posts in this ng to respond to but I came across
>>> something last night that I thought you might find interesting.
>>>
>>> The issue is whether a language should support chained comparisons such
>>> as where
>>>
>>>    A < B < C
>>>
>>> means that B is between A and C? Or should a programmer have to write
>>>
>>>    A < B && B < C

...

>>  It is not particularly hard
>> to write "A < B && B < C" when you need it.
>
> * B has to be written twice

True.

>
> * It's not always a simple expression, so you need to ensure both are
> actually identical
>
> * The reader needs to double check that it /is/ the same expression

Two very good points!


--
James Harris

Message has been deleted

Bart

unread,
Oct 26, 2021, 8:15:12 PM10/26/21
to
On 23/10/2021 15:39, James Harris wrote:


> Though it's not just about ranges. For example,
>
>   low == mid == high
>
> Some such as
>
>   x < y > z
>
> is not, by itself, so intuitive. But as you say, they can all be read as
> written with "and" between the parts. In this case,
>
>   x < y && y > z
>
> I haven't yet decided whether or not to include composite comparisons
> but I can see readability benefits.

If you don't have chained comparisons, then you have to decide what a <
b < c or a = b = c mean. Where there isn't an intuitive alternative
meaning, then you might as well use that syntax to allow chains.

But you might want to restrict it so that the sequence of comparisons
are either all from (< <= =) or (> >= =).

So that if you were to plot the numeric values of A op B op C for
example when the result is True, the gradient would always be either >=
0 or <= 0; never mixed; ie no ups then downs.

(Which also allows you to infer the relationship of A and C).

Then, x < y > z would not be allowed (it's too hard to grok); neither
would x != y != z, even if it is equivalent to:

x != y and y != z

(That is better written as y != x and y != z; I would write it as y not
in [x, z])

Andy Walker

unread,
Oct 28, 2021, 3:37:09 PM10/28/21
to
On 27/10/2021 01:15, Bart wrote:
> If you don't have chained comparisons, then you have to decide what
> a < b < c or a = b = c mean. Where there isn't an intuitive alternative
> meaning, then you might as well use that syntax to allow chains.

/Syntactically/, "a < b < c" is correct [FSVO!] in most
languages, and if not it's because the semantics are obtruding
into the syntax in ways that they ought not to. IOW, "a + b + c"
is correct syntax almost no matter how you slice and dice, and
there is no interesting syntactic difference between that and the
"<" case. Both of these fail [if they do] because of the types
of "a" [etc], not because "a OP b OP c" is a forbidden construct.

So really, this is a question of the semantics of "a < b".
We really, really don't want to mess around too much with "a < b"
itself, or else we all get confused. The trouble then is that
conventionally "a < b" returns Boolean and throws away the "b".
Effectively, you either have [eg] "TRUE < c" [which is either
meaningless or very likely not what the perpetrator intended] or
else you have to "special case" "a < b < c" so that "b" is kept
around, and so that "a < b" means one thing if that is the whole
expression and means something quite different if it is the left
operand of another operator. Special-casing is clearly possible,
esp in a private language, but it's not desirable in general.
So we "need" a new operator.

All of which set me thinking about how to do it, or at
least something close to "it", in languages with user-defined
operators. The following is what I came up with for Algol 68G
[it's a complete program -- explanations available if anyone is
actually interested]:

====== 8< ====== 8< ====== 8< ====== 8< ====== 8< ====== 8< ======

PRIO +< = 5, +> = 5, +<= = 5, +>= = 5, += = 4, +/= = 4;
MODE IB = STRUCT (INT i, BOOL b);
OP +< = (INT i,j) IB: ( i < j | (j, TRUE) | (~, FALSE) ),
< = (IB p, INT k) BOOL: ( b OF p | i OF p < k | FALSE ),
+< = (IB p, q) IB:
( b OF p | (i OF q, i OF p < i OF q) | (~, FALSE) ),
+< = (IB p, INT k) IB:
( b OF p | (k, i OF p < k) | (~, FALSE) ),
+< = ([] INT a) BOOL:
IF UPB a <= LWB a THEN TRUE # vacuously #
ELSE INT p := a [LWB a];
FOR i FROM LWB a + 1 TO UPB a
DO ( p < a[i] | p := a[i] | e ) OD;
TRUE EXIT e: FALSE
FI;
# Repeat the above for the other five operators #

# Use as, for example: #
print ( 1 +< 2 +< 3 < 4);
print ( +< [] INT (1, 2, 3, 4, 5, 6) );
# cast needed as the MODE (type) of (1,2,3,...) is
not fully determined by the syntax in this position #
[] INT c = (6, 5, 4, 3, 2, 1); print ( +< c ) # no cast! #
# prints "TTF" #

====== 8< ====== 8< ====== 8< ====== 8< ====== 8< ====== 8< ======

I initially used "<<" as the new operator, but [for good if obscure
reasons] that can't be used as a monadic operator in Algol. More
generally, it's not hard to write new operators such that [eg]

IF 0 +< [] INT (a,b,c) +<= 100 THEN ...

works. Whether that's worth doing is another matter. It's neater
and [perhaps] more readable than

IF a > 0 AND a <= 100 AND b > 0 AND ...

but I've managed without over several decades of programming and
can't claim ever to have missed it.

> But you might want to restrict it so that the sequence of comparisons
> are either all from (< <= =) or (> >= =).
[...]
> Then, x < y > z would not be allowed (it's too hard to grok); [...]
ISTM harder to restrict it than to allow it. But again
whether it's worth doing [whether at all, or with restrictions]
is another matter. I doubt it; but it makes a nice little
programming exercise.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Boccherini

Bart

unread,
Oct 28, 2021, 8:03:51 PM10/28/21
to
Well, this works, for your examples.

But I think that with working on arrays, that's straying from what this
is about; that is more like list operators, where you apply a designated
operator between all elements of the array to get a single scalar result.

I may have mentioned the way I approach this, which is to consider a
chain of related operators as a single n-ary operator.

This can be implemented by transforming A op1 B op2 C into (A op1 B) and
(B op2 C), but I now treat it as a single operation, with a composite
operator and multiple operands.

That has worked very well, and implemented much more simply when
built-in to a compiler, than trying to do it with language-building
features, as your example shows. For example, my way works for any types
for which < <= >= > are defined, for any types at all when only = <> are
involved, and it will work with mixed types.

(But I don't have list ops, which is a different subject.)


> ====== 8< ====== 8< ====== 8< ====== 8< ====== 8< ====== 8< ======
>
> I initially used "<<" as the new operator, but [for good if obscure
> reasons] that can't be used as a monadic operator in Algol.  More
> generally, it's not hard to write new operators such that [eg]
>
>   IF 0 +< [] INT (a,b,c) +<= 100 THEN ...
>
> works.  Whether that's worth doing is another matter.

And what it actually means is another! I can't relate it what you write
below ...

  It's neater
> and [perhaps] more readable than
>
>   IF a > 0 AND a <= 100 AND b > 0 AND ...

... which I'd anyway write as:

if a in 1..100 and b>0 then

I wouldn't use chained compares at all, as this is even better. Syntax
is so easy to add to a language!


> but I've managed without over several decades of programming and
> can't claim ever to have missed it.

I wouldn't like to go back to whatever I was using in the mid-80s...

>> But you might want to restrict it so that the sequence of comparisons
>> are either all from (< <= =) or (> >= =).
> [...]
>> Then, x < y > z would not be allowed (it's too hard to grok); [...]
>     ISTM harder to restrict it than to allow it.

I've just done it; it took 3 minutes, and some 8 lines of code. So
you're right, it was harder by requiring some extra effort and mode
code, but it probably saves some head-scratching later.


Bart

unread,
Oct 28, 2021, 8:10:00 PM10/28/21
to
On 29/10/2021 01:03, Bart wrote:

> That has worked very well, and implemented much more simply when
> built-in to a compiler, than trying to do it with language-building
> features, as your example shows. For example, my way works for any types
> for which < <= >= > are defined, for any types at all when only = <> are
> involved,


> and it will work with mixed types.

Well, sort of. I thought I better try. It appears to work, but it
converts all the operands to a common, dominant type. So A=B=C could
have different semantics compared with A=B and B=C, as the former could
use all floats; the latter might use int for A=B and floats for B=C.

Dmitry A. Kazakov

unread,
Oct 29, 2021, 2:27:53 AM10/29/21
to
On 2021-10-28 21:37, Andy Walker wrote:

>     /Syntactically/, "a < b < c" is correct [FSVO!] in most
> languages, and if not it's because the semantics are obtruding
> into the syntax in ways that they ought not to.

It is worth to mention that originally

a < b < c

is not an expression. It is a proposition and it is also meant to convey
transitivity. a is less than b and also less than c. For that reason

a < b > c

is not used, rather

a,c < b

while

a < b < c < d

is pretty common to define an ordered set.

An imperative programming language does not need any of that. Inclusion
tests are better with intervals

b in a..c

sets are better with aggregates

(a,b,c,d)

A declarative language would have to keep expressions apart from
declarations anyway.

Andy Walker

unread,
Oct 29, 2021, 3:29:58 PM10/29/21
to
On 29/10/2021 07:27, Dmitry A. Kazakov wrote:
> It is worth to mention that originally
>    a < b < c
> is not an expression. It is a proposition and it is also meant to convey transitivity. a is less than b and also less than c. For that reason
>    a < b > c
> is not used,

Sure. But once you've added operators to do chained
comparisons, it's harder to prevent "a < b > c" than just to
accept it; it's not as though the meaning is unclear. [OTOH,
I suppose it could be a typo, but OTTH I can't imagine using
any of this in my own programming.]

> rather
>    a,c < b

This is OK as maths, but in many computing languages
it's syntactically ambiguous. In Algol, you'd need to
decorate it something like "[]INT (a,b) < c" plus writing
the [newly extended] operator. Again, it all seems like a
lot of trouble to avoid "a < b AND c < b", or, if "b" is a
complicated expression "(INT b = some mess; a < b AND c < b)".
Note that "a < b >= c" is slightly harder to re-cast, tho' it
comes free with the chaining operators.

> while
>    a < b < c < d
> is pretty common to define an ordered set.

Again, this drops out with no extra work once the
chaining operators are written; equally again, it's not
sufficiently common in programming to justify much work
being done for it, and esp not to justify extra syntax.

> An imperative programming language does not need any of that.
> Inclusion tests are better with intervals
> b in a..c
> sets are better with aggregates
> (a,b,c,d)

No doubt, but I don't want to have to mess with the
compiler to add and use them.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Chwatal

Dmitry A. Kazakov

unread,
Oct 29, 2021, 3:44:47 PM10/29/21
to
They can be well used for many other purposes starting with cycles

for I in 1..200
for I in (a,b,c,d)

while chained operations have no use at all, except for renaming an
object like b in

a < b < c

Yet it is too specific to serve numerous other cases one could need a
renaming:

b = b + c

I would expect at least that, e.g. in the evil spirit of C:

b += c
a < b <&& c

(:-))

Andy Walker

unread,
Oct 29, 2021, 7:18:30 PM10/29/21
to
On 29/10/2021 01:03, Bart wrote:
>> [...] More
>> generally, it's not hard to write new operators such that [eg]
>>    IF 0 +< [] INT (a,b,c) +<= 100 THEN ...
>> works.  Whether that's worth doing is another matter.
> And what it actually means is another! I can't relate it what you write below ...
>>   It's neater
>> and [perhaps] more readable than
>>    IF a > 0 AND a <= 100 AND b > 0 AND ...
> ... which I'd anyway write as:
>    if a in 1..100 and b>0 then

That's only the first half of it! Or did you miss the "..."?
Put differently, it meant "a" and "b" and "c" all > 0 and <= 100.
[NB, "a > 0" is equivalent to "a >= 1" for integers, but not for
reals. But that's another can of worms.]

> I wouldn't use chained compares at all, as this is even better.

I've never used them in programming [until yesterday!].

> Syntax is so easy to add to a language!

Syntax is easy to change in your own private language,
but I don't expect any ideas you or I or even Dmitri may have
to propagate any time soon to C or Fortran or Pascal or ....
It is essentially impossible to make any major changes to those
languages that have standards; it's also quite hard to make
sure that syntax changes don't impact (a) on legacy programs,
and (b) other parts of the syntax. If /you/ bungle in this
sort of way in /your/ language, you can quietly revert to the
old version of the compiler; if the C committee does something
daft, chaos ensues, which is why they are /extremely/ cautious.

[...]
> I wouldn't like to go back to whatever I was using in the mid-80s...

In my case, that would be primarily C [tho' thenabouts
I was jointly giving a module on comparative languages, and we
got through something like 20 languages in as many lectures,
with tops-and-tails describing various types of language and
projects in about six of them]. But C was merely the best of
a bad job [esp given the then-available resources and compilers
-- we had 2.4MB discs and 64KB limits on code and data on our
PDP-11 (with over 100 users and ~40 terminals)]. Personally,
I gradually became more and more bored with writing C programs,
and largely switched to shell scripts and similar. When A68G
came along, I suddenly regained the joy of programming, and of
using a language instead of fighting it. But that's back to
the '70s!

>>> But you might want to restrict it so that the sequence of comparisons
>>> are either all from (< <= =) or (> >= =).
>> [...]
>>> Then, x < y > z would not be allowed (it's too hard to grok); [...]
>>      ISTM harder to restrict it than to allow it.
> I've just done it; it took 3 minutes, and some 8 lines of code. So
> you're right, it was harder by requiring some extra effort and mode
> code, but it probably saves some head-scratching later.

When you say you've done it, you mean you've tweaked your
compiler. More important from my PoV is to explain it to users,
inc re-writing the "User Manual" to include the new syntax. For
that purpose, it's easier to say that all comparison operators can
be "chained", giving examples, than to say "these can be chained,
separately those can be chained, but you can't mix them, except
for the [in]equality operators", with a sub-text of "because you
wouldn't understand what they meant, so don't trouble your pretty
head about it". I made the last bit up, but it's the impression
I often get from patronising user manuals and textbooks.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Chwatal

Bart

unread,
Oct 30, 2021, 7:32:56 AM10/30/21
to
On 30/10/2021 00:18, Andy Walker wrote:
> On 29/10/2021 01:03, Bart wrote:
>>> [...] More
>>> generally, it's not hard to write new operators such that [eg]
>>>    IF 0 +< [] INT (a,b,c) +<= 100 THEN ...
>>> works.  Whether that's worth doing is another matter.
>> And what it actually means is another! I can't relate it what you
>> write below ...
>>>                               It's neater
>>> and [perhaps] more readable than
>>>    IF a > 0 AND a <= 100 AND b > 0 AND ...
>> ... which I'd anyway write as:
>>     if a in 1..100 and b>0 then
>
>     That's only the first half of it!  Or did you miss the "..."?
> Put differently, it meant "a" and "b" and "c" all > 0 and <= 100.
> [NB, "a > 0" is equivalent to "a >= 1" for integers, but not for
> reals.  But that's another can of worms.]

If your example was short of the extra clutter, then it would be:

if 0 < (a,b,c) <= 100 then

It's a little clearer what's happening, but this list processing is
still causing some confusion. You've taken a chain of compare operators,
and combined that with list operations. So what would:

0 < (a,b,c) < (d,e,f)

mean? Your original also has ambiguities: should the result be a single
value (True when 0 < x <= 100 for every x in (a,b,c))? Or should it be:

(True, False, True)

which is the the set of booleans from each 0 < x <= 100?

So this is a sort of red herring when talking about the benefits of
chained compares.

>> I wouldn't use chained compares at all, as this is even better.
>
>     I've never used them in programming [until yesterday!].
>
>> Syntax is so easy to add to a language!
>
>     Syntax is easy to change in your own private language,

It's easy to add syntax in a new language. And it really isn't hard in
an established language.

Where do you think all those new features in sucessive C standards come
from? It is specific implementations adding extensions.

So &&L in gnu C is special syntax to take the address of a label.
(Needed because labels live in their own namespace, and so &L would be
ambiguous.)

> It is essentially impossible to make any major changes to those
> languages that have standards;  it's also quite hard to make
> sure that syntax changes don't impact (a) on legacy programs,
> and (b) other parts of the syntax.

See above. On the subject of C, see also C++.

>  If /you/ bungle in this
> sort of way in /your/ language, you can quietly revert to the
> old version of the compiler;  if the C committee does something
> daft, chaos ensues, which is why they are /extremely/ cautious.

C syntax was already chaotic, it can't get much worse!

Eg. 'break' being overloaded; a*b meaning multiply a by b, OR declare
variable b of type 'pointer to a'.

I've switched machines so DAK's posts are visible again; but he suggests
syntax like this:

for I in 1..200
for I in (a,b,c,d)

which is exactly what I use (with the addition of 'do'). The first works
in both of my languages; the second only in the higher level one. The
meanings are obvious.

But, probably half of current languages have copied C's brain-dead
for-loop syntax for that first example:

for (i = 1; i <= 200; ++i)

(More typically, 0..199 for such languages as they tend to be 0-based
too, /and/ case-sensitive.)

What is the matter with people?! This is the compiler's job to know how
to implement iteration, not yours!



> [...]
>> I wouldn't like to go back to whatever I was using in the mid-80s...
>
>     In my case, that would be primarily C [tho' thenabouts
> I was jointly giving a module on comparative languages, and we
> got through something like 20 languages in as many lectures,
> with tops-and-tails describing various types of language and
> projects in about six of them].  But C was merely the best of
> a bad job [esp given the then-available resources and compilers
> -- we had 2.4MB discs and 64KB limits on code and data on our
> PDP-11 (with over 100 users and ~40 terminals)].

(Really? I feel fortunate now in sharing my pdp11/34 with only 1-2 other
people!)

> Personally,
> I gradually became more and more bored with writing C programs,
> and largely switched to shell scripts and similar.  When A68G
> came along, I suddenly regained the joy of programming, and of
> using a language instead of fighting it.  But that's back to
> the '70s!

You've illustrated perfectly why I still prefer using my own languages!

But I couldn't use A68; I'd be constantly fighting the syntax and tyope
system. And switching between capitals and lower case all the time...

>>>> But you might want to restrict it so that the sequence of comparisons
>>>> are either all from (< <= =) or (> >= =).
>>> [...]
>>>> Then, x < y > z would not be allowed (it's too hard to grok); [...]
>>>      ISTM harder to restrict it than to allow it.
>> I've just done it; it took 3 minutes, and some 8 lines of code. So
>> you're right, it was harder by requiring some extra effort and mode
>> code, but it probably saves some head-scratching later.
>
>     When you say you've done it, you mean you've tweaked your
> compiler.  More important from my PoV is to explain it to users,
> inc re-writing the "User Manual" to include the new syntax.

That would just be a Note: you can only mix '= < <=' or '= > >='.

(But don't try to explain further or you'd get tied up in knots.)

> For
> that purpose, it's easier to say that all comparison operators can
> be "chained", giving examples, than to say "these can be chained,
> separately those can be chained, but you can't mix them, except
> for the [in]equality operators", with a sub-text of "because you
> wouldn't understand what they meant, so don't trouble your pretty
> head about it".

It can be to avoid confusion and to be keep things readable.

Dmitry A. Kazakov

unread,
Oct 30, 2021, 8:22:59 AM10/30/21
to
On 2021-10-30 13:32, Bart wrote:

> If your example was short of the extra clutter, then it would be:
>
>  if 0 < (a,b,c) <= 100 then
>
> It's a little clearer what's happening, but this list processing is
> still causing some confusion. You've taken a chain of compare operators,
> and combined that with list operations. So what would:
>
>    0 < (a,b,c) < (d,e,f)
>
> mean?

Depends on the lists. Actually the above is abbreviations or OR-list vs
AND-lists. Depending on that you expand macros (because these are
actually macros rather than operations):

0 < AND(a,b,c) < AND(d,e,f) --- expand --->

--- expand ---> (0 < a < d) and (0 < a < e) and (0 < a < f) ...

Here "and" is a proper logic operation.

> Your original also has ambiguities: should the result be a single
> value (True when 0 < x <= 100 for every x in (a,b,c))? Or should it be:
>
>    (True, False, True)
>
> which is the the set of booleans from each 0 < x <= 100?

It is unambiguous because "for every x in (a,b,c)" makes the list a
macro list AND(a,b,c). When you move "for every x" to the leftmost
position you reduce in to a proper operation "and". The macro rule is:

for every x in (y, ...) do z --- expand --->

--- expand ---> (x do z) and (for every x in (...) do z)

for every x in () do z --- expand ---> True

You could also have "exists" quantification:

for any x in (y, ...) do z --- expand --->

--- expand ---> (x do z) or (for any x in (...) do z)

for any x in () do z --- expand ---> False

> So this is a sort of red herring when talking about the benefits of
> chained compares.

It is actually a macro as well that expands

LESS(a,b,c)

into proper operations a < b and b < c. It is possible to "overload"
LESS with < to keep the syntax same for both. Then you resolve the
conflict since for

1 < 2 < 3

there is no Boolean < int and you decide in favor of LESS. But in

1 < 2 and x

there is no LESS and x and you decide for comparison. But then you could
not have ordered Booleans.

> I've switched machines so DAK's posts are visible again; but he suggests
> syntax like this:

Yes, my own machines hate me too! (:-))

>> For
>> that purpose, it's easier to say that all comparison operators can
>> be "chained", giving examples, than to say "these can be chained,
>> separately those can be chained, but you can't mix them, except
>> for the [in]equality operators", with a sub-text of "because you
>> wouldn't understand what they meant, so don't trouble your pretty
>> head about it".

You do not want

a or b < c

to mean

(a < c) or (b < c)

It is a funny stuff, e.g.


(a and b) * (c or d) >= 100

means

((a * c) or (a * d) >= 100) and ((b * c) or (b * d) >= 100)


Cool, no? Even better, what about this one

(a xor b) > c

Bart

unread,
Oct 30, 2021, 12:36:50 PM10/30/21
to
On 30/10/2021 13:22, Dmitry A. Kazakov wrote:
> On 2021-10-30 13:32, Bart wrote:
>
>> If your example was short of the extra clutter, then it would be:
>>
>>   if 0 < (a,b,c) <= 100 then
>>
>> It's a little clearer what's happening, but this list processing is
>> still causing some confusion. You've taken a chain of compare
>> operators, and combined that with list operations. So what would:
>>
>>     0 < (a,b,c) < (d,e,f)
>>
>> mean?
>
> Depends on the lists. Actually the above is abbreviations or OR-list vs
> AND-lists. Depending on that you expand macros (because these are
> actually macros rather than operations):
>
>    0 < AND(a,b,c) < AND(d,e,f) --- expand --->
>
>    --- expand ---> (0 < a < d) and (0 < a < e) and (0 < a < f) ...
>
> Here "and" is a proper logic operation.

That's one interpretation of it. And one implementation.

For example, it's not clear what happened to b and c, unless they will
each also be compared against d, e and f. (Making it more like a matrix
operation.)

>
>> Your original also has ambiguities: should the result be a single
>> value (True when 0 < x <= 100 for every x in (a,b,c))? Or should it be:
>>
>>     (True, False, True)
>>
>> which is the the set of booleans from each 0 < x <= 100?
>
> It is unambiguous because "for every x in (a,b,c)" makes the list a
> macro list AND(a,b,c).

It's ambiguous because there is more than one intuitive result. For the
scalar-vector example:

x op (a, b, c)

one result might be:

(x op a, x op b, x op c)

which is a list of bools when op is "<". However you might want to take
that further and make it:

x op a and x op b and x op c

to yield a single bool result. But the original has a chain of related
ops. Maybe that one should be equivalent to either:

(0 < a <= 100, 0 < b <= 100, 0 < c <= 100)

Or the same collapsed down to one bool. This suggests there would be a
choice between those possibilities. It also requires that where this is
a mix of scalar and vector terms, the scalar terms should be treated as
a repeated vector of that term.

But then, you will have other kinds of operators where that doesn't make
sense.

A vector- or list-processing language needs careful design.


> You do not want
>
>    a or b < c
>
> to mean
>
>    (a < c) or (b < c)
>
> It is a funny stuff, e.g.
>
>
>    (a and b) * (c or d) >= 100
>
> means
>
>    ((a * c) or (a * d) >= 100) and ((b * c) or (b * d) >= 100)
>
>
> Cool, no? Even better, what about this one
>
>    (a xor b) > c
>
> (:-))

Those are all weird constructions at odds with the normal precedences of
< and 'or'. Here:

(a and b) * (c or d) >= 100

it looks like you're multiplying two bools, then comparing that result
with 100. It would need special operators. But I still don't understand
what's happening even if your expand version.

Dmitry A. Kazakov

unread,
Oct 30, 2021, 3:50:03 PM10/30/21
to
On 2021-10-30 18:36, Bart wrote:
> On 30/10/2021 13:22, Dmitry A. Kazakov wrote:
>> On 2021-10-30 13:32, Bart wrote:
>>
>>> If your example was short of the extra clutter, then it would be:
>>>
>>>   if 0 < (a,b,c) <= 100 then
>>>
>>> It's a little clearer what's happening, but this list processing is
>>> still causing some confusion. You've taken a chain of compare
>>> operators, and combined that with list operations. So what would:
>>>
>>>     0 < (a,b,c) < (d,e,f)
>>>
>>> mean?
>>
>> Depends on the lists. Actually the above is abbreviations or OR-list
>> vs AND-lists. Depending on that you expand macros (because these are
>> actually macros rather than operations):
>>
>>     0 < AND(a,b,c) < AND(d,e,f) --- expand --->
>>
>>     --- expand ---> (0 < a < d) and (0 < a < e) and (0 < a < f) ...
>>
>> Here "and" is a proper logic operation.
>
> That's one interpretation of it. And one implementation.
>
> For example, it's not clear what happened to b and c, unless they will
> each also be compared against d, e and f. (Making it more like a matrix
> operation.)

If you mean vectors, they are not comparable. But yes, this is a reason
why am against such things. To me (a,b,c) is an aggregate that can be of
any user-defined type, which then determines the meaning of <.
Everything is overloaded and everything is typed.

>>> Your original also has ambiguities: should the result be a single
>>> value (True when 0 < x <= 100 for every x in (a,b,c))? Or should it be:
>>>
>>>     (True, False, True)
>>>
>>> which is the the set of booleans from each 0 < x <= 100?
>>
>> It is unambiguous because "for every x in (a,b,c)" makes the list a
>> macro list AND(a,b,c).
>
> It's ambiguous because there is more than one intuitive result. For the
> scalar-vector example:
>
>   x op (a, b, c)
>
> one result might be:
>
>   (x op a, x op b, x op c)

Yes, if op is *, () is vector, x is scalar. Then

2 * (3,4,5) = (6,8,10)

> A vector- or list-processing language needs careful design.

Just provide user-defined types, user-defined aggregates etc. The
programmer will sort things out. Less you wire in the syntax, easier it
would be for everybody.

>> You do not want
>>
>>     a or b < c
>>
>> to mean
>>
>>     (a < c) or (b < c)
>>
>> It is a funny stuff, e.g.
>>
>>
>>     (a and b) * (c or d) >= 100
>>
>> means
>>
>>     ((a * c) or (a * d) >= 100) and ((b * c) or (b * d) >= 100)
>>
>>
>> Cool, no? Even better, what about this one
>>
>>     (a xor b) > c
>>
>> (:-))
>
> Those are all weird constructions at odds with the normal precedences of
> < and 'or'. Here:
>
>   (a and b) * (c or d) >= 100
>
> it looks like you're multiplying two bools, then comparing that result
> with 100. It would need special operators. But I still don't understand
> what's happening even if your expand version.

Yes, for that reason you either use an alternative notation, e.g. | is
frequently used to separate alternatives:

case X is
when 1 | 4 | 40 =>
Do something
when ...
end case;

Or you use quantifiers as Andy did:

forall x in (a, b) exists y in (c, d) such that x*y >= 100

removing free variables x, y:

all(a, b) * any(c, d) >= 100

The last step:

(a and b) * (c or d) >= 100

Again, not that I advocate for this stuff.

Andy Walker

unread,
Oct 30, 2021, 5:10:56 PM10/30/21
to
On 30/10/2021 12:32, Bart wrote:
> [...] So what would:
>    0 < (a,b,c) < (d,e,f)
> mean?

Whatever you wanted it to mean. As Dmitri proposed and
[long ago ...] Algol specified, all operators [inc standard ones
such as "+" and "AND"] are in "user space". You can define or
re-define them as you choose. Your program is embedded in an
environment in which lots of things are defined for you, but you
don't have to stick with the normal definitions. It will all be
visible in your code, so it's up to you whether you choose to
confuse yourself or not.

> Your original also has ambiguities: should the result be a single
> value (True when 0 < x <= 100 for every x in (a,b,c))? Or should it
> be:
>    (True, False, True)
> which is the the set of booleans from each 0 < x <= 100?

That may be an ambiguity in your own mind, but when you
write your program you will have to decide what it means, and
there is no reason why what you decide has to be the same as
what Dmitri and I decide.

> So this is a sort of red herring when talking about the benefits of
> chained compares.

As previously noted, I've never felt the need to use
"chained" compares, so I'm not going to start talking now
about the benefits of them [for programs, as opposed to in
mathematics, where some usages are well understood].

[...]
> It's easy to add syntax in a new language. And it really isn't hard
> in an established language.
> Where do you think all those new features in sucessive C standards
> come from? It is specific implementations adding extensions.

Most of them came from Algol! Each new standard thus
far has added something from Algol, and AFAIR nothing that is
Algol-like has ever been taken away. [In view of discussions
with Brian Kernighan, I would like to take some credit for this,
but I really can't, and expect Steve Bourne was much more of an
influence!]

> So &&L in gnu C is special syntax to take the address of a label.
> (Needed because labels live in their own namespace, and so &L would
> be ambiguous.)

It's not /needed/. It's a way of creating botches.
Lots of early languages had label variables, and associated
paraphernalia, and gradually mainstream languages dropped
them. I'm not a subscriber to "all labels are evil" [there
was one, and an elided "GOTO", in the code I showed], but
there is very much less need for them today than there was
[or seemed to be] in the '50s and '60s.

[...]
> C syntax was already chaotic, it can't get much worse!

K&R C syntax totals about 5 1/2 sides in a manual of
31 sides. Compare the latest standard and weep. Then look
at the C++ standard, and weep some more. Then look at the
debates in "cmp.std.c" and weep more again. For comparison,
the Algol syntax chart in "Algol Bulletin" is one side, tho'
admittedly it doesn't include formats [an Algol botch, that
would surely have been removed had Algol ever been revised
after looking at Unix], another half side.

> Eg. 'break' being overloaded; a*b meaning multiply a by b, OR declare
> variable b of type 'pointer to a'.

That's the sort of thing that happens when a private
language goes public before it is properly defined, and
without proper critical scrutiny. Sadly, that has been the
norm ever since it became standard practice for programmers
to write their own languages, loosely based on C or some
other currently popular language, write a paper or a web
page about the result, and sit back and wait for some of
them to thrive [usually for not-very-good reasons].

[...]
> But I couldn't use A68; I'd be constantly fighting the syntax and
> tyope system.

That's because you haven't read the RR!.

> And switching between capitals and lower case all the
> time...

That's up to you. Any full implementation should
allow you the choice between upper-case stropping, reserved-
word stropping, prefix-apostrophe stropping, prefix-period
stropping and [perhaps] lower-case stropping, also a choice
of character sets [all this long before it became normal to
send programs around the world electronically, so that the
"internationalisation" effort became important]. I often
write code snippets for here in reserved-word, as it makes
the code look a bit more friendly. It's straightforward
[-ish] to write preprocessors that convert between regimes.

[...]
>>      When you say you've done it, you mean you've tweaked your
>> compiler.  More important from my PoV is to explain it to users,
>> inc re-writing the "User Manual" to include the new syntax.
> That would just be a Note: you can only mix '= < <=' or '= > >='.

Note again that in Algol "chained" comparisons, if
you choose to use them, are entirely private grief, and no
change to any syntax is required. But "Notes" aren't what
a User Manual is about, though it can be useful to put
supplementary information into a footnote, comment or example.
The Manual is for the Truth, the Whole Truth and Nothing But
the Truth. If your language allows you to mix some operators
but not others, then the syntax should reflect that [which,
in the present case would make the syntax more complicated].

> (But don't try to explain further or you'd get tied up in knots.)

How patronising!

> It can be to avoid confusion and to be keep things readable.

Readability and confusions of user programs are not
to do with the User Manual but with style and good usage.
By all means put appropriate comments in footnotes, but the
Manual is the place where the actual syntax and semantics
of the language are "officially" defined, not where the
proles are told what the High Priests deign to allow them
to know. Some of the proles may well be better and more
experienced programmers than the High Priests.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Byrd

Bart

unread,
Oct 31, 2021, 6:52:43 AM10/31/21
to
On 30/10/2021 22:10, Andy Walker wrote:
> On 30/10/2021 12:32, Bart wrote:
>> [...] So what would:
>>     0 < (a,b,c) < (d,e,f)
>> mean?
>
>     Whatever  you wanted it to mean.  As Dmitri proposed and
> [long ago ...] Algol specified, all operators [inc standard ones
> such as "+" and "AND"] are in "user space".  You can define or
> re-define them as you choose.  Your program is embedded in an
> environment in which lots of things are defined for you, but you
> don't have to stick with the normal definitions.  It will all be
> visible in your code, so it's up to you whether you choose to
> confuse yourself or not.

I don't buy that. You will confuse yourself, and anyone who tries to
read, understand, modify or port your program. And there will be
problems mixing code that has applied different semantics to the same
syntax.

Some things I think needs to be defined by the language.

I do very little in the way of listing processing, but it's handling
mostly with user-functions, and I do not support chained comparisons.
However, they are in the standard library.

Something like your '0 < []int (a,b,c)' would be written as:

mapsv(('<'), 0, (a,b,c))

with a vector result. A further step, eg. like DAK's and() macro, would
be needed to combine those results into one:

andv(mapsv(('<'), 0, (a,b,c)))

But these work on two operands at a time. I don't know why people are
trying to compound a mildly interesting extension called 'chained
comparisons', with advanced list processing on the lines of APL.

One represents one tiny detail my language design, the other dwarfs the
whole language.


>     As previously noted, I've never felt the need to use
> "chained" compares, so I'm not going to start talking now
> about the benefits of them [for programs, as opposed to in
> mathematics, where some usages are well understood].

Here's a line from an old program I found today (to do with a Rubik cube):

if face[4] = face[5] = face[6] = face[2] = face[8] = face[1] =
face[3] then

There are several ways of doing this without using chain comparisons,
especially as all terms are indices into the same list, but when
suddenly you have to test that 7 different things have the same value,
this was the simplest way to do so at the time.

So why not?

>
> [...]
>> It's easy to add syntax in a new language. And it really isn't hard
>> in an established language.
>> Where do you think all those new features in sucessive C standards
>> come from? It is specific implementations adding extensions.
>
>     Most of them came from Algol!  Each new standard thus
> far has added something from Algol, and AFAIR nothing that is
> Algol-like has ever been taken away.

Examples? I can't think of any Algol features in C, unless you're
talking about very early days.


> [In view of discussions
> with Brian Kernighan,

I guess you didn't bring up type declaration syntax or printf formats!

>> So &&L in gnu C is special syntax to take the address of a label.
>> (Needed because labels live in their own namespace, and so &L would
>> be ambiguous.)
>
>     It's not /needed/.  It's a way of creating botches.
> Lots of early languages had label variables, and associated
> paraphernalia, and gradually mainstream languages dropped
> them.  I'm not a subscriber to "all labels are evil" [there
> was one, and an elided "GOTO", in the code I showed], but
> there is very much less need for them today than there was
> [or seemed to be] in the '50s and '60s.

A decade ago I was playing with 4 kinds of dispatch loops inside
interpreters. They were based on:

Function-tables, switches, label-tables and assembly+threaded code.

These are in increasing order of performance. For some reason which I
don't quite get, using a table of label addresses was faster than
switch. In fact, the fastest dispatch method using pure HLL code.

This required the host language to have label address and label variables.

At that time, CPython for Linux, which used gcc, relied on label-tables
to get an extra bit of performance. But CPython for Windows, which was
built with MSVC, didn't have that, because MSVC didn't have that extension.

So CPython on Windows was a bit slower than on Linux, because someone
decided that label pointers weren't worth having.


>> Eg. 'break' being overloaded; a*b meaning multiply a by b, OR declare
>> variable b of type 'pointer to a'.
>
>     That's the sort of thing that happens when a private
> language goes public before it is properly defined, and
> without proper critical scrutiny.

Lots of opportunity to get C fixed. But people who like C aren't
interested; every terrible misfeature is really a blessing!

In the 70s, Fortran also had formats of sorts for printing values. The
lastest Fortran however lets you write:

print *, a, b, c

Meanwhile C still requires you to do:

printf("%lld %f %u\n", a, b, c);

or something along those lines, because it depends on the types of a, b
and c. Some languages are better at moving on!


>>>      When you say you've done it, you mean you've tweaked your
>>> compiler.  More important from my PoV is to explain it to users,
>>> inc re-writing the "User Manual" to include the new syntax.
>> That would just be a Note: you can only mix '= < <=' or '= > >='.
>
>     Note again that in Algol "chained" comparisons, if
> you choose to use them, are entirely private grief, and no
> change to any syntax is required.

The grief would be in having to write:

IF a=b AND b=c AND c=d THEN

or, in devising those operator overloads, which still doesn't give you a
nice syntax, and will restrict the types it will work on, instead of
just writing:

if a=b=c=d then

Dmitry A. Kazakov

unread,
Oct 31, 2021, 7:45:39 AM10/31/21
to
On 2021-10-31 11:52, Bart wrote:

> Here's a line from an old program I found today (to do with a Rubik cube):
>
>   if face[4] = face[5] = face[6] = face[2] = face[8] = face[1] =
> face[3] then
>
> There are several ways of doing this without using chain comparisons,
> especially as all terms are indices into the same list, but when
> suddenly you have to test that 7 different things have the same value,
> this was the simplest way to do so at the time.
>
> So why not?

Because it is very marginal and for a sane person suggests a typo error.
A cleaner way would to define a function like

All_Same (List, Ignore) -- Compare list elements ignoring one

Otherwise, you can do that in Ada, no problem:
----------------------------------------------
type Chained_Result is record
Partial : Boolean;
Tail : Integer;
end record;
function "=" (Left, Right : Integer) return Chained_Result is
begin
return (Left = Right, Right);
end "=";
function "=" (Left : Chained_Result; Right : Integer)
return Chained_Result is
begin
return (Left.Partial and then Left.Tail = Right, Right);
end "=";
function "=" (Left : Chained_Result; Right : Integer)
return Boolean is
begin
return (Left.Partial and then Left.Tail = Right);
end "=";
----------------------------------------------

That is all you need to do this:

a : Integer := 1;
b : Integer := 2;
c : Integer := 3;
begin
Put_Line (Boolean'Image ((a = b) = c));

will print

FALSE

Note, the parenthesis. These are a purely syntax requirement, not
because the language could not handle:

a = b = c

It would be all OK, but relational operators sharing operands are not
allowed in Ada. You must separate them. Other examples are

a or b and c
-a**b
a**b**c

etc. Ada does require you to remember 20+ levels of precedence.

And yes, you could make the above a generic package to work with any
integer type.

The problem of handling chained comparisons syntactically (as a macro)
rather than typed within the normal/sane syntax is that you will have
numerous barriers the parser could not handle. Starting with parenthesis:

a = (b) = c

These you could work around. But what about chaining mixed types:

b := (x, y, z); -- List of elements

a = b = c

It is easy to handle in Ada since you have proper objects for which you
can define the equality operator you wanted. Macros cannot do that.

Bart

unread,
Oct 31, 2021, 10:32:45 AM10/31/21
to
On 31/10/2021 11:45, Dmitry A. Kazakov wrote:
> On 2021-10-31 11:52, Bart wrote:
>
>> Here's a line from an old program I found today (to do with a Rubik
>> cube):
>>
>>    if face[4] = face[5] = face[6] = face[2] = face[8] = face[1] =
>> face[3] then
>>
>> There are several ways of doing this without using chain comparisons,
>> especially as all terms are indices into the same list, but when
>> suddenly you have to test that 7 different things have the same value,
>> this was the simplest way to do so at the time.
>>
>> So why not?
>
> Because it is very marginal and for a sane person suggests a typo error.

How about an example like this:

if hsample2 = vsample2 = hsample3 = vsample3 and
hsample1 <= 2 and vsample1 <=2 then

(The last part of that looks like a candicate for one of your (hsample1,
vsample2) <= 2 ideas. But most likely syntaxes end up looking worse than
the original.)


> A cleaner way would to define a function like
>    All_Same (List, Ignore)  -- Compare list elements ignoring one

This is overkill for just 3 (or 4) elements instead of two.

100 elements, then yes you'd use list processing and other techniques,
since the elements wouldn't be individually enumerated anyway.


> Otherwise, you can do that in Ada, no problem:

The problem is that this is all quite complicated, and requires some
extra skills, especially if you want to work with any types.

I favour building it into a language. That also requires special skills,
but that is only required of the implementer.

My version of it probably added 100 lines of straightforwards code to
the static code compiler (and 60 to the dynamic one). It also works for
arbitrary types and other comparison operators.

You don't need overload features, nor do you have implement the feature
yourself.

Dmitry A. Kazakov

unread,
Oct 31, 2021, 11:43:39 AM10/31/21
to
On 2021-10-31 15:32, Bart wrote:
> On 31/10/2021 11:45, Dmitry A. Kazakov wrote:
>> On 2021-10-31 11:52, Bart wrote:
>>
>>> Here's a line from an old program I found today (to do with a Rubik
>>> cube):
>>>
>>>    if face[4] = face[5] = face[6] = face[2] = face[8] = face[1] =
>>> face[3] then
>>>
>>> There are several ways of doing this without using chain comparisons,
>>> especially as all terms are indices into the same list, but when
>>> suddenly you have to test that 7 different things have the same
>>> value, this was the simplest way to do so at the time.
>>>
>>> So why not?
>>
>> Because it is very marginal and for a sane person suggests a typo error.
>
> How about an example like this:
>
>     if hsample2 = vsample2 = hsample3 = vsample3 and
>        hsample1 <= 2 and vsample1 <=2 then
>
> (The last part of that looks like a candicate for one of your (hsample1,
> vsample2) <= 2 ideas. But most likely syntaxes end up looking worse than
> the original.)

I would split that into several tests commenting on what is going on.
Long formulae are subject of careful revision. Mathematically, the above
looks like an expression of some sort of symmetry. Surely the problem
space has a term for that and I expect to see it in the program, e.g. as
a function call. That is the difference between construed examples and
real-life programs.

> The problem is that this is all quite complicated, and requires some
> extra skills, especially if you want to work with any types.

If you do that frequently you have the skill, if you do not, then you do
not need it.

> I favour building it into a language. That also requires special skills,
> but that is only required of the implementer.

No, it makes the language unsafe. Because the chances are 99% that a=b=c
is just an error. So I prefer the language rejecting it straight away
without consideration of any types or operations involved.

Bart

unread,
Oct 31, 2021, 12:11:13 PM10/31/21
to
Below is the context for that example.

This checks a jpeg file configuration before choosing a suitable handler
for the rest of the file.

h/vsample1/2/3 are sampling rates in hoz/vert for Y, U and V channels;
the latter two must match each other.

The comptype codes 1/2/3 mean Y/U/V, and this tests they are in that
order and for those channels.

Here I only handle 3 channels, Y/U/V, with certain sampling combinations.

---------------------------
function loadscan(fs,hdr)=
initbitstream(fs)

(vsample1,vsample2,vsample3) := hdr.vsample
(hsample1,hsample2,hsample3) := hdr.hsample
(comptype1,comptype2,comptype3) := hdr.comptype

pimage := nil
case hdr.ncomponents
when 1 then
abort("loadmono")
when 3 then
if comptype1<>1 or comptype2<>2 or comptype3<>3 then
abort("comptype?")
fi
if hsample2=vsample2=hsample3=vsample3 and
hsample1<=2 and vsample1<=2 then
pimage := loadcolour(fs,hdr,hsample1,vsample1)
else
println hsample1,vsample1,hsample2,vsample2,hsample3,vsample3
abort("Unknown sampling")
fi
else
abort("ncomp")
esac

return pimage
end

Dmitry A. Kazakov

unread,
Oct 31, 2021, 12:59:41 PM10/31/21
to
On 2021-10-31 17:11, Bart wrote:
> On 31/10/2021 15:43, Dmitry A. Kazakov wrote:

> Below is the context for that example.
>
> This checks a jpeg file configuration before choosing a suitable handler
> for the rest of the file.
>
> h/vsample1/2/3 are sampling rates in hoz/vert for Y, U and V channels;
> the latter two must match each other.

Then you do need separate checks in order to give meaningful error
message on what exactly is wrong. Unless you missed something, because
it is unusual to have that much redundancy without any use.

Andy Walker

unread,
Nov 1, 2021, 7:00:33 PM11/1/21
to
On 31/10/2021 10:52, Bart wrote:
[I wrote, re Algol:]
>> [...] Your program is embedded in an
>> environment in which lots of things are defined for you, but you
>> don't have to stick with the normal definitions.  It will all be
>> visible in your code, so it's up to you whether you choose to
>> confuse yourself or not.
> I don't buy that. You will confuse yourself, and anyone who tries to
> read, understand, modify or port your program. And there will be
> problems mixing code that has applied different semantics to the same
> syntax.

With respect, that's just silly. You aren't forced to define
or use your own operators; the standard ones are broadly the same as
those in every other normal language. So are the priorities, and the
standard functions. If a skilled programmer /chooses/ to provide
some new operators [and aside from deliberate attempts to obfuscate
for competition purposes, and such like], then it is to make code
/clearer/. For example, if you happen to be doing matrix algebra,
it is likely to be much clearer if you write new operators rather
than new functions, so that you can reproduce quite closely normal
mathematical notations. Code will only be "mixed" if several
people write different bits of a project /without/ agreeing the
specifications of their own bits; that is a problem for /any/
large project, and nothing to do with Algol. You can write code
that is difficult to "read, understand, modify or port" in any
non-trivial language; again, nothing special about Algol, more to
do with unskilled programmers following dubious practices.

> Some things I think needs to be defined by the language.

Again, the things that it is normal to define are perfectly
well defined in Algol. It's just easier to use existing syntax on
new operators and functions than it is in most other languages.

> I do very little in the way of listing processing, but it's handling
> mostly with user-functions, and I do not support chained comparisons.
> However, they are in the standard library.

If you don't "support" them then what do you mean by "in the
standard library"? Is it not your own language, entirely under your
own control and providing what [and only what] you yourself support?

> Something like your '0 < []int (a,b,c)' would be written as:
> mapsv(('<'), 0, (a,b,c))
> with a vector result.

Is that supposed to be clearer? What are the specification
for your "mapsv" function, and what am I supposed to do if I want
something slightly different? [My Algol snippet was entirely user-
written, and could easily be tweaked any way I chose, all there and
visible in the code. If you want something different, take it and
tweak it your own way -- entirely up to you.]

>> [...] Each new [C] standard thus
>> far has added something from Algol, and AFAIR nothing that is
>> Algol-like has ever been taken away.
> Examples? I can't think of any Algol features in C, unless you're
> talking about very early days.

Early days were important for C, as when there was only K&R
C, and C was "what DMR's compiler does", it was easy to change the
language, and there are a fair number of changes between the first
versions and 7th Edition Unix. More recently, they have added
type "Bool", dynamic arrays, stronger typing, parallel processing,
complex numbers, anonymous structures and unions, possible bounds
checking, "long long" types and related ideas, mixed code and
declarations, better string handling, and doubtless other things
I've forgotten.

[...]
>>> Eg. 'break' being overloaded; a*b meaning multiply a by b, OR declare
>>> variable b of type 'pointer to a'.
>>      That's the sort of thing that happens when a private
>> language goes public before it is properly defined, and
>> without proper critical scrutiny.
> Lots of opportunity to get C fixed. But people who like C aren't
> interested; every terrible misfeature is really a blessing!

You can't fix "break" or "a*b"; by the time of 7th Edition,
it was already too late, there were thousands of users and hundreds of
thousands of programs, and too much would have broken if you changed
such fundamental syntax. So almost all changes since have been either
new facilities that could not have been written in correct earlier C,
or a tightening up on things that were previously ill-defined. Again,
this is a difference between private languages and major ones. If you
want a reasonable change in C, the way to achieve that is to implement
it in "gcc" or similar, show that it works without breaking anything,
and put it to the committee. Most grumblies aren't prepared to put in
the hard work and/or don't understand the problems with their ideas.

[...]
>>      Note again that in Algol "chained" comparisons, if
>> you choose to use them, are entirely private grief, and no
>> change to any syntax is required.
> The grief would be in having to write:
>   IF a=b AND b=c AND c=d THEN
> or, in devising those operator overloads, which still doesn't give
> you a nice syntax, and will restrict the types it will work on,
> instead of just writing:
>   if a=b=c=d then

Allowing "a=b=c=d" shows that "a=b" means one thing if it
is "stand alone" and something quite different if it is a left
operand. You're very good at saying "my language allows XXX" for
all manner of interesting and perhaps even desirable "XXX", but
it's at the expense of specifying exactly what the related syntax
and semantics are. In Algol, expressions are parsed by the usual
rules of precedence and [L->R] associativity, after which "a o b"
for any operands "a" and "b" and any operator "o" means exactly
the same as "f(a,b)" where "f" is a function with two parameters
of the same types as those of "o" and the same code body as that
of "o", all completely visible in the RR plus your own code.
What's the corresponding rule in your language?

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Hummel

Bart

unread,
Nov 1, 2021, 9:38:59 PM11/1/21
to
On 01/11/2021 23:00, Andy Walker wrote:
> On 31/10/2021 10:52, Bart wrote:
> [I wrote, re Algol:]
>>> [...] Your program is embedded in an
>>> environment in which lots of things are defined for you, but you
>>> don't have to stick with the normal definitions.  It will all be
>>> visible in your code, so it's up to you whether you choose to
>>> confuse yourself or not.
>> I don't buy that. You will confuse yourself, and anyone who tries to
>> read, understand, modify or port your program. And there will be
>> problems mixing code that has applied different semantics to the same
>> syntax.
>
>     With respect, that's just silly.  You aren't forced to define
> or use your own operators;  the standard ones are broadly the same as
> those in every other normal language.  So are the priorities, and the
> standard functions.  If a skilled programmer /chooses/ to provide
> some new operators [and aside from deliberate attempts to obfuscate
> for competition purposes, and such like], then it is to make code
> /clearer/.

Sorry, but I had a lot of trouble understanding your A68 example. For
example, part of it involved arrays, which I thought was some extra
ability you'd thrown in, but I think now may actually be necessary to
implement that feature. [I still don't know...]

It reminds me of posts in comp.lang.c where people claim some new
feature is not necessary. because it so 'straightforward' to
half-implement some terrible version of it via impenetrable
meta-programming.


> For example, if you happen to be doing matrix algebra,
> it is likely to be much clearer if you write new operators rather
> than new functions, so that you can reproduce quite closely normal
> mathematical notations.

It could well be clearer, AFTER you've had to implement it via code that
is a lot less clearer than ordinary user-code.

(My first scripting language had application-specific types including 3D
transformation matrices and 3D points. There, if A and B are matrices,
and P is a point, then:

C := A*B # multiples matrices
Q := C*P # use C to transform P

/That/ was clear, with the bonus that the user didn't need to implement
a big chunk of the language themself!)

> Code will only be "mixed" if several
> people write different bits of a project /without/ agreeing the
> specifications of their own bits;  that is a problem for /any/
> large project, and nothing to do with Algol.  You can write code
> that is difficult to "read, understand, modify or port" in any
> non-trivial language;  again, nothing special about Algol, more to
> do with unskilled programmers following dubious practices.

If the effect is to create lots of mini, overlapping dialects or
extensions, then that is a problem. There will be assorted custom
preludes or drag in too.

>> I do very little in the way of listing processing, but it's handling
>> mostly with user-functions, and I do not support chained comparisons.
>> However, they are in the standard library.
>
>     If you don't "support" them then what do you mean by "in the
> standard library"?

They are implemented as user-functions which are placed in a library
that comes with the language.


>> Something like your '0 < []int (a,b,c)' would be written as:
>>    mapsv(('<'), 0, (a,b,c))
>> with a vector result.
>
>     Is that supposed to be clearer?

I'm not pretending my language properly supports list operations. That
would be a very different kind of language.

My mapsv is clearer for me since the 's' and 'v' indicate that it takes
a 'scalar' and a 'vector' operand. ('scalar' here just means it is
treated as a single value, and 'vector' that it is a multiple value.)

Then the rules for that removes any confusion about the result:

mapsv(op, A, B): scalar op vector -> vector



> What are the specification
> for your "mapsv" function, and what am I supposed to do if I want
> something slightly different?  [My Algol snippet was entirely user-
> written, and could easily be tweaked any way I chose, all there and
> visible in the code.  If you want something different, take it and
> tweak it your own way -- entirely up to you.]

-------------------------
global function mapsv(op,as,b)=
#Apply op or function between elements of single value as and vector b
c::=makeempty(b)
forall i,x in b do
c[i]:=mapss(op,as,x)
od
return c
end
-------------------------

(Hmm, maybe 's' stands for 'single' not 'vector'!)

mapss is an actual built-in function which is awkward in user-code,
since 'op' can be any built-in binary op, or any user-function taking
two parameter.

The language has poorly developed operator overloading which is little used.

I don't think you can write my mapss() function in Algol68. Example:

a:=12
b:=20

println mapss( (random(1..4)|(+), (-), (*)| (/)), a, b)

Output is 240.

>> Examples? I can't think of any Algol features in C, unless you're
>> talking about very early days.
>
>     Early days were important for C, as when there was only K&R
> C, and C was "what DMR's compiler does", it was easy to change the
> language, and there are a fair number of changes between the first
> versions and 7th Edition Unix.  More recently, they have added
> type "Bool", dynamic arrays, stronger typing, parallel processing,
> complex numbers, anonymous structures and unions, possible bounds
> checking, "long long" types and related ideas, mixed code and
> declarations, better string handling, and doubtless other things
> I've forgotten.

OK, but its implementations of those are crude and remain low level.

People mainly use them to write poorer code that is harder to follow.


> You can't fix "break" or "a*b";  by the time of 7th Edition,
> it was already too late,

It's never too late. An alternate to 'break' could have been introduced
for switch, eg. have both 'break' and 'breaksw'; eventually only
'breaksw' is allowed, and 'break' is an error. Then, further along,
'break' inside switch is allowed to be used for loop-break.

However, try taking a program from 1980 and try compiling it now.
Actually, take a program from 2021 and try building it with two
different compilers.

>> The grief would be in having to write:
>>    IF a=b AND b=c AND c=d THEN
>> or, in devising those operator overloads, which still doesn't give
>> you a nice syntax, and will restrict the types it will work on,
>> instead of just writing:
>>    if a=b=c=d then
>
>     Allowing "a=b=c=d" shows that "a=b" means one thing if it
> is "stand alone" and something quite different if it is a left
> operand.  You're very good at saying "my language allows XXX" for
> all manner of interesting and perhaps even desirable "XXX", but
> it's at the expense of specifying exactly what the related syntax
> and semantics are.

I copied the feature from Python. You'd need to ask Guido what it means!

> In Algol, expressions are parsed by the usual
> rules of precedence and [L->R] associativity, after which "a o b"
> for any operands "a" and "b" and any operator "o" means exactly
> the same as "f(a,b)" where "f" is a function with two parameters
> of the same types as those of "o" and the same code body as that
> of "o", all completely visible in the RR plus your own code.
> What's the corresponding rule in your language?

If I write:

A = B = C

in static code, then it works something like this:

* The dominant type of A, B, C is determined

* A, B, C are converted to that type as needed, as values A', B', C'
(This is for numeric types; "=" also works for exactly compatible
arbitrary types with no conversions applied)

* The expression returns True when A', B', C' have identical values


Bart

unread,
Nov 2, 2021, 3:01:24 PM11/2/21
to
On 02/11/2021 01:38, Bart wrote:
> On 01/11/2021 23:00, Andy Walker wrote:

>>> Something like your '0 < []int (a,b,c)' would be written as:
>>>    mapsv(('<'), 0, (a,b,c))
>>> with a vector result.
>>
>>      Is that supposed to be clearer?
>
> I'm not pretending my language properly supports list operations. That
> would be a very different kind of language.
>
> My mapsv is clearer for me since the 's' and 'v' indicate that it takes
> a 'scalar' and a 'vector' operand. ('scalar' here just means it is
> treated as a single value, and 'vector' that it is a multiple value.)

I think this is how it works in the 'K' programming language (a
derivative of APL):

0 < a b c

This prodouces a vector. To combine the bools into a single value, is
possibly done like this (I don't have a version to try):

& 0 < a b c

I don't know about precedences (except it works right to left iirc). On
mine, it would not be hard to define 'and' (&) to work as a unary
operator, which is expected to be a list:

and (a, b, c)
and L # where L is a list

Doing this for "<" is trickier:

0 < (a, b, c)

because while this example seems obvious, it could also look like this:

x < L

Here, x could be "C" and L could be "ABCDE", so that "C" < "ABCDE" is
false (comparing two strings). But you might want L to be treated as a
list of one-character strings, so that the result is (0, 0, 0, 1, 1).

I don't know how K sorts this out. Using my mapsv however:

x := "C"
L := "ABCDEF"
print mapsv((<), x, L)

This correctly shows (0,0,0,1,1). So it might be ugly, but you have
better confidence that it does what you want.

The ugliness can be mitigated a little via a macro:

macro lessv(a,b) = mapsv((<),a,b)

print lessv(x, L)

James Harris

unread,
Nov 3, 2021, 11:58:23 AM11/3/21
to
On 27/10/2021 01:15, Bart wrote:
> On 23/10/2021 15:39, James Harris wrote:
>
>
>> Though it's not just about ranges. For example,
>>
>>    low == mid == high
>>
>> Some such as
>>
>>    x < y > z
>>
>> is not, by itself, so intuitive. But as you say, they can all be read
>> as written with "and" between the parts. In this case,
>>
>>    x < y && y > z
>>
>> I haven't yet decided whether or not to include composite comparisons
>> but I can see readability benefits.
>
> If you don't have chained comparisons, then you have to decide what a <
> b < c or a = b = c mean. Where there isn't an intuitive alternative
> meaning, then you might as well use that syntax to allow chains.

Good point. Should it even be possible to apply magnitude comparisons
(i.e. those including < and > symbols) to boolean values?

Assuming it should be, if one wanted to evaluate

a < b

and then ask if the boolean result was 'less than' c, without chained
comparisons it would be as simple as

a < b < c

but how would one express that in a language which supported chained
comparisons? I think my preference would be

bool(a < b) < c

where bool() would break up the comparison chain by having the form of a
function call but would have no effect on the value. That would work on
your equals example, too: either of

bool(a == b) == c
a == bool(b == c)

There's another potential problem with chained comparisons, though. Consider

a < b < c < d

What if a programmer wanted to evaluate the b < c part first?

>
> But you might want to restrict it so that the sequence of comparisons
> are either all from (< <= =) or (> >= =).
>
> So that if you were to plot the numeric values of A op B op C for
> example when the result is True, the gradient would always be either >=
> 0 or <= 0; never mixed; ie no ups then downs.
>
> (Which also allows you to infer the relationship of A and C).

When you see a chained comparison do you try to read it as a single
expression? Yes, one can if the operators are all in the same direction,
as you indicate, but maybe in general it's better to read the
comparisons as individual operations.

>
> Then, x < y > z would not be allowed (it's too hard to grok);

Is it really any harder to grok than

x < y && y > z

and if so, why not just read the former as the latter?

> neither
> would x != y != z, even if it is equivalent to:
>
>   x != y and y != z
>
> (That is better written as y != x and y != z; I would write it as y not
> in [x, z])

It would seem inconsistent to allow

a == b == c

and yet prohibit

a != b != c



--
James Harris

James Harris

unread,
Nov 3, 2021, 1:06:13 PM11/3/21
to
On 01/11/2021 23:00, Andy Walker wrote:

...

> If a skilled programmer /chooses/ to provide
> some new operators [and aside from deliberate attempts to obfuscate
> for competition purposes, and such like], then it is to make code
> /clearer/.

Not sure about that. Defining new operators is probably a very poor
idea, making code less readable rather than more so.

For example, if a programmer defines an 'operator' called XX to be used
in function syntax such as

XX(a, b)

then while the meaning of XX may be clear to the original implementor
someone else reading the code needs to learn what XX means in order to
understand what's involved.

As if that's not bad enough things can get worse. If a programmer
defines a brand new operator called >=< as in

a and b >=< c + 1

then there is not even a clue in the source code as to the precedence of
the new operator relative to other operators.


--
James Harris

Bart

unread,
Nov 3, 2021, 1:54:14 PM11/3/21
to
Yes, I use parentheses to break up a chain. For these two tests:

if a = b = c then fi
if (a = b) = c then fi

there are these two different ASTs:

- 1 if:
- - 1 cmpchain: keq keq
- - - 1 name: a
- - - 1 name: b
- - - 1 name: c
- - 2 block:

- 1 if:
- - 1 cmp: keq
- - - 1 cmp: keq
- - - - 1 name: a
- - - - 2 name: b
- - - 2 name: c
- - 2 block:

In my case I don't have a Bool type (not exposed via the type system
anyway), so the second evaluates a=b to 0 or 1, then compares that to 'c'.


>   bool(a == b) == c
>   a == bool(b == c)
>
> There's another potential problem with chained comparisons, though.
> Consider
>
>   a < b < c < d
>
> What if a programmer wanted to evaluate the b < c part first?

Parentheses again? That's what they're for!



>> Then, x < y > z would not be allowed (it's too hard to grok);
>
> Is it really any harder to grok than
>
>   x < y && y > z

I can read the latter like I can 'a < b' and 'c < d'; as independent
tests. But combining them suggests some interelationships that aren't
really there.

What's the relationship between x and z? There isn't any, other than
they're both less than y (that is, if the expression is True). So that
does form a sort of weak pattern, one that DAK might exploit as:

(x, z) < y

> It would seem inconsistent to allow
>
>   a == b == c
>
> and yet prohibit
>
>   a != b != c

Yes it looks inconsistent. But what does the latter mean?

It is possible to mechanically translate this as 'a != b and b != c',
but I believe the abbreviated version could be confusing; it looks like
is testing whether a, b, c are all different from each other, but a True
result doesn't mean that: a could have the same value as c.

(For them all different, you'd need 'a != b != c != a'. Easier I think
to do 'not (a = b = c)'.)

Dmitry A. Kazakov

unread,
Nov 3, 2021, 3:56:14 PM11/3/21
to
On 2021-11-03 18:54, Bart wrote:
> On 03/11/2021 15:58, James Harris wrote:

>> and yet prohibit
>>
>>    a != b != c
>
> Yes it looks inconsistent. But what does the latter mean?

In fact this is very frequently used in mathematics, as proposition, of
course, e.g.

∀x≠y≠z

And, also, carefully observe that while equality (=) is transitive
inequality (/=) is not. So

a != b != c

means this

(a != b) and (b != c) and (a != c)

Follow the science! (:-))

> It is possible to mechanically translate this as 'a != b and b != c',
> but I believe the abbreviated version could be confusing; it looks like
> is testing whether a, b, c are all different from each other, but a True
> result doesn't mean that: a could have the same value as c.
>
> (For them all different, you'd need 'a != b != c != a'. Easier I think
> to do 'not (a = b = c)'.)

That is not same.

not (a = b = c) <=>
<=> not ((a = b) and (b = c)) <=>
<=> not (a = b) or not (b = c)
<=> (a != b) or (b != c)

Note, "or" instead of "and" (de Morgan rule flips or/and) and missing
the third term (a != C).

Bart

unread,
Nov 3, 2021, 4:34:18 PM11/3/21
to
On 03/11/2021 19:56, Dmitry A. Kazakov wrote:
> On 2021-11-03 18:54, Bart wrote:
>> On 03/11/2021 15:58, James Harris wrote:
>
>>> and yet prohibit
>>>
>>>    a != b != c
>>
>> Yes it looks inconsistent. But what does the latter mean?
>
> In fact this is very frequently used in mathematics, as proposition, of
> course, e.g.
>
>    ∀x≠y≠z
>
> And, also, carefully observe that while equality (=) is transitive
> inequality (/=) is not. So
>
>    a != b != c
>
> means this
>
>    (a != b) and (b != c) and (a != c)
>
> Follow the science! (:-))

This came up in my post as needing:

a != b != c != a

You're suggesting that a pattern like this:

a != b != c

which would be otherwise not be allowed (according to my preference),
should be treated as a circular set of compares like the above?

That would be an idea, although introducing more inconsistency.


>> It is possible to mechanically translate this as 'a != b and b != c',
>> but I believe the abbreviated version could be confusing; it looks
>> like is testing whether a, b, c are all different from each other, but
>> a True result doesn't mean that: a could have the same value as c.
>>
>> (For them all different, you'd need 'a != b != c != a'. Easier I think
>> to do 'not (a = b = c)'.)
>
> That is not same.

Yes, you're right; my mistake. 'not (a=b=c)' would mean not all the
same, not all different.

Ike Naar

unread,
Nov 3, 2021, 4:41:50 PM11/3/21
to
On 2021-11-03, Bart <b...@freeuk.com> wrote:
> On 03/11/2021 15:58, James Harris wrote:
>> It would seem inconsistent to allow
>>
>> ? a == b == c
>>
>> and yet prohibit
>>
>> ? a != b != c
>
> Yes it looks inconsistent. But what does the latter mean?
>
> It is possible to mechanically translate this as 'a != b and b != c',
> but I believe the abbreviated version could be confusing; it looks like
> is testing whether a, b, c are all different from each other, but a True
> result doesn't mean that: a could have the same value as c.
>
> (For them all different, you'd need 'a != b != c != a'. Easier I think
> to do 'not (a = b = c)'.)

'not (a = b = c)' might be easier, but it is not the same as (a != b != c)
for any of the given interpretations of (a != b != c).

Take, for example, (a,b,c) = (2,2,3).

'not (2 = 2 = 3)' = 'not ((2 = 2) and (2 = 3))'
= 'not (true and false)' = 'not false' = true.

If 'a != b != c' is interpreted as '(a != b) and (b != c)' then
'2 != 2 != 3' = '(2 != 2) and (2 != 3)' = 'false and true' = false.

If 'a != b != c' is interpreted as 'a, b, c all different' then
'2 != 2 != 3' = '2, 2, 3 all different' = false.

James Harris

unread,
Nov 3, 2021, 4:55:55 PM11/3/21
to
On 03/11/2021 17:54, Bart wrote:
> On 03/11/2021 15:58, James Harris wrote:
>> On 27/10/2021 01:15, Bart wrote:

...

> Yes, I use parentheses to break up a chain. For these two tests:
>
>     if a = b = c then fi
>     if (a = b) = c then fi
>
> there are these two different ASTs:
>
>     - 1 if:
>     - - 1 cmpchain: keq keq
>     - - - 1 name: a
>     - - - 1 name: b
>     - - - 1 name: c
>     - - 2 block:
>
>     - 1 if:
>     - - 1 cmp:  keq
>     - - - 1 cmp:  keq
>     - - - - 1 name: a
>     - - - - 2 name: b
>     - - - 2 name: c
>     - - 2 block:
>
> In my case I don't have a Bool type (not exposed via the type system
> anyway), so the second evaluates a=b to 0 or 1, then compares that to 'c'.

AISI in the presence of chained comparisons your second example would
have to be implemented as

bool(a == b) == c

For the reasoning, see below about parens.

>
>
>>    bool(a == b) == c
>>    a == bool(b == c)
>>
>> There's another potential problem with chained comparisons, though.
>> Consider
>>
>>    a < b < c < d
>>
>> What if a programmer wanted to evaluate the b < c part first?
>
> Parentheses again? That's what they're for!

In your earlier example parens change the semantics, don't they? In this
instance I was asking about keeping the meaning unchanged, just carrying
out the inner evaluation first. Imagine that evaluation of the first
term, a, has a side effect that is not wanted unless b < c.

ISTM that if a language is going to include chained comparison that
there's a need for both approaches: explicit conversion of part of an
expression to a boolean and ordering, and that parens should indicate
ordering without changing the semantics. Thus

a < b < c < d

would be the same as

(((a < b) < c) < d)

IOW the chain should extend through parens. This makes chained
comparisons not as simple for a programmer as we have been considering.

What do you think?

>
>
>
>>> Then, x < y > z would not be allowed (it's too hard to grok);
>>
>> Is it really any harder to grok than
>>
>>    x < y && y > z
>
> I can read the latter like I can 'a < b' and 'c < d'; as independent
> tests. But combining them suggests some interelationships that aren't
> really there.

I don't see why it has to indicate a relationship between unadjacent
operands, unless the comparisons happen to imply that.

>
> What's the relationship between x and z? There isn't any, other than
> they're both less than y (that is, if the expression is True).

Yes. I don't see a need for any implication about x and z.

...

>> It would seem inconsistent to allow
>>
>>    a == b == c
>>
>> and yet prohibit
>>
>>    a != b != c
>
> Yes it looks inconsistent. But what does the latter mean?
>
> It is possible to mechanically translate this as 'a != b and b != c',

That's all and exactly what it would mean. Nothing more, AISI.


> but I believe the abbreviated version could be confusing; it looks like
> is testing whether a, b, c are all different from each other, but a True
> result doesn't mean that: a could have the same value as c.
>
> (For them all different, you'd need 'a != b != c != a'.

Good point. I'd have written it as

a != b != c and a != c

but I think you are right.

> Easier I think
> to do 'not (a = b = c)'.)

That may be a different test.


--
James Harris

James Harris

unread,
Nov 3, 2021, 5:03:54 PM11/3/21
to
On 03/11/2021 20:34, Bart wrote:

...

> You're suggesting that a pattern like this:
>
>   a != b != c
>
> which would be otherwise not be allowed (according to my preference),
> should be treated as a circular set of compares like the above?
>
> That would be an idea, although introducing more inconsistency.

Yes, it would be an idea - a bad one! There's no basis for implying a
relationship between a and c in the above. As keep saying to certain
others here, computing is not mathematics.


--
James Harris

Dmitry A. Kazakov

unread,
Nov 3, 2021, 6:44:24 PM11/3/21
to
On 2021-11-03 21:34, Bart wrote:
> On 03/11/2021 19:56, Dmitry A. Kazakov wrote:
>> On 2021-11-03 18:54, Bart wrote:
>>> On 03/11/2021 15:58, James Harris wrote:
>>
>>>> and yet prohibit
>>>>
>>>>    a != b != c
>>>
>>> Yes it looks inconsistent. But what does the latter mean?
>>
>> In fact this is very frequently used in mathematics, as proposition,
>> of course, e.g.
>>
>>     ∀x≠y≠z
>>
>> And, also, carefully observe that while equality (=) is transitive
>> inequality (/=) is not. So
>>
>>     a != b != c
>>
>> means this
>>
>>     (a != b) and (b != c) and (a != c)
>>
>> Follow the science! (:-))
>
> This came up in my post as needing:
>
>   a != b != c != a
>
> You're suggesting that a pattern like this:
>
>   a != b != c
>
> which would be otherwise not be allowed (according to my preference),
> should be treated as a circular set of compares like the above?

I do not suggest it, I state what it means where you borrowed it from.

a op b op c op d

means that x op y holds for all ordered pairs (x,y) from the ordered set
(a, b, c, d).

IFF op is transitive like = and < are, you can reduce the number of
pairs to only adjacent elements from the ordered set. Otherwise you must
try all n(n-1)/2 pairs.

> That would be an idea, although introducing more inconsistency.

What inconsistency?

Bart

unread,
Nov 3, 2021, 7:13:06 PM11/3/21
to
I was wrong about using parentheses (see below).

> In this
> instance I was asking about keeping the meaning unchanged, just carrying
> out the inner evaluation first. Imagine that evaluation of the first
> term, a, has a side effect that is not wanted unless b < c.
>
> ISTM that if a language is going to include chained comparison that
> there's a need for both approaches: explicit conversion of part of an
> expression to a boolean and ordering, and that parens should indicate
> ordering without changing the semantics. Thus
>
>   a < b < c < d
>
> would be the same as
>
>   (((a < b) < c) < d)

> IOW the chain should extend through parens.

I don't agree. A chain must be linear, or it's not a chain (see my first
AST).

If you want each of a, b, c, d to be evaluated in a certain order, then
parentheses is not the way. Say the desired order is d, a, c, b. The
only way to guarantee that is to do:

td:=d; ta:=a; tc:=c; tb:=b
if ta < tb < tc < td

If you want that without that boilerplate code, then how will you tell
the language the order? (How will the compiler manage it?)

If, further, you want the /comparisons/ to be done in a certain order,
then that's harder. If the desired order is b<c then c<d then a<b, you'd
have to write it the normal way:

if b<c and c<d and a<b

You can't use a chain which implies a particular order, short-circuiting
in a similar manner to 'and' and 'or':

If a < b is false in 'a < b < c < d', then there's no point in doing the
rest. And especially no point in evaluating d first!

> This makes chained
> comparisons not as simple for a programmer as we have been considering.
>
> What do you think?

Evaluation order is a different subject. How would you enforce either a
first or b first even on something as simple as:

a + b
f(a, b)

> (((a < b) < c) < d)

This doesn't guaranteed that a<b is evaluated before c. I just means it
does (a<b) and c, not a and (b<c).

It also means that this is not a comparison chain.

Dmitry A. Kazakov

unread,
Nov 4, 2021, 5:22:12 AM11/4/21
to
On 2021-11-04 00:13, Bart wrote:

> Evaluation order is a different subject. How would you enforce either a
> first or b first even on something as simple as:
>
>     a + b
>     f(a, b)

One method is to use closures for lazy evaluation. Consider declaration
of a short-cut multiplication

function "*" (Left : Integer; Right : Lazy Integer) return Integer;

The implementation will go as follows:

function "*" (Left : Integer; Right : Lazy Integer) return Integer is
begin
if Left = 0 then
return 0;
else
return Multiply (Left, Right.all);
end if;
end "*";

Lazy Integer is a closure, in effect a pointer to a function returning
the value.

This way you can enforce any desired order:

function f (Left : Lazy Integer; Right : Lazy Integer)
return Integer is
First, Second : Integer;
begin
First := Right.all; -- Evaluate Right first
Second := Left.all; -- Evaluate Left second
...

> >    (((a < b) < c) < d)
>
> This doesn't guaranteed that a<b is evaluated before c.

Why should it be? You impose some imaginary requirements motivated only
by weaknesses of your compiler/parser. The mathematical proposition

a < b < c < d

has no evaluation aspect or order of at all.

If you want to borrow it make it as close to the source as possible.
Otherwise, as it was suggested many times, just do not.

James Harris

unread,
Nov 4, 2021, 6:32:23 AM11/4/21
to
On 03/11/2021 23:13, Bart wrote:
> On 03/11/2021 20:55, James Harris wrote:
>> On 03/11/2021 17:54, Bart wrote:
>>> On 03/11/2021 15:58, James Harris wrote:
>>>> On 27/10/2021 01:15, Bart wrote:

...

>>    a < b < c < d
>>
>> would be the same as
>>
>>    (((a < b) < c) < d)
>
>> IOW the chain should extend through parens.
>
> I don't agree. A chain must be linear, or it's not a chain (see my first
> AST).

That's good! Disagreement is key to advancement. :-)

Are you not, in this case, however, making the use of parens
inconsistent with their use elsewhere in the language? For example, in

a - (b - c)

the parens define a putative order for the subtraction operations and
the order matters.

>
> If you want each of a, b, c, d to be evaluated in a certain order, then
> parentheses is not the way. Say the desired order is d, a, c, b. The
> only way to guarantee that is to do:
>
>   td:=d; ta:=a; tc:=c; tb:=b
>   if ta < tb < tc < td

Interesting. Is that how you see the evaluation of chaining? I see it
differently. For me the trouble with your version is that it evaluates
some of the terms before it knows that it needs to - which is not what I
understand the chaining idea ought to do. AISI

a < b < c < d

should evaluate a then b, then if a < b is true evaluate c, then if b <
c is true evaluate d, then if c < d then the whole expression would be
true. Do you see it differently? I thought you agreed with shortcut
evaluation of the expression as a key benefit thereof but maybe not.

Either way, especially once arguments are arranged in a suitable order
for chaining they may not appear in source code in the order in which a
programmer needs them to be evaluated.


>
> If you want that without that boilerplate code, then how will you tell
> the language the order? (How will the compiler manage it?)

Sorry, I can't work out what that paragraph is asking.


>
> If, further, you want the /comparisons/ to be done in a certain order,
> then that's harder. If the desired order is b<c then c<d then a<b, you'd
> have to write it the normal way:
>
>   if b<c and c<d and a<b

Yes, though that doesn't work if one of those terms needs to be
evaluated only once so it's not a valid translation.

>
> You can't use a chain which implies a particular order, short-circuiting
> in a similar manner to 'and' and 'or':

Why not?

>
> If a < b is false in 'a < b < c < d', then there's no point in doing the
> rest. And especially no point in evaluating d first!

Indeed!

>
>> This makes chained comparisons not as simple for a programmer as we
>> have been considering.
>>
>> What do you think?
>
> Evaluation order is a different subject. How would you enforce either a
> first or b first even on something as simple as:
>
>     a + b
>     f(a, b)

I don't understand. Why would that be a the problem? I do, in fact,
define the semantic evaluation order. AISI that gives programmers the
guarantees that need that they will get the same results on different
machines and different compilers.

In practice a compiler /could/ evaluate operands in a different order
where it could be sure that that would not affect the semantics - which
would be most cases.

>
> >    (((a < b) < c) < d)
>
> This doesn't guaranteed that a<b is evaluated before c. I just means it
> does (a<b) and c, not a and (b<c).
>
> It also means that this is not a comparison chain.
>

IYO...!


--
James Harris

Bart

unread,
Nov 4, 2021, 10:35:34 AM11/4/21
to
On 04/11/2021 10:32, James Harris wrote:
> On 03/11/2021 23:13, Bart wrote:
>> On 03/11/2021 20:55, James Harris wrote:
>>> On 03/11/2021 17:54, Bart wrote:
>>>> On 03/11/2021 15:58, James Harris wrote:
>>>>> On 27/10/2021 01:15, Bart wrote:
>
> ...
>
>>>    a < b < c < d
>>>
>>> would be the same as
>>>
>>>    (((a < b) < c) < d)
>>
>>> IOW the chain should extend through parens.
>>
>> I don't agree. A chain must be linear, or it's not a chain (see my
>> first AST).
>
> That's good! Disagreement is key to advancement. :-)
>
> Are you not, in this case, however, making the use of parens
> inconsistent with their use elsewhere in the language? For example, in
>
>   a - (b - c)
>
> the parens define a putative order for the subtraction operations and
> the order matters.

No, the brackets just change the shape of the AST. That happens
consistently.

In the case of MY chained comparisons, changing the shape may destroy
the chain, or break it up into shorter, independent pieces.

MY chained comparisons are incompatible with having arbitrary evaluation
order of either the terms in the chain, or the compare operations.

>> If you want each of a, b, c, d to be evaluated in a certain order,
>> then parentheses is not the way. Say the desired order is d, a, c, b.
>> The only way to guarantee that is to do:
>>
>>    td:=d; ta:=a; tc:=c; tb:=b
>>    if ta < tb < tc < td
>
> Interesting. Is that how you see the evaluation of chaining? I see it
> differently. For me the trouble with your version

That's not my version; that's what you'd have to do if you wanted
evaluation of terms in that order.

> is that it evaluates
> some of the terms before it knows that it needs to - which is not what I
> understand the chaining idea ought to do. AISI
>
>   a < b < c < d
>
> should evaluate a then b, then if a < b is true evaluate c, then if b <
> c is true evaluate d, then if c < d then the whole expression would be
> true. Do you see it differently?

Yes, that's what I do. 'if a < b < c < d' generates this intermediate code:

push t.start.a i64
push t.start.b i64
jumpge #4 i64
push t.start.b i64
push t.start.c i64
jumpge #4 i64
push t.start.c i64
push t.start.d i64
jumpge #4 i64
# <body of if>
#4:

(Currently this evaluates inner terms twice. I'll have to fix this by
inserting extra stack manipulation instructions: dupl, swap etc.
Alternately there could be versions of jumpge etc that only pop one value.

There are various ways. It's just not a priority as I don't use terms
with side-effects.)


>> You can't use a chain which implies a particular order,
>> short-circuiting in a similar manner to 'and' and 'or':
>
> Why not?

Because in the case of 'a<b<c<d', which may exit early when a>=b, c and
d would not get evaluated. But you're saying you want the programmer to
tell the language they should be evaluated first anyway?

That's incompatible with how a chained comparison works.

I suppose you can have a version that doesn't short-circuit, all terms
are evaluated, and all comparison are done. Then the language could
evaluate right to left or inside out. But you wouldn't want that.

>>
>> If a < b is false in 'a < b < c < d', then there's no point in doing
>> the rest. And especially no point in evaluating d first!
>
> Indeed!
>
>>
>>> This makes chained comparisons not as simple for a programmer as we
>>> have been considering.
>>>
>>> What do you think?
>>
>> Evaluation order is a different subject. How would you enforce either
>> a first or b first even on something as simple as:
>>
>>      a + b
>>      f(a, b)
>
> I don't understand. Why would that be a the problem? I do, in fact,
> define the semantic evaluation order.

So do I, in the case of f(a,b); 'b' is done first. But how to I get a to
be evaluated first? I can't do it by adding brackets!

> In practice a compiler /could/ evaluate operands in a different order
> where it could be sure that that would not affect the semantics - which
> would be most cases.

If it doesn't affect the semantics, then why would a programmer want a
different order? It can only be for different results.

For faster code? That's the compiler's job!

>>
>>  >    (((a < b) < c) < d)
>>
>> This doesn't guaranteed that a<b is evaluated before c. I just means
>> it does (a<b) and c, not a and (b<c).
>>
>> It also means that this is not a comparison chain.
>>
>
> IYO...!

No, it's just not a chain. Unless you want to argue about the difference
between a linked list, a binary tree, and a degenerate(?) binary tree
which is more of a vertical linked list.

My chain of comparison ops is equivalent to a linked list. Any other
shape, is not the same thing. Not in my language, because I say so!

a < b < c < d is equivalent to:

(chaincmp (< < <) (a b c d))

Always. And (chaincmp (= > >=) (w x y z)) corresponds to:

w = x > y >= z

With the implementation performed left-to-right and short-circuiting as
soon as a comparison yields false.

If I wanted any different behaviour, then it would be coded differently.



Bart

unread,
Nov 4, 2021, 1:53:37 PM11/4/21
to
On 04/11/2021 14:35, Bart wrote:
> On 04/11/2021 10:32, James Harris wrote:

>> should evaluate a then b, then if a < b is true evaluate c, then if b
>> < c is true evaluate d, then if c < d then the whole expression would
>> be true. Do you see it differently?
>
> Yes, that's what I do. 'if a < b < c < d' generates this intermediate code:
>
>     push           t.start.a  i64
>     push           t.start.b  i64
>     jumpge         #4         i64
>     push           t.start.b  i64
>     push           t.start.c  i64
>     jumpge         #4         i64
>     push           t.start.c  i64
>     push           t.start.d  i64
>     jumpge         #4         i64
> #  <body of if>
> #4:
>
> (Currently this evaluates inner terms twice. I'll have to fix this by
> inserting extra stack manipulation instructions: dupl, swap etc.
> Alternately there could be versions of jumpge etc that only pop one value.

I've now fixed this for the most common instance (when testing if the
expression is true rather than false). The intermediate code now expands to:

push t.start.a i64
push t.start.b i64
duplstack
swapstack 2 3
jumpge #5 i64
push t.start.c i64
duplstack
swapstack 2 3
jumpge #5 i64
push t.start.d i64
jumpge #5 i64
#5:

Usually those stack operations don't appear in the final code; they
affect the operand stack used by the compiler. And actually, for simple
variables, the final ASM code is exactly the same (R.x are register
variables):

cmp R.a, R.b
jge L5
cmp R.b, R.c
jge L5
cmp R.c, R.d
jge L5
L5:

In more elaborate expressions, eg. with array elements, then it does
reduce the amount of code a little, even though there are no side-effects.

Andy Walker

unread,
Nov 4, 2021, 2:55:55 PM11/4/21
to
On 04/11/2021 14:35, Bart wrote:
>>> Evaluation order is a different subject. How would you enforce
>>> either a first or b first even on something as simple as:
>>>      a + b
>>>      f(a, b)

In sensible languages, you can't [in general]. If you, for
some reason, really, really need to, then there is always the recourse
of "tempa := a; f(tempa, b)" or similar.

[James:]
>> I don't understand. Why would that be a the problem? I do, in fact,
>> define the semantic evaluation order.
> So do I, in the case of f(a,b); 'b' is done first. But how to I get a
> to be evaluated first? I can't do it by adding brackets!

Sensible languages [I don't claim that all languages, or even
all major languages, are sensible!] don't overspecify such things, but
say that parameters or operands are evaluated "collaterally". As above,
you can enforce ordering where it matters by using temporary variables.
If you overspecify, then you prevent some very important optimisations,
esp [but not only] when computers have "threads" or multiple CPUs. A
typical case would be "a := f(b) + g(c)" where "f" and "g" are functions
that do [or could] take a long time. In general, you can't tell in
advance whether "f" and "g" are independent, esp [eg] when there are
pointers that may conceivably be aliases. It could be, for example,
that "f" takes minutes to do some calculation, while "g" has to wait
for the user to type in some information. So a specified order takes
the sum of the times, where something equivalent to

parallel begin tf := f(b), tg := g(c) end; a := tf + tg

takes the larger of the times. Repeated a lot, this can make a
huge difference to the run time, and esp to the responsiveness.
An important use case is interactive games, eg computer chess,
where the computer can continue to analyse while waiting for an
opponent to move.

>> In practice a compiler /could/ evaluate operands in a different
>> order where it could be sure that that would not affect the
>> semantics - which would be most cases.

I don't know whether it's "most cases", but there's very
little to "be sure" of if any operand or parameter includes a
function call.

> If it doesn't affect the semantics, then why would a programmer want
> a different order? It can only be for different results.

There's a wide gulf between not affecting semantics and
being sure of not affecting them.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Bizet

Andy Walker

unread,
Nov 4, 2021, 3:46:14 PM11/4/21
to
On 03/11/2021 17:06, James Harris wrote:
[I wrote:]
>> If a skilled programmer /chooses/ to provide
>> some new operators [and aside from deliberate attempts to obfuscate
>> for competition purposes, and such like], then it is to make code
>> /clearer/.
> Not sure about that. Defining new operators is probably a very poor
> idea, making code less readable rather than more so.

If it makes the code less readable [though there is no general
reason why it should], then of course a good programmer won't do it.

> For example, if a programmer defines an 'operator' called XX to be
> used in function syntax such as
> XX(a, b)
> then while the meaning of XX may be clear to the original implementor
> someone else reading the code needs to learn what XX means in order
> to understand what's involved.

If "XX" is an operator, then the syntax would be "a XX b",
and there is no interesting difference between the operator and a
function with two parameters. In both cases you have to look at
the definition [which would, in Algol, be essentially the same in
both cases] of the operator/function to find out what it means.
In practical cases, you would commonly call the operator something
fairly obvious such as "+" where you might call the function [eg]
"matrixadd" or "gameadd" or "graphadd" or "mytypeadd" depending on
the operand/parameter types. It's up to the programmer to decide
which version is clearer. In the common case where the operand
types are "similar to" numbers, [eg rationals, complex, members of
a group, quaternions, (combinatorial) games, matrices, ...], it
makes sense to overload the usual arithmetic operators, then you
can write formulas in very much the same way as textbooks do.

> As if that's not bad enough things can get worse. If a programmer
> defines a brand new operator called >=< as in
>   a and b >=< c + 1
> then there is not even a clue in the source code as to the precedence
> of the new operator relative to other operators.

Of course there is, otherwise it would be impossible to
parse formulas. In Algol, dyadic operators must have an associated
priority; I suppose other languages could have some other way.
[May or may not be worth noting that ">=<" is not actually a legal
operator symbol in Algol.]

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Bizet

James Harris

unread,
Nov 4, 2021, 4:37:57 PM11/4/21
to
On 04/11/2021 19:46, Andy Walker wrote:
> On 03/11/2021 17:06, James Harris wrote:
> [I wrote:]
>>> If a skilled programmer /chooses/ to provide
>>> some new operators [and aside from deliberate attempts to obfuscate
>>> for competition purposes, and such like], then it is to make code
>>> /clearer/.
>> Not sure about that. Defining new operators is probably a very poor
>> idea, making code less readable rather than more so.
>
>     If it makes the code less readable [though there is no general
> reason why it should], then of course a good programmer won't do it.

Unfortunately, not all programmers are good. Bad code gets produced. And
good programmers can end up having to work on the code of bad ones.

It is, of course, not possible to force bad programmers to produce good
code but it's at least better if a language does not encourage bad
practices - and I put it to you that creating brand new operators with
new symbols (which I believe is what you are thinking about) can be such
a feature.

For example, an operator that a programmer creates may help him at the
time because he is thinking about the problem and knows what the
operator is for. But it's easy to forget that the same symbol will be a
complete black box to someone else who looks at the code later. All the
later programmer will have to give a clue as to what is going on is an
unfamiliar operator symbol which tells him three quarters of nothing -
which is what I was trying to illustrate with the example, below.

...

Note that as you appeared to suggest it's better to give a meaning to an
existing operator and make it something appropriate (such as + for
vectoradd, for example) and to keep their familiar precedences. At least
then another programmer will be able to read the code.

>> As if that's not bad enough things can get worse. If a programmer
>> defines a brand new operator called >=< as in
>>    a and b >=< c + 1
>> then there is not even a clue in the source code as to the precedence
>> of the new operator relative to other operators.
>
>     Of course there is, otherwise it would be impossible to
> parse formulas.  In Algol, dyadic operators must have an associated
> priority;  I suppose other languages could have some other way.
> [May or may not be worth noting that ">=<" is not actually a legal
> operator symbol in Algol.]
>
So in the example expression,

a and b >=< c + 1

what would be the priority of >=< (or whatever legal form is used)
relative to those of the adjacent operators 'and' and '+'?


--
James Harris

Andy Walker

unread,
Nov 4, 2021, 7:07:54 PM11/4/21
to
On 02/11/2021 01:38, Bart wrote:
> Sorry, but I had a lot of trouble understanding your A68 example. For
> example, part of it involved arrays, which I thought was some extra
> ability you'd thrown in, but I think now may actually be necessary to
> implement that feature. [I still don't know...]

??? Arrays in Algol are sufficiently similar to those in
other languages [even C!] that I don't see why you would think them
some "extra ability". If you meant the cast [again, similar to C],
then that was [in the specific example] necessary because the
construct "(a, b, c)" is not unambiguously an array of integers
[it could have been (eg) a structure of three integer variables],
and so needs a stronger context to disambiguate. No cast is needed
in cases where the operand is already an array of integers.

[...]
>> For example, if you happen to be doing matrix algebra,
>> it is likely to be much clearer if you write new operators rather
>> than new functions, so that you can reproduce quite closely normal
>> mathematical notations.
> It could well be clearer, AFTER you've had to implement it via code
> that is a lot less clearer than ordinary user-code.

The implementation in Algol is exactly the same, apart
from writing

OP + = # ... whatever ... #; ...

instead of

PROC matrixadd = # ... whatever ... #; ...

Everything you need to look at and understand in the one case is
the same in the other case.

> (My first scripting language had application-specific types including
> 3D transformation matrices and 3D points. There, if A and B are
> matrices, and P is a point, then:
>    C := A*B          # multiples matrices
>    Q := C*P          # use C to transform P
> /That/ was clear, with the bonus that the user didn't need to
> implement a big chunk of the language themself!)

Every language has to decide which facilities should be
provided as a standard library [or equivalent], and which left
to either purveyors of application-specific libraries or ordinary
users. Algol took the view, very reasonably, that it was not the
job of the language itself to specify matrix packages, statistics
packages, windowing systems, tensor calculus, cryptography, or a
host of other things that informed users can write themselves.
Same applies to Pascal, C and other languages both early and late.
It is of course open to you to advertise your version of [whatever
language] to include packages for this/that/the_other; but you
can't plausibly expect language designers to be expert not only
in specifying syntax and semantics but also in advanced algebra,
statistics, numerical analysis, combinatorial game theory, theory
of relativity, astrophysics, genetics, .... Indeed, you can even
less expect that today than you could 50-odd years ago when it
was normal for computer people to have backgrounds in maths,
physics and/or engineering.

>> Code will only be "mixed" if several
>> people write different bits of a project /without/ agreeing the
>> specifications of their own bits;  that is a problem for /any/
>> large project, and nothing to do with Algol.  You can write code
>> that is difficult to "read, understand, modify or port" in any
>> non-trivial language;  again, nothing special about Algol, more to
>> do with unskilled programmers following dubious practices.
> If the effect is to create lots of mini, overlapping dialects or
> extensions, then that is a problem. There will be assorted custom
> preludes or drag in too.

That too is nothing at all to do specifically with Algol
[or Fortran or C or Pascal or ...]. It's a "problem" for any
general purpose language. It's not an unmitigated good for a
language to supply lots of facilities "as of right"; it makes
manuals and specifications much more complicated, and it means
that every implementer has to be able to write stats functions,
symbolic algebra packages, sound editors, ..., whatever it is
you decide is important enough to be an integral part of the
language. What you do in your own private language is, of
course, entirely up to you, and you will presumably provide
those and only those things that you both want and know about.
The rest of the world has different wants and knowledge.

>>> [...] I do not support chained comparisons.
>>> However, they are in the standard library.
>>      If you don't "support" them then what do you mean by "in the
>> standard library"?
> They are implemented as user-functions which are placed in a library
> that comes with the language.

So they are supported! Or, if I find a bug are you
going to say "Oh, that's only part of the standard library,
nothing to do with me, guv', complain to [who?]"?

[...]>> You can't fix "break" or "a*b";  by the time of 7th Edition,
>> it was already too late,
> It's never too late. An alternate to 'break' could have been
> introduced for switch, eg. have both 'break' and 'breaksw';
> eventually only 'breaksw' is allowed, and 'break' is an error. Then,
> further along, 'break' inside switch is allowed to be used for
> loop-break.

That takes three revision cycles, for something that is
only a minor irritant, not a pressing need. C moves v slowly.

> However, try taking a program from 1980 and try compiling it now.

I can do slightly better than that. I have C programs
from the late '70s, [therefore] all in K&R C, one of them still
in regular use. I've actually made almost no use ever of C
facilities other than K&R, and have written very little C since
the early '90s [Algol is so-o-o much more productive!]. But my
old programs still work [or still have the same bugs, sadly!].

> Actually, take a program from 2021 and try building it with two
> different compilers.

Give me a second Algol compiler [or a different "sh"!]
and I'll try the experiment.

[...]
>>      Allowing "a=b=c=d" shows that "a=b" means one thing if it
>> is "stand alone" and something quite different if it is a left
>> operand.  You're very good at saying "my language allows XXX" for
>> all manner of interesting and perhaps even desirable "XXX", but
>> it's at the expense of specifying exactly what the related syntax
>> and semantics are.
> I copied the feature from Python. You'd need to ask Guido what it
> means!

I thought it was your language we were talking about. How
can I, or anyone else, judge your language against C or Algol or
any other language if /you/ can't tell us what its syntax and
semantics are?

>> In Algol, expressions are parsed by the usual
>> rules of precedence and [L->R] associativity, after which "a o b"
>> for any operands "a" and "b" and any operator "o" means exactly
>> the same as "f(a,b)" where "f" is a function with two parameters
>> of the same types as those of "o" and the same code body as that
>> of "o", all completely visible in the RR plus your own code.
>> What's the corresponding rule in your language?
> If I write:
>     A = B = C
> in static code, then it works something like this:
> * The dominant type of A, B, C is determined
> * A, B, C are converted to that type as needed, as values A', B', C'
> (This is for numeric types; "=" also works for exactly compatible
> arbitrary types with no conversions applied)
> * The expression returns True when A', B', C' have identical values

OK, so we now know what "A = B = C" means. Are there
different rules for every operator, for every number of operands,
for every type, ..., amounting to perhaps hundreds of pages, or
is there [as in Algol] a general rule that can explain the syntax
in a couple of paragraphs and the semantics in another few? It
affects how hard your language is to explain to potential users,
or to people here trying to understand your arguments.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Bizet

Bart

unread,
Nov 4, 2021, 9:56:14 PM11/4/21
to
On 04/11/2021 23:07, Andy Walker wrote:
> On 02/11/2021 01:38, Bart wrote:
>> Sorry, but I had a lot of trouble understanding your A68 example. For
>> example, part of it involved arrays, which I thought was some extra
>> ability you'd thrown in, but I think now may actually be necessary to
>> implement that feature. [I still don't know...]
>
>     ???  Arrays in Algol are sufficiently similar to those in
> other languages [even C!] that I don't see why you would think them
> some "extra ability".

I'm talking about introducing arrays to what appeared to be an
implementation of chained operators. Are they central to being able to
implement an arbitrary list of such comparisons, or are they there for a
different reason?

Your code layout is poor IMO; for example you define multiple new
operators, but in a comma-separated list! That's poor choice for
separating such major bits of code.

I rewrote your operator-defining code in a style more like my current
syntax. Now the code is much clearer (I can see where one op definition
ends, and the next begins!) though I still can't make out how it works.

(I'm not sure what ~ does either.)

-------------------------------------
record intbool =
int i
bool b
end

operator "+<"(int i, j)intbool =
if i < j then
return (j, True)
else
return (~, False)
fi
end

operator "<"(intbool p, int k)bool =
if p.b then
return p.i < k
else
return False
fi
end

operator "+<"(intbool p, q)intbool =
if p.b then
return (q.i, p.i < q.i)
else
return (~, False)
fi
end

operator "+<"(intbool p, int k)intbool =
if p.b then
return (k, p.i < k)
else
return (~, False)
fi
end

operator "+<"([]int a)bool =
if a.upb <= a.lwb then
return True
else
int p := a.[a.lwb]
for i := a.lwb+1 to a.upb do
if p < a[i] then
p := a[i]
else
return False
fi
od
return True
fi
end
-------------------------------------

>> (My first scripting language had application-specific types including
>> 3D transformation matrices and 3D points. There, if A and B are
>> matrices, and P is a point, then:
>>     C := A*B          # multiples matrices
>>     Q := C*P          # use C to transform P
>> /That/ was clear, with the bonus that the user didn't need to
>> implement a big chunk of the language themself!)
>
>     Every language has to decide which facilities should be
> provided as a standard library [or equivalent], and which left
> to either purveyors of application-specific libraries or ordinary
> users.  Algol took the view, very reasonably, that it was not the
> job of the language itself to specify matrix packages, statistics
> packages, windowing systems, tensor calculus, cryptography, or a
> host of other things that informed users can write themselves.

This is how a dedicated language (or a DSL) can have an advantage over
more general or more mainstream ones.


>> If the effect is to create lots of mini, overlapping dialects or
>> extensions, then that is a problem. There will be assorted custom
>> preludes or drag in too.
>
>     That too is nothing at all to do specifically with Algol
> [or Fortran or C or Pascal or ...].  It's a "problem" for any
> general purpose language.

It's a bigger problem when mainstream languages doesn't include features
that are considered fundamental enough, that every other application has
to include its own implementations.

(Eg. I've lost count of how many min/max macros or functions I've seen
in C.)


> It's not an unmitigated good for a
> language to supply lots of facilities "as of right";  it makes
> manuals and specifications much more complicated,

Yet Algol68 - and C - include facilities for complex numbers. I've never
used them and never will. And A68 has all those advanced ways of
creating slices in several dimensions.

> and it means
> that every implementer has to be able to write stats functions,
> symbolic algebra packages, sound editors, ..., whatever it is
> you decide is important enough to be an integral part of the
> language.  What you do in your own private language is, of
> course, entirely up to you, and you will presumably provide
> those and only those things that you both want and know about.
> The rest of the world has different wants and knowledge.


>> They are implemented as user-functions which are placed in a library
>> that comes with the language.
>
>     So they are supported!

But it's not supported directly by the language; its implemented in
user-code.

That's kind of a middle ground; the implementation will be poor, the
syntax unwieldy, but it will work on every installation.


>> I copied the feature from Python. You'd need to ask Guido what it
>> means!
>
>     I thought it was your language we were talking about.  How
> can I, or anyone else, judge your language against C or Algol or
> any other language if /you/ can't tell us what its syntax and
> semantics are?

I thought you wanted to know how chained compares worked in my language.
I'm saying they were copied from Python:

https://docs.python.org/3/reference/expressions.html, section 6.10.



>     OK, so we now know what "A = B = C" means.  Are there
> different rules for every operator, for every number of operands,
> for  every type, ..., amounting to perhaps hundreds of pages, or
> is there [as in Algol] a general rule that can explain the syntax
> in a couple of paragraphs and the semantics in another few?  It
> affects how hard your language is to explain to potential users,
> or to people here trying to understand your arguments.

Well, the Algol68 books I've seen come to many hundreds of pages too.

But in the specific case of chained operators, that is really a very
minor feature. I saw it in Python, and decided to copy it.

I didn't need to delve into reference manuals to figure how to use it in
Python either; it's pretty straightforward! Something intuitive for
change, at least until you try and do a != b != c.

(My version only works for the six comparison ops; Python takes it a bit
further. The sementatics are a litle different between static and
dynamic languages, but there are differences anyway.)

Andy Walker

unread,
Nov 4, 2021, 10:11:34 PM11/4/21
to
On 04/11/2021 20:37, James Harris wrote:
> It is, of course, not possible to force bad programmers to produce
> good code but it's at least better if a language does not encourage
> bad practices - and I put it to you that creating brand new operators
> with new symbols (which I believe is what you are thinking about) can
> be such a feature.

Sorry, but why do you think that creating a new operator
is [other things being equal] a worse practice than creating a
new function with the same parameters, code, specification, etc?
What demons are you fighting?

> For example, an operator that a programmer creates may help him at
> the time because he is thinking about the problem and knows what the
> operator is for. But it's easy to forget that the same symbol will be
> a complete black box to someone else who looks at the code later.

So [and to the same extent] will be a function.

> All the later programmer will have to give a clue as to what is going
> on is an unfamiliar operator symbol which tells him three quarters
> of nothing - which is what I was trying to illustrate with the
> example, below.

It tells the programmer exactly as much as an unfamiliar
function name. Whether you see "a >=< b" or "myweirdfunc (a, b)"
you're going to have to go to the defining occurrence of the
operator/function to find out what is going on [same in both
cases!].

[...]
> So in the example expression,
>   a and b >=< c + 1
> what would be the priority of >=< (or whatever legal form is used)
> relative to those of the adjacent operators 'and' and '+'?

Whatever the programmer set it to be. In the case of
Algol, there is a table of standard priorities in the Revised
Report [RR10.2.3.0], which any programmer really has to learn
[same as in C or any other major language apart from the oddities
where all operators have the same priority or where expressions
are written in reverse-Polish notation]. The programmer gets to
choose where in that table the new operator comes.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Daquin

Bart

unread,
Nov 5, 2021, 9:48:21 AM11/5/21
to
I see several levels of user-defined/overloaded operators:

TYPE 1 New alphanumeric operators can be created. While this doesn't
appear much different from new function names, it means syntax can look
like this:

a b c d e f

Problems:

(1) Which of those are operators, and which variables? You can't tell
from the 'shape' of the code.

(2) The compiler can't tell either, because those names may not be
resolvable until the next stage after parsing. So it cannot properly
form an AST of the right structure

(3) Because names follow normal scope rules, it is possible that the
same name can be a variable in one scope, and an operator in another.

(4) The same name could be used for several operators in different
scopes, with different precedences

(5) Modern languages with namespaces can have qualifiers, which applies
also to named operators, so 'a x.b c', adding further to the confusion

(6) Such a scheme could allow mixed alphanumeric and symbolic operators;
see next section


TYPE 2 New operators can only be made out of symbols. At least here, you
will know what is what! But there are still problems:

(7) Unless there is a rule, such as any new operator like +++ has
program-wide scope, and cannot be shadowed, then there may be the same
name-resolving problems as named operators

(8) This mean again that the AST shape cannot be determined until later
(depending on how the rest of the language works; it might insist on
ahead-of-use declarations)

(9) It can still mean that two versions of +++ can have different
precedences

(10) With namespaces, you may need a qualified operator, like x.+++,
which is not pretty

(11) With few restrictions on operator names, source code can end up
looking like a zoo.

(12) With symbolic names, it is not practical to give self-explanatory
names to operators; they will necessarily be cryptic.


TYPE 3 Here, no new operator tokens can be defined. You can only
overloaded existing operators. This is the kind I'd favour, and I
believe some languages do this.

This solves the problems of scope, of precedence, of name-resolution,
and allows the correct AST shape to be created by the parser. But yet:

(13) This still has the problem of which overload is in scope, which
ones are visible. For example there can be two versions of +(T,T)T:
which one will be chosen; how can you control that?

(14) Common to all of these is that if you see:

A + B

where A and B are of user-defined types (or maybe even of standard
types, if overriding normal behaviour is allowed), what exactly will the
code do?

It could decide to subtract the value of B from A!

I think TYPE 3 is OK as a semi-internal mechanism used to define some
built-in functionality. But Algol68 (which appears to be TYPE 2) goes
way beyond that, and encourages using such features in user-code.

As I said I do very little with this; I support TYPE 3 in my dynamic
language in experimental form, and mainly use it to define custom
TOSTR() handlers. TOSTR() is an operator used implicitly in PRINT
statements, so that if X is of type T, then:

print X

will call the custom handler for T instead of the default one. So if T
is a date type for example, it might show "Fri 5-Nov-21" instead of
"(5,11,2021)".

There are none of the problems I outlined above. I don't even have a
syntax for the overload, I used a special built-in function:

$setoverload((op), T, F)

(op) you will have seen before, it can be (+) etc, or for print, it is
(tostr). T is the type that is overloaded (here it's for a unary op),
and F is the handler function, which is written conventionally. (So you
could just do F(X) instead of op X.)

Andy Walker

unread,
Nov 6, 2021, 6:48:13 PM11/6/21
to
On 05/11/2021 01:56, Bart wrote:
>>> Sorry, but I had a lot of trouble understanding your A68 example. For
>>> example, part of it involved arrays, which I thought was some extra
>>> ability you'd thrown in, but I think now may actually be necessary to
>>> implement that feature. [I still don't know...]
>>      ???  Arrays in Algol are sufficiently similar to those in
>> other languages [even C!] that I don't see why you would think them
>> some "extra ability".
> I'm talking about introducing arrays to what appeared to be an
> implementation of chained operators. Are they central to being able
> to implement an arbitrary list of such comparisons, or are they there
> for a different reason?

Algol doesn't have "lists" as opposed to arrays, so if we
want to implement [eg] "+< somearray" to mean a chained comparison
between the elements of the array [similar, apparently to your
"mapsv (('<'), ...)", then of course "somearray" has to be an
array. I still don't see why it's so strange to involve/introduce
arrays to implement an indefinitely-long chain. [It was a throw-
away example; I hope no-one would really spend much time or effort
on writing new operators to make "A < B < C" work.]

> Your code layout is poor IMO;

"De gustibus ..." and all that jazz.

> for example you define multiple new
> operators, but in a comma-separated list! That's poor choice for
> separating such major bits of code.

But the "major bits" were appropriately indented; adding
unnecessary operator tokens would obscure the indentation for no
gain in clarity. It's not as though they were each two or three
pages long; I expect you to be able to follow indentation over a
couple of lines.

> I rewrote your operator-defining code in a style more like my current
> syntax. Now the code is much clearer (I can see where one op
> definition ends, and the next begins!)

In my version, that happens when the indentation returns to
the previous level. Adding "OP" to the start of each line doesn't
help in any way. [It could have if that return had been far away.]

> though I still can't make out
> how it works.

Well, it's [as re-written] your code! I can understand it.

> (I'm not sure what ~ does either.)

"~" is the "doesn't matter" token; it saves inventing some
expression of the right type to fill in, and to confuse readers who
wonder why you've written 17 [or whatever]. Eg, "WHILE ... DO~OD"
is a loop where the controlled statement would be empty if Algol
allowed empty statements.

>     record intbool = [...]
>     operator "+<"(int i, j)intbool = [...]
>     operator "<"(intbool p, int k)bool = [...]
>     [...]

Note that the effect of your "clarity" is that the tokens
"+<" etc are now buried in the middle of the line instead of at the
front, where they would be more prominent. [Just sayin'! I'm not
a fan of layout wars.]

More importantly, your version is 51 lines, where mine was
15. So mine fits comfortably into one window [24 lines] even with
five appended lines of comments and three of examples, yours takes
three [esp with examples], meaning that anyone trying to understand
yours will be scrolling up and down like a yo-yo. [I /am/ a fan of
trying to get interesting chunks of code together onto a single
page, so that users can see it all in one place.]

>> It's not an unmitigated good for a
>> language to supply lots of facilities "as of right";  it makes
>> manuals and specifications much more complicated,
> Yet Algol68 - and C - include facilities for complex numbers. I've
> never used them and never will. And A68 has all those advanced ways
> of creating slices in several dimensions.

Complex numbers: You may have forgotten, but back in the
'60s, computers were for doing calculations. No networking/comms,
no games [not while anyone was watching, anyway], no word-processing,
no file systems, no editors, no [lots of other things]. For "no",
read "well, very few and very primitive, tho' a few people were doing
some early work on them". If you're a mathematician or engineer, the
chances are rather good that at least some of the calculations you do
will involve complex numbers. I accept that they may not be for you,
but they were bread-and-butter then for people doing NA or solving
differential equations. They were even part of the machine code for
some computers.

Creating slices: There's really only one facility! It's
just quite general: in any array, you can pick out a cuboid of your
choice, and deal with that as a smaller/simpler array. Arrays are
really one of the things that Algol got absolutely spot-on right,
inc bounds and slices, and esp with the Torrix extensions, it's easy
to implement and to use, and I'm often amazed that there are still
languages out there where you can't look up array bounds but have
to carry them around into procedure calls, and where you can't
create sub-arrays [esp sub-strings].

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Ravel

Andy Walker

unread,
Nov 6, 2021, 7:34:01 PM11/6/21
to
On 05/11/2021 13:48, Bart wrote:
> I see several levels of user-defined/overloaded operators:
> TYPE 1 New alphanumeric operators can be created. While this doesn't
> appear much different from new function names, it means syntax can
> look like this:
>   a b c d e f
> Problems:
> (1) Which of those are operators, and which variables? You can't tell
> from the 'shape' of the code.

This is easily solved, as in Algol, by using a different
"font" for operators. This idea was recognised as far back as IAL
["Algol 58"]. Every usual language already recognises something
of the sort; typically, quote symbols switch between code and
strings, comment symbols between code and comments, brackets into
or out of a "subscript" mode, and so on. Textbooks commonly use
one font for code examples, another for the ordinary text. The
snag back in the '50s was that card equipment had only upper case
letters so everything in the program had to be wedged into that.
Today we could do better, but the styles of the '50s and '60s are
too entrenched. The result of all that is that your example might
be written in A68G as [eg]

a B c D E f

where upper case denotes operators and lower case variables, or
in earlier versions of Algol-like languages as

a .b c .d .e f

or

A 'B' C 'D E' F

or various other representations depending on available equipment.
[Separately and only tangentially relevant, I would hope that if
you invent operators you use more meaningful names.]

> (2) The compiler can't tell either, because those names may not be
> resolvable until the next stage after parsing. So it cannot properly
> form an AST of the right structure

An unparsable language is obviously useless. Algol, at
least, goes to great lengths to ensure that code is unambiguous.

> (3-6) [... various confusions ...]

Yes, if you set out to confuse people who read your code,
you can succeed. I'm more interested in languages in which good
programmers can write good, clear, concise and efficient code than
in languages in which all programmers are prevented from doing
things that are useful but might be abused.

[...]
> (14) Common to all of these is that if you see:
>    A + B
> where A and B are of user-defined types (or maybe even of standard
> types, if overriding normal behaviour is allowed), what exactly will
> the code do?
> It could decide to subtract the value of B from A!

Yes, and "sqrt(x)" could return the square of "x" after
sending a rude e-mail to your boss. Lots of things are possible.
Why do you keep worrying about the daft things people could do?
Take responsibility for your own programming, and leave the idiot
programmers to be sorted out in other ways.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Ravel

Bart

unread,
Nov 7, 2021, 6:56:00 AM11/7/21
to
On 06/11/2021 22:48, Andy Walker wrote:
> On 05/11/2021 01:56, Bart wrote:
>>>> Sorry, but I had a lot of trouble understanding your A68 example. For
>>>> example, part of it involved arrays, which I thought was some extra
>>>> ability you'd thrown in, but I think now may actually be necessary to
>>>> implement that feature. [I still don't know...]
>>>      ???  Arrays in Algol are sufficiently similar to those in
>>> other languages [even C!] that I don't see why you would think them
>>> some "extra ability".
>> I'm talking about introducing arrays to what appeared to be an
>> implementation of chained operators. Are they central to being able
>> to implement an arbitrary list of such comparisons, or are they there
>> for a different reason?
>
>     Algol doesn't have "lists" as opposed to arrays, so if we
> want to implement [eg] "+< somearray" to mean a chained comparison
> between the elements of the array [similar, apparently to your
> "mapsv (('<'), ...)", then of course "somearray" has to be an
> array.  I still don't see why it's so strange to involve/introduce
> arrays to implement an indefinitely-long chain.

In the user code the elements do not form an array. The comparisons are
done with dedicated code for each, not using a loop. Inside the
compiler, it could make use of arrays, but the input source and output
code probably won't.

The consequences of your A68 solution is that it takes 4 times as long
to do 'a <+ b <+ c < d', as it does to do 'a < b AND b < c AND c < d'
[10 secs vs 2.5 secs for 10 million iterations].

When I make the same comparison in my languages, I get the same timing,
although that's because these simple expressions end up as the same
code. [0.75 secs for 1 billion iterations static, 0.85 secs for 100M,
dynamic]

So in this case, emulating such a construct in user code results in an
inferior version: funnier syntax, and not as efficient. But a side
effect is that you get +< or <+ (how do you remember which it is) to act
on arrays of ints.

(A reminder that this solution only works for "<", only for INTs, and
may not allow mixed comparisions like a <= b < c.)


>> Your code layout is poor IMO;
>
>     "De gustibus ..." and all that jazz.
>
>>                  for example you define multiple new
>> operators, but in a comma-separated list! That's poor choice for
>> separating such major bits of code.
>
>     But the "major bits" were appropriately indented;  adding
> unnecessary operator tokens would obscure the indentation for no
> gain in clarity.  It's not as though they were each two or three
> pages long;  I expect you to be able to follow indentation over a
> couple of lines.

No, it's just poor. C allows comma-separated lists of function
declarations (when they share the same return type), but I don't think
I've ever seen that.

TBF, that may be mostly due to people not being aware they can do that.
Which is fortunate for everyone else.

The trend is now to declare only one thing per line, even variables.
Which doesn't mean this:

int a,
b,
c;

It means this::

int a;
int b;
int c;

(Personally I stick with int a, b, c - for variables.)

But you're defining entire functions (those OPs are no different from
PROCs) as a comma-separated list in the first style.

You should be able to take a function definition and paste it, copy it,
move it, delete it... without needing to refactor the surrounding code.

>     "~" is the "doesn't matter" token;  it saves inventing some
> expression of the right type to fill in, and to confuse readers who
> wonder why you've written 17 [or whatever].  Eg, "WHILE ... DO~OD"
> is a loop where the controlled statement would be empty if Algol
> allowed empty statements.

(A very strange feature. print(~) just shows SKIP. Assigning it to an
int stores 1. Assigning to a REAL stores some random value. Assigning to
a string stores "".

At the least I would have expected 0 and 0.0 to be stored for numbers,
ie. 'empty' or 'zero'.)

>>      record intbool = [...]
>>      operator "+<"(int i, j)intbool = [...]
>>      operator "<"(intbool p, int k)bool = [...]
>>      [...]
>
>     Note that the effect of your "clarity" is that the tokens
> "+<" etc are now buried in the middle of the line instead of at the
> front, where they would be more prominent.

You had only +< at the start of the line, there it means nothing.
Perhaps it's the middle of an expression; you have to scan the previous
line to find out, and perhaps the previous dozen lines. There are quite
a few commas!

'operator' does the job of 'fn', 'func' and 'function' which are now
sensibly being used to mark a function definition, instead of <nothing>
as popularised by C. You know this is an operator definition, and you
are defining "+<" (I quoted it in my made-up syntax to stop it bleeding
into actual symbols like '=', or I could write is as (+<).)


  [Just sayin'!  I'm not
> a fan of layout wars.]
>
>     More importantly, your version is 51 lines, where mine was
> 15.  So mine fits comfortably into one window [24 lines] even with
> five appended lines of comments and three of examples, yours takes
> three [esp with examples], meaning that anyone trying to understand
> yours will be scrolling up and down like a yo-yo.

I can do one line versions for most of those ops (not the array one),
then it ends up as 25 lines. The OP definitions are still independent.

However, I would rather look at:

operator "+<"(int i, j)intbool =
if i < j then
return (j, True)
else
return (~, False)
fi
end

than:

operator "+<"(int i, j)intbool = { (i<j | (j, True) | return (~,
False)) }

Keeping the definitions independent makes it more viable for a text
editor to detect them and be able to collapse blocks of code. (My editor
also displays 60 lines, not 25.)

Bart

unread,
Nov 7, 2021, 7:00:56 AM11/7/21
to
Yes I tried this in Algol68. It appears to support my TYPE 1 operators,
the most comprehensive.

And yes it appears to have most of the problems I mentioned. Except
those associated with definitions imported from other modules, but it
either doesn't have modules, or I don't know how to use them.

It also allows the use of operators where their OP and PRIO definitions
are at the end of the file. So making it a trifle harder to parse the
correct shapes of expression, at least the first time through.

James Harris

unread,
Nov 7, 2021, 10:49:25 AM11/7/21
to
On 05/11/2021 02:11, Andy Walker wrote:
> On 04/11/2021 20:37, James Harris wrote:
>> It is, of course, not possible to force bad programmers to produce
>> good code but it's at least better if a language does not encourage
>> bad practices - and I put it to you that creating brand new operators
>> with new symbols (which I believe is what you are thinking about) can
>> be such a feature.
>
>     Sorry, but why do you think that creating a new operator
> is [other things being equal] a worse practice than creating a
> new function with the same parameters, code, specification, etc?
> What demons are you fighting?

Fighting demons may describe your approach to problem solving if you are
doing so by defining brand new operator symbols but I suggest that there
are specific problems with doing so:

1. While well-known operators are enunciable (e.g. := can be read as
"becomes" and >= can be read as "is greater than or equal to") in a way
that reflects their meaning that's not true of new ones. For example,
how would you read any of these:

@:
#~
%!?

2. While the meaning of the new composite symbol may be logical to the
person who makes it up its appearance can be meaningless to someone
else. For example,

>=<

What does it mean? The person making it up may feel that it's a fairly
clear way to indicate bit shuffling. But someone reading the code would
have no clue to that from the operator itself.

3. You may not like to hear criticism of Algol and I wouldn't criticise
it unnecessarily but didn't you recently show code in which Algol allows
programmers to define the precedence of new operators? If so, that also
helps to make expressions unreadable for the reasons I set out before,
i.e. that in an expression such as

a or b #? c + d

there is no clue as to how #? it will be parsed relative to the
operators on either side of it.


>
>> For example, an operator that a programmer creates may help him at
>> the time because he is thinking about the problem and knows what the
>> operator is for. But it's easy to forget that the same symbol will be
>> a complete black box to someone else who looks at the code later.
>
>     So [and to the same extent] will be a function.
>
>> All the later programmer will have to give a clue as to what is going
>> on is an unfamiliar operator symbol which tells him three quarters
>> of nothing - which is what I was trying to illustrate with the
>> example, below.
>
>     It tells the programmer exactly as much as an unfamiliar
> function name.  Whether you see "a >=< b" or "myweirdfunc (a, b)"
> you're going to have to go to the defining occurrence of the
> operator/function to find out what is going on [same in both
> cases!].

I would say two things to that. First, the best approach is for a
language to come with a comprehensive standard library which provides
common (and not so common) data structures, operations and algorithms.
That saves many different programmers inventing the same solutions to
problems and calling them by different names thus making code easier to
read for everyone.

Second, I'd say that where a programmer has to produce a new utility
function (it's not in the language, not in the standard library, and not
even in a library which someone else has published) it should be
identified by a name rather than by a string of punctuation characters,
and that name should be used in a syntax in which the precedence is
apparent and unquestionable; the name will at least give a reader some
idea as to what the function does.

In your example, a >=< b gives no clue but shuffle_bits(a, b) tells the
reader something about the purpose, would happen with obvious precedence
and is meaningfully enunciable and is therefore a better approach, IMO.

YMMV


--
James Harris

James Harris

unread,
Nov 7, 2021, 10:52:45 AM11/7/21
to
On 05/11/2021 13:48, Bart wrote:

...

>
> I see several levels of user-defined/overloaded operators:

...

That (now snipped) looks like an extensive write-up on operator
overloading options and a good springboard into the topic. The thing is,
this discussion is about chained comparisons. Why didn't you start a new
thread?



--
James Harris

James Harris

unread,
Nov 7, 2021, 12:54:32 PM11/7/21
to
On 04/11/2021 14:35, Bart wrote:
> On 04/11/2021 10:32, James Harris wrote:
>> On 03/11/2021 23:13, Bart wrote:

...

>>    a < b < c < d
>>
>> should evaluate a then b, then if a < b is true evaluate c, then if b
>> < c is true evaluate d, then if c < d then the whole expression would
>> be true. Do you see it differently?
>
> Yes, that's what I do. 'if a < b < c < d' generates this intermediate code:
>
>     push           t.start.a  i64
>     push           t.start.b  i64
>     jumpge         #4         i64
>     push           t.start.b  i64
>     push           t.start.c  i64
>     jumpge         #4         i64
>     push           t.start.c  i64
>     push           t.start.d  i64
>     jumpge         #4         i64
> #  <body of if>
> #4:

Looks good to me.

...

>>> You can't use a chain which implies a particular order,
>>> short-circuiting in a similar manner to 'and' and 'or':
>>
>> Why not?
>
> Because in the case of 'a<b<c<d', which may exit early when a>=b, c and
> d would not get evaluated. But you're saying you want the programmer to
> tell the language they should be evaluated first anyway?

No. I am querying whether parens in

a < (b < c) < d

should be allowed to specify the evaluation of b < c first while
maintaining the chain. You could think of the language doing that for
utility and consistency. On the utility of it the programmer might want
to encode

if b < c and a < b and c < d

thereby not evaluating the rest if "b < c", i.e. evaluating the b < c
part first. On the consistency of it it would be similar to

a - (b - c) - d

where b - c would effectively be evaluated first, e.g. to avoid
overflow, albeit that there's no shortcutting going on.

...

>> In practice a compiler /could/ evaluate operands in a different order
>> where it could be sure that that would not affect the semantics -
>> which would be most cases.
>
> If it doesn't affect the semantics, then why would a programmer want a
> different order? It can only be for different results.
>
> For faster code? That's the compiler's job!

As an example,

F(A(), B())

I define that to evaluate A() before B(). If the internals of A and B
are unknown then that's exactly what the compiler would have to emit
code to do. As you will be aware, that can be slightly awkward when the
call stack needs to be constructed right-to-left.

Therefore what I was saying was that when the compiler can tell that
there's no conflict (e.g. A() has no side effects) it can evaluate B()
then A(). In particular, in something like

F(X, Y)

or

F(3, 2)

or even more (i.e. in most cases) evaluation can be done R-L while
maintaining the semantics of L-R evaluation.

>
>>>
>>>  >    (((a < b) < c) < d)
>>>
>>> This doesn't guaranteed that a<b is evaluated before c. I just means
>>> it does (a<b) and c, not a and (b<c).
>>>
>>> It also means that this is not a comparison chain.
>>>
>>
>> IYO...!
>
> No, it's just not a chain. Unless you want to argue about the difference
> between a linked list, a binary tree, and a degenerate(?) binary tree
> which is more of a vertical linked list.

Interesting point of disagreement. I would say that while you personally
may not parse it as a chain that doesn't necessarily stop it being one!

>
> My chain of comparison ops is equivalent to a linked list. Any other
> shape, is not the same thing. Not in my language, because I say so!

:-)


--
James Harris

Bart

unread,
Nov 7, 2021, 3:59:06 PM11/7/21
to
Could do, but I've not much left to say.

I think I made it clear I'm not keen on user-defined operators, of
either kind, as there are too many issues with few benefits.

(Besides, I find it easy enough to create new, built-in operators in my
own implementations, when I need to.)

I'm OK with overloading existing operators, although I've done very
little with that.

Rod Pemberton

unread,
Nov 7, 2021, 8:23:37 PM11/7/21
to
On Sun, 5 Sep 2021 11:50:18 +0100
James Harris <james.h...@gmail.com> wrote:

> I've got loads of other posts in this ng to respond to but I came
> across something last night that I thought you might find interesting.
>
> The issue is whether a language should support chained comparisons
> such as where
>
> A < B < C
>

Ok.

> means that B is between A and C?

Why does this have something to do with B? ...

I.e., you seem to think it's the result of:

(A < B) && (B < C)

Charles thinks it's generally parsed as:

(A < B) < C

Whereas, I would've assumed it had something to do with A:

A < (B < C)

In other words, I generally assume that the assigned to variable, or
the intended final comparison of a chained sequence, is on the left, as
in C or BASIC etc, albeit there is no explicit assignment operator in
your chained comparison.

> Or should a programmer have to write
>
> A < B && B < C
>
> ?

I think the issue is how you specify which comparison or comparisons or
variable you want as the result, as this will determine the order of
the operations. Do you let the parsing rules decide? e.g.,
left-to-right or right-to-left? Do you define which variable must be
returned? e.g., leftmost, rightmost, middle?

> I like the visual simplicity of the first form but it doesn't look so
> intuitive with a mix of relational operators such as in
>
> A < B > C
>
> What does that mean?!
>
> For programming, what do you guys prefer to see in a language? Would
> you rather have each relation yield false/true (so that A < B in the
> above yields a boolean) or to allow the above kind of chaining where
> a meaning is attributed to a succession of such operators?

I generally don't like chaining, even in C, because a) it can be
confusing as to what is intended by the programmer, and b) may require
the programmer to memorize esoteric rules they will forget. When they
do, you get parens slapped around everything to control the parsing
order, so the code looks like LISP.

> If the latter, what rules should govern how successive relational
> operators are applied?

...

--
Is this the year that Oregon ceases to exist?

Rod Pemberton

unread,
Nov 7, 2021, 8:23:44 PM11/7/21
to
On Mon, 6 Sep 2021 23:24:47 +0200
David Brown <david...@hesbynett.no> wrote:

> On 06/09/2021 21:10, James Harris wrote:
> > On 06/09/2021 12:24, David Brown wrote:

> >> (Since I may have accidentally complemented you for a language
> >> feature, I need to add balance - don't you have a space key on
> >> your keyboard? Why don't you use it when writing code?)
> >
> > Indeed!
> >
> > In fairness, the absence of spaces doesn't look too bad when
> > variable names are short - such as might be used in example code
> > fragments. But IMO spaceless code becomes hard to read with longer
> > identifiers as used in proper programming.
> >
> >
>
> It is a matter of style and opinion (and Bart knows that, of course).
>
> But it is a serious point. Layout of code is vital to readability,

Yes, and that's why C programmers usually use the totally f'd up style,
where the braces are completely out of alignment with each other, which
reliably and consistently leads to readability issues in C from
erroneous indentation, especially after trivial editing.

When instead, they should use the logical and rational and obviously
significantly more READABLE style of a C novice, where the braces are
column aligned, whereby indentation errors are substantially easier to
identify.

> and spaces are a big part of that (as is consistency - you don't want
> different spacing depending on the length of the variables). It is
> better to write "c == 1" than "c==1", because the "==" is not part of
> either then "c" or the "1".

Disagree.

The spaces make it significantly harder to read for me.

James Harris

unread,
Nov 8, 2021, 4:06:47 AM11/8/21
to
On 08/11/2021 01:25, Rod Pemberton wrote:
> On Sun, 5 Sep 2021 11:50:18 +0100
> James Harris <james.h...@gmail.com> wrote:
>
>> I've got loads of other posts in this ng to respond to but I came
>> across something last night that I thought you might find interesting.
>>
>> The issue is whether a language should support chained comparisons
>> such as where
>>
>> A < B < C
>>
>
> Ok.
>
>> means that B is between A and C?
>
> Why does this have something to do with B? ...
>
> I.e., you seem to think it's the result of:
>
> (A < B) && (B < C)

That's the suggestion. And it is a suggestion. It's not meant to be
about parsing an existing language!

>
> Charles thinks it's generally parsed as:
>
> (A < B) < C
>
> Whereas, I would've assumed it had something to do with A:
>
> A < (B < C)
>
> In other words, I generally assume that the assigned to variable, or
> the intended final comparison of a chained sequence, is on the left, as
> in C or BASIC etc, albeit there is no explicit assignment operator in
> your chained comparison.

That's a bit odd. There's no assignment at all. It might make it clearer
to consider it in context of IF statements:

if A < B < C then print "B is between A and C non-inclusive"
if P <= Q <= R then print "Q is between P and R inclusive"
if U < V > W then print "V is highest of the three variables"


--
James Harris

Andy Walker

unread,
Nov 8, 2021, 7:03:03 PM11/8/21
to
On 07/11/2021 15:49, James Harris wrote:
[I wrote:]
>>      Sorry, but why do you think that creating a new operator
>> is [other things being equal] a worse practice than creating a
>> new function with the same parameters, code, specification, etc?
>> What demons are you fighting?
> Fighting demons may describe your approach to problem solving if you
> are doing so by defining brand new operator symbols but I suggest
> that there are specific problems with doing so:

I wasn't talking about problem solving but about a weird
frame of mind in which you are happy for people to write

proc wertyuiop = (int a, b) bool: a > b;

[adjust to whatever your preferred syntax for procedure declarations
is], but

op wertyuiop = (int a, b) bool: a > b;

is a no-no? In any given case, you can decide for yourself whether
the names are meaningful; it's obviously [IMO] better if they are,
but I don't know of any language in which identifiers are allowed
or banned on that basis.

> 1. While well-known operators are enunciable (e.g. := can be read as
> "becomes" and >= can be read as "is greater than or equal to") in a
> way that reflects their meaning that's not true of new ones. For
> example, how would you read any of these:
>   @:
>   #~
>   %!?

In Algol? None of those are potential operators: "@", ":",
"#" and "~" are already in the syntax ["@" is for slices, ":" for
labels and similar, "#" for comments and "~" is a skip]; operators
are one or two symbols from an relatively uninteresting list of the
usual suspects, optionally followed by ":=" or "=:", so "%!?" is
too long. As for enunciation, this is, as usual, a matter of
trusting the programmer to be sensible. The fact that it is
possible to write obscure code is not a challenge for people to
do so! OTOH, APL is that-away -->.

> 2. While the meaning of the new composite symbol may be logical to
> the person who makes it up its appearance can be meaningless to
> someone else. For example,
>   >=<

Again, this is not a legal A68 operator. If it had been,
then the same applies -- programmers should write readable code.
But it is always possible that they are following the symbolisms
used in, say, a physics paper, in which case it becomes sensible
for both programmers and [informed] readers.

[...]
> 3. You may not like to hear criticism of Algol and I wouldn't
> criticise it unnecessarily

The [often ignorant] criticism here is as naught compared
with the [often informed] criticism of Algol 68 when it first
came out. Some of that resulted in the improvements of the
Revised Report; some of it resulted in Pascal [be careful what
you wish for!].

> but didn't you recently show code in which
> Algol allows programmers to define the precedence of new operators?

It not merely allows, it requires that for /new/ dyadic
operators. How else could a compiler know what precedence to
use?

> If so, that also helps to make expressions unreadable for the reasons
> I set out before, i.e. that in an expression such as
>   a or b #? c + d
> there is no clue as to how #? it will be parsed relative to the
> operators on either side of it.

Again, you're seeing gremlins where none exist. IRL,
bad programmers write unreadable and incompetent code in any
language [even Basic], and good programmers write readable
and competent code in any language [even (deleted)]. But in
return, the reader has to make the effort to become familiar
with the language.

[...]
>>      It tells the programmer exactly as much as an unfamiliar
>> function name.  Whether you see "a >=< b" or "myweirdfunc (a, b)"
>> you're going to have to go to the defining occurrence of the
>> operator/function to find out what is going on [same in both
>> cases!].
> I would say two things to that. First, the best approach is for a
> language to come with a comprehensive standard library which provides
> common (and not so common) data structures, operations and
> algorithms. That saves many different programmers inventing the same
> solutions to problems and calling them by different names thus making
> code easier to read for everyone.

Within reason, these things don't belong in the language
/definition/. It's open to commercial companies or academe to
provide useful libraries [eg, the NAG library of high-quality NA
routines]. You can't expect what is usually a small group of
programmers to implement high-quality NA /and/ stats /and/
celestial mechanics /and/ number theory /and/ cryptography /and/
... as well as being competent enough to design and implement a
good language. After that, you have the problem of information
overload. Language definitions are already too big! It takes
over 600 pages to describe standard C [not a massive language];
if you add a couple of hundred more pages to describe dozens of
pre-defined types, scores of relevant operators, hundreds of
interesting functions, people will just switch off and not
bother to read them. [Unix/Linux has suffered from this;
7th Edition was small enough that it was sensibly possible to
read the entire manual and the entire source code for the
whole caboodle, including all the commands, compilers, papers
and so on. Today, it is not possible for you even to keep up
with the changes, beyond the small subset that you personally
are interested in.]

> Second, I'd say that where a programmer has to produce a new utility
> function (it's not in the language, not in the standard library, and
> not even in a library which someone else has published) it should be
> identified by a name rather than by a string of punctuation
> characters, and that name should be used in a syntax in which the
> precedence is apparent and unquestionable; the name will at least
> give a reader some idea as to what the function does.

Name: yes, of course. Precedence: that has, see above,
to be specified for dyadic operators. But it is still the case
that mathematicians will prefer, and find more readable, symbols
such as "+" where addition [and other standard operations] are
applied to new types.

> In your example, a >=< b gives no clue but shuffle_bits(a, b) tells
> the reader something about the purpose, would happen with obvious
> precedence and is meaningfully enunciable and is therefore a better
> approach, IMO.

You've perhaps forgotten, but ">=<" was /your/ example,
as was the idea that it was to shuffle bits. I used "+<", as
"<" couldn't be used, to give at least some impression of a
chained "<". It would have been much less clear as a function.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Mendelssohn

Bart

unread,
Nov 8, 2021, 8:56:17 PM11/8/21
to
On 09/11/2021 00:03, Andy Walker wrote:
> On 07/11/2021 15:49, James Harris wrote:
> [I wrote:]
>>>      Sorry, but why do you think that creating a new operator
>>> is [other things being equal] a worse practice than creating a
>>> new function with the same parameters, code, specification, etc?
>>> What demons are you fighting?
>> Fighting demons may describe your approach to problem solving if you
>> are doing so by defining brand new operator symbols but I suggest
>> that there are specific problems with doing so:
>
>     I wasn't talking about problem solving but about a weird
> frame of mind in which you are happy for people to write
>
>    proc wertyuiop = (int a, b) bool: a > b;
>
> [adjust to whatever your preferred syntax for procedure declarations
> is], but
>
>    op wertyuiop = (int a, b) bool: a > b;
>
> is a no-no?

It is for me. My current compiler design can't handle that.

In the syntax, consecutive identifiers don't normally occur. When they
do, as in:

A B ...

it is assumed that A is user-type, and B is a new variable. With
user-defined named operators, then:

A B ...

could still mean A is type, or A is a variable and B is an postfix/infix
operator, or A is a unary operator and B is a variable, or another unary
operator.

It's not parsable. It would need an extra pass, perhaps two (across all
modules for whole-program compilation). It's not worth it.

User-defined named operators /could/ be added, but would need a special
prelude and special rules:

* Names that are to be operators need to be listed in the main module
(or the main header I now use to describe all the modules). That info
needs to include their precedence, and whether unary, binary, prefix
etc; everything needed to be construct a proper AST.

* Those names become reserved words so have program-wide scope and
cannot be redefined

* There can only be one instance of each name across the program.

* The operators would still need defining like functions are, with
operand types, bodies and return types

* Possibly, the op-defining functions are regular functions, with
directives to link the new operator names to the function.



James Harris

unread,
Nov 10, 2021, 3:48:40 AM11/10/21
to
On 09/11/2021 00:03, Andy Walker wrote:
> On 07/11/2021 15:49, James Harris wrote:

...

>> 1. While well-known operators are enunciable (e.g. := can be read as
>> "becomes" and >= can be read as "is greater than or equal to") in a
>> way that reflects their meaning that's not true of new ones. For
>> example, how would you read any of these:
>>    @:
>>    #~
>>    %!?
>
>     In Algol?

No, the examples were of the principle rather than for a specific language.

...

>>                  but didn't you recently show code in which
>> Algol allows programmers to define the precedence of new operators?
>
>     It not merely allows, it requires that for /new/ dyadic
> operators.  How else could a compiler know what precedence to
> use?

User-assignable precedence for user-defined operators is no problem for
the compiler! I was saying that it is a problem for humans reading the
code because there's nothing in the context in which they appear to
indicate how they are supposed to be combined with adjacent operators.

Yes, a human could find where the unfamiliar operator had been defined
and given a priority and then work out how that relates to adjacent
operators but I suggest that it would be better if the way the operator
relates to its surroundings were to be present in the code where the
operator was used.

>
>> If so, that also helps to make expressions unreadable for the reasons
>> I set out before, i.e. that in an expression such as
>>    a or b #? c + d
>> there is no clue as to how #? it will be parsed relative to the
>> operators on either side of it.
>
>     Again, you're seeing gremlins where none exist.  IRL,
> bad programmers write unreadable and incompetent code in any
> language

What's bad about the expression, above? It involves only three
operators! Yet someone reading it cannot tell what order they are
applied in. I'm sorry but you cannot blame the programmer. The facility
itself is at fault.

> [even Basic], and good programmers write readable
> and competent code in any language [even (deleted)].  But in
> return, the reader has to make the effort to become familiar
> with the language.

...

>> In your example, a >=< b gives no clue but shuffle_bits(a, b) tells
>> the reader something about the purpose, would happen with obvious
>> precedence and is meaningfully enunciable and is therefore a better
>> approach, IMO.
>
>     You've perhaps forgotten, but ">=<" was /your/ example,

You gave the example. You contrasted it with "myweirdfunc".


> as was the idea that it was to shuffle bits.  I used "+<", as
> "<" couldn't be used, to give at least some impression of a
> chained "<".

OK.


> It would have been much less clear as a function.
>

That is a legitimate argument and deserves a fuller discussion but, for
this thread, while I see the point of it I don't believe it justifies
the problems which defining new operators with their own precedences
causes. There are better potential solutions such as building behaviour
into a language or providing a way to use user-defined named operators
in a way that their precedence and their nature are fixed or apparent in
the syntax - but that's another topic.


--
James Harris

Andy Walker

unread,
Nov 10, 2021, 8:07:26 AM11/10/21
to
On 10/11/2021 08:48, James Harris wrote:
> User-assignable precedence for user-defined operators is no problem
> for the compiler! I was saying that it is a problem for humans
> reading the code because there's nothing in the context in which they
> appear to indicate how they are supposed to be combined with adjacent
> operators.

IRL, it quite simply is not and never has been a problem
for humans. Yes, of course you can write puzzles; but people with
serious intent to write useful code don't do that. There are four
uses for new operators:

-- monadic operators ["if isprime i then ..."]. Not a problem.
-- extensions of standard operators to new types. Not a problem.
-- operators that relate to the problem at hand [eg, dot and cross
multiplication of vectors], where you can reasonably expect the
precedence to follow the maths/physics/genetics/whatever.
-- The rest. I've never seen any of these.

> Yes, a human could find where the unfamiliar operator had been
> defined and given a priority and then work out how that relates to
> adjacent operators but I suggest that it would be better if the way
> the operator relates to its surroundings were to be present in the
> code where the operator was used.

If there's a difficulty [I've never seen one], you could
always add a comment. Otherwise, what do you have in mind? Some
new syntax for every time you use an operator? ???

>>> If so, that also helps to make expressions unreadable for the reasons
>>> I set out before, i.e. that in an expression such as
>>>    a or b #? c + d
>>> there is no clue as to how #? it will be parsed relative to the
>>> operators on either side of it.
>>      Again, you're seeing gremlins where none exist.  IRL,
>> bad programmers write unreadable and incompetent code in any
>> language
> What's bad about the expression, above? It involves only three
> operators! Yet someone reading it cannot tell what order they are
> applied in. I'm sorry but you cannot blame the programmer. The
> facility itself is at fault.

The reason you "cannot tell" is that you have no idea what
"#?" is supposed to mean. If you were familiar with the problem
in which "#?" seemed to the programmer to be a Good Idea, then you
would presumably know whether it was "like" [eg] an addition, or a
logical operator, or some sort of assignment, or whatever. If not,
then you're going to have to work a little harder, eg by referring
back to where "#?" was defined; whether the fault is then with the
programmer writing obscure code or with the reader being unfamiliar
with the problem is another matter, but it's unreasonable to blame
the language. The "facility itself" is extremely useful -- once
you've used it, you'll wonder how you managed without.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Valentine

Bart

unread,
Nov 10, 2021, 9:15:44 AM11/10/21
to
Creating a new operator with a new precedence /is/ changing the syntax.

Actually, in Algol68 you can change the syntax even with no new operators:

PRIO + = 9;
PRIO * = 5;

print((2+3*4))

The results depend on the relative priorities of + and *; I can get 14,
or 20.

(I had to define both here because I don't know what the default
priorities of + and * are. I'd have to go and dig up that info
somewhere. All I know is that * is normally higher than +.)

Or, I can get both results from the same program:

print((2+3*4));

BEGIN
PRIO + = 5;
PRIO * = 9;

print((2+3*4))
END;

PRIO + = 9;
PRIO * = 5;

0

In this case, I've put one set of PRIOs at the end, so that the
behaviour of 2+3*4 is even more surprising, since you don't know about
that override until the end of a normally much longer program.

To me this is all undesirable. I would favour:

* Having only a fixed set of built-in operator with fixed precedences

* Possibly, having some additional built-in symbolic operators which are
unassigned, available for user-programs to overload. Again with fixed
precedences (but likely to be just one)

Andy Walker

unread,
Nov 10, 2021, 2:58:36 PM11/10/21
to
On 10/11/2021 14:15, Bart wrote:
[I wrote (to James):]
>> [...] Otherwise, what do you have in mind?  Some
>> new syntax for every time you use an operator?  ???
> Creating a new operator with a new precedence /is/ changing the syntax.

It's a point of view; but James was talking about /applied/
occurrences of the operator, not /defining/ occurrences.
> Actually, in Algol68 you can change the syntax even with no new operators:

What do you mean by "change the syntax"? Any non-trivial
change to a program is changing the syntactic structure of that
program; but neither in Algol 68 nor in most other languages is
there any way for a program to change its underlying syntax. In
Algol, the syntax of formulas is defined in RR5.4.2.1 and stays
the same no matter what games you play with priorities.

>     PRIO + = 9;
>     PRIO * = 5;
>     print((2+3*4))

Yeah, you can write puzzles. And ...? As an entry in an
"Obfuscated Algol" competition, you can sensibly do things like
that. In normal code, you can't, and most people would regard it
as Bad programming. But [in Algol] there has to be provision for
it, otherwise there would be no way to define "+" and "*" in the
first place -- they're not hard-wired in the syntax, only in the
code of the standard prelude.

[...]
> To me this is all undesirable. I would favour:
> * Having only a fixed set of built-in operator with fixed precedences

With or without allowing code to re-define operators for
other types? It would seriously impact most of my interesting
programs if I couldn't use [eg] "+", "-" and "*" for things like
graphs and games, or different sorts of number -- might as well
use C or other restricted language. Not /quite/ so bad if that's
allowed but defining new operators is disallowed, though that
would still affect perhaps half of my non-trivial programs. It
would be a major change however to allow code in the standard
prelude that was not syntactically/semantically available inside
the program [it would affect the entire structure of programs].

> * Possibly, having some additional built-in symbolic operators which
> are unassigned, available for user-programs to overload. Again with
> fixed precedences (but likely to be just one)

IOW, you want to exercise control over the symbolisms
used in my programs, as though you were an expert in whatever
field happens to be relevant to my programming task? I want
to use something for, say circle-plus, and I need to find out
whether you have graciously provided a symbol for just that
purpose, otherwise I'm SOL, and will have to resort to Algol
instead to write my program. Fine, but I [therefore] won't
be interested in your language.

The only reason you and James seem to have for not
liking new operators is that they can be abused by bad
programmers. Yes; and ...? Do we invent interesting
languages for the benefit of bad programmers or so that
we can write interesting [and useful] programs?

Bart

unread,
Nov 10, 2021, 4:45:04 PM11/10/21
to
On 10/11/2021 19:58, Andy Walker wrote:
> On 10/11/2021 14:15, Bart wrote:
> [I wrote (to James):]
>>> [...] Otherwise, what do you have in mind?  Some
>>> new syntax for every time you use an operator?  ???
>> Creating a new operator with a new precedence /is/ changing the syntax.
>
>     It's a point of view;  but James was talking about /applied/
> occurrences of the operator, not /defining/ occurrences.
>> Actually, in Algol68 you can change the syntax even with no new
>> operators:
>
>     What do you mean by "change the syntax"?  Any non-trivial
> change to a program is changing the syntactic structure of that
> program;  but neither in Algol 68 nor in most other languages is
> there any way for a program to change its underlying syntax.

Syntax for me includes whether a + b * c is parsed as a + (b * c) or as
(a + b) * c. In other words, the structure of an expression.

But a language can be defined so that any expression is the linear:

term | binop

(a series of terms separated with binops) with the actual semantics
depending on the attributes of those operators. (Eg. table-driven
expression parsers, which is what I've used myself.)

However I don't see that as being so useful. I think hard-coded
precedences, fixed in the grammar, are better. But even my
table-specified precedences were fixed; you couldn't change the meaning
of a+b*c from a+(b*c) to (a+b)*c from one line to the next, and then
back again.

Sometimes a language can give too much freedom.

>>      PRIO + = 9;
>>      PRIO * = 5;
>>      print((2+3*4))
>
>     Yeah, you can write puzzles.  And ...?  As an entry in an
> "Obfuscated Algol" competition, you can sensibly do things like
> that.  In normal code, you can't,

Then disallow it.

and most people would regard it
> as Bad programming.  But [in Algol] there has to be provision for
> it, otherwise there would be no way to define "+" and "*" in the
> first place -- they're not hard-wired in the syntax, only in the
> code of the standard prelude.

(1) If that's how Algol is implemented, then OK, but it's not drawing a
proper line between different kinds of code: implementation, standard
libraries, third party libraries, and user libraries and code.

(2) Even if PRIO is needed to set a precedence, it could allow it just
once per distinct operator

(3) How is "+" defined in Algol 68?


> [...]
>> To me this is all undesirable. I would favour:
>> * Having only a fixed set of built-in operator with fixed precedences
>
>     With or without allowing code to re-define operators for
> other types?

I've specified the operators availble for overload. The standard ones,
and few predefined spare ones. I think a few languages do the same thing.

The sensible ones may have learned that having free-for-all on operator
names is not a good idea. Being able to use Unicode identifiers is worse.


>> * Possibly, having some additional built-in symbolic operators which
>> are unassigned, available for user-programs to overload. Again with
>> fixed precedences (but likely to be just one)
>
>     IOW, you want to exercise control over the symbolisms
> used in my programs, as though you were an expert in whatever
> field happens to be relevant to my programming task?  I want
> to use something for, say circle-plus, and I need to find out
> whether you have graciously provided a symbol for just that
> purpose, otherwise I'm SOL, and will have to resort to Algol
> instead to write my program.  Fine, but I [therefore] won't
> be interested in your language.

Doesn't Algol68 limit the available symbols anyway? Eg. you couldn't use
"<". I couldn't use "+++" when I tried that.

I assume there are rules so that you can resolve "+++" and "+ + +",
especially of the language likes to ignore white space.

Andy Walker

unread,
Nov 11, 2021, 2:46:26 PM11/11/21
to
On 10/11/2021 21:45, Bart wrote:
>>> Actually, in Algol68 you can change the syntax even with no new operators:
>>      What do you mean by "change the syntax"?  Any non-trivial
>> change to a program is changing the syntactic structure of that
>> program;  but neither in Algol 68 nor in most other languages is
>> there any way for a program to change its underlying syntax.
> Syntax for me includes whether a + b * c is parsed as a + (b * c) or
> as (a + b) * c. In other words, the structure of an expression.

If your language has a syntax that specifies that, then fine;
but Algol doesn't. It has a two-level grammar, so that the syntax
includes lots of other things, such as whether an identifier has
been declared, whether types match, and so on. I don't propose to
write an essay on 2LGs, it's all in the RR, and I've previously
referred you to the actual syntax of formulas in RR5.4.2.1. It
includes what other languages would call semantics.

> [...] I think hard-coded
> precedences, fixed in the grammar, are better.

Yes, I understand that to be your opinion. But the only
rationale you have produced for that is that rogue programmers
can abuse declared priorities. Oh, and that your own compiler
can't handle it. Are we inventing languages for rogue programmers
or for productivity?

>> [...] As an entry in an
>> "Obfuscated Algol" competition, you can sensibly do things like
>> that.  In normal code, you can't,
> Then disallow it.

It's 46 years too late to change the RR. Esp when there
is no reason to. IRL it simply is not a problem, and it would
complicate the syntax.

>> [...] But [in Algol] there has to be provision for
>> it, otherwise there would be no way to define "+" and "*" in the
>> first place -- they're not hard-wired in the syntax, only in the
>> code of the standard prelude.
> (1) If that's how Algol is implemented, then OK, but it's not drawing
> a proper line between different kinds of code: implementation,
> standard libraries, third party libraries, and user libraries and
> code.

That's not how Algol is /implemented/, it's how it is
/defined/ [and it's again 46 years too late to change that].
But it does draw those lines, indeed rather more carefully than
most if not all other language definitions. See RR10.1.

> (2) Even if PRIO is needed to set a precedence, it could allow it
> just once per distinct operator

It could. But it would cut across the whole concept of
block structure; eg, you couldn't embed a program inside another
one and expect the result to be legal. [It would actually be a
fairly simple change to RR4.2.1b. But there really, really, is
absolutely no need, and it would prevent a handful of interesting
applications.]

> (3) How is "+" defined in Algol 68?

RR10.2.3.0a for the priority and an infinity of rules
typified by RR10.2.3.3i which defines it for integers, plus
similar versions for monadic "+", other lengths and other
parameter types. If you're really interested, you need to see
also RR10.1.3, which gives the same freedom as C's "as-if" rule,
and RR2.1.3.1e which sets out the assumed properties of numbers.

> Doesn't Algol68 limit the available symbols anyway? Eg. you couldn't
> use "<". I couldn't use "+++" when I tried that.

I could have used "<", but that would have over-written
the standard meaning of "<". If you mean that you couldn't
declare an operator "+++", then [as I have already explained in
this thread several times] A68 operators are not free-for-all;
see RR9.4.2.1F for dyadic operators, RR9.4.2.1K for monadic.
[Trying things is Good, but you will ultimately save your own,
and more importantly my, time if you read the Revised Report.]

> I assume there are rules so that you can resolve "+++" and "+ + +",
> especially of the language likes to ignore white space.

Yes; "a +++ b" means [unambiguously, and however you
try to redefine operators] "a + ( + ( + b ) )", that is, a
dyadic operator and two monadic; "a + + + b" means the same,
but in any case you aren't allowed white space [or comments!]
internally to an operator symbol.

Bart

unread,
Nov 11, 2021, 3:59:56 PM11/11/21
to
On 11/11/2021 19:46, Andy Walker wrote:
> On 10/11/2021 21:45, Bart wrote:
>>>> Actually, in Algol68 you can change the syntax even with no new
>>>> operators:
>>>      What do you mean by "change the syntax"?  Any non-trivial
>>> change to a program is changing the syntactic structure of that
>>> program;  but neither in Algol 68 nor in most other languages is
>>> there any way for a program to change its underlying syntax.
>> Syntax for me includes whether a + b * c is parsed as a + (b * c) or
>> as (a + b) * c. In other words, the structure of an expression.
>
>     If your language has a syntax that specifies that, then fine;
> but Algol doesn't.  It has a two-level grammar, so that the syntax
> includes lots of other things, such as whether an identifier has
> been declared, whether types match, and so on.  I don't propose to
> write an essay on 2LGs, it's all in the RR, and I've previously
> referred you to the actual syntax of formulas in RR5.4.2.1.  It
> includes what other languages would call semantics.
>
>> [...] I think hard-coded
>> precedences, fixed in the grammar, are better.
>
>     Yes, I understand that to be your opinion.  But the only
> rationale you have produced for that is that rogue programmers
> can abuse declared priorities.  Oh, and that your own compiler
> can't handle it.

You misunderstood. My compilers for decades have had table-driven
expression parsers, but with non-alterable priorities. I'm using
grammar-driven parsers now as they are clearer, and make it easier to
add specialist handling to some operators (eg. chained comparisons).


> Are we inventing languages for rogue programmers
> or for productivity?

Both mine are for own productivity. That means 90% working on on my code
instead of 90% battling the compiler.

I don't suffer from the lack of user-defined ops in the stuff I do.

But I /would/ suffer from the lack of many of my features, such as
'tabledata', using a foreign language like yours.


>>> [...]  As an entry in an
>>> "Obfuscated Algol" competition, you can sensibly do things like
>>> that.  In normal code, you can't,
>> Then disallow it.
>
>     It's 46 years too late to change the RR.

Why does it have to be fixed? Other languages evolve. Among the changes
I would make to Algol68:

* Fix the semicolon problem, or making it quietly tolerant

* Get rid of that spaces-in-identifiers stuff, so that you don't need
keyword stropping, and people can write code in a civilised manner.

I don't care about the operator stuff, as I never use it.

>> (3) How is "+" defined in Algol 68?
>
>     RR10.2.3.0a for the priority and an infinity of rules
> typified by RR10.2.3.3i which defines it for integers, plus
> similar versions for monadic "+", other lengths and other
> parameter types.  If you're really interested, you need to see
> also RR10.1.3, which gives the same freedom as C's "as-if" rule,
> and RR2.1.3.1e which sets out the assumed properties of numbers.

OK, my question should have been, how do you /implement/ "+" when "+"
meaning 'add' doesn't yet exist. There must either be some core BUILT-IN
features to make it practical, or "+" is secretly built-in already.


Doesn't Algol68 limit the available symbols anyway? Eg. you couldn't
>> use "<". I couldn't use "+++" when I tried that.
>
>     I could have used "<", but that would have over-written
> the standard meaning of "<".  If you mean that you couldn't
> declare an operator "+++", then [as I have already explained in
> this thread several times] A68 operators are not free-for-all;
> see RR9.4.2.1F for dyadic operators, RR9.4.2.1K for monadic.
> [Trying things is Good, but you will ultimately save your own,
> and more importantly my, time if you read the Revised Report.]

So, it looks like the most apt operator you want to use probably
wouldn't be allowed. So the language is telling which you can use, or
which are legal.

Andy Walker

unread,
Nov 11, 2021, 4:21:05 PM11/11/21
to
On 07/11/2021 11:55, Bart wrote:
>>> I'm talking about introducing arrays to what appeared to be an
>>> implementation of chained operators. Are they central to being able
>>> to implement an arbitrary list of such comparisons, or are they there
>>> for a different reason?
>>      Algol doesn't have "lists" as opposed to arrays, so if we
>> want to implement [eg] "+< somearray" to mean a chained comparison
>> between the elements of the array [similar, apparently to your
>> "mapsv (('<'), ...)", then of course "somearray" has to be an
>> array.  I still don't see why it's so strange to involve/introduce
>> arrays to implement an indefinitely-long chain.
> In the user code the elements do not form an array. The comparisons
> are done with dedicated code for each, not using a loop. Inside the
> compiler, it could make use of arrays, but the input source and
> output code probably won't.

I still don't have the slightest idea what you are complaining
about. FTAOD /yet again/, I have personally never felt the need to
use "chained comparisons" , even less to write programs that define
them as operators, but it seemed like a modestly interesting task,
so I spent a few minutes demonstrating that it was possible in
Algol. The actual "<" operator was already [unsurprisingly] in
use, so I had to use a new symbol, and chose, rather arbitrarily,
"+<". It was then easy to implement "a +< b". For good measure,
I chucked in a monadic version [optional extra] such that
"+< c" returned true iff the elements of "c" were sorted; that
used [unsurprisingly] an array [which you seem to be complaining
about] and a loop [which you seem to be complaining about /not/
being used, even though it manifestly is]. If you can create a
comprehensible question, I can try to answer it, but ATM I have
no idea what you're trying to say about arrays.

> The consequences of your A68 solution is that it takes 4 times as
> long to do 'a <+ b <+ c < d', as it does to do 'a < b AND b < c AND c
> < d' [10 secs vs 2.5 secs for 10 million iterations].

It's an order of magnitude faster on my PC, but the ratio is
[unsurprisingly] the same. That's for the obvious reason that the
"a < b AND b < c AND c < d" version does six loads and five built-in
operations whereas "a +< b +< c < d" saves two loads but has to do
the same operations /plus/ two function calls, and a lot of parameter
establishment [as the intermediate type is a structure] and passing.

> When I make the same comparison in my languages, I get the same
> timing, although that's because these simple expressions end up as
> the same code. [0.75 secs for 1 billion iterations static, 0.85 secs
> for 100M, dynamic]

Or because your version is compiled and optimised, whereas
mine is interpreted and unoptimised. ICBA to run it through the
unit optimiser, after which the main time [in A68G] would presumably
be that spent running a trivial loop 10^7 [or 10^9] times.

> So in this case, emulating such a construct in user code results in
> an inferior version: funnier syntax, and not as efficient.

It's not surprising that an unoptimised interpretation is
"less efficient" than compiled and optimised. Whether it's
"inferior" is a matter of taste. Again, it's not code that I
would ever write in normal circumstances.

> But a side
> effect is that you get +< or <+ (how do you remember which it is) to
> act on arrays of ints.

Not a "side effect"; I wrote an extra operator. As "<+"
[which you managed somehow to perpetrate above] is not a legal A68
operator, it's easy to remember which one I used.

> (A reminder that this solution only works for "<",

Yes, as I commented in the original code; it would be a
waste of everyone's time [esp mine] to write out five copies of
the same code but replacing "<" by "<=", "=", "/=", ">" and ">="
just for completeness ...

> only for INTs,

... and a further waste of time writing out further copies
for every possible combination of parameter types. If this had
been a real problem and not a mere proof of concept, I would have
written a shell script to generate the ~1440 [~5000 with your
preferred layout] lines of very repetitive code needed. Yet
again, it really would have been silly.

> and
> may not allow mixed comparisions like a <= b < c.)

But yes, those would all then be legal.

[...]
> The trend is now to declare only one thing per line, even variables.

Yeah, I've had spells of doing that. But that was before
we had "screens", and having acres of white space on the lineprinter
listings was useful for writing edits/corrections in the days before
we had usable editors or even files. Or simply guillotining off the
printing and stapling together pads of blank paper. Not quite as
bad as the author who wrote as an exercise "Write a program to print
the squares of the integers from 1 to 1000, each on a new page" --
I bet he was popular with his computing centre.

> (A very strange feature. print(~) just shows SKIP. Assigning it to an
> int stores 1. Assigning to a REAL stores some random value. Assigning
> to a string stores "".
> At the least I would have expected 0 and 0.0 to be stored for
> numbers, ie. 'empty' or 'zero'.)

It's a "skip" token. If you want 0 or 0.0 then write that,
don't tell the compiler that you don't care and then be surprised by
the result. "What drink do you want?" "Don't care." ... "Why have
you brought me iced tea, I was expecting white wine."

[...]
> (My editor also displays 60 lines, not 25.)

Lucky you. Most of my screens are not tall enough for that
[and the exception is busy displaying music, which needs lots of
space, esp for orchestral parts]. T'bird occupies more than half
the available height with headers, other windows and margins, to
even 25 lines is pushing it; my own editor will run to about 40
lines before the font size gets too small [for me] to read, but
if I'm using a split screen to compare two chunks of code it's
difficult for either to be much more than 25.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Dussek

Bart

unread,
Nov 11, 2021, 7:59:59 PM11/11/21
to
On 11/11/2021 21:21, Andy Walker wrote:
> On 07/11/2021 11:55, Bart wrote:
>>>> I'm talking about introducing arrays to what appeared to be an
>>>> implementation of chained operators. Are they central to being able
>>>> to implement an arbitrary list of such comparisons, or are they there
>>>> for a different reason?
>>>      Algol doesn't have "lists" as opposed to arrays, so if we
>>> want to implement [eg] "+< somearray" to mean a chained comparison
>>> between the elements of the array [similar, apparently to your
>>> "mapsv (('<'), ...)", then of course "somearray" has to be an
>>> array.  I still don't see why it's so strange to involve/introduce
>>> arrays to implement an indefinitely-long chain.
>> In the user code the elements do not form an array. The comparisons
>> are done with dedicated code for each, not using a loop. Inside the
>> compiler, it could make use of arrays, but the input source and
>> output code probably won't.

I was asking how arrays came in it. You said of course there has to be
an array. But not really; you don't need arrays for:

a + b + c

so why do you need them for:

a < b < c

The answer was that your A68 solution needs to use arrays to make it
work even for a handful of scalar values. And a side-effect of that was
to be able to do < (10,20,30) (even if needing a cast).

>> When I make the same comparison in my languages, I get the same
>> timing, although that's because these simple expressions end up as
>> the same code. [0.75 secs for 1 billion iterations static, 0.85 secs
>> for 100M, dynamic]
>
>     Or because your version is compiled and optimised, whereas
> mine is interpreted and unoptimised.

That 0.85 secs for 100 million is for my interpreted bytecode.

My point really is that implementing stuff in /user-code/ is going to
have performance problems when that user-code is interpreted.

>     It's not surprising that an unoptimised interpretation is
> "less efficient" than compiled and optimised.

That isn't always an option, not when using dynamic typing as geting it
optimised gets very difficult (they've been trying it for years with
Python).

>     ... and a further waste of time writing out further copies
> for every possible combination of parameter types.  If this had
> been a real problem and not a mere proof of concept, I would have
> written a shell script to generate the ~1440 [~5000 with your
> preferred layout] lines of very repetitive code needed.  Yet
> again, it really would have been silly.

This is yet another problem. Your script to generate that lot may be
short, but it would still be 1400 lines of dense code to end up doing a
poorer, slower emulation of something my language can do in 100
non-dense lines because it is natively supported.

You may also need to process all that extra code when a program doesn't
use the feature.

These are all disadvantages of using language-building features in
user-code to implement functionality, as opposed to having direct
language support.

Everyone here seems to like that former approach.

>> (A very strange feature. print(~) just shows SKIP. Assigning it to an
>> int stores 1. Assigning to a REAL stores some random value. Assigning
>> to a string stores "".
>> At the least I would have expected 0 and 0.0 to be stored for
>> numbers, ie. 'empty' or 'zero'.)
>
>     It's a "skip" token.  If you want 0 or 0.0 then write that,
> don't tell the compiler that you don't care and then be surprised by
> the result.

So, a very strange feature. Effectively it returns some random value?

Sometimes a value is unimportant, but you still want a program to give
consistent, repeatable results.

> "What drink do you want?" "Don't care." ... "Why have
> you brought me iced tea, I was expecting white wine."

An empty glass every time?


> [...]
>> (My editor also displays 60 lines, not 25.)
>
>     Lucky you.  Most of my screens are not tall enough for that

Either you are still using a display from the 1970s, or you need to use
a giant font (or maybe you're posting from a smartphone).

At times I've also turned my display 90 degrees (portrait mode) to
better view scanned documents and such, a whole page at a time.

Andy Walker

unread,
Nov 12, 2021, 3:12:52 PM11/12/21
to
On 11/11/2021 20:59, Bart wrote:
>>> [...] I think hard-coded
>>> precedences, fixed in the grammar, are better.
>>      Yes, I understand that to be your opinion.  But the only
>> rationale you have produced for that is that rogue programmers
>> can abuse declared priorities.  Oh, and that your own compiler
>> can't handle it.
> You misunderstood. My compilers for decades have had table-driven
> expression parsers, but with non-alterable priorities. I'm using
> grammar-driven parsers now as they are clearer, and make it easier to
> add specialist handling to some operators (eg. chained comparisons).

/You/ described Algol 68 operators are being "unparsable".
In general, that's manifestly untrue as A68 compilers have been
around for over half a century; what else am I to understand from
your comment than that you were referring to /your/ compiler rather
than Joe Bloggs's compiler? If you're saying that you now /can/
compile the devious examples that you drew attention to earlier,
then good; well done.

>> Are we inventing languages for rogue programmers
>> or for productivity?
> Both mine are for own productivity. That means 90% working on on my
> code instead of 90% battling the compiler.

Personally, I find it easier and more productive to work
/with/ the compiler rather than against it, and then spend 100%
on my own code. But then, I have, of course, managed to read the
RR and even to read MvdV's description of A68G.

> I don't suffer from the lack of user-defined ops in the stuff I do.

Nor do I.

> But I /would/ suffer from the lack of many of my features, such as
> 'tabledata', using a foreign language like yours.

Well, yes, if you like piggling with compilers and your
own language, then of course it will have your features rather
than mine. That's no use to me or anyone else here, as your
compiler is both changing under our feet and undocumented, so
it's [as far as we can tell] full of things we don't know about
and so can't use, even if we snaffle your binary.

>>> Then disallow it.
>>      It's 46 years too late to change the RR.
> Why does it have to be fixed? Other languages evolve.

Yes, and either it's a total incompatible mess or else
[more sensibly] the versions are called [eg] "K&R C", "C99", ...
"C2x". The RR is in hard copy. We don't have magic typesetters
to re-write history. If you want an Algol 23, contact IFIP and
offer your services. Good luck in finding enough people with
enough interest. It's more important to me to be able to run
the A68 programs I've accumulated over ~50 years than to write
a somewhat similar language with cosmetic changes and no real
advantages. A68G will see me out; Marcus has added some
interesting things [partial parametrisation and Torrix, amongst
others, but sadly not yet modals] without losing backwards
compatibility.

> OK, my question should have been, how do you /implement/ "+" when "+"
> meaning 'add' doesn't yet exist. There must either be some core
> BUILT-IN features to make it practical, or "+" is secretly built-in
> already.

The same way as every other language, I expect. The usual
way is to build an AST and walk it generating code [or some other
representation]; if you hit a "+" node at the top level, you emit
suitable machine [or whatever] code, using the "as if" rule. Not
interestingly different from how you implement "if ... then". Or
you could look at the A68G sources to see how Marcus does it.
[Other A68 compilers are available.]

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Praetorius

Andy Walker

unread,
Nov 12, 2021, 3:37:13 PM11/12/21
to
On 12/11/2021 00:59, Bart wrote:
> I was asking how arrays came in it. You said of course there has to
> be an array. But not really; you don't need arrays for:
>    a + b + c
> so why do you need them for:
>    a < b < c

??? I don't and didn't. There is no array in [from my OP]

OP +< = (INT i,j) IB: ( i < j | (j, TRUE) | (~, FALSE) )

[which was the operator used in "a +< b < c"].

> The answer was that your A68 solution needs to use arrays to make it
> work even for a handful of scalar values.

No, there was an array [unsurprisingly] in the definition
of a /monadic/ "+<" designed to check whether the elements of the
array passed as its parameter are in order.

> And a side-effect of that
> was to be able to do < (10,20,30) (even if needing a cast).

Yes, that was the whole purpose [not a side effect] of
choosing to define the monadic operator. It would have more point
if we had

[1000] INT a;
FOR i TO 1000 DO a[i] := ... whatever OD;
IF +< a THEN ...

rather than

IF a[1] < a[2] AND a[2] < a[3] AND ... AND # many lines later #
a[999] < a[1000] THEN ...

> This is yet another problem. Your script to generate that lot may be
> short, but it would still be 1400 lines of dense code to end up doing
> a poorer, slower emulation of something my language can do in 100
> non-dense lines because it is natively supported.
> You may also need to process all that extra code when a program
> doesn't use the feature.

So I wrote 15 lines to demonstrate a point of principle in
a completely unimportant application [I have much more important
uses of user-defined operators in my real code]; and you wrote
100 lines to implement that unimportant application. Whatever.
None of my [real] programs will ever use that application, so
the 1400 lines will never get written and even less processed.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Praetorius

Bart

unread,
Nov 12, 2021, 4:22:47 PM11/12/21
to
On 12/11/2021 20:37, Andy Walker wrote:
> On 12/11/2021 00:59, Bart wrote:
>> I was asking how arrays came in it. You said of course there has to
>> be an array. But not really; you don't need arrays for:
>>     a + b + c
>> so why do you need them for:
>>     a < b < c
>
>     ???  I don't and didn't.  There is no array in [from my OP]
>
>   OP +< = (INT i,j) IB: ( i < j | (j, TRUE) | (~, FALSE) )
>
> [which was the operator used in "a +< b < c"].
>
>> The answer was that your A68 solution needs to use arrays to make it
>> work even for a handful of scalar values.
>
>     No, there was an array [unsurprisingly] in the definition
> of a /monadic/ "+<" designed to check whether the elements of the
> array passed as its parameter are in order.

Yes, that's the one I mean. I managed to delete that now and the rest
still worked. I tried that before but it gave problems (but I've since
added support in my editor for A68G comments, making it easier to
comment out a block).

So ... well I won't say anything more about that array [still not sure
why it's there!]


>> This is yet another problem. Your script to generate that lot may be
>> short, but it would still be 1400 lines of dense code to end up doing
>> a poorer, slower emulation of something my language can do in 100
>> non-dense lines because it is natively supported.
>> You may also need to process all that extra code when a program
>> doesn't use the feature.
>
>     So I wrote 15 lines to demonstrate a point of principle in
> a completely unimportant application [I have much more important
> uses of user-defined operators in my real code];  and you wrote
> 100 lines to implement that unimportant application.  Whatever.
> None of my [real] programs will ever use that application, so
> the 1400 lines will never get written and even less processed.

I'm not suggesting you should have done all that for a demo.

I /am/ suggesting that would be a poor way of adding such a feature to a
language, by retro-fitting it via user-code. My 100 lines handle all
type combinations for which the relevant operators are already defined
in non-chained form.


[A note to DB and DAK if they are reading this: I'm not going to be
reading their posts. Sorry about the two unread ones. If they want the
last word, they now have it.]

Bart

unread,
Nov 12, 2021, 4:53:08 PM11/12/21
to
On 12/11/2021 20:12, Andy Walker wrote:
> On 11/11/2021 20:59, Bart wrote:


>> But I /would/ suffer from the lack of many of my features, such as
>> 'tabledata', using a foreign language like yours.
>
>     Well, yes, if you like piggling with compilers and your
> own language,

I'm stuck with it now; I'm too spoilt to use anyone else's. But then,
it's actually pretty good, which I didn't appreciate until I /tried/ to
use others, because I did get fed up with maintaining it.


> then of course it will have your features rather
> than mine.  That's no use to me or anyone else here, as your
> compiler is both changing under our feet and undocumented, so
> it's [as far as we can tell] full of things we don't know about
> and so can't use, even if we snaffle your binary.

There are examples of 'tabledata' used here:

https://github.com/sal55/langs/blob/master/Examples/ax_tables.m

Mainly, it defines parallel sets of enums, and corresponding data
arrays. Or sometimes just parallel arrays.

Trying to do the same in C requires using ugly 'x-macros'.

Before I had this feature, I had to use tables in text files and scripts
to generate the arrays as code. Having it built-in is much better. And
it's easy to implement!

(The "$" you see returns the last enum name as a string literal. This
file defines stuff for my x64 assembler.)


>>>> Then disallow it.
>>>      It's 46 years too late to change the RR.
>> Why does it have to be fixed? Other languages evolve.
>
>     Yes, and either it's a total incompatible mess or else
> [more sensibly] the versions are called [eg] "K&R C", "C99", ...
> "C2x".  The RR is in hard copy.  We don't have magic typesetters
> to re-write history.  If you want an Algol 23, contact IFIP and
> offer your services.  Good luck in finding enough people with
> enough interest.  It's more important to me to be able to run
> the A68 programs I've accumulated over ~50 years than to write
> a somewhat similar language with cosmetic changes and no real
> advantages.  A68G will see me out;  Marcus has added some
> interesting things [partial parametrisation and Torrix, amongst
> others, but sadly not yet modals] without losing backwards
> compatibility.


I just find it astonishing that after 50 years of practical use, even if
meagre, no one has ideas for enhancements. What is really that perfect?

I'd have loads. Another is to replace that 'm OF p' business with the
commonly used 'p.m', or allow both. 'm OF p' just scans badly and tends
to bleed into surrounding terms.


>> OK, my question should have been, how do you /implement/ "+" when "+"
>> meaning 'add' doesn't yet exist. There must either be some core
>> BUILT-IN features to make it practical, or "+" is secretly built-in
>> already.

No, to do you implement it in user code, not inside the compiler. Unless
I misunderstand when you said that ordinary operators like "+" and "-"
don't exist in the core language, they all have to be defined via a
prelude written in Algol68.

Then how would you write a function to add two integers, say, when "+"
does not exist.

But I guess some magic is used here.

Andy Walker

unread,
Nov 14, 2021, 6:06:50 PM11/14/21
to
On 12/11/2021 21:53, Bart wrote:
[Algol 68:]
> I just find it astonishing that after 50 years of practical use, even
> if meagre, no one has ideas for enhancements. What is really that
> perfect?

There were then, and have been since, lots of ideas for
improvement. But you can't change Algol 68 any more than we can
change C99. There's a hint in the name. Further, Marcel's A68G
would not have been anywhere near as interesting had he not worked
very hard for compatibility with A68 as it /is/, not as /you/ might
want it to look -- esp if you propose to throw away useful syntax.
Some of the obvious enhancements have made it into A68G, such as
partial parametrisation, [parts of] Torrix, a lot of linear algebra
and other packages, a plotting library, enhancements to loops and
conditionals, interface with Linux, Curses, sound, ....

> I'd have loads. Another is to replace that 'm OF p' business with the
> commonly used 'p.m', or allow both. 'm OF p' just scans badly and
> tends to bleed into surrounding terms.

See Marcel's book, p60. But you still have to work quite
hard [or be v unlucky] to write genuine A68 that doesn't equally
work with A68G. I've managed to run a /lot/ of old programs
unchanged with A68G.

[...]
> Then how would you write a function to add two integers, say, when
> "+" does not exist.

RR10.2.3.3i. There are actually several possible ways,
some better [IMO] as pure ideas than that section of the RR;
eg you could work from the Peano axioms, or from game theory
[see Conway's book "On Numbers and Games"], or you could
implement the standard "school" method using a 3-d array
indexed by two digits and the carry, or use a version of a
fairly easy Sed script. But there's no point, as ...

> But I guess some magic is used here.
... no matter what the official definition, practical
implementations will use the "as-if" rule to replace "a+b" by
something rather like the machine code "load a, load b, add".
If you think of that as "magic", then so be it.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Paderewski

Bart

unread,
Nov 15, 2021, 9:45:55 AM11/15/21
to
On 14/11/2021 23:06, Andy Walker wrote:
> On 12/11/2021 21:53, Bart wrote:
> [Algol 68:]
>> I just find it astonishing that after 50 years of practical use, even
>> if meagre, no one has ideas for enhancements. What is really that
>> perfect?
>
>     There were then, and have been since, lots of ideas for
> improvement.  But you can't change Algol 68 any more than we can
> change C99.  There's a hint in the name.

If didn't stop there being myriad versions of C, many with the their
extensions, which later became official.

>> I'd have loads. Another is to replace that 'm OF p' business with the
>> commonly used 'p.m', or allow both. 'm OF p' just scans badly and
>> tends to bleed into surrounding terms.
>
>     See Marcel's book, p60.

I don't see anything like that on page 60, which seemed to be about
complex numbers. This is that 687-page PDF?

However, I did stumble cross the first program below, which is to do
with Hamming Numbers.

It made quite an impression, because it was so dreadful. Only some of
that is to do with the language; mostly just poor style.

You're trying to follow the algorithm, and it turns out most it is
defining ad hoc operators as it goes along, nothing to do with the
algorithm itself.

The definition of +:= also seems very dodgy: not just appending, but
doing something to the first part of the array too. (It doesn't need it;
it works fine with a normal append.)

My plan was to convert it into my language, but I couldn't do it. I
first refactored the A68 code, into hamm2.a68 below.

This doesn't run: mismatched something or other (I had extreme
difficulty in determining the boundaries of anything). But it was enough
to be able to finally grasp the algorithm. My version of it is in
hamm.q, below.

Hamm1.a68 only works for the first test with small numbers. Above that
it needs more memory, but I can't remember how that is done (--help said
nothing about it).

On mine, it works for the small numbers, and for 1691 (the test calls in
the book), but for 1000000, it needs bignums (use the commented line in
my version). Nothing else needs changing.


--------------------------------------------------------------------
hamm1.a68
--------------------------------------------------------------------
PR precision=100 PR
MODE SERIES = FLEX [1 : 0] UNT; # Initially, no elements #
MODE UNT = LONG LONG INT; # A 100-digit unsigned integer #
OP LAST = (SERIES h) UNT: h[UPB h]; # Last element of a series #

PROC hamming number = (INT n) UNT: # The n-th Hamming number #
CASE n
IN 1, 2, 3, 4, 5, 6, 8, 9, 10, 12 # First 10 in a table #
OUT SERIES h := 1, # Series, initially one element #
UNT m2 := 2, m3 := 3, m5 := 5, # Multipliers #
INT i := 1, j := 1, k := 1; # Counters #
TO n - 1
DO OP MIN = (INT i, j) INT: (i < j | i | j),
MIN = (UNT i, j) UNT: (i < j | i | j);
PRIO MIN = 9;
OP +:= = (REF SERIES s, UNT elem) VOID:
# Extend a series by one element, only keep the elements you need #
(INT lwb = i MIN j MIN k, upb = UPB s;
REF SERIES new s = NEW FLEX [lwb : upb + 1] UNT;
(new s[lwb : upb] := s[lwb : upb], new s[upb + 1] := elem);
s := new s
);
# Determine the n-th hamming number iteratively #
h +:= m2 MIN m3 MIN m5;
(LAST h = m2 | m2 := 2 * h[i +:= 1]);
(LAST h = m3 | m3 := 3 * h[j +:= 1]);
(LAST h = m5 | m5 := 5 * h[k +:= 1])
OD;
LAST h
ESAC;

--------------------------------------------------------------------

--------------------------------------------------------------------
hamm2.a68
--------------------------------------------------------------------
MODE UNT = LONG LONG INT; # A 100-digit unsigned integer #

OP MIN = (INT i, j) INT: (i < j | i | j);
OP MIN = (UNT i, j) UNT: (i < j | i | j);
PRIO MIN = 9;

PR precision=100 PR
MODE SERIES = FLEX [1 : 0] UNT; # Initially, no elements #

OP +:= = # Extend a series by one
element, #
BEGIN # only keep the elements you
need #
REF SERIES s, UNT elem) VOID:
(INT lwb = i MIN j MIN k, upb = UPB s;
REF SERIES news = NEW FLEX [lwb : upb + 1] UNT;
(news[lwb : upb] := s[lwb : upb], news[upb + 1] := elem);
s := news
END;

OP LAST = (SERIES h) UNT: h[UPB h]; # Last element of a series #

PROC hamming number = (INT n) UNT: # The n-th Hamming number #
BEGIN
CASE n
IN 1, 2, 3, 4, 5, 6, 8, 9, 10, 12 # First 10 in a table #
OUT
SERIES h := 1, # Series, initially one
element #
UNT m2 := 2, m3 := 3, m5 := 5, # Multipliers #
INT i := 1, j := 1, k := 1; # Counters #
TO n - 1 DO # Determine the n-th
hamming number iteratively #
h +:= m2 MIN m3 MIN m5;
IF LAST h = m2 THEN m2 := 2 * h[i +:= 1] FI;
IF LAST h = m3 THEN m3 := 3 * h[j +:= 1] FI;
IF LAST h = m5 THEN m5 := 5 * h[k +:= 1] FI
OD;
LAST h
ESAC
END

--------------------------------------------------------------------

--------------------------------------------------------------------
hamm.q
--------------------------------------------------------------------
function hamm(n) =
case n
when 1..10 then
return (1,2,3,4,5,6,8,9,10,12)[n]
else
h ::= (1,) # ::= makes a mutable copy
m2:=2; m3:=3; m5:=5
! m2:=2L; m3:=3L; m5:=5L # needed for big numbers

i := j := k := 1
to n-1 do
h append:=min(min(m2,m3),m5)
if last(h) = m2 then m2 := 2*h[++i] fi
if last(h) = m3 then m3 := 3*h[++j] fi
if last(h) = m5 then m5 := 5*h[++k] fi
od
return last(h)
esac
end
--------------------------------------------------------------------

Andy Walker

unread,
Nov 17, 2021, 3:56:17 PM11/17/21
to
On 15/11/2021 14:45, Bart wrote:
>>> I just find it astonishing that after 50 years of practical use, even
>>> if meagre, no one has ideas for enhancements. What is really that
>>> perfect?
>>      There were then, and have been since, lots of ideas for
>> improvement.  But you can't change Algol 68 any more than we can
>> change C99.  There's a hint in the name.
> If didn't stop there being myriad versions of C, many with the their
> extensions, which later became official.

Yes, but the later versions of C99 weren't called C99.
Similarly, other versions of Algol were called things like A68R,
A68RS, A68S, A68C and A68G, and doubtless others I've forgotten.
If you want to invent another language called A68Bart, feel free,
but it won't gain much traction unless (a) disruption to existing
A68 programs is minimal [preferably zero], and (b) it comes with
a detailed specification saying what has changed. Meanwhile,
there will be no Algol 22 unless IFIP comes back from the dead.

>>> I'd have loads.

Note that different stropping regimes are of minimal value;
it is simple [though not trivial] to write a pre-processor [eg] to
convert between your preferred reserved-word strop and the more
usual upper-case strop. The other idea you have touted here of
zapping [most] operators and/or priorities is a no-no; too many
current programs rely on it. OTOH, if you merely want to extend
A68 go to it -- as A68G has been extended in many ways from the
language of the RR [without making existing programs useless].

>>> Another is to replace that 'm OF p' business with the
>>> commonly used 'p.m', or allow both. 'm OF p' just scans badly and
>>> tends to bleed into surrounding terms.

On the other hand, "age of person" reads better than
"person.age". It perhaps dependsnon what you're used to, and
A68 came very early into the field.

>>      See Marcel's book, p60.
> I don't see anything like that on page 60, which seemed to be about
> complex numbers. This is that 687-page PDF?

My copy is 706 [ie xviii+688] pages. Section 3.8. Note
that the change is non-trivial, see the example at the bottom of
the section.

> However, I did stumble cross the first program below, which is to do
> with Hamming Numbers.
> It made quite an impression, because it was so dreadful. Only some of
> that is to do with the language; mostly just poor style.

Style wars are always unedifying, and are usually caused by
unfamiliarity with the language. If you don't like the style, you
can always run the source through a pretty-printer.

Note that the algorithm is well-known, and I would expect
any CS professional to know the problem and [at least] Dijkstra's
algorithm for solving it. There are lots of version on the web.
The one here is [allegedly] derived from a version in Python.

[...]
> The definition of +:= also seems very dodgy: not just appending, but
> doing something to the first part of the array too. (It doesn't need
> it; it works fine with a normal append.)

There is no "normal append" in Algol. That's why the program
has to (a) create a new larger array, (b) copy the old part into it,
and (c) add the new element. Feel free to write a new operator, to
go along with "min" and "max", to add to A68Bart. However, this is
a case where copying other languages too slavishly causes problems;
my own version simply created a large-enough array to start with
[e * n^(2/3) is certainly big enough, I haven't explored to see how
little I could get away with] and used it as a circular buffer.
This saved a factor of over 1500 [!] in the time taken to evaluate
the millionth Hamming number; IOW, ~99.93% of the time taken by
the transcribed Python version is taken up by steps (a,b) above.
[This is not the end of the story, as there are much more efficient
ways to calculate Hamming numbers, and there is at least another
factor of 1000 to be gained, but my (crude) program is fast enough
up to about the 10^7-th (or, at a push, 10^8-th) number.]

[...]
> Hamm1.a68 only works for the first test with small numbers. Above
> that it needs more memory, but I can't remember how that is done
> (--help said nothing about it).
$ a68g --help | grep size
--frame "number": set frame stack size to "number".
--handles "number": set handle space size to "number".
** --heap "number": set heap size to "number". **
--stack "number": set expression stack size to "number".

But that's not actually the problem, though setting the
heap larger gets further. Some versions of A68G have a bug [Gasp!
Quelle horreur! Shock!] by which garbage collection is not
initiated when it should be. My copy works, but you probably
need to call "sweep heap;" at intervals in the execution [p215
in my copy, section 10.10.7b]; "preemptive sweep" may also
work [I haven't tried it]. As above, it's all moot as you can
save many gigabytes of unnecessary allocation and copying and
thereby over 99% of the time by the simple device above.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Ganz

Bart

unread,
Nov 18, 2021, 8:25:30 AM11/18/21
to
On 17/11/2021 20:56, Andy Walker wrote:
> On 15/11/2021 14:45, Bart wrote:
>>>> I just find it astonishing that after 50 years of practical use, even
>>>> if meagre, no one has ideas for enhancements. What is really that
>>>> perfect?
>>>      There were then, and have been since, lots of ideas for
>>> improvement.  But you can't change Algol 68 any more than we can
>>> change C99.  There's a hint in the name.
>> If didn't stop there being myriad versions of C, many with the their
>> extensions, which later became official.
>
>     Yes, but the later versions of C99 weren't called C99.
> Similarly, other versions of Algol were called things like A68R,
> A68RS, A68S, A68C and A68G, and doubtless others I've forgotten.
> If you want to invent another language called A68Bart, feel free,

I did that in 1981/82! I called it M.

Now I have two languages both derived from that crude early one: a
systems language, very static, and a more dynamic scripting language.
(Although compared with Python, it might as well be static.)

Both have the same syntax inspired originally by Algol68, but as you've
seen it's more pracical with lots of tweaks.

> but it won't gain much traction

That was never the intention, which was to just to help do my job
effectively.

(And originally, it was for fun. I got a kick out of writing code in a
homemade 'HLL', on my homemade computer, that a year or two earlier
would have needed a mainframe computer.

I still get that kick.)

>     On the other hand, "age of person" reads better than
> "person.age".  It perhaps dependsnon what you're used to, and
> A68 came very early into the field.

I think Pascal came out not longer after (and before A68 was finalised).
That would have used 'person.age' or 'person^.age', depending on whether
'person' was a reference.

(I copied this until a couple of years ago, when I allowed the "^" to be
dropped [to match the dynamic language].

Finally moving one step nearer to A68! I decided that cleaner code was
worth the loss of transparency. But people could still add "^" if they
wanted.)

>> It made quite an impression, because it was so dreadful. Only some of
>> that is to do with the language; mostly just poor style.
>
>     Style wars are always unedifying, and are usually caused by
> unfamiliarity with the language.  If you don't like the style, you
> can always run the source through a pretty-printer.

In this case, it's simply poor. A pretty printer might change the layout
and indent things more consistently, but it would still leave those OP
definitions in the middle of the algorithm, or leave alone this puzzling
style of declaration (abbreviated):

SERIES h:=1, UNT m2:=2, INT i:=3;

The puzzle was the use of commas instead of semicolons to separate
distinct declarations. Even if the language allows it, you need to
consider clarity especially in a textbook aimed at people learning the
language.

>     Note that the algorithm is well-known, and I would expect
> any CS professional to know the problem and [at least] Dijkstra's
> algorithm for solving it.  There are lots of version on the web.
> The one here is [allegedly] derived from a version in Python.

The same program is at Rosetta Code (where it benefits a little from
colour highlighting). But looking at other languages that implement the
same algorithm, one of the clearest was AWK.

> [...]
>> The definition of +:= also seems very dodgy: not just appending, but
>> doing something to the first part of the array too. (It doesn't need
>> it; it works fine with a normal append.)
>
>     There is no "normal append" in Algol.

What was odd about +:= here was that it also altered the lower-bound:
discarding earlier parts of the list that were no longer needed. That is
not something you expect from A +:= B which is usually understood to
have the same end result as A := A + B, where "+" is here meant as 'append'.

But here it does something more unusual, in not being a 'pure' function,
as you might expect for an operator, but takes account of the values of
the external i, j, k. I guess that's why it has to be inside the
algorithm, if it's a binary op.

In all, rather messy. A regular function would have been a better bet here.

It is loading more messages.
0 new messages