Add nested-function support in a language the based on a stack-machine

dror....@gmail.com

unread,

Feb 12, 2018, 3:51:19 PM2/12/18

to

Suppose I have a simple C-like programming language:

int foo() {
int x = 10;
int bar(y int) {
return y * 2
}
return bar() + x
}

Like you can see, it supports nested functions.

I already implemented the lexing-parsing phases, and now I'm working
on the code generation and the stack-machine. most of the operations,
and the simple flow-control already implemented but I need some
help/ideas how to solve the nested-function task.

The stack-machine has two registers, accumulator, and a temporary register.

Each frame contains: `[arguments, local variables, return address,
caller frame and stack pointer, temporaries and data operations]`. I
read about two ways to implement the nested-function. one, using an
activation-link (also called static-link), and display. Is there any
better idea to handle this? If I'll choose one of these idea, is it
compile time calculation or do I need to store something in runtime?
if it's a runtime thing, what's the time complexity for it? is it O(1)
operation, or something like O(k) where k is the depth of the
function.
[This at least used to be a standard topic in compiler texts. You should
be able to do either in O(1). -John]

dror....@gmail.com

unread,

Feb 12, 2018, 5:29:35 PM2/12/18

to

> [This at least used to be a standard topic in compiler texts. You should
> be able to do either in O(1). -John]

I'm not sure how to calculate it in compile time, because the call frame
doesn't have a fixed size, because I'm using the same stack for storing the
call-frames and for the operations evaluation (iadd, isub, etc..). Maybe I
should split it into two stacks, one for the call frames, and the second for
the operation evaluations. I just sharing my thoughts.

[I would think that your compiler should always know how many partial
results are on the stack above the call frame so it should be able to
use a fixed offset to get to the chain pointers at the base of call
frame. -John]

Louis Krupp

unread,

Feb 13, 2018, 11:14:33 AM2/13/18

to

On Mon, 12 Feb 2018 11:25:36 -0800 (PST), dror....@gmail.com wrote:

>Suppose I have a simple C-like programming language: ...

>Like you can see, it supports nested functions.

gcc supports nested functions as an extension to C. Compiling this
program with -O0 -fdump-tree-all and looking at the generated files
might give you an idea of one way to do it:

#include <stdio.h>

int main(void)
{
int k1 = 1;
int p1(void)
{
int k2 = 2;
int p2(void)
{
int k3 = 3;
int p3(void)
{
return k1 + k2 + k3;
}
return p3();
}
return p2();
}
int n;

n = p1();
printf("%d\n", n);

return 0;
}

Apparently, gcc chains stack frames, and accessing uplevel variables,
as when p3 uses k1 and k2, seems like it would take O(n) time, where n
is the nesting level.

The alternative, a display vector, seems like it would be easier to
implement unless you're doing this for a real machine with a real (and
therefore limited) set of registers.

Louis

Hans-Peter Diettrich

unread,

Feb 13, 2018, 11:15:26 AM2/13/18

to

ACK

Simple compilers use a dedicated base pointer register (x86: eBP) for
their stack frame, but clever compilers will track the current stack
pointer offset to that base address.

The calling frames can have a variable size, when the local function is
called from various places in the code. I.e. the address of a local
variable in an outer scope can vary, fixed is only the variable's offset
to its own frame start. That's why the chain of all intermediate frame
pointers must be followed, until the frame of the variable of interest
is found.

Things can become worse if one of the calling functions is called
recursively, so that a variable number of frames has to be traversed
until the outmost frame is found.

DoDi

Kaz Kylheku

unread,

Feb 13, 2018, 8:08:15 PM2/13/18

to

The problem is that these functions can recurse, including mutually.
So that is to say, you can end up with a situation in which you have
a single activation of main with a single k1 variable, but multiple
activations of p2, with multiple k2's all being live at the
same time.

To complicate things, the multiple active p2 functions can be lifted
into function pointers, all of which are simultaneously valid.

Each of those p2 pointers could be called back by some external function.
Those calls have to establish an environment which resolves to the
correct k2, and at the same time to the one and only k1.

That doesn't rule out display vectors, but it seems like each function
activation has to have its own complete display vector.

For instance the p2 level functions need a two-level vector. Under
the same main invocation, they share the vec[0] entry pointing to
the main frame; but each one has a different vec[1] referencing its
own frame.

When a p3 executes it has its own 3 element vec[]. The vec[0] and
vec[1] are copied from the parent p2. If a pointer to a p3 is captured,
then its vec[1] correctly references the particular dynamic p2 instance
in which it is scoped; vec[2] references its own environment.

Under this scheme, we basically we need o(n^2) storage for the vectors,
for n nesting levels in the absence of recursion.

The compiler can determine which functions aren't subject to indirection
and avoid allocating the trampoline space for that.

If functions are downward funargs only (turn to garbage when they go out
of scope), the trampoline space needed to capture a given function
invocation can be part of the activation record of the block in which
that function is declared. When that block terminates, the function is
garbage, and so is its trampoline.

> implement unless you're doing this for a real machine with a real (and
> therefore limited) set of registers.

Hmm. The vectors can be treated like any array, or set of local
variables (v0, v1, ..). We generate abstract AST code to load these and
use them; then that code gets subject to register allocation like
anything else.
[As I recall, you do need a display with a pointer for each lexical scope,
but so long as you don't expect closures to work, you can do it in
constant time with a fixed location where each scope stores its
current frame pointer. If you want closures, then you're into tree
shaped stacks and garbage collected stack frames. -John]

George Neuner

unread,

Feb 13, 2018, 10:20:22 PM2/13/18

to

On Wed, 14 Feb 2018 00:59:34 +0000 (UTC), Kaz Kylheku <217-67...@kylheku.com> wrote:
>On 2018-02-13, Louis Krupp <lkr...@nospam.pssw.com.invalid> wrote:
>> On Mon, 12 Feb 2018 11:25:36 -0800 (PST), dror....@gmail.com wrote:
>>
>>>Suppose I have a simple C-like programming language: ...
>>>Like you can see, it supports nested functions.
>>
>> gcc supports nested functions as an extension to C. Compiling this
>> program with -O0 -fdump-tree-all and looking at the generated files

>> might give you an idea of one way to do it: ...

>> Apparently, gcc chains stack frames, and accessing uplevel variables,
>> as when p3 uses k1 and k2, seems like it would take O(n) time, where n
>> is the nesting level.
>>
>> The alternative, a display vector, seems like it would be easier to
>
>The problem is that these functions can recurse, including mutually.
>So that is to say, you can end up with a situation in which you have
>a single activation of main with a single k1 variable, but multiple
>activations of p2, with multiple k2's all being live at the
>same time.
>
>To complicate things, the multiple active p2 functions can be lifted
>into function pointers, all of which are simultaneously valid.
>
>Each of those p2 pointers could be called back by some external function.
>Those calls have to establish an environment which resolves to the
>correct k2, and at the same time to the one and only k1.
>
>That doesn't rule out display vectors, but it seems like each function
>activation has to have its own complete display vector.

As long as each function creates a new stack frame when called, then
recursion is not relevant and a single display is sufficient to handle
the nesting.

The display tracks lexical parentage [who defined who], not call
chain parentage. Each display entry points to the stack frame of the
last function invoked that was defined at the corresponding lexical
level.

The preamble of a function at, e.g., level 3, preserves display[3] in
its own stack frame, and sets display[3] to point to its own frame.
The postscript of the function restores the original display[3] value
when the function ends.
[Obviously, the compiler must keep track of lexical levels in order
that each function update the proper display entry.]

When a function at level 4 needs something at level 2, it just looks
to display[2] to find the proper stack frame. Regardless of
intervening recursions, the lexical environment is maintained:
display[2] always will point to the latest invocation of the level 2
function which contains the current (level 4) function.

>For instance the p2 level functions need a two-level vector. Under
>the same main invocation, they share the vec[0] entry pointing to
>the main frame; but each one has a different vec[1] referencing its
>own frame.
>
>When a p3 executes it has its own 3 element vec[]. The vec[0] and
>vec[1] are copied from the parent p2. If a pointer to a p3 is captured,
>then its vec[1] correctly references the particular dynamic p2 instance
>in which it is scoped; vec[2] references its own environment.
>
>Under this scheme, we basically we need o(n^2) storage for the vectors,
>for n nesting levels in the absence of recursion.

No, you're missing something. I just haven't figured out what.

>The compiler can determine which functions aren't subject to indirection
>and avoid allocating the trampoline space for that.

With the classic implementation, every non-leaf function has to do its
part to maintain the lexical environment. But the allocation overhead
is just one saved pointer per stack frame.

>[As I recall, you do need a display with a pointer for each lexical scope,
>but so long as you don't expect closures to work, you can do it in
>constant time with a fixed location where each scope stores its
>current frame pointer. If you want closures, then you're into tree
>shaped stacks and garbage collected stack frames. -John]

You still do have closures of sorts ... they just are distributed
through the stack frames of the current function call chain.

A display cannot handle persistent closures, such as in Lisp, where a
nested function may be invoked after its parent terminates, outside of
the runtime environment that defined it.

George

George Neuner

unread,

Feb 13, 2018, 11:12:07 PM2/13/18

to

On Tue, 13 Feb 2018 00:42:00 -0700, Louis Krupp
<lkr...@nospam.pssw.com.invalid> wrote:

>On Mon, 12 Feb 2018 11:25:36 -0800 (PST), dror....@gmail.com wrote:
>
>>Suppose I have a simple C-like programming language: ...
>>Like you can see, it supports nested functions.
>
>gcc supports nested functions as an extension to C. Compiling this
>program with -O0 -fdump-tree-all and looking at the generated files

>might give you an idea of one way to do it: ...

>The alternative, a display vector, seems like it would be easier to
>implement unless you're doing this for a real machine with a real (and
>therefore limited) set of registers.

The display doesn't need to be in registers [though the pointer to the
display itself should be]. Since every non-leaf function will touch
some part of the display, it will tend to stay in cache close to the
CPU.

And a display doesn't need to be large to be effective. Lexical
definition hierarchies tend to be quite shallow. Studies of nested
functions done back in the days of Pascal showed that 16 entries
sufficed for all but a tiny percentage of code.

Also remember that OO is an orthogonal dimension: the depth of an
object hierarchy will not be limited by using a display.
However if you mix objects and display nested functions, I *think* it
would be necessary to maintain a separate display instance per object
type.

George

George Neuner

unread,

Feb 14, 2018, 1:08:44 PM2/14/18

to

On Mon, 12 Feb 2018 11:25:36 -0800 (PST), dror....@gmail.com wrote:

Using activation pointers in the stack frames, you do end up with an
O(n) list. To get something M lexical levels away, you need to follow
M pointers.

A display is O(1) to use: to get from, e.g., level 4 to level 2, the
level 4 function just looks at the pointer in display[2].

Both are O(1) to maintain at runtime - just a pointer save/restore.

Whether they work for you depends on the language. These methods work
for languages like Pascal, but not for languages like Lisp where
nested functions can be called after their parent has termninated and
the stack environment that created them has disappeared.

And as I mentioned in another post, there is an issue mixing displays
with object oriented languages.

If you prefer to go with more advanced compiler techniques, there is
either closure conversion or lambda lifting. They can get pretty
involved [again depending on the language], so rather than us trying
to explain them here, it is better that you read about them and ask
questions if you don't understand something.

>[This at least used to be a standard topic in compiler texts. You should
>be able to do either in O(1). -John]

I have never actually seen lambda lifting (as such) covered in any
compiler text ... I learned of it perusing research papers.

But certainly the other methods should be in any book since ~1990.

Not that lambda lifting is that hard to grasp: once you understand
closure conversion, it's obvious that lambda lifting is simply a (more
complicated) optimization for situations that don't require closure
persistence. Of course, in such situations, you could just choose to
stack allocate the closure rather than heap allocating it. Once
you're committed to the code rewriting necessary to do conversion, the
extra needed to do lambda lifting may be less of a concern.

YMMV,
George

Kaz Kylheku

unread,

Feb 14, 2018, 1:10:20 PM2/14/18

to

Yes; it is clear that there can be a single display vector which is
associated with the entire lexical scope as a whole, and then the
local scopes just mutate the entries in the vector, taking care
to save and restore them (which requires some local storage, by the
way, but less than in the scheme I describe).

The problem with this saving and restoring is that it's not thread safe.
If there are multiple invocations of the p2 scope, and they can be
entered by independent threads, we're in trouble if there is a single
pointer to that scope in a single display. (Excluding that from being a
requirement is fairly reasonable.)

The model I described simply replicates the display, allowing immutable
local copies of it in each frame. When a scope is entered via
indirection, it just installs a context pointer referencing the display.
(Of course, that context pointer has to be saved and restored.
It could just be the regular frame pointer, with the display being
at some well known offset in the frame.)

> When a function at level 4 needs something at level 2, it just looks
> to display[2] to find the proper stack frame. Regardless of
> intervening recursions, the lexical environment is maintained:
> display[2] always will point to the latest invocation of the level 2
> function which contains the current (level 4) function.

(Under what I described, that display[2] is in that function's own
display[] array rather than a shared one. That display[2] is
initialized and never mutated.)

> No, you're missing something. I just haven't figured out what.

I'm missing the space optimization that a single display can be
(serially) shared, in spite of multiple activations of scopes, as long
as the entries are saved and restored.

Shoefoot

unread,

Feb 14, 2018, 3:32:06 PM2/14/18

to

On Monday, February 12, 2018 at 3:51:19 PM UTC-5, dror....@gmail.com wrote:
> one, using an activation-link (also called static-link), and display.

I'm a big fan of Display stacks. They are easy to implement and fast at runtime.
However they do break down with setjmp/longjmp. The Display stack becomes
part of the state and must be pushed with the setjmp. There is also some
considerations for the debugger. The debugger has to walk the stack rather than
use the Display.

George Neuner

unread,

Feb 17, 2018, 10:42:48 AM2/17/18

to

I understand, and you're correct that the display needs to be counted
as part of the thread switch context. But each thread also would be
using from a separate stack, so having the entire display saved in
each frame is overkill when all that's needed is a single pointer.

If you're suggesting that the stacks be shared among threads, then
you're no longer talking about "threads" per se, but rather about
"scheduler activations", which are subtly different from threads.
I'm not aware of any non-research system that is based on activations.

>> When a function at level 4 needs something at level 2, it just looks
>> to display[2] to find the proper stack frame. Regardless of
>> intervening recursions, the lexical environment is maintained:
>> display[2] always will point to the latest invocation of the level 2
>> function which contains the current (level 4) function.
>
>(Under what I described, that display[2] is in that function's own
>display[] array rather than a shared one. That display[2] is
>initialized and never mutated.)
>
>> No, you're missing something. I just haven't figured out what.
>
>I'm missing the space optimization that a single display can be
>(serially) shared, in spite of multiple activations of scopes, as long
>as the entries are saved and restored.

No worries. IME, displays don't get much respect from modern
textbooks - they are mentioned mostly in passing. The basic problem
is that, while they are quite useful for an Algol-like language, they
can be problematic for an OO or functional language.

George

Kaz Kylheku

unread,

Feb 17, 2018, 11:24:41 AM2/17/18

to

The address of a local function can be taken. When that function is
called from any context, possibly another thread, it has access to the
locals that were lexically apparent there.

> If you're suggesting that the stacks be shared among threads, then
> you're no longer talking about "threads" per se, but rather about

Stack *access* is shared among threads. E.g. a thread can register
a stack object in a shared queue (such as a synchronization wait queue).
As long as it is dequeued before that frame terminates, everything is
cool. Commonly done.

If a downward-only lexical closure is implemented as a pointer into some
threads's stack memory (plus a function) that object can be passed to
another thread and used (as long as the context which created that
closure doesn't terminate).

When that sort of thing is done with threads, we cannot have a single
display array per lexical scope being mutated.

If the stack closures maintain local displays which are immutable,
then it can work.

The function which is called from the other thread is using that
thread's stack, of course, for the parameter passing and return and
all. Also for its freshly instantiated locals. However, the material
in the captured lexical scope is in the original stack. So is the
display vector which accesses the multiple levels of it.

The display is only used for acessing the lexical things outside of the
local function's body; its own locals and parameters are on the local
stack.

Ahd here is the kicker: *that* local environment can be captured by that
thread, together with that other environment. No problem; that thread
has a copy of the display in its stack frame and extends it with a pointer
to its own stuff. So now there is a display which references two
different stacks.

> No worries. IME, displays don't get much respect from modern
> textbooks - they are mentioned mostly in passing.

This is probably because of

1) a few decades of dominance of C, C++ and Java whose standard dialects
don't have scope capturing local functions. So no interest in the
mainstream software dev in local functions as such.
OOP techniques somewhat make up for the lack of closures. You use
an object with a method; the captured stuff is in the object.
Languages with "class scope" make it easy for the code in a class
method to access the object with simple, unqualified name references
like "foo", so that at least the body of a method is comparably
convenient to that of a closure.

2) real languages have proper lexical closures with indefinite
lifetimes. The display concept is still relevant but probably turn into
something else under a different name. (We are seeing much more use
of and interest in languages that have closures; but that interest is
largely in new ones that are poorly optimized.)

> The basic problem
> is that, while they are quite useful for an Algol-like language, they
> can be problematic for an OO or functional language.

Displays, or something equivalent, is necessary to work out the nesting
of captured lexical scopes. Different contours of the scope have dynamic
activations that differ in lifetimes and have to be separate objects.
Those objects somehow have to be centrally referenced from places that
*somehow* have simultaneous access to them.

In a purely functional language, a simplifying aspect is that bindings
are immutable. We can capture a lexical scope by copying it; the user
cannot tell the difference between that and the original scope.
If variables are mutable, you know that you have a copy because a
variable assignment in the original scope isn't reflected in the
captured one.

Anton Ertl

unread,

Feb 17, 2018, 12:34:50 PM2/17/18

to

Kaz Kylheku <217-67...@kylheku.com> writes:
>On 2018-02-15, George Neuner <gneu...@comcast.net> wrote:
>> No worries. IME, displays don't get much respect from modern
>> textbooks - they are mentioned mostly in passing.
>
>This is probably because of

This is because displays were found to be more costly for Algol-like
languages (there was an influential paper that convinced everyone of
that). IIRC the additional cost is in updating the display on calls
and returns.

Unfortunately, I don't have a reference to the paper above. You can
probably find a reference in old compiler books.

Of course, given that you are using a different language that probably
uses nested functions quite differently from what was used in that
paper, you may want to reevaluate the balance of display vs. static
chains.

>> The basic problem
>> is that, while they are quite useful for an Algol-like language, they
>> can be problematic for an OO or functional language.

The funny thing is that, while displays were found to be suboptimal
for Algol-like languages, they can be used for type inclusion tests,
used in many OO languages [Cohen:acm:toplas:1991]. There are a bunch
of other solutions to this problem, so I don't know if displays are
used for this purpose in current systems, though.

@Article{Cohen:acm:toplas:1991,
author = "Norman H. Cohen",
title = "Type-Extension Type Tests Can Be Performed In Constant
Time",
journal = "ACM Transactions on Programming Languages and
Systems",
volume = "13",
number = "4",
pages = "626--629",
month = oct,
year = "1991",
refs = "2",
checked = "19940624",
source = "Dept. Library",
keywords = "class, descriptor, display, extensible data type,
inheritance, membership test, object-oriented
programming, type extension, type test",
note = "Technical Correspondence",
abstract = "Wirth's proposal for type extensions includes an
algorithm for determining whether a give value belongs
to an extension of a given type. In the worst case,
this algorithm takes time proportional to the depth of
the type-extension hierarchy. Wirth describes the loop
in this algorithm as ``unavoidable,'' but in fact, the
test can be performed in constant time by associating a
``display'' of base types with each type descriptor.",
xref = "Wirth:acm:toplas:1988",
reffrom = "Corney:Gough:plasa:1994",
}

I dimly remember a letter to the editor by Wirth where he appreciated
that finally a good use for the display had been found.

>Displays, or something equivalent, is necessary to work out the nesting
>of captured lexical scopes. Different contours of the scope have dynamic
>activations that differ in lifetimes and have to be separate objects.
>Those objects somehow have to be centrally referenced from places that
>*somehow* have simultaneous access to them.

Static link chains work fine.

- anton
--
M. Anton Ertl
an...@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/

George Neuner

unread,

Mar 1, 2018, 1:48:28 PM3/1/18

to

On Sat, 17 Feb 2018 16:39:04 GMT, an...@mips.complang.tuwien.ac.at
(Anton Ertl) wrote:

>Kaz Kylheku <217-67...@kylheku.com> writes:
>>On 2018-02-15, George Neuner <gneu...@comcast.net> wrote:
>>> No worries. IME, displays don't get much respect from modern
>>> textbooks - they are mentioned mostly in passing.
>>
>>This is probably because of
>
>This is because displays were found to be more costly for Algol-like
>languages

More costly than what?

>(there was an influential paper that convinced everyone of
>that). IIRC the additional cost is in updating the display on calls
>and returns.

Which is no different from the overhead to maintain static links, and
faster to use when you need non-locals more than one lexical level
distant.

>Unfortunately, I don't have a reference to the paper above. You can
>probably find a reference in old compiler books.

I'd really like to see that paper to find out what they are comparing
to display [and using what language].

It certainly is true that closeure conversion is a better solution
overall because it flattens all non-local accesses to use a single
indirection ... but it also significantly complicates the compiler,
and creates a need for handling wide pointers (or equivalent). Many
useful languages simply don't need that extra complexity.

>Of course, given that you are using a different language that probably
>uses nested functions quite differently from what was used in that
>paper, you may want to reevaluate the balance of display vs. static
>chains.

I would never use static chains ... if display wasn't viable I would
jump straight to closure conversion.

Thanks for that pointer ... going to have to read that.

>>Displays, or something equivalent, is necessary to work out the nesting
>>of captured lexical scopes. Different contours of the scope have dynamic
>>activations that differ in lifetimes and have to be separate objects.
>>Those objects somehow have to be centrally referenced from places that
>>*somehow* have simultaneous access to them.
>
>Static link chains work fine.

Yes, they do work - but they are the lowest performance option for
non-local accesses.

>- anton
George

George Neuner

unread,

Mar 1, 2018, 1:49:07 PM3/1/18

to

On Sat, 17 Feb 2018 16:13:10 +0000 (UTC), Kaz Kylheku

<217-67...@kylheku.com> wrote:

>On 2018-02-15, George Neuner <gneu...@comcast.net> wrote:
>> On Wed, 14 Feb 2018 18:06:40 +0000 (UTC), Kaz Kylheku
>><217-67...@kylheku.com> wrote:
>>
>>>On 2018-02-14, George Neuner <gneu...@comcast.net> wrote:

>> ... you're correct that the display needs to be counted

>> as part of the thread switch context. But each thread also would be
>> using from a separate stack, so having the entire display saved in
>> each frame is overkill when all that's needed is a single pointer.
>
>The address of a local function can be taken. When that function is
>called from any context, possibly another thread, it has access to the
>locals that were lexically apparent there.

You Lisp is showing. IOW, that's language dependent.

There are perfectly useable languages that do not allow taking the
address of a function ... never mind passing such pointers between
threads.

Upcalls / callbacks / completion functions, etc. do require the
ability to specify the target function, but there isn't any reason for
the language to expose that to the programmer. And in such cases the
stack environment required by the called function exists at the point
where the function is called.

>Stack *access* is shared among threads. E.g. a thread can register
>a stack object in a shared queue (such as a synchronization wait queue).
>As long as it is dequeued before that frame terminates, everything is
>cool. Commonly done.

Again, language dependent.

Your hypothetical stack resident object need not be any kind of
"active" object. And if the [shared entity] is provided a callback
function [by whatever means the language permits], there is no problem
as long as the lexical environment of that function still exists.

Have you never used multi-threaded Pascal?

>If a downward-only lexical closure is implemented as a pointer into some
>threads's stack memory (plus a function) that object can be passed to
>another thread and used (as long as the context which created that
>closure doesn't terminate).

Yes it can. And the result is something akin to co-routine. But
there's no reason any particular language should allow it.

Many of your arguments are based on the notion of a general purpose
language needing to provide full closures and objects ... but in a
more restricted domain specific setting there often are perfectly good
arguments for NOT providing general purpose features.

We really don't know what the OP is trying to do. It is good to point
out limitations of an idea and to bring up alternatives where viable,
but let's not go too crazy here until we know more.

YMMV,
George

Kaz Kylheku

unread,

Mar 1, 2018, 6:47:17 PM3/1/18

to

How do flat representations handle the fact that different levels
of the lexical environment have different lifetimes?

E.g.

(let ((a 1) (b 2))
(loop ...
(let ((c 3) (d 4))
(lambda ()))))

The (c d) environment is freshly instantiated each time the loop
re-enters the inner let, but the (a b) environment is the same.

Each (lambda ()) invocation must capture a different environment,
with a different (c d) but shared (a b).

If there is just a flat vector, it must still contain an extra
level of indirection, no?

If the language is purely functional, then the bindings are immutable,
and so the (lambda ()) can copy the whole damn thing.

If bindings are mutable, then when one lambda modifies a or b,
the other lambdas (with their different environment) must see the
modification. Yet they are shielded from each other's modification
of c or d.

So the upshot is that there has to be an extra level of indirection.

It cannot be the case that "all non-local accesses use a single
indirection".

The flat environment has to contain entries which are not the storage
bindings themselves, but pointers to bindings.

For instance (let's keep it simple and concrete) it could be a vector of
conses:

#((a . 1) (b . 2) (c . 3) (d . 4))

This is not direct access: we have to dereference a pointer to index
into the vector. Then we retrieve a cons cell which is a pointer,
we have to dereference this to get to the cdr field to get to the value.

So then whenever the loop enters the (c d) environment, the the (c ...)
and (e ...) conses are destructively replaced with fresh conses.

What the lambda operator then has to do to fetch the environment
for the closure is to take a shallow-copy snapshot of the vector:
allocate a new vector of the same length which takes the same conses.

So if instead of the above, we have a display, we can do this:

#(#(1 2) #(3 4)) ;; env vec: effectively, a two-level display

Instead of dereferencing a vector and then a cons cell, we
dereference a vector and then an inner vector.

In this display case (pun intended) when the inner let is entered,
the #(3 4) is replaced with #(nil nil), destructively,
analogous to the conses being replaced in the "flat" case.
Note that this is likely more efficient than allocating two conses.

The lambda operator shallow-copies the whole vector, just like
in the flat environment case.

> I would never use static chains ... if display wasn't viable I would
> jump straight to closure conversion.

What is the meaning of "display", anyway?

It seems like it just means "array" instead of "linked list" (just
specifically in the context of activation frame environments).

The C main function's argv[] is a "display" of strings, whereas
the Lisp list '("bash" "-c" "ls") is a "chain".

Except that the context isn't environment access and so we don't use
the word "display".
[A display is an array of pointers to the frames of the enclosing
scopes. You can access any variable in an enclosing frame by
indirecting through the frame's entry in the current display and then
indexing. This is for languges like Algol and Pascal, not Lisp. I
don't ever recall anyone talking about displays and threads, although
I think they should work OK so long as you keep your displays on the
local stack. -John]

Kaz Kylheku

unread,

Mar 1, 2018, 9:37:32 PM3/1/18

to

On 2018-03-01, Kaz Kylheku <217-67...@kylheku.com> wrote:
> John:

> [A display is an array of pointers to the frames of the enclosing
> scopes. You can access any variable in an enclosing frame by
> indirecting through the frame's entry in the current display and then
> indexing. This is for languges like Algol and Pascal, not Lisp. I

This is absolutely relevant to Lisp dialects with lexical scoping,
like Common Lisp and Scheme (or anything worth its salt, really).
There are no first-class functions without lexical scoping.

Structure and Interpretation of Computer Programs (SICP) discusses
something similar to displays in "5.5.6 Lexical Addressing":

https://mitpress.mit.edu/sicp/full-text/book/book-Z-H-35.html#%_sec_5.5.6

"We can exploit this fact by inventing a new kind of variable-lookup
operation, lexical-address-lookup, that takes as arguments an
environment and a lexical address that consists of two numbers: a frame
number, which specifies how many frames to pass over, and a displacement
number, which specifies how many variables to pass over in that frame.
Lexical-address-lookup will produce the value of the variable stored at
that lexical address relative to the current environment."

The focus of that section is eliminating the run-time search which
actually finds a name by symbol matching, with numeric indexing.
However, if we can "pass over" the frames by indexing through a vector
of them, then we have a display. We can also directly index into the
frame by also representing it as a vector, so we don't have to
"pass over" n-1 variables to get to the n-th one.

[I was thinking of closures when I said not Lisp. I suppose you could
use displays and garbage collect them but I haven't really thought
about the implications. -John]

George Neuner

unread,

Mar 3, 2018, 12:59:17 PM3/3/18

to

On Thu, 1 Mar 2018 19:26:06 +0000 (UTC), Kaz Kylheku
<217-67...@kylheku.com> wrote:

>On 2018-03-01, George Neuner <gneu...@comcast.net> wrote:
>
>> It certainly is true that closeure conversion is a better solution
>> overall because it flattens all non-local accesses to use a single
>> indirection ... but it also significantly complicates the compiler,
>> and creates a need for handling wide pointers (or equivalent). Many
>> useful languages simply don't need that extra complexity.
>
>How do flat representations handle the fact that different levels
>of the lexical environment have different lifetimes?

In fact they don't ... all the "levels" of the lexical environment
must persist IN SOME FASHION for as long as the function lives. That
is the whole point of the "closure" - to persist the environment.

>E.g.
>
> (let ((a 1) (b 2))
> (loop ...
> (let ((c 3) (d 4))
> (lambda ()))))

This particular example actually is better suited to lambda lifting
than to closure conversion, but ...

>The (c d) environment is freshly instantiated each time the loop
>re-enters the inner let, but the (a b) environment is the same.
>
>Each (lambda ()) invocation must capture a different environment,
>with a different (c d) but shared (a b).
>
>If there is just a flat vector, it must still contain an extra
>level of indirection, no?

As you discuss more below, the answer depends on whether or not any
binding are mutated and shared. Note, it does not depend on whether
they are "mutable" per the language, but whether they actually are
mutated.

The real point of closure conversion is to flatten the function's
environment as much as is possible to minimize the runtime effort of
accessing the variable. How far you can take it depends on the
complexity of your compiler.

If no bindings are mutated, the values simply can be copied into a
single structure which is passed to the function.

[ a b c d ]

Note, that this also is the case for shared bindings which are NOT
mutated. (more below)

If any of the bindings are mutated, then you have to assume that they
also are shared (unless you can prove otherwise), and so you need to
provide a more complex environment.

What this looks like depends on how much you want to optimize the
non-shared bindings:

The simplest and most direct translation: you allocate a separate
structure for each scope, and a vector of pointers to them. You then
pass the pointer vector to the function.

[ + + ]
| |
| +--> [ c d ]
|
+-----> [ a b ]

More complex: you allocate separately those bindings which are
mutated, and create a flat structure that contains values for the
static bindings and pointers to the mutable ones.

[ a + c d ]
|
+--> [ b ]

More complex still: figure out whether mutated bindings actually are
shared. A mutated variable, e.g., used as a private counter, need not
be allocated separately.

This can get quite complicated to figure out if there are multiple
functions mutating different sets of shared bindings.

Also note that if the closures are not passed upward out of the
definition environment, the environment "structures" may be stack
allocated [if the runtime uses mark:release style stack(s)].

>If the language is purely functional, then the bindings are immutable,
>and so the (lambda ()) can copy the whole damn thing.
>
>If bindings are mutable, then when one lambda modifies a or b,
>the other lambdas (with their different environment) must see the
>modification. Yet they are shielded from each other's modification
>of c or d.
>
>So the upshot is that there has to be an extra level of indirection.
>
>It cannot be the case that "all non-local accesses use a single
>indirection".

An extra indirection is required only for those bindings which are
BOTH mutated AND shared. Sharing alone or mutation alone does not
require a separate allocation.

The challenge for the compiler is to figure whether a binding actually
is mutated only, shared only, or both mutated and shared.

>> I would never use static chains ... if display wasn't viable I would
>> jump straight to closure conversion.
>
>What is the meaning of "display", anyway?
>
>It seems like it just means "array" instead of "linked list" (just
>specifically in the context of activation frame environments).

More or less. I don't know who named the construct originally, or why
the particular name "display" was chosen.

The object, however, is simple. With nested functions and recursion,
every call frame logically is a member of 2 lists: "who called who",
and "who defined who".

"who called who" pretty much has to be maintained as a list chained
through the call frames themselves because the call stack can become
arbitrarily deep, and because (modulo GC, debugger, etc) there never
is any runtime need to follow more than 1 pointer.

However, "who defined who" hierarchies typically are fairly shallow:
some large scale studies done on Pascal showed that 16 lexical scopes
sufficed for 99+% of the code they examined.

A display optimizes non-local access (more than 1 lexical scope away)
by maintaining "who defined who" as a vector of frame pointers rather
than as yet another list chained through the frames themselves. It
has the same O(1) maintenance overhead as the chained list, but its
use is O(1) rather than O(n).

>The C main function's argv[] is a "display" of strings, whereas
>the Lisp list '("bash" "-c" "ls") is a "chain".

Yes.

>Except that the context isn't environment access and so we don't use
>the word "display".

Yes, the term "display" - referring to a vector of pointers - is
rather unique to compiler construction. I haven't seen it used as
such elsewhere.

>[A display is an array of pointers to the frames of the enclosing
>scopes. You can access any variable in an enclosing frame by
>indirecting through the frame's entry in the current display and then
>indexing. This is for languges like Algol and Pascal, not Lisp. I
>don't ever recall anyone talking about displays and threads, although
>I think they should work OK so long as you keep your displays on the
>local stack. -John]

Yes - the display has to be considered aa part of the thread switch
context - like CPU registers.

The main point with a display is that the function's execution
environment is (Lisp eq) identical to its definition environment.
There is no separate allocation(s) created such that the function can
be called after its definition environment is gone.

George

Anton Ertl

unread,

Mar 3, 2018, 1:16:11 PM3/3/18

to

George Neuner <gneu...@comcast.net> writes:
>On Sat, 17 Feb 2018 16:39:04 GMT, an...@mips.complang.tuwien.ac.at
>(Anton Ertl) wrote:
>
>>Kaz Kylheku <217-67...@kylheku.com> writes:
>>>On 2018-02-15, George Neuner <gneu...@comcast.net> wrote:
>>>> No worries. IME, displays don't get much respect from modern
>>>> textbooks - they are mentioned mostly in passing.
>>>
>>>This is probably because of
>>
>>This is because displays were found to be more costly for Algol-like
>>languages
>
>More costly than what?

Static link chains.

>> IIRC the additional cost is in updating the display on calls
>>and returns.
>
>Which is no different from the overhead to maintain static links

Wrong. Consider the following nesting of functions:

A
B
C
D

With a display, in C you have built up a display pointing to the
frames of C, B, and A. When performing a call from C to D, you throw
that away and replace it with a display pointing to just the frame of
D. When returning from D, you have to restore the old display.

If you don't maintain a static link chain, you have to save the
complete display of C on the call and restore it on return. Note that
D may itself call functions that need more display, so you don't get
away with just saving and restoring the first slot of the display.
IIRC this was the variant looked at in the paper that concluded that
displays are more costly.

If you maintain a static link chain, you can restore the display of C
from the static link chain, but you now have the cost of maintaining
the static link chain and in addition the cost of maintaining the
display, which is obviously more costly than just the static chain
alone. If you go that way, you can actually produce an intermediate
scheme, where you cache static link accesses that you need, and the
cached accesses are similar to (the needed part of) a display.

>I'd really like to see that paper to find out what they are comparing
>to display [and using what language].

The language was probably Algol or one of it's direct offspring.

Anyway, if I get around to it, I will search for a reference to the
paper in our library, but if someone else can supply the reference,
that would be cool.

>>Static link chains work fine.
>
>Yes, they do work - but they are the lowest performance option for
>non-local accesses.

The first non-local level has the same performance as a display. Are
deeper accesses frequent enough to pay for the increased cost of
maintaining the display? The paper's answer was no, but that may vary
from language to language (and with usage patterns). Relative
hardware costs have also shifted somewhat. Bottom line: Better
reevaluate the options yourself.

Rob Warnock

unread,

Mar 3, 2018, 1:17:38 PM3/3/18

to

Kaz Kylheku <217-67...@kylheku.com> wrote:
+---------------

| George Neuner <gneu...@comcast.net> wrote:
| > It certainly is true that closeure conversion is a better solution
| > overall because it flattens all non-local accesses to use a single
| > indirection ... but it also significantly complicates the compiler,
| > and creates a need for handling wide pointers (or equivalent). Many
| > useful languages simply don't need that extra complexity.
|
| How do flat representations handle the fact that different levels
| of the lexical environment have different lifetimes?

+---------------

I'm not sure if this counts as "closure conversion" or not, but some
Common Lisp implementations [such as CMUCL] use flat environments just
fine (even in the interpreter). The trick is to do some minimal
pre-analysis of the code, to see which closed-over variables are *not*
ever mutated or are *not* used in closures which "escape" the original
lexical context. All such variables may be copied into a single "flat"
vector at closure creation time. All other closed-over variables are
converted into "indirect" variables -- such variables are individually
heap-allocated, pointed to by "ref" pointers which are also copied
into a the same flat environment vector at closure creation time.
Then when the closure is called, all of the up-level variables are
either *in* the (flat) environment or available via *one* indirection.
Because all of the "ref" pointers for a given environment variable in
all the (potential) copies point to the same heap location, closures
which mutate a shared variable work just fine. And all the rest are
either read-only [and thus all the copies are the same] or else
mutated and *not* accessed in any interior closure which escapes.

This is normally performs better than linked environments or even
displays, unless there are a *large* number of nested closures. In the
latter case, when each level of closure is created, it will have to
copy all of its free variables from its parent's flat environment into
its own.

[I apologize if I haven't explained this very well.]

This approach is not unique to Lisp. Appel discusses it for ML in his
"Compiling With Continuations".

-Rob

-----
Rob Warnock <rp...@rpw3.org>
627 26th Avenue <http://rpw3.org/>
San Mateo, CA 94403

Anton Ertl

unread,

Mar 5, 2018, 11:12:40 AM3/5/18

to

an...@mips.complang.tuwien.ac.at (Anton Ertl) writes:

>George Neuner <gneu...@comcast.net> writes:
>>I'd really like to see that paper to find out what they are comparing
>>to display [and using what language].
>
>The language was probably Algol or one of it's direct offspring.
>
>Anyway, if I get around to it, I will search for a reference to the
>paper in our library, but if someone else can supply the reference,
>that would be cool.

I looked in some older compiler books, but did not find a reference to
such a paper. What I found:

* Aho and Ullman (1977) only mention displays as ways to implement
static scoping.

* Aho, Sethi, Ullman (1986) also mention access links (static links).

* Fischer and Leblanc (1988) discuss both, and say the following:

|For example, a study of reference patterns in a wide variety of
|Simula 67 programs (Magnusson 1982) found that fully 80% if all
|references were to variables at the outermost level or to local
|variables. Another 17% of the references were to the block
|immediately surrounding the local one. [...] Thus few references
|will actually require following the static chain, and most of those
|will require only one level of such indirection.

|Magnussen, K. 1982. "Identifier references in Simula 67 programs."
|Simula Newsletter 10(2):

They do not write about performance disadvantages of displays,
though.

I also looked if I could find Wirth's response to Cohen's paper, and I
found it:

<https://dl.acm.org/citation.cfm?id=214521> (probably behind a paywall
for many).

And there he gave the reason for not using a display for accessing
non-local variables:

|Upon closer analysis, we decided to discard the use of a display
|that was already in our first Pascal compiler (1969). The reason
|was that (1) the display requires updating for every procedure call
|and return (!) that causes a change of context, and (2) variables at
|intermediate levels (neither local nor global) are referenced quite
|rarely. As a result, the maintenance of a display turned out to be
|more costly than the inefficient access of nonlocal variables via a
|static link.

So apparently there was no influential paper in the 60s or 70s that
damned the display. Wirth and his collaborators discovered the
performance disadvantages of displays by original research, but
apparently did not publish this piece of knowledge before 1991.

George Neuner

unread,

Mar 5, 2018, 4:50:15 PM3/5/18

to

On Fri, 02 Mar 2018 09:30:42 GMT, an...@mips.complang.tuwien.ac.at

No. D and A are at the same lexical level. On entry, D will replace
A's entry in the display, and will restore A's entry on exit.

The scope rules say D can't see or call B or C. If it calls A, then A
replaces it at the same lexical level. When A returns, D's link is
restored.

No problem. And no need to preserve the whole display.

>If you don't maintain a static link chain, you have to save the
>complete display of C on the call and restore it on return.

No you don't.

>Note that
>D may itself call functions that need more display, so you don't get
>away with just saving and restoring the first slot of the display.
>IIRC this was the variant looked at in the paper that concluded that
>displays are more costly.

That is simply wrong unless you change the whole semantics.

As long as call sites are restricted to being within the definition
scope of the called function [as is true in Pascal], and every
function does its little part to maintain the display, then there is
NO POSSIBLE function call sequence that requires saving/restoring the
whole display.

Each function needs only save/restore the single display entry for its
own lexical level. Scope visibility rules dictate that no function
can see lower into the display than its own level - so what those
entries point to is irrelevant.

>If you maintain a static link chain, you can restore the display of C
>from the static link chain, but you now have the cost of maintaining
>the static link chain and in addition the cost of maintaining the
>display, which is obviously more costly than just the static chain
>alone. If you go that way, you can actually produce an intermediate
>scheme, where you cache static link accesses that you need, and the
>cached accesses are similar to (the needed part of) a display.

First, your example does not show any need for a static chain.

Second: the cost to maintain a display is just a single extra pointer
field in the call frame, one pointer copy and 2 pointer stores.

FYI, the amortized cost of a display MAY be less than to maintain the
static chain.

For a static chain:

- a call sideways in the same scope needs the same link as the
caller. That can be fetched from the caller's frame.
but
- a call upwards to a higher scope (such as C->D in your example)
needs a link pointer which it doesn't have available locally.
The preamble code must go up the caller's chain to find the correct
scope link to put in the new frame.

OTOH, the steps to maintain the display are the same regardless of
what function is called:
- save the display entry for my lexical level.
- replace the display entry with a pointer to my frame.
- restore the display entry on exit.

>>I'd really like to see that paper to find out what they are comparing
>>to display [and using what language].
>
>The language was probably Algol or one of it's direct offspring.
>
>Anyway, if I get around to it, I will search for a reference to the
>paper in our library, but if someone else can supply the reference,
>that would be cool.

I really would like to see it. Algol doesn't do anything that would
make a display more work than static chaining. But it would matter
what code base they studied and how those programs were written. As
you mention below, single level accesses are very cheap with static
links [deeper accesses are not].

I can't agree or disagree with a paper I haven't read. And certainly
I can't refute it's conclusions.

>>>Static link chains work fine.
>>
>>Yes, they do work - but they are the lowest performance option for
>>non-local accesses.
>
>The first non-local level has the same performance as a display. Are
>deeper accesses frequent enough to pay for the increased cost of
>maintaining the display? The paper's answer was no, but that may vary
>from language to language (and with usage patterns).

That's true. If the majority of non-local accesses are a single level
away, then the display has no advantage for access. But it still may
have an advantage for maintenance.

It only takes a few upward function calls or deep non-local accesses
to F_ performance of a static chain.

>hardware costs have also shifted somewhat. Bottom line: Better
>reevaluate the options yourself.

Good advice !!!

There still are plenty of uses for simple Pascal-like languages. Many
DSL applications simply don't need multi-threading, objects, or other
stuff that a general purpose language might need.

Displays are just a tool in your toolkit ... you need to read the
directions and use them appropriately.

>- anton
George

George Neuner

unread,

Mar 5, 2018, 4:57:29 PM3/5/18

to

On Mon, 05 Mar 2018 15:01:33 GMT, an...@mips.complang.tuwien.ac.at
(Anton Ertl) wrote:

>I also looked if I could find Wirth's response to Cohen's paper, and I
>found it:
>
><https://dl.acm.org/citation.cfm?id=214521> (probably behind a paywall
>for many).
>
>And there he gave the reason for not using a display for accessing
>non-local variables:
>
> |Upon closer analysis, we decided to discard the use of a display
> |that was already in our first Pascal compiler (1969). The reason
> |was that (1) the display requires updating for every procedure call
> |and return (!) that causes a change of context, and (2) variables at
> |intermediate levels (neither local nor global) are referenced quite
> |rarely. As a result, the maintenance of a display turned out to be
> |more costly than the inefficient access of nonlocal variables via a
> |static link.
>
>So apparently there was no influential paper in the 60s or 70s that
>damned the display. Wirth and his collaborators discovered the
>performance disadvantages of displays by original research, but
>apparently did not publish this piece of knowledge before 1991.
>
>- anton

I may be mistaken on the timeline, but it seems to me that in 1990
Wirth already was working on Oberon, which, like Modula-2, had
visibility rules that were problematic for a display implementation.

I think he also had seen how people had used Modula-2, and before that
the "unit" [modular] extended Pascals. Programmers in these languages
tended to prefer the modules to control visibility and to write
flatter code that made less use of nested functions.

OTOOH, there were the Lisp programmers.

Even in Lisp, deep scope accesses are not that common. But they
aren't exactly rare either ... sometimes you just have to step-wise
build up the context for a function.

I think the real issue was that, by 1990, everybody in the language
research community knew that closure conversion was a better choice
than either static chains or displays: not just for performance, but
because it enables more freedom and functionality in the language.

YMMV,
George

John Levine

unread,

Mar 6, 2018, 12:24:48 AM3/6/18

to

>>>>This is because displays were found to be more costly for Algol-like

>>>>languages ...

>>
>>>> IIRC the additional cost is in updating the display on calls
>>>>and returns.

I get the impression that displays describe two different things.

One is a static array that has one entry for each level of lexical
scoping. In a routine declared N levels (zero based) down, its prolog
saves the Nth entry in the display and replaces it with a pointer to
the current stack frame, and the epilog restores it. It can assume
that higher level routines have correctly set entries 0:N-1. That
seems pretty efficient. Assuming the saved pointer is at a known
location in each stack frame you can do a longjmp and unwind stacks
without too much pain.

The other is the same array, but with a copy of the current display in
each routine's stack frame. This is slower at call time but probably
faster to use on machines like S/360 without direct addressing
since the display on the stack is addressable from the frame pointer,
but static data needs to load a pointer from a constant pool. A
longjmp doesn't need anything special since stacked displays go away
when the stack frames go away.

R's,
John

Tomasz Kowaltowski

unread,

Mar 6, 2018, 10:27:08 AM3/6/18

to

I am not sure whether it will contribute anything new
in this discussion, but you might want to check an old
paper of mine:

Parameter Passing Mechanisms and Run Time Data Structures
Software - Practice and Experience
Vol. 11, Number 7, July 1981
pg. 757-765

tk
[Any way to get a copy of it? Wiley wants $38 for a PDF. -John]

Kaz Kylheku

unread,

Mar 6, 2018, 1:23:53 PM3/6/18

to

This is precisely what I happen to be working on at the moment: a
virtual machine for a Lisp dialect in which I plan to have displays
copied into the local stack frame. I expect it to be a performance
advantage that there is nothing special to do when the stack frame goes
away.

If we had to take some unconditional action, we would have to set up
a catch frame. That means doing a setjmp-like operation that saves
a whole vector of stuff similar to a jmp_buf. Not just machine
registers, but the signal mask (or a virtualized representation of it,
for efficiency), and other stuff. This vector is larger than a display
display containing a significant number of environment levels!

If the scope binds dynamically scoped variables, those have to be
undone. However, that requires no special action. The dynamic
environment is a chain rooted at a global pointer called dyn_env. Long
ago, I made the simplifying design decision to include this dyn_env
pointer in the jmp_buf-like context that is saved at every catch frame,
and so there is no need for a scope to set up a handler for the sake of
restoring dynamically scoped bindings. The normal return case has to
restore dyn_env, which is simple.

bartc

unread,

Mar 7, 2018, 12:44:36 PM3/7/18

to

On 05/03/2018 19:39, George Neuner wrote:
> On Fri, 02 Mar 2018 09:30:42 GMT, an...@mips.complang.tuwien.ac.at
> (Anton Ertl) wrote:

>> Wrong. Consider the following nesting of functions:
>>
>> A
>> B
>> C
>> D
>>
>> With a display, in C you have built up a display pointing to the
>> frames of C, B, and A. When performing a call from C to D, you throw
>> that away and replace it with a display pointing to just the frame of
>> D. When returning from D, you have to restore the old display.
>
> No. D and A are at the same lexical level. On entry, D will replace
> A's entry in the display, and will restore A's entry on exit.
>
> The scope rules say D can't see or call B or C.

What scope rules? As I don't think this is for a specific language.

Even in C, some versions of which have nested functions, it is possible
to set up a function pointer to C(), which can later be called from D()
without first calling A().

And in my own language, it is possible for code in D() to directly call
A.B.C() without entering A.

Although in this one, C() can't access any stack-allocated data of B or
A; partly for this reason (data may not be undefined if called from
outside), and also because it would be fiddly to do anyway, even if
called from B() and A(), as the thread demonstrates.

(C() can access static data of A() and B(), also types, named constants,
enums, and other local functions, so it can still be worthwhile having
local functions with such a restriction.)

--
bartc
[The discussion so far has mostly been about Algol/Pascal style
languages where you can't peek into blocks and contexts go away when a
routine returns so there are no closures. I agree that if you relax
those rules, life gets more complicated.

Hypothetical question: Algol60 call by name was a mistake. They
intended an elegant definition of call by reference and didn't realize
until Jensen's device what they'd done. If they'd done what they
intended and we didn't have to invent thunks, how would programming
languages be different? -John]

Kaz Kylheku

unread,

Mar 11, 2018, 5:43:20 AM3/11/18

to

On 2018-03-06, Kaz Kylheku <217-67...@kylheku.com> wrote:
> On 2018-03-06, John Levine <jo...@taugh.com> wrote:
>> The other is the same array, but with a copy of the current display in
>> each routine's stack frame. This is slower at call time but probably
>> faster to use on machines like S/360 without direct addressing
>> since the display on the stack is addressable from the frame pointer,
>> but static data needs to load a pointer from a constant pool. A
>> longjmp doesn't need anything special since stacked displays go away
>> when the stack frames go away.
>
> This is precisely what I happen to be working on at the moment: a
> virtual machine for a Lisp dialect in which I plan to have displays
> copied into the local stack frame. I expect it to be a performance
> advantage that there is nothing special to do when the stack frame goes
> away.

And ... bam, here is a working demo in less than a week of hacking!

Funny we should have had this whole debate around this time. I've been
massaging the display-based VM design in my head for some months. I
write some design docs regarding the instruction set and fired up Dia to
make some bubble and arrow diagrams.

In the end, it is turning out to be far simpler than I made it out to
be. I have a working implementation of a VM with 32 instructions so far,
and an assembler/disassembler for all of them, with backpatch label
support. There are some pseudo-ops: a single mov instruction for
instance generates three different real instructions based on how the operands fit.

Without any more lengthy introduction, here is vm-clos.tl.

It implements the simple counting closure in the VM assembly
language (there is no compiler yet):

(load "asm")

(in-package :sys)

(let ((a (new assembler))
(d #(succ format t "inner closure, counter: ~s\n")))
a.(asm '((close t1 16 :l1 1 1 nil v0000)
(close t1 0 :l1 0 0 nil)
(call t1 d1 d2 d3 v0000) ;; (format t ...)
(mov t1 v0000) ;; t1 <- v-0-0
(call v0000 d0 v0000) ;; v-0-0 <- (succ v-0-0)
:l1
(end t1))) ;; (return t1)
a.(dis-listing)
(let* ((vd (vm-make-desc 4 32 a.buf d)) ;; a.buf is code
(mk-counter (vm-interpret-toplevel vd))
(counter [mk-counter 10]))
(prinl [counter])
(prinl [counter])
(prinl [counter])))

Run:

$ ./txr vm-clos.tl
0: 7C00000D close t01 16 13 1 1 nil v0000
1: 00100001
2: 00010001
3: 00000200
4: 7C00000D close t01 0 13 0 0 nil
5: 00000001
6: 00000000
7: 14030001 call t01 d01 d02 d03 v0000
8: 01020101
9: 02000103
10: 20010200 movsr t01 v0000
11: 14010200 call v0000 d00 v0000
12: 02000100
13: 0C000001 end t01
inner closure, counter: 10
10
inner closure, counter: 11
11
inner closure, counter: 12
12

The high level overview is that we have an assembly job which produces
some code in a.buf. We use this to make a VM description: we tell the VM
description that it has 4 levels of display nesting (we only use 3), and
that it should allocate 32 registers. We give it the code from a.buf and
the data vector d.

The vm-interpret-toplevel instantiates the VM state and runs this
description. No arguments can be passed in the top-level entry. But it
produces a return value. That return value is the outer closure: a
one-arg function which instantiates a counter that counts from that
argument.

When we call the closure, it instantiates a vm state, but with an
entry into the indicated closure code, and with argument passing.

The close instruction walks the display (on the stack) and clones it
into a heaped object. Any levels that are moved from stack to heap
are also installed in the original display, since they are shared.
There is a way for the machine code to create levels that are not
subject to capture.

Everything is in the display. Each level has up to 256 entries.
Level 0 is the registers t0 through tFF. t0 is always NIL, inspired
by MIPS.

Level 1 is the data vector, d0 through dFF.

The levels after that are vXXYY. For instance v0312 is the fifth level,
register 18 (12 hex). I might re-adjust this: say, have only up to 64
levels that are up to 1024 cells wide.

I not only have closures working, but unwind-protect and exception
handling, and the handling of the dynamic environment for binding
special variables.

Basically, it appears to be reasonably complete; I can't think of
anything missing from being able to support the semantics of the Lisp
dialect. Maybe some unexpected strange interactions with the delimited
continuation support will crop up. Ah, debugging support. Ha!

:)

Martin Ward

unread,

Mar 12, 2018, 9:30:00 PM3/12/18

to

On 06/03/18 19:02, john wrote:
> Hypothetical question: Algol60 call by name was a mistake. They
> intended an elegant definition of call by reference and didn't realize
> until Jensen's device what they'd done. If they'd done what they
> intended and we didn't have to invent thunks, how would programming
> languages be different? -John

Lets go along with the theory that they intended call by reference
and did not realise that what they had was different until
Jenson's device was invented. What happened next (hypothetically!)
was that the language designers realised two things:

(1) Call by name is mathematically, semantically, simpler
than call by reference

(2) Call by reference is much easier to implement than call by name.
Call by name requires significant effort to implement.

The designers decided to go with call by name because it produces
a mathematically simpler language, even at the expense of greater
effort to implement the compiler.

The result was an explosion of productive research and development
in compilers and language implementation.

Since that time, language designers have become very cautious and timid
in specifying powerful new language features and compiler research
has stagnated. An extreme example is the C language reference
which merely codifies the intersection of the behaviour of the major
C compilers, and leaves everything else "undefined".

So to answer your hypothetical question: how would programming language
be different if the designers of Algol 60 had decided to put
implementation convenience above mathematical simplicity
and expressive power in the language? Well, perhaps compiler
research would have stagnated from the beginning and we would not even
have some of the powerful language features we have now:
such as first order functions, closures, abstract data types and so on.

(In case you haven't noticed, I am trying to be provocative,
and hoping to be proved wrong: especially about the stagnation
in language design!)

--
Martin

Dr Martin Ward | Email: mar...@gkc.org.uk | http://www.gkc.org.uk
G.K.Chesterton site: http://www.gkc.org.uk/gkc | Erdos number: 4
[It's not just a theory. Alan Perlis told me so and
he was in a position to know. But provoke away. -John]

Kaz Kylheku

unread,

Mar 13, 2018, 3:31:23 PM3/13/18

to

On 2018-03-12, Martin Ward <mar...@gkc.org.uk> wrote:
> On 06/03/18 19:02, john wrote:
>> Hypothetical question: Algol60 call by name was a mistake. They
>> intended an elegant definition of call by reference and didn't realize
>> until Jensen's device what they'd done. If they'd done what they
>> intended and we didn't have to invent thunks, how would programming
>> languages be different? -John
>
> Lets go along with the theory that they intended call by reference
> and did not realise that what they had was different until
> Jenson's device was invented. What happened next (hypothetically!)
> was that the language designers realised two things:
>
> (1) Call by name is mathematically, semantically, simpler
> than call by reference
>
> (2) Call by reference is much easier to implement than call by name.
> Call by name requires significant effort to implement.
>
> The designers decided to go with call by name because it produces
> a mathematically simpler language, even at the expense of greater
> effort to implement the compiler.

On the Roseta Code wiki site there is this programming task:

https://rosettacode.org/wiki/Man_or_boy_test

The aim of the task is "Imitate Knuth's example in Algol 60 in another
language, as far as possible".

I did this in my TXR Lisp dialect to a greater extent than the other
solutions, IMHO.

https://rosettacode.org/wiki/Man_or_boy_test#TXR

I implemented a "defun-cbn" macro which is like defun, but
creates a call-by-name function. I also implemented "labels-cbn"
macro for binding lexical functions, also with call-by-name
arguments.

The solution then looks like a nearly expression-for-expression
translation of the Algol:

(defun-cbn A (k x1 x2 x3 x4 x5)
(let ((k k))
(labels-cbn (B ()
(dec k)
(set B (set A (A k (B) x1 x2 x3 x4))))
(if (<= k 0)
(set A (+ x4 x5))
(B))))) ;; value of (B) correctly discarded here!

(prinl (A 10 1 -1 -1 1 0))

The code for these macros is given there, and explained.
Thunks are just objects with lambdas for getting and setting
values. To pass an expression like (foo bar) by name,
which might denote a modifiablke place, it becomes.

Basically the thunk mechanism is simulated with objects that contain
getter/setter lambdas. The function calls are transformed such that their
argument expressions are wrapped in constructors of thunk objects, and then in
the bodies the references are transformed into accessors to these thunks.

There is a bit of a cheat. (defun-cbn foo (...) ...) in fact defines a macro
called foo, and not a function. It also defines a hidden function with a
machine-generated name. The macro transforms the arguments and then calls the
hidden function. This is a cheat because of course macros aren't functions;
they can't be indirected upon with apply, mapcar and so on.

> The result was an explosion of productive research and development
> in compilers and language implementation.
>
> Since that time, language designers have become very cautious and timid
> in specifying powerful new language features and compiler research
> has stagnated.

That's because machines have gotten faster and memories bigger!

Perversely, when memories get bigger, all the interesting stuff
gets a lot slower, because it has exponential complexity.

Approaches that could cut it for production use due to resource limtations that
constrained the input sizes suddenly turn into toys.

I'm only half joking; could that be part of it?

> So to answer your hypothetical question: how would programming language
> be different if the designers of Algol 60 had decided to put
> implementation convenience above mathematical simplicity
> and expressive power in the language?

I suspect that wouldn't because all that call by name stuff business is
basically just an alteration of lazy evaluation semantics into supporting
mutation.

Lazy evaluation means that when we call (f x), x isn't evaluated now; its value
isn't needed yet. Implementation-wise, we pass down something which allows
access to x.

If x is immutable, than that something is a simple lambda that
provides read-only access.

If we don't need the laziness to be implicit in the language, we can
use (delay x) to create a promise of the value of x, and (force p)
to later force the value out of the promise object at the point
where we need it.

If we have discovered lambda as a way of obtaining lazy evaluation
(i.e. implementing delay and force) adding another lambda to also enable the
assignment of a new value to x is just a footnote. We add a second
method (stuff p) to stuff a new value into the promise.
The caller's x put into (delay x) is overwritten and so it goes.

--
TXR Programming Lanuage: http://nongnu.org/txr
Music DIY Mailing List: http://www.kylheku.com/diy
ADA MP-1 Mailing List: http://www.kylheku.com/mp1

William Clodius

unread,

Mar 13, 2018, 3:33:24 PM3/13/18

to

Martin Ward <mar...@gkc.org.uk> wrote:

> <snip>

> So to answer your hypothetical question: how would programming language
> be different if the designers of Algol 60 had decided to put
> implementation convenience above mathematical simplicity
> and expressive power in the language? Well, perhaps compiler
> research would have stagnated from the beginning and we would not even
> have some of the powerful language features we have now:
> such as first order functions, closures, abstract data types and so on.

Didn't Lisp have first order functions and closures in 58? If I remembe
the discussion of APT in the HOPL I conference proceedings correctly it
surprisingly had the equivalent of structs.

>
> (In case you haven't noticed, I am trying to be provocative,
> and hoping to be proved wrong: especially about the stagnation
> in language design!)

However the other members of the committee were in a better position to
know their own minds than Perlis was, and the the first HOPL conference,
in the discussion of Naur's presentation, some of them claimed to have
understood the implications of call by name from the beginning.

Kartik Agaram

unread,

Mar 13, 2018, 4:55:31 PM3/13/18

to

On Mon, Mar 12, 2018 at 8:09 PM, William Clodius <w.cl...@icloud.com> wrote:
> Didn't Lisp have first order functions and closures in 58? If I remember

> the discussion of APT in the HOPL I conference proceedings correctly it
> surprisingly had the equivalent of structs.

Lisp had first-class functions but closures require lexical scope,
which didn't land until Scheme in the '70s.

Kaz Kylheku

unread,

Mar 13, 2018, 4:57:44 PM3/13/18

to

On 2018-03-13, William Clodius <w.cl...@icloud.com> wrote:
> Martin Ward <mar...@gkc.org.uk> wrote:
>
>> <snip>
>> So to answer your hypothetical question: how would programming language
>> be different if the designers of Algol 60 had decided to put
>> implementation convenience above mathematical simplicity

>> and expressive power in the language? ...

> Didn't Lisp have first order functions and closures in 58?

No. It was dynamically scoped: locals were basically saved and restored
globals, so lambda didn't capture any enviornment.

Lambda Calculus (even typed and all) was already known, and the
researchers must have been aware of it, or where else would they
have cribbed "LAMBDA" from, right?

Legend has it that MacCarthy has at some point admitted that he didn't
properly understand Lambda Calculus during the early Lisp work; I don't
have a citation for that.

Kaz Kylheku

unread,

Mar 14, 2018, 12:42:09 AM3/14/18

to

Anonyous functions without environments are basically just
function pointers.

If those are first class functions, then C has first class functions.

Andy Walker

unread,

Mar 14, 2018, 12:44:29 AM3/14/18

to

on 13/03/18 03:09, William Clodius wrote:
> Martin Ward <mar...@gkc.org.uk> wrote:

>> So to answer your hypothetical question: how would programming language
>> be different if the designers of Algol 60 had decided to put
>> implementation convenience above mathematical simplicity
>> and expressive power in the language?

[Ie, had specified call-by-reference rather than call-by-name.]

>> Well, perhaps compiler
>> research would have stagnated from the beginning [...].

Three points:

(a) IAL ["Algol 58"] specified call-by-textual-replacement. "The
execution ... is effected as though all formal parameters were replaced,
throughout the procedure, by the actual parameters .... This replacement
may be considered to be a replacement of every occurrence ... of the
formal parameters by the symbols (or sets of symbols) listed as actual
parameters ...." [Report, section D9] It would be surprising if a
decent-sized collection of the best brains of the period didn't realise
that *if* IAL were ever to be implemented, some "interesting" games
could be played.

(b) However, that's a rather big "if". In those days when it was
almost impossible to transfer code from machine A to machine B, the
primary purpose of the "algorithmic" languages was not to be the
source of a program, but to express algorithms for translation into
machine code or into some other implemented language by typing it
up onto cards or tape. So there was little or no point in writing
obscure code in Algol [any version thereof]. OK, Algol compilers
did eventually arrive, though few were both reasonably complete and
reasonably efficient.

(c) I agree that the difficulties of compiling IAL, A60 and A68
stimulated research into compilation techniques. But that may well
have been the cause of the virtual demise of Algol, upstaged by
languages that were easier to write compilers for [but harder to
write programs *in*].

> However the other members of the committee were in a better position to
> know their own minds than Perlis was, and the the first HOPL conference,
> in the discussion of Naur's presentation, some of them claimed to have
> understood the implications of call by name from the beginning.

It seems to have been a left-pondian vs right-pondian thing.
As a very crude over-generalisation, one side simply wanted to get
work done, and the other side was much more interested in the limits
of what could be done. Plenty of exceptions both ways, esp by the
late 1970s.

--
Andy Walker,
Nottingham.
[How close did anyone get to implementing Algol 58? I know about JOVIAL
but I believe it was pretty far from full IAL. -John]

Kartik Agaram

unread,

Mar 14, 2018, 9:30:11 AM3/14/18

to

On Tue, Mar 13, 2018 at 5:07 PM, Kaz Kylheku <157-07...@kylheku.com> wrote:
> Anonymous functions without environments are basically just

> function pointers.
>
> If those are first class functions, then C has first class functions.

C doesn't have anonymous functions (ignoring a combination of GNU
extensions added sometime after 1988). But you mentioned anonymous
functions, so you must be aware of this. So I'm not sure what you're
saying.

Kaz Kylheku

unread,

Mar 14, 2018, 5:36:51 PM3/14/18

to

That's just a syntactic restriction on where functions are
placed in the program text, due to the lack of a function literal
syntax.

If functions don't require an environment, you can easily add anonymous
function literals with a textual preprocessor like GNU m4, by
implementing a completely trivial "no-environment lambda lifting".

Syntax like:

some_fun(LAMBDA{int (int a, int b){ return a + b; }})

is simply replaced by

some_fun(f_0013);

accompanied by the insertion of

static int f_0013(int arg){ return arg + 1; })

into the global scope. This is easy enough that it's usually not a major
inconvenience to simply do it by hand. If the C preprocessor were
slightly more powerful, it could handle the above.

Once we obtain a function as a function pointer by evaluating its
name, then it has been anonymized. We can pass it around the program,
store it in variables and structures and so on. It has the same power
as a lambda-with-no-env.

Andy Walker

unread,

Mar 14, 2018, 5:37:42 PM3/14/18

to

On 14/03/18 00:27, our moderator wrote:
> [How close did anyone get to implementing Algol 58? I know about JOVIAL
> but I believe it was pretty far from full IAL. -John]

AFAIK, no closer than Jovial. In a sense, there never was a full
IAL; the report was "preliminary", everyone was waiting for the proper
language, and once Algol 60 came out, they lost interest.

--
Andy Walker,
Nottingham.

Anton Ertl

unread,

Mar 14, 2018, 5:53:09 PM3/14/18

to

Martin Ward <mar...@gkc.org.uk> writes:
>On 06/03/18 19:02, john wrote:
>> Hypothetical question: Algol60 call by name was a mistake. They
>> intended an elegant definition of call by reference and didn't realize
>> until Jensen's device what they'd done. If they'd done what they
>> intended and we didn't have to invent thunks, how would programming
>> languages be different? -John
>
>Lets go along with the theory that they intended call by reference
>and did not realise that what they had was different until
>Jenson's device was invented. What happened next (hypothetically!)
>was that the language designers realised two things:
>
>(1) Call by name is mathematically, semantically, simpler
>than call by reference
>
>(2) Call by reference is much easier to implement than call by name.
>Call by name requires significant effort to implement.
>
>The designers decided to go with call by name because it produces
>a mathematically simpler language, even at the expense of greater
>effort to implement the compiler.

Looking at the story of recursion in Algol 60
<https://vanemden.wordpress.com/2014/06/18/how-recursion-got-into-programming-a-comedy-of-errors-3/>,
there was a more mathematical camp in the Algol 60 committee, and a
more implementation-oriented camp. The mathematicians knew how to
specify what they wanted, the implementors know how to implement what
they wanted. But since the product of the committee was a
specification, they would also have to know how to specify what they
wanted if they wanted to get it into the report. I guess that they
did not know how to specify call by reference, so the mathematical way
won in this case, as it did for recursion.

History shows that, for recursion, they took the right decision, while
for parameter passing, they didn't.

>The result was an explosion of productive research and development
>in compilers and language implementation.

Is it really productive to research and develop a feature that turns
out to be a dead end? Even the closest thing to call-by-name these
days, call-by-need in lazy functional languages, does not use the
techniques developed by call-by-name.

>Since that time, language designers have become very cautious and timid
>in specifying powerful new language features and compiler research
>has stagnated.

The pace in language design certainly has slowed down a lot since the
early days, but I take this as a sign that the field is maturing: Now
that we have a lot of usable and not-too-horrible programming
languages, it becomes harder to produce something that is better. And
in particular, it is harder to produce something that is better to a
sufficient degree to overcome the various forces that resist such a
change.

Concerning the stagnation in compiler research, that certainly is not
reflected in the numbers of papers submitted and published at compiler
conferences.

>An extreme example is the C language reference
>which merely codifies the intersection of the behaviour of the major
>C compilers, and leaves everything else "undefined".

That is the way to go for a standard that codifies existing practice.
However, it is not ok for compiler maintainers to then take the view
that only programs in that intersection (an almost empty set) need to
be compiled as intended.

>So to answer your hypothetical question: how would programming language
>be different if the designers of Algol 60 had decided to put
>implementation convenience above mathematical simplicity
>and expressive power in the language?

If they had specified call-by-reference and call-by-value, similar to
Pascal? Programming languages would not be any different.

If they had specified non-recursion? Recursion may have taken longer
to take hold in mainstream languages. However, recursion was already
there in Lisp, so it's not as if we needed Algol 60 for that.

In general, while language designers have stood on the shoulders of
their predecessors, I don't have the impression that they let
themselves be limited by limitations of previous languages; such
limitations served more as a motivation for improving things. Take
Landin's influtential paper "The next 700 programming languages" as an
example.

>Well, perhaps compiler
>research would have stagnated from the beginning and we would not even
>have some of the powerful language features we have now:
>such as first order functions, closures, abstract data types and so on.

Algol 60 did not have first-order functions and closures. It did have
lexical scoping, though, which the Lisp family only acquired with
Scheme in 1975.

Algol 60 certainly did not have abstract data types. It did not even
have records.

So these are examples that actually contradict your theory.

Martin Ward

unread,

Mar 19, 2018, 7:16:36 AM3/19/18

to

On 13/03/18 03:02, Kaz Kylheku wrote:
> That's because machines have gotten faster and memories bigger!
>
> Perversely, when memories get bigger, all the interesting stuff
> gets a lot slower, because it has exponential complexity.

Faster machines and larger memories mean that it becomes more
important to use the best algorithms. The best algorithms
are often more complex than less efficient algorithms:
which means that it is *more* important to have powerful
languages in which it is easier to define complex algorithms.

So the need for more powerful languages has only increased over time.

On 14/03/18 15:16, Anton Ertl wrote:
>> The result was an explosion of productive research and development
>> in compilers and language implementation.
>

> Is it really productive to research and develop a feature that turns
> out to be a dead end? Even the closest thing to call-by-name these
> days, call-by-need in lazy functional languages, does not use the
> techniques developed by call-by-name.

Actually, the closest thing to call-by-name is first class functions
(technically: downward funargs). The desire for mathematical
simplicity over implementation simplicity drove the choice
of call by name over call by reference. Call by name allowed
Jenson's Device which is a method of passing a function as a parameter.
Call by name turned out to be implementable by, in effect,
passing a function as a parameter.

Of course, with hindsight we can see that defining function parameters
directly in the language would be an even better solution:
but language design was still in its early days at the time,
and Algol 60 was "not only an improvement on its predecessors,
but also on nearly all its successors" (C.A.R. Hoare).

How many modern languages can claim the same?

> Algol 60 did not have first-order functions and closures. It did have
> lexical scoping, though, which the Lisp family only acquired with
> Scheme in 1975.
>
> Algol 60 certainly did not have abstract data types. It did not even
> have records.
>
> So these are examples that actually contradict your theory.

My theory is that the attitude of the Algol 60 designers towards
language design is what led to these innovations appearing
over then next 20 years: this is the "explosion of productive

research and development in compilers and language implementation"

that I was referring to.

What similar innovations in compilers and language design
have appeared over the *last* 20 years? (Haskill is still considered
by some to be a scary new innovative language: it was initially
developed in the early 1990's, over 20 years ago).

Gene Wirchenko

unread,

Mar 20, 2018, 7:22:52 AM3/20/18

to

On Mon, 19 Mar 2018 11:04:20 +0000, Martin Ward <mar...@gkc.org.uk>
wrote:

[snip]

>Faster machines and larger memories mean that it becomes more
>important to use the best algorithms.

How does that follow?

One could argue exactly the opposite as there are resources to
waste.

[snip]

Sincerely,

Gene Wirchenko
[Depends on the algorithms. If they're linear, things are fine,
not so fine if they're O(N^2) or O(2^N) -John]

Anton Ertl

unread,

Mar 20, 2018, 7:33:18 AM3/20/18

to

Martin Ward <mar...@gkc.org.uk> writes:
>Of course, with hindsight we can see that defining function parameters
>directly in the language would be an even better solution:
>but language design was still in its early days at the time,
>and Algol 60 was "not only an improvement on its predecessors,
>but also on nearly all its successors" (C.A.R. Hoare).
>
>How many modern languages can claim the same?

Was Hoare right in making this claim? He made this claim in 1973, and
indeed most langauges developed between 1960 and 1973 have been
forgotten by now, and probably Algol 60 was an improvement over many
of them.

Interestingly, Hoare was in the more implementation-oriented Algol W
faction of the post-60 Algol committee rather than the more
mathematical faction that won out and eventually produced the Algol 68
report. He probably meant especially Algol 68 as successor over which
Algol 60 was an improvement. So this claim actually condemns the
attitude that you praise.

[Anton Ertl:]

>> Algol 60 did not have first-order functions and closures. It did have
>> lexical scoping, though, which the Lisp family only acquired with
>> Scheme in 1975.
>>
>> Algol 60 certainly did not have abstract data types. It did not even
>> have records.
>>
>> So these are examples that actually contradict your theory.
>
>My theory is that the attitude of the Algol 60 designers towards
>language design is what led to these innovations appearing
>over then next 20 years: this is the "explosion of productive
>research and development in compilers and language implementation"
>that I was referring to.

Higher-order functions (I somehow mixed up "higher-order" with
"first-class" in the above) were already available in IPL, before
Algol 60, which falsifies this theory for this feature.

Records were available in Cobol, contemporary with Algol 60, so again,
this theory is falsified in this case.

Concerning features that appeared after 1960, there is no way to
verify or falsify your theory.

>What similar innovations in compilers and language design

>have appeared over the *last* 20 years? (Haskell is still considered

>by some to be a scary new innovative language: it was initially
>developed in the early 1990's, over 20 years ago).

Many people that used to research programming in the 1960s thought in
the late 1960s that programming languages were a solved problem, and
refocused on software engineering. The field certainly has matured,
and a sign of that maturity is that 1) you don't get great
breakthroughs all the time and 2) that even when there is a useful
innovation, it takes quite a while until it appears in mainstream
languages.

Actually, it even took quite a while in the early years; e.g., IPL
(1956) had higher-order functions, Algol 60 didn't, Pascal (1970) did,
Ada-83 didn't (IIRC).

And a lot of the work that has been done in the last decades are about
making various features fit together that have lived in separation in
various programming languages for a long time. E.g., Java 5 (2004)
added generics to Java and Java 8 (2014) added lambdas. Both feature
are much older, but needed to be integrated with the Java type system.
And of course, people also needed to be convinced that these features
are worth the complexity that they add to the language.

There are new features being developed all the time. We will see
which of them make it into a mainstream language, but in any case that
will take it's time.

One of the more prominent recent ones is the way that Rust combines
static type checking with explicit memory management to ensure memory
safety in low-level languages.

There is also quite a bit of innovation in tools, e.g., in fuzz
testing, and in using theorem provers for software verification.

Bill Findlay

unread,

Mar 20, 2018, 12:33:53 PM3/20/18

to

On 20 Mar 2018, Anton Ertl wrote (in article<18-0...@comp.compilers>):

> Actually, it even took quite a while in the early years; e.g., IPL
> (1956) had higher-order functions, Algol 60 didn't, Pascal (1970) did,
> Ada-83 didn't (IIRC).

Algol 60 and Wirth Pascal had exactly the same functional-parameter
capability;
i.e. both procedural and functional parameters were included,
but neither language provided a way to specify their signatures.

This was rectified for Pascal, with Wirth's approval, by the BSI Pascal
standard and later by the bureaucratic variant of the latter known as ISO
Pascal.

(Ada 83 provided a clumsy way to do it by means of generics;
later versions of Ada do it properly.)

--
Bill Findlay

mac

unread,

Mar 20, 2018, 12:35:36 PM3/20/18

to

> and Algol 60 was "not only an improvement on its predecessors, but also
> on nearly all its successors" (C.A.R. Hoare).

> How many modern languages can claim the same?

C. For a while I would have allowed C++ as an exception, but that went away
with cfront. Still not sure about go.
[Your moderator reminds you that he has very little patience for yet more
arguments about why C is a good or a bad language. -John]

Kaz Kylheku

unread,

Mar 20, 2018, 12:36:59 PM3/20/18

to

On 2018-03-19, Gene Wirchenko <ge...@telus.net> wrote:
> On Mon, 19 Mar 2018 11:04:20 +0000, Martin Ward <mar...@gkc.org.uk>
> wrote:
>
> [snip]
>
>>Faster machines and larger memories mean that it becomes more
>>important to use the best algorithms.
>
> How does that follow?
>
> One could argue exactly the opposite as there are resources to
> waste.

Clever algorithms had to be used when machines were slow even on small
data and the data and code had to fit into small memories.

Clever algorithms have to be used when data sets are large.

In between there is a large territory where the machines are powerful,
but the problems (or sub-problems) being solved are reasonably small.
Here we use interpreted languages instead of compiled, linked lists
instead of more specific data structures, hashes instead of vectors and
N-dimensional data even when it isn't sparse and so on.

Martin Ward

unread,

Mar 23, 2018, 8:40:41 AM3/23/18

to

On 19/03/18 19:50, Gene Wirchenko wrote:
>> Faster machines and larger memories mean that it becomes more
>> important to use the best algorithms.
>
> How does that follow?

It follows because data expands to fill the memory available.

> One could argue exactly the opposite as there are resources to
> waste.
>

> [Depends on the algorithms. If they're linear, things are fine, not
> so fine if they're O(N^2) or O(2^N) -John]

My first computer had 8K of RAM and a 1MHz clock speed.
If I wanted to sort 1,000 values I could use bubble sort, O(N^2),
or quicksort, O(N log N).

So, bubble sort would need of the order of 1,000,000 instructions
and take about 1 second: this might well be acceptable for an operation
you only need to do a few times an hour. Quicksort would need
of the order of 10,000 instructions and take 0.01 seconds.

My current laptop (not the latest model) has 8Gb of RAM:
literally, over a million times as much RAM, and a 1.7 GHz clock.
Even accounting for more powerful instructions, the clock is
only of the order 10,000 times faster, while the memory is 1,000,000
times larger. Now 1,000,000,000 values will fit in RAM.
Sorting these with bubble sort will need of the order 10^18 instructions
which takes 100,000,000 seconds (over 1,000 years) while quicksort
needs 300,000,000,000 instructions and takes only a few seconds.

For a linear algorithm, say a linear search, my first computer
takes 0.001 seconds to process the data, while my new laptop
takes 0.1 seconds. So if I need to execute this linear algorithm
more than 10 times a second, my old computer will be fine
but my new laptop would struggle. I would need to look hard
for a sub-linear algorithm.

Conclusion: even for linear algorithms things may not be fine
and more powerful sub-linear algorithms may be needed.

--
Martin

Dr Martin Ward | Email: mar...@gkc.org.uk | http://www.gkc.org.uk
G.K.Chesterton site: http://www.gkc.org.uk/gkc | Erdos number: 4

[Give or take the detail that bubble sort is never the right algorithm
since insertion sort is shorter and faster, I agree. -John]

Andy Walker

unread,

Mar 24, 2018, 9:34:05 AM3/24/18

to

On 23/03/18 10:44, Martin Ward wrote:
[...]

> which takes 100,000,000 seconds (over 1,000 years)

Point of order: 100,000,000 seconds is about pi years.

[...]

> For a linear algorithm, say a linear search, my first computer
> takes 0.001 seconds to process the data, while my new laptop
> takes 0.1 seconds. So if I need to execute this linear algorithm
> more than 10 times a second, my old computer will be fine
> but my new laptop would struggle.

On the other hand, if you needed to search, say, a million
items, your new laptop would process them in around a millisecond,
while your first computer would spend however long it took you to
feed in the data bit-by-bit from a sequence of floppies. Several
minutes?

Somewhat more to the point, as this is "comp.compilers",
it seems unlikely that either the compiler or the source being
compiled have grown by anything like the growth in RAM, despite the
unfortunate modern [ie last 40-odd years!] tendency towards bloat.
It seems somewhat OTT to claim that we need better algorithms
*because of* faster and logically-bigger computers. We want them
primarily because they are better!

--
Andy Walker,
Nottingham.
[Well, both. Some compiler optimization algorithms can be quite slow,
ike optimal register assignment and constructing perfect hashes. -John]

Anton Ertl

unread,

Mar 24, 2018, 6:50:27 PM3/24/18

to

Andy Walker <a...@cuboid.co.uk> writes:
> Somewhat more to the point, as this is "comp.compilers",
>it seems unlikely that either the compiler or the source being
>compiled have grown by anything like the growth in RAM, despite the
>unfortunate modern [ie last 40-odd years!] tendency towards bloat.

One of the papers I saw during this recent discussion mentions that
one of Wirth's early compilers (PL/360 IIRC) self-compiled in IIRC 25s
on a machine of the time, i.e., late 1960s (which the author of the
paper found quite impressive). In 1992, I heard a keynote address by
Wirth (at CC'92), where he contrasted Oberon with C++, and used
self-compilation time as one metric: The Oberon compiler compiled
itself in IIRC 6s, which he contrasted with IIRC 20 minutes for the
C++ compiler.

The last few times that I built a recent gcc, it took several hours to
compile itself (three stages, and it included several front-ends, but
still); interestingly, I rebuilt gcc-2.7 in 2015, and that took only a
few minutes.

Software bloats to fill the memory and run-time that is deemed
acceptable. Whether that means more or less sophistication in the
algorithms varies from case to case.

- anton
--
M. Anton Ertl
an...@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/

[Back in the 1970s, the Dartmouth BASIC compiler was so fast that
nobody bothered to save object code. It rarely took more than a few
clock ticks from source code to starting the executable. -John]

William Clodius

unread,

Mar 25, 2018, 6:46:14 AM3/25/18

to

Martin Ward <mar...@gkc.org.uk> wrote:

> <snip>> [Give or take the detail that bubble sort is never the right algorithm

> since insertion sort is shorter and faster, I agree. -John]

I believe that bubble sort is faster if the data is usually presorted.
The Unicode consortium describes the sorting of codepoints to forn the
normalization forms, because the initial decomposition process usually
results in an ordering that is usually in or close to the desired order.

[It's the reverse. Insertion sort is O(N) if the list is already sorted.
See http://www.differencebetween.com/difference-between-bubble-sort-and-vs-insertion-sort/
-John]

Rob Warnock

unread,

Mar 25, 2018, 6:47:37 AM3/25/18

to

At the end of a message by Anton Ertl <an...@mips.complang.tuwien.ac.at>,
our august moderator noted:
+---------------

| [Back in the 1970s, the Dartmouth BASIC compiler was so fast that
| nobody bothered to save object code. It rarely took more than a few
| clock ticks from source code to starting the executable. -John]

+---------------

And lest we forget:

https://en.wikipedia.org/wiki/WATFIV

WATFOR/WATFIV [a.k.a. Univ. Waterloo FORTRAN] was a compile-and-go
system widely used to teach FORTRAN from 1965 until the late 1980s
[and in some places until >1995].

-Rob

-----
Rob Warnock <rp...@rpw3.org>
627 26th Avenue <http://rpw3.org/>
San Mateo, CA 94403
[Apropos of another comment, Dartmouth BASIC was a real compiler that
generated machine code. So was WATFOR/WATFIV. -John]

Robin Vowels

unread,

Mar 26, 2018, 1:16:50 PM3/26/18

to

> From: "Rob Warnock" <rp...@rpw3.org>

> And lest we forget:
>
> https://en.wikipedia.org/wiki/WATFIV
>
> WATFOR/WATFIV [a.k.a. Univ. Waterloo FORTRAN] was a compile-and-go
> system widely used to teach FORTRAN from 1965 until the late 1980s

And let's not forget Charles Hamblin's GEORGE,
which did the same thing from 1957,
namely, compile-and-go.

Martin Ward

unread,

Mar 29, 2018, 5:28:01 PM3/29/18

to

On 20/03/18 09:06, Anton Ertl wrote:
> Higher-order functions (I somehow mixed up "higher-order" with
> "first-class" in the above) were already available in IPL, before
> Algol 60, which falsifies this theory for this feature.

The theory is that "the attitude of the Algol 60 designers towards
language design is what led to these innovations appearing".
Clearly, this "attitude" was present before, during and after
the development of Algol 60. In the context of the theory,
Algol 60 is just one example of the influence of attitude.

The second part of the theory is that since that time,
the attitude has changed and "language designers have become

very cautious and timid in specifying powerful new language features
and compiler research has stagnated."

> Concerning features that appeared after 1960, there is no way to
> verify or falsify your theory.

It should be easy to falsify the theory: where are the new language
features that have been invented in the last, say, twenty years?
Where are the powerful new languages which make Haskell
look like Dartmouth BASIC?

A very small example of language designers prioritising
implementation effort over power and simplicity: even today there
are very few mainstream languages which implement arbitrary precision
integers ("bignums") and rational numbers as primitive data types.
Proving that an algorithm is correct is simpler when you don't have
to worry about integer overflow, or floating point approximations.
For example, the binary search algorithm that Jon Bentley proved correct
and published in "Programming Pearls" in 1986 had a bug that
was not detected for two decades:

https://research.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html

It doesn't help that C and C++ have very lax rules for dealing
with overflow (in the sense that many cases are declared to
be undefined behaviour), leading to complex solutions for
simple problems:

http://www.di.unipi.it/~ruggieri/Papers/semisum.pdf

(I am still hoping for my theory to be proved wrong!)

A related, and more worrying, trend is the new and growing area
of research under the heading "empirical software engineering"
which aims to do away with program semantics altogether.
A program is deemed "correct" if and only if it passes its test suite.
Various automated and semi-automated ways of modifying the program
are being investigated: any modification which passes the test suite
is deemed to be "correct". For example, "empirical slicing"
may be defined as "delete random sections of code and call the result
a valid slice if it passes the regression test". Program semantics
and program analysis are considered to be "too difficult"
by these researchers, and therefore are not attempted.

Readers of comp.risks will no doubt already be wondering how
such methods avoid introducing security holes: given that
a security hole will not necessarily prevent the program
from passing its test suite (unless the tests happen
to include the carefully crafted data which triggers
the security hole!) As far as I can tell, the answer is:
they don't!

Anton Ertl

unread,

Mar 30, 2018, 10:31:44 PM3/30/18

to

Martin Ward <mar...@gkc.org.uk> writes:
>On 20/03/18 09:06, Anton Ertl wrote:
>> Higher-order functions (I somehow mixed up "higher-order" with
>> "first-class" in the above) were already available in IPL, before
>> Algol 60, which falsifies this theory for this feature.
>
>The theory is that "the attitude of the Algol 60 designers towards
>language design is what led to these innovations appearing".
>Clearly, this "attitude" was present before, during and after
>the development of Algol 60.

Ok, so this theory cannot even be verified of falsified for the
features that came before Algol 60.

In any case, I think this attitude is still present.

>The second part of the theory is that since that time,
>the attitude has changed and "language designers have become
>very cautious and timid in specifying powerful new language features
>and compiler research has stagnated."

And I think that this is not the case. It's just that hardly any new
language features and languages achieve popularity (and those that do
take a long time). Admittedly that feeds back to discourage some from
going there, but language design is far too fascinating to let that
discourage everyone. E.g., Christian Heinlein is doing interesting
things with flexible syntax <http://flexipl.info/tutorial/>.

>It should be easy to falsify the theory: where are the new language
>features that have been invented in the last, say, twenty years?

I mentioned some things in the grandfather posting. Do you want me to
repeat them?

>Where are the powerful new languages which make Haskell
>look like Dartmouth BASIC?

You mean a language that is even harder to learn and even less
practical than Haskell?-) Such languages are developed in significant
numbers: Take a look at
<https://en.wikipedia.org/wiki/Esoteric_programming_language>.

>A related, and more worrying, trend is the new and growing area
>of research under the heading "empirical software engineering"
>which aims to do away with program semantics altogether.
>A program is deemed "correct" if and only if it passes its test suite.

Sounds to me like test-driven development. Not really new, however.
I heard about it in a talk by Kent Beck 20 years ago.

>Various automated and semi-automated ways of modifying the program
>are being investigated: any modification which passes the test suite
>is deemed to be "correct". For example, "empirical slicing"
>may be defined as "delete random sections of code and call the result
>a valid slice if it passes the regression test". Program semantics
>and program analysis are considered to be "too difficult"
>by these researchers, and therefore are not attempted.

That's an interesting idea, if it exists (the things Google gave me
when I asked for "empirical slicing" or "empirical program slicing"
did not appear to fit your description). It may not be the direction
you favour, but it would be something new, wouldn't it?

Googling for "empirical software engineering" also does not give me
any hits that fit your description, either.

>Readers of comp.risks will no doubt already be wondering how
>such methods avoid introducing security holes: given that
>a security hole will not necessarily prevent the program
>from passing its test suite (unless the tests happen
>to include the carefully crafted data which triggers
>the security hole!) As far as I can tell, the answer is:
>they don't!

And neither did the waterfall model. I am not a security expert, but
AFAIK the old-fashioned way to get there are to be aware of security
issues during design and coding, and then doing a code audit.
However, there have been some successes in finding vulnerabilities
with fuzz testing, and with techniques based on automated theorem
provers.

Martin Ward

unread,

Apr 6, 2018, 11:54:30 AM4/6/18

to

On 30/03/18 15:20, Anton Ertl wrote:
>> The theory is that "the attitude of the Algol 60 designers towards
>> language design is what led to these innovations appearing".
>> Clearly, this "attitude" was present before, during and after
>> the development of Algol 60.

> Ok, so this theory cannot even be verified of falsified for the
> features that came before Algol 60.

John asked us to speculate what might have happened differently,
not produce empirically verifiable theories :-)

I came across this quote the other day, which suggests that the change
in attitude had already started by 1979:

"The call-by-name mechanism is so expensive that modern languages
have mostly dropped it in favour of the semantically different,
but operationally more efficient, call-by-reference"
--"Understanding and Writing Compilers", Richard Bornat, Macmillan 1979.

Note that the

>> It should be easy to falsify the theory: where are the new language
>> features that have been invented in the last, say, twenty years?
>

> I mentioned some things in the grandfather posting. Do you want me to
> repeat them?

You mentioned generics and lambdas, both of which you acknowledge
have been around for a long time: lambdas have been around
since the original Lisp!

You also mentioned combining static type checking with explicit
memory management: again two features which have been around
for decades. One might ask: why earth we still need explicit memory
management in this day and age? Because it is more efficient
to implement? Doesn't that prove what I have been saying all along?

Re "empirical software engineering", I apologise that I mis-remembered
the term used and sent you on a wild goose chase. :-(
"Search based software engineering" might be a more accurate term.
Two examples are:

"Observation Based" or "Language Independent" program slicing
involves deleting random chunks of code and then compiling
and re-testing the result (if possible) to see if it is
a "valid" slice (where a "valid" slice is defined as
"one which passes the test suite"):

http://coinse.kaist.ac.kr/publications/pdfs/Binkley2014ud.pdf

"Automated Software Transplantation" involves copying random blocks
of code from one program to another and testing if the resulting
program still "works" (ie passes its test suite) and also
contains some functionality from the other program
(eg it can now handle a new file format).

http://crest.cs.ucl.ac.uk/autotransplantation/downloads/autotransplantation.pdf

https://motherboard.vice.com/en_us/article/bmj9g4/muscalpel-is-an-algorithmic-code-transplantation

> It may not be the direction
> you favour, but it would be something new, wouldn't it?

It is like introducing alchemy into the chemistry department:
let's forget all the theory and all the maths and just smush
random chemicals together and see what happens!
Yes, it's something new, and no, it's not a direction I favour.

>> Where are the powerful new languages which make Haskell
>> look like Dartmouth BASIC?
>

> You mean a language that is even harder to learn and even less
> practical than Haskell?-)

Modern popular languages are neither powerful nor easy to learn.
The latest C standard is over 700 pages (even the Haskell Language
Report is only 329 pages!). In the pursuit of efficiency over
everything, C leaves so many operations "undefined" (there are
around 199 examples of undefined behaviour) that simply
writing a piece of code to carry out twos-complement arithmetic
(in a fully standards-complient manner) is a challenge
to tax the most experienced programmers: and probably compiles
into very inefficient object code even on a processor which has
native twos complement arithmetic operations!

https://wiki.sei.cmu.edu/confluence/display/c/INT32-C.+Ensure+that+operations+on+signed+integers+do+not+result+in+overflow

For C++ there is the SafeInt class which tries to do this:
but examples of undefined behaviour were detected in that library,
which caused it to give incorrect results on some compilers:

http://www.cs.utah.edu/~regehr/papers/overflow12.pdf

One effect of the increases in memory size and CPU speed over
the decades is that it allows programmers to write much larger
programs and implement solutions to much more complex problems.
This makes mastering complexity far more important.
A language which is an order of magnitude more powerful
would mean that programmers need to write and comprehend
an order of magnitude less code: and each individual programmer
or small team could work on problems which are an order of
magnitude more complex. If this language takes an order of magnitude
longer to learn and fully understand, then it is still worth it!

Unfortunately, this flies in the face of modern business practice
where "programmers" are a cheap resource to be hired and put to work
generating lines of code.

Derek M. Jones

unread,

Apr 8, 2018, 12:37:54 PM4/8/18

to

All,

> John asked us to speculate what might have happened differently,
> not produce empirically verifiable theories :-)

http://shape-of-code.coding-guidelines.com/2016/05/23/the-fall-of-rome-and-the-ascendancy-of-ego-and-bluster/

> You also mentioned combining static type checking with explicit

A language can contain strong typing, but these are useless unless
people actually use them:
http://shape-of-code.coding-guidelines.com/2014/04/17/c-vs-ada-which-language-is-more-strongly-typed/

Some weak evidence that stronger typing saves some time.
My view is that the benefit happens over longer timescales (too long
for it to be cost effective to run an experiment):
http://shape-of-code.coding-guidelines.com/2014/08/27/evidence-for-the-benefits-of-strong-typing-where-is-it/

> Re "empirical software engineering",

If anybody has any public data that I don't have, please let me know:
http://www.knosof.co.uk/ESEUR

> It is like introducing alchemy into the chemistry department:

I would say it's more akin to saying that chemistry papers need to
maximize the number of mathematical orgasm they contain and lets
ignore reality.

> let's forget all the theory and all the maths and just smush
> random chemicals together and see what happens!
> Yes, it's something new, and no, it's not a direction I favour.

http://shape-of-code.coding-guidelines.com/2017/11/29/vanity-project-or-real-research/

> Modern popular languages are neither powerful nor easy to learn.

What evidence do you have for this?

George Neuner

unread,

Apr 9, 2018, 4:59:29 PM4/9/18

to

On Sun, 8 Apr 2018 14:21:48 +0100, "Derek M. Jones"
<derek@_NOSPAM_knosof.co.uk> wrote:

> Martin Ward wrote:
>> Modern popular languages are neither powerful nor easy to learn.
>
>What evidence do you have for this?

I disagree about "easy to learn" - there are plenty of languages that
are easy to learn. But as to the question of "power" ...

Note that "powerful" and "useful" (for some definition) are not the
same thing. There are plenty of semantically restricted languages
that can be considered useful for their intended purposes.

That said:

IMO, the evidence that many popular languages are not "powerful" is
that they are either exclusively or primarily OO, but they implement
only single inheritance objects.

Wherever you stand on OO as a programming paradigm, you can't deny
that single inheritance is the weakest variant of it. The addition of
"interfaces" and "mix-ins" does not make up for the lack of true
multiple inheritence in those situations where it is needed.

The necessity to write "Design Patterns" was, IMO, acknowledgement
that the average programmer could not figure out how to express their
ideas under Java's limited object model.

I prefer to use languages that naturally support multiple programming
paradigms, and don't put many (or any) limits on what can be done
using them. Some solutions are best expressed procedurally, others
are more naturally functional, and yet others are best modeled using
objects.

I relegate to the proverbial junk heap the many languages that force
solutions to be shoehorned into a model that they don't naturally fit.
There are too many "me too" languages that think a simple object model
combined with procedural code is the solution to every problem.

YMMV,
George

Anton Ertl

unread,

Apr 10, 2018, 11:09:19 AM4/10/18

to

George Neuner <gneu...@comcast.net> writes:
>The necessity to write "Design Patterns" was, IMO, acknowledgement
>that the average programmer could not figure out how to express their
>ideas under Java's limited object model.

Design Patterns was published in 1994 (based on work that started in
1990), before Java was published in 1995. Moreover, it says at the
beginning that the design patterns are somewhat language-specific, and
that the language they have in target is C++. It seems to me to be
more a guideline of how to make use of the vast possibilities of a
programming language to achieve certain things effectively (and with
maintainable results), rather than a guideline on how to work around
limitations.

Derek M. Jones

unread,

Apr 10, 2018, 11:09:57 AM4/10/18

to

George,

>>> Modern popular languages are neither powerful nor easy to learn.
>>
>> What evidence do you have for this?
>
> I disagree about "easy to learn" - there are plenty of languages that
> are easy to learn. But as to the question of "power" ...

Powerful and easy to learn is a claim that proponents of every language
make. It is a marketing statement.

If you ask them how their language can be more powerful than other
Turing complete languages, hey invariably switch to saying that it's
easy to write powerful programs (whatever they might be).

Something like 30 languages per year get non-trivial implementations.
So the question to ask is, how does your language compare to the
30 languages created last year? They invariably have not checked
out last year's languages. Then ask about comparing against the 30
from the year before, and so on.

Inventing languages is invariably vanity research. Fine, but let's
not take anything claimed seriously.

Martin Ward

unread,

Apr 10, 2018, 11:38:51 AM4/10/18

to

On 08/04/18 14:21, Derek M. Jones wrote:
>> Modern popular languages are neither powerful nor easy to learn.
>
> What evidence do you have for this?

The C standard is over 700 pages: not exactly an easy read. C has 199
different cases of undefined behaviour that the programmer has to
memorise and avoid using if they want to write conformant and
compatible code.

C++ is even more popular than C but adds layers more complexity
on top of the complexity of C: "If you think C++ is not overly
complicated, just what is a protected abstract virtual base pure
virtual private destructor, and when was the last time you needed one?"
(Tom Cargill, C++ Journal, Fall 1990).

An iostream-based "hello, world" program requires the GNU C++ compiler
to parse 718K bytes.

See also: http://yosefk.com/c++fqa/defective.html

Yet, for all that complexity, "C combines the power of assembly language
with the flexibility of assembley language"! To do anything useful
in C or C++ one needs to use large numbers of functions from
various libraries. The GNU C library, which contains basic low-level
functions such as string handling, I/O, memory allocation etc,
has a manual which is 1,174 pages long.

On the other hand, the Revised^4 Report on the Algorithmic Language
Scheme ("Dedicated to the Memory of ALGOL 60") is only a 55 page manual
but it includes the full syntax and semantics of the language.

--
Martin

Dr Martin Ward | Email: mar...@gkc.org.uk | http://www.gkc.org.uk
G.K.Chesterton site: http://www.gkc.org.uk/gkc | Erdos number: 4

[In fairness, a lot of the 700 pages of the C standard are about
the library. In my copy of C99, pages 9-163 are about the
language, pages 164-401 are about the library, and then there's
about 150 pages of appendices. But it's certainly a lot bigger
than the language that K&R wrote about in the 1970s. -John]

Derek M. Jones

unread,

Apr 10, 2018, 2:07:09 PM4/10/18

to

Martin,

> On 08/04/18 14:21, Derek M. Jones wrote:
>>> Modern popular languages are neither powerful nor easy to learn.
>>
>> What evidence do you have for this?
>
> The C standard is over 700 pages: not exactly an easy read.

By evidence I mean an evaluation of multiple languages.

Here are some languages from 1957. Were they powerful and easy
to learn?

http://shape-of-code.coding-guidelines.com/2017/05/21/evidence-for-28-possible-compilers-in-1957/

....> memorise and avoid using if they want to write conformant and
> compatible code.

I thought we were talking about powerful and easy to learn?

> On the other hand, the Revised^4 Report on the Algorithmic Language
> Scheme ("Dedicated to the Memory of ALGOL 60") is only a 55 page manual
> but it includes the full syntax and semantics of the language.

My question was about powerful and easy to learn. Not about number
of pages in the language specification.
[In my experience, any language that is semantically similar to
languages you already know is easy to learn. For example, I find
python comprehensions obvious and easy to use because they're just a
syntax for a function mapping. I know other python programmers who
find them baffling and always write "for" loops instead, presumably
because the languages they'd used didn't do much function mapping.
-John]

Anton Ertl

unread,

Apr 10, 2018, 2:07:20 PM4/10/18

to

Martin Ward <mar...@gkc.org.uk> writes:
>Yet, for all that complexity, "C combines the power of assembly language
>with the flexibility of assembley language"!

I wish! The C standard allows that, but does not guarantee it, so
providing that power and flexibility is a quality-of-implementation
issue. And unfortunately, at least the gcc and LLVM maintainers do
not want to provide this quality. A manifesto of this position is

http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html

my counter-position papers are:

http://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf
http://www.kps2017.uni-jena.de/proceedings/kps2017_submission_5.pdf

Gene Wirchenko

unread,

Apr 11, 2018, 1:21:17 PM4/11/18

to

On Tue, 10 Apr 2018 16:11:29 +0100, Martin Ward <mar...@gkc.org.uk>
wrote:

[some neat stuff which Our Esteemed Moderator comments on]

>[In fairness, a lot of the 700 pages of the C standard are about
>the library. In my copy of C99, pages 9-163 are about the
>language, pages 164-401 are about the library, and then there's
>about 150 pages of appendices. But it's certainly a lot bigger
>than the language that K&R wrote about in the 1970s. -John]

I propose Gene's Language Heuristic:

1) Take the specification for the language and print a hardcover
book of it. (Apply reasonable rules for font size, etc.)

2) Pick up the book.

3) If you are unable to do so, call dispose.

4) Whack the language creator hard on the head with the hardcover
book.

5) Did you kill him?

6) If yes, call dispose.

7) Did you incapacitate him?

8) If yes, hope really hard that he gives up his antisocial
practices of language creation. End of heuristic.

9) Seek treatment for shock.

10) Warily, learn the language. End of heuristic.

procedure dispose

Dispose of the book, the language, and the language creator.
(Exactly how is not defined by this heuristic. If you have had to do
this before, consider reusing that method.)

There are too many languages that are too similar.

Sincerely,

Gene Wirchenko

George Neuner

unread,

Apr 11, 2018, 1:22:32 PM4/11/18

to

On Tue, 10 Apr 2018 05:48:43 GMT, an...@mips.complang.tuwien.ac.at
(Anton Ertl) wrote:

>George Neuner <gneu...@comcast.net> writes:
>>The necessity to write "Design Patterns" was, IMO, acknowledgement
>>that the average programmer could not figure out how to express their
>>ideas under Java's limited object model.
>
>Design Patterns was published in 1994 (based on work that started in
>1990), before Java was published in 1995.

Java began in 1991 ... as a language called "Oak". Many people used
and/or experimented with Oak for mobile programming prior to it
becoming Java. It was very well known.

>Moreover, [Design Patterns] says at the

>beginning that the design patterns are somewhat language-specific, and
>that the language they have in target is C++.

That is *partly* true, but read on.

On page 14, it says:
<quote>
:
We chose Smalltalk and C++ for pragmatic reasons: Our day-to-day
experience has been in these languages, and they are increasingly
popular.

The choice of programming language is important because it influences
one's point of view. Our patterns assume Smalltalk/C++-level language
features, and that choice determines what can and cannot be
implemented easily.
:
</quote>

Smalltalk had/has single inheritence only, and it's dynamic dispatch
mechanism is very different from that of C++. Examples that were
meant to work in both languages had to be targeted to the lowest
common denominator ... which in most cases meant Smalltalk even if the
syntax of the example was in C++.

The largely Smalltalk centric text transferred very naturally to
Oak/Java because their object models shared many of the same
limitations. Oak already was more popular than Smalltalk by the time
the book was finished, and the more expansive desktop Java was
starting to take off as well.

[Yes, I have ... in honesty, read parts of, skimmed others ... the
book. And yes, I worked with Smalltalk (ParcPlace on the Macintosh)
plenty enough to be familiar with its object model. Smalltalk was my
6th or 7th programming language - I was learning Objective-C at more
or less the same time, so I'm not quite certain of the chronology
there. C++ (at least what it added wrt C) came next as my 8th
language.]

There is some good stuff in the book, but many of the patterns either
are trivial to implement or simply unnecessary in C++ where all code
does NOT have to be ensconced in some object. Many others add little
value in C++ beyond some algorithmic decoupling.

Now decoupling is a three edged sword: it can make code more
maintainable, but it can also make code more complicated. In either
case, it can (and usually does) make code slower.

C++ programmers accept that using objects exacts a small runtime cost,
but as a breed they tend to eschew things that make their programs
*unnecessarily* slower.

So C++ programmers largely ignored "Design Patterns" because many felt
it really wasn't all that useful to them.

OTOH, Java programmers embraced it and held it up as the gospel of
program design. For quite a long while, if you weren't intimately
familiar with the gory details of <insert pattern here>, you couldn't
get a Java programming job to save your life.

Not so for C++. Few C++ managers gave a damn whether you could
describe <insert pattern here> ... most were more interested in how
you would go about solving the problem.

>It seems to me to be
>more a guideline of how to make use of the vast possibilities of a
>programming language to achieve certain things effectively (and with
>maintainable results), rather than a guideline on how to work around
>limitations.

That's undoubtedly what it was intended to be. Unfortunately I think
it ended up being more of a crutch for Java.

>- anton
YMMV, and certainly it will.
George

Kaz Kylheku

unread,

Apr 11, 2018, 1:22:46 PM4/11/18

to

On 2018-04-09, George Neuner <gneu...@comcast.net> wrote:
> IMO, the evidence that many popular languages are not "powerful" is
> that they are either exclusively or primarily OO, but they implement
> only single inheritance objects.

I'm surprised that anyone finds multiple inheritance so singularly
important.

Single inheritance is really only crippling if two kinds of objects have
to inherit from a common base in order to be substitutable.

Remove that restriction and inheritance is properly reduced to the mere
code/data reuse hack that it is.

If anything, lack of multiple dispatch probably hurts more than lack
of MI.

> Wherever you stand on OO as a programming paradigm, you can't deny
> that single inheritance is the weakest variant of it.

I can place my standpoint almost anywhere in the OO programming
paradigm, yet not see this. Sorry, George!

Kaz Kylheku

unread,

Apr 11, 2018, 1:22:58 PM4/11/18

to

On 2018-04-10, Martin Ward <mar...@gkc.org.uk> wrote:
> Yet, for all that complexity, "C combines the power of assembly language
> with the flexibility of assembley language"!

Not so; C provides no portable way to inspect the stack or machine
registers. Writing a precisely-tracing garbage collector which can
look for root pointers in the stack is possible in assembly language;
only a conservative approach is feasible in anything remotely resembling
portable C.

Assembly languages are predictable; for instance, they have defined
behaviors on integer overflow.

Decent quality instruction sets architectures provide ways to catch
an exception in a handler which can precisely re-start the program from
the faulting point after doing some fixup. Almost anything can be
treated in a way that assures safety: illegal instruction, division by zero.

In assembly languages, a pointer value held in a register doesn't become
"indeterminate" just because it was passed to some free()-like function.
(And other such nonsense fictions.)

Derek M. Jones

unread,

Apr 11, 2018, 1:23:36 PM4/11/18

to

Anton,

> my counter-position papers are:
>
> http://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf
> http://www.kps2017.uni-jena.de/proceedings/kps2017_submission_5.pdf

and another position (which offers solutions):
http://shape-of-code.coding-guidelines.com/2014/08/31/undefined-behavior-pay-up-for-shut-up/

Hans-Peter Diettrich

unread,

Apr 11, 2018, 1:24:04 PM4/11/18

to

Am 10.04.2018 um 14:15 schrieb Derek M. Jones:

> Something like 30 languages per year get non-trivial implementations.

IMO it's not so much the implementation that distinguishes languages,
instead it's their domain and, with big projects in mind, their design
and debug features (strict typing...).

DoDi

Derek M. Jones

unread,

Apr 11, 2018, 5:08:40 PM4/11/18

to

Hans-Peter,

>> Something like 30 languages per year get non-trivial implementations.
>
> IMO it's not so much the implementation that distinguishes languages,
> instead it's their domain and, with big projects in mind, their design
> and debug features (strict typing...).

I got my data from: http://hopl.info/ (sadly, no longer maintained).
They get their data from published papers.

There are probably hundreds of non-trivial domain specific languages
created every year. We don't get to hear about them because no paper
is published describing them.

Hans-Peter Diettrich

unread,

Apr 11, 2018, 8:47:52 PM4/11/18

to

Am 10.04.2018 um 20:32 schrieb George Neuner:
> On Tue, 10 Apr 2018 05:48:43 GMT, an...@mips.complang.tuwien.ac.at
> (Anton Ertl) wrote:

> Smalltalk had/has single inheritence only, and it's dynamic dispatch
> mechanism is very different from that of C++.

Isn't *multiple inheritance* one of the features that C++ proved
impractical? Which other languages support multiple inheritance?

DoDi
[Lots of languages have multiple inheritance. Python is one of the
more popular these days. As you've seen, opinions vary about how
useful it is. -John]

bartc

unread,

Apr 12, 2018, 11:22:02 AM4/12/18

to

It's so useful that I have to go and look up what exactly it means (and
then I'm not much the wiser).

But I guess some people develop dependences on such features. If I
question some exotic feature in the Python group, there will always be
someone for whom it is indispensable.

(Never mind that Python lacks what to me are fundamental features such
as named constants, 'switch', records, enumerations, repeat-n-times
loops, static local variables, pass-by-reference, or goto.

Some of those can apparently be emulated - usually badly and
cumbersomely - by making use of the advanced features.)

The trouble is, if you dare to put forward such a point of view, someone
is going to mention the 'blub paradox', just to put you in your place.

--
bartc
[Python certainly has a lot of theology. I believe that python users
would say that your misssing features are implemented through simple
idioms and aren't worth gunking up the languages, e.g. repeat N times
is "for i in range(N):". But like I said, it's theology. -John]

Rob Warnock

unread,

Apr 12, 2018, 11:23:53 AM4/12/18

to

Hans-Peter Diettrich <DrDiet...@netscape.net> wrote:
+---------------

| Isn't *multiple inheritance* one of the features that C++ proved
| impractical? Which other languages support multiple inheritance?
|
| DoDi
| [Lots of languages have multiple inheritance. Python is one of the
| more popular these days. As you've seen, opinions vary about how
| useful it is. -John]

+---------------

Common Lisp has multiple inheritance, as well as multiple dispatch.

bartc

unread,

Apr 12, 2018, 8:56:42 PM4/12/18

to

On 12/04/2018 11:51, bartc wrote:

> [Python certainly has a lot of theology. I believe that python users
> would say that your misssing features are implemented through simple
> idioms and aren't worth gunking up the languages, e.g. repeat N times
> is "for i in range(N):". But like I said, it's theology. -John]

(Just on that point, that is one of the simpler features and it is just
neater syntax for something that can be expressed in other ways.

But it is not adding extra syntax; if anything it is getting rid of it!
If a for-loop starts like this:

for i:=1 to n do ...

Then by leaving out the bits not needed you end up with this:

to n do ...

A repeat-n-times loop (one that doesn't have to maintain an explicit
loop counter accessible as a reference-counted variable from the source
code). And an endless loop by leaving 'to n'. (This comes from Algol-68
actually; not my idea.)

But even if extra syntax is needed, so what? Syntax is free, provided
you don't go mad with it. Compare with the type system and libraries for
which no such curbs appear to exist)

--
bartc
[Syntax is free in the compiler but not necessarily in the brains of
the programmers. Back when I was writing PL/I programs, people said
my code was unreadable because I'd learned PL/I from the reference
manual, while everyone else learned it from books like "PL/I for
Fortran progammers" or "PL/I for commercial programmers." My code
used what seemed reasonable to me but others found it a mishmosh of
stuff they knew and stuff they didn't. -John]

George Neuner

unread,

Apr 12, 2018, 8:58:00 PM4/12/18

to

On Thu, 12 Apr 2018 01:09:42 +0200, Hans-Peter Diettrich
<DrDiet...@netscape.net> wrote:

>Am 10.04.2018 um 20:32 schrieb George Neuner:
>> On Tue, 10 Apr 2018 05:48:43 GMT, an...@mips.complang.tuwien.ac.at
>> (Anton Ertl) wrote:
>
>> Smalltalk had/has single inheritence only, and it's dynamic dispatch
>> mechanism is very different from that of C++.
>
>Isn't *multiple inheritance* one of the features that C++ proved
>impractical? Which other languages support multiple inheritance?

No. It proved only that C++ wasn't able to accomodate conflicting
classes effectively. Lisp handles this same problem much more
effectively, so it isn't a generic fault of MI.

Moreover, most (all?) SI languages implement interfaces (abstract
classes). Interfaces mainly are a workaround for lacking real MI, so
why would they do that if it were useless?

C++ implemented interfaces also, but it did so because it's object
layout semantics did not allow for the merging of non-conflicting
members, and its dispatch semantics did not adequately address
predictable function dispatch in the case of "diamond" inheritance.

Lisp addresses both of these issues very well, and also handles the
issue of conflicting (by type) members.

>[Lots of languages have multiple inheritance. Python is one of the
>more popular these days. As you've seen, opinions vary about how
>useful it is. -John]

Obviously it depends on the specifics of the hierarchy. There is no
guarantee that MI will save you anything. But where interfaces are
commonly employed in SI languages, an implementation under MI might
well be less work.

It often is the case that a subclass has to reimplement something from
one of its ancestors, and it *might* be the case with MI that there is
a clash wrt handling something that has to be worked around. But
using interfaces, it is *guaranteed* that you will have to
(re)implement not just things that conflict, but everything in the
interface.

YMMV,
George

George Neuner

unread,

Apr 12, 2018, 8:58:59 PM4/12/18

to

On Tue, 10 Apr 2018 18:32:23 +0000 (UTC), Kaz Kylheku
<157-07...@kylheku.com> wrote:

>On 2018-04-09, George Neuner <gneu...@comcast.net> wrote:
>> IMO, the evidence that many popular languages are not "powerful" is
>> that they are either exclusively or primarily OO, but they implement
>> only single inheritance objects.
>
>I'm surprised that anyone finds multiple inheritance so singularly
>important.
>
>Single inheritance is really only crippling if two kinds of objects have
>to inherit from a common base in order to be substitutable.

That's why SI languages implement interfaces - the poor person's MI.

>If anything, lack of multiple dispatch probably hurts more than lack
>of MI.

I agree that multiple dispatch is at least as important.

>> Wherever you stand on OO as a programming paradigm, you can't deny
>> that single inheritance is the weakest variant of it.
>
>I can place my standpoint almost anywhere in the OO programming
>paradigm, yet not see this. Sorry, George!

You can uncover his eyes, but you can't make him see. <grin>
Sorry Kaz!

George

Kaz Kylheku

unread,

Apr 13, 2018, 1:16:31 PM4/13/18

to

On 2018-04-13, George Neuner <gneu...@comcast.net> wrote:
> On Thu, 12 Apr 2018 01:09:42 +0200, Hans-Peter Diettrich
><DrDiet...@netscape.net> wrote:
>
>>Am 10.04.2018 um 20:32 schrieb George Neuner:
>>> On Tue, 10 Apr 2018 05:48:43 GMT, an...@mips.complang.tuwien.ac.at
>>> (Anton Ertl) wrote:
>>
>>> Smalltalk had/has single inheritence only, and it's dynamic dispatch
>>> mechanism is very different from that of C++.
>>
>>Isn't *multiple inheritance* one of the features that C++ proved
>>impractical? Which other languages support multiple inheritance?
>
> No. It proved only that C++ wasn't able to accomodate conflicting
> classes effectively. Lisp handles this same problem much more
> effectively, so it isn't a generic fault of MI.

More problematic than the lexical conflicts is the whole business
of "virtual inheritance". If you don't use it, you end up with
with multiple copies of the same bases via different paths in the
multiple inheritance DAG.

It all has to do with the specification of C++ inheritance (and
everything else) being motivated by a specific locus of implementations:
how it works at the memory layout level with specific pointers, offsets,
tables and whatnot.

Before the C++ people specify anything, they first have to ensure that
some relatively cheap and fast mechanism can handle it at run-time.
That then guides how the specification is written, even if it is not
explicitly in the wording.

> Moreover, most (all?) SI languages implement interfaces (abstract
> classes). Interfaces mainly are a workaround for lacking real MI, so
> why would they do that if it were useless?

Are they? Interfaces are only a workaround for lack of MI if you
take it for granted that a class must *declare* inheritance from
B before its instances are permitted to be substituted wherever
a B may be used.

If you do not take that for granted, then you don't need inheritance
at all.

In the Common Lisp object system that you mentioned, we can easily
specialize the same method to string or integer, even though those
are not related by inheritance. We just:

(defmethod foo ((arg integer))
...)

(defmethod foo ((arg string))
...)

The integer and string classes doesn't have to have a declaration
injected into them which them allows their instances to appear
as obj in (foo obj) calls.

So *that* motivation for MI is absent: we can take any class
or classes, and just use them in multiple method-based protocols;
as many as we want.

--
TXR Programming Lanuage: http://nongnu.org/txr
Music DIY Mailing List: http://www.kylheku.com/diy
ADA MP-1 Mailing List: http://www.kylheku.com/mp1

Kaz Kylheku

unread,

Apr 13, 2018, 1:17:36 PM4/13/18

to

On 2018-04-13, George Neuner <gneu...@comcast.net> wrote:

> On Tue, 10 Apr 2018 18:32:23 +0000 (UTC), Kaz Kylheku
><157-07...@kylheku.com> wrote:
>
>>On 2018-04-09, George Neuner <gneu...@comcast.net> wrote:
>>> IMO, the evidence that many popular languages are not "powerful" is
>>> that they are either exclusively or primarily OO, but they implement
>>> only single inheritance objects.
>>
>>I'm surprised that anyone finds multiple inheritance so singularly
>>important.
>>
>>Single inheritance is really only crippling if two kinds of objects have
>>to inherit from a common base in order to be substitutable.
>
> That's why SI languages implement interfaces - the poor person's MI.

Well, SI in *static* languages!

SI in *dynamic* languages doesn't need interfaces.

I should know; I made one:

1> (defstruct apple nil (:method rot (me) `@me rots`))
#<struct-type apple>
2> (defstruct orange nil (:method rot (me) `@me rots`))
#<struct-type orange>
3> (each ((f (list (new apple) (new orange))))
(prinl f.(rot)))
"#S(apple) rots"
"#S(orange) rots"

apple and orange are not related; no "rottable" interface is
needed to legitimize the .(rot) call.

I mean, phooey; over my dead &body.

Hans-Peter Diettrich

unread,

Apr 13, 2018, 1:18:19 PM4/13/18

to

Am 13.04.2018 um 02:51 schrieb George Neuner:
> On Thu, 12 Apr 2018 01:09:42 +0200, Hans-Peter Diettrich
> <DrDiet...@netscape.net> wrote:
>
>> Am 10.04.2018 um 20:32 schrieb George Neuner:
>>> On Tue, 10 Apr 2018 05:48:43 GMT, an...@mips.complang.tuwien.ac.at
>>> (Anton Ertl) wrote:
>>
>>> Smalltalk had/has single inheritence only, and it's dynamic dispatch
>>> mechanism is very different from that of C++.
>>
>> Isn't *multiple inheritance* one of the features that C++ proved
>> impractical? Which other languages support multiple inheritance?
>
> No. It proved only that C++ wasn't able to accomodate conflicting
> classes effectively. Lisp handles this same problem much more
> effectively, so it isn't a generic fault of MI.
>
> Moreover, most (all?) SI languages implement interfaces (abstract
> classes). Interfaces mainly are a workaround for lacking real MI, so
> why would they do that if it were useless?

Interfaces don't have all the problems arising from multiple
inheritance. I can see only the disadvantage, that duplicate code may be
required to implement the same interface in multiple classes. But see
below...

I've learned to distinguish class aggregation and composition by asking
whether it *is* something, or *has* something. A house *has* a roof, but
it *isn't* a roof, so it does not inherit from a roof class. This way I
never felt a need for multiple inheritance in my programs. But perhaps
outside my restricted experience there exist other needs...

> It often is the case that a subclass has to reimplement something from
> one of its ancestors, and it *might* be the case with MI that there is
> a clash wrt handling something that has to be worked around. But
> using interfaces, it is *guaranteed* that you will have to
> (re)implement not just things that conflict, but everything in the
> interface.

An interface can be implemented by a (class type) component of a class.
Then only the specific dependencies between both classes have to be
implemented explicitly. In Delphi the "implements" keyword implements
such object delegation, in addition to method delegation.

DoDi

Martin Ward

unread,

Apr 13, 2018, 1:21:09 PM4/13/18

to

Anton wrote:
> my counter-position papers are:
>
> http://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf
>
> http://www.kps2017.uni-jena.de/proceedings/kps2017_submission_5.pdf

Quote: "if the totally-defined C specified that shifting by the data
width produces 0, the compiler would have to implement shifts more
expensively on some machines; and if it specified that it produces the
unshifted value, it would have to implement shifts more expensively on
other machines."

This is a perfect example of support for my thesis. I am not happy
about this, since I was (and still am) hoping that my thesis would be
disproved :-(

To recap the thesis:

The Algol 60 designers placed mathematical simplicity (simplicity in
the language, ease of analysis etc) above the effort to implement the
compiler and execution efficiency of compiled code.

This attitude led to an explosion of productive research and
development in compilers and language implementation.

Since that time, language designers have become very cautious and

timid in specifying powerful new language features and compiler

research has stagnated (with C as an extreme example of this
stagnation).

Not specifying the result of a shift because the implementation is
inefficient on some machines is an extreme example of timidity.
(Mathematically, shifting by 2N bits should be semantically equivalent
to shifting by N bits twice, regardless of the value of N!)

Back in the good old days, language designers developed powerful
languages regardless of implementation efficiency, and hardware
designers responded by developing hardware to implement these powerful
languages more efficiently (hardware stacks, lisp machines,
content-addressable memory and so on).

The post by Hans-Peter Diettrich on Unum numbers is a small
encouraging sign.

--
Martin

Dr Martin Ward | Email: mar...@gkc.org.uk | http://www.gkc.org.uk
G.K.Chesterton site: http://www.gkc.org.uk/gkc | Erdos number: 4

[I think you'll find that don't specify in ways that might be hard to
compile goes all the way back to Fortran. Its standards allowed
optimizations like strength reduction if they were mathematically
equivalent to the original code, even if they weren't computationally
equivalent. -John]

Martin Ward

unread,

Apr 13, 2018, 1:22:47 PM4/13/18

to

> [Syntax is free in the compiler but not necessarily in the brains of
> the programmers. Back when I was writing PL/I programs, people said
> my code was unreadable because I'd learned PL/I from the reference
> manual, while everyone else learned it from books like "PL/I for
> Fortran progammers" or "PL/I for commercial programmers." My code
> used what seemed reasonable to me but others found it a mishmosh of
> stuff they knew and stuff they didn't. -John]

The IBM Language Reference for Enterprise PL/I for z/OS is 862 pages.

E.W.Dijkstra wrote in his ACM Turing Lecture 1972:

"Finally, although the subject is not a pleasant one, I must
mention PL/1, a programming language for which the defining
documentation is of a frightening size and complexity.
Using PL/1 must be like flying a plane with 7000 buttons,
switches and handles to manipulate in the cockpit.
I absolutely fail to see how we can keep our growing programs
firmly within our intellectual grip when by its sheer baroqueness
the programming language -- our basic tool, mind you! -- already
escapes our intellectual control."

He concluded:

"We shall do a much better programming job, provided that we approach
the task with a full appreciation of its tremendous difficulty,
provided that we stick to modest and elegant programming languages,
provided that we respect the intrinsic limitations of the human mind
and approach the task as Very Humble Programmers."

--
Martin

Dr Martin Ward | Email: mar...@gkc.org.uk | http://www.gkc.org.uk
G.K.Chesterton site: http://www.gkc.org.uk/gkc | Erdos number: 4

[It was PL/I F, for which the manual was much shorter. The ANSI
standard, on the other hand, was and is unreadable, page after page
of VDL. -John]

Robin Vowels

unread,

Apr 14, 2018, 3:04:10 PM4/14/18

to

From: "Martin Ward" <mar...@gkc.org.uk>
Sent: Friday, April 13, 2018 11:10 PM

> The IBM Language Reference for Enterprise PL/I for z/OS is 862 pages.

The IBM PL/I for OS/2 Language Reference is 491 pages plus 121 pages
for the built-in functions, published 1994.
This reference includes a number of new language features.

The language references also include a number of example programs.

> E.W.Dijkstra wrote in his ACM Turing Lecture 1972:
>
> "Finally, although the subject is not a pleasant one, I must
> mention PL/1, a programming language for which the defining
> documentation is of a frightening size and complexity.
> Using PL/1 must be like flying a plane with 7000 buttons,
> switches and handles to manipulate in the cockpit.

Dijkstra's comment is nonsense.
It seems that he couldn't even spell the name of the language.
[assuming that the quotation is literally correct]

> I absolutely fail to see how we can keep our growing programs
> firmly within our intellectual grip when by its sheer baroqueness
> the programming language -- our basic tool, mind you! -- already
> escapes our intellectual control."

Others seem to have mastered it, but not Dijkstra, apparently.
[Considering how quickly it was designed, PL/I is not a bad language,
but it definitely has parts that fit together poorly. I once tried
to write a program that used arrays of 12-bit strings and the code
PL/I F generated was very special, not in a good way. Per one of my
previous comments, few programmers learn all of PL/I, most learn a
subset adequate for the kind of programming they do. I doubt many
write programs that use both recursive routines with stacks of
controlled storage and decimal picture I/O. -John]

Robin Vowels

unread,

Apr 14, 2018, 3:04:45 PM4/14/18

to

From: "bartc" <b...@freeuk.com>
Sent: Friday, April 13, 2018 4:40 AM

> But it is not adding extra syntax; if anything it is getting rid of it!
> If a for-loop starts like this:
>
> for i:=1 to n do ...
>
> Then by leaving out the bits not needed you end up with this:
>
> to n do ...

The control variable, i, must not be omitted.
It may be required for computations within the loop
(including subscript references).

Even if not explicitly referenced within the loop,
its value will be required for fault finding (with error control
and/or with debugger).

> A repeat-n-times loop (one that doesn't have to maintain an explicit
> loop counter accessible as a reference-counted variable from the source
> code).

It's still required, as described above.

George Neuner

unread,

Apr 14, 2018, 3:05:32 PM4/14/18

to

On Fri, 13 Apr 2018 10:22:03 +0200, Hans-Peter Diettrich

<DrDiet...@netscape.net> wrote:

>Am 13.04.2018 um 02:51 schrieb George Neuner:
>> On Thu, 12 Apr 2018 01:09:42 +0200, Hans-Peter Diettrich
>> <DrDiet...@netscape.net> wrote:
>>
>>> Am 10.04.2018 um 20:32 schrieb George Neuner:
>>>> On Tue, 10 Apr 2018 05:48:43 GMT, an...@mips.complang.tuwien.ac.at
>>>> (Anton Ertl) wrote:
>
>> Moreover, most (all?) SI languages implement interfaces (abstract
>> classes). Interfaces mainly are a workaround for lacking real MI, so
>> why would they do that if it were useless?
>
>Interfaces don't have all the problems arising from multiple
>inheritance. I can see only the disadvantage, that duplicate code may be
>required to implement the same interface in multiple classes. But see
>below...

Addressing your next point, interfaces don't easily solve the problem
that class A really IS-A class B and also IS-A class C.

>I've learned to distinguish class aggregation and composition by asking
>whether it *is* something, or *has* something. A house *has* a roof, but
>it *isn't* a roof, so it does not inherit from a roof class. This way I
>never felt a need for multiple inheritance in my programs. But perhaps
>outside my restricted experience there exist other needs...

HAS-A relationships also can be handled by container delegation if the
object system supports it. E.g., the house object can contain a roof
object and "roof requests" sent to the house can be forwarded to the
contained roof.

IMO, this is a really useful feature that too few object systems
support.

The distinction between IS-A and HAS-A is something that isn't really
addressed by MI inheritence. But that isn't to say that it couldn't
be. Something like "implements" (more below) in an MI language could
be used to inherit preserving the HAS-A distinction for some suitable
predicate.

But IMO it is more natural to ask if something DOES <foo> or IS
<fooable> rather than to ask if it HAS some property <foo>. Obviously,
it's the same question - but the way in which it is asked implies a
slightly different semantic - one which MI answers directly [modulo
class naming] without needing any distinction of interfaces. YMMV.

>> It often is the case that a subclass has to reimplement something from
>> one of its ancestors, and it *might* be the case with MI that there is
>> a clash wrt handling something that has to be worked around. But
>> using interfaces, it is *guaranteed* that you will have to
>> (re)implement not just things that conflict, but everything in the
>> interface.
>
>An interface can be implemented by a (class type) component of a class.
>Then only the specific dependencies between both classes have to be
>implemented explicitly. In Delphi the "implements" keyword implements
>such object delegation, in addition to method delegation.

But not all languages are so kind.

The problem is in unrelated classes that need to implement the entire
interface. Consider, e.g., serializable, or displayable - in many
languages you would end up reimplementing the entire interface in
every branch of your heirarchy, with overrides in virtually every
subclass.

The result is that some languages now have many such things built-in -
either implicitly, or via wizard mechanisms like "implements". Lots
of built-in functionality takes the sting out of using uncooperative
languages that make it hard (or impossible) to implement such
functionality yourself. Unfortunately, it also tends to blind the
users to the limitations of the language ... until they try to do
something the language just won't permit, and then b*tch about needing
"foreign" functions and having to do things in C or assembler.

The "implements" mechanism saves work in SI languages, but it is less
useful in an MI language where any class can simply inherit whatever
functionality it needs.

>DoDi

YMMV,
George

bartc

unread,

Apr 14, 2018, 4:57:21 PM4/14/18

to

On 14/04/2018 05:19, Robin Vowels wrote:
> From: "bartc" <b...@freeuk.com>
> Sent: Friday, April 13, 2018 4:40 AM
>
>
>> But it is not adding extra syntax; if anything it is getting rid of it!
>> If a for-loop starts like this:
>>
>> for i:=1 to n do ...
>>
>> Then by leaving out the bits not needed you end up with this:
>>
>> to n do ...
>
> The control variable, i, must not be omitted.
> It may be required for computations within the loop
> (including subscript references).
>
> Even if not explicitly referenced within the loop,
> its value will be required for fault finding (with error control
> and/or with debugger).

Huh? Who says that?

This example is from my language (inspired by Algol68), where the
control variable /can/ be omitted. And it's not needed any more than you
need one here for this line of code repeated three times:

print "*"
print "*"
print "*"

(For that matter, it can be omitted from a C for-loop too, if only
because that language is so crude it doesn't even have the concept of a
control variable for a loop.)

>> A repeat-n-times loop (one that doesn't have to maintain an explicit
>> loop counter accessible as a reference-counted variable from the source
>> code).
>
> It's still required, as described above.

If implemented as an actual loop (rather than unrolling or whatever),
then there might be a count kept somewhere. But it doesn't need to be in
a format compatible with the regular variables in the language or be
part of its type system.

The count might also increment upwards or downwards; more likely the latter.

If a loop index is needed for fault-finding, then you just switch over
to for-loop that does have the index.

--
bartc

Hans-Peter Diettrich

unread,

Apr 14, 2018, 9:02:17 PM4/14/18

to

Am 14.04.2018 um 19:40 schrieb George Neuner:

> The problem is in unrelated classes that need to implement the entire
> interface. Consider, e.g., serializable, or displayable - in many
> languages you would end up reimplementing the entire interface in
> every branch of your heirarchy, with overrides in virtually every
> subclass.

Please give an example where inheritance from a "serializable" base
class can do something meaningful, without explicitly coded integration
into the derived class.

> The "implements" mechanism saves work in SI languages, but it is less
> useful in an MI language where any class can simply inherit whatever
> functionality it needs.

See above. Fully unrelated functionality does not even deserve a
relationship between both classes, no inheritance at all. Any
dependencies require explicit code, even in MI.

DoDi

Hans-Peter Diettrich

unread,

Apr 14, 2018, 9:02:28 PM4/14/18

to

Am 14.04.2018 um 19:40 schrieb George Neuner:

> On Fri, 13 Apr 2018 10:22:03 +0200, Hans-Peter Diettrich

>> An interface can be implemented by a (class type) component of a class.
>> Then only the specific dependencies between both classes have to be
>> implemented explicitly. In Delphi the "implements" keyword implements
>> such object delegation, in addition to method delegation.
>
> But not all languages are so kind.

A strange argument :-(

If a language can do it without MI, what's the remaining benefit of MI?

DoDi

Andy Walker

unread,

Apr 14, 2018, 9:03:32 PM4/14/18

to

On 14/04/18 05:19, Robin Vowels wrote:
> From: "bartc" <b...@freeuk.com>

[...]

>> Then by leaving out the bits not needed you end up with this:
>> to n do ...
> The control variable, i, must not be omitted.

Can't answer for Python [the previous topic] or for BartC's
own languages, but in Algol 68, as Bart referenced, it most certainly
can, and sometimes ought to be.

Specific examples:

(a) Sometimes the control variable does not proceed by constant
increments:

int i := 0;
to n do ...; i +:= f(i) od

(b) Sometimes the control variable, if present, might overflow, eg in
the case of a loop that goes round more than "maxint" times:

while ... do ... od

[May be worth noting that RR3.5.2 Step 4 contains a bug in the
printed version in this case, pointed out and corrected somewhere
in "Algol Bulletin". You could write "for i by 0 to 2 ..." but
that is just silly.]

> It may be required for computations within the loop
> (including subscript references).

And it may not. Programmer's decision.

> Even if not explicitly referenced within the loop,
> its value will be required for fault finding (with error control
> and/or with debugger).

*May* be. Depends on the "error control" and "debugger", and
how they, if they exist, interact with the programmer, on what faults
may have occurred -- perhaps the code was already tested -- and on
the phase of the moon.

Personally, I'm not a big fan of the style police, nor of
languages that encourage them.

--
Andy Walker,
Nottingham.

Costello, Roger L.

unread,

Apr 16, 2018, 10:47:30 AM4/16/18

to

Robin Vowels wrote:

> Dijkstra's comment is nonsense.

I am curious, why do you say Dijkstra's comment is nonsense?

> [assuming that the quotation is literally correct]

The quote seems to be correct:

https://books.google.com/books?id=JdziBwAAQBAJ&pg=PA15&lpg=PA15&dq=Finally,+although+the+subject+is+not+a+pleasant+one,+I+must+mention+PL/1,+a+programming+language+for+which+the+defining+documentation+is+of+a+frightening+size+and+complexity&source=bl&ots=5709gQOo9K&sig=AfrW2xb-WrcWKpmRAsfxufLS0F4&hl=en&sa=X&ved=0ahUKEwj4od7S677aAhUHvFMKHSF3AcUQ6AEIKTAA#v=onepage&q=Finally%2C%20although%20the%20subject%20is%20not%20a%20pleasant%20one%2C%20I%20must%20mention%20PL%2F1%2C%20a%20programming%20language%20for%20which%20the%20defining%20documentation%20is%20of%20a%20frightening%20size%20and%20complexity&f=false

[Read the page or so around the quote and you'll see that Dijkstra
thought Algol60, of which he was one of the authors, was too
complicated, because BNF made it too easy to specify syntax. As far
as I can tell, he wanted extremely simple languages to make it easier
to prove programs correct. But that, of course, is another swamp.
-John]

Robin Vowels

unread,

Apr 17, 2018, 4:03:11 PM4/17/18

to

From: "Costello, Roger L." <cost...@mitre.org>
Sent: Monday, April 16, 2018 10:56 PM

> Robin Vowels wrote:
>
>> Dijkstra's comment is nonsense.
>
> I am curious, why do you say Dijkstra's comment is nonsense?

Dijkstra's comment is nonsense because it is possible to master the
language. His analogy with flying a plane is entirely erroneous.

Yes, when it was introduced PL/I was a larger and richer language than
FORTRAN, but was easier to learn and to use, and because there were
not numerous restrictions on such things as expressions in DO
statements, extended ranges, awkward rules in constructing and using
FORMAT statements, etc.

It was possible to do much more with the language, especially with
character strings (who recalls Hollerith constants?).

One I recall, storing large arrays in a machne with limited storage --
declare an array of one-digit decimal integers.

[BTW, the current specification of Fortran is longer than that of PL/I.]

[Perhaps he meant that it was impossible for *him* to master the language.
It does have some odd rough edges, e.g., give or take my recollection of
the syntax:

DCL (A,B,C) CHAR(3);
A = '123';
B = '456';
C = A+B;

What does C contain? Answer: three spaces. -John]

Robin Vowels

unread,

Apr 19, 2018, 1:19:22 AM4/19/18

to

From John Levine

Sent: Monday, April 16, 2018 10:56 PM

> [Perhaps he meant that it was impossible for *him* to master the language.
> It does have some odd rough edges, e.g., give or take my recollection of
> the syntax:

> DCL (A,B,C) CHAR(3);
> A = '123';
> B = '456';
> C = A+B;

> What does C contain? Answer: three spaces. -John]

No. IBM's PL/I for Windows compiler produces two diagnostics:

1: at compilation time, at line 6 [your line 4]:

(6:1) : IBM1211I W Source in string assignment is longer than the target C.

2. At execution time, a fatal error at the assignment C = A+B :

IBM0441I ONCODE=0150 The STRINGSIZE condition was raised.
At offset +0000016B in procedure with entry L

Character variables were not really intended for doing decimal arithmetic,
though conversions from character to decimal are more-or-less common.

I'm not sure whether a compiler warning message would have been
produced in the 1960s with the PL/I F-compiler,
but am fairly certain that there would have been no run-time error.
[I got the three spaces in PL/I F. I dimly recall that if I turned on optimization,
it'd do constant substitution and just move the three spaces into C. Nice to hear
that they warn you off this design error now. -John]

Gene Wirchenko

unread,

Apr 19, 2018, 1:20:25 AM4/19/18

to

On Tue, 17 Apr 2018 19:08:57 +1000, Our Esteemed Moderator wrote:

[snip]

>[BTW, the current specification of Fortran is longer than that of PL/I.]
>
>[Perhaps he meant that it was impossible for *him* to master the language.
>It does have some odd rough edges, e.g., give or take my recollection of
>the syntax:
>
> DCL (A,B,C) CHAR(3);
> A = '123';
> B = '456';
> C = A+B;
>
>What does C contain? Answer: three spaces. -John]

There was an even weirder PL/I one that I tried to find but can
not. As I recall, it was five implicit conversions that ended up with
a value opposite to the starting value.

Sincerely,

Gene Wirchenko

anti...@math.uni.wroc.pl

unread,

Apr 29, 2018, 1:44:54 PM4/29/18

to

George Neuner <gneu...@comcast.net> wrote:
> On Fri, 02 Mar 2018 09:30:42 GMT, an...@mips.complang.tuwien.ac.at
> (Anton Ertl) wrote:
>
> >George Neuner <gneu...@comcast.net> writes:
> >>On Sat, 17 Feb 2018 16:39:04 GMT, an...@mips.complang.tuwien.ac.at
> >>(Anton Ertl) wrote:
> >>
> >>>Kaz Kylheku <217-67...@kylheku.com> writes:
> >>>>On 2018-02-15, George Neuner <gneu...@comcast.net> wrote:
> >>>>> No worries. IME, displays don't get much respect from modern
> >>>>> textbooks - they are mentioned mostly in passing.
> >>>>
> >>>>This is probably because of
> >>>
> >>>This is because displays were found to be more costly for Algol-like
> >>>languages
> >>
> >>More costly than what?
> >
> >Static link chains.
> >
> >>> IIRC the additional cost is in updating the display on calls
> >>>and returns.
> >>
> >>Which is no different from the overhead to maintain static links
> >
> >Wrong. Consider the following nesting of functions:
> >
> >A
> > B
> > C
> >D
> >
> >With a display, in C you have built up a display pointing to the
> >frames of C, B, and A. When performing a call from C to D, you throw
> >that away and replace it with a display pointing to just the frame of
> >D. When returning from D, you have to restore the old display.
>
> No. D and A are at the same lexical level. On entry, D will replace
> A's entry in the display, and will restore A's entry on exit.
>
> The scope rules say D can't see or call B or C. If it calls A, then A
> replaces it at the same lexical level. When A returns, D's link is
> restored.

You assume no procedure parameters (boring case). With procedure
parameters C can pass itself (or B or other procedure at the
same lexical level) to D and D can call it. So dynamic chain
can be:

A B C D C

and innermost C needs to see A variables in its display.
IIUC resolving this required copies that make display
management O(n) (n being depth of static chain).

> No problem. And no need to preserve the whole display.
>
> >If you don't maintain a static link chain, you have to save the
> >complete display of C on the call and restore it on return.
>
> No you don't.
>
>
> >Note that D may itself call functions that need more display, so you don't get
> >away with just saving and restoring the first slot of the display.
> >IIRC this was the variant looked at in the paper that concluded that
> >displays are more costly.
>
> That is simply wrong unless you change the whole semantics.
>
> As long as call sites are restricted to being within the definition
> scope of the called function [as is true in Pascal],

Pascal without procedural parameters.

--
Waldek Hebisch

Martin Ward

unread,

May 1, 2018, 10:07:50 AM5/1/18

to

On 14/04/18 05:11, Robin Vowels wrote:
>> The IBM Language Reference for Enterprise PL/I for z/OS is 862 pages.
>
> The IBM PL/I for OS/2 Language Reference is 491 pages plus 121 pages
> for the built-in functions, published 1994.
> This reference includes a number of new language features.

So in order to master PL/I one would need to read *both* manuals, and
probably a few other similar sized manuals for any other versions or
variants of PL/I that may be in existence.

As we will see below, different versions of PL/I give different
meanings even for a very simple statement such as "C = A+B". Someone
who has mastered PL/I would need to have *all* of these different
meanings at their fingertips.

> Dijkstra's comment is nonsense because it is possible to master the
> language. His analogy with flying a plane is entirely erroneous.

...

>> What does C contain? Answer: three spaces. -John]
>

> No. IBM's PL/I for Windows compiler produces two diagnostics:
>
> 1: at compilation time, at line 6 [your line 4]:
>
> (6:1) : IBM1211I W Source in string assignment is longer than the target C.
>
> 2. At execution time, a fatal error at the assignment C = A+B :
>
> IBM0441I ONCODE=0150 The STRINGSIZE condition was raised.
> At offset +0000016B in procedure with entry L

Clearly *you* have not mastered PL/I: since you had to construct and
execute a sample program in order to find out the meaning of the very
simple statement "C = A+B" on a particular version of PL/I, and you do
not know the meaning of the statement for other versions of PL/I that
you do not have access to:

> I'm not sure whether a compiler warning message would have been
> produced in the 1960s with the PL/I F-compiler,
> but am fairly certain that there would have been no run-time error.

You are "fairly certain" that PL/I F gives a compiler warning, but
John knows that it does not. It would appear from this discussion that
Dijkstra's evaluation of PL/I has been fully justified: "I absolutely

fail to see how we can keep our growing programs firmly within our
intellectual grip when by its sheer baroqueness the programming
language -- our basic tool, mind you! -- already escapes our
intellectual control."

When you have to type in and execute a program to find out what "C =
A+B" does, then you can safely say that the programming language has
indeed escaped your intellectual control.

Later in the same paragraph he described PL/I as having the "growth
characteristics of a dangerous tumor": the growth of the manual from
612 pages in 1994 to 892 pages in 2013 also bears out this statement
as prophetic.

https://www.cs.utexas.edu/~EWD/transcriptions/EWD03xx/EWD340.html

--
Martin

Dr Martin Ward | Email: mar...@gkc.org.uk | http://www.gkc.org.uk
G.K.Chesterton site: http://www.gkc.org.uk/gkc | Erdos number: 4

[Every language's manual grows over the years so I don't think that is a fair
comparison. PL/I was a large language for its time but if you compare it
to, say, C++ or even python with its standard libraries it's not all that
large. It does have the problem of having been invented in a hurry, so
that there are persistent rough edges where the parts from Fortran and
the parts from Cobol meet. This will end the skirmishing unless someone
has something about, you know, building PL/I compilers. -John]

Albert van der Horst

unread,

May 5, 2018, 9:24:21 AM5/5/18

to

In article <18-0...@comp.compilers>, Martin Ward <mar...@gkc.org.uk> wrote:
>On 30/03/18 15:20, Anton Ertl wrote:
>>> The theory is that "the attitude of the Algol 60 designers towards
>>> language design is what led to these innovations appearing".
>>> Clearly, this "attitude" was present before, during and after
>>> the development of Algol 60.
>> Ok, so this theory cannot even be verified of falsified for the
>> features that came before Algol 60.
>
>John asked us to speculate what might have happened differently,
>not produce empirically verifiable theories :-)
>
>I came across this quote the other day, which suggests that the change
>in attitude had already started by 1979:
>
>"The call-by-name mechanism is so expensive that modern languages
>have mostly dropped it in favour of the semantically different,
>but operationally more efficient, call-by-reference"
>--"Understanding and Writing Compilers", Richard Bornat, Macmillan 1979.

Earlier.
Algol 68 already moved to call by reference.

I paraphrase from page 182 183 Informal introduction to Algol68
(Lindsey / van der Meulen)
"
Algol 60 had something called Jensen's devise, using (or misusing)
call-by-name. In algol 68 we use references only.
"
Then they give an example of how the effect of call-by-name of e.g. a
real can be had by using a procedure that returns a a real.

Basically a call-by-name is a procedure where the input
parameters are obsured, as viewed from the algol 68 perspective.

Groetjes Albert
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst