Printing beyond printf

James Harris

unread,

Dec 27, 2021, 9:01:19 AM12/27/21

to

I've loads of other messages to get back to but while I think of it I'd
like to post a suggestion for you guys to shoot down in flames. ;-)

The printf approach to printing is flexible and fast at rendering
inbuilt types - probably better than anything which came before it - but
it's not perfect. In particular, it means that the code inside printf
which does the rendering needs to know about all types it may be asked
to render in character form. But there are many other types. E.g. a
programmer could want a print routine to render a boolean, an integer, a
float, a record, a table, a list, a widget, etc.

So here's another potential approach. What do you think of it?

The idea is, as with the printf family, to have a controlling string
where normal characters are copied verbatim and special fields are
marked with a % sign or similar. The difference is what would come after
the % sign and how it would be handled.

What I am thinking of is a format specification something like

%EB;

where "E" is a code which identifies the rendering engine, "B" is the
body of the format and ";" marks the end of the format and a return to
normal printing.

The /mechanical/ difference is that rather than the print function doing
all the formatting itself it would outsource any it didn't know. For
outsourcing, the rendering engine would be sent both the value to be
printed AND a pointer to the format string.

As for where rendering engines could come from:

* Some rendering engines could be inbuilt.
* Some could be specified earlier in the code.
* Some could be supplied in the parameter list (see below).

What would the formats look like? As some examples:

%i; a plain, normal, signed integer
%iu; a plain, normal, unsigned integer
%iu02x; a 2-digit zero-padded unsigned hex integer
%Kabc; a type K unknown to the print function

The latter would need the print function to have been previously told
about a rendering engine for "K". The print function would pass to the
rendering engine the format specification and a pointer to the value.
Finally,

%*abc; a dynamic render

The * would indicate that the address of the rendering engine had been
supplied as a parameter as in

printit("This is %*abc;, OK?", R, v)

where R is the rendering engine and v is the value to be rendered
according to the specification abc.

That's it. It's intended to be convenient, efficient, flexible and about
as simple as possible to use. Whether it achieves that is up for debate.

Thoughts/opinions? Is there a better approach to the formatted printing
of arbitrary types?

--
James Harris

Bart

unread,

Dec 27, 2021, 10:49:40 AM12/27/21

to

Look at that last example. You have to give it three things other than
the surrounding context: v, R and *abc.

At the simplest, you want to do it with just v. Then:

* It will apply the renderer previously associated with that user-type

* If it there isn't one, it will use a default rendering for that
generic type (array, struct etc)

* Or it can say it doesn't know how to do it, or just prints a reference
to v

Next simplest is when you specify some parameters to control the
formatting. How these work depends on what it does above.

Supplying a rendere each time you print I would say is not the simplest
way of doing it! If you're going to do that, you might as well do:

printit("This is %s; OK?", R(v,"abc"))

Then no special features are needed. Except that if R returns a string,
then you need some means of disposing of that string after printing, but
there are several ways of dealing with that.

> %i; a plain, normal, signed integer
> %iu; a plain, normal, unsigned integer
> %iu02x; a 2-digit zero-padded unsigned hex integer
> %Kabc; a type K unknown to the print function

This is too C-like. C-style formats have all sorts of problems
associated with hard-coding a type-code into a format string:

* What is the code for arbitrary expression X?
* What will it be when X changes, or the type of the terms change?
* What is it for clock_t, or some other semi-opaque type?
* What is it for uint64_t? (Apparently, it is PRId64 - a macro that
expands to a string)

If your Print function is implemented as a regular user-code function,
which your language knows nothing about, then you will need some scheme
which imparts type information to the function, as well as a way of
dealing with such variadic parameter lists.

But if it does, which would be a more modern way of doing so, then the
compiler already knows the types involved. Then the formatting is about
the display, so for an integer:

* Perhaps override signed to unsigned
* Plus sign
* Width
* Justification
* Zero-padding (and padding character)
* Base
* Separators (and grouping)
* Prefix and/or suffix
* Upper/lower case (digits A-F)

Or maybe, this number represents certain quanties (eg. a day of the
week), which will need displaying in a special way. If you're not using
a special type for that, then here it will need a way to override that,
perhaps using an R() function.

You should also look at how the current crop of languages do it. They
are also still tend to use format strings, and some like to put the
expressions to be printed inside the format string.

-----------

(My approach in dynamic code is that there is an internal function
tostr(), fully overloaded for different types, with optional format
data, that is applied to Print items. So that:

print a, b, c:"h" # last bit means hex

is the same as:

print tostr(a), tostr(b), tostr(c, "h")

There is a crude override mechanism, which links 'tostr' and a type T,
to a regular user-code function F.

Then, when printing T, it will call F().

In static code, this part is poorly developed. But Print (which is again
known to the language as it is a statement), can deal with regular
types, including most of those options for integers:

print a:"z 8 h s_" # leading zeros, 8-char field, hex, "_"
separator

)

James Harris

unread,

Dec 27, 2021, 12:26:36 PM12/27/21

to

On 27/12/2021 15:49, Bart wrote:
> On 27/12/2021 14:01, James Harris wrote:

...

>> What would the formats look like? As some examples:
>>
>>    %i;      a plain, normal, signed integer
>>    %iu;     a plain, normal, unsigned integer
>>    %iu02x; a 2-digit zero-padded unsigned hex integer
>>    %Kabc;   a type K unknown to the print function

...

>> %*abc; a dynamic render
>>
>> The * would indicate that the address of the rendering engine had been
>> supplied as a parameter as in
>>
>> printit("This is %*abc;, OK?", R, v)
>>
>> where R is the rendering engine and v is the value to be rendered
>> according to the specification abc.

...

> Look at that last example. You have to give it three things other than
> the surrounding context: v, R and *abc.

Yes, though this is for /formatted/ output so that's not surprising. And
the last example was in many ways the worst-case scenario.

>
> At the simplest, you want to do it with just v. Then:
>
> * It will apply the renderer previously associated with that user-type
>
> * If it there isn't one, it will use a default rendering for that
> generic type (array, struct etc)
>
> * Or it can say it doesn't know how to do it, or just prints a reference
> to v
>
> Next simplest is when you specify some parameters to control the
> formatting. How these work depends on what it does above.
>
> Supplying a rendere each time you print I would say is not the simplest
> way of doing it! If you're going to do that, you might as well do:
>
> printit("This is %s; OK?", R(v,"abc"))
>
> Then no special features are needed. Except that if R returns a string,
> then you need some means of disposing of that string after printing, but
> there are several ways of dealing with that.

Dealing with memory is indeed a problem with that approach. The %s could
be passed a string which needs to be freed or one which must not be
freed. One option is

%s - a string which must not be freed
%M - a string which must be freed

>
> >    %i;      a plain, normal, signed integer
> >    %iu;     a plain, normal, unsigned integer
> >    %iu02x; a 2-digit zero-padded unsigned hex integer
> >    %Kabc;   a type K unknown to the print function
>
> This is too C-like. C-style formats have all sorts of problems
> associated with hard-coding a type-code into a format string:
>
> * What is the code for arbitrary expression X?

It would have to be something to match the type of X.

> * What will it be when X changes, or the type of the terms change?

The format string would need to be changed to reflect the type change.

> * What is it for clock_t, or some other semi-opaque type?

Perhaps %sdhh:mm:ss.fff; where d indicates datetime.

> * What is it for uint64_t? (Apparently, it is PRId64 - a macro that
> expands to a string)

How about %u; with the renderer being told the width of the type? Or
%u64; with the renderer not needing to know the width of the type?

Or with comma digit separators every 3 places

%u64:s,:3;

Meaning unsigned 64, separator "," every 3 places.

I'm not too worried about the specific format codes. Someone must have
already come up with a set of them which could be used. The main thing
is that any format code would be understood by the programmer and by the
formatter and that it would be clear to everyone where the format code
ended.

That said, it would make sense for the elements of the format string to
appear in some sort of logical order - possibly the order in which they
would be needed by the renderer.

>
> If your Print function is implemented as a regular user-code function,
> which your language knows nothing about, then you will need some scheme
> which imparts type information to the function, as well as a way of
> dealing with such variadic parameter lists.

Agreed.

>
> But if it does, which would be a more modern way of doing so, then the
> compiler already knows the types involved. Then the formatting is about
> the display, so for an integer:
>
> * Perhaps override signed to unsigned
> * Plus sign
> * Width
> * Justification
> * Zero-padding (and padding character)
> * Base
> * Separators (and grouping)
> * Prefix and/or suffix
> * Upper/lower case (digits A-F)

It's a good list. It's amazing how many of those choices printf hits in
a short space.

>
> Or maybe, this number represents certain quanties (eg. a day of the
> week), which will need displaying in a special way. If you're not using
> a special type for that, then here it will need a way to override that,
> perhaps using an R() function.

OK.

>
> You should also look at how the current crop of languages do it. They
> are also still tend to use format strings, and some like to put the
> expressions to be printed inside the format string.

Maybe unfairly I have an antipathy to copying other languages but maybe
in this case it would be useful. Are there any you would recommend?

Incidentally, my real goal is to have the ability to output in a
self-describing 'binary stream' format rather than necessarily
converting to text but that's a subject in itself and would require
external support. Text will have to do for now!

>
> -----------
>
> (My approach in dynamic code is that there is an internal function
> tostr(), fully overloaded for different types, with optional format
> data, that is applied to Print items. So that:
>
> print a, b, c:"h" # last bit means hex
>
> is the same as:
>
> print tostr(a), tostr(b), tostr(c, "h")

Maybe that's better: the ability to specify custom formatting on any
argument. I presume that's not just available for printing, e.g. you
could write

string s := c:"h"

and that where you have "h" you could have an arbitrarily complex format
specification.

There looks to be a potential issue, though. In C one can build up the
control string at run time. Could you do that with such as

string fmt := format_string(....)
s := c:(fmt)

?

>
> There is a crude override mechanism, which links 'tostr' and a type T,
> to a regular user-code function F.
>
> Then, when printing T, it will call F().
>
> In static code, this part is poorly developed. But Print (which is again
> known to the language as it is a statement), can deal with regular
> types, including most of those options for integers:
>
>
> print a:"z 8 h s_" # leading zeros, 8-char field, hex, "_"
> separator
>
> )

That's the spirit! ;-)

--
James Harris

Bart

unread,

Dec 27, 2021, 2:05:39 PM12/27/21

to

On 27/12/2021 17:26, James Harris wrote:
> On 27/12/2021 15:49, Bart wrote:

>> printit("This is %s; OK?", R(v,"abc"))
>>
>> Then no special features are needed. Except that if R returns a
>> string, then you need some means of disposing of that string after
>> printing, but there are several ways of dealing with that.
>
> Dealing with memory is indeed a problem with that approach. The %s could
> be passed a string which needs to be freed or one which must not be
> freed. One option is
>
> %s - a string which must not be freed
> %M - a string which must be freed

This has similar problems to hardcoding a type. Take this:

printit("%s", F())

F returns a string, but is it one that needs freeing or not? That's
depends on what happens inside F. Whatever you choose, later the
implementation of F changes, then 100 formats have to change too?

This is a more general problem of memory-managing strings. It's more
useful to be able to solve it for the language, then it will work for
Print too.

(Personally, I don't get involved with this at all, not in low level code.

Functions that return strings generally return a pointer to a local
static string that needs to be consumed ASAP. Or sometimes there is a
circular list of them to allow several such calls per Print. It's not
very sophisticated, but that's why the language is low level.)

If you want a more rigorous approach, perhaps try this:

printit("%H", R(v, "abc"))

H means handler. R() is not the handler itself, but returns a descriptor
(eg. X) the contains a reference to a handler function, and references
to those two captured values (a little like a lambda or closure I think).

The Print handler then calls X() with a parameter to request the value.
And can call it again with another parameter to free it. Or perhaps it
can be used to iterate over the characters, or sets of strings.

The latter might be more suitable when you have a 1-billion element
array to print (eg. to a file), and you don't want or need to generate
one giant string in one go.

But this starts to get into generators and iterators. Perhaps it is a
too advanced approach if your language is anything like mine.

>>
>> >    %i;      a plain, normal, signed integer
>> >    %iu;     a plain, normal, unsigned integer
>> >    %iu02x; a 2-digit zero-padded unsigned hex integer
>> >    %Kabc;   a type K unknown to the print function
>>
>> This is too C-like. C-style formats have all sorts of problems
>> associated with hard-coding a type-code into a format string:
>>
>>    * What is the code for arbitrary expression X?
>
> It would have to be something to match the type of X.
>
>>    * What will it be when X changes, or the type of the terms change?
>
> The format string would need to be changed to reflect the type change.
>
>>    * What is it for clock_t, or some other semi-opaque type?
>
> Perhaps %sdhh:mm:ss.fff; where d indicates datetime.

The problem with doing it in C is that clock_t could be u32, u64, i32,
i64 or even a float type; what number format to use? It's not a string
anyway; you can turn it into one, but then that's my DOW example.

>> * What is it for uint64_t? (Apparently, it is PRId64 - a macro that
>> expands to a string)

Again this is for C; the problem being that a format string should not
need to include type information:

* The compiler knows the type
* You may not know the type (eg. clock_t)
* You may not know the format needed (eg. uint64_t)
* You don't want to have to maintain 1000 format strings as
expressions and types of variables change

> That said, it would make sense for the elements of the format string to
> appear in some sort of logical order - possibly the order in which they
> would be needed by the renderer.

But then somebody has to remember them! I ensure the order doesn't matter.

> Maybe unfairly I have an antipathy to copying other languages but maybe
> in this case it would be useful. Are there any you would recommend?

I have the same approach to other languages. Generally I find their
print schemes over-elaborate, so tend to do my own thing. Yet they also
have to solve the same problems.

>> (My approach in dynamic code is that there is an internal function
>> tostr(), fully overloaded for different types, with optional format
>> data, that is applied to Print items. So that:
>>
>> print a, b, c:"h" # last bit means hex
>>
>> is the same as:
>>
>> print tostr(a), tostr(b), tostr(c, "h")
>
> Maybe that's better: the ability to specify custom formatting on any
> argument. I presume that's not just available for printing, e.g. you
> could write
>
> string s := c:"h"
>
> and that where you have "h" you could have an arbitrarily complex format
> specification.

Well, ":" is specifically used in print-item lists (elsewhere it creates
key:value pairs). I would write your example in dynamic code as one of:

s := sprint(c:"h") # sprint is special, like print
s := tostr(c,"h") # tostr is a function-like operator

(sprint can turn a list of items into one string; tostr does one at a
time, although that one item can be arbitrarily complex.)

In my cruder static code, it might be:

[100]char str
print @str, c:"h"

> There looks to be a potential issue, though. In C one can build up the
> control string at run time. Could you do that with such as
>
> string fmt := format_string(....)
> s := c:(fmt)
>
> ?

Sure, what comes after ":" is just any string expression:

ichar fmt = (option=1 "Hs_" | "s,")

print 123456:fmt # displays 1_E240 or 123,456

(Separator grouping is 3 digits decimal; 4 digits hex/binary.)

I've exercised my print formatting recently and found some weak areas,
to do with tabulation. Getting things lined up in columns is tricky,
especially with a header.

I do have formatted print which looks like this:

fprint "#(#, #) = #", fnname, a, b, result

If I want field widths, they are written as:

fprint "#: # #", a:w1, b:w2, c:w3

where w1/w2/w2 are "12" etc. Here, the first problem is a disconnect
between each #, and the corresponding print item. This is why some
languages bring them inside.

But the main thing here is that I don't get a sense of what it looks
like until I run the program. Something I've seen in the past would look
a bit like:

fprint "###: ####### #############", a, b, c

The widths are the number of # characters. That same string could be
used for headings:

const format = "###: ####### #############"

fprint format, "No.", "Barcode", "Description"
....
fprint format, i, item[i].code, item[i].descr

I haven't implemented this, it's just an idea. This an actual example of
the kind of output I'm talking about, but done the hard way by trial and
error:

Type Seg Offset Symbol/Target+Offset
-------------------------------------------------------
1: imprel32 code 00000024 MessageBoxA
2: locabs64 code 00000015 idata 02E570B0
3: locabs64 code 0000000B idata 02E570B6

Andy Walker

unread,

Dec 27, 2021, 5:02:32 PM12/27/21

to

On 27/12/2021 14:01, James Harris wrote:

> The printf approach to printing is flexible and fast at rendering
> inbuilt types - probably better than anything which came before it -
> but it's not perfect.

No, it's rubbish. If you need formatted transput [not
entirely convinced, but chacun a son gout], then instead of all
the special casing, the easiest and most flexible way is to
convert everything to strings. Thus, for each type, you need
an operator that converts values of that type into strings [and
vv for reading]. You can incorporate formatting details [such
as whether decimal points are "," or ".", whether leading zeros
are suppressed, etc., etc] into the operator, or as parameters
to a suitable procedure call. Such operators are separately
useful, eg for sorting. The default operator could, eg, be one
that converts [eg] an integer into the shortest possible string.
That way, the actual transput routines need to know almost
nothing about the types and formatting details of this parameters,
only how to write/read an array of characters.

[...]

> So here's another potential approach. What do you think of it?
> The idea is, as with the printf family, to have a controlling string
> where normal characters are copied verbatim and special fields are
> marked with a % sign or similar. The difference is what would come
> after the % sign and how it would be handled.

Then what you've done is to use "%" where you should
instead simply be including a string. So the specification of
"printf" becomes either absurdly complicated [as indeed it is
in most languages] or too limited [because some plausible
conversions are not catered for]. The "everything is a string"
approach has the advantage that for specialised use, eg if you
want to read/write your numbers as Roman numerals, you just have
to write the conversion routines that you would need anyway, no
need to change anything in "printf".

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Goodban

Dmitry A. Kazakov

unread,

Dec 28, 2021, 4:22:00 AM12/28/21

to

On 2021-12-27 23:02, Andy Walker wrote:
> On 27/12/2021 14:01, James Harris wrote:
>> The printf approach to printing is flexible and fast at rendering
>> inbuilt types - probably better than anything which came before it -
>> but it's not perfect.
>
> No, it's rubbish.

Patently obvious rubbish, rather.

> then instead of all
> the special casing, the easiest and most flexible way is to
> convert everything to strings.

Sure, though, usually not everything is converted to string. For
example, formatting symbols or extensions of the idea: meta/tagged
formats like HTML, XML etc are inherently bad.

The most flexible is a combination of a string that carries most of the
information specific to the datatype (an OO method) and some commands to
the rendering environment.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

James Harris

unread,

Jan 2, 2022, 11:06:52 AM1/2/22

to

On 27/12/2021 22:02, Andy Walker wrote:
> On 27/12/2021 14:01, James Harris wrote:
>> The printf approach to printing is flexible and fast at rendering
>> inbuilt types - probably better than anything which came before it -
>> but it's not perfect.
>
> No, it's rubbish. If you need formatted transput [not
> entirely convinced, but chacun a son gout],

If you are not convinced by formatted io then what kind of io do you
prefer?

> then instead of all
> the special casing, the easiest and most flexible way is to
> convert everything to strings.

If you convert to strings then what reclaims the memory used by those
strings? Not all languages have dynamic memory management, and dynamic
memory management is not ideal for all compilation targets.

The form I proposed had no need for dynamic allocations. That's part of
the point of it.

> Thus, for each type, you need
> an operator that converts values of that type into strings [and
> vv for reading].

Yes, there's a converse (and in some ways even more involved) issue with
reading.

...

>> So here's another potential approach. What do you think of it?
>> The idea is, as with the printf family, to have a controlling string
>> where normal characters are copied verbatim and special fields are
>> marked with a % sign or similar. The difference is what would come
>> after the % sign and how it would be handled.
>
> Then what you've done is to use "%" where you should
> instead simply be including a string. So the specification of
> "printf" becomes either absurdly complicated [as indeed it is
> in most languages] or too limited [because some plausible
> conversions are not catered for]. The "everything is a string"
> approach has the advantage that for specialised use, eg if you
> want to read/write your numbers as Roman numerals, you just have
> to write the conversion routines that you would need anyway, no
> need to change anything in "printf".
>

I'm not sure you understand the proposal. To be clear, the print routine
would be akin to

print("String with %kffff; included", val)

where the code k would be used to select a formatter. The formatter
would be passed two things:

1. The format from k to ; inclusive.
2. The value val.

As a result, the format ffff could be as simple as someone could design it.

Note that there would be no requirement for dynamic memory. The
formatter would just send for printing each character as it was generated.

What's wrong with that? (Genuine question!)

--
James Harris

Dmitry A. Kazakov

unread,

Jan 2, 2022, 11:37:29 AM1/2/22

to

On 2022-01-02 17:06, James Harris wrote:

> If you convert to strings then what reclaims the memory used by those
> strings?

What reclaims memory used by those integers?

> Not all languages have dynamic memory management, and dynamic
> memory management is not ideal for all compilation targets.

No dynamic memory management is required for handling temporary objects.

----------
If that were relevant in the case of formatted output, which has a
massive overhead, so that even when using the heap (which is no way
necessary) it would leave a little or no dent. I remember a SysV C
compiler which modified the format string of printf in a misguided
attempt to save a little bit memory, while the linker put string
constants in the read-only memory...

Bart

unread,

Jan 2, 2022, 12:08:47 PM1/2/22

to

On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
> On 2022-01-02 17:06, James Harris wrote:
>
>> If you convert to strings then what reclaims the memory used by those
>> strings?
>
> What reclaims memory used by those integers?

Integers are passed by value at this level of language.

No heap storage is involved.

>> Not all languages have dynamic memory management, and dynamic memory
>> management is not ideal for all compilation targets.
>
> No dynamic memory management is required for handling temporary objects.

If memory is allocated for the temporary object, then at some point it
needs to be reclaimed. Preferably just after the print operation is
completed.

If your language takes care of those details, then lucky you. It means
someone else has had the job of making it work.

Dmitry A. Kazakov

unread,

Jan 2, 2022, 12:21:41 PM1/2/22

to

On 2022-01-02 18:08, Bart wrote:
> On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
>> On 2022-01-02 17:06, James Harris wrote:
>>
>>> If you convert to strings then what reclaims the memory used by those
>>> strings?
>>
>> What reclaims memory used by those integers?
>
> Integers are passed by value at this level of language.

This has nothing to do with the question: what reclaims integers?
FORTRAN-IV passed everything by reference, yet calls

FOO (I + 1)

were OK almost human life span ago.

>>> Not all languages have dynamic memory management, and dynamic memory
>>> management is not ideal for all compilation targets.
>>
>> No dynamic memory management is required for handling temporary objects.
>
> If memory is allocated for the temporary object, then at some point it
> needs to be reclaimed. Preferably just after the print operation is
> completed.

Yep.

> If your language takes care of those details, then lucky you. It means
> someone else has had the job of making it work.

Sure.

Evey *normal* language takes care of the objects it creates. And every
*normal* language lets identityless objects (like integers, strings,
records etc) be created ad-hoc and passed around in a unified manner.

If this

Put_Line ("X=" & X'Image & ", Y=" & Y'Image);

is a problem in your language, then the job is not done.

James Harris

unread,

Jan 2, 2022, 12:37:08 PM1/2/22

to

On 27/12/2021 19:05, Bart wrote:
> On 27/12/2021 17:26, James Harris wrote:
>> On 27/12/2021 15:49, Bart wrote:
>
>>>     printit("This is %s; OK?", R(v,"abc"))
>>>
>>> Then no special features are needed. Except that if R returns a
>>> string, then you need some means of disposing of that string after
>>> printing, but there are several ways of dealing with that.
>>
>> Dealing with memory is indeed a problem with that approach. The %s
>> could be passed a string which needs to be freed or one which must not
>> be freed. One option is
>>
>>    %s - a string which must not be freed
>>    %M - a string which must be freed
>
> This has similar problems to hardcoding a type. Take this:
>
>     printit("%s", F())
>
> F returns a string, but is it one that needs freeing or not? That's
> depends on what happens inside F. Whatever you choose, later the
> implementation of F changes, then 100 formats have to change too?

I guess that every piece of code which called F() would have to change
whether printit was involved or not. But I take your point.

Note that my suggestion (of passing to a formatter the format and the
value) would not require dynamic memory management.

>
> This is a more general problem of memory-managing strings. It's more
> useful to be able to solve it for the language, then it will work for
> Print too.

Agreed.

>
> (Personally, I don't get involved with this at all, not in low level code.
>
> Functions that return strings generally return a pointer to a local
> static string that needs to be consumed ASAP. Or sometimes there is a
> circular list of them to allow several such calls per Print. It's not
> very sophisticated, but that's why the language is low level.)
>
> If you want a more rigorous approach, perhaps try this:
>
> printit("%H", R(v, "abc"))
>
> H means handler. R() is not the handler itself, but returns a descriptor
> (eg. X) the contains a reference to a handler function, and references
> to those two captured values (a little like a lambda or closure I think).

AFAICS

R(v, "abc")

would be called before invoking printit. IOW wouldn't the delayed call
of a lambda require a distinct syntax?

...

>>>    * What is it for uint64_t? (Apparently, it is PRId64 - a macro that
>>>      expands to a string)
>
> Again this is for C; the problem being that a format string should not
> need to include type information:
>
> * The compiler knows the type
> * You may not know the type (eg. clock_t)
> * You may not know the format needed (eg. uint64_t)
> * You don't want to have to maintain 1000 format strings as
>     expressions and types of variables change

OK. If the compiler knows the type T why not have the print function invoke

T.format

with the value and the format string as parameters?

...

Perhaps one option is record-based output something akin to

tout.putrec(R, i, item[i].code, item[i].descr)

where tout is terminal out, R is a record format, and the values are
output in binary. The downside is that that would not be plain text and
would require support so that it could be viewed but the upsides would
include allowing the viewer to resize and reorder tables. (The headings
would be metadata; the user could choose whether to see them or not.)

--
James Harris

Bart

unread,

Jan 2, 2022, 12:50:50 PM1/2/22

to

On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
> On 2022-01-02 18:08, Bart wrote:
>> On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
>>> On 2022-01-02 17:06, James Harris wrote:
>>>
>>>> If you convert to strings then what reclaims the memory used by
>>>> those strings?
>>>
>>> What reclaims memory used by those integers?
>>
>> Integers are passed by value at this level of language.
>
> This has nothing to do with the question: what reclaims integers?
> FORTRAN-IV passed everything by reference, yet calls
>
> FOO (I + 1)
>
> were OK almost human life span ago.

Fortran didn't allow recursion either. So such a call involved writing
the expression to a static location, and passing a reference to that
location.

The problem here is that you call a function F which returns a string to
be passed to Peint, which may be a literal, or in static memory, or has
a shared reference with other objects, none of which require the memory
to be reclaimed.

Or it may have been created specially for this return value, so then
after use (it's been printed), any resources need to be reclaimed.

Your approach to 'solve' this is to 'just' create a language high enough
is level (and harder to write and slower to run), to get around it.

Which actually doesn't solve it; you've just turned a small job into a
huge one.

More interesting is this: /given/ a language design low enough in level
that it doesn't have first class strings with automatic memory
management, how would you implement the printing of complex objects
requiring elaborate 'to-string' conversions.

> Evey *normal* language takes care of the objects it creates. And every
> *normal* language lets identityless objects (like integers, strings,
> records etc) be created ad-hoc and passed around in a unified manner.
>
> If this
>
> Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>
> is a problem in your language, then the job is not done.

My static language is of the lower-level kind described above, yet this
example is merely:

println =X, =Y

You really want a more challenging example.

James Harris

unread,

Jan 2, 2022, 12:54:35 PM1/2/22

to

On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
> On 2022-01-02 18:08, Bart wrote:
>> On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
>>> On 2022-01-02 17:06, James Harris wrote:
>>>
>>>> If you convert to strings then what reclaims the memory used by
>>>> those strings?
>>>
>>> What reclaims memory used by those integers?

Not the print function. See below.

>>
>> Integers are passed by value at this level of language.
>
> This has nothing to do with the question: what reclaims integers?
> FORTRAN-IV passed everything by reference, yet calls
>
> FOO (I + 1)
>
> were OK almost human life span ago.

I disagree slightly with both of you. AISI it doesn't matter whether the
objects to be printed are integers or structures or arrays or widgets.
If there's any reclaiming to be done then it would be carried out by
other language mechanisms which would happen anyway; it would not be
required by the print function. The print function would simply use
them. In reclamation terms it would neither increase nor decrease any
reference count.

For example,

complex c
function F
widget w
....
print(w, c)
endfunction

Widget w would be created at function entry and reclaimed at function
exit. The global c would be created at program load time and destroyed
when the program terminates. The print function would not get involved
in any of that stuff.

...

> If this
>
> Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>
> is a problem in your language, then the job is not done.
>

That would be poor for small-machine targets. Shame on Ada! ;-)

--
James Harris

James Harris

unread,

Jan 2, 2022, 1:05:50 PM1/2/22

to

On 28/12/2021 09:21, Dmitry A. Kazakov wrote:
> On 2021-12-27 23:02, Andy Walker wrote:

...

>> then instead of all
>> the special casing, the easiest and most flexible way is to
>> convert everything to strings.
>
> Sure, though, usually not everything is converted to string. For
> example, formatting symbols or extensions of the idea: meta/tagged
> formats like HTML, XML etc are inherently bad.
>
> The most flexible is a combination of a string that carries most of the
> information specific to the datatype (an OO method) and some commands to
> the rendering environment.

That sounds interesting. How would it work?

--
James Harris

Dmitry A. Kazakov

unread,

Jan 2, 2022, 1:25:18 PM1/2/22

to

With single dispatch you have an interface, say, 'printable'. The
interface has an abstract method 'image' with the profile:

function Image (X : Printable) return String;

Integer, float, string, whatever that has to be printable inherits to
Printable and thus overrides Image. That is.

The same goes with serialization/streaming etc.

Dmitry A. Kazakov

unread,

Jan 2, 2022, 1:30:12 PM1/2/22

to

On 2022-01-02 18:50, Bart wrote:
> On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
>> On 2022-01-02 18:08, Bart wrote:
>>> On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
>>>> On 2022-01-02 17:06, James Harris wrote:
>>>>
>>>>> If you convert to strings then what reclaims the memory used by
>>>>> those strings?
>>>>
>>>> What reclaims memory used by those integers?
>>>
>>> Integers are passed by value at this level of language.
>>
>> This has nothing to do with the question: what reclaims integers?
>> FORTRAN-IV passed everything by reference, yet calls
>>
>> FOO (I + 1)
>>
>> were OK almost human life span ago.
>
> Fortran didn't allow recursion either.

Irrelevant. What reclaims integer I+1?

>> If this
>>
>> Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>>
>> is a problem in your language, then the job is not done.
>
> My static language is of the lower-level kind described above, yet this
> example is merely:
>
> println =X, =Y

No it is not. The example creates temporary strings, which possibility
stunned James and you as something absolutely unthinkable, or requiring
heap, which is rubbish and pretty normal in any decent language.

Dmitry A. Kazakov

unread,

Jan 2, 2022, 1:40:25 PM1/2/22

to

That works perfectly well on small targets. You seem unaware of what
actually happens on I/O. For a "small" target it would be some sort of
networking stack with a terminal emulation on top of it. Believe me,
creating a temporary string on the stack is nothing in comparison to
that. Furthermore, the implementation would likely no less efficient
than printf which has no idea how large the result is and would have to
reallocate the output buffer or keep it between calls and lock it from
concurrent access. Locking on an embedded system is a catastrophic event
because switching threads is expensive as hell. Note also that you
cannot stream output, because networking protocols and terminal
emulators are much more efficient if you do bulk transfers. All that is
the infamous premature optimization.

Bart

unread,

Jan 2, 2022, 1:51:28 PM1/2/22

to

Well, your example wasn't easy to scan.

But since you say it requires temporary strings, where does it put them?
They have to go somewhere!

On the stack? The stack is typically 1-4MB; what if the strings are
bigger? What if some of the terms are strings returned from a function;
those will not be on the stack. Example:

println "(" + tostr(123456)*1'000'000 + ")"

this creates an intermediate string of 6MB; too big for a stack.

Bart

unread,

Jan 2, 2022, 2:04:33 PM1/2/22

to

This is not the issue that had been discussed. Printing an existing
string is no problem; even C can deal with that: you just use
printf("%s", S).

If you are printing a string returned from F(), then any requirement to
reclaim resources it might use is common to any use of it within the
language, and not specific to Print.

The problem is when the string in question has been created directly or
indirectly (explicitly by a function call or implicitly by calling some
handler) within the argument list of the Print function or statement,
and you want any memory allocated specifically for this purpose to be
taken care of automatically.

Dmitry A. Kazakov

unread,

Jan 2, 2022, 2:23:46 PM1/2/22

to

On 2022-01-02 19:51, Bart wrote:

> But since you say it requires temporary strings, where does it put them?
> They have to go somewhere!
>
> On the stack?

The stack is as big as you specify, it is not the program stack, normally.

The stack is typically 1-4MB; what if the strings are
> bigger? What if some of the terms are strings returned from a function;
> those will not be on the stack. Example:
>
> println "(" + tostr(123456)*1'000'000 + ")"
>
> this creates an intermediate string of 6MB; too big for a stack.

Are you going to spill 6MB character long single line on a terminal
emulator? Be realistic.

Bart

unread,

Jan 2, 2022, 2:42:24 PM1/2/22

to

On 02/01/2022 19:23, Dmitry A. Kazakov wrote:
> On 2022-01-02 19:51, Bart wrote:
>
>> But since you say it requires temporary strings, where does it put
>> them? They have to go somewhere!
>>
>> On the stack?
>
> The stack is as big as you specify, it is not the program stack, normally.

So, what are you going to, have a stack which is 1000 times bigger than
normal just in case?

Anyway, if the string is returned from a function, it is almost
certainly on the heap.

> The stack is typically 1-4MB; what if the strings are
>> bigger? What if some of the terms are strings returned from a
>> function; those will not be on the stack. Example:
>>
>> println "(" + tostr(123456)*1'000'000 + ")"
>>
>> this creates an intermediate string of 6MB; too big for a stack.
>
> Are you going to spill 6MB character long single line on a terminal
> emulator? Be realistic.

Who knows what a user-program will do? And who are you to question it?

Besides the print output could redirected to a file, or be piped, or it
could be writing a file anyway.

It could also be on multiple lines if you want:

println "(" + (tostr(123456)+"\n")*1'000'000 + ")"

It depends on contents of the string data the user wants to output,
which as I said could be anything and of any size:

println reverse(readstrfile(filename))

Dmitry A. Kazakov

unread,

Jan 2, 2022, 3:25:33 PM1/2/22

to

On 2022-01-02 20:42, Bart wrote:
> On 02/01/2022 19:23, Dmitry A. Kazakov wrote:
>> On 2022-01-02 19:51, Bart wrote:
>>
>>> But since you say it requires temporary strings, where does it put
>>> them? They have to go somewhere!
>>>
>>> On the stack?
>>
>> The stack is as big as you specify, it is not the program stack,
>> normally.
>
> So, what are you going to, have a stack which is 1000 times bigger than
> normal just in case?

No, I will never ever print anything that does not fit into a page of
72-80 characters wide.

BTW formatting output is meant to format, which includes text wrapping,
you know.

> Anyway, if the string is returned from a function, it is almost
> certainly on the heap.

No, it is certainly on the secondary stack.

>> Are you going to spill 6MB character long single line on a terminal
>> emulator? Be realistic.
>
> Who knows what a user-program will do?

The developer:

https://spec.cs.miami.edu/cpu2017/flags/gcc.2021-07-21.html#user_Wl-stack

> It could also be on multiple lines if you want:

Now it is time to learn about cycles.

> It depends on contents of the string data the user wants to output,
> which as I said could be anything and of any size:
>
> println reverse(readstrfile(filename))

It could even be fetching whole git repository tree of the Linux kernel...

1. Why would anybody ever do that?

2. Why would anybody program it in such a stupid way?

Which boils down to the requirements of the application program and its
behavior upon broken input constraints from these requirements.

Bart

unread,

Jan 2, 2022, 5:31:26 PM1/2/22

to

On 02/01/2022 20:25, Dmitry A. Kazakov wrote:
> On 2022-01-02 20:42, Bart wrote:
>> On 02/01/2022 19:23, Dmitry A. Kazakov wrote:
>>> On 2022-01-02 19:51, Bart wrote:
>>>
>>>> But since you say it requires temporary strings, where does it put
>>>> them? They have to go somewhere!
>>>>
>>>> On the stack?
>>>
>>> The stack is as big as you specify, it is not the program stack,
>>> normally.
>>
>> So, what are you going to, have a stack which is 1000 times bigger
>> than normal just in case?
>
> No, I will never ever print anything that does not fit into a page of
> 72-80 characters wide.

You're not even going to allow anything that might spill over multiple
lines? Say, displaying factorial(1000); apparently in your language,
there is no way to see what it looks like.

> BTW formatting output is meant to format, which includes text wrapping,
> you know.
>
>> Anyway, if the string is returned from a function, it is almost
>> certainly on the heap.
>
> No, it is certainly on the secondary stack.

You'll have to explain what a secondary stack is.

>>> Are you going to spill 6MB character long single line on a terminal
>>> emulator? Be realistic.
>>
>> Who knows what a user-program will do?
>
> The developer:
>
> https://spec.cs.miami.edu/cpu2017/flags/gcc.2021-07-21.html#user_Wl-stack

If your program needs to ROUTINELY increase the stack size, then it is
probably broken. You should only need to for some highly-recursive
programs such as the Ackermann benchmark. Not for ordinary Print!

>> It could also be on multiple lines if you want:
>
> Now it is time to learn about cycles.
>
>> It depends on contents of the string data the user wants to output,
>> which as I said could be anything and of any size:
>>
>> println reverse(readstrfile(filename))
>
> It could even be fetching whole git repository tree of the Linux kernel...

Yes, why not? I think that is exactly what I did once:

dir/s >files # create a large text file
type files # display a large text file

So large files are OK in some circumstances, but not as an argument to
Print?

> 1. Why would anybody ever do that?
>
> 2. Why would anybody program it in such a stupid way?

Somebody does this:

println dirlist("*.c")

The output could be anything from 10 characters to 100,000 or more. (And
yes, the default routine to print a list of strings could write one per
line.)

Language implementers shouldn't place unreasonable restrictions on what
user code can do, because of some irrational aversion to using heap memory.

There might be 4MB of stack memory and 4000MB of heap memory, so why not
use it!

The issues however are with lower-level languages and what can be done
there. Using stack memory is not really practical there either.

Some approaches which don't involve creating discrete string objects I
think have been discussed. But there are also myriad workarounds to get
things done, even if not elegant.

>
> Which boils down to the requirements of the application program and its
> behavior upon broken input constraints from these requirements.

Nothing is broken. Any lack of first string handling and automatic
memory management is by design in some languages. But you are never
going to be stuck getting anything printed, it might just take extra effort.

Dmitry A. Kazakov

unread,

Jan 3, 2022, 4:19:44 AM1/3/22

to

On 2022-01-02 23:31, Bart wrote:
> On 02/01/2022 20:25, Dmitry A. Kazakov wrote:
>> On 2022-01-02 20:42, Bart wrote:
>>> On 02/01/2022 19:23, Dmitry A. Kazakov wrote:
>>>> On 2022-01-02 19:51, Bart wrote:
>>>>
>>>>> But since you say it requires temporary strings, where does it put
>>>>> them? They have to go somewhere!
>>>>>
>>>>> On the stack?
>>>>
>>>> The stack is as big as you specify, it is not the program stack,
>>>> normally.
>>>
>>> So, what are you going to, have a stack which is 1000 times bigger
>>> than normal just in case?
>>
>> No, I will never ever print anything that does not fit into a page of
>> 72-80 characters wide.
>
> You're not even going to allow anything that might spill over multiple
> lines?

No.

> Say, displaying factorial(1000);

Displaying it to who?

>> BTW formatting output is meant to format, which includes text
>> wrapping, you know.
>>
>>> Anyway, if the string is returned from a function, it is almost
>>> certainly on the heap.
>>
>> No, it is certainly on the secondary stack.
>
> You'll have to explain what a secondary stack is.

Secondary stacks are used to pass large and dynamically sized objects.
They are also used to allocate local objects of these properties.

Secondary stacks are not machine stacks and have no limitations of.

>>>> Are you going to spill 6MB character long single line on a terminal
>>>> emulator? Be realistic.
>>>
>>> Who knows what a user-program will do?
>>
>> The developer:
>>
>> https://spec.cs.miami.edu/cpu2017/flags/gcc.2021-07-21.html#user_Wl-stack
>
> If your program needs to ROUTINELY increase the stack size, then it is
> probably broken.

Right, except that it is your program that tries to create 6MB strings
for no reason.

> So large files are OK in some circumstances, but not as an argument to
> Print?

Exactly.

>> 1. Why would anybody ever do that?
>>
>> 2. Why would anybody program it in such a stupid way?
>
> Somebody does this:
>
> println dirlist("*.c")

Nobody prints file lists this way. Hint: formatted output presumes
formatting:

https://www.merriam-webster.com/dictionary/format

> The output could be anything from 10 characters to 100,000 or more. (And
> yes, the default routine to print a list of strings could write one per
> line.)

Wrong. For printing lists of files, if anybody cares, there will be a
subprogram with parameters:

- Files path / array of paths
- Wildcard/pattern use flag
- Number of columns
- Output direction: column first vs row first
- First line decorator text
- Consequent lines decorator text
- Filter object
- Sorting order object

etc.

Bart

unread,

Jan 3, 2022, 6:19:14 AM1/3/22

to

On 03/01/2022 09:19, Dmitry A. Kazakov wrote:

> On 2022-01-02 23:31, Bart wrote:

> Right, except that it is your program that tries to create 6MB strings
> for no reason.

No; my user does so. I can't control what they write.

Should I impose an arbitrary limit like max 80 characters per any print
item, and max 80 characters in total for all items on one print
statement? (But then multiple print statements can write to the same line.)

I'm not implementing Fortran or writing to a line-printer, and my
language copes fine with no such limits. If the user does something
silly, they will find out when it gets slow or they run out of memory.

However, my (higher level) implementation uses managed strings that work
with the heap.

I think (since I've sort of lost track of what we were arguing about)
you were advocating stack storage. And talking about a secondary stack,
which is presumably allocated on the heap.

>> So large files are OK in some circumstances, but not as an argument to
>> Print?
>
> Exactly.
>
>>> 1. Why would anybody ever do that?
>>>
>>> 2. Why would anybody program it in such a stupid way?
>>
>> Somebody does this:
>>
>> println dirlist("*.c")
>
> Nobody prints file lists this way.

Actually, it's just a list. Should a language allow you to print a whole
list? If not, why not? If you need more control, do it element by
element. Or the format control codes that are optional for every print
item can specify some basic parameters, eg. print one element per line.

My higher-level language still builds a single string for each print
item before it does anything else with it.

Sometimes, that is necessary (eg. the result may be right-justified
within a given field width, so it needs to know the final size);
sometimes it isn't, and it is better to send each successive character
into some destination, but I don't have that yet.

>> The output could be anything from 10 characters to 100,000 or more.
>> (And yes, the default routine to print a list of strings could write
>> one per line.)
>
> Wrong. For printing lists of files, if anybody cares, there will be a
> subprogram with parameters:
>
> - Files path / array of paths
> - Wildcard/pattern use flag
> - Number of columns
> - Output direction: column first vs row first
> - First line decorator text
> - Consequent lines decorator text
> - Filter object
> - Sorting order object

As I said I've lost track of what we discussing. But I know it wasn't
about how to implement DIR or ls!

Dmitry A. Kazakov

unread,

Jan 3, 2022, 7:00:56 AM1/3/22

to

On 2022-01-03 12:19, Bart wrote:

> As I said I've lost track of what we discussing.

We were discussing inability to return a string from a function in your
language in a reasonable way.

You argued that there is no need to have it due to danger that some user
might miss his or her medication or confuse that with certain mushrooms
and so came to an idea of creating terabyte large temporary strings...

Bart

unread,

Jan 3, 2022, 7:22:51 AM1/3/22

to

On 03/01/2022 12:00, Dmitry A. Kazakov wrote:
> On 2022-01-03 12:19, Bart wrote:
>
>> As I said I've lost track of what we discussing.
>
> We were discussing inability to return a string from a function in your
> language in a reasonable way.
>
> You argued that there is no need to have it due to danger that some user
> might miss his or her medication or confuse that with certain mushrooms
> and so came to an idea of creating terabyte large temporary strings...

Unlike Python? Here:

a = [10,]*1000000
s = str(a)
print (len(s))

The length of s is 4 million (1 million times "10, ").

I believe that print(a) would simply apply str() to 'a' then write that
string.

Andy Walker

unread,

Jan 3, 2022, 7:35:09 PM1/3/22

to

On 02/01/2022 16:06, James Harris wrote:
>>> The printf approach to printing is flexible and fast at rendering
>>> inbuilt types - probably better than anything which came before it -
>>> but it's not perfect.
>> No, it's rubbish. If you need formatted transput [not
>> entirely convinced, but chacun a son gout],
> If you are not convinced by formatted io then what kind of io do you prefer?

Unformatted transput, of course. Eg,

print this, that, these, those and the many other things

[with whatever syntax, quoting, separators, etc you prefer]. Much
the same for "read". Most of the time the default is entirely
adequate. If not, then the choice for the language designer is
either an absurdly complicated syntax that still probably doesn't
meet some plausible needs, or to provide simple mechanisms that
allow programmers to roll their own. Guess which I prefer.

>> then instead of all
>> the special casing, the easiest and most flexible way is to
>> convert everything to strings.
> If you convert to strings then what reclaims the memory used by those
> strings? Not all languages have dynamic memory management, and
> dynamic memory management is not ideal for all compilation targets.

AIUI, you are designing your own language. If it doesn't
have strings, eg as results of procedures, then you have much worse
problems than designing some transput procedures. There are lots
of ways of implementing strings, but they are for the compiler to
worry about, not the language designer [at least, once you know it
can be done].

> The form I proposed had no need for dynamic allocations. That's part
> of the point of it.

There's no need for "dynamic allocations" merely for transput.
Most of the early languages that I used didn't have them, but still
managed to print and read things. You're making mountains out of
molehills.

> I'm not sure you understand the proposal. To be clear, the print
> routine would be akin to
> print("String with %kffff; included", val)
> where the code k would be used to select a formatter. The formatter
> would be passed two things:
> 1. The format from k to ; inclusive.
> 2. The value val.
> As a result, the format ffff could be as simple as someone could
> design it.

Yes, that's what I thought you meant. C is thataway -->.
You call it "simple"; C is one of the simpler languages of this
type, yet the full spec of "printf" and its friends is horrendous.
Build in some version [exact syntax up to you] of

print "String with ", val, " included"

and you're mostly done. For the exceptional cases, use a library
procedure or your own procedure to convert "val" to a suitable
array of characters, with whatever parameters are appropriate.

> Note that there would be no requirement for dynamic memory. The
> formatter would just send for printing each character as it was
> generated.
> What's wrong with that? (Genuine question!)

Nothing. How on earth do you think we managed in olden
times, before we had "dynamic memory"? [Ans: by printing each
character in sequence.]

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music

Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Bizet

Bart

unread,

Jan 4, 2022, 6:31:39 AM1/4/22

to

On 04/01/2022 00:35, Andy Walker wrote:
> On 02/01/2022 16:06, James Harris wrote:
>>>> The printf approach to printing is flexible and fast at rendering
>>>> inbuilt types - probably better than anything which came before it -
>>>> but it's not perfect.
>>> No, it's rubbish. If you need formatted transput [not
>>> entirely convinced, but chacun a son gout],
>> If you are not convinced by formatted io then what kind of io do you
>> prefer?
>
> Unformatted transput, of course. Eg,
>
> print this, that, these, those and the many other things

If I do this in A68G:

print(("<",1,">"))

the output is:

< +1>

The "<>" are just to help show the problem: how to get rid of the those
leading spaces and that plus sign? Or write the number with a field
width of your choice?

(It gets worse with wider, higher precision numbers, as it uses the
maximum value as the basis for the field width, so that one number could
take up most of the line. Now you will need to start calling functions
that return strings to get things done properly.)

> [with whatever syntax, quoting, separators, etc you prefer]. Much

> Yes, that's what I thought you meant. C is thataway -->.
> You call it "simple"; C is one of the simpler languages of this
> type, yet the full spec of "printf" and its friends is horrendous.

The main problem with C's printf is having to tell it the exact type of
each expression.

But those formatting facilities are genuinely useful and harder to
emulate in user code if they didn't exist.

> Build in some version [exact syntax up to you] of
>
> print "String with ", val, " included"
>
> and you're mostly done.

Yeah, and it will need this:

print a, " ", b, " ", c, "\n"

instead of:

println a, b, c

Designing Print properly can make a big difference!

> For the exceptional cases, use a library
> procedure or your own procedure to convert "val" to a suitable
> array of characters, with whatever parameters are appropriate.

Well, this is the problem. Where will the array of characters be located
especially if the size is unpredictable? What happens here:

print val1, val2

Or here:

print val3(f())

where f itself includes a 'print val'.

A language might not be advanced enough to have memory-managed
persistent data structures, but it might want custom printing of
user-defined types. Example:

println "Can't convert", strmode(s), "to", strmode(t)

strmode() turns an internal type code into a human readable type
specificication. This language doesn't have flex strings; strmode just
returns a pointer to a fixed-size static string big enough for the
largest expected type.

In this example, because I know that evaluation is left-to-right, it
doesn't matter that the second strmode call will overwrite the earlier
result. But here it does:

f(strmode(s), strmode(t))

>> Note that there would be no requirement for dynamic memory. The
>> formatter would just send for printing each character as it was
>> generated.
>> What's wrong with that? (Genuine question!)
>
> Nothing. How on earth do you think we managed in olden
> times, before we had "dynamic memory"? [Ans: by printing each
> character in sequence.]

Probably the printing tasks weren't that challenging. As my A68G example
showed, output tended to be tabulated.

Andy Walker

unread,

Jan 4, 2022, 7:54:34 PM1/4/22

to

On 04/01/2022 11:31, Bart wrote:
> If I do this in A68G:
> print(("<",1,">"))
> the output is:
> < +1>
> The "<>" are just to help show the problem: how to get rid of the
> those leading spaces and that plus sign? Or write the number with a
> field width of your choice?

"The problem"? It's the A68 /default/. You might prefer
a different default [esp for your own language], but someone had
to choose, without being able to read the minds of contributors
to this newsgroup more than half a century later. If you want
different output in A68, check out the specification of "whole"
[and "fixed" and "float"], RR10.3.2.1, which directly solves the
two "problems" you mention.

> (It gets worse with wider, higher precision numbers, as it uses the
> maximum value as the basis for the field width, so that one number
> could take up most of the line. Now you will need to start calling
> functions that return strings to get things done properly.)

"Worse", "properly"? All you're saying is that you don't
like the default. Note, again for A68, that "whole" [still] works
for all flavours of number. FTAOD, I carefully didn't mention A68
in my previous postings to this thread; I don't greatly care for
A68 transput -- but at least the unformatted [or "formatless" in
RR-speak] version is easy to learn and use.

> The main problem with C's printf is having to tell it the exact type
> of each expression.

The main problem is its complexity! It's as bad as A68,
while not being /anywhere near/ as comprehensive. I suspect you
haven't read N2731 [or near equivalent]. If it takes scores of
pages to describe a relatively simple facility, there's surely
something wrong.

> But those formatting facilities are genuinely useful and harder to
> emulate in user code if they didn't exist.

The RR includes "user code" for both the formatted and
unformatted versions of transput; so if any part of it didn't
exist in some other language, you could easily roll your own.
Not that you should need to.

[...]

>> For the exceptional cases, use a library
>> procedure or your own procedure to convert "val" to a suitable
>> array of characters, with whatever parameters are appropriate.
> Well, this is the problem. Where will the array of characters be
> located especially if the size is unpredictable?

Why [as a user] do you care? Your language either has
strings as a useful type or it doesn't. If it does, you're in
business. If it doesn't, and you want to write software that
handles things like words or names or anything of the sort, then
you have problems way beyond getting your output to look nice
[but at least Unix/Linux comes with loads of commands to do
that for you]. If, OTOH, you're a compiler writer trying to
implement strings, then there is plenty of source code for
those commands available to give you a start.

[In response to James:]

>> How on earth do you think we managed in olden
>> times, before we had "dynamic memory"? [Ans: by printing each
>> character in sequence.]
> Probably the printing tasks weren't that challenging. As my A68G
> example showed, output tended to be tabulated.

Perhaps you could write that in a more patronising form?
[For some of us, "olden times" started more than a decade before
the A68 RR, and we nevertheless managed to write word-processing
and similar "apps" -- despite the absence of mod cons such as
editors, mice, disc storage, files, "dynamic memory", ....]

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music

Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Handel

Bart

unread,

Jan 5, 2022, 8:12:57 AM1/5/22

to

On 05/01/2022 00:54, Andy Walker wrote:
> On 04/01/2022 11:31, Bart wrote:

>> The main problem with C's printf is having to tell it the exact type
>> of each expression.
>
> The main problem is its complexity! It's as bad as A68,
> while not being /anywhere near/ as comprehensive. I suspect you
> haven't read N2731 [or near equivalent]. If it takes scores of
> pages to describe a relatively simple facility, there's surely
> something wrong.

If you mean 7.21.6.1 about fprintf, that's 7 pages, of which the last
two are examples. Yes, it goes on a bit, but that's just its style,
which is a specification hopefully useful to someone needing to
implemement it.

Format codes form a little language of their own. Bear in mind all the
different parameters that could be used to control the appearance of an
integer or float value.

This one for example roughly emulates A68's display for INT:

"%+11d"

In my scheme, the equivalent params are: "+11" (or "11 +"; it's not
fussy about syntax), and the docs (in the form of comments to the
function that turns that style string into a descriptor) occupy 2 lines,
but are rather sparse.

>> Well, this is the problem. Where will the array of characters be
>> located especially if the size is unpredictable?
>
> Why [as a user] do you care? Your language either has
> strings as a useful type or it doesn't.

The static language doesn't. It doesn't mean it's not possible to do
things, it's just more work.

But there is a greater need for helpful Print features that involve
text, which are one corner of the language, than to do 100 times as much
work in transforming that language into one with first class strings.

I use this language to implement my higher-level one, and to take care
of things that can't be done in scripting code.

> [In response to James:]
>>> How on earth do you think we managed in olden
>>> times, before we had "dynamic memory"? [Ans: by printing each
>>> character in sequence.]
>> Probably the printing tasks weren't that challenging. As my A68G
>> example showed, output tended to be tabulated.
>
> Perhaps you could write that in a more patronising form?

I've lost track here of your argument.

You say that a language ought to have strings as a proper type. But you
also say it doesn't need them. So which is it?

I think the thread is partly about how to add custom printing to a
language that doesn't have automatically managed string types. Example:

record date =
int day, month, year
end

date d := (5,1,2022)

print d

Ideally the output should be something like "5-Jan-2022". But my static
language doesn't support this at all, not even printing the 3 numbers
involved. My dynamic one just shows "(5,1,2022)".

One solution - this is about static code from now on - is to write a
function like this:

function strdate(date &d)ichar =
static [100]char str
static []ichar months = ("Jan","Feb","Mar","Apr","May","Jun",
"Jul","Aug","Sep","Oct","Nov","Dec")

fprint @str, "#-#-#", d.day, months[d.month], d.year
return str
end

then write:

print strdate(d)

but this has obvious limitations: the result of strdate() must be
consumed immediately for example. For more complex types, the string
could be arbitrarily long; what should that buffer size be?

While returning an allocated string means something needs to deallocate
it sooner or later, preferably sooner, but via which mechanism? It could
also mean arbitrary large strings that can cause issues.

Another approach is to use:

print d

and for it to know, through an association made elsewhere, that turning
d into a string (or otherwise serialising it) involves calling the function.

I don't think it's helpful to suggest that either the language needs to
be transformed into a higher level one, just for Print, or that it
doesn't need any such features, because decades ago we all seemed to
manage to print dates with basic Print. Yes I can do that now too:

print d.day,,"-",,months[d.month],,"-",,d.year # ,, means no space
fprint "#-#-#", d.day, months[d.month], d.year
printdate(d); println

but as you can see it's bit of a pig.

Andy Walker

unread,

Jan 7, 2022, 8:20:30 AM1/7/22

to

On 05/01/2022 13:12, Bart wrote:
> Format codes form a little language of their own. Bear in mind all
> the different parameters that could be used to control the appearance
> of an integer or float value.

Yes, but that's the point. As you point out with your "date"
example, you're typically on your own if you want to print something
that doesn't fit neatly into the "little language" [such as dates or
Roman numerals], and there are so many "different parameters" that
the little language grows into something bigger, while still being
insufficient for many needs. Learning it is a distraction from the
main programming language involved.

>>> Probably the printing tasks weren't that challenging. As my A68G
>>> example showed, output tended to be tabulated.
>> Perhaps you could write that in a more patronising form?
> I've lost track here of your argument.

You were implying that before modern times, only simple
output used to be needed. That's patronising rubbish.

> You say that a language ought to have strings as a proper type. But
> you also say it doesn't need them. So which is it?

Both, of course. Proper strings make life easier, but
it's possible to program around that as long as your language
has /some/ way of constructing and printing characters. Any
modern general-purpose language /ought/ to be able to have
arrays [whether of characters, integers, procedures returning
structures, ...] as parameters and as function return values.
You can program around the lack, but these days you shouldn't
have to.

> I think the thread is partly about how to add custom printing to a
> language that doesn't have automatically managed string types.

Well, it was about whether languages should have formats
[presumably somewhat similar to those in C]. I don't see the
point. In C terms, the simple cases [such as "%s", "%d"] can
be replaced by simply printing the string or whatever, the
slightly more complicated cases by some equivalent to the A68
"whole", "fixed" and "float" procedures, and you have to roll
your own with anything complex anyway, as your date example
[snipped] shows.

[...]> While returning an allocated string means something needs to

> deallocate it sooner or later, preferably sooner, but via which
> mechanism? It could also mean arbitrary large strings that can cause
> issues.

You're back to implementation issues. Not the concern
of the user who wants to print dates that look nice. Meanwhile,
the implementation issues were solved more than half a century
ago. I don't know why you and James are so opposed to the use
of heap storage [and temporary files, if you really want strings
that are many gigabytes]?

> I don't think it's helpful to suggest that either the language needs
> to be transformed into a higher level one, just for Print, or that it
> doesn't need any such features, because decades ago we all seemed to
> manage to print dates with basic Print.

"Need" is an exaggeration. But in any case no-one here
has suggested either part of that [esp not if you replace "need"
by "desirable" (with appropriate changes to the grammar)]. It is
indeed desirable for a modern language to have the ability to
allocate [and deallocate] off-stack storage and the ability to
print characters. Is there any major general-purpose computing
language of the past sixty years that has not had such abilities?

Meanwhile, you're proposing adding a "little language"
to the language spec "just for Print". Why is that any better
than adding "fixed", "float" and "whole" to the library?

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music

Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Dussek

Bart

unread,

Jan 7, 2022, 1:14:51 PM1/7/22

to

On 07/01/2022 13:20, Andy Walker wrote:
> On 05/01/2022 13:12, Bart wrote:

> You're back to implementation issues. Not the concern
> of the user who wants to print dates that look nice. Meanwhile,
> the implementation issues were solved more than half a century
> ago. I don't know why you and James are so opposed to the use
> of heap storage [and temporary files, if you really want strings
> that are many gigabytes]?

Because heap storage requires a more advanced language to manage
properly, ie. automatically. (I don't care for speculative GC methods.)

>> I don't think it's helpful to suggest that either the language needs
>> to be transformed into a higher level one, just for Print, or that it
>> doesn't need any such features, because decades ago we all seemed to
>> manage to print dates with basic Print.
>
> "Need" is an exaggeration. But in any case no-one here
> has suggested either part of that [esp not if you replace "need"
> by "desirable" (with appropriate changes to the grammar)]. It is
> indeed desirable for a modern language to have the ability to
> allocate [and deallocate] off-stack storage and the ability to
> print characters. Is there any major general-purpose computing
> language of the past sixty years that has not had such abilities?

Plenty don't do such allocations /automatically/.

> Meanwhile, you're proposing adding a "little language"
> to the language spec "just for Print". Why is that any better
> than adding "fixed", "float" and "whole" to the library?

Until about 2010 I only had simple Print. If wanted to display something
in hex for example, I had a library function called Hex:

print hex(a)

And that function hex was defined like this (in an older version of my
language):

FUNCTION HEX(n)=
static [1..20]char str
sprintf(^str," %XH",n)
^str
END

Notice it's using C's 'sprintf' to do the work; I can't be having that!
(Yes I can do this easily enough with my own code, but why bother then
it's there ready to use.)

So if I'm having to utilise something from another language (and C at
that), it sounds like somthing I ought to have built-in to mine:

print a:"h"

(The :fmt syntax comes from Pascal.) Inside my library, I do it with
low-level code; C's sprintf is not suitable as there are extra controls.

But also, inside my library, the intermediate string data is much more
easily managed.

> Meanwhile, you're proposing adding a "little language"
> to the language spec "just for Print".

It can be called a 'language' for C-style formats as there is a syntax
involved. (The 'just for Print' comment was about adding advanced
features to the language.)

As I do it, it's more of a style string, as I frequently use within
applications, for controlling GUI elements for example.

It is the x:fmt syntax that is specific to Print, and is really
syntactic sugar:

print x, y:fmt

is equivalent (when x and y have i64 types) to:

m$print_startcon()
m$print_i64_nf(x) # this calls m$print_i64(x, nil)
m$print_i64(y,fmt)
m$print_end()

(The start/end functions provide a print context, eg for output to
console, file, string, and allow for logic to insert spaces /between/
items.)

James Harris

unread,

Jan 8, 2022, 11:18:13 AM1/8/22

to

On 02/01/2022 18:25, Dmitry A. Kazakov wrote:
> On 2022-01-02 19:05, James Harris wrote:
>> On 28/12/2021 09:21, Dmitry A. Kazakov wrote:

...

>>> The most flexible is a combination of a string that carries most of
>>> the information specific to the datatype (an OO method) and some
>>> commands to the rendering environment.
>>
>> That sounds interesting. How would it work?
>
> With single dispatch you have an interface, say, 'printable'. The
> interface has an abstract method 'image' with the profile:
>
> function Image (X : Printable) return String;
>
> Integer, float, string, whatever that has to be printable inherits to
> Printable and thus overrides Image. That is.
>
> The same goes with serialization/streaming etc.
>

OK, that was my preferred option, too. The trouble with it is that it
needs somewhere to put the string form. And you know the problems
therewith (lack of recursion OR lack of thread safety OR dynamic memory
management).

My more recent suggestion would not need any special place for the
string form. Anything to be output could be written directly to wherever
the print function was writing to.

--
James Harris

James Harris

unread,

Jan 8, 2022, 11:29:36 AM1/8/22

to

On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
> On 2022-01-02 17:06, James Harris wrote:
>

>> If you convert to strings then what reclaims the memory used by those
>> strings?
>

> What reclaims memory used by those integers?

Depends on where they are:

1. Globals - reclaimed when the program exits.

2. Locals on the stack or in activation records - reclaimed at least by
the time the function exits.

3. Dynamic on the heap - management required.

Your solution of creating string forms of values is reasonable in a
language and an environment which already have dynamic memory management
but not otherwise.

>
>> Not all languages have dynamic memory management, and dynamic memory
>> management is not ideal for all compilation targets.
>

> No dynamic memory management is required for handling temporary objects.

Where would you put the string forms?

>
> ----------
> If that were relevant in the case of formatted output, which has a
> massive overhead, so that even when using the heap (which is no way
> necessary) it would leave a little or no dent. I remember a SysV C
> compiler which modified the format string of printf in a misguided
> attempt to save a little bit memory, while the linker put string
> constants in the read-only memory...
>

Whether formatted or not, all IO tends to have higher costs than
computation and for most applications the cost of printing doesn't
matter. But when designing a language or a standard library it's a bad
idea to effectively impose a scheme which has a higher cost than
necessary because the language designer doesn't know what uses his
language will be put to.

--
James Harris

James Harris

unread,

Jan 8, 2022, 11:36:06 AM1/8/22

to

On 02/01/2022 17:21, Dmitry A. Kazakov wrote:

...

> If this
>
> Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>
> is a problem in your language, then the job is not done.

What's wrong with

put_line("X=%i;, Y=%i;", X, Y)

?

I see that you've gone for default formatting so I did the same. I could
(under the suggestion) customise them by putting a format specification
between % and ;. How would you customise the format of X'Image and Y'Image?

--
James Harris

James Harris

unread,

Jan 8, 2022, 11:50:19 AM1/8/22

to

I don't know why you bring locking into it. It's neither necessary nor
relevant.

Furthermore, the only use for an output buffer is to make output more
efficient; it's not fundamental.

On a minimal system a print function only has to emit one character at a
time. No buffering. No memory management. For sure, there can be
buffering added on top but that would be for performance reasons.

Whether a buffer is used or not ISTM your schemes imply significant
memory management. There's nothing wrong with that. It's quite normal in
most languages and on most hardware that anyone is likely to use these
days. But I return to my point that it's best not to design output
facilities which need such complexity because one cannot enumerate every
environment in which a language will ever be used.

--
James Harris

James Harris

unread,

Jan 8, 2022, 12:15:59 PM1/8/22

to

On 04/01/2022 00:35, Andy Walker wrote:

> On 02/01/2022 16:06, James Harris wrote:
>>>> The printf approach to printing is flexible and fast at rendering
>>>> inbuilt types - probably better than anything which came before it -
>>>> but it's not perfect.
>>> No, it's rubbish. If you need formatted transput [not
>>> entirely convinced, but chacun a son gout],
>> If you are not convinced by formatted io then what kind of io do you
>> prefer?
>
> Unformatted transput, of course. Eg,
>
> print this, that, these, those and the many other things

OK. I would say that that has /default/ formatting but is still formatted.

>
> [with whatever syntax, quoting, separators, etc you prefer]. Much
> the same for "read". Most of the time the default is entirely
> adequate. If not, then the choice for the language designer is
> either an absurdly complicated syntax that still probably doesn't
> meet some plausible needs, or to provide simple mechanisms that
> allow programmers to roll their own. Guess which I prefer.

You have a preferred syntax in which users can express how they want
values to be formatted? Suggestions welcome!

>>> then instead of all
>>> the special casing, the easiest and most flexible way is to
>>> convert everything to strings.
>> If you convert to strings then what reclaims the memory used by those
>> strings? Not all languages have dynamic memory management, and
>> dynamic memory management is not ideal for all compilation targets.
>
> AIUI, you are designing your own language. If it doesn't
> have strings, eg as results of procedures, then you have much worse
> problems than designing some transput procedures. There are lots
> of ways of implementing strings, but they are for the compiler to
> worry about, not the language designer [at least, once you know it
> can be done].

The design of the language supports dynamically sized strings. That's
fine for applications which need them. But that's different from
imposing such strings and the management thereof on every print operation.

>
>> The form I proposed had no need for dynamic allocations. That's part
>> of the point of it.
>
> There's no need for "dynamic allocations" merely for transput.
> Most of the early languages that I used didn't have them, but still
> managed to print and read things. You're making mountains out of
> molehills.

Oh? How would you handle

widget w
print(w)

?

ISTM you are suggesting that w be converted to a string (or array of
characters) and then printed.

>
>> I'm not sure you understand the proposal. To be clear, the print
>> routine would be akin to
>>    print("String with %kffff; included", val)
>> where the code k would be used to select a formatter. The formatter
>> would be passed two things:
>>    1. The format from k to ; inclusive.
>>    2. The value val.
>> As a result, the format ffff could be as simple as someone could
>> design it.
>
>     Yes, that's what I thought you meant. C is thataway -->.
> You call it "simple"; C is one of the simpler languages of this
> type, yet the full spec of "printf" and its friends is horrendous.

I agree. C's printf formats are fine for simple cases. But they don't
scale. That's why I suggested a scheme which was more flexible.

> Build in some version [exact syntax up to you] of
>
> print "String with ", val, " included"
>
> and you're mostly done. For the exceptional cases, use a library
> procedure or your own procedure to convert "val" to a suitable
> array of characters, with whatever parameters are appropriate.

a. Where would you store the array of characters?

b. What's wrong with

print "String with %v; included" % val

where v is a suitable format for the type of val.

>
>> Note that there would be no requirement for dynamic memory. The
>> formatter would just send for printing each character as it was
>> generated.
>> What's wrong with that? (Genuine question!)
>
> Nothing. How on earth do you think we managed in olden
> times, before we had "dynamic memory"? [Ans: by printing each
> character in sequence.]
>

Cool. But if you agree with my suggestion of a means which can be used
to render each character in sequence I don't know why you suggested
conversion to an array of characters.

--
James Harris

James Harris

unread,

Jan 8, 2022, 12:18:35 PM1/8/22

to

On 04/01/2022 11:31, Bart wrote:

> On 04/01/2022 00:35, Andy Walker wrote:
>> On 02/01/2022 16:06, James Harris wrote:

...

>>> If you are not convinced by formatted io then what kind of io do you
>>> prefer?
>>
>>      Unformatted transput, of course. Eg,
>>
>>    print this, that, these, those and the many other things
>
> If I do this in A68G:
>
>     print(("<",1,">"))
>
> the output is:
>
>     <         +1>

...

> print val3(f())
>
> where f itself includes a 'print val'.

Two great examples!

--
James Harris

Dmitry A. Kazakov

unread,

Jan 8, 2022, 2:35:23 PM1/8/22

to

On 2022-01-08 17:18, James Harris wrote:
> On 02/01/2022 18:25, Dmitry A. Kazakov wrote:
>> On 2022-01-02 19:05, James Harris wrote:
>>> On 28/12/2021 09:21, Dmitry A. Kazakov wrote:
>
> ...
>
>>>> The most flexible is a combination of a string that carries most of
>>>> the information specific to the datatype (an OO method) and some
>>>> commands to the rendering environment.
>>>
>>> That sounds interesting. How would it work?
>>
>> With single dispatch you have an interface, say, 'printable'. The
>> interface has an abstract method 'image' with the profile:
>>
>> function Image (X : Printable) return String;
>>
>> Integer, float, string, whatever that has to be printable inherits to
>> Printable and thus overrides Image. That is.
>>
>> The same goes with serialization/streaming etc.
>
> OK, that was my preferred option, too. The trouble with it is that it
> needs somewhere to put the string form. And you know the problems
> therewith (lack of recursion OR lack of thread safety OR dynamic memory
> management).

No idea why you think there is something special about string format or
that any of the mentioned issues would ever apply. Conversion to string
needs no recursion, is as thread safe as any other call, needs no
dynamic memory management.

There should be no format specifications at all. You just need a few
parameters for Image regarding type-specific formatting and a few
parameters regarding rendering context in the actual output call.

The former are like put + if positive, base, precision etc; the latter
are like output field width, alignment, fill character etc.

Of course, the formatting parameters bring things in the realm of
multiple dispatch:

function Image (X : Printable; Options : Format := Default)
return String;

Combinations of Printable x Format are multiple dispatch.

It is unlikely to support, but in this case it could be replaced by a
variant record for Format. It would lack extensibility, but where is any
in printf? Alternatively to variant record you can make Format having
methods like:

function Base (X : Format) return Number_Base;

The methods would return defaults if not overridden. In both cases the
language must provide good support of aggregates to make format
specifications comfortable.

Dmitry A. Kazakov

unread,

Jan 8, 2022, 2:48:43 PM1/8/22

to

On 2022-01-08 17:50, James Harris wrote:
> On 02/01/2022 18:40, Dmitry A. Kazakov wrote:
>> On 2022-01-02 18:54, James Harris wrote:
>>> On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
>>
>>>> If this
>>>>
>>>> Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>>>>
>>>> is a problem in your language, then the job is not done.
>>>
>>> That would be poor for small-machine targets. Shame on Ada! ;-)
>>
>> That works perfectly well on small targets. You seem unaware of what
>> actually happens on I/O. For a "small" target it would be some sort of
>> networking stack with a terminal emulation on top of it. Believe me,
>> creating a temporary string on the stack is nothing in comparison to
>> that. Furthermore, the implementation would likely no less efficient
>> than printf which has no idea how large the result is and would have
>> to reallocate the output buffer or keep it between calls and lock it
>> from concurrent access. Locking on an embedded system is a
>> catastrophic event because switching threads is expensive as hell.
>> Note also that you cannot stream output, because networking protocols
>> and terminal emulators are much more efficient if you do bulk
>> transfers. All that is the infamous premature optimization.
>
> I don't know why you bring locking into it. It's neither necessary nor
> relevant.

Because this is how I/O works.

> Furthermore, the only use for an output buffer is to make output more
> efficient; it's not fundamental.

It is fundamental, there is no hardware anymore where you could just
send a single character to. A small target will write to the network
stack, e.g. use socket send over TCP, that will coalesce output into
network packets, these would be buffered into transport layer frames,
these will go to physical layer packets etc.

There is no such thing as character stream without a massive overhead
beneath it. So creating a string on the secondary stack is nothing in
compare to that especially when you skip stream abstraction. Most
embedded software do. They tend to do I/O directly in packets. E.g.
sending application level packets over TCP or using UDP.

Tracing, the only place where text output is actually used, does not do
printf directly. It usually does some kind of locking when the output
consists of multiple printfs. Consider it an output transaction when
each instance of output is atomic. Of course, character streams have no
place there.

Dmitry A. Kazakov

unread,

Jan 8, 2022, 2:54:21 PM1/8/22

to

On 2022-01-08 17:36, James Harris wrote:
> On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
>
> ...
>
>> If this
>>
>> Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>>
>> is a problem in your language, then the job is not done.
>
> What's wrong with
>
> put_line("X=%i;, Y=%i;", X, Y)
>
> ?

Untyped, unsafe, messy, non-portable garbage that does not work with
user-defined types.

And again, that is not the point. The point is that in any decent
language there is no need either in printf mess or print statements.

Because the language abstractions are powerful enough to express
formatting I/O in language terms.

Dmitry A. Kazakov

unread,

Jan 8, 2022, 3:01:52 PM1/8/22

to

On 2022-01-08 17:29, James Harris wrote:
> On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
>> On 2022-01-02 17:06, James Harris wrote:
>>
>>> If you convert to strings then what reclaims the memory used by those
>>> strings?
>>
>> What reclaims memory used by those integers?
>
> Depends on where they are:
>
> 1. Globals - reclaimed when the program exits.
>
> 2. Locals on the stack or in activation records - reclaimed at least by
> the time the function exits.
>
> 3. Dynamic on the heap - management required.

Now replace integer with string. The work done.

> Your solution of creating string forms of values is reasonable in a
> language and an environment which already have dynamic memory management
> but not otherwise.

You do not need that.

>>> Not all languages have dynamic memory management, and dynamic memory
>>> management is not ideal for all compilation targets.
>>
>> No dynamic memory management is required for handling temporary objects.
>
> Where would you put the string forms?

The same place you put integer, float, etc. That place is called stack
or LIFO.

> Whether formatted or not, all IO tends to have higher costs than
> computation and for most applications the cost of printing doesn't
> matter. But when designing a language or a standard library it's a bad
> idea to effectively impose a scheme which has a higher cost than
> necessary because the language designer doesn't know what uses his
> language will be put to.

Did you do actual measurements?

printf obviously imposes higher costs than direct conversions. And also
costs that cannot be easily optimized since the format can be an
expression and even if a constant it is difficult to break down into
direct conversions.

Bart

unread,

Jan 8, 2022, 3:22:44 PM1/8/22

to

But the result is ugly, ungainly code to do a simple task.

Look at the difference when it's done properly:

fprintln "X=#, Y=#", X, Y

Or as I also do it (as "=" displays prints the expression itself before
its value):

println =X, =Y

Back to that Ada, I wanted do this:

Put_Line (X+Y'Image);

This didn't work (I guess ' has higher precedence than +?). But neither did:

Put_Line ((X+Y)'Image);

'Image' can only be applied to a name, not an expression; why? So I
have to use an intermediate variable: Z := X+Y then do Z'Image.

I think you're not in a position to tell people how to implement Print!
You need to be able to just do this:

println X+Y

Bart

unread,

Jan 8, 2022, 3:46:22 PM1/8/22

to

On 08/01/2022 20:01, Dmitry A. Kazakov wrote:
> On 2022-01-08 17:29, James Harris wrote:

>>> No dynamic memory management is required for handling temporary objects.
>>
>> Where would you put the string forms?
>
> The same place you put integer, float, etc. That place is called stack
> or LIFO.

(1) integer, float etc are a fixed size known at compile-time

(2) integer, float etc are usually manipulated by value

Apparently Ada strings have a fixed length. If so, that would make
implementing your ideas simpler; and I can do the same thing in my
lower-level language, but I don't consider that a viable solution.

It would be very constraining if you had to know, in advance, the size
of a string returned from a function for a complex user-type conversion.
Which I guess also means the function has the same limit for all
possible calls.

But Ada also has unbounded strings:

"Unbounded strings are allocated using heap memory, and are deallocated
automatically."

Yeah, so exactly what we've been talking about. Either the language has
that kind of advanced feature, based on the heap, or it uses crude
methods with fixed length strings (which was the problem with Pascal)
which is not really a solution; just the kind of workaround I use already.

Dmitry A. Kazakov

unread,

Jan 8, 2022, 3:49:06 PM1/8/22

to

On 2022-01-08 21:22, Bart wrote:

> This didn't work (I guess ' has higher precedence than +?). But neither
> did:
>
> Put_Line ((X+Y)'Image);
>
> 'Image' can only be applied to a name, not an expression; why?

Because 'Image is a type attribute:

<subtype>'Image (<value>)

So

Integer_32'image (X + Y)

> I think you're not in a position to tell people how to implement Print!

I am, just don't.

> You need to be able to just do this:
>
> println X+Y

Nope, I don't need that at all. In Ada it is just this:

Put (X + Y);
New_Line;

See the package Integer_IO (ARM A.10.8). The point is that is is almost
never used, because, again, not needed for real-life software.

Dmitry A. Kazakov

unread,

Jan 8, 2022, 4:04:29 PM1/8/22

to

On 2022-01-08 21:46, Bart wrote:
> On 08/01/2022 20:01, Dmitry A. Kazakov wrote:
>> On 2022-01-08 17:29, James Harris wrote:
>
>>>> No dynamic memory management is required for handling temporary
>>>> objects.
>>>
>>> Where would you put the string forms?
>>
>> The same place you put integer, float, etc. That place is called stack
>> or LIFO.
>
> (1) integer, float etc are a fixed size known at compile-time

So what?

> (2) integer, float etc are usually manipulated by value

Irrelevant.

> Apparently Ada strings have a fixed length.

Apparently not:

function Get_Line (File : File_Type) return String;

> It would be very constraining if you had to know, in advance, the size
> of a string returned from a function for a complex user-type conversion.

String is an indefinite type, the size of an object is unknown until
run-time.

> Which I guess also means the function has the same limit for all
> possible calls.

Wrong. Indefinite types are returned just same as definite types are, on
the stack, which means memory management policy LIFO.

To widen your horizon a little bit, a stack LIFO can be implemented by
many various means: using machine stack, using machine registers, using
thread local storage as well as various combinations of.

> But Ada also has unbounded strings:
>
> "Unbounded strings are allocated using heap memory, and are deallocated
> automatically."

Unbounded_String is practically never needed and discouraged to use.
Because heap is a bad idea and because text processing algorithm almost
never require changing length/content of a string. If you do that, then
you do something wrong or the language is pool, e.g. does not support
string slices.

Bart

unread,

Jan 8, 2022, 4:05:57 PM1/8/22

to

On 08/01/2022 20:49, Dmitry A. Kazakov wrote:
> On 2022-01-08 21:22, Bart wrote:
>
>> This didn't work (I guess ' has higher precedence than +?). But
>> neither did:
>>
>> Put_Line ((X+Y)'Image);
>>
>> 'Image' can only be applied to a name, not an expression; why?
>
> Because 'Image is a type attribute:
>
> <subtype>'Image (<value>)

And yet it works with X'Image when X is a variable, not a type.

> So
>
> Integer_32'image (X + Y)

Yeah, that's much better!

>> I think you're not in a position to tell people how to implement Print!
>
> I am, just don't.

Tell me one thing that Ada can do with its Print scheme that I can't do
more simply and with less typing with mine.

>> You need to be able to just do this:
>>
>> println X+Y
>
> Nope, I don't need that at all. In Ada it is just this:
>
> Put (X + Y);

So it's overloading Put() with different types. But the language doesn't
similarly overload Put_Line()?

> See the package Integer_IO (ARM A.10.8). The point is that is is almost
> never used, because, again, not needed for real-life software.

Huh? Have you never written data to a file?

Bart

unread,

Jan 8, 2022, 4:18:28 PM1/8/22

to

On 08/01/2022 21:04, Dmitry A. Kazakov wrote:
> On 2022-01-08 21:46, Bart wrote:
>> On 08/01/2022 20:01, Dmitry A. Kazakov wrote:
>>> On 2022-01-08 17:29, James Harris wrote:
>>
>>>>> No dynamic memory management is required for handling temporary
>>>>> objects.
>>>>
>>>> Where would you put the string forms?
>>>
>>> The same place you put integer, float, etc. That place is called
>>> stack or LIFO.
>>
>> (1) integer, float etc are a fixed size known at compile-time
>
> So what?
>
>> (2) integer, float etc are usually manipulated by value
>
> Irrelevant.

Relevant because you are suggesting that strings can be manipulated just
like a 4- or 8-byte primitive type.

>> Apparently Ada strings have a fixed length.
>
> Apparently not:
>
> function Get_Line (File : File_Type) return String;

Yet I can't do this:

S: String;

"unconstrained subtype not allowed". It needs a size or to be
initialised from a literal of known length.

>
>> It would be very constraining if you had to know, in advance, the size
>> of a string returned from a function for a complex user-type conversion.
>
> String is an indefinite type, the size of an object is unknown until
> run-time.
>
>> Which I guess also means the function has the same limit for all
>> possible calls.
>
> Wrong. Indefinite types are returned just same as definite types are, on
> the stack, which means memory management policy LIFO.
>
> To widen your horizon a little bit, a stack LIFO can be implemented by
> many various means: using machine stack, using machine registers, using
> thread local storage as well as various combinations of.

Suppose you have this:

Put_Line(Get_Line(...));

Can you go into some detail as to what, exactly, is passed back from
Get_Line(), what, exactly, is passed to Put_Line(), bearing in mind that
64-bit ABIs frown on passing by value any args more than 64-bits, and
where, exactly, the actual string data, which can be of any length,
resides during this process, and how that string data is destroyed when
it is no longer needed?

Then perhaps you might explain in what way that is identical to passing
a Integer to Put().

>> But Ada also has unbounded strings:
>>
>> "Unbounded strings are allocated using heap memory, and are
>> deallocated automatically."
>
> Unbounded_String is practically never needed and discouraged to use.
> Because heap is a bad idea and because text processing algorithm almost
> never require changing length/content of a string.

It seems you've never written a text editor either!

James Harris

unread,

Jan 8, 2022, 4:41:50 PM1/8/22

to

On 08/01/2022 19:54, Dmitry A. Kazakov wrote:

> On 2022-01-08 17:36, James Harris wrote:
>> On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
>>
>> ...
>>
>>> If this
>>>
>>> Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>>>
>>> is a problem in your language, then the job is not done.
>>
>> What's wrong with
>>
>> put_line("X=%i;, Y=%i;", X, Y)
>>
>> ?
>
> Untyped, unsafe, messy, non-portable garbage that does not work with
> user-defined types.

That's just wrong. It is typesafe, clean and portable. What's more, per
the suggestion I made to start this thread it will work with
user-defined types.

You should at least try to understand an idea before you dismiss it! ;-)

(The only way it would become type unsafe would be via such as

format_string = F()
put_line(format_string, X, Y)

and that does not have to be allowed.)

--
James Harris

James Harris

unread,

Jan 8, 2022, 4:45:55 PM1/8/22

to

On 08/01/2022 20:01, Dmitry A. Kazakov wrote:
> On 2022-01-08 17:29, James Harris wrote:
>> On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
>>> On 2022-01-02 17:06, James Harris wrote:
>>>
>>>> If you convert to strings then what reclaims the memory used by
>>>> those strings?
>>>
>>> What reclaims memory used by those integers?
>>
>> Depends on where they are:
>>
>> 1. Globals - reclaimed when the program exits.
>>
>> 2. Locals on the stack or in activation records - reclaimed at least
>> by the time the function exits.
>>
>> 3. Dynamic on the heap - management required.
>
> Now replace integer with string. The work done.

Strings are, in general, not of fixed length.

>
>> Your solution of creating string forms of values is reasonable in a
>> language and an environment which already have dynamic memory
>> management but not otherwise.
>
> You do not need that.
>
>>>> Not all languages have dynamic memory management, and dynamic memory
>>>> management is not ideal for all compilation targets.
>>>
>>> No dynamic memory management is required for handling temporary objects.
>>
>> Where would you put the string forms?
>
> The same place you put integer, float, etc. That place is called stack
> or LIFO.

Integer and Float are of fixed length.

>
>> Whether formatted or not, all IO tends to have higher costs than
>> computation and for most applications the cost of printing doesn't
>> matter. But when designing a language or a standard library it's a bad
>> idea to effectively impose a scheme which has a higher cost than
>> necessary because the language designer doesn't know what uses his
>> language will be put to.
>
> Did you do actual measurements?

Did you?

:-)

>
> printf obviously imposes higher costs than direct conversions. And also
> costs that cannot be easily optimized since the format can be an
> expression and even if a constant it is difficult to break down into
> direct conversions.
>

I am not defending printf.

--
James Harris

James Harris

unread,

Jan 8, 2022, 4:58:05 PM1/8/22

to

On 08/01/2022 19:48, Dmitry A. Kazakov wrote:
> On 2022-01-08 17:50, James Harris wrote:
>> On 02/01/2022 18:40, Dmitry A. Kazakov wrote:
>>> On 2022-01-02 18:54, James Harris wrote:
>>>> On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
>>>
>>>>> If this
>>>>>
>>>>> Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>>>>>
>>>>> is a problem in your language, then the job is not done.
>>>>
>>>> That would be poor for small-machine targets. Shame on Ada! ;-)
>>>
>>> That works perfectly well on small targets. You seem unaware of what
>>> actually happens on I/O. For a "small" target it would be some sort
>>> of networking stack with a terminal emulation on top of it. Believe
>>> me, creating a temporary string on the stack is nothing in comparison
>>> to that. Furthermore, the implementation would likely no less
>>> efficient than printf which has no idea how large the result is and
>>> would have to reallocate the output buffer or keep it between calls
>>> and lock it from concurrent access. Locking on an embedded system is
>>> a catastrophic event because switching threads is expensive as hell.
>>> Note also that you cannot stream output, because networking protocols
>>> and terminal emulators are much more efficient if you do bulk
>>> transfers. All that is the infamous premature optimization.
>>
>> I don't know why you bring locking into it. It's neither necessary nor
>> relevant.
>
> Because this is how I/O works.

Why? Where there's no contention and just one task producing a certain
stream's output what is there to lock from?

I have even designed a lock-free way of allowing two or more tasks to
write to the same stream so I don't buy in to conventional wisdom of the
necessity for locks.

>
>> Furthermore, the only use for an output buffer is to make output more
>> efficient; it's not fundamental.
>
> It is fundamental, there is no hardware anymore where you could just
> send a single character to.

Of course there is. For example, a 7-segment display. Another: an async
serial port.

> A small target will write to the network
> stack, e.g. use socket send over TCP, that will coalesce output into
> network packets, these would be buffered into transport layer frames,
> these will go to physical layer packets etc.

In a couple of replies recently you've mentioned a communication stack.
I don't know why you are thinking of such a thing but not all
communication uses the OSI 7-layer model! ;-)

>
> There is no such thing as character stream without a massive overhead
> beneath it. So creating a string on the secondary stack is nothing in
> compare to that especially when you skip stream abstraction. Most
> embedded software do. They tend to do I/O directly in packets. E.g.
> sending application level packets over TCP or using UDP.
>
> Tracing, the only place where text output is actually used,

Tracing is 'the only place where text output is used'? :-o

--
James Harris

James Harris

unread,

Jan 8, 2022, 5:11:30 PM1/8/22

to

On 08/01/2022 19:35, Dmitry A. Kazakov wrote:
> On 2022-01-08 17:18, James Harris wrote:
>> On 02/01/2022 18:25, Dmitry A. Kazakov wrote:
>>> On 2022-01-02 19:05, James Harris wrote:
>>>> On 28/12/2021 09:21, Dmitry A. Kazakov wrote:
>>
>> ...
>>
>>>>> The most flexible is a combination of a string that carries most of
>>>>> the information specific to the datatype (an OO method) and some
>>>>> commands to the rendering environment.
>>>>
>>>> That sounds interesting. How would it work?
>>>
>>> With single dispatch you have an interface, say, 'printable'. The
>>> interface has an abstract method 'image' with the profile:
>>>
>>> function Image (X : Printable) return String;
>>>
>>> Integer, float, string, whatever that has to be printable inherits to
>>> Printable and thus overrides Image. That is.
>>>
>>> The same goes with serialization/streaming etc.
>>
>> OK, that was my preferred option, too. The trouble with it is that it
>> needs somewhere to put the string form. And you know the problems
>> therewith (lack of recursion OR lack of thread safety OR dynamic
>> memory management).
>
> No idea why you think there is something special about string format or
> that any of the mentioned issues would ever apply. Conversion to string
> needs no recursion,

It does if a to-string function invokes another to-string function.

> is as thread safe as any other call, needs no
> dynamic memory management.

Unless you know the maximum size of the string (and you prohibit
recursion and you keep it thread local) then you cannot reserve space
for it in the activation record of the caller (or in global space). As
you know, if you create it in the activation record of the to-string
function then its memory will go out of scope when the function returns.

However, if the formatter is passed the value and the format (my
suggestion) then it (the formatter) can print the characters one by one
or could write them to a buffer - with the buffer being legitimately and
safely deallocated when the formatter returns.

>
> There should be no format specifications at all. You just need a few
> parameters for Image regarding type-specific formatting and a few
> parameters regarding rendering context in the actual output call.
>
> The former are like put + if positive, base, precision etc; the latter
> are like output field width, alignment, fill character etc.

That sounds good. Perhaps other things should be added: fixed or
floating sign, leading or trailing sign, different potential
representations of bases, digit grouping, fixed-point scaling, response
to exceeding a field width, etc.

--
James Harris

Dmitry A. Kazakov

unread,

Jan 8, 2022, 5:57:25 PM1/8/22

to

On 2022-01-08 23:11, James Harris wrote:
> On 08/01/2022 19:35, Dmitry A. Kazakov wrote:

>> No idea why you think there is something special about string format
>> or that any of the mentioned issues would ever apply. Conversion to
>> string needs no recursion,
>
> It does if a to-string function invokes another to-string function.

Recursion is when a function calls itself not another function.

>> is as thread safe as any other call, needs no dynamic memory management.
>
> Unless you know the maximum size of the string

Again, that is not required. You should really take a look how stacks
work. You can return an indefinite object without prior knowledge of its
size until actual return. It is no rocket science.

> However, if the formatter is passed the value and the format (my
> suggestion) then it (the formatter) can print the characters one by one

It cannot.

1. Check how I/O works

2. Observe output fields, alignments, padding, columns of output

Dmitry A. Kazakov

unread,

Jan 8, 2022, 6:07:16 PM1/8/22

to

How the file system knows that there is only one task accessing the stream?

> I have even designed a lock-free way

Lock-free does not mean no locking. It only means that synchronization
is achieved per busy waiting rather than per signaling an event.

>>> Furthermore, the only use for an output buffer is to make output more
>>> efficient; it's not fundamental.
>>
>> It is fundamental, there is no hardware anymore where you could just
>> send a single character to.
>
> Of course there is. For example, a 7-segment display. Another: an async
> serial port.

The hardware used in actual applications. Even for an UART it is not
true, consider software handshaking as a counter-example. In practice
serial protocols need a considerable amount of locking, sometimes, even
hard timeouts. As an example consider MODBUS RTU. It is a serial
protocol. Even a simple modem requires locking, or for that matter an
serial dot-matrix printer if you find one.

>> A small target will write to the network stack, e.g. use socket send
>> over TCP, that will coalesce output into network packets, these would
>> be buffered into transport layer frames, these will go to physical
>> layer packets etc.
>
> In a couple of replies recently you've mentioned a communication stack.
> I don't know why you are thinking of such a thing but not all
> communication uses the OSI 7-layer model! ;-)

And the overhead coming from each of the layers.

Dmitry A. Kazakov

unread,

Jan 8, 2022, 6:12:53 PM1/8/22

to

On 2022-01-08 22:05, Bart wrote:
> On 08/01/2022 20:49, Dmitry A. Kazakov wrote:
>> On 2022-01-08 21:22, Bart wrote:
>>
>>> This didn't work (I guess ' has higher precedence than +?). But
>>> neither did:
>>>
>>> Put_Line ((X+Y)'Image);
>>>
>>> 'Image' can only be applied to a name, not an expression; why?
>>
>> Because 'Image is a type attribute:
>>
>> <subtype>'Image (<value>)
>
> And yet it works with X'Image when X is a variable, not a type.

That is another attribute. The type of a variable is known. The type of
an expression is not.

>>> You need to be able to just do this:
>>>
>>> println X+Y
>>
>> Nope, I don't need that at all. In Ada it is just this:
>>
>> Put (X + Y);
>
> So it's overloading Put() with different types. But the language doesn't
> similarly overload Put_Line()?

Why should it? It is never happens to print one data point per line
except when the whole line is printed and then that is a string.

>> See the package Integer_IO (ARM A.10.8). The point is that is is
>> almost never used, because, again, not needed for real-life software.
>
> Huh? Have you never written data to a file?

Not this way.

Besides it is about formatted output, not about data. Data are written
in binary formats from simple to very complex like in the databases.

Formatted output is performed into a string buffer which is then output.
Always, no exceptions.

Dmitry A. Kazakov

unread,

Jan 8, 2022, 6:20:30 PM1/8/22

to

On 2022-01-08 22:41, James Harris wrote:
> On 08/01/2022 19:54, Dmitry A. Kazakov wrote:
>> On 2022-01-08 17:36, James Harris wrote:
>>> On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
>>>
>>> ...
>>>
>>>> If this
>>>>
>>>> Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>>>>
>>>> is a problem in your language, then the job is not done.
>>>
>>> What's wrong with
>>>
>>> put_line("X=%i;, Y=%i;", X, Y)
>>>
>>> ?
>>
>> Untyped, unsafe, messy, non-portable garbage that does not work with
>> user-defined types.
>
> That's just wrong. It is typesafe,

So the text "%i" is checked against the type of X during compile-time?

> clean

Obviously not. Counting formats and argument positions, come on!

> and portable.

Varlists are inherently non-portable.

> What's more, per
> the suggestion I made to start this thread it will work with
> user-defined types.

The suggestion made no sense being a low-level mess lacking minimal
safety checks.

Dmitry A. Kazakov

unread,

Jan 8, 2022, 6:36:07 PM1/8/22

to

On 2022-01-08 22:18, Bart wrote:
> On 08/01/2022 21:04, Dmitry A. Kazakov wrote:
>> On 2022-01-08 21:46, Bart wrote:
>>> On 08/01/2022 20:01, Dmitry A. Kazakov wrote:
>>>> On 2022-01-08 17:29, James Harris wrote:
>>>
>>>>>> No dynamic memory management is required for handling temporary
>>>>>> objects.
>>>>>
>>>>> Where would you put the string forms?
>>>>
>>>> The same place you put integer, float, etc. That place is called
>>>> stack or LIFO.
>>>
>>> (1) integer, float etc are a fixed size known at compile-time
>>
>> So what?
>>
>>> (2) integer, float etc are usually manipulated by value
>>
>> Irrelevant.
>
> Relevant because you are suggesting that strings can be manipulated just
> like a 4- or 8-byte primitive type.

Regarding memory management they can.

>>> Apparently Ada strings have a fixed length.
>>
>> Apparently not:
>>
>> function Get_Line (File : File_Type) return String;
>
> Yet I can't do this:
>
> S: String;

Yet you cannot. So?

> "unconstrained subtype not allowed". It needs a size or to be
> initialised from a literal of known length.

Wrong:

S : String := Get_Line (Standard_Input); -- Perfectly legal

>> To widen your horizon a little bit, a stack LIFO can be implemented by
>> many various means: using machine stack, using machine registers,
>> using thread local storage as well as various combinations of.
>
> Suppose you have this:
>
> Put_Line(Get_Line(...));
>
> Can you go into some detail as to what, exactly, is passed back from
> Get_Line(),

That depends on the compiler. Get_Line could allocate a doped string
vector on the secondary stack and return a reference to it on the
primary stack. Or the dope and reference on the primary stack and the
body on the secondary stack.

> what, exactly, is passed to Put_Line(),

Reference to the vector on the secondary stack or else the dope and the
reference.

> bearing in mind that
> 64-bit ABIs frown on passing by value any args more than 64-bits, and
> where, exactly, the actual string data, which can be of any length,
> resides during this process, and how that string data is destroyed when
> it is no longer needed?

The object's length computable from the vector's dope. E.g.

8 + high-bound - low-bound + 1 + rounding

assuming 32-bit bounds. The secondary stack containing arguments of
Put_Line is popped upon return from it.

> Then perhaps you might explain in what way that is identical to passing
> a Integer to Put().

Just same. Push arguments on the stack, pop it upon return. In the case
of Integer it could be the primary stack instead of secondary stacks.
Which one uses GNAT for Ada calling convention I don't know.

>>> But Ada also has unbounded strings:
>>>
>>> "Unbounded strings are allocated using heap memory, and are
>>> deallocated automatically."
>>
>> Unbounded_String is practically never needed and discouraged to use.
>> Because heap is a bad idea and because text processing algorithm
>> almost never require changing length/content of a string.
>
> It seems you've never written a text editor either!

On the contrary. I wrote text editors in FORTRAN-IV which had no
character type at all. The text buffer was LOGICAL*1 allocated
programmatically in a single huge static array.

Hint, it is an incredible bad idea to use Unbounded_String for a text
buffer.

Bart

unread,

Jan 8, 2022, 6:36:26 PM1/8/22

to

On 08/01/2022 23:12, Dmitry A. Kazakov wrote:
> On 2022-01-08 22:05, Bart wrote:
>> On 08/01/2022 20:49, Dmitry A. Kazakov wrote:
>>> On 2022-01-08 21:22, Bart wrote:
>>>
>>>> This didn't work (I guess ' has higher precedence than +?). But
>>>> neither did:
>>>>
>>>> Put_Line ((X+Y)'Image);
>>>>
>>>> 'Image' can only be applied to a name, not an expression; why?
>>>
>>> Because 'Image is a type attribute:
>>>
>>> <subtype>'Image (<value>)
>>
>> And yet it works with X'Image when X is a variable, not a type.
>
> That is another attribute. The type of a variable is known. The type of
> an expression is not.

But it is known with Put(X+Y) ?

>
>>>> You need to be able to just do this:
>>>>
>>>> println X+Y
>>>
>>> Nope, I don't need that at all. In Ada it is just this:
>>>
>>> Put (X + Y);
>>
>> So it's overloading Put() with different types. But the language
>> doesn't similarly overload Put_Line()?
>
> Why should it? It is never happens to print one data point per line
> except when the whole line is printed and then that is a string.

The common-sense way of doing this is to define Put/Putln to work
identically, but the latter writes a newline as the end.

I'm not sure where you get the idea that one item per line never happens
unless it's a string. If I wanted to output N numbers, I can do N calls
to Put, but each has to be followed by Put_Line("") or Put_Line(" ") to
separate them? Instead of just doing N calls of Put_Line(x).

It's an unnecessary restriction. (Have a look at a FizzBuzz program.)

The most convenient for the user is to be able to print any number of
items per line, of mixed types. Which is exactly how my Println works.

>>> See the package Integer_IO (ARM A.10.8). The point is that is is
>>> almost never used, because, again, not needed for real-life software.
>>
>> Huh? Have you never written data to a file?
>
> Not this way.
>
> Besides it is about formatted output, not about data. Data are written
> in binary formats from simple to very complex like in the databases.
>
> Formatted output is performed into a string buffer which is then output.
> Always, no exceptions.

Nonsense. There are no rules for what someone may want to write to a
text file. Here's a program to write N random numbers to a file, of
which the first line is N:

n:=random(1 million)
println @f, n
to n do
println @f, random(0)
end

Um, just one number per line...

There are also files with mixed text and binary content, eg. PGM files.

Dmitry A. Kazakov

unread,

Jan 8, 2022, 6:39:52 PM1/8/22

to

On 2022-01-08 22:45, James Harris wrote:
> On 08/01/2022 20:01, Dmitry A. Kazakov wrote:
>> On 2022-01-08 17:29, James Harris wrote:
>>> On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
>>>> On 2022-01-02 17:06, James Harris wrote:
>>>>
>>>>> If you convert to strings then what reclaims the memory used by
>>>>> those strings?
>>>>
>>>> What reclaims memory used by those integers?
>>>
>>> Depends on where they are:
>>>
>>> 1. Globals - reclaimed when the program exits.
>>>
>>> 2. Locals on the stack or in activation records - reclaimed at least
>>> by the time the function exits.
>>>
>>> 3. Dynamic on the heap - management required.
>>
>> Now replace integer with string. The work done.
>
> Strings are, in general, not of fixed length.

So what? Length is irrelevant when dealing with a stack. You push a
chuck, you pop a chunk.

>>> Whether formatted or not, all IO tends to have higher costs than
>>> computation and for most applications the cost of printing doesn't
>>> matter. But when designing a language or a standard library it's a
>>> bad idea to effectively impose a scheme which has a higher cost than
>>> necessary because the language designer doesn't know what uses his
>>> language will be put to.
>>
>> Did you do actual measurements?
>
> Did you?

It is not my claim. The burden of proof is on your side.

>> printf obviously imposes higher costs than direct conversions. And
>> also costs that cannot be easily optimized since the format can be an
>> expression and even if a constant it is difficult to break down into
>> direct conversions.
>
> I am not defending printf.

Promoting undefendable, even better? (:-))

Dmitry A. Kazakov

unread,

Jan 8, 2022, 7:00:12 PM1/8/22

to

On 2022-01-09 00:36, Bart wrote:
> On 08/01/2022 23:12, Dmitry A. Kazakov wrote:
>> On 2022-01-08 22:05, Bart wrote:
>>> On 08/01/2022 20:49, Dmitry A. Kazakov wrote:
>>>> On 2022-01-08 21:22, Bart wrote:
>>>>
>>>>> This didn't work (I guess ' has higher precedence than +?). But
>>>>> neither did:
>>>>>
>>>>> Put_Line ((X+Y)'Image);
>>>>>
>>>>> 'Image' can only be applied to a name, not an expression; why?
>>>>
>>>> Because 'Image is a type attribute:
>>>>
>>>> <subtype>'Image (<value>)
>>>
>>> And yet it works with X'Image when X is a variable, not a type.
>>
>> That is another attribute. The type of a variable is known. The type
>> of an expression is not.
>
> But it is known with Put(X+Y) ?

Yes, because Put tells the type it expects.

>>>>> You need to be able to just do this:
>>>>>
>>>>> println X+Y
>>>>
>>>> Nope, I don't need that at all. In Ada it is just this:
>>>>
>>>> Put (X + Y);
>>>
>>> So it's overloading Put() with different types. But the language
>>> doesn't similarly overload Put_Line()?
>>
>> Why should it? It is never happens to print one data point per line
>> except when the whole line is printed and then that is a string.
>
> The common-sense way of doing this is to define Put/Putln to work
> identically, but the latter writes a newline as the end.

There is no common sense doing things that are not needed!

> I'm not sure where you get the idea that one item per line never happens
> unless it's a string. If I wanted to output N numbers, I can do N calls
> to Put, but each has to be followed by Put_Line("") or Put_Line(" ") to
> separate them? Instead of just doing N calls of Put_Line(x).

When reasonable people print arrays they do that in columned output.

> It's an unnecessary restriction. (Have a look at a FizzBuzz program.)

It is no restriction, it as a feature nobody needed. You can easily
implement it yourself:

procedure Put_Line (X : Integer) is
begin
New_Line;
Put (X);
end Put_Line;

Ada features are voted by a committee. This one never made it, if it was
ever suggested, which I doubt.

>>>> See the package Integer_IO (ARM A.10.8). The point is that is is
>>>> almost never used, because, again, not needed for real-life software.
>>>
>>> Huh? Have you never written data to a file?
>>
>> Not this way.
>>
>> Besides it is about formatted output, not about data. Data are written
>> in binary formats from simple to very complex like in the databases.
>>
>> Formatted output is performed into a string buffer which is then
>> output. Always, no exceptions.
>
> Nonsense. There are no rules for what someone may want to write to a
> text file.

Of course there are rules, stated above. You may follow them or not.

> There are also files with mixed text and binary content, eg. PGM files.

Once binary always binary.

(Talking to you I begin sounding like a damned psychoanalyst! (:-))

Bart

unread,

Jan 8, 2022, 7:15:25 PM1/8/22

to

On 08/01/2022 23:36, Dmitry A. Kazakov wrote:

> On 2022-01-08 22:18, Bart wrote:

>> Relevant because you are suggesting that strings can be manipulated
>> just like a 4- or 8-byte primitive type.
>
> Regarding memory management they can.
>

>> Suppose you have this:
>>
>> Put_Line(Get_Line(...));
>>
>> Can you go into some detail as to what, exactly, is passed back from
>> Get_Line(),
>
> That depends on the compiler. Get_Line could allocate a doped string
> vector on the secondary stack and return a reference to it on the
> primary stack. Or the dope and reference on the primary stack and the
> body on the secondary stack.
>
>> what, exactly, is passed to Put_Line(),
>
> Reference to the vector on the secondary stack or else the dope and the
> reference.
>
>> bearing in mind that 64-bit ABIs frown on passing by value any args
>> more than 64-bits, and where, exactly, the actual string data, which
>> can be of any length, resides during this process, and how that string
>> data is destroyed when it is no longer needed?
>
> The object's length computable from the vector's dope. E.g.
>
> 8 + high-bound - low-bound + 1 + rounding
>
> assuming 32-bit bounds. The secondary stack containing arguments of
> Put_Line is popped upon return from it.
>
>> Then perhaps you might explain in what way that is identical to
>> passing a Integer to Put().
>
> Just same. Push arguments on the stack, pop it upon return. In the case
> of Integer it could be the primary stack instead of secondary stacks.
> Which one uses GNAT for Ada calling convention I don't know.

I think you demonstrated above that it is entirely different, it is not
just pushing one 64-bit value:

* Doped string vectors
* References
* Secondary stacks (which appear to be a kind of heap but with
stack-like allocations)

It's also not clear, as James pointed out, how a string created on a
secondary stack in a called function manages to move to the secondary
stack of the caller, since in between there may be secondary stack space
containing local data for the called function (plus possible secondary
stack space pertaining the the caller's argument list).

This is also highly specific to machine, language and ABI.

In other words, nothing like as simple as passing an integer.

c array.
>
> Hint, it is an incredible bad idea to use Unbounded_String for a text
> buffer.

The basis on which most programs work is that memory is a huge array of
mutable bytes. Also, most of the available data memory to a program
(99%) will be in the form of heap memory.

Dmitry A. Kazakov

unread,

Jan 9, 2022, 6:57:43 AM1/9/22

to

On 2022-01-09 01:15, Bart wrote:

> It's also not clear, as James pointed out, how a string created on a
> secondary stack in a called function manages to move to the secondary
> stack of the caller,

One technique is to use two stacks and swap them over. The arguments and
locals stack vs the results stack so that the callee's arguments and
locals stack is the caller's results stack. The call sequence would be this:

push arguments onto S1
call and swap stacks
push locals on S2 (former S1)
push result onto S1 (former S2)
return and swap stacks
pop S1

Now the result of the call is on the arguments and locals stack as expected.

> This is also highly specific to machine, language and ABI.

This is absolutely non-specific. A machine may provide some support, but
there is no problem if it does not.

> In other words, nothing like as simple as passing an integer.

It is exactly same stack handling for everything. Surely some
optimization like using registers for the stack tops is possible and
welcome.

>> Hint, it is an incredible bad idea to use Unbounded_String for a text
>> buffer.
>
> The basis on which most programs work is that memory is a huge array of
> mutable bytes.

This has nothing to do with the data structures used to implement text
buffers.

The text buffers have certain requirements like effective insertion and
removing portions of text as well as effective text tagging. A character
array is incredibly miserable on these.

So, the short answer, just don't. The long answer read the literature on
the subject, study existing implementations of text buffers.

Bart

unread,

Jan 9, 2022, 11:19:44 AM1/9/22

to

On 09/01/2022 11:57, Dmitry A. Kazakov wrote:
> On 2022-01-09 01:15, Bart wrote:
>
>> It's also not clear, as James pointed out, how a string created on a
>> secondary stack in a called function manages to move to the secondary
>> stack of the caller,
>
> One technique is to use two stacks and swap them over. The arguments and
> locals stack vs the results stack so that the callee's arguments and
> locals stack is the caller's results stack. The call sequence would be
> this:
>
> push arguments onto S1
> call and swap stacks
> push locals on S2 (former S1)
> push result onto S1 (former S2)
> return and swap stacks
> pop S1
>
> Now the result of the call is on the arguments and locals stack as
> expected.
>
>> This is also highly specific to machine, language and ABI.
>
> This is absolutely non-specific. A machine may provide some support, but
> there is no problem if it does not.

It depends on how the language does things. It depends on the ABI. Which
in turn also depends on the machine.

>> In other words, nothing like as simple as passing an integer >

> It is exactly same stack handling for everything.

Keep on saying that, but you're wrong. A simple 64-bit number and a
complex data structure which even you admit can be implemented in
different ways, are not the same thing!

> The text buffers have certain requirements like effective insertion and
> removing portions of text as well as effective text tagging. A character
> array is incredibly miserable on these.

My current editor uses a list of strings (and implemented as interpreted
code). There's no special way to delete or insert elements, so removing
or adding a line is not that efficient. But you won't really notice
until you get to 1 or 2 million lines of text. Most files I edit are a
few thousand lines.

Andy Walker

unread,

Jan 9, 2022, 5:05:46 PM1/9/22

to

On 08/01/2022 17:15, James Harris wrote:
>>> If you are not convinced by formatted io then what kind of io do
>>> you prefer?
>> Unformatted transput, of course. Eg,
>> print this, that, these, those and the many other things
> OK. I would say that that has /default/ formatting but is still
> formatted.

You can say that if you insist, but ISTM to be an abuse of
language. The point about /formatted/ transput is that the user
prescribes some "mould" into which objects such as integers [for
output] or strings [for input] are "poured" for conversion. So
we get calls like "printf (some_mould, some_value)". It's a
useful distinction from unformatted transput, where the user does
not specify any mould.

[...]
> You have a preferred syntax in which users can express how they want
> values to be formatted? Suggestions welcome!

The whole point would be that unformatted transput doesn't
/have/ a syntax! It just uses the otherwise-existing syntax of the
language. So, no, I don't have a preferred syntax. [Admittedly,
in the case of C, the dividing line is blurred, as C's formatting
strings are part of the library; you could in principle write your
own "printf" procedure from scratch with its own semantics. But
no-one does, as it's a lot of work for little reward. Some other
languages do have special syntax for formats.]

[...]
> The design of the language supports dynamically sized strings. That's
> fine for applications which need them. But that's different from
> imposing such strings and the management thereof on every print
> operation.

How does that differ in principle from "my language has
numbers, but that's different from imposing them on every
arithmetic operation"? The string [array of characters] is the
natural unit of both input and output. The whole point of
transput is to convert values, such as integers [or dates], to
strings for output, or conversely to parse strings into integers
[or dates] for input. If your language has strings, then what
is the objection to using them? If not, then you have to
re-write your conversion/parsing procedures to work character
by character; it's not difficult, but it's extra work and it
means that you can't [easily] use intermediate stages [such as
constructing an array of strings and sorting them before output].

[...]
>> You're making mountains out of
>> molehills.
> Oh? How would you handle
> widget w
> print(w)
> ?
> ISTM you are suggesting that w be converted to a string (or array of
> characters) and then printed.

What on earth else would you expect to happen? That's
quite separate from whether a "format" has to be invoked. That's
merely the choice between:

widget w; format f = [whatever]; printf (f, w)

and

widget w; proc g = [whatever]; print (g(w))

where "f" is a "mould" that converts widgets to strings and "g" is
a procedure that converts widgets to strings. The difference is
that "f" needs a special "little language" and "g" doesn't.

[...]
> a. Where would you store the array of characters?

You get an array of characters either way. If your
language has strings that's it. If not, then both "printf"
and "print" in the above notation can be defined to produce
[and print or read] the strings character by character if
that's what your language needs. There's no /important/
difference.

> b. What's wrong with
> print "String with %v; included" % val

Nothing, except that "%" has become a privileged
character [with two different syntaxes in your example].
But why should anyone prefer it to

print "String with ", val, " included"

[if the default format is OK for you, or replace "val" by
some "g(val)" otherwise]?

[...]
> Cool. But if you agree with my suggestion of a means which can be
> used to render each character in sequence I don't know why you
> suggested conversion to an array of characters.

It's more flexible, and you need the conversion process
anyway. As above, arrays of characters are the natural units;
as in "Hello, world!", which you don't usually, when writing C,
think of as "H" followed by "e" followed by "l", ... [though you
obviously can if you want to, and at a lower level the code for
"printf" is indeed very likely to step through the string].

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Bizet

Andy Walker

unread,

Jan 9, 2022, 8:08:12 PM1/9/22

to

On 07/01/2022 18:14, Bart wrote:
[I wrote:]
>> [...] I don't know why you and James are so opposed to the use
>> of heap storage [and temporary files, if you really want strings
>> that are many gigabytes]?
> Because heap storage requires a more advanced language to manage
> properly, ie. automatically. (I don't care for speculative GC
> methods.)

Oh. Well, a language either provides heap storage or it
does not. Even C does, even if in an unsafe and rather primitive
way. The techniques involved aren't exactly cutting-edge recent.
For the limited requirements of strings-for-transput, procedures
such as [in C terms] "malloc" and "free" are entirely sufficient
and safe. If you have control [as you do for your own language]
of the compiler, then you can implement strings-as-results even
without any off-stack storage [simply copy them, or indeed any
data structures that don't involve pointers, down into the stack
space of the calling procedure as part of exiting the returning
function]. But of course it's easier, from the PoV of the
programmer, if heaps and strings are already built in. [Of
course, as discussed elsewhere in the thread, strings aren't
necessary either, merely convenient.]

So what you actually seem to be saying is that your [or
James's] language "should" be more primitive than C, and should
be incapable of implementing reasonably general data structures,
such as trees, that require heaps, or near equivalents. That
would be a pretty definite deal-breaker for me. I can understand
that you might want a better implementation than C's; but that's
another [and perhaps inspiring] matter.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music

Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Couperin

Bart

unread,

Jan 9, 2022, 8:25:29 PM1/9/22

to

On 09/01/2022 22:05, Andy Walker wrote:
> On 08/01/2022 17:15, James Harris wrote:

>> b. What's wrong with
>> print "String with %v; included" % val
>
> Nothing, except that "%" has become a privileged
> character [with two different syntaxes in your example].
> But why should anyone prefer it to
>
> print "String with ", val, " included"

There are reasons why people would. Most of my Prints are ordinary print
statements like your example, except that they inject spaces between
items (wherever "," is written, with ",," used to suppress that space as
seen below).

But sometimes I decide to use formatted print, or 'fprint', rather than
'print'. 'fprint' starts with a format string. Just see for yourself
whether my choice was justified:

fprint "operator (#)(#,#)#", jtagnames[i]+2, strmode(p.amode),
strmode(p.bmode), strmode(p.rmode)

Using ordinary, it's needs to be written like this:

print "(",, jtagnames[i]+2,, ")", strmode(p.amode),, "," ,,
strmode(p.bmode),, ")",, strmode(p.rmode)

(Typical output is 'operator (+)(int, int)int')

A simpler example:

fprint "[#..#]", ttlower[m], ttlength[m]+ttlower[m]-1

is written unformatted as:

print "[",, ttlower[m],, "..",, ttlength[m]+ttlower[m]-1,, "]"

And a final one:

fprint "Refbit #:# (#,#)", ttname[p.uref.elemtag], p.uref.ptr,
p.uref.bitoffset, p.uref.bitlength

which I won't bother translating...

The 'fprint' versions give a nice overall picture of the layout of the
line, with variable parts represented by '#'. It is is very easy to get
that right and to maintain.

Doing it with normal 'print' is obviously possible, but it's much more
fiddly and much harder to see the result without running the program,
which then usually needs multiple tweaks to get just right.

Sometimes I even use it in simple cases like this (but with somewhat
more elaborate expressions):

fprint "# #", a, b

even though this is exactly equivalent to 'print a, b'. Why? Well, it
immediately tells me that that is the case; compare with:

print longtable[i+2, f(j-2, 1)], longtable[a, b]

where it's not immediately clear that there are only two print items!

I can very easily control the spacing between items; unspaced is "##",
spaced is "# #".

But also, I can instantly change the format to anything else, eg. an
extra space is "# #"; with a comma as well it's "#, #". Try that with
my example:

print longtable[i+2, f(j-2, 1)], ", ", longtable[a, b]

Notice the comma blends in with all the others!

Another advantage is being to able to use a generic format like "(#, #,
#)", and applying to a sequence of 3-element prints, if I wanted them to
all be displayed in the same style. And to change the style, I change it
one place.

Still not convinced? This is an easy and low-cost language feature for
the extra convenience provided.

Bart

unread,

Jan 9, 2022, 8:31:47 PM1/9/22

to

Actually, both my languages use similar print features: both have
print/println and fprint/fprintln that work in the same way.

Yet one of them does have first class strings and automatic memory
management. It just means it can print more complex types automatically,
and it is more practical to implement your explicit g() to-string functions.

Some of the discussion was about how to make possible some of that extra
capability in a lower-level language.

Clearly any solutions in my case are not going to be about upgrading the
GC features of the other language; it has work within that level of
language.

Andy Walker

unread,

Jan 11, 2022, 6:36:54 PM1/11/22

to

On 10/01/2022 01:25, Bart wrote:
[I wrote:]
>> But why should anyone prefer [James's proposal] to

>> print "String with ", val, " included"

[...]

> But sometimes I decide to use formatted print, or 'fprint', rather
> than 'print'. 'fprint' starts with a format string. Just see for
> yourself whether my choice was justified:

[... examples snipped ...]

Obviously, if your language has /both/ "print" and "printf"
/and/ you're well versed in the details of both, then you can and
should use whichever suits you in particular cases. Most users
are not so familiar with the details of whatever languages they're
using, and have no control over what the language provides. So
the more important question is whether a /new/ language /should/
have both, and if not which is the one that should go. Bearing
in mind that "printf" necessitates the invention of a "little
language" [or not so little!] and perhaps additional syntax, and
will still not be comprehensive [cf your "date" example], ISTM
that the decision is straightforward.

> The 'fprint' versions give a nice overall picture of the layout of
> the line, with variable parts represented by '#'. It is is very easy
> to get that right and to maintain.

Yes, but there are other ways to do that without adding
to the size of the language and its description. Left as an
exercise.

> I can very easily control the spacing between items; unspaced is
> "##", spaced is "# #".
> But also, I can instantly change the format to anything else, eg. an
> extra space is "# #"; with a comma as well it's "#, #".

Not "anything else" [or, at least, not "instantly"]. As
you came close to pointing out, there are dozens of ways in which
dates are commonly [or less commonly!] printed, not all of which
are trivial re-arrangements of "print day, month, year". Even
simple numbers come in a variety of styles, not all of which are
catered for by formats such as "+11d.ddd", or whatever. See
also below.

[...]

> Another advantage is being to able to use a generic format like "(#,
> #, #)", and applying to a sequence of 3-element prints, if I wanted
> them to all be displayed in the same style. And to change the style,
> I change it one place.

Yes, but you don't need "printf" to do that. A procedure
that takes three parameters [or an array, and if you like strings
to act as start, finish and separator] and prints the appropriate
output is essentially trivial to write in any sensible language,
and is just as generic and easy to change. It's also your
choice [as a language designer] whether to provide that procedure
and others like it, in whatever numbers, in the language library
or leave it to relevant users to write their own. It's a matter
of balance. Providing conversions from numbers to strings of
digits is probably highly desirable; providing formatting for
triples of integers probably isn't, and should be left to the
users.

> Still not convinced? This is an easy and low-cost language feature
> for the extra convenience provided.

Providing simple procedures such as the one just suggested
is indeed easy, but not to do with writing little languages, or
adding syntax, and esp not with doubling the documentation needed.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music

Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Lange

Bart

unread,

Jan 12, 2022, 8:22:08 AM1/12/22

to

I did an experiment: I removed support for 'fprint/fprintln' from my
compiler.

I left in the library support for it, since this will still be needed
whether it's built-in, or implemented via functions.

It made the compiler executable about 0.2% smaller. It made the source
code 50 lines smaller.

However, it leaves a problem: exactly how to I achieve the same thing
using only user-functions?

That is, implement a function like this:

formattedprint(dest, formatstring, x, y, fmtxx(z,"z11"))

where x, y, z are of of arbitrary types. And it introduces the problems
that have been discussed of what to do about the string generated from
fmtxx(). It has the -xx designation because that is also type-specific.

My language doesn't have function overloads, but that wouldn't help:
overloads are used to select one of N functions, but we don't want
separate functions; we need one function that loops through a parameter
list of mixed types.

I don't have variadic parameters either. And I don't want to go
backwards and end up with C's crude solution.

So, saving 50 lines and having a 0.2% smaller core compiler, which is
your suggestion, means having a bunch of heavyweight language features
to design and add.

Or, the easy solution is to require the user to write:

startfmtprint(formatstring, dest)
print_xx(x, "")
print_yy(y, "")
print_zz(z, "z11")
endprint()

Here the 50 lines saved in the compiler, is replaced by 500 lines in
user code. (In one application, by one programmer.)

The payoff is poor.

>> But also, I can instantly change the format to anything else, eg. an
>> extra space is "# #"; with a comma as well it's "#, #".
>
> Not "anything else" [or, at least, not "instantly"]. As
> you came close to pointing out, there are dozens of ways in which
> dates are commonly [or less commonly!] printed, not all of which
> are trivial re-arrangements of "print day, month, year".

This kind of format is used when you get to the point where you /have/ 3
items to print and you know the layout you want.

It is little to do with the language knowing how to deal with:

print d

when 'd' is some user-defined type.

> [...]
>> Another advantage is being to able to use a generic format like "(#,
>> #, #)", and applying to a sequence of 3-element prints, if I wanted
>> them to all be displayed in the same style. And to change the style,
>> I change it one place.
>
> Yes, but you don't need "printf" to do that. A procedure
> that takes three parameters [or an array, and if you like strings
> to act as start, finish and separator] and prints the appropriate
> output is essentially trivial to write in any sensible language,

See my example above. Not all formats are just "#, #, #". And in my
example of applying one format to multiple prints, not all those prints
will have the same types.

> providing formatting for
> triples of integers probably isn't, and should be left to the
> users.

Where did I do that? I gave an example where if you have several Print
that have to output 3 things, then you may want them to share the same
layout.

>> Still not convinced? This is an easy and low-cost language feature
>> for the extra convenience provided.
>
> Providing simple procedures such as the one just suggested
> is indeed easy, but not to do with writing little languages, or
> adding syntax, and esp not with doubling the documentation needed.

You will still need to document those functions. Or perhaps you are
suggesting every user has to reinvent the same functions for turning
numbers into strings, padding to a given width, justifying left or right
etc etc.

If you supply user-code functions in a library for that, they will still
need documentating, and that code of solution will be poorer.

Andy Walker

unread,

Jan 15, 2022, 2:53:58 PM1/15/22

to

On 12/01/2022 13:22, Bart wrote:
> I did an experiment: I removed support for 'fprint/fprintln' from my
> compiler.
> I left in the library support for it, since this will still be needed
> whether it's built-in, or implemented via functions.

That depends on what the support entails. If it's anything
even remotely like Algol support for formats, it's simply not needed
if you don't have formats [Algol "$ ... $", often one of the first
things to go in subset languages (inc early versions of A68R)]. If
OTOH you really mean the support for "print", then of course that
will be needed, but that suggests that your idea of formatted
transput is, to say the least, minimal.

> It made the compiler executable about 0.2% smaller. It made the
> source code 50 lines smaller.
> However, it leaves a problem: exactly how to I achieve the same thing
> using only user-functions?

Well, I jotted down A68G code for what seems to be roughly
your "fprint[ln]" example. It comes to 19 lines, inc six lines to
implement a couple of bells-and-whistles [optional trailing
newline, and repeat last format]:

STRING ditto = "";
STRING lastfstring := ditto;
BOOL donewline := TRUE; # set FALSE to suppress trailing newline #

PROC fprint = (STRING s, [] UNION (INT, LONG INT, REAL, STRING) a) VOID:
( STRING fmt = ( s = ditto | lastfstring | lastfstring := s );
INT i := 0;
FOR j TO UPB fmt
DO IF fmt[j] = "#"
THEN CASE a[i +:= 1]
IN (UNION (INT, LONG INT) k): print (whole (k, 0)),
(REAL r): print (fixed (r, -8, 5)),
(STRING w): print (w)
ESAC
ELSE print (fmt[j])
FI
OD;
donewline | print (newline)
);

fprint ( "Hello #! Sqrt # is #, to 5dp", ("World", 2, sqrt(2)) );
fprint ( ditto, ("Bart", 1.69, "1.3") );
fprint ( "Longmaxint is #.", longmaxint )

[prints:

Hello World! Sqrt 2 is 1.41421, to 5dp
Hello Bart! Sqrt 1.69000 is 1.3, to 5dp
Longmaxint is 999999999999999999999999999999999999999999.

]. If this was a serious exercise, there would be more checks
and warnings. But also, there is a strong temptation to add
ever more facilities, which is how you finish up with dozens
of extra parameters or settable variables, and something that
takes serious effort to learn all the details of. Language
designers need to resist. Provide the basics, and let users
add the optional extras. That was how Unix/C started; and
letting people add the kitchen sink is how we got to Linux,
Gcc and documentation that is too much to print or learn.

> That is, implement a function like this:
> formattedprint(dest, formatstring, x, y, fmtxx(z,"z11"))
> where x, y, z are of of arbitrary types. And it introduces the
> problems that have been discussed of what to do about the string
> generated from fmtxx(). It has the -xx designation because that is
> also type-specific.

To add a "dest", cf the relation between "put" and "print"
[RR10.5.1d; cost minimal]. For arbitrary types, see RR10.3.2.2a,b
and RR10.3.2.3a; but note that the full generality is available
only to the library, not to ordinary user code. For "what to do
about the string" -- just print it! The point is that specific
formats are not necessary, not that no printing is necessary.

[...]

> I don't have variadic parameters either.

Nor does Algol.

[...]

> It is little to do with the language knowing how to deal with:
> print d
> when 'd' is some user-defined type.

That's not to do with formatted printing either.

[...]> See my example above. Not all formats are just "#, #, #". And in my

> example of applying one format to multiple prints, not all those
> prints will have the same types.

See my code above.

[...]

>> Providing simple procedures such as the one just suggested
>> is indeed easy, but not to do with writing little languages, or
>> adding syntax, and esp not with doubling the documentation needed.
> You will still need to document those functions. Or perhaps you are
> suggesting every user has to reinvent the same functions for turning
> numbers into strings, padding to a given width, justifying left or
> right etc etc.

How much documentation do you suggest the code above needs?
Surely nothing like the scores of pages needed to describe formats
in C and Algol? [Formatted transput is ~15% of the RR, four times
as much as unformatted (which easily does all the things you're
"suggesting" users might have to reinvent), a similar fraction of
the A68 /syntax/, and is a nightmare to lex, as formats can
contain ordinary code, potentially with embedded formats nested
to arbitrary depth.]

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music

Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Peerson

Bart

unread,

Jan 15, 2022, 8:22:52 PM1/15/22

to

On 15/01/2022 19:53, Andy Walker wrote:
> On 12/01/2022 13:22, Bart wrote:
>> I did an experiment: I removed support for 'fprint/fprintln' from my
>> compiler.
>> I left in the library support for it, since this will still be needed
>> whether it's built-in, or implemented via functions.
>
> That depends on what the support entails. If it's anything
> even remotely like Algol support for formats, it's simply not needed
> if you don't have formats [Algol "$ ... $", often one of the first
> things to go in subset languages (inc early versions of A68R)]. If
> OTOH you really mean the support for "print", then of course that
> will be needed, but that suggests that your idea of formatted
> transput is, to say the least, minimal.

I already demonstrated it: it's simply using a format string where "#"
characters show where each print item goes.

What is still present in both fprint and normal print are per-item
display options (like width and justify).

You might consider it minimal, but I considered it something that would
benefit from direct language support.

That's a reasonable attempt at emulating 'fprint/ln'. It's an approach I
can't use in my systems language, because it doesn't have automatic
tagged unions as used here; arbitrary array constructors; nor that
automatic 'rowing' feature to turn one item into a list.

My dynamic does have some of these, and there it would look like this:

const ditto=""

proc qfprint(s, items, newline=0) =
static var lastfstring=""

unless items.islist then items:=(items,) end
fmt:=(s=ditto | lastfstring | lastfstring:=s)

i:=0
for j to fmt.len do
if fmt[j]="#" then
print items[++i]
else
print fmt[j]
fi
od
if newline then println fi
end

proc qfprintln(s, items) =
qfprint(s, items, 1)
end

qfprintln( "Hello #! Sqrt # is #", ("World", 2, sqrt(2)) )
qfprintln( ditto, ("Bart", 1.69, "1.3") )
qfprintln( "maxint is #.", i64.max)

* The 'unless' line is to turn a single item into a list

* This actually works for any types, of arbitrary complexity

* Newline has been done using separate qfprint and qfprintln routines

It's not far off what it looks like using native fprint, where it has
less punctuation and allows for per-item format overrides, as well as
the ability to fprint to a file, string or other destination:

fprintln fm:="Hello #! Sqrt # is #", "World", 2, sqrt(2)
fprintln fm, "Bart", 1.69, "1.3"
fprintln "maxint is #.", i64.max

(Note I don't have decimal.max, my long number type; partly because
there is no maximum.)

> [...]
>> I don't have variadic parameters either.
>
> Nor does Algol.

It has the union trick, and the ability to construct arrays of those
unions of effectively mixed types.

Yeah, A68 probably went overboard with this stuff. Beyond low-level
printing, the requirements are too diverse and application-specific.

Bart

unread,

Jan 16, 2022, 6:01:25 AM1/16/22

to

Actually your example demonstrates your argument in reverse: it's full
of complex language features which I decided are not necessary to build
in to my systems language. (Not just that: they are hard to implement,
and inefficient.)

You are saying a language shouldn't have this type of formatting control
built-in, because it's so easy to emulate a half-working version with
different behaviour using other features.

Provided the language has those features, as A68 coincidentally happens
to have!

Meanwhile it also demonstrates one or two features that are missing from
A68, which I do have in my systems language and consider more useful:
local static variables (that retain their value betweeen calls), and
optional function parameters with default values.

anti...@math.uni.wroc.pl

unread,

Jan 18, 2022, 2:19:17 PM1/18/22

to

Dmitry A. Kazakov <mai...@dmitry-kazakov.de> wrote:
> On 2022-01-02 18:50, Bart wrote:
> > On 02/01/2022 17:21, Dmitry A. Kazakov wrote:

> >> On 2022-01-02 18:08, Bart wrote:
> >>> On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
> >>>> On 2022-01-02 17:06, James Harris wrote:
> >>>>
> >>>>> If you convert to strings then what reclaims the memory used by
> >>>>> those strings?
> >>>>
> >>>> What reclaims memory used by those integers?
> >>>

> >>> Integers are passed by value at this level of language.
> >>
> >> This has nothing to do with the question: what reclaims integers?
> >> FORTRAN-IV passed everything by reference, yet calls
> >>
> >> ??? FOO (I + 1)
> >>
> >> were OK almost human life span ago.
> >
> > Fortran didn't allow recursion either.
>
> Irrelevant. What reclaims integer I+1?

Relevant. Early Fortrans statically allocated storage to I + 1.
Storage was "reclaimed" by OS at program termination, but
reserved for the whole run. Such static allocation is
impossible without static bound on maximal size of object
and it does not work in case of recursion.

--
Waldek Hebisch

Dmitry A. Kazakov

unread,

Jan 18, 2022, 3:06:10 PM1/18/22

to

And still irrelevant, whatever reclaims integers can reclaim strings and
conversely. Furthermore all that has nothing to do with the parameter
passing mode or where and how call frames get allocated.

P.S. DEC FORTRAN-IV used the machine stack for subroutine calls and
expressions.

Andy Walker

unread,

Jan 18, 2022, 8:02:28 PM1/18/22

to

On 16/01/2022 11:01, Bart wrote:
[I wrote:]

>>> Well, I jotted down A68G code for what seems to be roughly
>>> your "fprint[ln]" example. It comes to 19 lines, inc six lines to

>>> implement a couple of bells-and-whistles [...].
[Bart:]

>> That's a reasonable attempt at emulating 'fprint/ln'. It's an
>> approach I can't use in my systems language, because it doesn't
>> have automatic tagged unions as used here; arbitrary array
>> constructors; nor that automatic 'rowing' feature to turn one item
>> into a list.

Tagged unions: They aren't exactly an unusual feature of
languages [Wiki gives examples from Algol, Pascal, Ada, ML, Haskell,
Modula, Rust, Nim, ..., tho' sometimes, as in C, a certain amount
of pushing and squeezing is needed]. If your unions aren't tagged,
it's a recipe for unsafe use of types [as in C].

Arbitrary array constructors: ??? You mean the ability to
write down an actual row of things and have the language treat it
as a row of things? How awful.

Rowing: Well, you're on better ground here, as the rowing
coercion is one of the features of Algol that has been touted as
something that perhaps ought not to have been included. But it's
only syntactic sugar, so it's easily worked around if your language
doesn't have it.

> Actually your example demonstrates your argument in reverse: it's
> full of complex language features which I decided are not necessary
> to build in to my systems language. (Not just that: they are hard to
> implement, and inefficient.)

Complex? Did you find my code difficult to follow? I can
easily explain anything if necessary, but I found it easy to write.
Indeed, part of my point was that it was /too/ easy. It was hard
to resist the temptation to add more and more and more bells and
whistles, which any competent programmer could supply and which
[in a library or as part of the language] take longer to describe
and learn than to write "ab initio".

Hard to implement? Possibly. But it's been done. You
[and more importantly, in context, James] don't need to re-invent
wheels, any more than you need to invent new parsing techniques
or new ways of sorting lists. Just copy the code from any of the
many PD compilers out there.

Inefficient? ???

> You are saying a language shouldn't have this type of formatting
> control built-in, because it's so easy to emulate a half-working
> version with different behaviour using other features.

Half-working? What's the other half, and what do you
expect for a few minutes working from your examples rather than
from a formal spec? Yes, I could very easily have added things
like L/R justification, better control of places before/after
decimal points, alternative decimal "point"s [such as ",", for
our continental friends], alternatives to "#", ..., but they
would have cluttered the code for no interesting benefit. If
you have need for these things, go ahead and implement them;
but there's no need to have them defined in the syntax or even
in a required/specified library routine.

> Provided the language has those features, as A68 coincidentally
> happens to have!

Coincidence? Or a decision based on decades of actual
experience? No language was more debated before and during its
initial implementation than Algol; see the pages, over nearly
30 years, of Algol Bulletin. Of course, that was around half a
century ago, and much has happened since to both hardware and
software, so parts of that debate [and some of the results] now
look quite silly or outdated. The depressing thing is not the
mistakes of Algol and other languages of the period, but that
recent languages persist in repeating those mistakes.

> Meanwhile it also demonstrates one or two features that are missing
> from A68, which I do have in my systems language and consider more
> useful: local static variables (that retain their value betweeen
> calls), and optional function parameters with default values.

"Local static variables" were in Algol 60, were problematic
in detailed specification and were therefore dropped in Algol 68 [RR
0.2.6f]. Some of the reasons are expanded in the well-known book on
"Algol 60 Implementation" by Randell and Russell, esp in relation to
"own" arrays. The effects are easy to obtain in other ways. [Most
other prominent languages don't have them either.] For optional
function parameters, see AB37, and note that partial parametrisation
meets many of the needs and is in A68G.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music

Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Mayer

anti...@math.uni.wroc.pl

unread,

Jan 18, 2022, 8:25:07 PM1/18/22

to

Low level issues are very relevant for disscussion here. You can
postulate that language should have ability to return strings, fine.
There is well-know method to implement that: heap allocation +
garbage collection (for purpose of this disscussion reference
counting is just specific way to implement garbage collection).
However you claimed that returning of strings can be implemented
without heap allocation, which is exactly question how language
manages memory and is closely related to how call frames get
allocated. Now, if religious fanatics banned heap allocation,
than you are right that one can do return using stacks. But
it goes well beyond what integer or to that matter fixed size
objects need. Namely only called routine knows how big string
will be so needs to allocate storage for it (on stack if you
insist). Then you need to copy it to caller storage. If
caller allocate it on its own stack, copy can be done only
after called routine returned: only at that poit one can
enlarge callers stack space. So you need to copy string from
called routine to secondary stack before return and than
copy it back to callers stack. It is tempting to optimize
and create string directly on secondary stack or just keep
it all the time on secondary stack but that may easily led
to bugs. So you either have rather inefficient implementation
which performs a lot of copies (and needs sufficient space
on secondary stack for all objects "in transit") or some
(probably complicated) code in compiler that tries to
recognize optimizable cases (which probably does not help
much in general case). Compared to that heap allocation
is simpler to implement and likely more efficient.

To put it differently, retuning integer on modern machine
requires very little code in compiler and during runtime,
one just puts value in designated return register. The
whole call-return machinery and stack frame allocation
can be done in say 200 lines of compiler code. The
machinations needed to handle variable sized object are
order or tow orders of magnitude more complicated. So
saying that "whatever reclaims integers can reclaim strings"
is at best misleading.

P.S. I used to like very much idea of allocating variable
size objects on stack(s). But after looking at various
tradeofs I am no longer convinced that it makes sense.

--
Waldek Hebisch

anti...@math.uni.wroc.pl

unread,

Jan 18, 2022, 8:42:07 PM1/18/22

to

Andy Walker <a...@cuboid.co.uk> wrote:
> On 07/01/2022 18:14, Bart wrote:
> [I wrote:]
> >> [...] I don't know why you and James are so opposed to the use
> >> of heap storage [and temporary files, if you really want strings
> >> that are many gigabytes]?
> > Because heap storage requires a more advanced language to manage
> > properly, ie. automatically. (I don't care for speculative GC
> > methods.)
>
> Oh. Well, a language either provides heap storage or it
> does not. Even C does, even if in an unsafe and rather primitive
> way. The techniques involved aren't exactly cutting-edge recent.

Well, there are shades of gray here. To explain, there is concern
about small machines. Small means different things for various
people, some think 256M is small. But I mean really small,
think about 4k storage in total (program + data). Microcontrollers
of that size are widely used. One can program them using C++,
however any attempt to use runtime featurs will pull most of
standard library which may be more than 100k code (that is real
date from my target). So, to support such small machines
you want to be able to create programs which need only minimal
runtime support. I must say that C 'printf' is one of first
things to ban: it is well-known that even most space-efficient
versions of 'printf' are much larger than specialized code
for single type. Also, buffering needed on larger systems can
be simplifed when exact hardware/software configuration is
known.

So I do not understand why James wants fancy Print on small
systems. But desire to run without heap storage is IMHO
quite resonable.

--
Waldek Hebisch

Dmitry A. Kazakov

unread,

Jan 19, 2022, 2:42:46 AM1/19/22

to

On 2022-01-19 02:25, anti...@math.uni.wroc.pl wrote:

> To put it differently, retuning integer on modern machine
> requires very little code in compiler and during runtime,
> one just puts value in designated return register. The
> whole call-return machinery and stack frame allocation
> can be done in say 200 lines of compiler code. The
> machinations needed to handle variable sized object are
> order or tow orders of magnitude more complicated.

Or reverse, advanced algorithms of register optimizations are incredibly
complicated while dealing with arrays is very straightforward. You make
assumptions about certain implementations which confirm nothing but
these assumptions.

> So
> saying that "whatever reclaims integers can reclaim strings"
> is at best misleading.

No, it is exactly the point. Integers and strings and all other objects
managed by the language are created and reclaimed by the language memory
management system. Which in turn can operate in a LIFO policy even if
the object sizes are indeterminable.

That came as a little surprise to some, which led to silly claims
regarding need of heap/GC or parameter passing by reference etc.

> P.S. I used to like very much idea of allocating variable
> size objects on stack(s). But after looking at various
> tradeofs I am no longer convinced that it makes sense.

Your loss. Using the heap is a crime on a multi-core architecture.

Bart

unread,

Jan 19, 2022, 5:48:23 AM1/19/22

to

On 19/01/2022 07:42, Dmitry A. Kazakov wrote:
> On 2022-01-19 02:25, anti...@math.uni.wroc.pl wrote:
>
>> To put it differently, retuning integer on modern machine
>> requires very little code in compiler and during runtime,
>> one just puts value in designated return register. The
>> whole call-return machinery and stack frame allocation
>> can be done in say 200 lines of compiler code. The
>> machinations needed to handle variable sized object are
>> order or tow orders of magnitude more complicated.
>
> Or reverse, advanced algorithms of register optimizations are incredibly
> complicated while dealing with arrays is very straightforward. You make
> assumptions about certain implementations which confirm nothing but
> these assumptions.

No need; the ABI will specify how primitive types are passed, such as
integers, and in which registers.

Strings are not considered a primitive type; their handling is largely
up to the language.

So that itself tells you they cannot be dealt with the same way.

>> So
>> saying that "whatever reclaims integers can reclaim strings"
>> is at best misleading.
>
> No, it is exactly the point. Integers and strings and all other objects
> managed by the language are created and reclaimed by the language memory
> management system. Which in turn can operate in a LIFO policy even if
> the object sizes are indeterminable.

At the point where a Print routine has finished with a string associated
with an item in print-list, how will the language know how to recover
any resource used by that string, or if it needs to do so? Remember that
at different times through the same code:

* The string might be a literal (so it can be left)
* It can be constructed just for this purpose (so must be recovered)
* It could belong to a global entity (so can be left)
* It could be shared (so a reference count may need adjusting)

This is NEVER going to be as simple as just passing a number of 32 or 64
bits. I don't know why you keep saying it is other than to be contrary.

> Your loss. Using the heap is a crime on a multi-core architecture.

Huh? Pretty much everything is multi-core now other than on small devices.

How else are you going to use all those GB of memory other than using a
heap? Do you really want to insist on LIFO allocation everywhere? That
would be a highly restricted language, and makes ordinary data
structures either impossible, or highly inefficient.

You may as well insist that file storage on a disk is allocated in a
LIFO manner too; that would get rid of all that pesky fragmentation!
People who invent file systems seem to have missed a trick.

Bart

unread,

Jan 19, 2022, 6:40:36 AM1/19/22

to

On 19/01/2022 01:02, Andy Walker wrote:
> On 16/01/2022 11:01, Bart wrote:
> [I wrote:]
>>>> Well, I jotted down A68G code for what seems to be roughly
>>>> your "fprint[ln]" example. It comes to 19 lines, inc six lines to
>>>> implement a couple of bells-and-whistles [...].
> [Bart:]
>>> That's a reasonable attempt at emulating 'fprint/ln'. It's an
>>> approach I can't use in my systems language, because it doesn't
>>> have automatic tagged unions as used here; arbitrary array
>>> constructors; nor that automatic 'rowing' feature to turn one item
>>> into a list.
>
> Tagged unions: They aren't exactly an unusual feature of
> languages [Wiki gives examples from Algol, Pascal, Ada, ML, Haskell,
> Modula, Rust, Nim, ..., tho' sometimes, as in C, a certain amount
> of pushing and squeezing is needed]. If your unions aren't tagged,
> it's a recipe for unsafe use of types [as in C].

There are different kinds of tagged unions, sum types and complex
enumerations. And various ways of implementing them, and lots of
possible ways of assigning things to them.

For example, your use of Algol68 unions doesn't name the individual
'fields'; it uses a form of pattern matching and assignment to an
arbitrary named entity to get access to a particular type and value.

So there isn't really one clear way to this stuff.

The sort of tagged unions /I/ would want, would need a tag value that is
a global enum. Different cases could also have the same type.

> Arbitrary array constructors: ??? You mean the ability to
> write down an actual row of things and have the language treat it
> as a row of things? How awful.

My language has fixed-size arrays. But even there, I haven't implemented
all possibilities:

proc D([3]int x)={}

static [3]int A = (10,20,30) # OK
[3]int B := (10,20,30) # OK
[3]int C

C := (10,20,30) # OK
D((10,20,30)) # Not supported in codegen pass

The last is slightly tricky as space for the data needs to arranged.

With variable-length arrays, I can use slices for that purpose, although
they were intended for views into existing arrays.

So a slice can pass that information to a function, but I haven't
developed a constructor type for a slice for an array of values:

proc D(slice[]int S)={}

[3]int A
[300]int B

D(A) # OK
D(B) # OK
D[50..60] # OK
D((10,20,30)) # Not implemented
D((P, N)) # Will create an slice descriptor

These are the language choices /I/ made. Doubtless you would have made
very different ones.

Remember that with my other car, sorry, language, anything goes:

proc D(S) = {}

D((10, 20.0, "30", 40..50, (60,70,80)))

so the lack of expressiveness in the lower level language is not of
concern; only enough needs to work to implement the higher level one.

> Rowing: Well, you're on better ground here, as the rowing
> coercion is one of the features of Algol that has been touted as
> something that perhaps ought not to have been included. But it's
> only syntactic sugar, so it's easily worked around if your language
> doesn't have it.

It's a type issue; such a feature reduces type safety. It stops a
language detecting the use of scalar rather than a list, which could be
an error on the user's part.

>> Actually your example demonstrates your argument in reverse: it's
>> full of complex language features which I decided are not necessary
>> to build in to my systems language. (Not just that: they are hard to
>> implement, and inefficient.)
>
> Complex? Did you find my code difficult to follow?

That nested union I found hard to understand: so 'r' and 'w' are reals
and strings, but 'k' is yet another union of int and long int. And it
knows how to print such a type. Which means it knows how to print any
element of a[] anyway; the case statement is just to control the default
display.

It's also a use of CASE in A68 that I hadn't seen for a long type:
tagging each possibility to make it each clear what each branch is
operating one.

Maybe that ought to be possible here too:

FOR i FROM 1 TO 4 DO
print((
CASE i
IN
"one",
"two",
"three"
OUT
"other"
ESAC, newline))
OD

While my example is clear, elsewhere it would be useful to have have 1:,
2: etc in front of each branch.

> You
> [and more importantly, in context, James] don't need to re-invent
> wheels, any more than you need to invent new parsing techniques
> or new ways of sorting lists. Just copy the code from any of the
> many PD compilers out there.
>
> Inefficient? ???

Yeah. Earlier discussion touched on the inefficiency of turning
something into an explicit string before printing (say, an entire array,
instead of one element at time).

Here, you're turning a set of N print-items into an array, so that it
can traverse them, deal with them, then discard the array.

>> You are saying a language shouldn't have this type of formatting
>> control built-in, because it's so easy to emulate a half-working
>> version with different behaviour using other features.
>
> Half-working?

Well, how big a UNION would be needed for all possible printable types?
My static language limits them, but could easily decide to print arrays
and user-defined records if it wants. The diversity is handled within
the compiler.

It's also missing per-item formating codes. A solution using only
user-code, even in Algol68, is unwieldy and ugly.

Look at what C++ ended up with using a solution based on
language-building features:

std::cout << A << " " << B << std::endl;

> our continental friends], alternatives to "#", ..., but they

(One point about "#": when I wrote my CASE example, I commented out your
FPRINT code, using #...# comments. Of course the embedded #s interfered
with that, creating mysterious errors later on.

It could really do with line comments, as block comments for commenting
out code are troublesome.)

> "Local static variables" were in Algol 60, were problematic
> in detailed specification and were therefore dropped in Algol 68 [RR
> 0.2.6f]. Some of the reasons are expanded in the well-known book on
> "Algol 60 Implementation" by Randell and Russell, esp in relation to
> "own" arrays. The effects are easy to obtain in other ways. [Most
> other prominent languages don't have them either.]

It's not clear what is problematic about them, other than making
functions impure.

I just went ahead and implemented them! Sure you can get round the
omission by other means, such as using globals, but that is
unsatisfactory (clashing names with statics belong to other functions
and less safe, as they can be modified by any code).

Dmitry A. Kazakov

unread,

Jan 19, 2022, 7:24:16 AM1/19/22

to

On 2022-01-19 11:48, Bart wrote:
> On 19/01/2022 07:42, Dmitry A. Kazakov wrote:
>> On 2022-01-19 02:25, anti...@math.uni.wroc.pl wrote:

>> No, it is exactly the point. Integers and strings and all other
>> objects managed by the language are created and reclaimed by the
>> language memory management system. Which in turn can operate in a LIFO
>> policy even if the object sizes are indeterminable.
>
> At the point where a Print routine has finished with a string associated
> with an item in print-list, how will the language know how to recover
> any resource used by that string, or if it needs to do so? Remember that
> at different times through the same code:
>
> * The string might be a literal (so it can be left)
> * It can be constructed just for this purpose (so must be recovered)
> * It could belong to a global entity (so can be left)
> * It could be shared (so a reference count may need adjusting)

Which exactly applies to integer. Moreover things are far more
complicated to integers. An integer can be

- packed and misaligned in a container
- in a register
- optimized away value
- atomic access value
- mapped to an I/O port non-relocatable value

>> Your loss. Using the heap is a crime on a multi-core architecture.
>
> Huh? Pretty much everything is multi-core now other than on small devices.

See, could not even claim the green grapes.

> How else are you going to use all those GB of memory other than using a
> heap?

For doing something useful, maybe?

> You may as well insist that file storage on a disk is allocated in a
> LIFO manner too; that would get rid of all that pesky fragmentation!

Well, if you looked how journaling file systems function or how flash
does you might experience another revelation...

> People who invent file systems seem to have missed a trick.

Sure. None uses heap to transfer memory blocks, I hope. Oh, don't tell
me you just wrote one. You'll get me a PTSD...

Bart

unread,

Jan 19, 2022, 8:34:12 AM1/19/22

to

On 19/01/2022 12:24, Dmitry A. Kazakov wrote:
> On 2022-01-19 11:48, Bart wrote:
>> On 19/01/2022 07:42, Dmitry A. Kazakov wrote:
>>> On 2022-01-19 02:25, anti...@math.uni.wroc.pl wrote:
>
>>> No, it is exactly the point. Integers and strings and all other
>>> objects managed by the language are created and reclaimed by the
>>> language memory management system. Which in turn can operate in a
>>> LIFO policy even if the object sizes are indeterminable.
>>
>> At the point where a Print routine has finished with a string
>> associated with an item in print-list, how will the language know how
>> to recover any resource used by that string, or if it needs to do so?
>> Remember that at different times through the same code:
>>
>> * The string might be a literal (so it can be left)
>> * It can be constructed just for this purpose (so must be recovered)
>> * It could belong to a global entity (so can be left)
>> * It could be shared (so a reference count may need adjusting)
>
> Which exactly applies to integer. Moreover things are far more
> complicated to integers. An integer can be
>
> - packed and misaligned in a container
> - in a register
> - optimized away value
> - atomic access value
> - mapped to an I/O port non-relocatable value

This is an integer being passed to a function. The Win64 ABI specifies
that a /copy/ of its value is passed in register RCX if it is the first
argument.

What does it say about the values of strings?

>>> Your loss. Using the heap is a crime on a multi-core architecture.
>>
>> Huh? Pretty much everything is multi-core now other than on small
>> devices.
>
> See, could not even claim the green grapes.

Huh?

>> How else are you going to use all those GB of memory other than using
>> a heap?
>
> For doing something useful, maybe?

Next you're going to tell me that all those languages that need garbage
collection are doing it all wrong.

>
>> You may as well insist that file storage on a disk is allocated in a
>> LIFO manner too; that would get rid of all that pesky fragmentation!
>
> Well, if you looked how journaling file systems function or how flash
> does you might experience another revelation...

I'm talking about normal, random-access, read-write devices. And surely
any other with special technological requirements, will allow you to
delete any arbitrary file, without the storage used being lost forever.

(Exceptions include WORM devices, but they don't have stack-like
capabilities either.)

>> People who invent file systems seem to have missed a trick.
>
> Sure. None uses heap to transfer memory blocks, I hope. Oh, don't tell
> me you just wrote one. You'll get me a PTSD...
>

Huh? again. Are you on something?

Dmitry A. Kazakov

unread,

Jan 19, 2022, 8:59:11 AM1/19/22

to

If your language does not support out and in/out parameters it is your
problem and this has nothing to do with ABI, at all. The following is
legal Ada:

function Foo (X : String) return String;
pragma Convention (Stdcall, Foo);

Takes string returns string and has Win32 calling convention. The
compiler will give you a friendly warning that it would be tricky to use
it from C, but otherwise there is no problem.

> What does it say about the values of strings?

Win32 says pretty much same: LPSTR, LPLONG.

>>>> Your loss. Using the heap is a crime on a multi-core architecture.
>>>
>>> Huh? Pretty much everything is multi-core now other than on small
>>> devices.
>>
>> See, could not even claim the green grapes.
>
> Huh?

Usually when cornered you claim that you don't even need a feature X.

>>> How else are you going to use all those GB of memory other than using
>>> a heap?
>>
>> For doing something useful, maybe?
>
> Next you're going to tell me that all those languages that need garbage
> collection are doing it all wrong.

Certainly. The performance gain is wasted on garbage software written by
incompetent programmers. The Moore's Law allowed to produce not steaming
piles of but mountains of volcanic guano piercing the stratosphere...

>>> You may as well insist that file storage on a disk is allocated in a
>>> LIFO manner too; that would get rid of all that pesky fragmentation!
>>
>> Well, if you looked how journaling file systems function or how flash
>> does you might experience another revelation...
>
> I'm talking about normal, random-access, read-write devices.

What every Linux machine runs...

>>> People who invent file systems seem to have missed a trick.
>>
>> Sure. None uses heap to transfer memory blocks, I hope. Oh, don't tell
>> me you just wrote one. You'll get me a PTSD...
>
> Huh? again. Are you on something?

Yes, nobody would willingly have used MS-DOS unless under threats of
bodily harm, yet... Now, if you wrote a file system, how do I know that
it is not on a device I might own? A frightening thought... (:-))

Bart

unread,

Jan 19, 2022, 9:38:41 AM1/19/22

to

On 19/01/2022 13:59, Dmitry A. Kazakov wrote:
> On 2022-01-19 14:34, Bart wrote:
>> On 19/01/2022 12:24, Dmitry A. Kazakov wrote:

>>> - packed and misaligned in a container
>>> - in a register
>>> - optimized away value
>>> - atomic access value
>>> - mapped to an I/O port non-relocatable value
>>
>> This is an integer being passed to a function. The Win64 ABI specifies
>> that a /copy/ of its value is passed in register RCX if it is the
>> first argument.
>
> If your language does not support out and in/out parameters it is your
> problem and this has nothing to do with ABI, at all. The following is
> legal Ada:
>
> function Foo (X : String) return String;
> pragma Convention (Stdcall, Foo);
>
> Takes string returns string and has Win32 calling convention. The
> compiler will give you a friendly warning that it would be tricky to use
> it from C, but otherwise there is no problem.
>
>> What does it say about the values of strings?
>
> Win32 says pretty much same: LPSTR, LPLONG.

Those types are 64-bit pointers on Win64.

LPSTR points to a sequence of bytes, terminated with zero to represent a
crude C-style string.

That says nothing about the management of those bytes, which is the bit
that you are saying is so utterly trivial that it's not worth discussing.

An LPSTR value is /not/ the value of the string which what I asked
about; it's the value of the pointer.

Dmitry A. Kazakov

unread,

Jan 19, 2022, 10:20:57 AM1/19/22

to

On 2022-01-19 15:38, Bart wrote:
> On 19/01/2022 13:59, Dmitry A. Kazakov wrote:
>> On 2022-01-19 14:34, Bart wrote:
>>> What does it say about the values of strings?
>>
>> Win32 says pretty much same: LPSTR, LPLONG.
>
> Those types are 64-bit pointers on Win64.

For example this:

procedure Foo (X : in out LONG);
pragma Convention (Stdcall, Foo);

will deploy LPLONG for X. The value will be passed by reference (LPLONG)
as Win32 Stdcall prescribes. See any pointers? Right, there is none.

> LPSTR points to a sequence of bytes, terminated with zero to represent a
> crude C-style string.

Yes, and the following works perfectly well:

declare
procedure Foo (X : char_array);
pragma Convention (Stdcall, Foo);
X : char_array := "abc";
begin
Foo (X & "d" & NUL);

Foo will get null-terminated "abcd". If implemented in C, it would use
LPSTR. Again, no pointers in sight. And, no, heap will not be used.

You are thoroughly confused regarding calling conventions and memory
management. These things are only vaguely related. It is indeed possible
to invent conventions that would make no reasonable management possible,
but it is rarely the goal of the people designing them...

Bart

unread,

Jan 19, 2022, 11:00:03 AM1/19/22

to

On 19/01/2022 15:20, Dmitry A. Kazakov wrote:
> On 2022-01-19 15:38, Bart wrote:
>> On 19/01/2022 13:59, Dmitry A. Kazakov wrote:
>>> On 2022-01-19 14:34, Bart wrote:
>>>> What does it say about the values of strings?
>>>
>>> Win32 says pretty much same: LPSTR, LPLONG.
>>
>> Those types are 64-bit pointers on Win64.
>
> For example this:
>
> procedure Foo (X : in out LONG);
> pragma Convention (Stdcall, Foo);
>
> will deploy LPLONG for X. The value will be passed by reference (LPLONG)
> as Win32 Stdcall prescribes. See any pointers? Right, there is none.

Actually, yes. A reference is a pointer, just an implicit one. The 'P'
in LPLONG stands for 'Pointer'; what did you think it was?

>> LPSTR points to a sequence of bytes, terminated with zero to represent
>> a crude C-style string.
>
> Yes, and the following works perfectly well:
>
>    declare
>       procedure Foo (X : char_array);
>       pragma Convention (Stdcall, Foo);
>       X : char_array := "abc";
>    begin
>       Foo (X & "d" & NUL);
>
> Foo will get null-terminated "abcd". If implemented in C, it would use
> LPSTR. Again, no pointers in sight. And, no, heap will not be used.

Not even for Foo(X*1000000)?

Sure, if you say so:

* No pointers are involved

* No heap storage is necessary, no matter what language

* No memory resources are involved

* Passing an arbitrary string expression is just like passing an integer
expression

If that's what you really believe, than that's OK; it doesn't look like
anyone will be able to persuade you otherwise.

Meanwhile some of us have to actually implement this stuff for real.

Dmitry A. Kazakov

unread,

Jan 19, 2022, 12:17:05 PM1/19/22

to

On 2022-01-19 17:00, Bart wrote:
> On 19/01/2022 15:20, Dmitry A. Kazakov wrote:
>> On 2022-01-19 15:38, Bart wrote:
>>> On 19/01/2022 13:59, Dmitry A. Kazakov wrote:
>>>> On 2022-01-19 14:34, Bart wrote:
>>>>> What does it say about the values of strings?
>>>>
>>>> Win32 says pretty much same: LPSTR, LPLONG.
>>>
>>> Those types are 64-bit pointers on Win64.
>>
>> For example this:
>>
>> procedure Foo (X : in out LONG);
>> pragma Convention (Stdcall, Foo);
>>
>> will deploy LPLONG for X. The value will be passed by reference
>> (LPLONG) as Win32 Stdcall prescribes. See any pointers? Right, there
>> is none.
>
> Actually, yes. A reference is a pointer, just an implicit one. The 'P'
> in LPLONG stands for 'Pointer'; what did you think it was?

LONG, see the declaration of Foo.

>>> LPSTR points to a sequence of bytes, terminated with zero to
>>> represent a crude C-style string.
>>
>> Yes, and the following works perfectly well:
>>
>>     declare
>>        procedure Foo (X : char_array);
>>        pragma Convention (Stdcall, Foo);
>>        X : char_array := "abc";
>>     begin
>>        Foo (X & "d" & NUL);
>>
>> Foo will get null-terminated "abcd". If implemented in C, it would use
>> LPSTR. Again, no pointers in sight. And, no, heap will not be used.
>
> Not even for Foo(X*1000000)?

Yes, for that invoking a contract termination clause would be the best
choice.

However this:
------------------------------ test.adb -------
with Ada.Text_IO; use Ada.Text_IO;
with Interfaces.C.Strings; use Interfaces.C, Interfaces.C.Strings;

with Ada.Unchecked_Conversion, System;

procedure Test is

procedure Foo (X : char_array);
pragma Convention (Stdcall, Foo);

procedure Foo (X : char_array) is
function "+" is
new Ada.Unchecked_Conversion (System.Address, chars_ptr);
begin
Put_Line ("Length:" & size_t'Image (strlen (+X'Address)));
end Foo;

function "*" (X : char_array; Y : size_t) return char_array is
begin
return Result : char_array (0..X'Length * Y) do
for I in 1..Y loop
Result ((I - 1) * X'Length..I * X'Length - 1) := X;
end loop;
Result (Result'Last) := NUL;
end return;
end "*";

X : char_array := "abc";
begin

Foo (X * 1000000);
end Test;
----------------------------------------
compiles and works just fine:
----------------------------------------
> gcc -c test.adb
test.adb:7:19: warning: type of argument "Foo.X" is unconstrained array
[-gnatwx]
test.adb:7:19: warning: foreign caller must pass bounds explicitly [-gnatwx]
gnatbind -x test.ali
gnatlink test.ali

> D:\Temp\y\test>test
Length: 3000000
----------------------------------------
Questions?

> Sure, if you say so:
>
> * No pointers are involved

Right. None. In Ada pointer is a distinct type in the declaration of
which contains the keyword "access". Saw any?

> * No heap storage is necessary, no matter what language

Right. Since avoiding heap for dealing with indefinite object is a
computable problem there is no necessity.

> * No memory resources are involved

Wrong. Memory is always required for expression evaluation.

> * Passing an arbitrary string expression is just like passing an integer
> expression

Neither happens. We are not talking about closures.

Bart

unread,

Jan 19, 2022, 5:09:24 PM1/19/22

to

On 19/01/2022 17:17, Dmitry A. Kazakov wrote:
> On 2022-01-19 17:00, Bart wrote:
>> On 19/01/2022 15:20, Dmitry A. Kazakov wrote:
>>> On 2022-01-19 15:38, Bart wrote:
>>>> On 19/01/2022 13:59, Dmitry A. Kazakov wrote:
>>>>> On 2022-01-19 14:34, Bart wrote:
>>>>>> What does it say about the values of strings?
>>>>>
>>>>> Win32 says pretty much same: LPSTR, LPLONG.
>>>>
>>>> Those types are 64-bit pointers on Win64.
>>>
>>> For example this:
>>>
>>> procedure Foo (X : in out LONG);
>>> pragma Convention (Stdcall, Foo);
>>>
>>> will deploy LPLONG for X. The value will be passed by reference
>>> (LPLONG) as Win32 Stdcall prescribes. See any pointers? Right, there
>>> is none.
>>
>> Actually, yes. A reference is a pointer, just an implicit one. The
>> 'P' in LPLONG stands for 'Pointer'; what did you think it was?
>
> LONG, see the declaration of Foo.

The 'P' in 'LPLONG' stands for 'LONG'? Okay.....

I asked if no heap was used. How can you tell what your Ada code is
doing and how?

I also asked, if memory need to be recovered, which bit of code did
that, and where.

>> Sure, if you say so:
>>
>> * No pointers are involved
>
> Right. None. In Ada pointer is a distinct type in the declaration of
> which contains the keyword "access". Saw any?

So because no explicit pointer denotations are in the source code, that
means that none are used in the implementation?

Gotcha.

>> * No heap storage is necessary, no matter what language
>
> Right. Since avoiding heap for dealing with indefinite object is a
> computable problem there is no necessity.

You really, really don't like heaps don't you? Let me pose a little
challenge: you're writing, in Ada, an interpreter for a language which
creates and destroys ad hoc objects (like, any scripting language).

You can't control how people write programs in that language. How do you
implement the memory managemnt in such a language, without using heaps
in Ada? For example a program has a tree structure where arbitrary nodes
may grow, others may be deleted.

To be clear, I'm not asking how you approach such a task in Ada; the
task has already been coded in another language, and your job is to
implement that language, without using anything that looks like a heap,
or use language features that might implicitly use a heap (but Ada
doesn't, according to you).

>> * No memory resources are involved
>
> Wrong. Memory is always required for expression evaluation.
>
>> * Passing an arbitrary string expression is just like passing an
>> integer expression
>
> Neither happens. We are not talking about closures.
>

OK, since you are being pedantic now, I mean passing the result of
evaluating those respective expressions.

The result of an integer expression will be an integer. The result of an
string expression may be one of half a dozen cateogories of strings
(literal, owned, shared, slice etc), and exactly what is passed depends
on a dozen different ways that strings may be implemented.

Look, just forget it. I can see you're just being mischievous.

Dmitry A. Kazakov

unread,

Jan 20, 2022, 3:31:08 AM1/20/22

to

On 2022-01-19 23:09, Bart wrote:
> On 19/01/2022 17:17, Dmitry A. Kazakov wrote:
>> On 2022-01-19 17:00, Bart wrote:
>>> On 19/01/2022 15:20, Dmitry A. Kazakov wrote:
>>>> On 2022-01-19 15:38, Bart wrote:
>>>>> On 19/01/2022 13:59, Dmitry A. Kazakov wrote:
>>>>>> On 2022-01-19 14:34, Bart wrote:
>>>>>>> What does it say about the values of strings?
>>>>>>
>>>>>> Win32 says pretty much same: LPSTR, LPLONG.
>>>>>
>>>>> Those types are 64-bit pointers on Win64.
>>>>
>>>> For example this:
>>>>
>>>> procedure Foo (X : in out LONG);
>>>> pragma Convention (Stdcall, Foo);
>>>>
>>>> will deploy LPLONG for X. The value will be passed by reference
>>>> (LPLONG) as Win32 Stdcall prescribes. See any pointers? Right, there
>>>> is none.
>>>
>>> Actually, yes. A reference is a pointer, just an implicit one. The
>>> 'P' in LPLONG stands for 'Pointer'; what did you think it was?
>>
>> LONG, see the declaration of Foo.
>
> The 'P' in 'LPLONG' stands for 'LONG'? Okay.....

?

By executing it as I did above.

> I also asked, if memory need to be recovered, which bit of code did
> that, and where.

Use gcc -S and/or gdb.

>>> Sure, if you say so:
>>>
>>> * No pointers are involved
>>
>> Right. None. In Ada pointer is a distinct type in the declaration of
>> which contains the keyword "access". Saw any?
>
> So because no explicit pointer denotations are in the source code, that
> means that none are used in the implementation?

Exactly so. No pointers means no pointers,

>>> * No heap storage is necessary, no matter what language
>>
>> Right. Since avoiding heap for dealing with indefinite object is a
>> computable problem there is no necessity.
>
> You really, really don't like heaps don't you?

This has nothing to with love. It is a statement of fact. You asked
whether heap is necessary for a problem X. The answer is no. It is a
formal question:

Whether the ordering of allocation/deallocation used in X is "random",
actually indeterminable? No it is not, it is well defined in fact.

>>> * Passing an arbitrary string expression is just like passing an
>>> integer expression
>>
>> Neither happens. We are not talking about closures.
>
> OK, since you are being pedantic now, I mean passing the result of
> evaluating those respective expressions.

Define "same". First, you must understand that this is a language
question. If under "same" you mean by-reference vs. by-value then I must
disappoint you. They can be very same. GNAT Ada compiler supports
passing small arrays by-value.

Formally the language Ada mandates this:

- scalar objects must be passed by value
- tagged and limited objects must be passed by reference
- other objects (and arrays fall in this category) are passed at will

Of course all this goes out of the window when Windows calling
convention is requested, sorry for pun. Or another example, if FORTRAN
calling convention is requested, then, of course, integer will be passed
by reference not by value.

> The result of an integer expression will be an integer. The result of an
> string expression may be one of half a dozen cateogories of strings
> (literal, owned, shared, slice etc), and exactly what is passed depends
> on a dozen different ways that strings may be implemented.

Rubbish. The result of string expression is string. Values are not
passed, objects representing values do. It is quite no matter what sort
of value an object holds when passing the object.

Do you really intend to code separately passing each possible type and
aspect of objects? Oh my, now I finally understand your dismay at seeing
the type system of a modern programming language...

Bart

unread,

Jan 20, 2022, 5:43:01 AM1/20/22

to

On 20/01/2022 08:31, Dmitry A. Kazakov wrote:

> On 2022-01-19 23:09, Bart wrote:

>>>> Actually, yes. A reference is a pointer, just an implicit one. The
>>>> 'P' in LPLONG stands for 'Pointer'; what did you think it was?
>>>
>>> LONG, see the declaration of Foo.
>>
>> The 'P' in 'LPLONG' stands for 'LONG'? Okay.....
>
> ?

FFS, it's very simple: 'LP' in MS types stands for 'Long Pointer', so
the 'P' stands for 'Pointer', you know, the thing you said doesn't exist.

>> So because no explicit pointer denotations are in the source code,
>> that means that none are used in the implementation?
>
> Exactly so. No pointers means no pointers,

Not even in the underlying implementation? (That's that bit /I/ have to
code.)

> This has nothing to with love. It is a statement of fact. You asked
> whether heap is necessary for a problem X. The answer is no. It is a
> formal question:
>
> Whether the ordering of allocation/deallocation used in X is "random",
> actually indeterminable? No it is not, it is well defined in fact.

Not even when X calls random() to decide which parts of a data structure
to allocate/deallocate?

Remember, I am not writing X, I have to implement the language in which
X is written.

> Formally the language Ada mandates this:
>
> - scalar objects must be passed by value
> - tagged and limited objects must be passed by reference
> - other objects (and arrays fall in this category) are passed at will

So, assuming a string is not classed as a scalar, a non-small string is
passed differently from an integer, possibly 'at will'.

Thank you for finally admitting that integers and strings might need
different passing mechanisms.

> Of course all this goes out of the window when Windows calling
> convention is requested, sorry for pun.

So what does SYS V ABI do that is so different?

> Or another example, if FORTRAN
> calling convention is requested, then, of course, integer will be passed
> by reference not by value.

'By-reference' is an extra level of indirection that can be specified
within the code; it can be applied to both integers normally passed by
value, and objects normally passed by reference anyway.

The language specifies which objects can be modified by a callee without
a formal 'by-reference', if any.

>> The result of an integer expression will be an integer. The result of
>> an string expression may be one of half a dozen cateogories of strings
>> (literal, owned, shared, slice etc), and exactly what is passed
>> depends on a dozen different ways that strings may be implemented.
>
> Rubbish. The result of string expression is string.

Yes, which can be one of several categories, not several types, for
example, the string data is owned by the object, or it can be a view
into a separate substring.

And that string can be implemented in a dozen different ways depending
on language.

Do you disagree with that, or is Ada the only possible language under
discussion?

You said, long ago, that dealing with strings is no harder than dealing
with fixed-width integers.

Take this example:

string S:="onetwothree"

case random(1..10)
when 1 then S := "ABC"
when 2 then S := T
when 3 then S := T.[10..20]
when 4 then S := S.[4..6]
when 5 then S := F()*10
esac

T:=""

Now S has something else assigned to it, or goes out of scope, and the
string data MIGHT need to be recovered. But how does it know whether S
owns the string it refers to or not?

This depends on how the language defines strings to work. Maybe all
those assignments and slicing created a fresh copy of the string or
slice, but then T might be a billion characters. Or maybe these are
immutable, shared strings.

> Values are not
> passed, objects representing values do. It is quite no matter what sort
> of value an object holds when passing the object.

I don't understand what you are saying. Of course it is very easy to
utterly dismiss low-level implementation details, and talk about only in
HLL terms (presumably Ada terms).

So you can claim that an implementation does not involve registers or
addresses or pointers or stack or heap or even memory, because those
terms do not appear in the source code.

Which all seems to be a desperate attempt to win an argument.

Dmitry A. Kazakov

unread,

Jan 20, 2022, 7:35:00 AM1/20/22

to

On 2022-01-20 11:42, Bart wrote:
> On 20/01/2022 08:31, Dmitry A. Kazakov wrote:
>> On 2022-01-19 23:09, Bart wrote:
>
>>>>> Actually, yes. A reference is a pointer, just an implicit one. The
>>>>> 'P' in LPLONG stands for 'Pointer'; what did you think it was?
>>>>
>>>> LONG, see the declaration of Foo.
>>>
>>> The 'P' in 'LPLONG' stands for 'LONG'? Okay.....
>>
>> ?
>
> FFS, it's very simple: 'LP' in MS types stands for 'Long Pointer', so
> the 'P' stands for 'Pointer', you know, the thing you said doesn't exist.

And I bring the code you conveniently removed:

procedure Foo (X : in out LONG);
pragma Convention (Stdcall, Foo);

*where* is 'LP'?

>>> So because no explicit pointer denotations are in the source code,
>>> that means that none are used in the implementation?
>>
>> Exactly so. No pointers means no pointers,
>
> Not even in the underlying implementation? (That's that bit /I/ have to
> code.)

No idea. What is the underlying implementation? Gates, capacitors,
silicon wafers?

>> This has nothing to with love. It is a statement of fact. You asked
>> whether heap is necessary for a problem X. The answer is no. It is a
>> formal question:
>>
>> Whether the ordering of allocation/deallocation used in X is "random",
>> actually indeterminable? No it is not, it is well defined in fact.
>
> Not even when X calls random() to decide which parts of a data structure
> to allocate/deallocate?

Not even then. The ordering remains same.

> Remember, I am not writing X, I have to implement the language in which
> X is written.
>
>> Formally the language Ada mandates this:
>>
>> - scalar objects must be passed by value
>> - tagged and limited objects must be passed by reference
>> - other objects (and arrays fall in this category) are passed at will
>
> So, assuming a string is not classed as a scalar, a non-small string is
> passed differently from an integer, possibly 'at will'.
>
> Thank you for finally admitting that integers and strings might need
> different passing mechanisms.

Two integers might. So what?

>> Of course all this goes out of the window when Windows calling
>> convention is requested, sorry for pun.
>
> So what does SYS V ABI do that is so different?

I don't remember SysV calling conventions. Why do you care? What is
different? Why would it shatter the earth under you feet?

>> Or another example, if FORTRAN calling convention is requested, then,
>> of course, integer will be passed by reference not by value.
>
> 'By-reference' is an extra level of indirection that can be specified
> within the code; it can be applied to both integers normally passed by
> value, and objects normally passed by reference anyway.

Thank you Captain. So what?

> The language specifies which objects can be modified by a callee without
> a formal 'by-reference', if any.

Ada does without both formal (it has none) and informal (it does not
need any). And?

>>> The result of an integer expression will be an integer. The result of
>>> an string expression may be one of half a dozen cateogories of
>>> strings (literal, owned, shared, slice etc), and exactly what is
>>> passed depends on a dozen different ways that strings may be
>>> implemented.
>>
>> Rubbish. The result of string expression is string.
>
> Yes, which can be one of several categories, not several types, for
> example, the string data is owned by the object, or it can be a view
> into a separate substring.

Nonsense. Ada and other sane languages are specific that the
representation is a property of the type, meaning: same type, same
representation.

> And that string can be implemented in a dozen different ways depending
> on language.

And integer can be implemented in two dozens of ways, maybe in three...

> You said, long ago, that dealing with strings is no harder than dealing
> with fixed-width integers.
>
> Take this example:
>
>     string S:="onetwothree"
>
>     case random(1..10)
>     when 1 then S := "ABC"
>     when 2 then S := T
>     when 3 then S := T.[10..20]
>     when 4 then S := S.[4..6]
>     when 5 then S := F()*10
>     esac

This is illegal in Ada and irrelevant anyway. Try to stay focused on
parameter passing methods and managing temporary objects. Both are well
known issues. I have no idea why are you trying to debunk algorithms
existing, implemented and deployed for multiple decades. It is just
beyond silly.

>> Values are not passed, objects representing values do. It is quite no
>> matter what sort of value an object holds when passing the object.
>
> I don't understand what you are saying.

That for a memory manager it makes absolutely no difference what value
the object has. Just like a heap would not care, so memory manager does not.

> Of course it is very easy to
> utterly dismiss low-level implementation details, and talk about only in
> HLL terms (presumably Ada terms).

As I said, gcc -S is your friend.

The discussion was about the language features and your amazement that
it can have strings and, what a horror, not to use heap when dealing
with string expressions. I don't even understand what are you trying to
say. That compiler vendors lie? That they are in a secret cabal with
programmers who actually do not use their compilers in production code?

> So you can claim that an implementation does not involve registers or
> addresses or pointers or stack or heap or even memory, because those
> terms do not appear in the source code.

Exactly. Unless we are talking about language to language translators. I
remember that there were several Ada to C, Ada to Java translators. But
GNAT for x86 is not one of them. And, again, what is the point? You can
rewrite a program that does not use pointer into a program that uses
them? Yes you can. So what?

> Which all seems to be a desperate attempt to win an argument.

There is no argument, merely statements of facts.

Bart

unread,

Jan 20, 2022, 8:21:45 AM1/20/22

to

On 20/01/2022 12:34, Dmitry A. Kazakov wrote:
> On 2022-01-20 11:42, Bart wrote:

> *where* is 'LP'?

Here, where you said:

"will deploy LPLONG for X. The value will be passed by reference
(LPLONG) as Win32 Stdcall prescribes. See any pointers? Right, there is
none."

And I asked, So what is the P in LPLONG?

>> Not even in the underlying implementation? (That's that bit /I/ have
>> to code.)
>
> No idea. What is the underlying implementation? Gates, capacitors,
> silicon wafers?

I implement languages, which usually means going from HLL source code
down to the native code executed by a processor. That doesn't mean going
down to microprogramming or bitslicing or logic gates or transistors or
right down to the movement of electrons.

Stop being silly.

>>> This has nothing to with love. It is a statement of fact. You asked
>>> whether heap is necessary for a problem X. The answer is no. It is a
>>> formal question:
>>>
>>> Whether the ordering of allocation/deallocation used in X is
>>> "random", actually indeterminable? No it is not, it is well defined
>>> in fact.
>>
>> Not even when X calls random() to decide which parts of a data
>> structure to allocate/deallocate?
>
> Not even then. The ordering remains same.

That very glib, but I notice you don't explain how it's done.

It's like you have a pile of books to be put in order on a shelf. But at
some point, you need to remove a book from the middle of those already
on the shelf.

Now, you want to place the next book, but if it's no wider than the one
just removed, can you reuse that empty slot?

If so, then you have a heap. If not, then you have a very inefficient
system of storing books, and will need more bookcases than necessary.

Or are you arguing that the task of putting books on the shelf, and
removing arbitrary books, can ALWAYs be done in a left to right order,
with never any gaps?

Or do you have to use a technique which pushes all the books to the
right of the gap, to the left to close the gap? (Which will screw up any
external references to the exact location of any book on a shelf, as
well as being hopelessly inefficient.)

>> Thank you for finally admitting that integers and strings might need
>> different passing mechanisms.
>
> Two integers might.

No, why should they if they are the same size? As I said half a dozen
posts ago, a u64 value as the first argumemt on Win64 ABI goes in RCX,
always.

>>> Of course all this goes out of the window when Windows calling
>>> convention is requested, sorry for pun.
>>
>> So what does SYS V ABI do that is so different?
>
> I don't remember SysV calling conventions.

Funny that you can remember Win64 ones!

> Why do you care? What is
> different? Why would it shatter the earth under you feet?

You had a sly dig at Windows calling conventions. I asked you to clarify
what is so bad about it compared with others, but now you don't care!

>> And that string can be implemented in a dozen different ways depending
>> on language.
>
> And integer can be implemented in two dozens of ways, maybe in three...

Really, there is greater diversity in implementing a fixed width small
integer than a string?

>> You said, long ago, that dealing with strings is no harder than
>> dealing with fixed-width integers.
>>
>> Take this example:
>>
>>      string S:="onetwothree"
>>
>>      case random(1..10)
>>      when 1 then S := "ABC"
>>      when 2 then S := T
>>      when 3 then S := T.[10..20]
>>      when 4 then S := S.[4..6]
>>      when 5 then S := F()*10
>>      esac
>
> This is illegal in Ada

OK, so the only language that counts is Ada? I'm glad that point has
been settled.

Actually plenty of languages have mutable variables and allow
conditional assignments of strings like the above.

> and irrelevant anyway. Try to stay focused on
> parameter passing methods and managing temporary objects.

'S' was a temporary object; I asked how, at the end of its scope, it
knew what to do with the string data if refers to.

Transient objects created as intermediate results in expressions are
part of it too, but I thought you would simply deny their existence, as
they don't form any part of the source, so they can't possibly be an
issue anyone has to bother their head about, not even the implementers
of the language!

> The discussion was about the language features and your amazement that
> it can have strings

I've given plenty of examples where string data is not created or
destroyed in a LIFO manner, thus requiring ad hoc allocations and
allocations (so, a heap).

You have chosen to ignore the examples, or dismiss languages where that
is routinely done.

Since you also dismissing the entire concept of GC, which has also been
around for decades, then this conversation is rather pointless.

All I can surmise is that pet language doesn't have a heap allocator

and, what a horror, not to use heap when dealing
> with string expressions. I don't even understand what are you trying to
> say. That compiler vendors lie? That they are in a secret cabal with
> programmers who actually do not use their compilers in production code?
>
>> So you can claim that an implementation does not involve registers or
>> addresses or pointers or stack or heap or even memory, because those
>> terms do not appear in the source code.
>
> Exactly.

I thought so. We're just going around in circles because you are
choosing to ignore any actual realities.

Dmitry A. Kazakov

unread,

Jan 20, 2022, 9:35:42 AM1/20/22

to

On 2022-01-20 14:21, Bart wrote:
> On 20/01/2022 12:34, Dmitry A. Kazakov wrote:
>> On 2022-01-20 11:42, Bart wrote:
>
>> *where* is 'LP'?
>
> Here, where you said:
>
> "will deploy LPLONG for X. The value will be passed by reference
> (LPLONG) as Win32 Stdcall prescribes. See any pointers? Right, there is
> none."

Right. Ada passes LONG by reference. No pointers.

A C equivalent would be a pointer LONG* since C does not support
by-reference parameter passing.

A FORTRAN-IV equivalent would be INTEGER*4.

> And I asked, So what is the P in LPLONG?

In what language?

>>> Not even in the underlying implementation? (That's that bit /I/ have
>>> to code.)
>>
>> No idea. What is the underlying implementation? Gates, capacitors,
>> silicon wafers?
>
> I implement languages, which usually means going from HLL source code
> down to the native code executed by a processor. That doesn't mean going
> down to microprogramming or bitslicing or logic gates or transistors or
> right down to the movement of electrons.
>
> Stop being silly.

Stop talking about your language. Your language, your problems.

I have no idea what intermediate code GCC uses and I do not care.
Machine language has no pointers, it has instruction modes.

>>>> This has nothing to with love. It is a statement of fact. You asked
>>>> whether heap is necessary for a problem X. The answer is no. It is a
>>>> formal question:
>>>>
>>>> Whether the ordering of allocation/deallocation used in X is
>>>> "random", actually indeterminable? No it is not, it is well defined
>>>> in fact.
>>>
>>> Not even when X calls random() to decide which parts of a data
>>> structure to allocate/deallocate?
>>
>> Not even then. The ordering remains same.
>
> That very glib, but I notice you don't explain how it's done.

Done what? Ordering? Ordering is done by counting. You count arguments
using your fingers. The first argument is the index finger. The second
is the middle finger and so on...

> It's like you have a pile of books to be put in order on a shelf. But at
> some point, you need to remove a book from the middle of those already
> on the shelf.

No, it is not that case. Read previous posts regarding using stacks.

>>> Thank you for finally admitting that integers and strings might need
>>> different passing mechanisms.
>>
>> Two integers might.
>
> No, why should they if they are the same size?

No they don't. Integer_32 and Integer_16 have different sizes.

>>>> Of course all this goes out of the window when Windows calling
>>>> convention is requested, sorry for pun.
>>>
>>> So what does SYS V ABI do that is so different?
>>
>> I don't remember SysV calling conventions.
>
> Funny that you can remember Win64 ones!

It is my job. I am writing a lot of code communicating with Windows and
Linux libraries.

>> Why do you care? What is different? Why would it shatter the earth
>> under you feet?
>
> You had a sly dig at Windows calling conventions. I asked you to clarify
> what is so bad about it compared with others, but now you don't care!

I never said that Windows conventions are especially bad. I said that
conventions have very little to do with what language does with
temporary objects. You are confusing them. Ada can use a lot of
different conventions and yet have string expressions with all of them.
Is that clear?

>>> And that string can be implemented in a dozen different ways
>>> depending on language.
>>
>> And integer can be implemented in two dozens of ways, maybe in three...
>
> Really, there is greater diversity in implementing a fixed width small
> integer than a string?

Sure. There exist hundreds of different integer encodings purposed for
different goals.

>>> You said, long ago, that dealing with strings is no harder than
>>> dealing with fixed-width integers.
>>>
>>> Take this example:
>>>
>>>      string S:="onetwothree"
>>>
>>>      case random(1..10)
>>>      when 1 then S := "ABC"
>>>      when 2 then S := T
>>>      when 3 then S := T.[10..20]
>>>      when 4 then S := S.[4..6]
>>>      when 5 then S := F()*10
>>>      esac
>>
>> This is illegal in Ada
>
> OK, so the only language that counts is Ada? I'm glad that point has
> been settled.

No, the point is that the example is irrelevant since you wondered how
it happens that Ada does not need heap for string expressions. The above
is not Ada and cannot be made Ada with Ada strings. Furthermore it is
not expression and not a call.

> Actually plenty of languages have mutable variables and allow
> conditional assignments of strings like the above.

Actually it is no problem to have conditional *expressions* and
amazingly it does not requires heap either. The following is legal Ada
program:
-----------------------------------------
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Numerics.Discrete_Random;

procedure Test is
type Choice is (Red, Black, Blue);
package Random_Choices is new Ada.Numerics.Discrete_Random (Choice);
use Random_Choices;

procedure Foo (X : String) is
begin
Put_Line (X);
end Foo;

Seed : Generator;
X : constant String := "abcdefgh";
begin
Reset (Seed);
Foo
( (case Random (Seed) is
when Red => "red",
when Black => X (2..6),
when Blue => "Hello " & X & "!")
);
end Test;
-----------------------------------------

>> and irrelevant anyway. Try to stay focused on parameter passing
>> methods and managing temporary objects.
>
> 'S' was a temporary object;

No, S is a variable.

Temporary object is one created and destroyed by the language for the
purpose of evaluation of an expression. The compiler choice to create
any temporary objects shall not change the semantics of the expression.

>> The discussion was about the language features and your amazement that
>> it can have strings
>
> I've given plenty of examples where string data is not created or
> destroyed in a LIFO manner, thus requiring ad hoc allocations and
> allocations (so, a heap).

No you claimed that

- it is impossible to return string from a subprogram without using heap
- it is impossible to pass a string to a subprogram without using heap
- it is impossible to have a string expression as an argument of a
subprogram without using heap
- it is impossible to use strings with Win32 API interfaces

All that is plain wrong, demonstrated by working examples.

> You have chosen to ignore the examples, or dismiss languages where that
> is routinely done.

Right, because they all are irrelevant to the claims you made.

Andy Walker

unread,

Jan 20, 2022, 1:01:07 PM1/20/22

to

On 19/01/2022 01:42, anti...@math.uni.wroc.pl wrote:
[I wrote:]

>> Oh. Well, a language either provides heap storage or it
>> does not. Even C does, even if in an unsafe and rather primitive
>> way. The techniques involved aren't exactly cutting-edge recent.
> Well, there are shades of gray here. To explain, there is concern
> about small machines. Small means different things for various
> people, some think 256M is small. But I mean really small,

> think about 4k storage in total (program + data). [...]

> So I do not understand why James wants fancy Print on small
> systems. But desire to run without heap storage is IMHO
> quite resonable.

Yes, but that's a matter, for a given language/implementation,
of knowing which features use the heap. Presumably, a cross-compiler
targetting a tiny machine knows to avoid space-intensive features, or
at least to issue warnings and suggest work-arounds. Certainly in the
days when I was programming small machines, I knew what constructs to
avoid. Today, I no longer care. But then, my current PC [more than a
decade old] has more RAM than all previous machines that I used [inc
several university mainframes] put together; and my car [now 8yo] has
storage and compute power that an earlier generation could only have
boggled at. Of course, that brings about problems of bloat -- my
current mailer and browser are each a factor of more than 1000x as
large as their predecessors in the '90s, for no proportionate added
value [as far as I am concerned].

Yes, if heap techniques are too big and perhaps too slow for
James's purposes, then of course his new language and compiler should
avoid them. But I don't see that as being closely related to the
preferred style of I/O. Eg, A68R managed to implement very general
I/O facilities [formatted and unformatted] without using the heap.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music

Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Valentine

Bart

unread,

Jan 20, 2022, 3:13:32 PM1/20/22

to

On 20/01/2022 14:35, Dmitry A. Kazakov wrote:

>>> Two integers might.
>>
>> No, why should they if they are the same size?
>
> No they don't. Integer_32 and Integer_16 have different sizes.

That's not an interesting distinction. For passing to functions, they
are likely to be either widened to 64 bits, or occupy the lower part of
a 64-bit value.

> Sure. There exist hundreds of different integer encodings purposed for
> different goals.

At the low level, there are very few encodings (mainly to to with
endianness, so on a specific target, there might be just one).

Strings however don't really exist at that level, so they could be
myriad ways of representing them.

> -----------------------------------------
> with Ada.Text_IO; use Ada.Text_IO;
> with Ada.Numerics.Discrete_Random;
>
> procedure Test is
>    type Choice is (Red, Black, Blue);
>    package Random_Choices is new Ada.Numerics.Discrete_Random (Choice);
>    use Random_Choices;
>
>    procedure Foo (X : String) is
>    begin
>       Put_Line (X);
>    end Foo;
>
>    Seed : Generator;
>    X    : constant String := "abcdefgh";
> begin
>    Reset (Seed);
>    Foo
>    ( (case Random (Seed) is
>          when Red   => "red",
>          when Black => X (2..6),
>          when Blue => "Hello " & X & "!")
>    );
> end Test;

(So why was my example illegal in Ada?)

This is not quite as challenging as my example, where T is a global, of
a length not known at compile-time, F()*10 also has a runtime length,
and S is set to refer to a substring within itself.

But also, I don't know the semantics for Ada strings. What happens in a
situation like this (I don't know Ada syntax so I will use a made-up one):

function Bar => String is
String X;
if random()<0.5 then
X := <compute some string at runtime>;
else
X := "ABC";
end if
return X;
end Bar

In your code:

Foo(Bar());

Bar returns the string stored in X, but X is a local that is destroyed
when it exits, so how does that string persist?

And I will ask again, how does it know when the string is done with, and
how does it know whether it is necessary to destroy the string and
recover its memory?

Suppose also that Foo() is changed to copy the string to a global.

Does Ada handle strings by value, which means it copies strings rather
than share them? This would make many things easier (what I used to do),
but is also less efficient.

>>> The discussion was about the language features and your amazement
>>> that it can have strings

First class strings are a heavy duty feature. You don't appreciate that
fact because your language does it all for your.

You claim it does it without using heap memory, so either you're not
telling me what restrictions that imposes, or you've carefully avoided
tricky cases in your examples.

>> I've given plenty of examples where string data is not created or
>> destroyed in a LIFO manner, thus requiring ad hoc allocations and
>> allocations (so, a heap).
>
> No you claimed that
>
> - it is impossible to return string from a subprogram without using heap
> - it is impossible to pass a string to a subprogram without using heap
> - it is impossible to have a string expression as an argument of a
> subprogram without using heap

These are all more difficult to do, and impose more language
restrictions, without using heap-like allocation.

I would find it almost impossible to implement interpreters for example
as I would not be able to model the chaotic allocation patterns of the
programs they run.

(Not actually impossible, as you would just implement a heap-like
allocator on top of a linear one.)

> - it is impossible to use strings with Win32 API interfaces

Win64 ABI doesn't concern itself with strings, as they depend on how the
language decides implement them.

> All that is plain wrong, demonstrated by working examples.
>
>> You have chosen to ignore the examples, or dismiss languages where
>> that is routinely done.
>
> Right, because they all are irrelevant to the claims you made.

I notice you didn't respond to my bookshelf analogy. Everyone knows how
bookshelves work; few know the ins and outs of Ada. Therefore much
easier to hide any limitations.

Dmitry A. Kazakov

unread,

Jan 20, 2022, 4:25:22 PM1/20/22

to

On 2022-01-20 21:13, Bart wrote:
> On 20/01/2022 14:35, Dmitry A. Kazakov wrote:
>
>> Sure. There exist hundreds of different integer encodings purposed for
>> different goals.
>
> At the low level, there are very few encodings (mainly to to with
> endianness, so on a specific target, there might be just one).

No, they exist all since the language may use any of. But you do not
understand the difference between the programing and the machine languages.

> Strings however don't really exist at that level, so they could be
> myriad ways of representing them.

Irrelevant but wrong. There exist machines with string operation
instructions, DEC VAX.

>> -----------------------------------------
>> with Ada.Text_IO; use Ada.Text_IO;
>> with Ada.Numerics.Discrete_Random;
>>
>> procedure Test is
>>     type Choice is (Red, Black, Blue);
>>     package Random_Choices is new Ada.Numerics.Discrete_Random (Choice);
>>     use Random_Choices;
>>
>>     procedure Foo (X : String) is
>>     begin
>>        Put_Line (X);
>>     end Foo;
>>
>>     Seed : Generator;
>>     X    : constant String := "abcdefgh";
>> begin
>>     Reset (Seed);
>>     Foo
>>     ( (case Random (Seed) is
>>           when Red   => "red",
>>           when Black => X (2..6),
>>           when Blue => "Hello " & X & "!")
>>     );
>> end Test;
>
> (So why was my example illegal in Ada?)

Because it used a variable instead of an expression. I fixed that.

> But also, I don't know the semantics for Ada strings. What happens in a
> situation like this (I don't know Ada syntax so I will use a made-up one):
>
>     function Bar => String is
>         String X;
>         if random()<0.5 then
>           X := <compute some string at runtime>;
>         else
>           X := "ABC";
>         end if
>         return X;
>     end Bar

This is illegal for the same reason. A correct code would be

function Bar return String is
begin
if Random < 0.5 then
return "(" & Bar & ")"; -- Recursively calls itself
else
return ""; -- End of recursion
end if
end Bar;

Bar returns a pseudo-random sequence of nested balanced parenthesis. No
heap used.

> Bar returns the string stored in X, but X is a local that is destroyed
> when it exits, so how does that string persist?

What persists? A variable does not survive if there were one. The value
does in a platonic universe, it existed before the first computer was
invented with continue to exist after all of them.

The effect of a string-valued function is, as the name suggests,
evaluation of a string value which becomes the value of some object at
the caller's discretion.

> And I will ask again, how does it know when the string is done with, and
> how does it know whether it is necessary to destroy the string and
> recover its memory?

All objects are destroyed at the end of their scopes.

> Suppose also that Foo() is changed to copy the string to a global.

It could not, unless the length were same if assigning of strings is
what you meant.

> Does Ada handle strings by value, which means it copies strings rather
> than share them?

Strings are exactly like integers with the only difference that the
compiler is free to choose between by-reference and by-value passing.

Assigning a string implies copying its body or parts of, but not the
bounds as they are the object constraint.

>>>> The discussion was about the language features and your amazement
>>>> that it can have strings
>

> You claim it does it without using heap memory, so either you're not
> telling me what restrictions that imposes, or you've carefully avoided
> tricky cases in your examples.

You cannot change any of the object constraints.

- Fixed strings have bounds as the constraint.
- Bounded strings have the maximum length as the constraint.
- Unbounded strings have no constraint, and thus, must use the heap.

Fixed and bounded strings do not use the heap.

>> No you claimed that
>>
>> - it is impossible to return string from a subprogram without using heap
>> - it is impossible to pass a string to a subprogram without using heap
>> - it is impossible to have a string expression as an argument of a
>> subprogram without using heap
>
> These are all more difficult to do, and impose more language
> restrictions, without using heap-like allocation.

Nothing of above is a restriction. The opposite is, like your language
restricting to no string or to an obligatory heap. In Ada you have a
free choice that suits best the problem at hand.

> I would find it almost impossible to implement interpreters for example
> as I would not be able to model the chaotic allocation patterns of the
> programs they run.

That is because you do not understand how a more or less advanced type
system works and why there is no chaos at all.

>> All that is plain wrong, demonstrated by working examples.
>>
>>> You have chosen to ignore the examples, or dismiss languages where
>>> that is routinely done.
>>
>> Right, because they all are irrelevant to the claims you made.
>
> I notice you didn't respond to my bookshelf analogy.

Because it is a wrong analogy. The right one is the poles in the Tower
of Hanoi game with the difference that instead of moving rings one moves
the poles. Though one could move the rings instead, in some rather
inefficient implementation.

Andy Walker

unread,

Jan 20, 2022, 6:17:38 PM1/20/22

to

On 19/01/2022 11:40, Bart wrote:
> The sort of tagged unions /I/ would want, would need a tag value that
> is a global enum.

That makes it [unnecessarily] hard to compile independent
modules. But perhaps I've misunderstood your notion of "global"?

> Different cases could also have the same type.

As they can in Algol [RR3.4.2cC, see the comment]. But of
course the consequence is that the choice of which case is chosen
is then undefined.

[...]

>> Rowing: Well, you're on better ground here, as the rowing
>> coercion is one of the features of Algol that has been touted as
>> something that perhaps ought not to have been included. But it's
>> only syntactic sugar, so it's easily worked around if your language
>> doesn't have it.
> It's a type issue; such a feature reduces type safety. It stops a
> language detecting the use of scalar rather than a list, which could
> be an error on the user's part.

It could, though it's an unlikely error, and the construct
saves a lot of extra work for the user. [Eg, "print (x)" would be
an error without rowing, as the parameter to "print" is an array of
a union of printable types and layout routines, so you would have
to write something like

([1] UNION (OUTTYPE, PROC (REF FILE) VOID) y; y[1] := x; print (y))

except that "OUTTYPE" is a secret type known only to the compiler.]
IRL, I expect some different syntactic sugar would be provided for
such a common case.

>> Inefficient? ???
> Yeah. Earlier discussion touched on the inefficiency of turning
> something into an explicit string before printing (say, an entire
> array, instead of one element at time).

The point wasn't that you /had/ to turn things into explicit
strings but that /if/ all you had was single-character output, then
it was possible to do so. But in fact A68 [RR10.3.3.1a] defines
unformatted output in exactly that way, calling "putchar" [RR10.3.3.1b]
to perform the actual transput. Note that "putchar" is a complicated
piece of code, as it is one of the primary interfaces to the OS.
As ever, compilers are free to do things in a more efficient way if
appropriate [and I expect A68G in particular to do so, as it can assume
a Linux-ish OS].

> Here, you're turning a set of N print-items into an array, so that it
> can traverse them, deal with them, then discard the array.

And the time spent doing this, compared with (a) the sum of
the times taken to deal with the N items one at a time, and (b) the
time, in a real application, taken to construct the N items in the
first place is? 0.1% of the total running time of the program? It's
a load of fuss about nothing.

>> Half-working?
> Well, how big a UNION would be needed for all possible printable
> types? My static language limits them, but could easily decide to
> print arrays and user-defined records if it wants. The diversity is
> handled within the compiler.

Algol already handles that diversity, including arrays and
structure types. The question was whether it is worthwhile to
add the whole panoply of formats [15% of the RR, a similar fraction
of Algol's syntax, probably a similar fraction of the required parts
of the system library] to add a relatively small utility. It's too
late to "rescue" Algol or C from this; but James is developing a
new language.

> It's also missing per-item formating codes. A solution using only
> user-code, even in Algol68, is unwieldy and ugly.

Yes. So is formatted transput in general. The degree of
ugliness is comparable, however it's done. Simple things are easy,
as demonstrated; complicated things are complicated; and formats
require users to learn syntax and semantics that add seriously to
the difficulty of learning and using the language.

>> "Local static variables" were in Algol 60, were problematic
>> in detailed specification and were therefore dropped in Algol 68 [RR
>> 0.2.6f]. Some of the reasons are expanded in the well-known book on
>> "Algol 60 Implementation" by Randell and Russell, esp in relation to
>> "own" arrays. The effects are easy to obtain in other ways. [Most
>> other prominent languages don't have them either.]
> It's not clear what is problematic about them, other than making
> functions impure.

They're hard to implement if they are dynamic arrays, they
need special initialisation, and they're difficult to specify in
a rigorous way. The added utility is very little. Library routines
can use the "letter aleph" dodge [cf C's "__"] to prevent abuse;
user code can circumvent their limitations anyway, so it becomes a
matter of coding standards.

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music

Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Valentine

Bart

unread,

Jan 20, 2022, 7:28:33 PM1/20/22

to

On 20/01/2022 23:17, Andy Walker wrote:
> On 19/01/2022 11:40, Bart wrote:
>> The sort of tagged unions /I/ would want, would need a tag value that
>> is a global enum.
>
> That makes it [unnecessarily] hard to compile independent
> modules. But perhaps I've misunderstood your notion of "global"?

Typically it might be used like this:

record unitrec_etc =
byte tag # or opcode, opndtype etc
...
end

'tag' is an instance of some global enumeration: token type, AST mode
type, IL or target language opcode.

Or, as the tagged data type of an interpreter, it could represent a
typecode of /the language being intepreted/, not of the language being
used to write the interpreter; so the latter's types are not relevant at
all.

It is anyway something that is used across the program. It is used here
to control how parts of the record might be interpreted, and may be used
in other records too. This is up to user-code to implement.

>> Here, you're turning a set of N print-items into an array, so that it
>> can traverse them, deal with them, then discard the array.

> And the time spent doing this, compared with (a) the sum of
> the times taken to deal with the N items one at a time, and (b) the
> time, in a real application, taken to construct the N items in the
> first place is? 0.1% of the total running time of the program? It's
> a load of fuss about nothing.

You don't know that. If the output is to a string or into a file buffer,
then the process of stringifying can be fast if isn't dominated by the
display response time. Then this 'rowing' can be significant.

However I've done a test, and the results are interesting. This is with
dynamic code with 1 million iterations of this, directed to a file:

println a, b, c, d # took 1.8 seconds
println (a, b, c, d) # took 0.9 seconds

So it's faster! But this might be to do with the per-item processing
being done inside the interpreter, not in interpreted bytecode.

If the loop was instead this:

F(a, b, c, d) # F contains println a,b,c,d

it was 1.8 seconds again, but the following that corresponds more with
your approach:

F((a, b, c, d)) # F iterates over a list and prints
# one at a time + discrete space

was 2.8 seconds.

> late to "rescue" Algol or C from this; but James is developing a
> new language.

Actually I've just started a new one, which may or may go anywhere.

It doesn't touch any of the things recently discussed, as I don't see
anything wrong with them!

Mainly it's an attempt to combine my two languages because it's annoying
that a feature you need on one only works in the other. They can work
together, but...

> Yes. So is formatted transput in general. The degree of
> ugliness is comparable, however it's done. Simple things are easy,
> as demonstrated; complicated things are complicated; and formats
> require users to learn syntax and semantics that add seriously to
> the difficulty of learning and using the language.

Python replaced the 'print' statement with the 'print' function between
Py2 and Py3.

Control over spacing and newlines is done with optional keyword
parameters, which I can never remember.

You always have to know and remember something! With mine, once you know
that it has print and println, that part at least is taken care of.

And if you can't remember how to suppress the automatic space between A
and B here:

print A, B

you will probably remember that the space is only between items of the
same print, or for each comma, so that you have the choice of writing:

print A
print B

to get your work done without having to find and search the docs. This
trick won't work with Python until you can figure out how to hold the
newline.

Bart

unread,

Jan 21, 2022, 6:46:06 AM1/21/22

to

On 20/01/2022 21:25, Dmitry A. Kazakov wrote:

OK. But if I've got a pile of books on one pole (using a hole near one
corner to minimise the inconvenience), how to do I remove a book in the
middle of the pile?

An analogy has to corespond to what people actually want to do.

Here's a very similar one which is about actual code:

* You have a list of strings (could be an editor where each string = one
of an ordered set of lines; or could be anything)

* Let's say the implemention is a linear list of fixed-length
descriptors (eg. pointers), with the string data itself stored elsewhee
(say, in a LIFO stack)

* Now you have the same requirement as the bookshelf: you want to delete
on of those strings, and the accompanying string data

* Here it diverges form the bookshelf, as you probably want to close the
gap, which means moving those descriptors. But it means moving the
string data too - inefficient

* Or we can say that you want to replace one of those strings with a
different string, other shorter or longer.

To retain the LIFO structure of the string data, you will still need to
move all those descriptors and string data.

Now suppose you want to reverse the order of all those strings. Or
randomise them. Using fixed-length descriptors, you can very easily
reverse or randomise that list of descriptors. (Or may you just want to
swap the order of two lines.)

You don't need to touch the string data in its stack-like structure.
However it is now no longer a stack; it's more of a heap, as the
ordering is chaotic.