Printing beyond printf

98 views
Skip to first unread message

James Harris

unread,
Dec 27, 2021, 9:01:19 AM12/27/21
to
I've loads of other messages to get back to but while I think of it I'd
like to post a suggestion for you guys to shoot down in flames. ;-)

The printf approach to printing is flexible and fast at rendering
inbuilt types - probably better than anything which came before it - but
it's not perfect. In particular, it means that the code inside printf
which does the rendering needs to know about all types it may be asked
to render in character form. But there are many other types. E.g. a
programmer could want a print routine to render a boolean, an integer, a
float, a record, a table, a list, a widget, etc.

So here's another potential approach. What do you think of it?

The idea is, as with the printf family, to have a controlling string
where normal characters are copied verbatim and special fields are
marked with a % sign or similar. The difference is what would come after
the % sign and how it would be handled.

What I am thinking of is a format specification something like

%EB;

where "E" is a code which identifies the rendering engine, "B" is the
body of the format and ";" marks the end of the format and a return to
normal printing.

The /mechanical/ difference is that rather than the print function doing
all the formatting itself it would outsource any it didn't know. For
outsourcing, the rendering engine would be sent both the value to be
printed AND a pointer to the format string.

As for where rendering engines could come from:

* Some rendering engines could be inbuilt.
* Some could be specified earlier in the code.
* Some could be supplied in the parameter list (see below).

What would the formats look like? As some examples:

%i; a plain, normal, signed integer
%iu; a plain, normal, unsigned integer
%iu02x; a 2-digit zero-padded unsigned hex integer
%Kabc; a type K unknown to the print function

The latter would need the print function to have been previously told
about a rendering engine for "K". The print function would pass to the
rendering engine the format specification and a pointer to the value.
Finally,

%*abc; a dynamic render

The * would indicate that the address of the rendering engine had been
supplied as a parameter as in

printit("This is %*abc;, OK?", R, v)

where R is the rendering engine and v is the value to be rendered
according to the specification abc.

That's it. It's intended to be convenient, efficient, flexible and about
as simple as possible to use. Whether it achieves that is up for debate.

Thoughts/opinions? Is there a better approach to the formatted printing
of arbitrary types?


--
James Harris

Bart

unread,
Dec 27, 2021, 10:49:40 AM12/27/21
to
Look at that last example. You have to give it three things other than
the surrounding context: v, R and *abc.

At the simplest, you want to do it with just v. Then:

* It will apply the renderer previously associated with that user-type

* If it there isn't one, it will use a default rendering for that
generic type (array, struct etc)

* Or it can say it doesn't know how to do it, or just prints a reference
to v

Next simplest is when you specify some parameters to control the
formatting. How these work depends on what it does above.

Supplying a rendere each time you print I would say is not the simplest
way of doing it! If you're going to do that, you might as well do:

printit("This is %s; OK?", R(v,"abc"))

Then no special features are needed. Except that if R returns a string,
then you need some means of disposing of that string after printing, but
there are several ways of dealing with that.

> %i; a plain, normal, signed integer
> %iu; a plain, normal, unsigned integer
> %iu02x; a 2-digit zero-padded unsigned hex integer
> %Kabc; a type K unknown to the print function

This is too C-like. C-style formats have all sorts of problems
associated with hard-coding a type-code into a format string:

* What is the code for arbitrary expression X?
* What will it be when X changes, or the type of the terms change?
* What is it for clock_t, or some other semi-opaque type?
* What is it for uint64_t? (Apparently, it is PRId64 - a macro that
expands to a string)

If your Print function is implemented as a regular user-code function,
which your language knows nothing about, then you will need some scheme
which imparts type information to the function, as well as a way of
dealing with such variadic parameter lists.

But if it does, which would be a more modern way of doing so, then the
compiler already knows the types involved. Then the formatting is about
the display, so for an integer:

* Perhaps override signed to unsigned
* Plus sign
* Width
* Justification
* Zero-padding (and padding character)
* Base
* Separators (and grouping)
* Prefix and/or suffix
* Upper/lower case (digits A-F)

Or maybe, this number represents certain quanties (eg. a day of the
week), which will need displaying in a special way. If you're not using
a special type for that, then here it will need a way to override that,
perhaps using an R() function.

You should also look at how the current crop of languages do it. They
are also still tend to use format strings, and some like to put the
expressions to be printed inside the format string.

-----------

(My approach in dynamic code is that there is an internal function
tostr(), fully overloaded for different types, with optional format
data, that is applied to Print items. So that:

print a, b, c:"h" # last bit means hex

is the same as:

print tostr(a), tostr(b), tostr(c, "h")

There is a crude override mechanism, which links 'tostr' and a type T,
to a regular user-code function F.

Then, when printing T, it will call F().

In static code, this part is poorly developed. But Print (which is again
known to the language as it is a statement), can deal with regular
types, including most of those options for integers:


print a:"z 8 h s_" # leading zeros, 8-char field, hex, "_"
separator

)

James Harris

unread,
Dec 27, 2021, 12:26:36 PM12/27/21
to
On 27/12/2021 15:49, Bart wrote:
> On 27/12/2021 14:01, James Harris wrote:

...

>> What would the formats look like? As some examples:
>>
>>    %i;      a plain, normal, signed integer
>>    %iu;     a plain, normal, unsigned integer
>>    %iu02x;  a 2-digit zero-padded unsigned hex integer
>>    %Kabc;   a type K unknown to the print function

...

>>    %*abc;   a dynamic render
>>
>> The * would indicate that the address of the rendering engine had been
>> supplied as a parameter as in
>>
>>    printit("This is %*abc;, OK?", R, v)
>>
>> where R is the rendering engine and v is the value to be rendered
>> according to the specification abc.

...

> Look at that last example. You have to give it three things other than
> the surrounding context: v, R and *abc.

Yes, though this is for /formatted/ output so that's not surprising. And
the last example was in many ways the worst-case scenario.

>
> At the simplest, you want to do it with just v. Then:
>
> * It will apply the renderer previously associated with that user-type
>
> * If it there isn't one, it will use a default rendering for that
>   generic type (array, struct etc)
>
> * Or it can say it doesn't know how to do it, or just prints a reference
>   to v
>
> Next simplest is when you specify some parameters to control the
> formatting. How these work depends on what it does above.
>
> Supplying a rendere each time you print I would say is not the simplest
> way of doing it! If you're going to do that, you might as well do:
>
>    printit("This is %s; OK?", R(v,"abc"))
>
> Then no special features are needed. Except that if R returns a string,
> then you need some means of disposing of that string after printing, but
> there are several ways of dealing with that.

Dealing with memory is indeed a problem with that approach. The %s could
be passed a string which needs to be freed or one which must not be
freed. One option is

%s - a string which must not be freed
%M - a string which must be freed

>
> >    %i;      a plain, normal, signed integer
> >    %iu;     a plain, normal, unsigned integer
> >    %iu02x;  a 2-digit zero-padded unsigned hex integer
> >    %Kabc;   a type K unknown to the print function
>
> This is too C-like. C-style formats have all sorts of problems
> associated with hard-coding a type-code into a format string:
>
>   * What is the code for arbitrary expression X?

It would have to be something to match the type of X.

>   * What will it be when X changes, or the type of the terms change?

The format string would need to be changed to reflect the type change.

>   * What is it for clock_t, or some other semi-opaque type?

Perhaps %sdhh:mm:ss.fff; where d indicates datetime.

>   * What is it for uint64_t? (Apparently, it is PRId64 - a macro that
>     expands to a string)

How about %u; with the renderer being told the width of the type? Or
%u64; with the renderer not needing to know the width of the type?

Or with comma digit separators every 3 places

%u64:s,:3;

Meaning unsigned 64, separator "," every 3 places.

I'm not too worried about the specific format codes. Someone must have
already come up with a set of them which could be used. The main thing
is that any format code would be understood by the programmer and by the
formatter and that it would be clear to everyone where the format code
ended.

That said, it would make sense for the elements of the format string to
appear in some sort of logical order - possibly the order in which they
would be needed by the renderer.

>
> If your Print function is implemented as a regular user-code function,
> which your language knows nothing about, then you will need some scheme
> which imparts type information to the function, as well as a way of
> dealing with such variadic parameter lists.

Agreed.

>
> But if it does, which would be a more modern way of doing so, then the
> compiler already knows the types involved. Then the formatting is about
> the display, so for an integer:
>
>   * Perhaps override signed to unsigned
>   * Plus sign
>   * Width
>   * Justification
>   * Zero-padding (and padding character)
>   * Base
>   * Separators (and grouping)
>   * Prefix and/or suffix
>   * Upper/lower case (digits A-F)

It's a good list. It's amazing how many of those choices printf hits in
a short space.

>
> Or maybe, this number represents certain quanties (eg. a day of the
> week), which will need displaying in a special way. If you're not using
> a special type for that, then here it will need a way to override that,
> perhaps using an R() function.

OK.

>
> You should also look at how the current crop of languages do it. They
> are also still tend to use format strings, and some like to put the
> expressions to be printed inside the format string.

Maybe unfairly I have an antipathy to copying other languages but maybe
in this case it would be useful. Are there any you would recommend?

Incidentally, my real goal is to have the ability to output in a
self-describing 'binary stream' format rather than necessarily
converting to text but that's a subject in itself and would require
external support. Text will have to do for now!

>
> -----------
>
> (My approach in dynamic code is that there is an internal function
> tostr(), fully overloaded for different types, with optional format
> data, that is applied to Print items. So that:
>
>    print a, b, c:"h"              # last bit means hex
>
> is the same as:
>
>    print tostr(a), tostr(b), tostr(c, "h")

Maybe that's better: the ability to specify custom formatting on any
argument. I presume that's not just available for printing, e.g. you
could write

string s := c:"h"

and that where you have "h" you could have an arbitrarily complex format
specification.

There looks to be a potential issue, though. In C one can build up the
control string at run time. Could you do that with such as

string fmt := format_string(....)
s := c:(fmt)

?

>
> There is a crude override mechanism, which links 'tostr' and a type T,
> to a regular user-code function F.
>
> Then, when printing T, it will call F().
>
> In static code, this part is poorly developed. But Print (which is again
> known to the language as it is a statement), can deal with regular
> types, including most of those options for integers:
>
>
>     print a:"z 8 h s_"     # leading zeros, 8-char field, hex, "_"
> separator
>
> )

That's the spirit! ;-)


--
James Harris

Bart

unread,
Dec 27, 2021, 2:05:39 PM12/27/21
to
On 27/12/2021 17:26, James Harris wrote:
> On 27/12/2021 15:49, Bart wrote:

>>     printit("This is %s; OK?", R(v,"abc"))
>>
>> Then no special features are needed. Except that if R returns a
>> string, then you need some means of disposing of that string after
>> printing, but there are several ways of dealing with that.
>
> Dealing with memory is indeed a problem with that approach. The %s could
> be passed a string which needs to be freed or one which must not be
> freed. One option is
>
>   %s - a string which must not be freed
>   %M - a string which must be freed

This has similar problems to hardcoding a type. Take this:

printit("%s", F())

F returns a string, but is it one that needs freeing or not? That's
depends on what happens inside F. Whatever you choose, later the
implementation of F changes, then 100 formats have to change too?

This is a more general problem of memory-managing strings. It's more
useful to be able to solve it for the language, then it will work for
Print too.

(Personally, I don't get involved with this at all, not in low level code.

Functions that return strings generally return a pointer to a local
static string that needs to be consumed ASAP. Or sometimes there is a
circular list of them to allow several such calls per Print. It's not
very sophisticated, but that's why the language is low level.)

If you want a more rigorous approach, perhaps try this:

printit("%H", R(v, "abc"))

H means handler. R() is not the handler itself, but returns a descriptor
(eg. X) the contains a reference to a handler function, and references
to those two captured values (a little like a lambda or closure I think).

The Print handler then calls X() with a parameter to request the value.
And can call it again with another parameter to free it. Or perhaps it
can be used to iterate over the characters, or sets of strings.

The latter might be more suitable when you have a 1-billion element
array to print (eg. to a file), and you don't want or need to generate
one giant string in one go.

But this starts to get into generators and iterators. Perhaps it is a
too advanced approach if your language is anything like mine.


>>
>>  >    %i;      a plain, normal, signed integer
>>  >    %iu;     a plain, normal, unsigned integer
>>  >    %iu02x;  a 2-digit zero-padded unsigned hex integer
>>  >    %Kabc;   a type K unknown to the print function
>>
>> This is too C-like. C-style formats have all sorts of problems
>> associated with hard-coding a type-code into a format string:
>>
>>    * What is the code for arbitrary expression X?
>
> It would have to be something to match the type of X.
>
>>    * What will it be when X changes, or the type of the terms change?
>
> The format string would need to be changed to reflect the type change.
>
>>    * What is it for clock_t, or some other semi-opaque type?
>
> Perhaps %sdhh:mm:ss.fff; where d indicates datetime.

The problem with doing it in C is that clock_t could be u32, u64, i32,
i64 or even a float type; what number format to use? It's not a string
anyway; you can turn it into one, but then that's my DOW example.

>>    * What is it for uint64_t? (Apparently, it is PRId64 - a macro that
>>      expands to a string)

Again this is for C; the problem being that a format string should not
need to include type information:

* The compiler knows the type
* You may not know the type (eg. clock_t)
* You may not know the format needed (eg. uint64_t)
* You don't want to have to maintain 1000 format strings as
expressions and types of variables change

> That said, it would make sense for the elements of the format string to
> appear in some sort of logical order - possibly the order in which they
> would be needed by the renderer.

But then somebody has to remember them! I ensure the order doesn't matter.



> Maybe unfairly I have an antipathy to copying other languages but maybe
> in this case it would be useful. Are there any you would recommend?

I have the same approach to other languages. Generally I find their
print schemes over-elaborate, so tend to do my own thing. Yet they also
have to solve the same problems.

>> (My approach in dynamic code is that there is an internal function
>> tostr(), fully overloaded for different types, with optional format
>> data, that is applied to Print items. So that:
>>
>>     print a, b, c:"h"              # last bit means hex
>>
>> is the same as:
>>
>>     print tostr(a), tostr(b), tostr(c, "h")
>
> Maybe that's better: the ability to specify custom formatting on any
> argument. I presume that's not just available for printing, e.g. you
> could write
>
>   string s := c:"h"
>
> and that where you have "h" you could have an arbitrarily complex format
> specification.

Well, ":" is specifically used in print-item lists (elsewhere it creates
key:value pairs). I would write your example in dynamic code as one of:

s := sprint(c:"h") # sprint is special, like print
s := tostr(c,"h") # tostr is a function-like operator

(sprint can turn a list of items into one string; tostr does one at a
time, although that one item can be arbitrarily complex.)

In my cruder static code, it might be:

[100]char str
print @str, c:"h"

> There looks to be a potential issue, though. In C one can build up the
> control string at run time. Could you do that with such as
>
>   string fmt := format_string(....)
>   s := c:(fmt)
>
> ?

Sure, what comes after ":" is just any string expression:

ichar fmt = (option=1 "Hs_" | "s,")

print 123456:fmt # displays 1_E240 or 123,456

(Separator grouping is 3 digits decimal; 4 digits hex/binary.)


I've exercised my print formatting recently and found some weak areas,
to do with tabulation. Getting things lined up in columns is tricky,
especially with a header.

I do have formatted print which looks like this:

fprint "#(#, #) = #", fnname, a, b, result

If I want field widths, they are written as:

fprint "#: # #", a:w1, b:w2, c:w3

where w1/w2/w2 are "12" etc. Here, the first problem is a disconnect
between each #, and the corresponding print item. This is why some
languages bring them inside.

But the main thing here is that I don't get a sense of what it looks
like until I run the program. Something I've seen in the past would look
a bit like:

fprint "###: ####### #############", a, b, c

The widths are the number of # characters. That same string could be
used for headings:

const format = "###: ####### #############"

fprint format, "No.", "Barcode", "Description"
....
fprint format, i, item[i].code, item[i].descr

I haven't implemented this, it's just an idea. This an actual example of
the kind of output I'm talking about, but done the hard way by trial and
error:

Type Seg Offset Symbol/Target+Offset
-------------------------------------------------------
1: imprel32 code 00000024 MessageBoxA
2: locabs64 code 00000015 idata 02E570B0
3: locabs64 code 0000000B idata 02E570B6

Andy Walker

unread,
Dec 27, 2021, 5:02:32 PM12/27/21
to
On 27/12/2021 14:01, James Harris wrote:
> The printf approach to printing is flexible and fast at rendering
> inbuilt types - probably better than anything which came before it -
> but it's not perfect.

No, it's rubbish. If you need formatted transput [not
entirely convinced, but chacun a son gout], then instead of all
the special casing, the easiest and most flexible way is to
convert everything to strings. Thus, for each type, you need
an operator that converts values of that type into strings [and
vv for reading]. You can incorporate formatting details [such
as whether decimal points are "," or ".", whether leading zeros
are suppressed, etc., etc] into the operator, or as parameters
to a suitable procedure call. Such operators are separately
useful, eg for sorting. The default operator could, eg, be one
that converts [eg] an integer into the shortest possible string.
That way, the actual transput routines need to know almost
nothing about the types and formatting details of this parameters,
only how to write/read an array of characters.

[...]
> So here's another potential approach. What do you think of it?
> The idea is, as with the printf family, to have a controlling string
> where normal characters are copied verbatim and special fields are
> marked with a % sign or similar. The difference is what would come
> after the % sign and how it would be handled.

Then what you've done is to use "%" where you should
instead simply be including a string. So the specification of
"printf" becomes either absurdly complicated [as indeed it is
in most languages] or too limited [because some plausible
conversions are not catered for]. The "everything is a string"
approach has the advantage that for specialised use, eg if you
want to read/write your numbers as Roman numerals, you just have
to write the conversion routines that you would need anyway, no
need to change anything in "printf".

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Goodban

Dmitry A. Kazakov

unread,
Dec 28, 2021, 4:22:00 AM12/28/21
to
On 2021-12-27 23:02, Andy Walker wrote:
> On 27/12/2021 14:01, James Harris wrote:
>> The printf approach to printing is flexible and fast at rendering
>> inbuilt types - probably better than anything which came before it -
>> but it's not perfect.
>
>     No, it's rubbish.

Patently obvious rubbish, rather.

> then instead of all
> the special casing, the easiest and most flexible way is to
> convert everything to strings.

Sure, though, usually not everything is converted to string. For
example, formatting symbols or extensions of the idea: meta/tagged
formats like HTML, XML etc are inherently bad.

The most flexible is a combination of a string that carries most of the
information specific to the datatype (an OO method) and some commands to
the rendering environment.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

James Harris

unread,
Jan 2, 2022, 11:06:52 AMJan 2
to
On 27/12/2021 22:02, Andy Walker wrote:
> On 27/12/2021 14:01, James Harris wrote:
>> The printf approach to printing is flexible and fast at rendering
>> inbuilt types - probably better than anything which came before it -
>> but it's not perfect.
>
>     No, it's rubbish.  If you need formatted transput [not
> entirely convinced, but chacun a son gout],

If you are not convinced by formatted io then what kind of io do you
prefer?


> then instead of all
> the special casing, the easiest and most flexible way is to
> convert everything to strings.

If you convert to strings then what reclaims the memory used by those
strings? Not all languages have dynamic memory management, and dynamic
memory management is not ideal for all compilation targets.

The form I proposed had no need for dynamic allocations. That's part of
the point of it.


> Thus, for each type, you need
> an operator that converts values of that type into strings [and
> vv for reading].

Yes, there's a converse (and in some ways even more involved) issue with
reading.


...

>> So here's another potential approach. What do you think of it?
>> The idea is, as with the printf family, to have a controlling string
>> where normal characters are copied verbatim and special fields are
>> marked with a % sign or similar. The difference is what would come
>> after the % sign and how it would be handled.
>
>     Then what you've done is to use "%" where you should
> instead simply be including a string.  So the specification of
> "printf" becomes either absurdly complicated [as indeed it is
> in most languages] or too limited [because some plausible
> conversions are not catered for].  The "everything is a string"
> approach has the advantage that for specialised use, eg if you
> want to read/write your numbers as Roman numerals, you just have
> to write the conversion routines that you would need anyway, no
> need to change anything in "printf".
>

I'm not sure you understand the proposal. To be clear, the print routine
would be akin to

print("String with %kffff; included", val)

where the code k would be used to select a formatter. The formatter
would be passed two things:

1. The format from k to ; inclusive.
2. The value val.

As a result, the format ffff could be as simple as someone could design it.

Note that there would be no requirement for dynamic memory. The
formatter would just send for printing each character as it was generated.

What's wrong with that? (Genuine question!)


--
James Harris

Dmitry A. Kazakov

unread,
Jan 2, 2022, 11:37:29 AMJan 2
to
On 2022-01-02 17:06, James Harris wrote:

> If you convert to strings then what reclaims the memory used by those
> strings?

What reclaims memory used by those integers?

> Not all languages have dynamic memory management, and dynamic
> memory management is not ideal for all compilation targets.

No dynamic memory management is required for handling temporary objects.

----------
If that were relevant in the case of formatted output, which has a
massive overhead, so that even when using the heap (which is no way
necessary) it would leave a little or no dent. I remember a SysV C
compiler which modified the format string of printf in a misguided
attempt to save a little bit memory, while the linker put string
constants in the read-only memory...

Bart

unread,
Jan 2, 2022, 12:08:47 PMJan 2
to
On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
> On 2022-01-02 17:06, James Harris wrote:
>
>> If you convert to strings then what reclaims the memory used by those
>> strings?
>
> What reclaims memory used by those integers?

Integers are passed by value at this level of language.

No heap storage is involved.


>> Not all languages have dynamic memory management, and dynamic memory
>> management is not ideal for all compilation targets.
>
> No dynamic memory management is required for handling temporary objects.

If memory is allocated for the temporary object, then at some point it
needs to be reclaimed. Preferably just after the print operation is
completed.

If your language takes care of those details, then lucky you. It means
someone else has had the job of making it work.

Dmitry A. Kazakov

unread,
Jan 2, 2022, 12:21:41 PMJan 2
to
On 2022-01-02 18:08, Bart wrote:
> On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
>> On 2022-01-02 17:06, James Harris wrote:
>>
>>> If you convert to strings then what reclaims the memory used by those
>>> strings?
>>
>> What reclaims memory used by those integers?
>
> Integers are passed by value at this level of language.

This has nothing to do with the question: what reclaims integers?
FORTRAN-IV passed everything by reference, yet calls

FOO (I + 1)

were OK almost human life span ago.

>>> Not all languages have dynamic memory management, and dynamic memory
>>> management is not ideal for all compilation targets.
>>
>> No dynamic memory management is required for handling temporary objects.
>
> If memory is allocated for the temporary object, then at some point it
> needs to be reclaimed. Preferably just after the print operation is
> completed.

Yep.

> If your language takes care of those details, then lucky you. It means
> someone else has had the job of making it work.

Sure.

Evey *normal* language takes care of the objects it creates. And every
*normal* language lets identityless objects (like integers, strings,
records etc) be created ad-hoc and passed around in a unified manner.

If this

Put_Line ("X=" & X'Image & ", Y=" & Y'Image);

is a problem in your language, then the job is not done.

James Harris

unread,
Jan 2, 2022, 12:37:08 PMJan 2
to
On 27/12/2021 19:05, Bart wrote:
> On 27/12/2021 17:26, James Harris wrote:
>> On 27/12/2021 15:49, Bart wrote:
>
>>>     printit("This is %s; OK?", R(v,"abc"))
>>>
>>> Then no special features are needed. Except that if R returns a
>>> string, then you need some means of disposing of that string after
>>> printing, but there are several ways of dealing with that.
>>
>> Dealing with memory is indeed a problem with that approach. The %s
>> could be passed a string which needs to be freed or one which must not
>> be freed. One option is
>>
>>    %s - a string which must not be freed
>>    %M - a string which must be freed
>
> This has similar problems to hardcoding a type. Take this:
>
>     printit("%s", F())
>
> F returns a string, but is it one that needs freeing or not? That's
> depends on what happens inside F. Whatever you choose, later the
> implementation of F changes, then 100 formats have to change too?

I guess that every piece of code which called F() would have to change
whether printit was involved or not. But I take your point.

Note that my suggestion (of passing to a formatter the format and the
value) would not require dynamic memory management.

>
> This is a more general problem of memory-managing strings. It's more
> useful to be able to solve it for the language, then it will work for
> Print too.

Agreed.

>
> (Personally, I don't get involved with this at all, not in low level code.
>
> Functions that return strings generally return a pointer to a local
> static string that needs to be consumed ASAP. Or sometimes there is a
> circular list of them to allow several such calls per Print. It's not
> very sophisticated, but that's why the language is low level.)
>
> If you want a more rigorous approach, perhaps try this:
>
>    printit("%H", R(v, "abc"))
>
> H means handler. R() is not the handler itself, but returns a descriptor
> (eg. X) the contains a reference to a handler function, and references
> to those two captured values (a little like a lambda or closure I think).

AFAICS

R(v, "abc")

would be called before invoking printit. IOW wouldn't the delayed call
of a lambda require a distinct syntax?

...

>>>    * What is it for uint64_t? (Apparently, it is PRId64 - a macro that
>>>      expands to a string)
>
> Again this is for C; the problem being that a format string should not
> need to include type information:
>
>   * The compiler knows the type
>   * You may not know the type (eg. clock_t)
>   * You may not know the format needed (eg. uint64_t)
>   * You don't want to have to maintain 1000 format strings as
>     expressions and types of variables change

OK. If the compiler knows the type T why not have the print function invoke

T.format

with the value and the format string as parameters?

...
Perhaps one option is record-based output something akin to

tout.putrec(R, i, item[i].code, item[i].descr)

where tout is terminal out, R is a record format, and the values are
output in binary. The downside is that that would not be plain text and
would require support so that it could be viewed but the upsides would
include allowing the viewer to resize and reorder tables. (The headings
would be metadata; the user could choose whether to see them or not.)


--
James Harris

Bart

unread,
Jan 2, 2022, 12:50:50 PMJan 2
to
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
> On 2022-01-02 18:08, Bart wrote:
>> On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
>>> On 2022-01-02 17:06, James Harris wrote:
>>>
>>>> If you convert to strings then what reclaims the memory used by
>>>> those strings?
>>>
>>> What reclaims memory used by those integers?
>>
>> Integers are passed by value at this level of language.
>
> This has nothing to do with the question: what reclaims integers?
> FORTRAN-IV passed everything by reference, yet calls
>
>    FOO (I + 1)
>
> were OK almost human life span ago.

Fortran didn't allow recursion either. So such a call involved writing
the expression to a static location, and passing a reference to that
location.

The problem here is that you call a function F which returns a string to
be passed to Peint, which may be a literal, or in static memory, or has
a shared reference with other objects, none of which require the memory
to be reclaimed.

Or it may have been created specially for this return value, so then
after use (it's been printed), any resources need to be reclaimed.

Your approach to 'solve' this is to 'just' create a language high enough
is level (and harder to write and slower to run), to get around it.

Which actually doesn't solve it; you've just turned a small job into a
huge one.

More interesting is this: /given/ a language design low enough in level
that it doesn't have first class strings with automatic memory
management, how would you implement the printing of complex objects
requiring elaborate 'to-string' conversions.

> Evey *normal* language takes care of the objects it creates. And every
> *normal* language lets identityless objects (like integers, strings,
> records etc) be created ad-hoc and passed around in a unified manner.
>
> If this
>
>    Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>
> is a problem in your language, then the job is not done.

My static language is of the lower-level kind described above, yet this
example is merely:

println =X, =Y

You really want a more challenging example.

James Harris

unread,
Jan 2, 2022, 12:54:35 PMJan 2
to
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
> On 2022-01-02 18:08, Bart wrote:
>> On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
>>> On 2022-01-02 17:06, James Harris wrote:
>>>
>>>> If you convert to strings then what reclaims the memory used by
>>>> those strings?
>>>
>>> What reclaims memory used by those integers?

Not the print function. See below.

>>
>> Integers are passed by value at this level of language.
>
> This has nothing to do with the question: what reclaims integers?
> FORTRAN-IV passed everything by reference, yet calls
>
>    FOO (I + 1)
>
> were OK almost human life span ago.

I disagree slightly with both of you. AISI it doesn't matter whether the
objects to be printed are integers or structures or arrays or widgets.
If there's any reclaiming to be done then it would be carried out by
other language mechanisms which would happen anyway; it would not be
required by the print function. The print function would simply use
them. In reclamation terms it would neither increase nor decrease any
reference count.

For example,

complex c
function F
widget w
....
print(w, c)
endfunction

Widget w would be created at function entry and reclaimed at function
exit. The global c would be created at program load time and destroyed
when the program terminates. The print function would not get involved
in any of that stuff.

...

> If this
>
>    Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>
> is a problem in your language, then the job is not done.
>

That would be poor for small-machine targets. Shame on Ada! ;-)


--
James Harris

James Harris

unread,
Jan 2, 2022, 1:05:50 PMJan 2
to
On 28/12/2021 09:21, Dmitry A. Kazakov wrote:
> On 2021-12-27 23:02, Andy Walker wrote:

...

>> then instead of all
>> the special casing, the easiest and most flexible way is to
>> convert everything to strings.
>
> Sure, though, usually not everything is converted to string. For
> example, formatting symbols or extensions of the idea: meta/tagged
> formats like HTML, XML etc are inherently bad.
>
> The most flexible is a combination of a string that carries most of the
> information specific to the datatype (an OO method) and some commands to
> the rendering environment.

That sounds interesting. How would it work?


--
James Harris

Dmitry A. Kazakov

unread,
Jan 2, 2022, 1:25:18 PMJan 2
to
With single dispatch you have an interface, say, 'printable'. The
interface has an abstract method 'image' with the profile:

function Image (X : Printable) return String;

Integer, float, string, whatever that has to be printable inherits to
Printable and thus overrides Image. That is.

The same goes with serialization/streaming etc.

Dmitry A. Kazakov

unread,
Jan 2, 2022, 1:30:12 PMJan 2
to
On 2022-01-02 18:50, Bart wrote:
> On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
>> On 2022-01-02 18:08, Bart wrote:
>>> On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
>>>> On 2022-01-02 17:06, James Harris wrote:
>>>>
>>>>> If you convert to strings then what reclaims the memory used by
>>>>> those strings?
>>>>
>>>> What reclaims memory used by those integers?
>>>
>>> Integers are passed by value at this level of language.
>>
>> This has nothing to do with the question: what reclaims integers?
>> FORTRAN-IV passed everything by reference, yet calls
>>
>>     FOO (I + 1)
>>
>> were OK almost human life span ago.
>
> Fortran didn't allow recursion either.

Irrelevant. What reclaims integer I+1?

>> If this
>>
>>     Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>>
>> is a problem in your language, then the job is not done.
>
> My static language is of the lower-level kind described above, yet this
> example is merely:
>
>   println =X, =Y

No it is not. The example creates temporary strings, which possibility
stunned James and you as something absolutely unthinkable, or requiring
heap, which is rubbish and pretty normal in any decent language.

Dmitry A. Kazakov

unread,
Jan 2, 2022, 1:40:25 PMJan 2
to
That works perfectly well on small targets. You seem unaware of what
actually happens on I/O. For a "small" target it would be some sort of
networking stack with a terminal emulation on top of it. Believe me,
creating a temporary string on the stack is nothing in comparison to
that. Furthermore, the implementation would likely no less efficient
than printf which has no idea how large the result is and would have to
reallocate the output buffer or keep it between calls and lock it from
concurrent access. Locking on an embedded system is a catastrophic event
because switching threads is expensive as hell. Note also that you
cannot stream output, because networking protocols and terminal
emulators are much more efficient if you do bulk transfers. All that is
the infamous premature optimization.

Bart

unread,
Jan 2, 2022, 1:51:28 PMJan 2
to
Well, your example wasn't easy to scan.

But since you say it requires temporary strings, where does it put them?
They have to go somewhere!

On the stack? The stack is typically 1-4MB; what if the strings are
bigger? What if some of the terms are strings returned from a function;
those will not be on the stack. Example:

println "(" + tostr(123456)*1'000'000 + ")"

this creates an intermediate string of 6MB; too big for a stack.

Bart

unread,
Jan 2, 2022, 2:04:33 PMJan 2
to
This is not the issue that had been discussed. Printing an existing
string is no problem; even C can deal with that: you just use
printf("%s", S).

If you are printing a string returned from F(), then any requirement to
reclaim resources it might use is common to any use of it within the
language, and not specific to Print.

The problem is when the string in question has been created directly or
indirectly (explicitly by a function call or implicitly by calling some
handler) within the argument list of the Print function or statement,
and you want any memory allocated specifically for this purpose to be
taken care of automatically.

Dmitry A. Kazakov

unread,
Jan 2, 2022, 2:23:46 PMJan 2
to
On 2022-01-02 19:51, Bart wrote:

> But since you say it requires temporary strings, where does it put them?
> They have to go somewhere!
>
> On the stack?

The stack is as big as you specify, it is not the program stack, normally.

The stack is typically 1-4MB; what if the strings are
> bigger? What if some of the terms are strings returned from a function;
> those will not be on the stack. Example:
>
>    println  "(" + tostr(123456)*1'000'000 + ")"
>
> this creates an intermediate string of 6MB; too big for a stack.

Are you going to spill 6MB character long single line on a terminal
emulator? Be realistic.

Bart

unread,
Jan 2, 2022, 2:42:24 PMJan 2
to
On 02/01/2022 19:23, Dmitry A. Kazakov wrote:
> On 2022-01-02 19:51, Bart wrote:
>
>> But since you say it requires temporary strings, where does it put
>> them? They have to go somewhere!
>>
>> On the stack?
>
> The stack is as big as you specify, it is not the program stack, normally.

So, what are you going to, have a stack which is 1000 times bigger than
normal just in case?

Anyway, if the string is returned from a function, it is almost
certainly on the heap.

>  The stack is typically 1-4MB; what if the strings are
>> bigger? What if some of the terms are strings returned from a
>> function; those will not be on the stack. Example:
>>
>>     println  "(" + tostr(123456)*1'000'000 + ")"
>>
>> this creates an intermediate string of 6MB; too big for a stack.
>
> Are you going to spill 6MB character long single line on a terminal
> emulator? Be realistic.

Who knows what a user-program will do? And who are you to question it?

Besides the print output could redirected to a file, or be piped, or it
could be writing a file anyway.

It could also be on multiple lines if you want:

println "(" + (tostr(123456)+"\n")*1'000'000 + ")"

It depends on contents of the string data the user wants to output,
which as I said could be anything and of any size:

println reverse(readstrfile(filename))

Dmitry A. Kazakov

unread,
Jan 2, 2022, 3:25:33 PMJan 2
to
On 2022-01-02 20:42, Bart wrote:
> On 02/01/2022 19:23, Dmitry A. Kazakov wrote:
>> On 2022-01-02 19:51, Bart wrote:
>>
>>> But since you say it requires temporary strings, where does it put
>>> them? They have to go somewhere!
>>>
>>> On the stack?
>>
>> The stack is as big as you specify, it is not the program stack,
>> normally.
>
> So, what are you going to, have a stack which is 1000 times bigger than
> normal just in case?

No, I will never ever print anything that does not fit into a page of
72-80 characters wide.

BTW formatting output is meant to format, which includes text wrapping,
you know.

> Anyway, if the string is returned from a function, it is almost
> certainly on the heap.

No, it is certainly on the secondary stack.

>> Are you going to spill 6MB character long single line on a terminal
>> emulator? Be realistic.
>
> Who knows what a user-program will do?

The developer:

https://spec.cs.miami.edu/cpu2017/flags/gcc.2021-07-21.html#user_Wl-stack

> It could also be on multiple lines if you want:

Now it is time to learn about cycles.

> It depends on contents of the string data the user wants to output,
> which as I said could be anything and of any size:
>
>     println reverse(readstrfile(filename))

It could even be fetching whole git repository tree of the Linux kernel...

1. Why would anybody ever do that?

2. Why would anybody program it in such a stupid way?

Which boils down to the requirements of the application program and its
behavior upon broken input constraints from these requirements.

Bart

unread,
Jan 2, 2022, 5:31:26 PMJan 2
to
On 02/01/2022 20:25, Dmitry A. Kazakov wrote:
> On 2022-01-02 20:42, Bart wrote:
>> On 02/01/2022 19:23, Dmitry A. Kazakov wrote:
>>> On 2022-01-02 19:51, Bart wrote:
>>>
>>>> But since you say it requires temporary strings, where does it put
>>>> them? They have to go somewhere!
>>>>
>>>> On the stack?
>>>
>>> The stack is as big as you specify, it is not the program stack,
>>> normally.
>>
>> So, what are you going to, have a stack which is 1000 times bigger
>> than normal just in case?
>
> No, I will never ever print anything that does not fit into a page of
> 72-80 characters wide.

You're not even going to allow anything that might spill over multiple
lines? Say, displaying factorial(1000); apparently in your language,
there is no way to see what it looks like.


> BTW formatting output is meant to format, which includes text wrapping,
> you know.
>
>> Anyway, if the string is returned from a function, it is almost
>> certainly on the heap.
>
> No, it is certainly on the secondary stack.

You'll have to explain what a secondary stack is.



>>> Are you going to spill 6MB character long single line on a terminal
>>> emulator? Be realistic.
>>
>> Who knows what a user-program will do?
>
> The developer:
>
> https://spec.cs.miami.edu/cpu2017/flags/gcc.2021-07-21.html#user_Wl-stack

If your program needs to ROUTINELY increase the stack size, then it is
probably broken. You should only need to for some highly-recursive
programs such as the Ackermann benchmark. Not for ordinary Print!


>> It could also be on multiple lines if you want:
>
> Now it is time to learn about cycles.
>
>> It depends on contents of the string data the user wants to output,
>> which as I said could be anything and of any size:
>>
>>      println reverse(readstrfile(filename))
>
> It could even be fetching whole git repository tree of the Linux kernel...

Yes, why not? I think that is exactly what I did once:

dir/s >files # create a large text file
type files # display a large text file

So large files are OK in some circumstances, but not as an argument to
Print?


> 1. Why would anybody ever do that?
>
> 2. Why would anybody program it in such a stupid way?


Somebody does this:

println dirlist("*.c")

The output could be anything from 10 characters to 100,000 or more. (And
yes, the default routine to print a list of strings could write one per
line.)

Language implementers shouldn't place unreasonable restrictions on what
user code can do, because of some irrational aversion to using heap memory.

There might be 4MB of stack memory and 4000MB of heap memory, so why not
use it!

The issues however are with lower-level languages and what can be done
there. Using stack memory is not really practical there either.

Some approaches which don't involve creating discrete string objects I
think have been discussed. But there are also myriad workarounds to get
things done, even if not elegant.


>
> Which boils down to the requirements of the application program and its
> behavior upon broken input constraints from these requirements.

Nothing is broken. Any lack of first string handling and automatic
memory management is by design in some languages. But you are never
going to be stuck getting anything printed, it might just take extra effort.




Dmitry A. Kazakov

unread,
Jan 3, 2022, 4:19:44 AMJan 3
to
On 2022-01-02 23:31, Bart wrote:
> On 02/01/2022 20:25, Dmitry A. Kazakov wrote:
>> On 2022-01-02 20:42, Bart wrote:
>>> On 02/01/2022 19:23, Dmitry A. Kazakov wrote:
>>>> On 2022-01-02 19:51, Bart wrote:
>>>>
>>>>> But since you say it requires temporary strings, where does it put
>>>>> them? They have to go somewhere!
>>>>>
>>>>> On the stack?
>>>>
>>>> The stack is as big as you specify, it is not the program stack,
>>>> normally.
>>>
>>> So, what are you going to, have a stack which is 1000 times bigger
>>> than normal just in case?
>>
>> No, I will never ever print anything that does not fit into a page of
>> 72-80 characters wide.
>
> You're not even going to allow anything that might spill over multiple
> lines?

No.

> Say, displaying factorial(1000);

Displaying it to who?

>> BTW formatting output is meant to format, which includes text
>> wrapping, you know.
>>
>>> Anyway, if the string is returned from a function, it is almost
>>> certainly on the heap.
>>
>> No, it is certainly on the secondary stack.
>
> You'll have to explain what a secondary stack is.

Secondary stacks are used to pass large and dynamically sized objects.
They are also used to allocate local objects of these properties.

Secondary stacks are not machine stacks and have no limitations of.

>>>> Are you going to spill 6MB character long single line on a terminal
>>>> emulator? Be realistic.
>>>
>>> Who knows what a user-program will do?
>>
>> The developer:
>>
>> https://spec.cs.miami.edu/cpu2017/flags/gcc.2021-07-21.html#user_Wl-stack
>
> If your program needs to ROUTINELY increase the stack size, then it is
> probably broken.

Right, except that it is your program that tries to create 6MB strings
for no reason.

> So large files are OK in some circumstances, but not as an argument to
> Print?

Exactly.

>> 1. Why would anybody ever do that?
>>
>> 2. Why would anybody program it in such a stupid way?
>
> Somebody does this:
>
>    println dirlist("*.c")

Nobody prints file lists this way. Hint: formatted output presumes
formatting:

https://www.merriam-webster.com/dictionary/format

> The output could be anything from 10 characters to 100,000 or more. (And
> yes, the default routine to print a list of strings could write one per
> line.)

Wrong. For printing lists of files, if anybody cares, there will be a
subprogram with parameters:

- Files path / array of paths
- Wildcard/pattern use flag
- Number of columns
- Output direction: column first vs row first
- First line decorator text
- Consequent lines decorator text
- Filter object
- Sorting order object

etc.

Bart

unread,
Jan 3, 2022, 6:19:14 AMJan 3
to
On 03/01/2022 09:19, Dmitry A. Kazakov wrote:
> On 2022-01-02 23:31, Bart wrote:

> Right, except that it is your program that tries to create 6MB strings
> for no reason.

No; my user does so. I can't control what they write.

Should I impose an arbitrary limit like max 80 characters per any print
item, and max 80 characters in total for all items on one print
statement? (But then multiple print statements can write to the same line.)

I'm not implementing Fortran or writing to a line-printer, and my
language copes fine with no such limits. If the user does something
silly, they will find out when it gets slow or they run out of memory.

However, my (higher level) implementation uses managed strings that work
with the heap.

I think (since I've sort of lost track of what we were arguing about)
you were advocating stack storage. And talking about a secondary stack,
which is presumably allocated on the heap.

>> So large files are OK in some circumstances, but not as an argument to
>> Print?
>
> Exactly.
>
>>> 1. Why would anybody ever do that?
>>>
>>> 2. Why would anybody program it in such a stupid way?
>>
>> Somebody does this:
>>
>>     println dirlist("*.c")
>
> Nobody prints file lists this way.

Actually, it's just a list. Should a language allow you to print a whole
list? If not, why not? If you need more control, do it element by
element. Or the format control codes that are optional for every print
item can specify some basic parameters, eg. print one element per line.

My higher-level language still builds a single string for each print
item before it does anything else with it.

Sometimes, that is necessary (eg. the result may be right-justified
within a given field width, so it needs to know the final size);
sometimes it isn't, and it is better to send each successive character
into some destination, but I don't have that yet.


>> The output could be anything from 10 characters to 100,000 or more.
>> (And yes, the default routine to print a list of strings could write
>> one per line.)
>
> Wrong. For printing lists of files, if anybody cares, there will be a
> subprogram with parameters:
>
> - Files path / array of paths
> - Wildcard/pattern use flag
> - Number of columns
> - Output direction: column first vs row first
> - First line decorator text
> - Consequent lines decorator text
> - Filter object
> - Sorting order object

As I said I've lost track of what we discussing. But I know it wasn't
about how to implement DIR or ls!

Dmitry A. Kazakov

unread,
Jan 3, 2022, 7:00:56 AMJan 3
to
On 2022-01-03 12:19, Bart wrote:

> As I said I've lost track of what we discussing.

We were discussing inability to return a string from a function in your
language in a reasonable way.

You argued that there is no need to have it due to danger that some user
might miss his or her medication or confuse that with certain mushrooms
and so came to an idea of creating terabyte large temporary strings...

Bart

unread,
Jan 3, 2022, 7:22:51 AMJan 3
to
On 03/01/2022 12:00, Dmitry A. Kazakov wrote:
> On 2022-01-03 12:19, Bart wrote:
>
>> As I said I've lost track of what we discussing.
>
> We were discussing inability to return a string from a function in your
> language in a reasonable way.
>
> You argued that there is no need to have it due to danger that some user
> might miss his or her medication or confuse that with certain mushrooms
> and so came to an idea of creating terabyte large temporary strings...

Unlike Python? Here:

a = [10,]*1000000
s = str(a)
print (len(s))

The length of s is 4 million (1 million times "10, ").

I believe that print(a) would simply apply str() to 'a' then write that
string.

Andy Walker

unread,
Jan 3, 2022, 7:35:09 PMJan 3
to
On 02/01/2022 16:06, James Harris wrote:
>>> The printf approach to printing is flexible and fast at rendering
>>> inbuilt types - probably better than anything which came before it -
>>> but it's not perfect.
>>      No, it's rubbish.  If you need formatted transput [not
>> entirely convinced, but chacun a son gout],
> If you are not convinced by formatted io then what kind of io do you prefer?

Unformatted transput, of course. Eg,

print this, that, these, those and the many other things

[with whatever syntax, quoting, separators, etc you prefer]. Much
the same for "read". Most of the time the default is entirely
adequate. If not, then the choice for the language designer is
either an absurdly complicated syntax that still probably doesn't
meet some plausible needs, or to provide simple mechanisms that
allow programmers to roll their own. Guess which I prefer.
>> then instead of all
>> the special casing, the easiest and most flexible way is to
>> convert everything to strings.
> If you convert to strings then what reclaims the memory used by those
> strings? Not all languages have dynamic memory management, and
> dynamic memory management is not ideal for all compilation targets.

AIUI, you are designing your own language. If it doesn't
have strings, eg as results of procedures, then you have much worse
problems than designing some transput procedures. There are lots
of ways of implementing strings, but they are for the compiler to
worry about, not the language designer [at least, once you know it
can be done].

> The form I proposed had no need for dynamic allocations. That's part
> of the point of it.

There's no need for "dynamic allocations" merely for transput.
Most of the early languages that I used didn't have them, but still
managed to print and read things. You're making mountains out of
molehills.

> I'm not sure you understand the proposal. To be clear, the print
> routine would be akin to
>   print("String with %kffff; included", val)
> where the code k would be used to select a formatter. The formatter
> would be passed two things:
>   1. The format from k to ; inclusive.
>   2. The value val.
> As a result, the format ffff could be as simple as someone could
> design it.

Yes, that's what I thought you meant. C is thataway -->.
You call it "simple"; C is one of the simpler languages of this
type, yet the full spec of "printf" and its friends is horrendous.
Build in some version [exact syntax up to you] of

print "String with ", val, " included"

and you're mostly done. For the exceptional cases, use a library
procedure or your own procedure to convert "val" to a suitable
array of characters, with whatever parameters are appropriate.

> Note that there would be no requirement for dynamic memory. The
> formatter would just send for printing each character as it was
> generated.
> What's wrong with that? (Genuine question!)

Nothing. How on earth do you think we managed in olden
times, before we had "dynamic memory"? [Ans: by printing each
character in sequence.]

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Bizet

Bart

unread,
Jan 4, 2022, 6:31:39 AMJan 4
to
On 04/01/2022 00:35, Andy Walker wrote:
> On 02/01/2022 16:06, James Harris wrote:
>>>> The printf approach to printing is flexible and fast at rendering
>>>> inbuilt types - probably better than anything which came before it -
>>>> but it's not perfect.
>>>      No, it's rubbish.  If you need formatted transput [not
>>> entirely convinced, but chacun a son gout],
>> If you are not convinced by formatted io then what kind of io do you
>> prefer?
>
>     Unformatted transput, of course.  Eg,
>
>   print this, that, these, those and the many other things

If I do this in A68G:

print(("<",1,">"))

the output is:

< +1>

The "<>" are just to help show the problem: how to get rid of the those
leading spaces and that plus sign? Or write the number with a field
width of your choice?

(It gets worse with wider, higher precision numbers, as it uses the
maximum value as the basis for the field width, so that one number could
take up most of the line. Now you will need to start calling functions
that return strings to get things done properly.)

> [with whatever syntax, quoting, separators, etc you prefer].  Much

>     Yes, that's what I thought you meant.  C is thataway -->.
> You call it "simple";  C is one of the simpler languages of this
> type, yet the full spec of "printf" and its friends is horrendous.

The main problem with C's printf is having to tell it the exact type of
each expression.

But those formatting facilities are genuinely useful and harder to
emulate in user code if they didn't exist.

> Build in some version [exact syntax up to you] of
>
>   print "String with ", val, " included"
>
> and you're mostly done.

Yeah, and it will need this:

print a, " ", b, " ", c, "\n"

instead of:

println a, b, c

Designing Print properly can make a big difference!

>  For the exceptional cases, use a library
> procedure or your own procedure to convert "val" to a suitable
> array of characters, with whatever parameters are appropriate.

Well, this is the problem. Where will the array of characters be located
especially if the size is unpredictable? What happens here:

print val1, val2

Or here:

print val3(f())

where f itself includes a 'print val'.

A language might not be advanced enough to have memory-managed
persistent data structures, but it might want custom printing of
user-defined types. Example:

println "Can't convert", strmode(s), "to", strmode(t)

strmode() turns an internal type code into a human readable type
specificication. This language doesn't have flex strings; strmode just
returns a pointer to a fixed-size static string big enough for the
largest expected type.

In this example, because I know that evaluation is left-to-right, it
doesn't matter that the second strmode call will overwrite the earlier
result. But here it does:

f(strmode(s), strmode(t))

>> Note that there would be no requirement for dynamic memory. The
>> formatter would just send for printing each character as it was
>> generated.
>> What's wrong with that? (Genuine question!)
>
>     Nothing.  How on earth do you think we managed in olden
> times, before we had "dynamic memory"?  [Ans:  by printing each
> character in sequence.]

Probably the printing tasks weren't that challenging. As my A68G example
showed, output tended to be tabulated.

Andy Walker

unread,
Jan 4, 2022, 7:54:34 PMJan 4
to
On 04/01/2022 11:31, Bart wrote:
> If I do this in A68G:
>     print(("<",1,">"))
> the output is:
>     <         +1>
> The "<>" are just to help show the problem: how to get rid of the
> those leading spaces and that plus sign? Or write the number with a
> field width of your choice?

"The problem"? It's the A68 /default/. You might prefer
a different default [esp for your own language], but someone had
to choose, without being able to read the minds of contributors
to this newsgroup more than half a century later. If you want
different output in A68, check out the specification of "whole"
[and "fixed" and "float"], RR10.3.2.1, which directly solves the
two "problems" you mention.

> (It gets worse with wider, higher precision numbers, as it uses the
> maximum value as the basis for the field width, so that one number
> could take up most of the line. Now you will need to start calling
> functions that return strings to get things done properly.)

"Worse", "properly"? All you're saying is that you don't
like the default. Note, again for A68, that "whole" [still] works
for all flavours of number. FTAOD, I carefully didn't mention A68
in my previous postings to this thread; I don't greatly care for
A68 transput -- but at least the unformatted [or "formatless" in
RR-speak] version is easy to learn and use.

> The main problem with C's printf is having to tell it the exact type
> of each expression.

The main problem is its complexity! It's as bad as A68,
while not being /anywhere near/ as comprehensive. I suspect you
haven't read N2731 [or near equivalent]. If it takes scores of
pages to describe a relatively simple facility, there's surely
something wrong.

> But those formatting facilities are genuinely useful and harder to
> emulate in user code if they didn't exist.

The RR includes "user code" for both the formatted and
unformatted versions of transput; so if any part of it didn't
exist in some other language, you could easily roll your own.
Not that you should need to.

[...]
>>  For the exceptional cases, use a library
>> procedure or your own procedure to convert "val" to a suitable
>> array of characters, with whatever parameters are appropriate.
> Well, this is the problem. Where will the array of characters be
> located especially if the size is unpredictable?

Why [as a user] do you care? Your language either has
strings as a useful type or it doesn't. If it does, you're in
business. If it doesn't, and you want to write software that
handles things like words or names or anything of the sort, then
you have problems way beyond getting your output to look nice
[but at least Unix/Linux comes with loads of commands to do
that for you]. If, OTOH, you're a compiler writer trying to
implement strings, then there is plenty of source code for
those commands available to give you a start.

[In response to James:]
>> How on earth do you think we managed in olden
>> times, before we had "dynamic memory"?  [Ans:  by printing each
>> character in sequence.]
> Probably the printing tasks weren't that challenging. As my A68G
> example showed, output tended to be tabulated.

Perhaps you could write that in a more patronising form?
[For some of us, "olden times" started more than a decade before
the A68 RR, and we nevertheless managed to write word-processing
and similar "apps" -- despite the absence of mod cons such as
editors, mice, disc storage, files, "dynamic memory", ....]

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Handel

Bart

unread,
Jan 5, 2022, 8:12:57 AMJan 5
to
On 05/01/2022 00:54, Andy Walker wrote:
> On 04/01/2022 11:31, Bart wrote:

>> The main problem with C's printf is having to tell it the exact type
>> of each expression.
>
>     The main problem is its complexity!  It's as bad as A68,
> while not being /anywhere near/ as comprehensive.  I suspect you
> haven't read N2731 [or near equivalent].  If it takes scores of
> pages to describe a relatively simple facility, there's surely
> something wrong.

If you mean 7.21.6.1 about fprintf, that's 7 pages, of which the last
two are examples. Yes, it goes on a bit, but that's just its style,
which is a specification hopefully useful to someone needing to
implemement it.

Format codes form a little language of their own. Bear in mind all the
different parameters that could be used to control the appearance of an
integer or float value.

This one for example roughly emulates A68's display for INT:

"%+11d"

In my scheme, the equivalent params are: "+11" (or "11 +"; it's not
fussy about syntax), and the docs (in the form of comments to the
function that turns that style string into a descriptor) occupy 2 lines,
but are rather sparse.

>> Well, this is the problem. Where will the array of characters be
>> located especially if the size is unpredictable?
>
>     Why [as a user] do you care?  Your language either has
> strings as a useful type or it doesn't.

The static language doesn't. It doesn't mean it's not possible to do
things, it's just more work.

But there is a greater need for helpful Print features that involve
text, which are one corner of the language, than to do 100 times as much
work in transforming that language into one with first class strings.

I use this language to implement my higher-level one, and to take care
of things that can't be done in scripting code.

> [In response to James:]
>>> How on earth do you think we managed in olden
>>> times, before we had "dynamic memory"?  [Ans:  by printing each
>>> character in sequence.]
>> Probably the printing tasks weren't that challenging. As my A68G
>> example showed, output tended to be tabulated.
>
>     Perhaps you could write that in a more patronising form?

I've lost track here of your argument.

You say that a language ought to have strings as a proper type. But you
also say it doesn't need them. So which is it?

I think the thread is partly about how to add custom printing to a
language that doesn't have automatically managed string types. Example:

record date =
int day, month, year
end

date d := (5,1,2022)

print d

Ideally the output should be something like "5-Jan-2022". But my static
language doesn't support this at all, not even printing the 3 numbers
involved. My dynamic one just shows "(5,1,2022)".

One solution - this is about static code from now on - is to write a
function like this:

function strdate(date &d)ichar =
static [100]char str
static []ichar months = ("Jan","Feb","Mar","Apr","May","Jun",
"Jul","Aug","Sep","Oct","Nov","Dec")

fprint @str, "#-#-#", d.day, months[d.month], d.year
return str
end

then write:

print strdate(d)

but this has obvious limitations: the result of strdate() must be
consumed immediately for example. For more complex types, the string
could be arbitrarily long; what should that buffer size be?

While returning an allocated string means something needs to deallocate
it sooner or later, preferably sooner, but via which mechanism? It could
also mean arbitrary large strings that can cause issues.

Another approach is to use:

print d

and for it to know, through an association made elsewhere, that turning
d into a string (or otherwise serialising it) involves calling the function.

I don't think it's helpful to suggest that either the language needs to
be transformed into a higher level one, just for Print, or that it
doesn't need any such features, because decades ago we all seemed to
manage to print dates with basic Print. Yes I can do that now too:

print d.day,,"-",,months[d.month],,"-",,d.year # ,, means no space
fprint "#-#-#", d.day, months[d.month], d.year
printdate(d); println

but as you can see it's bit of a pig.

Andy Walker

unread,
Jan 7, 2022, 8:20:30 AMJan 7
to
On 05/01/2022 13:12, Bart wrote:
> Format codes form a little language of their own. Bear in mind all
> the different parameters that could be used to control the appearance
> of an integer or float value.

Yes, but that's the point. As you point out with your "date"
example, you're typically on your own if you want to print something
that doesn't fit neatly into the "little language" [such as dates or
Roman numerals], and there are so many "different parameters" that
the little language grows into something bigger, while still being
insufficient for many needs. Learning it is a distraction from the
main programming language involved.

>>> Probably the printing tasks weren't that challenging. As my A68G
>>> example showed, output tended to be tabulated.
>>      Perhaps you could write that in a more patronising form?
> I've lost track here of your argument.

You were implying that before modern times, only simple
output used to be needed. That's patronising rubbish.

> You say that a language ought to have strings as a proper type. But
> you also say it doesn't need them. So which is it?

Both, of course. Proper strings make life easier, but
it's possible to program around that as long as your language
has /some/ way of constructing and printing characters. Any
modern general-purpose language /ought/ to be able to have
arrays [whether of characters, integers, procedures returning
structures, ...] as parameters and as function return values.
You can program around the lack, but these days you shouldn't
have to.

> I think the thread is partly about how to add custom printing to a
> language that doesn't have automatically managed string types.

Well, it was about whether languages should have formats
[presumably somewhat similar to those in C]. I don't see the
point. In C terms, the simple cases [such as "%s", "%d"] can
be replaced by simply printing the string or whatever, the
slightly more complicated cases by some equivalent to the A68
"whole", "fixed" and "float" procedures, and you have to roll
your own with anything complex anyway, as your date example
[snipped] shows.

[...]> While returning an allocated string means something needs to
> deallocate it sooner or later, preferably sooner, but via which
> mechanism? It could also mean arbitrary large strings that can cause
> issues.

You're back to implementation issues. Not the concern
of the user who wants to print dates that look nice. Meanwhile,
the implementation issues were solved more than half a century
ago. I don't know why you and James are so opposed to the use
of heap storage [and temporary files, if you really want strings
that are many gigabytes]?

> I don't think it's helpful to suggest that either the language needs
> to be transformed into a higher level one, just for Print, or that it
> doesn't need any such features, because decades ago we all seemed to
> manage to print dates with basic Print.

"Need" is an exaggeration. But in any case no-one here
has suggested either part of that [esp not if you replace "need"
by "desirable" (with appropriate changes to the grammar)]. It is
indeed desirable for a modern language to have the ability to
allocate [and deallocate] off-stack storage and the ability to
print characters. Is there any major general-purpose computing
language of the past sixty years that has not had such abilities?

Meanwhile, you're proposing adding a "little language"
to the language spec "just for Print". Why is that any better
than adding "fixed", "float" and "whole" to the library?

--
Andy Walker, Nottingham.
Andy's music pages: www.cuboid.me.uk/andy/Music
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Dussek

Bart

unread,
Jan 7, 2022, 1:14:51 PMJan 7
to
On 07/01/2022 13:20, Andy Walker wrote:
> On 05/01/2022 13:12, Bart wrote:

>     You're back to implementation issues.  Not the concern
> of the user who wants to print dates that look nice.  Meanwhile,
> the implementation issues were solved more than half a century
> ago.  I don't know why you and James are so opposed to the use
> of heap storage [and temporary files, if you really want strings
> that are many gigabytes]?

Because heap storage requires a more advanced language to manage
properly, ie. automatically. (I don't care for speculative GC methods.)

>> I don't think it's helpful to suggest that either the language needs
>> to be transformed into a higher level one, just for Print, or that it
>> doesn't need any such features, because decades ago we all seemed to
>> manage to print dates with basic Print.
>
>     "Need" is an exaggeration.  But in any case no-one here
> has suggested either part of that [esp not if you replace "need"
> by "desirable" (with appropriate changes to the grammar)].  It is
> indeed desirable for a modern language to have the ability to
> allocate [and deallocate] off-stack storage and the ability to
> print characters.  Is there any major general-purpose computing
> language of the past sixty years that has not had such abilities?

Plenty don't do such allocations /automatically/.

>     Meanwhile, you're proposing adding a "little language"
> to the language spec "just for Print".  Why is that any better
> than adding "fixed", "float" and "whole" to the library?

Until about 2010 I only had simple Print. If wanted to display something
in hex for example, I had a library function called Hex:

print hex(a)

And that function hex was defined like this (in an older version of my
language):

FUNCTION HEX(n)=
static [1..20]char str
sprintf(^str," %XH",n)
^str
END

Notice it's using C's 'sprintf' to do the work; I can't be having that!
(Yes I can do this easily enough with my own code, but why bother then
it's there ready to use.)

So if I'm having to utilise something from another language (and C at
that), it sounds like somthing I ought to have built-in to mine:

print a:"h"

(The :fmt syntax comes from Pascal.) Inside my library, I do it with
low-level code; C's sprintf is not suitable as there are extra controls.

But also, inside my library, the intermediate string data is much more
easily managed.

> Meanwhile, you're proposing adding a "little language"
> to the language spec "just for Print".

It can be called a 'language' for C-style formats as there is a syntax
involved. (The 'just for Print' comment was about adding advanced
features to the language.)

As I do it, it's more of a style string, as I frequently use within
applications, for controlling GUI elements for example.

It is the x:fmt syntax that is specific to Print, and is really
syntactic sugar:

print x, y:fmt

is equivalent (when x and y have i64 types) to:

m$print_startcon()
m$print_i64_nf(x) # this calls m$print_i64(x, nil)
m$print_i64(y,fmt)
m$print_end()

(The start/end functions provide a print context, eg for output to
console, file, string, and allow for logic to insert spaces /between/
items.)

James Harris

unread,
Jan 8, 2022, 11:18:13 AMJan 8
to
On 02/01/2022 18:25, Dmitry A. Kazakov wrote:
> On 2022-01-02 19:05, James Harris wrote:
>> On 28/12/2021 09:21, Dmitry A. Kazakov wrote:

...

>>> The most flexible is a combination of a string that carries most of
>>> the information specific to the datatype (an OO method) and some
>>> commands to the rendering environment.
>>
>> That sounds interesting. How would it work?
>
> With single dispatch you have an interface, say, 'printable'. The
> interface has an abstract method 'image' with the profile:
>
>    function Image (X : Printable) return String;
>
> Integer, float, string, whatever that has to be printable inherits to
> Printable and thus overrides Image. That is.
>
> The same goes with serialization/streaming etc.
>

OK, that was my preferred option, too. The trouble with it is that it
needs somewhere to put the string form. And you know the problems
therewith (lack of recursion OR lack of thread safety OR dynamic memory
management).

My more recent suggestion would not need any special place for the
string form. Anything to be output could be written directly to wherever
the print function was writing to.


--
James Harris

James Harris

unread,
Jan 8, 2022, 11:29:36 AMJan 8
to
On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
> On 2022-01-02 17:06, James Harris wrote:
>
>> If you convert to strings then what reclaims the memory used by those
>> strings?
>
> What reclaims memory used by those integers?

Depends on where they are:

1. Globals - reclaimed when the program exits.

2. Locals on the stack or in activation records - reclaimed at least by
the time the function exits.

3. Dynamic on the heap - management required.

Your solution of creating string forms of values is reasonable in a
language and an environment which already have dynamic memory management
but not otherwise.

>
>> Not all languages have dynamic memory management, and dynamic memory
>> management is not ideal for all compilation targets.
>
> No dynamic memory management is required for handling temporary objects.

Where would you put the string forms?

>
> ----------
> If that were relevant in the case of formatted output, which has a
> massive overhead, so that even when using the heap (which is no way
> necessary) it would leave a little or no dent. I remember a SysV C
> compiler which modified the format string of printf in a misguided
> attempt to save a little bit memory, while the linker put string
> constants in the read-only memory...
>

Whether formatted or not, all IO tends to have higher costs than
computation and for most applications the cost of printing doesn't
matter. But when designing a language or a standard library it's a bad
idea to effectively impose a scheme which has a higher cost than
necessary because the language designer doesn't know what uses his
language will be put to.


--
James Harris

James Harris

unread,
Jan 8, 2022, 11:36:06 AMJan 8
to
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:

...

> If this
>
>    Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>
> is a problem in your language, then the job is not done.

What's wrong with

put_line("X=%i;, Y=%i;", X, Y)

?

I see that you've gone for default formatting so I did the same. I could
(under the suggestion) customise them by putting a format specification
between % and ;. How would you customise the format of X'Image and Y'Image?


--
James Harris

James Harris

unread,
Jan 8, 2022, 11:50:19 AMJan 8
to
I don't know why you bring locking into it. It's neither necessary nor
relevant.

Furthermore, the only use for an output buffer is to make output more
efficient; it's not fundamental.

On a minimal system a print function only has to emit one character at a
time. No buffering. No memory management. For sure, there can be
buffering added on top but that would be for performance reasons.

Whether a buffer is used or not ISTM your schemes imply significant
memory management. There's nothing wrong with that. It's quite normal in
most languages and on most hardware that anyone is likely to use these
days. But I return to my point that it's best not to design output
facilities which need such complexity because one cannot enumerate every
environment in which a language will ever be used.


--
James Harris

James Harris

unread,
Jan 8, 2022, 12:15:59 PMJan 8
to
On 04/01/2022 00:35, Andy Walker wrote:
> On 02/01/2022 16:06, James Harris wrote:
>>>> The printf approach to printing is flexible and fast at rendering
>>>> inbuilt types - probably better than anything which came before it -
>>>> but it's not perfect.
>>>      No, it's rubbish.  If you need formatted transput [not
>>> entirely convinced, but chacun a son gout],
>> If you are not convinced by formatted io then what kind of io do you
>> prefer?
>
>     Unformatted transput, of course.  Eg,
>
>   print this, that, these, those and the many other things

OK. I would say that that has /default/ formatting but is still formatted.

>
> [with whatever syntax, quoting, separators, etc you prefer].  Much
> the same for "read".  Most of the time the default is entirely
> adequate.  If not, then the choice for the language designer is
> either an absurdly complicated syntax that still probably doesn't
> meet some plausible needs, or to provide simple mechanisms that
> allow programmers to roll their own.  Guess which I prefer.

You have a preferred syntax in which users can express how they want
values to be formatted? Suggestions welcome!


>>> then instead of all
>>> the special casing, the easiest and most flexible way is to
>>> convert everything to strings.
>> If you convert to strings then what reclaims the memory used by those
>> strings? Not all languages have dynamic memory management, and
>> dynamic memory management is not ideal for all compilation targets.
>
>     AIUI, you are designing your own language.  If it doesn't
> have strings, eg as results of procedures, then you have much worse
> problems than designing some transput procedures.  There are lots
> of ways of implementing strings, but they are for the compiler to
> worry about, not the language designer [at least, once you know it
> can be done].

The design of the language supports dynamically sized strings. That's
fine for applications which need them. But that's different from
imposing such strings and the management thereof on every print operation.

>
>> The form I proposed had no need for dynamic allocations. That's part
>> of the point of it.
>
>     There's no need for "dynamic allocations" merely for transput.
> Most of the early languages that I used didn't have them, but still
> managed to print and read things.  You're making mountains out of
> molehills.

Oh? How would you handle

widget w
print(w)

?

ISTM you are suggesting that w be converted to a string (or array of
characters) and then printed.

>
>> I'm not sure you understand the proposal. To be clear, the print
>> routine would be akin to
>>    print("String with %kffff; included", val)
>> where the code k would be used to select a formatter. The formatter
>> would be passed two things:
>>    1. The format from k to ; inclusive.
>>    2. The value val.
>> As a result, the format ffff could be as simple as someone could
>> design it.
>
>     Yes, that's what I thought you meant.  C is thataway -->.
> You call it "simple";  C is one of the simpler languages of this
> type, yet the full spec of "printf" and its friends is horrendous.

I agree. C's printf formats are fine for simple cases. But they don't
scale. That's why I suggested a scheme which was more flexible.


> Build in some version [exact syntax up to you] of
>
>   print "String with ", val, " included"
>
> and you're mostly done.  For the exceptional cases, use a library
> procedure or your own procedure to convert "val" to a suitable
> array of characters, with whatever parameters are appropriate.

a. Where would you store the array of characters?

b. What's wrong with

print "String with %v; included" % val

where v is a suitable format for the type of val.

>
>> Note that there would be no requirement for dynamic memory. The
>> formatter would just send for printing each character as it was
>> generated.
>> What's wrong with that? (Genuine question!)
>
>     Nothing.  How on earth do you think we managed in olden
> times, before we had "dynamic memory"?  [Ans:  by printing each
> character in sequence.]
>

Cool. But if you agree with my suggestion of a means which can be used
to render each character in sequence I don't know why you suggested
conversion to an array of characters.


--
James Harris

James Harris

unread,
Jan 8, 2022, 12:18:35 PMJan 8
to
On 04/01/2022 11:31, Bart wrote:
> On 04/01/2022 00:35, Andy Walker wrote:
>> On 02/01/2022 16:06, James Harris wrote:

...

>>> If you are not convinced by formatted io then what kind of io do you
>>> prefer?
>>
>>      Unformatted transput, of course.  Eg,
>>
>>    print this, that, these, those and the many other things
>
> If I do this in A68G:
>
>     print(("<",1,">"))
>
> the output is:
>
>     <         +1>

...

>    print val3(f())
>
> where f itself includes a 'print val'.


Two great examples!



--
James Harris

Dmitry A. Kazakov

unread,
Jan 8, 2022, 2:35:23 PMJan 8
to
On 2022-01-08 17:18, James Harris wrote:
> On 02/01/2022 18:25, Dmitry A. Kazakov wrote:
>> On 2022-01-02 19:05, James Harris wrote:
>>> On 28/12/2021 09:21, Dmitry A. Kazakov wrote:
>
> ...
>
>>>> The most flexible is a combination of a string that carries most of
>>>> the information specific to the datatype (an OO method) and some
>>>> commands to the rendering environment.
>>>
>>> That sounds interesting. How would it work?
>>
>> With single dispatch you have an interface, say, 'printable'. The
>> interface has an abstract method 'image' with the profile:
>>
>>     function Image (X : Printable) return String;
>>
>> Integer, float, string, whatever that has to be printable inherits to
>> Printable and thus overrides Image. That is.
>>
>> The same goes with serialization/streaming etc.
>
> OK, that was my preferred option, too. The trouble with it is that it
> needs somewhere to put the string form. And you know the problems
> therewith (lack of recursion OR lack of thread safety OR dynamic memory
> management).

No idea why you think there is something special about string format or
that any of the mentioned issues would ever apply. Conversion to string
needs no recursion, is as thread safe as any other call, needs no
dynamic memory management.

There should be no format specifications at all. You just need a few
parameters for Image regarding type-specific formatting and a few
parameters regarding rendering context in the actual output call.

The former are like put + if positive, base, precision etc; the latter
are like output field width, alignment, fill character etc.

Of course, the formatting parameters bring things in the realm of
multiple dispatch:

function Image (X : Printable; Options : Format := Default)
return String;

Combinations of Printable x Format are multiple dispatch.

It is unlikely to support, but in this case it could be replaced by a
variant record for Format. It would lack extensibility, but where is any
in printf? Alternatively to variant record you can make Format having
methods like:

function Base (X : Format) return Number_Base;

The methods would return defaults if not overridden. In both cases the
language must provide good support of aggregates to make format
specifications comfortable.

Dmitry A. Kazakov

unread,
Jan 8, 2022, 2:48:43 PMJan 8
to
On 2022-01-08 17:50, James Harris wrote:
> On 02/01/2022 18:40, Dmitry A. Kazakov wrote:
>> On 2022-01-02 18:54, James Harris wrote:
>>> On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
>>
>>>> If this
>>>>
>>>>     Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>>>>
>>>> is a problem in your language, then the job is not done.
>>>
>>> That would be poor for small-machine targets. Shame on Ada! ;-)
>>
>> That works perfectly well on small targets. You seem unaware of what
>> actually happens on I/O. For a "small" target it would be some sort of
>> networking stack with a terminal emulation on top of it. Believe me,
>> creating a temporary string on the stack is nothing in comparison to
>> that. Furthermore, the implementation would likely no less efficient
>> than printf which has no idea how large the result is and would have
>> to reallocate the output buffer or keep it between calls and lock it
>> from concurrent access. Locking on an embedded system is a
>> catastrophic event because switching threads is expensive as hell.
>> Note also that you cannot stream output, because networking protocols
>> and terminal emulators are much more efficient if you do bulk
>> transfers. All that is the infamous premature optimization.
>
> I don't know why you bring locking into it. It's neither necessary nor
> relevant.

Because this is how I/O works.

> Furthermore, the only use for an output buffer is to make output more
> efficient; it's not fundamental.

It is fundamental, there is no hardware anymore where you could just
send a single character to. A small target will write to the network
stack, e.g. use socket send over TCP, that will coalesce output into
network packets, these would be buffered into transport layer frames,
these will go to physical layer packets etc.

There is no such thing as character stream without a massive overhead
beneath it. So creating a string on the secondary stack is nothing in
compare to that especially when you skip stream abstraction. Most
embedded software do. They tend to do I/O directly in packets. E.g.
sending application level packets over TCP or using UDP.

Tracing, the only place where text output is actually used, does not do
printf directly. It usually does some kind of locking when the output
consists of multiple printfs. Consider it an output transaction when
each instance of output is atomic. Of course, character streams have no
place there.

Dmitry A. Kazakov

unread,
Jan 8, 2022, 2:54:21 PMJan 8
to
On 2022-01-08 17:36, James Harris wrote:
> On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
>
> ...
>
>> If this
>>
>>     Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>>
>> is a problem in your language, then the job is not done.
>
> What's wrong with
>
>   put_line("X=%i;, Y=%i;", X, Y)
>
> ?

Untyped, unsafe, messy, non-portable garbage that does not work with
user-defined types.

And again, that is not the point. The point is that in any decent
language there is no need either in printf mess or print statements.

Because the language abstractions are powerful enough to express
formatting I/O in language terms.

Dmitry A. Kazakov

unread,
Jan 8, 2022, 3:01:52 PMJan 8
to
On 2022-01-08 17:29, James Harris wrote:
> On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
>> On 2022-01-02 17:06, James Harris wrote:
>>
>>> If you convert to strings then what reclaims the memory used by those
>>> strings?
>>
>> What reclaims memory used by those integers?
>
> Depends on where they are:
>
> 1. Globals - reclaimed when the program exits.
>
> 2. Locals on the stack or in activation records - reclaimed at least by
> the time the function exits.
>
> 3. Dynamic on the heap - management required.

Now replace integer with string. The work done.

> Your solution of creating string forms of values is reasonable in a
> language and an environment which already have dynamic memory management
> but not otherwise.

You do not need that.

>>> Not all languages have dynamic memory management, and dynamic memory
>>> management is not ideal for all compilation targets.
>>
>> No dynamic memory management is required for handling temporary objects.
>
> Where would you put the string forms?

The same place you put integer, float, etc. That place is called stack
or LIFO.

> Whether formatted or not, all IO tends to have higher costs than
> computation and for most applications the cost of printing doesn't
> matter. But when designing a language or a standard library it's a bad
> idea to effectively impose a scheme which has a higher cost than
> necessary because the language designer doesn't know what uses his
> language will be put to.

Did you do actual measurements?

printf obviously imposes higher costs than direct conversions. And also
costs that cannot be easily optimized since the format can be an
expression and even if a constant it is difficult to break down into
direct conversions.

Bart

unread,
Jan 8, 2022, 3:22:44 PMJan 8
to
But the result is ugly, ungainly code to do a simple task.

Look at the difference when it's done properly:

fprintln "X=#, Y=#", X, Y

Or as I also do it (as "=" displays prints the expression itself before
its value):

println =X, =Y

Back to that Ada, I wanted do this:

Put_Line (X+Y'Image);

This didn't work (I guess ' has higher precedence than +?). But neither did:

Put_Line ((X+Y)'Image);

'Image' can only be applied to a name, not an expression; why? So I
have to use an intermediate variable: Z := X+Y then do Z'Image.

I think you're not in a position to tell people how to implement Print!
You need to be able to just do this:

println X+Y



Bart

unread,
Jan 8, 2022, 3:46:22 PMJan 8
to
On 08/01/2022 20:01, Dmitry A. Kazakov wrote:
> On 2022-01-08 17:29, James Harris wrote:

>>> No dynamic memory management is required for handling temporary objects.
>>
>> Where would you put the string forms?
>
> The same place you put integer, float, etc. That place is called stack
> or LIFO.

(1) integer, float etc are a fixed size known at compile-time

(2) integer, float etc are usually manipulated by value

Apparently Ada strings have a fixed length. If so, that would make
implementing your ideas simpler; and I can do the same thing in my
lower-level language, but I don't consider that a viable solution.

It would be very constraining if you had to know, in advance, the size
of a string returned from a function for a complex user-type conversion.
Which I guess also means the function has the same limit for all
possible calls.

But Ada also has unbounded strings:

"Unbounded strings are allocated using heap memory, and are deallocated
automatically."

Yeah, so exactly what we've been talking about. Either the language has
that kind of advanced feature, based on the heap, or it uses crude
methods with fixed length strings (which was the problem with Pascal)
which is not really a solution; just the kind of workaround I use already.


Dmitry A. Kazakov

unread,
Jan 8, 2022, 3:49:06 PMJan 8
to
On 2022-01-08 21:22, Bart wrote:

> This didn't work (I guess ' has higher precedence than +?). But neither
> did:
>
>      Put_Line ((X+Y)'Image);
>
> 'Image' can only be applied to a name, not an expression; why?

Because 'Image is a type attribute:

<subtype>'Image (<value>)

So

Integer_32'image (X + Y)

> I think you're not in a position to tell people how to implement Print!

I am, just don't.

> You need to be able to just do this:
>
>     println X+Y

Nope, I don't need that at all. In Ada it is just this:

Put (X + Y);
New_Line;

See the package Integer_IO (ARM A.10.8). The point is that is is almost
never used, because, again, not needed for real-life software.

Dmitry A. Kazakov

unread,
Jan 8, 2022, 4:04:29 PMJan 8
to
On 2022-01-08 21:46, Bart wrote:
> On 08/01/2022 20:01, Dmitry A. Kazakov wrote:
>> On 2022-01-08 17:29, James Harris wrote:
>
>>>> No dynamic memory management is required for handling temporary
>>>> objects.
>>>
>>> Where would you put the string forms?
>>
>> The same place you put integer, float, etc. That place is called stack
>> or LIFO.
>
> (1) integer, float etc are a fixed size known at compile-time

So what?

> (2) integer, float etc are usually manipulated by value

Irrelevant.

> Apparently Ada strings have a fixed length.

Apparently not:

function Get_Line (File : File_Type) return String;

> It would be very constraining if you had to know, in advance, the size
> of a string returned from a function for a complex user-type conversion.

String is an indefinite type, the size of an object is unknown until
run-time.

> Which I guess also means the function has the same limit for all
> possible calls.

Wrong. Indefinite types are returned just same as definite types are, on
the stack, which means memory management policy LIFO.

To widen your horizon a little bit, a stack LIFO can be implemented by
many various means: using machine stack, using machine registers, using
thread local storage as well as various combinations of.

> But Ada also has unbounded strings:
>
> "Unbounded strings are allocated using heap memory, and are deallocated
> automatically."

Unbounded_String is practically never needed and discouraged to use.
Because heap is a bad idea and because text processing algorithm almost
never require changing length/content of a string. If you do that, then
you do something wrong or the language is pool, e.g. does not support
string slices.

Bart

unread,
Jan 8, 2022, 4:05:57 PMJan 8
to
On 08/01/2022 20:49, Dmitry A. Kazakov wrote:
> On 2022-01-08 21:22, Bart wrote:
>
>> This didn't work (I guess ' has higher precedence than +?). But
>> neither did:
>>
>>       Put_Line ((X+Y)'Image);
>>
>> 'Image' can only be applied to a name, not an expression; why?
>
> Because 'Image is a type attribute:
>
>    <subtype>'Image (<value>)

And yet it works with X'Image when X is a variable, not a type.

> So
>
>    Integer_32'image (X + Y)

Yeah, that's much better!

>> I think you're not in a position to tell people how to implement Print!
>
> I am, just don't.

Tell me one thing that Ada can do with its Print scheme that I can't do
more simply and with less typing with mine.

>> You need to be able to just do this:
>>
>>      println X+Y
>
> Nope, I don't need that at all. In Ada it is just this:
>
>    Put (X + Y);

So it's overloading Put() with different types. But the language doesn't
similarly overload Put_Line()?

> See the package Integer_IO (ARM A.10.8). The point is that is is almost
> never used, because, again, not needed for real-life software.

Huh? Have you never written data to a file?



Bart

unread,
Jan 8, 2022, 4:18:28 PMJan 8
to
On 08/01/2022 21:04, Dmitry A. Kazakov wrote:
> On 2022-01-08 21:46, Bart wrote:
>> On 08/01/2022 20:01, Dmitry A. Kazakov wrote:
>>> On 2022-01-08 17:29, James Harris wrote:
>>
>>>>> No dynamic memory management is required for handling temporary
>>>>> objects.
>>>>
>>>> Where would you put the string forms?
>>>
>>> The same place you put integer, float, etc. That place is called
>>> stack or LIFO.
>>
>> (1) integer, float etc are a fixed size known at compile-time
>
> So what?
>
>> (2) integer, float etc are usually manipulated by value
>
> Irrelevant.

Relevant because you are suggesting that strings can be manipulated just
like a 4- or 8-byte primitive type.

>> Apparently Ada strings have a fixed length.
>
> Apparently not:
>
>    function Get_Line (File : File_Type) return String;

Yet I can't do this:

S: String;

"unconstrained subtype not allowed". It needs a size or to be
initialised from a literal of known length.


>
>> It would be very constraining if you had to know, in advance, the size
>> of a string returned from a function for a complex user-type conversion.
>
> String is an indefinite type, the size of an object is unknown until
> run-time.
>
>> Which I guess also means the function has the same limit for all
>> possible calls.
>
> Wrong. Indefinite types are returned just same as definite types are, on
> the stack, which means memory management policy LIFO.
>
> To widen your horizon a little bit, a stack LIFO can be implemented by
> many various means: using machine stack, using machine registers, using
> thread local storage as well as various combinations of.

Suppose you have this:

Put_Line(Get_Line(...));

Can you go into some detail as to what, exactly, is passed back from
Get_Line(), what, exactly, is passed to Put_Line(), bearing in mind that
64-bit ABIs frown on passing by value any args more than 64-bits, and
where, exactly, the actual string data, which can be of any length,
resides during this process, and how that string data is destroyed when
it is no longer needed?

Then perhaps you might explain in what way that is identical to passing
a Integer to Put().

>> But Ada also has unbounded strings:
>>
>> "Unbounded strings are allocated using heap memory, and are
>> deallocated automatically."
>
> Unbounded_String is practically never needed and discouraged to use.
> Because heap is a bad idea and because text processing algorithm almost
> never require changing length/content of a string.

It seems you've never written a text editor either!

James Harris

unread,
Jan 8, 2022, 4:41:50 PMJan 8
to
On 08/01/2022 19:54, Dmitry A. Kazakov wrote:
> On 2022-01-08 17:36, James Harris wrote:
>> On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
>>
>> ...
>>
>>> If this
>>>
>>>     Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>>>
>>> is a problem in your language, then the job is not done.
>>
>> What's wrong with
>>
>>    put_line("X=%i;, Y=%i;", X, Y)
>>
>> ?
>
> Untyped, unsafe, messy, non-portable garbage that does not work with
> user-defined types.

That's just wrong. It is typesafe, clean and portable. What's more, per
the suggestion I made to start this thread it will work with
user-defined types.

You should at least try to understand an idea before you dismiss it! ;-)

(The only way it would become type unsafe would be via such as

format_string = F()
put_line(format_string, X, Y)

and that does not have to be allowed.)


--
James Harris

James Harris

unread,
Jan 8, 2022, 4:45:55 PMJan 8
to
On 08/01/2022 20:01, Dmitry A. Kazakov wrote:
> On 2022-01-08 17:29, James Harris wrote:
>> On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
>>> On 2022-01-02 17:06, James Harris wrote:
>>>
>>>> If you convert to strings then what reclaims the memory used by
>>>> those strings?
>>>
>>> What reclaims memory used by those integers?
>>
>> Depends on where they are:
>>
>> 1. Globals - reclaimed when the program exits.
>>
>> 2. Locals on the stack or in activation records - reclaimed at least
>> by the time the function exits.
>>
>> 3. Dynamic on the heap - management required.
>
> Now replace integer with string. The work done.

Strings are, in general, not of fixed length.

>
>> Your solution of creating string forms of values is reasonable in a
>> language and an environment which already have dynamic memory
>> management but not otherwise.
>
> You do not need that.
>
>>>> Not all languages have dynamic memory management, and dynamic memory
>>>> management is not ideal for all compilation targets.
>>>
>>> No dynamic memory management is required for handling temporary objects.
>>
>> Where would you put the string forms?
>
> The same place you put integer, float, etc. That place is called stack
> or LIFO.

Integer and Float are of fixed length.

>
>> Whether formatted or not, all IO tends to have higher costs than
>> computation and for most applications the cost of printing doesn't
>> matter. But when designing a language or a standard library it's a bad
>> idea to effectively impose a scheme which has a higher cost than
>> necessary because the language designer doesn't know what uses his
>> language will be put to.
>
> Did you do actual measurements?

Did you?

:-)

>
> printf obviously imposes higher costs than direct conversions. And also
> costs that cannot be easily optimized since the format can be an
> expression and even if a constant it is difficult to break down into
> direct conversions.
>

I am not defending printf.


--
James Harris

James Harris

unread,
Jan 8, 2022, 4:58:05 PMJan 8
to
On 08/01/2022 19:48, Dmitry A. Kazakov wrote:
> On 2022-01-08 17:50, James Harris wrote:
>> On 02/01/2022 18:40, Dmitry A. Kazakov wrote:
>>> On 2022-01-02 18:54, James Harris wrote:
>>>> On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
>>>
>>>>> If this
>>>>>
>>>>>     Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
>>>>>
>>>>> is a problem in your language, then the job is not done.
>>>>
>>>> That would be poor for small-machine targets. Shame on Ada! ;-)
>>>
>>> That works perfectly well on small targets. You seem unaware of what
>>> actually happens on I/O. For a "small" target it would be some sort
>>> of networking stack with a terminal emulation on top of it. Believe
>>> me, creating a temporary string on the stack is nothing in comparison
>>> to that. Furthermore, the implementation would likely no less
>>> efficient than printf which has no idea how large the result is and
>>> would have to reallocate the output buffer or keep it between calls
>>> and lock it from concurrent access. Locking on an embedded system is
>>> a catastrophic event because switching threads is expensive as hell.
>>> Note also that you cannot stream output, because networking protocols
>>> and terminal emulators are much more efficient if you do bulk
>>> transfers. All that is the infamous premature optimization.
>>
>> I don't know why you bring locking into it. It's neither necessary nor
>> relevant.
>
> Because this is how I/O works.

Why? Where there's no contention and just one task producing a certain
stream's output what is there to lock from?

I have even designed a lock-free way of allowing two or more tasks to
write to the same stream so I don't buy in to conventional wisdom of the
necessity for locks.

>
>> Furthermore, the only use for an output buffer is to make output more
>> efficient; it's not fundamental.
>
> It is fundamental, there is no hardware anymore where you could just
> send a single character to.

Of course there is. For example, a 7-segment display. Another: an async
serial port.


> A small target will write to the network
> stack, e.g. use socket send over TCP, that will coalesce output into
> network packets, these would be buffered into transport layer frames,
> these will go to physical layer packets etc.

In a couple of replies recently you've mentioned a communication stack.
I don't know why you are thinking of such a thing but not all
communication uses the OSI 7-layer model! ;-)

>
> There is no such thing as character stream without a massive overhead
> beneath it. So creating a string on the secondary stack is nothing in
> compare to that especially when you skip stream abstraction. Most
> embedded software do. They tend to do I/O directly in packets. E.g.
> sending application level packets over TCP or using UDP.
>
> Tracing, the only place where text output is actually used,

Tracing is 'the only place where text output is used'? :-o


--
James Harris

James Harris

unread,
Jan 8, 2022, 5:11:30 PMJan 8
to
On 08/01/2022 19:35, Dmitry A. Kazakov wrote:
> On 2022-01-08 17:18, James Harris wrote:
>> On 02/01/2022 18:25, Dmitry A. Kazakov wrote:
>>> On 2022-01-02 19:05, James Harris wrote:
>>>> On 28/12/2021 09:21, Dmitry A. Kazakov wrote:
>>
>> ...
>>
>>>>> The most flexible is a combination of a string that carries most of
>>>>> the information specific to the datatype (an OO method) and some
>>>>> commands to the rendering environment.
>>>>
>>>> That sounds interesting. How would it work?
>>>
>>> With single dispatch you have an interface, say, 'printable'. The
>>> interface has an abstract method 'image' with the profile:
>>>
>>>     function Image (X : Printable) return String;
>>>
>>> Integer, float, string, whatever that has to be printable inherits to
>>> Printable and thus overrides Image. That is.
>>>
>>> The same goes with serialization/streaming etc.
>>
>> OK, that was my preferred option, too. The trouble with it is that it
>> needs somewhere to put the string form. And you know the problems
>> therewith (lack of recursion OR lack of thread safety OR dynamic
>> memory management).
>
> No idea why you think there is something special about string format or
> that any of the mentioned issues would ever apply. Conversion to string
> needs no recursion,

It does if a to-string function invokes another to-string function.

> is as thread safe as any other call, needs no
> dynamic memory management.

Unless you know the maximum size of the string (and you prohibit
recursion and you keep it thread local) then you cannot reserve space
for it in the activation record of the caller (or in global space). As
you know, if you create it in the activation record of the to-string
function then its memory will go out of scope when the function returns.

However, if the formatter is passed the value and the format (my
suggestion) then it (the formatter) can print the characters one by one
or could write them to a buffer - with the buffer being legitimately and
safely deallocated when the formatter returns.

>
> There should be no format specifications at all. You just need a few
> parameters for Image regarding type-specific formatting and a few
> parameters regarding rendering context in the actual output call.
>
> The former are like put + if positive, base, precision etc; the latter
> are like output field width, alignment, fill character etc.

That sounds good. Perhaps other things should be added: fixed or
floating sign, leading or trailing sign, different potential
representations of bases, digit grouping, fixed-point scaling, response
to exceeding a field width, etc.


--
James Harris

Dmitry A. Kazakov

unread,
Jan 8, 2022, 5:57:25 PMJan 8