On 27/12/2021 17:26, James Harris wrote:
> On 27/12/2021 15:49, Bart wrote:
>> printit("This is %s; OK?", R(v,"abc"))
>>
>> Then no special features are needed. Except that if R returns a
>> string, then you need some means of disposing of that string after
>> printing, but there are several ways of dealing with that.
>
> Dealing with memory is indeed a problem with that approach. The %s could
> be passed a string which needs to be freed or one which must not be
> freed. One option is
>
> %s - a string which must not be freed
> %M - a string which must be freed
This has similar problems to hardcoding a type. Take this:
printit("%s", F())
F returns a string, but is it one that needs freeing or not? That's
depends on what happens inside F. Whatever you choose, later the
implementation of F changes, then 100 formats have to change too?
This is a more general problem of memory-managing strings. It's more
useful to be able to solve it for the language, then it will work for
Print too.
(Personally, I don't get involved with this at all, not in low level code.
Functions that return strings generally return a pointer to a local
static string that needs to be consumed ASAP. Or sometimes there is a
circular list of them to allow several such calls per Print. It's not
very sophisticated, but that's why the language is low level.)
If you want a more rigorous approach, perhaps try this:
printit("%H", R(v, "abc"))
H means handler. R() is not the handler itself, but returns a descriptor
(eg. X) the contains a reference to a handler function, and references
to those two captured values (a little like a lambda or closure I think).
The Print handler then calls X() with a parameter to request the value.
And can call it again with another parameter to free it. Or perhaps it
can be used to iterate over the characters, or sets of strings.
The latter might be more suitable when you have a 1-billion element
array to print (eg. to a file), and you don't want or need to generate
one giant string in one go.
But this starts to get into generators and iterators. Perhaps it is a
too advanced approach if your language is anything like mine.
>>
>> > %i; a plain, normal, signed integer
>> > %iu; a plain, normal, unsigned integer
>> > %iu02x; a 2-digit zero-padded unsigned hex integer
>> > %Kabc; a type K unknown to the print function
>>
>> This is too C-like. C-style formats have all sorts of problems
>> associated with hard-coding a type-code into a format string:
>>
>> * What is the code for arbitrary expression X?
>
> It would have to be something to match the type of X.
>
>> * What will it be when X changes, or the type of the terms change?
>
> The format string would need to be changed to reflect the type change.
>
>> * What is it for clock_t, or some other semi-opaque type?
>
> Perhaps %sdhh:mm:ss.fff; where d indicates datetime.
The problem with doing it in C is that clock_t could be u32, u64, i32,
i64 or even a float type; what number format to use? It's not a string
anyway; you can turn it into one, but then that's my DOW example.
>> * What is it for uint64_t? (Apparently, it is PRId64 - a macro that
>> expands to a string)
Again this is for C; the problem being that a format string should not
need to include type information:
* The compiler knows the type
* You may not know the type (eg. clock_t)
* You may not know the format needed (eg. uint64_t)
* You don't want to have to maintain 1000 format strings as
expressions and types of variables change
> That said, it would make sense for the elements of the format string to
> appear in some sort of logical order - possibly the order in which they
> would be needed by the renderer.
But then somebody has to remember them! I ensure the order doesn't matter.
> Maybe unfairly I have an antipathy to copying other languages but maybe
> in this case it would be useful. Are there any you would recommend?
I have the same approach to other languages. Generally I find their
print schemes over-elaborate, so tend to do my own thing. Yet they also
have to solve the same problems.
>> (My approach in dynamic code is that there is an internal function
>> tostr(), fully overloaded for different types, with optional format
>> data, that is applied to Print items. So that:
>>
>> print a, b, c:"h" # last bit means hex
>>
>> is the same as:
>>
>> print tostr(a), tostr(b), tostr(c, "h")
>
> Maybe that's better: the ability to specify custom formatting on any
> argument. I presume that's not just available for printing, e.g. you
> could write
>
> string s := c:"h"
>
> and that where you have "h" you could have an arbitrarily complex format
> specification.
Well, ":" is specifically used in print-item lists (elsewhere it creates
key:value pairs). I would write your example in dynamic code as one of:
s := sprint(c:"h") # sprint is special, like print
s := tostr(c,"h") # tostr is a function-like operator
(sprint can turn a list of items into one string; tostr does one at a
time, although that one item can be arbitrarily complex.)
In my cruder static code, it might be:
[100]char str
print @str, c:"h"
> There looks to be a potential issue, though. In C one can build up the
> control string at run time. Could you do that with such as
>
> string fmt := format_string(....)
> s := c:(fmt)
>
> ?
Sure, what comes after ":" is just any string expression:
ichar fmt = (option=1 "Hs_" | "s,")
print 123456:fmt # displays 1_E240 or 123,456
(Separator grouping is 3 digits decimal; 4 digits hex/binary.)
I've exercised my print formatting recently and found some weak areas,
to do with tabulation. Getting things lined up in columns is tricky,
especially with a header.
I do have formatted print which looks like this:
fprint "#(#, #) = #", fnname, a, b, result
If I want field widths, they are written as:
fprint "#: # #", a:w1, b:w2, c:w3
where w1/w2/w2 are "12" etc. Here, the first problem is a disconnect
between each #, and the corresponding print item. This is why some
languages bring them inside.
But the main thing here is that I don't get a sense of what it looks
like until I run the program. Something I've seen in the past would look
a bit like:
fprint "###: ####### #############", a, b, c
The widths are the number of # characters. That same string could be
used for headings:
const format = "###: ####### #############"
fprint format, "No.", "Barcode", "Description"
....
fprint format, i, item[i].code, item[i].descr
I haven't implemented this, it's just an idea. This an actual example of
the kind of output I'm talking about, but done the hard way by trial and
error:
Type Seg Offset Symbol/Target+Offset
-------------------------------------------------------
1: imprel32 code 00000024 MessageBoxA
2: locabs64 code 00000015 idata 02E570B0
3: locabs64 code 0000000B idata 02E570B6