Type qualifiers, declaration aliases and namespaces

22 views
Skip to first unread message

James Harris

unread,
Aug 20, 2021, 11:04:58 AMAug 20
to
In another thread Bart posed some great questions to which I only have a
partial answer. As the answers end up getting into a separate topic or
two I'll start this new thread.

On 20/08/2021 14:19, Bart wrote:
> On 20/08/2021 12:55, James Harris wrote:
>> On 20/08/2021 11:47, Bart wrote:
>>> On 20/08/2021 08:29, James Harris wrote:


...


>> Naming integers iN was tempting but I felt that it either took away
>> too much of the namespace or, as illustrated, would be irregular and
>> fiddly.
>
>
>
> I don't use i1 i2 i4, only i8/i16/i32/i64/i128.

You do have similar, though, don't you? In an earlier reply you said "In
my case however I also have bittypes which I call u1, u2 and u4 (which
then continue as u8, u16 etc).".

If you have u2 etc then

u1 is a reserved word
u2 is a reserved word
u3 is not reserved
u4 is a reserved word
u5 is not reserved
etc

...

> Many languages now which allow size-specific types will have then as
> one of:
>
> i32
> int32
> Int32
> int32_t etc
>
> You could say that all these are irregular since int31/int33 are legal
> user identifiers, but int32 isn't (well apart from Rust).

True. AIUI in C int31_t would not be a type name but it would be using a
form of name which is reserved - and C has only conventions for names,
not enforcement.


>
> This applies to 'int' to:
>
> hnt int jnt ...
> ins int inu ...

I am not sure I understand what that is pointing out.


>
> And actually to most keywords unless the language has a
> peculiar enough syntax to allow keywords as identifiers (I
> think PL/I allowed if if=if ...)

I heard something like that before.

if if = if then then = then else else = else
if then = else then if = then else then = if

:-o


>>>> IOW the iN and uN forms are tempting but they seem to be rather
>>>> limiting.
>>>
>>> Why, what are you planning?
>>
>> If possible (and I haven't implemented it yet) I'd rather have the
>> number of bits as a qualifier which goes after the type name as
>> follows
>>
>> int 8 a
>> int 16 b
>> int 32 c
>
> This is more flexible

Yes.


> (I'd prefer some punctuation or other way of
> connecting the number with the type)

I'm surprised to hear that you would want additional punctuation. I tend
to put a lot of effort into trying to make it unnecessary! For example,
instead of C's

if (e)

I have just

if e

Isn't code easier to read without unnecessary punctuation?


> but as I said, you then have to deal with extra possibilities:
>
> * Could the number be an expression?

Yes, as long as it was resolvable at compile time. If the width were to
be specified by an expression, E, then the syntax would be

int (E) d

>
> * Could it be the name of a macro or constant that expands to a
> number?

It could be the name of a constant, yes. I don't have any plans for
macros. The constant name would have to be enclosed in parens as in the
expression form, above. As for the full syntax a constant could be
defined as

ro uint Bits = 32

where ro means "read only". Then later that constant could be used as in

int (Bits) counter


>
> * If the number is a name, then int a ... becomes ambiguous; are you
> defining an int called 'a',or is 'a' a name that expands to '32', and
> the actual variable name follows?

ATM,

int a would declare an integer a of default width
int (w) b would declare an integer b of width w
int 16 c would declare an integer c of width 16

Depending on how other decisions pan out I might end up changing the
parens to square brackets for consistency. Then the middle one of those
declarations would become

int [w] b


>
> * What to do about invalid sizes?

In the expressions above, the size would be determined at compile time
so any invalid size could be rejected.


>
> * Could such a number appear also after a user-defined type; for
> example if an alias 'T' for 'int' was created, would 'T 8 a'
> be allowed?

Good question. I hadn't thought of doing that but it might be possible.
To explain, I have been thinking to declare type names with a syntax like

typedef T1 = int 8

then

T1 g

would declare g as of type int 8.

However, as a separate matter I am also toying with the idea of allowing
short names for other namespaces such as

namespace S = ns.dns.invalid.scl.personnel

Then

S.X

would really refer to

ns.dns.invalid.scl.personnel.X

Why is that relevant? Because a typename such as int is also a name.
Therefore I would be able to define

namespace T2 = int

and subsequently do as you originally suggested by writing

T2 8 h

If T2 had been declared to be int then that would do what you asked
about, above, and declare h to be of type "int 8".

Whether a programmer would want to do that or not is another matter!



Either way, what I've not bottomed out, yet, is whether there's a need
for both typedefs and namespace definitions. They are very similar:

typedef T1 = int 8
namespace T2 = int

They may be better replaced with

alias T1 = int 8
alias T2 = int

Would that be a good idea? I don't know. It would be flexible, for sure,
but possibly confusing. A programmer would not be forced to use it but
if he did it could make subsequent code harder to parse - both for the
compiler and, more importantly, for a programmer because it would be
harder to recognise type names.

So at the moment this stuff is still on the drawing board.

...

>> uint 64
>>
>> Having said that, what do you make of uns when compared with uint?
>
> Here I agree, uint is better than uns, nat, and nneg! Uint or
> variations is also commonly used so that wouldn't be a bad choice.

OK. Given what you said before that's unexpected! But welcome. :-)


--
James Harris

David Brown

unread,
Aug 20, 2021, 2:50:07 PMAug 20
to
On 20/08/2021 17:04, James Harris wrote:

>
>>>>> IOW the iN and uN forms are tempting but they seem to be rather
>>>>> limiting.
>>>>
>>>> Why, what are you planning?
>>>
>>> If possible (and I haven't implemented it yet) I'd rather have the
>>> number of bits as a qualifier which goes after the type name as
>>> follows
>>>
>>>    int 8 a
>>>    int 16 b
>>>    int 32 c
>>
>> This is more flexible
>
> Yes.
>
>
>> (I'd prefer some punctuation or other way of
>> connecting the number with the type)
>
> I'm surprised to hear that you would want additional punctuation. I tend
> to put a lot of effort into trying to make it unnecessary! For example,
> instead of C's
>
>   if (e)
>
> I have just
>
>   if e
>
> Isn't code easier to read without unnecessary punctuation?
>

No. /Excessive/ punctuation, or complicated symbols make code hard to
read, especially when rarely used. (So having something like ">>?" for
a "maximum" operator is a terrible idea.)

If you require your "if" statements to use brackets, then the
parenthesis are not needed : "if e { ... }". If you don't require
brackets, then put the parenthesis or other syntax (like a "then") to
make it clear. (My preference is to require the {} brackets.)


>
>> but as I said, you then have to deal with extra possibilities:
>>
>> * Could the number be an expression?
>
> Yes, as long as it was resolvable at compile time. If the width were to
> be specified by an expression, E, then the syntax would be
>
>   int (E) d
>

I'd recommend looking at C++ templates. You might not want to follow
all the details of the syntax, and you want to look at the newer and
better techniques rather than the old ones. But pick a way to give
compile-time parameters to types, and then use that - don't faff around
with special cases and limited options. Pick one good method, then you
could have something like this :

builtin::int<32> x;
using int32 = builtin::int<32>;
int32 y;

That is (IMHO) much better than your version because it will be
unambiguous, flexible, and follows a syntax that you can use for all
sorts of features.


If you want more fun, you could make types first-class objects of your
language. Then you could have a function "int" that takes a single
number as a parameter and returns a type. Then you'd have :

int(32) x;
type int32 = int(32);
int32 y;

Bart

unread,
Aug 20, 2021, 3:33:38 PMAug 20
to
On 20/08/2021 16:04, James Harris wrote:
> In another thread Bart posed some great questions to which I only have a
> partial answer. As the answers end up getting into a separate topic or
> two I'll start this new thread.
>
> On 20/08/2021 14:19, Bart wrote:

> >
> > This applies to 'int' to:
> >
> >    hnt int jnt ...
> >    ins int inu ...
>
> I am not sure I understand what that is pointing out.
>

Sometimes an innocuous-looking reserved word may be part of a pattern
you want to use for variables. For example, I couldn't understand what
was wrong here:

ref int pi, pj, pk

I'd forgotten that 'pi' was a reserved word (you know, the constant
3.1415926...)

> > (I'd prefer some punctuation or other way of
> > connecting the number with the type)
>
> I'm surprised to hear that you would want additional punctuation.

You need /some/ punctuation, ie. symbols, otherwise source code will
just be a monotonous sequence of names and literals.

I quite like writing f(x,y,z) for example, but some languages will drop
the comma so that you have f(x y z), where you start having to think
about where an argument ends and the next begins, or even:

f x y z

(eg. Haskell). Without boundaries, this can get ambiguous:

f x g y z

Is that f(x,g(y),z) or f(x,g(y,x)) or f(x,y,g,z)?

In this example, I felt it needed something to tie the '16' to the 'int'.

My dynamic language defines some struct members like this:

string*13 barcode
string*36 description

That is, fixed-width string fields (0 to max 13/36 characters). Here, I
don't actually need that *, since 13 or 36 can't be the member name. But
it would look weirdly naked without:

string 13 barcode
string 36 description

That's more suited to a data description format with entries lined up in
3 columns.

> I tend
> to put a lot of effort into trying to make it unnecessary! For example,
> instead of C's
>
>   if (e)
>
> I have just
>
>   if e
>
> Isn't code easier to read without unnecessary punctuation?

Yes, but this reads like it does in English (or would do if 'e' had a
more meaningful name). But these don't:

int 32 ...
string 32 ...

You'd probably read them out loud with something added between keyword
and number so that it flows better. That's what the punctuation provides.

>
> > but as I said, you then have to deal with extra possibilities:
> >
> > * Could the number be an expression?
>
> Yes, as long as it was resolvable at compile time. If the width were to
> be specified by an expression, E, then the syntax would be
>
>   int (E) d

OK. This can serve as the punctuation I mentioned.


> ATM,
>
>   int a           would declare an integer a of default width
>   int (w) b       would declare an integer b of width w
>   int 16 c        would declare an integer c of width 16
>
> Depending on how other decisions pan out I might end up changing the
> parens to square brackets for consistency. Then the middle one of those
> declarations would become
>
>   int [w] b

For consistency you'd have int [16] too. Unless you're going to have a
lot of them in any program, then you might end up with int16! (That is,
just drop the space.)

>
> >
> > * What to do about invalid sizes?
>
> In the expressions above, the size would be determined at compile time
> so any invalid size could be rejected.
>

But which /are/ the invalid sizes; would int 24 be OK?

>
> >
> > * Could such a number appear also after a user-defined type; for
> > example if an alias 'T' for 'int' was created, would 'T 8 a'
> > be allowed?
>
> Good question. I hadn't thought of doing that but it might be possible.
> To explain, I have been thinking to declare type names with a syntax like
>
>   typedef T1 = int 8
>
> then
>
>   T1 g
>
> would declare g as of type int 8.
>
> However, as a separate matter I am also toying with the idea of allowing
> short names for other namespaces such as
>
>   namespace S = ns.dns.invalid.scl.personnel

> Then
>
>   S.X
>
> would really refer to
>
>   ns.dns.invalid.scl.personnel.X

(I think I can do that at the minute with macros. My macros only work
when the bodies are well-formed sub-expressions, but your example could
be written like this:

macro S = ns.dns.invalid.scl.personnel

But...

)

>
> Why is that relevant? Because a typename such as int is also a name.
> Therefore I would be able to define
>
>   namespace T2 = int
>
> and subsequently do as you originally suggested by writing
>
>   T2 8 h

( ... my macro wouldn't work here because this is not an expression. It
needs a more general macro system.)

>
> If T2 had been declared to be int then that would do what you asked
> about, above, and declare h to be of type "int 8".
>
> Whether a programmer would want to do that or not is another matter!
>
>
>
> Either way, what I've not bottomed out, yet, is whether there's a need
> for both typedefs and namespace definitions. They are very similar:
>
>   typedef T1 = int 8
>   namespace T2 = int

The right-hand-side of a namespace definition is presumably a series of
dotted names. The new name doesn't mean anything by itself until it is
expanded at each instance site.

The right-hand-side of a type definition would be a type specifier. The
new name is a Type, and can be used anywhere a type is expected.

The only point of similarity is when both type and namespace define an
alias to a simple type denoted, at the right end, by a single name
token. But typedef can also construct an arbitrary new type.

anti...@math.uni.wroc.pl

unread,
Aug 20, 2021, 4:13:37 PMAug 20
to
James Harris <james.h...@gmail.com> wrote:
> In another thread Bart posed some great questions to which I only have a
> partial answer. As the answers end up getting into a separate topic or
> two I'll start this new thread.
>
> On 20/08/2021 14:19, Bart wrote:
> > On 20/08/2021 12:55, James Harris wrote:
> >> On 20/08/2021 11:47, Bart wrote:
> >>> On 20/08/2021 08:29, James Harris wrote:
>
>
> ...
>
>
> >> Naming integers iN was tempting but I felt that it either took away
> >> too much of the namespace or, as illustrated, would be irregular and
> >> fiddly.
> >
> >
> >
> > I don't use i1 i2 i4, only i8/i16/i32/i64/i128.
>
> You do have similar, though, don't you? In an earlier reply you said "In
> my case however I also have bittypes which I call u1, u2 and u4 (which
> then continue as u8, u16 etc).".
>
> If you have u2 etc then
>
> u1 is a reserved word
> u2 is a reserved word
> u3 is not reserved
> u4 is a reserved word
> u5 is not reserved
> etc

Most languages make difference between reserved words and predefined
identifiers. For example, in Pascal 'begin' is reserved word,
while 'integer' is merely a predefined identifer. If you have
no use of predefined 'integer' you are allowed to redefine it
and use new meaning.

> >
> > And actually to most keywords unless the language has a
> > peculiar enough syntax to allow keywords as identifiers (I
> > think PL/I allowed if if=if ...)
>
> I heard something like that before.
>
> if if = if then then = then else else = else
> if then = else then if = then else then = if
>
> :-o

PL/I put things to extreme: formally no identifier was reserved
and you you could put declarations after use. Most languages
take intermediate position: there is small number of reserved
words and you need to declare variables before use. So
re-using predefined identifiers is easy to implement and safe.

In fact, in case of PL/I one view is that _all_ non-alhanumeric
"words" are reserved. That is things like comma, parenthesis,
semicolon, etc. By reserving also some aplhanumeric words
one gets nicer and simpler syntax. But there is no need to
reserve type names.

--
Waldek Hebisch

David Brown

unread,
Aug 21, 2021, 5:32:21 AMAug 21
to
On 20/08/2021 21:33, Bart wrote:
> On 20/08/2021 16:04, James Harris wrote:
>> In another thread Bart posed some great questions to which I only have
>> a partial answer. As the answers end up getting into a separate topic
>> or two I'll start this new thread.
>>
>> On 20/08/2021 14:19, Bart wrote:
>
>>  >
>>  > This applies to 'int' to:
>>  >
>>  >    hnt int jnt ...
>>  >    ins int inu ...
>>
>> I am not sure I understand what that is pointing out.
>>
>
> Sometimes an innocuous-looking reserved word may be part of a pattern
> you want to use for variables. For example, I couldn't understand what
> was wrong here:
>
>   ref int pi, pj, pk
>
> I'd forgotten that 'pi' was a reserved word (you know, the constant
> 3.1415926...)
>

And that is one of the reasons why a well-designed programming language
keeps the reserved words to a minimum, and one of the reasons why you
want namespaces (or modules, or packages, or whatever you want to call
them). 99.99% of programs don't need pi, so it should not be forced
upon them unless they choose to use it.

(If it makes you feel any better, the lack of namespaces and modules in
C is one of its major drawbacks for large-scale programming.)

>>  > (I'd prefer some punctuation or other way of
>>  > connecting the number with the type)
>>
>> I'm surprised to hear that you would want additional punctuation.
>
> You need /some/ punctuation, ie. symbols, otherwise source code will
> just be a monotonous sequence of names and literals.

Agreed.

>
> I quite like writing f(x,y,z)

You really should learn to use the space key. "f(x, y, z)" is vastly
easier to read.

> for example, but some languages will drop
> the comma so that you have f(x y z), where you start having to think
> about where an argument ends and the next begins, or even:
>
>   f x y z
>

In Haskell, which is a functional programming language, "f" is not a
function that takes three parameters. It is a function that takes one
parameter, and returns a function that takes one parameter and returns a
function that takes one parameter and returns a number (if that's the
final type, which is not visible in this case).

So it means (((f x) y) z).

> (eg. Haskell). Without boundaries, this can get ambiguous:
>
>   f x g y z
>
> Is that f(x,g(y),z) or f(x,g(y,x)) or f(x,y,g,z)?

(f x) ((g y) z)

Functional programming works in a rather different way from imperative
programming, and trying to interpret it as imperative programming will
only cause you confusion. You need to learn the paradigm before you can
understand the code. The same applies to other kinds of programming,
like Forth's RPN notation (which also has much less need of punctuation).

I agree that punctuation is useful in imperative programming, but that
doesn't mean it is needed for every kind of programming.

James Harris

unread,
Aug 21, 2021, 5:32:35 AMAug 21
to
On 20/08/2021 19:50, David Brown wrote:
> On 20/08/2021 17:04, James Harris wrote:

...

>>>>     int 8 a
>>>>     int 16 b
>>>>     int 32 c

...

>>> (I'd prefer some punctuation or other way of
>>> connecting the number with the type)
>>
>> I'm surprised to hear that you would want additional punctuation. I tend
>> to put a lot of effort into trying to make it unnecessary! For example,
>> instead of C's
>>
>>   if (e)
>>
>> I have just
>>
>>   if e
>>
>> Isn't code easier to read without unnecessary punctuation?
>>
>
> No. /Excessive/ punctuation, or complicated symbols make code hard to
> read, especially when rarely used. (So having something like ">>?" for
> a "maximum" operator is a terrible idea.)

Hm, I speak about "unnecessary" punctuation. You disagree and say the
problem is "excessive" punctuation. What's the difference between
unnecessary and excessive??

...

>>> * Could the number be an expression?
>>
>> Yes, as long as it was resolvable at compile time. If the width were to
>> be specified by an expression, E, then the syntax would be
>>
>>   int (E) d
>>
>
> I'd recommend looking at C++ templates. You might not want to follow
> all the details of the syntax, and you want to look at the newer and
> better techniques rather than the old ones. But pick a way to give
> compile-time parameters to types, and then use that - don't faff around
> with special cases and limited options. Pick one good method, then you
> could have something like this :
>
> builtin::int<32> x;
> using int32 = builtin::int<32>;
> int32 y;
>
> That is (IMHO) much better than your version because it will be
> unambiguous, flexible, and follows a syntax that you can use for all
> sorts of features.

My version of that would be

typedef i32 = int 32

int 32 x
i32 y

Your C++ version doesn't seem to be any more precise or flexible. And my
version is shorter, clearer and (at least once you are used to the
syntax) easier to read. In fact, mine is so much more readable that it
shows how weird it looks to use both int 32 and i32 - something that IMO
the C++ version obscures by lots of unnecessary (your term) waffle text!
So I am not sure what your criticism is.

I do agree, however, that I need to look at templates. Are C++
templates, as set out in

https://www.cplusplus.com/doc/oldtutorial/templates/

essentially just about parametrising functions and classes where the
parameters are types and other classes?

Or are they more flexible?

I ask that because I wonder if something based on macros (where the
parameters could be of any form, not just types and classes) could be as
useful but more adaptable to different situations. After all, the
creation of real functions from templated functions is rather like the
instantiation of macros, isn't it?

>
>
> If you want more fun, you could make types first-class objects of your
> language. Then you could have a function "int" that takes a single
> number as a parameter and returns a type. Then you'd have :
>
> int(32) x;
> type int32 = int(32);
> int32 y;
>

Are you talking there about a dynamic language where int is called at
run time?


--
James Harris

David Brown

unread,
Aug 21, 2021, 5:41:01 AMAug 21
to
On 20/08/2021 22:13, anti...@math.uni.wroc.pl wrote:

>
> PL/I put things to extreme: formally no identifier was reserved
> and you you could put declarations after use. Most languages
> take intermediate position: there is small number of reserved
> words and you need to declare variables before use. So
> re-using predefined identifiers is easy to implement and safe.
>

Forth is the most flexible language I know of in this sense:


$ gforth
Gforth 0.7.3, Copyright (C) 1995-2008 Free Software Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
2 2 + . 4 ok
: 2 3 ; ok
2 2 + . 6 ok


The result of "2 2 +" is 4, then I redefine "2" to mean "3", and now the
result of "2 2 +" is 6.

And in Metafont (or Metapost), an identifier like "u8" would mean the
eighth entry in the array "u". In TeX, "u8" could be the macro "u" with
the parameter 8, as digits cannot be part of identifiers. (Of course,
TeX lets you redefine the character class of the digits to make them
letters...)

James Harris

unread,
Aug 21, 2021, 10:58:09 AMAug 21
to
On 20/08/2021 20:33, Bart wrote:
> On 20/08/2021 16:04, James Harris wrote:
>> Bart wrote:


...


>>  > (I'd prefer some punctuation or other way of
>>  > connecting the number with the type)
>>
>> I'm surprised to hear that you would want additional punctuation.
>
> You need /some/ punctuation, ie. symbols, otherwise source code will
> just be a monotonous sequence of names and literals.

At least you wouldn't need your hated shift key.

;-)

>
> I quite like writing f(x,y,z) for example, but some languages will drop
> the comma so that you have f(x y z), where you start having to think
> about where an argument ends and the next begins, or even:
>
>   f x y z
>
> (eg. Haskell). Without boundaries, this can get ambiguous:
>
>   f x g y z
>
> Is that f(x,g(y),z) or f(x,g(y,x)) or f(x,y,g,z)?

I remember a Basic where one could type

r = sin x + cos y

It was certainly easily readable.


>
> In this example, I felt it needed something to tie the '16' to the 'int'.

OK.

...

>> I have just
>>
>>    if e
>>
>> Isn't code easier to read without unnecessary punctuation?
>
> Yes, but this reads like it does in English (or would do if 'e' had a
> more meaningful name). But these don't:
>
>    int 32 ...
>    string 32 ...
>
> You'd probably read them out loud with something added between keyword
> and number so that it flows better. That's what the punctuation provides.

YM varies, clearly. When you read your own i32 I expect you read it as

"i thirty-two"

Isn't

"int thirty-two"

sufficiently similar to read aloud?

...

>>    int (E) d
>
> OK. This can serve as the punctuation I mentioned.

...

>>    int [w] b
>
> For consistency you'd have int [16] too.

Well, int [16] would be allowed as what's in the brackets would be a
compile-time expression. But the brackets would be unnecessary.

Could it be a familiarity thing? New C programmers sometimes write

return (x);

because it looks right to them. But after a while they get used to

return x;

Could it be that these things just take time to get used to?

Besides, you likely remember that my declarations are meant to include
/ranges/ of widths (where the width has to be within a certain range).
For example,

int 16..32 b

would mean that b had to be between 16 and 32 bits (inclusive) wide. If
either of those bounds were an expression then it (that bound's
calculation) would need to be bracketed.

>
> Unless you're going to have a
> lot of them in any program, then you might end up with int16! (That is,
> just drop the space.)

:-)

I could do

typedef i16 = int 16

but I am not sure that would be much of a saving. IMO a typedef would be
better reserved for logical type names rather than be used to make
shortcuts for types which are already short.

>
>>
>>  >
>>  > * What to do about invalid sizes?
>>
>> In the expressions above, the size would be determined at compile time
>> so any invalid size could be rejected.
>>
>
> But which /are/ the invalid sizes; would int 24 be OK?

Yes, int 24 would be OK. Negative numbers would be invalid. Probably
zero, too. int 1 would be valid, though unusual.

...

>> Either way, what I've not bottomed out, yet, is whether there's a need
>> for both typedefs and namespace definitions. They are very similar:
>>
>>    typedef T1 = int 8
>>    namespace T2 = int
>
> The right-hand-side of a namespace definition is presumably a series of
> dotted names.

The RHS of a namespace definition would be required only to be an extant
name. It could be dotted or not. For example, if ns means "name system"
and ns.dns refers to the DNS name system then you could have

namespace com = ns.dns.com
namespace gweb = com.google.www

where the definition of gweb uses the "com" defined on the preceding
line. Yes, the above both have dots but you could go on to write

namespace webroot = gweb

making webroot an alias for the gweb name previously defined. So the RHS
would not have to have dots.


> The new name doesn't mean anything by itself until it is
> expanded at each instance site.

I am not sure about it being 'expanded' if you mean as one might expand
a macro. I see it more as an alias.

>
> The right-hand-side of a type definition would be a type specifier. The
> new name is a Type, and can be used anywhere a type is expected.

Yes.

>
> The only point of similarity is when both type and namespace define an
> alias to a simple type denoted, at the right end, by a single name
> token. But typedef can also construct an arbitrary new type.
>

Well, as with C, typedef really just creates a new name for an existing
type but you make a good point that a typedef has to create a type and
could not name a partial type ... so typedef and namedef (sic) should
probably be kept separate even though their forms are almost identical.


--
James Harris

Bart

unread,
Aug 21, 2021, 12:31:10 PMAug 21
to
On 21/08/2021 15:58, James Harris wrote:
> On 20/08/2021 20:33, Bart wrote:
>> On 20/08/2021 16:04, James Harris wrote:
> >> Bart wrote:
>
>
> ...
>
>
>>>  > (I'd prefer some punctuation or other way of
>>>  > connecting the number with the type)
>>>
>>> I'm surprised to hear that you would want additional punctuation.
>>
>> You need /some/ punctuation, ie. symbols, otherwise source code will
>> just be a monotonous sequence of names and literals.
>
> At least you wouldn't need your hated shift key.
>
> ;-)

It's not so bad in between tokens, maybe I just don't like interrupting
the typing of a single alphanumeric token.

However I do have considerable problems with typing accurately, so I
still hate the unneeded punctuation you have in C, especially with
simple prints:

printf("A=%d B=%f\n",a,b);

7 shifted symbols, versus none in my equivalent code: println =a, =b




>> I quite like writing f(x,y,z) for example, but some languages will
>> drop the comma so that you have f(x y z), where you start having to
>> think about where an argument ends and the next begins, or even:
>>
>>    f x y z
>>
>> (eg. Haskell). Without boundaries, this can get ambiguous:
>>
>>    f x g y z
>>
>> Is that f(x,g(y),z) or f(x,g(y,x)) or f(x,y,g,z)?
>
> I remember a Basic where one could type
>
>   r = sin x + cos y
>
> It was certainly easily readable.

My syntax also allows 'sin x + cos y', but only because sin and cos are
operators. However I tend to add the parentheses because I think it
looks better, with less reliance on white space. Actually I also write
max(A,B) instead of A max B for that reason.

Operators can otherwise be used with no parentheses as @A, A@B or A@
depending on unary/binary and whether prefix, infix or postfix.

So I will accept the annpyanceof some punctuation when there is a
benefit: clearer code, or code that is going to persist for longer than
the 2-minute half-life of a debug print.

>> You'd probably read them out loud with something added between keyword
>> and number so that it flows better. That's what the punctuation provides.
>
> YM varies, clearly. When you read your own i32 I expect you read it as
>
>   "i thirty-two"
>
> Isn't
>
>   "int thirty-two"
>
> sufficiently similar to read aloud?

Well, if I had to transcribe how I'd imagine I'd say those out loud, it
might be as "I-32" or "int-thirty-two"; that is, with the hyphen. (But
nothing extra added as I'd thought.)

After all, we write (or at least I do), "64 bits" or "64-bit", even
though in speech the gap between each part of near identical.

The latter would be more of an adjective, but whether a type-specifier
is classed an adjective is another question.

In the US, they have "Interstate 15", without punctuation, which is also
written compactly as "I-15", suggesting some connection is necessary
otherwise an orphaned 'I' by itself is ambiguous.

Anyway, there's no overwhelming evidence either way. To me it just feels
better if 'int' and '32' had a stronger connection than between '32' and
what follows.

> Besides, you likely remember that my declarations are meant to include
> /ranges/ of widths (where the width has to be within a certain range).
> For example,
>
>   int 16..32 b
>
> would mean that b had to be between 16 and 32 bits (inclusive) wide.

Ok, to this seems more like a range of values (as used in Pascal and
Ada) than a range of bits. Didn't you previously have a range like this
to denote values?

>> But which /are/ the invalid sizes; would int 24 be OK?
>
> Yes, int 24 would be OK.

Just reading that makes me think of all the extra work that's going to
be involved! Doing a simple assignment:

A := B

normally means two instructions on x64: load to register, store to
register, when A and B are 1, 2, 4 or 8 bytes.

When they are 3 bytes, then it would likely need 4 instructions or
possibly six if concerned about alignment, or you could get away with 3
if you can over-read the value of B (read 1 byte beyond B).

Now think about packed arrays of 24 bits.

Of course, I'm assuming the target hardware doesn't have 24-bit integers
as native types.


> Negative numbers would be invalid. Probably
> zero, too. int 1 would be valid, though unusual.

Unsigned 1-bit is fine (also called Bool). Signed 1-bit would be
unusual! It would have values of -1 and 0 I think.

>> The right-hand-side of a namespace definition is presumably a series
>> of dotted names.
>
> The RHS of a namespace definition would be required only to be an extant
> name. It could be dotted or not. For example, if ns means "name system"
> and ns.dns refers to the DNS name system then you could have
>
>   namespace com = ns.dns.com

That looks a bit dodgy. So 'com' can appear on both sides?

>   namespace gweb = com.google.www
>
> where the definition of gweb uses the "com" defined on the preceding
> line. Yes, the above both have dots but you could go on to write
>
>   namespace webroot = gweb
>
> making webroot an alias for the gweb name previously defined. So the RHS
> would not have to have dots.
>
>
>> The new name doesn't mean anything by itself until it is expanded at
>> each instance site.
>
> I am not sure about it being 'expanded' if you mean as one might expand
> a macro. I see it more as an alias.

Well, somewhere there needs to be a way for the compiler to trace the
path represented by 'webroot'. But sure, you probably don't need to
expand it at each instance into some sequence of AST nodes the implement
".". There it would differ from an implementation based on macros.

David Brown

unread,
Aug 21, 2021, 2:11:04 PMAug 21
to
On 21/08/2021 11:32, James Harris wrote:
> On 20/08/2021 19:50, David Brown wrote:
>> On 20/08/2021 17:04, James Harris wrote:
>
> ...
>
>>>>>      int 8 a
>>>>>      int 16 b
>>>>>      int 32 c
>
> ...
>
>>>> (I'd prefer some punctuation or other way of
>>>> connecting the number with the type)
>>>
>>> I'm surprised to hear that you would want additional punctuation. I tend
>>> to put a lot of effort into trying to make it unnecessary! For example,
>>> instead of C's
>>>
>>>    if (e)
>>>
>>> I have just
>>>
>>>    if e
>>>
>>> Isn't code easier to read without unnecessary punctuation?
>>>
>>
>> No.  /Excessive/ punctuation, or complicated symbols make code hard to
>> read, especially when rarely used.  (So having something like ">>?" for
>> a "maximum" operator is a terrible idea.)
>
> Hm, I speak about "unnecessary" punctuation. You disagree and say the
> problem is "excessive" punctuation. What's the difference between
> unnecessary and excessive??
>

None of the punctuation in that paragraph was necessary - the meaning
would have been clear and unambiguous without any periods, apostrophes,
or quotation marks. Yet only the final double question mark was
excessive. It's a matter of degree. Too little punctuation makes the
language harder to read and write, and offers more scope for ambiguity.
Too much makes it hard to read and write, and makes it difficult to
learn. Somewhere in the middle there is a happy medium - going too far
one way (limiting punctuation to the minimum necessary) is as bad as
going too far the other way (excessive punctuation that detracts from
the flow of the code).

> ...
>
>>>> * Could the number be an expression?
>>>
>>> Yes, as long as it was resolvable at compile time. If the width were to
>>> be specified by an expression, E, then the syntax would be
>>>
>>>    int (E) d
>>>
>>
>> I'd recommend looking at C++ templates.  You might not want to follow
>> all the details of the syntax, and you want to look at the newer and
>> better techniques rather than the old ones.  But pick a way to give
>> compile-time parameters to types, and then use that - don't faff around
>> with special cases and limited options.  Pick one good method, then you
>> could have something like this :
>>
>>     builtin::int<32> x;
>>     using int32 = builtin::int<32>;
>>     int32 y;
>>
>> That is (IMHO) much better than your version because it will be
>> unambiguous, flexible, and follows a syntax that you can use for all
>> sorts of features.
>
> My version of that would be
>
>   typedef i32 = int 32
>
>   int 32 x
>   i32 y
>

Punctuation here is not /necessary/, but it would make the code far
easier to read, and far safer (in that mistakes are more likely to be
seen by the compiler rather than being valid code with unintended meaning).

> Your C++ version doesn't seem to be any more precise or flexible.

What happens when you have a type that should have two parameters - size
and alignment, for example? Or additional non-integer parameters such
as signedness or overflow behaviour? Or for container types with other
types as parameters? C++ has that all covered in a clear and accurate
manner - your system does not.

My intention here is to encourage you to think bigger. Stop thinking
"how do I make integer types?" - think wider and with greater generality
and ambition. Make a good general, flexible system of types, and then
let your integer types fall naturally out of that.

> And my
> version is shorter, clearer and (at least once you are used to the
> syntax) easier to read.

"Shorter" is /not/ an advantage, any more than "longer" is an advantage.

> In fact, mine is so much more readable that it
> shows how weird it looks to use both int 32 and i32 - something that IMO
> the C++ version obscures by lots of unnecessary (your term) waffle text!
> So I am not sure what your criticism is.

It doesn't really matter if you decide that "int32", "int32_t", "i32",
or anything else is going to be the normal way to declare a 32-bit
integer. You have to figure out what you think makes sense and reads
well in your language. But the syntax I suggested for defining the type
is not "waffle" - it is intention. You don't want this sort of thing to
be short - you want it to be consistent and logical, unambiguous in
syntax, and not conflict with identifiers the programmer might want.

>
> I do agree, however, that I need to look at templates. Are C++
> templates, as set out in
>
>   https://www.cplusplus.com/doc/oldtutorial/templates/
>
> essentially just about parametrising functions and classes where the
> parameters are types and other classes?
>
> Or are they more flexible?

A limited tutorial on a 20+ year old version of the language is not
going to be the best reference. This is a /much/ better site for C++
(and C) information, and works closely with the language standardisation
groups.

<https://en.cppreference.com/w/cpp/language/templates>

It's not a tutorial site, however. It aims to be accurate to the
standards but gives a more reader-friendly format than the standards,
and is excellent at noting the differences between different standards
versions.

Originally, templates were just about functions and classes parametrised
by types. They let you make a "max" function that would work for any
type with a " > " operator, or a list container class that could work
for any type. But they moved on from that. They also include template
aliases, variables, and concepts (which are a way of naming
characteristics of types - a sort of "type of type", except they use
duck-typing instead of structural typing). As well as types, template
parameters can be integers, enumerators, and now pretty much any
"literal" class. For a while, C++ templates were used for compile-time
calculations in C++, but that was an awkward process - the syntax was
seriously ugly and they were limited and inefficient. (Now you use
proper compile-time functions.)

Make sure you look at C++20 for inspiration, not ancient C++98. Look at
concepts - they greatly simplify templates and generic programming.
(They are not the only way to do it - remember that a lot of the way
things are done in an old, evolved language like C++ come from adding
features while retaining backwards compatibility - for a new language,
you don't need to do that, and can jump straight to better designs. You
are looking for inspiration and ideas to copy, not copying all the
weaker parts of older languages.)

Perhaps even look at the metaclasses proposal
<https://www.fluentcpp.com/2018/03/09/c-metaclasses-proposal-less-5-minutes/>.
This will not be in C++ before C++26, maybe even later, but it gives a
whole new way of building code. If metaclasses had been part of C++
from the beginning, there would be no struct, class, enum, or union in
the language - these would have been standard library metaclasses. They
are /that/ flexible.

>
> I ask that because I wonder if something based on macros (where the
> parameters could be of any form, not just types and classes) could be as
> useful but more adaptable to different situations. After all, the
> creation of real functions from templated functions is rather like the
> instantiation of macros, isn't it?
>

To some extent, yes - but it is done in a clearer, cleaner and more
systematic manner.

Pure textual macros, like C's, have lots of limitations (no recursion is
a critical limitation) - as well as being too chaotic because there are
few rules.

But there are other languages with other kinds of macros, with different
possibilities. There are some languages were features like loop
structures are not keywords or fundamental language statements, but just
macros from the standard library.

Ultimately, things like macros, templates, generics, metafunctions,
etc., are just names for high-level compile-time coding constructs.

>>
>>
>> If you want more fun, you could make types first-class objects of your
>> language.  Then you could have a function "int" that takes a single
>> number as a parameter and returns a type.  Then you'd have :
>>
>>     int(32) x;
>>     type int32 = int(32);
>>     int32 y;
>>
>
> Are you talking there about a dynamic language where int is called at
> run time?
>

No - "int" would be a compile-time function here.

James Harris

unread,
Aug 21, 2021, 2:11:32 PMAug 21
to
On 21/08/2021 17:31, Bart wrote:
> On 21/08/2021 15:58, James Harris wrote:
>> On 20/08/2021 20:33, Bart wrote:
>>> On 20/08/2021 16:04, James Harris wrote:
>>  >> Bart wrote:

...

>> At least you wouldn't need your hated shift key.
>>
>> ;-)
>
> It's not so bad in between tokens, maybe I just don't like interrupting
> the typing of a single alphanumeric token.

Understood.

From memory web search engines used to (and maybe still do) regard
underscore as being part of a word and hyphen as separating words. So

this-rather_odd-name

would be three 'words'.

>
> However I do have considerable problems with typing accurately, so I
> still hate the unneeded punctuation you have in C, especially with
> simple prints:
>
>   printf("A=%d B=%f\n",a,b);
>
> 7 shifted symbols, versus none in my equivalent code: println =a, =b
>

That's curious given what we have been discussing. You appear to have a
function with two parameters without parens!

I have not yet decided on output mechanisms but since there's some code
to compare I'll have a go. One option is

cout.putrec(a, b)

Another is

cout.putf("a=%M, b=%M\n/", a.string(), b.string())

Another is

debug.vardump(a, b)

which I guess is nearer the intention of your =a form.

...


>> I remember a Basic where one could type
>>
>>    r = sin x + cos y
>>
>> It was certainly easily readable.
>
> My syntax also allows 'sin x + cos y', but only because sin and cos are
> operators. However I tend to add the parentheses because I think it
> looks better, with less reliance on white space. Actually I also write
> max(A,B) instead of A max B for that reason.

OK.

What about the viewpoint that a function call /always/ has the form

f X

where X is /always/ a single argument, and that the single argument
needs to be wrapped in parens if it is a composite of something other
than one element?

I quite like the theory of that given that

f(x)
f (x)

should both mean the same AND that parens are traditionally used for
grouping without changing the meaning.

IOW (x) should mean the same as x.

...


> After all, we write (or at least I do), "64 bits" or "64-bit", even
> though in speech the gap between each part of near identical.

FWIW, I think Verilog allows N' as meaning N-bit. So one could have

7' 15

meaning 7-bit 15. For hardware programming specific-width values are
fairly common.

...

>> Besides, you likely remember that my declarations are meant to include
>> /ranges/ of widths (where the width has to be within a certain range).
>> For example,
>>
>>    int 16..32 b
>>
>> would mean that b had to be between 16 and 32 bits (inclusive) wide.
>
> Ok, to this seems more like a range of values (as used in Pascal and
> Ada) than a range of bits. Didn't you previously have a range like this
> to denote values?

Not me. I have considered using

int range 0..9

Maybe that's what you are thinking of. But in that, "range" would be an
essential keyword (to indicate that the integer should be restricted to
values in the specified range).

>
>>> But which /are/ the invalid sizes; would int 24 be OK?
>>
>> Yes, int 24 would be OK.
>
> Just reading that makes me think of all the extra work that's going to
> be involved!

... (examples snipped)

I know. I don't pretend it would be easy but IMO it's important.

>
> Of course, I'm assuming the target hardware doesn't have 24-bit integers
> as native types.

The idea is that the computations would be the same irrespective of the
word size of the machine. So normal 16-bit ops would be a challenge on
such a machine. :-(

>
>
>> Negative numbers would be invalid. Probably zero, too. int 1 would be
>> valid, though unusual.
>
> Unsigned 1-bit is fine (also called Bool). Signed 1-bit would be
> unusual! It would have values of -1 and 0 I think.

Indeed. (That'd be worse to work with than int 24...!)

...

>>    namespace com = ns.dns.com
>
> That looks a bit dodgy. So 'com' can appear on both sides?

It isn't intended to work in the way you have in mind. Rather, imagine
that there's a 'current namespace' so that when you type

int b
int c

then b and c will be placed in that namespace. All normal stuff. Now add

namespace d = <something>

The current namespace will then have b, c and d.

So in the example, after

namedef com = ns.dns.com

there will be a name, com, in the current namespace, and the program
should be able to refer to com just as easily as it refers to b or c in
the example above.

Does that make more sense?

That said, there may be some problems with the idea that I haven't yet
seen. I certainly have a lot of details to work out in respect of where
name resolution will look if a name is not in the current namespace, and
how to allow the programmer to control that mechanism.


--
James Harris

anti...@math.uni.wroc.pl

unread,
Aug 21, 2021, 4:27:47 PMAug 21
to
David Brown <david...@hesbynett.no> wrote:
> On 20/08/2021 22:13, anti...@math.uni.wroc.pl wrote:
>
> >
> > PL/I put things to extreme: formally no identifier was reserved
> > and you you could put declarations after use. Most languages
> > take intermediate position: there is small number of reserved
> > words and you need to declare variables before use. So
> > re-using predefined identifiers is easy to implement and safe.
> >
>
> Forth is the most flexible language I know of in this sense:
>
>
> $ gforth
> Gforth 0.7.3, Copyright (C) 1995-2008 Free Software Foundation, Inc.
> Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
> Type `bye' to exit
> 2 2 + . 4 ok
> : 2 3 ; ok
> 2 2 + . 6 ok
>
>
> The result of "2 2 +" is 4, then I redefine "2" to mean "3", and now the
> result of "2 2 +" is 6.

Forth is weird because it treats integers as identifiers. However,
there are several languages in "most flexible" camp: each supports
user-provided scanner, so after appropriate "prelude" you can
put completely different programmibng language. AFAIK Forth allows
this. But also Lisp and few other languages.

IIUC this is much more general than what James wants...

--
Waldek Hebisch

Bart

unread,
Aug 21, 2021, 8:14:12 PMAug 21
to
On 21/08/2021 19:11, James Harris wrote:
> On 21/08/2021 17:31, Bart wrote:
>> On 21/08/2021 15:58, James Harris wrote:

>> 7 shifted symbols, versus none in my equivalent code: println =a, =b
>>
>
> That's curious given what we have been discussing. You appear to have a
> function with two parameters without parens!

Well, that's because it's classed as a statement, which have dedicated
syntax.

print is also different from a function as it is given an arbitrarily
long list of operands, none of more significance than the other, and it
will consume all of them.

Nested prints like this:

print a, b, print c, d, e

would be parsed in a certain way (print a, b, (print c, d, e), but will
not compile since 'print' does not return a value that can be printed.

Nested is possible as:

print a, b, (print c, d; e), f

But in typical use, a print statement will consume all its operands, and
will never have nested print statements in a form that will cause issues.

>
> I have not yet decided on output mechanisms but since there's some code
> to compare I'll have a go. One option is
>
>   cout.putrec(a, b)
>
> Another is
>
>   cout.putf("a=%M, b=%M\n/", a.string(), b.string())

Your example here uses formatted print, which I'd write as:

fprintln "a=#, b=#", a, b

Having this stuff as statements means you don't need to deal with
challenging features like:

* Variadic /numbers/ of arguments to a function

* Variadic /types/ of arguments ...

* ... which you've circumvented with an explicit to-string routine,
but now you need overloaded versions for any types, plus you
need to manage the string memory used

> which I guess is nearer the intention of your =a form.

Using '=' requires being able to turn any expression back into a string.
(Which I don't do perfectly, and the form may not match what was in the
source code, so that 'max(a,b)' may come out as 'a max b'.)


> What about the viewpoint that a function call /always/ has the form
>
>   f X
>
> where X is /always/ a single argument, and that the single argument
> needs to be wrapped in parens if it is a composite of something other
> than one element?

That's for other languages (everyone is mad now about functional
programming with its currying and lambdas).

This anyway causes problems with my current syntax where consecutive
identifiers do not normally occur. If they do, then the first is assumed
to be the name of a user-type.

My view is that if you want a language to look like a command language
that you'd write a line at a time like a shell, or a language mainly
used interactively via a REPL, then you can make them more friendly and
more informal by doing away with parentheses around command arguments.

After all you don't need to write DEL (file.c).

But if you are constructing a whole source file before submitting it to
a compiler or interpreter, then it can do with being a bit more formal.

However you've seen examples of my syntax; it's not particularly
cluttery is it, or bristling with punctuation.



>
>>>    namespace com = ns.dns.com
>>
>> That looks a bit dodgy. So 'com' can appear on both sides?
>
> It isn't intended to work in the way you have in mind. Rather, imagine
> that there's a 'current namespace' so that when you type
>
>   int b
>   int c
>
> then b and c will be placed in that namespace. All normal stuff. Now add
>
>   namespace d = <something>
>
> The current namespace will then have b, c and d.
>
> So in the example, after
>
>   namedef com = ns.dns.com
>
> there will be a name, com, in the current namespace, and the program

Yes, I hadn't spotted that they're not the same because you're defining
a top level 'com' name which will not clash with the other 'com', as it
only appears after a "." so is not visible.

James Harris

unread,
Aug 22, 2021, 3:03:09 PMAug 22
to
On 22/08/2021 01:14, Bart wrote:
> On 21/08/2021 19:11, James Harris wrote:
>> On 21/08/2021 17:31, Bart wrote:
>>> On 21/08/2021 15:58, James Harris wrote:
>
>>> 7 shifted symbols, versus none in my equivalent code: println =a, =b

...

The whole discussion of print /statements/ and command forms is too big
an issue to go into here. Something to come back to. :-)



>> What about the viewpoint that a function call /always/ has the form
>>
>>    f X
>>
>> where X is /always/ a single argument, and that the single argument
>> needs to be wrapped in parens if it is a composite of something other
>> than one element?
>
> That's for other languages (everyone is mad now about functional
> programming with its currying and lambdas).

It's nothing to do with functional programming. And probably little to
do with command syntax. It's meant to be a plain call. It could be to a
function (returning a result) or a procedure (returning nothing so not a
function).

Consider these two expressions

a
(a)

In the kind of syntax we are familiar with the parentheses do not alter
the meaning so both of those expressions would mean the same thing. So
if we put an identifier before them why should the meaning change? As in

f a
f (a)

Would it be more syntactically consistent (with the examples, above) if
they meant the same as each other?

You mentioned before about how with command syntax one can have the need
to walk through the following arguments, which could include
interpreting their different types and sizes. Consider

a + f(b)

The call would have one argument, (b). With

a + f(b, c, d)

could one also say that the call has one argument (b, c, d), and parse
it as such even if b, c and d were not positional but were a sequence
and had different types and sizes?

...

> However you've seen examples of my syntax; it's not particularly
> cluttery is it, or bristling with punctuation.

No, I like your syntax. It seems to me to be clean and user friendly.

That said, I think you have a number of special cases which help it
remain so but can be confusing to a newbie. Your =a is a case in point.

...

>> So in the example, after
>>
>>    namedef com = ns.dns.com
>>
>> there will be a name, com, in the current namespace, and the program
>
> Yes, I hadn't spotted that they're not the same because you're defining
> a top level 'com' name which will not clash with the other 'com', as it
> only appears after a "." so is not visible.
>

Yes. To be clear, to apply

namedef com = ns.dns.com

the only name which would need to be visible is ns. Assuming it could
resolve the RHS of the assignment the namedef would make the plain name
com visible as a short form of the longer name.


--
James Harris

Bart

unread,
Aug 22, 2021, 3:54:25 PMAug 22
to
(X) doesn't always mean the same thing (I'll elaborate on X below).

In most cases it just means the expression X. But following a term, it
is the arguments for a function call (or cast or operator when they use
function-like syntax):

(expr)
A + (expr)
A(args)
A(args)(args)
(expr)(args)

X can be any of:

Expr:

() Empty list
(x) Simple term
(x,) 1-element list
(x,y,...) List of N terms

Args:

() No args (mandatory)
(x) 1 arg
(x,y,...) N args

As mentioned, some other constructs use the function-like syntax:

max(x,y) Same as (x max y)
clamp(x,y,z) Only uses this syntax
int(x) Cast
date(25,12,20) Record constructor, when 'date' is a type.

So, in your a + f(b,c,d) example, that is 3 arguments not one. For a
single argument, I'd need to write:

a + f((b, c, d))


Bart

unread,
Aug 22, 2021, 8:11:04 PMAug 22
to
On 21/08/2021 10:40, David Brown wrote:
> On 20/08/2021 22:13, anti...@math.uni.wroc.pl wrote:
>
>>
>> PL/I put things to extreme: formally no identifier was reserved
>> and you you could put declarations after use. Most languages
>> take intermediate position: there is small number of reserved
>> words and you need to declare variables before use. So
>> re-using predefined identifiers is easy to implement and safe.
>>
>
> Forth is the most flexible language I know of in this sense:
>
>
> $ gforth
> Gforth 0.7.3, Copyright (C) 1995-2008 Free Software Foundation, Inc.
> Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
> Type `bye' to exit
> 2 2 + . 4 ok
> : 2 3 ; ok
> 2 2 + . 6 ok
>
>
> The result of "2 2 +" is 4, then I redefine "2" to mean "3", and now the
> result of "2 2 +" is 6.

-------------------------------
C:\qapps>qq forth
Bart-Forth
Type bye, quit or exit to stop

> 2 2 + .
4
> : 2 3 ;
> 2 2 + .
6
>
-------------------------------

It didn't have a REPL 20 minutes ago; it does now.

I wrote this last year, then when I went to download some test programs
(since I find it impossible to code in myself), I found that each used a
different, incompatible dialect.


(This Forth written in my 'Q' scripting language. 'Q' implemented in my
'M' systems language. 'M' implemented in my 'M' systems language.)

James Harris

unread,
Aug 23, 2021, 3:40:19 PMAug 23
to
On 22/08/2021 20:54, Bart wrote:
> On 22/08/2021 20:03, James Harris wrote:

...

>> Consider
>>
>>    a + f(b)
>>
>> The call would have one argument, (b). With
>>
>>    a + f(b, c, d)
>>
>> could one also say that the call has one argument (b, c, d), and parse
>> it as such even if b, c and d were not positional but were a sequence
>> and had different types and sizes?

...

> Expr:
>
>   ()         Empty list
>   (x)        Simple term
>   (x,)       1-element list
>   (x,y,...)  List of N terms

...

> So, in your a + f(b,c,d) example, that is 3 arguments not one. For a
> single argument, I'd need to write:
>
>   a + f((b, c, d))

Speaking of lists, what if a certain language treated every function
call as

F L

where L was a list?

The list could be in one of the forms you show, above, with zero or more
arguments. Any mandatory arguments would come first, followed by any
optional arguments. Is this what you were calling 'command syntax'
except with delimiters rather than whitespace between arguments?

I think of command syntax as more Lisp-like as in

(f b c d)

I was thinking here, however, that such a call would be more
conventionally embeddable in expressions as in

a + f(b,c,d) + e

Further, if b were to be mandatory and c and d were optional then f
would begin with b already assigned to whatever local name was the
formal parameter and would then iterate over - or even, effectively,
/parse/ - the subsequent sequence of arguments.

Parsing a series of arguments in that way would be akin to other
conventional processing patterns where a program is responding to a
series of varying inputs, including processing arguments on a command
line and the classic model of successively taking the next piece of work
from an event queue.

My language is currently much more conventional so perhaps this is just
a thought experiment. I suppose what I am thinking of is being able to
invoke a function with a form of arguments which is more general and
flexible than those we normally use.

One could, for example, create the list (in an arbitrary number of
steps) before making the call such as

Args = (b, c, d)
a + f *Args + e

Or the tail of the list could be generated dynamically by another
process so it would only end when that other process signalled that the
list was complete.

Etc.

There seem to be lots of ways of producing a list as a 'work stream' and
that's what would be fed in as the 'parameter list' to the callee.

Maybe you've already tried something like this as it may be better
suited to a dynamic language than a static one.

An early parser of mine built a tree where each node was a list of the form

(nodetype, param ...)

Each list began with a node type and then had any 'parameters' of that
type. In a sense, that had the same structure: a type followed by a
series of properties is akin to a command followed by a series of
arguments.

As I say, it seems to be a common pattern. I just wonder what it would
enable if it were applied to a parameter list.


--
James Harris

Bart

unread,
Aug 23, 2021, 5:55:48 PMAug 23
to
With static code, dealing with Win64 ABI is already enough of a
nightmare then it's best to keep things as simple and conventional as
possible.

With interpreted code where there is a software stack, then it's
possible to be more flexible.

But I've looked at some of those ways of passing parameters, eg. like
Pythons *A or A*, whichever it is, which expands a list (or object) A
into N arguments. Unfortunately they don't really fit my current
implementations.

It's not impossible, but I would have to keep conventional function
calls to keep things efficient, with expanding argument lists
implemented via new bytecode instructions.

In that case that would just have to join the waiting list of such features!

It's possible in Python because everything is done at runtime (I'm
actually surprised it's not a lot slower), but it makes it harder to
optimise.

It also aspires to be a functional language so it needs to support all
these tricks with argument lists, or returning function objects, or
creating lambda functions; it's all interconnected. My own need to be
more prosaic.


Reply all
Reply to author
Forward
0 new messages