Why does C allow structs to have a tag?

James Harris

unread,

Jun 6, 2021, 8:16:17 AM6/6/21

to

Does C, as a language, need to allow its structs to have tags?

AIUI a fragment such as "struct A {....};" reserves no storage but
declares a template which can be used later, e.g. to declare a variable
as in

struct A var;

to declare a parameter as in

void f(struct A parm) {....}

to effect a cast as in

(struct A *) p

etc but in all such cases struct A is being used as a type. And there is
a more general feature for that in typedef.

So could C's struct tags be omitted from the language? If not, what does
a struct tag add? Is it something to do with forward declarations or
syntactic consistency with union tags, etc?

--
James Harris

David Brown

unread,

Jun 6, 2021, 9:28:10 AM6/6/21

to

Struct tags are not required if you have a typedef:

typedef struct { int a; int b; } A;

They are also not needed if you only refer to the struct once :

struct { int a; int b; } a;

But in general if you have two "struct" declarations, these define
different types - even if they contain the same members. So if you want
to use the type more than once, you need to give it a name - either
using a tag or by making it part of a typedef.

Forward declarations require a tag - there is no way to avoid that.
Beyond that, I think you can skip tags and stick to typedef if you prefer.

(Some people feel it is better to refer to the type as "struct A" rather
than a typedef "A", others prefer to use typdefs and avoid the "struct"
part.)

Öö Tiib

unread,

Jun 6, 2021, 9:34:44 AM6/6/21

to

Not from C language. C language removes only features that no one
uses and even those very slowly. The "struct A" will be there forever as
it is quite popular. But from your code go ahead, David Brown
explained how.

Bart

unread,

Jun 6, 2021, 9:59:35 AM6/6/21

to

They add nothing at all to the language. Other languages don't need
them. Even C++ I think is moving away from them.

It's just another quirk among many (you probably know that tags have
their own namespace too).

However, as C works at the minute, tags are needed for self-referential
structs, or with cyclic references. This won't work, even if you try and
have a forward declaration for Node:

typedef struct {
int data;
Node* next;
} Node;

You have to use:

typedef struct Nodetag{
int data;
struct Nodetag* next;
} Node;

Of course, as a long-standing feature of C, many will claim it is
invaluable. For example that using:

struct Nodetag x;

is clearer than:

Node x;

because you know that you're dealing with a struct. My view is that a
feature that allows:

struct Nodetag Nodetag;

is not one you'd want one to emulate.

Kaz Kylheku

unread,

Jun 6, 2021, 10:52:10 AM6/6/21

to

On 2021-06-06, James Harris <james.h...@gmail.com> wrote:
> etc but in all such cases struct A is being used as a type. And there is
> a more general feature for that in typedef.

typedef is general, but a general what?

It's purely a type aliasing mechanism for giving names to existing
types. It doesn't define new types.

> So could C's struct tags be omitted from the language? If not, what does
> a struct tag add? Is it something to do with forward declarations or
> syntactic consistency with union tags, etc?

The tag is essential in the C design for self-referential
structures:

struct node {
struct node *next;
void *datum;
};

typedef cannot solve this because

typedef struct {
node *next; // error: typedef does not exist yet!
void *datum;
} node;

The struct specifier is what introduces a type; typedef just introduces
the name "node" into the scope which refers to that type.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Kaz Kylheku

unread,

Jun 6, 2021, 11:05:16 AM6/6/21

to

On 2021-06-06, Bart <b...@freeuk.com> wrote:
> You have to use:
>
> typedef struct Nodetag{

There is no requirement that the tag and type name must be distinct
since they are in separate namespaces.

> int data;
> struct Nodetag* next;
> } Node;

It will work wit a forward declaration too, but that of course
still requires the tag appearing in two places:

typedef struct Nodetag Node;

struct Nodetag {
Nodetag *next;

};

> because you know that you're dealing with a struct. My view is that a
> feature that allows:
>
> struct Nodetag Nodetag;
>
> is not one you'd want one to emulate.

On the contrary, type names should be in a separate namespace, so that
an object of type foo can naturally be named foo when you need it:

node = new node;
other_node = new node;

Only crap languages reduce everything to one namespace.

C does not escape this disparagement. Though it provides that
struct/enum tag namespace, it's for those kinds of types only, and you
have to reference it explicitly.

Bart

unread,

Jun 6, 2021, 11:05:44 AM6/6/21

to

On 06/06/2021 15:51, Kaz Kylheku wrote:
> On 2021-06-06, James Harris <james.h...@gmail.com> wrote:
>> etc but in all such cases struct A is being used as a type. And there is
>> a more general feature for that in typedef.
>
> typedef is general, but a general what?
>
> It's purely a type aliasing mechanism for giving names to existing
> types. It doesn't define new types.
>
>> So could C's struct tags be omitted from the language? If not, what does
>> a struct tag add? Is it something to do with forward declarations or
>> syntactic consistency with union tags, etc?
>
> The tag is essential in the C design for self-referential
> structures:
>
> struct node {
> struct node *next;
> void *datum;
> };
>
> typedef cannot solve this because
>
> typedef struct {
> node *next; // error: typedef does not exist yet!
> void *datum;
> } node;
>
> The struct specifier is what introduces a type; typedef just introduces
> the name "node" into the scope which refers to that type.

It's only essential because C has decided that's how it has to work.

Your example works with C++; how does that manage it?

C could have done it with a forward declaration of 'node', but:

typedef node;

is not valid; you need to give a type, and that type then counts as
either a separate or duplicate struct to the one defined later. Or you
use a tag for that forward, empty struct declaration, so you're back to
square one.

It's just fortuitous that the tag is placed just before {...} instead of
just after, otherwise you'd have the same problem with tags.

A minor tweak of syntax, such as:

typedef newtype = oldtype; // with oldtype name-less

instead of:

typedef oldtype newtype; // with oldtype/newtype wrapped around
// each other for complex types

would have done it.

Bart

unread,

Jun 6, 2021, 11:31:02 AM6/6/21

to

On 06/06/2021 16:05, Kaz Kylheku wrote:
> On 2021-06-06, Bart <b...@freeuk.com> wrote:
>> You have to use:
>>
>> typedef struct Nodetag{
>
>
> There is no requirement that the tag and type name must be distinct
> since they are in separate namespaces.
>
>> int data;
>> struct Nodetag* next;
>> } Node;
>
> It will work wit a forward declaration too, but that of course
> still requires the tag appearing in two places:
>
> typedef struct Nodetag Node;
>
> struct Nodetag {
> Nodetag *next;
> };
>
>
>> because you know that you're dealing with a struct. My view is that a
>> feature that allows:
>>
>> struct Nodetag Nodetag;
>>
>> is not one you'd want one to emulate.
>
> On the contrary, type names should be in a separate namespace, so that
> an object of type foo can naturally be named foo when you need it:

Why would you want that? Case-sensitive languages already allow that
anyway if you're really stuck for inventing meaningful new identifiers.

Foo foo;

> node = new node;
> other_node = new node;

It took me a while to figure out what this is supposed to mean.

So 'node' following 'new' is a type; but the other node is a variable
name? What about:

node node;

> Only crap languages reduce everything to one namespace.

Namespaces are used where they are genuinely useful, for example to
disambiguate A.F from B.F. It's not so you can have F meaning two or
more different things within the same scope. This is an example from C
(one of 8 case variations of the three As):

A:; struct A A; goto A;

Really handy when half the keys don't work on your keyboard!

James Harris

unread,

Jun 6, 2021, 11:33:42 AM6/6/21

to

On 06/06/2021 16:05, Bart wrote:
> On 06/06/2021 15:51, Kaz Kylheku wrote:

...

>> The tag is essential in the C design for self-referential
>> structures:

Agreed with those who have mentioned that. I am not sure whether C still
needs to prohibit forward references. They were once needed for
single-pass compilation, I would think, but that era is long past.

> It's just fortuitous that the tag is placed just before {...} instead of
> just after, otherwise you'd have the same problem with tags.

Far from fortuitous it's more likely to be good design. With

struct A {....} B;

isn't it true that either A or B can be omitted, and that A will be the
struct tag and B will be an identifier which is being declared as of the
specified struct type? And it fits with C's ordering of

type identfier

That's a really quite subtle and consistent design feature, ISTM.

>
> A minor tweak of syntax, such as:
>
> typedef newtype = oldtype;   // with oldtype name-less
>
> instead of:
>
> typedef oldtype newtype;     // with oldtype/newtype wrapped around
>                                // each other for complex types
>
> would have done it.

Yes, I prefer your 'tweaked' syntax for a programming language (though
am not suggesting it for C), just without the semicolon, as in

typedef newtypename = oldtypespec

--
James Harris

Manfred

unread,

Jun 6, 2021, 12:58:34 PM6/6/21

to

On 6/6/2021 5:05 PM, Kaz Kylheku wrote:
> On 2021-06-06, Bart <b...@freeuk.com> wrote:
>> You have to use:
>>
>> typedef struct Nodetag{
>
>
> There is no requirement that the tag and type name must be distinct
> since they are in separate namespaces.
>
>> int data;
>> struct Nodetag* next;
>> } Node;
>
> It will work wit a forward declaration too, but that of course
> still requires the tag appearing in two places:
>
> typedef struct Nodetag Node;
>
> struct Nodetag {

> Nodetag *next; // error: unknown type name ‘Nodetag’
> };

>
>

typedef struct Nodetag Node;

struct Nodetag {

int data;
Node *next;
};

Message has been deleted

Richard Damon

unread,

Jun 6, 2021, 3:03:58 PM6/6/21

to

The biggest need is for referencing a struct whose definition hasn't
been given, especially as a pointer type.

You can say 'struct A' to reference that type without ever having seen
the actual definition of what that structure looks like.

You can NOT do this with a typedef. You CAN say
typedef struct A Astruct; to make Astruct a typedef for that unknown
struct type, but even that requires the struct tag to be specified, so
it can possible be linked later when the struct is actually defined.

Except by using this sort of tag, there is no way in C to say that a
name is to some type that may be defined later.

I suppose C could be extended to allow a statement like typedef A; to
define that A was a type, that later might get connected to some defined
type. This would allow you to define pointers to that type.

David Brown

unread,

Jun 6, 2021, 3:12:41 PM6/6/21

to

Or even:

typedef struct Node Node;

struct Node {
int data;
Node *next;
};

There's no need for an extra name here.

Bart

unread,

Jun 6, 2021, 4:02:17 PM6/6/21

to

Unless you want to avoid any confusion. Since you can come across this:

Node* A;
struct Node* B;

Does one of those lines have a 'struct' missing, or does the other have
a 'struct' that shouldn't be there?

The eye picks up such inconsistencies; it creates a distraction.

Allowing two ways of denoting the same user-type is also sloppy.

Richard Harnden

unread,

Jun 6, 2021, 4:25:45 PM6/6/21

to

That's why the "_t" suffix in "typedef struct foo foo_t" is popular.

Personally, I'm happy to loose the typedef and say struct foo when
that's what I mean. It's not that much extra typing.

Keith Thompson

unread,

Jun 6, 2021, 6:10:50 PM6/6/21

to

Bart <b...@freeuk.com> writes:
> On 06/06/2021 13:16, James Harris wrote:
>> Does C, as a language, need to allow its structs to have tags?
>>
>> AIUI a fragment such as "struct A {....};" reserves no storage but
>> declares a template which can be used later, e.g. to declare a
>> variable as in
>>
>> struct A var;
>>
>> to declare a parameter as in
>>
>> void f(struct A parm) {....}
>>
>> to effect a cast as in
>>
>> (struct A *) p
>>
>> etc but in all such cases struct A is being used as a type. And
>> there is a more general feature for that in typedef.
>>
>> So could C's struct tags be omitted from the language? If not, what
>> does a struct tag add? Is it something to do with forward
>> declarations or syntactic consistency with union tags, etc?
>
> They add nothing at all to the language. Other languages don't need
> them. Even C++ I think is moving away from them.

No, it isn't.

In C, you can define a structure type as
struct foo { /* members /* };

where "foo" is the tag. The name of the resulting type is "struct foo".
You can optionally add a typedef if you prefer to refer to it as "foo"

In C++, the same type, defined with the same syntax, can be referred to
either as "struct foo" or as "foo"; the latter is more common. The
identifier "foo" is still the struct tag. C++ has worked this way for
decades; it's not "moving away" from anything.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Keith Thompson

unread,

Jun 6, 2021, 6:13:37 PM6/6/21

to

Which is a great argument for not bothering with the typedef:

struct Node {
int data;
struct Node *next;
};

struct Node obj;
struct Node *ptr;
// ...

(Bart will argue that having to type "struct" again is a heavy burden.
I will not respond.)

Bart

unread,

Jun 6, 2021, 6:55:43 PM6/6/21

to

There is NO argument. Every other language including C++ sees separate
tag names as pointless. C++ can't completely get rid of them, so still
allows 'struct tag{}' or 'struct tag x' for compatibility with C code,
although 'struct tag{}' here creates a typedef 'tag' too.

C++ actually allows this:

typedef struct T {int x,y;} U;

so that you can use T, struct T or U (but not struct U). So C
compatibility has given it /three/ ways of denoting the same user-type!

I declare such a struct like this (in one language):

type T = struct (int x,y)

It declares exactly one user type, denoted always as just 'T'.

Keith Thompson

unread,

Jun 6, 2021, 8:02:37 PM6/6/21

to

Bart <b...@freeuk.com> writes:
[...]

> There is NO argument. Every other language including C++ sees separate
> tag names as pointless. C++ can't completely get rid of them, so still
> allows 'struct tag{}' or 'struct tag x' for compatibility with C code,
> although 'struct tag{}' here creates a typedef 'tag' too.

[...]

No, that's not a typedef. It's just the name of the type. C++ does
have typedefs, but as in C they require the "typedef" keyword.

In C, you can write "typedef struct foo { ... } foo;". Then "foo" is a
typedef name, and an alias for the type that is also named "struct foo".

In C++, you can write "struct foo { ... };", and "struct foo" and
"foo" are both names for the type. Neither of them is a typedef.
The name "foo" acts much like the typedef name in C. (There might
be subtle differences; I'm not sure.)

And something I just noticed: C++ doesn't use the term "tag". "foo" is
a "class name".

The distinction is not deeply significant. I'm just like to get the
terminology right.

David Brown

unread,

Jun 7, 2021, 2:52:36 AM6/7/21

to

What you call "sloppy", others might call "flexible". You are, IIRC, a
fan of case-insensitive languages. Are they sloppy for letting you
write "foo" and "Foo" to mean the same thing? Or are you just engaging
in your favourite hobby - taking something that other people see as a
convenient way to write clearer code and using it for yet another
meaningless rant against C?

I agree that it is good to be consistent when you write your code, and I
would expect a programmer to choose either "struct Node" /or/ "Node",
and use it consistently. Generally, you'd only find the typedef at all
if the programmer intends to use "Node" consistently.

So there is no confusion that I can see.

(On the other hand, I think code like "struct Node Node;" is a lot more
likely to cause confusion.)

Bart

unread,

Jun 7, 2021, 6:06:23 AM6/7/21

to

On 07/06/2021 07:52, David Brown wrote:

> On 06/06/2021 22:02, Bart wrote:

>> Unless you want to avoid any confusion. Since you can come across this:
>>
>> Node* A;
>> struct Node* B;
>>
>> Does one of those lines have a 'struct' missing, or does the other have
>> a 'struct' that shouldn't be there?
>>
>> The eye picks up such inconsistencies; it creates a distraction.
>>
>> Allowing two ways of denoting the same user-type is also sloppy.
>>
>
> What you call "sloppy", others might call "flexible". You are, IIRC, a
> fan of case-insensitive languages. Are they sloppy for letting you
> write "foo" and "Foo" to mean the same thing?

You mean in the same way that "bart", "Bart" or "BART" all refer to me?

Or the fact that if I say them out loud, they all sound identical too?

> Or are you just engaging
> in your favourite hobby - taking something that other people see as a
> convenient way to write clearer code and using it for yet another
> meaningless rant against C?

No. For me case sensitivity means code that ISN'T clearer because people
exploit that subtle difference in case between otherwise identical
names: to use them for different purposes not just within the same
scope, but across a program, where you have to keep doing a double-take.

Here are some wonderful examples from sqlite3.c:

Action action
affinity Affinity
B b
bIn bin
BusyHandler busyHandler
Clsid clsid CLSID
DbPage dbpage
debuginfo DebugInfo
dwContextHelpId dwContextHelpID
dwFlags dwflags
Edx eDx
errmsg errMsg
hwndActive hWndActive
message Message MESSAGE
nExt next Next
szPMA szPma
WalIndexHdr walIndexHdr

> I agree that it is good to be consistent when you write your code, and I
> would expect a programmer to choose either "struct Node" /or/ "Node",
> and use it consistently. Generally, you'd only find the typedef at all
> if the programmer intends to use "Node" consistently.

My manually written C code only ever defines structs as part of a
typedef. And only uses tags when necessary to define the struct. It is
then subsequently referred to by the typedef name.

My automatically generated C code does the opposite; never uses typedefs
for structs (because there, who cares?).

But in any other language, we just wouldn't be discussing this at all.
You create a new type, be it a struct or anything else, and it's called
T. End of story.

David Brown

unread,

Jun 7, 2021, 7:25:50 AM6/7/21

to

On 07/06/2021 12:06, Bart wrote:
> On 07/06/2021 07:52, David Brown wrote:
>> On 06/06/2021 22:02, Bart wrote:
>
>>> Unless you want to avoid any confusion. Since you can come across this:
>>>
>>> Node* A;
>>> struct Node* B;
>>>
>>> Does one of those lines have a 'struct' missing, or does the other have
>>> a 'struct' that shouldn't be there?
>>>
>>> The eye picks up such inconsistencies; it creates a distraction.
>>>
>>> Allowing two ways of denoting the same user-type is also sloppy.
>>>
>>
>> What you call "sloppy", others might call "flexible". You are, IIRC, a
>> fan of case-insensitive languages. Are they sloppy for letting you
>> write "foo" and "Foo" to mean the same thing?
>
> You mean in the same way that "bart", "Bart" or "BART" all refer to me?
>
> Or the fact that if I say them out loud, they all sound identical too?

So you are happy with multiple ways of writing "bart" or "BART" that
look very different, and would immediately be viewed as different
identifiers and therefore different meanings to people familiar with
case-sensitive programming languages. Yet you would be upset with two
versions of writing the same type in C that are really quite obvious to
C programmers - and where any mistakes (in real code) would be caught by
the compiler.

>
>> Or are you just engaging
>> in your favourite hobby - taking something that other people see as a
>> convenient way to write clearer code and using it for yet another
>> meaningless rant against C?
>
> No. For me case sensitivity means code that ISN'T clearer because people
> exploit that subtle difference in case between otherwise identical
> names: to use them for different purposes not just within the same
> scope, but across a program, where you have to keep doing a double-take.
>

Ah, it's the old "some people write code I didn't like, therefore the
language is bad" argument. You do like to take your own personal
opinion and think of it as global rules that should apply to everyone.
And you do like to take the worst (in your not so humble opinion)
examples and assume all code is written that way.

Bart

unread,

Jun 7, 2021, 8:00:30 AM6/7/21

to

On 07/06/2021 12:25, David Brown wrote:
> On 07/06/2021 12:06, Bart wrote:

>> You mean in the same way that "bart", "Bart" or "BART" all refer to me?
>>
>> Or the fact that if I say them out loud, they all sound identical too?
>
> So you are happy with multiple ways of writing "bart" or "BART" that
> look very different, and would immediately be viewed as different
> identifiers and therefore different meanings to people familiar with
> case-sensitive programming languages. Yet you would be upset with two
> versions of writing the same type in C that are really quite obvious to
> C programmers - and where any mistakes (in real code) would be caught by
> the compiler.

The complaint is really about case-sensitivity in general, across
languages, file systems, OS command lines and, worst because YOU CAN'T
SEE WHAT YOU'RE TYPING, in passwords.

It is just an almighty pain. Fortunately for the most important
situations, common sense prevails and case-insensitivity is used.
(User-names, the first parts of web-addresses, emails and so on. Google
searches too!)

> Ah, it's the old "some people write code I didn't like, therefore the
> language is bad" argument. You do like to take your own personal
> opinion and think of it as global rules that should apply to everyone.
> And you do like to take the worst (in your not so humble opinion)
> examples and assume all code is written that way.

So, you don't take the same advantage of being able to use Foo and foo?

If that case, why do you need a case-sensitive language?

Thiago Adams

unread,

Jun 7, 2021, 9:01:00 AM6/7/21

to

On Sunday, June 6, 2021 at 10:28:10 AM UTC-3, David Brown wrote:
[...]

> But in general if you have two "struct" declarations, these define
> different types - even if they contain the same members. So if you want
> to use the type more than once, you need to give it a name - either
> using a tag or by making it part of a typedef.

In some cases creating a name just make the code more complex.

I wish I could create template structs using macros.

#define vector(T) { T * data; int size, capacity; }

void F(struct vector(Card) * v);

this convention has more information associated than a typedef using
plural name for instance.

void F(struct Cards * v);

Similar of "struct Card cards[] " it brings the information about
the characteristics of the type.

Imagine in C if each time we need array of something
we have to assign a name to it.

int a[10]; //error you need to create a name for it first.

or each compound literal had to have a name...

C also doesn't have unnamed functions (like C++ lambdas) and
sometimes the name is artificially created and used in just one place like
callbacks.

Malcolm McLean

unread,

Jun 7, 2021, 10:25:41 AM6/7/21

to

On Monday, 7 June 2021 at 13:00:30 UTC+1, Bart wrote:
>
> So, you don't take the same advantage of being able to use Foo and foo?
>
> If that case, why do you need a case-sensitive language?
>

If the language is case-insensitive, you don't want the same identifier to
appear in the source in different cases. That confuses anyone who doesn't
know that the language is case insensitive, or who doesn't know that the
programmer who wrote the code knows that the language is case
insensitive, or who hasn't been told why such an odd practice is followed
(there might be reasons, such as attaching metadata to an identifier).

So in fact what you want is not a case insensitive language, but a language
which disallows identifiers which are identical except in case. Alternatively
a language which enforces rules such as "local variables must be lower case,
functions must start with an uppercase letter".

There are good arguments against allowing C's free for all. Virtually every coding
standard is more restrictive than C in the identifiers that it allows. However
backwards compatibility is all.

Manfred

unread,

Jun 7, 2021, 11:15:17 AM6/7/21

to

Since I am no Bart, I am going to argue about that.
Not that's a heavy burden at all. My view is about consistent syntax
instead:
In C the canonical object declaration is:

type_name object_name;

One of the great strengths of the language is the possibility of
defining user defined types, by means of structs and unions (leaving out
typedefs that are aliases as you say)
From this perspective, after I have defined my type Foo as
struct Foo { int a; int b; };

I think it should be enough for the compiler to understand the correct
meaning when I write
Foo foo;

Now, I know you say (strictly according to the standard) that the type
name is "struct Foo" instead of "Foo", however from the pure syntactical
perspective this is a combination of a keyword and a name, so in C user
defined type names are required to include a keyword in their name; this
is not consistent with other type names.

I know that in practice it's no big deal, it's just something you have
to learn early on and you get used to it.
It's just that I am somewhat of a fan of language theory, syntax and
semantics.
That's why I like the C++ syntax for this considerably better.

I think this is one matter that falls under the category of "hangover"
from obsolete compiler weaknesses, to use David's terms.
Obviously all of this is now carved in stone, so no one is expecting C
to change this "feature" - I'm just expressing my taste about it.

Bart

unread,

Jun 7, 2021, 11:31:34 AM6/7/21

to

On 07/06/2021 15:25, Malcolm McLean wrote:
> On Monday, 7 June 2021 at 13:00:30 UTC+1, Bart wrote:
>>
>> So, you don't take the same advantage of being able to use Foo and foo?
>>
>> If that case, why do you need a case-sensitive language?
>>
> If the language is case-insensitive, you don't want the same identifier to
> appear in the source in different cases.

Why is that a problem? You think that, because a syntax is
case-insensitive, that coders will delight is applying random
combinations of case in their identifiers? In any case, you will /know/
they are all the same.

In my own coding style, I mainly use all-lower-case except when
highlighting debug or temporary code, then I might use all upper case.

In also allows me to import functions such as 'MessageBox' and call them
as 'messagebox' (ie. not need to remember the exact capitalisation). Or
C's printf as PRINTF if it is temporary code.

In C however, once someone decides the pattern of capitalisation, then
you have to use exactly that pattern. With a danger that, if you get it
wrong, it may inadvertently match the same identifier from an outer
scope with a different pattern.

> That confuses anyone who doesn't
> know that the language is case insensitive,

Fortran, Ada, Pascal and the Algols are case-insensitive. Nim for
something more recent. But one prerequisite when using a language is
knowing something about it! Including knowing it if is case insensitive
or not.

> or who doesn't know that the
> programmer who wrote the code knows that the language is case
> insensitive, or who hasn't been told why such an odd practice is followed
> (there might be reasons, such as attaching metadata to an identifier).

Why do people think it is odd all of a sudden?

One of the most ridiculous things is case-senstivity in a CLI. So you
can't type COPY, it has to be copy, because COPY means something else.
(What does it mean? What does Copy or CoPy mean?)

The next most ridiculous thing is case-sensititivity in file names. Just
remember the exact capitalisation and don't accidentally use the wrong
one. (Imagine 64 versions of hello.c in your directory.)

> So in fact what you want is not a case insensitive language, but a language
> which disallows identifiers which are identical except in case. Alternatively
> a language which enforces rules such as "local variables must be lower case,
> functions must start with an uppercase letter".

Guidelines can be enforced in a case-insensitive language too.

What you can't do however is use Foo for a function name, and foo for a
local variable in the same scope (as well as foo for a struct tag /and/
label name, plus fOO, fOo and foO variations for those, and FoO, FOo and
FOO for functions) as C does.

> There are good arguments against allowing C's free for all. Virtually every coding
> standard is more restrictive than C in the identifiers that it allows. However
> backwards compatibility is all.
>

All my languages can interface with case-senstitive FFI names. Even if
they clash when case is ignored.

David Brown

unread,

Jun 7, 2021, 12:06:59 PM6/7/21

to

Sure. There is a reason C++ has lambdas, templates, etc. But C is kept
as a simpler language, and does not have these features - if you want
them, C++ is your obvious choice.

Kaz Kylheku

unread,

Jun 7, 2021, 12:15:20 PM6/7/21

to

On 2021-06-07, Bart <b...@freeuk.com> wrote:
> On 07/06/2021 15:25, Malcolm McLean wrote:
>> On Monday, 7 June 2021 at 13:00:30 UTC+1, Bart wrote:
>>>
>>> So, you don't take the same advantage of being able to use Foo and foo?
>>>
>>> If that case, why do you need a case-sensitive language?
>>>
>> If the language is case-insensitive, you don't want the same identifier to
>> appear in the source in different cases.
>
> Why is that a problem? You think that, because a syntax is
> case-insensitive, that coders will delight is applying random
> combinations of case in their identifiers? In any case, you will /know/
> they are all the same.

When you say case insensitivity, what does that mean? Are these
considered the same identifier?

δέλτα

Δέλτα

How about Japanese katakana vs. hiragana. Are these the same
identifier?

スウジすうじ

すうジスウじ

They both say "suuji"; can you declare using one and reference elsewhere
using the others?

Bart

unread,

Jun 7, 2021, 12:53:14 PM6/7/21

to

On 07/06/2021 17:15, Kaz Kylheku wrote:
> On 2021-06-07, Bart <b...@freeuk.com> wrote:
>> On 07/06/2021 15:25, Malcolm McLean wrote:
>>> On Monday, 7 June 2021 at 13:00:30 UTC+1, Bart wrote:
>>>>
>>>> So, you don't take the same advantage of being able to use Foo and foo?
>>>>
>>>> If that case, why do you need a case-sensitive language?
>>>>
>>> If the language is case-insensitive, you don't want the same identifier to
>>> appear in the source in different cases.
>>
>> Why is that a problem? You think that, because a syntax is
>> case-insensitive, that coders will delight is applying random
>> combinations of case in their identifiers? In any case, you will /know/
>> they are all the same.
>
> When you say case insensitivity, what does that mean? Are these
> considered the same identifier?
>
> δέλτα
>
> Δέλτα
>
> How about Japanese katakana vs. hiragana. Are these the same
> identifier?
>
> スウジすうじ
>
> すうジスウじ
>
> They both say "suuji"; can you declare using one and reference elsewhere
> using the others?

I'm mainly talking about language source code which uses the ASCII
alphabet for keywords and user-identifiers.

I'm not concerned with other elements of source code such as comments,
or string literals, or any data used by the program. Or elements
representing file names (eg. names of include files).

That means considering A-Z and a-z in keywords and identifiers as
interchangable.

As for the answers to your questions; in my languages they are not
identifiers. Some more unusual identifiers are allowed using a "`"
prefix, for example, keywords, or numbers, which can be tweaked for utf8
sequences, but ` makes it case-sensitive in that case.

Then, they are considered equivalent when the byte-sequences match, not
glyphs nor meanings.

In languages such as Fortran and Ada, you'd have to ask experts on those.

James Harris

unread,

Jun 7, 2021, 1:38:43 PM6/7/21

to

If I understand you then yes, but that's a design choice.

I wonder if a language could, instead, allow statements to refer to
types which come anywhere in the same scope, i.e. allow /implicit/
forward references?

For example, the self referential:

typedef A = struct {
A *next;
int data;
}

or the indirectly referential:

typedef P = struct {
Q *firstchild;
int data;
}

typedef Q = struct {
P *parent;
float data;
}

The hardest part of that for a compiler to parse would be

Q *firstchild;

because no mention of Q would have even been seen by the time the
compiler got to that line.

Could an implicit forward reference such as the above be parsed?
Perhaps. As long as the compiler skipping that line and coming back to
it later did not prevent recognition of later types then it looks as
though the above would be parseable without requiring a programmer to
code an explicit forward reference. If the line could be a declaration
then proper recognition of it could be carried out in a second pass.

If that would remove the need for a struct tag and also remove the
slightly ugly and annoying requirement of having to include explicit
forward references then it would be a small but welcome step forward, IMO.

--
James Harris

James Harris

unread,

Jun 7, 2021, 1:42:47 PM6/7/21

to

On 06/06/2021 23:55, Bart wrote:

...

> C++ actually allows this:
>
> typedef struct T {int x,y;} U;
>
> so that you can use T, struct T or U (but not struct U). So C
> compatibility has given it /three/ ways of denoting the same user-type!
>
> I declare such a struct like this (in one language):
>
> type T = struct (int x,y)
>
> It declares exactly one user type, denoted always as just 'T'.

Does that language cope with /later/ type definitions such as

type T = struct (int x, y; Q q)
type Q = int

If so, how do you handle compilation of something (Q) which has not yet
been seen?

--
James Harris

James Harris

unread,

Jun 7, 2021, 1:55:02 PM6/7/21

to

On 07/06/2021 11:06, Bart wrote:
> On 07/06/2021 07:52, David Brown wrote:
>> On 06/06/2021 22:02, Bart wrote:
>
>>> Unless you want to avoid any confusion. Since you can come across this:
>>>
>>> Node* A;
>>> struct Node* B;
>>>
>>> Does one of those lines have a 'struct' missing, or does the other have
>>> a 'struct' that shouldn't be there?
>>>
>>> The eye picks up such inconsistencies; it creates a distraction.
>>>
>>> Allowing two ways of denoting the same user-type is also sloppy.
>>>
>>
>> What you call "sloppy", others might call "flexible". You are, IIRC, a
>> fan of case-insensitive languages. Are they sloppy for letting you
>> write "foo" and "Foo" to mean the same thing?
>
> You mean in the same way that "bart", "Bart" or "BART" all refer to me?

So does bartc. So does whatever your real name is. But they would be
different if used as identifiers. IOW I don't think you can use "refers
to me" as a justification for case insensitivity.

>
> Or the fact that if I say them out loud, they all sound identical too?

Similar could be said for

lastchild

and

last_child

also for

index2

and

index_2

and even

index_two

>
>> Or are you just engaging
>> in your favourite hobby - taking something that other people see as a
>> convenient way to write clearer code and using it for yet another
>> meaningless rant against C?
>
> No. For me case sensitivity means code that ISN'T clearer because people
> exploit that subtle difference in case between otherwise identical
> names: to use them for different purposes not just within the same
> scope, but across a program, where you have to keep doing a double-take.

I have some sympathy for your position but IMO a language which uses
case-insensitive identifiers really needs an IDE to convert all
instances of an identifier to the casing used where the identifier is
defined. E.g. if you'd defined

int Bartc

then anywhere else where you typed

bartc

should be autoconverted to the casing used in the definition.

The problem with that is that not everyone wants to use a
language-specific or even language-aware IDE. Some of us prefer source
files to be plain text.

--
James Harris

James Harris

unread,

Jun 7, 2021, 2:02:23 PM6/7/21

to

On 07/06/2021 17:53, Bart wrote:
> On 07/06/2021 17:15, Kaz Kylheku wrote:

...

>> When you say case insensitivity, what does that mean? Are these
>> considered the same identifier?
>>
>>    δέλτα
>>
>>    Δέλτα
>>
>> How about Japanese katakana vs. hiragana. Are these the same
>> identifier?
>>
>>    スウジ   すうじ
>>
>>    すうジ   スウじ
>>
>> They both say "suuji"; can you declare using one and reference elsewhere
>> using the others?

...

> As for the answers to your questions; in my languages they are not
> identifiers. Some more unusual identifiers are allowed using a "`"
> prefix, for example, keywords, or numbers, which can be tweaked for utf8
> sequences, but ` makes it case-sensitive in that case.

If you want to include unusual characters in identifiers (perhaps in
order to interface with external routines which allow then) is there any
reason not to use \ rather than the ` character? That would at least be
consistent with how backslash is often used in strings.

--
James Harris

Bart

unread,

Jun 7, 2021, 2:40:04 PM6/7/21

to

Yes, it handles that as it can deal with out-of-order definitions in
general.

But I also make it a bit harder than it need be because I allow
non-hierarchical module imports (ie. allows cyclic and mutual imports)
so that in your example, Q might be defined in a imported module, which
might not be processed until after this one (perhaps because it itself
imported this, so module-processing order is indeterminate).

However you don't need to do it that way.

The key here is for the syntax to recognise two consecutive identifiers:

A B ...

as a probable declaration of variable B with type A. Alternatively you
can use a syntax like this:

var A B
var B:A

Here you know that B needs to be a type. In all cases, it will require
an extra pass, for example:

T A[N];
char B[sizeof(A)];

If T (and even N) are defined later, it will not be able to determine
the dimensions of B (sizeof(T)*N) on the first pass.

So you have to work with everything being tentative.

Bart

unread,

Jun 7, 2021, 2:57:54 PM6/7/21

to

On 07/06/2021 18:54, James Harris wrote:
> On 07/06/2021 11:06, Bart wrote:

>> You mean in the same way that "bart", "Bart" or "BART" all refer to me?
>
> So does bartc. So does whatever your real name is.

I think you know what I'm getting at here. If I signed a card bart, Bart
or BART, they would know who it's from (apart from there being not many
people called Bart).

If I used BART instead of Bart, nobody would wonder who on earth that
might be. So in real life, case is ignored, unless it is for some
special emphasis or special meaning (like Bay Area Rapid Transit).

I can translate a poem to all caps, or all lower case, or every word
capitalised, and people can still understand it. Random capitalisation
might be a bit more work though!

But they would be
> different if used as identifiers. IOW I don't think you can use "refers
> to me" as a justification for case insensitivity.
>
>>
>> Or the fact that if I say them out loud, they all sound identical too?
>
> Similar could be said for
>
> lastchild
>
> and
>
> last_child
>
> also for
>
> index2
>
> and
>
> index_2
>
> and even
>
> index_two

Most of those use underscores that I'm not that keen on either. In some
languages (I think Nim) underscores are not significant: A_B and AB are
the same. It's just used for readability.

I don't use underscores much.

>> No. For me case sensitivity means code that ISN'T clearer because
>> people exploit that subtle difference in case between otherwise
>> identical names: to use them for different purposes not just within
>> the same scope, but across a program, where you have to keep doing a
>> double-take.
>
> I have some sympathy for your position but IMO a language which uses
> case-insensitive identifiers really needs an IDE to convert all
> instances of an identifier to the casing used where the identifier is
> defined. E.g. if you'd defined
>
> int Bartc
>
> then anywhere else where you typed
>
> bartc
>
> should be autoconverted to the casing used in the definition.

They way it usually works is that "bartc" is used everywhere, or the
same consistent name, But even when mixed, it doesn't matter because
they are the same name. You're still thinking in case-sensitive terms.

Have a go with Windows which is a case-insensitive OS. Create a file
called Hello. Then type:

> dir HELLO

it will still find it. Now edit a file called hello; it will edit the
same file.

Now, the file system works a little like you say because file names
retain their original case of when they were created, so here will list
the file as Hello.

I don't do that except for FFI names or ones starting with `; names are
usually converted to lower case.

Keith Thompson

unread,

Jun 7, 2021, 3:03:42 PM6/7/21

to

Ideally, I agree. The reasons for requiring the struct keyword are
historical.

> Now, I know you say (strictly according to the standard) that the type
> name is "struct Foo" instead of "Foo", however from the pure
> syntactical perspective this is a combination of a keyword and a name,
> so in C user defined type names are required to include a keyword in
> their name; this is not consistent with other type names.

A type name in C isn't just an identifer optionally preceded by keyword.
It can be a sequence of keywords (unsigned long long int), or a
combination of keywords and punctuation (void(*)(void)), and can even
include expressions (unsigned char[2*x+y][time()%60]). The only way a C
type name can be a single identifier is via a typedef -- a feature that
was added to the language relatively late, and that has required parsers
to be a bit more complicated.

In the absence of typedefs, every type name in C has a syntactic marker
to indicate whether it's an integer or floating-point type (one of
several keywords), a pointer (*), a function (()), an array ([]), or a
struct, union, or enum (struct, union, and enum keywords).

Recall that C evolved from a language that didn't have types. The idea
of *naming* types was added, and the idea of naming types with a single
identifier was added even later.

> I know that in practice it's no big deal, it's just something you have
> to learn early on and you get used to it.
> It's just that I am somewhat of a fan of language theory, syntax and
> semantics.
> That's why I like the C++ syntax for this considerably better.

C++ still has all of C's syntactic quirks. I agree that being able to
use the tag (what C calls a "class name") as a type name is an
improvement -- and that it couldn't be adopted in C without breaking
existing code.

Given C's rules, my personal preference is to write out "struct foo"
rather than creating a typedef -- unless the type is intended to be
opaque, meaning that client code shouldn't know that it's a struct.
(But if I were working on a project that had a convention of using
typedefs, I'd follow that.) In C++, I just use the class name.

> I think this is one matter that falls under the category of "hangover"
> from obsolete compiler weaknesses, to use David's terms.
> Obviously all of this is now carved in stone, so no one is expecting C
> to change this "feature" - I'm just expressing my taste about it.

Bart

unread,

Jun 7, 2021, 3:55:35 PM6/7/21

to

On 07/06/2021 20:03, Keith Thompson wrote:

> Manfred <non...@add.invalid> writes:

>> Now, I know you say (strictly according to the standard) that the type
>> name is "struct Foo" instead of "Foo", however from the pure
>> syntactical perspective this is a combination of a keyword and a name,
>> so in C user defined type names are required to include a keyword in
>> their name; this is not consistent with other type names.
>
> A type name in C isn't just an identifer optionally preceded by keyword.
> It can be a sequence of keywords (unsigned long long int), or a
> combination of keywords and punctuation (void(*)(void)), and can even
> include expressions (unsigned char[2*x+y][time()%60]). The only way a C
> type name can be a single identifier is via a typedef -- a feature that
> was added to the language relatively late, and that has required parsers
> to be a bit more complicated.
>
> In the absence of typedefs, every type name in C has a syntactic marker
> to indicate whether it's an integer or floating-point type (one of
> several keywords), a pointer (*), a function (()), an array ([]), or a
> struct, union, or enum (struct, union, and enum keywords).

You're missing the point a little.

One purpose of a user-type is mop up all that mess into single
identifer, which can be used without any other keywords or other syntax.

That doesn't happen with the user-identifier assigned to a struct tag.
You have to prefer it with 'struct', to indicate it has to look into the
right namespace.

It's a different kind of named entity than a typedef, and unique in that
it /requires/ the combination of keyword and user-identifier.

>
> Recall that C evolved from a language that didn't have types. The idea
> of *naming* types was added, and the idea of naming types with a single
> identifier was added even later.
>
>> I know that in practice it's no big deal, it's just something you have
>> to learn early on and you get used to it.
>> It's just that I am somewhat of a fan of language theory, syntax and
>> semantics.
>> That's why I like the C++ syntax for this considerably better.
>
> C++ still has all of C's syntactic quirks. I agree that being able to
> use the tag (what C calls a "class name") as a type name is an
> improvement -- and that it couldn't be adopted in C without breaking
> existing code.
>
> Given C's rules, my personal preference is to write out "struct foo"
> rather than creating a typedef

So? Just name the type 'struct_T' instead of 'T'.

David Brown

unread,

Jun 7, 2021, 4:17:38 PM6/7/21

to

On 07/06/2021 21:55, Bart wrote:
> On 07/06/2021 20:03, Keith Thompson wrote:
>> Manfred <non...@add.invalid> writes:
>
>>> Now, I know you say (strictly according to the standard) that the type
>>> name is "struct Foo" instead of "Foo", however from the pure
>>> syntactical perspective this is a combination of a keyword and a name,
>>> so in C user defined type names are required to include a keyword in
>>> their name; this is not consistent with other type names.
>>
>> A type name in C isn't just an identifer optionally preceded by keyword.
>> It can be a sequence of keywords (unsigned long long int), or a
>> combination of keywords and punctuation (void(*)(void)), and can even
>> include expressions (unsigned char[2*x+y][time()%60]). The only way a C
>> type name can be a single identifier is via a typedef -- a feature that
>> was added to the language relatively late, and that has required parsers
>> to be a bit more complicated.
>>
>> In the absence of typedefs, every type name in C has a syntactic marker
>> to indicate whether it's an integer or floating-point type (one of
>> several keywords), a pointer (*), a function (()), an array ([]), or a
>> struct, union, or enum (struct, union, and enum keywords).
>
> You're missing the point a little.
>
> One purpose of a user-type is mop up all that mess into single
> identifer, which can be used without any other keywords or other syntax.
>

No, that would be the purpose of having type aliases - "typedef" in C.
The purpose of user-defined types is to be able to make types that are
different from existing types in some aspect.

> That doesn't happen with the user-identifier assigned to a struct tag.
> You have to prefer it with 'struct', to indicate it has to look into the
> right namespace.
>

Or you combine the two features - the creation of a new type with
"struct", and the convenient aliasing with "typedef".

> It's a different kind of named entity than a typedef, and unique in that
> it /requires/ the combination of keyword and user-identifier.

It's different in that it serves a different purpose.

>
>>
>> Recall that C evolved from a language that didn't have types. The idea
>> of *naming* types was added, and the idea of naming types with a single
>> identifier was added even later.
>>
>>> I know that in practice it's no big deal, it's just something you have
>>> to learn early on and you get used to it.
>>> It's just that I am somewhat of a fan of language theory, syntax and
>>> semantics.
>>> That's why I like the C++ syntax for this considerably better.
>>
>> C++ still has all of C's syntactic quirks. I agree that being able to
>> use the tag (what C calls a "class name") as a type name is an
>> improvement -- and that it couldn't be adopted in C without breaking
>> existing code.
>>
>> Given C's rules, my personal preference is to write out "struct foo"
>> rather than creating a typedef
>
> So? Just name the type 'struct_T' instead of 'T'.

People have different choices here. Some people prefer to view their
types with a little more abstraction, and would tend to use "T" (and
there's a fair correlation with preferring "char* p"). Others prefer to
think more in terms of how values of the type are used, and like "struct
T" because structs are used in different ways from scalers (programmers
with those preferences are perhaps more likely to write "char *p"). C
caters for both preferences - and both are in common use.

Keith Thompson

unread,

Jun 7, 2021, 4:22:55 PM6/7/21

to

Bart <b...@freeuk.com> writes:
> On 07/06/2021 20:03, Keith Thompson wrote:
>> Manfred <non...@add.invalid> writes:
>>> Now, I know you say (strictly according to the standard) that the type
>>> name is "struct Foo" instead of "Foo", however from the pure
>>> syntactical perspective this is a combination of a keyword and a name,
>>> so in C user defined type names are required to include a keyword in
>>> their name; this is not consistent with other type names.
>> A type name in C isn't just an identifer optionally preceded by
>> keyword.
>> It can be a sequence of keywords (unsigned long long int), or a
>> combination of keywords and punctuation (void(*)(void)), and can even
>> include expressions (unsigned char[2*x+y][time()%60]). The only way a C
>> type name can be a single identifier is via a typedef -- a feature that
>> was added to the language relatively late, and that has required parsers
>> to be a bit more complicated.
>> In the absence of typedefs, every type name in C has a syntactic
>> marker
>> to indicate whether it's an integer or floating-point type (one of
>> several keywords), a pointer (*), a function (()), an array ([]), or a
>> struct, union, or enum (struct, union, and enum keywords).
>
> You're missing the point a little.

I'm not missing the point. I just disagree with you.

C does not have a convention of using a single identifier as a type
name. See N1570 6.7.7 for more information on C type names. I do not
choose to impose such a convention using typedefs (unless the existing
code I'm working on does so).

[...]

>> Given C's rules, my personal preference is to write out "struct foo"
>> rather than creating a typedef
>
> So? Just name the type 'struct_T' instead of 'T'.

Or I can just use a space instead of an underscore.

I understand that the need to use the struct keyword bothers you.
There's no need to repeat yourself.

Malcolm McLean

unread,

Jun 7, 2021, 4:37:04 PM6/7/21

to

On Monday, 7 June 2021 at 21:22:55 UTC+1, Keith Thompson wrote:
>
> C does not have a convention of using a single identifier as a type
> name. See N1570 6.7.7 for more information on C type names. I do not
> choose to impose such a convention using typedefs (unless the existing
> code I'm working on does so).
>

I suspect that a lot of house style guides say that structs must be typedefed.
A lot will also mandate the stdint defines for fixed width integers, which has
the effect of reducing the use cases for types like "unsigned long" almost
to nothing.

Bart

unread,

Jun 7, 2021, 4:54:17 PM6/7/21

to

On 07/06/2021 21:17, David Brown wrote:
> On 07/06/2021 21:55, Bart wrote:

>> One purpose of a user-type is mop up all that mess into single
>> identifer, which can be used without any other keywords or other syntax.

They all do the same thing. C allows anonymous structs (with struct{})
in a similar way to anonymous arrays ([]) and pointers (*). Or you can
name then, using typedef.

But structs additionally have the concept of a type-tag to do the same
thing, with their own namespace and their own special syntax to
distinguish such a name from a regular typedef name.

The only reason anyone has come up with for such a useless feature is
that it was there first. But when did typedef come along? It must be 40
years ago.

But since the language didn't also provide a way around the
self-referential struct problem, and compilers wouldn't deprecate it
anyway, people continued using these two parallel schemes.

>> It's a different kind of named entity than a typedef, and unique in that
>> it /requires/ the combination of keyword and user-identifier.
>
> It's different in that it serves a different purpose.

Which is? I still don't get it. Any struct can have:

typedef? Tag? Declare vars using:

No No struct {...} a,b,c;
Yes No T a,b,c;
No Yes struct Tag a,b,c;
Yes Yes T a,b,c; OR struct Tag a,b,c;

Someone please tell why the language needs such a table with two
columns, or why it is desirable, instead of:

typedef?
No struct {...} a,b,c;
Yes T a,b,c;

or (if I made it like mine) just:

typedef?
Yes T a,b,c;

Kaz Kylheku

unread,

Jun 7, 2021, 6:26:47 PM6/7/21

to

But you're convinced that your perspective is broader than that of
anyone else here, right?

Bart

unread,

Jun 7, 2021, 7:19:53 PM6/7/21

to

On 07/06/2021 23:26, Kaz Kylheku wrote:
> On 2021-06-07, Bart <b...@freeuk.com> wrote:

>> I'm mainly talking about language source code which uses the ASCII
>> alphabet for keywords and user-identifiers.
>
> But you're convinced that your perspective is broader than that of
> anyone else here, right?
>

No, but you seem to be convinced that yours is.

Mine is a pragmatic approach: many languages do only support this
restricted subset of characters for identifiers, like C.

Programmers don't care as much as you seem to think. Unicode is a
complete minefield, and I think it's best to keep it out of the core
parts of a programming language, and allow it mainly in source comments,
source string literals, file names relevant to the compiler, program
data, and libraries.

But every language makes its own choice.

Long ago I did provide facilities in one of my script languages for
keywords and such to use an extended alphabet (then, for western
european languages), but users weren't bothered; they got on fine with
the English!

Besides, there is an awful lot more to it than just allowing Unicode in
identifiers, or even those aspects I mentioned above. Suppose I did
support the Japanese alphabet for identifiers in my language implementation:

* Keywords would still be in English

* Named operators are still in English

* Standard type names are in English

* As would be the names of standard library functions, types, macros,
structs, enums and variables

* As would be the case with the vast majority of third party libraries

* Compiler error messages would be in English

* Compiler options would be based on English

* Entry point names would still be 'main' or 'start'. Etc.

So any programmer will still need to encounter English in many places.
If they really want to use their native language for identifiers (and
can deal with the problem of many distinct characters having identical
glyphs among many others), there are languages that will do that.

Including mine with a tweak, in a bit that you snipped, and where I said
it would be case-sensitive as case-insensitivity only applies for A-Z/a-z.

So, what is your point? That the existence of alphabets where letter
case is poorly defined, or meaningless, means we shouldn't be able to
define it for our A-Z? What would you do about languages such as Fortran
and Ada that are still used, but are case-insensitive?

Note that even C is case-insensitive is a few places:

0xABC or 0xabc

1.2e3 or 1.2E3

100ull or 100ULL

u'A' or U'A'

Keith Thompson

unread,

Jun 7, 2021, 8:06:28 PM6/7/21

to

Bart <b...@freeuk.com> writes:
[...]

> Note that even C is case-insensitive is a few places:
>
> 0xABC or 0xabc
>
> 1.2e3 or 1.2E3
>
> 100ull or 100ULL
>
> u'A' or U'A'

u'A' is of type char16_t. U'A' is of type char32_t.

Bart

unread,

Jun 8, 2021, 6:13:53 AM6/8/21

to

On 08/06/2021 01:06, Keith Thompson wrote:
> Bart <b...@freeuk.com> writes:
> [...]

>> Note that even C is case-insensitive [in] a few places:

>>
>> 0xABC or 0xabc
>>
>> 1.2e3 or 1.2E3
>>
>> 100ull or 100ULL
>>
>> u'A' or U'A'
>
> u'A' is of type char16_t. U'A' is of type char32_t.
>

OK. So:

0x123 or 0X123

0x123p3 or 0x123P3

#include <stdio.h> or #include <STDIO.H> (on Windows)

The point is, in such contexts, 'A' and 'a' for example are deemed to be
interchangeable. This is the case even though there exist alphabets
where such equivalences don't exist for some or all of the letters.

People are trying to use the latter examples as reasons to banish the
concept of case equivalence in computer systems even for alphabets where
it is perfectly well-defined.

Joe Pfeiffer

unread,

Jun 8, 2021, 10:06:45 AM6/8/21

to

James Harris <james.h...@gmail.com> writes:

> Does C, as a language, need to allow its structs to have tags?

> etc but in all such cases struct A is being used as a type. And there
> is a more general feature for that in typedef.
>
> So could C's struct tags be omitted from the language? If not, what
> does a struct tag add? Is it something to do with forward declarations
> or syntactic consistency with union tags, etc?

structs (with tags) predate typedefs. I expect you're right that they
are no longer necessary, but they cause no harm that I know of and
removing them now would break lots of code written by people who don't
like typedefs.

Malcolm McLean

unread,

Jun 8, 2021, 10:45:53 AM6/8/21

to

You need a tag when a struct contains a pointer to its own kind. Which is quite
common for graph nodes.
In other situations, you have the option - either typedef the struct or use the
"struct mytag" syntax. This isn't really desireable. Programming languages
shouldn't provide two ways to do essentially the same thing. It just
leads to gratuitious inconsistencies, sometimes even incompatibilities.

Guillaume

unread,

Jun 8, 2021, 11:09:21 AM6/8/21

to

Le 08/06/2021 à 16:45, Malcolm McLean a écrit :
> On Tuesday, 8 June 2021 at 15:06:45 UTC+1, Joe Pfeiffer wrote:
> You need a tag when a struct contains a pointer to its own kind. Which is quite
> common for graph nodes.

Yep. This is akin to a forward type definition.

> In other situations, you have the option - either typedef the struct or use the
> "struct mytag" syntax. This isn't really desireable. Programming languages
> shouldn't provide two ways to do essentially the same thing. It just
> leads to gratuitious inconsistencies, sometimes even incompatibilities.

I think this just has historical reasons for C, but it's kind of weird
indeed. Especially since, not only does it give two ways of defining and
using struct types, but it also implies separate namespaces. (structs
have their own namespace, as well as enums, unions, then typedefs, ...)

Now as you mentioned above, to get rid of that, C would need to have
another way of declaring a forward type definition. You can define a
forward typedef for structs actually, but it still requires a struct
tag, so that doesn't solve much.

If I'm not mistaken, you can do the following:

typedef struct foo foo_t;

struct foo { ... foo_t *next; };

which would be similar to:

typedef struct foo { ... struct foo *next; } foo_t;

In both cases, you still need a struct tag.

Joe Pfeiffer

unread,

Jun 8, 2021, 11:25:48 AM6/8/21

to

Ah, you are correct. Though if we were eliminating struct tags, we
could modify typedefs so they could have what amounts to a forward
reference (as we can with tagged structs).

Kaz Kylheku

unread,

Jun 8, 2021, 12:00:40 PM6/8/21

to

On 2021-06-08, Guillaume <mes...@bottle.org> wrote:
> Le 08/06/2021 à 16:45, Malcolm McLean a écrit :
>> On Tuesday, 8 June 2021 at 15:06:45 UTC+1, Joe Pfeiffer wrote:
>> You need a tag when a struct contains a pointer to its own kind. Which is quite
>> common for graph nodes.
>
> Yep. This is akin to a forward type definition.
>
>> In other situations, you have the option - either typedef the struct or use the
>> "struct mytag" syntax. This isn't really desireable. Programming languages
>> shouldn't provide two ways to do essentially the same thing. It just
>> leads to gratuitious inconsistencies, sometimes even incompatibilities.
>
> I think this just has historical reasons for C, but it's kind of weird
> indeed. Especially since, not only does it give two ways of defining and
> using struct types, but it also implies separate namespaces. (structs
> have their own namespace, as well as enums, unions, then typedefs, ...)

It may feel "weird", but the underlying compile-time object model is
quite very clear.

There is a single namespace for declared identifiers in which typedef
names live along with functions and variables.

There is a separate tag namespace which holds only struct/union/enum
types. This has different, useful properties from the regular namespace.

Only typedef links aliases to types into the regular namespace, nothing else.

Even if you have it so that structs are entered into the regular
namespace automatically without typedef, you still needs structs to have
a name property.

When the compiler is looking at some type object, it needs to be able
to ask what its name is.

If we have ethis situation:

struct foo { int x; };
// foo is now a type, without typedef
typedef foo bar;
// bar is now an alias for foo

When bar is used for defining and declaring, the compiler needs
to know that the underlying type that bar refers to is actually
a structure with name foo: a struct foo.

E.g. for diagnostics. If you reference "a.z" where a has been declared
as a bar, you want something like:

parser.c:13: bar has no member z.
foo.h:5: bar is a typedef for foo, defined here.

If a were declared as foo, that might be:

parser.c:13: foo has no member z.
foo.h:5: foo is defined here.

The idea of a tag in the model itself will not go away so easily;
C just lacks the syntactic sugar to conceal it. The model is explicit
to the programmer. Structures have names, and you have to wire those
names into the regular namespace yourself with typedef, if you
want them there.

Kaz Kylheku

unread,

Jun 8, 2021, 12:13:40 PM6/8/21

to

You don't want to eliminate struct tags, because a struct type should
know what its name is. It's just a matter of forwarding that
information to the ordinary namespace.

struct foo; // effectively a forward declaration

// foo is now in the regular identifier namespace as if introduced
// by typedef. It refers to an incomplete type.

struct bar {
foo *pfoo;
};

struct foo {
bar *pbar;
struct xyzzy { // xyzzy introduced into surrounding name space
int plop;
} x;
};

// foo now refers to a complete type

xyzzy *px; // xyzzy known here

When "struct xyzzy" is introduced in the middle of a struct, the
declared identifier x is entered into that struct as a member, and the
typedef name "xyzzy" is introduced into the closest enclosing ordinary
lexical scope or file scope.

Note that this is different from C++, in which xyzzy would be
foo::xyzzy. Anyone who has ever prepared a C API header with nested
structures for C++ use would have run into this.

In other regards, it resembles the C++ solution.

("struct foo" doesn't introduce a typedef name in C++. It introduces a
class name, which has a baggage around it. Class names have linkage and
are subject to the one definition rule and such.)

Bart

unread,

Jun 8, 2021, 12:23:24 PM6/8/21

to

It sounds as though you are trying to retrospectively justify C's tag
names, simply because the language has them, and you don't want to admit
that they don't belong.

The fact is that if C had had typedef right from the start, tags would
never have existed, and nobody would have thought it was a good idea to
add them.

You /don't/ need a separate tag namespace where even anonymous structs
live. (If you did, pointers and arrays and function pointers would live
there too.)

If an implementation requires that anonymous structs 'belong' to a
symbol table entry, to match named ones, then an anonymous /typedef/,
not user-accessible, can be created. It doesn't need a separate name-space.

The struct will additionally belong to the collection of user-specified
types, which includes pointers, arrays and function pointers.

Bart

unread,

Jun 17, 2021, 1:14:55 PM6/17/21

to

On 17/06/2021 17:35, David Brown wrote:
> On 17/06/2021 15:34, Bart wrote:

>> Which hardly anybody uses! Nearly all for-loops I see that aren't
>> clearly an iteration are really just while-loops.
>
> Your opinions as to what you think people use in C are not something I
> consider realistic or valuable.

It isn't what I think, it's what I see.

Not everyone writes code like David Brown. Or me.

> If it is a while loop, write "while". That's what C programmers do.

If only! Every time I look aat for-loop in other people's code, I have
to analyse it to figure exactly what it is. Here's a case in point:

for (n = num, len = 0; (! len && ! n) || n > 0; len++) {

Guillaume

unread,

Jun 17, 2021, 1:30:59 PM6/17/21

to

Le 17/06/2021 à 17:22, Andrey Tarasevich a écrit :
> On 6/17/2021 3:29 AM, Bart wrote:
>> On 17/06/2021 03:39, Andrey Tarasevich wrote:

>>> On 6/6/2021 5:16 AM, James Harris wrote:
>>>> Does C, as a language, need to allow its structs to have tags?
>>>>

>>>> So could C's struct tags be omitted from the language? If not, what
>>>> does a struct tag add? Is it something to do with forward
>>>> declarations or syntactic consistency with union tags, etc?
>>>

>>> The tag is absolutely necessary when you want to declare a
>>> self-referential struct type, like a node in a linked list that must
>>> contain a pointer to the same type.
>>>
>>> Just try declaring a linked list node without using a tag. Report to
>>> the forum.
>>>
>>
>> It's not actually that hard, you just use void*, with the odd cast as
>> needed:
>
> As I explicitly stated in the post you were responding to:
>
> > P.S. Of course, you can get around this by using `void *` pointers
> and explicit type casts everywhere, but that would still represent a
> major loss of language functionality.
>
> And that is basically the answer to the original question: C language
> _needs_ struct tags to avoid having to "just use void*, with the odd
> cast as needed". Having to do the latter would, of course, be an
> abomination.

Yup. Using void * can be powerful but bypasses type checking, and not
only that, but you also lose the self-documentation a proper type provides.

Regarding casts, they are usually not needed though if you always assign
the void * pointer (for instance here, the 'next' member) to a pointer
to the struct, and conversely.

So if your pointer to node is 'Node', for instance, something like 'Node
= Node->next' is perfectly valid without any cast. Or 'Node2->next = Node1'.

Casts would be needed if you directly want to access a member from the
'next' pointer, but there are lots of ways to avoid that, so that point
is relatively moot.

Kaz Kylheku

unread,

Jun 17, 2021, 1:53:27 PM6/17/21

to

On 2021-06-17, Bart <b...@freeuk.com> wrote:
> On 17/06/2021 17:22, Kaz Kylheku wrote:
>> On 2021-06-17, Andrey Tarasevich <andreyta...@hotmail.com> wrote:
>
>
>> BartC likes to turn every discussion into some unproductive banter with
>> no useful outcome, about how C could be (or could have been) something else,
>> but without any concrete action plan to do anything about it.
>
> (That is not true. I've created several alternatives over approx 40
> years, for in-house or personal use. Although they weren't really
> alternatives for at least the first 15 years.

That is not doing anything the problem of what C is, from the
perspective of people who are stuck using it, who come to the newsgroup
to discuss about how to use it.

> The problem is no one is interested in alternatives, and I don't think

No, that isn't the problem. People are absolutely keenly interested in
alternatives and using them. Anyone who is using Rust, or C++, or Go
is probably interested in C alternatives.

Some people using C are also interested in C alternatives, but
for whatever pragmatic reasons are using C anyway.

People don't go to a comp.lang.c newsgroup to discuss alternatives
though; we know where to find alternatives and get help.

> anyone here has to the power to actually change the language anyway.

If you don't seize the power, you will never have the power.

I believe I have the power to change C; if I really wanted to badly
enough, I could avail myself of that influence.

I would form a plan, identify the roadblocks in it and work on removing
them one by one.

Changing C is a procedure, a task. Every task has a finite number of
steps. First you have to figure out what they are. You might start with
the wrong ones. Once you are on the right track though, it's just a
matter of taking the first step, then the second and so on.

(If the change idea is unreasonable, though, there will likely be an
impassable roadblock. Changes are generally expected to be thoroughly
backward compatible, so that the vast majority existing programs compile
and have an unaltered behavior, and the remainder are only in some
dubious bucket we are willing to throw under the bus.)

> (Besides, with C being so firmly ensconced as part of Unix and Linux, it
> isn't going anywhere.)

That is false; the bleeding edge of C is changing, and the GNU C
Compiler Collection project is actively maintained, cranking out new
releases.

> But what some people might have been able to do far better than me - 20+
> years ago would have been better - is to have created an up-to-date

20 years ago isn't coming back though.

> Instead we end up with ones like Rust and Zig that things harder rather
> than easier.

These projects have their reasons though. They are real things in the
real world.

You can download them from a repo, build them, and run a test suite.

You can join a mailing list, web forum or IRC channel or whatever and
meet others users.

If you don't think Rust and Zig are the answer, the only way is to
compete with your own, join those projects to steer their direction, or
else forever be complaining that people are making all these various C
replacements, but not doing it how you would like.

Bart

unread,

Jun 17, 2021, 4:27:54 PM6/17/21

to

On 17/06/2021 18:53, Kaz Kylheku wrote:
> On 2021-06-17, Bart <b...@freeuk.com> wrote:

[C alternatives]

> If you don't think Rust and Zig are the answer, the only way is to
> compete with your own, join those projects to steer their direction, or
> else forever be complaining that people are making all these various C
> replacements, but not doing it how you would like.

It isn't really for me. I'm stuck with using my languages for the same
reason I'd be stuck working as self-employed, if I was still working.

But I constantly see the problems people have with C, not here much,
usenet is pretty much dead, but in places like reddit and others, a lot
of things which would otherwise have been a piece of cake if the
language had done things properly.

I just really feel sorry for them, as I can't help.

Many of these are small things that do not warrant switching to a
totally new and very different language (which would introduce 100 new
problems).

But let's take one example of something posted here, someone wanted to
print this:

printf("UINT64_MAX : %lld\n", UINT64_MAX);

C has made it fashionable to make this stuff much harder than it need be
(Zig is worse than C for printing). The full solution for this was (see
stdint.h thread):

#include <stdio.h> # needed for printf
#include <stdint.h> # needed for UINT64_MAX and for inttypes.h

printf("UINT64_MAX : " PRId64 "\n", UINT64_MAX);

(It will be something alone those lines; I'm not going to bother
checking it.)

Basically, they needed to print a number to the console, something that
in the 1970s would have been utterly trivial:

print x

Öö Tiib

unread,

Jun 17, 2021, 8:58:14 PM6/17/21

to

You did not use %:

printf("%" PRId64 "\n", UINT64_MAX);

Yes such not overly trivial way, but trivial enough.

> Basically, they needed to print a number to the console, something that
> in the 1970s would have been utterly trivial:
>
> print x

In the 1970s what is currently called 'UINT64_MAX' was utterly trivially
called 'x'? I think you misremember.

The trouble only starts when you output UINT64_MAX as you will see that
all these excels and libreoffices to where someone might want to paste
it fail to process it. They fail gracefully and silently by showing wrong
results. So you will need more C and more C.

Bart

unread,

Jun 18, 2021, 6:25:45 AM6/18/21

to

On 18/06/2021 01:58, Öö Tiib wrote:
> On Thursday, 17 June 2021 at 23:27:54 UTC+3, Bart wrote:

>> But let's take one example of something posted here, someone wanted to
>> print this:
>>
>> printf("UINT64_MAX : %lld\n", UINT64_MAX);
>>
>> C has made it fashionable to make this stuff much harder than it need be
>> (Zig is worse than C for printing). The full solution for this was (see
>> stdint.h thread):
>>
>> #include <stdio.h> # needed for printf
>> #include <stdint.h> # needed for UINT64_MAX and for inttypes.h
>>
>> printf("UINT64_MAX : " PRId64 "\n", UINT64_MAX);
>>
>> (It will be something alone those lines; I'm not going to bother
>> checking it.)
>
> You did not use %:
>
> printf("%" PRId64 "\n", UINT64_MAX);
>
> Yes such not overly trivial way, but trivial enough.

You helped make my point for me. What a palaver to just print a number!
And so easy to get wrong.

>> Basically, they needed to print a number to the console, something that
>> in the 1970s would have been utterly trivial:
>>
>> print x
>
> In the 1970s what is currently called 'UINT64_MAX' was utterly trivially
> called 'x'? I think you misremember.

UINT64_MAX is just a number. That is not the main issue here. Given the
existence of such a value, you should be able to do:

print UINT64_MAX

plus whatever extra syntax the language requires. #includes, inttype.h,
"%" and "PRId64" is not mere syntax.

To get on to the MAX part, If I do this in my language, it is just:

println =u64.max

.max can be applied to any integer type or expression. The "=" adds the
label that is present in the C. The output is:

MAX(U64)= 18446744073709551615

> The trouble only starts when you output UINT64_MAX as you will see that
> all these excels and libreoffices to where someone might want to paste
> it fail to process it. They fail gracefully and silently by showing wrong
> results. So you will need more C and more C.

How is this relevant? A language shouldn't allow you to just write:

print(a)

because some extreme values of a, IF you were to paste them elsewhere,
might cause problems with buggy software? How would adding more C fix that?

When I paste this value, it is just a string anyway. BTW here's the next
one up:

println =u128.max

It shows:

MAX(U128)= 340282366920938463463374607431768211455

A u128 type is merely double the word size of the current crop of 64-bit
machines. A systems language should deal with this stuff effortlessly.

If I paste /that/ value into Google, I get a list of 128-bit resources.

Tim Rentsch

unread,

Jun 24, 2021, 1:33:12 PM6/24/21

to

John Bode <jfbod...@gmail.com> writes:

[...]

> To reiterate a point I make a lot - typedef on its own creates leaky
> abstractions. If I need to know that I have to use the `.` or `->`
> operator on something, I'd rather have it declared as `struct A foo;` or
> `struct B *ptr;` instead of a type name that doesn't convey struct-ness
> at all. Same reason I don't like it when people hide pointers behind
> typedefs - I once spent half a day chasing my tail because somebody
> created a typedef name for a pointer type and used that as a template
> parameter for a vector in C++, such that when I was using an iterator
> I needed to write
>
> (*it)->do_something();
>
> However, since the typedef name didn't indicate pointer-ness *at all*, I
> wound up writing
>
> it->do_something();
>
> and g++ vomited up hundreds of incomprehensible error messages that
> basically boiled down to "you need to use a * here, dummy".

Sounds to me like the culprit is C++, not typedefs.

Öö Tiib

unread,

Jun 24, 2021, 4:46:03 PM6/24/21

to

It is good old issue of three star programmers (that both C and C++
share) and on given case it was additionally obfuscated by using typedef
of pointer. By most C coding guidelines I've seen it is advised not to
typedef pointers that are meant to be dereferenced.

John Bode

unread,

Jul 1, 2021, 6:58:04 PM7/1/21

to

g++ doesn't handle errors in template parameters very well and generates
a *lot* of hard-to-follow error messages for relatively simple mistakes.
And at the time I was still relatively inexperienced with C++, which
didn't help.

But the typedef name (or, more properly, the incomplete and leaky
abstraction introduced by that typedef name) was the actual culprit.
Again, there was nothing to tell me that I was iterating over a vector
of *pointers*, not a vector of instances, which required me to use
different syntax.

With better error messages I would have figured it out in a few minutes
rather than half a day, but it was still time lost to a leaky
abstraction.

That was an egregious case, but far from the only one. Hiding pointers
and struct types behind typedefs without an API to handle pointer and
member selection operations for you is unambiguously bad practice, and
people need to stop doing it.

Tim Rentsch

unread,

Jul 11, 2021, 3:26:33 AM7/11/21

to

> [...]

I stand by my earlier claim that the culprit here is C++
rather than typedefs.

John Bode

unread,

Jul 26, 2021, 10:49:14 AM7/26/21

to

I honestly don't know how to make it any clearer. I lost time due to
using the wrong syntax. I was using the wrong syntax because the
information I needed in order to use the right syntax was hidden from
me. That's not a function of it being C++, that's a function of the
abstraction not being complete.

I'm not saying don't use typedef. I'm saying that you shouldn't hide
pointer-ness or struct-ness behind a typedef name *unless* you are
willing to provide an API that hides corresponding operations on those
types as well.

The C standard library doesn't hide pointer-ness behind typedef names
(e.g. the FILE type). You shouldn't either.

If you're going to hide a floating point type behind a typedef name,
don't make me have to hunt for the typedef to know which conversion
specifier I have to use to print it out - provide an API to format it
for me.

If you're going to hide a struct or union type behind a typedef name,
don't make me hunt down the typedef to know how to access the members -
provide an API to do that for me.

If you're going to hide an array type behind a typedef name, don't
make me hunt down the typedef to know I have to use the [] operator
or the size - provide an API to handle that for me.

If you're going to create an abstraction for a type, CREATE A FULL
ABSTRACTION FOR THAT TYPE or don't bother doing it at all.

Kaz Kylheku

unread,

Jul 27, 2021, 11:47:18 AM7/27/21

to

On 2021-07-26, John Bode <jfbod...@gmail.com> wrote:
> If you're going to hide a struct or union type behind a typedef name,
> don't make me hunt down the typedef to know how to access the members -
> provide an API to do that for me.

I don't find that reasonable; it's pretty common to make typedef names
for structures just to have a shorthand for declaring them, without
intending to create a fully opaque type with abstracted operations.

typedef struct version {
int major;
int minor;
} version_t;

The typedef is co-located with the struct declaration; your ediutor
should be able to jump to the definition of version_t which is
the above line.

Your remark makes sense for a typedef name for a pointer to such
a structure. That's often intended to be an abstract handle.

> If you're going to hide an array type behind a typedef name, don't

Array typedefs should only ever be used for breaking up declarations.
simplifying declarations, not as an abstract type in an API.

Hiding arrays behind typedef names is a poor idea because the result
"abstraction" still cannot be passed to functions, returned or assigned.

The type still decays into a pointer when used as as parameter type.

typedef int foo_t[42];

void fun(foo_t x) {
// programmer trap:
// sizeof x isn't sizeof (foo_t) here!
foo_t y;
// sizeof y *is* sizeof (foo_t).

Keith Thompson

unread,

Jul 27, 2021, 1:56:02 PM7/27/21

to

Kaz Kylheku <563-36...@kylheku.com> writes:
> On 2021-07-26, John Bode <jfbod...@gmail.com> wrote:
>> If you're going to hide a struct or union type behind a typedef name,
>> don't make me hunt down the typedef to know how to access the members -
>> provide an API to do that for me.
>
> I don't find that reasonable; it's pretty common to make typedef names
> for structures just to have a shorthand for declaring them, without
> intending to create a fully opaque type with abstracted operations.
>
> typedef struct version {
> int major;
> int minor;
> } version_t;
>
> The typedef is co-located with the struct declaration; your ediutor
> should be able to jump to the definition of version_t which is
> the above line.

I disagree. I prefer to refer to the type as "struct version" and not
bother with the typedef.

I'm not arguing I'm right and you're wrong. It's just my preference
(for which I've given reasons before).

The point is that there are two major styles for defining struct types,
and every C programmer who works with code written by other people will
need to deal with both of them.

If I did use typedefs, I'd probably use the same identifier for the tag
and the typedef name.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips

Chris M. Thomasson

unread,

Jul 27, 2021, 3:32:21 PM7/27/21

to

On 7/27/2021 8:47 AM, Kaz Kylheku wrote:
> On 2021-07-26, John Bode <jfbod...@gmail.com> wrote:
>> If you're going to hide a struct or union type behind a typedef name,
>> don't make me hunt down the typedef to know how to access the members -
>> provide an API to do that for me.
>
> I don't find that reasonable; it's pretty common to make typedef names
> for structures just to have a shorthand for declaring them, without
> intending to create a fully opaque type with abstracted operations.
>
> typedef struct version {
> int major;
> int minor;
> } version_t;
>
> The typedef is co-located with the struct declaration; your ediutor
> should be able to jump to the definition of version_t which is
> the above line.

[...]

100% pure nitpick... I got a bit scolded one time for using the *_t
suffix in a POSIX environment. Its reserved!

Chris M. Thomasson

unread,

Jul 27, 2021, 3:41:23 PM7/27/21

to

On 7/27/2021 8:47 AM, Kaz Kylheku wrote:

> On 2021-07-26, John Bode <jfbod...@gmail.com> wrote:
>> If you're going to hide a struct or union type behind a typedef name,
>> don't make me hunt down the typedef to know how to access the members -
>> provide an API to do that for me.
>
> I don't find that reasonable; it's pretty common to make typedef names
> for structures just to have a shorthand for declaring them, without
> intending to create a fully opaque type with abstracted operations.
>
> typedef struct version {
> int major;
> int minor;
> } version_t;
>
> The typedef is co-located with the struct declaration; your ediutor
> should be able to jump to the definition of version_t which is
> the above line.

[...]

I still like to avoid typedefs from, time to time:
__________________________________
#include <stdio.h>

struct version
{
int major;
int minor;

};

void
version_output(
struct version const* const self
) {
printf("FooProg Version: (%d.%d)\n", self->major, self->minor);
}

static struct version g_version = { 0, 1 };

int main()
{
version_output(&g_version);

struct version version = { 0, 2 };
version_output(&version);

return 0;
}
__________________________________

Guillaume

unread,

Jul 27, 2021, 5:48:15 PM7/27/21

to

Le 27/07/2021 à 19:55, Keith Thompson a écrit :
> Kaz Kylheku <563-36...@kylheku.com> writes:
>> On 2021-07-26, John Bode <jfbod...@gmail.com> wrote:
>>> If you're going to hide a struct or union type behind a typedef name,
>>> don't make me hunt down the typedef to know how to access the members -
>>> provide an API to do that for me.
>>
>> I don't find that reasonable; it's pretty common to make typedef names
>> for structures just to have a shorthand for declaring them, without
>> intending to create a fully opaque type with abstracted operations.
>>
>> typedef struct version {
>> int major;
>> int minor;
>> } version_t;
>>
>> The typedef is co-located with the struct declaration; your ediutor
>> should be able to jump to the definition of version_t which is
>> the above line.
>
> I disagree. I prefer to refer to the type as "struct version" and not
> bother with the typedef.

Okay,then certainly it's a matter of taste, and that can't be discussed.

All I'm seeing though is that your argument could be used for just any
type really. Any typedef "hides" the type. It's made for that. It's an
abstraction. It's the whole point. Are you for typeless languages?

Or, what is it that you specifically have with structs that you wouldn't
mind with other types? Like, why does it bother you not to directly see
that a type is a struct, while the fact it's an int, a double, or an
array would matter to you? In any case, you must know what a given type
is for properly using it.

I don't get it. And others seem not to as well. I guess we will never
get it, and that's fine.

Scott Lurndal

unread,

Jul 27, 2021, 6:12:16 PM7/27/21

to

Consider abstract types such as pid_t or off_t. These allow
the underlying field to change in size (e.g. from unsigned
short to unsigned int) without changing the semantics of
the use of the type. A simple recompile/relink is all that is
necessary to switch to the new definition on a new machine with
different fundamental types.

When you hide a struct behind a typedef, it's not opaque,
as the field names are not generic (there was a time when any
field name could be used with any pointer, but that's long in
the past now).

I generally fall into Keith's camp on this in C, and in C++
I never use typedef to hide a struct (the struct tag is,
in a sense, a typedef in C++).

Bart

unread,

Jul 27, 2021, 7:39:10 PM7/27/21

to

I would rather see specific types, such as i16 i32 i64.

If they need to vary by platform, then provide specific headers for that
platform, instead of ending up with an unmaintainable, unreadable,
fragile mess by trying to have one header for everything.

Some integer types will depend on the target so, fine, use special types
for those (size_t and intptr_t for example). But DON'T have a dedicated
integer typedef for every field of every struct! Declarations like this
are a joke:

struct stat {
dev_t st_dev; /* ID of device containing file */
ino_t st_ino; /* inode number */
mode_t st_mode; /* protection */
nlink_t st_nlink; /* number of hard links */
uid_t st_uid; /* user ID of owner */
gid_t st_gid; /* group ID of owner */
dev_t st_rdev; /* device ID (if special file) */
off_t st_size; /* total size, in bytes */
blksize_t st_blksize; /* blocksize for file system I/O */
blkcnt_t st_blocks; /* number of 512B blocks allocated */
time_t st_atime; /* time of last access */
time_t st_mtime; /* time of last modification */
time_t st_ctime; /* time of last status change */
};

I think there are 13 fields and 10 different types! There will be a
section of header where those 10 typedefs are defined differently per
platform; just use that same mechanism to define specific stat structs
instead.

> When you hide a struct behind a typedef, it's not opaque,
> as the field names are not generic (there was a time when any
> field name could be used with any pointer, but that's long in
> the past now).
>
> I generally fall into Keith's camp on this in C, and in C++
> I never use typedef to hide a struct (the struct tag is,
> in a sense, a typedef in C++).

You don't need to hide it:

typedef struct {...} struct_T;

instead of:

struct T {...};

Use typedef and you can be consistent with all other named user types.

Keith Thompson

unread,

Jul 27, 2021, 8:14:27 PM7/27/21

to

As I said, I've discussed my reasoning here before, but I don't mind
doing so again.

For me, the point is that "struct version" already has a perfectly good
name: "struct version". I see no great value in giving it another name,
such as "version_t", or even "version". It introduces potential
confusion with no particular benefit.

FILE is a great example of an appropriate use of a typedef. It's
required to be an object type, but code that uses it doesn't and
shouldn't care that it's a struct, and shouldn't attempt to refer to its
members, if any. But code that uses struct version needs to know that
it's a struct, and needs to be able to refer to its members.

Typedefs are also a good way to define a name for an integer type where
code that uses it doesn't need to know which underlying type it is.
They can provide information about what the type is used for (size_t,
for example). A struct tag can provide that same information.

Historically, typedefs were a relatively late addition to C.
In pre-typedef C, a type name could never be just an identifier. Type
names (unlike in some languages) were always constructed using syntax
involving some combination of keywords, symbols, and identifiers.
It's possible but not practical to give *all* types names consisting of
a single identifier. I see no great advantage in doing so just for
struct types.

Having said that, if I were working on a C project that had a convention
of using typedefs for struct types, I would follow that convention
without hesitation. If there were inconsistent rules about the
relationship between tag names and typedef names, I might raise that as
a concern and suggest using the same identifier for both. And in
$DAYJOB I work in C++, which doesn't have this issue (defining "struct
foo" creates a type named "foo", and I don't call it "struct foo",
because C++ is a different language with different rules).

Routinely using typedefs for struct types strikes me as a way of
papering over an aspect of the C language that some people dislike.
Since I personally don't dislike it, I don't bother unless there's a
good reason to do so.

Keith Thompson

unread,

Jul 27, 2021, 9:31:09 PM7/27/21

to

Bart <b...@freeuk.com> writes:
> On 27/07/2021 23:12, Scott Lurndal wrote:

[...]

>> Consider abstract types such as pid_t or off_t. These allow
>> the underlying field to change in size (e.g. from unsigned
>> short to unsigned int) without changing the semantics of
>> the use of the type. A simple recompile/relink is all that is
>> necessary to switch to the new definition on a new machine with
>> different fundamental types.
>
> I would rather see specific types, such as i16 i32 i64.

C has those in <stdint.h> (though not by those names).

> If they need to vary by platform, then provide specific headers for
> that platform, instead of ending up with an unmaintainable,
> unreadable, fragile mess by trying to have one header for everything.

The audience for most system headers is the compiler, not the
programmer. I don't need to look at /usr/include/stdint.h to understand
what int32_t means.

> Some integer types will depend on the target so, fine, use special
> types for those (size_t and intptr_t for example). But DON'T have a
> dedicated integer typedef for every field of every struct!
> Declarations like this are a joke:
>
> struct stat {
> dev_t st_dev; /* ID of device containing file */
> ino_t st_ino; /* inode number */
> mode_t st_mode; /* protection */
> nlink_t st_nlink; /* number of hard links */
> uid_t st_uid; /* user ID of owner */
> gid_t st_gid; /* group ID of owner */
> dev_t st_rdev; /* device ID (if special file) */
> off_t st_size; /* total size, in bytes */
> blksize_t st_blksize; /* blocksize for file system I/O */
> blkcnt_t st_blocks; /* number of 512B blocks allocated */
> time_t st_atime; /* time of last access */
> time_t st_mtime; /* time of last modification */
> time_t st_ctime; /* time of last status change */
> };

Would you insist on using the [u]intN_t types for all those members?
And if the requirements for a field vary from one system to another,
what then? Would st_dev and st_ino be the same type on one platform and
different types on another?

> I think there are 13 fields and 10 different types! There will be a
> section of header where those 10 typedefs are defined differently per
> platform; just use that same mechanism to define specific stat structs
> instead.

Are you saying I'd have to use
#include <sys/foostat.h>
on one system and
#include <sys/barstat.h>
on another? I don't think that's really what you meant.

>> When you hide a struct behind a typedef, it's not opaque,
>> as the field names are not generic (there was a time when any
>> field name could be used with any pointer, but that's long in
>> the past now).
>> I generally fall into Keith's camp on this in C, and in C++
>> I never use typedef to hide a struct (the struct tag is,
>> in a sense, a typedef in C++).
>
> You don't need to hide it:
>
> typedef struct {...} struct_T;
>
> instead of:
>
> struct T {...};
>
> Use typedef and you can be consistent with all other named user types.

As you know, that doesn't allow for the very common case of a struct
that contains a pointer to itself. And "struct_T" is (very slightly)
more difficult to type than "struct T".

Most "named user types" are structs. I use a consistent naming scheme
for them. I don't give separate names to pointer or array types, and
most named integer types that I use are defined in standard or system
headers.

Kaz Kylheku

unread,

Jul 27, 2021, 10:01:50 PM7/27/21

to

On 2021-07-27, Keith Thompson <Keith.S.T...@gmail.com> wrote:
> Kaz Kylheku <563-36...@kylheku.com> writes:
>> On 2021-07-26, John Bode <jfbod...@gmail.com> wrote:
>>> If you're going to hide a struct or union type behind a typedef name,
>>> don't make me hunt down the typedef to know how to access the members -
>>> provide an API to do that for me.
>>
>> I don't find that reasonable; it's pretty common to make typedef names
>> for structures just to have a shorthand for declaring them, without
>> intending to create a fully opaque type with abstracted operations.
>>
>> typedef struct version {
>> int major;
>> int minor;
>> } version_t;
>>
>> The typedef is co-located with the struct declaration; your ediutor
>> should be able to jump to the definition of version_t which is
>> the above line.
>
> I disagree. I prefer to refer to the type as "struct version" and not
> bother with the typedef.
>
> I'm not arguing I'm right and you're wrong. It's just my preference
> (for which I've given reasons before).

Right, of course; but all I'm saying is that there exists a certain
coding style with those typedefs, in which the typedefs do *not* signify
"I am an opaque type accessed only by an API".

Keith Thompson

unread,

Jul 27, 2021, 10:07:21 PM7/27/21

to

Kaz Kylheku <563-36...@kylheku.com> writes:
> On 2021-07-27, Keith Thompson <Keith.S.T...@gmail.com> wrote:
>> Kaz Kylheku <563-36...@kylheku.com> writes:
>>> On 2021-07-26, John Bode <jfbod...@gmail.com> wrote:
>>>> If you're going to hide a struct or union type behind a typedef name,
>>>> don't make me hunt down the typedef to know how to access the members -
>>>> provide an API to do that for me.
>>>
>>> I don't find that reasonable; it's pretty common to make typedef names
>>> for structures just to have a shorthand for declaring them, without
>>> intending to create a fully opaque type with abstracted operations.
>>>
>>> typedef struct version {
>>> int major;
>>> int minor;
>>> } version_t;
>>>
>>> The typedef is co-located with the struct declaration; your ediutor
>>> should be able to jump to the definition of version_t which is
>>> the above line.
>>
>> I disagree. I prefer to refer to the type as "struct version" and not
>> bother with the typedef.
>>
>> I'm not arguing I'm right and you're wrong. It's just my preference
>> (for which I've given reasons before).
>
> Right, of course; but all I'm saying is that there exists a certain
> coding style with those typedefs, in which the typedefs do *not* signify
> "I am an opaque type accessed only by an API".

Agreed.

Bart

unread,

Jul 28, 2021, 8:03:04 AM7/28/21

to

On 28/07/2021 02:30, Keith Thompson wrote:

> Bart <b...@freeuk.com> writes:

>> I think there are 13 fields and 10 different types! There will be a
>> section of header where those 10 typedefs are defined differently per
>> platform; just use that same mechanism to define specific stat structs
>> instead.
>
> Are you saying I'd have to use
> #include <sys/foostat.h>
> on one system and
> #include <sys/barstat.h>
> on another? I don't think that's really what you meant.

No, you'd use:

#include <sys/stat.h>

If you compile on platform foo, the contents of stat.h will be relevant
to foo; on bar, relevant to bar. You don't have one stat.h containing
declarations for a dozen irrelevant platforms to the one you're on.

If you need to cross-compile for foo or bar on X, then you give the
compiler suitable options as to where to look for system headers. (I
guess the same as happens when you give -m32 or -m64 options.) Or you
just have separate compiler installations.

>> You don't need to hide it:
>>
>> typedef struct {...} struct_T;
>>
>> instead of:
>>
>> struct T {...};
>>
>> Use typedef and you can be consistent with all other named user types.
>
> As you know, that doesn't allow for the very common case of a struct
> that contains a pointer to itself.

I think that's covered with:

typedef struct T {...; struct T*...} struct_T;

> And "struct_T" is (very slightly)
> more difficult to type than "struct T".

That was a suggestion. Other hints can be used. For example I commonly
use Trec, or in the past, rT (eg. rsystemtime).

>
> Most "named user types" are structs.

Apart from off_t, clock_t and friends!

Scott Lurndal

unread,

Jul 28, 2021, 10:06:44 AM7/28/21

to

Keith Thompson <Keith.S.T...@gmail.com> writes:
>Bart <b...@freeuk.com> writes:
>> On 27/07/2021 23:12, Scott Lurndal wrote:
>[...]

>

>> Some integer types will depend on the target so, fine, use special
>> types for those (size_t and intptr_t for example). But DON'T have a
>> dedicated integer typedef for every field of every struct!
>> Declarations like this are a joke:
>>
>> struct stat {
>> dev_t st_dev; /* ID of device containing file */
>> ino_t st_ino; /* inode number */
>> mode_t st_mode; /* protection */
>> nlink_t st_nlink; /* number of hard links */
>> uid_t st_uid; /* user ID of owner */
>> gid_t st_gid; /* group ID of owner */
>> dev_t st_rdev; /* device ID (if special file) */
>> off_t st_size; /* total size, in bytes */
>> blksize_t st_blksize; /* blocksize for file system I/O */
>> blkcnt_t st_blocks; /* number of 512B blocks allocated */
>> time_t st_atime; /* time of last access */
>> time_t st_mtime; /* time of last modification */
>> time_t st_ctime; /* time of last status change */
>> };
>
>Would you insist on using the [u]intN_t types for all those members?

We learned thirty years ago that using sized integer types instead
of abstract types in struct stat (and other application<->OS APIs)
was a very, very, very, very bad idea.

Bart has this bugaboo about having to track down the final definitions
of these abstract types, which is, frankly, ridiculous and short-sighted.

In any case, a simple gcc -E will provide the exact line and exact
source file name for any abstract type defined by the implementation
easily and simply.

$ (cd ~; cc -E a.c | grep ino_t)
typedef unsigned long int __ino_t;
typedef __ino_t ino_t;
__ino_t st_ino;

Keith Thompson

unread,

Jul 28, 2021, 10:20:09 AM7/28/21

to

Bart <b...@freeuk.com> writes:
> On 28/07/2021 02:30, Keith Thompson wrote:
>> Bart <b...@freeuk.com> writes:
>>> I think there are 13 fields and 10 different types! There will be a
>>> section of header where those 10 typedefs are defined differently per
>>> platform; just use that same mechanism to define specific stat structs
>>> instead.
>> Are you saying I'd have to use
>> #include <sys/foostat.h>
>> on one system and
>> #include <sys/barstat.h>
>> on another? I don't think that's really what you meant.
>
> No, you'd use:
>
> #include <sys/stat.h>

Right, that's what I do now.

> If you compile on platform foo, the contents of stat.h will be
> relevant to foo; on bar, relevant to bar. You don't have one stat.h
> containing declarations for a dozen irrelevant platforms to the one
> you're on.
>
> If you need to cross-compile for foo or bar on X, then you give the
> compiler suitable options as to where to look for system headers. (I
> guess the same as happens when you give -m32 or -m64 options.) Or you
> just have separate compiler installations.

The approach you suggest is perfectly possible and permitted by the
language. Your complaint isn't about C; it's about how implementers
choose to write their headers.

Personally, it wouldn't affect me, since when I use <sys/stat.h> I
usually consult the relevant documentation, not the contents of the
header file.

I suggest that providing a single header makes maintenance easier. If
an implementer chooses to maintain separate "foo" and "bar" versions of
stat.h, they're likely to generate them from a common source anyway.
It's easy enough to use the preprocessor to generate the different
versions.

>>> You don't need to hide it:
>>>
>>> typedef struct {...} struct_T;
>>>
>>> instead of:
>>>
>>> struct T {...};
>>>
>>> Use typedef and you can be consistent with all other named user types.
>> As you know, that doesn't allow for the very common case of a struct
>> that contains a pointer to itself.
>
> I think that's covered with:
>
> typedef struct T {...; struct T*...} struct_T;

Yes, it is, and that's what often done in practice. And if you want to
do it that way, you certainly can. I suggest that just using "struct T"
and dropping the typedef is simpler.

>> And "struct_T" is (very slightly)
>> more difficult to type than "struct T".
>
> That was a suggestion. Other hints can be used. For example I commonly
> use Trec, or in the past, rT (eg. rsystemtime).

Why are "hints" needed? Why use different identifiers for the tag and
the typedef? There are probably some reasons to do that, but *if*
you're going to use typedefs, it's simpler to write:

typedef struct T { ... } T;

>> Most "named user types" are structs.
>
> Apart from off_t, clock_t and friends!

As I wrote, and you snipped (emphasis added):

Most "named user types" are structs. I use a consistent naming
scheme for them. I don't give separate names to pointer or array

types, **and most named integer types that I use are defined in
standard or system headers**.

Bart

unread,

Jul 28, 2021, 10:45:24 AM7/28/21

to

Here's my definition of struct stat as used in stat.h for my Win64 C
compiler:

struct _stat {
unsigned int st_dev;
unsigned short st_ino;
unsigned short st_mode;
short st_nlink;
short st_uid;
short st_gid;
unsigned long st_rdev;
unsigned int st_size;
unsigned long long int st_atime;
unsigned long long int st_mtime;
unsigned long long int st_ctime;
};

This doesn't tell you exactly what short, long etc are, but neither does
you grep example.

But since this is for Windows, that specifies short, int, long, long
long as 16, 32, 32 and 64 bits respectively.

(BTW there are 2 bytes of padding between st_gid and st_rdev fields. You
can determine this by carefully counting, but only becauses the field
sizes are much easier to determine.)

Below are versions from stat.h belonging to Windows SDK. Even though
they are already specific, they still make extensive use of typedefed names!

You can imagine it's rather puzzling figuring out which one I ought to
be using, if I wanted to use some of that functions via a FFI, even
before I need to determine what the types mean.

(I wonder what the purpose of __time32_t is, if this is supposed to be a
32-bit field anyway?)

struct _stat32
{
_dev_t st_dev;
_ino_t st_ino;
unsigned short st_mode;
short st_nlink;
short st_uid;
short st_gid;
_dev_t st_rdev;
_off_t st_size;
__time32_t st_atime;
__time32_t st_mtime;
__time32_t st_ctime;
};

struct _stat32i64
{
_dev_t st_dev;
_ino_t st_ino;
unsigned short st_mode;
short st_nlink;
short st_uid;
short st_gid;
_dev_t st_rdev;
__int64 st_size;
__time32_t st_atime;
__time32_t st_mtime;
__time32_t st_ctime;
};

struct _stat64i32
{
_dev_t st_dev;
_ino_t st_ino;
unsigned short st_mode;
short st_nlink;
short st_uid;
short st_gid;
_dev_t st_rdev;
_off_t st_size;
__time64_t st_atime;
__time64_t st_mtime;
__time64_t st_ctime;
};

struct _stat64
{
_dev_t st_dev;
_ino_t st_ino;
unsigned short st_mode;
short st_nlink;
short st_uid;
short st_gid;
_dev_t st_rdev;
__int64 st_size;
__time64_t st_atime;
__time64_t st_mtime;
__time64_t st_ctime;
};

#define __stat64 _stat64 // For legacy compatibility

#if defined(_CRT_INTERNAL_NONSTDC_NAMES) && _CRT_INTERNAL_NONSTDC_NAMES
&& !defined _CRT_NO_TIME_T
struct stat
{
_dev_t st_dev;
_ino_t st_ino;
unsigned short st_mode;
short st_nlink;
short st_uid;
short st_gid;
_dev_t st_rdev;
_off_t st_size;
time_t st_atime;
time_t st_mtime;
time_t st_ctime;
};
#endif

Scott Lurndal

unread,

Jul 28, 2021, 11:22:08 AM7/28/21

to

You are willfully ignoring what I wrote.

The point is that _nobody_ cares what the underlying type
of ino_t is.

The point is that we learned the hard way that your solution
isn't future-safe.

The point is that 'struct stat' is data structure defined
by standards that mandate the member data types.

How many customers use your Win64 C compiler for anything,
particularly for production code? How many different computer
architectures does your C compiler support?

Bart

unread,

Jul 28, 2021, 12:50:08 PM7/28/21

to

On 28/07/2021 16:21, Scott Lurndal wrote:

> Bart <b...@freeuk.com> writes:

>> Here's my definition of struct stat as used in stat.h for my Win64 C
>> compiler:
>
> You are willfully ignoring what I wrote.
>
> The point is that _nobody_ cares what the underlying type
> of ino_t is.

I do. Or more likely types such as off_t , or off64_t which are also
used in applications. (Wouldn't off_t just be i64?)

> The point is that we learned the hard way that your solution
> isn't future-safe.

It's not my language that is stuck at int = 32 bits, while everything
these days is 64 bits.

> The point is that 'struct stat' is data structure defined
> by standards that mandate the member data types.

What standards? I showed half-a-dozen different versions! Two more below...

> How many customers use your Win64 C compiler for anything,
> particularly for production code? How many different computer
> architectures does your C compiler support?

This is exactly my point. A compiler installation can exist for just one
platform. This goes for Windows compilers such as Pelles C (2 targets),
DMC, lccwin (2 targets), Tcc (2 targets).

The 2 targets in each case are Win32 and Win64. There only really need
to be two targets, in the case of struct stat, because of NOT
future-proofing sufficiently: the needs for file and volume sizes beyond
32-bit capacity were clear 30 years ago.

A product like gcc is huge (like 100,000 development files); having
dedicated definitions for a handful of system header files is
insignificant compared with the dedicated files needed for all its
various targets.

BTW here are two more struct stat definitions to add the mix, this time
from DMC (the one that ONLY needs to support Win32); ghastly isn't it?

Suppose your job is to call one of the myriad *stat() functions from a
non-C language, to find out some info about a file; you'll have your
work cut out. (Actually I don't think I ever bothered trying to call it.)

#if !defined(_STYPES)

#define _ST_FSTYPSZ 16

/* SVR4 stat */

struct stat {
dev_t st_dev;
long st_pad1[3];
ino_t st_ino;
mode_t st_mode;
nlink_t st_nlink;
uid_t st_uid;
gid_t st_gid;
dev_t st_rdev;
long st_pad2[2];
off_t st_size;
long st_pad3;
union
{
time_t st__sec;
timestruc_t st__tim;
} st_atim,
st_mtim,
st_ctim;
long st_blksize;
long st_blocks;
char st_fstype[_ST_FSTYPSZ];
int st_aclcnt;
level_t st_level;
ulong_t st_flags;
lid_t st_cmwlevel;
long st_pad4[4];
};

#define st_atime st_atim.st__sec
#define st_mtime st_mtim.st__sec
#define st_ctime st_ctim.st__sec

#else /* !defined(_STYPES) */

/* SVID 2 stat */

struct stat {
o_dev_t st_dev;
o_ino_t st_ino;
o_mode_t st_mode;
o_nlink_t st_nlink;
o_uid_t st_uid;
o_gid_t st_gid;
o_dev_t st_rdev;

off_t st_size;
time_t st_atime;
time_t st_mtime;
time_t st_ctime;
};

int __cdecl stat(const char *,struct stat *);
int __cdecl fstat(int,struct stat *);

#if !defined(_POSIX_SOURCE)
int __cdecl lstat(const char *, struct stat *);
int __cdecl mknod(const char *, mode_t, dev_t);
#endif

#endif /* !defined(_STYPES) */

Scott Lurndal

unread,

Jul 28, 2021, 1:21:13 PM7/28/21

to

Bart <b...@freeuk.com> writes:
>On 28/07/2021 16:21, Scott Lurndal wrote:
>> Bart <b...@freeuk.com> writes:
>
>>> Here's my definition of struct stat as used in stat.h for my Win64 C
>>> compiler:
>>
>> You are willfully ignoring what I wrote.
>>
>> The point is that _nobody_ cares what the underlying type
>> of ino_t is.
>
>I do. Or more likely types such as off_t , or off64_t which are also
>used in applications. (Wouldn't off_t just be i64?)

But you are a minority of one. Why should the world cater to
your whims?

(And no, it wouldn't be i64; some implementations may not support
64-bit integers, others may prefer an unsigned definition (although
in this case, posix has specific requirements that off_t be a signed
integer type). All the programmer needs to know that it is a
signed integer type.

>
>> The point is that we learned the hard way that your solution
>> isn't future-safe.
>
>It's not my language that is stuck at int = 32 bits, while everything
>these days is 64 bits.
>
>> The point is that 'struct stat' is data structure defined
>> by standards that mandate the member data types.
>
>What standards? I showed half-a-dozen different versions! Two more below...
>

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_types.h.html

>
>This is exactly my point. A compiler installation can exist for just one
>platform. This goes for Windows compilers such as Pelles C (2 targets),
>DMC, lccwin (2 targets), Tcc (2 targets).

So what? Programmers generally[*] don't write code for a particular compiler or platform,
they write code to the language specification to solve some problem, which is independent
of any particular compiler implementation. Portability of C code is what
has built the internet as we know it.

And for other languages, the language provides the binding, for
example:

http://www.nongnu.org/posix90/#SEC7

[*] exceptions exist for OS, firmware and to a much lesser extent,
compiler programmers.

Bart

unread,

Jul 28, 2021, 6:41:16 PM7/28/21

to

On 28/07/2021 18:21, Scott Lurndal wrote:
> Bart <b...@freeuk.com> writes:
>> On 28/07/2021 16:21, Scott Lurndal wrote:
>>> Bart <b...@freeuk.com> writes:
>>
>>>> Here's my definition of struct stat as used in stat.h for my Win64 C
>>>> compiler:
>>>
>>> You are willfully ignoring what I wrote.
>>>
>>> The point is that _nobody_ cares what the underlying type
>>> of ino_t is.
>>
>> I do. Or more likely types such as off_t , or off64_t which are also
>> used in applications. (Wouldn't off_t just be i64?)
>
> But you are a minority of one. Why should the world cater to
> your whims?

Because it makes sense? These types are long overdue for a clearout.

> (And no, it wouldn't be i64;

I meant why wouldn't off64_t be just i64. Having off64_t be uint16_t or
whatever would be perverse.

some implementations may not support
> 64-bit integers, others may prefer an unsigned definition (although
> in this case, posix has specific requirements that off_t be a signed
> integer type). All the programmer needs to know that it is a
> signed integer type.

I want to call a C function via an FFI from a language that I've
devised. The public API says one parameter type is off_t.

Which type matches that in my language? I have to choose from i8-i64 and
u8-u64.

The kind of answer I want is (a) 'It uses i64' etc; or (b) 'It uses i32
or u64 depending on platform'.

Not keeping the answer buried for decades under a dozen typedefs,
dependent on a dozen conditional macros, dependent on the C compilers
used [what C compiler? I might not have one] and hidden in include files
6 levels deep, scattered across multiple directories, to make it as hard
as possible to find out.

I think the WINAPI also defines far too many different types, but at
least they are better documented:

https://docs.microsoft.com/en-us/windows/win32/winprog/windows-data-types

You can find all the information needed here, to be able to use these
types from a language that is not C.

>>
>>> The point is that we learned the hard way that your solution
>>> isn't future-safe.
>>
>> It's not my language that is stuck at int = 32 bits, while everything
>> these days is 64 bits.
>>
>>> The point is that 'struct stat' is data structure defined
>>> by standards that mandate the member data types.
>>
>> What standards? I showed half-a-dozen different versions! Two more below...
>>
>
> https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_types.h.html

The one that comes with gcc/tdm only defines half a dozen of those (but
takes 115 lines to do so!). Others I've seen define 3 or 4.

>>
>> This is exactly my point. A compiler installation can exist for just one
>> platform. This goes for Windows compilers such as Pelles C (2 targets),
>> DMC, lccwin (2 targets), Tcc (2 targets).
>
> So what? Programmers generally[*] don't write code for a particular compiler or platform,
> they write code to the language specification to solve some problem, which is independent
> of any particular compiler implementation. Portability of C code is what
> has built the internet as we know it.
>
> And for other languages, the language provides the binding, for
> example:
>
> http://www.nongnu.org/posix90/#SEC7

This doesn't do struct stat. But supposing it did, how did /they/ pick
up the necessary information? Although they seem to have clocked that
'clock_t' is an integer type, rather than float, something else that C
prefers to keep under wraps.

Keith Thompson

unread,

Jul 28, 2021, 7:07:48 PM7/28/21

to

Bart <b...@freeuk.com> writes:
> On 28/07/2021 18:21, Scott Lurndal wrote:
>> Bart <b...@freeuk.com> writes:

[...]

> I want to call a C function via an FFI from a language that I've
> devised. The public API says one parameter type is off_t.
>
> Which type matches that in my language? I have to choose from i8-i64
> and u8-u64.
>
> The kind of answer I want is (a) 'It uses i64' etc; or (b) 'It uses
> i32 or u64 depending on platform'.

POSIX says that off_t is a signed integer type.

Is there some reason that compiling and running this program on the
target system doesn't solve your problem?

#include <stdio.h>
#include <limits.h>
int main(void) {
printf("off_t is %c%zu\n",
(off_t)-1 < (off_t)0 ? 'i' : 'u',
CHAR_BIT * sizeof (off_t));

}

> Not keeping the answer buried for decades under a dozen typedefs,
> dependent on a dozen conditional macros, dependent on the C compilers
> used [what C compiler? I might not have one] and hidden in include
> files 6 levels deep, scattered across multiple directories, to make it
> as hard as possible to find out.

Yeah, you don't need to do any of that.

[...]

>>> This is exactly my point. A compiler installation can exist for just one
>>> platform. This goes for Windows compilers such as Pelles C (2 targets),
>>> DMC, lccwin (2 targets), Tcc (2 targets).
>> So what? Programmers generally[*] don't write code for a
>> particular compiler or platform,
>> they write code to the language specification to solve some problem, which is independent
>> of any particular compiler implementation. Portability of C code is what
>> has built the internet as we know it.
>> And for other languages, the language provides the binding, for
>> example:
>> http://www.nongnu.org/posix90/#SEC7
>
> This doesn't do struct stat. But supposing it did, how did /they/ pick
> up the necessary information? Although they seem to have clocked that
> 'clock_t' is an integer type, rather than float, something else that C
> prefers to keep under wraps.

POSIX says that clock_t can be either integer or floating-point. You
can use a method similar to what I wrote above to find out what it is in
a particular implementation (without reading the header files).

Bart

unread,

Jul 28, 2021, 9:23:16 PM7/28/21

to

On 29/07/2021 00:07, Keith Thompson wrote:
> Bart <b...@freeuk.com> writes:
>> On 28/07/2021 18:21, Scott Lurndal wrote:
>>> Bart <b...@freeuk.com> writes:
>
> [...]
>
>> I want to call a C function via an FFI from a language that I've
>> devised. The public API says one parameter type is off_t.
>>
>> Which type matches that in my language? I have to choose from i8-i64
>> and u8-u64.
>>
>> The kind of answer I want is (a) 'It uses i64' etc; or (b) 'It uses
>> i32 or u64 depending on platform'.
>
> POSIX says that off_t is a signed integer type.
>
> Is there some reason that compiling and running this program on the
> target system doesn't solve your problem?
>
> #include <stdio.h>
> #include <limits.h>
> int main(void) {
> printf("off_t is %c%zu\n",
> (off_t)-1 < (off_t)0 ? 'i' : 'u',
> CHAR_BIT * sizeof (off_t));
> }

Sure, I can do that (after fixing the code with the right includes and
avoiding %zu). But that's the trial and error approach.

In the end I might end up with a list of such types for my platform, and
what concrete types they actually are.

But, why doesn't such a list already exist anyway? It's not as though my
platform is a rare, obscure one out of hundreds; it's Windows on x64.
The other two I might be interested in are Linux on x64 and on arm64.

>>> http://www.nongnu.org/posix90/#SEC7
>>
>> This doesn't do struct stat. But supposing it did, how did /they/ pick
>> up the necessary information? Although they seem to have clocked that
>> 'clock_t' is an integer type, rather than float, something else that C
>> prefers to keep under wraps.
>
> POSIX says that clock_t can be either integer or floating-point. You
> can use a method similar to what I wrote above to find out what it is in
> a particular implementation (without reading the header files).

Actually I don't know what type Fortran is trying to represent; it might
not be clock_t; the info at that link is incomplete, and rather poor.

Keith Thompson

unread,

Jul 28, 2021, 10:20:43 PM7/28/21

to

Bart <b...@freeuk.com> writes:
> On 29/07/2021 00:07, Keith Thompson wrote:
>> Bart <b...@freeuk.com> writes:
>>> On 28/07/2021 18:21, Scott Lurndal wrote:
>>>> Bart <b...@freeuk.com> writes:
>> [...]
>>
>>> I want to call a C function via an FFI from a language that I've
>>> devised. The public API says one parameter type is off_t.
>>>
>>> Which type matches that in my language? I have to choose from i8-i64
>>> and u8-u64.
>>>
>>> The kind of answer I want is (a) 'It uses i64' etc; or (b) 'It uses
>>> i32 or u64 depending on platform'.
>> POSIX says that off_t is a signed integer type.
>> Is there some reason that compiling and running this program on the
>> target system doesn't solve your problem?
>> #include <stdio.h>
>> #include <limits.h>
>> int main(void) {
>> printf("off_t is %c%zu\n",
>> (off_t)-1 < (off_t)0 ? 'i' : 'u',
>> CHAR_BIT * sizeof (off_t));
>> }
>
> Sure, I can do that (after fixing the code with the right includes and
> avoiding %zu). But that's the trial and error approach.

%zu is the correct format specifier. I'm not even going to ask why
you'd want to avoid it.

Yes, I probably should have used `#include <sys/types.h>`.

No, it's not trial and error. Write the program once, compile and run
it on each platform of interest. Expand the program as needed to show
the characteristcs of all the types you're interested in. Tweak the
output any way you like to suit your purposes. Make it generate source
code in your personal language if you like.

> In the end I might end up with a list of such types for my platform,
> and what concrete types they actually are.
>
> But, why doesn't such a list already exist anyway? It's not as though
> my platform is a rare, obscure one out of hundreds; it's Windows on
> x64. The other two I might be interested in are Linux on x64 and on
> arm64.

Such a list might exist somewhere. If it doesn't, or if it's hard to
find, it's probably because not many people are interested in it.

And I've just shown you how you can generate such a list yourself, by
writing a C program that generates it.

[snip]

(I didn't look at the Fortran stuff.)

Bart

unread,

Jul 29, 2021, 6:05:30 AM7/29/21

to

I'll tell you anyway: it doesn't work on Windows.

> Yes, I probably should have used `#include <sys/types.h>`.
>
> No, it's not trial and error. Write the program once, compile and run
> it on each platform of interest. Expand the program as needed to show
> the characteristcs of all the types you're interested in. Tweak the
> output any way you like to suit your purposes. Make it generate source
> code in your personal language if you like.

This is stuff I've had to do years ago. More recently, I could
semi-translate C APIs (although normally I'd exclude system headers), by
using a special option on my C compiler, which will generate a
particular rendering, as often APIs bristle with compile-specific code.

It doesn't attempt most of the macros that are normally used, which in
general contain expressions in C syntax which is not trivially convertible.

Generating FFI bindings of C APIs is a big job. Most people who have to
do similar things are not going to be writing their own C parsers or
whatever; they might have to use more heavyweight solutions, or simply
do a huge amount of work.

The off_t and other types are examples of where how C is typically
written makes it harder than necessary. Instead of using a plain type,
or even using one layer of typedefs, it uses several.

(For gcc/tdm, off_t is defined inside _mingw_off_t.h (it has its own
header!). This defines also _off_t and off32_t as 'long'; _off64_t and
off64_t; and off_t itself as either off64_t or off32_t

One clock_t type used 6 layers of typedefs and macros; what the hell
happended there? This is why I'm saying this stuff should be overhauled.
Obviously no one wants to mess with it, so they add their own
abstractions. Then someone else does the same... Eventually someone
whats to printf such a value!)

-----------------------------------------------
_mingw_off_t.h
-----------------------------------------------
#ifndef _OFF_T_DEFINED
#define _OFF_T_DEFINED
#ifndef _OFF_T_
#define _OFF_T_
typedef long _off_t;
#if !defined(NO_OLDNAMES) || defined(_POSIX)
typedef long off32_t;
#endif
#endif

#ifndef _OFF64_T_DEFINED
#define _OFF64_T_DEFINED
__MINGW_EXTENSION typedef long long _off64_t;
#if !defined(NO_OLDNAMES) || defined(_POSIX)
__MINGW_EXTENSION typedef long long off64_t;
#endif
#endif /*_OFF64_T_DEFINED */

#ifndef _FILE_OFFSET_BITS_SET_OFFT
#define _FILE_OFFSET_BITS_SET_OFFT
#if !defined(NO_OLDNAMES) || defined(_POSIX)
#if (defined(_FILE_OFFSET_BITS) && (_FILE_OFFSET_BITS == 64))
typedef off64_t off_t;
#else
typedef off32_t off_t;
#endif /* #if !defined(NO_OLDNAMES) || defined(_POSIX) */
#endif /* (defined(_FILE_OFFSET_BITS) && (_FILE_OFFSET_BITS == 64)) */
#endif /* _FILE_OFFSET_BITS_SET_OFFT */

#endif /* _OFF_T_DEFINED */
-----------------------------------------------

All to define a type which is i32 or i64; wonderful isn't it?

Let's face it, if you had to write some functions or data types that
expressed a file offset, you'd just use int64_t and be done with it.

John Dill

unread,

Jul 29, 2021, 9:09:53 AM7/29/21

to

This is kind of what build systems are for, even though it doesn't suit
Bart's tastes. To find the size of an opaque type, you'd leverage something
similar to ac_check_sizeof from autoconf that basically generates a C
program, runs the program, inspects the stdout, stderr or return code, then
sets a define in a config.h style file based on the result.

Bart could even do his own mini home-grown version of that if he had
the motivation. You just need some kind of scripting language that can
interact with the shell.

Bart

unread,

Jul 29, 2021, 9:27:17 AM7/29/21

to

People seem to be missing the point. You shouldn't NEED to write
scripts, install a C compiler, write test programs just to find out what
a type is! They shouldn't work as hard as they do to hide these types.

For goodness sake, an offset type is either going to be i32 or i64!

Look at my link to Windows data types, where all the info needed is on
that one web-page.

Scott Lurndal

unread,

Jul 29, 2021, 9:51:50 AM7/29/21

to

No, you're missing the point. 99.9999% of people writing code
in the C language don't need to know what the underlying type is. Noting
that it (off_t for example) is a signed integer is all they need
to know.

>
>For goodness sake, an offset type is either going to be i32 or i64!

No, it's not. C runs on many platforms other than Windows on x86.

Keith Thompson

unread,

Jul 29, 2021, 11:12:36 AM7/29/21

to

Bart <b...@freeuk.com> writes:
> On 29/07/2021 03:20, Keith Thompson wrote:
>> Bart <b...@freeuk.com> writes:

[...]

>>> Sure, I can do that (after fixing the code with the right includes and
>>> avoiding %zu). But that's the trial and error approach.
>> %zu is the correct format specifier. I'm not even going to ask why
>> you'd want to avoid it.
>
> I'll tell you anyway: it doesn't work on Windows.

Yes, it does. I don't know why it wouldn't work for you. Are you using
some obsolete version of Windows or of its C implementation?

[...]

Bart

unread,

Jul 29, 2021, 11:19:14 AM7/29/21

to

You have a small point, but mainly you're just making excuses for some
terrible header code that should long have been cleaned up.

The _mingw_off_t.h file I posted that makes such a meal of defining
off_t, is specifically for Windows on x86.

I would also suggest that those languages that would like to use C
libraries, the ones that predominantly use integer types based around
8/16/32/64 bits, will not be running on the same 13-bit processors that
C likes to support.

They are also unlikely to run under a system where a file offset of 16
bits is sufficient (that would be too small even for a 1980s floppy
disk). The most useful type these days would be i64.

I actually haven't used fstat(), if that's what's it's called, for
perhaps 20 years. There are two many myriad combinations of it with the
numerous variations of struct stat, off_t and the rest.

There are 8 combinations listed here (6 functions plus file length
variations, halfway down the page):

https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fstat-fstat32-fstat64-fstati64-fstat32i64-fstat64i32?view=msvc-160

Another page lists TWELVE versions of struct stat (6 using char* plus 6
with wchar_t*).

This is for basically two platforms (Win32 and Win64), but they will run
under the same file system that will have files bigger than 2GB, so
you'd expect the same file functions and associated types for both.

I would call that messy, and that's without even delving into the headers.

Bart

unread,

Jul 29, 2021, 11:34:41 AM7/29/21

to

On 29/07/2021 16:12, Keith Thompson wrote:
> Bart <b...@freeuk.com> writes:
>> On 29/07/2021 03:20, Keith Thompson wrote:
>>> Bart <b...@freeuk.com> writes:
> [...]
>>>> Sure, I can do that (after fixing the code with the right includes and
>>>> avoiding %zu). But that's the trial and error approach.
>>> %zu is the correct format specifier. I'm not even going to ask why
>>> you'd want to avoid it.
>>
>> I'll tell you anyway: it doesn't work on Windows.
>
> Yes, it does. I don't know why it wouldn't work for you. Are you using
> some obsolete version of Windows or of its C implementation?

It's Windows 7. gcc is version 9.2.0.

This program:

#include <stdio.h>
int main(void) {printf("%zu\n",sizeof(void*));}

displays 'zu' with gcc, tcc and bcc (my product that uses msvcrt.dll).

It shows the correct value with lccwin, DMC, clang and CL (MS' compiler).

Since I predominantly use the first 3 compilers, I don't find zu useful.

Actually until I did this test, I couldn't remember which supported it
and which didn't. I like code that works on anything, so prefer to use a
format supported on any compiler.

Maybe there is a way of persuading gcc to use better libraries, but I've
no idea how, and anyway like to run it like this:

gcc prog.c

If this a program that will only be used with bcc, then there I can use
"%?", which is replaced with the correct format, I believe "%llu" here,
and get the right answer as well. That would have been a more useful
extension than another bizarre and very specific format.

Scott Lurndal

unread,

Jul 29, 2021, 1:10:19 PM7/29/21

to

Bart <b...@freeuk.com> writes:
>On 29/07/2021 16:12, Keith Thompson wrote:
>> Bart <b...@freeuk.com> writes:
>>> On 29/07/2021 03:20, Keith Thompson wrote:
>>>> Bart <b...@freeuk.com> writes:
>> [...]
>>>>> Sure, I can do that (after fixing the code with the right includes and
>>>>> avoiding %zu). But that's the trial and error approach.
>>>> %zu is the correct format specifier. I'm not even going to ask why
>>>> you'd want to avoid it.
>>>
>>> I'll tell you anyway: it doesn't work on Windows.
>>
>> Yes, it does. I don't know why it wouldn't work for you. Are you using
>> some obsolete version of Windows or of its C implementation?
>
>It's Windows 7. gcc is version 9.2.0.
>
>This program:
>
> #include <stdio.h>
> int main(void) {printf("%zu\n",sizeof(void*));}

It has _always_ worked properly with GCC. You're clearly
either doing something incorrect if your assertion is accurate.

$ cc -o /tmp/b /tmp/b.c
$ /tmp/b
8
$ cat /tmp/b.c

#include <stdio.h>
int main(void) {printf("%zu\n",sizeof(void*));}

$ cc --version
gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Keith Thompson

unread,

Jul 29, 2021, 1:28:40 PM7/29/21

to

The problem is with the old C library. The version of gcc is
irrelevant.

Windows 7 is obsolete. If you're stuck using it for some reason, that's
a valid reason to avoid "%zu". "it doesn't work on Windows" is not.

Keith Thompson

unread,

Jul 29, 2021, 1:30:01 PM7/29/21

to

sc...@slp53.sl.home (Scott Lurndal) writes:
> Bart <b...@freeuk.com> writes:
>>On 29/07/2021 16:12, Keith Thompson wrote:
>>> Bart <b...@freeuk.com> writes:
>>>> On 29/07/2021 03:20, Keith Thompson wrote:
>>>>> Bart <b...@freeuk.com> writes:
>>> [...]
>>>>>> Sure, I can do that (after fixing the code with the right includes and
>>>>>> avoiding %zu). But that's the trial and error approach.
>>>>> %zu is the correct format specifier. I'm not even going to ask why
>>>>> you'd want to avoid it.
>>>>
>>>> I'll tell you anyway: it doesn't work on Windows.
>>>
>>> Yes, it does. I don't know why it wouldn't work for you. Are you using
>>> some obsolete version of Windows or of its C implementation?
>>
>>It's Windows 7. gcc is version 9.2.0.
>>
>>This program:
>>
>> #include <stdio.h>
>> int main(void) {printf("%zu\n",sizeof(void*));}
>
> It has _always_ worked properly with GCC. You're clearly
> either doing something incorrect if your assertion is accurate.

[...]

gcc doesn't implement printf. The C library does. Bart is using an
obsolete version of Microsoft's C library.

Bart

unread,

Jul 29, 2021, 2:07:24 PM7/29/21

to

On 29/07/2021 18:28, Keith Thompson wrote:
> Bart <b...@freeuk.com> writes:

>> This program:
>>
>> #include <stdio.h>
>> int main(void) {printf("%zu\n",sizeof(void*));}
>>
>> displays 'zu' with gcc, tcc and bcc (my product that uses msvcrt.dll).

> The problem is with the old C library. The version of gcc is
> irrelevant.
>
> Windows 7 is obsolete. If you're stuck using it for some reason, that's
> a valid reason to avoid "%zu". "it doesn't work on Windows" is not.

Looks like Windows 10 is obsolete too:

c:\c>ver
Microsoft Windows [Version 10.0.14393]

c:\c>type c.c

#include <stdio.h>

int main(void) {
printf("%zu\n", sizeof(void*));
}

c:\c>gcc c.c

c:\c>a
zu

Keith Thompson

unread,

Jul 29, 2021, 2:16:55 PM7/29/21

to

No, it looks like you're using an obsolete C library implementation on
Windows 10. I'm able to compile and run your program on Windows 10,
and it prints 8 or 4 depending on which implementation I use.

You could probably get some help with that if you asked (and provided
some information).

Bart

unread,

Jul 29, 2021, 2:41:27 PM7/29/21

to

You said 'Windows 7 is obsolete'.

Anyway, how can the C library be obsolete if it comes /with/ Windows 10?
Even if I could, installing a different library wouldn't be much help,
since then I'd have programs that worked fine on my Windows, but on no
one else's unless they jumped through the same hoops.

Just for the sake of %zu which can be trivially replaced with %d and a
(int) cast (or %lld and (long long) if expecting some big objects), it
is not worth the headache of extra dependencies.

> I'm able to compile and run your program on Windows 10,
> and it prints 8 or 4 depending on which implementation I use.
>
> You could probably get some help with that if you asked (and provided
> some information).

Does Tiny C work too?

Keith Thompson

unread,

Jul 29, 2021, 3:44:07 PM7/29/21

to

I did, and it is.

> Anyway, how can the C library be obsolete if it comes /with/ Windows
> 10? Even if I could, installing a different library wouldn't be much
> help, since then I'd have programs that worked fine on my Windows, but
> on no one else's unless they jumped through the same hoops.

I don't know enough about it to answer that. I use Visual Studio for
work and Cygwin for personal use. I don't have a gcc-based
implementation outside Cygwin. Microsoft's online documentation says
"%zu" works.

I know that Microsoft was very slow to support C99 (which is where %zu
was introduced), but they've made considerable progress more recently.

> Just for the sake of %zu which can be trivially replaced with %d and a
> (int) cast (or %lld and (long long) if expecting some big objects), it
> is not worth the headache of extra dependencies.

On the systems and implementions I currently use, there are no extra
dependencies; "%zu" just works, and I don't seem to have an
implementation where it doesn't. For my own purposes, I have no reason
to avoid "%zu". Obviously your situation is different, and there are
several workarounds you can use.

I don't know what C library implementation you're using or how it was
installed, other than your statement that it came with Windows 10.
If you're not bothered by its lack of support for "%zu", I don't see
anything more to discuss.

>> I'm able to compile and run your program on Windows 10,
>> and it prints 8 or 4 depending on which implementation I use.
>> You could probably get some help with that if you asked (and
>> provided
>> some information).
>
> Does Tiny C work too?

I don't have Tiny C on either Windows 10 system I have access to, but I
presume it would use the same library that gcc uses, so I'd expect it to
work. (tcc, like gcc, is a compiler, not a complete implementation.
You have a habit of glossing over that.)