Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

I think references should have been const by default

248 views
Skip to first unread message

Juha Nieminen

unread,
Oct 21, 2021, 1:23:23 AM10/21/21
to
Time and again I see beginner C++ programmers make the same mistake:
Make functions take objects by non-const reference, even when (in the
vast, vast majority of cases) the function doesn't modify those objects.

In one particularly egregious case some beginner programmer had written
a comparator lambda that took two std::string objects by reference,
wondered why he was getting a compiler error, and then removed the
references, so it was taking the objects by value. That way it compiled.
Then he wondered why it was so slow.

I think it was a mistake to make references non-const by default.
It's logical (and consistent with pointer syntax) of course, but I think
it was a mistake. In the vast, vast majority of cases if you create a
reference to something else, you want it to be const (and there are
many good reasons why it should be const). It's extremely rare to
explicitly want a non-const reference.

And the thing is, C++98 has a *perfect* keyword to explicitly denote
that you want a non-const reference, which could have been used for
this purpose, so no new keyword would be needed for this: 'mutable'.

In other words, I think C++ would have been better if it worked like
this:

void foo1(std::string& str)
{
str = "hello"; // error: 'str' is const
}

void foo2(mutable std::string& str)
{
str = "hello"; // ok
}

(You could still write "const std::string&", but it would just have
the exact same meaning as "std::string&", ie. in this case the
'const' is superfluous, a bit like how 'signed' is superfluous
in "signed int".)

David Brown

unread,
Oct 21, 2021, 2:54:45 AM10/21/21
to
I agree with you entirely. But if we are going for wishful thinking
about how C++ could have been made better, I'd have preferred "const"
for all variables and required "mutable" to declare a variable that
could be modified. (Of course that would mean you'd need something else
for class members that are today "mutable". But I think it's a lot more
common for people to make variables that could have been "const" but
aren't, than to use mutable members.)

Racing...@watershipdown.co.uk

unread,
Oct 21, 2021, 5:12:44 AM10/21/21
to
Looking for logic in C++ keywords is a hiding to nothing. Eg const_cast
actually means remove const'ness , not add it which is frankly bizarre.
You might as well have true mean false and false mean true. Similarly
putting throw() at the end of a function def means it can't throw! Though at
least that insanity has been superceeded by noexcept now.

Paavo Helde

unread,
Oct 21, 2021, 5:18:55 AM10/21/21
to
21.10.2021 08:23 Juha Nieminen kirjutas:
>
> I think it was a mistake to make references non-const by default.
> It's logical (and consistent with pointer syntax) of course, but I think
> it was a mistake.
Agreed.

By default in C is to pass objects by value, so the changes in the
object will not affect the caller side. On the caller side the pass by
value and pass by reference look the same, so this important feature (no
surprises on the caller side) was lost in C++. Using a const reference
fixes this loss, so one can even argue that using a const reference, at
least by default would also be logical and consistent, in some sense.

If it were up to me, I would have required extra syntax on the caller
side to pass via non-const reference, like @var.

Pushing everything more const would mean the language would behave and
feel more "functional" and less "OOP", but that's a good thing.

Juha Nieminen

unread,
Oct 21, 2021, 5:48:33 AM10/21/21
to
David Brown <david...@hesbynett.no> wrote:
> I agree with you entirely. But if we are going for wishful thinking
> about how C++ could have been made better, I'd have preferred "const"
> for all variables and required "mutable" to declare a variable that
> could be modified. (Of course that would mean you'd need something else
> for class members that are today "mutable". But I think it's a lot more
> common for people to make variables that could have been "const" but
> aren't, than to use mutable members.)

That would quite quickly turn quite annoying, eg. with things like
for-loops:

for(int i = 0; i < 10; ++i) // error, i is const

Juha Nieminen

unread,
Oct 21, 2021, 5:53:09 AM10/21/21
to
Racing...@watershipdown.co.uk wrote:
> Looking for logic in C++ keywords is a hiding to nothing. Eg const_cast
> actually means remove const'ness , not add it which is frankly bizarre.

What would have been a better name for that keyword in your opinion?

> Similarly
> putting throw() at the end of a function def means it can't throw! Though at
> least that insanity has been superceeded by noexcept now.

I think it's pretty logical. It just lists all the exceptions that the
function can throw. If the list is empty, it means it doesn't throw
any exception.

If they wanted to be clearer, it should have been "throws(whatever)", but
that would have required adding yet another single-use keyword.

Paavo Helde

unread,
Oct 21, 2021, 5:56:15 AM10/21/21
to
Solved by C++20:

for (int i : std::ranges::iota(0, 10))

Having const default would mean additional bonus in that 'i' would be
const inside the loop body.

Racing...@watershipdown.co.uk

unread,
Oct 21, 2021, 6:36:31 AM10/21/21
to
On Thu, 21 Oct 2021 09:52:53 -0000 (UTC)
Juha Nieminen <nos...@thanks.invalid> wrote:
>Racing...@watershipdown.co.uk wrote:
>> Looking for logic in C++ keywords is a hiding to nothing. Eg const_cast
>> actually means remove const'ness , not add it which is frankly bizarre.
>
>What would have been a better name for that keyword in your opinion?

unconst_cast , noconst_cast, take your pick.

>
>> Similarly
>> putting throw() at the end of a function def means it can't throw! Though at
>> least that insanity has been superceeded by noexcept now.
>
>I think it's pretty logical. It just lists all the exceptions that the
>function can throw. If the list is empty, it means it doesn't throw
>any exception.

Except throw inside a function means throw any exception currently on the stack.
You can't have it both ways and for once the C++ committee saw sense and
replaced it with noexcept.

>If they wanted to be clearer, it should have been "throws(whatever)", but
>that would have required adding yet another single-use keyword.

There's nothing wrong with single use keywords if they make things clearer.
This should have been done with the = 0 in pure virtuals and the spurious int
in postfix operator definitions.

eg:
virtual void myfunc() = 0; -> pure virtual void myfunc();
myclass &operator++(int) -> myclass &operator++() postfix


David Brown

unread,
Oct 21, 2021, 6:57:22 AM10/21/21
to
for (mutable int i = 0; i < 10; ++i)

Yes, there are many places where you want variable variables - but I
think overall you typically have more that could be declared const than
have to be variable. (The idea is not without precedence - there are
other modern languages with constant objects by default.)

For loops, I think the best solution would be a mixture - the loop
variable should be mutable within the controlling expressions, but
should be constant within the statement or block - as though you had
written:

for (int i = 0; i < 10; i++) {
const auto i_ = i;
const auto i = i_;

// Here, "i" is constant
}


Bonita Montero

unread,
Oct 21, 2021, 7:14:16 AM10/21/21
to
You're compulsive.

Alf P. Steinbach

unread,
Oct 21, 2021, 7:23:35 AM10/21/21
to
On 21 Oct 2021 07:23, Juha Nieminen wrote:
> Time and again I see beginner C++ programmers make the same mistake:
> Make functions take objects by non-const reference, even when (in the
> vast, vast majority of cases) the function doesn't modify those objects.
>
> In one particularly egregious case some beginner programmer had written
> a comparator lambda that took two std::string objects by reference,
> wondered why he was getting a compiler error, and then removed the
> references, so it was taking the objects by value. That way it compiled.
> Then he wondered why it was so slow.
>
> I think it was a mistake to make references non-const by default.
> It's logical (and consistent with pointer syntax) of course, but I think
> it was a mistake. In the vast, vast majority of cases if you create a
> reference to something else, you want it to be const (and there are
> many good reasons why it should be const). It's extremely rare to
> explicitly want a non-const reference.

Agreed. So what can you do about it?

Just define

template< class T > using Ref_ = const T&;

And then write

void foo( Ref_<Baluba> x )

instead of

void foo( const Baluba& x )

A more clever definition in terms of a nested type of a class template,
like the standard library usually does, does not support type deduction
for function templates, but the above supports e.g.

template< class T >
void foo( Ref_<T> x )

with calls like `foo( something )`.

When the same approach is adopted for pointers it supports general
left/west `const`.

The one big drawback of this approach is that code using such
definitions can't be posted to this group without Mr. Fibble complaining
loudly and posting offensive remarks about how it's impossible to grok.


> And the thing is, C++98 has a *perfect* keyword to explicitly denote
> that you want a non-const reference, which could have been used for
> this purpose, so no new keyword would be needed for this: 'mutable'.
>
> In other words, I think C++ would have been better if it worked like
> this:
>
> void foo1(std::string& str)
> {
> str = "hello"; // error: 'str' is const
> }
>
> void foo2(mutable std::string& str)
> {
> str = "hello"; // ok
> }
>
> (You could still write "const std::string&", but it would just have
> the exact same meaning as "std::string&", ie. in this case the
> 'const' is superfluous, a bit like how 'signed' is superfluous
> in "signed int".)

- Alf

Manfred

unread,
Oct 21, 2021, 7:58:02 AM10/21/21
to
Good point. However, I value syntax consistency a lot, so given the
current state of things (non-const default pointers) I think non-const
default references are preferable to me - and I write /many/ more 'const
T&' declarations than 'T&' myself.
In an ideal world, if we were able to have const default pointers as
well, then I would endorse your view entirely.

BTW Alf's alternative has its appeal.

Bo Persson

unread,
Oct 21, 2021, 9:09:20 AM10/21/21
to
On 2021-10-21 at 13:13, Bonita Montero wrote:
> You're compulsive.

Who, me?

Racing...@watershipdown.co.uk

unread,
Oct 21, 2021, 9:40:55 AM10/21/21
to
I think const is a lot of fuss about nothing frankly. I barely use them
anyway and I can't remember the last time that I had a bug due to updating
a variable that shouldn't have been updated. For those who think its simply
an indication to other devs that a variable won't be changed then fair enough
but personally I find a lot of const/non const compilation errors related
to functions a PITA.

Bart

unread,
Oct 21, 2021, 10:04:49 AM10/21/21
to
I don't about C++, but in C, you can take a program, remove all the
'const' qualifiers, and it will still compile and work.

Makes you think...

Manfred

unread,
Oct 21, 2021, 10:41:51 AM10/21/21
to
...mumble, obviously 'const' is almost exclusively for the programmer's
convenience (some implementations may use it to map some const data to
read only memory, but that's probably a minor use compared to the
massive benefit in accurate modeling of a process).

The categories of immutable and mutable data are pretty relevant in
process modeling, and since SW design is a lot about modeling, long live
'const'.

Racing...@watershipdown.co.uk

unread,
Oct 21, 2021, 10:51:22 AM10/21/21
to
Well quite. But then I started out in C and const wasn't a thing so you had
to know what was going on and not always rely on the compiler to tell you.
But then C doesn't have references I suppose and its a bit difficult to
accidentaly dereference a pointer and update it unlike a C++ reference. Even
so, I still don't like consts. But its just personal taste.

Racing...@watershipdown.co.uk

unread,
Oct 21, 2021, 10:57:20 AM10/21/21
to
On Thu, 21 Oct 2021 16:41:34 +0200
Manfred <non...@add.invalid> wrote:
>On 10/21/2021 4:04 PM, Bart wrote:
>> I don't about C++, but in C, you can take a program, remove all the
>> 'const' qualifiers, and it will still compile and work.
>>
>> Makes you think...
>>
>
>....mumble, obviously 'const' is almost exclusively for the programmer's
>convenience (some implementations may use it to map some const data to
>read only memory, but that's probably a minor use compared to the
>massive benefit in accurate modeling of a process).
>
>The categories of immutable and mutable data are pretty relevant in
>process modeling, and since SW design is a lot about modeling, long live
>'const'.

Tbh if you don't know which data should be left alone and which you can update
you probably shouldn't be working on the code until you learn it a bit better.
As for private/protected variables in library classes, that just pisses me off.
If I want to alter something then let me Mr Library Coder. Its not up to you
to decide what I can and can't alter in my own program and you're not going to
be able to think up every possible use case in which your "private" variable
might need to be accessed or changed.

I actually had this issue with an in house lib at a company but the muppet
who wrote it refused to make the variable public or even available via a
setter even though we needed to access it. In the end I got the address
of a public variable in the class, counted back in memory and dereferenced the
pointer where his private var was stored. Worked a treat though very fragile.
Eventually management made him add a getter function so that hack was removed.

James Kuyper

unread,
Oct 21, 2021, 12:07:51 PM10/21/21
to
On 10/21/21 5:12 AM, Racing...@watershipdown.co.uk wrote:
...
> Looking for logic in C++ keywords is a hiding to nothing. Eg const_cast
> actually means remove const'ness , not add it which is frankly bizarre.

You can use const_cast<> to add const as easily as removing it. Doing so
is normally unnecessary, because such conversions can be done
implicitly, which might be what gave you that impression.

You can also add and remove the other qualifier, volatile. Therefore, I
think qual_cast<> might have been a better name for it.

James Kuyper

unread,
Oct 21, 2021, 12:09:35 PM10/21/21
to
On Thu, 21 Oct 2021 15:04:19 +0100
Bart <b...@freeuk.com> wrote:
...
>I don't about C++, but in C, you can take a program, remove all the
>'const' qualifiers, and it will still compile and work.
>
>Makes you think...

Keep in mind that this is strictly true in C only if you remove all the
"const" qualifiers from all of the #included header files, and you can't
do that with standard library headers. It would also be difficult to do
with the headers associated with third-party libraries.

More importantly, that's true only if the program would compile without
diagnostics before the removal of the 'const' qualifiers, and there's a
simple, blindingly obvious reason for that, which does NOT count as an
argument against the proper use of "const":

The declaration of an identifier containing the 'const' keyword
indicates that this identifier should not be used in any way that puts
the object qualified by that keyword in danger of being modified. The
purpose of that keyword is to enable diagnostic messages when the
identifier is used in an expression that could put that object in danger
of being modified. If your program generates no diagnostics, that means
that it contains no such expressions. Removing "const" from the entire
translation unit, including #included headers, will not change the fact
that there is no danger, and the program should therefore work just the
same as before. However, if you modify the program again after removing
all "const" keywords, that modification might create such a danger, but
the implementation no longer has any obligation of warning you about the
danger.

You said that you don't know about C++. Well there's an important
difference between C++ and C in this regard: function overloading. A
function can be overloaded based upon the difference in the way one or
more of its arguments is qualified. The overloads may do significantly
different things - if so, the non-const overload usually attempts to
modify the relevant object, whereas the const overload uses some
work-around to avoid the need to modify it. That work-around is usually
inconvenient in some way, otherwise there would have been no need to
declare the const overload. Therefore, at best, removing all use of the
"const" keyword would require that the non-const overload be dropped,
and that the less convenient const overload be used at all times.

That's "usually". Overloads are entirely under the programmer's control,
and the difference between them need not be "usual" - it very often
isn't. It would be trivial to create overloads that do significantly
different things with const and non-const arguments (most trivially,
they could print out "const" and "non-const" respectively). No single
function can replace both overloads, but it would have to do so if you
removed all occurances of the "const" keyword.

Branimir Maksimovic

unread,
Oct 21, 2021, 5:28:06 PM10/21/21
to
Good thinking, but now to late, perhaps with -std=c++22
and latter on?

--

7-77-777
Evil Sinner!
with software, you repeat same experiment, expecting different results...

Bart

unread,
Oct 21, 2021, 6:49:29 PM10/21/21
to
Actually here's an example from C where const changes the behaviour:

#include <stdio.h>
#include <stdint.h>

int main(void) {
#define issigned(x) _Generic((x),\
int8_t: "S",\
int16_t: "S",\
int32_t: "S",\
const int32_t: "const S",\
int64_t: "S",\
uint8_t: "u",\
uint16_t: "u",\
uint32_t: "u",\
uint64_t: "u",\
default: "other")

int32_t x;
// const int32_t x;
puts(issigned(x));
}

The output is different when x has a const attribute.

james...@alumni.caltech.edu

unread,
Oct 21, 2021, 7:21:55 PM10/21/21
to
On Thursday, October 21, 2021 at 6:49:29 PM UTC-4, Bart wrote:
> On 21/10/2021 17:09, James Kuyper wrote:
> > On Thu, 21 Oct 2021 15:04:19 +0100
> > Bart <b...@freeuk.com> wrote:
> > ...
> >> I don't about C++, but in C, you can take a program, remove all the
> >> 'const' qualifiers, and it will still compile and work.
> >>
> >> Makes you think...
> >
> > Keep in mind that this is strictly true in C only if you remove all the
> > "const" qualifiers from all of the #included header files, and you can't
> > do that with standard library headers. It would also be difficult to do
> > with the headers associated with third-party libraries.
...
You're correct, I forgot about that. _Generic provides a capability similar to but much more restricted than function overloading, and as such allows the behavior to depend upon the qualifiers.

Juha Nieminen

unread,
Oct 22, 2021, 12:42:03 AM10/22/21
to
Bart <b...@freeuk.com> wrote:
> I don't about C++, but in C, you can take a program, remove all the
> 'const' qualifiers, and it will still compile and work.

At least if you turn off warnings.

Also, you'll likely make some programs less efficient because the compiler
will do compile-time calculations on things like const arrays containing
compile-time literals, which it won't if the array is not const.

So yes, 'const' can actually make the program more efficient (especially
in C++, where it guarantees to the compiler that it can assume the
contents won't change).

Juha Nieminen

unread,
Oct 22, 2021, 12:45:20 AM10/22/21
to
Racing...@watershipdown.co.uk wrote:
>>I think it's pretty logical. It just lists all the exceptions that the
>>function can throw. If the list is empty, it means it doesn't throw
>>any exception.
>
> Except throw inside a function means throw any exception currently on the stack.
> You can't have it both ways and for once the C++ committee saw sense and
> replaced it with noexcept.

If 'throw(...)' after a function declaration specifies a list of exceptions
that the function may throw, it's only logical that if this list is empty
then it doesn't throw anything.

You could just as well complain about the multiple different uses of {}.

Juha Nieminen

unread,
Oct 22, 2021, 12:46:10 AM10/22/21
to
Bonita Montero <Bonita....@gmail.com> wrote:
> You're compulsive.

And you are an asshole.

Juha Nieminen

unread,
Oct 22, 2021, 12:48:26 AM10/22/21
to
Alf P. Steinbach <alf.p.s...@gmail.com> wrote:
> Agreed. So what can you do about it?
>
> Just define
>
> template< class T > using Ref_ = const T&;
>
> And then write
>
> void foo( Ref_<Baluba> x )
>
> instead of
>
> void foo( const Baluba& x )

Doesn't really help with the problem of beginners using non-const
references everywhere...

Bonita Montero

unread,
Oct 22, 2021, 1:55:56 AM10/22/21
to
>> You're compulsive.

> And you are an asshole.

That depends on the situation, but sometimes that's true.
I just don't like compulsive personalities that arise problems
that don't really exist.

David Brown

unread,
Oct 22, 2021, 3:46:46 AM10/22/21
to
I makes you think that it is true that good programming language design
is more about what you /can't/ do, rather than about what you /can/ do.
"const" in C does not let you do things you could not otherwise do - it
restricts you, thus making code clearer, safer, more maintainable, and
perhaps sometimes more efficient.

Const in C++ is more integral to the language, and can't be removed in
the same way (not that anyone would want to).




David Brown

unread,
Oct 22, 2021, 5:13:55 AM10/22/21
to
On 21/10/2021 18:09, James Kuyper wrote:
> On Thu, 21 Oct 2021 15:04:19 +0100
> Bart <b...@freeuk.com> wrote:
> ...
>> I don't about C++, but in C, you can take a program, remove all the
>> 'const' qualifiers, and it will still compile and work.
>>
>> Makes you think...
>
> Keep in mind that this is strictly true in C only if you remove all the
> "const" qualifiers from all of the #included header files, and you can't
> do that with standard library headers. It would also be difficult to do
> with the headers associated with third-party libraries.
>
I'm not suggesting that this would be at all a good idea, and it would
certainly be undefined and undocumented behaviour, but you could
probably remove "const" by adding "-Dconst=" to your compiler flags. It
works for gcc (maybe I should file a bug here - it is even accepted with
-std=c99 -Wpedantic).

Branimir Maksimovic

unread,
Oct 22, 2021, 7:21:19 AM10/22/21
to
PEACE&LOVE, brothers and sisters :P

Bart

unread,
Oct 22, 2021, 10:49:13 AM10/22/21
to
On 22/10/2021 08:46, David Brown wrote:
> On 21/10/2021 16:04, Bart wrote:

>> I don't [know] about C++, but in C, you can take a program, remove all the
>> 'const' qualifiers, and it will still compile and work.
>>
>> Makes you think...
>>
>
> I makes you think that it is true that good programming language design
> is more about what you /can't/ do, rather than about what you /can/ do.
> "const" in C does not let you do things you could not otherwise do - it
> restricts you, thus making code clearer, safer, more maintainable, and
> perhaps sometimes more efficient.

There are better ways of doing it. In C, it is just adds a lot of
clutter that effects readability and can hide real problems.

Neither does the syntax make it that obvious which bit of the type is
refered to, as in:

const int * const * x;

The first const applies to the following int; the second const refers to
the /previous/ * (AIUI).

This can give a false sense of security, especially when you have, say,
a const pointer to a struct which contains non-const pointers. The
'const' only protects that top level; it does not stop you writing
nested non-const data.

In my example, x can still be written to! (x=0 is allowed; but *x=0 and
**x=0 are not.)

Use of 'const' can also proliferate through interactions with non-const
versions of the type, adding to the clutter.

(I don't do much with readonly stuff [in my languages]. My experiments
focus on mutability of objects, or specific variables, not of types.)

Bart

unread,
Oct 22, 2021, 11:14:32 AM10/22/21
to
That program only behaves as I said when I use my own compiler.

With gcc and others that support _Generic, any 'const' attributes of the
types of controlling expressions appear to be stripped away.

I'm not sure why that is. It looked to be a bug in gcc, but other
compilers do the same. Maybe they are all copying gcc's behaviour, but I
don't what C itself says about it.

With the version below, gcc et al display:

int
int

With mine, it displays:

int
const int

--------------------------------

#include <stdio.h>

int main(void) {
#define strtype(x) _Generic((x),\
int: "int",\
const int: "const int")

int x;
const int y;
puts(strtype(x));
puts(strtype(y));
}

Keith Thompson

unread,
Oct 22, 2021, 3:05:12 PM10/22/21
to
Bart <b...@freeuk.com> writes:
[...]
> That program only behaves as I said when I use my own compiler.
>
> With gcc and others that support _Generic, any 'const' attributes of
> the types of controlling expressions appear to be stripped away.
>
> I'm not sure why that is. It looked to be a bug in gcc, but other
> compilers do the same. Maybe they are all copying gcc's behaviour, but
> I don't what C itself says about it.
[...]

_Generic is a C feature that does not appear in C++.

The first operand of _Generic is an expression, not an object, and it
determines the type of that expression. If the expression happens to be
an lvalue, it undergoes *lvalue conversion* (N1570 6.3.2.1p2), which
strips any qualifiers. So if the operand of _Generic is the name of an
object of type const int, the type of that *expression* is int, not
const int. (Something like _Generic that operates on lvalues rather
than expressions might have been useful, but I've never had a need for
it.)

Consider an expression like (n + 1). Would you expect it to have a
different type depending on whether n was defined as const or not?

C++ overloading, unlike C's _Generic, can use references to distinguish
between const and non-const objects. You can't have two overloaded
functions that differ only in their parameter type, int vs. const int,
but you can have overloaded functions with parameters of type int& and
const int&.

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */

daniel...@gmail.com

unread,
Oct 22, 2021, 3:28:42 PM10/22/21
to
On Thursday, October 21, 2021 at 2:54:45 AM UTC-4, David Brown wrote:
> On 21/10/2021 07:23, Juha Nieminen wrote:
> >
> > In other words, I think C++ would have been better if it worked like
> > this:
> >
> > void foo1(std::string& str)
> > {
> > str = "hello"; // error: 'str' is const
> > }
> >
> > void foo2(mutable std::string& str)
> > {
> > str = "hello"; // ok
> > }

It would have been better if std::string was immutable.

> I agree with you entirely. But if we are going for wishful thinking
> about how C++ could have been made better, I'd have preferred "const"
> for all variables and required "mutable" to declare a variable that
> could be modified.

Note though that const does not mean immutable.

Daniel

Bart

unread,
Oct 22, 2021, 3:48:18 PM10/22/21
to
On 22/10/2021 20:04, Keith Thompson wrote:
> Bart <b...@freeuk.com> writes:
> [...]
>> That program only behaves as I said when I use my own compiler.
>>
>> With gcc and others that support _Generic, any 'const' attributes of
>> the types of controlling expressions appear to be stripped away.
>>
>> I'm not sure why that is. It looked to be a bug in gcc, but other
>> compilers do the same. Maybe they are all copying gcc's behaviour, but
>> I don't what C itself says about it.
> [...]
>
> _Generic is a C feature that does not appear in C++.
>
> The first operand of _Generic is an expression, not an object, and it
> determines the type of that expression. If the expression happens to be
> an lvalue, it undergoes *lvalue conversion* (N1570 6.3.2.1p2), which
> strips any qualifiers. So if the operand of _Generic is the name of an
> object of type const int, the type of that *expression* is int, not
> const int. (Something like _Generic that operates on lvalues rather
> than expressions might have been useful, but I've never had a need for
> it.)
>
> Consider an expression like (n + 1). Would you expect it to have a
> different type depending on whether n was defined as const or not?

If I get my C compiler to display the type, then for 'const int n', n
has type 'const int', and n+1 has type 'int'.

But let me ask you, if p has type 'const int* p', do you expect 'p+1' to
have a different type from 'p'? (Namely, 'int*' instead of 'const int*'.)

I understand now that the const qualifiers are only removed at the top
level, so only the leftmost 'const' if types were written left to right
(my const int* p example would have it after 'pointer to').

This still makes the way _Generic works unintuitive.

Keith Thompson

unread,
Oct 22, 2021, 4:29:40 PM10/22/21
to
If your C compiler displays the type using some non-standard extension,
then of course that extension can do anything you like.

> But let me ask you, if p has type 'const int* p', do you expect 'p+1'
> to have a different type from 'p'? (Namely, 'int*' instead of 'const
> int*'.)

No, p+1 would have type const int* (pointer to const int).

> I understand now that the const qualifiers are only removed at the top
> level, so only the leftmost 'const' if types were written left to
> right (my const int* p example would have it after 'pointer to').
>
> This still makes the way _Generic works unintuitive.

I find it intuitive, but perhaps not immediately so. I had to remind
myself about the lvalue conversion, and that its operand is an
expression, not an object, for it to make sense.

If you want to continue discussing this, I suggest starting a new thread
in comp.lang.c.

James Kuyper

unread,
Oct 22, 2021, 7:18:33 PM10/22/21
to
On 10/22/21 5:13 AM, David Brown wrote:
...
> I'm not suggesting that this would be at all a good idea, and it would
> certainly be undefined and undocumented behaviour, but you could
> probably remove "const" by adding "-Dconst=" to your compiler flags. It
> works for gcc (maybe I should file a bug here - it is even accepted with
> -std=c99 -Wpedantic).
>

You are right - the behavior would be undefined:

"The program shall not have any macros with names lexically identical to
keywords currently defined prior to the inclusion of the header or when
any macro defined in the header is expanded." (C standard, 7.1.2p5)

As a "shall" occurring outside of a "Constraints" section, the behavior
of a program that violates that rule is undefined (4p2).

But it would probably work as intended on many implementations.

David Brown

unread,
Oct 23, 2021, 5:55:01 AM10/23/21
to
It turns out - who would have guessed? - that gcc is correct. The gcc
developers these days tend to be very careful and strict about this kind
of thing.

As Keith says, the controlling expression undergoes "lvalue conversion"
(this is in 6.5.1.1p2, if you want to look it up). C18 helpfully adds a
footnote that did not exist in C11, saying "An lvalue conversion drops
type qualifiers". (I think the standard could benefit from more of such
explanatory footnotes.)

I think it is odd, however, that you can have qualified types in the
generic association list, since they can't ever match anything (AFAICS).


David Brown

unread,
Oct 23, 2021, 6:21:35 AM10/23/21
to
On 22/10/2021 16:48, Bart wrote:
> On 22/10/2021 08:46, David Brown wrote:
>> On 21/10/2021 16:04, Bart wrote:
>
>>> I don't [know] about C++, but in C, you can take a program, remove
>>> all the
>>> 'const' qualifiers, and it will still compile and work.
>>>
>>> Makes you think...
>>>
>>
>> I makes you think that it is true that good programming language design
>> is more about what you /can't/ do, rather than about what you /can/ do.
>>   "const" in C does not let you do things you could not otherwise do - it
>> restricts you, thus making code clearer, safer, more maintainable, and
>> perhaps sometimes more efficient.
>
> There are better ways of doing it.

There are certainly /different/ ways of doing things. There are lots of
different programming languages, with their different strengths and
weaknesses.

> In C, it is just adds a lot of
> clutter that effects readability and can hide real problems.

What an odd idea.

If you don't like "const", don't use it in your programming. Others
find it useful to aid readability and avoid problems.

>
> Neither does the syntax make it that obvious which bit of the type is
> refered to, as in:
>
>   const int * const * x;

If you find this kind of thing confusing, use "typedef". It exists to
improve readability (amongst other benefits).

>
> The first const applies to the following int; the second const refers to
> the /previous/ * (AIUI).
>
> This can give a false sense of security, especially when you have, say,
> a const pointer to a struct which contains non-const pointers. The
> 'const' only protects that top level; it does not stop you writing
> nested non-const data.
>

You mean, people who don't really understand what they are doing and
write code that confuses themselves, get mixed up? And how is C
different from any other language in that respect?

I appreciate that you personally prefer a different ordering when
writing types, and that you are not alone in that. Fine. C has a
different ordering, and people usually manage perfectly well. The
difference between "const int * x", "int * const x" and "const int *
const x" is one of these things newbies to C often find hard, and it
turns up in every FAQ and tutorial on the language. If /you/ still find
it hard, read a FAQ.

> In my example, x can still be written to! (x=0 is allowed; but *x=0 and
> **x=0 are not.)

Yes - x is not const.

Bart

unread,
Oct 23, 2021, 6:22:32 AM10/23/21
to
They can be used in examples like this:

#include <stdio.h>

#define strtypeof(t) _Generic(t,\
const int*: "pointer to const int",\
int*: "pointer to int",\
default: "other")

int main(void) {
int * p;
const int * q;

puts(strtypeof(p));
puts(strtypeof(q));
}

All compilers that support _Generic show:

pointer to int
pointer to const int

This suggests a way to maintain those top level qualifiers, by wrapping
a pointer around a type. But it would be an ungainly workaround (and the
fact that typeof() also drops those qualifiers would might make it
impractical).

David Brown

unread,
Oct 23, 2021, 6:36:14 AM10/23/21
to
Yes - but those are not qualified types. Using your preferred ordering,
a "pointer to int" and "pointer to const int" are different types.

>
> All compilers that support _Generic show:
>
>     pointer to int
>     pointer to const int
>
> This suggests a way to maintain those top level qualifiers, by wrapping
> a pointer around a type. But it would be an ungainly workaround (and the
> fact that typeof() also drops those qualifiers would might make it
> impractical).

Why are you inventing an ugly workaround for a non-existent problem?
_Generic in C looks at the unqualified type of an expression - there
isn't a problem.

It turns out your compiler has a bug due to a slight misunderstanding of
_Generic. I'm glad you've found it, and can correct it (assuming you
want to be closer to following the standards). But no one is looking
for a "workaround" here. (Especially not /here/, in c.l.c++ !)

Bart

unread,
Oct 23, 2021, 7:07:16 AM10/23/21
to
On 23/10/2021 11:21, David Brown wrote:
> On 22/10/2021 16:48, Bart wrote:
>> On 22/10/2021 08:46, David Brown wrote:
>>> On 21/10/2021 16:04, Bart wrote:
>>
>>>> I don't [know] about C++, but in C, you can take a program, remove
>>>> all the
>>>> 'const' qualifiers, and it will still compile and work.
>>>>
>>>> Makes you think...
>>>>
>>>
>>> I makes you think that it is true that good programming language design
>>> is more about what you /can't/ do, rather than about what you /can/ do.
>>>   "const" in C does not let you do things you could not otherwise do - it
>>> restricts you, thus making code clearer, safer, more maintainable, and
>>> perhaps sometimes more efficient.
>>
>> There are better ways of doing it.
>
> There are certainly /different/ ways of doing things. There are lots of
> different programming languages, with their different strengths and
> weaknesses.
>
>> In C, it is just adds a lot of
>> clutter that effects readability and can hide real problems.
>
> What an odd idea.
>
> If you don't like "const", don't use it in your programming. Others
> find it useful to aid readability and avoid problems.

I was thinking more about other people's code. Mine doesn't use const at
all.

>>
>> Neither does the syntax make it that obvious which bit of the type is
>> refered to, as in:
>>
>>   const int * const * x;
>
> If you find this kind of thing confusing, use "typedef". It exists to
> improve readability (amongst other benefits).

So, even /more/ clutter?! I's also like to see a typedefed version of my
example that is not harder to understand.

>>
>> The first const applies to the following int; the second const refers to
>> the /previous/ * (AIUI).
>>
>> This can give a false sense of security, especially when you have, say,
>> a const pointer to a struct which contains non-const pointers. The
>> 'const' only protects that top level; it does not stop you writing
>> nested non-const data.
>>
>
> You mean, people who don't really understand what they are doing and
> write code that confuses themselves, get mixed up? And how is C
> different from any other language in that respect?

Yes, everybody. You have a dynamic tree data structure for example,
using non-const references within its nodes to allow it to be updated.

How do you write a function that takes a reference to that tree, but is
not allowed to update it?

This is what someone might expect of an immutable parameter.

This is not to say that I know how to achieve this; I don't (but I
haven't researched it much either). The nearest I can do is pass a deep
copy of such a tree, to protect the original, but that is hardly efficient.

I just see C's const as a waste of time. I'm starting to use readonly
data in a few places without my languages, but where it's handled sensibly.

What I don't do is introduce such a polarising type attribute at every
level of a data structure, one that poisons every other type it comes
into contact with, such that it becomes challenging to do perfectly
innocuous things.

Ben Bacarisse

unread,
Oct 23, 2021, 8:23:16 AM10/23/21
to
David Brown <david...@hesbynett.no> writes:

> I think it is odd, however, that you can have qualified types in the
> generic association list, since they can't ever match anything
> (AFAICS).

I was curious so I tried this:

#include <stdio.h>

struct S { const int i; } s;
struct S f(void) { return s; }

int main(void)
{
const char *t =
_Generic(f().i,
int: "int",
const int: "const int");
puts(t);
}

f().i is not a lvalue and has a const-qualified type. gcc prints "int",
but clang prints "const int". I think clang is right here. (So much
for "they all copy gcc"!)

--
Ben.

Bart

unread,
Oct 23, 2021, 8:57:50 AM10/23/21
to
You need to file a bug report to Clang's developers so that they can fix
that oversight!

But, why do think Clang is wrong? The type has a top-level const qualifier.

I don't get why this is only removed for an lvalue (where you'd think
that a const attribute is more critical).


David Brown

unread,
Oct 23, 2021, 12:45:41 PM10/23/21
to
Perhaps if you used more of C's common features yourself, you'd be less
confused about them and less inclined to think they are "clutter" or
hinder readability. (If you only want to use C as an output language
from your transpilers, and thus only use a subset of the language, then
that's absolutely fine - but it makes you a poor judge of what features
are useful to people working with human-written C code rather than
machine-generated C code.)

>
>>>
>>> Neither does the syntax make it that obvious which bit of the type is
>>> refered to, as in:
>>>
>>>    const int * const * x;
>>
>> If you find this kind of thing confusing, use "typedef".  It exists to
>> improve readability (amongst other benefits).
>
> So, even /more/ clutter?! I's also like to see a typedefed version of my
> example that is not harder to understand.
>

typedef const int constant_integer;
typedef constant_integer * pointer_to_constant_integer;
typedef const pointer_to_constant_integer
constant_pointer_to_constant_integer;
typedef constant_pointer_to_constant_integer *
pointer_to_constant_pointer_to_constant_integer;

pointer_to_constant_pointer_to_constant_integer x;


That's the order you prefer, is it not? (I'm not suggesting it's a good
way to write it, I'm merely showing you how it could be done with a
choice of names that might suit your liking.)


Maybe you want it more compact:

typedef const int * p_cint;
typedef const p_cint * p_cp_cint;
p_cp_cint x;


In real code, of course, it would usually make more sense to think about
what your types actually are and how they will be used, and then use
type names that fit.


>>>
>>> The first const applies to the following int; the second const refers to
>>> the /previous/ * (AIUI).
>>>
>>> This can give a false sense of security, especially when you have, say,
>>> a const pointer to a struct which contains non-const pointers. The
>>> 'const' only protects that top level; it does not stop you writing
>>> nested non-const data.
>>>
>>
>> You mean, people who don't really understand what they are doing and
>> write code that confuses themselves, get mixed up?  And how is C
>> different from any other language in that respect?
>
> Yes, everybody. You have a dynamic tree data structure for example,
> using non-const references within its nodes to allow it to be updated.
>
> How do you write a function that takes a reference to that tree, but is
> not allowed to update it?
>
> This is what someone might expect of an immutable parameter.

There are occasions when it is more convenient to cast away const, or
where it is hard to maintain full const correctness. "const" does not
absolve the programmer of having to think. But it does make a lot of
code clearer and easier to understand.

>
> This is not to say that I know how to achieve this; I don't (but I
> haven't researched it much either). The nearest I can do is pass a deep
> copy of such a tree, to protect the original, but that is hardly efficient.
>
> I just see C's const as a waste of time. I'm starting to use readonly
> data in a few places without my languages, but where it's handled sensibly.
>
> What I don't do is introduce such a polarising type attribute at every
> level of a data structure, one that poisons every other type it comes
> into contact with, such that it becomes challenging to do perfectly
> innocuous things.
>

Nobody does that with "const". I guess it is just yet another of C's
features that you don't quite understand, and prefer to hate
irrationally than learn.

Bart

unread,
Oct 23, 2021, 1:58:51 PM10/23/21
to
On 23/10/2021 17:45, David Brown wrote:
> On 23/10/2021 13:06, Bart wrote:

>>>>    const int * const * x;
>>>
>>> If you find this kind of thing confusing, use "typedef".  It exists to
>>> improve readability (amongst other benefits).
>>
>> So, even /more/ clutter?! I's also like to see a typedefed version of my
>> example that is not harder to understand.
>>
>
> typedef const int constant_integer;
> typedef constant_integer * pointer_to_constant_integer;
> typedef const pointer_to_constant_integer
> constant_pointer_to_constant_integer;
> typedef constant_pointer_to_constant_integer *
> pointer_to_constant_pointer_to_constant_integer;
>
> pointer_to_constant_pointer_to_constant_integer x;
>
>
> That's the order you prefer, is it not? (I'm not suggesting it's a good
> way to write it, I'm merely showing you how it could be done with a
> choice of names that might suit your liking.)
>
>
> Maybe you want it more compact:
>
> typedef const int * p_cint;
> typedef const p_cint * p_cp_cint;
> p_cp_cint x;


Well, I was right, the alternatives are worse.

If you are interested in the actual type, or the 'shape' of that type,
devoid of qualifiers, then you don't want all that. You want to know the
type is 'int**'.


>> This is what someone might expect of an immutable parameter.
>
> There are occasions when it is more convenient to cast away const, or
> where it is hard to maintain full const correctness. "const" does not
> absolve the programmer of having to think. But it does make a lot of
> code clearer and easier to understand.

Somebody writes an informal library but doesn't bother to mark with
'const' those functions that take char* that don't happen to modify the
string:

void f1(char*);
void f2(char*);
void f3(char*);

Now someone who has a mania for 'const' wants to use it:

const char* s="ABC";
f1(s);

However, it doesn't work. They will know from the specs that f1 doesn't
write into the string, but the compiler doesn't know that.

Now, it starts to get messy. Either casts have to be inserted, or the
library needs to be heavily revised. Then that library may import
another which is also missing consts. And so const-poisoning infects the
whole code-base.

At some point, it will also stop you doing things legally, and you have
to start using casts. Now, you are starting to fight the language.

Was it Pascal or Ada that first had those in/out parameter attributes?

I can write this [in my syntax]:

proc f1(ichar s) = {} # anything goes
proc f2(ichar in s) = {} # s is input to the function
proc f3(ichar out s) = {} # s is output from the function

I don't do anything with these at the minute (I think 'out' and 'inout',
not shown, are just aliases for '&') but they can do a lot just as
annotations.

At some point an implementation can enforce them and ensure that an 'in'
data structure is not modified in the function, even one that has
mutable components. The programmer doesn't need to micro-manage every
level of the type structure, or have to think about exactly how
foolproof those 'const' attributes are.

It should be like a write-protect switch on the whole caboodle.

(At least, within the bounds of what the language can help with. A data
structure may contain references to external data, such as files, disks,
images, which can all be modifible, or they can be altered via another
path to the original data.

But C's const doesn't prevent that either.)

Ben Bacarisse

unread,
Oct 23, 2021, 4:02:23 PM10/23/21
to
Bart <b...@freeuk.com> writes:

> On 23/10/2021 13:23, Ben Bacarisse wrote:
>> David Brown <david...@hesbynett.no> writes:
>>
>>> I think it is odd, however, that you can have qualified types in the
>>> generic association list, since they can't ever match anything
>>> (AFAICS).
>> I was curious so I tried this:
>> #include <stdio.h>
>> struct S { const int i; } s;
>> struct S f(void) { return s; }
>> int main(void)
>> {
>> const char *t =
>> _Generic(f().i,
>> int: "int",
>> const int: "const int");
>> puts(t);
>> }
>> f().i is not a lvalue and has a const-qualified type. gcc prints "int",
>> but clang prints "const int". I think clang is right here. (So much
>> for "they all copy gcc"!)
>
> You need to file a bug report to Clang's developers so that they can
> fix that oversight!
>
> But, why do think Clang is wrong? The type has a top-level const
> qualifier.

I said I think clang is right (because the expression f().i is not an
lvalue).

> I don't get why this is only removed for an lvalue (where you'd think
> that a const attribute is more critical).

lvalue conversion converts an lvalue to the value stored. It makes no
sense for the result to have any qualifiers -- they are anything but
critical for pure values.

--
Ben.

Chris Vine

unread,
Oct 23, 2021, 4:29:40 PM10/23/21
to
On Fri, 22 Oct 2021 04:41:47 -0000 (UTC)
Juha Nieminen <nos...@thanks.invalid> wrote:
> So yes, 'const' can actually make the program more efficient (especially
> in C++, where it guarantees to the compiler that it can assume the
> contents won't change).

Since this thread is entitled "I think references should have been
const by default", it may be worth mentioning that holding a const
reference to an object does not mean that the compiler "can assume the
contents won't change". It guarantees that, in the absence of a const
cast, non-mutable non-static data won't be modified through the
reference. If the object concerned is a lvalue it says nothing about
what might be done to the object's non-mutable data through its
variable name (assuming that is non-const) or by some other non-const
reference. It also says nothing about the mutability of the object's
static data (if any).

I say this in case it is used to put forward the incorrect notion that
"const" means "thread safe", which I have occasionally seen propagated
by the ill-informed.

Keith Thompson

unread,
Oct 23, 2021, 6:07:32 PM10/23/21
to
The removal of type qualifiers is part of lvalue conversion. No lvalue,
no lvalue conversion.

I can see that it would make sense for the expression `f().i` to have
type int rather than const int, but the standard doesn't say so.

Tim Rentsch

unread,
Oct 24, 2021, 12:12:51 AM10/24/21
to
Ben Bacarisse <ben.u...@bsb.me.uk> writes:

[...]

> I was curious so I tried this:
>
> #include <stdio.h>
>
> struct S { const int i; } s;
> struct S f(void) { return s; }
>
> int main(void)
> {
> const char *t =
> _Generic(f().i,
> int: "int",
> const int: "const int");
> puts(t);
> }
>
> f().i is not a lvalue and has a const-qualified type. gcc prints "int",
> but clang prints "const int". I think clang is right here. (So much
> for "they all copy gcc"!)

I have looked into this question and just now posted in comp.std.c
giving the results of my investigation.

Tim Rentsch

unread,
Oct 24, 2021, 12:15:43 AM10/24/21
to
There is some confusion about that. More info in comp.std.c.

David Brown

unread,
Oct 24, 2021, 6:11:37 AM10/24/21
to
People can write code in a lazy way (or perhaps an old way), and this
can cause some inconveniences in using it along with code written in
newer and better ways. That's true in general - it is not special for
C, nor special for "const" in C.

Your solution to C's const problem (as you see it), is to have a
language where you can distinguish between pointers which cannot be used
to change the data, and pointers which /can/ be used to change the data.
As long as people use these correctly in your language, or Pascal, or
Ada, everything works correctly.

That's fine - and a good idea.

It is /exactly/ the same as is done in C. "void f1(char *)" is like an
"inout" parameter, and "void f2(const char *)" is like an "in" parameter.

Using "char *" as a parameter in C when it is read-only data is exactly
like using "inout" parameters in Ada or Pascal, or "ichar s" in your
language - it is lazy, unhelpful, and it stops people using it for
constant data (without extra effort).

Therefore, I simply don't understand what you have against "const" in C
- your own language is almost identical except for minor syntax differences.

(You do have a syntax for saying the pointer is used only for writing,
not for reading, which standard C is missing.)

> At some point an implementation can enforce them and ensure that an 'in'
> data structure is not modified in the function, even one that has
> mutable components. The programmer doesn't need to micro-manage every
> level of the type structure, or have to think about exactly how
> foolproof those 'const' attributes are.
>
> It should be like a write-protect switch on the whole caboodle.
>
> (At least, within the bounds of what the language can help with. A data
> structure may contain references to external data, such as files, disks,
> images, which can all be modifible, or they can be altered via another
> path to the original data.
>
> But C's const doesn't prevent that either.)
>

Yes, "const" has its limitations - the same ones as you have in your
language. Programming languages /always/ have compromises.

Jorgen Grahn

unread,
Oct 24, 2021, 10:20:21 AM10/24/21
to
On Thu, 2021-10-21, Juha Nieminen wrote:
> Time and again I see beginner C++ programmers make the same mistake:
> Make functions take objects by non-const reference, even when (in the
> vast, vast majority of cases) the function doesn't modify those objects.

I don't often see that mistake. If they do it, surely they must have
learned from really lousy sources, and do other stupid things, too?

It's important and easy for an author or a teacher to describe the
common cases of parameter passing:

void foo(Bar bar);
void foo(const Bar& bar);

And the more exotic ones:

void foo(Bar& bar);
void foo(Bar&& bar);
void foo(Bar* bar);
void foo(const Bar* bar);
void foo(std::unique_ptr<Bar> bar);
// and some I missed I guess

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Bart

unread,
Oct 24, 2021, 6:11:57 PM10/24/21
to
On 24/10/2021 11:11, David Brown wrote:
> On 23/10/2021 19:58, Bart wrote:

>> I don't do anything with these at the minute (I think 'out' and 'inout',
>> not shown, are just aliases for '&') but they can do a lot just as
>> annotations.
>>
>
> People can write code in a lazy way (or perhaps an old way), and this
> can cause some inconveniences in using it along with code written in
> newer and better ways. That's true in general - it is not special for
> C, nor special for "const" in C.
>
> Your solution to C's const problem (as you see it), is to have a
> language where you can distinguish between pointers which cannot be used
> to change the data, and pointers which /can/ be used to change the data.
> As long as people use these correctly in your language, or Pascal, or
> Ada, everything works correctly.
>
> That's fine - and a good idea.
>
> It is /exactly/ the same as is done in C. "void f1(char *)" is like an
> "inout" parameter, and "void f2(const char *)" is like an "in" parameter.
>
> Using "char *" as a parameter in C when it is read-only data is exactly
> like using "inout" parameters in Ada or Pascal, or "ichar s" in your
> language - it is lazy, unhelpful, and it stops people using it for
> constant data (without extra effort).
>
> Therefore, I simply don't understand what you have against "const" in C
> - your own language is almost identical except for minor syntax differences.

I've played around with C-style readonly type attributes in the past. I
didnt't like them. It disrupts the type system (see the fuss about how
_Generic should work) and it didn't give the necessary protection.

'const' in C only affects one level of a complex type. It doesn't
affects those parts not specified (like the members of a struct type, or
rather those values at the other side of an embedded pointer type, which
is not itself const).

I want 'readonly' to protect an entire data structure, especially one
that is otherwise mutable that is passed to a function, by only
specifying one thing.

Now, I haven't fully implemented such an attribute (I've only reserved
syntax like 'let' and 'in'), I just know how I'd like it to work.

That is, propagate down deep into the data structure. I think that is
possible. I don't think it would be part of the type system; it's likely
to be a property of an expression.

Öö Tiib

unread,
Oct 24, 2021, 7:18:34 PM10/24/21
to
Can it be that you haven't implemented it because it is what you would
like to want but do not always want?

> That is, propagate down deep into the data structure. I think that is
> possible. I don't think it would be part of the type system; it's likely
> to be a property of an expression.

In most software I have seen pointers in object often point at other
objects that are not logically components of said object. So the
pointers often do not go to "down deep" but entirely elsewhere.

Bart

unread,
Oct 24, 2021, 7:58:42 PM10/24/21
to
On 25/10/2021 00:18, Öö Tiib wrote:
> On Monday, 25 October 2021 at 01:11:57 UTC+3, Bart wrote:

>> 'const' in C only affects one level of a complex type. It doesn't
>> affects those parts not specified (like the members of a struct type, or
>> rather those values at the other side of an embedded pointer type, which
>> is not itself const).
>>
>> I want 'readonly' to protect an entire data structure, especially one
>> that is otherwise mutable that is passed to a function, by only
>> specifying one thing.
>>
>> Now, I haven't fully implemented such an attribute (I've only reserved
>> syntax like 'let' and 'in'), I just know how I'd like it to work.
>
> Can it be that you haven't implemented it because it is what you would
> like to want but do not always want?

Partly because I've classed it as low priority; this would not allow me
to do anything new, just apply extra restrictions!

However it is something interesting to explore.

>> That is, propagate down deep into the data structure. I think that is
>> possible. I don't think it would be part of the type system; it's likely
>> to be a property of an expression.
>
> In most software I have seen pointers in object often point at other
> objects that are not logically components of said object. So the
> pointers often do not go to "down deep" but entirely elsewhere.

Determining the boundaries of a data structure, beyond which
write-protection shouldn't apply or can't be applied, would be one of
the problems to look at.

Juha Nieminen

unread,
Oct 25, 2021, 12:51:54 AM10/25/21
to
Chris Vine <chris@cvine--nospam--.freeserve.co.uk> wrote:
> I say this in case it is used to put forward the incorrect notion that
> "const" means "thread safe", which I have occasionally seen propagated
> by the ill-informed.

"const means thread-safe" is not said in the context of const references,
but in the context of const member functions, which is a completely
different thing.

(And, in this case, the idea is "const member functions *should be*
re-entrant", rather than "const member functions are thread-safe".)

And when I said "const can make the program more efficient" I'm
referring to compile-time literals. Especially ones in a const
array. (When the compiler sees the definition of a const array
full of compile-time literals, it can assume that the contents
of the array will never change, and can start taking values from
it at compile time if it's able to. It doesn't need to assume
that the values may change.)

Juha Nieminen

unread,
Oct 25, 2021, 1:00:51 AM10/25/21
to
Bart <b...@freeuk.com> wrote:
> Neither does the syntax make it that obvious which bit of the type is
> refered to, as in:
>
> const int * const * x;

Actually the syntax *does* make it obvious. You are just reading the type
declaration in the wrong direction. Pointer variable declarations should
be read from right to left (this is a simple but non-obvious trick that
surprisingly few programmers know.) In your example, when we read the
declaration from right to left, it becomes:

"x is a pointer to a const pointer that points to an int that's const".

Or, if you want to be a bit clearer:

"x is a pointer to a (const pointer) that points to an int, the int
itself being const".

(In other words, x itself is not const and can be modified, but it
points to a const pointer, ie. *x cannot be modified, and this
const pointer is pointing to a const int, ie. **x cannot be modified
either.)

Racing...@watershipdown.co.uk

unread,
Oct 25, 2021, 4:21:22 AM10/25/21
to
On Sat, 23 Oct 2021 18:45:22 +0200
David Brown <david...@hesbynett.no> wrote:
>On 23/10/2021 13:06, Bart wrote:
>> So, even /more/ clutter?! I's also like to see a typedefed version of my
>> example that is not harder to understand.
>>
>
>typedef const int constant_integer;
>typedef constant_integer * pointer_to_constant_integer;
>typedef const pointer_to_constant_integer
>constant_pointer_to_constant_integer;
>typedef constant_pointer_to_constant_integer *
>pointer_to_constant_pointer_to_constant_integer;
>
>pointer_to_constant_pointer_to_constant_integer x;

Consts in C are pointless because it doesn't have references and its rather
difficult to "accidentaly" dereference a pointer to update the value its
pointing to.

Juha Nieminen

unread,
Oct 25, 2021, 5:48:06 AM10/25/21
to
Racing...@watershipdown.co.uk wrote:
> Consts in C are pointless because it doesn't have references and its rather
> difficult to "accidentaly" dereference a pointer to update the value its
> pointing to.

Actually it isn't. Very old-style C code, mostly prior to the C89 standard,
but you can see even modern examples sometimes (for some old-school C coders
habits die hard), often used non-const pointers to char as "strings".
In fact, I think even the K&R famous book as examples with non-const char*'s
being initialized to point to string literals.

The problem with this is that it can be too easy to accidentally try to
modify the contents of the "string" through that pointer. If you are
accustomed to never using 'const' when dealing with char*'s, you'll
probably pay little attention to the fact that some function somewhere
is taking a non-const char* as parameter, and you might at some point
call it with a pointer that's pointing to a string literal. If said
function does modify the "string" it's getting as parameter, that's UB.

Most modern C compilers will give you a warning if you try to assign
a const char* (eg. a string literal) to a non-const char* (or give
one to a function taking a non-const char*), but if you were
determined to never use 'const' and turn off such warnings, such
mistakes are not extraordinarily unlikely.

Bo Persson

unread,
Oct 25, 2021, 6:33:54 AM10/25/21
to
On 2021-10-25 at 11:47, Juha Nieminen wrote:
> Racing...@watershipdown.co.uk wrote:
>> Consts in C are pointless because it doesn't have references and its rather
>> difficult to "accidentaly" dereference a pointer to update the value its
>> pointing to.
>
> Actually it isn't. Very old-style C code, mostly prior to the C89 standard,
> but you can see even modern examples sometimes (for some old-school C coders
> habits die hard), often used non-const pointers to char as "strings".
> In fact, I think even the K&R famous book as examples with non-const char*'s
> being initialized to point to string literals.
>

In defense of K&R. :-)

They didn't have const in original C. It was Bjarne who first added it
to C++, and only later did C also adopt the keyword.


Bart

unread,
Oct 25, 2021, 7:13:24 AM10/25/21
to
If only it was that simple to read declarations!

Ones such as int** can work by going from right to left, but in general
it is inside out.

I noticed you deftly bypassed the fact that 'const' for 'int' can be
written either side of 'int', or both!

At least this example helps highlight which of those ** comes first.

Racing...@watershipdown.co.uk

unread,
Oct 25, 2021, 10:19:25 AM10/25/21
to
On Mon, 25 Oct 2021 09:47:50 -0000 (UTC)
Juha Nieminen <nos...@thanks.invalid> wrote:
>Racing...@watershipdown.co.uk wrote:
>> Consts in C are pointless because it doesn't have references and its rather
>> difficult to "accidentaly" dereference a pointer to update the value its
>> pointing to.
>
>Actually it isn't. Very old-style C code, mostly prior to the C89 standard,
>but you can see even modern examples sometimes (for some old-school C coders
>habits die hard), often used non-const pointers to char as "strings".

And? How do you accidentaly write *str or str[0] for example?

>In fact, I think even the K&R famous book as examples with non-const char*'s
>being initialized to point to string literals.

The concept of const didn't exist in K&R C so what would be your alternative?

>The problem with this is that it can be too easy to accidentally try to
>modify the contents of the "string" through that pointer. If you are
>accustomed to never using 'const' when dealing with char*'s, you'll
>probably pay little attention to the fact that some function somewhere
>is taking a non-const char* as parameter, and you might at some point
>call it with a pointer that's pointing to a string literal. If said
>function does modify the "string" it's getting as parameter, that's UB.

No idea what UB means, but what'll happen is it'll crash immediately so you'll
soon find out.

James Kuyper

unread,
Oct 25, 2021, 10:30:21 AM10/25/21
to
On 10/25/21 4:21 AM, Racing...@watershipdown.co.uk wrote:
...
> Consts in C are pointless because it doesn't have references and its rather
> difficult to "accidentaly" dereference a pointer to update the value its
> pointing to.

Actually, it isn't. All it takes is unfamiliarity with the functions
you're using. I remember, in particular, I've seen messages from several
people expressing surprise that strtok() writes to the string that you
pass it as it's first argument. If the pointers they had tried to pass
to strtok() had been const char * rather than char*, they would have
been reminded of the problem. Of course, they might not have understood
the reminder, if they weren't familiar with functions whose declarations
use "const" appropriately. All of the standard library functions do so,
many other libraries don't.

However, that's only a part of the problem that "const" is intended to
help avoid. The other part is intentionally dereferencing a pointer to
update the value it's pointing act, due to being unaware of the fact
that what it's pointing at is something that shouldn't be written to. In
code which doesn't make proper use of "const", that's a fairly common
mistake, at least in my experience (which is admittedly limited, since
my own code does make proper use of "const").

James Kuyper

unread,
Oct 25, 2021, 10:39:30 AM10/25/21
to
On 10/25/21 5:47 AM, Juha Nieminen wrote:
...
> Most modern C compilers will give you a warning if you try to assign
> a const char* (eg. a string literal) to a non-const char* (or give
> one to a function taking a non-const char*),

They must do so on assignment; 6.5.16p2 occurs in a "Constraints" section:
"An assignment operator shall have a modifiable lvalue as its left operand."

And if a function prototype is in scope "... the arguments are
implicitly converted, as if by assignment, to the types of the
corresponding parameters ..." (6.5.2.2p7), so the same constraints apply
there, too. For the same reason, they also apply to return statements.

Racing...@watershipdown.co.uk

unread,
Oct 25, 2021, 10:39:34 AM10/25/21
to
On Mon, 25 Oct 2021 10:30:04 -0400
James Kuyper <james...@alumni.caltech.edu> wrote:
>On 10/25/21 4:21 AM, Racing...@watershipdown.co.uk wrote:
>....
>> Consts in C are pointless because it doesn't have references and its rather
>> difficult to "accidentaly" dereference a pointer to update the value its
>> pointing to.
>
>Actually, it isn't. All it takes is unfamiliarity with the functions
>you're using. I remember, in particular, I've seen messages from several
>people expressing surprise that strtok() writes to the string that you
>pass it as it's first argument. If the pointers they had tried to pass

Those are the sorts of people who should stick to python or javascript.

>However, that's only a part of the problem that "const" is intended to
>help avoid. The other part is intentionally dereferencing a pointer to
>update the value it's pointing act, due to being unaware of the fact
>that what it's pointing at is something that shouldn't be written to. In

They'll soon find out if they try to write to it.


Manfred

unread,
Oct 25, 2021, 10:58:18 AM10/25/21
to
There's still the point that the C standard describes "type qualifiers"
both in the context of "declaration-specifiers" and "declarators", and,
in the first case, it says that "type specifiers" (e.g. 'int') and "type
qualifiers" (like 'const') may appear "in any order".
This flexibility is handy in simple declarations, but may be seen as
less consistent in case of multiple levels of indirection.
In the case of pointer "declarators", on the other hand, "type
qualifiers", if any, always occur /after/ their respective '*'.

All of this makes sense, after you pay the necessary attention, and it
allows to specify the desired qualifiers for each level of indirection,
which is a valuable feature. I'd say this is one of the cases where
flexibility comes at a price, which, in this case, is worth its value.

James Kuyper

unread,
Oct 25, 2021, 10:59:30 AM10/25/21
to
On 10/25/21 10:19 AM, Racing...@watershipdown.co.uk wrote:
> On Mon, 25 Oct 2021 09:47:50 -0000 (UTC)
> Juha Nieminen <nos...@thanks.invalid> wrote:
>> Racing...@watershipdown.co.uk wrote:
>>> Consts in C are pointless because it doesn't have references and its rather
>>> difficult to "accidentaly" dereference a pointer to update the value its
>>> pointing to.
>>
>> Actually it isn't. Very old-style C code, mostly prior to the C89 standard,
>> but you can see even modern examples sometimes (for some old-school C coders
>> habits die hard), often used non-const pointers to char as "strings".
>
> And? How do you accidentaly write *str or str[0] for example?

It's not the *str that's accidental, its the call to a function that
contains *str, with an argument that points to a string that shouldn't
be written to. That is in fact a fairly easy mistake to made, and when
people didn't use "const" properly, it's actually a fairly common one.

...
>> In fact, I think even the K&R famous book as examples with non-const char*'s
>> being initialized to point to string literals.
>
> The concept of const didn't exist in K&R C so what would be your alternative?

There was no alternative, which is why that was the case. After "const"
was added to the language, K&R 2nd edition was updated accordingly.

...
>> call it with a pointer that's pointing to a string literal. If said
>> function does modify the "string" it's getting as parameter, that's UB.
>
> No idea what UB means, but what'll happen is it'll crash immediately so you'll
> soon find out.

UB means "Undefined Behavior", a technical term from the C standard
which does NOT mean "behavior for which there is no definition". It
means "behavior, upon use of a nonportable or erroneous program
construct or of erroneous data, for which this document imposes no
requirements" (3.4.3). Note that "this document" refers to the C
standard; other documents (such as compiler documentation or ABI
standards) might define the behavior, without changing the fact that is
qualifies as "undefined behavior" as far as the C standard is concerned.

A lot of people have trouble understanding how breath-takingly wide the
scope of "imposes no requirements" is. The standard tries to make that
clear with the following examples "Possible undefined behavior ranges
from ignoring the situation completely with unpredictable results, to
behaving during translation or program execution in a documented manner
characteristic of the environment (with or without the issuance of a
diagnostic message), to terminating a translation or execution (with the
issuance of a diagnostic message)."

Note, in particular, that the most insidious form of undefined behavior
is that your program can behave exactly the way you incorrectly thought
it was required to behave. The reason that's dangerous is that it leaves
you with no warning that the behavior might change when you recompile
with a different compiler, or with different compiler options, or even
with the same compiler options, or even if you simply run the program a
second time, even if you give it the same inputs as the previous time.
That's how comprehensive the phrase "no requirements" is - the undefined
behavior is NOT required to be the same each time you execute the
offending program.

Getting back to your comment - it's not required to crash immediately -
that would constitute a requirement. And it's actually possible, as a
result of optimizations performed by the compiler, that it might
actually do something quite different. In particular, one possibility is
the attempt to write to the object might become a NOp (as indicated by
the phrase "ignoring the situation completely").

James Kuyper

unread,
Oct 25, 2021, 11:05:47 AM10/25/21
to
On 10/25/21 10:39 AM, Racing...@watershipdown.co.uk wrote:
> On Mon, 25 Oct 2021 10:30:04 -0400
> James Kuyper <james...@alumni.caltech.edu> wrote:
...
>> However, that's only a part of the problem that "const" is intended to
>> help avoid. The other part is intentionally dereferencing a pointer to
>> update the value it's pointing act, due to being unaware of the fact
>> that what it's pointing at is something that shouldn't be written to. In
>
> They'll soon find out if they try to write to it.

Not necessarily - the fact that the behavior is undefined gives
implementations the freedom to implement such code anyway they want,
including ways that can be quite hard to recognize as errors - even
though they are.
Back when I was first converting a lot of other people's K&R C code to
make use of the new features of C90, I frequently found errors like that
which had been masked for years - the errors were quite capable of
causing serious problems, but for one reason or the other, they had
failed to do frequently enough for the problem to be successfully
tracked down. Most of that code ran much more reliably after I finished
converting it.

Manfred

unread,
Oct 25, 2021, 11:56:54 AM10/25/21
to
On 10/25/2021 4:59 PM, James Kuyper wrote:
> On 10/25/21 10:19 AM, Racing...@watershipdown.co.uk wrote:
<snip>
>>
>> No idea what UB means, but what'll happen is it'll crash immediately so you'll
>> soon find out.
>
> UB means "Undefined Behavior", a technical term from the C standard
> which does NOT mean "behavior for which there is no definition". It
> means "behavior, upon use of a nonportable or erroneous program
> construct or of erroneous data, for which this document imposes no
> requirements" (3.4.3). Note that "this document" refers to the C
> standard; other documents (such as compiler documentation or ABI
> standards) might define the behavior, without changing the fact that is
> qualifies as "undefined behavior" as far as the C standard is concerned.

Thanks for the quote, it made me compare it with the definition of UB in
the C++ standard, which simply states "behavior for which this
International Standard imposes no requirements".

The lack of the sentence "upon use of a nonportable or erroneous program
construct or of erroneous data" actually relegates the language at the
mercy of language lawyers, and led to the UB bloat that affects C++
nowadays.

Racing...@watershipdown.co.uk

unread,
Oct 25, 2021, 12:14:48 PM10/25/21
to
On Mon, 25 Oct 2021 10:59:15 -0400
James Kuyper <james...@alumni.caltech.edu> wrote:
>On 10/25/21 10:19 AM, Racing...@watershipdown.co.uk wrote:
>>> call it with a pointer that's pointing to a string literal. If said
>>> function does modify the "string" it's getting as parameter, that's UB.
>>
>> No idea what UB means, but what'll happen is it'll crash immediately so
>you'll
>> soon find out.
>
>UB means "Undefined Behavior", a technical term from the C standard
>which does NOT mean "behavior for which there is no definition". It
>means "behavior, upon use of a nonportable or erroneous program
>construct or of erroneous data, for which this document imposes no
>requirements" (3.4.3). Note that "this document" refers to the C
>standard; other documents (such as compiler documentation or ABI
>standards) might define the behavior, without changing the fact that is
>qualifies as "undefined behavior" as far as the C standard is concerned.

Any attempt to write to a read only program text area will result in a crash
regardless of the language. It is implicit that its read only in C because
C also provides the following initialisation which places the string
(presumably) on the heap:

char str[] = "hello world";


James Kuyper

unread,
Oct 25, 2021, 1:14:47 PM10/25/21
to
On 10/25/21 12:14 PM, Racing...@watershipdown.co.uk wrote:
...
> Any attempt to write to a read only program text area will result in a crash
> regardless of the language.

Perhaps that is true at the hardware level, on some processors. However,
there's also some processors which don't even have the concept of
read-only memory, and there are fully conforming C implementation that
can target some of those processors.

However, I'm talking about the level of C code, not hardware. The
translation from C code to machine code is defined only in terms of the
required behavior, and when there is NO required behavior, that
translation can get distinctly weird if you believe the mistaken idea
that C is a "portable assembler".

> ... It is implicit that its read only in C because
> C also provides the following initialisation which places the string
> (presumably) on the heap:
>
> char str[] = "hello world";

Such code cannot result in the string being placed in read-only memory,
because it's perfectly legal to modify str. On the other hand, both of
the following C declarations do allow strings to be placed in read-only
memory, even if they occur at block scope:

const char str[] = "Hello world!";
char *strptr = "Good bye!";

The first one is allowed to be placed in read-only memory because the
object str is declared "const". The second is allowed to be placed in
read-only memory because it's undefined behavior to write to the memory
pointed at by by strptr, despite the fact that, in C, the string literal
does NOT have the type const char[10], as it would in C++.

However, just because it would be permissible for an implementation to
place those objects in read-only memory, it's not actually required that
they be placed there. Many implementations won't do so, especially if
those declarations occur at block scope.

And even if they were placed in read-only memory, writing C code that
attempts to modify that memory need not result in machine language
instructions being executed to attempt such a read. Because the behavior
of such code is undefined, an implementation is free to translate such
source code into machine code that does nothing of the kind - and this
is, in fact, the natural result, in some contexts, of certain optimizations.

Bart

unread,
Oct 25, 2021, 1:19:55 PM10/25/21
to
On 25/10/2021 17:14, Racing...@watershipdown.co.uk wrote:
> On Mon, 25 Oct 2021 10:59:15 -0400
> James Kuyper <james...@alumni.caltech.edu> wrote:
>> On 10/25/21 10:19 AM, Racing...@watershipdown.co.uk wrote:
>>>> call it with a pointer that's pointing to a string literal. If said
>>>> function does modify the "string" it's getting as parameter, that's UB.
>>>
>>> No idea what UB means, but what'll happen is it'll crash immediately so
>> you'll
>>> soon find out.
>>
>> UB means "Undefined Behavior", a technical term from the C standard
>> which does NOT mean "behavior for which there is no definition". It
>> means "behavior, upon use of a nonportable or erroneous program
>> construct or of erroneous data, for which this document imposes no
>> requirements" (3.4.3). Note that "this document" refers to the C
>> standard; other documents (such as compiler documentation or ABI
>> standards) might define the behavior, without changing the fact that is
>> qualifies as "undefined behavior" as far as the C standard is concerned.
>
> Any attempt to write to a read only program text area will result in a crash
> regardless of the language.

Data is only put into readonly, write-protected memory when the data
values are already known before the program starts.

Lots of uses of 'const' are for data not known until the program starts
execution, and many of these will be reinitialised many times as they
are declared inside blocks.

Other uses will make take normally mutable data and make it readonly
when passed to function.

So using write-protected memory is not that much help.

Keith Thompson

unread,
Oct 25, 2021, 1:49:13 PM10/25/21
to
I suggest that you would benefit more here from asking questions than
from making assertions.

That declaration does not place anything on the heap. The contents of
str is placed on the stack if it appears within a function definition.
or in the static data area if it appears outside a function definition.

Others have addresses your errors regarding "const".

Keith Thompson

unread,
Oct 25, 2021, 1:56:39 PM10/25/21
to
I think the example being referred to was something like:
char *s;
s = "hello";
which does not require a diagnostic in C (because C string literals are
not const). The following is recommended in C and required in C++:
const char *s;
s = "hello";
but here s is still a modifiable lvalue because the "const" applies to
what s points to, not to s itself.

A case that would invoke the constraint in 6.5.16p2 is:
char *const s;
s = "hello";
because s itself is read-only; you can't assign *anything* to it.

Keith Thompson

unread,
Oct 25, 2021, 2:12:03 PM10/25/21
to
Manfred <non...@add.invalid> writes:
> On 10/25/2021 4:59 PM, James Kuyper wrote:
>> On 10/25/21 10:19 AM, Racing...@watershipdown.co.uk wrote:
> <snip>
>>>
>>> No idea what UB means, but what'll happen is it'll crash immediately so you'll
>>> soon find out.
>> UB means "Undefined Behavior", a technical term from the C standard
>> which does NOT mean "behavior for which there is no definition". It
>> means "behavior, upon use of a nonportable or erroneous program
>> construct or of erroneous data, for which this document imposes no
>> requirements" (3.4.3). Note that "this document" refers to the C
>> standard; other documents (such as compiler documentation or ABI
>> standards) might define the behavior, without changing the fact that is
>> qualifies as "undefined behavior" as far as the C standard is concerned.
>
> Thanks for the quote, it made me compare it with the definition of UB
> in the C++ standard, which simply states "behavior for which this
> International Standard imposes no requirements".
>
> The lack of the sentence "upon use of a nonportable or erroneous
> program construct or of erroneous data" actually relegates the
> language at the mercy of language lawyers, and led to the UB bloat
> that affects C++ nowadays.
[...]

I don't see how the omission of "upon use of a nonportable or erroneous
program construct or of erroneous data" in the C++ standard makes any
real difference.

C definition, all standard editions:
behavior, upon use of a nonportable or erroneous program construct
or of erroneous data, for which this International Standard imposes
no requirements

C++ definition, before C++11:
behavior, such as might arise upon use of an erroneous program
construct or erroneous data, for which this International Standard
imposes no requirement

C++ definition, C++11 and later:
behavior for which this International Standard imposes no requirements

In all cases, "undefined behavior" is determined either by an explicit
statement or by the omission of any definition of the behavior (or, in
C, by violation of a "shall" outside a constraint).

Chris M. Thomasson

unread,
Oct 25, 2021, 3:45:22 PM10/25/21
to
Fwiw, when writing code in C, I tend to use the following pattern:

struct foo
{
unsigned int a;
};


void
foo_init(
struct foo* const self,
unsigned int a
){
self->a = a;
}


int
foo_compute(
struct foo const* const self,
unsigned int a
){
return self->a *= a + 123;
}


I like to use a const pointer to self so that if I accidentally modify
self, I will get a nice warning. Its basically a habit of mine. 'self'
is akin to the this pointer in C++.

Oh well... ;^)

Chris Vine

unread,
Oct 25, 2021, 7:52:27 PM10/25/21
to
On Mon, 25 Oct 2021 04:51:35 -0000 (UTC)
Juha Nieminen <nos...@thanks.invalid> wrote:
> Chris Vine <chris@cvine--nospam--.freeserve.co.uk> wrote:
> > I say this in case it is used to put forward the incorrect notion that
> > "const" means "thread safe", which I have occasionally seen propagated
> > by the ill-informed.
>
> "const means thread-safe" is not said in the context of const references,
> but in the context of const member functions, which is a completely
> different thing.

I think you are confused: the two go together. Where a const reference
references an object, the only member functions of the object that you
may call via that reference are const ones. const member functions are
not thread safe in the general case. If you are suggesting otherwise
you are wrong.

> (And, in this case, the idea is "const member functions *should be*
> re-entrant", rather than "const member functions are thread-safe".)

No, I was referring to misguided suggestions as to the latter.

> And when I said "const can make the program more efficient" I'm
> referring to compile-time literals.

That _is_ a completely different thing: compilers can certainly make
assumptions about literals.

Juha Nieminen

unread,
Oct 26, 2021, 1:18:53 AM10/26/21
to
James Kuyper <james...@alumni.caltech.edu> wrote:
> A lot of people have trouble understanding how breath-takingly wide the
> scope of "imposes no requirements" is. The standard tries to make that
> clear with the following examples "Possible undefined behavior ranges
> from ignoring the situation completely with unpredictable results, to
> behaving during translation or program execution in a documented manner
> characteristic of the environment (with or without the issuance of a
> diagnostic message), to terminating a translation or execution (with the
> issuance of a diagnostic message)."

No better example of "undefined behavior" causing a major problem than
that bug in the Linux kernel discovered some years ago, where the kernel
would deliberately dereference a null pointer (I don't remember anymore
for what reason), and gcc saw that it was a null pointer dereference,
which according to the C standard is undefined behavior, and since that
allows the compiler to do with it whatever it wants, it (if I remember
correctly) just optimized it away, causing the extraordinarily hard-to-find
bug in the kernel.

(Also, if I remember correctly, it caused quite a discussion about
whether compilers should actually be allowed to "do whatever they want"
with such code, or whether they should do as they are told.)

Juha Nieminen

unread,
Oct 26, 2021, 1:29:46 AM10/26/21
to
Racing...@watershipdown.co.uk wrote:
> Any attempt to write to a read only program text area will result in a crash
> regardless of the language.

There's absolutely nothing requiring C (or C++) compilers to put string
literals in a read-only memory segment. They are free to put them in a
normal read/write memory segment if they so wish.

Nothing guarantees that the target architecture even *has* such a thing
as "read-only memory segments".

This means that your program may well work "correctly" in one target
architecture but not in another.

> It is implicit that its read only in C because
> C also provides the following initialisation which places the string
> (presumably) on the heap:
>
> char str[] = "hello world";

It cannot place it on the heap because that would just be a memory leak
(there would be nothing freeing it). It would allocate that array on
the stack, if it's inside a function (and if it's at the global scope,
whichever segment is dedicated to those).

And that string literal there, if it actually gets generated into the
final binary, will still be in read-only memory (if the architecture
supports such a thing). It's just that its contents are copied to the
array when the array is allocated on the stack.

(Btw, this is the reason why I say that C as "strings", rather than
strings. They are just char arrays, with a zero byte as an element
that by convention indicates the final character. This causes a
lot of confusion, especially since it induces many people to
think that a char* is a "string". Which it isn't. It's a pointer
to a value of type char. It *might* point to a null-terminated
char array, or it might not. It's not guaranteed that it's a
"string".)

Juha Nieminen

unread,
Oct 26, 2021, 1:36:29 AM10/26/21
to
Bart <b...@freeuk.com> wrote:
> I noticed you deftly bypassed the fact that 'const' for 'int' can be
> written either side of 'int', or both!

That's not really a problem in the right-to-left reading.

int const *ptr;

can be read as:

"ptr is a pointer to a (const int)."

and:

const int *ptr;

can be read as:

"ptr is a pointer to an int that's const (ie. a const int)."

David Brown

unread,
Oct 26, 2021, 2:54:05 AM10/26/21
to
On 26/10/2021 07:29, Juha Nieminen wrote:
> Racing...@watershipdown.co.uk wrote:
>> Any attempt to write to a read only program text area will result in a crash
>> regardless of the language.
>
> There's absolutely nothing requiring C (or C++) compilers to put string
> literals in a read-only memory segment. They are free to put them in a
> normal read/write memory segment if they so wish.
>
> Nothing guarantees that the target architecture even *has* such a thing
> as "read-only memory segments".
>
> This means that your program may well work "correctly" in one target
> architecture but not in another.

There is also nothing to guarantee that attempting to write to read-only
memory will result in a "crash". It could result in nothing happening
at all (the write being ignored), or a hang, or a reset of the entire
system, or a write to somewhere different in memory. (I've worked with
systems with all four such behaviours to at least some extent.)

A particular /OS/ might guarantee that attempting to write to read-only
memory segments results in a particular handling of the process, but it
is certainly not guaranteed by C or C++.

And of course, the C compiler might not actually attempt to make the
write, but act as though it had. (I think that would be unlikely in
practice, but it could be done for strings local to a function.)

>
>> It is implicit that its read only in C because
>> C also provides the following initialisation which places the string
>> (presumably) on the heap:
>>
>> char str[] = "hello world";
>
> It cannot place it on the heap because that would just be a memory leak
> (there would be nothing freeing it). It would allocate that array on
> the stack, if it's inside a function (and if it's at the global scope,
> whichever segment is dedicated to those).
>

(Hypothetically, it /could/ be allocated on the heap, or elsewhere, if
the compiler also generated code to free it appropriately. While almost
all C implementations use a stack for local data, there are a few
exceptions.)

David Brown

unread,
Oct 26, 2021, 3:07:47 AM10/26/21
to
On 26/10/2021 07:18, Juha Nieminen wrote:
> James Kuyper <james...@alumni.caltech.edu> wrote:
>> A lot of people have trouble understanding how breath-takingly wide the
>> scope of "imposes no requirements" is. The standard tries to make that
>> clear with the following examples "Possible undefined behavior ranges
>> from ignoring the situation completely with unpredictable results, to
>> behaving during translation or program execution in a documented manner
>> characteristic of the environment (with or without the issuance of a
>> diagnostic message), to terminating a translation or execution (with the
>> issuance of a diagnostic message)."
>
> No better example of "undefined behavior" causing a major problem than
> that bug in the Linux kernel discovered some years ago, where the kernel
> would deliberately dereference a null pointer (I don't remember anymore
> for what reason), and gcc saw that it was a null pointer dereference,
> which according to the C standard is undefined behavior, and since that
> allows the compiler to do with it whatever it wants, it (if I remember
> correctly) just optimized it away, causing the extraordinarily hard-to-find
> bug in the kernel.
>

The compiler did not cause a bug in the kernel. There was a bug in the
source code - the programmer got the order of the code wrong, and
checked the pointer after using it. This was a simple mistake in the
code, and should have been spotted by the reviewer - it was an
embarrasing failure in the development chain of the kernel. (The review
and moderation process in the kernel development usually maintains very
high standards.)

The new optimisation in gcc did not /cause/ the bug, it merely changed
the /consequences/ of the bug. The optimisation was entirely valid.

It is, however, also reasonable for a project like an OS kernel to
accept that there is a risk of human error leading to bugs in the code,
and want to reduce the consequences that might result from such bugs.

But we can learn from our mistakes - the kernel gained the feature of
having a memory page at address zero mapped with no access, so that any
later attempt to dereference a null pointer would be caught. At that
point, -fdelete-null-pointer-checks can (and should) be re-enabled,
along with the warning "-Wnull-derefence" that was also added as a
consequence of this issue.

> (Also, if I remember correctly, it caused quite a discussion about
> whether compilers should actually be allowed to "do whatever they want"
> with such code, or whether they should do as they are told.)
>

The compiler /did/ do as it was told. It was not told to do what the
programmer wanted to tell it.

Racing...@watershipdown.co.uk

unread,
Oct 26, 2021, 4:19:11 AM10/26/21
to
On Mon, 25 Oct 2021 13:14:30 -0400
James Kuyper <james...@alumni.caltech.edu> wrote:
>On 10/25/21 12:14 PM, Racing...@watershipdown.co.uk wrote:
>....
>> Any attempt to write to a read only program text area will result in a crash
>> regardless of the language.
>
>Perhaps that is true at the hardware level, on some processors. However,
>there's also some processors which don't even have the concept of
>read-only memory, and there are fully conforming C implementation that
>can target some of those processors.
>
>However, I'm talking about the level of C code, not hardware. The
>translation from C code to machine code is defined only in terms of the
>required behavior, and when there is NO required behavior, that
>translation can get distinctly weird if you believe the mistaken idea
>that C is a "portable assembler".
>
>> ... It is implicit that its read only in C because
>> C also provides the following initialisation which places the string
>> (presumably) on the heap:
>>
>> char str[] = "hello world";
>
>Such code cannot result in the string being placed in read-only memory,
>because it's perfectly legal to modify str. On the other hand, both of

Yes, that was my point. [] means modifyable, * means read only in every
C implementation I've ever used.


Racing...@watershipdown.co.uk

unread,
Oct 26, 2021, 4:21:20 AM10/26/21
to
On Mon, 25 Oct 2021 10:48:57 -0700
Keith Thompson <Keith.S.T...@gmail.com> wrote:
>Racing...@watershipdown.co.uk writes:
>> Any attempt to write to a read only program text area will result in a crash
>> regardless of the language. It is implicit that its read only in C because
>> C also provides the following initialisation which places the string
>> (presumably) on the heap:
>>
>> char str[] = "hello world";
>
>I suggest that you would benefit more here from asking questions than
>from making assertions.

I suggest you ease up on being patronising.

>That declaration does not place anything on the heap. The contents of
>str is placed on the stack if it appears within a function definition.
>or in the static data area if it appears outside a function definition.

Wherever its placed, the point is its modifyable unlike *str = which isn't.

>Others have addresses your errors regarding "const".

Not really. They're just trying to make a case for const being useful in C.
I've yet to see that.


Racing...@watershipdown.co.uk

unread,
Oct 26, 2021, 4:23:32 AM10/26/21
to
On Tue, 26 Oct 2021 05:29:31 -0000 (UTC)
Juha Nieminen <nos...@thanks.invalid> wrote:
>Racing...@watershipdown.co.uk wrote:
>> It is implicit that its read only in C because
>> C also provides the following initialisation which places the string
>> (presumably) on the heap:
>>
>> char str[] = "hello world";
>
>It cannot place it on the heap because that would just be a memory leak
>(there would be nothing freeing it). It would allocate that array on

It wouldn't need to be free'd if it existed for the lifetime of the program.

>strings. They are just char arrays, with a zero byte as an element
>that by convention indicates the final character. This causes a

Wow, really? Who knew!


Juha Nieminen

unread,
Oct 26, 2021, 4:37:29 AM10/26/21
to
Racing...@watershipdown.co.uk wrote:
> Not really. They're just trying to make a case for const being useful in C.
> I've yet to see that.

It can catch errors where you accidentally try to modify the contents of,
for example, a string literal.

(This doesn't mean that you do like
char* str = "hello"; str[0] = 'H';
but it does mean that you might do like
doSomething("hello");
where that doSomething() actually modifies the data behind the pointer
it's given.)

It can also make code more efficient.

What more do you need?

Juha Nieminen

unread,
Oct 26, 2021, 4:40:43 AM10/26/21
to
Racing...@watershipdown.co.uk wrote:
>>strings. They are just char arrays, with a zero byte as an element
>>that by convention indicates the final character. This causes a
>
> Wow, really? Who knew!

A lot of beginner C programmers don't.

And some not-so-beginner C programmers either. (Well, they do tend to know
about the trailing-zero-byte thing, but otherwise they may have a
surprisingly poor grasp of what a "string" in C actually is, and may
even think that a char* is a "string" (which it most definitely is not).)

Racing...@watershipdown.co.uk

unread,
Oct 26, 2021, 5:02:21 AM10/26/21
to
On Tue, 26 Oct 2021 08:37:13 -0000 (UTC)
Juha Nieminen <nos...@thanks.invalid> wrote:
>Racing...@watershipdown.co.uk wrote:
>> Not really. They're just trying to make a case for const being useful in C.
>> I've yet to see that.
>
>It can catch errors where you accidentally try to modify the contents of,
>for example, a string literal.
>
>(This doesn't mean that you do like
> char* str = "hello"; str[0] = 'H';
>but it does mean that you might do like
> doSomething("hello");
>where that doSomething() actually modifies the data behind the pointer
>it's given.)

fenris$ cat t.c
#include <stdio.h>

void func(char *str)
{
str[0] = 0;
}


int main()
{
char *str = "hello";
func(str);
puts("Worked");
return 0;
}
fenris$ cc t.c
fenris$ a.out
Bus error: 10
fenris$

Juha Nieminen

unread,
Oct 26, 2021, 7:15:14 AM10/26/21
to
Racing...@watershipdown.co.uk wrote:
> fenris$ a.out
> Bus error: 10

For starters, that's in no way guaranteed to happen. Learn standard C.

Secondly, if you think that a runtime diagnostic is as good as a compile-time
diagnostic, then you have still a LOT to learn about software development.
The earlier in the development process that a bug can be caught, the better.
This is basic software development 101.

The writing-to-a-string-literal might happen only in some cases, not always.
For example, it could depend on the particular contents of some of input
file, or a particular action by the user, a particular command line
parameter, or a myriad of other things that can vary from execution to
execution. In the worst case scenarios the error may happen sporadically
and without a clear pattern, which can make extraordinarily difficult to
debug. Counless hours could be spent in trying to find such an elusive
and obscure bug.

All of which could have been avoided if you just used 'const' and
turned on compiler warnings, and paid attention to them.

There's literally zero reason not to use 'const' for pointers that
are not intended to be used to modify the values they are pointing to.

Bart

unread,
Oct 26, 2021, 7:40:14 AM10/26/21
to
You've misunderstood then.

But * and [] types are modifable:

char* s = "ABC";
puts(s);
*s = 'Z';

This shows ABC the first time it's executed. The second time it shows
ZBC; the code has changed the string literal! Where the same literal iS
shared across the program, it will change the value of "ABC" everywhere.

This is on those implementations that don't put ABC into readonly memory
(eg. tcc, bcc, DMC, lcc, msvc). Ones like gcc and clang will crash.

You can't compare that with this:

char t[] = "ABC";
puts(t)
t[0] = 'Z';

Here, "ABC" is left unmolested. But the reason is because the
initialisation /copies/ the literal string to the array. So it modifies
a copy. The declaration of s directly points it to the literal.


Ben Bacarisse

unread,
Oct 26, 2021, 9:29:50 AM10/26/21
to
Yes, RR has misunderstood (or is expressing the point in a confusing
way).

> But * and [] types are modifable:

And this is bad wording. Some objects with pointer type are modifiable
and some are not. No objects with array types are modifiable. But in
fact you seem to be referring to the /target/ of pointer types (again,
some of which are modifiable and some are not) and to array /elements/
about which the same is also true. There is no general rule about "*
and [] types".

> char* s = "ABC";

This relies on an a conversion that is valid (bad unwise) in C and not
permitted in C++.

> puts(s);
> *s = 'Z';

This is undefined behaviour in both C and C++. The target of the
assignment (the first character of the string) is not a modifiable
object.

> This shows ABC the first time it's executed. The second time it shows
> ZBC; the code has changed the string literal!

It might show ABC again, or it may not get that far. Or, formally,
anything at all could happen.

> This is on those implementations that don't put ABC into readonly
> memory (eg. tcc, bcc, DMC, lcc, msvc). Ones like gcc and clang will
> crash.

It may vary depending on the command-line options, the platform and
compiler version. Talking about what "gcc" or "tcc" does is not very
helpful. Anyway, people should be encouraged to write, where possible,
code that does not depend on such things.

> You can't compare that with this:
>
> char t[] = "ABC";
> puts(t)
> t[0] = 'Z';
>
> Here, "ABC" is left unmolested. But the reason is because the
> initialisation /copies/ the literal string to the array. So it
> modifies a copy. The declaration of s directly points it to the
> literal.

Yes.

--
Ben.

Bart

unread,
Oct 26, 2021, 10:12:08 AM10/26/21
to
It's a modification of what RR said.

> Some objects with pointer type are modifiable
> and some are not. No objects with array types are modifiable.

I don't know what you mean by that. Unless it is that you can't directly
assign to a whole array object at once; only an element at a time. Or,
going the other way, when the array is a member of a struct and you
assign to the whole struct.

(I know that you can't make a whole array const, only the elements.)

> But in
> fact you seem to be referring to the /target/ of pointer types (again,
> some of which are modifiable and some are not) and to array /elements/
> about which the same is also true. There is no general rule about "*
> and [] types".
>
>> char* s = "ABC";
>
> This relies on an a conversion that is valid (bad unwise) in C and not
> permitted in C++.

I tried it in C++ before posting (as I'd thought that "ABC" would have
type const char*) but it seemed to work. (Using -Wall -std=c++14.)


>> puts(s);
>> *s = 'Z';
>
> This is undefined behaviour in both C and C++. The target of the
> assignment (the first character of the string) is not a modifiable
> object.

>> This shows ABC the first time it's executed. The second time it shows
>> ZBC; the code has changed the string literal!
>
> It might show ABC again, or it may not get that far. Or, formally,
> anything at all could happen.

>> This is on those implementations that don't put ABC into readonly
>> memory (eg. tcc, bcc, DMC, lcc, msvc). Ones like gcc and clang will
>> crash.
>
> It may vary depending on the command-line options, the platform and
> compiler version. Talking about what "gcc" or "tcc" does is not very
> helpful. Anyway, people should be encouraged to write, where possible,
> code that does not depend on such things.

I'm writing about what is typically observed.

(I don't put string literals into a readonly segment because I haven't
got round to it yet.

It is surprising that a big compiler like MSVC doesn't do so either, but
apparently that's only done when optimising; rather odd.)

James Kuyper

unread,
Oct 26, 2021, 10:31:57 AM10/26/21
to
On 10/26/21 4:18 AM, Racing...@watershipdown.co.uk wrote:
> On Mon, 25 Oct 2021 13:14:30 -0400
> James Kuyper <james...@alumni.caltech.edu> wrote:
>> On 10/25/21 12:14 PM, Racing...@watershipdown.co.uk wrote:
...
>>> ... It is implicit that its read only in C because
>>> C also provides the following initialisation which places the string
>>> (presumably) on the heap:
>>>
>>> char str[] = "hello world";
>>
>> Such code cannot result in the string being placed in read-only memory,
>> because it's perfectly legal to modify str. On the other hand, both of
>
> Yes, that was my point. [] means modifyable, * means read only in every
> C implementation I've ever used.

Incorrect. In most declarations, [] means array, and * means pointer.
Neither one means "read only".
I think you may be thinking of a different fact that has nothing to do
with read-only memory. Within the scope of an identifier that identifies
an array, that identifier can only ever identify that particular array.
An identifier that identifies a pointer to an object type need not point
at any actual object, and unless it itself is declared const, can be
changed to point at a different object. But that difference between
arrays and pointers has nothing to do with read-only memory. The address
of a named array is not necessarily stored in any pointer - it is
normally hard-coded into the machine language instructions that refer to
the array, so the fact that you can't change that address is not because
the address is stored in read-only memory.

Exception 1: it's not permitted to declare functions that take arrays as
arguments, but it is permitted to declare a function parameter as if it
were an array. Such a declaration is automatically converting into a
declaration of a pointer to the element type of an array. Thus, the
following two function declarations are functionally identical, despite
being syntactically different:

void func(int array[]);
void func(int *ptr);

Exception 2: in a function parameter declaration, the construct [*]
marks the corresponding dimension of the relevant array as having a
variably modified type with an unknown length for that dimension. This
feature cannot be used in the defining declaration for a function,
because the function definition requires that the variable length be
explicitly specified. It is still an array, and not in any sense a
pointer (unless the relevant dimension is the top-most one, in which
case exception 1 described above also applies).

Any attempt to modify the contents of a string literal is undefined. Any
attempt to modify an object whose definition is const-qualified is also
undefined. Those facts permit, but do not require, that those objects be
stored in read-only memory.


Racing...@watershipdown.co.uk

unread,
Oct 26, 2021, 10:36:14 AM10/26/21
to
On Tue, 26 Oct 2021 11:14:58 -0000 (UTC)
Juha Nieminen <nos...@thanks.invalid> wrote:
>Racing...@watershipdown.co.uk wrote:
>> fenris$ a.out
>> Bus error: 10
>
>For starters, that's in no way guaranteed to happen. Learn standard C.

It is on *nix and thats good enough for me.

>Secondly, if you think that a runtime diagnostic is as good as a compile-time
>diagnostic, then you have still a LOT to learn about software development.

All I'm saying is the bug would exhibit itself pretty quickly.

>All of which could have been avoided if you just used 'const' and
>turned on compiler warnings, and paid attention to them.

I always have warnings on so const not required.

Racing...@watershipdown.co.uk

unread,
Oct 26, 2021, 10:37:08 AM10/26/21
to
On Tue, 26 Oct 2021 12:39:54 +0100
Bart <b...@freeuk.com> wrote:
>On 26/10/2021 09:18, Racing...@watershipdown.co.uk wrote:
>But * and [] types are modifable:
>
> char* s = "ABC";
> puts(s);
> *s = 'Z';
>
>This shows ABC the first time it's executed. The second time it shows
>ZBC; the code has changed the string literal! Where the same literal iS

I suggest you actually try running that code and see what happens.


James Kuyper

unread,
Oct 26, 2021, 10:42:39 AM10/26/21
to
On 10/26/21 4:21 AM, Racing...@watershipdown.co.uk wrote:
> On Mon, 25 Oct 2021 10:48:57 -0700
> Keith Thompson <Keith.S.T...@gmail.com> wrote:
>> Racing...@watershipdown.co.uk writes:
>>> Any attempt to write to a read only program text area will result in a crash
>>> regardless of the language. It is implicit that its read only in C because
>>> C also provides the following initialisation which places the string
>>> (presumably) on the heap:
>>>
>>> char str[] = "hello world";
>>
>> I suggest that you would benefit more here from asking questions than
>>from making assertions.
>
> I suggest you ease up on being patronising.

You'll get less patronizing responses when you cease displaying such an
abysmal understanding of C, while believing you understand it better
than others.

>> That declaration does not place anything on the heap. The contents of
>> str is placed on the stack if it appears within a function definition.
>> or in the static data area if it appears outside a function definition.
>
> Wherever its placed, the point is its modifyable unlike *str = which isn't.

Your right, but for the wrong reasons. It's true that *str isn't
modifiable, but that's not just because of the "*", it's because str has
been initialized to point at the first character of a string literal. It
could equally easily have been initialized to point at modifiable
memory. Nothing about the str itself makes it read-only.

Racing...@watershipdown.co.uk

unread,
Oct 26, 2021, 10:42:39 AM10/26/21
to
On Tue, 26 Oct 2021 10:31:41 -0400
James Kuyper <james...@alumni.caltech.edu> wrote:
>On 10/26/21 4:18 AM, Racing...@watershipdown.co.uk wrote:
>> On Mon, 25 Oct 2021 13:14:30 -0400
>> James Kuyper <james...@alumni.caltech.edu> wrote:
>>> On 10/25/21 12:14 PM, Racing...@watershipdown.co.uk wrote:
>....
>>>> ... It is implicit that its read only in C because
>>>> C also provides the following initialisation which places the string
>>>> (presumably) on the heap:
>>>>
>>>> char str[] = "hello world";
>>>
>>> Such code cannot result in the string being placed in read-only memory,
>>> because it's perfectly legal to modify str. On the other hand, both of
>>
>> Yes, that was my point. [] means modifyable, * means read only in every
>> C implementation I've ever used.
>
>Incorrect. In most declarations, [] means array, and * means pointer.
>Neither one means "read only".
>I think you may be thinking of a different fact that has nothing to do
>with read-only memory. Within the scope of an identifier that identifies

No I'm not. The pointer will be pointing to a string literal in the program
static text area which is usually non modifiable.

>Exception 1: it's not permitted to declare functions that take arrays as
>arguments,

Since when?

fenris$ cat t.c
#include <stdio.h>

void func(int a[2][3])
{
printf("%d\n",a[1][2]);
}


int main()
{
int a[2][3];
a[1][2] = 123;
func(a);
return 0;
}
fenris$ cc t.c; a.out
123


James Kuyper

unread,
Oct 26, 2021, 10:43:24 AM10/26/21
to
On 10/26/21 5:02 AM, Racing...@watershipdown.co.uk wrote:
...
> fenris$ cat t.c
> #include <stdio.h>
>
> void func(char *str)
> {
> str[0] = 0;
> }
>
>
> int main()
> {

Try changing the following line:
> char *str = "hello";

to
char greeting[] = "hello";
char *str = greeting;

> func(str);
> puts("Worked");
> return 0;
> }

You shouldn't get a bus error this time. Do you understand why?
It is loading more messages.
0 new messages