Is C ready to become a safer language?

Thiago Adams

unread,

Feb 7, 2024, 11:02:11 PMFeb 7

to

Let's say C compilers can detect all sorts of bugs at compile time.

How would C compilers report that? As an error or a warning?

Let's use this sample:

int main() {
int a = 1;
a = a / 0;
}

GCC says:

warning: division by zero is undefined [-Wdivision-by-zero]
5 | a = a / 0;
| ^ ~

In case GCC or any other compiler reports this as an error, then C
programmers would likely complain. Am I right?

So, even if we had compile-time checks for bugs, C compilers and the
standard are not prepared to handle the implications to make C a safer
language.

From my point of view, we need an error, not a warning. But we also
need a way to ignore the error in case the programmer wants to see what
happens, with a division by zero, for instance. (Please note that this
topic IS NOT about this specific warning; it is just a sample.)

Warnings work more or less like this. The problem with warnings is that
they are not standardized - the "name/number" and the way to
disable/enable them.

So this is the first problem we need to solve in C to make the language
safer. We need a mechanism.

I also believe we can have a standard profile for warnings that will
change the compiler behaviour; for instance, making undefined behaviour
an error.

The C standard is complacent with the lack of error messages. Sometimes
it says "message is encouraged...". This shifts the responsibility from
the language. But when someone says "C is dangerous," the language as a
whole is blamed, regardless of whether you are using MISRA, for instance.

Thus, not only are the mechanics of the language is unprepared, but the
standard is also not prepared to assume the responsibility of being a
source of guidance and safety.

Lawrence D'Oliveiro

unread,

Feb 7, 2024, 11:40:40 PMFeb 7

to

On Thu, 8 Feb 2024 01:01:56 -0300, Thiago Adams wrote:

> So, even if we had compile-time checks for bugs, C compilers and the
> standard are not prepared to handle the implications to make C a safer
> language.

Do you want C to turn into Java? In Java, rules about reachability are
built into the language. And these rules are simplistic ones, based on
the state of the art back in the 1990s, not taking account of
improvements in compiler technology since then. For example, in the
following code, the uninitialized declaration of “Result” is a
compile-time error, even though a human looking at the code can figure
out that there is no way it will be left uninitialized at the point of
reference:

char Result; /* not allowed! */
boolean LastWasBackslash = false;
for (;;)
{
++ColNr;
final int ich = Input.read();
if (ich < 0)
{
--ColNr;
EOF = true;
Result = '\n';
break;
}
else if (LastWasCR && (char)ich == '\012')
{
/* skip LF following CR */
--ColNr;
LastWasCR = false;
LastWasBackslash = false;
}
else if (!LastWasBackslash && (char)ich == '\\')
{
LastWasBackslash = true;
LastWasCR = false;
}
else if (!LastWasBackslash || !IsEOL((char)ich))
{
Result = (char)ich;
break;
}
else
{
++LineNr;
ColNr = 0;
LastWasCR = (char)ich == '\015';
LastWasBackslash = false;
} /*if*/
} /*for*/
EOL = EOF || IsEOL(Result);
LastWasCR = Result == '\015';

Kaz Kylheku

unread,

Feb 7, 2024, 11:58:41 PMFeb 7

to

On 2024-02-08, Thiago Adams <thiago...@gmail.com> wrote:
> Let's say C compilers can detect all sorts of bugs at compile time.
>
> How would C compilers report that? As an error or a warning?

ISO C doesn't distinguish "error" from "warning"; it speaks only about
diagnostics. It requires diagnostics for certain situations.

In compiler (not ISO C) parlance, we might understand "error" to be a
situation when a diagnostic is issued, and the implementation terminates
the translation, so that a translated program or translation unit is not
produced.

A "warning" is any other situation when a diagnostic is issued and
translation continues.

(These concepts extend into run-time. There could be a run-time diagnostic
which doesn't terminate the program, and one which does.)

With these definitions ...

> Let's use this sample:
>
> int main() {
> int a = 1;
> a = a / 0;
> }
>
> GCC says:
>
> warning: division by zero is undefined [-Wdivision-by-zero]
> 5 | a = a / 0;
> | ^ ~
> In case GCC or any other compiler reports this as an error, then C
> programmers would likely complain. Am I right?

They might.

A programmer taking a standard-based view might remark that
the program is not required to be diagnosed by ISO C.

It does invoke undefined behavior if executed, though.

Based on the deduction that the program has unconditional undefined
behavior, it is legitimate to terminate translating the program with a
diagnostic, since that is a possible consequence of undefined behavior.

> So, even if we had compile-time checks for bugs, C compilers and the
> standard are not prepared to handle the implications to make C a safer
> language.
>
> From my point of view, we need an error, not a warning.

Implementations are free to implement arbitrary diagnostics, and also
nonconforming modes of translation, which stop translating a program
that does not violate any ISO C syntax or constraint rule.

Thus compilers can provide tools to help programmers enforce
rules that don't exist in the language as such.

> But we also
> need a way to ignore the error in case the programmer wants to see what
> happens, with a division by zero, for instance. (Please note that this
> topic IS NOT about this specific warning; it is just a sample.)
>
> Warnings work more or less like this. The problem with warnings is that
> they are not standardized - the "name/number" and the way to
> disable/enable them.
>
> So this is the first problem we need to solve in C to make the language
> safer. We need a mechanism.
>
> I also believe we can have a standard profile for warnings that will
> change the compiler behaviour; for instance, making undefined behaviour
> an error.
>
> The C standard is complacent with the lack of error messages. Sometimes
> it says "message is encouraged...". This shifts the responsibility from

I don't believe so. The C standard absolutely requires diagnostics
for certain situations. Everywhere else, it doesn't.

I don't remember seeing any text "encouraging" a diagnostic.
That's ambiguous: is it really required or not?

> the language. But when someone says "C is dangerous," the language as a
> whole is blamed, regardless of whether you are using MISRA, for instance.
>
> Thus, not only are the mechanics of the language is unprepared, but the
> standard is also not prepared to assume the responsibility of being a
> source of guidance and safety.

The standard specifies which constructs are required to do what under
what conditions, and those situations are safe.

Sometimes "do what" is unspecified (the implementation choices from a
range of safe behaviors) or implementation-defined (similar, but
implementation also documents the choice).

The standard also specifies that some situations must be diagnosed,
like syntax and type errors and other constraint rule violations.

Everything else is undefined behavior (some of it potentially
defined by the implementation as a "documented extension").

Avoidance of undefined behavior is left to the careful design
of the program, and whatever tools the implementation and third parties
provide for detecting undefined behavior.

Some languages, like Common Lisp, provide a notion of safety level.
Over individual expressions in Lisp, we can declare optimization
parameters: safety (0-3) and speed (0-3). We can also declare facts to
the Common Lisp compiler like types of operands and results. When we
lie to the compiler, the behavior becomes undefined, but the situation
is nuanced. Safe code diagnoses errors. If we tell the Lisp compiler
that some X is a cons cell, and then access (car X), but at run time, X
turns out to be a string, an error will be signaled. However, if we
compile the code with safety 0, all bets are off: the machine code may
blindly access the string object as if it were a cons cell, with
disastrous consequences.

C could benefit from an approach along these lines. The big problem is
that in C it's hard to impossible to make many undefined behaviors
safe (as in detect them and abort the program).

For isntance, there is no way to tell whether a pointer is valid
or not, or how large an array it points to.

Lisp is a safe, dynamic language first, and an aggressively optimized
language second. It's easy to tell that (car X) is accessing a string
and not a cons cell thanks to the run-time information in the objects.

It's easier to strip away safety from a safe language, and generate
unsafe code that works with lower level machine types, than to introduce
safety into a machine-oriented language, because the data
representations don't accomodate the needed run-time bits.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazi...@mstdn.ca

Keith Thompson

unread,

Feb 8, 2024, 12:00:04 AMFeb 8

to

Thiago Adams <thiago...@gmail.com> writes:
> Let's say C compilers can detect all sorts of bugs at compile time.
>
> How would C compilers report that? As an error or a warning?
>
> Let's use this sample:
>
> int main() {
> int a = 1;
> a = a / 0;
> }
>
> GCC says:
>
> warning: division by zero is undefined [-Wdivision-by-zero]
> 5 | a = a / 0;
> | ^ ~
>
> In case GCC or any other compiler reports this as an error, then C
> programmers would likely complain. Am I right?

Someone will always complain, but a conforming compiler can report this
as a fatal error.

Division by zero has undefine behavior. Under the standard's definition
of undefined behavior, it says:

NOTE

Possible undefined behavior ranges from ignoring the situation
completely with unpredictable results, to behaving during
translation or program execution in a documented manner
characteristic of the environment (with or without the issuance
of a diagnostic message), to terminating a translation or
execution (with the issuance of a diagnostic message).

Though it's not quite that simple. Rejecting the program if the
compiler can't prove that the division will be executed would IMHO
be non-conforming. Code that's never executed has no behavior, so it
doesn't have undefined behavior.

But of course any compiler can reject anything it likes in
non-conforming mode. See for example "gcc -Werror".

But even ignoring that, a culture of paying very close attention to
non-fatal warnings could go a long way towards making C safer (assuming
compilers are clever enough to issue good warnings).

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Medtronic
void Void(void) { Void(); } /* The recursive call of the void */

Kaz Kylheku

unread,

Feb 8, 2024, 12:00:25 AMFeb 8

to

On 2024-02-08, Lawrence D'Oliveiro <l...@nz.invalid> wrote:
> On Thu, 8 Feb 2024 01:01:56 -0300, Thiago Adams wrote:
>
>> So, even if we had compile-time checks for bugs, C compilers and the
>> standard are not prepared to handle the implications to make C a safer
>> language.
>
> Do you want C to turn into Java? In Java, rules about reachability are
> built into the language. And these rules are simplistic ones, based on
> the state of the art back in the 1990s, not taking account of
> improvements in compiler technology since then. For example, in the
> following code, the uninitialized declaration of “Result” is a
> compile-time error, even though a human looking at the code can figure
> out that there is no way it will be left uninitialized at the point of
> reference:

Because C doesn't mandate such a warning, GCC was able to rid itself
of naive, badly implemented diagnostics in this area.

I have some old installations of GCC which still warn about some of my
code (that some variable might be used uninitialized). Newer compilers
are silent on that code, due to doing better analysis.

Keith Thompson

unread,

Feb 8, 2024, 12:33:20 AMFeb 8

to

Kaz Kylheku <433-92...@kylheku.com> writes:
> On 2024-02-08, Thiago Adams <thiago...@gmail.com> wrote:

[...]

>> The C standard is complacent with the lack of error messages. Sometimes
>> it says "message is encouraged...". This shifts the responsibility from
>
> I don't believe so. The C standard absolutely requires diagnostics
> for certain situations. Everywhere else, it doesn't.
>
> I don't remember seeing any text "encouraging" a diagnostic.
> That's ambiguous: is it really required or not?

Search for "Recommended practice" in the standard. For example,
N1570 6.7.4p9:

Recommended practice

The implementation should produce a diagnostic message for a
function declared with a _Noreturn function specifier that appears
to be capable of returning to its caller.

David Brown

unread,

Feb 8, 2024, 3:18:00 AMFeb 8

to

On 08/02/2024 05:59, Keith Thompson wrote:
> Thiago Adams <thiago...@gmail.com> writes:
>> Let's say C compilers can detect all sorts of bugs at compile time.
>>
>> How would C compilers report that? As an error or a warning?
>>
>> Let's use this sample:
>>
>> int main() {
>> int a = 1;
>> a = a / 0;
>> }
>>
>> GCC says:
>>
>> warning: division by zero is undefined [-Wdivision-by-zero]
>> 5 | a = a / 0;
>> | ^ ~
>>
>> In case GCC or any other compiler reports this as an error, then C
>> programmers would likely complain. Am I right?
>
> Someone will always complain, but a conforming compiler can report this
> as a fatal error.
>

I'm not /entirely/ convinced. Such code is only undefined behaviour at
run-time, I believe. A compiler could reject the code (give a fatal
error) if it is sure that this will be reached when the code is run.
But can it be sure of that, even if it is in "main()" ? Freestanding
implementations don't need to run "main()" (not all my programs have had
a "main()" function), and the freestanding/hosted implementation choice
is a matter of the implementation, not just the compiler.

> Division by zero has undefine behavior. Under the standard's definition
> of undefined behavior, it says:
>
> NOTE
>
> Possible undefined behavior ranges from ignoring the situation
> completely with unpredictable results, to behaving during
> translation or program execution in a documented manner
> characteristic of the environment (with or without the issuance
> of a diagnostic message), to terminating a translation or
> execution (with the issuance of a diagnostic message).
>
> Though it's not quite that simple. Rejecting the program if the
> compiler can't prove that the division will be executed would IMHO
> be non-conforming. Code that's never executed has no behavior, so it
> doesn't have undefined behavior.
>

Indeed.

In particular, someone could write :

static inline void undefined_behaviour(void) {
1 / 0;
}

and use that for a "can never happen" indicator for compiler optimisation :

int foo(int x) {
if (x < 0) undefined_behaviour();
// From here on, the compiler can optimise using the
// assumption that x >= 0
...
}

gcc and clang have __builtin_unreachable() for this purpose - "calling"
that function that is always run-time undefined behaviour.

> But of course any compiler can reject anything it likes in
> non-conforming mode. See for example "gcc -Werror".
>

Yes. Once I have a project roughly in shape, I always enable that so
that I don't miss any warnings.

> But even ignoring that, a culture of paying very close attention to
> non-fatal warnings could go a long way towards making C safer (assuming
> compilers are clever enough to issue good warnings).
>

Yes.

I think it would be better if compilers were stricter in their default
modes, even if their stricter compliant modes have to reduce the
severity of some warnings. People should be allowed to write "weird"
code that looks very much like an error - but perhaps it is those people
who should need extra flags or other effort, not those that write
"normal" code and want to catch their bugs. (I know this can't be done
for tools like gcc, because it could cause problems with existing code.)

Still, I am happy to see that the latest gcc trunk has made some pre-C99
features into fatal errors instead of warnings - implicit function
declarations are now fatal errors.

Thiago Adams

unread,

Feb 8, 2024, 7:20:53 AMFeb 8

to

Em 2/8/2024 1:40 AM, Lawrence D'Oliveiro escreveu:
> On Thu, 8 Feb 2024 01:01:56 -0300, Thiago Adams wrote:
>
>> So, even if we had compile-time checks for bugs, C compilers and the
>> standard are not prepared to handle the implications to make C a safer
>> language.
>
> Do you want C to turn into Java?

No. In this topic I am assuming that the compilers can catch all sorts
of bugs in compile time.
With this assumption in mind, my point is that the standard is not
prepared to handle the implications.

Because:

1 - There is no universal mechanism for enable/disable warnings. Each
compiler does in a different way using pragma, and there is no guidance
in the standard with a rule number for instance.

2 - Because C avoids breaking code, it also needs something like a
warning profile.

3 - Maybe the standard is not prepared to assume the task of central
point of guidance about safety.

Another sample:

int main()
{
int a[2];
a[3] = 1;
}

GCC says

warning: array index 3 is past the end of the array (that has type
'int[2]') [-Warray-bounds]
5 | a[3] = 1;
| ^ ~

One more:

struct X {double d;};
struct Y {int i;};
int main()
{
struct X x;
struct Y *p;
p = &x;
}

GCC says:

warning: incompatible pointer types assigning to 'struct Y *' from
'struct X *' [-Wincompatible-pointer-types]
7 | p = &x;
| ^ ~~

For this one a C++ compiler gives:

<source>:7:6: error: cannot convert 'X*' to 'Y*' in assignment
7 | p = &x;

So, this may be also related with the "spirit of C". I like the idea of
the programmer can do whatever they want.
BUT, I think my default the compiler must complain.

Then I think we could have in the standard a better guidance about
safety defaults, and to avoid breaking code, the standard could define
safety profiles.

I also would like to note that some languages that have one main
implementation like rust does not have this problem..because the
specification is more or less the implementation of one compiler.

This will never be the case of C so I am wondering how the standard
could be written for at same time.

- keep compiling existing code
- give guidance and safety guarantees
- make code safe by default

Without this the C language is abstract about safety.

Consider the question

"Is C language safe?"

The answer will be , well, the language itself is very vague, depends
of the compiler you use.

The other problem is that we may need annotations and maybe other
changes on the type system. C23 now has attributes.

My point is also that safety , cannot be on the compiler side only,
because each compiler can have a different annotations.

Thiago Adams

unread,

Feb 8, 2024, 7:49:01 AMFeb 8

to

Em 2/8/2024 5:17 AM, David Brown escreveu:
...

> I think it would be better if compilers were stricter in their default
> modes, even if their stricter compliant modes have to reduce the
> severity of some warnings. People should be allowed to write "weird"
> code that looks very much like an error - but perhaps it is those people
> who should need extra flags or other effort, not those that write
> "normal" code and want to catch their bugs. (I know this can't be done
> for tools like gcc, because it could cause problems with existing code.)

Yes this is my point.
But I believe we need a standard mechanism, and this is the first step
towards safety in C.

Consider this code.

int main(void)
{
int a = 1, b = 2;

#ifdef _MSC_VER
#pragma warning( push )
#pragma warning( disable : 4706 )
#else
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wparentheses"
#endif

if (a = b){}

#ifdef _MSC_VER
#pragma warning( pop )
#else
#pragma GCC diagnostic pop
#endif
}

This code wants to use a = b inside the if condition.
The code shows how to disable the warning in GCC and MSCV.

If we had in the standard a number for the warning, and a mechanism for
disabling then we could have something like

int main(void)
{
int a = 1, b = 2;

if (a = b) [[disable:4706]]
{
}
}

Maybe

int main(void)
{
int a = 1, b = 2;

if (a = b) _ignore("I want to assign here...")
{
}
}

That is applied to any warning on that specific line.
The advantage is the warning ID is not necessary

Malcolm McLean

unread,

Feb 8, 2024, 9:42:38 AMFeb 8

to

Which is a huge advantage. The first proposal creates something which is
a complete mystery to anyone who doesn't intimately know what has been
done. The second anyone can easily understand.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm

David Brown

unread,

Feb 8, 2024, 10:26:06 AMFeb 8

to

Standard numbers would be a /really/ bad idea - standard names would be
vastly better. There are already a couple of attributes that
standardise manipulation of warnings - [[nodiscard]] and
[[maybe_unused]]. But that syntax is not scalable. Perhaps :

[[ignored(parentheses)]]
[[warn(parentheses)]]
[[error(parentheses)]]

>
> int main(void)
> {
>     int a = 1, b = 2;
>
>     if (a = b) [[disable:4706]]
>     {
>     }
> }
>
> Maybe
>
> int main(void)
> {
>     int a = 1, b = 2;
>
>     if (a = b) _ignore("I want to assign here...")
>     {
>     }
> }
>
> That is applied to any warning on that specific line.
> The advantage is the warning ID is not necessary
>

The disadvantage then is that it affects all warnings, not the ones you
know are safe to ignore. And anything that applies to a specific line
is a non-starter for C - you would have to attach it to a statement or
other syntactic unit.

Richard Kettlewell

unread,

Feb 8, 2024, 10:37:39 AMFeb 8

to

Thiago Adams <thiago...@gmail.com> writes:
> From my point of view, we need an error, not a warning. But we also
> need a way to ignore the error in case the programmer wants to see
> what happens, with a division by zero, for instance. (Please note that
> this topic IS NOT about this specific warning; it is just a sample.)

All this already exists. Compilers warn about many possible or actual
problems (such as your example) and the warnings can be selectively or
globally turned into errors. For example see:
https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Warning-Options.html#index-Werror

> Warnings work more or less like this. The problem with warnings is
> that they are not standardized - the "name/number" and the way to
> disable/enable them.

Having to configure warnings and errors separately for each compiler in
a multi-platform environment is a tiny cost compared to the cost of
developing and maintaining cross-platform source code, multiple build
platforms, multiple test platforms, etc. So I don’t think there’s much
benefit to be hand from standardization here. Historically the addition
of useful warning options to compilers has outpaced the development of
the C standard in any case.

> So this is the first problem we need to solve in C to make the
> language safer. We need a mechanism.
>
> I also believe we can have a standard profile for warnings that will
> change the compiler behaviour; for instance, making undefined
> behaviour an error.

This can’t be done completely at compile time, at least not without an
unacceptably high level of false positives.

Tools do exist to detect undefined behavior at runtime and terminate the
program, though. A couple of examples are:

https://clang.llvm.org/docs/MemorySanitizer.html
https://clang.llvm.org/docs/AddressSanitizer.html

However there are severe limitations:
* Coverage is only partial.
* Performance is impacted.
* Some of the tools are unsuited to production environments, see e.g.
https://www.openwall.com/lists/oss-security/2016/02/17/9

You could also look at (partial) hardware-based responses to the
undefined behavior problem such as
https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/

--
https://www.greenend.org.uk/rjk/

Keith Thompson

unread,

Feb 8, 2024, 11:04:24 AMFeb 8

to

In a freestanding implementation, main() *might* be just another
function. In that case a compiler can't prove that the code will be
invoked.

I was assuming a hosted implementation -- and the compiler knows whether
its implementation is hosted or freestanding.

Malcolm McLean

unread,

Feb 8, 2024, 11:15:09 AMFeb 8

to

Well sometimes it can. The boot routine or program entry point is by
defintion always invoked, and you can generally prove that at least some
code is always reached from that. However it is the halting problem, and
you can never prove for all cases, even if the code must be reached or
not reached regardless of runtime inputs.

Thiago Adams

unread,

Feb 8, 2024, 11:15:24 AMFeb 8

to

I was having a look at C# specification. It uses pragma.

https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/preprocessor-directives#nullable-context

Sample
#pragma warning disable 414, CS3021
#pragma warning restore CS3021

It also have warning numbers, I am not sure if other C# compilers uses
the same warnings "id".

Thiago Adams

unread,

Feb 8, 2024, 11:25:37 AMFeb 8

to

Em 2/8/2024 12:37 PM, Richard Kettlewell escreveu:
> Thiago Adams <thiago...@gmail.com> writes:
>> From my point of view, we need an error, not a warning. But we also
>> need a way to ignore the error in case the programmer wants to see
>> what happens, with a division by zero, for instance. (Please note that
>> this topic IS NOT about this specific warning; it is just a sample.)
>
> All this already exists. Compilers warn about many possible or actual
> problems (such as your example) and the warnings can be selectively or
> globally turned into errors. For example see:
> https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Warning-Options.html#index-Werror

It is not standard making safety not portable.

>> Warnings work more or less like this. The problem with warnings is
>> that they are not standardized - the "name/number" and the way to
>> disable/enable them.
>
> Having to configure warnings and errors separately for each compiler in
> a multi-platform environment is a tiny cost compared to the cost of
> developing and maintaining cross-platform source code, multiple build
> platforms, multiple test platforms, etc. So I don’t think there’s much
> benefit to be hand from standardization here. Historically the addition
> of useful warning options to compilers has outpaced the development of
> the C standard in any case.

See my sample comparing GCC and MSVC pragma to disable warnings.
I believe that code can be improved.

>> So this is the first problem we need to solve in C to make the
>> language safer. We need a mechanism.
>>
>> I also believe we can have a standard profile for warnings that will
>> change the compiler behaviour; for instance, making undefined
>> behaviour an error.
>
> This can’t be done completely at compile time, at least not without an
> unacceptably high level of false positives.

For now I am just assuming it can, them we can focus on the mechanism of
control.
But since you pointed the problems.. going further, and comparing with
Rust for instance, the static analysis will require flow analysis.
Imagine to have to specify how flow analysis should behave in all
compilers? This is a problem that Rust does not have.
Basically it is an specification for static analysis not only the
language. It could be an separated document.
What are the benefits? I believe C language as a whole would pass a new
and important message of safety.

bart

unread,

Feb 8, 2024, 11:31:03 AMFeb 8

to

On 08/02/2024 04:01, Thiago Adams wrote:
>
> Let's say C compilers can detect all sorts of bugs at compile time.
>
> How would C compilers report that? As an error or a warning?
>
> Let's use this sample:
>
> int main() {
>     int a = 1;
>     a = a / 0;
> }
>
> GCC says:
>
> warning: division by zero is undefined [-Wdivision-by-zero]
>     5 | a = a / 0;
>       |        ^ ~
>
> In case GCC or any other compiler reports this as an error, then C
> programmers would likely complain. Am I right?
>
> So, even if we had compile-time checks for bugs, C compilers and the
> standard are not prepared to handle the implications to make C a safer
> language.
>
> From my point of view, we need an error, not a warning. But we also
> need a way to ignore the error in case the programmer wants to see what
> happens, with a division by zero, for instance.

Replace the 0 with a variable containing zero at runtime.

(Please note that this
> topic IS NOT about this specific warning; it is just a sample.)
>
> Warnings work more or less like this. The problem with warnings is that
> they are not standardized - the "name/number" and the way to
> disable/enable them.
>
> So this is the first problem we need to solve in C to make the language
> safer. We need a mechanism.
>
> I also believe we can have a standard profile for warnings that will
> change the compiler behaviour; for instance, making undefined behaviour
> an error.
>
> The C standard is complacent with the lack of error messages. Sometimes
> it says "message is encouraged...". This shifts the responsibility from
> the language. But when someone says "C is dangerous," the language as a
> whole is blamed, regardless of whether you are using MISRA, for instance.
>
> Thus, not only are the mechanics of the language is unprepared, but the
> standard is also not prepared to assume the responsibility of being a
> source of guidance and safety.

This is something which has long been of fascination to me: how exactly
do you get a C compiler to actually fail a program with a hard error
when there is obviously something wrong, while not also failing on
completely harmless matters.

So, taking gcc as an example, the same program may:

* Pass, and generate a runnable binary

* Warn, and generate a runnable binary still

* Fail with a hard error.

But the latter is unusual; you might get it with a syntax error for example.

The compiler-behaviour yet get depends on the options that are supplied,
which in turn depends on the programmer: THEY get to choose whether a
program passes or not, something you that you might expect to be the
compiler's job!

The compiler power-users here will have their own set-ups that define
which classes of programs will pass or fail. But I think compilers
should do that out of the box.

--------------------------------------------------

c:\c>type c.c
int main(void) {
int a, b;
a=b/0;
}

c:\c>tcc c.c

c:\c>gcc c.c
c.c: In function 'main':
c.c:3:5: warning: division by zero [-Wdiv-by-zero]
3 | a=b/0;
| ^

c:\c>mcc c
Compiling c.c to c.exe
Proc: main
MCL Error: Divide by zero Line:3 in:c.c

Keith Thompson

unread,

Feb 8, 2024, 11:33:04 AMFeb 8

to

Malcolm McLean <malcolm.ar...@gmail.com> writes:
> On 08/02/2024 16:04, Keith Thompson wrote:

[...]

>> In a freestanding implementation, main() *might* be just another
>> function. In that case a compiler can't prove that the code will be
>> invoked.
>>
> Well sometimes it can. The boot routine or program entry point is by
> defintion always invoked, and you can generally prove that at least
> some code is always reached from that.

That's not the case in the example being discussed, which was an
unconditional division by zero in main().

> However it is the halting
> problem, and you can never prove for all cases, even if the code must
> be reached or not reached regardless of runtime inputs.

The halting problem is hardly relevant. A compiler *may* reject a
program if it can prove that undefined behavior will always occur.
It's under no obligation to do so, or even to diagnose it.

bart

unread,

Feb 8, 2024, 12:02:59 PMFeb 8

to

On 08/02/2024 12:20, Thiago Adams wrote:

> One more:
>
> struct X {double d;};
> struct Y {int i;};
> int main()
> {
> struct X x;
> struct Y *p;
> p = &x;
> }
>
> GCC says:
>
> warning: incompatible pointer types assigning to 'struct Y *' from
> 'struct X *' [-Wincompatible-pointer-types]
>     7 | p = &x;
>       |    ^ ~~
>
> For this one a C++ compiler gives:
>
> <source>:7:6: error: cannot convert 'X*' to 'Y*' in assignment
>     7 | p = &x;
>
>
> So, this may be also related with the "spirit of C". I like the idea of
> the programmer can do whatever they want.

They can, they just have to write it like this:

p = (struct Y *)&x;

or, with a common extension:

p = (typeof(p))&x;

(or, in my language, it can be just 'p := cast(&x)'. Without a cast, it

> Consider the question
>
> "Is C language safe?"
>
> The answer will be , well, the language itself is very vague, depends
> of the compiler you use.

And the options you provide that define the dialect of C and its strictness.

But even if compilers were strict by default, too many things in C are
unsafe, but still legal. That's due to the language design which is not
going to change.

Thiago Adams

unread,

Feb 8, 2024, 12:18:13 PMFeb 8

to

Em 2/8/2024 2:02 PM, bart escreveu:
> On 08/02/2024 12:20, Thiago Adams wrote:
>
>> One more:
>>
>> struct X {double d;};
>> struct Y {int i;};
>> int main()
>> {
>>   struct X x;
>>   struct Y *p;
>>   p = &x;
>> }
>>
>> GCC says:
>>
>> warning: incompatible pointer types assigning to 'struct Y *' from
>> 'struct X *' [-Wincompatible-pointer-types]
>>      7 | p = &x;
>>        |    ^ ~~
>>
>> For this one a C++ compiler gives:
>>
>> <source>:7:6: error: cannot convert 'X*' to 'Y*' in assignment
>>      7 | p = &x;
>>
>>
>> So, this may be also related with the "spirit of C". I like the idea
>> of the programmer can do whatever they want.
>
> They can, they just have to write it like this:
>
>    p = (struct Y *)&x;
>
> or, with a common extension:
>
>    p = (typeof(p))&x;
>
> (or, in my language, it can be just 'p := cast(&x)'. Without a cast, it

Here casts are a very good solution. Another sample is when mixing enuns
types.

>> Consider the question
>>
>> "Is C language safe?"
>>
>> The answer will be , well, the language itself is very vague, depends
>> of the compiler you use.
>
>
> And the options you provide that define the dialect of C and its
> strictness.
>
> But even if compilers were strict by default, too many things in C are
> unsafe, but still legal. That's due to the language design which is not
> going to change.
>

This depends on how strict you want to be.

Another sample

void f(int a[]){
a[2] = 1;
}

I wish this code would not compile if in strict mode. Because it
requires inter-procedural analysis.

But with this change:

void f(int a[3]){
a[2] = 1;
}
it should compile.
Here is similar of cast, the language already have the mechanism to talk
with the static analyser.

With the progress of static analyses we need more. If we just leave the
problem for static analysers to solve then safety will not be portable.
Each one will have it own warning, annotations etc.

What is is the point of C standard?
To make the code portable? But not the code guarantees portable?

Thiago Adams

unread,

Feb 8, 2024, 12:20:04 PMFeb 8

to

Em 2/8/2024 2:17 PM, Thiago Adams escreveu:
...

> Another sample
>
> void f(int a[]){
> a[2] = 1;
> }
>
> I wish this code would not compile if in strict mode. Because it
> requires inter-procedural analysis.

The other reason is because the implementation may not be available.
But the interface of the function allow check at each caller.

Keith Thompson

unread,

Feb 8, 2024, 12:46:10 PMFeb 8

to

Thiago Adams <thiago...@gmail.com> writes:
[...]

> One more:
>
> struct X {double d;};
> struct Y {int i;};
> int main()
> {
> struct X x;
> struct Y *p;
> p = &x;
> }
>
> GCC says:
>
> warning: incompatible pointer types assigning to 'struct Y *' from
> 'struct X *' [-Wincompatible-pointer-types]
> 7 | p = &x;
> | ^ ~~
>
> For this one a C++ compiler gives:
>
> <source>:7:6: error: cannot convert 'X*' to 'Y*' in assignment
> 7 | p = &x;
>
>
> So, this may be also related with the "spirit of C". I like the idea
> of the programmer can do whatever they want.
> BUT, I think my default the compiler must complain.

That's simply a constraint violation; there is no implicit conversion
from struct X* to struct Y*. A conforming compiler must diagnose it
(and *may* treat it as a fatal error). (I believe the C++ rules are
equivalent. g++ just happens to be more strict than gcc in this case.)

gcc rejects it with "-pedantic-errors".

[...]

Thiago Adams

unread,

Feb 8, 2024, 12:57:07 PMFeb 8

to

This sample shows a different mind set between C and C++ compilers.
I am suggesting a a "safe mindset" for C, that is basically a profile
for warnings.

Another one.

int * f() {
int i = 0;
return &i;
}

int main(){
int* p = f();
}

<source>:3:12: warning: function returns address of local variable
[-Wreturn-local-addr]
3 | return &i;
| ^~

The mindset is "well.. this may be an error..."
I wish a new mindset

"this code seems wrong and I will not compile this until you explicitly
annotate why you need this"

The idea that C programmers can do whatever they want still valid, but
not by default in a safer profile.

Another one

enum E1{A};
enum E2{B};
int main(){
if (A == B){}
}

<source>:4:9: warning: comparison between 'enum E1' and 'enum E2'
[-Wenum-compare]
4 | if (A == B){}

In C++ it is an warning as well.

enum E1{A};
enum E2{B};
int main(){
if (A == B){}
E1 a = B; //error in C++
}

enum E1{A};
enum E2{B};
int main(){
if (A == B){}
enum E1 a = B; //OK in C
}

(I know in C enumerators are int,
but static analysis can handle that internally)

Keith Thompson

unread,

Feb 8, 2024, 12:57:39 PMFeb 8

to

Thiago Adams <thiago...@gmail.com> writes:
[...]

> Here casts are a very good solution. Another sample is when mixing
> enuns types.

No casts are required for assignments between different enum types, or
between enums and integers. Stricter enums can be a nice feature, but C
doesn't have them. (C++ does.)

[...]

> This depends on how strict you want to be.
>
> Another sample
>
> void f(int a[]){
> a[2] = 1;
> }
>
> I wish this code would not compile if in strict mode. Because it
> requires inter-procedural analysis.

I think you mean either that you wish C allowed that sample to be
rejected or that your C compiler had the ability to reject it. In fact
it's valid C code (which *might* have undefined behavior depending on
what's passed by the caller).

> But with this change:
>
> void f(int a[3]){
> a[2] = 1;
> }
> it should compile.

Certainly. But all three of these:
void f(int a[])

void f(int a[3])

void f(int *a)

are exactly equivalent according to the C standard. (The fact that the
3 is ignored in the second example is IMHO unfortunate.)

A compiler can take notice of the [3] for the purpose of enabling or
disabling warnings, but there's no such requirement. And since
void f(int a[static 3])
can be used to assert that a points to the initial element of an array
with at least 3 elements, special handling for

void f(int a[3])

seems unlikely.

David Brown

unread,

Feb 8, 2024, 3:43:12 PMFeb 8

to

gcc and clang both accept #pragmas for controlling warnings. These
could be standardised as an alternative to (or in addition to) attributes.

> It also have warning numbers, I am not sure if other C# compilers uses
> the same warnings "id".
>

No. Warnings are determined by compilers. Well-designed tools use
names, and sometimes there can be compatibility (such as clang
originally copying gcc, then each copying the other for warning flag
names depending on who implements it first). Numbers are not IME at all
consistent - they are a left-over from the days when adding proper error
messages or flag names to a compiler would take too much code space. It
is amateurish that MS has yet to fix this.

David Brown

unread,

Feb 9, 2024, 2:14:32 AMFeb 9

to

It's reasonable to assume "hosted" unless you have particular reason not to.

But I am not sure that the /compiler/ knows that it is compiling for a
hosted or freestanding implementation. The same gcc can be used for
Linux hosted user code and a freestanding Linux kernel. Does the
compiler always know which when compiling a unit that happens to contain
"main()" ? I don't think gcc's "-ffreestanding" or "-fno-hosted" flags
are much used - after all, virtually every freestanding C implementation
also implements at least a fair part of the C standard library, and
implements the same semantics in the functions it provides, so there
really is no difference for the compiler.

(This is not something I claim to have any kind of clear answer for -
it's an open question for my part.)

David Brown

unread,

Feb 9, 2024, 2:19:47 AMFeb 9

to

It is not "the halting problem". What you are trying to say, is that it
is undecidable or not a computable problem in the general case.

Compilers and linkers can - and do - map potential reachability, erring
on the side of false positives. It is very common, at least in embedded
systems where you try to avoid wasting space, to trace reachability and
discard code and data that is known to never be reachable. But this is
done by treating any symbol referenced by each function (after dead-code
elimination) as being reachable by that function, which may include
false positives.

Malcolm McLean

unread,

Feb 9, 2024, 2:46:21 AMFeb 9

to

On 09/02/2024 07:19, David Brown wrote:
> On 08/02/2024 17:14, Malcolm McLean wrote:
>> On 08/02/2024 16:04, Keith Thompson wrote:
>
>>> In a freestanding implementation, main() *might* be just another
>>> function. In that case a compiler can't prove that the code will be
>>> invoked.
>>>
>> Well sometimes it can. The boot routine or program entry point is by
>> defintion always invoked, and you can generally prove that at least
>> some code is always reached from that. However it is the halting
>> problem, and you can never prove for all cases, even if the code must
>> be reached or not reached regardless of runtime inputs.
>>
>
> It is not "the halting problem". What you are trying to say, is that it
> is undecidable or not a computable problem in the general case.
>

The "is this code reached?" problem is the halting problem with the
trivial and unimportant difference that the code in question does not
have to be "exit()".

David Brown

unread,

Feb 9, 2024, 3:33:24 AMFeb 9

to

No, it is not.

The two problems can be shown to be equivalently "hard" - that is, if
you could find a solution to one, it would let you solve the other. But
that does not make them the same problem.

And even if they /were/ the same thing, writing "this is undecidable" or
"this is infeasible to compute" is clear and to the point. Writing
"this is the halting problem" is name-dropping a computer science theory
in order to look smart - and like most such attempts, is more smart-arse
than smart.

Malcolm McLean

unread,

Feb 9, 2024, 5:03:22 AMFeb 9

to

Well I've been accused of wasting my English degree, and so now I'm
going to accuse you of wasting your mathematics-related degree.

David Brown

unread,

Feb 9, 2024, 5:18:08 AMFeb 9

to

I haven't seen anyone here accusing you of wasting your English
literature degree. I think you have a lot of trouble communicating with
others here, and I think you have a strong tendency to invent what you
think others might have written, rather than reading what they actually
wrote. It is not helped by your "slapdash" style. I would have
expected someone with a university level degree in English to have a
greater emphasis on reading and understanding, and communicating.

But that does not mean I could or did accuse you of wasting your degree.
I don't know nearly enough about your life to comment on that. If you
are where you are now, and if your career, hobbies, interests,
education, or anything else in your life has benefited from the degree,
then it was not wasted.

I studied mathematics and computation. Both have been very useful in my
work. I can't claim that much of the high-level mathematics is directly
applicable to my job, but the training in logical thinking, problem
solving, and an emphasis on proof is vital. In addition, I was lucky
enough to have a tutor that put a lot of weight on communication and
accurate technical writing, which is an essential part of my job.

Ben Bacarisse

unread,

Feb 9, 2024, 5:24:25 AMFeb 9

to

David Brown <david...@hesbynett.no> writes:

> On 09/02/2024 08:46, Malcolm McLean wrote:
>> On 09/02/2024 07:19, David Brown wrote:
>>> On 08/02/2024 17:14, Malcolm McLean wrote:
>>>> On 08/02/2024 16:04, Keith Thompson wrote:
>>>
>>>>> In a freestanding implementation, main() *might* be just another
>>>>> function. In that case a compiler can't prove that the code will be
>>>>> invoked.
>>>>>
>>>> Well sometimes it can. The boot routine or program entry point is by
>>>> defintion always invoked, and you can generally prove that at least
>>>> some code is always reached from that. However it is the halting
>>>> problem, and you can never prove for all cases, even if the code must
>>>> be reached or not reached regardless of runtime inputs.
>>>>
>>>
>>> It is not "the halting problem". What you are trying to say, is that it
>>> is undecidable or not a computable problem in the general case.
>>>
>> The "is this code reached?" problem is the halting problem with the
>> trivial and unimportant difference that the code in question does not
>> have to be "exit()".
>
> No, it is not.
>
> The two problems can be shown to be equivalently "hard" - that is, if you
> could find a solution to one, it would let you solve the other. But that
> does not make them the same problem.

Sure. But it's not the halting problem for another reason as well.

In the formal models (that have halting problems and so on), it's "the
computation" that is the problem instance. That is, the code and all
the data are presented for an answer. A C compiler has to decide on
reachability without knowing the input.

For example, is this code "undefined":

int c, freq[128];
while ((c = getchar()) != EOF) freq[c]++;

? Maybe it was knocked up as a quick way to collect stats on base64
encoded data streams. If so, it's fine (at least in terms of UB).

--
Ben.

Malcolm McLean

unread,

Feb 9, 2024, 6:24:53 AMFeb 9

to

Yes, but I specified "regardless of runtime inputs". A bit downthread,
you can be forgiven for having missed that. But DB less so, especially
when I'm accused of being the one who doesn't read things properly. And
even when I agree there might be some truth in that, and explain why, I
am then accused of misrepresenting what an Oxford English degree is like.
If you change the halting problem such that some of the symbols on the
tape are allowed to have unknown values then I don't think you are
changing it in any mathematically very interesting way so it is still
"trival", but if you attempt a halt decider it will substantially change
your programming approach, and so it is no longer "unimportant".

If a signed integer overflows the behaviour is undefined, so you also
have to prove that the input stream is short. And of course you also
forgot to intialise to zero.

bart

unread,

Feb 9, 2024, 7:46:03 AMFeb 9

to

On 09/02/2024 10:24, Ben Bacarisse wrote:

Perhaps until you read an input that has 2**31 or more of the same
character.

Ben Bacarisse

unread,

Feb 9, 2024, 8:02:14 AMFeb 9

to

Yes, that's the point. Any UB is dependent on the input. It's not a
static property of the code.

--
Ben.

Ben Bacarisse

unread,

Feb 9, 2024, 8:28:37 AMFeb 9

to

How can it be the halting problem "regardless of runtime inputs"
considering my point that the HP instances include the input? You may
be thinking of the universal halting problem, but that does not really
fit what you said.

A better way to put it would be just to state that almost all
interesting run-time properties of programs are undecidable. And Rice's
theorem gives any one who is curious the exact definition of
"interesting".

> If a signed integer overflows the behaviour is undefined, so you also have
> to prove that the input stream is short. And of course you also forgot to
> intialise to zero.

I did indeed. Thanks. Overflow is just another example of the point I
was making, but not zeroing wasn't!

--
Ben.

David Brown

unread,

Feb 9, 2024, 9:03:52 AMFeb 9

to

Why should I be "forgiven" for missing that, when I did not miss it and
when it is not at all relevant to what I wrote? I saw it, but responded
only to the clear bit of your claim "it is the halting problem", and
ignored the jumbled part about "runtime inputs". My point stands
whatever you meant to say about runtime inputs, and however what you
meant relates to what you tried to write.

So again, you did not read my post, or you are bizarrely blaming /me/
for something you think Ben did.

> And
> even when I agree there might be some truth in that, and explain why, I
> am then accused of misrepresenting what an Oxford English degree is like.

Again, read my posts. I said your description of your degree does not
match my experience with my own degree at Oxford, nor what I heard from
others at the time who studied English literature. You may have given
an accurate description of your personal experiences - I have no way to
either prove or disprove that, and no reason to suspect you of being
intentionally deceptive. But I believe it to have been unusual, and due
to a bad tutor.

> If you change the halting problem such that some of the symbols on the
> tape are allowed to have unknown values then I don't think you are
> changing it in any mathematically very interesting way so it is still
> "trival", but if you attempt a halt decider it will substantially change
> your programming approach, and so it is no longer "unimportant".
>

I would prefer to think a bit about how a volatile input tape would
relate to the halting problem as it is normally stated, before offering
an opinion on how it may or may not change the result. I suspect you
are correct that it will not change the problem or the results, but I
would want to be a bit more rigorous about what is meant before jumping
to conclusions.

However, I have no idea what you mean by "if you attempt a halt decider
it will substantially change your programming approach".

Malcolm McLean

unread,

Feb 9, 2024, 9:04:37 AMFeb 9

to

Most real programs have some runtime data and it is rare to have a
program which is designed to calculate a hardcoded value. But often it
doesn't feed into any branches and so it's quite straightforwards to
show that it won't affect flow control. However this won't help you to
write a "statement reached decider" which works for all cases.

> A better way to put it would be just to state that almost all
> interesting run-time properties of programs are undecidable. And Rice's
> theorem gives any one who is curious the exact definition of
> "interesting".
>

Unlike you I'm not a trained computer scientist and I just pick up bits
as I go along. I'm familiar with Turing's proof that a halt decider is
impossible (how could I not be?), but not so comfortable with Rice's
theorem. So whilst I'm sure your explanation is more general and better,
I could not have made it myself. I do qualify as "curious" however.

David Brown

unread,

Feb 9, 2024, 9:08:19 AMFeb 9

to

Well, /some/ UB is determinable at compile time - such as an identifier
having both internal and external linkage in the same unit, and a few
other bits and pieces that the language designers probably thought were
too burdensome to require compiler implementers to handle.

But mostly UB is a runtime issue, and therefore usually it is dependent
on the input.

Malcolm McLean

unread,

Feb 9, 2024, 11:01:46 AMFeb 9

to

On 09/02/2024 14:03, David Brown wrote:
> On 09/02/2024 12:24, Malcolm McLean wrote:
>>
>> And even when I agree there might be some truth in that, and explain
>> why, I am then accused of misrepresenting what an Oxford English
>> degree is like.
>
> Again, read my posts. I said your description of your degree does not
> match my experience with my own degree at Oxford, nor what I heard from
> others at the time who studied English literature. You may have given
> an accurate description of your personal experiences - I have no way to
> either prove or disprove that, and no reason to suspect you of being
> intentionally deceptive. But I believe it to have been unusual, and due
> to a bad tutor.
>

OK. So you also attended Oxford. And now I'm surprised. That's what
Oxford English is like. Very much an emphasis on geting things done,
quickly and to deadlines, because most Oxford English graduates will
work in careers where that is important. Only a small minority become
computer programmers like me where a program often has t be perfect ot
it doesn't work at all. Now whislt of course I don;t have direct
experience of other colleges, I'm pretty sure my own college was fairly
normal about this. One tutor was maybe a bit keener than normal.
We got essays for next week on an author most of us had never read on
the very first day, and I dn;t think you;f]=d get that at every college.
But not far off.

I'm really not mischaracterising. Of course you also have to defend the
essay and it is marked. But that's less important. It's very unusual to
receive a mark for an essay which is so low that it means that if you
write a similar essay in finals you will fail. Oxford is extremely
generous with the lower marks and in ensuring that they are not a fail.
Unless of course the candidate submits nothing. In which case it can
only be a fail. So you must submit something which constitutes an essay
for the tutorial, and my tutors were very insistent on that.

I fell out with my tutor catastrophically over moral issues and because
of the type of subject English Literature is, that had profound
implications for my work. He allowed that to happen, he was in the wrong
about our dispute and everything I predicted that would happen in
English Studies has in fact come to pass, and he was therefore a bad
tutor. But to be fair I was an extremely difficult student.

>> If you change the halting problem such that some of the symbols on the
>> tape are allowed to have unknown values then I don't think you are
>> changing it in any mathematically very interesting way so it is still
>> "trival", but if you attempt a halt decider it will substantially
>> change your programming approach, and so it is no longer "unimportant".
>>
>
> I would prefer to think a bit about how a volatile input tape would
> relate to the halting problem as it is normally stated, before offering
> an opinion on how it may or may not change the result. I suspect you
> are correct that it will not change the problem or the results, but I
> would want to be a bit more rigorous about what is meant before jumping
> to conclusions.
>

Well this is it isn't it. Employ a computer scientist as a mathematician
and you'll get a rigorous proof. Employ an English graduate (and I have
actually held the job title "mathematician" though I am in no way
qualified to describe myself as such) and he quickly guesses. But are
you so surprised that you think I am thereby misrepresenting Oxford
English? And the English graduate does at least produce something
constructive quickly.

>
> However, I have no idea what you mean by "if you attempt a halt decider
> it will substantially change your programming approach".
>

We're changing the model slightly so that instead of all the input tape
being available to the decider, some values are unknown. It still has to
determine whether the program will halt or not, and whilst sometimes
this will now be inherently impossible because the answer depends on the
input, often it will not be so. As I said, I don't think anything much
has actually changed. But it does mean that we now have to write our
halt decider in a different way. It's no longer as simple as replacing
the code we want to know is reached by exit() and declaring that it is
now the halting problem.

The halt decider will fail to work as specified on some inputs, so I say
"attempt a halt decider". You can have a go. But it won't actually work.

Keith Thompson

unread,

Feb 9, 2024, 11:28:22 AMFeb 9

to

David Brown <david...@hesbynett.no> writes:
[...]

> But I am not sure that the /compiler/ knows that it is compiling for a
> hosted or freestanding implementation. The same gcc can be used for
> Linux hosted user code and a freestanding Linux kernel.

[...]

A conforming compiler must predefine the macro __STDC_HOSTED__ to either
0 or 1 (since C99).

Keith Thompson

unread,

Feb 9, 2024, 11:49:03 AMFeb 9

to

Your earlier statement that "it is the halting problem" is at best
questionable, partly because it's not clear what "it" refers to in that
statement.

The issue is not whether we understand what the halting problem is.
The issue is that it's not relevant in this context.

We were discussing whether a compiler can reject a program that has
undefined behavior. It clearly can *if* the undefined behavior will
occur in every execution of the program (for all inputs if the program
has inputs).

If a program includes the expression 1/0, evaluation that expression has
undefined behavior, but there is no undefined behavior if that code is
never reached and the expression is not evaluated.

Determining, for all programs containing the expression 1/0, whether
that expression will be evaluated is (probably) more or less equivalent
to the halting problem, which means that no compiler can perfectly
determine whether it's allowed to reject such programs.

But the halting program is irrelevant because *compilers are not
required to do that*. Most compilers do *some* flow analysis, and can
determine *in some cases* whether a given expression will always,
sometimes, or never be evaluated, but they cannot, do not, and are not
required to do a perfect job. They are not required to do any such
analysis at all. A conforming compiler can issue a non-fatal warning
whenever it sees a division by a constant 0, not caring whether it can
be reached -- or it can ignore the issue altogether (except that it must
detect it if it occurs in a context that requires a constant
expression).

The example was one in which it's trivially provable that the expression
1/0 will be evaluated whenever the program is executed (assuming a
hosted implementation).

And if you and David want to argue about Oxford English degrees, I urge
*both of you* to do it elsewhere. David, you don't have to respond to
everything.

David Brown

unread,

Feb 9, 2024, 12:00:41 PMFeb 9

to

On 09/02/2024 17:01, Malcolm McLean wrote:
> On 09/02/2024 14:03, David Brown wrote:
>> On 09/02/2024 12:24, Malcolm McLean wrote:
>>>
>>> And even when I agree there might be some truth in that, and explain
>>> why, I am then accused of misrepresenting what an Oxford English
>>> degree is like.
>>
>> Again, read my posts. I said your description of your degree does not
>> match my experience with my own degree at Oxford, nor what I heard
>> from others at the time who studied English literature. You may have
>> given an accurate description of your personal experiences - I have no
>> way to either prove or disprove that, and no reason to suspect you of
>> being intentionally deceptive. But I believe it to have been unusual,
>> and due to a bad tutor.
>>
>
> OK. So you also attended Oxford. And now I'm surprised.

Why is that surprising?

> That's what
> Oxford English is like. Very much an emphasis on geting things done,
> quickly and to deadlines, because most Oxford English graduates will
> work in careers where that is important. Only a small minority become
> computer programmers like me where a program often has t be perfect ot
> it doesn't work at all. Now whislt of course I don;t have direct
> experience of other colleges, I'm pretty sure my own college was fairly
> normal about this. One tutor was maybe a bit keener than normal.
> We got essays for next week on an author most of us had never read on
> the very first day, and I dn;t think you;f]=d get that at every college.
> But not far off.
>

Perhaps you should discourage your cat from walking across your keyboard
while trying to type. Surely accurate typing is a skill you need for
programming?

I know literature students (of any language) were expected to work hard,
reading a lot of texts - /all/ students at Oxford had intense workloads.
In computer science, practicals were done in whatever language the
lecturer liked - so you might easily find you have to learn a new
programming language in a couple of weeks, outside of any courses or
tutorials, for answering the practical.

I know literature students were expected to write a lot, quickly - as
were students of most subjects. But IME they were also expected to
write accurately and sensibly. There is no point in doing something
fast, if it is not correct (to the extent that a literature essay can be
"correct").

> I'm really not mischaracterising. Of course you also have to defend the
> essay and it is marked. But that's less important. It's very unusual to
> receive a mark for an essay which is so low that it means that if you
> write a similar essay in finals you will fail. Oxford is extremely
> generous with the lower marks and in ensuring that they are not a fail.

I have seen students at Oxford fail. Not many, but a few.

> Unless of course the candidate submits nothing. In which case it can
> only be a fail. So you must submit something which constitutes an essay
> for the tutorial, and my tutors were very insistent on that.
>

You sound like you were trying to get away the absolute minimum possible
without failing.

> I fell out with my tutor catastrophically over moral issues and because
> of the type of subject English Literature is, that had profound
> implications for my work. He allowed that to happen, he was in the wrong
> about our dispute and everything I predicted that would happen in
> English Studies has in fact come to pass, and he was therefore a bad
> tutor. But to be fair I was an extremely difficult student.

Such bad chemistry between a student and a tutor happens sometimes,
unfortunately.

But what you are describing is a situation where you scraped through,
learning little from the tutor. That is not normal university experience.

>
>>> If you change the halting problem such that some of the symbols on
>>> the tape are allowed to have unknown values then I don't think you
>>> are changing it in any mathematically very interesting way so it is
>>> still "trival", but if you attempt a halt decider it will
>>> substantially change your programming approach, and so it is no
>>> longer "unimportant".
>>>
>>
>> I would prefer to think a bit about how a volatile input tape would
>> relate to the halting problem as it is normally stated, before
>> offering an opinion on how it may or may not change the result. I
>> suspect you are correct that it will not change the problem or the
>> results, but I would want to be a bit more rigorous about what is
>> meant before jumping to conclusions.
>>
> Well this is it isn't it. Employ a computer scientist as a mathematician
> and you'll get a rigorous proof. Employ an English graduate (and I have
> actually held the job title "mathematician" though I am in no way
> qualified to describe myself as such) and he quickly guesses. But are
> you so surprised that you think I am thereby misrepresenting Oxford
> English? And the English graduate does at least produce something
> constructive quickly.
>

I think your situation at university was unusual, as you describe it -
though not impossible.

And I would expect someone who has a degree in English to be more
accurate in the language they write. Perhaps my expectations are
unreasonable, but in comparison to other regulars here, your rate of
typos, spelling mistakes, grammatical errors, and - more importantly -
confusing and unclear wording, is significantly higher. We all make
mistakes at times, but to reach your level shows a lack of care and a
lack of attention to detail. It is not a matter of producing things
quickly - it is laziness.

Put it this way. If you were to apply to my department for a job as a
programmer, and you wrote your CV and cover letter in the style you
write in this newsgroup, I would reject you on that basis alone.

> >
>> However, I have no idea what you mean by "if you attempt a halt
>> decider it will substantially change your programming approach".
>>
> We're changing the model slightly so that instead of all the input tape
> being available to the decider, some values are unknown. It still has to
> determine whether the program will halt or not, and whilst sometimes
> this will now be inherently impossible because the answer depends on the
> input, often it will not be so. As I said, I don't think anything much
> has actually changed. But it does mean that we now have to write our
> halt decider in a different way. It's no longer as simple as replacing
> the code we want to know is reached by exit() and declaring that it is
> now the halting problem.
>
> The halt decider will fail to work as specified on some inputs, so I say
> "attempt a halt decider". You can have a go. But it won't actually work.
>

So when you write "if you attempt a halt decider", you mean something
like attempting to "write" a halt decider, or "design", or "test", or
"run" a halt decider? And doing this will somehow "change your
programming approach" ? Are you trying to say that if you allow some
input values to be unknown, it changes how you design your halt decider?
That would make no sense, because you /can't/ design a general halt
decider (without transfinite computation models and "oracles"). Are you
trying to say that if someone spends time trying to do this, it will
change their attitude to programming in general?

Your first attempt at explaining this made no sense. Your second
attempt did not help at all. Perhaps it is best just to leave it alone.

David Brown

unread,

Feb 9, 2024, 2:16:03 PMFeb 9

to

On 09/02/2024 17:28, Keith Thompson wrote:
> David Brown <david...@hesbynett.no> writes:
> [...]
>> But I am not sure that the /compiler/ knows that it is compiling for a
>> hosted or freestanding implementation. The same gcc can be used for
>> Linux hosted user code and a freestanding Linux kernel.
> [...]
>
> A conforming compiler must predefine the macro __STDC_HOSTED__ to either
> 0 or 1 (since C99).
>

Okay, that looks like a difference. A compiler could, I believe, call
itself "freestanding", define that to 0, and otherwise act exactly like
a hosted implementation. But it seems unlikely.

Malcolm McLean

unread,

Feb 9, 2024, 8:20:20 PMFeb 9

to

On 09/02/2024 17:00, David Brown wrote:
> On 09/02/2024 17:01, Malcolm McLean wrote:
>
>> OK. So you also attended Oxford. And now I'm surprised.
>
> Why is that surprising?
>

You don't seem to have the same understanding as me about what makes
Oxford tick,

> There is no point in doing something
> fast, if it is not correct (to the extent that a literature essay can be
> "correct").
>
>> I'm really not mischaracterising. Of course you also have to defend
>> the essay and it is marked. But that's less important. It's very
>> unusual to receive a mark for an essay which is so low that it means
>> that if you write a similar essay in finals you will fail. Oxford is
>> extremely generous with the lower marks and in ensuring that they are
>> not a fail.
>
> I have seen students at Oxford fail. Not many, but a few.
>

There is a point in dong something fast but not correct, and I think I
explained it very clearly. Something fast but not correct may get a low
mark, but as long as it is not too badly incorrect, it will pass.
Something slow and accurate but not submitted on time will receive a
fail. Now maybe less so in computer science.

>> Unless of course the candidate submits nothing. In which case it can
>> only be a fail. So you must submit something which constitutes an
>> essay for the tutorial, and my tutors were very insistent on that.
>>
>
> You sound like you were trying to get away the absolute minimum possible
> without failing.
>

My experience is that people at Oxford just don't think lke that.
Obviously your exoerience must have been different. That's what I mean
about being surprised.

>
> Such bad chemistry between a student and a tutor happens sometimes,
> unfortunately.
>
> But what you are describing is a situation where you scraped through,
> learning little from the tutor. That is not normal university experience.
>

I wasn't really bad chemistry. I disgreed with him over some moral
issues, and ultimately the direction in which he was taking the other
students, and we fell out. It's not normal to have such a dispute, I
agree. However whilst he was my personal tutor, he wasn't my only tutor,
and I remained on good terms with the others. In some ways I was a very
difficult student.

> I think your situation at university was unusual, as you describe it -
> though not impossible.
>

Surely you are not saying that this is all invented? It did actually
happen, and so, no, it can't have been impossible.

> And I would expect someone who has a degree in English to be more
> accurate in the language they write. Perhaps my expectations are
> unreasonable, but in comparison to other regulars here, your rate of
> typos, spelling mistakes, grammatical errors, and - more importantly -
> confusing and unclear wording, is significantly higher. We all make
> mistakes at times, but to reach your level shows a lack of care and a
> lack of attention to detail. It is not a matter of producing things
> quickly - it is laziness.
>
> Put it this way. If you were to apply to my department for a job as a
> programmer, and you wrote your CV and cover letter in the style you
> write in this newsgroup, I would reject you on that basis alone.
>

Whilst I'd have no hesitation at all in recommedending you for a job as
a programmer with us.

Now am I unclear or are you obtuse? Are people so contentious that they
can't understand, or is it genuinely my fault?

Tim Rentsch

unread,

Feb 9, 2024, 9:00:14 PMFeb 9

to

bart <b...@freeuk.com> writes:

[...]

> This is something which has long been of fascination to me: how
> exactly do you get a C compiler to actually fail a program with a
> hard error when there is obviously something wrong, while not also
> failing on completely harmless matters.

I think the answer is obvious: unless and until you find someone
who works on a C compiler and who has exactly the same sense that
you do of "when there is obviously something wrong" and of what
conditions fall under the heading of "completely harmless matters",
and also the same sense that you do of how a C compiler should
behave in those cases, you won't get exactly what you want unless
you do it yourself.

Tim Rentsch

unread,

Feb 10, 2024, 5:28:56 AMFeb 10

to

Keith Thompson <Keith.S.T...@gmail.com> writes:

> Thiago Adams <thiago...@gmail.com> writes:
>
>> Let's say C compilers can detect all sorts of bugs at compile time.
>>
>> How would C compilers report that? As an error or a warning?
>>
>> Let's use this sample:
>>
>> int main() {
>> int a = 1;
>> a = a / 0;
>> }
>>
>> GCC says:
>>
>> warning: division by zero is undefined [-Wdivision-by-zero]
>> 5 | a = a / 0;
>> | ^ ~
>>
>> In case GCC or any other compiler reports this as an error, then C
>> programmers would likely complain. Am I right?
>
> Someone will always complain, but a conforming compiler can report this
> as a fatal error.
>

> Division by zero has undefine behavior. Under the standard's definition
> of undefined behavior, it says:
>
> NOTE
>
> Possible undefined behavior ranges from ignoring the situation
> completely with unpredictable results, to behaving during
> translation or program execution in a documented manner
> characteristic of the environment (with or without the issuance
> of a diagnostic message), to terminating a translation or
> execution (with the issuance of a diagnostic message).
>
> Though it's not quite that simple. Rejecting the program if the
> compiler can't prove that the division will be executed would IMHO
> be non-conforming. Code that's never executed has no behavior, so
> it doesn't have undefined behavior.

An implementation can refuse to translate the program, but not
because undefined behavior occurs. The undefined behavior here
happens only when the program is executed, but just compiling the
program doesn't do that. No execution, no undefined behavior.
Still the program may be rejected, because it is not strictly
conforming (by virtue of having output depend on the undefined
behavior if the program is ever run).

bart

unread,

Feb 10, 2024, 3:23:06 PMFeb 10

to

Take this function:

void F() {
F();
F(1);
F(1, 2.0);
F(1, 2.0, "3");
F(1, 2.0, "3", F);
}

Even if /one/ of those calls is correct, the other four can't be
possibly be correct as well.

Is there anyone here who doesn't think there is something obviously wrong?

How about this one:

#include <stdio.h>
int main(void) {
int a;
L1:
printf("Hello, World!\n");
}

Ramp up the warnings and a compiler will tell you about unused 'a' and
'L1'. Use -Werror and the compilation will fail.

Is there anyone here who thinks that running this program with those
unused identifiers is not completely harmless?

Richard Harnden

unread,

Feb 10, 2024, 4:30:27 PMFeb 10

to

Gcc says, "warning: passing arguments to 'F' without a prototype is
deprecated in all versions of C and is not supported in C2x
[-Wdeprecated-non-prototype]"

>
> How about this one:
>
> #include <stdio.h>
> int main(void) {
>     int a;
>     L1:
>     printf("Hello, World!\n");
> }
>
> Ramp up the warnings and a compiler will tell you about unused 'a' and
> 'L1'. Use -Werror and the compilation will fail.
>
> Is there anyone here who thinks that running this program with those
> unused identifiers is not completely harmless?

The point of the warning is that maybe you meant to use them. Remove
them if they're not needed, or fix the code so they do get used.

Kaz Kylheku

unread,

Feb 10, 2024, 4:49:40 PMFeb 10

to

On 2024-02-10, bart <b...@freeuk.com> wrote:
> #include <stdio.h>
> int main(void) {
> int a;
> L1:
> printf("Hello, World!\n");
> }
>
> Ramp up the warnings and a compiler will tell you about unused 'a' and
> 'L1'. Use -Werror and the compilation will fail.
>
> Is there anyone here who thinks that running this program with those
> unused identifiers is not completely harmless?

Unused warnings exist because they help catch bugs.

double distance(double x, double y)
{
return sqrt(x*x + x*x);
}

The diagnostic will not catch all bugs of this type, since just one use is
enough to silence it, but catching something is better than nothing.

Removing unused cruft also helps to keep the code clean. Stray material
sometimes gets left behind after refactoring, or careless copy paste.
Unused identifiers are a "code smell".

Sometimes something must be left unused. It's good to be explicit about
that: to have some indication that it's deliberately unused.

When I implemented unused warnings in my Lisp compiler, I found a bug right away.

https://www.kylheku.com/cgit/txr/commit/?id=5ee2cd3b2304287c010237e03be4d181412e066f

In this diff hunk against in the assembler:

@@ -217,9 +218,9 @@
(q me.(cur-pos)))
(inc c)
me.(set-pos p)
- (format t "~,5d: ~,08X ~a\n" (trunc p 4) me.(get-word) dis-txt)
+ (format stream "~,5d: ~,08X ~a\n" (trunc p 4) me.(get-word) dis-txt)
(while (< (inc p 4) q)
- (format t "~,5d: ~,08X\n" (trunc p 4) me.(get-word)))
+ (format stream "~,5d: ~,08X\n" (trunc p 4) me.(get-word)))
me.(set-pos q)
(set p q)))
c))

The format function was given argument t, a nickname for standard output, so
this code ignored the stream parameter and always sent output to standard
output.

With the unused warnings, it got diagnosed.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazi...@mstdn.ca

Thiago Adams

unread,

Feb 10, 2024, 6:44:33 PMFeb 10

to

Em 2/10/2024 6:49 PM, Kaz Kylheku escreveu:
> On 2024-02-10, bart <b...@freeuk.com> wrote:
>> #include <stdio.h>
>> int main(void) {
>> int a;
>> L1:
>> printf("Hello, World!\n");
>> }
>>
>> Ramp up the warnings and a compiler will tell you about unused 'a' and
>> 'L1'. Use -Werror and the compilation will fail.
>>
>> Is there anyone here who thinks that running this program with those
>> unused identifiers is not completely harmless?
>
> Unused warnings exist because they help catch bugs.
>
> double distance(double x, double y)
> {
> return sqrt(x*x + x*x);
> }

Unused warning is a good sample to explain my point of view.
I want a "warning profile" inside the compiler to do a "automatic code
review".
The criteria is not only complain about UB etc..the criteria is the same
used by humans (in the context of the program, how critical etc) to
approve or not a code.

Unused is a good sample, a human reviewing your code would ask why
something is not used for instance.

In code review I also try to make a code review that does not depend on
the review of the caller.

for instance

void f(int a[])
{
a[1]= 1;
}

this code depends review of the caller.

We can create alternatives that will check input.

void f(int n, int a[n])
{

}
I wish a warning profile to catch this type of "caller" dependent function.
The other common sample is null pointers, where the function assumes the
caller will not pass a null pointer.

I want to have a classification for that - a "name" - to classify
function with "caller contract dependent" something like that.

One way I am thinking to create a warning profile is a set of warnings I
want as as error etc..
then the user can configure the compiler according with the company
guidelines.

The idea of making warnings part of the C standard helps to make "safety
portable" then the program can have the same guarantees independent of
the tool used.

bart

unread,

Feb 10, 2024, 7:02:58 PMFeb 10

to

On 10/02/2024 21:49, Kaz Kylheku wrote:
> On 2024-02-10, bart <b...@freeuk.com> wrote:
>> #include <stdio.h>
>> int main(void) {
>> int a;
>> L1:
>> printf("Hello, World!\n");
>> }
>>
>> Ramp up the warnings and a compiler will tell you about unused 'a' and
>> 'L1'. Use -Werror and the compilation will fail.
>>
>> Is there anyone here who thinks that running this program with those
>> unused identifiers is not completely harmless?
>
> Unused warnings exist because they help catch bugs.
>
> double distance(double x, double y)
> {
> return sqrt(x*x + x*x);
> }
>
> The diagnostic will not catch all bugs of this type, since just one use is
> enough to silence it, but catching something is better than nothing.

> Removing unused cruft also helps to keep the code clean. Stray material
> sometimes gets left behind after refactoring, or careless copy paste.
> Unused identifiers are a "code smell".

This is a different kind of analysis. IMV it doesn't belong in a routine
compilation, just something you do periodically, or when you're stuck
for ideas.

In your example, maybe you did want x*x*2, and the 'y' is either a
parameter no longer needed, or not yet needed, or temporarily not used.

So it is not 'obviously wrong', and by itself, not using a parameter is
harmless.

I'm looking for a result from a compiler which is either Pass or Fail,
not Maybe.

> Sometimes something must be left unused. It's good to be explicit about
> that: to have some indication that it's deliberately unused.
>
> When I implemented unused warnings in my Lisp compiler, I found a bug right away.
>
> https://www.kylheku.com/cgit/txr/commit/?id=5ee2cd3b2304287c010237e03be4d181412e066f
>
> In this diff hunk against in the assembler:
>
> @@ -217,9 +218,9 @@
> (q me.(cur-pos)))
> (inc c)
> me.(set-pos p)
> - (format t "~,5d: ~,08X ~a\n" (trunc p 4) me.(get-word) dis-txt)
> + (format stream "~,5d: ~,08X ~a\n" (trunc p 4) me.(get-word) dis-txt)
> (while (< (inc p 4) q)
> - (format t "~,5d: ~,08X\n" (trunc p 4) me.(get-word)))
> + (format stream "~,5d: ~,08X\n" (trunc p 4) me.(get-word)))
> me.(set-pos q)
> (set p q)))
> c))
>
> The format function was given argument t, a nickname for standard output, so
> this code ignored the stream parameter and always sent output to standard
> output.
>
> With the unused warnings, it got diagnosed.

So you use the linty options when you're stuck with a bug, as I suggested.

Keith Thompson

unread,

Feb 10, 2024, 7:15:22 PMFeb 10

to

Tim Rentsch <tr.1...@z991.linuxsc.com> writes:
[snip discussion of a program that divides by zero]

> An implementation can refuse to translate the program, but not
> because undefined behavior occurs. The undefined behavior here
> happens only when the program is executed, but just compiling the
> program doesn't do that. No execution, no undefined behavior.
> Still the program may be rejected, because it is not strictly
> conforming (by virtue of having output depend on the undefined
> behavior if the program is ever run).

Just to be clear, would you say that a conforming hosted implementation
may reject this program:

#include <limits.h>
#include <stdio.h>
int main(void) {
printf("INT_MAX = %d\n", INT_MAX);
}

solely because it's not strictly conforming?

Keith Thompson

unread,

Feb 10, 2024, 7:31:26 PMFeb 10

to

bart <b...@freeuk.com> writes:
> On 10/02/2024 01:59, Tim Rentsch wrote:
>> bart <b...@freeuk.com> writes:
>> [...]
>>
>>> This is something which has long been of fascination to me: how
>>> exactly do you get a C compiler to actually fail a program with a
>>> hard error when there is obviously something wrong, while not also
>>> failing on completely harmless matters.

The only thing that *requires* a compiler to reject a translation unit
is the #error directive. For any violation of a syntax rule or
constraint, the standard only requires a *diagnostic message*, which can
be a non-fatal warning.

Most C compilers reject translation units for other reasons, and most
have options to control those reasons.

>> I think the answer is obvious: unless and until you find someone
>> who works on a C compiler and who has exactly the same sense that
>> you do of "when there is obviously something wrong" and of what
>> conditions fall under the heading of "completely harmless matters",
>> and also the same sense that you do of how a C compiler should
>> behave in those cases, you won't get exactly what you want unless
>> you do it yourself.
>
>
> Take this function:
>
> void F() {
> F();
> F(1);
> F(1, 2.0);
> F(1, 2.0, "3");
> F(1, 2.0, "3", F);
> }
>
> Even if /one/ of those calls is correct, the other four can't be
> possibly be correct as well.

True, if by "correct" you mean "avoids undefined behavior". In fact
only the first call is correct (and since it's endlessly recursive, it
prevents any of the others from being executed, something that a
compiler may or may not notice).

No syntax error or constraint is violated (prior to C23), so no
diagnostic is required.

> Is there anyone here who doesn't think there is something obviously wrong?

Something is wrong, and is easily avoided by not using old-style
function declarations and definitions, which have been obsolescent since
1989.

> How about this one:
>
> #include <stdio.h>
> int main(void) {
> int a;
> L1:
> printf("Hello, World!\n");
> }
>
> Ramp up the warnings and a compiler will tell you about unused 'a' and
> 'L1'. Use -Werror and the compilation will fail.

Use -Werror and the compiler is non-conforming, since it will reject a
translation unit for anything that the authors thought was worth warning
about. Nevertheless, the -Werror option (or equivalent) can be useful.

> Is there anyone here who thinks that running this program with those
> unused identifiers is not completely harmless?

Is there anyone here who thinks that a compiler can be clever enough to
be trusted to determine whether an unused identifier is harmless or not?
If you have an unused identifier, it's likely you've written something
other than what you meant. A compiler can't know what you meant.

There are very good reasons for warning about unused identifiers.
They are very likely to be symptoms of bugs. If you don't want those
particular warnings, there are likely to be ways to disable them.
Aside from command-line arguments, `(void)a;` is likely to inhibit
the warning. As for an unused label, you can delete it or comment
it out.

vallor

unread,

Feb 10, 2024, 7:41:12 PMFeb 10

to

On Sat, 10 Feb 2024 20:22:52 +0000, bart <b...@freeuk.com> wrote in
<uq8lus$3dceu$1...@dont-email.me>:

> How about this one:
>
> #include <stdio.h>
> int main(void) {
> int a;
> L1:
> printf("Hello, World!\n");
> }
>
> Ramp up the warnings and a compiler will tell you about unused 'a' and
> 'L1'. Use -Werror and the compilation will fail.
>
> Is there anyone here who thinks that running this program with those
> unused identifiers is not completely harmless?

Yet you promoted warnings to errors, just to find a way to make
it fail. :(

("Is there anyone here who thinks that" bart's continuous
complaining about options to gcc deserve any merit?)

Regarding the topic, I'm curious why there is resistance
to conditionals written like this:

if( 1 == a)

...that is to say, with the constant first.

I've done that in C and Perl. Are the aesthetics so
bad that I should quit that? Isn't it safer to write
it that way, so that a dropped "=" is pointed out on
compilation?

--
-v

bart

unread,

Feb 10, 2024, 8:07:07 PMFeb 10

to

On 11/02/2024 00:40, vallor wrote:
> On Sat, 10 Feb 2024 20:22:52 +0000, bart <b...@freeuk.com> wrote in
> <uq8lus$3dceu$1...@dont-email.me>:
>
>> How about this one:
>>
>> #include <stdio.h>
>> int main(void) {
>> int a;
>> L1:
>> printf("Hello, World!\n");
>> }
>>
>> Ramp up the warnings and a compiler will tell you about unused 'a' and
>> 'L1'. Use -Werror and the compilation will fail.
>>
>> Is there anyone here who thinks that running this program with those
>> unused identifiers is not completely harmless?
>
> Yet you promoted warnings to errors, just to find a way to make
> it fail. :(

I didn't promote anything. I said IF you increased the warnings, AND
used -Werror, it will fail.

-Werror is what has always been suggested to me when I complain that
some clear error only results in a warning, which means a runnable
program has been produced.

>
> ("Is there anyone here who thinks that" bart's continuous
> complaining about options to gcc deserve any merit?)

Actually I didn't mention gcc. But when I do, if it managed to do its
job without so much effort on my part then there would be fewer complaints.

Keith Thompson

unread,

Feb 10, 2024, 8:39:41 PMFeb 10

to

bart <b...@freeuk.com> writes:
[...]

> So it is not 'obviously wrong', and by itself, not using a parameter
> is harmless.
>
> I'm looking for a result from a compiler which is either Pass or Fail,
> not Maybe.

[...]

So you don't like compiler warnings about things that are not clearly
wrong.

That's a valid preference, but it's not one that's shared by most
programmers or supported by most compilers.

I suppose one way to get what you want is to use "gcc -pedantic-errors"
and ignore all warnings, (or do the equivalent for whatever compiler
you're using. Or there may be a way to turn off all warnings that
don't indicate violations of constraints or syntax rules, but since
that's not the behavior I want I haven't taken the time to investigate.

Keith Thompson

unread,

Feb 10, 2024, 8:51:36 PMFeb 10

to

vallor <val...@cultnix.org> writes:
[...]

> Regarding the topic, I'm curious why there is resistance
> to conditionals written like this:
>
> if( 1 == a)
>
> ...that is to say, with the constant first.
>
> I've done that in C and Perl. Are the aesthetics so
> bad that I should quit that? Isn't it safer to write
> it that way, so that a dropped "=" is pointed out on
> compilation?

I personally find "Yoda conditions" jarring.

I'm aware that `1 == a` and `a == 1` are equivalent, but I read them
differently. The latter asks something about a, namely whether it's
equal to 1. The former asks something about 1, which is inevitably a
silly question; I already know all about 1. I find that when reading
such code, I have to pause for a moment (perhaps a fraction of a second)
and mentally reverse the condition to understand it.

Note that I'm describing, not defending, the way I react to it.

Writing "=" rather than "==" is a sufficienly rare mistake, and likely
to be caught quickly because most compilers warn about it, that it's
just not worth scrambling the code to avoid it.

If you've internalized the commutativity of "==" so well that seeing
`1 == a` rather than `a == 1` doesn't bother you, that's fine.
But consider that some people reading your code are likely to have
reactions similar to mine.

Kaz Kylheku

unread,

Feb 10, 2024, 9:46:33 PMFeb 10

to

On 2024-02-11, bart <b...@freeuk.com> wrote:
> On 10/02/2024 21:49, Kaz Kylheku wrote:
>> On 2024-02-10, bart <b...@freeuk.com> wrote:
>>> #include <stdio.h>
>>> int main(void) {
>>> int a;
>>> L1:
>>> printf("Hello, World!\n");
>>> }
>>>
>>> Ramp up the warnings and a compiler will tell you about unused 'a' and
>>> 'L1'. Use -Werror and the compilation will fail.
>>>
>>> Is there anyone here who thinks that running this program with those
>>> unused identifiers is not completely harmless?
>>
>> Unused warnings exist because they help catch bugs.
>>
>> double distance(double x, double y)
>> {
>> return sqrt(x*x + x*x);
>> }
>>
>> The diagnostic will not catch all bugs of this type, since just one use is
>> enough to silence it, but catching something is better than nothing.
>
>> Removing unused cruft also helps to keep the code clean. Stray material
>> sometimes gets left behind after refactoring, or careless copy paste.
>> Unused identifiers are a "code smell".
>
> This is a different kind of analysis. IMV it doesn't belong in a routine
> compilation, just something you do periodically, or when you're stuck
> for ideas.

Periodically translates to never. If there are some situations you don't
want in the code, the best thing is to intercept any change which
introduces such, and not allow it to be merged.

> In your example, maybe you did want x*x*2, and the 'y' is either a
> parameter no longer needed, or not yet needed, or temporarily not used.

If we take a correct program and add an unused variable to it, it
doesn't break. Everyone knows that. That isn't the point.

> So it is not 'obviously wrong', and by itself, not using a parameter is
> harmless.

While it's not obviously wrong, it's not obviously right either.

Moreover, it is a hard fact that the parameter y is not used.

Granted, not every truth about a program is equally useful, but
experience shows that reporting unused identifiers pays off. My own
experience and experiences of others. That's why such diagnostics are
implemented in compilers.

> I'm looking for a result from a compiler which is either Pass or Fail,
> not Maybe.

Things are fortunately not going to revert to the 1982 state of the
art, though.

The job of the compile is not only to translate the code or report
failure, but to unearth noteworthy facts about the code and remark on
them.

>> With the unused warnings, it got diagnosed.
>
> So you use the linty options when you're stuck with a bug, as I suggested.

The point is, I would likely not have found that bug to this day without
the diagnostic. You want to be informed /before/ the bug is identified
in the field.

Tim Rentsch

unread,

Feb 10, 2024, 10:54:32 PMFeb 10

to

vallor <val...@cultnix.org> writes:

> [...] I'm curious why there is resistance to conditionals written

> like this:
>
> if( 1 == a)
>
> ...that is to say, with the constant first.
>
> I've done that in C and Perl. Are the aesthetics so
> bad that I should quit that? Isn't it safer to write
> it that way, so that a dropped "=" is pointed out on
> compilation?

Do you know the phrase too clever by half? It describes
this coding practice. A partial solution at best, and
what's worse it comes with a cost for both writers and
readers of the code. It's easier and more effective just
to use -Wparentheses, which doesn't muck up the code and
can be turned on and off easily. There are better ways
for developers to spend their time than trying to take
advantage of clever but tricky schemes that don't help
very much and are done more thoroughly and more reliably
by using pre-existing automated tools. Too much buck,
not nearly enough bang.

Tim Rentsch

unread,

Feb 11, 2024, 1:21:49 AMFeb 11

to

In both cases the answer is, It depends.

There are scenarios where I would want the first example to compile
successfully and without any complaints. There are other scenarios
where I would want the second example to be given fatal errors during
compilation. Good compilers provide a range of options, knowing that
different circumstances call for different compilation outcomes.
Even if you want the same set of error and warning conditions in
every single compile that you do, other people don't. So you better
get used to the idea of setting the various options the way you want
them, or else write your own compiler and discover that no one else
will use it because it doesn't offer any way to select the particular
sets of choices they need for the various compilation scenarios that
are important to what they're doing.

Tim Rentsch

unread,

Feb 11, 2024, 2:22:36 AMFeb 11

to

Keith Thompson <Keith.S.T...@gmail.com> writes:

> Tim Rentsch <tr.1...@z991.linuxsc.com> writes:
> [snip discussion of a program that divides by zero]
>
>> An implementation can refuse to translate the program, but not
>> because undefined behavior occurs. The undefined behavior here
>> happens only when the program is executed, but just compiling the
>> program doesn't do that. No execution, no undefined behavior.
>> Still the program may be rejected, because it is not strictly
>> conforming (by virtue of having output depend on the undefined
>> behavior if the program is ever run).
>
> Just to be clear, would you say that a conforming hosted implementation
> may reject this program:
>
> #include <limits.h>
> #include <stdio.h>
> int main(void) {
> printf("INT_MAX = %d\n", INT_MAX);
> }
>
> solely because it's not strictly conforming?

My understanding of the C standard is that a hosted implementation
may choose not to accept the above program and still be conforming,
because this program is not strictly conforming. (Please assume
subsequent remarks always refer to implementations that are both
hosted and conforming.)

Also, assuming we have ruled out cases involving #error, a conforming
implementation may choose not to accept a given program if and only if
the program is not strictly conforming. Being strictly conforming is
the only criterion that matters (again assuming there is no #error) in
deciding whether an implementation may choose not to accept the
program in question.

I'm guessing that what you mean by "may reject" is the same as what
I mean by "may choose not to accept". I'd like to know if you think
that's right, or if you think there is some difference between the
two. (My intention is that the two phrases have the same meaning.)

Does the above adequately address the question you want answered?

Keith Thompson

unread,

Feb 11, 2024, 3:13:15 AMFeb 11

to

I'm not sure. As I recall, I gave up on trying to understand what you
think "accept" means.

N1570 5.1.2.3p6:

A program that is correct in all other aspects, operating on correct
data, containing unspecified behavior shall be a correct program and
act in accordance with 5.1.2.3.

Does that not apply to the program above? How can it do so if it's
rejected (or not "accepted")?

The same paragraph says that "A *conforming hosted implementation* shall
accept any strictly conforming program". Are you reading that as
implying that *only* strictly conforming programs must be accepted?

As a practical matter, an implementation that accepts *only* strictly
conforming programs would be very nearly useless. I don't see anything
in the standard that says a program can be rejected purely because it's
not strictly conforming, and I don't believe that was the intent.

Malcolm McLean

unread,

Feb 11, 2024, 6:01:41 AMFeb 11

to

On 11/02/2024 00:40, vallor wrote:
>

> ("Is there anyone here who thinks that" bart's continuous
> complaining about options to gcc deserve any merit?)
>

The compiler should be invoked with

gcc foo.c

Anything else represents a failure and makes it harder to use, and might
mean that the program fails to compile properly if someone fiddles with
or doesn't understand the command lines . An option is compromise, not
an added beauty.

Richard Harnden

unread,

Feb 11, 2024, 6:32:02 AMFeb 11

to

On 11/02/2024 11:01, Malcolm McLean wrote:
> On 11/02/2024 00:40, vallor wrote:
>> ("Is there anyone here who thinks that" bart's continuous
>> complaining about options to gcc deserve any merit?)
>>
> The compiler should be invoked with
>
> gcc foo.c
>

As a first stab, I'd use:

gcc -std=c11 -pedantic -W -Wall -Wextra -Werror foo.c

and try very hard to fix any/all warnings.

David Brown

unread,

Feb 11, 2024, 7:07:12 AMFeb 11

to

On 11/02/2024 00:44, Thiago Adams wrote:
> Em 2/10/2024 6:49 PM, Kaz Kylheku escreveu:
>> On 2024-02-10, bart <b...@freeuk.com> wrote:
>>>     #include <stdio.h>
>>>     int main(void) {
>>>       int a;
>>>       L1:
>>>       printf("Hello, World!\n");
>>>     }
>>>
>>> Ramp up the warnings and a compiler will tell you about unused 'a' and
>>> 'L1'. Use -Werror and the compilation will fail.
>>>
>>> Is there anyone here who thinks that running this program with those
>>> unused identifiers is not completely harmless?
>>
>> Unused warnings exist because they help catch bugs.
>>
>>    double distance(double x, double y)
>>    {
>>      return sqrt(x*x + x*x);
>>    }
>
>
> Unused warning is a good sample to explain my point of view.
> I want a "warning profile" inside the compiler to do a "automatic code
> review".
> The criteria is not only complain about UB etc..the criteria is the same
> used by humans (in the context of the program, how critical etc) to
> approve or not a code.
>

While I appreciate the desire here, it is completely impossible in
practice. There are two main hinders. One is that different people
want different things from code reviews - code that is fine and
acceptable practice for one group or on one project might be banned
outright in another project or group. The other is that code reviewers
generally know more than you can express in code (even if you have a
language that supports assumptions, assertions, and contracts), and this
knowledge is important in code reviews but cannot be available to
automatic tools.

The best that can be done, is what is done today - compilers have lots
of warnings that can be enabled or disabled individually. Some are
considered important enough and universal enough that they are enabled
by default. There will be a group of warnings (gcc -Wall) that the
compiler developers feel are useful to a solid majority of developers
without having too many false positives on things the developers
consider good code. And there will be an additional group of warnings
(gcc -Wall -Wextra) as a starting point for developers who want stricter
code rules, and who will usually then have explicit flags for
fine-grained control of their particular requirements.

And beyond that, there are a variety of niche checking tools for
particular cases, and large (and often expensive) code quality and
static checking tool suites for more advanced checks.

David Brown

unread,

Feb 11, 2024, 7:15:40 AMFeb 11

to

On 11/02/2024 01:02, bart wrote:
> On 10/02/2024 21:49, Kaz Kylheku wrote:
>> On 2024-02-10, bart <b...@freeuk.com> wrote:
>>>     #include <stdio.h>
>>>     int main(void) {
>>>       int a;
>>>       L1:
>>>       printf("Hello, World!\n");
>>>     }
>>>
>>> Ramp up the warnings and a compiler will tell you about unused 'a' and
>>> 'L1'. Use -Werror and the compilation will fail.
>>>
>>> Is there anyone here who thinks that running this program with those
>>> unused identifiers is not completely harmless?
>>
>> Unused warnings exist because they help catch bugs.
>>
>>    double distance(double x, double y)
>>    {
>>      return sqrt(x*x + x*x);
>>    }
>>
>> The diagnostic will not catch all bugs of this type, since just one
>> use is
>> enough to silence it, but catching something is better than nothing.
>
>> Removing unused cruft also helps to keep the code clean. Stray material
>> sometimes gets left behind after refactoring, or careless copy paste.
>> Unused identifiers are a "code smell".
>
> This is a different kind of analysis. IMV it doesn't belong in a routine
> compilation, just something you do periodically, or when you're stuck
> for ideas.
>

Absolutely not!

You want these checks as soon as possible. You don't want to find out
about a bug because someone complained about glitches and problems in
the delivered system - you want to find out about it as soon as you
wrote it.

If your static error checking, or linting, is not done as part of the
compile, it should be done /before/ the compilation. Not as an
afterthought when you are bored!

You can have extra levels of checking and simulation run separately if
they take significantly longer to run - just like running test suites
and regression tests. It is not uncommon in large development groups to
have big and advanced checks and reports done when code is checked into
development branches of the source code control system - and passing
these is a requirement before moving the code to the master branch. If
these big checking systems take a couple of hours to run, then you can't
run them for every change - but you can run them overnight.

Checking for unused variables takes a couple of milliseconds, and should
always be done.

> In your example, maybe you did want x*x*2, and the 'y' is either a
> parameter no longer needed, or not yet needed, or temporarily not used.
>
> So it is not 'obviously wrong', and by itself, not using a parameter is
> harmless.
>
> I'm looking for a result from a compiler which is either Pass or Fail,
> not Maybe.
>

Early on in a project, you expect to have lots of things like this. You
can temporarily disable such warnings that come up a lot. As your code
solidifies, you enable them again. And once you have got code that
looks like a something worth testing, you enable "-Werror" so that all
warnings are treated as fatal errors - that way, none will be missed as
you build the code.

bart

unread,

Feb 11, 2024, 7:24:54 AMFeb 11

to

On 11/02/2024 02:46, Kaz Kylheku wrote:
> On 2024-02-11, bart <b...@freeuk.com> wrote:

>> This is a different kind of analysis. IMV it doesn't belong in a routine
>> compilation, just something you do periodically, or when you're stuck
>> for ideas.
>
> Periodically translates to never. If there are some situations you don't
> want in the code, the best thing is to intercept any change which
> introduces such, and not allow it to be merged.

I have an option in one of compilers called '-unused'. It displays a
list of unused parameter, local and global variables.

I should use it more often than I do. But in any case, it is a
by-product of an internal check where no storage is allocated for
variables, and no spilling is done for parameters.

The first unused parameter it reports on one app, is where the function
is part of a suite of functions that need to share the same set of
parameters. Not all functions will use all parameters.

Most unused non-parameters are left-overs from endless modifications.
(Temporary debugging variables are usually written in capitals so are
easy to spot.)

>> In your example, maybe you did want x*x*2, and the 'y' is either a
>> parameter no longer needed, or not yet needed, or temporarily not used.
>
> If we take a correct program and add an unused variable to it, it
> doesn't break. Everyone knows that. That isn't the point.
>
>> So it is not 'obviously wrong', and by itself, not using a parameter is
>> harmless.
>
> While it's not obviously wrong, it's not obviously right either.

> Moreover, it is a hard fact that the parameter y is not used.

> Granted, not every truth about a program is equally useful, but
> experience shows that reporting unused identifiers pays off. My own
> experience and experiences of others. That's why such diagnostics are
> implemented in compilers.

Take these declarations at file-scope:

typedef int A;
static int B;
int C;
typedef long long int int64_t; // visible via stdint.h

They are not used anywhere in this translation unit. gcc will report B
being unused, but not the others.

'C' might be used in other translation units; I don't know if the linker
will pick that up, or maybe that info is not known to it.

A and int64_t can't be reported because the declarations for them may be
inside a header (as is the case for int64_t) used by other modules where
they /are/ used.

But if not, they could also indicate errors. (Maybe there is also
'typedef float D', and some variable should have been type A not D.)

So potentially useful information that you say is important, but can't
be or isn't done by a compiler.

(This where whole-program compilers like the ones I do come into their own.)

>> I'm looking for a result from a compiler which is either Pass or Fail,
>> not Maybe.
>
> Things are fortunately not going to revert to the 1982 state of the
> art, though.
>
> The job of the compile is not only to translate the code or report
> failure, but to unearth noteworthy facts about the code and remark on
> them.

That's rubbish. People are quite happy to use endless scripting
languages where the bytecode compiler does exactly that: translate
source to linear bytecode in a no-nonsense fashion.

Those are people who want a fast or even instant turnaround.

Some of use want to treat languages that target native code in the same
way; like scripting languages, but with the benefit of strict
type-checking and faster code!

>>> With the unused warnings, it got diagnosed.
>>
>> So you use the linty options when you're stuck with a bug, as I suggested.
>
> The point is, I would likely not have found that bug to this day without
> the diagnostic. You want to be informed /before/ the bug is identified
> in the field.

So you run that option (like -unused in my product), /before/ it gets to
the field.

I can't routinely use -unused for the 100s of compiles I might do in one
day, even if the source was up-to-date that morning with zero unused
vars, because I will be compiling part-finished or temporarily
commented-out code all the time. Eg. there might be an empty function body.

Or I've deleted the body for a rewrite, but still need the same variables.

David Brown

unread,

Feb 11, 2024, 7:28:09 AMFeb 11

to

I would second that opinion.

People who have a left-to-right language as their native tongue will
find "if (a == 1)" makes more sense, because it reads more like a normal
language sentence. It requires less cognitive effort to interpret it,
and it is therefore more likely that they will interpret it correctly.

You can of course train yourself to be familiar with other arrangements,
and they will eventually look "natural" to you. People who are used to
programming in Forth feel it makes more sense to write "a 1 = if",
because that's the Forth way.

But if you want to write code that is as clear as possible to as many
other programmers as possible (and that is a good aim, though of course
not an overriding aim), try reading the code aloud as a sentence.

And if your compiler does not immediately warn on "if (a = 1)", enable
better warnings on the compiler, or get a better compiler. Or if you
have no choice but to use a poor compiler, get a linter or use a good
compiler for linting in additional to the real target compiler.

bart

unread,

Feb 11, 2024, 7:38:14 AMFeb 11

to

That sounds like an incredibly slow and painful way to code.

During development, you are adding, modifying, commenting, uncommenting
and tearing down code all the time.

C already requires you to dot all the Is and cross all the Ts because of
its syntax and type needs. Why make the job even harder?

Your command line will fail a program because this variable:

int a;

is not yet used, or you've temporarily commented out the code where it
is used.

Instead of concentrating on getting working code, you now have to divert
your attention to all these truly pedantic matters.

Have you thought of WHY you are even allowed to do gcc foo.c ?

By all means run that command line when you reach certain stages, and
certainly before you ship.

My complaint is that 'gcc foo.c' is usually too lax, but your set of
options is too draconian.

I want a compiler that allows my unused variable by default, but doesn't
allow this assignment by default:

int a;
char* b=a;

It should fail it without being told it needs to fail it. And without a
million users of gcc each having to create their own suite of options,
in effect creating their own language dialect.

Richard Harnden

unread,

Feb 11, 2024, 8:23:17 AMFeb 11

to

On 11/02/2024 12:38, bart wrote:
> On 11/02/2024 11:31, Richard Harnden wrote:
>> On 11/02/2024 11:01, Malcolm McLean wrote:
>>> On 11/02/2024 00:40, vallor wrote:
>>>> ("Is there anyone here who thinks that" bart's continuous
>>>> complaining about options to gcc deserve any merit?)
>>>>
>>> The compiler should be invoked with
>>>
>>> gcc foo.c
>>>
>>
>> As a first stab, I'd use:
>>
>> gcc -std=c11 -pedantic -W -Wall -Wextra -Werror foo.c
>>
>> and try very hard to fix any/all warnings.
>
>
> That sounds like an incredibly slow and painful way to code.
>
> During development, you are adding, modifying, commenting, uncommenting
> and tearing down code all the time.
>
> C already requires you to dot all the Is and cross all the Ts because of
> its syntax and type needs. Why make the job even harder?

Fixing things early /is/ easier. YMMobviouslyV.

David Brown

unread,

Feb 11, 2024, 9:06:40 AMFeb 11

to

On 11/02/2024 13:24, bart wrote:
> On 11/02/2024 02:46, Kaz Kylheku wrote:
>> On 2024-02-11, bart <b...@freeuk.com> wrote:
>

>> Granted, not every truth about a program is equally useful, but
>> experience shows that reporting unused identifiers pays off. My own
>> experience and experiences of others. That's why such diagnostics are
>> implemented in compilers.
>
> Take these declarations at file-scope:
>
>    typedef int A;
>    static int B;
>    int C;
>    typedef long long int int64_t;        // visible via stdint.h
>
> They are not used anywhere in this translation unit. gcc will report B
> being unused, but not the others.
>
> 'C' might be used in other translation units; I don't know if the linker
> will pick that up, or maybe that info is not known to it.
>

Some linkers certainly can pick up this kind of thing, and use it to
discard code, data, or sections that are not needed. It's not common to
warn about unused symbols, however, since you could easily be
overwhelmed. If you have a static library, or some C files that you are
using as a library, then any given program will probably only need a
small fraction of the functions they provide - unused code and data is
then not an indication of a probably error, but a normal part of the coding.

> A and int64_t can't be reported because the declarations for them may be
> inside a header (as is the case for int64_t) used by other modules where
> they /are/ used.

Not quite.

Compiler warnings apply to a compilation, which is done on a translation
unit - generally a C file with some headers included in it. If you
write "static int B;" in a header and use it with two C files, one of
which uses the variable B and the other does not, then you'll get a
warning (with "gcc -Wall") for the one compilation but not for the other.

"int64_t" is defined in a system header. gcc (and other compilers)
treat system headers (any header included with < > brackets) differently
- most warnings are disabled for them, because they often contain lots
of things you don't need, and because they may use different styles than
you choose for your own code.

Unused typedefs don't trigger a warning in gcc (even with -Wall) unless
they are local to a function, because it's only in that case that it is
likely to be because of a bug in the code.

>
> But if not, they could also indicate errors. (Maybe there is also
> 'typedef float D', and some variable should have been type A not D.)
>
> So potentially useful information that you say is important, but can't
> be or isn't done by a compiler.
>

Warnings can never be perfect, can't catch all errors, and can't avoid
all false positives and false negatives.

> (This where whole-program compilers like the ones I do come into their
> own.)

Yes, whole-program analysis can do checks that cannot be done when
analysing individual translation units.

>
>>> I'm looking for a result from a compiler which is either Pass or Fail,
>>> not Maybe.
>>
>> Things are fortunately not going to revert to the 1982 state of the
>> art, though.
>>
>> The job of the compile is not only to translate the code or report
>> failure, but to unearth noteworthy facts about the code and remark on
>> them.
>
> That's rubbish. People are quite happy to use endless scripting
> languages where the bytecode compiler does exactly that: translate
> source to linear bytecode in a no-nonsense fashion.

Some people might be happy with that. I am not.

To me, my main use of a compiler is a "development" tool. It helps me
develop correct code. It is entirely possible to have a compiler
separate from the linter - this was common in the early days of C, and
the most extensive static error checking is done by dedicated analysis
tools. But for a large amount of common static checking, there is a
strong overlap in the code analysis done by a checker and that done by
an optimiser - it makes sense to combine the two aspects of development.

Compilers can also be used as tools for building or installing software,
run by someone other than the developers of the software. In such
cases, the person running the compiler is far less interested in
warnings - they hope the code is bug-free when they get it. Even then,
however, warnings (as long as there are no false positives) can be
helpful in case something goes wrong.

bart

unread,

Feb 11, 2024, 9:17:45 AMFeb 11

to

If it's something that needs to be fixed, or is even part of the final
product.

A lot of code may be replaced five minutes later.

A program is gradually built and converges to its final form with lots
of deviations along the way. Obviously my mileage /is/ different as I
would find it stifling for it to conform to your standards at every
single step of the way, even for code with an expected half-life
measured in seconds.

(As it happens, I write most substantial projects in a different
language, and generate a C version, if needed, all at once using a
transpiler.

The development process in that other language is more informal, yet its
compiler fails assignments between incompatible pointer types, and
ignores unused labels. The sort of sensible behaviour I'd want in a C
compiler by default.)

Kaz Kylheku

unread,

Feb 11, 2024, 1:32:58 PMFeb 11

to

On 2024-02-11, bart <b...@freeuk.com> wrote:
> On 11/02/2024 02:46, Kaz Kylheku wrote:
>> On 2024-02-11, bart <b...@freeuk.com> wrote:
>
>>> This is a different kind of analysis. IMV it doesn't belong in a routine
>>> compilation, just something you do periodically, or when you're stuck
>>> for ideas.
>>
>> Periodically translates to never. If there are some situations you don't
>> want in the code, the best thing is to intercept any change which
>> introduces such, and not allow it to be merged.
>
> I have an option in one of compilers called '-unused'. It displays a
> list of unused parameter, local and global variables.

If that list is not in the standard error reporting format that
editors understand like:

foo.c:13:warning: unused variable "a"

it's going to be troublesome to use.

These diagnostics would be nice to have. They require that the
compiler check the file provenance of the declaration.

We only want to know that the typedefs and "int C" are not used,
if those declarations are in the same file, not if they
came from a header.

You wouldn't want

#include <stdio.h>

generating warnings that you didn't use ferror, fputs, scanf, ...!

> 'C' might be used in other translation units; I don't know if the linker
> will pick that up, or maybe that info is not known to it.

C might be used in other translation units; yet it would be useful to
have a warning that C is not used in this translation unit.

But only if the declaration didn't come from a header.

>> Things are fortunately not going to revert to the 1982 state of the
>> art, though.
>>
>> The job of the compile is not only to translate the code or report
>> failure, but to unearth noteworthy facts about the code and remark on
>> them.
>
> That's rubbish.

I.e. you're disagreeing with best practices from the software
engineering field.

> People are quite happy to use endless scripting
> languages where the bytecode compiler does exactly that: translate
> source to linear bytecode in a no-nonsense fashion.

This is an informal fallacy known as whataboutism.

Since "anyone can code", legions of dilettantes use poorly engineered
tools today.

So what? Some of them also don't test or document, or use version
control. So neither should you?

"There exist plumbers who string together pipes and hope for the best,
so engineering in the area of fluid dynamics is for shit."

> Those are people who want a fast or even instant turnaround.
>
> Some of use want to treat languages that target native code in the same
> way; like scripting languages, but with the benefit of strict
> type-checking and faster code!

All static information about a program is part of type checking!

Have you heard of the Curry-Howard correspondence? In a nutshell, it's
a mathematical result which says that type systems and formal logic
are equivalent.

Type checking isn't just about rejecting when an integer argument
is given to a string parameter.

Type checking means checking logical consistencies. When a compiler
checks types, it is evaluating logic.

Any logical proposition that we can verify about a program is a type
check.

If we decide to diagnose unused variables, that's a type check.

>>>> With the unused warnings, it got diagnosed.
>>>
>>> So you use the linty options when you're stuck with a bug, as I suggested.
>>
>> The point is, I would likely not have found that bug to this day without
>> the diagnostic. You want to be informed /before/ the bug is identified
>> in the field.
>
> So you run that option (like -unused in my product), /before/ it gets to
> the field.

Then you have to estimate: how many days before releasing to the field
do you do that, based on guessing how many bugs that might uncover.

It's an extra grunt work that someone has to be assigned to.

If fixes arise out of it, they will all be root caused to earlier work
items, which are probably associated with closed work tickets. Do you
open new tickets for those, or re-open the old ones? Or just put it
under its own ticket?

If you always have all merged code in a state where there are no unused
identifiers, you don't have any of this.

> I can't routinely use -unused for the 100s of compiles I might do in one
> day, even if the source was up-to-date that morning with zero unused
> vars, because I will be compiling part-finished or temporarily
> commented-out code all the time. Eg. there might be an empty function body.

How recently and for how many years have you worked in a software
engineering team of more than five people, using tools that you didn't
cob together yourself?

Thiago Adams

unread,

Feb 11, 2024, 1:43:36 PMFeb 11

to

I think it is possible having the following.
A way to specify a set of warnings/errors. It can be a string for instance.
Make some warning in this set standard.

> And beyond that, there are a variety of niche checking tools for
> particular cases, and large (and often expensive) code quality and
> static checking tool suites for more advanced checks.
>
>

Yes, I agree we can have tools, and each tool can solve the problem.

But my point in having something standardized is because we can have
"standardized safety" and "standardized mechanism to control static
analyses tools".

The same assumptions you have in on compiler you can have in another.

We can compare this approach with C++ for instance, when in C++ we have
an error and in C a warning, that means the error is part of the C++
language, it works in the same way in any compiler.

The other advantage is not having each tool with its own annotations.
Today GCC has some annotations, MSVC has SAL for instance etc.

Thiago Adams

unread,

Feb 11, 2024, 1:58:16 PMFeb 11

to

Em 2/10/2024 9:31 PM, Keith Thompson escreveu:
> bart <b...@freeuk.com> writes:
>> On 10/02/2024 01:59, Tim Rentsch wrote:
>>> bart <b...@freeuk.com> writes:
>>> [...]
>>>
>>>> This is something which has long been of fascination to me: how
>>>> exactly do you get a C compiler to actually fail a program with a
>>>> hard error when there is obviously something wrong, while not also
>>>> failing on completely harmless matters.
>
> The only thing that *requires* a compiler to reject a translation unit
> is the #error directive. For any violation of a syntax rule or
> constraint, the standard only requires a *diagnostic message*, which can
> be a non-fatal warning.

I haven't checked the standard but

#if 1/0
#endif

<source>:4:6: error: division by zero in preprocessor expression
4 | #if 1/0
| ~^~

stops compilation.

With constexpr in C23 I guess division by 0 will stop code generation as
well.

Keith Thompson

unread,

Feb 11, 2024, 4:34:05 PMFeb 11

to

Thiago Adams <thiago...@gmail.com> writes:
[...]

> We can compare this approach with C++ for instance, when in C++ we
> have an error and in C a warning, that means the error is part of the
> C++ language, it works in the same way in any compiler.

C++'s requirements for diagnostics are similar to C's:

If a program contains a violation of any diagnosable rule or an
occurrence of a construct described in this document as
“conditionally-supported” when the implementation does not support
that construct, a conforming implementation shall issue at least one
diagnostic message.

As in C, if a program violates language rule, the standard only requires
a diagnostic, which may be a non-fatal warning.

I've seen cases where similar errors are treated as warnings by gcc and
as fatal errors by g++ (in their default modes), but that's just a
choice by the implementers, not a choice imposed by the language. (The
choice for gcc is influenced by a perceived need to accept code that was
valid in earlier versions of the language, something that's less of a
concern in C++.)

And of course both gcc and g++ support the "-pedantic-errors" option.

Keith Thompson

unread,

Feb 11, 2024, 4:47:26 PMFeb 11

to

Then I suggest you check the standard. It's perfectly valid for an
implementation to choose to stop compilation if it encounters 1/0 in a
preprocessor expression, but it's not required. "The resulting tokens
compose the controlling constant expression which is evaluated according
to the rules of 6.6." 6.6, Constraints: "Each constant expression shall
evaluate to a constant that is in the range of representable values for
its type." So that's simply a constraint violation, requiring a
diagnostic but not requiring the translation unit to be rejected.

> With constexpr in C23 I guess division by 0 will stop code generation
> as well.

No, it merely requires a constant expression.
constexpr int n = 1/0;
is a constraint violation, requiring a diagnostic but not requiring
rejection.

Tim Rentsch

unread,

Feb 12, 2024, 2:49:13 AMFeb 12

to

My understanding of the C standard is that 'shall accept' is
meant in the sense of 'shall use its best efforts to complete
translation phases 1 through 8 successfully and produce an
executable'.

Where you say "5.1.2.3p6:" I expect you mean "4p3".

Where you say "the same paragraph" I expect you mean "4p6".

The word "reject" does not appear in the C standard. In my own
writing I am trying henceforth to use "accept" exclusively and
not use "reject". For the safe of discussion I can take "reject"
to mean the logical complement of "accept", which is to say a
program is either accepted or rejected, never both and never
neither. Does that last sentence match your own usage?

The C standard has only one place where a statement is made about
accepting a program, saying in 4p6 that implementations shall
accept any strictly conforming program; no other paragraph in the
standard mentions accepting a program. Given that, it's hard for
me to understand how someone could read the standard as saying
anything other than that a program must be accepted if it is
strictly conforming, but if the program is not strictly conforming
then there is no requirement that it be accepted. In short form, a
program must be accepted if and only if it is strictly conforming.
Does that summary mean something different than your phrase "*only*
strictly conforming programs must be accepted"?. My understanding
of the C standard is that strictly conforming programs must be
accepted, but implementations are not required to accept any
program that is not strictly conforming.

In response to your question about 4p3, the short answer is that
any non-strictly-conforming program that an implementation chooses
not to accept is not correct in all other aspects, so 4p3 does not
apply. If you want to talk about that further we should split that
off into a separate thread, because 4p3 has nothing to do with
program acceptance.

I agree that an implementation that chooses not to accept any
program that it can determine to be not strictly conforming has
very little practical utility. On the other hand I don't think
that matters because no one is going to put in the effort needed to
produce such an implementation.

Regarding your last sentence

I don't see anything in the standard that says a
program can be rejected purely because it's not
strictly conforming, and I don't believe that was
the intent.

One, there is nothing in the C standard about rejecting a program,
only about what programs must be accepted, and

Two, in the last clause, I'm not completely sure what the "that"
is that you don't believe, but in any case I have no idea what
reasoning underlies your belief (or lack thereof). Can you
explain what it is you mean by that part of the sentence, and
what your reasoning is or why you think it?

David Brown

unread,

Feb 12, 2024, 5:32:55 AMFeb 12

to

I entirely agree that it would be nice if warnings and warning sets were
standardised - or at least there was a subset of standard warnings
supported. You'd have to pick new names for the subsets, and they would
probably have to be picked explicitly to avoid conflict with existing
code. But you could have a range of named string options for different
warnings, and sets of "standard C warning levels" - "scw1", "scw2", etc.

gcc and clang already cooperate a lot with warning names. Intel icc
follows them. You'd have to get MS on board to make it a reality - and
they are the ones who would have to do a lot of work on their tools and
IDEs, since they currently use a number system. (It could have been
worse - if they had used names, but different names, it would be more
confusing.)

And for MS to be interested, and to avoid duplication of effort, you'd
want C++ included from day one.

> We can compare this approach with C++ for instance, when in C++ we have
> an error and in C a warning, that means the error is part of the C++
> language, it works in the same way in any compiler.

C++ diagnostics have the same rules as for C. And the main C++
compilers part of the same compiler suites as the main C compilers.

The difference is that when C++ started to become mainstream, there was
little in the way of existing code - compilers could mark bad practices
as warnings or even errors, by default. For C, however, there was a
significant body of code and some code that compiled and worked would
fail to compile if the new checks were enabled by default. Thus the C
compilers left them off by default.

So the reason for stricter defaults in the major C++ compilers compared
to their C siblings is backwards compatibility for older C source code.

vallor

unread,

Feb 13, 2024, 1:23:12 AMFeb 13

to

On Sat, 10 Feb 2024 19:54:13 -0800, Tim Rentsch
<tr.1...@z991.linuxsc.com> wrote in <86sf1z4...@linuxsc.com>:

Thank you all for setting me straight on this. I'll stop
my yoda-conditionals, and start using -Wall.

--
-v

Keith Thompson

unread,

Feb 18, 2024, 6:29:07 PMFeb 18

to

[CORRECTION: 3p4]

>>
>> A program that is correct in all other aspects, operating on
>> correct data, containing unspecified behavior shall be a
>> correct program and act in accordance with 5.1.2.3.
>>
>> Does that not apply to the program above? How can it do so if it's
>> rejected (or not "accepted")?
>>
>> The same paragraph says that "A *conforming hosted implementation*
>> shall accept any strictly conforming program". Are you reading
>> that as implying that *only* strictly conforming programs must be
>> accepted?
>>
>> As a practical matter, an implementation that accepts *only*
>> strictly conforming programs would be very nearly useless. I
>> don't see anything in the standard that says a program can be
>> rejected purely because it's not strictly conforming, and I don't
>> believe that was the intent.
>
> My understanding of the C standard is that 'shall accept' is
> meant in the sense of 'shall use its best efforts to complete
> translation phases 1 through 8 successfully and produce an
> executable'.

That sounds reasonable. I wish the standard actually defined "accept".

> Where you say "5.1.2.3p6:" I expect you mean "4p3".

Yes.

> Where you say "the same paragraph" I expect you mean "4p6".

Yes.

> The word "reject" does not appear in the C standard. In my own
> writing I am trying henceforth to use "accept" exclusively and
> not use "reject". For the safe of discussion I can take "reject"
> to mean the logical complement of "accept", which is to say a
> program is either accepted or rejected, never both and never
> neither. Does that last sentence match your own usage?

Yes, "reject" means "not accept". There might be some nuance that that
definition misses, so I'll try to avoid using the word "reject" in this
discussion.

> The C standard has only one place where a statement is made about
> accepting a program, saying in 4p6 that implementations shall
> accept any strictly conforming program; no other paragraph in the
> standard mentions accepting a program. Given that, it's hard for
> me to understand how someone could read the standard as saying
> anything other than that a program must be accepted if it is
> strictly conforming, but if the program is not strictly conforming
> then there is no requirement that it be accepted. In short form, a
> program must be accepted if and only if it is strictly conforming.
> Does that summary mean something different than your phrase "*only*
> strictly conforming programs must be accepted"?. My understanding
> of the C standard is that strictly conforming programs must be
> accepted, but implementations are not required to accept any
> program that is not strictly conforming.

Certainly a conforming implementation must accept any strictly
conforming program (insert handwaving about capacity limits).

I can understand how one might read that requirement as implying that an
implementation need not accept any program that is not strictly
conforming. I don't read it that way.

> In response to your question about 4p3, the short answer is that
> any non-strictly-conforming program that an implementation chooses
> not to accept is not correct in all other aspects, so 4p3 does not
> apply. If you want to talk about that further we should split that
> off into a separate thread, because 4p3 has nothing to do with
> program acceptance.

I say it does. Under 4p3, the above program (that prints the value of
INT_MAX) is a "correct program", so it must "act in accordance with
5.1.2.3". It cannot do so unless it is first accepted.

You're saying that the correctness of a program can depend on whether an
implementation chooses to accept it. I disagree.

An implementation that does not accept the above program is not
conforming because the implementation violates 4p3.

[...]