Does a constraint violation imply undefined behavior?

Keith Thompson

unread,

Apr 26, 2006, 10:46:39 PM4/26/06

to

This came up in a discussion on comp.lang.c, subject "Size of Void".
Credit goes to Peter Nilsson for the analysis that led to this
question.

Consider this (not very useful) program:

int main(void)
{
int i = 42;
void *p;
p = i;
return 0;
}

The assignment violates the constraints for a simple assignment,
defined in C99 6.5.16.1p1, since the types of the operands are not
among the cases that are allowed.

But suppose the compiler issues the required diagnostic for this
constraint violation, but then proceeds to generate an executable for
the program.

C99 6.5.16.1p2, describing the semantics of simple assignment says:

In _simple assignment_ (=), the value of the right operand is
converted to the type of the assignment expression and replaces
the value stored in the object designated by the left operand.

Does this apply even in in the presence of a constraint violation? If
the implementation chooses to accept this program after issuing a
diagnostic, is it then required to treat the assignment
p = i;
as if it were
p = (void*)i;
?

I have a vague memory that, once a program violates a constraint, its
behavior is undefined; I'm now thinking that I was mistaken on this
point. In other words, I *think* the answer to the question in the
subject is "no", but I'm still not certain whether this is true or
whether it's intended.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Wojtek Lerch

unread,

Apr 27, 2006, 12:27:01 AM4/27/06

to

"Keith Thompson" <ks...@mib.org> wrote in message
news:ln64kvs...@nuthaus.mib.org...

> I have a vague memory that, once a program violates a constraint, its
> behavior is undefined; I'm now thinking that I was mistaken on this
> point. In other words, I *think* the answer to the question in the
> subject is "no", but I'm still not certain whether this is true or
> whether it's intended.

I suspect the idea is that if a program violates the syntax or constraints
of the C language, it's not a (valid) C program, and therefore the C
standard does not define its semantics.

Keith Thompson

unread,

Apr 27, 2006, 12:35:22 AM4/27/06

to

But where is that stated?

The standard doesn't require a compiler to reject a program that has a
constraint violation. Does accepting such a program release it from
any obligation to obey the standard?

Francis Glassborow

unread,

Apr 27, 2006, 7:10:23 AM4/27/06

to

In article <lnslnzr...@nuthaus.mib.org>, Keith Thompson
<ks...@mib.org> writes

>But where is that stated?
>
>The standard doesn't require a compiler to reject a program that has a
>constraint violation. Does accepting such a program release it from
>any obligation to obey the standard?

Where is it stated that compilers should be psychic. If a compiler
accepts code that includes a constraint violation it is using an
extension and what then happens is entirely up to the implementation.
Apart from the QoI issue, would it be wrong if the implementor
documented that the consequence was to reformat your root drive?

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

kuy...@wizard.net

unread,

Apr 27, 2006, 7:15:31 AM4/27/06

to

Keith Thompson wrote:
...

> I have a vague memory that, once a program violates a constraint, its
> behavior is undefined; I'm now thinking that I was mistaken on this
> point. In other words, I *think* the answer to the question in the
> subject is "no", but I'm still not certain whether this is true or
> whether it's intended.

In some contexts, I believe that when the standard defines the behavior
of a given code construct, it is defining that behavior only for those
cases where the relevant constraints have not been violated. In other
words, when the constraints are violated, the behavior is implicitly
undefined. This interpretation works when the relevant constraints need
to obeyed in order for the semantics described to be meaningful. It
doesn't work when the constraint is independent of the semantics.
However, I can't cite any text supporting this idea.

Douglas A. Gwyn

unread,

Apr 27, 2006, 2:17:36 PM4/27/06

to

Keith Thompson wrote:
> Does this apply even in in the presence of a constraint violation?

Technically, if there is a constraint violation then all bets
are off as to applicability of other rules.

However, if some C compiler wants to accept, as an extension,
void*p;p=42; then one would expect it to be the same as
void*p;p=(void*)42; and for "42" to be the virtual byte
address within the process. Of course addressing details
are highly platform-dependent..

Keith Thompson

unread,

Apr 27, 2006, 3:19:25 PM4/27/06

to

"Douglas A. Gwyn" <DAG...@null.net> writes:
> Keith Thompson wrote:
>> Does this apply even in in the presence of a constraint violation?
>
> Technically, if there is a constraint violation then all bets
> are off as to applicability of other rules.

Does your use of the word "technically" imply that this is actually
stated in the standard? If so, where?

> However, if some C compiler wants to accept, as an extension,
> void*p;p=42; then one would expect it to be the same as
> void*p;p=(void*)42; and for "42" to be the virtual byte
> address within the process. Of course addressing details
> are highly platform-dependent..

Saying that "one would expect" this really doesn't answer my question.
I would expect the same thing myself. My question is what the
standard actually requires (which is why I asked here rather than in
comp.lang.c).

Douglas A. Gwyn

unread,

Apr 27, 2006, 6:58:54 PM4/27/06

to

Keith Thompson wrote:
> "Douglas A. Gwyn" <DAG...@null.net> writes:
> > Keith Thompson wrote:
> >> Does this apply even in in the presence of a constraint violation?
> > Technically, if there is a constraint violation then all bets
> > are off as to applicability of other rules.
> Does your use of the word "technically" imply that this is actually
> stated in the standard? If so, where?

It's not a strictly conforming program if it doesn't obey the
grammar and constraints (and also in some other circumstances).
A conforming implementation is not obliged to accept any non-
strictly conforming program, but it may accept some of them.
That usually happens via use of some "conforming extension".

> > However, if some C compiler wants to accept, as an extension,
> > void*p;p=42; then one would expect it to be the same as
> > void*p;p=(void*)42; and for "42" to be the virtual byte
> > address within the process. Of course addressing details
> > are highly platform-dependent..
> Saying that "one would expect" this really doesn't answer my question.
> I would expect the same thing myself. My question is what the
> standard actually requires (which is why I asked here rather than in
> comp.lang.c).

6.3.2.3p5 (and corresponding footnote) contains the spec for
integer-to-pointer conversion. Essentially: it's implementation
defined, it's expected to correspond in a "natural" way to the
native addressing scheme, and whether it will work suitably in
any given case depends on getting a lot of details right.

Peter Nilsson

unread,

Apr 27, 2006, 7:36:40 PM4/27/06

to

Francis Glassborow wrote:
> In article <lnslnzr...@nuthaus.mib.org>, Keith Thompson
> <ks...@mib.org> writes

> > The standard doesn't require a compiler to reject a program that has a
> > constraint violation. Does accepting such a program release it from
> > any obligation to obey the standard?
>
> Where is it stated that compilers should be psychic. If a compiler
> accepts code that includes a constraint violation it is using an
> extension

Is it? I think Keith should have added the context of the original
discussion
in clc. Consider...

unsigned char c = -1L;

There is no extension here. The semantics are well defined because
there
is an implicit conversion which is well defined.

Now consider the last declaration in...

int a = 42;
void *p = &a;
intptr_t i = p;

Ignoring the constraint violation, the semantics appear to be equally
well
defined (albeit implementation defined), assuming intptr_t is supported
on
the implementation. Indeed, the semantics appear identical to...

intptr_t i = (void *) p;

A compiler doesn't need 'psychic' powers! If the diagnostic is issued
in
the form of a warning, then the only 'extension' is in continuing to
translate the program.

The issue is whether that continuance really negates the semantic
rules that would otherwise seem perfectly applicable!

> in the standard. and what then happens is entirely up to the
> implementation.

But where does the standard state that constraint violations are ipso
facto undefined behaviour?

If the construct that violated the constraint has no valid semantics,
then fair enough, but what about the case where there _are_ valid
semantics?

> Apart from the QoI issue, would it be wrong if the implementor
> documented that the consequence was to reformat your root drive?

How is reformatting the root drive one of the unspecified possibilities
(requiring documentation) in the following...

#include <stdio.h>
#include <stdint.h>

int main(void)
{
int a;

#ifdef INTPTR_MAX
void *p = &a;
intptr_t i = p; /* constraint violation */
int *q = (void *) i;
#else
int *q = &a;
#endif

*q = 42;
printf("%d\n", a);
}

If a given compiler treats the constraint violation by issuing a
warning rather
than an error, then why shouldn't the programmer expect, indeed demand,
that the program's output be 42?

--
Peter

Keith Thompson

unread,

Apr 27, 2006, 8:07:48 PM4/27/06

to

"Douglas A. Gwyn" <DAG...@null.net> writes:
> Keith Thompson wrote:
>> "Douglas A. Gwyn" <DAG...@null.net> writes:
>> > Keith Thompson wrote:
>> >> Does this apply even in in the presence of a constraint violation?
>> > Technically, if there is a constraint violation then all bets
>> > are off as to applicability of other rules.
>> Does your use of the word "technically" imply that this is actually
>> stated in the standard? If so, where?
>
> It's not a strictly conforming program if it doesn't obey the
> grammar and constraints (and also in some other circumstances).
> A conforming implementation is not obliged to accept any non-
> strictly conforming program, but it may accept some of them.
> That usually happens via use of some "conforming extension".

This program:

#include <stdio.h>
int main(void)
{
if (sizeof(int) > 4) {
puts("yes");
}
else {
puts("no");
}
return 0;
}

is not strictly conforming, since its output depends on
implementation-defined behavior. (If I'm mistaken on this point,
please explain why.) Are you saying an implementation isn't required
to accept it?

>> > However, if some C compiler wants to accept, as an extension,
>> > void*p;p=42; then one would expect it to be the same as
>> > void*p;p=(void*)42; and for "42" to be the virtual byte
>> > address within the process. Of course addressing details
>> > are highly platform-dependent..
>> Saying that "one would expect" this really doesn't answer my question.
>> I would expect the same thing myself. My question is what the
>> standard actually requires (which is why I asked here rather than in
>> comp.lang.c).
>
> 6.3.2.3p5 (and corresponding footnote) contains the spec for
> integer-to-pointer conversion. Essentially: it's implementation
> defined, it's expected to correspond in a "natural" way to the
> native addressing scheme, and whether it will work suitably in
> any given case depends on getting a lot of details right.

Yes, I know the specification for integer-to-pointer conversion. The
only relevant point in this case, since I'm ignoring the result, is
that it does not invoke undefined behavior.

In my original article, I presented this program:

/* Program 1 */

int main(void)
{
int i = 42;
void *p;

p = i; /* constraint violation */
return 0;
}

Compare it to this one:

/* Program 2 */

int main(void)
{
int i = 42;
void *p;

p = (void*)i;
return 0;
}

Program 2 is strictly conforming. It stores some implementation-defined
value in p, but the output doesn't depend on that value, and there is
no undefined behavior. It produces no output.

Program 1 obviously contains a constraint violation, and therefore is
not strictly conforming. An implementation must issue a diagnostic,
and is allowed to reject it.

If the implementation chooses to accept Program 1 (after issuing a
diagnostic), and *if* it then invokes undefined behavior, then it is
allowed to, for example, print the string "Kaboom!" during execution.

May a conforming implementation compile Program 1, issue a diagnostic,
accept the program, and generate an executable that prints "Kaboom!"?
Or is the implementation required to obey the semantics of
integer-to-pointer conversion in this case?

Francis Glassborow

unread,

Apr 28, 2006, 8:12:23 AM4/28/06

to

In article <1146181000.1...@j73g2000cwa.googlegroups.com>,
Peter Nilsson <ai...@acay.com.au> writes

>Now consider the last declaration in...
>
> int a = 42;
> void *p = &a;
> intptr_t i = p;
>
>Ignoring the constraint violation,

What constraint violation? In C (though not in C++) there is an implicit
conversion from void* to int* (which I assume is the type for which
intptr_t is an alias)

ais523

unread,

Apr 28, 2006, 9:14:26 AM4/28/06

to

Francis Glassborow wrote:

> In article <1146181000.1...@j73g2000cwa.googlegroups.com>,
> Peter Nilsson <ai...@acay.com.au> writes
> >Now consider the last declaration in...
> >
> > int a = 42;
> > void *p = &a;
> > intptr_t i = p;
> >
> >Ignoring the constraint violation,
>
> What constraint violation? In C (though not in C++) there is an implicit
> conversion from void* to int* (which I assume is the type for which
> intptr_t is an alias)
>

No, intptr_t is an integer type guaranteed to be able to hold the value
of a void* (so that

int a = 42;
void *p = &a;

intptr_t i = (intptr_t)p;
printf("%d\n",*(int*)(void*)i);
must print 42 if intptr_t is defined by the implementation); see
7.18.1.4 in N1124.pdf. Although convertions between integers and
pointers are implementation-defined, converting a void* to an intptr_t
and back again is guaranteed to work if INTPTR_MAX is defined.
(uintptr_t and UINTPTR_MAX would also work; in fact, they are slightly
more likely to be defined.)

Michael Wojcik

unread,

Apr 29, 2006, 3:56:59 PM4/29/06

to

In article <lnwtdaq...@nuthaus.mib.org>, Keith Thompson <ks...@mib.org> writes:
> "Douglas A. Gwyn" <DAG...@null.net> writes:
> > Keith Thompson wrote:
> >> Does this apply even in in the presence of a constraint violation?
> >
> > Technically, if there is a constraint violation then all bets
> > are off as to applicability of other rules.
>
> Does your use of the word "technically" imply that this is actually
> stated in the standard? If so, where?

I would argue this follows from the definition of "constraint" in
3.8 (in C99 / N1124):

restriction, either syntactic or semantic, by which the exposition
of language elements is to be interpreted

If a constraint is violated, the TU does not represent a C program,
because it can no longer be interpreted according to the standard.
Thus if an implementation (after issuing the diagnostic) produces a
program, it has not produced a C program.

I agree, though, that the standard does NOT say that violating a
constraint produces UB. It explicitly states that violating a
"shall" rule *outside* a constraint produces UB, which implies that
such a violation does not produce UB. I suspect that's because the
required diagnostic (modulo QoI issues in producing meaningful
diagnostics) means that the implementation has already indicated that
it is not producing a C program, and so there's no need to apply the
catchall UB label. Because the standard no longer applies, it can't
define it; the entire program is now undefined as far as C is
concerned.

Interestingly, this makes a constraint violation more powerful than
ordinary UB: UB only takes effect "upon use of" an erroneous
construct, while constraint violations render the entire program
undefined.

But that's just my interpretation, of course.

--
Michael Wojcik michael...@microfocus.com

The lecturer was detailing a proof on the blackboard. He started to say,
"From the above it is obvious that ...". Then he stepped back and thought
deeply for a while. Then he left the room. We waited. Five minutes
later he returned smiling and said, "Yes, it is obvious", and continued
to outline the proof. -- John O'Gorman

James Dennett

unread,

Apr 30, 2006, 7:47:09 AM4/30/06

to

Keith Thompson wrote:
> "Douglas A. Gwyn" <DAG...@null.net> writes:
>> Keith Thompson wrote:
>>> "Douglas A. Gwyn" <DAG...@null.net> writes:
>>>> Keith Thompson wrote:
>>>>> Does this apply even in in the presence of a constraint violation?
>>>> Technically, if there is a constraint violation then all bets
>>>> are off as to applicability of other rules.
>>> Does your use of the word "technically" imply that this is actually
>>> stated in the standard? If so, where?

I should check the C standard's definition of "undefined";
I'm assuming (dangerously) that it's similar to C++'s, in
which case omission of a definition is one way that something
can be undefined.

>>>> However, if some C compiler wants to accept, as an extension,
>>>> void*p;p=42; then one would expect it to be the same as
>>>> void*p;p=(void*)42; and for "42" to be the virtual byte
>>>> address within the process. Of course addressing details
>>>> are highly platform-dependent..
>>> Saying that "one would expect" this really doesn't answer my question.
>>> I would expect the same thing myself. My question is what the
>>> standard actually requires (which is why I asked here rather than in
>>> comp.lang.c).
>> 6.3.2.3p5 (and corresponding footnote) contains the spec for
>> integer-to-pointer conversion. Essentially: it's implementation
>> defined, it's expected to correspond in a "natural" way to the
>> native addressing scheme, and whether it will work suitably in
>> any given case depends on getting a lot of details right.
>
> Yes, I know the specification for integer-to-pointer conversion. The
> only relevant point in this case, since I'm ignoring the result, is
> that it does not invoke undefined behavior.
>
> In my original article, I presented this program:
>
> /* Program 1 */
> int main(void)
> {
> int i = 42;
> void *p;
> p = i; /* constraint violation */
> return 0;
> }
>

[snip]

> May a conforming implementation compile Program 1, issue a diagnostic,
> accept the program, and generate an executable that prints "Kaboom!"?
> Or is the implementation required to obey the semantics of
> integer-to-pointer conversion in this case?

As nothing in the standard defines the behavior of
program 1 (or even requires that it compile), its
behavior is undefined -- the most basic form of
undefined, I might say, being lack of definition.

AFAIK, all that is required for code containing a
diagnosable constraint violation is a diagnostic.
In this specific case, no definition is given to
the behavior of p = i (except that it is a constraint
violation requiring a diagnostic).

So: I believe it's perfectly conforming for an
implementation to accept program 1, issue a
diagnostic, and output an executable that prints
"Kaboom!".

-- James

Harald van Dĳk

unread,

Apr 30, 2006, 8:02:00 AM4/30/06

to

Michael Wojcik wrote:
> In article <lnwtdaq...@nuthaus.mib.org>, Keith Thompson <ks...@mib.org> writes:
> > "Douglas A. Gwyn" <DAG...@null.net> writes:
> > > Keith Thompson wrote:
> > >> Does this apply even in in the presence of a constraint violation?
> > >
> > > Technically, if there is a constraint violation then all bets
> > > are off as to applicability of other rules.
> >
> > Does your use of the word "technically" imply that this is actually
> > stated in the standard? If so, where?
>
> I would argue this follows from the definition of "constraint" in
> 3.8 (in C99 / N1124):
>
> restriction, either syntactic or semantic, by which the exposition
> of language elements is to be interpreted

Maybe more directly from 4#6, "A conforming implementation may have
extensions (including additional library functions), provided they do
not alter the behavior of any strictly conforming program." An
extension which requires a constraint violation to use cannot alter the
behaviour of any strictly conforming program, so is allowed, unless I'm
misunderstanding.

> Interestingly, this makes a constraint violation more powerful than
> ordinary UB: UB only takes effect "upon use of" an erroneous
> construct, while constraint violations render the entire program
> undefined.

OTOH, if a program with a constraint violation is accepted, an
extension is used. All extensions must be documented (4#8). So even if
a compiler can choose to have a program reformat your hard drive if an
int is assigned to a pointer without a cast in a code block which would
never be executed, it must warn you in advance.

That's just my view.

kuy...@wizard.net

unread,

Apr 30, 2006, 12:27:52 PM4/30/06

to

James Dennett wrote:
> Keith Thompson wrote:
...

> I should check the C standard's definition of "undefined";
> I'm assuming (dangerously) that it's similar to C++'s, in
> which case omission of a definition is one way that something
> can be undefined.

Yes, that same rule applies to C. However, you have to be careful how
to apply it. For instance, 6.5.5p4 says "The result of the binary *
operator is the product of the operands.". Now, that definition says
nothing about the bear that is hiding behind that tree in the north
field. People have made arguments that are logicially equivatlent to
assuming that since the standard doesn't mention that bear, the
behavior is undefined if the bear decides to sit down. However, the
standard does define the behavior, even in that case, despite the fact
that it doesn't mention the bear. That definition contains no
exclusions based upon the bear's activities, and must therefore
continue to apply, regardless of what those activities are. The
behavior continues to be defined even if he sits on your computer,
preventing it from from operating. The correct description of what has
happened in that case is not that this is allowable behavior, because
the behavior is undefined. Rather, the behavior of the bear has turned
your computer system into one that no longer contains a conforming
implementation of C.

In the case of constraint violations, if the standard contains a
description of the behavior that continues to be meaningful if the
constraint is violated, then IMO the behavior continues to be defined;
a dignostic message is mandatory, but a conforming implementation is
still required to produce behavior consistent with the standard's
specifications. However, in most cases constraint violations render
some aspect of the standard's defined behavior meaningless, so this is
usually not an issue.

...

> > /* Program 1 */
> > int main(void)
> > {
> > int i = 42;
> > void *p;
> > p = i; /* constraint violation */
> > return 0;
> > }
> >
>
> [snip]
>
> > May a conforming implementation compile Program 1, issue a diagnostic,
> > accept the program, and generate an executable that prints "Kaboom!"?
> > Or is the implementation required to obey the semantics of
> > integer-to-pointer conversion in this case?
>
> As nothing in the standard defines the behavior of
> program 1 (or even requires that it compile), its
> behavior is undefined -- the most basic form of
> undefined, I might say, being lack of definition.

As far as I can see, the behavior is not undefined. The standard
defines the required behavior for that statement in sections
6.5.16.1p2, 6.3.2.2p5, and 6.2.6.1p5. it's the explicitly specified
undefined behavior mentioned in that last section that renders the
behavior of this program undefined. It's not undefined just because of
violating the constraints in 6.5.16.1p1.

> AFAIK, all that is required for code containing a
> diagnosable constraint violation is a diagnostic.

I can't find anything in the standard that waves the other requirements
imposed by the standard, just because a constraint has been violated.

Keith Thompson

unread,

May 1, 2006, 12:15:15 AM5/1/06

to

"Harald van Dĳk" <tru...@gmail.com> writes:
> Michael Wojcik wrote:
>> In article <lnwtdaq...@nuthaus.mib.org>, Keith Thompson
>> <ks...@mib.org> writes:
>> > "Douglas A. Gwyn" <DAG...@null.net> writes:
>> > > Keith Thompson wrote:
>> > >> Does this apply even in in the presence of a constraint violation?
>> > >
>> > > Technically, if there is a constraint violation then all bets
>> > > are off as to applicability of other rules.
>> >
>> > Does your use of the word "technically" imply that this is actually
>> > stated in the standard? If so, where?
>>
>> I would argue this follows from the definition of "constraint" in
>> 3.8 (in C99 / N1124):
>>
>> restriction, either syntactic or semantic, by which the exposition
>> of language elements is to be interpreted
>
> Maybe more directly from 4#6, "A conforming implementation may have
> extensions (including additional library functions), provided they do
> not alter the behavior of any strictly conforming program." An
> extension which requires a constraint violation to use cannot alter the
> behaviour of any strictly conforming program, so is allowed, unless I'm
> misunderstanding.

Sure, but accepting a program that includes a constraint violation
isn't necessarily an extension.

The standard doesn't require programs with constraint violations to be
rejected. A conforming implementation could accept such programs even
if the "may have extensions" clause weren't in the standard.

Once the program is accepted, its behavior is defined if and only if
the standard defines the behavior.

I think.

Harald van Dĳk

unread,

May 1, 2006, 2:09:49 AM5/1/06

to

Keith Thompson wrote:
> "Harald van Dĳk" <tru...@gmail.com> writes:
> > Michael Wojcik wrote:
> >> In article <lnwtdaq...@nuthaus.mib.org>, Keith Thompson
> >> <ks...@mib.org> writes:
> >> > "Douglas A. Gwyn" <DAG...@null.net> writes:
> >> > > Keith Thompson wrote:
> >> > >> Does this apply even in in the presence of a constraint violation?
> >> > >
> >> > > Technically, if there is a constraint violation then all bets
> >> > > are off as to applicability of other rules.
> >> >
> >> > Does your use of the word "technically" imply that this is actually
> >> > stated in the standard? If so, where?
> >>
> >> I would argue this follows from the definition of "constraint" in
> >> 3.8 (in C99 / N1124):
> >>
> >> restriction, either syntactic or semantic, by which the exposition
> >> of language elements is to be interpreted
> >
> > Maybe more directly from 4#6, "A conforming implementation may have
> > extensions (including additional library functions), provided they do
> > not alter the behavior of any strictly conforming program." An
> > extension which requires a constraint violation to use cannot alter the
> > behaviour of any strictly conforming program, so is allowed, unless I'm
> > misunderstanding.
>
> Sure, but accepting a program that includes a constraint violation
> isn't necessarily an extension.

If you meant "not necessarily" as a question about the definition of
"extension", while it's not explicitly defined, according to some of
the examples in J.5, even different semantics to strictly conforming
programs can be considered as extensions (which would make an
implementation nonconforming), so why can't giving different semantics
to programs with constraint violations be?

If you meant that accepting a program that includes a constraint
violation may be an extension on one implementation and not on another,
then I don't see how that matters if that one implementation can still
exist.

> The standard doesn't require programs with constraint violations to be
> rejected. A conforming implementation could accept such programs even
> if the "may have extensions" clause weren't in the standard.
>
> Once the program is accepted, its behavior is defined if and only if
> the standard defines the behavior.
>
> I think.

I can't find support for that idea, and it can cause problems. An
example:

#include <stdlib.h>
static int a = EXIT_FAILURE;
static int \u0061; /* constraint violation; 6.4.3 */
int main(void) {
return \u0061;
}

Except for the constraint violation, I believe this is perfectly valid.
(If I'm wrong, please do correct me.) But if an implementation decides
to accept this, what should it return? Should \u0061 be the same
variable as a, so should EXIT_FAILURE be returned? Or should \u0061 be
a distinct variable which is default-initialised to 0? The former is
what I would hope implementations which choose to accept it do, the
latter is what I expect simple implementations which choose to accept
it to do (and what I believe is defined by the standard if there were
no constraint violation). Personally, I would think both should be
allowed, and I think both are.

Harald van Dĳk

unread,

May 1, 2006, 2:49:51 AM5/1/06

to

The example I gave was a bad one, sorry. Here's another attempt:

int main(void) {
return 1. % 2.; /* 6.5.5 */
}

"If the quotient a/b is representable, the expression (a/b)*b + a%b
shall equal a."

In this sentence no distinction is made between % applied to integers
and % applied to other types, so 1. % 2., if this weren't a constraint
violation, should return 0 (unless there are very strange rounding
errors). Does this mean an implementation can't choose to support % on
doubles behaving as on integers?

Keith Thompson

unread,

May 1, 2006, 1:38:47 PM5/1/06

to

"Harald van Dĳk" <tru...@gmail.com> writes:

> Keith Thompson wrote:
[...]

>> Sure, but accepting a program that includes a constraint violation
>> isn't necessarily an extension.
>
> If you meant "not necessarily" as a question about the definition of
> "extension", while it's not explicitly defined, according to some of
> the examples in J.5, even different semantics to strictly conforming
> programs can be considered as extensions (which would make an
> implementation nonconforming), so why can't giving different semantics
> to programs with constraint violations be?
>
> If you meant that accepting a program that includes a constraint
> violation may be an extension on one implementation and not on another,
> then I don't see how that matters if that one implementation can still
> exist.

My argument is that, since accepting a program with a constraint
violation is allowed by the standard, doing so does not require any
kind of extension. What the implementation does after that is another
question.

>> The standard doesn't require programs with constraint violations to be
>> rejected. A conforming implementation could accept such programs even
>> if the "may have extensions" clause weren't in the standard.
>>
>> Once the program is accepted, its behavior is defined if and only if
>> the standard defines the behavior.
>>
>> I think.
>
> I can't find support for that idea, and it can cause problems. An
> example:

[snip]

Yes, it causes problems. I'll go into that further in another
followup.

Keith Thompson

unread,

May 1, 2006, 1:57:41 PM5/1/06

to

"Harald van Dĳk" <tru...@gmail.com> writes:
> Harald van Dĳk wrote:
>> Keith Thompson wrote:

[...]

>> > The standard doesn't require programs with constraint violations to be
>> > rejected. A conforming implementation could accept such programs even
>> > if the "may have extensions" clause weren't in the standard.
>> >
>> > Once the program is accepted, its behavior is defined if and only if
>> > the standard defines the behavior.
>> >
>> > I think.
>>
>> I can't find support for that idea, and it can cause problems. An
>> example:
>
> The example I gave was a bad one, sorry. Here's another attempt:
>
> int main(void) {
> return 1. % 2.; /* 6.5.5 */
> }
>
> "If the quotient a/b is representable, the expression (a/b)*b + a%b
> shall equal a."
>
> In this sentence no distinction is made between % applied to integers
> and % applied to other types, so 1. % 2., if this weren't a constraint
> violation, should return 0 (unless there are very strange rounding
> errors). Does this mean an implementation can't choose to support % on
> doubles behaving as on integers?

Interesting question.

The example I posted earlier is more straightforward:

int main(void)
{
int i = 42;
void *p;
p = i;
return 0;
}

The assignment "p = i;" violates a constraint, but once the program is
accepted, the semantics *seem* to be well defined by the standard; the
RHS is converted to the type of the LHS, and the result is
implementation-defined

In other cases, things get confusing, probably because the various
Semantics sections in the standard were written under the assumption
that all Constraints have been satisfied. I doubt that anyone thought
the semantics of about floating-point "%" (or struct "%", for that
matter).

It's tempting to fall back to common sense, but I think C's treatment
of constraint violations already violates what most programmers would
expect. In most languages, the equivalent of a constraint violation
means that the program must be rejected. C's rule that a constraint
violation requires a diagnostic, but doesn't require the program to be
rejected, opens up a huge gray area. As far as I can tell, this
question is *not* clearly resolved by the wording in the standard.

I suggest that the best solution would be to add an explicit statement
to the standard that a program that violates a constraint invokes
undefined behavior. This would not affect any implementations (they'd
be free to continue to act exactly as they do now), but it would
encourage programmers to be more cautious about constraint violations
that happen to be accepted by their compilers. Some would argue that
this added rule would merely state explicitly what's already implicit
in the standard. I'm not sure whether it does, but in any case a
little redundancy wouldn't hurt.

A more radical change would be to require any program with a
constraint violation to be rejected. Like all rules in the standard,
this would apply only when the compiler is invoked in conforming mode
(which needn't be the default). There are compilers that, even in
conforming mode, merely issue warnings for some constraint violations,
and then accept the program anyway; this would have to be changed.
But all compilers would still be free to have a non-conforming mode in
which such programs are accepted with a warning, or even accepted
silently. In my opinion, this would make C's treatment of constraint
violations closer to what most programmers expect, while still
allowing the current behavior *as an extension*.

Douglas A. Gwyn

unread,

May 2, 2006, 5:05:51 PM5/2/06

to

Michael Wojcik wrote:
> Interestingly, this makes a constraint violation more powerful than
> ordinary UB: UB only takes effect "upon use of" an erroneous
> construct, while constraint violations render the entire program
> undefined.

And that's intentional.

Douglas A. Gwyn

unread,

May 2, 2006, 5:04:51 PM5/2/06

to

kuy...@wizard.net wrote:
> I can't find anything in the standard that waves the other requirements
> imposed by the standard, just because a constraint has been violated.

But if a program violates enough constraints,
it isn't even close to C anymore, and we don't
want conforming compilers to have to figure out
some "sane" interpretation for such a program.
That's why I pointed to the requirement for
conforming implementations to accept s.c.
programs (and not necessarily anything else),
which a constraint violator is not -- it lets
the implementation off the hook for malformed
programs.

Jordan Abel

unread,

May 2, 2006, 5:32:50 PM5/2/06

to

Aren't implementations permitted to fail to translate units containing
constraint violations? Once that's allowed, it's not clear what further
requirements can meaningfully be placed on the implementation.

For an example of "surprising" behavior ['example of undefined behavior'
isn't really useful, so i'll say "surprising behavior" for something
other than what someone expects], a conforming C90 implementation can
interpret the presence of two instances of the keyword "long" in
a single declaration (a syntax error) as calling for a type wider than
long int. A diagnostic is of course required, such a compiler might
choose to word its diagnostic as something along the lines of "long long
type not permitted by strict ansi C"

Keith Thompson

unread,

May 2, 2006, 5:51:31 PM5/2/06

to

"Douglas A. Gwyn" <DAG...@null.net> writes:

But surely implementations are "on the hook" for more than just
*strictly* conforming programs. As I mentioned earlier, something
like this:

#include <stdio.h>
int main(void)
{
printf("sizeof(int) = %d\n", (int)sizeof(int));
return 0;
}

is not strictly conforming, but I'd certainly expect a implementation
to handle it properly.

I don't see how the dividing line between strictly conforming and non
strictly conforming programs is useful in this discussion.

I do think that it would be reasonable to say that the behavior of any
program that violates a constraint (and is nevertheless accepted by
the implementation) should be undefined. I just don't see a clear
statement of that in the standard.

Keith Thompson

unread,

May 2, 2006, 6:52:04 PM5/2/06

to

Jordan Abel <rand...@gmail.com> writes:
> On 2006-05-02, Douglas A. Gwyn <DAG...@null.net> wrote:
>> kuy...@wizard.net wrote:
>>> I can't find anything in the standard that waves the other requirements
>>> imposed by the standard, just because a constraint has been violated.
>>
>> But if a program violates enough constraints, it isn't even close to
>> C anymore, and we don't want conforming compilers to have to figure
>> out some "sane" interpretation for such a program. That's why
>> I pointed to the requirement for conforming implementations to accept
>> s.c. programs (and not necessarily anything else), which a constraint
>> violator is not -- it lets the implementation off the hook for
>> malformed programs.
>
> Aren't implementations permitted to fail to translate units containing
> constraint violations? Once that's allowed, it's not clear what further
> requirements can meaningfully be placed on the implementation.

Unless the standard places such requirements on the implementation.

In the example that started this thread, a simple assignment without
an explicit cast violated a constraint, but (a literal reading of) the
semantics for a simple assignment imply that an implicit conversion
should then be done. The semantics of that conversion are
implementation-defined, but not undefined.

Certainly an implementation is allowed to reject any program that
contains a constraint violation. It's also allowed to reject a
program with, for example, more than 1023 enumeration constants in a
single enumeration (C99 5.2.4.1) -- but if it chooses to accept it, it
can't decide to implement its own semantics that violate the
requirements of the standard. (At least I *hope* it can't.)

Yes, there are some constraint violations that make the program
nothing more than gobbledygook -- i.e., the standard *doesn't* define
their behavior. But in other cases, the behavior is still defined.

I'd like to see an explicit statement in the standard that the
behavior is actually undefined.

> For an example of "surprising" behavior ['example of undefined behavior'
> isn't really useful, so i'll say "surprising behavior" for something
> other than what someone expects], a conforming C90 implementation can
> interpret the presence of two instances of the keyword "long" in
> a single declaration (a syntax error) as calling for a type wider than
> long int. A diagnostic is of course required, such a compiler might
> choose to word its diagnostic as something along the lines of "long long
> type not permitted by strict ansi C"

Likewise, a conforming C99 compiler could interpret the presence of
three instances of "long" as calling for a type wider than
long long int. But the standard doesn't define the behavior of
"long long long int", so the behavior is undefined, so the
implementation is free to do whatever it likes. My current concern is
cases where the behavior appears to be defined.

Jordan Abel

unread,

May 2, 2006, 9:32:47 PM5/2/06

to

On 2006-05-02, Keith Thompson <ks...@mib.org> wrote:
> Jordan Abel <rand...@gmail.com> writes:
>> On 2006-05-02, Douglas A. Gwyn <DAG...@null.net> wrote:
>>> kuy...@wizard.net wrote:
>>>> I can't find anything in the standard that waves the other requirements
>>>> imposed by the standard, just because a constraint has been violated.
>>>
>>> But if a program violates enough constraints, it isn't even close to
>>> C anymore, and we don't want conforming compilers to have to figure
>>> out some "sane" interpretation for such a program. That's why
>>> I pointed to the requirement for conforming implementations to accept
>>> s.c. programs (and not necessarily anything else), which a constraint
>>> violator is not -- it lets the implementation off the hook for
>>> malformed programs.
>>
>> Aren't implementations permitted to fail to translate units containing
>> constraint violations? Once that's allowed, it's not clear what further
>> requirements can meaningfully be placed on the implementation.
>
> Unless the standard places such requirements on the implementation.

Why? It failed to translate and happened to create what looks like
translation output but does something else after external references are
resolved and it is executed.

> Certainly an implementation is allowed to reject any program that
> contains a constraint violation. It's also allowed to reject a
> program with, for example, more than 1023 enumeration constants in a
> single enumeration (C99 5.2.4.1) -- but if it chooses to accept it, it
> can't decide to implement its own semantics that violate the
> requirements of the standard. (At least I *hope* it can't.)

>> For an example of "surprising" behavior ['example of undefined

Keith Thompson

unread,

May 2, 2006, 10:36:34 PM5/2/06

to

Jordan Abel <rand...@gmail.com> writes:
> On 2006-05-02, Keith Thompson <ks...@mib.org> wrote:
>> Jordan Abel <rand...@gmail.com> writes:
>>> On 2006-05-02, Douglas A. Gwyn <DAG...@null.net> wrote:
>>>> kuy...@wizard.net wrote:
>>>>> I can't find anything in the standard that waves the other requirements
>>>>> imposed by the standard, just because a constraint has been violated.
>>>>
>>>> But if a program violates enough constraints, it isn't even close to
>>>> C anymore, and we don't want conforming compilers to have to figure
>>>> out some "sane" interpretation for such a program. That's why
>>>> I pointed to the requirement for conforming implementations to accept
>>>> s.c. programs (and not necessarily anything else), which a constraint
>>>> violator is not -- it lets the implementation off the hook for
>>>> malformed programs.
>>>
>>> Aren't implementations permitted to fail to translate units containing
>>> constraint violations? Once that's allowed, it's not clear what further
>>> requirements can meaningfully be placed on the implementation.
>>
>> Unless the standard places such requirements on the implementation.
>
> Why? It failed to translate and happened to create what looks like
> translation output but does something else after external references are
> resolved and it is executed.

No, it didn't fail to translate. In the scenario I'm discussing, the
program is successfully translated by a conforming implementation.
The implementation produced a required diagnostic for the constraint
violation (which is all the standard requires it to do).

Jordan Abel

unread,

May 2, 2006, 11:33:30 PM5/2/06

to

On 2006-05-03, Keith Thompson <ks...@mib.org> wrote:
> Jordan Abel <rand...@gmail.com> writes:
>> On 2006-05-02, Keith Thompson <ks...@mib.org> wrote:
>>> Jordan Abel <rand...@gmail.com> writes:
>>>> On 2006-05-02, Douglas A. Gwyn <DAG...@null.net> wrote:
>>>>> kuy...@wizard.net wrote:
>>>>>> I can't find anything in the standard that waves the other requirements
>>>>>> imposed by the standard, just because a constraint has been violated.
>>>>>
>>>>> But if a program violates enough constraints, it isn't even close to
>>>>> C anymore, and we don't want conforming compilers to have to figure
>>>>> out some "sane" interpretation for such a program. That's why
>>>>> I pointed to the requirement for conforming implementations to accept
>>>>> s.c. programs (and not necessarily anything else), which a constraint
>>>>> violator is not -- it lets the implementation off the hook for
>>>>> malformed programs.
>>>>
>>>> Aren't implementations permitted to fail to translate units containing
>>>> constraint violations? Once that's allowed, it's not clear what further
>>>> requirements can meaningfully be placed on the implementation.
>>>
>>> Unless the standard places such requirements on the implementation.
>>
>> Why? It failed to translate and happened to create what looks like
>> translation output but does something else after external references are
>> resolved and it is executed.
>
> No, it didn't fail to translate. In the scenario I'm discussing, the
> program is successfully translated by a conforming implementation.

Except that a conforming implementation doesn't _have_ to successfully
translate it. It can, as in _my_ scenario, fail to translate and produce
garbage output which might be misinterpreted by the casual reader as
a mistranslation.

> The implementation produced a required diagnostic for the constraint
> violation (which is all the standard requires it to do).

right - that's ALL it requires it to do. It doesn't even require it to
successfully translate.

Douglas A. Gwyn

unread,

May 3, 2006, 11:53:07 AM5/3/06

to

Keith Thompson wrote:
> ... As I mentioned earlier, something

> like this:
> #include <stdio.h>
> int main(void)
> {
> printf("sizeof(int) = %d\n", (int)sizeof(int));
> return 0;
> }
> is not strictly conforming, but I'd certainly expect a implementation
> to handle it properly.

The real problem is that the C standard has never specified enough
"conformance categories", and that strict conformance of programs
was defined in terms of invariance of output. (At least as of C99
we allow the output to depend on localization.) What was intended,
but which we couldn't think of a good way to say in "standardese",
is that the correct operation of a s.c. program should not depend
on any implementation choice. We wimped out by turning that into
just a requirement on the observable output instead.

ena8...@yahoo.com

unread,

May 5, 2006, 7:24:42 AM5/5/06

to

Keith Thompson wrote:
> This came up in a discussion on comp.lang.c, subject "Size of Void".
> Credit goes to Peter Nilsson for the analysis that led to this
> question.
>
> Consider this (not very useful) program:

>
> int main(void)
> {
> int i = 42;
> void *p;
> p = i;
> return 0;
> }
>

> The assignment violates the constraints for a simple assignment,
> defined in C99 6.5.16.1p1, since the types of the operands are not
> among the cases that are allowed.
>
> But suppose the compiler issues the required diagnostic for this
> constraint violation, but then proceeds to generate an executable for
> the program.
>
> C99 6.5.16.1p2, describing the semantics of simple assignment says:
>
> In _simple assignment_ (=), the value of the right operand is
> converted to the type of the assignment expression and replaces
> the value stored in the object designated by the left operand.
>
> Does this apply even in in the presence of a constraint violation? If
> the implementation chooses to accept this program after issuing a
> diagnostic, is it then required to treat the assignment
> p = i;
> as if it were
> p = (void*)i;
> ?
>
> I have a vague memory that, once a program violates a constraint, its
> behavior is undefined; I'm now thinking that I was mistaken on this
> point. In other words, I *think* the answer to the question in the
> subject is "no", but I'm still not certain whether this is true or
> whether it's intended.

It's undefined behavior. The definition of contraint basically says
that the text defining how programs behave apply only when all
constraints are met. If constraints aren't met, there is no
definition,
and no definition means undefined behavior.

kuy...@wizard.net

unread,

May 5, 2006, 10:52:02 AM5/5/06

to

ena8...@yahoo.com wrote:
...

> It's undefined behavior. The definition of contraint basically says
> that the text defining how programs behave apply only when all
> constraints are met.

The definition of "constraint" in 3.6p1 says: "restrictions, both
syntactic and semantic, by which the exposition of language elements is
to be interpreted".

As a native speaker of english, and someone very comfortable this kind
of technical jargon, I have to say that I find the second clause of
that definition very obscure. It might mean what you say, but I don't
think that it's at all clear that it does. A much better way to write
it, if that was the intent, would be as follows:

"restrictions, both syntactic and semantic, which must be satisfied in
order for the exposition of language elements given in this standard to
be meaningful."

That fact that this much simpler, clearer wording was not used leaves
me uncertain as to whether this simpler wording correctly describes the
original intent.

Dave Thompson

unread,

May 11, 2006, 12:05:47 AM5/11/06

to

On Mon, 01 May 2006 17:57:41 GMT, Keith Thompson <ks...@mib.org>
wrote:
<snip>

> It's tempting to fall back to common sense, but I think C's treatment
> of constraint violations already violates what most programmers would
> expect. In most languages, the equivalent of a constraint violation
> means that the program must be rejected. C's rule that a constraint

Which 'most' languages, other than interactive ones? I've seen
Fortran, PL/I, COBOL and Ada compilers that try to 'fix' errors and
continue compilation, and I think every assembler I've used, and I've
heard of Pascals that do. Some LISPs were (in)famous for 'DWIM'.

In the days of batch, this was a damn good idea. With turnaround
rarely much less than an hour and sometimes more than a day, if you
could find and fix at most one error per compile your development time
for any nontrivial program would be years if not decades.

Nowadays with mostly personal and AFAICT entirely interactive systems,
prompt stop and let the user fix is a much better approach. But this
isn't a feature of the language, it's QoI of the implementation.

The only (noninteractive) language _specified_ to stop on error that
comes to my mind is perl in 'strict' mode. And perl isn't actually
specified by a standard but by the single, definitive implementation.

> violation requires a diagnostic, but doesn't require the program to be
> rejected, opens up a huge gray area. As far as I can tell, this
> question is *not* clearly resolved by the wording in the standard.
>

- David.Thompson1 at worldnet.att.net

Keith Thompson

unread,

May 11, 2006, 2:59:52 PM5/11/06

to

Dave Thompson <david.t...@worldnet.att.net> writes:
> On Mon, 01 May 2006 17:57:41 GMT, Keith Thompson <ks...@mib.org>
> wrote:
> <snip>
>> It's tempting to fall back to common sense, but I think C's treatment
>> of constraint violations already violates what most programmers would
>> expect. In most languages, the equivalent of a constraint violation
>> means that the program must be rejected. C's rule that a constraint
>
> Which 'most' languages, other than interactive ones? I've seen
> Fortran, PL/I, COBOL and Ada compilers that try to 'fix' errors and
> continue compilation, and I think every assembler I've used, and I've
> heard of Pascals that do. Some LISPs were (in)famous for 'DWIM'.

I'm not sure of the rules for Fortran, PL/I, or COBOL, but I do know
Ada. It's common for an Ada compiler to (attempt to) continue
compilation after an error, even a syntax error, but only for the
purpose of diagnosing more errors. The compiler *must* reject any
program that contains a syntax or semantic error (the latter
corresponds to C's constraint violation).

> In the days of batch, this was a damn good idea. With turnaround
> rarely much less than an hour and sometimes more than a day, if you
> could find and fix at most one error per compile your development time
> for any nontrivial program would be years if not decades.

Sure, but that doesn't mean the compiler is going to generate a
working executable after reporting all those errors.