char buff[100];
size_t size = strlen(argv[1]) + strlen(argv[2]) + sizeof(buff);
i've been told that size_t is guaranteed to be large enough to represent
the entire address space. because the program itself takes up some of
the address space, the combined size of objects in the address space
(for example, environment variables, command line arguments, or
statically or dynamically allocated memory) should never exceed SIZE_MAX.
the standard, however, does not appear to explicitly make this
guarantee. It says it must be an unsigned int (6.5.34) that is capable
of representing values up to 65535 (7.18.3). 7.17 recommends that
size_t "not have integer conversion rank greater than that of signed
long int unless the implementation support objects larger enough to make
this necessary".
is it possible that their are implementations where size_t can represent
the size of any object that can be created on that system but adding two
sizes can result in an integer overflow?
thanks,
rCs
>
> can values of size_t overflow
No, because it's an unsigned type, so it is guaranteed to wrap, rather
than overflow. If arithmetic on unsigned types gives a result that is
too large to fit in the range, it is "reduced" (which means that
(1+the_max_value_for_the_type) is added to, or taken away from, the
result as many times as is necessary to bring the result into range).
<snip>
> i've been told that size_t is guaranteed to be large enough to
> represent
> the entire address space.
No, it's guaranteed to be at least large enough to represent the size of
the largest object that the implementation can allocate.
<snip>
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Yes, as far as I can see (_pace_ R.H.'s objection to
the word "overflow").
For example, consider a machine where memory is
organized in "segments" and each object must be entirely
contained in a single "segment." On such a machine, size_t
would only need to accommodate byte counts up to the segment
size -- but if there are several segments available to the
program, the sum of the sizes of objects in those segments
could exceed the size of a single segment. Voilà! size_t
overflow (_pace_ R.H.'s objection to the word "overflow").
For ex:
Assume that size_t can hold a mximum of only 65535
size_t sz = 65537;
In this case wrapping is done to yield the result as 1.
But
size_t s1 = 65532;
size_t s2 = 65530;
size_t s3 = s1 + s2;/*Integer overflow producing UB*/
In this case we have integer overflow.
Integer encompasses both signed and unsigned integer types. So
arithmetics on integer types might produce UB due to overflow even if
it is unsigned integer type.
--
Please do NOT use robinton.demon.co.uk addresses
They cease to be valid on July 14
replace with Francis.Glassborow at btinternet.com
What if size_t is a typedef for unsigned short, and unsigned short promotes
to signed int?
Can you elaborate?
I am not getting you.
If size_t is a typedef for unsigned int even there is no need for any
promotion or conversion.
What those terms conversion/promotion has to do here?
My point is arithmetics on integer might lead to overflow which is a
UB and unsigned integer is NOT an exception.
Unsigned integer arithmetics (different from conversion) might also
lead to overflow.
If I am wrong quote the relevant part of the standards that contest my
view.
§6.2.5p9:
..."A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned
integer type is
reduced modulo the number that is one greater than the largest value
that can be
represented by the resulting type."
Robert Gamble
Thanks for that quote.
I should have consulted the standards before posting.
I apologise for the inconvenience caused by my erroneous post.
Same thing:
size_t a, b, c;
...
a = b + c;
The subexpressions 'b' and 'c' promote to unsigned int, and are added,
yielding a result of type unsigned int. If the result of the addition
cannot be represented as an unsigned int, it wraps around (possible if
unsigned short and unsigned int are the same size -- though in that
case the implementer would probably make size_t a typedef for unsigned
int rather than for unsigned short). The result is assigned to a,
after being converted from unsigned int to unsigned short. If the
result cannot be represented as an unsigned short, it wraps around.
(Conversion and addition happen to have the same wraparound rules for
unsigned types. For signed types, the overflow rules for addition and
conversion differ.)
--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
As discussed, there is no such guarantee. But void* must be large
enough to cover the entire address space, at least the space used to
store objects, since each byte of each object must have a distinct
address.
(Most modern implementations have a flat address space; the ultimate
limit on the size of a single object is the same as the limit on the
size of the entire address space, and size_t and void* are the same
size. But the C standard doesn't limit itself to such
implementations.)
Why do 'b' and 'c' not promote to signed int? The integer promotion rules
state that unsigned short promotes to unsigned int only if signed int is
not large enough to hold all valid values of type unsigned short, do they
not?
6.3.1.1p2:
"The following may be used in an expression wherever an int or unsigned int
may be used:
-- An object or expression with an integer type whose integer conversion
rank is less than or equal to the rank of int and unsigned int.
-- A bit-field of type _Bool, int, signed int, or unsigned int.
If an int can represent all values of the original type, the value is
converted to an int; otherwise, it is converted to an unsigned int. These
are called the integer promotions.48) All other types are unchanged by the
integer promotions."
Did you read Harald's question carefully? He asked
about the case where size_t promotes to *signed* int, that
is, where SIZE_MAX <= INT_MAX. In this case, both 'b' and
'c' promote to plain int, the addition is carried out in
plain int arithmetic, and the result is converted to size_t
for storage in 'a'. If the int addition overflows, the
behavior is undefined.
No, I didn't.
> He asked
> about the case where size_t promotes to *signed* int, that
> is, where SIZE_MAX <= INT_MAX. In this case, both 'b' and
> 'c' promote to plain int, the addition is carried out in
> plain int arithmetic, and the result is converted to size_t
> for storage in 'a'. If the int addition overflows, the
> behavior is undefined.
You're right, I was wrong. (But now I'm right again, and all I had to
do was change my mind!)
In order for that to happen we need a somewhat bizarre implementation
where signed int has precisely one more value bit than size_t so that it
can represent all the values of size_t but has only a sign bit left.
Technically possible but I have never come across a system where it was so.
The really problematic cases are those where all integer types have the
same number of bits where no promotion/conversion can provide any extra
capacity. However that will not give undefined behaviour as the final
promotion option for an integer type is always to an unsigned type.
Where does the standard prohibits an implementation from type-
defining size_t as unsigned char/short?
I agree that such an implementation is rare, but I wouldn't call it a
bizarre one just because it takes unsigned char/short for size_t.
--
Jun, Woong (woong at icu.ac.kr)
Samsung Electronics Co., Ltd.
``All opinions expressed are mine, and do not represent
the official opinions of any organization.''
No, but it would be a bizarre one if size_t is an unsigned char/short
type that is exactly one bit smaller than int. If it's more than one
bit smaller, adding two size_t values can't overflow int.
Many of the previous responses have gone off on a detour,
and there is at least one point that I don't recall seeing made.
sizeof_t has to be able to represent the size of any *single*
object (that the implementation supports), not the aggregate
size of *all* objects.
If you add too many large numbers using type size_t of course
it will wrap around (not overflow).
There is an excellent chance (no guaranteed by the C standard)
that uintmax_t will be wide enough to accommodate the sum of
all object sizes within a single thread of a program, although
one wonders why you think you need to do this.
Program text (functions) can be in a totally separate address
space, and void* is not guaranteed to be useful for pointing
to functions.
<snip>
> Many of the previous responses have gone off on a detour,
> and there is at least one point that I don't recall seeing made.
Which one was that? I made two of your points in my original reply in
this thread.
> sizeof_t has to be able to represent the size of any *single*
> object (that the implementation supports), not the aggregate
> size of *all* objects.
This one...
> If you add too many large numbers using type size_t of course
> it will wrap around (not overflow).
...and this one. And I can't imagine that it's...
> There is an excellent chance (no guaranteed by the C standard)
> that uintmax_t will be wide enough to accommodate the sum of
> all object sizes within a single thread of a program, although
> one wonders why you think you need to do this.
...this one, since it talks about mere probabilities rather than
guarantees (as you yourself note). So it must be...
> Program text (functions) can be in a totally separate address
> space, and void* is not guaranteed to be useful for pointing
> to functions.
...this one, and I can certainly see that it's relevant. But I must
admit it hadn't occurred to me that anyone would think otherwise.
Okay, so that shows a gap in my powers of imagination. :-)
But nobody said there were only two objects. From the original
question:
> can values of size_t overflow when all the operands being added
> are sizes of objects in the process space? for example:
There is no assumption of only two objects, and indeed, the example
has three:
> char buff[100];
> size_t size = strlen(argv[1]) + strlen(argv[2]) + sizeof(buff);
--
Mark Williams
If you promote two unsigned shorts to ints and then add them the only
cases where you can get overflow is where the int has zero or one more
value bit. In the former case unsigned short will promote to unsigned
int (and hence no UB) It is the latter case which is theoretically
possible but never done in practice.
In all other cases after promotion from unsigned short (or char) to int
there will be a sufficient range of values to deal with all possible
results of adding together two unsigned shorts. Demoting an int to an
unsigned short does not result in undefined behaviour (it is exactly
defined by the Standard)
We were (are) addressing a different point (raised by the SR Rajesh)
that adding two or more size_t values might result in UB.
Keith Thompson, message dated Thu, 05 Jul 2007 12:56:31 -0700:
> size_t a, b, c;
> ...
> a = b + c;
I count two objects of type size_t being added. No more. No less.
...
> There is no assumption of only two objects, and indeed, the example
> has three:
>
> > char buff[100];
> > size_t size = strlen(argv[1]) + strlen(argv[2]) + sizeof(buff);
That was an earlier example on the same thread; it was not the one
that was in context in the sub-thread leading up to my response. Eric
Sosman's responded to Keith's example of a=b+c by saying that it could
result in an overflow. Francis Glassborow responded by pointing out
that it would take a very bizarre implementation for that to be true.
Jun Woong asked why it would be considered bizarre, and I explained. I
can't speak for the others, but Francis' answer only makes sense if it
is restricted to the addition of two size_t values, so I suspect
that's the context he was making it in.
Yes, I missed that context. Since I jumped into the middle of this
discussion, missed the example given above Francis Glassborow's post
has only two operands for addition. Sorry to Francis Glassborow.
The standards says "computation involving unsigned operands" which
means all computations involving (say) unsigned char promoted to int,
should not overflow but wrapped based on UCHAR_MAX+1
This is what I follow from the current words of the standards.In fact
that was what I thought, before following other discussions here and
before being reminded of computations we perform using unsigned char
as int! (as in example below)
Don't you think the words chosen by the standards here are deluding
and doesn't take integer promotion into account?
Or when unsigned char is promoted to int, it is guranteed
(ridiculous!) NOT to overflow rather than wrap based on modulo
UCHAR_MAX+1 rule?
That means:
unsigned char c1 = 251;
unsigned char c2 = 253;
int i = c1 + c2;
Assume UCHAR_MAX==255. Integer promotion occurs followed by
computation. "Commonly", this won't wrap and therefore yields an int
result of 504. If wrapping is done then the result will be different.
But the present words of the standards seems to be saying that
wrapping is required in this case also!.
I prefer "computation involving operands *whose type is unsigned after
promotion*" which clearly expresses the intent.
Any comments?
No; it means no such thing. After promotion to int,
the operands *are* int and are not unsigned; the arithmetic
is performed according to the int rules, and not some other
rules.
Here's a case to ponder, that may (or may not) clarify
the logic of promotion:
unsigned char uch = 100;
double dbl = 100;
dbl *= uch; /* Wrap or no wrap? Why or why not? */
When you've decided about that one, make a tiny change
to the code and ponder again:
unsigned char uch = 100;
/* double */ int dbl = 100;
dbl *= uch; /* Wrap or no wrap? Why or why not? */
Did you answer differently? Why or why not?
> This is what I follow from the current words of the standards.In fact
> that was what I thought, before following other discussions here and
> before being reminded of computations we perform using unsigned char
> as int! (as in example below)
>
> Don't you think the words chosen by the standards here are deluding
> and doesn't take integer promotion into account?
I don't feel deluded. Not about this, anyhow.
> Or when unsigned char is promoted to int, it is guranteed
> (ridiculous!) NOT to overflow rather than wrap based on modulo
> UCHAR_MAX+1 rule?
I don't understand the question.
> That means:
>
> unsigned char c1 = 251;
> unsigned char c2 = 253;
> int i = c1 + c2;
>
> Assume UCHAR_MAX==255. Integer promotion occurs followed by
> computation. "Commonly", this won't wrap and therefore yields an int
> result of 504. If wrapping is done then the result will be different.
> But the present words of the standards seems to be saying that
> wrapping is required in this case also!.
No; quite the contrary. Since UCHAR_MAX < INT_MAX (by
assumption), c1 and c2 both promote to int. The addition
takes place in int arithmetic, and yields an int sum of 504.
This does not overflow, because INT_MAX >= 32767, hence
INT_MAX > 504.
> I prefer "computation involving operands *whose type is unsigned after
> promotion*" which clearly expresses the intent.
<Shrug.>
Was the conclusion from the past discussion on calloc() that
size_t is required by the standard to be able to represent the
size of any single object? I don't think so. If that was the intent,
the committee failed to deliver it with the text of the standard.
Well, then it is all matter of what you call operand!
I guess C99 does not give any specific definition for the term
"operand". (If it defines that term, let me know.)
unsigned char a = 10;
unsigned char b = 11;
int c = a+b;
Here I am free to call 'a' and 'b' as operands of + operator and I am
free to say that, the type of operand is unsigned char.
>
> Here's a case to ponder, that may (or may not) clarify
> the logic of promotion:
>
> unsigned char uch = 100;
> double dbl = 100;
> dbl *= uch; /* Wrap or no wrap? Why or why not? */
>
> When you've decided about that one, make a tiny change
> to the code and ponder again:
>
> unsigned char uch = 100;
> /* double */ int dbl = 100;
> dbl *= uch; /* Wrap or no wrap? Why or why not? */
>
> Did you answer differently? Why or why not?
I am not able to understand, what these illustrations are meant to
illustrate. How are they relevant to my current question?
>
> > This is what I follow from the current words of the standards.In fact
> > that was what I thought, before following other discussions here and
> > before being reminded of computations we perform using unsigned char
> > as int! (as in example below)
>
> > Don't you think the words chosen by the standards here are deluding
> > and doesn't take integer promotion into account?
>
> I don't feel deluded. Not about this, anyhow.
Because you know that the standards is sensible.
I am talking about the words of the standards NOT its intent. The
words used here are not describing the intent properly. If there had
been a definition of the term "operand" as the value you get after
promotion, by C99, then my question is rendered null and void.
The standards would have done well had it used the term "value"
instead of operands.
That is:
"computation involving unsigned values"
The unsigned char operands are promoted to int "values" or unsigned
int "values".
> > Or when unsigned char is promoted to int, it is guranteed
> > (ridiculous!) NOT to overflow rather than wrap based on modulo
> > UCHAR_MAX+1 rule?
>
> I don't understand the question.
I think, my present elucidation should have clarified the meaning of
On a little thought I find that usage of the term "values" here also
produces ambiguity, as value also refers to unsigned char value in
this case.
So I stand by my earlier recommendation:
The Standard does not define "operand." It also
does not define "computation" or "execution" or "number."
(Nor "is," in case you know someone named Bill.)
> unsigned char a = 10;
> unsigned char b = 11;
>
> int c = a+b;
>
> Here I am free to call 'a' and 'b' as operands of + operator and I am
> free to say that, the type of operand is unsigned char.
Well, it's a free country. The C language takes another
course, though, because for any expression it needs to decide
what kinds of operators should apply. You see only `+', but
C sees a whole family of addition operators, with different
properties and requirements:
int + int
unsigned int + unsigned int
long + long
unsigned long + unsigned long
long long + long long
unsigned long long + unsigned long long
float + float
double + double
long double + long double
float _Complex + float _Complex
double _Complex + double _Complex
long double _Complex + long double _Complex
pointer + int
pointer + unsigned int
pointer + long
pointer + unsigned long
pointer + long long
pointer + unsigned long long
... and still more if extended types are supported. That's
(counts) eighteen different kinds of `+' even *after* using
promotion to eliminate a lot of cases. Just imagine trying
to extend this list by adding short and unsigned short and
char and signed char and unsigned char and all the different
sizes of signed and unsigned bit-fields -- That's at least
thirty-seven more kinds of `+' right there, tripling the
number of `+' operators to be described (and comprehended).
So we see that promotion has reduced the number of kinds
of `+' by about two-thirds -- but in fact it's done far more
than that! Imagine how bad it would be if you also had to
define separate `+' operators for float + short, float + char,
float + unsigned long long, ... And add to this the fact
that machines usually don't possess instructions for all
these kinds of mixed-mode arithmetic ... Operand promotion
(or some kind of "operand reconciliation") is a fundamental
simplifier for nearly all programming languages.
... and *that's* why you're free to call your operands
whatever you like, but not to try to get C to use your pet
names for them.
>> Here's a case to ponder, that may (or may not) clarify
>>the logic of promotion:
>>
>> unsigned char uch = 100;
>> double dbl = 100;
>> dbl *= uch; /* Wrap or no wrap? Why or why not? */
>>
>> When you've decided about that one, make a tiny change
>>to the code and ponder again:
>>
>> unsigned char uch = 100;
>> /* double */ int dbl = 100;
>> dbl *= uch; /* Wrap or no wrap? Why or why not? */
>>
>> Did you answer differently? Why or why not?
>
>
> I am not able to understand, what these illustrations are meant to
> illustrate. How are they relevant to my current question?
You seem to believe that a computation in which one
of the (nominal) operands is unsigned char should be
performed according to the rules of unsigned arithmetic
and should wrap at UCHAR_MAX+1. The first example above
contains just such a computation, and asks whether you
think the result should be 10000 or 10000 wrapped (most
commonly, 10000%256, or 16). Speak up: Which do you think
it should be: 10000 or 16 (or whatever)?
The second example tries to bring the same computation
closer to home by eliminating the confusion of floating-
point. Once again, we have a (nominal) unsigned operand;
should the result wrap? What's your vote: 10000, or 16?
If you voted for 10000 both times, you have decided
that the expressions' arithmetic should *not* wrap (and
you were right). If you voted for 16 both times, you have
decided in favor of wrapping (and were wrong). If you
voted for 10000 in one case and 16 in the other, you'll
need to explain why you think the cases should be different.
> The standards would have done well had it used the term "value"
> instead of operands.
> That is:
> "computation involving unsigned values"
I'm not sure there's a difference. You (I presume) would
still argue that a variable of type unsigned char holds a value
of type unsigned char, and you'd be right back where you began.
>>>I prefer "computation involving operands *whose type is unsigned after
>>>promotion*" which clearly expresses the intent.
The Standard has a whole section (6.3) devoted to the
topic of conversions, which says more about them than any
amount of fiddling around with "operand" vs "value" vs
"addend" vs "subtrahend" vs "dividend" ... could ever do.
Furthermore, the definitions of the various operators all
specify what conversions occur, and to which operands (some
operators do not apply the same rules to all their operands).
I'm not sure what more one could desire.
Here's the thing: The Standard is not a tutorial, and
was never intended to be one. It is not organized like a
tutorial, it is not written in the language of tutorials,
and it is no substitute for a tutorial. Paradoxically, you
must already know C before you can understand C's official
description -- but this is a situation that is quite common
among standards as a class. They are written in order to
nail down a precise specification of something that's already
understood, perhaps with some degree of vagueness. They are
not teaching aids, and should not be read as such.
6.3 Conversions
1 Several operators convert operand values from one type to another
automatically.
So I was right in calling 'a' and 'b' as operand of + considering this
clause. Guess, now you understand why I have been stating that the
standards is ambiguous in this regard.
Since I already know C, I would expect 10000 in both the cases. But
this particular clause of the standards seems to be confusing.
The correct answer deduced to be 10000 is rationalised by 6.3
Conversion.
The clause under discussion doesn't say anything about the behavior of
this code snippent. Because what constitutes "computation involving
unsigned operands" is not properly implied here. But a "crude
interpretation" tells me that this term *could* refer to computation
in which all the "operands" (definition as pointed out earlier) are
unsigned common type. (Like all unsigned int or unsigned char etc.)
Beyond that this term is unclear!
I get all your point. I do subscribe to your view. Even I do believe
that you need to have a degree of "common sense" while reading the
standards. And "common sense" will tell you, that "operand" in that
context should be interpreted as value after promotion, because I had
some experience with C coding.
But there is some other reference which implies(implicitly) that
operand means value before promotion, as pointed out earlier. Don't
you think this is an inconsistency worth consideration, given the fact
that the standards of a language is something like a constitution for
the language-users who want to code portable codes?
Standards should be as much unambiguous as possible.
I don't calim that this is a major mistake on the part of the
standards.
My point is, the present words render itself to some other
interpretation. And I point out such an interpretation.
The last time I was involved in such a discussion, as I recall
the consensus was that calloc is not meant as a way to "go beyond"
the largest size representable in size_t, and if it reports
success for some request then the size of that object (thought
of as aliased to an array of unsigned char, for example) is
obliged to fit within size_t.
The reason we say what these typedefs mean is that we want the
programmer to be able to use them with that meaning. size_t is
specified as appropriate for representing the size of an object.
The implementation is obliged to honor that if it is to conform
to the requirements of the standard.
size_t is specified as the result of the sizeof operator (7.17p2).
There are objects whose size cannot be computed with the sizeof
operator, namely dynamically allocated objects.
I agree that your interpretation is sensible, but not that it's what
the standard actually says.
Your wording is unfortunate, because this being comp.std.c, everything
is going off in the wrong direction when they read "integer
overflow".
Fact is: There is no guarantee whatsoever that the total number of
bytes in _two_ separate objects cannot be too large to be represented
in a value of type size_t. And I have used implementations where this
was the case. In your example, it is absolutely possible that "size"
is much smaller than the length of the first string, plus the length
of the second string, plus the size of the buffer.
Consider:
if (p = calloc(BIG,BIG))
n = sizeof *(char(*)[BIG][BIG])p / BIG;
Is that supposed to work, or what?
It doesn't have to work; see DR #266.
According to DR #266, this code is not strictly conforming because it exceeds an environmental limit; but that's just a trivial consequence of the definition of "strictly conforming". Since it's not strictly conforming, implementations are free to reject it; but is there a general rule that makes its behaviour completely undefined even on implementations that accept it?
Or did you just mean to say that it doesn't have to compile?
I agree that "it's not strictly conforming" is not a very useful answer by
itself, but I'm pretty sure from the explanation that the DR saying just
that, that it doesn't have to compile, and if it does, the behaviour is
outside of the scope of the standard. Note that it does not just say that
a /minimal/ environmental limit is exceeded, but that an environmental
limit (of the specific implementation) is exceeded. Note also that no
diagnostic is required, even though it's impossible to return a value that
would match any behaviour defined by the standard.
Only if BIG is a constant expression. If BIG is a variable of type
size_t, gcc accepts it without complaint, but I think it's wrong; C99
6.7.5.2p1 says: "Only an ordinary identifier (as defined in 6.2.3)
with both block scope or function prototype scope and no linkage shall
have a variably modified type.", which seems to preclude treating an
allocated object as a VLA.
An alternative interpretation -- one I hope that's intended: an allocated
object has no associated identifier, so 6.7.5.2p1 is not violated. The
English is ambiguous; it's not clear from the text what "only" applies to.
An implementation isn't allowed to reject a program just because it's
not strictly conforming. See C99 4p3:
A program that is correct in all other aspects, operating on
correct data, containing unspecified behavior shall be a correct
program and act in accordance with 5.1.2.3.
For example:
#include <stdio.h>
#include <limits.h>
int main(void)
{
printf("INT_MAX = %d\n", INT_MAX);
return 0;
}
I think the mention of strict conformance in DR 266 is irrelevant.
A conforming implementation can reject a non-strictly conforming code.
That is what the standards says explicitly.
The only requirement of a conforming implementation is that, it should
accept strictly conforming code.
>
> A program that is correct in all other aspects, operating on
> correct data, containing unspecified behavior shall be a correct
> program and act in accordance with 5.1.2.3.
What are you trying to prove by quoting this?
>
> For example:
>
> #include <stdio.h>
> #include <limits.h>
> int main(void)
> {
> printf("INT_MAX = %d\n", INT_MAX);
> return 0;
>
> }
I believe that this code should be categorised as strictly
conforming.
Because we have:
Astrictly conforming program shall use only those features of the
language and library
specified in this International Standard.2) It shall not produce
output dependent on any
unspecified, undefined, or implementation-defined behavior, and shall
not exceed any
minimum implementation limit.
I see that, there are no words in this text which states that, for a
strictly conforming code, there need to be consistency of output
between different implementations.
One more thing, INT_MAX has implementation defined value, but that
does not come under implementation defined behavior.
I believe that whatever the standards explicitly state as
implementation defined behavior or unspecified behavior should not be
used by a strictly conforming to produce "output". And I remember in
no place the standards say that using implementation defined value
produces implementation defined behavior.
>
> I think the mention of strict conformance in DR 266 is irrelevant.
It is relevant because, the code exceeds the environment limits, so
the behavior of the code is beyond the scope of the standards.
Can you elucidate your point more clearly?
I don't believe that's correct. The standard says a conforming
implementation "shall accept any strictly conforming program", but
that's not the only requirement.
>> A program that is correct in all other aspects, operating on
>> correct data, containing unspecified behavior shall be a correct
>> program and act in accordance with 5.1.2.3.
>
> What are you trying to prove by quoting this?
That accepting any strictly conforming program is not the only
requirement on a conforming implementation. A program that contains
unspecified behavior is not strictly coforming, but it must still be
accepted.
>> For example:
>>
>> #include <stdio.h>
>> #include <limits.h>
>> int main(void)
>> {
>> printf("INT_MAX = %d\n", INT_MAX);
>> return 0;
>>
>> }
>
> I believe that this code should be categorised as strictly
> conforming.
[...]
> One more thing, INT_MAX has implementation defined value, but that
> does not come under implementation defined behavior.
C99 3.4 defines "behavior" as "external appearance or action". The
behavior of the above program depends on an implementation-defined
value. I'll grant you that the wording is vague, but in my opinion
that qualifies as "implementation-defined behavior".
Supporting the idea that use of an "implementation-defined value" is
"implementation-defined behavior", here's C99 3.4.1:
implementation-defined behavior
unspecified behavior where each implementation documents how the
choice is made
EXAMPLE An example of implementation-defined behavior is the
propagation of the high-order bit when a signed integer is shifted
right.
And here's 6.5.7p5 (using "**" to denote the superscript):
The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1
has an unsigned type or if E1 has a signed type and a nonnegative
value, the value of the result is the integral part of the
quotient of E1 / 2**E2. If E1 has a signed type and a negative
value, the resulting value is implementation-defined.
So the example used in the definition of "implementation-defined
behavior" is an implementation-defined value. J.3,
"Implementation-defined behavior", also discusses a number of things
that are implementation-defined values. (Both J.3 and the note in
3.4.1 are informative, not normative, but they reflect the intent.)
But consider a less ambiguous example:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
void *p = malloc(0);
if (p) {
puts("malloc(0) is non-null");
}
else {
puts("malloc(0) is null");
}
return 0;
}
The behavior of malloc(0) is explicitly implementation-defined; see
C99 7.20.3. The program's output depends on implementation-defined
behavior, so it's not strictly conforming. But according to C99 4p3,
it "shall be a correct program and act in accordance with 5.1.2.3"; an
implementation may not reject it just because it's not strictly
conforming.
[...]
Hmm, ok Agreed.
But still I believe that the code you presented must be considered
strictly conforming, though the present words and intent of the
standards says it isn't.
So I believe that for the next C standards, the committee need to take
into account such codes like the one you presented, and come up with
either a good redefinition of strictly conforming code or cook up a
name for such class of codes and document the expected behavior.
Since C99 is a la mode, I subscribe to your view that, the code is
nont strictly conforming.
Now, this has given rise to an interesting question.
int main( void )
{
return 0;
}
Is this code strictly conforming?
The reason I ask this question is, from 7.20.4.3p:
Finally, control is returned to the host environment. If the value of
status is zero or
EXIT_SUCCESS, an implementation-defined form of the status successful
termination is
returned. If the value of status is EXIT_FAILURE, an implementation-
defined form
of the status unsuccessful termination is returned. Otherwise the
status returned is
implementation-defined
This code will be returning implementation defined form of the status
successful termination.
Doesn't this mean it is non-strictly conforming?
If somehow you managed to prove that the above code snippet is
strictly conforming, then can you tell me if this code is strictly
conforming?
#include <stdio.h>
#include <limits.h>
int main(void)
{
int a = INT_MAX;
puts("Hello!");
return 0;
}
>
> But consider a less ambiguous example:
>
> #include <stdio.h>
> #include <stdlib.h>
> int main(void)
> {
> void *p = malloc(0);
> if (p) {
> puts("malloc(0) is non-null");
> }
> else {
> puts("malloc(0) is null");
> }
> return 0;
>
> }
>
> The behavior of malloc(0) is explicitly implementation-defined; see
> C99 7.20.3. The program's output depends on implementation-defined
> behavior, so it's not strictly conforming. But according to C99 4p3,
> it "shall be a correct program and act in accordance with 5.1.2.3"; an
> implementation may not reject it just because it's not strictly
> conforming.
C994p3 occurs before classfying programs into strictly conforming, and
conforming, so I am not able to understand what is the intent of the
clause.
But your interpretation makes sense.
>
> [...]
>
Those of us who have been around C for a couple of decades have
sometimes speculated that there is no actual 'strictly conforming'
program. However the concept of strictly conforming is really about
what the writer can assume that all implementations will do. If I write
strictly conforming code and the implementation generates an executable
that does not provide the specified observable behaviour then the
implementation is at fault.
The concept of conforming code is more useful. IIRC an implementation is
required to accept conforming code and, within the limitations of
available resources generate suitable object code.
The requirements on implementations actually go further than that, an
implementation cannot simply reject source code because it potentially
can result in undefined behaviour (just as well, else no useful code
could guarantee to be compiled -- though that leads us off into whether
an implementation is required to be useful :-)
int foo(int i, int j)){
return i + j;
}
Has the potential for undefined behaviour :-)
Since the phrase "strictly conforming" has been defined the way it is
since 1989, I suggest that changing the definition now would only
cause confusion. I agree that introducing a new term for some less
strict kind of conformance would be a good thing.
> Since C99 is a la mode, I subscribe to your view that, the code is
> nont strictly conforming.
>
> Now, this has given rise to an interesting question.
>
> int main( void )
> {
> return 0;
> }
>
> Is this code strictly conforming?
>
> The reason I ask this question is, from 7.20.4.3p:
>
> Finally, control is returned to the host environment. If the value
> of status is zero or EXIT_SUCCESS, an implementation-defined form of
> the status successful termination is returned. If the value of
> status is EXIT_FAILURE, an implementation- defined form of the
> status unsuccessful termination is returned. Otherwise the status
> returned is implementation-defined
[...]
The definition says that a SC program shall not produce *output* that
depends on implementation-defined behavior. It depends, at least in
part, on whether the result returned to the environment is considered
"output".
Just as a matter of common sense, I think it should be considered to
be output, but I'm not clear on whether the standard says it is.
But in my opinion it doesn't make sense to consider a program to be
non-SC just because it executes 'return 0;' in main, but a program
should be considered non-SC if it does something like
'return malloc(0) != NULL;'. I'd like to argue that the returning of
the "implementation-defined form of the status successful termination"
occurs outside the scope of the program, and therefore does not affect
strict conformance, but I'm not sure the standard really suppors that.
[...]
> If somehow you managed to prove that the above code snippet is
> strictly conforming, then can you tell me if this code is strictly
> conforming?
>
> #include <stdio.h>
> #include <limits.h>
> int main(void)
> {
> int a = INT_MAX;
> puts("Hello!");
> return 0;
> }
Certainly the 'int a = INT_MAX;' has no effect on the program's strict
conformance, since it doesn't affect the program's output.
There have been arguments about whether any program that produces
output can be strictly conforming. It's unspecified whether the
puts() call will succeed; for example, on a Unix-like system, I could
run
./foo > /dev/null/nosuchfile
and it would produce no output (and puts would return EOF). There's
also the issue that, even if it succeeds, the form of the output can
vary from one implementation to another (ASCII vs. EBCDIC, forms of
line endings, etc.).
This *should* be a sipmle question.
In my opinion, the definition of "strictly conforming" shoould be
modified to cover common-sense cases like these, and a new class of
conformance should be added to cover things like
printf("INT_MAX = %d\n", INT_MAX);
The term "correct program" in 4p3 comes close to what we're looking
for, but it's not presented as a definition (the term "correct
program" isn't in italics).
No, a "conforming program" is merely one that happens to be
accepted by *some* conforming implementation. The category
with teeth is "strictly conforming program".
While the implementation is obliged to document its interpretation
of the success/failure status, the semantic behavior insofar as the
C standard is concerned is independent of that interpretation;
thus, as an "output" so long as it is semantically the same on every
conforming system it constitutes "the same output", just as the
external characters in "Hello, world!\n" vary depending on the
platform (EBCDIC vs. ASCII vs. UCS-2 etc., also variations in the
external handling of new-line). What matters is the semantic
interface between the C program and the environment, not any details
of the encodings used at that interface.
The idea of "strictly conforming" is that the program "does the
same thing" under any conforming implementation. The reference to
"output" was just a clearer way of expressing what "doing the same
thing" must consist of. As of C99 we also allowed the output to
depend in unimportant ways on locale specifics, since after all we
wanted to encourage appropriate localization as a matter of
*improving* portability.
No, there is also an entire specification that it must conform to.
That's a good explanation. Is there any way to derive it from the
actual wording of the standard?
Then, is this the intent of 4p3, as Keith pointed out earlier?
"A program that is correct in all other aspects, operating on correct
data, containing
unspecified behavior shall be a correct program and act in accordance
with 5.1.2.3."
Or Are you referring to something else?
You said my code which used INT_MAX is an SC code, as the output did
not depend on the implementation defined INT_MAX
On same lines I would expect the following code to be strictly
conforming.
#include <stdlib.h>
int main( void )
{
int *p = malloc(0);
return 0;
}
The following code is also SC,
#include <stdlib.h>
int main( void )
{
return malloc(0)!=NULL;
}
if returning to host is NOT considered as output.
But it is to be accepted that standards need to define the term
"output" .Now we speculate about it. I think, it is imperative of the
standards to define the term, so as to relieve the language users from
confusion.
> > status unsuccessful termination is returned
>I'd like to argue that the returning of the "implementation-defined form of the status successful termination"
> occurs outside the scope of the program, and therefore does not affect
> strict conformance, but I'm not sure the standard really suppors that.
>
> but I'm not sure the standard really suppors that.
This is the primary reason for my question.
> [...]
>
> > If somehow you managed to prove that the above code snippet is
> > strictly conforming, then can you tell me if this code is strictly
> > conforming?
>
> > #include <stdio.h>
> > #include <limits.h>
> > int main(void)
> > {
> > int a = INT_MAX;
> > puts("Hello!");
> > return 0;
> > }
>
> Certainly the 'int a = INT_MAX;' has no effect on the program's strict
> conformance, since it doesn't affect the program's output.
>
> There have been arguments about whether any program that produces
> output can be strictly conforming. It's unspecified whether the
> puts() call will succeed; for example, on a Unix-like system, I could
> run
>
> ./foo > /dev/null/nosuchfile
>
> and it would produce no output (and puts would return EOF). There's
> also the issue that, even if it succeeds, the form of the output can
> vary from one implementation to another (ASCII vs. EBCDIC, forms of
> line endings, etc.).
Those arguments make sense considering the present words of the
standards.
>
> This *should* be a sipmle question.
>
> In my opinion, the definition of "strictly conforming" shoould be
> modified to cover common-sense cases like these, and a new class of
> conformance should be added to cover things like
> printf("INT_MAX = %d\n", INT_MAX);
> The term "correct program" in 4p3 comes close to what we're looking
> for, but it's not presented as a definition (the term "correct
> program" isn't in italics).
I hope that the next standards will take care of all these things!
>
> --
Only by applying some common sense. The basic point to appreciate
is that the standard could not have been intended to render every
output-producing program not strictly conforming.
That's certainly consistent with the purpose of the specifications
as applied to the implementation.
Yes. Such an implementation exists.
I have in mind, several real mode x86 16 bits compilers using a memory
model called "large memory model".
In this segmented memory model, size_t is an unsigned int in range [0,
65535], and all objects cannot be larger than 65535 bytes, but the sum of
sizes of all objects can be much larger. With the usual allocators, the
sum of sizes of all objects can approximatively equal 640KiB.
These 16 bits compilers weren't conforming with the ISO standards for many
reasons. But I think that this memory model is not one.
The ISO standard seems to permit segmented memory models in that way.
From Wojtek Lerch:
> According to DR #266, this code is not strictly conforming because it
> exceeds an environmental limit; but that's just a trivial consequence of
> the definition of "strictly conforming". Since it's not strictly
> conforming, implementations are free to reject it; but is there a
> general rule that makes its behaviour completely undefined even on
> implementations that accept it?
Yes, in my opinion, there's one.
n1124 6.5-5 contains:
> If an exceptional condition occurs during the evaluation of an
> expression (that is, if the
> result is not mathematically defined or not in the range of
> representable values for its
> type), the behavior is undefined.
Assuming that sizeof *(char(*)[BIG][BIG]) is matematically defined,
nothing in the standard requires that it be representable in size_t (and
on all compilers I'm aware of, it isn't representable for a big enough
BIG).
Consequently, it's unspecified whether the exact size is yielded or
behavior is undefined. Consequently the behavior is undefined since there
are absolutely no guarantee about the program behavior.
Note that the same paragraph (6.5-5) is required to make INT_MAX+INT_MAX
undefined behavior:
n1124 6.5.6-5 contains:
> The result of the binary + operator is the sum of the operands.
As for sizeof, the paragraph itself doesn't specify at all the behavior of
the program if the sum of the operands is not representable in its type.
6.5-5 solves the issue.
Keith Thompson wrote:
> An implementation isn't allowed to reject a program just because it's
> not strictly conforming. See C99 4p3:
Right.
From Rajesh S R:
> C994p3 occurs before classfying programs into strictly conforming, and
> conforming, so I am not able to understand what is the intent of the
> clause.
> But your interpretation makes sense.
This is well known on comp.std.c: The concept of strictly conforming, as
it's currently defined, is useless.
A different defnition should be given, in my opinion, probably with a new
term, such as "portable" that I'll use, just for the sake of using a
different term. Something that wouldn't exclude implementation defined and
unspecified behavior, as far as they're not relied on in a way that would
make the program expose undefined behavior later on.
In other words, the behavior wouldn't have to be identical on all
implementations, as it seems to be now, but the program should have to
have a defined behavior on all implementations.
With this new wording, excluding unspecified and implementation defined
behavior from the list of things that a "portable" program mustn't do, the
concept of "portable" program would *still* be bogous because of
implementation limits.
The following program:
int main(void) {
int x[2];
return 0;
}
May exceed implementation limits if, for example, sizeof(int)==32768, but
the limit on the size of objects is 65535.
I would even go as far as saying that it's undefined behavior by omission,
because the standard doesn't describe at all the behavior of programs
exceeding implementation limits.
The only thing that makes the program, above, work on all implementations
I'm aware of, is that no implementation is stupid enough to have so big
integers with a so small limit of object sizes.
Consequently, I fear that we couldn't come with any useful definition of
"portable" program.
Maybe we could add implementation limits, requiring that arrays of scalar
types can have up to 256 (or whatever number the standard choose) elements
without exceeding any implementation limit.
I fear that the committee would have to add many other minimal limits and
yet get a bogous concept.
Francis Glassborow wrote:
> The requirements on implementations actually go further than that, an
> implementation cannot simply reject source code because it potentially
> can result in undefined behaviour
Yes, undefined behavior is a concept that doesn't apply to programs
themsevles, but to programs and their data (e.g. data input from stdin).
The behavior can become unpredictable as soon as the behavior is comitted
to be undefined.
However, I don't know how much oracle implementations of C are possible.
What about an implementation that knows the data input by stdin, before it
is input?
That's probably not possible in a system where stdin gets a feedback from
the user, but stdin may be a "pipe" connected to another program whose
behavior might be predicted.
Rajesh S R wrote:
> if returning to host is NOT considered as output.
On this point, the standard is loose. It doesn't specify what constitutes
program's output.
However.
1) I see nothing that forbids implementations from specifying that this
value is part of the output... They may even document a program behavior
where return values are output on stdout and are argued to be part of the
program's output.
2) Even if program's output had to be taken to a restrictive sense, such
as "output on stdout", I'm pretty sure implementations are allowed to
"produce output" for return values.
> I think, it is imperative of the
> standards to define the term, so as to relieve the language users from
> confusion.
Yes. At least to say something as simple as "it's an
implementation-defined concept".
Rajesh S R wrote:
> On Jul 11, 10:17 pm, "Douglas A. Gwyn" <DAG...@null.net> wrote:
> > Rajesh S R wrote:
> > > The only requirement of a conforming implementation is that, it
> should
> > > accept strictly conforming code.
> > > No, there is also an entire specification that it must conform to.
> Then, is this the intent of 4p3, as Keith pointed out earlier?
> "A program that is correct in all other aspects, operating on correct
> data, containing
> unspecified behavior shall be a correct program and act in accordance
> with 5.1.2.3."
This paragraph is only ONE requirement about implementations.
But, implementations have many other requirements, such as generating a
diagnostic message for this ill-formed program:
int int int 37 ill-formed;
program() {return this is ill formed;}
To the contrary, it is the key conformance category and is used
in defining implementation conformance.
It is true, however, that "strictly conforming" is a tighter
requirement than "portable", and also that there are useful programs
which are neither s.c. not portable.
> A different defnition should be given, in my opinion, probably with a new
> term, such as "portable" ...
> ...
> Consequently, I fear that we couldn't come with any useful definition of
> "portable" program.
It's easy to be critical of others' efforts, isn't it?