if i try this:
printf("%p", f);
i get:
warning: format %p expects type 'void *; but argument 2 has type 'void
(*) void)'
if i try this:
printf("%p", (void *)f);
i get:
ISO C forbids conversion of function pointer to object pointer type
if i try this:
printf("%x", ((unsigned int)f));
it compiles cleanly, but this solution bugs me because it assumes that
an unsigned int is the same length as a pointer.
any suggestions as to the "correct" way to print the address of a function?
thanks,
rCs
This cannot be done portably. As your message (including the compiler's
warnings) almost pointed out, there is no *printf format specifier for
function pointers, and there is no integer type guaranteed to be large
enough to hold it. (This includes intptr_t and uintptr_t.) Why are you
trying to do this? Depending on your needs, you may be able to store
the address in a variable, and access that variable as an array of
unsigned char, printing each element.
void (*fp)(void);
/* ... */
for (unsigned char *c = (unsigned char *) &fp; c != (unsigned char *)
(&fp + 1); c++)
printf("%x ", (unsigned) *c);
You can't, and still be portable. Nothing even tells you the size
of a function pointer. For example, what if the code for that
function is in a shared library, and the pointer has to specify the
file, offset within the file, and size of the actual code, not to
mention how to link that function to any subsidiary functions. Any
attempt to use an unsigned char pointer to access the function
pointer will not know what bits are germane.
What are you trying to do?
--
Some informative links:
<news:news.announce.newusers
<http://www.geocities.com/nnqweb/>
<http://www.catb.org/~esr/faqs/smart-questions.html>
<http://www.caliburn.nl/topposting.html>
<http://www.netmeister.org/news/learn2quote.html>
<http://cfaj.freeshell.org/google/>
> ISO C forbids conversion of function pointer to object pointer type
This is the first I have heard of this restriction. The smallest addressable
unit of memory in C is the byte (sidestepping the technicalities of bitfields
and so forth...). Four separate built-in pointer types are guaranteed to
store accurately the address of a byte:
char*
char signed*
char unsigned*
void*
(and all their const/volatile/restrict variants.)
On these grounds, I don't understand the logic pertaining to why we can't
simply do:
void *p = (void*)SomeFunc;
Sure, the Standard forbids it... but do the laws of physics not guarantee its
success... ?
--
Frederick Gotham
Your question is about the C language, not about the C standard
document, so it's not really appropriate for comp.std.c. Followups
redirected.
There is no way in standard C to directly print the value of a
function pointer. As gcc correctly told you, there is no conversion
from a function pointer to void*, though some implementations may
provide such a conversion as an extension. (I believe gcc will do so
if you turn off some of those command-line flags.) There are
implementations on which function pointers are bigger than void*, and
no meaningful conversion is possible. Similarly, function pointers
(or even object pointers) could be bigger than any integer type. The
language provides "%p" to print object pointers; it doesn't provide
anything similar for function pointers.
You can either convert to void* and use "%p" (and live with the lack
of portability), or you can treat the function pointer as an array of
unsigned char and convert it to hexadecimal yourself.
--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
<snip>
> [...] I don't understand the logic pertaining to why we can't
> simply do:
>
> void *p = (void*)SomeFunc;
>
> Sure, the Standard forbids it...
The Standard acknowledges that it is a common extension.
> but do the laws of physics not guarantee
> its success... ?
No. There is no guarantee that void * is sufficiently large to store a
function pointer without information loss.
--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
A function is not an object, nor is there a requirement (even in the laws of
physics) that a function pointer is a pointer to the first byte of the
executable code for that function, which might lead to the behavior you expect.
It's simply a different abstract type. There have to be real architectures
where the address format(s) for code are not the same size as the address
format(s) for data... and I don't have any examples handy, but I'm counting
on other people to provide them.
S.
Harald van Dijk wrote:
> This cannot be done portably. As your message (including the compiler's
> warnings) almost pointed out, there is no *printf format specifier for
> function pointers, and there is no integer type guaranteed to be large
> enough to hold it. (This includes intptr_t and uintptr_t.) Why are you
> trying to do this? Depending on your needs, you may be able to store
> the address in a variable, and access that variable as an array of
> unsigned char, printing each element.
>
> void (*fp)(void);
> /* ... */
> for (unsigned char *c = (unsigned char *) &fp; c != (unsigned char *)
> (&fp + 1); c++)
> printf("%x ", (unsigned) *c);
Or print the individual bytes comprising the function pointer:
void printfp(void (*fp)())
{
unsigned char * p;
int i;
p = (unsigned char *) &fp;
for (i = 0; i < sizeof(fp); i++)
printf("%02X", p[i]);
}
Or use a union:
union FuncPtr
{
void (*fp)();
unsigned char c[1];
};
void printfp(void (*fp)())
{
union FuncPtr u;
int i;
u.fp = fp;
for (i = 0; i < sizeof(u.fp); i++)
printf("%02X", u.c[i]);
}
Both of these invoke undefined behavior, of course.
-drt
There is no printf format specifier to print the value of a function
pointer but you can portably write a function to print the
representation of such a pointer; whether that is useful will depend on
the system. I believe the following program is conforming:
#include <stdio.h>
#include <math.h>
#include <limits.h>
void print_func_addr (void (*fp)()) {
unsigned char *cp = (unsigned char *)&fp;
size_t size = sizeof(fp);
while (size--)
printf("%.*x", (CHAR_BIT + 3) / 4, *cp++);
puts("");
}
int main (void) {
void (*fp)();
/* Print address of main function */
fp = (void (*)())main;
print_func_addr(fp);
/* Print address of sin function */
fp = (void (*)())sin;
print_func_addr(fp);
return 0;
}
On my system this prints:
be840408
88830408
Which is the byte-backwards value of the addresses of main() and sin()
in the program on my system.
Robert Gamble
Where is the undefined behavior?
Robert Gamble
The sizeof operator will if you ask it nicely.
Robert Gamble
>There have to be real architectures
>where the address format(s) for code are not the same size as the address
>format(s) for data...
Another possibility is that code addresses have the same values as
data addresses but refer to separate memory, as on PDP-11s with
separate I- and D-spaces. On such a machine, if functions were
objects (and C used native addresses as pointers) then two different
objects could have the same address.
-- Richard
This is what I suggested, except with two added problems.
%X expects an unsigned int, yet you're passing it an unsigned char.
unsigned char will promote to signed int (it can be promoted to
unsigned int, but not if <stdio.h> is implemented). There is a
guarantee that va_arg() will accept this, but there is no guarantee
that printf() is implemented using the <stdarg.h> macros.
Printing each byte with %02X, without a separating non-digit may make
several different representations indistinguishable if CHAR_BIT != 8.
> Or use a union:
>
> union FuncPtr
> {
> void (*fp)();
> unsigned char c[1];
> };
Are there any possible benefits to using a union here instead of a
direct cast?
Answer withheld pending suitable apologies.
--
Francis Glassborow ACCU
Author of 'You Can Do It!' and "You Can Program in C++"
see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects
David R Tribble wrote:
>> [...] print the individual bytes comprising the function pointer: [...]
>
Robert Gamble wrote:
> Where is the undefined behavior?
As you correctly observe, there is no u.b. in the first example,
but there is (I think) in the second:
>> union FuncPtr
>> {
>> void (*fp)();
>> unsigned char c[1];
>> };
>>
>> void printfp(void (*fp)())
>> {
>> union FuncPtr u;
>> int i;
>>
>> u.fp = fp;
>> for (i = 0; i < sizeof(u.fp); i++)
>> printf("%02X", u.c[i]);
>> }
>
IIRC, the u.b. arises because the contents of one member of
the union is used after the other member has been assigned
a value.
-drt
Harald van Dijk wrote:
>> This cannot be done portably. [...]
Harald van Dijk wrote:
David R Tribble wrote:
>> Or print the individual bytes comprising the function pointer:
>>
>> void printfp(void (*fp)())
>> {
>> unsigned char * p;
>> int i;
>>
>> p = (unsigned char *) &fp;
>> for (i = 0; i < sizeof(fp); i++)
>> printf("%02X", p[i]);
>> }
>
Harald van Dijk wrote:
> This is what I suggested, except with two added problems.
>
> %X expects an unsigned int, yet you're passing it an unsigned char.
> unsigned char will promote to signed int (it can be promoted to
> unsigned int, but not if <stdio.h> is implemented). There is a
> guarantee that va_arg() will accept this, but there is no guarantee
> that printf() is implemented using the <stdarg.h> macros.
Does it make a difference if 'unsigned char' is promoted to 'signed
int' if the value is printed using "%X"?
> Printing each byte with %02X, without a separating non-digit may make
> several different representations indistinguishable if CHAR_BIT != 8.
True. I assumed no more than 8 bits per char.
David R Tribble wrote:
>> Or use a union:
>>
>> union FuncPtr
>> {
>> void (*fp)();
>> unsigned char c[1];
>> };
>
Harald van Dijk wrote:
> Are there any possible benefits to using a union here instead of a
> direct cast?
No, I just offered it as another way to do it.
On the other hand, a union might come in handy if the program
needs to reconstruct a function pointer from characters read from
a value previously written to some file. (Obviously, this could
only work within the same program execution.)
-drt
In practice, probably not. As far as standard C is concerned, yes.
Sorry if my message was confusing, I'll try to be as direct as possible
here. %X is a conversion specifier for unsigned int. The behaviour is
undefined if you pass a signed int.
> void print_func_addr (void (*fp)()) {
> unsigned char *cp = (unsigned char *)&fp;
According to C89 (3.2.2.3 Pointers)
A pointer to void may be converted to or from a pointer to any
incomplete or object type. A pointer to any incomplete or object
type may be converted to a pointer to void and back again;
the result shall compare equal to the original pointer.
In another location, void is defined as an incomplete type that
cannot be completed.
In the section on cast operators (C89 3.3.4):
A pointer to an object or incomplete type may be converted to a
pointer to a different object type or a different incomplete type.
The resulting pointer might not be valid if it is improperly aligned
for the type pointed to. [...]
A pointer to a function of one type may be converted to a pointer
to a function of anotehr type and back again; the result shall compare
equal to the original pointer. [...]
If you examine the code snippet above, you will note that you
are converting a pointer to a function into a pointer to an object.
Although the standard guarantees that pointers to void will have
the same representation and alignment restrictions as pointers to
char, you aren't using pointers to void -- and the standard
does not guarantee that a pointer to a function may be
meaningfully cast to a pointer to anything else. Even converting
a pointer to a function into a pointer to void is not amongst the
defined operations.
--
"It is important to remember that when it comes to law, computers
never make copies, only human beings make copies. Computers are given
commands, not permission. Only people can be given permission."
-- Brad Templeton
This was explicitly undefined in C89 (actually it was unspecified but
the intent was that it be undefined), this verbiage does not exist in
C99 (actually it is erroneously included in non-normative appendix J
but there is no supporting wording in the Standard proper). In C99 the
aliasing rules govern and in this case the behavior is well-defined
(unless sizeof u.fp > 1 in which case you are accessing an invalid
array element).
Robert Gamble
No, a pointer to a pointer to a function is not a pointer to a
function. (Note the & before fp.)
No, I am converting a "pointer *to a pointer* to a function" to a
"pointer to unsigned char".
> Although the standard guarantees that pointers to void will have
> the same representation and alignment restrictions as pointers to
> char, you aren't using pointers to void -- and the standard
> does not guarantee that a pointer to a function may be
> meaningfully cast to a pointer to anything else.
But it does guarantee that a pointer to a function type can be
converted to a pointer to any other function type and back again which
implies that the size of all function pointers are the same. This
doesn't neccessarily mean that all pointers to functions have the same
representation and if they don't then it might be required to print the
representation of the function pointer without converting it to a
different function type to achieve the desired result, YMMV.
> Even converting a pointer to a function into a pointer to void is not amongst the
> defined operations.
Correct.
Robert Gamble
You can avoid the space(s) in the output by rounding CHAR_BIT to a
number of nibbles and printing a fixed number of hex digits for each
byte.
#define CHAR_NIBBLES ((CHAR_BIT + 3)/4)
printf("%0*x", CHAR_NIBBLES, 0u + *c);
--
Peter
> Robert Seacord posted:
>
> > ISO C forbids conversion of function pointer to object pointer type
Actually, the warning is incorrect. The C standard does not forbid
this. It produces undefined behavior by the lack of a definition. The
C standard specifically defines converting a pointer to any object
type to and from pointer to void, and certain other conversions
between pointers to object types with an appropriate cast.
It just plain does not define any conversions for pointers to
functions, hence any attempt to do so is undefined.
> This is the first I have heard of this restriction. The smallest addressable
> unit of memory in C is the byte (sidestepping the technicalities of bitfields
> and so forth...). Four separate built-in pointer types are guaranteed to
> store accurately the address of a byte:
>
> char*
> char signed*
> char unsigned*
> void*
>
> (and all their const/volatile/restrict variants.)
Nowhere in the standard does it say that they can address every byte
that exists in a machine, merely every byte that an executable can
access.
> On these grounds, I don't understand the logic pertaining to why we can't
> simply do:
>
> void *p = (void*)SomeFunc;
You are displaying your ignorance of the depth and breadth of systems
on which C is implemented. You are assuming many things:
1. A pointer to function is an address in the same way that a pointer
to an object type is. There are some systems where this is indeed not
true.
2. That if a pointer to function is indeed a memory address, that it
is in the same memory space as data occupied by objects. Again, there
are platforms where this is not true.
> Sure, the Standard forbids it... but do the laws of physics not guarantee its
> success... ?
No, the standard does not forbid it. And the laws of physics have
nothing at all to do with. But on some platforms either the OS or the
hardware architecture prevent if from succeeding in any meaningful
way.
--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
Thanks, that's something I'll need to remember myself; it'll surely
make itself useful sooner or later. I doubt I'll use your way of
avoiding a cast, though. :)
Not that it helps here but I seem to remember that it does define
conversions between different function pointer types.
You may think you are but that is not the way that pointers to functions
work. Given:
void fn(void);
fn and &fn are synonymous (how could they be anything else unless the
second expression was meaningless and Ritchie decided otherwise)
Given:
void (*fp)(void)
fp and &fp are not synonymous, and &fp is a pointer to a pointer to a
function.
You are incorrect.
> Given:
>
> void fn(void);
>
> fn and &fn are synonymous (how could they be anything else unless the
> second expression was meaningless and Ritchie decided otherwise)
In this case the function type is converted to a pointer to function
type; my example was taking the address of a pointer to function
variable, not a function.
Robert Gamble
jacob
I know very well a few memory models used on x86 CPU (and the computer
on which I'm typing this message is an x86) in real mode:
In this mode, many compilers had six memory models:
"Tiny" memory model: 16 bits object pointers and 16 bits function
pointers.
These pointers being addresses of a byte of memory in the same 64KB
segment.
"Small" memory model: 16 bits object pointers and 16 bits function
pointers.
But, these pointers are addresses in a different segment (CS!=DS) and
thus, even if it is possible to convert a function pointer to an object
pointer, exploring such an object pointer doesn't give you the bytes of
the code of the function, but only meaningless bytes!
"Medium" memory model: 16 bits object pointers and 32 bits (the low
order 16 bits word being the offset and the high order 16 bits word
being the segment). Conversions from function pointers to object
pointers are typically forbidden at compile-time.
"Compact" memory model: 32 bits object pointers and 16 bits function
pointers (that's a bit the opposite of the medium memory model).
"Large" memory model: 32 bits object pointers and 32 bits function
pointers (here, they're quite compatible : casting from function to
pointers to object pointers has a sense).
"Huge" memory model: 32 bits object pointers and 32 bits function
pointers, but, with a special pointer arithmetic, the C implementation
transformed a segmented memory model (where an array can't have a size
greater than 65535 bytes) in a flat memory model with the cost of a
slower pointer arithmetic.
Of course that's only a few examples (four examples) of real world
existing memory models where function pointers can't be cast to object
pointers.
I'm conscient there are many other platforms with many other different
memory models where function pointers can't be cast to object pointers
(e.g. embedded CPU where the code memory is a special ROM which is not
accessible via object pointers and with a different size for function
pointers & object pointers).
> Sure, the Standard forbids it... but do the laws of physics not guarantee its
> success... ?
>
Because you probably don't know what are physics in the strange world
of memory models.
I guess that with the fact that a few modern popular desktop OS use an
ILP32 *flat* memory model it's easy to feed illusions about what memory
models can exist.
But yes, DOS, or rather 16-bit real- and protected- mode environments
typically provided memory models where the (near) pointer to a function
was completely different from a (near) pointer to a data item. They were
offsets into different segments, and while data and code memory was not
physically distinct and could be aliased, the pointers were not
interchangeable.
That said, I don't recall most of these environments being quite ISO C
compliant with regards to pointer handling.
Michal
The code in question:
void print_func_addr (void (*fp)()) {
unsigned char *cp = (unsigned char *)&fp;
applies the unary "&" to an object of a pointer-to-function type, not
to the name of the function itself.
--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
There is no standard way to print the value of a pointer to function.
Instead of using unsigned int and %x, you could use uintmax_t and
PRIxMAX. Although it's not guaranteed to work, it's probably your
best shot.
> Answer withheld pending suitable apologies.
Thanks for your input, CBFalconer. Please note that I have taken note of your
position, and shall require no further reminders.
--
Frederick Gotham
>> Four separate built-in pointer types are guaranteed to store accurately
>> the address of a byte:
>>
>> char* char signed* char unsigned* void*
>>
>> (and all their const/volatile/restrict variants.)
>
> Nowhere in the standard does it say that they can address every byte
> that exists in a machine, merely every byte that an executable can
> access.
I would have thought that that was a given? Maybe I should just be hyper-
specific and start writing ENG at the beginning of every sentence I write
which is in English? ENG What do you think?
>> On these grounds, I don't understand the logic pertaining to why we
>> can't simply do:
>>
>> void *p = (void*)SomeFunc;
>
> You are displaying your ignorance of the depth and breadth of systems
> on which C is implemented.
Yes, which is why I appended "... ?" to the statement, demonstrating my
ignorance and curiosity.
> No, the standard does not forbid it. And the laws of physics have
> nothing at all to do with.
Sometimes the laws of physics override what a standard says or doesn't say.
Example: Let's say we have a standard for a particular programming language
which doesn't explictly say whether an unsigned integer type can trap.
However, in a particular part of the standard, it states that an unsigned
integer type shall contain no padding bits. In yet another part of the
standard, it says that the unsigned integer type shall obey modulo
arithmetic of 2^n, where n is the amount of bits in the type.
If we have several such statements, we can put them all together and
realise that, by the laws of physics, an unsigned integer type can't
possibly trap because each and every one of its bit-patterns represents a
valid value.
Granted though, I realise that the laws of physics don't play a part in the
function pointer discussion.
--
Frederick Gotham
> In article <h45hh2l85bkgfr5j3...@4ax.com>, Jack Klein
> <jack...@spamcop.net> writes
> >It just plain does not define any conversions for pointers to
> >functions, hence any attempt to do so is undefined.
>
> Not that it helps here but I seem to remember that it does define
> conversions between different function pointer types.
Agreed, my sentence could have been more detailed.
Firstly, you need one addtional thing justify that conclusion, which is
indeed present: a statement of required range of the unsigned integer
type.
Secondly, what you're describing is mathematics, not physics. The laws
of physics are relevant only by making assumptions about the physical
details about how the platform operates. As a test case, consider an
implementation which operates within a human brain. If your "law of
physics" is just as applicable to such an implementation as it is to a
more conventional one, then it probably isn't really a law of physics.
Also, note that in general, the laws allow arbitrarily bad behavior,
for strictly conforming programs, if the platform is damaged in a
suitable fashion while running the program. That's just one of many
different reasons why it's a bad idea to invoke the laws of physics to
resolve questions about the C standard.
> Jack Klein posted:
>
> >> Four separate built-in pointer types are guaranteed to store accurately
> >> the address of a byte:
> >>
> >> char* char signed* char unsigned* void*
> >>
> >> (and all their const/volatile/restrict variants.)
> >
> > Nowhere in the standard does it say that they can address every byte
> > that exists in a machine, merely every byte that an executable can
> > access.
>
>
> I would have thought that that was a given? Maybe I should just be hyper-
> specific and start writing ENG at the beginning of every sentence I write
> which is in English? ENG What do you think?
I don't understand your reference to "ENG". I did not assume that
your post was in anything other than English, if that is what you are
referring to.
In fact, that statement of mine is actually not quite accurate, due to
not being complete. Better wording for the final clause would be,
"merely every byte that an executable can access as an object".
In any case, you may have thought that this was a given, but taken
together with the rest of your post, I thought you were building a
chain of reasoning that lead to an incorrect conclusion.
> >> On these grounds, I don't understand the logic pertaining to why we
> >> can't simply do:
> >>
> >> void *p = (void*)SomeFunc;
A momentary digression, I am astonished that you did not define the
pointer as "void const *p", with the cast modified accordingly. Back
to the main topic...
Here is where I think (thought) your chain jumps the track. If you
re-read the quotes from your post to here, it implies (to me) that
your concept is:
Given(1): A function SomeFunc() which my program may call (argument
and return type details irrelevant at the moment) and...
Given(2): A pointer to char can point to the address of a byte.
Then(1): The function SomeFunc() must begin at some byte address...
Then(2): This byte address may be stored in a pointer to char (and/or
pointer to void)...
Then(3): Therefore the expression above (void *p = (void*)SomeFunc;)
must result in something meaningful in relation to the pointer to
function.
Now you happened to what I call "Given(2)" first, so I started out by
commenting on it.
I inferred you to be taking the statement I name "Then(1)" as the
equivalent of an absolute assumption, a third "Given" as it were. This
is true in many cases on many platforms, in fact probably the
overwhelming majority. So for the moment, let's limit the discussion
to those for which this assumption is true.
Somebody else introduced the term "Harvard architecture" quite
correctly, which I neglected to do. There are processors and DSPs,
some of them used in much higher volume than familiar microprocessors
in servers, workstations, and laptops, that actually work this way.
A valid object, say an int named SomeInt, might happen to reside at
address 0x4000. And it could well be that SomeFunc happens to start
at address 0x4000. But this coincidence is meaningless, because they
are in completely separate memory spaces. If you could assign the
address of SomeFunc to a pointer to unsigned char, and examine the
sequence of bytes via that pointer, you would not be examining the
object code of the function, but instead the object representation of
the int SomeInt.
So the point I was trying to make was that even though a pointer to
char (or void) can address a byte, and even a pointer to function
happens to contain the address of the first opcode of the function, it
may be a byte that the pointer to void cannot point to, because it is
in a completely distinct memory space.
> > No, the standard does not forbid it. And the laws of physics have
> > nothing at all to do with.
>
> Sometimes the laws of physics override what a standard says or doesn't say.
> Example: Let's say we have a standard for a particular programming language
> which doesn't explictly say whether an unsigned integer type can trap.
> However, in a particular part of the standard, it states that an unsigned
> integer type shall contain no padding bits. In yet another part of the
> standard, it says that the unsigned integer type shall obey modulo
> arithmetic of 2^n, where n is the amount of bits in the type.
>
> If we have several such statements, we can put them all together and
> realise that, by the laws of physics, an unsigned integer type can't
> possibly trap because each and every one of its bit-patterns represents a
> valid value.
Now here is where we are in complete agreement. There are at least
several cases of behavior that the standard does not implicitly state,
but that can be deduced to be well defined from a combination of other
implicit statements that are made.
> Granted though, I realise that the laws of physics don't play a part in the
> function pointer discussion.
--
> > Robert Seacord posted:
> >
> > > ISO C forbids conversion of function pointer to object pointer type
>
> Actually, the warning is incorrect. The C standard does not forbid
> this. It produces undefined behavior by the lack of a definition.
I think that if ____ causes undefined behaviour according to
the ISO C standard, then it is true to say that ISO C forbids ____ .
That is my understanding of the English word "forbid".
It does not matter whether it is explicitly undefined, or
undefined by omission.
That doesn't match my understanding.
I'd say that the C standard "forbids" only things for which it
requires a diagnostic. An implementation may forbid something that
invokes undefined behavior, but it can also let you get away with it;
it can even define the semantics if it chooses.
If the C standard "forbids" something, it prevents you from doing it.
An implementation is specifically not required to prevent you from
invoking undefined behavior.
Actually, even things that require diagnostics aren't quite what I'd
call "forbidden", since the compiler isn't required to reject anything
other than a #error directive.
(The meaning of the standard in this area is clear enough; what we're
discussing is the meaning of "forbid".)
It depends on what program you are referring to. A behavior to cause
UB is forbidden when a s.c. program referred to.
Anyway, since the standard says that a program to cause UB is involved
with *erroneous* things, it is true that it gives an impression that
causing UB is forbidden, even if it's not always true in practice.
--
Jun, Woong (woong at icu.ac.kr)
Samsung Electronics Co., Ltd.
``All opinions expressed are mine, and do not represent
the official opinions of any organization.''
To forbid something is to say that it should not be done. Except on
those occasions when the standard uses the word "shall", it generally
doesn't say things like that about programs. In general, it doesn't say
that a program shouldn't do something, it merely tells you what the
consequences of doing something are. The consequences of breaking a
syntax rule or violating a constraint are that a diagnostic is
required. The consequences of writing a program that isn't strictly
conforming are that a conforming implementation is not required to
translate and correctly execute the program. The consequences of
writing code with undefined behavior is that even if it does execute
the program, you have no guarantees, of any kind, about what the
behavior of that program will be. However, the standard doesn't say
that you shouldn't do this - that is left as a conclusion for the
reader to derive.
It's also a conclusion that's not necessarily true: if a piece of code
whose behavior is undefined according to the C standard, has behavior
that is defined by one or more particular implemenations, it may be
quite reasonable to write such code when putting together a program
that is intended to only be used with those implementations. It's often
the case that such code is the only way to achieve the desired effect.
Not every program needs to be portable, or even can be portable.
The standardese is that "____ is not allowed in a strictly
conforming program." Implementations may define a meaning
for any instance of "undefined behavior", and *maximally
portable* programs cannot use such an extension, but less
portable programs may very well want to take advantage of
implementation-specific features.
You can easily introduce UB into a strictly conforming program. It's
not forbidden (nothing is going to stop you from doing it); it merely
renders the program no longer strictly conforming.
UB in a strictly conforming program isn't forbidden, it's logically
impossible.
Actually, a program can be strictly conforming while containing
undefined behavior as long as any output does not depend on the result
of the undefined behavior. The following program is strictly
conforming:
int main (void) {
int i = 1;
i = i++ + i++ + i++;
return 0;
}
Of course, an argument could probably be made that since anything can
occur as the result of undefined behavior, including the emittance of
unintended output, that such a program isn't strictly conforming.
Robert Gamble
And a decent debugging implementation (not that I know of any) actually
would emit such a message, just before it halts the program. That's
assuming it doesn't outright refuse to even compile it.
Your argument only works with if you replace "undefined" with
"unspecified". Any program which has undefined behavior has output
which depends upon the undefined behavior.
...
> Of course, an argument could probably be made that since anything can
> occur as the result of undefined behavior, including the emittance of
> unintended output, that such a program isn't strictly conforming.
Yes, such an argument could be made, and would be correct.
If a program contains UB to be triggered, then its output or even
whether or not you can get the intended output always depend on that
UB, which makes the program not s.c.
That means that UB and a strictly conforming program does not
coexist, which was my point and I think not quite different from
what I said in my previous post.
If you're saying what I think you are saying, it's not correct.
The following function could be used in a strictly conforming
program:
void foo( void ) {
if ( 0 )
1 / 0;
}
Division by zero would result in undefined behavior, but we
can easily see that that operation could never be executed
in this example. (There are more practical examples of this,
such as where something is done one way if a pointer is null,
but if the pointer is non-null, it is used to access the
pointed-to object.)
> > If a program contains UB to be triggered, then its output or even
> > whether or not you can get the intended output always depend on that
> > UB, which makes the program not s.c.
>
> If you're saying what I think you are saying,
I think that "UB to be triggered"
means the opposite of "that operation could never be executed"
> it's not correct.
> The following function could be used in a strictly conforming
> program:
> void foo( void ) {
> if ( 0 )
> 1 / 0;
> }
> Division by zero would result in undefined behavior, but we
> can easily see that that operation could never be executed
> in this example.
--
pete
In the formal abstract machine model of the C standard, code is not
addressed with bytes... Code is something that's executed by the
environment in an unspecified (by omission) way.
So, many existing implementations don't use the addressing space of
objects (bytes are something related to objects such as char, int,
etc.) for code. There are implementations where the code is in ROM,
there are also interpreted implementations (where the whole text code
or intermediate code is not entirely in the physical low-level memory,
but a place remains on the HD).
Or, if you prefer, the problem is not that void* can't address all the
bytes, it's that a function pointer doesn't identify any particular
byte! It identifies a *function* (or some code, if you prefer).
In particular, an implementation is allowed to represent a function
pointer by an ordinal number, used to import the function from a shared
library (thus, allowing lazy loading of the module containing the
function), or it might use any representation it likes.
But, even with naive implementations, there are many platforms where
function pointers are totally incompatible with object pointers (and
their sizes are even different).
"
In this International Standard, ''shall'' is to be interpreted
as a requirement on an
implementation or on a program; conversely, ''shall not'' is to
be interpreted as a
prohibition.
"
"
A source file shall not end in a
partial preprocessing token or in a partial comment.
"
That's just an example, there are many other occurences of "shall" and
"shall not" that applies to programs, and not implementations.
Ok, that's not "forbidden", that's only "prohibited", with the words of
the standard.
However, I do think that the wording of the standard is unsactifactory
here, since, programs having UB or constraints violation can clearly be
legal C programs for some specific implementations that document &
allow those UB & constraints violations.
Worst, the definition of a strictly conforming program seems quite
stupid:
"
A strictly conforming program shall use only those features of the
language and library
specified in this International Standard.2) It shall not produce output
dependent on any
unspecified, undefined, or implementation-defined behavior, and shall
not exceed any
minimum implementation limit.
"
It's a bit as saying.
You shall not do that.
And if you're a strictly conforming, you shall not violates orders that
I give with the word "shall".
> In general, it doesn't say
> that a program shouldn't do something, it merely tells you what the
> consequences of doing something are.
Not always.
> It's also a conclusion that's not necessarily true: if a piece of code
> whose behavior is undefined according to the C standard, has behavior
> that is defined by one or more particular implemenations, it may be
> quite reasonable to write such code when putting together a program
> that is intended to only be used with those implementations.
Yes, I agree. And I think that the C rationale included something like
"allowing non-portable programs".
It also applies to ill-formed programs, since an implementation is
allowed to compile it & give it a documented behavior, and many
implementations do!
> It's often
> the case that such code is the only way to achieve the desired effect.
> Not every program needs to be portable, or even can be portable.
Yes.
Yes, I explicitly acknowledged that fact when I said 'Except on those
occasions when the standard uses the word "shall"...'.
> That's just an example, there are many other occurences of "shall" and
> "shall not" that applies to programs, and not implementations.
Yes, but the overwhelming majority of the standard's statements
governing programs are not introduced with the word "shall". Of
course, even when it does, as far as the standard is concerned it is
simply a way of describing a constraint (if it occurs in a constraint),
or means exactly the same thing as saying that the behavior is
undefined (when it occurs outside a constraint). However, interpreted
as ordinary english, statements saying that something "shall" or "shall
not" be done are commands, whereas saying that the behaviour is
undefined is simply giving the reader information.
> However, I do think that the wording of the standard is unsactifactory
> here, since, programs having UB or constraints violation can clearly be
> legal C programs for some specific implementations that document &
> allow those UB & constraints violations.
That's precisely what the authors wanted. Most UB is there precisely in
order to allow some implementations to define the behavior. That's less
true of constraints, which is precisely why constraint violations
require a diagnostic.
> > In general, it doesn't say
> > that a program shouldn't do something, it merely tells you what the
> > consequences of doing something are.
> Not always.
"In general" means essentially the same thing as "not always"; it's
just a difference in emphasis, like the difference between "half full"
and "half empty". In any event, if you felt there was a contradiction,
following that sentence up with a counter-example would have been
helpful.
Francis Glassborow wrote:
>> IIRC that is called a Harvard architecture and was also used by some of
>> the DOS models where different segments were used for data and program.
>
Michal Necasek wrote:
> Except DOS was not Harvard architecture.
>
> But yes, DOS, or rather 16-bit real- and protected- mode environments
> typically provided memory models where the (near) pointer to a function
> was completely different from a (near) pointer to a data item. They were
> offsets into different segments, and while data and code memory was not
> physically distinct and could be aliased, the pointers were not
> interchangeable.
>
> That said, I don't recall most of these environments being quite ISO C
> compliant with regards to pointer handling.
True. In particular, most x86 compilers failed to properly convert
16-bit 'near' null pointers (e.g., 6A80:0000) into true 'far' null
pointers (0000:0000).
Most Win32 programmers don't realize that the "flat" 32-bit memory
address model of Windows actually uses the 32-bit segment registers
of the 386+, but simply arranges all of them (DS, ES, FS, GS) to
overlap exactly, so that it appears that all code, data, stack, etc.
segments lie in the same 32-bit range.
If [Win32] programs ever get to the point that 4GB is not enough for
the entire program (code + data + stack + heap), this could be an
opportunity to [re]introduce different segments for code and data.
OTOH, 64-bit CPUs will probably be the preferred solution by then.
-drt
And a strictly conforming program can't see this issue.
A similar issue appears with calling conventions with win32 compilers:
A int(__stdcall*)() can't be implicitly converted to a int(*)().
You must not deduce that the compiler is not compliant. You must just
deduce that int(__stdcall*)() is not a function pointer as defined by
the ISO standard. It's a pseudo-function-pointer
__stdcall-function-pointer which obeys to very similar rules (the same
rules than function pointers, if you replace all the occurences of
function-pointer with __stdcall-function-pointer in the ISO standard)
than normal function pointers, except that it lives in a different
world than function pointers.
That's the same thing with __far pointers in small memory models.
__far pointers live in a different world than normal pointers, though
there is a special additionnal rule (and a few other additionnal rules)
which allows one to convert a normal pointer to a __far pointer (and
there this conversion has not to obey to pointer-to-pointer conversion
rules of the ISO standard).
AFAIK, Borland C++ 5.0 (which included a DOS compiler) is quite well
C89 compliant. The compliance is probably not perfect (well, at this
day, I've not seen any real compliance problem with it in strict mode),
but this is not fundamentally due to the memory model.
> Most Win32 programmers don't realize that the "flat" 32-bit memory
> address model of Windows actually uses the 32-bit segment registers
> of the 386+, but simply arranges all of them (DS, ES, FS, GS) to
> overlap exactly, so that it appears that all code, data, stack, etc.
> segments lie in the same 32-bit range.
>
> If [Win32] programs ever get to the point that 4GB is not enough for
> the entire program (code + data + stack + heap), this could be an
> opportunity to [re]introduce different segments for code and data.
> OTOH, 64-bit CPUs will probably be the preferred solution by then.
>
IIRC, there was a 32 bits OS on x86 (I don't remember which one) which
allowed the programmer to use as many segments he wanted, and, CS, DS
and SS could be refering to different segments.
Note that i586 and lower can't use more than 4GB of physical memory.
i686 and higher can use up to 64GB of physical memory.
> IIRC, there was a 32 bits OS on x86 (I don't remember which one) which
> allowed the programmer to use as many segments he wanted, and, CS, DS
> and SS could be refering to different segments.
> Note that i586 and lower can't use more than 4GB of physical memory.
> i686 and higher can use up to 64GB of physical memory.
>
Use of non-flat 32-bit memory models was fairly common with DOS
extenders, as well as in certain specialized environments. For instance
the PharLap 386|DOS extender used a special selector to map the first
megabyte of physical memory.
At least Microsoft and Watcom had (have) compilers that supported
segmented 32-bit programming.
Michal