Josh <Gun...@yahoo.com>
Sent via Deja.com http://www.deja.com/
Before you buy.
> I am currently writing a card game program for a C programming class and
> one of the requirements is that we use error checking with setjmp and
> longjmp. The library definitions seem a little confusing and I am hoping
> someone can set me in the right direction for a good book, an example,
> a webpage or anything else that might help me to understand this a
> little better. Thanks in advance for the help.
>
refer Advanced Unix Programming Richard Stevens.
#include <setjmp.h>
#include <stdio.h>
jmp_buf context;
void jump( char* text, int val )
{
printf( text );
longjmp( context, val );
}
void main()
{
int i;
i = setjmp( context );
if( i == 0 )
jump( "Hello ", 1 );
else if( i == 1 )
jump( "world ", 2 );
printf( "!\n" );
}
As the program shows you can jump across
functions: from jump() to main(). That is
because setjmp() saves the current state
of the stack (stack = list of calling functions
and their parameters.)
The "goto" keyword does not save the stack,
so it cannot be used to jump across functions.
printf("%s",text); would be preferable in case text contains '%'.
> longjmp( context, val );
> }
>
> void main()
You mean int main(void).
> {
> int i;
> i = setjmp( context );
Not allowed. From http://www.dinkumware.com/htm_cl/setjmp.html :
"You can use the macro setjmp only in an expression that:
* has no operators
* has only the unary operator !
* has one of the relational or equality operators
(==, !=, <, <=, >, or >=) with the other operand
an integer constant expression
You can write such an expression only as the expression part
of a do, expression, for, if, if-else, switch, or while statement."
> if( i == 0 )
> jump( "Hello ", 1 );
> else if( i == 1 )
> jump( "world ", 2 );
> printf( "!\n" );
And throw in a return 0; here.
> }
>
> As the program shows you can jump across
> functions: from jump() to main(). That is
> because setjmp() saves the current state
> of the stack (stack = list of calling functions
> and their parameters.)
The function in which setjmp was called must still be active
when a corresponding longjmp occurs, and local variables that
aren't volatile may be messed up. So setjmp does not necessarily
save the entire current state of the stack; it may save just
enough to remove subsequent function activations.
> The "goto" keyword does not save the stack,
> so it cannot be used to jump across functions.
In Pascal, goto can be used to jump out of a function or procedure;
the compiler just has to sort it out, probably by looking at the call
stack. (In my experience, it's generally not correctly implemented
in Pascal compilers.) More importantly, there is no way to refer
to a label in another function in C, due to the scope rules.
--
MJSR
> > {
> > int i;
> > i = setjmp( context );
>
> Not allowed. From http://www.dinkumware.com/htm_cl/setjmp.html :
>
> "You can use the macro setjmp only in an expression that:
> * has no operators
> * has only the unary operator !
> * has one of the relational or equality operators
> (==, !=, <, <=, >, or >=) with the other operand
> an integer constant expression
>
> You can write such an expression only as the expression part
> of a do, expression, for, if, if-else, switch, or while statement."
I have to admit that this puzzles me.
I understand that the value of local variables in the function
which calls setjmp() can be indeterminate at the point when
setjmp() returns as a result of a call to longjmp(), but the
assignment of the return value of setjmp() to the variable i
occurs (by definition) *after* setjmp() has returned.
So, it seems to me that a statement such as:
i = setjmp(context);
*should* be OK since even if the value of i is indeterminate at the
moment when setjmp() returns, the assignment of the actual return
value of setjmp to i takes place immediately after that has happened.
(btw if PJP says that it isn't OK to do this I'm sure that he is
correct, but I would really like to understand why this restriction
exists - ie what bizarre implementation of setjmp() the committee
was trying to allow by imposing this restriction).
In particular, if it's OK to say:
switch (setjmp(context)) {
...
}
why is it *not* OK to say:
i = setjmp(context);
switch (i) {
...
}
I can only speculate (not having been on the ANSI C committee -- if I
*had* been, I would have argued for making setjmp/longjmp "work right"
in all cases, through compiler magic as necessary). Consider a
heavily stack-oriented machine where parameters are passed on a stack
in the usual fashion, but where assignments like:
i = j;
are handled as:
lea i # address of i into stack
lod j # value of j into stack
sto # store
Then the call above might be compiled as:
lea i
lea context
call setjmp
sto
Since setjmp works by saving the stack pointer, and the "lea i"
has changed the stack pointer before the call, the result is not
useful and will make "bad things" happen on a longjmp(context,any).
(I believe "goto"s should be implemented directly in the compiler,
rather than made to look like library functions. That makes all
this silliness go away. Just keep chanting "longjmp equals goto",
if that is what it takes... :-) )
--
In-Real-Life: Chris Torek, Berkeley Software Design Inc
El Cerrito, CA Domain: to...@bsdi.com +1 510 234 3167
http://claw.bsdi.com/torek/ (not always up) I report spam to abuse@.
But consider a hypothetical compiler that translated this to
"load &i into register 1; do the setjmp magic with its result
in register 0; assign register 0 to *(register 1)". When longjmp
is called, this will start up somewhere in the setjmp magic; and
then register 1 might contain garbage when the assignment occurs.
> (btw if PJP says that it isn't OK to do this I'm sure that he is
> correct, but I would really like to understand why this restriction
> exists - ie what bizarre implementation of setjmp() the committee
> was trying to allow by imposing this restriction).
From the ANSI C Rationale at http://www.lysator.liu.se/c/ :
"The temporaries may be correct on the initial call to setjmp,
but are not likely to be on any return initiated by a corresponding
call to longjmp. These considerations dictated the constraint that
setjmp be called only from within fairly simple expressions, ones
not likely to need temporary storage."
Right! But only if they are non-volatile auto locals, and have been modified
since the setjmp.
>So, it seems to me that a statement such as:
>
>i = setjmp(context);
>
>*should* be OK since even if the value of i is indeterminate at the
>moment when setjmp() returns, the assignment of the actual return
>value of setjmp to i takes place immediately after that has happened.
The problem is that setjmp() is a macro, not a function. You have to forget
everything you know about functions when considering the meaning of
i = setjmp(context).
The stark reality is that what you have here is a dangerous context
saving/restoring mechanism being applied somewhere in the middle of an
expression with a side effect.
>(btw if PJP says that it isn't OK to do this I'm sure that he is
>correct, but I would really like to understand why this restriction
>exists - ie what bizarre implementation of setjmp() the committee
>was trying to allow by imposing this restriction).
>
>In particular, if it's OK to say:
>
> switch (setjmp(context)) {
>
> ...
>
> }
>
>why is it *not* OK to say:
>
> i = setjmp(context);
>
> switch (i) {
In the latter example, there is a race condition between restoring the context
and storing the new value into i. The context restoring operation might also
restore the old value of i. This is readily possible if the value of i
is mapped to a register, and setjmp/longjmp save and restore that register.
Because setjmp() is a macro not a function call, the restoring of the context
and the assignment to i are jumbled up into some complex expression
whose evaluation order is not well defined.
If you need to pass, through longjmp, a value that needs to be saved, you
need to build up some more sophisticated mechanism. I did this in an
exception handling library (users.footprints.net/~kaz/kazlib.html).
In this software, a thrown exception is passed by searching an exception stack
for a matching node, and then storing the info in that node. The node also
contains the jmp_buf that is jumped to. The target of the jmp_buf retrieves
the info from its node, which is in automatic storage.
In simplified pseudo code, the logic looks something like
{
struct exception_node n;
struct exception *e;
push_exception(&n);
if (setjmp(&n.jumpcontext))
e = &n.exception;
else
e = 0;
if (e == 0) {
/* this is the ``try'' code */
} else {
/* this is the ``catch'' code */
switch (e->code) {
/*... various exception types .. */
}
}
pop_exception();
}
The exception stack provides the global context for passing arbitrary
information from throw to catch, and for matching exceptions to handlers.
The down-side is that this is non-reentrant, so that in a multi-threaded
environment, the push_on_exception_stack() and pop_exception_stack() functions
have to know how to implicitly locate the specific exception stack of the
calling thread.
OK, the thing that I think was confusing me is the fact that setjmp()
is allowed to be (actually, defined to be) a macro and therefore,
although it *looks* like a function call (which would imply a sequence
point), it isn't and there is no sequence point.
Therefore, in the process of doing whatever magic is necessary to
restore the current context, we have to allow for the possibility
that the setjmp() macro may have done the moral equivalent of i++
(or something even more horrible), the effect of which is not
guaranteed to have taken place before the assignment.
I do agree with Chris Torek, though - this really appears to be a
botch - ie setjmp() *should* have been required to have the semantics
of a function call.
I suppose in future if I want to use the return value from setjmp()
I'll just have to write something like this :-(
switch (setjmp(context)) {
case 0: i = 0; break;
case 1: i = 1; break;
case 2: i = 2; break;
case 3: i = 3; break;
default: i = -1; break;
}
This almost makes C++ exception handling look good ...
>Kaz Kylheku wrote:
>>
>> On Wed, 29 Dec 1999 09:14:17 -0800, Michael Davidson <m...@sco.com> wrote:
>>
>> >why is it *not* OK to say:
>> >
>> > i = setjmp(context);
>> >
>> > switch (i) {
>>
>> In the latter example, there is a race condition between restoring the context
>> and storing the new value into i. The context restoring operation might also
>> restore the old value of i. This is readily possible if the value of i
>> is mapped to a register, and setjmp/longjmp save and restore that register.
...
>I suppose in future if I want to use the return value from setjmp()
>I'll just have to write something like this :-(
>
> switch (setjmp(context)) {
> case 0: i = 0; break;
> case 1: i = 1; break;
> case 2: i = 2; break;
> case 3: i = 3; break;
> default: i = -1; break;
> }
Which would appear to require identical magic to be made to work.
Again there is a race condition, but in this case the Standard defines
who will win. Rather arbitrary, isn't it?
-- Mat.
There is no ``race condition'' in this one. The controlling expression of the
switch statement is a full expression. The standard says that a sequence point
occurs after each full expression is evaluated. The assignment to i is clearly
separated from the setjmp() expression.
(Of course, the actual reason why it's valid is because the setjmp() is
used in a way that meets the requirements of the standard.)
>On Thu, 30 Dec 1999 01:22:31 +0000, Mathew Hendry <sca...@dial.pipex.com> wrote:
>>On Wed, 29 Dec 1999 14:31:39 -0800, Michael Davidson <m...@sco.com>
>>wrote:
>>
>>>I suppose in future if I want to use the return value from setjmp()
>>>I'll just have to write something like this :-(
>>>
>>> switch (setjmp(context)) {
>>> case 0: i = 0; break;
>>> case 1: i = 1; break;
>>> case 2: i = 2; break;
>>> case 3: i = 3; break;
>>> default: i = -1; break;
>>> }
>>
>>Which would appear to require identical magic to be made to work.
>>Again there is a race condition, but in this case the Standard defines
>>who will win. Rather arbitrary, isn't it?
>
>There is no ``race condition'' in this one. The controlling expression of the
>switch statement is a full expression.
But this expression is far from atomic. The value of the expression
must be placed in the temporary sometime after that value is known.
Just as in the
i = setjmp(context);
case, the value may be scrubbed by the context restoration. To prevent
this, some sort of magic must happen.
> The standard says that a sequence point
>occurs after each full expression is evaluated. The assignment to i is clearly
>separated from the setjmp() expression.
>
>(Of course, the actual reason why it's valid is because the setjmp() is
>used in a way that meets the requirements of the standard.)
That's a questionable reason when the requirements are themselves
being questioned. :)
-- Mat.
By the way I've been browsing through setjmp.h and the GCC
implementation defines setjmp as a call to __setjmp
which is a real function. That might explain why the race
condition won't occur in GCC.
P.S. My newsserver carries only Kaz' reply; if this question was
answered in another thread, I haven't seen it.
>> In the latter example, there is a race condition between restoring the context>> and storing the new value into i. The context restoring operation might also
>> restore the old value of i. This is readily possible if the value of i
>> is mapped to a register, and setjmp/longjmp save and restore that register.
>>
>How is it possible that the race condition occurs when
>"assigning the context value", and not when
>"using the context value" ? It seems that the first
>is a special case of the second ?
"Race conditions" in C relate, if anything, to multiple accesses to objects
between seqence points (where at least one access is a write and a read
isn't used to determine the new value to be written). In the switch
case it could be argued that the return value is not being written to any
object and therefore cannot take part in any race condition. To me that's
the most significant point: the rules make it impossible to access
any object in the expression containing the setjmp() invocation.
>By the way I've been browsing through setjmp.h and the GCC
>implementation defines setjmp as a call to __setjmp
>which is a real function. That might explain why the race
>condition won't occur in GCC.
Just because __setjmp is a real function doesn't mean that it will
return through the normal function return mechanism. When longjmp()
is called that's one thing it can't do.
--
-----------------------------------------
Lawrence Kirby | fr...@genesis.demon.co.uk
Wilts, England | 7073...@compuserve.com
-----------------------------------------
But when setjmp returns, this expression may have already caused
some stuff to happen (e.g., loading the address of i into a register,
which may not be restored as part of the context). The most obvious
strategy for compiling switch(setjmp(context)) wouldn't generate any
switch code before the expression is evaluated. Compiler writers who
want to load the address of a jump table before evaluating the switch
expression would be out of luck; but if setjmp were allowed anywhere
then the efficiency of all assignments and expression evaluations
might suffer for a particular compiler (or setjmp might become very
difficult to implement).
(Of course, some sort of magic must occur with setjmp to get its
value into a known place whether it was the original call or a
return from a later longjmp call; the restrictions in the standard
are apparently aimed at allowing slightly less magic than would be
necessary if setjmp could be used in more complicated expressions.)
The same context problem might be true of the legal expression
setjmp(context) >= 2, which might depend on 2 already being in a
register. The ugliness of the switch statement quoted above
suggests that simple assignment to a local variable should
have been allowed as well.
(I think you were looking at something other than gcc -- gcc does
not have its own setjmp.h.)
In gcc's "calls.c" source file, you can find the following:
if (name != 0 && IDENTIFIER_LENGTH (DECL_NAME (fndecl)) <= 17
/* Exclude functions not at the file scope, or not `extern',
since they are not the magic functions we would otherwise
think they are. */
&& DECL_CONTEXT (fndecl) == NULL_TREE && TREE_PUBLIC (fndecl))
{
char *tname = name;
/* We assume that alloca will always be called by name. It
makes no sense to pass it as a pointer-to-function to
anything that does not understand its behavior. */
*may_be_alloca
= (((IDENTIFIER_LENGTH (DECL_NAME (fndecl)) == 6
&& name[0] == 'a'
&& ! strcmp (name, "alloca"))
|| (IDENTIFIER_LENGTH (DECL_NAME (fndecl)) == 16
&& name[0] == '_'
&& ! strcmp (name, "__builtin_alloca"))));
/* Disregard prefix _, __ or __x. */
if (name[0] == '_')
{
if (name[1] == '_' && name[2] == 'x')
tname += 3;
else if (name[1] == '_')
tname += 2;
else
tname += 1;
}
if (tname[0] == 's')
{
*returns_twice
= ((tname[1] == 'e'
&& (! strcmp (tname, "setjmp")
|| ! strcmp (tname, "setjmp_syscall")))
|| (tname[1] == 'i'
&& ! strcmp (tname, "sigsetjmp"))
|| (tname[1] == 'a'
&& ! strcmp (tname, "savectx")));
if (tname[1] == 'i'
&& ! strcmp (tname, "siglongjmp"))
*is_longjmp = 1;
else if ((tname[0] == 'q' && tname[1] == 's'
&& ! strcmp (tname, "qsetjmp"))
|| (tname[0] == 'v' && tname[1] == 'f'
&& ! strcmp (tname, "vfork")))
*returns_twice = 1;
else if (tname[0] == 'l' && tname[1] == 'o'
&& ! strcmp (tname, "longjmp"))
*is_longjmp = 1;
/* XXX should have "malloc" attribute on functions instead
of recognizing them by name. */
else if (! strcmp (tname, "malloc")
|| ! strcmp (tname, "calloc")
|| ! strcmp (tname, "realloc")
/* Note use of NAME rather than TNAME here. These functions
are only reserved when preceded with __. */
|| ! strcmp (name, "__vn") /* mangled __builtin_vec_new */
|| ! strcmp (name, "__nw") /* mangled __builtin_new */
|| ! strcmp (name, "__builtin_new")
|| ! strcmp (name, "__builtin_vec_new"))
*is_malloc = 1;
Can you guess what all this is for? :-)
But, had the standard defined setjmp() to be a function rather than
a macro it *could* (and would) have *required* that setjmp() *appear*
to return "as if" it had returned through the normal function return
mechanism.
[ and, I believe that this would actually have been technically
possible even on systems with particularly baroque calling
conventions so I don't agree that this is something that "can't"
be done - it's just something that the standard, in it's wisdom,
decided not to require ]
Certainly anything's possible. At worst a compiler could
#define setjmp(env) __builtin_setjmp(env)
and then have special support in the compiler for __builtin_setjmp.
setjmp() never got the status of an intrinsic language feature (e.g.
it isn't a keyword) so its definition is a compromise between
implementability and utility. The current definition arguably works
well enough for most situations.
I guess there are some parallels with the (non-standard) alloca() function.
You're probably aware of the restrictions in the use of that. Things
that play with "stack" frames but aren't fully intrinsic tend to be like
that. At least C99 has VLAs and there should be no more excuse for using
alloca() (although I'm sceptical as to whether there ever was any excuse).
...
>>There is no ``race condition'' in this one. The controlling expression of the
>>switch statement is a full expression.
>
>But this expression is far from atomic. The value of the expression
>must be placed in the temporary sometime after that value is known.
But that is fully under the control of setjmp() (or longjmp()) until
their execution is complete, i.e. they can do whatever is necessary to make
the value is correct. Once their execution is complete the environment
has been fully restored and there is no problem.
>Just as in the
>
> i = setjmp(context);
>
>case, the value may be scrubbed by the context restoration. To prevent
>this, some sort of magic must happen.
The difference here is that there is an evaluation that can occur
outside setjmp()'s or longjmp()'s control and before their execution
is complete, and that is the evaluation of the lvalue i. Consider this:
*p = setjmp(context);
The work in the evaluation of the lvalue *p is more obvious. It can
happen prior to or even mixed with the evaluation of setjmp(). While
less obvious the same is true for i. In any event things are going to
get complex if you start creating rules that allow i but disallow *p.
>The difference here is that there is an evaluation that can occur
>outside setjmp()'s or longjmp()'s control and before their execution
>is complete, and that is the evaluation of the lvalue i. Consider this:
>
> *p = setjmp(context);
>
>The work in the evaluation of the lvalue *p is more obvious. It can
>happen prior to or even mixed with the evaluation of setjmp(). While
>less obvious the same is true for i.
And on real machines (if not in the C virtual machine) the same is
true for temporaries. They too require storage and might be "partially
evaluated" at the point at which the context is changed. In practice
the "assignment" to a temporary and the assignment to i are unlikely
to be significantly different.
> In any event things are going to
>get complex if you start creating rules that allow i but disallow *p.
To allow initialisations might be simpler
int i = setjmp(context);
since pointers and other evil devices are automatically ruled out.
-- Mat.
Whether or not temporary storage is needed depends on the implementation.
The setjmp interface specification carefully worded specifically to minimize or
eliminate the need for temporary storage.
It's possible, for instance, that setjmp() places its result value into some
register. That register is designated to be clobbered and therefore is not
restored by longjmp().
>On Thu, 30 Dec 99 20:38:57 GMT, fr...@genesis.demon.co.uk (Lawrence
>Kirby) wrote:
>
>>The difference here is that there is an evaluation that can occur
>>outside setjmp()'s or longjmp()'s control and before their execution
>>is complete, and that is the evaluation of the lvalue i. Consider this:
>>
>> *p = setjmp(context);
>>
>>The work in the evaluation of the lvalue *p is more obvious. It can
>>happen prior to or even mixed with the evaluation of setjmp(). While
>>less obvious the same is true for i.
>
>And on real machines (if not in the C virtual machine) the same is
>true for temporaries.
No it isn't. The use of any temporary created within the execution
of setjmp or longjmp is under the control of the setjmp/longjmp code.
i and *p can be at least partially evaluated outside the control
of the setjmp/longjmp code which means the setjmp/longjmp code can't
control that evaluation.
>They too require storage and might be "partially
>evaluated" at the point at which the context is changed.
But such evaluation is as a result of how setjmp/longjmp are implemented.
The evaluation of i and *p are not. For example setjmp/longjmp may
be implemented as hand-written assembly code which forces things to
be done in the "right" order and any temporary it uses is managed
correctly. The evaulation of i and *p are just part of the normal
compilation process which will be oblivious to the requirements of
setjmp/longjmp unless the compiler detects and allows for such usage
(effectively making setjmp at least an intrinsic).
>In practice
>the "assignment" to a temporary and the assignment to i are unlikely
>to be significantly different.
They could be *very* different.
>> In any event things are going to
>>get complex if you start creating rules that allow i but disallow *p.
>
>To allow initialisations might be simpler
>
> int i = setjmp(context);
>
>since pointers and other evil devices are automatically ruled out.
But you are also creating an automatic variable which *might* cause
its own "stack frame" adjustments. It could create a really nasty
conflict.
Now when setjmp() returns (for example by setting the
instruction pointer register), it will return before
the assignment is done. Hence there should be no race
condition.
One thing is clear: there's more to setjmp() than
it seems :)
P.S. The GNU manual page for setjmp() does not
name any restrictions on its use. Perhaps
they worked around it ?
Hmm... it looks like they're doing some form
of function replacement. I guess the "magic
functions" must refer to internal compiler
functions, for example to save the CPU state?
>> Just because __setjmp is a real function doesn't mean that it will
>> return through the normal function return mechanism. When longjmp()
>> is called that's one thing it can't do.
>>
>Agreed, but it does mean that the assignment operation
>will be done after the function call.
There are essentially 3 steps to assignment
1. The left hand lvalue expression is evaluated
2. The right hand value expression is evaluated
3. The value from 2 is converted to the type of the lvalue from 1
and written to the object that the lvalue designates.
Of these steps the only ordering is that 3 depends on the results of
1 and 2 so must come after both. There is no ordering between
steps 1 and 2. That means that step 1 can be messed up by the action
of the setjmp()/longjmp() mechanism. Also consider that setjmp() will
appear to return twice, once from the initial setjmp() invocation, and
again from the longjmp() call. There's no requirement from C's normal
expression evaluation semantics that steps 1 and 3 be able to cope with
this, i.e. reentering the evaluation of an expression partway through.
setjmp()/longjmp() is required to make this work for a limited set
of expression forms and contexts but they all avoid assignment and other
potentially unpleasant issues.
> I.e. it's
>no longer a macro where even that basic assumption
>could fail.
Reentering the evaluation of an expression partway through violates
all normal assumptions about expression evaluation in C.
>Now when setjmp() returns (for example by setting the
>instruction pointer register), it will return before
>the assignment is done. Hence there should be no race
>condition.
The race condition is between steps 1 and 2. My *p=setjmp(...) example
was intended to demonstrate this.
>One thing is clear: there's more to setjmp() than
>it seems :)
>
>P.S. The GNU manual page for setjmp() does not
> name any restrictions on its use. Perhaps
> they worked around it ?
Maybe the sort of architectures that gcc works on makes it easy to
do so. Or maybe the manual simply doesn't specify the restrictions that
exist. In fact since setjmp() and longjmp() are part of the library
they may well have nothing to do with gcc itself, or at the very least
are related to the particular system library you happen to be using with
gcc. There may still be no safe assumptions you can make beyond the
guarantees given in the standard, even if the code is intended for only
gcc implementation. If gcc has no restrictions, what behaviour does it
define for *p++ = setjmp(...) ?
>> Can you guess what all this is for? :-)
In article <386CA950...@brotherrobot.org>,
Andomar <and...@brotherrobot.org> wrote:
>Hmm... it looks like they're doing some form
>of function replacement. I guess the "magic
>functions" must refer to internal compiler
>functions, for example to save the CPU state?
More or less, yes. Calls to "the setjmp function", for instance,
still generate calls to setjmp -- but the "liveness" of registers
around that call is modified in ways that "liveness" of registers
around other function calls is not. In other words, if you rename
the assembly-language version of setjmp that I wrote -- say, call
it "glomp" -- and compile C code of the form:
if (glomp(jbuf)) {
...
} else {
...
}
that code will often fail. Change that same code back to:
if (setjmp(jbuf)) {
...
} else {
...
}
and the code works again.
Another specific case (alloca) never turns into a function at all,
at least in some cases, because function calls can use the same
stack that alloca modifies. If the compiler did not recognize the
name "alloca" and inline it, the code would often fail mysteriously.