Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

why/what setjmp()/longjmp()

1 view
Skip to first unread message

bhaves...@my-deja.com

unread,
Aug 12, 1999, 3:00:00 AM8/12/99
to
dear everyone,
I am tring to understand setjmp()/longjmp().

From the man pages it did not become clear how and in
which conditions this function should be used?

If someone can demystify these calls that will be of
great help.

with regards,
bhavesh


Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.

Chris Torek

unread,
Aug 12, 1999, 3:00:00 AM8/12/99
to
In article <7ovkg6$sqg$1...@nnrp1.deja.com> <bhaves...@my-deja.com> writes:
>... From the man pages it did not become clear how and in
>which conditions [setjmp and longjmp] should be used?

This is a little like asking when you "should" drive to the store,
or under what conditions you "should" ride a bicycle, etc.; and
learning the C language from man pages is not really a good way to
go. Although setjmp and longjmp "look like" functions, they are
actually basic language elements, rather like "if", "while", and
so forth. (Some compilers may implement them without any help from
a library. Many optimizing compilers must at least recognize calls
to setjmp, and do some special state-saving around them.)

On the other hand, if you already understand the language fairly well:

>If someone can demystify these calls that will be of
>great help.

... then I think the setjmp/longjmp "functions" can be explained
pretty well.

Longjmp is merely a fancy "goto". It is a computed (run-time)
"goto" that goes to the label most recently set via the corresponding
"setjmp". Unlike ordinary "goto", which can only jump around
inside a single function, longjmp can jump entirely out of a
function. The only constraint on "how far" a longjmp can jump is
that it must jump into a function that has not yet "finished".

Thus:

void endless(void) {
...
loop:
...
goto loop;
}

is much the same as:

void endless(void) {
jmp_buf loop;
...
setjmp(loop); /* top of loop */
...
longjmp(loop, 1);
}

The second parameter to longjmp() is nearly useless. To see why,
first we must note that if you give it 0, longjmp pretends that
you gave it 1 instead. (I think this is actually a historical
accident, with no good reason to exist today, but it is written
that way in the C Standard, so you must simply memorize it as
Another Useless Fact, for the moment.)

Now, the Standard places a bunch of constraints on setjmp:

An invocation of the |setjmp| macro shall appear only in one
of the following contexts:

. the entire controlling expression of a selection or
iteration statement;
. one operand of a relational or equality operator with
the other operand an integer constant expression, with
the resulting expression being the entire controlling
expression of a selection or iteration statement;
. the operand of a unary ! operator with the resulting
expression being the entire controlling expression of
a selection or iteration statement; or
. the entire expression of an expression statement
(possibly cast to |void|).

(§4.6.1.1, p. 119, ll. 23--30).

It also says that setjmp() returns zero when called normally,
and "returns the value passed to longjmp" when it gets used as
the target label for the longjmp-goto.

In short, this means you can write:

while (setjmp(label))
or if (setjmp(label))
or switch (setjmp(label))

but not, for instance:

x = setjmp(label);

If you *could* do "x = setjmp", the second argument to longjmp
would wind up in "x". Ignoring "switch" for a moment, the other
possiblities -- a loop, an "if", or ignoring the value entirely
(as in the example loop in endless() above) either discard the
value, or are only interested in zero-vs-nonzero. So:

if (setjmp(gronk)) {
/*
* Do special stuff because we got longjmp-ed-to.
*/
...
} else {
/*
* Do the usual stuff; we have not been longjmp'd (yet).
*/
...
}

If you call longjmp(gronk,0), this is just like longjmp(gronk,1)
(remember the Useless Fact from above?) -- it goes back to the
setjmp, a la "goto label;", but makes the setjmp appear to "return"
1 this time. And if you call longjmp(gronk,2) -- or indeed any
other nonzero value -- it makes the setjmp "return" that same
value, but all that does is go into the "true" case for the "if".

Thus, the only way you can inspect the actual value is with the
"switch" statement:

switch (setjmp(retry)) {

case 0:
... do the normal stuff ...
break;

case 1:
... do the "special case 1" stuff ...
break;

case 2:
... do the "special case 2" stuff ...
break;

/* etc */
}
...
if (retry_as_special_case)
longjmp(retry, special_case_number);

Of course, this particular example can -- and probably should -- be
rewritten to use ordinary "goto" (or, better yet, rewritten not to
use *any* kind of goto):

special_case_number = 0;
retry:
switch (special_case_number) {
case 0: ... break;
case 1: ... break;
case 2: ... break;
/* etc */
}
...
if (retry_as_special_case)
goto retry;

which shows that the value passed to longjmp was not "really" used
after all: what we are really switching on is the value of the
variable "special_case_number".

Anyway, in addition to the constraints on the placement of setjmp,
and the fact that longjmp is a "computed goto" (the kind that is
the hardest to follow in real code, and therefore the kind that,
if one wants to avoid "goto"s in general, should be most shunned),
the Standard places another set of onerous constraints on code
that uses setjmp and longjmp. Quoting again from the Standard
(§4.6.2.1, p. 120, ll. 11--14):

All accessible objects have values as of the time |longjmp|
was called, except that the values of objects of automatic
storage duration that are local to the function containing
the invocation of the corresponding |setjmp| macro that do
not have volatile-qualified type and have been changed
between the |setjmp| invocation and |longjmp| call are
indeterminate.

This rather hairy paragraph means that any code that "calls"
setjmp needs to use "volatile" variables. Code like:

void f(void) {
int i;
...
i = 3;
if (setjmp(label)) {
printf("i is %d\n", i);
return;
}
...
i = 4;
longjmp(label, 1);
}

could print 3, or 4, or anything else entirely. If we use a plain
"goto" (plus something to tell us that we came back):

i = 3;
gone_to = 0;
label:
if (gone_to) {
printf("i is %d\n", i);
return;
}
...
i = 4;
gone_to = 1;
goto label;

the code becomes well-defined (it must print "i is 4"). The version
with the setjmp needs to use "volatile int i", not just plain "int i".

So, given all the pitfalls with setjmp and longjmp, why would anyone
ever use it? My own ideal answer is "no one ever would", but there
are some reasons. One classic example is in a recursive descent parser.
When the parser discovers a syntax error, it needs to "return" from
all the recursive calls, back to a top-level error handler. Rather
than having each function return a possible error, and doing things
like:

struct exprtree *mulexpr(void) {
struct exprtree *lhs, *rhs;

lhs = castexpr();
if (lhs == NULL)
return NULL; /* error */
rhs = mulexpr();
if (rhs == NULL)
return NULL; /* error, again */
return newexpr(MUL, lhs, rhs);
}

one can just write:

struct exprtree *mulexpr(void) {
struct exprtree *lhs = castexpr();
return newexpr(MUL, lhs, mulexpr());
}

and rely on a "longjmp" somewhere to bail us entirely out of
mulexpr() when things go wrong. (On the other hand, with careful
use of a "jamming" lexer state, one could do the exact same thing
by handling NULL inside newexpr(), and letting all the recursion
return normally. So this may not be such a great example after
all. I think such uses of setjmp/longjmp typically occur when some
project failed to account for the possibility of errors, and
error-handling is later retrofitted.)

Another classic example, which is probably better justified, occurs
with signal handers. In this case, however, most of the details
is outside the scope of the Standard. Discussion as to "correct"
use of longjmp from a signal handler would have to go in
comp.unix.programmer and the like. All the Standard lets you do
is set a variable of type "volatile sig_atomic_t" (if needed) and
then longjmp, e.g., back to a read-eval-print loop in a Lisp
interpreter.

(Many people use setjmp and longjmp to implement things for which
it was never really designed, such as threads. This sometimes
works, and sometimes fails miserably, e.g., on stack-unwinding
longjmp's. If you need threads, you have to write machine- and/or
OS-dependent code; whether that code can use setjmp and longjmp
internally is likewise machine- and/or OS-dependent.)
--
In-Real-Life: Chris Torek, Berkeley Software Design Inc
El Cerrito, CA Domain: to...@bsdi.com +1 510 234 3167
http://claw.bsdi.com/torek/ (not always up) I report spam to abuse@.

Kaz Kylheku

unread,
Aug 13, 1999, 3:00:00 AM8/13/99
to
On Thu, 12 Aug 1999 23:14:21 GMT, bhaves...@my-deja.com
<bhaves...@my-deja.com> wrote:
>dear everyone,
>I am tring to understand setjmp()/longjmp().
>
>From the man pages it did not become clear how and in
>which conditions this function should be used?

They are useful for implementing exception handling. The longjmp function
allows you to discard many function activations and return to an earlier one.
This technique can eliminate tedious return value checking at each level.

Used by themselves, the setjmp macro and longjmp function are only a crude
exception handling mechanism, since they suffer from a few drawbacks. For one
thing, there is no provision for the clean up of resources associated with the
discarded function activations. If some of the discarded functions allocated
dynamic memory or other resources, these will not be feed by the longjmp, but
will be allowed to leak. Another drawback is that longjmp has a specific
target, dictated by what is saved in the jmp_buf object.

These drawbacks can be overcome with some extra coding and discipline, so that
with setjmp and longjmp you can achieve something very similar in functionality
to the exception handling features of languages like C++, and with a high
degree of portability not available to users of these languages.

leonardo...@my-deja.com

unread,
Aug 13, 1999, 3:00:00 AM8/13/99
to
In article <7ovkg6$sqg$1...@nnrp1.deja.com>,

bhaves...@my-deja.com wrote:
> dear everyone,
> I am tring to understand setjmp()/longjmp().
>
> From the man pages it did not become clear how and in
> which conditions this function should be used?
>
> If someone can demystify these calls that will be of
> great help.
>
> with regards,
> bhavesh
>
> Sent via Deja.com http://www.deja.com/
> Share what you know. Learn what you don't.

setjmp(jmp_buf) saves the most crucial details of the program state,
instruction pointer, stack pointer and others register depending on the
system, and returns zero.

longjmp(jmp_buf, x) restores the states previously saved via setjmp
(jmp_buf), and stores x in the register used for returning integer
values.

The way I use them, they provide an unstructured (and potentially
dangerous) way to exit from a set of deeply nested procedures.

e.g.
SetupForPrinting();
(1) if (!set_jmp(jbPrinterError)) { // jmp_buf jbPrinterError;
/* perform printing tasks... they may be sort of complex */
/* They should not allocate memory, open files or any other */
/* action that requires any additional clean up */
}
(2) CleanupAfterPrinting();

My printing system uses "lprintf" and "lprintsring", both of them based
on "lprintchar", which is responsible for sending characters to the
printer. I use these in conjunction with "harderr" (which is -AFAIK-
implementation dependant, I think the standard way is to use "signal").
If any error arises, lprintchar calls "longjmp(printer_error, 1)", so
control returns to the "if" (1) statement, having 1 as the return value
of setjmp, so the whole block is skipped, and control is transferred to
the statementes immediately following the if block (2).

The caveats of this method are:
1.- You should not allocate memory nor open files within the "longjmp"
statements, since this type of "return" doesn't allow for memory
recovery nor file closing, thus causing memory leaks or orphan handles.
2.- You ALWAYS have to precede a printing segment with a setjmp
(jbPrinterError), since failing to do so will cause unpredictable (and
certainly incorrect) behaviour in case an error arises during printing.

In any case setjmp/longjmp are used as an archaic form of exception
handling, found in more modern languages.

There may be addiotional uses, but about these, my more knowledgeable
fellows will surely inform you.

Salud!

John Bode

unread,
Aug 13, 1999, 3:00:00 AM8/13/99
to
In article <7ovkg6$sqg$1...@nnrp1.deja.com>,
bhaves...@my-deja.com wrote:
> dear everyone,
> I am tring to understand setjmp()/longjmp().
>
> From the man pages it did not become clear how and in
> which conditions this function should be used?
>

These functions together allow you transfer control back to a
well-defined point of the program, removing any stack frames as
necessary.

You can use them to implement something similar to the Try/Throw/Catch
mechanism in C++. Here's an untested, somewhat artificial, but
hopefully easy-to-follow example:

#include <setjmp.h>
#include <stdio.h>

jmp_buf ex;

static int foo (int a, int b)
{
if (!b)
longjmp (ex, 1); /* THROW */
else
return a/b;
}

int main (void)
{
int x = 0, y = 1, z = 0;
if (setjmp (ex) == 0) /* TRY : longjmp branches back to here */
{
x = foo(y, z);
}
else /* CATCH */
{
printf ("Exception: attempt to divide by zero\n");
}
}

The call to setjmp() saves the current state of the program and
establishes a point for longjmp() to branch to. The first time setjmp()
is called it returns 0, so we call foo(). The second argument to foo()
will cause a divide-by-zero error, so we call longjmp() to branch back
to the location established by setjmp(). We pass a non-zero value in the
second argument of longjmp() to indicate an error has occurred. Calling
longjmp() causes the stack frame associated with foo() to be popped from
the stack (essentially, the program is back in the state it was before
we called foo()). From the programmer's perspective, it appears that
setjmp() has been called again, but this time it returns a non-zero
value, which we interpret as an error, so we branch to the printf()
statement.

The important points to remember are:

1. setjmp/longjmp allow you to branch across procedure boundaries
2. longjmp removes stack frames for intervening procedure calls
3. longjmp cannot branch forward

Hope this made sense (and was correct).

> If someone can demystify these calls that will be of
> great help.
>
> with regards,
> bhavesh
>
> Sent via Deja.com http://www.deja.com/
> Share what you know. Learn what you don't.
>

--
Phrases often heard just before a major disaster:
"How hard can it be?"
"Hey, watch this!"
"Don't worry, this time I *know* what I'm doing..."

0 new messages