Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Exceptions

240 views
Skip to first unread message

Thiago R. Adams

unread,
Jun 29, 2005, 10:50:59 AM6/29/05
to
Hi,
I have a lot of questions about exceptions.

In the sample below:

struct E(
. . .
~E(){}
);

int main()
{
try {
throw E();
}
catch(const E &e) {
e.some();
}
}

When ~E will be called? Which is the E scope?

Where E will be created? In the Stack?

Exceptions are slow if are not throwed?

What makes exceptions slower or fast? The number of catch's influences?

Using exceptions, what influences the memory?

And what influences the program size?


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

tony_i...@yahoo.co.uk

unread,
Jun 30, 2005, 7:09:33 AM6/30/05
to
Firstly, I won't answer "when ~E will be called", because a basic
programming skill is putting a few cout statements in to find out for
yourself. You can also work out whether E is on the stack or not, but
this is a bit more difficult - the answer is yes. My recollection is
that E() will be created on the normal program stack, and the catch
statement may run in a dedicated exception stack (which allows
exceptions to work when the main stack has been exhausted due to
excessive recursion).

Anyway, entering and leaving a try { } block may have overheads in many
environments, but they can vary massively. Again, a basic programming
skill is to benchmark things in your own environment, and/or look at
the assembly output from your compiler (e.g. g++ -S). On Solaris a
couple years ago, I found reporting errors via exceptions was about 15
times slower than using integral return codes (like libc), but things
may well have changed. I don't recall the overhead for successful
processing. Anyway, whether this is significant depends on many
factors. It just confirmed my impression that the programmer I had
just interviewed was being a bit silly when he insisted that he hadn't
reported an error using anything other than an exception for several
years.

As for what influences the memory: use of exceptions may require the
compiler to reserve an additional stack, but the standard C++ library
uses exceptions itself, so you should have that overhead anyway. Re
program size, it's just not relevant unless you're in an embedded
environment, in which case I'd say again that you have to write code
and make measurements with your particular compiler and processor.

The general rule is use exceptions to report exceptional circumstances,
not oft-encountered error conditions. Varying from this rule isn't
worth doing unless you have profiling results showing you that you have
to.

A design example: a function written "bool is_odd(int)" probably
shouldn't throw, but "void assert_odd(int)" could, because the caller
is clearly saying it would be exception not to succeed. See
Stroustrup's TC++PL3 for some background discussions on use of
exceptions.

Cheers, Tony

Thiago R. Adams

unread,
Jun 30, 2005, 1:30:31 PM6/30/05
to
Hi Tony,

My question is about destructor. When it be called? Yes I made some
tests with cout. I guess that the destructor is called at more external
try. And the compiler is smart if you use um rethrown. But this
behavior must be written is some place? or not?

>You can also work out whether E is on the stack or not, but
> this is a bit more difficult - the answer is yes. My recollection is
> that E() will be created on the normal program stack, and the catch
> statement may run in a dedicated exception stack (which allows
> exceptions to work when the main stack has been exhausted due to
> excessive recursion).

I made some tests. I create a recusive function with try/catch and log
the address of stack variables. The stack address of variables doesn't
change with the exceptions. (I used visual c++ 2005 express)
Then I also guess that is in other stack.

> The general rule is use exceptions to report exceptional circumstances,
> not oft-encountered error conditions. Varying from this rule isn't
> worth doing unless you have profiling results showing you that you have
> to.

It's an excelent topic! :) What is one exception condition?
I instead use exceptions for erros and for precoditions broken.
for example:

struct X{
X(AnyInterface *p) {
if (p == 0)
throw runtime_error("AnyInterface is obrigatory for class X!");
else
p->f();
}
};

I need known the exceptions overhead and behavior to convince my
coworkers that is good.
Today we use a lot of return codes.

TC++PL3 is very good! :)
I read also:
CUJ august 2004, "When, for what, and how should you use exceptions"
Sutter
Exceptional C++ Style and C++ Coding Standard.

Thank's
http://paginas.terra.com.br/informatica/thiago_adams/eng/index.htm

David Abrahams

unread,
Jun 30, 2005, 2:27:44 PM6/30/05
to
"Thiago R. Adams" <thiago...@gmail.com> writes:

> Hi,
> I have a lot of questions about exceptions.
>
> In the sample below:
>
> struct E(
> . . .
> ~E(){}
> );
>
> int main()
> {
> try {
> throw E();
> }
> catch(const E &e) {
> e.some();
> }
> }
>
> When ~E will be called?

It depends on the implementation, because E's copy ctor may be called
an arbitrary number of times, but the last E will definitely be
destroyed when execution reaches the closing brace of the catch block.

> Which is the E scope?

I don't understand the question.

> Where E will be created? In the Stack?

That's entirely implementation-dependent. From an abstract
point-of-view, there's "magic memory" in which all exceptions are
stored during unwinding.

> Exceptions are slow if are not throwed?

That's also implementation dependent. On the "best" implementations,
there's no speed penalty for exceptions until one is thrown.

> What makes exceptions slower or fast? The number of catch's
> influences?

That's also implementation dependent. On the "best" implementations,
there's no speed penalty for a catch block. Generally, those
implementations trade speed in the case where an exception is thrown
for speed in the case where no exception is thrown. In other words,
on implementations I consider inferior, actually throwing an exception
may be faster, but executing the code normally, when there are no
exceptions, will be slower.

> Using exceptions, what influences the memory?
> And what influences the program size?

That's also implementation dependent. There is generally no dynamic
memory associated with exception-handling. As for the program size,
on the "best" implementations, tables are generated that associate
program counter values with the unwinding actions and catch blocks
that must be executed when an exception is thrown with any of those
program counter values in an active stack frame. In some
implementations you can limit the number of tables generated by using
the empty exception specification wherever possible.

HTH,

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

David Abrahams

unread,
Jul 1, 2005, 5:50:27 AM7/1/05
to
"Thiago R. Adams" <thiago...@gmail.com> writes:

>> The general rule is use exceptions to report exceptional circumstances,
>> not oft-encountered error conditions. Varying from this rule isn't
>> worth doing unless you have profiling results showing you that you have
>> to.
> It's an excelent topic! :) What is one exception condition?
> I instead use exceptions for erros and for precoditions broken.

Broken preconditions should almost always be handled with an assert
and not an exception. An exception will usually cause a great deal of
code to be executed before you get a chance to diagnose the problem.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Maxim Yegorushkin

unread,
Jul 1, 2005, 6:57:53 AM7/1/05
to
On Thu, 30 Jun 2005 22:27:44 +0400, David Abrahams
<da...@boost-consulting.com> wrote:

[]

>> Using exceptions, what influences the memory?
>> And what influences the program size?
>
> That's also implementation dependent. There is generally no dynamic
> memory associated with exception-handling.

Not sure if exception throwing is exception-handling, but g++ allocates
exceptions on the heap.

http://savannah.gnu.org/cgi-bin/viewcvs/gcc/gcc/libstdc%2B%2B-v3/libsupc%2B%2B/eh_alloc.cc?rev=HEAD&content-type=text/vnd.viewcvs-markup

--
Maxim Yegorushkin
<firstname...@gmail.com>

Thiago R. Adams

unread,
Jul 1, 2005, 1:02:33 PM7/1/05
to
> Broken preconditions should almost always be handled with an assert
> and not an exception. An exception will usually cause a great deal of
> code to be executed before you get a chance to diagnose the problem.

In much cases asserts work as comentary only;
for example:

void f(pointer *p){
assert(p != 0);
p->f();
}

With exceptions the code will be response the error.

The TC++PL has an example:

template <classX , classA > inline void Assert(A assertion) {
if (!assertion ) throw X();
}

If think that the most useful is:
template <classX , classA > inline void Assert(A assertion) {
DebugBreak(); // stops debug
if (!assertion ) throw X();
}

void f2(int* p) {
// ARG_CHECK is an constant: true for checks args
Assert<Bad_arg>(ARG_CHECK || p !=0 );
}

If code is correct the all callers will be test for arguments.
Then I can remove this test. (ARG_CHECK) But is nescessary?

When use throw and when use asserts?
Two cases:
void func(vector &v, int *p) {
if (p == 0)
throw bad_arg(); // why not? because this functions alread throws
v.push_back(p); // push can throw right?
}

void func(vector &v, int *p) throw() {
// maybe is better, becase this function no throw
// it has only one contract with callers in debug and release
assert(p != 0);

// yes I can use reference but is only an example :)
*p = v.size() + 10;
}

I think that is a long and important topic.
Performance in exceptions is important to decide the use.
My coworks says: "If we doesn't kwown the behavior and performance
penalities of exceptions, we will use returns codes, because returns
codes is more simple and is known"


Thanks!
(and sorry my english is not so good)

David Abrahams

unread,
Jul 1, 2005, 9:58:38 PM7/1/05
to
"Maxim Yegorushkin" <firstname...@gmail.com> writes:

> On Thu, 30 Jun 2005 22:27:44 +0400, David Abrahams
> <da...@boost-consulting.com> wrote:
>
> []
>
>>> Using exceptions, what influences the memory?
>>> And what influences the program size?
>>
>> That's also implementation dependent. There is generally no dynamic
>> memory associated with exception-handling.
>
> Not sure if exception throwing is exception-handling, but g++ allocates
> exceptions on the heap.
>
> http://savannah.gnu.org/cgi-bin/viewcvs/gcc/gcc/libstdc%2B%2B-v3/libsupc%2B%2B/eh_alloc.cc?rev=HEAD&content-type=text/vnd.viewcvs-markup

AFAICT those exceptions are being "dynamically" allocated out of
static memory, in

static one_buffer emergency_buffer[EMERGENCY_OBJ_COUNT];

That looks like the "magic memory" I was referring to. Am I missing
something?

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,
Jul 1, 2005, 10:04:21 PM7/1/05
to
"Thiago R. Adams" <thiago...@gmail.com> writes:

>> Broken preconditions should almost always be handled with an assert
>> and not an exception. An exception will usually cause a great deal of
>> code to be executed before you get a chance to diagnose the problem.
>
> In much cases asserts work as comentary only;
> for example:
>
> void f(pointer *p){
> assert(p != 0);
> p->f();
> }

No, they work to stop program execution at the earliest point of
detection of the error.

> With exceptions the code will be response the error.

If preconditions are broken, your program state is broken, by
definition. Trying to recover is generally ill-advised.

> The TC++PL has an example:

That doesn't make it right :)

> template <classX , classA > inline void Assert(A assertion) {
> if (!assertion ) throw X();
> }
>
> If think that the most useful is:
> template <classX , classA > inline void Assert(A assertion) {
> DebugBreak(); // stops debug

Stop execution so I can debug the program. Good!

> if (!assertion ) throw X();
> }

If the assertion fails when there is no debugger, how do you expect
the program to recover?

> void f2(int* p) {
> // ARG_CHECK is an constant: true for checks args
> Assert<Bad_arg>(ARG_CHECK || p !=0 );
> }
>
> If code is correct the all callers will be test for arguments.
> Then I can remove this test. (ARG_CHECK) But is nescessary?

I don't understand the question.

> When use throw and when use asserts?

Use asserts to detect that the invariants you have designed into your
program are broken. Use throw to indicate that a function will not be
able to fulfill its usual postconditions, and when the immediate
caller is not very likely to be able to handle the error directly and
continue (otherwise, use error return codes and the like).

> Two cases:
> void func(vector &v, int *p) {
> if (p == 0)
> throw bad_arg(); // why not? because this functions alread throws

Why not? Because, I presume, passing 0 is a precondition violation.
It depends on what you put in your documentation. If you say, "you
must pass me a non-null pointer," then use an assert. If you say, "if
you pass a null pointer I'll throw," well, then throw. However, the
former is usually the better course of action.

> v.push_back(p); // push can throw right?

Yes. So what?

> }
>
> void func(vector &v, int *p) throw() {
> // maybe is better, becase this function no throw
> // it has only one contract with callers in debug and release
> assert(p != 0);

Even if it were not nothrow, it would have only one contract.

> // yes I can use reference but is only an example :)
> *p = v.size() + 10;
> }
>
> I think that is a long and important topic.
> Performance in exceptions is important to decide the use.
> My coworks says: "If we doesn't kwown the behavior and performance
> penalities of exceptions, we will use returns codes, because returns
> codes is more simple and is known"

Yes, it's a common FUD. It's easy to do some experiments to get a
feeling for the real numbers.

Do they know the cost in correctness and maintainability of using
return codes?

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Alf P. Steinbach

unread,
Jul 2, 2005, 6:23:03 AM7/2/05
to
* David Abrahams -> Thiago R. Adams:

>
> >
> > If think that the most useful is:
> > template <classX , classA > inline void Assert(A assertion) {
> > DebugBreak(); // stops debug
>
> Stop execution so I can debug the program. Good!
>
> > if (!assertion ) throw X();
> > }
>
> If the assertion fails when there is no debugger, how do you expect
> the program to recover?

That's actually a good _C++_ question... ;-)

First, the reason why one would like to 'throw' in this case, which is
usually not to recover in the sense of continuing normal execution, but
to recover enough to do useful logging, reporting and graceful exit on
an end-user system with no debugger and other programmer's tools
(including, no programmer's level understanding of what goes on).

Why that is a problem in C++: the standard exception hierarchy is not
designed. Uh, I meant, it's not designed with different degrees of
recoverability in mind. At best you can use std::runtime_error for
"soft" (in principle recover-and-continue-normally'able) exceptions, and
classes derived otherwise from std::exception for "hard" exceptions, but
in practice people tend to not restrict themselves to
std::runtime_error, and the Boost library is an example of this common
practice -- the standard's own exception classes are also examples.

So, if you want a really "hard" exception in C++, one that is likely to
propagate all the way up to the topmost control level, you'll have to
use something else than a standard exception.

And even that non-standard exception might be caught (and not rethrown)
by a catch(...) somewhere. Which is not as unlikely as it might seem.
E.g., as I recall, ScopeGuard does that in its destructorน.

Well, then, why not use std::terminate instead? After all, that's what
it's for, isn't it? And it's configurable.

But no, that's not what it's for. Calling std::terminate does not
guarantee RAII cleanup as a "hard" exception would. In short, I know of
no good portable solution to this problem in standard C++, and thinking
of how extremely easily it could have been supported in the design of
the standard exception classes (there was even existing practice from
earlier languages indicating how it should be) it's very frustrating.


น) One might argue that calling std::terminate is the only reasonable
failure handling in a destructor, even for ScopeGuard-like objects.
But the standard already provides that draconian measure for the
situation where it's really needed, where you would otherwise have a
double exception (which does not exist in C++). Doing it explicitly
just removes a measure of control from the client code programmer.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Maxim Yegorushkin

unread,
Jul 3, 2005, 6:10:40 AM7/3/05
to
David Abrahams wrote:

[]

> > Not sure if exception throwing is exception-handling, but g++ allocates
> > exceptions on the heap.
> >
> > http://savannah.gnu.org/cgi-bin/viewcvs/gcc/gcc/libstdc%2B%2B-v3/libsupc%2B%2B/eh_alloc.cc?rev=HEAD&content-type=text/vnd.viewcvs-markup
>
> AFAICT those exceptions are being "dynamically" allocated out of
> static memory, in
>
> static one_buffer emergency_buffer[EMERGENCY_OBJ_COUNT];
>
> That looks like the "magic memory" I was referring to. Am I missing
> something?

Only when malloc returns 0. That's why the *emergency* buffer.

--
Maxim Yegorushkin
<firstname...@gmail.com>

Peter Dimov

unread,
Jul 3, 2005, 6:11:16 AM7/3/05
to
Alf P. Steinbach wrote:
> * David Abrahams -> Thiago R. Adams:
> >
> > >
> > > If think that the most useful is:
> > > template <classX , classA > inline void Assert(A assertion) {
> > > DebugBreak(); // stops debug
> >
> > Stop execution so I can debug the program. Good!
> >
> > > if (!assertion ) throw X();
> > > }
> >
> > If the assertion fails when there is no debugger, how do you expect
> > the program to recover?
>
> That's actually a good _C++_ question... ;-)
>
> First, the reason why one would like to 'throw' in this case, which is
> usually not to recover in the sense of continuing normal execution, but
> to recover enough to do useful logging, reporting and graceful exit on
> an end-user system with no debugger and other programmer's tools
> (including, no programmer's level understanding of what goes on).

No recovery is possible after a failed assert. A failed assert means
that we no longer know what's going on. Generally logging and reporting
should be done at the earliest opportunity; if you attempt to "recover"
you may be terminated and no longer be able to log or report.

In some situations (ATM machine in the middle of a transaction, say) it
might make sense to attempt recovery even when an assertion fails, if
things can't possibly get any worse. This is extremely fragile, of
course. You can't test how well the recovery works because by
definition it is only executed in situations that your tests did not
cover (if they did, you'd have fixed the code to no longer assert).

Maxim Yegorushkin

unread,
Jul 3, 2005, 6:19:39 AM7/3/05
to
On Sat, 02 Jul 2005 14:23:03 +0400, Alf P. Steinbach <al...@start.no>
wrote:

[]

>> If the assertion fails when there is no debugger, how do you expect
>> the program to recover?
>
> That's actually a good _C++_ question... ;-)
>
> First, the reason why one would like to 'throw' in this case, which is
> usually not to recover in the sense of continuing normal execution, but
> to recover enough to do useful logging, reporting and graceful exit on
> an end-user system with no debugger and other programmer's tools
> (including, no programmer's level understanding of what goes on).

Why would one want a graceful exit when code is broken, rather than dying
as loud as possible leaving a core dump with all state preserved, rather
than unwound? std::abort() is a good tool for that.

--
Maxim Yegorushkin
<firstname...@gmail.com>

Alf P. Steinbach

unread,
Jul 3, 2005, 6:25:19 PM7/3/05
to
* Peter Dimov:

> Alf P. Steinbach wrote:
> > * David Abrahams -> Thiago R. Adams:
> > >
> > > >
> > > > If think that the most useful is:
> > > > template <classX , classA > inline void Assert(A assertion) {
> > > > DebugBreak(); // stops debug
> > >
> > > Stop execution so I can debug the program. Good!
> > >
> > > > if (!assertion ) throw X();
> > > > }
> > >
> > > If the assertion fails when there is no debugger, how do you expect
> > > the program to recover?
> >
> > That's actually a good _C++_ question... ;-)
> >
> > First, the reason why one would like to 'throw' in this case, which is
> > usually not to recover in the sense of continuing normal execution, but
> > to recover enough to do useful logging, reporting and graceful exit on
> > an end-user system with no debugger and other programmer's tools
> > (including, no programmer's level understanding of what goes on).
>
> No recovery is possible after a failed assert. A failed assert means
> that we no longer know what's going on. Generally logging and reporting
> should be done at the earliest opportunity; if you attempt to "recover"
> you may be terminated and no longer be able to log or report.

If "recovery" in the above means continuing on with normal execution,
then the above is a sort of extremist version of my position. I would
not go that far, especially in light of contrary facts. For example,
one of the most used programs in the PC world, the Windows Explorer,
does continue normal execution in this situation simply by restarting
itself, so with this meaning of "recovery" the above is false.

If "recovery" in the above means something generally more reasonable,
like, cleaning up, then again that flies in the face of established
fact. E.g., generally open files are closed by the OS whenever a
process terminates, whatever the cause of the termination, and it's no
big deal to define that as part of the application. In this sense of
"recovery" the above therefore says it's impossible to write an OS in
C++, and that's patently false, too.

So what does the above mean, if anything?

Is it, perhaps, a "proof", from principles, that hummingbirds can't fly?

Then I'd suggest the proof technique and/or the principles are to blame.


> In some situations (ATM machine in the middle of a transaction, say) it
> might make sense to attempt recovery even when an assertion fails, if
> things can't possibly get any worse. This is extremely fragile, of
> course. You can't test how well the recovery works because by
> definition it is only executed in situations that your tests did not
> cover (if they did, you'd have fixed the code to no longer assert).

Happily it's not the case that every ATM machine that encounters a
failed assertion has to be discarded. :-)


Cheers,

- Alf

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Alf P. Steinbach

unread,
Jul 3, 2005, 6:29:23 PM7/3/05
to
* Maxim Yegorushkin:

>
> Why would one want a graceful exit when code is broken, rather than dying
> as loud as possible leaving a core dump with all state preserved, rather
> than unwound? std::abort() is a good tool for that.

Because almost all code in existence, with the possible exception of
TeX, is broken, and (1) end-users really don't like core dumps, (2) the
maintainance team would like some information about what went wrong so
that it can be fixed, and a discarded end-user's core dump doesn't do,
(3) it's generally impolite and inconsiderate to wreck the city when one
knows one's dying, but that can be the effect of a simple std::abort.

Cheers,

- Alf

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Niklas Matthies

unread,
Jul 3, 2005, 10:27:08 PM7/3/05
to
On 2005-07-03 10:19, Maxim Yegorushkin wrote:

> On Sat, 02 Jul 2005 14:23:03 +0400, Alf P. Steinbach wrote:
:
>>> If the assertion fails when there is no debugger, how do you expect
>>> the program to recover?
>>
>> That's actually a good _C++_ question... ;-)
>>
>> First, the reason why one would like to 'throw' in this case, which is
>> usually not to recover in the sense of continuing normal execution, but
>> to recover enough to do useful logging, reporting and graceful exit on
>> an end-user system with no debugger and other programmer's tools
>> (including, no programmer's level understanding of what goes on).
>
> Why would one want a graceful exit when code is broken, rather than
> dying as loud as possible leaving a core dump with all state
> preserved, rather than unwound?

Because the customer expects and demands it. Actually, more often than
not the customer even demands graceful resumption.

-- Niklas Matthies

Albrecht Fritzsche

unread,
Jul 3, 2005, 10:27:31 PM7/3/05
to
>Why would one want a graceful exit when code is broken, rather than dying
>as loud as possible leaving a core dump with all state preserved, rather
>than unwound?

Because your code is integrated into another program. This program
might, for instance,
give some guarantees while your code is working - like saving all prior
actions etc...

If all your program can do is "dying loud" then the integrators won't
be exactly happy.

Ali

Nicola Musatti

unread,
Jul 4, 2005, 7:08:15 AM7/4/05
to

Maxim Yegorushkin wrote:
[...]


> Why would one want a graceful exit when code is broken, rather than dying
> as loud as possible leaving a core dump with all state preserved, rather
> than unwound? std::abort() is a good tool for that.

You and Peter seem to assume that there can be no knowledge about how
and where the code is broken. I believe that this depends on the kind
of application and how modular it is. In many interactive applications
it only takes a little care to ensure that you can always abort the
current operation and go back to the main event loop as if no problem
ever took place.

I'm working all day long with an IDE that throws at me any kind of
unexpected exceptions; when that happens I usually restart the #@!
thing, but the few times I did try to carry on, I never encountered any
problems.

Cheers,
Nicola Musatti

Motti Lanzkron

unread,
Jul 4, 2005, 7:07:18 AM7/4/05
to
On 1 Jul 2005 13:02:33 -0400, "Thiago R. Adams"
<thiago...@gmail.com> wrote:

>> Broken preconditions should almost always be handled with an assert
>> and not an exception. An exception will usually cause a great deal of
>> code to be executed before you get a chance to diagnose the problem.
>
>In much cases asserts work as comentary only;
>for example:
>
>void f(pointer *p){
> assert(p != 0);
> p->f();
>}
>
>With exceptions the code will be response the error.
>
>The TC++PL has an example:
>
>template <classX , classA > inline void Assert(A assertion) {
> if (!assertion ) throw X();
>}
>
>If think that the most useful is:
>template <classX , classA > inline void Assert(A assertion) {
> DebugBreak(); // stops debug
> if (!assertion ) throw X();
>}

We have some macros that do something similar:
They're macros so that the original __LINE__ and __FILE__ are kept and
since a lot of code is COM, errors are indicated by return codes.

Something along the lines of:
#define ASSERT_OR_RETURN(cond, ret) \
if( cond ) \
; \
else { \
assert(false && #cond); \
return ret; \
}

And it's used as:
ASSERT_OR_RETURN( pArg != NULL, E_POINTER);

As a side note, this has got some criticism that claims that static
code analyzers can't correctly analyze functions that use such
constructs. Is this true? I'm not familiar with such tools but it
seems that they should have the option of some macro expansion.

Peter Dimov

unread,
Jul 4, 2005, 1:43:32 PM7/4/05
to
Alf P. Steinbach wrote:
> * Peter Dimov:
>> Alf P. Steinbach wrote:

>>> First, the reason why one would like to 'throw' in this case, which
>>> is usually not to recover in the sense of continuing normal
>>> execution, but to recover enough to do useful logging, reporting
>>> and graceful exit on an end-user system with no debugger and other
>>> programmer's tools (including, no programmer's level understanding
>>> of what goes on).
>>
>> No recovery is possible after a failed assert. A failed assert means
>> that we no longer know what's going on. Generally logging and
>> reporting should be done at the earliest opportunity; if you attempt
>> to "recover" you may be terminated and no longer be able to log or
>> report.
>
> If "recovery" in the above means continuing on with normal execution,

You said: "one would like to 'throw' in this case" ... "to recover enough to
do useful logging, reporting and graceful exit."

So I assumed that by "recovery" you mean "stack unwinding" (with its usual
side effects), because this is what 'throw' does.

> then the above is a sort of extremist version of my position. I would
> not go that far, especially in light of contrary facts. For example,
> one of the most used programs in the PC world, the Windows Explorer,
> does continue normal execution in this situation simply by restarting
> itself, so with this meaning of "recovery" the above is false.

Normal execution doesn't involve killing the host process, even if it's
automatically restarted after that. Exit+restart is recovery, for a
different definition of "recovery", but it's not performed by throwing an
exception.

> If "recovery" in the above means something generally more reasonable,
> like, cleaning up, then again that flies in the face of established
> fact. E.g., generally open files are closed by the OS whenever a
> process terminates, whatever the cause of the termination, and it's no
> big deal to define that as part of the application. In this sense of
> "recovery" the above therefore says it's impossible to write an OS in
> C++, and that's patently false, too.

No, cleaning up open files after a process dies is not recovery, and it's
not performed by throwing an exception.

> So what does the above mean, if anything?

It means that performing stack unwinding after a failed assert is usually a
bad idea.

Peter Dimov

unread,
Jul 4, 2005, 1:45:31 PM7/4/05
to
Nicola Musatti wrote:
> Maxim Yegorushkin wrote:
> [...]
>> Why would one want a graceful exit when code is broken, rather than
>> dying as loud as possible leaving a core dump with all state
>> preserved, rather than unwound? std::abort() is a good tool for that.
>
> You and Peter seem to assume that there can be no knowledge about how
> and where the code is broken.

Not really.

Either

(a) you go the "correct program" way and use assertions to verify that your
expectations match the observed behavior of the program, or

(b) you go the "resilient program" way and use exceptions in an attempt to
recover from certain situations that may be caused by bugs.

(a) implies that whenever an assert fails, the program no longer behaves as
expected, so everything you do from this point on is based on _hope_ that
things aren't as bad.

(b) implies that whenever stack unwinding might occur, you must assume that
the conditions that you would've ordinarily tested with an assert do not
hold.

Most people do neither. They write incorrect programs and don't care about
the fact that every stack unwinding must assume a broken program. It's all
wishful thinking. We can't make our programs correct, so why even bother?
Just throw an exception.

Alf P. Steinbach

unread,
Jul 4, 2005, 10:28:14 PM7/4/05
to
* Peter Dimov:

> > >
> > > No recovery is possible after a failed assert.
>
> [The above] means that performing stack unwinding after a failed

> assert is usually a bad idea.

I didn't think of that interpretation, but OK.

The interpretation, or rather, what you _meant_ to say in the first
place, is an opinion, which makes it more difficult to discuss.

After a failed assert it's known that something, which could be anything
(e.g. full corruption of memory), is wrong. Attempting to execute even
one teeny tiny little instruction might do unimaginable damage. Yet you
think it's all right to not only terminate the process but also to log
things, which involves file handling, as long as one doesn't do a stack
rewind up from the point of the failed assert. This leads me to suspect
that you're confusing a failed assert with a corrupted stack, or that
you think that a failure to clean up 100% might be somehow devastating.
Anyway, an explanation of your opinion would be great, and this time,
please write what you mean, not something entirely different.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,
Jul 5, 2005, 6:52:02 AM7/5/05
to
"Peter Dimov" <pdi...@gmail.com> writes:

> Nicola Musatti wrote:
>> Maxim Yegorushkin wrote:
>> [...]
>>> Why would one want a graceful exit when code is broken, rather than
>>> dying as loud as possible leaving a core dump with all state
>>> preserved, rather than unwound? std::abort() is a good tool for that.
>>
>> You and Peter seem to assume that there can be no knowledge about how
>> and where the code is broken.
>
> Not really.
>
> Either
>
> (a) you go the "correct program" way and use assertions to verify that your
> expectations match the observed behavior of the program, or
>
> (b) you go the "resilient program" way and use exceptions in an attempt to
> recover from certain situations that may be caused by bugs.
>
> (a) implies that whenever an assert fails, the program no longer behaves as
> expected, so everything you do from this point on is based on _hope_ that
> things aren't as bad.
>
> (b) implies that whenever stack unwinding might occur, you must assume that
> the conditions that you would've ordinarily tested with an assert do not
> hold.

And while it is possible to do (b) in a principled way, it's much more
difficult than (a), because once you unwind and return to "normal"
code with the usual assumptions about program integrity broken, you
have to either:

1. Test every bit of data obsessively to make sure it's still
reasonable, or

2. Come up with a principled way to decide which kinds of
brokenness you're going to look for and try to circumvent, and
which invariants you're going to assume still hold.

In practice, I think doing a complete job of (1) is really impossible,
so you effectively have to do (2). Note also that once you unwind to
"normal" code, information about the particular integrity check that
failed tends to get lost: all the different throw points unwind into
the same instruction stream, so there really is a vast jungle of
potential problems to consider.

Programming with the basic assumption that the program state might be
corrupt is very difficult, and tends to work against the advantages of
exceptions, cluttering the "normal" flow of control with integrity
tests and attempts to work around the problems. And your program gets
bigger, harder to test and to maintain; if your work is correct, these
tests and workarounds will never be executed at all.

> Most people do neither. They write incorrect programs and don't care
> about the fact that every stack unwinding must assume a broken
> program.

I assume you mean that's the assumption you must make when you throw
in response to failed preconditions.

> It's all wishful thinking. We can't make our programs correct, so
> why even bother? Just throw an exception.

....and make it someone else's problem. Code higher up the call stack
mike know how to deal with it, right? ;-)

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Alf P. Steinbach

unread,
Jul 5, 2005, 9:03:24 AM7/5/05
to
* David Abrahams:

> "Peter Dimov" <pdi...@gmail.com> writes:
>
> > Nicola Musatti wrote:
> >> Maxim Yegorushkin wrote:
> >> [...]
> >>> Why would one want a graceful exit when code is broken, rather than
> >>> dying as loud as possible leaving a core dump with all state
> >>> preserved, rather than unwound? std::abort() is a good tool for that.
> >>
> >> You and Peter seem to assume that there can be no knowledge about how
> >> and where the code is broken.
> >
> > Not really.
> >
> > Either
> >
> > (a) you go the "correct program" way and use assertions to verify that your
> > expectations match the observed behavior of the program, or
> >
> > (b) you go the "resilient program" way and use exceptions in an attempt to
> > recover from certain situations that may be caused by bugs.

[Here responding to Peter Dimov's statement:]

Those are extremes, so the "either" is not very meaningful.

AFAIK the techniques of mathematical proof of program correctness is in
general not used in the industry. One reason is simply that the proofs
(and attendant machinery) tend to be more complex than the programs. Apart
from the work involved, that means a possibly higher chance of errors, for
example from over-generalization being employed as a valid proof technique.

When an assertion fails you have proof that the program isn't correct, and
due to the way we use asserts, an indication that the process should terminate,
so whether (a) or (b) has been employed (and I agree that if one had to choose
between the extremes (a) would be a good choice) is not relevant any more.


> > (a) implies that whenever an assert fails, the program no longer behaves as
> > expected, so everything you do from this point on is based on _hope_ that
> > things aren't as bad.

That is literally correct, but first of all, "the program" is an over-
generalization, because you usually know something much more specific than
that, and secondly, there are degrees of hope, including informed hope.

If you try to execute a halt instruction you're hoping the instruction
space is not corrupted, and further that the OS' handling of illegal
instructions (if any) still works. And I can hear you thinking, those
bad-case scenarios are totally implausible, and even the scenarios
leading someone to try a halt instruction are so implausible that no-one
actually do that. But those scenarios are included in "the program" no
longer behaving as expected, that's what that over-generalization and
absolute -- incorrectly applied -- mathematical logic means.

If you try to terminate the process using a call to something, you're
hoping that this isn't a full stack you're up against, and likewise for
evaluation of any expression whatsoever. Whatever you do, you're doing a
gut-feeling potential cost / clear benefit analysis, and this should in my
opinion be a pragmatic decision, a business decision. It should not be a
decision based on absolute black/white principles thinking where every
small K.O. is equated to a nuclear attack because in both cases you're down.

As an example, the program might run out of handles to GUI objects. In
old Windows that meant that what earlier was very nice graphical
displays suddenly started showing as e.g. white, blank areas. If this is
detected (as it should be) then there's generally nothing that can be done
within this process, so the process should terminate, and that implies
detection by something akin to a C++ 'assert'. A normal exception won't
do, because it might be picked up by some general exception handler. On
the other hand, you'd like that program to clean up. E.g., if it's your
ATM example, you'd like it to eject the user's card before terminating.

And, you'd like it to log and/or report this likely bug, e.g. sending a mail.

And, you don't want to compromise your design by making everything global
just so a common pre-termination handler can do the job.


> > (b) implies that whenever stack unwinding might occur, you must assume that
> > the conditions that you would've ordinarily tested with an assert do not
> > hold.

(b) implies that whenever anything might occur, you must assume that anything
can be screwed up. ;-)


> And while it is possible to do (b) in a principled way, it's much more
> difficult than (a), because once you unwind and return to "normal"
> code with the usual assumptions about program integrity broken, you
> have to either:
>
> 1. Test every bit of data obsessively to make sure it's still
> reasonable, or
>
> 2. Come up with a principled way to decide which kinds of
> brokenness you're going to look for and try to circumvent, and
> which invariants you're going to assume still hold.
>
> In practice, I think doing a complete job of (1) is really impossible,
> so you effectively have to do (2).

[Here responding to David Abraham's statement:]

I think your points (1) and (2) summarizes approach (b) well, and show
that it's not a technique one would choose if there was a choice.

But as mentioned above, it's an extreme, although in some other
languages (e.g. Java) you get null-pointer exceptions & the like.


> Note also that once you unwind to
> "normal" code, information about the particular integrity check that
> failed tends to get lost: all the different throw points unwind into
> the same instruction stream, so there really is a vast jungle of
> potential problems to consider.

I agree, for the C++ exceptions we do have.

If we did have some kind of "hard" exception supported by the language,
or even just standardized and supported by convention, then the vast jungle
of potential problems that stands in the way of further normal execution
wouldn't matter so much: catching that hard exception at some uppermost
control level you know that the process has to terminate, not continue
with normal execution (which was the problem), and you know what actions
you've designated for that case (also known by throwers, or at least known
to be irrelevant to them), so that's what the code has to attempt to do.


> [snip]


> ....and make it someone else's problem. Code higher up the call stack
> mike know how to deal with it, right? ;-)

In the case of a "hard" exception it's not "might", it's a certainty.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,
Jul 5, 2005, 9:13:05 AM7/5/05
to
Alf P. Steinbach wrote:
> * Peter Dimov:

> > > > No recovery is possible after a failed assert.

> > [The above] means that performing stack unwinding after a
> > failed assert is usually a bad idea.

> I didn't think of that interpretation, but OK.

> The interpretation, or rather, what you _meant_ to say in the
> first place, is an opinion, which makes it more difficult to
> discuss.

It's always difficult to discuss a sentence with "usually".
What percentage is "usually"?

My work is almost exclusively on large scall servers, usually on
critical systems. In that field, it is always a mistake to do
anything more than necessary when a program invariant fails; you
back out as quickly as possible, and let the watchdog processes
clean up and restart.

At the client level, I agree that the question is less clear,
although the idea of a executing tons of destructors when the
program invariants don't hold sort of scares me even there. As
does the idea that some important information not be displayed
because of the error. For a game, on the other hand, its no big
deal, and in many cases, of course, you can recover enough to
continue, or at least save the game so it could be restarted.

> After a failed assert it's known that something, which could
> be anything (e.g. full corruption of memory), is wrong.
> Attempting to execute even one teeny tiny little instruction
> might do unimaginable damage.

Well, a no-op is probably safe:-).

> Yet you think it's all right to not only terminate the process
> but also to log things, which involves file handling, as long
> as one doesn't do a stack rewind up from the point of the
> failed assert. This leads me to suspect that you're confusing
> a failed assert with a corrupted stack, or that you think that
> a failure to clean up 100% might be somehow devastating.

I think the idea is that basically, you don't know what stack
unwinding may do, or try to do, because it depends on the global
program state. It's not local, and you have no control over
it. Most of the time, it's probably acceptable to do some
formatting (which, admittedly, may overwrite critical memory if
some pointers are corrupted, but you're not going to do anything
with the memory afterwards), and try to output the results
(which does entail a real risk -- if the file descriptor is
corrupted, you may end up overwriting something you shouldn't).
The point is, if you try to do this from the abortion routine,
you know at least exactly what you are trying to do, and can
estimate the risk. Whereas stack unwinding leads you into the
unknown, and you can't estimate the risk.

And of course, there are cases where even the risk of trying to
log the data is unacceptable. You core, the watchdog process
picks up the return status (which under Unix tells you that the
process was terminated by an unhandled signal, and has generated
a core dump), and generates the relevant log entries.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

ka...@gabi-soft.fr

unread,
Jul 5, 2005, 9:12:21 AM7/5/05
to
Niklas Matthies wrote:
> On 2005-07-03 10:19, Maxim Yegorushkin wrote:
> > On Sat, 02 Jul 2005 14:23:03 +0400, Alf P. Steinbach wrote:

> >>> If the assertion fails when there is no debugger, how do
> >>> you expect the program to recover?

> >> That's actually a good _C++_ question... ;-)

> >> First, the reason why one would like to 'throw' in this
> >> case, which is usually not to recover in the sense of
> >> continuing normal execution, but to recover enough to do
> >> useful logging, reporting and graceful exit on an end-user
> >> system with no debugger and other programmer's tools
> >> (including, no programmer's level understanding of what
> >> goes on).

> > Why would one want a graceful exit when code is broken,
> > rather than dying as loud as possible leaving a core dump
> > with all state preserved, rather than unwound?

> Because the customer expects and demands it. Actually, more
> often than not the customer even demands graceful resumption.

I guess it depends on the customer. None of my customers would
ever have accepted anything but "aborting" anytime we were
unsure of the data. Most of the time, trying to continue the
program after an assertion failed would have been qualified as
"grobe Fahrlässigkeit" -- I think the correct translation is
"criminal negligence".

But I'm not sure that that was the question. My impression
wasn't that people were saying, continue, even if you don't know
what you are doing. My impression was that we were discussing
the best way to shut the program down; basically: with or
without stack walkback. Which can be summarized by something
along the lines of: trying to clean up, but risk doing something
bad, or get out as quickly as possible, with as little risk as
possible, and leave the mess.

My experience (for the most part, in systems which are more or
less critical in some way, and under Unix) is that the operating
system will clean up most of the mess anyway, and that any
attempts should be carefully targetted, to minimize the risk.
Throwing an exception means walking back the stack, which in
turn means executing a lot of unnecessary and potentially
dangerous destructors. I don't think that the risk is that
great, typically, but it is very, very difficult, if not
impossible, to really evaluate. For example, I usually have
transaction objects on the stack. Calling the destructor
without having called commit should normally provoke a roll
back. But if I'm unsure of the global invariants of the
process, it's a risk I'd rather not take; maybe the destructor
will misinterpret some data, and cause a commit, although the
transaction didn't finish correctly. Where as if I abort, the
connection to the data base is broken (by the OS), and the data
base automatically does its roll back in this case. Why take
the risk (admittedly very small), when a solution with zero risk
exists?

But this is based on my personal experience. I can imagine that
in the case of a light weight graphical client, for example, the
analysis might be different. About all that can go wrong is
that the display is all messed up, and in this case, the user
will kill the process and restart it manually. And of course,
you might luck out, the user might not even notice, and you can
pull one over on him.

Still, if what the program is doing is important, and not just
cosmetic, you must take into account that if the program
invariants don't hold, it may do something wrong. If doing
something wrong can have bad consequences (and not just cosmetic
effects), then you really should limit what you try to do to a
minimum. Regardless of what some naive user might think. In
such cases, walking back the stack calling destructors
represents a significantly greater risk than explicitly
executing a very limited set of targetted clean-up operations.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

David Abrahams

unread,
Jul 5, 2005, 1:53:00 PM7/5/05
to
al...@start.no (Alf P. Steinbach) writes:

> * Peter Dimov:
>> > >
>> > > No recovery is possible after a failed assert.
>>
>> [The above] means that performing stack unwinding after a failed
>> assert is usually a bad idea.
>
> I didn't think of that interpretation, but OK.
>
> The interpretation, or rather, what you _meant_ to say in the first
> place,

AFAICT that was a *conclusion* based on what Peter had said before.

> is an opinion, which makes it more difficult to discuss.


> After a failed assert it's known that something, which could be anything
> (e.g. full corruption of memory), is wrong. Attempting to execute even
> one teeny tiny little instruction might do unimaginable damage. Yet you
> think it's all right to not only terminate the process but also to log
> things, which involves file handling, as long as one doesn't do a stack
> rewind up from the point of the failed assert.

There are two problems with stack unwinding at that point:

1. It executes more potentially damaging instructions than necessary,
since none of the destructors or catch blocks involved have
anything to do with process termination or logging.

2. The flow of execution proceeds into logic that is usually involved
with the resumption of normal execution, and it's easy to end up
back in code that executes as though everything is still fine.

> This leads me to suspect that you're confusing a failed assert with
> a corrupted stack, or that you think that a failure to clean up 100%
> might be somehow devastating. Anyway, an explanation of your
> opinion would be great, and this time, please write what you mean,
> not something entirely different.

Ouch.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Peter Dimov

unread,
Jul 5, 2005, 1:49:28 PM7/5/05
to
David Abrahams wrote:

> "Peter Dimov" <pdi...@gmail.com> writes:
>
> > Either
> >
> > (a) you go the "correct program" way and use assertions to verify that your
> > expectations match the observed behavior of the program, or
> >
> > (b) you go the "resilient program" way and use exceptions in an attempt to
> > recover from certain situations that may be caused by bugs.
> >
> > (a) implies that whenever an assert fails, the program no longer behaves as
> > expected, so everything you do from this point on is based on _hope_ that
> > things aren't as bad.
> >
> > (b) implies that whenever stack unwinding might occur, you must assume that
> > the conditions that you would've ordinarily tested with an assert do not
> > hold.
>
> And while it is possible to do (b) in a principled way, it's much more
> difficult than (a), because once you unwind and return to "normal"
> code with the usual assumptions about program integrity broken, you
> have to either:
>
> 1. Test every bit of data obsessively to make sure it's still
> reasonable, or
>
> 2. Come up with a principled way to decide which kinds of
> brokenness you're going to look for and try to circumvent, and
> which invariants you're going to assume still hold.
>
> In practice, I think doing a complete job of (1) is really impossible,
> so you effectively have to do (2).

It's possible to do (b) when you know that the stack unwinding will
completely destroy the potentially corrupted state, and it seems
possible - in theory - to write programs this way.

That is, instead of:

void menu_item_4()
{
frobnicate( document );
}

one might write

void menu_item_4()
{
Document tmp( document );

try
{
frobnicate( tmp );
document.swap( tmp );
}
catch( exception const & x )
{
// maybe attempt autosave( document ) here
report_error( x );
}
}

Even if there are bugs in frobnicate, if it doesn't leave tmp in an
undestroyable state, it's possible to continue.

I still prefer (a), of course. :-)

David Abrahams

unread,
Jul 6, 2005, 5:30:29 AM7/6/05
to
al...@start.no (Alf P. Steinbach) writes:

>> And while it is possible to do (b) in a principled way, it's much more
>> difficult than (a), because once you unwind and return to "normal"
>> code with the usual assumptions about program integrity broken, you
>> have to either:
>>
>> 1. Test every bit of data obsessively to make sure it's still
>> reasonable, or
>>
>> 2. Come up with a principled way to decide which kinds of
>> brokenness you're going to look for and try to circumvent, and
>> which invariants you're going to assume still hold.
>>
>> In practice, I think doing a complete job of (1) is really impossible,
>> so you effectively have to do (2).
>
> [Here responding to David Abraham's statement:]
>
> I think your points (1) and (2) summarizes approach (b) well, and show
> that it's not a technique one would choose if there was a choice.

I think that's what Peter meant when he wrote "performing stack


unwinding after a failed assert is usually a bad idea."

^^^^^^^

> But as mentioned above, it's an extreme

^^^^
What is an extreme?

> although in some other languages (e.g. Java) you get null-pointer
> exceptions & the like.

IMO null-pointer exceptions are a joke; it's a way of claiming that
the language is typesafe and making that sound bulletproof: all you do
is turn programming errors into exceptions with well-defined behavior!
Fantastic! Now my program goes on doing... something... even though
its state might be completely garbled.

>> Note also that once you unwind to "normal" code, information about
>> the particular integrity check that failed tends to get lost: all
>> the different throw points unwind into the same instruction stream,
>> so there really is a vast jungle of potential problems to consider.
>
> I agree, for the C++ exceptions we do have.
>
> If we did have some kind of "hard" exception supported by the
> language, or even just standardized and supported by convention,
> then the vast jungle of potential problems that stands in the way of
> further normal execution wouldn't matter so much: catching that hard
> exception at some uppermost control level you know that the process
> has to terminate, not continue with normal execution (which was the
> problem), and you know what actions you've designated for that case
> (also known by throwers, or at least known to be irrelevant to
> them), so that's what the code has to attempt to do.

And what happens if the corrupted programming state causes a crash
during unwinding (e.g. from a destructor)?

What makes executing the unwinding actions the right thing to do? How
do you know which unwinding actions will get executed at the point
where you detect that your program state is broken?

>> [snip]
>> ....and make it someone else's problem. Code higher up the call stack
>> mike know how to deal with it, right? ;-)
>
> In the case of a "hard" exception it's not "might", it's a certainty.

?? I don't care how you flavor the exception; the appropriate recovery
action cannot always be known by the outermost layer of the program.
Furthermore, what the outermost layer of the program knows how to do
becomes irrelevant if there is a crash during unwinding.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,
Jul 6, 2005, 5:59:01 AM7/6/05
to
"Peter Dimov" <pdi...@gmail.com> writes:

<snip example that copies program state, modifies, and swaps>

What you've just done -- implicitly -- is to decide which kinds of


brokenness you're going to look for and try to circumvent, and which

invariants you're going to assume still hold. For example, your
strategy assumes that whatever broke invariants in the copy of your
document didn't also stomp on the memory in the original document.
Part of what your strategy does is to increase the likelihood that
your assumptions will be correct, but if you're going to go down the
(b)(2) road in a principled way, you have to recognize where the
limits of your program's resilience are.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Alf P. Steinbach

unread,
Jul 6, 2005, 6:03:31 AM7/6/05
to
* David Abrahams:

> al...@start.no (Alf P. Steinbach) writes:
>
> > * Peter Dimov:
> >> > >
> >> > > No recovery is possible after a failed assert.
> >>
> >> [The above] means that performing stack unwinding after a failed
> >> assert is usually a bad idea.
> >
> > I didn't think of that interpretation, but OK.
> >
> > The interpretation, or rather, what you _meant_ to say in the first
> > place,
>
> AFAICT that was a *conclusion* based on what Peter had said before.

"It's impossible to do stack unwinding, therefore it's usually a bad
idea to do stack unwinding." I didn't think of that. It's, uh...


> > is an opinion, which makes it more difficult to discuss.
>
>
> > After a failed assert it's known that something, which could be anything
> > (e.g. full corruption of memory), is wrong. Attempting to execute even
> > one teeny tiny little instruction might do unimaginable damage. Yet you
> > think it's all right to not only terminate the process but also to log
> > things, which involves file handling, as long as one doesn't do a stack
> > rewind up from the point of the failed assert.
>
> There are two problems with stack unwinding at that point:
>
> 1. It executes more potentially damaging instructions than necessary,

That is an unwarranted assumption.

In practice the opposite is probably more likely for many classes of
applications, but it does depend: there are situations where stack unwinding
is not advicable, and there are situations where it is advicable.

The decision is just like the judgement call of assert versus
exception versus return value: you judge the severity, the consequences
of this or that way of handling it, what you have time for (:-)), even
what maintainance programmers are likely to understand, etc.


> since none of the destructors or catch blocks involved have
> anything to do with process termination or logging.

Ditto, the above is just assumptions, but you might turn it around
and get a valid statement: _if_ the assumptions above hold, then that
is a situation where stack unwinding would perhaps not be a good idea.


> 2. The flow of execution proceeds into logic that is usually involved
> with the resumption of normal execution, and it's easy to end up
> back in code that executes as though everything is still fine.

Not sure what you mean with "resumption of normal execution" since
destructors are very much oriented towards directly terminating in
such cases. Destructors are the most likely actors that may terminate
the process if further problems manifest themselves. Either by directly
calling abort or terminate, or by throwing (where C++ rules guarantee
termination).

Regarding "code that executes as though everything is still fine":

First of all, most likely everything at higher levels _is_ just as fine,
or not, as it ever was: when you detect that null-pointer argument (say) it
doesn't usually mean more than a simple typo or the like at the calling
site. That goes into the equation for likely good versus bad of doing this
or that. Secondly, destructors are designed to make things fine if they
aren't already: they're cleaners, clean-up routines, and after a destructor
has run successfully there's no longer that class invariant that can be bad,
so by cleaning up you're systematically removing potential badness.


> > This leads me to suspect that you're confusing a failed assert with
> > a corrupted stack, or that you think that a failure to clean up 100%
> > might be somehow devastating. Anyway, an explanation of your
> > opinion would be great, and this time, please write what you mean,
> > not something entirely different.
>
> Ouch.

I wouldn't and didn't put it that way.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Alf P. Steinbach

unread,
Jul 6, 2005, 6:40:15 AM7/6/05
to
* David Abrahams:

>
> > But as mentioned above, it's an extreme
> ^^^^
> What is an extreme?

Peter's approach (b). Also approach (a), but here the reference was
to approach (b).


> > although in some other languages (e.g. Java) you get null-pointer
> > exceptions & the like.
>
> IMO null-pointer exceptions are a joke; it's a way of claiming that
> the language is typesafe and making that sound bulletproof: all you do
> is turn programming errors into exceptions with well-defined behavior!
> Fantastic! Now my program goes on doing... something... even though
> its state might be completely garbled.

Agreed.


> > If we did have some kind of "hard" exception supported by the
> > language, or even just standardized and supported by convention,
> > then the vast jungle of potential problems that stands in the way of
> > further normal execution wouldn't matter so much: catching that hard
> > exception at some uppermost control level you know that the process
> > has to terminate, not continue with normal execution (which was the
> > problem), and you know what actions you've designated for that case
> > (also known by throwers, or at least known to be irrelevant to
> > them), so that's what the code has to attempt to do.
>
> And what happens if the corrupted programming state causes a crash
> during unwinding (e.g. from a destructor)?

The same as would have happened if we'd initiated that right away: the
difference is that it doesn't have to happen, and generally won't, and
even if it does one might have collected useful info on the way. ;-)


> What makes executing the unwinding actions the right thing to do? How
> do you know which unwinding actions will get executed at the point
> where you detect that your program state is broken?

Both questions are impossible to answer due to built-in assumptions. For
the first question, there is no more "the right thing" than there is "best",
out of context. For the second question, you generally don't know the
unwinding actions, and, well, you know that... ;-)


> >> [snip]
> >> ....and make it someone else's problem. Code higher up the call stack
> >> mike know how to deal with it, right? ;-)
> >
> > In the case of a "hard" exception it's not "might", it's a certainty.
>
> ?? I don't care how you flavor the exception; the appropriate recovery
> action cannot always be known by the outermost layer of the program.

When you get to the outermost layer you are (or rather, the program is)
already mostly recovered: you have logged the most basic information (of
course you did that right away, before unwinding), you have removed lots of
potential badness (due to destructors cleaning up), and perhaps as part of
that logged more rich & useful state information, and now all you have to do
is to attempt to do more chancy high level error reporting before terminating
in a hopefully clean way -- if you like to do very dangerous things, as they
evidently do at Microsoft, you could also at this point store descriptions
of the state and attempt a Bird Of Phoenix resurrection, a.k.a. restart.


> Furthermore, what the outermost layer of the program knows how to do
> becomes irrelevant if there is a crash during unwinding.

A crash during unwinding is OK: it's no worse than what you would have had
with no unwinding. Two main potential problems are (1) a hang, which can
occur during cleanup, and (2) unforeseen bad effects such as trashing data
on the disk or committing a transaction erronously. Such potential problems
will have to be weighted against the probability of obtaining failure data
at all, obtaining rich failure data, pleasing or at least not seriously
annoying the user (in the case of an interactive app), etc., in each case.

Cheers,

- Alf

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,
Jul 6, 2005, 6:01:21 PM7/6/05
to
al...@start.no (Alf P. Steinbach) writes:

> * David Abrahams:
>> al...@start.no (Alf P. Steinbach) writes:
>>
>> > * Peter Dimov:
>> >> > >
>> >> > > No recovery is possible after a failed assert.
>> >>
>> >> [The above] means that performing stack unwinding after a failed
>> >> assert is usually a bad idea.
>> >
>> > I didn't think of that interpretation, but OK.
>> >
>> > The interpretation, or rather, what you _meant_ to say in the first
>> > place,
>>
>> AFAICT that was a *conclusion* based on what Peter had said before.
>
> "It's impossible to do stack unwinding, therefore it's usually a bad
> idea to do stack unwinding." I didn't think of that. It's, uh...

You clipped everything but the first sentence of Peter's paragraph,
which makes what he's saying look like a simpleminded tautology, and
now you're ridiculing it. Nice.

>> > After a failed assert it's known that something, which could be anything
>> > (e.g. full corruption of memory), is wrong. Attempting to execute even
>> > one teeny tiny little instruction might do unimaginable damage. Yet you
>> > think it's all right to not only terminate the process but also to log
>> > things, which involves file handling, as long as one doesn't do a stack
>> > rewind up from the point of the failed assert.
>>
>> There are two problems with stack unwinding at that point:
>>
>> 1. It executes more potentially damaging instructions than necessary,
>
> That is an unwarranted assumption.

Let's ignore, for the moment, that the runtime library's unwinding
code is being executed and _its_ invariants may have been violated.

If you arrange your program so that the only automatic objects with
nontrivial destructors you use are specifically designed to do
something when the program state is broken, then you will be executing
the minimal number of potentially damaging instructions. Do you think
that's a likely scenario? I would hate to write programs under a
restriction that I could only use automatic objects with nontrivial
destructors to account for a condition that should never occur. It
would also prevent the use of most libraries.

> In practice the opposite is probably more likely for many classes of
> applications,

Wow, so many classes of applications actually can be written that way? I
presume they wouldn't use the standard (or any other) libraries.

>> 2. The flow of execution proceeds into logic that is usually involved
>> with the resumption of normal execution, and it's easy to end up
>> back in code that executes as though everything is still fine.
>
> Not sure what you mean with "resumption of normal execution"

I mean what happens in most continuously running programs (not
single-pass translators like compilers) after error reporting or
exception translation at the end of a catch block that doesn't end
with a rethrow.

> since destructors are very much oriented towards directly
> terminating in such cases.

Destructors are oriented towards destroying an object, and have no
particular relationship to overall error-handling strategies.

> Destructors are the most likely actors that may terminate the
> process if further problems manifest themselves.

Destructors intentionally taking a program down is a common idiom in
the code you've seen? Can you show me even one example of that in
open source code somewhere? I'm really curious.

> Either by directly calling abort or terminate, or by throwing (where
> C++ rules guarantee termination).

C++ rules don't guarantee termination if a destructor throws, unless
unwinding is already in progress.

> Regarding "code that executes as though everything is still fine":
>
> First of all, most likely everything at higher levels _is_ just as
> fine,

That is an unwarranted assumption.

> or not, as it ever was: when you detect that null-pointer argument (say) it


> doesn't usually mean more than a simple typo or the like at the calling
> site.

That is an unwarranted assumption.

> That goes into the equation for likely good versus bad of doing this
> or that.

Yes. Those equations are what make (b) difficult. There are no
disciplined ways of understanding the risks of the choices you have to
make.

> Secondly, destructors are designed to make things fine if they
> aren't already: they're cleaners, clean-up routines, and after a
> destructor has run successfully there's no longer that class
> invariant that can be bad, so by cleaning up you're systematically
> removing potential badness.

It's easy to imagine that running a bunch of destructors increases
"badness," e.g. by leaving dangling pointers behind.

>> > This leads me to suspect that you're confusing a failed assert with
>> > a corrupted stack, or that you think that a failure to clean up 100%
>> > might be somehow devastating. Anyway, an explanation of your
>> > opinion would be great, and this time, please write what you mean,
>> > not something entirely different.
>>
>> Ouch.
>
> I wouldn't and didn't put it that way.

I didn't rewrite your text at all. Unless someone is spoofing you,
you put it exactly that way.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,
Jul 6, 2005, 6:02:52 PM7/6/05
to
al...@start.no (Alf P. Steinbach) writes:

>> > If we did have some kind of "hard" exception supported by the
>> > language, or even just standardized and supported by convention,
>> > then the vast jungle of potential problems that stands in the way of
>> > further normal execution wouldn't matter so much: catching that hard
>> > exception at some uppermost control level you know that the process
>> > has to terminate, not continue with normal execution (which was the
>> > problem), and you know what actions you've designated for that case
>> > (also known by throwers, or at least known to be irrelevant to
>> > them), so that's what the code has to attempt to do.
>>
>> And what happens if the corrupted programming state causes a crash
>> during unwinding (e.g. from a destructor)?
>
> The same as would have happened if we'd initiated that right away:

Initiated what?

> the difference is that it doesn't have to happen,

Difference between what and what?

> and generally won't,

Based on what do you say that?

> and even if it does one might have collected useful info on the
> way. ;-)

Maybe. As long as you're clear that it's wishful thinking. Also,
there seems to be little good reason to use unwinding to collect that
info. You can establish a chain of reporting frames and traverse that
when your precondition is violated without doing any unwinding (use
TLS to ensure thread safety if you need it).

>> What makes executing the unwinding actions the right thing to do? How
>> do you know which unwinding actions will get executed at the point
>> where you detect that your program state is broken?
>
> Both questions are impossible to answer due to built-in assumptions.
> For the first question, there is no more "the right thing" than
> there is "best", out of context.

That's my point exactly. The author of the code where the
precondition violation is detected doesn't _know_ the context, so she
can't know what is appropriate.

> For the second question, you generally don't know the unwinding
> actions, and, well, you know that... ;-)

Yes, also part of my point.

>> >> [snip]
>> >> ....and make it someone else's problem. Code higher up the call stack
>> >> mike know how to deal with it, right? ;-)
>> >
>> > In the case of a "hard" exception it's not "might", it's a certainty.
>>
>> ?? I don't care how you flavor the exception; the appropriate recovery
>> action cannot always be known by the outermost layer of the program.
>
> When you get to the outermost layer you are (or rather, the program is)
> already mostly recovered: you have logged the most basic information (of
> course you did that right away, before unwinding),

That's not recovery! There's normally no recovering from broken
invariants, because except for a few places where you second-guess
your entire worldview as a programmer, your whole program has been
written with the assumption that they hold.

> you have removed lots of potential badness (due to destructors
> cleaning up),

That's pretty vague. What kind of "potential badness" do you think
gets removed?

> and perhaps as part of that logged more rich & useful state
> information,

Unwinding is totally unnecessary for that purpose.

> and now all you have to do is to attempt to do more
> chancy high level error reporting before terminating in a hopefully
> clean way -- if you like to do very dangerous things, as they
> evidently do at Microsoft, you could also at this point store
> descriptions of the state and attempt a Bird Of Phoenix
> resurrection, a.k.a. restart.

Again, unwinding is totally unnecessary for that purpose.

>> Furthermore, what the outermost layer of the program knows how to
>> do becomes irrelevant if there is a crash during unwinding.
>
> A crash during unwinding is OK: it's no worse than what you would
> have had with no unwinding.

Of course it is worse if you are postponing anything important until
the outer layer, because the crash will prevent you from getting to
that important action.

AFAICT, the only good reason to unwind is so that you can resume
normal execution. If you're not going to do that, unwinding just
destroys important debugging information and makes your program more
vulnerable to crashes that may occur due to the execution of
noncritical destructors and catch blocks.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Bob Bell

unread,
Jul 6, 2005, 6:46:11 PM7/6/05
to
Alf P. Steinbach wrote:
> * David Abrahams:

> > Furthermore, what the outermost layer of the program knows how to do
> > becomes irrelevant if there is a crash during unwinding.
>
> A crash during unwinding is OK: it's no worse than what you would have had
> with no unwinding.

You seem to be confusing "crash" and "abort".

Bob

Bob Bell

unread,
Jul 6, 2005, 6:46:36 PM7/6/05
to
David Abrahams wrote:
> "Peter Dimov" <pdi...@gmail.com> writes:
> > David Abrahams wrote:
> >> "Peter Dimov" <pdi...@gmail.com> writes:
> >>
> >> > Either
> >> >
> >> > (a) you go the "correct program" way and use assertions to verify that your
> >> > expectations match the observed behavior of the program, or
> >> >
> >> > (b) you go the "resilient program" way and use exceptions in an attempt to
> >> > recover from certain situations that may be caused by bugs.

[snip]

> > It's possible to do (b) when you know that the stack unwinding will
> > completely destroy the potentially corrupted state, and it seems
> > possible - in theory - to write programs this way.
>
> <snip example that copies program state, modifies, and swaps>
>
> What you've just done -- implicitly -- is to decide which kinds of
> brokenness you're going to look for and try to circumvent, and which
> invariants you're going to assume still hold. For example, your
> strategy assumes that whatever broke invariants in the copy of your
> document didn't also stomp on the memory in the original document.
> Part of what your strategy does is to increase the likelihood that
> your assumptions will be correct, but if you're going to go down the
> (b)(2) road in a principled way, you have to recognize where the
> limits of your program's resilience are.

And recognize that where those limits are exceeded, you're back to (a)
anyway.

Bob

Gerhard Menzl

unread,
Jul 7, 2005, 10:43:30 AM7/7/05
to
ka...@gabi-soft.fr wrote:

> My experience (for the most part, in systems which are more or
> less critical in some way, and under Unix) is that the operating
> system will clean up most of the mess anyway, and that any
> attempts should be carefully targetted, to minimize the risk.
> Throwing an exception means walking back the stack, which in
> turn means executing a lot of unnecessary and potentially
> dangerous destructors. I don't think that the risk is that
> great, typically, but it is very, very difficult, if not
> impossible, to really evaluate. For example, I usually have
> transaction objects on the stack. Calling the destructor
> without having called commit should normally provoke a roll
> back. But if I'm unsure of the global invariants of the
> process, it's a risk I'd rather not take; maybe the destructor
> will misinterpret some data, and cause a commit, although the
> transaction didn't finish correctly. Where as if I abort, the
> connection to the data base is broken (by the OS), and the data
> base automatically does its roll back in this case. Why take
> the risk (admittedly very small), when a solution with zero risk
> exists?

I think the problem with this discussion is that no-one seems to agree
about what we mean by global invariants and what kinds of programs we
are talking about. When flight control software encounters a negative
altitude value, it had better shut down (and, hopefully, let the backup
system take over). On the other hand, a word processor that aborts and
destroys tons of unsaved work just because the spellchecker has met a
violated invariant is just inacceptable.

It is generally agreed that modularity, loose coupling, and
encapsulation are cornerstones of good software design. Providing these
principles are being adherred to, I wonder whether global invariants (or
preconditions) that require immediate shutdown when violated are really
as common as this discussion seems to suggest they are.

In my experience, the distinction is rarely that clear-cut, at least not
in interactive, user-centric systems. For example, right now I am
working on the front-end of a multi-user telecommunication system that
needs a central database for some, but not all operations. A corrupted
database table would certainly constitute violated preconditions, yet a
shutdown in such a case would be out of the question. Our customer
insists - justifiably - that operations which do not rely on database
transactions, such as emergency calls, continue to function even if the
database connection is completely broken.


--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

Alf P. Steinbach

unread,
Jul 7, 2005, 8:38:17 PM7/7/05
to
* David Abrahams:
> al...@start.no (Alf P. Steinbach) writes:
>
> > * David Abrahams:
> >> al...@start.no (Alf P. Steinbach) writes:
> >>
> >> > * Peter Dimov:
> >> >> > >
> >> >> > > No recovery is possible after a failed assert.
> >> >>
> >> >> [The above] means that performing stack unwinding after a failed
> >> >> assert is usually a bad idea.
> >> >
> >> > I didn't think of that interpretation, but OK.
> >> >
> >> > The interpretation, or rather, what you _meant_ to say in the first
> >> > place,
> >>
> >> AFAICT that was a *conclusion* based on what Peter had said before.
> >
> > "It's impossible to do stack unwinding, therefore it's usually a bad
> > idea to do stack unwinding." I didn't think of that. It's, uh...
>
> You clipped everything but the first sentence of Peter's paragraph,

It so happens that I agree with the literal interpretation of the rest
(although not with the sense it imparts). I.e. there was nothing to
discuss there with Peter, and no need to quote. For completeness, here's
what I clipped and agree literally with, emphasis added:

A failed assert means that we no longer _know_ what's going on. [Right]


Generally logging and reporting should be done at the earliest

opportunity [right again, although what can be logged/reported at
that early moment, and of what use it can be, is very restricted]; if


you attempt to "recover" you may be terminated and no longer be able

to log or report [right, and that holds for anything you do].


> which makes what he's saying look like a simpleminded tautology,

I don't think Peter is simpleminded, quite the opposite, and anyway,
that discussion is off-topic and not one I'd like to participate in.


> and now you're ridiculing it. Nice.

If showing that a statement is incorrect, by quoting the parts it
refers to, is ridicule, then I ridiculed your statement. However,
quoting is normally not considered ridicule. You're off-topic
both regarding Peter's alleged intellectual capacity and my
alleged choice of rhetorical tools.

Cheers,

- Alf

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Peter Dimov

unread,
Jul 7, 2005, 9:12:18 PM7/7/05
to
Gerhard Menzl wrote:
>
> I think the problem with this discussion is that no-one seems to agree
> about what we mean by global invariants and what kinds of programs we
> are talking about. When flight control software encounters a negative
> altitude value, it had better shut down (and, hopefully, let the backup
> system take over). On the other hand, a word processor that aborts and
> destroys tons of unsaved work just because the spellchecker has met a
> violated invariant is just inacceptable.

The two options aren't "abort and destroy hours of work" and "throw an
exception". The two options are "throw an exception" and "don't throw
an exception".

In particular, nothing prevents the failed assertion handler to attempt
an emergency save, using a different file name (to not clobber the
"last known good" save), a different file format (if the native format
consists of a dump of the data structures and will likely produce an
unreadable file), and a different, extra-paranoid code path. _Then_
abort.

> It is generally agreed that modularity, loose coupling, and
> encapsulation are cornerstones of good software design. Providing these
> principles are being adherred to, I wonder whether global invariants (or
> preconditions) that require immediate shutdown when violated are really
> as common as this discussion seems to suggest they are.

Yes, I tried to gave an example of that in the other post.
Unfortunately, two important global invariants are "the heap is not
corrupted" and "there are no dangling pointers that are causing
damage", and a violation of those is usually manifested as a violation
of another (possibly local) invariant.

> In my experience, the distinction is rarely that clear-cut, at least not
> in interactive, user-centric systems. For example, right now I am
> working on the front-end of a multi-user telecommunication system that
> needs a central database for some, but not all operations. A corrupted
> database table would certainly constitute violated preconditions, yet a
> shutdown in such a case would be out of the question. Our customer
> insists - justifiably - that operations which do not rely on database
> transactions, such as emergency calls, continue to function even if the
> database connection is completely broken.

A broken database connection, a corrupted table, a corrupted data file
are not logic errors. The number of logic errors in a program is
constant and does not depend on external factors. (Unless the program
itself changes, i.e. you consider the information in the database
"code", a part of the program.)

David Abrahams

unread,
Jul 7, 2005, 9:14:39 PM7/7/05
to
Gerhard Menzl <gerhar...@hotmail.com> writes:

> I think the problem with this discussion is that no-one seems to agree
> about what we mean by global invariants and what kinds of programs we
> are talking about.

No, that's not the problem, as shown by what you write here:

> When flight control software encounters a negative altitude value,
> it had better shut down (and, hopefully, let the backup system take
> over). On the other hand, a word processor that aborts and destroys
> tons of unsaved work just because the spellchecker has met a
> violated invariant is just inacceptable.

You seem to assume that aborting is the only alternative to unwinding
when a violated invariant is detected. In an interactive application
like a word processor it's usually possible to recover the state of
the document and offer the user an opportunity to save when a violated
invariant is detected. All of that can be done without any
unwinding.

> It is generally agreed that modularity, loose coupling, and
> encapsulation are cornerstones of good software design. Providing
> these principles are being adherred to, I wonder whether global
> invariants (or preconditions) that require immediate shutdown when
> violated are really as common as this discussion seems to suggest
> they are.

a. Nobody's suggesting "immediate" shutdown.

b. I'm not saying that the conditions *require* immediate shutdown.
I'm saying that if you try to continue, making judgements about what
things you can rely on at that point can become very difficult, and
that we don't have a well-developed discipline for doing so. I'm also
saying that dealing with the possibility of broken invariants tends to
complicate and obfuscate regular code, usually to no benefit.

> In my experience, the distinction is rarely that clear-cut, at least
> not in interactive, user-centric systems. For example, right now I
> am working on the front-end of a multi-user telecommunication system
> that needs a central database for some, but not all operations. A
> corrupted database table would certainly constitute violated
> preconditions, yet a shutdown in such a case would be out of the
> question. Our customer insists - justifiably - that operations which
> do not rely on database transactions, such as emergency calls,
> continue to function even if the database connection is completely
> broken.

In that case, a corrupted database table is by definition *not* a
broken precondition. IIUC, you are expected to write software that is
guaranteed to work whether the database table is corrupt or not, and
you seem to accept that challenge. Great! It's similar to writing
software that is robust in the face of invalid user input. If the
user's input is invalid, well, there's some functionality they can't
get to until the input is corrected. Nobody I know would consider
valid user input a precondition in that case.

In your application, calling database table integrity a precondition
is only going to confuse things and make your code more complicated.
Once you understand that database integrity is not a precondition, it
becomes very clear that you need to check for corruption in certain
places and make sure that you do something sensible if you detect it.

An application that continues in the face of broken preconditions is
-- by definition -- going on "a wing and a prayer." It's fine to do
so, as long as you know there are no guarantees at that point. My
favorite example of an appropriate place to hope for the best is a
lighting controller for a rock concert, where it might be better to
keep the lights flashing somehow than to have the stage go dark.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,
Jul 8, 2005, 6:20:20 AM7/8/05
to
al...@start.no (Alf P. Steinbach) writes:

> If showing that a statement is incorrect, by quoting the parts it
> refers to, is ridicule, then I ridiculed your statement.

It was not my statement in the first place.

> However, quoting is normally not considered ridicule.

It wasn't your use of quotation that smelled like ridicule to me. It
was

"I didn't think of that. It's, uh..."

> You're off-topic both regarding Peter's alleged intellectual
> capacity

I drew no conclusions about Peter's intellectual capacity nor did I
make claims about your view of said capacity. What I wrote was that
quoting Peter out of context makes his quoted statement seem
simpleminded.

> and my alleged choice of rhetorical tools.

Well I'm willing to consider that I might be off base, but I don't
think I was off-topic. Anyway, if you tell me you meant something
else by all that, I'm willing to forget it. What about the rest of my
post?

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Francis Glassborow

unread,
Jul 8, 2005, 9:23:27 AM7/8/05
to
In article <uslyqq...@boost-consulting.com>, David Abrahams
<da...@boost-consulting.com> writes

>You seem to assume that aborting is the only alternative to unwinding
>when a violated invariant is detected. In an interactive application
>like a word processor it's usually possible to recover the state of
>the document and offer the user an opportunity to save when a violated
>invariant is detected. All of that can be done without any
>unwinding.

Indeed and the point that seems to be being missed is that the options
aren't abort or throw. Another viable option is to call an error
handling function, which can, for example, do essential clean-up before
aborting or throwing.
>

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

Alf P. Steinbach

unread,
Jul 8, 2005, 11:03:08 AM7/8/05
to
* Francis Glassborow:
> * David Abrahams:

> > You seem to assume that aborting is the only alternative to unwinding
> > when a violated invariant is detected. In an interactive application
> > like a word processor it's usually possible to recover the state of
> > the document and offer the user an opportunity to save when a violated
> > invariant is detected. All of that can be done without any
> > unwinding.
>
> Indeed and the point that seems to be being missed is that the options
> aren't abort or throw. Another viable option is to call an error
> handling function, which can, for example, do essential clean-up before
> aborting or throwing.

To do that open documents (say) must be available to the clean-up code,
and that means globally accessible structures or registering clean-up
actions with some singleton. Both suffer from the same problems wrt. to
possible broken global state as unwinding does. And they add code that
is more error-prone because, first, it's more code, and second, because
it's invasive code, and third, because there's no (under assumption of
valid higher level states) guarantee of execution, as there is with RAII.

And, it's reasonable and common (it even has built-in language support)
to do recursive clean-up when an object is destroyed. But ensuring
destruction of objects can be difficult when the stack is not unwound.
So with no stack unwinding the objects may have to be designed to be able
to do total clean-up without being destroyed, which effectively means adding
zombie states.

Summing up: avoiding stack unwinding while insisting on clean-up means
globals, invasive support code, probably zombie states, and probably more
that I didn't think of right here, and it's got the same problems wrt.
broken invariants at higher levels as stack unwinding does. Nothing seems
to be _gained_ by that centralized approach. And much seems to be lost.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,
Jul 8, 2005, 9:37:53 PM7/8/05
to
Alf P. Steinbach wrote:
> * Francis Glassborow:
> > * David Abrahams:
> > > You seem to assume that aborting is the only alternative
> > > to unwinding when a violated invariant is detected. In an
> > > interactive application like a word processor it's usually
> > > possible to recover the state of the document and offer
> > > the user an opportunity to save when a violated invariant
> > > is detected. All of that can be done without any
> > > unwinding.

> > Indeed and the point that seems to be being missed is that
> > the options aren't abort or throw. Another viable option is
> > to call an error handling function, which can, for example,
> > do essential clean-up before aborting or throwing.

> To do that open documents (say) must be available to the
> clean-up code, and that means globally accessible structures
> or registering clean-up actions with some singleton. Both
> suffer from the same problems wrt. to possible broken global
> state as unwinding does. And they add code that is more
> error-prone because, first, it's more code, and second,
> because it's invasive code, and third, because there's no
> (under assumption of valid higher level states) guarantee of
> execution, as there is with RAII.

The difference is that in this case, you are doing a targeted
set of clean-up actions. You know exactly what you are trying
to do, and can evaluate the risk. When you throw, you don't
really know what objects might be on the stack between you and
where ever the exception is caught. This makes evaluating the
risk a lot more difficult.

And of course, if necessary, calling specialized code means that
the code you actually call can take the uncertainty of the
global state into consideration. As a simple example, I have a
class which represents temporary files -- it contains the name
of the file (in an std::string), and calls remove in its
destructor, using the name it has, with no further checks. In
an emergency clean-up action, however, I would doubtlessly add a
few coherency checks -- that the name is in the directory (or a
directory) where I normally create temporary files, for
example. So that if the name has been corrupted, I won't
accidentally destroy something critical.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

David Abrahams

unread,
Jul 8, 2005, 9:38:48 PM7/8/05
to
al...@start.no (Alf P. Steinbach) writes:

> * Francis Glassborow:
>> * David Abrahams:
>> > You seem to assume that aborting is the only alternative to unwinding
>> > when a violated invariant is detected. In an interactive application
>> > like a word processor it's usually possible to recover the state of
>> > the document and offer the user an opportunity to save when a violated
>> > invariant is detected. All of that can be done without any
>> > unwinding.
>>
>> Indeed and the point that seems to be being missed is that the options
>> aren't abort or throw. Another viable option is to call an error
>> handling function, which can, for example, do essential clean-up before
>> aborting or throwing.
>
> To do that open documents (say) must be available to the clean-up code,
> and that means globally accessible structures or registering clean-up
> actions with some singleton.

Guess what? You already did that, because your interactive document
editor supports undo (doesn't it?)

When I wrote desktop document editing software, upon the detection of
a fatal error it was trivial to ask the undo mechanism to restore any
document state that was currently being modified and give the user an
opportunity to save the open documents (under new names, of course).

> Both suffer from the same problems wrt. to possible broken global
> state as unwinding does.

The problems aren't quite the same, because unwinding touches global
state that doesn't need to be touched, while the other strategies go
directly to the state that needs to be worked on and leaves everything
else alone.

> And they add code that is more error-prone
> because, first, it's more code,

Nope; you had to write it anyway.

> and second, because it's invasive code,

Did I mention that you had to write it anyway? ;-)

> and third, because there's no (under assumption of valid higher
> level states) guarantee of execution, as there is with RAII.

Not sure what you had in mind here.

> And, it's reasonable and common (it even has built-in language support)
> to do recursive clean-up when an object is destroyed.

What does an object being destroyed have to do with anything?

> But ensuring destruction of objects can be difficult when the stack
> is not unwound. So with no stack unwinding the objects may have to
> be designed to be able to do total clean-up without being destroyed,
> which effectively means adding zombie states.

No, you're going to stop the program anyway, which means the objects
die automatically.

> Summing up: avoiding stack unwinding while insisting on clean-up
> means globals, invasive support code, probably zombie states, and
> probably more that I didn't think of right here,

Have you ever tried to do it, or are you just speculating? Because it
worked for me, and I didn't experience any problems with it.

> and it's got the same problems wrt. broken invariants at higher
> levels as stack unwinding does. Nothing seems to be _gained_ by
> that centralized approach.

Centralized?


--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Alf P. Steinbach

unread,
Jul 9, 2005, 4:45:47 AM7/9/05
to
[Addressing just two basic points]

* David Abrahams -> Alf P. Steinbach:


>
> > and third, because there's no (under assumption of valid higher
> > level states) guarantee of execution, as there is with RAII.
>
> Not sure what you had in mind here.

With exceptions the client code that needs cleanup doesn't need to take
any explicit extra action to have that cleanup performed. That's one less
thing that can go wrong in the initial coding, and that's one main reason
why exceptions are used for less fatal circumstances. For example, if you
create a huge temporary file (as someone else recently used as an example in
this thread), then you can guard against exceptions by using an automatic
object with a suitable destructor,

TempFile f(aLargeSize);

and when the code is all exception based that simple declaration is all
that's needed.

However when cleanup for a fatal error is centralized (to be run at the
detection call-chain level) you have to adapt the TempFile class to support
that centralized clean-up, or do something extra each place a TempFile is
used, to hook it up with the clean-up, or not clean up.

Ideally the destructor should be able to differentiate between an ordinary
exception (where destruction failure could be handled by throwing a "hard"
exception) and a "hard" exception (where it's known the process is going to
terminate anyway, no special action except logging needed on failure). But
as it is a C++ destructor doesn't even know whether it's invoked via
exception-induced stack unwinding or not. As with most other high-level
concepts, e.g. modules, where we have to use the preprocessor, the language
support isn't there, and the concept must be emulated via conventions.


> > And, it's reasonable and common (it even has built-in language support)
> > to do recursive clean-up when an object is destroyed.
>
> What does an object being destroyed have to do with anything?
>
> > But ensuring destruction of objects can be difficult when the stack
> > is not unwound. So with no stack unwinding the objects may have to
> > be designed to be able to do total clean-up without being destroyed,
> > which effectively means adding zombie states.
>
> No, you're going to stop the program anyway, which means the objects
> die automatically.

There's no concept of object death in the standard. ;-)

Assuming that you mean, those C++ objects are going to disappear: no more
process, no more objects. Yes, and that disappearance nearly without any
trace, leaving behind a mess and little or no information about the detailed
reasons, is what we're trying to avoid.

C++ object destruction is more than internal-to-the-process object
disappearance. C++ object destruction is recursive, and often involves
external resources. To invoke that recursive destruction with cleaning up
of external resources and possibly logging of states, etc., the objects have
to be destroyed via the C++ mechanisms, not just disappeared along with the
process.

But ensuring C++ destruction of objects can be difficult when the stack
is not unwound.

So with no stack unwinding the objects may have to be designed to be able to
do total clean-up without being destroyed, which effectively means adding

zombie states. The TempFile object above is a bad example in this respect
because a file object typically has a possible zombie state anyway, at least
if it's designed to deal with errors. But replace "file" with anything.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Dave Harris

unread,
Jul 9, 2005, 2:28:30 PM7/9/05
to
gerhar...@hotmail.com (Gerhard Menzl) wrote (abridged):

> When flight control software encounters a negative altitude value,
> it had better shut down (and, hopefully, let the backup system
> take over). On the other hand, a word processor that aborts and
> destroys tons of unsaved work just because the spellchecker has met a
> violated invariant is just inacceptable.

I think your real problem there is that you have tons of unsaved work. You
can cope with program bugs, maybe, but you probably can't cope with O/S
crashes or power cuts so you need a more general strategy.

My employers write word processors. Our approach is to write the document
to a temporary file every 10 or so minutes. We delete it during a normal
shutdown and also after a successful user-save. Any documents found during
start-up are then unsaved work lost due to a crash, and we ask the user if
she wants them restored.

This mitigates the problem. They never lose more than 10 minutes work.
That's acceptable because crashes should be rare (if they are a daily
occurance we are already in big trouble). We are confident we can restore
the document because we had a known sane state when it was written out.

-- Dave Harris, Nottingham, UK.

David Abrahams

unread,
Jul 9, 2005, 2:33:47 PM7/9/05
to
al...@start.no (Alf P. Steinbach) writes:

> [Addressing just two basic points]
>
> * David Abrahams -> Alf P. Steinbach:
>>
>> > and third, because there's no (under assumption of valid higher
>> > level states) guarantee of execution, as there is with RAII.
>>
>> Not sure what you had in mind here.
>
> With exceptions the client code that needs cleanup doesn't need to take
> any explicit extra action to have that cleanup performed. That's one less
> thing that can go wrong in the initial coding, and that's one main reason
> why exceptions are used for less fatal circumstances.

That's true for things that need to be cleaned up "anyway" and are
allocated on the stack or managed by stack objects, but that leaves
out a lot. For example, your argument doesn't work for the particular
case you cited: interactive document editing software. The document
is not going to be "cleaned up" in the normal course of events.

> For example, if you create a huge temporary file (as someone else
> recently used as an example in this thread), then you can guard
> against exceptions by using an automatic object with a suitable
> destructor,
>
> TempFile f(aLargeSize);
>
> and when the code is all exception based that simple declaration is all
> that's needed.

That's true, but in my opinion temporary files fall into a separate
class of resource -- those that need to be cleaned up even in case of
fatal errors -- and that class should be dealt with by a separate
mechanism. If you are going to try to do something reasonable in the
face of programming errors, you probably need to deal with the case
where an exception specification is violated, or where an exception
gets thrown from a destructor during unwinding, anyway. So these
kinds of cleanups should be registered with atexit (and unregistered
when the corresponding stack object is destroyed) or in the
terminate() handler[1].

> However when cleanup for a fatal error is centralized (to be run at the
> detection call-chain level)

I still don't understand what you mean by "centralized."

> you have to adapt the TempFile class to support that centralized
> clean-up, or do something extra each place a TempFile is used, to
> hook it up with the clean-up, or not clean up.

Yep, exactly.

> Ideally the destructor should be able to differentiate between an ordinary
> exception (where destruction failure could be handled by throwing a "hard"
> exception) and a "hard" exception (where it's known the process is going to
> terminate anyway, no special action except logging needed on failure). But
> as it is a C++ destructor doesn't even know whether it's invoked via
> exception-induced stack unwinding or not.

Actually it does. That's what uncaught_exception() tells you. The
only problem with uncaught_exception is that it doesn't tell you when
you're in a

catch(...) { ... throw; }

block[2], but it does tell the truth: no stack unwinding is in progress.

> As with most other high-level concepts, e.g. modules, where we have
> to use the preprocessor,

Whoa, preprocessor??! How is that relevant here?

> the language support isn't there, and the concept must be emulated
> via conventions.

You can easily support this with a simple library (in fact I've been
meaning to write one for this purpose for years). I wouldn't call
using a library "emulation via convention."

>> > And, it's reasonable and common (it even has built-in language support)
>> > to do recursive clean-up when an object is destroyed.
>>
>> What does an object being destroyed have to do with anything?

??

>> > But ensuring destruction of objects can be difficult when the stack
>> > is not unwound. So with no stack unwinding the objects may have to
>> > be designed to be able to do total clean-up without being destroyed,
>> > which effectively means adding zombie states.
>>
>> No, you're going to stop the program anyway, which means the objects
>> die automatically.
>
> There's no concept of object death in the standard. ;-)
>
> Assuming that you mean, those C++ objects are going to disappear: no more
> process, no more objects.

There's no concept of "process" in the standard ;-)
Can we just speak English here? Yes, that's what I mean.

> Yes, and that disappearance nearly without any trace, leaving behind
> a mess

Only a small fraction of objects on the stack in a typical program are
responsible for any mess that might be left.

> and little or no information about the detailed reasons,
> is what we're trying to avoid.

The detailed reasons generally have nothing to do with most of what
was on the stack, and you're not going to instrument all of your
classes to dump their state on destruction anyway.

> C++ object destruction is more than internal-to-the-process object
> disappearance. C++ object destruction is recursive, and often involves
> external resources. To invoke that recursive destruction with cleaning up
> of external resources and possibly logging of states, etc.,

Ah, but you're not going to log unconditionally, are you? You don't
want to dump reams of log information for the normal control flow
case! So you need global state, too, if only to decide whether to log
something.

> the objects have to be destroyed via the C++ mechanisms, not just
> disappeared along with the process.

No, they clearly don't. We can see that because many successful C++
programs exit without falling off the end of main(), some even in the
normal case. Only _crucial_ cleanups need to be performed when
aborting due to a broken invariant.

> But ensuring C++ destruction of objects can be difficult when the
> stack is not unwound.
>
> So with no stack unwinding the objects may have to be designed to be able to
> do total clean-up without being destroyed, which effectively means adding
> zombie states.

I know what you mean, but in practice it doesn't matter especially
because at that point we are on a very direct train to termination.

> The TempFile object above is a bad example in this respect because a
> file object typically has a possible zombie state anyway, at least
> if it's designed to deal with errors. But replace "file" with
> anything.

"Balloon?"

Sorry, I can't imagine a crucial resource for which a manager object
should not have an "empty" state.

Footnotes:
[1] I'm being a little fuzzy about exactly
where because I'd need to consult the standard to make sure I got
everything right

[2] We can argue later about whether those blocks are a good idea.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gerhard Menzl

unread,
Jul 15, 2005, 8:41:45 AM7/15/05
to
Peter Dimov wrote:

> The two options aren't "abort and destroy hours of work" and "throw an
> exception". The two options are "throw an exception" and "don't throw
> an exception".
>
> In particular, nothing prevents the failed assertion handler to
> attempt an emergency save, using a different file name (to not clobber
> the "last known good" save), a different file format (if the native
> format consists of a dump of the data structures and will likely
> produce an unreadable file), and a different, extra-paranoid code
> path. _Then_ abort.

How can a global handler obtain the necessary context information to
perform the emergency save?

> Yes, I tried to gave an example of that in the other post.
> Unfortunately, two important global invariants are "the heap is not
> corrupted" and "there are no dangling pointers that are causing
> damage", and a violation of those is usually manifested as a violation
> of another (possibly local) invariant.

I agree that in case of a corrupted heap (or stack), there is no point
in continuing. But there are many local invariants that do not affect
the global program state and thus hardly warrant an abort. How would you
handle those?

> A broken database connection, a corrupted table, a corrupted data file
> are not logic errors. The number of logic errors in a program is
> constant and does not depend on external factors. (Unless the program
> itself changes, i.e. you consider the information in the database
> "code", a part of the program.)

Sort of. A considerable part of the program logic rests in stored
procedures in the database itself.

--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gerhard Menzl

unread,
Jul 15, 2005, 10:48:24 AM7/15/05
to
David Abrahams wrote:


> You seem to assume that aborting is the only alternative to unwinding
> when a violated invariant is detected. In an interactive application
> like a word processor it's usually possible to recover the state of
> the document and offer the user an opportunity to save when a violated
> invariant is detected. All of that can be done without any
> unwinding.

No, I was referring to an earlier posting of yours where you wrote:

> If preconditions are broken, your program state is broken, by
> definition. Trying to recover is generally ill-advised.

and

> Stop execution so I can debug the program. Good!

My understanding is that recovery without unwinding requires a separate,
handcrafted mechanism. Otherwise, how would you obtain the necessary
context that is normally not accessible in a global, low-level function
like an assertion handler?

You argued (elsewhere) that the Undo feature in a word processor
provides such a mechanism (or a substantial building block of one)
anyway. Well, what about applications that don't have an Undo feature?

> In that case, a corrupted database table is by definition *not* a
> broken precondition. IIUC, you are expected to write software that is
> guaranteed to work whether the database table is corrupt or not, and
> you seem to accept that challenge. Great! It's similar to writing
> software that is robust in the face of invalid user input. If the
> user's input is invalid, well, there's some functionality they can't
> get to until the input is corrected. Nobody I know would consider
> valid user input a precondition in that case.

I would not compare user input, which is always unpredictable, with the
state of a database. If, as in my case, a database table is accessed
exclusively by my application, which reads in data that have been
written by itself during earlier sessions, corrupt data are closer to a
broken invariant (or precondition) than to invalid user input.

> In your application, calling database table integrity a precondition
> is only going to confuse things and make your code more complicated.
> Once you understand that database integrity is not a precondition, it
> becomes very clear that you need to check for corruption in certain
> places and make sure that you do something sensible if you detect it.

I think it depends on whether you regard the database as an integral
part of your system or something external. In a sense, it is both. An
error caused by someone tinkering manually with the database, or by a
broken network connection is external, but an error caused by your own
application writing garbage that will lead to confusion during the next
run is more like a logical error.

--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gerhard Menzl

unread,
Jul 15, 2005, 10:47:01 AM7/15/05
to
Dave Harris wrote:

> I think your real problem there is that you have tons of unsaved work.
> You can cope with program bugs, maybe, but you probably can't cope
> with O/S crashes or power cuts so you need a more general strategy.
>
> My employers write word processors. Our approach is to write the
> document to a temporary file every 10 or so minutes. We delete it
> during a normal shutdown and also after a successful user-save. Any
> documents found during start-up are then unsaved work lost due to a
> crash, and we ask the user if she wants them restored.
>
> This mitigates the problem. They never lose more than 10 minutes work.
> That's acceptable because crashes should be rare (if they are a daily
> occurance we are already in big trouble). We are confident we can
> restore the document because we had a known sane state when it was
> written out.

Automated saving is certainly a good idea, but it does not really
address my point, for which I merely used the word processor as an
example. You can do a lot of typing, drawing, or whatever in ten
minutes. Destroying that work without at least attempting to save it
upon a failed assertion is not going to enhance your reputation as a
software vendor.

--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Peter Dimov

unread,
Jul 15, 2005, 11:05:56 AM7/15/05
to
Gerhard Menzl wrote:
> Peter Dimov wrote:
>
>> The two options aren't "abort and destroy hours of work" and "throw
>> an exception". The two options are "throw an exception" and "don't
>> throw an exception".
>>
>> In particular, nothing prevents the failed assertion handler to
>> attempt an emergency save, using a different file name (to not
>> clobber the "last known good" save), a different file format (if the
>> native format consists of a dump of the data structures and will
>> likely produce an unreadable file), and a different, extra-paranoid
>> code path. _Then_ abort.
>
> How can a global handler obtain the necessary context information to
> perform the emergency save?

And how is throwing an exception and postponing the emergency save until
unwinding is complete better at obtaining that context information?

>> Yes, I tried to gave an example of that in the other post.
>> Unfortunately, two important global invariants are "the heap is not
>> corrupted" and "there are no dangling pointers that are causing
>> damage", and a violation of those is usually manifested as a
>> violation of another (possibly local) invariant.
>
> I agree that in case of a corrupted heap (or stack), there is no point
> in continuing. But there are many local invariants that do not affect
> the global program state and thus hardly warrant an abort. How would
> you handle those?

The problem is that a corrupted heap or a dangling pointer can easily break
local invariants. When you see a broken local invariant, it is not safe to
assume that it is caused by a local error that will go away when the stack
is unwound.

>> A broken database connection, a corrupted table, a corrupted data
>> file are not logic errors. The number of logic errors in a program is
>> constant and does not depend on external factors. (Unless the program
>> itself changes, i.e. you consider the information in the database
>> "code", a part of the program.)
>
> Sort of. A considerable part of the program logic rests in stored
> procedures in the database itself.

Well, the good thing here is that a broken database invariant can't affect
the other program invariants because they don't share the same address
space, so it might be possible for the part of the program that doesn't need
access to the database to function normally.

Alf P. Steinbach

unread,
Jul 15, 2005, 12:10:00 PM7/15/05
to
* Peter Dimov:

>
> And how is throwing an exception and postponing the emergency save until
> unwinding is complete better at obtaining that context information?

The "and..." seems to have no connection to earlier thread content.


> The problem is that a corrupted heap or a dangling pointer can easily break
> local invariants. When you see a broken local invariant, it is not safe to
> assume that it is caused by a local error that will go away when the stack
> is unwound.

Correct but hardly relevant: nothing is 100% safe to assume at any time, and
generalizing from "can theoretically be hit in the head by a meteorite" to
"should not ever be outdoors" isn't what I'd call practical. Also, it's
inconsistent with your position. It implies that nothing should be done.


> Well, the good thing here is that a broken database invariant can't affect
> the other program invariants because they don't share the same address
> space,

Sorry, but that's incorrect.

Paraphrasing what you wrote just a sentence or two earlier, when you see a
broken local invariant it's not guaranteed that it's caused by a local
error, and depending on the kind of invariant and other context, it may in
some cases be most likely caused by a non-local error.

A broken database invariant can have been caused by a broken invariant
elsewhere in the client program, and/or it can have caused broken invariants
elsewhere in the client program, and/or it can come to cause such.


> so it might be possible for the part of the program that doesn't need
> access to the database to function normally.

Correct but hardly relevant: that's possible anyway.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,
Jul 15, 2005, 12:10:22 PM7/15/05
to
Gerhard Menzl <gerhar...@hotmail.com> writes:

> David Abrahams wrote:
>
>
>> You seem to assume that aborting is the only alternative to unwinding
>> when a violated invariant is detected. In an interactive application
>> like a word processor it's usually possible to recover the state of
>> the document and offer the user an opportunity to save when a violated
>> invariant is detected. All of that can be done without any
>> unwinding.
>
> No, I was referring to an earlier posting of yours where you wrote:
>
>> If preconditions are broken, your program state is broken, by
>> definition. Trying to recover is generally ill-advised.
>
> and
>
>> Stop execution so I can debug the program. Good!
>
> My understanding is that recovery without unwinding requires a separate,
> handcrafted mechanism.

Who said anything about recovery? I am talking about stopping
execution at the earliest possible moment when an invariant violation
is detected.

> Otherwise, how would you obtain the necessary context that is
> normally not accessible in a global, low-level function like an
> assertion handler?

I don't understand the question.

> You argued (elsewhere) that the Undo feature in a word processor
> provides such a mechanism (or a substantial building block of one)
> anyway. Well, what about applications that don't have an Undo feature?

I guess you'd better write one ;-)

Imagine you wrote a word processor without Undo. You wouldn't really
expect stack unwinding to automatically undo any edits in progress
would you?

If you have crucial program state that needs to be restored (and,
e.g. written to a file) before exiting, you need a mechanism to handle
it.

>> In that case, a corrupted database table is by definition *not* a
>> broken precondition. IIUC, you are expected to write software that is
>> guaranteed to work whether the database table is corrupt or not, and
>> you seem to accept that challenge. Great! It's similar to writing
>> software that is robust in the face of invalid user input. If the
>> user's input is invalid, well, there's some functionality they can't
>> get to until the input is corrected. Nobody I know would consider
>> valid user input a precondition in that case.
>
> I would not compare user input, which is always unpredictable, with the
> state of a database. If, as in my case, a database table is accessed
> exclusively by my application, which reads in data that have been
> written by itself during earlier sessions, corrupt data are closer to a
> broken invariant (or precondition) than to invalid user input.

A precondition violation is -- by definition -- unrecoverable.
http://en.wikipedia.org/wiki/Precondition

If you want to use some other definition where it's a fuzzier notion,
be my guest, but then we are going to be talking at cross-purposes. I
also warn you that the term "precondition" loses almost all of its
power when you do that.

>> In your application, calling database table integrity a precondition
>> is only going to confuse things and make your code more complicated.
>> Once you understand that database integrity is not a precondition, it
>> becomes very clear that you need to check for corruption in certain
>> places and make sure that you do something sensible if you detect it.
>
> I think it depends on whether you regard the database as an integral
> part of your system or something external.

No. If your program is expected to be resilient against condition ~X,
then X is not a precondition. Do yourself a favor and use the clear
and accepted terminology. Your programs will benefit.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Alf P. Steinbach

unread,
Jul 15, 2005, 9:00:01 PM7/15/05
to
* David Abrahams:

>
> A precondition violation is -- by definition -- unrecoverable.
> http://en.wikipedia.org/wiki/Precondition
>
> If you want to use some other definition where it's a fuzzier notion,
> be my guest, but then we are going to be talking at cross-purposes. I
> also warn you that the term "precondition" loses almost all of its
> power when you do that.

Sorry, but that's incorrect.

As a simple counter-example, write a Basic interpreter in C++, then when a
precondition fails in the executing Basic interpreter, recover at the C++
level -- the C++ program can happily continue executing.

Note that the existence of this _one_ counter-example invalidates your
notion of absolute unrecoverability.

As a more extreme example, you can turn off that computer, reboot, and
restore all state from some restore point.

You might and probably will argue that that's not recovery within the
"program". And that's correct as far as it goes, but that argument is
precisely the direction one might need to direct one's attention to
understand this. Can you recover without rebooting the physical computer?
Then you're within the OS "program" (and nowadays some folks prefer to run
everything in a virtual OS just to be able to do this, it's not academic).
Can you recover with some even less extreme measure? Then you might well be
within an ordinary C++ simple application program. Can you recover without
sacrificing a thread or two? Then you're even within the C++ standard's
limited notion of program, and one example was given above. It's all about
abstraction levels. Generalizing from a limited context and level up to
"everything", as you did, is not valid.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,
Jul 16, 2005, 6:22:13 AM7/16/05
to
al...@start.no (Alf P. Steinbach) writes:

> * David Abrahams:
>>
>> A precondition violation is -- by definition -- unrecoverable.
>> http://en.wikipedia.org/wiki/Precondition
>>
>> If you want to use some other definition where it's a fuzzier notion,
>> be my guest, but then we are going to be talking at cross-purposes. I
>> also warn you that the term "precondition" loses almost all of its
>> power when you do that.
>
> Sorry, but that's incorrect.

Only if you insist on complicating the way it's interpreted.

> As a simple counter-example, write a Basic interpreter in C++, then
> when a precondition fails in the executing Basic interpreter,
> recover at the C++ level -- the C++ program can happily continue
> executing.

Well, duh! Write a C++ program. If a precondition fails when
executing the program, recover at the OS level -- the OS can happily
continue executing. Of course I meant "unrecoverable for the program
whose precondition is violated."

> Note that the existence of this _one_ counter-example invalidates
> your notion of absolute unrecoverability.

Oh, totally, I agree. Unless you take the "obvious" interpretation of
what I said.

> As a more extreme example, you can turn off that computer, reboot,
> and restore all state from some restore point.
>
> You might and probably will argue that that's not recovery within the
> "program".

You catch on fast.

> And that's correct as far as it goes

That's as far as it was meant to go.

> but that argument is precisely the direction one might need to
> direct one's attention to understand this. Can you recover without
> rebooting the physical computer? Then you're within the OS
> "program" (and nowadays some folks prefer to run everything in a
> virtual OS just to be able to do this, it's not academic). Can you
> recover with some even less extreme measure? Then you might well be
> within an ordinary C++ simple application program.

You just made a leap from recovering at a meta-level to recovering at
the same level. An "ordinary C++ simple application program" has no
clear isolation between its various parts, so it usually isn't
possible to make a well-founded decision that the brokenness stops
somewhere before the process boundary when a precondition is violated.
You can guesstimate, but then you're in the territory of wishful
thinking, not in that of being able to reliably claim that the program
is functioning normally. The OP needs to be able to keep the program
functioning normally in the face of database corruption, which means
staying out of the territory of wishful thinking.

> Can you recover without sacrificing a thread or two? Then you're
> even within the C++ standard's limited notion of program, and one
> example was given above. It's all about abstraction levels.
> Generalizing from a limited context and level up to "everything", as
> you did, is not valid.

My statement that was delivered in a limited context, and I did *not*
generalize it up to "everything." AFAICT you did that and then
criticized me for the consequences of it.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Bob Bell

unread,
Jul 16, 2005, 6:23:03 AM7/16/05
to
Alf P. Steinbach wrote:
> * David Abrahams:
> >
> > A precondition violation is -- by definition -- unrecoverable.
> > http://en.wikipedia.org/wiki/Precondition
> >
> > If you want to use some other definition where it's a fuzzier notion,
> > be my guest, but then we are going to be talking at cross-purposes. I
> > also warn you that the term "precondition" loses almost all of its
> > power when you do that.
>
> Sorry, but that's incorrect.
>
> As a simple counter-example, write a Basic interpreter in C++, then when a
> precondition fails in the executing Basic interpreter, recover at the C++
> level -- the C++ program can happily continue executing.

If anything, that's an example of a "fuzzier notion."

Bob

James Kanze

unread,
Jul 16, 2005, 7:48:28 PM7/16/05
to
Gerhard Menzl wrote:
> Dave Harris wrote:

>>I think your real problem there is that you have tons of
>>unsaved work. You can cope with program bugs, maybe, but you
>>probably can't cope with O/S crashes or power cuts so you need
>>a more general strategy.

>>My employers write word processors. Our approach is to write
>>the document to a temporary file every 10 or so minutes. We
>>delete it during a normal shutdown and also after a successful
>>user-save. Any documents found during start-up are then
>>unsaved work lost due to a crash, and we ask the user if she
>>wants them restored.

>>This mitigates the problem. They never lose more than 10
>>minutes work. That's acceptable because crashes should be
>>rare (if they are a daily occurance we are already in big
>>trouble). We are confident we can restore the document because
>>we had a known sane state when it was written out.

> Automated saving is certainly a good idea, but it does not really
> address my point, for which I merely used the word processor as an
> example. You can do a lot of typing, drawing, or whatever in ten
> minutes. Destroying that work without at least attempting to save it
> upon a failed assertion is not going to enhance your reputation as a
> software vendor.

I think that part of Dave's point is that failed assertions, in
general, aren't going to enhance your reputation as a software
vendor. So they should be rare enough that the lost ten minutes
doesn't matter. Of course, if the software is to run on a
modern machine, you can do the save every 30 seconds, or even
more often, if it will matter.

Note to that saving the work is also valid for system crashes.
And in the code I write (and judging from yours and Dave's
contributions here, probably in your code as well), system
crashes are a lot more frequent than assertion failures in my
code.

--
James Kanze mailto: james...@free.fr


Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

9 pl. Pierre Sémard, 78210 St.-Cyr-l'École, France +33 (0)1 30 23 00 34

Dave Harris

unread,
Jul 17, 2005, 4:26:21 PM7/17/05
to
gerhar...@hotmail.com (Gerhard Menzl) wrote (abridged):
> Automated saving is certainly a good idea, but it does not really
> address my point, for which I merely used the word processor as an
> example. You can do a lot of typing, drawing, or whatever in ten
> minutes.

Well, you can do about 10 minutes worth. For something like a word
processor it will take at most 10 minutes to recreate the lost work.

The point is the loss is bounded. Where-as if you continue running after
the program state has become invalid, you risk corrupting everything. The
loss is not bounded. So the principle applies even when the app is doing
something like, eg, telemetry of Mars rover missions, where the data may
be highly valuable and impossible to recreate.

-- Dave Harris, Nottingham, UK.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gerhard Menzl

unread,
Aug 3, 2005, 11:27:08 AM8/3/05
to
David Abrahams wrote:

> Who said anything about recovery?

Er, you did. Several times. That's how the Undo example came up.

> I am talking about stopping execution at the earliest possible moment
> when an invariant violation is detected.

And I questioned whether stopping execution upon detection of an
invariant or precondition violation is always the correct thing to do.

>>Otherwise, how would you obtain the necessary context that is
>>normally not accessible in a global, low-level function like an
>>assertion handler?
>
> I don't understand the question.

You: stack unwinding is a no-no
I: so you abort and lose all unsaved work
You: no, you let a separate clean-up function do the saving
I: then you need a separate mechanism (which may exist already
if you have an Undo feature), otherwise how would the clean-up
function, which cannot possibly know anything about the
application state, do its work, right?

>>You argued (elsewhere) that the Undo feature in a word processor
>>provides such a mechanism (or a substantial building block of one)
>>anyway. Well, what about applications that don't have an Undo feature?
>
> I guess you'd better write one ;-)

Now it's my turn not to understand. There are applications where Undo is
part of the specification, and applications where it isn't. Because
there are operations that cannot be undone, for example. You can abort a
phone call, but you cannot undo it (although people have been known to
badly wish for such a feature).

> Imagine you wrote a word processor without Undo. You wouldn't really
> expect stack unwinding to automatically undo any edits in progress
> would you?

Not unless input handling were implemented recursively. *g*

> A precondition violation is -- by definition -- unrecoverable.
> http://en.wikipedia.org/wiki/Precondition

That definition doesn't say anything about recovery. It says:

"In computer programming, a precondition is a fact that must always
be true just prior to the execution of some section of code. [...]
If a precondition is violated, the effect of the section of code
becomes undefined and thus may or may not carry out its intended
work."

Whether the malfunction of a "section of code" doesn't matter or
requires stopping all execution or anything in between depends on what
the code does and the importance it has in the overall scheme.

> If you want to use some other definition where it's a fuzzier notion,
> be my guest, but then we are going to be talking at cross-purposes. I
> also warn you that the term "precondition" loses almost all of its
> power when you do that.

I am always eager to unfuzzy my software engineering notions, otherwise
I wouldn't take part in this group. But the definition you cited simply
doesn't say anything about how the rest of the program is affected when
one part cannot fulfill its responsibility.

>>>In your application, calling database table integrity a precondition
>>>is only going to confuse things and make your code more complicated.
>>>Once you understand that database integrity is not a precondition, it
>>>becomes very clear that you need to check for corruption in certain
>>>places and make sure that you do something sensible if you detect it.
>>
>>I think it depends on whether you regard the database as an integral
>>part of your system or something external.
>
> No. If your program is expected to be resilient against condition ~X,
> then X is not a precondition. Do yourself a favor and use the clear
> and accepted terminology. Your programs will benefit.

Suppose you write several distinct values into a database table. Your
program logic is supposed to make sure that the same value is never used
twice, and the database definition ensures that the database engine
applies a second check (by using a unique key, for example). If a
function that operates on the data relies on the values still being
unique after retrieval, that uniqueness is, by your (Wikipedia)
definition, a precondition. Yet you say it's not. Why? Because you
always have to check data retrieved from a database for possible
corruption? I don't think this is always feasible. What am I missing?


--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,
Aug 6, 2005, 5:59:11 AM8/6/05
to
Gerhard Menzl <gerhar...@hotmail.com> writes:

> David Abrahams wrote:
>
>> Who said anything about recovery?
>
> Er, you did. Several times. That's how the Undo example came up.

You're mixing things I see as distinct. When your programs supposed
invariants are violated, I am saying there is no recovery.

>> I am talking about stopping execution at the earliest possible moment
>> when an invariant violation is detected.
>
> And I questioned whether stopping execution upon detection of an
> invariant or precondition violation is always the correct thing to
> do.

It's not. There is a small class of applications where it makes sense
to muddle on anyway:

http://groups-beta.google.com/group/comp.lang.c++.moderated/msg/3f398f7f06a956a5?hl=en&

>>>Otherwise, how would you obtain the necessary context that is
>>>normally not accessible in a global, low-level function like an
>>>assertion handler?
>>
>> I don't understand the question.
>
> You: stack unwinding is a no-no

When you detect an assertion violation.

> I: so you abort and lose all unsaved work
> You: no, you let a separate clean-up function do the saving
> I: then you need a separate mechanism (which may exist already
> if you have an Undo feature), otherwise how would the clean-up
> function, which cannot possibly know anything about the
> application state, do its work, right?

Yes, you need a separate mechanism of some kind. Can you imagine a
realistic scenario where you wouldn't want that mechanism anyway?

>>>You argued (elsewhere) that the Undo feature in a word processor
>>>provides such a mechanism (or a substantial building block of one)
>>>anyway. Well, what about applications that don't have an Undo feature?
>>
>> I guess you'd better write one ;-)
>
> Now it's my turn not to understand.

I mean, you'd better write a substantial building block of an Undo
feature.

> There are applications where Undo is part of the specification, and
> applications where it isn't. Because there are operations that
> cannot be undone, for example. You can abort a phone call, but you
> cannot undo it (although people have been known to badly wish for
> such a feature).

Well, of course in those applications there's no way to recover the
unmodified state. So what are you worried about?

>> A precondition violation is -- by definition -- unrecoverable.
>> http://en.wikipedia.org/wiki/Precondition
>
> That definition doesn't say anything about recovery. It says:
>
> "In computer programming, a precondition is a fact that must
> always be true just prior to the execution of some section of
> code. [...] If a precondition is violated, the effect of the
> section of code becomes undefined and thus may or may not carry

^^^^^^^^^
> out its intended work."

Recovery is about producing defined behavior. Once you enter the land
of undefined behavior, you can't come back. Maybe you have some other
definition of "recovery?"

> Whether the malfunction of a "section of code" doesn't matter or
> requires stopping all execution or anything in between depends on what
> the code does and the importance it has in the overall scheme.

It depends on how reliably it can be isolated from the rest of the
system.

If it truly "doesn't matter," then whatever you've specified as a
precondition was too strong.

>> If you want to use some other definition where it's a fuzzier
>> notion, be my guest, but then we are going to be talking at
>> cross-purposes. I also warn you that the term "precondition" loses
>> almost all of its power when you do that.
>
> I am always eager to unfuzzy my software engineering notions,
> otherwise I wouldn't take part in this group. But the definition you
> cited simply doesn't say anything about how the rest of the program
> is affected when one part cannot fulfill its responsibility.

Very simply: the Wiki definition says that if a precondition is
violated, the behavior becomes undefined. The moment you say say, "I
can do something reliably here if I detect a violation," then the
condition being violated is -- by definition -- no longer a precondition
because you're defining what the behavior should be when it happens.
The behavior is no longer undefined, so the condition just describes
one set of valid inputs.

Perhaps more importantly, a violation of X's precondition does not
merely mean that X can't fulfill its responsibility. It also means
that something else somewhere in the program is broken. It may have
been at the bottom of some long call chain that has since returned.
So by the time you detect that X's preconditions are violated, you
really have no idea where the problem is.

>> If your program is expected to be resilient against condition ~X,
>> then X is not a precondition. Do yourself a favor and use the clear
>> and accepted terminology. Your programs will benefit.
>
> Suppose you write several distinct values into a database table. Your
> program logic is supposed to make sure that the same value is never used
> twice, and the database definition ensures that the database engine
> applies a second check (by using a unique key, for example). If a

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I don't know what you mean by that part, but I don't suppose it
matters.

> function that operates on the data relies on the values still being
> unique after retrieval, that uniqueness is, by your (Wikipedia)
> definition, a precondition.

Yes.

> Yet you say it's not.

No I don't. Why do you think so?

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gerhard Menzl

unread,
Aug 8, 2005, 3:48:12 PM8/8/05
to
David Abrahams wrote:

> You're mixing things I see as distinct. When your programs supposed
> invariants are violated, I am saying there is no recovery.

I guess I just find it hard to get a clear view of what exactly you mean
by "precondition". You cited a definition, but that definition seems to
be in contradiction to several things you said before.

> Yes, you need a separate mechanism of some kind. Can you imagine a
> realistic scenario where you wouldn't want that mechanism anyway?

A mechanism, yes, certainly. A separate mechanism and thus duplicate
logic? Not if I can avoid it.

>>There are applications where Undo is part of the specification, and
>>applications where it isn't. Because there are operations that
>>cannot be undone, for example. You can abort a phone call, but you
>>cannot undo it (although people have been known to badly wish for
>>such a feature).
>
> Well, of course in those applications there's no way to recover the
> unmodified state. So what are you worried about?

I am worried about throwing the baby out with the bathwater.
Specifically, I have my doubts about the underlying assumption that the
effects of a precondition violations are always global and thus require
shutdown.

> It depends on how reliably it can be isolated from the rest of the
> system.
>
> If it truly "doesn't matter," then whatever you've specified as a
> precondition was too strong.
>

> Very simply: the Wiki definition says that if a precondition is
> violated, the behavior becomes undefined. The moment you say say, "I
> can do something reliably here if I detect a violation," then the
> condition being violated is -- by definition -- no longer a
> precondition because you're defining what the behavior should be when
> it happens. The behavior is no longer undefined, so the condition just
> describes one set of valid inputs.

Hm, this is beginning to smell like circular logic to me: if it is
possible to continue, it cannot have been a precondition, hence a
precondition is defined as a condition the violation of which makes
further execution impossible, which eventually boils down to: when you
cannot continue, you must stop. Now this is something I will readily
agree with, but it doesn't seem like a useful definition of precondition
to me anymore, and it certainly does not match the definition you cited.

What I am missing particularly is a distinction between "cannot continue
execution of the operation" and "cannot continue execution of the
program". The former does not necessarily mean the latter. My impression
(which may be wrong) is that you restrict the term "precondition
violation" to situations where the latter applies.


> Perhaps more importantly, a violation of X's precondition does not
> merely mean that X can't fulfill its responsibility. It also means
> that something else somewhere in the program is broken. It may have
> been at the bottom of some long call chain that has since returned.
> So by the time you detect that X's preconditions are violated, you
> really have no idea where the problem is.

This a situation that may occur, but it is different from the Wikipedia
definition. I find your definition of "precondition" very elusive; maybe
that's the source of our disagreement.

>>Suppose you write several distinct values into a database table. Your
>>program logic is supposed to make sure that the same value is never
>>used twice, and the database definition ensures that the database
>>engine applies a second check (by using a unique key, for example). If

> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> I don't know what you mean by that part, but I don't suppose it
> matters.

It's database speak for what std::set guarantees in contrast to std::
multiset.

>>a function that operates on the data relies on the values still being


>>unique after retrieval, that uniqueness is, by your (Wikipedia)
>>definition, a precondition.
>
> Yes.
>
>>Yet you say it's not.
>
> No I don't. Why do you think so?

Because you wrote earlier:

> Once you understand that database integrity is not a precondition, it
> becomes very clear that you need to check for corruption in certain
> places and make sure that you do something sensible if you detect it.

Are we wrestling with conflicting definitions of database integrity as well?


--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,
Aug 9, 2005, 4:19:14 AM8/9/05
to
Gerhard Menzl <gerhar...@hotmail.com> writes:

> David Abrahams wrote:
>
>> You're mixing things I see as distinct. When your programs supposed
>> invariants are violated, I am saying there is no recovery.
>
> I guess I just find it hard to get a clear view of what exactly you mean
> by "precondition." You cited a definition,

Yes, I mean precisely what that definition says.

> but that definition seems to be in contradiction to several things
> you said before.

Like what?

>> Yes, you need a separate mechanism of some kind. Can you imagine a
>> realistic scenario where you wouldn't want that mechanism anyway?
>
> A mechanism, yes, certainly. A separate mechanism and thus duplicate
> logic? Not if I can avoid it.

No "duplicate logic" is needed in order to keep track of the critical
cleanups. You can use good old RAII with a common base class that
links itself into a chain of objects that will do their business in
case of a critical failure.

>>>There are applications where Undo is part of the specification, and
>>>applications where it isn't. Because there are operations that
>>>cannot be undone, for example. You can abort a phone call, but you
>>>cannot undo it (although people have been known to badly wish for
>>>such a feature).
>>
>> Well, of course in those applications there's no way to recover the
>> unmodified state. So what are you worried about?
>
> I am worried about throwing the baby out with the bathwater.

Okay, but that's a separate question. You brought up operations that
can't be undone. What could you possibly do to roll those back using
unwinding?

> Specifically, I have my doubts about the underlying assumption that
> the effects of a precondition violations are always global and thus
> require shutdown.

This isn't about the *effects* of a precondition violation, but what a
precondition violation indicates: a bug in the program logic.

>> It depends on how reliably it can be isolated from the rest of the
>> system.
>>
>> If it truly "doesn't matter," then whatever you've specified as a
>> precondition was too strong.
>>
>> Very simply: the Wiki definition says that if a precondition is
>> violated, the behavior becomes undefined. The moment you say say, "I
>> can do something reliably here if I detect a violation," then the
>> condition being violated is -- by definition -- no longer a
>> precondition because you're defining what the behavior should be when
>> it happens. The behavior is no longer undefined, so the condition just
>> describes one set of valid inputs.
>
> Hm, this is beginning to smell like circular logic to me: if it is
> possible to continue, it cannot have been a precondition

^
without causing undefined behavior


Correct. That's simply true by definition (the wikipedia definition).
Remember, I have been encouraging you to keep your terms very clear,
so the definition of the word matters.

> hence a precondition is defined as

Whoa; I'm not drawing any conclusions about the definition of
"precondition" from the above. The above *follows* from the
definition of "precondition."

So the rest of what you're saying here doesn't reflect what I'm saying
at all:

> a condition the violation of which makes further execution
> impossible, which eventually boils down to: when you cannot
> continue, you must stop. Now this is something I will readily agree
> with, but it doesn't seem like a useful definition of precondition
> to me anymore, and it certainly does not match the definition you
> cited.

The definition I cited says, among other things:

If a precondition is violated, the effect of the section of code

becomes undefined and thus may or may not carry out its intended
work.

> What I am missing particularly is a distinction between "cannot continue
> execution of the operation" and "cannot continue execution of the
> program". The former does not necessarily mean the latter.

No it does not.

> My impression (which may be wrong) is that you restrict the term
> "precondition violation" to situations where the latter applies.

No I don't. The problem is that it's usually impossible to tell the
former from the latter.

>> Perhaps more importantly, a violation of X's precondition does not
>> merely mean that X can't fulfill its responsibility. It also means
>> that something else somewhere in the program is broken. It may have
>> been at the bottom of some long call chain that has since returned.
>> So by the time you detect that X's preconditions are violated, you
>> really have no idea where the problem is.
>
> This a situation that may occur, but it is different from the
> Wikipedia definition.

It follows very clearly from the wikipedia definition, unless you
assume that programmers are intentionally inducing undefined
behavior. Since a precondition violation results in undefined
behavior, and --- I hope you will agree --- undefined behavior is a
clear indicator of a broken program, clearly something is broken
somewhere in the program.

I think I have an inkling what you're thinking about this. It's
something like "Ah, but if I throw an exception, the function never
executes, so I can avoid the undefined behavior by not proceeding with
the function and instead initiating stack unwinding." Right?

That would be wrong. First, the exception throwing is part of what
the function does. If the function's documentation describes a
precondition, the exception happens after entering the function,
i.e. after you've proceeded to execute the "section of code" to which
the precondition applies. So the exception throwing is just part of
the function's undefined behavior. You can't count on it, and use the
knowledge of the exception to make further guarantees about the
reliable behavior of the program.

The moment when the function's documentation says it throws an
exception in response to some condition being violated -- so that you
can rely on it throwing and make further deductions about the behavior
of the program afterwards -- that exception throwing becomes part of
the function's defined behavior and thus the condition is not a
precondition, by definition.

> I find your definition of "precondition" very elusive; maybe that's
> the source of our disagreement.

It's very simple; it is exactly what you'll find at
http://en.wikipedia.org/wiki/Precondition.

>>>Suppose you write several distinct values into a database table. Your
>>>program logic is supposed to make sure that the same value is never
>>>used twice, and the database definition ensures that the database
>>>engine applies a second check (by using a unique key, for example). If
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> I don't know what you mean by that part, but I don't suppose it
>> matters.
>
> It's database speak for what std::set guarantees in contrast to std::
> multiset.

I understood that; I just don't see what kind of "second check" you
can get from that guarantee.

>>>a function that operates on the data relies on the values still being
>>>unique after retrieval, that uniqueness is, by your (Wikipedia)
>>>definition, a precondition.
>>
>> Yes.
>>
>>>Yet you say it's not.
>>
>> No I don't. Why do you think so?
>
> Because you wrote earlier:
>
>> Once you understand that database integrity is not a precondition, it
>> becomes very clear that you need to check for corruption in certain
>> places and make sure that you do something sensible if you detect it.
>
> Are we wrestling with conflicting definitions of database integrity
> as well?

You're taking that statement out of its context. I wasn't making a
sweeping generalization that database integrity is never a
precondition. In the case you were describing above, with the second
check, it clearly is.

At the time of the statement you quote, we were discussing a program
that has to continue to work reliably (i.e. according to some
well-defined specification which might include a reduction in
functionality) even in the face of no database integrity. In *that
case* database integrity isn't a precondition. By definition.

HTH,


--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Simon Bone

unread,
Aug 9, 2005, 9:52:41 AM8/9/05
to
On Mon, 08 Aug 2005 15:48:12 -0400, Gerhard Menzl wrote in reply to David
Abrahams:

> What I am missing particularly is a distinction between "cannot continue
> execution of the operation" and "cannot continue execution of the
> program". The former does not necessarily mean the latter. My impression
> (which may be wrong) is that you restrict the term "precondition
> violation" to situations where the latter applies.
>
>

For the former ("cannot continue execution of the operation") an exception
is reasonable and can be made safe. For the later ("cannot continue
execution of the program") standard C++ exceptions cannot be safely used.

I think David's point is that in standard C++ there are many ways to
invoke "undefined behaviour" and they all lead to the latter situation. If
the inputs to a particular routine can be detected as so screwed up so as
to make this inevitable, you need to have some response. That can't be an
exception so it has to be something else. Termination is the standard
answer, but you might have a platform specific alternative.

One thing you can't expect to get away with is trying to combine code to
handle both situations. They are fundamentally different in their
implications. What you can do is to combine the code that detects each
situation, at the start of a routine typically.

HTH

Simon Bone

Gerhard Menzl

unread,
Aug 10, 2005, 2:37:04 PM8/10/05
to
In order to prevent this exchange from breaking into dozens of little
sub-arguments, I will try to focus on what I think is the key point (or
the key misunderstanding):

> I think I have an inkling what you're thinking about this. It's
> something like "Ah, but if I throw an exception, the function never
> executes, so I can avoid the undefined behavior by not proceeding with
> the function and instead initiating stack unwinding." Right?
>
> That would be wrong. First, the exception throwing is part of what
> the function does. If the function's documentation describes a
> precondition, the exception happens after entering the function,
> i.e. after you've proceeded to execute the "section of code" to which
> the precondition applies. So the exception throwing is just part of
> the function's undefined behavior. You can't count on it, and use the
> knowledge of the exception to make further guarantees about the
> reliable behavior of the program.

Your definition allows preconditions to be tested using assertions. But
assertions are, after all, part of what the function does. If I applied
what you wrote above, then triggering an assertion would be just part of
the function's undefined behaviour.

But is it? When you specify a precondition for a function, you need a
clear idea of what the consequences of a violation are, which part of
what the function does is affected, and which part isn't. Otherwise, how
would you be able to specify the precondition in the first place? If,
for instance, as the implementer of flex_array, you specify the precondition

idx < size()

for operator[](size_type idx), you know that accessing

&*begin() + idx

in case the precondition is violated possibly invokes undefined
behaviour, but querying the size is okay (if it weren't, you couldn't
even test the precondition). Why should throwing an exception
immediately after testing the precondition be "part of the function's
undefined behavior"? This would only be true if the stack is corrupted,
but that's not what the precondition is about. Or would you argue that
any precondition violation may be due to stack corruption?

> The moment when the function's documentation says it throws an
> exception in response to some condition being violated -- so that you
> can rely on it throwing and make further deductions about the behavior
> of the program afterwards -- that exception throwing becomes part of
> the function's defined behavior and thus the condition is not a
> precondition, by definition.
>
>>I find your definition of "precondition" very elusive; maybe that's
>>the source of our disagreement.
>
> It's very simple; it is exactly what you'll find at
> http://en.wikipedia.org/wiki/Precondition.

If I understand you correctly, a precondition ceases to be a
precondition as soon as the function that specifies the precondition
detects its violation and throws an exception. I don't see how this
follows from "a precondition is a fact that must always be true just
prior to the execution of some section of code", though. Anyway, does
the same hold for asserts? If it doesn't, and asserting preconditions is
allowed, but exceptions are not, how do you draw the line? Is it because
exceptions are part of the contract, and assertions are not?

By the way, how does Eiffel, in the context of which the terms
precondition and contract originated, after all, handle this?

--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,
Aug 10, 2005, 4:57:30 PM8/10/05
to
Gerhard Menzl <gerhar...@hotmail.com> writes:

> In order to prevent this exchange from breaking into dozens of little
> sub-arguments, I will try to focus on what I think is the key point (or
> the key misunderstanding):
>
>> I think I have an inkling what you're thinking about this. It's
>> something like "Ah, but if I throw an exception, the function never
>> executes, so I can avoid the undefined behavior by not proceeding with
>> the function and instead initiating stack unwinding." Right?
>>
>> That would be wrong. First, the exception throwing is part of what
>> the function does. If the function's documentation describes a
>> precondition, the exception happens after entering the function,
>> i.e. after you've proceeded to execute the "section of code" to which
>> the precondition applies. So the exception throwing is just part of
>> the function's undefined behavior. You can't count on it, and use the
>> knowledge of the exception to make further guarantees about the
>> reliable behavior of the program.
>
> Your definition allows preconditions to be tested using assertions. But
> assertions are, after all, part of what the function does. If I applied
> what you wrote above, then triggering an assertion would be just part of
> the function's undefined behaviour.

Yes!

> But is it? When you specify a precondition for a function, you need a
> clear idea of what the consequences of a violation are,
> which part of what the function does is affected, and which part
> isn't. Otherwise, how would you be able to specify the precondition
> in the first place?

Go back to the definition of precondition from Wikipedia: the
consequence is undefined behavior.

> If, for instance, as the implementer of
> flex_array, you specify the precondition
>
> idx < size()
>
> for operator[](size_type idx), you know that accessing
>
> &*begin() + idx
>
> in case the precondition is violated possibly invokes undefined
> behaviour, but querying the size is okay (if it weren't, you couldn't
> even test the precondition). Why should throwing an exception
> immediately after testing the precondition be "part of the function's
> undefined behavior"? This would only be true if the stack is corrupted,
> but that's not what the precondition is about. Or would you argue that
> any precondition violation may be due to stack corruption?

Yes, any precondition violation could be the result of stack corruption.

>>>I find your definition of "precondition" very elusive; maybe that's
>>>the source of our disagreement.
>>
>> It's very simple; it is exactly what you'll find at
>> http://en.wikipedia.org/wiki/Precondition.
>
> If I understand you correctly, a precondition ceases to be a
> precondition as soon as the function that specifies the precondition
> detects its violation and throws an exception.

No, it ceases to be a precondition the moment that the author of the
function *documents* the throwing of that exception as the function's
response to the condition being violated.

(**) It's inconsistent to say, "This function has X as a precondition.
However, if X is violated, I promise you it will throw an exception
and do nothing else." The correct thing to do if you are going to
make that promise is drop the first sentence.


> I don't see how this follows from "a precondition is a fact that
> must always be true just prior to the execution of some section of
> code", though.

It doesn't.

> Anyway, does the same hold for asserts? If it doesn't, and asserting
> preconditions is allowed, but exceptions are not,

You can replace "throw an exception" with "call abort," "assert(0),"
or anything else you like in (**) and the statement is the same.

> how do you draw the line? Is it because exceptions are part of
> the contract, and assertions are not?

> By the way, how does Eiffel, in the context of which the terms
> precondition and contract originated, after all, handle this?

I don't know.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Niklas Matthies

unread,
Aug 10, 2005, 5:11:40 PM8/10/05
to
On 2005-08-10 18:37, Gerhard Menzl wrote:
:

> Why should throwing an exception immediately after testing the
> precondition be "part of the function's undefined behavior"? This
> would only be true if the stack is corrupted, but that's not what
> the precondition is about. Or would you argue that any precondition
> violation may be due to stack corruption?

Moreover, any precondition fulfillment may be due to stack corruption.
So the only safe thing to do after detecting it is to terminate the
program.

-- Niklas Matthies

Dave Harris

unread,
Aug 14, 2005, 2:49:26 PM8/14/05
to
gerhar...@hotmail.com (Gerhard Menzl) wrote (abridged):
> But is it? When you specify a precondition for a function, you need a
> clear idea of what the consequences of a violation are, which part of
> what the function does is affected, and which part isn't.

You don't need a clear idea, and sometimes no part of the function is
affected at all. For example:

int get_count( int *p ) {
assert( p != 0 );
return 0;
}

is reasonable. One reason for writing such is to reserve a condition for
/future/ implementations to exploit, if they want to. Another reason is to
verify something that you think ought to be true, even if you don't have
any plan to exploit that truth.


> By the way, how does Eiffel, in the context of which the terms
> precondition and contract originated, after all, handle this?

In Eiffel, pre-condition checking is done by the language and can be
turned off as an optimisation, so callers must not rely on it. If you want
to recover from a condition failure, you should write your own code to
check it, not rely on language-level assertions.

-- Dave Harris, Nottingham, UK.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gerhard Menzl

unread,
Aug 17, 2005, 8:42:44 AM8/17/05
to
Dave Harris wrote:

>>But is it? When you specify a precondition for a function, you need a
>>clear idea of what the consequences of a violation are, which part of
>>what the function does is affected, and which part isn't.
>
> You don't need a clear idea, and sometimes no part of the function is
> affected at all. For example:
>
> int get_count( int *p ) {
> assert( p != 0 );
> return 0;
> }
>
> is reasonable.

As the author of get_count(), I can tell that at present a violation of
the precondition will have no consequences, and that undefined behaviour
will be invoked as soon as the function is changed to dereference p.
That's what I call a clear idea.

> In Eiffel, pre-condition checking is done by the language and can be
> turned off as an optimisation, so callers must not rely on it. If you
> want to recover from a condition failure, you should write your own
> code to check it, not rely on language-level assertions.

How does the Eiffel runtime handle violated preconditions? Display a
diagnostic message and abort?


--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gerhard Menzl

unread,
Aug 22, 2005, 6:06:55 AM8/22/05
to
David Abrahams wrote:

> No, it ceases to be a precondition the moment that the author of the
> function *documents* the throwing of that exception as the function's
> response to the condition being violated.
>
> (**) It's inconsistent to say, "This function has X as a precondition.
> However, if X is violated, I promise you it will throw an exception
> and do nothing else." The correct thing to do if you are going to
> make that promise is drop the first sentence.

Originally, I took you to mean that throwing an exception upon detecting
a violated precondition is always wrong. Now you seem to say that
*documenting* such an exception is inconsistent with the specification
of a precondition. These are two very different statements. The first is
about implementation, the second about interface. Or, to put it
polemically: what's wrong with throwing a ViolatedPreconditionException,
as long as I don't document it?

By the way, the term "precondition" seems to be used in meanings that
differ from the one you have sketched. In the Eiffel world (as far as I
can tell), a precondition is something that the caller must guarantee
before invoking a function. This would rule out stack integrity or other
global conditions which the caller cannot guarantee.

Another conflicting use I have found is from P. J. Plauger's Editor's
Forum in the June issue of C/C++ Users Journal where he writes about a
new C library called "Safer C": "But any implementation of the function
must test that its arguments meet all preconditions. The runtime
equivalent of a diagnostic is to call the diagnostic handler. If the
handler returns, the function then cauterizes any output buffers and
returns an error code."

--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,
Aug 22, 2005, 10:48:10 AM8/22/05
to
Gerhard Menzl <gerhar...@hotmail.com> writes:

> David Abrahams wrote:
>
>> No, it ceases to be a precondition the moment that the author of the
>> function *documents* the throwing of that exception as the function's
>> response to the condition being violated.
>>
>> (**) It's inconsistent to say, "This function has X as a precondition.
>> However, if X is violated, I promise you it will throw an exception
>> and do nothing else." The correct thing to do if you are going to
>> make that promise is drop the first sentence.
>
> Originally, I took you to mean that throwing an exception upon detecting
> a violated precondition is always wrong.

No. It's almost always wrong, but that's a different issue.

> Now you seem to say that *documenting* such an exception is
> inconsistent with the specification of a precondition.

That's correct.

> These are two very different statements.

Of course. I know what I'm saying.

> The first is about implementation, the second about interface. Or,
> to put it polemically: what's wrong with throwing a
> ViolatedPreconditionException, as long as I don't document it?

Well, I've been through the arguments about why it's usually a bad
idea in this thread. You can just review my postings (and not only
the ones in reply to yours).

> By the way, the term "precondition" seems to be used in meanings that
> differ from the one you have sketched. In the Eiffel world (as far as I
> can tell), a precondition is something that the caller must guarantee
> before invoking a function.

In my definition the caller must guarantee the precondition also,
since the alternative is undefined behavior. I don't know what I've
said that could make you think otherwise.

> This would rule out stack integrity or other global conditions which
> the caller cannot guarantee.

If the caller is invoking the callee other than as some expression of
undefined behavior, it can. I'm assuming that even in Eiffel, the
moment stack integrity is violated you have undefined behavior. Once
you enter undefined behavior land, all bets are off and all actions
are part of that undefined behavior. Undefined behavior means "all
bets are off" and no guarantees (not even the ones you /think/ the
caller can ensure) are really valid.

> Another conflicting use I have found is from P. J. Plauger's Editor's
> Forum in the June issue of C/C++ Users Journal where he writes about a
> new C library called "Safer C": "But any implementation of the function
> must test that its arguments meet all preconditions. The runtime
> equivalent of a diagnostic is to call the diagnostic handler. If the
> handler returns, the function then cauterizes any output buffers and
> returns an error code."

I'm not sure that's a conflict either.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Peter Dimov

unread,
Aug 23, 2005, 2:43:05 AM8/23/05
to
Gerhard Menzl wrote:
> David Abrahams wrote:
>
> > No, it ceases to be a precondition the moment that the author of the
> > function *documents* the throwing of that exception as the function's
> > response to the condition being violated.
> >
> > (**) It's inconsistent to say, "This function has X as a precondition.
> > However, if X is violated, I promise you it will throw an exception
> > and do nothing else." The correct thing to do if you are going to
> > make that promise is drop the first sentence.
>
> Originally, I took you to mean that throwing an exception upon detecting
> a violated precondition is always wrong. Now you seem to say that
> *documenting* such an exception is inconsistent with the specification
> of a precondition.

In principle, yes.

> These are two very different statements. The first is
> about implementation, the second about interface. Or, to put it
> polemically: what's wrong with throwing a ViolatedPreconditionException,
> as long as I don't document it?

In principle, the implementation is allowed to implement the interface
as you suggest. In practice, such an implementation would lock the
customers in, because they'll start relying on
ViolatedPreconditionException. Whether this is right or wrong is
another story. In effect, you are implementing a different interface,
one without a precondition.

> By the way, the term "precondition" seems to be used in meanings that
> differ from the one you have sketched. In the Eiffel world (as far as I
> can tell), a precondition is something that the caller must guarantee
> before invoking a function. This would rule out stack integrity or other
> global conditions which the caller cannot guarantee.

Stack integrity is an implicit global invariant if the integrity of
said stack affects the observable behavior. IOW, if 1+1 yields 3 when
stack integrity is violated, then all programs implicitly assume a
non-broken stack, even when the language doesn't explicitly define this
as an "invariant".

Gerhard Menzl

unread,
Aug 23, 2005, 6:59:54 PM8/23/05
to
David Abrahams wrote:

>>The first is about implementation, the second about interface. Or,
>>to put it polemically: what's wrong with throwing a
>>ViolatedPreconditionException, as long as I don't document it?
>
> Well, I've been through the arguments about why it's usually a bad
> idea in this thread. You can just review my postings (and not only
> the ones in reply to yours).

Honestly, I've tried, but I still find your rationale elusive.

>>By the way, the term "precondition" seems to be used in meanings that
>>differ from the one you have sketched. In the Eiffel world (as far as
>>I can tell), a precondition is something that the caller must
>>guarantee before invoking a function.
>
> In my definition the caller must guarantee the precondition also,
> since the alternative is undefined behavior. I don't know what I've
> said that could make you think otherwise.

For example:

> Yes, any precondition violation could be the result of stack
> corruption.

An intact stack is a global condition that cannot be guaranteed a single
caller. Stack corruption could be the result of a broken compiler or of
user code that invokes undefined behaviour somewhere else. Both events
lie outside the language. They cannot be detected reliably and portably
at this level, and they are not covered by the contract between the
caller and the called. From this perspective, they're more like Acts of
God, to extend the law metaphor. Yet you seem to regard such global
failures in the context of the contract. I find this confusing and at
variance with the idea of DbC as I understand it.

>>This would rule out stack integrity or other global conditions which
>>the caller cannot guarantee.
>
> If the caller is invoking the callee other than as some expression of
> undefined behavior, it can. I'm assuming that even in Eiffel, the
> moment stack integrity is violated you have undefined behavior. Once
> you enter undefined behavior land, all bets are off and all actions
> are part of that undefined behavior. Undefined behavior means "all
> bets are off" and no guarantees (not even the ones you /think/ the
> caller can ensure) are really valid.

A violated precondition causes undefined behaviour. That doesn't mean
undefined behaviour caused at a different level (like a broken compiler)
can be regarded as a breach of a local contract.

>>Another conflicting use I have found is from P. J. Plauger's Editor's
>>Forum in the June issue of C/C++ Users Journal where he writes about a
>>new C library called "Safer C": "But any implementation of the
>>function must test that its arguments meet all preconditions. The
>>runtime equivalent of a diagnostic is to call the diagnostic handler.
>>If the handler returns, the function then cauterizes any output
>>buffers and returns an error code."
>
> I'm not sure that's a conflict either.

To me, it is in glaring conflict with

> The moment you say say, "I can do something reliably here if I detect
> a violation," then the condition being violated is -- by definition --
> no longer a precondition because you're defining what the behavior
> should be when it happens.

--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Bob Bell

unread,
Aug 24, 2005, 3:58:29 AM8/24/05
to
Gerhard Menzl wrote:
> David Abrahams wrote:
> >>By the way, the term "precondition" seems to be used in meanings that
> >>differ from the one you have sketched. In the Eiffel world (as far as
> >>I can tell), a precondition is something that the caller must
> >>guarantee before invoking a function.
> >
> > In my definition the caller must guarantee the precondition also,
> > since the alternative is undefined behavior. I don't know what I've
> > said that could make you think otherwise.
>
> For example:
>
> > Yes, any precondition violation could be the result of stack
> > corruption.
>
> An intact stack is a global condition that cannot be guaranteed a single
> caller. Stack corruption could be the result of a broken compiler or of
> user code that invokes undefined behaviour somewhere else. Both events
> lie outside the language. They cannot be detected reliably and portably
> at this level, and they are not covered by the contract between the
> caller and the called. From this perspective, they're more like Acts of
> God, to extend the law metaphor. Yet you seem to regard such global
> failures in the context of the contract. I find this confusing and at
> variance with the idea of DbC as I understand it.

I think the point of Dave's remark above is that any precondition, like

assert(index < size());

can fail because the stack is corrupted.

Testing preconditions isn't about assigning blame or deciding on cause.
It's just about testing a condition that must be true, or else the
function can't do what it's supposed to do. If the precondition is
"index < size()", and this condition evaluates to false, the function
cannot know or care why. It could be because the caller just passed in
a bad value. It could be because the stack is corrupted, causing
garbage to be read from index. It could be because the size() function
returned garbage because its invariants are broken. It could be because
of a buggy compiler that emitted garbage code. It could be because of a
hardware fault.

It doesn't matter which of these is the case; all that matters is that
the function cannot continue if index < size().

Bob

Gerhard Menzl

unread,
Aug 24, 2005, 11:09:49 AM8/24/05
to
Bob Bell wrote:

> I think the point of Dave's remark above is that any precondition,
> like
>
> assert(index < size());
>
> can fail because the stack is corrupted.
>
> Testing preconditions isn't about assigning blame or deciding on
> cause. It's just about testing a condition that must be true, or else
> the function can't do what it's supposed to do. If the precondition is
> "index < size()", and this condition evaluates to false, the function
> cannot know or care why. It could be because the caller just passed in
> a bad value. It could be because the stack is corrupted, causing
> garbage to be read from index. It could be because the size() function
> returned garbage because its invariants are broken. It could be
> because of a buggy compiler that emitted garbage code. It could be
> because of a hardware fault.
>
> It doesn't matter which of these is the case; all that matters is that
> the function cannot continue if index < size().

That makes sense, but the same could be said about error conditions that
*are* promised to be handled. A function that throws a documented
InvalidPointerException could be the victim of stack corruption or a
hardware fault just as well.


--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Bob Bell

unread,
Aug 24, 2005, 6:27:39 PM8/24/05
to
Gerhard Menzl wrote:
> Bob Bell wrote:
>
> > I think the point of Dave's remark above is that any precondition,
> > like
> >
> > assert(index < size());
> >
> > can fail because the stack is corrupted.
> >
> > Testing preconditions isn't about assigning blame or deciding on
> > cause. It's just about testing a condition that must be true, or else
> > the function can't do what it's supposed to do. If the precondition is
> > "index < size()", and this condition evaluates to false, the function
> > cannot know or care why. It could be because the caller just passed in
> > a bad value. It could be because the stack is corrupted, causing
> > garbage to be read from index. It could be because the size() function
> > returned garbage because its invariants are broken. It could be
> > because of a buggy compiler that emitted garbage code. It could be
> > because of a hardware fault.
> >
> > It doesn't matter which of these is the case; all that matters is that
> > the function cannot continue if index < size().
>
> That makes sense, but the same could be said about error conditions that
> *are* promised to be handled.

No; the difference is that failed preconditions represent undefined
behavior. Error conditions that *are* promised to be handled are, by
definition, well-defined behavior.

It seems to me that the problem being discussed is largely one of
definition of terms. As I understand it, Dave is advocating a
definition of precondition that reads something like "a condition which
must be true when a function is called, or else we have undefined
behavior." I agree with this definition, and view it as synonymous with
"a condition which must be true when a function is called, or else the
function cannot continue."

It sounds like you want a definition that reads something like "a
precondition is a condition which is tested when a function is called;
if true, the function runs as normal; if false, the function throws an
exception." I don't think there's anything wrong with making a
distinction between conditions which cause a function to behave one way
and conditions that make a function behave another. But this emphasis
ignores those conditions which lead to undefined behavior, and
documenting, detecting and avoiding those conditions is _very_
important for designing and implementing a robust system.

> A function that throws a documented
> InvalidPointerException could be the victim of stack corruption or a
> hardware fault just as well.

Suppose you have a function F() that is documented to throw X when
condition Y fails. Suppose you call F(), and Y legitimately fails. You
don't have undefined behavior, an X is thrown, and life goes on.

If, however, F() is called and Y appears to fail because some undefined
behavior has occured (such as stack corruption or whatnot), well now
you're in undefined behavior land, and anything could happen. The
program could crash, wipe your hard disk, or throw an X. If it manages
to actually throw an X, it doesn't change the fact that your program is
now exhibiting undefined behavior.

The key point is that your condition Y did not help you detect and
avoid undefined behavior.

Bob

Gerhard Menzl

unread,
Aug 25, 2005, 7:04:58 PM8/25/05
to
Bob Bell wrote:

> It seems to me that the problem being discussed is largely one of
> definition of terms.

Yes, that's what it has turned into.

> As I understand it, Dave is advocating a definition of precondition
> that reads something like "a condition which must be true when a
> function is called, or else we have undefined behavior." I agree with
> this definition, and view it as synonymous with "a condition which
> must be true when a function is called, or else the function cannot
> continue."

"The function cannot continue" would allow for throwing an exception or
returning an error code, "we have undefined behaviour" would not, hence
the two are not synonymous.

> It sounds like you want a definition that reads something like "a
> precondition is a condition which is tested when a function is called;
> if true, the function runs as normal; if false, the function throws an
> exception."

No, I don't *want* a particular definition. What I was hoping to get is
a clear definition that is consistent with the statement that throwing
an exception upon detecting a precondition violation is almost always
wrong. So far I haven't seen one, and my hopes have dwindled.

As to the role of undefined behaviour, I have seen:

- a violated precondition causes undefined behaviour
- a violated precondition may be the result of undefined behaviour
- once you detect a violated precondition, you already have undefined
behaviour
- continuing after detecting a violated precondition would cause
undefined behaviour

All these are different in subtle ways.

I have also tried to understand whether the definition used by David is
specific to C++ or consistent with the notion of precondition used by
the programming community in general, and with the concept of Design by
Contract in Eiffel in particular.

> Suppose you have a function F() that is documented to throw X when
> condition Y fails. Suppose you call F(), and Y legitimately fails. You
> don't have undefined behavior, an X is thrown, and life goes on.
>
> If, however, F() is called and Y appears to fail because some
> undefined behavior has occured (such as stack corruption or whatnot),
> well now you're in undefined behavior land, and anything could happen.
> The program could crash, wipe your hard disk, or throw an X. If it
> manages to actually throw an X, it doesn't change the fact that your
> program is now exhibiting undefined behavior.
>
> The key point is that your condition Y did not help you detect and
> avoid undefined behavior.

Doesn't the very definition of undefined behaviour render its detection
once it has occurred impossible?

I don't see any point in treating precondition violations separately on
the grounds that they may be the *result* of undefined behaviour: that
is true for any program state, and you cannot detect it anyway. Treating
them separately because going on as if nothing had happened would *lead
to* undefined behaviour is an altogether different story. To me, mixing
these aspects is a major obstacle in this discussion.

--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Joe

unread,
Aug 25, 2005, 7:14:08 PM8/25/05
to
Interesting how phrasing can be very subtle. To me, preconditions are
those conditions the caller must meet before my routine has well
defined meaning. That is, I am willing to guarantee to the caller that
my routine will work if these conditions are met and further I believe
this is the original intent of DBC. Stack corruption is a different
thing entirely and nothing useful can be done about it. While it is a
violation of the contract, it is not a very interesting one. If the
stack is corrupt, evaluating the precondition is already undefined
behavior and there is no reason the believe that the precondition check
itself will return anything meaningful. I am relatively neutral as to
whether exceptions should be used to return a precondition failure, but
I firmly believe that they have nothing to do with stack corruption
detection and the point is really that the caller is invoking our
routine correctly so that they can correct the offending code if the
preconditions are not true.

In other words, much like static type checking, preconditions are there
to catch programming errors of a dynamic nature, such as accidentally
passing a 5 where the on;y sensible values are between 100 and 500.
Stack corruption is a much more insideous problem which we would be
thrilled to catch with a precondition, but that would not be the reason
I would put a precondition in the code.

joe

Bob Bell

unread,
Aug 26, 2005, 5:47:53 AM8/26/05
to
Gerhard Menzl wrote:
> Bob Bell wrote:
> > As I understand it, Dave is advocating a definition of precondition
> > that reads something like "a condition which must be true when a
> > function is called, or else we have undefined behavior." I agree with
> > this definition, and view it as synonymous with "a condition which
> > must be true when a function is called, or else the function cannot
> > continue."
>
> "The function cannot continue" would allow for throwing an exception or
> returning an error code, "we have undefined behaviour" would not, hence
> the two are not synonymous.

I should be more specific. I interpret "the function cannot continue"
to mean that the function shouldn't be allowed to execute a single
instruction more, not even to throw an exception.

> > It sounds like you want a definition that reads something like "a
> > precondition is a condition which is tested when a function is called;
> > if true, the function runs as normal; if false, the function throws an
> > exception."
>
> No, I don't *want* a particular definition. What I was hoping to get is
> a clear definition that is consistent with the statement that throwing
> an exception upon detecting a precondition violation is almost always
> wrong. So far I haven't seen one, and my hopes have dwindled.

Then how about a more pragmatic definition? When a precondition fails,
it almost always indicates a programmer error (a bug). When a bug
occurs, the last thing you want is to unwind the stack:

-- unwinding the stack destroys state that could help you track
down the bug
-- unwinding the stack may do more damage
-- throwing an exception allows the bug to go unnoticed if a
caller catches and swallows it (e.g., catch (...))
-- throwing an exception gives a (possibly indirect) caller a
chance to respond to the bug; typically, there isn't anything
reasonable a caller can do to respond to a bug

What you really want is to stop the program in a debugger, generate a
core dump, or otherwise examine the state of the program at the instant
the bug was detected. If you throw an exception, you're just allowing
the program to continue running with a bug.

> I have also tried to understand whether the definition used by David is
> specific to C++ or consistent with the notion of precondition used by
> the programming community in general, and with the concept of Design by
> Contract in Eiffel in particular.

I don't think it is; from what I understand about Eiffel (which is
little) the aim is to keep the program running if a contract is broken.
But I could be wrong about that.

> > Suppose you have a function F() that is documented to throw X when
> > condition Y fails. Suppose you call F(), and Y legitimately fails. You
> > don't have undefined behavior, an X is thrown, and life goes on.
> >
> > If, however, F() is called and Y appears to fail because some
> > undefined behavior has occured (such as stack corruption or whatnot),
> > well now you're in undefined behavior land, and anything could happen.
> > The program could crash, wipe your hard disk, or throw an X. If it
> > manages to actually throw an X, it doesn't change the fact that your
> > program is now exhibiting undefined behavior.
> >
> > The key point is that your condition Y did not help you detect and
> > avoid undefined behavior.
>
> Doesn't the very definition of undefined behaviour render its detection
> once it has occurred impossible?

You're right, undefined behavior, as defined by the language standard,
is undetectable once it's occurred. It's clear from you response that
applying the term "undefined behavior" to preconditions has been
misleading. In the interest of clarity, I'm going to switch to
"undefined state". Example:

F() is documented to specify that it is the responsibility of all
callers to establish condition Y. Now suppose F() is called and Y is
false. What does this mean? All you know is that some caller failed to
establish Y. Assuming the contract was valid and reasonable, you have
detected a bug. (Even if the contract was invalid, you've still
detected a bug -- only the bug is that F() demands condition Y.)

In practical terms, the program has entered an undefined state -- it's
doing something you didn't think it could do. Whether it entered the
undefined state before Y was tested, as a result of testing Y, etc., is
not that important, as far as I'm concerned. What's important is what
you do about it. If you throw an exception, you allow the program to
continue running. But since it's entered an undefined state, you don't
know what it will do.

This is not exactly the same as "undefined behavior" as defined in the
standard, but it shares a lot of similarities. "Undefined behavior"
means the language standard has nothing to say about what the program
will do. "Undefined state" means that you, the programmer, have nothing
to say about what the program will do.

Getting back to my definition from my previous message, a precondition
is "a condition which must be true when a function is called, or else
the program has entered an undefined state." If this happens, I believe
that the right thing to do is stop the program, so I see this as


synonymous with "a condition which must be true when a function is
called, or else the function cannot continue."

> I don't see any point in treating precondition violations separately on


> the grounds that they may be the *result* of undefined behaviour: that
> is true for any program state, and you cannot detect it anyway. Treating
> them separately because going on as if nothing had happened would *lead
> to* undefined behaviour is an altogether different story. To me, mixing
> these aspects is a major obstacle in this discussion.

I think the right way to think about preconditions is that they detect
bugs.

Bob

Dave Harris

unread,
Aug 27, 2005, 3:39:38 PM8/27/05
to
gerhar...@hotmail.com (Gerhard Menzl) wrote (abridged):
> As the author of get_count(), I can tell that at present a violation of
> the precondition will have no consequences, and that undefined
> behaviour will be invoked as soon as the function is changed to
> dereference p. That's what I call a clear idea.

I don't think you /need/ a clear idea. I sometimes assert things I expect
to be true without thinking at all about the consequences if they aren't
true.


> How does the Eiffel runtime handle violated preconditions? Display a
> diagnostic message and abort?

If checking is enable, it throws an exception. If not, then the program
proceeds normally and has undefined behaviour. Programmers are encourage
to write code as if checking is disabled.

-- Dave Harris, Nottingham, UK.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,
Aug 27, 2005, 3:32:02 PM8/27/05
to
Gerhard Menzl <gerhar...@hotmail.com> writes:

>>>By the way, the term "precondition" seems to be used in meanings that
>>>differ from the one you have sketched. In the Eiffel world (as far as
>>>I can tell), a precondition is something that the caller must
>>>guarantee before invoking a function.
>>
>> In my definition the caller must guarantee the precondition also,
>> since the alternative is undefined behavior. I don't know what I've
>> said that could make you think otherwise.
>
> For example:
>
>> Yes, any precondition violation could be the result of stack
>> corruption.
>
> An intact stack is a global condition that cannot be guaranteed a single
> caller.

So? I didn't say that an intact stack is a precondition. I just said
that a corrupted stack could easily result in any precondition being
violated.

> Stack corruption could be the result of a broken compiler or of
> user code that invokes undefined behaviour somewhere else. Both events
> lie outside the language. They cannot be detected reliably and portably
> at this level, and they are not covered by the contract between the
> caller and the called. From this perspective, they're more like Acts of
> God, to extend the law metaphor. Yet you seem to regard such global
> failures in the context of the contract. I find this confusing and at
> variance with the idea of DbC as I understand it.

Let me put it another way: if a single caller cannot guarantee an
intact stack, it cannot guarantee anything. A corrupted stack throws
everything into doubt. Even if the caller's author _thinks_ he is
testing some condition and guaranteeing it to the callee, if the stack
is corrupt, any otherwise-correct test could easily give false results.

That's why, in general:

Preconditions are not ensured by the caller testing for them and
somehow deciding to avoid making the call. They are ensured by
reading the guarantees made to the caller by the other functions it
is calling, reading the requirements the caller makes on *its*
callers, and combining that information using logic to form a little
proof that the conditions hold.

It's fine for the caller to use asserts or whatever to check his own
logic and root out bugs from his code. However, it's impossible to
write a program that meets any useful specification when at any moment
the programmer's understanding of the program state could turn out to
be wrong, so those tests can at best be a debugging tool -- you'd
better write your program as though they're going to pass.

You have to decide on a baseline context inside of which your
guarantees live. "Obviously" if some code executed earlier induces
undefined behavior (e.g. by corrupting the stack), your guarantees are
just as meaningless as if radiation had inverted a few bits in memory.

>>>This would rule out stack integrity or other global conditions which
>>>the caller cannot guarantee.

Once again, I never said that stack corruption was a precondition
violation. You seem to be looking hard for ways to find that what
I've said is somehow inconsistent or incoherent. Poking holes in
arguments I never made seems sorta pointless.

It doesn't sound to me as though you're trying to understand what I'm
saying; rather, it seems much more as though you simply don't _like_
what I'm saying. If that's so, I'd like to stop trying to explain
myself now. If not, I apologize in advance for even asking.


>> If the caller is invoking the callee other than as some expression of
>> undefined behavior, it can. I'm assuming that even in Eiffel, the
>> moment stack integrity is violated you have undefined behavior. Once
>> you enter undefined behavior land, all bets are off and all actions
>> are part of that undefined behavior. Undefined behavior means "all
>> bets are off" and no guarantees (not even the ones you /think/ the
>> caller can ensure) are really valid.
>
> A violated precondition causes undefined behaviour. That doesn't
> mean undefined behaviour caused at a different level (like a broken
> compiler) can be regarded as a breach of a local contract.

No, it simply makes the local contract somewhat meaningless, unless
there is strong insulation between the levels (as with processes). A
broken compiler affects everything at a deep level, so there's no
insulation.

>>>Another conflicting use I have found is from P. J. Plauger's Editor's
>>>Forum in the June issue of C/C++ Users Journal where he writes about a
>>>new C library called "Safer C": "But any implementation of the
>>>function must test that its arguments meet all preconditions. The
>>>runtime equivalent of a diagnostic is to call the diagnostic handler.
>>>If the handler returns, the function then cauterizes any output
>>>buffers and returns an error code."
>>
>> I'm not sure that's a conflict either.
>
> To me, it is in glaring conflict with
>
>> The moment you say say, "I can do something reliably here if I detect
>> a violation," then the condition being violated is -- by definition --
>> no longer a precondition because you're defining what the behavior
>> should be when it happens.

If you are merely suggesting that Microsoft's "Safer C" specification
uses a different concept of the term "precondition," which allows a
documented response to violations, you'll get no argument from me on
that point. I never claimed that everyone in the world has the same
concept. Clearly you and I don't, and I daresay the fact that
somebody at Microsoft disagreed with me certainly doesn't prove
anything about the coherence of my arguments.

I am making only the following claims:

1. That any concept of "precondition violation" that allows the callee
to guarantee a particular response to a violation is very weak and
close to useless. It's logically indistinguishable from any other
documented behavior, aside from attaching some meaningless moral
judgement to the behavior ("violated preconditions are 'bad'; other
documented behaviors are 'good'"). A technical term is much more
powerful and useful when it distinguishes one thing from another.
Okay, you could use "precondition violation" as a shorthand for
"produces the following category of guaranteed behavior," as the
"safer C" spec appears to do. That brings me to...

2. The concept of precondition violation that allows the callee to
guarantee a particular response to a violation is usually bad for
callers. Because it's logically indistinguishable from any other
documented behavior, the caller almost invariably treats those
responses in the same way as "error" returns (e.g. resource
exhaustion, file locked, etc.) that can actually happen in correct
code. That results in either haphazard "recovery" code that doesn't
actually work consistently, or a huge overhead in code that tries to
accomodate these responses-to-one's-own-bugs correctly. Even if you
do the latter you still end up with the former, most of the time.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gerhard Menzl

unread,
Aug 30, 2005, 7:06:15 AM8/30/05
to
Bob Bell wrote:

> I should be more specific. I interpret "the function cannot continue"
> to mean that the function shouldn't be allowed to execute a single
> instruction more, not even to throw an exception.

Not even log and assert?

> Then how about a more pragmatic definition? When a precondition fails,
> it almost always indicates a programmer error (a bug). When a bug
> occurs, the last thing you want is to unwind the stack:
>
> -- unwinding the stack destroys state that could help you track
> down the bug
> -- unwinding the stack may do more damage
> -- throwing an exception allows the bug to go unnoticed if a
> caller catches and swallows it (e.g., catch (...))
> -- throwing an exception gives a (possibly indirect) caller a
> chance to respond to the bug; typically, there isn't anything
> reasonable a caller can do to respond to a bug
>
> What you really want is to stop the program in a debugger, generate a
> core dump, or otherwise examine the state of the program at the
> instant the bug was detected. If you throw an exception, you're just
> allowing the program to continue running with a bug.

Perhaps I am too pragmatic and customer/end-user-oriented. While this is
something that is perfectly suitable for the development phase, you
don't want it to happen at the user's site. According to this
definition, every precondition is a potential source of crash (again
from a user's perspective: user's don't distinguish between calls to
assert() or exit() and, say, memory access violations - they just
perceive the program crashing). It seems like an extremely pessimistic
approach to me: pull the emergency brake whenever a condition that
should hold doesn't. There is no such thing as local failure - the bug
is always assumed to be global and catastrophic. Thus, under pressure to
deliver systems that do not "crash", there would be strong motivation to
use preconditions sparingly - which would defeat the purpose of DbC.

> I don't think it is; from what I understand about Eiffel (which is
> little) the aim is to keep the program running if a contract is
> broken. But I could be wrong about that.

That would be the opposite of the when-in-doubt-pull-the-emergency-brake
approach. Considering that making software sytems more reliable has been
a major driving force behind Eiffel and DbC, this difference puzzles me.

> You're right, undefined behavior, as defined by the language standard,
> is undetectable once it's occurred. It's clear from you response that
> applying the term "undefined behavior" to preconditions has been
> misleading. In the interest of clarity, I'm going to switch to
> "undefined state". Example:
>
> F() is documented to specify that it is the responsibility of all
> callers to establish condition Y. Now suppose F() is called and Y is
> false. What does this mean? All you know is that some caller failed to
> establish Y. Assuming the contract was valid and reasonable, you have
> detected a bug. (Even if the contract was invalid, you've still
> detected a bug -- only the bug is that F() demands condition Y.)
>
> In practical terms, the program has entered an undefined state -- it's
> doing something you didn't think it could do. Whether it entered the
> undefined state before Y was tested, as a result of testing Y, etc.,
> is not that important, as far as I'm concerned. What's important is
> what you do about it. If you throw an exception, you allow the program
> to continue running. But since it's entered an undefined state, you
> don't know what it will do.

I don't have - and never had - troubles understanding this reasoning.
What I am having doubts about is whether failure to meet Y automatically
means that the entire program is in an undefined state and aborting is
the only sensible reaction - unless, of course, you restrict the
definition of precondition to exactly those cases. From what I have been
able to survey, such a restrictive definition does not seem to be used
universally.

--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gerhard Menzl

unread,
Aug 30, 2005, 7:11:23 AM8/30/05
to
David Abrahams wrote:

> Once again, I never said that stack corruption was a precondition
> violation. You seem to be looking hard for ways to find that what
> I've said is somehow inconsistent or incoherent. Poking holes in
> arguments I never made seems sorta pointless.
>
> It doesn't sound to me as though you're trying to understand what I'm
> saying; rather, it seems much more as though you simply don't _like_
> what I'm saying. If that's so, I'd like to stop trying to explain
> myself now. If not, I apologize in advance for even asking.

I am sorry if you got that impression. Nothing could be further from the
truth. My motivation behind participating in this newsgroup is to learn
about best practice from more experienced peers and pass on my knowledge
to the less experienced. I am really not in for
I-beat-a-Boost-guru-in-a-discussion games. My apologies if I made it
sound like I were.

Since you don't see any inconsistencies or contradictions where I do,
the problem must be communication, I guess.

> If you are merely suggesting that Microsoft's "Safer C" specification
> uses a different concept of the term "precondition," which allows a
> documented response to violations, you'll get no argument from me on
> that point. I never claimed that everyone in the world has the same
> concept. Clearly you and I don't, and I daresay the fact that
> somebody at Microsoft disagreed with me certainly doesn't prove
> anything about the coherence of my arguments.

Of course not. I am just trying to sort out who means what by


"precondition". As you said:

> A technical term is much more powerful and useful when it
> distinguishes one thing from another.

It is also more powerful and useful when there are not several slightly
different definitions being used in the industry.

By the way, I cited P. J. Plauger because of his role as an experienced
C++ Standard Library implementer, not because of his Microsoft
connection. When two experts who work in closely related fields don't
seem to use a technical term in the same way, how are ordinary
programmers supposed to agree on it?


--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Bob Bell

unread,
Aug 30, 2005, 6:06:10 PM8/30/05
to
Gerhard Menzl wrote:
> Bob Bell wrote:
>
> > I should be more specific. I interpret "the function cannot continue"
> > to mean that the function shouldn't be allowed to execute a single
> > instruction more, not even to throw an exception.
>
> Not even log and assert?

If I said "it's OK to log and assert", would that invalidate my point
or support yours? The point is to make as few assumptions about the
state of the system as possible, which leads to executing as little
code as possible. The problem with throwing is that it assumes that the
entire state of the system is still good, and that any and all code can
still run.

> > Then how about a more pragmatic definition? When a precondition fails,
> > it almost always indicates a programmer error (a bug). When a bug
> > occurs, the last thing you want is to unwind the stack:
> >
> > -- unwinding the stack destroys state that could help you track
> > down the bug
> > -- unwinding the stack may do more damage
> > -- throwing an exception allows the bug to go unnoticed if a
> > caller catches and swallows it (e.g., catch (...))
> > -- throwing an exception gives a (possibly indirect) caller a
> > chance to respond to the bug; typically, there isn't anything
> > reasonable a caller can do to respond to a bug
> >
> > What you really want is to stop the program in a debugger, generate a
> > core dump, or otherwise examine the state of the program at the
> > instant the bug was detected. If you throw an exception, you're just
> > allowing the program to continue running with a bug.
>
> Perhaps I am too pragmatic and customer/end-user-oriented.

Too pragmatic for a pragmatic definition? ;-)

> While this is
> something that is perfectly suitable for the development phase, you
> don't want it to happen at the user's site.

If you mean that you want to avoid crashes/ungraceful shutdowns when
end-users use the system, I agree.

> According to this
> definition, every precondition is a potential source of crash (again
> from a user's perspective: user's don't distinguish between calls to
> assert() or exit() and, say, memory access violations - they just
> perceive the program crashing). It seems like an extremely pessimistic
> approach to me: pull the emergency brake whenever a condition that
> should hold doesn't.

Are you saying that it's OK to let the program continue running when
you know there is a bug, but you don't know anything about how
extensive it is? Isn't the right thing to do to fix the bug? Throwing
an exception gets in the way of fixing the bug (see the above list of
problems caused by throwing in response to a bug). How can throwing
possibly be a good idea?

> There is no such thing as local failure

Until you know differently, there isn't. One way to know that a failure
is local is to actually debug it and determine the cause. Another way
to know the failure is local is if the failure is wrapped in some kind
of firewall, like a separate address space. When all of the state of a
program is in a shared address space, a failure in one part of a
program can be caused by something entirely non-local.

> - the bug
> is always assumed to be global and catastrophic.

The alternative is to assume the bug is always local and not
catastrophic. In my experience, making this assumption causes far more
serious problems than assuming a bug is global and catastrophic.

> Thus, under pressure to
> deliver systems that do not "crash", there would be strong motivation to
> use preconditions sparingly - which would defeat the purpose of DbC.

In practice this doesn't happen (at least, in my practice; can't speak
for anyone else). Instead, liberal usage of assertions to trap
precondition violations as bugs leads to finding and fixing a lot of
bugs. Perhaps you should try it before deciding that it doesn't work.

> > I don't think it is; from what I understand about Eiffel (which is
> > little) the aim is to keep the program running if a contract is
> > broken. But I could be wrong about that.
>
> That would be the opposite of the when-in-doubt-pull-the-emergency-brake
> approach. Considering that making software sytems more reliable has been
> a major driving force behind Eiffel and DbC, this difference puzzles me.

I don't know why it should. I'm not programming with Eiffel, and as far
as I know, neither are you, so why should it matter what "precondition"
means in Eiffel? Lots of terms are used differently by the two camps.
You don't seem to have trouble discussing exceptions, despite the fact
that the term means different things in the two languages.

I don't know much about Eiffel, but I don't see how letting a program
continue to run in an undefined state makes a system more reliable.

> > In practical terms, the program has entered an undefined state -- it's
> > doing something you didn't think it could do. Whether it entered the
> > undefined state before Y was tested, as a result of testing Y, etc.,
> > is not that important, as far as I'm concerned. What's important is
> > what you do about it. If you throw an exception, you allow the program
> > to continue running. But since it's entered an undefined state, you
> > don't know what it will do.
>
> I don't have - and never had - troubles understanding this reasoning.
> What I am having doubts about is whether failure to meet Y automatically
> means that the entire program is in an undefined state

What's the alternative? Saying that the program's state is partially
undefined? Or that some subset of the state is undefined, while the
remainder of the state is well-defined? That kind of fuzzy thinking is
something I don't understand. It often turns out to be wrong, and leads
to missed opportunities to fix bugs.

The point isn't "failure to meet Y automatically means that the entire
program is in an undefined state". The point is that any other
assumption is just less safe; it's safer to assume that, until proven
otherwise, the state of the entire program is in an undefined state.
Not coincidentally, the effort required to "prove otherwise" usuallly
involves gaining enough information to fix the bug.

One other pragmatic reason to stop the program and fix the bug the
moment the bug is detected is that you never know when the bug is going
to recur and you'll get another opportunity.

> and aborting is
> the only sensible reaction - unless, of course, you restrict the
> definition of precondition to exactly those cases. From what I have been
> able to survey, such a restrictive definition does not seem to be used
> universally.

Universally across languages? Or just within the C++ community? I'm
more interested in the term as it is used in the C++ community, and
there I see that there hasn't been much of a consensus. However, the
opinions of several experts I respect match my own intuitive
understanding, so I'm satisfied. Precondition failures indicate bugs,
and the right thing to do is fix the bug; just about the worst thing
you could do is throw an exception, since throwing an exception is
tantamount to ignoring the bug.

Bob

David Abrahams

unread,
Aug 30, 2005, 6:03:14 PM8/30/05
to
Gerhard Menzl <gerhar...@hotmail.com> writes:

> David Abrahams wrote:
>
>> It doesn't sound to me as though you're trying to understand what I'm
>> saying; rather, it seems much more as though you simply don't _like_
>> what I'm saying. If that's so, I'd like to stop trying to explain
>> myself now. If not, I apologize in advance for even asking.
>
> I am sorry if you got that impression. Nothing could be further from the
> truth. My motivation behind participating in this newsgroup is to learn
> about best practice from more experienced peers and pass on my knowledge
> to the less experienced. I am really not in for
> I-beat-a-Boost-guru-in-a-discussion games. My apologies if I made it
> sound like I were.

Okay, thanks for clearing that up; I won't mention it again.

> As you said:
>
>> A technical term is much more powerful and useful when it
>> distinguishes one thing from another.
>
> It is also more powerful and useful when there are not several
> slightly different definitions being used in the industry.
>
> By the way, I cited P. J. Plauger because of his role as an experienced
> C++ Standard Library implementer, not because of his Microsoft
> connection. When two experts who work in closely related fields don't
> seem to use a technical term in the same way, how are ordinary
> programmers supposed to agree on it?

The best answer I can give you is:

1. I don't think that writing is based so much on Bill (P.J.)'s
_definition_ of "precondition" but on the _usage_ that was adopted
by the authors of the "safer C" specification. IMO, Bill was just
describing the system using the terminology its authors had
already established.

2. I haven't seen a _definition_ of precondition that's both
non-vacuous and consistent with that usage. I think that's an
indication that people using "precondition" that way haven't really
given rigorous thought to what it means when they use the word.

People use words casually and loosely all the time without giving a
second thought to what they mean. I've been suggesting that your
software will benefit from picking a rigorous definition for
"precondition" that clearly distinguishes preconditions from other
things.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Nicola Musatti

unread,
Aug 31, 2005, 6:41:55 AM8/31/05
to

Bob Bell wrote:
> Gerhard Menzl wrote:
> > Bob Bell wrote:
> >
> > > I should be more specific. I interpret "the function cannot continue"
> > > to mean that the function shouldn't be allowed to execute a single
> > > instruction more, not even to throw an exception.
> >
> > Not even log and assert?
>
> If I said "it's OK to log and assert", would that invalidate my point
> or support yours? The point is to make as few assumptions about the
> state of the system as possible, which leads to executing as little
> code as possible. The problem with throwing is that it assumes that the
> entire state of the system is still good, and that any and all code can
> still run.

Excuse me, but don't you risk assuming too much in the other direction?
Consider for example a function as the following:

double safeSqrt(double arg) {
if ( arg < 0 )
// what goes here?
return std::sqrt(arg);
}

Wouldn't it be a bit extreme to assume the world has ended just because
this function was passed a negative number?

On the other hand I agree that if the world has actually ended, we
wouldn't want to add damage to it. So what can we do about it? You are
probably right that exception handling is not to be trusted and it
seems to me that the least action you can take is to return a
conventional value.

Should we reach the conclusion that returning error codes is better
than exceptions for writing really robust code? ;-)

Cheers,
Nicola Musatti

Bob Bell

unread,
Aug 31, 2005, 5:09:06 PM8/31/05
to
Nicola Musatti wrote:
> Bob Bell wrote:
> > Gerhard Menzl wrote:
> > > Bob Bell wrote:
> > >
> > > > I should be more specific. I interpret "the function cannot continue"
> > > > to mean that the function shouldn't be allowed to execute a single
> > > > instruction more, not even to throw an exception.
> > >
> > > Not even log and assert?
> >
> > If I said "it's OK to log and assert", would that invalidate my point
> > or support yours? The point is to make as few assumptions about the
> > state of the system as possible, which leads to executing as little
> > code as possible. The problem with throwing is that it assumes that the
> > entire state of the system is still good, and that any and all code can
> > still run.
>
> Excuse me, but don't you risk assuming too much in the other direction?

What's the risk? Does it outweigh the risk of shipping a buggy program?

> Consider for example a function as the following:
>
> double safeSqrt(double arg) {
> if ( arg < 0 )
> // what goes here?
> return std::sqrt(arg);
> }
>
> Wouldn't it be a bit extreme to assume the world has ended just because
> this function was passed a negative number?

It may seem extreme, but making that assumption gives you an
opportunity to detect and fix a bug. Again, the alternative is to
assume that the world is OK when a negative number is passed, and that
is clearly wrong. When balancing "seems a bit extreme" against "clearly
wrong", I'll go with the "seems a bit extreme" option.

> On the other hand I agree that if the world has actually ended, we
> wouldn't want to add damage to it. So what can we do about it? You are
> probably right that exception handling is not to be trusted and it
> seems to me that the least action you can take is to return a
> conventional value.

It comes back to:

-- you have a function with a precondition that says its argument must
be non-negative or the function can't perform its job sensibly;
therefore, if a negative number is passed, there must be a bug
-- writing the function so that it returns a value when a negative
number is passed just lets a bug go unnoticed

> Should we reach the conclusion that returning error codes is better
> than exceptions for writing really robust code? ;-)

I know you're joking, but returning any value (error code or otherwise)
has the same problem that throwing an exception does; it implicitly
assumes that the world is OK and any code can still run.

Bob

David Abrahams

unread,
Aug 31, 2005, 5:05:05 PM8/31/05
to
"Nicola Musatti" <nicola....@gmail.com> writes:

> Bob Bell wrote:
>> Gerhard Menzl wrote:
>> > Bob Bell wrote:
>> >
>> > > I should be more specific. I interpret "the function cannot continue"
>> > > to mean that the function shouldn't be allowed to execute a single
>> > > instruction more, not even to throw an exception.
>> >
>> > Not even log and assert?
>>
>> If I said "it's OK to log and assert", would that invalidate my point
>> or support yours? The point is to make as few assumptions about the
>> state of the system as possible, which leads to executing as little
>> code as possible. The problem with throwing is that it assumes that the
>> entire state of the system is still good, and that any and all code can
>> still run.
>
> Excuse me, but don't you risk assuming too much in the other direction?
> Consider for example a function as the following:
>
> double safeSqrt(double arg) {
> if ( arg < 0 )
> // what goes here?
> return std::sqrt(arg);
> }
>
> Wouldn't it be a bit extreme to assume the world has ended just because
> this function was passed a negative number?

Sure. If you want to throw an exception there, just document it and
don't call arg >= 0 a precondition.

If you call it a precondition, invoking safeSqrt with a negative
number becomes a bug. Then what's "safe" about that function? All
the caller knows when an exception emanates from it is that he's got a
bug somewhere in his code.

If it's a precondition, and you want to write code that tries to take
the safeSqrt of a million numbers, you have to check each one first to
make sure it's non-negative, or you have a bug in your code. If it's
not a precondition, wrap the whole thing in a try/catch block. "It's
easier to ask forgiveness than permission" ;-)

> On the other hand I agree that if the world has actually ended, we
> wouldn't want to add damage to it. So what can we do about it? You are
> probably right that exception handling is not to be trusted

That's not Bob's point at all.

> and it seems to me that the least action you can take is to return a
> conventional value.
>
> Should we reach the conclusion that returning error codes is better
> than exceptions for writing really robust code? ;-)

No. If you want your code to be robust, be clear about the difference
between preconditions and the conditions that generate error codes and
exceptions.

Some people like to avoid the word "error" in connection with the
latter category, so

_Programmer errors_ lead to precondition failures which invoke
undefined behavior

_Exceptional conditions_, such as resource allocation failures
and negative arguments to a safeSqrt that throws, are expected,
and generate exceptions or abnormal return values or
...whatever other mechanism you choose to report the condition.

Is that clearer?


--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gerhard Menzl

unread,
Sep 1, 2005, 7:17:46 AM9/1/05
to
Bob Bell wrote:

> If I said "it's OK to log and assert", would that invalidate my point
> or support yours? The point is to make as few assumptions about the
> state of the system as possible, which leads to executing as little
> code as possible. The problem with throwing is that it assumes that
> the entire state of the system is still good, and that any and all
> code can still run.

No, I am not trying to invalidate your point in any way. When I point
out what I perceive as inconsistencies I do so in order to to increase
my understanding and, hopefully, achieve mutual agreement on a more
refined level.

When you say that "the function shouldn't be allowed to execute a single
instruction more", logging and asserting would be impossible. "Executing
as little code as possible", on the other hand, sounds reasonable to me
and eliminates the contradiction. I still cannot reconcile this
guideline with Dave's point that unwinding the stack is (almost always)
wrong, but starting a separate undo/recovery mechanism isn't. This may
be due to misunderstanding.

> If you mean that you want to avoid crashes/ungraceful shutdowns when
> end-users use the system, I agree.

That is what I am concerned about, and it's not just because of trying
to be nice to users. The software I am currently working on is part of a
larger system that has an abysmal record. The team was recently
confronted with an ultimatum set by the customer along the lines of:
you've got six more weeks until the final test; if there's one single
crash (and from a customer's view, this includes assertions and the
like), you're out - the end of a contract of several million dollars as
well as dozens of jobs. From this perspective, one cannot help eyeing
statements like "terminating the program is good because it helps
debugging" with a certain reserve. You don't have to tell me that there
has to be something seriously wrong with the development process to get
into a situation like this in the first place, but unfortunately,
development in the real world does not always take the perfect (or even
reasonably sound) process path.

> In practice this doesn't happen (at least, in my practice; can't speak
> for anyone else). Instead, liberal usage of assertions to trap
> precondition violations as bugs leads to finding and fixing a lot of
> bugs. Perhaps you should try it before deciding that it doesn't work.

I am sorry if I should have created the impression that I have decided
the approach doesn't work. I *do* make liberal use of assertions. But
you have to take into account that the more liberally you use
assertions, the more likely it is that you err on the other side, i.e.
that you designate a condition as a result of a bug when in reality it
is a possible, if exotic program state.

> I don't know why it should. I'm not programming with Eiffel, and as
> far as I know, neither are you, so why should it matter what
> "precondition" means in Eiffel? Lots of terms are used differently by
> the two camps. You don't seem to have trouble discussing exceptions,
> despite the fact that the term means different things in the two
> languages.

The answer to this is easy: because, to the best of my knowledge, the
concept of Design by Contract originated in and is most fervently
advocated by the Eiffel camp. Althogh it is being supported directly in
Eiffel, it is abstract enough not to be tied to that language. There may
be differences in implementation, but the fundamentals should be the
same. If, however, there is a sound argument why preconditions should be
defined and/or handled differently in C++, I would like to see it.

I am well aware that certain technical terms mean different things in
different parts of the software engineering community, but I also think
that redefining terms gratuitously should be avoided.

> What's the alternative? Saying that the program's state is partially
> undefined? Or that some subset of the state is undefined, while the
> remainder of the state is well-defined? That kind of fuzzy thinking is
> something I don't understand. It often turns out to be wrong, and
> leads to missed opportunities to fix bugs.

How about an example? Suppose you have a telephony application with a
phone book. The phone book module uses std::binary_search on a
std::vector. A precondition for this algorithm is that the range be
sorted. A bug causes an unsorted range to be passed. Leaving aside the
fact that detecting the violation of this precondition may be a bit
costly, how would you expect the application to react? Abort and thus
disconnect the call in progress although it's only the phone book that
is broken? Exhibit undefined behaviour, such as (hopefully not more
than) displaying garbage? Notify the user of the problem but let him
finish the call? Would the latter cause the precondition cease to be a
precondition? This is not meant to be polemic; I am genuinely interested.

As for fuzzy thinking, "something's amiss somewhere" sounds more fuzzy
to me than "something's amiss in this module/function". Sure, a bug can
surface at a point far from its source: writing to arbitrary locations
in memory is an example. But is it feasible always to react as if this
were the case, although in the majority of cases the cause is probably
to be found locally?

> One other pragmatic reason to stop the program and fix the bug the
> moment the bug is detected is that you never know when the bug is
> going to recur and you'll get another opportunity.

What kind of scenario do you have in mind? If your program aborts at a
remote user site, no immediate fixing is going to take place. I fully
agree that masking bugs and just plodding on is bad practice, I just
doubt that aborting is the only means of preventing it.

> Precondition failures indicate bugs, and the right thing to do is fix
> the bug; just about the worst thing you could do is throw an
> exception, since throwing an exception is tantamount to ignoring the
> bug.

Why you do you equate throwing exceptions with ignoring bugs? In my
application, the top level exception handler tries to write as much
information as possible to a log file. It then informs the user that a
serious error has happened, that the application may be in a shaky state
and had better be terminated, where the log file is, and that an
administrator should be called to collect the information and forward it
to the vendor. Admittedly, the user could choose to ignore the notice
and carry on, but then he could also restart an aborted application adn
carry on. Or are you concerned about sloppy exception handling practices
in larger teams?


--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Nicola Musatti

unread,
Sep 2, 2005, 7:03:08 AM9/2/05
to

David Abrahams wrote:
[...]

> _Programmer errors_ lead to precondition failures which invoke
> undefined behavior

Ok. What you are saying is you don't know where you are so your best
option is give up immediately, lest you cause additional damage,
correct?

I see two issues that arise from this point of view: how to implement
"giving up" and what kind of recovery can be performed.

Ideally one would want to collect as much information as possible on
what went wrong for diagnostic purposes. On Unix systems generating a
core dump is a convenient option, on other systems it might not be so
easy. On the system I work on I don't have core dumps, but my debugger
breaks on throws, so at least in development builds I implement
assertions by throwing exceptions.

As far as recovery is concerned I agree that the current module/process
cannot be trusted anymore and recovery should take place at a different
level: by some monitoring process, by automatic reboot, by requiring
human intervention or whatever.

Cheers,
Nicola Musatti

David Abrahams

unread,
Sep 2, 2005, 7:10:37 AM9/2/05
to
Gerhard Menzl <gerhar...@hotmail.com> writes:

> Bob Bell wrote:
>
> > ...something...


>
> When you say that "the function shouldn't be allowed to execute a single
> instruction more", logging and asserting would be impossible. "Executing
> as little code as possible", on the other hand, sounds reasonable to me
> and eliminates the contradiction.

I've only ever said the latter, FWIW.

> I still cannot reconcile this guideline with Dave's point that
> unwinding the stack is (almost always) wrong, but starting a
> separate undo/recovery mechanism isn't. This may be due to
> misunderstanding.

If you find my point to be in contradiction with the goal of
"executing as little code as possible," then there probably has been a
misunderstanding. At the point you detect a violated precondition,
there can usually only be partial recovery. You should do just enough
work to avoid total catastrophe, if you can. At the point you detect
a violated precondition, you don't have any way to ensure that the
actions taken by stack unwinding will be minimal. On the other hand,
if you have a separate mechanism for registering critical recovery
actions -- and an agreement among components to use it -- you can
invoke that, and avoid any noncritical actions.

The other reason that unwinding is almost always wrong is that it is
very prone to losing the information that a bug was detected, and
allowing execution to proceed as though full recovery has occurred.
All it takes is passing through a layer like this one:

try
{
...something that detects a precondition violation...
}
catch(e1& x)
{
translate_or_report(x);
}
catch(e2& x)
{
translate_or_report(x);
}
...
catch(...)
{
translate_or_report_unknown_error();
}

which often occurs at subsystem boundaries.

>> If you mean that you want to avoid crashes/ungraceful shutdowns when
>> end-users use the system, I agree.
>
> That is what I am concerned about, and it's not just because of
> trying to be nice to users. The software I am currently working on
> is part of a larger system that has an abysmal record. The team was
> recently confronted with an ultimatum set by the customer along the
> lines of: you've got six more weeks until the final test; if there's
> one single crash (and from a customer's view, this includes
> assertions and the like), you're out - the end of a contract of
> several million dollars as well as dozens of jobs. From this
> perspective, one cannot help eyeing statements like "terminating the
> program is good because it helps debugging" with a certain
> reserve.

Understandable.

Your best option, if you have time for it -- and if a clean emergency
shutdown will not be interpreted as a crash -- is to institute a
recovery subsystem for critical things that must happen during
emergency shutdown. In non-shipping code, asserts should immediately
invoke the debugger, and then invoke emergency recovery and shutdown.
In shipping code, obviously, there's no debugger.

If you can't do that, then you may have to resort to using the
exception mechanism in order to protect your jobs. However, in
non-shipping code, asserts should *still* invoke the debugger
immediately, and you should take care that you don't confuse unwinding
from a precondition violation with "recovery." Your program is in a
bad state, and if you continue after unwinding, you're doing so "on a
wing and a prayer." Cross your fingers, get out your mojo hand, and
light your voodoo candles.

Good Luck.

>> In practice this doesn't happen (at least, in my practice; can't speak
>> for anyone else). Instead, liberal usage of assertions to trap
>> precondition violations as bugs leads to finding and fixing a lot of
>> bugs. Perhaps you should try it before deciding that it doesn't work.
>
> I am sorry if I should have created the impression that I have decided
> the approach doesn't work. I *do* make liberal use of assertions. But
> you have to take into account that the more liberally you use
> assertions, the more likely it is that you err on the other side, i.e.
> that you designate a condition as a result of a bug when in reality it
> is a possible, if exotic program state.

That can only happen if you assert some condition that isn't in the
called function's set of documented preconditions. If the assertion
matches the function's documentation, then it *is* catching a bug.

>> I don't know why it should. I'm not programming with Eiffel, and as
>> far as I know, neither are you, so why should it matter what
>> "precondition" means in Eiffel? Lots of terms are used differently by
>> the two camps. You don't seem to have trouble discussing exceptions,
>> despite the fact that the term means different things in the two
>> languages.
>
> The answer to this is easy: because, to the best of my knowledge,
> the concept of Design by Contract originated in and is most
> fervently advocated by the Eiffel camp. Althogh it is being
> supported directly in Eiffel, it is abstract enough not to be tied
> to that language. There may be differences in implementation, but
> the fundamentals should be the same. If, however, there is a sound
> argument why preconditions should be defined and/or handled
> differently in C++, I would like to see it.

As far as I can tell, the Eiffel camp has a similar understanding.
Because the throw-in-response-to-precondition-violation behavior can
be turned on and off globally, you basically can't count on it. Think
of it as one possible expression of undefined behavior. In some
languages, throwing an exception is basically the only way to get a
debuggable stack trace. If that's the case in Eiffel, it would
explain why they have the option to throw: it's as close as possible
to invoking the debugger (perhaps it even does so).

I should also point out that there's some variation among languages
(and even among C++ compilers) in _when_ stack unwinding actually
occurs. For example, in C++, if the exception is never caught, there
may not ever be any unwinding (it's up to the implementation). In
Python, no unwinding happens until the exception backtrace is
_explicitly_ discarded or the next exception is thrown. I don't know
about the details of Eiffel's exception mechanism, but all of these
variations can have a major impact on the danger of throwing in
response to a precondition violation. In other words, you may have to
look a lot deeper to understand the proper relationship of Eiffel to
C++.

> I am well aware that certain technical terms mean different things
> in different parts of the software engineering community, but I also
> think that redefining terms gratuitously should be avoided.

Absolutely. But I don't think there are as many different definitions
as you seem to think there are. Have you found *any* definitions of
"precondition" other than the Wikipedia one? I'm not talking about
meanings of the word you infer from seeing it used in context. I'm
talking about _definitions_.


Also: read the section called "Run-time Assertion Monitoring" at
docs.eiffel.com/eiffelstudio/general/guided_tour/language/tutorial-09.html
AFAICT, that is in nearly perfect agreement with everything I've been
saying.

>> What's the alternative? Saying that the program's state is partially
>> undefined? Or that some subset of the state is undefined, while the
>> remainder of the state is well-defined? That kind of fuzzy thinking is
>> something I don't understand. It often turns out to be wrong, and
>> leads to missed opportunities to fix bugs.
>
> How about an example? Suppose you have a telephony application with
> a phone book. The phone book module uses std::binary_search on a
> std::vector. A precondition for this algorithm is that the range be
> sorted. A bug causes an unsorted range to be passed. Leaving aside
> the fact that detecting the violation of this precondition may be a
> bit costly, how would you expect the application to react?

------reactions-------


> Abort and thus disconnect the call in progress although it's only
> the phone book that is broken?

you don't know that ;-)

> Exhibit undefined behaviour, such as (hopefully not more than)
> displaying garbage? Notify the user of the problem but let him
> finish the call?

-------reactions--------

More on this in a moment.

> Would the latter cause the precondition cease to be a precondition?

I would just say yes, but that would be slightly too simple an answer.

First of all, that the range is sorted is always a precondition of
std::binary_search. Nothing you can do can ever change that, since
you are not the author of std::binary_search or, more importantly, of
its specification. If you write some function phone_book that accepts
a range and calls binary_search, you have two choices: either make
sortedness a precondition of phone_book, or do something in phone_book
to detect non-sortedness, and document the well-defined behavior
phone_book will give in response to that condition. If you make
sortedness a precondition, you are allowed to skip detecting
non-sortedness, or you can detect it and take whatever voodoo action
you think gives you the least chance of getting fired... just so long
as you remember you're in voodoo land now.

On to the reactions. Which reaction is most appropriate for that
particular application is outside my domain of expertise. I can tell
you what my personal expectations are, but I'm not sure if that is
much help. Of course I would not expect the condition to be violated
in the first place, because it's the sort of condition that can
usually be easily guaranteed with a little bit of care and logical
deduction. I therefore wouldn't expect the application to insert
explicit checks for it, so if the condition were somehow violated, I'd
expect the application to do something crazy like displaying garbage
or looping infinitely, etc.

> This is not meant to be polemic; I am genuinely interested.
>
> As for fuzzy thinking, "something's amiss somewhere" sounds more fuzzy
> to me than "something's amiss in this module/function".

Yes, but you don't know that the latter is true. Thinking the latter
is true when you only know the former is more fuzzy thinking.

>> One other pragmatic reason to stop the program and fix the bug the
>> moment the bug is detected is that you never know when the bug is
>> going to recur and you'll get another opportunity.
>
> What kind of scenario do you have in mind? If your program aborts at a
> remote user site, no immediate fixing is going to take place. I fully
> agree that masking bugs and just plodding on is bad practice, I just
> doubt that aborting is the only means of preventing it.
>
>> Precondition failures indicate bugs, and the right thing to do is fix
>> the bug; just about the worst thing you could do is throw an
>> exception, since throwing an exception is tantamount to ignoring the
>> bug.
>
> Why you do you equate throwing exceptions with ignoring bugs?

For what it's worth, _I_ don't equate those. However, throwing an
exception can easily lead to ignoring a bug as I demonstrated above.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,
Sep 2, 2005, 10:39:00 AM9/2/05
to
"Nicola Musatti" <nicola....@gmail.com> writes:

> David Abrahams wrote:
> [...]
>> _Programmer errors_ lead to precondition failures which invoke
>> undefined behavior
>
> Ok. What you are saying is you don't know where you are so your best
> option is give up immediately, lest you cause additional damage,
> correct?

No, at least not in that sentence I'm not. All I'm saying there is
what I wrote and not anything about how to respond.

Stopping quickly and as gracefully as possible is usually the best
option. Sometimes you can't afford that, though. For example, if
your program is running the stage lights at a rock concert, you don't
want them to stop flashing. That would be weird. However, you ought
to be thinking about alternatives, like, "can I reboot the system
quickly enough?" And if you're writing critical systems like life
support you ought to be thinking about having backup hardware in place
that can take over while you shut this hardware down.

> I see two issues that arise from this point of view: how to implement
> "giving up" and what kind of recovery can be performed.
>
> Ideally one would want to collect as much information as possible on
> what went wrong for diagnostic purposes. On Unix systems generating a
> core dump is a convenient option, on other systems it might not be so
> easy. On the system I work on

Which one, please?

> I don't have core dumps, but my debugger
> breaks on throws,

Unconditionally? That could severely impair debuggability of some
kinds of legitimate code.

> so at least in development builds I implement assertions by throwing
> exceptions.

If that's you're only option, it's your only option. What can I say?
It might be better to invoke the debugger directly, if it's possible,
though.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Bob Bell

unread,
Sep 2, 2005, 12:39:53 PM9/2/05
to
Gerhard Menzl wrote:
> Bob Bell wrote:
> No, I am not trying to invalidate your point in any way. When I point
> out what I perceive as inconsistencies I do so in order to to increase
> my understanding and, hopefully, achieve mutual agreement on a more
> refined level.

That's entirely reasonable; sorry if I seemed a little touchy.

> When you say that "the function shouldn't be allowed to execute a single
> instruction more", logging and asserting would be impossible.

Well, maybe I overstated things a bit there.

> "Executing
> as little code as possible", on the other hand, sounds reasonable to me
> and eliminates the contradiction. I still cannot reconcile this
> guideline with Dave's point that unwinding the stack is (almost always)
> wrong, but starting a separate undo/recovery mechanism isn't. This may
> be due to misunderstanding.

To me, "execute as little code as possible" and "don't throw an
exception" are completely consistent with each other, due to the fact
that "throw an exception" is equivalent to "allow any code to run." A
separate recovery mechanism is OK because it can be constrained to
"execute as little code as possible" before the system shuts down. (As
for basing this recovery mechanism on an undo mechanism, I don't have a
strong opinion.)

> > If you mean that you want to avoid crashes/ungraceful shutdowns when
> > end-users use the system, I agree.
>
> That is what I am concerned about, and it's not just because of trying
> to be nice to users. The software I am currently working on is part of a
> larger system that has an abysmal record. The team was recently
> confronted with an ultimatum set by the customer along the lines of:
> you've got six more weeks until the final test; if there's one single
> crash (and from a customer's view, this includes assertions and the
> like), you're out - the end of a contract of several million dollars as
> well as dozens of jobs. From this perspective, one cannot help eyeing
> statements like "terminating the program is good because it helps
> debugging" with a certain reserve. You don't have to tell me that there
> has to be something seriously wrong with the development process to get
> into a situation like this in the first place, but unfortunately,
> development in the real world does not always take the perfect (or even
> reasonably sound) process path.

Ouch. I sympathize; sometimes what we should do is not what we're
allowed to do. I've been trying to keep the discussion on the level of
what we should do.

If I was working for an employer that said "one more crash and you're
fired," I'd probably be bending over backward to make sure that, no
matter what, the system didn't crash, despite the fact that I would
likely do nothing to make the system more stable in any real sense.

> > Perhaps you should try it before deciding that it doesn't work.
>
> I am sorry if I should have created the impression that I have decided
> the approach doesn't work.

I shouldn't have put that last sentence in, I was out of line.

> I *do* make liberal use of assertions. But
> you have to take into account that the more liberally you use
> assertions, the more likely it is that you err on the other side, i.e.
> that you designate a condition as a result of a bug when in reality it
> is a possible, if exotic program state.

There are always risks of mistakes no matter what you do, so I don't
disagree with this point. But I weigh it against the alternative, and
find the alternative worse; the risk of not detecting bugs by leaving
out assertions is worse to me than the risk of mistaking a tolerable
condition as a bug.

Keep in mind also that the more assertions there are, the more likely
you are to find a bug close to its cause.

In any case, if an assertion fires on a condition that should be
tolerated, you've still detected a bug -- the incorrect assertion. ;-)

> I am well aware that certain technical terms mean different things in
> different parts of the software engineering community, but I also think
> that redefining terms gratuitously should be avoided.

Sure, but gratuitous similarities should be avoided as well. The notion
of "class" is significantly different in C++ than it is in, say, CLOS,
but that doesn't prevent the C++ community from using the term
productively. I'm not too concerned with (what I think are) minor
differences between the term "precondition" in C++ and Eiffel. The
important thing (like with "class") is to come up with a definition
that works for C++.

> How about an example? Suppose you have a telephony application with a
> phone book. The phone book module uses std::binary_search on a
> std::vector. A precondition for this algorithm is that the range be
> sorted. A bug causes an unsorted range to be passed. Leaving aside the
> fact that detecting the violation of this precondition may be a bit
> costly, how would you expect the application to react? Abort and thus
> disconnect the call in progress although it's only the phone book that
> is broken?

The phrase "although it's only the phone book that is broken" adds an
unjustified bias to the question: how do you know that only the phone
book is broken? Until you debug it, you're just hoping. I would "abort
and thus disconnect the call in progress" and then attempt to fix the
bug.

I typically turn assertions off when a program is delivered, so this is
unlikely to happen to an end-user. What will happen, though, I cannot
predict. Maybe his call will work. Or maybe he'll be billed at ten
times his normal rate.

> Notify the user of the problem but let him
> finish the call? Would the latter cause the precondition cease to be a
> precondition?

It would cease to be useful to call "vector must be sorted" a
precondition.

If "precondition" means "a condition which must be true when a function
is called in order for a function to work", then any condition which
fails but still allows the function to work is not a precondition. (I'm
including any normal return or a thrown exception in "work".)

Roughly speaking, I can partition "conditions before a function is
called" into three groups:

1) conditions which allow a function to "succeed"
2) conditions which lead to a function "not succeeding"
3) conditions which are the result of programmer errors

The important thing is that a system must gracefully tolerate 1) and
2); there is no requirement (in general, there can't be) for a system
to gracefully tolerate 3). We can reason about the system as long as we
only have 1) and 2); we can draw conclusions about its correctness, and
make predictions about what it will do and what new states it will
enter. With 3), we cannot reason reliably about the system, and cannot
predict what it will do.

Thus, I think it's very important to distinguish 1) and 2) from 3).

Do you agree that this is a useful distinction to make (regardless of
the term used to label that distinction)?

When it turns out that the phone book vector is not sorted, we have 3).
Trying to continue running is treating it like 1) or 2).

> As for fuzzy thinking, "something's amiss somewhere" sounds more fuzzy
> to me than "something's amiss in this module/function". Sure, a bug can
> surface at a point far from its source: writing to arbitrary locations
> in memory is an example. But is it feasible always to react as if this
> were the case, although in the majority of cases the cause is probably
> to be found locally?

In my experience, yes. Most of the time, you're right; the bug is
local, and usually quite simple. Sometimes, the bug is quite nasty and
takes a bit longer; sometimes, the problem is not local at all. In any
case, I stop the program and debug it.

I don't really have a problem with deciding a priori that a bug is
local or far-reaching in scope. What I have a problem with is starting
with "the bug is probably local" as a premise and concluding "we can
probably just let the program keep running for a while."

> > One other pragmatic reason to stop the program and fix the bug the
> > moment the bug is detected is that you never know when the bug is
> > going to recur and you'll get another opportunity.
>
> What kind of scenario do you have in mind?

Any bug that is difficult to reproduce. For example, a memory
corruption bug; these often appear intermittently, and can be quite
difficult to track down. If such a bug is detected, it's better to stop
now and fix it, because who knows when it will recur again?

> If your program aborts at a
> remote user site, no immediate fixing is going to take place.

True, but what's the alternative?

> I fully
> agree that masking bugs and just plodding on is bad practice, I just
> doubt that aborting is the only means of preventing it.

There are several means of preventing bugs, and they all complement
each other. Assertions are just one part of the process. Assertions
combined with rigorous testing will uncover a _lot_ of bugs. (Even
assertions plus minimal testing is better than nothing.)

Not aborting when a bug is detected makes it much harder to fix.

Suppose you modify the program such that it can tolerate the bug (e.g.,
you throw an exception when the bug is detected, and some caller
responds to the exception by successfully isolating the affected parts
of the system, perhaps reinitializing them). Is the condition really a
bug anymore? I don't think so; the condition, and the system's
response, becomes part of the well-defined behavior of the system. If
this is the case, it doesn't make sense to call the condition a
precondition anymore.

> > Precondition failures indicate bugs, and the right thing to do is fix
> > the bug; just about the worst thing you could do is throw an
> > exception, since throwing an exception is tantamount to ignoring the
> > bug.
>
> Why you do you equate throwing exceptions with ignoring bugs?

Because it allows the system to continue running.

> In my
> application, the top level exception handler tries to write as much
> information as possible to a log file. It then informs the user that a
> serious error has happened, that the application may be in a shaky state
> and had better be terminated, where the log file is, and that an
> administrator should be called to collect the information and forward it
> to the vendor. Admittedly, the user could choose to ignore the notice

> and carry on. but then he could also restart an aborted application adn


> carry on. Or are you concerned about sloppy exception handling practices
> in larger teams?

Sloppy exception handling definitely makes things worse, but even
excellent exception handling can allow the system to continue.

In your scenario, the bug certainly isn't ignored by the user. From the
point of view of the code, however, it essentially was ignored, because
after doing all the logging and so forth, the program goes back to
doing what it did before.

If you throw an exception, it's possible for any code path in the
system to be executed, even though you know at least one path is
broken, and you don't know how extensive the damage is.

Long-windedly yours,

Bob

Gerhard Menzl

unread,
Sep 8, 2005, 10:05:17 AM9/8/05
to
Bob Bell wrote:

> There are always risks of mistakes no matter what you do, so I don't
> disagree with this point. But I weigh it against the alternative, and
> find the alternative worse; the risk of not detecting bugs by leaving
> out assertions is worse to me than the risk of mistaking a tolerable
> condition as a bug.
>
> Keep in mind also that the more assertions there are, the more likely
> you are to find a bug close to its cause.
>
> In any case, if an assertion fires on a condition that should be
> tolerated, you've still detected a bug -- the incorrect assertion. ;-)

No qualms about this in non-shipping code (see also my response to David
in this and all other regards). Spurious assertions at the customer's
site can do a lot of damage, though.

> The phrase "although it's only the phone book that is broken" adds an
> unjustified bias to the question: how do you know that only the phone
> book is broken? Until you debug it, you're just hoping. I would "abort
> and thus disconnect the call in progress" and then attempt to fix the
> bug.
>
> I typically turn assertions off when a program is delivered, so this
> is unlikely to happen to an end-user. What will happen, though, I
> cannot predict. Maybe his call will work. Or maybe he'll be billed at
> ten times his normal rate.

If you had stated this from the beginning, we could have saved quite
some effort and bandwidth. :-) My concerns have always been related to
what happens at the user's site. Interestingly though, the practice of
turning off assertions in shipping code has been condemned here many times.

> If "precondition" means "a condition which must be true when a
> function is called in order for a function to work", then any
> condition which fails but still allows the function to work is not a
> precondition. (I'm including any normal return or a thrown exception
> in "work".)
>
> Roughly speaking, I can partition "conditions before a function is
> called" into three groups:
>
> 1) conditions which allow a function to "succeed"
> 2) conditions which lead to a function "not succeeding"
> 3) conditions which are the result of programmer errors
>
> The important thing is that a system must gracefully tolerate 1) and
> 2); there is no requirement (in general, there can't be) for a system
> to gracefully tolerate 3). We can reason about the system as long as
> we only have 1) and 2); we can draw conclusions about its correctness,
> and make predictions about what it will do and what new states it will
> enter. With 3), we cannot reason reliably about the system, and cannot
> predict what it will do.
>
> Thus, I think it's very important to distinguish 1) and 2) from 3).
>
> Do you agree that this is a useful distinction to make (regardless of
> the term used to label that distinction)?

Absolutely. The dispute has always been about how to handle situations
of type 3.

> In my experience, yes. Most of the time, you're right; the bug is
> local, and usually quite simple. Sometimes, the bug is quite nasty and
> takes a bit longer; sometimes, the problem is not local at all. In any
> case, I stop the program and debug it.

Under lab conditions, you can. In the field, you can't, and the
trade-offs are often different.

> I don't really have a problem with deciding a priori that a bug is
> local or far-reaching in scope. What I have a problem with is starting
> with "the bug is probably local" as a premise and concluding "we can
> probably just let the program keep running for a while."

This is not and has never been my premise. Exceptions may be *abused* to
make a program behave like this, but I contest the argument that this is
in their nature. It depends on your exception handling concept. If you
don't have a good concept, your program will be hard to debug and maintain.

>>If your program aborts at a
>>remote user site, no immediate fixing is going to take place.
>
> True, but what's the alternative?

Trying to log as much as possible, notifying the user, and giving him
the chance to shut down himself and remain in charge of the situation,
as opposed to humiliating the user by pretending that a myopic piece of
code is a better judge. Yes, I am being polemic here. I am aware that
sometimes a piece of code *is* a better judge; it's just that I see a
tendency among technical people to assume this is the case in general.

> There are several means of preventing bugs, and they all complement
> each other. Assertions are just one part of the process. Assertions
> combined with rigorous testing will uncover a _lot_ of bugs. (Even
> assertions plus minimal testing is better than nothing.)
>
> Not aborting when a bug is detected makes it much harder to fix.

Under lab conditions, sure.

> Sloppy exception handling definitely makes things worse, but even
> excellent exception handling can allow the system to continue.
>
> In your scenario, the bug certainly isn't ignored by the user. From
> the point of view of the code, however, it essentially was ignored,
> because after doing all the logging and so forth, the program goes
> back to doing what it did before.
>
> If you throw an exception, it's possible for any code path in the
> system to be executed, even though you know at least one path is
> broken, and you don't know how extensive the damage is.

That depends a lot on the type of application and how you handle
exceptions. Also note that if you turn assertions off in shipping code,
your program doesn't just go back to doing what it did before, it goes
on following the broken path willingly! Surely this is worse than
aborting the broken path and taking the bet that at least backing away
from the bug zone works.


--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gerhard Menzl

unread,
Sep 8, 2005, 10:08:16 AM9/8/05
to
David Abrahams wrote:

> If you find my point to be in contradiction with the goal of
> "executing as little code as possible," then there probably has been a
> misunderstanding. At the point you detect a violated precondition,
> there can usually only be partial recovery. You should do just enough
> work to avoid total catastrophe, if you can. At the point you detect
> a violated precondition, you don't have any way to ensure that the
> actions taken by stack unwinding will be minimal. On the other hand,
> if you have a separate mechanism for registering critical recovery
> actions -- and an agreement among components to use it -- you can
> invoke that, and avoid any noncritical actions.

I think one reason why I am having difficulties coming to terms with
your position (and why this discussion keeps on going) is that, to me,
the distinction between what *has already happened* when a violated
precondition is detected and what *is going to happen* when the function
continues nevertheless is still somewhat blurred.

There is no dispute that continuing would cause hell to break loose.
After all, that's why the author of the function specified the
precondition in the first place. However, I cannot agree with the
general assumption that hell *has already broken loose* at the point of
detection, and that nothing and nobody can be trusted anymore. Sure, it
means that there is a bug that prevents the function from achieving its
goal. But does it also mean that the same bug will interfere with the
operations performed during unwinding? I think I know what your answer
to this is going to be:

> you don't know that ;-)

Do I, or better: does the original function have to? If another
operation performed during unwinding relies on the same condition,
surely it will have specified the same precondition, and if it doesn't,
it should not be affected. Admittedly, this raises the thorny issue of
exceptions thrown during unwinding, but if an exception should really
leave a destructor as an effect of a situation like this, terminate
would be called, which conforms to what you advocate anyway.

Precondition specifications aren't normally about complex global states,
they demand that certain local conditions of limited scope be met. They
don't say "the stack is uncorrupted", they say: "this particular vector
must be sorted". If it isn't, it's usually because the author of the
client forgot to sort the vector, or called another function after the
sort that push_backs an element. In well-designed programs that exhibit
a high degree of encapsulation and low coupling, this should never
affect code that doesn't rely on the sorting of the vector - unless the
violation is a mere effect of a greater mess, like a buffer overrun. But
in that case, starting a separate recovery mechanism is acting "on a
wing and a prayer" as well. Ultimately, your bet is: the precondition
does not hold, great evil is afoot, I will just manage to perform a
number of critical operations, but performing the non-critical ones
would awaken more evil things, hence I skip them and bail out.

Now I will readily agree that there is a wide range of applications for
which this is just the right bet. With many types of programs, it will
be even better not to try any recovery at all. I am not convinced,
however, that this should be a general guideline, regardless of the
concrete problem domain and application type. You have brought up the
stage lighting example yourself - perhaps we just differ on the question
how rare or how frequent these applications are. And if I remember
correctly, it was a statement of mine along the lines of having
different strategies for different applications that started the whole
discussion.

> The other reason that unwinding is almost always wrong is that it is
> very prone to losing the information that a bug was detected, and
> allowing execution to proceed as though full recovery has occurred.
> All it takes is passing through a layer like this one:
>
> try
> {
> ...something that detects a precondition violation...
> }
> catch(e1& x)
> {
> translate_or_report(x);
> }
> catch(e2& x)
> {
> translate_or_report(x);
> }
> ...
> catch(...)
> {
> translate_or_report_unknown_error();
> }
>
> which often occurs at subsystem boundaries.

I fully agree that ignoring errors and masking bugs is a bad thing and a
reason for concern. But what you are saying here is that because
exceptions might be suppressed or handled in a light-hearted way the
should not be thrown in the first place. In other words, a function that
detects a precondition violation cannot trust a top-level component. How
then can it trust the assertion/termination mechanism? After all, it
might have been defined like this:

void do_assert (char const* expr,
char const* file,
unsigned int line)
{
write_log (expr, file, line); // don't feel like terminating
}

I don't think that the mere theoretical possibility of a component
screwing up justifies not giving it a chance. The way I understand
Design by Contract, it's a methodology that pervades the entire program.
It's not a local measure. Abstracting from the other reasons you have
brought forward against throwing exceptions on detecting precondition
violations, the handler would have to look like:

try
{
// something that detects a precondition violation
}
catch (precondition_violation& pv)
{
// do whatever is appropriate for the type of application:
// log, display a message, abort, whatever
}

If someone gets this wrong, they are likely to get the separate
violation handler wrong as well.

This also raises the question at what level it is appropriate to decide
how to react to a violated precondition. If the proper reaction depends
on the type of application (and by bringing up your stage lighting
example you admit that it does), the decision can only be taken at a
higher level, at any rate not in a general purpose function that isn't
aware of the type of application it resides in. Otherwise, it would not
even be possible for the stage lighting controller to carry on and
project random colours.

Talking of application-independent, low-level code, how are (especially
third-party) libraries supposed to handle a violated precondition? Note
that I am referring to the actual implementation here, not the interface
documentation. You can't throw an exception, because you would have to
document it, and then there wouldn't be a precondition anymore. assert()
or terminate()? Carry on and let the client taste the full consequences
of its negligence? How do you handle this at Boost?

> Your best option, if you have time for it -- and if a clean emergency
> shutdown will not be interpreted as a crash -- is to institute a
> recovery subsystem for critical things that must happen during
> emergency shutdown.

This is perfectly ok for technical people like you and me. Most
customers (those I referred to, at any rate), however, don't care about
this sort of thing. There is no such thing as a graceful shutdown for
them. If the program stops working, it's a crash. They have little
esteem even for the most elaborate and elegant shutdown mechanism. Of
course the anger will be less if their documents get saved, compared to
a real crash, where they arent't, but they are still angry. And you know
what? They are right!

I really find myself wearing two hats here: as a developer, I always
take the worst case into consideration and want the world to come to a
standstill whenever a bug is detected, but as a user's advocate and,
more still, as a user myself I don't want to be bugged by programs that
decide to drop dead.

A good example is the Mozilla family of Web browsers. Every now and
again, the otherwise much loved monster will declare that an error has
happened, and that the application will be terminated, and would I be so
kind to fill out the quality feedback form. It then dies and takes
everything that is not saved (such as things you have just typed into a
Web form) with it. I have never looked at the source of the Mozilla
project, but this behaviour looks suspiciously like
abort-on-contract-breach to me. Every time this happens, my developer's
admiration for the refined bug reporting mechanism is quickly
extinguished by my user's rage. It's technology-centric behaviour. It
humiliates users. Yes, there is a theoretical possibility that
unwinding, notifying me and offering me the chance to close the browser
myself might wake a sleeping demon that goes and formats my hard disk.
But that danger is probably much higher with applications that don't
bother with DbC in the first place. In all likelyhood, the worst thing
that would happen is garbage on the display.

> In non-shipping code, asserts should immediately
> invoke the debugger, and then invoke emergency recovery and shutdown.
> In shipping code, obviously, there's no debugger.

I am grateful that you make the distinction between non-shipping and
shipping code here. Let me emphasize that from the beginning of this
exchange my reservations have been related exclusively to the latter.
With non-shipping code, i.e. under lab conditions, my practice has
always been to stop immediately, so no objections there. After all,
that's the standard behaviour of the C standard library assert() on most
platforms. Automatically invoking the debugger is nice if your platform
supports it (mine causes the debugger itself to freeze in nine out of
ten cases), but that's a detail.

That leaves the question what to do in shipping code. Standard C
practice (in the sense of what most platforms seem to do - I don't know
what the C Standard says) is to let the preprocessor suppress the test
and boldly stomp into what may be disastrous. Incidentally, the Eiffel
practice (thanks for the link, by the way) seems to be similar:
assertion monitoring is usually turned off in shipping code. This is in
stark contrast to what has been frequently advocated in this newsgroup.
The standard argument is: disabling assertions in shipping code is like
leaving the life jackets ashore when you set sail. I find this metaphor
rather misleading - assertions are more like self-destruction devices
than life jackets - yet the argument cannot be dismissed so easily. What
is your position on this? Should assertions in shipping code do nothing,
do the same as in non-shipping code, or do something else? Ironically,
one of the suggestions I remember having read here is that they should
throw exceptions. :-)

> Good Luck.

Thanks, but the ordeal has been passed already. I may now build in
crashs again. *g*

> That can only happen if you assert some condition that isn't in the
> called function's set of documented preconditions. If the assertion
> matches the function's documentation, then it *is* catching a bug.

I was referring to preconditions/assertions that aren't, i.e. the kind
of error where you think something always holds only to discover there
are situations where it legitimately doesn't. In other words, the bug is
in your analysis.

> As far as I can tell, the Eiffel camp has a similar understanding.
> Because the throw-in-response-to-precondition-violation behavior can
> be turned on and off globally, you basically can't count on it. Think
> of it as one possible expression of undefined behavior.

According to the description in the Eiffel tutorial you pointed me to,
the behaviour can be specified at the class level, with higher level
defaults. It is not up to the individual function to decide. Thus, at
least within the bounds of the code you maintain and compile yourself,
you tell the runtime what to do when a contract is violated. That is,
you *can* count on it. A typical, simple strategy is to turn all checks
on during development and turn them off before you ship.

> In some languages, throwing an exception is basically the only way to
> get a debuggable stack trace. If that's the case in Eiffel, it would
> explain why they have the option to throw: it's as close as possible
> to invoking the debugger (perhaps it even does so).
>
> I should also point out that there's some variation among languages
> (and even among C++ compilers) in _when_ stack unwinding actually
> occurs. For example, in C++, if the exception is never caught, there
> may not ever be any unwinding (it's up to the implementation). In
> Python, no unwinding happens until the exception backtrace is
> _explicitly_ discarded or the next exception is thrown. I don't know
> about the details of Eiffel's exception mechanism, but all of these
> variations can have a major impact on the danger of throwing in
> response to a precondition violation. In other words, you may have to
> look a lot deeper to understand the proper relationship of Eiffel to
> C++.

Certainly. The default effect of an exception in Eiffel seems to be
termination. Whether this involves unwinding or not I cannot tell. On my
platform (.NET), an exception is a convenient way to get a stack trace.

> Absolutely. But I don't think there are as many different definitions
> as you seem to think there are. Have you found *any* definitions of
> "precondition" other than the Wikipedia one? I'm not talking about
> meanings of the word you infer from seeing it used in context. I'm
> talking about _definitions_.

I think we have reached agreement on the definition; it's the resulting
conclusions and practices, and their applicability to different types of
software where doubts remain.


--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,
Sep 12, 2005, 1:27:13 PM9/12/05
to
Gerhard Menzl <gerhar...@hotmail.com> writes:

> David Abrahams wrote:
>
>> If you find my point to be in contradiction with the goal of
>> "executing as little code as possible," then there probably has been a
>> misunderstanding. At the point you detect a violated precondition,
>> there can usually only be partial recovery. You should do just enough
>> work to avoid total catastrophe, if you can. At the point you detect
>> a violated precondition, you don't have any way to ensure that the
>> actions taken by stack unwinding will be minimal. On the other hand,
>> if you have a separate mechanism for registering critical recovery
>> actions -- and an agreement among components to use it -- you can
>> invoke that, and avoid any noncritical actions.
>
> I think one reason why I am having difficulties coming to terms with
> your position (and why this discussion keeps on going) is that, to
> me, the distinction between what *has already happened* when a
> violated precondition is detected and what *is going to happen* when
> the function continues nevertheless is still somewhat blurred.

Of course it is blurry, because you don't know anything about the
severity of the problem once a violated precondition is detected.
Your options are to be conservative and take emergency shutdown
measures, continue -- whether by going forward or unwinding to some
other place in the code -- and hope the breakage is not too bad, or
sit somewhere in between, placing bets on a case-by-case basis.

> There is no dispute that continuing would cause hell to break loose.
> After all, that's why the author of the function specified the
> precondition in the first place. However, I cannot agree with the
> general assumption that hell *has already broken loose* at the point
> of detection, and that nothing and nobody can be trusted anymore.

It is certainly true that hell has already broken loose. By the time
the violation is detected, somebody somewhere has already done
something they were told not to do. Whether or not anything can be
trusted is a matter of opinion; you can make your own judgements.

> Sure, it means that there is a bug that prevents the function from
> achieving its goal. But does it also mean that the same bug will
> interfere with the operations performed during unwinding? I think I
> know what your answer to this is going to be:
>
>> you don't know that ;-)

Exactly.

> Do I, or better: does the original function have to?

If it is going to make guarantees about robustness in the face of
these conditions, then yes. That's where we started this discussion:
you wanted to provide documented guarantees of behavior in the face of
violated preconditions. If the original function is going to guess
about the brokenness of the context in which it was called, then no,
it doesn't need to know. However, as I've repeatedly said, the called
function usually has little or no knowledge about the context in which
it is called, so it's very difficult to make an educated guess.

> Precondition specifications aren't normally about complex global states,
> they demand that certain local conditions of limited scope be met.

Exactly. That's what makes the detecting code particularly unsuited
to making educated guesses about severity.

> They don't say "the stack is uncorrupted", they say: "this
> particular vector must be sorted". If it isn't, it's usually because
> the author of the client forgot to sort the vector, or called
> another function after the sort that push_backs an element.

On what do you base that assessment? Do you have data, or is it just
intuition?

> In well-designed programs that exhibit a high degree of
> encapsulation and low coupling, this should never affect code that
> doesn't rely on the sorting of the vector - unless the violation is
> a mere effect of a greater mess, like a buffer overrun.

True, but what makes you think that sortedness is not part of some
much larger global invariant? The sortedness of the vector might be
fundamental to the operation of most of the program.

> But in that case, starting a separate recovery mechanism is acting
> "on a wing and a prayer" as well.

Exactly. At that point, everything is a shot in the dark. I bet on
the recovery mechanism avoiding total catastrophe because it's the
best I can do.

> Ultimately, your bet is: the precondition does not hold, great evil
> is afoot, I will just manage to perform a number of critical
> operations, but performing the non-critical ones would awaken more
> evil things, hence I skip them and bail out.

Right.

> Now I will readily agree that there is a wide range of applications
> for which this is just the right bet. With many types of programs,
> it will be even better not to try any recovery at all. I am not
> convinced, however, that this should be a general guideline,
> regardless of the concrete problem domain and application type. You
> have brought up the stage lighting example yourself - perhaps we
> just differ on the question how rare or how frequent these
> applications are.

Maybe, maybe not. Of course it's a matter of degree.

Programmers in general seldom make the distinction carefully between
violated preconditions and conditions that are known to be
recoverable. You yourself seem to have had that problem. The pull to
throw from a violated precondition, and hope that code somewhere else
can deal with the problem, is quite strong. We're loathe to admit
that the program is broken, so we bet that something can be done about
it elsewhere. Once you start trying to unwind-and-continue from a
violated precondition, you -- or someone on your team -- will
typically begin to add code for defensive programming (which has a
high development cost and often, doesn't actually work), because you
now have to make the program "work" even in a broken state.

When I say, "it's almost always a mistake to throw from a violated
precondition," I am addressing that problem: I want people to think
much more carefully about the consequences and be much more
conservative about the idea of doing so. If you determine, for
whatever reason, that your application is better off betting that
things "aren't broken too badly," you should still design the program
as though preconditions are never actually violated. In other words,
the program should not count on these exceptions and expect to respond
to them in useful ways. Anything else leads to a mess.

> And if I remember correctly, it was a statement of
> mine along the lines of having different strategies for different
> applications that started the whole discussion.

If you did make such a remark, that wasn't what prompted me to get
involved. It was the blurring of the notion of precondition that
incited my interest.

>> The other reason that unwinding is almost always wrong is that it is
>> very prone to losing the information that a bug was detected, and
>> allowing execution to proceed as though full recovery has occurred.
>> All it takes is passing through a layer like this one:
>>
>> try
>> {
>> ...something that detects a precondition violation...
>> }
>> catch(e1& x)
>> {
>> translate_or_report(x);
>> }
>> catch(e2& x)
>> {
>> translate_or_report(x);
>> }
>> ...
>> catch(...)
>> {
>> translate_or_report_unknown_error();
>> }
>>
>> which often occurs at subsystem boundaries.
>
> I fully agree that ignoring errors and masking bugs is a bad thing and a
> reason for concern. But what you are saying here is that because
> exceptions might be suppressed or handled in a light-hearted way the
> should not be thrown in the first place. In other words, a function that
> detects a precondition violation cannot trust a top-level
> component.

No, that's not the issue. In general, catch blocks like the one above
do the right thing. They're not swallowing errors. In this case the
precondition violation gets treated like a recoverable error simply
because its exception passes through a translation layer. At such a
language or subsystem boundary, even if the programmer can anticipate
the "violated precondition" exception type thrown by low level code,
what would you have him do? What sort of response would you deem
"trustworthy?"

> I don't think that the mere theoretical possibility of a component
> screwing up justifies not giving it a chance. The way I understand
> Design by Contract, it's a methodology that pervades the entire program.
> It's not a local measure. Abstracting from the other reasons you have
> brought forward against throwing exceptions on detecting precondition
> violations, the handler would have to look like:
>
> try
> {
> // something that detects a precondition violation
> }
> catch (precondition_violation& pv)
> {
> // do whatever is appropriate for the type of application:
> // log, display a message, abort, whatever
> }

These layers occur in libraries, where we might not know what's
appropriate for the application.

> This also raises the question at what level it is appropriate to decide
> how to react to a violated precondition. If the proper reaction depends
> on the type of application (and by bringing up your stage lighting
> example you admit that it does), the decision can only be taken at a
> higher level, at any rate not in a general purpose function that isn't
> aware of the type of application it resides in. Otherwise, it would not
> even be possible for the stage lighting controller to carry on and
> project random colours.

Yes. See BOOST_ASSERT, which I think uses a sensible approach.

> Talking of application-independent, low-level code, how are (especially
> third-party) libraries supposed to handle a violated precondition? Note
> that I am referring to the actual implementation here, not the interface
> documentation. You can't throw an exception, because you would have to
> document it, and then there wouldn't be a precondition anymore. assert()
> or terminate()? Carry on and let the client taste the full consequences
> of its negligence? How do you handle this at Boost?

Ditto. :)

>> Your best option, if you have time for it -- and if a clean emergency
>> shutdown will not be interpreted as a crash -- is to institute a
>> recovery subsystem for critical things that must happen during
>> emergency shutdown.
>
> This is perfectly ok for technical people like you and me. Most
> customers (those I referred to, at any rate), however, don't care about
> this sort of thing. There is no such thing as a graceful shutdown for
> them. If the program stops working, it's a crash.

Okay, so in your case a clean emergency shutdown will be interpreted
as a crash.

I once added that strategy to a program that was historically unstable
and my customers were immensely grateful that their work wasn't being
lost. Of course I fixed a lot of bugs too, but didn't manage to nail
all of them. However, the clean emergency shutdown, along with users
sending me their files and a description of the action that caused the
crash, allowed me to go on fixing more problems until the program was
in pretty good shape.

> They have little esteem even for the most elaborate and elegant
> shutdown mechanism. Of course the anger will be less if their
> documents get saved, compared to a real crash, where they arent't,
> but they are still angry. And you know what? They are right!

Sure. The way to avoid that is to eliminate bugs from the program,
not to try to hobble along anyway when a bug is detected. The
customer will usually be just as angry when the program doesn't behave
as expected because some internal assumption is violated. And you


know what? They are right!

Hobbling along can even be dangerous for the customer's data, since
for them, too, the pull not to admit something is really broken is
strong. Who knows what effect the next attempted editing operation or
transaction will actually have?

> I really find myself wearing two hats here: as a developer, I always
> take the worst case into consideration and want the world to come to
> a standstill whenever a bug is detected, but as a user's advocate
> and, more still, as a user myself I don't want to be bugged by
> programs that decide to drop dead.

Dropping dead is not the same as a clean emergency shutdown.

> A good example is the Mozilla family of Web browsers. Every now and
> again, the otherwise much loved monster will declare that an error
> has happened, and that the application will be terminated, and would
> I be so kind to fill out the quality feedback form. It then dies and
> takes everything that is not saved (such as things you have just
> typed into a Web form) with it. I have never looked at the source of
> the Mozilla project, but this behaviour looks suspiciously like
> abort-on-contract-breach to me. Every time this happens, my
> developer's admiration for the refined bug reporting mechanism is
> quickly extinguished by my user's rage. It's technology-centric
> behaviour. It humiliates users. Yes, there is a theoretical
> possibility that unwinding, notifying me and offering me the chance
> to close the browser myself might wake a sleeping demon that goes
> and formats my hard disk. But that danger is probably much higher
> with applications that don't bother with DbC in the first place. In
> all likelyhood, the worst thing that would happen is garbage on the
> display.

Maybe. You don't really know that, do you?

Anyway browsers are an unusual case, since they're primarily viewers.
If they don't contain some carefully-crafted bit of the user's work
that will be lost on a crash, it's probably okay... hmm, but wait:
there's webmail. So I could lose this whole message unless it gets
written to disk and I get the chance to start over. Oh, and there are
all kinds of scam sites that masquerade as secure and trustworthy,
which might be easily mistaken for legit if the user begins to
overlook garbage on the screen. As a matter of fact, security is a
big deal for web browsers. They're used in all kinds of critical
applications, including banking. Oh, and there are plugins, which
could be malicious and might be the cause of the violation detected.
No, I don't think we want the browser pressing ahead when it detects a
bug, not at all. I think this is a perfect case-in-point.

> That leaves the question what to do in shipping code. Standard C
> practice (in the sense of what most platforms seem to do - I don't
> know what the C Standard says) is to let the preprocessor suppress
> the test and boldly stomp into what may be disastrous. Incidentally,
> the Eiffel practice (thanks for the link, by the way) seems to be
> similar: assertion monitoring is usually turned off in shipping
> code.

That can be a good policy, because programmers concerned about
efficiency will never be deterred from writing assertions on the basis
of slowing down the shipping program.

> This is in stark contrast to what has been frequently advocated in
> this newsgroup. The standard argument is: disabling assertions in
> shipping code is like leaving the life jackets ashore when you set
> sail.

One or two vocal advocates of that approach do not a consensus make.
I've never agreed with it.

> I find this metaphor rather misleading - assertions are more like
> self-destruction devices than life jackets - yet the argument cannot
> be dismissed so easily. What is your position on this? Should
> assertions in shipping code do nothing, do the same as in
> non-shipping code, or do something else?

The correct policy depends on the application and your degree of
confidence in the code.

> Ironically, one of the suggestions I remember having read here is
> that they should throw exceptions. :-)

There are lots of different opinions out there. I'm sure you could
find an advocate for anything if you look hard enough.

>> That can only happen if you assert some condition that isn't in the
>> called function's set of documented preconditions. If the assertion
>> matches the function's documentation, then it *is* catching a bug.
>
> I was referring to preconditions/assertions that aren't, i.e. the kind
> of error where you think something always holds only to discover there
> are situations where it legitimately doesn't. In other words, the bug is
> in your analysis.

Yeah, I was accounting for that possibility. It's still a bug.

>> Have you found *any* definitions of "precondition" other than the
>> Wikipedia one? I'm not talking about meanings of the word you
>> infer from seeing it used in context. I'm talking about
>> _definitions_.
>
> I think we have reached agreement on the definition;

Well, that's progress, anyway.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gerhard Menzl

unread,
Sep 14, 2005, 11:46:52 AM9/14/05
to
David Abrahams wrote:

>>Do I, or better: does the original function have to?
>
> If it is going to make guarantees about robustness in the face of
> these conditions, then yes. That's where we started this discussion:
> you wanted to provide documented guarantees of behavior in the face of
> violated preconditions. If the original function is going to guess
> about the brokenness of the context in which it was called, then no,
> it doesn't need to know. However, as I've repeatedly said, the called
> function usually has little or no knowledge about the context in which
> it is called, so it's very difficult to make an educated guess.
>
>>Precondition specifications aren't normally about complex global
>>states, they demand that certain local conditions of limited scope be
>>met.
>
> Exactly. That's what makes the detecting code particularly unsuited
> to making educated guesses about severity.

Full agreement here. That's why I have reservations about terminating
the program (in shipping code) whenever a precondition is violated: it
is based on educatedly guessing maximum severity.

>>They don't say "the stack is uncorrupted", they say: "this
>>particular vector must be sorted". If it isn't, it's usually because
>>the author of the client forgot to sort the vector, or called
>>another function after the sort that push_backs an element.
>
> On what do you base that assessment? Do you have data, or is it just
> intuition?

On my personal experience regarding the relative frequency of possible
causes:

Hardware failure: not that I remember
Compiler error: once or twice
Stack overflow: hardly ever
Buffer overrun: rare
Simple thinko: most of the time

I do not claim general validity. Your mileage may vary.

> True, but what makes you think that sortedness is not part of some
> much larger global invariant? The sortedness of the vector might be
> fundamental to the operation of most of the program.

I have a hard time imagining a non-trivial and well-designed program for
which this is the case, but let's just assume it. If the sortedness is
part of a global invariant, it must be specified as such, not just as a
local precondition. In other words, the contract surveillance mechanism
would trigger whatever action it is supposed to trigger everywhere the
invariant matters.

>>But in that case, starting a separate recovery mechanism is acting
>>"on a wing and a prayer" as well.
>
> Exactly. At that point, everything is a shot in the dark. I bet on
> the recovery mechanism avoiding total catastrophe because it's the
> best I can do.

I understood that. It's still a bet - or an educated guess.

> Programmers in general seldom make the distinction carefully between
> violated preconditions and conditions that are known to be
> recoverable. You yourself seem to have had that problem. The pull to
> throw from a violated precondition, and hope that code somewhere else
> can deal with the problem, is quite strong. We're loathe to admit
> that the program is broken, so we bet that something can be done about
> it elsewhere. Once you start trying to unwind-and-continue from a
> violated precondition, you -- or someone on your team -- will
> typically begin to add code for defensive programming (which has a
> high development cost and often, doesn't actually work), because you
> now have to make the program "work" even in a broken state.

Maybe we have different views of what throwing an exception means. When
I throw an exception, I do not hope that some code up there fixes the
mess and carries on as before - unless the catching code can clearly
tell by the nature of the exception and the high-level program state
that it is safe to do so. In general, it is easier to make such
decisions upwards from the point of detection.

> When I say, "it's almost always a mistake to throw from a violated
> precondition," I am addressing that problem: I want people to think
> much more carefully about the consequences and be much more
> conservative about the idea of doing so. If you determine, for
> whatever reason, that your application is better off betting that
> things "aren't broken too badly," you should still design the program
> as though preconditions are never actually violated. In other words,
> the program should not count on these exceptions and expect to respond
> to them in useful ways. Anything else leads to a mess.

My bet is not that things "aren't broken too badly". My bet also isn't
that the program has degenerated to a pulp of random bits. My bet is
that there is just enough stability left to perform a graceful exit.
Again: I am referring to shipping code here.

I also do not advocate high-level code to *count* on these exceptions; I
merely want it to handle them accordingly. In an interactive program,
this could mean informing the user and asking him to exit.

> No, that's not the issue. In general, catch blocks like the one above
> do the right thing. They're not swallowing errors. In this case the
> precondition violation gets treated like a recoverable error simply
> because its exception passes through a translation layer. At such a
> language or subsystem boundary, even if the programmer can anticipate
> the "violated precondition" exception type thrown by low level code,
> what would you have him do? What sort of response would you deem
> "trustworthy?"

A precondition violation in a separate subsystem normally means that the
subsystem has a bug. The more encapsulated and self-sufficient a
component is, the more inappropriate I would consider it for such a
component to terminate the entire program. I have worked with components
that do. They caused havoc.

Ideally, upon catching a violated precondition exception, the subsystem
would enter a global error state that would cause all further calls to
fail instantly. The external caller would be notified of the partial
"shutdown" and could decide whether it is possible to continue without
the subsystem (e.g. work offline), or initiate a shutdown itself.

> Sure. The way to avoid that is to eliminate bugs from the program,
> not to try to hobble along anyway when a bug is detected. The
> customer will usually be just as angry when the program doesn't behave
> as expected because some internal assumption is violated. And you
> know what? They are right!

Again, fully agreed. This is not about either terminating or continue as
if nothing had happened. It's about gracefully handling situations that
should never happen.

> Anyway browsers are an unusual case, since they're primarily viewers.
> If they don't contain some carefully-crafted bit of the user's work
> that will be lost on a crash, it's probably okay... hmm, but wait:
> there's webmail. So I could lose this whole message unless it gets
> written to disk and I get the chance to start over. Oh, and there are
> all kinds of scam sites that masquerade as secure and trustworthy,
> which might be easily mistaken for legit if the user begins to
> overlook garbage on the screen. As a matter of fact, security is a
> big deal for web browsers. They're used in all kinds of critical
> applications, including banking. Oh, and there are plugins, which
> could be malicious and might be the cause of the violation detected.
> No, I don't think we want the browser pressing ahead when it detects a
> bug, not at all. I think this is a perfect case-in-point.

Again, we are not talking about pressing ahead. There's a trade-off.
Terminating minimizes the chance of a catastrophic chain reaction and
risks destroying user data for harmless reasons. Giving the user a
chance to bail out in a controlled way minimizes the chance of data loss
and risks executing malicious code. Your bet is always to assume the
worst case. Personally, I prefer my browser to be wary and suspicious,
but not paranoid and suicidal, especially because I am a much better
judge of whether a website might be forged or a plugin might be from a
dubious source.

>>That leaves the question what to do in shipping code. Standard C
>>practice (in the sense of what most platforms seem to do - I don't
>>know what the C Standard says) is to let the preprocessor suppress
>>the test and boldly stomp into what may be disastrous. Incidentally,
>>the Eiffel practice (thanks for the link, by the way) seems to be
>>similar: assertion monitoring is usually turned off in shipping
>>code.
>
> That can be a good policy, because programmers concerned about
> efficiency will never be deterred from writing assertions on the basis
> of slowing down the shipping program.

Now you've lost me. You go to greath lengths to convince me that
pressing ahead is potentially disastrous, and then you call turning off
assertions in shipping mode a good policy? In other words, carefully
backing away (throwing an exception) is more dangerous than plunging
headlong into the abyss (ignoring the violation and executing the normal
case)? I'm sorry, but this doesn't make sense to me.

>>This is in stark contrast to what has been frequently advocated in
>>this newsgroup. The standard argument is: disabling assertions in
>>shipping code is like leaving the life jackets ashore when you set
>>sail.
>
> One or two vocal advocates of that approach do not a consensus make.
> I've never agreed with it.

We're not talking about a suggestion from a few passing amateurs. I
don't remember exactly who it was, but they were trusted experts. James
Kanze may have been one of them. What is more, I cannot remember having
seen objections posted.

>>I find this metaphor rather misleading - assertions are more like
>>self-destruction devices than life jackets - yet the argument cannot
>>be dismissed so easily. What is your position on this? Should
>>assertions in shipping code do nothing, do the same as in
>>non-shipping code, or do something else?
>
> The correct policy depends on the application and your degree of
> confidence in the code.

Which has been my position from the beginning. :-)

--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

dave_abrahams

unread,
Sep 14, 2005, 9:16:40 PM9/14/05
to
> David Abrahams wrote:
>
> > Exactly. That's what makes the detecting code particularly unsuited
> > to making educated guesses about severity.
>
> Full agreement here. That's why I have reservations about terminating
> the program (in shipping code) whenever a precondition is violated: it
> is based on educatedly guessing maximum severity.

Having read your whole message, I don't understand what policy you're
advocating, nor which viewpoint you're arguing with. Later in this
message you clearly state that you're not for "pressing ahead," but
instead are interested in a "graceful exit." I have recommended a
graceful exit to you in this thread, so surely you know I do not
support immediate termination in most cases. As far as I can tell, a
graceful exit means "take emergency measures and then terminate."
However, you also objected to my suggestion on the grounds that your
customers would see it as a crash. It only makes sense to terminate
immediately when no emergency measures are necessary or possible and
your customers can tolerate the rare core dump (or whatever) without
much human-readable explanation.

> >>They don't say "the stack is uncorrupted", they say: "this
> >>particular vector must be sorted". If it isn't, it's usually because
> >>the author of the client forgot to sort the vector, or called
> >>another function after the sort that push_backs an element.
>
> > On what do you base that assessment? Do you have data, or is it just
> > intuition?
>
> On my personal experience regarding the relative frequency of possible
> causes:
>
> Hardware failure: not that I remember
> Compiler error: once or twice
> Stack overflow: hardly ever
> Buffer overrun: rare
> Simple thinko: most of the time

Sure. The question is whether the thinko is local to where the
violation is detected. That's a different issue altogether.

> > Programmers in general seldom make the distinction carefully between
> > violated preconditions and conditions that are known to be
> > recoverable. You yourself seem to have had that problem. The pull to
> > throw from a violated precondition, and hope that code somewhere else
> > can deal with the problem, is quite strong. We're loathe to admit
> > that the program is broken, so we bet that something can be done about
> > it elsewhere. Once you start trying to unwind-and-continue from a
> > violated precondition, you -- or someone on your team -- will
> > typically begin to add code for defensive programming (which has a
> > high development cost and often, doesn't actually work), because you
> > now have to make the program "work" even in a broken state.
>
> Maybe we have different views of what throwing an exception means. When
> I throw an exception, I do not hope that some code up there fixes the
> mess and carries on as before -

If it's not going to carry on in some sense, why bother throwing? Why
not just quit?

> unless the catching code can clearly tell by the nature of the
> exception and the high-level program state that it is safe to do
> so. In general, it is easier to make such decisions upwards from the
> point of detection.

It's been my experience that in general, an application that can
recover from one kind of exception can recover from almost any
exception -- as long as exceptions aren't used to indicate that the
program is in a broken state from which no recovery is possible, of
course.

> > When I say, "it's almost always a mistake to throw from a violated
> > precondition," I am addressing that problem: I want people to think
> > much more carefully about the consequences and be much more
> > conservative about the idea of doing so. If you determine, for
> > whatever reason, that your application is better off betting that
> > things "aren't broken too badly," you should still design the program
> > as though preconditions are never actually violated. In other words,
> > the program should not count on these exceptions and expect to respond
> > to them in useful ways. Anything else leads to a mess.
>
> My bet is not that things "aren't broken too badly". My bet also isn't
> that the program has degenerated to a pulp of random bits. My bet is
> that there is just enough stability left to perform a graceful exit.
> Again: I am referring to shipping code here.

The question remains how throwing an exception is going to help you
achieve a graceful exit. And if you really mean "just enough
stability," then throwing an exception is the wrong choice because it
will almost always do more than is necessary for a graceful exit. So,
really, your bet is that there's enough stability left to run all the
catch blocks and destructors of automatic objects between the point of
the throw and the point where emergency measures are taken, and then
perform a graceful exit. There's nothing inherently wrong with making
that bet, but you ought to be honest with yourself about what you're
counting on.

> I also do not advocate high-level code to *count* on these exceptions; I
> merely want it to handle them accordingly.

The problem is that it's an extra discipline for the programmer to
carefully distinguish recoverable from unrecoverable exceptions. I'm
saying that any benefits you get from unwinding are usually not worth
the cost of maintaining that distinction, especially in a project with
developers who may not have considered all the issues that deeply.

> In an interactive program,
> this could mean informing the user and asking him to exit.

I guess we have different user interface philosophies. I am not one
of those people who thinks every interface should be dumb, but one of
the things I expect from my programs is that they'll do their best to
protect me from really bad things. If I have open documents and I
save over the old ones before exiting, I could end up with nothing
useful. If I happen to hit return as the error message is coming up
and miss the dialog box, I don't want to miss the chance to save all
my documents.

> > No, that's not the issue. In general, catch blocks like the one above
> > do the right thing. They're not swallowing errors. In this case the
> > precondition violation gets treated like a recoverable error simply
> > because its exception passes through a translation layer. At such a
> > language or subsystem boundary, even if the programmer can anticipate
> > the "violated precondition" exception type thrown by low level code,
> > what would you have him do? What sort of response would you deem
> > "trustworthy?"
>
> A precondition violation in a separate subsystem normally means that the
> subsystem has a bug. The more encapsulated and self-sufficient a
> component is, the more inappropriate I would consider it for such a
> component to terminate the entire program.

Agreed.

> I have worked with components that do. They caused havoc.
>
> Ideally, upon catching a violated precondition exception, the subsystem
> would enter a global error state that would cause all further calls to
> fail instantly. The external caller would be notified of the partial
> "shutdown" and could decide whether it is possible to continue without
> the subsystem (e.g. work offline), or initiate a shutdown itself.

Not bad.

> > Anyway browsers are an unusual case, since they're primarily viewers.
> > If they don't contain some carefully-crafted bit of the user's work
> > that will be lost on a crash, it's probably okay... hmm, but wait:
> > there's webmail. So I could lose this whole message unless it gets
> > written to disk and I get the chance to start over. Oh, and there are
> > all kinds of scam sites that masquerade as secure and trustworthy,
> > which might be easily mistaken for legit if the user begins to
> > overlook garbage on the screen. As a matter of fact, security is a
> > big deal for web browsers. They're used in all kinds of critical
> > applications, including banking. Oh, and there are plugins, which
> > could be malicious and might be the cause of the violation detected.
> > No, I don't think we want the browser pressing ahead when it detects a
> > bug, not at all. I think this is a perfect case-in-point.
>
> Again, we are not talking about pressing ahead.

So, in the case of the browser, what _are_ we talking about? What do
you think should happen?

> There's a trade-off. Terminating minimizes the chance of a
> catastrophic chain reaction and risks destroying user data for
> harmless reasons. Giving the user a chance to bail out in a
> controlled way minimizes the chance of data loss and risks executing
> malicious code. Your bet is always to assume the worst
> case.

Yes. I would assume the worse and *force* the user to bail out in a
way that saves as much relevant data as possible.

> Personally, I prefer my browser to be wary and suspicious, but not
> paranoid and suicidal, especially because I am a much better judge
> of whether a website might be forged or a plugin might be from a
> dubious source.

That's true until the screen display begins to show you stuff that
doesn't correspond to what's actually going on at the website you're
visiting because of some broken invariant. Can't you imagine what
happens when the little "security lock icon" becomes permanently stuck
in the "on" state?

> >>That leaves the question what to do in shipping code. Standard C
> >>practice (in the sense of what most platforms seem to do - I don't
> >>know what the C Standard says) is to let the preprocessor suppress
> >>the test and boldly stomp into what may be disastrous. Incidentally,
> >>the Eiffel practice (thanks for the link, by the way) seems to be
> >>similar: assertion monitoring is usually turned off in shipping
> >>code.
>
> > That can be a good policy, because programmers concerned about
> > efficiency will never be deterred from writing assertions on the basis
> > of slowing down the shipping program.
>
> Now you've lost me. You go to greath lengths to convince me that
> pressing ahead is potentially disastrous

No, I was trying to convince you that unwinding usually does more harm
than good when a precondition violation is detected.

> and then you call turning off assertions in shipping mode a good
> policy?

Depends on the application, your degree of confidence in your unit
tests, etc. Certainly the STL would have little value for many
applications if implementations were all forced to support the checks
used in many debugging implementations even in shipping mode.

> In other words, carefully backing away (throwing an exception) is
> more dangerous than plunging headlong into the abyss (ignoring the
> violation and executing the normal case)? I'm sorry, but this
> doesn't make sense to me.

Me neither. Fortunately, I never said that :)

> >>This is in stark contrast to what has been frequently advocated in
> >>this newsgroup. The standard argument is: disabling assertions in
> >>shipping code is like leaving the life jackets ashore when you set
> >>sail.
>
> > One or two vocal advocates of that approach do not a consensus make.
> > I've never agreed with it.
>
> We're not talking about a suggestion from a few passing amateurs. I
> don't remember exactly who it was, but they were trusted experts. James
> Kanze may have been one of them.

Yes he was. James and I have had a few big disagreements in the past.

> What is more, I cannot remember having seen objections posted.

You may find it hard to believe, but I don't find it necessary to
argue with every assertion I disagree with. :)

> >>I find this metaphor rather misleading - assertions are more like
> >>self-destruction devices than life jackets - yet the argument cannot
> >>be dismissed so easily. What is your position on this? Should
> >>assertions in shipping code do nothing, do the same as in
> >>non-shipping code, or do something else?
>
> > The correct policy depends on the application and your degree of
> > confidence in the code.
>
> Which has been my position from the beginning. :-)

Maybe there's nothing left to say about all this, then.

--
Dave Abrahams
Boost Consulting
www.boost-consulting.com

Gerhard Menzl

unread,
Sep 19, 2005, 1:16:45 PM9/19/05
to
dave_abrahams wrote:

> Having read your whole message, I don't understand what policy you're
> advocating, nor which viewpoint you're arguing with. Later in this
> message you clearly state that you're not for "pressing ahead," but
> instead are interested in a "graceful exit." I have recommended a
> graceful exit to you in this thread, so surely you know I do not
> support immediate termination in most cases. As far as I can tell, a
> graceful exit means "take emergency measures and then terminate."

I advocate a policy that is tailored to the type of application and its
sensitivity to security issues. In the case of interactive applications,
I advocate a policy that - ideally - makes users feel they are still in
charge of the situation and doesn't dumb them down by pretending the
program is always smarter. What this means in detail is probably
off-topic here.

I understand that you consider the C++ exception mechanism largely
unsuitable for fulfilling these goals. Although your arguments have not
convinced me that this is the case most of the times, I am now more
aware of the dangers. Thanks for broadening my horizon.

> It's been my experience that in general, an application that can
> recover from one kind of exception can recover from almost any
> exception -- as long as exceptions aren't used to indicate that the
> program is in a broken state from which no recovery is possible, of
> course.

It depends on what you mean by "recover". An application that may easily
handle database-related exceptions and carry on may have more troubles
recovering from std::bad_alloc. Hm, this reminds me of std::bad_cast -
wouldn't you agree that this exception type usually signals a
programming error?

> The problem is that it's an extra discipline for the programmer to
> carefully distinguish recoverable from unrecoverable exceptions. I'm
> saying that any benefits you get from unwinding are usually not worth
> the cost of maintaining that distinction, especially in a project with
> developers who may not have considered all the issues that deeply.

There is also an extra discipline for the programmer to maintain an
extra emergency cleanup mechanism and carefully distinguish resources
which need to be released even in case of a contract breach from those
that don't.

> I guess we have different user interface philosophies. I am not one
> of those people who thinks every interface should be dumb, but one of
> the things I expect from my programs is that they'll do their best to
> protect me from really bad things.

That may be the case. I don't think user interfaces should be dumb, but
I am more concerned about user interfaces that make users look dumb. But
I am straying into the off-topic zone again.

>>Personally, I prefer my browser to be wary and suspicious, but not
>>paranoid and suicidal, especially because I am a much better judge
>>of whether a website might be forged or a plugin might be from a
>>dubious source.
>
> That's true until the screen display begins to show you stuff that
> doesn't correspond to what's actually going on at the website you're
> visiting because of some broken invariant. Can't you imagine what
> happens when the little "security lock icon" becomes permanently stuck
> in the "on" state?

I can contrive lots of freak accidents caused by code that throws an
exception upon detecting a contract breach, just as I can contrive freak
accidents caused by code that doesn't throw and shuts down instead. All
I am saying is that there is a balance, and that there's odds, and that
I have doubts about the odds being as clear as you seem to think they are.

>>Now you've lost me. You go to greath lengths to convince me that
>>pressing ahead is potentially disastrous
>
> No, I was trying to convince you that unwinding usually does more harm
> than good when a precondition violation is detected.
>
>>and then you call turning off assertions in shipping mode a good
>>policy?
>
> Depends on the application, your degree of confidence in your unit
> tests, etc. Certainly the STL would have little value for many
> applications if implementations were all forced to support the checks
> used in many debugging implementations even in shipping mode.

That's a keypoint, so I must insist. You claim that throwing is almost
always a bad choice because there is *a chance* that code executed
during unwinding is rendered broken. Yet turning off assertions may be
okay, although in case of a contract breach this would cause code to be
executed which is *known* to be broken. To me, this is a glaring
contradiction. In the awkward case of a programming error slipping
through your tightly knit mesh of unit tests the odds of avoiding
further damage are surely better for throwing an exception than they are
for continuing normally, notwithstanding the fact that they may be even
better for aborting.

>>We're not talking about a suggestion from a few passing amateurs. I
>>don't remember exactly who it was, but they were trusted experts.
>>James Kanze may have been one of them.
>
> Yes he was. James and I have had a few big disagreements in the past.

It's a pity he hasn't taken the bait yet. I would be interested what his
views are on this. Maybe his endless-thread-filter is on.

>>What is more, I cannot remember having seen objections posted.
>
> You may find it hard to believe, but I don't find it necessary to
> argue with every assertion I disagree with. :)

I didn't mean to say you do. But when strong opinions posted here
repeatedly by long-term participants remain unchallenged, this is often
a hint (although, of course, no proof) that something is established
best practice. Hence my surprise.


--
Gerhard Menzl

#dogma int main ()

Humans may reply by replacing the thermal post part of my e-mail address
with "kapsch" and the top level domain part with "net".

0 new messages