Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Guru of the Week #61: Solution

1 view
Skip to first unread message

Herb Sutter

unread,
Oct 29, 1999, 3:00:00 AM10/29/99
to
-------------------------------------------------------------------
Guru of the Week problems and solutions are posted periodically
on news:comp.lang.c++.moderated. For past problems and solutions
see the GotW archive at www.peerdirect.com/resources.
(c) 1999 H P Sutter. News archives may keep copies of this article.
-------------------------------------------------------------------

_______________________________________________________

GotW #61: CHALLENGE EDITION: ACID Programming

Difficulty: 10 / 10
_______________________________________________________

INTRODUCTION
=======================================================

Exception Safety Recap
----------------------

>1. Consider again the Abrahams exception safety
> guarantees, as frequently described in GotW, and any
> others that you may know of.

Recapping from GotW #59, Dave Abrahams' guarantees are
usually stated as follows.

A1. Basic Guarantee: If an exception is thrown, no
resources are leaked, and objects remain in a
destructible and usable -- but not necessarily
predictable -- state. This is considered by many to
be the weakest usable exception safety guarantee,
and is appropriate where client code can cope with
failed operations that have already made changes to
objects' state.

A point I want to make here is that this essentially
means "there are no bugs." It's not much different
from saying: "If an error occurs, no resources are
leaked, and objects remain in a destructible and usable
-- not not necessarily predictable -- state."

For example, consider a Container::AppendElements()
operation that appends multiple elements onto the end
of an existing container. If called to append 10
elements, and it successfully adds 6 elements before
encountering an error and stopping, then the operation
provides only the basic guarantee: The container is
still in a consistent state (it still meets the
Container invariants, and further operations on the
container will work fine), but the state is not
necessarily predictable (the caller probably has no way
to know in advance what the final state will be if an
error is encountered; it may or may not be the initial
state).

A2. Strong Guarantee: If an exception is thrown,
program state remains unchanged. This guarantee
always implies global commit-or-rollback semantics,
including that no references or iterators into a
container be invalidated if an operation fails.

Note that here we are already heading toward
transactional semantics.

In addition, certain functions must provide an even
stricter guarantee in order to make the above exception
safety guarantees possible in higher functions:

A3. Nothrow Guarantee: The function will not emit an
exception under any circumstances. It turns out
that it is sometimes impossible to implement the
strong or even the basic guarantee unless certain
functions are guaranteed not to throw (e.g.,
destructors, deallocation functions). For example,
an important feature of the standard auto_ptr is
that no auto_ptr operation will throw.

Other experts have also proposed exception safety
guarantees. Building on earlier work by Harald Mueller
in [2], Jack Reeves in [3] suggested three guarantees
regarding the state of an object after an exception had
occurred:

R1. Good: The object meets its invariants.

R2. Bad: The object has been corrupted in that it no
longer meets its invariants, and the only thing that
you are guaranteed to be able to safely do with it
is destroy it.

R3. Undefined: The object is throughly corrupted and
cannot be safely used or even destroyed.

Over the years, other possible guarantees have been
suggested, notably the "destructible guarantee," which
is another way of saying that if an exception occurs
then an object is in Reeves' "Bad" state.


Analyzing a Motivating Example
------------------------------

> Then, review the
> technique described in GotW #59 (Question #4 and
> Conclusion #2). Which Abrahams (or other) guarantee
> does that technique provide? Discuss.

In GotW #59, the main technique presented was that a
class contain its member objects by auto_ptr<Pimpl>
rather than directly by value:

// The general solution to
// Cargill's Widget Example
//
class Widget
{
// ...
private:
class Impl;
auto_ptr<Impl> pimpl_;
// ... provide destruction, copy construction
// and assignment that work correctly, or
// suppress them ...
};

class Widget::Impl
{
public:
// ...
T1 t1_;
T2 t2_;
};

void Widget::Swap( Widget& other ) throw()
{
auto_ptr<Impl> temp( pimpl_ );
pimpl_ = other.pimpl_;
other.pimpl_ = temp;
}

Widget& Widget::operator=( const Widget& other )
{
Widget temp( other ); // do all the work off to the side

Swap( temp ); // then "commit" the work using
return *this; // nonthrowing operations only
}

See GotW #59 for further details.

Now we get to the crux of the matter: That GotW then
made the claim that the above Widget::operator=()
provides the strong guarantee without any
exception-safety requirements on T1 and T2. As Dave
Abrahams has pointed out, this isn't quite true: For
example, if constructing a T1 or T2 object causes side
effects (such as changing a global variable or
launching a rocket) and then throws an exception, and
that side effect is not cancelled by the corresponding
T1 or T2 destructor, then Widget::operator=() has
caused a side effect and so doesn't fully meet the
requirements of the strong guarantee. (It does fully
meet the requirements of the strong guarantee in all
other ways besides the possibility of side effects.)

So here's the rub: The above solution to Cargill's
Widget Example is an important technique, it does
provide a safety guarantee, and that guarantee appears
to be a fundamental concept -- it is the strongest
exception safety guarantee one can make for assignment
without requiring any assumptions at all about the used
type (more on this below). Better still, it is a
useful guarantee to be able to make; for example, it is
useful to know that upon failure a change to a
container object will produce no local effects on the
container, even though it might invalidate existing
iterators into the container. More on this in the
final section.

So, since it is a guarantee, what guarantee is it? It
doesn't meet the requirements of the Abrahams strong or
nothrow guarantees, but at the same time it seems to
give more than just the Abrahams basic guarantee or the
Reeves Good guarantee.

There's got to be a name for a concept as useful and as
fundamental as "the strongest guarantee you can make
without requiring assumptions about the used type" --
and that's not a name, that's just an explanation. It
bothered me greatly that we had no succinct name for
it.


TOWARD TRANSACTIONAL PROGRAMMING
=======================================================

Ideas From the Database World
-----------------------------

Could another area of computer science yield any
insights that would help? To introduce the answer,
let's take a quick trip into the database world and its
existing methods of analyzing program correctness with
respect to database manipulation:

>2. In the context of databases, what do the following
> terms mean:
>
> a) transaction

A transaction is a unit of work. That is, it
accomplishes some task (typically involving multiple
database changes) in such a way that it meets the ACID
requirements to a specified degree appropriate for its
particular task:

> b) ACID

ACID is an acronym for a set of four requirements:

Requirement Description (Databases)
----------- ----------------------------------------
Atomicity The operation is indivisible, all-or-
nothing. A transaction can be
"committed" or "rolled back"; if it is
committed all of its effects must
complete fully, otherwise if it is
rolled back it must have no effect.

Consistency The transaction transforms the database
from one consistent state to another
consistent state. Here "consistent"
implies obeying not just relational
integrity constraints but also business
rules and semantics, such as that a
given Customer record's Address and ZIP
fields must not specify inconsistent
information (e.g., an address in New
York and a ZIP code in Alabama).

Isolation Two transactions are fully isolated if
they can be started simultaneously and
the result is guaranteed to be the same
as though one was run entirely before
the other ("Serializable"). In practice,
databases support various levels of
isolation, each of which trades off
better isolation at the expense of
concurrency or vice versa. Some typical
isolation levels, in order from best to
worst concurrency, are: Read Uncommitted
(a.k.a. Dirty Read), Read Committed,
Repeatable Read, and Serializable
(highest isolation), and describe what
kinds of intermediate states of other
transactions might be visible.

Note 1: Two transactions operating on nonoverlapping
parts of the database are by definition always fully
isolated.

The following notes show why isolation is
disproportionately important:

Note 2: Isolation interacts with atomicity. For
example, consider three transactions T1, T2, and T3
that run concurrently. All three are atomic. T1 and
T2 run at Serializable isolation and do not interact
with each other; that is, because of their isolation
levels, T1 and T2 are also guaranteed to be atomic with
respect to each other. T3, however, runs at Read
Uncommitted isolation, and it is possible for T3 to see
some but not all of the changes made by T1 and T2
before those other transactions commit; that is,
because of its isomation level, T1 and T2 are NOT
atomic as seen by T3.

Note 3: Because isolation interacts with atomicity, it
also transitively interacts with consistency. This is
because lower isolation levels may permit states to be
seen by other transactions that would not otherwise be
visible, and relying carelessly on such intermediate
states can create inconsistencies in the database. For
example, consider what could happen if transaction T4
observes some intermediate state of transaction T5,
makes a change based on that observed state, but T5 is
subsequently rolled back. Now T4 has made further
changes based on phantom data that, as it turns out,
never really existed.

Durability If a transaction is committed, its
effects are not lost even if there is
a subsequent system crash. This
requirement is usually implemented using
a transaction log which is used to "roll
forward" the state of the database on
system restart.

For more information, see a standard database text like
[4].

In the database world, atomicity, consistency, and
durability are always fully required in most real-world
database systems. It is the isolation aspect that
offers tradeoffs for concurrency (and, as noted, the
tradeoffs can indirectly affect atomicity and
consistency too). In practice, running at anything
less than Serializable isolation opens windows wherein
the database could get into a state that violates
business rules (and hence the consistency requirement),
but it's done all the time. In short, we live with
this quite happily today, even in some financial
database systems with stringent integrity requirements,
because we can measure the exposure and we accept the
tradeoff in order to achieve better concurrency.
Nearly all production databases in the world run at
less than the strongest isolation levels at least some
of the time.

More information about ACID and transaction semantics,
especially as they apply to a distributed database
environment, will be covered in my next book [7].


Mapping ACID Concepts to Non-SQL Programming
--------------------------------------------

Note that in the following discussion, for an operation
to "fail" can mean fail in any way, so that the
analysis is the same whether the failure is reported by
an error return code or by an exception.

>3. The key insight is to notice that ACID transactional
> analysis can be applied to program correctness,
> including C++ exception safety. Demonstrate a
> parallel between the two by showing a programming
> analog for each aspect of ACID.

The elements of ACID were developed to describe
requirements in database environments, but each has
analogs that make sense in programming environments.
Note that these analogs apply to reasoning about
program correctness in general, both with and without
exceptions and including exception support in other
languages, not just to C++ exception handling.

The following analysis distinguishes between local
effects and side effects:

1. Local effects include all effects on the visible
state of the immediate object(s) being manipulated.
Here the visible state is defined as the state
visible through member functions as well as free
functions (e.g., operator<<()) that form part of the
interface. For more on what constitutes the
interface of a type, see the Interface Principle as
described in [5] and in Items 31-34 of [6].

2. Side effects include all other changes, specifically
changes to other program states and structures as
well as any changes caused outside the program (such
as invalidating an iterator into a manipulated
container, writing to a disk file, or launching a
rocket), that affect the system's business rules, as
they are expressed in object and system invariants.
So, for example, a counter of the number of times a
certain function is called would normally not be
considered a side effect on the state of the system
if the counter is simply a debugging aid; it would
be a side effect on the state of the system if it
recorded something like the number of transactions
made in a bank account, because that affects
business rules or program invariants. Note: In
this article I use the terms "business rules" and
"invariants" interchangeably.

Of course, if an effect is visible as a side effect but
also through the manipulated object's interface (such
as via ReadByte() or WasLaunched() functions), it is
both a local effect and a side effect.

I propose that the following descriptions of the ACID
requirements are suitable for describing program
correctness in general and exception safety in
particular. They form four fundamental axes of ACID
exception safety analysis:

Axis Description (Programming)
----------- ----------------------------------------
Atomicity The operation is indivisible, all-or-
nothing, with respect to local effects.

Consistency The operation transforms system from one
consistent state to another consistent
state. Here "consistent" implies
obeying not just object integrity
constraints (i.e., objects continue to
fulfill their invariants) but also
wider system invariants. Note "no
[memory or resource] leaks" is usually a
required program invariant.

Before tackling Isolation, a note: In the database
world, isolation is often all about concurrency -- that
is, access and manipulation of the same data elements
by different transactions operating concurrently. The
same concepts we use to describe database transaction
isolation map directly onto multithreaded and
multiprocess access and manipulation of shared
resources; the same issues apply -- read locks, write
locks, read/write locks, deadlocks, livelocks, and
similar issues.

There is a difference, however, between relational
databases and programs. Standard relational database
systems provide direct support in the database engine
to enforce isolation options such as "Read Committed"
by automatically creating, maintaining, and freeing
locks on individual data objects. They can do this
because the engine "owns," or acts as an arbiter and
access point for, all the data. No such support exists
in most programming languages (including C++ and Java;
Java has some thread safety primitives, but they are
not very useful and do not actually solve the thread
safety problem even in simple cases). In languages
like C++ and Java, if you want locking you have to do
it yourself in an application-specific way by manually
writing calls to acquire and release locks on program
resources like shared variables.

Isolation Two operations are fully isolated if
they can be started simultaneously and
the result is guaranteed to be the same
as though one was run entirely before
the other ("Serializable"). In practice,
programs implement various levels of
isolation, each of which trades off
better isolation at the expense of
concurrency or vice versa. Some typical
isolation levels, in order from best to
worst concurrency, are: Read Uncommitted
(a.k.a. Dirty Read), Read Committed,
Repeatable Read, and Serializable
(highest isolation), and describe what
kinds of intermediate states of other
operations might be visible.

Interestingly, however, note that concurrency isolation
issues only arise between transactions manipulating the
same data; so, as already noted earlier, isolation is
fundamentally concerned with not only concurrency, but
interaction of side effects. As Note 1 above stated:
two transactions operating on nonoverlapping parts of
the database are by definition always fully isolated.

In the interests of space, this article does not pursue
concurrency issues in depth (which would be a lengthy
discussion beyond the immediate scope of the article),
except to note the following: The ACID program
analysis method proposed in this article can be
directly extended to address concurrency issues in the
same way that those issues are addressed in database
environments, with the caveat that most programming
languages do not provide the same native support
provided inherently by relational database management
systems for basic concurrency control and that these
must therefore be implemented by the application. A
later article will expand on this wider aspect of
isolation.

This article then focuses specifically on the aspects
of isolation of interest in a nonshared/serialized
environment, namely the interaction of side effects:

Isolation Regardless of threading and other
(continued) concurrency issues, an operation on one
or more objects is fully isolated if it
does not cause side effects (i.e., only
causes local effects on those objects).

Durability If an operation completes successfully,
all its effects are preserved even if
there is a subsequent system crash. For
example, a rocket once fired stays in
flight even if the launching software
in the submarine crashes afterwards.

Here's a difference: In the database world, atomicity,
consistency, and durability are always fully required
and it is usually isolation that is traded off (with
some indirect effects on atomicity and consistency).
In the programming world, atomicity is normally assumed
(but if there are asynchronous interrupts or
exceptions, atomicity has to be explicity programmed
for), consistency in terms of maintain invariants is
also normally assumed, and durability is not often
considered except in database-like applications, where
the durability is made the responsibility of an
underlying database or file system.

Really, it looks like what we are building up to is a
transactional approach to program correctness,
including C++ exception safety, arrived at by adapting
concepts that are already well-understood in another
area of computer science.


A Notation for Safety Guarantees
--------------------------------

Next, let's analyze the possible statements we can make
about each axis. For each axis, we can make one of
three statements:

a) that there is no guarantee at all (designated by a
dash, '-', or simply omitting mention of the
guarantee);

b) a statement about effects if an operation may fail
(designated by a lowercase letter, such as 'i'); or

c) in addition to b), a statement about effects if an
operation does not or cannot fail (designated by an
uppercase letter, such as 'I').

Note that these are true "levels" in that each subsumes
the one before it. I omit a fourth possible case ("a
statement about effects if an operation succeeds [but
with no statement about what happens on failure]")
because it is no better than a) for analyzing failure
cases.

Here's what we get:

Level Description
----------- ----------------------------------------

Atomicity:

- No guarantee.

a If the operation fails, the manipulated
object(s) are in their initial state,
otherwise the manipulated object(s) are
in their final state.

A The operation is guaranteed to succeed
and the manipulated object(s) are in
their final state. (For exception-
related failures, this is spelled
"throw()" in C++; but an operation
could still fail in ways other than
throwing an exception.)

Note that atomicity as defined here concerns itself
only with local effects, that is, with the state
of a directly manipulated object.

Consistency:

- No guarantee.

c If the operation fails, the system is in
a consistent state. (Depending whether
or not the operation also makes some
atomicity guarantee, the consistent
state may be the initial state, the
final state, or possibly an
intermediate state, depending on
whether and how a failure occurred.)

C If the operation succeeds or fails, the
system is in a consistent state.

Any operation that can result in objects no longer
meeting their invariants should be considered
buggy. No program safety, let alone exception
safety, is possible without a guarantee of
consistency on success, so the rest of this article
will generally rely on C. Further, we require that
at the beginning of the operation the system is in
a consistent state, else no reasoning about
consistency is possible or useful.

Isolation:

- No guarantee.

i If the operation fails, there will be no
side effects.

I If the operation either succeeds or
fails, there will be no side effects.

Note that isolation as defined here concerns itself
only with side effects. In many ways it is a
mirror of atomicity.

Note that 'a'+'i' => 'c', because we require that
the objects and the system were initially in a
consistent state.

Durability:

- No guarantee.

d If the operation fails, whatever state
was achieved will persist.

D If the operation succeeds or fails, the
state persists.


Interlude: FAQs and Objections
------------------------------

At this point, several questions or objections arise,
and it makes sense to pause briefly and handle them
now:

Q: "I think 'c' and 'd' may not be useful, on the
grounds that they seem backward: Shouldn't
operations first specify whether they produce
consistent and/or durable results if they succeed,
and then optionally what they do if they fail?"

A: An interesting outgrowth of the above ACID view,
then, is that it is actually more important first to
state what operations do if they fail, and only then
to state in addition what they do if they succeed.
(I grant that programmers who find it annoying that
their companies force them to document their
functions' failure modes may also dislike the
previous statement.)

Q: "The database version of 'atomicity' includes all
effects, whereas the above doesn't. Is that right?"

A: Yes. It is useful in programming to distinguish
between local effects and side effects, and the
global atomicity guarantee still exists and is
expressed as "both 'a' and 'i'."

This last point leads nicely into the question of
combining guarantees:


A Notation for Combining Guarantees
-----------------------------------

The four axes of correctness analysis are not entirely
independent; one can affect the other, as for example
above 'a'+'i' => 'c'.

For notational convenience I will designate each
combination of guarantee levels as a 4-tuple (such as
"-CID" or "aC-D"), where each position specifies the
level of the atomicity, consistency, isolation, and
durability axis (respectively) that is guaranteed, if
any. For convenience, the '-'s could also be omitted.

This article focuses primarily on the first three axes,
so in the following discussion the durability guarantee
level will often be shown as - or omitted. Because the
durability axis is independent of the others, this can
be done without loss of generality.

Note that, if the operation guarantees that it will
always succeed (A), then it is not very meaningful to
make a guarantee about what happens if the operation
fails without a guarantee about what happens if the
operation succeeds (c, i, or d). So, for example, in
every case where ACi is guaranteed, AC is equivalent
and should be specified instead.


(RE)VIEWING EXISTING APPROACHES THROUGH THE
LENS OF TRANSACTIONAL PROGRAMMING
=======================================================

Describing Existing Guarantees as ACID Guarantees
-------------------------------------------------

The first goal of this taxonomy was:

> - It should be inclusive of existing work, covering
> existing exception safety analysis methods.

Hence Question #4:

>4. Treat the analysis from Question #3 as a taxonomy
> and use it to describe the Abrahams and other prior
> guarantees. Discuss.

The ACID model does indeed provide a concise and
explicit notation to describe the Abrahams guarantees,
Reeves' Good, Bad, and Undefined states, and the
destructible guarantee:

---- Reeves Undefined state guarantee
---- Reeves Bad state guarantee
---- Destructible guarantee

-C-- Abrahams basic guarantee
-C-- Reeves Good state guarantee

aCi- Abrahams strong guarantee

AC-- Abrahams nothrow guarantee

Note that this analysis points out at least two
weaknesses in prior models:

First, the prior models generally ignored the issue of
isolation. Note that none of the above-listed prior
guarantees included I. When using the prior models we
frequently found that we had to add amplifications
like, "it meets the strong guarantee except there could
be side effects." Descriptions like that implicitly
point to missing guarantees, which are now accounted
for above.

Second, the above ACID analysis makes it clear why
Mueller's and Reeves' "Bad" and "Undefined" states, and
the sometimes-cited "destructible" guarantee, are all
the same thing from a program correctness point of
view.

I have always thought that the "Bad," "Undefined," and
"destructible" ideas had problems. There's nothing
magical about exiting a function via a C++ exception
compared to returning an error value indicating the
operation failed, so consider: Say that a programmer
on your team wrote a function that could never throw
and just returned a code to indicate success or
failure, and that could leave an object in a state
where it no longer met its invariants (or, even worse,
couldn't even be destroyed). Would you accept the
work, or would you send the programmer back to his desk
to fix his bug? I think most of us would agree that
the function just described is plainly buggy (and so is
its specification if the specification says it can
render objects invariant-breaking). So why should the
answer be any different in the case of failure reported
by throwing an exception? Clearly it shouldn't, and in
fact none of the Bad, Undefined, or destructible
"guarantees" really gives any kind of useful program
correctness or safety guarantee, because all useful
program correctness guarantees include at least the
concept of preserving object and system invariants.

[Aside: Consider "Bad" and "destructible" a little
further. If a function f() can leave any object t in a
state wherein all you can ever do is call t's
destructor, then f() is a sink in all but name, and
might as well destroy t too.]


Filling Holes: ACID and GotW #59
--------------------------------

The second goal of this taxonomy was:

> - It should help to fill holes in existing methods,
> such as a situation where given sample code was
> known to support some exception-safety guarantee but
> that guarantee could not be adequately described
> using the existing models alone.

This brings us to Question #5:

>5. Does the answer from Question #4 equip us to give a
> better answer to Question #1? Explain.

When we considered Question #1, I said that it seems
like there ought to be a name for a concept as
fundamental as "the strongest guarantee you can make
(for assignment) without requiring assumptions about
the used type." In the ACID model, it turns out that
there is:

aC-- The guarantee provided by the technique
shown in GotW #59, if T construction and
destruction provide at least -C-- (which
should be a given for correct programs).

Without making any assumptions about the used type(s),
is it possible to make a stronger guarantee than aC for
a generalized assignment operator? I believe it is
possible to show fairly rigorously that the answer is
No, as follows:

1. Consider a simplified version of the T::operator=()
code.

// (Simplified) solution to Cargill's Widget Example
//
class Widget { /*...*/ struct Impl* pimpl_; };

class Widget::Impl { /*...*/ T1 t1_; T2 t2_; };

Widget& Widget::operator=( const Widget& other )
{
Impl* temp = new Impl( *other.pimpl_ );
// construction could fail

delete pimpl_;
pimpl_ = temp;
return *this;
}

As before, the above code invokes the Widget::Impl
constructor to create a new object from the existing
one. If the construction fails, C++'s language
rules automatically destroy any fully-constructed
subobjects (such as t1_, if the t2_ construction
fails). So we could end up getting any of the
following function call sequences:

Case 1 (Failure): T1::T1() fails
Case 2 (Failure): T1::T1(), T2::T2() fails, T1::~T1()
Case 3 (Success): T1::T1(), T2::T2()

2. Consider the C++ language rules. The C++ language
rules strictly define the meaning of construction
and destruction (in both cases we assume the
Consistency guarantee because, for example, any
type's constructor does not yield a valid object of
that type is just buggy):

C++ Constructor Guarantee: aC--
A constructor, if successful, must create a
valid object of the indicated type; if it
fails, it has not created anything.

C++ Destructor Guarantee: AC--
A destructor destroys a valid object of the
indicated type, and so is an inverse of any
constructor of the same type. Further, a
destructor is guaranteed to succeed, at least
in the sense that the object it operates on is
considered fully-destroyed, or as-destroyed-as-
it-can-ever-be because you can't re-run a
failed destructor. (Further, the standard
requires that any destructor used in the
standard library may not throw; this includes
the destructor of any type you can legally
instantiate a container with.)

At any rate, the above solution assumes
implicitly that at least T1::~T1() guarantees
A---, else there is no way to write a correct
Widget::operator=() anyway. For more on these
issues, see Item 16 in [6].

2. Analyze Atomicity (i.e., local effects). In each of
the two failure cases, the only local effects are
the possible construction or destruction of the t1_
and t2_ member objects, because these are the only
objects being directly manipulated.

Case 1 (Failure): T1::T1() fails
The only attempted operation promises at least
'a' atomicity, so no local effects occur and
the entire assignment has no local effects.

Case 2 (Failure): T1::T1(), T2::T2() fails, T1::~T1()
T2::T2() promises at least 'a' atomicity, so
does not have any local effects. But T1::~T1()
is guaranteed to succeed and undoes all local
effects of T1::T1(), and so the entire
assignment has no local effects.

Case 3 (Success): T1::T1(), T2::T2()
Success, all local effects have been
successfully produced.

Failure is possible, but all failure cases have no
local effects. Therefore the assignment's best
possible atomicity guarantee level is 'a' -- thanks
to the guarantees inherent in the C++ language.

3. Analyze Consistency. Failure is possible, but in
each case each suboperation guarantees the
progression from one consistent state to another
regardless of success or failure, so the entire
assignment ends in a consistent state. Therefore
the assignment's best possible consistency guarantee
level is 'C' -- again thanks to the guarantees
inherent in the C++ language.

4. Analyze Isolation. No guarantee is possible here,
because no suboperation makes any guarantee about
side effects. For example, in Case 1, T1::T1() may
fail in a way that has already affected global
state.

5. Analyze Durability. No guarantee is possible here
either, because no suboperation makes any guarantee
about durability. For example, in Case 1, T1::T1()
may fail in a way that does not achieve persistence
of the final state.

So aC is the strongest exception safety guarantee one
can make for assignment without requiring any
assumptions at all about the used type. That's a
useful piece of information, especially in generic
programming using language facilities like C++
templates where we cannot always assume knowledge of
the guarantees provided by arbitrary and even
unknowable types.

Better still, we can make an even stronger statement
with a little knowledge about the types we're using:

aCI- The guarantee provided by the technique
shown in GotW #59, *if* T1 and T2
construction and destruction provide at
least -CI-.

Interestingly, note that this guarantee (aCI-) promises
more than the Abrahams strong guarantee (aCi-).

In short:

- With no knowledge of the used types T1 and T2, it is
always possible to write an assignment operator for
T that gives "better than the basic guarantee" --
specifically, such an assignment operator is
aC-safe.

- With the additional knowledge that T1 and T2
construction and destruction are CI-safe, if is
always possible to write an assignment operator for
T that gives "better than the strong guarantee" --
specifically, such an assignment operator is
aCI-safe.

Neither of these guarantees is concisely expressible
using prior exception and program safety models.

TRANSACTIONAL PROGRAM CORRECTNESS ANALYSIS
=======================================================

The third goal of this taxonomy was:

> - It should help us reason about and analyze the
> safety of code, making it less a craft and more a
> methodical engineering task.

Let's see how well we can do.


ACID Safety Analysis
--------------------

It's important that the ACID program correctness
taxonomy precisely describes the prior Abrahams and
other guarantees, but perhaps the best thing about the
ACID model is that it simplifies our analysis because
the compound guarantees are no longer the important
thing at all.

Consider first the following principle, which is
perhaps the most important result of this article:

The Transactional Principle

All programming operations should be considered as
transactions, with associated ACID guarantees. This
applies regardless of whether exceptions are
possible or absent.

All programming operations are amenable to this new
style of ACID transactional analysis. After years of
differing with Dave Abrahams on this point, I can now
in good conscience come around to his point of view
that "exceptions are not magical" -- they are simply
another failure case, albeit with complex programming
semantics in C++ that come into play when you actually
use them, but conceptually no different from an error
return code. The reason I only now feel I can agree
with Dave Abrahams is that, after writing about and
advancing this topic for several years, I only now
think I finally understand exception safety well enough
to be able to reason about it confidently as simply
another form of "early return."

By defining distinct and largely independent axes, the
ACID model gives us a simpler and systematic -- even
mechanical -- way to analyze an operation's
correctness: Analyze it according to the individual
safety guarantees, one axis at a time, not according to
aggregated "compound guarantees." The way to analyze a
function, then, is to examine each axis' guarantee
level in turn:

1. If the axis has no guarantee, do nothing.

2. If the axis has a guarantee about a failure case,
then for each possible EARLY return (each possible
failure return, or each possible throw or interrupt
point which means each non-A-safe function call we
perform as a suboperation), ensure the guarantee is
met if the early return is taken.

3. If the axis makes a guarantee about a success case,
then for each possible NORMAL return, ensure the
guarantee is met if the return is taken.

Let's consider examples to show why this can make
reasoning about exception safety simpler, more
systematic, more mechanical -- which is to say more
like engineering and less like craft:


Example: Analyzing for aCi-Safety (the Strong Guarantee)
--------------------------------------------------------

Say you are writing a function that must guarantee
aCi-safety (the Abrahams strong guarantee). How do you
analyze whether it meets the guarantee? By considering
each axis independently as follows:

1. [a] For each possible EARLY return, ensure the
immediate object is returned to the initial state if
the early return is taken.

2. [C] For each NORMAL return, ensure the immediate
object's state is consistent.

3. [i] For each possible EARLY return, ensure that side
effects are undone if the early return is taken.

Note that we didn't need to consider the c-safe aspect
because ai-safe => aci-safe since we require that the
initial state of the system be consistent.


Another Example: Analyzing for aC-Safety
----------------------------------------

Now say you are writing a function that must guarantee
aC-safety. How do you analyze whether it meets the
guarantee? By considering each axis independently as
follows:

1. [a] For each possible EARLY return, ensure the
immediate object is returned to the initial state if
the early return is taken.

2. [c] For each possible EARLY return, ensure that the
system is in a consistent state if the early return
is taken.

3. [C] For each NORMAL return, ensure the immediate
object's state is consistent.


OTHER RESULTS AND CONCLUSIONS
=======================================================

The Importance of "Big-A" Atomicity
-----------------------------------

As pointed out by Abrahams and Colvin and others, to
guarantee any kind of exception safety it is essential
that certain operations -- notable destructors and
deallocation functions -- be required not to throw.
That is, they must be at least A-safe. We rely on this
in all sorts of situations, notably:

- The "do all the work off to the side, and commit
using only nonthrowing operations" method is not
possible if there are no suitable nonthrowing
(A-safe) operations.

- The related "create a temporary and Swap()" method,
which is the standard form of exception-safe copy
assignment, is not possible without a nonthrowing
(A-safe) Swap().

An interesting point raised by Greg Colvin is that for
this reason it does not appear to be possible to write
exception-safe code in Java. The Java specification
currently allows the JVM to throw an exception at any
time at arbitrary points in the program, for example
asynchronously when stop is called on a Thread object.
In practice it isn't that bad, but it is worrisome that
the Java specification can apparently be shown to
preclude writing exception-safe code because there is
no way to absolutely guarantee A-safe operations in
Java.


Side Effects Last
-----------------

It seems best to try to perform I-safe operations last,
in order to put off side effects. That way, if a
failure occurs earlier in the operation, no side
effects will have been begun. This helps to limit the
scope for side effects after failures.


Future Directions
-----------------

An important area for further study is a rigorous
treatment of analyzing compound operations. For
example, as illustrated in one of the analysis examples
above, the consistency guarantees operate like a chain;
an operation is c-safe only if all suboperations are
c-safe (any one that isn't becomes the weak link that
breaks the chain), and an operation is C-safe only if
all suboperations are C-safe. That's fine for
consistency... now, is there anything we can conclude
similarly about atomicity, isolation, durability, or
combinations thereof in suboperations?


Summary
-------

When I started this taxonomy, I had in mind applying
database concepts to better analyze and describe C++
exception safety. In the process of doing so, however,
I kept on coming back to the simple terms "successful
operation" and "failed operation," and finally realized
(strongly influenced, I am sure, by papers like [1])
that the latter applied completely and equally to ALL
forms of failure -- C++ exceptions, Java exceptions, C
return codes, asynchronous interrupts, and any other
imaginable failure mode and reporting mechanism. That
made me realize that what I was developing might not
really be just a taxonomy for analyzing C++ exception
safety at all, but rather something much more general:
A taxonomy for describing program correctness in all
its aspects in all languages. Further use, critique,
and development will show whether this approach is
indeed that widely applicable or not.

Finally, consider two fundamental implications:

1. The Transactional Principle: All programming
operations should be considered as transactions,
with associated guarantees, and are amenable to this
new style of ACID transactional analysis. This
applies regardless of whether exceptions are
possible or absent.

2. As pointed out by Dave Abrahams in [1] and other
places, "exception safety" is a bit of a misnomer,
and the ACID model bears this out by not
distinguishing between different kinds of "early
returns," whether they be through error codes,
through exceptions, or through some other method.
Exception-related "safety" is not some magical
entity or unknowable phantom; "safety" is just
another name for "correctness." When we want to
talk about "exception safety," we should instead
just talk about "program correctness" without
arbitrarily highlighting one form of correctness.
Unfortunately, this only underscores that languages
that have exceptions but do not allow exception-safe
programming (such as by not supporting the strongest
no-failure atomicity guarantee, or A-safety)
actually have a worse problem, namely insufficient
support for general program correctness.

If this taxonomy and its related analysis methods are
at all useful, they will be subject to refinement. I
invite readers to analyze and expand upon these ideas.


Acknowledgments
---------------

Thanks to Dave Abrahams, Greg Colvin, and Scott Meyers
for incisive reviews of this article, along with much
other fun and productive discussion of C++ topics over
the years.

[1] Abrahams, D. "Exception-Safety in Generic Components,"
in "Generic Programming: Proceedings of a Dagstuhl
Seminar," Jazayeri M., R. Loos, and D. Musser, eds.
(Springer Verlag, 1999).

[2] Mueller, H. "10 Rules for Handling Exception
Handling Successfully" (C++ Report, 8(1), January
1996).

[3] Reeves, J. "Coping with Exceptions"
(C++ Report, 8(3), March 1996). Available online at
http://meyerscd.awl.com/DEMO/MAGAZINE/RE_FRAME.HTM.

[4] Date, C.J. "An Introduction to Database Systems,
7th Edition" (Addison-Wesley, 1999).

[5] Sutter, H. "Namespaces and the Interface Principle"
(C++ Report, 11(3), March 1999).

[6] Sutter, H. "Exceptional C++" (Addison-Wesley, 2000)

[7] Sutter, H. "Peer Distributed Databases"
(Addison-Wesley, forthcoming)


---
Herb Sutter (mailto:hsu...@peerdirect.com)

PeerDirect Inc. 2695 North Sheridan Way, Suite 150
www.peerdirect.com Mississauga Ontario Canada L5K 2N6

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]


Bill Wade

unread,
Nov 1, 1999, 3:00:00 AM11/1/99
to

Herb Sutter wrote in message <38192d73...@nntp.netcom.ca>...
> [ Moderator might complain if I quoted 46k of text]

Lots of very good points. I'll play devil's advocate and complain about the
things I don't like.

I intend this to be constructive criticism for Herb's use, so I'll tend to
use 'you' where most readers will want to read 'he.'

I don't like overloading the meaning of ACID this way. There are some
interesting parallels, but in the traditional sense each letter of ACID
refers to the system as a whole rather than parts of the system (say the
contents of a record as "local" information and the entries in an index as
"global" information). Herb specifically applies 'A' to local state and 'I'
to non-local state or behavior. I believe it would be most helpful to pick
another acronym. People are likely to mix up his meaning of 'I' with the DB
meaning of 'I' when attempting to validate designs.

The notion of a consistent system is clearly context dependent.
set<int>::insert should meet the strong guarantee (meaning it is at least
aCi), and since integer construction has no side effects it should be aCI.
However a successful insert may temporarily leave the system as a whole in
violation of one of one of its invariants (perhaps all interesting integers
are in set A and all interesting prime integers are in set B, the invariant
is temporarily off between the two insertions).

In a similar vein, it is not clear to me where you are drawing the boundary
between the "immediate" or directly manipulated object and "side effects."
In your definition of atomicity you discuss state changes to "directly
manipulated" objects. In your example analysis you refer to "the immediate
object." If an object's constructor manipulates some global objects, when,
if ever, do 'atomic' rules apply to them.

Finally I think any such taxonomy should indicate the importance of
operations which can be undone even after they have successfully completed.
This seems to be orthogonal to the "local" vs. "side effect" issue and is
related to your "importance of big A" and "Side Effect Last" sections. If
"can be undone" is 'U' we get a prefered partial ordering of operations:
Ua < ua (prefer to do things that can be undone first).
ua < uA (prefer to do things that can fail first).
UA items (can be undone and never fail) can safely be inserted anywhere.
'new' is normally a Un operation (easily undone, sometimes fails). 'delete'
is a uN operation (usually can't be undone, never fails). If I wanted an
assignment operator that had a side effect of launching a rocket I would
want to be sure to modify your operator= as follows:

Widget& Widget::operator=(const Widget& other)
{
auto_ptr<Impl> temp(new Impl(*other.pimpl_)); // May fail. Easily
undone.
LaunchRocket(); // May fail. If it succeeds can't be undone.
delete pimpl_; // Never fails. Can't be undone.
pimpl = temp.release();
return *this;
}

Both the copy-constructor line and the LaunchRocket line might be aCi.
However to make operator= as a whole aCi it is important that LaunchRocket
be done after the copy construction and before the delete.

Pierre Baillargeon

unread,
Nov 2, 1999, 3:00:00 AM11/2/99
to
Herb Sutter wrote:

> Second, the above ACID analysis makes it clear why
> Mueller's and Reeves' "Bad" and "Undefined" states, and
> the sometimes-cited "destructible" guarantee, are all
> the same thing from a program correctness point of
> view.
>

[...]

> fact none of the Bad, Undefined, or destructible
> "guarantees" really gives any kind of useful program
> correctness or safety guarantee, because all useful
> program correctness guarantees include at least the
> concept of preserving object and system invariants.

I disagree here. There is a vast body of programs that need that
guarantee, and the ones I'm currently working on are part of it. These
programs are the ones in which any failure causes the program to exit.
But to be able to exit, all local and static variables must be
destroyable. Thus that guarantee is useful.

Providing less could cause the program to crash (which is not treated
the same as exiting with an error code), and providing more is just
wasting time.

[...]


> An interesting point raised by Greg Colvin is that for
> this reason it does not appear to be possible to write
> exception-safe code in Java. The Java specification
> currently allows the JVM to throw an exception at any
> time at arbitrary points in the program, for example
> asynchronously when stop is called on a Thread object.
> In practice it isn't that bad, but it is worrisome that
> the Java specification can apparently be shown to
> preclude writing exception-safe code because there is
> no way to absolutely guarantee A-safe operations in
> Java.

I almost forgot I'm reading a C++ newsgroup where cheap shots against
Java seem almost a sport. The above statement seem to imply that C++ is
different. There are nothing called "exit()", "abort()", "unexpected",
"signals" (from which exit() can be called), "division by zero", "bad
pointers", and such in C++, of course. Use threads in C++, and you won't
get the same result. Noooo. Of course not.

Can anyone tell me what is the difference netween a stop exception going
up the Java call stack and a call to exit() from a signal in C++ ? I
didn't think so.

{This is an interesting point. My comment wasn't meant to be
argumentative, and if you're a reader of GotW you know that
most of my complaints in it are about C++! :-) Regards, -hps}

Pete Becker

unread,
Nov 3, 1999, 3:00:00 AM11/3/99
to
Pierre Baillargeon wrote:
>
> I almost forgot I'm reading a C++ newsgroup where cheap shots against
> Java seem almost a sport.

No, they're just a warmup exercise. Too easy a target to be a real
sport.

--
Pete Becker
Dinkumware, Ltd.
http://www.dinkumware.com

Dave Abrahams

unread,
Nov 4, 1999, 3:00:00 AM11/4/99
to
In article <381F10DD...@artquest.net> , Pierre Baillargeon
<p...@artquest.net> wrote:

>Herb wrote:
>> Second, the above ACID analysis makes it clear why
>> Mueller's and Reeves' "Bad" and "Undefined" states, and
>> the sometimes-cited "destructible" guarantee, are all
>> the same thing from a program correctness point of
>> view.
>>

> [...]


>
>> fact none of the Bad, Undefined, or destructible
>> "guarantees" really gives any kind of useful program
>> correctness or safety guarantee, because all useful
>> program correctness guarantees include at least the
>> concept of preserving object and system invariants.
>

> I disagree here. There is a vast body of programs that need that
> guarantee, and the ones I'm currently working on are part of it. These
> programs are the ones in which any failure causes the program to exit.

I agree that this is a very important class of programs.
Nonetheless, from a logical standpoint, "destructible" cannot be thought of
as a separate condition outside the class invariant. In your case, the
conditions associated with "destructible" may be the only invariants your
class really has. This is not just a pedantic argument: any operation you
can perform on an object can occur after an exception has been thrown,
provided there is a catch block. If the class invariant is broken at that
point, what will the result be?

> But to be able to exit, all local and static variables must be
> destroyable.

I beg to differ (18.2.2/5):
Automatic objects are not destroyed as a result of calling exit().
You could just call exit(). Why bother unwinding at all in this case?

> Thus that guarantee is useful.

> Providing less could cause the program to crash (which is not treated
> the same as exiting with an error code), and providing more is just
> wasting time.

It's been my experience that the effort required to go from the
"destructible" guarantee to the basic guarantee is vanishingly small, and
well worth the effort. You don't have to worry so much about what you might
want to do in a catch block, and your classes then have a hope of a future
in systems which need to recover. I've encountered large systems built on
the model "kill the process and start over in case of an exception" which
are struggling to adapt in environments which can't tolerate that strategy.
Designing useful class invariants goes a long way toward ensuring the
long-term viability of your code.

> [...]


>> An interesting point raised by Greg Colvin is that for
>> this reason it does not appear to be possible to write
>> exception-safe code in Java. The Java specification
>> currently allows the JVM to throw an exception at any
>> time at arbitrary points in the program, for example
>> asynchronously when stop is called on a Thread object.
>> In practice it isn't that bad, but it is worrisome that
>> the Java specification can apparently be shown to
>> preclude writing exception-safe code because there is
>> no way to absolutely guarantee A-safe operations in
>> Java.
>

> I almost forgot I'm reading a C++ newsgroup where cheap shots against

> Java seem almost a sport. The above statement seem to imply that C++ is
> different. There are nothing called "exit()", "abort()", "unexpected",
> "signals" (from which exit() can be called), "division by zero", "bad
> pointers", and such in C++, of course. Use threads in C++, and you won't
> get the same result. Noooo. Of course not.

Um, actually, there's no reason a well-designed thread package in C++ ought
to do that. There really is a big difference here between the two languages.
Writing recovery code absolutely depends on being able to count on certain
operations not throwing an exception. This behavior is guaranteed by the C++
language. As I understand it, the Java language specification specifically
allows exceptions to occur at any time.

And, BTW, I consider the invocation of the "division by zero" and "bad
pointer" cases to be a red herring. These fall into the realm of "undefined
behavior", and are relatively easily avoided.

> Can anyone tell me what is the difference netween a stop exception going
> up the Java call stack and a call to exit() from a signal in C++ ?

To begin with, they're not the same thing at all. In the Java case, there is
an exception. In the C++ case there is none (exit() is not an exception -
local variables are not destroyed). Furthermore (as I understand it) this
behavior is defined into the Java language, but signals are not defined in
C++. An implementation is free to choose any number of responses to signals.
Many platforms don't support signals. The upshot is that if you're
programming in Java on any platform, you should expect an exception at any
time. The same is not true of C++.

>I didn't think so.

Think again ;)

You're-only-forced-to-accept-what-appears-in-the-standard'ly-yr's,
Dave

Pete Becker

unread,
Nov 5, 1999, 3:00:00 AM11/5/99
to
Dave Abrahams wrote:
>
>
> > Can anyone tell me what is the difference netween a stop exception going
> > up the Java call stack and a call to exit() from a signal in C++ ?
>
> To begin with, they're not the same thing at all. In the Java case, there is
> an exception. In the C++ case there is none (exit() is not an exception -
> local variables are not destroyed). Furthermore (as I understand it) this
> behavior is defined into the Java language, but signals are not defined in
> C++. An implementation is free to choose any number of responses to signals.
> Many platforms don't support signals. The upshot is that if you're
> programming in Java on any platform, you should expect an exception at any
> time. The same is not true of C++.
>

In fairness to Java, here's the straight dope on asynchronously stopping
a thread in Java. It's legal, but at present deprecated. Here's the
explanation, from the JDK 1.2 online documentation:

"Deprecated. This method is inherently unsafe. Stopping a thread with
Thread.stop causes it to unlock all of the monitors that it has locked
(as a natural consequence of the unchecked ThreadDeath exception
propagating up the stack). If any of the objects previously protected by
these monitors were in an inconsistent state, the damaged objects become
visible to other threads, potentially resulting in arbitrary behavior.
Many uses of stop should be replaced by code that simply modifies some
variable to indicate that the target thread should stop running. The
target thread should check this variable regularly, and return from its
run method in an orderly fashion if the variable indicates that it is to
stop running. If the target thread waits for long periods (on a
condition variable, for example), the interrupt method should be used to
interrupt the wait."

Of course, they also made disparaging comments about
java.io.PrintStream, and deprecated its two constructors in JDK 1.1.5.
In JDK 1.2 the constructors are no longer deprecated.

--
Pete Becker
Dinkumware, Ltd.
http://www.dinkumware.com

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]

0 new messages