Vindicated? Sutter and COW Strings

Joshua Lehrer

unread,

Sep 7, 2004, 6:34:59 PM9/7/04

to

I was happy as I read Sutter's article about COW strings. We here at
FactSet wrote a COW string some time ago. I have always claimed it
was thread safe, when written properly, using atomic integral
operations:

http://snipurl.com/8wsm
http://snipurl.com/8wsj

As I read Sutter's article, I felt vindicated. Our code is very
similar, and behaves in almost exactly the same way.

In fact, our string::swap even returns a reference. This is something
that I threw in because I thought it would be convenient. I'm glad to
see that I was "ahead" of the curve. :)

joshua lehrer
factset research systems
NYSE:FDS

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

ka...@gabi-soft.fr

unread,

Sep 8, 2004, 6:00:05 PM9/8/04

to

usene...@lehrerfamily.com (Joshua Lehrer) wrote in message
news:<31c49f0d.04090...@posting.google.com>...

> I was happy as I read Sutter's article about COW strings. We here at
> FactSet wrote a COW string some time ago. I have always claimed it
> was thread safe, when written properly, using atomic integral
> operations:

> http://snipurl.com/8wsm
> http://snipurl.com/8wsj

> As I read Sutter's article, I felt vindicated. Our code is very
> similar, and behaves in almost exactly the same way.

The problem is that Herb's code isn't thread safe. At least, not using
the definition of thread safety he uses, and supposing that his class
supports an interface compatible with std::string.

It is quite possible to write a thread safe String class using atomic
increment and atomic decrement. I don't think that it can be done if
the class returns raw non-const references, or if it has an iterator
which returns raw non-const eferences, however. To date, all attempts
that I've seen have failed. (Herb's code is almost exactly like that of
the g++ implementation, and that is only thread-safe for the perverted
definition of thread-safe that g++ uses -- that I need to protect access
even if no thread ever modifies the object.)

In the second message you quote above, there is one point on which I am
not clear. You say "If you have a global std::string, that multiple
threads may be modifying and reading, then it must be locked upon
access." The problems occur when you have a global non-const string
that nobody modifies. My typical example is some configuration
parameter which is initialized in main, from the command line arguments
or from a configuration file, before any threads have been started, and
then is never modified. Logically, IMHO, several threads should be able
to access this variable without external locking. If the string class
is not COW, they can. According to Posix, they can (or could, supposing
that std::string were a type known to Posix). Regretfully, it doesn't
work with Herb's implementation, nor with the g++ implementation.

For an example of the problem, consider the following scenario: the
string has been initialized, as above. You then start three threads.

Thread 1 makes a local copy of the string. The reference count is thus
two.

Thread 2 decides to read a character in the string, by calling
operator[]. Since the string is declared non-const, thread 2 calls
the non-const version of operator[], even though it is not modifying
the string. However, because the implementation of String cannot
know that no modification is intended, it must isolate the
implementation (i.e. execute the copy on write) -- in Herb's code,
this means call ensureUnique(), in the g++ implementation, call
_M_mutate(). Both of these functions do basically the same thing:
they check whether the string is shared (it is), and if (since) it
is, they allocate a new copy, and copy the data.

At this point, however, thread 2 is interrupted by thread 3.

Thread 3 does exactly the same thing as thread 2. Since the string is
still shared, it also allocates a new image, and starts copying the
data. At this point, it gets interrupted.

Thread 1 comes back to life. Its local copy of the string goes out of
scope, decrementing the reference count. The reference count is now
one.

Thread 2 comes back to life. It finishes copying the data, and
"disconnects" the old image -- i.e. it decrements the reference
count (which was one), and if the results are zero, it deletes the
old data. And thread 2 goes on to use the new, reserved data.

Thread 3 comes back to life. By a bit of bad luck, it already had the
pointer to the old data loaded in a register, ready to copy. So it
copies the old data that thread 2 has just deleted. The standard
qualifies this as undefined behavior: that's bad number 1.

When thread 3 finishs copying, it disconnects the "old" image:
depending on whether the compiler has kept the pointer in a
register, or rereads it from the object, it disconnects the image
that thread 2 has already deleted, possibly deleting it twice
(depending on whether the reference counter is signed or not -- this
is bad 2), or it disconnects the image thread 2 has just installed,
deleting it (since its reference count was necessarily one).

Thread 2 comes back to life, and uses the reference which the operator[]
returned. If thread 3 disconnected the image it just installed,
this points to deleted memory -- yet another bad.

Note that if you are doing a deep copy, none of this is any problem. As
long as you don't really modify the string, no part of its
implementation is modified, and the Posix rules apply. Note too that if
all modifications pass through a function in the string (rather than
having a function which returns a pointer or a reference allowing a
possible modification), there is no problem -- the string class knows
for sure when it will be modified (and we all agree that IF any thread
is modifying the object, all threads must protect all of their
accesses). One way of doing this is to have operator[] (and operator*
of the iterator) return a proxy. The standard doesn't allow this, but
I've used it quite successfully on some of my own, pre-standard string
classes -- it also has the advantage that operator[] or obtaining a
non-const iterator don't inhibit copy on write for all of the future.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Kai-Uwe Bux

unread,

Sep 10, 2004, 10:01:57 AM9/10/04

to

ka...@gabi-soft.fr wrote:

Thank you very much for this detailed account.

What about using more powerful atomic integer operations like atomic
compare_and_swap.

atomic
bool
atomic_compare_and_swap ( size_t& addr,
size_t expected, size_t put_there ) {
if ( addr == expected ) {
addr = put_there;
return( true );
} else {
return( false );
}
}

Now, one could implement ensureUnique() as follows:

void EnsureUnique ( void ) {
std::size_t current_count = atomic_read( data_ptr->ref_count );
// This object has a unique handle on the buffer.
// Nothing to do:
if ( current_count == 1 ) {
return;
}
// Copy the data into a privately held new buffer object.
// Be sure *not* to change the handle yet.
/*
This line really is a dummy for what needs to be done here
to create a local copy with refcount 1.
*/
Data* new_data_ptr = new Data ( data->buffer, this->size(), 1 );
do {
if ( atomic_compare_and_swap( data_ptr->ref_count,
current_count, current_count-1 ) ) {
// The reference count in *data_ptr has been adjusted.
// Note that it did not drop to 0!
// Release the handle:
data_ptr = new_data_ptr;
return;
}
// Something strange happened in the outside world
// and changed the ref_count. We should update our
// knowledge:
current_count = atomic_read( data_ptr->ref_count );
} while ( current_count != 1 );
// Leaving the loop this way, we know that by some events
// happening outside, we became the sole owner of *data_ptr.
// Our copying was not necessary:
delete new_data_ptr;
}

Now your story would read as follows:

Thread 1 makes a local copy of the string. The reference count is 2.

Thread 2 decides to read a character in the string, by calling

operator[]. This triggers EnsureUnique(). New data is allocated and the
buffer is copied. The value for current_count(thread 2) is 2.

At this point, however, thread 2 is interrupted by thread 3.

Thread 3 does exactly the same thing as thread 2. Since the string is
still shared, it also allocates a new image, and starts copying the
data.

current_count(thread 3) == 2

At this point, it gets interrupted.

Thread 1 comes back to life. Its local copy of the string goes out of
scope, decrementing the reference count. The reference count is now
one.

Thread 2 comes back to life. It finishes copying the data, and

enters the loop. Calling

atomic_compare_and_swap( data_ptr->ref_count,
current_count, current_count-1 )

returns *false* since the ref_count has changed to 1. This value
will be read, the loop is left at the bottom exit and thread 2
continues using the old buffer.

Thread 3 comes back to life and faces the same fate as thread 2.

Would that work?

Best

Kai-Uwe Bux

Bill Wade

unread,

Sep 10, 2004, 10:09:52 AM9/10/04

to

ka...@gabi-soft.fr wrote in message news:<d6652001.04090...@posting.google.com>...

> ... The problems occur when you have a global non-const string
> that nobody modifies. ...

I'd say that the problem is the false belief that it is possible to
call a non-const member (such as nc-begin or nc-[]) and pretend that
it is just a read access. I agree that COW std::string cannot be
thread safe. It cannot be thread safe because so many developers
believe the fallacy.

James Kanze

unread,

Sep 12, 2004, 6:06:40 AM9/12/04

to

wa...@stoner.com (Bill Wade) writes:

|> ka...@gabi-soft.fr wrote in message
|> news:<d6652001.04090...@posting.google.com>...

|> > ... The problems occur when you have a global non-const string
|> > that nobody modifies. ...

|> I'd say that the problem is the false belief that it is possible to
|> call a non-const member (such as nc-begin or nc-[]) and pretend that
|> it is just a read access. I agree that COW std::string cannot be
|> thread safe. It cannot be thread safe because so many developers
|> believe the fallacy.

The fallacy is more fundamental than that. The problem is that you have
non-const functions that don't modify the string. If I don't modify the
string, I've maintained my end of the contract; the const-ness of the
actual function the compiler choses to call is irrelevant.

The real problem is in the interface defined by the standard. An
interface in which it is fully natural to call non-const functions but
not to modify the object.

--
James Kanze

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung

9 place Sémard, 78210 St.-Cyr-l'École, France +33 (0)1 30 23 00 34

Herb Sutter

unread,

Sep 13, 2004, 3:58:47 PM9/13/04

to

Let me jump in to note that most of James' bug report is actually a FAQ,
well known to me and not a bug. The rest is also not a bug in my code, but
a legitimate request for a std::string redesign where the typical solution
to what he wants to do is to use an immutable string type.

On 8 Sep 2004 18:00:05 -0400, ka...@gabi-soft.fr wrote:
>The problem is that Herb's code isn't thread safe. At least, not using
>the definition of thread safety he uses, and supposing that his class
>supports an interface compatible with std::string.

That's untrue. Two points:

First, the code is thread-safe. FWIW I've received this particular "bug
report" so often, including even from multiple fellow CUJ columnists, that
this summer I finally decided to write an aricle about this non-bug so
that I wouldn't have to answer it again. :-) It appeared as my September
2004 CUJ column under the title "'Just Enough' Thread Safety."

Second, what you actually want (as you noted) is not a more thread-safe
implementation, but rather a more thread-friendly design. Specifically,
you're really asking for a redesigned std::string that offers operations
that can be knowably non-mutating (unlike op[], which falls into the
category of possibly-mutating operations), so that you can document
guarantees like "you don't need to do external locking if there are only
reader threads and on (possibly-)writer threads" for more useful common
cases.

Incidentally (and I know James probably knows this, but for completeness),
note that if you have _all_ non-mutating operations (i.e., an immutable
string type) then the whole issue of locking disappears. That's why other
languages/frameworks have gone that way. Of course, there are
corresponding disadvantages (e.g., every string operation makes a new
string, which has interesting performance consequences as well as
interesting implementation consequences such as that you implement += in
terms of + instead of the std::string-"normal" other way around).

Finally, it's worth noting that std::string has even more constraints for
COW than you or I have mentioned here, and that are entirely unrelated to
thread safety. In particular, it's not clear to me how any COW
implementation of std::string can be standards-conforming because of this
situation (which I've been pointing out for years in hallway conversations
and talks, but it now occurs to me I don't think I've written about):

string s1( "Hello" );
string s2( s1 );

According to the standard, I don't believe a conforming implementation may
do a shallow copy here. The reason is that there's no way to know whether
the very next use of the string will mention op[]:

s1[0]; // or, s2[0]

If string did use COW and did a shallow copy when constructing s2, then
str[0] must do a deep copy, which means it will allocate memory, which
means it could throw a bad_alloc exception, and that is nonconforming
because when the index is within bounds no exception may be thrown. The
standard doesn't allow an exception to be thrown here, but no COW string
implementation I know of could realistically avoid this possibility.

Here COW is nonconforming, but, worse, this violation of the explicit
contract of the type is highly surprising to users when it does happen.

Herb

---
Herb Sutter (www.gotw.ca)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)

ka...@gabi-soft.fr

unread,

Sep 13, 2004, 4:30:17 PM9/13/04

to

Kai-Uwe Bux <jkher...@gmx.net> wrote in message
news:<chppr0$m4o$1...@murdoch.acc.Virginia.EDU>...

> ka...@gabi-soft.fr wrote:
> > usene...@lehrerfamily.com (Joshua Lehrer) wrote in message
> > news:<31c49f0d.04090...@posting.google.com>...

[...]

> What about using more powerful atomic integer operations like atomic
> compare_and_swap.

I don't think that a thread safe version could be implemented using this
either. The basic problem is that when you isolate, you must update two
variables, the use count AND the pointer to the image. Compare and swap
won't prevent another thread from interrupting between the two, finding
that the image is isolated, but reading the old, pre-isolated pointer.

While I'm at it, I should mention that it's important to take into
account one or two things that Herb and I have not yet mentioned, for
reasons of simplification. Basically, if the function called is
returning a reference or an iterator, not only does it need a unique
copy of the image, it needs to ensure that this copy remains unique
after leaving the function. G++, for example, uses a special flag value
(-1, I think) in the reference count to signal this. This flag can only
be reset by a function which is documented to possibly invalidate
iterators or references. If we ignore for the moment c_str() and
data(), these are all functions which really do modify the string --
which in our problem case, won't be called. This means that once this
flag is set, no more problems can occur. (This also means that in my
scenario, only the first call do operator[] can cause any problems.)

This introduces some added complexity to any proposed solution.
However, even without this problem, your code fails.

Here's where you might have a problem. See below. (Note that it is the
new pointer which is isolated, not the old one.)

> data_ptr = new_data_ptr;

> return;
> }
> // Something strange happened in the outside world
> // and changed the ref_count. We should update our
> // knowledge:
> current_count = atomic_read( data_ptr->ref_count );
> } while ( current_count != 1 );
> // Leaving the loop this way, we know that by some events
> // happening outside, we became the sole owner of *data_ptr.
> // Our copying was not necessary:
> delete new_data_ptr;
> }

> Now your story would read as follows:

> Thread 1 makes a local copy of the string. The reference count is 2.

Nope.

Of course, in real life, we'd need the isolated state. But regardless.
I've still not found any way of eliminating the race condition between
the atomic_compare_and_swap and assigning to the data pointer. In the
end, with four threads, it always fails: 2 threads make copies, pushing
the count up to three. Third thread intervenes, gets through the
atomic_compare_and_swap, which decrements the count to two, but is
interrupted before assigning the data_ptr. Fourth thread comes in, does
the same. Then both threads set the data_ptr. Of course, the last
one overwrites the first; so the first has leaked memory.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Bill Wade

unread,

Sep 13, 2004, 4:38:12 PM9/13/04

to

James Kanze <ka...@gabi-soft.fr> wrote in message news:<m2acvxv...@lns-vlq-28-82-254-78-252.adsl.proxad.net>...

> The real problem is in the interface defined by the standard. An
> interface in which it is fully natural to call non-const functions but
> not to modify the object.

No argument there. The std::string interface is dangerous even for
single-threaded COW. About all an implementation can safely do is
copy-on-ptr-or-write (COPOW), where "ptr" is any operation that gives
out an iterator, pointer, or reference (including const_iterator,...).

COPOW works with any std::container<POD>. It has all of the
performance penaly of COW, and not quite all of the performance
benefits of COW. In particluar, for std::string (and certainly for
other std::containers) allocation must occur even in cases where the
deep-copy is avoided. I can't see using it by default for any
std::container (strings are too short, and the other containers rely
too much on the iterator interface). I could see an implementation
making it optionally available (perhaps via an allocator_trait), but I
suspect that most developers with an application that needs COPOW (or
COW) will write their own containers. If I am sure that I need COW
(or I am sure that COW must not be used), then I don't use std::string
for "real" development.

David Abrahams

unread,

Sep 14, 2004, 8:22:11 AM9/14/04

to

Herb Sutter <hsu...@gotw.ca> writes:

> Finally, it's worth noting that std::string has even more constraints for
> COW than you or I have mentioned here, and that are entirely unrelated to
> thread safety. In particular, it's not clear to me how any COW
> implementation of std::string can be standards-conforming because of this
> situation (which I've been pointing out for years in hallway conversations
> and talks, but it now occurs to me I don't think I've written about):
>
> string s1( "Hello" );
> string s2( s1 );
>
> According to the standard, I don't believe a conforming implementation may
> do a shallow copy here. The reason is that there's no way to know whether
> the very next use of the string will mention op[]:
>
> s1[0]; // or, s2[0]
>
> If string did use COW and did a shallow copy when constructing s2, then
> str[0] must do a deep copy, which means it will allocate memory, which
> means it could throw a bad_alloc exception, and that is nonconforming
> because when the index is within bounds no exception may be thrown.

Since when? Standard references please!

I think there may be reasons that a conforming COW string is
impossible, but this ain't one of them, AFAICT.

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

ne...@pgxml.net

unread,

Sep 14, 2004, 8:23:24 AM9/14/04

to

ka...@gabi-soft.fr writes:

[...COW strings]

>
>> Thread 1 makes a local copy of the string. The reference count is
>> 2.
>
> Nope.
>
> Of course, in real life, we'd need the isolated state. But
> regardless. I've still not found any way of eliminating the race
> condition between the atomic_compare_and_swap and assigning to the
> data pointer. In the end, with four threads, it always fails: 2
> threads make copies, pushing the count up to three. Third thread
> intervenes, gets through the atomic_compare_and_swap, which decrements
> the count to two, but is interrupted before assigning the data_ptr.
> Fourth thread comes in, does the same. Then both threads set the
> data_ptr. Of course, the last one overwrites the first; so the first
> has leaked memory.

*sigh*

<my_oafishness>
man stackframe
in the name of KISS, do we need COW-strings?
</my_oafishness>

greets, scnr

chr- "jeder bitte nur ein Kreuz" -istoph

ka...@gabi-soft.fr

unread,

Sep 14, 2004, 6:46:07 PM9/14/04

to

Herb Sutter <hsu...@gotw.ca> wrote in message
news:<jb3ak010au9medvi0...@4ax.com>...

> Let me jump in to note that most of James' bug report is actually a
> FAQ, well known to me and not a bug. The rest is also not a bug in my
> code, but a legitimate request for a std::string redesign where the
> typical solution to what he wants to do is to use an immutable string
> type.

> On 8 Sep 2004 18:00:05 -0400, ka...@gabi-soft.fr wrote:
> >The problem is that Herb's code isn't thread safe. At least, not
> >using the definition of thread safety he uses, and supposing that his
> >class supports an interface compatible with std::string.

> That's untrue.

Which part is untrue? Obviously, when talking about thread safety, we
have to define what we mean by the term. Your article clearly said that
it should behave as a non-COW string does with regards to thread-safety.
I interpret that to mean that anything that would reliably work for non
COW strings would work with your proposed implementation. You
specifically spoke about an operator[] which returned a reference --
that raises most, if not all of the problems the standard string class
has. In particular, if operator[] returns a reference, you must call
ensureUnique (or something equivalent) in it.

Given that, I presented an example of code which would work with a non
COW string, but which would fail with your implementation. So where is
my error: in my interpretation of the guarantees which you intended, in
my suppositions concerning the interface which you wanted to implement,
or in my example of where it failed. (If it is the latter, you'll have
to show me the exact flaw in my reasoning, because I've poured over that
one out a lot.)

> Two points:

> First, the code is thread-safe.

For what definition of thread-safe. It does not offer the same
guarantees as a non-COW implementation would, which is the definition
you seem to offer. It doesn't work even in some cases where the user
does not actually modify the string, but does use the non-const
operator[]. You even suggested that the const operator[] should return
a reference, and that the user should be able to cast away const on it
and modify the string -- if that is the case, it doesn't even work in
some cases where the user doesn't call any non-const functions.

I spent considerable time in analysing your code, because I know that
you are aware of the issues, and you are capable of finding solutions
where others can't. I took the time to explain my analysis in detail,
including my pre-suppositions concerning the definition (contract) of
thread-safety. (I personally felt that that was a weak point of the
article, that a simple phrase like "the same guarantees as a non COW
implementation" is rather vague, and that those guarantees should have
been spelled out.)

Just saying that I was wrong is not enough. I may be wrong, but if so,
you are going to have to be more explicit, and point out exactly where I
went astray.

> FWIW I've received this particular "bug report" so often, including
> even from multiple fellow CUJ columnists, that this summer I finally
> decided to write an aricle about this non-bug so that I wouldn't have
> to answer it again. :-) It appeared as my September 2004 CUJ column
> under the title "'Just Enough' Thread Safety."

That's exactly the article we are criticizing. See above: you gave a
definition of the desired thread safety (the same as a non-COW class
would offer), you speak at least a little about an operator[] which
returns a reference, and you present code which, if used with an
operator[] which returns a reference, requires an external, user
provided lock in a context where a non COW thread doesn't require one.

> Second, what you actually want (as you noted) is not a more
> thread-safe implementation, but rather a more thread-friendly
> design. Specifically, you're really asking for a redesigned
> std::string that offers operations that can be knowably non-mutating
> (unlike op[], which falls into the category of possibly-mutating
> operations), so that you can document guarantees like "you don't need
> to do external locking if there are only reader threads and on
> (possibly-)writer threads" for more useful common cases.

That too. What I think we really need is that the standard address the
threading issue. My own preference is for the guarantees given by the
SGI implementation and Posix -- maybe because they are the only
guarantees I've actually seen clearly specified. And I'd like to see an
interface to std::string which would allow these guarantees with a COW
implementation, at least in the presence of an atomic increment or some
such.

> Incidentally (and I know James probably knows this, but for
> completeness), note that if you have _all_ non-mutating operations
> (i.e., an immutable string type) then the whole issue of locking
> disappears. That's why other languages/frameworks have gone that
> way. Of course, there are corresponding disadvantages (e.g., every
> string operation makes a new string, which has interesting performance
> consequences as well as interesting implementation consequences such
> as that you implement += in terms of + instead of the
> std::string-"normal" other way around).

My personal feeling is that in a correctly *designed* string interface,
the only non const functions would be assignment operators. But that's
from a design point of view -- as you say, it may have certain
implications with regards to performance. (My own personal feeling is
that the performance problems can be addressed and solved without
compromizing the design. But in my own work, performance of strings has
always been more or less irrelevant, so I've not actually studied the
question in detail. So my feeling isn't really much more than simple
intuition.)

The alternative is to use proxy classes for the modification operators.
In theory, at least, a good compiler can optimize the proxy classes out,
so the performance overhead should be minimal. But that's just in
theory. I don't know how good real compilers do in practice.

> Finally, it's worth noting that std::string has even more constraints
> for COW than you or I have mentioned here, and that are entirely
> unrelated to thread safety. In particular, it's not clear to me how
> any COW implementation of std::string can be standards-conforming
> because of this situation (which I've been pointing out for years in
> hallway conversations and talks, but it now occurs to me I don't think
> I've written about):

> string s1( "Hello" );
> string s2( s1 );

> According to the standard, I don't believe a conforming implementation
> may do a shallow copy here. The reason is that there's no way to know
> whether the very next use of the string will mention op[]:

> s1[0]; // or, s2[0]

> If string did use COW and did a shallow copy when constructing s2,
> then str[0] must do a deep copy, which means it will allocate memory,
> which means it could throw a bad_alloc exception, and that is
> nonconforming because when the index is within bounds no exception may
> be thrown. The standard doesn't allow an exception to be thrown here,
> but no COW string implementation I know of could realistically avoid
> this possibility.

§19.4.4.8/3: "No destructor operation defined in the C++ Standard
Library will throw an exception. Any other fucntions defined in the C++
Standard Library that do not have an exception-specification may throw
implementation-defined exceptions unless otherwise specified."

There is no exception specification on operator[].

With regards to whether this is a good thing, or what a quality
implementation might do... But the standard does formally allow it.

> Here COW is nonconforming, but, worse, this violation of the explicit
> contract of the type is highly surprising to users when it does
> happen.

Here COW is conforming. But that's all you can say for it -- for the
rest, I totally agree. If we're talking about quality implementations,
I don't think that COW is viable for any class as fundamental as
string -- IMHO, independantly of threading, I have a right to expect
that at the very least, a normal read operation cannot throw. I'm of
two minds with regards to writing, but there are certainly strong
arguments that a normal write operation to an existing object should not
throw if the value written is legal.

Ideally, what we would like is that std::string be as robust and as
simple to use as int. That's not attainable, but we should definitly
consider just how close we can come.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Bill Wade

unread,

Sep 14, 2004, 7:06:04 PM9/14/04

to

Herb Sutter <hsu...@gotw.ca> wrote in message news:<jb3ak010au9medvi0...@4ax.com>...

> Finally, it's worth noting that std::string has even more constraints for

> COW than you or I have mentioned here, and that are entirely unrelated to
> thread safety. In particular, it's not clear to me how any COW
> implementation of std::string can be standards-conforming because of this
> situation (which I've been pointing out for years in hallway conversations
> and talks, but it now occurs to me I don't think I've written about):
>
> string s1( "Hello" );
> string s2( s1 );
>
> According to the standard, I don't believe a conforming implementation may
> do a shallow copy here. The reason is that there's no way to know whether
> the very next use of the string will mention op[]:
>
> s1[0]; // or, s2[0]
>
> If string did use COW and did a shallow copy when constructing s2, then
> str[0] must do a deep copy, which means it will allocate memory, which
> means it could throw a bad_alloc exception, and that is nonconforming
> because when the index is within bounds no exception may be thrown. The
> standard doesn't allow an exception to be thrown here,

I don't believe the words in the standard caught the intent of the
writers. It is pretty clear that the writers intended COW to be
legal. I expect that the writers understood that typical COW
implementations could throw here. I agree that the appropriate
license was not included in the standard.

> but no COW string
> implementation I know of could realistically avoid this possibility.

No existing COW std::string implementation I know of avoids this
possibility. Realistically, it is to avoid.

In cases where COW is a big win the saving are in the copy (O(N)), not
in the allocation (expect fast O(1) for single-threaded). An
implementation may defer the copy without deferring the allocation.

Spare (allocated, but not yet used) heap objects (string
representations) may be maintained by either the built-in allocator
(on a per-size basis), by string objects directly, or by string
representations.

It may be most efficient (excellent memory-access behavior) to
maintain the spares in the built-in allocator, but that doesn't work
for custom allocators.

The spares can be maintained as a single-linked list in the
representation object. This is space-efficient, since the pointer can
replace the reference count.

Even more space-efficient (but less copy-efficient), is to put all
pointers in the string objects (not in the shared-heap objects). With
small-string optimizations, a modern string object typically has room
for all the pointers needed:
charT* __begin, __end;
mutable charT* __begin_owned, __end_owned;
mutable stringT* __prev, __next; // strings sharing __begin
The shared heap object consists of nothing but the actual characters,
no reference count or size information. Optionally you might add
another pointer so that the shared buffer and owned buffer could have
different allocated sizes. With this implementation, the mutable
basic_string members prevent shareable strings from living in
read-only memory.

Joshua Lehrer

unread,

Sep 14, 2004, 8:49:24 PM9/14/04

to

ka...@gabi-soft.fr wrote in message news:<d6652001.04091...@posting.google.com>...

> While I'm at it, I should mention that it's important to take into
> account one or two things that Herb and I have not yet mentioned, for
> reasons of simplification. Basically, if the function called is
> returning a reference or an iterator, not only does it need a unique
> copy of the image, it needs to ensure that this copy remains unique

We do that as well. We use a word to keep track of the reference
count, with a special bit meaning that the String is no longer
referenceable.

We solved the operator[] problem by returning proxies. Sure, it isn't
std compliant, but being std compliant was not part of our spec. In
fact, our spec required us to be compliant with an old dec::compap::hp
string class.

Our current issue that can not be fixed is with "begin()" and "end()"
which suffer from the same problems as operator[].

joshua lehrer
factset research systems
NYSE:FDS

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Ben Hutchings

unread,

Sep 14, 2004, 8:55:24 PM9/14/04

to

Herb Sutter wrote:
<snip>

> Finally, it's worth noting that std::string has even more constraints for
> COW than you or I have mentioned here, and that are entirely unrelated to
> thread safety. In particular, it's not clear to me how any COW
> implementation of std::string can be standards-conforming because of this
> situation (which I've been pointing out for years in hallway conversations
> and talks, but it now occurs to me I don't think I've written about):
>
> string s1( "Hello" );
> string s2( s1 );
>
> According to the standard, I don't believe a conforming implementation may
> do a shallow copy here. The reason is that there's no way to know whether
> the very next use of the string will mention op[]:
>
> s1[0]; // or, s2[0]
>
> If string did use COW and did a shallow copy when constructing s2, then
> str[0] must do a deep copy, which means it will allocate memory, which
> means it could throw a bad_alloc exception, and that is nonconforming
> because when the index is within bounds no exception may be thrown. The
> standard doesn't allow an exception to be thrown here, but no COW string
> implementation I know of could realistically avoid this possibility.

<snip>

How do you figure that? The standard doesn't specify whether any
particular members of basic_string call the allocator. None of the
"Throws:" paragraphs include bad_alloc among the possible exceptions.

--
Ben Hutchings
If at first you don't succeed, you're doing about average.

Herb Sutter

unread,

Sep 14, 2004, 8:56:21 PM9/14/04

to

On 14 Sep 2004 08:22:11 -0400, David Abrahams <da...@boost-consulting.com>
wrote:

>Herb Sutter <hsu...@gotw.ca> writes:
> > Finally, it's worth noting that std::string has even more constraints for
> > COW than you or I have mentioned here, and that are entirely unrelated to
> > thread safety. In particular, it's not clear to me how any COW
> > implementation of std::string can be standards-conforming because of this
> > situation (which I've been pointing out for years in hallway conversations
> > and talks, but it now occurs to me I don't think I've written about):
> >
> > string s1( "Hello" );
> > string s2( s1 );
> >
> > According to the standard, I don't believe a conforming implementation may
> > do a shallow copy here. The reason is that there's no way to know whether
> > the very next use of the string will mention op[]:
> >
> > s1[0]; // or, s2[0]
> >
> > If string did use COW and did a shallow copy when constructing s2, then
> > str[0] must do a deep copy, which means it will allocate memory, which
> > means it could throw a bad_alloc exception, and that is nonconforming
> > because when the index is within bounds no exception may be thrown.
>
>Since when? Standard references please!
>
>I think there may be reasons that a conforming COW string is
>impossible, but this ain't one of them, AFAICT.

Ah, I forgot string []/at mirrored vector's. (We must fix that, but that's
a separate issue.) So just change [] to at:

s1.at(0); // or, s2.at(0)

Now it's clear that the above is not allowed to throw, but could do so on
a COW implementation.

For those keeping score at home, this is because of wording in the
standard that Dave himself was instrumental in crafting:

17.4.4.8(1)
Any of the functions defined in the C + + Standard Library can
report a failure by throwing an exception of the type(s) described
in their Throws: paragraph and/or their exception-specification
(15.4). An implementation may strengthen the exception-
specification for a non-virtual function by removing listed
exceptions. 175)

Footnote 175:
That is, an implementation of the function will have an explicit
exception-specification that lists fewer exceptions than those
specified in this International Standard. It may not, however,
change the types of exceptions listed in the exception-
specification from those specified, nor add others.

In short, the standard library, any operation with a "Throws:" paragraph
may throw only listed exceptions. For basic_string<>::at, we have:

21.3.4(3):
Throws: out_of_range if pos >= size().

That is, at may not throw an exception of any type besides out_of_range,
and it may throw no exceptions if pos < size(). That being the case here,
COW cannot be a conforming implementation of std::string AFAICS because of
the given sequence:

string s1( "Anything" );
string s2( s1 ); // this cannot be a shallow copy
s2.at(0); // because this might be the immediately following
// operation, and it cannot throw anything and must
// return a true reference into the string's buffer
s2.at(0) = 'x'; // and in particular this must not mutate s1

I know of no way of making all the above standard-required behavior
guaranteed to be always true in the presence of COW.

Herb

---
Herb Sutter (www.gotw.ca)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Alexander Terekhov

unread,

Sep 15, 2004, 5:56:40 AM9/15/04

to

Herb Sutter wrote:
[...]

> For those keeping score at home, this is because of wording in the

> standard that Dave himself was instrumental in crafting: ...

You get -1. "and/or". at() has an open-ended ES.

regards,
alexander.

Herb Sutter

unread,

Sep 15, 2004, 5:58:28 AM9/15/04

to

First, thanks again for a typically insightful and deeply thought-through
analysis.

On 14 Sep 2004 18:46:07 -0400, ka...@gabi-soft.fr wrote:
>Which part is untrue? Obviously, when talking about thread safety, we
>have to define what we mean by the term. Your article clearly said that
>it should behave as a non-COW string does with regards to thread-safety.

Close enough. I said that:

So String has to do just enough extra serialization work to enable
calling code to work correctly as long as the calling code does its
usual thing for thread safe operation, namely to serialize access to
individual visible String objects that the calling code knows are or
can be actually shared across threads.

Note that I am not saying anything about if only readers are involved.

>Given that, I presented an example of code which would work with a non
>COW string, but which would fail with your implementation. So where is
>my error: in my interpretation of the guarantees which you intended, in
>my suppositions concerning the interface which you wanted to implement,
>or in my example of where it failed. (If it is the latter, you'll have
>to show me the exact flaw in my reasoning, because I've poured over that
>one out a lot.)

Sure, and I do understand you think deeply about things before you post.

I do disagree with the statement that it works with a non-COW string but
fails with my implementation. Your example boils down to:

// global
String s = "Hello";

// thread 1
String s2 = s;

// thread 2
s[0];

// thread 3
s[0];

Here's a stretched but valid example that breaks the above code for even a
non-COW String: Imagine that a non-COW String (conformingly) decides to
cache a count of the number of member function calls performed on it, that
the count is a member of a type that does not guarantee atomic writes (and
there's no internal synchronization), and op[] updates the count. That's a
perfectly valid instrumented implementation that's perfectly safe for
single-threaded use. But in the multi-threaded use above, there is a race
condition between threads 2 and 3 and you can get corrupted counts; you
can also get reads of partially written counts (e.g., if thread 1 cares to
read the instrumented count), etc.

More fundamentally, you want and assume that it's safe to call nonmutating
or even possibly-mutating member functions on the same object on different
threads without external locking, as long as the operations are not
_actually_ mutating the physical (not logical) state of the object. That's
an understandable but problematic assumption.

First, I do not assume that it's safe to call even const member functions
on the same object of any type on different threads without external
locking unless that type explicitly documents that's safe. I understand
that you want that to be true (and I agree that's a good idea and it is in
fact true for some types on some platforms), but std::string doesn't say
whether or not it should/must be true (although admittedly a more
fundamental reason for that is because the standard doesn't talk about
threads), and at best we're left with implementation-defined behavior (the
std::string implementer has to say whether it's true for their
implementation).

Second, even if that is supported, then the type really has to document
very clearly exactly what constitutes mutation. And the problem isn't just
that op[] used above is a non-const function, because this issue is deeper
than const vs. non-const functions can express, because of logical vs.
physical constness. And it is physical constness we care about, because of
the possibility that some mutable internal state will be in a partially
written mode when another thread tries to read it.

>> Finally, it's worth noting that std::string has even more constraints
>> for COW than you or I have mentioned here, and that are entirely
>> unrelated to thread safety. In particular, it's not clear to me how
>> any COW implementation of std::string can be standards-conforming
>> because of this situation (which I've been pointing out for years in
>> hallway conversations and talks, but it now occurs to me I don't think
>> I've written about):
>
>> string s1( "Hello" );
>> string s2( s1 );
>
>> According to the standard, I don't believe a conforming implementation
>> may do a shallow copy here. The reason is that there's no way to know
>> whether the very next use of the string will mention op[]:
>
>> s1[0]; // or, s2[0]
>
>> If string did use COW and did a shallow copy when constructing s2,
>> then str[0] must do a deep copy, which means it will allocate memory,
>> which means it could throw a bad_alloc exception, and that is
>> nonconforming because when the index is within bounds no exception may
>> be thrown. The standard doesn't allow an exception to be thrown here,
>> but no COW string implementation I know of could realistically avoid
>> this possibility.
>
>§19.4.4.8/3: "No destructor operation defined in the C++ Standard
>Library will throw an exception. Any other fucntions defined in the C++
>Standard Library that do not have an exception-specification may throw
>implementation-defined exceptions unless otherwise specified."

Right. It is otherwise specified in 17.4.4.8/1 for any operation with a
"throws:" specification:

17.4.4.8(1)
Any of the functions defined in the C++ Standard Library can

report a failure by throwing an exception of the type(s) described
in their Throws: paragraph and/or their exception-specification

I don't see any dispensation anywhere for an operation with a "Throws:"
clause to throw other kinds of exceptions. It's clear from what follows
that any exception specification in particular in the standard is clearly
meant to be interpreted and enforced this way. Further, one other passage
I didn't cite but also had been reading was:

17.3.1.3(3)
Descriptions of function semantics contain the following elements
(as appropriate): [...]
- Throws: any exceptions thrown by the function, and the conditions
that would cause the exception

This reaffirms 17.4.4.8(1) and specifies further that even the listed
legal exceptions may only be thrown under the listed conditions. If
there's no "Throws:" clause then the implementation has leeway to throw
other things and/or under other conditions, but AIUI if there is a
"Throws:" then it does not.

On that basis, the specification of basic_string<>::at seems pretty
strict:

const_reference at(size_type pos) const;
reference at(size_type pos);

Requires: pos < size()

| Throws: out_of_range if pos >= size().

Returns: operator[](pos).

I therefore understand this to mean that an exception other than
out_of_range may not be thrown, and even that exception may not be thrown
if pos < size().

It could be that I'm misunderstanding the force of the "Throws:"
specification, but if so either I'm missing a different passage in the
standard that contradicts the above or the passages quoted above should be
reworded to reflect a looser intent.

>Ideally, what we would like is that std::string be as robust and as
>simple to use as int. That's not attainable, but we should definitly
>consider just how close we can come.

Immutable strings are probably the best approximation.

Herb

---
Herb Sutter (www.gotw.ca)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

Sep 15, 2004, 2:29:23 PM9/15/04

to

usene...@lehrerfamily.com (Joshua Lehrer) wrote in message

news:<31c49f0d.04091...@posting.google.com>...

> ka...@gabi-soft.fr wrote in message
> news:<d6652001.04091...@posting.google.com>...

> > While I'm at it, I should mention that it's important to take into
> > account one or two things that Herb and I have not yet mentioned,
> > for reasons of simplification. Basically, if the function called is
> > returning a reference or an iterator, not only does it need a unique
> > copy of the image, it needs to ensure that this copy remains unique

> We do that as well. We use a word to keep track of the reference
> count, with a special bit meaning that the String is no longer
> referenceable.

Hmm. If you don't allow modification except through explicit member
functions (don't leak modifiability?), it shouldn't be necessary.

> We solved the operator[] problem by returning proxies. Sure, it isn't
> std compliant, but being std compliant was not part of our spec. In
> fact, our spec required us to be compliant with an old dec::compap::hp
> string class.

> Our current issue that can not be fixed is with "begin()" and "end()"
> which suffer from the same problems as operator[].

Why are begin() and end() a problem. Obviously, the non-const iterators
have to return a proxy (and not a reference) for the * operator. But
once you've done that, where is the problem -- the proxy should ensure
that all single character modifications go through a set function, and
that this function is never called unless the client actually intends
modification.

And of course, we agree that if the client intends real modification, he
is responsible for locking.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,

Sep 15, 2004, 3:31:20 PM9/15/04

to

wa...@stoner.com (Bill Wade) writes:

> Herb Sutter <hsu...@gotw.ca> wrote in message news:<jb3ak010au9medvi0...@4ax.com>...
>
>> Finally, it's worth noting that std::string has even more constraints for
>> COW than you or I have mentioned here, and that are entirely unrelated to
>> thread safety. In particular, it's not clear to me how any COW
>> implementation of std::string can be standards-conforming because of this
>> situation (which I've been pointing out for years in hallway conversations
>> and talks, but it now occurs to me I don't think I've written about):
>>
>> string s1( "Hello" );
>> string s2( s1 );
>>
>> According to the standard, I don't believe a conforming implementation may
>> do a shallow copy here. The reason is that there's no way to know whether
>> the very next use of the string will mention op[]:
>>
>> s1[0]; // or, s2[0]
>>
>> If string did use COW and did a shallow copy when constructing s2, then
>> str[0] must do a deep copy, which means it will allocate memory, which
>> means it could throw a bad_alloc exception, and that is nonconforming
>> because when the index is within bounds no exception may be thrown. The
>> standard doesn't allow an exception to be thrown here,
>
> I don't believe the words in the standard caught the intent of the
> writers.

Maybe Herb's just misinterpreting the words ;-)

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,

Sep 15, 2004, 3:32:10 PM9/15/04

to

Herb Sutter <hsu...@gotw.ca> writes:

> On 14 Sep 2004 08:22:11 -0400, David Abrahams <da...@boost-consulting.com>
> wrote:
>>Herb Sutter <hsu...@gotw.ca> writes:

>
> Ah, I forgot string []/at mirrored vector's. (We must fix that, but that's
> a separate issue.) So just change [] to at:
>
> s1.at(0); // or, s2.at(0)
>
> Now it's clear that the above is not allowed to throw, but could do so on
> a COW implementation.

Not to me.

> For those keeping score at home, this is because of wording in the
> standard that Dave himself was instrumental in crafting:
>
> 17.4.4.8(1)
> Any of the functions defined in the C + + Standard Library can
> report a failure by throwing an exception of the type(s) described
> in their Throws: paragraph and/or their exception-specification
> (15.4). An implementation may strengthen the exception-
> specification for a non-virtual function by removing listed
> exceptions. 175)
>
> Footnote 175:
> That is, an implementation of the function will have an explicit
> exception-specification that lists fewer exceptions than those
> specified in this International Standard. It may not, however,
> change the types of exceptions listed in the exception-
> specification from those specified, nor add others.

I had nothing to do with that wording. It was there before I arrived
on the scene.

> In short, the standard library, any operation with a "Throws:" paragraph
> may throw only listed exceptions.

No, that's just an allowance, not a prohibition.

> For basic_string<>::at, we have:
>
> 21.3.4(3):
> Throws: out_of_range if pos >= size().
>
> That is, at may not throw an exception of any type besides
> out_of_range,

There's a reason it doesn't say "iff" above.

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,

Sep 15, 2004, 3:33:05 PM9/15/04

to

Herb Sutter <hsu...@gotw.ca> writes:

> >§19.4.4.8/3: "No destructor operation defined in the C++ Standard
> >Library will throw an exception. Any other fucntions defined in the C++
> >Standard Library that do not have an exception-specification may throw
> >implementation-defined exceptions unless otherwise specified."
>
> Right. It is otherwise specified in 17.4.4.8/1 for any operation with a
> "throws:" specification:
>
> 17.4.4.8(1)
> Any of the functions defined in the C++ Standard Library can
> report a failure by throwing an exception of the type(s) described
> in their Throws: paragraph and/or their exception-specification

That's an allowance, not a prohibition on anything. It's almost
meaningless except as an encouragement to implementors.

> I don't see any dispensation anywhere for an operation with a "Throws:"
> clause to throw other kinds of exceptions.
>
> It's clear from what follows that any exception specification in
> particular in the standard is clearly meant to be interpreted and
> enforced this way.

Yes, but a "Throws:" clause is not an exception specification.

> Further, one other passage I didn't cite but also
> had been reading was:
>
> 17.3.1.3(3)
> Descriptions of function semantics contain the following elements
> (as appropriate): [...]
> - Throws: any exceptions thrown by the function, and the conditions
> that would cause the exception
>
> This reaffirms 17.4.4.8(1) and specifies further that even the listed
> legal exceptions may only be thrown under the listed conditions.

Once again, an allowance but not a prohibition. It's the difference
between "a if b" and "a if and only if b".

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

Sep 15, 2004, 4:13:18 PM9/15/04

to

Herb Sutter <hsu...@gotw.ca> wrote in message

news:<j9pek0tqmqmlrc8e7...@4ax.com>...

> s1.at(0); // or, s2.at(0)

Note that neither operator[] nor at() have exception specifiers in my
copy of the standard (the 1998 version).

> Now it's clear that the above is not allowed to throw,

Even considering the second sentence in §17.4.4.8/3: "Any other
functions [than a destructor] defined in the C++ Standard Library that

do not have an exception-specification may throw implementation-defined
exceptions unless otherwise specified."

> but could do so on a COW implementation.

> For those keeping score at home, this is because of wording in the
> standard that Dave himself was instrumental in crafting:

> 17.4.4.8(1)
> Any of the functions defined in the C + + Standard Library can
> report a failure by throwing an exception of the type(s) described
> in their Throws: paragraph and/or their exception-specification
> (15.4). An implementation may strengthen the exception-
> specification for a non-virtual function by removing listed
> exceptions. 175)

> Footnote 175:
> That is, an implementation of the function will have an explicit
> exception-specification that lists fewer exceptions than those
> specified in this International Standard. It may not, however,
> change the types of exceptions listed in the exception-
> specification from those specified, nor add others.

> In short, the standard library, any operation with a "Throws:"
> paragraph may throw only listed exceptions.

That's not quite what the text you quote says. Although the first
sentence speaks of "their Throws: paragraph and/or their
exception-specification", the others only say
"exception-specification". And neither operator[] nor at() have an
exception specification.

I don't know if this is a defect in the standard, or whether it was
intentional. One can easily imagine that the authors really intended to
cover three cases:

- No specifications as to what happens on implementation defined
errors -- the function has neither a throws clause nor an exception
specification. (This is the most common cases.)

- The standard specifies exactly and fully all possible exceptions:
the function has an exception specification (and probably a throws
clause as well).

- The standard specifies precise exceptions for certain types of
errors, but leaves the implementation free, as in the first case,
for all other possible errors: the function has a throws clause, but
no exception specification.

Unless there is a defect report on this, I'd say that the most probable
intention was to allow the three cases, and that at() falls into the
third -- an illegal index throws a well defined exception, but an
implementation is free to throw an implementation defined exception.
(IMHO, footnote 178 seems to confirm this interpretation.)

As far as I can see, something like:

template< typename CharT >
std::basic_string< CharT >::const_reference
std::basic_string< CharT >::at() const
{
throw 3.14159 ;
}

is a perfectly conforming implementation. Not a very useful one, of
course, but the standard doesn't require usefulness.

If you think that this is a misinterpretation of the standard, please
explain why.

If you think that it is a defect, or at least, something we should
change, I'm behind you 100%. It means that in theory, at least, it is
impossible to write exception safe code without depending on
implementation defined behavior. (Note, for example, that
std::string::swap doesn't have an exception specification either.)

> For basic_string<>::at, we have:

> 21.3.4(3):
> Throws: out_of_range if pos >= size().

> That is, at may not throw an exception of any type besides
> out_of_range, and it may throw no exceptions if pos < size().

All that says is that basic_string<>::at MUST throw the designated
exception for this particular error. The paragraph you quote says that
a function "can" report an error by means of the specified exception.
It does not say that it "shall not" raise any other exceptions.

(On rereading it: I don't like the use of "can" either. Normally,
requirements on an implementation use the verb "shall". I would much
prefer text which clearly required an implementation to report errors
decribed in the Throws: clause by means of the described exception, or
one derived from it. I rather think that this was the intent, but my
interpretation of "can", in the context of the standard, is that an
implementation is allowed to, but not required to.)

At any rate, I've changed the subject, and added comp.std.c++ to the
groups, since this really has nothing to do with your implementation of
COW strings. (I'd also have set followups if my newsreader allowed it.)

> That being the case here, COW cannot be a conforming implementation of
> std::string AFAICS because of the given sequence:

--

James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

[ comp.lang.c++.moderated. First time posters: Do this! ]

[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std...@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]

Herb Sutter

unread,

Sep 15, 2004, 7:45:25 PM9/15/04

to

On 14 Sep 2004 20:55:24 -0400, Ben Hutchings

<ben-publ...@decadentplace.org.uk> wrote:
>How do you figure that? The standard doesn't specify whether any
>particular members of basic_string call the allocator.

True in general, although you have for example:

21.3.1 basic_string constructors
In all basic_string constructors, a copy of the Allocator argument is
used for any memory allocation performed by the constructor or
member functions during the lifetime of the object.

>None of the
>"Throws:" paragraphs include bad_alloc among the possible exceptions.

Most of the basic_string functions, including most constructors, don't
have Throws: paragraphs, which means they can throw anything.

But you do actually point out a difficulty with my statement about Throws:
paragraphs: At least resize, reserve, and the versions of construction and
append (et al.) that take another basic_string and a position as input
have a Throws: paragraph about out_of_bounds, but clearly those operations
might have to perform allocations.

I have always understood a Throws: paragraph to be restrictive, Dave
Abrahams has mentioned in private email that he has always understood a
Throws: paragraph to be open to throwing other exceptions, and the above
leans more toward Dave's view of it. I still think that we need to make
this clearer, because at least to my reading Throws: looks restrictive as
the standard is written right now, but it now seems to be that in other
places this wasn't intended.

If Throws: isn't intended to be restrictive, then ignore my aside from the
original discussion about COW being fundamentally an invalid
implementation of std::string as a red herring. :-)

Herb

---
Herb Sutter (www.gotw.ca)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,

Sep 16, 2004, 11:14:40 AM9/16/04

to

ka...@gabi-soft.fr writes:

> As far as I can see, something like:
>
> template< typename CharT >
> std::basic_string< CharT >::const_reference
> std::basic_string< CharT >::at() const
> {
> throw 3.14159 ;
> }
>
> is a perfectly conforming implementation. Not a very useful one, of
> course, but the standard doesn't require usefulness.
>
> If you think that this is a misinterpretation of the standard, please
> explain why.

You understand correctly.

> If you think that it is a defect, or at least, something we should
> change, I'm behind you 100%. It means that in theory, at least, it is
> impossible to write exception safe code without depending on
> implementation defined behavior.

That's already possible.

> (Note, for example, that std::string::swap doesn't have an exception
> specification either.)

Bingo. That should be changed, just to keep people from making
mistakes with it. But it's not, strictly speaking, a defect.

>> For basic_string<>::at, we have:
>
>> 21.3.4(3):
>> Throws: out_of_range if pos >= size().
>
>> That is, at may not throw an exception of any type besides
>> out_of_range, and it may throw no exceptions if pos < size().
>
> All that says is that basic_string<>::at MUST throw the designated
> exception for this particular error.

I'm not even sure it's that strong.

> The paragraph you quote says that a function "can" report an error
> by means of the specified exception. It does not say that it "shall
> not" raise any other exceptions.
>
> (On rereading it: I don't like the use of "can" either. Normally,
> requirements on an implementation use the verb "shall". I would much
> prefer text which clearly required an implementation to report errors
> decribed in the Throws: clause by means of the described exception, or
> one derived from it. I rather think that this was the intent, but my
> interpretation of "can", in the context of the standard, is that an
> implementation is allowed to, but not required to.)

Correct. I think it's meant to be "guidance for implementors".

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

Sep 16, 2004, 10:26:46 PM9/16/04

to

Herb Sutter <hsu...@gotw.ca> wrote in message

news:<p31fk0li82lc5s55b...@4ax.com>...

> First, thanks again for a typically insightful and deeply
> thought-through analysis.

> On 14 Sep 2004 18:46:07 -0400, ka...@gabi-soft.fr wrote:
> >Which part is untrue? Obviously, when talking about thread safety,
> >we have to define what we mean by the term. Your article clearly
> >said that it should behave as a non-COW string does with regards to
> >thread-safety.

> Close enough. I said that:

> So String has to do just enough extra serialization work to enable
> calling code to work correctly as long as the calling code does its
> usual thing for thread safe operation, namely to serialize access
> to individual visible String objects that the calling code knows
> are or can be actually shared across threads.

I don't have the article here to reread, but I'm pretty sure you made a
point about the guarantees which a non-COW string makes. Personally, I
wish you'd have spent more time discussing what the actual contract was,
but I understand that that can often become far more technically
abstract than you probably wanted.

If I understand you correctly, you are saying that as soon as the object
is used by more than one thread, all accesses must be protected by a
lock. If this is the case, then your implementation conforms to the
contract. My personal opinion is that this isn't a very good guarantee,
perhaps only because I'm used to more, from Posix or from the SGI
implementation of the STL.

> Note that I am not saying anything about if only readers are involved.

Your refering to the guarantees of a non-COW implementation lead me
astray, because a non-COW implementation (including the one available at
the SGI site) gives the usual Posix guarantee -- if no thread every
modifies the object, then you do not need synchronization to read it.

> >Given that, I presented an example of code which would work with a
> >non COW string, but which would fail with your implementation. So
> >where is my error: in my interpretation of the guarantees which you
> >intended, in my suppositions concerning the interface which you
> >wanted to implement, or in my example of where it failed. (If it is
> >the latter, you'll have to show me the exact flaw in my reasoning,
> >because I've poured over that one out a lot.)

> Sure, and I do understand you think deeply about things before you post.

Not always, but I've been through this with the g++ implementation of
string (which is basically the same thing as you proposed).

> I do disagree with the statement that it works with a non-COW string
> but fails with my implementation. Your example boils down to:

> // global
> String s = "Hello";

> // thread 1
> String s2 = s;

> // thread 2
> s[0];

> // thread 3
> s[0];

> Here's a stretched but valid example that breaks the above code for
> even a non-COW String: Imagine that a non-COW String (conformingly)
> decides to cache a count of the number of member function calls
> performed on it, that the count is a member of a type that does not
> guarantee atomic writes (and there's no internal synchronization), and
> op[] updates the count. That's a perfectly valid instrumented
> implementation that's perfectly safe for single-threaded use. But in
> the multi-threaded use above, there is a race condition between
> threads 2 and 3 and you can get corrupted counts; you can also get
> reads of partially written counts (e.g., if thread 1 cares to read the
> instrumented count), etc.

You don't have to get that exotic. Obviously, by non-COW strings, I
supposed the standard implementation, e.g. the one at SGI. A priori, an
infinity of implementations are possible. In practice, of course, I'm
only concerned with pratical, usable implementations.

> More fundamentally, you want and assume that it's safe to call
> nonmutating or even possibly-mutating member functions on the same
> object on different threads without external locking, as long as the
> operations are not _actually_ mutating the physical (not logical)
> state of the object. That's an understandable but problematic
> assumption.

It's the guarantee that I get with all of the implementations of
std::string I know of except g++. More to the point, it's the guarantee
I get with other components, and with the basic types under Posix. It
is, more or less, the "standard" guarantee. If only because outside of
SGI and Posix, I've not found any formal specifications of what is
guaranteed.

> First, I do not assume that it's safe to call even const member
> functions on the same object of any type on different threads without
> external locking unless that type explicitly documents that's safe.

I agree there. For that matter, unless the class documents its
guarantees, you cannot do anything with it in multiple threads, even
with external locking. A legal implementation of std::string, for
example, could call opendir/readdir even in const functions -- under
Solaris, for example, these functions do NOT always work, even if you
protect them with locks.

Unless you have a contract, you haven't a leg to stand on.

The SGI implementation of std::basic_string (which is used in the
STLport) does guarantee this, explicitly. So does the Rogue Wave
implementation that comes with Sun CC, and it uses COW (but it also uses
a lot of locks, with somewhat unpleasant consequences on performance).
I don't know what formal guarantees Dinkumware gives us, but since from
what I here, they don't use COW (and I'm pretty sure that they don't go
around inserting extra writes in const functions just to cause
problems), in practice, they also give this guarantee.

In the case of g++, the guarantee is present in all of the standard
containers except basic_string, because they use the SGI implementation
for all of the rest. Both guarantees are explicitly documented.
(Although IMHO, the documentation could be a little bit more explicit as
to which classes fall get which guarantee. Basically, it says that the
classes of the STL which they derive from the SGI implementation have
the SGI guarantee. If the naïve programmer goes to the SGI site, he
will find that there, basic_string IS part of their current STL
implementation.)

> I understand that you want that to be true (and I agree that's a good
> idea and it is in fact true for some types on some platforms), but
> std::string doesn't say whether or not it should/must be true
> (although admittedly a more fundamental reason for that is because the
> standard doesn't talk about threads), and at best we're left with
> implementation-defined behavior (the std::string implementer has to
> say whether it's true for their implementation).

I totally agree with you there. To be very clear -- my contention with
your code was based on "the same guarantees as a non-COW string."

> Second, even if that is supported, then the type really has to
> document very clearly exactly what constitutes mutation. And the
> problem isn't just that op[] used above is a non-const function,
> because this issue is deeper than const vs. non-const functions can
> express, because of logical vs. physical constness. And it is physical
> constness we care about, because of the possibility that some mutable
> internal state will be in a partially written mode when another thread
> tries to read it.

Agreed. There is a fundamental problem in that the implementation
mutates internally without the user knowing it, and without any intent
on the part of the user to mutate the value. That's not to say that it
can't be done -- Joshua Lehrer has mentionned his implementation, which
uses COW and conforms to the SGI/Posix threading guarantees. But he
needs a proxy to do it, and the standard doesn't allow std::basic_string
to use a proxy.

(Note that there was a long comment on the CD2 by the French national
body concerning the validity of iterators and references in
basic_string. We presented several solutions, including the possibility
of reworking the interface enough to allow the use of proxies --
something I consider necessary for COW to work even in a non-threaded
environment. In the end, the committee adopted the minimist proposal;
in some ways, I was disappointed, but I can understand their not wanting
to rework the text more than necessary at that late date. Still, had
they adopted our more radical approach, allowing proxies, COW would be
very viable, even in a multi-threaded environment.)

[Discussion of whether operator[] or at() can throw...]

I've addressed this in another post. David Abrahams seems to have
addressed it as well, in more or less the same terms, and he is much
more familiar with this part of the standard than I am. Rather than
develope the ideas here, and end up with yet another subthread talking
about the same thing, I'll leave my other posting and the of David
Abrahams to do the work. (I presume you will comment on at least one of
them, and if I have anything new to contribute, I will do so there.)

[...]

> >Ideally, what we would like is that std::string be as robust and as
> >simple to use as int. That's not attainable, but we should
> >definitly consider just how close we can come.

> Immutable strings are probably the best approximation.

That's what I've always thought. In general, I try to design my value
classes so that the only non-const functions are assignment operators.
Still, with proxies you can do a pretty good job even if you want to
support modification.

Of course, std::basic_string is neither immutable, nor does it allow
proxies. But it's all we've got; it's far from perfect, but for most
everyday use, it gets the job done.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Herb Sutter

unread,

Sep 16, 2004, 11:00:48 PM9/16/04

to

On 15 Sep 2004 15:33:05 -0400, David Abrahams <da...@boost-consulting.com>
wrote:

>Herb Sutter <hsu...@gotw.ca> writes:
>> >§19.4.4.8/3: "No destructor operation defined in the C++ Standard
>> >Library will throw an exception. Any other fucntions defined in the C++
>> >Standard Library that do not have an exception-specification may throw
>> >implementation-defined exceptions unless otherwise specified."
>>
>> Right. It is otherwise specified in 17.4.4.8/1 for any operation with a
>> "throws:" specification:
>>
>> 17.4.4.8(1)
>> Any of the functions defined in the C++ Standard Library can
>> report a failure by throwing an exception of the type(s) described
>> in their Throws: paragraph and/or their exception-specification
>
>That's an allowance, not a prohibition on anything. It's almost
>meaningless except as an encouragement to implementors.

What about the "Throws: Nothing" clauses, such as on many std::list
operations (e.g., list::splice)? Are you saying they should be considered
to be meaningless non-normative encouragement?

FWIW, I'm becoming convinced that the standard is inconsistent. Further
below...

>> I don't see any dispensation anywhere for an operation with a "Throws:"
>> clause to throw other kinds of exceptions.
>>
>> It's clear from what follows that any exception specification in
>> particular in the standard is clearly meant to be interpreted and
>> enforced this way.
>
>Yes, but a "Throws:" clause is not an exception specification.

The two are taken in the same breath and the same context, and "Throws:"
additionally gets further dispensation elsewhere.

FWIW I now agree that there's a problem here. It now looks to me like the
standard uses Throws: inconsistently: Under either interpretation of
"Throws:" there is a set of Throws: clauses that seem to me to be broken.

Specifically, exactly one of the following should be true. Either:

1. "Throws:" wasn't intended to be restrictive. If so, that is not clearly
indicated at all (to my eye) and IMO needs to be reworded in the
paragraphs I cited. More fundamentally, however, if this is the intent
then some functions with Throws: paragraphs appear to be badly specified,
including the slew of "Throws: Nothing"s in Clause 23.2.2 (std::list)
which I think we would agree were clearly intended to be restrictive and
had better be restrictive (e.g., list::splice, for which the nothrow
guarantee is an essential design feature).

Or else:

2. "Throws:" was intended to be restrictive. If so, I'm fine with the
wording cited above which I think already accomplishes this, but we should
probably make it even more explicit because other experts such as yourself
read it differently which means it's not as clear as it needs to be. But,
more fundamentally again, if this is the intent then a different set of
functions with Throws: paragraphs appear to be badly specified, in
particular as was pointed out on this thread various basic_string
functions (e.g., the constructor that takes another string and offset(s),
whose Throws: appears in 21.3.1(4), should allow more than just
out_of_range), and we should check that catchall paragraphs like the
newly-added-in-C++03 paragraph 21.3(4a) are still okay (i.e., if "Throws:"
is restrictive it should be made so with explicit "unless otherwise
specified" wording).

But in either case, if this is confusing us and if "Throws:" is being used
inconsistently (e.g., nonrestrictively in std::basic_string, restrictively
in std::list), this needs fixing in the standard, doesn't it?

Herb

---
Herb Sutter (www.gotw.ca)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Andrew Koenig

unread,

Sep 17, 2004, 11:29:55 AM9/17/04

to

<ka...@gabi-soft.fr> wrote in message
news:d6652001.04091...@posting.google.com...

> As far as I can see, something like:

>
> template< typename CharT >
> std::basic_string< CharT >::const_reference
> std::basic_string< CharT >::at() const
> {
> throw 3.14159 ;
> }
>
> is a perfectly conforming implementation. Not a very useful one, of
> course, but the standard doesn't require usefulness.

I agree.

IIRC, we had an extended discussion on this issue at the Boston meeting, at
which we concluded;

1) Implementations should be permitted to throw exceptions if they
exceed their limitations;

2) We did not know how to require that implementations should exceed
their limitations only at times that are convenient to their users; and

3) Implementations that throw exceptions too often at inconvenient times
were unlikely to attract many users.

In other words, we left greater latitude to "quality of implementation"
issues than some of us might have liked, but only because we could not find
an unambiguous way of specifying otherwise.

Herb Sutter

unread,

Sep 17, 2004, 11:29:56 AM9/17/04

to

>If you think that it is a defect, or at least, something we should
>change, I'm behind you 100%. It means that in theory, at least, it is
>impossible to write exception safe code without depending on
>implementation defined behavior.

That's my fear now. Let me use this article to repost in this new thread
the essentials of a separate reply on the original thread.

To recap, the basic point under discussion is what "Throws:"
specifications actually specify: Are they restrictive, or can
implementations throw additional types of exceptions and/or under
different circumstances than described in a "Throws:" clause?

Some interpret 17.4.4.8(1) and (3) to mean that only exception
specifications may not be violated, and that "Throws:" clauses are not
restrictive so that implementations are free to throw additional
exceptions and/or under additional circumstances.

Others interpret 17.4.4.8(1) to mean that Throws: clauses and exception
specifications taken together may not be violated, thus that both are
restrictive so that implementations are not free to throw additional
exceptions and/or under additional circumstances. In particular, (1) doesn
not say that a standard library function "can" report a failure by
throwing other types of exceptions. Further, we have 17.3.1.3(3) which
says that "Throws:" clauses document "any exceptions thrown by the
function, and the conditions that would cause the exception."

Here are a few potential defects:

1. In 17.4.4.8(1), almost certainly "can" should be replaced by "shall" at
least for "Throws:" paragraphs, unless we really intended to additionally
grant license for functions to not throw the specificed exceptions under
the specified conditions in "Throws:" paragraphs. Alternatively, we could
change "would" to "shall" in 17.3.1.3(3) line 6.

2. We need to reword 17.4.4.8 and 17.3.1.3 to make it clear whether or not
"Throws:" is restrictive.

3. If we intended/decide that "Throws:" shouldn't be restrictive, we
probably want to fix some functions whose Throws: paragraphs appear to be
inappropriate under that interpretation, including the "Throws: Nothing"
paragraphs in Clause 23.2.2 (std::list) which AFAIK were definitely
intended to be restrictive (e.g., list::splice).

4. If we intended/decide that "Throws:" should be restrictive, we probably
want to fix a different set of functions whose Throws: paragraphs appear
to be inappropriate under _that_ interpretation, including the
basic_string constructor that takes another string and offset(s), whose
Throws: appears in 21.3.1(4) and should probably allow more than just
out_of_range (e.g., bad_alloc if allocation fails).

James notes:

>I don't know if this is a defect in the standard, or whether it was
>intentional. One can easily imagine that the authors really intended to
>cover three cases:
>
> - No specifications as to what happens on implementation defined
> errors -- the function has neither a throws clause nor an exception
> specification. (This is the most common cases.)
>
> - The standard specifies exactly and fully all possible exceptions:
> the function has an exception specification (and probably a throws
> clause as well).
>
> - The standard specifies precise exceptions for certain types of
> errors, but leaves the implementation free, as in the first case,
> for all other possible errors: the function has a throws clause, but
> no exception specification.
>
>Unless there is a defect report on this, I'd say that the most probable
>intention was to allow the three cases,
>

>If you think that this is a misinterpretation of the standard, please
>explain why.

This is one possible intent, but I don't think the above is consistent
with the "Throws: Nothing" clauses on std::list functions, which would
fall into your third case but which almost certainly were intended to be
guaranteed-nonthrowing operations.

I suspect that no single interpretation of "Throws:" clauses in the
current standard can be consistent right now, because "Throws:" appears to
be used inconsistently -- nonrestrictively in some places (e.g.,
std::basic_string) and restrictively in others (e.g., std::list).

Herb

---
Herb Sutter (www.gotw.ca)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)

ka...@gabi-soft.fr

unread,

Sep 17, 2004, 1:35:30 PM9/17/04

to

Herb Sutter <hsu...@gotw.ca> wrote in message

news:<ccejk0dvthosg73hj...@4ax.com>...

> On 15 Sep 2004 15:33:05 -0400, David Abrahams
> <da...@boost-consulting.com> wrote:
> >Herb Sutter <hsu...@gotw.ca> writes:
> >> >§19.4.4.8/3: "No destructor operation defined in the C++ Standard
> >> >Library will throw an exception. Any other fucntions defined in
> >> >the C++ Standard Library that do not have an
> >> >exception-specification may throw implementation-defined
> >> >exceptions unless otherwise specified."

> >> Right. It is otherwise specified in 17.4.4.8/1 for any operation with a
> >> "throws:" specification:

> >> 17.4.4.8(1)
> >> Any of the functions defined in the C++ Standard Library can
> >> report a failure by throwing an exception of the type(s)
> >> described in their Throws: paragraph and/or their
> >> exception-specification

> >That's an allowance, not a prohibition on anything. It's almost
> >meaningless except as an encouragement to implementors.

> What about the "Throws: Nothing" clauses, such as on many std::list
> operations (e.g., list::splice)? Are you saying they should be
> considered to be meaningless non-normative encouragement?

Are you saying that the use of "can" instead of the normally normative
"shall" (or "shall not") is without significance?

Frankly, I think the paragraph needs rework. The use of "can" is a
standard is very suspicious. If there is an actual requirement, the
correct words are "shall" and "shall not". If there is no actual
requirement, if it is a recommendation, the word "may" may be used, but
IMHO, it is much better to first specify the exact requirements, then
add something along the lines of "The intent is...".

[...]

> Specifically, exactly one of the following should be true. Either:

> 1. "Throws:" wasn't intended to be restrictive. If so, that is not
> clearly indicated at all (to my eye) and IMO needs to be reworded in
> the paragraphs I cited. More fundamentally, however, if this is the
> intent then some functions with Throws: paragraphs appear to be badly
> specified, including the slew of "Throws: Nothing"s in Clause 23.2.2
> (std::list) which I think we would agree were clearly intended to be
> restrictive and had better be restrictive (e.g., list::splice, for
> which the nothrow guarantee is an essential design feature).

And basic_string::swap has neither a Throws: clause nor an exception
specification (although we certainly don't want it to throw anything
either). I suspect that there is more than a little work to be done in
this respect. If it is an essential part of the design that a function
doesn't throw, then, as the standard is presently written, that function
should have an empty exception specifier.

Note that the standard is somewhat inconsistent even with regards to
destructors, which are guaranteed not to throw. Some of the destructors
(e.g. exception) have an empty exception specification, others (e.g
basic_string) have no exception specification.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,

Sep 18, 2004, 7:01:08 AM9/18/04

to

Herb Sutter <hsu...@gotw.ca> writes:

> But in either case, if this is confusing us and if "Throws:" is being used
> inconsistently (e.g., nonrestrictively in std::basic_string, restrictively
> in std::list), this needs fixing in the standard, doesn't it?

I agree with you. Please file a LWG defect.

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,

Sep 18, 2004, 7:01:53 AM9/18/04

to

ka...@gabi-soft.fr writes:

>> Note that I am not saying anything about if only readers are
>> involved.
>
> Your refering to the guarantees of a non-COW implementation lead me
> astray, because a non-COW implementation (including the one
> available at the SGI site) gives the usual Posix guarantee -- if no
> thread every modifies the object, then you do not need
> synchronization to read it.

This has nothing to do with POSIX really. It's all about the
difference between logical and physical constness. Herb is claiming
that even a non-COW implementation of std::string is allowed to have
mutable internal state that is altered when you read it, and in that
case multiple readers need to be synchronized. While he's technically
correct, I have the feeling it's an unfair argument, though I can't
really justify my feeling. It's certainly possible to implement a
non-COW string that doesn't have mutable internal state, and in fact,
the most naive implementation works that way. Probably the right
measure for minimal thread safety is "as threadsafe as int", not "as
threadsafe as a non-COW version," since as Herb has demonstrated,
"it's non-COW" is almost devoid of thread-safety implications.

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Herb Sutter

unread,

Sep 18, 2004, 10:26:35 AM9/18/04

to

On 17 Sep 2004 13:35:30 -0400, ka...@gabi-soft.fr wrote:
>Frankly, I think the paragraph needs rework. The use of "can" is a
>standard is very suspicious.

Agreed.

>And basic_string::swap has neither a Throws: clause nor an exception
>specification (although we certainly don't want it to throw anything
>either). I suspect that there is more than a little work to be done in
>this respect. If it is an essential part of the design that a function
>doesn't throw, then, as the standard is presently written, that function
>should have an empty exception specifier.

Agreed (though I'd like to see some dispensation that an empty exception
spec doesn't need to actually appear on the function as long as the
function really won't throw).

>Note that the standard is somewhat inconsistent even with regards to
>destructors, which are guaranteed not to throw. Some of the destructors
>(e.g. exception) have an empty exception specification, others (e.g
>basic_string) have no exception specification.

At least that part is already covered by the catchall. It would probably
be nice to be consistent about the exception specs on stdlib dtors. But
the more substantial change I would like to see discussed for C++0x is to
elevate the stdlib's prohibition into the language itself, and declare
undefined behavior if dtors throw. That would be controversial, but worth
discussing to see if there might be agreement. Frankly, as it is it's
little better than undefined behavior even now, because you can't even
reliably make use of a type whose dtor can throw (if its ctor can throw
too) because just trying to create an object of such a type could
terminate your program.

Herb

---
Herb Sutter (www.gotw.ca)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Ben Hutchings

unread,

Sep 19, 2004, 6:43:06 AM9/19/04

to

ka...@gabi-soft.fr wrote:
<snip>

> The SGI implementation of std::basic_string (which is used in the
> STLport) does guarantee this, explicitly. So does the Rogue Wave
> implementation that comes with Sun CC, and it uses COW (but it also uses
> a lot of locks, with somewhat unpleasant consequences on performance).
> I don't know what formal guarantees Dinkumware gives us, but since from
> what I here, they don't use COW (and I'm pretty sure that they don't go
> around inserting extra writes in const functions just to cause
> problems), in practice, they also give this guarantee.

<snip>

The old Dinkumware implementation of std::basic_string shipped with
Visual C++ 6.0 uses COW and its reference-counting is not thread-safe.
The current implementation does not use COW.

--
Ben Hutchings
I say we take off; nuke the site from orbit. It's the only way to be sure.

Alexander Terekhov

unread,

Sep 19, 2004, 6:50:27 AM9/19/04

to

Herb Sutter wrote:
[...]

> Agreed (though I'd like to see some dispensation that an empty exception
> spec doesn't need to actually appear on the function as long as the
> function really won't throw).

That's sorta cheating. Fix ES by mandating 2-phase EH with
std::unexpected() invoked at throw point.

[...]

> the more substantial change I would like to see discussed for C++0x is to
> elevate the stdlib's prohibition into the language itself, and declare
> undefined behavior if dtors throw.

Dtors shall simply have implicit throw() ES by default.

regards,
alexander.

David Abrahams

unread,

Sep 19, 2004, 6:50:49 AM9/19/04

to

Herb Sutter <hsu...@gotw.ca> writes:

No, if the ctor throws the dtor isn't called.

I can actually imagine legit uses for intentionally throwing dtors,
though I've never needed it badly enough to do it myself.

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Sean Kelly

unread,

Sep 20, 2004, 1:47:27 AM9/20/04

to

David Abrahams wrote:
>
> This has nothing to do with POSIX really. It's all about the
> difference between logical and physical constness. Herb is claiming
> that even a non-COW implementation of std::string is allowed to have
> mutable internal state that is altered when you read it, and in that
> case multiple readers need to be synchronized. While he's technically
> correct, I have the feeling it's an unfair argument, though I can't
> really justify my feeling.

I think it's just a matter of expected behavior, similar in a sense to
the discussion about "const" in another thread. If a programmer
attempts a non-modifying operation on a valid subrange of what is
essentially a primitive type then it is unreasonable to suggest that the
operation may throw or may result in undefined behavior. This does
suggest, however, that any mention of multithreading in the standard
might require a careful review of all library classes, just to ensure
that behavior guarantees are worded appropriately.

> It's certainly possible to implement a
> non-COW string that doesn't have mutable internal state, and in fact,
> the most naive implementation works that way. Probably the right
> measure for minimal thread safety is "as threadsafe as int", not "as
> threadsafe as a non-COW version," since as Herb has demonstrated,
> "it's non-COW" is almost devoid of thread-safety implications.

Perhaps "as threadsafe as a dynamic array of the same element type?"
The suggestion that it is "as threadsafe as int" implies that even write
operations might be marginally atomic, which may be too strong of a
guarantee. In any case, I agree that string should behave logically
similar to any primitive type with respect to thread and exception
safety. The programmer is always free to use a standalone
implementation with platform-specific behavior if it suits the situation.

Sean

Herb Sutter

unread,

Sep 20, 2004, 1:47:56 AM9/20/04

to

On 19 Sep 2004 06:50:49 -0400, David Abrahams <da...@boost-consulting.com>
wrote:

^^^^^^
array
My typo.

> > terminate your program.
>
>No, if the ctor throws the dtor isn't called.
>
>I can actually imagine legit uses for intentionally throwing dtors,
>though I've never needed it badly enough to do it myself.

---
Herb Sutter (www.gotw.ca)

Convener, ISO WG21 (C++ standards committee) (www.gotw.ca/iso)
Contributing editor, C/C++ Users Journal (www.gotw.ca/cuj)
Architect, Developer Division, Microsoft (www.gotw.ca/microsoft)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Hyman Rosen

unread,

Sep 20, 2004, 2:07:00 AM9/20/04

to

David Abrahams wrote:
> I can actually imagine legit uses for intentionally throwing dtors,
> though I've never needed it badly enough to do it myself.

I've written classes, generally versions of output formatters,
that are intended to be created only as temporaries and which
use operator<< to gather up arguments and do the real output
work in their destructors. That work may involve operations
which can throw. This is slightly risky - if one of the calls
to generate an argument throws and then the destructor also
throws, the program would abort, but I find the idiom useful
and the likelihood of a problem small.

Francis Glassborow

unread,

Sep 20, 2004, 11:30:25 AM9/20/04

to

In article <ubrg3d...@boost-consulting.com>, David Abrahams
<da...@boost-consulting.com> writes

>No, if the ctor throws the dtor isn't called.

That is true today but not once we have forwarding ctors. This is an
issue I had not considered and one that I think needs consideration.
Perhaps it is as simple as saying that having a throwing dtor in a class
with a throwing forwarding ctor is an unwise design but perhpas we
should say more.

>

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

ka...@gabi-soft.fr

unread,

Sep 20, 2004, 2:32:33 PM9/20/04

to

Andrew Koenig <a...@acm.org> wrote in message
news:<8Y62d.595706$Gx4.3...@bgtnsc04-news.ops.worldnet.att.net>...

> <ka...@gabi-soft.fr> wrote in message
> news:d6652001.04091...@posting.google.com...

> > As far as I can see, something like:

> > template< typename CharT >
> > std::basic_string< CharT >::const_reference
> > std::basic_string< CharT >::at() const
> > {
> > throw 3.14159 ;
> > }

> > is a perfectly conforming implementation. Not a very useful one, of
> > course, but the standard doesn't require usefulness.

> I agree.

> IIRC, we had an extended discussion on this issue at the Boston
> meeting, at which we concluded;

> 1) Implementations should be permitted to throw exceptions if they
> exceed their limitations;

> 2) We did not know how to require that implementations should
> exceed their limitations only at times that are convenient to their
> users; and

> 3) Implementations that throw exceptions too often at inconvenient
> times were unlikely to attract many users.

> In other words, we left greater latitude to "quality of
> implementation" issues than some of us might have liked, but only
> because we could not find an unambiguous way of specifying otherwise.

I suspected something along these lines. (And I agree that an
implementation like the above is unlikely to attract many users.) But
isn't there already something in the standard to the effect that
exceeding (unspecified) implementation resource limits is undefined
behavior? So even without the above, the implementation could throw an
exception. Or crash, or reformat the hard disk, or...

Or is the presence of this sentence to be interpreted as a sort of a
very weak recommendation that throwing an exception might be a better
solution than some of the alternatives? (IMHO, it's certainly a better
solution than formatting the hard disk. Compared to simply crashing, or
invoking a user defined callback which aborts by default, I'm less
sure.)

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

Sep 20, 2004, 3:08:57 PM9/20/04

to

David Abrahams <da...@boost-consulting.com> wrote in message
news:<u656c2...@boost-consulting.com>...
> ka...@gabi-soft.fr writes:

> >> Note that I am not saying anything about if only readers are
> >> involved.

> > Your refering to the guarantees of a non-COW implementation lead me
> > astray, because a non-COW implementation (including the one
> > available at the SGI site) gives the usual Posix guarantee -- if no
> > thread every modifies the object, then you do not need
> > synchronization to read it.

> This has nothing to do with POSIX really. It's all about the
> difference between logical and physical constness.

Sort of. Even with a non-COW string, we are talking about logical
constness. What eventually gets modified is the not the bits in the
actual string object, but rather the char's in an array which is pointed
to by the string object.

The key difference is that in one case, the string object can undertake
modifications even when the user code doesn't modify anything. In the
other, it doesn't.

> Herb is claiming that even a non-COW implementation of std::string is
> allowed to have mutable internal state that is altered when you read
> it, and in that case multiple readers need to be synchronized. While
> he's technically correct, I have the feeling it's an unfair argument,
> though I can't really justify my feeling.

The problem is simple: if we are talking about what an implementation
has a right to do, any implementation can contain modifiable state.
There's no doubt about it. Once we start speaking of COW, however, we
are talking about specific implementations. If we understand non-COW to
mean all possible implementations except one particular implementation,
then Herb is right. If we understand as simple a set of concrete,
reasonable implementations, then he isn't, as long as no implementation
which mutates when it isn't necessary is in that set.

> It's certainly possible to implement a non-COW string that doesn't
> have mutable internal state, and in fact, the most naive
> implementation works that way. Probably the right measure for minimal
> thread safety is "as threadsafe as int", not "as threadsafe as a
> non-COW version," since as Herb has demonstrated, "it's non-COW" is
> almost devoid of thread-safety implications.

Agreed. That's really what I mean when I speak of the "Posix
guarantee". Because, of course, Posix doesn't even know
std::basic_string exists, much less guarantee anything for it.

While I'm at it, I'll repeat a plea I've made before. What does Windows
guarantee with regards to threads and int? The same as Posix? More?
Less? Or does Windows just count on the fact that its platform is a
single processor IA-32 architecture, where the hardware gives enough
useful guarantees that no more is really necessary, and compilers which
don't optimize too agressively?

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

Sep 20, 2004, 3:09:18 PM9/20/04

to

Ben Hutchings <ben-publ...@decadentplace.org.uk> wrote in message
news:<slrnckpm59.pk.b...@decadentplace.org.uk>...

> ka...@gabi-soft.fr wrote:
> <snip>
> > The SGI implementation of std::basic_string (which is used in the
> > STLport) does guarantee this, explicitly. So does the Rogue Wave
> > implementation that comes with Sun CC, and it uses COW (but it also
> > uses a lot of locks, with somewhat unpleasant consequences on
> > performance). I don't know what formal guarantees Dinkumware gives
> > us, but since from what I here, they don't use COW (and I'm pretty
> > sure that they don't go around inserting extra writes in const
> > functions just to cause problems), in practice, they also give this
> > guarantee.
> <snip>

> The old Dinkumware implementation of std::basic_string shipped with
> Visual C++ 6.0 uses COW and its reference-counting is not thread-safe.
> The current implementation does not use COW.

I was aware that the old implementation did use COW, but I'm sort of
surprised that it isn't thread safe. From what I understand, most
Windows applications are multi-threaded. And I know with the non-thread
safe version of COW string in g++ 2.95.2, you generally get a core dump
in less that a minute in a multithreaded environment, as soon as you
start your load tests.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,

Sep 20, 2004, 3:11:27 PM9/20/04

to

Sean Kelly <se...@f4.ca> writes:

> > It's certainly possible to implement a
> > non-COW string that doesn't have mutable internal state, and in fact,
> > the most naive implementation works that way. Probably the right
> > measure for minimal thread safety is "as threadsafe as int", not "as
> > threadsafe as a non-COW version," since as Herb has demonstrated,
> > "it's non-COW" is almost devoid of thread-safety implications.
>
> Perhaps "as threadsafe as a dynamic array of the same element type?"
> The suggestion that it is "as threadsafe as int" implies that even write
> operations might be marginally atomic,

What does "marginally atomic" mean? Either it's atomic or it isn't.
I think I understand what you're getting at, though I'm not sure it
clarifies more than it complicates.

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

David Abrahams

unread,

Sep 20, 2004, 3:11:49 PM9/20/04

to

Hyman Rosen <hyr...@mail.com> writes:

> David Abrahams wrote:
> > I can actually imagine legit uses for intentionally throwing dtors,
> > though I've never needed it badly enough to do it myself.
>
> I've written classes, generally versions of output formatters,
> that are intended to be created only as temporaries and which
> use operator<< to gather up arguments and do the real output
> work in their destructors. That work may involve operations
> which can throw. This is slightly risky - if one of the calls
> to generate an argument throws and then the destructor also
> throws, the program would abort, but I find the idiom useful
> and the likelihood of a problem small.

<shiver> That's more than slightly risky -- it's just as serious a
case as any that involves a throwing dtor.

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

ka...@gabi-soft.fr

unread,

Sep 20, 2004, 3:33:42 PM9/20/04

to

Herb Sutter <hsu...@gotw.ca> wrote in message

news:<2dgjk0hpr9iium9i2...@4ax.com>...

> >If you think that it is a defect, or at least, something we should
> >change, I'm behind you 100%. It means that in theory, at least, it
> >is impossible to write exception safe code without depending on
> >implementation defined behavior.

> That's my fear now. Let me use this article to repost in this new
> thread the essentials of a separate reply on the original thread.

Good idea. That consolidates the discussion in one thread.

> To recap, the basic point under discussion is what "Throws:"
> specifications actually specify: Are they restrictive, or can
> implementations throw additional types of exceptions and/or under
> different circumstances than described in a "Throws:" clause?

> Some interpret 17.4.4.8(1) and (3) to mean that only exception
> specifications may not be violated, and that "Throws:" clauses are not
> restrictive so that implementations are free to throw additional
> exceptions and/or under additional circumstances.

> Others interpret 17.4.4.8(1) to mean that Throws: clauses and
> exception specifications taken together may not be violated, thus that
> both are restrictive so that implementations are not free to throw
> additional exceptions and/or under additional circumstances. In
> particular, (1) doesn not say that a standard library function "can"
> report a failure by throwing other types of exceptions. Further, we
> have 17.3.1.3(3) which says that "Throws:" clauses document "any
> exceptions thrown by the function, and the conditions that would cause
> the exception."

Note that Andrew Koenig and Dave Abrahams have weighed in saying that
the first interpretation was what was intended. Given their respective
roles in the standardization and the definition of exceptions in the
library, I find that rather conclusive.

With regards to intent, at least.

> Here are a few potential defects:

> 1. In 17.4.4.8(1), almost certainly "can" should be replaced by
> "shall" at least for "Throws:" paragraphs, unless we really intended
> to additionally grant license for functions to not throw the
> specificed exceptions under the specified conditions in "Throws:"
> paragraphs. Alternatively, we could change "would" to "shall" in
> 17.3.1.3(3) line 6.

I sort of agree, which perhaps one hesitation -- what happens if the
implementation exceeds some resource limit at the same time it
encounters the condition requiring the exception? But more on this
later. (I mention it here because Andy mentionned it as the motivation
for the current wording.)

> 2. We need to reword 17.4.4.8 and 17.3.1.3 to make it clear whether or
> not "Throws:" is restrictive.

> 3. If we intended/decide that "Throws:" shouldn't be restrictive, we
> probably want to fix some functions whose Throws: paragraphs appear to
> be inappropriate under that interpretation, including the "Throws:
> Nothing" paragraphs in Clause 23.2.2 (std::list) which AFAIK were
> definitely intended to be restrictive (e.g., list::splice).

> 4. If we intended/decide that "Throws:" should be restrictive, we
> probably want to fix a different set of functions whose Throws:
> paragraphs appear to be inappropriate under _that_ interpretation,
> including the basic_string constructor that takes another string and
> offset(s), whose Throws: appears in 21.3.1(4) and should probably
> allow more than just out_of_range (e.g., bad_alloc if allocation
> fails).

I'll throw the following idea out for discussion:

- We reword 19.4.4.8 to say that both Throws: and exception
specification are normative: the Throws: specifies what the function
*must* throw, under what condition, and the exception specification
specifies what it can throw.

- We add text (probably in the form of a note) there to remind that in
all cases, exceeding implementation limits is undefined behavior,
which means that, amongst other things, the implementation can throw
anything it wants.

- We then clean up all of the Throws: clauses and all of the exception
specifications to conform to what is wanted, according to the
above. In particular, Throws: nothing should disappear.

On the other hand, maybe there is an advantage in indicating somehow
that an implementation should try to avoid throwing from a
particular function, if possible, without making it absolutely
mandatory. (I'm not sure if I am really clear here. My idea is
something along the lines that if an implementation does detect when
resources are exceeded, and throws, that it would be better if at
all possible that it do so in another function.)

Anyhow, as I said, these are just ideas for discussion. I wasn't
present when the issue was originally discussed, and there are probably
important points which I'm not taking into account.

Also, we shouldn't ignore the point that the current specification IS
working. There is some value in leaving a maximum of freedom to the
implementation, as long as it is understood that that freedom will only
be used when absolutely necessary. And that is, in fact, the case
today. Implementations aren't abusing this freedom.

> James notes:
> >I don't know if this is a defect in the standard, or whether it was
> >intentional. One can easily imagine that the authors really intended
> >to cover three cases:

> > - No specifications as to what happens on implementation defined
> > errors -- the function has neither a throws clause nor an exception
> > specification. (This is the most common cases.)

> > - The standard specifies exactly and fully all possible exceptions:
> > the function has an exception specification (and probably a throws
> > clause as well).

> > - The standard specifies precise exceptions for certain types of
> > errors, but leaves the implementation free, as in the first case,
> > for all other possible errors: the function has a throws clause, but
> > no exception specification.

> >Unless there is a defect report on this, I'd say that the most probable
> >intention was to allow the three cases,

> >If you think that this is a misinterpretation of the standard, please
> >explain why.

> This is one possible intent, but I don't think the above is consistent
> with the "Throws: Nothing" clauses on std::list functions, which would
> fall into your third case but which almost certainly were intended to
> be guaranteed-nonthrowing operations.

Agreed. My comments only concern §17.4.4.8; it is quite possible that
there are inconsistencies elsewhere.

Of course, it is also possible that the intent behind the "Throws:
Nothing" clauses is to simply recommend: don't throw if you can possibly
avoid it (as opposed to "throwing here will make you non-conformant",
e.g. as in destructors or when there is an empty exception
specification).

I do think that there is some justification for three levels of
specification:

- A fully restrictive one: the standard specifies exactly what should
be thrown, and when. Of course, the "exceeding resources" cop-out
still exists. If you get a stack overflow when calling
basic_string::at(), then what happens is undefined behavior, and
there is no guarantee that you will get an out_of_range, even if
your index is out of range.

- A recommended practice: the standard suggests what is the preferred
behavior, but an implementation doesn't automatically cease to
become non conformant if for some reason, this preferred behavior is
not reasonable on its target platform.

- The standard leave full liberty up to the implementation. Of hand,
I'd suggest that this be the case for most non-const functions, with
just enough exceptions to make exception safe code possible
(e.g. functions like swap).

> I suspect that no single interpretation of "Throws:" clauses in the
> current standard can be consistent right now, because "Throws:"
> appears to be used inconsistently -- nonrestrictively in some places
> (e.g., std::basic_string) and restrictively in others (e.g.,
> std::list).

It's hard to say, because the standard never really officially takes a
position on recommended pratices. It does so in an ad hoc manner
(e.g. like in the case of the semantics of a reference_cast). If the
Throws: clauses really are just "recommended practice", maybe what you
see as inconsistency is intentional. (Of course, if this is the case,
I'd much prefer that the standard says so explicitly. Something along
the lines of "The Throws: paragraphs are not normative, but it is
expected that an implementation respect them whenever reasonable.")

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Sean Kelly

unread,

Sep 21, 2004, 7:14:46 AM9/21/04

to

Francis Glassborow wrote:

> In article <ubrg3d...@boost-consulting.com>, David Abrahams
> <da...@boost-consulting.com> writes
>
>>No, if the ctor throws the dtor isn't called.
>
> That is true today but not once we have forwarding ctors. This is an
> issue I had not considered and one that I think needs consideration.
> Perhaps it is as simple as saying that having a throwing dtor in a class
> with a throwing forwarding ctor is an unwise design but perhpas we
> should say more.

So say I have something like this (assume the syntax is meant to
illustrate a forwarding ctor):

class C {
public:
C() {}
C(int) { this->C(); throw ""; }
~C() {}
};

You're saying that the dtor would be called? If so, what's the
reasoning behind this?

Sean

Sean Kelly

unread,

Sep 21, 2004, 7:17:08 AM9/21/04

to

David Abrahams wrote:

> Sean Kelly <se...@f4.ca> writes:
>
>> > It's certainly possible to implement a
>> > non-COW string that doesn't have mutable internal state, and in fact,
>> > the most naive implementation works that way. Probably the right
>> > measure for minimal thread safety is "as threadsafe as int", not "as
>> > threadsafe as a non-COW version," since as Herb has demonstrated,
>> > "it's non-COW" is almost devoid of thread-safety implications.
>>
>>Perhaps "as threadsafe as a dynamic array of the same element type?"
>>The suggestion that it is "as threadsafe as int" implies that even write
>>operations might be marginally atomic,
>
> What does "marginally atomic" mean? Either it's atomic or it isn't.
> I think I understand what you're getting at, though I'm not sure it
> clarifies more than it complicates.

I suppose I could have chosen better wording :) I meant that on some
architectures (notably x86) integer writes are atomic in that no
synchronization is required to maintain data integrity (assuming the
data is aligned properly). However this ignores the issue of memory
visibility, so I settled on "marginally atomic" to describe a situation
where data integrity is maintained but algorithm integrity may be lost.

Considering a vector as a unit comparable to int however, it is possible
that an unsynchronzied write operation (push_back, for example) may
result in data corruption.

Sean

Balog Pal

unread,

Sep 21, 2004, 7:22:00 AM9/21/04

to

"David Abrahams" <da...@boost-consulting.com> wrote in message
news:u656c2...@boost-consulting.com...

> Probably the right

> measure for minimal thread safety is "as threadsafe as int", not "as
> threadsafe as a non-COW version,"

How about "as threadsafe as double" instead of int?

I think int supposed to work for any bit pattern in it -- so one could think
it is okey to read and use a however broken int. The value can be used in a
comparision, or input to CAS. While a broken double could nuke the system.

Francis Glassborow

unread,

Sep 21, 2004, 2:26:14 PM9/21/04

to

In article <P6T8gbRu...@robinton.demon.co.uk>, Francis Glassborow
<fra...@robinton.demon.co.uk> writes

>In article <ubrg3d...@boost-consulting.com>, David Abrahams
><da...@boost-consulting.com> writes
>>No, if the ctor throws the dtor isn't called.
>
>That is true today but not once we have forwarding ctors. This is an
>issue I had not considered and one that I think needs consideration.
>Perhaps it is as simple as saying that having a throwing dtor in a class
>with a throwing forwarding ctor is an unwise design but perhpas we
>should say more.

At a meeting to day we briefly discussed this issue and then realised
that it is no different to the current case where a ctor throws and a
sub-object's dtor throws. If programmer insist on writing dtors that
throw there will be consequences.

David Abrahams

unread,

Sep 21, 2004, 2:31:48 PM9/21/04

to

"Balog Pal" <pa...@lib.hu> writes:

> "David Abrahams" <da...@boost-consulting.com> wrote in message
> news:u656c2...@boost-consulting.com...
>
> > Probably the right
> > measure for minimal thread safety is "as threadsafe as int", not "as
> > threadsafe as a non-COW version,"
>
> How about "as threadsafe as double" instead of int?
>
> I think int supposed to work for any bit pattern in it

Is that in the standard somewhere?? That would surprise me.
Certainly that's not true of pointers.

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Ben Hutchings

unread,

Sep 21, 2004, 2:34:04 PM9/21/04

to

ka...@gabi-soft.fr wrote:
<snip>

> While I'm at it, I'll repeat a plea I've made before. What does Windows
> guarantee with regards to threads and int?

As you know, the Windows memory model is somewhat underspecified, but
the Platform SDK documentation says:

"Simple reads and writes to properly-aligned 32-bit variables are
atomic. In other words, when one thread is updating a 32-bit
variable, you will not end up with only one portion of the
variable updated; all 32 bits are updated in an atomic fashion.
...

"Simple reads and writes to properly aligned 64-bit variables are
atomic on 64-bit Windows. Reads and writes to 64-bit values are
not guaranteed to be atomic on 32-bit Windows. Reads and writes to
variables of other sizes are not guaranteed to be atomic on any
platform.

"The interlocked functions should be used to perform complex
operations in an atomic manner."

(Quoted from
<http://msdn.microsoft.com/library/en-us/dllproc/base/interlocked_variable_access.asp>.)

> The same as Posix? More? Less? Or does Windows just count on the
> fact that its platform is a single processor IA-32 architecture,
> where the hardware gives enough useful guarantees that no more is
> really necessary, and compilers which don't optimize too
> agressively?

The average Windows programmer may still assume that the IA32 single-
processor memory model is valid everywhere, but the implementers of
Windows certainly don't. Windows NT and CE have collectively run on
at least 10 different architectures and there are current versions of
Windows NT for multiprocessor i386, PowerPC, IA64 and AMD64 systems.
(The PowerPC version is for use in the X-Box 2 and is not otherwise
available.)

--
Ben Hutchings
The generation of random numbers is too important to be left to chance.

Balog Pal

unread,

Sep 21, 2004, 6:50:02 PM9/21/04

to

"David Abrahams" <da...@boost-consulting.com> wrote in message

news:ullf3y...@boost-consulting.com...

> > I think int supposed to work for any bit pattern in it
>
> Is that in the standard somewhere?? That would surprise me.

Not in the C++ document (where only unsigned char is mentioned with that
property). If anywhere it can be in the C standard, where restriction on
representation (1,2 copmplement, SM) also appear. Unfortunately I don't
have that standard to check, hopefully someone can help out.

> Certainly that's not true of pointers.

Sure, it's not true for most stuff, why not chose something for the picture
that rings the proper bells aloud. (As despite explicit requirements int
really covers all the bits on most platform used today.)

Paul

ka...@gabi-soft.fr

unread,

Sep 22, 2004, 2:41:01 PM9/22/04

to

Ben Hutchings <ben-publ...@decadentplace.org.uk> wrote in message

news:<slrncl07t0.pk.b...@decadentplace.org.uk>...

> ka...@gabi-soft.fr wrote:
> <snip>
> > While I'm at it, I'll repeat a plea I've made before. What does
> > Windows guarantee with regards to threads and int?

> As you know, the Windows memory model is somewhat underspecified,

So is Posix, for that matter:-). But I know where to find what is
specified. That's really the difference.

> but the Platform SDK documentation says:

> "Simple reads and writes to properly-aligned 32-bit variables are
> atomic. In other words, when one thread is updating a 32-bit
> variable, you will not end up with only one portion of the
> variable updated; all 32 bits are updated in an atomic fashion.
> ...

> "Simple reads and writes to properly aligned 64-bit variables are
> atomic on 64-bit Windows. Reads and writes to 64-bit values are
> not guaranteed to be atomic on 32-bit Windows. Reads and writes to
> variables of other sizes are not guaranteed to be atomic on any
> platform.

That's interesting, but it doesn't help much. (It's really just a
statement of what Windows expects from the hardware.)

> "The interlocked functions should be used to perform complex
> operations in an atomic manner."

Atomicity is only part of the problem. There are ordering issues.
Suppose I have two global variables a and b of an atomic type AtomicInt,
and both are initially 0. I then execute the following code in a
thread:

a = 1 ;
b = 1 ;

Obviously, another thread can see 0,0; 1,0 or 1,1. According to Posix,
unless I add additional primitives, it can also see 0,1, and this can
really occur in practice on many hardware. (I seem to recall reading
that IA-32 guarentees that it cannot.)

> (Quoted from
> <http://msdn.microsoft.com/library/en-us/dllproc/base/interlocked_variable_access.asp>.)

> > The same as Posix? More? Less? Or does Windows just count on the
> > fact that its platform is a single processor IA-32 architecture,
> > where the hardware gives enough useful guarantees that no more is
> > really necessary, and compilers which don't optimize too
> > agressively?

> The average Windows programmer may still assume that the IA32 single-
> processor memory model is valid everywhere, but the implementers of
> Windows certainly don't.

The implementers of Windows NT came largely from DEC, and were certainly
familiar with the Alpha (which offers about the weakest guarantees
around). The question isn't so much what the implementers of the system
know, but what they are willing to contractually guarantee.

> Windows NT and CE have collectively run on at least 10 different
> architectures and there are current versions of Windows NT for
> multiprocessor i386, PowerPC, IA64 and AMD64 systems. (The PowerPC
> version is for use in the X-Box 2 and is not otherwise available.)

And in the past, they ran on Alpha. But in practice, the IA-32 platform
is by far the most important one.

In fact, I'm not too worried about Windows itself. I feel fairly
confident in the ability of the people implementing the OS to implement
whatever guarantees they make correctly. The problem is more one of
documentation -- what guarantees do they implement intentionally, as
opposed to things that just happen to work on one common platform.

--
James Kanze GABI Software http://www.gabi-soft.fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Graeme Prentice

unread,

Sep 22, 2004, 2:48:59 PM9/22/04

to

On 21 Sep 2004 18:50:02 -0400, Balog Pal wrote:

>"David Abrahams" <da...@boost-consulting.com> wrote in message
>news:ullf3y...@boost-consulting.com...
>
>> > I think int supposed to work for any bit pattern in it
>>
>> Is that in the standard somewhere?? That would surprise me.
>
>Not in the C++ document (where only unsigned char is mentioned with that
>property). If anywhere it can be in the C standard, where restriction on
>representation (1,2 copmplement, SM) also appear. Unfortunately I don't
>have that standard to check, hopefully someone can help out.
>

3.9 para 4 in the C++ standard refers to non-normative footnote 37 which
says "The intent is that the memory model of C + + is compatible with
that of ISO/IEC 9899 Programming Language C."

and the C99 standard say explicitly that trap representations can occur
for all types other than character type. - 6.2.6.1 para 5

Graeme

Sean Kelly

unread,

Sep 23, 2004, 4:13:18 AM9/23/04

to

"Balog Pal" <pa...@lib.hu> wrote in message news:<4150...@andromeda.datanet.hu>...

> "David Abrahams" <da...@boost-consulting.com> wrote in message
> news:ullf3y...@boost-consulting.com...
>
> > > I think int supposed to work for any bit pattern in it
> >
> > Is that in the standard somewhere?? That would surprise me.
>
> Not in the C++ document (where only unsigned char is mentioned with that
> property). If anywhere it can be in the C standard, where restriction on
> representation (1,2 copmplement, SM) also appear. Unfortunately I don't
> have that standard to check, hopefully someone can help out.

I can't find anything like that in the C99 standard. From my reading
of 6.2.6.1 C99 only provides that guarantee for unsigned char.

Sean

Ben Hutchings

unread,

Sep 24, 2004, 11:42:46 AM9/24/04

to

ka...@gabi-soft.fr wrote:
> Ben Hutchings <ben-publ...@decadentplace.org.uk> wrote in message
> news:<slrncl07t0.pk.b...@decadentplace.org.uk>...
>> ka...@gabi-soft.fr wrote:
>> <snip>
>> > While I'm at it, I'll repeat a plea I've made before. What does
>> > Windows guarantee with regards to threads and int?
>
>> As you know, the Windows memory model is somewhat underspecified,
>
> So is Posix, for that matter:-). But I know where to find what is
> specified. That's really the difference.

<snip>

>> "The interlocked functions should be used to perform complex
>> operations in an atomic manner."
>
> Atomicity is only part of the problem. There are ordering issues.

I realise that. I left it implicit that ordering is *not* guaranteed
for ordinary variable access. The interlocked functions provide
memory barriers as well as atomic addition, CAS, etc.

The odd thing is that there is no obvious way to read with an acquire
or any other kind of memory barrier. It is possible to do this using
InterlockedCompareExchange with the exchange and comparand set to the
same value, preferably an unlikely one to avoid invalidating cache
lines. (John Torjo suggested this on the Boost mailing list.) This
provides a full memory barrier; recent versions of Windows have
variants of the function that provide a acquire or release barrier.

> Suppose I have two global variables a and b of an atomic type AtomicInt,
> and both are initially 0. I then execute the following code in a
> thread:
>
> a = 1 ;
> b = 1 ;
>
> Obviously, another thread can see 0,0; 1,0 or 1,1. According to Posix,
> unless I add additional primitives, it can also see 0,1, and this can
> really occur in practice on many hardware.

The same is true in Windows environments, and this can happen in
practice on IA64.

> (I seem to recall reading that IA-32 guarentees that it cannot.)

Right.

>> > The same as Posix? More? Less?

The answer to this is, in summary, "much the same".

<snip>

> In fact, I'm not too worried about Windows itself. I feel fairly
> confident in the ability of the people implementing the OS to implement
> whatever guarantees they make correctly. The problem is more one of
> documentation -- what guarantees do they implement intentionally, as
> opposed to things that just happen to work on one common platform.

Documentation of Win32 does tend to improve gradually in response to
queries and comments.

--
Ben Hutchings
Horngren's Observation:
Among economists, the real world is often a special case.

Ben Hutchings

unread,

Sep 25, 2004, 6:22:39 AM9/25/04

to

I wrote:
> ka...@gabi-soft.fr wrote:
> > Ben Hutchings <ben-publ...@decadentplace.org.uk> wrote in message
> > news:<slrncl07t0.pk.b...@decadentplace.org.uk>...
> >> ka...@gabi-soft.fr wrote:
> >> <snip>
> >> > While I'm at it, I'll repeat a plea I've made before. What does
> >> > Windows guarantee with regards to threads and int?
> >
> >> As you know, the Windows memory model is somewhat underspecified,
> >
> > So is Posix, for that matter:-). But I know where to find what is
> > specified. That's really the difference.
><snip>
> >> "The interlocked functions should be used to perform complex
> >> operations in an atomic manner."
> >
> > Atomicity is only part of the problem. There are ordering issues.
>
> I realise that. I left it implicit that ordering is *not* guaranteed
> for ordinary variable access. The interlocked functions provide
> memory barriers as well as atomic addition, CAS, etc.

Actually I'm not certain that they do provide memory barriers, except
that:

> recent versions of Windows have variants of

> [InterlockedCompareExchange] that provide a acquire or release
> barrier.

--
Ben Hutchings
Reality is just a crutch for people who can't handle science fiction.

Alexander Terekhov

unread,

Sep 25, 2004, 6:31:48 AM9/25/04

to

Ben Hutchings wrote:
[...]

> The odd thing is that there is no obvious way to read with an acquire
> or any other kind of memory barrier. It is possible to do this using
> InterlockedCompareExchange with the exchange and comparand set to the
> same value, preferably an unlikely one to avoid invalidating cache
> lines. (John Torjo suggested this on the Boost mailing list.) This
> provides a full memory barrier; recent versions of Windows have
> variants of the function that provide a acquire or release barrier.

Take a closer look at MS docu. See also:

http://google.com/groups?threadm=XIRDc.192766%24Ly.83841%40attbi_s01
(Subject: Re: DCSI - thread safe singleton)

I mean followups too.

regards,
alexander.

Sean Kelly

unread,

Sep 26, 2004, 5:40:43 AM9/26/04

to

Ben Hutchings <ben-publ...@decadentplace.org.uk> wrote in message news:<slrncl63k0.pk.b...@decadentplace.org.uk>...

>
> The odd thing is that there is no obvious way to read with an acquire
> or any other kind of memory barrier. It is possible to do this using
> InterlockedCompareExchange with the exchange and comparand set to the
> same value, preferably an unlikely one to avoid invalidating cache
> lines. (John Torjo suggested this on the Boost mailing list.) This
> provides a full memory barrier; recent versions of Windows have
> variants of the function that provide a acquire or release barrier.

Assuming all writes involve a memory barrier, do reads even need one?

Sean

Ben Hutchings

unread,

Sep 26, 2004, 12:05:29 PM9/26/04

to

Sean Kelly wrote:
> Ben Hutchings <ben-publ...@decadentplace.org.uk> wrote in message
> news:<slrncl63k0.pk.b...@decadentplace.org.uk>...
> >
> > The odd thing is that there is no obvious way to read with an acquire
> > or any other kind of memory barrier. It is possible to do this using
> > InterlockedCompareExchange with the exchange and comparand set to the
> > same value, preferably an unlikely one to avoid invalidating cache
> > lines. (John Torjo suggested this on the Boost mailing list.) This
> > provides a full memory barrier; recent versions of Windows have
> > variants of the function that provide a acquire or release barrier.
>
> Assuming all writes involve a memory barrier, do reads even need one?

Consider this function that attempts to cache the result of a complex
calculation:

LONG calculate()
{
static volatile LONG cached_result;
static volatile LONG cache_valid;
LONG result;
if (cache_valid)
{
result = cached_result;
}
else
{
// calculation of result omitted
cached_result = result;
InterlockedCompareExchangeRelease(&cache_valid, 1, 0);
}
return result;
}

The volatile qualifier prevents the compiler from reordering or adding
reads, but not the processor (in practice). So supposing two threads
race through the function, this could happen:

Thread 1 Thread 2

read cached_result == 0 (speculative)
read cache_valid == 0; take branch
calculate result read cached_result == 0 (speculative)
set cached_result = result
set cache_valid = 1
return (correct result) read cache_valid == 1; don't take branch
set result = 0 (speculatively read)
return (wrong result)

I think this could be fixed by adding an acquire memory barrier to the
test of valid, e.g. replacing:
if (cache_valid)
with
if (InterlockedCompareExchangeAcquire(&cache_valid, 0, 0))

--
Ben Hutchings
Make three consecutive correct guesses and you will be considered an expert.

Sean Kelly

unread,

Sep 27, 2004, 4:47:59 PM9/27/04

to

Gotcha. So why not use inline assembler in such cases? Or is the
compiler not required to preserve the blocks as-is? Granted, the code
wouldn't be portable, but then neither is Interlocked*.

Sean

Peter Dimov

unread,

Sep 29, 2004, 1:31:19 PM9/29/04

to

Ben Hutchings <ben-publ...@decadentplace.org.uk> wrote in message news:<slrncldd0l.pk.b...@decadentplace.org.uk>...

I'm not sure which platform is being discussed. Is it IA32? If so, I
believe the above can't happen; read cache_valid == 1 in thread 2 will
see a newer value and invalidate the speculatively read cached_result.
If IA64, ...

> I think this could be fixed by adding an acquire memory barrier to the
> test of valid, e.g. replacing:
> if (cache_valid)
> with
> if (InterlockedCompareExchangeAcquire(&cache_valid, 0, 0))

... this doesn't seem to fix it, because ICXA only exchanges with
acquire semantics when the comparison doesn't fail.

Vladimir Marko

unread,

Oct 1, 2004, 11:43:33 AM10/1/04

to

usene...@lehrerfamily.com (Joshua Lehrer) wrote in message news:<31c49f0d.04090...@posting.google.com>...
[snip]

Thinking of COW strings I came to this example:

#include <string>
#include <iostream>

void foo(std::string& s){
const std::string& const_s=s;
const char& const_c=const_s[0];
char& c=s[0]; // invalidates const_c ???
c='X';
std::cout << const_c << " vs. " << c << std::endl;
}

int main(){
std::string s1("Hello!");
std::string s2(s1);
foo(s1);
}

I'm not sure if the marked line should invalidate const_c. To decide
this one should use 21.3/5 which says

21.3/5 References, pointers, and iterators referring to the elements
of a basic_string sequence may be invalidated by the following
uses of that basic_string object:

// 4 bullets not listing const version of operator [] ()

(5th bullet) Subsequent to any of the above uses except the forms
of insert() and erase() which return iterators, the first call to
non-const member functions operator[](), at(), begin(), rbegin(),
end(), or rend().

I went as far as to check the word "subsequent" at www.dictionary.com
(my mother language is not english) to make sure I understand it
correctly and AFICT 21.3/5 does _not_ cover my example. But any COW
string implementation will invalidate const_c when s is shared (or,
precisely, let it point to the original copy of the string while in
a non-COW string implementation const_c and c would point to the same
char which makes const_c a valid reference from the language point
of view and invalid from the library point of view).

Please tell me that I've got something wrong here. Otherwise the COW
string would definitely be a wrong implementation (even without the
issues of thread safety and exception specifications).

Vladimir Marko

Alexander Terekhov

unread,

Oct 1, 2004, 1:02:48 PM10/1/04

to

Peter Dimov wrote:

[... DCCI ...]

> > if (InterlockedCompareExchangeAcquire(&cache_valid, 0, 0))
>
> ... this doesn't seem to fix it, because ICXA only exchanges with
> acquire semantics when the comparison doesn't fail.

Atomic PODs aside for a moment, (double-checked concurrent init)

class stuff {

atomic<lazy const *> m_ptr;

public:

/* ... */

lazy const & lazy_instance() {
lazy const * ptr;
if (!(ptr = m_ptr.load(msync::ddhlb)) &&
!m_ptr.attempt_update(0, ptr = new lazy(), msync::ssb)) {
delete ptr;
ptr = m_ptr.load(msync::ddhlb);
}
return *ptr;
}
}

m_ptr.load(msync::ddhlb) is a simple fetch on IA64 (because *ptr
is data-dapendent). Only Alpha would need a barrier here, AFAIK.

regards,
alexander.

Ben Hutchings

unread,

Oct 5, 2004, 6:12:01 PM10/5/04

to

Peter Dimov wrote:
> Ben Hutchings <ben-publ...@decadentplace.org.uk> wrote in message

> news:<slrncldd0l.pk.b...@decadentplace.org.uk>...
<snip>

>> The volatile qualifier prevents the compiler from reordering or adding
>> reads, but not the processor (in practice). So supposing two threads
>> race through the function, this could happen:
>>
>> Thread 1 Thread 2
>>
>> read cached_result == 0 (speculative)
>> read cache_valid == 0; take branch
>> calculate result read cached_result == 0 (speculative)
>> set cached_result = result
>> set cache_valid = 1
>> return (correct result) read cache_valid == 1; don't take branch
>> set result = 0 (speculatively read)
>> return (wrong result)
>
> I'm not sure which platform is being discussed.

An arbitrary Win32 platform. Currently IA64 has the weakest memory
ordering of any architecture on which Win32 has been implemented, so
let's assume that.

<snip>

> If IA64, ...
>
>> I think this could be fixed by adding an acquire memory barrier to the
>> test of valid, e.g. replacing:
>> if (cache_valid)
>> with
>> if (InterlockedCompareExchangeAcquire(&cache_valid, 0, 0))
>
> ... this doesn't seem to fix it, because ICXA only exchanges with
> acquire semantics when the comparison doesn't fail.

Well I wasn't sure about that ("I think") - what would you suggest?

--
Ben Hutchings
Nothing is ever a complete failure; it can always serve as a bad example.

Alexander Terekhov

unread,

Oct 6, 2004, 9:23:30 AM10/6/04

to

Ben Hutchings wrote:
[...]

> > I'm not sure which platform is being discussed.
>
> An arbitrary Win32 platform. Currently IA64 has the weakest memory
> ordering of any architecture on which Win32 has been implemented, so
> let's assume that.

Well,

http://www.theinquirer.net/?article=14407
http://www.gamepro.com/microsoft/xbox/hardware/news/35216.shtml

;-)

regards,
alexander.

Peter Dimov

unread,

Oct 6, 2004, 6:47:39 PM10/6/04

to

Ben Hutchings <ben-publ...@decadentplace.org.uk> wrote in message news:<slrncm4tg3.fqi.b...@decadentplace.org.uk>...

> Peter Dimov wrote:
> > Ben Hutchings <ben-publ...@decadentplace.org.uk> wrote in message
> > news:<slrncldd0l.pk.b...@decadentplace.org.uk>...
> <snip>
> >> The volatile qualifier prevents the compiler from reordering or adding
> >> reads, but not the processor (in practice). So supposing two threads
> >> race through the function, this could happen:
> >>
> >> Thread 1 Thread 2
> >>
> >> read cached_result == 0 (speculative)
> >> read cache_valid == 0; take branch
> >> calculate result read cached_result == 0 (speculative)
> >> set cached_result = result
> >> set cache_valid = 1
> >> return (correct result) read cache_valid == 1; don't take branch
> >> set result = 0 (speculatively read)
> >> return (wrong result)
> >
> > I'm not sure which platform is being discussed.
>
> An arbitrary Win32 platform. Currently IA64 has the weakest memory
> ordering of any architecture on which Win32 has been implemented, so
> let's assume that.

That would be Win64 then. ;-)

> <snip>
> > If IA64, ...
> >
> >> I think this could be fixed by adding an acquire memory barrier to the
> >> test of valid, e.g. replacing:
> >> if (cache_valid)
> >> with
> >> if (InterlockedCompareExchangeAcquire(&cache_valid, 0, 0))
> >
> > ... this doesn't seem to fix it, because ICXA only exchanges with
> > acquire semantics when the comparison doesn't fail.
>
> Well I wasn't sure about that ("I think") - what would you suggest?

I know next to nothing about IA64, but it is supposed to have an
ld.acq instruction. It doesn't seem to have a Windows equivalent,
unfortunately.