Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Value of s[s.size()] when s is a string

126 views
Skip to first unread message

Paul

unread,
Oct 15, 2018, 3:21:46 PM10/15/18
to
I think that, when s is a std::string,
s[s.size()] == 0; However, I haven't been able to confirm it.
It works like that in my compiler.

Is this correct?

Thanks,

Paul Epstein

james...@alumni.caltech.edu

unread,
Oct 15, 2018, 4:22:18 PM10/15/18
to
[Re: std::basic_string<charT> s; s[pos]]

"Returns: *(begin() + pos) if pos < size(). Otherwise, returns a
reference to an object of type charT with value charT(), where modifying
the object to any value other than charT() leads to undefined behavior."
(20.3.2.5p2)

charT defaults to char, so for typical uses charT() is essentially the
same as 0. That will also be true, except for the type of the result,
for the other three types that the standard templates are required to be
specialize for: char16_t, char32_t, and wchar_t, respectively.

Paul

unread,
Oct 15, 2018, 4:34:15 PM10/15/18
to
That seems to read as though s[s.size() + 1] is also 0 but that doesn't appear to be true with my compiler. I tested it with the below. s[s.size()]
== 0 but s[s.size() + ..] is not usually 0.

Paul

#include <string>
#include <iostream>
int main()
{
std::string s;
for(int i = 0; i < 10; ++i)
std::cout << int(s[i]) << "\n";
}

james...@alumni.caltech.edu

unread,
Oct 15, 2018, 5:13:42 PM10/15/18
to
On Monday, October 15, 2018 at 4:34:15 PM UTC-4, Paul wrote:
> On Monday, October 15, 2018 at 9:22:18 PM UTC+1, james...@alumni.caltech.edu wrote:
> > On Monday, October 15, 2018 at 3:21:46 PM UTC-4, Paul wrote:
> > > I think that, when s is a std::string,
> > > s[s.size()] == 0; However, I haven't been able to confirm it.
> > > It works like that in my compiler.
> > >
> > > Is this correct?
> > >
> > > Thanks,
> > >
> > > Paul Epstein
> >
> > [Re: std::basic_string<charT> s; s[pos]]
> >
> > "Returns: *(begin() + pos) if pos < size(). Otherwise, returns a
> > reference to an object of type charT with value charT(), where modifying
> > the object to any value other than charT() leads to undefined behavior."
> > (20.3.2.5p2)
> >
> > charT defaults to char, so for typical uses charT() is essentially the
> > same as 0. That will also be true, except for the type of the result,
> > for the other three types that the standard templates are required to be
> > specialize for: char16_t, char32_t, and wchar_t, respectively.
>
> That seems to read as though s[s.size() + 1] is also 0 but that doesn't appear

Not quite. 203.2.5p1 says "Requires: pos <= size()." Violating a "Requires" clause renders the behavior undefined.

Alf P. Steinbach

unread,
Oct 15, 2018, 6:15:13 PM10/15/18
to
On 15.10.2018 22:22, james...@alumni.caltech.edu wrote:
> On Monday, October 15, 2018 at 3:21:46 PM UTC-4, Paul wrote:
>> I think that, when s is a std::string,
>> s[s.size()] == 0; However, I haven't been able to confirm it.
>> It works like that in my compiler.
>>
>> Is this correct?
>>
>
> [Re: std::basic_string<charT> s; s[pos]]
>
> "Returns: *(begin() + pos) if pos < size(). Otherwise, returns a
> reference to an object of type charT with value charT(), where modifying
> the object to any value other than charT() leads to undefined behavior."
> (20.3.2.5p2)
>
> charT defaults to char, so for typical uses charT() is essentially the
> same as 0. That will also be true, except for the type of the result,
> for the other three types that the standard templates are required to be
> specialize for: char16_t, char32_t, and wchar_t, respectively.

Worth noting for those using elder C++03 compilers, that until C++11
this guarantee and well-definedness was only offered for the `const`
accessor.

C++03 §21.3.4/1:
“Returns: If pos < size(), returns data()[pos]. Otherwise, if pos ==
size(), the const version returns charT(). Otherwise, the behavior is
undefined.”

The C++11 guarantees are more clean. But C++03 supported efficient
reference counted string buffers, called “COW” strings (copy on write),
and used by g++'s standard library implementation, by regarding pointers
and references as invalidated by calls to `data` or `c_str`, and by the
first call to non-`const` `operator[]`, `at`, `begin`, `rbegin`, `end`
and `rend`. In C++11 and later these operations don't invalidate, i.e.
C++11 std::string is easier to use correctly, and furthermore indexing
is required to be O(1), which rules out a COW buffer copying on first
access. C++11 `std::stringview` indexing is correspondingly designed to
return a reference to constant `char` instead of the `char` itself,
relying on the compiler to optimize away that indirection overhead.
What's paid for: that COW strings won't be re-introduced.

My impression is that the decision to rule out COW strings was caused by
a bug in the g++ implementation. Rather than admit that it was a bug,
g++ supporters took the position that a safe implementation was just
impossible, and so, implicitly, that the buggy one was a practical
trade-off, the best that one could do. They got got downright nasty
about the possibility of implementing COW correctly if the standard
didn't stand actively in the way. As I recall (this was years ago) I
supplied proof in the form of an actual implementation, to no avail.
Hardcore fans of this or that are usually immune to facts and logic, and
I've encountered that a great many times so I should have known, but.


Cheers!,

- Alf

Richard

unread,
Oct 15, 2018, 7:44:52 PM10/15/18
to
[Please do not mail me a copy of your followup]

"Alf P. Steinbach" <alf.p.stein...@gmail.com> spake the secret code
<pq33h3$6fv$1...@dont-email.me> thusly:

>My impression is that the decision to rule out COW strings was caused by
>a bug in the g++ implementation. Rather than admit that it was a bug,
>g++ supporters took the position that a safe implementation was just
>impossible, and so, implicitly, that the buggy one was a practical
>trade-off, the best that one could do. They got got downright nasty
>about the possibility of implementing COW correctly if the standard
>didn't stand actively in the way. As I recall (this was years ago) I
>supplied proof in the form of an actual implementation, to no avail.

When I gave a presentation on strings to the Utah C++ Programmer's
meetup in January, 2018[1], I pointed out that many large code bases
have custom strings that better suit their problem domain. For
instance, LLVM/clang has custom string classes such as SmallString,
StringRef and Twine.[2] Personally I found StringRef too easy to misuse
and therefore not following Scott Meyer's advice for interfaces. If
you've got an application that is doing lots of string manipulation
and copy-on-write would provide benefit, then by all means use a CoW
string implemenation and use that throughout your code base.

Alf, are you aware of any CoW string implementation that is
maintained?

[1] <https://www.meetup.com/Utah-Cpp-Programmers/events/xxrjflyxcbnb/>
[2] <http://llvm.org/docs/ProgrammersManual.html#string-like-containers>
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Terminals Wiki <http://terminals-wiki.org>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>

Öö Tiib

unread,
Oct 16, 2018, 9:23:16 AM10/16/18
to
I do not understand what is "bug" there? The GNU libstdc++
is part of mainline GCC distribution not stand-alone portable
implementation of standard library. Such standard library is not
required to be implemented as source code text files. If parts of
it are text files then those are not required to contain C++ code.
Even if those contain C++-like code then it is not required to be
standard-conforming C++ code.

Lot of library features added by C++11 standard are something
outright arcane and magical exactly with all these excuses
expressed. Look at std::initializer_list, std::thread, std::atomic
and so on, it is pure magic not possible to express in standard
C++.

Therefore while isocpp.org is full of priggish language lawyers
I do not believe that disappearance of CoW strings was because
of such sanctimonious arguments. In lot of programming languages
string is part of core language.

However the non-sanctimonious arguments that I have heard are
also plenty:
1) Majority of strings handled by actual code are shorter than 30
bytes but CoW offers no optimization around that fact.
2) Copy of string with CoW involves thread-safe increment of
reference count and that makes it more expensive than copy with
short string optimization.
3) Efficient usage of CoW strings involves strict const-correctness
but even the talented programmers of major shops do not honor
that everywhere.
4) The immediate availability of size of string is very important for
lot of algorithms (and can make those more efficient than with
zero terminated strings of C) but with CoW it is behind additional
level of indirection.

So the major shops voted for making one-pointer CoW string illegal
and now there are all (usually ... 32 byte) non-CoW strings.

Programmers who want to be very efficient with text processing
should extensively use std::string_view of C++17. Inconveniences,
complications, dangling references and thread safety are back on
shoulders of programmers with it, but it is very powerful
performance optimization. But in 95% of code the std::string is
efficient enough and non-CoW is bit more robust and scalable
than CoW for that code.

bol...@cylonhq.com

unread,
Oct 16, 2018, 9:54:32 AM10/16/18
to
On Tue, 16 Oct 2018 06:23:00 -0700 (PDT)
=?UTF-8?B?w5bDtiBUaWli?= <oot...@hot.ee> wrote:
>On Tuesday, 16 October 2018 01:15:13 UTC+3, Alf P. Steinbach wrote:
>expressed. Look at std::initializer_list, std::thread, std::atomic
>and so on, it is pure magic not possible to express in standard
>C++.

std::initializer_list is just a clever vararg. You could do it in standard C
never mind C++.


Öö Tiib

unread,
Oct 16, 2018, 10:47:30 AM10/16/18
to
No it is not and no we could not implement it in C. It is core language
feature Expression {1,2,3} just magically makes constexpr instance
of std::initializer_list<int> to poof into existence. The very concept
of constexpr is missing from C and so you have to use preprocessor
metaprogramming to handle things like that in it.

My point was that magic is fully allowed by standard so in same way
expression "Hello World!" could make constexpr instance of
std::basic_string<char> (AKA std::string) to poof into existence.
Standard just haven't chosen to do that. There are no such thing
like constexpr std::string in C++. So we have to use explicitly
made std::string_view("Hello World!") in constexpr functions.


Öö Tiib

unread,
Oct 16, 2018, 11:15:23 AM10/16/18
to
On Tuesday, 16 October 2018 17:47:30 UTC+3, Öö Tiib wrote:
> So we have to use explicitly
> made std::string_view("Hello World!") in constexpr functions.

Here I was bit too croaky, we can use "Hello World"sv as shortcut
of it.

bol...@cylonhq.com

unread,
Oct 16, 2018, 11:30:34 AM10/16/18
to
On Tue, 16 Oct 2018 07:47:18 -0700 (PDT)
=?UTF-8?B?w5bDtiBUaWli?= <oot...@hot.ee> wrote:
>On Tuesday, 16 October 2018 16:54:32 UTC+3, bol...@cylonhq.com wrote:
>> On Tue, 16 Oct 2018 06:23:00 -0700 (PDT)
>> =?UTF-8?B?w5bDtiBUaWli?= <oot...@hot.ee> wrote:
>> >On Tuesday, 16 October 2018 01:15:13 UTC+3, Alf P. Steinbach wrote:
>> >expressed. Look at std::initializer_list, std::thread, std::atomic
>> >and so on, it is pure magic not possible to express in standard
>> >C++.
>>
>> std::initializer_list is just a clever vararg. You could do it in standard C
>> never mind C++.
>
>No it is not and no we could not implement it in C. It is core language
>feature Expression {1,2,3} just magically makes constexpr instance
>of std::initializer_list<int> to poof into existence. The very concept
>of constexpr is missing from C and so you have to use preprocessor
>metaprogramming to handle things like that in it.

And?


Bonita Montero

unread,
Oct 16, 2018, 12:06:35 PM10/16/18
to
AFAIK C++-strings are not guaranteed to be zero-terminated without
calling c_str(). I.e. when you do s[s.size()] you may be access the
string out of bounds.

james...@alumni.caltech.edu

unread,
Oct 16, 2018, 12:35:56 PM10/16/18
to
20.3.2.5p2 guarantees that, if s is an instance of
std::basic_string<charT>, then s[s.size] returns a reference to an
object with a value of charT().

Öö Tiib

unread,
Oct 16, 2018, 12:40:40 PM10/16/18
to
It works with text substitution. So its usage is ugly, hard to debug and
on case when recursive #include is needed (and for list of unknown
length it might be needed) then it is also often slow. There are
no type safety, and no comprehensible diagnostics. Consider:

constexpr auto x = {1, 2, "huh"};

We did forget to include <initializer_list> so it explicitly tells
that we are required to include it for deducing from brace-enclosed
initializer list. Pure magic and miracles. :D After including we will
get: "error: unable to deduce 'const std::initializer_list<const auto>'
from '{1, 2, "huh"}' note: deduced conflicting types for parameter
'auto' ('int' and 'const char*')".

Can you post how your preprocessor implementation of
std::initializer_list looks like and how it behaves in similar
situation?

Bonita Montero

unread,
Oct 16, 2018, 2:09:18 PM10/16/18
to
> 20.3.2.5p2 guarantees that, if s is an instance of
> std::basic_string<charT>, then s[s.size] returns a reference to an
> object with a value of charT().

... and also that this charT is 0?

Öö Tiib

unread,
Oct 16, 2018, 2:55:05 PM10/16/18
to
When charT is char (like it is on case of std::string) then yes.

james...@alumni.caltech.edu

unread,
Oct 16, 2018, 3:56:59 PM10/16/18
to
charT is the template parameter. The templates must be specialized for
charT == char, char16_t, char32_t, and wchar_t, and for each of those
types charT() works out to a value of 0 with the specified type. If you
want to create your own character class myChar, and provide a traits
class for it (for instance, by specializing std::traits<myChar>),
myChar() can return whatever value you want it to. Note that std::basic_string<myChar> will treat myChar() as the terminating
character for strings, just as '\0' is the terminating character for
std::basic_string<char>.

Juha Nieminen

unread,
Oct 17, 2018, 3:34:07 AM10/17/18
to
If you take a raw pointer to the contents with s.data() or &s[0] then
you are correct: It's not guaranteed to be a null terminated raw string.

However, the question was about s[s.size()], which is different.

bol...@cylonhq.com

unread,
Oct 17, 2018, 4:56:11 AM10/17/18
to
I've never found a use for constexpr in real world code so that fact that
it can't be emulated in C is irrelevant to me and auto is dynamic typing and
if you want that use a scripting language, I hate it. It saves the coder a few
seconds but makes debugging a pain. As for complex type or function parameter
initialisation, C99 introduced compound literals.

Chris Vine

unread,
Oct 17, 2018, 5:50:42 AM10/17/18
to
On Wed, 17 Oct 2018 07:33:54 -0000 (UTC)
Juha Nieminen <nos...@thanks.invalid> wrote:
> Bonita Montero <Bonita....@gmail.com> wrote:
> > AFAIK C++-strings are not guaranteed to be zero-terminated without
> > calling c_str(). I.e. when you do s[s.size()] you may be access the
> > string out of bounds.
>
> If you take a raw pointer to the contents with s.data() ... then
> you are correct: It's not guaranteed to be a null terminated raw string.

std::string::data() does terminate the string with null from C++11
onwards, so there seems to be no difference from std::string::c_str().
This change probably arises because, from C++11 onwards,
std::string::size() must be of constant complexity so null termination
won't have any significant effect on efficiency.

Alf P. Steinbach

unread,
Oct 18, 2018, 8:47:20 PM10/18/18
to
On 16.10.2018 01:44, Richard wrote:
> [snip]
> Alf, are you aware of any CoW string implementation that is
> maintained?

No, sorry.

I guess Mr. Google can help, or the weekly posted list of C++ libraries.

Off the cuff, aspects to look for in a COW string type that doesn't aim
to conform to the `std::string` specification, but instead to take
maximum advantage:

* Does it construct from literal in constant time?
If not then it's clearly ungood.

* Does it support repeated concatenation in linear time, as in CPython?
If not then it's clearly ungood.

* Does it support guaranteed cloning for thread safety?
If not then it's clearly ungood.

* Does it avoid imposing silly and inefficient thread safety measures?
If not then it's clearly ungood.

* Does it provide substrings in constant time?
Nice if it does, but also we now have `std::string_view`.

Cheers!,

- Alf

Alf P. Steinbach

unread,
Oct 18, 2018, 8:59:23 PM10/18/18
to
It was possible to invalidate a pointer into a string's buffer, by
operations that one would not expect to do that. I'm a bit hazy on the
details, except that it necessarily involves making a (logical) copy.


> The GNU libstdc++
> is part of mainline GCC distribution not stand-alone portable
> implementation of standard library. Such standard library is not
> required to be implemented as source code text files. If parts of
> it are text files then those are not required to contain C++ code.
> Even if those contain C++-like code then it is not required to be
> standard-conforming C++ code.
>
> Lot of library features added by C++11 standard are something
> outright arcane and magical exactly with all these excuses
> expressed. Look at std::initializer_list, std::thread, std::atomic
> and so on, it is pure magic not possible to express in standard
> C++.
>
> Therefore while isocpp.org is full of priggish language lawyers
> I do not believe that disappearance of CoW strings was because
> of such sanctimonious arguments. In lot of programming languages
> string is part of core language.
>
> However the non-sanctimonious arguments that I have heard are
> also plenty:

Hm, I had to google that word, “sanctimonious”. Learned something. :-)


> 1) Majority of strings handled by actual code are shorter than 30
> bytes but CoW offers no optimization around that fact.
> 2) Copy of string with CoW involves thread-safe increment of
> reference count and that makes it more expensive than copy with
> short string optimization.
> 3) Efficient usage of CoW strings involves strict const-correctness
> but even the talented programmers of major shops do not honor
> that everywhere.
> 4) The immediate availability of size of string is very important for
> lot of algorithms (and can make those more efficient than with
> zero terminated strings of C) but with CoW it is behind additional
> level of indirection.

These are all wrong. COW and short buffer optimization are orthogonal
things. Copying of COW strings need not be more thread safe than copying
of non-COW strings; copying of `std::string` is not thread safe. COW
string efficiency has little or nothing to do with `const` correctness.
Nothing prevents `size()` available directly in a COW string: the string
size can't be changed by other COW strings that refer to the buffer.


> So the major shops voted for making one-pointer CoW string illegal
> and now there are all (usually ... 32 byte) non-CoW strings.

Well, the offered arguments do not cut it. And I think my hypothesis has
at least some chance of being at least partially right. But the battle
for possible COW implementations of `std::string` is lost so it doesn't
matter much except for how to relate to discussions about it...


> Programmers who want to be very efficient with text processing
> should extensively use std::string_view of C++17. Inconveniences,
> complications, dangling references and thread safety are back on
> shoulders of programmers with it, but it is very powerful
> performance optimization. But in 95% of code the std::string is
> efficient enough and non-CoW is bit more robust and scalable
> than CoW for that code.

For the `std::string` specification yes, I agree the C++11 spec makes
for more robust client code than the COW-supporting C++03 spec.

So something was gained, and something was lost. :)


Cheers!,

- Alf

Öö Tiib

unread,
Oct 19, 2018, 7:44:03 AM10/19/18
to
Can't find cites of it. Was it about thread safety?

> > The GNU libstdc++
> > is part of mainline GCC distribution not stand-alone portable
> > implementation of standard library. Such standard library is not
> > required to be implemented as source code text files. If parts of
> > it are text files then those are not required to contain C++ code.
> > Even if those contain C++-like code then it is not required to be
> > standard-conforming C++ code.
> >
> > Lot of library features added by C++11 standard are something
> > outright arcane and magical exactly with all these excuses
> > expressed. Look at std::initializer_list, std::thread, std::atomic
> > and so on, it is pure magic not possible to express in standard
> > C++.
> >
> > Therefore while isocpp.org is full of priggish language lawyers
> > I do not believe that disappearance of CoW strings was because
> > of such sanctimonious arguments. In lot of programming languages
> > string is part of core language.
> >
> > However the non-sanctimonious arguments that I have heard are
> > also plenty:
>
> Hm, I had to google that word, “sanctimonious”. Learned something. :-)

That indicates I have read needlessly lot of philosophical and political
stuff. :D

> > 1) Majority of strings handled by actual code are shorter than 30
> > bytes but CoW offers no optimization around that fact.
> > 2) Copy of string with CoW involves thread-safe increment of
> > reference count and that makes it more expensive than copy with
> > short string optimization.
> > 3) Efficient usage of CoW strings involves strict const-correctness
> > but even the talented programmers of major shops do not honor
> > that everywhere.
> > 4) The immediate availability of size of string is very important for
> > lot of algorithms (and can make those more efficient than with
> > zero terminated strings of C) but with CoW it is behind additional
> > level of indirection.
>
> These are all wrong. COW and short buffer optimization are orthogonal
> things. Copying of COW strings need not be more thread safe than copying
> of non-COW strings; copying of `std::string` is not thread safe. COW
> string efficiency has little or nothing to do with `const` correctness.
> Nothing prevents `size()` available directly in a COW string: the string
> size can't be changed by other COW strings that refer to the buffer.

The arguments were particularly about C++ std::string in different
implementations not CoW in general. C++03 did not formally have
threads but in practice we had more and more multi-core systems to
work with. There 2) and 3) are very convincing. 2) CoW string has to have
unexposed reference count, that can be shared between threads without
any external ways to indicate if it is or not. 3) CoW obviously provides
benefits only when refcount > 1 often and efficient usage avoids altering
that when not needed. Calls of non-const methods have to check that
refcount == 1 and to make actual copy when it is not. Interface of string
has lot of non-const overloads: at, [], front, back, begin, end, rbegin,
rend and since C++17 also data(). Avoiding unneeded calls of those
means const correctness. The 1) and 4) were perhaps critique of
some concrete CoW string implementations.

> > So the major shops voted for making one-pointer CoW string illegal
> > and now there are all (usually ... 32 byte) non-CoW strings.
>
> Well, the offered arguments do not cut it. And I think my hypothesis has
> at least some chance of being at least partially right. But the battle
> for possible COW implementations of `std::string` is lost so it doesn't
> matter much except for how to relate to discussions about it...

Specialists of major shops can also buy political, philosophical and
scientific arguments so it may be. The decisions have to be also
supported with explanations why profiling shows about the actual
products what it shows and what it will cost to improve it. thing about Usenet is that here we are free of that.

> > Programmers who want to be very efficient with text processing
> > should extensively use std::string_view of C++17. Inconveniences,
> > complications, dangling references and thread safety are back on
> > shoulders of programmers with it, but it is very powerful
> > performance optimization. But in 95% of code the std::string is
> > efficient enough and non-CoW is bit more robust and scalable
> > than CoW for that code.
>
> For the `std::string` specification yes, I agree the C++11 spec makes
> for more robust client code than the COW-supporting C++03 spec.
>
> So something was gained, and something was lost. :)

Yes and surgery that makes things simpler isn't typically worse
than the disease that it cures. :D

Juha Nieminen

unread,
Oct 19, 2018, 10:24:23 AM10/19/18
to
Alf P. Steinbach <alf.p.stein...@gmail.com> wrote:
> Copying of COW strings need not be more thread safe than copying
> of non-COW strings; copying of `std::string` is not thread safe.

I think one problem with an "unofficial" (in the sense that the standard
doesn't mandate it, nor define how exactly it should work) copy-on-write
scheme in std::string is that, by the virtue of being "unofficial",
there's no "standard" way of checking if the current string is being
shared by another instance, and to make a deep copy if that's so.

In some situations you want to make a deep copy of a data structure
(eg. to have a copy that's local to a thread, and not shared with
any other thread), but you may want to do this deep-copying only
if it's necessary, ie. if the data is being currently shared.
Always doing a deep copy "just in case", even if it's not needed,
could be inefficient.

And, of course, there's also the issue that if you have something
like this:

void foo(std::string s)
{
std::size_t length = s.length();
...
}

you normally wouldn't expect having to enclose those reads in
mutexes because 's' is being taken by value and should be a
local copy. When you suddenly can't trust that a parameter
taken by value cannot be handled without mutexes, it raises all
kinds of problems.

Alf P. Steinbach

unread,
Oct 19, 2018, 12:22:20 PM10/19/18
to
On 19.10.2018 16:24, Juha Nieminen wrote:
> Alf P. Steinbach <alf.p.stein...@gmail.com> wrote:
>> Copying of COW strings need not be more thread safe than copying
>> of non-COW strings; copying of `std::string` is not thread safe.
>
> I think one problem with an "unofficial" (in the sense that the standard
> doesn't mandate it, nor define how exactly it should work) copy-on-write
> scheme in std::string is that, by the virtue of being "unofficial",
> there's no "standard" way of checking if the current string is being
> shared by another instance, and to make a deep copy if that's so.
>
> In some situations you want to make a deep copy of a data structure
> (eg. to have a copy that's local to a thread, and not shared with
> any other thread), but you may want to do this deep-copying only
> if it's necessary, ie. if the data is being currently shared.
> Always doing a deep copy "just in case", even if it's not needed,
> could be inefficient.

Agreed. The apparently simple solution of just making a copy and
pretend-modifying it, would introduce inefficiency for non-COW strings.
So that's not portable.


> And, of course, there's also the issue that if you have something
> like this:
>
> void foo(std::string s)
> {
> std::size_t length = s.length();
> ...
> }
>
> you normally wouldn't expect having to enclose those reads in
> mutexes because 's' is being taken by value and should be a
> local copy. When you suddenly can't trust that a parameter
> taken by value cannot be handled without mutexes, it raises all
> kinds of problems.

But here it appears that you confuse responsibilities. It's not foo's
responsibility to make a deep copy of s in order to support execution of
foo in a separate thread. It's the caller's responsibility.

In C++03 there were no standard threads, so it was not addressed by the
language.

In C++11 threads were introduced and the COW support of the design
removed, and so in C++11 this issue not addressed by the language
either. One way it could have been addressed is, as you note at the
start, by providing a deep copy operation. Then the caller of foo would
have to use that for using foo as a thread function.


Cheers!,

- Alf
0 new messages