In the following code snippet, when the <string, vector> pair is
inserted into the map, is it necessary for the contents of the string
and vector to be duplicated, or does this just shuffle pointers
around?
{
string s(100, 'x');
vector<int> v(100);
map<string, vector<int>> m;
m.insert(make_pair(s, v));
}
Is the full 100 bytes of the string s duplicated and then the original
freed? Since the local variables s and v are no longer needed after
the end of the scope immediately following the insert, it seems quite
unnecessary to duplicate and then free the originals. Can this be
avoided? Do any implementations of STL implement copy-on-write
semantics for string or vector?
Thanks,
Shaun
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
make_pair creates a new pair object and returns it by value, so "in
principle" s and v will be copied. Of course, the compiler is allowed to
perform any optimizations that doesn't change the observable behavior,
so there's no "guarantee" that the copies will be made.
In C++0x you will be able to use make_pair(T1&&, T2&&) and "move" s and
v, avoiding the copies. As far as I know, the syntax will probably be
m.insert(make_pair(move(s), move(v))) (with all std:: omitted).
--
Seungbeom Kim
"In principle" then, would m.insert(pair<string, vector<int>>(s, v));
avoid making a copy? I had been treating make_pair as a syntatic
nicety, but completely equivalent to the constructor of pair.
Cheers,
Shaun
--
I'm not 100% sure, but I think STLPort implements string with CoW.
> avoided? Do any implementations of STL implement copy-on-write
> semantics for string or vector?
copy-on-write doesn't work well with threads so nobody uses it much
anymore.
Current STLPort-5.2.1 and STLPort 4 bundled with the latest Sun C++
compilers do not not use CoW. One most disappointing feature of STLPort
std::string is that the default constructor allocates storage, in other
words, a memory allocation is performed even for empty strings.
GNU std::string uses CoW. I hear that the standard interface of
std::string can not be possibly satisfied by a CoW implementation of
std::string, nevertheless I find CoW implementations of std::string the
most practical.
GNU C++ library also provides another string class
__gnu_cxx::__versa_string, which does not do CoW and I've heard that
there are plans to make it the default std::string implementation in the
future, although the CoW std::string implementation will still be
available under a different name.
--
Max
It is perfectly possible to combine CoW and threads when the CoW
classes use atomic-reference counting. This is precisely how the Qt
framework provides fully reentrant implementations of string,
container, and other implicitly shared classes.
MSVC(VC6)/Dinkumware used to do COW for std::string, but found that
small string optimization provided better overall performance.
Jeff
Since s and v are lvalues they will be copied. Now, if that means that
all the characters of that string object are duplicated is another
questions and depends on whether the implementation uses CoW or not.
Recent libstdc++ Versions still use CoW (in a thread-safe way) for
std::string. I'm not sure about std::vector. Probably not.
> > In C++0x you will be able to use make_pair(T1&&, T2&&) and "move" s and
> > v, avoiding the copies. As far as I know, the syntax will probably be
> > m.insert(make_pair(move(s), move(v))) (with all std:: omitted).
>
> "In principle" then, would m.insert(pair<string, vector<int>>(s, v));
> avoid making a copy?
No. They have to be copied since s and v are lvalues. Bug again, that
doesn't imply that the string's elements are copied (due to CoW). But
CoW is not mandated, only a possibility.
In C++0x you will be able to write
m.emplace(move(s), move(v));
without having to worry about any copying. The pair object will be
directly constructed "into the map" using its templated constructor
that forwards the rvalues references of move(s) and move(v) to the
constructors of std::string and std::vector.
> I had been treating make_pair as a syntatic nicety, but completely
> equivalent to the constructor of pair.
I think it's safe to say that is is if your compiler supports RVO
(return value optimization) and inlining. You can expect make_pair to
be as efficient as a direct constructor call. If you're concerned
about the performance you could do something like this:
if (m.find(s)==m.end())
m[s].swap(v);
This will create a pair with a copied string as key and a default-
constructed vector. The new vector is immediately swapped with v.
Cheers,
SG
How does that preclude CoW for non-small strings?
While I agree that this is the best solution, I think it would also
make sense to have container::insert(const T &&value), which makes use
of the move constructor, making m.insert(make_pair(move(s), move(v))
almost as efficient. (I do not remember if C++0X has this or not).
It doesn't, it is a separate optimization, but with an overlap in
effect for short strings.
The original use of CoW for std::string was shown not to work in
practice. Due to the semantics it becomes copy-on-potential-write,
which is most often a net loss - especially for multi-threaded use.
Herb Sutter did the measures that surprised most people:
http://www.gotw.ca/gotw/045.htm
Bo Persson
The latest draft, N2960, has insert member functions that take
rvalue references (P&&, not const P&&, though) for all containers
that have insert member functions.
--
Seungbeom Kim