std::function<void(std::string&&)> sink = ...;std::string s;for (int i = 0; i < 26; ++i){s.append('a' + i);sink(std::move(s));}
std::function<void(std::string&&)> sink = ...;std::string s;for (int i = 0; i < 26; ++i){s.append('a' + i);sink(std::move(s));
s = std::string(); // possibly redundant?
In general would this assumption hold across all standard movable types? Is the extra re-initialization necessary to ensure cross platform/implementation behavior?}
(This question is based upon document ISO/IEC 14882:2011(E))I'm writing a parser that's going to end up generating lots of instances of standard strings and vectors, and I'm trying to be as unobtrusive in terms of memory as I can. I'm trying to move the generated strings and vectors for obvious reasons, but it got me thinking about the state of the the carcass of the moved instance.For strings, I found the language in § 21.4.2 paragraph 2 that "In the second form, str is left in a valid state with an unspecified value." The second form refers to the the rvalue-reference constructor of basic_string.From this language, I expect I should be able to call methods on a previously moved instance without introducing undefined behavior, but the result of those calls are unspecified (calling c_str could return an empty string or anything else). What I gather from this is that if I wanted to reuse a previously moved string instance, I would need re-initialize it in order to get to a known state.As an example, if I wrote:std::function<void(std::string&&)> sink = ...;std::string s;for (int i = 0; i < 26; ++i){s.append('a' + i);sink(std::move(s));}It would be conformant for sink to receive "a", "b", "c".... or to receive "a", "ab", "abc", or really anything else after initially receiving "a".Question boils down to: To be in a known & valid state, do I need to reinitialize a moved type in order to be able to guarantee behavior?
i.e. Do I need to do the following in order to ensure sink receives "a", "b", "c"...:std::function<void(std::string&&)> sink = ...;std::string s;for (int i = 0; i < 26; ++i){s.append('a' + i);sink(std::move(s));s = std::string(); // possibly redundant?In general would this assumption hold across all standard movable types?}
Is the extra re-initialization necessary to ensure cross platform/implementation behavior?
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.
To post to this group, send email to std-dis...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-discussion/.
Got it, Richard. Appreciate the response, and, unfortunately, confirming my fears. The small string example really helped to make it clear why.
'Fear' was probably a too strong and wrong word to use, yet Richard's explanation confirmed that I shouldn't rely upon observed behavior of one compiler.
--
Are you saying that this output is undefined, i.e. that on another machine the output might be something different?
an object state that is not specified except that the object’s invariants are met and operations on the object behave as specified for its type
[ Example: If an object x of type std::vector<int> is in a valid but unspecified state, x.empty() can be called unconditionally, and x.front() can be called only if x.empty() returns false. —end example ]
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.
To post to this group, send email to std-dis...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-discussion/.
.~. In my life God comes first.... /V\ but Linux is pretty high after that :-D /( )\ Francis (Grizzly) Smit ^^-^^ http://www.smit.id.au/ -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GM/CS/H/P/S/IT/L d- s+:+ a++ C++++ UL++++$ P++ L+++$ E--- W++ N W--- M-- V-- PE- PGP t+ 5-- X-- R- tv b++++ D- G e++ h+ y? ------END GEEK CODE BLOCK------ http://www.geekcode.com/
On 2015–04–22, at 9:51 AM, Nevin Liber <ne...@eviloverlord.com> wrote:On 21 April 2015 at 20:44, <gmis...@gmail.com> wrote:Are you saying that this output is undefined, i.e. that on another machine the output might be something different?
Yes. Not quite undefined, but in a valid but unspecified state.
On 2015–04–22, at 12:16 PM, gmis...@gmail.com wrote:
the standard should change to make such code work as expected for things like string.e.g.: guarantee this assert will never fail.std::string s1 = "hello";
std::string s2 = std::move(s)
assert(s1.empty());
For std::vector (over std::allocator), as far as I know, the moved-from state is constrained to be empty by the requirement that iterators to the moved-from object remain valid, pointing to the moved-to object.
I have seen a lot of code like that, which is technically broken then. I've written code like that myself regarding string.I think perhaps the standard should change to make such code work as expected for things like string.
You are exactly correct about what I think the "problem" is and what I think the possible "solution" might be.Most implementations (it appears) already conform to the design that I find least surprising and that's probably exactly why.
But if we are saying code shouldn't rely on that, then I think we need compilers to diagnose our wrong assumptions here and/or have some major implementations disagree, so we are less likely to rely on non guaranteed behaviour.
If we do nothing, I think eventually there will be a large body of code that needs support and we'll standardize it anyway.
Don't get me wrong, I think move is awesome, but can we do more to manage it's impact better?Maybe we designate a [[safe_after_move]] class or function attribute or something and then have the compiler attempt to warn on use of any other method after a move has been seen. It might not be bullet proof but maybe it might help.
On Apr 22, 2015, at 6:09 PM, gmis...@gmail.com wrote:
>
> On Thursday, April 23, 2015 at 2:45:19 AM UTC+12, Howard Hinnant wrote:
> ———
>
>> Questions:
>>
>> 1. Is this guarantee intended only for std::string, and not other instantiations of std::basic_string?
>>
>> 2. Is this guarantee intended only for the move constructor, or is it also intended for the move assignment operator? Or what about other operations within the std::lib that take a std::string&& (or std::basic_string<…>&&) such as vector<string>::insert(const_iterator, string&&)?
>>
>
> It's also not practical for me to answer your questions explicitly because I suspect the answer requires a detailed analysis of not just string but other types like vector etc. too.
It is difficult to analyze a proposal without knowing exactly what is being proposed.
>
> Maybe we designate a [[safe_after_move]] class or function attribute or something and then have the compiler attempt to warn on use of any other method after a move has been seen. It might not be bullet proof but maybe it might help.
Perhaps it would be instructive to recast the question in terms we’re familiar with from C++98. What should this program output?
#include <iostream>
#include <algorithm>
#include <string>
int
main()
{
std::string array[] = {"a", "b", "c"};
const std::size_t sz = sizeof(array)/sizeof(array[0]);
std::remove(array, array+sz, "a");
std::cout << array[sz-1] << '\n';
}
What operations would it be safe to do with the value array[sz-1] after std::remove is called? Is the answer to that question different between C++98, C++03, C++11 and C++14?
Is their any practical difference between array[sz-1] and a C++11 moved-from std::string? Is the issue with std::remove one that has needed (and continues to need) to be solved (since C++98)?
If we were to put the call to std::remove under a try/catch, in the catch clause what operations would be safe to perform on the elements of array?
I guess what I’m getting around to is: Valid but unspecified states are not a new thing with C++11. We have lived with them everywhere we have basic exception safety, and in a few more places such as std::remove, std::remove_if and std::unique. If moved-from std::strings need fixing, don’t we also need to address these other areas as well for consistency? Or is move that special?
Howard
On 2015–04–22, at 10:59 PM, Nevin Liber <ne...@eviloverlord.com> wrote:And what about calls which only conditionally do a move, such as set<T>::insert(T&&)?
On Apr 22, 2015, at 8:48 PM, gmis...@gmail.com wrote:
>
> And secondly, once applied, it obliterates the object pretty profoundly.
This is a notion I’d prefer to reword. In the early days of move this was the prevailing thought. I.e. you can’t do anything with a moved-from object but destruct it! But that really isn’t right. After all, if that were true, std::swap wouldn’t swap values, it would just destruct them. :-) That is, std::swap brings moved-from values “back from the dead”.
Indeed, during a sort, everything gets moved, and every moved-from value is given a new one.
> None of this is to say that the design of move is wrong.
No worries, I’m not interpreting your thoughts that way.
> But what bothers me is that even for my simple example, the result is that apparently (listening to the replies already in) can't guarantee even that code will reliably output the same thing on different platforms AND that no compiler I have seen currently diagnoses the issue. It may not even be possible. That's what worries me.
Your worries are not without justification.
> Your earlier question acknowledges the kinds of things that need to be answered which ever road you go down and the complexity of answering that which is why I haven't tried.
>
> All I'm doing is just laying out what I see as the current state and saying they seem less than ideal and asking what can we do about it. I want the compiler to be able to spot errant use after move as much as we spot errant use after delete and/or to make sensible use after move less platform specific. There may be no perfect here. But better might be obtainable. I like move I just want to make using it less of a source of bugs and am interested in ways of achieving that if it's possible.
I appreciate your efforts in this area. I am hearing from multiple sources that std::move is overused and that moved-from values are not being correctly handled. Right now I do not know what the solution is, nor if a solution is possible, nor even if this is something that needs to be solved beyond education on std::move. I remain interested in the domain, and will contribute whatever background knowledge I can towards everyone's efforts to better C++.
Howard
My view point 1 is that an object that has been moved from should generally attempt to put itself into a state that it has the same usability guarantees as if has if it had been default constructed. This model seems natural and it would fix a certain source of bugs and it would make the simple string example work consistently which I find appealing.
But my contrary view says that offering any such a guarantee might be sub-optimal because my guess is that most objects are never used again after they have been moved from so not having to reset to a definite state might improve performance.
And the most bothersome kind of code is code that conditionally moves (either by error or by design) because that seems to invite a dangerous source of bugs - i.e. where you can end up appending to a string you thought was empty but it turns out that it isn't because a move didn't happen for some reason.Adopting a clear "don't re-use after move" rule would seem to help in the latter cases but it makes my simple code example unreliable and I hate that.
For example, imagine if this were the default:void take_me_lord(my_object&& ashes_to_ashes);my_object goodbye_cruel_world("so long");take_me_lord( goodbye_cruel_world );goodbye_cruel_world = "resurrection"; // COMPILER ERROR:
A re-constructor could put an existing object into a state where it can be destructed or reused as if it had just been constructed.
On Thursday, April 23, 2015 at 2:27:15 AM UTC-4, gmis...@gmail.com wrote:My view point 1 is that an object that has been moved from should generally attempt to put itself into a state that it has the same usability guarantees as if has if it had been default constructed. This model seems natural and it would fix a certain source of bugs and it would make the simple string example work consistently which I find appealing.That sounds very similar to a valid but unspecified state. The difference being just that you don't know what's in the object, and by using std::move() you've explicitly said that you don't care, because you're either going to destroy it or sets its state to something known.
But my contrary view says that offering any such a guarantee might be sub-optimal because my guess is that most objects are never used again after they have been moved from so not having to reset to a definite state might improve performance.And the most bothersome kind of code is code that conditionally moves (either by error or by design) because that seems to invite a dangerous source of bugs - i.e. where you can end up appending to a string you thought was empty but it turns out that it isn't because a move didn't happen for some reason.Adopting a clear "don't re-use after move" rule would seem to help in the latter cases but it makes my simple code example unreliable and I hate that.It's not "don't re-use after move", it's "don't re-use after move until you've put it into a known state".
For example, imagine if this were the default:void take_me_lord(my_object&& ashes_to_ashes);my_object goodbye_cruel_world("so long");take_me_lord( goodbye_cruel_world );goodbye_cruel_world = "resurrection"; // COMPILER ERROR:That being a compiler error would be very bad. That is perfectly valid code today for well-behaved classes such as std::string, assuming the addition of the missing std::move() in the take_me_lord() call, since it won't compile as-is because it's not an rvalue. I've heard several people say, and I agree, that if you read std::move() as std::rvalue_cast(), then it makes more sense that you're not actually moving when you do std::move(), you're telling the called function that it's okay to move from this object.A re-constructor could put an existing object into a state where it can be destructed or reused as if it had just been constructed.You don't need a re-constructor for this, as this is mostly what already happens. You can reuse it as long as you use a function that sets the entire state, and not one that simply modifies the existing state. So you can use std::string::operator=(), but std::string::operator+=() will give you unspecified results.
On Friday, April 24, 2015 at 8:33:35 AM UTC+12, Greg Marr wrote:On Thursday, April 23, 2015 at 2:27:15 AM UTC-4, gmis...@gmail.com wrote:My view point 1 is that an object that has been moved from should generally attempt to put itself into a state that it has the same usability guarantees as if has if it had been default constructed. This model seems natural and it would fix a certain source of bugs and it would make the simple string example work consistently which I find appealing.That sounds very similar to a valid but unspecified state. The difference being just that you don't know what's in the object, and by using std::move() you've explicitly said that you don't care, because you're either going to destroy it or sets its state to something known.The difference is consistency - programmers mental model.
Currently, after a move (such as on a string) a types "valid but unspecified state is" so unspecified that even my simple string example from earlier can't be guaranteed to work i.e. even length cannot be guaranteed to be 0.
But because it feels natural that a string should be empty after a move and because it usually does turn out to be in practice programmers are making the mistake of relying on that. If you changed the implementation of string to artificially change length to a random number after move, I think a lot of code would break / you'd find a lot of bugs.
If types could be generally said to reset to their default initialized state this would fix some of those bugs
If we don't improve on the situation as it is today we have bugs like my string example going undetected and the 'just don't use it again after a move' starts to become the safest thing do to avoid bugs even though that's technically that is overkill.
(This question is based upon document ISO/IEC 14882:2011(E))I'm writing a parser that's going to end up generating lots of instances of standard strings and vectors, and I'm trying to be as unobtrusive in terms of memory as I can. I'm trying to move the generated strings and vectors for obvious reasons, but it got me thinking about the state of the the carcass of the moved instance.
For strings, I found the language in § 21.4.2 paragraph 2 that "In the second form, str is left in a valid state with an unspecified value." The second form refers to the the rvalue-reference constructor of basic_string.
From this language, I expect I should be able to call methods on a previously moved instance without introducing undefined behavior, but the result of those calls are unspecified (calling c_str could return an empty string or anything else). What I gather from this is that if I wanted to reuse a previously moved string instance, I would need re-initialize it in order to get to a known state.
As an example, if I wrote:
std::function<void(std::string&&)> sink = ...;std::string s;
for (int i = 0; i < 26; ++i){s.append('a' + i);sink(std::move(s));}
It would be conformant for sink to receive "a", "b", "c".... or to receive "a", "ab", "abc", or really anything else after initially receiving "a".
Question boils down to: To be in a known & valid state, do I need to reinitialize a moved type in order to be able to guarantee behavior?
i.e. Do I need to do the following in order to ensure sink receives "a", "b", "c"...:
std::function<void(std::string&&)> sink = ...;std::string s;
for (int i = 0; i < 26; ++i){s.append('a' + i);sink(std::move(s));s = std::string(); // possibly redundant?}
Hi,In general would this assumption hold across all standard movable types? Is the extra re-initialization necessary to ensure cross platform/implementation behavior?
std::function<void(std::string&&)> sink = ...;
for (int i = 0; i < 26; ++i){
std::string s; = ('a' + i);sink(std::move(s));
}
On 23 April 2015 at 17:14, <gmis...@gmail.com> wrote:
On Friday, April 24, 2015 at 8:33:35 AM UTC+12, Greg Marr wrote:On Thursday, April 23, 2015 at 2:27:15 AM UTC-4, gmis...@gmail.com wrote:My view point 1 is that an object that has been moved from should generally attempt to put itself into a state that it has the same usability guarantees as if has if it had been default constructed. This model seems natural and it would fix a certain source of bugs and it would make the simple string example work consistently which I find appealing.That sounds very similar to a valid but unspecified state. The difference being just that you don't know what's in the object, and by using std::move() you've explicitly said that you don't care, because you're either going to destroy it or sets its state to something known.The difference is consistency - programmers mental model.Not fitting your mental model is not a bug in the standard.
I don't know how you solve the problem of people not reading documentation, because the number of different mental models that people can make up that don't fit the specification is unbounded.
Currently, after a move (such as on a string) a types "valid but unspecified state is" so unspecified that even my simple string example from earlier can't be guaranteed to work i.e. even length cannot be guaranteed to be 0.Because unspecified means not specified. I really don't know how it can be made any clearer. Would you prefer we pick a different word out of a thesaurus? You seem to think that unspecified should mean specified.
But because it feels natural that a string should be empty after a move and because it usually does turn out to be in practice programmers are making the mistake of relying on that. If you changed the implementation of string to artificially change length to a random number after move, I think a lot of code would break / you'd find a lot of bugs.Many people have an oversimplified mental model of the underlying machine too, with no caches and atomic access to each and every primitive type. They sprinkle "volatile" all over their variables because they think it magically solves threading and concurrency problems. What should we do about that, as the bugs it causes are far more insidious?If types could be generally said to reset to their default initialized state this would fix some of those bugsBut not all of them, because the mental model would still be wrong. It would just make correct code slower.Your mental model is inconsistent with what an arbitrary developer can do with his/her move constructor for a user defined type.Plus, there are reasons to use r-value references besides intent to move. It indicates an unnamed temporary, and can be useful in things like expression templates, where the classes should not be holding on to references/pointers past the full expression it is a part of.If we don't improve on the situation as it is today we have bugs like my string example going undetected and the 'just don't use it again after a move' starts to become the safest thing do to avoid bugs even though that's technically that is overkill.What's wrong with that, other than it being overly pessimistic?Over time, you can teach them a better mental model:
- Assignment will put the object back into a specific state and makes it once again effectively useable.
- For standard library containers, calling clear() will also the object back into a specific state and makes it once again effectively useable.
While that may not always be 100% optimal, IMNSHO anything more fine-grained needs a measured performance bottleneck before pursuing.
Or you can write your own library that matches your mental model.
The repchar() function never calls
anything like clear() that might be considered a "reset to a known
state" function, but repchar() can be expected to work perfectly well on
a string with arbitrary but known contents, so I think it can reasonably
be expected to work the same way on a moved-from string. A moved-from
string is just a string with unknown but valid content, not some kind of
delicate fragile half-constructed value that's only usable in certain
contexts.
I don't think there's anything wrong with the current specification of
move semantics,
A moved from string that has a length that is unspecified or that has to be assigned and empty string to be usable is odd at first inspection.
The issue isn't about performance, it's about reducing bugs related to using move.
On Fri, Apr 24, 2015 at 12:59 PM, <gmis...@gmail.com> wrote:A moved from string that has a length that is unspecified or that has to be assigned and empty string to be usable is odd at first inspection.An 'int' that is uninitialized and has to be assigned to in order to be usable is odd at first impression.The issue isn't about performance, it's about reducing bugs related to using move.But it is. There is a balance between the guarantees that you want to provide and the cost that providing those guarantees impose on the program.
The main use of functions taking rvalue-references is to take advantage of an rvalue that won't be used after this operation, yes, you can bind an rvalue-reference to an lvalue and steal from the lvalue, but that is not the primary use case. Adding code that resets the string to a well known state would make every use of the move operations incur a cost, small as it may be, that is not needed.
Consider a large vector<string>, say 10M elements, and erasing the first element. Do you want to trigger 10M 'clear()' followed by move assignments? I'd rather avoid the 'clear' as the move assignments will guarantee the final value without going through some intermediate guaranteed to be empty state.
David