I don't have a proposal or a real solution. This is more intended to highlight some of the problems related to unique ownership and move semantics.
1. Its too easy to accidentally use a moved from object by mistake
Consider this code:
void Container::addThing(unique_ptr<Thing> p) {
this->_things.push_back(std::move(p));
if(logging_enabled) {
std::cout << "Added thing " << p->name() << " " << p->foo() << " " << p->bar() << std::endl;
}
return;
}
I've seen this kind of bug many times. Its very easy to write and the compiler provides no help to you. Even better since the bug is hidden when logging is disabled, it could easily pass into production. Its also somewhat of an expert problem when explaining to novices.
Now here's an attempt to fix the bug the right way. But we still have a mistake! Again no help from the compiler. Also unless you're really paying attention, a quick skim through the code will likely miss this one.
void Container::addThing(unique_ptr<Thing> p) {
auto* pc = p.get();
this->_things.push_back(std::move(p));
if(logging_enabled) {
std::cout << "Added thing " << pc->name() << " " << pc->foo() << " " << p->bar() << std::endl;
}
return;
}
After we move p into _things, we never need to touch p again. Unfortunately in this case and many others, we can't really introduce scopes with {} to eliminate p from the local namespace and prevent these kinds of bugs.
A possible solution here might be some kind of [[discard]] attribute or a std::move_final() which is std::move() + [[discard]] together. So that the compiler would warn on any use of p after the move.
2. unique_ptr<T> and T* being different types can cause pessimizations because of language rules.
Of course its a good thing that these types are different. One is an owner and one is not. Using a different type means we can enlist the help of the compiler to enforce correctness.
Sometimes I have a container with keeps a set of unique_ptrs, and I want to view that set.
class OwningContainer {
public:
array_view<const unique_ptr<Thing>> getThings() const { return _things; }
private:
std::vector<unique_ptr<Thing>> _things;
};
class NonOwningContainer {
public:
array_view<const Thing*> getThings() const { return _things; }
private:
std::vector<Thing*> _things;
};
//Out of line function, maybe lives in a 3rd party library.
void f(array_view<const Thing*> things);
void g(const OwningContainer& c){
f(c.getThings()); //Compiler Error
}
void h(const NonOwningContainer& c) {
f(c.getThings()); //Ok
}
The getThings() method of OwningContainer and NonOwningContainer methods have the exact same semantics. We're getting a const view of the stored thing pointers. The returned objects are even bitwise and machine code (after optimization) identical, but are "marked up" by the compiler with different types.
From the limited perspective of the code calling getThings(), whether or not the container owns the pointers is an implementation detail. The caller does not and should not care whether the pointers are stored raw, unique_ptr. He just wants to view the collection and do something with it.
The big problem here is that the return types of getThings() are different. This means that in a generic context, you need to start introducing templates in order to handle all possible pointer types. Adding templates complicates the code and slows down compile times.
Also since const unique_ptr<T> and const T* are semantically and even bitwise identical, using templates here will unnecessarily bloat your code with 2 functions that do the exact same thing. This increases your binary size and puts more pressure on the icache.
In this example, we must change f() to be a template. Even though the actual compiled down machine code will be identical. The biggest problem I have with this example is that the problem comes from artificial language rules and not physical limits about how hardware and memory works. That goes against the zero overhead principle of C++.
In order to avoid making f() a template here, there are a few options today we can try with OwningContainer:
1. Return vector<Thing*> by value, doing a copy and memory allocations at every call. (very slow)
2. Store a second vector<Thing*> inside OwningContainer, keep it in sync with the unique_ptr version, and return it in getThings(). (twice memory usage, slow, complicated, error prone)
3. Abandon unique_ptr inside of OwningContainer, and go back to manually managing the memory. (error prone and greatly increases development time)
Possible solutions to this include:
1. Add some kind of way to alias a unique_ptr<T> into a T*. Letting me essentially convert an array_view<const unique_ptr<T>> to array_view<const T>. In terms of how the implementation actually works on the machine, this is a trivial no-op. In terms of language rules its a complete nightmare.
2. Provide a specialized unique_vector<T*> which essentially operates like vector<unique_ptr<T>>, but exposes T* in its const interface. Then we can construct array_view<const T> over vector<T> and unique_vector<T>. Hiding the ownership implementation details and avoiding the need to for artificial templates. This would solve the immediate example I've shown, but I'm not sure if its too specific and leaves out other similar situations.
How would you solve these 2 issues?
Have you seen any other complexities show up in your code from adopting unique_ptr?