std::optional - support for sentinel values

589 views
Skip to first unread message

Igor Baidiuk

unread,
Jun 29, 2017, 5:51:18 AM6/29/17
to ISO C++ Standard - Future Proposals
At the moment, std::optional type, in the way it was accepted by committee, always has storage block for its value and a flag field, which determines if actual value is present.
Although, there are often safe wrapper types (like std::reference_wrapper, or wrapper around OS handle, or a raw pointer) which have kind of "impossible state" on their own. For such types, it's perfectly possible to represent std::nullopt state as this "impossible" value:
  • Allow to define optional sentinel value for a type. Current approach in STL seems to be trait struct for the type and trait adaptor used in std::optional to access values (similar to std::allocator<T> + std::allocator_traits<Alloc>). There might be a requirement that sentinel value should be constexpr.
  • If type defines sentinel value, then std::optional's nullopt is represented with that sentinel value, and there's no additional flag

There are several benefits coming with this approach:

  • Space efficiency for types with "invalid value"
  • Type safety. Sentinel value simply cannot be used the same way as normal value. Roughly saying, no "null pointer dereference" issue

The other extension, orthogonal to first one, is to allow value to be stored using alternate representation. As an example, reference type can be stored as a pointer and simply presented as reference. This will also allow to have cheap and relatively easy support for storing references in std::optional, without need of completely separate type specialization.


Any thoughts on this?



Ville Voutilainen

unread,
Jun 29, 2017, 6:30:31 AM6/29/17
to ISO C++ Standard - Future Proposals
That's a type different from optional. And optional-with-sentinel that
considers a certain raw
pointer value to be the sentinel value is not the same thing as an
optional that adds a nullopt
state. The latter can't be made to act like the former the way you
suggest, because once a trait
is present, it can't be removed, or we have an odr violation.

You need a new type.

Igor Baidiuk

unread,
Jun 29, 2017, 6:47:51 AM6/29/17
to ISO C++ Standard - Future Proposals
Okay, this may be a new type. I doubt it'll be possible to alter std::optional definition, even with default tempate parameter. It's sad that includes still give us sticks in a wheel.
Although, I think the whole idea is worth discussing, as it allows more tools for safe programming.

Ville Voutilainen

unread,
Jun 29, 2017, 6:55:56 AM6/29/17
to ISO C++ Standard - Future Proposals
On 29 June 2017 at 13:47, Igor Baidiuk <targe...@gmail.com> wrote:
> Okay, this may be a new type. I doubt it'll be possible to alter
> std::optional definition, even with default tempate parameter. It's sad that
> includes still give us sticks in a wheel.

It has nothing to do with includes. Altering the definition of
optional to conditionally use
a trait would be remotely feasible, but it would also be an abi-breaking change.

> Although, I think the whole idea is worth discussing, as it allows more
> tools for safe programming.

Sure.

Igor Baidiuk

unread,
Jun 29, 2017, 7:09:16 AM6/29/17
to ISO C++ Standard - Future Proposals


On Thursday, June 29, 2017 at 1:55:56 PM UTC+3, Ville Voutilainen wrote:

It has nothing to do with includes. Altering the definition of
optional to conditionally use
a trait would be remotely feasible, but it would also be an abi-breaking change.


Maybe I didn't express it well enough. I meant the case when someone declares such "sentinel" locally for module, and then return that option to unit where sentinel is not known.
Then I realized this is well-possible even if new optional would have trait parameter.

Ville Voutilainen

unread,
Jun 29, 2017, 7:14:04 AM6/29/17
to ISO C++ Standard - Future Proposals
On 29 June 2017 at 14:09, Igor Baidiuk <targe...@gmail.com> wrote:
>> It has nothing to do with includes. Altering the definition of
>> optional to conditionally use
>> a trait would be remotely feasible, but it would also be an abi-breaking
>> change.
>>
>
> Maybe I didn't express it well enough. I meant the case when someone
> declares such "sentinel" locally for module, and then return that option to
> unit where sentinel is not known.
> Then I realized this is well-possible even if new optional would have trait
> parameter.

Right; multiple translation units must agree, which makes it an odr-problem, not
an include-problem. The trait-approach might still work, it just
requires that the trait
gives the same result in every TU, but injecting such a trait-approach
into the current
optional is an abi break.

Going for a separate type is safe from that abi break, which
automatically removes some
automatic opposition from the picture.

Tony V E

unread,
Jun 29, 2017, 7:16:52 AM6/29/17
to ISO C++ Standard - Future Proposals
std::optional<std::opt_sentineled<int,-1>>



Sent from my BlackBerry portable Babbage Device
  Original Message  
From: Ville Voutilainen
Sent: Thursday, June 29, 2017 7:13 AM
To: ISO C++ Standard - Future Proposals
Reply To: std-pr...@isocpp.org
Subject: Re: [std-proposals] std::optional - support for sentinel values
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAFk2RUaVNMX9ktdTTTZywzmmOh6zJVShRcnL_ppJeFnFDpm0nw%40mail.gmail.com.

Igor Baidiuk

unread,
Jun 29, 2017, 7:50:11 AM6/29/17
to ISO C++ Standard - Future Proposals
If we introduce such "special wrapper", we'll lose in ergonomics, even if this will save us from ABI breakage.
As for ABI, major compilers like MSVC or GCC do ABI breakage on major version bump anyway.
Though I'd rather put that particular question aside for now. Let's imagine we're up to a new type.

Ville Voutilainen

unread,
Jun 29, 2017, 8:04:35 AM6/29/17
to ISO C++ Standard - Future Proposals
On 29 June 2017 at 14:50, Igor Baidiuk <targe...@gmail.com> wrote:
> If we introduce such "special wrapper", we'll lose in ergonomics, even if
> this will save us from ABI breakage.
> As for ABI, major compilers like MSVC or GCC do ABI breakage on major
> version bump anyway.

That's becoming less correct for msvc and is not at all correct for libstdc++.

> Though I'd rather put that particular question aside for now. Let's imagine
> we're up to a new type.

The type is certainly useful, because it's more compact than
std::optional is. I think it's worth
having in the standard, but I can imagine some saying "too many types"
without quantifying how
many is too many and what the deciding factor is for that. :)

Arthur O'Dwyer

unread,
Jun 29, 2017, 7:05:35 PM6/29/17
to ISO C++ Standard - Future Proposals
On Thursday, June 29, 2017 at 2:51:18 AM UTC-7, Igor Baidiuk wrote:
At the moment, std::optional type, in the way it was accepted by committee, always has storage block for its value and a flag field, which determines if actual value is present.
Although, there are often safe wrapper types (like std::reference_wrapper, or wrapper around OS handle, or a raw pointer) which have kind of "impossible state" on their own. For such types, it's perfectly possible to represent std::nullopt state as this "impossible" value:
  • Allow to define optional sentinel value for a type. Current approach in STL seems to be trait struct for the type and trait adaptor used in std::optional to access values (similar to std::allocator<T> + std::allocator_traits<Alloc>). There might be a requirement that sentinel value should be constexpr.
  • If type defines sentinel value, then std::optional's nullopt is represented with that sentinel value, and there's no additional flag

There are several benefits coming with this approach:

  • Space efficiency for types with "invalid value"
  • Type safety. Sentinel value simply cannot be used the same way as normal value. Roughly saying, no "null pointer dereference" issue


Mark Zeren's C++Now talk on strings will be interesting to you. Following that talk, I implemented his ideas for a "nullable trait" in my "STL From Scratch" under the name std::tombstone_traits<T>.
However, it hit some obstacles.

Here are the problems with your approach:
- Using a sentinel value of the type T is philosophically incorrect w.r.t. lifetime management. If I construct a disengaged optional, and then destroy it, I must not call the destructor of T. If I copy a disengaged optional, I must not call the copy constructor of T. If you unconditionally construct a T no matter whether the optional is engaged or disengaged, then what you have is philosophically not an optional<T>; it is literally a T.  If that's what you want, just write T.
- Using a sentinel value of type T is technically impossible if the type T has no "available" value; for example, bool. However, it is common for there to be "available" bit-patterns that are distinguishable from any possible value of T.  You raised the example of a reference_wrapper<T> in the all-bits-zero state.  For another example, there's std::vector in the state "ptr=null, size=42, capacity=0" — that's a bit-pattern that does not correspond to any value of type std::vector.
- In fact, if you can enumerate the available bit-patterns, you can store a lot of extra information in the unused bits of a T. My STL uses this fact to make sizeof(optional<optional<optional<bool>>>) == 1. Your approach would be completely unable to optimize this; you'd have to use at least 4 bytes.

Okay, now for the problems with my approach:
- The implementation of tombstone_traits<T> might need to be a friend of T, and certainly needs to be kept in sync with the implementation details of T, which is too bad, but not fatal.
- Because this approach relies on a kind of type-punning — a safe kind, AFAIK, but a kind — therefore it cannot be done constexpr, which means std::optional<T> is no longer usable in constexpr contexts. IMO this is absolutely fatal for pursuing this approach in the STL.  (Unless someone can come up with a way of doing it constexpr that I didn't think of...)

 

The other extension, orthogonal to first one, is to allow value to be stored using alternate representation. As an example, reference type can be stored as a pointer and simply presented as reference. This will also allow to have cheap and relatively easy support for storing references in std::optional, without need of completely separate type specialization.


This is optional<reference_wrapper<T>>.
As for why optional<T&> is a terrible, horrible, no-good, very bad idea, please see Matt Calabrese's talk from C++Now.

HTH,
Arthur

Igor Baidiuk

unread,
Jun 30, 2017, 10:21:35 AM6/30/17
to ISO C++ Standard - Future Proposals
On Friday, June 30, 2017 at 2:05:35 AM UTC+3, Arthur O'Dwyer wrote:
Mark Zeren's C++Now talk on strings will be interesting to you. Following that talk, I implemented his ideas for a "nullable trait" in my "STL From Scratch" under the name std::tombstone_traits<T>.
Thanks, will check it.

 
Here are the problems with your approach:
- Using a sentinel value of the type T is philosophically incorrect w.r.t. lifetime management. If I construct a disengaged optional, and then destroy it, I must not call the destructor of T. If I copy a disengaged optional, I must not call the copy constructor of T. If you unconditionally construct a T no matter whether the optional is engaged or disengaged, then what you have is philosophically not an optional<T>; it is literally a T.  If that's what you want, just write T.

I disagree.
Following your thought, C++ is already incorrect w.r.t. lifetimes - some value, which is moved out, is still accessible and requires destructor call.
Next, I'm thinking more from correctness and usability points of view.
The values used for sentinels are usually "invalid" type states. The closest example is what remains of value if it's moved out. In such case destructor is trivial.
Next, such "invalid" state simply cannot be used in place of normal value. Null pointer cannot be used to access object. In fact, pointers should have never been allowed to be dereferenced, from philosophical correctness point of view. Typed pointer is kind of abomination - it can be rebound to arbitrary location via simple arithmetic operations, and it can be used to access data like reference.
The approach with sentinel values is exactly for this purpose - it allows to reuse effective representation (the "free" bits you mention below) and provide safe interface to it - there can be normal value, or no sensible value at all.
 
- Using a sentinel value of type T is technically impossible if the type T has no "available" value; for example, bool. However, it is common for there to be "available" bit-patterns that are distinguishable from any possible value of T.  You raised the example of a reference_wrapper<T> in the all-bits-zero state.  For another example, there's std::vector in the state "ptr=null, size=42, capacity=0" — that's a bit-pattern that does not correspond to any value of type std::vector.
Of course, sentinel values don't exist for all the types on the table. And for vector, which you mention, empty state is perfectly valid - we don't care how it's represented, we simply shouldn't violate its contract.
 
- In fact, if you can enumerate the available bit-patterns, you can store a lot of extra information in the unused bits of a T. My STL uses this fact to make sizeof(optional<optional<optional<bool>>>) == 1. Your approach would be completely unable to optimize this; you'd have to use at least 4 bytes.
Let's imagine someone tries to take reference to doubly-optional bool from your triple-optional bool. And assign it some value. Also note that Some(Some(Some(false))), Some(Some(None)), Some(None) are three different state of your triply-optional bool. Even with space compression, you'll need at least 2 bytes.
 
 

The other extension, orthogonal to first one, is to allow value to be stored using alternate representation. As an example, reference type can be stored as a pointer and simply presented as reference. This will also allow to have cheap and relatively easy support for storing references in std::optional, without need of completely separate type specialization.


This is optional<reference_wrapper<T>>.
As for why optional<T&> is a terrible, horrible, no-good, very bad idea, please see Matt Calabrese's talk from C++Now.
This is disputable, true. I just remembered how some folks complained that optional<T&> isn't possible.
 

HTH,
Arthur

Matthew Woehlke

unread,
Jun 30, 2017, 11:44:57 AM6/30/17
to std-pr...@isocpp.org, Igor Baidiuk
On 2017-06-30 10:21, Igor Baidiuk wrote:
> On Friday, June 30, 2017 at 2:05:35 AM UTC+3, Arthur O'Dwyer wrote:
>> - In fact, if you can enumerate the available bit-patterns, you can store
>> a *lot* of extra information in the unused bits of a T. My STL uses this
>> fact to make sizeof(optional<optional<optional<bool>>>) == 1. Your
>> approach would be completely unable to optimize this; you'd have to use at
>> least 4 bytes.
>
> Let's imagine someone tries to take reference to doubly-optional bool from
> your triple-optional bool. And assign it some value.

I can imagine ways this can still work. (The complexity will depend on
what happens when you assign `true` to the inner-most `bool`, i.e. if it
stores 1 (0x1) or -1 (0xff).)

> Also note that
> Some(Some(Some(false))), Some(Some(None)), Some(None) are three different
> state of your triply-optional bool. Even with space compression, you'll
> need at least 2 bytes.

Er... why?

An `optional<T>` needs `bitsof(T)+1` bits of storage. A `bool` has two
legal states, so `bitsof(bool) == 1`. Thus, an `optional<bool>` needs 2
bits of storage.

Applying this recursively:

bitsof(optional<bool>) == 2
bitsof(optional<optional<bool>>) == 3
bitsof(optional<optional<optional<bool>>>) == 4

Last I checked, there are (probably / at least) 8 bits in a byte. I
hardly need *16* bits to store optional^3<bool>. In fact, I could
(theoretically) store up to optional^7<bool> in *one* byte.

--
Matthew

Arthur O'Dwyer

unread,
Jun 30, 2017, 12:49:58 PM6/30/17
to ISO C++ Standard - Future Proposals
On Fri, Jun 30, 2017 at 8:44 AM, Matthew Woehlke <mwoehlk...@gmail.com> wrote:
On 2017-06-30 10:21, Igor Baidiuk wrote:
> On Friday, June 30, 2017 at 2:05:35 AM UTC+3, Arthur O'Dwyer wrote:
>> - In fact, if you can enumerate the available bit-patterns, you can store
>> a *lot* of extra information in the unused bits of a T. My STL uses this
>> fact to make sizeof(optional<optional<optional<bool>>>) == 1. Your
>> approach would be completely unable to optimize this; you'd have to use at
>> least 4 bytes.
>
> Let's imagine someone tries to take reference to doubly-optional bool from
> your triple-optional bool. And assign it some value.

I can imagine ways this can still work. (The complexity will depend on
what happens when you assign `true` to the inner-most `bool`, i.e. if it
stores 1 (0x1) or -1 (0xff).)

Correct; and you mean "complexity" in the lines-of-code sense, not in the big-O sense, just so we're all on the same page. :)
My tombstone_traits doesn't care what the value-carrying representations are; all it needs to know is how many non-value-carrying representations it has to play with.
 
> Also note that
> Some(Some(Some(false))), Some(Some(None)), Some(None) are three different
> state of your triply-optional bool. Even with space compression, you'll
> need at least 2 bytes.

Er... why?

An `optional<T>` needs `bitsof(T)+1` bits of storage. A `bool` has two
legal states, so `bitsof(bool) == 1`. Thus, an `optional<bool>` needs 2
bits of storage.

Applying this recursively:

  bitsof(optional<bool>) == 2
  bitsof(optional<optional<bool>>) == 3
  bitsof(optional<optional<optional<bool>>>) == 4

You can do better (and I do).
With a single byte, we have 256 representations to play with.
2 of those representations are value-carrying: 0x00 means "false" and 0x01 means "true".  (If you let "true" be represented by 0xff, that's also fine; nothing changes.)  So that leaves representations 0x02 through 0xFF available for tombstones.

My std::optional<T> asks std::tombstone_traits<T> for the count of available tombstone representations. If T is int, then tombstone_traits<int>::spare_representations == 0, and so std::optional<int> will be composed of an int and a bool.

If T is bool, then tombstone_traits<int>::spare_representations == 254, and so std::optional<bool> will be composed of a single bool (which may or may not be constructed). Iff the optional is engaged, the contained bool's representation will be 0x00 or 0x01. Iff the optional is disengaged, the contained bool's representation (which is not a value-carrying representation) will be 0x02.  Representations 0x03 through 0xFF remain "spare", and this is reflected in std::tombstone_traits<std::optional<bool>>.

So now consider std::optional<std::optional<bool>>.  That's std::optional<U> where std::tombstone_traits<U>::spare_representations == 253.  And so again, std::optional<std::optional<bool>> will be composed of a single std::optional<bool> (that is, represented by a single byte) (which may or may not be constructed). Iff the inner optional<bool> is engaged, the bool's representation will be 0x00 or 0x01. Iff the inner optional<bool> is disengaged, the bool's representation will be 0x02. And otherwise the inner optional<bool> must not exist, which is to say, the outer optional<bool> must be disengaged; we borrow representation 0x03 for that.  Representations 0x04 through 0xFF remain "spare", and this is reflected in std::tombstone_traits<std::optional<std::optional<bool>>>.

This all works, btw; you can clone my from-scratch repo and play with it if you want.  The fatal problem with it is that optional<T> becomes non-constexpr.

 
Last I checked, there are (probably / at least) 8 bits in a byte. I
hardly need *16* bits to store optional^3<bool>. In fact, I could
(theoretically) store up to optional^7<bool> in *one* byte.

Try optional^254<bool>. :)

–Arthur

inkwizyt...@gmail.com

unread,
Jun 30, 2017, 1:24:04 PM6/30/17
to ISO C++ Standard - Future Proposals, targe...@gmail.com

16 because you need full byte object in optional otherwise you will get `std::vector<bool>` mk2.

Arthur O'Dwyer

unread,
Jun 30, 2017, 1:24:32 PM6/30/17
to ISO C++ Standard - Future Proposals
On Fri, Jun 30, 2017 at 7:21 AM, Igor Baidiuk <targe...@gmail.com> wrote:
On Friday, June 30, 2017 at 2:05:35 AM UTC+3, Arthur O'Dwyer wrote:
Mark Zeren's C++Now talk on strings will be interesting to you. Following that talk, I implemented his ideas for a "nullable trait" in my "STL From Scratch" under the name std::tombstone_traits<T>.
Thanks, will check it.

 
Here are the problems with your approach:
- Using a sentinel value of the type T is philosophically incorrect w.r.t. lifetime management. If I construct a disengaged optional, and then destroy it, I must not call the destructor of T. If I copy a disengaged optional, I must not call the copy constructor of T. If you unconditionally construct a T no matter whether the optional is engaged or disengaged, then what you have is philosophically not an optional<T>; it is literally a T.  If that's what you want, just write T.

I disagree.
Following your thought, C++ is already incorrect w.r.t. lifetimes - some value, which is moved out, is still accessible and requires destructor call.

Nope.  An object of type T, even in a "moved-from state", still has a value of type T.  That value might be unspecified; it might even have unusual restrictions on what you can do with it (although this is exceedingly rare); the most common case is that the object's value is some variety of "trivial value" (e.g. the empty std::string) or some variety of "empty value" (e.g. a std::future which is not valid(), or a std::function whose target_type() is typeid(void)).

The way you can tell that the object still has a value (and therefore is still of type T) is that you're required to call the destructor ~T() on it (usually implicitly). Every object of type T must be destroyed by calling its destructor; and vice versa, you're only ever allowed to call ~T() on an object that really is of type T.

 
Next, I'm thinking more from correctness and usability points of view.
The values used for sentinels are usually "invalid" type states. The closest example is what remains of value if it's moved out. In such case destructor is trivial.

Well, that's just false. Go look at the destructor of std::unique_lock sometime, or std::future, or std::string, or std::shared_ptr. What happens if you move-out-of such an object? Does its destructor magically become trivial?

 
Next, such "invalid" state simply cannot be used in place of normal value. Null pointer cannot be used to access object. In fact, pointers should have never been allowed to be dereferenced, from philosophical correctness point of view. Typed pointer is kind of abomination - it can be rebound to arbitrary location via simple arithmetic operations, and it can be used to access data like reference.

Here, I agree with you.  Nullable pointers are the "billion-dollar mistake".  And C++ gives us-the-user the power to correct that mistake!  Let's make a non-nullable pointer type:

class int_ptr {
    int *p_;
public:
    int_ptr() = delete;
    int_ptr(int *p) { REQUIRE(p); p_ = p; ASSERT(p_); }
    int_ptr(const int_ptr& rhs) { ASSERT(rhs.p_); p_ = rhs.p_; }
    int_ptr& operator=(const int_ptr& rhs) { ASSERT(p_ && rhs.p_); p_ = rhs.p_; }
    ~int_ptr() { ASSERT(p_); }
    int& operator*() const noexcept { return *p_; }
};

The copy operations and deleted zero-argument constructor are technically unnecessary, but I put them in so that we'd be able to talk sensibly about the invariants of this class type.
I claim that the invariant of this type is "It is never NULL, and therefore it is always safe to dereference with operator*()."  We have one "narrow contract" entrypoint that checks its arguments and throws (via the REQUIRE macro) if there's no way to build a legal (invariant-satisfying) int_ptr value out of those arguments.
Okay. So now I have a non-nullable pointer type where NULL is actually excluded from the domain of values of this type.

Show me how you'd build a space-efficient std::optional<int_ptr> using your mechanism.  I claim that you won't be able to do it, because, again, by definition, every legal value of int_ptr is a legal value of int_ptr. You're never going to find a legal value of int_ptr that is "available" for your use, just like you couldn't find one for bool.  A type always totally covers its domain, by definition: The domain of a type is the set of all possible values of that type.

Similarly, I claim that if you tried to build a std::optional<int *> — an optional that could hold a nullable pointer or else be disengaged — using your mechanism, you'd find it impossible to distinguish between

    std::optional<int*> o1 = std::nullopt;  // this optional is disengaged
    std::optional<int*> o2 = nullptr;  // this optional is engaged, and holds a nullable pointer with value NULL

If that's true, then what you've built isn't optional<int*>, because it can't hold "a nullable pointer or disengaged." What you've built is equivalent to either optional<int_ptr> (because it can hold "a non-nullable pointer or disengaged") or simply int* (because it can hold "a non-nullable pointer or null", which is to say, it can hold "a nullable pointer").  This is why I said in my original response:— If you mean "T", just say "T". Don't use the phrase "optional<T>" to mean "T"; that's just going to confuse people.

I've addressed your comments about tombstone_traits in my response to Matthew Woehlke. Hopefully that will help you see how it's implemented. The code is still available too. :)

–Arthur

Arthur O'Dwyer

unread,
Jun 30, 2017, 1:32:03 PM6/30/17
to ISO C++ Standard - Future Proposals
On Fri, Jun 30, 2017 at 10:24 AM, <inkwizyt...@gmail.com> wrote:

16 because you need full byte object in optional otherwise you will get `std::vector<bool>` mk2.

I hate vector<bool> as much as you do, I promise. :)  The analogy doesn't work here. vector<bool> is terrible because it purports to expose accessors returning bool& but its in-memory representation doesn't contain a full-byte bool anywhere, so it has to cheat and use proxy iterators and just generally mess things up.

In the case of std::optional<std::optional<bool>>, it purports to expose accessors returning bool& if and only if both the outer and inner optionals are engaged. When either optional is disengaged, the bool member does not exist and therefore there is no need to keep it around in memory (in fact, we are required by the Standard to destroy it[1]). Which means that we have a whole unused byte to play around with. We just need to make sure that when the optionals are engaged, the bool really truly exists — and I do make sure of that.

Notice that I left out the case where "both optionals are disengaged." That's because no such state exists. When the outer optional is disengaged, there is no inner optional — the inner optional is not in any state at all — it has been destroyed, and will not reappear until you engage the outer optional again.

An optional<T> which is disengaged does not hold a T.

–Arthur

[1] modulo the as-if rule

inkwizyt...@gmail.com

unread,
Jun 30, 2017, 2:03:31 PM6/30/17
to ISO C++ Standard - Future Proposals

How you exactly check what state is now without UB? You need poke bytes that could be `bool` or `byte` that store "engaged" levels.
Only way I see that could made this work is (but still this could be UB):
struct optional
{
   
union { bool v; char i; };
   
bool isSet()
   
{
       
bool t = true, f = false;
       
return memcmp(this, &t, 1) || memcmp(this, &f, 1);
   
}
}



Arthur O'Dwyer

unread,
Jun 30, 2017, 2:29:13 PM6/30/17
to ISO C++ Standard - Future Proposals
See the repository, please.  Basically, access the bytes as (unsigned char) to avoid UB.

--
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/fghaQHk8oe8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposals+unsubscribe@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

Barry Revzin

unread,
Jun 30, 2017, 3:28:12 PM6/30/17
to ISO C++ Standard - Future Proposals

This is optional<reference_wrapper<T>>.
As for why optional<T&> is a terrible, horrible, no-good, very bad idea, please see Matt Calabrese's talk from C++Now.


That's not at all the impression I got from that talk. One of the things he specifically mentions as a good idea is to support references for optional/variant, but internally wrap the references so that they behave as regular types (that is, like pointers. so operator= would rebind references, not assign through them).

Arthur O'Dwyer

unread,
Jun 30, 2017, 3:42:29 PM6/30/17
to ISO C++ Standard - Future Proposals
I'll have to re-watch it, then. I would say that "an optional of a regular type that behaves like a pointer" is a good idea, but that one is spelled optional<T*>.  Notice that part of behaving like a regular type is making sure that the semantics of operator== match the semantics of your copy-constructor: a copy of a value should compare == to the original value, and modifying the copy should not modify the original.  I'm pretty sure Matt talks about that pretty extensively.

The places where we put "foo<T&>" into the standard are either places where the rebinding question never comes up (e.g. std::promise<T&>) or places where the rebinding question does come up and the answer is "oh crap, maybe that was a bad idea" (e.g. std::tuple<T&>).  I said as much during Matt's talk (hence why I haven't watched the video).  IMO, if C++11 had had proper multiple-assignment, then it wouldn't have needed assign-throughable std::tie, and if it didn't have std::tie, then it wouldn't have had std::tuple<T&>.  For the cases where we use std::tie and std::forward_as_tuple today, we could equally well be using a "std::forwarding_tuple<T>" with perfect-forwarding read-through semantics and no operator= at all.  But, hindsight is 20/20.

–Arthur

Marc Mutz

unread,
Jul 1, 2017, 2:06:59 PM7/1/17
to std-pr...@isocpp.org
On 2017-06-30 01:05, Arthur O'Dwyer wrote:
[...]
> Okay, now for the problems with _my_ approach:
> - The implementation of tombstone_traits<T> might need to be a friend
> of T, and certainly needs to be kept in sync with the implementation
> details of T, which is too bad, but not fatal.

For a library-only solution, this is true. But a solution for the
standard could use compiler hooks to identify fixed-value bits (e.g.
LSBs of a sizeof(T) > 1 T*, MSBs of T*s which are outside the target
architecture's physical address space, ...) or corresponding bits of
objects containing such T*s) into which to stuff side information.

> - Because this approach relies on a kind of type-punning — a safe
> kind, AFAIK, but a kind — therefore it cannot be done constexpr,
> which means std::optional<T> is no longer usable in constexpr
> contexts. IMO this is absolutely fatal for pursuing this approach in
> the STL. (Unless someone can come up with a way of doing it constexpr
> that I didn't think of...)

Again, for a library-only solution, this is true (modulo your last
qualification). But as a std library feature, we could just *define*
constexpr functions that extract the side information from a type, to be
implemented as compiler hooks, if necessary (GCC seems to allow type
punning in constexpr functions already as an extension, so there it
would be library-only):

constexpr const T &get_object(const T &) noexcept;
constexpr T &get_object(T &) noexcept;
constexpr sideband<T> get_sideband(const T &) noexcept;

where sideband<T> is some kind of integral type that maps the
information that can be stored in the fixed-value bits of T into an
integer range [0, N[, get_object launders its argument from the sideband
information (masks the sideband bits of a T*), and get_sideband launders
its argument from the type's value-representing bits.

This exposes the sideband information storable in any T to library
writers (not just std::optional, but std::variant, too, and even Boost
and Qt).

Thanks,
Marc

Arthur O'Dwyer

unread,
Jul 1, 2017, 6:51:50 PM7/1/17
to ISO C++ Standard - Future Proposals
On Sat, Jul 1, 2017 at 11:01 AM, Marc Mutz <marc...@kdab.com> wrote:
On 2017-06-30 01:05, Arthur O'Dwyer wrote:
[...]
Okay, now for the problems with _my_ approach:
- The implementation of tombstone_traits<T> might need to be a friend
of T, and certainly needs to be kept in sync with the implementation
details of T, which is too bad, but not fatal.

For a library-only solution, this is true. But a solution for the standard could use compiler hooks to identify fixed-value bits (e.g. LSBs of a sizeof(T) > 1 T*, MSBs of T*s which are outside the target architecture's physical address space, ...) or corresponding bits of objects containing such T*s) into which to stuff side information.

I can see we're trying to solve the same problem here, but I think you'll run into trouble distinguishing "pre-punned" and "post-punned" pointer types, for example.  I mean if you have a user-defined class with the layout

struct TaggedPointer {
    int *p;
};

how is the compiler supposed to deduce how many bits of the "p" field are available for punning? And if the answer is "it can't", then we're back to square one — you haven't made any dent in the problem.

 

- Because this approach relies on a kind of type-punning — a safe
kind, AFAIK, but a kind — therefore it cannot be done constexpr,
which means std::optional<T> is no longer usable in constexpr
contexts. IMO this is absolutely fatal for pursuing this approach in
the STL.  (Unless someone can come up with a way of doing it constexpr
that I didn't think of...)

Again, for a library-only solution, this is true (modulo your last qualification). But as a std library feature, we could just *define* constexpr functions that extract the side information from a type, to be implemented as compiler hooks, if necessary (GCC seems to allow type punning in constexpr functions already as an extension, so there it would be library-only):

   constexpr const T &get_object(const T &) noexcept;
   constexpr T &get_object(T &) noexcept;
   constexpr sideband<T> get_sideband(const T &) noexcept;

where sideband<T> is some kind of integral type that maps the information that can be stored in the fixed-value bits of T into an integer range [0, N[, get_object launders its argument from the sideband information (masks the sideband bits of a T*), and get_sideband launders its argument from the type's value-representing bits.

Notice that my tombstone_traits doesn't store any data in "the padding bits" of a type; it actually counts the available representations and uses those. This is why I can store optional^254<bool> in a single byte, as opposed to just optional^7<bool>.
This is important because for example std::string has zero "padding bits":

struct string {
    char *p;
    int size;
    int cap;
};

but it does have approximately 8 billion billion "spare representations" (those where size > cap).  And fitting std::optional<std::string> in 16 bytes instead of 24 bytes would be a nice win.

This exposes the sideband information storable in any T to library writers (not just std::optional, but std::variant, too, and even Boost and Qt).

That's already my intention with tombstone_traits<T>: that it be exploitable by std::optional and also whoever else wants to use it. (One of my examples is an open-addressed hash table, although I haven't bothered to implement it yet. The hash table needs two spare representations: one for "empty" and one for "tombstone".)

Your "use by std::variant" idea is interesting. I thought about that a tiny bit a while back, but I couldn't see how to "take the intersection of" the spare representations of two different types. Of course taking the intersection is easy if you're dealing only in "padding bits", but if you deal in "padding bits" then you can't optimize std::optional<std::string>.  In fact I can't think of any run-of-the-mill types that have "padding bits" except for bool, whereas I can't think of any run-of-the-mill non-scalar types that don't have "spare representations".

Using tombstone_traits<T> with the partial specialization std::variant<T, monostate, monostate...> would be possible. I wonder if that's interesting.

–Arthur

Matt Calabrese

unread,
Jul 5, 2017, 10:24:28 AM7/5/17
to ISO C++ Standard - Future Proposals
On Thu, Jun 29, 2017 at 7:05 PM, Arthur O'Dwyer <arthur....@gmail.com> wrote:

The other extension, orthogonal to first one, is to allow value to be stored using alternate representation. As an example, reference type can be stored as a pointer and simply presented as reference. This will also allow to have cheap and relatively easy support for storing references in std::optional, without need of completely separate type specialization.


This is optional<reference_wrapper<T>>.
As for why optional<T&> is a terrible, horrible, no-good, very bad idea, please see Matt Calabrese's talk from C++Now.

Erm, that's not what I said in my talk. In fact, quite the opposite. I advocate support for optional<T&> 

Arthur O'Dwyer

unread,
Jul 5, 2017, 5:47:29 PM7/5/17
to ISO C++ Standard - Future Proposals
Huh!  Well, you fooled me.  What I took away from your talk was mostly the many examples of why value semantics and regular types are a good thing, and the subtlety of what it really means to be a "regular" type (for example, that copying ought to produce an identical, substitutable duplicate of the original, which implies that you can't be "regular" without supporting operator==).  I also remember the examples of how to mess things up by injecting "assign-through" types into generic value-semantic contexts, e.g.


int main() {
    int i=3, j=2, k=1;
    using assign_through = std::tuple<int&>;
    std::vector<assign_through> v = { {i}, {j}, {k} };
    std::sort(v.begin(), v.end());
    printf("%d %d %d\n", i, j, k);  // "3 3 3"
}

I thought you were holding up assign_through as a "bad", "irregular" type, not as a model.

I also recall (more hazily) discussion of the sins of vector<bool>, although that might have been a different talk.

Anyway, it sounds like I've misrepresented Matt's opinion. Let me backtrack and just say that I personally wouldn't support the addition of a type to the library which was (A) irregular and (B) spelled as if it were a specialization of an existing type that is regular.

–Arthur
Reply all
Reply to author
Forward
0 new messages