[boost] [optional] generates unnessesary code for trivial types

206 views
Skip to first unread message

Hite, Christopher

unread,
Jan 25, 2012, 12:28:45 PM1/25/12
to bo...@lists.boost.org
When decompiling my code I noticed a bunch of unnessesary code caused by boost::optional.

1) deconstruction
typedef boost::optional<int> optional_int;

void deconstruct_boost_optional(optional_int& o){
o.~optional_int();
}

One would expect this to do nothing. Instead gcc 4.6.0 with O3 generates:

if(m_initialized){
// do nothing
m_initialized = false;
}

00000000 <deconstruct_boost_optional(boost::optional<int>&)>:
0: 8b 44 24 04 mov 0x4(%esp),%eax
4: 80 38 00 cmpb $0x0,(%eax)
7: 74 03 je c <deconstruct_boost_optional(boost::optional<int>&)+0xc>
9: c6 00 00 movb $0x0,(%eax)
c: f3 c3 repz ret


This one could be easily fixed by removing the bit that sets m_initialized to false, since we're deconstructing anyway.

2) assignment also generates these problems:

void assign_boost_optional(optional_int& o){
o=13;
}

Here there's a semantic issue: we have to decide to use the copy constructor or operator=. This is also wasteful for POD types or any type which has_trivial_copy<>.

3) Even more expensive is if we want to copy an optional<int>

void assign_boost_optional(optional_int& a,optional_int& b){
a=b;
}

00000000 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)>:
0: 8b 44 24 04 mov 0x4(%esp),%eax
4: 8b 54 24 08 mov 0x8(%esp),%edx
8: 80 38 00 cmpb $0x0,(%eax)
b: 74 0b je 18 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)+0x18>
d: 80 3a 00 cmpb $0x0,(%edx)
10: 75 16 jne 28 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)+0x28>
12: c6 00 00 movb $0x0,(%eax)
15: c3 ret
16: 66 90 xchg %ax,%ax
18: 80 3a 00 cmpb $0x0,(%edx)
1b: 74 09 je 26 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)+0x26>
1d: 8b 52 04 mov 0x4(%edx),%edx
20: c6 00 01 movb $0x1,(%eax)
23: 89 50 04 mov %edx,0x4(%eax)
26: f3 c3 repz ret
28: 8b 52 04 mov 0x4(%edx),%edx
2b: 89 50 04 mov %edx,0x4(%eax)
2e: c3 ret

Three possible branches! Theoretically single 64 bit copy do the job. I'm tempted to say: it would be best if for any T has_trivial_copy< optional<T> > iff has_trivial_copy<T>. It might make a sense to make an exception for huge T, where the copying an unused T is more expensive than the branching.


4) has_trivial_destructor<T> should impl has_trivial_destructor< optional<T> > , but this is hard to implement without specialization of optional.

Checking has_trivial_destructor might take care of the complexity of optional<T&> since has_trivial_destructor< T& >.


I'd be willing to fix #1. The other issues need some discussion.

Chris


_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Kim Barrett

unread,
Jan 25, 2012, 5:22:34 PM1/25/12
to bo...@lists.boost.org
On Jan 25, 2012, at 12:28 PM, Hite, Christopher wrote:
> When decompiling my code I noticed a bunch of unnessesary code caused by boost::optional.

I happen to have been looking at the source and generated code for boost::optional recently myself, so jumping in here with a few comments.

> 1) deconstruction
> typedef boost::optional<int> optional_int;
>
> void deconstruct_boost_optional(optional_int& o){
> o.~optional_int();
> }
>
> One would expect this to do nothing. Instead gcc 4.6.0 with O3 generates:
>
> if(m_initialized){
> // do nothing
> m_initialized = false;
> }
>
> 00000000 <deconstruct_boost_optional(boost::optional<int>&)>:
> 0: 8b 44 24 04 mov 0x4(%esp),%eax
> 4: 80 38 00 cmpb $0x0,(%eax)
> 7: 74 03 je c <deconstruct_boost_optional(boost::optional<int>&)+0xc>
> 9: c6 00 00 movb $0x0,(%eax)
> c: f3 c3 repz ret
>
>
> This one could be easily fixed by removing the bit that sets m_initialized to false, since we're deconstructing anyway.

This sounds right to me. Note that eliminating the assignment of m_initialized would (in this case of a trivial destructor for T) make the entire clause controlled by the conditional be empty after optimization, allowing the compiler to optimize away the conditional too.

What's going on here is that the destructor is calling the destroy() helper function, which does more work than the destructor actually needs, specifically setting m_initialized to false. Other callers of destroy() do need that assignment.

> 3) Even more expensive is if we want to copy an optional<int>
>
> void assign_boost_optional(optional_int& a,optional_int& b){
> a=b;
> }
>
> 00000000 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)>:
> 0: 8b 44 24 04 mov 0x4(%esp),%eax
> 4: 8b 54 24 08 mov 0x8(%esp),%edx
> 8: 80 38 00 cmpb $0x0,(%eax)
> b: 74 0b je 18 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)+0x18>
> d: 80 3a 00 cmpb $0x0,(%edx)
> 10: 75 16 jne 28 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)+0x28>
> 12: c6 00 00 movb $0x0,(%eax)
> 15: c3 ret
> 16: 66 90 xchg %ax,%ax
> 18: 80 3a 00 cmpb $0x0,(%edx)
> 1b: 74 09 je 26 <assign_boost_optional(boost::optional<int>&, boost::optional<int>&)+0x26>
> 1d: 8b 52 04 mov 0x4(%edx),%edx
> 20: c6 00 01 movb $0x1,(%eax)
> 23: 89 50 04 mov %edx,0x4(%eax)
> 26: f3 c3 repz ret
> 28: 8b 52 04 mov 0x4(%edx),%edx
> 2b: 89 50 04 mov %edx,0x4(%eax)
> 2e: c3 ret
>
> Three possible branches! Theoretically single 64 bit copy do the job. I'm tempted to say: it would be best if for any T has_trivial_copy< optional<T> > iff has_trivial_copy<T>. It might make a sense to make an exception for huge T, where the copying an unused T is more expensive than the branching.

I think the generated code gets somewhat simplified once issue (1) is addressed.

I think it would be a mistake to just blindly copy the value of b when b.m_initialized is false, if for no other reason than doing so will lead to endless user complaints about compiler and valgrind warnings. Also, invoking undefined behavior can result in the compiler doing very nasty and unexpected things, even in the absence of runtime issues from reading an "uninitialized" location. Consider the possibility that the compiler can prove that the optional being copied from is uninitialized, and so can conclude that the read of its value is undefined behavior. Probably the *best* one can hope for in such a situation is a compiler warning, and many far worse results are possible.

While I think this shouldn't be necessary from a theoretical standpoint, in a practical sense it might make the optimizer's job a little easier (and so increase the chances of getting the code you are looking for) to change the assign(optional) member functions that presently look something like

if (is_initialized())
if (rhs.is_initialized())
assign_value(…)
else destroy()
else if (rhs.is_initialized())
construct(…)

to instead be something like

if (rhs.is_initialized())
if (is_initialized())
assign_value(...)
else construct(...)
else destroy()

or

if ( ! rhs.is_initialized())
destroy()
else if (is_initialized())
assign_value(...)
else construct(…)

Simonson, Lucanus J

unread,
Jan 25, 2012, 6:20:46 PM1/25/12
to bo...@lists.boost.org
I don't personally think that the style of programming that optional is intended for is suitable for high performance/performance critical situations in the first place. Pass by reference and return a bool for a conditional return value. Pass the bool and the object separately for a conditional argument. Pass or return a pointer and check if it is null. Yes, my advice really is to not use optional if you want performance. Even if we did everything you can think of to make optional fast you are still better off designing your interfaces in such a way that you don't need it if your goal is performance. That copy that you are counting branches in is probably unnecessary in the first place. Safety, on the other hand, is also important. All this looking at assembly code generated by optional smacks of premature optimization. If you agree with the idea that optional is valuable because of safety considerations then write your application using optional and not worrying muc
h about performance and get the functionality right then measure your performance and optimize the places where it matters by stripping out usage of optional or whatever else is slowing you down so you get safety most of the time (with most of the benefit) and performance where you actually need it. Life is about tradeoffs. Optional will never be perfect.

I find that it is quite easy to write safe C++ interfaces without using optional, so I see no reason why you can't design code that is both safe and fast without it.

I know the author of optional and you haven't convinced me that we should bother him.

Regards,
Luke

Kim Barrett

unread,
Jan 25, 2012, 7:39:36 PM1/25/12
to bo...@lists.boost.org
On Jan 25, 2012, at 6:20 PM, Simonson, Lucanus J wrote:
> I don't personally think that the style of programming that optional is intended for is suitable for high performance/performance critical situations in the first place. Pass by reference and return a bool for a conditional return value. Pass the bool and the object separately for a conditional argument. Pass or return a pointer and check if it is null. Yes, my advice really is to not use optional if you want performance.

All of the offered suggestions require the caller to construct an initial object that can be passed (by reference / pointer) to the callee for replacement. That may be either inefficient (object is expensive to construct) or impossible (caller doesn't have access to an appropriate constructor).

Hite, Christopher

unread,
Jan 26, 2012, 10:22:36 AM1/26/12
to bo...@lists.boost.org
Thanks for your feedback.

> I think the generated code gets somewhat simplified once issue (1) is addressed.

It would help, but I think won't get rid of all the branches. Your refactoring might help more.

> I think it would be a mistake to just blindly copy the value of b when b.m_initialized is false,
> if for no other reason than doing so will lead to endless user complaints about compiler and
> valgrind warnings. Also, invoking undefined behavior can result in the compiler doing very
> nasty and unexpected things, even in the absence of runtime issues from reading an
> "uninitialized" location. Consider the possibility that the compiler can prove that the
> optional being copied from is uninitialized, and so can conclude that the read of its value
> is undefined behavior. Probably the *best* one can hope for in such a situation is a
> compiler warning, and many far worse results are possible.

Consider the completely legal code below:

struct cheap_optional_int{
cheap_optional_int() : m_initialized() {} // don't init m_data

bool m_initialized;
int m_data;
};

void assign_boost_cheap_optional_int(cheap_optional_int& a,cheap_optional_int& b){
a=b; // default impl
}

The compiler generates nothing but 32-bit moves from the source to the destination. This is completely fine for valgrind. It only complains if a branch based is taken based on uninitialized data.

00000000 <assign_boost_cheap_optional_int(cheap_optional_int&, cheap_optional_int&)>:
0: 53 push %ebx
1: 8b 44 24 0c mov 0xc(%esp),%eax
5: 8b 58 04 mov 0x4(%eax),%ebx
8: 8b 08 mov (%eax),%ecx
a: 8b 44 24 08 mov 0x8(%esp),%eax
e: 89 08 mov %ecx,(%eax)
10: 89 58 04 mov %ebx,0x4(%eax)
13: 5b pop %ebx
14: c3 ret

Sorry the assembler is so poorly formatted after it's mailed.

The cool thing is cheap_optional_int has_trivial_destructor and has_trivial_copy because we haven't overridden the defaults.

Unfotunately overriding the default ctor/dtor always breaks these, even if the code could be optimized out. It may not even be possible for a compiler to solve.


Chris

_____________________________________________
From: Hite, Christopher
Sent: Wednesday, January 25, 2012 6:29 PM
To: 'bo...@lists.boost.org'
Subject: [optional] generates unnessesary code for trivial types

Hite, Christopher

unread,
Jan 26, 2012, 10:28:47 AM1/26/12
to bo...@lists.boost.org
> I don't personally think that the style of programming that optional is intended for is suitable for high performance/performance critical situations in the first place.

You may be right, but you're talking about different use cases. I've got a protocol de/encoders so I want a friendly high level representation of messages that I want to hand off between modules. Imagine a struct with an optional substruct.

Valid alternatives: a pointer to the substruct. Even if I can put the second structure on the stack, this might mean less cache hits. The total extra size is also increased bool=>pointer.

Another option sometimes possible is a nullable value. FAST-FIX's nullable integer for example increments all non-negative values and uses 0 to represent a null.

Another option is to use a presence map at the top of a structure with one bit(or byte) per optional field. That might help with alignment.

> I find that it is quite easy to write safe C++ interfaces without using optional...

Yes I used optional because I knew it would do things correctly.

> you haven't convinced me

Just focus on #1 first. Not writing to m_initialized in the deconstructor would benifit all use cases of optional.

It can't be the solution to just not use boost everytime there's a performance issue.

Vicente J. Botet Escriba

unread,
Jan 26, 2012, 1:23:17 PM1/26/12
to bo...@lists.boost.org
Le 26/01/12 00:20, Simonson, Lucanus J a écrit :
Hi,

the user can not always redesign an interface using optional<T> as he
could be not the owner (use of 3pp libraries).
I'm sure the author/maintainer of optional would adopt some patches if
it is probed a performance improvement for some specific cases.

Best,
Vicente

Domagoj Saric

unread,
Jan 27, 2012, 11:29:51 AM1/27/12
to bo...@lists.boost.org
On 26.1.2012. 0:20, Simonson, Lucanus J wrote:
> I don't personally think that the style of programming that optional is intended for is suitable for high performance/performance critical situations in the first place. Pass by reference and return a bool for a conditional return value. Pass the bool and the object separately for a conditional argument. Pass or return a pointer and check if it is null. Yes, my advice really is to not use optional if you want performance. Even if we did everything you can think of to make optional fast you are still better off designing your interfaces in such a way that you don't need it if your goal is performance. That copy that you are counting branches in is probably unnecessary in the first place. Safety, on the other hand, is also important. All this looking at assembly code generated by opt
> ional smacks of premature optimization. If you agree with the idea that optional is valuable because of safety considerations then write your application using optional and not worrying muc
> h about performance and get the functionality right then measure your performance and optimize the places where it matters by stripping out usage of optional or whatever else is slowing you down so you get safety most of the time (with most of the benefit) and performance where you actually need it. Life is about tradeoffs. Optional will never be perfect.
>
> I find that it is quite easy to write safe C++ interfaces without using optional, so I see no reason why you can't design code that is both safe and fast without it.

I see no reason why we can't have safe _and_ fast _and_ optional?

The rationale you gave is just typical premature pessimization apologetics that
also somehow assumes that C++ is "safe and slow" and that you have to go "bare
metal C" to have performance. Luckily that's just plain incorrect, to put it
mildly. Sadly, that rationale nonetheless also too often gives us such bloatware
as std::streams, lexical_cast or boost::filesystem...

When you design a such a generic library how can there be "premature optimization"?


--
"What Huxley teaches is that in the age of advanced technology, spiritual
devastation is more likely to come from an enemy with a smiling face than
from one whose countenance exudes suspicion and hate."
Neil Postman

Domagoj Saric

unread,
Jan 27, 2012, 11:32:50 AM1/27/12
to bo...@lists.boost.org
On 25.1.2012. 18:28, Hite, Christopher wrote:
> When decompiling my code I noticed a bunch of unnessesary code caused by boost::optional.

Hi, I've recently created an improved internal version of boost::optional to
help workaround two issues:
- suboptimal codegen
- concurrent access.

You can now find this version under
https://svn.boost.org/svn/boost/sandbox/optional. So far the following has been
done:

a) the lifetime management bool was changed into a properly typed pointer (this
actually takes the same amount of space while it provides a no-op get_ptr()
member function as well as easier debugging as the contents of optional can
now clearly be seen through the pointer, as opposed to gibberish in an opaque
storage array)
b) added another conditional constructor that accepts an in-place factory
c) uses the safe bool idiom implementation from Boost.Range (which generates
better code on pre MSVC10 compilers)
d) skips redundant/dead stores of marking itself as uninitialised [including but
limited to, in its destructor (if it has one)]
e) streamlined internal assign paths to help the compiler avoid unnecessary
branching
f) added direct_create() and direct_destroy() member functions that allow the
user to bypass the internal lifetime management (they only assert correct
usage) in situations where the user's own external logic already implicitly
knows the state of the optional
g) optional now declares and defines a destructor only if the contained type has
a non-trivial destructor (this prevents the compiler from detecting false EH
states and thus generating bogus EH code)
h) optional marks itself as uninitialised _before_ calling the contained
object's destructor (this makes it a little more robust in race conditions;
it is of course not a complete solution for such scenarios, those require
external "help" and/or (m)-reference counting to be implemented)
i) extracted the "placeholder" functionality into a standalone class (basically
what would be left of optional<> if the lifetime management "bool" member and
logic was removed) so that it can be reused (e.g. for singleton like classes,
or when more complex custom lifetime management is required)
j) added compiler specific "aids" to workaround situations when the compiler is
unable to detect that placement new will never return a nullptr (and then
generates bogus branching) - IOW "optional<int> optional_number( 3 );" no
longer generates a branch before storing "3" (yes "LOL":)
k) the lifetime management pointer is now stored after the actual contained
object (this helps in avoiding more complex/offset addressing when accessing
optionals through pointers w/o checking whether they are initialised)
l) removed support for antediluvian compilers (MSVC6, BCB5)

todo:

m) lifetime management policy: bool, pointer, reference count (+ a more generic
abstraction/interop with smart_ptr)...

n) zero size overhead for optional references (requires (m))

o) avoid branching in assignment and copy construction of optionals that hold
PODs smaller than N * sizeof( void * ) where N is some small number


- temporarily renamed to optional2 to avoid collision with the original
optional
- passes all optional unit tests (after being renamed back to optional) with
MSVC10 SP1 and Apple Clang 3.0 (from Xcode 4.2.1)


Hope it helps ;)


ps. AFAICT the only real obstacle in having really nice codegen with
boost::optional<a_fundamental_type> is lack of proper ABI/compiler support for
passing and returning small structs in registers...


--
"What Huxley teaches is that in the age of advanced technology, spiritual
devastation is more likely to come from an enemy with a smiling face than
from one whose countenance exudes suspicion and hate."
Neil Postman

Simonson, Lucanus J

unread,
Jan 27, 2012, 1:57:30 PM1/27/12
to bo...@lists.boost.org
> I see no reason why we can't have safe _and_ fast _and_ optional?

I'm actually glad to see you putting effort into making that happen. The effort required is the only reason. Performance wasn't the reason I don't use optional, but for those who do use it I'm sure it will be valuable.

>The rationale you gave is just typical premature pessimization apologetics that
>also somehow assumes that C++ is "safe and slow" and that you have to go "bare
>metal C" to have performance. Luckily that's just plain incorrect, to put it
>mildly. Sadly, that rationale nonetheless also too often gives us such bloatware
>as std::streams, lexical_cast or boost::filesystem...

I said safe and fast C++ without optional, which isn't the same thing as "bare metal C". Bare metal C wouldn't qualify as safe. I'm as annoyed by the "C is faster than C++, ergo I never learned C++" guys as you are. Optional was implemented to be safe and slow because it was targeting safe and slow use cases. For POD types and anything that has a default constructor a std::pair<bool, T> seems fine to me.

Regards,
Luke

Andrey Semashev

unread,
Jan 27, 2012, 4:31:25 PM1/27/12
to bo...@lists.boost.org
On Friday, January 27, 2012 18:57:30 Simonson, Lucanus J wrote:
>
> Optional was implemented to be safe and slow because it was targeting safe
> and slow use cases. For POD types and anything that has a default
> constructor a std::pair<bool, T> seems fine to me.

I'm failing to see why optional should be slow. I use it extensively, POD
types included, and I don't consider pair<bool, T> as a valid replacement.
I'll be glad if it gets optimized for POD types, why not?

Nevin Liber

unread,
Jan 27, 2012, 5:10:48 PM1/27/12
to bo...@lists.boost.org
On 27 January 2012 12:57, Simonson, Lucanus J
<lucanus.j...@intel.com> wrote:

>  Optional was implemented to be safe and slow because it was targeting safe and slow use cases.

You are saying it is *deliberately* slow??

> For POD types and anything that has a default constructor a std::pair<bool, T> seems fine to me.

I don't want to write a different style of code depending on whether
or not a type T is default constructible. I can't easily pass things
like this to templates because I have to write special cases all over
the place. Optional models my intent.

I want to take advantage of RVO.

Some things that are default constructible are still very expensive to
construct (such as std::deque under gcc).

When you initialize class members, do you use member intializer lists
for default constructible types or do you just throw a bunch of
assignments in the body of the constructor? If the former, why do you
do it, given that default construction followed by assignment "seems
fine" to you?
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com(847) 691-1404

Nevin Liber

unread,
Jan 27, 2012, 5:14:08 PM1/27/12
to bo...@lists.boost.org
On 27 January 2012 10:32, Domagoj Saric <domago...@littleendian.com> wrote:

> Hi, I've recently created an improved internal version of boost::optional to
> help workaround two issues:
>  - suboptimal codegen
>  - concurrent access.

Your changes sound interesting! (I'm not as sure about the
"concurrent access" stuff, but only because I haven't given it much
thought yet.)

Regards,


--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com(847) 691-1404

_______________________________________________

Olaf van der Spek

unread,
Jan 28, 2012, 7:34:29 AM1/28/12
to bo...@lists.boost.org
On Fri, Jan 27, 2012 at 5:32 PM, Domagoj Saric
<domago...@littleendian.com> wrote:
> a) the lifetime management bool was changed into a properly typed pointer
> (this
>   actually takes the same amount of space while it provides a no-op

AFAIK bool and pointer aren't the same size. How can it still take the
same amount of space?

Olaf

Joshua Boyce

unread,
Jan 28, 2012, 9:14:08 AM1/28/12
to bo...@lists.boost.org

My guess would be that the compiler promotes the size of a bool to that of
the native word size of the machine because the ease and speed of aligned
memory access outweigh the 'size savings' (as typically your object is
going to need to occupy an entire register, word on the stack, etc --
except when in an array, but that's actually another reason you want to
have the size of the object promoted, as again, unaligned memory access is
slow).

Afaik though, in code it will typically still be treated as if it were e.g.
1 byte on x86 (using the AL register instead of EAX), and simply ignoring
the high portion of the register.

Note: Not a compiler/optimization/cpu/etc expert. This is just my amateur
'guess'. If you are really curious, just compile a test and disassemble it
with GDB/WinDbg/IDA/etc, testing the codegen for various scenarios and
optimization flags.

Dave Abrahams

unread,
Jan 30, 2012, 1:51:42 PM1/30/12
to bo...@lists.boost.org

on Wed Jan 25 2012, "Simonson, Lucanus J" <lucanus.j.simonson-AT-intel.com> wrote:

> I don't personally think that the style of programming that optional
> is intended for is suitable for high performance/performance critical
> situations in the first place.

Why not? It seems like a great candidate for common compiler
optimizations.

> Pass by reference and return a bool for a conditional return value.
> Pass the bool and the object separately for a conditional argument.
> Pass or return a pointer and check if it is null. Yes, my advice
> really is to not use optional if you want performance.

Why?

> Even if we did everything you can think of to make optional fast you
> are still better off designing your interfaces in such a way that you
> don't need it if your goal is performance.

Why do you say that?

--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

Dave Abrahams

unread,
Jan 30, 2012, 1:58:07 PM1/30/12
to bo...@lists.boost.org

I support this work. Optional should be optimal :-)

on Fri Jan 27 2012, Domagoj Saric <domagoj.saric-AT-littleendian.com> wrote:

> a) the lifetime management bool was changed into a properly typed pointer (this
> actually takes the same amount of space while it provides a no-op get_ptr()
> member function as well as easier debugging as the contents of optional can
> now clearly be seen through the pointer, as opposed to gibberish in an opaque
> storage array)

Seems to me this potentially makes optional<char> much bigger. No?

> b) added another conditional constructor that accepts an in-place factory
> c) uses the safe bool idiom implementation from Boost.Range (which generates
> better code on pre MSVC10 compilers)
> d) skips redundant/dead stores of marking itself as uninitialised [including but
> limited to, in its destructor (if it has one)]
> e) streamlined internal assign paths to help the compiler avoid unnecessary
> branching
> f) added direct_create() and direct_destroy() member functions that allow the
> user to bypass the internal lifetime management (they only assert correct
> usage) in situations where the user's own external logic already implicitly
> knows the state of the optional
> g) optional now declares and defines a destructor only if the contained type has
> a non-trivial destructor (this prevents the compiler from detecting false EH
> states and thus generating bogus EH code)
> h) optional marks itself as uninitialised _before_ calling the contained
> object's destructor (this makes it a little more robust in race conditions;

I generally disagree with this sort of defensive programming. Won't it just
mask bugs?

Please tell me that at least *some* C++ compiler does that nowadays...?

--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

Sebastian Redl

unread,
Jan 30, 2012, 3:00:43 PM1/30/12
to bo...@lists.boost.org

All Linux compilers on x64 platforms follow the AMD64 ABI, possibly with minor variations/bugs. This ABI specifies that classes are passed in registers if
- they are trivially copyable and destructible (optional should be specialized for types that fulfill these criteria to ensure this),
- they have no virtual functions or bases,
- they are smaller than 2 qwords (4 qwords if all members are float, double, or SSE types), and
- they don't contain any weird stuff, like 80-bit long doubles or unaligned fields.

The Mac ABI for x64 is very close, though I don't know the differences.

The Win64 ABI is far less nice about registers. It passes the first four arguments in registers, and spills everything else onto the stack. It does not pack multiple values into a register. If a value is larger than 8 bytes, it is not split across registers. The ABI description says that "aggregates" can be passed in registers, but it doesn't elaborate on whether this refers to the C++ definition of aggregates (unlikely!) or whatever else the definition is. It sounds pretty useless.

I'm not aware of any x86-32 calling convention that passes classes of any kind in registers.

Sebastian

Sebastian Redl

unread,
Jan 30, 2012, 3:30:38 PM1/30/12
to bo...@lists.boost.org

Correcting myself: the Common C++ ABI for x86-32 actually specifies that trivially copyable and destructible classes are treated just like simple values for parameter passing, so they can be passed and returned in registers. Of course, the far smaller register file of x86-32 makes that still not very useful.

Simonson, Lucanus J

unread,
Jan 30, 2012, 3:49:36 PM1/30/12
to bo...@lists.boost.org
From: Dave Abrahams

>> I don't personally think that the style of programming that optional
>> is intended for is suitable for high performance/performance critical
>> situations in the first place.

>Why not? It seems like a great candidate for common compiler
>optimizations.

To some extent it depends what style of programming optional is intended for. What I had in mind was the highly object oriented defensive programming style that emphasizes safety often at the expense of performance in vogue around the time Java came out.

>> Pass by reference and return a bool for a conditional return value.
>> Pass the bool and the object separately for a conditional argument.
>> Pass or return a pointer and check if it is null. Yes, my advice
>> really is to not use optional if you want performance.

>Why?

I like pass by reference and return a bool over returning an optional for performance because we allocate memory for the result of the function outside of the function call and there is no transfer of ownership of the result. Even with move semantics, you have just changed an unnecessary copy into cheaper unnecessary move.

>> Even if we did everything you can think of to make optional fast you
>> are still better off designing your interfaces in such a way that you
>> don't need it if your goal is performance.

>Why do you say that?

I don't trust the compiler to always inline what I want it to if it is busy inlining the optional function calls. The compiler heuristics for inlining can get overloaded and confused as the number of nested inline functions grows. There are no inline function calls to check the bool return value of a function or use the reference passed to the function. I believe that getting the ownership of the data at the right place in the code for performance is preferable to transferring ownership, even with move. It also helps the compiler optimize to be given less code that looks more like what you want the compiler to produce at the end so that it has less opportunity to fail to give you what you wanted. We can imagine an arbitrarily good compiler that always does what we intend, but a compiler that generates a branch for "if(m_initialize) m_initialize = false" is clearly not the ideal compiler we imagine.

I did come around to supporting optimization of optional, it might as well be as good of a trade off between safety and performance as we can make it. I don't use optional myself because I prefer alternative syntax for simplicity reasons, convenience, fewer dependencies, etc, and not even performance reasons.

Regards,
Luke

Joshua Boyce

unread,
Jan 30, 2012, 4:15:36 PM1/30/12
to bo...@lists.boost.org
On Tue, Jan 31, 2012 at 7:49 AM, Simonson, Lucanus J <
lucanus.j...@intel.com> wrote:

> From: Dave Abrahams
> >> I don't personally think that the style of programming that optional
> >> is intended for is suitable for high performance/performance critical
> >> situations in the first place.
>
> >Why not? It seems like a great candidate for common compiler
> >optimizations.
>
> To some extent it depends what style of programming optional is intended
> for. What I had in mind was the highly object oriented defensive
> programming style that emphasizes safety often at the expense of
> performance in vogue around the time Java came out.
>
>

But if we can maintain the same level of safety, while at the same time
increasing efficiency, doesn't that benefit everyone?

Kim Barrett

unread,
Jan 30, 2012, 4:29:20 PM1/30/12
to bo...@lists.boost.org
On Jan 30, 2012, at 3:49 PM, Simonson, Lucanus J wrote:
> I like pass by reference and return a bool over returning an optional for performance because we allocate memory for the result of the function outside of the function call and there is no transfer of ownership of the result.

Personally, I like returning values rather than modifying arguments. But more importantly, the caller might not even be able to construct that object to be passed by reference, due to lack of access to an appropriate combination of constructor and initialization arguments, such as when the class has no default constructor.

> Even with move semantics, you have just changed an unnecessary copy into cheaper unnecessary move.

If one cares about performance and one's compiler is not capable of doing RVO for optionals, perhaps one should be looking for a better compiler, and not just for better handling of optionals.

Olaf van der Spek

unread,
Jan 30, 2012, 4:48:50 PM1/30/12
to bo...@lists.boost.org
On Mon, Jan 30, 2012 at 9:00 PM, Sebastian Redl
<sebasti...@getdesigned.at> wrote:
> All Linux compilers on x64 platforms follow the AMD64 ABI, possibly with minor variations/bugs. This ABI specifies that classes are passed in registers if

Does that also apply to functions that aren't exported? I'd assume the
compiler is free to do whatever it wants in that case.

Olaf

Dave Abrahams

unread,
Jan 30, 2012, 6:09:02 PM1/30/12
to bo...@lists.boost.org

on Mon Jan 30 2012, Kim Barrett <kab.conundrums-AT-verizon.net> wrote:

> On Jan 30, 2012, at 3:49 PM, Simonson, Lucanus J wrote:
>> I like pass by reference and return a bool over returning an
>> optional for performance because we allocate memory for the result
>> of the function outside of the function call and there is no
>> transfer of ownership of the result.
>
> Personally, I like returning values rather than modifying arguments.
> But more importantly, the caller might not even be able to construct
> that object to be passed by reference, due to lack of access to an
> appropriate combination of constructor and initialization arguments,
> such as when the class has no default constructor.
>
>> Even with move semantics, you have just changed an unnecessary copy into cheaper unnecessary move.
>
> If one cares about performance and one's compiler is not capable of
> doing RVO for optionals, perhaps one should be looking for a better
> compiler, and not just for better handling of optionals.

IIRC, RVO is now mandated where it's possible, so the whole move
argument is kina moot.

--
Dave Abrahams
BoostPro Computing
http://www.boostpro.com

Hite, Christopher

unread,
Feb 1, 2012, 11:53:22 AM2/1/12
to bo...@lists.boost.org
> On 27.1.2012. 11:32, Domagoj Saric wrote:

> a) the lifetime management bool was changed into a properly typed pointer (this
> actually takes the same amount of space while it provides a no-op get_ptr()
> member function as well as easier debugging as the contents of optional can
> now clearly be seen through the pointer, as opposed to gibberish in an opaque
> storage array)

I'd support this only if were configurable. It takes more space for small or
non-word-aligned data. It might be more expensive on some systems to calculate
the address and store it.

I did think that about defaulting to int_fast8_t for the bool if its alignment>=
alignment of T. On my x86 system it's still 1 byte though. On some system it
might help.

It would also break has_trivial_copy. If someone was naughty and memcopied them,
the new version would lead to a very hard to find bug.

As for the debugger the new C++ allows for a union to contain a class. So if
a placeholder implemention using such a union would show the data in debug.

> d) skips redundant/dead stores of marking itself as uninitialised [including but
> limited to, in its destructor (if it has one)]
> e) streamlined internal assign paths to help the compiler avoid unnecessary
> branching

Sounds like what I'm after.

> f) added direct_create() and direct_destroy() member functions that allow the
> user to bypass the internal lifetime management (they only assert correct
> usage) in situations where the user's own external logic already implicitly
> knows the state of the optional

Sounds good. I also wanted these.

> g) optional now declares and defines a destructor only if the contained type has
> a non-trivial destructor (this prevents the compiler from detecting false EH
> states and thus generating bogus EH code)

Yes, that's what I want.

> h) optional marks itself as uninitialised _before_ calling the contained
> object's destructor (this makes it a little more robust in race conditions;

> it is of course not a complete solution for such scenarios, those require
> external "help" and/or (m)-reference counting to be implemented)

Seems to contradict (g). I'd support something like that only if it can be
configured out. Maybe there's some case completely out of optional's scope
where you use atomic ops.

If you factor out the aligned storage you can build something else that does
ref-counting or a thread safe state machine or whatever.

> i) extracted the "placeholder" functionality into a standalone class (basically
> what would be left of optional<> if the lifetime management "bool" member and
> logic was removed) so that it can be reused (e.g. for singleton like classes,
> or when more complex custom lifetime management is required)

I 100% agree with this. I think there should be one placeholder implementation.
I think boost::function should use it as well. I think it may be useful to users.

> k) the lifetime management pointer is now stored after the actual contained
> object (this helps in avoiding more complex/offset addressing when accessing
> optionals through pointers w/o checking whether they are initialised)

Seems weird. If the front of T is more likely to be used (and old char buffer),
your pointer may wind up in a different cache line.

> o) avoid branching in assignment and copy construction of optionals that hold
> PODs smaller than N * sizeof( void * ) where N is some small number

Again it would be cool if the user had control over this.

I'm going to have to check out your code.


So the big thing I take away from all this it would be really nice if some things
were configurable. How do we do that without breaking code?

Changing the signature to optional<T,Properties=optional_traits<T> >, might
break code that uses boost::optional as a template template parameter.

You could just refer to optional_traits inside and force the user to specialize
it for his T, but that could create violations of the one definition rule.

Also is it OK for optional to depend on enable_if/SFINAE and type traits?

Chris

Domagoj Saric

unread,
Feb 2, 2012, 9:45:37 AM2/2/12
to bo...@lists.boost.org
On 28.1.2012. 13:34, Olaf van der Spek wrote:
> On Fri, Jan 27, 2012 at 5:32 PM, Domagoj Saric
> <domago...@littleendian.com> wrote:
>> a) the lifetime management bool was changed into a properly typed pointer
>> (this
>> actually takes the same amount of space while it provides a no-op
>
> AFAIK bool and pointer aren't the same size. How can it still take the
> same amount of space?

A "language lawyer" might be more precise but, for optional<T>:
- if the bool member is before the T member the compiler has to add
alignment_of<T>::value - sizeof( bool ) bytes of padding after the bool so that
the T member would be properly aligned
- if the bool member is after the T member the compiler has to add the same
amount of padding after the bool member to satisfy the requirement that there
are no "holes" between individual (properly aligned) instances of optional<T> in
arrays of optional<T>...

IOW, my statement in (a) does not for example hold for chars or shorts...


--
"What Huxley teaches is that in the age of advanced technology, spiritual
devastation is more likely to come from an enemy with a smiling face than
from one whose countenance exudes suspicion and hate."
Neil Postman

Stewart, Robert

unread,
Feb 2, 2012, 10:08:28 AM2/2/12
to bo...@lists.boost.org
Hite, Christopher wrote:
>
> So the big thing I take away from all this it would be really
> nice if some things were configurable. How do we do that
> without breaking code?

Quite possibly, you'll need to introduce a new type that provides the configurability you want, while hardcoding backward compatible choices for the existing optional.

_____
Rob Stewart robert....@sig.com
Software Engineer using std::disclaimer;
Dev Tools & Components
Susquehanna International Group, LLP http://www.sig.com


________________________________

IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Domagoj Saric

unread,
Feb 2, 2012, 10:18:49 AM2/2/12
to bo...@lists.boost.org
On 1.2.2012. 17:53, Hite, Christopher wrote:
>> On 27.1.2012. 11:32, Domagoj Saric wrote:
>
>> a) the lifetime management bool was changed into a properly typed pointer (this
>> actually takes the same amount of space while it provides a no-op get_ptr()
>> member function as well as easier debugging as the contents of optional can
>> now clearly be seen through the pointer, as opposed to gibberish in an opaque
>> storage array)
> I'd support this only if were configurable. It takes more space for small or
> non-word-aligned data.

True, I was planning on automatically deciding between bool and pointer based on
sizeof( T ) after adding lifetime management policy support (m)...


> It might be more expensive on some systems to calculate
> the address and store it.

How? You have to fetch the address either way...


> It would also break has_trivial_copy. If someone was naughty and memcopied them,
> the new version would lead to a very hard to find bug.

True, didn't think about trivial copy until Sebastian outlined the
pass-POD-in-register requirements of the AMD x64 ABI. WRT to this it boils down
to whether you want a no-op get_ptr() or your platform and compiler actually
support passing PODs in registers _and_ most of the types you store in optionals
actually satisfy the compiler/ABI requirements for that _and_ you mostly pass
and return those optionals by value...


> As for the debugger the new C++ allows for a union to contain a class. So if
> a placeholder implemention using such a union would show the data in debug.

But the pointer approach would also work with "real world" compilers ;)


>> h) optional marks itself as uninitialised _before_ calling the contained
>> object's destructor (this makes it a little more robust in race conditions;
>> it is of course not a complete solution for such scenarios, those require
>> external "help" and/or (m)-reference counting to be implemented)
> Seems to contradict (g). I'd support something like that only if it can be
> configured out. Maybe there's some case completely out of optional's scope
> where you use atomic ops.

It doesn't (contradict (g)), this applies only to situations where you actually
have to mark the optional as empty (such as when reset() is called).


> If you factor out the aligned storage you can build something else that does
> ref-counting or a thread safe state machine or whatever.

With (m) I'd rather (in some distant future:) add a refcounting policy to
optional (or some future underlying more generic class) so that users don't have
to reimplement this...


>> k) the lifetime management pointer is now stored after the actual contained
>> object (this helps in avoiding more complex/offset addressing when accessing
>> optionals through pointers w/o checking whether they are initialised)
> Seems weird. If the front of T is more likely to be used (and old char buffer),
> your pointer may wind up in a different cache line.

Well yes, as I said this benefits only the cases where the pointer/bool is not
accessed (when an optional is accessed through a pointer/reference). IOW in
99.9% of real world cases the point is quite moot but it did make sense at a
particular stage of a project I'm working on (when you have dozens of hundreds
of template generated functions you can actually measure savings in code size
when you do even such micromanagement). It no longer matters for me but the
layout of optional2 is still like that (currently) purely because it turned out
like that (in the current stage of development) so I wrote point (k) nonetheless
just for the feedback ;)


> So the big thing I take away from all this it would be really nice if some things
> were configurable. How do we do that without breaking code?
>
> Changing the signature to optional<T,Properties=optional_traits<T> >, might
> break code that uses boost::optional as a template template parameter.

Judging for example from the rationale for the lack of smart_ptr configurability
or from the feedback I got for my improved boost::function proposal, it would be
very difficult for this type of configurability for optional to get accepted.

I was rather planing on making the best of optional with automatic/self
configuration based on properties of T and then later (in a galaxy far far
away:) propse an underlying library ("smart resource" or something like that),
that would separate the lifetime management and storage concerns in a maximally
configurable manner, on top of which traditional optional smart_ptr could be
built...


--
"What Huxley teaches is that in the age of advanced technology, spiritual
devastation is more likely to come from an enemy with a smiling face than
from one whose countenance exudes suspicion and hate."
Neil Postman

Domagoj Saric

unread,
Feb 2, 2012, 10:20:27 AM2/2/12
to bo...@lists.boost.org
On 30.1.2012. 19:58, Dave Abrahams wrote:
> I support this work. Optional should be optimal :-)

Or, optimal should not be optional (everything should be optimal :D).


>> a) the lifetime management bool was changed into a properly typed pointer (this
>> actually takes the same amount of space while it provides a no-op get_ptr()
>> member function as well as easier debugging as the contents of optional can
>> now clearly be seen through the pointer, as opposed to gibberish in an opaque
>> storage array)
>
> Seems to me this potentially makes optional<char> much bigger. No?

True (see my answer to Olaf and Christofer).


>> h) optional marks itself as uninitialised _before_ calling the contained
>> object's destructor (this makes it a little more robust in race conditions;
>
> I generally disagree with this sort of defensive programming. Won't it just
> mask bugs?

I generally disagree too but in cases where there is actual "defensive
programming" i.e. handling of invalid/buggy usage. The typical example is code
that asserts that a pointer is not null and then handles the case if it is.
There is none of that here. Imaging writing optional from scratch, at one point
you would have to decide the same thing, when to mark the optional as empty -
before or after calling the destructor. Either way you choose won't make a
difference (semantic or performance wise) for correct code. Incorrect code will
crash less. Isn't that a good thing (considering there is no actual handling of
incorrect code)?
Considering that "there is no bug free software" (one wonders about laser brain
surgery robots :), wouldn't it be better to "a priori crash less" and add
separate sanity checks for invalid concurrent access in order to catch bugs
(obviously this is more work and I don't know if any Boost component does
anything like this actually)?

Perhaps there is no "right" answer to this question and its more a matter of
preference so consider the above as "my 2 cents"...


ps. and yes, I forgot the buzz: (p) rvalue references support :)


--
"What Huxley teaches is that in the age of advanced technology, spiritual
devastation is more likely to come from an enemy with a smiling face than
from one whose countenance exudes suspicion and hate."
Neil Postman

Domagoj Saric

unread,
Feb 2, 2012, 10:41:50 AM2/2/12
to bo...@lists.boost.org
On 30.1.2012. 21:30, Sebastian Redl wrote:
>> All Linux compilers on x64 platforms follow the AMD64 ABI, possibly with minor variations/bugs. This ABI specifies that classes are passed in registers if
>> - they are trivially copyable and destructible (optional should be specialized for types that fulfill these criteria to ensure this),
>> - they have no virtual functions or bases,
>> - they are smaller than 2 qwords (4 qwords if all members are float, double, or SSE types), and
>> - they don't contain any weird stuff, like 80-bit long doubles or unaligned fields.
>>
>> The Mac ABI for x64 is very close, though I don't know the differences.

Thanks for the summary (didn't know there was a separate OS X x64 ABI).


>> The Win64 ABI is far less nice about registers. It passes the first four arguments in registers, and spills everything else onto the stack. It does not pack multiple values into a register. If a value is larger than 8 bytes, it is not split across registers. The ABI description says that "aggregates" can be passed in registers, but it doesn't elaborate on whether this refers to the C++ definition of aggregates (unlikely!) or whatever else the definition is. It sounds pretty useless.

Right, the Windows/MSVC x64 ABI is a major !?wth!?...I just can't think of a
reason why they had to invest resources into making their own ABI that is so
complicated and so inferior to the AMD proposed one (e.g. you can't pass an SSE
vector through an XMM register??).


> Correcting myself: the Common C++ ABI for x86-32 actually specifies that trivially copyable and destructible classes are treated just like simple values for parameter passing, so they can be passed and returned in registers. Of course, the far smaller register file of x86-32 makes that still not very useful.

Unfortunately I have never seen MSVC pass or return any struct through registers
even though it has interprocedural optimizations and link time code generation
capabilities so it can "invent" (as the documentation claims) its own calling
conventions for non exported functions. Don't know whether any other x86
compiler is able to do so...

In any case, the problem is that there is no nearly
portable/standard/wide-spread way (pragma, decl specifier...) to tell the
compiler to return small PODs in registers, especially not just for a particular
function and/or POD type. GCC has -freg-struct-return but that seems nearly
useless because it applies to the whole binary and so it requires the OS to be
built with that option.


--
"What Huxley teaches is that in the age of advanced technology, spiritual
devastation is more likely to come from an enemy with a smiling face than
from one whose countenance exudes suspicion and hate."
Neil Postman

Olaf van der Spek

unread,
Feb 2, 2012, 1:19:54 PM2/2/12
to bo...@lists.boost.org
On Thu, Feb 2, 2012 at 4:20 PM, Domagoj Saric
<domago...@littleendian.com> wrote:
> There is none of that here. Imaging writing optional from scratch, at one
> point you would have to decide the same thing, when to mark the optional as
> empty - before or after calling the destructor. Either way you choose won't
> make a difference (semantic or performance wise) for correct code. Incorrect
> code will crash less. Isn't that a good thing (considering there is no
> actual handling of incorrect code)?

Isn't this about the destructor of optional? Marking it as empty seems
unneeded there.

Olaf

Hite, Christopher

unread,
Feb 7, 2012, 4:36:56 PM2/7/12
to bo...@lists.boost.org
Sorry for not coming back quicker. I've been sick.

I did some experimenting in my own codebase with a "array_vector" which acts
like vector constructs things when they're added, but like boost::array uses
a fixed size array.

I tested the techniques I would use to improve optional. So I think I can
deliver this very small set of goals cleanly:

1) ~optional doesn't set m_initialized.

2) has_trivial_destructor<T> implies has_trivial_destructor<optional<T> >

3) has_has_trivial_copy<T> and has_trivial_assign<T> implies them optional
unless sizeof(T) exceeds some constant max_trivial_copy_Size, which
can also be overridden.

4) I'll define a optional_traits<T> with defaults and an
optional_with_traits<T,Traits=optional_traits<T> >
which can be used to make optionals which override features and from which
optional<T> will derive. That's the best compromise if I can't change
the signature of optional (Is Robert Stewart right?). I think we should use
the traits technique for any new libraries.

Thanks Sebastian Redl and Domagoj Saric for pointing out that (2) and (3)
will may help some compilers put cheap optionals in registers.

Shall I continue? Should I make branch or do it in trunk?

Andrey Semashev

unread,
Feb 7, 2012, 5:00:04 PM2/7/12
to bo...@lists.boost.org
On Tuesday, February 07, 2012 22:36:56 Hite, Christopher wrote:
> Sorry for not coming back quicker. I've been sick.
>
> I did some experimenting in my own codebase with a "array_vector" which acts
> like vector constructs things when they're added, but like boost::array
> uses a fixed size array.
>
> I tested the techniques I would use to improve optional. So I think I can
> deliver this very small set of goals cleanly:
>
> 1) ~optional doesn't set m_initialized.
>
> 2) has_trivial_destructor<T> implies has_trivial_destructor<optional<T> >
>
> 3) has_has_trivial_copy<T> and has_trivial_assign<T> implies them optional
> unless sizeof(T) exceeds some constant max_trivial_copy_Size, which
> can also be overridden.
>
> 4) I'll define a optional_traits<T> with defaults and an
> optional_with_traits<T,Traits=optional_traits<T> >
> which can be used to make optionals which override features and from which
> optional<T> will derive. That's the best compromise if I can't change
> the signature of optional (Is Robert Stewart right?). I think we should use
> the traits technique for any new libraries.

Do I understand it correctly that optional_with_traits is an advanced
replacement for optional? If so, will the good old optional be optimized? I
think, it is possible to optimize the current optional without changing its
signature if we specialize optional_detail::optional_base on the types or
traits we're interested in.

BTW, I would really like to see optional< T& > optimized to store T*
internally.

> Shall I continue? Should I make branch or do it in trunk?

I think, a branch or sandbox is a good start.

Hite, Christopher

unread,
Feb 8, 2012, 1:05:26 PM2/8/12
to bo...@lists.boost.org
On Tuesday, February 07, 2012 17:00:04 Andrey Semashev wrote:
> Do I understand it correctly that optional_with_traits is an advanced
> replacement for optional? If so, will the good old optional be optimized?
No, optional will be ("isA") optional_with_traits. It's just a work around.
I'd prefer to redefine optional:
template<typename T, typename Traits=optional_traits<T> >
class optional;

That might in rare cases it could break user code like:
mpl::quote1<optional>

Personaly I doubt that this is such an issue, but have the best of both
worlds I can define a temporary "optional_with_traits" which when boost
goes to 2.0 we could deprecate and add the parameter to optional.
template<typename T, typename Traits=optional_traits<T> >
class optional_with_traits;

template<typename T>
class optional : public optional_with_traits<T> {...};

Do we gurantee boost users that templates will never add default parameters?

> BTW, I would really like to see optional< T& > optimized to store T*
> internally.

I'm going to say something provacative here. I agree with Lucanus. I see no
reason for optional<T&>. As far I can tell you could use a T*. The only
justification I can think of is on system without memory protection you can
build checks into operator*().

Maybe if you're mixing code with old libraries where T* might imply ownership
you might use optional<T&> to imply no ownership and some temporary validity.

Perhaps we should define a new "smart pointer" called dumb_ptr<T> which can't
be assigned into auto_ptr,unique_ptr,shared_ptr, or any pointer type which
implies ownership.

Maybe I'm missing something, but I don't see the justification.

Chris

Andrey Semashev

unread,
Feb 8, 2012, 10:18:59 PM2/8/12
to bo...@lists.boost.org
On Wednesday, February 08, 2012 19:05:26 Hite, Christopher wrote:
>
> Do we gurantee boost users that templates will never add default parameters?

This would be a breaking change, so yes, such a change should be avoided if
possible. Personnally, I didn't grasp the benefit of using these traits to
justify the breakage. What will be traits used for, what will it provide?

> > BTW, I would really like to see optional< T& > optimized to store T*
> > internally.
>
> I'm going to say something provacative here. I agree with Lucanus. I see no
> reason for optional<T&>. As far I can tell you could use a T*. The only
> justification I can think of is on system without memory protection you can
> build checks into operator*().
>

> Maybe I'm missing something, but I don't see the justification.

optional< T& > is a useful thing when you want to apply operators (such as
relation operators or streaming) to the referred value. In generic code you
don't have to specialize for pointers to do the right thing. I'm going to use
this property in my Boost.Log library.

Nevin Liber

unread,
Feb 9, 2012, 3:59:15 AM2/9/12
to bo...@lists.boost.org
On 8 February 2012 17:18, Andrey Semashev <andrey....@gmail.com> wrote:
> optional< T& > is a useful thing when you want to apply operators (such as
> relation operators or streaming) to the referred value. In generic code you
> don't have to specialize for pointers to do the right thing.

+1 here. Please keep the interfaces the same, unless you have a
*very* compelling reason not to.

(Unless, of course, you are one of those who *likes* vector<bool>... :-))


--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com(847) 691-1404

_______________________________________________

Thorsten Ottosen

unread,
Feb 9, 2012, 4:24:54 AM2/9/12
to bo...@lists.boost.org
Den 09-02-2012 09:59, Nevin Liber skrev:
> On 8 February 2012 17:18, Andrey Semashev<andrey....@gmail.com> wrote:
>> optional< T& > is a useful thing when you want to apply operators (such as
>> relation operators or streaming) to the referred value. In generic code you
>> don't have to specialize for pointers to do the right thing.
>
> +1 here. Please keep the interfaces the same, unless you have a
> *very* compelling reason not to.

+1.

optional<T*> cannot optimize the bool away, because it can be null. So
optional<T&> is both more efficeint and more handy.

-Thorsten

Andrzej Krzemienski

unread,
Feb 9, 2012, 6:20:41 AM2/9/12
to bo...@lists.boost.org
> optional< T& > is a useful thing when you want to apply operators (such
as
> relation operators or streaming) to the referred value. In generic code
you
> don't have to specialize for pointers to do the right thing. I'm going to
use
> this property in my Boost.Log library.

Andrey, would you mind giving us a short example of how you want to use an
optional reference? I am in the middle of designing the new optional
interface for TR2, and came to the conclusion that in order to avoid
counter-intuitive semantics for optional reference assignment, I had better
remove it at all; that is, optional references are to be limited: they
should provide no assignment. You could still use optional values and
optional references in generic code but with reduced interface:

template <typename T> // T is a ref or a value
void use( std::tr2::optional<T> opt, T nval )
{
if (opt) {
std::cout << *out; // fine
*opt = nval; // fine, assigning T's not optionals
opt = nval; // invalid if T is a ref
opt = opt; // invalid if T is a ref
}

if (needToRebindAReference()) {
opt.emplace(nval); // valid - always rebinds
}
};

Would such a limited interface as I described above be enough for your
generic usage of optional?
Regards,
&rzej

Andrey Semashev

unread,
Feb 9, 2012, 7:04:44 AM2/9/12
to bo...@lists.boost.org
On Thu, Feb 9, 2012 at 3:20 PM, Andrzej Krzemienski <akrz...@gmail.com> wrote:
>
> Andrey, would you mind giving us a short example of how you want to use an
> optional reference? I am in the middle of designing the new optional
> interface for TR2, and came to the conclusion that in order to avoid
> counter-intuitive semantics for optional reference assignment, I had better
> remove it at all; that is, optional references are to be limited: they
> should provide no assignment. You could still use optional values and
> optional references in generic code but with reduced interface:
>
> template <typename T> // T is a ref or a value
> void use( std::tr2::optional<T> opt, T nval )
> {
>  if (opt) {
>    std::cout << *out; // fine
>   *opt = nval; // fine, assigning T's not optionals
>    opt = nval; // invalid if T is a ref
>    opt = opt;  // invalid if T is a ref
>  }
>
>  if (needToRebindAReference()) {
>    opt.emplace(nval); // valid - always rebinds
>  }
> };
>
> Would such a limited interface as I described above be enough for your
> generic usage of optional?

My code is not yet set in stone, and actually, it doesn't use optional
references yet. But it looks like it should be enough for my needs,
assuming optional<T&> is still going to support the same relation
operators as optional<T> with the same semantics. The general idea is
that I build a Boost.Phoenix expression (usually, a predicate or a
streaming expression) where some of the terminals may result in
optional references (references allow to avoid expensive copying).
When the expression is executed, it should operate on the referred
values, if present, or execute a fallback logic otherwise. In my
context, the required behavior is almost exactly what optional<T&>
provides.

I have a question though. Why prohibit opt = opt assignment? It looks
quite safe and has a fairly obvious behavior. If I have an optional
reference as a member of my class, the lack of assignment in optional
would force me to define a custom assignment operator for my class.
This seems to be an unnecessary requirement. Also, in the source code
I dealt with I often saw people writing something like opt =
optional<T>() to clear the value. This would break with references for
no apparent reason.

Domagoj Saric

unread,
Feb 9, 2012, 7:45:49 AM2/9/12
to bo...@lists.boost.org
On 7.2.2012. 22:36, Hite, Christopher wrote:
> I tested the techniques I would use to improve optional. So I think I can
> deliver this very small set of goals cleanly:
>
> 1) ~optional doesn't set m_initialized.
>
> 2) has_trivial_destructor<T> implies has_trivial_destructor<optional<T> >
>
> 3) has_has_trivial_copy<T> and has_trivial_assign<T> implies them optional
> unless sizeof(T) exceeds some constant max_trivial_copy_Size, which
> can also be overridden.
>
> 4) I'll define a optional_traits<T> with defaults and an
> optional_with_traits<T,Traits=optional_traits<T> >
> which can be used to make optionals which override features and from which
> optional<T> will derive. That's the best compromise if I can't change
> the signature of optional (Is Robert Stewart right?). I think we should use
> the traits technique for any new libraries.
>
> Thanks Sebastian Redl and Domagoj Saric for pointing out that (2) and (3)
> will may help some compilers put cheap optionals in registers.
>
> Shall I continue? Should I make branch or do it in trunk?

The optional in sandbox (that passes regression tests) already does 1 and 2
(among many other things) so doing it from scratch again would be reinventing
the wheel.

ad 3) I would agree to such a compromise: that a bool be used for small PODs (so
that they get trivial copy and assign) and a pointer for everything else (so
that these get a no-op get_ptr() and nice debugging)...
[In my version PODs always/implicitly get "nice debugging" regardless of the
lifetime management implementation (bool/pointer/...).]

ad 4) As said before, even though my personal prima facie stance is always "the
more configurability the better", it is highly unlikely (from reasons previously
given) that changing optional's signature would pass.
Given that, the best workaround IMO for such "ancient"/"written in stone"
constructs that suffer from the "Joe Sixpack" approach, i.e. they are good
enough for 90% use cases, is to:
- create a separate configurable construct and use it as an implementation
detail of the original construct that maximally auto-configures based on T
(improving the "good enough percentage" to "98%")
- provide global configuration (that overrides auto-configuration) for the
original construct (improving the "good enough percentage" to "99.8%")
...and the remaining "0.2%" can use the new construct directly...

So far this corresponds to your optional_with_traits approach except that I
don't think that providing global configuration by overriding/specializing the
default traits is the correct approach. As you noted, this can violate the ODR
and AFAIK users are not used that changing a _type_ can violate the ODR and
change the behaviour of another type. I'd rather use macros for that (e.g.
#define BOOST_OPTIONAL_MAX_BRANCHLESS_COPY_SIZE 4 * sizeof( void * )) because
programmers are already used/"trained" to be careful with macros WRT to the ODR
_and_ because there already exist tools/compilers which can detect macro ODR
violations at link time (e.g. MSVC10)...


--
"What Huxley teaches is that in the age of advanced technology, spiritual
devastation is more likely to come from an enemy with a smiling face than
from one whose countenance exudes suspicion and hate."
Neil Postman

Domagoj Saric

unread,
Feb 9, 2012, 7:50:16 AM2/9/12
to bo...@lists.boost.org

IMO it seems that, yes, you are making the same mistake as Lucanus, thinking
about "The Universe" only as/through your POV of your personal problem domain:
a) (optional models optionally holding an object) + (objects can be held by
value and by reference) = optional<T&> perfectly logical
b) creating special cases (e.g. for T&) creates special problems in generic code


--
"What Huxley teaches is that in the age of advanced technology, spiritual
devastation is more likely to come from an enemy with a smiling face than
from one whose countenance exudes suspicion and hate."
Neil Postman

Domagoj Saric

unread,
Feb 9, 2012, 8:06:56 AM2/9/12
to bo...@lists.boost.org
On 2.2.2012. 16:18, Domagoj Saric wrote:
> On 1.2.2012. 17:53, Hite, Christopher wrote:
>>> On 27.1.2012. 11:32, Domagoj Saric wrote:
>>> k) the lifetime management pointer is now stored after the actual contained
>>> object (this helps in avoiding more complex/offset addressing when accessing
>>> optionals through pointers w/o checking whether they are initialised)
>> Seems weird. If the front of T is more likely to be used (and old char buffer),
>> your pointer may wind up in a different cache line.
>
> Well yes, as I said this benefits only the cases where the pointer/bool is not
> accessed (when an optional is accessed through a pointer/reference). IOW in
> 99.9% of real world cases the point is quite moot but it did make sense at a
> particular stage of a project I'm working on (when you have dozens of hundreds
> of template generated functions you can actually measure savings in code size
> when you do even such micromanagement). It no longer matters for me but the
> layout of optional2 is still like that (currently) purely because it turned out
> like that (in the current stage of development) so I wrote point (k) nonetheless
> just for the feedback ;)

Actually I forgot a, personally, much more important reason why placing the
contained object at the beginning/same address as optional itself was more
desirable.
My optional use cases generally fall into two categories, optionals of
fundamental types (bools, ints and floats) and small PODs and optionals of
nontrivial GUI objects. The latter case usually looks like this (a compile-time
generated Model-View-Controller design where the "controller" is "short
circuited" for simplicity and efficiency):

template <typename T>
class Model
{
optional<View<Model>> optionalGUI_;
};

Without a "controller", View<Model> needs to access its Model instance and
instead of storing a Model pointer it can simply deduce its address from its own
address (knowing that Views only ever exist as members of Models). When View is
inside an optional it first needs to calculate the address of
optional<View<Model>> from its own address and then the address of the Model
parent from the optional address. And the crux of the problem is: to calculate
the address of the optional it needs to know the layout of optional...

(Incidentally the current/original optional allowed for an ugly way to calculate
the offset of the contained object by using a helper class that derives from
optional_base...)

Andrzej Krzemienski

unread,
Feb 9, 2012, 8:39:55 AM2/9/12
to bo...@lists.boost.org
> I have a question though. Why prohibit opt = opt assignment? It looks
> quite safe and has a fairly obvious behavior. If I have an optional
> reference as a member of my class, the lack of assignment in optional
> would force me to define a custom assignment operator for my class.
> This seems to be an unnecessary requirement. Also, in the source code
> I dealt with I often saw people writing something like opt =
> optional<T>() to clear the value. This would break with references for
> no apparent reason.

First, let me show why opt = nval; is controversial. Current semantics for
boost::optional<T&> is to rebind on assignment, which means that in the
following code:

int i = 1;
int j = 2;
optional<int&> oi = i;
oi = j;

The effect of this program is that i remains 1, j remains 2 and oi holds a
reference to j. This is surprising to those that think of optional as
delayed initialization. I lean towards disabling opt = opt because it is
very similar to opt = nval;

int i = 1;
int j = 2;
optional<int&> oi = i;
optional<int&> oj = j;
i = j;

The effect here is that i remains 1, j remains 2 and oi and oj both hold a
reference to j. You may find it less surprising but if you think of
optional reference as a regular reference that is initialized a bit later,
the behavior is not what you would expect.

Note that if you store a normal reference to int as class member, you
already have to write your assignment yourself. changing a normal reference
to an optional reference should not come as something irregular.

Andrey Semashev

unread,
Feb 9, 2012, 9:05:30 AM2/9/12
to bo...@lists.boost.org
On Thu, Feb 9, 2012 at 5:39 PM, Andrzej Krzemienski <akrz...@gmail.com> wrote:
>
> First, let me show why opt = nval; is controversial.

I understand and agree that opt = nval may be ambiguous and even dangerous.

> I lean towards disabling opt = opt because it is
> very similar to opt = nval;
>
>  int i = 1;
>  int j = 2;
>  optional<int&> oi = i;
>  optional<int&> oj = j;
>  i = j;
>
> The effect here is that i remains 1, j remains 2 and oi and oj both hold a
> reference to j. You may find it less surprising but if you think of
> optional reference as a regular reference that is initialized a bit later,
> the behavior is not what you would expect.

I see your point but, IMHO, optional<T&> is too different from normal
references to attempt to draw this association. After all,
optional<T&> is an object in the sense it has internal state and you
can take its address while T& is not (one may call it an alias of an
object). It is therefore logical that optional<T&>::operator= operates
on the object (i.e. on the optional contents) and hypothetical
T&::operator= operates on the referred object.

> Note that if you store a normal reference to int as class member, you
> already have to write your assignment yourself. changing a normal reference
> to an optional reference should not come as something irregular.

Again, I tend to see optional<T&> as an object, with no apparent
reason why it cannot be assigned to.

Andrzej Krzemienski

unread,
Feb 9, 2012, 9:33:03 AM2/9/12
to bo...@lists.boost.org
> I see your point but, IMHO, optional<T&> is too different from normal
> references to attempt to draw this association. After all,
> optional<T&> is an object in the sense it has internal state and you
> can take its address while T& is not (one may call it an alias of an
> object). It is therefore logical that optional<T&>::operator= operates
> on the object (i.e. on the optional contents) and hypothetical
> T&::operator= operates on the referred object.
> > Note that if you store a normal reference to int as class member, you
> > already have to write your assignment yourself. changing a normal
reference
> > to an optional reference should not come as something irregular.
>
> Again, I tend to see optional<T&> as an object, with no apparent
> reason why it cannot be assigned to.

I understand (I think) your point of view. Let me clarify one thing. I am
thinking of disabling the assignment not because I think it does not belong
to references, but because there are two ways of implementing it, and
implementing it either way would surprise a different group of programmers.
And this would be a "run-time surprise". Instead my choice (not necessarily
the best one) is to provide a "compile-time surprise".
With your view of optional reference (if I got it right it is a pointer
with a somewhat different syntax) your expectation of rebinding assignment
comes as natural. With a different model of optional reference a
non-rebinding assignment comes as more natural. I believe that
optional<reference_wrapper<T>> would serve your purpose best. Or would it
also introduce the lack of uniformity?

paul Fultz

unread,
Feb 9, 2012, 10:48:47 AM2/9/12
to bo...@lists.boost.org

Actually, you could just take the optional_traits as the first parameter. So you define
optional<T> or optional<optional_traits<my_traits<T> > >. Then optional would be
specialized for optional_traits that will get the user-defined traits.

Andrey Semashev

unread,
Feb 9, 2012, 12:00:07 PM2/9/12
to bo...@lists.boost.org
On Thursday, February 09, 2012 15:33:03 Andrzej Krzemienski wrote:

> I believe that
> optional<reference_wrapper<T>> would serve your purpose best. Or would it
> also introduce the lack of uniformity?

Interesting. I'm not sure it's going to work because reference_wrapper won't
have T's operators. The compiler may find the operators via ADL, but calling
them will require an implicit cast from reference_wrapper to T&, which may
mess up overload resolution. I think, optional<T&> is closer to my needs.

Andrey Semashev

unread,
Feb 9, 2012, 12:54:42 PM2/9/12
to bo...@lists.boost.org
On Thursday, February 09, 2012 15:33:03 Andrzej Krzemienski wrote:
> I believe that
> optional<reference_wrapper<T>> would serve your purpose best. Or would it
> also introduce the lack of uniformity?

One additional note on this. I would like optional<T&> to be perfectly
implementable as a wrapper around T*. optional<reference_wrapper<T>> does not
allow this, at least not with its generic interface, since optional<T>::get()
returns T&, which effectively forces optional<reference_wrapper<T>> to store
reference_wrapper internally (along with the value presence flag).
Specializing optional on std::reference_wrapper does not solve the problem
entirely because there is also boost::reference_wrapper. It would be odd if
optional worked differently with different reference_wrappers.

Domagoj Saric

unread,
Feb 9, 2012, 2:01:08 PM2/9/12
to bo...@lists.boost.org
"Domagoj Saric" je napisao u poruci interesnoj
grupi:jh0f5v$ljr$1...@dough.gmane.org...

> So far this corresponds to your optional_with_traits approach except that
> I don't think that providing global configuration by
> overriding/specializing the default traits is the correct approach. As you
> noted, this can violate the ODR and AFAIK users are not used that changing
> a _type_ can violate the ODR and change the behaviour of another type.

Or I might just be babbling :) That's what traits are for (when per type as
opposed to per instantiation configuration is enough/desired)...

Domagoj Saric

unread,
Feb 9, 2012, 2:09:52 PM2/9/12
to bo...@lists.boost.org
"paul Fultz" je napisao u poruci interesnoj
grupi:1328802527.475...@web112602.mail.gq1.yahoo.com...

> Actually, you could just take the optional_traits as the first parameter.
> So you define
> optional<T> or optional<optional_traits<my_traits<T> > >. Then optional
> would be
> specialized for optional_traits that will get the user-defined traits.

(possibly a bit of work to still get the special trivial destructor and
assignment functionality in the specialization, but) Clever ;)


--
"What Huxley teaches is that in the age of advanced technology, spiritual
devastation is more likely to come from an enemy with a smiling face than
from one whose countenance exudes suspicion and hate."
Neil Postman

_______________________________________________

paul Fultz

unread,
Feb 9, 2012, 3:58:48 PM2/9/12
to bo...@lists.boost.org
----- Original Message -----

> From: Domagoj Saric <dsa...@gmail.com>
> To: bo...@lists.boost.org
> Cc:
> Sent: Thursday, February 9, 2012 2:09 PM
> Subject: Re: [boost] [optional] generates unnessesary code for trivial types
>

>& quot;paul Fultz"  je napisao u poruci interesnoj

> grupi:1328802527.475...@web112602.mail.gq1.yahoo.com...
>> Actually, you could just take the optional_traits as the first parameter.
> So you define
>> optional<T> or optional<optional_traits<my_traits<T> >
>> . Then optional would be
>> specialized for optional_traits that will get the user-defined traits.
>
> (possibly a bit of work to still get the special trivial destructor and
> assignment functionality in the specialization, but) Clever ;)


Actually, you could use an optional_impl class, that always uses traits. And then
when the user is not passing in their own traits you would pass in default_traits.
Something like this:

template<class T>
class optional : public optional_impl<default_traits<T> >
{
//Foward constructors, and operators
};

template<class Trait>
class optional<optional_traits<Trait> : public optional_impl<Trait >
{
//Foward constructors, and operators
};

Then the assign operator would forward to an assign method in the base class.
Of course, this would mean that if T is trivially assignable, optional<T> would not
be trivially assignable. Was that one of your goals of the original design?

Nathan Ridge

unread,
Feb 9, 2012, 9:13:23 PM2/9/12
to Boost Developers Mailing List

> int i = 1;
> int j = 2;
> optional<int&> oi = i;
> optional<int&> oj = j;
> i = j;
>
> The effect here is that i remains 1, j remains 2 and oi and oj both hold a
> reference to j. You may find it less surprising but if you think of
> optional reference as a regular reference that is initialized a bit later,
> the behavior is not what you would expect.

Huh? How does oi come to hold a reference to j?

Regards,
Nate

Andrzej Krzemienski

unread,
Feb 10, 2012, 3:10:10 AM2/10/12
to bo...@lists.boost.org
> > int i = 1;
> > int j = 2;
> > optional<int&> oi = i;
> > optional<int&> oj = j;
> > i = j;

> > The effect here is that i remains 1, j remains 2 and oi and oj both
hold a
> > reference to j. You may find it less surprising but if you think of
> > optional reference as a regular reference that is initialized a bit
later,
> > the behavior is not what you would expect.
>
> Huh? How does oi come to hold a reference to j?

Apologies, I meant to say

int i = 1;
int j = 2;
optional<int&> oi = i;
optional<int&> oj = j;

oi = oj;

Boost.Optional documentation explains it better:
http://www.boost.org/doc/libs/1_48_0/libs/optional/doc/html/boost_optional/rebinding_semantics_for_assignment_of_optional_references.html

Regards,
&rzej

Hite, Christopher

unread,
Feb 10, 2012, 9:35:28 AM2/10/12
to bo...@lists.boost.org
I think paul Fultz wrote:
> Actually, you could just take the optional_traits as the first parameter. So you define
> optional<T> or optional<optional_traits<my_traits<T> > >. Then optional would be
> specialized for optional_traits that will get the user-defined traits.

That new optional might break old code:

template<typename T>
void print(const optional<T>& o){
if(o){
T copy=*o;
std::cout<<copy;
}
}

That copy is going to be a optional_traits<>. My method would also force you to
add traits to the signature, but at least you wouldn't need to write two specializations:

template<typename T,typename Traits>
void print(const optional_with_traits<T, Traits >& o);

Se have to ask ourselves if configuration is worth it. I think sometimes you don't
find that out until it's too late and someone wants a option.

Things I can think of making configurable:
* bool type - a word might be faster on risk machines
* use_trivial_destruction - there might be a case for overridding this
* use_trivial_copy - cann't see a case for overriding this
* bool_first
* alignment - maybe 4 byte alignment is legal but 8 byte is faster
* enable_assertions

Chris

Hite, Christopher

unread,
Feb 10, 2012, 10:16:49 AM2/10/12