Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

C++ 2017 -- win, lose and draw

305 views
Skip to first unread message

woodb...@gmail.com

unread,
Jan 22, 2017, 4:24:45 PM1/22/17
to

std::string_view == win

But there should be support for appending a string_view to
std::string. That seems to still be missing from gcc 7.0
and clang 3.9.1. That's kind of frustrating.

std::variant == win
std::any == lose
std::optional == draw

That's my take on these additions. I tried replacing a use of
std::unique_ptr with std::optional. The resulting executable
was 588 bytes bigger (~1.5%). It avoids the allocation/
deallocation, but not without it's own cost. In a single
threaded program perhaps unique_ptr would be better.



Brian
Ebenezer Enterprises - In G-d we trust.
http://webEbenezer.net

Mr Flibble

unread,
Jan 22, 2017, 5:11:46 PM1/22/17
to
On 22/01/2017 21:24, woodb...@gmail.com wrote:
>
> std::string_view == win
>
> But there should be support for appending a string_view to
> std::string. That seems to still be missing from gcc 7.0
> and clang 3.9.1. That's kind of frustrating.
>
> std::variant == win
> std::any == lose
> std::optional == draw
>
> That's my take on these additions. I tried replacing a use of
> std::unique_ptr with std::optional. The resulting executable
> was 588 bytes bigger (~1.5%). It avoids the allocation/
> deallocation, but not without it's own cost. In a single
> threaded program perhaps unique_ptr would be better.

std::optional is not meant to be an alternative to std::unique_ptr and
shouldn't be considered such.

/Flibble

woodb...@gmail.com

unread,
Jan 29, 2017, 4:38:50 PM1/29/17
to
On Sunday, January 22, 2017 at 3:24:45 PM UTC-6, woodb...@gmail.com wrote:
> std::string_view == win
>
> But there should be support for appending a string_view to
> std::string. That seems to still be missing from gcc 7.0
> and clang 3.9.1. That's kind of frustrating.
>

I found support for this with gcc7 now. I had to use
-std=c++17 rather than the -std=c++1z that clang likes.

And I found a bug with my usage of string_view. I was passing
the results from it's "data" member function to a constructor
that takes a char const*. I knew that that function doesn't
guarantee to return a null terminated string, but forgot about
that.

This page says:
http://en.cppreference.com/w/cpp/experimental/basic_string_view/data

"Returns a pointer to the underlying character array. The pointer is such that the range [data(); data() + size()) is valid and the values in it correspond to the values of the view. (n.b. Unlike basic_string::data() and string literals, data() may return a pointer to a buffer that is not null-terminated. Therefore it is typically a mistake to pass data() to a routine that takes just a const CharT* and expects a null-terminated string.) "

One thing I would add to that is that is if you encounter this problem,
you might want to add a function that takes a string_view. If you
already have a string_view, then it's good to use that. That's what I did
in fixing the problem and it will help me avoid a few calls to strlen.


I've been looking at Clang and GCC's implementations of string_view.

This is from Clang 3.9.1:

private:
const value_type* __data;
size_type __size;

---------------------------

And this is from GCC 7.0

private:
// blah blah
size_t _M_len;
const _CharT* _M_str;

---------------------------

I think GCC's version is better in terms of the name _M_len
rather than __size.

It seems though that putting the pointer first might help
in terms of preventing some padding in the type if the pointer
is 8 bytes and the length member is 4. So maybe Clang's way
is better... I just wrote a little test program and the
size of string_view is 16 on both clang and gcc. But who
needs support for such long strings? Does the standard
require that? I could live with 4 billion as a limit for the
length of strings.

woodb...@gmail.com

unread,
Jan 30, 2017, 8:14:46 PM1/30/17
to
On Sunday, January 29, 2017 at 3:38:50 PM UTC-6, woodb...@gmail.com wrote:
> I've been looking at Clang and GCC's implementations of string_view.
>
> This is from Clang 3.9.1:
>
> private:
> const value_type* __data;
> size_type __size;
>
> ---------------------------
>
> And this is from GCC 7.0
>
> private:
> // blah blah
> size_t _M_len;
> const _CharT* _M_str;
>
> ---------------------------
>
> I think GCC's version is better in terms of the name _M_len
> rather than __size.
>
> It seems though that putting the pointer first might help
> in terms of preventing some padding in the type if the pointer
> is 8 bytes and the length member is 4. So maybe Clang's way
> is better... I just wrote a little test program and the
> size of string_view is 16 on both clang and gcc. But who
> needs support for such long strings? Does the standard
> require that? I could live with 4 billion as a limit for the
> length of strings.
>
>

When string_view is 16 bytes it's probably a good idea
to use (lvalue) references with it. If it were 12 bytes,
taking it by value would be more palatable. And if I
understand correctly, taking it by rvalue reference
would be the same as taking it by value.

Are others interested in a 12 byte string_view?


Brian
Ebenezer Enterprises - "If I give to a needy soul, but don't
have love then who is poor?" for KING & COUNTRY

https://duckduckgo.com/?q=%22proof+of+your+love%22+king+and+country&t=ffsb&ia=videos&iax=1&iai=b-2dKOfbC9c

http://webEbenezer.net

Öö Tiib

unread,
Jan 31, 2017, 3:52:15 AM1/31/17
to
We go back to such first grade basics now?

Imagine that we would only be content with up to 255 byte
view then 1 byte size would be fine? So sizeof string_view
would be 9? Wrong!

#include <iostream>

struct nah {void* ptr; uint8_t size;};

int main()
{
std::cout << "Size is still " << sizeof (nah)
<< " bytes, Brian.\n";
}

What it answers?
Why it is so? ;-)

Scott Lurndal

unread,
Jan 31, 2017, 9:06:42 AM1/31/17
to
woodb...@gmail.com writes:
>On Sunday, January 29, 2017 at 3:38:50 PM UTC-6, woodb...@gmail.com wrote:
>> I've been looking at Clang and GCC's implementations of string_view.
>>
>> This is from Clang 3.9.1:
>>
>> private:
>> const value_type* __data;
>> size_type __size;
>>
>> ---------------------------
>>
>> And this is from GCC 7.0
>>
>> private:
>> // blah blah
>> size_t _M_len;
>> const _CharT* _M_str;
>>
>> ---------------------------
>>
>> I think GCC's version is better in terms of the name _M_len
>> rather than __size.
>>
>> It seems though that putting the pointer first might help
>> in terms of preventing some padding in the type if the pointer
>> is 8 bytes and the length member is 4.

On linux systems, size_t is the same size as a pointer
(4 bytes on ia32, 8 bytes on x86_64). size_t must be a
type large enough to represent the entire physical address
space.

Can't speak to windows, but I'd find it unusual for size_t to be
4 bytes on any 64-bit system.


> So maybe Clang's way
>> is better... I just wrote a little test program and the
>> size of string_view is 16 on both clang and gcc. But who
>> needs support for such long strings? Does the standard
>> require that? I could live with 4 billion as a limit for the
>> length of strings.
>>
>>
>
>When string_view is 16 bytes it's probably a good idea
>to use (lvalue) references with it.

Not necessarily, a processor specific ABI's may pass it
in a 128-bit SIMD register when passed by value, or it may
pass a 128-bit value in two integer 64-bit registers. e.g.
as required by the intel x86_64 psABI:

The classification of aggregate (structures and arrays) and union types works
as follows:
1. If the size of an object is larger than two eightbytes, or it contains unaligned
fields, it has class MEMORY.
2. If a C++ object has either a non-trivial copy constructor or a non-trivial
destructor 10 it is passed by invisible reference (the object is replaced in the
parameter list by a pointer that has class INTEGER). 11
3. If the size of the aggregate exceeds a single eightbyte, each is classified
separately. Each eightbyte gets initialized to class NO_CLASS.
4. Each field of an object is classified recursively so that always two fields are
considered. The resulting class is calculated according to the classes of the
fields in the eightbyte:
(a) If both classes are equal, this is the resulting class.
(b) If one of the classes is NO_CLASS, the resulting class is the other
class.
(c) If one of the classes is MEMORY, the result is the MEMORY class.
(d) If one of the classes is INTEGER, the result is the INTEGER.
(e) If one of the classes is X87, X87UP, COMPLEX_X87 class, MEM-
ORY is used as class.
(f) Otherwise class SSE is used.
5. Then a post merger cleanup is done:
(a) If one of the classes is MEMORY, the whole argument is passed in
memory.
(b) If SSEUP is not preceeded by SSE, it is converted to SSE.




>If it were 12 bytes,
>taking it by value would be more palatable. And if I
>understand correctly, taking it by rvalue reference
>would be the same as taking it by value.
>
>Are others interested in a 12 byte string_view?

Nyet.

woodb...@gmail.com

unread,
Jan 31, 2017, 11:43:46 AM1/31/17
to
16.

> Why it is so? ;-)

I think it's due to alignment and arrays. In some cases, a reordering
of the data members can help reduce the size of a class, but not in
this case. Thank you for your reply.

Scott Lurndal

unread,
Jan 31, 2017, 12:25:19 PM1/31/17
to
woodb...@gmail.com writes:
>On Tuesday, January 31, 2017 at 2:52:15 AM UTC-6, =C3=96=C3=B6 Tiib wrote:

>> #include <iostream>
>>=20
>> struct nah {void* ptr; uint8_t size;};
>> =20
>> int main()
>> {
>> std::cout << "Size is still " << sizeof (nah)=20
>> << " bytes, Brian.\n";
>> }
>>=20
>> What it answers?
>
>16.
>
>> Why it is so? ;-)
>
>I think it's due to alignment and arrays. In some cases, a reordering
>of the data members can help reduce the size of a class, but not in
>this case. Thank you for your reply.

It is so because the compiler is required to align members of structure (MoS)
on natural boundaries. The natural boundary for size_t and any pointer will
be 8-bytes (since both are 8-byte quantities) on modern 64-bit architectures.

If the uint8_t preceeds the pointer, then the compiler will need to allocate
7 filler bytes before the pointer. If the uint8_t follows the pointer, the
filler bytes will be added because the structure has a minimum alignment
derived from the largest minimum alignment of any MoS, which in this case is
again eight bytes (consider, for example, an array of this structure - for
the pointer to be aligned correctly in all elements of the array, each element
of the array must be a multiple of 8-bytes in size).

You can specify that the structure should be "packed" using implementation
defined mechanisms (__attribute__((packed)) in gcc, #pragma packed in other
compilers) if you really don't want padding between the fields (and are
prepared to take the substantial performance hit from accessing unaligned
data (which requires trapping and fixup on some architectures that don't
allow direct access to unaligned data), and causes substantial pipeline
bubbles on architectures that do support access to unaligned data).

The linux tool 'pahole' will extract the structure definition from the
DWARF data in the ELF executable and show where the holes are and how
large they are.

e.g.:

struct tm {
int tm_sec; /* 0 4 */
int tm_min; /* 4 4 */
int tm_hour; /* 8 4 */
int tm_mday; /* 12 4 */
int tm_mon; /* 16 4 */
int tm_year; /* 20 4 */
int tm_wday; /* 24 4 */
int tm_yday; /* 28 4 */
int tm_isdst; /* 32 4 */

/* XXX 4 bytes hole, try to pack */

long int tm_gmtoff; /* 40 8 */
const char * tm_zone; /* 48 8 */

/* size: 56, cachelines: 1, members: 11 */
/* sum members: 52, holes: 1, sum holes: 4 */
/* last cacheline: 56 bytes */
};

Chris M. Thomasson

unread,
Jan 31, 2017, 4:54:04 PM1/31/17
to
On 1/31/2017 9:25 AM, Scott Lurndal wrote:
> woodb...@gmail.com writes:
>> On Tuesday, January 31, 2017 at 2:52:15 AM UTC-6, =C3=96=C3=B6 Tiib wrote:
>
>>> #include <iostream>
>>> =20
>>> struct nah {void* ptr; uint8_t size;};
>>> =20
>>> int main()
>>> {
>>> std::cout << "Size is still " << sizeof (nah)=20
>>> << " bytes, Brian.\n";
>>> }
>>> =20
>>> What it answers?
>>
>> 16.
>>
>>> Why it is so? ;-)
>>
>> I think it's due to alignment and arrays. In some cases, a reordering
>> of the data members can help reduce the size of a class, but not in
>> this case. Thank you for your reply.
>
> It is so because the compiler is required to align members of structure (MoS)
> on natural boundaries. The natural boundary for size_t and any pointer will
> be 8-bytes (since both are 8-byte quantities) on modern 64-bit architectures.

Well, just to be safe: alignof(size_t)?

woodb...@gmail.com

unread,
Jan 31, 2017, 9:53:56 PM1/31/17
to
I did a little test on an i3 laptop and wasn't able to detect
a pattern. Sometimes the test that used references was slightly
faster and other times the test that used values was slightly
faster. So it may not make as much of difference as I thought.
I'll probably pass string_view by value for the time being.



Brian
Ebenezer Enterprises
http://webEbenezer.net

Juha Nieminen

unread,
Feb 1, 2017, 2:56:47 AM2/1/17
to
woodb...@gmail.com wrote:
> It seems though that putting the pointer first might help
> in terms of preventing some padding in the type if the pointer
> is 8 bytes and the length member is 4.

Can you name a system where pointers are 8 bytes long but size_t is 4 bytes?

Robert Wessel

unread,
Feb 1, 2017, 4:20:31 AM2/1/17
to
The old AS/400 C compiler had 16 byte pointers and 4 byte size_t's. In
the early days of that, you couldn't have any individual object bigger
than 64KB, so a 2 byte size_t would have been possible, but unless
memory is failing me, size_t was was an int and 4 bytes.

woodb...@gmail.com

unread,
Feb 1, 2017, 11:59:05 AM2/1/17
to
No. I'm reconsidering 32 bit operating systems though in order
to get a more reasonable size_t.



Brian
Ebenezer Enterprises - "I was a hopeless fool, now I'm hopelessly
devoted to You."

https://duckduckgo.com/?q=love+broke+thru+tobymac&t=ffsb&ia=videos&iax=1&iai=44l9PRI4c2M

Scott Lurndal

unread,
Feb 1, 2017, 12:33:02 PM2/1/17
to
woodb...@gmail.com writes:
>On Wednesday, February 1, 2017 at 1:56:47 AM UTC-6, Juha Nieminen wrote:
>> woodb...@gmail.com wrote:
>> > It seems though that putting the pointer first might help
>> > in terms of preventing some padding in the type if the pointer
>> > is 8 bytes and the length member is 4.
>>
>> Can you name a system where pointers are 8 bytes long but size_t is 4 bytes?
>
>No. I'm reconsidering 32 bit operating systems though in order
>to get a more reasonable size_t.

What's wrong with an 8-byte size_t?

Note that even IOS is deprecating 32-bit apps. Good luck with that.

woodb...@gmail.com

unread,
Feb 1, 2017, 1:55:32 PM2/1/17
to
On Wednesday, February 1, 2017 at 11:33:02 AM UTC-6, Scott Lurndal wrote:
> woodb...@gmail.com writes:
> >On Wednesday, February 1, 2017 at 1:56:47 AM UTC-6, Juha Nieminen wrote:
> >> woodb...@gmail.com wrote:
> >> > It seems though that putting the pointer first might help
> >> > in terms of preventing some padding in the type if the pointer
> >> > is 8 bytes and the length member is 4.
> >>
> >> Can you name a system where pointers are 8 bytes long but size_t is 4 bytes?
> >
> >No. I'm reconsidering 32 bit operating systems though in order
> >to get a more reasonable size_t.
>
> What's wrong with an 8-byte size_t?

I don't need support for such mammoth string lengths.


>
> Note that even IOS is deprecating 32-bit apps. Good luck with that.

https://www.gotquestions.org/God-is-in-control.html

The advent of the C++ Middleware Writer is further evidence
of G-d's sovereignty.

I didn't say that I plan to start using 32 bit operating systems
to support the C++ Middleware Writer.


Brian
Ebenezer Enterprises - "And one called out to another and said, "Holy,
Holy, Holy, is the L-RD of hosts, The whole earth is full of His glory."
Isaiah 6:3

http://webEbenezer.net

Daniel

unread,
Feb 1, 2017, 10:01:36 PM2/1/17
to
On Wednesday, February 1, 2017 at 1:55:32 PM UTC-5, woodb...@gmail.com wrote:
>
> The advent of the C++ Middleware Writer is further evidence
> of G-d's sovereignty.
>
It's nice to see that Brian has a sense of self deprecating humour!

Daniel

Öö Tiib

unread,
Feb 2, 2017, 8:50:23 AM2/2/17
to
Serious fundamentalistic positions are typically indistinguishable
from self-irony and parody.

woodb...@gmail.com

unread,
Feb 2, 2017, 12:42:01 PM2/2/17
to
You may wish to cast me as a villain because you also consider
Moses and Aaron to be villains. By faith they stood up to
Pharoah, and G-d freed millions of people who had been slaves
to a powerful regime. I'm with those who realize they are prone
to slavery and need G-d as much as the ancient Israelites did.


Brian
Ebenezer Enterprises - "My brothers, G-d called you to be free.
But do not use your freedom as an excuse for letting your physical
desires control you, but through love seek the best for one another." Galatians 5:13

Daniel

unread,
Feb 2, 2017, 1:07:11 PM2/2/17
to
On Thursday, February 2, 2017 at 12:42:01 PM UTC-5, woodb...@gmail.com wrote:

> consider Moses and Aaron to be villains.

A Canaanites would have thought so, much as modern civilized man regards Hitler:
massacres of men and women and children with the intent to kill an entire people
tend to evoke that reaction. But keep in mind that the Pentateuch is generally
considered to have been written much later than the purported events, in the
monarchy period.

Daniel

Gareth Owen

unread,
Feb 2, 2017, 2:35:03 PM2/2/17
to
woodb...@gmail.com writes:

> On Wednesday, February 1, 2017 at 9:01:36 PM UTC-6, Daniel wrote:
>> On Wednesday, February 1, 2017 at 1:55:32 PM UTC-5, woodb...@gmail.com wrote:
>> >
>> > The advent of the C++ Middleware Writer is further evidence
>> > of G-d's sovereignty.
>> >
>> It's nice to see that Brian has a sense of self deprecating humour!
>>
>
> You may wish to cast me as a villain

I consider you a buffoon. You're some way below villain.

> because you also consider Moses and Aaron to be villains

I rarely consider them at all, and when I do
a) I don't consider them villains
b) you really shouldn't flatter yourself with comparisons

Chris Vine

unread,
Feb 2, 2017, 3:13:52 PM2/2/17
to
On Thu, 2 Feb 2017 09:41:53 -0800 (PST)
woodb...@gmail.com wrote:

> On Wednesday, February 1, 2017 at 9:01:36 PM UTC-6, Daniel wrote:
> > On Wednesday, February 1, 2017 at 1:55:32 PM UTC-5,
> > woodb...@gmail.com wrote:
> > >
> > > The advent of the C++ Middleware Writer is further evidence
> > > of G-d's sovereignty.
> > >
> > It's nice to see that Brian has a sense of self deprecating humour!
> >
>
> You may wish to cast me as a villain because you also consider
> Moses and Aaron to be villains. By faith they stood up to
> Pharoah, and G-d freed millions of people who had been slaves
> to a powerful regime. I'm with those who realize they are prone
> to slavery and need G-d as much as the ancient Israelites did.

Relax, he wasn't casting you as a villain. It was just making fun of
you for your delusions of grandeur. He was treating you as a joke, not
a villain, if that makes you feel better about yourself.

You are also illogical. No one had at any stage mentioned, or thought
of, Moses or Aaron. That is just another instance of your
self-delusion.

I suspect you are more prone to insanity than slavery.

Juha Nieminen

unread,
Feb 7, 2017, 2:04:34 AM2/7/17
to
woodb...@gmail.com wrote:
>> Note that even IOS is deprecating 32-bit apps. Good luck with that.
>
> https://www.gotquestions.org/God-is-in-control.html
>
> The advent of the C++ Middleware Writer is further evidence
> of G-d's sovereignty.

In a conversation that has absolutely nothing to do with religion,
you suddenly, out of the blue, throwin in proselytizing.

How about you stop being a retarded asshole?

woodb...@gmail.com

unread,
Feb 7, 2017, 2:25:11 PM2/7/17
to
On Tuesday, February 7, 2017 at 1:04:34 AM UTC-6, Juha Nieminen wrote:
> woodb...@gmail.com wrote:
> >> Note that even IOS is deprecating 32-bit apps. Good luck with that.
> >
> > https://www.gotquestions.org/God-is-in-control.html
> >
> > The advent of the C++ Middleware Writer is further evidence
> > of G-d's sovereignty.
>
> In a conversation that has absolutely nothing to do with religion,
> you suddenly, out of the blue, throwin in proselytizing.
>

I'm explaining that "luck" has nothing to do with it... many
thieves around here... If I were lucky as a business man, I
wouldn't deserve the spoils of victory. Their many attempts
over many years to derail my work/company have failed. So they
resort to this sort of crap to make others think that those who
work hard and risk everything they have don't deserve huge
rewards. I'm blessed not lucky. If people fall for their crap,
they will feel justified in attempting to steal what G-d has
given me. They have an impoverished, deep-pockets mentality.
I hope you will realize that everyone has a right to defend
themselves.


Brian
Ebenezer Enterprises - "Free at last, free at last. Thank G-d
Almighty we are free at last." Martin Luther King, Jr.

http://webEbenezer.net

Daniel

unread,
Feb 7, 2017, 3:49:57 PM2/7/17
to
On Tuesday, February 7, 2017 at 2:25:11 PM UTC-5, woodb...@gmail.com wrote:
> I hope you will realize that everyone has a right to defend
> themselves.
>
Don't worry about it, when you're willing to work for free, you can do anything
that you like, you don't have to do what other people want you to do, and it
doesn't matter if nobody wants to pay for it.

Best regards,
Daniel

Chris Vine

unread,
Feb 7, 2017, 7:34:21 PM2/7/17
to
On Tue, 7 Feb 2017 11:25:02 -0800 (PST)
woodb...@gmail.com wrote:
> On Tuesday, February 7, 2017 at 1:04:34 AM UTC-6, Juha Nieminen wrote:
> > woodb...@gmail.com wrote:
> > >> Note that even IOS is deprecating 32-bit apps. Good luck with
> > >> that.
> > >
> > > https://www.gotquestions.org/God-is-in-control.html
> > >
> > > The advent of the C++ Middleware Writer is further evidence
> > > of G-d's sovereignty.
> >
> > In a conversation that has absolutely nothing to do with religion,
> > you suddenly, out of the blue, throwin in proselytizing.
> >
>
> I'm explaining that "luck" has nothing to do with it... many
> thieves around here... If I were lucky as a business man, I
> wouldn't deserve the spoils of victory. Their many attempts
> over many years to derail my work/company have failed. So they
> resort to this sort of crap to make others think that those who
> work hard and risk everything they have don't deserve huge
> rewards. I'm blessed not lucky. If people fall for their crap,
> they will feel justified in attempting to steal what G-d has
> given me. They have an impoverished, deep-pockets mentality.
> I hope you will realize that everyone has a right to defend
> themselves.

Brian,

Please don't swear. We don't use the word Cr-p here. This is a clean
living newsgroup.

As an aside, no one on this newsgroup has tried to derail your
company. No one has resorted to the activity to which you have rather
offensively alluded. You seem to be suffering from paranoia as well as
a potty mouth. Please stop it.

Mr Flibble

unread,
Feb 7, 2017, 7:48:23 PM2/7/17
to
<woodb...@gmail.com> wrote:
> On Tuesday, February 7, 2017 at 1:04:34 AM UTC-6, Juha Nieminen wrote:
>> woodb...@gmail.com wrote:
>>>> Note that even IOS is deprecating 32-bit apps. Good luck with that.
>>>
>>> https://www.gotquestions.org/God-is-in-control.html
>>>
>>> The advent of the C++ Middleware Writer is further evidence
>>> of G-d's sovereignty.
>>
>> In a conversation that has absolutely nothing to do with religion,
>> you suddenly, out of the blue, throwin in proselytizing.
>>
>
> I'm explaining that "luck" has nothing to do with it... many
> thieves around here... If I were lucky as a business man, I
> wouldn't deserve the spoils of victory. Their many attempts
> over many years to derail my work/company have failed. So they
> resort to this sort of crap to make others think that those who
> work hard and risk everything they have don't deserve huge
> rewards. I'm blessed not lucky. If people fall for their crap,
> they will feel justified in attempting to steal what G-d has
> given me. They have an impoverished, deep-pockets mentality.
> I hope you will realize that everyone has a right to defend
> themselves.

Brian, please don't swear here you potty mouthed cunt.

/Flibble

woodb...@gmail.com

unread,
Feb 15, 2017, 12:12:22 PM2/15/17
to
On Sunday, January 22, 2017 at 3:24:45 PM UTC-6, woodb...@gmail.com wrote:
> std::string_view == win
>
> But there should be support for appending a string_view to
> std::string. That seems to still be missing from gcc 7.0
> and clang 3.9.1. That's kind of frustrating.
>
> std::variant == win

But why is std::variant<float> allowed? If you have
std::variant<float, double> then sometimes it could
have a float and other times a double.

"to exhibit or undergo change <the sky was constantly varying>"
https://www.merriam-webster.com/dictionary/vary


Brian
Ebenezer Enterprises - In G-d we trust.
http://webEbenezer.net

Brian
Ebenezer Enterprises

> std::any == lose
> std::optional == draw
>
> That's my take on these additions. I tried replacing a use of
> std::unique_ptr with std::optional. The resulting executable
> was 588 bytes bigger (~1.5%). It avoids the allocation/
> deallocation, but not without it's own cost. In a single
> threaded program perhaps unique_ptr would be better.
>
>
>

Mr Flibble

unread,
Feb 15, 2017, 12:37:03 PM2/15/17
to
On 15/02/2017 17:12, woodb...@gmail.com wrote:
> On Sunday, January 22, 2017 at 3:24:45 PM UTC-6, woodb...@gmail.com wrote:
>> std::string_view == win
>>
>> But there should be support for appending a string_view to
>> std::string. That seems to still be missing from gcc 7.0
>> and clang 3.9.1. That's kind of frustrating.
>>
>> std::variant == win
>
> But why is std::variant<float> allowed? If you have
> std::variant<float, double> then sometimes it could
> have a float and other times a double.

Asking why is it allowed is a naive question.

Sometimes you might want variant<T, float>, sometimes you might want
variant<T, double> and sometimes you might want variant<T, float,
double> or even a plain variant<float, double>.

Just because you cannot think of a reason why you might want that it
doesn't mean that other people can't think of a reason.

One example I can think of off the top of my head would be to use a
variant to represent types of objects in a scripting language and that
scripting language supports both 32-bit and 64-bit floating point types
just like C++.

/Flibble



Scott Lurndal

unread,
Feb 15, 2017, 12:51:22 PM2/15/17
to
Mr Flibble <flibbleREM...@i42.co.uk> writes:
>On 15/02/2017 17:12, woodb...@gmail.com wrote:
>> On Sunday, January 22, 2017 at 3:24:45 PM UTC-6, woodb...@gmail.com wrote:
>>> std::string_view == win
>>>
>>> But there should be support for appending a string_view to
>>> std::string. That seems to still be missing from gcc 7.0
>>> and clang 3.9.1. That's kind of frustrating.
>>>
>>> std::variant == win
>>
>> But why is std::variant<float> allowed? If you have
>> std::variant<float, double> then sometimes it could
>> have a float and other times a double.
>
>Asking why is it allowed is a naive question.
>
>Sometimes you might want variant<T, float>, sometimes you might want
>variant<T, double> and sometimes you might want variant<T, float,
>double> or even a plain variant<float, double>.

While std::variant is very similar to Pascal variant records,
personally I don't see any advantage over a plain old
union (which when I use them, I generally am using them to
access _both_ variants, e.g. if I need access to a subset
(say a bitfield) of a larger value or to access the floating
point value directly as a collection of bitfields (sign, exponent,
mantissa)).

e.g.

union REGISTER_NAME {
uint64_t u;
struct REGISTER_NAME_S {
#if __BYTE_ORDER == __BIG_ENDIAN
uint64_t reserved_48_63: 16;
uint64_t vmid : 16;
uint64_t base : 32;
#else
uint64_t base : 32;
uint64_t vmid : 16;
uint64_t reserved_48_63: 16;
#endif
} s;
};

REGISTER_NAME f;
f.u = 0xabcd12349876fedc;
uint64_t vmid = f.s.vmid;

(much more readable than vmid = (f.u >> 32) & 0xffffull;

Scott Lurndal

unread,
Feb 15, 2017, 12:53:10 PM2/15/17
to
woodb...@gmail.com writes:

>> That's my take on these additions. I tried replacing a use of
>> std::unique_ptr with std::optional. The resulting executable
>> was 588 bytes bigger (~1.5%). It avoids the allocation/
>> deallocation, but not without it's own cost. In a single
>> threaded program perhaps unique_ptr would be better.

Why do you consider 588 bytes significant?

woodb...@gmail.com

unread,
Feb 15, 2017, 1:47:04 PM2/15/17
to
That shows a weakness of std::variant that I hadn't thought
of -- there are no names associated with the members.

No one has said why something like this is interesting:
union REGISTER_NAME {
uint64_t u;
};

Or

::std::variant<uint64_t>



Brian
Ebenezer Enterprises

Mr Flibble

unread,
Feb 15, 2017, 1:47:29 PM2/15/17
to
On 15/02/2017 17:51, Scott Lurndal wrote:
> Mr Flibble <flibbleREM...@i42.co.uk> writes:
>> On 15/02/2017 17:12, woodb...@gmail.com wrote:
>>> On Sunday, January 22, 2017 at 3:24:45 PM UTC-6, woodb...@gmail.com wrote:
>>>> std::string_view == win
>>>>
>>>> But there should be support for appending a string_view to
>>>> std::string. That seems to still be missing from gcc 7.0
>>>> and clang 3.9.1. That's kind of frustrating.
>>>>
>>>> std::variant == win
>>>
>>> But why is std::variant<float> allowed? If you have
>>> std::variant<float, double> then sometimes it could
>>> have a float and other times a double.
>>
>> Asking why is it allowed is a naive question.
>>
>> Sometimes you might want variant<T, float>, sometimes you might want
>> variant<T, double> and sometimes you might want variant<T, float,
>> double> or even a plain variant<float, double>.
>
> While std::variant is very similar to Pascal variant records,
> personally I don't see any advantage over a plain old
> union (which when I use them, I generally am using them to
> access _both_ variants, e.g. if I need access to a subset
> (say a bitfield) of a larger value or to access the floating
> point value directly as a collection of bitfields (sign, exponent,
> mantissa)).

A std::variant is a type safe discriminated union whilst an ordinary
union is not.

/Flibble

Scott Lurndal

unread,
Feb 15, 2017, 3:02:03 PM2/15/17
to
Like I said, it's a Pascal variant record, and I don't see
a use for it in modern code. Whereas union _does_ have uses.

Paavo Helde

unread,
Feb 15, 2017, 3:52:24 PM2/15/17
to
There is a massive use case for a type-safe discriminated union, namely
supporting dynamically typed variables in a scripting language engine.
Mr Flibble already pointed this out.

AFAIK using an union for type punning is still UB in C++ (unlike in C),
though many compilers support it.

Öö Tiib

unread,
Feb 15, 2017, 4:15:08 PM2/15/17
to
On Wednesday, 15 February 2017 19:51:22 UTC+2, Scott Lurndal wrote:
> While std::variant is very similar to Pascal variant records,
> personally I don't see any advantage over a plain old
> union (which when I use them, I generally am using them to
> access _both_ variants, e.g. if I need access to a subset
> (say a bitfield) of a larger value or to access the floating
> point value directly as a collection of bitfields (sign, exponent,
> mantissa)).

What you describe is type-punning. I am not fully sure if
such usage of union is valid by standard but it seems that all
compilers support it. It is usage that std::variant does
not allow. Type-punning can be achieved with 'reinterpret_cast'
or with 'memcpy' so union is also not needed for it.

For me most important usage of variant is that it makes possible
to have dynamic polymorphism of potentially unrelated types
AND without dynamic memory management AND type-safely (unlike
'void*' or 'union' or 'reinterpret_cast').

Paavo Helde

unread,
Feb 15, 2017, 4:24:16 PM2/15/17
to
On 15.02.2017 23:14, Öö Tiib wrote:
> On Wednesday, 15 February 2017 19:51:22 UTC+2, Scott Lurndal wrote:
>> While std::variant is very similar to Pascal variant records,
>> personally I don't see any advantage over a plain old
>> union (which when I use them, I generally am using them to
>> access _both_ variants, e.g. if I need access to a subset
>> (say a bitfield) of a larger value or to access the floating
>> point value directly as a collection of bitfields (sign, exponent,
>> mantissa)).
>
> What you describe is type-punning. I am not fully sure if
> such usage of union is valid by standard but it seems that all
> compilers support it. It is usage that std::variant does
> not allow. Type-punning can be achieved with 'reinterpret_cast'
> or with 'memcpy' so union is also not needed for it.

Reinterpret_cast of pointers is "more UB" than union punning in
practice, compilers are known to generate invalid code for it.

OTOH, memcpy() is guaranteed to do the correct thing and will be most
probably compiled to the same machine code anyway nowadays, so portable
C++ code should use memcpy.

Bo Persson

unread,
Feb 15, 2017, 7:30:16 PM2/15/17
to
But there is a get<type>(v) to access a value of the selected type.


>
> No one has said why something like this is interesting:
> union REGISTER_NAME {
> uint64_t u;
> };
>
> Or
>
> ::std::variant<uint64_t>

This can be the accidental result of using std::variant<types...>.
Should we forbid this when sizeof...(types) == 1?



Bo Persson

woodb...@gmail.com

unread,
Feb 15, 2017, 9:08:39 PM2/15/17
to
I think so. At least the documentation should explain that
there's no good reason to use a variant<T>.


Brian
Ebenezer Enterprises - "When you like your work, every day is
a holiday." Frank Tyger

http://webEbenezer.net

Scott Lurndal

unread,
Feb 16, 2017, 8:42:07 AM2/16/17
to
=?UTF-8?B?w5bDtiBUaWli?= <oot...@hot.ee> writes:
>On Wednesday, 15 February 2017 19:51:22 UTC+2, Scott Lurndal wrote:
>> While std::variant is very similar to Pascal variant records,
>> personally I don't see any advantage over a plain old
>> union (which when I use them, I generally am using them to
>> access _both_ variants, e.g. if I need access to a subset
>> (say a bitfield) of a larger value or to access the floating
>> point value directly as a collection of bitfields (sign, exponent,
>> mantissa)).
>
>What you describe is type-punning. I am not fully sure if
>such usage of union is valid by standard but it seems that all
>compilers support it. It is usage that std::variant does
>not allow. Type-punning can be achieved with 'reinterpret_cast'
>or with 'memcpy' so union is also not needed for it.

Why use an extremely ugly ineffecient solution like memcpy or reinterpret_cast when the
union works just fine? And has worked just fine for forty
years. And will work fine for the forseeable future.

As for the uses of std::variant, they're limited to a very few
real-world applications; it's not generally useful.

Scott Lurndal

unread,
Feb 16, 2017, 8:43:46 AM2/16/17
to
Most of my code is extremely performance intensive (cpu bound) and
littering the code with memcpy's would be silly.

Fundamentally, type punning with unions has worked for forty years
and will continue to work into the future, and it yields efficient
and readable code. The lack of type-safety frankly isn't an issue.

Paavo Helde

unread,
Feb 16, 2017, 9:39:01 AM2/16/17
to
On 16.02.2017 15:41, Scott Lurndal wrote:
> =?UTF-8?B?w5bDtiBUaWli?= <oot...@hot.ee> writes:
>> On Wednesday, 15 February 2017 19:51:22 UTC+2, Scott Lurndal wrote:
>>> While std::variant is very similar to Pascal variant records,
>>> personally I don't see any advantage over a plain old
>>> union (which when I use them, I generally am using them to
>>> access _both_ variants, e.g. if I need access to a subset
>>> (say a bitfield) of a larger value or to access the floating
>>> point value directly as a collection of bitfields (sign, exponent,
>>> mantissa)).
>>
>> What you describe is type-punning. I am not fully sure if
>> such usage of union is valid by standard but it seems that all
>> compilers support it. It is usage that std::variant does
>> not allow. Type-punning can be achieved with 'reinterpret_cast'
>> or with 'memcpy' so union is also not needed for it.
>
> Why use an extremely ugly ineffecient solution like memcpy or reinterpret_cast when the
> union works just fine?

Ugly is in the eye of beholder and can be hidden away in a service function.

About inefficient: both memcpy and union get compiled into the same
machine code:

tmp>g++ test4.cpp -std=c++11 -O2 -Wall
tmp>./a.out f f f
Line 14
40
tmp>g++ test4.cpp -std=c++11 -O2 -Wall -DUSE_UNION
tmp>./a.out f f f
Line 24
40
tmp>g++ test4.cpp -std=c++11 -O2 -Wall -S -o n.s
tmp>g++ test4.cpp -std=c++11 -O2 -Wall -DUSE_UNION -S -o u.s
tmp>diff n.s u.s
19c19
< movl $14, %esi
---
> movl $24, %esi
51c51
< movl $14, %esi
---
> movl $24, %esi

IOW, the only difference is the __LINE__ constant.

------------------------------------------------------------------------

tmp>cat test4.cpp
#include <stdint.h>
#include <string.h>
#include <stdio.h>

struct Reg {
uint64_t base : 32;
uint64_t vmid : 16;
uint64_t reserved_48_63: 16;
};

#ifndef USE_UNION

uint64_t ReadVmid(uint64_t x) {
printf("Line %d\n", __LINE__);
Reg y;
static_assert(sizeof(x)==sizeof(y), "");
memcpy(&y, &x, sizeof(x));
return y.vmid;
}

#else

uint64_t ReadVmid(uint64_t x) {
printf("Line %d\n", __LINE__);
union {
uint64_t a;
Reg b;
} u;
u.a = x;
return u.b.vmid;
}

#endif

int main(int argc, char* argv[]) {
printf("%lu\n", ReadVmid(uint64_t(argc)*0xA00000000));
}

------------------------------------------------------------------------

tmp>cat n.s
.file "test4.cpp"
.section .rodata.str1.1,"aMS",@progbits,1
.LC0:
.string "Line %d\n"
.section .text.unlikely,"ax",@progbits
.LCOLDB1:
.text
.LHOTB1:
.p2align 4,,15
.globl _Z8ReadVmidm
.type _Z8ReadVmidm, @function
_Z8ReadVmidm:
.LFB26:
.cfi_startproc
pushq %rbx
.cfi_def_cfa_offset 16
.cfi_offset 3, -16
movq %rdi, %rbx
movl $14, %esi
shrq $32, %rbx
movl $.LC0, %edi
xorl %eax, %eax
call printf
movzwl %bx, %eax
popq %rbx
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE26:
.size _Z8ReadVmidm, .-_Z8ReadVmidm
.section .text.unlikely
.LCOLDE1:
.text
.LHOTE1:
.section .rodata.str1.1
.LC2:
.string "%lu\n"
.section .text.unlikely
.LCOLDB3:
.section .text.startup,"ax",@progbits
.LHOTB3:
.p2align 4,,15
.globl main
.type main, @function
main:
.LFB27:
.cfi_startproc
pushq %rbx
.cfi_def_cfa_offset 16
.cfi_offset 3, -16
movl $14, %esi
movl %edi, %ebx
xorl %eax, %eax
movl $.LC0, %edi
call printf
leaq (%rbx,%rbx,4), %rsi
movl $.LC2, %edi
xorl %eax, %eax
addq %rsi, %rsi
andl $65534, %esi
call printf
xorl %eax, %eax
popq %rbx
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE27:
.size main, .-main
.section .text.unlikely
.LCOLDE3:
.section .text.startup
.LHOTE3:
.ident "GCC: (SUSE Linux) 5.3.1 20160301 [gcc-5-branch
revision 233849]"
.section .note.GNU-stack,"",@progbits


Mr Flibble

unread,
Feb 16, 2017, 2:49:29 PM2/16/17
to
The "modern code" you write must be very limited in scope if you never
come across the need to use a variant. I use variants all the time
(certainly more often than I use unions).

/Flibble


Öö Tiib

unread,
Feb 16, 2017, 5:58:18 PM2/16/17
to
On Thursday, 16 February 2017 15:42:07 UTC+2, Scott Lurndal wrote:
> =?UTF-8?B?w5bDtiBUaWli?= <oot...@hot.ee> writes:
> >On Wednesday, 15 February 2017 19:51:22 UTC+2, Scott Lurndal wrote:
> >> While std::variant is very similar to Pascal variant records,
> >> personally I don't see any advantage over a plain old
> >> union (which when I use them, I generally am using them to
> >> access _both_ variants, e.g. if I need access to a subset
> >> (say a bitfield) of a larger value or to access the floating
> >> point value directly as a collection of bitfields (sign, exponent,
> >> mantissa)).
> >
> >What you describe is type-punning. I am not fully sure if
> >such usage of union is valid by standard but it seems that all
> >compilers support it. It is usage that std::variant does
> >not allow. Type-punning can be achieved with 'reinterpret_cast'
> >or with 'memcpy' so union is also not needed for it.
>
> Why use an extremely ugly ineffecient solution like memcpy or
> reinterpret_cast when the union works just fine?

What inefficiency? AFAIK reinterpret_cast is explicitly not
allowed to generate any run-time instructions. How can it be
inefficient? You likely haven't measured it any.

> And has worked just fine for forty
> years. And will work fine for the forseeable future.

It is all likely so but no one of us can predict future.

>
> As for the uses of std::variant, they're limited to a very few
> real-world applications; it's not generally useful.

You think that union is really in C (and so in C++) language for
type-punning as its primary feature? Why you think so? To my
knowledge the main purpose of union is to save memory by using the
same memory region for storing different objects at different times. Interpreting exactly same byte values as different objects at
different times is also sometimes needed but lot less often.

Chris Vine

unread,
Feb 16, 2017, 6:21:47 PM2/16/17
to
On Thu, 16 Feb 2017 14:58:08 -0800 (PST)
Öö Tiib <oot...@hot.ee> wrote:
> On Thursday, 16 February 2017 15:42:07 UTC+2, Scott Lurndal wrote:
> > =?UTF-8?B?w5bDtiBUaWli?= <oot...@hot.ee> writes:
> > >On Wednesday, 15 February 2017 19:51:22 UTC+2, Scott Lurndal
> > >wrote:
> > >> While std::variant is very similar to Pascal variant records,
> > >> personally I don't see any advantage over a plain old
> > >> union (which when I use them, I generally am using them to
> > >> access _both_ variants, e.g. if I need access to a subset
> > >> (say a bitfield) of a larger value or to access the floating
> > >> point value directly as a collection of bitfields (sign,
> > >> exponent, mantissa)).
> > >
> > >What you describe is type-punning. I am not fully sure if
> > >such usage of union is valid by standard but it seems that all
> > >compilers support it. It is usage that std::variant does
> > >not allow. Type-punning can be achieved with 'reinterpret_cast'
> > >or with 'memcpy' so union is also not needed for it.
> >
> > Why use an extremely ugly ineffecient solution like memcpy or
> > reinterpret_cast when the union works just fine?
>
> What inefficiency? AFAIK reinterpret_cast is explicitly not
> allowed to generate any run-time instructions. How can it be
> inefficient? You likely haven't measured it any.

The problem with type punning via reinterpret_cast is that except when
casting to char* or unsigned char*, dereferencing the cast pointer
violates the strict aliasing rule in §3.10/10 of the standard. (Yes,
you can get round that by wrapping the reinterpret_cast within a
function call, but then std::memcpy or a union is clearer and more
convenient.)

> > And has worked just fine for forty
> > years. And will work fine for the forseeable future.
>
> It is all likely so but no one of us can predict future.
>
> >
> > As for the uses of std::variant, they're limited to a very few
> > real-world applications; it's not generally useful.
>
> You think that union is really in C (and so in C++) language for
> type-punning as its primary feature? Why you think so? To my
> knowledge the main purpose of union is to save memory by using the
> same memory region for storing different objects at different times.
> Interpreting exactly same byte values as different objects at
> different times is also sometimes needed but lot less often.

In practice type punning probably is its main use. Using unions for
type punning is a long established tradition which although fine (and
ubiquitous) in C99/11 is apparently undefined behavior in C++. I say
"apparently", because it is asserted by Stroustrup in TC++PL 4th
edition (and I have also seen it asserted by others) that you have
undefined behavior if you read a union member which was not the last
one written to, although I have not managed to find the part of the
standard which says so explicitly: it seems to be taken to be implied
by the statement in the standard that "at most one of the non-static
data members of an object of union type can be active at any time".

However the construct is accepted in C++ as an extension by gcc and
clang, and probably most other compilers. In practice it is OK.

std::memcpy() is guaranteed by the standard and the copy will be
optimized out by the compiler. It will end up like a union without the
small prospect of undefined behavior on unknown compilers.
reinterpret_cast is a last resort.

David Brown

unread,
Feb 17, 2017, 3:07:46 AM2/17/17
to
Wrapping in a function call does not help avoid strict aliasing rules
unless the function is defined in a separately compiled TU - in which
case it is inefficient. And if your setup includes link-time
optimisation of sufficient sophistication, the function call does not
help at all.

>
>>> And has worked just fine for forty
>>> years. And will work fine for the forseeable future.
>>
>> It is all likely so but no one of us can predict future.
>>
>>>
>>> As for the uses of std::variant, they're limited to a very few
>>> real-world applications; it's not generally useful.
>>
>> You think that union is really in C (and so in C++) language for
>> type-punning as its primary feature? Why you think so? To my
>> knowledge the main purpose of union is to save memory by using the
>> same memory region for storing different objects at different times.
>> Interpreting exactly same byte values as different objects at
>> different times is also sometimes needed but lot less often.
>
> In practice type punning probably is its main use. Using unions for
> type punning is a long established tradition which although fine (and
> ubiquitous) in C99/11 is apparently undefined behavior in C++. I say
> "apparently", because it is asserted by Stroustrup in TC++PL 4th
> edition (and I have also seen it asserted by others) that you have
> undefined behavior if you read a union member which was not the last
> one written to, although I have not managed to find the part of the
> standard which says so explicitly: it seems to be taken to be implied
> by the statement in the standard that "at most one of the non-static
> data members of an object of union type can be active at any time".

The C89 standard said that type-punning through unions was undefined
behaviour. C++ based their rules on that C standard, but as far as I
can tell reading a union member that was not the last one written
(except for the "common start sequence of structs" exception) is
undefined behaviour in C++ by virtue of not having an defined behaviour
- unlike in C89 where it is explicitly labelled "undefined".

>
> However the construct is accepted in C++ as an extension by gcc and
> clang, and probably most other compilers. In practice it is OK.
>
> std::memcpy() is guaranteed by the standard and the copy will be
> optimized out by the compiler. It will end up like a union without the
> small prospect of undefined behavior on unknown compilers.
> reinterpret_cast is a last resort.
>

Yes - memcpy is usually efficient for such purposes. But it is still
ugly :-)

Chris Vine

unread,
Feb 17, 2017, 7:20:26 AM2/17/17
to
On Fri, 17 Feb 2017 09:07:36 +0100
David Brown <david...@hesbynett.no> wrote:
[snip]
> Wrapping in a function call does not help avoid strict aliasing rules
> unless the function is defined in a separately compiled TU - in which
> case it is inefficient. And if your setup includes link-time
> optimisation of sufficient sophistication, the function call does not
> help at all.

Types have gone by link time. I think it would be impossible to treat
the return value or out parameter of a function in the way you mention
at that time. Indeed, many functions (iconv comes to mind) rely on this
by returning arrays of bytes which you are expected to cast to an array
of an integer type of the right size, which may not in fact represent
the dynamic type used in constructing the byte array. Some of the
BSD-style networking functions do something similar.

David Brown

unread,
Feb 17, 2017, 8:42:53 AM2/17/17
to
On 17/02/17 13:20, Chris Vine wrote:
> On Fri, 17 Feb 2017 09:07:36 +0100
> David Brown <david...@hesbynett.no> wrote:
> [snip]
>> Wrapping in a function call does not help avoid strict aliasing rules
>> unless the function is defined in a separately compiled TU - in which
>> case it is inefficient. And if your setup includes link-time
>> optimisation of sufficient sophistication, the function call does not
>> help at all.
>
> Types have gone by link time.

Not if you have link-time optimisation. The compiler can keep whatever
information it wants, allowing for exactly the same kinds of
optimisations and transforms as if all the definitions were in one huge
file.

My point is that using a function call to try to force an isolation like
this is not something you can rely on.

Tim Rentsch

unread,
Feb 18, 2017, 11:59:02 AM2/18/17
to
Chris Vine writes:

> On Thu, 16 Feb 2017 14:58:08 -0800 (PST) Tiib <oot...@hot.ee> wrote:
>> On Thursday, 16 February 2017 15:42:07 UTC+2, Scott Lurndal wrote:
>>> Tiib <oot...@hot.ee> writes:
>>>> On Wednesday, 15 February 2017 19:51:22 UTC+2, Scott Lurndal
>>>> wrote:
>>>>> While std::variant is very similar to Pascal variant records,
>>>>> personally I don't see any advantage over a plain old
>>>>> union (which when I use them, I generally am using them to
>>>>> access _both_ variants, e.g. if I need access to a subset
>>>>> (say a bitfield) of a larger value or to access the floating
>>>>> point value directly as a collection of bitfields (sign,
>>>>> exponent, mantissa)).
>>>>
>>>> What you describe is type-punning. I am not fully sure if
>>>> such usage of union is valid by standard but it seems that all
>>>> compilers support it. It is usage that std::variant does
>>>> not allow. Type-punning can be achieved with 'reinterpret_cast'
>>>> or with 'memcpy' so union is also not needed for it.

[...]

>> You think that union is really in C (and so in C++) language for
>> type-punning as its primary feature? Why you think so? To my
>> knowledge the main purpose of union is to save memory by using the
>> same memory region for storing different objects at different times.
>> Interpreting exactly same byte values as different objects at
>> different times is also sometimes needed but lot less often.
>
> In practice type punning probably is its main use. Using unions for
> type punning is a long established tradition which although fine (and
> ubiquitous) in C99/11 is apparently undefined behavior in C++. I say
> "apparently", because it is asserted by Stroustrup in TC++PL 4th
> edition (and I have also seen it asserted by others) that you have
> undefined behavior if you read a union member which was not the last
> one written to, although I have not managed to find the part of the
> standard which says so explicitly: it seems to be taken to be implied
> by the statement in the standard that "at most one of the non-static
> data members of an object of union type can be active at any time".

After seeing this comment I looked into this question (again), a
little more deeply this time. I am now of the opinion that
accessing a member other than the active member is defined
behavior, and not undefined behavior, subject of course to the
usual caveats about relative sizes, depending on representations,
etc. My reasoning is as follows (referencing C++14 of n4296).

In section 5.20 paragraph 2, defining a core constant expression,
one of the exclusions listed is (2.5)

an operation that would have undefined behavior

A little further on, in (2.8), there is another exclusion

an lvalue-to-rvalue conversion [...] that refers to a
non-active member of a union [...]

There would be no reason to list the second case if it were
already undefined behavior. I believe it is generally accepted
that writing in the C++ standard tries to avoid this sort of
redundancy. Hence it follows that (most likely) the authors
consider such accesses defined behavior rather than undefined
behavior (as before, subject to the usual caveats).

What do y'all think?

Alf P. Steinbach

unread,
Feb 18, 2017, 3:20:50 PM2/18/17
to
On 18.02.2017 17:58, Tim Rentsch wrote:
There is a paragraph with an intended-to-be-exhaustive bullet point list
of supported reinterpretations of bitpatterns. The gcc folks call it
/the strict aliasing rule/ of C++ and are pretty zealot-like about it
because of the silly behavior of that compiler: that one is supposed to
introduce inefficiency via `memcpy`, ideally two calls of `memcpy` per
reinterpretation, and hope to have that inefficiency optimized away by
the compiler, just to avoid its warnings and possibly
royally-screw-things-up pseudo-optimizations, where otherwise one would
have two pointers to different types really pointing to the same object,
i.e. aliased pointers. The standard doesn't call it strict aliasing so
I'm not sure how to search for it, and I always forget where it is.

Anyway,

• The list is not really exhaustive, it has at least one one-way
reinterpretation.

• The reinterpretations supported is where some common part of the
bitpattern really means the same in the two interpretations, e.g. signed
versus unsigned interpretation for non-negative integers.

• The reinterpretations of interest via type punning in e.g. a union,
are generally those not in the strict aliasing rule bullet list.

What you quoted seems to be about any reinterpretation at all not being
permitted in a constant expression.


Cheers & hth.,

- Alf

Chris Vine

unread,
Feb 18, 2017, 6:52:07 PM2/18/17
to
On strict aliasing, the provision is §3.10/10 of the standard.

Reinterpreting bit patterns through a union does not fall foul of the
strict aliasing rule set out there because of the sixth bullet of that
paragraph. It is directly equivalent to the fifth bullet of §6.5/7 of
C11 which is what permits (with §6.2.6.1/6 and /7 read with footnote 95
of §6.5.2.3/3) type punning through a union in C11.

Assuming bit patterns and alignment are compatible, it is the separate
rule about reading a union member other than the current active one
which is (as I understand it) the problem in C++ (if there is one).

Anyway, common compilers support this idiom.

Tim Rentsch

unread,
Feb 24, 2017, 4:40:06 PM2/24/17
to
> search for it, and I always forget where it is. [...]

Like Chris Vine said these rules are in section 3.10 p10. The
rules do not interfere with access through union members, as I
explain below, but let me go on to the individual items.

> * The list is not really exhaustive, it has at least one one-way
> reinterpretation.

I assume you're talking about the asymmetric rule involving
aggregate or union types. That rule has its origins in the
original C standard. I agree it looks suspicious. Despite that
I believe there are sensible reasons for the asymmetry. In any
case that clause isn't relevant here because the accesses I'm
talking about are through the union's members, never through the
union as a whole.


> * The reinterpretations supported is where some common part of the
> bitpattern really means the same in the two interpretations,
> e.g. signed versus unsigned interpretation for non-negative integers.

That is true for arbitrary accesses, independent of unions. For
example it's okay to access an (unsigned int) using an (int *).
But this sort of thing is not what's happening when accesses
are done through union members. See next.


> * The reinterpretations of interest via type punning in e.g. a union,
> are generally those not in the strict aliasing rule bullet list.

That statement is true, but actually not important. To see why,
consider this picture:

+--------+--------+--------+--------+--------+--------+--------+--------+
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
+--------+--------+--------+--------+--------+--------+--------+--------+
<---- u.small ---->
<------------- u.large ------------->
<----------------------------------- u --------------------------------->

The region shown as 'u' is a union of 8 bytes. The union type
has two members, 'small' of type uint16_t, and 'large' of type
uint32_t. Suppose we store into u.large and read u.small:

u.large = 23;
std::cout << u.small;

The type of the lvalue is uint16_t. The dynamic type of the
object being read is the type of the u.small object, which is
also uint16_t. This access is allowed under the first bullet
item of 3.10 p10.

To be sure what I just said about dynamic type is true, I read
through the definition of dynamic type, and some other terms like
"object", so as not to be misled. (I hope others will do this
also, to give me a double check.) Because of what region is
accessed and how the access is done, the only candidate object is
u.small, whose dynamic type is uint16_t. That matches the lvalue
type of the access, so we are good to go.

I do have a "however" to add to this, please see below.


> What you quoted seems to be about any reinterpretation at all not
> being permitted in a constant expression.

Yes, it definitely is, but that doesn't affect the reasoning. If
reading one member to access a different member-last-written were
undefined behavior under other circumstances there is no reason
to make it undefined behavior under these circumstances. The
dynamic type rules of 3.10 p10 do not interfere with these cases
of union member access, as I have explained above.


And now for the big "having said that....".

I have gone through the various parts of the C++ standard
checking and double checking the reasoning given above. I am
pretty sure the reasoning is sound.

However.... Different passages in the C++ standard look a bit
schizophrenic about unions. In some places it looks like unions
"hold" only one object at a time. In other places it looks like
unions always have all their objects - with the understanding that
only one member may be the "active" member, but the other members
are still identified with "regions of memory", which is what an
object is. It's easy to see how different people might reach
different conclusions about what the semantics are in these
cases.

In C, the wording regarding unions is more straightforward, and
it's fairly clear what is meant to happen. Moreover, if that
weren't enough, the C standard includes a prominent footnote
which says explicitly that union access through another member
will reinterpret the bits of the member last written. For
whatever reason the C++ standard has chosen to muddy the waters
about how unions are supposed to behave. So I think the best
we can say for sure is that the issue is open to debate, and
the C++ standard deserves a big Defect Report as long as that
is true.

I hope you have enjoyed this long explanation and probably
pointless exercise. :)

woodb...@gmail.com

unread,
Oct 20, 2018, 12:56:05 AM10/20/18
to
Originally in this thread I called std::variant a "win".
But after watching this:

CppCon 2018: Rishi Wani “Datum: A Compact Bitwise Copyable Variant Type”

https://duckduckgo.com/?q=datum+cppcon&t=ffab&ia=videos&iax=videos&iai=YdzbrFerlRY

my opinion of std::variant is not so high. Bloomberg has
done something interesting here with Datum. I'm glad I
haven't spent much time working on serialization support for
std::variant as datum is superior to it in a number of ways.


Brian
Ebenezer Enterprises - Enjoying programming again.
https://github.com/Ebenezer-group/onwards

Öö Tiib

unread,
Oct 20, 2018, 2:54:28 AM10/20/18
to
That lecture concentrates on how to pack stuff into the 52 often unused
bits of signaling NaN of double on 32-bit platform. Sure, we can
store lot of stuff into 52 bits on 32-bit platform if we so want.
I did not know that Bloomberg was so heavily into embedded development.

Öö Tiib

unread,
Oct 20, 2018, 4:23:23 AM10/20/18
to
On Saturday, 20 October 2018 07:56:05 UTC+3, woodb...@gmail.com wrote:
> On Wednesday, February 15, 2017 at 11:53:10 AM UTC-6, Scott Lurndal wrote:
> > woodb...@gmail.com writes:
> >
> > >> That's my take on these additions. I tried replacing a use of
> > >> std::unique_ptr with std::optional. The resulting executable
> > >> was 588 bytes bigger (~1.5%). It avoids the allocation/
> > >> deallocation, but not without it's own cost. In a single
> > >> threaded program perhaps unique_ptr would be better.
> >
> > Why do you consider 588 bytes significant?
>
> Originally in this thread I called std::variant a "win".
> But after watching this:
>
> CppCon 2018: Rishi Wani “Datum: A Compact Bitwise Copyable Variant Type”
>
> https://duckduckgo.com/?q=datum+cppcon&t=ffab&ia=videos&iax=videos&iai=YdzbrFerlRY

Note that benchmark at 26:17 is clearly defective.

Worst case mentioned tag+string is 40 bytes on 64 bit platform so
100 000 variants should take 4 000 000 bytes. His chart however shows
memory usage of 28 000 000 bytes. What was the object with size of 280
bytes? How it was packed into 6.5 bytes of signaling NaNs?

I trust that I would stick with StarOffice or MSOffice. No way I would
consider BloombergOffice since the guys there can't calculate.

0 new messages