Proposal for a uuid library

Marius Bancila

unread,

Dec 18, 2017, 10:51:27 AM12/18/17

to ISO C++ Standard - Future Proposals

I have written the draft of a proposal for a universally unique identifier (uuid) library. The draft is available here: https://github.com/mariusbancila/stduuid/blob/master/paper.md.

I have also written an implementation of the proposed library, that is available on Github here: https://github.com/mariusbancila/stduuid.

I did not submit this proposal yet. I am looking for feedback first, both on the library design and the proposal document itself (I know is not the best possible, but it's my first attempt).

Note: Although the proposal specifies that the library should be provided as part of a new namespace called std::uuids, the implementation on Github uses the namespace uuids so that it can actually be used as it is today.

Jan Wilmans

unread,

Dec 18, 2017, 11:04:20 AM12/18/17

to std-pr...@isocpp.org

Hi Marius,

I'm not an expert on proposals, nor on uuid's so I'm not sure I'm qualified to judge here, but, looking at the proposal, this came to my mind:

1) you have an implementation, great! On what compilers was it tested?

2) is there a way to use it without dynamic memory allocation?

ps. minor typo "to and frmo strings,"

Nicolas Lesser

unread,

Dec 18, 2017, 11:26:21 AM12/18/17

to ISO C++ Standard - Future Proposals

Didn't we agree that nested namespaces in std should be abolished? I can't find the paper proposing it, and I don't know whether it was accepted or not, but I completely agree. Having to write `std::uuids::make_uuid()` is a bit repetitive. No need for a nested namespace IMO.

The library proposed on this paper is a lite one

Did you mean "light" instead of "lite"?

For the sake of consistency, I propose you name the two string conversion functions `to_string` and template it like std::bitset::to_string did. The constructor of uuid for the two string types should also be templated.

Shouldn't you be using std::byte instead of std::uint8_t?

If you add iterator support, I think it would be nice to also allow construction from a pair of iterators.

Andrey Semashev

unread,

Dec 18, 2017, 12:07:37 PM12/18/17

to std-pr...@isocpp.org

On 12/18/17 18:51, Marius Bancila wrote:
> I have written the draft of a proposal for a universally unique
> identifier (uuid) library. The draft is available here:
> https://github.com/mariusbancila/stduuid/blob/master/paper.md.
>
> I have also written an implementation of the proposed library, that is
> available on Github here: https://github.com/mariusbancila/stduuid.
>
> I did not submit this proposal yet. I am looking for feedback first,
> both on the library design and the proposal document itself (I know is
> not the best possible, but it's my first attempt).
>

> /Note/: Although the proposal specifies that the library should be
> provided as part of a new namespace called /std::uuids/, the
> implementation on Github uses the namespace /uuids/ so that it can

> actually be used as it is today.

I've been thinking about writing a proposal for std::uuid myself one
day, so I'm glad someone did this sooner. :)

I have a few comments:

1. I'd rather remove any mention of std::wstring and only deal with
narrow strings.

2. Parsing string representation should accept std::string_view instead
of std::string.

3. I think, parsing should be a separate algorithm rather than a
constructor. I think, the current convention is std::from_chars.

http://en.cppreference.com/w/cpp/utility/from_chars

4. Conversion to string should probably be a free function std::to_string.

http://en.cppreference.com/w/cpp/string/basic_string/to_string

Although I could be convinced of having a member function as well.

The reason of having free functions instead of member functions is that
it can be useful in generic code that could work with different types,
including uuids (think of generic parsing/formatting libraries, like
Boost.Spirit).

5. This constructor:

explicit uuid(uint8_t const * const bytes);

is potentially unsafe. Yes, we know `bytes` should point to 16 bytes,
but this interface makes it not obvious and not enforceable. I would
rather replace this constructor and std::array-based one with this:

template< typename ForwardIterator >
explicit uuid(ForwardIterator bytes_begin, ForwardIterator bytes_end);

This also follows the conventions of other sequences/containers in the
standard library.

6. The default constructor should not be explicit. It should be
constexpr and noexcept.

7. Where does uuid_variant come from and why is it necessary? I may be
forgetting, but I don't remember RFC4122 mentioning variants.

8. Random UUID generation algorithm should accept the random number
generator. The proposal should not require the implementation to use
some pre-defined, possibly global RNG. It should allow reusing the RNG
to generate multiple UUIDs. I would also name the algorithm something
like `generate_random_uuid`.

9. I still think the proposal should include other means of generating
UUIDs, such as generating UUID from a name. When generating from a name,
the algorithm should also accept the hashing function to use.

10. I would like the proposal to explicitly mention that the alignment
of the `uuid` type is allowed and encouraged to be higher than 1, if it
allows for a faster implementation (read: using vector instructions).
Also, the proposal should make clear that the internal representation
may not be a straightforward array of bytes and may have arbitrary
endianness. This means that iterators should not be defined as pointers.

Nicol Bolas

unread,

Dec 18, 2017, 1:07:46 PM12/18/17

to ISO C++ Standard - Future Proposals

On Monday, December 18, 2017 at 11:26:21 AM UTC-5, Nicolas Lesser wrote:

Didn't we agree that nested namespaces in std should be abolished? I can't find the paper proposing it, and I don't know whether it was accepted or not, but I completely agree.

There was a paper discussing it, but that doesn't mean there was agreement with that paper.

That being said, the proposal contains 1 class, 2 enums, and a small number of functions. That's hardly a good reason to stick them in a namespace, regardless of the question of nested namespaces in `std`.

Having to write `std::uuids::make_uuid()` is a bit repetitive. No need for a nested namespace IMO.

The library proposed on this paper is a lite one
Did you mean "light" instead of "lite"?

For the sake of consistency, I propose you name the two string conversion functions `to_string` and template it like std::bitset::to_string did. The constructor of uuid for the two string types should also be templated.

Shouldn't you be using std::byte instead of std::uint8_t?

Especially since `uint8_t` is not required to be implemented.

Marius Bancila

unread,

Dec 19, 2017, 5:56:19 AM12/19/17

to std-pr...@isocpp.org

Thank you all for the good feedback so far. Let me try to answer to your comments.

you have an implementation, great! On what compilers was it tested?

Visual C++ 2017 15.4.5 on Windows and Clang Apple LLVM version 9.0.0 (clang-900.0.39.2) on Mac.

is there a way to use it without dynamic memory allocation?

What sort of dynamic memory allocation? Apologizes, but not sure I get the question.

Didn't we agree that nested namespaces in std should be abolished? I can't find the paper proposing it, and I don't know whether it was accepted or not, but I completely agree. Having to write `std::uuids::make_uuid()` is a bit repetitive. No need for a nested namespace IMO.

There was a paper discussing it, but that doesn't mean there was agreement with that paper.

That being said, the proposal contains 1 class, 2 enums, and a small number of functions. That's hardly a good reason to stick them in a namespace, regardless of the question of nested namespaces in `std`.

I was not aware of such agreement. My personal opinion is that there should be separate namespaces for different libraries, not just a big bloated namespace (to bring them all and in the darkness bind them). But if that is the agreement, or you believe the library is too small to have its own namespace I'm fine with std too.

Shouldn't you be using std::byte instead of std::uint8_t?

My thoughts were that I want to make it simply constructible from an array of unsigned chars, without having to explicitly convert every single one to std::byte. But I see there is a consensus std::byte should be the one to go with, so I will do that.

For the sake of consistency, I propose you name the two string conversion functions `to_string` and template it like std::bitset::to_string did. The constructor of uuid for the two string types should also be templated.

Conversion to string should probably be a free function std::to_string.
http://en.cppreference.com/w/cpp/string/basic_string/to_string

I have considered that myself and after some thoughts I decided to go with member functions string()/wstring(). The reason for that was I have seen the same thing was done for filesystem::path, which is the newest library to the standard, so I though that should be a good model. You seem to think that is not the case.

I'd rather remove any mention of std::wstring and only deal with narrow strings.]

Can you please explain why?

Parsing string representation should accept std::string_view instead of std::string.

Right. I have changed that.

I think, parsing should be a separate algorithm rather than a constructor. I think, the current convention is std::from_chars.
http://en.cppreference.com/w/cpp/utility/from_chars

I think that in the end some thing are all about preferences and style. I rather prefer simply constructing an object by passing an argument to its constructor and not call a free function to do so. Again, filesystem::path supports constructing paths from strings without relying on free functions. Should the parsing be a free function and not a member one, I would rather call it make_uuid() and not from_chars, from_string, to_uuid or others. I would prefer to have overloads for make_uuid(), for the simplicity of the API.

This constructor:
explicit uuid(uint8_t const * const bytes);
is potentially unsafe. Yes, we know `bytes` should point to 16 bytes, but this interface makes it not obvious and not enforceable. I would rather replace this constructor and std::array-based one with this:
template< typename ForwardIterator >
explicit uuid(ForwardIterator bytes_begin, ForwardIterator bytes_end);
This also follows the conventions of other sequences/containers in the standard library.

I was having my own doubts about that particular constructor. I followed your suggestion and removed it and added an iterator based constructor.

The default constructor should not be explicit. It should be constexpr and noexcept.

Done.

Where does uuid_variant come from and why is it necessary? I may be forgetting, but I don't remember RFC4122 mentioning variants.

It does under section 4.1.1 (https://tools.ietf.org/html/rfc4122#section-4.1.1), although three of the four variants are reserved either for backwards compatibility or future use. Therefore, in practice we only get uuid_variant::rfc UUIDs, variant 2 (i.e. uuid_variant::microsoft) was used for early GUIDs on the Windows platform. For instance, 00000000-0000-0000-C000-000000000046 is the uuid for the IUknown COM interface. This is a variant 2 uuid.

8. Random UUID generation algorithm should accept the random number generator. The proposal should not require the implementation to use some pre-defined, possibly global RNG. It should allow reusing the RNG to generate multiple UUIDs. I would also name the algorithm something like `generate_random_uuid`.
9. I still think the proposal should include other means of generating UUIDs, such as generating UUID from a name. When generating from a name, the algorithm should also accept the hashing function to use.
10. I would like the proposal to explicitly mention that the alignment of the `uuid` type is allowed and encouraged to be higher than 1, if it allows for a faster implementation (read: using vector instructions). Also, the proposal should make clear that the internal representation may not be a straightforward array of bytes and may have arbitrary endianness. This means that iterators should not be defined as pointers.

I need a bit more time to think about those ones.

Thank you again,

Marius

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/4eaedd85-d3ff-4b9f-a357-ac2ebe8d9bd6%40isocpp.org.

Andrey Semashev

unread,

Dec 19, 2017, 7:29:25 AM12/19/17

to std-pr...@isocpp.org

On 12/19/17 13:55, Marius Bancila wrote:
>
> Shouldn't you be using std::byte instead of std::uint8_t?
>
>
> My thoughts were that I want to make it simply constructible from an
> array of unsigned chars, without having to explicitly convert every
> single one to std::byte. But I see there is a consensus std::byte should
> be the one to go with, so I will do that.

My personal preference is to keep the support for interpreting `unsigned
char`s and `std::uint8_t`s. There are a lot of instances where UUIDs are
obtained from external sources, including C code, which has no idea of
`std::byte`. Also, `std::byte` requires explicit conversion from
integers, which clutters UUID constant initialization for no reason. You
can see that in the examples presented in the paper.

As a side note, I'm not that fond of `std::byte` as its use is
unnecessarily complicated in real programs. You never ever get or pass
pointers to `std::byte`, and `std::byte` constants always require
casting. Seems too much effort to use this type for no gain.

> For the sake of consistency, I propose you name the two string
> conversion functions `to_string` and template it like
> std::bitset::to_string did. The constructor of uuid for the two
> string types should also be templated.
>
>
> Conversion to string should probably be a free function std::to_string.
> http://en.cppreference.com/w/cpp/string/basic_string/to_string
>
>
> I have considered that myself and after some thoughts I decided to go
> with member functions string()/wstring(). The reason for that was I have
> seen the same thing was done for filesystem::path, which is the newest
> library to the standard, so I though that should be a good model. You
> seem to think that is not the case.
>
> I'd rather remove any mention of std::wstring and only deal with
> narrow strings.]
>
> Can you please explain why?

First, the type should not deal with character code conversion. Note
that wchar_t encoding is not defined by the standard. Second, if you
intend to support wchar_t, then the next question is why not char16_t
and char32_t as well? This would unnecessarily complicate the
implementation and interface of the component. Third, this is the
current convention with respect to other types in the standard library.

> I think, parsing should be a separate algorithm rather than a
> constructor. I think, the current convention is std::from_chars.
> http://en.cppreference.com/w/cpp/utility/from_chars
>
> I think that in the end some thing are all about preferences and style.
> I rather prefer simply constructing an object by passing an argument to
> its constructor and not call a free function to do so. Again,
> filesystem::path supports constructing paths from strings without
> relying on free functions.

filesystem::path is not an adequate comparison because it mostly wraps a
string. It is natural for the path to be constructible from a string -
it may not even require any parsing. UUIDs are not strings though, and
constructing from a string is not a mere conversion and definitely
requires parsing. Such action should better be more explicit and not
tied to the type itself.

> Should the parsing be a free function and not
> a member one, I would rather call it make_uuid() and not from_chars,
> from_string, to_uuid or others. I would prefer to have overloads for
> make_uuid(), for the simplicity of the API.

make_uuid is ambiguous. In your proposal you used this name to generate
random UUIDs. You could also use this function to generate UUID from
names. It goes in conflict with the use of this function to parse an
UUID from string.

I would prefer a separate name for each algorithm, a name that describes
the behavior most clearly. So generating a random UUID should be
something like `generate_random_uuid`, generating from name -
`generate_name_uuid` and parsing UUID from a string - `from_chars` or
`from_string` - preferably, sonething that has a set convention for this
purpose.

> Where does uuid_variant come from and why is it necessary? I may be
> forgetting, but I don't remember RFC4122 mentioning variants.
>
> It does under section 4.1.1
> (https://tools.ietf.org/html/rfc4122#section-4.1.1), although three of
> the four variants are reserved either for backwards compatibility or
> future use. Therefore, in practice we only get uuid_variant::rfc UUIDs,
> variant 2 (i.e. uuid_variant::microsoft) was used for early GUIDs on the
> Windows platform. For instance, 00000000-0000-0000-C000-000000000046 is
> the uuid for the IUknown COM interface. This is a variant 2 uuid.

I see. Although I doubt there's practical need in this field.

Andrey Semashev

unread,

Dec 19, 2017, 7:38:09 AM12/19/17

to std-pr...@isocpp.org

UUIDs are defined as 128-bit values. The RFC also defines contents of
specific bits of the value. So, it might make sense precondition UUIDs
on the support for `std::uint8_t`.

Jan Wilmans

unread,

Dec 19, 2017, 7:49:39 AM12/19/17

to std-pr...@isocpp.org

is there a way to use it without dynamic memory allocation?

What sort of dynamic memory allocation? Apologizes, but not sure I get the question.

Well, this is not really a specific remark on the proposal itself, but more on any proposal to come.

In the recent discussions of outcome and/or expected I'm not sure it makes sense to standardize more libraries that support exceptions as their only error-handling mechanism.

Also the work on the freestanding proposal is ongoing (discussed in the december telco of SG14).

I would argue, it also does not make sense to standardize any more libraries that are not freestanding, that _could_ be made freestanding. (and const-expr, for that matter)

Thiago Macieira

unread,

Dec 19, 2017, 12:40:50 PM12/19/17

to std-pr...@isocpp.org

Why can't you store 128 bits in a type that is 144 bits long?

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

Andrey Semashev

unread,

Dec 19, 2017, 4:13:29 PM12/19/17

to std-pr...@isocpp.org

On 12/19/17 20:40, Thiago Macieira wrote:
> On terça-feira, 19 de dezembro de 2017 04:38:06 PST Andrey Semashev wrote:
>> On 12/18/17 21:07, Nicol Bolas wrote:
>>> On Monday, December 18, 2017 at 11:26:21 AM UTC-5, Nicolas Lesser wrote:
>>> Shouldn't you be using std::byte instead of std::uint8_t?
>>>
>>> Especially since `uint8_t` is not required to be implemented.
>>
>> UUIDs are defined as 128-bit values. The RFC also defines contents of
>> specific bits of the value. So, it might make sense precondition UUIDs
>> on the support for `std::uint8_t`.
>
> Why can't you store 128 bits in a type that is 144 bits long?

You can, but you will have problems importing/exporting the value as
exactly 128 bits with no padding because that is the format defined by
the RFC.

Arthur O'Dwyer

unread,

Dec 19, 2017, 5:05:14 PM12/19/17

to ISO C++ Standard - Future Proposals

On Tuesday, December 19, 2017 at 2:56:19 AM UTC-8, Marius Bancila wrote:

is there a way to use it without dynamic memory allocation?

What sort of dynamic memory allocation? Apologizes, but not sure I get the question.

The then-current version had a hard dependency on std::string, which you've now fixed, except that I'm still not sure there's any way to extract the "string representation" of a std::uuid object without using std::string. One not-really-solution — which is probably a good idea anyway — is to provide a stream operator

ostream& operator<<(ostream&, const uuid&);

that streams the result directly into the ostream rather than needing dynamic memory allocation to convert it to std::string first. Of course, people who want to avoid memory allocation often want to avoid iostreams as well; that's why I say this is a not-really-solution. I have more comments on I/O below.

Shouldn't you be using std::byte instead of std::uint8_t?

My thoughts were that I want to make it simply constructible from an array of unsigned chars, without having to explicitly convert every single one to std::byte. But I see there is a consensus std::byte should be the one to go with, so I will do that.

I don't think there's consensus. std::byte is for bytes, not for an array that really boils down to a single __uint128_t anyway. The obvious way to represent a UUID in memory is as a __uint128_t. The second most obvious way is as a pair of uint64_ts, or a set-of-16 uint8_ts. I don't see why std::byte should be relevant at all.

Also, you should remove the begin() and end() methods from class uuid. Iterators are used to enable algorithms over sequences, and a UUID is not a sequence. Now, it can be serialized into a sequence of 16 bytes, or 2 uint64_ts, or ~35 characters, but it is not itself a sequence of anything. It's just an object.

For the sake of consistency, I propose you name the two string conversion functions `to_string` and template it like std::bitset::to_string did. The constructor of uuid for the two string types should also be templated.

For consistency with std::filesystem::path, I recommend leaving .string() and .wstring() alone at least for now.

http://en.cppreference.com/w/cpp/filesystem/path/string

I might equally well bikeshed them to .str() and .wstr(), or suggest implementing operator<< and not providing any other stringification API, but these are minor details and changing them will not improve the "consistency" of the C++ standard library, which is fundamentally inconsistent in this area.

Conversion to string should probably be a free function std::to_string.
http://en.cppreference.com/w/cpp/string/basic_string/to_string

I have considered that myself and after some thoughts I decided to go with member functions string()/wstring(). The reason for that was I have seen the same thing was done for filesystem::path, which is the newest library to the standard, so I though that should be a good model. You seem to think that is not the case.

The massive overload set of std::to_string() was a mistake in standardization that should never have happened.

What should exist, and does exist in every codebase AFAICT, is

template<class T>

auto to_string(T&& t) {

std::ostringstream oss;

oss << std::forward<T>(t);

return oss.str();

}

Adding overloads to deal with primitive types as special cases is possible but definitely not necessary.

I'd rather remove any mention of std::wstring and only deal with narrow strings.]

Can you please explain why?

wchar_t is not useful in portable code; whereas plain old `char` effectively is portable, because most people "just know" that char is 8-bit ASCII or a superset thereof, and std::string works with plain `char`, and so on. So the question is really just whether it's worth all the "administrative overhead" — extra verbiage in the standard, and the entry for .wstring() on cppreference, and so on — to deal with a case (wchar_t) that nobody is going to care about in practice.

However, if you go with operator<<, you don't have to worry as much about the administrative overhead. Wide streams and regular streams work pretty much the same at the high level, which is all you need in this case. Take a look at the one-line specifications of operator<< in [syserr.errcode.nonmembers], [util.smartptr.shared.io], etc.

I think, parsing should be a separate algorithm rather than a constructor. I think, the current convention is std::from_chars.
http://en.cppreference.com/w/cpp/utility/from_chars

std::from_chars / std::to_chars should not be used as a model of how to do things. They're very low-level and difficult to use correctly. Marius's intuition is correct here.

This constructor:
explicit uuid(uint8_t const * const bytes);
is potentially unsafe. Yes, we know `bytes` should point to 16 bytes, but this interface makes it not obvious and not enforceable. I would rather replace this constructor and std::array-based one with this:
template< typename ForwardIterator >
explicit uuid(ForwardIterator bytes_begin, ForwardIterator bytes_end);
This also follows the conventions of other sequences/containers in the standard library.

I was having my own doubts about that particular constructor. I followed your suggestion and removed it and added an iterator based constructor.

Well, you definitely need a way to construct a UUID from a string. I would expect that to be spelled

explicit uuid(char const *);

though, rather than

explicit uuid(uint8_t const *);

It's just a cosmetic difference (assuming your platform's uint8_t is a typedef for plain char and not unsigned char), but it would make it clearer what you expect to happen when this constructor is called.

8. Random UUID generation algorithm should accept the random number generator. The proposal should not require the implementation to use some pre-defined, possibly global RNG. It should allow reusing the RNG to generate multiple UUIDs. I would also name the algorithm something like `generate_random_uuid`.

If you mean make_uuid(), I believe the problem there is that make_uuid() is specified to create a "version 1" UUID using the platform's own OS-dependent method of generating UUIDs. This is not going to be parameterizable with anything like a C++ random number engine. (Nor should it be.)

It would be possible to provide another factory for "version 4" RNG-based UUIDs, parameterized on the RNG engine.

Speaking of versions, why do you have

uuid_variant::future

uuid_version::none

I would naively expect to see "uuid_foo::unknown" for both of these. In particular, "future" seems like a bad name because it collides with "std::future" and because it implies something about historical time which might or might not be accurate by the time programmers start using this library. The name "unknown" would be accurate and easy to remember, if used consistently.

9. I still think the proposal should include other means of generating UUIDs, such as generating UUID from a name. When generating from a name, the algorithm should also accept the hashing function to use.

Sadly, RFC 4122 defines "version" numbers 3 and 5 for the relatively obsolete MD5 and SHA-1 hashes respectively, and does not define any "version" numbers for present-day hashes such as SHA-256. I'd say it's very important to make sure that such UUID-generating functions can be written using the primitives this library provides; but it is not important to provide those UUID-generating functions out of the box as part of this library.

10. I would like the proposal to explicitly mention that the alignment of the `uuid` type is allowed and encouraged to be higher than 1, if it allows for a faster implementation (read: using vector instructions). Also, the proposal should make clear that the internal representation may not be a straightforward array of bytes and may have arbitrary endianness. This means that iterators should not be defined as pointers.

I need a bit more time to think about those ones.

Iterators should not be provided at all, because a UUID is not a container or sequence of objects.

But yes, I would expect the natural alignment of `class uuid` on, say, a RISC machine, to be the same as the alignment of `__uint128_t` or `uint64_t`.

my $.02,

Arthur

Thiago Macieira

unread,

Dec 19, 2017, 5:53:32 PM12/19/17

to std-pr...@isocpp.org

Why would I have such problems?

Please answer in a hypothetical 9-bit machine that has 8-bit network and
filesystem capabilities.

Andrey Semashev

unread,

Dec 19, 2017, 5:58:57 PM12/19/17

to std-pr...@isocpp.org

On 12/20/17 01:05, Arthur O'Dwyer wrote:
> On Tuesday, December 19, 2017 at 2:56:19 AM UTC-8, Marius Bancila wrote:
>
>
> is there a way to use it without dynamic memory allocation?
>
>
> What sort of dynamic memory allocation? Apologizes, but not sure I
> get the question.
>
>
> The then-current version had a hard dependency on std::string, which
> you've now fixed, except that I'm still not sure there's any way to
> extract the "string representation" of a std::uuid object without using
> std::string. One not-really-solution — which is probably a good idea
> anyway — is to provide a stream operator
>
> ostream& operator<<(ostream&, const uuid&);
>
> that streams the result directly into the ostream rather than needing
> dynamic memory allocation to convert it to std::string first. Of course,
> people who want to avoid memory allocation often want to avoid iostreams
> as well; that's why I say this is a not-really-solution. I have more
> comments on I/O below.

If you want to have a malloc-free interface then iostreams are not the
way to go. You would need a method that would produce the string
representation of an UUID in an externally provided buffer.

char buf[100];
uuid.to_string(buf, sizeof(buf));

Unfortunately, such an interface is not convenitnt when you can malloc
and just want to get the string in-line:

throw std::runtime_error("Transaction " + uuid.to_string() + " failed");

> Shouldn't you be using std::byte instead of std::uint8_t?
>
> My thoughts were that I want to make it simply constructible from an
> array of unsigned chars, without having to explicitly convert every
> single one to std::byte. But I see there is a consensus std::byte
> should be the one to go with, so I will do that.
>
> I don't think there's consensus. std::byte is for bytes, not for an
> array that really boils down to a single __uint128_t anyway. The obvious
> way to represent a UUID in memory is as a __uint128_t. The second most
> obvious way is as a pair of uint64_ts, or a set-of-16 uint8_ts. I don't
> see why std::byte should be relevant at all.

Not std::byte itself, but std::uuid should allow to be created from an
array of 16 bytes, which have contents defined in RFC:

https://tools.ietf.org/html/rfc4122#section-4.1.2

> Also, you should remove the begin() and end() methods from class uuid.
> Iterators are used to enable algorithms over sequences, and a UUID is

> not a sequence. Now, it /can be serialized into/ a sequence of 16 bytes,
> or 2 uint64_ts, or ~35 characters, but it is not /itself/ a sequence of

> anything. It's just an object.

Iterators offer the natural way of serialization:

std::uint8_t bytes[16];
std::copy(uuid.begin(), uuid.end(), bytes);

If iterators are not provided then std::uuid should provide a dedicated
method for exporting its contents. Which might not be a bad idea.

> The massive overload set of std::to_string() was a mistake in
> standardization that should never have happened.
> What should exist, and does exist in every codebase AFAICT, is
>
> template<class T>
> auto to_string(T&& t) {
> std::ostringstream oss;
> oss << std::forward<T>(t);
> return oss.str();
> }
>
> Adding overloads to deal with primitive types as special cases is

> /possible/ but definitely not /necessary/.

This is essentianlly what `boost::lexical_cast< std::string >(t)` is.
One of the benefits of `std::to_string` is that it _doesn't_ use
iostreams. The above might be good as the default fallback
implementation, but people (and compiler writerts) should be highly
encouraged to provide more efficient implementations for specific types.

> I think, parsing should be a separate algorithm rather than a
> constructor. I think, the current convention is std::from_chars.
> http://en.cppreference.com/w/cpp/utility/from_chars
> <http://en.cppreference.com/w/cpp/utility/from_chars>
>
> std::from_chars / std::to_chars should not be used as a model of how to
> do things. They're very low-level and difficult to use correctly.
> Marius's intuition is correct here.

The interface looks like what no-exceptions people would want to use.
The throwing alternative could look like this:

uuid uuid_from_string(std::string_view str);

But in any case, such conversion should not be the std::uuid's constructor.

> This constructor:
> explicit uuid(uint8_t const * const bytes);
> is potentially unsafe. Yes, we know `bytes` should point to 16
> bytes, but this interface makes it not obvious and not
> enforceable. I would rather replace this constructor and
> std::array-based one with this:
> template< typename ForwardIterator >
> explicit uuid(ForwardIterator bytes_begin, ForwardIterator
> bytes_end);
> This also follows the conventions of other sequences/containers
> in the standard library.
>
>
> I was having my own doubts about that particular constructor. I
> followed your suggestion and removed it and added an iterator based
> constructor.
>
>
> Well, you definitely need a way to construct a UUID from a string. I
> would expect that to be spelled
>
> explicit uuid(char const *);
>
> though, rather than
>
> explicit uuid(uint8_t const *);
>
> It's just a cosmetic difference (assuming your platform's uint8_t is a
> typedef for plain char and not unsigned char), but it would make it
> clearer what you expect to happen when this constructor is called.

I believe, this was not a constructor from a string representation of an
UUID but from the binary representation (i.e. the pointer points to 16
bytes that are the UUID's value).

> 8. Random UUID generation algorithm should accept the random
> number generator. The proposal should not require the
> implementation to use some pre-defined, possibly global RNG. It
> should allow reusing the RNG to generate multiple UUIDs. I would
> also name the algorithm something like `generate_random_uuid`.
>
>
> If you mean make_uuid(), I believe the problem there is that make_uuid()
> is specified to create a "version 1" UUID using the platform's own
> OS-dependent method of generating UUIDs. This is not going to be
> parameterizable with anything like a C++ random number engine. (Nor
> should it be.)

The proposal wasn't clear about it, and I'm not very familiar with the
system APIs mentioned there, but my understanding is that those APIs
generate RNG-based UUIDs, i.e. they should be version 4. I think,
nowdays version 4 is the only version being widely generated, possibly
with the exception of version 5, the name-based UUIDs, when one needs
stable name-to-UUID mapping.

In any case, the standard cannot assume the underlying OS has the means
to generate UUIDs directly. At most, it can assume some (P)RNG like
random_device or Mersenne twister, which is stateful. Requiring
make_uuid() to be self-contained on such systems basically means making
that (P)RNG global or performing expensive initialization of the RNG on
each call. In fact, that's what happens in uuid_generate that was
referenced in the proposal, which does stat() and IO on /dev/urandom or
/dev/random (although the actual IO may be avoided if getrandom() is
present).

> Speaking of versions, why do you have
> uuid_variant::future
> uuid_version::none
> I would naively expect to see "uuid_foo::unknown" for both of these. In
> particular, "future" seems like a bad name because it collides with
> "std::future" and because it implies something about historical time
> which might or might not be accurate by the time programmers start using
> this library. The name "unknown" would be accurate and easy to
> remember, if used consistently.

+1

> 9. I still think the proposal should include other means of
> generating UUIDs, such as generating UUID from a name. When
> generating from a name, the algorithm should also accept the
> hashing function to use.
>
> Sadly, RFC 4122 defines "version" numbers 3 and 5 for the relatively
> obsolete MD5 and SHA-1 hashes respectively, and does not define any
> "version" numbers for present-day hashes such as SHA-256.

This is why I mentioned that the algorithm should accept a hashing
function to use.

> I'd say it's

> very important to make sure that such UUID-generating functions /can be
> written /using the primitives this library provides; but it is not

> important to provide those UUID-generating functions out of the box as

> /part/ of this library.

That's certainly a possible approach, but I would still prefer if
std::uuid offered a more complete support for UUIDs and not just a
wrapper around `std::array<std::uint8_t, 16>`.

Andrey Semashev

unread,

Dec 19, 2017, 6:18:50 PM12/19/17

to std-pr...@isocpp.org

On 12/20/17 01:53, Thiago Macieira wrote:
> On terça-feira, 19 de dezembro de 2017 13:13:24 PST Andrey Semashev wrote:
>> On 12/19/17 20:40, Thiago Macieira wrote:
>>> On terça-feira, 19 de dezembro de 2017 04:38:06 PST Andrey Semashev wrote:
>>>> On 12/18/17 21:07, Nicol Bolas wrote:
>>>>> On Monday, December 18, 2017 at 11:26:21 AM UTC-5, Nicolas Lesser wrote:
>>>>> Shouldn't you be using std::byte instead of std::uint8_t?
>>>>>
>>>>> Especially since `uint8_t` is not required to be implemented.
>>>>
>>>> UUIDs are defined as 128-bit values. The RFC also defines contents of
>>>> specific bits of the value. So, it might make sense precondition UUIDs
>>>> on the support for `std::uint8_t`.
>>>
>>> Why can't you store 128 bits in a type that is 144 bits long?
>>
>> You can, but you will have problems importing/exporting the value as
>> exactly 128 bits with no padding because that is the format defined by
>> the RFC.
>
> Why would I have such problems?
>
> Please answer in a hypothetical 9-bit machine that has 8-bit network and
> filesystem capabilities.

How exactly such a machine operates in 8-bit environment? How does one
represent a 128-bit message on such a machine? I can imagine multiple
possibilities, and none of them looks suitable to me, unless I'm missing
something.

For example, we could use 8 least significant bits in each 9-bit byte to
store the corresponding 8 bits of the UUID. Thus we have 1-bit padding
in every byte. There are two problems with this:

- What to do with the padding, especially when importing the UUID into
std::uuid? The padding bits must not be part of the UUID value, so the
bits bust be either masked or the conversion should fail. The latter I
find unacceptable, because all the normal world expects this operation
to never fail and be noexcept.

- How to transmit these 144 bits as a 128-bit message to another
machine? This means that the program that has exported the value into an
array of 9-bit bytes now has to compact the bits somehow and then send
only the 128 significant bits to network or something. I have no idea
how that would happen. In any case, that kind of conversion is specific
to that 9-bit machine and is not needed on 8-bit machines.

Or, we could always use the compacted representation, but then the 9-bit
machine would only use 15-byte buffer with some bit padding. This makes
the surrounding code different (e.g. buffer sizes may be smaller) and
possibly incompatible with 8-bit machines. Ther there is that magic to
send exactly 128 bits over the network.

With all that complexity and potentially non-portable code, I would
rather limit UUIDs to machines with 8-bit bytes and make that interface
and implementation most efficient there.

Marius Bancila

unread,

Dec 20, 2017, 2:12:39 AM12/20/17

to std-pr...@isocpp.org

We could approach the problem of generating uuids in a similar manner as done in boost::uuid. We could have generator like this one that uses a random number generator (and a uniform distribution) to produce uuids:

template <typename UniformRandomNumberGenerator>
class uuid_random_generator
{
public:
typedef uuid result_type;

uuid_random_generator();
explicit uuid_random_generator(UniformRandomNumberGenerator& gen);
explicit uuid_random_generator(UniformRandomNumberGenerator* pGen);

uuid operator()();
};

boost also has a name generator, but that requires SHA1 support.

I would propose a "default generator" one that may rely on the operating system uuid support, as make_uuid() does in my current proposal.

class uuid_default_generator
{
public:
typedef uuid result_type;

uuid operator()();
};

Then, we could have two overloads for make_uuid() that would look like this:

uuid make_uuid()
{
return uuid_default_generator{}();
}

template <typename Generator>
uuid make_uuid(Generator & g)
{
return g();
}

This would enable users to write their own generators, such as one that is based on name hashing, and use them with make_uuid.

That would allow us to write code like the following:

auto id1 = make_uuid();

uuid_default_generator dgen;
auto id2 = make_uuid(dgen);
auto id3 = make_uuid(dgen);

std::random_device rd;
std::mt19937 mtgen(rd());
uuid_random_generator<std::mt19937> rgen(mtgen);
auto id4 = make_uuid(rgen);
auto id5 = make_uuid(rgen);

What do you think of this?

Marius

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/50277626-6c7a-38db-ac37-463d0aac5caf%40gmail.com.

Thiago Macieira

unread,

Dec 20, 2017, 3:27:51 AM12/20/17

to std-pr...@isocpp.org

On terça-feira, 19 de dezembro de 2017 15:18:46 PST Andrey Semashev wrote:
> How exactly such a machine operates in 8-bit environment? How does one
> represent a 128-bit message on such a machine? I can imagine multiple
> possibilities, and none of them looks suitable to me, unless I'm missing
> something.

You tell me. That's exactly the issue: there's nothing inherently wrong with
those machines and they do operate with an outside world 8-bit environment. So
why exactly should UUID in the standard library be forbidden?

> For example, we could use 8 least significant bits in each 9-bit byte to
> store the corresponding 8 bits of the UUID. Thus we have 1-bit padding
> in every byte. There are two problems with this:

I believe that's how they do strings.

> - What to do with the padding, especially when importing the UUID into
> std::uuid? The padding bits must not be part of the UUID value, so the
> bits bust be either masked or the conversion should fail. The latter I
> find unacceptable, because all the normal world expects this operation
> to never fail and be noexcept.

If you consider the UUID as a char[16], I don't see why that would be a
problem at all.

> - How to transmit these 144 bits as a 128-bit message to another
> machine? This means that the program that has exported the value into an
> array of 9-bit bytes now has to compact the bits somehow and then send
> only the 128 significant bits to network or something. I have no idea
> how that would happen. In any case, that kind of conversion is specific
> to that 9-bit machine and is not needed on 8-bit machines.

That's a solved problem. Such machines already know how to talk to the outside
world. I asked you to assume such a solution exists (and it does for such
machines).

Usually, the C library of those machines has functions to convert the low 32
bits of a of 36-bit multi-byte words into four octet, stored in consecutive 9-
bit bytes. Similar for 18-bit words into 2 bytes.

> With all that complexity and potentially non-portable code, I would
> rather limit UUIDs to machines with 8-bit bytes and make that interface
> and implementation most efficient there.

What are you talking about in non-portable code? The standard library for this
platform is designed for this platform by the compiler vendor. There's no
portability issue.

If you meant with the API, then there is only a portability issue if the API
creates one in the first place. So create an API that doesn't have portability
issues: simply replace uint8_t with unsigned char or std::byte and the problem
is solved. (C++ requires a byte to be at least 8 bits wide)

Andrey Semashev

unread,

Dec 20, 2017, 1:56:15 PM12/20/17

to std-pr...@isocpp.org

On 12/20/17 10:12, Marius Bancila wrote:
> We could approach the problem of generating uuids in a similar manner as
> done in boost::uuid. We could have generator like this one that uses a
> random number generator (and a uniform distribution) to produce uuids:
>
>
> template <typename UniformRandomNumberGenerator>
> class uuid_random_generator
> {
> public:
> typedef uuid result_type;
>
> uuid_random_generator();
> explicit uuid_random_generator(UniformRandomNumberGenerator& gen);
> explicit uuid_random_generator(UniformRandomNumberGenerator* pGen);
>
> uuid operator()();
> };
>
>
> boost also has a name generator, but that requires SHA1 support.

I don't see the problem of implementing SHA1 in the standard library. It
would have value of its own if implemented as a generic component, but
in case of std::uuid it can be an internal part of the generator.

> I would propose a "default generator" one that may rely on the operating
> system uuid support, as make_uuid() does in my current proposal.
>
> class uuid_default_generator
> {
> public:
> typedef uuid result_type;
>
> uuid operator()();
> };

If you put this uuid_default_generator in the standard you effectively
*require* the OS to provide this kind of utility. My point is that this
requirement is unreasonable and possibly not every OS that currently
supports C++ qualifies.

I don't really see the point in having the generator as an object
either. This is one part of Boost.UUID design that I don't like. You
really want a function, which takes the RNG (in case of random
generator) or name and hash function state (in case of name generator)
as arguments. It's these arguments that may need to be persistent across
multiple calls to the generation algorithm; the algorithm itself has no
state.

Andrey Semashev

unread,

Dec 20, 2017, 1:56:17 PM12/20/17

to std-pr...@isocpp.org

On 12/20/17 10:12, Marius Bancila wrote:

> We could approach the problem of generating uuids in a similar manner as
> done in boost::uuid. We could have generator like this one that uses a
> random number generator (and a uniform distribution) to produce uuids:
>
>
> template <typename UniformRandomNumberGenerator>
> class uuid_random_generator
> {
> public:
> typedef uuid result_type;
>
> uuid_random_generator();
> explicit uuid_random_generator(UniformRandomNumberGenerator& gen);
> explicit uuid_random_generator(UniformRandomNumberGenerator* pGen);
>
> uuid operator()();
> };
>
>
> boost also has a name generator, but that requires SHA1 support.

I don't see the problem of implementing SHA1 in the standard library. It
would have value of its own if implemented as a generic component, but
in case of std::uuid it can be an internal part of the generator.

> I would propose a "default generator" one that may rely on the operating
> system uuid support, as make_uuid() does in my current proposal.
>
> class uuid_default_generator
> {
> public:
> typedef uuid result_type;
>
> uuid operator()();
> };

Andrey Semashev

unread,

Dec 20, 2017, 2:19:34 PM12/20/17

to std-pr...@isocpp.org

On 12/20/17 11:27, Thiago Macieira wrote:
> On terça-feira, 19 de dezembro de 2017 15:18:46 PST Andrey Semashev wrote:
>> How exactly such a machine operates in 8-bit environment? How does one
>> represent a 128-bit message on such a machine? I can imagine multiple
>> possibilities, and none of them looks suitable to me, unless I'm missing
>> something.
>
> You tell me.

Sorry, it works backwards. I have no idea. None of what I could imagine
makes sense, so I consider the idea of supporting non-8-bit machines
nonsensical. If you can suggest a sensible way to support those
machines, please do.

>> - What to do with the padding, especially when importing the UUID into
>> std::uuid? The padding bits must not be part of the UUID value, so the
>> bits bust be either masked or the conversion should fail. The latter I
>> find unacceptable, because all the normal world expects this operation
>> to never fail and be noexcept.
>
> If you consider the UUID as a char[16], I don't see why that would be a
> problem at all.

That should be unsigned char[16]. The problem is that unsigned char[16]
does not represent a 128-bit value on a non-8-bit machine. For example,
two 144-bit "UUIDs" may not compare equal because of the padding bits.
You have to define the meaning and behavior with regard to the bits that
are not part of the UUID value. That, in turn, may penalize 8-bit machines.

>> - How to transmit these 144 bits as a 128-bit message to another
>> machine? This means that the program that has exported the value into an
>> array of 9-bit bytes now has to compact the bits somehow and then send
>> only the 128 significant bits to network or something. I have no idea
>> how that would happen. In any case, that kind of conversion is specific
>> to that 9-bit machine and is not needed on 8-bit machines.
>
> That's a solved problem. Such machines already know how to talk to the outside
> world. I asked you to assume such a solution exists (and it does for such
> machines).
>
> Usually, the C library of those machines has functions to convert the low 32
> bits of a of 36-bit multi-byte words into four octet, stored in consecutive 9-
> bit bytes. Similar for 18-bit words into 2 bytes.

Who and when calls these functions? Does the C/C++ program need to call
these functions to convert the 9-bit bytes to 8-bit bytes before sending
a message to network? If the answer is yes then that code that does this
is not portable.

>> With all that complexity and potentially non-portable code, I would
>> rather limit UUIDs to machines with 8-bit bytes and make that interface
>> and implementation most efficient there.
>
> What are you talking about in non-portable code? The standard library for this
> platform is designed for this platform by the compiler vendor. There's no
> portability issue.

I was talking about a hypothetical implementation that used 128
consecutive bits to represent a UUID. So on a 9-bit machine one could write:

unsigned char buf[15]; // 135 bits
uuid.copy_to(buf); // writes 128 consecutive bits
network.send_bits(buf, 128 /* bits */);

Such code would not work on an 8-bit machine because unsigned char[15]
would not be enough to store a UUID.

> If you meant with the API, then there is only a portability issue if the API
> creates one in the first place. So create an API that doesn't have portability
> issues: simply replace uint8_t with unsigned char or std::byte and the problem
> is solved. (C++ requires a byte to be at least 8 bits wide)

It's not solved until you define the behavior on non-8-bit machines.

Nicol Bolas

unread,

Dec 20, 2017, 2:50:14 PM12/20/17

to ISO C++ Standard - Future Proposals

On Wednesday, December 20, 2017 at 2:19:34 PM UTC-5, Andrey Semashev wrote:

On 12/20/17 11:27, Thiago Macieira wrote:
> On terça-feira, 19 de dezembro de 2017 15:18:46 PST Andrey Semashev wrote:
>> How exactly such a machine operates in 8-bit environment? How does one
>> represent a 128-bit message on such a machine? I can imagine multiple
>> possibilities, and none of them looks suitable to me, unless I'm missing
>> something.
>
> You tell me.

Sorry, it works backwards. I have no idea. None of what I could imagine
makes sense, so I consider the idea of supporting non-8-bit machines
nonsensical. If you can suggest a sensible way to support those
machines, please do.

And yet they do work. Your imagination is not required.

Let them work out how to handle their stuff.

>> - What to do with the padding, especially when importing the UUID into
>> std::uuid? The padding bits must not be part of the UUID value, so the
>> bits bust be either masked or the conversion should fail. The latter I
>> find unacceptable, because all the normal world expects this operation
>> to never fail and be noexcept.
>
> If you consider the UUID as a char[16], I don't see why that would be a
> problem at all.

That should be unsigned char[16]. The problem is that unsigned char[16]
does not represent a 128-bit value on a non-8-bit machine. For example,
two 144-bit "UUIDs" may not compare equal because of the padding bits.
You have to define the meaning and behavior with regard to the bits that
are not part of the UUID value. That, in turn, may penalize 8-bit machines.

But there wouldn't be padding bits. If a byte is 9 bits, and you set the value 128 in a 9-bit byte, the top bit is zero. That's not padding; that's how you encode 128 in 9 bits.

So the same would be here. An `std::byte[16]` would contain 16 9-bit bytes. The high byte of each one would be zero, since each byte only contains values on the range [0, 255].

So two 144-bit "UUIDs" will compare equal if their sequence of 8-bit bytes are equal.

Andrey Semashev

unread,

Dec 20, 2017, 3:00:06 PM12/20/17

to std-pr...@isocpp.org

On 12/20/17 22:50, Nicol Bolas wrote:
> On Wednesday, December 20, 2017 at 2:19:34 PM UTC-5, Andrey Semashev wrote:
>
> On 12/20/17 11:27, Thiago Macieira wrote:
> > On terça-feira, 19 de dezembro de 2017 15:18:46 PST Andrey
>

> >> - What to do with the padding, especially when importing the
> UUID into
> >> std::uuid? The padding bits must not be part of the UUID value,
> so the
> >> bits bust be either masked or the conversion should fail. The
> latter I
> >> find unacceptable, because all the normal world expects this
> operation
> >> to never fail and be noexcept.
> >
> > If you consider the UUID as a char[16], I don't see why that
> would be a
> > problem at all.
>
> That should be unsigned char[16]. The problem is that unsigned char[16]
> does not represent a 128-bit value on a non-8-bit machine. For example,
> two 144-bit "UUIDs" may not compare equal because of the padding bits.
> You have to define the meaning and behavior with regard to the bits
> that
> are not part of the UUID value. That, in turn, may penalize 8-bit
> machines.
>
> But there wouldn't be padding bits. If a byte is 9 bits, and you set the
> value 128 in a 9-bit byte, the top bit is zero. That's not padding;
> that's how you encode 128 in 9 bits.

Yes, sorry, by "padding" I meant the additional significant bit.

> So the same would be here. An `std::byte[16]` would contain 16 9-bit
> bytes. The high byte of each one would be zero, since each byte only
> contains values on the range [0, 255].
>
> So two 144-bit "UUIDs" will compare equal if their sequence of 8-bit
> bytes are equal.

You will have to deal with the 9th bit when importing UUID from
`std::byte[16]` buffer. What if one of the bytes is 256, for example?
Should it be considered invalid input?

Nicol Bolas

unread,

Dec 20, 2017, 3:14:47 PM12/20/17

to ISO C++ Standard - Future Proposals

Each byte of such a valid UUID buffer would be required to be on the range [0, 255]. That's all that needs to be said.

Andrey Semashev

unread,

Dec 20, 2017, 3:19:52 PM12/20/17

to std-pr...@isocpp.org

I was asking what if that is not the case. UB?

Arthur O'Dwyer

unread,

Dec 20, 2017, 3:46:01 PM12/20/17

to ISO C++ Standard - Future Proposals

Andrey Semashev also wrote:

> If you put this uuid_default_generator in the standard you effectively *require* the OS

> to provide this kind of utility. My point is that this requirement is unreasonable and

> possibly not every OS that currently supports C++ qualifies.

This is not a problem. The C++11 standard requires std::random_device to exist, and not every OS has /dev/urandom. (See https://reviews.llvm.org/D41316 for a very recent manifestation of this issue.) Heck, not every OS has a filesystem, and we just got std::filesystem into C++17!

The important thing is that the functionality be available under a shared (standard) name. If the functionality is a bit truncated on certain rare platforms, that's totally fine with me.

> I don't really see the point in having the generator as an object either. This is one part

> of Boost.UUID design that I don't like. You really want a function, which takes the RNG

> (in case of random generator) or name and hash function state (in case of name generator)

> as arguments. It's these arguments that may need to be persistent across multiple calls

> to the generation algorithm; the algorithm itself has no state.

I generally agree with Andrey here; but statelessness (or near-statelessness) is not a rock-solid argument against objectness. I will explain below that what we are looking for here is exactly a distribution object, as defined in C++11.

On Tue, Dec 19, 2017 at 11:12 PM, Marius Bancila <marius....@gmail.com> wrote:

We could approach the problem of generating uuids in a similar manner as done in boost::uuid. We could have generator like this one that uses a random number generator (and a uniform distribution) to produce uuids:

template <typename UniformRandomNumberGenerator>
class uuid_random_generator
{
public:
typedef uuid result_type;

uuid_random_generator();
explicit uuid_random_generator(UniformRandomNumberGenerator& gen);
explicit uuid_random_generator(UniformRandomNumberGenerator* pGen);

uuid operator()();
};

If there is one thing we have learned from std::independent_bits_engine, it's that ephemeral adaptor objects should not own their adaptees. (See libc++'s internal __independent_bits_engine, which is exactly the same as independent_bits_engine except that it is non-owning, because you need non-owningness in order to implement uniform_int_distribution efficiently.)

So what you want here is a non-owning object whose operator() takes a UniformRandomNumberGenerator& as an argument. And we have a name for that in C++11; it's a distribution.

class uniform_uuid_distribution

{

public:

using result_type = uuid;

uniform_uuid_distribution() = default;

template<class G> requires(UniformRandomBitGenerator<G>)

uuid operator()(G& g) const;

};

I would propose a "default generator" one that may rely on the operating system uuid support, as make_uuid() does in my current proposal.

class uuid_default_generator
{
public:
typedef uuid result_type;

uuid operator()();
};

This seems unnecessary, at least in my formulation.

Then, we could have two overloads for make_uuid() that would look like this:

uuid make_uuid()
{
return uuid_default_generator{}();
}

template <typename Generator>
uuid make_uuid(Generator & g)
{
return g();
}

I'd say like this:

uuid make_uuid()

{ ...OS-specific... };

template<class G> requires(UniformRandomBitGenerator<G>)

uuid make_uuid(G& g)

{ return uniform_uuid_distribution()(g); }

This would enable users to write their own generators, such as one that is based on name hashing, and use them with make_uuid.

That would allow us to write code like the following:

auto id1 = make_uuid();

uuid_default_generator dgen;
auto id2 = make_uuid(dgen);
auto id3 = make_uuid(dgen);

std::random_device rd;
std::mt19937 mtgen(rd());
uuid_random_generator<std::mt19937> rgen(mtgen);
auto id4 = make_uuid(rgen);
auto id5 = make_uuid(rgen);

If I have my own generator, say, rgen, why on earth would I want to write make_uuid(rgen) when I could just write rgen()? You should introduce new objects and classes only when they provide some useful functionality. There's no point in having a function make_uuid(g) that invariably returns g().

HTH,

Arthur

Arthur O'Dwyer

unread,

Dec 20, 2017, 3:46:38 PM12/20/17

to ISO C++ Standard - Future Proposals

Knock it off, you three. Or at least take it to private email.

--
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/NjVdCCf0tlk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposals+unsubscribe@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/95c3a4d5-4bd0-facd-68f9-9fb09ddf47b8%40gmail.com.

Thiago Macieira

unread,

Dec 20, 2017, 8:56:27 PM12/20/17

to std-pr...@isocpp.org

On quarta-feira, 20 de dezembro de 2017 12:56:10 CST Andrey Semashev wrote:
> I don't really see the point in having the generator as an object
> either. This is one part of Boost.UUID design that I don't like. You
> really want a function, which takes the RNG (in case of random
> generator) or name and hash function state (in case of name generator)
> as arguments. It's these arguments that may need to be persistent across
> multiple calls to the generation algorithm; the algorithm itself has no
> state.

Actually, no. you want ONLY an MD5 or SHA-1 generator and no state. That's
what RFC 4122 says you should use.

Thiago Macieira

unread,

Dec 20, 2017, 9:00:09 PM12/20/17

to std-pr...@isocpp.org

On quarta-feira, 20 de dezembro de 2017 14:19:48 CST Andrey Semashev wrote:
> > Each byte of such a valid UUID buffer would be required to be on the
> > range [0, 255]. That's all that needs to be said.
>
> I was asking what if that is not the case. UB?

Precondition violation. Apply GIGO.

Andrey Semashev

unread,

Dec 21, 2017, 3:08:34 AM12/21/17

to std-pr...@isocpp.org

On 12/21/17 04:56, Thiago Macieira wrote:
> On quarta-feira, 20 de dezembro de 2017 12:56:10 CST Andrey Semashev wrote:
>> I don't really see the point in having the generator as an object
>> either. This is one part of Boost.UUID design that I don't like. You
>> really want a function, which takes the RNG (in case of random
>> generator) or name and hash function state (in case of name generator)
>> as arguments. It's these arguments that may need to be persistent across
>> multiple calls to the generation algorithm; the algorithm itself has no
>> state.
>
> Actually, no. you want ONLY an MD5 or SHA-1 generator and no state. That's
> what RFC 4122 says you should use.

There's reason to believe that a new RFC comes out that will have a
stronger hash function for name UUIDs. The UUID generation algorithm
should allow for that extension, even if done by user.

Andrey Semashev

unread,

Dec 21, 2017, 3:11:10 AM12/21/17

to std-pr...@isocpp.org

On 12/21/17 05:00, Thiago Macieira wrote:
> On quarta-feira, 20 de dezembro de 2017 14:19:48 CST Andrey Semashev wrote:
>>> Each byte of such a valid UUID buffer would be required to be on the
>>> range [0, 255]. That's all that needs to be said.
>>
>> I was asking what if that is not the case. UB?
>
> Precondition violation. Apply GIGO.

Ok, I guess that makes sense and doesn't penalise conventional platforms.

Thiago Macieira

unread,

Dec 21, 2017, 7:33:11 AM12/21/17

to std-pr...@isocpp.org

Fair enough, but the library should make it easy to follow existing practice.

Making it so that the user has to jump through hoops to get the expected
behaviour only because some RFC in the future may add another hash function is
just bad design.

Marius Bancila

unread,

Jan 22, 2018, 4:10:44 AM1/22/18

to std-pr...@isocpp.org

After winter vacation and further thinking I have made changes to the library proposal based on the feedback I got from here (although it's hard to satisfy all opinions).

Here is a list of changes:

iterators are no longer defined as pointers; the proposal states that the internal representation may not be an array and may have arbitrary endianness.
conversion functions string() and wstring() have been removed; non-member functions to_string() and to_wstring() have been added instead
make_guid() functions have been removed; generator function objects have been added to create version 4 (random number-based) and 5 (name-based with SHA1 hashing) UUIDs.

Again, the link to the project is https://github.com/mariusbancila/stduuid, and to the proposal: https://github.com/mariusbancila/stduuid/blob/master/paper.md.

Thank you

--

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/2248745.fYPvmxbhEj%40tjmaciei-mobl1.

j c

unread,

Jan 22, 2018, 7:47:56 AM1/22/18

to std-pr...@isocpp.org

The std::hash function involves the abomination that is std::stringstream. Surely there's a better way?

Since these UUIDs are supposed to be unique perhaps you could use part of the 16 byte array as a hash?

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CA%2BgtASzy2bwCr1-6N2URPLF14hgjJUr5gEbUSsoC9gddzVP%2BhA%40mail.gmail.com.

Nicolas Lesser

unread,

Jan 22, 2018, 11:39:07 AM1/22/18

to ISO C++ Standard - Future Proposals

- Why do you have two swap functions inside `std::uuid`? Isn't that a bit redundant, and/or can lead to an ambiguous call?

- There is no mention of string or wstring in the TS. Is this intended?

What are uuid_iterator and uuid_const_iterator? They are accessible to everyone you know, because the default access specifier of struct is public. You probably meant for them to be implementation defined, in that case, what the standard normally does is something like:

typedef /*implementation-defined*/ iterator; // see [uuid]

using const_iterator = implementation-defined; // see [uuid]

- What do you think of marking every function constexpr? I don't know if this is going to be tricky to implement, but it seems like the standard library starts to go in the direction of making a lot of things constexpr.

conversion functions string() and wstring() have been removed; non-member functions to_string() and to_wstring() have been added instead

You still call string and wstring in the example under "String Conversion".

Andrey Semashev

unread,

Jan 22, 2018, 12:23:09 PM1/22/18

to std-pr...@isocpp.org

On 01/22/18 12:10, Marius Bancila wrote:
> After winter vacation and further thinking I have made changes to the
> library proposal based on the feedback I got from here (although it's
> hard to satisfy all opinions).
>
> Here is a list of changes:
>

> * iterators are no longer defined as pointers; the proposal states

> that the internal representation may not be an array and may have
> arbitrary endianness.

> * conversion functions /string()/ and /wstring()/ have been removed;
> non-member functions /to_string()/ and /to_wstring()/ have been
> added instead
> * /make_guid()/ functions have been removed; generator function

> objects have been added to create version 4 (random number-based)
> and 5 (name-based with SHA1 hashing) UUIDs.
>
> Again, the link to the project is
> https://github.com/mariusbancila/stduuid, and to the proposal:
> https://github.com/mariusbancila/stduuid/blob/master/paper.md.

Looks much better now. A few comments on the proposal:

- I still think that constructors from strings is a mistake, but you
seem to prefer them to separate functions. Fair enough.

- I believe, `swap` should be overloaded, not specialized.

- The section "Iterators constructors" lists two examples, neither of
which uses iterators to construct UUID.

- The "Capacity" section should probably be named "Size" since you
describe the `size` method there. Also, `nil` has nothing to do with
size, and its description is better moved elsewhere.

- The example in the "String conversion" section still uses member
functions `string` and `wstring`. Also, I still don't see the point in
`to_wstring`.

- In the "Generating new uuids" section, why is `uuid_name_generator`
initialized with an UUID? How do you use that generator with another
hash function?

On the "Technical Specifications" section:

- Why is `uuid_variant::future` needed?

- `uuid::uuid_const_iterator` and `uuid::uuid_iterator` don't need to be
in the specification. The type of the iterators can simply be specified
as implementation-defined.

- I think `uuid_default_generator` still has to provide some guarantees
on the produced UUIDs. Having a generator that produces noone knows what
is hardly useful. I assume it is supposed to produce random UUIDs using
some system-specific source of randomness, so why not say that? Also, if
it is a random generator, its name should reflect that.

Arthur O'Dwyer

unread,

Jan 22, 2018, 12:55:04 PM1/22/18

to ISO C++ Standard - Future Proposals

On Mon, Jan 22, 2018 at 9:23 AM, Andrey Semashev <andrey....@gmail.com> wrote:

On 01/22/18 12:10, Marius Bancila wrote:

Again, the link to the project is https://github.com/mariusbancila/stduuid, and to the proposal: https://github.com/mariusbancila/stduuid/blob/master/paper.md.

- I believe, `swap` should be overloaded, not specialized.

Nicolas Lesser also observed that `swap` was funky.

Marius, here's what you currently have:

// BAD

namespace my {

struct X {

void swap(X&); // member swap

friend void swap(X&, X&); // Barton-Nackman trick provides ADL swap

};

}

namespace std {

template<> void swap(my::X&, my::X&); // bogus "specialization" of a function template not in your namespace

}

And here is what you should have:

// GOOD

namespace my {

struct X {

void swap(X&); // member swap

};

inline void swap(X&, X&); // ADL swap

}

Your specialization for std::hash is correct; std::hash is a customization point that operates by template specialization.

But std::swap is not a customization point. The way you provide `swap` for your type is via ADL, not via template specialization of std::swap.

(std::swap actually has tons of overloads, many of which are templates; providing a "template specialization" for an overloaded function is bad times.)

It would be acceptable to continue using the Barton-Nackman trick, too. I.e., this is also acceptable:

// GOOD

namespace my {

struct X {

void swap(X&); // member swap

friend void swap(X&, X&); // Barton-Nackman trick provides ADL swap

};

}

But standard library types do not use the Barton-Nackman trick, and so I recommend that you don't use it either, in things targeting the standard library.

Speaking of std::hash, it might be preferable to write

using result_type = std::hash<std::string>::result_type;

and eliminate the static_cast in operator()().

- I think `uuid_default_generator` still has to provide some guarantees on the produced UUIDs. Having a generator that produces noone knows what is hardly useful. I assume it is supposed to produce random UUIDs using some system-specific source of randomness, so why not say that? Also, if it is a random generator, its name should reflect that.

Yes. Please never ever name anything "default_"-anything. There are two conflicting meanings for "default_" in the C++ standard:

(1) "Default" as in "the best one to use." There will be a reason that it is the best one. State that reason. For example, "my::mt19937" is a good name for a default random number engine, and "my::sha256" is a good name for the default hash.

(2) "Default" as in "implementation-defined." Such a typedef will never be used by anyone, because its behavior will not be portable. Prefer to eliminate such useless cruft from the wording completely.

HTH,

Arthur

j c

unread,

Jan 22, 2018, 2:31:15 PM1/22/18

to std-pr...@isocpp.org

the signature of std::hash may be correct but the implementation leaves a lot to be desired. There's too much overhead in converting the uuid to a string and then hashing the string.

--

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0%2B1jQaQ_mFS4h0gwN8GMRGRnHXqy_kUHdXXHQbPJL0RnA%40mail.gmail.com.

Arthur O'Dwyer

unread,

Jan 22, 2018, 2:48:42 PM1/22/18

to ISO C++ Standard - Future Proposals

On Mon, Jan 22, 2018 at 11:31 AM, j c <james.a...@gmail.com> wrote:

the signature of std::hash may be correct but the implementation leaves a lot to be desired. There's too much overhead in converting the uuid to a string and then hashing the string.

FWIW, I agree that the std::hash implementation should be improved by vendors if possible. However:

(1) The semantics seem correct to me. I think it is at least slightly valuable for std::hash<>(uu) and std::hash<>(uu.to_string()) to have the same integer value. Or, equivalently, for std::hash<>(s) and std::hash<>(uuid(s)) to have the same integer value.

(2) The details of the implementation (e.g. whether it allocates memory from the free store) can be left up to the vendor, under the as-if rule.

Should there be a "to_string" member function overload that takes a buffer-of-char and writes the result into it? That would eliminate free-store allocation, since the maximum length of a UUID is known ahead of time (right?). But the exact signature of such a member function might be hard to agree on, in the absence of a standardized writeable string_view class. My suggestion would be the classic `size_t to_string(char *, size_t)`, just to rub it in the faces of the "string_view is implicitly const" crowd. :P

That is:

template<>

struct hash<uuids::uuid> {
using argument_type = uuids::uuid;

using result_type = typename std::hash<std::string_view>::result_type;

result_type operator()(argument_type const& uuid) const {

char buffer[100]; // should be enough for anybody

auto length = uuid.to_string(buffer, sizeof buffer);

return std::hash<std::string_view>()(std::string_view(buffer, length));

}

};

–Arthur

j c

unread,

Jan 22, 2018, 3:26:08 PM1/22/18

to std-pr...@isocpp.org

On Monday, January 22, 2018, Arthur O'Dwyer <arthur....@gmail.com> wrote:

On Mon, Jan 22, 2018 at 11:31 AM, j c <james.a...@gmail.com> wrote:
the signature of std::hash may be correct but the implementation leaves a lot to be desired. There's too much overhead in converting the uuid to a string and then hashing the string.

FWIW, I agree that the std::hash implementation should be improved by vendors if possible. However:
(1) The semantics seem correct to me. I think it is at least slightly valuable for std::hash<>(uu) and std::hash<>(uu.to_string()) to have the same integer value. Or, equivalently, for std::hash<>(s) and std::hash<>(uuid(s)) to have the same integer value.

it's not the same though. A raw uuid is an array of 16 bytes. A string representation of this is those 16 bytes converted to human readable characters and then having hyphens inserted where appropriate.

If you want these to hash to the same value then I'd argue that there should be a std::uuid::from_string function.

Arthur O'Dwyer

unread,

Jan 22, 2018, 3:54:27 PM1/22/18

to ISO C++ Standard - Future Proposals

Marius does provide such a function.

https://github.com/mariusbancila/stduuid/blob/master/include/uuid.h#L604-L619

However, you make a good point there: suppose we have

std::string string_with_dashes = "47183823-2574-4bfd-b411-99ed177d3e43";

std::string string_without_dashes = "4718382325744bfdb41199ed177d3e43";

stdext::uuid uuid_with_dashes(string_with_dashes);

stdext::uuid uuid_without_dashes(string_without_dashes);

std::hash<std::string> string_hasher;

std::hash<stdext::uuid> uuid_hasher;

const bool string_hashes_arent_equal = (string_hasher(string_with_dashes) != string_hasher(string_without_dashes));

const bool uuid_hash_doesnt_behave_like_string_hash = not (

(uuid_hasher(uuid_with_dashes) == string_hasher(string_with_dashes) &&

(uuid_hasher(uuid_without_dashes) == string_hasher(string_without_dashes)

);

Then we expect to have

assert(string_with_dashes != string_without_dashes); // we MUST have this

assert(uuid_with_dashes == uuid_without_dashes); // we MUST have this

expect(string_hashes_arent_equal); // we SHOULD have this

assert(uuid_hasher(uuid_with_dashes) == uuid_hasher(uuid_without_dashes)); // we MUST have this (since the two hasher arguments are equal)

if (string_hashes_arent_equal && uuid_hasher(uuid_with_dashes) == string_hasher(string_with_dashes)) {

assert(uuid_hasher(uuid_without_dashes) == string_hasher(string_with_dashes)); // we MUST have this (by transitivity of equality on size_t)

assert(uuid_hasher(uuid_without_dashes) != string_hasher(string_without_dashes)); // if the string hashes aren't equal, we MUST have this

assert(uuid_hash_doesnt_behave_like_string_hash);

}

So okay, I agree, making hash<uuid> and hash<string> behave "the same" is just asking for trouble. hash<uuid> should be reimplemented in terms of hashing the underlying bytes. It could probably just XOR them all together in size_t-sized chunks. And the standard should make it implementation-defined, i.e., let the vendor figure it out.

–Arthur

Andrey Semashev

unread,

Jan 22, 2018, 8:30:20 PM1/22/18

to std-pr...@isocpp.org

On 01/22/18 22:48, Arthur O'Dwyer wrote:
> On Mon, Jan 22, 2018 at 11:31 AM, j c <james.a...@gmail.com
> <mailto:james.a...@gmail.com>> wrote:
>
> the signature of std::hash may be correct but the implementation
> leaves a lot to be desired. There's too much overhead in converting
> the uuid to a string and then hashing the string.
>
>
> FWIW, I agree that the std::hash implementation should be improved by
> vendors if possible. However:
> (1) The semantics seem correct to me. I think it is at least slightly
> valuable for std::hash<>(uu) and std::hash<>(uu.to_string()) to have the
> same integer value. Or, equivalently, for std::hash<>(s) and
> std::hash<>(uuid(s)) to have the same integer value.
> (2) The details of the implementation (e.g. whether it allocates memory
> from the free store) can be left up to the vendor, under the as-if rule.
>
> Should there be a "to_string" member function overload that takes a
> buffer-of-char and writes the result into it?

I don't think std::hash<std::uuid> should be string-based at all, but
we're discussing the proposal, and it doesn't need to suggest a
particular implementation of the specialization. In fact, I think it
should *not* suggest a particular implementation.

Edward Catmur

unread,

Jan 24, 2018, 4:02:18 PM1/24/18

to ISO C++ Standard - Future Proposals

Would it be acceptable to make it constexpr, noexcept or (preferably) both? That would preclude the slow, allocating stringstream approach without mandating any particular implementation.

Andrey Semashev

unread,

Jan 24, 2018, 5:23:37 PM1/24/18

to ISO C++ Standard - Future Proposals

I think, constexpr and noexcept would be nice regardless. They don't
preclude a string-based implementation though, as one could generate a
string in a local buffer. Not that I see why one would implement it
that way.

Arthur O'Dwyer

unread,

Jan 24, 2018, 7:12:15 PM1/24/18

to ISO C++ Standard - Future Proposals

That's exactly what I meant by the suggestion

> > Should there be a "to_string" member function overload that takes a
> > buffer-of-char and writes the result into it?

There probably should be such a member function.

Either that, or

(A) uuid::operator<< should be specified not to use heap-allocation, and

(B) the standard should provide an allocator-aware version of std::ostringstream (P0407),

which would let people stringify a UUID to a local char buffer by jumping through only O(1) hoops.

–Arthur

Marius Bancila

unread,

Jan 31, 2018, 2:45:13 AM1/31/18

to std-pr...@isocpp.org

I got some good comments after the last update. I'll try to summarize the main points and answer here:

The section "Iterators constructors" lists two examples, neither of which uses iterators to construct UUID.

That was a error after some edits. It's fixed.

The "Capacity" section should probably be named "Size" since you describe the `size` method there. Also, `nil` has nothing to do with size, and its description is better moved elsewhere.

Done. I made two sections, one called Size and one called Nil.

I believe, `swap` should be overloaded, not specialized.

Done. I removed the specialization of swap and provided an overload.

Why is `uuid_variant::future` needed?

It's actually called uuid_variant::reserved. It's not really needed, but it is specified by rfc4122 so at least in theory it may occur (in the future), and therefore the enum should have, in my opinion, an enumerator for it.

`uuid::uuid_const_iterator` and `uuid::uuid_iterator` don't need to be in the specification. The type of the iterators can simply be specified as implementation-defined.

typedef /*implementation-defined*/ iterator; // see [uuid]
using const_iterator = implementation-defined; // see [uuid]

That was my intention in the first place, but didn't put it in this form, but in something that looked more of a specification. I changed it to what was suggested.

I don't think std::hash<std::uuid> should be string-based at all, but we're discussing the proposal, and it doesn't need to suggest a particular implementation of the specialization. In fact, I think it should *not* suggest a particular implementation.

I am not suggesting any implementation. Yes, I did a string-based implementation, but that doesn't mean it has to be so.

I think `uuid_default_generator` still has to provide some guarantees on the produced UUIDs. Having a generator that produces noone knows what is hardly useful. I assume it is supposed to produce random UUIDs using some system-specific source of randomness, so why not say that? Also, if it is a random generator, its name should reflect that.

Yes. Please never ever name anything "default_"-anything. There are two conflicting meanings for "default_" in the C++ standard:
(1) "Default" as in "the best one to use." There will be a reason that it is the best one. State that reason. For example, "my::mt19937" is a good name for a default random number engine, and "my::sha256" is a good name for the default hash.
(2) "Default" as in "implementation-defined." Such a typedef will never be used by anyone, because its behavior will not be portable. Prefer to eliminate such useless cruft from the wording completely.

I removed the uuid_default_generator completely from the specification. Although in my particular implementation I still have that, I have renamed it to uuid_system_generator and explained it uses system resources for generating a UUID. However, it's not part of the library specification any more.

I hope this addresses most of the last concerns.

Marius

--

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0%2BPp40k0wLva_6iF1mApOeq0ycNt94cjYeAUD-TC_XYNQ%40mail.gmail.com.

Marius Bancila

unread,

Mar 6, 2018, 4:48:32 PM3/6/18

to std-pr...@isocpp.org

My proposal on the uuid library, P0959R0, made it to the pre-Jacksonville mailing (https://isocpp.org/blog/2018/02/2018-02-pre-jacksonville-mailing-available), but I cannot be there to present it. If anyone that will be there is interested in doing so, please drop me a message.

Thank you

Tony V E

unread,

Jun 13, 2018, 11:11:34 PM6/13/18

to Standard Proposals

The committee discussed this paper briefly on Saturday, at the very end of the week.

Some comments (a mix of just my opinions, and LEWG comments)

- don't construct from a string_view. Make that a "factory" function. ie a static member function uuid from_string(string_view). Either return nil uuid on error, or throw an exception. (Note that nil is a valid result, so returning nil means you don't know if it is an error or a valid parse, unfortunately. Might also consider expected<uuid> but nothing else in the STL uses expected (and it is not in the STL yet))

It should be a function because it is doing parsing, which feels more like a function (takes input, gives output) than a constructor.

- construct from a fixed-size span<const std::byte, 16>. This makes "wrong size" not your problem.

- not sure whether the iterator constructor should stay - I suppose it is useful for non-contiguous cases. Committee might make it UB if it is not 16 bytes.

- remove container-like stuff: size and iterators, typedefs, etc - just have a to_span() function that returns a span<const std::byte, 16>

(it probably never should have had mutable iterators, btw - we don't want to allow changing single bytes in a uuid)

Oh, wait - you were saying that the internal representation might not be just bytes - is that important?

I think we need to standardize the order. Construction from a sequence of bytes should give the same uuid on all machines.

And we also need to define how the bytes turn into strings.

Basically, I would say the byte order turns directly into string order. For bytes a,b,c,d... you get string "aabbccdd-eeff-gghh-iijj-kkllmmnnoopp"

- remove state_size. sizeof(uuid) is good enough (Hmm, although it is technically not guaranteed to be == 16, unless we add that in writing. But it doesn't matter - use size() on the returned span)

- rename nil to is_nil (or remove altogether - checking == {} works)

- add assignment from another uuid

- should be trivially copyable, etc

Basically, we want a class that is not much more than 16 bytes gathered up. They should almost always be treated as a whole, not parts. So once you have the unique bytes inside the uuid, you can be sure that they stay unique (as unique as wherever you constructed them from).

The invariant of the class is the uniqueness. You can't actually guarantee that, but you can at least guarantee that a uuid (if not nil) _maintains_ the invariant, if you assume all uuids are correct on construction.

(Would be interesting to consider move-only, to maintain uniqueness)

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CA%2BgtASwAinGU4xSiTO513TTU-47YKymHQC5YnBv_QidTa5Tt0A%40mail.gmail.com.

--

Be seeing you,

Tony

Patrice Roy

unread,

Jun 14, 2018, 12:06:27 AM6/14/18

to std-pr...@isocpp.org

Thanks for the follow-up, Tony. I'm sure Marius appreciates :)

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAOHCbitwUMo2%3Dvp4_7AH_d3YfsYYpQn-81BWUE9VFXeVg4_GZA%40mail.gmail.com.

Marius Bancila

unread,

Jun 14, 2018, 1:59:12 AM6/14/18

to std-pr...@isocpp.org

Thank you for the feedback. I will work it out following the comments.

As a side note, I have already renamed nil() to is_nil(), although I did not submitted an update of the paper only for that, as I wanted to get this kind of feedback first.

Marius

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0%2BPp40k0wLva_6iF1mApOeq0ycNt94cjYeAUD-TC_XYNQ%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CA%2BgtASwAinGU4xSiTO513TTU-47YKymHQC5YnBv_QidTa5Tt0A%40mail.gmail.com.

--
Be seeing you,
Tony

--

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAOHCbitwUMo2%3Dvp4_7AH_d3YfsYYpQn-81BWUE9VFXeVg4_GZA%40mail.gmail.com.

Thiago Macieira

unread,

Jun 14, 2018, 2:36:45 AM6/14/18

to std-pr...@isocpp.org

On Wednesday, 13 June 2018 20:11:31 PDT Tony V E wrote:
> - remove container-like stuff: size and iterators, typedefs, etc - just
> have a to_span() function that returns a span<const std::byte, 16>
> (it probably never should have had mutable iterators, btw - we don't want
> to allow changing single bytes in a uuid)
>
> Oh, wait - you were saying that the internal representation might not be
> just bytes - is that important?
> I think we need to standardize the order. Construction from a sequence of
> bytes should give the same uuid on all machines.

Suggest referring to RFC 4122 section 4.1.2 for the layout and byte order.

Andrey Semashev

unread,

Jun 14, 2018, 5:01:20 AM6/14/18

to std-pr...@isocpp.org

On 06/14/18 06:11, Tony V E wrote:
> The committee discussed this paper briefly on Saturday, at the very end
> of the week.
>
> Some comments (a mix of just my opinions, and LEWG comments)
>

> - remove container-like stuff: size and iterators, typedefs, etc - just
> have a to_span() function that returns a span<const std::byte, 16>
> (it probably never should have had mutable iterators, btw - we don't
> want to allow changing single bytes in a uuid)
>
> Oh, wait - you were saying that the internal representation might not be
> just bytes - is that important?

This was my suggestion. I don't think we should use a span to expose the
contents of uuid because it mandates representation (an array of bytes,
big endian). This will harm performance. The point of iterators is that
they may not be just pointers. For example, they could be reverse
iterators on little endian machines.

If we had iterator_range, we could use that. As long as we don't, I
think exposing iterators is the best we can do.

> I think we need to standardize the order. Construction from a sequence
> of bytes should give the same uuid on all machines.

The order of bytes exposed by uuid's iterators must be fixed, as it is
fixed in RFC. The important part is that the bytes stored in the
implementation can have a different order and, actually, can be stored
not as bytes but as words or a SIMD vector. The conversion logic is
hidden in the iterators.

The constructor from bytes would accept the portable representation
(equivalent to RFC) and convert it to the internal representation.

> And we also need to define how the bytes turn into strings.
>
> Basically, I would say the byte order turns directly into string order.
> For bytes a,b,c,d... you get string "aabbccdd-eeff-gghh-iijj-kkllmmnnoopp"

The string format is described in the RFC, we could reproduce it in the
wording.

> - remove state_size. sizeof(uuid) is good enough (Hmm, although it is
> technically not guaranteed to be == 16, unless we add that in writing.
> But it doesn't matter - use size() on the returned span)

I think, uuid::size() should do. Using sizeof to determine the size of
the UUID doesn't feel right.

> - rename nil to is_nil (or remove altogether - checking == {} works)

is_nil could be more efficient.

> Basically, we want a class that is not much more than 16 bytes gathered
> up. They should almost always be treated as a whole, not parts. So
> once you have the unique bytes inside the uuid, you can be sure that
> they stay unique (as unique as wherever you constructed them from).
> The invariant of the class is the uniqueness. You can't actually
> guarantee that, but you can at least guarantee that a uuid (if not nil)
> _maintains_ the invariant, if you assume all uuids are correct on
> construction.

I don't agree. Uniqueness is not an invariant of a UUID. It's an
invariant of a random UUID generator, although technically it's not
guaranteed. As for UUID, as soon as it it constructed, it's just an id.
It doesn't have to be unique (e.g. you can copy it or you can construct
multiple UUIDs from the same bytes).

> (Would be interesting to consider move-only, to maintain uniqueness)

I don't think having it movable only would be useful.

Arthur O'Dwyer

unread,

Jun 14, 2018, 2:23:16 PM6/14/18

to ISO C++ Standard - Future Proposals

On Thu, Jun 14, 2018 at 2:01 AM, Andrey Semashev <andrey....@gmail.com> wrote:

On 06/14/18 06:11, Tony V E wrote:

The committee discussed this paper briefly on Saturday, at the very end of the week.

Some comments (a mix of just my opinions, and LEWG comments)

- rename nil to is_nil (or remove altogether - checking == {} works)

is_nil could be more efficient.

FWIW,

- I forget the motivation for letting a UUID get into the "empty" state

- I forget why the "empty" state should be considered distinct from the perfectly valid "all-bytes-zero" state which is easy to test for and not special in any way

- I forget why the "empty" state is not named "empty"

Basically, we want a class that is not much more than 16 bytes gathered up. They should almost always be treated as a whole, not parts. So once you have the unique bytes inside the uuid, you can be sure that they stay unique (as unique as wherever you constructed them from).
The invariant of the class is the uniqueness. You can't actually guarantee that, but you can at least guarantee that a uuid (if not nil) _maintains_ the invariant, if you assume all uuids are correct on construction.

I don't agree. Uniqueness is not an invariant of a UUID. It's an invariant of a random UUID generator, although technically it's not guaranteed. As for UUID, as soon as it it constructed, it's just an id. It doesn't have to be unique (e.g. you can copy it or you can construct multiple UUIDs from the same bytes).

(Would be interesting to consider move-only, to maintain uniqueness)

I don't think having it movable only would be useful.

Right. There is absolutely no value, and quite a lot of downside, to making a key type such as UUID (or ISBN, or a string type used to store person's name) non-copyable. The whole point of making a key is that you can use it to refer to things from multiple different places, which means being able to copy it freely.

–Arthur

Thiago Macieira

unread,

Jun 14, 2018, 3:03:36 PM6/14/18

to std-pr...@isocpp.org

On Thursday, 14 June 2018 02:01:16 PDT Andrey Semashev wrote:
> > - remove container-like stuff: size and iterators, typedefs, etc - just
> > have a to_span() function that returns a span<const std::byte, 16>
> > (it probably never should have had mutable iterators, btw - we don't
> > want to allow changing single bytes in a uuid)
> >
> > Oh, wait - you were saying that the internal representation might not be
> > just bytes - is that important?
>
> This was my suggestion. I don't think we should use a span to expose the
> contents of uuid because it mandates representation (an array of bytes,
> big endian). This will harm performance. The point of iterators is that
> they may not be just pointers. For example, they could be reverse
> iterators on little endian machines.

Right, it may not work to provide a function returning std::span<byte, 16>,
but instead a std::array<byte, 16> to get the actual bytes, if one needs them
for some reason.

For example, the implementation on Windows may want to be layout-compatible
with the GUID structure[1] to facilitate copying to and from that type (QUuid
does that). As a side-effect, it means the first 8 bytes are not in the
correct order for a byte-level access on little-endian machines.

By the way, what's the policy on allowing the implementation to provide member
functions to convert to and from native structures like GUID?

[1] https://msdn.microsoft.com/en-us/library/windows/desktop/aa373931.aspx

Tony V E

unread,

Jun 14, 2018, 3:04:32 PM6/14/18

to Standard Proposals

On Thu, Jun 14, 2018 at 5:01 AM, Andrey Semashev <andrey....@gmail.com> wrote:

On 06/14/18 06:11, Tony V E wrote:

The committee discussed this paper briefly on Saturday, at the very end of the week.

Some comments (a mix of just my opinions, and LEWG comments)

- remove container-like stuff: size and iterators, typedefs, etc - just have a to_span() function that returns a span<const std::byte, 16>
(it probably never should have had mutable iterators, btw - we don't want to allow changing single bytes in a uuid)

Oh, wait - you were saying that the internal representation might not be just bytes - is that important?

This was my suggestion. I don't think we should use a span to expose the contents of uuid because it mandates representation (an array of bytes, big endian). This will harm performance. The point of iterators is that they may not be just pointers. For example, they could be reverse iterators on little endian machines.

How much potential performance are we talking about? Is it really worth it?

The RFC mandates network byte order (most significant first). Does that mean < can be just memcmp?

Being able to use span simplifies a lot.

And I'd like uuid to NOT be a container. It is an ID. Container-ness and bytes are implementation details.

But if performance can be shown to be important, then it is important.

If we had iterator_range, we could use that. As long as we don't, I think exposing iterators is the best we can do.

I think we need to standardize the order. Construction from a sequence of bytes should give the same uuid on all machines.

The order of bytes exposed by uuid's iterators must be fixed, as it is fixed in RFC. The important part is that the bytes stored in the implementation can have a different order and, actually, can be stored not as bytes but as words or a SIMD vector. The conversion logic is hidden in the iterators.

The constructor from bytes would accept the portable representation (equivalent to RFC) and convert it to the internal representation.

And we also need to define how the bytes turn into strings.

Basically, I would say the byte order turns directly into string order. For bytes a,b,c,d... you get string "aabbccdd-eeff-gghh-iijj-kkllmmnnoopp"

The string format is described in the RFC, we could reproduce it in the wording.

- remove state_size. sizeof(uuid) is good enough (Hmm, although it is technically not guaranteed to be == 16, unless we add that in writing. But it doesn't matter - use size() on the returned span)

I think, uuid::size() should do. Using sizeof to determine the size of the UUID doesn't feel right.

If it is not a container, you don't need size or sizeof at all. It is 16 by definition. (Hmmm, what if you are on a system with 9 or 16 bit bytes? I guess there are still 16 *octets* and the iterators or span would still have 16 entries.

I guess uint8_t is better than std::byte, for that reason.

- rename nil to is_nil (or remove altogether - checking == {} works)

is_nil could be more efficient.

Basically, we want a class that is not much more than 16 bytes gathered up. They should almost always be treated as a whole, not parts. So once you have the unique bytes inside the uuid, you can be sure that they stay unique (as unique as wherever you constructed them from).
The invariant of the class is the uniqueness. You can't actually guarantee that, but you can at least guarantee that a uuid (if not nil) _maintains_ the invariant, if you assume all uuids are correct on construction.

I don't agree. Uniqueness is not an invariant of a UUID. It's an invariant of a random UUID generator, although technically it's not guaranteed. As for UUID, as soon as it it constructed, it's just an id. It doesn't have to be unique (e.g. you can copy it or you can construct multiple UUIDs from the same bytes).

(Would be interesting to consider move-only, to maintain uniqueness)

I don't think having it movable only would be useful.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/b06d1b8d-a1c9-5d85-c3d0-d56d8b93d9ee%40gmail.com.

Andrey Semashev

unread,

Jun 14, 2018, 3:11:33 PM6/14/18

to ISO C++ Standard - Future Proposals

On Thu, Jun 14, 2018 at 9:23 PM, Arthur O'Dwyer
<arthur....@gmail.com> wrote:
> On Thu, Jun 14, 2018 at 2:01 AM, Andrey Semashev <andrey....@gmail.com>
> wrote:
>> On 06/14/18 06:11, Tony V E wrote:
>>>
>>> - rename nil to is_nil (or remove altogether - checking == {} works)
>>
>> is_nil could be more efficient.
>
> FWIW,
> - I forget the motivation for letting a UUID get into the "empty" state
> - I forget why the "empty" state should be considered distinct from the
> perfectly valid "all-bytes-zero" state which is easy to test for and not
> special in any way
> - I forget why the "empty" state is not named "empty"

The nil value is not an empty state, a nil UUID still contains a
value. This value is defined in the RFC and is designated as such:

https://tools.ietf.org/html/rfc4122#section-4.1.7

I would say that having a special function to test against this value
is useful because it is often used as an indicator of
default-constructed or unassigned or unused, etc. value. The RFC
specifically mentions that value, so having a way to explicitly test
for it seems reasonable.

Tony V E

unread,

Jun 14, 2018, 3:17:14 PM6/14/18

to Standard Proposals

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAEhD%2B6CTRS5yDCRikb3zgwkkxzvpZyJf6iwRn%2B%2BxzzEP01YDpA%40mail.gmail.com.

We definitely don't want size() == 16 and empty() == true.

Which is what a nil uuid would say, if we renamed is_nil to empty.

Andrey Semashev

unread,

Jun 14, 2018, 3:26:12 PM6/14/18

to ISO C++ Standard - Future Proposals

On Thu, Jun 14, 2018 at 10:03 PM, Thiago Macieira <thi...@macieira.org> wrote:
> On Thursday, 14 June 2018 02:01:16 PDT Andrey Semashev wrote:
>> > - remove container-like stuff: size and iterators, typedefs, etc - just
>> > have a to_span() function that returns a span<const std::byte, 16>
>> > (it probably never should have had mutable iterators, btw - we don't
>> > want to allow changing single bytes in a uuid)
>> >
>> > Oh, wait - you were saying that the internal representation might not be
>> > just bytes - is that important?
>>
>> This was my suggestion. I don't think we should use a span to expose the
>> contents of uuid because it mandates representation (an array of bytes,
>> big endian). This will harm performance. The point of iterators is that
>> they may not be just pointers. For example, they could be reverse
>> iterators on little endian machines.
>
> Right, it may not work to provide a function returning std::span<byte, 16>,
> but instead a std::array<byte, 16> to get the actual bytes, if one needs them
> for some reason.

This would mean that in order to export portable representation one
would have to copy the bytes as std::array first and then copy bytes
from that array to where they need to be (e.g. a shared memory region
or write to a file). In this case it might be better to have an export
method of std::uuid that takes an output iterator, which will receive
the bytes.

> For example, the implementation on Windows may want to be layout-compatible
> with the GUID structure[1] to facilitate copying to and from that type (QUuid
> does that). As a side-effect, it means the first 8 bytes are not in the
> correct order for a byte-level access on little-endian machines.

I'm not sure having std::uuid layout compatible with GUID would be
helpful. Users would have to perform a reinterpret_cast in order to
employ this feature, and this would be generally UB and certainly not
something we want to encourage.

> By the way, what's the policy on allowing the implementation to provide member
> functions to convert to and from native structures like GUID?

We have native_handle() in std::mutex and std::condition_variable. We
could add something like this to std::uuid, but frankly, I would
prefer if std::uuid didn't use GUID as implementation as it would make
its ordering inefficient.

Andrey Semashev

unread,

Jun 14, 2018, 3:43:21 PM6/14/18

to ISO C++ Standard - Future Proposals

On Thu, Jun 14, 2018 at 10:04 PM, Tony V E <tvan...@gmail.com> wrote:
> On Thu, Jun 14, 2018 at 5:01 AM, Andrey Semashev <andrey....@gmail.com>
> wrote:
>> On 06/14/18 06:11, Tony V E wrote:
>>>
>>> - remove container-like stuff: size and iterators, typedefs, etc - just
>>> have a to_span() function that returns a span<const std::byte, 16>
>>> (it probably never should have had mutable iterators, btw - we don't want
>>> to allow changing single bytes in a uuid)
>>>
>>> Oh, wait - you were saying that the internal representation might not be
>>> just bytes - is that important?
>>
>>
>> This was my suggestion. I don't think we should use a span to expose the
>> contents of uuid because it mandates representation (an array of bytes, big
>> endian). This will harm performance. The point of iterators is that they may
>> not be just pointers. For example, they could be reverse iterators on little
>> endian machines.
>
> How much potential performance are we talking about? Is it really worth it?
> The RFC mandates network byte order (most significant first). Does that
> mean < can be just memcmp?
>
> Being able to use span simplifies a lot.
>
> And I'd like uuid to NOT be a container. It is an ID. Container-ness and
> bytes are implementation details.
>
> But if performance can be shown to be important, then it is important.

My primary concern is about the ordering operators. For example, in
Boost.UUID there is an optimized implementation of operator< for SSE
[1], in which you can see additional trickery that had to be
implemented to accommodate for little endian native byte order.
Ordering operators are often used in associative containers and
sorting, which means lots of calls in a tight loop, so I would expect
this to be a rather hot operation. I would really like if std::uuid
didn't enforce this kind of implementation.

[1] https://github.com/boostorg/uuid/blob/develop/include/boost/uuid/detail/uuid_x86.ipp#L98

>>> - remove state_size. sizeof(uuid) is good enough (Hmm, although it is
>>> technically not guaranteed to be == 16, unless we add that in writing. But
>>> it doesn't matter - use size() on the returned span)
>>
>> I think, uuid::size() should do. Using sizeof to determine the size of the
>> UUID doesn't feel right.
>
> If it is not a container, you don't need size or sizeof at all. It is 16 by
> definition.

You do need a size if you want to export or import bytes into UUID.
Yes, the size is always 16, but having it as a constant or another
symbolic way would allow to avoid magic constants and silly typos in
the user's code.

> (Hmmm, what if you are on a system with 9 or 16 bit bytes? I
> guess there are still 16 *octets* and the iterators or span would still have
> 16 entries.
>
> I guess uint8_t is better than std::byte, for that reason.

Yes, we had this discussion before. The net result is that though the
RFC defines UUID as an object 128-bit long, a platform with non-8-bit
bytes would leave the excess bits of every byte unused and set to a
predefined value so that they don't participate in the UUID value.

Edward Catmur

unread,

Jun 14, 2018, 4:01:28 PM6/14/18

to std-pr...@isocpp.org

On Thu, Jun 14, 2018 at 8:26 PM, Andrey Semashev <andrey....@gmail.com> wrote:

On Thu, Jun 14, 2018 at 10:03 PM, Thiago Macieira <thi...@macieira.org> wrote:
> On Thursday, 14 June 2018 02:01:16 PDT Andrey Semashev wrote:
>> > - remove container-like stuff: size and iterators, typedefs, etc - just
>> > have a to_span() function that returns a span<const std::byte, 16>
>> > (it probably never should have had mutable iterators, btw - we don't
>> > want to allow changing single bytes in a uuid)
>> >
>> > Oh, wait - you were saying that the internal representation might not be
>> > just bytes - is that important?
>>
>> This was my suggestion. I don't think we should use a span to expose the
>> contents of uuid because it mandates representation (an array of bytes,
>> big endian). This will harm performance. The point of iterators is that
>> they may not be just pointers. For example, they could be reverse
>> iterators on little endian machines.
>
> Right, it may not work to provide a function returning std::span<byte, 16>,
> but instead a std::array<byte, 16> to get the actual bytes, if one needs them
> for some reason.

This would mean that in order to export portable representation one
would have to copy the bytes as std::array first and then copy bytes
from that array to where they need to be (e.g. a shared memory region
or write to a file). In this case it might be better to have an export
method of std::uuid that takes an output iterator, which will receive
the bytes.

With Ranges TS one can write ranges::copy(uuid.as_bytes(), OutputIterator). I don't think an export() method would really add anything.

> For example, the implementation on Windows may want to be layout-compatible
> with the GUID structure[1] to facilitate copying to and from that type (QUuid
> does that). As a side-effect, it means the first 8 bytes are not in the
> correct order for a byte-level access on little-endian machines.

I'm not sure having std::uuid layout compatible with GUID would be
helpful. Users would have to perform a reinterpret_cast in order to
employ this feature, and this would be generally UB and certainly not
something we want to encourage.

Or memcpy, or bit_cast. It doesn't have to be UB. However this would still encourage unobviously platform-dependent code, which should be avoided.

> By the way, what's the policy on allowing the implementation to provide member
> functions to convert to and from native structures like GUID?

We have native_handle() in std::mutex and std::condition_variable. We
could add something like this to std::uuid, but frankly, I would
prefer if std::uuid didn't use GUID as implementation as it would make
its ordering inefficient.

That wouldn't imply using GUID internally, native() could convert and return by value. But what about platforms that don't have a native uuid type?

Tony V E

unread,

Jun 14, 2018, 4:09:36 PM6/14/18

to Standard Proposals

OK, revising, based on latest emails:

On Wed, Jun 13, 2018 at 11:11 PM, Tony V E <tvan...@gmail.com> wrote:

The committee discussed this paper briefly on Saturday, at the very end of the week.

Some comments (a mix of just my opinions, and LEWG comments)

- don't construct from a string_view. Make that a "factory" function. ie a static member function uuid from_string(string_view). Either return nil uuid on error, or throw an exception. (Note that nil is a valid result, so returning nil means you don't know if it is an error or a valid parse, unfortunately. Might also consider expected<uuid> but nothing else in the STL uses expected (and it is not in the STL yet))
It should be a function because it is doing parsing, which feels more like a function (takes input, gives output) than a constructor.

still yes to the above

- construct from a fixed-size span<const std::byte, 16>. This makes "wrong size" not your problem.

still yes. Except uint8_t.

We need to say that the order matches the RFC

- not sure whether the iterator constructor should stay - I suppose it is useful for non-contiguous cases. Committee might make it UB if it is not 16 bytes.

Do we even need the end iterator?

- remove container-like stuff: size and iterators, typedefs, etc - just have a to_span() function that returns a span<const std::byte, 16>
(it probably never should have had mutable iterators, btw - we don't want to allow changing single bytes in a uuid)

Oh, wait - you were saying that the internal representation might not be just bytes - is that important?

OK, so keep the container interface. But only const_iterators. Not mutable.

Is that a problem for writing into an existing uuid? Construct a new one and assign? Let the compiler optimize it.

Or do we need the mutable iterators? They seem wrong to me.

I think we need to standardize the order. Construction from a sequence of bytes should give the same uuid on all machines.

yes (refer to RFC)

And we also need to define how the bytes turn into strings.

yes (refer to RFC)

- remove state_size. sizeof(uuid) is good enough (Hmm, although it is technically not guaranteed to be == 16, unless we add that in writing. But it doesn't matter - use size() on the returned span)

size()

- rename nil to is_nil (or remove altogether - checking == {} works)

yes

- add assignment from another uuid

yes

- should be trivially copyable, etc

yes?

Basically, we want a class that is not much more than 16 bytes gathered up. They should almost always be treated as a whole, not parts. So once you have the unique bytes inside the uuid, you can be sure that they stay unique (as unique as wherever you constructed them from).
The invariant of the class is the uniqueness. You can't actually guarantee that, but you can at least guarantee that a uuid (if not nil) _maintains_ the invariant, if you assume all uuids are correct on construction.

(Would be interesting to consider move-only, to maintain uniqueness)

no.

Andrey Semashev

unread,

Jun 14, 2018, 4:12:49 PM6/14/18

to ISO C++ Standard - Future Proposals

On Thu, Jun 14, 2018 at 10:43 PM, Andrey Semashev

In case you're interested, here are some benchmarks, including against
a memcmp-based version:

https://svn.boost.org/trac10/ticket/8509
https://svn.boost.org/trac10/attachment/ticket/8509/uuid_operators.txt

I didn't test a version that doesn't account for endianness, though.

Andrey Semashev

unread,

Jun 14, 2018, 4:28:13 PM6/14/18

to ISO C++ Standard - Future Proposals

On Thu, Jun 14, 2018 at 11:01 PM, 'Edward Catmur' via ISO C++ Standard
- Future Proposals <std-pr...@isocpp.org> wrote:
> On Thu, Jun 14, 2018 at 8:26 PM, Andrey Semashev <andrey....@gmail.com>
> wrote:
>> On Thu, Jun 14, 2018 at 10:03 PM, Thiago Macieira <thi...@macieira.org>
>> wrote:
>> >
>> > Right, it may not work to provide a function returning std::span<byte,
>> > 16>,
>> > but instead a std::array<byte, 16> to get the actual bytes, if one needs
>> > them
>> > for some reason.
>>
>> This would mean that in order to export portable representation one
>> would have to copy the bytes as std::array first and then copy bytes
>> from that array to where they need to be (e.g. a shared memory region
>> or write to a file). In this case it might be better to have an export
>> method of std::uuid that takes an output iterator, which will receive
>> the bytes.
>
> With Ranges TS one can write ranges::copy(uuid.as_bytes(), OutputIterator).
> I don't think an export() method would really add anything.

If as_bytes() returns std::span<std::byte, 16> then the problem with
enforced internal representation remains.

>> We have native_handle() in std::mutex and std::condition_variable. We
>> could add something like this to std::uuid, but frankly, I would
>> prefer if std::uuid didn't use GUID as implementation as it would make
>> its ordering inefficient.
>
> That wouldn't imply using GUID internally, native() could convert and return
> by value.

Oh, right.

> But what about platforms that don't have a native uuid type?

AFAIR, the standard says that native_handle() may not be present if
the platform doesn't provide it. std::uuid::native() could be defined
similarly.

Tony V E

unread,

Jun 14, 2018, 4:36:06 PM6/14/18

to Standard Proposals

I'd rather just make it easy to write to_GUID as a free function.

It is a good test case to see whether we defined the API well.

Andrey Semashev

unread,

Jun 14, 2018, 4:43:49 PM6/14/18

to ISO C++ Standard - Future Proposals

On Thu, Jun 14, 2018 at 11:09 PM, Tony V E <tvan...@gmail.com> wrote:
> OK, revising, based on latest emails:
>
> On Wed, Jun 13, 2018 at 11:11 PM, Tony V E <tvan...@gmail.com> wrote:
>>
>> - construct from a fixed-size span<const std::byte, 16>. This makes "wrong
>> size" not your problem.
>
> still yes. Except uint8_t.
> We need to say that the order matches the RFC

Personally, I'd prefer span<const T, 16> where T is one of std::byte,
unsigned char, signed char or char. std::byte is rather cumbersome to
use, most of the time you get some variant of char from external
sources.

uint8_t would limit std::uuid to only platforms with 8-bit bytes. This
is probably unnecessary.

>> - not sure whether the iterator constructor should stay - I suppose it is
>> useful for non-contiguous cases. Committee might make it UB if it is not 16
>> bytes.
>
> Do we even need the end iterator?

Some implementations might want to use it for debugging purposes.

>> - remove container-like stuff: size and iterators, typedefs, etc - just
>> have a to_span() function that returns a span<const std::byte, 16>
>> (it probably never should have had mutable iterators, btw - we don't want
>> to allow changing single bytes in a uuid)
>>
>> Oh, wait - you were saying that the internal representation might not be
>> just bytes - is that important?
>
> OK, so keep the container interface. But only const_iterators. Not
> mutable.
> Is that a problem for writing into an existing uuid? Construct a new one
> and assign? Let the compiler optimize it.
> Or do we need the mutable iterators? They seem wrong to me.

I'd be ok with only const_iterators.

>> - should be trivially copyable, etc
>
> yes?

I don't see an immediate use case that requires trivial copyability,
but if it happens to be trivially copyable, why not.

Tony V E

unread,

Jun 14, 2018, 4:58:43 PM6/14/18

to Standard Proposals

On Thu, Jun 14, 2018 at 4:43 PM, Andrey Semashev <andrey....@gmail.com> wrote:

On Thu, Jun 14, 2018 at 11:09 PM, Tony V E <tvan...@gmail.com> wrote:
> OK, revising, based on latest emails:
>
> On Wed, Jun 13, 2018 at 11:11 PM, Tony V E <tvan...@gmail.com> wrote:
>>
>> - construct from a fixed-size span<const std::byte, 16>. This makes "wrong
>> size" not your problem.
>
> still yes. Except uint8_t.
> We need to say that the order matches the RFC

Personally, I'd prefer span<const T, 16> where T is one of std::byte,
unsigned char, signed char or char. std::byte is rather cumbersome to
use, most of the time you get some variant of char from external
sources.

Let's keep signed char and char out of it. We are really talking numbers between 0 and 255.

So

unsigned char - fine

byte - hard to use

uint8_t - not guaranteed to exist

uint_least8_t?

short short short int?

uint8_t would limit std::uuid to only platforms with 8-bit bytes. This
is probably unnecessary.

>> - not sure whether the iterator constructor should stay - I suppose it is
>> useful for non-contiguous cases. Committee might make it UB if it is not 16
>> bytes.
>
> Do we even need the end iterator?

Some implementations might want to use it for debugging purposes.

>> - remove container-like stuff: size and iterators, typedefs, etc - just
>> have a to_span() function that returns a span<const std::byte, 16>
>> (it probably never should have had mutable iterators, btw - we don't want
>> to allow changing single bytes in a uuid)
>>
>> Oh, wait - you were saying that the internal representation might not be
>> just bytes - is that important?
>
> OK, so keep the container interface. But only const_iterators. Not
> mutable.
> Is that a problem for writing into an existing uuid? Construct a new one
> and assign? Let the compiler optimize it.
> Or do we need the mutable iterators? They seem wrong to me.

I'd be ok with only const_iterators.

>> - should be trivially copyable, etc
>
> yes?

I don't see an immediate use case that requires trivial copyability,
but if it happens to be trivially copyable, why not.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAEhD%2B6Djfqg17%2BCRyX4qxWA0YuEAk6doySZQ6Q-e5MFp5V9zYA%40mail.gmail.com.

Edward Catmur

unread,

Jun 14, 2018, 5:20:23 PM6/14/18

to std-pr...@isocpp.org

On Thu, Jun 14, 2018 at 9:28 PM, Andrey Semashev <andrey....@gmail.com> wrote:

On Thu, Jun 14, 2018 at 11:01 PM, 'Edward Catmur' via ISO C++ Standard
- Future Proposals <std-pr...@isocpp.org> wrote:
> On Thu, Jun 14, 2018 at 8:26 PM, Andrey Semashev <andrey....@gmail.com>
> wrote:
>> On Thu, Jun 14, 2018 at 10:03 PM, Thiago Macieira <thi...@macieira.org>
>> wrote:
>> >
>> > Right, it may not work to provide a function returning std::span<byte,
>> > 16>,
>> > but instead a std::array<byte, 16> to get the actual bytes, if one needs
>> > them
>> > for some reason.
>>
>> This would mean that in order to export portable representation one
>> would have to copy the bytes as std::array first and then copy bytes
>> from that array to where they need to be (e.g. a shared memory region
>> or write to a file). In this case it might be better to have an export
>> method of std::uuid that takes an output iterator, which will receive
>> the bytes.
>
> With Ranges TS one can write ranges::copy(uuid.as_bytes(), OutputIterator).
> I don't think an export() method would really add anything.

If as_bytes() returns std::span<std::byte, 16> then the problem with
enforced internal representation remains.

Sure, that's why it should return std::array.

If you're concerned about runtime efficiency, std::array<std::byte, 16> can fit into the two return registers (%rax, %rdx). But far more likely it would be inlined and the codegen for copy(as_bytes(), OutputIterator) would be identical to that for an export() member function.

Edward Catmur

unread,

Jun 14, 2018, 5:33:26 PM6/14/18

to std-pr...@isocpp.org

We'd need access to the individual parts:

struct _GUID to_GUID(std::uuid u) {

struct _GUID g{u.time_low(), u.time_mid(), uuid.time_hi_and_version()};

ranges::copy(u.as_bytes() | view::slice(8, 16), g.Data4);

return g;

}

But this is still non-trivial to write.

Thiago Macieira

unread,

Jun 14, 2018, 5:35:00 PM6/14/18

to std-pr...@isocpp.org

On Thursday, 14 June 2018 12:26:10 PDT Andrey Semashev wrote:
> > For example, the implementation on Windows may want to be
> > layout-compatible
> > with the GUID structure[1] to facilitate copying to and from that type
> > (QUuid does that). As a side-effect, it means the first 8 bytes are not
> > in the correct order for a byte-level access on little-endian machines.
>
> I'm not sure having std::uuid layout compatible with GUID would be
> helpful. Users would have to perform a reinterpret_cast in order to
> employ this feature, and this would be generally UB and certainly not
> something we want to encourage.

I didn't mean to have this as a front-end visible feature. Instead, it would
be an internal optimisation for the conversion function:

> > By the way, what's the policy on allowing the implementation to provide
> > member functions to convert to and from native structures like GUID?
>
> We have native_handle() in std::mutex and std::condition_variable. We
> could add something like this to std::uuid, but frankly, I would
> prefer if std::uuid didn't use GUID as implementation as it would make
> its ordering inefficient.

It's a QoI on what it chooses to make efficient: the copy to the byte
representation specified by RFC 4122 or the GUID representation.

One of the two will have three byte-swap operations.

Andrey Semashev

unread,

Jun 14, 2018, 5:35:18 PM6/14/18

to ISO C++ Standard - Future Proposals

On Thu, Jun 14, 2018 at 11:58 PM, Tony V E <tvan...@gmail.com> wrote:
> On Thu, Jun 14, 2018 at 4:43 PM, Andrey Semashev <andrey....@gmail.com>
> wrote:
>>
>> On Thu, Jun 14, 2018 at 11:09 PM, Tony V E <tvan...@gmail.com> wrote:
>> > OK, revising, based on latest emails:
>> >
>> > On Wed, Jun 13, 2018 at 11:11 PM, Tony V E <tvan...@gmail.com> wrote:
>> >>
>> >> - construct from a fixed-size span<const std::byte, 16>. This makes
>> >> "wrong
>> >> size" not your problem.
>> >
>> > still yes. Except uint8_t.
>> > We need to say that the order matches the RFC
>>
>> Personally, I'd prefer span<const T, 16> where T is one of std::byte,
>> unsigned char, signed char or char. std::byte is rather cumbersome to
>> use, most of the time you get some variant of char from external
>> sources.
>
> Let's keep signed char and char out of it. We are really talking numbers
> between 0 and 255.
> So
> unsigned char - fine
> byte - hard to use
> uint8_t - not guaranteed to exist
>
> uint_least8_t?
> short short short int?

I think, unsigned char, std::byte and uint8_t (if supported) are fine.

Edward Catmur

unread,

Jun 14, 2018, 5:36:13 PM6/14/18

to std-pr...@isocpp.org

We want to be able to pass uuid by value, in registers (2 general-purpose or 1 SSE). This requires trivial copyability (or [[trivial_abi]], but that's platform-dependent at present).

Thiago Macieira

unread,

Jun 14, 2018, 5:43:14 PM6/14/18

to std-pr...@isocpp.org

On Thursday, 14 June 2018 14:20:20 PDT 'Edward Catmur' via ISO C++ Standard -

Future Proposals wrote:
> If you're concerned about runtime efficiency, std::array<std::byte, 16> can
> fit into the two return registers (%rax, %rdx).

Right. Code gen leaves a bit to be desired though:
https://godbolt.org/g/r8jw4g

Windows ABI requires this type to be returned by implicit first parameter,
though. But as you say:

> But far more likely it
> would be inlined and the codegen for copy(as_bytes(), OutputIterator) would
> be identical to that for an export() member function.

I agree. That applies as well for the byte-swapping version.

Andrey Semashev

unread,

Jun 14, 2018, 5:47:24 PM6/14/18

to ISO C++ Standard - Future Proposals

It doesn't have to have access to individual parts. to_GUID could
populate GUID members from portable bytes in RFC format. With SSE it
might be possible to do with a single byte shuffle.

Tony V E

unread,

Jun 15, 2018, 8:54:44 AM6/15/18

to Standard Proposals

On Thu, Jun 14, 2018 at 4:43 PM, Andrey Semashev <andrey....@gmail.com> wrote:

On Thu, Jun 14, 2018 at 11:09 PM, Tony V E <tvan...@gmail.com> wrote:
> OK, revising, based on latest emails:
>
> On Wed, Jun 13, 2018 at 11:11 PM, Tony V E <tvan...@gmail.com> wrote:
>>
>> - construct from a fixed-size span<const std::byte, 16>. This makes "wrong
>> size" not your problem.
>
> still yes. Except uint8_t.
> We need to say that the order matches the RFC

Personally, I'd prefer span<const T, 16> where T is one of std::byte,
unsigned char, signed char or char. std::byte is rather cumbersome to
use, most of the time you get some variant of char from external
sources.

uint8_t would limit std::uuid to only platforms with 8-bit bytes. This
is probably unnecessary.

>> - not sure whether the iterator constructor should stay - I suppose it is
>> useful for non-contiguous cases. Committee might make it UB if it is not 16
>> bytes.
>
> Do we even need the end iterator?

Some implementations might want to use it for debugging purposes.

That reminds me - the end iterator should be a separate type from the begin iterator - ie Iterator begin, Sentinel end, like Ranges.

Or, actually, just take a Range, since Ranges are likely in C++20.

>> - remove container-like stuff: size and iterators, typedefs, etc - just
>> have a to_span() function that returns a span<const std::byte, 16>
>> (it probably never should have had mutable iterators, btw - we don't want
>> to allow changing single bytes in a uuid)
>>
>> Oh, wait - you were saying that the internal representation might not be
>> just bytes - is that important?
>
> OK, so keep the container interface. But only const_iterators. Not
> mutable.
> Is that a problem for writing into an existing uuid? Construct a new one
> and assign? Let the compiler optimize it.
> Or do we need the mutable iterators? They seem wrong to me.

I'd be ok with only const_iterators.

>> - should be trivially copyable, etc
>
> yes?

I don't see an immediate use case that requires trivial copyability,
but if it happens to be trivially copyable, why not.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAEhD%2B6Djfqg17%2BCRyX4qxWA0YuEAk6doySZQ6Q-e5MFp5V9zYA%40mail.gmail.com.

Marius Bancila

unread,

Jun 27, 2018, 9:26:07 AM6/27/18

to std-pr...@isocpp.org

Thank you for all the feedback here. I was a bit busy but finally made some time to look into all the recent discussions here and summarize the things that should be done in order to move further.

Here is a list of changes I have done based on the feedback:

Removed string constructors and replaced with free overloaded function from_string().
Parsing strings to uuid throws exception uuid_error instead of creating a nil uuid when the operation fails.
{} included in the supported format for parsing, i.e. "{xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}".
Removed state_size.
Renamed member function nil() to is_nil().
The default constructor is defaulted.
Added a conversion construct from std::span<std::byte, 16>.
Added the free function as_bytes() to convert the uuid into a view of its underlying bytes.
Constructing a uuid from a range with a size other than 16 is undefined behaviour.
Removed mutable iterators (but preserved the constant iterators).
Removed typedefs and others container-like parts.
Defined the correlation between the internal UUID bytes and the string representation.
Added UUID layout and byte order specification from the RFC 4122 document.

To be honest, I am unsure whether the member function size() should stay or not. As we don't want to make the uuid a container-like type perhaps this should be removed too.

The revised proposal document is available here https://github.com/mariusbancila/stduuid/blob/master/P0959.md for further review.

Thank you,

Marius

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAEhD%2B6Djfqg17%2BCRyX4qxWA0YuEAk6doySZQ6Q-e5MFp5V9zYA%40mail.gmail.com.

--
Be seeing you,
Tony

--

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAOHCbivzYngSA3G0yLr42Yap-Pn7C9TvdWdQsq0vbwUVZRKdmA%40mail.gmail.com.

Nicolas Lesser

unread,

Jun 27, 2018, 10:56:38 AM6/27/18

to std-pr...@isocpp.org

You seem to have a lot of functions/constructors that need constexpr :)

- uuid(std::span) -> std::span is constexpr enabled, so I don't see a reason for this not to be constexpr

- uuid(FwdIt, FwdIt) -> this can also be constexpr

- swaps are now constexpr if possible (P0879), which in your case is okay

- begin and end can also be constexpr, there isn't a reason not to

- as_bytes can also be constexpr as std::span is constexpr enabled

- operator==, operator!= and operator< can also be constexpr

Is there a reason why there are a member and non member functions for operator== and operator<?

Does it really make sense to provide a free function as_bytes? I don't think we want that, at least I don't :)

I think it makes sense to use operator<=> instead of operator== and operator!= (as an additional benefit, you can then use uuid as a non-type template parameter).

I'm not an LWG person, but I think that everywhere in the standard library using type = ... is used instead of typedef.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CA%2BgtASxNbbrEPSycOPJPAiUtxLY6mTtGqjkGhYoT%2B9M5g4ByrA%40mail.gmail.com.

Nicol Bolas

unread,

Jun 27, 2018, 11:13:51 AM6/27/18

to ISO C++ Standard - Future Proposals

On Wednesday, June 27, 2018 at 10:56:38 AM UTC-4, Nicolas Lesser wrote:

Does it really make sense to provide a free function as_bytes? I don't think we want that, at least I don't :)

Yes. `std::as_bytes` is a free function from `span` that takes any `span` and returns a byte array to its sequence. So it makes sense that if a UUID is considered span-like, then it should be convertible with a similar interface.

Now granted, I'm not sure it's a good idea, but that's more due to lifetime issues.

Nicolas Lesser

unread,

Jun 27, 2018, 11:26:21 AM6/27/18

to std-pr...@isocpp.org

On Wed, Jun 27, 2018 at 5:13 PM Nicol Bolas <jmck...@gmail.com> wrote:

On Wednesday, June 27, 2018 at 10:56:38 AM UTC-4, Nicolas Lesser wrote:
Does it really make sense to provide a free function as_bytes? I don't think we want that, at least I don't :)

Yes. `std::as_bytes` is a free function from `span` that takes any `span` and returns a byte array to its sequence. So it makes sense that if a UUID is considered span-like, then it should be convertible with a similar interface.

Didn't know about this :) Thanks for mentioning it.

Now granted, I'm not sure it's a good idea, but that's more due to lifetime issues.

I agree.

Tony V E

unread,

Jun 27, 2018, 1:25:39 PM6/27/18

to Standard Proposals

I'm not a big fan of std::as_bytes.

Particularly the non-const version. Basically asking for UB.

I've been considering writing a paper...

--

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CALmDwq1NC4Z8%2BqdYw54zo_DOS%3D%3DwKOFmMfj9NVm%3DgjHEVu5LXQ%40mail.gmail.com.

Tony V E

unread,

Jun 27, 2018, 1:44:26 PM6/27/18

to Standard Proposals

On Wed, Jun 27, 2018 at 9:25 AM, Marius Bancila <marius....@gmail.com> wrote:

Thank you for all the feedback here. I was a bit busy but finally made some time to look into all the recent discussions here and summarize the things that should be done in order to move further.

Here is a list of changes I have done based on the feedback:
Removed string constructors and replaced with free overloaded function from_string().

from_string needs to be a static function inside of uuid. ie

uuid id = uuid::from_string("00....");

We can't take the name "from_string" for all of std and use it only for uuid. If anything, some day there may be a templatized from_string

int x = from_string<int>("00...");

uuid id = from_string<uuid>("00....");

But uuid::from_string is easier, as it doesn't open up a big discussion about a generalized from_string<>

Parsing strings to uuid throws exception uuid_error instead of creating a nil uuid when the operation fails.
{} included in the supported format for parsing, i.e. "{xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}".
Removed state_size.
Renamed member function nil() to is_nil().
The default constructor is defaulted.
Added a conversion construct from std::span<std::byte, 16>.
Added the free function as_bytes() to convert the uuid into a view of its underlying bytes.

why allow the non-const access version of as_bytes?

Constructing a uuid from a range with a size other than 16 is undefined behaviour.
Removed mutable iterators (but preserved the constant iterators).
Removed typedefs and others container-like parts.

Defined the correlation between the internal UUID bytes and the string representation.

Is this the same as RFC 4122?

Added UUID layout and byte order specification from the RFC 4122 document.
To be honest, I am unsure whether the member function size() should stay or not. As we don't want to make the uuid a container-like type perhaps this should be removed too.

We need to decide to have either as_bytes, or a container-like interface, not both.

We go with container-like if we think that implementations will use a non-byte representation (ie SIMD, etc) internally.

Exposing as_bytes forces a byte order representation. Fine, but then container-like is unnecessary.

Maybe you should include a discussion of this in the paper, along with pros/cons of non-byte-order representations.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAEhD%2B6Djfqg17%2BCRyX4qxWA0YuEAk6doySZQ6Q-e5MFp5V9zYA%40mail.gmail.com.

--
Be seeing you,
Tony

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAOHCbivzYngSA3G0yLr42Yap-Pn7C9TvdWdQsq0vbwUVZRKdmA%40mail.gmail.com.

--

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CA%2BgtASxNbbrEPSycOPJPAiUtxLY6mTtGqjkGhYoT%2B9M5g4ByrA%40mail.gmail.com.

Marius Bancila

unread,

Jun 29, 2018, 3:33:18 AM6/29/18

to std-pr...@isocpp.org

You seem to have a lot of functions/constructors that need constexpr :)
- uuid(std::span) -> std::span is constexpr enabled, so I don't see a reason for this not to be constexpr
- uuid(FwdIt, FwdIt) -> this can also be constexpr
- swaps are now constexpr if possible (P0879), which in your case is okay
- begin and end can also be constexpr, there isn't a reason not to
- as_bytes can also be constexpr as std::span is constexpr enabled
- operator==, operator!= and operator< can also be constexpr

Right, that's correct. The idea was that I worked on the paper and implemented the library altogether. Some of these functions use copy or swap which are indeed constexpr in C++20, but no compiler supports that yet. But sure, they should be specified as constexpr and I did so now in the paper.

Is there a reason why there are a member and non member functions for operator== and operator<?

They are all non-member.

Does it really make sense to provide a free function as_bytes? I don't think we want that, at least I don't :)

It was a mistake I did. I first put it a free function and I updated the parer and then while doing other changes I realized it should be a member function and did so but forgot to properly update the proposal. But I fixed that now.

I think it makes sense to use operator<=> instead of operator== and operator!= (as an additional benefit, you can then use uuid as a non-type template parameter).

My initial idea was to only be able to compare uuid for equality. operator < is necessary to be able to use uuids with associative containers. It does not really make sense to compare whether a uuid is less than another for any other reasons. Should we want to have all these comparison operators, then yes, it makes more sense to use the <=> operator.

constexpr auto operator <=>(uuid const & lhs, uuid const & rhs) noexcept;

I'm not an LWG person, but I think that everywhere in the standard library using type = ... is used instead of typedef.

You're probably right on that. I replaced the typedefs with using statements.

from_string needs to be a static function inside of uuid. ie
uuid id = uuid::from_string("00....");
We can't take the name "from_string" for all of std and use it only for uuid. If anything, some day there may be a templatized from_string
int x = from_string<int>("00...");
uuid id = from_string<uuid>("00....");
But uuid::from_string is easier, as it doesn't open up a big discussion about a generalized from_string<>

Yes, you are definitely right about that. That was kinda stupid from me to do.

why allow the non-const access version of as_bytes?

Indeed, that doesn't make too much sense either, so I removed it and only kept the const version.

Is this the same as RFC 4122?

Yes, RFC 4122 specifies that (and I put that in the proposal too) "The fields are encoded as 16 octets, with the sizes and order of the fields defined above, and with each field encoded with the Most Significant Byte first (known as network byte order)."

Marius

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAEhD%2B6Djfqg17%2BCRyX4qxWA0YuEAk6doySZQ6Q-e5MFp5V9zYA%40mail.gmail.com.

--
Be seeing you,
Tony

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAOHCbivzYngSA3G0yLr42Yap-Pn7C9TvdWdQsq0vbwUVZRKdmA%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CA%2BgtASxNbbrEPSycOPJPAiUtxLY6mTtGqjkGhYoT%2B9M5g4ByrA%40mail.gmail.com.

--
Be seeing you,
Tony

--

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAOHCbiuO%3D6vMbwCnaDUD%3DdZ1bc0-Xf5qqyzJhfisqpubW-2gaA%40mail.gmail.com.

Nicolas Lesser

unread,

Jun 29, 2018, 3:43:17 AM6/29/18

to std-pr...@isocpp.org

Is there a reason why there are a member and non member functions for operator== and operator<?

They are all non-member.

Whoops, didn't realize they were were friends.

I think it makes sense to use operator<=> instead of operator== and operator!= (as an additional benefit, you can then use uuid as a non-type template parameter).

My initial idea was to only be able to compare uuid for equality. operator < is necessary to be able to use uuids with associative containers. It does not really make sense to compare whether a uuid is less than another for any other reasons. Should we want to have all these comparison operators, then yes, it makes more sense to use the <=> operator.

constexpr auto operator <=>(uuid const & lhs, uuid const & rhs) noexcept;

Just because we don't want the others operators doesn't mean we can't use the spaceship operator:

constexpr std::strong_equality operator<=>(uuid const&, uuid const&) noexcept = default;

The operator< can always be provided as an extra overload.

Tony V E

unread,

Jul 3, 2018, 12:21:10 PM7/3/18

to Standard Proposals

If we only want map/set interop, then define a std::less<uuid>, not operator<.

Alternatively, recognize that unordered_map/set are associative containers that don't require an ordering.

Arthur O'Dwyer

unread,

Jul 3, 2018, 12:48:30 PM7/3/18

to ISO C++ Standard - Future Proposals

Please don't do this.

You should never explicitly specialize std::less<T> for any type, ever. The primary reason is that it breaks all the user's expectations. The secondary reason is that it won't always work! Heterogeneous containers use std::less<>, not std::less<T>.

If your type has operator<, then it should have the full complement of < <= == >= > != (which in C++2a means "it should have operator<=>").

Orthogonally to that consideration, std::less<T>'s whole purpose is to expose the semantics of everybody's operator< in a consistent and functorish way.

Your proposal has nothing to do with std::less, and therefore should not touch it.

If you really hate comparison operators for some reason, then the STL-favored thing to do is add a free function named foo_less with the semantics you want for your comparator, and force people to type out

using uuidset_t = std::set<std::uuid, std::foo_less>;

Prior art: https://en.cppreference.com/w/cpp/memory/owner_less

However, we can tell that this is probably a bad idea in your case by considering what is the appropriate name for "foo_less" in this case. It's comparing uuids, and it's comparing them on the only thing it makes sense to compare uuids on — their values — so the logical name for the free function is "uuid_less".

using uuidset_t = std::set<std::uuid, std::uuid_less>;

But C++ has a better way to spell "uuid_less(const uuid&, const uuid&)"! That spelling is very old-school-C. The C++-ish spelling for "uuid_less" is...

bool operator<(const uuid&, const uuid&);

This conveys the same information — "compares uuids, for less-than" — but it conveys it in a shorter and simpler way that also composes (semi)nicely with generic containers such as std::set and std::map.

Alternatively, recognize that unordered_map/set are associative containers that don't require an ordering.

I personally wouldn't mind if std::uuid, like std::byte, didn't have any "less-than" operation at all, and if you wanted to compare them (binary-search an array of them, whatever) you had to cast them to string or something. But that's because I wouldn't use std::uuid much. If I did use it, I'd probably notice the lack and complain about it.

(And I say this as someone who has often railed against the dangers of tuple::operator< and optional::operator<. Consistency is, etc.)

–Arthur

Marius Bancila

unread,

Jul 5, 2018, 1:37:55 AM7/5/18

to std-pr...@isocpp.org

I don't think specializing std::less for uuid is a good idea. I prefer having operator <, which, as said earlier, should actually be operator <=>.

--

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0Jx%3DUAqczN0FVNu9DqxWrahjf8Wk-tELna_KCMCJOHPRg%40mail.gmail.com.

Richard Hodges

unread,

Jul 5, 2018, 6:33:41 AM7/5/18

to std-pr...@isocpp.org

The discussion on operator< is interesting.

I had a look at the wikipedia page for UUID and note that there is a specific warning about using UUIDs as primary keys in ordered indexes.

This begs the question: when would ordering be required for a UUID?

Equality and std::hash I can completely understand, but a UUID itself encodes no information on relative order. It's salient property is merely universal uniqueness.

I would argue that the only operators that should be defined are:

binary operator== - test for equality

binary operator!= - test for inequality

plus a specialisation of std::hash

very much more arguably (i.e. I would expect more argument over these):

operator bool - returns true iff the uuid is not null

unary ! - returns true if the uuid is a null uuid

In my view the following operators make no sense:

operator <

operator << (other than to stream to basic_ostream<>)

operator >

operstor >> (other than to deserialise from basic_istream<>)

operator +, -, /, *,

R

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CA%2BgtASxTLX6eQa%2BZTR4ay60q_FryMYUucGmw4UVxVW-PP4X8ZA%40mail.gmail.com.

Gašper Ažman

unread,

Jul 5, 2018, 6:39:21 AM7/5/18

to std-pr...@isocpp.org

An ordering of UUIDs is very much important for implementing unique() and flatsets.

I would argue against inclusion of any type that can meet Stepanov Regular (Regular + TotallyOrdered) but does not.

G

R

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0Jx%3DUAqczN0FVNu9DqxWrahjf8Wk-tELna_KCMCJOHPRg%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CA%2BgtASxTLX6eQa%2BZTR4ay60q_FryMYUucGmw4UVxVW-PP4X8ZA%40mail.gmail.com.

--

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CALvx3hbL1k9qZynhba5sa49JnLSv1r4z7jVzWobECQjHEw8tLQ%40mail.gmail.com.

j c

unread,

Jul 5, 2018, 8:24:36 AM7/5/18

to std-pr...@isocpp.org

Yes, most of the online advertising industry uses UUIDs extensively (serialisation and lookups)

They really need to be as usable as normal types.

I don't think an anecdote on Wikipedia is sufficient reason to not include these operators.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAANG%3DkUK%2BMfxCy_qvC6a84UnkSwxRQyo7Bhc3-DXJC6E8mHJNw%40mail.gmail.com.

Richard Hodges

unread,

Jul 5, 2018, 10:52:30 AM7/5/18

to std-pr...@isocpp.org

On Thu, 5 Jul 2018 at 12:39, Gašper Ažman <gasper...@gmail.com> wrote:

An ordering of UUIDs is very much important for implementing unique() and flatsets.

I understand. There is nothing to prevent a library user from defining their own meaning of "UUID order" through a custom Compare function, which is free to interpret the UUID's data any way it wishes.

I would argue against inclusion of any type that can meet Stepanov Regular (Regular + TotallyOrdered) but does not.

What I am saying is that there is no standardised concept of "order" when it comes to UUIDs. It's simply not a universally valid concept.

Therefore to build one into a library seems like an error to me.

G

R

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0Jx%3DUAqczN0FVNu9DqxWrahjf8Wk-tELna_KCMCJOHPRg%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CA%2BgtASxTLX6eQa%2BZTR4ay60q_FryMYUucGmw4UVxVW-PP4X8ZA%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CALvx3hbL1k9qZynhba5sa49JnLSv1r4z7jVzWobECQjHEw8tLQ%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAANG%3DkUK%2BMfxCy_qvC6a84UnkSwxRQyo7Bhc3-DXJC6E8mHJNw%40mail.gmail.com.

Thiago Macieira

unread,

Jul 5, 2018, 11:09:59 AM7/5/18

to std-pr...@isocpp.org

On Thursday, 5 July 2018 07:52:16 PDT Richard Hodges wrote:
> Therefore to build one into a library seems like an error to me.

And yet placing such a type in containers that require ordering operations is
a useful use-case too. Sorting is an internal detail of that container, not a
specific goal (though of course people may come to use it).

How do we solve this?

Also, please note that UUID would have multiple sorting possibilities. For
example, which of these two sort first?

00000001-0000-0000-0000-000000000000
01000000-0000-0000-0000-000000000000

Many implementations treat UUIDs as a sequence of components, the first of
which is a 32-bit number.

Ditto for

01000000-0000-0000-0000-000000000000
00000000-0000-0000-0000-000000000001

Another valid implementation would be a plain 16-byte compare (SSE2
instruction PCMPGT), which due to x86's little-endianness placing the MSB in
the last byte would make the second one be sorted first.

Whatever solution we come up with, it should be made clear that the order is
not specified and thus cannot be relied upon to be cross-platform.

Richard Hodges

unread,

Jul 5, 2018, 11:53:33 AM7/5/18

to std-pr...@isocpp.org

On Thu, 5 Jul 2018 at 17:09, Thiago Macieira <thi...@macieira.org> wrote:

On Thursday, 5 July 2018 07:52:16 PDT Richard Hodges wrote:
> Therefore to build one into a library seems like an error to me.

And yet placing such a type in containers that require ordering operations is
a useful use-case too. Sorting is an internal detail of that container, not a
specific goal (though of course people may come to use it).

If the sort order cannot be specified, then the container is in any case conceptually unsorted, since different systems/compilers/library implementations will yield different iteration orders for the same source code. In which case, why would one not use an unordered_ container, which is already documented to have this behaviour?

As a code maintainer, if I see std::set<Foo>, then I will logically infer that Foo objects have a defined concept of ordering. i.e. that (foo1 < foo2) will yield the same result for any given foo1 and foo2, no matter what hardware my code is running on.

if I see std::set<std::uuid>, what am I to infer? That there is some implicit sense of order between UUIDs? There is not. The code will make no sense.

Again, if the user wants to sort by bytes, bits, integer representation or any other criteria, it would be trivial to write the Compare-conforming function object.

How do we solve this?

Forcing the user to conceptualise and define for himself what he means by UUID order may hopefully prompt him to conclude that it is a bad idea.

Also, please note that UUID would have multiple sorting possibilities. For
example, which of these two sort first?

00000001-0000-0000-0000-000000000000
01000000-0000-0000-0000-000000000000

Many implementations treat UUIDs as a sequence of components, the first of
which is a 32-bit number.

Ditto for

01000000-0000-0000-0000-000000000000
00000000-0000-0000-0000-000000000001

Another valid implementation would be a plain 16-byte compare (SSE2
instruction PCMPGT), which due to x86's little-endianness placing the MSB in
the last byte would make the second one be sorted first.

Whatever solution we come up with, it should be made clear that the order is
not specified and thus cannot be relied upon to be cross-platform.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/2276247.eLH75cpEnP%40tjmaciei-mobl1.

Nicol Bolas

unread,

Jul 5, 2018, 12:13:08 PM7/5/18

to ISO C++ Standard - Future Proposals

On Thursday, July 5, 2018 at 11:09:59 AM UTC-4, Thiago Macieira wrote:

On Thursday, 5 July 2018 07:52:16 PDT Richard Hodges wrote:
> Therefore to build one into a library seems like an error to me.

And yet placing such a type in containers that require ordering operations is
a useful use-case too. Sorting is an internal detail of that container, not a
specific goal (though of course people may come to use it).

I think that rather depends on the container in question. While you could say that `map`/`flat_map`'s primary purpose is to allow you to quickly access values through keys (with sorting being an implementation detail, despite it being the basis of the container's range interface), the primary purpose of `set`/`flat_set` is to be sorted. To be able to iterate through a list of items in a specific order.

If someone doesn't really care what the order is, that there simply needs to be some order just to use certain containers, that's fine. A `std::less` overload would accomplish that adequately. Or if we don't like `std::less`, we can have some other `uuid_order` functor that orders them in an arbitrary-yet-total way. So if you need a map of them, you could use `std::map<uuid, value, uuid_order>`.

Putting such an order into the primary interface of UUID leads to a question: what is that order? What does it mean for a UUID to compare less than another?

There is no standard answer for that. So it makes sense that the UUID type reflect the fact that there is no standard ordering; to do otherwise makes us liars.

Yes, this makes it slightly more uncomfortable to use UUIDs in ordered containers. But so long as we provide a solution that is relatively easy to use, then we will have done our duty.

Jake Arkinstall

unread,

Jul 5, 2018, 1:24:13 PM7/5/18

to std-pr...@isocpp.org

The pedant in me agrees that ordering of UUIDs is meaningless and that implementing it would be misleading and maybe even lend itself to abuse.

The lazy part of me thinks that searching for a UUID will be in enough demand to justify the inclusion of ordering. Although a UUID is a wasteful key to use in terms of binary size (by definition of a UUID - it's big enough to prevent any collisions on a vast scale), it is still something that goes on a lot. It makes no sense for a user to have to implement this ordering themselves, so along this line of thought a uuid_order is justified.

Or is it?

Including a uuid_order operation is also misleading for the same reasons that an in-built ordering is, and providing it would just mean an unnecessary step for something that we would be implicitly supporting by including uuid_order in the first place.

One could argue that lexicographical ordering is not semantically sensical (a string built up of arbitrary symbols that don't have much meaning on their own, but still have a defined ordering by consensus just so we can find stuff easier), but it's still used so heavily in the real world that its functionality is built in. We include it for booleans (equally nonsensical). And I guarantee that the vast majority of overloads of operator< are for convenience rather than any truly logical meaning.

So the lazy side of me is winning over the pedantic side. I definitely agree that ordering of UUIDs is nonsensical, but what it lacks in sense it more than makes up for in real world use.

Arthur O'Dwyer

unread,

Jul 5, 2018, 2:00:36 PM7/5/18

to ISO C++ Standard - Future Proposals

On Thu, Jul 5, 2018 at 9:13 AM, Nicol Bolas <jmck...@gmail.com> wrote:

On Thursday, July 5, 2018 at 11:09:59 AM UTC-4, Thiago Macieira wrote:
On Thursday, 5 July 2018 07:52:16 PDT Richard Hodges wrote:
> Therefore to build one into a library seems like an error to me.

And yet placing such a type in containers that require ordering operations is
a useful use-case too. Sorting is an internal detail of that container, not a
specific goal (though of course people may come to use it).

I think that rather depends on the container in question. While you could say that `map`/`flat_map`'s primary purpose is to allow you to quickly access values through keys (with sorting being an implementation detail, despite it being the basis of the container's range interface), the primary purpose of `set`/`flat_set` is to be sorted. To be able to iterate through a list of items in a specific order.

"The primary purpose of a `set` is to be sorted" is right up there with "A `byte` is not necessarily eight bits" in the list of claims that only C++ people make (while all the other languages laugh at them for it).

A set is simply a set — a collection of unique elements.

This is how C++ is able to have two varieties of it: one that uses operator< to ensure element uniqueness, and one that uses std::hash and operator== to ensure element uniqueness.

We probably all agree that std::set should originally have been named something like "std::ordered_set", at which point we could say that what distinguishes an "ordered_set" from an "unordered_set" is precisely that the former is ordered and the latter is unordered.

But that's not the world we live in. We live in this ugly world, where C++ has a data type named "std::set", representing a "set", that just happens to use operator< as an implementation detail. If your type does not provide operator< — or provides it with non-totally-ordering semantics (*cough* float *cough*) — then you are not allowed to create a "set" of those objects. And that's a horrible restriction to put on normal programmers. So we (library designers) work around it by adding operator< promiscuously for any type where a total ordering is conceivable.

The "correct" thing to have done, originally, in the original STL, would have been

template<class T> struct less { ... }; // as usual; not a customization point

template<class T> struct total_order : less<T> {}; // customization point, like std::hash

template<class T, class Cmp = total_order<T>> class ordered_set { ... };

This would have let us do the following desirable things:

- std::ordered_set<std::uuid> without providing uuid::operator<

- ability to store NaN in std::ordered_set<float> without undefined behavior

- and of course a pedagogical distinction between "ordered_set" and "unordered_set"

- and of course a pedagogical distinction between "operator<" and "the property of being totally ordered"

Shower thought: It would be nice if someone started a libc++ fork for "STL v2" with all these nice ABI breaks in it. (Also vector<bool>. Also is_relocatable<T>. Also tombstone optionals. Also fully regular tuple<>, and change tie() to return a tuple_ref<>.) Maybe it could gain enough traction that people start using it for real.

Post-shower thought: Ehh, I'll do it tomorrow. :P

–Arthur

Dejan Milosavljevic

unread,

Jul 5, 2018, 2:10:00 PM7/5/18

to std-pr...@isocpp.org

For me UUID appears to be just random string. No CRC or something.

Look at small competition at: https://stackoverflow.com/questions/440133/how-do-i-create-a-random-alpha-numeric-string-in-c

--

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAC%2B0CCPTnAHFag_KjbyJg5Sk3EvE7kO3Aa7Kub4kqPWu8nfRaA%40mail.gmail.com.

Jake Arkinstall

unread,

Jul 5, 2018, 3:13:03 PM7/5/18

to std-pr...@isocpp.org

On Thu, 5 Jul 2018, 19:09 Dejan Milosavljevic, <dmi...@gmail.com> wrote:

For me UUID appears to be just random string.

It is. It holds no meaning, but it is unique, which means there will be some people using it in a searchable manner (it is an identifier, after all).

For this reason and this reason alone, ordering makes sense. And if some ordering is provided by stdlib, I see no reason why users should jump through hoops to use it (e.g. through explicit use of a uuid_order).

Richard Hodges

unread,

Jul 5, 2018, 4:02:49 PM7/5/18

to std-pr...@isocpp.org

On Thu, 5 Jul 2018 at 21:13, Jake Arkinstall <jake.ar...@gmail.com> wrote:

On Thu, 5 Jul 2018, 19:09 Dejan Milosavljevic, <dmi...@gmail.com> wrote:
For me UUID appears to be just random string.

It is. It holds no meaning, but it is unique, which means there will be some people using it in a searchable manner (it is an identifier, after all).

Use a hash_map. It's O(1) lookup which easily outperforms the O(logN) lookup of an ordered map. The only reason we'd use an ordered map would be if order mattered. With a UUID it (by design) does not.

For this reason and this reason alone, ordering makes sense. And if some ordering is provided by stdlib, I see no reason why users should jump through hoops to use it (e.g. through explicit use of a uuid_order).

The presence of a hash map in the std library obviates any need to use any kind of ordered container with UUIDs. There is no possible logical reason to a greater/less-than relationship between two UUIDs.

People storing data temporally will have an ordered index on monotonic timestamps or sequence numbers (which are obviously ordered). People using UUIDs for identity lookups will always want a hash-based index for the purpose.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAC%2B0CCO0Kdrp5k1HTU%2BoNHNgTrJkkNdfqS4uNH9Msa92RZ%2BxrQ%40mail.gmail.com.

inkwizyt...@gmail.com

unread,

Jul 5, 2018, 4:55:02 PM7/5/18

to ISO C++ Standard - Future Proposals

On Thursday, July 5, 2018 at 7:24:13 PM UTC+2, Jake Arkinstall wrote:

The pedant in me agrees that ordering of UUIDs is meaningless and that implementing it would be misleading and maybe even lend itself to abuse.

The lazy part of me thinks that searching for a UUID will be in enough demand to justify the inclusion of ordering. Although a UUID is a wasteful key to use in terms of binary size (by definition of a UUID - it's big enough to prevent any collisions on a vast scale), it is still something that goes on a lot. It makes no sense for a user to have to implement this ordering themselves, so along this line of thought a uuid_order is justified.

Or is it?

Including a uuid_order operation is also misleading for the same reasons that an in-built ordering is, and providing it would just mean an unnecessary step for something that we would be implicitly supporting by including uuid_order in the first place.

But if it named `uuid_lexical_order` or `uuid_byte_order`? standard could provide both or even more. User will chose best version that fit his goals or he will roll its own.

Richard Hodges

unread,

Jul 5, 2018, 5:06:11 PM7/5/18

to std-pr...@isocpp.org

On Thu, 5 Jul 2018 at 22:55, <inkwizyt...@gmail.com> wrote:

On Thursday, July 5, 2018 at 7:24:13 PM UTC+2, Jake Arkinstall wrote:
The pedant in me agrees that ordering of UUIDs is meaningless and that implementing it would be misleading and maybe even lend itself to abuse.

The lazy part of me thinks that searching for a UUID will be in enough demand to justify the inclusion of ordering. Although a UUID is a wasteful key to use in terms of binary size (by definition of a UUID - it's big enough to prevent any collisions on a vast scale), it is still something that goes on a lot. It makes no sense for a user to have to implement this ordering themselves, so along this line of thought a uuid_order is justified.

Or is it?

Including a uuid_order operation is also misleading for the same reasons that an in-built ordering is, and providing it would just mean an unnecessary step for something that we would be implicitly supporting by including uuid_order in the first place.

But if it named `uuid_lexical_order` or `uuid_byte_order`? standard could provide both or even more. User will chose best version that fit his goals or he will roll its own.

What would be the motivating use case for such a utility?

Without a compelling one, this would just represent more code to maintain without reason.

One could argue that lexicographical ordering is not semantically sensical (a string built up of arbitrary symbols that don't have much meaning on their own, but still have a defined ordering by consensus just so we can find stuff easier), but it's still used so heavily in the real world that its functionality is built in. We include it for booleans (equally nonsensical). And I guarantee that the vast majority of overloads of operator< are for convenience rather than any truly logical meaning.

So the lazy side of me is winning over the pedantic side. I definitely agree that ordering of UUIDs is nonsensical, but what it lacks in sense it more than makes up for in real world use.

--

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/4e8d1fd8-3fae-4e0d-9b6c-b301850fc33d%40isocpp.org.

Tony V E

unread,

Jul 5, 2018, 5:06:48 PM7/5/18

to Standard Proposals

On Tue, Jul 3, 2018 at 12:48 PM, Arthur O'Dwyer <arthur....@gmail.com> wrote:

On Tue, Jul 3, 2018 at 9:21 AM, Tony V E <tvan...@gmail.com> wrote:
On Fri, Jun 29, 2018 at 3:43 AM, Nicolas Lesser <blitz...@gmail.com> wrote:
Is there a reason why there are a member and non member functions for operator== and operator<?

They are all non-member.

Whoops, didn't realize they were were friends.

I think it makes sense to use operator<=> instead of operator== and operator!= (as an additional benefit, you can then use uuid as a non-type template parameter).

My initial idea was to only be able to compare uuid for equality. operator < is necessary to be able to use uuids with associative containers. It does not really make sense to compare whether a uuid is less than another for any other reasons. Should we want to have all these comparison operators, then yes, it makes more sense to use the <=> operator.

constexpr auto operator <=>(uuid const & lhs, uuid const & rhs) noexcept;

Just because we don't want the others operators doesn't mean we can't use the spaceship operator:

constexpr std::strong_equality operator<=>(uuid const&, uuid const&) noexcept = default;

The operator< can always be provided as an extra overload.

If we only want map/set interop, then define a std::less<uuid>, not operator<.

Please don't do this.
You should never explicitly specialize std::less<T> for any type, ever. The primary reason is that it breaks all the user's expectations. The secondary reason is that it won't always work! Heterogeneous containers use std::less<>, not std::less<T>.

I mostly agree with you. I'm all for "take back std::less". But we already do this for std::complex, for example.

But I'd love a better answer - like fix map and set to call std::order when std::less doesn't work. Or something like that.

If your type has operator<, then it should have the full complement of < <= == >= > != (which in C++2a means "it should have operator<=>").
Orthogonally to that consideration, std::less<T>'s whole purpose is to expose the semantics of everybody's operator< in a consistent and functorish way.
Your proposal has nothing to do with std::less, and therefore should not touch it.

If you really hate comparison operators for some reason, then the STL-favored thing to do is add a free function named foo_less with the semantics you want for your comparator, and force people to type out
using uuidset_t = std::set<std::uuid, std::foo_less>;
Prior art: https://en.cppreference.com/w/cpp/memory/owner_less
However, we can tell that this is probably a bad idea in your case by considering what is the appropriate name for "foo_less" in this case. It's comparing uuids, and it's comparing them on the only thing it makes sense to compare uuids on — their values — so the logical name for the free function is "uuid_less".
using uuidset_t = std::set<std::uuid, std::uuid_less>;
But C++ has a better way to spell "uuid_less(const uuid&, const uuid&)"! That spelling is very old-school-C. The C++-ish spelling for "uuid_less" is...
bool operator<(const uuid&, const uuid&);
This conveys the same information — "compares uuids, for less-than" — but it conveys it in a shorter and simpler way that also composes (semi)nicely with generic containers such as std::set and std::map.

Alternatively, recognize that unordered_map/set are associative containers that don't require an ordering.

I personally wouldn't mind if std::uuid, like std::byte, didn't have any "less-than" operation at all, and if you wanted to compare them (binary-search an array of them, whatever) you had to cast them to string or something. But that's because I wouldn't use std::uuid much. If I did use it, I'd probably notice the lack and complain about it.

(And I say this as someone who has often railed against the dangers of tuple::operator< and optional::operator<. Consistency is, etc.)

–Arthur

--

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.

To post to this group, send email to std-pr...@isocpp.org.

To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CADvuK0Jx%3DUAqczN0FVNu9DqxWrahjf8Wk-tELna_KCMCJOHPRg%40mail.gmail.com.

inkwizyt...@gmail.com

unread,

Jul 5, 2018, 6:46:32 PM7/5/18

to ISO C++ Standard - Future Proposals

On Thursday, July 5, 2018 at 11:06:11 PM UTC+2, Richard Hodges wrote:

On Thu, 5 Jul 2018 at 22:55, <inkwizyt...@gmail.com> wrote:

On Thursday, July 5, 2018 at 7:24:13 PM UTC+2, Jake Arkinstall wrote:
The pedant in me agrees that ordering of UUIDs is meaningless and that implementing it would be misleading and maybe even lend itself to abuse.

The lazy part of me thinks that searching for a UUID will be in enough demand to justify the inclusion of ordering. Although a UUID is a wasteful key to use in terms of binary size (by definition of a UUID - it's big enough to prevent any collisions on a vast scale), it is still something that goes on a lot. It makes no sense for a user to have to implement this ordering themselves, so along this line of thought a uuid_order is justified.

Or is it?

Including a uuid_order operation is also misleading for the same reasons that an in-built ordering is, and providing it would just mean an unnecessary step for something that we would be implicitly supporting by including uuid_order in the first place.

But if it named `uuid_lexical_order` or `uuid_byte_order`? standard could provide both or even more. User will chose best version that fit his goals or he will roll its own.

What would be the motivating use case for such a utility?

Without a compelling one, this would just represent more code to maintain without reason.

I only referring to name and possibility of "misleading", if name explicitly mark ordering mechanism, we will not imply that uuid have any natural order. Each can have different tradeoffs and user can choose what he need or like.