The mini-review of Andy Tompkins UUID library starts today, November 23rd
2008, and will end on November 30th.
I really hope to see your vote and your participation in the discussions on
the Boost mailing lists!
The library can be downloaded from the vault here (it's the file
uuid_v13.zip in the topmost directory): http://tinyurl.com/6xx28b
The initial review of the UUID library ended with a provisional acceptance
(read here: http://article.gmane.org/gmane.comp.lib.boost.user/27774/).
This mini review is meant to allow for a final decision after looking at the
changed parts of the library. Here is a list of things fixed and changed:
- fixed the licensing issues revealed by the initial review
- fixed the security vulnerability revealed by the initial review
- renamed to uuid, moved all classes, functions, etc. to namespace
boost::uuids
- implemented sha1 hash function (thus no license problem)
- new class basic_uuid_generator to create random-based uuids. The random
number generator is no longer hard coded. It can use any random number
generator, default is boost::mt19937
- implemented a good seed function for random number generators
- all functions are now reentrant, all classes are as thread safe as an int
and the library is no longer dependent on Boost.Thread
---------------------------------------------------
Please always state in your review, whether you think the library should be
accepted as a Boost library!
Additionally please consider giving feedback on the following general
topics:
- What is your evaluation of the design?
- What is your evaluation of the implementation?
- What is your evaluation of the documentation?
- What is your evaluation of the potential usefulness of the library?
- Did you try to use the library? With what compiler? Did you have any
problems?
- How much effort did you put into your evaluation? A glance? A quick
reading? In-depth study?
- Are you knowledgeable about the problem domain?
Regards Hartmut
Review Manager
_______________________________________________
Boost-users mailing list
Boost...@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users
This link didn't work for me. Is there a problem with it?
Dave Jenkins
This link didn't work for me. Is there a problem with it?
The library can be downloaded from the vault here (it's the file
uuid_v13.zip in the topmost directory): http://tinyurl.com/6xx28b
>
> Please always state in your review, whether you think the library should be
> accepted as a Boost library!
I vote for acceptance.
>
> Additionally please consider giving feedback on the following general
> topics:
>
> - What is your evaluation of the design?
The design I like and fits into my world of c++ programming.
> - What is your evaluation of the implementation?
Looks good. I like the programming style. I would recommend to clean
up the files, in particular the commented out headers. Also, the
file uuid_serialize.hpp contains only commented out code besides the
macro instantiation. Is that functionality working?
How about having a uuid_all.hpp header?
> - What is your evaluation of the documentation?
Good.
> - What is your evaluation of the potential usefulness of the library?
Huge. We are using UUID's a lot and this library seems to be a good
candidate for the next code refactoring.
> - Did you try to use the library? With what compiler? Did you have any
> problems?
Yes, I compiled the test program which Paul Bristow posted and it
worked without a glitch. I'm using Visual Studio 8.
> - How much effort did you put into your evaluation? A glance? A quick
> reading? In-depth study?
A quick reading.
> - Are you knowledgeable about the problem domain?
I'm a user.
Thanks for another very useful lib.
Christian
i just did a quick compile/testsuite test for now, and i see three
trivial issues:
1. it doesn't compile with gcc-4.4, because of a missing include.
patch:
--- a/boost/uuid/seed_rng.hpp
+++ b/boost/uuid/seed_rng.hpp
@@ -26,6 +26,7 @@
#include <memory.h>^M
#include <ctime>^M
#include <cstdlib>^M
+#include <cstdio>^M
#include <boost/uuid/sha1.hpp>^M
//#include <boost/nondet_random.hpp> //forward declare boost::random_device^M
^M
2. the testsuite fails on x86_64 because it is using hardcoded 32-bit
hash values:
Running 1 test case...
libs/uuid/test/test_uuid.cpp(146): error in "test_main_caller( argc, argv )": check uuid_hasher(uuid()) == 3565488559U failed [17562157925649023279 != 3565488559]
libs/uuid/test/test_uuid.cpp(147): error in "test_main_caller( argc, argv )": check uuid_hasher(u1) == 4159045843U failed [6622376135548557651 != 4159045843]
libs/uuid/test/test_uuid.cpp(148): error in "test_main_caller( argc, argv )": check uuid_hasher(u2) == 2713274306U failed [4700797755868797762 != 2713274306]
*** 3 failures detected in test suite "Test Program"
3. the sources use cr+lf eol style, i guess correctly importing the
sources into subversion will fix this
maybe i find some time to do a full review later this week. anyway,
i have been using the library for quite some time and would like to
see it being part of boost. thus i vote for acceptance.
cheers, tim
--
t...@klingt.org
http://tim.klingt.org
The composer makes plans, music laughs.
Morton Feldman
< snip >
> > - What is your evaluation of the implementation?
>
> Looks good. I like the programming style. I would recommend to clean
> up the files, in particular the commented out headers. Also, the
> file uuid_serialize.hpp contains only commented out code besides the
> macro instantiation. Is that functionality working?
>
> How about having a uuid_all.hpp header?
If users want this, I have no problem creating a uuid_all.hpp header.
I'm also considering creating a uuid_fwd.hpp header.
< snip >
>
> Christian
>
Thanks,
Andy Tompkins
I'll fix this.
>
> 2. the testsuite fails on x86_64 because it is using hardcoded 32-bit
> hash values:
> Running 1 test case...
> libs/uuid/test/test_uuid.cpp(146): error in "test_main_caller( argc, argv
> )": check uuid_hasher(uuid()) == 3565488559U failed [17562157925649023279
> != 3565488559]
> libs/uuid/test/test_uuid.cpp(147): error in "test_main_caller( argc, argv
> )": check uuid_hasher(u1) == 4159045843U failed [6622376135548557651 !=
> 4159045843]
> libs/uuid/test/test_uuid.cpp(148): error in "test_main_caller( argc, argv
> )": check uuid_hasher(u2) == 2713274306U failed [4700797755868797762 !=
> 2713274306]
>
> *** 3 failures detected in test suite "Test Program"
Hmm, I assume that the value that uuid_hasher produces is correct since
I
mimicked the way Boost.Hash does it. So, I believe I just need to
adjust the
code in test_uuid.cpp in a 64 bit environment where sizeof(std::size_t)
!= 32?
>
>
> 3. the sources use cr+lf eol style, i guess correctly importing the
> sources into subversion will fix this
I believe so this to be true.
>
>
> maybe i find some time to do a full review later this week. anyway,
> i have been using the library for quite some time and would like to
> see it being part of boost. thus i vote for acceptance.
>
> cheers, tim
>
> --
> t...@klingt.org
> http://tim.klingt.org
>
> The composer makes plans, music laughs.
> Morton Feldman
>
Thanks,
Andy Tompkins
right ... i suppose the hardcoded value in test_uuid.cpp is only correct on
32 bit platforms ...
cheers, tim
--
t...@klingt.org
http://tim.klingt.org
The price an artist pays for doing what he wants is that he has to do
it.
William S. Burroughs
I like the design, but wish it had time-based UUIDs.
> - What is your evaluation of the implementation?
I had no problem reading the code. But there's almost nothing in
"uuid.ipp". Could you merge it into "uuid.hpp" for simplicity?
> - What is your evaluation of the documentation?
I'd like to see the Rationale section come first, before you give the whole
Class Synopsis. That would make it easier for casual readers to understand
the purpose of the library before reading further.
A small example program near the beginning of the document would help too.
There is a spelling error in the Rationale section, s/indended/intended/.
> - What is your evaluation of the potential usefulness of the library?
It's a nice addition to Boost.
> - Did you try to use the library? With what compiler? Did you have any
> problems?
Yes, I tried running some test code and had no problems.
I vote Yes, to include the UUID library in Boost.
Regards,
Dave Jenkins
Understood. I will look into this again.
> > - What is your evaluation of the implementation?
>
> I had no problem reading the code. But there's almost nothing in
> "uuid.ipp". Could you merge it into "uuid.hpp" for simplicity?
I could. The idea was to put non-template functions in uuid.ipp so that
one could compile it in a .cpp to create a library. I agree that there
is little in it. I should have moved the definition of many uuid member
function to uuid.ipp. But I think you are right, I should just merge
uuid.ipp into uuid.hpp.
> > - What is your evaluation of the documentation?
>
> I'd like to see the Rationale section come first, before you give the
> whole Class Synopsis. That would make it easier for casual readers to
> understand the purpose of the library before reading further.
Good idea. I'll do this.
> A small example program near the beginning of the document would
> help too.
Sure. I'll do this.
> There is a spelling error in the Rationale section,
> s/indended/intended/.
Thanks.
> > - What is your evaluation of the potential usefulness of the
> > library?
>
> It's a nice addition to Boost.
>
> > - Did you try to use the library? With what compiler? Did you have
> > any problems?
>
> Yes, I tried running some test code and had no problems.
>
> I vote Yes, to include the UUID library in Boost.
>
> Regards, Dave Jenkins
>
Thanks,
Andy Tompkins
Hartmut Kaiser wrote:
> The mini-review of Andy Tompkins UUID library starts today, November 23rd
> 2008, and will end on November 30th.
> I really hope to see your vote and your participation in the discussions on
> the Boost mailing lists!
>
I only saw a couple of issues:
uuid::get_showbraces_index() is not reentrant
(there is a race the first time it is called.)
operator<< doesn't handle setw.
operator>> ignores spaces. Is this intended?
In Christ,
Steven Watanabe
I would like to fix this, but I'm not yet sure how.
> operator<< doesn't handle setw.
> operator>> ignores spaces. Is this intended?
I intended to follow standard practices. I forgot setw. I will
fix this. I didn't realize that operator>> ignores spaces. I will
fix this too.
> In Christ,
> Steven Watanabe
Thanks,
Andy Tompkins
I also prefer using zero in the name. That said,
http://www.faqs.org/rfcs/rfc4122.html calls it the "nil" UUID. But
.NET calls it Guid.Empty, and python seems to only call it
uuid.UUID(), so nil may not be any clearer.
Though really, why not just call it operator unspecified_bool_type?
assert(!int() && !shared_ptr<T>() && !uuid());
>
> static uuid create(uuid const& namespace_uuid, char const* name, int
> name_length);
> Can this be a generator instead?
>
> std::string name="www.widgets.com";
> name_based_generator gen(name.begin(), name.end());
> uuid id = gen();
>
A generator seems logical. I certainly don't like the char*+length version.
std::string name="www.widgets.com";
sha1_name_based_generator gen(dns_namespace_uuid);
uuid id = gen(name);
uuid id2 = gen(name.begin(), name.end());
assert(id == id2)
I'd rather know whether it was v3 or v5 somewhere more obvious than
the design notes, though. If I were using v3 (MD5) uuids and switched
to boost not knowing that v5 (SHA1) uuids exist, I'd likely be very
confused when my code stops working with my data.
Good point. I wonder if they should both be generators. A generator
that takes a string or an iterator range for a string. And a different
generator that takes an array of bytes or an iterator range for bytes.
> std::string to_string() const;
> std::wstring to_wstring() const;
> template <typename ch, typename char_traits, typename alloc>
> std::basic_string<ch, char_traits, alloc> to_basic_string() const;
>
> Any reason to use to_string instead of std::basic_ostream operators
> instead?
No reason to use one over the other, just preference. It has been
mentioned before and I will likely remove the to_..._string functions.
> size_type size() The underlying container is boost::array, shouldn't
> this function be made static? Or is the size of the uuid
> implementation defined and the user shouldn't count on it's length?
It could easily be made static. It will _always_ return 16.
> is_null() What does this mean? Ah, ok, from the docs I see it's
> a magic uuid (all zero) that is_null. Maybe is_zero() would be
> more clear?
Maybe. Maybe uninitialized, nil. I don't think it matters too
much because I don't think there is a name that everyone will think
is obvious.
> static uuid create(uuid const& namespace_uuid, char const* name, int
> name_length); Can this be a generator instead?
>
> std::string name="www.widgets.com";
> name_based_generator gen(name.begin(), name.end());
> uuid id = gen();
I agree this should be a generator.
> Shouldn't basic_uuid_generator be named random_uuid_generator? It
> doesn't seem more basic than any other generator.
Yes it should have random in its name.
> - What is your evaluation of the design?
>
> Id like to see more separation between the generators and the core
> uuid class. IMO generating functionality should not be part of the
> uuid class at all, but separated into own headers with well
> documented behaviour. I'd like to se random_generator<class
> RandomGen> parsing_generator<class CharIterator>
> native_generator<..> (wrapper around os API)
Yes, I like this.
> - What is your evaluation of the implementation?
>
> It seems fine, but it's not organized IMO. As a user I don't want to
> pay compile time for features I do not use (from uuid.hpp):
> #include <boost/operators.hpp>
> #include <boost/io/ios_state.hpp>
> #include <boost/random/uniform_int.hpp>
> #include <boost/random/variate_generator.hpp>
> #include <boost/random/mersenne_twister.hpp>
> #include <boost/uuid/seed_rng.hpp>
> #include <sstream>
> #include <iosfwd>
> #include <locale>
This could easily be fixed by separating functionality out into
different
header files. I will do this.
> - What is your evaluation of the documentation?
>
> I'd like to see more information about how to create uuids and if the
> library can help me doing that. Also what I can expect from the
> different generators (speed vs uniqueness).
Yes, I will do this.
I not keen on this. The fact that boost::uuid is implemented with
boost::array is just an implementation detail. That may change. I
want the public interface to remain the same.
> I admit the last part is from the top of my head right now, there's
> probably flaws with it =) But separating io, generation and container
> should IMO be done either way.
I agree, separating io, generation, ... should be done. I will do this.
> Regards, Christian
Thanks,
Andy Tompkins
> size_type size() The underlying container is boost::array, shouldn'tIt could easily be made static. It will _always_ return 16.
> this function be made static? Or is the size of the uuid
> implementation defined and the user shouldn't count on it's length?
Maybe. Maybe uninitialized, nil. I don't think it matters too
> is_null() What does this mean? Ah, ok, from the docs I see it's
> a magic uuid (all zero) that is_null. Maybe is_zero() would be
> more clear?
much because I don't think there is a name that everyone will think
is obvious.
I not keen on this. The fact that boost::uuid is implemented with
> Maybe this is out of the question to most, but is the uuid class
> needed at all? I'd be happy to see the following working (cause then I
> can use our own uuid class with the algorithms provided by the
> library).
>
> #include <boost/uuid/random_generator.hpp>
> #include <boost/uuid/support/array.hpp>
> #include <boost/uuid/io.hpp>
>
> typedef boost::array<char, 16> id_type;
> boost::uuid::random_generator<id_type, boost::mt19937> uuid_gen(...);
> my_id id = uuid_gen();
> std::cout << boost::uuid::format(id) <<std::endl;
boost::array is just an implementation detail. That may change. I
want the public interface to remain the same.
First of all, my opinion is that this kind of classes, sometimes called
concrete types, are the very plus plus suggested by the language's name. A
well designed user type behaves and cost, exactly the same as any built in
type. This makes C++ unique in the plethora of available languages.
That said, I must mention that I am aware of the flaws pointed out by many
reviewers, specially those concerning interface. In particular, I don't like
the "default null constructor". For me, the default constructor MUST
generate a valid object, period. I have already copied/used Andy's code,
which I luckily found in your vault, and one thing I changed was this
default constructor [uuid_t uuid; // generates a valid random uuid]. If the
user is short of resources, she must use the copy constructor [uuid_t
nil(uuid_t::null());]. But I don't provide a bool is_null() const; method,
for operator==() does the job pretty well.
Despite this weaknesses, I think the library is ready for out-of-the-box
usage, so my vote is YES, this library should be part of boost.
Sincerely,
Kenneth
--
View this message in context: http://www.nabble.com/-Review--UUID-library-%28mini-%29review-starts-today%2C-November-23rd-tp20652849p20807056.html
Sent from the Boost - Users mailing list archive at Nabble.com.
"behaves and cost, exactly the same as any built in type" and the
"default constructor [...] generates a valid random uuid" seem rather
opposed. int() is not a random int, it's 0. No fundamental type --
no standard library type either, that I can think of -- has T() !=
T(), yet you want uuid() != uuid().
And why is the random UUID generator deemed more important than the
time-based one, so much so that it gets the default constructor?
... No fundamental type --
no standard library type either, that I can think of -- has T() !=
T(), yet you want uuid() != uuid().
You are right, there is an apparent contradiction in my wording above. By
"behaves and cost", I mean implementing comparison operators, inserters and
extractors, for example. Where applicable, of course, and with minimum
overhead. But an UUID is more than its bare 128-bit representation. IMHO,
what justifies the truly exceptional uuid() != uuid() is the "unique" in
"universally unique id".
Yours,
Kenneth
>And why is the random UUID generator deemed more important than the
>time-based one, so much so that it gets the default constructor?
Because Andy has not [yet] implemented the time-based generator, which poses
portability problems.
_______________________________________________
Boost-users mailing list
Boost...@lists.boost.org
http://lists.boost.org/mailman/listinfo.cgi/boost-users
--
View this message in context: http://www.nabble.com/-Review--UUID-library-%28mini-%29review-starts-today%2C-November-23rd-tp20652849p20820796.html
Sent from the Boost - Users mailing list archive at Nabble.com.
_______________________________________________
But it's the mapping of the ID to something that's unique, not the
objects themselves. The only time you can do something useful with a
UUID is when you have two that are the same, so I still don't think
breaking normal semantics is justified.
The usage I was looking at for my project never needs the UUID library
to generate one itself, in fact. They're all generated in the
database (http://dev.mysql.com/doc/refman/5.0/en/miscellaneous-functions.html#function_uuid),
but the class is still useful for parsing and serialization.
Changing tacks somewhat...
I was just looking deeper into the implementation, and saw some things
that worry me. For example, this looks like an aliasing violation in
uuid.hpp line 279:
*reinterpret_cast<unsigned long*>(&data[i]) = _v_gen();
It's at least clearly byte-order-dependant, which isn't good, since
I'd expect a generator seeded the same way to always produce the same
UUIDs, regardless of architecture.
I'm pleased to see proper shifting and masking in the SHA1 code, which
made me think of something: Why not just make the UUID class a PoD?
It has an architecture-independent in-memory format (rfc4122, section
4.1.2, which is already followed), so turning it into one would be
straight-forward, and would allow it to be very easily & safely
splatted to and from files.
This would require moving the constructors and such to generators,
stream operators, and similar, but I think those are positive changes
in any case. It certainly solves the "what should the constructor do"
problem, and places all the various generators (nil, random, name,
time, ...) at equal footing.
It's a value type, so let's make it as much like an int as possible.
Would it make any sense in defining data_type as array< unsigned char,
128 / CHAR_BIT >? I'm interested in portability issues, but I have no
access to platforms other than intel-x86, so I would like to hear some
advice on the subject.
Thank you.
Agustín K-ballo Bergé.-
Fair. I can add operator unspecified_bool_type. And, possibility
remove null().
> >
> > static uuid create(uuid const& namespace_uuid, char const* name, int
> > name_length); Can this be a generator instead?
> >
> > std::string name="www.widgets.com"; name_based_generator
> > gen(name.begin(), name.end()); uuid id = gen();
> >
>
> A generator seems logical. I certainly don't like the char*+length
> version.
Agreed.
> std::string name="www.widgets.com";
> sha1_name_based_generator gen(dns_namespace_uuid);
> uuid id = gen(name);
> uuid id2 = gen(name.begin(), name.end());
> assert(id == id2)
>
> I'd rather know whether it was v3 or v5 somewhere more obvious than
> the design notes, though. If I were using v3 (MD5) uuids and switched
> to boost not knowing that v5 (SHA1) uuids exist, I'd likely be very
> confused when my code stops working with my data.
Hmm, I'd be surprised if your code stopped working. But regardless,
would it be enough to add a/some member functions to uuid so that one
could ask what type of uuid it was? For example:
boost::uuids::uuid u;
// generate a uuid
bool bIsV3 = u.is_v3();
//or
boost::uuids::uuid::version_type v = u.version();
Regards,
Andy Tompkins
Well, if I had some UUIDs in a database that were generated from a v3
algorithm, and I tried to look one up from the base string, but
generated the UUID from the v5 algorithm, the lookup would fail.
Though I have no idea whether this would be an issue in practice.
> But regardless,
> would it be enough to add a/some member functions to uuid so that one
> could ask what type of uuid it was? For example:
>
> boost::uuids::version_type v = u.version();
>
I think that's an excellent idea.
I don't think it solves the issue, though, since I likely wouldn't
think to check going from some hypothetical old C API's
UuidCreateByName(&id, UUID_NAMESPACE_DNS, "www.widgets.com", 15);
to
id = boost::uuids::uuid::create(dns_namespace_uuid, "www.widgets.com");
Inter,
Kenneth
--
View this message in context: http://www.nabble.com/-Review--UUID-library-%28mini-%29review-starts-today%2C-November-23rd-tp20652849p20865931.html
Sent from the Boost - Users mailing list archive at Nabble.com.
_______________________________________________
I will document this.
< snip >
> > > Maybe this is out of the question to most, but is the uuid class
> > > needed at all? I'd be happy to see the following working (cause
> > > then I can use our own uuid class with the algorithms provided by
> > > the library).
> > >
> > > #include <boost/uuid/random_generator.hpp> #include
> > > <boost/uuid/support/array.hpp> #include <boost/uuid/io.hpp>
> > >
> > > typedef boost::array<char, 16> id_type;
> > > boost::uuid::random_generator<id_type, boost::mt19937>
> > > uuid_gen(...); my_id id = uuid_gen(); std::cout <<
> > > boost::uuid::format(id) <<std::endl;
> >
> > I not keen on this. The fact that boost::uuid is implemented with
> > boost::array is just an implementation detail. That may change. I
> > want the public interface to remain the same.
>
> Ok. For the sake of interface, does it make sense to you adding const
> char* uuid::data() const and static std::size_t uuid::size() that you
> won't change in future releases?
Hmm, I'm not sure. There is another post that I haven't got to
replying to that talks about platforms with different sized types
(char, int, ...). If I make sure that sizeof(boost::uuids::uuid) == 16
bytes on every platform then some platforms may get a performance
penalty. On a platform where sizeof(char) != 1 or sizeof(int) != 4
then boost::uuids::uuid will have to do more work to put bits where
they belong.
Then again, I don't think I can actually guarantee on every platform
that sizeof(boost::uuids::uuid) == 16.
What I have done is make sure that each of the 16 blocks of bits in a
boost::uuids::uuid is at least 8 bits. Thus boost::uuids::uuid::begin()
and boost::uuids::uuid::end() will walk the 16 blocks correctly.
I know this didn't really answer your question. I would like to offer
const char* uuid::data() const, but I'm not sure that I can.
> I think it's an important feature for a uuid to store its 128bit value
> as a continues range of bytes, and that users can rely on using it as
> such where needed. Also, that the ordering of the bytes won't change
> in future releases. Uuids are likely to be persistent somewhere, and
> not necessarily trough the means of boost.serialization where version
> data can be attached.
I agree. I did provide begin() and end() with this purpose in mind.
And the order will never change.
> Regards, Christian
Regards,
Andy Tompkins wrote:
> On a platform where sizeof(char) != 1
sizeof(char) == 1 by definition.
In Christ,
Steven Watanabe
I also think that uuid() == uuid() is a good thing. I do understand the
other point of view and where it is coming from. It really does make me
think that there should not be a default constructor. Then, one must be
explicit if they want a 'null' uuid, something like:
boost::uuids::uuid u1(boost::uuids::null());
and one must be explicit if they want a unique uuid, something like:
boost::uuids::uuid u2(boost::uuids::uuid_generator());
> The usage I was looking at for my project never needs the UUID library
> to generate one itself, in fact. They're all generated in the
> database (http://dev.mysql.com/doc/refman/5.0/en/miscellaneous-functions.html#function_uuid),
> but the class is still useful for parsing and serialization.
>
> Changing tacks somewhat...
>
> I was just looking deeper into the implementation, and saw some things
> that worry me. For example, this looks like an aliasing violation in
> uuid.hpp line 279:
>
> *reinterpret_cast<unsigned long*>(&data[i]) = _v_gen();
I do not believe this is an aliasing violation.
> It's at least clearly byte-order-dependant, which isn't good, since
> I'd expect a generator seeded the same way to always produce the same
> UUIDs, regardless of architecture.
Hmm, I was only trying to be efficient. The generator produces values
of size = sizeof(unsigned long) and I wanted to use all those bytes. Do
you have a suggestion that would improve this?
> I'm pleased to see proper shifting and masking in the SHA1 code, which
> made me think of something: Why not just make the UUID class a PoD? It
> has an architecture-independent in-memory format (rfc4122, section
> 4.1.2, which is already followed), so turning it into one would be straight-
> forward, and would allow it to be very easily & safely splatted to
> and from files.
I do really like having a class encapsulate the data and functions.
Also, I don't have much experience with platforms where sizeof(int)
!= 4 or sizeof(char) != 1, and I want to make sure that I can do as
you suggest.
> This would require moving the constructors and such to generators,
> stream operators, and similar, but I think those are positive changes
> in any case. It certainly solves the "what should the constructor do"
> problem, and places all the various generators (nil, random, name,
> time, ...) at equal footing.
I also think that all the ways to create a uuid can be made as
generators, and by not providing a default constructor will keep
them equal.
> It's a value type, so let's make it as much like an int as possible.
Regards,
Andy Tompkins
Hmm, can anyone verify this? I will address this if it is a problem.
What are the guarantees of the size of a byte? Is a byte always 8 bits?
> Would it make any sense in defining data_type as array< unsigned char,
> 128 / CHAR_BIT >? I'm interested in portability issues, but I have no
> access to platforms other than intel-x86, so I would like to
> hear some advice on the subject.
I also have no access to platforms other than x86. Would this work?
Would there be performance penalities on some platforms?
> Thank you.
>
> Agustín K-ballo Bergé.-
Regards, Andy Tompkins
Ah, I understand. Yes this would be a problem.
> Though I have no idea whether this would be an issue in practice.
Good question. I don't know how boost::uuids can prevent this besides
documentation.
> > But regardless, would it be enough to add a/some member functions to
> > uuid so that one could ask what type of uuid it was? For example:
> >
> > boost::uuids::version_type v = u.version();
> >
>
> I think that's an excellent idea.
>
> I don't think it solves the issue, though, since I likely wouldn't
> think to check going from some hypothetical old C API's
>
> UuidCreateByName(&id, UUID_NAMESPACE_DNS, "www.widgets.com", 15);
>
> to
>
> id = boost::uuids::uuid::create(dns_namespace_uuid,
> "www.widgets.com");
I agree, it doesn't solve the issue. I think the best guard against
this would be with unit tests for their code.
Regards,
Andy Tompkins
By C++ definition, a byte is the size of a char, and contains at least 8 bits.
My understanding is that posix sockets require CHAR_BIT == 8, so
outside of DSP chips and other special hardware, that's almost always
the case. Considering that UUIDs were originally designed for RPC --
typically over sockets -- and that the v1 algorithm is defined using
MAC addresses, it's probably a safe assumption.
3.10/15 from n2723 (and essentially the same as C++03):
If a program attempts to access the stored value of an object through
an lvalue of other than one of the following types the behavior is
undefined:
— the dynamic type of the object,
— a cv-qualified version of the dynamic type of the object,
— a type similar (as defined in 4.4) to the dynamic type of the object,
— a type that is the signed or unsigned type corresponding to the
dynamic type of the object,
— a type that is the signed or unsigned type corresponding to a
cv-qualified version of the dynamic type of the object,
— an aggregate or union type that includes one of the
aforementioned types among its members (including, recursively, a
member of a subaggregate or contained union),
— a type that is a (possibly cv-qualified) base class type of the
dynamic type of the object,
— a char or unsigned char type.
The dynamic type of the stored objects is unsigned char, and they are
being accessed through an lvalue of type unsigned long.
>> It's at least clearly byte-order-dependant, which isn't good, since
>> I'd expect a generator seeded the same way to always produce the same
>> UUIDs, regardless of architecture.
>
> Hmm, I was only trying to be efficient. The generator produces values
> of size = sizeof(unsigned long) and I wanted to use all those bytes. Do
> you have a suggestion that would improve this?
>
Actually, the default generator produces uint32_t, so (in the thread
on boost-dev I started earlier) I just switched to using that, and
used the same code that the sha1 uses to put 4-byte ints into 4 bytes.
>> I'm pleased to see proper shifting and masking in the SHA1 code, which
>> made me think of something: Why not just make the UUID class a PoD? It
>> has an architecture-independent in-memory format (rfc4122, section
>> 4.1.2, which is already followed), so turning it into one would be straight-
>> forward, and would allow it to be very easily & safely splatted to
>> and from files.
>
> I do really like having a class encapsulate the data and functions.
>
In general, I agree, but POD doesn't prevent having a class. Also,
the UUID itself imposes no invariant on the stored data other than
that it's initialized, and allowing arbitrary modification can't break
that.
That said, POD-ness is the least important of the aspects I've raised.
I would like mutating iterators into the array, though.
~ Scott
On Nov 23, 2008, at 7:13 PM, Hartmut Kaiser wrote:
> The mini-review of Andy Tompkins UUID library starts today,
> November 23rd
> 2008, and will end on November 30th.
> I really hope to see your vote and your participation in the
> discussions on
> the Boost mailing lists!
>
> The library can be downloaded from the vault here (it's the file
> uuid_v13.zip in the topmost directory): http://tinyurl.com/6xx28b
>
> The initial review of the UUID library ended with a provisional
> acceptance
> (read here: http://article.gmane.org/gmane.comp.lib.boost.user/
> 27774/).
>
> This mini review is meant to allow for a final decision after
> looking at the
> changed parts of the library. Here is a list of things fixed and
> changed:
>
> - fixed the licensing issues revealed by the initial review
> - fixed the security vulnerability revealed by the initial review
> - renamed to uuid, moved all classes, functions, etc. to namespace
> boost::uuids
> - implemented sha1 hash function (thus no license problem)
Others, including myself, are working on encoding libraries. If this
library, and an encoding library are added, we could change the
implementation to use the new encoding library.
> - new class basic_uuid_generator to create random-based uuids. The
> random
> number generator is no longer hard coded. It can use any random
> number
> generator, default is boost::mt19937
> - implemented a good seed function for random number generators
> - all functions are now reentrant, all classes are as thread safe
> as an int
> and the library is no longer dependent on Boost.Thread
>
> ---------------------------------------------------
>
> Please always state in your review, whether you think the library
> should be
> accepted as a Boost library!
Yes, please accept it.
>
> Additionally please consider giving feedback on the following general
> topics:
>
> - What is your evaluation of the design?
Why is there a container interface for inspecting the value's
octets? For a similar problem in my code (the MD5 stuff in the
Sandbox SVN), I defined a copying-out member function template that
takes an output iterator and returns the updated version of same.
You could do that, making sure to define some (compile-time) item
indicating the length (16), to facilitate easier changes to internal
storage.
Don't some operating systems provide UUID generation? Could there be
a way to add a creation function that uses the OS's code?
Is default construction useful besides when initialization can't be
done in one step? Instead of an "is_null" member function, maybe
(pseudo-)Boolean conversion (and "operator not") can be used.
I don't think the multitude of string conversion techniques are
needed. Keep the constructor with "char const *" for pseudo-
literals. Everything else, in either direction, could use
"lexical_cast".
> - What is your evaluation of the implementation?
The "generator_iterator" substitute in "seed_rng.hpp" doesn't
completely help. It is supposed to improve on the version in "boost/
generator_iterator.hpp" in the area of end-iterator support. I
originally said more here, but I erased it because when I filed a
ticket on this issue, I found one already (#2428) and put my notes
there.
The implementation of the constructor that takes a byte-returning
input-iterator shows why we need a double-bounded copy algorithm.
(Now in new ticket #2578.)
The copy constructor, destructor, and copy-assignment operator don't
do anything different than the automatically-defined versions would.
So they could be omitted.
Maybe the "create_name_based" and "get_showbraces_index" member
functions need to be in a new mandatory source file. If so, you
could move some common error strings there too. BTW, why do
"create" (and "create_name_based") need the length based in as a
separate argument? Just use "std::strlen".
The I/O function templates could be tightened up some. (I noticed it
since I work on I/O quite a bit. It could be done later.) The
optimization would be more important if my dump-most-of-the-string-
conversions-for-lexical_cast idea is used.
Should the serialization be done in a separate header? BTW, how is a
custom primitive type handled? Is it just a byte-level save/load of
memory? (What if the type isn't POD, like this one?)
> - What is your evaluation of the documentation?
It's adequate, but maybe detailed Doxygen information on each item
should be added.
> - What is your evaluation of the potential usefulness of the library?
It can be useful for (temporary) IDs. We have to make sure that the
type, especially its object representation, doesn't get heavy (i.e.
stay quasi-POD).
> - Did you try to use the library? With what compiler? Did you
> have any
> problems?
I used it on Mac OS X 10.4 (PowerPC) with XCode. All the tests, and
Paul Bristow's example, seemed to work.
> - How much effort did you put into your evaluation? A glance? A quick
> reading? In-depth study?
Two days of reading and trying out the code. Not too much of the
theory, though.
> - Are you knowledgeable about the problem domain?
Not really.
--
Daryle Walker
Mac, Internet, and Video Game Junkie
darylew AT hotmail DOT com
Thank you. So I think the uuid library will require either that
CHAR_BIT == 8
or CHAR_BIT % 8 == 0. That is to say that the platform must have 8 bit
bytes.
Andy.
Thank you. I understand now. I will change this.
> >> It's at least clearly byte-order-dependant, which isn't good, since
> >> I'd expect a generator seeded the same way to always produce the same
> >> UUIDs, regardless of architecture.
> >
> > Hmm, I was only trying to be efficient. The generator produces values
> > of size = sizeof(unsigned long) and I wanted to use all those bytes. Do
> > you have a suggestion that would improve this?
> >
>
> Actually, the default generator produces uint32_t, so (in the thread
> on boost-dev I started earlier) I just switched to using that, and
> used the same code that the sha1 uses to put 4-byte ints into 4 bytes.
Yes, I like that.
> >> I'm pleased to see proper shifting and masking in the SHA1 code, which
> >> made me think of something: Why not just make the UUID class a PoD? It
> >> has an architecture-independent in-memory format (rfc4122, section
> >> 4.1.2, which is already followed), so turning it into one would be straight-
> >> forward, and would allow it to be very easily & safely splatted to
> >> and from files.
> >
> > I do really like having a class encapsulate the data and functions.
> >
>
> In general, I agree, but POD doesn't prevent having a class. Also,
> the UUID itself imposes no invariant on the stored data other than
> that it's initialized, and allowing arbitrary modification can't break
> that.
>
> That said, POD-ness is the least important of the aspects I've raised.
> I would like mutating iterators into the array, though.
I understand your desire. I would like to put in mutating iterators. I
wrote
in my last post, should the uuid library require CHAR_BITS == 8, or can
it
relax and only require that CHAR_BITS % 8 == 0. That is the uuid
library only
needs 8 bit bytes. I guess the difference will be in implementating the
iterators.
> ~ Scott
Regards, Andy Tompkins
Yes, I would gladly change the uuid library to use the new encoding
library.
> > - new class basic_uuid_generator to create random-based uuids. The
> > random number generator is no longer hard coded. It can use any
> > random number generator, default is boost::mt19937
> > - implemented a good seed function for random number generators
> > - all functions are now reentrant, all classes are as thread safe as
> > an int and the library is no longer dependent on Boost.Thread
> >
> > ---------------------------------------------------
> >
> > Please always state in your review, whether you think the library
> > should be accepted as a Boost library!
>
> Yes, please accept it.
>
> >
> > Additionally please consider giving feedback on the following
> > general topics:
> >
> > - What is your evaluation of the design?
>
> Why is there a container interface for inspecting the value's octets?
> For a similar problem in my code (the MD5 stuff in the Sandbox SVN), I
> defined a copying-out member function template that takes an output
> iterator and returns the updated version of same. You could do that,
> making sure to define some (compile-time) item indicating the length
> (16), to facilitate easier changes to internal storage.
This was a requested feature.
> Don't some operating systems provide UUID generation? Could there be
> a way to add a creation function that uses the OS's code?
Yes, I will write generators that use OS code.
> Is default construction useful besides when initialization can't be
> done in one step? Instead of an "is_null" member function, maybe (pseudo-
> )Boolean conversion (and "operator not") can be used.
The default constructor is unintuitive. I will likely remove it. I
will
add a 'boolean' conversion and possibility operator!.
> I don't think the multitude of string conversion techniques are
> needed. Keep the constructor with "char const *" for pseudo-
> literals. Everything else, in either direction, could use
> "lexical_cast".
I agree, lexical_cast is good. I will remove the to_..._string
functions.
The constructor that takes a string will likely move to a generator.
> > - What is your evaluation of the implementation?
>
> The "generator_iterator" substitute in "seed_rng.hpp" doesn't
> completely help. It is supposed to improve on the version in
> "boost/ generator_iterator.hpp" in the area of end-iterator
> support. I originally said more here, but I erased it because when
> I filed a ticket on this issue, I found one already (#2428) and put
> my notes there.
Thanks. Ticket #2428 was added in response to the uuid library by
someone else.
> The implementation of the constructor that takes a byte-returning input-
> iterator shows why we need a double-bounded copy algorithm. (Now in
> new ticket #2578.)
>
> The copy constructor, destructor, and copy-assignment operator don't
> do anything different than the automatically-defined versions would.
> So they could be omitted.
Fair.
> Maybe the "create_name_based" and "get_showbraces_index" member
> functions need to be in a new mandatory source file. If so, you could
> move some common error strings there too. BTW, why do "create" (and
> "create_name_based") need the length based in as a separate argument?
> Just use "std::strlen".
The create_name_based function will be moved to a generator. I added
the
length so that one could use it with any given object. But this is not
what people want, so yes, it can just use std::strlen (as well as have
overloads for std::basic_string<>).
> The I/O function templates could be tightened up some. (I noticed it
> since I work on I/O quite a bit. It could be done later.) The
> optimization would be more important if my dump-most-of-the-string-
> conversions-for-lexical_cast idea is used.
Yes, the i/o functions need improving. I will likely need some help
when
the time comes.
> Should the serialization be done in a separate header? BTW, how is a
> custom primitive type handled? Is it just a byte-level save/load of
> memory? (What if the type isn't POD, like this one?)
Yes, it is a byte-level save/load. My reason was so that exactly 128
bytes
are saved/loaded.
> > - What is your evaluation of the documentation?
>
> It's adequate, but maybe detailed Doxygen information on each item
> should be added.
Agreed.
> > - What is your evaluation of the potential usefulness of the
> > library?
>
> It can be useful for (temporary) IDs. We have to make sure that the
> type, especially its object representation, doesn't get heavy (i.e.
> stay quasi-POD).
Agreed.
> > - Did you try to use the library? With what compiler? Did you have
> > any problems?
>
> I used it on Mac OS X 10.4 (PowerPC) with XCode. All the tests, and
> Paul Bristow's example, seemed to work.
>
> > - How much effort did you put into your evaluation? A glance? A
> > quick reading? In-depth study?
>
> Two days of reading and trying out the code. Not too much of the
> theory, though.
>
> > - Are you knowledgeable about the problem domain?
>
>
> Not really.
>
> --
> Daryle Walker Mac, Internet, and Video Game Junkie darylew AT
> hotmail DOT com
>
Thanks,
Andy Tompkins
I think it's reasonable for the uuid class itself to work iff
boost/cstdint.hpp provides a uint8_t, and for the generators to work
if boost/cstdint.hpp provides a uint32_t. (The latter is mostly for
the hash, but requiring it for the random too is convenient).
I don't see the added implementation complexity to support unusual
architectures as being worth it. For example, even the C compiler for
the last mcu I used (one of the weaker MSP430s) provided both, despite
it being about 500 (or 65000, depending) times slower than and having
one millionth the ram of my laptop.
That said, I recall a discussion on the LLVM list a bit back about an
architecture with 96-bit registers separated into four 24-bit
channels. Is anyone running Boost on such things?
> On Tue, 9 Dec 2008 03:34:01 -0500, "Daryle Walker"
> <dar...@hotmail.com> said:
>>
[SNIP]
>> Is default construction useful besides when initialization can't be
>> done in one step? Instead of an "is_null" member function, maybe
>> (pseudo-
>> )Boolean conversion (and "operator not") can be used.
>
> The default constructor is unintuitive. I will likely remove it. I
> will
> add a 'boolean' conversion and possibility operator!.
>
>> I don't think the multitude of string conversion techniques are
>> needed. Keep the constructor with "char const *" for pseudo-
>> literals. Everything else, in either direction, could use
>> "lexical_cast".
>
> I agree, lexical_cast is good. I will remove the to_..._string
> functions.
> The constructor that takes a string will likely move to a generator.
[SNIP]
So, there won't be any constructors besides the constructor template
that takes an input iterator for sixteen octet reads (and an implicit
copy constructor)? You could make UUID a POD type if you could move
that constructor too. (It could probably be moved to a creation
function, done as a static member function for UUID. The Nil-UUID
can be implemented this way too.)
>
>> Should the serialization be done in a separate header? BTW, how is a
>> custom primitive type handled? Is it just a byte-level save/load of
>> memory? (What if the type isn't POD, like this one?)
>
> Yes, it is a byte-level save/load. My reason was so that exactly 128
> bytes
> are saved/loaded.
[TRUNCATE]
You mean 128 bits, or 16 octets. An alternative I used, for a MD5
bit-queue, was to group the bits into sextets (6-packs), map that to
a base-64 string, and serialize that string. For 128 bits, you get
21 sextets and 2 remaining bits, for a string of 22 characters.
--
Daryle Walker
Mac, Internet, and Video Game Junkie
darylew AT hotmail DOT com
_______________________________________________
The last condition (CHAR_BIT % 8 == 0) does not prove that the
platform has 8 bit bytes, just consider DSP controllers with
CHAR_BIT = 32 - we have those. I have not completely followed
the discussion, but it is of-course possible to realize effective
8 bit calculations even on those systems with CHAR_BIT > 8 by
effectively using a math with corresponding bit masks hiding
everything beyond the 8 bit limit.
Side note: The C++ is going to introduce non-optional new
character types char16_t and char32_t. But these are not required
to exactly fit into 16 bit or 32 bit, just they are required to be
capable to *represent* corresponding values. This seems quite
similar to the discussion here, if I did not misunderstand it.
Greetings from Bremen,
Daniel Krügler
Yes, all the user defined constructors could be moved into generators /
static functions / free functions. The uuid class could be a POD. What
are the benefits of making it a POD? I don't yet understand the
advantages.
> >> Should the serialization be done in a separate header? BTW, how is
> >> a custom primitive type handled? Is it just a byte-level save/load
> >> of memory? (What if the type isn't POD, like this one?)
> >
> > Yes, it is a byte-level save/load. My reason was so that exactly
> > 128 bytes are saved/loaded.
> [TRUNCATE]
>
> You mean 128 bits, or 16 octets. An alternative I used, for a MD5 bit-
> queue, was to group the bits into sextets (6-packs), map that to a base-
> 64 string, and serialize that string. For 128 bits, you get 21
> sextets and 2 remaining bits, for a string of 22 characters.
Yes, thanks, I mean 128 bits or 16 octets. One will be able to use
lexical_cast to convert a uuid to/from a string. I can also see the
usefulness to be able to convert to/from a base-64 string, a single
integer value, or other types of binary to text encoding (like ascii85),
but I'm not sure these are in the scope of uuid.
> --
> Daryle Walker Mac, Internet, and Video Game Junkie darylew AT
> hotmail DOT com
Regards, Andy Tompkins