constexpr seed generator built-in

121 views
Skip to first unread message

dgutson .

unread,
Nov 17, 2015, 3:08:37 PM11/17/15
to std-proposals
We are developing a compile-time random number generator.
In order to provide the seed, we coded a very simple constexpr
function that operates on the chars of __DATE__ and __TIME__.
However, I would like to improve this, e.g. by adding more accuracy.
This leads either to invent a long __EPOCH__-like macro with enough
precision (or a similar macro such as __MSECS_FROM_MIDNIGHT__) or a
built-in
compiler-implemented constexpr function, such as std::seed().

Any thoughts?

We will implement in (initially forked) gcc whatever looks best here.

Thanks,

Daniel.

--
Who’s got the sweetest disposition?
One guess, that’s who?
Who’d never, ever start an argument?
Who never shows a bit of temperament?
Who's never wrong but always right?
Who'd never dream of starting a fight?
Who get stuck with all the bad luck?

Thiago Macieira

unread,
Nov 17, 2015, 3:23:46 PM11/17/15
to std-pr...@isocpp.org
On Tuesday 17 November 2015 17:08:35 dgutson . wrote:
> We are developing a compile-time random number generator.
> In order to provide the seed, we coded a very simple constexpr
> function that operates on the chars of __DATE__ and __TIME__.
> However, I would like to improve this, e.g. by adding more accuracy.
> This leads either to invent a long __EPOCH__-like macro with enough
> precision (or a similar macro such as __MSECS_FROM_MIDNIGHT__) or a
> built-in
> compiler-implemented constexpr function, such as std::seed().
>
> Any thoughts?
>
> We will implement in (initially forked) gcc whatever looks best here.

If you're forking GCC, you can add an intrinsic that can be called in that
constexpr context and returns a pseudo-random value of its own.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358

Matt Calabrese

unread,
Nov 17, 2015, 3:31:56 PM11/17/15
to ISO C++ Standard - Future Proposals
On Tue, Nov 17, 2015 at 12:08 PM, dgutson . <daniel...@gmail.com> wrote:
We are developing a compile-time random number generator.
In order to provide the seed, we coded a very simple constexpr
function that operates on the chars of __DATE__ and __TIME__.
However, I would like to improve this, e.g. by adding more accuracy.
This leads either to invent a long __EPOCH__-like macro with enough
precision (or a similar macro such as __MSECS_FROM_MIDNIGHT__) or a
built-in
compiler-implemented constexpr function, such as std::seed().

Compile-time random sounds pretty cool. I can think of a couple of uses, but I'm curious what your primary uses are.

Regarding a compile-implemented constexpr function that produces a unique seed, I think you'd very quickly run into ODR issues unless you guaranteed that all translation units produced the same seed (i.e. you'd probably have to seed your seed via a macro or something). Have you considered just providing the seed directly from the outside via a compiler option, such as -D, and generating that seed via your build system? I realize that this is outside of the realm of the standard, but it's probably worth considering anyway. The facility would likely be just as useful in practice, unless I'm missing something.

dgutson .

unread,
Nov 17, 2015, 3:41:00 PM11/17/15
to std-proposals
On Tue, Nov 17, 2015 at 5:23 PM, Thiago Macieira <thi...@macieira.org> wrote:
> On Tuesday 17 November 2015 17:08:35 dgutson . wrote:
>> We are developing a compile-time random number generator.
>> In order to provide the seed, we coded a very simple constexpr
>> function that operates on the chars of __DATE__ and __TIME__.
>> However, I would like to improve this, e.g. by adding more accuracy.
>> This leads either to invent a long __EPOCH__-like macro with enough
>> precision (or a similar macro such as __MSECS_FROM_MIDNIGHT__) or a
>> built-in
>> compiler-implemented constexpr function, such as std::seed().
>>
>> Any thoughts?
>>
>> We will implement in (initially forked) gcc whatever looks best here.
>
> If you're forking GCC, you can add an intrinsic that can be called in that
> constexpr context and returns a pseudo-random value of its own.

Yes, that's one of the alternatives we are considering as the
"compiler magic" version,
though I'd like to see the frontend stadard first.

>
> --
> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
> Software Architect - Intel Open Source Technology Center
> PGP/GPG: 0x6EF45358; fingerprint:
> E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
> To post to this group, send email to std-pr...@isocpp.org.
> Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

dgutson .

unread,
Nov 17, 2015, 3:44:58 PM11/17/15
to std-proposals
On Tue, Nov 17, 2015 at 5:31 PM, 'Matt Calabrese' via ISO C++ Standard
- Future Proposals <std-pr...@isocpp.org> wrote:
> On Tue, Nov 17, 2015 at 12:08 PM, dgutson . <daniel...@gmail.com> wrote:
>>
>> We are developing a compile-time random number generator.
>> In order to provide the seed, we coded a very simple constexpr
>> function that operates on the chars of __DATE__ and __TIME__.
>> However, I would like to improve this, e.g. by adding more accuracy.
>> This leads either to invent a long __EPOCH__-like macro with enough
>> precision (or a similar macro such as __MSECS_FROM_MIDNIGHT__) or a
>> built-in
>> compiler-implemented constexpr function, such as std::seed().
>
>
> Compile-time random sounds pretty cool. I can think of a couple of uses, but
> I'm curious what your primary uses are.

Unfortunately I cannot provide much details; one use is cryptography
(so each binary is released with some uniqueness).
Another potential use is the pivot selection of quicksort, though we
are not using this.

>
> Regarding a compile-implemented constexpr function that produces a unique
> seed, I think you'd very quickly run into ODR issues unless you guaranteed
> that all translation units produced the same seed (i.e. you'd probably have
> to seed your seed via a macro or something). Have you considered just
> providing the seed directly from the outside via a compiler option, such as
> -D, and generating that seed via your build system? I realize that this is
> outside of the realm of the standard, but it's probably worth considering
> anyway. The facility would likely be just as useful in practice, unless I'm
> missing something.

Despite I'm looking for a potential proposal, the -D idea didn't come
to my mind, thanks, we are
actually thinking in deep gcc support which could be tested with its
native test framework (dejaGNU),
so a new macro or built-in is the way to go.

>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to std-proposal...@isocpp.org.
> To post to this group, send email to std-pr...@isocpp.org.
> Visit this group at
> http://groups.google.com/a/isocpp.org/group/std-proposals/.



Thiago Macieira

unread,
Nov 17, 2015, 6:00:01 PM11/17/15
to std-pr...@isocpp.org
On Tuesday 17 November 2015 17:40:59 dgutson . wrote:
> > If you're forking GCC, you can add an intrinsic that can be called in that
> > constexpr context and returns a pseudo-random value of its own.
>
> Yes, that's one of the alternatives we are considering as the
> "compiler magic" version,
> though I'd like to see the frontend stadard first.

I don't think this could be standardised. We want reliable builds.

Nevin Liber

unread,
Nov 17, 2015, 6:15:24 PM11/17/15
to std-pr...@isocpp.org
On 17 November 2015 at 16:59, Thiago Macieira <thi...@macieira.org> wrote:
On Tuesday 17 November 2015 17:40:59 dgutson . wrote:
> > If you're forking GCC, you can add an intrinsic that can be called in that
> > constexpr context and returns a pseudo-random value of its own.
>
> Yes, that's one of the alternatives we are considering as the
> "compiler magic" version,
> though I'd like to see the frontend stadard first.

I don't think this could be standardised. We want reliable builds.

I'm guessing you meant repeatable, not reliable.  If so, I agree.  If not, please elaborate.

I'm also weary about the cryptographic applications of such a function/macro, as that is usually based on keeping something secret.  And if Daniel cannot talk about it, it becomes that much harder to get support for the feature.

As for the pivot selection of quicksort, why does having a constexpr seed generator particularly help (I suppose you could get a tiny bit of inlining performance when applying it to nearly sorted data, but I'd like to see benchmarks before drawing any conclusions)?

--
 Nevin ":-)" Liber  <mailto:ne...@cplusplusguy.com+1-847-691-1404

dgutson .

unread,
Nov 17, 2015, 7:00:16 PM11/17/15
to std-proposals


El 17/11/2015 20:15, "Nevin Liber" <ne...@cplusplusguy.com> escribió:
>
> On 17 November 2015 at 16:59, Thiago Macieira <thi...@macieira.org> wrote:
>>
>> On Tuesday 17 November 2015 17:40:59 dgutson . wrote:
>> > > If you're forking GCC, you can add an intrinsic that can be called in that
>> > > constexpr context and returns a pseudo-random value of its own.
>> >
>> > Yes, that's one of the alternatives we are considering as the
>> > "compiler magic" version,
>> > though I'd like to see the frontend stadard first.
>>
>> I don't think this could be standardised. We want reliable builds.
>
>
> I'm guessing you meant repeatable, not reliable.  If so, I agree.  If not, please elaborate.

What's the std definition of 'repeatable'? Does the std require to generate always the same binary?

>
> I'm also weary about the cryptographic applications of such a function/macro, as that is usually based on keeping something secret.  And if Daniel cannot talk about it, it becomes that much harder to get support for the feature.

Ok I'll try. Let's suppose we have a production line for some mil application where for security reasons each binary has to be released in a way that if it gets hacked, other binaries (produced with the same code) don't get hacked by the same codes.

>
> As for the pivot selection of quicksort, why does having a constexpr seed generator particularly help (I suppose you could get a tiny bit of inlining performance when applying it to nearly sorted data, but I'd like to see benchmarks before drawing any conclusions)?

Because we are already able to have associative containers that are stored in ROM, which nontrivial constructors are executed in constexpr time, and some hashing functions that are also executed in constexpr contexts which also require RNG. I already started a thread in this list a couple of months ago where people suggested that current STL containers should be ROM friendly rather than a brand new category (such as static_unordered_map). We are already there but this will be a very lengthy paper.

>
> --
>  Nevin ":-)" Liber  <mailto:ne...@cplusplusguy.com+1-847-691-1404
>

Thiago Macieira

unread,
Nov 17, 2015, 7:43:39 PM11/17/15
to std-pr...@isocpp.org
On Tuesday 17 November 2015 21:00:10 dgutson . wrote:
> >> I don't think this could be standardised. We want reliable builds.
> >
> > I'm guessing you meant repeatable, not reliable. If so, I agree. If
> > not, please elaborate.

Yes, I meant repeatable.

> What's the std definition of 'repeatable'? Does the std require to generate
> always the same binary?

The same output is produced with the same inputs (where that includes the
compiler itself and any switches you may toggle). There's also a reason why
GCC added -Wdate-time to warn about the use of __DATE__ and __TIME__, as those
produce non-repeatable builds.

The standard doesn't require that, of course. But I think it's the objective
of compiler writers. Repeatable builds allow for verification against
compromising of the binaries. The reason why -Wdate-time was actually added to
GCC was because some Linux distributions (notably those using OBS) compare
binaries to see whether an update was required.

> > I'm also weary about the cryptographic applications of such a
> > function/macro, as that is usually based on keeping something secret. And
> > if Daniel cannot talk about it, it becomes that much harder to get support
> > for the feature.
>
> Ok I'll try. Let's suppose we have a production line for some mil
> application where for security reasons each binary has to be released in a
> way that if it gets hacked, other binaries (produced with the same code)
> don't get hacked by the same codes.

Then you'd have some expert tools implement that.

If randomising the layout of the program is a good prevention method (and it
is), you probably want to implement that in the compiler so it rearranges the
code and possibly adds a slight disturbance to the stack frame sizes.

A random seed available to the source code is not that really useful, IMHO.

> Because we are already able to have associative containers that are stored
> in ROM, which nontrivial constructors are executed in constexpr time, and
> some hashing functions that are also executed in constexpr contexts which
> also require RNG.

PRNG-based hashing functions are a good idea, but you usually want the hash
seed to be computed at runtime, not at compile time. That's also the same
principle as ASLR found on modern OSes.

I get it that you have a static (ROM) hashing table, but it's very unlikely
that you implemented it with constexpr. The most common case is that you have
a code generation tool that laid the hashing table out for you (and you should
consider a perfect hashing algorithm instead, like gperf). Either way, either
this tool generated the random seed or one was given to it. Moreover, you need
the same seed in all translation units that access that hash, which means a
buildsystem option, not a compiler one.

Unless you saved the seed in the hashing table, which probably defeats your
purpose in the first place.

> I already started a thread in this list a couple of
> months ago where people suggested that current STL containers should be ROM
> friendly rather than a brand new category (such as static_unordered_map).
> We are already there but this will be a very lengthy paper.

ROM-friendly I agree. But in case of unordered_map, I think a perfect hashing
is a superior solution.

Arthur O'Dwyer

unread,
Nov 17, 2015, 7:50:34 PM11/17/15
to ISO C++ Standard - Future Proposals
On Tuesday, November 17, 2015 at 12:31:56 PM UTC-8, Matt Calabrese wrote:
On Tue, Nov 17, 2015 at 12:08 PM, dgutson . <daniel...@gmail.com> wrote:
We are developing a compile-time random number generator.
In order to provide the seed, we coded a very simple constexpr
function that operates on the chars of __DATE__ and __TIME__.
However, I would like to improve this, e.g. by adding more accuracy.
This leads either to invent a long __EPOCH__-like macro with enough
precision (or a similar macro such as __MSECS_FROM_MIDNIGHT__) or a
built-in compiler-implemented constexpr function, such as std::seed().

I would guess that anybody who expects the compiler to provide them with an accurate value for the current timestamp "is, of course, in a state of sin."
We've got timezones, Daylight Savings, leap seconds, clock drift, platform-specific resolution... These are problems worth tackling in the Standard Library, but not in the core language. The core language, ever since 1989, has punted on the whole question by providing __DATE__ and __TIME__ only in a really unconsumable format: strings with implementation-defined contents.

Apple has gone so far as to build a macro "__CF_COMPILE_DAY_OF_EPOCH__" that constructs the number of days since 2001-01-01; you could imagine extending this approach to create __CF_COMPILE_SECONDS_OF_UNIX_EPOCH__, but of course getting the current millisecond isn't possible without compiler cooperation.
 
Compile-time random sounds pretty cool. I can think of a couple of uses, but I'm curious what your primary uses are.

Regarding a compile-implemented constexpr function that produces a unique seed, I think you'd very quickly run into ODR issues unless you guaranteed that all translation units produced the same seed (i.e. you'd probably have to seed your seed via a macro or something). Have you considered just providing the seed directly from the outside via a compiler option, such as -D, and generating that seed via your build system? I realize that this is outside of the realm of the standard, but it's probably worth considering anyway. The facility would likely be just as useful in practice, unless I'm missing something.

In order of "plausibility" (implementability, aesthetics, simplicity), if I were you I'd be looking at:
(A) Just generate a seed in the build system and pass it in with -D.
(B) Implement a (cryptographic?) hash function H as a macro, metafunction, or constexpr function, and use H(__COUNTER__) in your source code.
(C) Implement __RANDOM__, or __builtin_random_constant(), or some such. I would strongly advise against making __builtin_rand() work in constant contexts, because that would just be crazy surprising.
(D) Implement __EPOCH_MILLISECONDS__ or some such. Notice that there are lots and lots of people asking for (the semantics, not necessarily the name) __EPOCH_SECONDS__ on StackOverflow. Also notice that GNU provides a builtin macro named __TIMESTAMP__ that holds a string representation of the last modification time of the current source file (as opposed to the __TIME__ at which it was compiled). So that name is taken.
(E) Implement some kind of built-in constexpr PRNG along the lines of your first post. :)

Notice that (A) and (B) and (C) all solve different problems:
(A) solves the problem of getting a single optionally-reproducible random number into a single place in the code.
(B) solves the problem of getting a sequence of reproducible random numbers into arbitrarily many places in the code. (Or hash __TIME__ into there too, for optional non-reproducibility as long as you're not compiling multiple times per second.)
(C) and (E) solve the problem of getting a sequence of non-reproducible random numbers into arbitrarily many places in the code.
(D) solves the problem of injecting non-reproducibility into (B) for those who like to compile multiple times per second.

So the most "plausible" feature requests I hear here are:

__COUNTER__, because it's already vendor-standard and just needs a proposal written as far as I'm aware
__RANDOM__, for people who either need a stream of pseudorandom numbers and don't want to bother with HASH(__COUNTER__), or who need a single non-reproducible random number
__EPOCH_SECONDS__, because it's in great demand

I don't like the idea of __EPOCH_MILLISECONDS__, because it doesn't really give the programmer any useful information; it's effectively indistinguishable from

constexpr int millis = __RANDOM__;
#define EPOCH_MILLISECONDS (__EPOCH_SECONDS__*1000 + millis)

I don't think it makes sense to describe a constexpr random-number function: that would be a function that gives a different result every time it's called, but (being constexpr) is guaranteed to give the same result every time it's called?

I do think that __RANDOM__ would solve your problem of "seed a constexpr PRNG", although it might not solve other related problems of yours or anyone else's.
I'd certainly feel awful if __RANDOM__ somehow got standardized ahead of __COUNTER__. :P

–Arthur

Zhihao Yuan

unread,
Nov 17, 2015, 8:30:21 PM11/17/15
to std-pr...@isocpp.org

On Tue, Nov 17, 2015 at 2:44 PM, dgutson . <daniel...@gmail.com> wrote:
Despite I'm looking for a potential proposal, the -D idea didn't come
to my mind, thanks, we are
actually thinking in deep gcc support which could be tested with its
native test framework (dejaGNU),
so a new macro or built-in is the way to go.

Despite of the `constexpr` part, std::seed() is still worthy to add.  It's a
separation of concerns, and may be beneficial in terms of portability,
quality, and/or performance (comparing to random_device).

About the `constexpr` part, my biggest concern is the quality of the
seed... I heard epoch, time, etc., but those are all of low quality.  The
only trustful way came into my mind is that the compiler links to C++
standard library and calls std::seed(), which may be implemented using
getentropy(2)/getrandom(2)/RDSEED, etc. to substitute std::seed upon
constexpr is requested.  Anyhow, my suggestion is: forget about
what the seed actually is when you proposing this.

A side note: it might be worthy to allow such a utility to generate seeds
of different widths, like 32/64 bits.  The interface may be seed<unsigned>
or something.

--
Zhihao Yuan, ID lichray
The best way to predict the future is to invent it.
___________________________________________________
4BSD -- http://bit.ly/blog4bsd

Thiago Macieira

unread,
Nov 17, 2015, 10:37:05 PM11/17/15
to std-pr...@isocpp.org
On Tuesday 17 November 2015 19:30:16 Zhihao Yuan wrote:
> On Tue, Nov 17, 2015 at 2:44 PM, dgutson . <daniel...@gmail.com> wrote:
> > Despite I'm looking for a potential proposal, the -D idea didn't come
> > to my mind, thanks, we are
> > actually thinking in deep gcc support which could be tested with its
> > native test framework (dejaGNU),
> > so a new macro or built-in is the way to go.
>
> Despite of the `constexpr` part, std::seed() is still worthy to add. It's a
> separation of concerns, and may be beneficial in terms of portability,
> quality, and/or performance (comparing to random_device).

What would std::seed() do that std::random_device() cannot do?

> About the `constexpr` part, my biggest concern is the quality of the
> seed... I heard epoch, time, etc., but those are all of low quality. The
> only trustful way came into my mind is that the compiler links to C++
> standard library and calls std::seed(), which may be implemented using
> getentropy(2)/getrandom(2)/RDSEED, etc. to substitute std::seed upon
> constexpr is requested. Anyhow, my suggestion is: forget about
> what the seed actually is when you proposing this.

> A side note: it might be worthy to allow such a utility to generate seeds
> of different widths, like 32/64 bits. The interface may be seed<unsigned>
> or something.

std::random_device can do that.

Thiago Macieira

unread,
Nov 17, 2015, 10:47:08 PM11/17/15
to std-pr...@isocpp.org
On Tuesday 17 November 2015 16:50:34 Arthur O'Dwyer wrote:
> I don't think it makes sense to describe a constexpr random-number
> function: that would be a function that gives a different result every time
> it's called, but (being constexpr) is guaranteed to give the same result
> every time it's called?
>
> I do think that __RANDOM__ would solve your problem of "seed a constexpr
> PRNG", although it might not solve other related problems of yours or
> anyone else's.
> I'd certainly feel awful if __RANDOM__ somehow got standardized ahead of
> __COUNTER__.

Note that constexpr is implicitly inline and must be present in all
translation units that use it. And by ODR, it must have the same definition
and value.

So if you were to do:

constexpr int seed = __RANDOM__;

and include that header from two source files, you'd violate ODR. The same
would go for __TIME__ or __EPOCH_SECONDS__ because different translation units
may be compiled at different times. Even __COUNTER__ is a bad idea because it
might have been used in another header.

At best, you could do in a header file:

extern const MyRomHash hash_in_rom;

which, as you can see, has no constexpr and, if you don't write the
initialisation code properly, could end up initialised dynamically (not in
ROM).

The proper way of doing this would be to access via a function that returns
the pointer to the data (a possibly unnecessary indirection).

Thiago Macieira

unread,
Nov 17, 2015, 10:57:27 PM11/17/15
to std-pr...@isocpp.org
On Tuesday 17 November 2015 19:47:03 Thiago Macieira wrote:
> At best, you could do in a header file:
>
> extern const MyRomHash hash_in_rom;
>
> which, as you can see, has no constexpr and, if you don't write the
> initialisation code properly, could end up initialised dynamically (not in
> ROM).

I take this part back. This is allowed:

extern const MyRomHash hash_in_rom;
constexpr MyRomHash hash_in_rom = make_hash();

T. C.

unread,
Nov 18, 2015, 5:05:19 AM11/18/15
to ISO C++ Standard - Future Proposals


On Tuesday, November 17, 2015 at 10:47:08 PM UTC-5, Thiago Macieira wrote:

So if you were to do:

        constexpr int seed = __RANDOM__;

and include that header from two source files, you'd violate ODR.

Not necessarily. Assuming that it's at namespace scope, 'seed' defaults to internal linkage.

Of course, ODR means that you can't really use it in many contexts.

Zhihao Yuan

unread,
Nov 18, 2015, 9:13:55 PM11/18/15
to std-pr...@isocpp.org
On Tue, Nov 17, 2015 at 9:36 PM, Thiago Macieira <thi...@macieira.org> wrote:
>
>> Despite of the `constexpr` part, std::seed() is still worthy to add. It's a
>> separation of concerns, and may be beneficial in terms of portability,
>> quality, and/or performance (comparing to random_device).
>
> What would std::seed() do that std::random_device() cannot do?

We haven't decide what std::seed() gives so I say it's a separation
of concerns. There are multiple questions we can raise here, from
"can random_device{}() be portable usable for seeding?" (non-
predictable is not portably guaranteed) to "how high the entropy
given by random_device{}() is?" (not required to give full entropy),
etc., and you can see that the directions are different...

>> A side note: it might be worthy to allow such a utility to generate seeds
>> of different widths, like 32/64 bits. The interface may be seed<unsigned>
>> or something.
>
> std::random_device can do that.

No, it cannot. Its `result_type` is fixed to `unsigned int`.

Rather than seed<T>, another technique I previously used is to
return an empty object, with a templated conversion operator
to IntT, so that .seed(Some-int-type) will receive the seeds of
the right width if invoked with .seed(get_seed()).
Reply all
Reply to author
Forward
0 new messages