Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

One agnostic random function for big project (4 GB of code)

43 views
Skip to first unread message

Frederick Gotham

unread,
Dec 12, 2019, 6:53:23 AM12/12/19
to

I'm working on a large embedded Linux project with about 4 gigs of source code.

I'm trying to create one header file that can be included in any sub-project, which will have a function for generating random bytes of data. The only two requirement of using this random number generator will be:
(1) You must include "agnostic_random.hpp" in the source file(s) where it's used
(2) When linking, you must link with "libdl.so"

In this header file, I'm taking advantage of the "inline" keyword because of its unique linking characteristics, specifically that I can have a static object inside an inline function and it will only exist once (and it will only be initialised once for the current process).

This random number generator must be fully thread-safe.

At the moment I've written it to use "RAND_bytes" provided by OpenSSL, but this might change later (e.g. I might use tpm2-tss directly).

Please give me a critique of this header file and let me know if I've overlooked anything.

#ifndef HPP_INCLUSION_GUARD_AGNOSTIC_RANDOM
#define HPP_INCLUSION_GUARD_AGNOSTIC_RANDOM

#ifndef __cplusplus
# error "This is a C++ header file but you've included it in a C source file (or you've used a C compiler on a C++ source file). Don't do that."
#endif

#include <cstdint> /* uintmax_t */

#include <atomic> /* atomic_flag */
#include <limits> /* numeric_limits<int> */

// The next two lines are declarations of functions found in
// "libdl.so". (Unfortunately the compiler won't let me put
// these declarations inside the body of a function).
extern "C" void *dlopen(char const *, int);
extern "C" void *dlsym(void *,char const *);

inline void Agnostic_Random(void *pvoid, std::uintmax_t len_bytes) // len_bytes can be 0
{
static std::atomic_flag is_not_first_time = ATOMIC_FLAG_INIT;

typedef int (*RAND_bytes_t)(char unsigned *, int);
static std::atomic<RAND_bytes_t> ptr_to_RAND_bytes(nullptr);

if ( !is_not_first_time.test_and_set() )
{
void *const handle = dlopen("libcrypto.so", 1); /* 1 == RTLD_LAZY */

if ( nullptr != handle )
ptr_to_RAND_bytes = reinterpret_cast<RAND_bytes_t>( dlsym(handle, "RAND_bytes") );
}

if ( nullptr == ptr_to_RAND_bytes )
return;

while ( 0 != len_bytes )
{
int bytes_this_time; // RAND_bytes unfortunately takes an 'int' for length

if ( len_bytes > std::numeric_limits<int>::max() )
bytes_this_time = std::numeric_limits<int>::max();
else
bytes_this_time = len_bytes;

ptr_to_RAND_bytes( static_cast<char unsigned *>(pvoid), bytes_this_time );

pvoid = static_cast<char*>(pvoid) + bytes_this_time; // Cast to char pointer for arithmetic
len_bytes -= bytes_this_time;
}
}

#endif // ifndef HPP_INCLUSION_GUARD_JCI_INSIGHT_RANDOM

Bart

unread,
Dec 12, 2019, 7:02:15 AM12/12/19
to
On 12/12/2019 11:53, Frederick Gotham wrote:
>
> I'm working on a large embedded Linux project with about 4 gigs of source code.

4GB of source files?

Out of interest, how long does it take to compile that lot?

Frederick Gotham

unread,
Dec 12, 2019, 7:06:37 AM12/12/19
to
Around about 55min - 65min with our PC's that have 20 cores.

bol...@nowhere.co.uk

unread,
Dec 12, 2019, 7:23:06 AM12/12/19
to
Even aircraft avionic systems don't require that much code. Sounds to me
like a load of inefficient framework junk has been thrown in the mix but
even then I'm struggling to see how any system can require that much code
no matter what it does.

Paavo Helde

unread,
Dec 12, 2019, 7:31:43 AM12/12/19
to
On 12.12.2019 13:53, Frederick Gotham wrote:
>
> I'm working on a large embedded Linux project with about 4 gigs of source code.
>
> I'm trying to create one header file that can be included in any sub-project, which will have a function for generating random bytes of data. The only two requirement of using this random number generator will be:
> (1) You must include "agnostic_random.hpp" in the source file(s) where it's used
> (2) When linking, you must link with "libdl.so"
>
> In this header file, I'm taking advantage of the "inline" keyword because of its unique linking characteristics, specifically that I can have a static object inside an inline function and it will only exist once (and it will only be initialised once for the current process).
>
> This random number generator must be fully thread-safe.
>
> At the moment I've written it to use "RAND_bytes" provided by OpenSSL, but this might change later (e.g. I might use tpm2-tss directly).
>
> Please give me a critique of this header file and let me know if I've overlooked anything.

TBH, at the moment it looks like you are trying to sneak in "accidental"
back-doors into this large Linux project. More below.

>
> #ifndef HPP_INCLUSION_GUARD_AGNOSTIC_RANDOM
> #define HPP_INCLUSION_GUARD_AGNOSTIC_RANDOM
>
> #ifndef __cplusplus
> # error "This is a C++ header file but you've included it in a C source file (or you've used a C compiler on a C++ source file). Don't do that."
> #endif
>
> #include <cstdint> /* uintmax_t */
>
> #include <atomic> /* atomic_flag */
> #include <limits> /* numeric_limits<int> */
>
> // The next two lines are declarations of functions found in
> // "libdl.so". (Unfortunately the compiler won't let me put
> // these declarations inside the body of a function).
> extern "C" void *dlopen(char const *, int);
> extern "C" void *dlsym(void *,char const *);
>

Namespace is missing. All C++ code should be in a namespace, especially
if used so widely.

> inline void Agnostic_Random(void *pvoid, std::uintmax_t len_bytes) // len_bytes can be 0

Buffer length should be in size_t, not some random type.

> {
> static std::atomic_flag is_not_first_time = ATOMIC_FLAG_INIT;
>
> typedef int (*RAND_bytes_t)(char unsigned *, int);
> static std::atomic<RAND_bytes_t> ptr_to_RAND_bytes(nullptr);
>
> if ( !is_not_first_time.test_and_set() )
> {
> void *const handle = dlopen("libcrypto.so", 1); /* 1 == RTLD_LAZY */

The handle is leaked. There should be a matching call to dlclose(). This
could be achieved pretty easily by having a static C++ class object with
a destructor.

>
> if ( nullptr != handle )
> ptr_to_RAND_bytes = reinterpret_cast<RAND_bytes_t>( dlsym(handle, "RAND_bytes") );
> }

A race condition. There is a time slice when the flag is set, but the
function pointer is not yet assigned. If another thread calls this
function during this time, it will see a wrong (NULL) function pointer.

Instead suggesting a simple:

static RAND_bytes_t ptr_to_RAND_bytes = SomeInitFunction();

This is guaranteed to do the correct thing since C++11.

> if ( nullptr == ptr_to_RAND_bytes )
> return;

No error return or exception throw. Extremely bad style, especially if
this has anything to do with crypto.

>
> while ( 0 != len_bytes )
> {
> int bytes_this_time; // RAND_bytes unfortunately takes an 'int' for length
>
> if ( len_bytes > std::numeric_limits<int>::max() )

Signed-unsigned comparison, compilers will produce warnings.

> bytes_this_time = std::numeric_limits<int>::max();
> else
> bytes_this_time = len_bytes;
>
> ptr_to_RAND_bytes( static_cast<char unsigned *>(pvoid), bytes_this_time );

No error check! Extremely bad style, especially in anything related to
crypto!

>
> pvoid = static_cast<char*>(pvoid) + bytes_this_time; // Cast to char pointer for arithmetic
> len_bytes -= bytes_this_time;
> }
> }
>
> #endif // ifndef HPP_INCLUSION_GUARD_JCI_INSIGHT_RANDOM

Comment is wrong.


Frederick Gotham

unread,
Dec 12, 2019, 7:48:11 AM12/12/19
to
On Thursday, December 12, 2019 at 12:31:43 PM UTC, Paavo Helde wrote:

> > extern "C" void *dlopen(char const *, int);
> > extern "C" void *dlsym(void *,char const *);
> >
>
> Namespace is missing. All C++ code should be in a namespace, especially
> if used so widely.


"libdl.so" is a dynamically-loaded shared library on Linux, and it's written in C. If you know how to link with a C library and put everything inside a namespace then please tell me how. I can't change the "libdl.so" library in this project.


> Buffer length should be in size_t, not some random type.


I chose uintmax_t specifically because I want the biggest native type. I could always use a "bignum" library if I wanted even bigger. (Although 16.8 million terabytes can be addressed with a 64-Bit unsigned int).


> The handle is leaked. There should be a matching call to dlclose(). This
> could be achieved pretty easily by having a static C++ class object with
> a destructor.


Linux handles all that when the process dies.


> A race condition. There is a time slice when the flag is set, but the
> function pointer is not yet assigned. If another thread calls this
> function during this time, it will see a wrong (NULL) function pointer.
>
> Instead suggesting a simple:
>
> static RAND_bytes_t ptr_to_RAND_bytes = SomeInitFunction();
>
> This is guaranteed to do the correct thing since C++11.


Thank you, I missed this race condition.

<snip>
> No error return or exception throw. Extremely bad style, especially if
> this has anything to do with crypto.
<snip>
> No error check! Extremely bad style, especially in anything related to
> crypto!


That's another day's work..... I've to chat to the team(s) about what should happen when we can't get random numbers from the TPM2 chip. I might even have to power off the unit.

Paavo Helde

unread,
Dec 12, 2019, 8:15:59 AM12/12/19
to
On 12.12.2019 14:47, Frederick Gotham wrote:
> On Thursday, December 12, 2019 at 12:31:43 PM UTC, Paavo Helde wrote:
>
>>> extern "C" void *dlopen(char const *, int);
>>> extern "C" void *dlsym(void *,char const *);
>>>
>>
>> Namespace is missing. All C++ code should be in a namespace, especially
>> if used so widely.
>
>
> "libdl.so" is a dynamically-loaded shared library on Linux, and it's written in C. If you know how to link with a C library and put everything inside a namespace then please tell me how. I can't change the "libdl.so" library in this project.
>

Sorry, I meant that your Agnostic_Random() should be in a namespace.

C library interfaces are indeed in the global namespace, that's one of
the reasons why all C++ code should be in a namespace.

>
>> Buffer length should be in size_t, not some random type.
>
>
> I chose uintmax_t specifically because I want the biggest native type. I could always use a "bignum" library if I wanted even bigger. (Although 16.8 million terabytes can be addressed with a 64-Bit unsigned int).

The caller cannot have a buffer which does not fit in std::size_t. Using
anything else is just obfuscation.

>
>> The handle is leaked. There should be a matching call to dlclose(). This
>> could be achieved pretty easily by having a static C++ class object with
>> a destructor.
>
>
> Linux handles all that when the process dies.

That's why I wrote "should", not "must".

>
>> A race condition. There is a time slice when the flag is set, but the
>> function pointer is not yet assigned. If another thread calls this
>> function during this time, it will see a wrong (NULL) function pointer.
>>
>> Instead suggesting a simple:
>>
>> static RAND_bytes_t ptr_to_RAND_bytes = SomeInitFunction();
>>
>> This is guaranteed to do the correct thing since C++11.
>
>
> Thank you, I missed this race condition.

You are welcome!

> <snip>
>> No error return or exception throw. Extremely bad style, especially if
>> this has anything to do with crypto.
> <snip>
>> No error check! Extremely bad style, especially in anything related to
>> crypto!
>
>
> That's another day's work..... I've to chat to the team(s) about what should happen when we can't get random numbers from the TPM2 chip. I might even have to power off the unit.

Seems a good idea.


Frederick Gotham

unread,
Dec 12, 2019, 8:27:25 AM12/12/19
to
On Thursday, December 12, 2019 at 1:15:59 PM UTC, Paavo Helde wrote:

> >> static RAND_bytes_t ptr_to_RAND_bytes = SomeInitFunction();
> >>
> >> This is guaranteed to do the correct thing since C++11.


Let me try to get my ahead around this.

So you're saying. . . that since C++11. . .

If you have an object of static duration inside the body of an inline function which is defined in a header file, and if this object has an initialiser, then even in a multi-threaded program where this function might be "re-entered" and have two or three simultaneous executions, it is guaranteed that the initialiser of the object will only be evaluated once.

Is that what you're saying?

Well I tip my hat the compiler writer's if they managed to pull that off. Can you point me to the relevant paragraph in the C++11 Standard?

I definitely think I would protect that line of code as follows:

#if __cplusplus >= 201103L
/* Let the compiler worry about any race conditions here */
static RAND_bytes_t ptr_to_RAND_bytes = SomeInitFunction();
#else
# error "I need a compiler with supreme built-in thread safety. This one won't do."
#endif

Paavo Helde

unread,
Dec 12, 2019, 9:26:03 AM12/12/19
to
On 12.12.2019 15:27, Frederick Gotham wrote:
> On Thursday, December 12, 2019 at 1:15:59 PM UTC, Paavo Helde wrote:
>
>>>> static RAND_bytes_t ptr_to_RAND_bytes = SomeInitFunction();
>>>>
>>>> This is guaranteed to do the correct thing since C++11.
>
>
> Let me try to get my ahead around this.
>
> So you're saying. . . that since C++11. . .
>
> If you have an object of static duration inside the body of an inline function which is defined in a header file, and if this object has an initialiser, then even in a multi-threaded program where this function might be "re-entered" and have two or three simultaneous executions, it is guaranteed that the initialiser of the object will only be evaluated once.
>
> Is that what you're saying?

Yes.
>
> Well I tip my hat the compiler writer's if they managed to pull that off. Can you point me to the relevant paragraph in the C++11 Standard?

This is in 9.7 in the latest standard [stmt.dcl]:

"Dynamic initialization of a block-scope variable with static storage
duration (6.7.1) or thread storage duration (6.7.2) is performed the
first time control passes through its declaration; [...] If control
enters the declaration concurrently while the variable is being
initialized, the concurrent execution shall wait for completion of the
initialization."

>
> I definitely think I would protect that line of code as follows:
>
> #if __cplusplus >= 201103L
> /* Let the compiler worry about any race conditions here */
> static RAND_bytes_t ptr_to_RAND_bytes = SomeInitFunction();
> #else
> # error "I need a compiler with supreme built-in thread safety. This one won't do."
> #endif

Beware this would probably not work in MSVC, at least not yet. They have
been very reluctant to update __cplusplus.


Frederick Gotham

unread,
Dec 12, 2019, 9:33:45 AM12/12/19
to
On Thursday, December 12, 2019 at 2:26:03 PM UTC, Paavo Helde wrote:

> > Is that what you're saying?
>
> Yes.


okay here's Take Two:

#ifndef HPP_INCLUSION_GUARD_AGNOSTIC_RANDOM
#define HPP_INCLUSION_GUARD_AGNOSTIC_RANDOM

#ifndef __cplusplus
# error "This is a C++ header file but you've included it in a C source file (or you've used a C compiler on a C++ source file). You can't do that."
#endif

#include <cstddef> /* size_t */

#include <limits> /* numeric_limits<int> */
#include <stdexcept> /* runtime_error */

// The next two lines are declarations of functions found in
// "libdl.so". (Unfortunately the compiler won't let me put
// these declarations inside the body of a function).
extern "C" void *dlopen(char const *, int);
extern "C" void *dlsym(void *,char const *);

namespace private_namespace_4e970662b9b64fb8a6bb1aea3aa7f631 { /* This is a 128-bit UUID */

typedef int (*RAND_bytes_t)(char unsigned *, int);

inline RAND_bytes_t GetFuncPointer(void)
{
void *const handle = dlopen("libcrypto.so", 1); /* 1 == RTLD_LAZY */

if ( nullptr == handle )
throw std::runtime_error("Agnostic_Random : dlopen : Could not load libcrypto.so");

RAND_bytes_t const pfunc = reinterpret_cast<RAND_bytes_t>(dlsym(handle, "RAND_bytes")); // Linux allows func_pointer = obj_pointer

if ( nullptr == pfunc )
throw std::runtime_error("Agnostic_Random : dlsym : Could not get address of RAND_bytes in libcrypto.so");

return pfunc;
}
}

inline void Agnostic_Random(void *pvoid, std::size_t len_bytes) // len_bytes can be 0
{
using namespace private_namespace_4e970662b9b64fb8a6bb1aea3aa7f631;

# if __cplusplus >= 201103L
/* Let the C++11 compiler solve any race conditions here */
static RAND_bytes_t const ptr_to_RAND_bytes = GetFuncPointer();
# else
# error "I need a compiler with supreme built-in thread safety. This one won't do."
# endif

while ( 0 != len_bytes )
{
int bytes_this_time; // RAND_bytes unfortunately takes an 'int' for length

if ( len_bytes > std::numeric_limits<int>::max() )
bytes_this_time = std::numeric_limits<int>::max();
else
bytes_this_time = len_bytes;

ptr_to_RAND_bytes( static_cast<char unsigned *>(pvoid), bytes_this_time );

pvoid = static_cast<char*>(pvoid) + bytes_this_time; // Cast to char pointer for arithmetic
len_bytes -= bytes_this_time;
}
}

#endif // ifndef HPP_INCLUSION_GUARD_AGNOSTIC_RANDOM

Paavo Helde

unread,
Dec 12, 2019, 9:40:19 AM12/12/19
to
On 12.12.2019 16:25, Paavo Helde wrote:

> This is in 9.7 in the latest standard [stmt.dcl]:

> If control
> enters the declaration concurrently while the variable is being
> initialized, the concurrent execution shall wait for completion of the
> initialization."

In C++11 standard this is in 6.7/4 [stmt.dcl]. Identical wording.


Frederick Gotham

unread,
Dec 12, 2019, 10:26:35 AM12/12/19
to
On Thursday, December 12, 2019 at 2:40:19 PM UTC, Paavo Helde wrote:

> In C++11 standard this is in 6.7/4 [stmt.dcl]. Identical wording.

I think you missed my post at 2:33pm

Paavo Helde

unread,
Dec 12, 2019, 10:37:36 AM12/12/19
to
I don't have to answer all posts, there are other people in this NG too.
Anyway, the new version 2:33 pm post looked better, though the
Agnostic_Random() function was still in global namespace and there was
still no error check on the actual function call.

If Agnostic_Random() were in a namespace with a meaningful name, then
you would not need to invent strange UUID namespace names, a simple
inner 'namespace detail' would suffice.

Frederick Gotham

unread,
Dec 12, 2019, 10:49:05 AM12/12/19
to
On Thursday, December 12, 2019 at 3:37:36 PM UTC, Paavo Helde wrote:

> > I think you missed my post at 2:33pm
>
> I don't have to answer all posts, there are other people in this NG too.


I mentioned it because our msgs crossed in the post.


> Anyway, the new version 2:33 pm post looked better, though the
> Agnostic_Random() function was still in global namespace and there was
> still no error check on the actual function call.


I've read the C++11 standard where it says about thread safety and the initialisation of static variables inside functions, and I want to take advantage of the part about what happens when the initialiser throws an exception (it's quite intricate).

I'm keeping the function in the global namespace with the name of my company and the product prefixed to it. So something like "Volkswagon_Beatle_Random". I'm doing this because there will be a few different RNG's available (e.g. C++ Standard, OpenSSL, Intel RDrand), and so I want my function name to be verbose. I don't want someone doing something like:

void SomeFunc(void)
{
using namespace_for_my_company_and_product;
Rand();
}

Instead I want it verbose like this:

void SomeFunc(void)
{
Company_And_Product_Rand();
}

Also I might have to write it in C also, and so I want the function names to be the same.

Frederick Gotham

unread,
Dec 12, 2019, 11:27:27 AM12/12/19
to
On Thursday, December 12, 2019 at 3:49:05 PM UTC, Frederick Gotham wrote:

> Also I might have to write it in C also, and so I want the function names to be the same.


I realise that this is a C++ newsgroup, but anyway here's what I've got so far with porting it to C. There's no namespaces so I've put a UUID suffix on all global identifiers.


#ifndef H_INCLUSION_GUARD_COMPANY_PRODUCT_RANDOM
#define H_INCLUSION_GUARD_COMPANY_PRODUCT_RANDOM

#ifdef __cplusplus
# error "This is a C header file but you've included it in a C++ source file (or you've used a C++ compiler on a C source file). You can't do that."
#endif

#include <stdbool.h> /* bool */

#include <stddef.h> /* NULL, size_t */

#include <limits.h> /* INT_MAX */

#include <pthread.h>

// The next two lines are declarations of functions found in
// "libdl.so". (Unfortunately the compiler won't let me put
// these declarations inside the body of a function).
extern void *dlopen(char const *, int);
extern void *dlsym(void *,char const *);

typedef int (*RAND_bytes_t_35466a8ef14e4bc2b50bfc28a6d19b66)(char unsigned *, int);

bool g_random_first_time_35466a8ef14e4bc2b50bfc28a6d19b66 = true;

RAND_bytes_t_35466a8ef14e4bc2b50bfc28a6d19b66 g_ptr_to_RAND_bytes_35466a8ef14e4bc2b50bfc28a6d19b66 = NULL;

pthread_mutex_t g_mutex_35466a8ef14e4bc2b50bfc28a6d19b66 = PTHREAD_MUTEX_INITIALIZER;

inline void Company_Product_Random(void *pvoid, size_t len_bytes) // len_bytes can be 0
{
pthread_mutex_lock(&g_mutex_35466a8ef14e4bc2b50bfc28a6d19b66);

if ( g_random_first_time_35466a8ef14e4bc2b50bfc28a6d19b66 )
{
g_random_first_time_35466a8ef14e4bc2b50bfc28a6d19b66 = false;

void *const handle = dlopen("libcrypto.so", 1); /* 1 == RTLD_LAZY */

if ( NULL == handle )
return;

g_ptr_to_RAND_bytes_35466a8ef14e4bc2b50bfc28a6d19b66 = \
(RAND_bytes_t_35466a8ef14e4bc2b50bfc28a6d19b66)dlsym(handle, "RAND_bytes"); // Linux allows func_pointer = obj_pointer
}

pthread_mutex_unlock(&g_mutex_35466a8ef14e4bc2b50bfc28a6d19b66);

if ( NULL == g_ptr_to_RAND_bytes_35466a8ef14e4bc2b50bfc28a6d19b66 )
return;

while ( 0 != len_bytes )
{
int bytes_this_time; // RAND_bytes unfortunately takes an 'int' for length

if ( len_bytes > (size_t)INT_MAX )
bytes_this_time = INT_MAX;
else
bytes_this_time = len_bytes;

g_ptr_to_RAND_bytes_35466a8ef14e4bc2b50bfc28a6d19b66( (char unsigned *)pvoid, bytes_this_time );

pvoid = (char*)pvoid + bytes_this_time; // Cast to char pointer for arithmetic
len_bytes -= bytes_this_time;
}
}

#endif // ifndef H_INCLUSION_GUARD_COMPANY_PRODUCT_RANDOM


Manfred

unread,
Dec 12, 2019, 11:42:01 AM12/12/19
to
On 12/12/2019 1:31 PM, Paavo Helde wrote:
>> #ifndef HPP_INCLUSION_GUARD_AGNOSTIC_RANDOM
>> #define HPP_INCLUSION_GUARD_AGNOSTIC_RANDOM
>>
>> #ifndef __cplusplus
>> #    error "This is a C++ header file but you've included it in a C
>> source file (or you've used a C compiler on a C++ source file). Don't
>> do that."
>> #endif
>>
>> #include <cstdint>  /* uintmax_t */
>>
>> #include <atomic>   /* atomic_flag */
>> #include <limits>   /* numeric_limits<int> */
>>
>> // The next two lines are declarations of functions found in
>> // "libdl.so". (Unfortunately the compiler won't let me put
>> // these declarations inside the body of a function).
>> extern "C" void *dlopen(char const *, int);
>> extern "C" void *dlsym(void *,char const *);
>>
>
> Namespace is missing. All C++ code should be in a namespace, especially
> if used so widely.

Actually dlopen, dlclose (missing) and dlsym are standard POSIX
functions which should not be declared this way, #include <dlfcn.h> instead.
This would land them in the global namespace, but since they belong to
some system library this is how it should be IMO.
If these declrations should be hidden from clients, then a C++ wrapper
as you suggest later is more appropriate.

Vir Campestris

unread,
Dec 13, 2019, 4:51:56 PM12/13/19
to
Look at icecream

https://github.com/icecc/icecream

It shares your compilation around the network.

Andy
0 new messages