Random-Generator Interface similar to the ClockInterface

56 views
Skip to first unread message

Andreas Heigl

unread,
Sep 26, 2022, 2:46:39 AM9/26/22
to php...@googlegroups.com
Hey all.

A recent twitter-thread[1] left me thinking that while we have by now
multiple Clock-implementations to ease testing with Dates and times
there is not really a way currently to ease testing with randomnes.

PHP itself provides the random_int and random_bytes functions and some
libraries provide additional sources of randomnes. But there is no way
of being able to "assert a certain random value" which makes functions
requiring randomness hard to test.

ANd so far I have not found any interoperability standard in that
direction in the PHP-Ecosystem. There are some libraries (not a huge
number) but I did not find an interface for interoperability and better
testing.

I would therefore like to propose a "RandomGeneratorInterface" similar
to the "CLockInterface" in that it provides an interface that can then
be used to either inject a "SystemRandomGenerator" or a
"FrozenRandomGenerator" into a class to further down be able to use
predefined randomness-values to be able to write meaningful and reusable
tests as well as use a decent randomness-value.

The added advantage can be that future improvements on
randomness-generation can easily be used in applications as only a new
implementation of the RandomGeneratorInterface needs to be injected.

My current idea of an interface would look like this, but this is open
for debate within a working group.

interface RandomGeneratorInterface {

public function generateInt(int $min = PHP_INT_MIN, int $max =
PHP_INT_MAX): int;
public function generateBytes(int $length = 64): string;
public function generateString(int $length = 64, $chars =
[a-zA-Z0-9]): string;
}

This interface is influenced by Anthony Ferraras random-lib[2] as well
as by the PHP-internal CSPRNG-functions.

Is this something the FIG would benefit from?

Looking forward to a discussion.

Cheers

Andreas


[1]: https://twitter.com/ramsey/status/1574099413861306368
[2]: https://packagist.org/packages/ircmaxell/random-lib
--
,,,
(o o)
+---------------------------------------------------------ooO-(_)-Ooo-+
| Andreas Heigl |
| mailto:and...@heigl.org N 50°22'59.5" E 08°23'58" |
| https://andreas.heigl.org |
+---------------------------------------------------------------------+
| https://hei.gl/appointmentwithandreas |
+---------------------------------------------------------------------+

Bruce Weirdan

unread,
Sep 26, 2022, 3:00:59 AM9/26/22
to php...@googlegroups.com
On Mon, Sep 26, 2022 at 2:46 AM Andreas Heigl <and...@heigl.org> wrote:
>
> while we have by now
> multiple Clock-implementations to ease testing with Dates and times
> there is not really a way currently to ease testing with randomnes.
>
> PHP itself provides the random_int and random_bytes functions and some
> libraries provide additional sources of randomnes. But there is no way
> of being able to "assert a certain random value" which makes functions
> requiring randomness hard to test.
>
> ANd so far I have not found any interoperability standard in that
> direction in the PHP-Ecosystem. There are some libraries (not a huge
> number) but I did not find an interface for interoperability and better
> testing.

Isn't it already covered natively by https://wiki.php.net/rfc/rng_extension?

Andreas Heigl

unread,
Sep 26, 2022, 3:28:47 AM9/26/22
to php...@googlegroups.com
Hey Bruce

Thanks for the feedback!
IMO Yes and No.

The rng_extension describes the Random\Engine interface. This interface
has only the `generate` method which returns a string. The length of
this string can not be specified.

In general the idea of the whole package was to "Create a single
Randomizer class which provides various randomization methods (like get
int/bytes, shuffle string/arrays). This class will take an Engine
interface in the constructor which can be swapped based on users needs.
Some essential RNG engines will be prepackaged for convenience but an
Interface will also be provided so that algorithms can be easily added."

The main task here is to ease adding randomizers in the core later.

The task is only to a certain extend to allow interoperability (due to
everyone using those methods) but sadly not to ease testing as the
methods of the provided Randomizer class are not based on an interface.

So to be able to test this with fixed values you will have to depend on
the Random\Randomizer-class and inject your own fixed Random\Engine
implementation. There is no way to replace the Random\Randomizer with
something else.

So if that is enough of interoperability, then I'm fine with that and
there is no need for any further investigation.

Cheers

Andreas
OpenPGP_0xA8D5437ECE724FE5.asc
OpenPGP_signature

Tim Düsterhus

unread,
Sep 26, 2022, 11:52:11 AM9/26/22
to php...@googlegroups.com
Hi

On 9/26/22 09:28, Andreas Heigl wrote:
>> Isn't it already covered natively by https://wiki.php.net/rfc/rng_extension?
> IMO Yes and No.
>
> The rng_extension describes the Random\Engine interface. This interface
> has only the `generate` method which returns a string. The length of
> this string can not be specified.

Indeed, because the engines represent some externally specified
algorithm that operates on a well-specified "output size".

> In general the idea of the whole package was to "Create a single
> Randomizer class which provides various randomization methods (like get
> int/bytes, shuffle string/arrays). This class will take an Engine
> interface in the constructor which can be swapped based on users needs.

ext/random provides Random\Randomizer as a high-level interface to an
engine's randomness. It will call the underlying engine as needed to
obtain enough random bits to perform the requested operation, without
introducing any biases.

> Some essential RNG engines will be prepackaged for convenience but an
> Interface will also be provided so that algorithms can be easily added."
>
> The main task here is to ease adding randomizers in the core later.
>
> The task is only to a certain extend to allow interoperability (due to
> everyone using those methods) but sadly not to ease testing as the
> methods of the provided Randomizer class are not based on an interface.
>
> So to be able to test this with fixed values you will have to depend on
> the Random\Randomizer-class and inject your own fixed Random\Engine
> implementation. There is no way to replace the Random\Randomizer with
> something else.

The engine is intended to be the pluggable part, not the Randomizer. I'd
say if you pass along an instance of Random\Randomizer, then you are
doing it wrong. Your service should create a Randomizer based on the
provided engine by itself if it wants to use the Randomizer's high level
interface.

The same is true if the Randomizer's API does not provide what you need.
If you use a library providing additional high level functionality, then
that library should take an engine. If it uses the Randomizer internally
as a building block, then that should be considered an implementation
detail.

For testing purposes you can provide a engine that is seeded with a
fixed seed (new Xoshiro256StarStar(hash('sha256', 'My PHPUnit Seed',
true)) would work) - or you can provide your own userland engine that
provides whatever values you need. You just need to be careful that the
userland engine is not too biased, because otherwise the Randomizer
might be unable to generate an unbiased result and throw
(BrokenRandomEngineError).

> So if that is enough of interoperability, then I'm fine with that and
> there is no need for any further investigation.
>
I'd say that "testing randomness" in general pretty hard and the same is
true when mocking the randomness. You can't really verify that your
logic exhibits the properties it should have. For example you can't test
that Randomizer::getInt() really is unbiased:
https://dilbert.com/strip/2001-10-25.

The tests in PHP itself only verify that getInt() does not return
numbers outside of the request change for a sufficiently large number of
attempts:
https://github.com/php/php-src/blob/master/ext/random/tests/03_randomizer/methods/getInt.phpt

For some superficial tests based on hardcoded results, the pluggable
Engine should be sufficient.

Best regards
Tim Düsterhus

Andreas Heigl

unread,
Sep 26, 2022, 12:53:42 PM9/26/22
to php...@googlegroups.com
Hey Tim.

On 26.09.22 17:52, Tim Düsterhus wrote:
> Hi
>
> On 9/26/22 09:28, Andreas Heigl wrote:
>>> Isn't it already covered natively by
>>> https://wiki.php.net/rfc/rng_extension?
>> IMO Yes and No.
>>
>> The rng_extension describes the Random\Engine interface. This interface
>> has only the `generate` method which returns a string. The length of
>> this string can not be specified.
>
> Indeed, because the engines represent some externally specified
> algorithm that operates on a well-specified "output size".
>
>> In general the idea of the whole package was to "Create a single
>> Randomizer class which provides various randomization methods (like get
>> int/bytes, shuffle string/arrays). This class will take an Engine
>> interface in the constructor which can be swapped based on users needs.
>
> ext/random provides Random\Randomizer as a high-level interface to an
> engine's randomness. It will call the underlying engine as needed to
> obtain enough random bits to perform the requested operation, without
> introducing any biases.

As far as I understood the RFC Random\Randomizer isn't an interface but
a final class. So there is no way to use that to either inject a random
providing randomizer or a known values providing randomizer (yes! I know
very well that known values aren'T random any more).
>
>> Some essential RNG engines will be prepackaged for convenience but an
>> Interface will also be provided so that algorithms can be easily added."
>>
>> The main task here is to ease adding randomizers in the core later.
>>
>> The task is only to a certain extend to allow interoperability (due to
>> everyone using those methods) but sadly not to ease testing as the
>> methods of the provided Randomizer class are not based on an interface.
>>
>> So to be able to test this with fixed values you will have to depend on
>> the Random\Randomizer-class and inject your own fixed Random\Engine
>> implementation. There is no way to replace the Random\Randomizer with
>> something else.
>
> The engine is intended to be the pluggable part, not the Randomizer. I'd
> say if you pass along an instance of Random\Randomizer, then you are
> doing it wrong. Your service should create a Randomizer based on the
> provided engine by itself if it wants to use the Randomizer's high level
> interface.
>
> The same is true if the Randomizer's API does not provide what you need.
> If you use a library providing additional high level functionality, then
> that library should take an engine. If it uses the Randomizer internally
> as a building block, then that should be considered an implementation
> detail.

The interface I was proposing here was intended for those high-level
libraries. Whether they use the Random-extension or paragonie/random or
the random_* functions is up to them and should not be part of the
proposal.

>
> For testing purposes you can provide a engine that is seeded with a
> fixed seed (new Xoshiro256StarStar(hash('sha256', 'My PHPUnit Seed',
> true)) would work) - or you can provide your own userland engine that
> provides whatever values you need. You just need to be careful that the
> userland engine is not too biased, because otherwise the Randomizer
> might be unable to generate an unbiased result and throw
> (BrokenRandomEngineError).

This is not what the proposed interface is about. The idea is to have at
least two possible implementations of a high-level interface:

* One for testing that provides a predictable set of values and
* One for production that provides an unpredictable set of values.

The proposal should *not* make assumptions about the quality of the
unpredictability and how the unpredictability is achieved and how
unpredictable the unpredictability is. That is what the implementing
libraries are there for. They do a great job at providnig a facade to
that complexity.
>
>> So if that is enough of interoperability, then I'm fine with that and
>> there is no need for any further investigation.
>>
> I'd say that "testing randomness" in general pretty hard and the same is
> true when mocking the randomness. You can't really verify that your
> logic exhibits the properties it should have. For example you can't test
> that Randomizer::getInt() really is unbiased:
> https://dilbert.com/strip/2001-10-25.

The idea is explicitly *not* to test the randomness with the proposed
interface. That is the task of the library that implements said interface.

The here proposed interface should make it possible for the developer to
test their code with *known* values, to make sure that certain values
will cause a defined action.

In the production environment these *known* values will then be replaced
with the random values but the randomness does not need to be tested as
that was tested in the used library.

This allows decoupled code and better separation of concerns.

In my opinion it targets a different use-case than the random-extension.

For me the random extension is a great way to provide the underlying
randomness within a random-library, but for the higher-level
unit-testing a decoupling from the actual implementation via a separate
interface is necessary.
OpenPGP_0xA8D5437ECE724FE5.asc
OpenPGP_signature

Tim Düsterhus

unread,
Sep 26, 2022, 1:54:53 PM9/26/22
to php...@googlegroups.com
Hi

On 9/26/22 18:53, Andreas Heigl wrote:
>> ext/random provides Random\Randomizer as a high-level interface to an
>> engine's randomness. It will call the underlying engine as needed to
>> obtain enough random bits to perform the requested operation, without
>> introducing any biases.
>
> As far as I understood the RFC Random\Randomizer isn't an interface but
> a final class. So there is no way to use that to either inject a random
> providing randomizer or a known values providing randomizer (yes! I know
> very well that known values aren'T random any more).

Yes, that is correct.

>> The engine is intended to be the pluggable part, not the Randomizer. I'd
>> say if you pass along an instance of Random\Randomizer, then you are
>> doing it wrong. Your service should create a Randomizer based on the
>> provided engine by itself if it wants to use the Randomizer's high level
>> interface.
>>
>> The same is true if the Randomizer's API does not provide what you need.
>> If you use a library providing additional high level functionality, then
>> that library should take an engine. If it uses the Randomizer internally
>> as a building block, then that should be considered an implementation
>> detail.
>
> The interface I was proposing here was intended for those high-level
> libraries. Whether they use the Random-extension or paragonie/random or
> the random_* functions is up to them and should not be part of the
> proposal.

I don't see much value in that, see below.

>> For testing purposes you can provide a engine that is seeded with a
>> fixed seed (new Xoshiro256StarStar(hash('sha256', 'My PHPUnit Seed',
>> true)) would work) - or you can provide your own userland engine that
>> provides whatever values you need. You just need to be careful that the
>> userland engine is not too biased, because otherwise the Randomizer
>> might be unable to generate an unbiased result and throw
>> (BrokenRandomEngineError).
>
> This is not what the proposed interface is about. The idea is to have at
> least two possible implementations of a high-level interface:
>

Such an interface would be incredibly limiting, because there's all
kinds of things that you could randomly generate:

- Uniformly distributed integers within a given range
- Integers that follow a binomial distribution
- Strings with arbitrary bytes
- Strings with bytes taken from some input string ("Replacement sampling")
- Strings containing a UUIDv4
- A UUID object containg a UUIDv4
- Uniformly distributed floats in [0, 1)
- booleans ("Bernoulli distribution")
- DateTimeImmutables

And there are many more examples in https://github.com/FakerPHP/Faker.

>>> So if that is enough of interoperability, then I'm fine with that and
>>> there is no need for any further investigation.
>>>
>> I'd say that "testing randomness" in general pretty hard and the same is
>> true when mocking the randomness. You can't really verify that your
>> logic exhibits the properties it should have. For example you can't test
>> that Randomizer::getInt() really is unbiased:
>> https://dilbert.com/strip/2001-10-25.
>
> The idea is explicitly *not* to test the randomness with the proposed
> interface. That is the task of the library that implements said interface.
>
> The here proposed interface should make it possible for the developer to
> test their code with *known* values, to make sure that certain values
> will cause a defined action.

I would expect that the "randomness" would usually be generated in the
outer layers (e.g. the controller) and the randomly generated value
(e.g. a randomly generated UUID object) would then be passed to the
functions that do whatever they need to do. These function could then be
independently tested with specific values to ensure that they properly
handle all the edge cases. For an end-to-end / integration test a seeded
engine can be provided.

> For me the random extension is a great way to provide the underlying
> randomness within a random-library, but for the higher-level
> unit-testing a decoupling from the actual implementation via a separate
> interface is necessary.
>

Best regards
Tim Düsterhus
Reply all
Reply to author
Forward
0 new messages