Any function where a zero would propagate.
This can be exactly as bad as accidentally comparing a NULL in SQL.
-Craig
That's vague for me.
Are you saying it is a common enought use pattern to divide with a
random number? Are there other reasons when a float() =:= 0.0 is fatal?
>
> -Craig
--
/ Raimo Niskanen, Erlang/OTP, Ericsson AB
It is relatively common whenever it is guaranteed to be safe! Otherwise it becomes a guarded expression.
Sure, that is a case of "well, just write it so that it can't do that" -- but the original function spec told us we didn't need to do that, so there is code out there that would rely on not using a factor of 0.0. I've probably written some in game servers, actually.
Propagating the product of multiplication by 0.0 is the more common problem I've seen, by the way, as opposed to division.
Consider: character stat generation in games, offset-by-random-factor calculations where accidentally getting exactly the same result is catastrophic, anti-precision routines in some aiming devices and simulations, adding wiggle to character pathfinding, unstuck() type routines, mutating a value in evolutionary design algorithms, and so on.
Very few of these cases are catastrophic and many would simply be applied again if the initial attempt failed, but a few can be very bad depending on how the system in which they are used is designed. The problem isn't so much that "there aren't many use cases" or "the uses aren't common" as much as the API was originally documented that way, and it has changed for no apparent reason. Zero has a very special place in mathematics and should be treated carefully.
I think ROK would have objected a lot less had the original spec been 0.0 =< X =< 1.0 (which is different from being 0.0 =< X < 1.0; which is another point of potentially dangerous weirdness). I'm curious to see what examples he comes up with. The ones above are just off the top of my head, and like I mentioned most of my personal examples don't happen to be really catastrophic in most cases because many of them involve offsetting from a known value (which would be relatively safe to reuse) or situations where failures are implicitly assumed to provoke retries.
-Craig
The spec did not match the reality. Either had to be corrected.
It is in general safer to change the documentation to match the reality.
So I do not agree that the spec changed for no apparent reason.
Furthermore Java's Random:nextFloat(), Python's random.random() and
Ruby's Random.rand all generate in the same interval:
http://docs.oracle.com/javase/6/docs/api/java/util/Random.html#nextFloat()
https://docs.python.org/3/library/random.html#random.random
http://ruby-doc.org/core-2.0.0/Random.html#method-i-rand
I think this all boils down to the fact that digital floating point values
(IEEE 754) has limited precision and in the interval 0.0 to 1.0 are better
regarded as 53 bit fixed point values.
A half open interval matches integer random number generators that also
in general use half open intervals.
With half open intervals you can generate numbers in [0.0,1.0) and other
numbers in [1.0,2.0), where the number 1.0 belongs to only one of these intervals.
This I think is a good default behaviour.
>
> I think ROK would have objected a lot less had the original spec been 0.0 =< X =< 1.0 (which is different from being 0.0 =< X < 1.0; which is another point of potentially dangerous weirdness). I'm curious to see what examples he comes up with. The ones above are just off the top of my head, and like I mentioned most of my personal examples don't happen to be really catastrophic in most cases because many of them involve offsetting from a known value (which would be relatively safe to reuse) or situations where failures are implicitly assumed to provoke retries.
>
> -Craig
--
/ Raimo Niskanen, Erlang/OTP, Ericsson AB
> It is in general safer to change the documentation to match the reality.
Wow.
I certainly hope this is not the general policy for OTP. We program
against the documentation. The documentation *is* our reality.
It also seems it's not even listed in the release notes. We program
against the documentation, if the documentation has a breaking changes
it would be great to know about it.
--
Loïc Hoguin
https://ninenines.eu
I certainly hope this is not the general policy for OTP. We program
against the documentation. The documentation *is* our reality.
I'm not arguing against that, but rather against that it's "in general
safer to change the documentation".
It should be a case by case basis but it's also important to recognize
that users write software against the documentation, and to take this
into account when making breaking changes.
To give an example, if such a thing were to happen in Cowboy, which
follows semver, two cases could happen:
* The documentation is wrong but the next version is a patch release:
fix the code to match the documentation. The rule is: don't break
people's programs.
* The documentation is wrong but the next version is a major release:
fix the documentation AND announce it as a breaking change (with all
details; and probably release a patch release for the old version as
above). The rule is: breaking people's programs is OK, just make sure
you tell them about it!
> I don't think you can make blanket statements on which way you should
> lean because there are good counterexamples in both "camps" so to speak.
Properly matching people's expectations is a lot more important than
whatever counterexamples may exist.
I disagree. Take this example:
https://lwn.net/SubscriberLink/732420/9b9f8f2825f1877f/ The printk()
function in the Linux kernel was documented to print new logs to new
lines unless the KERN_CONT option was passed. In reality it didn't
always started new lines and people expected (maybe even relied on)
this - and when the code was updated to match the documentation, they
were genuinely surprised when their code was broken.
The other route is to make existing functions do what they say they are going to whenever possible, add functions that provide the prescribed functionality, and deprecate and annotate (with warnings where appropriate) the ones that cannot provide whatever they originally claimed to. And be quite noisy about all of this.
OTP has many, many examples of this. It prevents surprise breakage of old code that depends on some particular (and occasionally peculiar) behavior while forging a path ahead -- allowing users to make an informed decision to review and update old code or stick with an older version of the runtime (which tends to be the more costly choice in many cases, but at least it can be an informed decision).
Consider what happened with now/0, for example. Now we have a more complex family of time functions but never was it viewed as an acceptable approach to simply shift documentation around a bit here and there in a rather quiet manner while adding in contextual execution features (that is to say, hidden states) that would cause now/0 to behave in a new way. And now/0 is deprecated.
> I think it is fair to evaluate on a case by case basis.
OK. I'll buy that.
In an EXTREMELY limited number of cases you will have a function that simply cannot live up to its spec without a ridiculous amount of nitpicky work that wouldn't really matter to anyone. This is not one of those cases. And in this case we are talking about providing a largely pure API in the standard library, not some meta behavior that acts indirectly through a rewrite system based on some proofing mechanics where the effects of improper definitions are magnified with each transformation.
So I get what you're saying, but this is not one of those cases, and for those odd cases it is much safer to deprecate functions, mark them as unsafe, provide compiler warnings and so on if the situation is just THAT BAD, and write a new function that is properly documented in a way that won't suddenly change later. For a functional language's standard library the majority of functions aren't going to be magically tricky, and specs are concrete promises while implementations ephemeral.
At least this change happened in a major release, not a minor one. If it is forgivable anywhere, it is in a major release. The tricky bit is that the promises a language's standard libs make to authors are a bit more sticky than those made by separate libraries provided in a given language. And yes, that is at least as much part of the social contract inherent in the human part of the programming world as it is a part of the technical contract implicit in published documentation. The social part of the contract is more important, from what I've seen. Consider why Ruby and many previously popular JS frameworks are considered to be cancer now -- its not just because things changed, it is that the way they changed jerked people around.
The issue I am addressing is a LOT more important than whether `0 =< X < 1.0`, of course (yeah, on this one issue, we'll figure it out). It is a general attitude that is absolutely dangerous.
>> It is in general safer to change the documentation to match the reality.
This is as corrosive a statement as can be. We need to think very carefully about that before this sort of thinking starts becoming common in other areas of OTP in general.
-Craig
This story is not about people following the documentation and then have
the documentation be "fixed" under their feet without them noticing, it
is in fact the complete opposite.
--
Loïc Hoguin
https://ninenines.eu
DHOH! Yes, I know, I know...
0.0 =< X < 1.0
/(>.<)\
The moral of the story: people are programming against
behavior/implementation, not documentation. In these cases fixing the
implementation instead of the documentation has very real possibility
of breaking existing programs. Of course, one can tell its users that
"it's your fault you haven't followed the documentation!" but it
doesn't necessarily make those users happy...
Maybe in the Linux kernel. Outside, where there is such a thing as
documentation (comments are not documentation), if the code behaves
differently than the documentation, you open a ticket... And in that
case, yes, for a limited time, you will program against the behavior and
not against the documentation. But it's the exception, not the rule.
--
Loïc Hoguin
https://ninenines.eu
There was once a boy who always rode his bike on the right side of the streets in his neighborhood. Sure, the signs all said "keep left" but, well, everyone just ignores the signs where he lives.
One day a new sign was in its place that said "keep right".
Now what should he do?
-Craig
There was once a boy who always rode his bike on the right side of the streets in his neighborhood. Sure, the signs all said "keep left" but, well, everyone just ignores the signs where he lives.
One day a new sign was in its place that said "keep right".
Now what should he do?
-Craig
Well it's a good thing I agreed with this in an email sent almost 6
hours ago then. :-)
--
Loïc Hoguin
https://ninenines.eu
I had no idea that statement would be so flammable. :-)
I simply wanted to point out that from the point of view of a developer of
a mature product like Erlang/OTP it is has happened too many times that
a subtle behaviour change breaks something for a customer.
And that is something that programmers writing new code often do not
appreciate since they simply want the libraries to be "right" where it is
a very reasonable view that the documentation defines what is "right".
I also realize that in this particular case to stop returning 0.0 from
rand:uniform() would also have been a safe choice since that would be
almost impossible to detect and almost certainly cause no harm.
And no, I did not state an OTP policy. We decide from case to case.
>
> --
> Loïc Hoguin
> https://ninenines.eu
--
/ Raimo Niskanen, Erlang/OTP, Ericsson AB
If I try to be philosophical, picking a random number in the range
0.0 to 1.0 of real numbers, the probability of getting a number exactly 0.0
(or exactly 1.0) is infinitely low. Therefore the range (0.0,1.0) is more
natural.
>
> My belief is that the [0,1) distribution is the most common because it is the easiest to implement given the IEEE floating point standard format. However, I would also like to be proven wrong, to have more faith in the current situation.
I think that is very possible.
We can not forget the fact that digital floating point numbers will always
be some kind of integer values in disguise.
:
>
> I have some examples that can make this desire a bit clearer:
>
> https://github.com/CloudI/cloudi_core/blob/a1c10a02245f0f4284d701a2ee5f07aad17f6e51/src/cloudi_core_i_runtime_testing.erl#L139-L149
>
> % use Box-Muller transformation to generate Gaussian noise
> % (G. E. P. Box and Mervin E. Muller,
> % A Note on the Generation of Random Normal Deviates,
> % The Annals of Mathematical Statistics (1958),
> % Vol. 29, No. 2 pp. 610–611)
> X1 = random(),
> X2 = PI2 * random(),
> K = StdDev * math:sqrt(-2.0 * math:log(X1)),
math:log(X1) will badarith if X1 =:= 0.0. You need a generator for X1
that does not return 0.0, just as RO'K says.
> Result1 = erlang:max(erlang:round(Mean + K * math:cos(X2)), 1),
> Result2 = erlang:max(erlang:round(Mean + K * math:sin(X2)), 1),
If random() for X2 is in [0.0,1.0] then both 0.0 and 1.0 will produce the
same value after math:cos(X2) or math:sin(X2), which I am convinced will
bias the result since that particular value will have twice the probability
compared to all other values. I think you should use a generator for X2
that only can return one of the endpoints.
Actually, it seems a generator for (0.0,1.0] would be more appropriate
here...
> sleep(Result2),
>
>
> https://github.com/CloudI/cloudi_core/blob/a1c10a02245f0f4284d701a2ee5f07aad17f6e51/src/cloudi_core_i_runtime_testing.erl#L204-L210
>
> X = random(),
> if
> X =< Percent ->
> erlang:exit(monkey_chaos);
> true ->
> ok
> end,
In this kind of code, I think that (when thinking integers, since we are
talking about integers in disguise) half open intervals are more correct.
The interval [0.0,0.1] contains say N+1 numbers, the interval [0.0,0.2]
contains 2*N+1 nubers so subtracting the first interval from the second
would get the interval (1.0,2.0) which have N numbers. So you get a bias
because you include both endpoints.
In this case I believe more in a generator that gives [0.0,1.0) and the
test X < Percent, since that is what I would have written using integers to
avoid off-by-one errors.
>
> with:
> random() ->
> quickrand:strong_float().
>
> These are code segments used for the CloudI service configuration options monkey_latency and monkey_chaos so that normal distribution latency values and random service deaths can occur, respectively (with the more common names as Latency Monkey and Chaos Monkey, but the words switched to make the concepts easier to find and associate). For the Box-Muller transformation, it really does want a definite range [0,1] and it helps make the monkey_chaos service death easier to understand at a glance.
Please explain why the Box-Muller transformation needs a definite range
[0.0,1.0].
--
/ Raimo Niskanen, Erlang/OTP, Ericsson AB
Why not use rand:normal/3?
It uses the Ziggurat Method and is supposed to be much faster and
numerically more stable than the basic Box-Muller method.
--
/ Raimo Niskanen, Erlang/OTP, Ericsson AB
That was my understanding after not having modified that routine for a decent amount of time, though I must be mistaken. I will need to fix this source code and regret not seeing these problems in the Box-Muller transformation source code. Thank you for pointing them out. At least this shows a need for a (0.0,1.0] function.
Thanks,
Michael
Yes. That is easily produced by (pointed out earlier in this thread):
1.0 - rand:uniform()
>
> Thanks,
> Michael
--
/ Raimo Niskanen, Erlang/OTP, Ericsson AB
Simpler - yes.
The basic benchmark in rand_SUITE indicates that rand:normal() is only
about 50% slower than rand:uniform(1 bsl 58) (internal word size),
which I think is a very good number.
The Box-Muller transform method needs 4 calls to the 'math' module for
non-trivial floating point functions i.e log(), sqrt(), cos() and sin(),
which is why I think that "must" be slower.
But I have also not measured... :-/
Looking forward to hear your results!
--
/ Raimo Niskanen, Erlang/OTP, Ericsson AB
Core i7 2670QM 2.2GHz 1 cpu, 4 cores/cpu, 2 hts/core
L2:4×256KB L3:6MB RAM:8GB:DDR3-1333MHz
Sandy Bridge-HE-4 (Socket G2)
Best Regards,
Michael
Thank you for sharing these numbers!
>
> A rough look at the latency associated with the normal distribution method, ignoring the latency for random number source is:
> rand:normal/0
> 3441.6 us = 6832.0 us - (rand:uniform/1 3390.4 us)
Should not the base value come from rand:uniform/0 instead. I know the
difference is not big - rand_SUITE:measure/1 suggests 3%, but it also
suggests that rand:normal/0 is about 50% slower than rand:uniform/0 while
your numbers suggests 100% slower. Slightly strange...
> quickrand_cache_normal:box_muller/2
> 3553.5 us = 9329.5 us - (quickrand_cache:floatR/0 5776.0 us)
> quickrand_cache_normal:box_muller/3
> 3213.4 us = 8917.7 us - (quickrand_cache:floatR/1 5704.3us)
It is really interesting to see that the calls to the 'math' module
does not slow that algorithm down very much (hardly noticable)!
>
> So, this helps to show that the latency with both methods is very similar if you ignore the random number generation. However, it likely requires some explanation: The quickrand_cache module is what I am using here for random number generation, which stores cached data from crypto:strong_rand_bytes/1 with a default size of 64KB for the cache. The difference between the functions quickrand_cache_normal:box_muller/2 and quickrand_cache_normal:box_muller/3 is that the first uses the process dictionary while the second uses a state variable. Using the large amount of cached random data, the latency associated with individual calls to crypto:strong_rand_bytes/1 is avoided at the cost of the extra memory consumption, and the use of the cache makes the speed of random number generation similar to the speed of pseudo-random number generation that occurs in the rand module.
We should add a 'rand' plugin to the 'crypto' module that does this
buffered crypto:strong_random_bytes/1 trick. There is something like that
in rand_SUITE, but we should really have an official one.
I also wonder where the sweet spot is? 64 KB seems like a lot of buffer.
>
> In CloudI, I instead use quickrand_normal:box_muller/2 to avoid the use of cached data to keep the memory use minimal (the use-case there doesn't require avoiding the latency associated with crypto:strong_rand_bytes/1 because it is adding latency for testing (at https://github.com/CloudI/cloudi_core/blob/299df02e6d22103415c8ba14379e90ca8c3d3b82/src/cloudi_core_i_runtime_testing.erl#L138) and it is best using a cryptographic random source to keep the functionality widely applicable). However, the same function calls occur in the quickrand Box-Muller transformation source code, so the overhead is the same.
>
> I used Erlang/OTP 20.0 (without HiPE) using the hardware below:
> |Core i7 2670QM 2.2GHz 1 cpu, 4 cores/cpu, 2 hts/core
> L2:4×256KB L3:6MB RAM:8GB:DDR3-1333MHz
> Sandy Bridge-HE-4 (Socket G2)
>
> Best Regards,
> Michael
> |
Best regards