James Dow Allen <gm...@jamesdowallen.nospam> writes:
> Tim Rentsch <
t...@alumni.caltech.edu> might have writ, in
>
news:kfn38bi...@x-alumni2.alumni.caltech.edu:
>
>> James Dow Allen <
jdall...@yahoo.com> writes:
>>
>>> I've placed some code online
>>>
http://fabpedigree.com/james/j_random.c
>>> which addresses a variety of issues related to PNRG's.
>>> (You may also want j_random.h and j_rdemo.c from
>>> the same directory.) Comments welcome.
>>
>> sigh...
>>
>> It's a good illustration of Hoare's aphorism.
Let me preface any further comments by saying the sigh was only
an expression of an internal non-analytic response. It wasn't
meant to be nasty, or even critical, just how I felt trying to
write a reply, and I'm sorry that how it came across was
something other than that.
> I wonder what you think you mean. The PNRG's in that file, which
> do appear very complicated, are taken directly from some of the
> best available.
I didn't look closely at the ma_rand() and me_rand() function
bodies. I assumed they are faithful implementations of some
well-known RNG's. It would be nice to know which ones, ie, it
would be nice if the code gave references or even just the names
in a comment, but I didn't consider the internals of these
functions to be of any concern. Going back and looking at them
now, I have mostly the same reaction, but I wonder what parts are
taken directly from the originals and what parts were added (by
you or someone else) that were not included in the orginal
algorithms. But that's about documentation, not about the code
per se.
> If you're referring to rand_index() with its complicated-looking
> ir_range/ir_vari, then it's obvious you fail to grasp what problem
> the mechanism solves -- despite that I mentioned it upthread.
I believe I understand the problem exactly. A few months ago I
posted some code in c.l.c for a simplified version of the same
concern. (The upthread description was given in the post I was
responding to - I wouldn't have left that part out of what was
quoted if my response were specifically about that area.)
> The special handling is useful only when one is using a very
> expensive RNG, but it's a FUN problem and,
I agree, it is a fun problem.
> of course its complexity (assuming that's what your "sigh" was
> about) is irrelevant, since the code is done.
In my perception this isn't a hard problem. It's a tricky
problem, but not really a hard problem.
The code though I think is more complicated than what the problem
calls for. That condition is always relevant IMO, even for code
that is already written. The result may not merit revising the
existing code, but it's always worth noting.
> So:
>
> (1) What does your nasty little sigh refer to?
A combination of disappointment and feeling unable to respond in
a constructive way. I was looking forward to seeing how you
solved the problem, especially knowing you are a smart guy. It
was frustrating trying to understand how that was done and not
being able to (at least not without putting in a lot more effort
than it seems like should have been necessary). Then I wanted
to write some constructive comments in response, but found the
prospect overwhelming: to figure out what to say (let alone how
to say it) looked like it would take just an enormous amount of
time. The result of all that came out as just a sigh.
> (2) Assuming it's rand_index(), try to grasp the purpose for the
> complexity; then present your own improved solution.
I will give code below.
> (3) Feel free to criticise the rest of that code in that module.
> For example, present your improved code for the setup for Walker's
> algorithm.
I didn't really look at this part of the code before. Looking at
it now (and making an educated guess about what it's supposed to
do, which wasn't obvious to me before), it would take me a little
while to write out a good solution (ie, one I had confidence in
that it works, and never screws up). I can say, I probably would
change the approach slightly, in two ways. One, even if the
input likelihoods are given in floating point, I would want to do
the calculations using integers, both to be sure the results are
repeatable and that the "filling" algorithm works correctly. (In
situations like this using floating point makes me nervous.)
Two, when making a choice (ie, in rand_walker()), it would be
nice to make a choice without using floating point, but instead
making a sequence of binary decisions, stopping as soon as the
outcome is determined. This approach uses somewhat mroe entropy
(I believe two bits on the average, versus one or less (yes?)
with how rand_walker()/rand_decide() are written now), but is
certainly going to be repeatable, which the floating point
calculation may not be. Alternatively, maybe the same method
that rand_decide() uses now could be used, except with scaled
integers rather than floating point so the results will be
repeatable. (Obviously I need to give more thought to this
before proposing any concrete alternatives, which is kinda where
I started in talking about this.)
Switching to more general comments...
My high order bit is that the code is opaque. I don't mean
it's incomprehensible, which is something very different. For
example, the ma_rand() is incomprehensible, by which I mean how
it works or what it's supposed to be doing are not evident.
However this is primarily a problem of documentation, not
code quality or organization.
Code that is opaque has a different quality. It's hard to
understand just how everything fits together. I feel like I
have to understand everything all at once before I understand
anything; there isn't good high-level organization. When
reading code, normally I take little bits and understand them a
piece at a time. Reading this code I am unable to do that -
there is too much inter-connection and inter-dependency. The
code seems to have problems, but the problems aren't obvious,
and it takes a lot of effort to figure out whether it works.
For example, I think there is a dependence on unspecifed
behavior, specifically order of evaluation; depending on which
compiler is used, or what optimization settings, there may be
different results out of, eg, rand32(). But it's hard to be
sure of that, because there isn't much layering, and the
cross-function linkages make it hard to reason about how things
work. I can't say the code is wrong, but /for sure/ I can say
it's hard to tell if it's right. IMO that derives mostly from
there being a lot of cross-function dependencies and not enough
structure in how the code is organized.
I'm sorry if the above seems critical but not helpful. It's
partly because I was having so much trouble expressing my
reaction in a constructive way that I wrote what I did before.
A lot of effort went into composing this response, and for now
it's the best I can do.
Returning to the earlier topic of rand_index(), two items.
First I would write rand_bits() in terms of rand_index() rather
than the other way around, as for example (and please excuse
minor changes in function or type names)
U32
uniform_bits_of_width( unsigned w ){
return w > 31 ? uniform_32() : uniform_32_below( 1uL << w );
}
This change simplifies the implementation of this function, and
allows the other function to be self-contained.
For the entropy-preserving "choose a number less than k", I
might write it this way, at least to illustrate an approach:
U32
uniform_32_below( U32 k ){
return k < 2 ? 0 : uu32( k, - (-k%k + 1) );
}
U32
uu32( U32 k, U32 z ){
static U64 W = 1, X = 0; // invariant: X < W
U32 u, r, m, y;
return k > W
? u = uniform_32(), u > z
? y = z+1, W *= -y, X = X*-y + (u-y), uu32( k, z )
: (m = 1 + z/k, W *= m, X = X*m + u/k, u%k)
: X >= (y = W - W%k)
? W -= y, X -= y, uu32( k, z )
: (r = X % k, W /= k, X /= k, r)
;
}
Obviously the style is rather terse, and some comments are needed
to make what's going on comprehensible (mainly the roles of W, X,
y, and z). But the organization is not complicated. There are
four cases, corresponding to whether the entropy already stored
suffices or a new random number needs to be generated, and in
each case whether the value being used is in the biased range or
the unbiased range. The two sub-cases where the value is in the
biased range update W and X and go again via a recursive call.
By contrast, to see what rand_index is doing, more than 100 lines
of code, including four function definitions, need examining. I
can't help feeling that the resulting complications are more than
the problem warrants.