strategies for implementation of USES?

Krishna Myneni

unread,

Feb 4, 2012, 6:47:10 PM2/4/12

to

To avoid any misunderstandings up front: This is *not* a proposal for
standardizing a new word in Forth.

The discussion of advanced tools programming in Forth touched on the
subject of being able to generate dependency trees for words. Such a
tool would likely be useful both academically and practically. It
seems to me that the basic factors for writing this tool are TRAVERSE-
WORDLIST and a word such as "USES?". The word USES? might have the
following stack diagram:

USES? ( wid1 nt1 wid2 nt2 -- flag )

where the pair (wid1, nt1) uniquely represents a word in wid1 and the
pair (wid2, nt2) uniquely represents a word in wid2. The returned flag
is true if the word represented by (wid1, nt1) is used in the
definition of the word represented by (wid2, nt2), and false
otherwise.

Now, one way of implementing such a word is for the Forth system to
generate and save the required data at compile time, so that USES? can
simply look up the flag. But, that seems prohibitively expensive and
wasteful. Are there more efficient/realistic strategies for
implementing USES? . Or perhaps, USES? isn't the correct factor for
accomplishing the desired goal?

Krishna

Krishna Myneni

unread,

Feb 4, 2012, 10:25:24 PM2/4/12

to

On Feb 4, 5:47 pm, Krishna Myneni <krishna.myn...@ccreweb.org> wrote:
...

> The discussion of advanced tools programming in Forth touched on the

> subject of being able to generate dependency trees for words. ...

Oops. I realize now that the term "dependency tree" is strictly not
correct. According to Wikipedia, in graph theory, "a tree is an
undirected graph in which any two vertices are connected by exactly
one simple path". Dependencies between words (vertices) do not fit
this description. What is the topologically correct term for a
dependency graph?

Krishna

BruceMcF

unread,

Feb 4, 2012, 10:37:15 PM2/4/12

to

On Feb 4, 10:25 pm, Krishna Myneni <krishna.myn...@ccreweb.org> wrote:
> What is the topologically correct term for a
> dependency graph?

Yes.

IOW, dependency graph *is* the correct term, meaning a directed graph
representing dependencies of several objects towards each other.

Krishna Myneni

unread,

Feb 4, 2012, 11:16:42 PM2/4/12

to

Some graphs have names. For example, see

http://en.wikipedia.org/wiki/Gallery_of_named_graphs

I was wondering if a word dependency graph fit the structure of a
known, named graph. I'd be surprised if computer science doesn't have
a specific name, since such a dependency graph is of far more general
applicability than to Forth words.

Krishna

Krishna Myneni

unread,

Feb 4, 2012, 11:21:26 PM2/4/12

to

On Feb 4, 10:16 pm, Krishna Myneni <krishna.myn...@ccreweb.org> wrote:
> On Feb 4, 9:37 pm, BruceMcF <agil...@netscape.net> wrote:
>
> > On Feb 4, 10:25 pm, Krishna Myneni <krishna.myn...@ccreweb.org> wrote:
>
> > > What is the topologically correct term for a
> > > dependency graph?
>
> > Yes.
>
> > IOW, dependency graph *is* the correct term, meaning a directed graph
> > representing dependencies of several objects towards each other.
>

> Some graphs have names. ...

You're right. The correct term does appear to be "dependency graph":

http://en.wikipedia.org/wiki/Dependency_graph

Thanks.

Krishna

BruceMcF

unread,

Feb 5, 2012, 8:26:57 AM2/5/12

to

On Feb 4, 6:47 pm, Krishna Myneni <krishna.myn...@ccreweb.org> wrote:

> Now, one way of implementing such a word is for the Forth system to
> generate and save the required data at compile time, so that USES? can
> simply look up the flag. But, that seems prohibitively expensive and
> wasteful. Are there more efficient/realistic strategies for
> implementing USES? . Or perhaps, USES? isn't the correct factor for
> accomplishing the desired goal?

Is the individual word the correct unit of analysis? I think of
dependencies in Forth in terms of depencies between sets of words.

Krishna Myneni

unread,

Feb 5, 2012, 8:51:07 AM2/5/12

to

You mean dependencies among wordlists? Yes, that can be useful too.
Just as with dependencies among Forth modules, the graph of which we
envision could be generated automatically.

However, I think the utility of such dependency graphs extends down to
the individual unit of a word in a specific wordlist. High level words
may end up having dozens of words (even excluding intrinsic Forth
words) in their dependency graph. Such a graph may be very useful in
the systematic analysis of the execution failure of such words.

Krishna

BruceMcF

unread,

Feb 5, 2012, 10:12:32 AM2/5/12

to

On Feb 5, 8:51 am, Krishna Myneni <krishna.myn...@ccreweb.org> wrote:
> On Feb 5, 7:26 am, BruceMcF <agil...@netscape.net> wrote:
>
> > On Feb 4, 6:47 pm, Krishna Myneni <krishna.myn...@ccreweb.org> wrote:
>
> > > Now, one way of implementing such a word is for the Forth system to
> > > generate and save the required data at compile time, so that USES? can
> > > simply look up the flag. But, that seems prohibitively expensive and
> > > wasteful. Are there more efficient/realistic strategies for
> > > implementing USES? . Or perhaps, USES? isn't the correct factor for
> > > accomplishing the desired goal?
>
> > Is the individual word the correct unit of analysis? I think of
> > dependencies in Forth in terms of depencies between sets of words.
>
> You mean dependencies among wordlists?

No, I meant dependencies among sets of words, since a capacity is so
often provided by a set of words rather than by an individual word.

But dependencies among wordlists would also be interesting.

Andrew Haley

unread,

Feb 7, 2012, 4:13:02 AM2/7/12

to

Krishna Myneni <krishna...@ccreweb.org> wrote:
> To avoid any misunderstandings up front: This is *not* a proposal for
> standardizing a new word in Forth.
>
> The discussion of advanced tools programming in Forth touched on the
> subject of being able to generate dependency trees for words. Such a
> tool would likely be useful both academically and practically. It
> seems to me that the basic factors for writing this tool are TRAVERSE-
> WORDLIST and a word such as "USES?". The word USES? might have the
> following stack diagram:
>
> USES? ( wid1 nt1 wid2 nt2 -- flag )
>
> where the pair (wid1, nt1) uniquely represents a word in wid1 and the
> pair (wid2, nt2) uniquely represents a word in wid2. The returned flag
> is true if the word represented by (wid1, nt1) is used in the
> definition of the word represented by (wid2, nt2), and false
> otherwise.
>
> Now, one way of implementing such a word is for the Forth system to
> generate and save the required data at compile time, so that USES? can
> simply look up the flag. But, that seems prohibitively expensive and
> wasteful.

I don't think it is: on the contrary, I think that's exactly the right
way to do it. You don't need to do this all the time.

Andrew.

Krishna Myneni

unread,

Feb 7, 2012, 6:22:35 AM2/7/12

to

On Feb 7, 3:13 am, Andrew Haley <andre...@littlepinkcloud.invalid>
wrote:

On second thought, I agree with your statement. Let's say we have a
medium sized dictionary of 10,000 words. Assume that the average word
directly uses on the order of 10 distinct words (excluding standard
Forth words) -- that's probably an over estimate. Then, the overhead
in storage for each word is 20 cells, where each of the 10
dependencies requires 2 cells to hold the wid and nt. Thus, for our
10K dictionary, we have an overhead of 200K cells to be able to
represent the internal dependencies between words. Furthermore, if the
nt uniquely identifies a word, in all wordlists, the overhead reduces
to 100K cells. So, it wouldn't be practical to store such information
on a number of embedded systems, but certainly should not be a problem
for desktop systems.

Another concern is the impact on compiler speed, although since the
compiler has to look up the word tokens anyway, it might not be much
of a hit in efficiency for the compiler to store the dependency list
for a word.

Krishna

Anton Ertl

unread,

Feb 7, 2012, 12:01:11 PM2/7/12

to

Andrew Haley <andr...@littlepinkcloud.invalid> writes:

>Krishna Myneni <krishna...@ccreweb.org> wrote:
>> The discussion of advanced tools programming in Forth touched on the
>> subject of being able to generate dependency trees for words. Such a
>> tool would likely be useful both academically and practically. It
>> seems to me that the basic factors for writing this tool are TRAVERSE-
>> WORDLIST and a word such as "USES?". The word USES? might have the
>> following stack diagram:
>>
>> USES? ( wid1 nt1 wid2 nt2 -- flag )
>>
>> where the pair (wid1, nt1) uniquely represents a word in wid1 and the
>> pair (wid2, nt2) uniquely represents a word in wid2. The returned flag
>> is true if the word represented by (wid1, nt1) is used in the
>> definition of the word represented by (wid2, nt2), and false
>> otherwise.

That appears inefficient to me. You would need to enumerate all words
to find out which ones are used. It's also incomplete because you
wouldn't see uses of anonymous words or words in private wordlists.
It seems to me that a word that enumerates the xts of words called by
a given colon definition would be as easy or hard to implement as
USES? and would lead to a more efficient and complete dependency
graph.

>> Now, one way of implementing such a word is for the Forth system to
>> generate and save the required data at compile time, so that USES? can
>> simply look up the flag. But, that seems prohibitively expensive and
>> wasteful.
>
>I don't think it is: on the contrary, I think that's exactly the right
>way to do it. You don't need to do this all the time.

It's certainly an approach I would have used in recent decades because
the executable code became more opaque; in contrast, in the 80s I
wrote such a tool for fig-Forth that worked on the threaded code.

Anyway, for a tool that works during compilation rather than on the
compiled code, it would be very useful to have hooks in the text
interpreter or in COMPILE,.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2011: http://www.euroforth.org/ef11/

Stephen Pelc

unread,

Feb 7, 2012, 12:56:01 PM2/7/12

to

On Tue, 7 Feb 2012 03:22:35 -0800 (PST), Krishna Myneni
<krishna...@ccreweb.org> wrote:

>Let's say we have a
>medium sized dictionary of 10,000 words. Assume that the average word
>directly uses on the order of 10 distinct words (excluding standard
>Forth words) -- that's probably an over estimate. Then, the overhead
>in storage for each word is 20 cells, where each of the 10
>dependencies requires 2 cells to hold the wid and nt. Thus, for our
>10K dictionary, we have an overhead of 200K cells to be able to
>represent the internal dependencies between words

Just to give you a couple of real-world data points, a 300k binary
cross-compiled embedded app produces a cross-reference file on VFX
of 500kb. A 22Mb desktop app produces a 75Mb cross-reference file.

Stephen

--
Stephen Pelc, steph...@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads

Paul Rubin

unread,

Feb 7, 2012, 7:47:46 PM2/7/12

to

In the "Valentine Bingo" thread I tried writing an implementation and
was surprised to see that there seems to be no random number generator
in the gforth library. Since (AFAIK) gforth is a full featured ANS
implementation, I infer that there's no RNG specified in the standard.
It looks like VFX supplies something called "CHOOSE" which I guess is
non-standard. I ended up supplying my own RNG, using a fairly poor
algorithm that I felt was ok for a simple demo program of this nature,
but choosing and coding a good algorithm requires knowing what you're
doing. So it wouldn't surprise me if there's an awful lot of Forth code
out there containing crappy RNG's.

It therefore seems worthwhile to me that the standard include a PRNG
spec, not for the algorithm itself, but something library implementers
can code to. Something like:

: RANDSEED ( addr n -- ) ... ;

takes n bytes from memory and crunches them to some internal state
that seeds the PRNG

: RANDUPDATE ( addr n -- ) ... ;

takes n bytes and folds them into the existing state.

: RANDSIZE ( -- n ) ... ;

puts the size of the internal state onto the stack. n should be a fixed
parameter of the PRNG algorithm, i.e. RANDSIZE should normally always
return the same result in a given implementation. (It's ok if an
implementation has some additional parametrization words that change n
explicitly).

: RANDCLONE ( addr n -- ) ... ;

copies the internal state to the supplied memory area. n must be
at least as large as the state size or else the word signals an error.

: NRAND ( -- n ) .... ;

generates a pseudorandom integer (uniform probability) and puts
it on the stack, updating the internal state.

: FRAND ( -- f:x ) ... ;

generates a pseudorandom floating point number uniformly distributed on
the unit interval, and puts it on the floating point stack, updating the
internal state.

: GENSEED ( -- ) \ optional, system dependent

Initializes the random state based on a non-deterministic source such as
reading the system timer.

The above words are supposed to supply pseudorandom streams with
reasonable statistical properties for purposes like simulations and
friendly games. Applications needing random numbers that withstand
adverserial attempts to predict the stream or recover the initial state
should use cryptographic algorithms and high-entropy physical randomness
sources.

I'm not really attached to the above specification. I only wrote it out
for the sake of having a concrete proposal that other people can
improve, instead of just making a vague suggestion that there should be
a PRNG.

Krishna Myneni

unread,

Feb 7, 2012, 9:16:45 PM2/7/12

to

On Feb 7, 11:56 am, stephen...@mpeforth.com (Stephen Pelc) wrote:
> On Tue, 7 Feb 2012 03:22:35 -0800 (PST), Krishna Myneni
>

> <krishna.myn...@ccreweb.org> wrote:
> >Let's say we have a
> >medium sized dictionary of 10,000 words. Assume that the average word
> >directly uses on the order of 10 distinct words (excluding standard
> >Forth words) -- that's probably an over estimate. Then, the overhead
> >in storage for each word is 20 cells, where each of the 10
> >dependencies requires 2 cells to hold the wid and nt. Thus, for our
> >10K dictionary, we have an overhead of 200K cells to be able to
> >represent the internal dependencies between words
>
> Just to give you a couple of real-world data points, a 300k binary
> cross-compiled embedded app produces a cross-reference file on VFX
> of 500kb. A 22Mb desktop app produces a 75Mb cross-reference file.
>

Thanks, Stephen. Can you give us the word counts for your embedded and
desktop apps (excluding standard Forth words)? Also, are the cross-
reference files text or binary, and do they contain additional info
beyond what is needed to generate a dependency graph? My simple
analysis suggests that an app using 10K of words requires additional
storage of < 1 MB to retain info needed to make a dependency graph.
This is not far out of line with your embedded app's cross reference
file.

Krishna

Andrew Haley

unread,

Feb 8, 2012, 4:57:20 AM2/8/12

to

Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>
> Anyway, for a tool that works during compilation rather than on the
> compiled code, it would be very useful to have hooks in the text
> interpreter or in COMPILE,.

Even better, and certainly more powerful, is the ability to write your
own text interpreter that we discussed a few months back. There's no
guarantee that an implementation uses COMPILE, ; it only has to
provide it.

Andrew.

Stephen Pelc

unread,

Feb 8, 2012, 6:13:03 AM2/8/12

to

On Tue, 7 Feb 2012 18:16:45 -0800 (PST), Krishna Myneni
<krishna...@ccreweb.org> wrote:

>> Just to give you a couple of real-world data points, a 300k binary
>> cross-compiled embedded app produces a cross-reference file on VFX
>> of 500kb. A 22Mb desktop app produces a 75Mb cross-reference file.
>

>Can you give us the word counts for your embedded and
>desktop apps (excluding standard Forth words)? Also, are the cross-
>reference files text or binary, and do they contain additional info
>beyond what is needed to generate a dependency graph?

The cross reference file is a binary file containing symbol names,
dictionary links, and reference chains that allow both cross reference
and simple decompilation. The format is as defined by
Lib/xref.fth
in any VFX Forth distribution.

I do not have the embedded app's source to hand, but the desktop app
is 23 Mb of binary from 45 Mb of source and has 103,000 words,
measured using #ALL-WIDS from
Examples/DictStats.fth

Anton Ertl

unread,

Feb 8, 2012, 9:16:01 AM2/8/12

to

Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>>
>> Anyway, for a tool that works during compilation rather than on the
>> compiled code, it would be very useful to have hooks in the text
>> interpreter or in COMPILE,.
>
>Even better, and certainly more powerful, is the ability to write your
>own text interpreter that we discussed a few months back.

That's certainly easier to agree on between different implementors
(although still hard enough). But I see two drawbacks compared to the
hook approach:

1) If you need to do something at the lowest layer of the text
interpreter, say "COMPILE,", you need to reimplement all of it. Sure,
words could be standardized that would help with that, but it would
still be significantly more cumbersome than just hooking into
"COMPILE,".

2) A hook would work even for invocations of the text interpreter that
the programmer had not considered in advance, like system-specific
words invoked by the user (who may be a different person than the
programmer).

In conclusion, both approaches have their advantages and
disadvantages, and they would complement each other: A few hooks in
places that every standard system must have (together with a
convention for dealing with having several independent things to plug
in), and fill in the rest by reimplementing the functionality with
provided words.

Anyway, that's a very long-term project as far as I can see.

> There's no
>guarantee that an implementation uses COMPILE, ; it only has to
>provide it.

Sure. So one would specify the hook in a more general way (if we want
a hook there at all).

Andrew Haley

unread,

Feb 8, 2012, 11:22:18 AM2/8/12

to

Paul Rubin <no.e...@nospam.invalid> wrote:

> In the "Valentine Bingo" thread I tried writing an implementation
> and was surprised to see that there seems to be no random number
> generator in the gforth library. Since (AFAIK) gforth is a full
> featured ANS implementation, I infer that there's no RNG specified
> in the standard. It looks like VFX supplies something called
> "CHOOSE" which I guess is non-standard. I ended up supplying my own
> RNG, using a fairly poor algorithm that I felt was ok for a simple
> demo program of this nature, but choosing and coding a good
> algorithm requires knowing what you're doing. So it wouldn't
> surprise me if there's an awful lot of Forth code out there
> containing crappy RNG's.
>
> It therefore seems worthwhile to me that the standard include a PRNG
> spec, not for the algorithm itself, but something library
> implementers can code to.

Sometimes you need something super fast, and you don't care about the
period or minor statistical anomalies. Like this:

\ Marsaglia, "Xorshift RNGs". http://www.jstatsoft.org/v08/i14/paper
variable rng
: random ( - n)
rng @
dup 0= or
dup 6 lshift xor
dup 21 rshift xor
dup 7 lshift xor
dup rng ! ;

This actually has pretty good statistical properties, even though it
only has a period of 2^32-1. It's great for things like CSMA/CD-style
collision backoff, dither noise, and so on. It'd be fine for
Valentine Bingo, too.

You can have something much more heavyweight like the interface you
proposed, which would be good enough even for a crypto-strong
generator. So, it's a case of horses for courses, I think. Anyone
who actually wants an RNG for an application can find one.

Andrew.

Marcel Hendrix

unread,

Feb 8, 2012, 2:13:51 PM2/8/12

to

Paul Rubin <no.e...@nospam.invalid> writes Re: suggestion: standardize random number generator?

> In the "Valentine Bingo" thread I tried writing an implementation and
> was surprised to see that there seems to be no random number generator
> in the gforth library. Since (AFAIK) gforth is a full featured ANS
> implementation, I infer that there's no RNG specified in the standard.
> It looks like VFX supplies something called "CHOOSE" which I guess is
> non-standard. I ended up supplying my own RNG, using a fairly poor
> algorithm that I felt was ok for a simple demo program of this nature,
> but choosing and coding a good algorithm requires knowing what you're
> doing. So it wouldn't surprise me if there's an awful lot of Forth code
> out there containing crappy RNG's.

You did not look very far? Already 'Starting Forth' had a random number
called generator, called CHOOSE, and it followed Knuth's recommendations.

Your search tells us more about gforth and VFX than it does about random
generators in relation to Forth. You should have heard about FSL and FFL
by now? Taygeta archives? CFL threads (quite a while back of course)?

Here's what a search through the iForth distribution returns:

anneal.frt(56): -- Random bit generator. Press et al, 7.4, 'Generation of Random Bits'.
gaussj.frt(2946): \ "Minimal" random number generator of Park and Miller. Returns a uniform deviate
gaussj.frt(2952): \ "Minimal" random number generator of Park and Miller with Bays-Durham shuffle
gaussj.frt(2958): \ Long period (> 2e18) random number generator of L'Ecuyer with Bays-Durham shuffle
gaussj.frt(2968): \ Returns a uniform random deviate between 0.0 and 1.0, generated by pseudo-DES
gaussj.frt(2977): : poidev ( F: mean -- r ) ( Poisson distributed random deviate )
gaussj.frt(2979): -- Return a matrix of random numbers chosen from the uniform distribution with parameters LO and HI.
gaussj.frt(2985): -- Returns a matrix of random numbers chosen from the normal distribution with parameters MU and SIGMA.
gaussj.frt(2991): -- Returns a matrix of random numbers chosen from the exponential distribution with parameter MU.
gaussj.frt(2997): -- Returns a matrix of random numbers chosen from the Poisson distribution with parameter MEAN.
kiss.frt(4): * DESCRIPTION : Fast Random Number Generator with large period
mertwist.frt(4): * DESCRIPTION : Fast Random Number Generator "Mersenne Twister: A 623-Dimensionally
Equidistributed Uniform Pseudo-Random Number Generator",
miscutil.frt(1691): CR ." RANDOMIZE \ ( -- ) (Re)initialize random number generator"
miscutil.frt(1692): CR ." RANDOM \ ( -- u ) Fairly random number (integer)"
miscutil.frt(1693): CR ." CHOOSE \ ( n -- u ) 0 <= u < n pick random number from range"
miscutil.frt(1694): CR ." FRANDOM \ ( F: -- r ) Random floating-point number < 2^31"
miscutil.frt(1695): CR ." FCHOOSE \ ( n -- ) ( F: -- ur ) 0 <= ur < r pick random number from range"
mwc256.frt(4): * DESCRIPTION : Reasonably fast Random Number Generator with large period
r250.frt(4): * DESCRIPTION : R250 Pseudo-Random number generator
ran-next.frt(25): * Knuth's recommended random number generator from TAOCP, *
ran4.frt(49): RAN4 is capable of generating 2^32 different random sequences {in a
ffl\rdg.fs(3): \ rdg - the distributed pseudo random number generator module in the ffl
ffl\rdg.fs(64): ( Distributed random generator creation, initialisation and destruction )
ffl\rdg.fs(66): : rdg-init ( x xt rdg -- = Initialise the generator with the random generator xt and its data x )
ffl\rdg.fs(73): : rdg-create ( x xt "<spaces>name" -- ; -- rdg = Create a named random generator in the dictionary with the random generator xt and its data x )
ffl\rdg.fs(78): : rdg-new ( x xt -- rdg = Create a new random generator on the heap with the random generator xt and its data x )
ffl\rdg.fs(83): : rdg-free ( rdg -- = Free the random generator from the heap )
ffl\rdg.fs(89): : rdg-gen-[0,1> ( F: -- r ; rdg -- = Generate a random number [0,1> )
ffl\rdg.fs(96): : rdg-gen-<0,1> ( F: -- r ; rdg -- = Generate a positive random number <0,1> )
ffl\rdg.fs(107): ( Random generator words )
ffl\rdg.fs(109): : rdg-uniform ( F: r1 r2 -- r3 ; rdg -- = Generate a random number with a uniform distribution in the range of [r1,r2> )
ffl\rdg.fs(121): : rdg-normal ( F: r1 r2 -- r3 ; rdg -- = Generate a random number with a normal or Gaussian distribution with mu or mean r1 and sigma or standard deviation r2 )
ffl\rdg.fs(151): : rdg-exponential ( F: r1 -- r2 ; rdg -- = Generate a random number with an exponential distribution with mu or mean r1 [>0] )
ffl\rdg.fs(152): rdg-gen-<0,1> \ look for positive random number
ffl\rdg.fs(161): : rdg-gamma ( F: r1 r2 -- r3 ; rdg -- = Generate a random number with a gamma distribution with alpha r1 [>0] and beta r2 [>0], alpha*beta = mean, alpha*beta^2 = variance )
ffl\rdg.fs(207): : rdg-beta ( F: r1 r2 -- r3 ; rdg -- = Generate a random number with a beta distribution with alpha r1 [>0] and beta r2 [>0], alpha*beta = mean, alpha*beta^2 = variance )
ffl\rdg.fs(216): : rdg-binomial ( F: r -- ; u1 rdg -- u2 = Generate a random number with a binomial distribution with probability r [0,1] and trails u1 [>=0])
ffl\rdg.fs(253): : rdg-poisson ( F: r -- ; rdg -- u = Generate a random number with a Poisson distribution with mean r [>=0] )
ffl\rdg.fs(283): : rdg-pareto ( F: r1 r2 -- r3 ; rdg -- = Generate a random number with a Pareto distribution with alpha r1 [>0] the scale parameter and r2 [>0] the shape parameter )
ffl\rdg.fs(290): : rdg-weibull ( F: r1 r2 -- r3 ; rdg -- = Generate a random number with a Weibull distribution with alpha r1 [>0] the scale parameter and beta r2 [>0] the shape parameter )
ffl\rng.fs(3): \ rng - the pseudo random number generator module in the ffl
ffl\rng.fs(39): ( rng = Pseudo random number generator module )
ffl\rng.fs(40): ( The rng module implements a pseudo random number generator; )
ffl\rng.fs(63): ( Random generator structure )
ffl\rng.fs(133): ( Random generator creation, initialisation and destruction )
ffl\rng.fs(140): : rng-create ( u "<spaces>name" -- ; -- rng = Create a named random generator in the dictionary with seed u )
ffl\rng.fs(145): : rng-new ( u -- rng = Create a new random generator on the heap with seed u )
ffl\rng.fs(150): : rng-free ( rng -- = Free the random generator from the heap )
ffl\rng.fs(155): ( Random generator words )
ffl\rng.fs(162): : rng-next-number ( rng -- n = Calculate the next pseudo random number, 32 bit )
ffl\rng.fs(180): : rng-next-float ( rng -- r = Calculate the next pseudo random float number, range [0,1> )
ffl\rng.fs(188): : rng-dump ( rng -- = Dump the random generator )
C:/dfwforth/examples/bignum/diehard.frt(51): : prng ( F: -- random# )
C:/dfwforth/examples/ent/merstwist.frt(4): * DESCRIPTION : Generate sequence of "random" numbers with a cycle of 2^19937-1
C:/dfwforth/examples/ent/test-random.frt(79): : TEST-RANDOM ( c-addr u sz TRUE=terse -- )
C:/dfwforth/examples/ent/test-random.frt(92): ['] RANDOM [IS] GENERATOR S" RANDOM"
C:/dfwforth/examples/ent/test-random.frt(93): ['] iran0 [IS] GENERATOR S" iran0"
C:/dfwforth/examples/ent/test-random.frt(94): ['] iran1 [IS] GENERATOR S" iran1"
C:/dfwforth/examples/ent/test-random.frt(95): ['] iran2 [IS] GENERATOR S" iran2"
C:/dfwforth/examples/ent/test-random.frt(96): ['] iran3 [IS] GENERATOR S" iran3"
C:/dfwforth/examples/ent/test-random.frt(97): ['] ran-KISS [IS] GENERATOR S" ran-KISS"
C:/dfwforth/examples/ent/test-random.frt(98): ['] genrand [IS] GENERATOR S" genrand"
C:/dfwforth/examples/ent/test-random.frt(99): ['] genrand_int32 [IS] GENERATOR S" genrand_int32"
C:/dfwforth/examples/ent/test-random.frt(100): ['] ran-MWC256 [IS] GENERATOR S" ran-MWC256"
C:/dfwforth/examples/ent/test-random.frt(101): ['] lcm_rand [IS] GENERATOR S" lcm_rand"
C:/dfwforth/examples/ent/test-random.frt(102): ['] r250 [IS] GENERATOR S" r250"
C:/dfwforth/examples/ent/test-random.frt(103): ['] ran-Next [IS] GENERATOR S" ran-Next"
C:/dfwforth/examples/ent/test-random.frt(104): ['] random-byte [IS] GENERATOR S" random-byte"
C:/dfwforth/examples/ent/test-random.frt(105): ['] iprng [IS] GENERATOR S" iprng"
C:/dfwforth/examples/gmpfr/alt-gmpfr/libgmp-iforth.frt(403): AliasedExtern: mpf_urandomb void __gmpf_urandomb (mpf_ptr rop, gmp_randstate_t state, mp_bitcnt_t nbits);
C:/dfwforth/examples/misc/shannon.frt(132): -- generate a pseudo-random number with a Gaussian distribution
C:/dfwforth/examples/nrc/chapter7_Random_Numbers/random.frt(4): * DESCRIPTION : Access to the random number routines in the nrclib.dll
C:/dfwforth/examples/numeric/random3.frt(4): * DESCRIPTION : super random number generator
C:/dfwforth/examples/numeric/wurstkessel.frt(5): / Wurstkessel data from www.random.org
C:/dfwforth/examples/shoot-2001-06-05/bench/random/random.frt(17): : 100e_random ( F: -- r )
...
Found 1785 occurrence(s)

-marcel

Krishna Myneni

unread,

Feb 8, 2012, 9:10:16 PM2/8/12

to

On Feb 8, 5:13 am, stephen...@mpeforth.com (Stephen Pelc) wrote:
...

> I do not have the embedded app's source to hand, but the desktop app
> is 23 Mb of binary from 45 Mb of source and has 103,000 words,

..

Wow! 103K words... I bet there are quite a few which are constants,
variables, or buffers. Such words would be dependencies for other
words, but would have no dependencies of their own.

Krishna

Krishna Myneni

unread,

Feb 8, 2012, 9:12:42 PM2/8/12

to

On Feb 7, 6:47 pm, Paul Rubin <no.em...@nospam.invalid> wrote:
> ... So it wouldn't surprise me if there's an awful lot of Forth code

> out there containing crappy RNG's.

> ...

Maybe so, but there are quite a number of useful generators as well,
as Marcel pointed out. You can also check out the Random Number
Generators section of

http://ccreweb.org/software/kforth/kforth4.html

Krishna

Paul Rubin

unread,

Feb 9, 2012, 2:50:16 AM2/9/12

to

m...@iae.nl (Marcel Hendrix) writes:

> Your search tells us more about gforth and VFX than it does about random
> generators in relation to Forth. You should have heard about FSL and FFL
> by now? Taygeta archives? CFL threads (quite a while back of course)?
>
> Here's what a search through the iForth distribution returns:

Well, I also looked in the ANS wordlist and found an implementation on
forth.org in some old Forth magazine. But, the presence of all those
different implementations and presumably differing interfaces seems to
me to be an argument for standardization. This is a type of function
that's pretty easy to get wrong. E.g. from the Wikipedia page about the
the notorious RANDU:

It is widely considered to be one of the most ill-conceived random
number generators designed. Notably, it fails the spectral test
badly for dimensions greater than 2, and every result is odd. ...

As a result of the wide use of RANDU in the early '70s, many results
from that time are seen as suspicious.

Anton Ertl

unread,

Feb 10, 2012, 12:53:10 PM2/10/12

to

Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>So, it's a case of horses for courses, I think. Anyone
>who actually wants an RNG for an application can find one.

The same could be said about lots of other features, e.g., SEARCH:
Sometimes the straightforward algorithm is what's needed, sometimes
something like Boyer-Moore is more appropriate. And yet we have
standardized SEARCH, and that's good; if we have specific
requirements, we can still implement a specific algorithm, but for
most cases the built-in SEARCH with its unspecified characteristics is
good enough.

Andrew Haley

unread,

Feb 10, 2012, 1:54:10 PM2/10/12

to

Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>So, it's a case of horses for courses, I think. Anyone
>>who actually wants an RNG for an application can find one.
>
> The same could be said about lots of other features, e.g., SEARCH:
> Sometimes the straightforward algorithm is what's needed, sometimes
> something like Boyer-Moore is more appropriate.

But SEARCH doesn't have any unspecified characteristics except how
long it takes.

> And yet we have standardized SEARCH, and that's good; if we have
> specific requirements, we can still implement a specific algorithm,
> but for most cases the built-in SEARCH with its unspecified
> characteristics is good enough.

Indeed, but I don't think that reasoning can with a reasonable
expectation of success be extended to random number generators. "Give
me a random muber" sounds seductively simple, but I think we'll find
very quickly that the actual requirements for speed versus quality
vary so widely that they can't be satisfied with a single generator.
We had this discussion only a few weeks ago.

It would be unfortunate to repeat the mistake of C's rand(), which is
mostly useless: if you want something fast it may not be fast enough
so you look for a fast one, and if you want something strong it
probably isn't strong enough so you look for a strong one. I suppose
there are cases where you don't care at all about the properties of
the generator, such as student assignments.

Andrew.

Marcel Hendrix

unread,

Feb 10, 2012, 2:47:52 PM2/10/12

to

Andrew Haley <andr...@littlepinkcloud.invalid> writes Re: suggestion: standardize random number generator?

> Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>>So, it's a case of horses for courses, I think. Anyone
>>>who actually wants an RNG for an application can find one.

>> The same could be said about lots of other features, e.g., SEARCH:
>> Sometimes the straightforward algorithm is what's needed, sometimes
>> something like Boyer-Moore is more appropriate.

> But SEARCH doesn't have any unspecified characteristics except how
> long it takes.

>> And yet we have standardized SEARCH, and that's good; if we have
>> specific requirements, we can still implement a specific algorithm,
>> but for most cases the built-in SEARCH with its unspecified
>> characteristics is good enough.

[..]

Because the functionality is guaranteed to be called SEARCH, you can
blindly include a file that overloads the system's SEARCH with e.g.
Boyer-Moore, and the final application will behave as expected - anywhere.

Likewise the RNG -- with a fairly complete interface wordset it doesn't
matter what the peculiarities of the system supplied set are -- simply
overload (with SYNONYM :-) the key words.

-marcel

Paul Rubin

unread,

Feb 10, 2012, 5:47:28 PM2/10/12

to

Andrew Haley <andr...@littlepinkcloud.invalid> writes:
> But SEARCH doesn't have any unspecified characteristics except how
> long it takes.

Are RNG's different?

> "Give me a random muber" sounds seductively simple, but I think we'll
> find very quickly that the actual requirements for speed versus
> quality vary so widely that they can't be satisfied with a single
> generator. We had this discussion only a few weeks ago.

I missed the earlier discussion but I'll try to find it. My initial
thought was that there are about three useful levels of RNG:

1) very fast, simple ones like the LFSR you posted or a comparable
LCG, suitable for casual use or (e.g.) noise dithering, where complex
statistical properties are less important than speed and simplicity.

2) fast ones (slower than the very simplest) that are designed to not
have any significant statistical anomalies in applications (like
scientific simulations) whose possible sensitivity to imperfect
randomness isn't specifically designed around the algorithm. Mersenne
Twister is a popular algorithm of this class.

3) Slower ones designed to withstand adversarial analysis, i.e. a
determined adversary with a lot of resources and with knowledge of the
algorithm, shouldn't be able to distinguish the PRNG output from a
true random stream without knowing the seed. Applications include
data security protocols, online games where real money is involved and
cheating is an issue, etc. To handle this case you need cryptographic
algorithms. It's still not terribly complicated from an
implementation point of view, but performance will suffer compared to
the simpler PRNG's (unless you have special hardware such as the
AES-NI instruction set in newer x86's).

Anyway I think it's reasonable to wrap the same API around all three
types of generator. It's possible that I'm missing something and
there are some more useful levels, but I don't see it as like numerical
analysis where you want 100 different eigenvalue routines for different
special classes of matrix. For applications where RNG speed isn't
a big issue, it's fine to use "level 3" RNG's for just about everything.

Andrew Haley

unread,

Feb 11, 2012, 4:26:58 AM2/11/12

to

Paul Rubin <no.e...@nospam.invalid> wrote:
> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>> But SEARCH doesn't have any unspecified characteristics except how
>> long it takes.
>
> Are RNG's different?

Yes, in almost every way.

>> "Give me a random muber" sounds seductively simple, but I think we'll
>> find very quickly that the actual requirements for speed versus
>> quality vary so widely that they can't be satisfied with a single
>> generator. We had this discussion only a few weeks ago.
>
> I missed the earlier discussion but I'll try to find it. My initial
> thought was that there are about three useful levels of RNG:
>
> 1) very fast, simple ones like the LFSR you posted or a comparable
> LCG, suitable for casual use or (e.g.) noise dithering, where complex
> statistical properties are less important than speed and simplicity.
>
> 2) fast ones (slower than the very simplest) that are designed to not
> have any significant statistical anomalies in applications (like
> scientific simulations) whose possible sensitivity to imperfect
> randomness isn't specifically designed around the algorithm. Mersenne
> Twister is a popular algorithm of this class.
>
> 3) Slower ones designed to withstand adversarial analysis, i.e. a
> determined adversary with a lot of resources and with knowledge of the
> algorithm, shouldn't be able to distinguish the PRNG output from a
> true random stream without knowing the seed. Applications include
> data security protocols, online games where real money is involved and
> cheating is an issue, etc. To handle this case you need cryptographic
> algorithms. It's still not terribly complicated from an
> implementation point of view, but performance will suffer compared to
> the simpler PRNG's (unless you have special hardware such as the
> AES-NI instruction set in newer x86's).

I like that classification. There is also 4) the provably strong
(with certain assumptions) generator like Blum-Blum-Shub, but I
suppose that could be a subset of 3.

> Anyway I think it's reasonable to wrap the same API around all three
> types of generator.

I'm not sure about 1, where perhaps you want the simplest possible
interface, but otherwise there's something to that.

Andrew.