Random number streams

22 views
Skip to first unread message

Gaëtan de Menten

unread,
Apr 29, 2013, 6:47:10 AM4/29/13
to liam...@googlegroups.com

On 26/04/2013 21:50, Howard Redway wrote:

> LIAM2 appears to draws random numbers from a single stream in
> the order required by the simulation with the only control being the
> specification of the initial seed.

This is indeed true.

> Genesis manages random numbers in such a way that if a procedure is
> unchanged then, if required, the same random numbers can always be
> drawn even if the number of random draws required for earlier
> procedures differs. This is an important feature of Genesis as it
> reduces stochastic differences to a minimum when comparing two
> scenarios. This is essential for the analysis of gainers and losers
> from a policy proposal as it ensures that for most policy options the
> same units exist in two runs and can be compared at the micro level.
> This would be a major obstacle were we to wish to convert Genesis
> models to the current release of LIAM2

As I said in my other email, solving this problem is planned for the 0.8
release, however I would like some input as to what modellers would prefer.

The goal is to use a different random stream for each "process using
random numbers" (let us call them "random processes"), so that two
models using the same random processes are more comparable (even if one
such process draws more numbers in one model than in the other). This
would be relatively easy to implement.

However, what would be interesting is to be able to compare two models
which have a different set of random processes. And thus, the random
processes which are common to both models should use the same random
streams.

The tricky part is to assign each random process with an "order number"
(of its random stream) in a consistent way across models. One way to do
this, would be to let the modeller specify that "order number" manually
for each random process. This would be a robust yet tedious solution.

Another obvious option would be to use the order in which they are
defined, but then if the modeller inserts a new "random process" in the
middle of the simulation, it would break. To avoid that, the best option
I can think of currently is to use the definition order *except* for
processes which have an order number defined explicitly. This way, a
modeller would only need to give manual order numbers for those
processes which are specific to a/some model(s). This is more convenient
but there is the risk that a modeller will forget to do that.

I have tried to come up with a way to use the procedure name or
information like that to make this completely automatic so that users
would not have to care about this, but I have been unable to find a
scheme that would be robust enough yet.

Any thoughts?


--
Ga�tan de Menten

Federal Planning Bureau
Economic Analyses & Forecasts
Avenue des Arts, 47-49 | 1000 Bruxelles
tel. +32 (0)2 507 7459
fax +32 (0)2 507 7373
email : g...@plan.be | www.plan.be


----------------------------------------------------------------------------

Disclaimer: please see "www.plan.be/disclaimer.html"

Please consider your environmental responsibility before printing this email

----------------------------------------------------------------------------

how...@howard-redway.co.uk

unread,
Jun 28, 2013, 4:57:35 PM6/28/13
to liam...@googlegroups.com, g...@plan.be

Assigning a random number stream to a process only partially solves the problem.  Suppose we make a small change to the employment rate (which we do every time we get a new set of assumptions from our finance department, The Treasury).  If the rate is reduced then everyone has a reduced chance of being in work.  So there are fewer people eligible to join a pension scheme and fewer random draws required for this process.  After a random number is not drawn for the first record with a different work status other record in that and later period gets a different random number from the stream.  Once this occurs other outputs will change due to changes in explanatory variables. 

Any changes in births, deaths, or partnership are particular problematic as the analysis of gainers and losers becomes much less meaningful.  Our policy users regard gainers and losers analyse as an essential requirement.

Another benefit of replicating random numbers is when changes are being made to a model that should not change any outputs.  I have recently been working on a modification of one of our other models to increase its flexibility so it can model not only current pension rules but a new set.  Being able to draw exactly the same random numbers enables the outputs from the two versions to be compared at the micro level to check the changes before coding for the new rules is added.

The way we deal with this in Pensim2 is to use a different seed for every random number draw.  The seed is a function of four variables, in LIAM2 terminology these are:  process, period, entity, and random_seed.  The function used is quite complex and specific to Pensim2 (and our other Genesis models) so would not suitable for a more general tool such as LIAM2.  We are currently investigating alternatives.

Using the order of definition for dealing with changes in the “process” factor of the Genesis equation for the seed would be acceptable.  We already deal with this by retaining the order of definitions when we require random numbers to be replicated.  Any new definitions are added at the end and redundant ones retained.

(For a general description of Genesis and Pensim2 see my posts in the thread: Comparison of LIAM2 and Genesis (SAS based model))

Ga�tan de Menten

Federal Planning Bureau
Economic Analyses & Forecasts
Avenue des Arts, 47-49 | 1000 Bruxelles
tel. +32 (0)2 507 7459
fax  +32 (0)2 507 7373
email : g...@plan.be | www.plan.be


----------------------------------------------------------------------------

Disclaimer: please see "www.plan.be/disclaimer.html"

Please consider your environmental responsibility before printing this email

----------------------------------------------------------------------------

Ga�tan de Menten

Gaëtan de Menten

unread,
Jul 2, 2013, 10:50:15 AM7/2/13
to liam...@googlegroups.com

Thanks for your input. It is really appreciated.

On 28/06/2013 22:57, how...@howard-redway.co.uk wrote:
> Assigning a random number stream to a process only partially solves the
> problem.Suppose we make a small change to the employment rate (which we
> do every time we get a new set of assumptions from our finance
> department, The Treasury).If the rate is reduced then everyone has a
> reduced chance of being in work.So there are fewer people eligible to
> join a pension scheme and fewer random draws required for this
> process.After a random number is not drawn for the first record with a
> different work status other record in that and later period gets a
> different random number from the stream.Once this occurs other outputs
> will change due to changes in explanatory variables.
>
> Any changes in births, deaths, or partnership are particular problematic
> as the analysis of gainers and losers becomes much less meaningful.Our
> policy users regard gainers and losers analyse as an essential requirement.

I initially thought about that but discarded the idea based on the fact
that aggregate values would be different if the population is different.
But since that seems to be useful after all, I will use a different
stream for each period like you suggest as it is not really more
difficult to implement.

> The way we deal with this in Pensim2 is to use a different seed for
> every random number draw.The seed is a function of four variables, in
> LIAM2 terminology these are:process, period, entity, and random_seed.The
> function used is quite complex and specific to Pensim2 (and our other
> Genesis models) so would not suitable for a more general tool such as
> LIAM2.We are currently investigating alternatives.
>
> Using the order of definition for dealing with changes in the �process�
> factor of the Genesis equation for the seed would be acceptable.We
> already deal with this by retaining the order of definitions when we
> require random numbers to be replicated.Any new definitions are added at
> the end and redundant ones retained.

I am now leaning towards using some kind of hash of (entity_name,
procedure_name, order of definition within procedure, period,
random_seed) as the seed for each "random process". This means that a
model with additional procedures can still be compared to the original
model wherever those additional procedures are defined, but that within
procedures of the same name, the order of definition (of random
processes) matters and one might need to introduce dummy processes to
keep variants of the same procedure comparable. This is seems like a
reasonable compromise.

Gaetan
Reply all
Reply to author
Forward
0 new messages