On 12/3/13, 8:16 PM, Ross Boylan wrote:
> On Tue, 2013-12-03 at 15:44 -0800, Ryan Brackney wrote:
>> I've spent a little more time exploring this issue, and I'm starting
>> to wonder whether it's there's a bug in STAN instead of an issue with
>> my own ignorance. I ran a simple experiment in which I set initial
>> values for a model using a simple normal distribution, or a simple
>> exponential distribution.
>>
>>
>> In both cases, I ran the model without overtly setting the initial
>> values of the chain. I then took the STAN generated initial values,
>> using the "get_inits" function, and reran the model with those initial
>> values that were previously used.
>>
>>
>> The normal distribution model runs fine this way. However, the
>> exponential model reverts the initial values to the lower bounds of
>> lambda, and only samples from that.
> What exactly do you mean by "reverts"?
I assume you (Ryan) mean that the only value sampled for
the parameter is the lower bound. This could indicate a problem
with the model, the data, or with Stan itself.
By the way, you don't need the upper=positive_infinity()
constraint --- no constraint is the default. It's OK
to leave it --- there's just a run-time conditional evaluation
that reduces it to the lower-bound-only transform if there is a +infinity
upper bound.
> If you mean the chain ends up at the lower bounds, you may have a
> problem getting estimates from the initial values, but not a difference
> based on how you set the initial values.
>
> Specifically, your script has (with labels added)
> #run a exponential distribution fit without specific initial values
> # fit 1
> exp_fit1 <- stan(file = 'model_ex_simple.txt', data = norm_dat,
> iter = 100, chains = 2)
> used_exp_inits1 <- get_inits(exp_fit1)
> la_exp1 <- extract(exp_fit1)
>
> #rerun the exponential distribution fit with the previously used initial values
> # fit 2
> exp_fit2 <- stan(file = 'model_ex_simple.txt', data = norm_dat,
> iter = 100, chains = 2, init = used_exp_inits1)
>
> used_exp_inits2 <- get_inits(exp_fit2)
> la_exp2 <- extract(exp_fit2)
>
> are you saying the chains from fit 1 and fit 2 are different?
Unless you specify a seed in the call to Stan, you can expect
results to vary across calls.
> Your problem sounds similar to mine (see the "tuning troubles" thread
> from last month). For some initial values the chain simply stayed on
> the initial values for every step. It sounds as if you got some
> movement, but it moved to a point at which it stuck (like my initial
> values).
>
> I suggest setting the seed argument to assure you (and others) can
> reproduce your results. Actually, I'm not sure if the seed argument to
> stan will affect the initial value creation (previous discussion on this
> list said it would not affect the values drawn from a user-defined
> initialization function, but you are not using such a function).
The seed passed to Stan determines all randomization within calls to stan().
I'm pretty sure that includes initialization if you don't specify initial
values. Our intent is to make the results exactly reproducible. I opened
an issue to clarify the RStan doc on this point:
https://github.com/stan-dev/rstan/issues/31
> To be
> sure, you could call set.seed at the start of the script (as well as
> using the seed argument to the stan function).
set.seed() will determine the randomization within R.
If you use a specified seed in the function stan(), set.seed() won't have
any effect on the sampling within the call to stan().
If you do not specify a seed for the call to stan(), then R generates
the seed and passes it to Stan. (See the doc for stan() in the RStan doc.)
- Bob