On 2/8/14, 4:52 PM, Marco Inacio wrote:
>
>
>
> On 14-02-07 03:09 PM, Marcus Brubaker wrote:
>> On Fri, Feb 7, 2014 at 10:12 AM, Bob Carpenter <
ca...@alias-i.com <mailto:
ca...@alias-i.com>> wrote:
>>
>> I have no idea how people typically do this. There's no
>> good way to constrain a vector beta so that for arbitrary
>> vector $x$ we have $x' * beta > 0$ (or $x' * beta < 0$).
>>
>>
>> Actually this isn't too hard to do in Stan if one of those vectors is data. Lets say $x$ is a fixed $N$ dimensional
>> data vector. Then use something like Gram-Schmidt to find the set of $N-1$ unit vectors $v_i$ which are orthogonal to
>> $x$ (i.e., such that dot_product(x,v_i) = 0). Then $\beta = \alpha x + \sum_i \gamma_i v_i$ where the $\gamma_i$ are
>> unconstrained and $\alpha$ is constrained to be negative. It's then simple to show that $\beta x$ will always be
>> negative.
>
>
> Thanks, I wonder if there's a way to extend this to a matrix of data $x$, we could do it for each column of the matrix,
> but seems a little too strong to force every $\beta * x_i$ ($x_i$ is column $i$ of matrix $x$) to be negative instead of
> just the sum them. But no need to worry about this now, I'll prefer using log-link anyway.
>
>
> Could you please help with this other problem? I wish to know if need to add Jacobian here in this case 'cause this
> model is taking more than 10 hours for just 100 iterations with very low number of n_eff.
I don't understand the model, but here are some suggestions and
questions.
Given that log(sigma) shows up, you almost certainly want to be
defining sigma with a lower bound,
real<lower=0> sigma;
The constraints declared on variables needs to match their support
in the model.
Add a Jacobian for what? You only need a Jacobian adjustment
for a change of variables.
I take it in increment_log_prob(sum(summands)) that
each summand is supposed to be a gamma distribution?
lgamma() is very expensive to evaluate, so if the loop is large,
this is going to be slow.
You want to define a local variable log_sigma =log(sigma)
rather than continually recomputing it.
Unless you want to inspect all those transformed parameters
in the output, you can just make them local variables in the
model.
exp(log(a) - b) is more efficiently coded as a * exp(-b)
as long as exp(-b) won't overflow or underflow.
exp(2*(eta1[i] - log(sigma)))
= square(exp(eta1[i] - log(sigma)))
= square(-sigma * exp(eta1[i])
= square(-sigma) * square(exp(eta1[i]])
and now you can just save square(-sigma) and reuse it
rather than recomputing it. Same for exp(eta1[i])
which gets reused in the summands.
You define eta0 and zeta1 but don't use them. This
is expensive in Stan because it causes unused derivative
evaluation and chain-rule propagations.
It won't make much difference to speed in this model, but you
don't need to build up summands or alpha as vectors.
You can make them locals in the loop and just increment_log_prob()
for each of what is now summands[i] --- increment_log_prob()
automatically vectorizes its sums internally. Just define the
terms you reuse (log(sigma) and square(-sigma)) as locals outside
the loop.
Why is there a final increment_log_prob(-log(sigma)); ?
Part of the computational problem may be overly vague priors:
beta ~ normal(0, 100);
gamma ~ normal(0, 100);
If you don't expect beta or gamma to be in the 100s, don't use
priors this wide.
In general, though, I'd suggest you get the model working before trying
to optimize too heavily. Then inspect what's going on with the samples
and see if the parameters are identified or if they need tighter priors.
- Bob