extracting parameters from optimization in RStan?

721 views
Skip to first unread message

David Chudzicki (dchudz@gmail.com)

unread,
Feb 2, 2014, 7:11:04 PM2/2/14
to stan-...@googlegroups.com
Calling the extract method on a stanfit object (returned by sampling or stan) puts the parameters in a very convenient form for working with. Is there a nice way to get the parameters resulting from optimizing in a similarly convenient form?

Naively, it seems to me like maybe it would make sense for the return value of optimizing to share a class with the return value of sampling. We do some of the same things with both types of fits, and I'd like the way I handle them to be similar. Maybe optimizing should give you a stanfit object with 1 chain and 1 iteration? Or maybe they're different subclasses of the same parent class?

I've pasted below an example session, and attached the code/data to run it (example is based on https://github.com/stan-dev/stan/tree/develop/src/models/misc/gaussian-process). At the bottom, you can see I faked a stanfit object by replacing the parameters from sampling 1 chain and 1 iteration with my optimized parameters. If I understand right, calling extract on that object did basically what I want. But obviously that's not ideal.

-David



> library("rstan")
> source("gp-fit.data.R")
> gpModel <- stan_model(file="gp-fit.stan")

TRANSLATING MODEL 'gp-fit' FROM Stan CODE TO C++ CODE NOW.
COMPILING THE C++ CODE FOR MODEL 'gp-fit' NOW.
> gpFitSampling <- sampling(gpModel, data=list(x=x,N=N,y=y), iter=200, chains=3)
SAMPLING FOR MODEL 'gp-fit' NOW (CHAIN 1).
Iteration:   1 / 200 [  0%]  (Warmup)
.
.
.
> class(gpFitSampling)
[1] "stanfit"
attr(,"package")
[1] "rstan"
> gpParametersSampling <- extract(gpFitSampling)
> dim(gpParametersSampling$Sigma)
[1] 300 101 101
> # Now I can do useful things with Sigma
> gpFitOptimizing <- optimizing(gpModel, data=list(x=x,N=N,y=y), iter=200, chains=3)
STAN OPTIMIZATION COMMAND (BFGS)
.
.
.
Optimization terminated normally: Convergence detected: change in objective function was below tolerance
> class(gpFitOptimizing)
[1] "list"
> gpFitOptimizing$par[1:20]
     eta_sq      rho_sq    sigma_sq  Sigma[1,1]  Sigma[2,1]  Sigma[3,1]  Sigma[4,1]  Sigma[5,1]  Sigma[6,1]  Sigma[7,1] 
 0.82396581  1.00554826  0.09551201  0.91947781  0.81572195  0.79148197  0.75267211  0.70151432  0.64081574  0.57371437 
 Sigma[8,1]  Sigma[9,1] Sigma[10,1] Sigma[11,1] Sigma[12,1] Sigma[13,1] Sigma[14,1] Sigma[15,1] Sigma[16,1] Sigma[17,1] 
 0.50341274  0.43293090  0.36490423  0.30144295  0.24406037  0.19366687  0.15061885  0.11480723  0.08576795  0.06279814 
> # Now it's a pain to do useful things with any of the parameters, especially Sigma.

> gpFitSamplingTemplate <- sampling(gpModel, data=list(x=x,N=N,y=y), iter=1, chains=1)
SAMPLING FOR MODEL 'gp-fit' NOW (CHAIN 1).
Iteration: 1 / 1 [100%]  (Sampling)
Elapsed Time: 2e-06 seconds (Warm-up)
              0.008863 seconds (Sampling)
              0.008865 seconds (Total)

> parsList <- (as.list(gpFitOptimizing$par))
> for (ixParameter in seq_along(parsList)) {
+ gpFitSamplingTemplate@sim$samples[[1]][[ixParameter]] <- parsList[[ixParameter]]
+ }
> optimizedParametersExtracted <- extract(gpFitSamplingTemplate, inc_warmup=TRUE)
> gpFitOptimizing$par[1:10]
    eta_sq     rho_sq   sigma_sq Sigma[1,1] Sigma[2,1] Sigma[3,1] Sigma[4,1] Sigma[5,1] Sigma[6,1] Sigma[7,1] 
0.82396581 1.00554826 0.09551201 0.91947781 0.81572195 0.79148197 0.75267211 0.70151432 0.64081574 0.57371437 
> optimizedParametersExtracted$eta_sq
[1] 0.8239658
> optimizedParametersExtracted$sigma_sq
[1] 0.09551201
> (optimizedParametersExtracted$Sigma)[1,1,1]
[1] 0.9194778
gp-fit.data.R
gp-fit.R
gp-fit.stan

Bob Carpenter

unread,
Feb 2, 2014, 8:06:26 PM2/2/14
to stan-...@googlegroups.com
I think structured output for optimizing() in RStan is
a very good idea, so I created a feature request:

https://github.com/stan-dev/rstan/issues/41

Users should feel free to submit feature requests, too.

CmdStan is set up so that the output of optimization and
MCMC are the same format, but those outputs are simple
CSV files with other material encoded as comments, and it's
not easy to extract structure from the headers.

If anyone wants to see this kind of thing happen (or other
things in RStan), we're always happy to take pull requests
(that come with reasonable test and doc) or welcome new dev
team members. We're also happy to help out if you need
help getting started building features or understanding how
things are currently coded.

- Bob
> --
> You received this message because you are subscribed to the Google Groups "stan users mailing list" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
Reply all
Reply to author
Forward
0 new messages