I take it you don't want to call the Stan C++ code
directly, but want to use rstan::optimizing().
Which means you need a compiled Stan model with data
associated.
That's going to be inefficient reading it in and out
if you have to do it every time, because most of the
data is real data that stays fixed.
It sounds like you need some currying (as in Haskell B.,
not stew:
https://en.wikipedia.org/wiki/Currying ).
Suppose we have a Stan program populated with data
it defines
log q(theta | y)
the log unnormalized posterior for fixed data y and parameters
theta.
Then what I think you want to do is pick out a subset of the
parameters of theta, let's say alpha, so that
theta = (alpha, beta).
And what you want is to have is the curried function f,
defined by
f = lambda alpha . lambda beta . log q(alpha, beta | y).
where that's the lambda abstraction operator so popular
with the functional programming set (
https://en.wikipedia.org/wiki/Lambda_calculus ).
That'll let you plug in some actual values alpha' for alpha
and get a new function g defined by:
g = f(alpha') = lambda beta . log q(alpha', beta | y)
Now if we now take values beta' for beta and apply g, we get
g(beta') = log q(alpha', beta' | y).
The key is that we can create the function g that is
differentiable so we can plug it into our optimizers.
You are doing that the hard way with two Stan programs. I
was suggesting the above functional abstraction, which R
supports, and seems much much cleaner to me if you're just
experimenting with optimization --- but it sounds like you need
the rest of Stan's solvers, too (do we have a general name
for optimization, MCMC, and VI?)
When it comes time to do it in C++, we'll essentially roll
our own bind operation, and provide a function that'll return
another function, which will look something like this:
stan_model marginalize(const vector<string>& labels,
const vetor<double>& alpha);
where in the returned stan_model, you get something
that implements the model base class and concepts.
I think that should be sufficient, and it won't involve changing
the modeling language syntax. But it will involve a somewhat
expensive run-time bind operation (mainly for weaving the parameters
back together when we do g(beta')).
This could be precompiled if we had labels in the program itself. But
I'd really rather stay away from that if at all possible and
let the program just define the density.
Whew.
- Bob