scaling and "prediction" with random walks

406 views
Skip to first unread message

Alex

unread,
Aug 19, 2013, 2:24:03 PM8/19/13
to r-inla-disc...@googlegroups.com
Hello,

In the random walk manual an option regarding the scaling is mentioned.
scale.model=T
With that within the f() the inla() crashes in an "unusual way".
While with the scale I can't figure out what arguments it needs. In the help: A scaling vector. Its meaning depends on the model. I put a mean and a sd but not working.

1) Is there a way to define the scaling within the f(), or a simple f(scale(x), model="rw2") is enough?
it does scale the precision/variable, but just in case there is an option within the f() or in case there is a difference.

2) To calculate the (eg median) linear predictor for a new observation (when the covariate is rw2) is it enough to cut the variable according to the result$summary.random$covariate$ID and find in which interval the new observation falls and assign that effect?
-If yes, then equivalently(?), if inla.group was used, to follow the inla.group construction and find the interval that interval that the new observation belongs? below is what I mean
-If no, then do I have to manually calculate the (conditional) effect?

Best,
Alex


which.breaks=result$summary.random$x$ID
n =length(result$summary.random$x$ID)

f.copy.inla.group.core <- function(x,n,which.breaks,method,pred.x){
  if (method == "cut") {
    a = cut(x, n,dig.lab = 3)
  }else{
    aq = unique(quantile(x, probs = c(0, ppoints(n - 1), 1)))
    a = cut(x, breaks = as.numeric(aq), include.lowest = TRUE, dig.lab = 3)
  }
  #from ?cut
  labs <- levels(a)
  bounds.cut <- cbind(lower = as.numeric( sub("\\((.+),.*", "\\1", labs) ),
                      upper = as.numeric( sub("[^,]*,([^]]*)\\]", "\\1", labs) ))
  bounds.cut <- as.data.frame(bounds.cut)
  #
  temp <- NA
  #
  for(i in 1:length(pred.x)){
    junk <- which(apply((bounds.cut-(pred.x[i])),1,prod)< 0)
    if(length(junk)==0){#then smaller than the smallest or larger than the largest
      if(pred.x[i]>max(bounds.cut$upper,na.rm=T)){temp[i] <- length(labs)} #ID is ordered
      if(pred.x[i]<min(bounds.cut$lower,na.rm=T)){temp[i] <- 1}
    }else{temp[i] <- junk}
  }
  #
  return(which.breaks[temp])
}


INLA help

unread,
Aug 19, 2013, 2:33:15 PM8/19/13
to Alex, r-inla-disc...@googlegroups.com
On Mon, 2013-08-19 at 11:24 -0700, Alex wrote:
> Hello,
>
> In the random walk manual an option regarding the scaling is
> mentioned.
> scale.model=T


You have reached an option we're working on at the moment, and it is
work in progress. Hopefully there will be a tutorial document soon
describing the intented use. We'll announce it here on the list. Please
stay tuned ;-)

Best
H


--
Håvard Rue
he...@r-inla.org

INLA help

unread,
Aug 19, 2013, 2:59:25 PM8/19/13
to Alex, r-inla-disc...@googlegroups.com
On Mon, 2013-08-19 at 11:24 -0700, Alex wrote:

> 1) Is there a way to define the scaling within the f(), or a simple
> f(scale(x), model="rw2") is enough? it does scale the
> precision/variable, but just in case there is an option within the f()
> or in case there is a difference.

The experimental new option 'scale.model' would scale the RW2 model so
that the 'generalized variance' is 1 (geometric mean of the variances
from the proper part of the model). In this way

f(idx, model="rw2", scale.model=TRUE)

would give the same results using

idx = 1:n
idx = n*(1:n)
idx = (1:n)/n

assuming no numerical difficulties arise. However, the main issue is to
be able to set meaningful priors, or to set priors in the controlled
way.

More info about this will come, soon, we hope.



> 2) To calculate the (eg median) linear predictor for a new observation
> (when the covariate is rw2) is it enough to cut the variable
> according

Maybe just add

control.predictor = list(compute = TRUE)

to get the posterior of the linear predictor, and if you want a certain
configuration of the covariates, include that in the model with a NA
reponse.

Alex

unread,
Aug 19, 2013, 3:14:31 PM8/19/13
to r-inla-disc...@googlegroups.com, Alex, he...@r-inla.org
Great. Thanks for the info and the prompt reply.


to get the posterior of the linear predictor, and if you want a certain
configuration of the covariates, include that in the model with a NA
reponse.
For large amount of new observations (~10^6) this takes time. Also with the use of SPDE models it takes more time despite the fact that I don't set the link option to calculate the predictions to the response (probability) scale.
Since I need the linear predictor only (without variation measure, just median) and the projected SPDE effect can be instantly calculated, I was wondering if I can escape from the inclusion of the new observations in the inla() call?

Best,
Alex

Finn Lindgren

unread,
Aug 19, 2013, 3:25:02 PM8/19/13
to Alex, r-inla-disc...@googlegroups.com, he...@r-inla.org
On 19 August 2013 22:14, Alex <bof...@gmail.com> wrote:
>> to get the posterior of the linear predictor, and if you want a certain
>> configuration of the covariates, include that in the model with a NA
>> reponse.
> For large amount of new observations (~10^6) this takes time. Also with the
> use of SPDE models it takes more time despite the fact that I don't set the
> link option to calculate the predictions to the response (probability)
> scale.

Yes, due to the original internal inla-design this is not as efficient
as it could be.
I don't think there is much to gain in terms of only calculating the
median (or the mean) since that still requires doing the integration
over the posterior for the parameters, but a lot of speedup can be
obtained by doing two (or more) inla() calls, where the prediction
part is only included in the second call, with the result of the first
call as known parameter posterior mode.

We've answered how to do that on the forum before, but I don't have
the direct link to the explanation; It _really_ should be in the FAQ,
but it doesn't seem to be there...

Finn

INLA help

unread,
Aug 19, 2013, 5:10:06 PM8/19/13
to Finn Lindgren, Alex, r-inla-disc...@googlegroups.com
On Mon, 2013-08-19 at 22:25 +0300, Finn Lindgren wrote:
> On 19 August 2013 22:14, Alex <bof...@gmail.com> wrote:
> >> to get the posterior of the linear predictor, and if you want a certain
> >> configuration of the covariates, include that in the model with a NA
> >> reponse.
> > For large amount of new observations (~10^6) this takes time. Also with the
> > use of SPDE models it takes more time despite the fact that I don't set the
> > link option to calculate the predictions to the response (probability)
> > scale.
>
> Yes, due to the original internal inla-design this is not as efficient
> as it could be.
> I don't think there is much to gain in terms of only calculating the
> median (or the mean) since that still requires doing the integration
> over the posterior for the parameters, but a lot of speedup can be
> obtained by doing two (or more) inla() calls, where the prediction
> part is only included in the second call, with the result of the first
> call as known parameter posterior mode.

an alternative is to reduce to graph internally in the 'find the mode of
the hyperpar' step, removing nodes coming from prediction only. a second
alterntive, is to write a wrapper that do this in two steps. but if
there are many prediction points, then it would not help. Like in the
~10^6 case above, that is simply to much. it should be possible to write
a 'predict' function, giving a point-estimate only, of a new
configuration, by parsing the inla-object. this could be intented for
'large prediction sets' as above.




--
Håvard Rue
he...@r-inla.org

Reply all
Reply to author
Forward
0 new messages