Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

[DLM with INLA] Interventions

124 views
Skip to first unread message

Dominique Soudant

unread,
Jul 1, 2024, 10:08:23 AM7/1/24
to R-inla discussion group
Hello,
In time series analysis, let's define an "intervention" as a modification of the model parameters to take into account of exogenous information. I'm considering two types of "intervention" here: 1) dealing with an exceptional value (i.e. neither false nor dubious) also known as an outlier 2) dealing with a level change. With the dlm package, I was doing an "intervention" for an outlier by allowing the observation variance to be greater than the routine observation variance (i.e. V_outlier = (1+exp(delta))*V_routine, delta estimated by maximum likelihood). Similarly, for a level change, I was allowing the evolution variance of the mean level to be greater than the routine evolution variance of the mean level.

In INLA, let's say that Obs are the observations.

Outlier
Let's say that the outlier is the 200th observation (RW1 process). The way I found to add an intervention for such outlier is as follow :
ObsMat <- matrix(cbind(Obs,NA),n,2)
ObsMat[200,1] <- NA
ObsMat[200,2] <- Obs[200]

RW1 = ObsMat ~ 0 + f(i, model="rw1",constr=FALSE)

# Estimation
model.RW1 = inla(RW1,
  data = data.frame(i),
  family = rep("gaussian", 2), # Two gaussian noise 1) regular observations 2) outlier
  control.family = list(
    list()
    ,list()    
  ),
  control.predictor = list(compute = TRUE)
)
This induces that an observation variance is specifically estimated for the outlier and my preliminary tests show that it works rather well.

Question : is there a way to specify that this outlier variance has to be greater than the "routine" observation variance ?

Level shifts
Let's say that in addition to the previous outlier, two interventions for level shifts have  to be specified at the 400th and the 500th observations.
k <- rep(0,n)
k[400:length(Obs)] <- 1
l <- rep(0,n)
l[500:length(Obs)] <- 1
RW1 = ObsMat ~ 0 + f(i, model="rw1",constr=FALSE) + k + l

# Estimation
model.RW1 = inla(RW1,
  data = data.frame(i,k,l),
  family = rep("gaussian", 2), # Two gaussian noise 1) regular observations 2) outlier
  control.family = list(
    list()
    ,list()    
  ),
  control.predictor = list(compute = TRUE)
)
This works pretty well too. But it's not the same idea as enlarging the evolution variance of the mean level.

Question : I can live with this way of handling level change interventions, but, alternatively, I wonder if it's possible to specify a higher evolution variance of the mean level for this type of intervention ?

Thank you for all the job done around INLA.

DS

Finn Lindgren

unread,
Jul 2, 2024, 8:48:18 AM7/2/24
to R-inla discussion group
Hi,

more general versions of such models can be implemented with rgeneric
and generic, but in your particular example, there are some simpler
alternatives:

Since you treat the "outliers" as having known location in the data,
you could introduce a random effect component that is added only to
those observations, in effect giving them variance variance_obs +
extra_variance_outlier, instead of what you did, which was to have two
separate models. By adding the random effect only to the outliers, you
will be guaranteed that their model variance is larger than for
non-outliers. This can be accomplished by
+f(additional_variability, model="iid")
where additional_variability has NA for all non-outliers, and indices
1, 2, 3, 4, ..., number_of_outliers, for the outliers.

Similarly, to get increased innovation variance after a specific time
point, you could add another random walk component there, constrained
to start at zero at that time, and applied only after that time (by
having NA in the input vector for earlier time points.
The combined effect would be that of an increased innovation variance
from that time (drawback of that approach is that it doesn't allow you
to estimate whether the change is a decrease or an increase).
Something like this:
+ f(extra_walk,model="rw1",constr=FALSE,extraconstr=list(A=matrix(rep(c(0,1,0),c(start_time-1,1,total_time-start_time)),
e = 0))
> --
> You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/r-inla-discussion-group/853cb581-c2f8-43d1-b938-b00edd2da360n%40googlegroups.com.



--
Finn Lindgren
email: finn.l...@gmail.com

Dominique Soudant

unread,
Jul 3, 2024, 8:52:47 AM7/3/24
to R-inla discussion group
Hi,
Thank you very much for your reply. I'll get back to you as soon as possible.
Regards,
DS

Dominique Soudant

unread,
Jul 3, 2024, 11:09:56 AM7/3/24
to R-inla discussion group
Hi,
Let's consider the following data simulation :
# SSsimple is used to simulate the time series.
if(!require("SSsimple")){
  install.packages("SSsimple",repos="http://cran.irsn.fr")
  require("SSsimple")
}

# The model is :
# Y_t = F*Theta_t + epsilon_t, epsilon_t ~ N(0,V)
# Theta_t = GG*Theta_{t-1} + omega_t, omega_t ~ N(0,W)

# F and GG are structurally derived from the model described above.
# The model described above and the package dlm were used to estimate V, W, and
# Beta0 from actual data.

#set.seed(2)

# Observation "matrix"
F <- matrix(c(1),1,1)

# Evolution matrix
GG <- matrix(c(1)
,1
,1
,byrow=TRUE)

# Observation variance
V <- 0.2721225

# Evolution variance matrix
W <- matrix(c(10**-6)
,1
,1
,byrow=TRUE)

# Time 0 state "matrix"
Beta0 <- c(0.4250634)

# Simulating the time series :
sim <- SS.sim(F=GG
,Q=W
,H=F
,R=V
,600
,Beta0
)

Obs <- sim$Z

# adding an outlier
Obs[round(length(Obs)/3,0)] <- 10

# Adding level shift
Obs[(2*round(length(Obs)/3,0)):length(Obs)] <- Obs[(2*round(length(Obs)/3,0)):length(Obs)]+2
Obs[500:length(Obs)] <- Obs[500:length(Obs)]-1
If I'm right, the INLA model with the outlier as an iid noise is :
# INLA estimation of the model _________________________________________________
# Formula ______________________________________________________________________
n=length(Obs)
i <- 1:n

# outlier
j <- rep(NA,n)
j[round(length(Obs)/3,0)] <- 1

k <- rep(0,n)
k[(2*round(length(Obs)/3,0)):length(Obs)] <- 1

l <- rep(0,n)
l[500:length(Obs)] <- 1

RW1 = Obs ~ 0 + f(i, model="rw1",constr=FALSE)  +
              f(j,model="iid") +

              k +
              l

# Estimation
# loading the INLA library
# ------------------------
require(INLA)

model.RW1 = inla(RW1,
  data = data.frame(i,j,k,l),
  family = rep("gaussian", 1),
  control.family = list(
    list()

  ),
  control.predictor = list(compute = TRUE)
)
But the behaviour is not as expected : 
rm("r")
r <- model.RW1

SIM <- Obs
INLA <- r$summary.fitted.values[1:n,1]
LL <- r$summary.fitted.values[1:n,3]
UL <- r$summary.fitted.values[1:n,5]
YRange <- range(Obs,INLA,LL,UL,na.rm=TRUE)
plot(SIM,col="lightgray",ylim=YRange)
lines(INLA)
lines(LL,lty="dashed")
lines(UL,lty="dashed")
abline(v=seq(1,length(Obs),by=52),col="lightgray")
The fitted values jump up to the outlier instead of ignoring it. I think I'd already tried that way and my conclusion was that the intervention for the outlier has to be on the observation variance, not on the unobserved component.

Regards
DS

Finn Lindgren

unread,
Jul 3, 2024, 1:36:29 PM7/3/24
to Dominique Soudant, R-inla discussion group
Comment on your final sentence:

The "fitted values" predictor includes the random effects, as it
includes _all_ the latent variables in the model. Since the outlier
effect is explicitly modelled by a random effect component, it will
also be part of the "fitted values" predictor, so it _should_ "jump
up" to the outlier.
Since you're using the extra component to "capture" outliers, you're
probably looking for the posterior prediction that does _not_ include
that in the expression. This is probably most easily done by posterior
sampling; that allows you to compute any functional of the model
components you like.

Put differently:
The extra component does not explicitly need to be part of the
observation variance if you spell out the _whole_ model in
mathematical form, i.e. including an extra random term for the
observation noise. It will be something like
all_except_noise + regular_observation_noise + outlier_noise
The "fitted values" you looked at are "all_except_noise +
outlier_noise", but you (in your specific model case) were only
interested in all_except_noise. The solution is to make sure you're
looking at all_except_noise, and not the "fitted values", since the
"fitted values" are not what your model interpretation is interested
in. What you want to do is simply different from the normal type of
models, so the default "fitted values" isn't the right thing for you.
Solve that by looking at the right thing (e.g. via posterior sampling
that lets you compute anything you want).
"fitted values" is only the right thing when you want "the sum of
_everything_ in the latent predictor model". In your case the extra
outlier noise component is specifically there because you're
interested in "the sum of everything _except_ the additional noise
induced by outliers", so "fitted values" isn't what you need.

Finn
> To view this discussion on the web, visit https://groups.google.com/d/msgid/r-inla-discussion-group/a5089232-46e5-4c80-8747-f139a5561e7fn%40googlegroups.com.

Dominique Soudant

unread,
Jul 4, 2024, 3:16:29 AM7/4/24
to R-inla discussion group
Ok I get it, thank you so much for taking the time to reply.
Regards
DS

Dominique Soudant

unread,
Apr 8, 2025, 10:09:25 AMApr 8
to R-inla discussion group
Hello,
I've decided to stick to the first way I presented of dealing with outliers. I had taken my example here:

https://sites.google.com/a/r-inla.org/www/models/tools#h.p_ID_138

Which reads:
There is no constraint in INLA that the type of likelihood must be the same for all observations.

I was wondering if there was a paper or book supporting this statement. This might do the trick:
https://becarioprecario.bitbucket.io/inla-gitbook/ch-INLAfeatures.html#sec:sevlik

Any other suggestions are welcome.
Thanks in advance
Regards
DS

Helpdesk (Haavard Rue)

unread,
Apr 14, 2025, 1:43:28 PMApr 14
to Dominique Soudant, R-inla discussion group

It simply says that every observation could have its own likelihood. this is the
design from the very beginning
Håvard Rue
he...@r-inla.org

Zazaland

unread,
Apr 22, 2025, 11:48:16 AMApr 22
to R-inla discussion group
Okay 

Dominique Soudant

unread,
Apr 25, 2025, 10:30:06 AMApr 25
to R-inla discussion group
Hello,

I'm coming back specifically to Finn Lindgren's suggestion for a mean level intervention. Attached is a script to implement this suggestion. The code section in the script implementing the suggestion is :
extra_walk <- c(rep(NA,SecondShift-1),SecondShift:SampleSize)

formula <- Y ~ 0 +
               f(i, model = "rw2", constr = FALSE) +
               f(extra_walk
                 ,model="rw1"
                 ,constr=FALSE
                 # ,extraconstr=list(A=matrix(rep(c(1,0)
                 #     ,c(1,SampleSize-SecondShift)
                 #              )
                 # ,1
                 # ,SampleSize-SecondShift+1
                 # )
                 # ,e = 0
                 # )
                 )

This works quite well (i.e. there's a jump in the mean level, very similar to what you get with a fixed effect) but doesn't include the extraconstr statement. If we add the extraconstr statement (which I've adapted to the best of my understanding of the syntax on the one hand and Finn's intention on the other), there's no longer any jump in the mean level. While the fixed-effects approach does the trick, I'm still curious whether it's possible to make an intervention by inflating the innovation variance of the mean level.

if by any chance you have some time to spare on this non-priority subject, thank you in advance.
DS

Le mardi 2 juillet 2024 à 14:48:18 UTC+2, Finn Lindgren a écrit :
Finn Lindgren 01.r

Finn Lindgren

unread,
Apr 28, 2025, 3:50:21 AMApr 28
to Dominique Soudant, R-inla discussion group
Hi,

it wasn't clear to me in the earlier discussion that you were looking for a temporal mean-shift model; I interpreted "outlier" in the sense of "at a _specific_ time, the value is anomalous", and not as "at a specific time, the expectation shifts and stays shifted in the future", which is what your simulation does. That's probably why your extraconstr doesn't do what you wanted; it's not designed for what you actually wanted to do.

The rw1 and rw2 models as implemented in inla have constant innovation noise variance, so cannot themselves accommodate mean-shifts of this type; like I mentioned before though, you can implement this yourself with rgeneric/cgeneric. But note that for rw1 at least, this is can be mathematically equivalent to having a constant-variance rw1 model plus a combination of step functions with weights being iid random variables. The case where the steps have a common variance parameter constant variance is simple to implement; in inlabru, one can do it with a matrix mapping to iid;

+ steps(cbind(step1, step2, step3), model = "iid", mapper = bru_mapper_matrix(3), hyper = ...)

In plain INLA, you'd need to supply the cbind(...) as an A-matrix connected to the steps component in inla.stack() instead.

Finn

--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages