Preferential sampling with both random and fixed effects

42 views
Skip to first unread message

Scott Foster

unread,
Nov 14, 2022, 3:47:00 AM11/14/22
to R-inla discussion group
Hi,

I'm trying to implement a model of preferential sampling.  The information I can find on-line is very helpful but I think falls a little short of where I would like to be.  I'm happy with the marked-point-process (two likelihoods) and I'm happy with using f(.,copy='i', .), but I'd also like to "copy" the fixed effects and have them share a common preference parameter.

This could be a little vague, let me try to firm things up a bit.

Let Y be the marks at sites X.  Model the intensity of the LGCP for X as:

log( E(X|u)) = \eta = X\tau+u

and the model for the marks as

g(E(Y|X,u)) = \alpha + \beta * \eta
                    =\alpha + \beta * X\tau + \beta * u

Currently, the examples online (Book https://becarioprecario.bitbucket.io/spde-gitbook/ch-lcox.html#model-fitting-under-preferential-sampling) show how to add the final term, but not the term with the fixed effects.

I'd settle for a model with a separate parameter for fixed and random contributions, viz

g(E(Y|X,u)) = \alpha + \beta_f * X\tau + \beta_r * u

with \beta_f != \beta_r, but it would be nice to have the option.

Any advice would be greatly appreciated.  I'm a bit lost right now.

Thanks in advance,

Scott

Helpdesk (Haavard Rue)

unread,
Nov 14, 2022, 1:27:51 PM11/14/22
to Scott Foster, R-inla discussion group

you need to zero out contributions you do not like when you define the
joint (or union) of the two models, either by 0's in the covariates, or
with idx[]=NA in f(idx,...)
> --
> You received this message because you are subscribed to the Google
> Groups "R-inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to r-inla-discussion...@googlegroups.com.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/r-inla-discussion-group/ece98539-add1-4d41-b5f8-57ee8c0627a7n%40googlegroups.com
> .

--
Håvard Rue
he...@r-inla.org

Scott Foster

unread,
Nov 15, 2022, 1:03:11 AM11/15/22
to Helpdesk (Haavard Rue), R-inla discussion group
Thanks for the response.  It is greatly appreciated.

However, I don't understand how to use it in relation to the proposed model.  I think that there might have been ambiguous communication about the model.  I'll try again (and then try to explain my confusion).

Consider a marked point process and let Y be the marks measured at sites X.  The conditional model for the marks is

g( E(Y|u)) = \eta_Y 
             = W_Y\tau + u_Y

where W_Y is a design matrix for the covariates at the sites X, \tau is a vector of coefficients and u_Y is a vector containing realisations of a Gaussian process at sites X.  This is 'just' a standard geostatistical model for the marks

The model for the point process allows for possible preferential sampling. It is 

log( E(X|u)) = \alpha + \beta*\eta 
             = \alpha + \beta*W_X\tau + \beta*u

where W_X is a design matrix (dimension M_X \times p) that depends upon the approximation to the Poisson point process (on- or off-grid, for example), \tau and u are as before, \alpha is a constant and \beta is a preference parameter.  If \beta>0 then the sites are associated with higher values of E(Y|u).

Please note that, since last email, I have swapped \beta and \alpha so that they now affect the locations and not the marks.  The same problem persists though (this is just a rescaling to make the preference more obvious).

Note that in the INLA call, the design matrix will be of dimension (|X|+M_X) \times p, to allow for the two outcomes and will consist of the two W matrices above being stacked.

Adding in the scalar parameter \alpha is easy, as it involves only adding a column of |X| zeros and |M_X| ones to the design matrix fed into INLA.

Adding \beta is where my confusion lies.  If it were known a priori then I could simply multiply the relevant part of the design matrix.  If it were additive, then I could add columns to the design matrix (like \alpha).  In the model for the locations, the design matrix retains the p columns but the coefficients are now \beta*\tau -- a 'compound' parameter (my made-up term).

So, I don't think that the model can be specified by altering the design matrix -- the terms of interest are 'compound' parameters, not additive.  The \beta scalar parameter also spans many covariates (columns of the design matrix).

I read the helpfile for f() to indicate that it applies only to single covariates, not multiple ones as needed here.

Am I missing something? Is this even possible as it is a non-linear model (just)?

Thanks again,

Scott

Elias T. Krainski

unread,
Nov 15, 2022, 5:43:26 AM11/15/22
to R-inla discussion group
Hi,

In order to copy the entire linear predictor, you can define a "fake zero" observations, so that
 0 = W\tau + u - v
where v ~ N(0, bigVariance)
then you copy 'v' instead of 'u' for the LGCP part. Please check the section 3.3 in the book.

Elias



--
Reply all
Reply to author
Forward
0 new messages