Fitting lgcp: error "arguments imply differing number of rows"

121 views
Skip to first unread message

Michael Smith

unread,
Jun 16, 2021, 5:00:50 AM6/16/21
to R-inla discussion group
Dear all,

I'm running into a strange error message fitting a model with certain covariates, but not with others. Here is the error:

> fit1 <- lgcp(comp1, data = grid2, samplers = boundary2, domain = list(coordinates = Mesh6))
Error in data.frame(x = integ$loc[, 1], y = integ$loc[, 2], weight = integ$weight,  :
  arguments imply differing number of rows: 0, 1
> traceback()
7: stop(gettextf("arguments imply differing number of rows: %s",
       paste(unique(nrows), collapse = ", ")), domain = NA)
6: data.frame(x = integ$loc[, 1], y = integ$loc[, 2], weight = integ$weight,
       group = g)
5: bru_int_polygon(domain, poly_segm, method = int.args$method,
       nsub = int.args$nsub2)
4: ipoints(samplers, domain$coordinates, group = samp.dim, int.args = int.args)
3: ipmaker(samplers = samplers, domain = domain, dnames = response,
       int.args = options[["bru_int_args"]])
2: like(family = "cp", formula = formula, data = data, samplers = samplers,
       E = E, ips = ips, domain = domain, options = options)
1: lgcp(comp1, data = grid2, samplers = boundary2, domain = list(coordinates = Mesh6))

...and here is the code that gave rise to the error, (Windows 10, Inlabru 2.3.1) along with some code that didn't throw an error:

require(INLA)
require(rgdal)
require(rgeos)
library(inlabru)
library(sp)
library(raster)
library(spdep)
# read in data
grid2 <- readOGR(paste(getwd(), "/31K_envs_clipped_cop95c.shp", sep = "")) 
# change m to km scale
grid2 <- spTransform(grid2, "+proj=utm +zone=54 +south +ellps=WGS72 +towgs84=0,0,1.9,0,0,0.814,-0.38 +units=km +no_defs")
# read in boundary
boundary2 <- readOGR(paste(getwd(), "/NG_boundary_clipped_byecoreg_UTM_snglprt.shp", sep = ""))
boundary2 <- spTransform(boundary2, "+proj=utm +zone=54 +south +ellps=WGS72 +towgs84=0,0,1.9,0,0,0.814,-0.38 +units=km +no_defs")
# remove records with NAs
require(spatialEco)
grid2 <- sp.na.omit(grid2, col.name = "cost")
grid2 <- sp.na.omit(grid2, col.name = "cec")
grid2 <- sp.na.omit(grid2, col.name = "temp10")
grid2 <- sp.na.omit(grid2, col.name = "bio7")
# scale variables
grid2@data[c("cost", "cec", "temp10", "therm")] <- scale(grid2@data[c("cost", "cec", "temp10", "therm")])
grid2@data[c("tri", "fpar", "lai", "soc0_5")] <- scale(grid2@data[c("tri", "fpar", "lai", "soc0_5")])
grid2@data[c("sq3", "sq5", "bio7", "dem")] <- scale(grid2@data[c("sq3", "sq5", "bio7", "dem")])
grid2@data$cop95_1 <- as.factor(grid2@data$cop95_1)
Mesh6 <- inla.mesh.2d(loc=grid2, boundary=boundary2, max.edge = c(25, 250), cutoff = 10,
                      crs="+proj=utm +zone=54 +south +ellps=WGS72 +units=km +no_defs")
matern3 <- inla.spde2.pcmatern(Mesh6,
                               prior.sigma = c(1, 0.01),
                               prior.range = c(25, 0.01))

# this works:
comp2 <- coordinates ~
  cost(cost, model = "linear") +
  mySmooth(coordinates, model = matern3) + Intercept(1)
fit1 <- lgcp(comp1, data = grid2, samplers = boundary2, domain = list(coordinates = Mesh6))

# this formula leads to an error when calling lgcp:
comp1 <- coordinates ~
  tri(tri, model = "linear") +
  mySmooth(coordinates, model = matern3) + Intercept(1)
# so does this
comp1 <- coordinates ~
  cop95_1(cop95_1, model = "factor_contrast") + Intercept(1)
# and this
comp1 <- coordinates ~
  cop95_1(cop95_1, model = "factor_full") - 1

...so the regression produces results for any combination of a dozen or so covariates and/or a spatial term, but produces an error when 'tri' or the factor 'cop95_1' are introduced. There doesn't seem to be anything odd about them: all have 27335 observations, all covariates except the factor are numeric:
> summary(grid2@data$tri)
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
-1.2830 -0.8967 -0.1923  0.0000  0.7430  5.2999
> sum(is.na(grid2@data$tri))
[1] 0

I can supply the .shp files on request but they are a bit too large to post here I think.

Best wishes,
Michael



Finn Lindgren

unread,
Jun 16, 2021, 5:28:46 AM6/16/21
to Michael Smith, R-inla discussion group
Hi,
Since the error appears in the likelihood setup part of inlabru::lgcp, can you please report this as an inlabru Issue on github so I can help debug it from there instead?

Finn

On 16 Jun 2021, at 10:00, 'Michael Smith' via R-inla discussion group <r-inla-disc...@googlegroups.com> wrote:


--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/r-inla-discussion-group/9d4d14a3-13b9-439b-a61c-c428786f3c34n%40googlegroups.com.

Michael Smith

unread,
Jun 16, 2021, 6:54:30 AM6/16/21
to R-inla discussion group
Of course - will do!

Michael Smith

unread,
Jun 17, 2021, 6:54:30 AM6/17/21
to R-inla discussion group
Here's an update in case it helps any other beginners....

What I had failed to understand was that for lgcp I need covariates over the entire region, not just at the observation points. My only excuse is that I can understand the lgcp helpfile, but it's harder to work out how the formula should be set up. So I now read in covariates as rasters and convert to a list of Spatial Pixels dataframes:

```

require(rgdal)

require(rgeos)

library(inlabru)

library(sp)

library(raster)

library(spdep)

 

# read in shapefile with data points and coordinates as Spatial Points

# this .shp had to be altered in QGIS using 'Multipart to singleparts' function otherwise it caused an error

grid2 <- readOGR(paste(getwd(), "/31K_envs_clipped_tobuffer_UTM_singlepoint.shp", sep = ""))

# read in boundary of New Guinea as Spatial Polygons

boundary2 <- readOGR(paste(getwd(), "/NG_boundary_UTM_10km_buffer_exBGV.shp", sep = ""))

# transform both objects to use km instead of m scale

grid2 <- spTransform(grid2, "+proj=utm +zone=54 +south +ellps=WGS72 +towgs84=0,0,1.9,0,0,0.814,-0.38 +units=km +no_defs")

boundary2 <- spTransform(boundary2, "+proj=utm +zone=54 +south +ellps=WGS72 +towgs84=0,0,1.9,0,0,0.814,-0.38 +units=km +no_defs")

 

# read in all environmental rasters in folder and add to list

filesshp <- list.files("C:/Users/smith/Documents/NG_rasters/inlabru", pattern=".tif")

covlist <- list()

for(i in 1:length(filesshp)) {

     r1 <- raster(paste("C:/Users/smith/Documents/NG_rasters/inlabru/", filesshp[i], sep = ""))

     # convert raster to Spatial Pixels

     s1 <- as(r1,"SpatialPixelsDataFrame")

     # check there are no NA values in SpatialPixelsDataFrame

     print(table(is.na(s1@data)))

     # again change projection to km

     s1 <- spTransform(s1, "+proj=utm +zone=54 +south +ellps=WGS72 +towgs84=0,0,1.9,0,0,0.814,-0.38 +units=km +no_defs")

     # append to list

     covlist <- c(covlist, s1)

}

 

# set up mesh and SPDE

Mesh6 <- inla.mesh.2d(loc=grid2, boundary=boundary2, max.edge = c(25, 250), cutoff = 3,

                      crs="+proj=utm +zone=54 +south +ellps=WGS72 +units=km +no_defs")

matern3 <- inla.spde2.pcmatern(Mesh6,

                               prior.sigma = c(1, 0.01),

                               prior.range = c(25, 0.01))

# set up formula with 2 covariates, a spatial term and an intercept

comp2 <- coordinates ~

  cost(covlist[[5]], model = "linear") +

  dem(covlist[[6]], model = "linear") +

  mySmooth(coordinates, model = matern3) + Intercept(1)

# fit model (throws error)

fit2 <- lgcp(comp2, data = grid2, samplers = boundary2, domain = list(coordinates = Mesh6))

 

```

This gives the error:

Error in .M.kindC(clx) :  not yet implemented for SpatialPointsDataFrame

In addition: Warning message:

In is.na(input) : is.na() applied to non-(list or vector) of type 'S4'

> traceback()

22: stop(gettextf(" not yet implemented for %s", clx@className),  domain = NA)

21: .M.kindC(clx)

20: .M.kind(x)

19: Matrix::sparseMatrix(i = which(ok), j = rep(1, sum(ok)), x = input[ok],  dims = c(NROW(input), ibm_n(mapper)))

18: ibm_amatrix.bru_mapper_linear(mapper[["mappers"]][[x]], input = input[[x]],  multi = multi - 1)

17: ibm_amatrix(mapper[["mappers"]][[x]], input = input[[x]], multi = multi -  1)

16: FUN(X[[i]], ...)

15: lapply(indexing, function(x) {

        ibm_amatrix(mapper[["mappers"]][[x]], input = input[[x]],

            multi = multi - 1)

    })

14: ibm_amatrix.bru_mapper_multi(component$mapper, input = val)

13: ibm_amatrix(component$mapper, input = val)

12: amatrix_eval.component(x, data = data, ...)

11: amatrix_eval(x, data = data, ...)

10: FUN(X[[i]], ...)

9: lapply(components, function(x) amatrix_eval(x, data = data, ...))

8: amatrix_eval.component_list(model$effects[included], data = lh[["data"]])

7: amatrix_eval(model$effects[included], data = lh[["data"]])

6: FUN(X[[i]], ...)

5: lapply(lhoods, function(lh) {

       included <- parse_inclusion(names(model[["effects"]]), lh[["include_components"]],

           lh[["exclude_components"]])

       amatrix_eval(model$effects[included], data = lh[["data"]])

   })

4: evaluate_A(model, lhoods)

3: iinla(model = info[["model"]], lhoods = info[["lhoods"]], options = info[["options"]])

2: bru(components, lik, options = options)

1: lgcp(comp2, data = grid2, samplers = boundary2, domain = list(coordinates = Mesh6))

```

...which is an impressively long traceback! But it may just be that some of the data points fall outside the raster boundary (in the sea) and return NA values.

If so, my next step is to go and find some R code to remove the offending data points, and that would be a question for another forum. But it I need to do something on Github please let me know.
Thanks,
Michael

Finn Lindgren

unread,
Jun 17, 2021, 7:17:43 AM6/17/21
to Michael Smith, R-inla discussion group
It appears that this may be an issue with the Matrix package changing its behaviour, as I ran in to a similar unexpected sparseMatrix() error earlier today.
If you reopen the inlabru issue and post the new code there I can debug it later.

Finn



--
Finn Lindgren
email: finn.l...@gmail.com

Finn Lindgren

unread,
Jun 17, 2021, 7:23:41 AM6/17/21
to Michael Smith, R-inla discussion group
No, wait, the message says it's getting a SpatialPointsDataFrame as covariate input. That shouldn't happen with your code, since you convert it to SpatialPixelsDataFrame (that should get evaluated to a vector of values internally).
Can you check what covlist[[5]] and covlist[[6]] actually contain? Are they still Pixels, or are they Points?

Finn

Michael Smith

unread,
Jun 17, 2021, 8:47:26 AM6/17/21
to R-inla discussion group
Hi Finn, thanks for having a look....sorry - I should have double checked! They have turned back into Spatial Points(!)

What converts them back into Spatial Points is the spTransform function. Oh dear.

Michael Smith

unread,
Jun 17, 2021, 8:49:23 AM6/17/21
to R-inla discussion group
I think all I need to do is transform first, then convert to pixels. I will let you know..

Finn Lindgren

unread,
Jun 17, 2021, 9:00:50 AM6/17/21
to Michael Smith, R-inla discussion group
Ah, yes, if Pixels need to be on a regular grid, that happens...
I don’t recall if SpatialGridDataFrame solves that?

Alternatively, I think if  you use CRS info consistently (including in  the mesh) then you may not need to do the transformation of the covariates; I believe the internal evaluation method can evaluate the values regardless of what CRS is used for storage. The important thing for m vs km units is that the model uses km, I.e. the mesh.

Finn

On 17 Jun 2021, at 13:47, 'Michael Smith' via R-inla discussion group <r-inla-disc...@googlegroups.com> wrote:



Finn Lindgren

unread,
Jun 17, 2021, 9:02:34 AM6/17/21
to Michael Smith, R-inla discussion group
...but some parts of the code may not be happy unless the same CRS is used everywhere. It’s o the long list of checks and fixes to add to the package...

And yes, if the only change is the units, then the results should be possible to store as Pixels, so transforming first should help.

Finn

On 17 Jun 2021, at 14:00, Finn Lindgren <finn.l...@gmail.com> wrote:

Ah, yes, if Pixels need to be on a regular grid, that happens...

Michael Smith

unread,
Jun 17, 2021, 2:11:55 PM6/17/21
to R-inla discussion group
The code does pick up differing CRS which is perhaps a useful warning system...

I'll have another go without transforming anything - but coercing the Spatial Points back to Spatial Pixels seems to work, crude maybe but avoids me having to learn about transforming rasters:

for(i in 1:length(filesshp)) {

     r1 <- raster(paste("C:/Users/smith/Documents/NG_rasters/inlabru/", filesshp[i], sep = ""))

     s1 <- as(r1,"SpatialPixelsDataFrame")

     s2 <- spTransform(s1, "+proj=utm +zone=54 +south +ellps=WGS72 +towgs84=0,0,1.9,0,0,0.814,-0.38 +units=km +no_defs")

     s1 <- as(s2,"SpatialPixelsDataFrame")

  covlist <- c(covlist, s1)

}

...lgcp also fills in NA values somehow, which is useful (in this case where there are very few).

Thanks again,
Michael

Finn Lindgren

unread,
Jun 17, 2021, 2:19:30 PM6/17/21
to Michael Smith, R-inla discussion group
Re-converting to Pixels from Points sounds like the best option here.
The transformation from m to km doesn’t break the regularity of the grid, so as long as the rest of the CRS is the same, is should be fine. And yes, inlabru calls bru_fill_missing() to do nearest-available-value infilling, which should be ok for isolated missing values in the grid.

Finn

On 17 Jun 2021, at 19:12, 'Michael Smith' via R-inla discussion group <r-inla-disc...@googlegroups.com> wrote:


Reply all
Reply to author
Forward
0 new messages