Binary data and Error in seq.default and Hyman splinefun

364 views
Skip to first unread message

Hedvig Skirgård

unread,
Apr 14, 2022, 4:25:12 PM4/14/22
to R-inla discussion group
Dearest R-INLA support community,

We're encountering a problem that is hard to de-bug. Basically, what we want to do is run a series of INLA models on a set of binary variables and it's failing in unexpected ways.

I've opted for a for-loop over the traits *, running inla() each time. While running through the loop over the traits, the script unmistakably fails some way in the traits it's looping over and spits out:

Error in seq.default(r[[1L]], r[[2L]], length = n) :
  'from' must be a finite number


What mystifies us is that
a) it makes it a fair way into the loop of traits before failing
b) when we run the trait it is currently on in the loop in isolation outside of the loop inla() completes just fine.

This is starting to make us wonder, is it do with memory somehow? What could be going on? Any and all advice is appreciated!

More details
To make it easier for others to observe and recreate the problem I've set up a little GitHub repos with some scripts and example data. The crucial script is this one:

The data the analysis is working with is unfortunately not published just yet, so I've recreated the workflow with a very similar albeit smaller dataset (the Sahul-survey of Reesink, Dunn and others). Ironically, as I'm looping through those using the same set-up I actually fails in a different way.

Error in splinefun(xnew, xx, method = "hyman") : zero non-NA

We would be very grateful for any advice or assistance anyone can give us, in particular when it comes to the seq.default error.

I'm sorry that my attempt at a reprex of sorts has failed. I hope that looking at the code might prove insightful nonetheless.

You will need to clone the entire repos to get all the bits and bobs for the example I've set up. Beware, the scripts will be installing some packages when run (INLA and its dependencies, paceman, tidyverse, reshape2, ape, jsonlite, phytools, missForest and fields. Nothing crazy).

The data is a series of linguistic traits (grammatical) of languages. We're running several different models: one that accounts only for a phylogenetic vcv, one for the spatial vcv, one for a discrete way of spatially grouping languages and then combinations of the previous.

As it stands right now, the script gets through a few features and then fails with the consol output about the seq.default-error. Before doing so it does spit out some warnings about negative Hessian values. For now, I'm just concentrating on solving the issue that's failing the entire run, and then we'll turn our attention to other concerns.

I'd be more than happy to outline more details of what's going on, but I'll keep it relatively short and sweet for the introduction I think.

I'm new to INLA and this community, please forgive me if I've done anything wrong.

All the best! Happy Easter!
Hedvig

* Yes I might re-write with apply() or another solution, but for now it's a for-loop to make it conceptually easy to read.

Hedvig Skirgård

unread,
Apr 15, 2022, 2:52:35 AM4/15/22
to R-inla discussion group
My apologies, I sent you all the link to the wrong branch at the repos. Here is the main branch version, which is the appropriate version:


/Hedvig

Hedvig Skirgård

unread,
Apr 29, 2022, 8:31:53 AM4/29/22
to R-inla discussion group
Hi again, 

Maybe this problem isn't something anyone else has gotten before, I appreciate that it's maybe a bit niche.

Is there anyone else in general in this group who has experiences applying INLA to binary data and/or looping INLA?


Med vänliga hälsningar,

Hedvig Skirgård


PhD,  Australian National University

Postdoc at Department of Linguistic and Cultural Evolution, Max Planck Institute for Evolutionary Anthropology 

Website



--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/r-inla-discussion-group/e0ffbaed-f762-4620-a281-37db04d88786n%40googlegroups.com.

r.di...@gmail.com

unread,
Apr 29, 2022, 11:17:01 AM4/29/22
to R-inla discussion group
Hi Hedvig!
Welcome to R-INLA discussion group. I'd be happy to look over your code and help to debug, but I am a little short on time at the moment. I could probably carve out some time to have a look sometime around the end of next week, if that is not too late for you? Otherwise, anyone else here can take a look too, though I strongly suspect the issue is not related to INLA itself. More likely something to do with R environments and the eval(substitute(..)) pattern you are using, but that is just from a very quick scan of your code. I notice you also cite my 2020 paper in your code. My recommendations for this type of model have evolved somewhat since I published that and I would be happy to discuss further about the model structure you are using if you are interested. Feel free to contact me via email for further discussion.

Cheers,
Russell

--
Russell Dinnage PhD
Research Assistant Professor
Institute of Environment
Department of Biological Sciences
Florida International University
Miami, Florida, United States

Helpdesk

unread,
May 1, 2022, 4:05:05 AM5/1/22
to Hedvig Skirgård, R-inla discussion group
I get this error still. looks like an R-issue to me, as this works well
in other cases of yours.

I would try to do this differently to make it more transparent what is
going on , like

formula = yy ~ 1 + xx

and then

data = data.frame(yy = DATA$SAHUL072, xx = DATA$SAHUL072.xx)

so the formula stays the same and only the data change. then it should
be easier to find the error or what cause it.

also make sure that the name 'SAHUL4.3.09a' does not create an issue in
itself, _and_ is actually there

Best
H



# Running the spatial-only model on feature SAHUL072. That means I'm
87.57% done.
# Running the spatial-only model on feature SAHUL075. That means I'm
88.14% done.
# Running the spatial-only model on feature SAHUL032. That means I'm
88.7% done.
# Running the spatial-only model on feature SAHUL059. That means I'm
89.27% done.
# Running the spatial-only model on feature SAHUL060. That means I'm
89.83% done.
# Running the spatial-only model on feature SAHUL036. That means I'm
90.4% done.
# Running the spatial-only model on feature SAHUL063. That means I'm
90.96% done.
# Running the spatial-only model on feature SAHUL038. That means I'm
91.53% done.
# Running the spatial-only model on feature SAHUL050. That means I'm
92.09% done.
# Running the spatial-only model on feature SAHUL061. That means I'm
92.66% done.
# Running the spatial-only model on feature SAHUL062. That means I'm
93.22% done.
# Running the spatial-only model on feature SAHUL064. That means I'm
93.79% done.
# Running the spatial-only model on feature SAHUL010. That means I'm
94.35% done.
# Running the spatial-only model on feature SAHUL011. That means I'm
94.92% done.
# Running the spatial-only model on feature SAHUL030. That means I'm
95.48% done.
# Running the spatial-only model on feature SAHUL005. That means I'm
96.05% done.
# Running the spatial-only model on feature SAHUL006. That means I'm
96.61% done.
# Running the spatial-only model on feature SAHUL017. That means I'm
97.18% done.
# Running the spatial-only model on feature SAHUL028. That means I'm
97.74% done.
# Running the spatial-only model on feature SAHUL004. That means I'm
98.31% done.
# Running the spatial-only model on feature SAHUL007. That means I'm
98.87% done.
# Running the spatial-only model on feature SAHUL003. That means I'm
99.44% done.
All done with the spatial only model, 100% done!#### AUTOTYP_area only
model ####
Starting INLA AUTOTYP-area only runs featurewise runs at 2022-05-01
10:59:37 .
# Running the autotyp_area-only model on feature SAHUL045. That means
I'm 0% done.
# Running the autotyp_area-only model on feature SAHUL4.3.09a. That
means I'm 0.56% done.
Error in eval(parse(text = formula[2]), envir = data, enclos =
.parent.frame) :
object 'SAHUL4.3.09a' not found

*** inla.core.safe: inla.program has crashed: rerun to get better
initial values. try=1/2
Error in eval(parse(text = formula[2]), envir = data, enclos =
.parent.frame) :
object 'SAHUL4.3.09a' not found

*** inla.core.safe: inla.program has crashed: rerun to get better
initial values. try=2/2
Error in eval(parse(text = formula[2]), envir = data, enclos =
.parent.frame) :
object 'SAHUL4.3.09a' not found
Error in inla.core.safe(formula = formula, family = family, contrasts =
contrasts, :
*** Fail to get good enough initial values. Maybe it is due to
something else.
# Running the autotyp_area-only model on feature SAHUL4.3.09c. That
means I'm 1.13% done.
Error in eval(parse(text = formula[2]), envir = data, enclos =
.parent.frame) :
object 'SAHUL4.3.09c' not found

*** inla.core.safe: inla.program has crashed: rerun to get better
initial values. try=1/2
Error in eval(parse(text = formula[2]), envir = data, enclos =
.parent.frame) :
object 'SAHUL4.3.09c' not found


--
Håvard Rue
he...@r-inla.org

Hedvig Skirgård

unread,
May 1, 2022, 4:34:32 AM5/1/22
to Helpdesk, R-inla discussion group
Thank you Håvard for going through that, that's very generous of you.

That's interesting, that's a third type of error that I haven't seen before. 

I'll tweak the code as per your suggestions and introduce another check to check that the variable is indeed there. Maybe I'll make it a bit more talky too to increase debug material.


Med vänliga hälsningar,

Hedvig Skirgård


PhD,  Australian National University

Postdoc at Department of Linguistic and Cultural Evolution, Max Planck Institute for Evolutionary Anthropology 

Website


Helpdesk

unread,
May 1, 2022, 4:36:02 AM5/1/22
to Hedvig Skirgård, R-inla discussion group

I guess you're using a recent INLA version, then adding

library(INLA)
inla.setOption(inla.mode="experimental")

will speed it up a little. There was a post a few weeks ago about the
papers behind this

Hedvig Skirgård

unread,
May 1, 2022, 4:57:30 AM5/1/22
to Helpdesk, R-inla discussion group
Oh cool, thanks!


Med vänliga hälsningar,

Hedvig Skirgård


PhD,  Australian National University

Postdoc at Department of Linguistic and Cultural Evolution, Max Planck Institute for Evolutionary Anthropology 

Website


Hedvig Skirgård

unread,
May 9, 2022, 2:12:16 PM5/9/22
to R-inla discussion group
I'm sorry I haven't amended code or gotten back with details sooner, I was preoccupied with another task last week but this week is INLA-detective-time!

Hedvig Skirgård

unread,
May 9, 2022, 4:29:58 PM5/9/22
to R-inla discussion group
I think I've got closer.

What I think is happening, and please forgive me if I've misunderstood something, is that when eigenvalues of the Hessian are negative, you get marginal.hyperpars of Inf 0.000000e+00. When we're then applying inla.tmarginal() and taking the square of those values root, all breaks down.

> phylo_effect_iid_model = inla.tmarginal(function(x) 1/sqrt(x),
+                                         output$marginals.hyperpar$`Precision for phy_id_iid_model`,
+                                         method = "linear") %>%
+     inla.qmarginal(c(0.025, 0.5, 0.975), .)
Error in seq.default(xmin, xmax, len = n) :
  'from' must be a finite number


Does that make sense?

What I've now done is check for Inf values and if they occur,  not apply this inla.tmarginal() and render a df from it but just make a df with the same cols but with NA where the margins would have been.


Not the prettiest solution, I agree and perhaps I'll think of something more smart tmrw. Of course, I need to get to the bottom of why we're getting negative eigenvalues in the Hessian at all

/Hedvig

Helpdesk

unread,
May 10, 2022, 2:39:22 AM5/10/22
to Hedvig Skirgård, R-inla discussion group
with the most recent package, there is a change on how this is handled
internally, which might be useful. If you upgrade R to 4.2 and then
install the most recent testing version of INLA

Hedvig Skirgård

unread,
May 10, 2022, 5:46:56 AM5/10/22
to Helpdesk, R-inla discussion group
Oh interesting. Thank you very much for your help Håvard!

Will this feature of the testing version be folded into the non-testing version? 

We're hoping to make this code as accessible as possible and reproducible, so I might make use of pkgs like groundhougr or something to keep track of which version of INLA we're using so that it's retrievable for other users.


Med vänliga hälsningar,

Hedvig Skirgård


PhD,  Australian National University

Postdoc at Department of Linguistic and Cultural Evolution, Max Planck Institute for Evolutionary Anthropology 

Website


Helpdesk

unread,
May 10, 2022, 5:57:59 AM5/10/22
to Hedvig Skirgård, R-inla discussion group

hopefully, the new experimental mode will be the default later this
year. the 'classic' (current default) will then be there optionally

Hedvig Skirgård

unread,
May 10, 2022, 6:36:27 AM5/10/22
to Helpdesk, R-inla discussion group
Okay, understood. I'll rejig some of our set-up and write messages and comments indicating what's a-happening.


Med vänliga hälsningar,

Hedvig Skirgård


PhD,  Australian National University

Postdoc at Department of Linguistic and Cultural Evolution, Max Planck Institute for Evolutionary Anthropology 

Website


Hedvig Skirgård

unread,
May 11, 2022, 8:19:22 AM5/11/22
to Helpdesk, R-inla discussion group
Unfortunately, using R 4.2 is not going to be a good fit for us for this project right now. There are 3 pkgs that won't install under 4.2: rcompanion, ggpubr and geiger.

I'll inspect where we are using functions from these pkgs and if an alternative is available, hopefully there is something that can be done but I'll have to double check some things.

I must admit that I was a bit surprised that ggpubr wasn't ready for R 4.2, out of those three.


Med vänliga hälsningar,

Hedvig Skirgård


PhD,  Australian National University

Postdoc at Department of Linguistic and Cultural Evolution, Max Planck Institute for Evolutionary Anthropology 

Website


Finn Lindgren

unread,
May 11, 2022, 8:44:56 AM5/11/22
to Hedvig Skirgård, Helpdesk, R-inla discussion group
Hi,
precisely what error messages do you get when trying to install those packages?
The CRAN pages indicates that it's available, with no 4.2 errors:
https://cran.r-project.org/web/packages/ggpubr/index.html
https://cran.r-project.org/web/packages/rcompanion/
https://cran.r-project.org/web/packages/geiger/
And the binary builds match the source versions for all platforms there.

Finn
--
Finn Lindgren
email: finn.l...@gmail.com

Finn Lindgren

unread,
May 11, 2022, 8:46:14 AM5/11/22
to Hedvig Skirgård, Helpdesk, R-inla discussion group
The geiger version seems to have been updated 2 days ago, so it may be
worth trying again, in case it was just a temporary glitch in the cran
mirrors.
Finn

Hedvig Skirgård

unread,
May 12, 2022, 9:49:22 AM5/12/22
to Finn Lindgren, Helpdesk, R-inla discussion group
Thanks Finn, 

I run through and install and load all packages in one script using pacman, like this:

I got a lot of output, after running it and installing everything else I ran it again, and I've copy-pasted in the output below.

Weirdly, I just tried plain install.packages and that worked... so maybe the problem is somehow with pacman::p_load() and these packages?

Anyway, that problem is "solved" now, more or less. So back to INLA testing verison :)

trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.2/ggpubr_0.4.0.tgz'
Content type 'application/x-gzip' length 1905174 bytes (1.8 MB)
==================================================
downloaded 1.8 MB


The downloaded binary packages are in
/var/folders/2r/drwc5p6n26ncm5cws9qsntrw0000gq/T//RtmpgaimyD/downloaded_packages

ggpubr installed

  There is a binary version available but the source version is later:
        binary source needs_compilation
geiger 2.0.7.1  2.0.9              TRUE

Do you want to install from sources the package which needs compilation? (Yes/no/cancel)
installing the source package ‘geiger’

trying URL 'https://cran.rstudio.com/src/contrib/geiger_2.0.9.tar.gz'
Content type 'application/x-gzip' length 361143 bytes (352 KB)
==================================================
downloaded 352 KB

* installing *source* package ‘geiger’ ...
** package ‘geiger’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG  -I'/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include' -I/usr/local/include   -fPIC  -Wall -g -O2  -c APE_reorder_phylo.c -o APE_reorder_phylo.o
clang++ -mmacosx-version-min=10.13 -std=gnu++14 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG  -I'/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include' -I/usr/local/include   -fPIC  -Wall -g -O2  -c branches.cpp -o branches.o
clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG  -I'/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include' -I/usr/local/include   -fPIC  -Wall -g -O2  -c init.c -o init.o
clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG  -I'/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include' -I/usr/local/include   -fPIC  -Wall -g -O2  -c mkn.c -o mkn.o
clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG  -I'/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include' -I/usr/local/include   -fPIC  -Wall -g -O2  -c pic.c -o pic.o
clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG  -I'/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include' -I/usr/local/include   -fPIC  -Wall -g -O2  -c util.c -o util.o
clang++ -mmacosx-version-min=10.13 -std=gnu++14 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG  -I'/Library/Frameworks/R.framework/Versions/4.2/Resources/library/Rcpp/include' -I/usr/local/include   -fPIC  -Wall -g -O2  -c utilities.cpp -o utilities.o
clang++ -mmacosx-version-min=10.13 -std=gnu++14 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/usr/local/lib -o geiger.so APE_reorder_phylo.o branches.o init.o mkn.o pic.o util.o utilities.o -L/Library/Frameworks/R.framework/Resources/lib -lRblas -L/usr/local/gfortran/lib/gcc/x86_64-apple-darwin18/8.2.0 -L/usr/local/gfortran/lib -lgfortran -lquadmath -lm -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
ld: warning: directory not found for option '-L/usr/local/gfortran/lib/gcc/x86_64-apple-darwin18/8.2.0'
ld: warning: directory not found for option '-L/usr/local/gfortran/lib'
ld: library not found for -lgfortran
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [geiger.so] Error 1
ERROR: compilation failed for package ‘geiger’
* removing ‘/Library/Frameworks/R.framework/Versions/4.2/Resources/library/geiger’

The downloaded source packages are in
‘/private/var/folders/2r/drwc5p6n26ncm5cws9qsntrw0000gq/T/RtmpgaimyD/downloaded_packages’
Great, INLA was already installed, loading now.
Warning messages:
1: In utils::install.packages(package, ...) :
  installation of package ‘geiger’ had non-zero exit status
2: In p_install(package, character.only = TRUE, ...) :
3: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘geiger’
4: In pacman::p_load(tidyverse, readr, fields, reshape2, broom, broom.mixed,  :
  Failed to install/load:
ggpubr, geiger



Med vänliga hälsningar,

Hedvig Skirgård


PhD,  Australian National University

Postdoc at Department of Linguistic and Cultural Evolution, Max Planck Institute for Evolutionary Anthropology 

Website


Finn Lindgren

unread,
May 12, 2022, 11:09:16 AM5/12/22
to Hedvig Skirgård, Helpdesk, R-inla discussion group
Ah,

the info listed for geiger shows that version 2.0.9 was uploaded to
CRAN _today_, which means that the source vs binary issue for that one
is just a symptom of the new geiger version making its way through the
CRAN build process for the different architectures. This usually
resolves within a day or two (sometimes more). This type of thing
happens often, especially around new R releases when new bugs are
detected and fixed, making many packages get new uploads, so a package
like INLA that involves many dependencies is likely to notice it.
If pacman has an option to give preference to prebuilt binary
packages, that should avoid also the temporary issue.

Finn

Hedvig Skirgård

unread,
May 13, 2022, 5:26:27 AM5/13/22
to Finn Lindgren, Helpdesk, R-inla discussion group
Thanks Finn!


Med vänliga hälsningar,

Hedvig Skirgård


PhD,  Australian National University

Postdoc at Department of Linguistic and Cultural Evolution, Max Planck Institute for Evolutionary Anthropology 

Website


Reply all
Reply to author
Forward
0 new messages