inla.mkl uses 100% of CPU

342 views
Skip to first unread message

Dominique Soudant

unread,
Jan 15, 2024, 5:43:10 AM1/15/24
to R-inla discussion group
Hello,
Short version
I've just installed INLA in a debian12 and executing a code uses 100% of the CPU, all the RAM and a large amount of swap.

Long version
debian 12 (cinnamon) is installed in a virtualbox. R-version 4.3.2, stable version of INLA installed. This required the following installations
sudo apt install libssl-dev
sudo apt install libfontconfig1-dev
sudo apt install libxml2-dev
sudo apt install libharfbuzz-dev libfribidi-dev
sudo apt install libcurl4-openssl-dev
sudo apt install libfreetype6-dev libpng-dev libtiff5-dev libjpeg-dev
sudo apt install libudunits2-dev
sudo apt install libgsl-dev
sudo apt install libgdal-dev

In this state the code I'm running uses 100% of the CPU etc., takes a long time to complete (i.e. several minutes) with an error exit:

Killed
Error in inla.inlaprogram.has.crashed() :
  The inla-program exited with an error. Unless you interupted it yourself, please rerun with verbose=TRUE and check the output carefully.
  If this does not help, please contact the developers at <he...@r-inla.org>.

 *** inla.core.safe: inla.program has crashed: rerun to get better initial values. try=1/1

 *** inla.core.safe: rerun with improved initial values

or not. As I saw that it was an inla.mkl process occupying 100% of the process, I assumed that mkl referred to intel.mkl and installed the corresponding package (sudo apt install intel-mkl), but the results are the same.

I obtain the same issue with a debian12 installed on hard disk.

Previously on debian12, R 4.3.3 but INLA_23.04.24 built 2023-04-24 19:15:35 UTC the code could produce an error (or not) but the result was obtained quickly and without 100% inla.mkl

I attach the code I use.
Thanks in advance for your help
Dominique
inla.mkl code issue.r

Dominique Soudant

unread,
Jan 15, 2024, 5:47:15 AM1/15/24
to R-inla discussion group
In the last line, R.4.3.2. should be read instead of R 4.3.3.

Dominique Soudant

unread,
Jan 15, 2024, 12:39:02 PM1/15/24
to R-inla discussion group
New attempt with a fedora 39 with installations of :
open-ssl-devel
fribidi-devel
libcurl-devel
udunits2-devel
libjpeg-turbo-devel
libtiff-devel
gdal-devel
gsl-devel
proj-devel
geos-devel
Obtaining the same symptoms : CPU at 100%, I waited may be for more than 30 minutes and finally shutdown the virtual machine.

Thank you for your help.
Dominique

Helpdesk (Haavard Rue)

unread,
Jan 15, 2024, 4:21:32 PM1/15/24
to Dominique Soudant, R-inla discussion group
I'm also running Fedora39, with

> inla.version()
R-INLA version ..........: 24.01.14
Date ....................: 2024-01-14
Maintainers .............: Havard Rue <hr...@r-inla.org>
: Finn Lindgren
<finn.l...@gmail.com>


and it runs fine in less than a second.


is there any change if you run with the Fedora39 binaries? do

inla.binary.install()

and chose the Fedora one
> --
> You received this message because you are subscribed to the Google
> Groups "R-inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to r-inla-discussion...@googlegroups.com.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/r-inla-discussion-group/6342eead-fdde-42df-ac13-cda15b7ab6cfn%40googlegroups.com
> .

--
Håvard Rue
he...@r-inla.org

Dominique Soudant

unread,
Jan 16, 2024, 5:53:28 AM1/16/24
to R-inla discussion group
Hi,

I installed the binary and run the code : I've got a feed back in less than a second but :
+   ,verbose=TRUE
+ )

Error in inla.inlaprogram.has.crashed() :
  The inla-program exited with an error. Unless you interupted it yourself, please rerun with verbose=TRUE and check the output carefully.
  If this does not help, please contact the developers at <he...@r-inla.org>.

 *** inla.core.safe:  inla.program has crashed: rerun to get better initial values. try=1/1
Error in inla.inlaprogram.has.crashed() :
  The inla-program exited with an error. Unless you interupted it yourself, please rerun with verbose=TRUE and check the output carefully.
  If this does not help, please contact the developers at <he...@r-inla.org>.
Erreur dans inla.core.safe(formula = formula, family = family, contrasts = contrasts,  :
  *** Failed to get good enough initial values. Maybe it is due to something else.

I also checked that this code executes perfectly on my previous system and the results is still : yes. Assuming that the virtualbox may have had something to do with it, I'm going to install a fedora39 on my hard disk and run another test.
Thank you for your help
Dominique

Helpdesk (Haavard Rue)

unread,
Jan 16, 2024, 8:07:33 AM1/16/24
to Dominique Soudant, R-inla discussion group
On Tue, 2024-01-16 at 02:53 -0800, Dominique Soudant wrote:
> I also checked that this code executes perfectly on my previous system
> and the results is still : yes. Assuming that the virtualbox may have
> had something to do with it, I'm going to install a fedora39 on my
> hard disk and run another test.

thx. I run Fedora39 myself on my Dell XPS13, so everything is fine for
me at least.

--
Håvard Rue
he...@r-inla.org

Thierry Onkelinx

unread,
Jan 16, 2024, 8:31:47 AM1/16/24
to Dominique Soudant, R-inla discussion group
I'm having similar issues. With some small datasets, inla.mkl uses only one of the available CPU's. The amount of used RAM slowly increases. When the RAM is full, the PC starts using the swap. And when the swap is also full everything crashes. I notice that it depends on the data and the prior. Slightly changing the response variable or the prior makes the model run in a few seconds.

I'll make a reproducible example soon.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry....@inbo.be
Havenlaan 88 bus 73, 1000 Brussel
Postadres: Koning Albert II-laan 15 bus 186, 1210 Brussel
Poststukken die naar dit adres worden gestuurd, worden ingescand en digitaal aan de geadresseerde bezorgd. Zo kan de Vlaamse overheid haar dossiers volledig digitaal behandelen. Poststukken met de vermelding ‘vertrouwelijk’ worden niet ingescand, maar ongeopend aan de geadresseerde bezorgd.
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////




Op ma 15 jan 2024 om 11:43 schreef Dominique Soudant <dominique.so...@gmail.com>:
--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.

Helpdesk (Haavard Rue)

unread,
Jan 16, 2024, 8:38:14 AM1/16/24
to Thierry Onkelinx, Dominique Soudant, R-inla discussion group
This also happens with the most recent testing version?

On Tue, 2024-01-16 at 14:31 +0100, 'Thierry Onkelinx' via R-inla
Håvard Rue
he...@r-inla.org

Thierry Onkelinx

unread,
Jan 16, 2024, 8:46:42 AM1/16/24
to Helpdesk, Dominique Soudant, R-inla discussion group
Al least with the last two stable versions. I'll test the testing version.

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry....@inbo.be
Havenlaan 88 bus 73, 1000 Brussel
Postadres: Koning Albert II-laan 15 bus 186, 1210 Brussel
Poststukken die naar dit adres worden gestuurd, worden ingescand en digitaal aan de geadresseerde bezorgd. Zo kan de Vlaamse overheid haar dossiers volledig digitaal behandelen. Poststukken met de vermelding ‘vertrouwelijk’ worden niet ingescand, maar ongeopend aan de geadresseerde bezorgd.
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////




Op di 16 jan 2024 om 14:38 schreef Helpdesk (Haavard Rue) <he...@r-inla.org>:

Dominique Soudant

unread,
Jan 17, 2024, 4:35:51 PM1/17/24
to R-inla discussion group
Hi,

I installed fedora39 on my disk, updated, installed R 4.3.2, installed INLA which required the installation of :
sudo dnf install openssl-devel
sudo dnf install harfbuzz-devel fribidi-devel
sudo dnf install libcurl-devel
sudo dnf install udunits2-devel
sudo dnf install gdal-devel
sudo dnf install freetype-devel libpng-devel libtiff-devel libjpeg-devel
sudo dnf install proj-devel
sudo dnf install gsl-devel
sudo dnf install geos-devel
(the previous list wasn't right), ran the code I've made available here three or four successive times, and on the last one I clearly saw the 8 threads between 95 and 99%, RAM usage went up to 100% (16 GB) then swap also went up to 100% (24 GB) and finally the R process crashed.

DELL Latitude 5500, 
Intel© Core™ i7-8665U CPU @ 1.90GHz × 4, 
RAM 16 GO
Fedora Linux 39 Cinnamon (x86-64)
Kernel linux : 6.6.11-200.fc39.x86_64
Cinnamon 5.8
> R.version
               _                          
platform       x86_64-redhat-linux-gnu    
arch           x86_64                      
os             linux-gnu                  
system         x86_64, linux-gnu          
status                                    
major          4                          
minor          3.2                        
year           2023                        
month          10                          
day            31                          
svn rev        85441                      
language       R                          
version.string R version 4.3.2 (2023-10-31)
nickname       Eye Holes   

> require(INLA)
Le chargement a nécessité le package : INLA
Le chargement a nécessité le package : Matrix
Le chargement a nécessité le package : sp
This is INLA_23.09.09 built 2023-10-16 17:29:11 UTC.

Please let me know how I can help you.
Thank you for your work.

Dominique

PS : with binary installed (i.e. inla.binary.install()), I get repeatedly:
Error in inla.inlaprogram.has.crashed() :
  The inla-program exited with an error. Unless you interupted it yourself, please rerun with verbose=TRUE and check the output carefully.
  If this does not help, please contact the developers at <he...@r-inla.org>.

 *** inla.core.safe:  inla.program has crashed: rerun to get better initial values. try=1/1
Error in inla.inlaprogram.has.crashed() :
  The inla-program exited with an error. Unless you interupted it yourself, please rerun with verbose=TRUE and check the output carefully.
  If this does not help, please contact the developers at <he...@r-inla.org>.

Havard Rue

unread,
Jan 17, 2024, 4:39:43 PM1/17/24
to R-inla discussion group, Dominique Soudant
Thx.  Can upgrade to the most recent testing version?    And also add options

inla(…, verbose=T, debug=T)


And send me the output 

-- 
Håvard Rue 
--
You received this message because you are subscribed to the Google Groups "R-inla discussion group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to r-inla-discussion...@googlegroups.com.

Dominique Soudant

unread,
Jan 18, 2024, 10:52:10 AM1/18/24
to R-inla discussion group
Hello,

I deleted the R directory where INLA was installed and installed the testing version. I ran my code more than ten times, each time with an execution time of less than a second and error-free output. I repeated this with another version of the code producing graphics and visually I get with this testing version what I got with the previous version of INLA (i.e. INLA_23.04.24 built 2023-04-24 19:15:35 UTC).

Thus, I doubt it's useful to reproduce here the feedback induced by "verbose=TRUE, debug=TRUE". I propose to go back to the stable version of INLA and see if I can recover the information produced with "verbose=TRUE, debug=TRUE".

To be continued.
Dominique.

Håvard Rue

unread,
Jan 18, 2024, 12:02:50 PM1/18/24
to Dominique Soudant, R-inla discussion group
Thank you. The newest testing version is also linked with the newest MKL
which *might* the explanation...
> --
> You received this message because you are subscribed to the Google
> Groups "R-inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to r-inla-discussion...@googlegroups.com.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/r-inla-discussion-group/4b0805fc-3c41-46f3-b047-361fe644498cn%40googlegroups.com
> .

--
Håvard Rue
hr...@r-inla.org

Dominique Soudant

unread,
Jan 19, 2024, 4:55:41 AM1/19/24
to R-inla discussion group
Hello,

I deleted the R directory where INLA was installed and installed the stable version. After several executions without problems, the symptom appeared and it took 40 min to crash R. I redirected the messages and output to a file with sink(), but I have the feeling that most of the information generated by verbose and debug is sent to the console. This is why I copy/pasted (i.e. select all + copy) the contents of the console into the file created by sink(). Unfortunately, only the lines visible in the console were copied instead of the 10000 buffer lines and then tmux (from which I run R --vanilla) also crashed, so the 10000 lines are lost ... I repeated the test with the binary installed. The two files are attached.
Thanks again for your work.
Dominique

Inla message and output with binary.txt
Inla message and output without binary.txt

Håvard Rue

unread,
Jan 19, 2024, 6:43:42 AM1/19/24
to Dominique Soudant, R-inla discussion group

yes, it crash:

/home/dsoudantfedora39/R/x86_64-redhat-linux-gnu-
library/4.3/INLA/bin/linux/64bit/inla.mkl.run : ligne 55 : 15482
Instruction non permise (core dumped)ldd -r "$DIR/$prog"


I have tried to rerun on my laptop without being able to recreate the
crash.

Question is, if this is a build issue or actually a bug.

could you use the most recent testing version? it should say


> inla.version()
R-INLA version ..........: 24.01.18

then you install the Fedora binary with

> inla.binary.install()
* Looking for Version_24.01.18 and os='<choose interactively>'
Available alternatives:
Alternative 1 is ./CentOS Linux-7
(Core)/Version_24.01.18/64bit.tgz
Alternative 2 is ./Fedora Linux-39 (Workstation
Edition)/Version_24.01.18/64bit.tgz


and chose Alternative 2.

I edited your script adding verbose and debug flags

control.predictor = list(compute = TRUE)
, verbose = T, debug = T
)


name it runme.R, and then did

$ while Rscript runme.R >> OUT 2>&1; do true; done


which will stop if inla crash. if it crash, can you share 'OUT' ? It ran
for a long time for me without crashing and I stopped it in the end.


if this crash, can you try a less optimized build, by downloading

https://inla.r-inla-download.org/Linux-builds/Fedora%20Linux-39%20(Workstation%20Edition)/devel/64bit.tgz

and unpack somewhere, f.ex as ~/Download/64bit

then redo the 'while...' by using this binary instead, by adding the new
binary to the 'runme.R' after loading INLA

library(INLA)
inla.setOption(inla.call="~/Download/64bit/inla.mkl.run")

and check if that one fail as well.

let me know
Havard
> --
> You received this message because you are subscribed to the Google
> Groups "R-inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to r-inla-discussion...@googlegroups.com.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/r-inla-discussion-group/b9170be4-d7d1-423d-b9a8-f09615a44aefn%40googlegroups.com

Dominique Soudant

unread,
Jan 22, 2024, 12:12:44 PM1/22/24
to R-inla discussion group
Hello,

In my virtualbox, I updated fedora 39, deleted the R directory, installed the latest test version of INLA :
=> This is INLA_24.01.20 built 2024-01-19 21:28:11 UTC.
I did a first run without the binary. I let the program run for 1H00: no crash, RAM used at 20%, OUT reached 54 MB.
Then I installed the binaries and ran the code again for 1 hour: no crashes, RAM filled up to 95% and then swap went up to 40%, the OUT file reached the size of 30MB.

However, as I wrote 18 janv. 2024, 16:52:10, using testing version solved my issue, the issue was with the stable version.
Dominique.

Helpdesk (Haavard Rue)

unread,
Jan 22, 2024, 4:09:27 PM1/22/24
to Dominique Soudant, R-inla discussion group
It think the Intel MKL is updated in the testing version and the stable
one, could be that. I think its time to update the stable one then...

Please use the testing one then, shouldn't be any issues with it.
> --
> You received this message because you are subscribed to the Google
> Groups "R-inla discussion group" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to r-inla-discussion...@googlegroups.com.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/r-inla-discussion-group/60f31c8d-b0e6-4ce3-a1b7-3e753e351340n%40googlegroups.com

Dominique Soudant

unread,
Jan 29, 2024, 8:31:02 AM1/29/24
to R-inla discussion group
Hi,
Thank you very much for your help with this issue and for all the work done by the INLA team.
Kind regards.
DS
Reply all
Reply to author
Forward
0 new messages