Error "if passing levelsAM, then ... must be a numeric matrix"

claram...@googlemail.com

unread,

Oct 17, 2019, 11:40:37 AM10/17/19

to SpaDES Users

Hi everyone,

have you ever had this error? I am doing a modified wolfAlps module with doParallel package, and starting it several times. When I do this with the original wolfAlps, it seems to work well for at least 100 repetitions. Like this:

(...)
inputs <- data.frame(file = c("wolves2008.asc", "packs2008.asc",
                              "CMR.asc", "HabitatSuitability.asc"))

wolfModuleStart <- simInit(times = times,# params = list(wolfAlps = wolfiparameters),
                           modules = modules, inputs = inputs, paths = paths)

registerDoParallel(cores=30) #32 pro node
getDoParWorkers()

Italy_1000 <-  foreach(i=1:1000, .packages = c("devtools", "NetLogoR", "SpaDES", "plyr", "data.table")) %dopar%  {
  spades(wolfModuleStart, progress =10,debug=T)
gc()
}

When I am doing it with my version, I get the following error:

Error in { :
  task 24 failed - "if passing levelsAM, then ... must be a numeric matrix"
Calls: %dopar% -> <Anonymous>
Execution halted

I think I need to revise every change I made and see where the error might come from, but I am really confused because I am doing repetitions and the initialization should always be exactly the same. However, this problem usually only happens after a few (20, 24, 136, ..) repetitions, not in the first. Seems to be independent of machine, I get it on a desktop computer as well as a cluster. It would be great if you had any idea of where to look or why it could be that the problem only appears sometimes. Could it be that it happens because all wolves are dead at some point? So that some object is empty?

Thanks, Clara

Alex Chubaty

unread,

Oct 17, 2019, 12:22:00 PM10/17/19

to SpaDES Users

Hi Clara, it looks that that error is generated i by NetLogoR (not SpaDES) when creating an agentMatrix object, but without seeing your code it's hard to guess at what might be the trigger. If you are using version control, finding the change that introduced the error should be fairly quick (e.g., using `git bisect`).

Eliot McIntire

unread,

Oct 17, 2019, 10:39:59 PM10/17/19

to SpaDES Users

Hi Clara,
Sadly, these infrequent errors that are difficult and time consuming to produce are hard to debug. I think I would go with your suggestion that it may be a case where all wolves are dead. But I can't be sure.

Can you reproduce them in a single core situation? Ie not in the foreach parallel. Ideally you know the seed and the replicate number...

seed <- .Random.seed
Keeping this between reps as reps go by, then you run your code and when it fails, you will have pinpointed the problem replicate and can debug it interactively.

C M

unread,

Nov 7, 2019, 10:06:13 AM11/7/19

to SpaDES Users

Hi,

I redid everything starting from the original wolfAlps module, and so far everything seems okay. Weirdly. I first redid the changes that can coexist with Italian Alps data, and had no errors whatsoever. Next I changed to German data, which requires many changes at the same time. Now I get a non-fatal error:

9.999 wolfAlps saveEnd 5

10 wolfAlps saveStart 5

10 wolfAlps yearly 5

10.001 wolfAlps dispersal 5

10.999 wolfAlps saveEnd 5

11 wolfAlps saveStart 5

11 wolfAlps yearly 5

11.001 wolfAlps dispersal 5

11.999 wolfAlps saveEnd 5

12 wolfAlps saveStart 5

12 wolfAlps yearly 5

Because of an interrupted spades call, the sim object at the time of interruption was saved in

SpaDES.core:::.pkgEnv$.sim . It will be deleted on next call to spades

This is the current event, printed as it is happening:

eventTime moduleName eventType eventPriority

0 checkpoint init 5

0 save init 5

0 progress init 5

0 load init 5

0 wolfAlps init 5

0 wolfAlps saveStart 5

0 wolfAlps yearly 5

0.001 wolfAlps dispersal 5

0.011 wolfAlps dispersal 5

0.021 wolfAlps dispersal 5

0.031 wolfAlps dispersal 5

0.041 wolfAlps dispersal 5

0.051 wolfAlps dispersal 5

in between repetions. (Above the error there is normal output for the previous run, after it the start of the next run.)

Do you think this error is a problem? What could it mean?

Cheers, Clara

C M

unread,

Nov 7, 2019, 10:19:36 AM11/7/19

to SpaDES Users

Unfortunately it does not happen if I don't do it in parallel. Could also be because then I can't do very many repetitions. But maybe I could try it on HPC again, with a very long runtime. Will do if my current method of redoing everything fails.

Tati Micheletti

unread,

Nov 7, 2019, 1:02:09 PM11/7/19

to SpaDES Users

Hi Clara,

I don't have a definitive answer, but there are some things I can tell you about my experience. Hope it helps!

The message normally happens when something errors and the SpaDES call is interrupted. I would think this might have happened to one of the workers (maybe it hit an edge case when doing an operation?).

The way to debug it would be to try identifying when it happens, which might not be easy. I have had a similar problem when one in 5 or 6 runs, I would hit an edge case where my raster would only have NA's. It was only easier to diagnose because I am a "messaging freak" (meaning: I add messages in several steps of the simulationl including inside my own functions, so I can kind of follow where things are and narrow down the potential places where things can be going wrong).

Another thing: have you tried using the new `future` package for doing things in parallel? I have had problems in the past with `parallel`, especially when using RStudio. I have been using future and future.apply for about 2 weeks now, I couldn't be happier. It's an easy straighforward way of doing things in parallel, especially if you want to replace a `lapply`. Sorry if I can't be more helpful. Let us know if you managed to get your things going.

C M

unread,

Nov 22, 2019, 10:32:49 AM11/22/19

to SpaDES Users

Hi, thank you for your help! It took me a while to redo everything, the good news is that the error above is gone.. But I have two different errors now, depending on input data (packs and wolves).

1) just one pack

7.261     wolfAlps   dispersal 5
7.271     wolfAlps   dispersal 5
7.281     wolfAlps   dispersal 5
7.291     wolfAlps   dispersal 5
7.301     wolfAlps   dispersal 5
7.311     wolfAlps   dispersal 5
7.999     wolfAlps   saveEnd   5
8         wolfAlps   saveStart 5
8         wolfAlps   yearly    5
8.001     wolfAlps   dispersal 5
8.011     wolfAlps   dispersal 5
8.021     wolfAlps   dispersal 5
8.031     wolfAlps   dispersal 5
8.Error in { :
  task 9 failed - "numbers of columns of arguments do not match"


Calls: %dopar% -> <Anonymous>
Execution halted

I tried to find where it stops with messaging, and put lots of comments in the Event types, like this:

..
  } else if (eventType == "dispersal") {
print("SCHEDULE EVENT 24")
    sim <- sim$wolfAlpsDispersal(sim) # dispersal movement
print("SCHEDULE EVENT 25")
    sim <- sim$wolfAlpsEstablish(sim) # join a pack or build a new territory
print("SCHEDULE EVENT 26")
    sim <- sim$wolfAlpsSaveTerrSize(sim) # save the size of the new created territories
print("SCHEDULE EVENT 27")
    if(NLcount(NLwith(agents = sim$wolves, var = "dispersing", val = 1)) != 0){
print("SCHEDULE EVENT 28")
      sim <- scheduleEvent(sim, time(sim, "year") + 0.01, "wolfAlps", "dispersal")
    }
print("SCHEDULE EVENT 29")
  } else {
print("SCHEDULE EVENT 30")
    warning(paste("Undefined event type: '", current(sim)[1, "eventType", with = FALSE],
                  "' in module '", current(sim)[1, "moduleName", with = FALSE], "'", sep = ""))
  }
print("SCHEDULE EVENT 31")
  return(invisible(sim))
}

and the last print I got was SCHEDULE EVENT 31.

How is that possible, there doesn't seem to happen anything afterwards.. ?

2) 16 packs (less danger of extinction)

1.821     wolfAlps   dispersal 5
1.831     wolfAlps   dispersal 5
1.841     wolfAlps   dispersal 5
1.851     wolfAlps   dispersal 5
1.861     wolfAlps   dispersal 5
1.871     wolfAlps   dispersal 5
1.881     wolfAlps   dispersal 5
1.999     wolfAlps   saveEnd   5
2         wolfAlps   saveStart 5
2         wolfAlps   yearly    5
2.001     wolfAlps   dispersal 5


Because of an interrupted spades call, the sim object at the time of interruption  was saved in
SpaDES.core:::.pkgEnv$.sim . It will be deleted on next call to spades

Error in { :
  task 1 failed - "'vec' must be sorted non-decreasingly and not contain NAs"


Calls: %dopar% -> <Anonymous>
Execution halted

From extensive messaging I now know that it breaks down during what would be line 635 in the original wolfAlps.R

   # Now each wolf has its subset of cells as potential next locations for dispersal
    # Probability of going to the potential next locations regarding their directions
    headDispersers <- of(agents = dispersers, var = "heading")
    nextLocs[,nextAngle:={
      dnorm(mean = 0, sd = params(sim)$wolfAlps$sigma, # calculate the probability using the Normal distribution of ...
            subHeadings(angle1 = headDispersers[id], # ... the rotation of each wolf's heading to ...
                        angle2 = towards(agents = turtle(sim$wolves, who = whoDispersers[id]), # ... the direction towards each of its next potential locations
                                         agents2 = cbind(x = x, y = y))
            )
      )}, by = id] # data.table use of by = id, so each of the above happens within each id

    # Probability of going to the potential next locations regarding habitat suitability
    data.table::set(nextLocs,,"suitabilityValGood", sim$suitabilityValGood[nextLocs$indices])

    # Probability of going to the potential next locations regarding the directions, habitat suitability and other wolves presence
    data.table::set(nextLocs,,"prob", nextLocs$nextAngle * nextLocs$suitabilityValGood * nextLocs$empty)
    probLoc <- runif(n = NLcount(dispersers), min = 0, max = 1)
    setkeyv(nextLocs, c("id"))
    # Selected next potential locations, based on probLoc
    nextLocs <- nextLocs[,.SD[findInterval(probLoc[id], cumsum(prob/sum(prob))) + 1],
                         by = id, .SDcols = c("x", "y", "prob")]
    selectedLocID <- as.matrix(nextLocs)[,c(2,3,1), drop = FALSE]

Do you think this might be a problem with cumsum, that it gets too large? Or what could it be?

(@Tati Now I also went crazy with messaging and I think it helps, so thank you for reminding me :) Also, I think I tried the future package in the beginning and couldn't get it to work at all, so I decided to go with parallel. )

Thank you :)

Reply all

Reply to author

Forward