Resetting simmer environment for Monte Carlo simulation

55 views
Skip to first unread message

Philemon Cyclone

unread,
Apr 5, 2023, 11:20:57 AM4/5/23
to simmer-devel
I would like to run Monte Carlo replications of my simmer simulation and was wondering if there is a way to do this without calling simmer::simmer() during each replication while still ensuring that the simmer::timeout() values are randomly sampled.

At a high level, my setup is as follows:

env <- setup_sim(input_params)
results <- run_sim(env, n_mc = 100)

where

setup_sim() is a function that initializes the simmer environment and adds resources and trajectories. I would like to separate environment creation from executing the simulation, since I am defining the environment programmatically and it takes a nonsignificant amount of time to set up. I then call run_sim() to execute the simulation, resetting the simmer environment each time:

run_sim <- function(.env, run_until = Inf, n_mc = 1, use_parallel = FALSE) {
  # Run simulation
  start_time <- Sys.time()

  print('Simulating...')
  if (use_parallel == TRUE & n_mc > 1) {
    envs <- parallel::mclapply(1:n_mc, function(i) {
      .env |>
        simmer::reset() |>
        simmer::run(until = run_until) |>
        simmer::wrap()
      })
    } else {
      envs <- purrr::map(
        seq.int(1, n_mc),
        function(i) {
          .env |>
            simmer::reset() |>
            simmer::run(
              until = run_until,
              progress = simmer_progress
              )
        },
        .progress = pb_bar_options
        )
      }

  duration <- Sys.time() - start_time
  print('Simulation complete. ')
  print(duration)

  envs
}

However, I noticed that the random number generated for timeouts do not change across replications, even though they are generated from function calls:

... |>
    simmer::timeout(
        \() {
          step_nbr <- simmer::get_attribute(env, keys = 'step_nbr')
          cycle_time <- get_cycle_time_wrapper(
            step_nbr,
            simmer::get_selected(env),
            rand_seed = sample.int(100000, 1)
            ) *
            simmer::get_attribute(env, keys = 'TimeMultiplier')

          cycle_time
        }
        ) |> 
...

where get_cycle_time_wrapper() is a function that randomly samples values from a lookup table.

Is there a way to ensure a new random sample for each replication given my setup, or do I have to recreate the simmer environment each time?

Thanks!

Iñaki Ucar

unread,
Apr 5, 2023, 12:41:16 PM4/5/23
to simmer...@googlegroups.com
Did you set a parallel-friendly RNG? E.g. RNGkind("L'Ecuyer-CMRG")

--
You received this message because you are subscribed to the Google Groups "simmer-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to simmer-devel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/simmer-devel/04c4c239-a341-4fdc-b8f9-c540bae50daen%40googlegroups.com.


--
Iñaki Úcar
Message has been deleted
Message has been deleted

Iñaki Ucar

unread,
Apr 7, 2023, 6:32:42 AM4/7/23
to simmer...@googlegroups.com
Sorry that these messages didn't get through. Apparently, they were marked as spam for some reason. Copying here the hints I gave you privately for more visibility:
  1. mclapply and family don't work on Windows. The simmer environment (like any other external pointer) is not serializable, and therefore cannot be shared in any other parallelization method. If you have access to a Linux machine, you can parallelize very efficiently via mclapply, by building the environment once and then sharing it. But on Windows you need to build the environment at least once in every worker (then you can simulate more than once per worker with the same environment).
  2. Whenever you parallelize, but also when you replicate via lapply/purrr tools by resetting and reusing the environment, you need to wrap() the final result. In a parallel run, this means that the results of your simulation can be exported back to the main process. In serial lapply/purrr runs, this means that you get a list of all the different simulation results, and not a list pointing to the same simulation environment, which only holds the results for your last simulation.
Hope it helps.
Iñaki

On Fri, 7 Apr 2023 at 11:35, Philemon Cyclone <phil.m.f...@gmail.com> wrote:
The issue does not occur when I used  parallel::mclapply(), but I just realized that this does not actually parallelize under Windows and just runs serially. Indeed, when trying to parallelize under Windows, I get
Error in { : task 1 failed - "external pointer is not valid"

I tried using furrr::future_map() as well as foreach %dopar%, but both approaches result in the same error. Is this maybe a case of "Non-exportable objects" discussed here: A Future for R: Common Issues with Solutions (r-project.org)?

Are there any examples of successful parallelization of simmer under Windows?



--
Iñaki Úcar
Reply all
Reply to author
Forward
Message has been deleted
0 new messages