Stan parallel works in Windows 10

440 views
Skip to first unread message

Bob K

unread,
May 17, 2016, 4:28:07 PM5/17/16
to Stan users mailing list
This may be old news, but it seems to me that Rstan (at least on Windows 10 on my 2-core laptop) is capable of running parallel chains in Windows without any extra programming effort (e.g., setting up a cluster for parLapply). Stan now launches additional R processes and redirects output to the "Viewer" in Rstudio merely by setting the cores=x option. Maybe this feature was added at some point, but I don't see it anywhere in the Rstan docs (?).

I am posting this merely so that other users of Windows who don't ever bother with parLapply but enjoy the ease of parallel usage with mclapply, etc. on Linux/Mac, can go ahead and use Rstan in parallel without any worries.

Note, though, that using the default Rstan detect_cores() option on Windows will report "logical cores", which are not true hardware cores. You can find out the true number of cores on a Windows 10 machine by going to Task Manager -> Performance tab. It's not hard on other versions to find the true number of cores. Be sure to then set the cores option in stan() manually to the true number of cores, or running stan() in parallel will slow your machine down considerably.

Andrew Gelman

unread,
May 17, 2016, 6:01:47 PM5/17/16
to stan-...@googlegroups.com
Hi, I have a feeling I'm not understanding most of this, but . . . when I run rstan on my Mac laptop, it automatically runs 4 chains in parallel, one chain on each processor.
A

--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
To post to this group, send email to stan-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mahdi Akbarzadeh

unread,
May 17, 2016, 6:42:45 PM5/17/16
to stan-...@googlegroups.com
Hi, When we're working with Windows, what is your suggestion? automatically run the rstan or its better that set the true number of cores?
--


Sincerely,

Mahdi Akbarzadeh,

PhD Candidate in Biostatistics;

School of Public Health,

Department of Biostatistics and Epidemiology,


 

 
 

Andrew Gelman

unread,
May 17, 2016, 6:46:30 PM5/17/16
to stan-...@googlegroups.com
For that, I have no idea!

Bob Carpenter

unread,
May 17, 2016, 7:26:34 PM5/17/16
to stan-...@googlegroups.com
The true number of cores is usually better, but you may
want to experiment because performance is going to depend on
memory and CPU contention, which aren't going to be the same
for all models/data and background processes.

- Bob

> On May 17, 2016, at 6:42 PM, Mahdi Akbarzadeh <akbarz...@gmail.com> wrote:
>

Mahdi Akbarzadeh

unread,
May 17, 2016, 7:46:26 PM5/17/16
to stan-...@googlegroups.com
Hi Bob, I think it is indistinguishable that our chains run parallel or not! 
When I run a rstan process with 4 chains, they run one after another. Does this mean that my process is progressing in an unparalleled format?! 

Bob Carpenter

unread,
May 17, 2016, 8:10:34 PM5/17/16
to stan-...@googlegroups.com
Run something that doesn't finish before the next process starts.
You should see something like this, where it takes a while to kick off
the chains. We've already finished 2000 iterations on each of the first
three chains before the fourth even starts.

- Bob

SAMPLING FOR MODEL 'power' NOW (CHAIN 1).

Chain 1, Iteration: 1 / 200000 [ 0%] (Warmup)
Chain 1, Iteration: 1000 / 200000 [ 0%] (Warmup)
SAMPLING FOR MODEL 'power' NOW (CHAIN 2).

Chain 2, Iteration: 1 / 200000 [ 0%] (Warmup)
Chain 1, Iteration: 2000 / 200000 [ 1%] (Warmup)
Chain 2, Iteration: 1000 / 200000 [ 0%] (Warmup)
Chain 1, Iteration: 3000 / 200000 [ 1%] (Warmup)
SAMPLING FOR MODEL 'power' NOW (CHAIN 3).

Chain 3, Iteration: 1 / 200000 [ 0%] (Warmup)
Chain 2, Iteration: 2000 / 200000 [ 1%] (Warmup)
Chain 1, Iteration: 4000 / 200000 [ 2%] (Warmup)
Chain 3, Iteration: 1000 / 200000 [ 0%] (Warmup)
Chain 3, Iteration: 2000 / 200000 [ 1%] (Warmup)
Chain 2, Iteration: 3000 / 200000 [ 1%] (Warmup)
SAMPLING FOR MODEL 'power' NOW (CHAIN 4).

- Bob

Mahdi Akbarzadeh

unread,
May 17, 2016, 8:32:19 PM5/17/16
to stan-...@googlegroups.com
Is this your original process history. It seems some rows has been in disarray!!

Mahdi Akbarzadeh

unread,
May 17, 2016, 8:48:44 PM5/17/16
to stan-...@googlegroups.com
This is my original history in 4 chain and 100 iteration (for test). Used OS was Windows. 

SAMPLING FOR MODEL 'BSEM_model' NOW (CHAIN 1).

Iteration:  1 / 100 [  1%]  (Warmup)
Iteration: 10 / 100 [ 10%]  (Warmup)
Iteration: 20 / 100 [ 20%]  (Warmup)
Iteration: 30 / 100 [ 30%]  (Warmup)
Iteration: 40 / 100 [ 40%]  (Warmup)
Iteration: 50 / 100 [ 50%]  (Warmup)
Iteration: 51 / 100 [ 51%]  (Sampling)
Iteration: 60 / 100 [ 60%]  (Sampling)
Iteration: 70 / 100 [ 70%]  (Sampling)
Iteration: 80 / 100 [ 80%]  (Sampling)
Iteration: 90 / 100 [ 90%]  (Sampling)
Iteration: 100 / 100 [100%]  (Sampling)
#  Elapsed Time: 86.891 seconds (Warm-up)
#                121.494 seconds (Sampling)
#                208.385 seconds (Total)


SAMPLING FOR MODEL 'BSEM_model' NOW (CHAIN 2).

Iteration:  1 / 100 [  1%]  (Warmup)
Iteration: 10 / 100 [ 10%]  (Warmup)
Iteration: 20 / 100 [ 20%]  (Warmup)
Iteration: 30 / 100 [ 30%]  (Warmup)
Iteration: 40 / 100 [ 40%]  (Warmup)
Iteration: 50 / 100 [ 50%]  (Warmup)
Iteration: 51 / 100 [ 51%]  (Sampling)
Iteration: 60 / 100 [ 60%]  (Sampling)
Iteration: 70 / 100 [ 70%]  (Sampling)
Iteration: 80 / 100 [ 80%]  (Sampling)
Iteration: 90 / 100 [ 90%]  (Sampling)
Iteration: 100 / 100 [100%]  (Sampling)
#  Elapsed Time: 79.707 seconds (Warm-up)
#                101.746 seconds (Sampling)
#                181.453 seconds (Total)

Bob K

unread,
May 18, 2016, 1:38:38 AM5/18/16
to Stan users mailing list
Hi Mahdi -- 

Yes you need to specify the "cores" option in the Rstan call to use parallel. Cores should be set to however many chains you will run in parallel.

Bob, it is true, but it can also be a bit of a hassle because it can make Windows extremely sluggish if the number of processes > number of cores. I discovered that Stan can do Windows in parallel when I accidentally ran code from a cluster and had 10 parallel chains running. It did work (albeit very slowly).

Re: Andrew, Windows cannot do parallel as easily as Macs/Linux because the mclapply function doesn't work (for reasons unique to the operating system), which is the easiest way to set it up. One has to instead use a different function, parLapply, which requires a lot more work. 

My Rstudio output currently looks like this:
Click the Refresh button to see progress of the chains
starting worker pid=5740 on localhost:11545 at 06:28:57.814
starting worker pid=17108 on localhost:11545 at 06:28:58.714

SAMPLING FOR MODEL 'Nominate: 1 dimension' NOW (CHAIN 1).

Chain 1, Iteration:   1 / 1000 [  0%]  (Warmup)
SAMPLING FOR MODEL 'Nominate: 1 dimension' NOW (CHAIN 2).

Chain 2, Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 2, Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 1, Iteration: 100 / 1000 [ 10%]  (Warmup)

There are two R processes running, each using 1/3 of processor power and consuming only 100 MB of memory. Anyways, this all could be old news for Stan developers, but I had just realized it, so I thought I'd share.
Message has been deleted

Mahdi Akbarzadeh

unread,
May 18, 2016, 7:56:58 AM5/18/16
to stan-...@googlegroups.com
Hi. Thanks a lot Bob.

On Tue, May 17, 2016 at 10:38 PM, Bob K <bobku...@gmail.com> wrote:
Hi Mahdi -- 

Yes you need to specify the "cores" option in the Rstan call to use parallel. Cores should be set to however many chains you will run in parallel.

Bob, it is true, but it can also be a bit of a hassle because it can make Windows extremely sluggish if the number of processes > number of cores. I discovered that Stan can do Windows in parallel when I accidentally ran code from a cluster and had 10 parallel chains running. It did work (albeit very slowly).

Re: Andrew, Windows cannot do parallel as easily as Macs/Linux because the mclapply function doesn't work (for reasons unique to the operating system), which is the easiest way to set it up. One has to instead use a different function, parLapply, which requires a lot more work. 

My Rstudio output currently looks like this:
Click the Refresh button to see progress of the chains
starting worker pid=5740 on localhost:11545 at 06:28:57.814
starting worker pid=17108 on localhost:11545 at 06:28:58.714

SAMPLING FOR MODEL 'Nominate: 1 dimension' NOW (CHAIN 1).

Chain 1, Iteration:   1 / 1000 [  0%]  (Warmup)
SAMPLING FOR MODEL 'Nominate: 1 dimension' NOW (CHAIN 2).

Chain 2, Iteration:   1 / 1000 [  0%]  (Warmup)
Chain 2, Iteration: 100 / 1000 [ 10%]  (Warmup)
Chain 1, Iteration: 100 / 1000 [ 10%]  (Warmup)

There are two R processes running, each using 1/3 of processor power and consuming only 100 MB of memory. Anyways, this all could be old news for Stan developers, but I had just realized it, so I thought I'd share.


On Wednesday, May 18, 2016 at 1:48:44 AM UTC+1, Mahdi Akbarzadeh wrote:

Bob Carpenter

unread,
May 18, 2016, 2:41:35 PM5/18/16
to stan-...@googlegroups.com

> On May 18, 2016, at 1:38 AM, Bob K <bobku...@gmail.com> wrote:
>
> Hi Mahdi --
>
> Yes you need to specify the "cores" option in the Rstan call to use parallel. Cores should be set to however many chains you will run in parallel.
>
> Bob, it is true, but it can also be a bit of a hassle because it can make Windows extremely sluggish if the number of processes > number of cores.

The question was about Intel's hyperthreading, which reports
the number of cores as 2 * the number of physical cores.

You can have problems with sluggishness at fewer processes for
Stan than that if other things are going on in the environment.

> I discovered that Stan can do Windows in parallel when I accidentally ran code from a cluster and had 10 parallel chains running. It did work (albeit very slowly).
>
> Re: Andrew, Windows cannot do parallel as easily as Macs/Linux because the mclapply function doesn't work (for reasons unique to the operating system), which is the easiest way to set it up. One has to instead use a different function, parLapply, which requires a lot more work.

I thought Windows parallelization was baked into RStan 2.9.
But I never run Windows, so don't have any experience there.

Have you set

> options(mc.cores = 4);

or however many you want?

- Bob

Bob K

unread,
May 18, 2016, 4:46:34 PM5/18/16
to Stan users mailing list
Apparently some kind of parallel processing for Windows is standard in Rstan 2.9, I just never noticed that it had that capability until now... there's nothing in the docs about it that I have found. A very pleasant surprise. Apparently one can use Windows and be a Bayesian, who knew.

I just make sure to pass the cores option to stan directly and it works fine for me. 

Mahdi Akbarzadeh

unread,
May 25, 2016, 11:42:30 PM5/25/16
to stan-...@googlegroups.com
I would recommend that you use the Linux operating system. 

--
You received this message because you are subscribed to the Google Groups "Stan users mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-users+...@googlegroups.com.
To post to this group, send email to stan-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages