Parallel collections don't seem very parallel

Samuel Ainsworth

unread,

Feb 19, 2014, 4:32:19 PM2/19/14

to scala...@googlegroups.com

I've been trying to use the parallel collections but without much real success. For example, running

val pv = scala.collection.parallel.immutable.ParVector.tabulate(1000000)(x => x)

pv.map(x => List.range(1, x))

spins up one of my four cores to 100% but the other three remain idle. What's up with that? At first, I thought it wasn't recognizing all of my cores for some reason but both "collection.parallel.ForkJoinTasks.defaultForkJoinPool.getParallelism" and "Runtime.getRuntime.availableProcessors" are 4 as I expect. I also tested ParList, ParArray, and some of the others. Why don't these parallel collections use all of my cores?

Thanks,

Samuel

√iktor Ҡlang

unread,

Feb 19, 2014, 4:35:01 PM2/19/14

to Samuel Ainsworth, scala-user

Works on my machine: val pv = scala.collection.parallel.immutable.ParVector.tabulate(1000000)(x => x)

➜ ~ scala

Welcome to Scala version 2.10.3 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0).

Type in expressions to have them evaluated.

Type :help for more information.

--
You received this message because you are subscribed to the Google Groups "scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-user+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--

Cheers,

√

———————
Viktor Klang

Chief Architect - Typesafe

Twitter: @viktorklang

Tim Pigden

unread,

Feb 19, 2014, 5:18:12 PM2/19/14

to Samuel Ainsworth, scala-user

always worked for me (tried them extensively while doing one of the coursera algorithms courses)

However, the unit of work you are carrying out in parallel has to be more than a couple of instructions to make it worthwhile I found.

--

You received this message because you are subscribed to the Google Groups "scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-user+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--

Tim Pigden
Optrak Distribution Software Limited
+44 (0)1992 517100
http://www.linkedin.com/in/timpigden
http://optrak.com
Optrak Distribution Software Ltd is a limited company registered in England and Wales.
Company Registration No. 2327613 Registered Offices: Suite 6,The Maltings, Hoe Lane, Ware, SG12 9LR England
This email and any attachments to it may be confidential and are intended solely for the use of the individual to whom it is addressed. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Optrak Distribution Software Ltd. If you are not the intended recipient of this email, you must neither take any action based upon its contents, nor copy or show it to anyone. Please contact the sender if you believe you have received this email in error.

Samuel Ainsworth

unread,

Feb 19, 2014, 5:19:21 PM2/19/14

to scala...@googlegroups.com, Samuel Ainsworth

Hmm, I'm running Scala 2.10.0 on a Macbook Air:

"Welcome to Scala version 2.10.0 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_17)."

Maybe it's a difference between Java 1.7 vs 1.8.

Samuel

Adriaan Moors

unread,

Feb 19, 2014, 5:28:57 PM2/19/14

to Samuel Ainsworth, scala-user

FWIW, I just tried this on a mac on Java 1.6/1.7/1.8, albeit on Scala 2.11.0-M8, and all cores lit up.

Alan Burlison

unread,

Feb 19, 2014, 6:32:32 PM2/19/14

to Samuel Ainsworth, scala...@googlegroups.com

Works for me, just doesn't scale all that well ;-)

load averages: 69.4, 37.0, 15.4; up 2+08:14:22
23:30:10
88 processes: 84 sleeping, 2 zombie, 2 on cpu
CPU states: 72.5% idle, 26.7% user, 0.7% kernel, 0.0% iowait, 0.0% swap
Kernel: 32214 ctxsw, 233 trap, 21363 intr, 516419 syscall, 6 flt
Memory: 255G phys mem, 212G free mem, 256G total swap, 256G free swap

PID USERNAME NLWP PRI NICE SIZE RES STATE TIME CPU COMMAND
22271 alanbur 351 30 0 391M 384M cpu/** 268:38 27.39% java

--
Alan Burlison
--

Alan Burlison

unread,

Feb 20, 2014, 5:58:39 AM2/20/14

to scala...@googlegroups.com, Samuel Ainsworth

On 19/02/2014 23:32, Alan Burlison wrote:

> Works for me, just doesn't scale all that well ;-)

Looks like it pegs out somewhere around 64 threads, at that point
further progress is limited by the GC caused by the creation of all the
List.ranges

Out of interest, is it *really* the case that the only way of globally
configuring the default number of threads for parallel collections is
this horrid hack?

http://stackoverflow.com/questions/17865823/how-do-i-set-the-default-number-of-threads-for-scala-2-10-parallel-collections

--
Alan Burlison
--

Samuel Ainsworth

unread,

Feb 20, 2014, 7:23:08 PM2/20/14

to scala...@googlegroups.com, Samuel Ainsworth

Yes, there are ways to control the number of threads and more generally how the work is handled.

http://docs.scala-lang.org/overviews/parallel-collections/configuration.html

I'll try to see what happens in Scala 2.10.3. Also, is there any way to debug the innards of parallel collections?

Samuel

--
Alan Burlison
--

Alan Burlison

unread,

Feb 20, 2014, 7:43:40 PM2/20/14

to Samuel Ainsworth, scala...@googlegroups.com

On 21/02/2014 00:23, Samuel Ainsworth wrote:

>> http://stackoverflow.com/questions/17865823/how-do-i-set-the-default-number-of-threads-for-scala-2-10-parallel-collections
>
> Yes, there are ways to control the number of threads and more generally how
> the work is handled.
>
> http://docs.scala-lang.org/overviews/parallel-collections/configuration.html

Unless I'm mistaken that's per collection and not global, correct? The
horrid reflection hack is because there seems to be no other way of
setting the default number of threads to use.

> I'll try to see what happens in Scala 2.10.3. Also, is there any way to
> debug the innards of parallel collections?

I'm using 2.10.3. There are also some JVM properties you can use, the
names of which escape me, but they don't work because of a bug. And the
older way via
collection.parallel.ForkJoinTasks.defaultForkJoinPool.setParallelism has
been removed in later 2.10 versions, although the getter is still there
and it seems that the default parallelism is set to the number of cores.

Programmatically setting the default collection parallelism seems like a
common thing to want to do, I'm puzzled as to why it isn't available.

--
Alan Burlison
--

Seth Tisue

unread,

Feb 20, 2014, 9:44:25 PM2/20/14

to scala-user

On Thu, Feb 20, 2014 at 7:43 PM, Alan Burlison <alan.b...@gmail.com> wrote:

There are also some JVM properties you can use, the names of which escape me, but they don't work because of a bug.

Ticket number?

Samuel Ainsworth

unread,

Feb 20, 2014, 10:12:49 PM2/20/14

to scala...@googlegroups.com, Samuel Ainsworth

Ok after playing around a bit and poking at it with VisualVM, I don't think this can possibly be a problem with Scala's implementation of the parallel collections. In VisualVM I can see that 4 worker threads are started and run smoothly. The reason why all of my cores didn't run at capacity previously remains a bit of a mystery but I have two hypotheses:

Less reasonable: I tested it first with my laptop unplugged. I'm guessing that the CPU runs in slightly different modes when plugged/unplugged to save power. When I plugged in my computer, a saw slightly more load on the other processors but not much.
More probable: The garbage collector constrains threads in certain ways when it is under stress. I tried running a similar experiment but that was less memory intensive. Instead of building lists from 1 to n, I tried factorizing the numbers which is obviously more CPU intensive but will hit the GC much less. It cranked up all of my cores to 100% as expected. I guess the moral of the story here is that when using parallel collections avoid hitting the GC and go for CPU intensive computations instead. Results may vary based on heap size, CPU, day of week, etc.

Hope this helps someone in the future!

Best,

Samuel

Alan Burlison

unread,

Feb 21, 2014, 4:32:36 AM2/21/14

to Seth Tisue, scala-user

See:

http://stackoverflow.com/questions/14207762/set-the-parallelism-level-for-all-collections-in-scala-2-10
https://issues.scala-lang.org/browse/SI-7399

Supposed to be fixed, as far as I can tell, isn't.

I always find it virtually impossible to figure out which Scala version
a bug has been fixed in from the bug tracker. Is there a way?

--
Alan Burlison
--

Alan Burlison

unread,

Feb 21, 2014, 5:39:40 AM2/21/14

to Samuel Ainsworth, scala...@googlegroups.com

On 21/02/2014 03:12, Samuel Ainsworth wrote:

> 2. More probable: The garbage collector constrains threads in certain

> ways when it is under stress. I tried running a similar experiment but that
> was less memory intensive. Instead of building lists from 1 to n, I tried
> factorizing the numbers which is obviously more CPU intensive but will hit
> the GC much less. It cranked up all of my cores to 100% as expected. I
> guess the moral of the story here is that when using parallel collections
> avoid hitting the GC and go for CPU intensive computations instead. Results
> may vary based on heap size, CPU, day of week, etc.

With your original code on a 256-core machine I was seeing more GC
threads active than I was application threads. And if I abused it badly
enough I could even induce the dreaded 'GC overhead limit exceeded'
which indicates that over 98% of the time was being spent in GC.

I know that there's a current meme that immutability + actors = instant
scalability. Immutability is clearly a big help as it helps avoid
locking, I'm less persuaded by actors. However they are both subject to
the same limitations that you were hitting with your example - either in
the underlying implementation (thread pool), or because you don't
actually have an infinite amount of memory (garbage collection) or
because of platform limitations (OS scalability). Basically, the
oft-promised free lunch has still not arrived :-)

--
Alan Burlison
--

Jason Zaugg

unread,

Feb 21, 2014, 6:03:29 AM2/21/14

to Alan Burlison, Seth Tisue, scala-user

On Fri, Feb 21, 2014 at 10:32 AM, Alan Burlison <alan.b...@gmail.com> wrote:

http://stackoverflow.com/questions/14207762/set-the-parallelism-level-for-all-collections-in-scala-2-10
https://issues.scala-lang.org/browse/SI-7399

Supposed to be fixed, as far as I can tell, isn't.

I always find it virtually impossible to figure out which Scala version a bug has been fixed in from the bug tracker. Is there a way?

I've just updated the fix-by version to 2.11.0-M4 (the milestone of the linked pull request)

We don't have a automated link between Jira and Git to do this; it is supposed to be done manually when the ticket is closed.

-jason

Alan Burlison

unread,

Feb 21, 2014, 6:15:59 AM2/21/14

to Jason Zaugg, Seth Tisue, scala-user

On 21/02/2014 11:03, Jason Zaugg wrote:

>> I always find it virtually impossible to figure out which Scala version a
>> bug has been fixed in from the bug tracker. Is there a way?
>
> I've just updated the fix-by version to 2.11.0-M4 (the milestone of the

> linked pull request <https://github.com/scala/scala/pull/2424>)

>
> We don't have a automated link between Jira and Git to do this; it is
> supposed to be done manually when the ticket is closed.

Thank you :-)

One question, in 2.11 will JVM properties be the only way of setting
thread parallelism or will
collection.parallel.ForkJoinTasks.defaultForkJoinPool.setParallelism or
it's equivalent be making a reappearance? It would be nice to be able to
set the required degree of parallelism programmatically as well as on
the command line.

--
Alan Burlison
--

Jason Zaugg

unread,

Feb 21, 2014, 6:21:09 AM2/21/14

to Alan Burlison, Seth Tisue, scala-user

AFAIK the underlying ForkJoinJool now needs to know the desired parallelism upon construction, and you can't change it later. So if we offered a programattic means to configure this, it wouldn't work if the pool had already been created.

-jason

Alan Burlison

unread,

Feb 21, 2014, 6:59:10 AM2/21/14

to Jason Zaugg, Seth Tisue, scala-user

On 21/02/2014 11:21, Jason Zaugg wrote:

> AFAIK the underlying ForkJoinJool now needs to know the desired parallelism
> upon construction, and you can't change it later. So if we offered a
> programattic means to configure this, it wouldn't work if the pool had
> already been created.

Ah, I think you nailed it there. From the JDK doc [1]:

"A ForkJoinPool is constructed with a given target parallelism level; by
default, equal to the number of available processors."

and there's no setter in the API either. Thanks.

[1]
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ForkJoinPool.html

--
Alan Burlison
--

Reply all

Reply to author

Forward