parallel Haskell concept map?

Eric Kow

unread,

Feb 24, 2011, 4:29:52 AM2/24/11

to parallel...@googlegroups.com

Hello parallel Haskell fans!

I'd like to put some of my cluelessness to service for the parallel

Haskell the community.

There seems to be a profusion of concepts and acronynms (eg. STM, SMP,

GPH, CnC, NDP, DpH) related to parallel Haskell. I suspect that this

could be intimidating for people who are new to parallelism and

concurrency and/or to Haskell itself. It might create the impression

that you have to learn a lot of stuff to get started (which seems

false), or perhaps create a "tyranny of choice" situation (argh! where

do I start?)

Maybe some sort of map would help? My hope is that something that

puts all these concepts on the same page and relates them together

(without trying to explain them) could help fellow newbies to build

up their confidence.

Diagram goals

-------------

1. Coverage - see all the words that people seem to

throw around a lot, in one place

2. Grouping and layering - rough sense of the territory.

Associations between words and relationships (eg. X is about

Parallelism, Y about Concurrency, Z is subordinate to Z').

3. Goal orientation - concretely worded programmer-oriented

objectives for each main concept. Tricky subtly is not what

the technology is trying to accomplish, but what the programmer

is trying to accomplish that might cause them to be interested

in the technology. Tricky subtly #2 is that this may be about

communicating objectives that programmers may not even realise

they should be having in the same place.

4. Availability marking - a sense of what technologies are mature and

well understood by the community. If I'm working in the trenches,

maybe I'm more hesitant to use a technology in progress, not

necessarily for fear of instability but more for simple things

like not being able to find help on it

General questions

-----------------

Attached is a first draft, ugly and full of holes to give a rough idea

what sort of thing I'm aiming for.

What do you think of the idea in general? Maybe a diagram is overkill

and simple wiki page is good enough. Is there a better way?

Or maybe the goals are not focused enough. For example, I'm concerned

that my attempt to arrange technologies on a continuum from use it now!

to current research may be result in a contrived ordering of

technologies.

Do you have any suggestions for filling in the diagram? Are there

any key words that I really ought to be putting in there? Should I be

narrowing my choices about what to include in the diagram?

Specific questions

------------------

1. Should I maintain the dashed line separating Parallelism and

Concurrency? Or is this something that's better treated as

being on a continuum (NB: from the user's perspective)?

2. How do you feel about the attempt to arrange concepts on a

continnum of use it now! <-> current research? Note that my

classification may be incorrect (given my ignorance), but I

wonder if more generally there is a more useful Y-axis

dimension for fellow newbies.

3. I'd like some words I can use to call "basic" parallelism and

concurrency. Is there something reasonable I can use? For now,

I'm using Control.Parallel and Concurrent Haskell respectively.

4. Can you help me improve my user-oriented characterisation of the

concepts?

I'm trying to balance between

a. accuracy (this is hard for me given my newbieness)

b. user-orientedness

c. brevity

d. concreteness

For example I have semi-nicked "separate computation from

coordination" from /Parallel and Distributed Haskells/, which

I hope satifies (a), but it seems a bit too abstract (d).

Can you come up with better? I'm worried also that my

characterisation of the concurrency stuff may be somewhat

missing the point.

Thanks very much for any help you may be able to offer!

Eric

PS. If you'd like to send patches to the diagram (Inkscape SVG)

darcs get http://patch-tag.com/r/kowey/multicore-fun

PPS. I gather from the name of our mailing list that we are convering

towards "Parallel Haskell" as a catch-all term for Haskell

Parallelism and Concurrency? Is that right? It's very helpful to

have a *short* name (one word is nice, ie. X or Haskell X)

that captures the essence of what we're doing. "Multicore" seems

like another candidate (seeing Don's Multicore Haskell Now!

slides), but I get a feeling that something more abstract would be

preferable.

parallel.png

Kevin Hammond

unread,

Feb 24, 2011, 5:08:26 AM2/24/11

to parallel...@googlegroups.com, Phil Trinder

Hi Eric,

I'm writing a book with Phil Trinder on Parallel Haskell. We can give you the current glossary entries if that helps?
Obviously we'll look at the ones you're defining.

> <parallel.png>

Best Wishes,
Kevin

--------

Kevin Hammond, Professor of Computer Science, University of St Andrews

T: +44-1334 463241 F: +44-1334-463278 W: http://www.cs.st-andrews.ac.uk/~kh

In accordance with University policy on electronic mail, this email reflects the opinions of the individual concerned, may contain confidential or copyright information that should not be copied or distributed without permission, may be of a private or personal nature unless explicitly indicated otherwise, and should not under any circumstances be taken as an official statement of University policy or procedure (see http://www.st-and.ac.uk).

The University of St Andrews is a charity registered in Scotland : No SC013532

Jost Berthold

unread,

Feb 24, 2011, 5:26:32 AM2/24/11

to parallel...@googlegroups.com, hackpar

Hi Eric,

Your diagramme looks like a good starting point, Yet, the first thing
is, it is more a table than a diagram. In my view, the transition from
parallel to concurrent/distributed is not really a fixed borderline (for
instance, look at Haskell contributions to the "progamming language
shootout", they look rather imperative and use concurrent Haskell, but
for the purpose of speeding up, parallelism).

There are some scientific papers around that might help understanding,
especially one:
"Parallel and Distributed Haskells" by P.Trinder, H.W.Loidl and
R.Pointon, in a special issue of JFP.

http://www.lmgtfy.com/?q=parallel%20and%20distributed%20Haskells

While already a bit older (2002), it collects a number of research
oriented approaches to parallel Haskell not showing in your overview,
gives a historic perspective, and includes a simple scheme in two
dimensions (location-awareness and explicitness) that will help improve
yours.
To fill the "hole" in red, you can put the word "Eden" (and I would say
move it up a bit), and there is an ongoing Haskell+MPI reimplementation
(the first one was in 2000).

Finally, thanks for bringing this up! The idea to collect "a big
picture" in this forum is very good. I am forwarding this message to
another list (including Phil Trinder and Hans-Wolfgang Loidl, btw) for
information.

Jost

Oleg Lobachev

unread,

Feb 24, 2011, 5:48:55 AM2/24/11

to parallel...@googlegroups.com

Hello Eric,

On Feb 24, 2011, at 10:29 , Eric Kow wrote:

> Maybe some sort of map would help? My hope is that something that
> puts all these concepts on the same page and relates them together
> (without trying to explain them) could help fellow newbies to build
> up their confidence.

As a more narrow task of a classification of parallel Haskell dialects, I would suggest some kind of a single-dimension classification. An example would be the PCAM from Foster's 1995 book "Designing and building parallel programs". Foster separates four phases, partitioning, communication, agglomeration and mapping. For a classification--inspired by Rita Loogen's lectures--one would need to denote, what phases are done by the compiler and what phases are left to the programmer. So, we can denote a semi-explicit parallel language as P---, as only partitioning is the programmer's task. A "control" parallel language would be PCA-. The explicit language would be PCAM.

In a broader context, you will need to identify paradigms (e.g., SMP), approaches (e.g., STM) and software tools (e.g., Eden TV).

Greetings,
Oleg

Paul Bone

unread,

Feb 24, 2011, 8:47:46 PM2/24/11

to parallel...@googlegroups.com

On Thu, Feb 24, 2011 at 01:29:52AM -0800, Eric Kow wrote:
> Hello parallel Haskell fans!
>

> Specific questions
> ------------------
> 1. Should I maintain the dashed line separating Parallelism and
> Concurrency? Or is this something that's better treated as
> being on a continuum (NB: from the user's perspective)?
>

I'd like to comment on this point in particular. The confusion between
Parallelism and Concurrency is one of my pet hates, and it's also something
that many people misunderstand.

Parallelism - is about using more than one processor (or SIMD instructions on a
single processor) during execution to speed up a program. The program does not
necessarily need to be written for parallelism. That is to say parallelism is
an implementation detail and one that it is only realized at runtime. For
example an arbitrary pure functional program can be evaluated in parallel if
the compiler and runtime support it, or the same program can be evaluated
sequentially.

Concurrency - is about expressing concurrent 'threads' of execution in a
program. This is a programming abstraction and helps the programmer separate
concerns within a program. Consider a web server serving many documents to many
clients simultaneously. The programmer may wish to use threads, using one
thread per client making it easier to manage concurrent actions.

These concepts are often confused because a programmer usually needs
concurrency to achieve parallelism, and therefore thinks that concurrency is
parallelism. This is possibly more confusing now that mutlicore CPUs are
ubiquitous, since in most cases concurrency will create parallelism (provided
that the compiler and runtime system support it).

Despite this there are cases where concurrency doesn't imply parallelism, one
example is the use of generators in Python. There are also cases where
parallelism doesn't imply concurrency, such as auto-parallelizing compilers.

Therefore, when classifying technologies as Parallel or Concurrent I believe
that these concepts should be kept separate. Everything can be classified as:
+ Neither Parallel or Concurrent.
+ Parallel but not Concurrent.
+ Concurrent but not Parallel.
+ Both Parallel and Concurrent.

signature.asc

Trinder, Philip W

unread,

Feb 25, 2011, 6:09:10 AM2/25/11

to Kevin Hammond, parallel...@googlegroups.com, E.Y...@brighton.ac.uk

Hi Eric,

A few thoughts:

What are you trying to do with your diagram? Is it to introduce
parallelism and concurrency options to Haskell programmers? What do you
plan to do with it? Could we use it if we like it?

Assuming you plan to introduce parallelism and concurrency options to
Haskell programmers, here are some comments.

* Please write 'Evaluation Strategies', rather than 'Strategies'

* Suggest you don't cover
+ MPI (it's too low level) or
+ Distributed Haskells - too experimental for most Haskell programmers

* Some aspects you've missed:

+ What Haskell implementations support the models. Perhaps this doesn't
matter if you assume people only use GHC, but even then they may need to
know that there's no parallelism in ghci

+ The distinction between shared memory and distributed memory
architectures. Currently GHC only supports shared-memory parallelism,
and not distributed memory. Example distributed memory implementations
are GUM and Eden.

> > 1. Should I maintain the dashed line separating Parallelism and
> > Concurrency? Or is this something that's better treated as
> > being on a continuum (NB: from the user's perspective)?

Yes, they are fundamentally different. Concurrent (IO) threads are
stateful, require mandatory scheduling etc.

> > 2. How do you feel about the attempt to arrange concepts on a
> > continnum of use it now! <-> current research? Note that my
> > classification may be incorrect (given my ignorance), but I
> > wonder if more generally there is a more useful Y-axis
> > dimension for fellow newbies.

Fine.

HTH,

Phil

--
Heriot-Watt University is a Scottish charity
registered under charity number SC000278.

Simon Marlow

unread,

Feb 28, 2011, 6:34:59 AM2/28/11

to parallel...@googlegroups.com

Hi Erik,

> I'd like to put some of my cluelessness to service for the parallel Haskell
> the community.
>
> There seems to be a profusion of concepts and acronynms (eg. STM, SMP, GPH,
> CnC, NDP, DpH) related to parallel Haskell. I suspect that this could be
> intimidating for people who are new to parallelism and concurrency and/or to
> Haskell itself. It might create the impression that you have to learn a lot
> of stuff to get started (which seems false), or perhaps create a "tyranny of
> choice" situation (argh! where do I start?)

I completely agree. Haskell's flexibility as a substrate it its undoing here.

The diagram is a good start, and I like the idea. We should have something like this on the front page of parallel.haskell.org. My main suggestion would be to separate it into two:

- one aimed at *users*, people who want to pick up the existing tools
and write concurrent or parallel programs. For this audience the
picture can be simplified considerably by removing all of the
experimental stuff (e.g. I'd leave out CnC, CHP, DPH, etc.). For
parallelism we have par/pseq and Strategies, for concurrency we have
forkIO, MVars and STM - these are the technologies we fully support.

- the full picture, including all experimental and research topics, and
libraries on Hackage such as Orc and parallel-io.

I like the way you made a top-level distinction between concurrency and parallelism. The quicker users understand the distinction, the easier their lives will be.

> PPS. I gather from the name of our mailing list that we are convering
>    towards "Parallel Haskell" as a catch-all term for Haskell
>    Parallelism and Concurrency? Is that right? It's very helpful to
>    have a *short* name (one word is nice, ie. X or Haskell X)
>    that captures the essence of what we're doing. "Multicore" seems
>    like another candidate (seeing Don's Multicore Haskell Now!
>    slides), but I get a feeling that something more abstract would be
>    preferable.

I think it makes sense to deal with them both together. Concurrent Haskell can be used as a parallel programming model, after all (it's just not the one we should recommend using without a good reason, such as the use of a nondeterministic algorithm).

Cheers,
Simon

Eric Kow

unread,

Mar 4, 2011, 9:45:07 AM3/4/11

to parallel...@googlegroups.com, P.W.T...@hw.ac.uk

Hi everybody,

Thanks for all your responses!

It's given me a lot to reflect on. Here's my attempt to summarise your

responses so far and hopefully respond to them below. Do shout if I've

misrepresented you in any way!

Note: I should have mentioned that it may take me a week or so to

respond to emails on this list. When doing so, I will tend to send

batch responses.

Overview

======================================================================

Kevin Hammond and Phil Trinder

------------------------------

* Offer for glossary from Parallel Haskell book [Kevin]

* QUESTION: what trying to accomplish?

* TODO: write "evaluation strategies" instead of "strategies"

* SUGGEST:

- no MPI (too low-level) or Distributed Haskells (too experimental)

- which Haskell implementations support X?

- distinguish between shared/distributed memory model?

* +1 separating parallel/concurrent

Jost Berthold

-------------

* appears to be table at the moment

* SUGGEST Eden and MPI binding efforts for Distributed

* -1 separating parallel/concurrent?

(eg. shootout use concurrency to achieve parallelism)

Oleg Lobachev

-------------

* SUGGEST: single dimension classification (if classifying PH dialects

only), eg. Foster (example, PCA- means programer task is first 3)

- Partitioning

- Communication

- Agglomeration

- Mapping

* TODO: distinguish between paradigms (SMP), approaches (STM) and

software tools (Eden TV)

Paul Bone

---------

* parallel/concurrent often confused

- parallelism: implementation detail - faster due to running

on more than one processor

- concurrent: thread as programming abstraction

* should keep separate, because can have

P C

P ~C : eg. auto-parallelizing compilers,

~P C : eg. Python generators

~P ~C

Simon Marlow

------------

* flexibility as substrate as Haskell's undoing

* parallel.haskell.org

* SUGGEST: two diagrams?

- simple user-oriented with only stuff we're can support (par/pseq,

forkIO, MVars, STM)

- full picture (current research, historical concepts)

* +1 separating parallel/concurrent

Kevin Hammond

======================================================================

> We can give you the current glossary entries if that helps?

Many thanks! I'll be referring to this a lot. One of my goals [1] is

to develop some sort of Parallel Haskell monthly roundup, of which one

of the features would be a /word of the day/. May I use terms from the

glossary for this feature? I may have to tweak them to fit the word of

the feature a bit.

Phil Trinder

======================================================================

> What are you trying to do with your diagram?

My target audience is essentially people who either

(i) don't know very much about parallel/concurrent programming and/or

(ii) don't know very much about Haskell or maybe both

("Gee maybe I should get into functional programming; I hear it's good

for concurrency"). My aim overall is to reduce static inertia, that is,

to make it more my audience to get started with Parallel/Concurrent

Haskell, and also to build up the momentum to become bona fide users.

More specifically, my hope is put into context the key words that the

presumably confused public may have encountered whilst searching online

for "Parallel Haskell" (or the like).

Also, I believe this means that I should treat other worthy uses for a

concept map as secondary. For example, presenting historical

developments in Parallel/Concurrent Haskell may help people to

understand current research directions; but could potentially detract

from the main objective. On the other hand, talking about the

historical approaches may also be useful in this context if only to

let users know explictly what words they've seen that they can

safely ignore for now (for being too old or too new, etc)

> What Haskell implementations support the models.

Thinking in stark market share terms, I'll assume everybody is using

GHC at the moment. Thanks for the observation about GHCi!

> The distinction between shared memory and distributed memory

> architectures.

I think the need to make this distinction may go away if I simplify

out some of the more experimental work from the diagram.

Jost Berthold

======================================================================

> There are some scientific papers around that might help understanding,

...

> To fill the "hole" in red, you can put the word "Eden" (and I would say

> move it up a bit), and there is an ongoing Haskell+MPI reimplementation

Thanks for the pointers! I hope to work through Parallel and

Distributed Haskell over time. If I do talk about Distributed stuff, at

least Eden sounds like something to mention.

> Yet, the first thing is, it is more a table than a diagram.

I assume this is referring to the current draft of the diagram. :-)

Yeah, so there's two directions to explore that I know of. Either

run with the idea that this is a table (simpler is better), or make

more use of the canvas. I'm going with the latter approach for now,

ie. continuing with a 2D canvas, anticipating room for further

experimentation (eg, connecting concepts with lines, etc).

> In my view, the transition from parallel to concurrent/distributed is

> not really a fixed borderline

I'm interested in this distinction between parallel and concurrent

Haskells. After taking all of your comment into account, I think I

shall retain some sort of distinction. But I may like to write up a

blog post about these three words (parallel, concurrent, distributed),

perhaps "reconciling" different ways of understanding the

(non)-distinction.

Oleg Lobachev

----------------------------------------------------------------------

> As a more narrow task of a classification of parallel Haskell

> dialects, I would suggest some kind of a single-dimension

> classification.

Hmm, schemes like PCAM look useful for understanding indeed.

You called this a "single" dimension classification, showing

P--- to PCA- to PCAM being the programmers' tasks. Does this

mean that whether or not the tasks are left to the programmer

is somehow constrained by the sequential order of the phases?

In other words, that there would be no thing like a P-C- or -CAM

language worth discussing?

> In a broader context, you will need to identify paradigms (e.g., SMP),

> approaches (e.g., STM) and software tools (e.g., Eden TV).

Could you please clarify the distinction between a paradigm and an

approach? Is it safe to take on a tree-shaped view of the three,

that there are potentially several tools that work within an approach,

and several approaches towards a single paradigm? Taking a concrete

example, what would be the paradigm that corresponds to STM? In any

case, thanks to you, I certainly hope to make at least the distinction

between tools and paradigms/approaches. Thanks!

Paul Bone

======================================================================

> These concepts are often confused because a programmer usually needs

> concurrency to achieve parallelism, and therefore thinks that concurrency is

> parallelism

Could I just confirm what you mean mean when you say a programmer

/usually/ needs concurrency (as opposed to always)? Reading your

separation of the concepts, I think you are saying that programmers

often manipulate some thread-of-execution abstraction to get parallelism

(as Jost points out about the shootout code). Would using things like

par/pseq or NDP thus count as "not using concurrency to achieve

parallelism"?

> Therefore, when classifying technologies as Parallel or Concurrent I believe

> that these concepts should be kept separate. Everything can be classified as:

> + Neither Parallel or Concurrent.

> + Parallel but not Concurrent.

> + Concurrent but not Parallel.

> + Both Parallel and Concurrent.

Viewing Parallel and Concurrent as orthogonal concepts is helping me

to see things a bit more clearly, thanks! I'll be experimenting to see

if reflecting this in the diagram is useful for my needs.

Simon Marlow

======================================================================

> We should have something like this on the front page of

> parallel.haskell.org.

That sounds good! I'm hoping to do a survey of the sources of

documentation out there (wiki pages, personal pages, project pages, API

doc, etc) and work out some sort of scheme for unifying them. Where

things may get delicate is that the attempt to unify may also contribute

to the proliferation of authorative sources (eg. "should I put this on

the Haskell wiki, or the Parallel Haskell wiki?").

> My main suggestion would be to separate it into two:

I think I like this idea. I still might play with the combined diagram

for a while, at least to the extent that users may find it helpful to

see "distractor" concepts being explicitly mentioned (for example,

having unsupported technologies represent in faded texts) as I mentioned

to Phil. But if that should fail, two diagrams it is (also good

because each diagram can have radically different perspectives, axes,

representation schemes, etc).

Paul Bone

unread,

Mar 7, 2011, 8:53:31 PM3/7/11

to parallel...@googlegroups.com

On Fri, Mar 04, 2011 at 06:45:07AM -0800, Eric Kow wrote:
> Paul Bone
> ======================================================================
> > These concepts are often confused because a programmer usually needs
> > concurrency to achieve parallelism, and therefore thinks that concurrency
> is
> > parallelism
>
> Could I just confirm what you mean mean when you say a programmer
> /usually/ needs concurrency (as opposed to always)? Reading your
> separation of the concepts, I think you are saying that programmers
> often manipulate some thread-of-execution abstraction to get parallelism
> (as Jost points out about the shootout code). Would using things like
> par/pseq or NDP thus count as "not using concurrency to achieve
> parallelism"?

Yes. I don't think par/pseq are about concurrency (nor are evaluation
stratergies). They're hints* to the compiler/runtime about where parallel
evaluation should be used. But they're not about concurrency because the
programmer doesn't use them to express the program in a different way (such as
they would with STM, message passing etc)

*I'm not sure how people prefer to classify par. They could be called hints
as they will be sparked but that doesn't gaurentee that they'll be executed in
parallel. Perhaps 'pragma' is a better term.

In general (imperative and declarative languages) programmers usually need
concurrency to gain parallelism. In particular, threads are often used for
this. I only mention this because I think it's how people become confused
about the terms parallel and concurrent.

> > Therefore, when classifying technologies as Parallel or Concurrent I
> believe
> > that these concepts should be kept separate. Everything can be classified
> as:
> > + Neither Parallel or Concurrent.
> > + Parallel but not Concurrent.
> > + Concurrent but not Parallel.
> > + Both Parallel and Concurrent.
>
> Viewing Parallel and Concurrent as orthogonal concepts is helping me
> to see things a bit more clearly, thanks! I'll be experimenting to see
> if reflecting this in the diagram is useful for my needs.

I'm glad! I was worried that I was being preachy, or that everyone on the list
already knew and I was beating a dead horse.

> Simon Marlow
> ======================================================================

> > My main suggestion would be to separate it into two:
>
> I think I like this idea. I still might play with the combined diagram
> for a while, at least to the extent that users may find it helpful to
> see "distractor" concepts being explicitly mentioned (for example,
> having unsupported technologies represent in faded texts) as I mentioned
> to Phil. But if that should fail, two diagrams it is (also good
> because each diagram can have radically different perspectives, axes,
> representation schemes, etc).