parallel Haskell concept map?

148 views
Skip to first unread message

Eric Kow

unread,
Feb 24, 2011, 4:29:52 AM2/24/11
to parallel...@googlegroups.com
Hello parallel Haskell fans!

I'd like to put some of my cluelessness to service for the parallel
Haskell the community.

There seems to be a profusion of concepts and acronynms (eg. STM, SMP,
GPH, CnC, NDP, DpH) related to parallel Haskell.  I suspect that this
could be intimidating for people who are new to parallelism and
concurrency and/or to Haskell itself.  It might create the impression
that you have to learn a lot of stuff to get started (which seems
false), or perhaps create a "tyranny of choice" situation (argh! where
do I start?)

Maybe some sort of map would help?  My hope is that something that
puts all these concepts on the same page and relates them together
(without trying to explain them) could help fellow newbies to build
up their confidence.

Diagram goals
-------------
1. Coverage - see all the words that people seem to
   throw around a lot, in one place

2. Grouping and layering - rough sense of the territory.
   Associations between words and relationships (eg. X is about
   Parallelism, Y about Concurrency, Z is subordinate to Z').

3. Goal orientation - concretely worded programmer-oriented
   objectives for each main concept.  Tricky subtly is not what
   the technology is trying to accomplish, but what the programmer
   is trying to accomplish that might cause them to be interested
   in the technology.  Tricky subtly #2 is that this may be about
   communicating objectives that programmers may not even realise
   they should be having in the same place.

4. Availability marking - a sense of what technologies are mature and
   well understood by the community.  If I'm working in the trenches,
   maybe I'm more hesitant to use a technology in progress, not
   necessarily for fear of instability but more for simple things
   like not being able to find help on it

General questions
-----------------
Attached is a first draft, ugly and full of holes to give a rough idea
what sort of thing I'm aiming for.

What do you think of the idea in general?  Maybe a diagram is overkill
and simple wiki page is good enough. Is there a better way?

Or maybe the goals are not focused enough.  For example, I'm concerned
that my attempt to arrange technologies on a continuum from use it now!
to current research may be result in a contrived ordering of
technologies.

Do you have any suggestions for filling in the diagram?  Are there
any key words that I really ought to be putting in there?  Should I be
narrowing my choices about what to include in the diagram?

Specific questions
------------------
1. Should I maintain the dashed line separating Parallelism and
   Concurrency?  Or is this something that's better treated as
   being on a continuum (NB: from the user's perspective)?

2. How do you feel about the attempt to arrange concepts on a
   continnum of use it now! <-> current research?  Note that my
   classification may be incorrect (given my ignorance), but I
   wonder if more generally there is a more useful Y-axis
   dimension for fellow newbies.

3. I'd like some words I can use to call "basic" parallelism and
   concurrency.  Is there something reasonable I can use?  For now,
   I'm using Control.Parallel and Concurrent Haskell respectively.

4. Can you help me improve my user-oriented characterisation of the
   concepts?

   I'm trying to balance between

    a. accuracy (this is hard for me given my newbieness)
    b. user-orientedness
    c. brevity
    d. concreteness

   For example I have semi-nicked "separate computation from
   coordination" from /Parallel and Distributed Haskells/, which
   I hope satifies (a), but it seems a bit too abstract (d).
   Can you come up with better?  I'm worried also that my
   characterisation of the concurrency stuff may be somewhat
   missing the point.

Thanks very much for any help you may be able to offer!

Eric

PS. If you'd like to send patches to the diagram (Inkscape SVG)

PPS. I gather from the name of our mailing list that we are convering
     towards "Parallel Haskell" as a catch-all term for Haskell
     Parallelism and Concurrency?  Is that right?  It's very helpful to
     have a *short* name (one word is nice, ie. X or Haskell X)
     that captures the essence of what we're doing.  "Multicore" seems
     like another candidate (seeing Don's Multicore Haskell Now!
     slides), but I get a feeling that something more abstract would be
     preferable.

parallel.png

Kevin Hammond

unread,
Feb 24, 2011, 5:08:26 AM2/24/11
to parallel...@googlegroups.com, Phil Trinder
Hi Eric,

I'm writing a book with Phil Trinder on Parallel Haskell. We can give you the current glossary entries if that helps?
Obviously we'll look at the ones you're defining.

> <parallel.png>

Best Wishes,
Kevin

--------

Kevin Hammond, Professor of Computer Science, University of St Andrews

T: +44-1334 463241 F: +44-1334-463278 W: http://www.cs.st-andrews.ac.uk/~kh

In accordance with University policy on electronic mail, this email reflects the opinions of the individual concerned, may contain confidential or copyright information that should not be copied or distributed without permission, may be of a private or personal nature unless explicitly indicated otherwise, and should not under any circumstances be taken as an official statement of University policy or procedure (see http://www.st-and.ac.uk).

The University of St Andrews is a charity registered in Scotland : No SC013532

Jost Berthold

unread,
Feb 24, 2011, 5:26:32 AM2/24/11
to parallel...@googlegroups.com, hackpar
Hi Eric,

Your diagramme looks like a good starting point, Yet, the first thing
is, it is more a table than a diagram. In my view, the transition from
parallel to concurrent/distributed is not really a fixed borderline (for
instance, look at Haskell contributions to the "progamming language
shootout", they look rather imperative and use concurrent Haskell, but
for the purpose of speeding up, parallelism).

There are some scientific papers around that might help understanding,
especially one:
"Parallel and Distributed Haskells" by P.Trinder, H.W.Loidl and
R.Pointon, in a special issue of JFP.

http://www.lmgtfy.com/?q=parallel%20and%20distributed%20Haskells

While already a bit older (2002), it collects a number of research
oriented approaches to parallel Haskell not showing in your overview,
gives a historic perspective, and includes a simple scheme in two
dimensions (location-awareness and explicitness) that will help improve
yours.
To fill the "hole" in red, you can put the word "Eden" (and I would say
move it up a bit), and there is an ongoing Haskell+MPI reimplementation
(the first one was in 2000).

Finally, thanks for bringing this up! The idea to collect "a big
picture" in this forum is very good. I am forwarding this message to
another list (including Phil Trinder and Hans-Wolfgang Loidl, btw) for
information.

Jost

Oleg Lobachev

unread,
Feb 24, 2011, 5:48:55 AM2/24/11
to parallel...@googlegroups.com
Hello Eric,

On Feb 24, 2011, at 10:29 , Eric Kow wrote:

> Maybe some sort of map would help? My hope is that something that
> puts all these concepts on the same page and relates them together
> (without trying to explain them) could help fellow newbies to build
> up their confidence.

As a more narrow task of a classification of parallel Haskell dialects, I would suggest some kind of a single-dimension classification. An example would be the PCAM from Foster's 1995 book "Designing and building parallel programs". Foster separates four phases, partitioning, communication, agglomeration and mapping. For a classification--inspired by Rita Loogen's lectures--one would need to denote, what phases are done by the compiler and what phases are left to the programmer. So, we can denote a semi-explicit parallel language as P---, as only partitioning is the programmer's task. A "control" parallel language would be PCA-. The explicit language would be PCAM.

In a broader context, you will need to identify paradigms (e.g., SMP), approaches (e.g., STM) and software tools (e.g., Eden TV).

Greetings,
Oleg

Paul Bone

unread,
Feb 24, 2011, 8:47:46 PM2/24/11
to parallel...@googlegroups.com
On Thu, Feb 24, 2011 at 01:29:52AM -0800, Eric Kow wrote:
> Hello parallel Haskell fans!
>
> Specific questions
> ------------------
> 1. Should I maintain the dashed line separating Parallelism and
> Concurrency? Or is this something that's better treated as
> being on a continuum (NB: from the user's perspective)?
>

I'd like to comment on this point in particular. The confusion between
Parallelism and Concurrency is one of my pet hates, and it's also something
that many people misunderstand.

Parallelism - is about using more than one processor (or SIMD instructions on a
single processor) during execution to speed up a program. The program does not
necessarily need to be written for parallelism. That is to say parallelism is
an implementation detail and one that it is only realized at runtime. For
example an arbitrary pure functional program can be evaluated in parallel if
the compiler and runtime support it, or the same program can be evaluated
sequentially.

Concurrency - is about expressing concurrent 'threads' of execution in a
program. This is a programming abstraction and helps the programmer separate
concerns within a program. Consider a web server serving many documents to many
clients simultaneously. The programmer may wish to use threads, using one
thread per client making it easier to manage concurrent actions.

These concepts are often confused because a programmer usually needs
concurrency to achieve parallelism, and therefore thinks that concurrency is
parallelism. This is possibly more confusing now that mutlicore CPUs are
ubiquitous, since in most cases concurrency will create parallelism (provided
that the compiler and runtime system support it).

Despite this there are cases where concurrency doesn't imply parallelism, one
example is the use of generators in Python. There are also cases where
parallelism doesn't imply concurrency, such as auto-parallelizing compilers.

Therefore, when classifying technologies as Parallel or Concurrent I believe
that these concepts should be kept separate. Everything can be classified as:
+ Neither Parallel or Concurrent.
+ Parallel but not Concurrent.
+ Concurrent but not Parallel.
+ Both Parallel and Concurrent.

signature.asc

Trinder, Philip W

unread,
Feb 25, 2011, 6:09:10 AM2/25/11
to Kevin Hammond, parallel...@googlegroups.com, E.Y...@brighton.ac.uk
Hi Eric,

A few thoughts:

What are you trying to do with your diagram? Is it to introduce
parallelism and concurrency options to Haskell programmers? What do you
plan to do with it? Could we use it if we like it?

Assuming you plan to introduce parallelism and concurrency options to
Haskell programmers, here are some comments.

* Please write 'Evaluation Strategies', rather than 'Strategies'

* Suggest you don't cover
+ MPI (it's too low level) or
+ Distributed Haskells - too experimental for most Haskell programmers

* Some aspects you've missed:

+ What Haskell implementations support the models. Perhaps this doesn't
matter if you assume people only use GHC, but even then they may need to
know that there's no parallelism in ghci

+ The distinction between shared memory and distributed memory
architectures. Currently GHC only supports shared-memory parallelism,
and not distributed memory. Example distributed memory implementations
are GUM and Eden.

> > 1. Should I maintain the dashed line separating Parallelism and
> > Concurrency? Or is this something that's better treated as
> > being on a continuum (NB: from the user's perspective)?

Yes, they are fundamentally different. Concurrent (IO) threads are
stateful, require mandatory scheduling etc.



> > 2. How do you feel about the attempt to arrange concepts on a
> > continnum of use it now! <-> current research? Note that my
> > classification may be incorrect (given my ignorance), but I
> > wonder if more generally there is a more useful Y-axis
> > dimension for fellow newbies.

Fine.

HTH,

Phil

--
Heriot-Watt University is a Scottish charity
registered under charity number SC000278.

Simon Marlow

unread,
Feb 28, 2011, 6:34:59 AM2/28/11
to parallel...@googlegroups.com
Hi Erik,

> I'd like to put some of my cluelessness to service for the parallel Haskell
> the community.
>
> There seems to be a profusion of concepts and acronynms (eg. STM, SMP, GPH,
> CnC, NDP, DpH) related to parallel Haskell.  I suspect that this could be
> intimidating for people who are new to parallelism and concurrency and/or to
> Haskell itself.  It might create the impression that you have to learn a lot
> of stuff to get started (which seems false), or perhaps create a "tyranny of
> choice" situation (argh! where do I start?)

I completely agree. Haskell's flexibility as a substrate it its undoing here.

The diagram is a good start, and I like the idea. We should have something like this on the front page of parallel.haskell.org. My main suggestion would be to separate it into two:

- one aimed at *users*, people who want to pick up the existing tools
and write concurrent or parallel programs. For this audience the
picture can be simplified considerably by removing all of the
experimental stuff (e.g. I'd leave out CnC, CHP, DPH, etc.). For
parallelism we have par/pseq and Strategies, for concurrency we have
forkIO, MVars and STM - these are the technologies we fully support.

- the full picture, including all experimental and research topics, and
libraries on Hackage such as Orc and parallel-io.

I like the way you made a top-level distinction between concurrency and parallelism. The quicker users understand the distinction, the easier their lives will be.

> PPS. I gather from the name of our mailing list that we are convering
>      towards "Parallel Haskell" as a catch-all term for Haskell
>      Parallelism and Concurrency?  Is that right?  It's very helpful to
>      have a *short* name (one word is nice, ie. X or Haskell X)
>      that captures the essence of what we're doing.  "Multicore" seems
>      like another candidate (seeing Don's Multicore Haskell Now!
>      slides), but I get a feeling that something more abstract would be
>      preferable.

I think it makes sense to deal with them both together. Concurrent Haskell can be used as a parallel programming model, after all (it's just not the one we should recommend using without a good reason, such as the use of a nondeterministic algorithm).

Cheers,
Simon

Eric Kow

unread,
Mar 4, 2011, 9:45:07 AM3/4/11
to parallel...@googlegroups.com, P.W.T...@hw.ac.uk
Hi everybody,

Thanks for all your responses!

It's given me a lot to reflect on. Here's my attempt to summarise your
responses so far and hopefully respond to them below.  Do shout if I've
misrepresented you in any way!

Note: I should have mentioned that it may take me a week or so to
respond to emails on this list.  When doing so, I will tend to send
batch responses.

Overview
======================================================================

Kevin Hammond and Phil Trinder
------------------------------
 * Offer for glossary from Parallel Haskell book [Kevin]
 * QUESTION: what trying to accomplish?
 * TODO: write "evaluation strategies" instead of "strategies"
 * SUGGEST:
   - no MPI (too low-level) or Distributed Haskells (too experimental)
   - which Haskell implementations support X?
   - distinguish between shared/distributed memory model?
 * +1 separating parallel/concurrent

Jost Berthold
-------------
 * appears to be table at the moment
 * SUGGEST Eden and MPI binding efforts for Distributed
 * -1 separating parallel/concurrent?
   (eg. shootout use concurrency to achieve parallelism)

Oleg Lobachev
-------------
 * SUGGEST: single dimension classification (if classifying PH dialects
   only), eg. Foster (example, PCA- means programer task is first 3)
    - Partitioning
    - Communication
    - Agglomeration
    - Mapping
 * TODO: distinguish between paradigms (SMP), approaches (STM) and
   software tools (Eden TV)

Paul Bone
---------
 * parallel/concurrent often confused
   - parallelism: implementation detail - faster due to running
     on more than one processor
   - concurrent: thread as programming abstraction
 * should keep separate, because can have
     P  C
     P ~C : eg. auto-parallelizing compilers,
    ~P  C : eg. Python generators
    ~P ~C

Simon Marlow
------------
 * flexibility as substrate as Haskell's undoing
 * SUGGEST: two diagrams?
   - simple user-oriented with only stuff we're can support (par/pseq,
     forkIO, MVars, STM)
   - full picture (current research, historical concepts)
 * +1 separating parallel/concurrent

Kevin Hammond
======================================================================
> We can give you the current glossary entries if that helps?

Many thanks!  I'll be referring to this a lot.  One of my goals [1] is
to develop some sort of Parallel Haskell monthly roundup, of which one
of the features would be a /word of the day/.  May I use terms from the
glossary for this feature?  I may have to tweak them to fit the word of
the feature a bit.

Phil Trinder
======================================================================
> What are you trying to do with your diagram?

My target audience is essentially people who either
(i) don't know very much about parallel/concurrent programming and/or
(ii) don't know very much about Haskell or maybe both
("Gee maybe I should get into functional programming; I hear it's good
for concurrency").  My aim overall is to reduce static inertia, that is,
to make it more my audience to get started with Parallel/Concurrent
Haskell, and also to build up the momentum to become bona fide users.
More specifically, my hope is put into context the key words that the
presumably confused public may have encountered whilst searching online
for "Parallel Haskell" (or the like).

Also, I believe this means that I should treat other worthy uses for a
concept map as secondary.  For example, presenting historical
developments in Parallel/Concurrent Haskell may help people to
understand current research directions; but could potentially detract
from the main objective.  On the other hand, talking about the
historical approaches may also be useful in this context if only to
let users know explictly what words they've seen that they can
safely ignore for now (for being too old or too new, etc)

> What Haskell implementations support the models. 

Thinking in stark market share terms, I'll assume everybody is using
GHC at the moment.  Thanks for the observation about GHCi!

> The distinction between shared memory and distributed memory
> architectures. 

I think the need to make this distinction may go away if I simplify
out some of the more experimental work from the diagram.

Jost Berthold
======================================================================
> There are some scientific papers around that might help understanding, 
...
> To fill the "hole" in red, you can put the word "Eden" (and I would say 
> move it up a bit), and there is an ongoing Haskell+MPI reimplementation 

Thanks for the pointers!  I hope to work through Parallel and
Distributed Haskell over time.  If I do talk about Distributed stuff, at
least Eden sounds like something to mention.

> Yet, the first thing is, it is more a table than a diagram.

I assume this is referring to the current draft of the diagram. :-)
Yeah, so there's two directions to explore that I know of.  Either
run with the idea that this is a table (simpler is better), or make
more use of the canvas.  I'm going with the latter approach for now,
ie. continuing with a 2D canvas, anticipating room for further
experimentation (eg, connecting concepts with lines, etc).

> In my view, the transition from parallel to concurrent/distributed is
> not really a fixed borderline

I'm interested in this distinction between parallel and concurrent
Haskells.  After taking all of your comment into account, I think I
shall retain some sort of distinction.  But I may like to write up a
blog post about these three words (parallel, concurrent, distributed),
perhaps "reconciling" different ways of understanding the
(non)-distinction.

Oleg Lobachev
----------------------------------------------------------------------
> As a more narrow task of a classification of parallel Haskell
> dialects, I would suggest some kind of a single-dimension
> classification. 

Hmm, schemes like PCAM look useful for understanding indeed.
You called this a "single" dimension classification, showing
P--- to PCA- to PCAM being the programmers' tasks.  Does this
mean that whether or not the tasks are left to the programmer
is somehow constrained by the sequential order of the phases?
In other words, that there would be no thing like a P-C- or -CAM
language worth discussing?

> In a broader context, you will need to identify paradigms (e.g., SMP),
> approaches (e.g., STM) and software tools (e.g., Eden TV).

Could you please clarify the distinction between a paradigm and an
approach?  Is it safe to take on a tree-shaped view of the three,
that there are potentially several tools that work within an approach,
and several approaches towards a single paradigm?  Taking a concrete
example, what would be the paradigm that corresponds to STM?  In any
case, thanks to you, I certainly hope to make at least the distinction
between tools and paradigms/approaches.  Thanks!

Paul Bone
======================================================================
> These concepts are often confused because a programmer usually needs
> concurrency to achieve parallelism, and therefore thinks that concurrency is
> parallelism

Could I just confirm what you mean mean when you say a programmer
/usually/ needs concurrency (as opposed to always)?  Reading your
separation of the concepts, I think you are saying that programmers
often manipulate some thread-of-execution abstraction to get parallelism
(as Jost points out about the shootout code).  Would using things like
par/pseq or NDP thus count as "not using concurrency to achieve
parallelism"?

> Therefore, when classifying technologies as Parallel or Concurrent I believe
> that these concepts should be kept separate.  Everything can be classified as:
>     + Neither Parallel or Concurrent.
>     + Parallel but not Concurrent.
>     + Concurrent but not Parallel.
>     + Both Parallel and Concurrent.

Viewing Parallel and Concurrent as orthogonal concepts is helping me
to see things a bit more clearly, thanks! I'll be experimenting to see
if reflecting this in the diagram is useful for my needs.

Simon Marlow
======================================================================
> We should have something like this on the front page of

That sounds good!  I'm hoping to do a survey of the sources of
documentation out there (wiki pages, personal pages, project pages, API
doc, etc) and work out some sort of scheme for unifying them.  Where
things may get delicate is that the attempt to unify may also contribute
to the proliferation of authorative sources (eg. "should I put this on
the Haskell wiki, or the Parallel Haskell wiki?").

> My main suggestion would be to separate it into two: 

I think I like this idea.  I still might play with the combined diagram
for a while, at least to the extent that users may find it helpful to
see "distractor" concepts being explicitly mentioned (for example,
having unsupported technologies represent in faded texts) as I mentioned
to Phil.   But if that should fail, two diagrams it is (also good
because each diagram can have radically different perspectives, axes,
representation schemes, etc).

Paul Bone

unread,
Mar 7, 2011, 8:53:31 PM3/7/11
to parallel...@googlegroups.com
On Fri, Mar 04, 2011 at 06:45:07AM -0800, Eric Kow wrote:
> Paul Bone
> ======================================================================
> > These concepts are often confused because a programmer usually needs
> > concurrency to achieve parallelism, and therefore thinks that concurrency
> is
> > parallelism
>
> Could I just confirm what you mean mean when you say a programmer
> /usually/ needs concurrency (as opposed to always)? Reading your
> separation of the concepts, I think you are saying that programmers
> often manipulate some thread-of-execution abstraction to get parallelism
> (as Jost points out about the shootout code). Would using things like
> par/pseq or NDP thus count as "not using concurrency to achieve
> parallelism"?

Yes. I don't think par/pseq are about concurrency (nor are evaluation
stratergies). They're hints* to the compiler/runtime about where parallel
evaluation should be used. But they're not about concurrency because the
programmer doesn't use them to express the program in a different way (such as
they would with STM, message passing etc)

*I'm not sure how people prefer to classify par. They could be called hints
as they will be sparked but that doesn't gaurentee that they'll be executed in
parallel. Perhaps 'pragma' is a better term.

In general (imperative and declarative languages) programmers usually need
concurrency to gain parallelism. In particular, threads are often used for
this. I only mention this because I think it's how people become confused
about the terms parallel and concurrent.

> > Therefore, when classifying technologies as Parallel or Concurrent I
> believe
> > that these concepts should be kept separate. Everything can be classified
> as:
> > + Neither Parallel or Concurrent.
> > + Parallel but not Concurrent.
> > + Concurrent but not Parallel.
> > + Both Parallel and Concurrent.
>
> Viewing Parallel and Concurrent as orthogonal concepts is helping me
> to see things a bit more clearly, thanks! I'll be experimenting to see
> if reflecting this in the diagram is useful for my needs.

I'm glad! I was worried that I was being preachy, or that everyone on the list
already knew and I was beating a dead horse.

> Simon Marlow
> ======================================================================


> > My main suggestion would be to separate it into two:
>
> I think I like this idea. I still might play with the combined diagram
> for a while, at least to the extent that users may find it helpful to
> see "distractor" concepts being explicitly mentioned (for example,
> having unsupported technologies represent in faded texts) as I mentioned
> to Phil. But if that should fail, two diagrams it is (also good
> because each diagram can have radically different perspectives, axes,
> representation schemes, etc).

Can I vote-up this idea?

signature.asc
Reply all
Reply to author
Forward
0 new messages