Multicore programming - is that really hard?

hzm...@hotmail.com

unread,

Feb 13, 2008, 5:43:23 PM2/13/08

to

I read a number of people saying few knows how to program on a
multicore computers. They did not elaborate and to me this has become
rhetoric. We have had symmetric multiprocessors and pthreads for a
long time. Can't we just use pthreads, or maybe some other things
too, to program on a multicore computer? Or somehow multicores do
present some new challenges that we did not have before, like on
SMPs? Please enlighten me.

--

Eugene Miya

unread,

Feb 13, 2008, 5:54:33 PM2/13/08

to

In article <d7cee607-dd33-4c54...@p69g2000hsa.googlegroups.com>,
<hzm...@hotmail.com> wrote:
>I read a number of people saying few know how to program on a

>multicore computers. They did not elaborate and to me this has become
>rhetoric. We have had symmetric multiprocessors and pthreads for a
>long time. Can't we just use pthreads, or maybe some other things
>too, to program on a multicore computer? Or somehow multicores do
>present some new challenges that we did not have before, like on
>SMPs? Please enlighten me.

Moderator hat off.

I would doubt new challenges significantly distinct from other threading.
It has to be be shown otherwise.
Your "can't?" question can be turned around substituting message passing.
The more loosely coupled would say that's enough.

%A Marc Mezard
%T Passing Messages Between Disciplines
%J Science
%V 301
%N 5640
%D 19 September 2003
%P 1685-1686
%K physics/computer science (AI) perspectives,
information theory error correction, belief propagation (BP),
discrete optimization satisfiability, statistical physics spin glasses,
%X Nice short paper on the meeting of 2 disciplines.
It glosses over certain global/long distance topics, but quite nice.

The CS community can't even agree on a decent parallel programming
paradigm or model.

And a further step removed can be Why can't we just do it all automagically?
These guys are merely sitting back waiting, can't stand message passing, and
have been perhaps burned by attempts at using threads.

Knuth is in this camp. He challenges anyone to parallelize TeX and get
significant speed up in a reasonably portable way. He noted last April
or May that he wrote it serially. No prize money mentioned.

Anyone else?

--

Chris Thomasson

unread,

Feb 14, 2008, 12:05:49 PM2/14/08

to

[comp.programming.threads added into discussion]

<hzm...@hotmail.com> wrote in message
news:d7cee607-dd33-4c54...@p69g2000hsa.googlegroups.com...

Are you trying to achieve very-high scalability? What is your exact problem
domain? PThreads will work fine in most cases... However, if you want to
realize outstanding performance and scalability characteristics, then you
might want to take a look into some other types of algorithms:

http://groups.google.com/group/comp.programming.threads/msg/fdc665e616176dce

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/ef3f26bbe1c52e0b

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/1f6e0e7eca04b6d0

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/205dcaed77941352

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/380ee52d7237fe3a

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/96f280d49a63bb9f

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/8329a6ddcb95b8ac

(try and read as much as possible...)

Keep in mind that future multi-core designs are going to revolve around NUMA
which means heavily distributed multi-threaded programming techniques are
essential...:

http://groups.google.com/group/comp.programming.threads/msg/4230a9d3cbc20ea9

http://groups.google.com/group/comp.arch/msg/a7501814f19731cd

http://groups.google.com/group/comp.arch/msg/ac41b1c179f0e7ca

Designing high-speed distributed algorithms is VERY difficult if you don't
know _exactly_ what your doing...

Any questions/comments?

--

tim

unread,

Feb 14, 2008, 12:10:42 PM2/14/08

to

In article
<d7cee607-dd33-4c54...@p69g2000hsa.googlegroups.com>,
> <hzm...@hotmail.com> wrote:
>>I read a number of people saying few know how to program on a multicore
>>computers. They did not elaborate and to me this has become rhetoric.
>>We have had symmetric multiprocessors and pthreads for a long time.
>>Can't we just use pthreads, or maybe some other things too, to program
>>on a multicore computer? Or somehow multicores do present some new
>>challenges that we did not have before, like on SMPs? Please enlighten
>>me.

Some programs are relatively easy to split into massively parallel
threads. A good example is the generation of 3D graphics. Even so, this is
regarded as a difficult area to program, and dropout rates in at least one
graphics programming course I am aware of are very high as a result of the
technical difficulties. To get good performance all sorts of arcane matters
need to be taken into account. Have a look at any of the guides to 3D
graphics programming on the ATI or Nvidia sites for examples.

There is also a big move to exploit the graphics chips for computation. In
some cases this is very successful, but often at a high cost. Algorithms
may have to be redone from scratch. For example traditional sort methods
do not parallelize well and you need to use something like bitonic sort.

Transactional systems such as many web sites can often be split into
threads in a way that is transparent to the programmer. As a result,
transactional programming is not that hard, but you need support from
experts when things go wrong. Suddenly your application starts running
slowly. Why? It could be database contention or many other things. Maybe
you have memory cache line clashes.

Sometimes you can spin off a thread to do some computationally intensive
work while the foreground thread looks after the ordinary work. I did this
for a wav2midi program I was trying to get going. Even in this simple
case, it was annoyingly tedious to get the synchronizing of the threads
working correctly. And because the problems are timing related, they do
not always recur on schedule. So it is hard to test these sorts of
applications. Debugging a large and seriously parallel program can quickly
develop a nightmarish quality.

Operating systems and application servers and databases have had to deal
with these issues for some time. Have a look at the Linux code to deal
with multiple CPUs. It is quite complex.

I'm writing a compiler and I found I can spread the load across multiple
CPUs but to do this required several days of work rewriting parts of the
code. I had to ensure each stage could process its work incrementally,
passing incremental outputs as soon as they were available. I had to put
locks around common data items (like global error counts).

Sometimes the CPU or compiler can parallelize work for you. Modern CPUs do
this automatically, but the limits are quite modest. A lot of hardware is
needed to ensure nothing interferes when it shouldn't. Compilers also can
parallelize work, with varying degrees of success. Gcc does this, which in
part explains why GCC compiles take many more processing cycles than they
used to.

In many cases there is no simple and easy way to split a program into
parallel components in a way that will work correctly and will perform well.

Part of the problem is that programming languages lack good, standardised
support for parallel programming constructs. Another challenge is that the
degree of parallelism required is rapidly increasing.

It all gets harder again when you have non-uniform access to memory.
Search for NUMA for information about this, or look at the PS3 chip
performance documentation.

Tim Josling
au.org.melbpc@tej words backwards.

--

Chris Thomasson

unread,

Feb 14, 2008, 12:20:39 PM2/14/08

to

"tim" <Ti...@internet.com> wrote in message
news:13r81lc...@corp.supernews.com...

> In article
> <d7cee607-dd33-4c54...@p69g2000hsa.googlegroups.com>,
>> <hzm...@hotmail.com> wrote:
>>>I read a number of people saying few know how to program on a multicore
>>>computers. They did not elaborate and to me this has become rhetoric.
>>>We have had symmetric multiprocessors and pthreads for a long time.
>>>Can't we just use pthreads, or maybe some other things too, to program
>>>on a multicore computer? Or somehow multicores do present some new
>>>challenges that we did not have before, like on SMPs? Please enlighten
>>>me.
>
> Some programs are relatively easy to split into massively parallel
> threads. A good example is the generation of 3D graphics. Even so, this is
> regarded as a difficult area to program, and dropout rates in at least one
> graphics programming course I am aware of are very high as a result of the
> technical difficulties. To get good performance all sorts of arcane
> matters
> need to be taken into account. Have a look at any of the guides to 3D
> graphics programming on the ATI or Nvidia sites for examples.
>
> There is also a big move to exploit the graphics chips for computation.

[...]

Indeed:

http://groups.google.com/group/comp.programming.threads/search?hl=en&group=comp.programming.threads&q=%22Chris+Thomasson%22+GPU

The GPU can crunch arrays better than the CPU... Turn array into texture and
tell GPU to render it in special ways...

--

David Schwartz

unread,

Feb 14, 2008, 4:57:23 PM2/14/08

to

On Feb 14, 9:05 am, "Chris Thomasson" <cris...@comcast.net> wrote:
> Any questions/comments?

When you hear someone say "multicore programming is hard", 99 times
out of 100 what they really mean is that multi-threading is hard.
This is both true and false.

If you are already familiar with multi-thread programming on SMP
machines, there is almost nothing new you need to learn to program for
multi-core processors. Even if you are writing operating systems, the
differences (caused by things like shared caches) are miniscule.

So if you've been writing high-end applications for a decade or so
that run on SMP machines, you're in great shape. Desktop machines are
just starting to look like servers have forever.

On the other hand, if you never bothered to learn multi-threading
because you were only writing mass-market programs to run on very few
machines that had more than one CPU ...

DS

--

Szabolcs Ferenczi

unread,

Feb 14, 2008, 5:59:23 PM2/14/08

to

On Feb 14, 6:05 pm, "Chris Thomasson" <cris...@comcast.net> wrote:
> Any questions/comments?

Yes, concurrent programming is hard.

The situation is similar to the relation between classical physics and
quantum physics. If you have learnt the basic principles in classical
physics, you will be in trouble with the indeterministic world in
quantum physics.

Similarly, if you are very much adapted to sequential/deterministic
programming concepts, you can not very easily adapt to the non-
deterministic nature of concurrent programs.

It is well illustrated with the inadequate programming fragments that
newcomers publish in this forum. They proudly announce that they have
come up with a super fast queue but the get operation returns false
when the queue is empty. Thus, they are not aware of the non-
deterministic nature of the concurrent programs. Q.E.D.

Best Regards,
Szabolcs

--

Dmitriy V'jukov

unread,

Feb 14, 2008, 8:49:06 PM2/14/08

to

[Your PST c.p. spam filtering moderator is going home for the evening.
Post to be sent out tomorrow; I have V-day commitments.]

On 15 фев, 01:59, Szabolcs Ferenczi <szabolcs.feren...@gmail.com> wrote:

> It is well illustrated with the inadequate programming fragments that
> newcomers publish in this forum. They proudly announce that they have
> come up with a super fast queue but the get operation returns false
> when the queue is empty. Thus, they are not aware of the non-
> deterministic nature of the concurrent programs. Q.E.D.

Citation, please. Provide some links to 'non-deterministic aware'
algorithms.
What they must return? Trulse? Falue?

Dmitriy V'jukov

--

Eugene Miya

unread,

Feb 15, 2008, 5:50:10 PM2/15/08

to

The c.p. moderator has to run a field experiment out in Death Valley for
the long weekend. I may have some wireless access but don't count on it
(like a Starbucks or if I swing past JPL or Caltech).
I will for certain be back on line Wednesday. Get posts to me before
430 PST (just under 2 hrs.) or you will have to wait until Wed. or trim
the Newsgroups down to c.p.t. (losing the c.p.).

--

Phil Hobbs

unread,

Feb 21, 2008, 5:50:57 PM2/21/08

to

Part of the difficulty is that performance on different classes of
problem is so dependent on the degree of connection among processors and
between them and memory. To achieve flat or nearly uniform memory
access, as in a classical SMP, the amount of cache-consistency traffic
grows astronomically with the number of processors and the size of the
caches, and so does its cost and power consumption. Since
single-processor performance requires lots of cache, there's a lot of
pressure to make machines less and less SMP-like. Highly multicore
machines will be doing several more or less unrelated things at once,
and so most of the cache consistency traffic will be meaningless. It's
like the old saw about advertising: "I know half our advertising budget
is wasted, I just don't know which half." It's hard to justify that
much extra hardware when most of it is doing nothing useful, most of the time.

Getting good performance out of a NUMA-type machine requires pretty deep
knowledge of how the machine is organized, and even of what else is
going to be running. That's intrinsically much more complicated and
machine-specific than any uniprocessor or small-SMP program, and it gets
harder and harder as the number of cores increases, which progressively
degrades the interconnection bandwidth and latency.

That's a significant part of the reason the programming model remains
broken, IMO.

Cheers,

Phil Hobbs

--

hzm...@hotmail.com

unread,

Feb 29, 2008, 1:39:35 PM2/29/08

to

I just read this article "Multicore puts screws to parallel-
programming models" (http://www.eetimes.com/
showArticle.jhtml;jsessionid=3BL0Y4BPOCN4WQSNDLRSKH0CJUNN2JVN?
articleID=206504466). The first sentence reads "Leaders in mainstream
computing are intensifying efforts to find a parallel-programming
model to feed the multicore processors already on chip makers' drawing
boards." It somehow implies that the pthreads model is not good
enough but it does not mention pthreads at all. Therefore, the
question does not seem to be related to whether multithreaded
programming is hard at all. It appears either the author and the
"leaders in mainstream computing" are ignorant of pthreads, or they do
not think pthreads or any other existing parallel-programming model,
such as OpenMP, HPF, ... is good enough. So they need to find a new
one. Any comments?

--

Eugene Miya

unread,

Feb 29, 2008, 6:57:42 PM2/29/08

to

In article <1aa0c6d9-20b6-4554...@s13g2000prd.googlegroups.com>,

Well, this isn't anything really new. It's been around for a few years
the key acronym to search is RAMP.

Videos exist of Dave talking about this.

Now I'm not certain what you want in the way of comment.
The first key word is "mainstream:" read this as clueless. Really.
If you really want to go back, and the guy who would be most steamed
about this because he's been working on this for decades is Dave Kuck.
And he now works for Intel.

Now the author is just some reporter. He might or might not know about
the issues of threading. He might or might not know the players:
it simply depends whom Merritt studied under. Squires when he ran DARPA in
the 90s claimed something like 1 in 3 programmers was capable of
parallel programming. It's a laughable debate which has gone on for
decades whether parallel programming is hard. Take pick your side.
One can match comment for coment on ease or difficulty. So threading or
message-passing is kind of irrelevant. I would tend to side with the
"hard" side, because if it weren't it wouldn't likely be such an issue.
You merely end up adding layers and layers of context and adjective
qualifications (oh, it's no harder that Systems programming; oh...).

What I can tell you from the non-mainstream is they are looking for
performance gains of say 50-100 times. Consistently. Give them 1,000
times if you can. And no one can. Not for the amounts of money being
offered. That's independent of ease of use.

When Dave was President of the ACM, he called for the psychological and
sociological study of programmers (the best guys). No one took him on
up it. No one would pay for it, and eccentricites of a lot of guys
make some feel uncomfortable. This is some of the RAMP back story.

The good part of the article is Bill Dally's comment about not wanting
to get locked into a bad paradigm early. They want to avoid what
happened to Fortran. The problem this creates is a dead lock from doing
real work. Creates academic employment.

--

hzm...@hotmail.com

unread,

Mar 2, 2008, 10:56:28 AM3/2/08

to

If I studied carefully all those researches mentioned, I may be able
to find my answer. But forgive my laziness. Let me go back to one of
my original questions: How are multicores different from
multiprocessors (i.e. sockets)? Didn't all these problems these
researchers are trying to solve now already exist when multiprocessors
emerged many years ago? Are these researchers re-discovering these
problems and perhaps re-inventing the solutions (or the solutions had
never been found)?
--

hzm...@hotmail.com

unread,

Mar 2, 2008, 10:57:44 AM3/2/08

to

On Feb 29, 3:57 pm, Eugene Miya wrote:
> The good part of the article is Bill Dally's comment about not wanting
> to get locked into a bad paradigm early. They want to avoid what
> happened to Fortran. The problem this creates is a dead lock from doing
> real work. Creates academic employment.

I am a novice in this field. Could you share with me what you mean by
"what happened to Fortran". What happened?
About getting locked in a paradigm, well, I guess Dally might be over-
worried. Multiprocessors have been there for a long time, and if we
have not already locked in a particular paradigm, it is unlikely that
we would soon lock in a paradigm, good or bad, for multicores. People
will continue to develope of all kinds of parallel paradigms. In fact,
I read a paper the other day that says there are already too many
parallel programming models. On the other hand, if we already locked
in a paradigm for multiprocesors, that paradigm would very likely be
the paradigm that we will use for multicores (unless multicores are
fundamentally different from multiprocessors, then it goes back to my
original question: what is that fundamental difference?) - and we are
already locked in.

--

tim

unread,

Mar 3, 2008, 11:50:28 AM3/3/08

to

[Moderator: Tim, I edited your post to be slightly clearer.]

On Sun, 02 Mar 2008 07:57:44 -0800, hzmonte wrote:
> I read a paper the other day that says there are already too many
> parallel programming models.

"When there are many solutions, that's because none of them are any good".

To,

[Moderator: This part here after the comma...]

--

Eugene Miya

unread,

Mar 3, 2008, 12:04:57 PM3/3/08

to

In article <57925c46-74a3-412a...@e10g2000prf.googlegroups.com>,

<hzm...@hotmail.com> wrote:
>On Feb 29, 3:57 pm, Eugene Miya wrote:

>> bad paradigm early.
>> Fortran.

>I am a novice in this field. Could you share with me what you mean by
>"what happened to Fortran". What happened?

Well Fortran was the best early attempt at standardizing programming
languages. Outside academia (as well as inside but outside CS depts.)
millions of lines (if not now billions) are invested in that knowledge.

Failure to consider that has already sunk dozens of companies and
projects. A big long history of syntactic and semantic problems and
difficulties exist with the existing language and the installed base.

Change itself is part of the problem. No one really wants to. And at
the same time it's not all Fortran's problem.

>About getting locked in a paradigm, well, I guess Dally might be over-
>worried.

I would not be so certain about that.

>Multiprocessors have been there for a long time, and if we
>have not already locked in a particular paradigm, it is unlikely that
>we would soon lock in a paradigm, good or bad, for multicores. People
>will continue to develope of all kinds of parallel paradigms. In fact,
>I read a paper the other day that says there are already too many
>parallel programming models.

Which paper?

Oh there probably are too many.

>On the other hand, if we already locked
>in a paradigm for multiprocesors, that paradigm would very likely be
>the paradigm that we will use for multicores (unless multicores are
>fundamentally different from multiprocessors, then it goes back to my
>original question: what is that fundamental difference?) - and we are
>already locked in.

MPs have been around but this is the first real commodity cycle where
processors are too cheap to meter. I would not claim that we've had
multiprocessors that long.

I think it's too early to lock in any paradigm or make any such claim.
People worry too much about hardware. We don't even have adequate
programming languages of any kind.

--

Szabolcs Ferenczi

unread,

Mar 5, 2008, 11:16:49 AM3/5/08

to

On Mar 2, 4:56 pm, hzmo...@hotmail.com wrote:
> ... How are multicores different from

> multiprocessors (i.e. sockets)? Didn't all these problems these
> researchers are trying to solve now already exist when multiprocessors
> emerged many years ago? Are these researchers re-discovering these
> problems and perhaps re-inventing the solutions (or the solutions had
> never been found)?

I think multicores are not any different from (shared memory)
multiprocessors from the point of view of the programming model. On
the other hand, multicores are quite different from the parallel
computers such as vector processors.

Once upon a time (1966), a guy named Flynn made a classification of
computers with respect to the instruction and data streams: multicores
belong to the Multiple Instruction Multiple Data (MIMD) stream class
while the vector processors belong to the Single Instruction Multiple
Data (SIMD) stream class.

However, the MIMD class is further subdivided into two subclasses
according to the assignments of memory segments to processing units.
If different processing units share common memory segments,
traditionally we can speak of tightly coupled machines or
multiprocessors, while if processing units have their private memory
segments, we can speak of loosely coupled machines or multicomputers.

Now, from the programming point of view, there emerged two basic
paradigms: (1) the shared resource based concurrent programs, and
(2) message based concurrent programs. What we call multi-threading today,
belongs to the first class. You have a correct guess that in early
days of the programming of shared memory multiprocessors there has
been attempts to come up with language level solutions such as
Critical Regions (CR), Conditional Critical Regions (CCR), and
Monitors. Unfortunately, none of these language concepts are taken
over properly in today's multi-threaded programming repertoire.

Probably the researchers of multicore programming should do nothing
else but carefully study and properly adapt what has been already
discovered in the programming of shared memory multiprocessors.

Best Regards,
Szabolcs

--

Chris Thomasson

unread,

Mar 6, 2008, 1:04:56 PM3/6/08

to

[comp.programming.threads added...]
[C.P. Moderator: OK.]

"Szabolcs Ferenczi" <szabolcs...@gmail.com> wrote in message
news:3c6d1366-fe78-4f67...@e10g2000prf.googlegroups.com...

> On Mar 2, 4:56 pm, hzmo...@hotmail.com wrote:
>> ... How are multicores different from
>> multiprocessors (i.e. sockets)? Didn't all these problems these
>> researchers are trying to solve now already exist when multiprocessors
>> emerged many years ago? Are these researchers re-discovering these
>> problems and perhaps re-inventing the solutions (or the solutions had
>> never been found)?
>
> I think multicores are not any different from (shared memory)
> multiprocessors from the point of view of the programming model. On
> the other hand, multicores are quite different from the parallel
> computers such as vector processors.

Multi-Core systems are new in the sense that there are now multiple
processors on a single chip. Inter-processor communication is cheaper,
making multi-threading more efficient. If you want something like vector
processors, well, try some general-purpose GPU programming:

http://groups.google.com/group/comp.parallel/msg/7869ec858cfe7883

[...]

> Now, from the programming point of view, there emerged two basic
> paradigms: (1) the shared resource based concurrent programs, and
> (2) message based concurrent programs.

How do you account for the fact that nearly all of the existing
message-based languages are implemented with shared-memory multi-threading?

> What we call multi-threading today, belongs to the first class.

Humm, if I understand you correctly; that's not entirely true. For instance,
you still use multi-threading techniques all over the place on the Cell
processor. I would say that the Cell is a hybrid of shared-memory and
DMA-Channel communications. Each SPU has a main-thread which uses a
DMA-channel interface for communication with other SPU's or the main PPC(s)
where there can be multiple-threads:

http://groups.google.com/group/comp.arch/msg/198a048b98fabe45

http://groups.google.com/group/comp.arch/browse_frm/thread/93b1619857...

They have memory barriers and LL/SC syncronization; They even have LL/SC
hooked up to the DMA channel interface; very cool!:

http://groups.google.com/group/comp.arch/msg/56b9fcc296e8564a

What do you think of the Cell architecture?

> You have a correct guess that in early
> days of the programming of shared memory multiprocessors there has
> been attempts to come up with language level solutions such as
> Critical Regions (CR), Conditional Critical Regions (CCR), and
> Monitors. Unfortunately, none of these language concepts are taken
> over properly in today's multi-threaded programming repertoire.

Non-sense! How do you explain POSIX Threads? Of course, we have been over
this:

http://groups.google.com/group/comp.programming.threads/msg/ed695d0cb48fc551

http://groups.google.com/group/comp.programming.threads/msg/7cfbb24f5e6d5a3f

> Probably the researchers of multicore programming should do nothing
> else but carefully study and properly adapt what has been already
> discovered in the programming of shared memory multiprocessors.

Tell that to SUN research:

http://research.sun.com/projects/dashboard.php?id=29

They are into developing new lock/wait-free algorithms... However, quite a
few of these algorithms originated in the 80's after CAS was invented...
Anyway, why should I not try invent new non-blocking synchronization
algorithms?

--

Szabolcs Ferenczi

unread,

Mar 6, 2008, 5:02:16 PM3/6/08

to

On Mar 6, 7:04 pm, "Chris Thomasson" <cris...@comcast.net> wrote:
> "Szabolcs Ferenczi" <szabolcs.feren...@gmail.com> wrote in message

> news:3c6d1366-fe78-4f67...@e10g2000prf.googlegroups.com...
> > On Mar 2, 4:56 pm, hzmo...@hotmail.com wrote:
> >> ... How are multicores different from
> >> multiprocessors (i.e. sockets)? Didn't all these problems these
> >> researchers are trying to solve now already exist when multiprocessors
> >> emerged many years ago? Are these researchers re-discovering these
> >> problems and perhaps re-inventing the solutions (or the solutions had
> >> never been found)?
>
> > I think multicores are not any different from (shared memory)
> > multiprocessors from the point of view of the programming model. On
> > the other hand, multicores are quite different from the parallel
> > computers such as vector processors.
>
> Multi-Core systems are new in the sense that there are now multiple
> processors on a single chip. Inter-processor communication is cheaper,
> making multi-threading more efficient.

What you are talking about are qualitative issues only but not any
architectural issues. Besides, I was talking about the programming model.

>From the point of view of the programming model, I even should not
have used the term memory but instead I should have used address space
which would have been more appropriate.

> If you want something like vector
> processors,

No, I do not want any vector processors. If I want one, I get access
to it. I just mentioned it as an example for a non-MIMD architecture.

> ...
> > Now, from the programming point of view, there emerged two basic
> > paradigms: (1) the shared resource based concurrent programs, and
> > (2) message based concurrent programs.
>
> How do you account for the fact that nearly all of the existing
> message-based languages are implemented with shared-memory multi-threading?

I am afraid you do not know again what you are talking about. You can
implement message-based programs on the single processor as well. You
can even implement multi-threading on a single processor. Oh yes, you
favourite cell model can also be implemented on the sequential
computer. So your counter argument just does not apply. You intermix
some issues again.

> > What we call multi-threading today, belongs to the first class.
>
> Humm, if I understand you correctly;

No, I am afraid you do not.

> ...

> What do you think of the Cell architecture?

I think we are not talking about it in this discussion thread. If you
want to discuss them, please open a discussion thread about it and
dump your valuable thoughts about it there. You are welcome.

> > You have a correct guess that in early
> > days of the programming of shared memory multiprocessors there has
> > been attempts to come up with language level solutions such as
> > Critical Regions (CR), Conditional Critical Regions (CCR), and
> > Monitors. Unfortunately, none of these language concepts are taken
> > over properly in today's multi-threaded programming repertoire.
>
> Non-sense! How do you explain POSIX Threads?

You call something "non-sense" what you do not know about. On the
other hand, if you do not know POSIX threads either, I am not going to
explain it to you here, just use Google. Google is your friend.
Besides, POSIX threads are not at the language level and I was talking
about language means. If you do not understand this, please refrain
yourself calling it "non-sense".

> ...

> > Probably the researchers of multicore programming should do nothing
> > else but carefully study and properly adapt what has been already
> > discovered in the programming of shared memory multiprocessors.
>
> Tell that to SUN research:
>
> http://research.sun.com/projects/dashboard.php?id=29
>
> They are into developing new lock/wait-free algorithms...

Lock-free algorithms are some very low level programming techniques
but they have no language consequences whatsoever. At least so far
not. You can try and work on the language issues for the lock-free
techniques, of course. Nevertheless, since lock-free techniques are
even more complicated and therefore more error prone ones, they make
multicore programming even harder.

Best Regards,
Szabolcs

--

Eugene Miya

unread,

Mar 6, 2008, 5:15:17 PM3/6/08

to

In article <d800ca6e-33de-4b48...@s37g2000prg.googlegroups.com>,

<hzm...@hotmail.com> wrote:
>my original questions:
>How are multicores different from multiprocessors (i.e. sockets)?

Well, I come from software. A socket means Berkeley-style IPC to me.

>Didn't all these problems these researchers are trying to
>solve now already exist when multiprocessors emerged many years ago?

Well, let me ask you rhetorically:
do you think those researchers solved those problems years ago?

Think about that for a moment. Give yourself context and assumptions.

No cheating. Take the moment.

The answer should be no. Two reasons for this: we overconcentrated on
hardware (the easy part) and we never came up with software transition
aids which didn't have great limitations. The community assumed we
would leave that to programmers. ha.

The other thing is which multiprocessors? We have no standardized
architecture, and likely won't in the foreseeable future, and you can
expect to wallow in that.

>Are these researchers re-discovering these problems and
>perhaps re-inventing the solutions (or the solutions had
>never been found)?

General solutions have likely never been found.
Well, we have had some limited success with vectorization (easy
vectorization, the history of that is some what interesting but not seen
by most programmers). I'm not certain I would use the word
"re-discovered." Whom do you think these researchers are?
Do you mean compilers people?
Do you realize the number of institutions where a person can go learn
about parallelizing compilers in the USA can be counted on 1 hand even
missing a couple of fingers?

--

Chris Thomasson

unread,

Mar 6, 2008, 7:49:51 PM3/6/08

to

"Szabolcs Ferenczi" <szabolcs...@gmail.com> wrote in message

news:095060e2-43a5-4ab0...@m36g2000hse.googlegroups.com...

>
> On Mar 6, 7:04 pm, "Chris Thomasson" <cris...@comcast.net> wrote:
>> "Szabolcs Ferenczi" <szabolcs.feren...@gmail.com> wrote in message
>> news:3c6d1366-fe78-4f67...@e10g2000prf.googlegroups.com...
>> > On Mar 2, 4:56 pm, hzmo...@hotmail.com wrote:

[...]

>> > Now, from the programming point of view, there emerged two basic
>> > paradigms: (1) the shared resource based concurrent programs, and
>> > (2) message based concurrent programs.
>>
>> How do you account for the fact that nearly all of the existing
>> message-based languages are implemented with shared-memory
>> multi-threading?
>
> I am afraid you do not know again what you are talking about.

I am pointing out that there are more than 2 basic paradigms. You forget
hybrid approaches.

> You can
> implement message-based programs on the single processor as well. You
> can even implement multi-threading on a single processor. Oh yes, you
> favourite cell model can also be implemented on the sequential
> computer. So your counter argument just does not apply. You intermix
> some issues again.

>> > What we call multi-threading today, belongs to the first class.
>>
>> Humm, if I understand you correctly;
>
> No, I am afraid you do not.
>
>> ...
>> What do you think of the Cell architecture?
>
> I think we are not talking about it in this discussion thread. If you
> want to discuss them, please open a discussion thread about it and
> dump your valuable thoughts about it there. You are welcome.

The Cell architecture is a good example model for a shared-memory
message-passing hybrid. You mention those two models as if they are mutually
excludable. I am trying to tell you that they are not. It seems like you
fail to understand that shared-memory multi-threading and message-passing
can be efficently combined. Its not one or the other.

>> > You have a correct guess that in early
>> > days of the programming of shared memory multiprocessors there has
>> > been attempts to come up with language level solutions such as
>> > Critical Regions (CR), Conditional Critical Regions (CCR), and
>> > Monitors. Unfortunately, none of these language concepts are taken
>> > over properly in today's multi-threaded programming repertoire.
>>
>> Non-sense! How do you explain POSIX Threads?
>
> You call something "non-sense" what you do not know about. On the
> other hand, if you do not know POSIX threads either, I am not going to
> explain it to you here, just use Google. Google is your friend.
> Besides, POSIX threads are not at the language level and I was talking
> about language means. If you do not understand this, please refrain
> yourself calling it "non-sense".

I have to say non-sense again... your assertion that POSIX threads are not
at the language level is erroneous. POSIX Threads is most certainly on the
language level as it puts restrictions on what a C compiler can do. Please
read here:

http://groups.google.com/group/comp.programming.threads/msg/729f412608a8570d

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/63f6360d939612b3

POSIX Threads and C have a relationship that goes back many years.

>> > Probably the researchers of multicore programming should do nothing
>> > else but carefully study and properly adapt what has been already
>> > discovered in the programming of shared memory multiprocessors.
>>
>> Tell that to SUN research:
>>
>> http://research.sun.com/projects/dashboard.php?id=29
>>
>> They are into developing new lock/wait-free algorithms...
>
> Lock-free algorithms are some very low level programming techniques
> but they have no language consequences whatsoever. At least so far
> not.

What do you mean they have no language consequences? I suggest you read this
entire thread:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/29ea516c5581240e

> You can try and work on the language issues for the lock-free
> techniques, of course.

Standard C++ is going to support all the tools you need to create high-end
non-blocking algorithms. Atomic operations and memory barrier functionality
is going to be defined by the Standard, and become an integral part of the
C++ language.

Java has standardized support for lock/wait-free algorithms as well.

> Nevertheless, since lock-free techniques are
> even more complicated and therefore more error prone ones, they make
> multicore programming even harder.

This kind of sounds like somebody attacking C because it does not check
array boundaries or something. Anyway, I am happy to know that C++ will
allow programmers to create very high-performance multi-threaded
applications under a common Standard.

--

Eugene Miya

unread,

Mar 13, 2008, 5:53:41 PM3/13/08

to

In article <D5-dnY9pof5r5lLa...@comcast.com>,
Chris Thomasson <cri...@comcast.net> wrote:
>http://groups.google.com/group/comp.parallel/msg/7869ec858cfe7883
...
>http://groups.google.com/group/comp.arch/msg/198a048b98fabe45
...
>http://groups.google.com/group/comp.arch/browse_frm/thread/93b1619857...
...
>http://groups.google.com/group/comp.arch/msg/56b9fcc296e8564a
...

>> Critical Regions (CR), Conditional Critical Regions (CCR), and

...
>http://groups.google.com/group/comp.programming.threads/msg/ed695d0cb48fc551
>
>http://groups.google.com/group/comp.programming.threads/msg/7cfbb24f5e6d5a3f

All these archive links are dead.

>Tell that to SUN research:
>http://research.sun.com/projects/dashboard.php?id=29

This link remains alive.
--

Chris Thomasson

unread,

Mar 14, 2008, 2:59:29 PM3/14/08

to

"Eugene Miya" <eug...@cse.ucsc.edu> wrote in message
news:47d99455$1@darkstar...

> In article <D5-dnY9pof5r5lLa...@comcast.com>,
> Chris Thomasson <cri...@comcast.net> wrote:
>>http://groups.google.com/group/comp.parallel/msg/7869ec858cfe7883
> ...
>>http://groups.google.com/group/comp.arch/msg/198a048b98fabe45
> ...

>>http://groups.google.com/group/comp.arch/browse_frm/thread/93b1619857...
^^^^^^^^^^^^^^^^^^^^^^^^^^^

The link above is the only one which shows up with [no subject] error.

> ...
>>http://groups.google.com/group/comp.arch/msg/56b9fcc296e8564a
> ...
>>> Critical Regions (CR), Conditional Critical Regions (CCR), and
> ...
>>http://groups.google.com/group/comp.programming.threads/msg/ed695d0cb48fc551
>>
>>http://groups.google.com/group/comp.programming.threads/msg/7cfbb24f5e6d5a3f
>
>
> All these archive links are dead.

They take to proper Google Groups pages when I click on them. Humm, weird...
On my end, all but one link you listed is dead; perhaps Google was doing
some maintenance or something...

[Moderator: Possible; even likely.]

Szabolcs Ferenczi

unread,

Mar 14, 2008, 3:59:20 PM3/14/08

to

On Feb 13, 11:43 pm, hzmo...@hotmail.com wrote:
> I read a number of people saying few knows how to program on a
> multicore computers. They did not elaborate and to me this has become
> rhetoric. We have had symmetric multiprocessors and pthreads for a
> long time.

First of all let me stress that multicore programming is hard.
It is hard because it is basically concurrent programming.
Concurrent programming requires a kind of thinking quite different to sequential
programming and it is hard to get rid of old habits. We can see
symptoms when people are trying to make concurrent programs with a
mind imprinted with sequential/deterministic programming habits. Those
attempts result in over controlling the interactions thus making the
system inefficient.

Let me note also that from the point of view of scaling, shared memory
multiprocessors (called multicores nowadays) are not so good as
distributed memory multiprocessors. From the programming point of
view, on the other hand, programming with shared memory is easier
since the data structures are somewhat closer to the ones in
sequential programs. Only access conflicts must be avoided, which is
not easy though.

From the programming point of view, the key feature is whether a
programming language has any statement suitable for handling non-
deterministic events. Practically, it means that the Guarded Commands
(GC) of Dijkstra must be adapted. It is irrespective of whether
communication happens via message exchange or via shared resources.
Dijkstra's GC has been adapted to message based language concepts
already since it is one of the basic elements in the Communicating
Sequential Processes language proposal (CSP) of Hoare. Brinch Hansen,
on the other hand, has shown in the Edison language how to adapt GC to
shared resource communication.

In the Edison language it is like this:

when B1 do SL1
else B2 do SL2
else Bn do SLn
end

When a process arrives at this statement, it will be delayed until one
of the conditions B1 .. Bn is true. When one is true, the
corresponding statement is performed, if more are true, one is
selected and the corresponding statement is performed. Note that the
`when' statement in Edison is a Conditional Critical Region with
multiple conditions. If you want a simple Critical Region you just
specify the condition `true': when true do S.

Multicore programming is especially hard because one observation from
Dijkstra is often ignored by the novice. It should always be kept in
mind by anyone making concurrent programs that in a uniprogramming
environment we have "once all false, always all false", which
justifies the use of the deterministic selection statements such as
`if' in sequential programs. In a multiprogramming environment,
inspecting the state of the shared variables or communication channels
should not result in the thread going on and "doing something else"
when "all false". Rather the thread must wait until one of the
conditions become true and then take the corresponding action.

> Can't we just use pthreads, or maybe some other things
> too, to program on a multicore computer? Or somehow multicores do
> present some new challenges that we did not have before, like on
> SMPs? Please enlighten me.

Working with library-level calls like the pthread library is almost as
error prone and is at a low level in concurrent programming as it is
programming in assembly language.

It is necessary that high level programming languages incorporate new
constructions such as the Guarded Commands and Conditional Critical
Regions so that programmers should be able to think at a higher level
of abstraction. Also compilers will have a chance to filter out most
of the programming errors that are made nowadays just because of the
low level of programming (e.g. pthread library calls).

The other low level feature in the pthread-like libraries is the
forking. In high level languages the main tool for process creation
should be the structured form of parallel statement. There are some
attempts towards it, but (of course) it is not called parallel
statement but rather something like `fork/join'.

Best Regards,
Szabolcs

--

Chris Thomasson

unread,

Mar 17, 2008, 1:48:20 PM3/17/08

to

"Chris Thomasson" <cri...@comcast.net> wrote in message
news:7q2dnbmwifeeMETa...@comcast.com...

>
> "Eugene Miya" <eug...@cse.ucsc.edu> wrote in message
> news:47d99455$1@darkstar...

[...]

>> ...
>>>http://groups.google.com/group/comp.arch/msg/56b9fcc296e8564a
>> ...
>>>> Critical Regions (CR), Conditional Critical Regions (CCR), and
>> ...
>>>http://groups.google.com/group/comp.programming.threads/msg/ed695d0cb48fc551
>>>
>>>http://groups.google.com/group/comp.programming.threads/msg/7cfbb24f5e6d5a3f
>>
>>
>> All these archive links are dead.
>
> They take to proper Google Groups pages when I click on them. Humm,
> weird...
> On my end, all but one link you listed is dead; perhaps Google was doing
> some maintenance or something...

Crap!! Well, I meant to say:

All but _one_ link takes me to proper Google Groups pages when I click on
them.
Humm, weird... On my end, _only__one__link_ you listed is dead; perhaps

Google was doing some maintenance or something...

> [Moderator: Possible; even likely.]

I think I cut a couple of characters off the dead one (e.g., no subject).

Sorry for the major typo!

Jesper Larsson Traff

unread,

Mar 19, 2008, 1:12:50 PM3/19/08

to

Just for information, some of the issues that have come
up in this thread will be discussed at the IPDPS 2008 Panel:

"How to avoid making the same Mistakes all over again:
What the parallel-processing Community has (failed) to offer
the multi/many-core Generation"

Panelists

Hideharu Amano, Keio University
Anwar Ghuloum, Intel
John Gustavson, Clearspeed
Keshav Pingali, University of Austin, Texas
Vivek Sarkar, Rice University, Texas
Uzi Vishkin, University of Maryland
Kathy Yelick, University of Berkeley, California

best regards

Jesper Larsson Traff, IPDPS 2008 Panel moderator
NEC Laboratories Europe, St. Augustin, Germany

Szabolcs Ferenczi

unread,

Apr 19, 2008, 4:20:50 PM4/19/08

to

On Mar 19, 7:12 pm, Jesper Larsson Traff <tr...@ccrl-nece.de> wrote:
> Just for information, some of the issues that have come
> up in this thread will be discussed at the IPDPS 2008 Panel:
>
> "How to avoid making the same Mistakes all over again:
> What the parallel-processing Community has (failed) to offer
> the multi/many-core Generation"
>
> Panelists
>
> Hideharu Amano, Keio University
> Anwar Ghuloum, Intel
> John Gustavson, Clearspeed
> Keshav Pingali, University of Austin, Texas
> Vivek Sarkar, Rice University, Texas
> Uzi Vishkin, University of Maryland
> Kathy Yelick, University of Berkeley, California
>
> best regards
>
> Jesper Larsson Traff, IPDPS 2008 Panel moderator
> NEC Laboratories Europe, St. Augustin, Germany

Can you summarise the discussion for us, please?

Best Regards,
Szabolcs

--