wf2 post-mortem

61 views
Skip to first unread message

Carlos Bueno

unread,
Jul 20, 2008, 5:47:06 PM7/20/08
to wide-finder
Ok, you've convinced me. Everyday multicore programming is in a very
sorry state.

I was less concerned with tweaking regexps or whatnot than trying to
multiply the known throughput of one logrep process by N CPUs, without
drastically changing the shape of the program. So I went into it
delibrately naive. I'm not an "average" programmer but I'm close to
the median. All of the approaches I came up with foundered after 2 or
4 CPUs. I then peeked at the other entrants for ideas:

* spawn children instead of threads in Python
* mmap the file
* read the file in chunks
* intead of reading & passing data, develop a signal interface to
tell children what chunks to read next
* etc...

Each new technique required a lot of reading, learning, testing,
platform-specific bugs, etc. That sucked. Python's GIL sucks. Even if
there were no Big Giant Lock in the way, Python threading & queuing
sucks. All of that tricky work just to end up where most everyone else
did: with a big pile of confusing code.

Multicore programming is hard. But a lot of things are hard to
understand, like process scheduling. I think the real problem is that
both good and bad solutions look messy, which means either that these
concepts are beyond the ken of mere mortals, or (more likely) that
current languages are not equipped to express them well.

So where does that leave me, Joe Median Programmer? I'm not, by
definition, qualified to add good support for concurrency to my
favorite programming languages. Am I just condemned to wait?

Yuri Schimke

unread,
Jul 21, 2008, 5:54:12 AM7/21/08
to wide-finder
I work on conceptually similar problems in Investment Banking using
Java In Memory caching products (Coherence, GigaSpaces etc). These
are very simple problems complicated by moderately large datasets
shared a across a cluster (e.g. 20 VMs). I think these problems are
relatively easy to solve within a specific domain. At least
abstractions can make programmers largely unaware of the heavy lifting
underneath, i.e. one person/group writes the framework and many other
leverage it.

Solutions like Google App Engine, look like interesting solutions to a
broader class of problems at the expense of fundamental changes in
approaches that perform well.

But I don't think there is necessarily a magic bullet to make
multicore processing as simple as a basic procedural program. The
best Java really achieved for multi-threaded programming was getting
programmers to stop writing programmers using Thread, volatile and
synchronized, and instead using the concurrent library.
Reply all
Reply to author
Forward
0 new messages