I want to know, which one is going to be best for parallel processing
on current and near future CPU's.
I read that the Go language go routines are more lightweight than a
thread, also Scala claims its threading is more lightweight.
http://java.dzone.com/articles/scala-threadless-concurrent
I read here "shared memory threads are quite heavyweight and incur
severe performance penalties from context switching overheads.". I
remember when I was programming a C++ server in the past, about 8
years ago, we had 1 thread per client, of course it started having
overhead of context switching so we had to drop that idea and go for a
different model.
Also it says in that link ". Hence Scala actors have been designed as
lightweight event objects, which get scheduled and executed on an
underlying worker thread pool that gets automatically resized when all
threads block on long running operations". This seems simular to Go
language?
Regardless, can someone compare and contrast the differences and
benifits of each?
Assuming I am trying to write some software which works well on 8 core
CPU's in the future say 2 or N years from now, should I be using
Scala, should I be using Go? I know this is only one consideration and
that many other factors weigh in the decision, but for discussion I
feel its best to limit to the threading topic.
Thanks! Phil
You can get a pretty good idea of what Erlang and Scala look like when
doing (very simple) parallel stuff. Anybody want to take that code
and write the Go version? Maybe we can do some quick measurements.
Michael
Scala runs on top of a JVM.
Erlang doesn't have writable variables.
Those are simple and maybe obvious but they color a lot of
the rest of the language and the way it gets used.
Russ
I'm in your court regarding super-scalability, and that is essentially my area of expertise.
My current thought is that the most scalable model existing now is that of the autonomous actor implementing periodically-updated actions received via an authenticated centralized command and control point.
As a complete Java-as-a-language/syntax-hater, I still have to admit that the VM has very much matured performance-wise since its early days, and I see Scala as a promising way to make it newly enjoyable for me to code in as well as pushing towards the nirvana that is unified interfaces and instruction sets.
At the same time, I reinforce your thoughts that the "surface level" features as you describe are mostly meaningless; instead I focus on how a language interfaces with the "weird" constraints of the mainstream hardware and operating systems that exist these days-
When a language makes it easy to scale and propagate both blocking and non-blocking operations across multiple cores->multiple CPUs->distributed NUMA it makes the grand plan shine like Baby Jesus.; and I like others see GoRoutines as crude but effective, which is definitely the right solution at times.
MUXing blocking and non-blocking OPs in to an OS-level thread pool that can hit every processor as needed is ugly, but a step in the right direction as far as I'm concerned :)
--
NUNQUAM NON PARATUS
V: 408.718.6290
I really don't understand these languages without writable variables,
I mean I do program in them, I program in Scala, but I dont like these
- in the past, I used to just grab a big block of ram and do whatever
I wanted with it. malloc huge block, free it later, then I didn't have
to deal with memory management and I write where I want.
recursiveFunction(Thing) ->
CandidateNewThing = Thing + 1,
if
CandidateNewThing > 20 -> NewThing = 20;
CandidateNewThing <= 20 -> NewThing = CandidateNewThing;
end,
%% do some interesting stuff with NewThing, and tail-recurse:
recursiveFunction(NewThing).
See? Did you miss being able to do Thing++? Not me. Every variable
tells you why it's there. Some call it cumbersome, some love it. Very
few people in-between.
Before we go too far away from the subject, let's get back to writing
a test. I'll throw something up in Erlang today. I'll need help from
you guys to translate it to Scala and Go.
Michael
Maybe Go has a fair shot at this. Hey, I'm allowed to ask for anything
in dreamland :)
Before we go too far away from the subject, let's get back to writing
a test. I'll throw something up in Erlang today. I'll need help from
you guys to translate it to Scala and Go.
Michael
On the message payload issue... I'm a little stumped. It would be
meaningless to pass payload-less messages. To be useful, the messages
should carry at least the ID of the sending actor. In Erlang, I can
make that an atom by using the built-in method to register processes.
"Registering" a process in erlang binds the atomic name with the
ProcessID. Is there something equivalent in Scala and in Go?
Michael
On Jan 22, 9:31 am, David Flemström <david.flemst...@gmail.com> wrote:
> On Fri, Jan 22, 2010 at 2:38 PM, Michael <mbi...@gmail.com> wrote:
>
> > Before we go too far away from the subject, let's get back to writing
> > a test. I'll throw something up in Erlang today. I'll need help from
> > you guys to translate it to Scala and Go.
>
> > Michael
>
> I could do the translation to Scala, but since Scala is one of those
> languages that allows you to do something in many different ways, there
> should be other translation candidates as well.
>
> FWIW, a list of known weaknesses that the Scala model has, worth considering
> when creating the benchmark:
>
> - Passing around primitive data like integers is expensive, because of
> the JVM legacy.
> - Since real OS threads are used for each actor (when using the
------------- snip -------------
# Turing Makefile
.SUFFIXES: .erl .beam
.erl.beam:
erlc -W $<
ERL = erl -boot start_clean -sname concurrencytest -setcookie cookie1
MODS = test shepherd sheep
all: compile
${ERL} -pa 'YOUR/PATH/HERE' -s test run
compile: ${MODS:%=%.beam}
clean:
rm -rf *.beam erl_crash.dump
------------- /snip -------------
Careful when you paste the Makefile. The indents MUST be one tab, not
a bunch of spaces. Make is finicky. Also, be sure to tell it where
the code is by modifying the path string (YOUR/PATH/HERE).
Next is "test.erl" It doesn't do much except set some initial
conditions. Think of this as your project file.
------------- snip -------------
%% Author: mbilca
%% Created: Jan 21, 2010
%% Description: test.erl :This is an Erlang concurrency test that uses
one shepherd
%% to spawn many sheep that send messages to each other.
%% The shepherd measures the time it takes for a given number of
messages
%% to randomly make their way through the population.
-module(test).
-export( [run/0] ).
run() ->
io:format("~nStarting Test ...~n"),
NumSheep = 10,
MaxMessages = 100,
io:format("Initializing Shepherd with ~p sheep
that will pass ~p messages...~n", [NumSheep,
MaxMessages]),
register(shepherd, spawn(shepherd, init, [NumSheep,
MaxMessages])).
------------- /snip -------------
Next is the "shepherd.erl":
------------- snip -------------
%% Author: mbilca
%% Created: Jan 21, 2010
%% Description: shepherd.erl: The shepherd spawns the sheep, and sends
the first message,
%% starting the sheep talking to each other
-module(shepherd).
-import( io, [format/1, format/2] ).
-import( lists, [seq/2] ).
-export([init/2]).
init(NumSheep, MaxMessages) ->
SheepAliases = [list_to_atom(integer_to_list(X)) || X <- seq
(1,NumSheep)],
[register(Alias, spawn(sheep, init, [Alias, SheepAliases,
NumSheep, MaxMessages])) || Alias <- SheepAliases],
loop(0, NumSheep, MaxMessages).
loop(START_Timestamp, NumSheep, MaxMessages) ->
receive
{start} ->
format("Starting shepherd...~n"),
{MegaSecs,Secs,MicroSecs} = now(),
START_Timestamp_new = (MegaSecs * 1000000 + Secs) *
1000000 + MicroSecs,
'1' ! {'fire', '2', 0}, %% kick off the party
loop(START_Timestamp_new, NumSheep, MaxMessages);
{stop} ->
{MegaSecs,Secs,MicroSecs} = now(),
STOP_Timestamp = (MegaSecs * 1000000 + Secs) * 1000000 +
MicroSecs,
ElapsedTime = STOP_Timestamp - START_Timestamp,
ElapsedSecs = ElapsedTime / 1000000,
format("Elapsed time for ~p sheep with ~p messages: ~p
seconds~n", [NumSheep, MaxMessages, ElapsedSecs]);
{save} ->
void
end.
------------- /snip ------------
Last is the "sheep.erl"
------------- snip -------------
%% Author: mbilca
%% Created: Jan 21, 2010
%% Description: sheep.erl: The only thing the sheep know to do is to
receive a message from other sheep,
%% increment the NumMessagesSoFar by one, and send a new message along
to a new randomly chosen sheep.
-module(sheep).
-import(io, [format/2]).
-export([init/4]).
init(MyAlias, SheepAliases, NumSheep, MaxMessages) ->
format("Starting sheep #~p~n", [MyAlias]),
loop(MyAlias, SheepAliases, NumSheep, MaxMessages).
loop(MyAlias, SheepAliases, NumSheep, MaxMessages) ->
receive
{'fire', FromAlias, NumMessagesSoFar} ->
format("RECEIVE: ~p <<< ~p; Message # ~p~n", [MyAlias,
FromAlias, NumMessagesSoFar]),
RandomTarget = list_to_atom(integer_to_list(random:uniform
(NumSheep))),
NewNumMessagesSoFar = NumMessagesSoFar + 1,
if
NewNumMessagesSoFar =< MaxMessages ->
begin
format(" SEND: ~p >>> ~p~n", [MyAlias,
RandomTarget]),
RandomTarget ! {'fire', MyAlias,
NewNumMessagesSoFar}
end;
true ->
begin
format(" Trying to stop ~p~n", [MyAlias]),
stop(MyAlias, SheepAliases)
end
end,
loop(MyAlias, SheepAliases, NumSheep, MaxMessages);
{'die'} ->
format(" ... sheep ~p dying!~n", [MyAlias])
end.
stop(MyAlias, SheepAliases) ->
format(" ... sheep ~p stopping.~n", [MyAlias]),
%% the first sheep that reaches MaxMessages commits mass murder:
[X ! {'die'} || X <- SheepAliases -- [MyAlias]],
shepherd ! {stop}.
------------- /snip ------------
We'll call this the babbling sheep test. They don't do anything
interesting except pass one message around randomly until MaxMessages
is hit. We'll be able to vary NumSheep and MaxMessages to make the
test runnable.
This is very raw, and not ready to actually do scaling tests. It is
offered so we can agree on the strategy first.
In the real test, we would have to turn off all the printouts, at
least. They're in there so you get an idea of what is happening under
the hood. This is why it starts with 10 sheep and only goes to 100
messages. Also, the Erlang runtime default to max 32K processes. This
can be overridden, though.
Let's first talk about whether this approach is a valid test of
scalability, since there are some things in here that go beyond pure
scalability (using random numbers for example). To David's point, the
messages have to carry some things to be useful. I think the above is
a realistic enough piece of code -- I don't know if I could simplify
it much and still retain it's usefulness.
If you think of better (simpler or not) ways to do it, great, but
let's make sure the Scala and Go versions do the same thing, and are
similar in approach.
So, David volunteered to do the Scala bit, any Go takers?
Regards,
Michael
After you make the project (just type make in the src directory), it
will automatically start an Erlang runtime instance. You'll see the
sheep being built, but the messaging won't start until you tell the
shepherd to start, like so:
shepherd ! {start}.
That ought to do it.
Michael
When benchmarking Erlang please note that you will have to try out a
few switches to get best performance.
For compiling: bytecode vs native
For running: whether to tell erlang use multiple cores, smp, etc
The flags needed for best performance will vary with the hardware used.
- Taru Karttunen
We'll have to do our best to match these OS-level concerns across the
languages, and also across multiple machines. Who knows, we may
discover some interesting variances between two-cores and, say, eight
hyper-thread-cores. We'll complete the matrix as we go, once a few of
us compile the three contenders on whatever machines we have
available. I'll run my tests on a Core2Duo OSX 10.6. I hope someone
with a Corei7 four core will volunteer. And an AMD-64 person :)
Michael
--
:: atomly ::
[ ato...@atomly.com : www.atomly.com : http://blog.atomly.com/ ...
[ atomiq records : new york city : +1.347.692.8661 ...
[ e-mail atomly-new...@atomly.com for atomly info and updates ...