Concurrent Ruby?

Kyle Murphy

unread,

Jul 29, 2008, 12:17:22 AM7/29/08

to

Apologies if this is a really stupid question, I am new to programming,
but after reading about Erlang and it's speed increase on multi-core
devices I had to ask.

With Matz supposedly making Ruby 2.0 right now, is it possible to make
it concurrent like Erlang so as to take advantage of the future
multi-core devices? Thank you.
--
Posted via http://www.ruby-forum.com/.

David Masover

unread,

Jul 29, 2008, 2:55:43 AM7/29/08

to

On Monday 28 July 2008 23:17:22 Kyle Murphy wrote:
> With Matz supposedly making Ruby 2.0 right now, is it possible to make
> it concurrent like Erlang

Not like Erlang, no.

Erlang does a couple of things differently. The most obvious one, which makes
it so scalable, is the message-passing -- Erlang uses "processes" and
message-passing almost as a programming paradigm. We talk
about "Object-Oriented Programming"; Erlang people talk
about "Concurrency-Oriented Programming".

These are much easier to write and scale than threads, and they perform much
better than single threads.

There are a few of us working to rectify this situation, at least
semantically -- there's Revactor, Dramatis, and my own unreleased project
which I've been wasting a few weekend hours on.

Another reason, which I'm running into while working on the above project, is
that Erlang has no mutable data. It even goes so far as to make variables
single-assignment, which is just annoying, but the data structures themselves
are never changed. Take a simple (contrived) Ruby example:

def some_function(options={})
options[:foo] ||= 'Foo'
options[:bar] ||= 'Bar'
options[:foobar] ||= options[:foo] + options[:bar]

some_file.each_line do |line|
line.chomp!
line.gsub! /curses/i, '******'
puts line
end
end

See, we're changing things. Arrays, strings, whatever -- it's actually the
characters inside the string that are changing.

In Erlang, (almost) no data ever changes, you just create new data. Which
means that when you send a message to another process, it's as simple as
sending a pointer across -- which means it's not only a constant-time
operation, it's an absurdly cheap constant-time operation. So the data is
shared, but because it never changes, you don't have to lock it.

Which means that in Erlang, message-passing is so cheap we don't have to worry
about it. If we ported the message-passing to Ruby, it's either unreliable or
it's massively expensive and still somewhat unreliable. I'm not sure there's
a good way around this, though if there is, I intend to find it.

> so as to take advantage of the future
> multi-core devices? Thank you.

This might happen -- maybe, sort of. Keeping all of the above in mind,
threading in Ruby is modeled after the traditional C and Java model, which
means they're probably more expensive to create, and certainly more
dangerous, which means there won't be as many of them.

On top of all that...

Right now, Ruby shares a problem with Python called the GIL -- the Global (or
Giant) Interpreter Lock. What this means is that only one Ruby instruction
may execute at a time. So even though they're using separate OS threads, and
even though different Ruby threads might run on different cores, the speed of
your program (at least the Ruby part) is limited to the speed of a single
core.

The standard response, which you'll probably already see (since I'm taking the
time to write a longer answer), is that you can do threading in two ways:
Either fork off a whole new Ruby process, so you probably can't have any
shared-memory problems -- and/or write the expensive parts in C, and have
your C extension release the Ruby GIL.

(See, you can have more than one bit of C code running in a Ruby program at
once, even alongside all the Ruby stuff -- at least until they need to do
something with Ruby itself.)

There's also JRuby, which uses Java's native threads, and has no GIL. There
have been some problems with them lately, but they should work -- but again,
keep all of the above in mind. You'll be threading as well as Java does, not
as well as Erlang does.

As you can probably tell, I'm not really happy about all of this.

Now, unlike Python, it looks as though the Ruby GIL might eventually be
removed. And there is JRuby. And there's the various actor projects (mine
included). So it's conceivable that we'd get Ruby scalable to arbitrary
numbers of processors.

But again, I suspect Erlang is still going to do it better, if all you care
about is multicore and efficiency. (Ruby is doing a better job of Unicode,
has much more library support, and I much prefer its syntax.)

Florian Gilcher

unread,

Jul 29, 2008, 7:01:10 AM7/29/08

to

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Jul 29, 2008, at 8:55 AM, David Masover wrote:

> a long mail

Nice writeup. You forgot one thing about Erlang, though: It is
(mostly) sideeffect-free while
object orientated languages always rely on sideeffects.
This makes it harder when it comes to concurrency.

Regards,
Skade
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)

iEYEARECAAYFAkiO+JAACgkQJA/zY0IIRZYWAQCgjyeagX/cPnHcYZWqgJq4BQSM
HjcAoKAhINdMzbO6tGzjnNoX37J6Oqu9
=P443
-----END PGP SIGNATURE-----

Robert Klemme

unread,

Jul 29, 2008, 11:03:24 AM7/29/08

to

2008/7/29 Florian Gilcher <f...@andersground.net>:

> On Jul 29, 2008, at 8:55 AM, David Masover wrote:
>
>> a long mail
>
> Nice writeup.

Absolutely agree. Thanks David!

> You forgot one thing about Erlang, though: It is (mostly)
> sideeffect-free while

Well, he said that data does not change which is basically the same.

> object orientated languages always rely on sideeffects.

I'd rather say "usually" because immutable classes are quite common.

> This makes it harder when it comes to concurrency.

Obviously.

Kind regards

robert

--
use.inject do |as, often| as.you_can - without end

David Masover

unread,

Jul 29, 2008, 12:00:26 PM7/29/08

to

On Tuesday 29 July 2008 06:01:10 Florian Gilcher wrote:
>
> On Jul 29, 2008, at 8:55 AM, David Masover wrote:
>
> > a long mail
>
> Nice writeup.

Thanks!

> You forgot one thing about Erlang, though: It is
> (mostly) sideeffect-free while
> object orientated languages always rely on sideeffects.

If I understand it right, side effects in Erlang simply take a different form.
Nothing's stopping me from sending random, spurious messages in the middle of
a supposedly-innocuous function.

I did talk about data not being mutable, which provides both a semantic
(lock-free) and a technical advantage (raw speed).

I'm trying to figure out how to at least partly duplicate the semantic
advantage in Ruby, but it's not easy -- I'm stuck either #freeze-ing
everything, or wrapping every message in an actor of its own, and both
approaches seem more obnoxious and error-prone than forcing the developer to
deal with it.

Charles Oliver Nutter

unread,

Jul 29, 2008, 12:56:43 PM7/29/08

to

David Masover wrote:
> There's also JRuby, which uses Java's native threads, and has no GIL. There
> have been some problems with them lately, but they should work -- but again,
> keep all of the above in mind. You'll be threading as well as Java does, not
> as well as Erlang does.

I'm not sure what you mean by problems...there have not been problems
with them lately; they work as you'd expect native threads to work. They
do require a bit more diligence on your part if you're sharing data
across the threads, since for performance reasons we don't do any extra
synchronization of e.g. Array, Hash, String. But native threads work
fine on JRuby.

- Charlie

ara.t.howard

unread,

Jul 29, 2008, 1:13:53 PM7/29/08

to

On Jul 29, 2008, at 10:00 AM, David Masover wrote:

> I'm trying to figure out how to at least partly duplicate the semantic
> advantage in Ruby, but it's not easy -- I'm stuck either #freeze-ing
> everything, or wrapping every message in an actor of its own, and both
> approaches seem more obnoxious and error-prone than forcing the
> developer to
> deal with it.

fan out multiple processes with a message queue each - easy to do with
drb. naive impl:

cfp:~> cat a.rb
b got "hello" (pid=94677)
a got "hello" (pid=94676)

cfp:~> cat a.rb

a =
actor {
recv_msg { |msg|
puts "a got #{ msg.inspect } (pid=#{ Process.pid })"
}
}

b =
actor {
recv_msg { |msg|
puts "b got #{ msg.inspect } (pid=#{ Process.pid })"
a.send_msg msg
}
}

b.send_msg 'hello'

STDIN.gets

BEGIN {

require 'rubygems'
require 'thread'
require 'drb'
require 'slave'

class Actor
include ::DRb::DRbUndumped

def initialize &block
@q = Queue.new
@block = block
act!
end

def act!
@thread = Thread.new do
Thread.current.abort_on_exception = true
instance_eval &@block
end
end

def send_msg message
@q.push message
end

def recv_msg
while(( message = @q.pop ))
yield message
end
end
end

def actor(*a, &b)
Slave.new{ Actor.new(*a, &b) }.object
end

STDOUT.sync = true

}

a @ http://codeforpeople.com/
--
we can deny everything, except that we have the possibility of being
better. simply reflect on that.
h.h. the 14th dalai lama

David Masover

unread,

Jul 30, 2008, 12:33:04 AM7/30/08

to

On Tuesday 29 July 2008 12:13:53 ara.t.howard wrote:
>
> On Jul 29, 2008, at 10:00 AM, David Masover wrote:
>
> > I'm trying to figure out how to at least partly duplicate the semantic
> > advantage in Ruby, but it's not easy -- I'm stuck either #freeze-ing
> > everything, or wrapping every message in an actor of its own, and both
> > approaches seem more obnoxious and error-prone than forcing the
> > developer to
> > deal with it.
>
> fan out multiple processes with a message queue each - easy to do with
> drb.

That implies a full copy (I think), which isn't always what's needed.

Without actually testing your implementation, what happens when I send, say, a
reference to an actor? (Kind of an essential feature.)

And without actually doing any benchmarks (how's that for naive?), I still
find it hard to believe that DRb+Queue would scale better than Thread+Queue,
for large numbers of actors. (Keep in mind, it's not unusual for an Erlang
program to have thousands of processes.)

Given that I still have a vague hope that YARV will eventually remove the GIL,
I'd rather stick to Threads, if I can make them safe.

David Masover

unread,

Jul 30, 2008, 12:36:01 AM7/30/08

to

On Tuesday 29 July 2008 11:56:43 Charles Oliver Nutter wrote:
> David Masover wrote:
> > There's also JRuby, which uses Java's native threads, and has no GIL.
There
> > have been some problems with them lately, but they should work -- but
again,
> > keep all of the above in mind. You'll be threading as well as Java does,
not
> > as well as Erlang does.
>
> I'm not sure what you mean by problems...there have not been problems
> with them lately;

Maybe it wasn't actually "lately".

And there's still the rest of it:

> They
> do require a bit more diligence on your part if you're sharing data
> across the threads,

That's the whole problem that I'm attacking right now -- while a pure actor
model wouldn't share any data, I'm not even sure I can safely clone
everything properly, if I was going that route. And I'd rather not, for
obvious performance reasons.

ara.t.howard

unread,

Jul 30, 2008, 1:33:53 AM7/30/08

to

On Jul 29, 2008, at 10:33 PM, David Masover wrote:

> That implies a full copy (I think), which isn't always what's needed.
>
> Without actually testing your implementation, what happens when I
> send, say, a
> reference to an actor? (Kind of an essential feature.)

DRb handles references. DRbUndumped provides a means to pass
references to remote objects around.

>
>
> And without actually doing any benchmarks (how's that for naive?), I
> still
> find it hard to believe that DRb+Queue would scale better than Thread
> +Queue,
> for large numbers of actors. (Keep in mind, it's not unusual for an
> Erlang
> program to have thousands of processes.)

no doubt that's true. processes can help you now though - especially
since threads don't scale right now in ruby with multi processor
machines.

>
>
> Given that I still have a vague hope that YARV will eventually
> remove the GIL,
> I'd rather stick to Threads, if I can make them safe.

sure, but if you want to burn up processors you simply have to use
processes attm.

you might find this interesting

http://groups.google.com/group/ruby-talk-google/browse_thread/thread/b4e346478eeeead4/0cbc4a86f2237476?lnk=gst&q=threadify+jruby#0cbc4a86f2237476

David Masover

unread,

Jul 30, 2008, 2:02:03 AM7/30/08

to

On Wednesday 30 July 2008 00:33:53 ara.t.howard wrote:
>
> On Jul 29, 2008, at 10:33 PM, David Masover wrote:
>
> > That implies a full copy (I think), which isn't always what's needed.
> >
> > Without actually testing your implementation, what happens when I
> > send, say, a
> > reference to an actor? (Kind of an essential feature.)
>
> DRb handles references. DRbUndumped provides a means to pass
> references to remote objects around.

Alright. What if I send a complex datastructure? Strings, I can live with, but
what about multidimensional arrays?

> > And without actually doing any benchmarks (how's that for naive?), I
> > still
> > find it hard to believe that DRb+Queue would scale better than Thread
> > +Queue,
> > for large numbers of actors. (Keep in mind, it's not unusual for an
> > Erlang
> > program to have thousands of processes.)
>
> no doubt that's true. processes can help you now though - especially
> since threads don't scale right now in ruby with multi processor
> machines.

I believe work is going on to make Threads scale in 1.9 -- current 1.9 still
has a GIL, though.

They do scale in JRuby, and probably in IronRuby (haven't tried).

> > Given that I still have a vague hope that YARV will eventually
> > remove the GIL,
> > I'd rather stick to Threads, if I can make them safe.
>
>
> sure, but if you want to burn up processors you simply have to use
> processes attm.

Or I could use JRuby. Or IronRuby.

I don't want to burn up processors atm. I want to build an architecture which
will be able to burn up processors in the future. I want to solve concurency
on a single machine once and be done with it -- without having to use Erlang.

>
http://groups.google.com/group/ruby-talk-google/browse_thread/thread/b4e346478eeeead4/0cbc4a86f2237476?lnk=gst&q=threadify+jruby#0cbc4a86f2237476

From that link:

"the sync overhead is prohibitive
for in memory stuff"

I am, specifically, interested in doing in-memory stuff. If I can solve that
problem, I'm not as worried about the network stuff, especially as others
have already solved that well enough (DRb and friends).

Charles Oliver Nutter

unread,

Jul 30, 2008, 3:10:19 AM7/30/08

to

Well if there are specific threading issues, we'd like to solve them.
And at this very moment we're debating and working on ways to make the
core collection types (String, Array, Hash) at least not dump a stack
trace when they're used unsafely. So I think there's little reason why
you couldn't implement a decent Actor framework on top of JRuby.

Also, we recently added Rubinius's MVM API atop our existing MVM
support, so that's another route you can go and really isolate
instances. But of course, they eat up more memory that way.

- Charlie

Tony Arcieri

unread,

Jul 30, 2008, 1:41:50 PM7/30/08

to

[Note: parts of this message were removed to make it a legal post.]

On Mon, Jul 28, 2008 at 10:17 PM, Kyle Murphy <kmur...@hotmail.com> wrote:

> Apologies if this is a really stupid question, I am new to programming,
> but after reading about Erlang and it's speed increase on multi-core
> devices I had to ask.
>
> With Matz supposedly making Ruby 2.0 right now, is it possible to make
> it concurrent like Erlang so as to take advantage of the future
> multi-core devices? Thank you.
>

Rubinius is able to spawn a VM per CPU core, and allow quasi-Erlang style
concurrency using Actor objects which can communicate across inter-VM
message buses.

It's not as elegant as Erlang's SMP scheduler (something like that really
isn't possible without a shared-nothing process architecture), but it more
or less provides the same approach Erlang uses for distributed systems (i.e.
each CPU is a "node")

--
Tony Arcieri
medioh.com

Charles Oliver Nutter

unread,

Jul 30, 2008, 3:22:29 PM7/30/08

to

Tony Arcieri wrote:
> On Mon, Jul 28, 2008 at 10:17 PM, Kyle Murphy <kmur...@hotmail.com> wrote:
>
>> Apologies if this is a really stupid question, I am new to programming,
>> but after reading about Erlang and it's speed increase on multi-core
>> devices I had to ask.
>>
>> With Matz supposedly making Ruby 2.0 right now, is it possible to make
>> it concurrent like Erlang so as to take advantage of the future
>> multi-core devices? Thank you.
>>
>
> Rubinius is able to spawn a VM per CPU core, and allow quasi-Erlang style
> concurrency using Actor objects which can communicate across inter-VM
> message buses.
>
> It's not as elegant as Erlang's SMP scheduler (something like that really
> isn't possible without a shared-nothing process architecture), but it more
> or less provides the same approach Erlang uses for distributed systems (i.e.
> each CPU is a "node")

It's worth mentioning JRuby also supports the MVM API, and sub-VMs share
nothing with their parents save them message queue. Sub-VMs also are
launched in their own native thread (though of course JRuby has native
threads within a given VM as well). It wouldn't be much of a leap to
implement the Actor model as well.

- Charlie