After a short discussion with benlangfeld on IRC, it was suggested I bring this up here. Looking at the current Celluloid mailbox implementation and DCell, I couldn't help but wonder, if it were ever considered that the Celluloid mailbox implementation could be swapped out with 0mq directly.
That is, every actor has a 0mq, well, queue. With the transport being inproc, IPC (if needed) and TCP.
In any case, the values are that the locks heavy mailbox implementation can be retired, and possibly when toggling transport to TCP, the DCell project can be merged(?).
On Sat, Jun 30, 2012 at 5:26 AM, Dotan Nahum <dip...@gmail.com> wrote:
> Hello group,
> After a short discussion with benlangfeld on IRC, it was suggested I bring
> this up here.
> Looking at the current Celluloid mailbox implementation and DCell, I
> couldn't help but wonder, if it
> were ever considered that the Celluloid mailbox implementation could be
> swapped out with
> 0mq directly.
It was considered, and a deliberate design decision not to. Among other
reasons, 0mq is something of an onerous native dependency. The existing
code ensures that 0mq is only a dependency of DCell, not Celluloid itself.
The preferred way to utilize 0mq in such a setup is to use the inproc
transport and serialize all messages despite the fact they are being sent
in process. While this does have the positive effects that the semantics in
a distributed context are closer to the in-process context, and that by
virtue of copying all messages concurrent mutation issues are sidestepped,
serializing everything even in-process is much slower and limits the types
of objects that can be used.
Even as purely a notification system, using I/O objects instead of mutexes
and conditions is slower. Celluloid::IO uses nio4r's wakeup mechanism which
writes a byte to a pipe instead of a ConditionVariable and is only about
80% of the speed of Celluloid itself, and also requires 2 file descriptors
per actor.
Via Celluloid's use of dependency injection and duck typing, it isn't
necessary to depend on 0mq as a one-stop-shop for message transport. Any
mailbox you want can be dropped in. With the system designed in this way, a
lot of the advantages of using 0mq everywhere are irrelevant.
The existing design also opens up the opportunity to change Celluloid's
entire underlying concurrency model. For example, Kilim provides
"microthreads" on the JVM along with its own messaging system. Because
Celluloid is not coupled to 0mq, it could potentially take advantage of
this on JRuby (and Kilim/JRuby integration is underway as part of this
year's GSoC)
In any case, the values are that the locks heavy mailbox implementation can
> be retired, and
> possibly when toggling transport to TCP, the DCell project can be
> merged(?).
Note that even if Celluloid switched to using 0mq in process, all of the
0mq sockets would still need to be guarded with locks as they do not
provide thread safety out of the box. Also note that uncontended locks are
cheap because they synchronize with userspace mechanisms unless contended.
I also think the fact that Celluloid and DCell are decoupled is a strength
of the system, not a weakness. It means Celluloid itself is quite flexible
and its internals can be modified on a case-by-case basis for different
purposes (e.g. Celluloid::IO)
> On Sat, Jun 30, 2012 at 5:26 AM, Dotan Nahum <dip...@gmail.com> wrote:
>> Hello group,
>> After a short discussion with benlangfeld on IRC, it was suggested I bring this up here.
>> Looking at the current Celluloid mailbox implementation and DCell, I couldn't help but wonder, if it
>> were ever considered that the Celluloid mailbox implementation could be swapped out with
>> 0mq directly.
> It was considered, and a deliberate design decision not to. Among other reasons, 0mq is something of an onerous native dependency. The existing code ensures that 0mq is only a dependency of DCell, not Celluloid itself.
In my mind, it could be possible to decouple the Mailbox backend from the rest of Celluloid. Though, having this is more a academic advantage than anything else.
> The preferred way to utilize 0mq in such a setup is to use the inproc transport and serialize all messages despite the fact they are being sent in process. While this does have the positive effects that the semantics in a distributed context are closer to the in-process context, and that by virtue of copying all messages concurrent mutation issues are sidestepped, serializing everything even in-process is much slower and limits the types of objects that can be used.
> Even as purely a notification system, using I/O objects instead of mutexes and conditions is slower. Celluloid::IO uses nio4r's wakeup mechanism which writes a byte to a pipe instead of a ConditionVariable and is only about 80% of the speed of Celluloid itself, and also requires 2 file descriptors per actor.
> Via Celluloid's use of dependency injection and duck typing, it isn't necessary to depend on 0mq as a one-stop-shop for message transport. Any mailbox you want can be dropped in. With the system designed in this way, a lot of the advantages of using 0mq everywhere are irrelevant.
An example of this is when, yesterday, I used the Celluloid::Task "framework" to do a simple co-operative scheduling on top of a ZMQ-based reactor. It was reasonably easy to use this single part to abstract the creation, suspending and resuming of Fibers.
> The existing design also opens up the opportunity to change Celluloid's entire underlying concurrency model. For example, Kilim provides "microthreads" on the JVM along with its own messaging system. Because Celluloid is not coupled to 0mq, it could potentially take advantage of this on JRuby (and Kilim/JRuby integration is underway as part of this year's GSoC)
>> In any case, the values are that the locks heavy mailbox implementation can be retired, and
>> possibly when toggling transport to TCP, the DCell project can be merged(?).
> Note that even if Celluloid switched to using 0mq in process, all of the 0mq sockets would still need to be guarded with locks as they do not provide thread safety out of the box. Also note that uncontended locks are cheap because they synchronize with userspace mechanisms unless contended.
> I also think the fact that Celluloid and DCell are decoupled is a strength of the system, not a weakness. It means Celluloid itself is quite flexible and its internals can be modified on a case-by-case basis for different purposes (e.g. Celluloid::IO)
I have thought that having the ability to run many DCell::Nodes in a single VM would also be an interesting academic exercise. This would, in theory, allow the option for a 1-to-1 actor-to-node relationship yielding an entirely ZMQ-based Actor space.
Attempting these things, I believe, will expose new and interesting compositions.
On 30 June 2012 20:15, Tony Arcieri <tony.arci...@gmail.com> wrote:
> While this does have the positive effects that the semantics in a
> distributed context are closer to the in-process context, and that by virtue
> of copying all messages concurrent mutation issues are sidestepped,
> serializing everything even in-process is much slower and limits the types
> of objects that can be used.
It might be nice to make this possible in Celluloid; that is, copying
of all messages. Call it a mutation-safe mode, if you will. Thoughts?
Certainly filled some blanks with my understanding of how Celluloid came to be. I actually answered around an hour after Tim posted but my Chrome crashed while replying. Took me a while, as a result, to recover my then broken and violated thought process from a very long answer that I've provided :)
I'll try to summarize then this time. I too, think Mailbox can be abstracted out - I would *really* love to know that in some cases I have the possibility to use "native" Java concurrent collections and executors. In recent days I came to conclusion that unless I have a good reason, I should always base long-running processes on JRuby and not MRI and this would make it just better knowing that I don't even need to optimize that bit.
Having said that, Celluloid does give a great answer and solution, and I'll keep using it either way - the benefits really shadow the micro-optimizations right now for me -- so needless to say you have my thanks!
On Sunday, July 1, 2012 3:00:20 AM UTC+3, Ben Langfeld wrote:
> On 30 June 2012 20:15, Tony Arcieri <tony.arci...@gmail.com> wrote: > > While this does have the positive effects that the semantics in a > > distributed context are closer to the in-process context, and that by > virtue > > of copying all messages concurrent mutation issues are sidestepped, > > serializing everything even in-process is much slower and limits the > types > > of objects that can be used.
> It might be nice to make this possible in Celluloid; that is, copying > of all messages. Call it a mutation-safe mode, if you will. Thoughts?