[Eventmachine-talk] EventMachine 2.0

Aman Gupta

unread,

Jun 28, 2009, 1:30:11 AM6/28/09

to eventmac...@rubyforge.org

I'd like to announce a new initiative I've been discussing with
Francis and various other members of the EM community over the past
week: EventMachine 2.0.

The current EM code base works well and is used by many people in
high-traffic production environments, but the code has become
increasingly messy and unmaintainable over the years. Adding new
features is unnecessarily painful as the C and ruby wrappers for the
C++ reactor have to be written and maintained by hand, and the lack of
a solid test suite has lead to a huge feature disparity between the
various reactors (C++, Java and pure Ruby).

We'd like to take a step back and redesign EM using the lessons
learned over the project's history. This will likely involve porting
the reactor to C, designing a clean and modular set of APIs that will
allow more advanced usage of EM (multiple reactors, embedding EM
directly into the ruby interpreter, etc), and embracing BDD to develop
a solid test suite for the project. Jake Douglas has started a wiki
detailing the plan so far:
http://wiki.github.com/eventmachine/eventmachine/em2-eventmachine-20

I'd like to invite everyone in the community to provide their
feedback: what would you like to see added or removed from EM? what's
missing from our plan?

Aman
_______________________________________________
Eventmachine-talk mailing list
Eventmac...@rubyforge.org
http://rubyforge.org/mailman/listinfo/eventmachine-talk

Chuck Remes

unread,

Jun 28, 2009, 10:40:22 AM6/28/09

to eventmac...@rubyforge.org

On Jun 28, 2009, at 12:30 AM, Aman Gupta wrote:

> I'd like to announce a new initiative I've been discussing with
> Francis and various other members of the EM community over the past
> week: EventMachine 2.0.

> [snip]

>
> I'd like to invite everyone in the community to provide their
> feedback: what would you like to see added or removed from EM? what's
> missing from our plan?

I have a few feature ideas that I'd like to mention.

1. User-defined event types

There was some code hacked into EM by Francis a few years ago that was
starting to support this idea but it never went anywhere. I think it
would be interesting to see it revisited in the rewrite. The main
benefit of something like this would be the forced abstraction of the
event concept as opposed to how it works now where an event is always
either a change on a file descriptor or an expiring timer.

2. Support FSM concepts directly in the framework

Evented code is usually pretty difficult to understand due to its
asynchronous nature. To make it readable and testable I usually try to
to build little state machines all over the place.

It would be awesome if EM supported a state machine DSL directly, but
I'd be just as happy if the framework didn't get in the way to allow
someone else to layer this on top.

3. Pluggable task schedulers

The current codebase only provides FIFO file descriptor event
delivery. Some tasks require event prioritization or some other kind
of reordering. It would be nice to have an architecture that allowed
us to replace the basic FIFO with (for example) a multi-level priority
feedback queue, random queue, reordering based on a field in the event
itself (e.g. timestamp), or whatever crazy idea someone invents.

I'd prefer this architecture to allow for ruby-based plugins. There
may be some utility to also allowing native code plugins (C and Java)
for the truly performance focused folks.

4. Project spinoffs

EM ships with a few protocols (HTTP, STOMP, LineAndText) but I suggest
that it may be better to spin these off to a separate project. The
project could be split up much like the Sequel ORM or MERB.

em-core: C reactor, I/O library
em-protocol: HTTP, STOMP, LineAndText, Pluggable architecture for
protocol extension
em-scheduler: FIFO, Priority Queue, etc, pluggable architecture for
scheduler extension
em-fsm: DSL for writing evented code

Those are my ideas. I bet there will be a lot of discussion on IRC but
I submit that it would be better for that discussion to occur here (or
*at the least* be summarized here) since IRC is not logged.

cr

Steve Hull

unread,

Jun 29, 2009, 12:55:04 AM6/29/09

to eventmac...@rubyforge.org

FWIW, I also like the idea of being able to prioritize events.

I would also like to see some sort of basic status built in to EM. Just like #servers running, #clients running, then for each server/client: sends/sec, receives/sec. For servers, # of connected clients. It would be cool to have this served up via HTTP on a configurable port or something, but if that step (building the page in HTML and serving it via http) where left to the users, that'd be ok too. Just at least have the info/counters available.

-Steve

Marco Ceresa

unread,

Jun 29, 2009, 5:23:29 AM6/29/09

to eventmac...@rubyforge.org

On Sun, Jun 28, 2009 at 6:30 AM, Aman Gupta<themast...@gmail.com> wrote:

> We'd like to take a step back and redesign EM using the lessons
> learned over the project's history. This will likely involve porting
> the reactor to C, designing a clean and modular set of APIs that will
> allow more advanced usage of EM (multiple reactors, embedding EM
> directly into the ruby interpreter, etc), and embracing BDD to develop
> a solid test suite for the project.

This is an excellent idea. Few steps to bring EM to the next level,
and rubysts won't envy twisted matrix anymore.

> I'd like to invite everyone in the community to provide their
> feedback: what would you like to see added or removed from EM? what's
> missing from our plan?

Pretty much everything is in Jake list and Chuck ideas in this thread.
In my opinion, spinoff and plugins are very important, so we should
also dedicate some efforts to develop

*) a cleaner API
*) a better documentation.

Especially documentation, I know it has been a pain in the past but EM
really kicks asses and it's a sin it scares people away just because
it's difficult to grasp at the beginning.

Marco

Roger Pack

unread,

Jun 29, 2009, 2:05:42 PM6/29/09

to eventmac...@rubyforge.org

> I'd like to invite everyone in the community to provide their
> feedback: what would you like to see added or removed from EM? what's
> missing from our plan?

More efficient calling into ruby land would be nice. Also nice would
be to experiments with using libev for the backend and see if it
improves speed or not.
Cheers!
=r

Jos Backus

unread,

Jun 29, 2009, 2:13:45 PM6/29/09

to eventmac...@rubyforge.org

On Mon, Jun 29, 2009 at 12:05:42PM -0600, Roger Pack wrote:
> More efficient calling into ruby land would be nice. Also nice would
> be to experiments with using libev for the backend and see if it
> improves speed or not.
> Cheers!
> =r

Fwiw, Rev uses libev. Maybe I'm missing something, but wouldn't it make more
sense to work on Rev?

http://rev.rubyforge.org/rdoc/

--
Jos Backus
jos at catnook.com

Chuck Remes

unread,

Jun 29, 2009, 3:09:49 PM6/29/09

to eventmac...@rubyforge.org

On Jun 29, 2009, at 1:13 PM, Jos Backus wrote:

> On Mon, Jun 29, 2009 at 12:05:42PM -0600, Roger Pack wrote:
>> More efficient calling into ruby land would be nice. Also nice would
>> be to experiments with using libev for the backend and see if it
>> improves speed or not.
>> Cheers!
>> =r
>
> Fwiw, Rev uses libev. Maybe I'm missing something, but wouldn't it
> make more
> sense to work on Rev?

Based upon the irc discussions I've read, there is no desire to use
any existing I/O library for the rewrite. If you read the wiki link
from the OP, the reasoning is spelled out.

cr

Kirk Haines

unread,

Jun 29, 2009, 2:49:38 PM6/29/09

to eventmac...@rubyforge.org

On Sun, Jun 28, 2009 at 10:55 PM, Steve Hull<p.w...@gmail.com> wrote:
> FWIW, I also like the idea of being able to prioritize events.
> I would also like to see some sort of basic status built in to EM. Just
> like #servers running, #clients running, then for each server/client:
> sends/sec, receives/sec. For servers, # of connected clients. It would be
> cool to have this served up via HTTP on a configurable port or something,
> but if that step (building the page in HTML and serving it via http) where
> left to the users, that'd be ok too. Just at least have the info/counters
> available.

I'm against trying to count send/second and receives/second in the
reactor core. One can trivially do that oneself, if one wants that
information, and if one doesn't want it (and most of the time, it is
irrelevant), why have it in there adding to the complexity of the
reactor code?

A raw count of clients and server is simple to put into the reactor,
but I still question whether there is value to that. I know that when
I have wanted a count along those lines, my needs have been more
complex than a raw count, so even were that available, I'd still have
to implement my own stuff. So I question the value of that facility
too, though not the extent that I question building in some windowed
measurement of throughput.

Kirk Haines

Roger Pack

unread,

Jun 29, 2009, 3:21:56 PM6/29/09

to eventmac...@rubyforge.org

>> Fwiw, Rev uses libev. Maybe I'm missing something, but wouldn't it make
>> more
>> sense to work on Rev?

rev is nice but since it's mostly written in ruby, it's slower.

> Based upon the irc discussions I've read, there is no desire to use any
> existing I/O library for the rewrite. If you read the wiki link from the OP,
> the reasoning is spelled out.

it would seem to me that wrapping an existing I/O library might make
the code "less" [thus easier to understand] as well as possibly
faster.
Thoughts?
=r

Jake Douglas

unread,

Jun 29, 2009, 3:34:27 PM6/29/09

to eventmac...@rubyforge.org

This would probably be appropriate to have as a build option or similar. There really are not that many places that statistics collection would occur.

--
Jake Douglas
206-795-9207

Jake Douglas

unread,

Jun 29, 2009, 3:40:55 PM6/29/09

to eventmac...@rubyforge.org

I think the outlined reasons against outweigh the benefit here. It's easier to understand until we discover a problem or inadequacy within libfoo and then have to go digging around inside of it, fork it, etc to accommodate our needs.

--
Jake Douglas
206-795-9207

Aman Gupta

unread,

Jun 29, 2009, 3:54:20 PM6/29/09

to eventmac...@rubyforge.org

On Mon, Jun 29, 2009 at 12:21 PM, Roger Pack<roger...@gmail.com> wrote:

>>> Fwiw, Rev uses libev. Maybe I'm missing something, but wouldn't it make
>>> more
>>> sense to work on Rev?
>
> rev is nice but since it's mostly written in ruby, it's slower.
>
>
>
>> Based upon the irc discussions I've read, there is no desire to use any
>> existing I/O library for the rewrite. If you read the wiki link from the OP,
>> the reasoning is spelled out.
>
> it would seem to me that wrapping an existing I/O library might make
> the code "less" [thus easier to understand] as well as possibly
> faster.
> Thoughts?

The wiki has a good list of reasons we decided not use an existing
reactor library:

Why don’t you use libevent or libev?
General opinions of these libraries seem to be poor
Unfamiliar – overhead in learning APIs
Do they support everything we need, the way we need it?
We could think so now, and find problems later
In which case we would be glued to a library we don’t have control over
They are pretty old. Maybe we’ll put new spin on things.
If you want libev, you can use Rev

It boils down to the fact that is meant to be an iteration of
EventMachine. We know the EM reactor is already solid and fast and the
goal is to improve it. If you really want to use another reactor,
there's great projects like Rev and ruby-ffi that make it easy to do
so.

Aman

Roger Pack

unread,

Jun 29, 2009, 4:07:29 PM6/29/09

to eventmac...@rubyforge.org

> The wiki has a good list of reasons we decided not use an existing
> reactor library:
>
> Why don’t you use libevent or libev?
> General opinions of these libraries seem to be poor
> Unfamiliar – overhead in learning APIs
> Do they support everything we need, the way we need it?
> We could think so now, and find problems later
> In which case we would be glued to a library we don’t have control over
> They are pretty old. Maybe we’ll put new spin on things.

old?

> If you want libev, you can use Rev
>
> It boils down to the fact that is meant to be an iteration of
> EventMachine. We know the EM reactor is already solid and fast and the
> goal is to improve it. If you really want to use another reactor,
> there's great projects like Rev and ruby-ffi that make it easy to do
> so.

Oh my apologies I thought you wanted a full rewrite :)

I guess I'll have to do it myself if nobody else wants to. I agree
that the EM reactor is indeed solid and fast, just thought a "massive
overhaul" might be a good chance to try it Question: does anybody
actually use/like the idle timeouts currently in place in EM? Just
wondering

Aman Gupta

unread,

Jun 29, 2009, 4:36:01 PM6/29/09

to eventmac...@rubyforge.org

On Mon, Jun 29, 2009 at 1:07 PM, Roger Pack<roger...@gmail.com> wrote:
>> The wiki has a good list of reasons we decided not use an existing
>> reactor library:
>>
>> Why don’t you use libevent or libev?
>> General opinions of these libraries seem to be poor
>> Unfamiliar – overhead in learning APIs
>> Do they support everything we need, the way we need it?
>> We could think so now, and find problems later
>> In which case we would be glued to a library we don’t have control over
>> They are pretty old. Maybe we’ll put new spin on things.
>
> old?
>
>> If you want libev, you can use Rev
>>
>> It boils down to the fact that is meant to be an iteration of
>> EventMachine. We know the EM reactor is already solid and fast and the
>> goal is to improve it. If you really want to use another reactor,
>> there's great projects like Rev and ruby-ffi that make it easy to do
>> so.
> Oh my apologies I thought you wanted a full rewrite :)
>
> I guess I'll have to do it myself if nobody else wants to. I agree
> that the EM reactor is indeed solid and fast, just thought a "massive
> overhaul" might be a good chance to try it Question: does anybody
> actually use/like the idle timeouts currently in place in EM? Just
> wondering

Yes. TCP timeouts are unreliable and don't provide much granularity.

Aman

Daniel DeLeo

unread,

Jun 30, 2009, 3:15:33 PM6/30/09

to eventmac...@rubyforge.org

Based on experiences with AMQP and the issues that repeatedly pop up on that list, I think an EM-friendly sleep() and/or blocking catchup() method would be useful for running within a loop or iterator. Perhaps this is just a special case of allowing more user control over scheduling in general (analogous to Fiber.resume vs. automatic scheduling with Threads)?

The ideas proposed by Chuck, Steve and Marco all sound interesting and worthwhile.

Also, out of curiosity, is the plan to connect the new C reactor via FFI and make it universal that way? Or continue with C/MRI, Pure Ruby and JRuby variants but use the test suite to "enforce" feature parity?

Anyway, this sounds exciting, good luck and thanks for the code!

Dan DeLeo

Steve Hull

unread,

Jul 6, 2009, 10:20:48 PM7/6/09

to eventmac...@rubyforge.org

Awesome! I think making this (measurement of throughput) a build option (or similar) would be excellent. :)

Another item on my wishlist:

Some sort of dependable/easy way to test EM clients and servers. Before I was really relying on EM's event loop, I was doing testing outside the event loop with Test::Unit and checking the "sent_data" for what I expected the output to be for some given input. But now I'm having to redo everything in em/spec, but I'm having problems figuring out how/when to do my assertions (or rather "should" thingies).

The problem is that I would *like* to be able to call "should" after the server calls "send_data", but it might not be the case that the server is done after the first call to send_data. So ideally, if I could tell when my server module is done processing (that is, the first time through an event loop where there is no processing being done by the server module), and then it could call some block defined somewhere that has assertions.

Or something like that. I'm not really sure how it "should" work (hah), but I do know that there's not an easy way to do it properly right now.

-Steve

Use case:

I have implemented a custom protocol for a game I'm working on at work. Every time the server receives a message, I call a corresponding method in my module after parsing the message, and then the server may or may not call send_data in that method. Many of the methods rely on em/mysql, and make asynchronous calls to the DB, then complete their processing inside the block passed to the asynchronous mysql call (in which they might very well call "send_data"). I guess I could introduce some sort of state variable, but I don't want to do this just to support testing.

Maybe I just need a good solid blog posting explaining how to properly do testing with EM?

Reply all

Reply to author

Forward