Computer Language Benchmarks Game

72 views
Skip to first unread message

Yang Zhang

unread,
Dec 1, 2008, 3:52:56 PM12/1/08
to felix-impl...@googlegroups.com
Hi, would you consider adding Felix to the Computer Language Benchmarks
Game (http://shootout.alioth.debian.org/) to give others a rough sense
of how performance stacks up against other languages? This could also
be a big publicity boost for Felix.
--
Yang Zhang
http://www.mit.edu/~y_z/

Erick Tryzelaar

unread,
Dec 1, 2008, 9:49:17 PM12/1/08
to felix-impl...@googlegroups.com
Hi Yang! We have some unfortunate history with the shootout. We
actually used to be in it years ago, but then we were kicked out
because we were accused of doing unfair optimizations, like optimizing
threads away into loops. They also said we didn't have enough of a
userbase to be worthwhile having in the competition. Skaller also got
into a couple arguments over this, and it created some bad blood. That
said, it's been years, so maybe the wounds have scarred over by now.
We do actually have the shootout tests in our repository (the speed
directory), along with sources for other languages to compare against.
Could be interesting to collect that and at least post it to the
website.

Yang Zhang

unread,
Dec 2, 2008, 2:06:54 PM12/2/08
to felix-impl...@googlegroups.com
Ah, interesting stories. :)

I'm sure you're well aware of this, but for a systems programming
language, support for real system threads is fairly important.

I'm surprised that they used "not enough users" against you, since I'm
seeing certain new languages on there like ATS, which I would not expect
to have any more users than you do.

Anyway, if you think it's OK to revisit the shootout, I would highly
encourage this!

BTW, who is actively involved in the project at this point?

Yang

Erick Tryzelaar

unread,
Dec 3, 2008, 12:15:23 AM12/3/08
to felix-impl...@googlegroups.com
On Dec 2, 2008, at 11:06 AM, Yang Zhang wrote:

> Ah, interesting stories. :)
>
> I'm sure you're well aware of this, but for a systems programming
> language, support for real system threads is fairly important.

We actually do support real threads too :) I dug up the last
conversation about this:

http://article.gmane.org/gmane.comp.lang.felix.general/1220

Anyway, we actually have two threading constructs. One, fthreads, are
userspace-scheduled threads. These then are run within pthreads, which
are real os-scheduled threads. fthreads are very lightweight and are
designed to have thousands of threads, but they're only active one at
a time on a real os thread. These can be routed through a thread pool
so that you can get concurrency. What happened with the thread-ring
(that the right test skaller?) is that it's a trivial test. You've got
one thread that passes data to the next thread and that to the next
thread. What we did was we used fthreads to model each thread, but the
compiler was able to optimize away the threads into tail calls and
then we did the test in constant space. We actually scored negative,
and were accused of cheating :)

skaller, I couldn't find the source for this in the speed directory
though. Do you still have it?


> I'm surprised that they used "not enough users" against you, since I'm
> seeing certain new languages on there like ATS, which I would not
> expect
> to have any more users than you do.


Yeah it'd be nice to get back in, and get some more people interested.


> Anyway, if you think it's OK to revisit the shootout, I would highly
> encourage this!
>
> BTW, who is actively involved in the project at this point?


So we've got a decent chunk of subscribers to the sourceforge mailing
list. Folks never got around to migrating to the googlegroups list.
The main guy is John Skaller. He's pretty much written most of it.
Then there's Rhythmic Fistman who did a lot of the async io. I've been
writing a new build system for the past, oh, year or so. I also made
the website, and poke at the standard library. It's been a bit quiet
though, but skaller's started playing around with things again, so
hopefully he'll have some new stuff coming.

Yang Zhang

unread,
Dec 3, 2008, 5:36:03 AM12/3/08
to felix-impl...@googlegroups.com
Erick Tryzelaar wrote:
> On Dec 2, 2008, at 11:06 AM, Yang Zhang wrote:
>
>> Ah, interesting stories. :)
>>
>> I'm sure you're well aware of this, but for a systems programming
>> language, support for real system threads is fairly important.
>
> We actually do support real threads too :) I dug up the last
> conversation about this:
>
> http://article.gmane.org/gmane.comp.lang.felix.general/1220
>
> Anyway, we actually have two threading constructs. One, fthreads, are
> userspace-scheduled threads. These then are run within pthreads, which
> are real os-scheduled threads. fthreads are very lightweight and are
> designed to have thousands of threads, but they're only active one at
> a time on a real os thread. These can be routed through a thread pool
> so that you can get concurrency. What happened with the thread-ring
> (that the right test skaller?) is that it's a trivial test. You've got
> one thread that passes data to the next thread and that to the next
> thread. What we did was we used fthreads to model each thread, but the
> compiler was able to optimize away the threads into tail calls and
> then we did the test in constant space. We actually scored negative,
> and were accused of cheating :)

Hrm. Was Felix really the only language to have used lightweight
threading for this benchmark? Other languages have this as well, and I
think it's how some of them obtain the scores they do.

Anyway, requiring OS threads for such a benchmark sounds silly, since
(for most reasonable languages) I'd assume the cost would be dominated
by the kernel.

>
> skaller, I couldn't find the source for this in the speed directory
> though. Do you still have it?
>
>
>> I'm surprised that they used "not enough users" against you, since I'm
>> seeing certain new languages on there like ATS, which I would not
>> expect
>> to have any more users than you do.
>
>
> Yeah it'd be nice to get back in, and get some more people interested.
>
>
>> Anyway, if you think it's OK to revisit the shootout, I would highly
>> encourage this!
>>
>> BTW, who is actively involved in the project at this point?
>
>
> So we've got a decent chunk of subscribers to the sourceforge mailing
> list. Folks never got around to migrating to the googlegroups list.
> The main guy is John Skaller. He's pretty much written most of it.
> Then there's Rhythmic Fistman who did a lot of the async io. I've been
> writing a new build system for the past, oh, year or so. I also made
> the website, and poke at the standard library. It's been a bit quiet
> though, but skaller's started playing around with things again, so
> hopefully he'll have some new stuff coming.

I'd love to see a Related Works study that juxtaposes Felix, D, BitC,
and even more distant systems programming languages (Cyclone, perhaps
OCaml) under a critical eye.

"async io" makes me very curious. What is this? Any chance this means
"cooperative threading"? I primarily write distributed systems, so this
is near and dear to me. After a long journey exploring cooperative
threading libraries (tame/tamer, boost.coro, more[3]), I settled on
ST[4], but it's still a headache (doesn't play nicely with pthreads,
Valgrind, etc.). I'm still looking for something that even comes close
to this AF library[5] for Python.

[1] http://read.cs.ucla.edu/tamer/
[2] http://www.crystalclearsoftware.com/soc/coroutine/
[3] http://www.gnu.org/software/pth/related.html
[4] http://state-threads.sf.net/
[5] svn://scripts.mit.edu:36900/mit/golem/Share/svn/af

Erick Tryzelaar

unread,
Dec 3, 2008, 6:35:56 PM12/3/08
to felix-impl...@googlegroups.com
On Dec 3, 2008, at 2:36 AM, Yang Zhang wrote:
> Hrm. Was Felix really the only language to have used lightweight
> threading for this benchmark? Other languages have this as well,
> and I
> think it's how some of them obtain the scores they do.


I know at least erlang and scala have similar threading models. I'm
not sure if they try to optimize away the threads though. Looking at
erlang's code it looks like the test isn't actually run with multiple
os-threads, so it's essentially the same as us.


> Anyway, requiring OS threads for such a benchmark sounds silly, since
> (for most reasonable languages) I'd assume the cost would be dominated
> by the kernel.

We'd actually prefer more complicated tests. This just happens to hit
an easy optimization edge case.

> I'd love to see a Related Works study that juxtaposes Felix, D, BitC,
> and even more distant systems programming languages (Cyclone, perhaps
> OCaml) under a critical eye.

Thats a nice idea, although I'm only tangentially aware of what the
other languages are doing. Maybe I'll have time someday :) If you
happen to write up something, let me know I'll put it on the website :)

> "async io" makes me very curious. What is this? Any chance this
> means
> "cooperative threading"? I primarily write distributed systems, so
> this
> is near and dear to me. After a long journey exploring cooperative
> threading libraries (tame/tamer, boost.coro, more[3]), I settled on
> ST[4], but it's still a headache (doesn't play nicely with pthreads,
> Valgrind, etc.). I'm still looking for something that even comes
> close
> to this AF library[5] for Python.

Oh, the async io is designed to work with our lightweight threads.
Since the fthreads are user-space, they can block the main thread in a
system call. We wrap the standard async io libraries like select,
poll, and etc to detect whether or not a call will block, and switch
to a new fthread if it would.

Looking at your references, I believe we're doing exactly what you
describe. We might even be relatively similar to AF. The api is pretty
low level. We don't have mutexes and the like implemented, we just use
a channel to talk between fthreads. Unlike erlang, it's not actually a
queue, the fthread will just block in the write until the other end of
the channel calls read, then the data is exchanged directly. You could
implement all the higher level constructs using this technique, but we
never got around to it.

We haven't got around to doing any actual distributed stuff across
multiple machines like erlang though, but I want to some day.

Finally, if you want to see more of the async io stuff, we do have a
test webserver, which you'll find either in tools/webserver.flx after
you've built felix, or here:

http://felix.svn.sourceforge.net/viewvc/felix/felix/trunk/lpsrc/flx_faio_tools.pak?revision=2073&view=markup

Yang Zhang

unread,
Dec 3, 2008, 8:28:43 PM12/3/08
to felix-impl...@googlegroups.com
Erick Tryzelaar wrote:
> On Dec 3, 2008, at 2:36 AM, Yang Zhang wrote:
>> Hrm. Was Felix really the only language to have used lightweight
>> threading for this benchmark? Other languages have this as well,
>> and I
>> think it's how some of them obtain the scores they do.
>
>
> I know at least erlang and scala have similar threading models. I'm
> not sure if they try to optimize away the threads though. Looking at
> erlang's code it looks like the test isn't actually run with multiple
> os-threads, so it's essentially the same as us.

Scala doesn't actually have lightweight threads. Scala has actors that
are somewhat similar to Erlang actors, but they run atop system threads,
where "system" = "JVM" (also the actors have to loop in a single
tail-recursive function).

>
>
>> Anyway, requiring OS threads for such a benchmark sounds silly, since
>> (for most reasonable languages) I'd assume the cost would be dominated
>> by the kernel.
>
> We'd actually prefer more complicated tests. This just happens to hit
> an easy optimization edge case.
>
>> I'd love to see a Related Works study that juxtaposes Felix, D, BitC,
>> and even more distant systems programming languages (Cyclone, perhaps
>> OCaml) under a critical eye.
>
> Thats a nice idea, although I'm only tangentially aware of what the
> other languages are doing. Maybe I'll have time someday :) If you
> happen to write up something, let me know I'll put it on the website :)

Not any time soon for me either :)

>
>> "async io" makes me very curious. What is this? Any chance this
>> means
>> "cooperative threading"? I primarily write distributed systems, so
>> this
>> is near and dear to me. After a long journey exploring cooperative
>> threading libraries (tame/tamer, boost.coro, more[3]), I settled on
>> ST[4], but it's still a headache (doesn't play nicely with pthreads,
>> Valgrind, etc.). I'm still looking for something that even comes
>> close
>> to this AF library[5] for Python.
>
> Oh, the async io is designed to work with our lightweight threads.
> Since the fthreads are user-space, they can block the main thread in a
> system call. We wrap the standard async io libraries like select,
> poll, and etc to detect whether or not a call will block, and switch
> to a new fthread if it would.
>
> Looking at your references, I believe we're doing exactly what you
> describe. We might even be relatively similar to AF. The api is pretty
> low level. We don't have mutexes and the like implemented, we just use
> a channel to talk between fthreads. Unlike erlang, it's not actually a
> queue, the fthread will just block in the write until the other end of
> the channel calls read, then the data is exchanged directly. You could
> implement all the higher level constructs using this technique, but we
> never got around to it.

Are Felix's lightweight threads cooperative (as with most C/C++ user
threading packages, or in any language where shared mutable state is
common) or preemptive (Haskell, Erlang, etc.) or either?

BTW, there are some slides in that svn repo giving an overview of AF.
The design is really nice, the key being the composability of
asynchronous tasks. E.g.:

yield s1.read() | s2.read() | (ch1.take() & ch2.take()) | sleep(5)

where:

a&b = start tasks a and b, waiting for both to finish
a|b = start tasks a and b, with one winner (wait for any to finish)

So the above means: "read from socket s1, read from s2, or take an
element off of both channels ch1 and ch2, whichever happens first, all
with a timeout of 5 seconds."

Crucial to this is the semantics of |, which cleanly aborts the losers
(there is always exactly one winner) so that they don't end up (e.g.)
dropping read data packets onto the floor. This involves execution of a
two-phase commit protocol among (potentially nested) tree of tasks.

>
> We haven't got around to doing any actual distributed stuff across
> multiple machines like erlang though, but I want to some day.

If you've wrapped select, poll, etc., then you've got distribution. :)

>
> Finally, if you want to see more of the async io stuff, we do have a
> test webserver, which you'll find either in tools/webserver.flx after
> you've built felix, or here:
>
> http://felix.svn.sourceforge.net/viewvc/felix/felix/trunk/lpsrc/flx_faio_tools.pak?revision=2073&view=markup
>
> >


Erick Tryzelaar

unread,
Dec 3, 2008, 9:57:44 PM12/3/08
to felix-impl...@googlegroups.com
On Dec 3, 2008, at 5:28 PM, Yang Zhang wrote:

> Scala doesn't actually have lightweight threads. Scala has actors
> that
> are somewhat similar to Erlang actors, but they run atop system
> threads,
> where "system" = "JVM" (also the actors have to loop in a single
> tail-recursive function).

Oh, I thought they mapped multiple actors to one thread.

> Are Felix's lightweight threads cooperative (as with most C/C++ user
> threading packages, or in any language where shared mutable state is
> common) or preemptive (Haskell, Erlang, etc.) or either?

I believe they're cooperative but I could be wrong. If I remember
correctly, fthreads are really just continuations, as well as the main
felix thread. These continuations are stored in some queue and when
one is suspended the next one in the queue is resumed. There might be
some intelligence as to picking a good continuation instead of just
the next one in the list. These are implemented though a classic
trampolines in c++.

> BTW, there are some slides in that svn repo giving an overview of AF.
> The design is really nice, the key being the composability of
> asynchronous tasks. E.g.:
>
> yield s1.read() | s2.read() | (ch1.take() & ch2.take()) | sleep(5)

Ah, no, we don't do anything like that, but I think we've got all the
building blocks so that it could be implemented. You'd run into
problems without preemption though if you've only got one os-thread.
Hmm. Do you know how the other languages are doing preemption?
inserting yields every x instructions, or do they use interrupts?

>> We haven't got around to doing any actual distributed stuff across
>> multiple machines like erlang though, but I want to some day.
>
> If you've wrapped select, poll, etc., then you've got distribution. :)

I've been looking into distributed filesystems recently, so I've got
that usage on my mind :)

Yang Zhang

unread,
Dec 3, 2008, 11:20:49 PM12/3/08
to felix-impl...@googlegroups.com
Erick Tryzelaar wrote:
> On Dec 3, 2008, at 5:28 PM, Yang Zhang wrote:
>> BTW, there are some slides in that svn repo giving an overview of AF.
>> The design is really nice, the key being the composability of
>> asynchronous tasks. E.g.:
>>
>> yield s1.read() | s2.read() | (ch1.take() & ch2.take()) | sleep(5)
>
> Ah, no, we don't do anything like that, but I think we've got all the
> building blocks so that it could be implemented. You'd run into
> problems without preemption though if you've only got one os-thread.
> Hmm. Do you know how the other languages are doing preemption?
> inserting yields every x instructions, or do they use interrupts?

You explicitly yield. This isn't as strange as it sounds; what's
frequently done is to make "blocking operations" your yield-points. In
fact, it feels quite natural. This is a nice default for systems
programming, which involves a lot of IO and not a lot of computation -
it's usually fine to run till the next blocking IO operation without
yielding. (Usually, if you're also performing computation, you farm the
work out to a different OS thread). This threading model is what the
Sun JVM implemented in ages past (they called them "green threads").

A result of this is that there are well-defined points in your program
that can yield, making concurrent programming much safer/easier - the
need for locking virtually disappears, as do inadvertent race
conditions. Managing shared state is easier. In a preemptive threading
system, code is implicitly yielding and explicitly synchronized (locks);
in a cooperative threading model, code is implicitly synchronized and
explicitly yielding.

Synchronization and thread safety are problems only when you're mutating
shared state. Languages like Haskell and Erlang inherently
disallow/discourage this. The dual to shared state is partitioned
state, where each thread manages just its own portion of the total state
(and message-passing is used for communication, as opposed to
communication via the shared state and synchronization).

Anyway, at least in Haskell, the runtime is basically a big looping
evaluator, so it can keep a count on the number of reductions performed
and suspend the thread once it performs a certain number of reductions -
so in that sense you're dynamically inserting yields every n
instructions (statically inserting yields that take place every n
instructions would be at least difficult and, for any non-trivial
program, impossible). But a more efficient way may be to maintain a
timeout interrupt; according to Wikipedia, that's what's done in GHC:

http://en.wikipedia.org/wiki/Green_threads

Erick Tryzelaar

unread,
Dec 4, 2008, 12:17:47 AM12/4/08
to felix-impl...@googlegroups.com
On Dec 3, 2008, at 8:20 PM, Yang Zhang wrote:
> You explicitly yield. This isn't as strange as it sounds; what's
> frequently done is to make "blocking operations" your yield-points.
> In
> fact, it feels quite natural. This is a nice default for systems
> programming, which involves a lot of IO and not a lot of computation -
> it's usually fine to run till the next blocking IO operation without
> yielding. (Usually, if you're also performing computation, you farm
> the
> work out to a different OS thread). This threading model is what the
> Sun JVM implemented in ages past (they called them "green threads").
>
> A result of this is that there are well-defined points in your program
> that can yield, making concurrent programming much safer/easier - the
> need for locking virtually disappears, as do inadvertent race
> conditions. Managing shared state is easier.

Ah, okay. We do that right now.

> Synchronization and thread safety are problems only when you're
> mutating
> shared state. Languages like Haskell and Erlang inherently
> disallow/discourage this. The dual to shared state is partitioned
> state, where each thread manages just its own portion of the total
> state
> (and message-passing is used for communication, as opposed to
> communication via the shared state and synchronization).

Our preferred mechanism is to use message passing, but I think we
actually do allow modification of shared state. You could technically
pass a pointer through a channel to allow direct modification, or
access global data, but I at least don't encourage it.

> Anyway, at least in Haskell, the runtime is basically a big looping
> evaluator, so it can keep a count on the number of reductions
> performed
> and suspend the thread once it performs a certain number of
> reductions -
> so in that sense you're dynamically inserting yields every n
> instructions (statically inserting yields that take place every n
> instructions would be at least difficult and, for any non-trivial
> program, impossible). But a more efficient way may be to maintain a
> timeout interrupt; according to Wikipedia, that's what's done in GHC:

Ah, thanks. That helped me track down an article on how they implement
it. ghc just uses a timer to schedule context switches. I suppose it
wouldn't be too hard to do that, other than figuring out how to
suspend a continuation that's in the middle of running from a signal
handler :)

Isaac Gouy

unread,
Dec 5, 2008, 4:33:19 PM12/5/08
to Felix Implementation


On Dec 2, 9:15 pm, Erick Tryzelaar <eri...@felix-lang.org> wrote:
> On Dec 2, 2008, at 11:06 AM, Yang Zhang wrote:
>
> > Ah, interesting stories. :)
>
> > I'm sure you're well aware of this, but for a systems programming
> > language, support for real system threads is fairly important.
>
> We actually do support real threads too :) I dug up the last
> conversation about this:
>
> http://article.gmane.org/gmane.comp.lang.felix.general/1220

"Isaac's biased view"
"Isaac completely misunderstood"
"how much of a Nazi he was"
"Gouy refused to listen to rational argument"
"Gouy ended up looking like so much of a fool"
"Isaac is so insecure and paranoid"
"Moron face decided Felix wasn't passing the test because"


"> However Felix is not in it, can you please add it?

No.
1) Too obscure - no users.
2) Too few tests completed.
"
http://lists.alioth.debian.org/pipermail/shootout-list/2005-July/002370.html


> Yeah it'd be nice to get back in, and get some more people interested.

1) Note how few languages are now shown in the benchmarks game
http://shootout.alioth.debian.org/u32q/

2) Note that you can download and use the new Python measurement
scripts (and any of the programs) from the benchmarks game
http://shootout.alioth.debian.org/u32q/faq.php#measurementscripts

Erick Tryzelaar

unread,
Dec 5, 2008, 5:57:23 PM12/5/08
to felix-impl...@googlegroups.com
On Dec 5, 2008, at 1:33 PM, Isaac Gouy wrote:
> "Isaac's biased view"
> "Isaac completely misunderstood"
> "how much of a Nazi he was"
> "Gouy refused to listen to rational argument"
> "Gouy ended up looking like so much of a fool"
> "Isaac is so insecure and paranoid"
> "Moron face decided Felix wasn't passing the test because"

Hi Isaac! I can't speak for John, but I apologize for those comments.
I can understand why he feels that way, but I don't think it was
really appropriate to use a lot of that language.

> "> However Felix is not in it, can you please add it?
>
> No.
> 1) Too obscure - no users.
> 2) Too few tests completed.
> "
> http://lists.alioth.debian.org/pipermail/shootout-list/2005-July/002370.html

Yeah I know, that's why I didn't ask :) Maybe some day we'll get more
of a momentum. From our perspective, it would definitely motivate us
to compete, but that's not your goal with the shootout.

>> Yeah it'd be nice to get back in, and get some more people
>> interested.
>
> 1) Note how few languages are now shown in the benchmarks game
> http://shootout.alioth.debian.org/u32q/
>
> 2) Note that you can download and use the new Python measurement
> scripts (and any of the programs) from the benchmarks game
> http://shootout.alioth.debian.org/u32q/faq.php#measurementscripts

Thanks for the link. I think we've got some of the old tests in our
repository from the other languages in the repository for speed
comparisons, but it'd be nice to update it. nanobench seems
interesting too.

Reply all
Reply to author
Forward
0 new messages