Interesting benchmark about HTTP servers, Crystal is the fastest even with single threading.

1,060 views
Skip to first unread message

raydf

unread,
Jun 28, 2016, 1:35:57 PM6/28/16
to Crystal
Hello everyone:

Just wanted to publish in the group this interesting benchmark of the http server libraries from various programming languages and Std libs.

https://github.com/costajob/app-servers

raydf

Tim Uckun

unread,
Jun 28, 2016, 7:27:20 PM6/28/16
to crysta...@googlegroups.com
Hey can you include jruby in there?

Also did you record other things like memory usage, CPU etc?

--
You received this message because you are subscribed to the Google Groups "Crystal" group.
To unsubscribe from this group and stop receiving emails from it, send an email to crystal-lang...@googlegroups.com.
To post to this group, send email to crysta...@googlegroups.com.
Visit this group at https://groups.google.com/group/crystal-lang.
To view this discussion on the web visit https://groups.google.com/d/msgid/crystal-lang/41a2be4e-6696-4a82-8d5a-655d9a32444a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

raydf

unread,
Jun 29, 2016, 1:23:04 AM6/29/16
to Crystal
Hello Tim:

This benchmark were made by Michele Costa, if you'll like to get more feedback please ask him directly in github.com.

I'm just sharing because is interesting to see the great performance of crystal in comparison against other languages. You could take inconsideration the jetty/java results for an estimate of the JVM performance.

Best regards,

raydf

Tim Uckun

unread,
Jun 30, 2016, 4:26:53 AM6/30/16
to crysta...@googlegroups.com
Just to follow up on my post....

I downloaded the benchmark and installed the latest versions of jruby and ruby on my macbook air.  I ran the benchmark with the same command lines on the benchmark for puma but of course since puma doesn't support multiple workers with jruby I left that out.  It turns out that puma with jruby  delivers a third more requests per second than puma under ruby 2.3 for this benchmark. I then installed the torquebox 4 beta gem and ran it with torquebox and it delivers twice as many req/s than ruby 2.3!  Wow. 

Having said that Crystal delivered three times as many requests per second than ruby :)


Michele Costa

unread,
Jun 30, 2016, 9:12:11 AM6/30/16
to Crystal
HI all,

Michele Costa here: i included JRuby initially, but removed it since there was no significant throughput advantage with MRI and, honestly, JRuby is not a language on its own, just a different Ruby implementation.
If you dare anyway you can look at an old commit including JRuby (you do not have to use workers with Puma, just threads).


Greeting
M.

Michele Costa

unread,
Jun 30, 2016, 9:17:59 AM6/30/16
to Crystal
Addendum: JVM/Jetty is much faster than JRuby, as it's faster than other langugaes than run on the JVM (Groovy). 
I suppose the tokenization phase to support Ruby Syntax is the real bottleneck of JRuby, as well as throwing away compiler's static type optimizations and other memory-efficient primitives (fixed size array, immutable strings, etc).

Best

Michele Costa

unread,
Jun 30, 2016, 9:37:25 AM6/30/16
to Crystal
Good point:  will try to profile CPU/RAM too, although with OSX i will rely on standard ActivityMonitor only.
If you have any clue on how to measure it in a better way  any suggestion is welcome ;)  

Tim Uckun

unread,
Jun 30, 2016, 9:52:14 AM6/30/16
to crysta...@googlegroups.com
In my completely offhand and unscientific test puma with the same parameters had roughly 30% more throughput than puma under the MRI.  Torquebox was significantly better than both. Having said that I think it's because it's using netty. Jetty based servers like Mizuno and Trinidad were not that much better than Puma.

Caviat: I don't really know how to maximize JVM performance or how to tune any of the servers. I did use JRUBY_OPTS=--server though.

Hristo Kochev

unread,
Jun 30, 2016, 2:28:11 PM6/30/16
to Crystal
Sorry, but your bootstrap on Elixir seems unfair to other ... you at least could compile :)

Tim Uckun

unread,
Jun 30, 2016, 5:47:43 PM6/30/16
to crysta...@googlegroups.com
Elixir doesn't fair well though. I don't know what's wrong  but it's roughly the speed of ruby.

Ylan Segal

unread,
Jun 30, 2016, 11:29:26 PM6/30/16
to crysta...@googlegroups.com
I was also surprised by elixir's performance in that benchmarks.  

I have tried locally to run a basic Hello World rack application vs a Cowboy application and Elixir/Erlang was more than an order of magnitude faster. I don't have the results published though, so take it as an anecdote. 

For more options, visit https://groups.google.com/d/optout.


--

Michele Costa

unread,
Jul 1, 2016, 2:58:10 AM7/1/16
to Crystal
I followed suggestion from the plug radme, also i am compiling the server as you can see: by: 

c "lib/plug_server.ex"

As said Elixir/Erlang are not aimed for pure performance, their use case is more related in keeping many processes alive and reboot them on failure.

The numbers seem to confirm other benchmarks around (https://www.techempower.com/benchmarks/#section=data-r12&hw=peak&test=json&l=2hzi8&f=zijx1b-zik0zj-zik0zj-zifmyn-ziiku7-1ekf or https://github.com/mroth/phoenix-showdown), considering Sinatra is about x3/3.5 times slower than pure rack.

If you have any enhancements suggestions is very welcome (fork and pull). 

Michele Costa

unread,
Jul 4, 2016, 8:23:35 AM7/4/16
to Crystal
Added CPU and memory consumption.
Re-added JRuby with Puma (Torquebox experience reminds me JAva days...).

Michele Costa

unread,
Jul 5, 2016, 4:20:39 AM7/5/16
to Crystal
Changed Elixir server to a full application and compiled using MIX "prod" flag: some minor enhancements on throughput, but better consistency.

Hristo Kochev

unread,
Jul 5, 2016, 10:48:07 AM7/5/16
to Crystal
Well I think you should place in your benchmarks the number of CPUS involved in the example.

In your benchmarks you stated:
Since both Ruby and Node starts multiple processes (9) i reported average total RAM consumption and express CPUs usage as a range of percentages.

So if you one make sure you benchmark Apples to Apples you should make all using 9 threads or make all single process only.

Regards,
Hristo Kochev 

Michele Costa

unread,
Jul 6, 2016, 11:31:53 AM7/6/16
to Crystal
Mmmmmhh, this does not make sense to me, since some languages simply delivers parallelism by using threads on a single process, while others (Ruby and Node here) does not allow for that and must rely on processes pre-forking.

Also what i am pointing here is to compare a standard installation of different languages (both Node and Puma comes with integrated processes balancing). 

What's the point of balancing multiple processes for the JVM, using an external balancer, when it just uses all of the CPUs? Better GC on single processes? I doubt that.

Anyway you're welcome to fork and propose setup changes ;) 

Mike Perham

unread,
Jul 6, 2016, 2:19:53 PM7/6/16
to Crystal
If one Crystal process can dominate, I'd be curious to see a benchmark result with $NUM_CORES forked Crystal processes.

Tim Uckun

unread,
Jul 7, 2016, 4:22:13 AM7/7/16
to crysta...@googlegroups.com
FYI if you are using puma on Jruby it's using java threads.



Michele Costa

unread,
Jul 8, 2016, 6:15:16 AM7/8/16
to Crystal
That's pretty obvious, i stated that Puma has a built-in processes balancer (link Node), something Crystal and NIM do not have.
By the way i'd rather prefer relying on multiple threads for parallelism, abstracted by some higher level primitives (routines and channels): is much more efficient than pre-forking.

As Mike said, once Crystal will run on multiple threads it will be interesting to see its throughput ;)

Michele Costa

unread,
Jul 13, 2016, 11:28:50 AM7/13/16
to Crystal
Just added Rust+Iron to the bench.
Reply all
Reply to author
Forward
0 new messages