Using crystal in production

1439 views
Skip to first unread message

Kostya M

unread,
Dec 14, 2015, 12:45:44 PM12/14/15
to Crystal
My expirience of using crystal in production.

1.5 month ago i rewrite our critical ruby service (EventMachine, 2000 SLOC) to crystal (900 SLOC). It was quite easy, i spend only a week.
Service listen on socket and handle 10 mln/requests per day. It running as daemon and getting-updating values in memory. 
Also it listen on http, to render simple Admins, have ~10 coroutines with cron_scheduler to some updates on memory, and ~20 coroutines with outcoming socket connections.
I dont want to rewrite it to Go because i need readable code for our ruby team for easy maintaning.

Now it running in production for a month without restart. Used 185Mb, peak cpu usage 30%
(ruby daemon leaking memory and restarting every day, peak cpu usage was > 100% (maybe 300%)).
We are quite happy.

Main problems with converting from ruby:

1) No open hashes as in Ruby:

stats = {"key1" => value1}
stats["key2"] = value2 if something2
stats["key3"] = value3 if something3
stats["key4"] = value4 if something4

all values of different types

and then send it to user with stats.to_json
maybe crystal should introduce open named tuple or allow open hashes(when types merging)?

2) no compact marshal format to store data structures on disk.
I tried json, but it quite big, and daemon somehow leaked, when generate 40Mb json every 2 min and save to disk.

3) sometimes needed to load the same class from YAML and JSON, and i need to repeat
all fields in JSOM.mapping and YAML.mapping

Serdar Doğruyol

unread,
Dec 14, 2015, 12:51:48 PM12/14/15
to Crystal
Wow that's awesome :)

Care to share the app(url e.g) if it's public?

Serdar

14 Aralık 2015 Pazartesi 19:45:44 UTC+2 tarihinde Kostya M yazdı:

Ryan Gonzalez

unread,
Dec 14, 2015, 12:57:33 PM12/14/15
to crysta...@googlegroups.com, Kostya M
What if you set the value type to a union? Like https://carc.in/#/r/o7v.

>
>and then send it to user with stats.to_json
>maybe crystal should introduce open named tuple or allow open
>hashes(when
>types merging)?
>
>2) no compact marshal format to store data structures on disk.
>I tried json, but it quite big, and daemon somehow leaked, when
>generate
>40Mb json every 2 min and save to disk.

What about using MessagePack (http://msgpack.org/index.html) or Protocol Buffers (https://developers.google.com/protocol-buffers/)? I doubt it would be difficult to implement something like to_json for MessagePack; not sure about Protocol Buffers.

>
>3) sometimes needed to load the same class from YAML and JSON, and i
>need
>to repeat
>all fields in JSOM.mapping and YAML.mapping

--
Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.

Brian J. Cardiff

unread,
Dec 14, 2015, 1:04:24 PM12/14/15
to crysta...@googlegroups.com, Kostya M
Nice! Thanks for sharing!

Regarding 2) 

--
You received this message because you are subscribed to the Google Groups "Crystal" group.
To unsubscribe from this group and stop receiving emails from it, send an email to crystal-lang...@googlegroups.com.
To post to this group, send email to crysta...@googlegroups.com.
Visit this group at https://groups.google.com/group/crystal-lang.
To view this discussion on the web visit https://groups.google.com/d/msgid/crystal-lang/CBFA371F-DF68-45F9-B10B-BB54F7BA2B2C%40gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
Brian J. Cardiff

Ary Borenszweig

unread,
Dec 14, 2015, 1:08:15 PM12/14/15
to crysta...@googlegroups.com
Woooooooow!! That's really amazing! I had no idea someone would be using Crystal in production. It's fantastic that you did that experiment, because now we have real proof that the language can be a valuable tool :-)

When you say "10 mln/requests per day", is that "10 million requests per day"?

I'm also curious about the number of lines dropping from 2000 to 900, why is that?

As for your questions/requests:

1. In the future you'll be able to do `{} of String => Object` and put any object as a key. We have to think a bit more how to then serialize that with JSON (as not every Object will implement `to_json`, or maybe yes).
2. @waj started Marshal here: https://github.com/manastech/crystal/tree/marshalling . It's not finished but it looks quite promising, and of course much more compact than JSON.
3. Yes, that's something that we need to solve. I guess one way would be defining the mapping in a constant and then passing the constant's value to JSON.mapping and YAML.mapping (but that's not possible right now).

Another question: how are you rendering the Admin pages? Oh, and another one: did others in your team had to code in Crystal? How was their experience?

Kostya M

unread,
Dec 14, 2015, 2:14:28 PM12/14/15
to crysta...@googlegroups.com
Its private code.
Yes 7-10 millions requests per day (response is mostly hash encoded with json).
Also daemon generate 2Gb log per day.
2000 lines to 900 mainly because of refactoring, rethink some logic.
Admin pages rendering with ecr, routes parsed by hands, little weird.
With crystal right now playing only me.

--
You received this message because you are subscribed to the Google Groups "Crystal" group.
To unsubscribe from this group and stop receiving emails from it, send an email to crystal-lang...@googlegroups.com.
To post to this group, send email to crysta...@googlegroups.com.
Visit this group at https://groups.google.com/group/crystal-lang.

Alex Fedorov

unread,
Dec 14, 2015, 3:02:52 PM12/14/15
to Crystal
That is just WOOOW.

Would be nice, if you could have written a blog post about that - I would tweet about this :)

So far I have used Crystal only for generating production code ('json-schema => json-hyper-schema' generator), instead of actually running it in production :)


--
Best Regards,
Oleksii Fedorov,
Sr Ruby, Clojure, Crystal, Golang Developer,
Microservices Backend Engineer,
+49 15757 486 476

Benoist Claassen

unread,
Dec 14, 2015, 3:06:27 PM12/14/15
to Crystal
We've been running crystal in production since the HTTP timeouts were released :)
Dramatically increased performance for us. We're doing some realtime stats calculations and requests went from timing out after 60 seconds to responding in < 5 secs. 

Luis Lavena

unread,
Dec 16, 2015, 8:55:04 AM12/16/15
to Crystal
On Monday, December 14, 2015 at 2:45:44 PM UTC-3, Kostya M wrote:
My expirience of using crystal in production.

1.5 month ago i rewrite our critical ruby service (EventMachine, 2000 SLOC) to crystal (900 SLOC). It was quite easy, i spend only a week.
Service listen on socket and handle 10 mln/requests per day. It running as daemon and getting-updating values in memory. 
Also it listen on http, to render simple Admins, have ~10 coroutines with cron_scheduler to some updates on memory, and ~20 coroutines with outcoming socket connections.
I dont want to rewrite it to Go because i need readable code for our ruby team for easy maintaning.

Now it running in production for a month without restart. Used 185Mb, peak cpu usage 30%
(ruby daemon leaking memory and restarting every day, peak cpu usage was > 100% (maybe 300%)).
We are quite happy.


Thank you for sharing!

Can I ask you about the following details?

- Are you using a cluster of individual processes, a single process or a cluster of forked process (listen_fork)
- If using a cluster, are you using nginx as reverse proxy/load balancer or anything else?
- If SSL is on, are you using that built-in in Crystal or doing SSL termination by the reverse proxy/load balancer?

Last but not least, can you share a bit of the hardware specs where this service is running? We have seen a wide performance difference by deploying Crystal on real hardware versus VM (Runabove, Digital Ocean and Docker images) or even Heroku (last one completely sucks)

Thank you in advance for your responses,

--
Luis Lavena

Kostya M

unread,
Dec 16, 2015, 9:04:11 AM12/16/15
to crysta...@googlegroups.com
This is one process, task is very specific, because using memory as shared storage, so i cant easy run it on multiple processes, if i can i maybe would solve it with ruby.

For multiple processes i would just use haproxy, for ssl nginx.

Hardware is our server, no any VM.

--
You received this message because you are subscribed to the Google Groups "Crystal" group.
To unsubscribe from this group and stop receiving emails from it, send an email to crystal-lang...@googlegroups.com.
To post to this group, send email to crysta...@googlegroups.com.
Visit this group at https://groups.google.com/group/crystal-lang.

Kostya M

unread,
Jul 1, 2016, 11:54:29 AM7/1/16
to crysta...@googlegroups.com
Now i use another tool in production.

The task was mapping parallel incoming socket connection (1000 in parallel) to outcoming http requests (150 in parallel) with pooling. Firstly i write it in ruby(celluloid) 300 lines, spend 3 hours, but in production cpu usage was 90-100% and many concurrency bugs (pool created more than requested, pool size with wrong size, just like mutexes not worked at all), i dont know why (may be celluloid bug, i tryed to debug but it was hard because of big concurrency), than i think this should work fine in Crystal. For 1 hour i rewrite this code to Crystal (300 lines), mostly copied from that ruby code. And when deploy, it just, works (cpu usage 5%), no any concurrency bugs (pool size, pools count was just as expected). Now it handling ~1million requests per day (and surely it can much more). Latelly i find one serious bug. It was not easy, because in strange case some pools just freezed, when others works as expected. 

the pseudocode looks like:

```
@tcp_server = TCPServer.new(host, port)
spawn { loop { spawn handle_socket(@tcp_server.accept) } }

def handle_socket(socket)
  data = socket.gets
  ch = process(data)
  result = ch.receive
  socket.puts(result.to_json)
rescue IO::EOFError
ensure
  socket.try &.close
end

def process(data)
  ch = Channel(Result).new
  
  spawn(...) do
    # some actions
    spawn(...) do
      # some actions
      ch.send(result)
    end
  end
  
  ch
end

sleep
```

the bug was that `ch.send(result)` hangs forever, and block that coroutine. i dont know exactly why, because surely ch.receive should be executed before, this send, because even in best case it can spend a second before send result. Or may be coroutine with handle_socket get exception and die, or something. It was not easy to find. The fix was change Channel(Result).new to Channel::Buffered(Result).new(1)

Ary Borenszweig

unread,
Jul 1, 2016, 12:05:38 PM7/1/16
to crysta...@googlegroups.com
Wow, that's really impressive, ~1 million requests per day! :-)

`channel.send` with an unbuffered channel will block the coroutine until `channel.receive` is executed on another coroutine. I don't know why it blocked in your case, but it would be really, really nice if you could reduce it and report something that we can reproduce. Tracking and fixing this kind of bugs is really important for us.

--
You received this message because you are subscribed to the Google Groups "Crystal" group.
To unsubscribe from this group and stop receiving emails from it, send an email to crystal-lang...@googlegroups.com.
To post to this group, send email to crysta...@googlegroups.com.
Visit this group at https://groups.google.com/group/crystal-lang.

For more options, visit https://groups.google.com/d/optout.



--
Ary Borenszweig         Manas Technology Solutions
[ar.phone]                      5258.5240       #ARY(279)
[us.phone]                      312.612.1050    #ARY(279)
[email]                         aboren...@manas.com.ar
[web]                           www.manas.com.ar

Kostya M

unread,
Jul 2, 2016, 12:15:48 PM7/2/16
to crysta...@googlegroups.com
there code: https://gist.github.com/kostya/7388768b1363e95852c5367a16b5900f
i removed from it any significant data, so it just example:
as you can see `every` pattern very common, it from celluloid, may be add it to language?

Kostya M

unread,
Jul 2, 2016, 2:16:33 PM7/2/16
to crysta...@googlegroups.com
seems if find what was a bug with channel, it adds one fetch object to pool twice (by typo, 273 line should be deleted): https://gist.github.com/kostya/7388768b1363e95852c5367a16b5900f#file-xx-cr-L273-L274
then first result sender ok, but second hangs, because no one receive it.
so there is no crystal bug.

Ary Borenszweig

unread,
Jul 3, 2016, 6:25:06 PM7/3/16
to crysta...@googlegroups.com
Phew...! So good there was no bug in Crystal :-)

We recently started working on supporting multiple threads with Juan. Once we finish that we can start thinking and adding some patterns to the standard library. We could add those patterns right now, but they will need a lot of changes once multiple threads are supported, so it's better to wait to avoid double work.


For more options, visit https://groups.google.com/d/optout.

Kostya M

unread,
Jul 20, 2016, 5:09:54 PM7/20/16
to crysta...@googlegroups.com
I write another production middleware with crystal. Process accept http task requests(kemal), read from memcached cache result of task if it is, else send task to redis by lpush (to another system), handling result by redis subscribe in another coroutine with little processing it, save it to memcached, and send it back to http connect as result (by channel). And write logs and counters to statsd. It was so fast and easy (200 lines). But solves not easy problem. With ruby i cant imagine i can code it as fast and stable as in Crystal. Used shards: sdogruyol/kemal, stefanwille/crystal-redis, kuende/memcached, ysbaddaden/pool, kostya/redis-reconnect, miketheman/statsd.cr. This is one process, uses 25%-50% cpu, handling ~500-800 tasks in parallel, and ~2mln tasks per day. Memory usage 80Mb. So crystal is so nice to writing such middlewares between systems.


Ary Borenszweig

unread,
Jul 20, 2016, 7:27:23 PM7/20/16
to crysta...@googlegroups.com
Wow, that's really nice to read, Kostya!

I think microservices is an area where Crystal can really shine :-)

Out of curiosity, because of the heavy load of your services, what does your company do? Or, hmmm... can the name be disclosed?


For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages