So I have a project (
https://github.com/knewter/rules_engine) that loops a function 10k times. This function is very basic, just doing a numeric comparison on a field in a 2-level nested struct, and returning a different struct depending on that comparison. It runs through 10k function invocations in 2.4ms.
I then compile my whole project using HiPE, hoping for a speedup and also just to play a bit. Unintuitively, this consequently runs more slowly after compiling with HiPE. I've included a couple of benchfella results below.
I've got protocols consolidated in all environments.
In these examples, the number I'm looking at is the "10k compiled rules" option. I've included the runs below, but the comparison is:
- with hipe - 2.7ms
- with beam: 2.4ms
When asking in #erlang about things that might make HiPE go slower, I got this response:
<kvakvs> jadams: when not all modules in your app are compiled with hipe and it has to switch many times between hipe, erlang and bifs/nifs
<kvakvs> jadams: there is stat you can check, somewhere in process or vm info, context switches
<kvakvs> erlang:statistics/1 with arg 'context_switches'
<kvakvs> if it grows fast, you're in trouble with different hipe/non hipe modules coexisting in a busy app
My only remaining thought is that perhaps the fact that my Elixir itself is not compiled to HiPE is causing the performance decrease when compiling my project's modules to HiPE.
Any thoughts? I'm going to go compile Elixir using HiPE and see if I get better results.
# With HiPE modules - 2711.80 µs/op
To be clear, I compiled the modules to HiPE using `ERL_COMPILER_OPTIONS="[native,{hipe, [verbose, o3]}]" mix compile --force` and confirmed (thanks to Jose's help) that they were compiled with the appropriate options. Also, adding `verbose` makes it pretty clear in the output, which is a mistake I was making previously as well.
[jadams:~/elixir/rules_engine] master(+28/-21)* 9s ± mix bench
Settings:
duration: 1.0 s
## RulesTimingBench
[07:02:22] 1/4: running 10000 rules
[07:02:27] 2/4: running 10000 compiled rules
[07:02:29] 3/4: running 100 rules
[07:02:31] 4/4: running 100 compiled rules
Finished in 11.92 seconds
## RulesTimingBench
running 100 compiled rules 100000 30.39 µs/op
running 10000 compiled rules 500 2711.80 µs/op
running 100 rules 50 36600.28 µs/op
running 10000 rules 1 4531512.00 µs/op
# With BEAM modules: 2413.99 µs/op
[jadams:~/elixir/rules_engine] master(+28/-21)* ± mix bench
Settings:
duration: 1.0 s
## RulesTimingBench
[07:04:23] 1/4: running 10000 rules
[07:04:27] 2/4: running 10000 compiled rules
[07:04:30] 3/4: running 100 rules
[07:04:32] 4/4: running 100 compiled rules
Finished in 10.83 seconds
## RulesTimingBench
running 100 compiled rules 50000 30.51 µs/op
running 10000 compiled rules 1000 2413.99 µs/op
running 100 rules 50 34522.36 µs/op
running 10000 rules 1 4127699.00 µs/op
--
Josh Adams