QLC Performances

Jean Parpaillon

unread,

Oct 6, 2021, 8:37:42 AM10/6/21

to erlang-questions, Hans Bolinder, Richard Carlsson, Björn Gustavsson, io.pasc...@gmail.com

Dear all,
With Pascal in CC, I'm working on an elixir Ecto to mnesia adapter.

Ecto queries are represented as an AST, which is compiled into database
specific calls:
* `mnesia:read/2` for basic queries
* Qlc for others

We are facing huge performance issues and have tried to optimize the
library in many ways, but it looks like the performance bottleneck is
now in Qlc itself.

The library now includes a benchmark. In this merge request, you will
find a benchmark that fetches 1000 records randomly from a table, by a
non-key column.

There a 3 implementations:
* mnesia.get.int.idx -> `mnesia:index_read/3`
* qlc.get.int.idx -> use hand written qlc query ([Rec || {'_', '_',
IndexedIntField, '_', '_', '_', '_', '_'} = Rec <-
mnesia:table('test_table'), IndexedIntField == B0].')
* ecto.get.int.idx -> use Ecto adapter

Here is the result of the benchmark:

Name ips average deviation
median 99th %
mnesia.get.int.idx 30.38 0.0329 s ±21.16% 0.0308
s 0.0655 s
qlc.get.int.idx 0.62 1.61 s ±0.81% 1.60
s 1.63 s
ecto.get.int.idx 0.34 2.94 s ±1.90% 2.93
s 3.02 s

Comparison:
mnesia.get.int.idx 30.38
qlc.get.int.idx 0.62 - 48.82x slower +1.57 s
ecto.get.int.idx 0.34 - 89.44x slower +2.91 s

Memory usage statistics:

Name average deviation median
99th %
mnesia.get.int.idx 0.00496 GB ±0.10% 0.00496 GB
0.00497 GB
qlc.get.int.idx 1.71 GB ±0.00% 1.71 GB
1.71 GB
ecto.get.int.idx 1.55 GB ±0.00% 1.55 GB
1.55 GB

Comparison:
mnesia.get.int.idx 0.00496 GB
qlc.get.int.idx 1.71 GB - 344.04x memory usage +1.70 GB
ecto.get.int.idx 1.55 GB - 313.04x memory usage +1.55 GB

We can observe a small overhead between plain qlc and Ecto, but a MAJOR
performance drop between plain mnesia and qlc.

Has this performance drop been observed in other applications ? May I
expect better performance with a different usage of qlc ?

I may try to fix qlc, would some bottlenecks have been identified or
someone give me some hints on where those bottlenecks can be.

I have also had a look at erl_syntax module from syntax_tools. Does it
seem a better approach to use erl_syntax for generating query handles,
rather than existing qlc parse transform ?

I need and willing to help on this topic, and every hint is welcome.

Best regards,
--
Jean Parpaillon
--
Software Engineering Consultant
Chairman @ OW2 Consortium - https://ow2.org/
Président L'Insolite Compagnie - https://insolitecompagnie.com
--
Phone: +33 6 30 10 92 86
linkedin: http://www.linkedin.com/in/jeanparpaillon/en

Jean Parpaillon

unread,

Oct 6, 2021, 9:04:23 AM10/6/21

to erlang-questions, Hans Bolinder, Björn Gustavsson, io.pasc...@gmail.com

Sorry, the missing links:
Merge request with benchmarks:
https://gitlab.com/patatoid/ecto3_mnesia/-/merge_requests/56

Ecto Query to Qlc compiler:
https://gitlab.com/patatoid/ecto3_mnesia/-/blob/master/lib/ecto/adapters/mnesia/query/qlc.ex

Cheers,
Jean

Dániel Szoboszlay

unread,

Oct 8, 2021, 5:16:35 AM10/8/21

to Jean Parpaillon, erlang-questions, Hans Bolinder, Björn Gustavsson, io.pasc...@gmail.com

Hi Jean,

We have observed some performance issues with qlc-s and select-s that we were expecting to use a secondary index, but they turned out not to. To quote my colleague, Andreas (not on this list):

It seems like mnesia select is able to use it index only when doing exact match and when the match spec that it receives have the value to match in the match head (not in the match body):

Fast:
mnesia:dirty_select(pacc, ets:fun2ms(fun(#pacc{ano = 4611713374746180065} = P) -> P end)).

Slow:
mnesia:dirty_select(pacc, ets:fun2ms(fun(#pacc{ano = A} = P) when A =:= 4611713374746180065 -> P end)).

One problem with qlc is that it is easy to shoot yourself in the foot. It is important to put the guards that translate to match specs first.

However, I tried your hand written qlc query from the benchmark PR (the version you inlined in the email is not correct, the '_'-s within the qlc will be translated to matches against the literal underscore atom, not treated as wildcards), and it still executes a dirty_index_read (not even a dirty_select), which is fast. Qlc still has a significant overhead, in my case ~1300 us vs 45 us for a plain dirty_index_read. But that should be independent of the table's size, and shouldn't be measured in seconds.

Can you verify with tracing what are the actual mnesia calls your qlc queries perform? They should do (dirty_)index_read-s, but maybe they do a (dirty_)select or even full table scan instead?