Strange behavio(u)r in my tests

131 views
Skip to first unread message

Sébastien Varlet

unread,
Jun 26, 2016, 11:46:16 AM6/26/16
to elixir-lang-talk
Hi,

I'm currently working on a client for the Tumblr API. Turns out it's way more work than anticipated but still a good way to practice.

However, I'm running at the moment into a problem with a test case where each test is taking about 2 seconds to run. These tests setup their own cowboy server (thanks to bypass) in order to simulate the calls. I have written such tests previously and never had this issue.

This is the file I'm working on:

As you can see, the JSON response is read from a file 'posts.json' also in the github repo. It is a rather large file (400KB). So my first suspicion was about the size of the file and the time taken by Poison to parse it.

So I used ExProf to verify that:
FUNCTION                                                  CALLS        %    TIME  [uS / CALLS]
--------                                                  -----  -------    ----  [----------]
[...]
'Elixir.Enum':'-reduce/3-fun-1-'/4                          207     0.03     196  [      0.95]
'Elixir.ExTumblr.Utils.Parsing':'-to_struct/3-fun-0-'/2     206     0.04     198  [      0.96]
lists:reverse/1                                             105     0.04     226  [      2.15]
lists:foldl/3                                               236     0.04     246  [      1.04]
erlang:binary_to_atom/2                                     206     0.04     246  [      1.19]
hackney_bstr:'-to_lower/1-lbc$^0/2-0-'/2                    130     0.05     282  [      2.17]
inet_tcp:recv/3                                             280     0.05     303  [      1.08]
hackney_tcp_transport:recv/3                                280     0.05     303  [      1.08]
hackney_http:ce_identity/1                                  280     0.05     306  [      1.09]
prim_inet:recv/3                                            280     0.06     311  [      1.11]
prim_inet:recv0/3                                           280     0.06     320  [      1.14]
hackney_http:content_decode/3                               280     0.06     321  [      1.15]
'Elixir.String.Chars':to_string/1                           342     0.06     322  [      0.94]
'Elixir.String.Chars.Atom':to_string/1                      342     0.06     326  [      0.95]
gen_tcp:recv/3                                              280     0.06     326  [      1.16]
erlang:atom_to_binary/2                                     343     0.06     333  [      0.97]
hackney_response:stream_body_recv/2                         279     0.06     340  [      1.22]
'Elixir.Map':do_take/3                                      363     0.06     342  [      0.94]
erlang:port_get_data/1                                      282     0.06     344  [      1.22]
hackney_response:stream_body/2                              279     0.06     353  [      1.27]
hackney_http:te_identity/2                                  280     0.06     358  [      1.28]
hackney_http:transfer_decode/2                              280     0.06     363  [      1.30]
hackney_response:read_body/3                                281     0.07     382  [      1.36]
maps:'-fold/3-fun-0-'/3                                     207     0.07     398  [      1.92]
maps:find/2                                                 343     0.07     402  [      1.17]
hackney_bstr:char_to_lower/1                                115     0.08     475  [      4.13]
hackney_http:'-parse_body/1-fun-3-'/1                       280     0.11     608  [      2.17]
'Elixir.String.Chars':impl_for/1                            342     0.11     626  [      1.83]
'Elixir.ExTumblr.Post':'-parse/1-fun-0-'/1                  340     0.11     632  [      1.86]
'Elixir.String.Chars':'impl_for!'/1                         342     0.11     639  [      1.87]
hackney_http:execute/1                                      286     0.12     652  [      2.28]
'Elixir.Enum':'-map/2-lists^map/1-0-'/2                     381     0.12     670  [      1.76]
hackney_http:'-parse_body/1-fun-2-'/2                       280     0.12     674  [      2.41]
hackney_response:stream_body1/2                             560     0.12     685  [      1.22]
inet_db:lookup_socket/1                                     282     0.12     693  [      2.46]
prim_inet:async_recv/3                                      280     0.12     700  [      2.50]
hackney_response:recv/1                                     280     0.13     714  [      2.55]
prim_inet:enc_time/1                                        281     0.13     724  [      2.58]
prim_inet:ctl_cmd/3                                         287     0.13     741  [      2.58]
hackney_http:parse_body/1                                   561     0.13     761  [      1.36]
erlang:port_control/3                                       287     0.14     785  [      2.74]
'Elixir.Poison.Parser':array_values/3                       711     0.14     791  [      1.11]
'Elixir.Poison.Parser':number_frac/2                        788     0.15     830  [      1.05]
'Elixir.Poison.Parser':number_start/1                       788     0.15     838  [      1.06]
erlang:binary_to_integer/1                                  788     0.15     839  [      1.06]
'Elixir.Poison.Parser':number_int/2                         784     0.15     853  [      1.09]
hackney_response:stream_body/1                              281     0.16     883  [      3.14]
'Elixir.Poison.Parser':number_exp/3                         788     0.16     903  [      1.15]
maps:from_list/1                                            729     0.27    1514  [      2.08]
'Elixir.Poison.Parser':number_complete/2                    788     0.28    1571  [      1.99]
'Elixir.Poison.Parser':number_digits/1                      784     0.29    1620  [      2.07]
hackney_http:execute/2                                      567     0.29    1641  [      2.89]
erts_internal:port_control/3                                287     0.34    1905  [      6.64]
erlang:setelement/3                                        1728     0.39    2228  [      1.29]
'Elixir.Poison.Parser':object_pairs/3                      5077     1.00    5634  [      1.11]
'Elixir.Poison.Parser':number_digits_count/2               5947     1.20    6765  [      1.14]
'Elixir.Poison.Parser':object_name/2                       5077     1.80   10167  [      2.00]
'Elixir.Poison.Parser':value/2                             5769     2.14   12061  [      2.09]
erlang:iolist_to_binary/1                                  9324     2.23   12558  [      1.35]
erlang:binary_to_integer/2                                11587     2.58   14537  [      1.25]
'Elixir.Poison.Parser':string_escape/2                    19399     4.10   23114  [      1.19]
'Elixir.Poison.Parser':skip_whitespace/1                  21712     7.95   44850  [      2.07]
'Elixir.Poison.Parser':string_continue/2                  46845     9.96   56241  [      1.20]
'Elixir.Poison.Parser':string_chunk_size/2               303284    60.04  338888  [      1.12]
-------------------------------------------------------  ------  -------  ------  [----------]
Total:                                                   457768  100.00%  564404  [      1.23]

Indeed, parsing takes time (value indicated in µs) but this is about half a second. When I run my 4 tests I could expect the test suite to take 2 seconds if they run sequentially. Yet, the test suite takes more than 10 seconds to run.

> mix test --trace test/ex_tumblr/posts_test.exs

ExTumblr.PostTest
  * test parse/1 can transform a generic map into a Post struct (6.5ms)

ExTumblr.PostsTest
  * test emits request to the right path (195.3ms)
  * test parses the http response into a Posts struct (55.6ms)
  * test specifies the api key in the query (33.7ms)
  * test sends a GET request (28.5ms)


Finished in 10.1 seconds
5 tests, 0 failures

What I notice every time the suite runs is a ~10 seconds pause between the two test cases. To be more specific:
> mix test --trace test/ex_tumblr/posts_test.exs

ExTumblr.PostTest
  * test parse/1 can transform a generic map into a Post struct (6.5ms)
############### 10 SECONDS PAUSE HERE ##############
ExTumblr.PostsTest
  * test emits request to the right path (195.3ms)
  * test parses the http response into a Posts struct (55.6ms)
  * test specifies the api key in the query (33.7ms)
  * test sends a GET request (28.5ms)


Finished in 10.1 seconds
5 tests, 0 failures
If I comment all these tests but one, I observe the same behaviour with a ~2 seconds pause. Every test basically adds its own ~2s pause.
 
I already posted this in the slack chat where @micmus and @ericmj were really kind and helpful but we couldn't find the cause.

Can anybody find the problem ?

Thanks 

OvermindDL1

unread,
Jun 27, 2016, 10:06:00 AM6/27/16
to elixir-lang-talk
I've not seen those cowboy mocking libraries but is it possible that they have to be told to stop somehow, and instead there is maybe a 5s timeout on its process ending after yours ends and maybe another 5s start on the next?  5s is the default and you say 10s so this is the first thing that jumps out at me.  Maybe watch sleeping processes waiting on something during those 2 5s chunks?

Sébastien Varlet

unread,
Jun 27, 2016, 10:51:29 AM6/27/16
to elixir-lang-talk
Hi,

Thanks for your answer.

The Bypass documentation explains that it takes care of shutting down what was setup for each test (I guess their macros setup an on_exit callback). Also I coded many other tests the exact same way in the same codebase and they never caused this problem.

I verified, each test adds its own ~2 seconds not 5 (I have 5 tests therefore the test case takes 10 secs).

Also, when I replace the JSON document by a very small one such as:
 
 @prebaked_response """
  {
    "
response": {
      "
blog": {},
      "
posts": []
    }
  }
  """

the problem disappears. So if the size of the document is the issue, I'm back to square one: is the json decoding taking too long ? apparently no since mix test --trace reports a duration for each test between 30ms and 200ms. Is the JSON document passed/copied around and it's taking too long ? apparently no for the same reason...

I'm really confused here.

José Valim

unread,
Jun 27, 2016, 11:16:44 AM6/27/16
to elixir-l...@googlegroups.com
The next course of action is to try to isolate the error. If your tests do nothing, do you still get the slowdown? What if you add a Bypass to a new application? Maybe something is holding the requests on the server? Try removing or moving components around until you find the minimum sample that reproduces it.
--
You received this message because you are subscribed to the Google Groups "elixir-lang-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-talk/fb765078-480d-4078-bd9c-2e4b0a2d76ef%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--


José Valim
Skype: jv.ptec
Founder and Director of R&D

Sébastien Varlet

unread,
Jun 27, 2016, 11:27:36 AM6/27/16
to elixir-lang-talk, jose....@plataformatec.com.br
Hi Jose,

In case it helps, I also ran this test case and watched some surprising results in observer. They are concerning but they still don't tell me what's going on.

"If your tests do nothing, do you still get the slowdown?"
-> I tried a test "lol" do assert true end, it's fast :)

"What if you add a Bypass to a new application?"
-> What do you mean ?


To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-talk+unsubscribe@googlegroups.com.

Sébastien Varlet

unread,
Jul 4, 2016, 4:28:27 AM7/4/16
to elixir-lang-talk, jose....@plataformatec.com.br
Hi there,

There is an update to this problem. As explained by @mosic in the elixir-forum (https://elixirforum.com/t/strange-behavio-u-r-in-my-tests/939/5), the timings improve a lot if the prebaked_response module attribute is replaced by a context property set by a setup callback.

It is very counterintuitive as I explain in this thread of the elixir-forum as module attributes are supposed to be set once at compile time. Doing the same in a setup callback means the load is moved from compile time to runtime and the JSON file will be read as many times as there are tests in the test case.

After applying @mosic's patch, I observed the same significant improvement of the test timings.

@mosic and I would really like to understand why.

José Valim

unread,
Jul 4, 2016, 4:31:24 AM7/4/16
to Sébastien Varlet, elixir-lang-talk
I will investigate. Can you please post your exact Erlang and Elixir versions? If you start IEx with "iex" you can see both.



José Valim
Skype: jv.ptec
Founder and Director of R&D

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.

Sébastien Varlet

unread,
Jul 4, 2016, 5:30:28 AM7/4/16
to elixir-lang-talk, sebastie...@gmail.com, jose....@plataformatec.com.br
Hi José,

Sure:
Erlang/OTP 18 [erts-7.3] [source] [64-bit] [smp:8:8] [async-threads:10] [hipe] [kernel-poll:false]
Interactive Elixir (1.3.0)

Thanks
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-talk+unsubscribe@googlegroups.com.

José Valim

unread,
Jul 7, 2016, 5:01:17 AM7/7/16
to Sébastien Varlet, elixir-lang-talk
For those following I have explained the final reason in the forum: https://elixirforum.com/t/strange-behavio-u-r-in-my-tests/939/10



José Valim
Skype: jv.ptec
Founder and Director of R&D

To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages