strange crashes of logger and supervisor

5 views
Skip to first unread message

Andreas Schultz

unread,
Oct 9, 2020, 5:47:37 AM10/9/20
to erlang-questions
Hi,

If got some really strange crashes of the logger and errors from supervisor on 22.1.7:

2020-10-08T20:29:11.310911+00:00 error: Generic server <0.934.0> terminating. Reason: {function_clause,[{logger_proxy,handle_info,[{#Ref<0.3534272603.1552154625.127680>,ok},no_state],[{file,"logger_proxy.erl"},{line,136}]},{logger_olp,try_callback_call,4,[{file,"logger_olp.erl"},{line,598}]},{logger_olp,handle_info,2,[{file,"logger_olp.erl"},{line,270}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,680}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,756}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}. Last message: {#Ref<0.3534272603.1552154625.127680>,ok}. State: #{burst_limit_enable => false,burst_limit_max_count => 500,burst_limit_window_time => 1000,burst_msg_count => 0,burst_win_ts => -576460751230297,cb_state => no_state,drop_mode_qlen => 1000,flush_qlen => 5000,id => logger_proxy,idle => false,last_load_ts => -575306488755097,last_qlen => 1,mode => async,mode_ref => {logger_olp,logger_proxy},module => logger_proxy,overload_kill_enable => false,overload_kill_mem_size => 3000000,overload_kill_qlen => 20000,overload_kill_restart_after => 5000,sync_mode_qlen => 500}.
2020-10-08T20:29:12.179509+00:00 error: Supervisor received unexpected message: {#Ref<0.3534272603.1029701639.174367>,ok}
2020-10-08T20:29:13.992900+00:00 error: Supervisor received unexpected message: {#Ref<0.3534272603.1029701639.174370>,ok}
2020-10-08T20:29:12.797713+00:00 error: crasher: initial call: logger_olp:init/1, pid: <0.934.0>, registered_name: [], error: {function_clause,[{logger_proxy,handle_info,[{#Ref<0.3534272603.1552154625.127680>,ok},no_state],[{file,"logger_proxy.erl"},{line,136}]},{logger_olp,try_callback_call,4,[{file,"logger_olp.erl"},{line,598}]},{logger_olp,handle_info,2,[{file,"logger_olp.erl"},{line,270}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,680}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,756}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}, ancestors: [logger_sup,kernel_sup,<0.902.0>], message_queue_len: 1, messages: [{#Ref<0.3534272603.1552154625.128792>,ok}], links: [<0.932.0>], dictionary: [{olp_ref,{logger_proxy,<0.934.0>,{logger_olp,logger_proxy}}}], trap_exit: true, status: running, heap_size: 987, stack_size: 28, reductions: 84949; neighbours:
2020-10-08T20:29:15.492989+00:00 error: Supervisor received unexpected message: {#Ref<0.3534272603.1322254338.95824>,ok}
2020-10-08T20:29:21.399535+00:00 error: Supervisor received unexpected message: {#Ref<0.3534272603.1322254338.95915>,ok}
2020-10-08T20:29:22.982660+00:00 error: Supervisor: {local,logger_sup}. Context: child_terminated. Reason: {function_clause,[{logger_proxy,handle_info,[{#Ref<0.3534272603.1552154625.127680>,ok},no_state],[{file,"logger_proxy.erl"},{line,136}]},{logger_olp,try_callback_call,4,[{file,"logger_olp.erl"},{line,598}]},{logger_olp,handle_info,2,[{file,"logger_olp.erl"},{line,270}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,680}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,756}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}. Offende

It seems that somehow unhandled {Ref, ok} message got into the inbox of the servers. I'm pretty sure that my code is not sending those messages to random processes. Does anyone have an idea how this could happen?

One clue might be that the system was under heavy load, possibly overloaded as this log entry suggests:

2020-10-08T20:28:15.676829+00:00 error: Error in process <0.4111.816> on node 'gtphub-c-node@ash-prd-tvp-gtphub-mvne-2' with exit value:, {{badmatch,{error,timeout}},[{cpu_sup,measurement_server_init,0,[{file,"cpu_sup.erl"},{line,502}]}]}

The stray message could be left over from a timed out gen call. If that is the case and late replies from calls can happen, then clearly the logger process should handle them more gracefully.


--

Andreas Schultz

Lukas Larsson

unread,
Oct 9, 2020, 7:44:21 AM10/9/20
to Andreas Schultz, erlang-questions
Hello,

This seems to be the logger proxy, i.e. the process that logs messages sent by the emulator and other nodes. 

It should not crash when receiving these late replies to gen_server:call/2. I will prepare a patch for it.

In OTP-24 there is a proposal for process aliases (https://github.com/erlang/otp/pull/2735) that should get rid of these types of bugs.

Lukas

Lukas Larsson

unread,
Dec 1, 2020, 2:56:25 AM12/1/20
to Andreas Schultz, erlang-questions
Merged a fix to maint for this problem: https://github.com/erlang/otp/pull/2884
Reply all
Reply to author
Forward
0 new messages