Hi,
If got some really strange crashes of the logger and errors from supervisor on 22.1.7:
2020-10-08T20:29:11.310911+00:00 error: Generic server <0.934.0> terminating. Reason: {function_clause,[{logger_proxy,handle_info,[{#Ref<0.3534272603.1552154625.127680>,ok},no_state],[{file,"logger_proxy.erl"},{line,136}]},{logger_olp,try_callback_call,4,[{file,"logger_olp.erl"},{line,598}]},{logger_olp,handle_info,2,[{file,"logger_olp.erl"},{line,270}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,680}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,756}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}. Last message: {#Ref<0.3534272603.1552154625.127680>,ok}. State: #{burst_limit_enable => false,burst_limit_max_count => 500,burst_limit_window_time => 1000,burst_msg_count => 0,burst_win_ts => -576460751230297,cb_state => no_state,drop_mode_qlen => 1000,flush_qlen => 5000,id => logger_proxy,idle => false,last_load_ts => -575306488755097,last_qlen => 1,mode => async,mode_ref => {logger_olp,logger_proxy},module => logger_proxy,overload_kill_enable => false,overload_kill_mem_size => 3000000,overload_kill_qlen => 20000,overload_kill_restart_after => 5000,sync_mode_qlen => 500}.
2020-10-08T20:29:12.179509+00:00 error: Supervisor received unexpected message: {#Ref<0.3534272603.1029701639.174367>,ok}
2020-10-08T20:29:13.992900+00:00 error: Supervisor received unexpected message: {#Ref<0.3534272603.1029701639.174370>,ok}
2020-10-08T20:29:12.797713+00:00 error: crasher: initial call: logger_olp:init/1, pid: <0.934.0>, registered_name: [], error: {function_clause,[{logger_proxy,handle_info,[{#Ref<0.3534272603.1552154625.127680>,ok},no_state],[{file,"logger_proxy.erl"},{line,136}]},{logger_olp,try_callback_call,4,[{file,"logger_olp.erl"},{line,598}]},{logger_olp,handle_info,2,[{file,"logger_olp.erl"},{line,270}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,680}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,756}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}, ancestors: [logger_sup,kernel_sup,<0.902.0>], message_queue_len: 1, messages: [{#Ref<0.3534272603.1552154625.128792>,ok}], links: [<0.932.0>], dictionary: [{olp_ref,{logger_proxy,<0.934.0>,{logger_olp,logger_proxy}}}], trap_exit: true, status: running, heap_size: 987, stack_size: 28, reductions: 84949; neighbours:
2020-10-08T20:29:15.492989+00:00 error: Supervisor received unexpected message: {#Ref<0.3534272603.1322254338.95824>,ok}
2020-10-08T20:29:21.399535+00:00 error: Supervisor received unexpected message: {#Ref<0.3534272603.1322254338.95915>,ok}
2020-10-08T20:29:22.982660+00:00 error: Supervisor: {local,logger_sup}. Context: child_terminated. Reason: {function_clause,[{logger_proxy,handle_info,[{#Ref<0.3534272603.1552154625.127680>,ok},no_state],[{file,"logger_proxy.erl"},{line,136}]},{logger_olp,try_callback_call,4,[{file,"logger_olp.erl"},{line,598}]},{logger_olp,handle_info,2,[{file,"logger_olp.erl"},{line,270}]},{gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,680}]},{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,756}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,226}]}]}. Offende
It seems that somehow unhandled {Ref, ok} message got into the inbox of the servers. I'm pretty sure that my code is not sending those messages to random processes. Does anyone have an idea how this could happen?
One clue might be that the system was under heavy load, possibly overloaded as this log entry suggests:
2020-10-08T20:28:15.676829+00:00 error: Error in process <0.4111.816> on node 'gtphub-c-node@ash-prd-tvp-gtphub-mvne-2' with exit value:, {{badmatch,{error,timeout}},[{cpu_sup,measurement_server_init,0,[{file,"cpu_sup.erl"},{line,502}]}]}
The stray message could be left over from a timed out gen call. If that is the case and late replies from calls can happen, then clearly the logger process should handle them more gracefully.
--