Prometheus Mnesia Error Upgrading 3.8 -> 3.9

185 views
Skip to first unread message

Tommy Curley

unread,
Aug 19, 2021, 3:34:58 PM8/19/21
to rabbitmq-users
Hi All,

I recently tried to upgrade my rabbitmq installation (3-node cluster using 3.8.0-alpine hosted in Kubernetes) to 3.9.x. After the upgrade, the rabbitmq service stood up and was functioning, but no metrics were being reported to Prometheus. I was getting the following error:

2021-08-18 15:36:11.732445+00:00 [erro] <0.735.0> 
2021-08-18 15:36:13.898384+00:00 [info] <0.3462.0> accepting AMQP connection <0.3462.0> (10.32.0.32:56732 -> 10.32.6.55:5672)
2021-08-18 15:36:13.898507+00:00 [erro] <0.3462.0> closing AMQP connection <0.3462.0> (10.32.0.32:56732 -> 10.32.6.55:5672):
2021-08-18 15:36:13.898507+00:00 [erro] <0.3462.0> {bad_header,<<"GET /met">>}
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>   crasher:
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>     initial call: cowboy_stream_h:request_process/3
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>     pid: <0.3484.0>
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>     registered_name: []
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>     exception error: an error occurred when evaluating an arithmetic expression
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>       in operator  +/2
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>          called as undefined + 0
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>       in call from prometheus_mnesia_collector:'-get_memory_usage/0-fun-0-'/2 (src/collectors/mnesia/prometheus_mnesia_collector.erl, line 199)
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>       in call from lists:foldl/3 (lists.erl, line 1267)
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>       in call from prometheus_mnesia_collector:get_memory_usage/0 (src/collectors/mnesia/prometheus_mnesia_collector.erl, line 201)
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>       in call from prometheus_mnesia_collector:metrics/1 (src/collectors/mnesia/prometheus_mnesia_collector.erl, line 124)
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>       in call from prometheus_mnesia_collector:collect_mf/2 (src/collectors/mnesia/prometheus_mnesia_collector.erl, line 108)
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>       in call from prometheus_collector:collect_mf/3 (src/prometheus_collector.erl, line 156)
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>       in call from prometheus_registry:'-collect/2-lc$^0/1-0-'/3 (src/prometheus_registry.erl, line 86)
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>     ancestors: [<0.735.0>,<0.686.0>,<0.684.0>,<0.683.0>,<0.681.0>,
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>                   rabbit_web_dispatch_sup,<0.599.0>]
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>     message_queue_len: 0
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>     messages: []
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>     links: [<0.735.0>,#Port<0.723>]
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>     dictionary: []
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>     trap_exit: false
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>     status: running
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>     heap_size: 6772
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>     stack_size: 28
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>     reductions: 4853
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0>   neighbours:
2021-08-18 15:36:21.731849+00:00 [erro] <0.3484.0> 
2021-08-18 15:36:21.732461+00:00 [erro] <0.735.0> Ranch listener {acceptor,{0,0,0,0,0,0,0,0},15692}, connection process <0.735.0>, stream 107 had its request process <0.3484.0> exit with reason badarith and stacktrace [{erlang,'+',[undefined,0],[{error_info,#{module => erl_erts_errors}}]},{prometheus_mnesia_collector,'-get_memory_usage/0-fun-0-',2,[{file,"src/collectors/mnesia/prometheus_mnesia_collector.erl"},{line,199}]},{lists,foldl,3,[{file,"lists.erl"},{line,1267}]},{prometheus_mnesia_collector,get_memory_usage,0,[{file,"src/collectors/mnesia/prometheus_mnesia_collector.erl"},{line,201}]},{prometheus_mnesia_collector,metrics,1,[{file,"src/collectors/mnesia/prometheus_mnesia_collector.erl"},{line,124}]},{prometheus_mnesia_collector,collect_mf,2,[{file,"src/collectors/mnesia/prometheus_mnesia_collector.erl"},{line,108}]},{prometheus_collector,collect_mf,3,[{file,"src/prometheus_collector.erl"},{line,156}]},{prometheus_registry,'-collect/2-lc$^0/1-0-',3,[{file,"src/prometheus_registry.erl"},{line,86}]}]
2021-08-18 15:36:21.732461+00:00 [erro] <0.735.0> 


No alarms, diagnostics checks all were green, and as far as I could tell the system was running exactly as expected with the exception of emitting metrics to prometheus.

Does anyone have any insight into this?

I haven't found a way around this, but so far have been able to upgrade to 3.8.21 successfully.

Thanks,
Tommy

Iliia Khaprov

unread,
Aug 20, 2021, 3:10:46 AM8/20/21
to rabbitm...@googlegroups.com
Hi,

please create issue for prometheus.erl - https://github.com/deadtrickster/prometheus.erl

Thanks
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/5025de07-a392-4aec-b723-0df4959f1631n%40googlegroups.com.

Tommy Curley

unread,
Aug 20, 2021, 10:37:48 AM8/20/21
to rabbitmq-users
Since this dependency is breaking a major piece of functionality, is there any possibility of reverting the version bump? Or verifying that this is an issue not specific to my installation?

There are security vulnerabilities that have been addressed in 3.9.x, but my cluster relies heavily on the prometheus+grafana integration for monitoring.

Thanks,
Tommy

Reply all
Reply to author
Forward
0 new messages