Thruk becomes slow and Grafana using thruk fails when a backend is unreachable

42 views
Skip to first unread message

Fabrice Le Dorze

unread,
Dec 16, 2024, 5:10:05 AM12/16/24
to Thruk
Hi
We got a power outage in a site, making its  CheckMK satellite unreachable.
Since, central Thruk became slow and Grafana elements using Central Thruk fail.
See below the installation :
* Thruk 3.16
* Grafana 11.2 with sni-thruk-datasource @ 2.0.4

Setting hidden to 1 through WEB GUI or in /etc/thruk/thruk_local.conf solve the problem.

But is there a way to do it in command line ?
Thx

Sven Nierlein

unread,
Dec 16, 2024, 5:25:20 AM12/16/24
to th...@googlegroups.com, Fabrice Le Dorze
Hi,

i suggest using LMD when connecting to multiple or remote backends.
See https://thruk.org/documentation/lmd.html

Regards,
Sven
> --
> You received this message because you are subscribed to the Google Groups "Thruk" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to thruk+un...@googlegroups.com <mailto:thruk+un...@googlegroups.com>.
> To view this discussion visit https://groups.google.com/d/msgid/thruk/3f2d9bbf-a474-4d0f-96b4-5e1cfd85c214n%40googlegroups.com <https://groups.google.com/d/msgid/thruk/3f2d9bbf-a474-4d0f-96b4-5e1cfd85c214n%40googlegroups.com?utm_medium=email&utm_source=footer>.

OpenPGP_signature.asc

Fabrice Le Dorze

unread,
Dec 16, 2024, 5:40:08 AM12/16/24
to Thruk
Even when backends are CheckMK ones ?
I got :
lmd error - /var/cache/thruk/lmd/live.sock:
peer $VAR1 = '/var/cache/thruk/lmd/live.sock';
statement $VAR1 = 'GET hosts
Backends: 36a9e 229c2
Columns: accept_passive_checks acknowledged action_url action_url_expanded active_checks_enabled address alias check_command check_freshness check_interval check_options check_period check_type checks_enabled childs comments current_attempt current_notification_number event_handler event_handler_enabled execution_time custom_variable_names custom_variable_values first_notification_delay flap_detection_enabled groups has_been_checked high_flap_threshold icon_image icon_image_alt icon_image_expanded is_executing is_flapping last_check last_notification last_state_change latency low_flap_threshold max_check_attempts name next_check notes notes_expanded notes_url notes_url_expanded notification_interval notification_period notifications_enabled num_services_crit num_services_ok num_services_pending num_services_unknown num_services_warn num_services obsess_over_host parents percent_state_change perf_data plugin_output process_performance_data retry_interval scheduled_downtime_depth state state_type modified_attributes_list last_time_down last_time_unreachable last_time_up display_name in_check_period in_notification_period has_long_plugin_output last_state_change_order lmd_last_cache_update depends_exec depends_notify peer_key
Filter: state = 1
Filter: has_been_checked = 1
And: 2
Filter: groups !>= OT
Filter: groups !>= NonProd
Filter: groups !>= DockerContainers
And: 4
Sort: name asc
OutputFormat: wrapped_json
ResponseHeader: fixed16
';

Sven Nierlein

unread,
Dec 16, 2024, 5:45:15 AM12/16/24
to th...@googlegroups.com, Fabrice Le Dorze
It should work with everything talking livestatus, this includes cmk.
So far i don't see any error message, just the query itself. Is it
possible to update Thruk as well?


On 16.12.24 11:40, Fabrice Le Dorze wrote:
> Even when backends are CheckMK ones ?
> I got :
> /lmd error - /var/cache/thruk/lmd/live.sock:
> peer $VAR1 = '/var/cache/thruk/lmd/live.sock';
> statement $VAR1 = 'GET hosts
> Backends: 36a9e 229c2
> Columns: accept_passive_checks acknowledged action_url action_url_expanded active_checks_enabled address alias check_command check_freshness check_interval check_options check_period check_type checks_enabled childs comments current_attempt current_notification_number event_handler event_handler_enabled execution_time custom_variable_names custom_variable_values first_notification_delay flap_detection_enabled groups has_been_checked high_flap_threshold icon_image icon_image_alt icon_image_expanded is_executing is_flapping last_check last_notification last_state_change latency low_flap_threshold max_check_attempts name next_check notes notes_expanded notes_url notes_url_expanded notification_interval notification_period notifications_enabled num_services_crit num_services_ok num_services_pending num_services_unknown num_services_warn num_services obsess_over_host parents percent_state_change perf_data plugin_output process_performance_data retry_interval
> scheduled_downtime_depth state state_type modified_attributes_list last_time_down last_time_unreachable last_time_up display_name in_check_period in_notification_period has_long_plugin_output last_state_change_order lmd_last_cache_update depends_exec depends_notify peer_key
> Filter: state = 1
> Filter: has_been_checked = 1
> And: 2
> Filter: groups !>= OT
> Filter: groups !>= NonProd
> Filter: groups !>= DockerContainers
> And: 4
> Sort: name asc
> OutputFormat: wrapped_json
> ResponseHeader: fixed16
> ';/
>
> Le lundi 16 décembre 2024 à 11:25:20 UTC+1, Sven Nierlein a écrit :
>
> Hi,
>
> i suggest using LMD when connecting to multiple or remote backends.
> See https://thruk.org/documentation/lmd.html <https://thruk.org/documentation/lmd.html>
>
> Regards,
> Sven
>
>
> On 16.12.24 11:10, Fabrice Le Dorze wrote:
> > Hi
> > We got a power outage in a site, making its  CheckMK satellite unreachable.
> > Since, central Thruk became slow and Grafana elements using Central Thruk fail.
> > See below the installation :
> > * Thruk 3.16
> > * Grafana 11.2 with sni-thruk-datasource @ 2.0.4
> >
> > Setting hidden to 1 through WEB GUI or in /etc/thruk/thruk_local.conf solve the problem.
> >
> > But is there a way to do it in command line ?
> > Thx
> >
> > --
> > You received this message because you are subscribed to the Google Groups "Thruk" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to thruk+un...@googlegroups.com <mailto:thruk+un...@googlegroups.com>.
> > To view this discussion visit https://groups.google.com/d/msgid/thruk/3f2d9bbf-a474-4d0f-96b4-5e1cfd85c214n%40googlegroups.com <https://groups.google.com/d/msgid/thruk/3f2d9bbf-a474-4d0f-96b4-5e1cfd85c214n%40googlegroups.com> <https://groups.google.com/d/msgid/thruk/3f2d9bbf-a474-4d0f-96b4-5e1cfd85c214n%40googlegroups.com?utm_medium=email&utm_source=footer <https://groups.google.com/d/msgid/thruk/3f2d9bbf-a474-4d0f-96b4-5e1cfd85c214n%40googlegroups.com?utm_medium=email&utm_source=footer>>.
>
> --
> You received this message because you are subscribed to the Google Groups "Thruk" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to thruk+un...@googlegroups.com <mailto:thruk+un...@googlegroups.com>.
> To view this discussion visit https://groups.google.com/d/msgid/thruk/a57a112e-ebba-4ee2-a444-4febf648402cn%40googlegroups.com <https://groups.google.com/d/msgid/thruk/a57a112e-ebba-4ee2-a444-4febf648402cn%40googlegroups.com?utm_medium=email&utm_source=footer>.

Fabrice Le Dorze

unread,
Dec 16, 2024, 8:09:31 AM12/16/24
to Thruk
I installed last 3.20 version from  https://download.opensuse.org/repositories/home:/naemon:/daily/Debian_12/.
Nothing better.
In /var/log/thruk/thruk.log, I get errror messages below.
In web GUI, the backends are red.

[2024/12/16 14:04:54][viz-dev][ERROR] >>>>>>>>>>>>>>>>>>>>>
[2024/12/16 14:04:54][viz-dev][ERROR] page:     GET https://thruk-dev.vic.verkor.com/thruk/cgi-bin/status.cgi?style=combined&nav=1&hidetop=1&title=All+Unhandled+Problems&section=Bookmarks&newname=&link_target=&bookmarksp=Bookmarks::Prod+IT+(unhandled)&bookmarksp=Bookmarks::Prod+IT+(all)&bookmarksp=Bookmarks::Prod+OT+(unhandled)&bookmarksp=Bookmarks::Prod+OT+(all)&bookmarksp=Bookmarks::Windows/Linux+Updates&view_mode=html&hst_s0_hoststatustypes=4&hst_s0_hostprops=10250&hst_s0_type=hostgroup&hst_s0_val_pre=&hst_s0_op=!=&hst_s0_value=OT&hst_s0_type=hostgroup&hst_s0_val_pre=&hst_s0_op=!=&hst_s0_value=NonProd&hst_s0_type=hostgroup&hst_s0_val_pre=&hst_s0_op=!=&hst_s0_value=DockerContainers&update=&svc_s0_hoststatustypes=3&svc_s0_servicestatustypes=20&svc_s0_hostprops=10250&svc_s0_serviceprops=10250&svc_s0_type=hostgroup&svc_s0_val_pre=&svc_s0_op=!=&svc_s0_value=OT&svc_s0_type=hostgroup&svc_s0_val_pre=&svc_s0_op=!=&svc_s0_value=NonProd&svc_s0_type=hostgroup&svc_s0_val_pre=&svc_s0_op=!=&svc_s0_value=DockerContainers&svc_s0_type=servicegroup&svc_s0_val_pre=&svc_s0_op=!=&svc_s0_value=Server_Updates&svc_s0_type=servicegroup&svc_s0_val_pre=&svc_s0_op=!=&svc_s0_value=Services_NonIT
[2024/12/16 14:04:54][viz-dev][ERROR] params:   {'bookmarksp' => ['Bookmarks::Prod IT (unhandled)','Bookmarks::Prod IT (all)','Bookmarks::Prod OT (unhandled)','Bookmarks::Prod OT (all)','Bookmarks::Windows/Linux Updates'],'hidetop' => '1','hst_s0_hostprops' => '10250','hst_s0_hoststatustypes' =...
[2024/12/16 14:04:54][viz-dev][ERROR] user:     admin
[2024/12/16 14:04:54][viz-dev][ERROR] address:  10.15.6.10
[2024/12/16 14:04:54][viz-dev][ERROR] duration: 9.0s
[2024/12/16 14:04:54][viz-dev][ERROR] No backend available
[2024/12/16 14:04:54][viz-dev][ERROR] None of the selected Backends could be reached, please have a look at the logfile for detailed information and make sure the core is up and running.
[2024/12/16 14:04:54][viz-dev][ERROR] lmd error - /var/cache/thruk/lmd/live.sock:
[2024/12/16 14:04:54][viz-dev][ERROR] peer                $VAR1 = '/var/cache/thruk/lmd/live.sock';
[2024/12/16 14:04:54][viz-dev][ERROR] statement           $VAR1 = 'GET hosts
[2024/12/16 14:04:54][viz-dev][ERROR] Backends: 36a9e 229c2
[2024/12/16 14:04:54][viz-dev][ERROR] Columns: accept_passive_checks acknowledged action_url action_url_expanded active_checks_enabled address alias check_command check_freshness check_interval check_options check_period check_type checks_enabled childs comments current_attempt current_notification_number event_handler event_handler_enabled execution_time custom_variable_names custom_variable_values first_notification_delay flap_detection_enabled groups has_been_checked high_flap_threshold icon_image icon_image_alt icon_image_expanded is_executing is_flapping last_check last_notification last_state_change latency low_flap_threshold max_check_attempts name next_check notes notes_expanded notes_url notes_url_expanded notification_interval notification_period notifications_enabled num_services_crit num_services_ok num_services_pending num_services_unknown num_services_warn num_services obsess_over_host parents percent_state_change perf_data plugin_output process_performance_data retry_interval scheduled_downtime_depth state state_type modified_attributes_list last_time_down last_time_unreachable last_time_up display_name in_check_period in_notification_period has_long_plugin_output last_state_change_order lmd_last_cache_update depends_exec depends_notify peer_key
[2024/12/16 14:04:54][viz-dev][ERROR] Filter: has_been_checked = 1
[2024/12/16 14:04:54][viz-dev][ERROR] Filter: state = 1
[2024/12/16 14:04:54][viz-dev][ERROR] And: 2
[2024/12/16 14:04:54][viz-dev][ERROR] Filter: scheduled_downtime_depth = 0
[2024/12/16 14:04:54][viz-dev][ERROR] Filter: acknowledged = 0
[2024/12/16 14:04:54][viz-dev][ERROR] Filter: is_flapping = 0
[2024/12/16 14:04:54][viz-dev][ERROR] Filter: notifications_enabled = 1
[2024/12/16 14:04:54][viz-dev][ERROR] And: 4
[2024/12/16 14:04:54][viz-dev][ERROR] And: 2
[2024/12/16 14:04:54][viz-dev][ERROR] Filter: groups !>= OT
[2024/12/16 14:04:54][viz-dev][ERROR] Filter: groups !>= NonProd
[2024/12/16 14:04:54][viz-dev][ERROR] Filter: groups !>= DockerContainers
[2024/12/16 14:04:54][viz-dev][ERROR] And: 4
[2024/12/16 14:04:54][viz-dev][ERROR] Sort: name asc
[2024/12/16 14:04:54][viz-dev][ERROR] OutputFormat: wrapped_json
[2024/12/16 14:04:54][viz-dev][ERROR] ResponseHeader: fixed16
[2024/12/16 14:04:54][viz-dev][ERROR] ';
[2024/12/16 14:04:54][viz-dev][ERROR] Can't use an undefined value as an ARRAY reference at /usr/share/thruk/lib/Thruk/Controller/status.pm line 1228.

Reply all
Reply to author
Forward
0 new messages