Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1005325: pcp: pmlogger.service fails with "protocol-error"

1,006 views
Skip to first unread message

Martin Pitt

unread,
Feb 11, 2022, 5:10:04 AM2/11/22
to
Package: pcp
Version: 5.2.6-1
Severity: important
Tags: upstream fixed-upstream

Hello,

Debian stable's pcp version has a rather annoying bug: after a while,
pmlogger.service fails to start up:

systemd[1]: pmlogger.service: Failed with result 'protocol'.
systemd[1]: Failed to start Performance Metrics Archive Logger.
systemd[1]: pmlogger.service: Scheduled restart job, restart counter is at 1.

Then it retries a few times and eventually fails. The root cause is that
something during the startup completely whacks up the log permissions:

# ls -ld /var/log/pcp/pmlogger
drwxrwxr-x 3 1000 wheel 4096 Sep 19 22:13 /var/log/pcp/pmlogger

The only way out of this is to run

chown -R pcp:pcp /var/log/pmlogger

I think this is the same problem as reported in
https://bugzilla.redhat.com/show_bug.cgi?id=2013937 , and there was a
corresponding upstream fix:

https://github.com/performancecopilot/pcp/commit/b9ff7d65b5e11

The essence of that is to drop the -C option from pmlogger_check.service.

However, I tried to apply this to Debian 11 by appending

PMLOGGER_CHECK_PARAMS="--skip-primary"

to /etc/default/pmlogger_timers. But unfortunately that still doesn't help, our
tests keep running into this bug:

https://logs.cockpit-project.org/logs/pull-16979-20220211-085507-4343f4f8-debian-stable/log.html#298

At this point I'm running out of ideas. This feels like quite a major bug, as
it's not at all obvious how to get out of the situation, and how to prevent it
from happening.

Note that this does not affect any other operating system that cockpit tests on
(Debian testing, Ubuntu 20.04 and 21.10, Fedora 34/35, CentOS/RHEL 8/9, Arch),
only Debian 11. So I'm fairly sure this is fixed in current upstream versions.

Thanks,

Martin
0 new messages