Hi,
[tl;dr: atop seems to hang on s390x]
On 12-08-2022 12:23, Marc Haber wrote:
> On Thu, Aug 11, 2022 at 10:51:32PM +0200, Paul Gevers wrote:
>> On 10-08-2022 12:03, Marc Haber wrote:
>>> Unfortunately, this bug report suffers from multiple cut&paste or
>>> template error. The ci link points to the mercurial page for amd64, the
>>> text alternates between s390s, armhf, arm64 and amd64.
>>
>> There was only one that I'm aware of, the link to mercurial. But I
>> understand it if the text was a bit confusing.
>
> You said autopkgtest fails on amd64, which was never the case. Maybe
> amd64 and arm64 got confused.
What I *wanted* to convey is that arm64 and amd64 *failures* are in our
RC policy and all other *regressions* are RC too. I did mix that up.
>>> I tried the (dead simple)d autopkgtest on the s390s and arm64 porterboxes
>>> and it succeeded in a second's time. I have sharpened the expression
>>> that counts the CPUs in lscpu's output and hope this will fix the issue.
>>
>> ooo, CPU count. Yes, some of those archs run on hosts with lots of CPU's.
>> armhf has 160, s390x has 10.
>
> I am testing locally on amd64 with a machine with 12 CPUs. The armhf
> tests succeed (see
>
https://ci.debian.net/data/autopkgtest/testing/armhf/a/atop/24578667/log.gz).
Great, same on arm64. s390x still times out though.
> The complete test is:
> #!/bin/bash
>
> # atop reports number of CPU and two extra lines
> ATOPSOPINION="$(atop -P cpu 5 1 | grep -vE '^(RESET|SEP)' | wc -l)"
When I run `atop` manually (on stable), it doesn't do anything...
root@ci-worker-s390x-01:~# atop
^C
I started up a clean unstable lxc container and installing atop takes
quite some time between:
Created symlink
/etc/systemd/system/timers.target.wants/atop-rotate.timer ->
/lib/systemd/system/atop-rotate.timer.
Created symlink /etc/systemd/system/multi-user.target.wants/atop.service
-> /lib/systemd/system/atop.service.
Created symlink
/etc/systemd/system/multi-user.target.wants/atopacct.service ->
/lib/systemd/system/atopacct.service.
and
Could not execute systemctl: at /usr/bin/deb-systemd-invoke line 145.
running atop from unstable also hangs:
root@elbrus:~# atop
^C
> There is no loop, and nothing that could fail on a big number. In my
> understanding, this could run on a box with 2000 cores and still work.
Except, it doesn't. Seems like atop is seriously broken on s390x on the
hosts that we have.
> Also, the test does not time out on zelenka when manually invoked in an
> schroot (setting PATH to point to an executable atop is necessary, as it
> does not seem to be possible to install an abitrary package that is not
> in the archive. Also, the test is successful if invoked after installing
> atop 2.7.1-2 from the archive.
Maybe we need to involve the s390x porters? I put them in CC to already
draw their attention.
Paul