Hi Chris,
Chris Hofstaedtler wrote:
> > As far as I can see, I didn't get a reply back from you on these
> > suggestions of mine. Maybe my mail fell through the cracks. But I
> > think we should take the discussion up again, probably in this bug
> > report.
>
> Right. I think I forgot to reply back then - sorry.
Happens…
> Experimental should have util-linux-extra 2.38-4+exp1 very soon,
> with irqtop installed. Obviously this can only be used for testing.
Thanks. That package though seems to miss the "Conflicts: irqtop". :-/
But I was aware of it and uninstalled irqtop beforehand. :-)
> Personally I think we should have only one irqtop - from my point of
> view it does not matter which one. Maybe the new version is
> superior.
Hmmmm.
> In any case we should not confuse our users.
Fully agree. Nevertheless, Debian is a lot about having choice between
different implementations (compared to e.g. Ubuntu). And choice
sometimes makes things less easier to understand.
> > Another point which comes to my mind now is that it might make sense
> > to rename the current irqtop package to irqtop-nf (or irqtop-ruby)
> > just to make clear that it does not contain the irqtop tool from
> > util-linux.
>
> Might be an idea. But lets see what the differences are, first.
Ack.
zhenwei pi wrote:
> The main difference between the two versions:
> - original irqtop shows separated interrupt information
> - new irqtop shows aggregate interrupt information
Thanks for that summary.
(I btw. just noticed that zhenwei is actually the author of
util-linux's implementation of irqtop:
https://git.kernel.org/pub/scm/utils/util-linux/util-linux.git/commit/?id=d511011c
refers to
https://github.com/pizhenwei/irqtop as previous place of
development. :-)
> Test env: Debian 10; 96 CPUs on a server, 230 characters per line in
> termial.
>
> - irqtop (original version) shows uncompleted interrupts(31 / 96 CPUs).
Hrm, interesting.
> n194-087-081 - irqtop - 2022-04-15 09:42:48 +0800
> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15 CPU16 CPU17 CPU18 CPU19 CPU20 CPU21 CPU22 CPU23 CPU24 CPU25 CPU26 CPU27 CPU28 CPU29 CPU30 […]
> cpuUtil: 0.0 0.0 0.4 0.0 1.3 0.0 0.0 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.2 0.2 0.2 0.0 0.2
> %irq: 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
> %sirq: 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
> irqTotal: 32 293 477 5 34 7 5 1805 112 51 3 2 28 2 1901 1 13 16 0 6 1 19 2 2 67 29 51 51 42 9 34
> i 9: . 2 0 0 0 . . . . . . . . . . . . . . . . . . . . . . . . . .
> i 48: . . . . . . 0 . . . . . . . . . . . . . . . . . . . . . . . .
> i 49: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
> i 50: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
> i 51: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
I currently only have access to boxes with 32 cores, but it shows all
of them and also some additional information in the last column which
seems to have been stripped from your instance due to probably the
limited terminal width. Mine looks like this and also has IRQ names
shown instead of numbers:
somehost - irqtop - 2022-04-15 15:13:12 +0000
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15 CPU16 CPU17 CPU18 CPU19 CPU20 CPU21 CPU22 CPU23 CPU24 CPU25 CPU26 CPU27 CPU28 CPU29 CPU30 CPU31
cpuUtil: 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 3.8 0.0 total CPU utilization %
%irq: 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 hardware IRQ CPU util%
%sirq: 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 software IRQ CPU util%
irqTotal: 5 0 1 0 0 5 0 0 0 0 0 0 5 0 0 36 0 0 0 1 0 0 0 0 0 0 0 0 0 0 17 0 total hardware IRQs
i 117: . . . . . . . . . . . . 3 . . . . . . . . . . . . . . . . . . . IR-PCI-MSI 4194317-edge i40e-eno1-TxRx-12
i LOC: 5 0 1 0 0 5 0 0 0 0 0 0 1 0 0 36 0 0 0 1 0 0 0 0 0 0 0 0 0 0 17 0 Local timer interrupts
s TIMER: 5 0 0 0 0 7 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
s NET_RX: 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
s SCHED: 5 0 0 0 0 7 0 0 0 0 0 0 1 0 0 33 0 0 0 1 0 0 0 0 0 0 0 0 0 0 9 0
s RCU: 1 0 0 0 0 7 0 0 0 0 0 0 1 0 0 7 0 0 0 1 0 0 0 0 0 0 0 0 0 0 13 0
(I currently suspect that zhenwei used the irqtop from Debian 10
Buster, i.e. version 2.3 instead of the current version 2.6 as can be
found in Debian Unstable and Testing. That might have caused these
differences.)
> - irqtop (from util-linux) shows aggregate interrupt information.
> irqtop | total: 575548749447 delta: 518913 | n148-134-075 | 2022-04-15 10:02:14+08:00
>
> IRQ TOTAL DELTA NAME
>
> LOC 218396041027 393883 Local timer interrupts
> RES 217686711923 50039 Rescheduling interrupts
> PIN 40532867503 10053 Posted-interrupt notification event
> CAL
15012540676 2013 Function call interrupts
> PIW 13810059255 57692 Posted-interrupt wakeup event
> TLB 8699607720 1597 TLB shootdowns
> 221
4235495788 88 IR-PCI-MSI 50331656-edge eth0-4
That's quite a difference IMHO.
The from irqtop from util-linux though shows on my box also some per
CPU respectively per core stats (Debian Unstable, with irqtop from
Christian's util-linux-extra package version 2.38-4+exp1 from Debian
Experimental):
irqtop | total: 22014142315 delta: 9471 | c6 | 2022-04-15 16:47:26+02:00
cpu0 cpu1 cpu2 cpu3
%irq: 30.4 24.1 20.0 25.5
%delta: 36.1 18.5 16.5 28.8
IRQ TOTAL DELTA NAME
LOC
14019433563 6020 Local timer interrupts
129 2573943707 1722 IR-PCI-MSI 520192-edge enp0s31f6
RES 2091668263 575 Rescheduling interrupts
130 794066902 17 IR-PCI-MSI 376832-edge ahci[0000:00:17.0]
CAL 763171794 140 Function call interrupts
138 612474851 790 IR-PCI-MSI 524288-edge nvkm
128 463147433 67 IR-PCI-MSI 327680-edge xhci_hcd
TLB 455459030 0 TLB shootdowns
137 221045281 140 IR-PCI-MSI 514048-edge snd_hda_intel:card0
131 19266170 0 IR-PCI-MSI 1572864-edge xhci_hcd
NMI 198160 0 Non-maskable interrupts
PMI 198160 0 Performance monitoring interrupts
MCP 68615 0 Machine check polls
17 217 0 IR-IO-APIC 17-fasteoi snd_hda_intel:card1
> Other enhanced features from the new version:
> - sort by several rules, include IRQ, TOTAL, DELTA and NAME.
> - specify cpus in list format to monitor.
> - specify output columns to print.
> - enable/disable per-cpu statistics by specified mode.
From my point of view, they seem to have quite a different feature
set. The main advantage of the irqtop from util-linux seems to be that
it is more readable with a lot of CPUs, but gives less detailed statistics.
The ruby-written irqtop does more detailed per-cpu/per-core statistics
— which might be helpful with a few cores, but you'll loose overview
with a lot of cores, as seen by zhenwei's "screenshot" which is
truncated at CPU 30.
> - performance improvement. New irqtop written by C uses a little CPU when
> running 'irqtop -d 1', the Ruby version spends more time(quite obvious on a
> 96 CPUs platform).
Yeah, it's obvious, but the reason is not the 96 CPUs but the fact
that its written in an interpreted language and not compiled.
Anyway, IMHO we should:
* Figure out how to get the util-linux implementation into Debian
proper.
* irqtop from util-linux should in some way become the future default,
as its probably what the user usually expects. The ruby-written
irqtop is only a niche tool written for analysing the performace of
the ipt_NETFLOW.ko iptables plugin kernel module. (But seems to have
been useful elsewhere, too, as probably shown by the fact that
util-linux added a similar tool, which is probably less focussed on
that one job. :-)
Regarding the ruby-written irqtop:
* It is currently endangered to be removed from testing by the
horribly outdated ruby-curses (
https://bugs.debian.org/958973) in
Debian which is also no more maintained; see
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=959115#10 and
https://bugs.debian.org/1009727 (Christian: I X-Debbugs-Cc'ed you on
#1009727 for that and because I know that you're also active in
Debian's Ruby packaging.)
* It has a higher popularity than I expected:
https://qa.debian.org/popcon-graph.php?packages=irqtop&show_installed=on&show_vote=on&want_legend=on&beenhere=1
Because I as user and Linux admin prefer having choice and because the
two irqtop implementations seem to rather different, I really would
prefer to keep the ruby-written irqtop in Debian nevertheless at least
for now.
My currently preferred variant (probably needs to be a bit more
polished) to go forward is:
* Renaming the current irqtop package (and binary) to irqtop-nf.
* Making a "irqtop" a transitional package which pulls in either
irqtop-nf or util-linux-extra , i.e. has a
Depends: irqtop-nf | util-linux-extra
in its control file. That way those who upgrade automatically get
the same implementation as before. But those who look at the package
see that there are two choices.
* After the Bookworm release, the "irqtop" package should be removed
and provided by the util-linux-extra package, so that those who do
"apt install irqtop" actually get the more expected implementation
from util-linux.
* I think we should also try to use /etc/alternatives/irqtop +
update-alternatives with irqtop from util-linux-extra having the
higher priority so that those who install both, get the probably
more expected util-linux-extra's implementation by default.
In case you agree, I'd upload an updated iptables-netflow source
package to Debian Experimental implementing these changes so we can
cross-installability and upgrade paths.