Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

systemd 252.6-1 produces an Instruction fault, sysvinit works

3 views
Skip to first unread message

Frank Scheiner

unread,
May 18, 2023, 7:20:05 AM5/18/23
to
Hi all,

subject says it all: I yesterday upgraded my root FS(es) on my DS25 and
noticed the following issue with the systemd version right where the
login prompt should appear (I seem to remember that I recognized
something similar already late last year with a self-compiled kernel but
attributed it to the kernel being self-compiled):

```
aboot: Linux/Alpha SRM bootloader version 1.0_pre20040408
aboot: switching to OSF/1 PALcode version 1.92
aboot: loading initrd (5376720 bytes/10502 blocks) at 0xfffffc00ffacc000
aboot: starting kernel network with arguments root=/dev/nfs
ip=:::::enP2p2s5:dhcp console=ttyS0,9600n8
[ 0.000000] Linux version 6.1.0-9-alpha-smp
(debian...@lists.debian.org) (gcc-12 (Debian 12.2.0-9) 12.2.0, GNU
ld (GNU Binutils for Debian) 2.40) #1 SMP Debian
6.1.27-1 (2023-05-08)
[ 0.000000] Booting GENERIC on Titan variation Granite using machine
vector PRIVATEER from SRM
[ 0.000000] Major Options: SMP MAGIC_SYSRQ
[ 0.000000] Command line: root=/dev/nfs ip=:::::enP2p2s5:dhcp
console=ttyS0,9600n8
[...]
Begin: Running /scripts/nfs-bottom ... done.
Begin: Running /scripts/init-bottom ... done.
[ 9.820307] systemd[1]: systemd 252.6-1 running in system mode (+PAM
+AUDIT +SELINUX +APPARMOR +IMA +SMACK -SECCOMP +GCRYPT -GNUTLS +OPENSSL
+ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP
+LIBFDISK +PCRE2 -PWQUALITY +P11KIT +QRENCODE +TPM2 +BZIP2 +LZ4 +XZ
+ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON +UTMP +SYSVINIT
default-hierarchy=unified)
[ 10.202143] systemd[1]: Detected architecture alpha.

Welcome to Debian GNU/Linux 12 (bookworm)!

[ 11.864251] systemd[1]: Queued start job for default target
graphical.target.
[ 11.958978] CPU 1
[ 11.958978] systemd(1): Instruction fault 4
[ 12.032220] pc = [<fffffc0005163bfc>] ra = [<fffffc0005163bf8>] ps
= 0000 Not tainted
[ 12.131829] pc is at 0xfffffc0005163bfc
[ 12.177728] ra is at 0xfffffc0005163bf8
[ 12.223626] v0 = 0000000000000000 t0 = 0000000000000023 t1 =
fffffc00066eb800
[ 12.310540] t2 = fffffc000512e680 t3 = 0000000000f00000 t4 =
0000000000000008
[ 12.398431] t5 = 0000000000000001 t6 = 0000000000000000 t7 =
fffffc0005160000
[ 12.486321] a0 = 0000000000000000 a1 = fffffc0005163bc0 a2 =
fffffc0005163bf8
[ 12.573235] a3 = 0000000000000001 a4 = 00000002c8cf86cc a5 =
0000000000000001
[ 12.661126] t8 = 0000000000000080 t9 = 0000000000000001 t10=
fffffc0002891148
[ 12.749016] t11= 0000000000000000 pv = fffffc00011d4a40 at =
5f19e10505e118bf
[ 12.835930] gp = fffffc0002871148 sp = 00000000440a695e
[ 12.899407] Disabling lock debugging due to kernel taint
[ 12.962883] Trace:
[ 12.987298] [<fffffc00011155d8>] cgroup_migrate_execute+0x338/0x600
[ 13.062493] [<fffffc0001115da8>] cgroup_update_dfl_csses+0x2c8/0x330
[ 13.138665] [<fffffc000111867c>] cgroup_subtree_control_write+0x56c/0x5e0
[ 13.219719] [<fffffc000110dc24>] cgroup_file_write+0xa4/0x1a0
[ 13.288079] [<fffffc0001379cd4>] kernfs_fop_write_iter+0x1a4/0x330
[ 13.362297] [<fffffc00012a06c0>] vfs_write+0x250/0x4c0
[ 13.423821] [<fffffc00012a0b1c>] ksys_write+0x8c/0x140
[ 13.485344] [<fffffc000101158c>] entSys+0xac/0xc0
[ 13.541985]
[ 13.559563] Code:
[ 13.559563] fffffc00
[ 13.582024] 00000000
[ 13.610344] 00000000
[ 13.638664] 05163bfc
[ 13.666985] fffffc00
[ 13.695305] 02871148
[ 13.723625] <fffffc00>
[ 13.751946] 00000000
[ 13.779289]
```

With the sysvinit version of the root FS and initramfs everything works
(did some `7z b` runs and an `openssl speed -elapsed` w/o an issue after
booting, too):

```
[...]
INIT: Entering runlevel: 2
Using makefile-style concurrent boot in runlevel 2.
Setting up console font and keymap...done.
[ 55.348604] loop0: detected capacity change from 0 to 2097152
[ 55.389620] Adding 1048568k swap on /dev/loop0. Priority:-2
extents:1 across:1048568k SSFS
Starting enhanced syslogd: rsyslogd.
Starting SMP IRQ Balancer: irqbalance.
Starting periodic command scheduler: cron.
Loading cpufreq kernel modules...done (none).
Starting system message bus: dbus.
Starting OpenBSD Secure Shell server: sshd.
[ 57.059541] exim4(1425): unaligned trap at 0000020000900088:
000000005f9dc335 28 2
[ 57.060517] exim4(1425): unaligned trap at 00000200008fff14:
000000005f9dc335 28 1
[ 57.062470] exim4(1425): unaligned trap at 0000020000900088:
000000005f9dc335 28 2
Starting MTA: exim4.

Debian GNU/Linux 12 ds25 ttyS0

ds25 login: [ 63.064420] do_entUnaUser: 15 callbacks suppressed
[ 63.064420] exim4(1445): unaligned trap at 0000020000900088:
00000000979d736e 28 2
[ 63.212858] exim4(1445): unaligned trap at 00000200008fff14:
00000000979d736e 28 1
[ 63.304655] exim4(1445): unaligned trap at 0000020000900088:
00000000979d736e 28 2
[ 63.395475] exim4(1445): unaligned trap at 00000200008fff14:
00000000979d736e 28 1
[ 63.488248] exim4(1445): unaligned trap at 0000020000900088:
00000000979d736e 28 2

ds25 login:
```

I'll try to downgrade systemd with some versions from snapshot.debian.org.

UPDATE: I did downgrade to 251.6-1 and 250.4-1, which both didn't help
with Linux 6.1.0-9, though, Interestingly when booting with Linux
5.3.0-3 everything works. And as 6.1.0-9 works well with sysvinit
instead of systemd, it looks like the combination of kernel and systemd
makes the problems. Looking back to [1], this already seems to happen
for Linux 6.0.7.

[1]: https://pastebin.com/FbFG7jHx

Is an instruction fault (1) like an illegal instruction or (2) something
else? The question remains, why sysvinit doesn't hit this problem. Does
someone observe similar problems on their machines or e.g. the buildds
with systemd and recent kernels (>= 6.x.x)?

Cheers,
Frank

Michael Cree

unread,
May 20, 2023, 8:30:05 PM5/20/23
to
On Thu, May 18, 2023 at 01:01:25PM +0200, Frank Scheiner wrote:
> Welcome to Debian GNU/Linux 12 (bookworm)!
>
> [ 11.958978] CPU 1
> [ 11.958978] systemd(1): Instruction fault 4
> [ 12.032220] pc = [<fffffc0005163bfc>] ra = [<fffffc0005163bf8>] ps
> = 0000 Not tainted
> [ 12.131829] pc is at 0xfffffc0005163bfc
> [ 12.177728] ra is at 0xfffffc0005163bf8

Yeah, I think I have seen this one too when I tried out the new kernel.

Instruction fault 4 occurs on the execution of a reserved instruction
(i.e. an invalid opcode) or the execution of a privileged instruction
in user mode.

Interestingly the program counter is in kernel space. So raises the
question is it the kernel that tried to execute an invalid instruction?

Looking at the code:

> [ 13.559563] Code:
> [ 13.559563] fffffc00
> [ 13.582024] 00000000
> [ 13.610344] 00000000
> [ 13.638664] 05163bfc
> [ 13.666985] fffffc00
> [ 13.695305] 02871148
> [ 13.723625] <fffffc00>
> [ 13.751946] 00000000
> [ 13.779289]

These do not appear to be valid code. They look more like addresses.
It has the appearance that the kernel has jumped into data and tried
to execute it as code!

Maybe report it to the Linux Kernel Mail List?

Cheers,
Michael

John Paul Adrian Glaubitz

unread,
May 21, 2023, 4:10:04 AM5/21/23
to
Hello Frank!
As you can see from the backtrace, the issue is triggered somewhere in the
CGroup code. Since SysVInit doesn't use CGroups, it's not surprising that
you don't see the backtrace there.

Adrian

--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

Frank Scheiner

unread,
May 22, 2023, 6:10:04 AM5/22/23
to
Dear Michael,
Thanks for your explanations and analysis, that's more than helpful.

> Maybe report it to the Linux Kernel Mail List?

The thread is here:

https://marc.info/?l=linux-alpha&m=168474811430816&w=2

I would have posted a link to the debian-alpha mailing list, but the
assumed greylisting there prevented my message to appear there yet.

Cheers,
Frank
0 new messages