Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[gentoo-user] my 5.15.93 kernel keeps rebooting

6 views
Skip to first unread message

John Covici

unread,
Feb 14, 2023, 9:10:04 AM2/14/23
to
Hi. So, foolish me, I decided to go from a working 5.10.155 system to
try latest lts of 5.15 which is 5.15.93. Compile, install went well,
but the system keeps rebooting. It gets all the way and even starts
the local services and then here are the last few lines, which may be
relevant or not:

Feb 14 06:36:31 ccs.covici.com systemd[1]: Starting local.service...
Feb 14 06:36:31 ccs.covici.com systemd[1]: Starting
systemd-update-utmp-runlevel.service...
Feb 14 06:36:31 ccs.covici.com bash[5753]: rm: cannot remove
'/etc/ppp/provider_is_up': No such file or directory
Feb 14 06:36:31 ccs.covici.com systemd[1]:
systemd-update-utmp-runlevel.service: Deactivated successfully.
Feb 14 06:36:31 ccs.covici.com systemd[1]: Finished
systemd-update-utmp-runlevel.service.
-- Boot 5c394be675854680a9cb616208f374f3 --

Any trouble shooting suggestions as to what is making the system
reboot?

Thanks in advance.

--
Your life is like a penny. You're going to lose it. The question is:
How do
you spend it?

John Covici wb2una
cov...@ccs.covici.com

Rich Freeman

unread,
Feb 14, 2023, 2:10:06 PM2/14/23
to
On Tue, Feb 14, 2023 at 9:08 AM John Covici <cov...@ccs.covici.com> wrote:
>
> Hi. So, foolish me, I decided to go from a working 5.10.155 system to
> try latest lts of 5.15 which is 5.15.93. Compile, install went well,
> but the system keeps rebooting. It gets all the way and even starts
> the local services and then here are the last few lines, which may be
> relevant or not:
>
> Feb 14 06:36:31 ccs.covici.com systemd[1]: Starting local.service...
> Feb 14 06:36:31 ccs.covici.com systemd[1]: Starting
> systemd-update-utmp-runlevel.service...
> Feb 14 06:36:31 ccs.covici.com bash[5753]: rm: cannot remove
> '/etc/ppp/provider_is_up': No such file or directory
> Feb 14 06:36:31 ccs.covici.com systemd[1]:
> systemd-update-utmp-runlevel.service: Deactivated successfully.
> Feb 14 06:36:31 ccs.covici.com systemd[1]: Finished
> systemd-update-utmp-runlevel.service.
> -- Boot 5c394be675854680a9cb616208f374f3 --
>
> Any trouble shooting suggestions as to what is making the system
> reboot?
>

Where are you getting this from, the system log/journal? This doesn't
seem like a clean shutdown, so if it is a kernel PANIC I wouldn't
expect the most critical info to be in the log (since it will stop
syncing to protect the filesystem). The details you need probably
will be displayed on the console briefly. You can also enable a
network console, which will send the dmesg output continuously over
UDP to another device. This won't be interrupted by a PANIC unless
there is some issue with the hardware or networking stack.

If you can get the final messages on dmesg and the panic core dump
that would help.

The other thing you can do is try to capture a kernel core dump, but
that is a bit more complicated to set up.

Otherwise your log is just going to say that everything was fine until
it wasn't.

--
Rich

John Covici

unread,
Feb 14, 2023, 3:00:05 PM2/14/23
to
Thanks a lot for responding.

OK, how would I set up logging to a network and what would I have to
do on another computer -- which in my case is Windows? I do have a
terminal program on there called teraterm which can do ssh, but that
is about what I have -- unless there is some other program I can put
on there.

Rich Freeman

unread,
Feb 14, 2023, 4:30:04 PM2/14/23
to
On Tue, Feb 14, 2023 at 2:54 PM John Covici <cov...@ccs.covici.com> wrote:
>
> On Tue, 14 Feb 2023 14:08:34 -0500,
> Rich Freeman wrote:
>>
> > will be displayed on the console briefly. You can also enable a
> > network console, which will send the dmesg output continuously over
> > UDP to another device.
>
> OK, how would I set up logging to a network and what would I have to
> do on another computer -- which in my case is Windows?

The docs are at:
https://www.kernel.org/doc/Documentation/networking/netconsole.txt

(you can also google for linux netconsole for some wiki articles on it)

I have on my command line: netconsole=@/,66...@10.1.0.52

That IP is the host I want the log traffic to go to. (Read the docs
if you have a more complicated networking setup - I assume that will
just run ARP and send stuff out without using a gateway/etc.)

Then on a receiving linux host I'd run (I think - it has been a while):
nc -u -l -p 6666

Now, you mentioned Windows. I've never used it, but nmap has a
program available in a windows version called ncat that might do the
job: https://nmap.org/ncat/

You just want to make sure you have it listening on port 6666 for UDP.
Make sure you use UDP or you won't receive anything.

If it is working you should get a ton of log spam when your host boots
- anything that shows up in dmesg will show up in the network console.
It is sent in realtime.

--
Rich

John Covici

unread,
Feb 14, 2023, 5:10:05 PM2/14/23
to
Sounds great -- I notice you ommitted the ip address, my network
device is brought up by a systemd unit file, will I need to specify
the device, then? I was thinking of netconsole=@192.168.0.1/eno1 --
would this be correct, assuming the ip address is correct?

Grant Edwards

unread,
Feb 15, 2023, 10:00:05 AM2/15/23
to
On 2023-02-14, Rich Freeman <ri...@gentoo.org> wrote:

> Where are you getting this from, the system log/journal? This doesn't
> seem like a clean shutdown, so if it is a kernel PANIC I wouldn't
> expect the most critical info to be in the log (since it will stop
> syncing to protect the filesystem). The details you need probably
> will be displayed on the console briefly. You can also enable a
> network console, which will send the dmesg output continuously over
> UDP to another device. This won't be interrupted by a PANIC unless
> there is some issue with the hardware or networking stack.

If you've got a serial port[1], you could also set up serial
logging. Though using serial ports have become a bit of a lost art,
the serial console code in the kernel is pretty carefully designed to
be the last man standing when things start to die. It's possible
(though I wouldn't say probable) that a serial console will be able to
show you stuff closer to the event horizon than a network console can.

Anyway, since still I'm in the serial port business (yes, there are
still plenty of people using serial ports in industrial settings) I
had to mention it...

[1] For this purpose you want a plain old UART on the motherboard type
seial port. You'd be surprised how many motherboards still have
them. Even though they're never brought out to a DB9 connector on
the back panel, there's often an 8-pin header on the edge of the
board somewhere, so you'd need one of these:

https://www.amazon.com/C2G-27550-Adapter-Bracket-Motherboards/dp/B0002J27R8/


--
Grant

John Covici

unread,
Feb 15, 2023, 10:20:05 AM2/15/23
to
I do have one which I use for my speech synthesizer. I also have one
on my other box which I could hook up -- if I can find my null modem
cable. I think I will try the netconsole first and the serial console
if that does not work.

Thanks for the hint.

Grant Edwards

unread,
Feb 15, 2023, 11:50:04 AM2/15/23
to
On 2023-02-15, Grant Edwards <grant.b...@gmail.com> wrote:

> [1] For this purpose you want a plain old UART on the motherboard type
> seial port. You'd be surprised how many motherboards still have
> them. Even though they're never brought out to a DB9 connector on
> the back panel, there's often an 8-pin header on the edge of the
> board somewhere, so you'd need one of these:

Oops, it's a 10pin (2x5) header not an 8-pin header, as I'm sure you'd
have figured out.

John Covici

unread,
Feb 16, 2023, 7:00:05 AM2/16/23
to
On Wed, 15 Feb 2023 09:50:27 -0500,
Grant Edwards wrote:
>
Still having problems with the netconsole -- I am determined to get
this working,so let me explain a bit more.

The sending computer has two nics, eno1 for the internal network and
eno2 is on the internet. So, my netconsole stanza said
netconsole=@192.168.0.1/eno1,@192.168.0.2

The box which is at 192.168.0.2 has netcat (windows version) and I
tried the following:
netcat -u -v -l 192.168.0.2 6666 and I also tried 192.168.0.1 6666
which is the ip address of the linux console which I am trying to
debug.

I also tried 0.0.0.0 6666 which did not work either, but I think the
windows firewall was blocking, and I did fix that, but did not try the
0.0.0.0 after that.

So, what am I doing wrong here?

Rich Freeman

unread,
Feb 16, 2023, 7:20:04 AM2/16/23
to
On Thu, Feb 16, 2023 at 6:50 AM John Covici <cov...@ccs.covici.com> wrote:
>
> The sending computer has two nics, eno1 for the internal network and
> eno2 is on the internet. So, my netconsole stanza said
> netconsole=@192.168.0.1/eno1,@192.168.0.2

Is CONFIG_NETCONSOLE enabled for your kernel?

I'm not sure if the kernel will assign the names eno1/2 to interfaces
- I think those might be assigned by udev, which probably won't have
run before the kernel parses this instruction. You might need to use
eth0/1 - and your guess is as good as mine which one corresponds to
which.

If it isn't one of those it might not hurt to put the target mac
address in there just to be safe. I haven't needed that but maybe
there are situations where ARP won't work (it would be needed if you
are crossing subnets, in which case you'd need the gateway MAC). Keep
in mind that this is a low-level function that doesn't use any
routing/userspace/etc. It was designed to be robust in the event of a
PANIC and to be able to be enabled fairly early during boot, so it
can't rely on the sorts of things we just take for granted with
networking.

>
> The box which is at 192.168.0.2 has netcat (windows version) and I
> tried the following:
> netcat -u -v -l 192.168.0.2 6666 and I also tried 192.168.0.1 6666
> which is the ip address of the linux console which I am trying to
> debug.
>
> I also tried 0.0.0.0 6666 which did not work either, but I think the
> windows firewall was blocking, and I did fix that, but did not try the
> 0.0.0.0 after that.
>

So I'm pretty sure that netcat requires listing the destination IP,
since it has to open a socket to listen on that IP. You can
optionally set a source address/port in which case it will ignore
anything else, but by default it will accept packets from any source.

I was definitely going to suggest making sure that a windows firewall
wasn't blocking the inbound connections. That's fairly default
behavior on windows.

--
Rich

John Covici

unread,
Feb 16, 2023, 9:10:05 AM2/16/23
to
hmmm, but what should I use for the source ip, I only assign those
when I bring the interface up when I start the interface -- I have
something like this:
[Unit]
Description=Network Connectivity for %i
Documentation=man:ip
Before=network.target
Wants=network.target
BindsTo=sys-subsystem-net-devices-%i.device
After=sys-subsystem-net-devices-%i.device
[Service]
Type=oneshot
RemainAfterExit=yes
EnvironmentFile=/etc/conf.d/network@%i
ExecStart=/bin/ip link set dev %i up
ExecStart=/bin/ip addr add ${address}/${netmask} broadcast ${broadcast} dev %i
ExecStart=-/bin/bash -c "test -n ${gateway} && /bin/ip route add default via ${gateway}"
ExecStart=-/bin/bash -c "test -f /etc/conf.d/postup@%i.sh&&/bin/bash -c /etc/conf.d/postup@%i.sh"
ExecStop=/bin/ip addr flush dev %i
ExecStop=/bin/ip link set dev %i down
ExecStop=-/bin/bash -c "test -f /etc/conf.d/postdown@%i.sh&&/bin/bash -c /etc/conf.d/postdown@%i.sh"

[Install]
WantedBy=multi-user.target

and the /etc/conf.d/network@eno1 is

address=192.168.0.1
netmask=24
broadcast=192.168.0.255
So, before I run this, I don't think the card has any ip address, does
it?

Mark Knecht

unread,
Feb 16, 2023, 9:20:04 AM2/16/23
to


address=192.168.0.1
netmask=24
broadcast=192.168.0.255
So, before I run this, I don't think the card has any ip address, does
it?

From what you hope the receiving machine 

arp -a 

?

HTH,
Mark

Rich Freeman

unread,
Feb 16, 2023, 9:40:04 AM2/16/23
to
On Thu, Feb 16, 2023 at 9:08 AM John Covici <cov...@ccs.covici.com> wrote:
>
> hmmm, but what should I use for the source ip, I only assign those
> when I bring the interface up when I start the interface -- I have
> something like this:
> [Unit]
> Description=Network Connectivity for %i
> ...
> So, before I run this, I don't think the card has any ip address, does
> it?

So, "cards" don't have an IP address. The kernel assigns an IP
address to an interface, which is entirely a software construct. It
happens to be a software construct that the network console feature
largely ignores anyway.

I didn't go reading the source code, but I'm guessing it is just
constructing raw UDP packets and it will happily set the IP to
whatever you want it to be. After all, it is just a field on the
packet.

So you can make the source IP whatever you want it to be. Just expect
the packets to show up with the IP you set on them. There is no
connection, so the IP doesn't need to be reachable by anything else.
You could stick literally anything in there as long as some firewall
isn't going to object and drop the packet. The destination IP matters
because that is where it is going to go, and the interface matters
because if it gets sent out on the wrong interface then obviously it
won't make it there.

I have no idea if the netconsole packets get seen by netfilter, but if
this is on some kind of router that might be something you need to
check, because if netfilter is configured to drop unassociated UDP
from the firewall to the LAN that could be an issue. However, it is
possible this just bypasses netfilter entirely.

If you have the dynamic netconsole option enabled you could have a
script update the settings after your network is up to set the source
IP to the one assigned by DHCP and make sure it is on the right
interface. As you point out though at boot time the interface won't
have an IP. It won't even be "up," not that this is likely to bother
the kernel.


--
Rich

John Covici

unread,
Feb 17, 2023, 2:10:04 PM2/17/23
to
On Thu, 16 Feb 2023 12:37:51 -0500,
Laurence Perkins wrote:
> https://wiki.gentoo.org/wiki/Kernel_Crash_Dumps is another option if you're somehow not getting enough information out of the console. More complex to set up, but you can take an actual debugger to the result and hopefully find out exactly what's going on.

Well, some progress, but no joy. I found actual messages from
netconsole and it seems no matter what device I put for the source,
netconsole says it doesn't exist. I tried my eno1, and also eth0 and
eth1. In my normal boot sequence, I see that udev renamed eth1 to
eno1, but netconsole still said it does not exist. So, I may have to
use the serial console method, I have to find my cables for that. I
did also try to add net.ifnames=0 to my boot options, but no joy
there.

Mark Knecht

unread,
Feb 17, 2023, 3:20:05 PM2/17/23
to


On Fri, Feb 17, 2023 at 12:03 PM John Covici <cov...@ccs.covici.com> wrote:
<SNIP>

> Well, some progress, but no joy.  I found actual messages from
> netconsole and it seems no matter what device I put for the source,
> netconsole says it doesn't exist.  I tried my eno1, and also eth0 and
> eth1.  In my normal boot sequence, I see that udev renamed eth1 to
> eno1, but netconsole still said it does not exist.  So, I may have to
> use the serial console method, I have to find my cables for that.  I
> did also try to add net.ifnames=0 to my boot options, but no joy
> there.
>
> --
> Your life is like a penny.  You're going to lose it.  The question is:
> How do
> you spend it?
>
>          John Covici wb2una
>          cov...@ccs.covici.com

John,
   I did a bad job at trying to point you in this direction the other day,
and in my testing I'm not sure how well it works. However another
option you might investigate is on the receiving end you can
apparently set the transmitter's IP address by using the
transmitter's mac address. Supposedly you would execute 
something like the following, with extra spaces added
for readability:

sudo arp -s 192.168.86.244      90:e6:ba:10:a3:e7      temp

which supposedly says 'when you see a packet with this 
mac address associate it with this IP address'. The temp
part says don't add it to the permanent tables.

After executing this you are supposed to be able to use tools 
that filter by IP address but I didn't have great results.

Hope this helps,
Mark 

John Covici

unread,
Feb 17, 2023, 4:40:05 PM2/17/23
to
My problem is that the sender aborts netconsole, so there is nothing
to receive.

On Fri, 17 Feb 2023 15:13:52 -0500,
Mark Knecht wrote:
>
> [1 <text/plain; UTF-8 (7bit)>]
> [2 <text/html; UTF-8 (quoted-printable)>]

John Covici

unread,
Apr 16, 2023, 6:20:04 AM4/16/23
to
On Thu, 16 Feb 2023 12:37:51 -0500,
Laurence Perkins wrote:
>
>
>
> >-----Original Message-----
> >From: John Covici <cov...@ccs.covici.com>
> >Sent: Wednesday, February 15, 2023 7:20 AM
> >To: gento...@lists.gentoo.org
> >Subject: Re: [gentoo-user] Re: my 5.15.93 kernel keeps rebooting
> >
> https://wiki.gentoo.org/wiki/Kernel_Crash_Dumps is another option if you're somehow not getting enough information out of the console. More complex to set up, but you can take an actual debugger to the result and hopefully find out exactly what's going on.

So, since I could get nothing out of the net console (it kept saying
that the device was not found) and my null modem connection between
the computer and another box only seemed to work from the box to the
computer with the problem but not the other way, I am trying to set up
to get a crash dump.

A few questions about this -- my root partition is zfs, whereas the
article seems to use /dev/something for the root.

I am using systemd, so what do I need in /etc/kexec.conf --do I put
all my kernel boot parameters in that file?

John Covici

unread,
Apr 16, 2023, 6:30:05 AM4/16/23
to
On Thu, 16 Feb 2023 12:37:51 -0500,
Laurence Perkins wrote:
>
>
>
> >-----Original Message-----
> >From: John Covici <cov...@ccs.covici.com>
> >Sent: Wednesday, February 15, 2023 7:20 AM
> >To: gento...@lists.gentoo.org
> >Subject: Re: [gentoo-user] Re: my 5.15.93 kernel keeps rebooting
> >
> https://wiki.gentoo.org/wiki/Kernel_Crash_Dumps is another option if you're somehow not getting enough information out of the console. More complex to set up, but you can take an actual debugger to the result and hopefully find out exactly what's going on.

Second try, don't know what happened.

So, after not getting any results from net console and somehow my null
modem cable only seemed to work from another computer to the one with
the problem kernel, I am trying to figure out how to set up for
getting a crash dump.

When looking at the article, it seems to want a root partition -- I
use zfs, which automatically detects the root partition, so can I just
forget about that one?

Also I am using systemd, so there is no /etc/local.d, but I do have
another location where I put commands to run after everything else
has run -- do I put the start up script there?

Also, there is a file /etc/conf.d/kexec.conf and I got a notice to
move it to /etc/kexec.conf, what do I put there?
0 new messages