PXE BOOT Fetch error in 15.5

46 views
Skip to first unread message

Mark Harrop

unread,
Apr 29, 2024, 1:15:54 PMApr 29
to kiwi
Hi All,

I am continuing the appliance development that J Mixer has done with openSuse.
I am updating our working OpenSuse15.3 appliance to OpenSuse15.5.

I have been able to build an ios image that loads fine but my pxe image keeps getting a "Failed to Fetch" error and reboots preventing any investigation. I have tried adding rd.debug and  rd.shell to the kernelcmdline but not show more information. I am running out of ideas on how to debug this problem and hope someone might have some insight.

I find this odd as the kernel and initrd image download fine.

I have tested the curl command on the server with the iso image installed.
localhost:/>curl -k sftp://172.27.18.54/install/cp9test/Ricoh-DFE-BOS.x86_64-1.15.5.md5 --user downloader:R1c0h
f377cdc1691f267efb8d3fc12267f924 935296 8192
localhost:/>curl -k sftp://172.27.18.54/install/cp9test/Ricoh-DFE-BOS.x86_64-1.15.5.md5 -o my.md5 --user downloader:R1c0h
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    45  100    45    0     0    113      0 --:--:-- --:--:-- --:--:--   113
100    45  100    45    0     0    113      0 --:--:-- --:--:-- --:--:--   113
 
My current guess is that the initrd is changing the network and it is no longer correct or available when trying to download the .xz file.

Was there a big change between 15.3 and 15.5 that would affect PXE?

Please let me know what additional information is can provide.

Thanks in advance.
Mark Harrop

Marcus Schäfer

unread,
Apr 30, 2024, 3:19:02 AMApr 30
to kiwi-...@googlegroups.com
Hi,

> I am updating our working OpenSuse15.3 appliance to OpenSuse15.5.
> I have been able to build an ios image that loads fine but my pxe image
> keeps getting a "Failed to Fetch" error and reboots preventing any
> investigation. I have tried adding rd.debug and rd.shell to the
> kernelcmdline but not show more information. I am running out of ideas
> on how to debug this problem and hope someone might have some insight.

If you get a "Failed to fetch..." error this means some issue with
the curl command. As you already tested the curl call outside of the
initrd environment I assume the client has in theory all access
granted.

So we need to dig a bit deeper. Please repeat the deployment with
the debug options:

rd.debug rd.kiwi.debug rd.shell

You will be dropped into a shell inside of the initrd now. There are
two sources of information:

/run/initramfs/log/boot.kiwi
/tmp/fetch.info

The later fetch.info is the output and error (2>&1) of the curl
command that kiwi called.

I think this two data sources should give us a hint what went wrong

Regards,
Marcus
--
Public Key available via: https://keybase.io/marcus_schaefer/key.asc
keybase search marcus_schaefer
-------------------------------------------------------
Marcus Schäfer Brunnenweg 18
Tel: +49 7562 905437 D-88260 Argenbühl
Germany
-------------------------------------------------------
signature.asc

Mark Harrop

unread,
Apr 30, 2024, 7:41:36 PMApr 30
to kiwi
Marcus,

Thank you for your quick response.

I discovered where I needed to add "rd.debug" through testing after sending off my email and added "rd.kiwi.debug rd.shell" afterwards.

Your clues showed me where to look to prove that my guess was true - no network was available.
fetch.png
This was same for all network devices but eth0 is the one connected.
The file /run/initramfs/log/boot.kiwi showed the failed fetch (and failed retries). Also more information that I need to dig into.

I now have a cmdline prompt to allow further debug.
Not sure what is missing from my current 15.5 initrd.
I am also trying to replicate this on my working 15.3 system for comparison.

Thank you,
Mark Harrop

Mark Harrop

unread,
May 2, 2024, 12:20:28 AMMay 2
to kiwi
Hi All,

I attempted to debug from the rs.shell prompt but could not get the nic interface up and configured.
I then decided to look at the initrd differences between 15.3 and 15.3.
I un-tared the ()install.tar files of working 15.3 and non-working 15.5 to get the ().initrd files.
I uncompressed these to get the initrd directories/files (15.3 unxz and 15.5 zstd compression).
I then created a "tree xml" file of each initrd directory to look for differences.

Scanning though the many changes, I saw many in firmware and kernel extensions.
I removed these from the files and was left with 125 differences.
15.3 - 469 directories - 2834 files
15.5 - 509 directories - 3075 files
Scanning these for missing items in 15.5, I saw the following stand outs:
/bin/arp
/bin/ping
/sysconfig/network/(ifcfg eth0/eth1/lo)
/lib/dracut/hooks/pre-udev/60-net-genrules.sh
/sbin/dhclient.conf
/usr/bin/arping
/usr/bin/ping
/usr/lib/systemd/system/initrd.target.wants (directory)
/lib64/libwicked
/sbin/arping
/sbin/wicked

I have googled and grepped trying to figure out how to enable the network in dracut and only came up with useless results.
I am not sure what dracut module to modify to get/start network services.

If it helps but I have attached the 15.3 and 15.5 modified tree xml files.
My kiwiconfig config.xml file has has minimal changes between 15.3 and 15.5.


Any help would be appreciated.
Thanks,
Mark
15.3_initrd_tree_mod.out
15.5_initrd_tree_mod.out

Marcus Schäfer

unread,
May 2, 2024, 4:29:40 AMMay 2
to kiwi-...@googlegroups.com
Hi,

> I attempted to debug from the rs.shell prompt but could not get the nic
> interface up and configured.

I suggest you try the following:

1. Make sure to pass the following kernel cmdline options

rd.debug rd.shell rd.neednet=1 ip=dhcp

This assumes the client can get an IP address via DHCP. If this
is not the case in your network google for ip= setting in
dracut for static ip/route assignment

2. Make sure in your kiwi image description you install
all components such that dracut can include all software
needed to setup a network. I usually pull in NetworkManager
but also systemd-networkd or wicked should work. Dracut has
several module code to setup the network. Just try:

<package name="NetworkManager"/>
<package name="iproute2"/>

3. Build the image and search in the build log file for any
suspicious error messages from dracut (they are not treated fatal)
So lookup the EXEC call for "dracut ..." and all lines after it
in the kiwi log are dracut output. This can help to find issues.

4. Boot up. If it fails you will land in
the rescue shell. From there we need to check

1. Do you have a nic at all ?

cat /proc/net/dev

2. In case of network manager, dracut starts the nm service
So what does it say

systemctl status nm-initrd

3. Overall systemd journal can be helpful

journalctl

Just some ideas to try
signature.asc

Mark Harrop

unread,
May 3, 2024, 1:14:04 AMMay 3
to kiwi

Marcus,

Thank you. Your suggestions allowed me to make some progress.

> 1. Make sure to pass the following kernel cmdline options
> rd.debug rd.shell rd.neednet=1 ip=dhcp 

This did allow the network to be configured though the fetch still failed.
It does seem in the /run/initramfs/rddodreport.txt that the network got an ip address before the fetch failed but the /tmp/fetch.log
shows the network down. I think this might be the network timing issue we saw and needed to extent in 15.3.

> 2. Make sure in your kiwi image description you install
> all components such that dracut can include all software
> needed to setup a network. I usually pull in NetworkManager
> but also systemd-networkd or wicked should work. Dracut has
> several module code to setup the network. Just try:

Already including  packages: iproute2, systemd-networkd, and wicked.


> 3. Build the image and search in the build log file for any
> suspicious error messages from dracut (they are not treated fatal)
> So lookup the EXEC call for "dracut ..." and all lines after it
> in the kiwi log are dracut output. This can help to find issues.

I didn't see any EXEC entries in my kiwi logs. 
I am using --logfile=file in my kiwi-ng commands.
The kiwi-ng logs are a lot smaller in 15.5 verses 15.3 where "DEBUG: time | EXEC" exists.
Checking the doc's, I see that I should be using the --debug flag.

> 4. Boot up. If it fails you will land in
> the rescue shell. From there we need to check
> 1. Do you have a nic at all ?

This did show the expected network devices and some data packets recieved and transfered.


> 2. In case of network manager, dracut starts the nm service

systemctl status wicked  -> service not found
systemclt_status.png

> 3. Overall systemd journal can be helpful
Still digging though this.


Thank you,
Mark


Reply all
Reply to author
Forward
0 new messages