pxe boot kernel parameters

1,337 views
Skip to first unread message

Andrew

unread,
Apr 10, 2014, 10:19:47 AM4/10/14
to coreo...@googlegroups.com
I'm interested in running coreos on bare metal with a custom built image. I'm able to pxe boot just fine using the stock images outlined in the docs. Here's my pxelinux.cfg:

default linux
prompt 0
timeout 1
label linux
        kernel /images/coreos/coreos_production_pxe.vmlinuz
        append initrd=/images/coreos/coreos_production_pxe_image.cpio.gz sshkey=".." state=tmpfs: root=squashfs:

However, I'm having trouble pxe booting my own custom built image. Here's the steps I used to build the image:

 $ ./build_packages
 $ ./build_image prod
 $ ./image_to_vm.sh --from=../build/images/amd64-usr/282.0.0+2014-04-09-1610-a1 --board=amd64-usr --prod_image --format=pxe

I can then boot the pxe image just fine using qemu:

 $ ./coreos_production_pxe.sh -curses -append 'sshkey="..."'

I copy the coreos_production_pxe_image.cpio.gz and coreos_production_pxe.vmlinuz up to my boot server and attempt to pxe boot the new image on bare metal machine and it seems to hang. Kernel boots but never makes it to the login screen.

I found that if I remove the  state=tmpfs: root=squashfs: kernel parameters from pxelinux.cfg it makes it farther in the boot process, gets to the login screen but networking fails to come up (network interface is down, no dhcp lease etc) but can login on the console using the local core user account. 

Am I missing some steps building the image to support these kernel parameters? Do I need to specify a --disk_layout or --mem when running image_to_vm.sh? 

Thanks in advance for any pointers.

--Andrew



Michael Marineau

unread,
Apr 10, 2014, 10:40:31 AM4/10/14
to coreos-dev

Sorry for the trouble but the basic issue is that I am still finishing the transition from our old 'amd64-generic' images and the new 'amd64-usr' images which use a different filesystem layout. The instructions on the website for PXE are still for the old images. You already found part of the answer, the new type which you built should not be given the root= and state= parameters. I am not sure why networking is not coming up for you, that should be working but we now use networks which doesn't yet handle everything out previous dhcp client did. Could you provide some details on the network/dhcp setup and the full journal of one of these half-successful boots?

Also, if you want to try to build the old image type just pass --board=amd64-generic to each of the commands. I'm guessing it won't resolve the network issue but who knows.

Thanks!

Michael Marineau

unread,
Apr 10, 2014, 10:42:45 AM4/10/14
to coreos-dev

On Apr 10, 2014 7:40 AM, "Michael Marineau" <michael....@coreos.com> wrote:
>
> Sorry for the trouble but the basic issue is that I am still finishing the transition from our old 'amd64-generic' images and the new 'amd64-usr' images which use a different filesystem layout. The instructions on the website for PXE are still for the old images. You already found part of the answer, the new type which you built should not be given the root= and state= parameters. I am not sure why networking is not coming up for you, that should be working but we now use networks

Darn autocorrect, that is networkd :)

Andrew Bruno

unread,
Apr 10, 2014, 11:24:59 AM4/10/14
to coreo...@googlegroups.com
On Thu, Apr 10, 2014 at 10:40 AM, Michael Marineau
<michael....@coreos.com> wrote:
> Sorry for the trouble but the basic issue is that I am still finishing the
> transition from our old 'amd64-generic' images and the new 'amd64-usr'
> images which use a different filesystem layout. The instructions on the
> website for PXE are still for the old images. You already found part of the
> answer, the new type which you built should not be given the root= and
> state= parameters. I am not sure why networking is not coming up for you,
> that should be working but we now use networks which doesn't yet handle
> everything out previous dhcp client did. Could you provide some details on
> the network/dhcp setup and the full journal of one of these half-successful
> boots?

How do I capture the full journal? I'd be happy to provide more
detailed info if that helps.

We have jumbo frames enabled MTU 9000 and after booting up the console
fills up with log messages bouncing the MTU. For example:

[..] igb changing MTU from 1500 to 9000
[..] igb changing MTU from 9000 to 1500
[..] en01 nic is up ...
[..] igb changing MTU from 1500 to 9000
[..] igb changing MTU from 9000 to 1500
[..] en01 nic is up ...
..
..

Thanks,

--Andrew

Michael Marineau

unread,
Apr 10, 2014, 11:46:54 AM4/10/14
to coreos-dev

Haha, cute. I'll look into it.

Andrew Bruno

unread,
Apr 10, 2014, 3:30:20 PM4/10/14
to coreo...@googlegroups.com
Here's some more info in case it helps any.. if I change the option
interface-mtu on the dhcp server to 1500 it works great. I get a lease
and network interface stays up.

snip from dhcpd.conf:

host coreos.xxx.xx {
hardware ethernet xxxx;
option interface-mtu 9000;
fixed-address 10.x.x.x;
option host-name "coreos";
filename "/pxelinux.0";
next-server xxxx;
}

After booting up here's a snip from journalctl:
....
Apr 10 18:26:37 coreos systemd-hostnamed[2515]: Changed host name to 'coreos'
Apr 10 18:26:37 coreos systemd-networkd[2494]: eno1: carrier off
Apr 10 18:26:37 coreos systemd-networkd[2494]: eno1: DHCP lease lost
Apr 10 18:26:37 coreos kernel: igb 0000:01:00.0: changing MTU from 9000 to 1500
Apr 10 18:26:38 coreos systemd-networkd[2494]: eno1: link configured
Apr 10 18:26:38 localhost systemd-hostnamed[2515]: Changed host name
to 'localhost'
Apr 10 18:26:40 localhost kernel: igb: eno1 NIC Link is Up 1000 Mbps
Full Duplex, Flow Control: RX/TX
Apr 10 18:26:40 localhost systemd-networkd[2494]: eno1: carrier on
Apr 10 18:26:43 localhost systemd-networkd[2494]: eno1: DHCPv4 address
10.xx.xx.xx/24 via 10.xx.x.xx
Apr 10 18:26:43 localhost kernel: igb 0000:01:00.0: changing MTU from
1500 to 9000
Apr 10 18:26:43 coreos systemd-hostnamed[2515]: Changed host name to 'coreos'
Apr 10 18:26:43 coreos systemd-networkd[2494]: eno1: carrier off
Apr 10 18:26:43 coreos systemd-networkd[2494]: eno1: DHCP lease lost
Apr 10 18:26:44 coreos kernel: igb 0000:01:00.0: changing MTU from 9000 to 1500
Apr 10 18:26:44 coreos systemd-networkd[2494]: eno1: link configured
Apr 10 18:26:44 localhost systemd-hostnamed[2515]: Changed host name
to 'localhost'
..
..

If I change the interface-mtu to 1500 on the dhcp server:

host coreos.xxx.xx {
option interface-mtu 1500;
}

Then run:

$ systemctl restart systemd-networkd
$ journalctl
...
Apr 10 19:03:45 localhost systemd-networkd[2716]: eno1: DHCPv4 address
10.xx.x.xx/24 via 10.xx.x.xx
Apr 10 19:03:45 coreos systemd-hostnamed[2515]: Changed host name to 'coreos'
Apr 10 19:03:45 coreos systemd-networkd[2716]: eno1: link configured
Apr 10 19:05:06 coreos systemd[1]: Starting system-sshd.slice.
Apr 10 19:05:06 coreos systemd[1]: Created slice system-sshd.slice.
Apr 10 19:05:06 coreos systemd[1]: Found device
dev-disk-by\x2dlabel-STATE.device.
Apr 10 19:05:06 coreos systemd[1]: Started Resize STATE partition.
Apr 10 19:05:06 coreos systemd[1]: Mounted /media/state.
..
..


Sounds like this may be more a systemd-networkd issue than a coreos
one? or perhaps a mis-configuration of our dhcp server but just
thought I'd pass along the info. In any event, I have a custom coreos
image pxe booting with networking now (sans jumbo frames). Thanks
again for the help.

--Andrew



On Thu, Apr 10, 2014 at 11:46 AM, Michael Marineau

Brandon Philips

unread,
Apr 10, 2014, 3:38:53 PM4/10/14
to coreos-dev
This does look like a systemd-networkd bug.

Is there any chance you can grab a tcpdump of the traffic and add
SYSTEMD_LOG_LEVEL=debug to the environment of systemd-networkd?

You can just run `SYSTEMD_LOG_LEVEL=debug
/usr/lib/systemd/systemd-networkd` directly after `systemctl stop
systemd-networkd` too.

Thanks!

Brandon

Andrew

unread,
Apr 11, 2014, 1:56:45 PM4/11/14
to coreo...@googlegroups.com
Attached are the tcpdump and systemd debug log output. Let me know if you need anything else.

--Andrew 
systemd-networkd-debug.txt
tcpdump-mtu-9000.txt
Reply all
Reply to author
Forward
0 new messages