Matchbox & dnsmasq PXE boot/install hangs

127 views
Skip to first unread message

Scott Vernon

unread,
Dec 19, 2017, 12:26:43 AM12/19/17
to CoreOS User
Good afternoon,

I am in need of some help, I have a feeling it will be something obvious but i'm not sure how to trouble shoot any further.

I've started a matchbox container with the following command:   sudo docker run --net=host --rm -v /var/lib/matchbox:/var/lib/matchbox:Z -v /etc/matchbox:/etc/matchbox:Z,ro quay.io/coreos/matchbox:latest ddress=0.0.0.0:8080 -rpc-address=0.0.0.0:8081 -log-level=debug
Also a dnsmasq container:  sudo docker run --rm --cap-add=NET_ADMIN --net=host quay.io/coreos/dnsmasq -d -q --dhcp-range=10.110.204.45,10.110.204.47 --enable-tftp --tftp-root=/var/lib/tftpboot --dhcp-userclass=set:ipxe,iPXE --dhcp-boot=tag:#ipxe,undionly.kpxe --dhcp-boot=tag:ipxe,http://matchbox.pd.origin.foxtel.com.au:8080/boot.ipxe   --address=/matchbox.pd.origin.foxtel.com.au/10.110.204.41 --log-queries --log-dhcp

I've copied some example configs into /var/lib/matchbox/groups , profiles & ignition

I can start VMware virtual machine, i see it get a dhcp address, match a group and profile and pull the image: 

time="2017-12-19T05:03:05Z" level=info msg="HTTP GET /boot.ipxe"
time="2017-12-19T05:03:05Z" level=info msg="HTTP GET /ipxe?uuid=4228964f-6646-3833-04d1-ed4eae5e498f&mac=00-50-56-a8-22-eb&domain=&hostname=&serial=VMware-42%2028%2096%204f%2066%2046%2038%2033-04%20d1%20ed%204e%20ae%205e%2049%208f"
time="2017-12-19T05:03:05Z" level=debug msg="Matched an iPXE config" labels=map[serial:VMware-42 28 96 4f 66 46 38 33-04 d1 ed 4e ae 5e 49 8f uuid:4228964f-6646-3833-04d1-ed4eae5e498f mac:00:50:56:a8:22:eb domain: hostname:] profile=simple
time="2017-12-19T05:03:05Z" level=info msg="HTTP GET /assets/coreos/1576.4.0/coreos_production_pxe.vmlinuz"
time="2017-12-19T05:03:06Z" level=info msg="HTTP GET /assets/coreos/1576.4.0/coreos_production_pxe_image.cpio.gz"
time="2017-12-19T05:03:22Z" level=info msg="HTTP GET /ignition?uuid=4228964f-6646-3833-04d1-ed4eae5e498f&mac=00-50-56-a8-22-eb"
time="2017-12-19T05:03:22Z" level=debug msg="Matched an Ignition or Container Linux Config template" group=default labels=map[mac:00:50:56:a8:22:eb uuid:4228964f-6646-3833-04d1-ed4eae5e498f] profile=simple

However the VM doesn't completely boot.  If i try a simple pxe boot with no install i can ping the address thats allocated but cannot ssh into the machine.  If i try a boot and install i cannot even ping.  The VM console hangs at the point shown in the attached image.

If anyone can offer any guidance or suggestions it would be much appreciated.

Many thanks,
Scott 

corepxe.PNG

Benjamin Gilbert

unread,
Dec 19, 2017, 12:47:22 AM12/19/17
to CoreOS User, Scott Vernon
On Mon, Dec 18, 2017 at 9:26 PM, Scott Vernon <sc0tt....@gmail.com> wrote:
However the VM doesn't completely boot.  [...]  The VM console hangs at the point shown in the attached image.

This usually means that Ignition failed and dropped to an emergency shell, but the emergency shell is running on a different console.  Check the kernel command line arguments being passed to the machine in the iPXE config.  If "console=tty0" is not the last "console=" argument, move it to the end of the kernel command line.  Then boot the machine and use "journalctl -t ignition" at the emergency shell prompt to see the Ignition logs.

--Benjamin Gilbert

Scott Vernon

unread,
Dec 19, 2017, 7:41:26 PM12/19/17
to CoreOS User
Thank you Benjamin,

I have tried your suggestion, however i am still unable to ssh to the new vm.  I wasn't clear enough yesterday but i do actually get a connection refused message back from the VM, unfortunately changing the position of the tty0 argument has made no difference.

 {
  "id": "simple-install",
  "name": "Simple CoreOS Container Linux Alpha Install",
  "boot": {
    "kernel": "/assets/coreos/1576.4.0/coreos_production_pxe.vmlinuz",
    "initrd": ["/assets/coreos/1576.4.0/coreos_production_pxe_image.cpio.gz"],
    "args": [
      "initrd=coreos_production_pxe_image.cpio.gz",
      "coreos.first_boot=yes",
      "console=ttyS0",
      "console=tty0",
      "coreos.autologin"
    ]
  },
  "ignition_id": "install-reboot.yaml"
}


core@sydipdmespxe01 /var/lib/matchbox/groups $ ssh -vvv co...@10.110.204.46
OpenSSH_7.4p1, OpenSSL 1.0.2k  26 Jan 2017
debug1: Reading configuration data /etc/ssh/ssh_config
debug2: resolving "10.110.204.46" port 22
debug2: ssh_connect_direct: needpriv 0
debug1: Connecting to 10.110.204.46 [10.110.204.46] port 22.
debug1: connect to address 10.110.204.46 port 22: Connection refused
ssh: connect to host 10.110.204.46 port 22: Connection refused

Benjamin Gilbert

unread,
Dec 22, 2017, 12:04:06 AM12/22/17
to CoreOS User, Scott Vernon
On Tue, Dec 19, 2017 at 4:41 PM, Scott Vernon <sc0tt....@gmail.com> wrote:
I have tried your suggestion, however i am still unable to ssh to the new vm.  I wasn't clear enough yesterday but i do actually get a connection refused message back from the VM, unfortunately changing the position of the tty0 argument has made no difference.

Right, that's expected.  Ignition is designed so that if it fails whatever reason, the machine will fail to boot.  The purpose of moving the tty0 argument is to get a functioning graphical console so you can obtain debug logs from Ignition.  Once you have an emergency shell prompt on the console, use "journalctl -t ignition" to see the logs.

--Benjamin Gilbert

Scott Vernon

unread,
Jan 11, 2018, 11:19:24 PM1/11/18
to CoreOS User
Hi Benjamin, 

I managed to get a usable console in vsphere by removing the tty's completely from the config.  I'm slightly embarrassed to admit that a lack of memory was the reason the vm wouldn't boot.  once i upped the RAM from 1 to 4g it booted successfully.

Thanks again,
Scott
Reply all
Reply to author
Forward
0 new messages