Shell provisioner never completes when reboot occurs in the middle

1,553 views
Skip to first unread message

Shawn Neal

unread,
Dec 2, 2013, 4:28:45 PM12/2/13
to packe...@googlegroups.com
For a Windows _guest_ I have a provisioner block like so:

  "provisioners": [
    {
      "type": "shell", "inline": [
        "shutdown /r /t 5 /f /d p:4:1 /c \"Packer Reboot\"",
        "ping -n 30 127.0.0.1"
      ]
    },
    {
      "type": "shell", "inline": [
        "ping www.google.com"
      ]
    }
  ]

The first block simply schedules a reboot and then pings itself 30 times which should take ~30 seconds - this has the same effect as sleep 30 on linux. The shell provisioner never receives the exit channel message because the script never finishes before the reboot. The problem is the next script to ping google never starts, Packer is still waiting for the first provisioner to send a complete message.

It seems like I'm suffering from this problem (from http://www.packer.io/docs/provisioners/shell.html):

put a long sleep after the reboot so that SSH will eventually be killed automatically. Some OS configurations don't properly kill all network connections on reboot, causing the provisioner to hang despite a reboot occurring. In this case, make sure you shut down the network interfaces on reboot or in your shell script.

I've tried various things like using netsh to disable/enable the NIC, restarting the OpenSSHd service, but nothing seems to force Packer to move on. A provisioner timeout might be handy here. I'm running OpenSSH on the Windows Server 2008 R2 guest via packer-windows. Here's the relevant snippet from the log from around the time the guest reboots:

...
    vmware: Approximate round trip times in milli-seconds:
2013/12/02 10:59:46 ui:     vmware: Approximate round trip times in milli-seconds:
    vmware: Minimum = 0ms, Maximum = 0ms, Average = 0ms
2013/12/02 10:59:46 ui:     vmware: Minimum = 0ms, Maximum = 0ms, Average = 0ms
    vmware: Reply from 127.0.0.1: bytes=32 time<1ms TTL=128
2013/12/02 10:59:47 ui:     vmware: Reply from 127.0.0.1: bytes=32 time<1ms TTL=128
2013/12/02 10:59:51 /Users/sneal/src/packer/bin/packer-builder-vmware: 2013/12/02 10:59:51 Opening conn for SSH to tcp 192.168.211.154:22
2013/12/02 11:00:06 /Users/sneal/src/packer/bin/packer-builder-vmware: 2013/12/02 11:00:06 background SSH connection checker failure: dial tcp 192.168.211.154:22: i/o timeout
2013/12/02 11:00:11 /Users/sneal/src/packer/bin/packer-builder-vmware: 2013/12/02 11:00:11 Opening conn for SSH to tcp 192.168.211.154:22
2013/12/02 11:00:16 /Users/sneal/src/packer/bin/packer-builder-vmware: 2013/12/02 11:00:16 Opening conn for SSH to tcp 192.168.211.154:22
...

Opening conn for SSH to tcp 192.168.211.154:22 continues on forever

ja...@geteventstore.com

unread,
Apr 27, 2014, 6:35:42 PM4/27/14
to packe...@googlegroups.com
I hit this problem earlier today - it seems a workable solution is to kill sshd instead of the ping trick:

    {
      "type": "shell", "inline": [
        "shutdown /r /t 5 /f /d p:4:1 /c \"Packer Reboot\"",
        "taskkill /im sshd.exe /f"
      ]
    },

Seems to do the trick - it moves on to the next provisioner immediately the SSH connection drops.

Shawn Neal

unread,
Apr 27, 2014, 7:38:46 PM4/27/14
to packe...@googlegroups.com
You might have better luck with the pause_before provisioner param - just put in the provisioner that runs immediately after the reboot. See https://github.com/mitchellh/packer/pull/737

"pause_before": "30s"


--
You received this message because you are subscribed to a topic in the Google Groups "Packer" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/packer-tool/-9UhG99Mr5k/unsubscribe.
To unsubscribe from this group and all its topics, send an email to packer-tool...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

ja...@geteventstore.com

unread,
Apr 29, 2014, 9:22:06 AM4/29/14
to packe...@googlegroups.com
Doh, didn't spot that - would have saved me some time!

In my particular case the shutdown might take a while (but not sure how long) so killing the sshd seems preferable as I'd otherwise need a really long timeout.

Cheers,


James
Reply all
Reply to author
Forward
0 new messages