I've been testing nomads behaviour on client crashes and came across something strange
I have a nomad server and client configured on separate hosts. I ran a system job from the server, with one docker task called store-app-va, which successfully created a docker container on the client. The docker container id for this job was e88e6630f436.
Next I reboot the client host. When it starts up the docker container is in a stopped state, but still visible with: docker ps -a. At this point the nomad client hasn't started yet, since I haven't configured it to start automatically. When I start the nomad client I get the following log entries:
Feb 04 04:20:16 nomad-store1 nomad[3356]: 2016/02/04 04:20:16 [ERR] client: failed to open handle to task 'store-app-va' for alloc '51c01501-1e1d-d85d-9e16-494e5f14433e': Failed to find container e88e6630f436248176e0b034d3ce67e760b1801ff06521d96ad4b9247a03bad8: <nil>
Feb 04 04:20:16 nomad-store1 nomad[3356]: 2016/02/04 04:20:16 [DEBUG] client: starting task context for 'store-app-va' (alloc '51c01501-1e1d-d85d-9e16-494e5f14433e')
Feb 04 04:20:16 nomad-store1 nomad[3356]: 2016/02/04 04:20:16 [DEBUG] client: starting runner for alloc '51c01501-1e1d-d85d-9e16-494e5f14433e'
Feb 04 04:20:16 nomad-store1 nomad[3356]: 2016/02/04 04:20:16 [DEBUG] driver.docker: binding directories []string{"/var/log/nomad/alloc/51c01501-1e1d-d85d-9e16-494e5f14433e/alloc:/alloc:rw,z", "/var/log/nomad/alloc/51c01501-1e1d-d85d-9e16-494e5f14433e/store-app-va:/local:rw,Z"} for
192.168.0.70:5000/store-vaFeb 04 04:20:16 nomad-store1 nomad[3356]: 2016/02/04 04:20:16 [DEBUG] driver.docker: setting container name to: store-app-va-51c01501-1e1d-d85d-9e16-494e5f14433e
Feb 04 04:20:16 nomad-store1 docker[3303]: time="2016-02-04T04:20:16.300378727Z" level=info msg="POST /containers/create?name=store-app-va-51c01501-1e1d-d85d-9e16-494e5f14433e"
Feb 04 04:20:16 nomad-store1 docker[3303]: time="2016-02-04T04:20:16.305185725Z" level=error msg="Handler for POST /containers/create returned error: Conflict. The name \"store-app-va-51c01501-1e1d-d85d-9e16-494e5f14433e\" is already in use by container e88e6630f436. You have to delete (or rename) that container to be able to reuse that name."
Feb 04 04:20:16 nomad-store1 docker[3303]: time="2016-02-04T04:20:16.305264522Z" level=error msg="HTTP Error" err="Conflict. The name \"store-app-va-51c01501-1e1d-d85d-9e16-494e5f14433e\" is already in use by container e88e6630f436. You have to delete (or rename) that container to be able to reuse that name." statusCode=409
Feb 04 04:20:16 nomad-store1 docker[3303]: time="2016-02-04T04:20:16.305735310Z" level=info msg="GET /containers/json?all=1&filters=%7B%22name%22%3A%5B%22store-app-va-51c01501-1e1d-d85d-9e16-494e5f14433e%22%5D%7D"
Feb 04 04:20:16 nomad-store1 nomad[3356]: 2016/02/04 04:20:16 [INFO] driver.docker: a container with the name store-app-va-51c01501-1e1d-d85d-9e16-494e5f14433e already exists; will attempt to purge and re-create
Feb 04 04:20:27 nomad-store1 nomad[3356]: 2016/02/04 04:20:27 [INFO] driver.docker: purged container store-app-va-51c01501-1e1d-d85d-9e16-494e5f14433e
Feb 04 04:20:27 nomad-store1 docker[3303]: time="2016-02-04T04:20:27.403240577Z" level=info msg="POST /containers/create?name=store-app-va-51c01501-1e1d-d85d-9e16-494e5f14433e"
Feb 04 04:20:28 nomad-store1 nomad[3356]: 2016/02/04 04:20:28 [ERR] client: failed to start task 'store-app-va' for alloc '51c01501-1e1d-d85d-9e16-494e5f14433e': Failed to start container 0c9911612ce71af682727670bd29d682b97dcb35e3d4b09bd6d1567634f2be2a: API error (500): Cannot start container 0c9911612ce71af682727670bd29d682b97dcb35e3d4b09bd6d1567634f2be2a: Bind for
192.168.0.71:80 failed: port is already allocated
Feb 04 04:20:28 nomad-store1 nomad[3356]: 2016/02/04 04:20:28 [DEBUG] client: terminating runner for alloc '51c01501-1e1d-d85d-9e16-494e5f14433e'
The end result is that the original container has been purged, and two new containers created. The container id in the first logged error looks wrong; it has the real container id at the beginning, followed by a sequence of extra characters. What is the expected behaviour in this case? Should the client agent find the old container and start it again, or will it always create a new container when a client restarts? If the latter is true, then why did it create two new containers?