Docker Swarm Overlay Network Problems

Skip to first unread message

Jacob Henderson

Feb 23, 2022, 11:24:45 PMFeb 23
to ClusterHAT
After following the instructions to setup a docker swarm on CNAT here, I then deploy a swarm stack with an overlay network. All nodes show up as active with `docker network ls`.

The docker services on the zeroes all fail to start with this error:

Feb 24 04:07:29 p1 dockerd[10001]: time="2022-02-24T04:07:29Z" level=error msg="enabling default vlan on bridge br0 failed open /sys/class/net/br0/bridge/default_pvid: permission denied" Feb 24 04:07:29 p1 dockerd[10001]: time="2022-02-24T04:07:29.470598391Z" level=error msg="reexec to set bridge default vlan failed exit status 1" Feb 24 04:07:45 p1 dockerd[10001]: time="2022-02-24T04:07:45.506170587Z" level=info msg="ignoring event" container=f3a9592298684ed2915e91fbfe3e6927fa8c18ffff79be748c19d159e63fa69c module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete" Feb 24 04:07:50 p1 dockerd[10001]: time="2022-02-24T04:07:50.137140295Z" level=error msg="fatal task error" error="task: non-zero exit (139)" module=node/agent/taskmanager Feb 24 04:07:51 p1 dockerd[10001]: time="2022-02-24T04:07:51.765275512Z" level=warning msg="Error (Unable to complete atomic operation, key modified) deleting object [endpoint q1jpb5oeysmfn3k6uoq4p3maz 7848fd2496431f838fac9506f0f3f8e686a1fcdf7a54fbd6936b6b3c62ea0715], retrying...." Feb 24 04:07:52 p1 dockerd[10001]: time="2022-02-24T04:07:52.017525461Z" level=warning msg="rmServiceBinding handleEpTableEvent huginn_web 25221c5d06c8e5af7a8525e73f36957d4f4fadb31a3b6a9a6afc1ae3847b3bdb aborted c.serviceBindings[skey] !ok" Feb 24 04:07:52 p1 dockerd[10001]: time="2022-02-24T04:07:52.022805439Z" level=warning msg="rmServiceBinding handleEpTableEvent huginn_postgres 467cedda402a1cc269829637ff29d75d5be029443de5ca22291ee06f9a9b5d93 aborted c.serviceBindings[skey] !ok" Feb 24 04:07:52 p1 dockerd[10001]: time="2022-02-24T04:07:52.025887427Z" level=info msg="initialized VXLAN UDP port to 4789 " Feb 24 04:07:52 p1 dockerd[10001]: time="2022-02-24T04:07:52.123869018Z" level=info msg="initialized VXLAN UDP port to 4789

This is the same issue described on github here. It only occurs with services attached via overlay. Unfortunately this is 90% of all use-cases where you'd even use a cluster. Can anyone help me out? I've also heard from some others that doing any actual cluster work requires CBRIDGE rather than CNAT. Any truth to that?

Reply all
Reply to author
0 new messages