Re: [ovs-discuss] SB flows not being created in OVN K8 Stateful set

149 views
Skip to first unread message

Dumitru Ceara

unread,
Aug 5, 2020, 3:54:11 PM8/5/20
to Brendan Doyle, gmood...@nvidia.com, ovn-kub...@googlegroups.com, ovs-discuss, Girish Moodalbail, aginwala, pcam...@redhat.com
On 8/5/20 5:14 PM, Brendan Doyle wrote:
> Folks,
>
> I'm stumped here, I have the k8 ovnkube-db-raft Stateful set up and
> running.
> But when I create a simple network, no SB flows are generated.
>
> ovn-nbctl show shows my network. ovn-sbctl show shows the physicals
> systems in my network.
> But I can't ping between any hosts because ovn-sbctl lflow-list is
> empty, and there are no
> errors or warnings in the logs. The ovn cluster says it is up and healthy.
>
> Anybody got any ideas why this might be?
>

Hi Brendan,

Maybe I missed it but is ovn-northd running anywhere? If so, could you
please share its logs?

Thanks,
Dumitru

Dumitru Ceara

unread,
Aug 6, 2020, 7:31:33 AM8/6/20
to Brendan Doyle, gmood...@nvidia.com, ovn-kub...@googlegroups.com, ovs-discuss, Girish Moodalbail, aginwala, pcam...@redhat.com
On 8/6/20 11:54 AM, Brendan Doyle wrote:
> I don't see any ovn-northd.log log, I only see those when I'm running
> OVN outside the k8s cluster.
> Before I start the Satefulset on my k8 nodes I run:
>
> ovn-ctl stop_northd
> ovn-ctl stop_ovsdb
> rm -rf /usr/etc/ovn/*.db
>
>
> The only logs I see are ovn-controller.log  (I'm running ovn controller
> on the K8 nodes) ovsdb-server-nb.log ovsdb-server-sb.log
> These logs look normal
>

ovn-northd is the daemon that translates OVN_Northbound DB records to
OVN_Southbound DB records (including logical flows). If your deployment
doesn't start ovn-northd SB won't get populated.

Regards,
Dumitru

> k8 node 1 logs
> ============
> 2020-08-05T13:44:51.958Z|00001|vlog|INFO|opened log file
> /var/log/ovn/ovsdb-server-nb.log
> 2020-08-05T13:44:51.961Z|00002|ovsdb_server|INFO|ovsdb-server (Open
> vSwitch) 2.13.90
> 2020-08-05T13:44:51.961Z|00003|reconnect|INFO|tcp:[253.255.0.34]:6643:
> connecting...
> 2020-08-05T13:44:51.961Z|00004|reconnect|INFO|tcp:[253.255.0.34]:6643:
> connected
> 2020-08-05T13:44:51.962Z|00005|raft_rpc|INFO|learned cluster ID 6e05
> 2020-08-05T13:44:51.962Z|00006|raft|INFO|tcp:[253.255.0.34]:6643:
> learned server ID c8a5
> 2020-08-05T13:44:51.962Z|00007|raft|INFO|server c8a5 is leader for term 1
> 2020-08-05T13:44:51.962Z|00008|raft|INFO|rejecting append_request
> because previous entry 1,5 not in local log (mismatch past end of log)
> 2020-08-05T13:44:51.964Z|00009|raft|INFO|server c8a5 added to configuration
> 2020-08-05T13:44:51.966Z|00010|raft|INFO|server 0d3d added to configuration
> 2020-08-05T13:44:51.966Z|00011|raft|INFO|tcp:253.255.0.34:50744: learned
> server ID c8a5
> 2020-08-05T13:44:51.966Z|00012|raft|INFO|tcp:253.255.0.34:50744: learned
> remote address tcp:[253.255.0.34]:6643
> 2020-08-05T13:44:52.015Z|00013|raft|INFO|tcp:253.255.0.35:51780: learned
> server ID 941e
> 2020-08-05T13:44:52.015Z|00014|raft|INFO|tcp:253.255.0.35:51780: learned
> remote address tcp:[253.255.0.35]:6643
> 2020-08-05T13:44:52.015Z|00015|raft|INFO|adding 941e (941e at
> tcp:[253.255.0.35]:6643) to cluster 6e05 failed (not leader)
> 2020-08-05T13:44:52.015Z|00016|raft|INFO|server 941e added to configuration
> 2020-08-05T13:44:52.016Z|00017|reconnect|INFO|tcp:[253.255.0.35]:6643:
> connecting...
> 2020-08-05T13:44:52.016Z|00018|reconnect|INFO|tcp:[253.255.0.35]:6643:
> connected
> 2020-08-05T13:45:01.962Z|00019|memory|INFO|8424 kB peak resident set
> size after 10.0 seconds
> 2020-08-05T13:45:01.962Z|00020|memory|INFO|cells:40 monitors:0
> raft-connections:4
> 2020-08-05T14:10:03.593Z|00021|reconnect|INFO|tcp:[253.255.0.34]:6643:
> connection closed by peer
> 2020-08-05T14:10:03.595Z|00022|raft|INFO|server 941e is leader for term 2
> 2020-08-05T14:10:04.594Z|00023|reconnect|INFO|tcp:[253.255.0.34]:6643:
> connecting...
> 2020-08-05T14:10:04.594Z|00024|reconnect|INFO|tcp:[253.255.0.34]:6643:
> connection attempt failed (Connection refused)
> 2020-08-05T14:10:04.594Z|00025|reconnect|INFO|tcp:[253.255.0.34]:6643:
> waiting 2 seconds before reconnect
> 2020-08-05T14:10:06.594Z|00026|reconnect|INFO|tcp:[253.255.0.34]:6643:
> connecting...
> 2020-08-05T14:10:06.594Z|00027|reconnect|INFO|tcp:[253.255.0.34]:6643:
> connection attempt failed (Connection refused)
> 2020-08-05T14:10:06.594Z|00028|reconnect|INFO|tcp:[253.255.0.34]:6643:
> waiting 4 seconds before reconnect
> 2020-08-05T14:10:08.453Z|00029|raft|INFO|received leadership transfer
> from 941e in term 2
> 2020-08-05T14:10:08.453Z|00030|raft|INFO|term 3: starting election
> 2020-08-05T14:10:08.453Z|00031|reconnect|INFO|tcp:[253.255.0.35]:6643:
> connection closed by peer
>
> 2020-08-05T13:44:48.971Z|00001|vlog|INFO|opened log file
> /var/log/ovn/ovsdb-server-sb.log
> 2020-08-05T13:44:48.973Z|00002|ovsdb_server|INFO|ovsdb-server (Open
> vSwitch) 2.13.90
> 2020-08-05T13:44:48.973Z|00003|reconnect|INFO|tcp:[253.255.0.34]:6644:
> connecting...
> 2020-08-05T13:44:48.973Z|00004|reconnect|INFO|tcp:[253.255.0.34]:6644:
> connected
> 2020-08-05T13:44:48.973Z|00005|raft_rpc|INFO|learned cluster ID 281c
> 2020-08-05T13:44:48.973Z|00006|raft|INFO|tcp:[253.255.0.34]:6644:
> learned server ID eeed
> 2020-08-05T13:44:48.974Z|00007|raft|INFO|server eeed is leader for term 1
> 2020-08-05T13:44:48.974Z|00008|raft|INFO|rejecting append_request
> because previous entry 1,4 not in local log (mismatch past end of log)
> 2020-08-05T13:44:48.976Z|00009|raft|INFO|server eeed added to configuration
> 2020-08-05T13:44:48.977Z|00010|raft|INFO|server 5098 added to configuration
> 2020-08-05T13:44:48.977Z|00011|raft|INFO|tcp:253.255.0.34:50628: learned
> server ID eeed
> 2020-08-05T13:44:48.977Z|00012|raft|INFO|tcp:253.255.0.34:50628: learned
> remote address tcp:[253.255.0.34]:6644
> 2020-08-05T13:44:49.044Z|00013|raft|INFO|tcp:253.255.0.35:49594: learned
> server ID b9de
> 2020-08-05T13:44:49.044Z|00014|raft|INFO|tcp:253.255.0.35:49594: learned
> remote address tcp:[253.255.0.35]:6644
> 2020-08-05T13:44:49.044Z|00015|raft|INFO|adding b9de (b9de at
> tcp:[253.255.0.35]:6644) to cluster 281c failed (not leader)
> 2020-08-05T13:44:49.044Z|00016|raft|INFO|server b9de added to configuration
> 2020-08-05T13:44:49.044Z|00017|reconnect|INFO|tcp:[253.255.0.35]:6644:
> connecting...
> 2020-08-05T13:44:49.045Z|00018|reconnect|INFO|tcp:[253.255.0.35]:6644:
> connected
> 2020-08-05T13:44:58.975Z|00019|memory|INFO|8408 kB peak resident set
> size after 10.0 seconds
> 2020-08-05T13:44:58.975Z|00020|memory|INFO|cells:39 monitors:0
> raft-connections:4
> 2020-08-05T14:10:03.593Z|00021|reconnect|INFO|tcp:[253.255.0.34]:6644:
> connection closed by peer
> 2020-08-05T14:10:03.595Z|00022|raft|INFO|server b9de is leader for term 2
> 2020-08-05T14:10:04.595Z|00023|reconnect|INFO|tcp:[253.255.0.34]:6644:
> connecting...
> 2020-08-05T14:10:04.595Z|00024|reconnect|INFO|tcp:[253.255.0.34]:6644:
> connection attempt failed (Connection refused)
> 2020-08-05T14:10:04.595Z|00025|reconnect|INFO|tcp:[253.255.0.34]:6644:
> waiting 2 seconds before reconnect
>
> 2020-08-05T13:47:27.231Z|00001|vlog|INFO|opened log file
> /usr/var/log/ovn/ovn-controller.log
> 2020-08-05T13:47:27.233Z|00002|reconnect|INFO|unix:/usr/var/run/openvswitch/db.sock:
> connecting...
> 2020-08-05T13:47:27.233Z|00003|reconnect|INFO|unix:/usr/var/run/openvswitch/db.sock:
> connected
> 2020-08-05T13:47:27.235Z|00004|main|INFO|OVS IDL reconnected, force
> recompute.
> 2020-08-05T13:47:27.236Z|00005|reconnect|INFO|tcp:253.255.0.35:6642:
> connecting...
> 2020-08-05T13:47:27.236Z|00006|main|INFO|OVNSB IDL reconnected, force
> recompute.
> 2020-08-05T13:47:27.236Z|00007|reconnect|INFO|tcp:253.255.0.35:6642:
> connected
> 2020-08-05T13:47:27.237Z|00008|ofctrl|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connecting to switch
> 2020-08-05T13:47:27.237Z|00009|rconn|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connecting...
> 2020-08-05T13:47:27.237Z|00010|rconn|WARN|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connection failed (No such file or directory)
> 2020-08-05T13:47:27.237Z|00011|rconn|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> waiting 1 seconds before reconnect
> 2020-08-05T13:47:28.239Z|00012|rconn|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connecting...
> 2020-08-05T13:47:28.239Z|00013|rconn|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connected
> 2020-08-05T13:47:28.240Z|00001|pinctrl(ovn_pinctrl0)|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connecting to switch
> 2020-08-05T13:47:28.240Z|00002|rconn(ovn_pinctrl0)|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connecting...
> 2020-08-05T13:47:28.240Z|00003|rconn(ovn_pinctrl0)|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connected
> 2020-08-05T13:47:28.436Z|00014|dpif_netlink|INFO|The kernel module does
> not support meters.
> 2020-08-05T14:10:05.486Z|00015|main|INFO|OVNSB commit failed, force
> recompute next time.
> 2020-08-05T14:10:05.486Z|00016|jsonrpc|WARN|tcp:253.255.0.35:6642:
> receive error: Connection reset by peer
> 2020-08-05T14:10:05.486Z|00017|reconnect|WARN|tcp:253.255.0.35:6642:
> connection dropped (Connection reset by peer)
> 2020-08-05T14:10:05.486Z|00018|reconnect|INFO|tcp:253.255.0.33:6642:
> connecting...
> 2020-08-05T14:10:05.486Z|00019|reconnect|INFO|tcp:253.255.0.33:6642:
> connection attempt failed (Connection refused)
> 2020-08-05T14:10:05.487Z|00020|reconnect|INFO|tcp:253.255.0.34:6642:
> connecting...
> 2020-08-05T14:10:05.487Z|00021|reconnect|INFO|tcp:253.255.0.34:6642:
> connection attempt failed (Connection refused)
> 2020-08-05T14:10:06.488Z|00022|reconnect|INFO|tcp:253.255.0.35:6642:
> connecting...
> 2020-08-05T14:10:06.488Z|00023|reconnect|INFO|tcp:253.255.0.35:6642:
> connection attempt failed (Connection refused)
> 2020-08-05T14:10:06.488Z|00024|reconnect|INFO|tcp:253.255.0.35:6642:
> waiting 2 seconds before reconnect
> 2020-08-05T14:10:08.490Z|00025|reconnect|INFO|tcp:253.255.0.33:6642:
> connecting...
> 2020-08-05T14:10:08.490Z|00026|reconnect|INFO|tcp:253.255.0.33:6642:
> connection attempt failed (Connection refused)
> 2020-08-05T14:10:08.490Z|00027|reconnect|INFO|tcp:253.255.0.33:6642:
> waiting 4 seconds before reconnect
> 2020-08-05T14:10:12.492Z|00028|reconnect|INFO|tcp:253.255.0.34:6642:
> connecting...
> 2020-08-05T14:10:12.492Z|00029|reconnect|INFO|tcp:253.255.0.34:6642:
> connection attempt failed (Connection refused)
> 2020-08-05T14:10:12.492Z|00030|reconnect|INFO|tcp:253.255.0.34:6642:
> continuing to reconnect in the background but suppressing further logging
>
> K8 node 2
> ---------------
> 2020-08-05T13:43:11.094Z|00001|vlog|INFO|opened log file
> /var/log/ovn/ovsdb-server-nb.log
> 2020-08-05T13:43:11.095Z|00002|raft|INFO|term 1: 40130328487 ms timeout
> expired, starting election
> 2020-08-05T13:43:11.095Z|00003|raft|INFO|term 1: elected leader by 1+ of
> 1 servers
> 2020-08-05T13:43:11.098Z|00004|ovsdb_server|INFO|ovsdb-server (Open
> vSwitch) 2.13.90
> 2020-08-05T13:43:12.074Z|00005|jsonrpc|WARN|unix#4: send error: Broken pipe
> 2020-08-05T13:43:12.074Z|00006|reconnect|WARN|unix#4: connection dropped
> (Broken pipe)
> 2020-08-05T13:43:15.942Z|00007|jsonrpc|WARN|tcp:253.255.0.35:60448:
> receive error: Connection reset by peer
> 2020-08-05T13:43:15.942Z|00008|reconnect|WARN|tcp:253.255.0.35:60448:
> connection dropped (Connection reset by peer)
> 2020-08-05T13:43:15.947Z|00009|raft|INFO|tcp:253.255.0.33:46572: learned
> server ID 0d3d
> 2020-08-05T13:43:15.947Z|00010|raft|INFO|tcp:253.255.0.33:46572: learned
> remote address tcp:[253.255.0.33]:6643
> 2020-08-05T13:43:15.947Z|00011|raft|INFO|starting to add server 0d3d
> (0d3d at tcp:[253.255.0.33]:6643) to cluster 6e05
> 2020-08-05T13:43:15.950Z|00012|raft|INFO|cluster 6e05: installed
> snapshot on server 0d3d  up to 1:1
> 2020-08-05T13:43:15.951Z|00013|reconnect|INFO|tcp:[253.255.0.33]:6643:
> connecting...
> 2020-08-05T13:43:15.952Z|00014|reconnect|INFO|tcp:[253.255.0.33]:6643:
> connected
> 2020-08-05T13:43:15.952Z|00015|raft|INFO|adding 0d3d (0d3d at
> tcp:[253.255.0.33]:6643) to cluster 6e05 succeeded (completed)
> 2020-08-05T13:43:15.997Z|00016|raft|INFO|tcp:253.255.0.35:60500: learned
> server ID 941e
> 2020-08-05T13:43:15.997Z|00017|raft|INFO|tcp:253.255.0.35:60500: learned
> remote address tcp:[253.255.0.35]:6643
> 2020-08-05T13:43:15.997Z|00018|raft|INFO|starting to add server 941e
> (941e at tcp:[253.255.0.35]:6643) to cluster 6e05
> 2020-08-05T13:43:16.000Z|00019|raft|INFO|cluster 6e05: installed
> snapshot on server 941e  up to 1:1
> 2020-08-05T13:43:16.001Z|00020|reconnect|INFO|tcp:[253.255.0.35]:6643:
> connecting...
> 2020-08-05T13:43:16.001Z|00021|reconnect|INFO|tcp:[253.255.0.35]:6643:
> connected
> 2020-08-05T13:43:16.002Z|00022|raft|INFO|adding 941e (941e at
> tcp:[253.255.0.35]:6643) to cluster 6e05 succeeded (completed)
> 2020-08-05T13:43:21.101Z|00023|memory|INFO|8452 kB peak resident set
> size after 10.0 seconds
> 2020-08-05T13:43:21.101Z|00024|memory|INFO|cells:40 monitors:0
> raft-connections:4
>
> 2020-08-05T13:43:11.105Z|00001|vlog|INFO|opened log file
> /var/log/ovn/ovsdb-server-sb.log
> 2020-08-05T13:43:11.106Z|00002|raft|INFO|term 1: 40130328497 ms timeout
> expired, starting election
> 2020-08-05T13:43:11.106Z|00003|raft|INFO|term 1: elected leader by 1+ of
> 1 servers
> 2020-08-05T13:43:11.108Z|00004|ovsdb_server|INFO|ovsdb-server (Open
> vSwitch) 2.13.90
> 2020-08-05T13:43:12.959Z|00005|raft|INFO|tcp:253.255.0.33:46140: learned
> server ID 5098
> 2020-08-05T13:43:12.959Z|00006|raft|INFO|tcp:253.255.0.33:46140: learned
> remote address tcp:[253.255.0.33]:6644
> 2020-08-05T13:43:12.959Z|00007|raft|INFO|starting to add server 5098
> (5098 at tcp:[253.255.0.33]:6644) to cluster 281c
> 2020-08-05T13:43:12.962Z|00008|raft|INFO|cluster 281c: installed
> snapshot on server 5098  up to 1:1
> 2020-08-05T13:43:12.963Z|00009|reconnect|INFO|tcp:[253.255.0.33]:6644:
> connecting...
> 2020-08-05T13:43:12.963Z|00010|reconnect|INFO|tcp:[253.255.0.33]:6644:
> connected
> 2020-08-05T13:43:12.963Z|00011|raft|INFO|adding 5098 (5098 at
> tcp:[253.255.0.33]:6644) to cluster 281c succeeded (completed)
> 2020-08-05T13:43:12.972Z|00012|jsonrpc|WARN|tcp:253.255.0.35:40706:
> receive error: Connection reset by peer
> 2020-08-05T13:43:12.972Z|00013|reconnect|WARN|tcp:253.255.0.35:40706:
> connection dropped (Connection reset by peer)
> 2020-08-05T13:43:13.027Z|00014|raft|INFO|tcp:253.255.0.35:44172: learned
> server ID b9de
> 2020-08-05T13:43:13.027Z|00015|raft|INFO|tcp:253.255.0.35:44172: learned
> remote address tcp:[253.255.0.35]:6644
> 2020-08-05T13:43:13.027Z|00016|raft|INFO|starting to add server b9de
> (b9de at tcp:[253.255.0.35]:6644) to cluster 281c
> 2020-08-05T13:43:13.029Z|00017|raft|INFO|cluster 281c: installed
> snapshot on server b9de  up to 1:1
> 2020-08-05T13:43:13.030Z|00018|reconnect|INFO|tcp:[253.255.0.35]:6644:
> connecting...
> 2020-08-05T13:43:13.030Z|00019|reconnect|INFO|tcp:[253.255.0.35]:6644:
> connected
> 2020-08-05T13:43:13.031Z|00020|raft|INFO|adding b9de (b9de at
> tcp:[253.255.0.35]:6644) to cluster 281c succeeded (completed)
> 2020-08-05T13:43:21.109Z|00021|memory|INFO|8604 kB peak resident set
> size after 10.0 seconds
> 2020-08-05T13:43:21.109Z|00022|memory|INFO|cells:39 monitors:0
> raft-connections:4
>
> 2020-08-05T13:45:52.338Z|00001|vlog|INFO|opened log file
> /usr/var/log/ovn/ovn-controller.log
> 2020-08-05T13:45:52.340Z|00002|reconnect|INFO|unix:/usr/var/run/openvswitch/db.sock:
> connecting...
> 2020-08-05T13:45:52.340Z|00003|reconnect|INFO|unix:/usr/var/run/openvswitch/db.sock:
> connected
> 2020-08-05T13:45:52.342Z|00004|main|INFO|OVS IDL reconnected, force
> recompute.
> 2020-08-05T13:45:52.342Z|00005|reconnect|INFO|tcp:253.255.0.35:6642:
> connecting...
> 2020-08-05T13:45:52.342Z|00006|main|INFO|OVNSB IDL reconnected, force
> recompute.
> 2020-08-05T13:45:52.342Z|00007|reconnect|INFO|tcp:253.255.0.35:6642:
> connected
> 2020-08-05T13:45:52.344Z|00008|ofctrl|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connecting to switch
> 2020-08-05T13:45:52.344Z|00009|rconn|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connecting...
> 2020-08-05T13:45:52.344Z|00010|rconn|WARN|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connection failed (No such file or directory)
> 2020-08-05T13:45:52.344Z|00011|rconn|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> waiting 1 seconds before reconnect
> 2020-08-05T13:45:53.344Z|00012|rconn|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connecting...
> 2020-08-05T13:45:53.345Z|00013|rconn|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connected
> 2020-08-05T13:45:53.346Z|00014|dpif_netlink|INFO|The kernel module does
> not support meters.
> 2020-08-05T13:45:53.348Z|00001|pinctrl(ovn_pinctrl0)|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connecting to switch
> 2020-08-05T13:45:53.348Z|00002|rconn(ovn_pinctrl0)|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connecting...
> 2020-08-05T13:45:53.348Z|00003|rconn(ovn_pinctrl0)|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connected
> 2020-08-05T14:08:29.432Z|00015|main|INFO|OVNSB commit failed, force
> recompute next time.
> 2020-08-05T14:08:29.432Z|00016|jsonrpc|WARN|tcp:253.255.0.35:6642:
> receive error: Connection reset by peer
> 2020-08-05T14:08:29.432Z|00017|reconnect|WARN|tcp:253.255.0.35:6642:
> connection dropped (Connection reset by peer)
> 2020-08-05T14:08:29.432Z|00018|reconnect|INFO|tcp:253.255.0.33:6642:
> connecting...
> 2020-08-05T14:08:29.432Z|00019|reconnect|INFO|tcp:253.255.0.33:6642:
> connection attempt failed (Connection refused)
> 2020-08-05T14:08:29.433Z|00020|reconnect|INFO|tcp:253.255.0.34:6642:
> connecting...
> 2020-08-05T14:08:29.433Z|00021|reconnect|INFO|tcp:253.255.0.34:6642:
> connection attempt failed (Connection refused)
> 2020-08-05T14:08:30.434Z|00022|reconnect|INFO|tcp:253.255.0.35:6642:
> connecting...
> 2020-08-05T14:08:30.434Z|00023|reconnect|INFO|tcp:253.255.0.35:6642:
> connection attempt failed (Connection refused)
> 2020-08-05T14:08:30.434Z|00024|reconnect|INFO|tcp:253.255.0.35:6642:
> waiting 2 seconds before reconnect
> 2020-08-05T14:08:32.436Z|00025|reconnect|INFO|tcp:253.255.0.33:6642:
> connecting...
> 2020-08-05T14:08:32.436Z|00026|reconnect|INFO|tcp:253.255.0.33:6642:
> connection attempt failed (Connection refused)
> 2020-08-05T14:08:32.436Z|00027|reconnect|INFO|tcp:253.255.0.33:6642:
> waiting 4 seconds before reconnect
> 2020-08-05T14:08:36.439Z|00028|reconnect|INFO|tcp:253.255.0.34:6642:
> connecting...
> 2020-08-05T14:08:36.439Z|00029|reconnect|INFO|tcp:253.255.0.34:6642:
> connection attempt failed (Connection refused)
> 2020-08-05T14:08:36.439Z|00030|reconnect|INFO|tcp:253.255.0.34:6642:
> continuing to reconnect in the background but suppressing further logging
>
> k8 node 3
> --------------
> 2020-08-05T13:44:52.007Z|00001|vlog|INFO|opened log file
> /var/log/ovn/ovsdb-server-nb.log
> 2020-08-05T13:44:52.009Z|00002|ovsdb_server|INFO|ovsdb-server (Open
> vSwitch) 2.13.90
> 2020-08-05T13:44:52.010Z|00003|reconnect|INFO|tcp:[253.255.0.34]:6643:
> connecting...
> 2020-08-05T13:44:52.010Z|00004|reconnect|INFO|tcp:[253.255.0.34]:6643:
> connected
> 2020-08-05T13:44:52.010Z|00005|raft_rpc|INFO|learned cluster ID 6e05
> 2020-08-05T13:44:52.010Z|00006|raft|INFO|tcp:[253.255.0.34]:6643:
> learned server ID c8a5
> 2020-08-05T13:44:52.011Z|00007|raft|INFO|server c8a5 is leader for term 1
> 2020-08-05T13:44:52.011Z|00008|raft|INFO|rejecting append_request
> because previous entry 1,6 not in local log (mismatch past end of log)
> 2020-08-05T13:44:52.013Z|00009|raft|INFO|server c8a5 added to configuration
> 2020-08-05T13:44:52.014Z|00010|raft|INFO|server 0d3d added to configuration
> 2020-08-05T13:44:52.014Z|00011|reconnect|INFO|tcp:[253.255.0.33]:6643:
> connecting...
> 2020-08-05T13:44:52.014Z|00012|reconnect|INFO|tcp:[253.255.0.33]:6643:
> connected
> 2020-08-05T13:44:52.015Z|00013|raft|INFO|server 941e added to configuration
> 2020-08-05T13:44:52.015Z|00014|raft|INFO|tcp:253.255.0.34:52448: learned
> server ID c8a5
> 2020-08-05T13:44:52.015Z|00015|raft|INFO|tcp:253.255.0.34:52448: learned
> remote address tcp:[253.255.0.34]:6643
> 2020-08-05T13:44:52.015Z|00016|raft|INFO|tcp:253.255.0.33:56762: learned
> server ID 0d3d
> 2020-08-05T13:44:52.015Z|00017|raft|INFO|tcp:253.255.0.33:56762: learned
> remote address tcp:[253.255.0.33]:6643
> 2020-08-05T13:45:02.010Z|00018|memory|INFO|8884 kB peak resident set
> size after 10.0 seconds
> 2020-08-05T13:45:02.010Z|00019|memory|INFO|cells:40 monitors:0
> raft-connections:4
> 2020-08-05T14:10:03.593Z|00020|raft|INFO|received leadership transfer
> from c8a5 in term 1
> 2020-08-05T14:10:03.593Z|00021|raft|INFO|term 2: starting election
> 2020-08-05T14:10:03.593Z|00022|reconnect|INFO|tcp:[253.255.0.34]:6643:
> connection closed by peer
> 2020-08-05T14:10:03.594Z|00023|raft|INFO|term 2: elected leader by 2+ of
> 3 servers
> 2020-08-05T14:10:04.594Z|00024|reconnect|INFO|tcp:[253.255.0.34]:6643:
> connecting...
> 2020-08-05T14:10:04.594Z|00025|reconnect|INFO|tcp:[253.255.0.34]:6643:
> connection attempt failed (Connection refused)
> 2020-08-05T14:10:04.594Z|00026|reconnect|INFO|tcp:[253.255.0.34]:6643:
> waiting 2 seconds before reconnect
> 2020-08-05T14:10:06.595Z|00027|reconnect|INFO|tcp:[253.255.0.34]:6643:
> connecting...
> 2020-08-05T14:10:06.595Z|00028|reconnect|INFO|tcp:[253.255.0.34]:6643:
> connection attempt failed (Connection refused)
> 2020-08-05T14:10:06.595Z|00029|reconnect|INFO|tcp:[253.255.0.34]:6643:
> waiting 4 seconds before reconnect
>
>
> cat /usr/var/log/ovn//backups/ovsdb-server-sb.13
>
> 2020-08-05T13:44:49.037Z|00001|vlog|INFO|opened log file
> /var/log/ovn/ovsdb-server-sb.log
> 2020-08-05T13:44:49.039Z|00002|ovsdb_server|INFO|ovsdb-server (Open
> vSwitch) 2.13.90
> 2020-08-05T13:44:49.040Z|00003|reconnect|INFO|tcp:[253.255.0.34]:6644:
> connecting...
> 2020-08-05T13:44:49.040Z|00004|reconnect|INFO|tcp:[253.255.0.34]:6644:
> connected
> 2020-08-05T13:44:49.040Z|00005|raft_rpc|INFO|learned cluster ID 281c
> 2020-08-05T13:44:49.040Z|00006|raft|INFO|tcp:[253.255.0.34]:6644:
> learned server ID eeed
> 2020-08-05T13:44:49.040Z|00007|raft|INFO|server eeed is leader for term 1
> 2020-08-05T13:44:49.040Z|00008|raft|INFO|rejecting append_request
> because previous entry 1,5 not in local log (mismatch past end of log)
> 2020-08-05T13:44:49.043Z|00009|raft|INFO|server eeed added to configuration
> 2020-08-05T13:44:49.043Z|00010|raft|INFO|server 5098 added to configuration
> 2020-08-05T13:44:49.043Z|00011|reconnect|INFO|tcp:[253.255.0.33]:6644:
> connecting...
> 2020-08-05T13:44:49.043Z|00012|reconnect|INFO|tcp:[253.255.0.33]:6644:
> connected
> 2020-08-05T13:44:49.044Z|00013|raft|INFO|server b9de added to configuration
> 2020-08-05T13:44:49.044Z|00014|raft|INFO|tcp:253.255.0.34:33550: learned
> server ID eeed
> 2020-08-05T13:44:49.044Z|00015|raft|INFO|tcp:253.255.0.34:33550: learned
> remote address tcp:[253.255.0.34]:6644
> 2020-08-05T13:44:49.044Z|00016|raft|INFO|tcp:253.255.0.33:58002: learned
> server ID 5098
> 2020-08-05T13:44:49.044Z|00017|raft|INFO|tcp:253.255.0.33:58002: learned
> remote address tcp:[253.255.0.33]:6644
> 2020-08-05T13:44:59.041Z|00018|memory|INFO|8652 kB peak resident set
> size after 10.0 seconds
> 2020-08-05T13:44:59.041Z|00019|memory|INFO|cells:39 monitors:0
> raft-connections:4
> 2020-08-05T14:10:03.593Z|00020|raft|INFO|received leadership transfer
> from eeed in term 1
> 2020-08-05T14:10:03.593Z|00021|raft|INFO|term 2: starting election
> 2020-08-05T14:10:03.593Z|00022|reconnect|INFO|tcp:[253.255.0.34]:6644:
> connection closed by peer
> 2020-08-05T14:10:03.594Z|00023|raft|INFO|term 2: elected leader by 2+ of
> 3 servers
> 2020-08-05T14:10:04.594Z|00024|reconnect|INFO|tcp:[253.255.0.34]:6644:
> connecting...
> 2020-08-05T14:10:04.594Z|00025|reconnect|INFO|tcp:[253.255.0.34]:6644:
> connection attempt failed (Connection refused)
> 2020-08-05T14:10:04.594Z|00026|reconnect|INFO|tcp:[253.255.0.34]:6644:
> waiting 2 seconds before reconnect
> 2020-08-05T14:10:05.378Z|00027|reconnect|INFO|tcp:[253.255.0.33]:6644:
> connection closed by peer
>
>
> 2020-08-05T13:47:29.329Z|00001|vlog|INFO|opened log file
> /usr/var/log/ovn/ovn-controller.log
> 2020-08-05T13:47:29.331Z|00002|reconnect|INFO|unix:/usr/var/run/openvswitch/db.sock:
> connecting...
> 2020-08-05T13:47:29.331Z|00003|reconnect|INFO|unix:/usr/var/run/openvswitch/db.sock:
> connected
> 2020-08-05T13:47:29.334Z|00004|main|INFO|OVS IDL reconnected, force
> recompute.
> 2020-08-05T13:47:29.334Z|00005|reconnect|INFO|tcp:253.255.0.34:6642:
> connecting...
> 2020-08-05T13:47:29.334Z|00006|main|INFO|OVNSB IDL reconnected, force
> recompute.
> 2020-08-05T13:47:29.334Z|00007|reconnect|INFO|tcp:253.255.0.34:6642:
> connected
> 2020-08-05T13:47:29.336Z|00008|ofctrl|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connecting to switch
> 2020-08-05T13:47:29.336Z|00009|rconn|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connecting...
> 2020-08-05T13:47:29.336Z|00010|rconn|WARN|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connection failed (No such file or directory)
> 2020-08-05T13:47:29.336Z|00011|rconn|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> waiting 1 seconds before reconnect
> 2020-08-05T13:47:30.337Z|00012|rconn|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connecting...
> 2020-08-05T13:47:30.338Z|00013|rconn|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connected
> 2020-08-05T13:47:30.340Z|00014|dpif_netlink|INFO|The kernel module does
> not support meters.
> 2020-08-05T13:47:30.341Z|00001|pinctrl(ovn_pinctrl0)|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connecting to switch
> 2020-08-05T13:47:30.341Z|00002|rconn(ovn_pinctrl0)|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connecting...
> 2020-08-05T13:47:30.341Z|00003|rconn(ovn_pinctrl0)|INFO|unix:/usr/var/run/openvswitch/br-int.mgmt:
> connected
> 2020-08-05T14:10:03.592Z|00015|main|INFO|OVNSB commit failed, force
> recompute next time.
> 2020-08-05T14:10:03.593Z|00016|jsonrpc|WARN|tcp:253.255.0.34:6642:
> receive error: Connection reset by peer
> 2020-08-05T14:10:03.593Z|00017|reconnect|WARN|tcp:253.255.0.34:6642:
> connection dropped (Connection reset by peer)
> 2020-08-05T14:10:03.593Z|00018|reconnect|INFO|tcp:253.255.0.33:6642:
> connecting...
> 2020-08-05T14:10:03.593Z|00019|reconnect|INFO|tcp:253.255.0.33:6642:
> connected
> 2020-08-05T14:10:05.378Z|00020|main|INFO|OVNSB commit failed, force
> recompute next time.
> 2020-08-05T14:10:05.378Z|00021|jsonrpc|WARN|tcp:253.255.0.33:6642:
> receive error: Connection reset by peer
> 2020-08-05T14:10:05.378Z|00022|reconnect|WARN|tcp:253.255.0.33:6642:
> connection dropped (Connection reset by peer)
> 2020-08-05T14:10:05.379Z|00023|reconnect|INFO|tcp:253.255.0.35:6642:
> connecting...
> 2020-08-05T14:10:05.379Z|00024|reconnect|INFO|tcp:253.255.0.35:6642:
> connected
> 2020-08-05T14:10:05.485Z|00025|main|INFO|OVNSB commit failed, force
> recompute next time.
> 2020-08-05T14:10:05.485Z|00026|jsonrpc|WARN|tcp:253.255.0.35:6642:
> receive error: Connection reset by peer
> 2020-08-05T14:10:05.485Z|00027|reconnect|WARN|tcp:253.255.0.35:6642:
> connection dropped (Connection reset by peer)
> 2020-08-05T14:10:06.486Z|00028|reconnect|INFO|tcp:253.255.0.34:6642:
> connecting...
> 2020-08-05T14:10:06.487Z|00029|reconnect|INFO|tcp:253.255.0.34:6642:
> connection attempt failed (Connection refused)
> 2020-08-05T14:10:06.487Z|00030|reconnect|INFO|tcp:253.255.0.34:6642:
> waiting 2 seconds before reconnect
> 2020-08-05T14:10:08.488Z|00031|reconnect|INFO|tcp:253.255.0.33:6642:
> connecting...
> 2020-08-05T14:10:08.488Z|00032|reconnect|INFO|tcp:253.255.0.33:6642:
> connection attempt failed (Connection refused)
> 2020-08-05T14:10:08.488Z|00033|reconnect|INFO|tcp:253.255.0.33:6642:
> waiting 4 seconds before reconnect
> 2020-08-05T14:10:12.492Z|00034|reconnect|INFO|tcp:253.255.0.35:6642:
> connecting...
> 2020-08-05T14:10:12.493Z|00035|reconnect|INFO|tcp:253.255.0.35:6642:
> connection attempt failed (Connection refused)
> 2020-08-05T14:10:12.493Z|00036|reconnect|INFO|tcp:253.255.0.35:6642:
> continuing to reconnect in the background but suppressing further logging

Dumitru Ceara

unread,
Aug 6, 2020, 9:44:22 AM8/6/20
to Brendan Doyle, gmood...@nvidia.com, ovn-kub...@googlegroups.com, ovs-discuss, Girish Moodalbail, aginwala, pcam...@redhat.com
On 8/6/20 2:03 PM, Brendan Doyle wrote:
>
>
> On 06/08/2020 12:31, Dumitru Ceara wrote:
>> On 8/6/20 11:54 AM, Brendan Doyle wrote:
>>> I don't see any ovn-northd.log log, I only see those when I'm running
>>> OVN outside the k8s cluster.
>>> Before I start the Satefulset on my k8 nodes I run:
>>>
>>> ovn-ctl stop_northd
>>> ovn-ctl stop_ovsdb
>>> rm -rf /usr/etc/ovn/*.db
>>>
>>>
>>> The only logs I see are ovn-controller.log  (I'm running ovn controller
>>> on the K8 nodes) ovsdb-server-nb.log ovsdb-server-sb.log
>>> These logs look normal
>>>
>> ovn-northd is the daemon that translates OVN_Northbound DB records to
>> OVN_Southbound DB records (including logical flows). If your deployment
>> doesn't start ovn-northd SB won't get populated.
>
> Well I use the ovnkube-db-raft.yaml and ovn-setup.yaml as per the
> documentation, and they call
> ovndb-raft-functions.sh and start the OVN procs as:
>
> run_as_ovs_user_if_needed \
>       ${OVNCTL_PATH} run_${db}_ovsdb --no-monitor \
>       --db-${db}-cluster-local-addr=[${ovn_db_host}] \
>       --db-${db}-cluster-local-port=${raft_port} \
>       --db-${db}-cluster-local-proto=${transport} \
>       ${db_ssl_opts} \
>       --ovn-${db}-log="${ovn_loglevel_db}" &
>

This seems to be starting the NB/SB DBs only.

> Are you saying that there is more to do?? as in invoke run-ovn-northd()

Yes, something has to start ovn-northd.

> in ovnkube.sh which is called
> from the ovnkube-master.yaml file? I see that
> https://github.com/ovn-org/ovn-kubernetes says
>
> Apply OVN DaemonSet and Deployment yamls.
>
> |# Create OVN namespace, service accounts, ovnkube-db headless service,
> configmap, and policies kubectl create -f
> $HOME/work/src/github.com/ovn-org/ovn-kubernetes/dist/yaml/ovn-setup.yaml #
> Run ovnkube-db deployment. kubectl create -f
> $HOME/work/src/github.com/ovn-org/ovn-kubernetes/dist/yaml/ovnkube-db.yaml
> # Run ovnkube-master deployment. kubectl create -f
> $HOME/work/src/github.com/ovn-org/ovn-kubernetes/dist/yaml/ovnkube-master.yaml
> # Run ovnkube daemonset for nodes kubectl create -f
> $HOME/work/src/github.com/ovn-org/ovn-kubernetes/dist/yaml/ovnkube-node.yaml
> |
>
>
> But makes no mention of ovnkube-db-raft.yaml ?? so is the above is the
> correct procedure?
> what then is ovnkube-db-raft.yaml for?

I'm not sure about the ovnkube specifics, I'll let someone from the
ovn-k8s team comment about this. But as mentioned above, something needs
to start ovn-northd otherwise the SB won't get populated.

Regards,
Dumitru

>
> Note I don't want to replace flannel with OVN as the CNI, I just want to
> run OVN central in a k8
> StatefulSet, that use flannel as the CNI.
>
> Is it documented anywhere how I can do this? I would have thought that
> ovnkube-db-raft.yaml
> was the way to go, and that it would start all that is needed??
>
> Do I need to run all of the above yamls too, or just a subset? and will
> they interfere with ovnkube-db-raft.yaml
> and/or flannel? Do I need to just drop ovnkube-db-raft.yaml???
>
> Can someone provide the exact yamls I need to do this please
>
> Thanks
>
>
> Brendan.
>
>
>

Girish Moodalbail

unread,
Aug 6, 2020, 11:19:17 AM8/6/20
to Brendan Doyle, Dumitru Ceara, Girish Moodalbail, ovn-kub...@googlegroups.com, ovs-discuss, aginwala, pcam...@redhat.com


On Thu, Aug 6, 2020 at 7:36 AM Brendan Doyle <brenda...@oracle.com> wrote:
OK thanks, perhaps Girish can comment, I thinking that the steps are

# Create OVN namespace, service accounts, ovnkube-db headless service, configmap, and policies
kubectl create -f $HOME/work/src/github.com/ovn-org/ovn-kubernetes/dist/yaml/ovn-setup.yaml


# Run ovnkube-db deployment.
kubectl apply -f $HOME/work/src/github.com/ovn-org/ovn-kubernetes/dist/yaml/ovnkube-db-raft.yaml


# Run ovnkube-master deployment.
kubectl create -f $HOME/work/src/github.com/ovn-org/ovn-kubernetes/dist/yaml/ovnkube-master.yaml

# Run ovnkube daemonset for nodes
kubectl create -f $HOME/work/src/github.com/ovn-org/ovn-kubernetes/dist/yaml/ovnkube-node.yaml

Yes, those are the steps to get OVN K8s CNI up and running with OVN DB running in clustered mode.

However, you also say below

 

      
Note I don't want to replace flannel with OVN as the CNI, I just want to
run OVN central in a k8
StatefulSet, that use flannel as the CNI.
So, my question is - What are you trying to do? How are you mixing Flannel and OVN DBs?

Do you want to run OVN DBs in clustered mode as a service (or K8s application) using Flannel as the CNI for your K8s cluster?

Regards,
~Girish
 

Girish Moodalbail

unread,
Aug 6, 2020, 11:49:13 AM8/6/20
to Brendan Doyle, Dumitru Ceara, Girish Moodalbail, ovn-kub...@googlegroups.com, ovs-discuss, aginwala, pcam...@redhat.com


On Thu, Aug 6, 2020 at 8:23 AM Brendan Doyle <brenda...@oracle.com> wrote:
Yes I want to use Flannel as the CNI, and just have the clustered OVN DBs as a k8s Service. providing
a HA OVN Central for ovn-controllers on hypervisors in my network. So it sounds like the above steps
won't work for me and I have to hand craft/modify the raft yaml to start northd, but not use the
rest of the yamls ?

Providing Clustered OVN DBs as a service is not the goal of the ovn-kubernetes project. However, you can re-use a lot of the project's yamls and container entrypoint scripts to achieve what you want to do.

1. Apply the ovn-setup.yaml and ovnkube-db-raft.yaml like you captured above
2. Edit the ovnkube-master.yaml to only have ovn-northd container and nothing else and name it ovn-north.yaml. Apply this ovn-north.yaml.
3. Have all the ovn-controller instances in your network point to the OVN SB DB instances. The OVN DB Pods run with hostNetwork set to `true`, so they will not be on a flannel network and therefore will be accessible from your hypervisors directly.


Reply all
Reply to author
Forward
0 new messages