On 12/17/20 7:55 PM, Ben Pfaff wrote:
> On Thu, Dec 17, 2020 at 08:54:56AM -0800, Girish Moodalbail wrote:
>> Hello all,
>>
>> Say, ovn-nbctl is started in daemon mode with options set for certs, and
>> those certs do not exist on the file system. For example, in the following
>> invocation assume that `/ovn-cert` folder is empty
>>
>> ovn-nbctl -vconsole:dbg --pidfile=/tmp/ovn-nbctl.pid --db=ssl:
10.0.64.7:6641
>> ,ssl:
10.0.64.6:6641,ssl:
10.0.64.4:6641 --log-file=/tmp/ovn-nbctl.log
>> --detach -p /ovn-cert/ovncontroller-privkey.pem -c
>> /ovn-cert/ovncontroller-cert.pem -C /ovn-cert/ca-cert.pem
>>
>> Now, if we run a command against that daemon via....
>>
>> ovs-appctl -t /var/run/ovn/ovn-nbctl.32254.ctl list-commands
> 
> [...]
> 
>> This is my theory. In ovn-nbctl.c`server_loop(), we have this infinite loop
>>
>>     for (;;) {
>>         if (ovsdb_idl_has_ever_connected(idl)) {
>>             daemonize_complete();
>>             unixctl_server_run(server);
>>         }
>>         ovsdb_idl_wait(idl);
>>         unixctl_server_wait(server);
>>         poll_block();
>>     }
>>
>> Since ovsdb_idl_has_ever_connected()  is not true due to missing certs, we
>> never get a chance to run the command from ovs-appctl and then poll_block()
>> will return immediately and we enter an infinite loop?
> 
> (The above is a partial snippet, there's actually more in the loop.)
> 
> It's always an infinite loop, it's just that it wastes CPU in that case.
> I think that you're right about the cause.  I think we should only call
> unixctl_server_wait() if we'd call unixctl_server_run(), so the right
> think to do appears to be move the unixctl_server_wait() call into the
> "if" condition.
In that case an "ovn-appctl -t ... <command>" will just block until the
IDL connects at least once.
Instead, would there be a concern with calling unixctl_server_run()
unconditionally?
This would allow the users to actually interact with the nbctl daemon
and, for example, gracefully stop it if it can't connect for whatever
reason:
# Start ovn-nbctl daemon without first starting the NB DB:
# This blocks because the IDL cannot connect.
export OVN_NB_DAEMON=$(ovn-nbctl --detach)
# In a different terminal, enable debug logs, exit, etc.
ovn-appctl -t /var/run/ovn/ovn-nbctl.18042.ctl vlog/set dbg
ovn-appctl -t /var/run/ovn/ovn-nbctl.18042.ctl exit
Regards,
Dumitru