[PATCH master] Add design for simplified node-add process

3 views
Skip to first unread message

Michael Hanselmann

unread,
Nov 16, 2012, 9:53:22 AM11/16/12
to ganeti...@googlegroups.com
Instead of initiating many SSH connections to copy files using “scp”, a
JSON structure is passed to a program running on the node to be added.
The design is similar to the one used for SSH setup.

Signed-off-by: Michael Hanselmann <han...@google.com>
---
Makefile.am | 1 +
doc/design-draft.rst | 1 +
doc/design-node-add.rst | 92 +++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 94 insertions(+), 0 deletions(-)
create mode 100644 doc/design-node-add.rst

diff --git a/Makefile.am b/Makefile.am
index 8a45a33..f3d1c0e 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -360,6 +360,7 @@ docrst = \
doc/design-linuxha.rst \
doc/design-multi-reloc.rst \
doc/design-network.rst \
+ doc/design-node-add.rst \
doc/design-oob.rst \
doc/design-ovf-support.rst \
doc/design-partitioned.rst \
diff --git a/doc/design-draft.rst b/doc/design-draft.rst
index d22f861..51a7bb0 100644
--- a/doc/design-draft.rst
+++ b/doc/design-draft.rst
@@ -18,6 +18,7 @@ Design document drafts
design-monitoring-agent.rst
design-remote-commands.rst
design-linuxha.rst
+ design-node-add.rst

.. vim: set textwidth=72 :
.. Local Variables:
diff --git a/doc/design-node-add.rst b/doc/design-node-add.rst
new file mode 100644
index 0000000..de4d868
--- /dev/null
+++ b/doc/design-node-add.rst
@@ -0,0 +1,92 @@
+Design for adding node to cluster
+=================================
+
+.. contents:: :depth: 3
+
+
+Current state and shortcomings
+------------------------------
+
+Adding a node to a cluster (master node excluded) currently involves
+setting up SSH (recently :doc:`simplified <design-ssh-setup>`) and then
+copying more than 25 files using ``scp`` before the node daemon can be
+started. No verification is being done before files are copied. Once the
+node daemon was started, an opcode is submitted to the master daemon,
+which will then copy more files, such as the configuration and job queue
+for master candidates, using RPC.
+
+This process is somewhat fragile and requires initiating many SSH
+connections.
+
+Proposed changes
+----------------
+
+Similar to how the :doc:`SSH setup was changed <design-ssh-setup>`, the
+process of copying files and starting the node daemon will be moved into
+a dedicated program. On its standard input it will receive a
+standardized JSON structure (defined :ref:`below
+<node-daemon-setup-json>`). Once the input data has been successfully
+decoded, the received values are verified for sanity, the program
+proceeds to write the values to files and then starts the node daemon
+(``ganeti-noded``).
+
+To add a new node to the cluster, the master node will have to gather
+all values, build the data structure, and then invoke the newly added
+``node-daemon-setup`` program via SSH. In this way only a single SSH
+connection is needed and the values can be verified before being written
+to files.
+
+If the program exists successfully, the node is ready to be added to the
+master daemon's configuration.
+
+.. _node-daemon-setup-json:
+
+JSON structure
+~~~~~~~~~~~~~~
+
+The data is given in an object containing the keys described below.
+Unless specified otherwise, all entries are optional.
+
+``cluster_name``
+ Required string with the cluster name. If a local cluster name is
+ found, the join process is aborted unless the passed cluster name
+ matches the local name. The cluster name is also included in the
+ dictionary given via the ``ssconf`` entry.
+``node_daemon_certificate``
+ Public and private part of cluster's node daemon certificate in PEM
+ format. If a local node certificate is found, the process is aborted
+ unless it matches.
+``rapi_daemon_certificate``
+ Remote API certificate, see ``node_daemon_certificate``.
+``spice_certificate``
+ SPICE server certificate for KVM, see ``node_daemon_certificate``.
+``spice_ca_certificate``
+ SPICE server certificate authority (CA), see
+ ``node_daemon_certificate``.
+``confd_hmac_key``
+ HMAC key for confd (due to a bug in the original confd implementation,
+ this must end with a newline, e.g. ``abcdef\n``).
+``ssconf``
+ Dictionary with ssconf names and their values. Both are strings.
+ Example:
+
+ .. highlight:: javascript
+
+ ::
+
+ {
+ "cluster_name": "cluster.example.com",
+ "master_ip": "192.168.2.1",
+ "master_netdev": "br0",
+ # …
+ }
+
+``start_node_daemon``
+ Boolean denoting whether the node daemon should be started (or
+ restarted if it was running for some reason).
+
+.. vim: set textwidth=72 :
+.. Local Variables:
+.. mode: rst
+.. fill-column: 72
+.. End:
--
1.7.7.3

Iustin Pop

unread,
Nov 16, 2012, 9:58:23 AM11/16/12
to Michael Hanselmann, ganeti...@googlegroups.com
On Fri, Nov 16, 2012 at 03:53:22PM +0100, Michael Hanselmann wrote:
> Instead of initiating many SSH connections to copy files using “scp”, a
> JSON structure is passed to a program running on the node to be added.
> The design is similar to the one used for SSH setup.

Since this is very similar and also related to the changed ssh setup,
please fold into that. These two together are about changing how node
adds are done, so feel free to rename the old one if you wish.

iustin

Michael Hanselmann

unread,
Nov 19, 2012, 10:59:48 AM11/19/12
to ganeti...@googlegroups.com
Instead of initiating many SSH connections to copy files using “scp”, a
JSON structure is passed to a program running on the node to be added.
The design is similar to the one used for SSH setup.

Signed-off-by: Michael Hanselmann <han...@google.com>
---
doc/design-node-add.rst | 86 +++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 84 insertions(+), 2 deletions(-)

diff --git a/doc/design-node-add.rst b/doc/design-node-add.rst
index 7b5fe4c..f9e5306 100644
--- a/doc/design-node-add.rst
+++ b/doc/design-node-add.rst
@@ -20,10 +20,20 @@ requires a tight coupling and equality between nodes (e.g. paths to
files being the same). Most of the logic and error handling is also done
on the connecting machine.

+Once a node's SSH daemon has been configured, more than 25 files need to
+be copied using ``scp`` before the node daemon can be started. No
+verification is being done before files are copied. Once the node daemon
+is started, an opcode is submitted to the master daemon, which will then
+copy more files, such as the configuration and job queue for master
+candidates, using RPC. This process is somewhat fragile and requires
+initiating many SSH connections.

Proposed changes
----------------

+SSH
+~~~
+
The main goal is to move more logic to the newly added node. Instead of
having a relatively large script executed on the master node, most of it
is moved over to the added node.
@@ -42,10 +52,35 @@ SSH client and to drop the dependency on Paramiko for Ganeti itself

Eventually ``setup-ssh`` can be removed.

+
+Node daemon
+~~~~~~~~~~~
+
+Similar to SSH setup changes, the process of copying files and starting
+the node daemon will be moved into a dedicated program. On its standard
+input it will receive a standardized JSON structure (defined :ref:`below
+<node-daemon-setup-json>`). Once the input data has been successfully
+decoded and the received values were verified for sanity, the program
+proceeds to write the values to files and then starts the node daemon
+(``ganeti-noded``).
+
+To add a new node to the cluster, the master node will have to gather
+all values, build the data structure, and then invoke the newly added
+``node-daemon-setup`` program via SSH. In this way only a single SSH
+connection is needed and the values can be verified before being written
+to files.
+
+If the program exists successfully, the node is ready to be added to the
+master daemon's configuration.
+
+
+Data structures
+---------------
+
.. _prepare-node-join-json:

-JSON structure
-~~~~~~~~~~~~~~
+JSON structure for SSH setup
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The data is given in an object containing the keys described below.
Unless specified otherwise, all entries are optional.
@@ -78,6 +113,53 @@ and public part of the key. Example:
("dsa", "-----BEGIN DSA PRIVATE KEY-----...", "ssh-dss AAAA..."),
]

+
+.. _node-daemon-setup-json:
+
+JSON structure for node daemon setup
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. vim: set textwidth=72 :
.. Local Variables:
.. mode: rst
--
1.7.7.3

Iustin Pop

unread,
Nov 21, 2012, 10:43:28 AM11/21/12
to Michael Hanselmann, ganeti...@googlegroups.com
On Mon, Nov 19, 2012 at 04:59:48PM +0100, Michael Hanselmann wrote:
> Instead of initiating many SSH connections to copy files using “scp”, a
> JSON structure is passed to a program running on the node to be added.
> The design is similar to the one used for SSH setup.

Thanks for sending this, I have a few questions below.
I wonder if all these files are needed for the node join. For example,
the spice parts, or the rapi parts, should not be needed for the join
itself; if we simply distribute them in the LU itself, via the regular
file redistribution, we'd simplify the (non-LU) join process.

I think the ssconf files (or part of them), cluster name, node daemon
cert are needed, but not the others.

thanks,
iustin

Michael Hanselmann

unread,
Nov 21, 2012, 6:48:28 PM11/21/12
to Iustin Pop, ganeti...@googlegroups.com
2012/11/21 Iustin Pop <ius...@google.com>:
> On Mon, Nov 19, 2012 at 04:59:48PM +0100, Michael Hanselmann wrote:
> […]
>> +``confd_hmac_key``
>> + HMAC key for confd (due to a bug in the original confd implementation,
>> + this must end with a newline, e.g. ``abcdef\n``).
>
> I wonder if all these files are needed for the node join. For example,
> the spice parts, or the rapi parts, should not be needed for the join
> itself; if we simply distribute them in the LU itself, via the regular
> file redistribution, we'd simplify the (non-LU) join process.
>
> I think the ssconf files (or part of them), cluster name, node daemon
> cert are needed, but not the others.

I had to double-check, but I think you're right. I left the full set
of ssconf files, as otherwise we'd have to define somewhere which ones
need to be copied and that could lead to subtle bugs in the future.

Interdiff:

--- a/doc/design-node-add.rst
+++ b/doc/design-node-add.rst
@@ -131,16 +131,6 @@ Unless specified otherwise, all entries are optional.
Public and private part of cluster's node daemon certificate in PEM
format. If a local node certificate is found, the process is aborted
unless it matches.
-``rapi_daemon_certificate``
- Remote API certificate, see ``node_daemon_certificate``.
-``spice_certificate``
- SPICE server certificate for KVM, see ``node_daemon_certificate``.
-``spice_ca_certificate``
- SPICE server certificate authority (CA), see
- ``node_daemon_certificate``.
-``confd_hmac_key``
- HMAC key for confd (due to a bug in the original confd implementation,
- this must end with a newline, e.g. ``abcdef\n``).
``ssconf``
Dictionary with ssconf names and their values. Both are strings.
Example:

Michael

Iustin Pop

unread,
Nov 22, 2012, 4:50:07 AM11/22/12
to Michael Hanselmann, ganeti...@googlegroups.com
On Thu, Nov 22, 2012 at 12:48:28AM +0100, Michael Hanselmann wrote:
> 2012/11/21 Iustin Pop <ius...@google.com>:
> > On Mon, Nov 19, 2012 at 04:59:48PM +0100, Michael Hanselmann wrote:
> > […]
> >> +``confd_hmac_key``
> >> + HMAC key for confd (due to a bug in the original confd implementation,
> >> + this must end with a newline, e.g. ``abcdef\n``).
> >
> > I wonder if all these files are needed for the node join. For example,
> > the spice parts, or the rapi parts, should not be needed for the join
> > itself; if we simply distribute them in the LU itself, via the regular
> > file redistribution, we'd simplify the (non-LU) join process.
> >
> > I think the ssconf files (or part of them), cluster name, node daemon
> > cert are needed, but not the others.
>
> I had to double-check, but I think you're right. I left the full set
> of ssconf files, as otherwise we'd have to define somewhere which ones
> need to be copied and that could lead to subtle bugs in the future.

Yes, that I didn't want to go into.

> Interdiff:
>
> --- a/doc/design-node-add.rst
> +++ b/doc/design-node-add.rst
> @@ -131,16 +131,6 @@ Unless specified otherwise, all entries are optional.
> Public and private part of cluster's node daemon certificate in PEM
> format. If a local node certificate is found, the process is aborted
> unless it matches.
> -``rapi_daemon_certificate``
> - Remote API certificate, see ``node_daemon_certificate``.
> -``spice_certificate``
> - SPICE server certificate for KVM, see ``node_daemon_certificate``.
> -``spice_ca_certificate``
> - SPICE server certificate authority (CA), see
> - ``node_daemon_certificate``.
> -``confd_hmac_key``
> - HMAC key for confd (due to a bug in the original confd implementation,
> - this must end with a newline, e.g. ``abcdef\n``).
> ``ssconf``
> Dictionary with ssconf names and their values. Both are strings.
> Example:

Mmm, interdiff is good, but please add a note that this doesn't copy all
nodes, and logically speaking the node will be fully joined only after
the LU is run (this step only prepares the node such that the LU can
run).

thanks,
iustin

Michael Hanselmann

unread,
Nov 22, 2012, 10:31:13 AM11/22/12
to Iustin Pop, ganeti...@googlegroups.com
2012/11/22 Iustin Pop <ius...@google.com>:
> Mmm, interdiff is good, but please add a note that this doesn't copy all
> nodes, and logically speaking the node will be fully joined only after
> the LU is run (this step only prepares the node such that the LU can
> run).

Interdiff (fixes a typo in “exits”, too):

--- a/doc/design-node-add.rst
+++ b/doc/design-node-add.rst
@@ -70,8 +70,10 @@ all values, build the data structure, and then
invoke the newly added
connection is needed and the values can be verified before being written
to files.

-If the program exists successfully, the node is ready to be added to the
-master daemon's configuration.
+If the program exits successfully, the node is ready to be added to the
+master daemon's configuration. The node daemon will be running, but
+``OpNodeAdd`` needs to be run before it becomes a full node. The opcode
+will copy more files, such as the :doc:`RAPI certificate <rapi>`.


Data structures

LGTY?

Michael

Iustin Pop

unread,
Nov 22, 2012, 10:47:58 AM11/22/12
to Michael Hanselmann, ganeti...@googlegroups.com
Indeed, thanks a lot!

(Official LGTM here: ^W)

iustin
Reply all
Reply to author
Forward
0 new messages