The option is added to allow us the passing of arbitrary ext-params
during disk addition (gnt-instance modify --disk add: ...), same as
we do during instance add.
The option is needed because during gnt-instance modify parameters'
validation, we do not know the type of the disk template
(whereas in gnt-instance add we find it from the -t option).
The option is called 'arbit' and not 'ext' params because it may be
useful in other contexts too, in the future.
The corresponding prereq checks for the existence of the 'provider'
parameter in case of an 'ext' template are also taken care off in
this commit. It serves as a general introduction of the 'ext'
template in gnt-instance modify.
# even if here we process the result, we allow submit only
result = SubmitOrSend(op, opts)
@@ -1608,7 +1609,8 @@ commands = {
[BACKEND_OPT, DISK_OPT, FORCE_OPT, HVOPTS_OPT, NET_OPT, SUBMIT_OPT,
DISK_TEMPLATE_OPT, SINGLE_NODE_OPT, OS_OPT, FORCE_VARIANT_OPT,
OSPARAMS_OPT, DRY_RUN_OPT, PRIORITY_OPT, NWSYNC_OPT, OFFLINE_INST_OPT,
- ONLINE_INST_OPT, IGNORE_IPOLICY_OPT, RUNTIME_MEM_OPT],
+ ONLINE_INST_OPT, IGNORE_IPOLICY_OPT, RUNTIME_MEM_OPT,
+ ALLOW_ARBITPARAMS_OPT],
"<instance>", "Alters the parameters of an instance"),
"shutdown": (
GenericManyOps("shutdown", _ShutdownInstance), [ArgInstance()],
diff --git a/lib/cmdlib.py b/lib/cmdlib.py
index 406ae2f..4554752 100644
--- a/lib/cmdlib.py
+++ b/lib/cmdlib.py
@@ -12241,7 +12241,10 @@ class LUInstanceSetParams(LogicalUnit):
for (op, _, params) in mods:
assert ht.TDict(params)
- utils.ForceDictType(params, key_types)
+ # If key_types is an empty dict, we assume we have an 'ext' template
+ # and thus do not ForceDictType
+ if key_types:
+ utils.ForceDictType(params, key_types)
if op == constants.DDM_REMOVE:
if params:
@@ -12277,9 +12280,18 @@ class LUInstanceSetParams(LogicalUnit):
params[constants.IDISK_SIZE] = size
- elif op == constants.DDM_MODIFY and constants.IDISK_SIZE in params:
- raise errors.OpPrereqError("Disk size change not possible, use"
- " grow-disk", errors.ECODE_INVAL)
+ elif op == constants.DDM_MODIFY:
+ if constants.IDISK_SIZE in params:
+ raise errors.OpPrereqError("Disk size change not possible, use"
+ " grow-disk", errors.ECODE_INVAL)
+ if constants.IDISK_MODE not in params:
+ raise errors.OpPrereqError("Disk 'mode' is the only kind of"
+ " modification supported, but missing",
+ errors.ECODE_NOENT)
+ if len(params) > 1:
+ raise errors.OpPrereqError("Disk modification doesn't support"
+ " additional arbitrary parameters",
+ errors.ECODE_INVAL)
@staticmethod
def _VerifyNicModification(op, params):
@@ -12330,14 +12342,26 @@ class LUInstanceSetParams(LogicalUnit):
if self.op.hvparams:
_CheckGlobalHvParams(self.op.hvparams)
if self.op.disks and self.op.disk_template is not None:
raise errors.OpPrereqError("Disk template conversion and other disk"
@@ -12491,6 +12515,33 @@ class LUInstanceSetParams(LogicalUnit):
self.diskmod = PrepareContainerMods(self.op.disks, None)
self.nicmod = PrepareContainerMods(self.op.nics, _InstNicModPrivate)
+ # Check the validity of the `provider' parameter
+ if instance.disk_template in constants.DT_EXT:
+ for mod in self.diskmod:
+ ext_provider = mod[2].get(constants.IDISK_PROVIDER, None)
+ if mod[0] == constants.DDM_ADD:
+ if ext_provider is None:
+ raise errors.OpPrereqError("Instance template is '%s' and parameter"
+ " '%s' missing, during disk add" %
+ (constants.DT_EXT,
+ constants.IDISK_PROVIDER),
+ errors.ECODE_NOENT)
+ elif mod[0] == constants.DDM_MODIFY:
+ if ext_provider:
+ raise errors.OpPrereqError("Parameter '%s' is invalid during disk"
+ " modification" %
+ constants.IDISK_PROVIDER,
+ errors.ECODE_INVAL)
+ else:
+ for mod in self.diskmod:
+ ext_provider = mod[2].get(constants.IDISK_PROVIDER, None)
+ if ext_provider is not None:
+ raise errors.OpPrereqError("Parameter '%s' is only valid for"
+ " instances of type '%s'" %
+ (constants.IDISK_PROVIDER,
+ constants.DT_EXT),
+ errors.ECODE_INVAL)
+
# OS change
if self.op.os_name and not self.op.force:
_CheckNodeHasOS(self, instance.primary_node, self.op.os_name,
diff --git a/lib/opcodes.py b/lib/opcodes.py
index 961192f..92e7798 100644
--- a/lib/opcodes.py
+++ b/lib/opcodes.py
@@ -1536,6 +1536,7 @@ class OpInstanceSetParams(OpCode):
"""
TestNicModifications = _TestInstSetParamsModList(_TestNicDef)
TestDiskModifications = _TestInstSetParamsModList(_TDiskParams)
+ TestExtDiskModifications = _TestInstSetParamsModList(_TExtDiskParams)
OP_DSC_FIELD = "instance_name"
OP_PARAMS = [
@@ -1569,9 +1570,63 @@ class OpInstanceSetParams(OpCode):
("wait_for_sync", True, ht.TBool,
"Whether to wait for the disk to synchronize, when changing template"),
("offline", None, ht.TMaybeBool, "Whether to mark instance as offline"),
+ ("allow_arbit_params", None, ht.TMaybeBool,
+ "Whether to allow the passing of arbitrary parameters to --disk(s)"),
]
OP_RESULT = _TSetParamsResult
+ def Validate(self, set_defaults):
+ """Validate opcode parameters, optionally setting default values.
+
+ @type set_defaults: bool
+ @param set_defaults: Whether to set default values
+ @raise errors.OpPrereqError: When a parameter value doesn't match
+ requirements
+
+ """
+ # pylint: disable=E1101
+ # as OP_ID has been dynamically defined
+
+ # Check if the template is DT_EXT
+ allow_arbitrary_params = False
+ for (attr_name, _, _, _) in self.GetAllParams():
+ if hasattr(self, attr_name):
+ if attr_name == "allow_arbit_params" and \
+ getattr(self, attr_name) == True:
+ allow_arbitrary_params = True
+
+ for
With this commit we introduce the External Storage Interface
to Ganeti, abbreviated: ExtStorage Interface.
The ExtStorage Interface provides Ganeti with the ability to interact
with externally connected shared storage pools, visible by all
VM-capable nodes. This means that Ganeti is able to handle VM disks
that reside inside a NAS/SAN or any distributed block storage provider.
The ExtStorage Interface provides a clear API, heavily inspired by the
gnt-os-interface API, that can be used by storage vendors or sysadmins
to write simple ExtStorage Providers (correlated to gnt-os-interface's
OS Definitions). Those Providers will glue externally attached shared
storage with Ganeti, without the need of preprovisioned block devices
on Ganeti VM-capable nodes as confined be the current `blockdev' disk
template.
To do so, we implement a new disk template called `ext' (of type
DTS_EXT_MIRROR) that passes control to externally provided scripts
(the ExtStorage Provider) for the template's basic functions:
create / attach / detach / remove / grow
The scripts reside under ES_SEARCH_PATH (correlated to OS_SEARCH_PATH)
and only one ExtStorage Provider is supported called `ext'.
The disk's logical id is the tuple ('ext', UUID.ext.diskX), where UUID
is generated as in disk template `plain' and X is the disk's index.
+# --with-extstorage-search-path=...
+# same black sed magic for quoting of the strings in the list
+AC_ARG_WITH([extstorage-search-path],
+ [AS_HELP_STRING([--with-extstorage-search-path=LIST],
+ [comma separated list of directories to]
+ [ search for External Storage Providers]
+ [ (default is /srv/ganeti/extstorage)]
+ )],
+ [es_search_path=`echo -n "$withval" | sed -e "s/\([[^,]]*\)/'\1'/g"`],
+ [es_search_path="'/srv/ganeti/extstorage'"])
+AC_SUBST(ES_SEARCH_PATH, $es_search_path)
+
# --with-iallocator-search-path=...
# do a bit of black sed magic to for quoting of the strings in the list
AC_ARG_WITH([iallocator-search-path],
diff --git a/lib/bdev.py b/lib/bdev.py
index d6c1a66..b0ed7c0 100644
--- a/lib/bdev.py
+++ b/lib/bdev.py
@@ -33,6 +33,7 @@ import logging
from ganeti import utils
from ganeti import errors
from ganeti import constants
+from ganeti import pathutils
from ganeti import objects
from ganeti import compat
from ganeti import netutils
@@ -2648,11 +2649,328 @@ class RADOSBlockDevice(BlockDev):
result.fail_reason, result.output)
+class ExtStorageDevice(BlockDev):
+ """A block device provided by an ExtStorage Provider.
+
+ This class implements the External Storage Interface, which means
+ handling of the externally provided block devices.
+
+ """
+ def __init__(self, unique_id, children, size, params):
+ """Attaches to an extstorage block device.
+
+ """
+ super(ExtStorageDevice, self).__init__(unique_id, children, size, params)
+ if not isinstance(unique_id, (tuple, list)) or len(unique_id) != 2:
+ raise ValueError("Invalid configuration data %s" % str(unique_id))
+
+ self.driver, self.vol_name = unique_id
+
+ self.major = self.minor = None
+ self.Attach()
+
+ @classmethod
+ def Create(cls, unique_id, children, size, params):
+ """Create a new extstorage device.
+
+ Provision a new volume using an extstorage provider, which will
+ then be mapped to a block device.
+
+ """
+ if not isinstance(unique_id, (tuple, list)) or len(unique_id) != 2:
+ raise errors.ProgrammerError("Invalid configuration data %s" %
+ str(unique_id))
+
+ # Call the External Storage's create script,
+ # to provision a new Volume inside the External Storage
+ _ExtStorageAction(constants.ES_ACTION_CREATE, unique_id, str(size))
+
+ return ExtStorageDevice(unique_id, children, size, params)
+
+ def Remove(self):
+ """Remove the extstorage device.
+
+ """
+ if not self.minor and not self.Attach():
+ # The extstorage device doesn't exist.
+ return
+
+ # First shutdown the device (remove mappings).
+ self.Shutdown()
+
+ # Call the External Storage's remove script,
+ # to remove the Volume from the External Storage
+ _ExtStorageAction(constants.ES_ACTION_REMOVE, self.unique_id)
+
+ def Rename(self, new_id):
+ """Rename this device.
+
+ """
+ pass
+
+ def Attach(self):
+ """Attach to an existing extstorage device.
+
+ This method maps the extstorage volume that matches our name with
+ a corresponding block device and then attaches to this device.
+
+ """
+ self.attached = False
+
+ # Call the External Storage's attach script,
+ # to attach an existing Volume to a block device under /dev
+ self.dev_path = _ExtStorageAction(constants.ES_ACTION_ATTACH,
+ self.unique_id)
+
+ try:
+ st = os.stat(self.dev_path)
+ except OSError, err:
+ logging.error("Error stat()'ing %s: %s", self.dev_path, str(err))
+ return False
+
+ if not stat.S_ISBLK(st.st_mode):
+ logging.error("%s is not a block device", self.dev_path)
+ return False
+
+ self.major = os.major(st.st_rdev)
+ self.minor = os.minor(st.st_rdev)
+ self.attached = True
+
+ return True
+
+ def Assemble(self):
+ """Assemble the device.
+
+ """
+ pass
+
+ def Shutdown(self):
+ """Shutdown the device.
+
+ """
+ if not self.minor and not self.Attach():
+ # The extstorage device doesn't exist.
+ return
+
+ # Call the External Storage's detach script,
+ # to detach an existing Volume from it's block device under /dev
+ _ExtStorageAction(constants.ES_ACTION_DETACH, self.unique_id)
+
+ self.minor = None
+ self.dev_path = None
+
+ def Open(self, force=False):
+ """Make the device ready for I/O.
+
+ """
+ pass
+
+ def Close(self):
+ """Notifies that the device will no longer be used for I/O.
+
+ """
+ pass
+
+ def Grow(self, amount, dryrun, backingstore):
+ """Grow the Volume.
+
+ @type amount: integer
+ @param amount: the amount (in mebibytes) to grow with
+ @type dryrun: boolean
+ @param dryrun: whether to execute the operation in simulation mode
+ only, without actually increasing the size
+
+ """
+ if not backingstore:
+ return
+ if not self.Attach():
+ _ThrowError("Can't attach to extstorage device during Grow()")
+
+ if dryrun:
+ # we do not support dry runs of resize operations for now.
+ return
+
+ new_size = self.size + amount
+
+ # Call the External Storage's grow script,
+ # to grow an existing Volume inside the External Storage
+ _ExtStorageAction(constants.ES_ACTION_GROW, self.unique_id,
+ str(self.size), grow=str(new_size))
+
+
+def _ExtStorageAction(action, unique_id, size=None, grow=None):
+ """Take an External Storage action.
+
+ Take an External Storage action concerning or affecting
+ a specific Volume inside the External Storage.
+
+ @type action: string
+ @param action: which action to perform. One of:
+ create / remove / grow / attach / detach
+ @type unique_id: tuple (driver, vol_name)
+ @param unique_id: a tuple containing the type of ExtStorage (driver)
+ and the Volume name
+ @type size: integer
+ @param size: the size of the Volume in mebibytes
+ @type grow: integer
+ @param grow: the new size in mebibytes (after grow)
+ @rtype: None or a block device path (during attach)
+
+ """
+ driver, vol_name = unique_id
+
+ # Create an External Storage instance of type `driver'
+ status, inst_es = ExtStorageFromDisk(driver)
+ if not status:
+ _ThrowError("%s" % inst_es)
+
+ # Create the basic environment for the driver's scripts
+ create_env = _ExtStorageEnvironment(unique_id, size, grow)
+
+ # Do not use log file for action `attach' as we need
+ # to get the output from RunResult
+ # TODO: find a way to have a log file for attach too
+ logfile = None
+ if action is not constants.ES_ACTION_ATTACH:
+ logfile = _VolumeLogName(action, driver, vol_name)
+
+ # Find out which external script to run according the given action
+ script_name = action + "_script"
+ script = getattr(inst_es, script_name)
+
+ # Run the external script
+ result = utils.RunCmd([script], env=create_env,
+ cwd=inst_es.path, output=logfile,)
+ if result.failed:
+ logging.error("External storage's %s command '%s' returned"
+
This is the ExtStorage Interface patch series rebased on top of
master branch's commit 0e1e0b6. It passes all current lint/unit
tests.
The first commit (design doc) has been already reviewed by Iustin
and the only question was about separate providers between disks,
which I think has been answered, after the discussion concerning
Storage Pools.
I'll be also sending a sample ExtStorage provider, which you can
use to test the patch series, without the need of setting up a
NAS and writing custom scripts.
Add support for passing parameters to the ext template (ext-params).
Take advantage of disk-params, that don't seem to make much sense in
this template (ExtStorage Providers are not predefined and we don't
know their needs) and use them to pass the ext-params dynamically to
the template.
ext-params are correlated with gnt-os-interface's os-params.
All ext-params are exported to the ExtStorage Provider through it's
environment, with variables prefixed with 'EXTP_' (similarly to the
OS interface's 'OSP_' params).
ext-params are passed through the --disk option. If the disk template
is of type `ext' during instance add, then any additional options that
are not in IDISK_PARAMS given to --disk are considered ext-params
e.g.:
Finally, we introduce a new IDISK_PARAM called IDISK_PROVIDER, that is
mandatory for template `ext' and is used to select the desired
ExtStorage Provider. This parameter is not valid for other template
types.
The IDISK_PROVIDER parameter becomes the first element of the
disk's unique_id tuple e.g.:
@@ -2700,7 +2703,8 @@ class ExtStorageDevice(BlockDev):
# Call the External Storage's remove script,
# to remove the Volume from the External Storage
- _ExtStorageAction(constants.ES_ACTION_REMOVE, self.unique_id)
+ _ExtStorageAction(constants.ES_ACTION_REMOVE, self.unique_id,
+ self.ext_params)
def Rename(self, new_id):
"""Rename this device.
@@ -2720,7 +2724,7 @@ class ExtStorageDevice(BlockDev):
# Call the External Storage's attach script,
# to attach an existing Volume to a block device under /dev
self.dev_path = _ExtStorageAction(constants.ES_ACTION_ATTACH,
- self.unique_id)
+ self.unique_id, self.ext_params)
try:
st = os.stat(self.dev_path)
@@ -2754,7 +2758,8 @@ class ExtStorageDevice(BlockDev):
# Call the External Storage's detach script,
# to detach an existing Volume from it's block device under /dev
- _ExtStorageAction(constants.ES_ACTION_DETACH, self.unique_id)
+ _ExtStorageAction(constants.ES_ACTION_DETACH, self.unique_id,
+ self.ext_params)
self.minor = None
self.dev_path = None
@@ -2795,10 +2800,10 @@ class ExtStorageDevice(BlockDev):
# Call the External Storage's grow script,
# to grow an existing Volume inside the External Storage
_ExtStorageAction(constants.ES_ACTION_GROW, self.unique_id,
- str(self.size), grow=str(new_size))
+ self.ext_params, str(self.size), grow=str(new_size))
Take an External Storage action concerning or affecting
@@ -2810,6 +2815,8 @@ def _ExtStorageAction(action, unique_id, size=None, grow=None):
@type unique_id: tuple (driver, vol_name)
@param unique_id: a tuple containing the type of ExtStorage (driver)
and the Volume name
+ @type ext_params: dict
+ @type ext_params: ExtStorage parameters
@type size: integer
@param size: the size of the Volume in mebibytes
@type grow: integer
@@ -2825,7 +2832,7 @@ def _ExtStorageAction(action, unique_id, size=None, grow=None):
_ThrowError("%s" % inst_es)
# Create the basic environment for the driver's scripts
- create_env = _ExtStorageEnvironment(unique_id, size, grow)
+ create_env = _ExtStorageEnvironment(unique_id, ext_params, size, grow)
# Do not use log file for action `attach' as we need
# to get the output from RunResult
@@ -2891,7 +2898,9 @@ def ExtStorageFromDisk(name, base_dir=None):
# an optional one
es_files = dict.fromkeys(constants.ES_SCRIPTS, True)
- for filename in es_files:
+ es_files[constants.ES_PARAMETERS_FILE] = True
+
+ for (filename, _) in es_files.items():
es_files[filename] = utils.PathJoin(es_dir, filename)
try:
@@ -2909,21 +2918,35 @@ def ExtStorageFromDisk(name, base_dir=None):
return False, ("File '%s' under path '%s' is not executable" %
(filename, es_dir))
+ parameters = []
+ if constants.ES_PARAMETERS_FILE in es_files:
+ parameters_file = es_files[constants.ES_PARAMETERS_FILE]
+ try:
+ parameters = utils.ReadFile(parameters_file).splitlines()
+ except EnvironmentError, err:
+ return False, ("Error while reading the EXT parameters file at %s: %s" %
+ (parameters_file, utils.ErrnoOrStr(err)))
+ parameters = [v.split(None, 1) for v in parameters]
+
es_obj = \
objects.ExtStorage(name=name, path=es_dir,
create_script=es_files[constants.ES_SCRIPT_CREATE],
remove_script=es_files[constants.ES_SCRIPT_REMOVE],
grow_script=es_files[constants.ES_SCRIPT_GROW],
attach_script=es_files[constants.ES_SCRIPT_ATTACH],
- detach_script=es_files[constants.ES_SCRIPT_DETACH])
+ detach_script=es_files[constants.ES_SCRIPT_DETACH],
+ verify_script=es_files[constants.ES_SCRIPT_VERIFY],
+ supported_parameters=parameters)
return True, es_obj
-def _ExtStorageEnvironment(unique_id, size=None, grow=None):
+def _ExtStorageEnvironment(unique_id, ext_params, size=None, grow=None):
"""Calculate the environment for an External Storage script.
@type unique_id: tuple (driver, vol_name)
@param unique_id: ExtStorage pool and name of the Volume
+ @type ext_params: dict
+ @param ext_params: the EXT parameters
@type size: integer
@param size: size of the Volume in mebibytes
@rtype: dict
@@ -2935,6 +2958,10 @@ def _ExtStorageEnvironment(unique_id, size=None, grow=None):
result = {}
result['VOL_NAME'] = vol_name
+ # EXT params
+ for pname, pvalue in ext_params.items():
+ result["EXTP_%s" % pname.upper()] = str(pvalue)
+
if size is not None:
result['VOL_SIZE'] = size
diff --git a/lib/cmdlib.py b/lib/cmdlib.py
index 90b2117..406ae2f 100644
--- a/lib/cmdlib.py
+++ b/lib/cmdlib.py
@@ -7071,6 +7071,7 @@ class LUInstanceRecreateDisks(LogicalUnit):
# TODO: Implement support changing VG while recreating
constants.IDISK_VG,
constants.IDISK_METAVG,
+ constants.IDISK_PROVIDER,
]))
Ganeti daemons: **ganeti-watcher**(8) (automatic instance restarter),
**ganeti-cleaner**(8) (job queue cleaner), **ganeti-noded**(8) (node
diff --git a/man/ganeti-extstorage-interface.rst b/man/ganeti-extstorage-interface.rst
new file mode 100644
index 0000000..7266ee4
--- /dev/null
+++ b/man/ganeti-extstorage-interface.rst
@@ -0,0 +1,212 @@
+ganeti-extstorage-interface(7) Ganeti | Version @GANETI_VERSION@
+================================================================
+
+Name
+----
+
+ganeti-extstorage-interface - Specifications for ExtStorage providers
+
+DESCRIPTION
+-----------
+
+The method for supporting external shared storage in Ganeti is to have
+an ExtStorage provider for each external shared storage hardware type.
+The ExtStorage provider is a set of files (executable scripts and text
+files), contained inside a directory which is named after the provider.
+This directory must be present across all nodes of a nodegroup (Ganeti
+doesn't replicate it), in order for the provider to be usable by Ganeti
+for this nodegroup (valid). The external shared storage hardware should
+also be accessible by all nodes of this nodegroup too.
+
+REFERENCE
+---------
+
+There are seven required files: *create*, *attach*, *detach*, *remove*,
+*grow*, *verify* (executables) and *parameters.list* (text file).
+
+Common environment
+~~~~~~~~~~~~~~~~~~
+
+All commands will get their input via environment variables. A common
+set of variables will be exported for all commands, and some of them
+might have extra ones. Note that all counts are zero-based.
+
+Since Ganeti version 2.5, the environment will be cleaned up before
+being passed to scripts, therefore they will not inherit the environment
+in with which the ganeti node daemon was started. If you depend on any
+environment variables (non-Ganeti), then you will need to define or
+source them appropriately.
+
+VOL_NAME
+ The name of the volume. This is unique for Ganeti and it uses it
+ to refer to a specific volume inside the external storage. Its + format is ``UUID.ext.diskX`` where ``UUID`` is produced by Ganeti
+ and is unique inside the Ganeti context. ``X`` is the number of the
+ disk count.
+
+VOL_SIZE
+ The volume's size in mebibytes.
+
+VOL_NEW_SIZE
+ Available only to the **grow** script. It declares the new size of
+ the volume after grow (in mebibytes). To find the amount of grow,
+ the scipt should calculate the number VOL_NEW_SIZE - VOL_SIZE.
+
+EXTP_*name*
+ Each ExtStorage parameter (see below) will be exported in its own
+ variable, prefixed with ``EXTP_``, and upper-cased. For example, a
+ ``fromsnap`` parameter will be exported as ``EXTP_FROMSNAP``.
+
+EXECUTABLE SCRIPTS
+------------------
+
+
+create
+~~~~~~
+
+The **create** command is used for creating a new volume inside the
+external storage. The ``VOL_NAME`` denotes the volume's name, which
+should be unique. After creation, Ganeti will refer to this volume by
+this name for all other actions.
+
+Ganeti produces this name dynamically and ensures its uniqueness inside
+the Ganeti context. Therefore, you should make sure not to provision
+manually additional volumes inside the external storage with this type
+of name, because this will lead to conflicts and possible loss of data.
+
+The ``VOL_SIZE`` variable denotes the size of the new volume to be
+created in mebibytes.
+
+If the script ends successfully, a new volume of size ``VOL_SIZE``
+should exist inside the external storage. e.g:: a lun inside a NAS
+appliance.
+
+The script returns ``0`` on success.
+
+attach
+~~~~~~
+
+This command is used in order to make an already created volume visible
+to the physical node which will host the instance. This is done by
+mapping the already provisioned volume to a block device inside the host
+node.
+
+The ``VOL_NAME`` variable denotes the volume to be mapped.
+
+After successful attachment the script returns to its stdout a string,
+which is the full path of the block device to which the volume is
+mapped. e.g:: /dev/dummy1
+
+When attach returns, this path should be a valid block device on the
+host node.
+
+The attach script should be idempotent if the volume is already mapped.
+If the requested volume is already mapped, then the script should just
+return to its stdout the path which is already mapped to.
+
+detach
+~~~~~~
+
+This command is used in order to unmap an already mapped volume from the
+host node. Detach undoes everything attach did. This is done by
+unmapping the requested volume from the block device it is mapped to.
+
+The ``VOL_NAME`` variable denotes the volume to be unmapped.
+
+``detach`` doesn't affect the volume itself. It just unmaps it from the
+host node. The volume continues to exist inside the external storage.
+It's just not accessible by the node anymore. This script doesn't return
+anything to its stdout.
+
+The detach script should be idempotent if the volume is already
+unmapped. If the volume is not mapped, the script doesn't perform any
+action at all.
+
+The script returns ``0`` on success.
+
+remove
+~~~~~~
+
+This command is used to remove an existing volume from the external
+storage. The volume is permanently removed from inside the external
+storage along with all its data.
+
+The ``VOL_NAME`` variable denotes the volume to be removed.
+
+The script returns ``0`` on success.
+
+grow
+~~~~
+
+This command is used to grow an existing volume of the external storage.
+
+The ``VOL_NAME`` variable denotes the volume to grow.
+
+The ``VOL_SIZE`` variable denotes the current volume's size (in
+mebibytes). The ``VOL_NEW_SIZE`` variable denotes the final size after
+the volume has been grown (in mebibytes).
+
+The amount of grow can be easily calculated by the scipt and is:
+
+grow_amount = VOL_NEW_SIZE - VOL_SIZE (in mebibytes)
+
+Ganeti ensures that: ``VOL_NEW_SIZE`` > ``VOL_SIZE``
+
+If the script returns successfully, then the volume inside the external
+storage will have a new size of ``VOL_NEW_SIZE``. This isn't immediately
+reflected to the instance's disk. See ``gnt-instance grow`` for more
+details on when the running instance becomes aware of its grown disk.
+
+The script returns ``0`` on success.
+
+verify
+~~~~~~
+
+The *verify* script is used to verify consistency of the external
+parameters (ext-params) (see below). The command should take one or more
+arguments denoting what checks should be performed, and return a proper
+exit code depending on whether the validation failed or succeeded.
+
+Currently, the script is not invoked by Ganeti, but should be present
+for future use and consistency with gnt-os-interface's verify script.
+
+The script should return ``0`` on success.
+
+TEXT FILES
+----------
+
+
+parameters.list
+~~~~~~~~~~~~~~~
+
+This file declares the parameters supported by the ExtStorage provider,
+one parameter per line, with name and description (space and/or tab
+separated). For example::
+
+ fromsnap Snapshot name to create the volume from
+ nas_ip The IP of the NAS appliance +
+The parameters can then be used during instance add as follows::
+
+ # gnt-instance add --disk=0:fromsnap="file_name",nas_ip="1.2.3.4" ...
+
+NOTES
+-----
+
+Backwards compatibility
+~~~~~~~~~~~~~~~~~~~~~~~
+
+The ExtStorage Interface was introduced in Ganeti 2.6.
+Ganeti 2.6 and up is compatible with the ExtStorage Interface.
+
+Common behaviour
+~~~~~~~~~~~~~~~~
+
+All the scripts should display
diff --git a/doc/design-shared-storage.rst b/doc/design-shared-storage.rst
index c175476..7080182 100644
--- a/doc/design-shared-storage.rst
+++ b/doc/design-shared-storage.rst
@@ -64,15 +64,11 @@ The design addresses the following procedures:
filesystems.
- Introduction of shared block device disk template with device
adoption.
+- Introduction of an External Storage Interface.
Additionally, mid- to long-term goals include:
- Support for external “storage pools”.
-- Introduction of an interface for communicating with external scripts,
- providing methods for the various stages of a block device's and
- instance's life-cycle. In order to provide storage provisioning
- capabilities for various SAN appliances, external helpers in the form
- of a “storage driver” will be possibly introduced as well.
Refactoring of all code referring to constants.DTS_NET_MIRROR
=============================================================
@@ -159,6 +155,104 @@ The shared block device template will make the following assumptions:
- The device will be available with the same path under all nodes in the
node group.
+Introduction of an External Storage Interface
+==============================================
+Overview
+--------
+
+To extend the shared block storage template and give Ganeti the ability
+to control and manipulate external storage (provisioning, removal,
+growing, etc.) we need a more generic approach. The generic method for
+supporting external shared storage in Ganeti will be to have an
+ExtStorage provider for each external shared storage hardware type. The
+ExtStorage provider will be a set of files (executable scripts and text
+files), contained inside a directory which will be named after the
+provider. This directory must be present across all nodes of a nodegroup
+(Ganeti doesn't replicate it), in order for the provider to be usable by
+Ganeti for this nodegroup (valid). The external shared storage hardware
+should also be accessible by all nodes of this nodegroup too.
+
+An “ExtStorage provider” will have to provide the following methods:
+
+- Create a disk
+- Remove a disk
+- Grow a disk
+- Attach a disk to a given node
+- Detach a disk from a given node
+- Verify its supported parameters
+
+The proposed ExtStorage interface borrows heavily from the OS
+interface and follows a one-script-per-function approach. An ExtStorage
+provider is expected to provide the following scripts:
+
+- `create`
+- `remove`
+- `grow`
+- `attach`
+- `detach`
+- `verify`
+
+All scripts will be called with no arguments and get their input via
+environment variables. A common set of variables will be exported for
+all commands, and some of them might have extra ones.
+
+- `VOL_NAME`: The name of the volume. This is unique for Ganeti and it
+ uses it to refer to a specific volume inside the external storage.
+- `VOL_SIZE`: The volume's size in mebibytes.
+- `VOL_NEW_SIZE`: Available only to the `grow` script. It declares the
+ new size of the volume after grow (in mebibytes).
+- `EXTP_name`: ExtStorage parameter, where `name` is the parameter in
+ upper-case (same as OS interface's `OSP_*` parameters).
+
+All scripts except `attach` should return 0 on success and non-zero on
+error, accompanied by an appropriate error message on stderr. The
+`attach` script should return a string on stdout on success, which is
+the block device's full path, after it has been successfully attached to
+the host node. On error it should return non-zero.
+
+Implementation
+--------------
+
+To support the ExtStorage interface, we will introduce a new disk
+template called `ext`. This template will implement the existing Ganeti
+disk interface in `lib/bdev.py` (create, remove, attach, assemble,
+shutdown, grow), and will simultaneously pass control to the external
+scripts to actually handle the above actions. The `ext` disk template
+will act as a translation layer between the current Ganeti disk
+interface and the ExtStorage providers.
+
+We will also introduce a new IDISK_PARAM called `IDISK_PROVIDER =
+provider`, which will be used at the command line to select the desired
+ExtStorage provider. This parameter will be valid only for template
+`ext` e.g.::
+
+ gnt-instance add -t ext --disk=0:size=2G,provider=sample_provider1
+
+The Extstorage interface will support different disks to be created by
+different providers. e.g.::
+
+ gnt-instance add -t ext --disk=0:size=2G,provider=sample_provider1
+ --disk=1:size=1G,provider=sample_provider2
+ --disk=2:size=3G,provider=sample_provider1
+
+Finally, the ExtStorage interface will support passing of parameters to
+the ExtStorage provider. This will also be done per disk, from the
+command line::
+
+ gnt-instance add -t ext --disk=0:size=1G,provider=sample_provider1,
+ param1=value1,param2=value2
+
+The above parameters will be exported to the ExtStorage provider's
+scripts as the enviromental variables:
+
+- `EXTP_PARAM1 = str(value1)`
+- `EXTP_PARAM2 = str(value2)`
+
+We will also introduce a new Ganeti client called `gnt-storage` which
+will be used to diagnose ExtStorage providers and show information about
+them, similarly to the way `gnt-os diagose` and `gnt-os info` handle OS
+definitions.
+
Long-term shared storage goals
==============================
Storage pool handling
@@ -166,7 +260,7 @@ Storage pool handling
A new cluster configuration attribute will be introduced, named
“storage_pools”, modeled as a dictionary mapping storage pools to
-external storage drivers (see below), e.g.::
+external storage providers (see below), e.g.::
{
"nas1": "foostore",
@@ -188,93 +282,19 @@ Furthermore, the storage pools will be used to indicate the availability
of storage pools to different node groups, thus specifying the
instances' “mobility domain”.
-New disk templates will also be necessary to facilitate the use of external
-storage. The proposed addition is a whole template namespace created by
-prefixing the pool names with a fixed string, e.g. “ext:”, forming names
-like “ext:nas1”, “ext:foo”.
-
-Interface to the external storage drivers
------------------------------------------
-
-In addition to external storage pools, a new interface will be
-introduced to allow external scripts to provision and manipulate shared
-storage.
-
-In order to provide storage provisioning and manipulation (e.g. growing,
-renaming) capabilities, each instance's disk template can possibly be
-associated with an external “storage driver” which, based on the
-instance's configuration and tags, will perform all supported storage
-operations using auxiliary means (e.g. XML-RPC, ssh, etc.).
-
-A “storage driver” will have to provide the following methods:
-
-- Create a disk
-- Remove a disk
-- Rename a disk
-- Resize a disk
-- Attach a disk to a given node
-- Detach a disk from a given node
-
-The proposed storage driver architecture borrows heavily from the OS
-interface and follows a one-script-per-function approach. A storage
-driver is expected to provide the following scripts:
-
-- `create`
-- `resize`
-- `rename`
-- `remove`
-- `attach`
-- `detach`
-
-These executables will be called once for each disk with no arguments
-and all required information will be passed through environment
-variables. The following environment variables will always be present on
-each invocation:
-
-- `INSTANCE_NAME`: The instance's name
-- `INSTANCE_UUID`: The instance's UUID
-- `INSTANCE_TAGS`: The instance's tags
-- `DISK_INDEX`: The current disk index.
-- `LOGICAL_ID`: The disk's logical id (if existing)
-- `POOL`: The storage pool the instance belongs to.
-
-Additional variables may be available in a per-script context (see
-below).
-
-Of particular importance is the disk's logical ID, which will act as
-glue between Ganeti and the external storage drivers; there are two
-possible ways of using a disk's logical ID in a storage driver:
-
-1. Simply use it as a unique identifier (e.g. UUID) and keep a separate,
- external database linking it to the actual storage.
-2. Encode all useful storage information in the logical ID and have the
- driver decode it at runtime.
-
-All scripts should return 0 on success and non-zero on error accompanied by
-an appropriate error message on stderr. Furthermore, the following
-special cases are defined:
-
-1. `create` In case of success, a string representing the disk's logical
- id must be returned on stdout, which will be saved in the instance's
- configuration and can be later used by the other scripts of the same
- storage driver. The logical id may be based on instance name,
- instance uuid and/or disk index.
-
- Additional environment variables present:
- - `DISK_SIZE`: The requested disk size in MiB
-
-2. `resize` In case of success, output the new disk size.
-
- Additional environment variables present:
- - `DISK_SIZE`: The requested disk size in MiB
-
-3. `rename` On success, a new logical id should be returned, which will
- replace the old one. This script is meant to rename the instance's
- backing store and update the disk's logical ID in case one of them is
- bound to the instance name.
+The pool, in which to put the new instance's disk, will be defined at
+the command line during `instance add`. This will become possible by
+replacing the IDISK_PROVIDER
Add a new client called 'gnt-storage'.
The client interacts with the ExtStorage interface, similarly to
the way gnt-os interacts with the OS interface.
For now, only two commands are supported: 'info' and 'diagnose'.
'diagnose' calculates the node status of each provider on each node,
similarly to gnt-os diagnose. Furthermore, for every provider, it
calculates it's nodegroup validity for each nodegroup. This is done
inside the LU and not the client (marked as 'TODO' for the global
validity of gnt-os diagnose).
In the future, gnt-storage can be used to manage storage pools,
or even be extended to diagnose other storage types supported by
Ganeti, such as lvm, drbd (INT_MIRROR) or rbd (EXT_MIRROR).
+def DiagnoseExtStorage(top_dirs=None):
+ """Compute the validity for all ExtStorage Providers.
+
+ @type top_dirs: list
+ @param top_dirs: the list of directories in which to
+ search (if not given defaults to
+ L{pathutils.ES_SEARCH_PATH})
+ @rtype: list of L{objects.ExtStorage}
+ @return: a list of tuples (name, path, status, diagnose, parameters)
+ for all (potential) ExtStorage Providers under all
+ search paths, where:
+ - name is the (potential) ExtStorage Provider
+ - path is the full path to the ExtStorage Provider
+ - status True/False is the validity of the ExtStorage Provider
+ - diagnose is the error message for an invalid ExtStorage Provider,
+ otherwise empty
+ - parameters is a list of (name, help) parameters, if any
+
+ """
+ if top_dirs is None:
+ top_dirs = pathutils.ES_SEARCH_PATH
+
+ result = []
+ for dir_name in top_dirs:
+ if os.path.isdir(dir_name):
+ try:
+ f_names = utils.ListVisibleFiles(dir_name)
+ except EnvironmentError, err:
+ logging.exception("Can't list the ExtStorage directory %s: %s",
+ dir_name, err)
+ break
+ for name in f_names:
+ es_path = utils.PathJoin(dir_name, name)
+ status, es_inst = bdev.ExtStorageFromDisk(name, base_dir=dir_name)
+ if status:
+ diagnose = ""
+ parameters = es_inst.supported_parameters
+ else:
+ diagnose = es_inst
+ parameters = []
+ result.append((name, es_path, status, diagnose, parameters))
+
+ return result
+
+
def BlockdevGrow(disk, amount, dryrun, backingstore):
"""Grow a stack of block devices.
OPT_COMPL_ALL = frozenset([
OPT_COMPL_MANY_NODES,
OPT_COMPL_ONE_NODE,
OPT_COMPL_ONE_INSTANCE,
OPT_COMPL_ONE_OS,
+ OPT_COMPL_ONE_EXTSTORAGE,
OPT_COMPL_ONE_IALLOCATOR,
OPT_COMPL_INST_ADD_NODES,
OPT_COMPL_ONE_NODEGROUP,
diff --git a/lib/client/gnt_storage.py b/lib/client/gnt_storage.py
new file mode 100644
index 0000000..09ac14e
--- /dev/null
+++ b/lib/client/gnt_storage.py
@@ -0,0 +1,196 @@
+#
+#
+
+# Copyright (C) 2012 Google Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+# 02110-1301, USA.
+
+"""External Storage related commands"""
+
+# pylint: disable=W0401,W0613,W0614,C0103
+# W0401: Wildcard import ganeti.cli
+# W0613: Unused argument, since all functions follow the same API
+# W0614: Unused import %s from wildcard import (since we need cli)
+# C0103: Invalid name gnt-storage
+
+from ganeti.cli import *
+from ganeti import opcodes
+from ganeti import utils
+
+
+def ShowExtStorageInfo(opts, args):
+ """List detailed information about ExtStorage providers.
+
+ @param opts: the command line options selected by the user
+ @type args: list
+ @param args: empty list or list of ExtStorage providers' names
+ @rtype: int
+ @return: the desired exit code
+
+ """
+ op = opcodes.OpExtStorageDiagnose(output_fields=["name", "nodegroup_status",
+ "parameters"],
+ names=[])
+
+ result = SubmitOpCode(op, opts=opts)
+
+ if not result:
+ ToStderr("Can't get the ExtStorage providers list")
+ return 1
+
+ do_filter = bool(args)
+
+ for (name, nodegroup_data, parameters) in result:
+ if do_filter:
+ if name not in args:
+ continue
+ else:
+ args.remove(name)
+
+ nodegroups_valid = []
+ for nodegroup_name, nodegroup_status in nodegroup_data.iteritems():
+ if nodegroup_status:
+ nodegroups_valid.append(nodegroup_name)
+
+ ToStdout("%s:", name)
+
+ if nodegroups_valid != []:
+ ToStdout(" - Valid for nodegroups:")
+ for ndgrp in utils.NiceSort(nodegroups_valid):
+ ToStdout(" %s", ndgrp)
+ ToStdout(" - Supported parameters:")
+ for pname, pdesc in parameters:
+ ToStdout(" %s: %s", pname, pdesc)
+ else:
+ ToStdout(" - Invalid for all nodegroups")
+
+ ToStdout("")
+
+ if args:
+ for name in args:
+ ToStdout("%s: Not Found", name)
+ ToStdout("")
+
+ return 0
+
+
+def _ExtStorageStatus(status, diagnose):
+ """Beautifier function for ExtStorage status.
+
+ @type status: boolean
+ @param status: is the ExtStorage provider valid
+ @type diagnose: string
+ @param diagnose: the error message for invalid ExtStorages
+ @rtype: string
+ @return: a formatted status
+
+ """
+ if status:
+ return "valid"
+ else:
+ return "invalid - %s" % diagnose
+
+
+def DiagnoseExtStorage(opts, args):
+ """Analyse all ExtStorage providers.
+
+ @param opts: the command line options selected by the user
+ @type args:
On Wed, Sep 26, 2012 at 05:38:17PM +0300, Constantinos Venetsanopoulos wrote:
> Update the shared storage design document to reflect the current
> changes, after the implementation of the ExtStorage interface.
> diff --git a/doc/design-shared-storage.rst b/doc/design-shared-storage.rst
> index c175476..7080182 100644
> --- a/doc/design-shared-storage.rst
> +++ b/doc/design-shared-storage.rst
> @@ -64,15 +64,11 @@ The design addresses the following procedures:
> filesystems.
> - Introduction of shared block device disk template with device
> adoption.
> +- Introduction of an External Storage Interface.
> Additionally, mid- to long-term goals include:
> - Support for external “storage pools”.
> -- Introduction of an interface for communicating with external scripts,
> - providing methods for the various stages of a block device's and
> - instance's life-cycle. In order to provide storage provisioning
> - capabilities for various SAN appliances, external helpers in the form
> - of a “storage driver” will be possibly introduced as well.
> Refactoring of all code referring to constants.DTS_NET_MIRROR
> =============================================================
> @@ -159,6 +155,104 @@ The shared block device template will make the following assumptions:
> - The device will be available with the same path under all nodes in the
> node group.
> +Introduction of an External Storage Interface
> +==============================================
> +Overview
> +--------
> +
> +To extend the shared block storage template and give Ganeti the ability
> +to control and manipulate external storage (provisioning, removal,
> +growing, etc.) we need a more generic approach. The generic method for
> +supporting external shared storage in Ganeti will be to have an
> +ExtStorage provider for each external shared storage hardware type. The
> +ExtStorage provider will be a set of files (executable scripts and text
> +files), contained inside a directory which will be named after the
> +provider. This directory must be present across all nodes of a nodegroup
> +(Ganeti doesn't replicate it), in order for the provider to be usable by
> +Ganeti for this nodegroup (valid).
How will Ganeti behave if they are not consistent? Report errors? (in
cluster verify?) Ignore the provider? Etc.
> The external shared storage hardware
> +should also be accessible by all nodes of this nodegroup too.
> +
> +An “ExtStorage provider” will have to provide the following methods:
> +
> +- Create a disk
> +- Remove a disk
> +- Grow a disk
> +- Attach a disk to a given node
> +- Detach a disk from a given node
> +- Verify its supported parameters
> +
> +The proposed ExtStorage interface borrows heavily from the OS
> +interface and follows a one-script-per-function approach. An ExtStorage
> +provider is expected to provide the following scripts:
> +
> +- `create`
> +- `remove`
> +- `grow`
> +- `attach`
> +- `detach`
> +- `verify`
> +
> +All scripts will be called with no arguments and get their input via
> +environment variables. A common set of variables will be exported for
> +all commands, and some of them might have extra ones.
> +
> +- `VOL_NAME`: The name of the volume. This is unique for Ganeti and it
> + uses it to refer to a specific volume inside the external storage.
> +- `VOL_SIZE`: The volume's size in mebibytes.
> +- `VOL_NEW_SIZE`: Available only to the `grow` script. It declares the
> + new size of the volume after grow (in mebibytes).
> +- `EXTP_name`: ExtStorage parameter, where `name` is the parameter in
> + upper-case (same as OS interface's `OSP_*` parameters).
> +
> +All scripts except `attach` should return 0 on success and non-zero on
> +error, accompanied by an appropriate error message on stderr. The
> +`attach` script should return a string on stdout on success, which is
> +the block device's full path, after it has been successfully attached to
> +the host node. On error it should return non-zero.
> +
> +Implementation
> +--------------
> +
> +To support the ExtStorage interface, we will introduce a new disk
> +template called `ext`. This template will implement the existing Ganeti
> +disk interface in `lib/bdev.py` (create, remove, attach, assemble,
> +shutdown, grow), and will simultaneously pass control to the external
> +scripts to actually handle the above actions. The `ext` disk template
> +will act as a translation layer between the current Ganeti disk
> +interface and the ExtStorage providers.
> +
> +We will also introduce a new IDISK_PARAM called `IDISK_PROVIDER =
> +provider`, which will be used at the command line to select the desired
> +ExtStorage provider. This parameter will be valid only for template
> +`ext` e.g.::
> +
> + gnt-instance add -t ext --disk=0:size=2G,provider=sample_provider1
> +
> +The Extstorage interface will support different disks to be created by
> +different providers. e.g.::
> +
> + gnt-instance add -t ext --disk=0:size=2G,provider=sample_provider1
> + --disk=1:size=1G,provider=sample_provider2
> + --disk=2:size=3G,provider=sample_provider1
This (also in the context of your other design changes) makes me a bit
uneasy, with regards to coordinating changes across multiple providers
in live migration and similar changes (even startup). Have you thought
about this?
> +Finally, the ExtStorage interface will support passing of parameters to
> +the ExtStorage provider. This will also be done per disk, from the
> +command line::
> +
> + gnt-instance add -t ext --disk=0:size=1G,provider=sample_provider1,
> + param1=value1,param2=value2
> +
> +The above parameters will be exported to the ExtStorage provider's
> +scripts as the enviromental variables:
> +
> +- `EXTP_PARAM1 = str(value1)`
> +- `EXTP_PARAM2 = str(value2)`
> +
> +We will also introduce a new Ganeti client called `gnt-storage` which
> +will be used to diagnose ExtStorage providers and show information about
> +them, similarly to the way `gnt-os diagose` and `gnt-os info` handle OS
> +definitions.
Hmm… you got me here (I was hoping to avoid new python-based CLI
front-ends, but it's too early for the switch ;-).
Except for the above two small comments, LGTM, thanks.
> On Wed, Sep 26, 2012 at 05:38:17PM +0300, Constantinos Venetsanopoulos wrote:
>> Update the shared storage design document to reflect the current
>> changes, after the implementation of the ExtStorage interface.
>> diff --git a/doc/design-shared-storage.rst b/doc/design-shared-storage.rst
>> index c175476..7080182 100644
>> --- a/doc/design-shared-storage.rst
>> +++ b/doc/design-shared-storage.rst
>> @@ -64,15 +64,11 @@ The design addresses the following procedures:
>> filesystems.
>> - Introduction of shared block device disk template with device
>> adoption.
>> +- Introduction of an External Storage Interface.
>> Additionally, mid- to long-term goals include:
>> - Support for external “storage pools”.
>> -- Introduction of an interface for communicating with external scripts,
>> - providing methods for the various stages of a block device's and
>> - instance's life-cycle. In order to provide storage provisioning
>> - capabilities for various SAN appliances, external helpers in the form
>> - of a “storage driver” will be possibly introduced as well.
>> Refactoring of all code referring to constants.DTS_NET_MIRROR
>> =============================================================
>> @@ -159,6 +155,104 @@ The shared block device template will make the following assumptions:
>> - The device will be available with the same path under all nodes in the
>> node group.
>> +Introduction of an External Storage Interface
>> +==============================================
>> +Overview
>> +--------
>> +
>> +To extend the shared block storage template and give Ganeti the ability
>> +to control and manipulate external storage (provisioning, removal,
>> +growing, etc.) we need a more generic approach. The generic method for
>> +supporting external shared storage in Ganeti will be to have an
>> +ExtStorage provider for each external shared storage hardware type. The
>> +ExtStorage provider will be a set of files (executable scripts and text
>> +files), contained inside a directory which will be named after the
>> +provider. This directory must be present across all nodes of a nodegroup
>> +(Ganeti doesn't replicate it), in order for the provider to be usable by
>> +Ganeti for this nodegroup (valid).
> How will Ganeti behave if they are not consistent? Report errors? (in
> cluster verify?) Ignore the provider? Etc.
The ExtStorage code follows exactly the behavior of the code
handling OS definitions: It produces appropriate error messages
and also comes with `gnt-storage {diagnose, info}' similarly to
`gnt-os {diagnose, info}'.
There is only one difference compared to the way OS defs are
handled:
The ExtStorage diagnose code calculates the validity of each provider
for each nodegroup in the cmdlib logic rather than in the client.
This was marked as 'TODO'' inside cmdlib for OS diagnose.
This gives you the flexibility to do neat things easily, such as running
the LU from inside cluster verify and producing validity statuses
each provider-nodegroup combination. So, presumably this can also
be used inside `gnt-cluster verify' in the future.
>> The external shared storage hardware
>> +should also be accessible by all nodes of this nodegroup too.
>> +
>> +An “ExtStorage provider” will have to provide the following methods:
>> +
>> +- Create a disk
>> +- Remove a disk
>> +- Grow a disk
>> +- Attach a disk to a given node
>> +- Detach a disk from a given node
>> +- Verify its supported parameters
>> +
>> +The proposed ExtStorage interface borrows heavily from the OS
>> +interface and follows a one-script-per-function approach. An ExtStorage
>> +provider is expected to provide the following scripts:
>> +
>> +- `create`
>> +- `remove`
>> +- `grow`
>> +- `attach`
>> +- `detach`
>> +- `verify`
>> +
>> +All scripts will be called with no arguments and get their input via
>> +environment variables. A common set of variables will be exported for
>> +all commands, and some of them might have extra ones.
>> +
>> +- `VOL_NAME`: The name of the volume. This is unique for Ganeti and it
>> + uses it to refer to a specific volume inside the external storage.
>> +- `VOL_SIZE`: The volume's size in mebibytes.
>> +- `VOL_NEW_SIZE`: Available only to the `grow` script. It declares the
>> + new size of the volume after grow (in mebibytes).
>> +- `EXTP_name`: ExtStorage parameter, where `name` is the parameter in
>> + upper-case (same as OS interface's `OSP_*` parameters).
>> +
>> +All scripts except `attach` should return 0 on success and non-zero on
>> +error, accompanied by an appropriate error message on stderr. The
>> +`attach` script should return a string on stdout on success, which is
>> +the block device's full path, after it has been successfully attached to
>> +the host node. On error it should return non-zero.
>> +
>> +Implementation
>> +--------------
>> +
>> +To support the ExtStorage interface, we will introduce a new disk
>> +template called `ext`. This template will implement the existing Ganeti
>> +disk interface in `lib/bdev.py` (create, remove, attach, assemble,
>> +shutdown, grow), and will simultaneously pass control to the external
>> +scripts to actually handle the above actions. The `ext` disk template
>> +will act as a translation layer between the current Ganeti disk
>> +interface and the ExtStorage providers.
>> +
>> +We will also introduce a new IDISK_PARAM called `IDISK_PROVIDER =
>> +provider`, which will be used at the command line to select the desired
>> +ExtStorage provider. This parameter will be valid only for template
>> +`ext` e.g.::
>> +
>> + gnt-instance add -t ext --disk=0:size=2G,provider=sample_provider1
>> +
>> +The Extstorage interface will support different disks to be created by
>> +different providers. e.g.::
>> +
>> + gnt-instance add -t ext --disk=0:size=2G,provider=sample_provider1
>> + --disk=1:size=1G,provider=sample_provider2
>> + --disk=2:size=3G,provider=sample_provider1
> This (also in the context of your other design changes) makes me a bit
> uneasy, with regards to coordinating changes across multiple providers
> in live migration and similar changes (even startup). Have you thought
> about this?
I'm not sure I can understand your point completely. Given the
diagnose functionality described above, are you concerned providers
are going to be in inconsistent state among nodes? Is it a matter
of how the allocator decides the target node given different providers?
Can you expand on the "coordinating changes across multiple providers
in live migration and similar changes (even startup)" part of your
question? Perhaps with some examples?
>> +Finally, the ExtStorage interface will support passing of parameters to
>> +the ExtStorage provider. This will also be done per disk, from the
>> +command line::
>> +
>> + gnt-instance add -t ext --disk=0:size=1G,provider=sample_provider1,
>> + param1=value1,param2=value2
>> +
>> +The above parameters will be exported to the ExtStorage provider's
>> +scripts as the enviromental variables:
>> +
>> +- `EXTP_PARAM1 = str(value1)`
>> +- `EXTP_PARAM2 = str(value2)`
>> +
>> +We will also introduce a new Ganeti client called `gnt-storage` which
>> +will be used to diagnose ExtStorage providers and show information about
>> +them, similarly to the way `gnt-os diagose` and `gnt-os info` handle OS
>> +definitions.
> Hmm… you got me here (I was hoping to avoid new python-based CLI
> front-ends, but it's too early for the switch ;-).
This makes me happy :-)
I understand this may need some porting to Haskell later on.
I hope my Haskell skills will have improved enough by then, for me to
be able to contribute to the effort.
On Thu, Sep 27, 2012 at 12:37:41PM +0300, Constantinos Venetsanopoulos wrote:
> On 09/26/2012 07:21 PM, Iustin Pop wrote:
> >On Wed, Sep 26, 2012 at 05:38:17PM +0300, Constantinos Venetsanopoulos wrote:
> >>Update the shared storage design document to reflect the current
> >>changes, after the implementation of the ExtStorage interface.
> >>diff --git a/doc/design-shared-storage.rst b/doc/design-shared-storage.rst
> >>index c175476..7080182 100644
> >>--- a/doc/design-shared-storage.rst
> >>+++ b/doc/design-shared-storage.rst
> >>@@ -64,15 +64,11 @@ The design addresses the following procedures:
> >> filesystems.
> >> - Introduction of shared block device disk template with device
> >> adoption.
> >>+- Introduction of an External Storage Interface.
> >> Additionally, mid- to long-term goals include:
> >> - Support for external “storage pools”.
> >>-- Introduction of an interface for communicating with external scripts,
> >>- providing methods for the various stages of a block device's and
> >>- instance's life-cycle. In order to provide storage provisioning
> >>- capabilities for various SAN appliances, external helpers in the form
> >>- of a “storage driver” will be possibly introduced as well.
> >> Refactoring of all code referring to constants.DTS_NET_MIRROR
> >> =============================================================
> >>@@ -159,6 +155,104 @@ The shared block device template will make the following assumptions:
> >> - The device will be available with the same path under all nodes in the
> >> node group.
> >>+Introduction of an External Storage Interface
> >>+==============================================
> >>+Overview
> >>+--------
> >>+
> >>+To extend the shared block storage template and give Ganeti the ability
> >>+to control and manipulate external storage (provisioning, removal,
> >>+growing, etc.) we need a more generic approach. The generic method for
> >>+supporting external shared storage in Ganeti will be to have an
> >>+ExtStorage provider for each external shared storage hardware type. The
> >>+ExtStorage provider will be a set of files (executable scripts and text
> >>+files), contained inside a directory which will be named after the
> >>+provider. This directory must be present across all nodes of a nodegroup
> >>+(Ganeti doesn't replicate it), in order for the provider to be usable by
> >>+Ganeti for this nodegroup (valid).
> >How will Ganeti behave if they are not consistent? Report errors? (in
> >cluster verify?) Ignore the provider? Etc.
> The ExtStorage code follows exactly the behavior of the code
> handling OS definitions: It produces appropriate error messages
> and also comes with `gnt-storage {diagnose, info}' similarly to
> `gnt-os {diagnose, info}'.
> There is only one difference compared to the way OS defs are
> handled:
> The ExtStorage diagnose code calculates the validity of each provider
> for each nodegroup in the cmdlib logic rather than in the client.
> This was marked as 'TODO'' inside cmdlib for OS diagnose.
> This gives you the flexibility to do neat things easily, such as running
> the LU from inside cluster verify and producing validity statuses
> each provider-nodegroup combination. So, presumably this can also
> be used inside `gnt-cluster verify' in the future.
> >>The external shared storage hardware
> >>+should also be accessible by all nodes of this nodegroup too.
> >>+
> >>+An “ExtStorage provider” will have to provide the following methods:
> >>+
> >>+- Create a disk
> >>+- Remove a disk
> >>+- Grow a disk
> >>+- Attach a disk to a given node
> >>+- Detach a disk from a given node
> >>+- Verify its supported parameters
> >>+
> >>+The proposed ExtStorage interface borrows heavily from the OS
> >>+interface and follows a one-script-per-function approach. An ExtStorage
> >>+provider is expected to provide the following scripts:
> >>+
> >>+- `create`
> >>+- `remove`
> >>+- `grow`
> >>+- `attach`
> >>+- `detach`
> >>+- `verify`
> >>+
> >>+All scripts will be called with no arguments and get their input via
> >>+environment variables. A common set of variables will be exported for
> >>+all commands, and some of them might have extra ones.
> >>+
> >>+- `VOL_NAME`: The name of the volume. This is unique for Ganeti and it
> >>+ uses it to refer to a specific volume inside the external storage.
> >>+- `VOL_SIZE`: The volume's size in mebibytes.
> >>+- `VOL_NEW_SIZE`: Available only to the `grow` script. It declares the
> >>+ new size of the volume after grow (in mebibytes).
> >>+- `EXTP_name`: ExtStorage parameter, where `name` is the parameter in
> >>+ upper-case (same as OS interface's `OSP_*` parameters).
> >>+
> >>+All scripts except `attach` should return 0 on success and non-zero on
> >>+error, accompanied by an appropriate error message on stderr. The
> >>+`attach` script should return a string on stdout on success, which is
> >>+the block device's full path, after it has been successfully attached to
> >>+the host node. On error it should return non-zero.
> >>+
> >>+Implementation
> >>+--------------
> >>+
> >>+To support the ExtStorage interface, we will introduce a new disk
> >>+template called `ext`. This template will implement the existing Ganeti
> >>+disk interface in `lib/bdev.py` (create, remove, attach, assemble,
> >>+shutdown, grow), and will simultaneously pass control to the external
> >>+scripts to actually handle the above actions. The `ext` disk template
> >>+will act as a translation layer between the current Ganeti disk
> >>+interface and the ExtStorage providers.
> >>+
> >>+We will also introduce a new IDISK_PARAM called `IDISK_PROVIDER =
> >>+provider`, which will be used at the command line to select the desired
> >>+ExtStorage provider. This parameter will be valid only for template
> >>+`ext` e.g.::
> >>+
> >>+ gnt-instance add -t ext --disk=0:size=2G,provider=sample_provider1
> >>+
> >>+The Extstorage interface will support different disks to be created by
> >>+different providers. e.g.::
> >>+
> >>+ gnt-instance add -t ext --disk=0:size=2G,provider=sample_provider1
> >>+ --disk=1:size=1G,provider=sample_provider2
> >>+ --disk=2:size=3G,provider=sample_provider1
> >This (also in the context of your other design changes) makes me a bit
> >uneasy, with regards to coordinating changes across multiple providers
> >in live migration and similar changes (even startup). Have you thought
> >about this?
> I'm not sure I can understand your point completely. Given the
> diagnose functionality described above, are you concerned providers
> are going to be in inconsistent state among nodes? Is it a matter
> of how the allocator decides the target node given different providers?
Ah no, see below.
> Can you expand on the "coordinating changes across multiple providers
> in live migration and similar changes (even startup)" part of your
> question? Perhaps with some examples?
I'll try :)
I have a _very slight_ worry on that handling "complex" instances will
become more tricky if the behaviour of different storage providers or
disk templates (this is in the context of the other designs) differs.
For example, let's say we have an instance with first disk DRBD, second
disk ext,provider=p1, third disk ext,provider=p2.
We know we can live migrate an instance across node groups for DRBD, and
we now we can migrate ext providers if they are available in both
groups. But combining all these checks across multiple disks is just 2%
more tricky: we need to move from "disk_template in
constants.DTS_MIRRORED" to something like "does all instance disks allow
migration/failover/move from (nodegroup A, nodes [a,b]) to (nodegroup B,
nodes [c,d])" (where A could be equal to B)?
This is doable, just means that a lot of decisions about the instance
behaviour (can be moved, can be live migrated, etc.) will move away from
the instance level (disk_template) and become an aggregate of the
instance's disk capabilities.
Which is all fine, now that I thought it through, just something that
we need to keep in mind.
> >>+Finally, the ExtStorage interface will support passing of parameters to
> >>+the ExtStorage provider. This will also be done per disk, from the
> >>+command line::
> >>+
> >>+ gnt-instance add -t ext --disk=0:size=1G,provider=sample_provider1,
> >>+ param1=value1,param2=value2
> >>+
> >>+The above parameters will be exported to the ExtStorage provider's
> >>+scripts as the enviromental variables:
> >>+
> >>+- `EXTP_PARAM1 = str(value1)`
> >>+- `EXTP_PARAM2 = str(value2)`
> >>+
> >>+We will also introduce a new Ganeti client called `gnt-storage` which
> >>+will be used to diagnose ExtStorage providers and show information about
> >>+them, similarly to the way `gnt-os diagose` and `gnt-os info` handle OS
> >>+definitions.
> >Hmm… you got me here (I was hoping to avoid new python-based CLI
> >front-ends, but it's too early for the switch ;-).
> This makes me happy :-)
> I understand this may need some porting to Haskell later on.
> I hope my Haskell skills will have improved enough by then, for me to
> be able to contribute to the effort.
> On Thu, Sep 27, 2012 at 12:37:41PM +0300, Constantinos Venetsanopoulos wrote:
>> On 09/26/2012 07:21 PM, Iustin Pop wrote:
>>> On Wed, Sep 26, 2012 at 05:38:17PM +0300, Constantinos Venetsanopoulos wrote:
>>>> Update the shared storage design document to reflect the current
>>>> changes, after the implementation of the ExtStorage interface.
>>>> diff --git a/doc/design-shared-storage.rst b/doc/design-shared-storage.rst
>>>> index c175476..7080182 100644
>>>> --- a/doc/design-shared-storage.rst
>>>> +++ b/doc/design-shared-storage.rst
>>>> @@ -64,15 +64,11 @@ The design addresses the following procedures:
>>>> filesystems.
>>>> - Introduction of shared block device disk template with device
>>>> adoption.
>>>> +- Introduction of an External Storage Interface.
>>>> Additionally, mid- to long-term goals include:
>>>> - Support for external “storage pools”.
>>>> -- Introduction of an interface for communicating with external scripts,
>>>> - providing methods for the various stages of a block device's and
>>>> - instance's life-cycle. In order to provide storage provisioning
>>>> - capabilities for various SAN appliances, external helpers in the form
>>>> - of a “storage driver” will be possibly introduced as well.
>>>> Refactoring of all code referring to constants.DTS_NET_MIRROR
>>>> =============================================================
>>>> @@ -159,6 +155,104 @@ The shared block device template will make the following assumptions:
>>>> - The device will be available with the same path under all nodes in the
>>>> node group.
>>>> +Introduction of an External Storage Interface
>>>> +==============================================
>>>> +Overview
>>>> +--------
>>>> +
>>>> +To extend the shared block storage template and give Ganeti the ability
>>>> +to control and manipulate external storage (provisioning, removal,
>>>> +growing, etc.) we need a more generic approach. The generic method for
>>>> +supporting external shared storage in Ganeti will be to have an
>>>> +ExtStorage provider for each external shared storage hardware type. The
>>>> +ExtStorage provider will be a set of files (executable scripts and text
>>>> +files), contained inside a directory which will be named after the
>>>> +provider. This directory must be present across all nodes of a nodegroup
>>>> +(Ganeti doesn't replicate it), in order for the provider to be usable by
>>>> +Ganeti for this nodegroup (valid).
>>> How will Ganeti behave if they are not consistent? Report errors? (in
>>> cluster verify?) Ignore the provider? Etc.
>> The ExtStorage code follows exactly the behavior of the code
>> handling OS definitions: It produces appropriate error messages
>> and also comes with `gnt-storage {diagnose, info}' similarly to
>> `gnt-os {diagnose, info}'.
>> There is only one difference compared to the way OS defs are
>> handled:
>> The ExtStorage diagnose code calculates the validity of each provider
>> for each nodegroup in the cmdlib logic rather than in the client.
>> This was marked as 'TODO'' inside cmdlib for OS diagnose.
>> This gives you the flexibility to do neat things easily, such as running
>> the LU from inside cluster verify and producing validity statuses
>> each provider-nodegroup combination. So, presumably this can also
>> be used inside `gnt-cluster verify' in the future.
> Sounds very good, thanks!
>>>> The external shared storage hardware
>>>> +should also be accessible by all nodes of this nodegroup too.
>>>> +
>>>> +An “ExtStorage provider” will have to provide the following methods:
>>>> +
>>>> +- Create a disk
>>>> +- Remove a disk
>>>> +- Grow a disk
>>>> +- Attach a disk to a given node
>>>> +- Detach a disk from a given node
>>>> +- Verify its supported parameters
>>>> +
>>>> +The proposed ExtStorage interface borrows heavily from the OS
>>>> +interface and follows a one-script-per-function approach. An ExtStorage
>>>> +provider is expected to provide the following scripts:
>>>> +
>>>> +- `create`
>>>> +- `remove`
>>>> +- `grow`
>>>> +- `attach`
>>>> +- `detach`
>>>> +- `verify`
>>>> +
>>>> +All scripts will be called with no arguments and get their input via
>>>> +environment variables. A common set of variables will be exported for
>>>> +all commands, and some of them might have extra ones.
>>>> +
>>>> +- `VOL_NAME`: The name of the volume. This is unique for Ganeti and it
>>>> + uses it to refer to a specific volume inside the external storage.
>>>> +- `VOL_SIZE`: The volume's size in mebibytes.
>>>> +- `VOL_NEW_SIZE`: Available only to the `grow` script. It declares the
>>>> + new size of the volume after grow (in mebibytes).
>>>> +- `EXTP_name`: ExtStorage parameter, where `name` is the parameter in
>>>> + upper-case (same as OS interface's `OSP_*` parameters).
>>>> +
>>>> +All scripts except `attach` should return 0 on success and non-zero on
>>>> +error, accompanied by an appropriate error message on stderr. The
>>>> +`attach` script should return a string on stdout on success, which is
>>>> +the block device's full path, after it has been successfully attached to
>>>> +the host node. On error it should return non-zero.
>>>> +
>>>> +Implementation
>>>> +--------------
>>>> +
>>>> +To support the ExtStorage interface, we will introduce a new disk
>>>> +template called `ext`. This template will implement the existing Ganeti
>>>> +disk interface in `lib/bdev.py` (create, remove, attach, assemble,
>>>> +shutdown, grow), and will simultaneously pass control to the external
>>>> +scripts to actually handle the above actions. The `ext` disk template
>>>> +will act as a translation layer between the current Ganeti disk
>>>> +interface and the ExtStorage providers.
>>>> +
>>>> +We will also introduce a new IDISK_PARAM called `IDISK_PROVIDER =
>>>> +provider`, which will be used at the command line to select the desired
>>>> +ExtStorage provider. This parameter will be valid only for template
>>>> +`ext` e.g.::
>>>> +
>>>> + gnt-instance add -t ext --disk=0:size=2G,provider=sample_provider1
>>>> +
>>>> +The Extstorage interface will support different disks to be created by
>>>> +different providers. e.g.::
>>>> +
>>>> + gnt-instance add -t ext --disk=0:size=2G,provider=sample_provider1
>>>> + --disk=1:size=1G,provider=sample_provider2
>>>> + --disk=2:size=3G,provider=sample_provider1
>>> This (also in the context of your other design changes) makes me a bit
>>> uneasy, with regards to coordinating changes across multiple providers
>>> in live migration and similar changes (even startup). Have you thought
>>> about this?
>> I'm not sure I can understand your point completely. Given the
>> diagnose functionality described above, are you concerned providers
>> are going to be in inconsistent state among nodes? Is it a matter
>> of how the allocator decides the target node given different providers?
> Ah no, see below.
>> Can you expand on the "coordinating changes across multiple providers
>> in live migration and similar changes (even startup)" part of your
>> question? Perhaps with some examples?
> I'll try :)
> I have a _very slight_ worry on that handling "complex" instances will
> become more tricky if the behaviour of different storage providers or
> disk templates (this is in the context of the other designs) differs.
> For example, let's say we have an instance with first disk DRBD, second
> disk ext,provider=p1, third disk ext,provider=p2.
> We know we can live migrate an instance across node groups for DRBD, and
> we now we can migrate ext providers if they are available in both
> groups. But combining all these checks across multiple disks is just 2%
> more tricky: we need to move from "disk_template in
> constants.DTS_MIRRORED" to something like "does all instance disks allow
> migration/failover/move from (nodegroup A, nodes [a,b]) to (nodegroup B,
> nodes [c,d])" (where A could be equal to B)?
> This is doable, just means that a lot of decisions about the instance
> behaviour (can be moved, can be live migrated, etc.) will move away from
> the instance level (disk_template) and become an aggregate of the
> instance's disk capabilities.
Exactly! Wrt the ExtStorage patchset, we won't need to make changes
in the decision making because everything still stays at instance level,
even though we have different providers on different disks. All we have
to do, is make sure all providers are present at the node/nodegroup we
want to migrate/failover/move (I have tested live migrations of instances
with let's say disk0 ext,provider=p1, disk1 ext,provider=p2 without
changing anything in the current allocation logic).
When we introduce Storage Pools and the ability to have different
disks of an instance residing in different Storage Pools, then we will
have to do exactly as you are saying (and is also written in the design
doc). We should move the decision logic from operating at instance
level, to operating at the aggregation of the instance's disks Storage
Pools. At that point, we also don't have a problem with providers,
because providers will be moved from an IDISK_PARAM (which we
need right now as a transition level) to a parameter of the Storage
Pool. Thus, the decision logic will not need to know anything about
providers as it doesn't need to know now.
As you say, we will move from:
"disk_template in constants.DTS_MIRRORED"
to:
"are all the instance's disks Storage Pools connected to the
nodegroup we want to
On Wed, Sep 26, 2012 at 05:38:18PM +0300, Constantinos Venetsanopoulos wrote:
> With this commit we introduce the External Storage Interface
> to Ganeti, abbreviated: ExtStorage Interface.
> The ExtStorage Interface provides Ganeti with the ability to interact
> with externally connected shared storage pools, visible by all
> VM-capable nodes. This means that Ganeti is able to handle VM disks
> that reside inside a NAS/SAN or any distributed block storage provider.
> The ExtStorage Interface provides a clear API, heavily inspired by the
> gnt-os-interface API, that can be used by storage vendors or sysadmins
> to write simple ExtStorage Providers (correlated to gnt-os-interface's
> OS Definitions). Those Providers will glue externally attached shared
> storage with Ganeti, without the need of preprovisioned block devices
> on Ganeti VM-capable nodes as confined be the current `blockdev' disk
> template.
> To do so, we implement a new disk template called `ext' (of type
> DTS_EXT_MIRROR) that passes control to externally provided scripts
> (the ExtStorage Provider) for the template's basic functions:
> create / attach / detach / remove / grow
> The scripts reside under ES_SEARCH_PATH (correlated to OS_SEARCH_PATH)
> and only one ExtStorage Provider is supported called `ext'.
> The disk's logical id is the tuple ('ext', UUID.ext.diskX), where UUID
> is generated as in disk template `plain' and X is the disk's index.
A few general comments, or rather questions. Some of the are rather
design level, but they become more apparent when reading the code only,
sorry.
I'm not sure creating one log file per operation is a good idea. While
installing/renaming an instance is a (somewhat) rare operation,
activating the disks of an instance is a much often one. I know that
right now in attach you don't create a log, but that seems to be the
intention (hence the TODO). Thoughts?
Also, it seems by reading the code that Ganeti generates an UUID for the
volume, and it asks the provider to create one such drive; basically,
the provider will get "please create disk f0559a.0 of size 500G", right?
This means that only the ganeti configuration has the mapping instance
name to disks, and that losing the ganeti configuration would mean
instantly losing all instance data since we can't map it back to a VM.
For 'plain' disks, we explicitly tag the LV names with the name of the
instance, to safeguard against such problems. Do you think this is not a
problem? Or could we add the instance name to the external provider
create call, such that providers could do such an association
themselves, if they care?
> +# --with-extstorage-search-path=...
> +# same black sed magic for quoting of the strings in the list
> +AC_ARG_WITH([extstorage-search-path],
> + [AS_HELP_STRING([--with-extstorage-search-path=LIST],
> + [comma separated list of directories to]
> + [ search for External Storage Providers]
> + [ (default is /srv/ganeti/extstorage)]
> + )],
> + [es_search_path=`echo -n "$withval" | sed -e "s/\([[^,]]*\)/'\1'/g"`],
> + [es_search_path="'/srv/ganeti/extstorage'"])
> +AC_SUBST(ES_SEARCH_PATH, $es_search_path)
> +
> # --with-iallocator-search-path=...
> # do a bit of black sed magic to for quoting of the strings in the list
> AC_ARG_WITH([iallocator-search-path],
> diff --git a/lib/bdev.py b/lib/bdev.py
> index d6c1a66..b0ed7c0 100644
> --- a/lib/bdev.py
> +++ b/lib/bdev.py
> @@ -33,6 +33,7 @@ import logging
> from ganeti import utils
> from ganeti import errors
> from ganeti import constants
> +from ganeti import pathutils
> from ganeti import objects
> from ganeti import compat
> from ganeti import netutils
> @@ -2648,11 +2649,328 @@ class RADOSBlockDevice(BlockDev):
> result.fail_reason, result.output)
> +class ExtStorageDevice(BlockDev):
> + """A block device provided by an ExtStorage Provider.
> +
> + This class implements the External Storage Interface, which means
> + handling of the externally provided block devices.
> +
> + """
> + def __init__(self, unique_id, children, size, params):
> + """Attaches to an extstorage block device.
> +
> + """
> + super(ExtStorageDevice, self).__init__(unique_id, children, size, params)
> + if not isinstance(unique_id, (tuple, list)) or len(unique_id) != 2:
> + raise ValueError("Invalid configuration data %s" % str(unique_id))
> +
> + self.driver, self.vol_name = unique_id
> +
> + self.major = self.minor = None
> + self.Attach()
> +
> + @classmethod
> + def Create(cls, unique_id, children, size, params):
> + """Create a new extstorage device.
> +
> + Provision a new volume using an extstorage provider, which will
> + then be mapped to a block device.
> +
> + """
> + if not isinstance(unique_id, (tuple, list)) or len(unique_id) != 2:
> + raise errors.ProgrammerError("Invalid configuration data %s" %
> + str(unique_id))
> +
> + # Call the External Storage's create script,
> + # to provision a new Volume inside the External Storage
> + _ExtStorageAction(constants.ES_ACTION_CREATE, unique_id, str(size))
> +
> + return ExtStorageDevice(unique_id, children, size, params)
> +
> + def Remove(self):
> + """Remove the extstorage device.
> +
> + """
> + if not self.minor and not self.Attach():
> + # The extstorage device doesn't exist.
> + return
> +
> + # First shutdown the device (remove mappings).
> + self.Shutdown()
> +
> + # Call the External Storage's remove script,
> + # to remove the Volume from the External Storage
> + _ExtStorageAction(constants.ES_ACTION_REMOVE, self.unique_id)
> +
> + def Rename(self, new_id):
> + """Rename this device.
> +
> + """
> + pass
> +
> + def Attach(self):
> + """Attach to an existing extstorage device.
> +
> + This method maps the extstorage volume that matches our name with
> + a corresponding block device and then attaches to this device.
> +
> + """
> + self.attached = False
> +
> + # Call the External Storage's attach script,
> + # to attach an existing Volume to a block device under /dev
> + self.dev_path = _ExtStorageAction(constants.ES_ACTION_ATTACH,
> + self.unique_id)
> +
> + try:
> + st = os.stat(self.dev_path)
> + except OSError, err:
> + logging.error("Error stat()'ing %s: %s", self.dev_path, str(err))
> + return False
> +
> + if not stat.S_ISBLK(st.st_mode):
> + logging.error("%s is not a block device", self.dev_path)
> + return False
> +
> + self.major = os.major(st.st_rdev)
> + self.minor = os.minor(st.st_rdev)
> + self.attached = True
> +
> + return True
> +
> + def Assemble(self):
> + """Assemble the device.
> +
> + """
> + pass
> +
> + def Shutdown(self):
> + """Shutdown the device.
> +
> + """
> + if not self.minor and not self.Attach():
> + # The extstorage device doesn't exist.
> + return
> +
> + # Call the External Storage's detach script,
> + # to detach an existing Volume from it's block device under /dev
> + _ExtStorageAction(constants.ES_ACTION_DETACH, self.unique_id)
> +
> + self.minor = None
> + self.dev_path = None
> +
> + def Open(self, force=False):
> + """Make the device ready for I/O.
> +
> + """
> + pass
> +
> + def Close(self):
> + """Notifies that the device will no longer be used for I/O.
> +
> + """
> + pass
> +
> + def Grow(self, amount, dryrun, backingstore):
> + """Grow the Volume.
> +
> + @type amount: integer
> + @param amount: the amount (in mebibytes) to grow with
> + @type dryrun: boolean
> + @param dryrun: whether to execute the operation in simulation mode
> + only, without actually increasing the size
> +
> + """
> + if not backingstore:
> + return
> + if not self.Attach():
> + _ThrowError("Can't attach to extstorage device during Grow()")
> +
> + if dryrun:
> + # we do not support dry runs of resize operations for now.
> + return
> +
> + new_size = self.size + amount
> +
> + # Call the External Storage's grow script,
> + # to grow an existing Volume inside the External Storage
> + _ExtStorageAction(constants.ES_ACTION_GROW, self.unique_id,
> + str(self.size),
On Wed, Sep 26, 2012 at 05:38:20PM +0300, Constantinos Venetsanopoulos wrote:
> The option is added to allow us the passing of arbitrary ext-params
> during disk addition (gnt-instance modify --disk add: ...), same as
> we do during instance add.
> The option is needed because during gnt-instance modify parameters'
> validation, we do not know the type of the disk template
> (whereas in gnt-instance add we find it from the -t option).
> The option is called 'arbit' and not 'ext' params because it may be
> useful in other contexts too, in the future.
> The corresponding prereq checks for the existence of the 'provider'
> parameter in case of an 'ext' template are also taken care off in
> this commit. It serves as a general introduction of the 'ext'
> template in gnt-instance modify.
This is not a review of the patch, but just reading the commit message,
I find "--allow-arbit-params" too hard to read. What about
"--flexible-params"?
> On Wed, Sep 26, 2012 at 05:38:18PM +0300, Constantinos Venetsanopoulos wrote:
>> With this commit we introduce the External Storage Interface
>> to Ganeti, abbreviated: ExtStorage Interface.
>> The ExtStorage Interface provides Ganeti with the ability to interact
>> with externally connected shared storage pools, visible by all
>> VM-capable nodes. This means that Ganeti is able to handle VM disks
>> that reside inside a NAS/SAN or any distributed block storage provider.
>> The ExtStorage Interface provides a clear API, heavily inspired by the
>> gnt-os-interface API, that can be used by storage vendors or sysadmins
>> to write simple ExtStorage Providers (correlated to gnt-os-interface's
>> OS Definitions). Those Providers will glue externally attached shared
>> storage with Ganeti, without the need of preprovisioned block devices
>> on Ganeti VM-capable nodes as confined be the current `blockdev' disk
>> template.
>> To do so, we implement a new disk template called `ext' (of type
>> DTS_EXT_MIRROR) that passes control to externally provided scripts
>> (the ExtStorage Provider) for the template's basic functions:
>> create / attach / detach / remove / grow
>> The scripts reside under ES_SEARCH_PATH (correlated to OS_SEARCH_PATH)
>> and only one ExtStorage Provider is supported called `ext'.
>> The disk's logical id is the tuple ('ext', UUID.ext.diskX), where UUID
>> is generated as in disk template `plain' and X is the disk's index.
> A few general comments, or rather questions. Some of the are rather
> design level, but they become more apparent when reading the code only,
> sorry.
No problem.
> I'm not sure creating one log file per operation is a good idea. While
> installing/renaming an instance is a (somewhat) rare operation,
> activating the disks of an instance is a much often one. I know that
> right now in attach you don't create a log, but that seems to be the
> intention (hence the TODO). Thoughts?
Indeed, creating a log file during attach was the initial intention, however
I understand your point since specifically attach is run multiple times
even during a single instance addition. On the other hand, I think creation
and removal of a disk should reside in different files because they do not
happen so often. Furthermore, these log files are the only place where the
admin actually can see what happens, when something goes wrong with
the external hardware and are also very helpful when writing custom
ExtStorage providers.
What do you think? Another option would be to keep the log files for
create/remove/grow and do not log the attach/detach. Or keep them
for now to see the workflow and if they produce a lot of unnecessary
info that we do not want to keep, merge all actions in a single log file
for each disk (but this needs code refactoring).
> Also, it seems by reading the code that Ganeti generates an UUID for the
> volume, and it asks the provider to create one such drive; basically,
> the provider will get "please create disk f0559a.0 of size 500G", right?
That's right.
> This means that only the ganeti configuration has the mapping instance
> name to disks, and that losing the ganeti configuration would mean
> instantly losing all instance data since we can't map it back to a VM.
> For 'plain' disks, we explicitly tag the LV names with the name of the
> instance, to safeguard against such problems. Do you think this is not a
> problem?
To be honest, I haven't thought about this case, but I understand
your point.
> Or could we add the instance name to the external provider
> create call, such that providers could do such an association
> themselves, if they care?
That would be fine, but I can't see an easy way to do that, since the
instance object doesn't find its way inside bdev, and it wouldn't be clean
to add new parameters in the Create call. What we could do though,
would be to prepend the volume name with the instance's name.
Much like you do with 'plain' disks. Now the disk name has the following
format:
'04859fac4.ext.diskX' where X is the index.
We could change that to 'instancename-04859fac4.ext.diskX' and
document it appropriately, so that the mapping will be there for the
providers to use it, if they care. Would that be OK?
>> +# --with-extstorage-search-path=...
>> +# same black sed magic for quoting of the strings in the list
>> +AC_ARG_WITH([extstorage-search-path],
>> + [AS_HELP_STRING([--with-extstorage-search-path=LIST],
>> + [comma separated list of directories to]
>> + [ search for External Storage Providers]
>> + [ (default is /srv/ganeti/extstorage)]
>> + )],
>> + [es_search_path=`echo -n "$withval" | sed -e "s/\([[^,]]*\)/'\1'/g"`],
>> + [es_search_path="'/srv/ganeti/extstorage'"])
>> +AC_SUBST(ES_SEARCH_PATH, $es_search_path)
>> +
>> # --with-iallocator-search-path=...
>> # do a bit of black sed magic to for quoting of the strings in the list
>> AC_ARG_WITH([iallocator-search-path],
>> diff --git a/lib/bdev.py b/lib/bdev.py
>> index d6c1a66..b0ed7c0 100644
>> --- a/lib/bdev.py
>> +++ b/lib/bdev.py
>> @@ -33,6 +33,7 @@ import logging
>> from ganeti import utils
>> from ganeti import errors
>> from ganeti import constants
>> +from ganeti import pathutils
>> from ganeti import objects
>> from ganeti import compat
>> from ganeti import netutils
>> @@ -2648,11 +2649,328 @@ class RADOSBlockDevice(BlockDev):
>> result.fail_reason, result.output)
>> +class ExtStorageDevice(BlockDev):
>> + """A block device provided by an ExtStorage Provider.
>> +
>> + This class implements the External Storage Interface, which means
>> + handling of the externally provided block devices.
>> +
>> + """
>> + def __init__(self, unique_id, children, size, params):
>> + """Attaches to an extstorage block device.
>> +
>> + """
>> + super(ExtStorageDevice, self).__init__(unique_id, children, size, params)
>> + if not isinstance(unique_id, (tuple, list)) or len(unique_id) != 2:
>> + raise ValueError("Invalid configuration data %s" % str(unique_id))
>> +
>> + self.driver, self.vol_name = unique_id
>> +
>> + self.major = self.minor = None
>> + self.Attach()
>> +
>> + @classmethod
>> + def Create(cls, unique_id, children, size, params):
>> + """Create a new extstorage device.
>> +
>> + Provision a new volume using an extstorage provider, which will
>> + then be mapped to a block device.
>> +
>> + """
>> + if not isinstance(unique_id, (tuple, list)) or len(unique_id) != 2:
>> + raise errors.ProgrammerError("Invalid configuration data %s" %
>> + str(unique_id))
>> +
>> + # Call the External Storage's create script,
>> + # to provision a new Volume inside the External Storage
>> + _ExtStorageAction(constants.ES_ACTION_CREATE, unique_id, str(size))
>> +
>> + return ExtStorageDevice(unique_id, children, size, params)
>> +
>> + def Remove(self):
>> + """Remove the extstorage device.
>> +
>> + """
>> + if not self.minor and not self.Attach():
>> + # The extstorage device doesn't exist.
>> + return
>> +
>> + # First shutdown the device (remove mappings).
>> + self.Shutdown()
>> +
>> + # Call the External Storage's remove script,
>> + # to remove the Volume from the External Storage
>> + _ExtStorageAction(constants.ES_ACTION_REMOVE, self.unique_id)
>> +
>> + def Rename(self, new_id):
>> + """Rename this device.
>> +
>> + """
>> + pass
>> +
>> + def Attach(self):
>> + """Attach to an existing extstorage device.
>> +
>> + This method maps the extstorage volume that matches our name with
>> + a corresponding block device and then attaches to this device.
>> +
>> + """
>> + self.attached = False
>> +
>> + # Call the External Storage's attach script,
>> + # to attach an existing Volume to a block device under /dev
>> + self.dev_path = _ExtStorageAction(constants.ES_ACTION_ATTACH,
>> + self.unique_id)
>> +
>> + try:
>> + st = os.stat(self.dev_path)
>> + except OSError, err:
>> + logging.error("Error stat()'ing %s: %s", self.dev_path, str(err))
>> + return False
>> +
>> + if not stat.S_ISBLK(st.st_mode):
>> + logging.error("%s is not a
> On Wed, Sep 26, 2012 at 05:38:20PM +0300, Constantinos Venetsanopoulos wrote:
>> The option is added to allow us the passing of arbitrary ext-params
>> during disk addition (gnt-instance modify --disk add: ...), same as
>> we do during instance add.
>> The option is needed because during gnt-instance modify parameters'
>> validation, we do not know the type of the disk template
>> (whereas in gnt-instance add we find it from the -t option).
>> The option is called 'arbit' and not 'ext' params because it may be
>> useful in other contexts too, in the future.
>> The corresponding prereq checks for the existence of the 'provider'
>> parameter in case of an 'ext' template are also taken care off in
>> this commit. It serves as a general introduction of the 'ext'
>> template in gnt-instance modify.
Oooops :)
I've been rebasing a hundred times (always resolving conflicts) the last
two weeks to catch up with you guys so this must have slipped.
Could you please remove it when applying?
> This is not a review of the patch, but just reading the commit message,
> I find "--allow-arbit-params" too hard to read. What about
> "--flexible-params"?
I think "allow" should be in there. Maybe it becomes more clear when
you see the code. For "arbit" I'm not sure either. In any case, we can
change that to "--flexible-params" during the review if you think that
fits best at the end.
On Thu, Sep 27, 2012 at 02:46:40PM +0300, Constantinos Venetsanopoulos wrote:
> On 09/27/2012 02:02 PM, Iustin Pop wrote:
> >On Thu, Sep 27, 2012 at 12:37:41PM +0300, Constantinos Venetsanopoulos wrote:
> >>On 09/26/2012 07:21 PM, Iustin Pop wrote:
> >>>On Wed, Sep 26, 2012 at 05:38:17PM +0300, Constantinos Venetsanopoulos wrote:
> >>>>Update the shared storage design document to reflect the current
> >>>>changes, after the implementation of the ExtStorage interface.
> >>>>diff --git a/doc/design-shared-storage.rst b/doc/design-shared-storage.rst
> >>>>index c175476..7080182 100644
> >>>>--- a/doc/design-shared-storage.rst
> >>>>+++ b/doc/design-shared-storage.rst
> >>>>@@ -64,15 +64,11 @@ The design addresses the following procedures:
> >>>> filesystems.
> >>>> - Introduction of shared block device disk template with device
> >>>> adoption.
> >>>>+- Introduction of an External Storage Interface.
> >>>> Additionally, mid- to long-term goals include:
> >>>> - Support for external “storage pools”.
> >>>>-- Introduction of an interface for communicating with external scripts,
> >>>>- providing methods for the various stages of a block device's and
> >>>>- instance's life-cycle. In order to provide storage provisioning
> >>>>- capabilities for various SAN appliances, external helpers in the form
> >>>>- of a “storage driver” will be possibly introduced as well.
> >>>> Refactoring of all code referring to constants.DTS_NET_MIRROR
> >>>> =============================================================
> >>>>@@ -159,6 +155,104 @@ The shared block device template will make the following assumptions:
> >>>> - The device will be available with the same path under all nodes in the
> >>>> node group.
> >>>>+Introduction of an External Storage Interface
> >>>>+==============================================
> >>>>+Overview
> >>>>+--------
> >>>>+
> >>>>+To extend the shared block storage template and give Ganeti the ability
> >>>>+to control and manipulate external storage (provisioning, removal,
> >>>>+growing, etc.) we need a more generic approach. The generic method for
> >>>>+supporting external shared storage in Ganeti will be to have an
> >>>>+ExtStorage provider for each external shared storage hardware type. The
> >>>>+ExtStorage provider will be a set of files (executable scripts and text
> >>>>+files), contained inside a directory which will be named after the
> >>>>+provider. This directory must be present across all nodes of a nodegroup
> >>>>+(Ganeti doesn't replicate it), in order for the provider to be usable by
> >>>>+Ganeti for this nodegroup (valid).
> >>>How will Ganeti behave if they are not consistent? Report errors? (in
> >>>cluster verify?) Ignore the provider? Etc.
> >>The ExtStorage code follows exactly the behavior of the code
> >>handling OS definitions: It produces appropriate error messages
> >>and also comes with `gnt-storage {diagnose, info}' similarly to
> >>`gnt-os {diagnose, info}'.
> >>There is only one difference compared to the way OS defs are
> >>handled:
> >>The ExtStorage diagnose code calculates the validity of each provider
> >>for each nodegroup in the cmdlib logic rather than in the client.
> >>This was marked as 'TODO'' inside cmdlib for OS diagnose.
> >>This gives you the flexibility to do neat things easily, such as running
> >>the LU from inside cluster verify and producing validity statuses
> >>each provider-nodegroup combination. So, presumably this can also
> >>be used inside `gnt-cluster verify' in the future.
> >Sounds very good, thanks!
> >>>>The external shared storage hardware
> >>>>+should also be accessible by all nodes of this nodegroup too.
> >>>>+
> >>>>+An “ExtStorage provider” will have to provide the following methods:
> >>>>+
> >>>>+- Create a disk
> >>>>+- Remove a disk
> >>>>+- Grow a disk
> >>>>+- Attach a disk to a given node
> >>>>+- Detach a disk from a given node
> >>>>+- Verify its supported parameters
> >>>>+
> >>>>+The proposed ExtStorage interface borrows heavily from the OS
> >>>>+interface and follows a one-script-per-function approach. An ExtStorage
> >>>>+provider is expected to provide the following scripts:
> >>>>+
> >>>>+- `create`
> >>>>+- `remove`
> >>>>+- `grow`
> >>>>+- `attach`
> >>>>+- `detach`
> >>>>+- `verify`
> >>>>+
> >>>>+All scripts will be called with no arguments and get their input via
> >>>>+environment variables. A common set of variables will be exported for
> >>>>+all commands, and some of them might have extra ones.
> >>>>+
> >>>>+- `VOL_NAME`: The name of the volume. This is unique for Ganeti and it
> >>>>+ uses it to refer to a specific volume inside the external storage.
> >>>>+- `VOL_SIZE`: The volume's size in mebibytes.
> >>>>+- `VOL_NEW_SIZE`: Available only to the `grow` script. It declares the
> >>>>+ new size of the volume after grow (in mebibytes).
> >>>>+- `EXTP_name`: ExtStorage parameter, where `name` is the parameter in
> >>>>+ upper-case (same as OS interface's `OSP_*` parameters).
> >>>>+
> >>>>+All scripts except `attach` should return 0 on success and non-zero on
> >>>>+error, accompanied by an appropriate error message on stderr. The
> >>>>+`attach` script should return a string on stdout on success, which is
> >>>>+the block device's full path, after it has been successfully attached to
> >>>>+the host node. On error it should return non-zero.
> >>>>+
> >>>>+Implementation
> >>>>+--------------
> >>>>+
> >>>>+To support the ExtStorage interface, we will introduce a new disk
> >>>>+template called `ext`. This template will implement the existing Ganeti
> >>>>+disk interface in `lib/bdev.py` (create, remove, attach, assemble,
> >>>>+shutdown, grow), and will simultaneously pass control to the external
> >>>>+scripts to actually handle the above actions. The `ext` disk template
> >>>>+will act as a translation layer between the current Ganeti disk
> >>>>+interface and the ExtStorage providers.
> >>>>+
> >>>>+We will also introduce a new IDISK_PARAM called `IDISK_PROVIDER =
> >>>>+provider`, which will be used at the command line to select the desired
> >>>>+ExtStorage provider. This parameter will be valid only for template
> >>>>+`ext` e.g.::
> >>>>+
> >>>>+ gnt-instance add -t ext --disk=0:size=2G,provider=sample_provider1
> >>>>+
> >>>>+The Extstorage interface will support different disks to be created by
> >>>>+different providers. e.g.::
> >>>>+
> >>>>+ gnt-instance add -t ext --disk=0:size=2G,provider=sample_provider1
> >>>>+ --disk=1:size=1G,provider=sample_provider2
> >>>>+ --disk=2:size=3G,provider=sample_provider1
> >>>This (also in the context of your other design changes) makes me a bit
> >>>uneasy, with regards to coordinating changes across multiple providers
> >>>in live migration and similar changes (even startup). Have you thought
> >>>about this?
> >>I'm not sure I can understand your point completely. Given the
> >>diagnose functionality described above, are you concerned providers
> >>are going to be in inconsistent state among nodes? Is it a matter
> >>of how the allocator decides the target node given different providers?
> >Ah no, see below.
> >>Can you expand on the "coordinating changes across multiple providers
> >>in live migration and similar changes (even startup)" part of your
> >>question? Perhaps with some examples?
> >I'll try :)
> OK. now its clear, thanks.
> >I have a _very slight_ worry on that handling "complex" instances will
> >become more tricky if the behaviour of different storage providers or
> >disk templates (this is in the context of the other designs) differs.
> >For example, let's say we have an instance with first disk DRBD, second
> >disk ext,provider=p1, third disk ext,provider=p2.
> >We know we can live migrate an instance across node groups for DRBD, and
> >we now we can migrate ext providers if they are available in both
> >groups. But combining all these checks across multiple disks is just 2%
> >more tricky: we need to move from "disk_template in
> >constants.DTS_MIRRORED" to something like "does all instance disks allow
> >migration/failover/move from (nodegroup A, nodes [a,b]) to (nodegroup B,
> >nodes [c,d])" (where A could be equal to B)?
> >This is doable, just means that a lot of decisions about the instance
> >behaviour (can be moved, can be live migrated, etc.) will move away from
> >the instance level (disk_template) and become an aggregate of the
> >instance's disk capabilities.
> Exactly! Wrt the ExtStorage patchset, we won't need to make changes
> in the decision making because everything still stays at instance level,
> even though we have different providers on different disks. All we have
> to do, is make sure all providers are present at the node/nodegroup we
> want to migrate/failover/move (I have tested live migrations of instances
> with let's say disk0 ext,provider=p1, disk1 ext,provider=p2 without
> changing anything in the current allocation logic).
> When we introduce Storage Pools and the ability to have different
> disks of an instance residing in different Storage Pools, then we will
> have to do exactly as you are saying (and is also written in the design
> doc).
Hah, sorry, I didn't read those designs except very briefly :)
> We should move the decision logic from operating at instance
> level, to operating at the aggregation of the instance's disks Storage
> Pools. At that point, we also don't have a problem with providers,
> because providers will be moved from an IDISK_PARAM
On Thu, Sep 27, 2012 at 05:43:56PM +0300, Constantinos Venetsanopoulos wrote:
> On 09/27/2012 04:17 PM, Iustin Pop wrote:
> >On Wed, Sep 26, 2012 at 05:38:18PM +0300, Constantinos Venetsanopoulos wrote:
> >>With this commit we introduce the External Storage Interface
> >>to Ganeti, abbreviated: ExtStorage Interface.
> >>The ExtStorage Interface provides Ganeti with the ability to interact
> >>with externally connected shared storage pools, visible by all
> >>VM-capable nodes. This means that Ganeti is able to handle VM disks
> >>that reside inside a NAS/SAN or any distributed block storage provider.
> >>The ExtStorage Interface provides a clear API, heavily inspired by the
> >>gnt-os-interface API, that can be used by storage vendors or sysadmins
> >>to write simple ExtStorage Providers (correlated to gnt-os-interface's
> >>OS Definitions). Those Providers will glue externally attached shared
> >>storage with Ganeti, without the need of preprovisioned block devices
> >>on Ganeti VM-capable nodes as confined be the current `blockdev' disk
> >>template.
> >>To do so, we implement a new disk template called `ext' (of type
> >>DTS_EXT_MIRROR) that passes control to externally provided scripts
> >>(the ExtStorage Provider) for the template's basic functions:
> >> create / attach / detach / remove / grow
> >>The scripts reside under ES_SEARCH_PATH (correlated to OS_SEARCH_PATH)
> >>and only one ExtStorage Provider is supported called `ext'.
> >>The disk's logical id is the tuple ('ext', UUID.ext.diskX), where UUID
> >>is generated as in disk template `plain' and X is the disk's index.
> >A few general comments, or rather questions. Some of the are rather
> >design level, but they become more apparent when reading the code only,
> >sorry.
> No problem.
> >I'm not sure creating one log file per operation is a good idea. While
> >installing/renaming an instance is a (somewhat) rare operation,
> >activating the disks of an instance is a much often one. I know that
> >right now in attach you don't create a log, but that seems to be the
> >intention (hence the TODO). Thoughts?
> Indeed, creating a log file during attach was the initial intention, however
> I understand your point since specifically attach is run multiple times
> even during a single instance addition. On the other hand, I think creation
> and removal of a disk should reside in different files because they do not
> happen so often. Furthermore, these log files are the only place where the
> admin actually can see what happens, when something goes wrong with
> the external hardware and are also very helpful when writing custom
> ExtStorage providers.
I'm not sure I understand this. Do you mean about debug/informational
messages? I believe that actual errors will simply propagate to ganeti
as failures in activation/other ops.
> What do you think? Another option would be to keep the log files for
> create/remove/grow and do not log the attach/detach. Or keep them
> for now to see the workflow and if they produce a lot of unnecessary
> info that we do not want to keep, merge all actions in a single log file
> for each disk (but this needs code refactoring).
Indeed.
Maybe a separate option would be, as you say, log only
create/remove/grow and recommend that attach/detach options log to
syslog? Or request that all logging goes to syslog?
> >Also, it seems by reading the code that Ganeti generates an UUID for the
> >volume, and it asks the provider to create one such drive; basically,
> >the provider will get "please create disk f0559a.0 of size 500G", right?
> That's right.
> >This means that only the ganeti configuration has the mapping instance
> >name to disks, and that losing the ganeti configuration would mean
> >instantly losing all instance data since we can't map it back to a VM.
> >For 'plain' disks, we explicitly tag the LV names with the name of the
> >instance, to safeguard against such problems. Do you think this is not a
> >problem?
> To be honest, I haven't thought about this case, but I understand
> your point.
> > Or could we add the instance name to the external provider
> >create call, such that providers could do such an association
> >themselves, if they care?
> That would be fine, but I can't see an easy way to do that, since the
> instance object doesn't find its way inside bdev, and it wouldn't be clean
> to add new parameters in the Create call. What we could do though,
> would be to prepend the volume name with the instance's name.
> Much like you do with 'plain' disks. Now the disk name has the following
> format:
> '04859fac4.ext.diskX' where X is the index.
> We could change that to 'instancename-04859fac4.ext.diskX' and
> document it appropriately, so that the mapping will be there for the
> providers to use it, if they care. Would that be OK?
Hmm, not really. The instance name can and will change, so having it as
part of the disk ID is not good.
But note that the instance "info" is already passed to BlockDevCreate,
and while not passed to the actual Create call, is passed to
BlockDevice.SetInfo()? Could we reuse that here too?
On Thu, Sep 27, 2012 at 05:51:05PM +0300, Constantinos Venetsanopoulos wrote:
> On 09/27/2012 04:26 PM, Iustin Pop wrote:
> >On Wed, Sep 26, 2012 at 05:38:20PM +0300, Constantinos Venetsanopoulos wrote:
> >>The option is added to allow us the passing of arbitrary ext-params
> >>during disk addition (gnt-instance modify --disk add: ...), same as
> >>we do during instance add.
> >>The option is needed because during gnt-instance modify parameters'
> >>validation, we do not know the type of the disk template
> >>(whereas in gnt-instance add we find it from the -t option).
> >>The option is called 'arbit' and not 'ext' params because it may be
> >>useful in other contexts too, in the future.
> >>The corresponding prereq checks for the existence of the 'provider'
> >>parameter in case of an 'ext' template are also taken care off in
> >>this commit. It serves as a general introduction of the 'ext'
> >>template in gnt-instance modify.
> Oooops :)
> I've been rebasing a hundred times (always resolving conflicts) the last
> two weeks to catch up with you guys so this must have slipped.
> Could you please remove it when applying?
Sure, no problem.
> >This is not a review of the patch, but just reading the commit message,
> >I find "--allow-arbit-params" too hard to read. What about
> >"--flexible-params"?
> I think "allow" should be in there. Maybe it becomes more clear when
> you see the code. For "arbit" I'm not sure either. In any case, we can
> change that to "--flexible-params" during the review if you think that
> fits best at the end.
Hmm, I read the code again and I am still undecided.
If I understood the patch correctly, the problem it tries to solve is
that during the _opcode_ validation, which is done outside of the LU
context, we don't have access to the disk template, hence no way to
check the parameters to the opcode. And you fix this by adding a
parameter which directs the code to use the extended disk template
specific code.
If this is correct, then I'm not sure it's the best solution. Since we
can't do actual validation at the opcode layer, then maybe we shouldn't;
what about delaying validation until we are in an LU context, and we
have access to the instance? Yes, the job will fail later, but that
would seem cleaner than "propagating" the disk template into the opcode
layer.
> On Thu, Sep 27, 2012 at 05:43:56PM +0300, Constantinos Venetsanopoulos wrote:
>> On 09/27/2012 04:17 PM, Iustin Pop wrote:
>>> On Wed, Sep 26, 2012 at 05:38:18PM +0300, Constantinos Venetsanopoulos wrote:
>>>> With this commit we introduce the External Storage Interface
>>>> to Ganeti, abbreviated: ExtStorage Interface.
>>>> The ExtStorage Interface provides Ganeti with the ability to interact
>>>> with externally connected shared storage pools, visible by all
>>>> VM-capable nodes. This means that Ganeti is able to handle VM disks
>>>> that reside inside a NAS/SAN or any distributed block storage provider.
>>>> The ExtStorage Interface provides a clear API, heavily inspired by the
>>>> gnt-os-interface API, that can be used by storage vendors or sysadmins
>>>> to write simple ExtStorage Providers (correlated to gnt-os-interface's
>>>> OS Definitions). Those Providers will glue externally attached shared
>>>> storage with Ganeti, without the need of preprovisioned block devices
>>>> on Ganeti VM-capable nodes as confined be the current `blockdev' disk
>>>> template.
>>>> To do so, we implement a new disk template called `ext' (of type
>>>> DTS_EXT_MIRROR) that passes control to externally provided scripts
>>>> (the ExtStorage Provider) for the template's basic functions:
>>>> create / attach / detach / remove / grow
>>>> The scripts reside under ES_SEARCH_PATH (correlated to OS_SEARCH_PATH)
>>>> and only one ExtStorage Provider is supported called `ext'.
>>>> The disk's logical id is the tuple ('ext', UUID.ext.diskX), where UUID
>>>> is generated as in disk template `plain' and X is the disk's index.
>>> A few general comments, or rather questions. Some of the are rather
>>> design level, but they become more apparent when reading the code only,
>>> sorry.
>> No problem.
>>> I'm not sure creating one log file per operation is a good idea. While
>>> installing/renaming an instance is a (somewhat) rare operation,
>>> activating the disks of an instance is a much often one. I know that
>>> right now in attach you don't create a log, but that seems to be the
>>> intention (hence the TODO). Thoughts?
>> Indeed, creating a log file during attach was the initial intention, however
>> I understand your point since specifically attach is run multiple times
>> even during a single instance addition. On the other hand, I think creation
>> and removal of a disk should reside in different files because they do not
>> happen so often. Furthermore, these log files are the only place where the
>> admin actually can see what happens, when something goes wrong with
>> the external hardware and are also very helpful when writing custom
>> ExtStorage providers.
> I'm not sure I understand this. Do you mean about debug/informational
> messages? I believe that actual errors will simply propagate to ganeti
> as failures in activation/other ops.
Actual errors do propagate to ganeti. However the log doesn't
(except the "n" last lines of the logfile). The user has to go back to
the log file to see what happened exactly. From my experience writing
ExtStorage providers, it really helps being able to look at the attach
script's stderr.
>> What do you think? Another option would be to keep the log files for
>> create/remove/grow and do not log the attach/detach. Or keep them
>> for now to see the workflow and if they produce a lot of unnecessary
>> info that we do not want to keep, merge all actions in a single log file
>> for each disk (but this needs code refactoring).
> Indeed.
> Maybe a separate option would be, as you say, log only
> create/remove/grow and recommend that attach/detach options log to
> syslog? Or request that all logging goes to syslog?
After more consideration, I think that not logging attach/detach ops is
not the right thing to do.
Currently no logging for attach happens, due to the issue with RunResult().
When this gets fixed, I think it would be best to log everything related
to a single disk in the same log file, so we don't miss anything, keep
related things together, and do not create too many small files. As an
example the file would contain a series of
[create,attach,attach,attach,detach,attach,...,grow,...,detach,remove] ops.
If you are OK with this approach, I will contribute the relevant patches as
soon as the code gets merged and the issue in the TODO comment gets fixed.
>>> Also, it seems by reading the code that Ganeti generates an UUID for the
>>> volume, and it asks the provider to create one such drive; basically,
>>> the provider will get "please create disk f0559a.0 of size 500G", right?
>> That's right.
>>> This means that only the ganeti configuration has the mapping instance
>>> name to disks, and that losing the ganeti configuration would mean
>>> instantly losing all instance data since we can't map it back to a VM.
>>> For 'plain' disks, we explicitly tag the LV names with the name of the
>>> instance, to safeguard against such problems. Do you think this is not a
>>> problem?
>> To be honest, I haven't thought about this case, but I understand
>> your point.
>>> Or could we add the instance name to the external provider
>>> create call, such that providers could do such an association
>>> themselves, if they care?
>> That would be fine, but I can't see an easy way to do that, since the
>> instance object doesn't find its way inside bdev, and it wouldn't be clean
>> to add new parameters in the Create call. What we could do though,
>> would be to prepend the volume name with the instance's name.
>> Much like you do with 'plain' disks. Now the disk name has the following
>> format:
>> '04859fac4.ext.diskX' where X is the index.
>> We could change that to 'instancename-04859fac4.ext.diskX' and
>> document it appropriately, so that the mapping will be there for the
>> providers to use it, if they care. Would that be OK?
> Hmm, not really. The instance name can and will change, so having it as
> part of the disk ID is not good.
> But note that the instance "info" is already passed to BlockDevCreate,
> and while not passed to the actual Create call, is passed to
> BlockDevice.SetInfo()? Could we reuse that here too?
If I understand correctly, you prefer disk names to be independent of
the instance name, then go and tag the disk object itself, e.g., the lv in
the case of lvm.
A similar approach would be for me to implement the
ExtStorageDevice.SetInfo() call which runs an external "setinfo" script.
The setinfo script would mark the disk with the instance name or other
metadata in a provider-specific way. For example the volumes of a NAS
would be marked with vendor specific metadata and the admin would be
able to recover instance data by examining these.
However, even if this happens, it seems I can only find code setting lv
tags during blockdev creation and not instance rename. So, how do
lv tags stay consistent after renames? Thus, even we follow the approach
you propose, the admin will still have problems tracking down instance
disks after loosing the ganeti config. Thoughts?
On Tue, Oct 09, 2012 at 06:59:55PM +0300, Constantinos Venetsanopoulos wrote:
> On 10/08/2012 04:55 PM, Iustin Pop wrote:
> >On Thu, Sep 27, 2012 at 05:43:56PM +0300, Constantinos Venetsanopoulos wrote:
> >>On 09/27/2012 04:17 PM, Iustin Pop wrote:
> >>>On Wed, Sep 26, 2012 at 05:38:18PM +0300, Constantinos Venetsanopoulos wrote:
> >>>>With this commit we introduce the External Storage Interface
> >>>>to Ganeti, abbreviated: ExtStorage Interface.
> >>>>The ExtStorage Interface provides Ganeti with the ability to interact
> >>>>with externally connected shared storage pools, visible by all
> >>>>VM-capable nodes. This means that Ganeti is able to handle VM disks
> >>>>that reside inside a NAS/SAN or any distributed block storage provider.
> >>>>The ExtStorage Interface provides a clear API, heavily inspired by the
> >>>>gnt-os-interface API, that can be used by storage vendors or sysadmins
> >>>>to write simple ExtStorage Providers (correlated to gnt-os-interface's
> >>>>OS Definitions). Those Providers will glue externally attached shared
> >>>>storage with Ganeti, without the need of preprovisioned block devices
> >>>>on Ganeti VM-capable nodes as confined be the current `blockdev' disk
> >>>>template.
> >>>>To do so, we implement a new disk template called `ext' (of type
> >>>>DTS_EXT_MIRROR) that passes control to externally provided scripts
> >>>>(the ExtStorage Provider) for the template's basic functions:
> >>>> create / attach / detach / remove / grow
> >>>>The scripts reside under ES_SEARCH_PATH (correlated to OS_SEARCH_PATH)
> >>>>and only one ExtStorage Provider is supported called `ext'.
> >>>>The disk's logical id is the tuple ('ext', UUID.ext.diskX), where UUID
> >>>>is generated as in disk template `plain' and X is the disk's index.
> >>>A few general comments, or rather questions. Some of the are rather
> >>>design level, but they become more apparent when reading the code only,
> >>>sorry.
> >>No problem.
> >>>I'm not sure creating one log file per operation is a good idea. While
> >>>installing/renaming an instance is a (somewhat) rare operation,
> >>>activating the disks of an instance is a much often one. I know that
> >>>right now in attach you don't create a log, but that seems to be the
> >>>intention (hence the TODO). Thoughts?
> >>Indeed, creating a log file during attach was the initial intention, however
> >>I understand your point since specifically attach is run multiple times
> >>even during a single instance addition. On the other hand, I think creation
> >>and removal of a disk should reside in different files because they do not
> >>happen so often. Furthermore, these log files are the only place where the
> >>admin actually can see what happens, when something goes wrong with
> >>the external hardware and are also very helpful when writing custom
> >>ExtStorage providers.
> >I'm not sure I understand this. Do you mean about debug/informational
> >messages? I believe that actual errors will simply propagate to ganeti
> >as failures in activation/other ops.
> Actual errors do propagate to ganeti. However the log doesn't
> (except the "n" last lines of the logfile). The user has to go back to
> the log file to see what happened exactly. From my experience writing
> ExtStorage providers, it really helps being able to look at the attach
> script's stderr.
> >>What do you think? Another option would be to keep the log files for
> >>create/remove/grow and do not log the attach/detach. Or keep them
> >>for now to see the workflow and if they produce a lot of unnecessary
> >>info that we do not want to keep, merge all actions in a single log file
> >>for each disk (but this needs code refactoring).
> >Indeed.
> >Maybe a separate option would be, as you say, log only
> >create/remove/grow and recommend that attach/detach options log to
> >syslog? Or request that all logging goes to syslog?
> After more consideration, I think that not logging attach/detach ops is
> not the right thing to do.
> Currently no logging for attach happens, due to the issue with RunResult().
> When this gets fixed, I think it would be best to log everything related
> to a single disk in the same log file, so we don't miss anything, keep
> related things together, and do not create too many small files. As an
> example the file would contain a series of
> [create,attach,attach,attach,detach,attach,...,grow,...,detach,remove] ops.
> If you are OK with this approach, I will contribute the relevant patches as
> soon as the code gets merged and the issue in the TODO comment gets fixed.
Sounds good. Thanks for taking the time to think this through!
> >>>Also, it seems by reading the code that Ganeti generates an UUID for the
> >>>volume, and it asks the provider to create one such drive; basically,
> >>>the provider will get "please create disk f0559a.0 of size 500G", right?
> >>That's right.
> >>>This means that only the ganeti configuration has the mapping instance
> >>>name to disks, and that losing the ganeti configuration would mean
> >>>instantly losing all instance data since we can't map it back to a VM.
> >>>For 'plain' disks, we explicitly tag the LV names with the name of the
> >>>instance, to safeguard against such problems. Do you think this is not a
> >>>problem?
> >>To be honest, I haven't thought about this case, but I understand
> >>your point.
> >>> Or could we add the instance name to the external provider
> >>>create call, such that providers could do such an association
> >>>themselves, if they care?
> >>That would be fine, but I can't see an easy way to do that, since the
> >>instance object doesn't find its way inside bdev, and it wouldn't be clean
> >>to add new parameters in the Create call. What we could do though,
> >>would be to prepend the volume name with the instance's name.
> >>Much like you do with 'plain' disks. Now the disk name has the following
> >>format:
> >>'04859fac4.ext.diskX' where X is the index.
> >>We could change that to 'instancename-04859fac4.ext.diskX' and
> >>document it appropriately, so that the mapping will be there for the
> >>providers to use it, if they care. Would that be OK?
> >Hmm, not really. The instance name can and will change, so having it as
> >part of the disk ID is not good.
> >But note that the instance "info" is already passed to BlockDevCreate,
> >and while not passed to the actual Create call, is passed to
> >BlockDevice.SetInfo()? Could we reuse that here too?
> If I understand correctly, you prefer disk names to be independent of
> the instance name, then go and tag the disk object itself, e.g., the lv in
> the case of lvm.
Yes.
> A similar approach would be for me to implement the
> ExtStorageDevice.SetInfo() call which runs an external "setinfo" script.
> The setinfo script would mark the disk with the instance name or other
> metadata in a provider-specific way. For example the volumes of a NAS
> would be marked with vendor specific metadata and the admin would be
> able to recover instance data by examining these.
Yes, exactly.
> However, even if this happens, it seems I can only find code setting lv
> tags during blockdev creation and not instance rename. So, how do
> lv tags stay consistent after renames? Thus, even we follow the approach
> you propose, the admin will still have problems tracking down instance
> disks after loosing the ganeti config. Thoughts?
That is actually a current bug (only internally raised so far). We do
plan to fix the rename LU to include this, by exposing SetInfo via an
rpc call.
So yes, let's take this approach, and fix the rename issue separately.
> On Tue, Oct 09, 2012 at 06:59:55PM +0300, Constantinos Venetsanopoulos wrote:
>> On 10/08/2012 04:55 PM, Iustin Pop wrote:
>>> On Thu, Sep 27, 2012 at 05:43:56PM +0300, Constantinos Venetsanopoulos wrote:
>>>> On 09/27/2012 04:17 PM, Iustin Pop wrote:
>>>>> On Wed, Sep 26, 2012 at 05:38:18PM +0300, Constantinos Venetsanopoulos wrote:
>>>>>> With this commit we introduce the External Storage Interface
>>>>>> to Ganeti, abbreviated: ExtStorage Interface.
>>>>>> The ExtStorage Interface provides Ganeti with the ability to interact
>>>>>> with externally connected shared storage pools, visible by all
>>>>>> VM-capable nodes. This means that Ganeti is able to handle VM disks
>>>>>> that reside inside a NAS/SAN or any distributed block storage provider.
>>>>>> The ExtStorage Interface provides a clear API, heavily inspired by the
>>>>>> gnt-os-interface API, that can be used by storage vendors or sysadmins
>>>>>> to write simple ExtStorage Providers (correlated to gnt-os-interface's
>>>>>> OS Definitions). Those Providers will glue externally attached shared
>>>>>> storage with Ganeti, without the need of preprovisioned block devices
>>>>>> on Ganeti VM-capable nodes as confined be the current `blockdev' disk
>>>>>> template.
>>>>>> To do so, we implement a new disk template called `ext' (of type
>>>>>> DTS_EXT_MIRROR) that passes control to externally provided scripts
>>>>>> (the ExtStorage Provider) for the template's basic functions:
>>>>>> create / attach / detach / remove / grow
>>>>>> The scripts reside under ES_SEARCH_PATH (correlated to OS_SEARCH_PATH)
>>>>>> and only one ExtStorage Provider is supported called `ext'.
>>>>>> The disk's logical id is the tuple ('ext', UUID.ext.diskX), where UUID
>>>>>> is generated as in disk template `plain' and X is the disk's index.
>>>>> A few general comments, or rather questions. Some of the are rather
>>>>> design level, but they become more apparent when reading the code only,
>>>>> sorry.
>>>> No problem.
>>>>> I'm not sure creating one log file per operation is a good idea. While
>>>>> installing/renaming an instance is a (somewhat) rare operation,
>>>>> activating the disks of an instance is a much often one. I know that
>>>>> right now in attach you don't create a log, but that seems to be the
>>>>> intention (hence the TODO). Thoughts?
>>>> Indeed, creating a log file during attach was the initial intention, however
>>>> I understand your point since specifically attach is run multiple times
>>>> even during a single instance addition. On the other hand, I think creation
>>>> and removal of a disk should reside in different files because they do not
>>>> happen so often. Furthermore, these log files are the only place where the
>>>> admin actually can see what happens, when something goes wrong with
>>>> the external hardware and are also very helpful when writing custom
>>>> ExtStorage providers.
>>> I'm not sure I understand this. Do you mean about debug/informational
>>> messages? I believe that actual errors will simply propagate to ganeti
>>> as failures in activation/other ops.
>> Actual errors do propagate to ganeti. However the log doesn't
>> (except the "n" last lines of the logfile). The user has to go back to
>> the log file to see what happened exactly. From my experience writing
>> ExtStorage providers, it really helps being able to look at the attach
>> script's stderr.
> I see.
>>>> What do you think? Another option would be to keep the log files for
>>>> create/remove/grow and do not log the attach/detach. Or keep them
>>>> for now to see the workflow and if they produce a lot of unnecessary
>>>> info that we do not want to keep, merge all actions in a single log file
>>>> for each disk (but this needs code refactoring).
>>> Indeed.
>>> Maybe a separate option would be, as you say, log only
>>> create/remove/grow and recommend that attach/detach options log to
>>> syslog? Or request that all logging goes to syslog?
>> After more consideration, I think that not logging attach/detach ops is
>> not the right thing to do.
>> Currently no logging for attach happens, due to the issue with RunResult().
>> When this gets fixed, I think it would be best to log everything related
>> to a single disk in the same log file, so we don't miss anything, keep
>> related things together, and do not create too many small files. As an
>> example the file would contain a series of
>> [create,attach,attach,attach,detach,attach,...,grow,...,detach,remove] ops.
>> If you are OK with this approach, I will contribute the relevant patches as
>> soon as the code gets merged and the issue in the TODO comment gets fixed.
> Sounds good. Thanks for taking the time to think this through!
>>>>> Also, it seems by reading the code that Ganeti generates an UUID for the
>>>>> volume, and it asks the provider to create one such drive; basically,
>>>>> the provider will get "please create disk f0559a.0 of size 500G", right?
>>>> That's right.
>>>>> This means that only the ganeti configuration has the mapping instance
>>>>> name to disks, and that losing the ganeti configuration would mean
>>>>> instantly losing all instance data since we can't map it back to a VM.
>>>>> For 'plain' disks, we explicitly tag the LV names with the name of the
>>>>> instance, to safeguard against such problems. Do you think this is not a
>>>>> problem?
>>>> To be honest, I haven't thought about this case, but I understand
>>>> your point.
>>>>> Or could we add the instance name to the external provider
>>>>> create call, such that providers could do such an association
>>>>> themselves, if they care?
>>>> That would be fine, but I can't see an easy way to do that, since the
>>>> instance object doesn't find its way inside bdev, and it wouldn't be clean
>>>> to add new parameters in the Create call. What we could do though,
>>>> would be to prepend the volume name with the instance's name.
>>>> Much like you do with 'plain' disks. Now the disk name has the following
>>>> format:
>>>> '04859fac4.ext.diskX' where X is the index.
>>>> We could change that to 'instancename-04859fac4.ext.diskX' and
>>>> document it appropriately, so that the mapping will be there for the
>>>> providers to use it, if they care. Would that be OK?
>>> Hmm, not really. The instance name can and will change, so having it as
>>> part of the disk ID is not good.
>>> But note that the instance "info" is already passed to BlockDevCreate,
>>> and while not passed to the actual Create call, is passed to
>>> BlockDevice.SetInfo()? Could we reuse that here too?
>> If I understand correctly, you prefer disk names to be independent of
>> the instance name, then go and tag the disk object itself, e.g., the lv in
>> the case of lvm.
> Yes.
>> A similar approach would be for me to implement the
>> ExtStorageDevice.SetInfo() call which runs an external "setinfo" script.
>> The setinfo script would mark the disk with the instance name or other
>> metadata in a provider-specific way. For example the volumes of a NAS
>> would be marked with vendor specific metadata and the admin would be
>> able to recover instance data by examining these.
> Yes, exactly.
>> However, even if this happens, it seems I can only find code setting lv
>> tags during blockdev creation and not instance rename. So, how do
>> lv tags stay consistent after renames? Thus, even we follow the approach
>> you propose, the admin will still have problems tracking down instance
>> disks after loosing the ganeti config. Thoughts?
> That is actually a current bug (only internally raised so far). We do
> plan to fix the rename LU to include this, by exposing SetInfo via an
> rpc call.
> So yes, let's take this approach, and fix the rename issue separately.
OK. So, to sum up, I'll implement the SetInfo functionality, fix all the
small issues we discussed in previous emails and send an interdiff.
When we fix the RunResult TODO and the code gets merged, I'll send
a patch that will merge all log files into a single one for each disk.
Finally, after the merge, I will also send a separate patch that
documents the SetInfo functionality in the design doc and man pages.
On Wed, Oct 10, 2012 at 12:07:31PM +0300, Constantinos Venetsanopoulos wrote:
> On 10/09/2012 07:10 PM, Iustin Pop wrote:
> >On Tue, Oct 09, 2012 at 06:59:55PM +0300, Constantinos Venetsanopoulos wrote:
> >>On 10/08/2012 04:55 PM, Iustin Pop wrote:
> >>>On Thu, Sep 27, 2012 at 05:43:56PM +0300, Constantinos Venetsanopoulos wrote:
> >>>>On 09/27/2012 04:17 PM, Iustin Pop wrote:
> >>>>>On Wed, Sep 26, 2012 at 05:38:18PM +0300, Constantinos Venetsanopoulos wrote:
> >>>>>>With this commit we introduce the External Storage Interface
> >>>>>>to Ganeti, abbreviated: ExtStorage Interface.
> >>>>>>The ExtStorage Interface provides Ganeti with the ability to interact
> >>>>>>with externally connected shared storage pools, visible by all
> >>>>>>VM-capable nodes. This means that Ganeti is able to handle VM disks
> >>>>>>that reside inside a NAS/SAN or any distributed block storage provider.
> >>>>>>The ExtStorage Interface provides a clear API, heavily inspired by the
> >>>>>>gnt-os-interface API, that can be used by storage vendors or sysadmins
> >>>>>>to write simple ExtStorage Providers (correlated to gnt-os-interface's
> >>>>>>OS Definitions). Those Providers will glue externally attached shared
> >>>>>>storage with Ganeti, without the need of preprovisioned block devices
> >>>>>>on Ganeti VM-capable nodes as confined be the current `blockdev' disk
> >>>>>>template.
> >>>>>>To do so, we implement a new disk template called `ext' (of type
> >>>>>>DTS_EXT_MIRROR) that passes control to externally provided scripts
> >>>>>>(the ExtStorage Provider) for the template's basic functions:
> >>>>>> create / attach / detach / remove / grow
> >>>>>>The scripts reside under ES_SEARCH_PATH (correlated to OS_SEARCH_PATH)
> >>>>>>and only one ExtStorage Provider is supported called `ext'.
> >>>>>>The disk's logical id is the tuple ('ext', UUID.ext.diskX), where UUID
> >>>>>>is generated as in disk template `plain' and X is the disk's index.
> >>>>>A few general comments, or rather questions. Some of the are rather
> >>>>>design level, but they become more apparent when reading the code only,
> >>>>>sorry.
> >>>>No problem.
> >>>>>I'm not sure creating one log file per operation is a good idea. While
> >>>>>installing/renaming an instance is a (somewhat) rare operation,
> >>>>>activating the disks of an instance is a much often one. I know that
> >>>>>right now in attach you don't create a log, but that seems to be the
> >>>>>intention (hence the TODO). Thoughts?
> >>>>Indeed, creating a log file during attach was the initial intention, however
> >>>>I understand your point since specifically attach is run multiple times
> >>>>even during a single instance addition. On the other hand, I think creation
> >>>>and removal of a disk should reside in different files because they do not
> >>>>happen so often. Furthermore, these log files are the only place where the
> >>>>admin actually can see what happens, when something goes wrong with
> >>>>the external hardware and are also very helpful when writing custom
> >>>>ExtStorage providers.
> >>>I'm not sure I understand this. Do you mean about debug/informational
> >>>messages? I believe that actual errors will simply propagate to ganeti
> >>>as failures in activation/other ops.
> >>Actual errors do propagate to ganeti. However the log doesn't
> >>(except the "n" last lines of the logfile). The user has to go back to
> >>the log file to see what happened exactly. From my experience writing
> >>ExtStorage providers, it really helps being able to look at the attach
> >>script's stderr.
> >I see.
> >>>>What do you think? Another option would be to keep the log files for
> >>>>create/remove/grow and do not log the attach/detach. Or keep them
> >>>>for now to see the workflow and if they produce a lot of unnecessary
> >>>>info that we do not want to keep, merge all actions in a single log file
> >>>>for each disk (but this needs code refactoring).
> >>>Indeed.
> >>>Maybe a separate option would be, as you say, log only
> >>>create/remove/grow and recommend that attach/detach options log to
> >>>syslog? Or request that all logging goes to syslog?
> >>After more consideration, I think that not logging attach/detach ops is
> >>not the right thing to do.
> >>Currently no logging for attach happens, due to the issue with RunResult().
> >>When this gets fixed, I think it would be best to log everything related
> >>to a single disk in the same log file, so we don't miss anything, keep
> >>related things together, and do not create too many small files. As an
> >>example the file would contain a series of
> >>[create,attach,attach,attach,detach,attach,...,grow,...,detach,remove] ops.
> >>If you are OK with this approach, I will contribute the relevant patches as
> >>soon as the code gets merged and the issue in the TODO comment gets fixed.
> >Sounds good. Thanks for taking the time to think this through!
> >>>>>Also, it seems by reading the code that Ganeti generates an UUID for the
> >>>>>volume, and it asks the provider to create one such drive; basically,
> >>>>>the provider will get "please create disk f0559a.0 of size 500G", right?
> >>>>That's right.
> >>>>>This means that only the ganeti configuration has the mapping instance
> >>>>>name to disks, and that losing the ganeti configuration would mean
> >>>>>instantly losing all instance data since we can't map it back to a VM.
> >>>>>For 'plain' disks, we explicitly tag the LV names with the name of the
> >>>>>instance, to safeguard against such problems. Do you think this is not a
> >>>>>problem?
> >>>>To be honest, I haven't thought about this case, but I understand
> >>>>your point.
> >>>>> Or could we add the instance name to the external provider
> >>>>>create call, such that providers could do such an association
> >>>>>themselves, if they care?
> >>>>That would be fine, but I can't see an easy way to do that, since the
> >>>>instance object doesn't find its way inside bdev, and it wouldn't be clean
> >>>>to add new parameters in the Create call. What we could do though,
> >>>>would be to prepend the volume name with the instance's name.
> >>>>Much like you do with 'plain' disks. Now the disk name has the following
> >>>>format:
> >>>>'04859fac4.ext.diskX' where X is the index.
> >>>>We could change that to 'instancename-04859fac4.ext.diskX' and
> >>>>document it appropriately, so that the mapping will be there for the
> >>>>providers to use it, if they care. Would that be OK?
> >>>Hmm, not really. The instance name can and will change, so having it as
> >>>part of the disk ID is not good.
> >>>But note that the instance "info" is already passed to BlockDevCreate,
> >>>and while not passed to the actual Create call, is passed to
> >>>BlockDevice.SetInfo()? Could we reuse that here too?
> >>If I understand correctly, you prefer disk names to be independent of
> >>the instance name, then go and tag the disk object itself, e.g., the lv in
> >>the case of lvm.
> >Yes.
> >>A similar approach would be for me to implement the
> >>ExtStorageDevice.SetInfo() call which runs an external "setinfo" script.
> >>The setinfo script would mark the disk with the instance name or other
> >>metadata in a provider-specific way. For example the volumes of a NAS
> >>would be marked with vendor specific metadata and the admin would be
> >>able to recover instance data by examining these.
> >Yes, exactly.
> >>However, even if this happens, it seems I can only find code setting lv
> >>tags during blockdev creation and not instance rename. So, how do
> >>lv tags stay consistent after renames? Thus, even we follow the approach
> >>you propose, the admin will still have problems tracking down instance
> >>disks after loosing the ganeti config. Thoughts?
> >That is actually a current bug (only internally raised so far). We do
> >plan to fix the rename LU to include this, by exposing SetInfo via an
> >rpc call.
> >So yes, let's take this approach, and fix the rename issue separately.
> OK. So, to sum up, I'll implement the SetInfo functionality, fix all the
> small issues we discussed in previous emails and send an interdiff.
> When we fix the RunResult TODO and the code gets merged, I'll send
> a patch that will merge all log files into a single one for each disk.
> Finally, after the merge, I will also send a separate patch that
> documents the SetInfo functionality in the design doc and man pages.
+ def SetInfo(self, text):
+ """Update metadata with info text.
-def _ExtStorageAction(action, unique_id, size=None, grow=None):
+ """
+ # Replace invalid characters
+ text = re.sub("^[^A-Za-z0-9_+.]", "_", text)
+ text = re.sub("[^-A-Za-z0-9_+.]", "_", text)
+
+ # Only up to 128 characters are allowed
+ text = text[:128]
+
+ # Call the External Storage's setinfo script,
+ # to set metadata for an existing Volume inside the External Storage
+ _ExtStorageAction(constants.ES_ACTION_SETINFO, self.unique_id,
+ metadata=text)
+
+
+def _ExtStorageAction(action, unique_id, size=None, grow=None, metadata=None):
"""Take an External Storage action.
Take an External Storage action concerning or affecting
@@ -2814,6 +2830,8 @@ def _ExtStorageAction(action, unique_id, size=None, grow=None):
@param size: the size of the Volume in mebibytes
@type grow: integer
@param grow: the new size in mebibytes (after grow)
+ @type metadata: string
+ @param metadata: metadata info of the Volume, for use by the provider
@rtype: None or a block device path (during attach)
# Create the basic environment for the driver's scripts
- create_env = _ExtStorageEnvironment(unique_id, size, grow)
+ create_env = _ExtStorageEnvironment(unique_id, size, grow, metadata)
# Do not use log file for action `attach' as we need
# to get the output from RunResult
@@ -2834,6 +2852,12 @@ def _ExtStorageAction(action, unique_id, size=None, grow=None):
if action is not constants.ES_ACTION_ATTACH:
logfile = _VolumeLogName(action, driver, vol_name)
+ # Make sure the given action results in a valid script
+ target_script = action
+ if target_script not in constants.ES_SCRIPTS:
+ _ThrowError("Action '%s' doesn't result in a valid ExtStorage script" %
+ action)
+
# Find out which external script to run according the given action
script_name = action + "_script"
script = getattr(inst_es, script_name)
@@ -2852,12 +2876,12 @@ def _ExtStorageAction(action, unique_id, size=None, grow=None):
if action is not constants.ES_ACTION_ATTACH:
lines = [utils.SafeEncode(val)
for val in utils.TailFile(logfile, lines=20)]
- _ThrowError("External storage's %s script failed (%s), last"
- " lines in the log file:\n%s",
- action, result.fail_reason, "\n".join(lines))
else:
- _ThrowError("External storage's %s script failed (%s)",
- action, result.fail_reason)
+ lines = result.output[-20:]
+
+ _ThrowError("External storage's %s script failed (%s), last"
+ " lines of output:\n%s",
+ action, result.fail_reason, "\n".join(lines))
-def _ExtStorageEnvironment(unique_id, size=None, grow=None):
+def _ExtStorageEnvironment(unique_id, size=None, grow=None, metadata=None):
"""Calculate the environment for an External Storage script.
@type unique_id: tuple (driver, vol_name)
@param unique_id: ExtStorage pool and name of the Volume
- @type size: integer
- @param size: size of the Volume in mebibytes
+ @type size: string
+ @param size: size of the Volume (in mebibytes)
+ @type grow: string
+ @param grow: new size of Volume after grow (in mebibytes)
+ @type metadata: string
+ @param metadata: metadata info of the Volume
@rtype: dict
@return: dict of environment variables
> On Wed, Oct 10, 2012 at 12:07:31PM +0300, Constantinos Venetsanopoulos wrote:
>> On 10/09/2012 07:10 PM, Iustin Pop wrote:
>>> On Tue, Oct 09, 2012 at 06:59:55PM +0300, Constantinos Venetsanopoulos wrote:
>>>> On 10/08/2012 04:55 PM, Iustin Pop wrote:
>>>>> On Thu, Sep 27, 2012 at 05:43:56PM +0300, Constantinos Venetsanopoulos wrote:
>>>>>> On 09/27/2012 04:17 PM, Iustin Pop wrote:
>>>>>>> On Wed, Sep 26, 2012 at 05:38:18PM +0300, Constantinos Venetsanopoulos wrote:
>>>>>>>> With this commit we introduce the External Storage Interface
>>>>>>>> to Ganeti, abbreviated: ExtStorage Interface.
>>>>>>>> The ExtStorage Interface provides Ganeti with the ability to interact
>>>>>>>> with externally connected shared storage pools, visible by all
>>>>>>>> VM-capable nodes. This means that Ganeti is able to handle VM disks
>>>>>>>> that reside inside a NAS/SAN or any distributed block storage provider.
>>>>>>>> The ExtStorage Interface provides a clear API, heavily inspired by the
>>>>>>>> gnt-os-interface API, that can be used by storage vendors or sysadmins
>>>>>>>> to write simple ExtStorage Providers (correlated to gnt-os-interface's
>>>>>>>> OS Definitions). Those Providers will glue externally attached shared
>>>>>>>> storage with Ganeti, without the need of preprovisioned block devices
>>>>>>>> on Ganeti VM-capable nodes as confined be the current `blockdev' disk
>>>>>>>> template.
>>>>>>>> To do so, we implement a new disk template called `ext' (of type
>>>>>>>> DTS_EXT_MIRROR) that passes control to externally provided scripts
>>>>>>>> (the ExtStorage Provider) for the template's basic functions:
>>>>>>>> create / attach / detach / remove / grow
>>>>>>>> The scripts reside under ES_SEARCH_PATH (correlated to OS_SEARCH_PATH)
>>>>>>>> and only one ExtStorage Provider is supported called `ext'.
>>>>>>>> The disk's logical id is the tuple ('ext', UUID.ext.diskX), where UUID
>>>>>>>> is generated as in disk template `plain' and X is the disk's index.
>>>>>>> A few general comments, or rather questions. Some of the are rather
>>>>>>> design level, but they become more apparent when reading the code only,
>>>>>>> sorry.
>>>>>> No problem.
>>>>>>> I'm not sure creating one log file per operation is a good idea. While
>>>>>>> installing/renaming an instance is a (somewhat) rare operation,
>>>>>>> activating the disks of an instance is a much often one. I know that
>>>>>>> right now in attach you don't create a log, but that seems to be the
>>>>>>> intention (hence the TODO). Thoughts?
>>>>>> Indeed, creating a log file during attach was the initial intention, however
>>>>>> I understand your point since specifically attach is run multiple times
>>>>>> even during a single instance addition. On the other hand, I think creation
>>>>>> and removal of a disk should reside in different files because they do not
>>>>>> happen so often. Furthermore, these log files are the only place where the
>>>>>> admin actually can see what happens, when something goes wrong with
>>>>>> the external hardware and are also very helpful when writing custom
>>>>>> ExtStorage providers.
>>>>> I'm not sure I understand this. Do you mean about debug/informational
>>>>> messages? I believe that actual errors will simply propagate to ganeti
>>>>> as failures in activation/other ops.
>>>> Actual errors do propagate to ganeti. However the log doesn't
>>>> (except the "n" last lines of the logfile). The user has to go back to
>>>> the log file to see what happened exactly. From my experience writing
>>>> ExtStorage providers, it really helps being able to look at the attach
>>>> script's stderr.
>>> I see.
>>>>>> What do you think? Another option would be to keep the log files for
>>>>>> create/remove/grow and do not log the attach/detach. Or keep them
>>>>>> for now to see the workflow and if they produce a lot of unnecessary
>>>>>> info that we do not want to keep, merge all actions in a single log file
>>>>>> for each disk (but this needs code refactoring).
>>>>> Indeed.
>>>>> Maybe a separate option would be, as you say, log only
>>>>> create/remove/grow and recommend that attach/detach options log to
>>>>> syslog? Or request that all logging goes to syslog?
>>>> After more consideration, I think that not logging attach/detach ops is
>>>> not the right thing to do.
>>>> Currently no logging for attach happens, due to the issue with RunResult().
>>>> When this gets fixed, I think it would be best to log everything related
>>>> to a single disk in the same log file, so we don't miss anything, keep
>>>> related things together, and do not create too many small files. As an
>>>> example the file would contain a series of
>>>> [create,attach,attach,attach,detach,attach,...,grow,...,detach,remove] ops.
>>>> If you are OK with this approach, I will contribute the relevant patches as
>>>> soon as the code gets merged and the issue in the TODO comment gets fixed.
>>> Sounds good. Thanks for taking the time to think this through!
>>>>>>> Also, it seems by reading the code that Ganeti generates an UUID for the
>>>>>>> volume, and it asks the provider to create one such drive; basically,
>>>>>>> the provider will get "please create disk f0559a.0 of size 500G", right?
>>>>>> That's right.
>>>>>>> This means that only the ganeti configuration has the mapping instance
>>>>>>> name to disks, and that losing the ganeti configuration would mean
>>>>>>> instantly losing all instance data since we can't map it back to a VM.
>>>>>>> For 'plain' disks, we explicitly tag the LV names with the name of the
>>>>>>> instance, to safeguard against such problems. Do you think this is not a
>>>>>>> problem?
>>>>>> To be honest, I haven't thought about this case, but I understand
>>>>>> your point.
>>>>>>> Or could we add the instance name to the external provider
>>>>>>> create call, such that providers could do such an association
>>>>>>> themselves, if they care?
>>>>>> That would be fine, but I can't see an easy way to do that, since the
>>>>>> instance object doesn't find its way inside bdev, and it wouldn't be clean
>>>>>> to add new parameters in the Create call. What we could do though,
>>>>>> would be to prepend the volume name with the instance's name.
>>>>>> Much like you do with 'plain' disks. Now the disk name has the following
>>>>>> format:
>>>>>> '04859fac4.ext.diskX' where X is the index.
>>>>>> We could change that to 'instancename-04859fac4.ext.diskX' and
>>>>>> document it appropriately, so that the mapping will be there for the
>>>>>> providers to use it, if they care. Would that be OK?
>>>>> Hmm, not really. The instance name can and will change, so having it as
>>>>> part of the disk ID is not good.
>>>>> But note that the instance "info" is already passed to BlockDevCreate,
>>>>> and while not passed to the actual Create call, is passed to
>>>>> BlockDevice.SetInfo()? Could we reuse that here too?
>>>> If I understand correctly, you prefer disk names to be independent of
>>>> the instance name, then go and tag the disk object itself, e.g., the lv in
>>>> the case of lvm.
>>> Yes.
>>>> A similar approach would be for me to implement the
>>>> ExtStorageDevice.SetInfo() call which runs an external "setinfo" script.
>>>> The setinfo script would mark the disk with the instance name or other
>>>> metadata in a provider-specific way. For example the volumes of a NAS
>>>> would be marked with vendor specific metadata and the admin would be
>>>> able to recover instance data by examining these.
>>> Yes, exactly.
>>>> However, even if this happens, it seems I can only find code setting lv
>>>> tags during blockdev creation and not instance rename. So, how do
>>>> lv tags stay consistent after renames? Thus, even we follow the approach
>>>> you propose, the admin will still have problems tracking down instance
>>>> disks after loosing the ganeti config. Thoughts?
>>> That is actually a current bug (only internally raised so far). We do
>>> plan to fix the rename LU to include this, by exposing SetInfo via an
>>> rpc call.
>>> So yes, let's take this approach, and fix the rename issue separately.
>> OK. So, to sum up, I'll implement the SetInfo functionality, fix all the
>> small issues we discussed in previous emails and send an interdiff.
>> When we fix the RunResult TODO and the code gets merged, I'll send
>> a patch that will merge all log files into a single one for each disk.
>> Finally, after the merge, I will also send a separate patch that
>> documents the SetInfo functionality in the design doc and man pages.