Problem while attaching existing disk to instance

333 views
Skip to first unread message

Valentin

unread,
Feb 8, 2021, 2:35:10 PM2/8/21
to ganeti
Hello,

We're using an external storage provider (lvm [1] ) and seem unable to attach an existing disk to an instance. Creating new disks is no problem at all, but we need to be able to use disks with data which were here before too.

When doing

gnt-instance modify --disk attach:name=data-disk test-instance

we only get

Failure: prerequisites not met for this operation:
error type: wrong_input, error details:
Missing provider for template 'ext'

but when the provider is explicitly set, it doesn't work either:

gnt-instance modify --disk attach:name=data-disk,provider=lvm test-instance

Failure: prerequisites not met for this operation:
error type: wrong_input, error details:
Only one argument is permitted in attach op, either name or uuid 

After looking around for a bit i can't find a way to attach a disk with existing data to a ganeti instance with any other storage template.

What is the suggested way/syntax of doing this?

Thanks in advance.


[1] http://www.goodbytez.de/ganeti/lvm.tar.gz

Phil Regnauld

unread,
Feb 8, 2021, 7:01:33 PM2/8/21
to gan...@googlegroups.com
Valentin (valentin.k93) writes:
>
> After looking around for a bit i can't find a way to attach a disk with
> existing data to a ganeti instance with any other storage template.

The only method I know of is "adopt", see gnt-instance manpage --
but it only works for "plain" disk type (LVM).

When using the adopt key in the disk definition, Ganeti will
reuse those volumes (instead of creating new ones) as the
instance's disks. Ganeti will rename these volumes to the
standard format, and (without installing the OS) will use them
as-is for the instance. This allows migrating instances from
non-managed mode (e.g. plain KVM with LVM) to being managed via
Ganeti. Please note that this works only for the `plain' disk
template (see below for template details).

Valentin

unread,
Feb 9, 2021, 7:30:39 AM2/9/21
to gan...@googlegroups.com
Thanks for your response.

On 09/02/2021 01:01, Phil Regnauld wrote:
>
> The only method I know of is "adopt", see gnt-instance manpage --
> but it only works for "plain" disk type (LVM).

I looked into it and there are multiple issues i have with this:
A) No migration is possible with plain disks and there is no way to
change the storage template once it is plain.
B) The adoption process copies data from the source disk which is not
feasible big storage volumes.
C) There is no way to attach an existing disk after a vm was created.

I realized another way to adopt disks is to use blockdev as a disk
template. but C) still applies.
Additionally there are many valid block devices, e.g. attched logivcal
volumes, that can't really be used due to the restriction that blockdev
paths have to start with /dev/disk.

At this point I'd like to outline the situation we're in:
We are in the process of updating our virtualization hardware and
software and we'd like to use ganeti for its cluster functionality and
ease of use in many aspects.

Our current setup is:
* Virtualization Hosts using a xen/xl stack
* networked storage with ATA over Ethernet
* clustered storage management with lvm using lvm2-lockd (previously clvm)

This allows us to run each vm on any of the hosts and do live
migrations. But we need to manually take care of the vms xen
configurations and make sure machines will only be started once as there
is nothing stopping one machine running twice on the cluster.

With our new hardware we are preparing and testing a migration to ganeti
with the parts of our virtualization setup that run well. So we test the
xen/xl stack and thoroughly and try to integrated the clustered storage
setup with lvmlockd. In this process we have contributed some PRs and
Issues to ganeti (e.g. [1][2][3]).

Right now we are evaluating if we want to continue with ganeti or use
another stack as the attachment of existing disks is a core
functionality we needed.
We would be willing to contribute to ganeti more if there is a clear
path forward to achieve this and the possibility to have those changes
upstream.

I see multiple ways on how to achieve the wanted changes that would
profit ganeti in general. With one change common to all of them:
*allow adoption/attachment of preexisting disks to already created vms*
Options for different storage interfaces would be:
* lvm: allow migration when clustering is active
* lvm: allow adoption of existing lvs without copying content
* exstorage: add interface to adopt disks
* blockdev: allow valid blockdevs outside /dev/disk

We'd be really interested in the thoughts of ganeti developers on such a
proposal.

Cheers,
Valentin

[1] https://github.com/ganeti/ganeti/pull/1582
[2] https://github.com/ganeti/instance-debootstrap/pull/12
[3] https://github.com/ganeti/ganeti/issues/1581

Brian Candler

unread,
Feb 9, 2021, 8:58:44 AM2/9/21
to ganeti
On Tuesday, 9 February 2021 at 12:30:39 UTC Valentin wrote:
there are multiple issues i have with this:
A) No migration is possible with plain disks and there is no way to
change the storage template once it is plain.

You can certainly convert from "plain" to "drbd", I've done that many times.

I'm not sure if you can change to "ext", although the documentation says:
conversions between all the available templates are supported, except the diskless and the blockdev templates
 
B) The adoption process copies data from the source disk which is not
feasible big storage volumes.

No, it just renames the existing LVM volume to UUID form, there's no data copying.  But I don't know how that works with the clustered LVM that you've been using.
 
C) There is no way to attach an existing disk after a vm was created.

I think that's correct: you can only adopt when creating a new instance (gnt-instance add), not modify.
 
Disk management is somewhat limited in ganeti, especially if you want to mix types (e.g. use LVM for boot disk and Ceph for a secondary disk).

Phil Regnauld

unread,
Feb 9, 2021, 9:20:31 AM2/9/21
to gan...@googlegroups.com
Brian Candler (b.candler) writes:
>
> Disk management is somewhat limited in ganeti, especially if you want to
> mix types (e.g. use LVM for boot disk and Ceph for a secondary disk).

I actually do that - if using external storage templates, you can
have different disks on different storage backends. system on SSD/LVM,
data on CEPH.

Valentin

unread,
Feb 10, 2021, 3:37:45 PM2/10/21
to ganeti
Thanks for the response, i'm sorry i got a bit confused there because we tried to use plain lvm and an external storage with lvm (clustered...)

there are multiple issues i have with this:
A) No migration is possible with plain disks and there is no way to
change the storage template once it is plain.

You can certainly convert from "plain" to "drbd", I've done that many times.

I'm not sure if you can change to "ext", although the documentation says:
conversions between all the available templates are supported, except the diskless and the blockdev templates

You are absolutely right, but this conversion uses dd to copy data....
 
 
B) The adoption process copies data from the source disk which is not
feasible big storage volumes.

No, it just renames the existing LVM volume to UUID form, there's no data copying.  But I don't know how that works with the clustered LVM that you've been using.

Yes, that's right but is still not what we need. it leaves us with a clustered lvm setup that can't migrate machines...

 
C) There is no way to attach an existing disk after a vm was created.

I think that's correct: you can only adopt when creating a new instance (gnt-instance add), not modify.
 
Disk management is somewhat limited in ganeti, especially if you want to mix types (e.g. use LVM for boot disk and Ceph for a secondary disk).

I fear that is a deal-breaker for us. e.g. we sometimes need to attach big datasets on a disk read-only to multiple vms or add such a dataset to a existing migrateable vm eithout copying which is  not possible this way....
still, if there is interrest in a real attach feature or/and an adoption mechanism for ext storage by the community and developers we'd consider implementing those.

Cheers

Phil Regnauld

unread,
Feb 10, 2021, 3:47:06 PM2/10/21
to gan...@googlegroups.com
Valentin (valentin.k93) writes:
>
> I fear that is a deal-breaker for us. e.g. we sometimes need to attach big
> datasets on a disk read-only to multiple vms or add such a dataset to a
> existing migrateable vm eithout copying which is not possible this way....
> still, if there is interrest in a real attach feature or/and an adoption
> mechanism for ext storage by the community and developers we'd consider
> implementing those.

Nothing stops you from creating the VM with an empty volume,
note its UUID, stop the VM, remove the LV, and rename your
existing, pre-created disk to have the correct name (UUID
from config), and restart VM.

It's hackish, but have done it before. An external provider
that could deal with all this would be great, too :)

(or even better, a rewrite of the ganeti disk management code
so that disks became "first rank" citizens that could exist
independently of VMs).

Brian Candler

unread,
Feb 10, 2021, 4:03:18 PM2/10/21
to ganeti
On Wednesday, 10 February 2021 at 20:47:06 UTC Phil Regnauld wrote:
(or even better, a rewrite of the ganeti disk management code
so that disks became "first rank" citizens that could exist
independently of VMs).

This has been a long-standing intended feature, with a design document:
Oddly, this has been marked as implemented in 2.14, but I'm pretty sure that's not the case (at least, not all of it).  In ganeti 2.15.2, disks are still children of instances.

Sascha Lucas

unread,
Feb 10, 2021, 5:35:35 PM2/10/21
to ganeti
Hi,

sorry for jumping in lately. What I have read is that an external storage
interface (ESI) should be used to move disks between instances. The Ganeti
function for de-/attach seems not working or is incomplete.

As a workaround it could be possible to use the ESI to trick Ganeti just
with normal --disk add/remove operations. On `remove` the ESI should not
delete the disk (i.e. if the disk has a special ExtStorage parameter like
keep=true). Later with `add` and an additional parameter like
`adopt=name_tag_or_id_of_lv` the ESI can reuse/rename an existing LV,
instead of creating a new one.

That's more of a theory. I haven't used ESI yet.

The lvmlockd approach looks interesting (omitting the clvm cluster foo).

If I remember correctly I've used de-/attaching disks successfully with
`blockdev` and Ganeti 2.14.

Disks seem to be independent objects in the Ganeti config:

"instances": {
"204acd6e-a799-41c4-8425-29b3db78b192": {
"name": "test.vm",
...
"disks": [
"6536c4f8-cdb2-4b23-a6fb-626396ff524c",
"a7217dd0-b26e-4505-9305-da911c036448"
],

and

"disks": {
"6536c4f8-cdb2-4b23-a6fb-626396ff524c": {
...
"a7217dd0-b26e-4505-9305-da911c036448": {

Maybe it's just a simple fix to make attach working for ext storage?

Any help in this area is highly appreciated.

Thanks, Sascha.

Brian Candler

unread,
Feb 11, 2021, 3:18:20 AM2/11/21
to ganeti
On Wednesday, 10 February 2021 at 22:35:35 UTC sascha...@web.de wrote:
Disks seem to be independent objects in the Ganeti config:

"instances": {
"204acd6e-a799-41c4-8425-29b3db78b192": {
"name": "test.vm",
...
"disks": [
"6536c4f8-cdb2-4b23-a6fb-626396ff524c",
"a7217dd0-b26e-4505-9305-da911c036448"
],


Thanks: it looks like I was wrong then.  In the gnt-instance manpage, I see that gnt-instance modify does now have an option to detach and keep a disk:

       The above apply also to  the  --disk detach  op‐
       tion,  which  removes  a  disk  from an instance but keeps it in the configuration and doesn't destroy it.

and a corresponding "--disk N:attach" option.

However, I can't see how you'd find detached disks; there's no "gnt-disk list" for example.  And ideally you'd be able to adopt an LVM volume to form a detached disk, separately from attaching it.

Brian Candler

unread,
Feb 11, 2021, 4:26:12 AM2/11/21
to ganeti
Looking into this a bit more (with 2.15.2):

- an instance must have a single disk_template: there is not a separate template for each disk

root@nuc1:~# gnt-instance list -o name,disk_template,disk.sizes
Instance             Disk_template Disk_sizes
elk.home.example.net ext           20480

- gnt-instance modify -t <template> modifies the template for *all* disks on an instance

- you can remove the primary disk:

root@nuc1:~# gnt-instance add -t plain -s 2G -o noop --no-start --no-name-check --no-ip-check foo
...
root@nuc1:~# gnt-instance modify --disk 0:detach foo
Modified instance foo
 - disk/0 -> detach
Please don't forget that most parameters take effect only at the next (re)start of the instance initiated by ganeti; restarting from within the instance will not be enough.
root@nuc1:~# gnt-instance list -o name,disk_template,disk.sizes
Instance             Disk_template Disk_sizes
elk.home.example.net ext           20480
foo                  diskless


However, to reattach it you need to have noted either its ganeti uuid, or given it a name.  Unfortunately, the ganeti disk UUID is not the same as the LVM UUID!  For example, after creating a new instance 'bar', gnt-instance info shows:

  Disks:
    - disk/0: plain, size 1.0G
      access mode: rw
      logical_id: ganeti/6719b82d-9bc1-4c99-af84-dd26893d344f.disk0
      on primary: /dev/ganeti/6719b82d-9bc1-4c99-af84-dd26893d344f.disk0 (253:4)
      name: None
      UUID: 9d435613-a1ed-4779-b1b6-a6fec4e0166b

So the only way to find the UUID of my missing disk seems to be "jq .disks /var/lib/ganeti/config.data"

OK, I found it.  Can I reattach it?  Nope, because the instance is now using the "diskless" template:

root@nuc1:~# gnt-instance modify --disk 0:attach,uuid=31cf98b3-e5b1-4823-843b-30659a2c9f65 foo
Failure: prerequisites not met for this operation:
error type: wrong_input, error details:
Instance allocation to group {'name': 'default', 'tags': [], 'ipolicy': {}, 'serial_no': 2, 'uuid': '764b0d55-a3c7-4c9c-8893-93abf5a75e66', 'ndparams': {}, 'diskparams': {}, 'mtime': 1499533056.725239, 'alloc_policy': 'preferred', 'networks': {'a767eb97-2e70-4cff-8639-4512e5f9e70b': {'link': 'communication_rt', 'mode': 'routed', 'vlan': u''}}, 'ctime': 0} (default) violates policy: Disk template diskless is not allowed (allowed templates drbd, plain, ext, file)
root@nuc1:~# gnt-instance modify -t plain foo
Failure: prerequisites not met for this operation:
error type: wrong_input, error details:
Conversion from the 'diskless' disk template is not supported
root@nuc1:~# gnt-instance modify -t plain --disk 0:attach,uuid=31cf98b3-e5b1-4823-843b-30659a2c9f65 foo
Failure: prerequisites not met for this operation:
error type: wrong_input, error details:
Disk template conversion and other disk changes not supported at the same time

Neither can I attach it to some other instance whose existing template is "ext":

root@nuc1:~# gnt-instance modify --disk 1:attach,uuid=31cf98b3-e5b1-4823-843b-30659a2c9f65 elk.home.example.net
Failure: prerequisites not met for this operation:
error type: wrong_input, error details:
Missing provider for template 'ext'

But I can attach it to some other instance which already has a disk and is using template "plain":

root@nuc1:~# gnt-instance modify --disk 1:attach,uuid=31cf98b3-e5b1-4823-843b-30659a2c9f65 bar
Thu Feb 11 09:14:07 2021  - INFO: Waiting for instance bar to sync disks
Thu Feb 11 09:14:07 2021  - INFO: Instance bar's disks are in sync
Modified instance bar
 - disk/1 -> attach:size=2048,mode=rw
Please don't forget that most parameters take effect only at the next (re)start of the instance initiated by ganeti; restarting from within the instance will not be enough.
root@nuc1:~# gnt-instance list -o name,disk_template,disk.sizes
Instance             Disk_template Disk_sizes
bar                  plain         1024,2048
elk.home.example.net ext           20480
foo                  diskless

So I'd say that the disk design document has only been partially implemented.  In particular:

- you can't mix disks of different types on an instance
- unless it's lurking in a non-obvious place, you can't list detached disks
- you can't do other operations on detached disks (e.g. create them, delete them, tag them, change their type)

Phil Regnauld

unread,
Feb 11, 2021, 4:54:17 AM2/11/21
to gan...@googlegroups.com
Brian Candler (b.candler) writes:
> Looking into this a bit more (with 2.15.2):
>
> - an instance must have a single disk_template: there is not a separate
> template for each disk

That's right, but you can mix ext provider when using ext as the
template -- so you can create an instance with one disk with one
provider, and the second disk using another provider (both ext):

# gnt-instance add -n vm1 -o noop --no-start --no-install --no-name-check --no-ip-check -t ext --disk 0:size=1G,provider=rbd-ssd,access=userspace disk-dummy

Thu Feb 11 10:39:52 2021 * disk 0, size 1.0G
Thu Feb 11 10:39:52 2021 * creating instance disks...
Thu Feb 11 10:39:53 2021 adding instance disk-dummy to cluster config
Thu Feb 11 10:39:53 2021 adding disks to cluster config
Thu Feb 11 10:39:54 2021 - INFO: Waiting for instance disk-dummy to sync disks
Thu Feb 11 10:39:54 2021 - INFO: Instance disk-dummy's disks are in sync

# gnt-instance modify --disk 1:add,provider=zfs,size=2G disk-dummy

Thu Feb 11 10:40:51 2021 * disk 1, size 2.0G
Thu Feb 11 10:40:52 2021 - INFO: Waiting for instance disk-dummy to sync disks
Thu Feb 11 10:40:52 2021 - INFO: Instance disk-dummy's disks are in sync
Modified instance disk-dummy
- disk/1 -> add:size=2048,mode=rw

# gnt-instance info disk-dummy

[...]
Disks:
- disk/0: ext, size 1.0G
access mode: rw
logical_id: ['rbd-ssd', 'fa7955f5-ca8b-4e11-8221-238818597de4.ext.disk0']
on primary: /dev/rbd7 (252:112)
name: None
UUID: 1ea0ae4a-33a2-4407-8d5e-f0559382c8d3
- disk/1: ext, size 2.0G
access mode: rw
logical_id: ['zfs', 'ca727632-cfc8-4b0b-a50d-c3c26fad11d3.ext.disk1']
on primary: /dev/zvol/zfs/vm/ca727632-cfc8-4b0b-a50d-c3c26fad11d3.ext.disk1 (230:0)
name: None
UUID: bd5a42fc-42b4-436a-bd1d-2f0485251837
[...]

# gnt-instance list -o name,disk_template,disk.size/0,disk.size/1 disk-dummy
Instance Disk_template Disk/0 Disk/1
disk-dummy ext 1.0G 2.0G

... and yes, it starts fine, I use this in production :) It's actually a simple
way to work around the limitations of the disk architecture in Ganeti,
assuming you don't want that instance to have both DRBD and ext (which
could be useful: system disk on drbd SSD, data disk on RBD CEPH).

Side note: when modifying an instance to add disks, say:

1. gnt-instance create ... --disk 0:size=1G
2. gnt-instance modify --disk add 1:size=2G
3. gnt-instance modify --disk add 1:size=3G

... and you specify disk 1 twice as above, you end up with:

Instance Disk_template Disk/0 Disk/1 Disk/2
disk-dummy ext 1.0G 3.0G 2.0G

This is actually consistent with 'gnt-instance modify disk X:remove' which
will also renumber the disk units and shift all X+n disks one down -- but
better be careful when adding/removing disks which disk you are
manipulating :)


> So the only way to find the UUID of my missing disk seems to be "jq .disks
> /var/lib/ganeti/config.data"
>
> OK, I found it. Can I reattach it? Nope, because the instance is now
> using the "diskless" template:

Ah that's interesting.

> Neither can I attach it to some other instance whose existing template is
> "ext":
>
> root@nuc1:~# *gnt-instance modify --disk
> 1:attach,uuid=31cf98b3-e5b1-4823-843b-30659a2c9f65 elk.home.example.net*
> Failure: prerequisites not met for this operation:
> error type: wrong_input, error details:
> Missing provider for template 'ext'

... but that's becase you didn't specify the ext provider template ?
What storage template is 'elk.home.example.net' using ?

> But I *can* attach it to some other instance which already has a disk *and*
> is using template "plain":
>
> root@nuc1:~# *gnt-instance modify --disk
> 1:attach,uuid=31cf98b3-e5b1-4823-843b-30659a2c9f65 bar*

Ok, that's somewhat useful.

> So I'd say that the disk design document
> <https://docs.ganeti.org/docs/ganeti/3.0/html/design-disks.html> has only
> been partially implemented. In particular:
>
> - you can't mix disks of different types on an instance
> - unless it's lurking in a non-obvious place, you can't list detached disks
> - you can't do other operations on detached disks (e.g. create them, delete
> them, tag them, change their type)

I think some of this work happened around the time the internal Ganeti
group at Google was being reassigned to other tasks, and it was never
finalized.

Cheers,
Phil

Brian Candler

unread,
Feb 11, 2021, 9:40:52 AM2/11/21
to ganeti
On Thursday, 11 February 2021 at 09:54:17 UTC Phil Regnauld wrote:
What storage template is 'elk.home.example.net' using ?


This zfs provider (my own version which doesn't spoof LVM, so it can't be used as DRBD backing, but it allows both LVM and ZFS to coexist in ganeti).

I can't attach a ganeti LVM disk to this unless I implemented LVM storage as another ext provider. Even if I did, ganeti wouldn't know that this disk was in use, so wouldn't prevent me from attaching it to a second instance with disastrous consequences.

Brian Candler

unread,
Feb 11, 2021, 9:44:43 AM2/11/21
to ganeti
TBH, an extstorage provider for Linstor would solve this problem.  At that point though, you don't really need anything other that extstorage: Linstor is way better than Ganeti's built-in LVM and DRBD storage anyway (except that it uses Java)

Valentin

unread,
Feb 12, 2021, 10:19:54 AM2/12/21
to gan...@googlegroups.com
Hi,

sascha...@web.de wrote:
>As a workaround it could be possible to use the ESI to trick Ganeti just
>with normal --disk add/remove operations. On `remove` the ESI should not
>delete the disk (i.e. if the disk has a special ExtStorage parameter like
>keep=true). Later with `add` and an additional parameter like
>`adopt=name_tag_or_id_of_lv` the ESI can reuse/rename an existing LV,
>instead of creating a new one.

Thanks for this idea and from what i understand it would work. but a
bugfix and complete implementation of independent disk object would be
better than such a hack.

regn...@gmail.com wrote:
>> So the only way to find the UUID of my missing disk seems to be "jq .disks
>> /var/lib/ganeti/config.data"
>>
>> OK, I found it. Can I reattach it? Nope, because the instance is now
>> using the "diskless" template:

The manual still states:
"Also, there is no support for conversions to or from the diskless
template."
which is kind of wrong, as ganeti obviously converts an instance to
diskless when its last disk is removed. But you still can't convert from
diskless to any other template.
So you can make your instances pretty useless with this, which i think
it's a bug.


regn...@gmail.com wrote:
> Ah that's interesting.
>
>> Neither can I attach it to some other instance whose existing template is
>> "ext":
>>
>> root@nuc1:~# *gnt-instance modify --disk
>> 1:attach,uuid=31cf98b3-e5b1-4823-843b-30659a2c9f65 elk.home.example.net*
>> Failure: prerequisites not met for this operation:
>> error type: wrong_input, error details:
>> Missing provider for template 'ext'
>
> ... but that's becase you didn't specify the ext provider template ?
> What storage template is 'elk.home.example.net' using ?

If I'm correct it doesn't matter....
I just tried to:
* create instance luigi with ext storage and 2 disks
* detach the secondary disk "luigi-a"
* attach the secondary disk

the errors were:
# gnt-instance modify --disk 1:attach luigi

Failure: prerequisites not met for this operation:
error type: wrong_input, error details:
Missing provider for template 'ext'

# gnt-instance modify --disk 1:attach,name=luigi-a,provider=lvm luigi

Failure: prerequisites not met for this operation:
error type: wrong_input, error details:
Only one argument is permitted in attach op, either name or uuid

# gnt-instance modify --disk 1:attach,provider=lvm luigi

Failure: prerequisites not met for this operation:
error type: wrong_input, error details:
Only one argument is permitted in attach op, either name or uuid

To be clear: this is a disk already saved in the ganeti configuration
and the the check if a provider was give should not be there. the
provider for this disk is already known. As I understand it the error is
raised by the call to CheckDiskExtProvider in [1] and this would need to
a) check which action is used on the disk or b) read the provider from
config in order tor work correctly.

Brian Candler (b.candler) wrote:
> TBH, an extstorage provider for Linstor
> <https://brian-candler.medium.com/linstor-networked-storage-without-the-complexity-c3178960ce6b> would
> solve this problem. At that point though, you don't really need anything
> other that extstorage: Linstor is way better than Ganeti's built-in LVM and
> DRBD storage anyway (except that it uses Java)

I don't think so, as the possibilities would still be restricted by
ganeti, e.g. adopting of disks and it would be nice not to have to do
everything via ext storage providers just because you can mix them but
can't mix different storage templates (see also [2])
Still, if you want an extstorage provider for linstore it's literally
just reading [3] and writing 10 lines of code.


When I started this thread i didn't know of the disk design document
https://docs.ganeti.org/docs/ganeti/3.0/html/design-disks.html
We'd be interested in the proposed changes denoted there and are willing
to implement what's missing. As well as fixing bugs in the already
implemented parts like the one with attaching disks of template ext.
Because of our use case we would probably prioritize the interface to
"Allow creation/modification/deletion of disks that are not attached to
any instance" and add this functionality to gnt-storage.
But because we are completely new to ganeti and we haven't seen any
ganeti clusters besides ours we are hesitant to do so without feedback
from ganeti developers. We want to avoid starting in a completely wrong
way missing something that should be obvious.

What I've read in different responses now is that many of you
effectively work around ganetis storage and need similar possibilities
as we do. Maybe opening some connected bugs on ganetis github would be a
possibility to structure this a bit and attract attention from developers?

Cheers,
Valentin

[1]
https://github.com/ganeti/ganeti/blob/4d8c16a0739c93400471023b3cf9adf73a037b8f/lib/cmdlib/instance_storage.py#L368
[2]
https://docs.ganeti.org/docs/ganeti/3.0/html/design-disks.html#eliminating-the-disk-template-from-the-instance
[3]
https://docs.ganeti.org/docs/ganeti/3.0/html/man-ganeti-extstorage-interface.html

Brian Candler

unread,
Feb 12, 2021, 1:09:35 PM2/12/21
to ganeti
On Friday, 12 February 2021 at 15:19:54 UTC Valentin wrote:
When I started this thread i didn't know of the disk design document
https://docs.ganeti.org/docs/ganeti/3.0/html/design-disks.html
We'd be interested in the proposed changes denoted there and are willing
to implement what's missing. As well as fixing bugs in the already
implemented parts like the one with attaching disks of template ext.
Because of our use case we would probably prioritize the interface to
"Allow creation/modification/deletion of disks that are not attached to
any instance" and add this functionality to gnt-storage.

As far as I can see, that part was never flushed out into concrete commands. In particular it says
  • Create a new Disk object of a given template and save it to the config.
  • Remove an existing Disk object from the config.
(without giving the CLI commands), and it doesn't mention the need to list detached disks.

"gnt-storage" currently only does extstorage.  I guess it could be extended (e.g. "gnt-storage add -t drbd ..." to create a disk), instead of having a new command "gnt-disk".

There are a load of commands which either need duplicating, or should be changed from instance commands to disk commands.  In particular, things like changing the template of a disk, growing a disk, adopting a disk, drbd failover (swap primary/secondary) - these are all things you might want to do to a disk while it's detached.  (And if it is attached to an instance, generally that instance needs to be shutdown anyway).  Even plain volumes will need to be able to do the equivalent of "gnt-instance move"; where you are moving just the disk, not the instance+disk pair.

There's another problematic issue: the concept of "primary" and "secondary" nodes for instances. Suppose you have disk X (-t drbd) which is replicated between node A and node B, and another disk Y which is replicated between node A and node C.  I don't think an instance could use both of these at the same time, since "primary"/"secondary" failover is something you do at the instance level.  Hence it will probably need to refuse to attach a drbd disk to an instance, in the case where there are already one or more attached drbd disks, and the disk you're trying to attach is on a different pair of nodes (*) (**).

As the design spec says, it's necessary to drop the concept of instance-level disk template entirely:

"The Instance object will no longer contain the list of disk objects that are attached to it or a disk template. Instead, an Instance object will refer to its disks using their UUIDs and the disks will contain their own template."

Hence gnt-instance list needs modifying to include -o options "disk.templates" and "disk.template/n"; perhaps the current "disk_template" returns the same as disk.template/0, or "diskless" if there are no disks attached.

However, this also means that even the concept of whether an instance *has* a secondary node, will depend on whether it has a drbd disk attached or not at that point in time.  And the options for failover or migrate may be constrained by the other attached disks (e.g. if it also has a plain disk attached, then migration is not possible at all; but if it has a ceph disk attached, that doesn't constraint it). This is a big can of worms.

These problems doesn't exist with Linstor, since each disk can be accessed from any node anyway; the instance doesn't have to execute on the "primary" node at all, and if different volumes were replicated between different sets of nodes, it wouldn't care.

Personally my feeling is: if lxd had a Linstor plugin, I would probably drop ganeti now for good.  lxd can now run full-fat virtual machines as well as containers, it has clustering, it does most of what ganeti does (except perhaps the hbal stuff), it has a good story around starting cloud images properly, and it is supported by Canonical.  The drbd replication is the main selling point left for ganeti, but the drbd8 primary-secondary replication is now a serious handicap.  drbd9's multinode replication (which Linstor uses) makes drbd work like a SAN.

Regards,

Brian.

(*) I did a quick check to see if the disk objects in the config keep a record of which nodes they are on.  They do:

root@nuc1:~# gnt-instance add -t drbd -n nuc1:nuc2 -s 1G -o noop --no-start --no-name-check --no-ip-check --no-wait-for-sync bar
...
root@nuc1:~# jq .disks /var/lib/ganeti/config.data
{
  "1aeb0d4d-ab13-4d91-a789-d4c98ec2b67f": {
    "logical_id": [
      "ced13e8c-a3ee-47e0-922e-58db0134c0dc",
      "73c16aad-8540-4535-b5b6-329c0b81758b",
      11001,
      0,
      0,
      "13f578aed78008614124bdf3014bab33482f4c4c"
    ],
    "dev_type": "drbd",
    "children": [
      {
        "logical_id": [
          "ganeti",
          "7daafd25-6802-4d7b-9715-454fb350dd9f.disk0_data"
        ],
        "dev_type": "plain",
        "children": [],
        "nodes": [
          "ced13e8c-a3ee-47e0-922e-58db0134c0dc",
          "73c16aad-8540-4535-b5b6-329c0b81758b"
        ],
        "iv_name": "",
        "size": 1024,
        "mode": "rw",
        "params": {},
        "uuid": "a9211b10-5534-49e8-b5dc-39606996ab5f",
        "serial_no": 1,
        "ctime": 1613151115.034525,
        "mtime": 1613151115.034515
      },
      {
        "logical_id": [
          "ganeti",
          "7daafd25-6802-4d7b-9715-454fb350dd9f.disk0_meta"
        ],
        "dev_type": "plain",
        "children": [],
        "nodes": [
          "ced13e8c-a3ee-47e0-922e-58db0134c0dc",
          "73c16aad-8540-4535-b5b6-329c0b81758b"
        ],
        "iv_name": "",
        "size": 128,
        "mode": "rw",
        "params": {},
        "uuid": "73242f08-29ab-4fc1-9f80-ac235d4cee24",
        "serial_no": 1,
        "ctime": 1613151115.034568,
        "mtime": 1613151115.034558
      }
    ],
    "nodes": [
      "ced13e8c-a3ee-47e0-922e-58db0134c0dc",
      "73c16aad-8540-4535-b5b6-329c0b81758b"
    ],
    "iv_name": "disk/0",
    "size": 1024,
    "mode": "rw",
    "params": {},
    "uuid": "1aeb0d4d-ab13-4d91-a789-d4c98ec2b67f",
    "serial_no": 1,
    "ctime": 1613151115.034598,
    "mtime": 1613151115.034589
  },
...
}

Another question thought, does the "nodes":[] list ordering imply primary/secondary?  Apparently not, because if I failover the instance, the disk info doesn't change.  This begs the question of what happens if you *activate* a disk when its detached.  Maybe this will not be possible - but then that's limiting as it means you can't access the content of a detached disk.

root@nuc1:~# gnt-instance failover bar
Failover will happen to image bar. This requires a shutdown of the
instance. Continue?
y/[n]/?: y
Fri Feb 12 17:47:08 2021  - INFO: Not checking memory on the secondary node as instance will not be started
Fri Feb 12 17:47:08 2021 Failover instance bar
Fri Feb 12 17:47:08 2021 * not checking disk consistency as instance is not running
Fri Feb 12 17:47:08 2021 * shutting down instance on source node
Fri Feb 12 17:47:09 2021 * deactivating the instance's disks on source node
root@nuc1:~# jq .disks /var/lib/ganeti/config.data
{
  "1aeb0d4d-ab13-4d91-a789-d4c98ec2b67f": {
    "logical_id": [
      "ced13e8c-a3ee-47e0-922e-58db0134c0dc",
      "73c16aad-8540-4535-b5b6-329c0b81758b",
      11001,
      0,
      0,
      "13f578aed78008614124bdf3014bab33482f4c4c"
    ],
    "dev_type": "drbd",
    "children": [
      {
        "logical_id": [
          "ganeti",
          "7daafd25-6802-4d7b-9715-454fb350dd9f.disk0_data"
        ],
        "dev_type": "plain",
        "children": [],
        "nodes": [
          "ced13e8c-a3ee-47e0-922e-58db0134c0dc",
          "73c16aad-8540-4535-b5b6-329c0b81758b"
        ],
        "iv_name": "",
        "size": 1024,
        "mode": "rw",
        "params": {},
        "uuid": "a9211b10-5534-49e8-b5dc-39606996ab5f",
        "serial_no": 1,
        "ctime": 1613151115.034525,
        "mtime": 1613151115.034515
      },
      {
        "logical_id": [
          "ganeti",
          "7daafd25-6802-4d7b-9715-454fb350dd9f.disk0_meta"
        ],
        "dev_type": "plain",
        "children": [],
        "nodes": [
          "ced13e8c-a3ee-47e0-922e-58db0134c0dc",
          "73c16aad-8540-4535-b5b6-329c0b81758b"
        ],
        "iv_name": "",
        "size": 128,
        "mode": "rw",
        "params": {},
        "uuid": "73242f08-29ab-4fc1-9f80-ac235d4cee24",
        "serial_no": 1,
        "ctime": 1613151115.034568,
        "mtime": 1613151115.034558
      }
    ],
    "nodes": [
      "ced13e8c-a3ee-47e0-922e-58db0134c0dc",
      "73c16aad-8540-4535-b5b6-329c0b81758b"
    ],
    "iv_name": "disk/0",
    "size": 1024,
    "mode": "rw",
    "params": {},
    "uuid": "1aeb0d4d-ab13-4d91-a789-d4c98ec2b67f",
    "serial_no": 1,
    "ctime": 1613151115.034598,
    "mtime": 1613151115.034589
  },
...
}

(**) If a detached disk is on the "wrong" nodes, then it will need to be possible to migrate it to the correct nodes while it's still detached.  But the current mechanism for doing this depends on moving the secondary and failing over; this means that detached disks *will* need a concept of primary and secondary, which they don't currently have.  Or else you'll have to change to plain, do the equivalent of "gnt-instance move", and then change back to drbd.

Sascha Lucas

unread,
Feb 15, 2021, 5:41:44 PM2/15/21
to gan...@googlegroups.com, ganeti...@googlegroups.com
Hi,

thanks a lot for the discussion. I think Brian made clear what
completing the design doc regarding disks would mean. From my point of
view the "big can of worms" are only caused by DRBD-8.4's nature for
primary and secondary. All other (redundant) disk templates, which
Ganeti calls "externally mirrored" do not have such limitations.

It might be considered to not allow de-/attach for DRBD on the first
run, or even at all (same for file and plain?). Maybe the use case for
de-/attach itself is somewhat special. Completing the disk design doc,
would open the way for mixing disk types, again limited for the DRBD
case, but not for externally mirrored.

I also agree with Valentin, that solving this types of problems with
the external storage interface instead of inside Ganeti is not the most
valuable variant. Besides that, it can be observed, that this
"functions as executables" interfaces, namely OS and storage, tend to
everyone inventing their own wheel. Probably a subclassing and more
integrated approach would be better?

Valentin offered to contribute towards completing the disk design doc,
which would be a great improvement for the Ganeti project. Personally I
like to support this from my end. But this must been seen more from a
point of a longtime user, than a Ganeti developer (my programming
capabilities are enough for smaller fixes, but easily lost inside
Haskell). So I'm cc-ing to ganeti-devel here, probably to catch some
other's thoughts as well. The full discussion can be found at [1].

Thanks, Sascha.

[1] https://groups.google.com/g/ganeti/c/FwMZhBP9Ru0

Valentin

unread,
Feb 16, 2021, 7:46:40 AM2/16/21
to ganeti...@googlegroups.com, gan...@googlegroups.com
Hi,

thanks for posting this to ganeti-devel, i was planning to do so myself.


On 15/02/2021 23:41, Sascha Lucas wrote:
> thanks a lot for the discussion. I think Brian made clear what
> completing the design doc regarding disks would mean. From my point of
> view the "big can of worms" are only caused by DRBD-8.4's nature for
> primary and secondary. All other (redundant) disk templates, which
> Ganeti calls "externally mirrored" do not have such limitations.
>
> It might be considered to not allow de-/attach for DRBD on the first
> run, or even at all (same for file and plain?). Maybe the use case for
> de-/attach itself is somewhat special. Completing the disk design doc,
> would open the way for mixing disk types, again limited for the DRBD
> case, but not for externally mirrored.

I think if the design doc is implemented nicely all of this can be
solved by doing the right checks on the drbd volumes. With mixed disk
types all instance modification commands impacting the storage need to
check attached disks and their states anyway instead of just checking
for the disk templates.
If we do this it should only be a small step to checking a drbd disks
hosts and suggest an action to move the drbd disk to primary/secondary
hosts that correspond to the desired instance.

> Valentin offered to contribute towards completing the disk design doc,
> which would be a great improvement for the Ganeti project. Personally I
> like to support this from my end. But this must been seen more from a
> point of a longtime user, than a Ganeti developer (my programming
> capabilities are enough for smaller fixes, but easily lost inside
> Haskell). So I'm cc-ing to ganeti-devel here, probably to catch some
> other's thoughts as well.

I was starting to come up with a plan for implementation together with
my colleagues.
As this is our main use case we would start with changes to the cli to
be able to manage detached disks and the extStorage Interface part of
it, but we try to keep the bigger picture in mind to make changes
suitable for widespread use with other disk types.
Our proposed changes would consist of an extension of the gnt-storage
cli to manage disks and appropriate changes to instance modification
commands and the model to be able to handle mixed disk templates.

gnt-storage cli:
* list -- show all disks (maybe allow selector/filtering)
* add -- add a disk to config (and create if needed)
* remove -- remove a disk from config
* parameter:keep -- optionally preserve data on disk (for all types
of storage, should be easy enough)
* compatibility with current gnt-instance commands should be maintained
(but maybe not extended)
* special commands that might be useful:
* drbd -- switch primary/secondary
* lvm/plain -- copy data between storage/nodes
* convert -- disk type conversion
* mount -- allow introspection of disk in node (only for disks that
aren't in use; possible issues may arise from storage location)

attach/detach of disks should be an instance operations. (reasoning: on
all hypervisors disk operations are part of the instance management;
compatibility with current command line interfaces)

storage configuration for instances:
* disk templates
* use disk templates in disk configs only
* allow disks to get attached to any instance (ganeti has to deduce
from all disk templates how a machine might get migrated or started;
also see below)
* disks need thorough checking of the availability of the disk for a
(running) instance
* diskless template needs to be dropped, checks for it should instead
test if no disk is attached
* instance debootstrap needs to be modified to not use disk templates
(cfgupgrade as well)
* all instance modification commands that impact storage need to check
attached disks and their state(s) (availability, ...) instead of just
checking for the instance disk template
* operations that operate on the instance and extract the disk template
e.g. for creation of a new disk will require an additional parameter for
the disk template. Several commands already provide an optional
parameter to override the instance setting, this will become required.

In addition we want to implement for ESI (ExtStorage Interface):
* add adopt option
* fix the issues with exclusive/shared mode outlined in [1]

I would love to hear some feedback from you on those plans.

In the meantime we have opened some smaller issues and PRs on github.
[2][3][4]

Thanks,
Valentin


[1] https://github.com/ganeti/ganeti/issues/1581
[2] https://github.com/ganeti/ganeti/issues/1583
[3] https://github.com/ganeti/ganeti/pull/1582
[4] https://github.com/ganeti/instance-debootstrap/pull/12

Phil Regnauld

unread,
Feb 16, 2021, 7:58:32 AM2/16/21
to Sascha Lucas, gan...@googlegroups.com, ganeti...@googlegroups.com
Sascha Lucas (sascha_lucas) writes:
>
> It might be considered to not allow de-/attach for DRBD on the first
> run, or even at all (same for file and plain?). Maybe the use case for
> de-/attach itself is somewhat special. Completing the disk design doc,
> would open the way for mixing disk types, again limited for the DRBD
> case, but not for externally mirrored.

Since nothing is known about the "migratability" of a given
external disk template, it is a bit all-or-nothing with DRBD, yes.

But, we would love to be able to have system disks on DRBD SSD storage,
and have the second/data disks on RBD (CEPH), so we'd be able to migrate.

Maybe having external disk template provide some sort of flag/feature
"can_migrate" or similar ?

Cheers,
Phil

Valentin

unread,
Feb 16, 2021, 12:29:47 PM2/16/21
to ganeti
Hi,

On Tuesday, February 16, 2021 at 1:58:32 PM UTC+1 regn...@gmail.com wrote:
Since nothing is known about the "migratability" of a given
external disk template, it is a bit all-or-nothing with DRBD, yes.

But, we would love to be able to have system disks on DRBD SSD storage,
and have the second/data disks on RBD (CEPH), so we'd be able to migrate.

Maybe having external disk template provide some sort of flag/feature
"can_migrate" or similar ?
 
As I understand it currently extStorage assumes that migrations can be done.  As stated in [1] :
"The method for supporting external shared storage in Ganeti is to have an ExtStorage provider...."
ExtStorage providers are made for shared storage only.

Adding such a flag could be a feature, but its hard to implement. If disks might be available on one, all or some hosts the info about nodes with that disk available would have to be provided by a bash script for that provider.

On the other hand, having the possibility to mix disk types and still be able to migrate if all attached disks support it would be on my list to implement as its also suggested in the disk-design-doc [2] as I understand it.

Cheers,
Valentin

Phil Regnauld

unread,
Feb 16, 2021, 3:28:11 PM2/16/21
to gan...@googlegroups.com
Valentin (valentin.k93) writes:
>
> As I understand it currently extStorage assumes that migrations can be
> done. As stated in [1] :

Right, it's actually explicit - thanks. So, no need to bother :)

> On the other hand, having the possibility to mix disk types and still be
> able to migrate if all attached disks support it would be on my list to
> implement as its also suggested in the disk-design-doc [2] as I understand

That would certainly be useful.

Cheers,
Phil

Reply all
Reply to author
Forward
0 new messages