Ganeti 3.0 with Qemu 5 (on Debian 11)?

166 views
Skip to first unread message

Thomas Rieschl

unread,
Sep 5, 2021, 8:43:11 AM9/5/21
to gan...@googlegroups.com
Hi there!

I've been trying to get a new node to join the cluster. The current
cluster is Ganeti 3 from Debian buster-backports, and Qemu version is
_not_ from backports.
When trying to join a new node which has Debian 11 installed and
therefore the Qemu version is 5.2 the instance would not start because
of different deprecation warnings.

Is Ganeti supported on version 5 of Qemu?
I _think_ I also tried Qemu5 from buster-backports with the same
deprecations.


I manageed to get the instance running with following tricks:

The first deprecation is because of the -usbdevice param:
"Could not start instance 'test-buster.test.local': Hypervisor error:
Failed to start instance test-buster.test.local: exited with exit code 1
(kvm: -usbdevice tablet: '-usbdevice' is deprecated, please use '-device
usb-...' instead"

I tried hacking
/usr/share/ganeti/3.0/ganeti/hypervisor/hv_kvm/__init__.py to use
"-device usb-tablet" instead. That resolved that issue but another one
was still existing:

In the logfile: /var/log/ganeti/kvm/test-buster.test.local.log I got this:
"kvm: -vnc :5123: keymap include files are not supported any more"

Removing the keymap param from the instance resolved that (I had "de" in
there). Easy one..

Thanks for your help!

If I should open an issue on GitHub instead, just tell me :)

Regards,
Thomas

Sascha Lucas

unread,
Sep 6, 2021, 3:06:44 AM9/6/21
to gan...@googlegroups.com
Hi Thomas,

On Sun, 5 Sep 2021, Thomas Rieschl wrote:

> When trying to join a new node which has Debian 11 installed and therefore
> the Qemu version is 5.2 the instance would not start because of different
> deprecation warnings.

Thanks a lot for discovering this issue.

> Is Ganeti supported on version 5 of Qemu?

Yes. Ganeti has been tested to work on Debian Bullseye. Sadly the tests
(qa-suite) seem not to cover any aspect, like in your case the VNC/keymap
combination.

> The first deprecation is because of the -usbdevice param:
> "Could not start instance 'test-buster.test.local': Hypervisor error: Failed
> to start instance test-buster.test.local: exited with exit code 1 (kvm:
> -usbdevice tablet: '-usbdevice' is deprecated, please use '-device usb-...'
> instead"

I think this one is "just" a warning. Qemu still starts with "-usbdevice
tablet". However this deprecation should be addressed, too.

> In the logfile: /var/log/ganeti/kvm/test-buster.test.local.log I got this:
> "kvm: -vnc :5123: keymap include files are not supported any more"
>
> Removing the keymap param from the instance resolved that (I had "de" in
> there). Easy one..

That is interesting. I confirm, that this bug is triggered when VNC is used
and a keymap is supplied.

> If I should open an issue on GitHub instead, just tell me :)

Yes please, any help is highly appreciated. You may even investigate on
the reasons inside Qemu behind this change and/or propose a PR on how to
handle this inside Ganeti.

Thanks, Sascha.

Thomas Rieschl

unread,
Sep 7, 2021, 10:28:03 AM9/7/21
to gan...@googlegroups.com
Hi,
You were right, the keymap error was the culprit. The usbdevice warning
just got me distracted. After fixing the keymap issue, the deprecation
wasn't even shown anymore because the output of the kvm process is only
shown when it returns a non-zero exit code.

As a side-note: the keymap error is not displayed in the error output
because of the "-D" param in the kvm invocation. Maybe that logfile
should be printed when kvm exists with != 0? (or at least just the
logfile path).

I also searched when the deprecation was introduced [1] (it was around
Qemu v2.10) and discovered that it was un-deprecated in Qemu v6 [2], so
nothing to be done here.

>
>> If I should open an issue on GitHub instead, just tell me :)
>
> Yes please, any help is highly appreciated. You may even investigate on
> the reasons inside Qemu behind this change and/or propose a PR on how to
> handle this inside Ganeti.

Will try, but I'm a Python noob ;)
Is there a minimum required Qemu version for Ganeti? When providing a PR
should I just check if Qemu > v4 [3] or just drop the whole "include"
fallback stuff [4]?


Best regards,
Thomas

>
> Thanks, Sascha.
>

[1]
https://github.com/qemu/qemu/commit/a358a3af4558a24398a541951cad7a6c458df72b
[2]
https://github.com/qemu/qemu/commit/6db34277e3b3071707a3a20afb82176e4f229b8f
[3]
https://github.com/qemu/qemu/commit/2a7bece653913a962a12672eb43aa796aea5ea23
[4]
https://github.com/ganeti/ganeti/blob/master/lib/hypervisor/hv_kvm/__init__.py#L1897-L1900

Georg Faerber

unread,
Sep 7, 2021, 1:34:33 PM9/7/21
to gan...@googlegroups.com
Hi all,

FWIW, this was also reported very recently via Debian.

The following mail was sent to the package tracker -- I'm not sure if
there is a public tracker of these mails, which is why I'm dumping the
complete mail in here.

Cheers,
Georg

----- Forwarded message from Santiago Garcia Mantinan <ma...@debian.org> -----

Date: Thu, 2 Sep 2021 13:05:38 +0200
From: Santiago Garcia Mantinan <ma...@debian.org>
To: gan...@packages.debian.org
Subject: Migrating from buster to bullseye, not so easy

Hi!

I have recently migrated a small ganeti cluster from buster to bullseye and
I wanted to comment with you the problems I found, in case you want me to
open bugs or whatever for any of them.

This is in no way any rant or anything, I enjoy pretty much ganeti and in
buster everything was going perfect with it, thanks for your great job.

The first thing I did was fetch the package and read the NEWS in case there
was something I should be aware of, it was good to have a message there
saying that I could migrate from buster's version to bullseye, you also
stated the typical setup, both packages installed and then migrate and
remove the old one.

This was my first problem, a dist-upgrade would remove the old package, so I
tried a simple apt install ganeti and same thing happened. I had those
problems before with other python2 based stuff, so I went for the same
solution... apt install python-is-python2 and after that, I could indeed
install both versions of ganeti on the same machine, maybe we should add a
hint for this or some other solution for these cases?

After this, I started migrating all the guests to one node, while I was
upgrading the other nodes and rebooting, it was a bad thing to see that
after the nodes were upgraded, rebooted, ... I couldn't bring the guests
back to my upgraded nodes. Ganeti would refuse to move them because there
was a problem, when I looked at the ganeti logs I could only see a message:

ganeti.errors.OpExecError: Could not pre-migrate instance prejitsi:
Failed to accept instance: Failed to start instance prejitsi: exited
with exit code 1 (kvm: -usbdevice tablet: '-usbdevice' is deprecated,
please use '-device usb-...' instead

This was completely misleading, as this message is only a warning, not an
error, the real error was...

kvm: -vnc 127.0.0.1:5142: keymap include files are not supported any more

Which I just couldn't see anywhere until I looked at the kvm logs, don't
know why I didn't see it, now I see it was on Ganeti's logs as well.

Anyway, the error was caused by me having the keymap set to spanish, so what
I did was:

gnt-instance modify -H keymap= guest

for every guest, but having them change the keymap meant a reboot off all
guests.

Having done that I thought I had finished, as I had all the nodes updated to
bullseye and cleaned, the cluster is running ok and everything looks fine,
so... I started moving tghe guests to their default node and... guests
started to freeze as they reached the destination node :-(

When I tried to see what had happened to those guests connecting to the
console... I got:

# gnt-instance console sid
Instance sid is paused, unpausing

Farder investigation of what had happened revealed

# cat /var/log/ganeti/kvm/sid.log
kvm: Could not open '/var/run/ganeti/instance-disks/sid:0': Permission denied
# ls -l /var/run/ganeti/instance-disks/sid:0
lrwxrwxrwx 1 root root 11 sep 2 12:48 /var/run/ganeti/instance-disks/sid:0 -> /dev/drbd11
# ls -l /dev/drbd*
brw-rw---- 1 root disk 147, 0 sep 2 12:50 /dev/drbd0
brw-rw---- 1 root disk 147, 1 sep 2 12:50 /dev/drbd1
brw-rw---- 1 root disk 147, 10 sep 2 12:50 /dev/drbd10
brw-rw---- 1 root disk 147, 11 sep 2 12:48 /dev/drbd11
# id sid
uid=123(sid) gid=105(kvm) grupos=105(kvm)

I run the machine as user sid, of course user sid cannot open the drbd I
don't think it should either.

I tested to see if this was the real problem, I changed group from disk to
kvm on the secondary node of sid and then did a migration without any
problem.

I feel like the last problem I found is a bug and I'll try to submit it as
soon as possible unless you tell me not to, as for the other two, I don't
know if they qualify as such or not, maybe we should add some info on the
release notes or similar?

You tell me what to do.

Thanks in advance.
--
Manty/BestiaTester -> http://manty.net

----- End forwarded message -----

Sascha Lucas

unread,
Sep 8, 2021, 4:56:29 AM9/8/21
to gan...@googlegroups.com
Hi Thomas,

On Tue, 7 Sep 2021, Thomas Rieschl wrote:

> As a side-note: the keymap error is not displayed in the error output because
> of the "-D" param in the kvm invocation. Maybe that logfile should be printed
> when kvm exists with != 0? (or at least just the logfile path).

That is true... and an other enhancement, that could be done.

> I also searched when the deprecation was introduced [1] (it was around Qemu
> v2.10) and discovered that it was un-deprecated in Qemu v6 [2], so nothing to
> be done here.

Thanks for the hint. Looks like one less problem.

> Will try, but I'm a Python noob ;)

NP. Me, too :-)

> Is there a minimum required Qemu version for Ganeti?

Not really. But we decided to support Ubuntu 18.04 in Ganeti-3.0 as the
oldest version. Speeking of Qemu, that is 2.11.

> When providing a PR
> should I just check if Qemu > v4 [3] or just drop the whole "include"
> fallback stuff [4]?

Ah, this is broken since Qemu>=4.0, we really should have noticed this
earlier in Ubuntu 20.04.

So it seems necessary to not create the runtime InstanceKeymapFile and
pass directly the keymap value to Qemus `-k` argument, if Qemu>=4.0.

However, Ganeti is all about continuous upgrades, live migration from
Qemu<4 to >=4.0 must be considered, too. Which leads to upgrading the kvm
runtime (see example in commit[a]). Hopefully this change would be safe
for live migration?

Thanks, Sascha.

[a] https://github.com/ganeti/ganeti/commit/80262fd405167802b782d0db6f932ba8a0c4de86

Sascha Lucas

unread,
Sep 8, 2021, 5:11:21 AM9/8/21
to gan...@googlegroups.com
Hi Georg,

thanks for sharing your information. This is very useful feedback. If I
summarize:

* ganeti-2.16 Debian package got removed on dist-upgrade to bullseye, if
python-is-python2 is not installed: this is very Debian centric and should
be mentioned in the NEWS.Debian? An other way would be, to upgrade to
Ganeti-3.0 using buster-backports and then dist-upgrading to bullseye.

* keymap include files are not supported any more: that's the one
discovered by Thomas here

* paused instance on bullseye live migration: this one is known[1] and
should be fixed in the Debian package. Alternatively users can work around,
by adding their security users to the disk group.

So by chance, if you are able to fix the Debian package or reach out to
Apollon, please do.

[1] https://github.com/ganeti/ganeti/pull/1603
> --
> You received this message because you are subscribed to the Google Groups "ganeti" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to ganeti+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/ganeti/YTeipaQ6jRUfD8ti%40debian.
>

Georg Faerber

unread,
Sep 8, 2021, 10:28:17 AM9/8/21
to gan...@googlegroups.com
Hi Sascha, all,

On 21-09-08 11:11:18, Sascha Lucas wrote:
> thanks for sharing your information. This is very useful feedback.

Thanks, and thank you as well.

> If I summarize:
>
> * ganeti-2.16 Debian package got removed on dist-upgrade to bullseye, if
> python-is-python2 is not installed: this is very Debian centric and should
> be mentioned in the NEWS.Debian? An other way would be, to upgrade to
> Ganeti-3.0 using buster-backports and then dist-upgrading to bullseye.

That's tracked via [1], more specifically, Santiago Garcia Mantinan
reported how to fix this via [2].

> * keymap include files are not supported any more: that's the one
> discovered by Thomas here

ACK, this might deserve a note in NEWS as well.

> * paused instance on bullseye live migration: this one is known[1] and
> should be fixed in the Debian package. Alternatively users can work around,
> by adding their security users to the disk group.

That's tracked via [3], and it seems unfixed in the relevant Debian
context.

> So by chance, if you are able to fix the Debian package or reach out
> to Apollon, please do.

ACK, I'll try to reach to Apollon and see what we're able to do.

Cheers,
Georg


[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=987907
[2] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=987907#56
[3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=993920

Sascha Lucas

unread,
Sep 8, 2021, 10:53:23 AM9/8/21
to gan...@googlegroups.com
On Wed, 8 Sep 2021, Sascha Lucas wrote:

> However, Ganeti is all about continuous upgrades, live migration from
> Qemu<4 to >=4.0 must be considered, too. Which leads to upgrading the kvm
> runtime (see example in commit[a]). Hopefully this change would be safe
> for live migration?

Please forget about this part. The `-k` option seems not part of the
`kvm_cmd` in `/var/run/ganeti/kvm-hypervisor/conf/instanceXXX.runtime`,
which I assumed blindly. So just fixing in `_ExecuteKVMRuntime` should be
enough.

I think it makes sense, to keep old behavior if Qemu<4 and switch to new
if >=4.0.

Thanks, Sascha.

Sascha Lucas

unread,
Oct 8, 2021, 2:44:26 PM10/8/21
to gan...@googlegroups.com
Hi,

just for the record: see PR1612[1]. Thanks to Thomas for reporting!

On Wed, 8 Sep 2021, Sascha Lucas wrote:

> I think it makes sense, to keep old behavior if Qemu<4 and switch to new
> if >=4.0.

No, I no longer think so. The old behavior is dropped entirely.

Sascha.

[1] https://github.com/ganeti/ganeti/pull/1612
Reply all
Reply to author
Forward
0 new messages