Strange QMP behavior...

19 views
Skip to first unread message

Dimitris Aragiorgis

unread,
Jan 20, 2014, 10:47:42 AM1/20/14
to ganeti...@googlegroups.com
Hi all!


Recently we observed some strange TypeErrors in node-daemon.log related to
QMP message handling. Specifically the exceptions were inside GetInstanceInfo().

Please do the following test:

1) Client A connects to qmp socket with socat
2) Client A gets greeting message {"QMP": {"version": ..}
3) Client A waits (select on the socket's fd)
4) Client B tries to connect to the *same* qmp socket with socat
5) Client B does *NOT* get any greating message
6) Client B waits (select on the socket's fd)
7) Client B closes connection (kill socat)
8) Client A quits too
9) Client C connects to qmp socket
10) Client C gets *two* greeting messages!!!

This case is reproducable in Ganeti:
1) Allocator runs and invokes GetAllInstancesInfo()
2) GetInstanceInfo() creates a QMPConnection()
3) Meanwhile a client wants to get instance runtime info
4) GetInstanceInfo() creates a new QMPConnection() but _Recv() blocks when
trying to read from socket and a timeout exception is raised. This
is catched and raised as HypervisorError which is *silently* catched inside
GetInstanceInfo()
5) First connection closes
6) When a third connection will arive it will get *two* greetings!
7) The first greeting will be consumed inside connect()
8) The second will be consumed as response to qmp_capabilities execute command
9) info cpus will get {'return': {}}!! which will result to TypeError.

After digging qemu source we found that monitor's output buffer is
shared between clients. Only one client can interact with the monitor.
The others are considered as blocked. Still their greeting message is
written to the output buffer and a callback is registered to flush the
message to the socket when possible. This means that if a client closes
the connection before getting the greeting, these data remain in the
buffer and will be flushed upon a future connection.

To fix this on the Ganeti side, first of all, a finally: section that closes
the socket must be added for each QMPConnection(), that practically leaves
smaller timeslot for more than one concurrent connections. Additionally we
should parse as many greetings available when starting the connection,
and finally parse the output of qmp_capabilities command.

Do you agree with all the above?

Should I proceed with a patch? Against stable 2.9?

Thanks,
dimara


PS: All this have been tested with QEMU 1.7
signature.asc

Apollon Oikonomopoulos

unread,
Jan 20, 2014, 11:46:35 AM1/20/14
to Dimitris Aragiorgis, ganeti...@googlegroups.com
Hi Dimitris,

On 17:47 Mon 20 Jan , Dimitris Aragiorgis wrote:
> Hi all!
>
>
> Recently we observed some strange TypeErrors in node-daemon.log related to
> QMP message handling. Specifically the exceptions were inside GetInstanceInfo().
>
> Please do the following test:
>
> 1) Client A connects to qmp socket with socat
> 2) Client A gets greeting message {"QMP": {"version": ..}
> 3) Client A waits (select on the socket's fd)
> 4) Client B tries to connect to the *same* qmp socket with socat
> 5) Client B does *NOT* get any greating message
> 6) Client B waits (select on the socket's fd)
> 7) Client B closes connection (kill socat)
> 8) Client A quits too
> 9) Client C connects to qmp socket
> 10) Client C gets *two* greeting messages!!!
>
> This case is reproducable in Ganeti:
> 1) Allocator runs and invokes GetAllInstancesInfo()

Doesn't the IAllocator run with the nodes locked?

> 2) GetInstanceInfo() creates a QMPConnection()
> 3) Meanwhile a client wants to get instance runtime info
> 4) GetInstanceInfo() creates a new QMPConnection() but _Recv() blocks when
> trying to read from socket and a timeout exception is raised. This
> is catched and raised as HypervisorError which is *silently* catched inside
> GetInstanceInfo()
> 5) First connection closes
> 6) When a third connection will arive it will get *two* greetings!
> 7) The first greeting will be consumed inside connect()
> 8) The second will be consumed as response to qmp_capabilities execute command
> 9) info cpus will get {'return': {}}!! which will result to TypeError.
>
> After digging qemu source we found that monitor's output buffer is
> shared between clients. Only one client can interact with the monitor.
> The others are considered as blocked. Still their greeting message is
> written to the output buffer and a callback is registered to flush the
> message to the socket when possible. This means that if a client closes
> the connection before getting the greeting, these data remain in the
> buffer and will be flushed upon a future connection.

This sounds like a qemu bug, and as such it should IMHO be discussed
with the qemu upstream first. Is it also present in qemu-1.6? Earlier
versions?
>
> To fix this on the Ganeti side, first of all, a finally: section that closes
> the socket must be added for each QMPConnection(), that practically leaves
> smaller timeslot for more than one concurrent connections. Additionally we
> should parse as many greetings available when starting the connection,
> and finally parse the output of qmp_capabilities command.

On top of that you would need a per-monitor lock to be safe, but even if
you do all of this, you will not be safe from external tools trying to
use the QMP socket (including the KVM daemon IIUC?).

Regards,
Apollon

Hrvoje Ribicic

unread,
Jan 22, 2014, 5:39:58 AM1/22/14
to Apollon Oikonomopoulos, Dimitris Aragiorgis, Ganeti Development
On Mon, Jan 20, 2014 at 5:46 PM, Apollon Oikonomopoulos <apo...@gmail.com> wrote:
Hi Dimitris,

On 17:47 Mon 20 Jan     , Dimitris Aragiorgis wrote:
> Hi all!
>
>
> Recently we observed some strange TypeErrors in node-daemon.log related to
> QMP message handling. Specifically the exceptions were inside GetInstanceInfo().
>
> Please do the following test:
>
> 1) Client A connects to qmp socket with socat
> 2) Client A gets greeting message {"QMP": {"version": ..}
> 3) Client A waits (select on the socket's fd)
> 4) Client B tries to connect to the *same* qmp socket with socat
> 5) Client B does *NOT* get any greating message
> 6) Client B waits (select on the socket's fd)
> 7) Client B closes connection (kill socat)
> 8) Client A quits too
> 9) Client C connects to qmp socket
> 10) Client C gets *two* greeting messages!!!
>
> This case is reproducable in Ganeti:
> 1) Allocator runs and invokes GetAllInstancesInfo()

Doesn't the IAllocator run with the nodes locked?

It does, and all of them if opportunistic locking is not used, yet instance info calls do not require these locks, AFAIK.


> 2) GetInstanceInfo() creates a QMPConnection()
> 3) Meanwhile a client wants to get instance runtime info
> 4) GetInstanceInfo() creates a new QMPConnection() but _Recv() blocks when
> trying to read from socket and a timeout exception is raised. This
> is catched and raised as HypervisorError which is *silently* catched inside
> GetInstanceInfo()
> 5) First connection closes
> 6) When a third connection will arive it will get *two* greetings!
> 7) The first greeting will be consumed inside connect()
> 8) The second will be consumed as response to qmp_capabilities execute command
> 9) info cpus will get {'return': {}}!! which will result to TypeError.
>
> After digging qemu source we found that monitor's output buffer is
> shared between clients. Only one client can interact with the monitor.
> The others are considered as blocked. Still their greeting message is
> written to the output buffer and a callback is registered to flush the
> message to the socket when possible. This means that if a client closes
> the connection before getting the greeting, these data remain in the
> buffer and will be flushed upon a future connection.

This sounds like a qemu bug, and as such it should IMHO be discussed
with the qemu upstream first. Is it also present in qemu-1.6? Earlier
versions?

It should definitely be discussed with the qemu upstream, yet if this is reproducible, with any version of qemu, Ganeti should provide a workaround.
Chances are that many users will be stuck with a certain qemu version for a while, and we don't want getting bug reports regarding this a year or two from now.
 
>
> To fix this on the Ganeti side, first of all, a finally: section that closes
> the socket must be added for each QMPConnection(), that practically leaves
> smaller timeslot for more than one concurrent connections. Additionally we
> should parse as many greetings available when starting the connection,
> and finally parse the output of qmp_capabilities command.

On top of that you would need a per-monitor lock to be safe, but even if
you do all of this, you will not be safe from external tools trying to
use the QMP socket (including the KVM daemon IIUC?).

This is a valid point, yet I believe that's what the socket closing code was meant to try and mitigate.
Short of a qemu fix, I do not believe that we can be safe - just to try and warn the user of the unfortunate interactions when more greeting messages arrive.
 

Regards,
Apollon

Hrvoje Ribicic

unread,
Jan 22, 2014, 5:47:03 AM1/22/14
to Dimitris Aragiorgis, Ganeti Development
I believe that the bug should be reported to the QEMU community to hear their response first. As you have more details on how to do this and a reproduction case easily available, I hope you might be able to do it?
If not, I can do the deed as well.

But after this is very likely registered as a bug, a patch would be welcome. If this happens early enough, even stable-2.8 could be a good target for the patch - I'll try and reproduce the bug there.
 

Thanks,
dimara


PS: All this have been tested with QEMU 1.7

Thanks for the detailed analysis!

Cheers,
Riba

Apollon Oikonomopoulos

unread,
Jan 22, 2014, 6:03:33 AM1/22/14
to Hrvoje Ribicic, Dimitris Aragiorgis, Ganeti Development
Hi Hrvoje,

On 11:39 Wed 22 Jan , Hrvoje Ribicic wrote:
> On Mon, Jan 20, 2014 at 5:46 PM, Apollon Oikonomopoulos
> <apo...@gmail.com>wrote:
>
> > > This case is reproducable in Ganeti:
> > > 1) Allocator runs and invokes GetAllInstancesInfo()
> >
> > Doesn't the IAllocator run with the nodes locked?
> >
>
> It does, and all of them if opportunistic locking is not used, yet instance
> info calls do not require these locks, AFAIK.

Yes, I forgot about that :)

> >
> > This sounds like a qemu bug, and as such it should IMHO be discussed
> > with the qemu upstream first. Is it also present in qemu-1.6? Earlier
> > versions?
> >
>
> It should definitely be discussed with the qemu upstream, yet if this is
> reproducible, with any version of qemu, Ganeti should provide a workaround.
> Chances are that many users will be stuck with a certain qemu version for a
> while, and we don't want getting bug reports regarding this a year or two
> from now.

Agreed. Personally I can't reproduce it with qemu-kvm 1.1.2 using the
test case (A connects, B connects, B disconnects, A disconnects, C
connects and gets only one greeting), but I can reproduce it with 1.6.0.

Regards,
Apollon

Apollon Oikonomopoulos

unread,
Jan 22, 2014, 6:15:59 AM1/22/14
to Hrvoje Ribicic, Dimitris Aragiorgis, Ganeti Development
On 13:03 Wed 22 Jan , Apollon Oikonomopoulos wrote:
> >
> > It should definitely be discussed with the qemu upstream, yet if this is
> > reproducible, with any version of qemu, Ganeti should provide a workaround.
> > Chances are that many users will be stuck with a certain qemu version for a
> > while, and we don't want getting bug reports regarding this a year or two
> > from now.
>
> Agreed. Personally I can't reproduce it with qemu-kvm 1.1.2 using the
> test case (A connects, B connects, B disconnects, A disconnects, C
> connects and gets only one greeting), but I can reproduce it with 1.6.0.

Even worse, this seems to affect the human monitor as well.

Dimitris Aragiorgis

unread,
Jan 22, 2014, 9:03:02 AM1/22/14
to Hrvoje Ribicic, Ganeti Development
Hi,

* Hrvoje Ribicic <ri...@google.com> [2014-01-22 11:47:03 +0100]:
Sorry for the delay. We were doing some digging in the QEMU source in
order to find out why this is happening. Yes, we were already planning to
report this misbehavior on qemu-devel today, so we will have news soon.

> But after this is very likely registered as a bug, a patch would be
> welcome. If this happens early enough, even stable-2.8 could be a good
> target for the patch - I'll try and reproduce the bug there.
>
>

Primarilly I thought 2.9 (and not 2.8) because this is the debian wheezy
version where qemu 1.7 is available too.

I was wondering if a quick and dirty workaround is the better
way to go. I mean the interaction with qmp inside GetInstanceInfo() is
kinda "optional" because cpu/memory values are also statically
available. So if we catch this TypeError there (and add some long
comment) it wouldn't make so much difference. But there is also the
QmpConnection() related to spice password..

A proper fix with the current code is more difficult since we have to
parse unknown number of greeting messages upon qmp connect() until we
receive a socket.timeout, meaning that there is no other data available
(since socket is blocking). I think the cost will be unacceptable!

What we could do is flush the entire buffer after parsing the first greeting
message to flush. And then execute "qmp_capabilities".

> >
> > PS: All this have been tested with QEMU 1.7
> >

As Apollon said this is NOT reproducable with QEMU 1.1. Which means this
is certainly a bug introduced in later versions. Let's hope it gets
fixed upstream soon enough.. Since the code inside QEMU is shared, this
misbehavior can be observed with QEMU monitor as well (but has no real
effect on Ganeti, since we do not parse any greeting messages there).

>
> Thanks for the detailed analysis!
>

:) So what do you think?

Cheers,
dimara


PS: Note that in 2.10, along with the hotplug implementation, QMP related code
has been refactored since MonitorSocket class has been introduced. So
this will need a bit of effort on your side to forward port it.


signature.asc
Reply all
Reply to author
Forward
0 new messages