Marek Marczykowski-Górecki writes:
> Strace can be helpful, see
https://github.com/QubesOS/qubes-issues/issues/4191
> Do you know what exact action triggers this problem? Given the trace
> in the above mentioned issue, I'd guess some network setting operation.
I updated, rebooted, started a few of my usual app qubes, then noticed the CPU wasn't going idle as expected. Haven't rebooted again since then, so I haven't had a chance yet to check exactly when the problem starts. But in the meantime, here's a trace:
$ sudo strace -fp $(pgrep xenstored) 2>&1 | grep -v "lseek\|read\|fcntl\|write\|ioctl\|poll"
strace: Process 2248 attached
accept(3, NULL, NULL) = 8
open("/var/lib/xenstored/tdb.0x2072860", O_WRONLY|O_CREAT|O_TRUNC, 0640) = 19
open("/var/lib/xenstored/tdb.0x2072860", O_RDWR) = 20
fstat(20, {st_mode=S_IFREG|0640, st_size=655360, ...}) = 0
close(19) = 0
rename("/var/lib/xenstored/tdb.0x2072860", "/var/lib/xenstored/tdb") = 0
close(21) = 0
unlink("/var/lib/xenstored/tdb.0x2072860") = -1 ENOENT (No such file or directory)
close(8) = 0
accept(3, NULL, NULL) = 8
open("/var/lib/xenstored/tdb.0x20721e0", O_WRONLY|O_CREAT|O_TRUNC, 0640) = 19
open("/var/lib/xenstored/tdb.0x20721e0", O_RDWR) = 21
fstat(21, {st_mode=S_IFREG|0640, st_size=655360, ...}) = 0
close(19) = 0
rename("/var/lib/xenstored/tdb.0x20721e0", "/var/lib/xenstored/tdb") = 0
close(20) = 0
unlink("/var/lib/xenstored/tdb.0x20721e0") = -1 ENOENT (No such file or directory)
close(8) = 0
accept(3, NULL, NULL) = 8
open("/var/lib/xenstored/tdb.0x20721e0", O_WRONLY|O_CREAT|O_TRUNC, 0640) = 19
open("/var/lib/xenstored/tdb.0x20721e0", O_RDWR) = 20
fstat(20, {st_mode=S_IFREG|0640, st_size=655360, ...}) = 0
close(19) = 0
accept(3, NULL, NULL) = 19
open("/var/lib/xenstored/tdb.0x2074a50", O_WRONLY|O_CREAT|O_TRUNC, 0640) = 26
open("/var/lib/xenstored/tdb.0x2074a50", O_RDWR) = 27
fstat(27, {st_mode=S_IFREG|0640, st_size=655360, ...}) = 0
close(26) = 0
rename("/var/lib/xenstored/tdb.0x20721e0", "/var/lib/xenstored/tdb") = 0
close(21) = 0
unlink("/var/lib/xenstored/tdb.0x20721e0") = -1 ENOENT (No such file or directory)
close(8) = 0
close(27) = 0
unlink("/var/lib/xenstored/tdb.0x2074a50") = 0
open("/var/lib/xenstored/tdb.0x2072860", O_WRONLY|O_CREAT|O_TRUNC, 0640) = 8
open("/var/lib/xenstored/tdb.0x2072860", O_RDWR) = 21
fstat(21, {st_mode=S_IFREG|0640, st_size=655360, ...}) = 0
close(8) = 0
accept(3, NULL, NULL) = 8
open("/var/lib/xenstored/tdb.0x20742f0", O_WRONLY|O_CREAT|O_TRUNC, 0640) = 26
open("/var/lib/xenstored/tdb.0x20742f0", O_RDWR) = 27
fstat(27, {st_mode=S_IFREG|0640, st_size=655360, ...}) = 0
close(26) = 0
rename("/var/lib/xenstored/tdb.0x20742f0", "/var/lib/xenstored/tdb") = 0
close(20) = 0
unlink("/var/lib/xenstored/tdb.0x20742f0") = -1 ENOENT (No such file or directory)
close(21) = 0
unlink("/var/lib/xenstored/tdb.0x2072860") = 0
close(8) = 0
open("/var/lib/xenstored/tdb.0x2072860", O_WRONLY|O_CREAT|O_TRUNC, 0640) = 8
open("/var/lib/xenstored/tdb.0x2072860", O_RDWR) = 20
fstat(20, {st_mode=S_IFREG|0640, st_size=655360, ...}) = 0
close(8) = 0
^C
That trace was over the course of about 5 seconds. Without filtering out read/write/etc, I get the typical trace that seberm mentioned in issue #4191.
To mitigate the bug, I attached and paused xenstored using gdb, and the CPU went idle. I leave it paused, but temporarily resume it when I need to start or stop a qube, use qvm-run, etc. That's how I've been getting by for the past week.