#
Whoops. A critical error has occurred. This is most likely a bug in Qubes Manager
FileNotFoundError: [Errno 2]
No such file or directory
at line 9
of file /usr/bin/qubes-qube-manager
#
Line 9 reads: load_entry_point('qubesmanager==4.0.16', 'console_scripts', 'qubes-qube-manager')()
Ok, so the weird thing is that this works fine half the time. On half of my boot ups, I don't encounter this problem. So if there is no such file or directory, it's not there half the time. qubes.xml looks good (to my untrained eyes), and df -h shows nothing at more than 1% utilization except for /dev/nvme0n1p1 mounted on /boot/efi which is 56% of 200MB. nvme0n1p1 is, I believe, the GPT table?
I'm worried about coming to rely on this installation if at some point the error doesn't go away every other reboot and becomes permanent. Am trying updates now--maybe that will help.
Guy
Updating the software in dom0 doesn't make the problem disappear, though now the main error message is:
QubesDaemonCommunicationError: Failed to connect to qubesd service: [Errno 2] No such file or directory
Thanks awokd! I'll give these a try next time I run into the problem
Ok, so on my next reboot, it ran into this problem again. I made a copy of the journalctl log and tried to restart qubesd, to no effect.
The attached file, jnlctlErr.txt, if you scroll down to 09:24:43, I think you can see where the Qubes OS daemon fails. It is immediately preceded by the 1d.2 pci device worker failing, suggesting that something about this failure is causing the daemon from starting (which occurs below the blank line I added to the log). 1d.2 is a PCI Bridge, Intel Corp Device a332. No idea what exactly this is or how to find out (not a hardware person).
One thing I thought of is the fact that there's a PS/2 card in the machine to which a PS/2 keyboard & mouse are attached. Neither has ever worked in Qubes (though they worked in Windows), so maybe that's what's triggering the problem? Will do some testing.
When I attempt to start qubes daemon w/ sudo systemctl restart qubesd, journalctl log shows other errors. The qubes daemon doesn't get started and I can't use the system.
What I can do is reboot. And about every other time, Qubes comes up and is fine. My concern is that at some point it'll stop doing this, so I'd really like to figure out how to solve this problem.
Guy
Looking the the relevant errors, in context (and the time between them):
...
Sep 13 09:20:23 localhost kernel: usb 1-10.1: New USB device found, idVendor=413c, idProduct=2002
Sep 13 09:20:23 localhost kernel: usb 1-10.1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Sep 13 09:20:23 localhost kernel: usb 1-10.1: Product: Dell USB Keyboard Hub
Sep 13 09:20:23 localhost kernel: usb 1-10.1: Manufacturer: Dell
Sep 13 09:20:23 localhost kernel: input: Dell Dell USB Keyboard Hub as /devices/pci0000:00/0000:00:14.0/usb1/1-10/1-10.1/1-10.1:1.0/0003:413C:2002.0001/input/input3
Sep 13 09:20:23 localhost kernel: hid-generic 0003:413C:2002.0001: input,hidraw0: USB HID v1.10 Keyboard [Dell Dell USB Keyboard Hub] on usb-0000:00:14.0-10.1/input0
Sep 13 09:20:23 localhost kernel: input: Dell Dell USB Keyboard Hub as /devices/pci0000:00/0000:00:14.0/usb1/1-10/1-10.1/1-10.1:1.1/0003:413C:2002.0002/input/input4
Sep 13 09:20:23 localhost kernel: usb 4-3: new low-speed USB device number 2 using ohci-pci
...
Sep 13 09:20:23 localhost kernel: hid-generic 0003:413C:2002.0002: input,hidraw1: USB HID v1.10 Device [Dell Dell USB Keyboard Hub] on usb-0000:00:14.0-10.1/input1
...
Sep 13 09:21:43 dom0 kernel: dcdbas dcdbas: Dell Systems Management Base Driver (version 5.6.0-3.2)
...
Sep 13 09:21:44 dom0 kernel: acpi PNP0C14:03: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
Sep 13 09:21:44 dom0 kernel: wmi_bus wmi_bus-PNP0C14:04: WQBC data block query control method not found
Sep 13 09:21:44 dom0 kernel: acpi PNP0C14:04: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
Sep 13 09:21:44 dom0 kernel: input: PC Speaker as /devices/platform/pcspkr/input/input12
Sep 13 09:21:44 dom0 kernel: input: Dell AIO WMI hotkeys as /devices/virtual/input/input13
...
Sep 13 09:21:44 dom0 kernel: dell-wmi 9DBB5994-A997-11DA-B012-B622A1EF5492: Dell descriptor buffer has invalid buffer length (32768)
Sep 13 09:21:44 dom0 kernel: dell-wmi 9DBB5994-A997-11DA-B012-B622A1EF5492: Detected Dell WMI interface version 1
Sep 13 09:21:44 dom0 kernel: input: Dell WMI hotkeys as /devices/platform/PNP0C14:04/wmi_bus/wmi_bus-PNP0C14:04/9DBB5994-A997-11DA-B012-B622A1EF5492/input/input14
Sep 13 09:21:44 dom0 systemd[1]: Found device /dev/disk/by-uuid/A482-5EDF.
Sep 13 09:21:44 dom0 systemd-udevd[1677]: Error calling EVIOCSKEYCODE on device node '/dev/input/event14' (scan code 0x150, key code 190): Invalid argument
...
Sep 13 09:21:49 dom0 kernel: snd_hda_codec_realtek hdaudioC0D0: Failed to find dell wmi symbol dell_micmute_led_set
...
Sep 13 09:22:00 dom0 kernel: input: HDA Intel PCH HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input21
Sep 13 09:22:43 dom0 systemd-udevd[1652]: seq 2961 '/devices/pci0000:00/0000:00:1d.2/0000:05:00.0/0000:06:00.0/usb4' is taking a long time
Sep 13 09:23:44 dom0 systemd[1]: systemd-udev-settle.service: Main process exited, code=exited, status=1/FAILURE
Sep 13 09:23:44 dom0 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-udev-settle comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Sep 13 09:23:44 dom0 systemd[1]: Failed to start udev Wait for Complete Device Initialization.
Sep 13 09:23:44 dom0 kernel: audit: type=1130 audit(1536848624.020:69): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-udev-settle comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Sep 13 09:23:44 dom0 systemd[1]: systemd-udev-settle.service: Unit entered failed state.
Sep 13 09:23:44 dom0 systemd[1]: systemd-udev-settle.service: Failed with result 'exit-code'.
...
Sep 13 09:23:44 dom0 systemd-logind[2748]: Watching system buttons on /dev/input/event14 (Dell WMI hotkeys)
Sep 13 09:23:44 dom0 systemd-logind[2748]: Watching system buttons on /dev/input/event13 (Dell AIO WMI hotkeys)
...
Sep 13 09:23:45 dom0 qmemman.systemstate[2758]: stat: xenfree=29212956823 memset_reqs=[]
Sep 13 09:24:43 dom0 systemd-udevd[1652]: seq 2961 '/devices/pci0000:00/0000:00:1d.2/0000:05:00.0/0000:06:00.0/usb4' killed
Sep 13 09:24:43 dom0 systemd-udevd[1652]: worker [1674] terminated by signal 9 (Killed)
Sep 13 09:24:43 dom0 systemd-udevd[1652]: worker [1674] failed while handling '/devices/pci0000:00/0000:00:1d.2/0000:05:00.0/0000:06:00.0/usb4'
#pci device 1d.2 is: 00:1d.2 PCI bridge: Intel Corporation Device a332 (rev f0), which I imagine might be related to: 00:17.0 SATA controller: Intel Corporation Device a352 (rev 10)
Sep 13 09:25:14 dom0 systemd[1]: qubesd.service: Start operation timed out. Terminating.
Sep 13 09:25:14 dom0 systemd[1]: Failed to start Qubes OS daemon.
If this guy is correct https://bugs.freedesktop.org/show_bug.cgi?id=75875#c1 about `you can safely ignore these errors` then the following workaround attempts are probably not going to make any difference(but hey, I tried):
What's the output of `lsmod|grep -i wmi` ? I'm guessing there should be something like `dell_wmi` which means it's possible to get it blacklisted (if even temporarily), so with something like `modprobe.blacklist=dell_wmi` as a kernel parameter (ie. cat /proc/cmdline) (or should it be dell-wmi ? that is, with dash instead of underscore). There's a list of ways on how to do that here https://askubuntu.com/questions/110341/how-to-blacklist-kernel-modules
but if you're using UEFI, that would mean appending that to the `kernel=` line (of the default= kernel) in /boot/efi/EFI/qubes/xen.cfg then rebooting; however I do recommend having a working copy first, as a backup, just in case your xen.cfg modifications render the system unbootable, for inspiration on how to do that, maybe see from here: https://groups.google.com/d/msg/qubes-users/CZ5vMNL_c7k/btiRvk9eBAAJ
If you don't want to blacklist that dell_wmi(guessing) module, or can't, then maybe consider temporarily commenting out this whole block of lines:
# Dell Latitude microphone mute
evdev:name:Dell WMI hotkeys:dmi:bvn*:bvr*:bd*:svnDell*:pnLatitude*
# Dell Precision microphone mute
evdev:name:Dell WMI hotkeys:dmi:bvn*:bvr*:bd*:svnDell*:pnPrecision*
KEYBOARD_KEY_150=f20 # Mic mute toggle, should be micmute
in dom0 file: /usr/lib/udev/hwdb.d/60-keyboard.hwdb
which would mean that some multimedia keys won't work but it should also not stall your boot process. Inspiration for this solution is from: https://ubuntuforums.org/showthread.php?t=2250210&p=13153308#post13153308
(or maybe more/different lines need to be blacklisted? unsure)
Another solution would be to use a newer kernel (like 4.18.7 from the unstable repo, see file /etc/yum.repos.d/qubes-dom0.repo if you want to `enabled = 1` it, under section [qubes-dom0-unstable]), but before you do I still recommend having another UEFI entry with the normal kernel(s) just in case the new one will not boot at all (tho unlikely), so you can use your BIOS boot menu to select between the two (tho be aware the currently running one will be updated(xen.cfg -wise) when you `sudo qubes-dom0-update` to a newer kernel). However I do note that your UEFI partition is only 200MiB so you might need to only copy the running kernel (the one referenced by `default=`) instead of everything, or you may not have enough space on it; this only makes sense in this context: https://groups.google.com/d/msg/qubes-users/CZ5vMNL_c7k/btiRvk9eBAAJ
Thanks awokd and Marcus! The points you made got me to unplugging the non-working PS/2 keyboard and mouse on my computer (which plug into a PCI card). I've rebooted 5 times since and have not run into an error. So it looks like something about the PS/2 peripherals were causing the problem. Which of course leads to the next question of why does this cause problems and why don't the keyboard & mouse work. Well, questions for another thread.