HCL - Dell Inspiron 15 5000 (5575) AMD Ryzen 5 2500U w/ Vega 8 Graphics

573 views
Skip to first unread message

Claudia

unread,
Dec 18, 2019, 7:13:57 PM12/18/19
to qubes...@googlegroups.com
This is R4.1 build 20191013

It works pretty well, definitely better than 4.0, but there are some weird boot issues. If I let it boot with everything as default, it will boot loop before reaching the disk password screen. I found I can get it to boot successfully if I add to the Xen commandline
noreboot=1 loglvl=all
and remove from the linux commandline
rhgb quiet rd.qubes.hide_all_usb

Still working on narrowing down which of those is/are responsible for fixing the problem (I can't figure out why any of them would).

Improvements since 4.0:
Screen power management works - brightness controls, and screen poweroff after inactivity (in 4.0
it would just blank but not power off)
Audio works, which it did not work in 4.0 even after many days of troubleshooting
amdgpu works correctly - doesn't freeze when booting without nomodeset
Multimedia keys - not sure if they worked in 4.0 or not

Still working:
UEFI mode
wifi
touchpad
keyboard

Still NOT working:
Suspend/resume

Not tested (yet):
Legacy mode
HDMI audio & video
USB Qube
SD card reader
Microphone
Webcam
Wired networking

I'll try to do some more testing and update this thread when I have a chance. Just putting this out
there for now.

Qubes-HCL-Dell_Inc_-Inspiron_5575-20191218-015203.yml

Claudia

unread,
Dec 20, 2019, 9:29:55 AM12/20/19
to qubes...@googlegroups.com
December 19, 2019 12:13 AM, "Claudia" <clau...@disroot.org> wrote:

> This is R4.1 build 20191013
>
> It works pretty well, definitely better than 4.0, but there are some weird boot issues. If I let it
> boot with everything as default, it will boot loop before reaching the disk password screen. I
> found I can get it to boot successfully if I add to the Xen commandline
> noreboot=1 loglvl=all
> and remove from the linux commandline
> rhgb quiet rd.qubes.hide_all_usb
>
> Still working on narrowing down which of those is/are responsible for fixing the problem (I can't
> figure out why any of them would).

Looks like rd.qubes.hide_all_usb is what's causing it to crash. When I remove it, it boots fine with the graphical splash and passphrase prompt. Another AMD Ryzen user mentioned having the same problem a while back. Something about AMD's IOMMU grouping of USB controllers, or something.

I'm planning on installing kernel-latest and I'll test it again when I do.

brenda...@gmail.com

unread,
Dec 20, 2019, 1:05:28 PM12/20/19
to qubes-users
On Friday, December 20, 2019 at 9:29:55 AM UTC-5, Claudia wrote:
December 19, 2019 12:13 AM, "Claudia" <clau...@disroot.org> wrote:

> This is R4.1 build 20191013
>
> It works pretty well, definitely better than 4.0, but there are some weird boot issues. If I let it
> boot with everything as default, it will boot loop before reaching the disk password screen. I

...Looks like rd.qubes.hide_all_usb is what's causing it to crash. When I remove it, it boots fine with the graphical splash and passphrase prompt. Another AMD Ryzen user mentioned having the same problem a while back. Something about AMD's IOMMU grouping of USB controllers, or something.

 
Unfortunately, (my understanding is) that exposes the dom0 to the USB ports. Even if you have a sys-usb, dom0 is still exposed temporarily on boot.


B

Claudia

unread,
Dec 20, 2019, 4:13:29 PM12/20/19
to brenda...@gmail.com, qubes-users
Thanks for the tip. I actually thought removing that parameter just simply disabled USB Qube functionality and attached all devices to dom0, but I guess that's only when sys-usb is not running. Once sys-usb is running, it takes over the USB controllers from dom0, I guess. It's just that they're exposed before sys-usb starts, in that case.

It would be nice to have working, but I've never had a USB Qube before, even on my old machine, so I haven't lost anything. I don't use a lot of USB devices, and it's not a big part of my threat model. I'll play around with it when I have a chance.

Claudia

unread,
Dec 21, 2019, 1:25:28 PM12/21/19
to qubes...@googlegroups.com
I realized I had disabled autostart for all VMs including sys-usb to speed up boot time (systemctl disable sys-{net,usb,firewall,whonix}@qubesvm.service), and I hadn't actually run sys-usb at all since then. I decided to start sys-usb while the system was running, and everything went to hell: the screen froze, audio stopped, even the caps lock light wouldn't come on.

So this isn't limited to hide_all_usb, just USB controller passthru in general on this machine.

So I reinstalled 4.0.2-rc3, once again, this time without USB Qube, and everything works (except suspend/resume which doesn't work anywhere except Fedora 30, not even F29 iirc), including audio and amdgpu without nomodeset. All this time I was blaming it on the old-ness of 4.0, but it was hide_all_usb all along. The only reason I tried getting rid of hide_all_usb in 4.1 is because the nomodeset trick didn't work in 4.1 so I had to continue troubleshooting, whereas in 4.0 it worked and I moved on. And also grub makes it way more convenient to modify boot options.

So in summary, for 4.0 and 4.1 alike, everything works except suspend/resume, as long as you don't set up a USB Qube. I attached updated HCL reports to reflect this, and will update them as I do more testing.

I'm going to try and figure out some of this IOMMU grouping stuff and start another thread about this issue. But like I said, I never had a USB Qube before, so I'm not going to miss it.
Qubes-HCL-Dell_Inc_-Inspiron_5575-20191220-220355.yml
Qubes-HCL-Dell_Inc_-Inspiron_5575-20191221-021654.yml

Vít Šesták

unread,
Dec 22, 2019, 5:33:48 AM12/22/19
to qubes-users
As far as I know/understand:

* At start, all the PCI devices are assigned to dom0.
* When a qube with an attached PCI device starts, dom0 assigns the PCI device to the qube, so it is no longer attached to dom0. It never gets actually attached to dom0 automatically (until reboot). If it was, a malicious qube could for example flash a malicious firmware to a USB device and shut down in order to connect the malicious device to dom0.
* In order to have some additional protection, rd.qubes.hide_all_usb hides all USB devices. That is, the USB PCI device is attached to dom0, but Linux ignores it, maybe it blacklists related kernel modules.

The behavior you have observed suggests that both detached USB controller and ignored USB controller cause an issue. So, maybe the problem is not in the process of detaching a controller that is being used, but rather in not having the controller available.

Intel includes some USB controller in CPU and quick Googling suggests that so does AMD. Maybe there is some AMD-specific code in Linux kernel that expects the USB controller to be available for whatever weird reason. Yes, it sounds strange, but it is the least implausible explanation I was able to find.

Regards,
Vít Šesták 'v6ak'

Claudia

unread,
Dec 22, 2019, 9:17:28 AM12/22/19
to Vít Šesták, qubes-users
December 22, 2019 10:33 AM, "Vít Šesták"
<groups-no-private-mail--con...@v6ak.com> wrote:

Thanks for the info. Yes, that sounds correct from what I could tell, too. More specifically, what rd.qubes.hide_all_usb actually does is looks for USB controllers device files in /sys/bus/pci, and then calls their driver/unbind, driver/new_slot, and driver/bind functions. So basically, it forces the Linux USB driver (xhci_pci, in my case) to detach from the USB controller, and then attaches it to Xen's pciback driver so that it can be used by sys-usb later (although I don't know if this step is actually necessary, since starting sys-usb without hide_all_usb works just fine). I'm assuming this step happens before udev trigger, which probes devices including USB devices. Maybe that's why the hook assigns pciback instead of just unbinding the USB driver - so udev doesn't see that the device has no driver and attempt to bind the USB driver to it again. Or maybe the act of binding pciback is what actually places the USB controller under IOMMU isolation by Xen (otherwise the USB controller could still perform a DMA attack even with no driver bound to it).

I don't know which step -- unbinding xhci_pci, or binding pciback -- actually causes the crash in this case. I can say, however, that one of them does cause an immediate crash, before sys-usb ever starts or has a chance to take over the USB controllers.

The same hook does the same thing for networking devices as well, so that those are never exposed. In my case this doesn't cause a problem because both network cards are their own devices on their own busses and have their own IOMMU groups, unlike my USB controllers.

Here's the actual code from /usr/lib/dracut/modules.d/99qubes-pciback/qubes-pciback.sh

#!/usr/bin/sh

type getarg >/dev/null 2>&1 || . /lib/dracut-lib.sh

# Find all networking devices currenly installed...
HIDE_PCI="`lspci -mm -n | grep '^[^ ]* "02'|awk '{print $1}'`"

# ... and optionally all USB controllers...
if getargbool 0 rd.qubes.hide_all_usb; then
HIDE_PCI="$HIDE_PCI `lspci -mm -n | grep '^[^ ]* "0c03'|awk '{print $1}'`"
fi

HIDE_PCI="$HIDE_PCI `getarg rd.qubes.hide_pci | tr ',' ' '`"

modprobe xen-pciback 2>/dev/null || :

# ... and hide them so that Dom0 doesn't load drivers for them
for dev in $HIDE_PCI; do
BDF=0000:$dev
if [ -e /sys/bus/pci/devices/$BDF/driver ]; then
echo -n $BDF > /sys/bus/pci/devices/$BDF/driver/unbind
fi
echo -n $BDF > /sys/bus/pci/drivers/pciback/new_slot
echo -n $BDF > /sys/bus/pci/drivers/pciback/bind
done

As for USB controllers being on the CPU, yes that's what I found as well; all current CPUs bundle their integrated USB controllers right on the chip. I don't think code in the Linux kernel expects the USB controllers to be available. Rather I think it has to do with IOMMU grouping, which tends to be structured differently for AMD than Intel. I'm going to start a new thread about that here soon.

Claudia

unread,
Dec 22, 2019, 1:20:17 PM12/22/19
to Vít Šesták, qubes...@googlegroups.com
December 22, 2019 5:45 PM, "Vít Šesták" <v...@v6ak.com> wrote:

> Helllo,
>
> Dec 22, 2019 15:17:27 Claudia :


>
>
>> I don't know which step -- unbinding xhci_pci, or binding pciback -- actually causes the crash in
> this case. I can say, however, that one of them does cause an immediate crash, before sys-usb ever
> starts or has a chance to take over the USB controllers.
>

> That (and the specific script) is an interesting finding. Maybe it would be possible to run the
> commands one-by-one to see which one crashes the system.
>

I think I might try that sometime, just out of curiosity.

> BTW, could you confirm that the USB Controller is an AMD one? This could be helpful for anyone
> wanting a Qubes laptop with AMD.
>

Yes, as far as I can tell.

03:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1 [1022:15e0]
03:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Raven USB 3.1 [1022:15e1]

qubes123

unread,
Dec 23, 2019, 2:31:20 AM12/23/19
to qubes-users
On Thursday, December 19, 2019 at 12:13:57 AM UTC, Claudia wrote:
This is R4.1 build 20191013

It works pretty well, definitely better than 4.0, but there are some weird boot issues. If I let it boot with everything as default, it will boot loop before reaching the disk password screen. I found I can get it to boot successfully if I add to the Xen commandline
noreboot=1 loglvl=all
and remove from the linux commandline
rhgb quiet rd.qubes.hide_all_usb

Still working on narrowing down which of those is/are responsible for fixing the problem (I can't figure out why any of them would).

Improvements since 4.0:
Screen power management works - brightness controls, and screen poweroff after inactivity (in 4.0
it would just blank but not power off)
Audio works, which it did not work in 4.0 even after many days of troubleshooting
amdgpu works correctly - doesn't freeze when booting without nomodeset
Multimedia keys - not sure if they worked in 4.0 or not

Still working:
UEFI mode
wifi
touchpad
keyboard

Still NOT working:
Suspend/resume

Suspend/resume problem is most likely caused by a recently added security feature in Xen, that checks CPUID after resume with the previously (at boot time) known CPUID. This is to ensure, that the CPU microcode level - along with the resulting Spectere/Meltdown etc. mitigations - still persist after system resume and there are no features missing.
For many AMD systems (eg. Trinity/Richland) CPUID changes after suspend (some of the high bits), resulting in Xen Panic (see xen/arch/x86/acpi/power.c). So, more investigation would be needed to check why the CPUID bits are changing after resume and whether it had any security implications or not.
For the time being - if you accept the possible security implications - you can disable that check eg. by commenting the panic line out after "recheck_cpu_features" in the above mentioned power.c file, compile xen for dom0 via qubes builder and test it in your system.

Vít Šesták

unread,
Dec 23, 2019, 2:51:33 AM12/23/19
to qubes...@googlegroups.com
Qubes123, that's an interesting finding. AFAIU, the CPUID check is rather a sanity check – it verifies that the microcode is loaded properly. I also, this should be no issue if Qubes provides a fresh microcode…

(And maybe it can cause crash if you suspend after BIOS update, provided that the BIOS contains a newer microcode.)

Regards,
Vít Šesták 'v6ak'

qubes123

unread,
Dec 23, 2019, 6:21:52 AM12/23/19
to qubes-users
I use a Corebooted system, where the latest AMD microcode is compiled into the BIOS statically.  And yes, I use a newer version of the AMD Fam15h microcode, than the version in the Linux Firmware package.
This change in some of the CPUID bits after resume could be a result of xen/kernel trying to load the published microcode, and then fails because the BIOS version is newer.
However, the /proc/cpuinfo reported microcode version always stays the same - the BIOS version. (..assuming the /proc/cpuinfo output is updated on any microcode upgrade attempts..)

As noted, I have a "special" use case, so testing the recommended change in power.c for Claudia's newer AMD system could show, that this CPUID change issue after resume is "special" for my case or "general" for some AMD users.

BR,
Qubes123
 

Claudia

unread,
Dec 23, 2019, 6:45:10 AM12/23/19
to qubes123, qubes-users
December 23, 2019 7:31 AM, "qubes123" <dm1.l...@gmail.com> wrote:
> Suspend/resume problem is most likely caused by a recently added security feature in Xen, that
> checks CPUID after resume with the previously (at boot time) known CPUID. This is to ensure, that
> the CPU microcode level - along with the resulting Spectere/Meltdown etc. mitigations - still
> persist after system resume and there are no features missing.
>

How recent? Is it present in Xen 4.8-fc25 (R4.0)? Xen 4.12-fc29 (R4.1)?

> For many AMD systems (eg. Trinity/Richland) CPUID changes after suspend (some of the high bits),
> resulting in Xen Panic (see xen/arch/x86/acpi/power.c). So, more investigation would be needed to
> check why the CPUID bits are changing after resume and whether it had any security implications or
> not.
> For the time being - if you accept the possible security implications - you can disable that check
> eg. by commenting the panic line out after "recheck_cpu_features" in the above mentioned power.c
> file, compile xen for dom0 via qubes builder and test it in your system.

Thanks for the info.

I'm not sure this is the problem, though, because I get the same symptoms when suspending in a Fedora 25 livecd. Which makes me think it's a Fedora problem not a Xen problem, at least for R4.0. In Fedora 29 I think the symptoms were slightly different, the system was responsive but the screen just didn't power back on after resume. I don't think suspend/resume actually worked correctly until Fedora 30. We should have an F31-based R4.1 developer release by the end of the month, which would be a more accurate test.

What are the symptoms of a Xen panic? Would it prevent the screen from powering back on? Would it reboot after five seconds? Or would it just hang?

I'll try booting qubes R4.1 on bare metal without Xen and try suspend/resume. If it works I'll post cpuinfo before and after.

qubes123

unread,
Dec 23, 2019, 8:17:30 AM12/23/19
to qubes-users
How recent? Is it present in Xen 4.8-fc25 (R4.0)? Xen 4.12-fc29 (R4.1)?

Well, maybe the "recent" was not accurate, as in R4.0 this started after xen-4.8.3.3 - from mid 2018.
In R4.1 this issue is there from the beginning, as it was not fixed since - being a minor problem affecting few (not so modern Fam15h) systems.


> For many AMD systems (eg. Trinity/Richland) CPUID changes after suspend (some of the high bits),
> resulting in Xen Panic (see xen/arch/x86/acpi/power.c). So, more investigation would be needed to
> check why the CPUID bits are changing after resume and whether it had any security implications or
> not.
> For the time being - if you accept the possible security implications - you can disable that check
> eg. by commenting the panic line out after "recheck_cpu_features" in the above mentioned power.c
> file, compile xen for dom0 via qubes builder and test it in your system.

Thanks for the info.

I'm not sure this is the problem, though, because I get the same symptoms when suspending in a Fedora 25 livecd. Which makes me think it's a Fedora problem not a Xen problem, at least for R4.0. In Fedora 29 I think the symptoms were slightly different, the system was responsive but the screen just didn't power back on after resume. I don't think suspend/resume actually worked correctly until Fedora 30. We should have an F31-based R4.1 developer release by the end of the month, which would be a more accurate test.

What are the symptoms of a Xen panic? Would it prevent the screen from powering back on? Would it reboot after five seconds? Or would it just hang?

I'll try booting qubes R4.1 on bare metal without Xen and try suspend/resume. If it works I'll post cpuinfo before and after.


As I remember, distribution releases (Fedora, Debian) without Xen were OK for me, however, with upstream Xen installed, obviously all had this suspend issue.
I don't have a debug card, so the symptoms are: when the computer wakes up with this issue, the screen remains blank and the processor start to heat up and the PC remains unresponsive. It doesn't turn off automatically you have to forcibly turn it off using the power button >5 sec.

Brendan Hoar

unread,
Dec 23, 2019, 9:26:56 AM12/23/19
to Claudia, qubes-users, qubes123
On Mon, Dec 23, 2019 at 6:45 AM Claudia <clau...@disroot.org> wrote:

 I'm not sure this is the problem, though, because I get the same symptoms when suspending in a Fedora 25 livecd. Which makes me think it's a Fedora problem not a Xen problem, at least for R4.0. In Fedora 29 I think the symptoms were slightly different, the system was responsive but the screen just didn't power back on after resume. I don't think suspend/resume actually worked correctly until Fedora 30. We should have an F31-based R4.1 developer release by the end of the month, which would be a more accurate test.

See Marek’s post near the end of this issue, he has a link to a F31 R4.1 ISO built on openqa:

Claudia

unread,
Dec 23, 2019, 9:46:08 AM12/23/19
to Brendan Hoar, qubes-users, qubes123

Awesome, in time for Christmas even! Downloading it now. Looks like it failed a few tests, so I don't know if it'll be usable enough to really test suspend/resume on it but we'll see. Not sure if I'll get a chance to install it today but I'll follow up when I do. Thanks for the link brendan.

Brendan Hoar

unread,
Dec 23, 2019, 10:08:21 AM12/23/19
to Claudia, qubes-users, qubes123
On Mon, Dec 23, 2019 at 9:46 AM Claudia <clau...@disroot.org> wrote:
Awesome, in time for Christmas even! Downloading it now. Looks like it failed a few tests, so I don't know if it'll be usable enough to really test suspend/resume on it but we'll see. Not sure if I'll get a chance to install it today but I'll follow up when I do. Thanks for the link brendan.

Yeah. Looks like Openqa can’t (or isn’t configured to) utilize iommu under the kvm hosting...and with Xen 4.13 the approach taken to workaround that in the test bed cannot be used any more.

So it’s possible that the large percentage of failed/incomplete tests might not have a significant impact on how well it works in your case.

That being said I’m curious about how Marek will get openqa working with Xen 4.13 and onward.

Brendan

Claudia

unread,
Dec 23, 2019, 10:08:37 AM12/23/19
to qubes123, qubes-users
December 23, 2019 1:17 PM, "qubes123" <dm1.l...@gmail.com> wrote:
> As I remember, distribution releases (Fedora, Debian) without Xen were OK for me, however, with
> upstream Xen installed, obviously all had this suspend issue.
> I don't have a debug card, so the symptoms are: when the computer wakes up with this issue, the
> screen remains blank and the processor start to heat up and the PC remains unresponsive. It doesn't
> turn off automatically you have to forcibly turn it off using the power button >5 sec.

When you say it remains blank, do you mean the screen is totally powered off, or do you mean the backlight comes on but it just displays a black screen?

Going from memory here, but I *think* in F29 (without Xen) the backlight would come on, but the screen was just blank, and I could make the caps lock light come on and hear sounds from the OS, and ctrl-alt-delete would cause it to reboot after 60 seconds as expected. Possibly a graphics driver problem.

In Qubes R4.1 (F29-based) with Xen, when I resume I can hear the fans come on, but that's it. The backlight remains powered off, the caps lock light won't come on, sound doesn't resume playing, and I have to hold the power button to force reboot. Sounds to me like it could be a Xen panic, although I believe this is the same as what happened in F25, if memory serves.

Also, I don't know what a debug card is, but my BIOS has an option called "USB Debugging" which is enabled. Do you know anything about that, or how to make use of it? I'm not looking to get into any serial/UART type stuff, but USB might be an option, depending on what it does, what you need to have, and how difficult it is.

qubes123

unread,
Dec 24, 2019, 5:30:13 AM12/24/19
to qubes-users

When you say it remains blank, do you mean the screen is totally powered off, or do you mean the backlight comes on but it just displays a black screen?


I had a powered off screen when Xen panics, the screen not even blinked (--> doesn't reach the kernel or the GPU driver, the systems halts before that.
 

Going from memory here, but I *think* in F29 (without Xen) the backlight would come on, but the screen was just blank, and I could make the caps lock light come on and hear sounds from the OS, and ctrl-alt-delete would cause it to reboot after 60 seconds as expected. Possibly a graphics driver problem.


This seems to me a GPU driver issue too. In this case the logs (kernel, hypervisor, xorg) are available in the next
boot for further checking.

In Qubes R4.1 (F29-based) with Xen, when I resume I can hear the fans come on, but that's it. The backlight remains powered off, the caps lock light won't come on, sound doesn't resume playing, and I have to hold the power button to force reboot. Sounds to me like it could be a Xen panic, although I believe this is the same as what happened in F25, if memory serves.

Also, I don't know what a debug card is, but my BIOS has an option called "USB Debugging" which is enabled. Do you know anything about that, or how to make use of it? I'm not looking to get into any serial/UART type stuff, but USB might be an option, depending on what it does, what you need to have, and how difficult it is.


Yes, this is what I experienced, when Xen panics. To be able to see what really happens in this case, you'd need a serial console (builtin or an add-on PCI debug card providing that serial/UART port) and configure xen to send the console output to that given serial port. With XEN cmd "loglvl=all" you'd see how far XEN gets after resume.
The BIOS option "USB Debugging" for the 5575 could mean, that one of your USB ports can be used for debugging (as a serial port using USB2,ehci mode), but IMHO this would also require special cables and another PC to use as a debug console...
 

Claudia

unread,
Dec 24, 2019, 2:16:03 PM12/24/19
to Brendan Hoar, qubes-users
December 23, 2019 3:08 PM, "Brendan Hoar" <brenda...@gmail.com> wrote:

Installed the new F31-based 4.1. Near the end of the installation, it said

The following error occurred while installing the boot loader. This system will not be bootable. Would you like to continue?
failed to write boot loader configuration

I don't know why that happened, but sure enough grub.cfg was missing. I guess the installer keeps logs in /tmp, but I didn't know that at the time. Rebooted into recovery, chrooted, and ran grub2-mkconfig.

I also noticed the UEFI boot menu situation changed. It added two entries "QubesOS" and "Fedora", and installed some other efi file in the default path \EFI\BOOT\. The former two pointed to she shim binary (under different paths), not sure about the default one. All three of them caused an instant reboot. Weird, but I didn't really investigate. I ended up just using my old "Qubes" boot entry (for grubx64.efi) after generating grub.cfg.

After installation it worked fine during my hour or two of casual usage. When I was using it, it seemed noticeably faster than previous versions. So you're right it probably just had to do with the new Xen under openqa. Suspend/resume still doesn't work though.

Claudia

unread,
Dec 24, 2019, 5:05:43 PM12/24/19
to qubes123, qubes-users
December 23, 2019 7:31 AM, "qubes123" <dm1.l...@gmail.com> wrote:

> Suspend/resume problem is most likely caused by a recently added security feature in Xen, that
> checks CPUID after resume with the previously (at boot time) known CPUID. This is to ensure, that
> the CPU microcode level - along with the resulting Spectere/Meltdown etc. mitigations - still
> persist after system resume and there are no features missing.
>
> For many AMD systems (eg. Trinity/Richland) CPUID changes after suspend (some of the high bits),
> resulting in Xen Panic (see xen/arch/x86/acpi/power.c). So, more investigation would be needed to
> check why the CPUID bits are changing after resume and whether it had any security implications or
> not.
> For the time being - if you accept the possible security implications - you can disable that check
> eg. by commenting the panic line out after "recheck_cpu_features" in the above mentioned power.c
> file, compile xen for dom0 via qubes builder and test it in your system.

So I installed the new F31-based R4.1 with Xen 8.13. Suspend/resume still isn't working; same
symptoms as before. A few corrections: I dug up some old threads and found that suspend/resume
actually did work correctly in F29, and on F25 the screen would power on, but just remain blank. So
in fact I never got the same symptoms on Qubes as I did with Fedora 25. This means that very likely
could be a Xen panic.

Something new, I booted R4.1 on bare metal without Xen, and it resumes fine. It probably will even
under R4.0 without Xen, too, but I haven't tried yet. So apparently it's not a version issue.

While booted without Xen, I checked /proc/cpuinfo before and after resume and they were the same
except for clock rates. The output is significantly different under Xen than bare metal, but the
microcode version is the same. In Xen, I obviously can't compare before- and after-resume outputs.

Not sure what to do. I'm really not looking forward to patching Xen.

> I use a Corebooted system, where the latest AMD microcode is compiled into
> the BIOS statically. And yes, I use a newer version of the AMD Fam15h
> microcode, than the version in the Linux Firmware package.
> This change in some of the CPUID bits after resume could be a result of
> xen/kernel trying to load the published microcode, and then fails because
> the BIOS version is newer.
> However, the /proc/cpuinfo reported microcode version always stays the same
> - the BIOS version. (..assuming the /proc/cpuinfo output is updated on any
> microcode upgrade attempts..)
>
> As noted, I have a "special" use case, so testing the recommended change in
> power.c for Claudia's newer AMD system could show, that this CPUID change
> issue after resume is "special" for my case or "general" for some AMD users.

That kind of makes sense. With your patch applied, can you see the CPUID bits in /proc/cpuid change after resume, or is the output the same as before?

Like I said, in my case nothing in /proc/cpuinfo changes before and after resume without Xen (although it could be different under Xen).

Claudia

unread,
Dec 28, 2019, 6:19:07 PM12/28/19
to qubes123, qubes-users
December 23, 2019 7:31 AM, "qubes123" <dm1.l...@gmail.com> wrote:

> For many AMD systems (eg. Trinity/Richland) CPUID changes after suspend (some of the high bits),
> resulting in Xen Panic (see xen/arch/x86/acpi/power.c). So, more investigation would be needed to
> check why the CPUID bits are changing after resume and whether it had any security implications or
> not.
> For the time being - if you accept the possible security implications - you can disable that check
> eg. by commenting the panic line out after "recheck_cpu_features" in the above mentioned power.c
> file, compile xen for dom0 via qubes builder and test it in your system.

I decided to give this a try, but I don't really know how to use the build system. I did `make vmm-xen`, modified the file chroot-dom0-fc29/home/user/rpmbuild/BUILD/xen-4.12.1/xen/arch/x86/acpi/power.c, but it appears after running `make vmm-xen` again my changes have been reverted. After it finishes the line is no longer commented out. Do I have to commit the change, or generate a patch file, or something like that?

qubes123

unread,
Dec 30, 2019, 1:44:13 PM12/30/19
to qubes-users
Answering to your earlier question, my CPU capability information bits change like this after suspend:

(XEN) Entering ACPI S3 state.
(XEN) AMD-Vi: Applying erratum 746 workaround for IOMMU at 0000:00:00.2
(XEN) Finishing wakeup from ACPI S3 state.
(XEN) CPU0: cap[ 1] is 3e98320b (expected b698320b)
(XEN) Missing previously available feature(s).
(XEN) Enabling non-boot CPUs  ...

Without the patch this result in xen panic.


I decided to give this a try, but I don't really know how to use the build system. I did `make vmm-xen`, modified the file chroot-dom0-fc29/home/user/rpmbuild/BUILD/xen-4.12.1/xen/arch/x86/acpi/power.c, but it appears after running `make vmm-xen` again my changes have been reverted. After it finishes the line is no longer commented out. Do I have to commit the change, or generate a patch file, or something like that?


After you configure the qubes builder (easiest done with ./setup) and download all the sources (including the qubes-vmm-xen extra sources), you have to put the patch file in the qubes-src/vmm-xen directory.
Then, include this patch in xen.spec.in file (somewhere below the last patch line eg. as Patch 1202: xxx --> filename is relative to the vmm-xen directory).
Then, change the release (rel file) number to avoid mixing the "official" and your custom versions. 
Then run: make wmm-xen, and the new rpms will be available in the pkgs folder.
You can check the logs meanwhile xen compiles if your patch was applied sucessfuly or not.
Then, install all the rpms (7), that are currently installed in dom0 (eg. the devel and the debug files etc. are not needed.

PS: my patch looks like this (it will show the CPUID capability bits changing in the hypervisor log)

diff -ruN a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
--- a/xen/arch/x86/acpi/power.c 2019-12-15 18:26:11.183000000 +0100
+++ b/xen/arch/x86/acpi/power.c 2019-12-15 18:23:15.439000000 +0100
@@ -257,7 +257,7 @@
     microcode_resume_cpu(0);
 
     if ( !recheck_cpu_features(0) )
-        panic("Missing previously available feature(s).");
+             printk(XENLOG_ERR "Missing previously available feature(s).\n");
 
     /* Re-enabled default NMI/#MC use of MSR_SPEC_CTRL. */
     ci->spec_ctrl_flags |= (default_spec_ctrl_flags & SCF_ist_wrmsr);

 

brenda...@gmail.com

unread,
Dec 30, 2019, 4:56:02 PM12/30/19
to qubes-users
On Monday, December 30, 2019 at 1:44:13 PM UTC-5, qubes123 wrote:
Answering to your earlier question, my CPU capability information bits change like this after suspend:

(XEN) Entering ACPI S3 state.
(XEN) AMD-Vi: Applying erratum 746 workaround for IOMMU at 0000:00:00.2
(XEN) Finishing wakeup from ACPI S3 state.
(XEN) CPU0: cap[ 1] is 3e98320b (expected b698320b)
(XEN) Missing previously available feature(s).
(XEN) Enabling non-boot CPUs  ...

Without the patch this result in xen panic.
 
PS: my patch looks like this (it will show the CPUID capability bits changing in the hypervisor log)

diff -ruN a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
--- a/xen/arch/x86/acpi/power.c 2019-12-15 18:26:11.183000000 +0100
+++ b/xen/arch/x86/acpi/power.c 2019-12-15 18:23:15.439000000 +0100
@@ -257,7 +257,7 @@
     microcode_resume_cpu(0);
 
     if ( !recheck_cpu_features(0) )
-        panic("Missing previously available feature(s).");
+             printk(XENLOG_ERR "Missing previously available feature(s).\n");
 
     /* Re-enabled default NMI/#MC use of MSR_SPEC_CTRL. */
     ci->spec_ctrl_flags |= (default_spec_ctrl_flags & SCF_ist_wrmsr);


Interesting. This call to recheck_cpu_features appears to be a check/panic that is also in Qubes R4.0, and was backported to Xen 4.8.3 last year:

 
q123 - do you happen to know what the exact two flag bits that changed represent in the cpu features? I wonder if the true issue is something not being properly preserved during the suspend, or properly re-initialized during the resume.

Brendan

qubes123

unread,
Jan 1, 2020, 7:30:34 AM1/1/20
to qubes-users
 | q123 - do you happen to know what the exact two flag bits that changed represent in the cpu features? I wonder if  the true issue is something not being properly preserved during the suspend, or properly re-initialized during the resume.

Brendan

I checked and it seems bits 27 (from 0->1) and 31 (from 1->0) change (for CPU0, capability[1]). However, I couldn't directly match, which is the correct register (eax, ebx, ecx, or edx) connected to this capability[1].  So I cannot draw deeper conclusions.
Looking at the "cpufeatures.h" file though, where all the Xen relevant CPUID feature bits are listed, (..and assuming capability[1] refers to these features) bits 27-31 are not relevant to Xen, therefore the code might have to be changed to check only the relevant feature bits and not the whole 32 bits in case of AMD (Fam15h at least)...
But I could be wrong, and this feature bit thing is only the consequence of what your wrote above.
Please also note, that I use coreboot and a newer AMD microcode, than what is generally available in mainstream the linux distros. So my case could be unique.

Qubes123

Claudia

unread,
Jan 1, 2020, 12:10:04 PM1/1/20
to qubes123, qubes-users

Thanks for the patch and the instructions. The qubes-builder documentation is outdated and sorely
lacking (it doesn't even mention ./setup!). I applied the patch for marek's 4.1 repo but I couldn't
get to produce an fc31 package. It kept building for fc29 which I don't currently have installed.
Then I built it for fc25 4.0 stable, but the patch wouldn't apply cleanly so I just modified the
existing patch-x86-check-feature-flags-after-resume.patch to print an error instead of panic, and
changed the message slightly.

patch-x86-check-feature-flags-after-resume.patch
diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index 3d26d4be31..e8fb3f6f31 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -255,6 +255,9 @@ static int enter_state(u32 state)

microcode_resume_cpu(0);

+ if ( !recheck_cpu_features(0) )
+ printk(XENLOG_ERR "Missing previously available feature(s). Ignoring.\n");
+


/* Re-enabled default NMI/#MC use of MSR_SPEC_CTRL. */
ci->spec_ctrl_flags |= (default_spec_ctrl_flags & SCF_ist_wrmsr);

spec_ctrl_exit_idle(ci);

Installed the seven packages already present in dom0. In case anyone is wondering those are:
xen-libs-4.8.5-14custom.fc25.x86_64.rpm
xen-4.8.5-14custom.fc25.x86_64.rpm
xen-hypervisor-4.8.5-14custom.fc25.x86_64.rpm
xen-runtime-4.8.5-14custom.fc25.x86_64.rpm
python3-xen-4.8.5-14custom.fc25.x86_64.rpm
xen-licenses-4.8.5-14custom.fc25.x86_64.rpm
xen-hvm-4.8.5-14custom.fc25.x86_64.rpm

Note that 4.8.5-14 -> 4.8.5-14custom shows up as a downgrade.

Ran `strings -a /boot/efi/EFI/qubes/xen.efi | grep Ignoring` to check for my unique message, just to be sure.

Rebooted. Checked xl info. Looks good. (Yes, it actually truncated the last character of the
version, apparently. Odd.)
xen_major : 4
xen_minor : 8
xen_extra : .5-14custom.fc2
xen_version : 4.8.5-14custom.fc2
cc_compile_date : Wed Jan 1 01:11:51 UTC 2020

Hit suspend from the XFCE menu. Waited 30 seconds or so. Crossed my fingers and resumed.

And... SUCCESS!

xl dmesg
(XEN) Preparing system for ACPI S3 state.
(XEN) Disabling non-boot CPUs ...


(XEN) Entering ACPI S3 state.

(XEN) Finishing wakeup from ACPI S3 state.

(XEN) CPU0: cap[ 1] is 7ed8320b (expected f6d8320b)
(XEN) Missing previously available feature(s). Ignoring.


(XEN) Enabling non-boot CPUs ...

Thank you for your help! It appears your machine is not a special case. Exact same result for both of us. Bit 27 flips on and bit 31 flips off (xor of 0x88000000). No idea what those mean, though.

However, I still have a long road ahead of me. I did several suspend/resume cycles, and each time I had a different combination of problems, including the mouse sticking, the keyboard not working, and finally input/output errors and segmentation faults in the terminal. But the Xen problem has been identified nonetheless. I'll try kernel-latest and see if that changes anything.

Thanks again.

BTW, have you reported this to upstream or do you have any plans to?

Claudia

unread,
Jan 3, 2020, 12:53:31 PM1/3/20
to qubes...@googlegroups.com
January 1, 2020 5:09 PM, "Claudia" <clau...@disroot.org> wrote:

> However, I still have a long road ahead of me. I did several suspend/resume cycles, and each time I
> had a different combination of problems, including the mouse sticking, the keyboard not working,
> and finally input/output errors and segmentation faults in the terminal. But the Xen problem has
> been identified nonetheless. I'll try kernel-latest and see if that changes anything.

Installed kernel-latest from stable, 5.3.11-1.qubes.x86, and no difference as far as I can tell. It resumes fine the first time usually, but after the second or third cycle, I get a bunch of io errors, as though someone unplugged the SATA connector. I think this is actually the underlying cause of the other symptoms. This is with no VMs running. No swap.

ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: limiting SATA link speed to 3.0 Gbps
ata1.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
ata1.00: revalidation failed (errno=-5)
ata1.00: disabled
sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 0:0:0:0: [sda] tag#21 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 0:0:0:0: [sda] tag#21 CDB: Write(10) 2a 00 3c 9f [...]
blk_update_request: I/O error, dev sda, sector [...] op 0x1: (WRITE) flags 0x100000 phys_seg 1 prio class 0
BTRFS error (device dm-0): bdev /dev/mapper/luks-[...] errs: wr 1, rd 0, flush 0, corrupt 0, gen0

Note this different than the Fedora 25 resume behavior. In F25 with 4.8.6, the screen doesn't power on, but the system seems responsive otherwise. For example ctrl-alt-delete reboots after 60 seconds as expected. (In Qubes, after resuming a second or third time and getting disk errors, when you try to shutdown it will just hang indefinitely.) But F25 was running from a USB drive so I wouldn't necessarily know if there were SATA errors in that case.

I'll see if I can figure out how to apply the patch to the latest 4.1 (F31-based) and try it from there. In the mean time, if anyone has any ideas please share.

Vít Šesták

unread,
Jan 3, 2020, 1:48:55 PM1/3/20
to qubes...@googlegroups.com
While comparing Qubes 4 to Fedora 25 might be tempting, it is not similar as it might seem. Qubes 4 is based on Fedora 25, but some parts including kernel are independent. So, seeing different kernel-related behavior in Fedora 25 and Qubes 4 is definitely not a surprise.

Regards,
Vít Šesták 'v6ak'

brenda...@gmail.com

unread,
Jan 3, 2020, 2:17:04 PM1/3/20
to qubes-users
On Friday, January 3, 2020 at 12:53:31 PM UTC-5, Claudia wrote:
January 1, 2020 5:09 PM, "Claudia" <clau...@disroot.org> wrote:

I'll see if I can figure out how to apply the patch to the latest 4.1 (F31-based) and try it from there. In the mean time, if anyone has any ideas please share.


Maybe not directly helpful, but I've been looking to be able to better debug Xen issues, so reposting this from https://github.com/QubesOS/qubes-issues/issues/5529
...

Since it appears the old made-for-purpose USB 2.0 EHCI Debug port dongles are impossible to find these days, I've been looking around for alternatives and stumbled upon use of the raspberry pi zero w/ USB Gadget drivers to log chromebook coreboot debug data. Pretty sure (but not 100%) the same could be done for Xen debug data:

https://johnlewis.ie/pi-zero-w-flashrom-and-usb-gadget-debug/
https://johnlewis.ie/wp-content/uploads/2017/04/ehcidebug.gif
raspberrypi/linux#1907
https://gist.github.com/gbaman/50b6cca61dd1c3f88f41


So...now I have a pi zero on the way.


Brendan

Claudia

unread,
Jan 3, 2020, 3:48:33 PM1/3/20
to brenda...@gmail.com, qubes...@googlegroups.com

Funny you should mention that. I happened to have a Pi Zero W lying around, and I almost did go that route. However when I started looking into USB 2.0 EHCI debug (thanks to user Qubes123 for the tip), it looked pretty complicated and somewhat unreliable, so I decided to try some simpler techniques first. Also my USB controllers don't list the debug capability so I don't think it would work on this machine. Luckily Qubes123's patch worked, or at least fixed the Xen panic, so I don't think I have a need for USB debugging at the moment.

However it is something I'd like to learn more about in case I need it in the future. Please let me know how you make out!

Also something you might be interested in is USB 3.0 XHCI Debug Capability, or DbC, which is built into the USB 3.0 spec. It's a host-to-host protocol so it doesn't require any OTG/gadget hardware, just two devices that support USB 3.0 Enhanced Superspeed, a USB 3.0 Enhanced Superspeed cable, and the target device (USB controller) must support XHCI Debug Capability (DbC). https://www.kernel.org/doc/html/v4.17/driver-api/usb/usb3-debug-port.html

The machine I was trying to debug does have a USB 3.1 controller, but it doesn't list the either the XHCI nor EHCI debug capability, even when USB debug is enabled in firmware. Just because there's a setting for it in firmware doesn't necessarily mean the hardware supports it, I suppose.

brenda...@gmail.com

unread,
Jan 3, 2020, 4:09:29 PM1/3/20
to qubes-users
On Friday, January 3, 2020 at 3:48:33 PM UTC-5, Claudia wrote:
January 3, 2020 7:17 PM, brend...@gmail.com wrote:

> Since it appears the old made-for-purpose USB 2.0 EHCI Debug port dongles are impossible to find
> these days, I've been looking around for alternatives and stumbled upon use of the raspberry pi
> zero w/ USB Gadget drivers to log chromebook coreboot debug data. Pretty sure (but not 100%) the
> same could be done for Xen debug data:

... 

> So...now I have a pi zero on the way.

Funny you should mention that. I happened to have a Pi Zero W lying around, and I almost did go that route. However when I started looking into USB 2.0 EHCI debug (thanks to user Qubes123 for the tip), it looked pretty complicated and somewhat unreliable, so I decided to try some simpler techniques first. Also my USB controllers don't list the debug capability so I don't think it would work on this machine.


On the W520, lspci -nvvD shows two PCI USB EHCI devices with Debug enabled. Probably the same PCI device presenting itself twice. From what I've read elsewhere, the USB 2.0 port on the back of the unit is the one wired up for debug use. Presumably the X230 units are similarly equipped.
 
Haven't checked the P52 yet.

Luckily Qubes123's patch worked, or at least fixed the Xen panic, so I don't think I have a need for USB debugging at the moment.

However it is something I'd like to learn more about in case I need it in the future. Please let me know how you make out!


Will do.
 

Also something you might be interested in is USB 3.0 XHCI Debug Capability, or DbC, which is built into the USB 3.0 spec. It's a host-to-host protocol so it doesn't require any OTG/gadget hardware, just two devices that support USB 3.0 Enhanced Superspeed, a USB 3.0 Enhanced Superspeed cable, and the target device (USB controller) must support XHCI Debug Capability (DbC). https://www.kernel.org/doc/html/v4.17/driver-api/usb/usb3-debug-port.html

The machine I was trying to debug does have a USB 3.1 controller, but it doesn't list the either the XHCI nor EHCI debug capability, even when USB debug is enabled in firmware. Just because there's a setting for it in firmware doesn't necessarily mean the hardware supports it, I suppose.


I'm not entirely sure that the XHCI capability is as "low level" friendly as the EHCI capability (it might require at least a mini USB stack to be utilized). Haven't dived into that fully yet.

B

Claudia

unread,
Jan 3, 2020, 7:45:56 PM1/3/20
to qubes...@googlegroups.com


And... SUCCESS 2.0!

Perhaps it's still too early to celebrate, but after six months of troubleshooting I think I might finally have working suspend/resume. I did some googling around, and eventually came across a rather inconspicuous post[1] from 2013 in the Xen archives that mentioned something I hadn't tried or heard about before. All I had to do was add to the Xen command line "dom0_max_vcpus=1 dom0_vcpus_pin". And that's it. Couldn't have been simpler. I should not have had to go to the 20th page of search results to find out about this.

This runs dom0 on CPU0 and only CPU0. My understanding is that it has to be running on the boot CPU at the exact moment of suspend and resume. Or something like that. Not sure of the specifics. Note that this may have a performance impact depending on your situation.

At first, I thought maybe this would render the Xen patch unnecessary: e.g. that it was suspending on one core and resuming on another causing an apparent change in cpuid bits. But I can see from the log the cpuid capability bits are still changing as before. (Those of you just tuning in, the patch and instructions are earlier in this thread. However you probably won't need it unless you have an AMD Fam15h processor. Note that there may be security implications associated with this patch.)

I've only had a chance to test about 15-20 cycles or so, but it works great so far. Suspends fast, resumes fast, lid-switch triggers both suspend and resume, WiFi automatically reconnects. I suspended in the middle of a YouTube video and came back up seamlessly. However after resume all instances of Firefox seem to jump to 100% CPU (but not frozen) until I close it, but that appears to be a known issue outside of Qubes and Xen also.

Tested on R4.0 stable with kernel-latest 5.3.11-1.qubes.x86 on Xen 4.8.5-14.fc25 (patched). I haven't tried this yet on the default kernel but I think it would probably work just as well. It also very well might work on other Qubes/Xen versions. I'll update my HCL accordingly when I have a chance.

[1] https://lists.gt.net/xen/devel/270965#270965

M

unread,
Jan 23, 2020, 11:57:26 AM1/23/20
to qubes-users
Hello Claudia


It seems you got Qubes OS working with your AMD Ryzen 5 2500U with integrated Vega 8 Graphics.

I have bought a AMD Ryzen 3200G also with Vega 8 Graphics, and I have tried install Qubes 4.0.1 and 4.0.2 on the pc (from a DL-DVD both in UEFI and Legacy mode), but without any luck so far.

In UEFi the cursor ends up blinking on a black screen.

In Legacy mode I end up just before loading the graphical interface by getting an error message: Failing to load kernel modules/X startup failed -> Aborting installation.

For screenshots, see here: https://github.com/QubesOS/qubes-issues/issues/4510

I have also tried the other possibilities to install Qubes that the two ISO-DVD’s offer. But that also ended with the same result.

Maybe I can learn from your experiences...

I have tried to read your threads in this post, but as a newbie it isn’t explained in a way that I can try to follow.

So I hope you will explain to me what I can try to do to make Qubes OS running on my pc.


Here is my hardware settings in case it should be relevant:

Motherboard: Asrock X570M Pro4
CPU: AMD Ryzen 3 3200G w. integrated Vega 8 Graphics
Ram: 32 GB G.Skill
Hard drive: SSD + HDD

Claudia

unread,
Jan 24, 2020, 12:22:59 AM1/24/20
to M, qubes-users

For me, the boot and install mostly just worked out of the box. I never experienced the installer drop to shell or "X failed to start" or anything like that. I did have the installer screen freeze sometimes on one version, I think 4.0.2-rc2, but I was able to get past it and never really investigated the cause. In my case, it was the post-installation stuff that took some real troubleshooting. So I don't have much to offer beyond the generic troubleshooting tips.

I looked at your thread, but it doesn't appear to have /tmp/X.log, please post that if you can. You're at least making it to the console, so that's good. I would definitely try booting with nomodeset if you haven't already. It can fix a wide variety of different X-related problems.

Also please mention what Qubes ISO versions and kernel versions you've tried. You may want to try an R4.1 pre-release build. Look for the link Brendan posted earlier in this thread. You may also want to try installing Qubes on a different machine, upgrading to kernel-latest, and then moving the disk or USB drive to the target machine.

For what it's worth, the "[Firmware bug]" and "ACPI Error" lines are quite common, if not universal, on Ryzen systems. However they don't seem to be related to any specific problems in practice, so I wouldn't worry too much about those. The "Failed to load kernel modules" error seems to be common in Qubes and even other OSes, regardless of hardware, so I wouldn't worry about that either. I doubt any of those are directly related to the X failure you're experiencing.

I can also say, sys-usb causes all manner of problems on my machine and for some other Ryzen users as well. So when you do finally make it to that point, I definitely would not recommend enabling that option until you have everything else working.

M

unread,
Jan 24, 2020, 7:55:26 AM1/24/20
to qubes-users
Thank you for your answer.

What do you mean by “nomodeset” ? - is it regarding legacy and UEFI mode or... ?

As I only have tried with Qubes OS stable version 4.0.1 and 4.0.2 and is now going to try 4.0.3 the kernel version is 4.19. How can I try to install Qubes with a newer kernel.

Could an idea be to try to install Linux Mint or Fedora 31 if 4.0.3 doesn’t work either ? - just to make sure they work and rule basic things out.

Both PartedMagic (24/12-2019) and Trial 4.2.2 live-versions runs fine (and also Windows 10).

Claudia

unread,
Jan 24, 2020, 8:28:25 AM1/24/20
to M, qubes-users
January 24, 2020 12:55 PM, "M" <annee...@gmail.com> wrote:

> Thank you for your answer.
>
> What do you mean by “nomodeset” ? - is it regarding legacy and UEFI mode or... ?

In 4.0, to enable nomodeset you have to edit the bootloader files files in the installation media. I just realized, since you're using DVDs instead of USB, this is going to be a lot more difficult. You'll have to unpack the ISO, modify the boot loader file, and then repack the ISO and burn it. I would recommend using a USB drive in this case if you can. That way you can do the modifications directly to the USB drive, and you don't have to waste additional DVDs.

In R4.1, you just have to press 'e' at the boot menu, and you can make last minute changes to the boot parameters without modifying anything. This would probably be the easiest option.

nomodeset is a kernel command line option that disables kernel-modesetting and prevents graphics drivers from being loaded, so they just use a basic minimal driver essentially. In 4.0 this would be the "kernel=" line of the xen.cfg file.

> As I only have tried with Qubes OS stable version 4.0.1 and 4.0.2 and is now going to try 4.0.3 the
> kernel version is 4.19. How can I try to install Qubes with a newer kernel.

I'm not sure if there's any easy way to install a newer kernel into the installer. The way most people do it is to do the installation on a different machine, install kernel-latest, and then move the drive to the other machine. However 4.0.3 should come with a newer LTS kernel at least, so try that first.

When the installer fails, copy or screenshot /tmp/X.log and post it.

> Could an idea be to try to install Linux Mint or Fedora 31 if 4.0.3 doesn’t work either ? - just to
> make sure they work and rule basic things out.

R4.0 is based on Fedora 25, so you could try booting that just to make sure it works, just to rule that out. However there's still a big difference between Qubes and Fedora 25, so it won't tell us very much.

M

unread,
Jan 24, 2020, 10:02:15 AM1/24/20
to qubes-users
I use DVD’s so that the files can’t be edited or a malicious file can’t be placed on the installation media in case it’s inserted in a compromised pc. But yes, it seems to require a lot of disc as Qubes OS not always develops linear. :j

Regarding editing ISO-files, I’m not as technical as you. So that would require some detailed instructions.

Sorry, it was Tails 4.2.2 instead of Trial.

Just to rule the option out: Could it be possible that when installing from a burned ISO-file the installation fails, while installing from a USB with a transferred ISO-file using Rufus in dd-mode the installation of Qubes OS succeed ? - If so, I would try that first. But then I have to buy a new USB flash drive.

Claudia

unread,
Jan 24, 2020, 10:56:48 AM1/24/20
to M, qubes-users
January 24, 2020 3:02 PM, "M" <annee...@gmail.com> wrote:

> I use DVD’s so that the files can’t be edited or a malicious file can’t be placed on the
> installation media in case it’s inserted in a compromised pc. But yes, it seems to require a lot of
> disc as Qubes OS not always develops linear. :j

This is good security practice. I recommend it if you don't mind the inconvenience.

> Regarding editing ISO-files, I’m not as technical as you. So that would require some detailed
> instructions.

The process for editing the ISO kernel parameters is described in https://www.qubes-os.org/doc/uefi-troubleshooting/ , except in your case you are adding the "nomodeset" option instead of the ones they tell you in the guide (based on your symptoms). Add "nomodeset" to the end of each "kernel=" line in xen.cfg.

Note: after you rebuild the ISO, optionally you may want to run it in a VM to make sure you got everything right, before you burn a DVD. Don't expect it to actually work correctly, but just make sure you're able to select the "Install Qubes" boot menu entry and that it doesn't complain about a bad config file or anything. If everything goes as it should, most likely you'll get a "sorry, this system doesn't support virtualization" type of message because it's already running in a VM. If so, that's good, burn to DVD.

However, that being said...

Honestly, the easiest thing right now would be download the R4.1 pre-release, burn it, try it with default settings first, and if you get the same problem as before, add nomodeset. To do that just press 'e' when you see the grub boot menu (with option "Qubes R4.1, with Xen hypervisor" highlighted) and then add "nomodeset" at the end of the kernel line (it looks something like "multiboot2 vmlinuz ...").

> Just to rule the option out: Could it be possible that when installing from a burned ISO-file the
> installation fails, while installing from a USB with a transferred ISO-file using Rufus in dd-mode
> the installation of Qubes OS succeed ? - If so, I would try that first. But then I have to buy a
> new USB flash drive.

In this situation, I kind of doubt it. The problem seems to be at a higher level than that, since you're getting to an anaconda console at least. It's probably kernel version issue or graphics driver issue.

M

unread,
Jan 25, 2020, 11:08:59 AM1/25/20
to qubes-users
1) I have tried to install R4.0.3 in legacy mode with the same result. And I have also tried the other installation possibilities that the DVD offer with the same result.
Can I somehow save the logs on a USB-drive as I’m running the installer from a DVD, or is it only possible to get access to the log files afterwards by installing Qubes OS from a USB-drive ?

2) I have tried to look for the “xen.cfg” file you mentioned, but can’t find a file with that name in the downloaded ISO-file. Where to find it or is it called something else ?

3) By “R4.1 pre-release” do you then mean “R4.0.1” ?
I have tried to load R4.0.1 in legacy mode and when the grub boot menu appears, there isn’t any option labeled “Qubes R4.1, with Xen Hypervisor”. Just the same installer menu as on the later versions. And if I press “e”, nothing seems to happen.

4) I’m also not totally sure where to add “nomodeset” when you just say at the end of the kernel line (it looks something like “multiboot2 vmlinuz”)... Sorry, could you be a little more precise on where I shall write it? Maybe show it in a picture... Just to make sure I add it the right place.

Claudia

unread,
Jan 25, 2020, 11:59:17 AM1/25/20
to M, qubes-users
January 25, 2020 4:09 PM, "M" <annee...@gmail.com> wrote:

> 1) I have tried to install R4.0.3 in legacy mode with the same result. And I have also tried the
> other installation possibilities that the DVD offer with the same result.
> Can I somehow save the logs on a USB-drive as I’m running the installer from a DVD, or is it only
> possible to get access to the log files afterwards by installing Qubes OS from a USB-drive ?
>
> 2) I have tried to look for the “xen.cfg” file you mentioned, but can’t find a file with that name
> in the downloaded ISO-file. Where to find it or is it called something else ?

Make sure you're looking in the first partition (/dev/sdb1). I'm not sure what directory it's in on the installer and I don't have a copy of it handy. On an installed system it's under (/dev/sdb1)/EFI/qubes/xen.cfg. Note this is for R4.0 only.

You might want to start a new thread about that, so someone with more experience in the installer can help you with that.

> 3) By “R4.1 pre-release” do you then mean “R4.0.1” ?

No, R4.1 is an upcoming version that hasn't been released yet, but has unstable builds available.

https://openqa.qubes-os.org/tests/5493/asset/iso/Qubes-4.1-20200113-x86_64.iso

R4.0.x versions are the same as R4.0, but with updates preinstalled.

> I have tried to load R4.0.1 in legacy mode and when the grub boot menu appears, there isn’t any
> option labeled “Qubes R4.1, with Xen Hypervisor”. Just the same installer menu as on the later
> versions. And if I press “e”, nothing seems to happen.

It might be called something different. It'll most likely be the default menu entry which is already highlighted, usually the first in the list.

I have no idea why "e" doesn't work. Can you move up and down to different menu entries?

> 4) I’m also not totally sure where to add “nomodeset” when you just say at the end of the kernel
> line (it looks something like “multiboot2 vmlinuz”)... Sorry, could you be a little more precise on
> where I shall write it? Maybe show it in a picture... Just to make sure I add it the right place.


Here's a copy from an installed R4.1 system, but the entry in the installer should look similar. nomodeset is on the second-to-last "module2" line (don't type the asterisks). In legacy mode this line will start with "linux /vmlinuz-" (I think) but works the same way. If you add it in the wrong place, don't worry just reboot and try again.

menuentry 'Qubes, with Xen hypervisor' --class qubes --class gnu-linux --class gnu --class os --class xen $menuentry_id_option 'xen-gnulinux-simple-2136e4d1-da52-4921-90c1-f7617ab8a31f' {
insmod part_gpt
insmod ext2
set root='hd0,gpt2'
if [ x$feature_platform_search_hint = xy ]; then
search --no-floppy --fs-uuid --set=root --hint-bios=hd0,gpt2 --hint-efi=hd0,gpt2 --hint-baremetal=ahci0,gpt2 039dd247-6c6e-40c5-a8ec-890d4462da53
else
search --no-floppy --fs-uuid --set=root ...
fi
echo 'Loading Xen 4.13.0 ...'
if [ "$grub_platform" = "pc" -o "$grub_platform" = "" ]; then
xen_rm_opts=
else
xen_rm_opts="no-real-mode edd=off"
fi
multiboot2 /xen-4.13.0.gz placeholder console=none dom0_mem=min:1024M dom0_mem=max:4096M iommu=no-igfx ucode=scan smt=off ${xen_rm_opts}
echo 'Loading Linux 4.19.89-1.pvops.qubes.x86_64 ...'
module2 /vmlinuz-4.19.89-1.pvops.qubes.x86_64 placeholder root=UUID=... ro rd.luks.uuid=luks-... plymouth.ignore-serial-consoles rhgb quiet **nomodeset**
echo 'Loading initial ramdisk ...'
module2 --nounzip /initramfs-4.19.89-1.pvops.qubes.x86_64.img
}

M

unread,
Jan 25, 2020, 1:46:41 PM1/25/20
to qubes-users
Thank you very much ! - I’ll try that...

By grub menu you probably mean this one: https://www.qubes-os.org/attachment/wiki/InstallationGuide/grub-boot-menu.png

But I do not get so far as it seems to first show after the installation.

The menu I see is this: https://www.qubes-os.org/attachment/wiki/InstallationGuide/boot-screen.png

Claudia

unread,
Jan 25, 2020, 4:24:25 PM1/25/20
to M, qubes-users

Grub doesn't work on R4.0 in UEFI mode, so the installer probably uses syslinux at least for 4.0, not sure about 4.1. Looking at the screenshot, it looks like you just have to press tab instead of "e" and the rest should be the same. Also the troubleshooting menu is probably worth checking out.

M

unread,
Jan 25, 2020, 5:14:33 PM1/25/20
to qubes-users
I have tried to press Tab and also tried the options in the trouble shouting menu.

Both ending with the same result...

Reply all
Reply to author
Forward
0 new messages