R4 system requirements; AMD compatibility?

906 views
Skip to first unread message

Claudia

unread,
May 22, 2019, 9:46:12 AM5/22/19
to qubes-users
Hello,

I've read the system requirements page, the HCL, and the "advice on
finding a Qubes compatible notebook" thread, all of which seem to refer
to Intel almost exclusively. I've also done some searching around, but
there seems to be very little information about Qubes' compatibility on
AMD machines.

I already know the conventional wisdom is "buy it, try it, and return it
if it doesn't work," and that's basically what I intend to do. But
before I do, I'm hoping for some reassurance, or advice on whether I
should just skip over AMD altogether.

Is Qubes support for AMD about as good as it is for Intel? Or, is there
a reason to pay the extra money for an Intel machine? Why is there so
little information about Qubes on AMD, and so few AMD machines on the
HCL? Surely I can't be the only person wondering about this.

A few things, more specifically:

1) The system requirements page says that Intel integrated graphics are
strongly recommended over Nvidia or Radeon, for compatibility reasons.
What about AMD's integrated "Vega" graphics?

2) It's harder to find compatibility-related information for AMD
processors. In particular, information about whether RVI ( = Intel EPT,
= SLAT) is supported by a given processor. Official specs, and even
sites like wikichip.org and cpu-monkey.com, often mention Intel EPT but
not AMD RVI. (AMD-V and IOMMU are usually mentioned, though). Is there a
specific cpu flag or something that I should be looking for in order to
know if RVI is supported? Or is it pretty much safe to assume that any
recent AMD processor with AMD-V and IOMMU will also support RVI?

3) Do AMD processors have integrated chipsets like Intel (4th gen and
up) processors? Or does the chipset remain on the motherboard in AMD
machines. Not a dealbreaker, but integrated chipsets definitely make
searching much easier.

For what it's worth, in particular I'm looking at a few laptops with
processors such as Ryzen 5 2500U, Ryzen 7 2700U, and Ryzen 5 3500U. They
seem like great deal for the money. For example the 2500U has about the
same specs/performance as an i5-8250U, but laptops with the 2500U tend
to be around $200 cheaper than the Intel version. But, if Qubes
compatibility is most likely going to be an issue, then I'm willing to
pay the extra money for an Intel machine and avoid the hassle.

So, can I expect to have about the same luck with AMD as Intel? Or
should I just pay the extra money and play it safe with Intel? Any
advice on these particular processors, or recent AMD processors in
general, it would make me feel a lot better before I go buying and
trying at random.

-------------------------------------------------

ONLY AT VFEmail! - Use our Metadata Mitigator to keep your email out of the NSA's hands!
$24.95 ONETIME Lifetime accounts with Privacy Features!
15GB disk! No bandwidth quotas!
Commercial and Bulk Mail Options!

Chris Laprise

unread,
May 22, 2019, 12:42:47 PM5/22/19
to Claudia, qubes-users
On 5/22/19 9:44 AM, Claudia wrote:
> Hello,
>
> I've read the system requirements page, the HCL, and the "advice on
> finding a Qubes compatible notebook" thread, all of which seem to refer
> to Intel almost exclusively. I've also done some searching around, but
> there seems to be very little information about Qubes' compatibility on
> AMD machines.
>
> I already know the conventional wisdom is "buy it, try it, and return it
> if it doesn't work," and that's basically what I intend to do. But
> before I do, I'm hoping for some reassurance, or advice on whether I
> should just skip over AMD altogether.
>
> Is Qubes support for AMD about as good as it is for Intel? Or, is there
> a reason to pay the extra money for an Intel machine? Why is there so
> little information about Qubes on AMD, and so few AMD machines on the
> HCL? Surely I can't be the only person wondering about this.
>
> A few things, more specifically:
>
> 1) The system requirements page says that Intel integrated graphics are
> strongly recommended over Nvidia or Radeon, for compatibility reasons.
> What about AMD's integrated "Vega" graphics?

That recommendation is overly conservative because early in Qubes'
history only Intel processors had been explored and integrated graphics
was the safe bet. If I were to re-word that phrase, I'd make it sound
more like "avoid Nvidia unless you know exactly what you're doing". IOW,
AMD is not a big worry when it comes to Linux graphics.

>
> 2) It's harder to find compatibility-related information for AMD
> processors. In particular, information about whether RVI ( = Intel EPT,
> = SLAT) is supported by a given processor. Official specs, and even
> sites like wikichip.org and cpu-monkey.com, often mention Intel EPT but
> not AMD RVI. (AMD-V and IOMMU are usually mentioned, though). Is there a
> specific cpu flag or something that I should be looking for in order to
> know if RVI is supported? Or is it pretty much safe to assume that any
> recent AMD processor with AMD-V and IOMMU will also support RVI?

I myself am just getting started with AMD on a circa 2013 laptop. Even
on this 6yr old A10 laptop, all of the virtualization features are there
– although coreboot is required for proper initialization (no surprise:
the laptop is a Lenovo Ideapad, not a Thinkpad).

FWIW, I do think you're right that any x86 processor that supports both
HVM (AMD-V) and IOMMU would also have RVI support. In the Qubes HCL, all
of the Ryzen systems were reported to have RVI(SLAT) working. One entry
also reports that IOMMU was present but not working on a "gaming"
motherboard.

These days, the main issue with these virtualization features is not
with the CPUs/chipsets themselves, but whether the BIOS/UEFI for a
specific computer model enables these features and initializes them
correctly. This is rarely an issue with business class computers from
reputable vendors.

>
> 3) Do AMD processors have integrated chipsets like Intel (4th gen and
> up) processors? Or does the chipset remain on the motherboard in AMD
> machines. Not a dealbreaker, but integrated chipsets definitely make
> searching much easier.

Someone else with direct Ryzen experience could better answer this
question. I know that in the 15h (2013) era the chipsets were separate
(but this was also true for Intel at least up to 2012).

>
> For what it's worth, in particular I'm looking at a few laptops with
> processors such as Ryzen 5 2500U, Ryzen 7 2700U, and Ryzen 5 3500U. They
> seem like great deal for the money. For example the 2500U has about the
> same specs/performance as an i5-8250U, but laptops with the 2500U tend
> to be around $200 cheaper than the Intel version. But, if Qubes
> compatibility is most likely going to be an issue, then I'm willing to
> pay the extra money for an Intel machine and avoid the hassle.
>
> So, can I expect to have about the same luck with AMD as Intel? Or
> should I just pay the extra money and play it safe with Intel? Any
> advice on these particular processors, or recent AMD processors in
> general, it would make me feel a lot better before I go buying and
> trying at random.

You can make the process less random by focusing on sources that have a
history of delivering better quality BIOS/firmware. Systems with the
best chance of compatibility will be business-class and indicate
compatibility with Linux (or Ubuntu/Redhat) in addition to Windows,
typically from vendors Lenovo, Dell, and HP. I would have included
Purism and System76 except they don't offer AMD laptops. In general,
avoid consumer and gaming models (e.g. choose Thinkpad over Ideapad).

Do I think its worth it to try? Definitely, as long as your choice isn't
random consumer gear. AMD processors tend to be more secure in the face
of sidechannel attacks and suffer less performance degradation[1] from
resulting security patches; that's certainly a reason to go in AMD's
direction.

1.
https://www.tomshardware.com/news/intel-amd-mitigations-performance-impact,39381.html

--

Chris Laprise, tas...@posteo.net
https://github.com/tasket
https://twitter.com/ttaskett
PGP: BEE2 20C5 356E 764A 73EB 4AB3 1DC4 D106 F07F 1886

Chris Laprise

unread,
May 23, 2019, 12:35:18 PM5/23/19
to Claudia, qubes-users
On 5/22/19 3:17 PM, Claudia wrote:
> Thanks for all the info! All good news except for the part about
> BIOS/UEFI feature support, which doesn't come as a surprise.
>
> I understand that, for maximum certainty, one should look at high-end,
> business-class, Linux-friendly product lines. However, this kind of
> defeats the whole point of looking for a laptop with a low-cost
> good-performance processor such as Ryzen. Those product lines are pricy
> to begin with (relative to specs), and they rarely ever go on sale.
>
> Also, it stands to reason that low-cost processors are usually found in
> low-cost machines. I found a couple of lists[1][2] of laptops with these
> processors. While there is a Thinkpad on the list, it looks to be mostly
> consumer-level laptops (although I'm not real familiar with computer
> makes and product lines so I could be wrong).
>
> To be honest, I was kind of looking this[3] Inspiron 5575 with 2500U
> available at Walmart for $350 (was down to $330 for a few days). After
> upgrading the RAM and perhaps adding an SSD, it still looks like a good
> deal for the money. Good enough to take a chance on, I think -- if it
> doesn't run Qubes, I'll just have to return it and look at some
> higher-end models.

It sounds like you have an approach worked out.

If the low cost model doesn't run Qubes, then consider: Now that AMD is
being taken seriously for business laptops (where they used to be
extremely rare), there should now be more choices that are lower cost
_because_ they have an AMD processor.

FWIW, firmware compatibility isn't good on consumer models regardless of
the CPU vendor: Intel based consumer products suffer from it, too.

>
> I did some quick searching, and Inspirons appear to have reasonably good
> Linux compatibility, at least for consumer-level laptops. Inspiron 5575
> is even on the arch wiki[4] (which is definitely not conclusive, but a
> good sign nonetheless, no?)

Dell is said to have a larger number of models that are _intentionally_
Linux compatible. Its worth checking their website to see if that model
(or any adjacent to it) are labeled as Linux compatible. The Dell
support page for Inspiron 5575 actually lists Ubuntu as a compatible
OS... a good sign!

>
> For what it's worth, I took a Qubes installer USB to Walmart, and tried
> it on a couple of consumer-level machines, although they did't have any
> Inspirons on display. The only one that didn't work was an older HP
> Pavilion with an A10 processor (it dropped to shell before graphical
> installer came up), but it did work on an HP Graphite Mist i5-8250U and
> one other machine which I can't recall at the moment. (Note by "worked"
> I mean it made it to the final point in the installer before you
> actually write to the disk.)
>
> So, unless there are any red flags I'm not seeing, I'm leaning towards
> giving the Inspiron 5575 a try.

Only red flag is consumer designation. As is often the case, Dell may
prefer to keep virtualization features turned off (and not provide a
BIOS screen option to turn them on) to avoid support costs.

>
> Thanks once again for your comprehensive and helpful reply! Any
> additional thoughts/advice welcome.

Good luck! And let us know how it goes and if you have any specific
questions.


>
> [1] https://www.notebookcheck.net/AMD-Ryzen-5-2500U-SoC.258646.0.html
> [2]
> https://www.reddit.com/r/Amd/comments/8v6e1u/list_of_ryzen_and_vega_amd_laptops_6302018/
>
> [3]
> https://www.walmart.com/ip/Dell-Inspiron-15-5000-5575-Laptop-15-6-AMD-Ryzen-5-2500U-with-Radeon-Vega8-Graphics-1TB-HDD-4GB-RAM-i5575-A410BLU-PUS/212669685
>
> [4] https://wiki.archlinux.org/index.php/Dell_Inspiron_5575

Claudia

unread,
May 23, 2019, 9:36:47 PM5/23/19
to Chris Laprise, qubes-users
Chris Laprise:
Thanks for all the info! All good news except for the part about
BIOS/UEFI feature support, which doesn't come as a surprise.

I understand that, for maximum certainty, one should look at high-end,
business-class, Linux-friendly product lines. However, this kind of
defeats the whole point of looking for a laptop with a low-cost
good-performance processor such as Ryzen. Those product lines are pricy
to begin with (relative to specs), and they rarely ever go on sale.

Also, it stands to reason that low-cost processors are usually found in
low-cost machines. I found a couple of lists[1][2] of laptops with these
processors. While there is a Thinkpad on the list, it looks to be mostly
consumer-level laptops (although I'm not real familiar with computer
makes and product lines so I could be wrong).

To be honest, I was kind of looking this[3] Inspiron 5575 with 2500U
available at Walmart for $350 (was down to $330 for a few days). After
upgrading the RAM and perhaps adding an SSD, it still looks like a good
deal for the money. Good enough to take a chance on, I think -- if it
doesn't run Qubes, I'll just have to return it and look at some
higher-end models.

I did some quick searching, and Inspirons appear to have reasonably good
Linux compatibility, at least for consumer-level laptops. Inspiron 5575
is even on the arch wiki[4] (which is definitely not conclusive, but a
good sign nonetheless, no?)

For what it's worth, I took a Qubes installer USB to Walmart, and tried
it on a couple of consumer-level machines, although they did't have any
Inspirons on display. The only one that didn't work was an older HP
Pavilion with an A10 processor (it dropped to shell before graphical
installer came up), but it did work on an HP Graphite Mist i5-8250U and
one other machine which I can't recall at the moment. (Note by "worked"
I mean it made it to the final point in the installer before you
actually write to the disk.)

So, unless there are any red flags I'm not seeing, I'm leaning towards
giving the Inspiron 5575 a try.

Thanks once again for your comprehensive and helpful reply! Any
additional thoughts/advice welcome.

Claudia

unread,
May 30, 2019, 10:01:07 AM5/30/19
to Chris Laprise, qubes-users
Chris Laprise:
(Sorry it took me this long to reply.)

Indeed, that's what it sounded like to me from your initial reply. I was
more worried about CPU vendor, mostly due to the way the system
requirements page is worded, and lack of relevant search results for
"Qubes on AMD." But as you pointed out, BIOS/UEFI support (and
therefore, the specific computer/motherboard model) is a much more
important consideration. So I'm glad I asked.

Also, just out of curiosity, as far as consumer vs. business and
firmware support: in your experience does firmware support also depend
significantly on the vendor, and not just product class? e.g. perhaps
vendor A's consumer-level products work even better than B's
business-class. Do any particular vendors use generally "better"
firmwares? (I know you mentioned Lenovo, Dell, and HP with regards to
Linux support.) Or is firmware support about the same across vendors,
within each product class?

>> I did some quick searching, and Inspirons appear to have reasonably
>> good Linux compatibility, at least for consumer-level laptops.
>> Inspiron 5575 is even on the arch wiki[4] (which is definitely not
>> conclusive, but a good sign nonetheless, no?)
>
> Dell is said to have a larger number of models that are _intentionally_
> Linux compatible. Its worth checking their website to see if that model
> (or any adjacent to it) are labeled as Linux compatible. The Dell
> support page for Inspiron 5575 actually lists Ubuntu as a compatible
> OS... a good sign!

Just to be clear, what relationship is there between Linux
compatibility, and firmware support, if any? Are these the same thing?
Roughly correlated? Or totally different?

> Good luck! And let us know how it goes and if you have any specific
> questions.

I'll make sure to follow up. I'm waiting to order it until I know I'll
have enough free time to play around with it, as I'll only have a
limited time to decide if I'm keeping it or returning it. Hopefully
within the next week or two.

I thought of some more questions in the meantime. Hope you don't mind me
picking your brain a little.

1) Should I update the BIOS before attempting to install Qubes? Is this
a generally recommended practice for Qubes, and if so, why isn't it
mentioned in the installation guide? I wonder how many people gave up on
a Qubes install without ever trying a firmware update. Should I keep the
firmware up to date thereafter? Firmware updates used to be rare, and it
was only recommended to install them if something was actually broken. I
guess now that the firmware is practically an OS in itself, we should be
updating it like one?

Luckily, Dell provides a way to update firmware from Linux, and a way to
do it with no OS at all. However, I imagine some (most) vendors require
Windows in order to install firmware updates. Just out of curiosity, how
do Linux users do firmware updates on machines without fwupd or a
self-updating firmware?

Also, what about microcode? Can microcode updates affect Qubes
compatibility? I know microcode is typically loaded by Linux each boot,
but if the system can't boot, I guess you have to install a permanent
update through the BIOS?

These are just things I thought of and was curious about, but I guess I
don't have to worry unless I actually hit a problem.

2) Do you think 4GB RAM will be enough to do the install? The system
requirements list 4GB as minimum, so I'm assuming it'll work. I'd rather
not buy the RAM until I know I'm keeping the machine, although I will if
I have to. But if I am going to need RAM for the install, I should order
it ahead of time.

brenda...@gmail.com

unread,
May 30, 2019, 1:38:51 PM5/30/19
to qubes-users
On Thursday, May 30, 2019 at 10:01:07 AM UTC-4, Claudia wrote:
> 1) Should I update the BIOS before attempting to install Qubes? Is this
> a generally recommended practice for Qubes, and if so, why isn't it
> mentioned in the installation guide?

I know you're asking Chris, so feel free to discount my comments here. :)

My strategy before Qubes installation is:
1. Update the BIOS/UEFI PC firmware. Must be done on target workstation/server.
2. Update the SSD firmware. Can be done on a different workstation.
3. Reset the factory DEK on the SSD if it supports OPAL or OPALite. Can be done on a different workstation.
4. NEW: turn off hyperthreading in the BIOS/UEFI config.

> I wonder how many people gave up on
> a Qubes install without ever trying a firmware update. Should I keep the
> firmware up to date thereafter? Firmware updates used to be rare, and it
> was only recommended to install them if something was actually broken. I
> guess now that the firmware is practically an OS in itself, we should be
> updating it like one?

I do, usually after some new Intel CPU vulnerability is published. :)

> Luckily, Dell provides a way to update firmware from Linux, and a way to
> do it with no OS at all. However, I imagine some (most) vendors require
> Windows in order to install firmware updates. Just out of curiosity, how
> do Linux users do firmware updates on machines without fwupd or a
> self-updating firmware?

I install a test Windows instance, download the firmware from the vendor and install it using the vendor tools. That's probably not the most secure method out there.

> Also, what about microcode? Can microcode updates affect Qubes
> compatibility? I know microcode is typically loaded by Linux each boot,
> but if the system can't boot, I guess you have to install a permanent
> update through the BIOS?

One thing that enterprise-serving PC manufacturer *firmware* updates generally cover is the microcode updates. Dell, Lenovo, etc. generally update the firmware with the microcode. Sometimes the BIOS firmware might get the update first, sometimes Linux distros might.

> 2) Do you think 4GB RAM will be enough to do the install? The system
> requirements list 4GB as minimum, so I'm assuming it'll work. I'd rather
> not buy the RAM until I know I'm keeping the machine, although I will if
> I have to. But if I am going to need RAM for the install, I should order
> it ahead of time.

I semi-recently installed 4.0 (can't remember which ISO release) on a system w/ only 4GB of RAM and it installed fine. However, it freaked out trying to start up VMs, repeating and failing continuously. Part of the issue there, I think, is a defect that has already been fixed in testing (and recently pushed to current) where dom0 was reserved more RAM than necessary severing limiting RAM available to VMs.

So for 4.0 I recommend at least 8GB and preferably 12GB. If you'll be running HVMs and/or Windows VMs that cannot participate in memory balancing like the PVH VMs can, 16GB minimum.

Brendan

Sphere

unread,
Jun 4, 2019, 5:51:51 AM6/4/19
to qubes-users
I haven't personally tried this but I am highly confident that Qubes will definitely work on an ASRock X370 Taichi

https://www.asrock.com/mb/AMD/X370%20Taichi/

It's the motherboard out there that has aced being able to do GPU Passthrough on a Windows Guest VM on a Linux Host so all in all it's very reliable for Qubes' many VM requirements and has done very well to reliably run various linux distros just by me reading through alot of pages/articles about doing GPU passthrough in linux

All in all, it's really worth the try and even if it doesn't work on qubes then maybe it will work on considerable alternatives like Subgraph OS.
https://subgraph.com/

There's X470 and X570 out already but I'm not sure how it fares with Qubes OS given that there's alot of new stuff going on in there right now that may not be compatible or working well with Linux.

Claudia

unread,
Jul 22, 2019, 11:37:52 AM7/22/19
to Chris Laprise, qubes-users
Chris Laprise:
So I finally got around to doing this.

Qubes works and all the basic features are supported, VT-x VT-d, and so
on, as far as I can tell.

One major issue, hardware/firmware-wise:

1) It doesn't come back from suspend. The fan stops, but there are no
blinking lights (actually, no lights besides AC and caps lock), and
nothing I do wakes it. I have to long-press the power button, then press
again to turn on the machine. It's probably an ACPI issue, probably not
a graphics driver issue as there's no dGPU. I played around with
acpi_osi= and some basic troubleshooting but the odds of me being able
to fix it are slim to none.

Minor issues:

2) Buggy BIOS / ACPI impl. Dom0 kernel dmesg complains about "[Firmware
bug]" and ACPI issues. Though no noticeable problems except suspend. The
firmware's OS-less update-from-USB-drive feature seems broken, I tried
several times. Still might be able to update from fwupd, though. No UI
for managing secure boot keys, etc., it seems to only have the bare
minimum options.

So, it appears you were totally right about consumer-grade laptops and
buggy firmware. But suspend/resume is problematic even among
high-end/business-class laptops, too, isn't it? It's just something
Linux has never been good at.

3) USB qube isn't working. I installed with USB qube, and the microphone
shows up fine. But flash drives and the card reader don't show up. When
I plug in a USB drive, its LED blinks on for a fraction of a second,
then turns off (on other machines it stays on). No sign of it in lsusb
or lsblk in either sys-usb or dom0.

However, when I remove the qubes.rd.hide_all_usb kernel flag, it works
normally, so I think this is just a software issue.

4) Screen power management (turn off display) doesn't work, although I
had the same problem with a machine where suspend does work, and I think
I narrowed it down to a fedora/X11 issue. The display does turn off when
the lid is closed and lid-switch is set to "do nothing," though.

5) ...plus a few other minor issues probably not hardware related.



Right now I'm trying to decide if I can live without suspend. But, this
is such a common problem that I'm afraid the next one I trade it in for
would have the same problem, and the next one after that. Then I spent
twice the money and got nowhere. This issue is all-too-common on laptops
running Linux. It could be fixed (or broken) on any machine at any time
in a random kernel update, too, but who knows.

This is especially a problem because Xen doesn't support hibernation at
all (not to mention whether it would actually work), and Qubes doesn't
support Xen's "save VM state" feature, either of which I could live with
instead. So my only choices are "on" and "off."

Besides suspend being broken, I actually really like it, and you can't
go wrong for the price.

I think I'm going to try installing Ubuntu and testing suspend from
there, and also trying to update the firmware from fwupd, but I'm not
holding my breath.

So, any advice on troubleshooting suspend... or advice on what to do
next, I guess... would be appreciated. Ugh, this is totally frustrating.

-------------------------------------------------
This free account was provided by VFEmail.net - report spam to ab...@vfemail.net

Chris Laprise

unread,
Jul 22, 2019, 12:06:14 PM7/22/19
to Claudia, qubes-users
On 7/22/19 11:38 AM, Claudia wrote:
> So I finally got around to doing this.
>
> Qubes works and all the basic features are supported, VT-x VT-d, and so
> on, as far as I can tell.
>
> One major issue, hardware/firmware-wise:
>
> 1) It doesn't come back from suspend. The fan stops, but there are no
> blinking lights (actually, no lights besides AC and caps lock), and
> nothing I do wakes it. I have to long-press the power button, then press
> again to turn on the machine. It's probably an ACPI issue, probably not
> a graphics driver issue as there's no dGPU. I played around with
> acpi_osi= and some basic troubleshooting but the odds of me being able
> to fix it are slim to none.

This is good to hear!

>
> Minor issues:
>
> 2) Buggy BIOS / ACPI impl. Dom0 kernel dmesg complains about "[Firmware
> bug]" and ACPI issues. Though no noticeable problems except suspend. The
> firmware's OS-less update-from-USB-drive feature seems broken, I tried
> several times. Still might be able to update from fwupd, though. No UI
> for managing secure boot keys, etc., it seems to only have the bare
> minimum options.
>
> So, it appears you were totally right about consumer-grade laptops and
> buggy firmware. But suspend/resume is problematic even among
> high-end/business-class laptops, too, isn't it? It's just something
> Linux has never been good at.

TBH, I haven't had long-term problems with suspend on Thinkpads _except_
when anti-evil-maid is enabled; that combo hasn't worked for about 2 years.

My experience is that consumer BIOS will result in poor suspend
compatibility.

>
> 3) USB qube isn't working. I installed with USB qube, and the microphone
> shows up fine. But flash drives and the card reader don't show up. When
> I plug in a USB drive, its LED blinks on for a fraction of a second,
> then turns off (on other machines it stays on). No sign of it in lsusb
> or lsblk in either sys-usb or dom0.
>
> However, when I remove the qubes.rd.hide_all_usb kernel flag, it works
> normally, so I think this is just a software issue.

This could have to do with how the USB controllers respond to the steps
involved in placing it under IOMMU passthrough. I think without
hide_all_usb set, they get reset twice (once in dom0 and once in sys-usb)?

>
> 4) Screen power management (turn off display) doesn't work, although I
> had the same problem with a machine where suspend does work, and I think
> I narrowed it down to a fedora/X11 issue. The display does turn off when
> the lid is closed and lid-switch is set to "do nothing," though.

I usually have to switch to KDE with sddm to get this working.

>
> 5) ...plus a few other minor issues probably not hardware related.
>
>
>
> Right now I'm trying to decide if I can live without suspend. But, this
> is such a common problem that I'm afraid the next one I trade it in for
> would have the same problem, and the next one after that. Then I spent
> twice the money and got nowhere. This issue is all-too-common on laptops
> running Linux. It could be fixed (or broken) on any machine at any time
> in a random kernel update, too, but who knows.
>
> This is especially a problem because Xen doesn't support hibernation at
> all (not to mention whether it would actually work), and Qubes doesn't
> support Xen's "save VM state" feature, either of which I could live with
> instead. So my only choices are "on" and "off."

This is an excellent point, and I think there is a Qubes issue about VM
hibernation...

>
> Besides suspend being broken, I actually really like it, and you can't
> go wrong for the price.
>
> I think I'm going to try installing Ubuntu and testing suspend from
> there, and also trying to update the firmware from fwupd, but I'm not
> holding my breath.

That's what I would also try first. Qubes 2.0 used to make my ethernet
NIC go dead, but booting temporarily with an Ubuntu live cd would get it
working again and I could use it in Qubes after that until I did a
Qubes-to-Qubes reboot. Problem stopped around Qubes 4.0. :)

>
> So, any advice on troubleshooting suspend... or advice on what to do
> next, I guess... would be appreciated. Ugh, this is totally frustrating.

You should try these:

* Find which wifi modules are being used in sys-net (i.e. do "sudo
lsmod") then add them to /rw/config/suspend-module-blacklist. I find
this is usually required to get suspend working right. For an Intel wifi
card, you would add both 'iwldvm' and 'iwlwifi' in that order.

* Upgrade the dom0 and vm kernels to 4.19 or later. The 4.19 versions
from qubes*testing have been very stable for me. OTOH, there are also
5.x versions available.

Claudia

unread,
Jul 22, 2019, 4:55:30 PM7/22/19
to Chris Laprise, qubes-users
Chris Laprise:
Not what I wanted to hear of course, but it does seem that way. I can't
say you didn't warn me. But that's the *only* firmware/hardware issue
I've run into so far. I feel like I'm so close!

>
>>
>> 3) USB qube isn't working. I installed with USB qube, and the
>> microphone shows up fine. But flash drives and the card reader don't
>> show up. When I plug in a USB drive, its LED blinks on for a fraction
>> of a second, then turns off (on other machines it stays on). No sign
>> of it in lsusb or lsblk in either sys-usb or dom0.
>>
>> However, when I remove the qubes.rd.hide_all_usb kernel flag, it works
>> normally, so I think this is just a software issue.
>
> This could have to do with how the USB controllers respond to the steps
> involved in placing it under IOMMU passthrough. I think without
> hide_all_usb set, they get reset twice (once in dom0 and once in sys-usb)?

As long as it's not strictly firmware/hardware-related, I'll worry about
it later and start another thread for it. If I have to get rid of the
USB qube or something it's not that big of a deal.

>
>>
>> 4) Screen power management (turn off display) doesn't work, although I
>> had the same problem with a machine where suspend does work, and I
>> think I narrowed it down to a fedora/X11 issue. The display does turn
>> off when the lid is closed and lid-switch is set to "do nothing," though.
>
> I usually have to switch to KDE with sddm to get this working.

I think I had it working temporarily on my current machine by directly
using X11 commands, it was just XFCE not using them correctly or
something. It must happen to a lot of people, if you ran into it as
well. That's something I can worry about later too. I'll keep sddm in
mind and see if that fixes it.

>
>>
>> 5) ...plus a few other minor issues probably not hardware related.
>>
>>
>>
>> Right now I'm trying to decide if I can live without suspend. But,
>> this is such a common problem that I'm afraid the next one I trade it
>> in for would have the same problem, and the next one after that. Then
>> I spent twice the money and got nowhere. This issue is all-too-common
>> on laptops running Linux. It could be fixed (or broken) on any machine
>> at any time in a random kernel update, too, but who knows.
>>
>> This is especially a problem because Xen doesn't support hibernation
>> at all (not to mention whether it would actually work), and Qubes
>> doesn't support Xen's "save VM state" feature, either of which I could
>> live with instead. So my only choices are "on" and "off."
>
> This is an excellent point, and I think there is a Qubes issue about VM
> hibernation...

#2414, which hasn't had any activity in two years.

>
>>
>> Besides suspend being broken, I actually really like it, and you can't
>> go wrong for the price.
>>
>> I think I'm going to try installing Ubuntu and testing suspend from
>> there, and also trying to update the firmware from fwupd, but I'm not
>> holding my breath.
>
> That's what I would also try first. Qubes 2.0 used to make my ethernet
> NIC go dead, but booting temporarily with an Ubuntu live cd would get it
> working again and I could use it in Qubes after that until I did a
> Qubes-to-Qubes reboot. Problem stopped around Qubes 4.0. :)
>
>>
>> So, any advice on troubleshooting suspend... or advice on what to do
>> next, I guess... would be appreciated. Ugh, this is totally frustrating.
>
> You should try these:
>
> * Find which wifi modules are being used in sys-net (i.e. do "sudo
> lsmod") then add them to /rw/config/suspend-module-blacklist. I find
> this is usually required to get suspend working right. For an Intel wifi
> card, you would add both 'iwldvm' and 'iwlwifi' in that order.

I didn't think of wifi modules preventing suspend. I'll definitely give
it a try, but I'm not sure it could cause the kind of problem I'm
having. It doesn't seem like it's even trying to wake up when I press a
key or the power button. It just stays sleeping. The only observable
difference between sleep and off is that in sleep pressing the power
button doesn't turn the machine on until I power cycle. Also, I tried
enabling "USB Wake Support" in BIOS, but it didn't seem to make a
difference.

>
> * Upgrade the dom0 and vm kernels to 4.19 or later. The 4.19 versions
> from qubes*testing have been very stable for me. OTOH, there are also
> 5.x versions available.
>

I did `qubes-dom0-upgrade --enablerepo=qubes-dom0-unstable kernel` and I
still couldn't get a newer kernel. 4.14.199-2 popped up and it said
"nothing to do." What am I doing wrong?

It doesn't seem like VM kernels would make a difference with suspend,
but I can try upgrading them anyways.

Thanks again

Claudia

unread,
Jul 22, 2019, 5:10:17 PM7/22/19
to Chris Laprise, qubes-users
Claudia:
Got it. current-testing, not -unstable. Sometimes I don't know how to
read, haha. I guess I just assumed unstable was newer than testing.

panina

unread,
Jul 23, 2019, 4:46:41 PM7/23/19
to qubes-users
A lot of this sounds like my issues, so I thought I'd give my side. I
run an AMD thinkpad A485 with ryzen 2500u pro. Not what I'd call a
low-end laptop, but the issues are mostly the same.

On 7/22/19 11:09 PM, Claudia wrote:
> Claudia:
>> Chris Laprise:
>>> On 7/22/19 11:38 AM, Claudia wrote:
>>>> So I finally got around to doing this.
>>>>
>>>> Qubes works and all the basic features are supported, VT-x VT-d, and
>>>> so on, as far as I can tell.
>>>>
>>>> One major issue, hardware/firmware-wise:
>>>>
>>>> 1) It doesn't come back from suspend. The fan stops, but there are
>>>> no blinking lights (actually, no lights besides AC and caps lock),
>>>> and nothing I do wakes it. I have to long-press the power button,
>>>> then press again to turn on the machine. It's probably an ACPI
>>>> issue, probably not a graphics driver issue as there's no dGPU. I
>>>> played around with acpi_osi= and some basic troubleshooting but the
>>>> odds of me being able to fix it are slim to none.
>>>
>>> This is good to hear!
>>

I've got the same issue on my thinkpad. It seems to suspend fine, but
when I try to wake it, only the fan and keyboard backlight turns on.
Nothing else. It seems, however, that when I hard reset it, it seems to
think it's resuming from suspend. Not sure if that's a clue.

I did try fedora on this machine though, and it had the same issues
there, so it doesn't look qubes-specific, at least for me.

>>>
>>>>
>>>> Minor issues:
>>>>
>>>> 2) Buggy BIOS / ACPI impl. Dom0 kernel dmesg complains about
>>>> "[Firmware bug]" and ACPI issues. Though no noticeable problems
>>>> except suspend. The firmware's OS-less update-from-USB-drive feature
>>>> seems broken, I tried several times. Still might be able to update
>>>> from fwupd, though. No UI for managing secure boot keys, etc., it
>>>> seems to only have the bare minimum options.
>>>>
>>>> So, it appears you were totally right about consumer-grade laptops
>>>> and buggy firmware. But suspend/resume is problematic even among
>>>> high-end/business-class laptops, too, isn't it? It's just something
>>>> Linux has never been good at.

I think I have the same issue, I get ACPI errors, one for each CPU
kernel. So I'd say this is not a "cheap laptop" issue, mine is
mid-range, branded towards IT security staff.

>>>
>>> TBH, I haven't had long-term problems with suspend on Thinkpads
>>> _except_ when anti-evil-maid is enabled; that combo hasn't worked for
>>> about 2 years.
>>>
>>> My experience is that consumer BIOS will result in poor suspend
>>> compatibility.
>>
>> Not what I wanted to hear of course, but it does seem that way. I
>> can't say you didn't warn me. But that's the *only* firmware/hardware
>> issue I've run into so far. I feel like I'm so close!
>>
>>>
>>>>
>>>> 3) USB qube isn't working. I installed with USB qube, and the
>>>> microphone shows up fine. But flash drives and the card reader don't
>>>> show up. When I plug in a USB drive, its LED blinks on for a
>>>> fraction of a second, then turns off (on other machines it stays
>>>> on). No sign of it in lsusb or lsblk in either sys-usb or dom0.
>>>>
>>>> However, when I remove the qubes.rd.hide_all_usb kernel flag, it
>>>> works normally, so I think this is just a software issue.
>>>
>>> This could have to do with how the USB controllers respond to the
>>> steps involved in placing it under IOMMU passthrough. I think without
>>> hide_all_usb set, they get reset twice (once in dom0 and once in
>>> sys-usb)?
>>
>> As long as it's not strictly firmware/hardware-related, I'll worry
>> about it later and start another thread for it. If I have to get rid
>> of the USB qube or something it's not that big of a deal.

Same issue for me here as well. Mine looks to me to be
IOMMU-group-related. I believe that something in the usbs' IOMMU group
is connected to amdgpu. If I boot with rd.qubes.hide_all_usb, I have to
have nomodeset. It looks like a collision with something in the graphics
driver, think I saw somehting floating past the screen during one crash.
Sadly have no record.

>>
>>>
>>>>
>>>> 4) Screen power management (turn off display) doesn't work, although
>>>> I had the same problem with a machine where suspend does work, and I
>>>> think I narrowed it down to a fedora/X11 issue. The display does
>>>> turn off when the lid is closed and lid-switch is set to "do
>>>> nothing," though.
>>>
>>> I usually have to switch to KDE with sddm to get this working.
>>
>> I think I had it working temporarily on my current machine by directly
>> using X11 commands, it was just XFCE not using them correctly or
>> something. It must happen to a lot of people, if you ran into it as
>> well. That's something I can worry about later too. I'll keep sddm in
>> mind and see if that fixes it.
>>

This one works for me, but I also run KDE. Don't remember it from XFCE
though, I think that worked for me there.
This looks interesting, I'll give this a try. I just shelved my
suspension issues, since I thought it wasn't qubes-specific, was going
to nag lenovo about it...

>>
>>>
>>> * Upgrade the dom0 and vm kernels to 4.19 or later. The 4.19 versions
>>> from qubes*testing have been very stable for me. OTOH, there are also
>>> 5.x versions available.
>>>
>>
>> I did `qubes-dom0-upgrade --enablerepo=qubes-dom0-unstable kernel` and
>> I still couldn't get a newer kernel. 4.14.199-2 popped up and it said
>> "nothing to do." What am I doing wrong?
>
> Got it. current-testing, not -unstable. Sometimes I don't know how to
> read, haha. I guess I just assumed unstable was newer than testing.
>

Confused me as well... But yeah, 4.19 seems really important for ryzen,
a lot apparently happened there.

>>
>> It doesn't seem like VM kernels would make a difference with suspend,
>> but I can try upgrading them anyways.
>>
>> Thanks again
>
>
> -------------------------------------------------
> This free account was provided by VFEmail.net - report spam to
> ab...@vfemail.net
>
> ONLY AT VFEmail! - Use our Metadata Mitigator to keep your email out of
> the NSA's hands!
> $24.95 ONETIME Lifetime accounts with Privacy Features!  15GB disk! No
> bandwidth quotas!
> Commercial and Bulk Mail Options! 

So yeah, a lot of this looks like it's ryzen-specific, and not due to
consumer-grade hardware. Mine isn't the most expensive thinkpad around,
I guess, but I can't really call it cheapish.

<3
/panina
0x6648B5C5E394CC24.asc
signature.asc

Claudia

unread,
Jul 25, 2019, 9:59:36 AM7/25/19
to Chris Laprise, qubes-users
Chris Laprise:

> My experience is that consumer BIOS will result in poor suspend
> compatibility.

Not surprising :( But there are indeed some Thinkpads, MSI, System76,
Purism, and other high-end business-class machines on the HCL with
suspend not working. So it's still a bit of crapshoot. Just don't want
anyone thinking they're "safe" if they buy an expensive machine.


>
> You should try these:
>
> * Find which wifi modules are being used in sys-net (i.e. do "sudo
> lsmod") then add them to /rw/config/suspend-module-blacklist. I find
> this is usually required to get suspend working right. For an Intel wifi
> card, you would add both 'iwldvm' and 'iwlwifi' in that order.

I did that for ath, ath10k_pci, and ath10k_core in sys-usb. No dice.
Would it help at all to try blacklisting them from dom0, just for
testing purposes? Also: order matters?

>
> * Upgrade the dom0 and vm kernels to 4.19 or later. The 4.19 versions
> from qubes*testing have been very stable for me. OTOH, there are also
> 5.x versions available.
>

Upgraded to latest dom0-current-unstable, 4.19.something. Only thing
this changed was that the boot process hangs between Xen and dom0 kernel
about 50% of the time. Tried this and the above fix together. Where can
I get these 5.x versions?

---

I also was able to upgrade the BIOS to 1.3.2 using Freedos. No dice.

BUT! I made a small breakthrough. It suspends and resumes flawlessly
under Ubuntu and Fedora. This stands to reason, as according to Dell the
machine officially supports Ubuntu. So it appears nothing is
fundamentally broken here.

I did some testing to try and narrow it down and here's what I found so far:
Qubes R4 with 4.14 & 4.19 - doesn't work
Ubuntu 19.04 with 5.0.0 - works
Fedora 30 with 5.0.9 - works
Fedora 29 with 4.18.16 - works

Initially I was hoping the fix is already in ~5.0 mainline. But isn't
the Qubes kernel based directly on the Fedora kernel? Why would it be
working on an *older* Fedora kernel and not on a *newer* Qubes unstable
kernel?

So what's the outlier here? Xen? IOMMU? XFCE? (test systems were all GNOME)

This part is interesting. I tried disabling VT-x and VT-d in BIOS and
booting Qubes. It suspends as usual, except now, when I hit a key to
resume it, the fan comes on (unlike before), and it sort of sounds like
the disk spins up, but the display doesn't power back on. Also, the
light on the caps lock key doesn't come on when I press it, and
Alt-SysRq-B doesn't do anything in this state, even after enabling it
beforehand. In other words it feels like it's trying, but won't fully
resume.

So I guess this tells me that it's partly virtualization related
(because it behaves slightly differently when virtualization is
disabled) and partly not (because it behaves differently under Qubes
with virtualization disabled than it does under Fedora). Maybe it's a
Xen issue, regardless of virtualization enabled/disabled?

This also really complicates the search process. It's not good enough to
find a laptop that is known to suspend under Linux. It's a wonder
*anyone* has working suspend in Qubes.

Note I'm assuming it's a kernel-related issue, but I guess it could be
something else entirely -- power management utility, udev, xfce, who
knows what.

Point is, if it's just a version issue, I can deal without suspend for a
while until the fix makes it to Qubes. But if it's some deeply-rooted
virtualization issue, I sort of doubt it'll ever be fixed, whether in
kernel or in BIOS.

Are there any other Xen-based distros out there I could test?

I know it's a lot of guesswork, but I value your advice.

awokd

unread,
Jul 25, 2019, 10:18:29 AM7/25/19
to qubes...@googlegroups.com
Claudia:

> This also really complicates the search process. It's not good enough to
> find a laptop that is known to suspend under Linux. It's a wonder
> *anyone* has working suspend in Qubes.

Unfortunately, this is true! Lots of things have to work together in
order for suspend to function. See
https://www.mail-archive.com/qubes...@googlegroups.com/msg27682.html
for an example with my AMD laptop.

> Are there any other Xen-based distros out there I could test?

You can add Xen to your stock Fedora install. That takes it roughly to
where Qubes begins, but you might want to use the same version of Fedora
dom0 uses.

brenda...@gmail.com

unread,
Jul 25, 2019, 11:04:53 AM7/25/19
to qubes-users
On Thursday, July 25, 2019 at 10:18:29 AM UTC-4, awokd wrote:
> Are there any other Xen-based distros out there I could test?

You can add Xen to your stock Fedora install. That takes it roughly to
where Qubes begins, but you might want to use the same version of Fedora
dom0 uses.


Claudia:

I'm sure the *last* thing you want to do is add additional variables, as things are easier to diagnose with fewer, but...

...there are very very early test-builds of Qubes R4.1 out there utilizing Xen 4.12 and Fedora 30. This 2019-07-01 build appears semi-stable in light testing. It is, at a high level (and Marek can correct me if I am wrong), the R4.01 codebase with up-to-date Xen and dom0 Fedora w/ any Qubes-related changes to ensure these work with the Qubes code base.

I was able to install that particular test build on a Thinkpad X230 for testing: https://openqa.qubes-os.org/tests/3021

(note: click on assets tab for link to download the ISO)

It might be worth installing one of those as well on another drive to see if newer Xen/Fedora combinations resolve sleep issues or get you closer to resolution.

Brendan

Chris Laprise

unread,
Jul 25, 2019, 12:16:00 PM7/25/19
to qubes...@googlegroups.com, Claudia, Brendan Hoar
On 7/25/19 11:04 AM, brenda...@gmail.com wrote:
> ...there are very very early test-builds of Qubes R4.1 out there
> utilizing Xen 4.12 and Fedora 30. This 2019-07-01 build appears
> semi-stable in light testing. It is, at a high level (and Marek can
> correct me if I am wrong), the R4.01 codebase with up-to-date Xen and
> dom0 Fedora w/ any Qubes-related changes to ensure these work with the
> Qubes code base.
>
> I was able to install that particular test build on a Thinkpad X230 for
> testing: https://openqa.qubes-os.org/tests/3021
>
> (note: click on assets tab for link to download the ISO)
>
> It might be worth installing one of those as well on another drive to
> see if newer Xen/Fedora combinations resolve sleep issues or get you
> closer to resolution.

Interesting link!

I was just thinking this could be an "old Fedora" problem. But that also
suggests there may be a way to patch the current dom0 to handle suspend
correctly. Of course, domU is a factor if sys-net or sys-usb are running.

Claudia:

Have you tried suspend/resume with no VMs running at all? This can be
accomplished by manually shutting down appVMs then service VMs, or with
the command 'qvm-shutdown --all --force --wait'.

If this works, then the problem may be in the VMs such as sys-net or
sys-usb. This is possible because those VMs control hardware that must
also respond to special commands related to sleep/wake.

If it doesn't work, then the problem is probably entirely in dom0 and
Fedora 25. Assuming you already have the testing 4.19 kernel, have you
thought of upgrading it to the even newer 5.x one as 'latest'? The
latest kernel is installed by specifying the special package named
'kernel-latest'.

Brendan Hoar

unread,
Jul 25, 2019, 2:03:24 PM7/25/19
to Chris Laprise, Claudia, qubes...@googlegroups.com
On Thu, Jul 25, 2019 at 12:15 PM Chris Laprise <tas...@posteo.net> wrote:
If it doesn't work, then the problem is probably entirely in dom0 and
Fedora 25. Assuming you already have the testing 4.19 kernel, have you
thought of upgrading it to the even newer 5.x one as 'latest'? The
latest kernel is installed by specifying the special package named
'kernel-latest'.

qubes-dom0-update kernel-latest # will update available dom0 kernels list and sets most recent as default

qubes-dom0-update kernel-latest-qubes-vm # will update available VM kernels list and requires manual checking to see if it changed your global defaults and/or changed a per-VM kernel setting when older ones are removed

I use both and generally pull from current + current-testing repos. Currently running 5.1.17-1 in dom0 and a mix in VMs under R4.01.

B

brenda...@gmail.com

unread,
Jul 26, 2019, 7:52:04 AM7/26/19
to qubes-users


On Thursday, July 25, 2019 at 12:16:00 PM UTC-4, Chris Laprise wrote:
On 7/25/19 11:04 AM, brend...@gmail.com wrote:
> I was able to install that particular test build on a Thinkpad X230 for
> testing: https://openqa.qubes-os.org/tests/3021
>
> (note: click on assets tab for link to download the ISO)

Interesting link!

As a side note, when you opened the ticket asking about back-porting "lvm-thin tools" to R4.01 (Fedora 25) for your backup tool work...I was wondering if you were aware that some of the (again, very very early test) builds of R4.1 were semi usable.

Not that we should have to wait to do fast, stable backups until R4.1... :)
 
Brendan

Chris Laprise

unread,
Jul 26, 2019, 9:21:48 AM7/26/19
to brenda...@gmail.com, qubes-users
I wasn't aware, but there's no ETA for R4.1 anyway.

Claudia

unread,
Jul 26, 2019, 1:24:57 PM7/26/19
to Chris Laprise, qubes...@googlegroups.com, Brendan Hoar
Chris Laprise:
> On 7/25/19 11:04 AM, brenda...@gmail.com wrote:
>> ...there are very very early test-builds of Qubes R4.1 out there
>> utilizing Xen 4.12 and Fedora 30. This 2019-07-01 build appears
>> semi-stable in light testing. It is, at a high level (and Marek can
>> correct me if I am wrong), the R4.01 codebase with up-to-date Xen and
>> dom0 Fedora w/ any Qubes-related changes to ensure these work with the
>> Qubes code base.
>>
>> I was able to install that particular test build on a Thinkpad X230
>> for testing: https://openqa.qubes-os.org/tests/3021
>>
>> (note: click on assets tab for link to download the ISO)
>>
>> It might be worth installing one of those as well on another drive to
>> see if newer Xen/Fedora combinations resolve sleep issues or get you
>> closer to resolution.
>
> Interesting link!
>
> I was just thinking this could be an "old Fedora" problem. But that also
> suggests there may be a way to patch the current dom0 to handle suspend
> correctly. Of course, domU is a factor if sys-net or sys-usb are running.

... But it works on Fedora 29 with 4.18, and not Qubes R4.0.1 with 4.19.

Is a Fedora 29 kernel 4.18 different than a Fedora 25 kernel 4.18? i.e.
different branches/backports or something? Or could something besides
the kernel be causing the difference in behavior?



Trying Fedora 25 with 4.8.6, just for good measure... This is
interesting. At resume, the fan comes on but the screen doesn't power on
(just as in Qubes), but this time, unlike in Qubes, pressing caps lock
*does* turn on the light. Alt-SysRq-B doesn't do anything (it might be
disabled), but incidentally, when I pressed Print Screen I heard a
camera shutter sound effect, and Ctrl-Alt-Del powered it off after 60
seconds. So, the OS is responsive after resume (unlike Qubes), just the
screen is still powered off.

Also, instead of powering off the screen due to inactivity, it just
turns it black instead, much like Qubes.

So it's almost starting to sound like it was a graphics driver issue or
something that was fixed sometime between Fedora 25 and Fedora 29, but
somehow still hasn't made it to Qubes. Still way too early to say, though.

>
> Claudia:
>
> Have you tried suspend/resume with no VMs running at all? This can be
> accomplished by manually shutting down appVMs then service VMs, or with
> the command 'qvm-shutdown --all --force --wait'.

Pretty sure I did, but re-tested to be certain. With all VMs stopped,
resuming makes the fan come on, but the screen doesn't power on, and
pressing the caps-lock key doesn't turn its light on. I have to
long-press the power button and restart.

I could have sworn when I tested this before, resume wouldn't make the
fan come on or anything at all. Although, it would be easy to mistake
one for the other, because the fan isn't the most reliable output
channel. From the time I tested Ubuntu onward, I started using
`sha256sum /dev/urandom` to intentionally get the fan running before
suspend. So it makes sense, in prior testing the fan probably just
wasn't running loud enough to hear.

So, if that's the case, that means I'm getting the same result in normal
Qubes as in Qubes with VT-x & VT-d disabled. This is good news, as means
it's probably not some firmware-related virtualization problem, right?

Just to humor myself, I was going to try testing if I could hear sound
from Qubes after resume, but it seems audio isn't working at all. Which
is a whole 'nother problem. Aplay says "... unable to open slave; audio
open error: no such file or directory." `echo -e '\a'` doesn't work even
on a TTY (lsmod shows pcspkr), and `beep` isn't installed.

> If this works, then the problem may be in the VMs such as sys-net or
> sys-usb. This is possible because those VMs control hardware that must
> also respond to special commands related to sleep/wake.

Makes sense, but unfortunately (or fortunately) I couldn't even get it
to resume properly under Qubes even with VT-x & VT-d disabled.

> If it doesn't work, then the problem is probably entirely in dom0 and
> Fedora 25. Assuming you already have the testing 4.19 kernel, have you

Well, that's good news. At least, I'd rather the problem be in dom0
kernel than in firmware! It also means it's probably not device-related,
if I'm following correctly. But, there's still a chance it could be Xen,
too, right?

I don't suppose it's possible to upgrade Qubes to Fedora 29, or so?

> thought of upgrading it to the even newer 5.x one as 'latest'? The
> latest kernel is installed by specifying the special package named
> 'kernel-latest'.
>

Installed 5.1.15-1. Hangs between Xen and dom0 kernel (blank screen with
no underscore) about 50% of the time. Same result. Fan comes back on,
but no caps lock light. Tried again with all VMs stopped. Same result.

So all this kind of makes me think
1) there are special kernel patches or something in Fedora and Ubuntu
that aren't in Qubes, or
2) the issue is caused by something other than the kernel entirely, or
3) it's Xen

I still have to try the R4.1 test build.

brenda...@gmail.com

unread,
Jul 26, 2019, 1:37:58 PM7/26/19
to qubes-users

On Friday, July 26, 2019 at 1:24:57 PM UTC-4, Claudia wrote:
Just to humor myself, I was going to try testing if I could hear sound
from Qubes after resume, but it seems audio isn't working at all. Which
is a whole 'nother problem. Aplay says "... unable to open slave; audio
open error: no such file or directory." `echo -e '\a'` doesn't work even
on a TTY (lsmod shows pcspkr), and `beep` isn't installed.

Notably, Xen does not pass the real PC speaker device to dom0 and while it simulates it for dom0, it does not actually invoke the hardware. Something something considered dangerous to expose adjacent-to-speaker hardware in dom0, apparently.

So terminal beeps, etc. do nothing (except maybe flash the terminal window, depending on your config).

I use a snippet I got from google:
function create_beep () {
    # create our beep file as xen does not expose the real "bell" device w/i
n the Qubes configuration and a simple echo "\007" does not work.
    > /tmp/sinewave.wav
    printf "\x52\x49\x46\x46\x64\x1F\x00\x00\x57\x41\x56\x45\x66\x6D\x74\x20
\x10\x00\x00\x00\x01\x00\x01\x00\x40\x1F\x00\x00\x40\x1F\x00\x00\x01\x00\x08\x00
\x64\x61\x74\x61\x40\x1F\x00\x00" >> /tmp/sinewave.wav
    for n in {0..999}
    do
            printf "\x80\x26\x00\x26\x7F\xD9\xFF\xD9" >> /tmp/sinewave.wav
    done
}

And then invoke the following when I need a beep (probably should be made a function) in scripts dom0:

  aplay -q /tmp/sinewave.wav --duration=1
 
Brendan

Claudia

unread,
Jul 26, 2019, 8:57:09 PM7/26/19
to qubes...@googlegroups.com
brenda...@gmail.com:
The OS comes with some default wave files. It's just that aplay doesn't
seem to be able to play anything. It detects the audio devices
correctly, but it errors out when I try to play something. That's why I
was trying the console bell, as I thought it might use a different
(simpler) interface or something. Thanks for the info about Xen. I don't
suppose there's a flag or something to tell Xen to actually invoke the
hardware?

Not a top priority at the moment, but some kind of audio could be
helpful for debugging suspend.

zach...@gmail.com

unread,
Feb 7, 2020, 1:00:28 AM2/7/20
to qubes-users
I have a Thinkpad T495 with an AMD Ryzen Pro 3700U and Vega 10 graphics. Everything seems to be working besides suspend/resume which is crucial for me since I'm on the go a lot. I had to build my own Qubes R4.0 ISO to get the installer to work due to it needing a 5.0+ kernel for the graphics driver. I installed `kernel-latest` from qubes-dom0-current testing but still didn't work. After trying every kernel option on the face of this Earth I decided to use an experimental Qubes R4.1 build as some things were pointing to dom0 Fedora 25 being the issue. On dom0 Fedora 31 it's still an issue with a 5.4 kernel. Has been driving me nuts as I've spent almost the whole day trying to figure the issue out.

When I suspend, it clearly suspends but when I open it back up the screen is off but the power LED is on. I can hear the fan spin up for a bit but nothing happens. CTRL + ALT + Backspace does nothing. I also tried switching to text mode before suspending with CTRL + ALT + F2. Nothing... I also disabled the compositor in XFCE to give it a try in both R4.0 and R4.1, no difference. It totally seems like an X server or amdgpu issue but I really don't know what to do.

I don't have any VMs running when I test the suspend and I don't have a sys-usb VM to take that out of the equation. Any ideas? I'm scratching my head over here and I'm at a loss on what to try next.

Claudia

unread,
Feb 7, 2020, 8:03:14 AM2/7/20
to zach...@gmail.com, qubes-users
February 7, 2020 6:00 AM, zach...@gmail.com wrote:

> I have a Thinkpad T495 with an AMD Ryzen Pro 3700U and Vega 10 graphics. Everything seems to be
> working besides suspend/resume which is crucial for me since I'm on the go a lot. I had to build my
> own Qubes R4.0 ISO to get the installer to work due to it needing a 5.0+ kernel for the graphics
> driver. I installed `kernel-latest` from qubes-dom0-current testing but still didn't work. After
> trying every kernel option on the face of this Earth I decided to use an experimental Qubes R4.1
> build as some things were pointing to dom0 Fedora 25 being the issue. On dom0 Fedora 31 it's still
> an issue with a 5.4 kernel. Has been driving me nuts as I've spent almost the whole day trying to
> figure the issue out.
>
> When I suspend, it clearly suspends but when I open it back up the screen is off but the power LED
> is on. I can hear the fan spin up for a bit but nothing happens. CTRL + ALT + Backspace does
> nothing. I also tried switching to text mode before suspending with CTRL + ALT + F2. Nothing... I
> also disabled the compositor in XFCE to give it a try in both R4.0 and R4.1, no difference. It
> totally seems like an X server or amdgpu issue but I really don't know what to do.
>
> I don't have any VMs running when I test the suspend and I don't have a sys-usb VM to take that out
> of the equation. Any ideas? I'm scratching my head over here and I'm at a loss on what to try next.
>

Did you try the Xen power.c patch?

It sounds like a Xen panic. Some or all AMD Fam15h processors change their CPUID feature bits after resume, which triggers a Xen panic (LEDs and fans on, screen off, keyboard and power button unresponsive). There is a patch and instructions towards the end of this thread: https://www.mail-archive.com/qubes...@googlegroups.com/msg31517.html - It takes some work but it sounds very likely it will fix your problem. Sys-usb causes other problems on a lot of Ryzen machines, so continue to keep it disabled for now.

It doesn't sound like a graphics problem. Usually X or amdgpu issues result in the screen's backlight coming on but displaying a blank screen, and often the keyboard is responsive just not the screen. At least in my experience.

PS: when replying to mailing lists please write your response *below* the quoted text you're replying to.

Claudia

unread,
Feb 7, 2020, 8:15:57 AM2/7/20
to zach...@gmail.com, qubes-users

Also I forgot to mention, if the patch works but you still run into other post-resume problems, you may have to pin dom0 to CPU0. See https://www.mail-archive.com/qubes...@googlegroups.com/msg31737.html

zach...@gmail.com

unread,
Feb 7, 2020, 3:07:34 PM2/7/20
to qubes-users
Thanks for responding Claudia! I haven't tried that patch but I saw it in your other thread. I guess my options are pretty exhausted at this point so I'll give it a try. I've never actually built an RPM outside of qubes-builder. I'm assuming I should just build the entire Qubes R4.0 (stable) ISO with the edit included. I've never had such a complex issue before so this is all new to me.

Should we get the Qubes team to include this patch as a fix for AMD? I'm not sure what the security implications are but I would assume it could introduce an issue where the Spectre/Meltdown microcode patches would not be applied when resuming? I'm also assuming the code is functioning as intended, as it panics but what would the real solution be? I wonder if there's any official fix by Xen in the works rather than commenting out that panic line. Even in Qubes R4.1 with Xen 4.13 the issue persists.

Sorry about the email above yours, Google groups wants to put it above your quote by default for some reason. I was also exhausted from trying 1000 kernel boot options lol.

zach...@gmail.com

unread,
Feb 7, 2020, 3:10:17 PM2/7/20
to qubes-users
Also... The patch shouldn't really have any security implications assuming your BIOS has the latest microcode patches right? I'm guessing this is only for microcode packages installed on the OS.

Claudia

unread,
Feb 7, 2020, 4:15:38 PM2/7/20
to zach...@gmail.com, qubes-users

Not sure what you mean about building an RPM outside qubes-builder. This is all done within qubes-builder. So if you have any experience at all with that, then you're already off to a way better start than I was. It wasn't nearly as bad as I thought it would be either. The GUI script does most of the work for you. I tried to leave a sufficiently detailed diary of what I did because I knew it would come in handy later (whether for myself or others). And, you can always ask the mailing list for help.

I just built the RPMs. Building the whole ISO apparently takes many hours and many GB of disk space. And, as with building anything, keep in mind if something goes wrong mid-build, there's a good chance you'll have to start over from the beginning. It took me several attempts. Either method should work the same though, so it's up to you.

>> Should we get the Qubes team to include this patch as a fix for AMD? I'm not sure what the security
>> implications are but I would assume it could introduce an issue where the Spectre/Meltdown
>> microcode patches would not be applied when resuming? I'm also assuming the code is functioning as
>> intended, as it panics but what would the real solution be? I wonder if there's any official fix by
>> Xen in the works rather than commenting out that panic line. Even in Qubes R4.1 with Xen 4.13 the
>> issue persists.

I've been thinking about that. I asked the original author if he reported it to upstream or intended to, but I never heard back from him. I think the Qubes devs would probably just say it's Xen's responsibility, and I can't say I disagree. I've been meaning to mention it on xen-devel but haven't gotten around to it. You're welcome to do so too if you want (if you do, please CC me). My thought was adding a Xen command line argument to override this check, e.g. recheck_cpuid_bits=false (default true, of course), but I have no idea if it would be accepted.

>> Sorry about the email above yours, Google groups wants to put it above your quote by default for
>> some reason. I was also exhausted from trying 1000 kernel boot options lol.

No worries, trust me you're not the first one. Terrible decision on google's part.

> Also... The patch shouldn't really have any security implications assuming your BIOS has the latest
> microcode patches right? I'm guessing this is only for microcode packages installed on the OS.

I have no idea really. I haven't been able to figure out what those feature bits actually mean, if anything. I get the feeling the original authors of that code didn't know either. It kind of looks to me like someone just noticed some bits changing and decided to add a panic just to be on the safe side. Mainly because that code doesn't actually look for any specific bits, it just compares the entire set of flags before and after resume. But I don't know. Use it at your own risk.

BTW, if the patch works for you, please post the output of `xl dmesg` showing which bits have changed after resume. I'm curious if it's the same on all machines.

zach...@gmail.com

unread,
Feb 7, 2020, 4:35:25 PM2/7/20
to qubes-users
Took me a bit to figure out but I applied the changes to `patch-x86-check-feature-flags-after-resume.patch` and I'm building the whole ISO now. I'm not actually sure how to build the RPM itself, and not the ISO. To be honest I've got like 100 tabs open so maybe I'm missing something... Once the ISO builds and I install it, I assume I just want to add vmm-xen to my yum/dnf excludes so no new vmm-xen package gets installed without the patch.
 

>> Should we get the Qubes team to include this patch as a fix for AMD? I'm not sure what the security
>> implications are but I would assume it could introduce an issue where the Spectre/Meltdown
>> microcode patches would not be applied when resuming? I'm also assuming the code is functioning as
>> intended, as it panics but what would the real solution be? I wonder if there's any official fix by
>> Xen in the works rather than commenting out that panic line. Even in Qubes R4.1 with Xen 4.13 the
>> issue persists.

I've been thinking about that. I asked the original author if he reported it to upstream or intended to, but I never heard back from him. I think the Qubes devs would probably just say it's Xen's responsibility, and I can't say I disagree. I've been meaning to mention it on xen-devel but haven't gotten around to it. You're welcome to do so too if you want (if you do, please CC me). My thought was adding a Xen command line argument to override this check, e.g. recheck_cpuid_bits=false (default true, of course), but I have no idea if it would be accepted.


I preemptively submitted this PR to see what the Qubes team thinks. https://github.com/QubesOS/qubes-vmm-xen/pull/70

I agree it probably should be fixed upstream, although I've seen the Qubes team make exceptions and apply their own changes. Upstream would probably take a huge amount of time to get merged and tested. I'm not a developer though so I'm sure you could explain the issue better than I. If you do mention it, CC me as well! I like the CLI argument idea, that's probably a much cleaner way of doing it and defaulting it to true. That way users could disable it if needed due to hardware screw-ups.

>> Sorry about the email above yours, Google groups wants to put it above your quote by default for
>> some reason. I was also exhausted from trying 1000 kernel boot options lol.

No worries, trust me you're not the first one. Terrible decision on google's part.

> Also... The patch shouldn't really have any security implications assuming your BIOS has the latest
> microcode patches right? I'm guessing this is only for microcode packages installed on the OS.

I have no idea really. I haven't been able to figure out what those feature bits actually mean, if anything. I get the feeling the original authors of that code didn't know either. It kind of looks to me like someone just noticed some bits changing and decided to add a panic just to be on the safe side. Mainly because that code doesn't actually look for any specific bits, it just compares the entire set of flags before and after resume. But I don't know. Use it at your own risk.

BTW, if the patch works for you, please post the output of `xl dmesg` showing which bits have changed after resume. I'm curious if it's the same on all machines. 


That sounds like a good explanation to me, was probably just to be on the safe-side. I will report back if it works, probably going to take a while to build since Fedora 25 likes to compile/build with single-threaded GCC lol...

brenda...@gmail.com

unread,
Feb 7, 2020, 5:13:57 PM2/7/20
to qubes-users
On Friday, February 7, 2020 at 9:35:25 PM UTC, zach...@gmail.com wrote:
I preemptively submitted this PR to see what the Qubes team thinks. https://github.com/QubesOS/qubes-vmm-xen/pull/70

I agree it probably should be fixed upstream, although I've seen the Qubes team make exceptions and apply their own changes. Upstream would probably take a huge amount of time to get merged and tested. I'm not a developer though so I'm sure you could explain the issue better than I. If you do mention it, CC me as well! I like the CLI argument idea, that's probably a much cleaner way of doing it and defaulting it to true. That way users could disable it if needed due to hardware screw-ups.

Marek is somewhat active on xen-devel. Submitting the PR to Qubes is probably as good a place to start the (github) discussion I suppose.

I expect Claudia is correct that it's really a Xen defect to address, either with a flag to disable the check, or security/stability focused checks only.

Xen might point upwards again, of course, and tell AMD to fix their microcode or manufacturer's their BIOS's...

...but if a disable flag could be added (--yes_i_know_what_im_doing caveat, of course) that'd be a good short term workaround for the larger Qubes user base that is less likely to be able to figure out how to get a build working and rpm applied and keep up to date with upstream.

Brendan

Marek Marczykowski-Górecki

unread,
Feb 8, 2020, 9:09:05 PM2/8/20
to brenda...@gmail.com, qubes-users
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
(continuing discussion from the above PR)

The patch as it is, is not acceptable, as it may introduce security
and/or stability issues on some machines. Xen (and Linux too) assumes
what CPU features is can use based on CPUID flags. If those changes
during system runtime (including suspend/resume) some instructions or
control registers may no longer be valid (->crash) or safe to use
(->security issue).

If that's just about microcode updates, that's probably BIOS bug - if it
applies microcode update on system startup, it should do the same on
system resume too. Anyway it's worth trying updating linux-firmware
package, which carries microcode updates for AMD. This should make Xen
apply microcode updates too - before checking those flags.
I've just uploaded updated version of the package to the current-testing
repository (both R4.0 and R4.1).

If that's about something else, then fixing it would require finding
what exactly is changing (and preferably also why). And only then find
how to mitigate this issue. If specific flags would turn out to be not
related to security features or otherwise having unwanted effects, then
ignoring those changes would be an option. But ignoring _only those
flags verified to be safe to ignore_, not all of them.

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAl4/abcACgkQ24/THMrX
1yxEGgf/SG+V7TKM8f7QZ5JFVSr++QasDbMefkuc30OeUkXKtFXsTNMH2fp1S8zq
lTgxfrrGH+N7sfP1KkjAZ7ri+DJgmoCyqULUNZAez5DdGlaLJRtsz5rRBtTr4t9F
nmJNC859/RPEpbozwxlM6K8JRhlxVg35Sl46E9lYHbNsTBqAywxhTUgENsZlrblh
gXn2MgnzDHvwShCltlNL2l29HaAXBzIICpPcgiRWLEY/Y1OTNHvYPiTgZdRtkkEM
5tM97EwxZF31k5i7wGpRed84xCid2bXvufq2Xjo2jWxXuQ01r+bv6v/lVwDvd5tz
iOWJsjj4tXLo3bcpuaCM5XvHI9x0yg==
=h62J
-----END PGP SIGNATURE-----

zach...@gmail.com

unread,
Feb 9, 2020, 4:29:26 AM2/9/20
to qubes-users
On Saturday, February 8, 2020 at 8:09:05 PM UTC-6, Marek Marczykowski-Górecki wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256


Appreciate you joining the discussion Marek, your input is really valued :)
 

If that's just about microcode updates, that's probably BIOS bug - if it
applies microcode update on system startup, it should do the same on
system resume too. Anyway it's worth trying updating linux-firmware
package, which carries microcode updates for AMD. This should make Xen
apply microcode updates too - before checking those flags.
I've just uploaded updated version of the package to the current-testing
repository (both R4.0 and R4.1).

I totally understand the resistance to merge the PR and that the real cause of the issue should be fixed in BIOS or OS CPU microcode.

I will be testing that new linux-firmware package on R4.0 shortly and will report back, thanks for uploading it. I've used my laptop on and off all day with suspending it multiple times using the "hacky" patch. If the new microcode works, it shouldn't write the log line saying it has missing features. I still have the vmm-xen packages I built with the modified patch installed.
 

If that's about something else, then fixing it would require finding
what exactly is changing (and preferably also why). And only then find
how to mitigate this issue. If specific flags would turn out to be not
related to security features or otherwise having unwanted effects, then
ignoring those changes would be an option. But ignoring _only those
flags verified to be safe to ignore_, not all of them.

I hope it doesn't come to that but we'll see. I wouldn't really know what else to do besides complain to Lenovo and hope they yell up at AMD to investigate. I assume it's something weird between hardware manufacturers and AMD.
 

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAl4/abcACgkQ24/THMrX
1yxEGgf/SG+V7TKM8f7QZ5JFVSr++QasDbMefkuc30OeUkXKtFXsTNMH2fp1S8zq
lTgxfrrGH+N7sfP1KkjAZ7ri+DJgmoCyqULUNZAez5DdGlaLJRtsz5rRBtTr4t9F
nmJNC859/RPEpbozwxlM6K8JRhlxVg35Sl46E9lYHbNsTBqAywxhTUgENsZlrblh
gXn2MgnzDHvwShCltlNL2l29HaAXBzIICpPcgiRWLEY/Y1OTNHvYPiTgZdRtkkEM
5tM97EwxZF31k5i7wGpRed84xCid2bXvufq2Xjo2jWxXuQ01r+bv6v/lVwDvd5tz
iOWJsjj4tXLo3bcpuaCM5XvHI9x0yg==
=h62J
-----END PGP SIGNATURE-----

P.S. the bits do change for me as stated in the Xen log when I resume from a suspend. Here is what mine says.

(XEN) CPU0: cap[ 1] is 7ed8320b (expected f6d8320b)


zach...@gmail.com

unread,
Feb 9, 2020, 4:51:53 AM2/9/20
to qubes-users
I installed the new linux-firmware in dom0, rebooted and tested a suspend/resume. Sadly the Xen log says the bits changed still so I'm guessing this will have to be addressed by AMD and hardware manufacturers. Will have to put some thought into what to do next.

Claudia

unread,
Feb 9, 2020, 9:15:11 AM2/9/20
to zach...@gmail.com, Marek Marczykowski-Górecki, qubes...@googlegroups.com, Chris Laprise, brenda...@gmail.com, qubes123

Thanks for sharing that. Just as I expected, bits 31 and 27, xor 0x88000000. That makes three of us now.

I finally did some digging. I'm wondering if it has to do with the RDRAND issue which has been well known since at least May 7, 2019 to affect Fam15h. This stands to reason, as I immediately updated to the May 19 BIOS update when I bought this machine. awokd had suggested this update, specifically an AGESA update contained within, might have been the cause of an unrelated problem.

https://www.phoronix.com/scan.php?page=news_item&px=AMD-CPUs-RdRand-Suspend
https://linuxreviews.org/AMD_finally_submits_kernel_patch_for_broken_RDRAND_on_older_AMD_APUs
https://www.mail-archive.com/qubes...@googlegroups.com/msg31568.html - User awokd's note about AGESA update
https://www.mail-archive.com/qubes...@googlegroups.com/msg31689.html - User Qubes123's investigation into CPUID bits

From linuxreviews.org:
"There have been reports of RDRAND issues after resuming from suspend on some AMD family 15h and family 16h systems. [...] RDRAND support is indicated by CPUID Fn00000001_ECX[30]. This bit can be reset by clearing MSR C001_1004[62]. Any software that checks for RDRAND support using CPUID, including the kernel, will believe that RDRAND is not supported. "

According to the page below, RDRAND is bit 30 in ECX, not 31. And that still doesn't explain the 27th bit turning on after resume.
27: OSXSAVE (turns ON)
30: RDRAND (unchanged)
31: Not used, always 0 (turns ON)

https://www.felixcloutier.com/x86/cpuid#fig-3-7

So it doesn't sound like the same problem at all, but all my search queries seem to lead back to the RDRAND issue. I'm hoping someone with more expertise in this area can make some better sense of this.

Brendan Hoar

unread,
Feb 9, 2020, 9:41:51 AM2/9/20
to Claudia, Chris Laprise, Marek Marczykowski-Górecki, qubes...@googlegroups.com, qubes123, zach...@gmail.com
On Sun, Feb 9, 2020 at 9:15 AM Claudia <clau...@disroot.org> wrote:
From linuxreviews.org:
"There have been reports of RDRAND issues after resuming from suspend on some AMD family 15h and family 16h systems. [...] RDRAND support is indicated by CPUID Fn00000001_ECX[30]. This bit can be reset by clearing MSR C001_1004[62]. Any software that checks for RDRAND support using CPUID, including the kernel, will believe that RDRAND is not supported. "

According to the page below, RDRAND is bit 30 in ECX, not 31. And that still doesn't explain the 27th bit turning on after resume.
27: OSXSAVE (turns ON)
30: RDRAND (unchanged)
31: Not used, always 0 (turns ON)

https://www.felixcloutier.com/x86/cpuid#fig-3-7

So it doesn't sound like the same problem at all, but all my search queries seem to lead back to the RDRAND issue. I'm hoping someone with more expertise in this area can make some better sense of this.

Hmm bit 27 can be influenced by software. Might be an issue where the value was saved for reference before the OS/hypervisor modified it?
OSXSAVEA value of 1 indicates that the OS has set CR4.OSXSAVE[bit 18] to enable XSETBV/XGETBV instructions to access XCR0 and to support processor extended state management using XSAVE/XRSTOR

Claudia

unread,
Feb 9, 2020, 10:19:34 AM2/9/20
to Marek Marczykowski-Górecki, brenda...@gmail.com, zach...@gmail.com, qubes-users
> marmarek:
> This is a very bad idea to "fix" it. Those missing/changed CPUID bits later on will cause issues.
> And given most of the microcode updates recently are about speculative execution, missing those
> features will make the host vulnerable to those issues again. There are multiple ways it can
> manifest - from crashes when Xen uses (now not present) CPU feature, to silent failures when Xen
> tries to use some feature and assume it protects the system, while it does not in practice.
>
> For this particular case (microcode included in BIOS newer than in OS), I see two options: make
> BIOS (coreboot, right?) apply microcode update on resume too, or include newer microcode in OS.

I want to make one thing clear: I am **not** suggesting this check be removed altogether. I am suggesting adding an **optional**, even undocumented, override parameter which defaults to the **current behavior** which is to panic.

I've found the patch to be quite stable so far. Unpatched is guaranteed to cause a crash (xen
panic) at resume; patched so far has not caused any noticeable stability issues for the four of us
using it, afaik. Just saying.

Also, not everyone has the option of coreboot. And we're not even completely certain this a
post-resume microcode update issue, either.

> lunarthegray:
> @marmarek the "fix" is a hack for sure but it's currently the only way to get some AMD Ryzen
> laptops to work with Qubes. I built Qubes R4.1 the other day and with kernel 5.4 and Xen 4.13 the
> issue remains.Laptop users often suspend and are on the go as I am. There was some discussion on
> the qubes-users mailing list about other solutions. I'm no firmware/Xen expert though. Would
> pinning dom0 to 1 vCPU prevent the issue of missing or changed CPU bits?I'm not exactly sure what
> the fix would be with standard BIOS, as I'm not brave enough to flash coreboot on my very new
> ThinkPad. Should I start trying to get in contact with Lenovo? I'm assuming AMD needs to release a
> microcode patch as it's not really an issue with Xen itself.

At least in my case, CPU pinning did not fix this issue. The bits still change and (would) cause a
Xen panic as before. Pinning dom0 to CPU0 merely fixed a separate post-resume issue with my SATA
controller. In that thread, I link to the original Xen archives thread about pinning which had
nothing to do with Ryzen.

February 9, 2020 2:09 AM, "Marek Marczykowski-Górecki" <marm...@invisiblethingslab.com> wrote:
> (continuing discussion from the above PR)
>
> The patch as it is, is not acceptable, as it may introduce security
> and/or stability issues on some machines. Xen (and Linux too) assumes
> what CPU features is can use based on CPUID flags. If those changes
> during system runtime (including suspend/resume) some instructions or
> control registers may no longer be valid (->crash) or safe to use
> (->security issue).

Like I said, it's been very stable for me so far. I've only had one bad resume in the months I've been using it, suspending at least once a day. Security issues on the other hand are indeed unknown at this point.

Also worth noting that this is Xen-specific. Afaik, the Linux kernel doesn't check for these changes. So everyone using plain old Ubuntu or whatever would be subject to the same stability and security implications caused by this patch.

> If that's just about microcode updates, that's probably BIOS bug - if it
> applies microcode update on system startup, it should do the same on

Weird that it's happening equally on various vendor BIOSes as well as coreboot, the only thing they have in common is Ryzen 2xxx-3xxx chips. It doesn't sound to me like a **BIOS** bug, per se, unless all these vendors and the Coreboot developers wrote the same bug independently. More likely an AMD bug, imo.

> system resume too. Anyway it's worth trying updating linux-firmware
> package, which carries microcode updates for AMD. This should make Xen
> apply microcode updates too - before checking those flags.
> I've just uploaded updated version of the package to the current-testing
> repository (both R4.0 and R4.1).

Thanks for the tip. I'll try it when I have a chance. `--enablerepo=qubes-dom0-current-testing kernel-latest linux-firmware` I'm guessing?

> If that's about something else, then fixing it would require finding
> what exactly is changing (and preferably also why). And only then find
> how to mitigate this issue. If specific flags would turn out to be not
> related to security features or otherwise having unwanted effects, then
> ignoring those changes would be an option. But ignoring _only those
> flags verified to be safe to ignore_, not all of them.

See my other reply about that.

But I would like to mention, there are already all kinds of options and parameters throughout the Xen, Qubes, and Linux codebases that come with stability/security implications. This isn't Apple iOS. You can easily shoot yourself in the foot. That's the nature of the beast. It is not Qubes' purpose to hide these from the user or take away control.

By that logic, we should also patch Xen so that "smt=off" is hardcoded, because as it is now someone might open xen.cfg and see that parameter and decide to turn it on for performance, which we all know is dangerous. Same with Qubes' "no-strict-reset", or dm-crypt's weak upstream default crypto parameters, I could go on and on.

So, again, I'm not suggesting we skip this check for everybody. I'm suggesting we make it into an undocumented Xen cmdline parameter known only to those who, as they say, have been warned. As it is right now, all of us who are affected by this are patching our own machines anyway, so what's the difference to anyone else?


> - --
> Best Regards,
> Marek Marczykowski-Górecki


Thank you for your consideration and for taking the time to follow up on the ML. I look forward to hearing your thoughts.

brenda...@gmail.com

unread,
Feb 9, 2020, 12:25:56 PM2/9/20
to qubes-users
On Sunday, February 9, 2020 at 3:19:34 PM UTC, Claudia wrote:
> marmarek:
> This is a very bad idea to "fix" it. Those missing/changed CPUID bits later on will cause issues.
> And given most of the microcode updates recently are about speculative execution, missing those
> features will make the host vulnerable to those issues again. There are multiple ways it can
> manifest - from crashes when Xen uses (now not present) CPU feature, to silent failures when Xen
> tries to use some feature and assume it protects the system, while it does not in practice.
>
> For this particular case (microcode included in BIOS newer than in OS), I see two options: make
> BIOS (coreboot, right?) apply microcode update on resume too, or include newer microcode in OS.

I want to make one thing clear: I am **not** suggesting this check be removed altogether. I am suggesting adding an **optional**, even undocumented, override parameter which defaults to the **current behavior** which is to panic.

I've found the patch to be quite stable so far. Unpatched is guaranteed to cause a crash (xen
panic) at resume; patched so far has not caused any noticeable stability issues for the four of us
using it, afaik. Just saying.



Has anyone tried utilizing the xen command line options to mask bits in the cpuid, in particular section 1.2.35 cpuid_mask_ecx)?

The man page below says that "Settings applied here take effect globally, including for Xen and all guests." This *might* mean it is applied *before* the resume from sleep CPU bit checks (but I'm not promising anything, as I have not traced through the source). And also "Warning: This option is not fully effective on Family 15h processors or later."


Excerpted:

```

1.2.34 cpuid_mask_cpu

= fam_0f_rev_[cdefg] | fam_10_rev_[bc] | fam_11_rev_b

Applicability: AMD

If none of the other cpuid_mask_* options are given, Xen has a set of pre-configured masks to make the current processor appear to be family/revision specified.

See below for general information on masking.

Warning: This option is not fully effective on Family 15h processors or later.

1.2.35 cpuid_mask_ecx

1.2.36 cpuid_mask_edx

1.2.37 cpuid_mask_ext_ecx

1.2.38 cpuid_mask_ext_edx

1.2.39 cpuid_mask_l7s0_eax

1.2.40 cpuid_mask_l7s0_ebx

1.2.41 cpuid_mask_thermal_ecx

1.2.42 cpuid_mask_xsave_eax

= <integer>

Applicability: x86. Default: ~0 (all bits set)

The availability of these options are model specific. Some processors don't support any of them, and no processor supports all of them. Xen will ignore options on processors which are lacking support.

These options can be used to alter the features visible via the CPUID instruction. Settings applied here take effect globally, including for Xen and all guests.

Note: Since Xen 4.7, it is no longer necessary to mask a host to create migration safety in heterogeneous scenarios. All necessary CPUID settings should be provided in the VM configuration file. Furthermore, it is recommended not to use this option, as doing so causes an unnecessary reduction of features at Xen's disposal to manage guests.

```

brenda...@gmail.com

unread,
Feb 9, 2020, 12:28:13 PM2/9/20
to qubes-users
On Sunday, February 9, 2020 at 5:25:56 PM UTC, brend...@gmail.com wrote:

Has anyone tried utilizing the xen command line options to mask bits in the cpuid, in particular section 1.2.35 cpuid_mask_ecx)?

The man page below says that "Settings applied here take effect globally, including for Xen and all guests." This *might* mean it is applied *before* the resume from sleep CPU bit checks (but I'm not promising anything, as I have not traced through the source). And also "Warning: This option is not fully effective on Family 15h processors or later."

Just noticed that the warning applies only to 1.2.34, which is AMD-only, apparently. Unclear to me if the other items 1.2.35 and higher, which is for "x86" apply only to intel or to all x86 architecture.
 
B

Marek Marczykowski-Górecki

unread,
Feb 9, 2020, 3:52:57 PM2/9/20
to brenda...@gmail.com, qubes-users
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On Sun, Feb 09, 2020 at 09:28:13AM -0800, brenda...@gmail.com wrote:
> On Sunday, February 9, 2020 at 5:25:56 PM UTC, brend...@gmail.com wrote:
> >
> >
> > Has anyone tried utilizing the xen command line options to mask bits in
> > the cpuid, in particular section 1.2.35 cpuid_mask_ecx)?
> >
> > The man page below says that "Settings applied here take effect globally,
> > including for Xen and all guests." This *might* mean it is applied *before*
> > the resume from sleep CPU bit checks (but I'm not promising anything, as I
> > have not traced through the source). And also "*Warning: This option is
> > not fully effective on Family 15h processors or later.*"
> >
>
> Just noticed that the warning applies only to 1.2.34, which is AMD-only,
> apparently. Unclear to me if the other items 1.2.35 and higher, which is
> for "x86" apply only to intel or to all x86 architecture.

I may be missing it in this thread, but have anybody tried Qubes 4.1
builds (with Xen 4.13) on such system? Does it have the same issue?

- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAl5AcSAACgkQ24/THMrX
1yzQ/ggAmQOFWyP0GNVs5dMuSzKx6mo7myoJ0tlJaKdpNPKZZnYjaLAqhUPig5YG
rd5iv26TjVq/bl8uiRE0/qwV0/sjqgmLTqPIQanzxsB5Cnok3OZyswghGJY/UY8Y
j5ADzpzRtCC7WhQkvhtPSwcC3c72rgmjfQg2IjKfYU6qyv+0aJ2HuJQj/kA49cG6
kzwGRIJJlxVfCsnlXSwmHa17PyiolvYqpQFhCN8EIM3KYFcjrBK+kP7nqdNXuQ8R
atZqH66h8wxp/BvGO9xGZPmWV6uhrC+JIKfdlaspKO4fWFxXuBwxGgS+favkn5wT
vBJcU6wxj2Qwk6MvJV17BMV1dwqntg==
=HtGL
-----END PGP SIGNATURE-----

zach...@gmail.com

unread,
Feb 9, 2020, 4:30:40 PM2/9/20
to qubes-users
On Sunday, February 9, 2020 at 2:52:57 PM UTC-6, Marek Marczykowski-Górecki wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On Sun, Feb 09, 2020 at 09:28:13AM -0800, brend...@gmail.com wrote:
> On Sunday, February 9, 2020 at 5:25:56 PM UTC, brend...@gmail.com wrote:
> >
> >
> > Has anyone tried utilizing the xen command line options to mask bits in
> > the cpuid, in particular section 1.2.35 cpuid_mask_ecx)?
> >
> > The man page below says that "Settings applied here take effect globally,
> > including for Xen and all guests." This *might* mean it is applied *before*
> > the resume from sleep CPU bit checks (but I'm not promising anything, as I
> > have not traced through the source). And also "*Warning: This option is
> > not fully effective on Family 15h processors or later.*"
> >
>
> Just noticed that the warning applies only to 1.2.34, which is AMD-only,
> apparently. Unclear to me if the other items 1.2.35 and higher, which is
> for "x86" apply only to intel or to all x86 architecture.

I may be missing it in this thread, but have anybody tried Qubes 4.1
builds (with Xen 4.13) on such system? Does it have the same issue?


I have, yes. A few days ago I built an ISO using R4.1 and tested it. The same issue happens with a fresh R4.1 install and Xen 4.13 + 5.4 kernel.

awokd

unread,
Feb 9, 2020, 5:00:30 PM2/9/20
to qubes...@googlegroups.com
Claudia:
>> marmarek:

>> For this particular case (microcode included in BIOS newer than in OS), I see two options: make
>> BIOS (coreboot, right?) apply microcode update on resume too, or include newer microcode in OS.
>
> I want to make one thing clear: I am **not** suggesting this check be removed altogether. I am suggesting adding an **optional**, even undocumented, override parameter which defaults to the **current behavior** which is to panic.
>
> I've found the patch to be quite stable so far. Unpatched is guaranteed to cause a crash (xen
> panic) at resume; patched so far has not caused any noticeable stability issues for the four of us
> using it, afaik. Just saying.
>
> Also, not everyone has the option of coreboot. And we're not even completely certain this a
> post-resume microcode update issue, either.

FWIW, my corebooted AMD has the same issue and resolution. Of course,
much of the source code came from AMD so it could be something common to
most/all. I wonder if there's a fix that could be made at that level.

--
- don't top post
Mailing list etiquette:
- trim quoted reply to only relevant portions
- when possible, copy and paste text instead of screenshots

Claudia

unread,
Feb 9, 2020, 5:23:15 PM2/9/20
to Marek Marczykowski-Górecki, brenda...@gmail.com, qubes-users
February 9, 2020 8:52 PM, "Marek Marczykowski-Górecki" <marm...@invisiblethingslab.com> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> On Sun, Feb 09, 2020 at 09:28:13AM -0800, brenda...@gmail.com wrote:
>> On Sunday, February 9, 2020 at 5:25:56 PM UTC, brend...@gmail.com wrote:
>>>
>>>
>>> Has anyone tried utilizing the xen command line options to mask bits in
>>> the cpuid, in particular section 1.2.35 cpuid_mask_ecx)?
>>>
>>> The man page below says that "Settings applied here take effect globally,
>>> including for Xen and all guests." This *might* mean it is applied *before*
>>> the resume from sleep CPU bit checks (but I'm not promising anything, as I
>>> have not traced through the source). And also "*Warning: This option is
>>> not fully effective on Family 15h processors or later.*"
>>>
>>
>> Just noticed that the warning applies only to 1.2.34, which is AMD-only,
>> apparently. Unclear to me if the other items 1.2.35 and higher, which is
>> for "x86" apply only to intel or to all x86 architecture.
>
> I may be missing it in this thread, but have anybody tried Qubes 4.1
> builds (with Xen 4.13) on such system? Does it have the same issue?

I also had the same problem with unpatched Xen 4.13, it was on the fc31-based R4.1 build right before christmas. The check was introduced in 4.8.3.3 and probably hasn't changed. For what it's worth, R4.1 and R4.0 both resume fine when booted without Xen. See https://www.mail-archive.com/qubes...@googlegroups.com/msg31518.html

Reply all
Reply to author
Forward
0 new messages