Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

KVM PCI Passthrough NVidia GeForce GTX 1080 Ti error code 43

694 views
Skip to first unread message

Ramon Hofer

unread,
Nov 12, 2017, 2:50:06 PM11/12/17
to
Dear all,

Please help me passthrough my GPU the a KVM guest.

The system I am using:
lshw: https://pastebin.com/tB7FqqxN

Host OS:Debian 9 Stretch
Mainboard: Supermicro C7Z170-M (activated VT-d in Bios)
CPU: Intel Core i7-7700K CPU @ 4.20GHz
GPU: EVGA GeForce GTX1080 Ti

The GPU is not listed because I have blacklisted it:
$ cat /etc/modprobe.d/blacklist.conf
blacklist nouveau

lspci: https://pastebin.com/6qYuJRPg

I found this guide:
https://scottlinux.com/2016/08/28/gpu-passthrough-with-kvm-and-debian-linux/

After installing Win7 guest, enabling PCI passthrough using
virt-manager, installing the NVidia driver in the guest, Windows reports
the error 43 for the GPU.

Windows has stopped this device because it has reported problems.
(Code 43)

This is described in the above mentioned post and a workaround is
linked:
https://www.reddit.com/r/VFIO/comments/479xnx/guests_with_nvidia_gpus_can_enable_hyperv/

Unfortunately I do not know how to apply the workaround. I understand
that I should create a file '/usr/libexec/qemu-kvm-hv-vendor' with the
following content:

#!/bin/sh
exec /usr/bin/qemu-kvm \
`echo "\$@" | sed 's|hv_time|hv_time,hv_vendor_id=whatever|g'`

Or according to the original redhat mailing list post by Alex
Williamson:
https://www.redhat.com/archives/vfio-users/2016-March/msg00092.html

$ cat /usr/libexec/qemu-kvm-hv-vendor
#!/bin/sh
exec /usr/bin/qemu-kvm \
`echo "\$@" | sed 's|hv_time|hv_time,hv_vendor_id=KeenlyKVM|g'`

But since there is no qemu-kvm present and the directory '/usr/libexec'
does not exist on my system, I wonder how I should proceed.

Any help fixing my problem would be highly appreciated.


Thanky you very much in advance and best regards,
Ramon

Alexander V. Makartsev

unread,
Nov 13, 2017, 3:50:07 AM11/13/17
to
This is interesting topic and I hope to find some time to spare to implement and test this setup on my system.
Can't suggest you anything yet, because this "Code 43" error is generic and can happen even on normal systems.
The reasons could be limitless from driver version conflict to bios\uefi firmware bug of your motherboard.
I wonder, what VEN_ID and DEV_ID are reported for your VGA in Windows guest?
Have you tried Windows 8.1 or 10 as guests? They could have more support for virtualization in general.

-- 
With kindest regards, Alexander.

⢀⣴⠾⠻⢶⣦⠀ 
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org
⠈⠳⣄⠀⠀⠀⠀ 

Ramon Hofer

unread,
Nov 13, 2017, 4:50:06 PM11/13/17
to
Dear Alexander,

Thank you very much for your reply.
> This is interesting topic and I hope to find some time to spare to
> implement and test this setup on my system. Can't suggest you anything
> yet, because this "Code 43" error is generic and can happen even on
> normal systems. The reasons could be limitless from driver version
> conflict to bios\uefi firmware bug of your motherboard. I wonder, what
> VEN_ID and DEV_ID are reported for your VGA in Windows guest? Have you
> tried Windows 8.1 or 10 as guests? They could have more support for
> virtualization in general.

Interesting. I thought I was just not able to setup KVM / QEMU
properly. Because I read and heard that NVidia deliberately switches
the card off when the driver detects that it is virtualised.

I am using the newest BIOS version (updated on Sunday) on the
motherboard:
Supermicro C7Z170-M
BIOS Version: 2.0a
BIOS Tag: 1088B
Date: 07/17/2017
Time: 15:51:37

Unfortunately I do not know anything about a bioy\uefi firmware bug. Is
this a known issue of my mainboard version?

In the BIOS for the "Boot mode select" setting, I have chosen
"Legacy" (there would also be "UEFI" or "DUAL"). Do you think it might
be worth trying to change it to the other two?

I have uploaded dmesg output if it helps:
dmesg: https://pastebin.com/79Us7WMf

In the Windows 7 guest, the reported IDs are:
VEN_ID: 10DE
DEV_ID: 1B06

The driver version in the Windows 7 guest is:
23.21.13.8813 (Date: 27.10.2017)

I have thought about buying a Windows 10 copy, but it is not possible
to get the direct download version in Switzerland, so I postponed the
purchase due to lack of patience.

But if it helps, here is the information from a Debian 9 guest with the
nvidia-driver package installed:
ID: 10de:1b06
VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1b06]
(rev a1)

This is also what nvidia-detect says:
$ sudo nvidia-detect
Detected NVIDIA GPUs:
00:09.0 VGA compatible controller [0300]: NVIDIA Corporation Device
[10de:1b06] (rev a1)

Checking card: NVIDIA Corporation Device 1b06 (rev a1)
Your card is supported by the default drivers.
It is recommended to install the
nvidia-driver
package.

So I installed nvidia-driver and rebooted the Debian 9 guest.
The display setting in XFCE4 still does not show the NVidia card and
the nvidia-setting program reports that I should run nvidia-xconfig as
root, which I did.
This is the resulting config file:
xorg.conf: https://pastebin.com/sCe30emi

Unfortunately lightdm fails to start. Here is the suggested log:
systemctl status lightdm.service: https://pastebin.com/VYgKuCy1

Since there is not much information in that log I have created a
pastebin for dmesg of the failed Debian 9 guest boot attempt:
dmesg for lightdm fail: https://pastebin.com/Djx2YycH


I am not sure if I can still get a copy of Windows 8 somewhere, but if
you think it helps, I can go an buy Windows 10.
Please let me know how I can help you helping me :-)

The problem got me thinking yesterday and today I asked around if
anybody wants the card and if I should buy an AMD GPU. But since my
card came with a pre-installed water block no potential buyer could be
found...


Thank you again and best regards,
Ramon

Alexander V. Makartsev

unread,
Nov 13, 2017, 5:40:05 PM11/13/17
to
On 14.11.2017 02:37, Ramon Hofer wrote:
Dear Alexander,

Thank you very much for your reply.
Interesting. I thought I was just not able to setup KVM / QEMU
properly. Because I read and heard that NVidia deliberately switches
the card off when the driver detects that it is virtualised.

I am using the newest BIOS version (updated on Sunday) on the
motherboard: 
Supermicro C7Z170-M
BIOS Version: 2.0a
BIOS Tag: 1088B
Date: 07/17/2017
Time: 15:51:37

Unfortunately I do not know anything about a bioy\uefi firmware bug. Is
this a known issue of my mainboard version?

In the BIOS for the "Boot mode select" setting, I have chosen
"Legacy" (there would also be "UEFI" or "DUAL"). Do you think it might
be worth trying to change it to the other two?
I can't tell for sure if it will help. Basically, you have to enable VT-d, IOMMU in BIOS and stick with it.
Updated BIOS is great because this at least removes one thing from the equation.
You can check if all features of QEMU are enabled on your host by typing:
    $ virt-host-validate

I have uploaded dmesg output if it helps:
dmesg: https://pastebin.com/79Us7WMf

In the Windows 7 guest, the reported IDs are:
VEN_ID: 10DE
DEV_ID: 1B06

The driver version in the Windows 7 guest is:
23.21.13.8813 (Date: 27.10.2017)

I'd try different versions of nVidia drivers, not only most recent one and perform clean install of the drivers.
I have thought about buying a Windows 10 copy, but it is not possible
to get the direct download version in Switzerland, so I postponed the
purchase due to lack of patience.

I am not sure if I can still get a copy of Windows 8 somewhere, but if
you think it helps, I can go an buy Windows 10.
Please let me know how I can help you helping me :-)
You can download and try Windows 10 for free and play with it for a 90-days trial period. Get the regular one, not LTSB.
https://www.microsoft.com/en-us/evalcenter/evaluate-windows-10-enterprise
Might be your best bet to try KVM with Windows 10 guest without spending any money.

The problem got me thinking yesterday and today I asked around if
anybody wants the card and if I should buy an AMD GPU. But since my
card came with a pre-installed water block no potential buyer could be
found...


Thank you again and best regards,
Ramon

Ramon Hofer

unread,
Nov 13, 2017, 6:20:04 PM11/13/17
to
Dear Adam,

On Mon, 13 Nov 2017 22:49:02 +0100
Adam Cécile <adam....@hitec.lu> wrote:

> Here is my notes/scripts when I did that to attach an nvidia card
> inside a KVM virtual machine:
> https://github.com/eLvErDe/nvidia-docker-cuda-kvm-with-passthru/blob/master/create-kvm-for-nvidia-docker.sh

Thank you very much for your script.

Since I am new to virtualisation and a bit a purist, I hoped to not use
docker or at least understand why I should use it.
Could you please tell a word or two about it?
Is it only because of the workaround you mention in the script?

With your script I have successfully created a VM and installed
nvidia-detect but it reports:
# nvidia-detect
Detected NVIDIA GPUs:
00:04.0 VGA compatible controller [0300]: NVIDIA Corporation Device
[10de:1b06] (rev a1) Uh oh. Your card is not supported by any driver
version up to 340.102. A newer driver may add support for your card.
Newer driver releases may be available in backports, unstable or
experimental.

I hope to have time to do further tests tomorrow by changing your
script to use Debian 9 and its newer drivers.


Thanky again and best regards,
Ramon

Ramon Hofer

unread,
Nov 13, 2017, 7:20:04 PM11/13/17
to
Dear Adam,

On Mon, 13 Nov 2017 22:49:02 +0100
Adam Cécile <adam....@hitec.lu> wrote:

> Here is my notes/scripts when I did that to attach an nvidia card
> inside a KVM virtual machine:
> https://github.com/eLvErDe/nvidia-docker-cuda-kvm-with-passthru/blob/master/create-kvm-for-nvidia-docker.sh

I rembered that I can just upgrade to Stretch the usual way by changing
the sources.list files.
Unfortunately it is exactly the same as what I described in my prvious
email to Alexander. I could install and configure the driver but
startxfce4 does not start.

startxfce4: https://pastebin.com/0MeCU492

xorg.conf: https://pastebin.com/a4qGhsxz

Maybe I do not startxfce4 correctly?
Do I need to change xorg.conf?
Or is there any way to check if the driver works in the guest?


Thanks again for your help.


Best regards,
Ramon

Ramon Hofer

unread,
Nov 16, 2017, 6:00:05 PM11/16/17
to
Dear Alexander,

Thanks for your reply and sorry for my late response.
If I may ask you to reply to all and keep me in CC, this way I get the
email in my client and can easily answer.

> On 14.11.2017 02:37, Ramon Hofer wrote:
> > Thank you very much for your reply.
> > Interesting. I thought I was just not able to setup KVM / QEMU
> > properly. Because I read and heard that NVidia deliberately switches
> > the card off when the driver detects that it is virtualised.
> >
> > I am using the newest BIOS version (updated on Sunday) on the
> > motherboard:
> > Supermicro C7Z170-M
> > BIOS Version: 2.0a
> > BIOS Tag: 1088B
> > Date: 07/17/2017
> > Time: 15:51:37
> >
> > Unfortunately I do not know anything about a bioy\uefi firmware
> > bug. Is this a known issue of my mainboard version?
> >
> > In the BIOS for the "Boot mode select" setting, I have chosen
> > "Legacy" (there would also be "UEFI" or "DUAL"). Do you think it
> > might be worth trying to change it to the other two?
>
> I can't tell for sure if it will help. Basically, you have to enable
> VT-d, IOMMU in BIOS and stick with it.

I have enabled VT-d but could not find IOMMU in the BIOS.
But since:

> You can check if all features of QEMU are enabled on your host by
> typing: $ virt-host-validate

The command reports everything enabled.

virt-host-validate: https://pastebin.com/FUbNst11


> > I have uploaded dmesg output if it helps:
> > dmesg: https://pastebin.com/79Us7WMf
> >
> > In the Windows 7 guest, the reported IDs are:
> > VEN_ID: 10DE
> > DEV_ID: 1B06
> >
> > The driver version in the Windows 7 guest is:
> > 23.21.13.8813 (Date: 27.10.2017)
> >
> I'd try different versions of nVidia drivers, not only most recent one
> and perform clean install of the drivers.

Just to be sure: I do not need the drivers on the Debian host since the
GPU is passed through to the guest. I had clean installs of everything.
I will have to look at the weekend for an older driver and test it...


> > I have [...] postponed the purchase due to lack of patience.
>
> You can download and try Windows 10 for free and play with it for a
> 90-days trial period. Get the regular one, not LTSB.
> https://www.microsoft.com/en-us/evalcenter/evaluate-windows-10-enterprise
> Might be your best bet to try KVM with Windows 10 guest without
> spending any money.

Thanks for the tip. I created a Windows 10 Enterprise AMD64 guest.
Unfortunately with the same result.


Some other thing I was thinking: I read on the Supermicro homepage that
the C7Z170-M supports 7th generation i7s (like my i7-7700K). But in the
printed manual it was written that it only supports 6th Generation i7.
But then I gues it would not even boot up, if the CPU was not
supported. And the following findings also say otherwise.


To test if the card is not dead, I have just removed the nouveau
blacklist, and the options vfio-pci ids=10de:1b06,10de:10ef
in /etc/modprobe.d/vfio.conf.
After a reboot, I then installed nvidia-detect, nvidia-driver, and
nvidia-xconfig, as well as task-xfce-desktop on the Debian 9 host.
But when I booted, the display of the NVidia card remained black (just
like previously in the Debian guests). It is still possible to
Alt+Ctrl+F1 into a different terminal.
Then I reset the BIOS setting to the defaults and booted again into the
original Debian 9 host's XFCE4. This time it worked. I am now running
unigine benchmark and it runs quite well.


Now I think I do not understand the basic concept of PCI passthrough
correctly.
I have added again the /etc/modprobe.d/vfio.conf option and rebooted.
The nouveau driver was already blacklisted by a softlink
to /etc/alternatives/glx--nvidia-blacklists-nouveau.conf.
Probably I have to remove the nvidia-driver again to be able to reserve
the card for KVM?
First I set the primary video card in the BIOS to the internal
graphics of the mainboard/CPU.
Then I have retried it with Windows 10. Still no luck.


Not sure what to try next.
Thanks again for your much appreciated help and time.


Best regards,
Ramon

Alexander V. Makartsev

unread,
Nov 17, 2017, 3:20:05 AM11/17/17
to
How many video adapters in your host machine? In BIOS you have to select video adapter to be initialized first (ex. IGFX). It has to be other than GTX1080, and you can't use 1080 in your host OS if you want it to be passed through to guest OS. This is the reason why you need vfio stub drivers, to protect 1080 from host OS interference.

I think, I can't provide information on the subject more than what is in ArchWiki already.
Personally, I think nVidia's anti-VM driver protection is the cause of your problems. At least because you have proper VEN_ID&DEV_ID for the video card in guest OS, which means you did installation of stub drivers (vfio) right in host OS.
This article session suggests that last unprotected driver version was 337.88 which means you won't be able to install it on Windows 10 guest and your Pascal-based video card is not supported by it.
https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#.22Error_43_:_Driver_failed_to_load.22_on_Nvidia_GPUs_passed_to_Windows_VMs
So blindly installing previous versions of nVidia drivers won't work, and the next step should be to fool Anti-VM driver protection with workarounds described in this article section:
https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF#Troubleshooting

Ramon Hofer

unread,
Nov 17, 2017, 2:20:05 PM11/17/17
to
Dear Alexander,

On Fri, 17 Nov 2017 13:09:38 +0500
"Alexander V. Makartsev" <avb...@gmail.com> wrote:

> How many video adapters in your host machine? In BIOS you have to
> select video adapter to be initialized first (ex. IGFX). It has to be
> other than GTX1080, and you can't use 1080 in your host OS if you
> want it to be passed through to guest OS. This is the reason why you
> need vfio stub drivers, to protect 1080 from host OS interference.

On the host machine I have the integrated graphics and the additional
NVidia card.

Not sure how to initialize it first. But I have chosen IGFX to be the
default.
The BIOS is shown on the monitor attached to the IGFX.
In the first setting I disabled the NVidia card I believe. Now it is
activated but not the primary card.

Thank you a lot for the explanations. I have a monitor attached to the
GTX1080, but only because I will want to use it in the guest.

I will check the configuration of the host: No NVidia driver,
blacklisted nouveau, vfio stub, and all of the virsh settings.
Also I will try a clean install of the host system and recreate the
guest(s).

What I just did is applying the virsh edit vendor_id and hidden state
settings:
/etc/libvirt/qemu/win10.xml: https://pastebin.com/7Rewq21Y

Then I started the Windows 10 guest and after checking that it did not
help, removed the GeFore card and the PCI devices in the Windows "device
drivers" screen. I then shut down the guest, removed and readded the
PCI devices in virt-manager and restarted the guest, made Windows
reinstall the driver, rebooted the guest again and still find the
error 43.

What makes me wonder: You write that the deliberate disabling of the
card is only implemented in the Windows version of the driver. But I
do find the same error using a Debian 9 guests.

Do you have any other ideas what I could try?


Thanks and best regards,
Ramon

Ramon Hofer

unread,
Nov 19, 2017, 7:50:06 AM11/19/17
to
Dear all

I have contacted NVidia support for help.

In essence they said that they do not support KVM.
But when I asked about the deliberate disabling when virtualization is
detected, they wanted to help me and I had to grab some more
information about the guest system.

Full log: https://pastebin.com/UeqrEZgp

And the mentioned Windows system information file that I had to collect:
Windows system information: https://www.sendspace.com/file/l713tf

I will keep you updated with their answer.


Best regards,
Ramon
0 new messages