-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
On Sat, Jan 09, 2021 at 11:25:57PM -0800, Rama McIntosh wrote:
> On Saturday, January 9, 2021 at 8:37:18 PM UTC-10 Rama McIntosh wrote:
> > Dell soldered the new AX500 pci card to my motherboard (HCL
> >
https://groups.google.com/g/qubes-users/c/Fa65-e8vqdM).
> >
> > Please let me know if a qubes-issue is a better place to track this.
> >
> > So when the driver tries to initialize msi:
> >
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/net/wireless/ath/ath11k/pci.c?h=v5.10.5#n640
> >
> > It gets -28 in sys-net:
> >
> > [ 5.706045] ath11k_pci 0000:00:06.0: WARNING: ath11k PCI support is
> > experimental!
> > [ 5.706448] ath11k_pci 0000:00:06.0: BAR 0: assigned [mem
> > 0xf2000000-0xf20fffff 64bit]
> > [ 5.734248] ath11k_pci 0000:00:06.0: failed to get 32 MSI vectors, only
> > -28 available
> > [ 5.734289] ath11k_pci 0000:00:06.0: failed to enable msi: -28
> > [ 5.736589] ath11k_pci: probe of 0000:00:06.0 failed with error -28
- -28 is ENOSPC (No space left on device). Likely too many MSI vectors
were requested. Lets see below.
> > The driver works fine in bare metal fedora 33. I tried permissive and
> > no-strict-reset. I've been looking at xen and suspect the bug is there.
> > I did notice a patch to fix a intel wifi driver to work with xen.
> >
> > Any help of what needs fixing would be appreciated. Thanks.
> more info:
>
> From xen/console/guest-sys-net-dm.log:
> [2021-01-09 18:51:45] [00:06.0] xen_pt_realize: Assigning real physical
> device 05:00.0 to devfn 0x30
> [2021-01-09 18:51:45] [00:06.0] xen_pt_register_regions: IO region 0
> registered (size=0x00100000 base_addr=0xd2100000 type: 0x4)
> [2021-01-09 18:51:45] [00:06.0] xen_pt_config_reg_init: Offset 0x000e
> mismatch! Emulated=0x0080, host=0x0000, syncing to 0x0000.
> [2021-01-09 18:51:45] [00:06.0] xen_pt_config_reg_init: Offset 0x0010
> mismatch! Emulated=0x0000, host=0xd2100004, syncing to 0xd2100004.
> [2021-01-09 18:51:45] [00:06.0] xen_pt_config_reg_init: Offset 0x0042
> mismatch! Emulated=0x0000, host=0x0003, syncing to 0x0003.
> [2021-01-09 18:51:45] [00:06.0] xen_pt_config_reg_init: Offset 0x0074
> mismatch! Emulated=0x0000, host=0x5908fc0, syncing to 0x5908fc0.
> [2021-01-09 18:51:45] [00:06.0] xen_pt_config_reg_init: Offset 0x007a
> mismatch! Emulated=0x0000, host=0x0010, syncing to 0x0010.
> [2021-01-09 18:51:45] [00:06.0] xen_pt_config_reg_init: Offset 0x0082
> mismatch! Emulated=0x0000, host=0x1012, syncing to 0x1012.
> [2021-01-09 18:51:45] [00:06.0] xen_pt_realize: no pin interrupt
> [2021-01-09 18:51:45] [00:06.0] xen_pt_realize: Real physical device
> 05:00.0 registered successfully
There may be some hint above, but I haven't tried to decode it yet
(specifically - match offsets to specific capabilities you list below).
> lspci from dom0:
>
> 05:00.0 Network controller: Qualcomm Device 1101 (rev 01)
> Subsystem: Rivet Networks Device a501
> Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B- DisINTx+
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
> <MAbort- >SERR- <PERR- INTx-
> Region 0: Memory at d2100000 (64-bit, non-prefetchable) [size=1M]
(...)
> Capabilities: [50] MSI: Enable- Count=1/32 Maskable+ 64bit-
> Address: 00000000 Data: 0000
> Masking: 00000000 Pending: 00000000
This looks interesting, specifically "Count=1/32". I haven't seen many
other devices with MSI (but not MSI-X) that has more than one vector.
> lspci on sys-net:
>
> 00:06.0 Network controller: Qualcomm Device 1101 (rev 01)
> Subsystem: Rivet Networks Device a501
> Physical Slot: 6
> Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B- DisINTx-
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
> <MAbort- >SERR- <PERR- INTx-
> Region 0: Memory at f2000000 (64-bit, non-prefetchable) [size=1M]
(...)
> Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit-
> Address: 00000000 Data: 0000
And here it is presented as "Count=1/1". This looks very much related.
It may be some of the masking listed in qemu output
(guest-sys-net-dm.log file), but can be also somewhere else. Anyway
the most likely responsible place is in qemu.
So, I've looked into qemu sources and indeed, I've found this comment[1]:
/* Currently no support for multi-vector */
if (*val & PCI_MSI_FLAGS_QSIZE) {
XEN_PT_WARN(&s->dev, "Tries to set more than 1 vector ctrl %x\n", *val);
}
Later in the code you can find also relevant register definition:
/* Message Control reg */
{
.offset = PCI_MSI_FLAGS,
.size = 2,
.init_val = 0x0000,
.res_mask = 0xFE00,
.ro_mask = 0x018E,
.emu_mask = 0x017E,
.init = xen_pt_msgctrl_reg_init,
.u.w.read = xen_pt_word_reg_read,
.u.w.write = xen_pt_msgctrl_reg_write,
},
multi-vector is covered by bits 4-6 (mask 0x70)[2] and you can see above
that it's emulated (emu_mask) which means qemu provides own values
instead of passing them from the hardware and the values for those 3 bits
are 0 (init_val).
I'm not sure how hard would be implementing multi-vector support here,
but it's clearly not there.
[1]
https://github.com/qemu/qemu/blob/master/hw/xen/xen_pt_config_init.c#L1104
[2]
https://wiki.osdev.org/PCI#Enabling_MSI
- --
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----
iQEzBAEBCAAdFiEEhrpukzGPukRmQqkK24/THMrX1ywFAl/7JeMACgkQ24/THMrX
1yxE4wf/bsj5hMkzKUatMHjUf+StfA0MObQFtrkStdG3XEynbJQgC/yNXkJiXokD
r7HGG6pxob+/2mNFc+OmMAvOIO7yLhew2EEiLGDiDkyPGIZsc92IdKE7Wijzh5Kf
6R/jrRf/lQMBD3CBrY8FaRpsqTBGp1uhxpeAmsf6Qjdh7bO1kjPzY08xmuoWf7gu
szSu/iStRuBo0irtFZL5W7DmcMrbtzq/sfnJzqJD5bxz4lqrzl5NQbgeJhP+nGWq
Ht0kSVvPxXMa9e/VMMQWLS+9ZElEGeDWhJw4f1fI3+1W533G+Y7z1ONzfYE+Jz9B
NnNzXTMuuDQkb2A8vaW1S8f1xjhtnQ==
=XcGW
-----END PGP SIGNATURE-----