Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#994721: linux-image-5.10.0-0.bpo.8-amd64: Freeze on i915 Broxton with linux >=5.7 and old mesa; not prevented by intel_iommu=intgpu_off

45 views
Skip to first unread message

Peter Nowee

unread,
Sep 19, 2021, 5:10:03 PM9/19/21
to
Package: src:linux
Version: 5.10.46-4~bpo10+1
Severity: important

Dear Maintainer,

*** Reporter, please consider answering these questions, where appropriate ***

* What led up to the situation?
* What exactly did you do (or not do) that was effective (or
ineffective)?
* What was the outcome of this action?
* What outcome did you expect instead?

*** End of the template - remove these template lines ***


-- Package-specific info:
** Version:
Linux version 5.10.0-0.bpo.8-amd64 (debian...@lists.debian.org) (gcc-8 (Debian 8.3.0-6) 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #1 SMP Debian 5.10.46-4~bpo10+1 (2021-08-07)

** Command line:
BOOT_IMAGE=/vmlinuz-5.10.0-0.bpo.8-amd64 root=/dev/mapper/disruption--vg-disruption--debstable--root ro ipv6.disable=1 intel_iommu=off

** Not tainted

** Kernel log:
Unable to read kernel log; any relevant messages should be attached

** Model information
sys_vendor: ASUSTeK COMPUTER INC.
product_name: E402NA
product_version: 1.0
chassis_vendor: ASUSTeK COMPUTER INC.
chassis_version: 1.0
bios_vendor: American Megatrends Inc.
bios_version: E402NA.317
board_vendor: ASUSTeK COMPUTER INC.
board_name: E402NA
board_version: 1.0

** Loaded modules:
ctr
ccm
cmac
rfcomm
appletalk
psnap
llc
bnep
snd_hda_codec_hdmi
ath9k
ath9k_common
ath3k
btusb
ath9k_hw
btrtl
btbcm
btintel
bluetooth
ath
jitterentropy_rng
mac80211
snd_sof_pci
x86_pkg_temp_thermal
intel_powerclamp
coretemp
snd_sof_intel_byt
snd_sof_intel_ipc
snd_sof_intel_hda_common
kvm_intel
mei_hdcp
snd_sof_xtensa_dsp
snd_sof
intel_rapl_msr
snd_sof_intel_hda
snd_hda_codec_realtek
snd_soc_skl
snd_hda_codec_generic
ledtrig_audio
snd_soc_hdac_hda
snd_hda_ext_core
snd_soc_sst_ipc
kvm
snd_soc_sst_dsp
snd_soc_acpi_intel_match
snd_soc_acpi
snd_hda_intel
cfg80211
snd_intel_dspcfg
soundwire_intel
soundwire_generic_allocation
snd_soc_core
snd_compress
soundwire_cadence
drbg
snd_hda_codec
joydev
ansi_cprng
irqbypass
rapl
snd_hda_core
wdat_wdt
intel_cstate
snd_hwdep
asus_nb_wmi
hid_multitouch
watchdog
asus_wmi
soundwire_bus
ecdh_generic
wmi_bmof
sparse_keymap
efi_pstore
serio_raw
pcspkr
snd_pcm
ecc
rfkill
libarc4
intel_xhci_usb_role_switch
snd_timer
roles
sg
mei_me
snd
soundcore
mei
processor_thermal_device
intel_rapl_common
intel_soc_dts_iosf
ac
evdev
nft_ct
nf_conntrack
int3403_thermal
int340x_thermal_zone
int3400_thermal
acpi_thermal_rel
asus_wireless
intel_pmc_core
nf_defrag_ipv6
nf_defrag_ipv4
nf_log_ipv4
nf_log_common
binfmt_misc
nft_log
nft_limit
nft_counter
parport_pc
nf_tables
ppdev
libcrc32c
lp
nfnetlink
parport
fuse
configfs
efivarfs
ip_tables
x_tables
autofs4
ext4
crc16
mbcache
jbd2
crc32c_generic
algif_skcipher
af_alg
dm_crypt
dm_mod
usbhid
sd_mod
t10_pi
crc_t10dif
crct10dif_generic
uas
usb_storage
i915
crct10dif_pclmul
crct10dif_common
crc32_pclmul
crc32c_intel
hid_generic
ghash_clmulni_intel
i2c_algo_bit
drm_kms_helper
cec
rtsx_pci_sdmmc
xhci_pci
mmc_core
ahci
libahci
xhci_hcd
drm
libata
r8169
aesni_intel
usbcore
libaes
crypto_simd
cryptd
glue_helper
scsi_mod
realtek
mdio_devres
usb_common
rtsx_pci
lpc_ich
libphy
intel_lpss_pci
i2c_i801
intel_lpss
idma64
i2c_smbus
i2c_hid
wmi
hid
battery
button
video

** Network interface configuration:
*** /etc/network/interfaces:

source /etc/network/interfaces.d/*

auto lo
iface lo inet loopback

** Network status:
*** IP interfaces and addresses:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: enp1s0f2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
link/ether 2c:fd:a1:7f:c1:72 brd ff:ff:ff:ff:ff:ff
3: wlp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether f0:03:8c:be:1a:77 brd ff:ff:ff:ff:ff:ff
inet 10.0.1.27/24 brd 10.0.1.255 scope global dynamic wlp2s0
valid_lft 515398266sec preferred_lft 515398266sec

*** Device statistics:
Inter-| Receive | Transmit
face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed
lo: 9763976 32212 0 0 0 0 0 0 9763976 32212 0 0 0 0 0 0
enp1s0f2: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
wlp2s0: 126272033 139268 0 0 0 0 0 0 18566882 105230 0 0 0 0 0 0

*** Protocol statistics:
Ip:
Forwarding: 2
167614 total packets received
11 with invalid addresses
0 forwarded
0 incoming packets discarded
163227 incoming packets delivered
136646 requests sent out
112 dropped because of missing route
4 fragments received ok
8 fragments created
Icmp:
45 ICMP messages received
1 input ICMP message failed
ICMP input histogram:
destination unreachable: 44
echo requests: 1
44 ICMP messages sent
0 ICMP messages failed
ICMP output histogram:
destination unreachable: 43
echo replies: 1
IcmpMsg:
InType3: 44
InType8: 1
OutType0: 1
OutType3: 43
Tcp:
3108 active connection openings
1404 passive connection openings
1078 failed connection attempts
955 connection resets received
1 connections established
128671 segments received
99858 segments sent out
181 segments retransmitted
2 bad segments received
2567 resets sent
Udp:
34500 packets received
2 packets to unknown port received
0 packet receive errors
37363 packets sent
0 receive buffer errors
0 send buffer errors
IgnoredMulti: 6
UdpLite:
TcpExt:
717 TCP sockets finished time wait in fast timer
1 packetes rejected in established connections because of timestamp
230 delayed acks sent
1 delayed acks further delayed because of locked socket
Quick ack mode was activated 117 times
72265 packet headers predicted
12654 acknowledgments not containing data payload received
7752 predicted acknowledgments
TCPSackRecovery: 7
Detected reordering 61 times using SACK
Detected reordering 3 times using time stamp
3 congestion windows fully recovered without slow start
3 congestion windows partially recovered using Hoe heuristic
TCPDSACKUndo: 1
2 congestion windows recovered without slow start after partial ack
TCPLostRetransmit: 121
7 fast retransmits
TCPTimeouts: 166
TCPLossProbes: 32
TCPLossProbeRecovery: 1
TCPBacklogCoalesce: 1111
TCPDSACKOldSent: 104
TCPDSACKOfoSent: 1
TCPDSACKRecv: 19
1056 connections reset due to unexpected data
73 connections reset due to early user close
18 connections aborted due to timeout
TCPDSACKIgnoredNoUndo: 6
TCPSackShifted: 7
TCPSackMerged: 8
TCPSackShiftFallback: 92
TCPDeferAcceptDrop: 1204
TCPRcvCoalesce: 60157
TCPOFOQueue: 8173
TCPOFOMerge: 1
TCPChallengeACK: 2
TCPSYNChallenge: 2
TCPSpuriousRtxHostQueues: 6
TCPAutoCorking: 858
TCPWantZeroWindowAdv: 1
TCPSynRetrans: 2
TCPOrigDataSent: 25838
TCPHystartDelayDetect: 7
TCPHystartDelayCwnd: 196
TCPKeepAlive: 1237
TCPDelivered: 26405
TCPAckCompressed: 3567
TcpTimeoutRehash: 148
TCPDSACKRecvSegs: 19
IpExt:
InBcastPkts: 1528
OutBcastPkts: 6
InOctets: 133706090
OutOctets: 23895690
InBcastOctets: 112938
OutBcastOctets: 468
InNoECTPkts: 167614


** PCI devices:
00:00.0 Host bridge [0600]: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series Host Bridge [8086:5af0] (rev 0b)
Subsystem: ASUSTeK Computer Inc. Celeron N3350/Pentium N4200/Atom E3900 Series Host Bridge [1043:1300]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0

00:00.1 Signal processing controller [1180]: Intel Corporation Device [8086:5a8c] (rev 0b)
Subsystem: ASUSTeK Computer Inc. Device [1043:1300]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin B routed to IRQ 24
Region 0: Memory at 91310000 (64-bit, non-prefetchable) [size=32K]
Capabilities: <access denied>
Kernel driver in use: proc_thermal
Kernel modules: processor_thermal_device

00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 500 [8086:5a85] (rev 0b) (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. HD Graphics 500 [1043:1300]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 129
Region 0: Memory at 90000000 (64-bit, non-prefetchable) [size=16M]
Region 2: Memory at 80000000 (64-bit, prefetchable) [size=256M]
Region 4: I/O ports at f000 [size=64]
[virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: i915
Kernel modules: i915

00:0e.0 Audio device [0403]: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series Audio Cluster [8086:5a98] (rev 0b)
Subsystem: ASUSTeK Computer Inc. Celeron N3350/Pentium N4200/Atom E3900 Series Audio Cluster [1043:1300]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 131
Region 0: Memory at 91318000 (64-bit, non-prefetchable) [size=16K]
Region 4: Memory at 91000000 (64-bit, non-prefetchable) [size=1M]
Capabilities: <access denied>
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel, snd_soc_skl, snd_sof_pci

00:0f.0 Communication controller [0780]: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series Trusted Execution Engine [8086:5a9a] (rev 0b)
Subsystem: ASUSTeK Computer Inc. Celeron N3350/Pentium N4200/Atom E3900 Series Trusted Execution Engine [1043:1300]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 130
Region 0: Memory at 91327000 (64-bit, non-prefetchable) [size=4K]
Capabilities: <access denied>
Kernel driver in use: mei_me
Kernel modules: mei_me

00:12.0 SATA controller [0106]: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series SATA AHCI Controller [8086:5ae3] (rev 0b) (prog-if 01 [AHCI 1.0])
Subsystem: ASUSTeK Computer Inc. Celeron N3350/Pentium N4200/Atom E3900 Series SATA AHCI Controller [1043:1300]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 127
Region 0: Memory at 9131c000 (32-bit, non-prefetchable) [size=8K]
Region 1: Memory at 91324000 (32-bit, non-prefetchable) [size=256]
Region 2: I/O ports at f090 [size=8]
Region 3: I/O ports at f080 [size=4]
Region 4: I/O ports at f060 [size=32]
Region 5: Memory at 91323000 (32-bit, non-prefetchable) [size=2K]
Capabilities: <access denied>
Kernel driver in use: ahci
Kernel modules: ahci

00:13.0 PCI bridge [0604]: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series PCI Express Port A #1 [8086:5ad8] (rev fb) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 122
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 0000e000-0000efff
Memory behind bridge: 91200000-912fffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: <access denied>
Kernel driver in use: pcieport

00:13.1 PCI bridge [0604]: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series PCI Express Port A #2 [8086:5ad9] (rev fb) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin B routed to IRQ 123
Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
Memory behind bridge: 91100000-911fffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
BridgeCtl: Parity- SERR+ NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: <access denied>
Kernel driver in use: pcieport

00:15.0 USB controller [0c03]: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series USB xHCI [8086:5aa8] (rev 0b) (prog-if 30 [XHCI])
Subsystem: ASUSTeK Computer Inc. Celeron N3350/Pentium N4200/Atom E3900 Series USB xHCI [1043:1300]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 128
Region 0: Memory at 91300000 (64-bit, non-prefetchable) [size=64K]
Capabilities: <access denied>
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci

00:16.0 Signal processing controller [1180]: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series I2C Controller #1 [8086:5aac] (rev 0b)
Subsystem: ASUSTeK Computer Inc. Celeron N3350/Pentium N4200/Atom E3900 Series I2C Controller [1043:1300]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 27
Region 0: Memory at 91322000 (64-bit, non-prefetchable) [size=4K]
Region 2: Memory at 91321000 (64-bit, non-prefetchable) [size=4K]
Capabilities: <access denied>
Kernel driver in use: intel-lpss
Kernel modules: intel_lpss_pci

00:17.0 Signal processing controller [1180]: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series I2C Controller #5 [8086:5ab4] (rev 0b)
Subsystem: ASUSTeK Computer Inc. Celeron N3350/Pentium N4200/Atom E3900 Series I2C Controller [1043:1300]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 31
Region 0: Memory at 91320000 (64-bit, non-prefetchable) [size=4K]
Region 2: Memory at 9131f000 (64-bit, non-prefetchable) [size=4K]
Capabilities: <access denied>
Kernel driver in use: intel-lpss
Kernel modules: intel_lpss_pci

00:1f.0 ISA bridge [0601]: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series Low Pin Count Interface [8086:5ae8] (rev 0b)
Subsystem: ASUSTeK Computer Inc. Celeron N3350/Pentium N4200/Atom E3900 Series Low Pin Count Interface [1043:1300]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Kernel driver in use: lpc_ich
Kernel modules: lpc_ich

00:1f.1 SMBus [0c05]: Intel Corporation Atom/Celeron/Pentium Processor N4200/N3350/E3900 Series SMBus Controller [8086:5ad4] (rev 0b)
Subsystem: ASUSTeK Computer Inc. Celeron N3350/Pentium N4200/Atom E3900 Series SMBus Controller [1043:1300]
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin A routed to IRQ -2147483648
Region 0: Memory at 9131e000 (64-bit, non-prefetchable) [size=256]
Region 4: I/O ports at f040 [size=32]
Kernel driver in use: i801_smbus
Kernel modules: i2c_i801

01:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS5286 PCI Express Card Reader [10ec:5286] (rev 01)
Subsystem: ASUSTeK Computer Inc. RTS5286 PCI Express Card Reader [1043:202f]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin B routed to IRQ 124
Region 0: Memory at 91200000 (32-bit, non-prefetchable) [size=64K]
Capabilities: <access denied>
Kernel driver in use: rtsx_pci
Kernel modules: rtsx_pci

01:00.2 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller [10ec:8136] (rev 06)
Subsystem: ASUSTeK Computer Inc. RTL810xE PCI Express Fast Ethernet controller [1043:200f]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 22
Region 0: I/O ports at e000 [size=256]
Region 2: Memory at 91214000 (64-bit, non-prefetchable) [size=4K]
Region 4: Memory at 91210000 (64-bit, prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: r8169
Kernel modules: r8169

02:00.0 Network controller [0280]: Qualcomm Atheros QCA9565 / AR9565 Wireless Network Adapter [168c:0036] (rev 01)
Subsystem: AzureWave QCA9565 / AR9565 Wireless Network Adapter [1a3b:218d]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 23
Region 0: Memory at 91100000 (64-bit, non-prefetchable) [size=512K]
Expansion ROM at 91180000 [disabled] [size=64K]
Capabilities: <access denied>
Kernel driver in use: ath9k
Kernel modules: ath9k


** USB devices:
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 012: ID 13d3:3490 IMC Networks
Bus 001 Device 006: ID 413c:3200 Dell Computer Corp. Mouse
Bus 001 Device 005: ID 046d:c31c Logitech, Inc. Keyboard K120
Bus 001 Device 003: ID 05e3:0606 Genesys Logic, Inc. USB 2.0 Hub / D-Link DUB-H4 USB 2.0 Hub
Bus 001 Device 010: ID 0781:5583 SanDisk Corp. Ultra Fit
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub


-- System Information:
Debian Release: 10.10
APT prefers oldstable-updates
APT policy: (500, 'oldstable-updates'), (500, 'oldstable-proposed-updates'), (500, 'oldstable'), (90, 'stable-security'), (90, 'proposed-updates'), (90, 'unstable'), (90, 'testing'), (90, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 5.10.0-0.bpo.8-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US:en (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages linux-image-5.10.0-0.bpo.8-amd64 depends on:
ii initramfs-tools [linux-initramfs-tool] 0.133+deb10u1
ii kmod 26-1
ii linux-base 4.6

Versions of packages linux-image-5.10.0-0.bpo.8-amd64 recommends:
ii apparmor 2.13.2-10
ii firmware-linux-free 3.4

Versions of packages linux-image-5.10.0-0.bpo.8-amd64 suggests:
pn debian-kernel-handbook <none>
ii grub-efi-amd64 2.02+dfsg1-20+deb10u4
pn linux-doc-5.10 <none>

Versions of packages linux-image-5.10.0-0.bpo.8-amd64 is related to:
pn firmware-amd-graphics <none>
ii firmware-atheros 20210315-3~bpo10+1
pn firmware-bnx2 <none>
pn firmware-bnx2x <none>
pn firmware-brcm80211 <none>
pn firmware-cavium <none>
pn firmware-intel-sound <none>
pn firmware-intelwimax <none>
pn firmware-ipw2x00 <none>
pn firmware-ivtv <none>
pn firmware-iwlwifi <none>
pn firmware-libertas <none>
pn firmware-linux-nonfree <none>
ii firmware-misc-nonfree 20210315-3~bpo10+1
pn firmware-myricom <none>
pn firmware-netxen <none>
pn firmware-qlogic <none>
ii firmware-realtek 20210315-3~bpo10+1
pn firmware-samsung <none>
pn firmware-siano <none>
pn firmware-ti-connectivity <none>
pn xen-hypervisor <none>

-- no debconf information

Peter Nowee

unread,
Sep 19, 2021, 5:30:03 PM9/19/21
to
Details to follow tomorrow (bisected already, how to reproduce)

Peter Nowee

unread,
Sep 20, 2021, 6:00:03 PM9/20/21
to
Hi Debian kernel team,

Sorry for the short report yesterday. Reportbug sent it out earlier
than I expected. Here is the full report:

* What led up to the situation?

On Debian 10 buster, I upgraded the linux kernel packages from
4.19+105+deb10u12 (buster) to 5.10.46-4~bpo10+1 (buster-backports).

Since then, the system started to regularly hang. These are always
complete freezes: I cannot move the mouse pointer anymore, cannot
switch to virtual console, and no response to network pings anymore.


* What exactly did you do (or not do) that was effective (or
ineffective)?

Use Firefox or Evince. See below for details.

* What was the outcome of this action?

Complete system hang/freeze.

* What outcome did you expect instead?

No hang.


Git bisect results:

Bisect Debian: https://salsa.debian.org/kernel-team/linux.git
First bad commit: 3fcc0ffb or 0fc228cb. Probably 3fcc0ffb, the first
update from 5.6 to 5.7.

* | a2f70104 [amd64] Update "x86: Make x32 syscall support conditional ..." for 5.7
* | 0fc228cb lockdown: Update Secure Boot support patches for 5.7 <--- git bisect bad
* | 3fcc0ffb Update to 5.7-rc4 <--- git bisect skip, because it fails to build
* | 6e17c1ca Enable support for fsverity <--- git bisect good
|/
o b49338be (tag: debian/5.6.7-1) Prepare to release linux (5.6.7-1).


Bisect upstream:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/

Notes about bisecting upstream:
- The pristine upstream kernel does not by default show the bug,
because it disables `intel_iommu` by default. You need to either:
- Apply the following two Debian patches:
features/x86/intel-iommu-add-option-to-exclude-integrated-gpu-only.patch
features/x86/intel-iommu-add-kconfig-option-to-exclude-igpu-by-default.patch
- Or boot the pristine upstream kernel with `intel_iommu=on`.
- The first bad commit is in a range of commits that fail to build
because of an unrelated problem:
depmod: ERROR: Cycle detected: drm_kms_helper -> drm -> drm_kms_helper
Cherrypick the following later revert commits to solve that:
$ git cherry-pick -x 6ae1a4bb^..09912635
$ git commit --allow-empty
$ git cherry-pick --continue
(Only relevant for bisecting one specific old branch, otherwise not
relevant to this bug report.)

First bad commit upstream: bf72c8c6, first merged in torvalds/master
with v5.7-rc1.

commit bf72c8c6ee77d46f74a2b143303a9c9923f9e7a7 (refs/bisect/bad)
Author: Chris Wilson <ch...@chris-wilson.co.uk>
Date: Thu Jan 30 09:22:38 2020 +0000

drm/i915/gt: Skip global serialisation of clear_range for bxt vtd

VT'd on Broxton and on Braswell require serialisation of GGTT updates.
However, it seems to only be required for insertion, so drop the
complication and heavyweight stop_machine() for clears. The range will
be serialised again before use.

Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.k...@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200130092239....@chris-wilson.co.uk

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index fdfed921..f83070b5 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -350,31 +350,6 @@ static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm,
stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL);
}

-struct clear_range {
- struct i915_address_space *vm;
- u64 start;
- u64 length;
-};
-
-static int bxt_vtd_ggtt_clear_range__cb(void *_arg)
-{
- struct clear_range *arg = _arg;
-
- gen8_ggtt_clear_range(arg->vm, arg->start, arg->length);
- bxt_vtd_ggtt_wa(arg->vm);
-
- return 0;
-}
-
-static void bxt_vtd_ggtt_clear_range__BKL(struct i915_address_space *vm,
- u64 start,
- u64 length)
-{
- struct clear_range arg = { vm, start, length };
-
- stop_machine(bxt_vtd_ggtt_clear_range__cb, &arg, NULL);
-}
-
static void gen6_ggtt_clear_range(struct i915_address_space *vm,
u64 start, u64 length)
{
@@ -881,8 +856,6 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
IS_CHERRYVIEW(i915) /* fails with concurrent use/update */) {
ggtt->vm.insert_entries = bxt_vtd_ggtt_insert_entries__BKL;
ggtt->vm.insert_page = bxt_vtd_ggtt_insert_page__BKL;
- if (ggtt->vm.clear_range != nop_clear_range)
- ggtt->vm.clear_range = bxt_vtd_ggtt_clear_range__BKL;
}

ggtt->invalidate = gen8_ggtt_invalidate;



How to reproduce:
- BIOS: VT-d enabled.
and
- Kernel releases since v5.7-rc1. I can still reproduce it with drm-tip
e61e3604 of 2021-09-17 based on v5.15-rc1.
and
- Kernel from:
- Debian official packages, or
- Built from Debian linux kernel source (salsa), or
- Built from pristine upstream patched with these 2 Debian patches:
features/x86/intel-iommu-add-option-to-exclude-integrated-gpu-only.patch
features/x86/intel-iommu-add-kconfig-option-to-exclude-igpu-by-default.patch
or
- Built from pristine upstream without the 2 Debian patches, but then
use boot parameter `intel_iommu=on`.
and
- Linux boot parameter:
- Debian packages, or upstream patched with the 2 Debian patches:
- No boot parameter, or
- `intel_iommu=intgpu_off` (Debian-specific, Debian default), or
- `intel_iommu=on`.
or
- With pristine upstream, without Debian patches:
- `intel_iommu=on`.
and
- The buster-versions of binaries from these sources:
- mesa 18.3.6-2+deb10u1 (not 20.3.5-1~bpo10+1), and
- libglvnd 1.1.0-1 (not 1.3.2-1~bpo10+2)
and
- Xorg driver:
- modesetting: Easiest to reproduce with Evince: Hang in <2 minutes,
usually less <30 seconds. Also reproducible with Firefox with
Compositing: OpenGL, though it will take longer, see Steps below.
or
- intel: Have not been able to reproduce with Evince, only with
Firefox with Compositing: OpenGL. Sometimes <1 min, sometimes 20
minutes. See Steps below.

To prevent the bug:
- BIOS: VT-d disabled.
or
- Linux version 5.6 or lower.
or
- Boot parameter `intel_iommu=off` (this is the default for pristine
upstream without the 2 Debian patches)
or
- The buster-backports-versions of binaries from these sources:
- mesa 20.3.5-1~bpo10+1, and
- libglvnd 1.3.2-1~bpo10+2

No influence:
- Boot parameter `intel_iommu=strict iommu.strict=1` does not prevent
the bug.


Steps to reproduce:

I usually use Xorg modesetting + Evince, because it triggers the bug
fastest and most reliably. Firefox with OpenGL can trigger the bug with
Xorg modesetting or Xorg intel, but takes longer.

- With Evince (with Xorg modesetting, not Xorg intel driver): Open the
PDF with Evince (tested with 3.30.2-3+deb10u1) and quickly and
repeatedly scroll up and down the document. System should hang in
less than 2 minutes, usually even less than 30 seconds.
- Choose a PDF that can be scrolled through, but is not so big that
Evince will spend time "Loading..." during scrolling. For example,
on Debian I used:
- /usr/share/doc/quilt/quilt.pdf (12 pages)
- /usr/share/doc/dbconfig-common/dbapp-policy.pdf.gz (7 pages)
- With Firefox (tested with 78.14.0esr-1~deb10u1), in `about:config`:
gfx.webrender.all false
layers.acceleration.force-enabled true
layers.acceleration.disabled false
Now `about:support` should show `Compositing: OpenGL`. Now scroll and
switch between tabs, such as about:config, about:preferences,
about:performance, about:support, about:performance, some random
offline documentation, or a random bug on bugs.debian.org. System may
hang in <1 min, but it may also take 20 minutes (or hours even).
- Make sure the page is not that long that the page goes blank during
scrolling. The page should be long enough to be scrollable,
but short enough that the contents stay visible even during fast
scrolling.
- Not sure if it can reproduce with Firefox with Compositing: Basic
or Compositing: WebRender.


Further notes:
- I searched upstream bug tracker for drm/intel a bit and so far found
only this report that may be related:
- https://gitlab.freedesktop.org/drm/intel/-/issues/4082
(System hangs during parallel media transcode operations after
enabling VT-d)
However, the discussion there focused more on that they had
temporarily ignored `intel_iommu=igfx_off` for their CI, causing the
bug to surface. They did not really look into the underlying issue,
except:
> this kernel bug may have been there for longer time, or be HW/FW
> issue needing WA.
(I guess HW/FW=Hardware/Firmware, WA=Workaround.)
and:
> Complete system hang sounds like a possible hw bug.
and:
> I'm more inclined to think the issue being some race condition in
> kernel, which could trigger BUG/Oops/panic (i.e. cause machine also
> to drop network connection and not answer pings any more). IOMMU
> changes performance a bit, so it could trigger races that do not
> trigger with IOMMU off.
- Actually, I feel there are two bugs here:
- One Debian bug, in the 2 Debian intel iommu patches: I would expect
that the Debian-specific `INTEL_IOMMU_DEFAULT_ON_INTGPU_OFF` would
result in the same behavior as `intel_iommu=off`, but here it
actually seems to behave like `intel_iommu=on`. Or maybe I am
misunderstanding something.
- One upstream bug, since bf72c8c6, first merged in v5.7-rc1. Still
in drm-tip e61e3604 of 2021-09-17 based on v5.15-rc1. I thought I
better first hear what you (the Debian kernel team) have to say
about this, so I did not report upstream yet.
- I do not know what to make of the fact that mesa and libglvnd from
buster-backports make the bug disappear. Perhaps this means that once
I upgrade to Debian 11 bullseye, I will not experience the bug
anymore anyway. But I thought mesa and libglvnd are like "user-space"
from the kernel's point-of-view and should never be able to make the
system freeze, whatever their version. How likely is it that later
some other user-space program not depending on mesa, or perhaps even
a later version of mesa, is able to trigger the bug again?
- Attached is a kernel log of a boot without any `intel_iommu` boot
parameters on which I was able to reproduce the bug. No log data for
the exact moment the bug is triggered, unfortunately. Note that the
MCE error does not occur on every boot and I have also seen hangs on
boots that did not have the MCE error.

Hope I did not forget anything, otherwise I will send more info later.

Thank you for your attention, and for all the work you do on packaging
the kernel. Really impressed by the sheer amount of work you all must
be doing to get all those packages out.

Best regards,
Peter Nowee
kernel-log-5.10.0-Debian-bug-994721.txt

Peter Nowee

unread,
Sep 21, 2021, 2:00:02 AM9/21/21
to
Just to clarify: The first post of this bug (message #5) shows boot
parameter `intel_iommu=off`. With that parameter, the bug does NOT
reproduce. I was just using a safe environment to practice reportbug
with, when it suddenly sent out the report already.

To reproduce the bug, use `intel_iommu=on`, `intel_iommu=intgpu_off` or
no boot parameter at all, as described in messages #10 and #15.
0 new messages