Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1019700: mmc0: Timeout waiting for hardware cmd interrupt.

2,097 views
Skip to first unread message

Hank Barta

unread,
Sep 13, 2022, 12:00:03 PM9/13/22
to
Package: src:linux
Version: 5.19.6-1
Severity: important
X-Debbugs-Cc: hba...@gmail.com

Dear Maintainer,

*** Reporter, please consider answering these questions, where appropriate ***

* What led up to the situation?

Apparent inability to initialize/connect to the SD card H/W. This leads to the message
below that is repeated about every 10s. It can manifest three ways.

1. Failure to boot - continuous retries to read SD card.
2. If a USB SSD is connected, it can skip the SD card and boot from the SATA SSD. (That is
the coneition as I prepare this report.)
3. Completes boot, message repeats and there are no /dev/mmc* entries and WiFi H/W is
not recognozed.
4. Completes boot, messages are repeated but /dev/mmc entries are present and can
mount/read an SD card. And WiFi appears to be working
5. Completes boot, no SD card timeout messages are reported and system operates normally.


* What exactly did you do (or not do) that was effective (or
ineffective)?
* What was the outcome of this action?
* What outcome did you expect instead?

I build kernel 5.19.8 and found the same problem behavior. I booted a different SSD with
Bullseye installed and on 5.10.0 kernel and do not see this issue. (Likely unrelated -
The 5.10 and 5.19.0 kernels had a lot of vc4 related errors that seem to be fixed in 5.19.8)

Additional information:

The 5.19.8 kernel was built with the options found at
https://github.com/HankB/Debian-Arm64-kernel-for-Pi-4B-on-X86_64

I have saved dmesg output from a normal boot and a boot that exhibted the timeout
(but was otherwise able to complete booting) in paste.dmesg.net

Normal - https://paste.debian.net/1253718/
Timeout - https://paste.debian.net/1253719/

Since the kernel log below doesn't include the information at the beginning of `dmesg`
I will capture again. Or I won't. It already overflowed the dmesg buffer. If needed
for this kernel I can dupicate the situation and capture before it overflows.


-- Package-specific info:
** Version:
Linux version 5.19.0-1-arm64 (debian...@lists.debian.org) (gcc-11 (Debian 11.3.0-5) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP Debian 5.19.6-1 (2022-09-01)

** Command line:
video=HDMI-A-1:1600x1200M@60 dma.dmachans=0x37f5 bcm2709.boardrev=0xc03111 bcm2709.serial=0x44557cae bcm2709.uart_clock=48000000 bcm2709.disk_led_gpio=42 bcm2709.disk_led_active_low=0 smsc95xx.macaddr=DC:A6:32:09:C6:71 vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000 console=tty0 console=ttyS1,115200 root=LABEL=RASPIROOT rw fsck.repair=yes net.ifnames=0 rootwait

** Tainted: WC (1536)
* kernel issued warning
* staging driver was loaded

** Kernel log:
[ 723.735217] mmc0: sdhci: Timeout: 0x00000000 | Int stat: 0x00018000
[ 723.741743] mmc0: sdhci: Int enab: 0x00ff1003 | Sig enab: 0x00ff1003
[ 723.748270] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
[ 723.754797] mmc0: sdhci: Caps: 0x45ee6432 | Caps_1: 0x0000a525
[ 723.761324] mmc0: sdhci: Cmd: 0x00000502 | Max curr: 0x00080008
[ 723.767851] mmc0: sdhci: Resp[0]: 0x000001aa | Resp[1]: 0x00000000
[ 723.774379] mmc0: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
[ 723.780905] mmc0: sdhci: Host ctl2: 0x00000000
[ 723.785404] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
[ 723.791930] mmc0: sdhci: ============================================
[ 733.923993] mmc0: Timeout waiting for hardware cmd interrupt.
[ 733.929837] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 733.936364] mmc0: sdhci: Sys addr: 0x00000000 | Version: 0x00001002
[ 733.942892] mmc0: sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000
[ 733.949420] mmc0: sdhci: Argument: 0x00000000 | Trn mode: 0x00000000
[ 733.955946] mmc0: sdhci: Present: 0x1fff0000 | Host ctl: 0x00000001
[ 733.962473] mmc0: sdhci: Power: 0x0000000f | Blk gap: 0x00000080
[ 733.969001] mmc0: sdhci: Wake-up: 0x00000000 | Clock: 0x0000fa07
[ 733.975528] mmc0: sdhci: Timeout: 0x00000000 | Int stat: 0x00018000
[ 733.982055] mmc0: sdhci: Int enab: 0x00ff1003 | Sig enab: 0x00ff1003
[ 733.988582] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
[ 733.995109] mmc0: sdhci: Caps: 0x45ee6432 | Caps_1: 0x0000a525
[ 734.001636] mmc0: sdhci: Cmd: 0x00000502 | Max curr: 0x00080008
[ 734.008163] mmc0: sdhci: Resp[0]: 0x000001aa | Resp[1]: 0x00000000
[ 734.014689] mmc0: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
[ 734.021216] mmc0: sdhci: Host ctl2: 0x00000000
[ 734.025716] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
[ 734.032242] mmc0: sdhci: ============================================
[ 744.164283] mmc0: Timeout waiting for hardware cmd interrupt.
[ 744.170128] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 744.176655] mmc0: sdhci: Sys addr: 0x00000000 | Version: 0x00001002
[ 744.183183] mmc0: sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000
[ 744.189711] mmc0: sdhci: Argument: 0x00000000 | Trn mode: 0x00000000
[ 744.196239] mmc0: sdhci: Present: 0x1fff0000 | Host ctl: 0x00000001
[ 744.202767] mmc0: sdhci: Power: 0x0000000f | Blk gap: 0x00000080
[ 744.209294] mmc0: sdhci: Wake-up: 0x00000000 | Clock: 0x0000fa07
[ 744.215821] mmc0: sdhci: Timeout: 0x00000000 | Int stat: 0x00018000
[ 744.222349] mmc0: sdhci: Int enab: 0x00ff1003 | Sig enab: 0x00ff1003
[ 744.228877] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
[ 744.235411] mmc0: sdhci: Caps: 0x45ee6432 | Caps_1: 0x0000a525
[ 744.241946] mmc0: sdhci: Cmd: 0x00000502 | Max curr: 0x00080008
[ 744.248474] mmc0: sdhci: Resp[0]: 0x000001aa | Resp[1]: 0x00000000
[ 744.255002] mmc0: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
[ 744.261537] mmc0: sdhci: Host ctl2: 0x00000000
[ 744.266037] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
[ 744.272564] mmc0: sdhci: ============================================
[ 754.404552] mmc0: Timeout waiting for hardware cmd interrupt.
[ 754.410395] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 754.416924] mmc0: sdhci: Sys addr: 0x00000000 | Version: 0x00001002
[ 754.423452] mmc0: sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000
[ 754.429979] mmc0: sdhci: Argument: 0x00000000 | Trn mode: 0x00000000
[ 754.436506] mmc0: sdhci: Present: 0x1fff0000 | Host ctl: 0x00000001
[ 754.443033] mmc0: sdhci: Power: 0x0000000f | Blk gap: 0x00000080
[ 754.449560] mmc0: sdhci: Wake-up: 0x00000000 | Clock: 0x0000fa07
[ 754.456087] mmc0: sdhci: Timeout: 0x00000000 | Int stat: 0x00018000
[ 754.462614] mmc0: sdhci: Int enab: 0x00ff1003 | Sig enab: 0x00ff1003
[ 754.469141] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
[ 754.475668] mmc0: sdhci: Caps: 0x45ee6432 | Caps_1: 0x0000a525
[ 754.482194] mmc0: sdhci: Cmd: 0x00000502 | Max curr: 0x00080008
[ 754.488722] mmc0: sdhci: Resp[0]: 0x000001aa | Resp[1]: 0x00000000
[ 754.495249] mmc0: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
[ 754.501776] mmc0: sdhci: Host ctl2: 0x00000000
[ 754.506276] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
[ 754.512802] mmc0: sdhci: ============================================
[ 764.644802] mmc0: Timeout waiting for hardware cmd interrupt.
[ 764.650643] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 764.657171] mmc0: sdhci: Sys addr: 0x00000000 | Version: 0x00001002
[ 764.663699] mmc0: sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000
[ 764.670227] mmc0: sdhci: Argument: 0x00000000 | Trn mode: 0x00000000
[ 764.676754] mmc0: sdhci: Present: 0x1fff0000 | Host ctl: 0x00000001
[ 764.683280] mmc0: sdhci: Power: 0x0000000f | Blk gap: 0x00000080
[ 764.689807] mmc0: sdhci: Wake-up: 0x00000000 | Clock: 0x0000fa07
[ 764.696334] mmc0: sdhci: Timeout: 0x00000000 | Int stat: 0x00018001
[ 764.702860] mmc0: sdhci: Int enab: 0x00ff1003 | Sig enab: 0x00ff1003
[ 764.709387] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
[ 764.715913] mmc0: sdhci: Caps: 0x45ee6432 | Caps_1: 0x0000a525
[ 764.722441] mmc0: sdhci: Cmd: 0x0000371a | Max curr: 0x00080008
[ 764.728968] mmc0: sdhci: Resp[0]: 0x00400120 | Resp[1]: 0x00000000
[ 764.735494] mmc0: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
[ 764.742021] mmc0: sdhci: Host ctl2: 0x00000000
[ 764.746521] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
[ 764.753048] mmc0: sdhci: ============================================
[ 774.885039] mmc0: Timeout waiting for hardware cmd interrupt.
[ 774.890883] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 774.897412] mmc0: sdhci: Sys addr: 0x00000000 | Version: 0x00001002
[ 774.903941] mmc0: sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000
[ 774.910468] mmc0: sdhci: Argument: 0x00000000 | Trn mode: 0x00000000
[ 774.916995] mmc0: sdhci: Present: 0x1fff0000 | Host ctl: 0x00000001
[ 774.923522] mmc0: sdhci: Power: 0x0000000f | Blk gap: 0x00000080
[ 774.930049] mmc0: sdhci: Wake-up: 0x00000000 | Clock: 0x0000fa07
[ 774.936576] mmc0: sdhci: Timeout: 0x00000000 | Int stat: 0x00018001
[ 774.943103] mmc0: sdhci: Int enab: 0x00ff1003 | Sig enab: 0x00ff1003
[ 774.949630] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
[ 774.956157] mmc0: sdhci: Caps: 0x45ee6432 | Caps_1: 0x0000a525
[ 774.962684] mmc0: sdhci: Cmd: 0x0000371a | Max curr: 0x00080008
[ 774.969211] mmc0: sdhci: Resp[0]: 0x00000120 | Resp[1]: 0x00000000
[ 774.975738] mmc0: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
[ 774.982264] mmc0: sdhci: Host ctl2: 0x00000000
[ 774.986763] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
[ 774.993291] mmc0: sdhci: ============================================

** Model information
Device Tree model: Raspberry Pi 4 Model B Rev 1.1

** Loaded modules:
btsdio
brcmfmac
brcmutil
rfcomm
algif_hash
qrtr
algif_skcipher
af_alg
bnep
hci_uart
btqca
btrtl
btbcm
btintel
bluetooth
jitterentropy_rng
sha512_generic
nls_ascii
bcm2835_v4l2(C)
nls_cp437
sha512_arm64
https://paste.debian.net/1253719/vfat
bcm2835_mmal_vchiq(C)
fat
videobuf2_vmalloc
videobuf2_memops
videobuf2_v4l2
videobuf2_common
videodev
aes_neon_bs
mc
snd_bcm2835(C)
aes_neon_blk
joydev
drbg
evdev
cpufreq_dt
ansi_cprng
vchiq(C)
iproc_rng200
snd_soc_hdmi_codec
ecdh_generic
rng_core
ecc
bcm2711_thermal
bcm2835_wdt
pwm_bcm2835
raspberrypi_cpufreq
leds_gpio
cfg80211
sg
rfkill
nf_tables
libcrc32c
nfnetlink
fuse
configfs
ip_tables
x_tables
autofs4
ext4
crc16
mbcache
jbd2
crc32c_generic
hid_logitech_hidpp
sd_mod
hid_logitech_dj
t10_pi
crc64_rocksoft
crc64
hid_generic
crc_t10dif
crct10dif_generic
uas
usbhid
usb_storage
scsi_mod
hid
scsi_common
vc4
snd_soc_core
snd_pcm_dmaengine
snd_pcm
snd_timer
snd
soundcore
cec
broadcom
bcm_phy_lib
rc_core
drm_display_helper
dwc2
xhci_pci
drm_cma_helper
xhci_hcd
genet
udc_core
roles
mdio_bcm_unimac
drm_kms_helper
of_mdio
fixed_phy
fwnode_mdio
usbcore
reset_raspberrypi
libphy
sdhci_iproc
crct10dif_ce
crct10dif_common
drm
sdhci_pltfm
i2c_bcm2835
usb_common
sdhci
gpio_regulator
phy_generic
fixed

** PCI devices:
not available

** USB devices:
Bus 002 Device 003: ID 174c:55aa ASMedia Technology Inc. ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge, ASM1153E SATA 6Gb/s bridge
Bus 002 Device 002: ID 2109:0813 VIA Labs, Inc. VL813 Hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 005: ID 046d:c52b Logitech, Inc. Unifying Receiver
Bus 001 Device 004: ID 04d9:0132 Holtek Semiconductor, Inc. USB Keyboard
Bus 001 Device 003: ID 2109:2813 VIA Labs, Inc. VL813 Hub
Bus 001 Device 002: ID 2109:3431 VIA Labs, Inc. Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub


-- System Information:
Debian Release: bookworm/sid
APT prefers testing
APT policy: (500, 'testing'), (102, 'unstable')
Architecture: arm64 (aarch64)

Kernel: Linux 5.19.0-1-arm64 (SMP w/4 CPU threads)
Kernel taint flags: TAINT_WARN, TAINT_CRAP
Locale: LANG=en_US.UTF-8, LC_CTYPE=C.UTF-8 (charmap=locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages linux-image-5.19.0-1-arm64 depends on:
ii initramfs-tools [linux-initramfs-tool] 0.142
ii kmod 30+20220630-3
ii linux-base 4.9

Versions of packages linux-image-5.19.0-1-arm64 recommends:
ii apparmor 3.0.7-1
ii firmware-linux-free 20200122-1

Versions of packages linux-image-5.19.0-1-arm64 suggests:
pn debian-kernel-handbook <none>
pn linux-doc-5.19 <none>

Versions of packages linux-image-5.19.0-1-arm64 is related to:
pn firmware-amd-graphics <none>
pn firmware-atheros <none>
pn firmware-bnx2 <none>
pn firmware-bnx2x <none>
ii firmware-brcm80211 20210818-1
pn firmware-cavium <none>
pn firmware-intel-sound <none>
pn firmware-intelwimax <none>
pn firmware-ipw2x00 <none>
pn firmware-ivtv <none>
pn firmware-iwlwifi <none>
pn firmware-libertas <none>
pn firmware-linux-nonfree <none>
pn firmware-misc-nonfree <none>
pn firmware-myricom <none>
pn firmware-netxen <none>
pn firmware-qlogic <none>
pn firmware-realtek <none>
pn firmware-samsung <none>
pn firmware-siano <none>
pn firmware-ti-connectivity <none>
pn xen-hypervisor <none>

-- debconf information:
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LC_CTYPE = "C.UTF-8",
LANG = "en_US.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory

Hank Barta

unread,
Sep 13, 2022, 12:20:04 PM9/13/22
to
Behavior not seen before - panic during boot.



--
Beautiful Sunny Winfield

Bjørn Mork

unread,
Sep 13, 2022, 12:50:03 PM9/13/22
to
Hank Barta <hba...@gmail.com> writes:

> ** Kernel log:
> [ 723.735217] mmc0: sdhci: Timeout: 0x00000000 | Int stat: 0x00018000
> [ 723.741743] mmc0: sdhci: Int enab: 0x00ff1003 | Sig enab: 0x00ff1003
> [ 723.748270] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
> [ 723.754797] mmc0: sdhci: Caps: 0x45ee6432 | Caps_1: 0x0000a525
> [ 723.761324] mmc0: sdhci: Cmd: 0x00000502 | Max curr: 0x00080008
> [ 723.767851] mmc0: sdhci: Resp[0]: 0x000001aa | Resp[1]: 0x00000000
> [ 723.774379] mmc0: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
> [ 723.780905] mmc0: sdhci: Host ctl2: 0x00000000
> [ 723.785404] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
> [ 723.791930] mmc0: sdhci: ============================================
> [ 733.923993] mmc0: Timeout waiting for hardware cmd interrupt.

These repeated messages are normal on the RPi4 if you boot it without an
SD card. E.g. from USB or network. If that's what you intend to do,
then you can avoid the repeated messages by adding

dtparam=sd_poll_once=on

to the config.txt file in your firmware partition. Often mounted as
/boot/firmware/.

The effect depends on which device-tree you are using. I believe it
will only work with the ones coming with the Raspberry Pi firmware. See

https://github.com/raspberrypi/firmware/blob/master/boot/overlays/README

for docs.


Bjørn

Hank Barta

unread,
Sep 13, 2022, 4:40:05 PM9/13/22
to


---------- Forwarded message ---------
From: Hank Barta <hba...@gmail.com>
Date: Tue, Sep 13, 2022 at 12:54 PM
Subject: Re: Bug#1019700: mmc0: Timeout waiting for hardware cmd interrupt.
To: Bjørn Mork <bj...@mork.no>


Hi Bjørn,

Many thanks for the prompt reply. In the mean time I have done the following:

* Reimaged my SD card with `20220808_raspi_4_bookworm.img.xz` from Debian Tested images. (5.18.14-1 kernel)
* Booted and noted no SD card timeouts. Rebooted and power cycled 3 times each with the same result.
* Performed `apt update && apt upgrade -y` and rebooted. (5.19.6-1 kernel)
* First boot - repeated SD timeouts and unable to log in. Power cycled to force reboot
* Second reboot - no SD card timeouts. Added `dtparam=sd_poll_once=on` to `/boot/firmware/config.txt`
* Third boot - repeated SD card timeouts.

Evetually I was able to log in to the console. Network is not fully up. The repeated SD timeouts seem to be slowing normal boot. Actually I may not have been logged in but in the console that presents when there is a problem booting. I exited and now I see a login prompt. And Ethernet finally came up. 737 seconds post boot according to console messages. (It was some time later before I could ssh in.)

The SD timeout messages stopped. I have a login prompt at the console but it takes about 30s to login. The system is now responsive, but WiFi modules did not load. I count 52 timeout messages in dmesg output. There is no response to <ctrl><alt><del> at the console. Tried to shutdown using `shutdown -r now` and the system hangs.

The system is most certainly not operating normally.

Does Debian use the device tree? This is a Debian system, not R-Pi OS.

If I reboot enough times I will get a clean boot followed by normal operation. I have tried different SD cards, USB SSDs and Pi 4Bs all with the same result so I do not believe this is a H/W problem. I do recall the previous SD timeout issue and I worked around that by inserting an SD card post boot but that no longer works. This seems to be a new problem.

best,
hank

On Tue, Sep 13, 2022 at 11:32 AM Bjørn Mork <bj...@mork.no> wrote:
Hank Barta <hba...@gmail.com> writes:

> ** Kernel log:
> [  723.735217] mmc0: sdhci: Timeout:   0x00000000 | Int stat: 0x00018000
> [  723.741743] mmc0: sdhci: Int enab:  0x00ff1003 | Sig enab: 0x00ff1003
> [  723.748270] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
> [  723.754797] mmc0: sdhci: Caps:      0x45ee6432 | Caps_1:   0x0000a525
> [  723.761324] mmc0: sdhci: Cmd:       0x00000502 | Max curr: 0x00080008
> [  723.767851] mmc0: sdhci: Resp[0]:   0x000001aa | Resp[1]:  0x00000000
> [  723.774379] mmc0: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
> [  723.780905] mmc0: sdhci: Host ctl2: 0x00000000
> [  723.785404] mmc0: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0x00000000
> [  723.791930] mmc0: sdhci: ============================================
> [  733.923993] mmc0: Timeout waiting for hardware cmd interrupt.

These repeated messages are normal on the RPi4 if you boot it without an
SD card.  E.g. from USB or network.  If that's what you intend to do,
then you can avoid the repeated messages by adding

 dtparam=sd_poll_once=on

to the config.txt file in your firmware partition.  Often mounted as
/boot/firmware/.

The effect depends on which device-tree you are using.  I believe it
will only work with the ones coming with the Raspberry Pi firmware.  See

https://github.com/raspberrypi/firmware/blob/master/boot/overlays/README

for docs.


Bjørn


--
Beautiful Sunny Winfield


--
Beautiful Sunny Winfield

Hank Barta

unread,
Sep 22, 2022, 11:30:03 AM9/22/22
to
I've gone through "git bisect" on the repo and results are at https://paste.debian.net/1254605/

Each candidate was tested with 6 reboots (including 3 that involved power cycling.)

There were four stages that did not build and at which I executed 'git bisect skip' to get a buildable candidate. They are listed in the paste linked above.

I have notes on the build errors if that would be useful.

--
Beautiful Sunny Winfield

Diederik de Haas

unread,
Dec 6, 2022, 7:20:04 PM12/6/22
to
Control: tag -1 moreinfo

Hi Hank,

On Tuesday, 13 September 2022 17:48:22 CET Hank Barta wrote:
> Package: src:linux
> Version: 5.19.6-1
>
> * What led up to the situation?
>
> Apparent inability to initialize/connect to the SD card H/W. This leads to
> the message below that is repeated about every 10s. It can manifest three
> ways.
>
> 1. Failure to boot - continuous retries to read SD card.
> 2. If a USB SSD is connected, it can skip the SD card and boot from the SATA
> SSD. (That is the coneition as I prepare this report.)
> 3. Completes boot, message repeats and there are no /dev/mmc* entries and
> WiFi H/W is not recognozed.
> 4. Completes boot, messages are repeated but /dev/mmc entries are present
> and can mount/read an SD card. And WiFi appears to be working
> 5. Completes boot, no SD card timeout messages are reported and system
> operates normally.
>
The title of this bug and the above quoted part of the kernel log seems to be
the same as the problem reported in https://bugs.debian.org/985630.

Do you agree?
Does that make this bug the same as the other one (and should therefor be
merged)? The main reason I'm hesitant to merge them is that both bugs also
describe other issues.
While the repeated messages aren't 'nice', they itself are harmless AFAICT.
But what you further described is more then just harmless.

Can you clarify? And while you're at it, also tell us whether the issue is the
same or resolved or worse with f.e. a 6.0 kernel? It would be great if you
could also try it with the 6.1-rcX kernel from Experimental.

Cheers,
Diederik
signature.asc

Hank Barta

unread,
Dec 7, 2022, 6:10:03 PM12/7/22
to
Hi Diederik,

This may not be the same bug. The one you referenced was provoked when the SD card slot was empty and could be suppressed by putting a card in the slot. Also that bug was fixed and the work around was no longer necessary. The one I experienced could happen with the SD card in place and at various times, the SD card would be recognized or would not be recognized (when a card was in place and the timeout was reported.) Let me clarify the situations I encountered while testing. Again, I performed testing while booting from a USB connected SSD.

* Normal boot, no timeout reported and SD card recognized.
* Timeout reported following boot and SD card recognized and working.
* Timeout reported and SD card not recognized.

Repeating the boot process could result in any of the three conditions and it did not seem to matter if a warm or cold boot was involved.

I have updated my test install to the 6.0 kernel, identified as

hbarta@boson:~$ uname -a
Linux boson 6.0.0-5-arm64 #1 SMP Debian 6.0.10-1 (2022-11-26) aarch64 GNU/Linux
hbarta@boson:~$

I rebooted and power cycled several times and was ready to declare this fixed, but the most recent reboot is den=monstrating the issue - e.g. MMC timeouts and no SD card in /dev. I inserted an SD card in the slot and the timeout messages are continuing after the card is initialized.

[  274.788978] mmc1: new ultra high speed DDR50 SDHC card at address aaaa
[  274.805432] mmcblk1: mmc1:aaaa SL16G 14.8 GiB  
[  274.837154]  mmcblk1: p1 p2
[  281.828861] mmc0: Timeout waiting for hardware cmd interrupt.
[  281.836160] mmc0: sdhci: ============ SDHCI REGISTER DUMP ===========
[  281.844122] mmc0: sdhci: Sys addr:  0x00000000 | Version:  0x00009902
[  281.852082] mmc0: sdhci: Blk size:  0x00000000 | Blk cnt:  0x00000000
[  281.860026] mmc0: sdhci: Argument:  0x00000c00 | Trn mode: 0x00000000
[  281.867973] mmc0: sdhci: Present:   0x01ff0001 | Host ctl: 0x00000001
[  281.875905] mmc0: sdhci: Power:     0x0000000f | Blk gap:  0x00000000
[  281.883839] mmc0: sdhci: Wake-up:   0x00000000 | Clock:    0x00007187
[  281.891779] mmc0: sdhci: Timeout:   0x00000000 | Int stat: 0x00018000
[  281.899716] mmc0: sdhci: Int enab:  0x00ff0003 | Sig enab: 0x00ff0003
[  281.907658] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
[  281.915602] mmc0: sdhci: Caps:      0x00000000 | Caps_1:   0x00000000
[  281.923543] mmc0: sdhci: Cmd:       0x0000341a | Max curr: 0x00000001
[  281.931481] mmc0: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0x00000000
[  281.939414] mmc0: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
[  281.947338] mmc0: sdhci: Host ctl2: 0x00000000
[  281.953232] mmc0: sdhci: ============================================

I'm not sure if this matters, but when the timeouts are reported, orderly shutdown takes several minutes longer than normal but eventually completes.

best,

--
Beautiful Sunny Winfield

Hank Barta

unread,
Dec 7, 2022, 9:40:03 PM12/7/22
to
I tested with the 6.1 kernel from testing.

hbarta@boson:~$ uname -a
Linux boson 6.1.0-0-arm64 #1 SMP Debian 6.1~rc7-1~exp1 (2022-12-01) aarch64 GNU/Linux
hbarta@boson:~$

The problem seems to be worse. It manifests with every cold boot (4 tries.) I was only able to test one warm boot immediately following installation of 6.1 (and shutting down 6.0) and the issue manifested then also. I was unable to further test warm boot because the system never completed shutdown even after waiting 10 minutes. Inserting an SD card got the following message (from dmesg)

[   46.858495] mmc1: new high speed SDIO card at address 0001

That message was followed by a single block from the timeout message referencing mmc1 and which seemed not to be repeated. The /dev/ entry for the SD card was never created.

Update: While composing this message it did complete a shutdown and reboot. Unfortunately the SD card was (sort of) bootable and the system tried to boot from it and hung, forcing a power cycle and cold boot. On this subsequent cold boot the timeout did not manifest. A subsequent warm boot also did not manifest. The third warm boot did manifest the timeout.

best,

--
Beautiful Sunny Winfield

Hank Barta

unread,
Dec 7, 2022, 9:50:03 PM12/7/22
to
On Wed, Dec 7, 2022 at 8:34 PM Hank Barta <hba...@gmail.com> wrote:
I tested with the 6.1 kernel from testing.

The kernel was from experimental, not testing. Apologies for the error.

--
Beautiful Sunny Winfield

Chen-Yu Tsai

unread,
Dec 24, 2022, 1:00:04 AM12/24/22
to
When it doesn't work, I see something like the following in
/proc/interrupts:

34: 0 0 0 0 GICv2 158 Level mmc0
35: 100001 0 0 0 GICv2 158 Level mmc1

which doesn't make sense, since this is a shared interrupt and should map
to the same interrupt number.

On a working system the following is seen:

34: 18707 0 0 0 GICv2 158 Level
mmc1, mmc0

It looks like this might be caused by some sort of race among mapping
shared IRQs, disabling of "nobody cared" interrupt lines, and the mmc
driver/core probing the attached card.

Disabling async probing of the sdhci-iproc driver and marking the SD
card as non-removable using an overlay combined seem to decrease the
chances of hitting this, but isn't guaranteed to not hit it.

Cyril Brulebois

unread,
Feb 19, 2023, 9:20:08 AM2/19/23
to
Hi,

Hank Barta <hba...@gmail.com> (2022-09-22):
FWIW bisecting this issue with a CM4 (and eMMC, no external storage) has
been on my todo list for a long while… but results seem consistent with
my (now) vague recollection: regression in the late 5.1X versions.

I was initially chasing down the PCIe regression and was also
encountering a black screen regression, so everything is a bit fuzzy,
sorry… A quick search returns these though:
https://lore.kernel.org/linux-arm-kernel/20220529011526....@mraw.org/
https://lore.kernel.org/linux-arm-kernel/20220602191757....@mraw.org/


Cheers,
--
Cyril Brulebois (ki...@debian.org) <https://debamax.com/>
D-I release manager -- Release team member -- Freelance Consultant
signature.asc

Cyril Brulebois

unread,
Feb 19, 2023, 9:42:52 AM2/19/23
to
Cyril Brulebois <ki...@debian.org> (2023-02-19):
> FWIW bisecting this issue with a CM4 (and eMMC, no external storage) has
> been on my todo list for a long while… but results seem consistent with
> my (now) vague recollection: regression in the late 5.1X versions.

Sorry, that was with external storage, swapping the SD card again and
again, always starting with a fresh one to get consistent results. So
the sd_poll_once trick mentioned by Bjorn would be totally out of scope
here.
signature.asc

Cyril Brulebois

unread,
Mar 8, 2023, 10:02:01 AM3/8/23
to
Control: tag -1 fixed-upstream
Control: found -1 6.1.15-1

Hi Bjørn and Hank,

Bjørn Mork <bj...@mork.no> (2022-09-13):
> Hank Barta <hba...@gmail.com> writes:
>
> > ** Kernel log:
> > [ 723.735217] mmc0: sdhci: Timeout: 0x00000000 | Int stat: 0x00018000
> > [ 723.741743] mmc0: sdhci: Int enab: 0x00ff1003 | Sig enab: 0x00ff1003
> > [ 723.748270] mmc0: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
> > [ 723.754797] mmc0: sdhci: Caps: 0x45ee6432 | Caps_1: 0x0000a525
> > [ 723.761324] mmc0: sdhci: Cmd: 0x00000502 | Max curr: 0x00080008
> > [ 723.767851] mmc0: sdhci: Resp[0]: 0x000001aa | Resp[1]: 0x00000000
> > [ 723.774379] mmc0: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000
> > [ 723.780905] mmc0: sdhci: Host ctl2: 0x00000000
> > [ 723.785404] mmc0: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
> > [ 723.791930] mmc0: sdhci: ============================================
> > [ 733.923993] mmc0: Timeout waiting for hardware cmd interrupt.
>
> These repeated messages are normal on the RPi4 if you boot it without an
> SD card.

This is definitely a blocking issue for booting on both Pi 4 (SD card)
and CM4 (SD card or internal eMMC). That's also seen with v6.1.15
(annotating accordingly) and v6.2 (not annotating as we have no such
package in the archive at the moment). Thankfully that's gone in
v6.3-rc1.

You can find my analysis in:
https://lore.kernel.org/linux-arm-kernel/20230308144105....@mraw.org/

Hopefully we'll get that fixed via stable/6.1.y soon. I might propose a
patch series against the Debian package right away though, Raspberry Pi
users have suffered long enough from that regression…
signature.asc

Diederik de Haas

unread,
Mar 8, 2023, 10:40:04 AM3/8/23
to
On Wednesday, 8 March 2023 15:49:03 CET Cyril Brulebois wrote:
> This is definitely a blocking issue for booting on both Pi 4 (SD card)
> and CM4 (SD card or internal eMMC). That's also seen with v6.1.15
> (annotating accordingly) and v6.2 (not annotating as we have no such
> package in the archive at the moment). Thankfully that's gone in
> v6.3-rc1.
>
> You can find my analysis in:
>
> https://lore.kernel.org/linux-arm-kernel/20230308144105.di552lbogqv2s7fk@mr
> aw.org/
>
> Hopefully we'll get that fixed via stable/6.1.y soon. I might propose a
> patch series against the Debian package right away though, Raspberry Pi
> users have suffered long enough from that regression…

The relevant patches are queued up for 6.1.16:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/
commit/queue-6.1?id=a5ed16fdb1a2a3a9b50c3da79abcfe9366b2c47a
signature.asc
0 new messages