Dell Latitude 7400 - nvme0: Missing interrupt

81 views
Skip to first unread message

Pavel Timofeev

unread,
Aug 16, 2021, 10:44:30 PM8/16/21
to freebsd-current
Hello
I've got a Dell Latitude 7400 and tried installing the latest 14.0-CURRENT
(main-n248636-d20e9e02db3) on it.
Despite other things the weird one which concerns me is
nvme0: Missing interrupt
message I get sometimes on the console.
It seems like I get it only after the reboot of the laptop, i. e. not
getting that message if I power cycle the laptop, at least I haven't seen
them for now in such cases.
So when the laptop is rebooted I can't even take advantage of
nvmecontrol(8) quickly.
Well, it still works, but it takes tens of seconds to return the output.


# pciconf -lv
hostb0@pci0:0:0:0: class=0x060000 rev=0x0c hdr=0x00 vendor=0x8086
device=0x3e34 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Coffee Lake HOST and DRAM Controller'
class = bridge
subclass = HOST-PCI
vgapci0@pci0:0:2:0: class=0x030000 rev=0x02 hdr=0x00 vendor=0x8086
device=0x3ea0 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'WhiskeyLake-U GT2 [UHD Graphics 620]'
class = display
subclass = VGA
none0@pci0:0:4:0: class=0x118000 rev=0x0c hdr=0x00 vendor=0x8086
device=0x1903 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal
Subsystem'
class = dasp
none1@pci0:0:8:0: class=0x088000 rev=0x00 hdr=0x00 vendor=0x8086
device=0x1911 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core
Processor Gaussian Mixture Model'
class = base peripheral
pchtherm0@pci0:0:18:0: class=0x118000 rev=0x30 hdr=0x00 vendor=0x8086
device=0x9df9 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Cannon Point-LP Thermal Controller'
class = dasp
none2@pci0:0:19:0: class=0x070000 rev=0x30 hdr=0x00 vendor=0x8086
device=0x9dfc subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Cannon Point-LP Integrated Sensor Hub'
class = simple comms
subclass = UART
xhci0@pci0:0:20:0: class=0x0c0330 rev=0x30 hdr=0x00 vendor=0x8086
device=0x9ded subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Cannon Point-LP USB 3.1 xHCI Controller'
class = serial bus
subclass = USB
none3@pci0:0:20:2: class=0x050000 rev=0x30 hdr=0x00 vendor=0x8086
device=0x9def subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Cannon Point-LP Shared SRAM'
class = memory
subclass = RAM
iwm0@pci0:0:20:3: class=0x028000 rev=0x30 hdr=0x00 vendor=0x8086
device=0x9df0 subvendor=0x8086 subdevice=0x4030
vendor = 'Intel Corporation'
device = 'Cannon Point-LP CNVi [Wireless-AC]'
class = network
ig4iic0@pci0:0:21:0: class=0x0c8000 rev=0x30 hdr=0x00 vendor=0x8086
device=0x9de8 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Cannon Point-LP Serial IO I2C Controller'
class = serial bus
ig4iic1@pci0:0:21:1: class=0x0c8000 rev=0x30 hdr=0x00 vendor=0x8086
device=0x9de9 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Cannon Point-LP Serial IO I2C Controller'
class = serial bus
ig4iic2@pci0:0:21:3: class=0x0c8000 rev=0x30 hdr=0x00 vendor=0x8086
device=0x9deb subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
class = serial bus
none4@pci0:0:22:0: class=0x078000 rev=0x30 hdr=0x00 vendor=0x8086
device=0x9de0 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Cannon Point-LP MEI Controller'
class = simple comms
ig4iic3@pci0:0:25:0: class=0x0c8000 rev=0x30 hdr=0x00 vendor=0x8086
device=0x9dc5 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Cannon Point-LP Serial IO I2C Host Controller'
class = serial bus
pcib1@pci0:0:28:0: class=0x060400 rev=0xf0 hdr=0x01 vendor=0x8086
device=0x9dbc subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Cannon Point-LP PCI Express Root Port'
class = bridge
subclass = PCI-PCI
pcib6@pci0:0:29:0: class=0x060400 rev=0xf0 hdr=0x01 vendor=0x8086
device=0x9db3 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
class = bridge
subclass = PCI-PCI
pcib7@pci0:0:29:4: class=0x060400 rev=0xf0 hdr=0x01 vendor=0x8086
device=0x9db4 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Cannon Point-LP PCI Express Root Port'
class = bridge
subclass = PCI-PCI
isab0@pci0:0:31:0: class=0x060100 rev=0x30 hdr=0x00 vendor=0x8086
device=0x9d84 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Cannon Point-LP LPC Controller'
class = bridge
subclass = PCI-ISA
hdac0@pci0:0:31:3: class=0x040380 rev=0x30 hdr=0x00 vendor=0x8086
device=0x9dc8 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Cannon Point-LP High Definition Audio Controller'
class = multimedia
subclass = HDA
none5@pci0:0:31:4: class=0x0c0500 rev=0x30 hdr=0x00 vendor=0x8086
device=0x9da3 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Cannon Point-LP SMBus Controller'
class = serial bus
subclass = SMBus
none6@pci0:0:31:5: class=0x0c8000 rev=0x30 hdr=0x00 vendor=0x8086
device=0x9da4 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'Cannon Point-LP SPI Controller'
class = serial bus
pcib2@pci0:1:0:0: class=0x060400 rev=0x02 hdr=0x01 vendor=0x8086
device=0x15da subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'JHL6340 Thunderbolt 3 Bridge (C step) [Alpine Ridge 2C
2016]'
class = bridge
subclass = PCI-PCI
pcib3@pci0:2:0:0: class=0x060400 rev=0x02 hdr=0x01 vendor=0x8086
device=0x15da subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'JHL6340 Thunderbolt 3 Bridge (C step) [Alpine Ridge 2C
2016]'
class = bridge
subclass = PCI-PCI
pcib4@pci0:2:1:0: class=0x060400 rev=0x02 hdr=0x01 vendor=0x8086
device=0x15da subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'JHL6340 Thunderbolt 3 Bridge (C step) [Alpine Ridge 2C
2016]'
class = bridge
subclass = PCI-PCI
pcib5@pci0:2:2:0: class=0x060400 rev=0x02 hdr=0x01 vendor=0x8086
device=0x15da subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'JHL6340 Thunderbolt 3 Bridge (C step) [Alpine Ridge 2C
2016]'
class = bridge
subclass = PCI-PCI
none7@pci0:3:0:0: class=0x088000 rev=0x02 hdr=0x00 vendor=0x8086
device=0x15d9 subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'JHL6340 Thunderbolt 3 NHI (C step) [Alpine Ridge 2C 2016]'
class = base peripheral
xhci1@pci0:57:0:0: class=0x0c0330 rev=0x02 hdr=0x00 vendor=0x8086
device=0x15db subvendor=0x1028 subdevice=0x08e1
vendor = 'Intel Corporation'
device = 'JHL6340 Thunderbolt 3 USB 3.1 Controller (C step) [Alpine
Ridge 2C 2016]'
class = serial bus
subclass = USB
rtsx0@pci0:58:0:0: class=0xff0000 rev=0x01 hdr=0x00 vendor=0x10ec
device=0x525a subvendor=0x1028 subdevice=0x08e1
vendor = 'Realtek Semiconductor Co., Ltd.'
device = 'RTS525A PCI Express Card Reader'
nvme0@pci0:59:0:0: class=0x010802 rev=0x00 hdr=0x00 vendor=0x1c5c
device=0x1639 subvendor=0x1c5c subdevice=0x1639
vendor = 'SK hynix'
class = mass storage
subclass = NVM


# nvmecontrol devlist
nvme0: PC611 NVMe SK hynix 512GB
nvme0ns1 (488386MB)

dmesg when power cycled -
https://drive.google.com/file/d/1dB27oB1O2CcnZy6DvOOhmFO8SN8V8SwJ
dmesg when rebooted -
https://drive.google.com/file/d/1DsKTMkihp_OmUcirByLaVO4o2mU38Bxh

Pavel Timofeev

unread,
Aug 21, 2021, 12:44:51 AM8/21/21
to Chuck Tuffli, freebsd-current
Pavel Timofeev <tim...@gmail.com>:

>
> Chuck Tuffli <ctu...@gmail.com>:
>
>> On Mon, Aug 16, 2021 at 7:43 PM Pavel Timofeev <tim...@gmail.com> wrote:
>> >
>> > Hello
>> > I've got a Dell Latitude 7400 and tried installing the latest
>> 14.0-CURRENT
>> > (main-n248636-d20e9e02db3) on it.
>> > Despite other things the weird one which concerns me is
>> > nvme0: Missing interrupt
>> > message I get sometimes on the console.
>> > It seems like I get it only after the reboot of the laptop, i. e. not
>> > getting that message if I power cycle the laptop, at least I haven't
>> seen
>> > them for now in such cases.
>> > So when the laptop is rebooted I can't even take advantage of
>> > nvmecontrol(8) quickly.
>> > Well, it still works, but it takes tens of seconds to return the output.
>> ...
>> I'm sort of curious about the time stamps for the log messages in the
>> failing case. Something like:
>>
>> $ grep "nv\(me\|d\)" /var/log/messages
>>
>> --chuck
>>
>
> Well, I can't see timestamps in the verbose boot log. Am I missing some
> configuration for that?
>
> $ grep "nv\(me\|d\)" /var/log/messages
> nvme0: <Generic NVMe Device> mem
> 0xcc100000-0xcc103fff,0xcc105000-0xcc105fff,0xcc104000-0xcc104fff at device
> 0.0 on pci6
> nvme0: attempting to allocate 5 MSI-X vectors (17 supported)
> nvme0: using IRQs 133-137 for MSI-X
> nvme0: CapLo: 0x140103ff: MQES 1023, CQR, TO 20
> nvme0: CapHi: 0x00000030: DSTRD 0, NSSRS, CSS 1, MPSMIN 0, MPSMAX 0
> nvme0: Version: 0x00010300: 1.3
> nvme0: Missing interrupt
> nvme0: Missing interrupt
> nvme0: Missing interrupt
> nvme0: Missing interrupt
> nvme0: Missing interrupt
> nvme0: Missing interrupt
> nvme0: Missing interrupt
> nvme0: Missing interrupt
> nvme0: Missing interrupt
> nvme0: Missing interrupt
> nvme0: Missing interrupt
> nvme0: Missing interrupt
> nvd0: <PC611 NVMe SK hynix 512GB> NVMe namespace
> GEOM: new disk nvd0
> nvd0: 488386MB (1000215216 512 byte sectors)
>


Ah, sorry, provided wrong output.
Here is what you requested:
$ grep "nv\(me\|d\)" /var/log/messages
Aug 21 04:34:36 nostromo kernel: nvme0: <Generic NVMe Device> mem
0xcc100000-0xcc103fff,0xcc105000-0xcc105fff,0xcc104000-0xcc104fff at device
0.0 on pci6
Aug 21 04:34:36 nostromo kernel: nvme0: attempting to allocate 5 MSI-X
vectors (17 supported)
Aug 21 04:34:36 nostromo kernel: nvme0: using IRQs 133-137 for MSI-X
Aug 21 04:34:36 nostromo kernel: nvme0: CapLo: 0x140103ff: MQES 1023, CQR,
TO 20
Aug 21 04:34:36 nostromo kernel: nvme0: CapHi: 0x00000030: DSTRD 0, NSSRS,
CSS 1, MPSMIN 0, MPSMAX 0
Aug 21 04:34:36 nostromo kernel: nvme0: Version: 0x00010300: 1.3
Aug 21 04:34:36 nostromo kernel: nvme0: Missing interrupt
Aug 21 04:34:36 nostromo kernel: nvme0: Missing interrupt
Aug 21 04:34:36 nostromo kernel: nvme0: Missing interrupt
Aug 21 04:34:36 nostromo kernel: nvd0: <PC611 NVMe SK hynix 512GB> NVMe
namespace
Aug 21 04:34:36 nostromo kernel: GEOM: new disk nvd0
Aug 21 04:34:36 nostromo kernel: nvd0: 488386MB (1000215216 512 byte
sectors)
Aug 21 04:34:42 nostromo kernel: nvme0: Missing interrupt
Aug 21 04:35:36 nostromo kernel: nvme0: Missing interrupt
Aug 21 04:35:50 nostromo kernel: nvme0: Missing interrupt

Warner Losh

unread,
Aug 21, 2021, 12:51:46 AM8/21/21
to Pavel Timofeev, Chuck Tuffli, freebsd-current
What happens if you set hw.nvme.use_nvd=0 and hw.cam.nda.nvd_compat=1
in the boot loader and reboot? Same thing except nda where nvd was? Or does
it work?

Something weird is going on in the interrupt assignment, I think, but I
wanted to get any nvd vs nda issues out of the way first.

Warner

Pavel Timofeev

unread,
Aug 21, 2021, 5:09:48 PM8/21/21
to Warner Losh, Chuck Tuffli, freebsd-current
Warner Losh <i...@bsdimp.com>:
Do you mean kern.cam.nda.nvd_compat instead of hw.cam.nda.nvd_compat?
kern.cam.nda.nvd_compat is 1 by default now.

So I tried to set hw.nvme.use_nvd to 1 as suggested, but I still see
nvme0: Missing interrupt
and now also
Root mount waiting for: CAM
messages besides those

Warner Losh

unread,
Aug 21, 2021, 5:26:07 PM8/21/21
to Pavel Timofeev, Chuck Tuffli, freebsd-current
OK. That all makes sense. I'd forgotten that nvd_compat=1 by default these
days.

I'll take a look on monday starting at the differences in interrupt
assignment that
are apparent when you cold boot vs reboot.

Thanks for checking... I'd hoped this was a cheap fix, but also didn't
really
expect it to be.

Warner

Pavel Timofeev

unread,
Oct 8, 2021, 4:44:47 PM10/8/21
to Warner Losh, Chuck Tuffli, freebsd-current
сб, 21 авг. 2021 г. в 15:22, Warner Losh <i...@bsdimp.com>:
I've recently upgraded to main-n249974-17f790f49f5 and it got even worse
now.
So clean poweron works as before.
But if rebooted nvme drive refuses to work, while before the code upgrade
it was just complaining about missing interrupts.

currently dmesg show this:
nvme0: <Generic NVMe Device> mem
0xcc100000-0xcc103fff,0xcc105000-0xcc105fff,0xcc104000-0xcc104fff at device
0.0 on pci6
nvd0: <PC611 NVMe SK hynix 512GB> NVMe namespace
nvd0: 488386MB (1000215216 512 byte sectors)
nvme0: <Generic NVMe Device> mem
0xcc100000-0xcc103fff,0xcc105000-0xcc105fff,0xcc104000-0xcc104fff at device
0.0 on pci6
nvme0: RECOVERY_START 9585870784 vs 9367036288
nvme0: timeout with nothing complete, resetting
nvme0: Resetting controller due to a timeout.
nvme0: RECOVERY_WAITING
nvme0: resetting controller
nvme0: aborting outstanding admin command
nvme0: IDENTIFY (06) sqid:0 cid:15 nsid:0 cdw10:00000001 cdw11:00000000
nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:15 cdw0:0
nvme0: nvme_identify_controller failed!
nvme0: waiting
nvme0: <Generic NVMe Device> mem
0xcc100000-0xcc103fff,0xcc105000-0xcc105fff,0xcc104000-0xcc104fff at device
0.0 on pci6
nvme0: RECOVERY_START 9362778467 vs 9361830561
nvme0: timeout with nothing complete, resetting
nvme0: Resetting controller due to a timeout.
nvme0: RECOVERY_WAITING
nvme0: resetting controller
nvme0: aborting outstanding admin command
nvme0: IDENTIFY (06) sqid:0 cid:15 nsid:0 cdw10:00000001 cdw11:00000000
nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:15 cdw0:0
nvme0: nvme_identify_controller failed!
nvme0: waiting

Warner Losh

unread,
Oct 8, 2021, 4:51:17 PM10/8/21
to Pavel Timofeev, Chuck Tuffli, freebsd-current
Why is this showing up twice? Or is everything above this line left over
from the first, working boot?


> nvme0: RECOVERY_START 9585870784 vs 9367036288
> nvme0: timeout with nothing complete, resetting
> nvme0: Resetting controller due to a timeout.
> nvme0: RECOVERY_WAITING
> nvme0: resetting controller
> nvme0: aborting outstanding admin command
> nvme0: IDENTIFY (06) sqid:0 cid:15 nsid:0 cdw10:00000001 cdw11:00000000
> nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:15 cdw0:0
> nvme0: nvme_identify_controller failed!
> nvme0: waiting
>

Clearly something bad is going on with the drive here... We looked into the
completion queues when we didn't get an interrupt and there was nothing
complete there....

The only thing I can think of is that this means there's a phase error
between the drive and the system. I recently removed a second reset and
made it an option NVME_2X_RESET. Can you see if adding
'options NVME_2X_RESET' to your kernel config fixes this?

Warner

Pavel Timofeev

unread,
Oct 9, 2021, 10:46:39 AM10/9/21
to Warner Losh, Chuck Tuffli, freebsd-current
пт, 8 окт. 2021 г. в 14:49, Warner Losh <i...@bsdimp.com>:
Sorry, it's showing twice due to multiple reboots. For one boot it's like:
nvme0: <Generic NVMe Device> mem
0xcc100000-0xcc103fff,0xcc105000-0xcc105fff,0xcc104000-0xcc104fff at device
0.0 on pci6
nvme0: RECOVERY_START 9633303481 vs 9365971423
nvme0: timeout with nothing complete, resetting
nvme0: Resetting controller due to a timeout.
nvme0: RECOVERY_WAITING
nvme0: resetting controller
nvme0: aborting outstanding admin command
nvme0: IDENTIFY (06) sqid:0 cid:15 nsid:0 cdw10:00000001 cdw11:00000000
nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:15 cdw0:0
nvme0: nvme_identify_controller failed!
nvme0: waiting

Well, neither Windows not Linux have any problems with the device. I
understand they may be hiding it or workaround somehow.

I'll try setting NVME_2X_RESET in the kernel config and report back in a
while.

Warner Losh

unread,
Oct 9, 2021, 5:01:45 PM10/9/21
to Pavel Timofeev, Chuck Tuffli, freebsd-current
Yea, I'm trying to figure out why your machine is different than mine, and
what Windows or Linux do that is different. It may be dodgy hardware, but
others have no trouble...

I'll try setting NVME_2X_RESET in the kernel config and report back in a
> while.
>

Thanks. If it helps, that tells me something. If it doesn't, that tells me
something else.

I suspect that it is somewhere else in the system, tbh, but I need to find
it systematically.

Warner

>

Pavel Timofeev

unread,
Oct 11, 2021, 12:50:49 AM10/11/21
to Warner Losh, Chuck Tuffli, freebsd-current
сб, 9 окт. 2021 г. в 14:59, Warner Losh <i...@bsdimp.com>:
Surprisingly, setting NVME_2X_RESET in the kernel config hasn't changed
anything. I. e it didn't help.

Warner Losh

unread,
Oct 11, 2021, 1:03:04 AM10/11/21
to Pavel Timofeev, Chuck Tuffli, freebsd-current
While it would have been nice to have this be the fix, I'm not that
surprised either.
It was the biggest change of late, apart from the big re-arrangement that
I'd done.

So the other changes have moved from the occasional missing interrupt
message
(which the old code would get when a command wasn't completed in the timeout
period, but that we found to be done when we scanned the completion queue.
Now
the device is detected fine (as before), but then doesn't do I/O at all
(including not
completing the identify command!) and is worse. This is unexpected and I'm
trying
understand what happens on reboot that 'changes'the working state and why
the
new code behaves so differently.

Warner

Warner Losh

unread,
Oct 12, 2021, 3:59:17 PM10/12/21
to Pavel Timofeev, Chuck Tuffli, freebsd-current
One piece of data that would be good to have:

nvmecontrol identify nvme0

There's an optional feature that none of my drives have, but that the Linux
driver does oddly
weird things when enabled. The output of that command will help me
understand if that may
be in play. Maybe we need to do oddly weird things too :)

Warner

Pavel Timofeev

unread,
Oct 12, 2021, 8:53:53 PM10/12/21
to Warner Losh, Chuck Tuffli, freebsd-current
вт, 12 окт. 2021 г. в 13:56, Warner Losh <i...@bsdimp.com>:
Sure, here it is:
Controller Capabilities/Features
================================
Vendor ID: 1c5c
Subsystem Vendor ID: 1c5c
Serial Number: ND03N46381010423H
Model Number: PC611 NVMe SK hynix 512GB
Firmware Version: 11000111
Recommended Arb Burst: 4
IEEE OUI Identifier: 2e e4 ac
Multi-Path I/O Capabilities: Not Supported
Max Data Transfer Size: 262144 bytes
Sanitize Crypto Erase: Not Supported
Sanitize Block Erase: Supported
Sanitize Overwrite: Not Supported
Sanitize NDI: Not Supported
Sanitize NODMMAS: Undefined
Controller ID: 0x0001
Version: 1.3.0

Admin Command Set Attributes
============================
Security Send/Receive: Supported
Format NVM: Supported
Firmware Activate/Download: Supported
Namespace Management: Not Supported
Device Self-test: Supported
Directives: Not Supported
NVMe-MI Send/Receive: Not Supported
Virtualization Management: Not Supported
Doorbell Buffer Config: Not Supported
Get LBA Status: Not Supported
Sanitize: block,
Abort Command Limit: 4
Async Event Request Limit: 8
Number of Firmware Slots: 3
Firmware Slot 1 Read-Only: No
Per-Namespace SMART Log: No
Error Log Page Entries: 256
Number of Power States: 5
Total NVM Capacity: 0 bytes
Unallocated NVM Capacity: 0 bytes
Firmware Update Granularity: 00 (Not Reported)
Host Buffer Preferred Size: 0 bytes
Host Buffer Minimum Size: 0 bytes

NVM Command Set Attributes
==========================
Submission Queue Entry Size
Max: 64
Min: 64
Completion Queue Entry Size
Max: 16
Min: 16
Number of Namespaces: 1
Compare Command: Supported
Write Uncorrectable Command: Supported
Dataset Management Command: Supported
Write Zeroes Command: Supported
Save Features: Supported
Reservations: Not Supported
Timestamp feature: Supported
Verify feature: Not Supported
Fused Operation Support: Not Supported
Format NVM Attributes: Per-NS Erase, Per-NS Format
Volatile Write Cache: Present

NVM Subsystem Name:

Warner Losh

unread,
Oct 12, 2021, 10:49:14 PM10/12/21
to Pavel Timofeev, Chuck Tuffli, freebsd-current
Thanks. Good news / bad news based on this...


> Controller Capabilities/Features
> ================================
> Vendor ID: 1c5c
> Subsystem Vendor ID: 1c5c
> Serial Number: ND03N46381010423H
> Model Number: PC611 NVMe SK hynix 512GB
>

OK. I have a PC511 256GB that I've been trying to get rotated
into my test harness... I'll make it next up tomorrow based on this...
With luck, it will have the same problem as yours. It was in a batch
of drives I got to try to sort out reported problems... (unluck of the
draw that it was pushed to the back of the list....)


> Firmware Version: 11000111
> Recommended Arb Burst: 4
> IEEE OUI Identifier: 2e e4 ac
> Multi-Path I/O Capabilities: Not Supported
> Max Data Transfer Size: 262144 bytes
> Sanitize Crypto Erase: Not Supported
> Sanitize Block Erase: Supported
> Sanitize Overwrite: Not Supported
> Sanitize NDI: Not Supported
> Sanitize NODMMAS: Undefined
> Controller ID: 0x0001
> Version: 1.3.0
>
> Admin Command Set Attributes
> ============================
> Security Send/Receive: Supported
> Format NVM: Supported
> Firmware Activate/Download: Supported
> Namespace Management: Not Supported
> Device Self-test: Supported
> Directives: Not Supported
> NVMe-MI Send/Receive: Not Supported
> Virtualization Management: Not Supported
> Doorbell Buffer Config: Not Supported
> Get LBA Status: Not Supported
>

Hmmm, I was hoping it was related to Doorbell Buffer Config.


> Sanitize: block,
> Abort Command Limit: 4
> Async Event Request Limit: 8
> Number of Firmware Slots: 3
> Firmware Slot 1 Read-Only: No
> Per-Namespace SMART Log: No
> Error Log Page Entries: 256
> Number of Power States: 5
> Total NVM Capacity: 0 bytes
> Unallocated NVM Capacity: 0 bytes
> Firmware Update Granularity: 00 (Not Reported)
> Host Buffer Preferred Size: 0 bytes
> Host Buffer Minimum Size: 0 bytes
>
> NVM Command Set Attributes
> ==========================
> Submission Queue Entry Size
> Max: 64
> Min: 64
> Completion Queue Entry Size
> Max: 16
> Min: 16
> Number of Namespaces: 1
>

These are all typical / normal as well...


> Compare Command: Supported
> Write Uncorrectable Command: Supported
> Dataset Management Command: Supported
> Write Zeroes Command: Supported
> Save Features: Supported
> Reservations: Not Supported
> Timestamp feature: Supported
> Verify feature: Not Supported
> Fused Operation Support: Not Supported
> Format NVM Attributes: Per-NS Erase, Per-NS Format
> Volatile Write Cache: Present
>

Also typical, except write uncorrectable (which we don't use).

Warner


> NVM Subsystem Name:
>
>
>

Warner Losh

unread,
Oct 17, 2021, 1:28:14 PM10/17/21
to Alexander Motin, Pavel Timofeev, Chuck Tuffli, freebsd-current
On Sun, Oct 17, 2021, 11:19 AM Alexander Motin <mav...@gmail.com> wrote:

> It may be a noise, but comparing logs I see in reboot case also
> "acpi_ec0: not getting interrupts, switched to polled mode". I am
> thinking whether the problem may be caused not by SSD, but by some
> resource conflict/misconfiguration in the system. Pavel, can you
> compare `devinfo -vr` and `lspci -vvvvv` in both cases. looking for any
> differences? Are you running the latest BIOS?
>

I'm leaning the same way since I have an identical drive not showing
problems in my system. It's also weird that the completion record for the
identify didn't show up after reboot. It makes me think it went to the
wrong place or didn't make it back up the bridge hierarchy.

Warner
> --
> Alexander Motin
>

Alexander Motin

unread,
Oct 17, 2021, 1:28:44 PM10/17/21
to Warner Losh, Pavel Timofeev, Chuck Tuffli, freebsd-current
It may be a noise, but comparing logs I see in reboot case also
"acpi_ec0: not getting interrupts, switched to polled mode". I am
thinking whether the problem may be caused not by SSD, but by some
resource conflict/misconfiguration in the system. Pavel, can you
compare `devinfo -vr` and `lspci -vvvvv` in both cases. looking for any
differences? Are you running the latest BIOS?

On 12.10.2021 15:56, Warner Losh wrote:
--
Alexander Motin

Pavel Timofeev

unread,
Oct 17, 2021, 7:55:24 PM10/17/21
to Alexander Motin, Warner Losh, Chuck Tuffli, freebsd-current
вс, 17 окт. 2021 г. в 11:19, Alexander Motin <mav...@gmail.com>:

> It may be a noise, but comparing logs I see in reboot case also
> "acpi_ec0: not getting interrupts, switched to polled mode". I am
> thinking whether the problem may be caused not by SSD, but by some
> resource conflict/misconfiguration in the system. Pavel, can you
> compare `devinfo -vr` and `lspci -vvvvv` in both cases. looking for any
> differences? Are you running the latest BIOS?
>
> On 12.10.2021 15:56, Warner Losh wrote:
> --
> Alexander Motin
>


Thanks for the reply.
It's using the latest firmware. This is the first thing I do in such case.


Attaching devinfo and lspci output.
These are diffs showing the difference between clean boot and a reboot:

$ diff -u devinfo.ok devinfo.nok
--- devinfo.ok 2021-10-17 17:48:07.964346000 -0600
+++ devinfo.nok 2021-10-17 17:48:07.886881000 -0600
@@ -214,10 +214,6 @@
nvme0 pnpinfo vendor=0x1c5c device=0x1639 subvendor=0x1c5c
subdevice=0x1639 class=0x010802 at slot=0 function=0 dbsf=pci0:59:0:0
handle=\_SB_.PCI0.RP13.PXSX
Interrupt request lines:
0x85
- 0x86
- 0x87
- 0x88
- 0x89
pcib7 memory window:
0xcc100000-0xcc103fff
0xcc104000-0xcc104fff

$ diff -u lspci.ok lspci.nok
--- lspci.ok 2021-10-17 17:48:15.894470000 -0600
+++ lspci.nok 2021-10-17 17:48:15.341379000 -0600
@@ -132,7 +132,7 @@
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
- Address: 00000000fee06000 Data: 0033
+ Address: 00000000fee06000 Data: 0034
Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE- FLReset+

Pavel Timofeev

unread,
Jan 14, 2022, 1:56:52 PM1/14/22
to Alexander Motin, Warner Losh, Chuck Tuffli, freebsd-current


вс, 17 окт. 2021 г. в 17:52, Pavel Timofeev <tim...@gmail.com>:
Hi,
I hope everyone is doing well.
So several BIOS updates passed, now the BIOS version 1.15.1, but it works the same. At least on CURRENT built several days ago (main-n252414-0e8181c0123).
What is interesting iwn(4) and iwlwifi(4) work the same way - only full power cycle makes wifi functional, simple reboot brakes it in most cases.
Does anybody have any idea?
Reply all
Reply to author
Forward
0 new messages