Problem with the connection status (Link-status) after loading the driver ntb_hw_intel

209 views
Skip to first unread message

Ka Vi

unread,
Jun 4, 2018, 7:07:16 AM6/4/18
to linux-ntb

    Hi

    After loading the ntb.ko and ntb_hw_intel drivers, the /sys/kernel/debug/ntb_hw_intel/0000:00:03.0/info file appears, indicating that Link-Status is Down.
    When the ntb_transport driver is loaded, the Link-Status state goes to Up.
    When the ntb_netdev driver is loaded, the eth0 device appears. When assigning ip-addressing to both devices eth0, we have a ping between them.

    After downloading the ntc_ntb_msi driver, the folders by / sys / kernel / debug / ntc_ntb_msi, / sys / kernel / debug / ntrdma, / sys / class / infiniband are empty. In the / sys / class / infiniband_verbs folder there is only one "abi_version" file containing one "6" character.


    Why does not the link go into UP status when loading ntb_hw_intel? And it appears only after ntb_transport, it's kind of like different things. 
    Also what it is necessary to make, what the link would pass in status UP at loading ntb_hw_intel?


    //---------------------------------------------------------------
    Spread spectrum clock is turned off by default.

    NTB Window Size:
    PBAR23SZ 22
    PBAR45SZ 20
    SBAR23SZ 20
    SBAR45SZ 20

    System Gooxi, Debian 4.9, 16Gb RAM. Intel® Xeon® E5-2603 v4
photo_2018-06-04_13-23-40.jpg

Allen Hubbe

unread,
Jun 4, 2018, 8:08:42 AM6/4/18
to Ka Vi, linux-ntb
On Mon, Jun 4, 2018 at 7:07 AM, Ka Vi <yato...@gmail.com> wrote:
>
> Hi
>
> After loading the ntb.ko and ntb_hw_intel drivers, the
> /sys/kernel/debug/ntb_hw_intel/0000:00:03.0/info file appears, indicating
> that Link-Status is Down.
> When the ntb_transport driver is loaded, the Link-Status state goes to Up.
> When the ntb_netdev driver is loaded, the eth0 device appears. When
> assigning ip-addressing to both devices eth0, we have a ping between them.
>
> After downloading the ntc_ntb_msi driver, the folders by / sys / kernel /
> debug / ntc_ntb_msi, / sys / kernel / debug / ntrdma, / sys / class /
> infiniband are empty. In the / sys / class / infiniband_verbs folder there
> is only one "abi_version" file containing one "6" character.

Was the ntb_transport driver unloaded before loading ntc_ntb_msi?

> Why does not the link go into UP status when loading ntb_hw_intel?

The hw driver only provides an interface for the hardware.

> And it
> appears only after ntb_transport, it's kind of like different things.

The transport driver tells the hw driver to bring up the interface.

> Also what it is necessary to make, what the link would pass in status UP at
> loading ntb_hw_intel?

If the link is allowed to go up before the transport driver is loaded,
then some other signal would be needed to indicate when the transport
driver becomes ready for communication. Doing it this way, the link
up signal not only indicates that the link is physically up, but also
that the peer is ready for communication over the link.

Ka Vi

unread,
Jun 4, 2018, 9:23:29 AM6/4/18
to linux-ntb
Hi, Allen! Thanks for answering!

1. Was the ntb_transport driver unloaded before loading ntc_ntb_msi?  
No, I want to create a bridge using NTRDMA over PCIe NTB, bypassing the TCP / IP protocol

I watched the wiki, and I did everything as indicated. And there is no usage of the ntb_transport module.

After editing the configuration file (https://github.com/ntrdma/ntrdma/wiki#module-loading-1), I load the module and get an error: [ 1273.099294] ntb_hw_intel: unknown parameter 'no_msix' ignored
All the necessary output in dmesg:
[ 1252.681245] ioatdma: Intel(R) QuickData Technology Driver 4.00
[ 1273.099294] ntb_hw_intel: unknown parameter 'no_msix' ignored
[ 1273.099420] Intel(R) PCI-E Non-Transparent Bridge Driver kkk  3.0
[ 1273.099624] ntb_hw_intel 0000:00:03.0: enabling bus mastering
[ 1273.100007] ntb_hw_intel 0000:00:03.0: NTB device registered.
[ 1325.970171] ntc: NTC Driver Framework 0.2
[ 1325.974509] ntc_ntb: NTC Non Transparent Bridge 0.3

What is the parameter no_msix ?

Based on this, and having spent a lot of time figuring out the problem, I decided to go back to the beginning, and see if there is any connection between the controllers without loading ntb_transport. Due to the fact that I went to a dead end, I felt that I was wrong somewhere in the beginning and therefore asked about (Problem with the connection status (Link-status) after loading the driver ntb_hw_intel)

Maybe I'm looking for a solution in the wrong direction, and if you know or can help me, I'll be very glad.

Allen Hubbe

unread,
Jun 4, 2018, 10:00:46 AM6/4/18
to Ka Vi, linux-ntb
On Mon, Jun 4, 2018 at 9:23 AM, Ka Vi <yato...@gmail.com> wrote:
> Hi, Allen! Thanks for answering!
>
> 1. Was the ntb_transport driver unloaded before loading ntc_ntb_msi?
> No, I want to create a bridge using NTRDMA over PCIe NTB, bypassing the TCP
> / IP protocol
>
> I watched the wiki, and I did everything as indicated. And there is no usage
> of the ntb_transport module.
>
> After editing the configuration file
> (https://github.com/ntrdma/ntrdma/wiki#module-loading-1), I load the module
> and get an error: [ 1273.099294] ntb_hw_intel: unknown parameter 'no_msix'
> ignored
> All the necessary output in dmesg:
> [ 1252.681245] ioatdma: Intel(R) QuickData Technology Driver 4.00
> [ 1273.099294] ntb_hw_intel: unknown parameter 'no_msix' ignored
> [ 1273.099420] Intel(R) PCI-E Non-Transparent Bridge Driver kkk 3.0
> [ 1273.099624] ntb_hw_intel 0000:00:03.0: enabling bus mastering
> [ 1273.100007] ntb_hw_intel 0000:00:03.0: NTB device registered.
> [ 1325.970171] ntc: NTC Driver Framework 0.2
> [ 1325.974509] ntc_ntb: NTC Non Transparent Bridge 0.3

Please load ntc_ntb_msi with module parameter dyndbg=+p

Can you also please provide the output of:
ls -ld /sys/bus/ntb/*/driver

If it was loaded with dyndbg, and that is the last thing in dmesg,
then it looks like the probe function was never called.

>
> What is the parameter no_msix ?

That parameter was used in development, to force the hw driver to
request msi interrupts instead of msi-x. It is no longer needed, and
ok that the parameter was ignored.

vk en

unread,
Jun 4, 2018, 10:43:01 AM6/4/18
to linux-ntb


On Monday, 4 June 2018 17:00:46 UTC+3, Allen Hubbe wrote:
On Mon, Jun 4, 2018 at 9:23 AM, Ka Vi <yato...@gmail.com> wrote:
> Hi, Allen! Thanks for answering!
>
> 1. Was the ntb_transport driver unloaded before loading ntc_ntb_msi?
> No, I want to create a bridge using NTRDMA over PCIe NTB, bypassing the TCP
> / IP protocol
>
> I watched the wiki, and I did everything as indicated. And there is no usage
> of the ntb_transport module.
>
> After editing the configuration file
> (https://github.com/ntrdma/ntrdma/wiki#module-loading-1), I load the module
> and get an error: [ 1273.099294] ntb_hw_intel: unknown parameter 'no_msix'
> ignored
> All the necessary output in dmesg:
> [ 1252.681245] ioatdma: Intel(R) QuickData Technology Driver 4.00
> [ 1273.099294] ntb_hw_intel: unknown parameter 'no_msix' ignored
> [ 1273.099420] Intel(R) PCI-E Non-Transparent Bridge Driver kkk  3.0
> [ 1273.099624] ntb_hw_intel 0000:00:03.0: enabling bus mastering
> [ 1273.100007] ntb_hw_intel 0000:00:03.0: NTB device registered.
> [ 1325.970171] ntc: NTC Driver Framework 0.2
> [ 1325.974509] ntc_ntb: NTC Non Transparent Bridge 0.3

Please load ntc_ntb_msi with module parameter dyndbg=+p
 
[  190.999503] ioatdma: Intel(R) QuickData Technology Driver 4.00
[  239.385371] ntc: NTC Driver Framework 0.2
[  239.390995] ntc_ntb: NTC Non Transparent Bridge 0.3
[  239.391018] probe ntb 0000:00:03.0
[  239.391020] no dma for new device 0000:00:03.0
 
Can you also please provide the output of:
ls -ld /sys/bus/ntb/*/driver

  root@ENGINE-1:~# ls -ld /sys/bus/ntb/*
drwxr-xr-x 2 root root    0 Jun  4 10:28 /sys/bus/ntb/devices
drwxr-xr-x 3 root root    0 Jun  4 10:16 /sys/bus/ntb/drivers
-rw-r--r-- 1 root root 4096 Jun  4 10:28 /sys/bus/ntb/drivers_autoprobe
--w------- 1 root root 4096 Jun  4 10:28 /sys/bus/ntb/drivers_probe
--w------- 1 root root 4096 Jun  4 10:28 /sys/bus/ntb/uevent

vk en

unread,
Jun 4, 2018, 10:50:51 AM6/4/18
to linux-ntb
He wrote that he did not find dma.
It turns out I should already have a dma, or should he create a virtual dma himself?
Or do I lack some module for dma?



Allen Hubbe

unread,
Jun 4, 2018, 11:31:14 AM6/4/18
to vk en, linux-ntb
On Mon, Jun 4, 2018 at 10:50 AM, vk en <yato...@gmail.com> wrote:
> He wrote that he did not find dma.
> It turns out I should already have a dma, or should he create a virtual dma
> himself?
> Or do I lack some module for dma?

modprobe ioatdma

https://github.com/ntrdma/ntrdma/wiki#module-loading-1

And also check that there is at least one channel not in use:
cat /sys/class/dma/*/in_use

Dave Jiang

unread,
Jun 4, 2018, 12:36:43 PM6/4/18
to Allen Hubbe, vk en, linux-ntb


On 06/04/2018 08:31 AM, Allen Hubbe wrote:
> On Mon, Jun 4, 2018 at 10:50 AM, vk en <yato...@gmail.com> wrote:
>> He wrote that he did not find dma.
>> It turns out I should already have a dma, or should he create a virtual dma
>> himself?
>> Or do I lack some module for dma?
>
> modprobe ioatdma
>
> https://github.com/ntrdma/ntrdma/wiki#module-loading-1
>
> And also check that there is at least one channel not in use:
> cat /sys/class/dma/*/in_use

On 06/04/2018 07:43 AM, vk en wrote:
> [ 190.999503] ioatdma: Intel(R) QuickData Technology Driver 4.00
> [ 239.385371] ntc: NTC Driver Framework 0.2
> [ 239.390995] ntc_ntb: NTC Non Transparent Bridge 0.3
> [ 239.391018] probe ntb 0000:00:03.0
> [ 239.391020] no dma for new device 0000:00:03.0

^ So he has the ioatdma driver already loaded.

1. Please make sure CBDMA is enabled in your BIOS options? Probably
under IIO section.
2. Please provide lspci and show if the DMA shows up? i.e.

00:04.0 System peripheral: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core
i7 DMA Channel 0 (rev 02)
Subsystem: Intel Corporation Device 35c8
Flags: bus master, fast devsel, latency 0, IRQ 52, NUMA node 0
Memory at 383ffff2c000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [80] MSI-X: Enable+ Count=1 Masked-
Capabilities: [90] Express Root Complex Integrated Endpoint, MSI 00
Capabilities: [e0] Power Management version 3
Kernel driver in use: ioatdma
Kernel modules: ioatdma

vk en

unread,
Jun 5, 2018, 3:05:11 AM6/5/18
to linux-ntb


On Monday, 4 June 2018 18:31:14 UTC+3, Allen Hubbe wrote:
On Mon, Jun 4, 2018 at 10:50 AM, vk en <yato...@gmail.com> wrote:
> He wrote that he did not find dma.
> It turns out I should already have a dma, or should he create a virtual dma
> himself?
> Or do I lack some module for dma?

modprobe ioatdma

[  142.410353] ioatdma: Intel(R) QuickData Technology Driver 4.00    <--------
[  180.957289] ntb_hw_intel: unknown parameter 'no_msix' ignored
[  180.957410] Intel(R) PCI-E Non-Transparent Bridge Driver kkk  3.0
[  180.957622] ntb_hw_intel 0000:00:03.0: enabling bus mastering
[  180.957933] ntb_hw_intel 0000:00:03.0: NTB device registered.
[  217.011155] ntc: NTC Driver Framework 0.2
[  217.016905] ntc_ntb: NTC Non Transparent Bridge 0.3
[  217.016928] probe ntb 0000:00:03.0
[  217.016930] no dma for new device 0000:00:03.0

https://github.com/ntrdma/ntrdma/wiki#module-loading-1

And also check that there is at least one channel not in use:
cat /sys/class/dma/*/in_use
 
The dma directory is empty.
root@ENGINE-1:~# ls -l /sys/class/dma/
total 0





 

vk en

unread,
Jun 5, 2018, 4:21:23 AM6/5/18
to linux-ntb
1. I did not find the CBDMA in BIOS, it's not there.
But we wrote to the manufacturer to find out where to find CBDMA.
2. Empty

Dave Jiang

unread,
Jun 5, 2018, 12:16:48 PM6/5/18
to vk en, linux-ntb
On 06/05/2018 01:21 AM, vk en wrote:
> 1. I did not find the CBDMA in BIOS, it's not there.
> But we wrote to the manufacturer to find out where to find CBDMA.

That is pretty unusual. All Intel E5 or higher based Xeon (or whatever
the equivalent is now, thanks marketing) has ioatdma (CBDMA). It's a
matter of whether your BIOS enable it or not. Of course if your platform
has NTB, then you should have the DMA engine as well as they are part of
the platform storage extension features. Which Xeon platform are you on
and who is your supplier?


> 2. Empty
>
>
>
> On Monday, 4 June 2018 19:36:43 UTC+3, dave.jiang wrote:
>
>
> ^ So he has the ioatdma driver already loaded.
>
> 1. Please make sure CBDMA is enabled in your BIOS options? Probably
> under IIO section.
> 2. Please provide lspci and show if the DMA shows up? i.e.
>
> 00:04.0 System peripheral: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core
> i7 DMA Channel 0 (rev 02)
>         Subsystem: Intel Corporation Device 35c8
>         Flags: bus master, fast devsel, latency 0, IRQ 52, NUMA node 0
>         Memory at 383ffff2c000 (64-bit, non-prefetchable) [size=16K]
>         Capabilities: [80] MSI-X: Enable+ Count=1 Masked-
>         Capabilities: [90] Express Root Complex Integrated Endpoint,
> MSI 00
>         Capabilities: [e0] Power Management version 3
>         Kernel driver in use: ioatdma
>         Kernel modules: ioatdma
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "linux-ntb" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to linux-ntb+...@googlegroups.com
> <mailto:linux-ntb+...@googlegroups.com>.
> To post to this group, send email to linu...@googlegroups.com
> <mailto:linu...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/linux-ntb/9058dedf-ec3a-42d4-9eaa-40d95740fcc0%40googlegroups.com
> <https://groups.google.com/d/msgid/linux-ntb/9058dedf-ec3a-42d4-9eaa-40d95740fcc0%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

vk en

unread,
Jun 6, 2018, 5:06:48 AM6/6/18
to linux-ntb


вторник, 5 июня 2018 г., 19:16:48 UTC+3 пользователь dave.jiang написал:
On 06/05/2018 01:21 AM, vk en wrote:
> 1. I did not find the CBDMA in BIOS, it's not there.
> But we wrote to the manufacturer to find out where to find CBDMA.

That is pretty unusual. All Intel E5 or higher based Xeon (or whatever
the equivalent is now, thanks marketing) has ioatdma (CBDMA). It's a
matter of whether your BIOS enable it or not. Of course if your platform
has NTB, then you should have the DMA engine as well as they are part of
the platform storage extension features. Which Xeon platform are you on
and who is your supplier?
 System Gooxi, Intel® Xeon® E5-2603 v4

Dave Jiang

unread,
Jun 6, 2018, 12:46:02 PM6/6/18
to vk en, linux-ntb
Ok yes so you need to inquire with your vendor about your BIOS enabling
to turn the CBDMA devices on. I'm surprised it's not there since it's a
standard feature for E5 and is enabled in the base BIOS code that Intel
provides to BIOS vendors.

vk en

unread,
Jun 8, 2018, 5:23:17 AM6/8/18
to linux-ntb
Hey.

1. I changed the system to a new one
it has 2 controllers with 2 processors each.  (before 2 controllers one processor each)

2. I successfully created ntrdma device, without any errors.

Problems:
1. If only 1 controller works, then the Status link in ntb_hw_intel still shows UP. 
So it should not be, since I need to connect 2 controllers. (On the first system, this worked correctly, if I turned off one controller or switched off (ntb_transport), then the link had the status of DOWN)
Why is this so?

2. I successfully created the device, but I did not manage to connect the clusters with each other.
NTRDMA also created an interface, but after their configuration, there is no ping between clusters.

What have I done wrong? or can I be something I forgot to do?

Allen Hubbe

unread,
Jun 9, 2018, 9:56:49 PM6/9/18
to vk en, linux-ntb
On Fri, Jun 8, 2018 at 5:23 AM vk en <yato...@gmail.com> wrote:
>
> Hey.
>
> 1. I changed the system to a new one
> it has 2 controllers with 2 processors each. (before 2 controllers one processor each)
>
> 2. I successfully created ntrdma device, without any errors.
>
> Problems:
> 1. If only 1 controller works, then the Status link in ntb_hw_intel still shows UP.
> So it should not be, since I need to connect 2 controllers. (On the first system, this worked correctly, if I turned off one controller or switched off (ntb_transport), then the link had the status of DOWN)
> Why is this so?

What is meant by controller?

> 2. I successfully created the device, but I did not manage to connect the clusters with each other.
> NTRDMA also created an interface, but after their configuration, there is no ping between clusters.
>
> What have I done wrong? or can I be something I forgot to do?

More information would help. Can you share the debugfs "info" files
for ntb_hw_intel, ntc, and ntrdma?

vk en

unread,
Jun 13, 2018, 7:27:10 AM6/13/18
to linux-ntb

What is meant by controller?
SAN has 2 units. (controller = unit)
 
More information would help.  Can you share the debugfs "info" files
for ntb_hw_intel, ntc, and ntrdma?

 We found some problems on the physical level. And when they are eliminated, we will try again.
Reply all
Reply to author
Forward
0 new messages