Issue with booting Linux

KARIM

unread,

Feb 16, 2021, 3:33:35 PM2/16/21

to black-parrot

Hello,

I am writing this post because I am having issues booting linux on BlackParrot on an alveo u280 card.

I am using the vu37p project available in [1].

(1)

This project allows me to load the kernel through pcie which is connected to an xdma bridge with an axi-lite interface. To write through such an interface the kernel should be 32-bits aligned. Before jumping to linux I have decided to start with a simple program such as bubble sort. After compiling the program I end up with an nbf file. Every line in nbf file is in the following format (eg: 03_0000200001_0000000000000001). in the host side I wrote a script to parse each line and write it through dma to address 0x10 in the following way:

data_1="0x"$(echo $line | cut -c23-30)

data_2="0x"$(echo $line | cut -c15-22)

data_3="0x"$(echo $line | cut -c6-13)

data_4="0x0000"$(echo $line | cut -c1-2)"00"

dma-ctl qdma08000 reg write bar 2 0x10 $data_1

dma-ctl qdma08000 reg write bar 2 0x10 $data_2

dma-ctl qdma08000 reg write bar 2 0x10 $data_3

dma-ctl qdma08000 reg write bar 2 0x10 $data_4

After the program is loaded, I issue multiple read commands to address 0x20 and print the output on the console (eg: dma-ctl qdma08000 reg read bar 2 0x20).

For a simple program such as bubble sort I am able to see correct output similar to the simulation with Verilator. However, when it comes to linux, it first takes about 4 hours to load (mainly due to line parsing and line by line kernel loading) and second when I issue read command, after loading the kernel, I only get deadbeef.

I obtained the linux kernel by typing “make linux” in bp_common/test then converted .riscv to .nbf using the makefiles made available in BlackParrot git repository.

So what I wanted to ask is, is the approach I took allows me to boot linux and whether I am doing things right ? What’s confusing is I was able to run a simple program and not linux kernel but both are compiled using the same method.

(2) On a different note, I have discovered subsequently that you guys are using Litex to boot linux with simple instructions in README. I have tried to follow the instructions up until the FPGA command where I believe I have to modify genesys2.py with the pinout of alveo u280. And here I would like to ask a question.

It was indicated that the path is LITEX/litex/boards/genesys2.py but there are two genesys2.py files and located in one level down the hierarchy i.e in LITEX/litex/boards/platforms and LITEX/litex/boards/targets. I would really appreciate if you could indicate which one of them is expected to be used so that I can modify accordingly.

[1] https://github.com/black-parrot-examples/bsg_fpga/tree/master/vu37p

Thanks for taking the time to read my post.

Prof. Michael Taylor

unread,

Feb 17, 2021, 12:40:08 AM2/17/21

to Sadullah Canakci, KARIM, black-parrot

Sadullah, can you advise regarding Litex?

--
You received this message because you are subscribed to the Google Groups "black-parrot" group.
To unsubscribe from this group and stop receiving emails from it, send an email to black-parrot...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/black-parrot/64b1f392-2a6f-405d-994e-c65dc29fbdbfn%40googlegroups.com.

--

Apologies for all typos and autocorrects. Message sent by combination of an approximate neural network and a smartphone.

Sadullah Canakci

unread,

Feb 17, 2021, 2:40:20 PM2/17/21

to black-parrot

Hi Karim,

I can answer your second question. Firstly, please take a look at the following link that describes the steps for booting up Linux using Litex+BlackParrot.

https://github.com/black-parrot/litex/tree/working_linux

Since BlackParrot and Litex are under active development, we compiled a set of compatible commits under that repo. Related to your path question, that is true that I did not provide the proper path there. It should be boards/targets/genesys2.py if you want to generate the bitstream (I will update README). However, I am not sure if modifying the only targets/genesys2.py is enough to achieve your goal (i.e., adapting pinout of alveo u280.). I used the genesys2.py scripts as they are without modifying and also I never spent the time understanding the necessary steps to adapt pinouts.

Hope this helps,

Sadullah

KARIM

unread,

Feb 17, 2021, 3:23:37 PM2/17/21

to black-parrot

Hi Sadullah,

Thanks for clarifying and for the feedback.

Do you think I could get some feedback with regards to my first question ?

What I am trying to know is, Do you think I should be able to boot linux just using the project in [1].

I was able to run simple programs i.e load instructions and read from io but when it comes to linux I do not get any response from io.

Best regards

[1] https://github.com/black-parrot-examples/bsg_fpga/tree/master/vu37p

Sadullah Canakci

unread,

Feb 17, 2021, 3:30:44 PM2/17/21

to KARIM, Dan Petrisko, black-parrot

Hi,

You are welcome. It is actually a project that Dan could be potentially aware of details. Is it the case @Dan Petrisko ?

Regards,

Sadullah

You received this message because you are subscribed to a topic in the Google Groups "black-parrot" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/black-parrot/ZeVnolEHkQg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to black-parrot...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/black-parrot/5dbc86bb-e797-4192-af05-604a990104abn%40googlegroups.com.

Dan Petrisko

unread,

Feb 17, 2021, 3:42:23 PM2/17/21

to Sadullah Canakci, KARIM, black-parrot

Hi Karim,

Your setup looks generally correct. We take an additional step in our build, which is to convert the nbf file to a word aligned series of accesses and then DMA them using the scripts found here: https://github.com/gaozihou/dma_ip_drivers

Perhaps the larger program size exposes a bug with your dma approach compared to the bubblesort program?

Best,

Dan

Prof. Michael Taylor

unread,

Feb 17, 2021, 5:05:25 PM2/17/21

to Dan Petrisko, KARIM, Sadullah Canakci, black-parrot

Hi,

Just a follow up — we are going to break out the unique pieces of our Xilinx repo fork into some BP/bsg repos. I was not aware that we had some stuff hanging out in here!

M

To view this discussion on the web visit https://groups.google.com/d/msgid/black-parrot/CABXpatpx4rU171OKZxDDoctxh1RhB5BepTY8yZGGC0ZEQg%2BXRg%40mail.gmail.com.

Dan Petrisko

unread,

Feb 17, 2021, 7:17:45 PM2/17/21

to Sadullah Canakci, KARIM, black-parrot

Hi Karim,

I went through and updated the dma_ip_drivers repo to be a little easier to use. This is closer to a push-button flow. Pardon our dust while we rearrange and cleanup these repos to be more user-facing!

git clone g...@github.com:black-parrot-examples/bsg_fpga.git

cd bsg_fpga/vu37p

make

make generate_bitstream program_fpga

cd black-parrot

make prep -j 20

make -C bp_common/test/ linux

git clone https://github.com/gaozihou/dma_ip_drivers

cd dma_ip_drivers/XDMA/linux-kernel/tools

make linux.run BLACKPARROT_DIR=<black-parrot-directory>

Best, Dan

On Wed, Feb 17, 2021 at 12:42 PM Dan Petrisko <petr...@cs.washington.edu> wrote:

Dan Petrisko

unread,

Feb 17, 2021, 7:19:13 PM2/17/21

to Sadullah Canakci, KARIM, black-parrot

And note that because of the nature of the XDMA driver, you will have to reboot the PC after programming the fpga

Glenn Baxter

unread,

Feb 17, 2021, 7:30:44 PM2/17/21

to Dan Petrisko, Sadullah Canakci, KARIM, black-parrot

It is not strictly true that you have to reboot. If you are emplacing the same image, it does not require a reboot. If you change things post XDMA, it usually does not require a reboot. We use partial reconfig as well as full config. The only time we’ve really had to reboot is when we change the PCIe mem map significantly. YMMV. Experiment to see :)

On Feb 17, 2021, at 17:19, Dan Petrisko <petr...@cs.washington.edu> wrote:

To view this discussion on the web visit https://groups.google.com/d/msgid/black-parrot/CABXpatoKTDVVWgD5imkMq1Y_V4YHN72Tmko8MW%2Bks1EVY1L%2BnA%40mail.gmail.com.

KARIM

unread,

Feb 18, 2021, 11:54:01 AM2/18/21

to black-parrot

Hi,

Thanks Dan for the instructions you put together and everybody for the feedback.

I have checked the process that you suggested. it seems quite similar to what I am doing except that I am using qdma instead of xdma and using an old BlackParrot commit.

I would like to follow the steps you suggested without much change (except FPGA pinout) but I am facing issues with the BlackParrot commit that is pointed out in the Makefile of vu37p. I couldn't checkout to commit 88876ed I get "error: pathspec '88876ed' did not match any file(s) known to git" error. and if I try to clone the repo separately (i.e master branch) and place it in vu37p folder, bp_fpga.tcl can't find the following files.

bp_be/src/v/bp_be_checker/bp_be_scoreboard.sv

bp_top/src/include/bp_top_defines.svh

bp_be/src/include/bp_be_defines.svh

bp_me/src/include/bp_me_defines.svh

Best regards

Dan Petrisko

unread,

Feb 18, 2021, 3:57:51 PM2/18/21

to KARIM, black-parrot

The commit is strongly coupled to the file list. The commits currently in the makefile should work

To view this discussion on the web visit https://groups.google.com/d/msgid/black-parrot/e847ca78-da32-4e2f-a77f-db55d79a3a42n%40googlegroups.com.

KARIM

unread,

Feb 18, 2021, 5:57:45 PM2/18/21

to black-parrot

Hi Dan,

Perhaps I didn't clearly deliver what I wanted to say.

The makefile in vu37p clones BlackParrot repo then checkouts to a commit with ID 88876ed.

It seems that the command (i.e git checkout 88876ed) in vu37p makefile doesn't work.

so then I cannot use that commit while building the FPGA example.

Thanks. I appreciate your time

Dan Petrisko

unread,

Feb 18, 2021, 6:03:34 PM2/18/21

to KARIM, black-parrot

Hi Karim,

Yes sorry I was unclear. When I say currently in makefile, I updated the pointers yesterday, they were stale. If you pull bsg_fpga again you can see the latest commits which I have verified to work.

https://github.com/black-parrot-examples/bsg_fpga/blob/master/vu37p/Makefile#L16

To view this discussion on the web visit https://groups.google.com/d/msgid/black-parrot/0f09bef5-9ab5-46ec-962b-5aef0076406cn%40googlegroups.com.

KARIM

unread,

Feb 21, 2021, 2:58:36 PM2/21/21

to black-parrot

Hi Dan,

Thanks for clarifying. I have tried to boot linux using the new branch.

It seems the branch of BlackParrot, in the Makefile of vu37p, generates a lightly different kernel which started to boot but got stuck at "Freeing unused kernel memory: 5760K". I attached a screenshot of booting prints. I am not sure if it has something to do with the kernel itself.

Best regards

3.png

1.png

2.png

Dan Petrisko

unread,

Feb 21, 2021, 7:47:25 PM2/21/21

to KARIM, black-parrot

Shot in the dark but when I've encountered that hang in the past, it was a spurious I$ fetch to an uncached address. In our host program, we simply ignore these requests and send 0 as a loopback, preventing deadlock: https://github.com/gaozihou/dma_ip_drivers/blob/master/XDMA/linux-kernel/tools/nbf.c#L299. In general, I would suggest looking carefully at our host tools (https://github.com/gaozihou/dma_ip_drivers/tree/master/XDMA/linux-kernel/tools) to see where they diverge from your own. Not to say it's impossible that the hardware has a bug, but it has been validated in both FPGA and simulation under a variety of timing conditions so if you're using the same RTL and same project setup with a different board and different host, I would lean towards blaming the host.

- Dan

To view this discussion on the web visit https://groups.google.com/d/msgid/black-parrot/12f3a2e0-d438-478c-a9d4-e2be3d0ca494n%40googlegroups.com.

KARIM

unread,

Feb 25, 2021, 2:23:41 PM2/25/21

to black-parrot

Hi Dan,

I have switched to fully using the scripts you provided (ie. host tools you shared including nbf.c with XDMA).

Below I wrote all the changes I made to adapt the design to alveo u280 card. I think these changes shouldn't cause the freeze problem I am still facing (i.e "Freeing unused kernel memory: 5760K"). I have attached screenshot of the beginning and end of what comes out on the terminal. If you also agree that these changes shouldn't cause the freeze problem, I am not sure what could have possibly went wrong. I would really appreciate any feedback you might provide.

On another note, I would like to ask about the booting process. After examining the RTL, I realized that when the instructions are loaded through NBF, they go directly to L2$. Then instructions would then move up the memory hierarchy to be executed. At the same time the instructions are looped back to HBM. The instructions are fetched from HBM, only when the program cannot fit in L2$. My conjecture is based on the fact that when I disable the HBM in FPGA example or DRAM in simulation and run a simple program, it actually executes without any problem. The premise based on this conjecture seems unorthodox because traditionally when booting linux or running a simple program a bootloader would load the whole kernel in memory and only upon completion, the instructions start to be fetched and executed. Did I understand correctly the booting process or am I missing something ?

Best regards

------------ Modifications done on vu37p to adapt to au280 ----------------

1) design_1_wrapper.v

Disabled the following (because I do not know where they should map in au280 and do not seem critical):

PCIE1_FPGA_CPERSTN,PCIE0_FPGA_CPRSNT,PCIE0_FPGA_CPWRON,PCIE0_FPGA_CWAKE

PCIE0_SWITCH,PCIE1_FPGA_CPRSNT,PCIE1_FPGA_CPWRON,PCIE1_FPGA_CWAKE, PCIE1_SWITCH

Changed the following:

pcie_perstn = !PCIE0_FPGA_CPERSTN —> pcie_perstn = PCIE0_FPGA_CPERSTN; (because PCIE0_FPGA_CPERSTN pinout is set to active low pin)

2) bp_fpga.tcl

Changed the following to avoid compilation errors and to adapt to alveo u280

CONFIG.pcie_blk_locn {PCIE4_X1Y0} —> CONFIG.pcie_blk_locn {PCIE4_X0Y1}

CONFIG.select_quad {GTY_Quad_233} —> CONFIG.select_quad {GTY_Quad_227}

xcvu37p-fsvh2892-2LV-e —> xcu280-fsvh2892-2L-e

3) design_1.xdc

# PCIe Refclk

set_property PACKAGE_PIN AR14 [get_ports {pcie_refclk_clk_n}]

set_property PACKAGE_PIN AR15 [get_ports {pcie_refclk_clk_p}]

create_clock -period 10.000 -name pcie_refclk_n [get_ports {pcie_refclk_clk_n}]

create_clock -period 10.000 -name pcie_refclk_p [get_ports {pcie_refclk_clk_p}]

# PCIe x4 channel

# First step: reset locations to default

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_rxn[0]}]

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_rxn[1]}]

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_rxn[2]}]

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_rxn[3]}]

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_rxp[0]}]

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_rxp[1]}]

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_rxp[2]}]

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_rxp[3]}]

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_txn[0]}]

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_txn[1]}]

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_txn[2]}]

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_txn[3]}]

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_txp[0]}]

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_txp[1]}]

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_txp[2]}]

set_property PACKAGE_PIN {} [get_ports {pci_express_x4_txp[3]}]

# Second step: set new locations

set_property PACKAGE_PIN AL1 [get_ports {pci_express_x4_rxn[0]}]

set_property PACKAGE_PIN AM3 [get_ports {pci_express_x4_rxn[1]}]

set_property PACKAGE_PIN AN5 [get_ports {pci_express_x4_rxn[2]}]

set_property PACKAGE_PIN AN1 [get_ports {pci_express_x4_rxn[3]}]

set_property PACKAGE_PIN AL2 [get_ports {pci_express_x4_rxp[0]}]

set_property PACKAGE_PIN AM4 [get_ports {pci_express_x4_rxp[1]}]

set_property PACKAGE_PIN AN6 [get_ports {pci_express_x4_rxp[2]}]

set_property PACKAGE_PIN AN2 [get_ports {pci_express_x4_rxp[3]}]

set_property PACKAGE_PIN AL10 [get_ports {pci_express_x4_txn[0]}]

set_property PACKAGE_PIN AM8 [get_ports {pci_express_x4_txn[1]}]

set_property PACKAGE_PIN AN10 [get_ports {pci_express_x4_txn[2]}]

set_property PACKAGE_PIN AP8 [get_ports {pci_express_x4_txn[3]}]

set_property PACKAGE_PIN AL11 [get_ports {pci_express_x4_txp[0]}]

set_property PACKAGE_PIN AM9 [get_ports {pci_express_x4_txp[1]}]

set_property PACKAGE_PIN AN11 [get_ports {pci_express_x4_txp[2]}]

set_property PACKAGE_PIN AP9 [get_ports {pci_express_x4_txp[3]}]

# PCIe Sideband

set_property PACKAGE_PIN BH26 [get_ports {PCIE0_FPGA_CPERSTN}]

set_property IOSTANDARD LVCMOS18 [get_ports {PCIE0_FPGA_CPERSTN}]

# Reset

set_property PACKAGE_PIN L30 [get_ports {rstn}]

set_property IOSTANDARD LVCMOS18 [get_ports {rstn}]

1.png

2.png

Dan Petrisko

unread,

Feb 25, 2021, 8:22:43 PM2/25/21

to KARIM, black-parrot

Hey Karim,

Those changes look fine to me.

It sounds like you understand the boot process. The NBF essentially replaces the zeroth-stage bootloader in this system. So each word is written to the L2, spilling to HBM when over capacity. The NBF also writes some configuration registers in the core, one of which is the 'freeze register', which halts the core. When the entire program is loaded into the L2/HBM, the nbf loader will send a message to unfreeze the core and start execution. However, even if the entire program can fit into L2, the L2 is write-allocate so it will still access HBM during the boot process.

>At the same time the instructions are looped back to HBM.

Could you explain what you mean by this?

Best,
Dan

To view this discussion on the web visit https://groups.google.com/d/msgid/black-parrot/a4477895-87d0-4d6f-9463-8f30f64277fbn%40googlegroups.com.

Glenn Baxter

unread,

Feb 25, 2021, 9:51:09 PM2/25/21

to Dan Petrisko, KARIM, black-parrot

Hi Dan,

Assuming there is another way to load HBM, can you please explain with detail what specific things must be written ( address and data content of every write) to be able to bring Linux up?

I ask because I am very close to having a U280 design where XDMA has access to 40GB of memory (4GB HBM + 4 GB HBM + 16 GB DDR4 + 16 GB DDR4) along with a Xilinx 16650 UART and a Xilinx 1G/10G/25G Ethernet solution across the QSFP+ connector.

Once I resolve the 64-bit BAR not being able to get >1GB window in PCIe (need UEFI Ubuntu), then I can test the new bitfile without crashing the PC. That is nearly done, so the next step will be to get the detailed instructions as to where you are placing the linux.riscv file in HBM and then how you tell BPs to wake up and boot.

Thank you,

Glenn

On Feb 25, 2021, at 18:23, Dan Petrisko <petr...@cs.washington.edu> wrote:

To view this discussion on the web visit https://groups.google.com/d/msgid/black-parrot/CABXpatr-1JE3g5TkVTv-GOH5rCgphi4C93eTjKAyS6r_%2B04TEg%40mail.gmail.com.

Dan Petrisko

unread,

Feb 26, 2021, 12:16:53 AM2/26/21

to Glenn Baxter, KARIM, black-parrot

Hi Glenn,

The boot sequence, including configuration, is generated entirely by this script: https://github.com/black-parrot/black-parrot/blob/dev/bp_common/software/py/nbf.py

The DRAM base address in the system is 0x8000_0000

KARIM

unread,

Feb 26, 2021, 10:37:25 AM2/26/21

to black-parrot

Hi Dan,

What I meant by "At the same time the instructions are looped back to HBM" is that the first instructions (i.e the first ones that would make it to L2$) wouldn't go to HBM and later on in the execution they might be evicted but also might still be needed so they have to be stored somewhere and that should be HBM. So the instructions that do make it to L2$ in the beginning of the instructions should also be stored in HBM. So the only question is when would that happen. if it's at the same time as they are loaded in the L2$ (i.e what I meant by looped back) as part of boot loading process or later on in the process when they are about to be evicted. Since it's a cache, and please correct me if I am wrong, I am leaning towards the latter.

Concerning booting Linux, thanks for checking the modifications I did. I will try to find out what happens in the FPGA by comparing it to the simulation results for now. but when I check the host (bp_nonsynth_host.sv) I can't see how the spurious I$ fetch to uncached addresses are handled (i.e sending back 0 as loopback). Yet it works.

Anyhow I believe the simulation working as shown below is a normal operation.

[ 1.873380] NET: Registered protocol family 17

[ 2.008937] Freeing unused kernel memory: 5760K

[ 2.009833] Run /init as init process

Starting syslogd: OK

Starting klogd: OK

Running sysctl: OK

Starting network: OK

Running test.sh

Stopping network: OK

Stopping klogd: OK

Stopping syslogd: OK

umount: can't unmount /: Invalid argument

The system is going down NOW!

Sent SIGTERM to all processes

--> test.sh is the shell script in bp_common/test/cfg/test.sh (by default it's executing "poweroff").

I believe I could modify the script to test other linux commands

Best regards

Dan Petrisko

unread,

Feb 26, 2021, 12:42:57 PM2/26/21

to KARIM, black-parrot

Yep, it'll happen when it's about to be evicted, which may happen during boot or may happen later.

You can see the loopback here. Basically bp_nonsynth_host.sv will just always send back a request, regardless of if it explicitly maps to anything in the host.

https://github.com/black-parrot/black-parrot/blob/master/bp_top/test/common/bp_nonsynth_host.sv#L111

Awesome work! Modifying that script should allow you to run whatever you like, something like a bash terminal will let you run programs interactively (more useful on FPGA).

- Dan

To view this discussion on the web visit https://groups.google.com/d/msgid/black-parrot/42405aa1-9e37-41f9-aa8d-c93a28f09756n%40googlegroups.com.

KARIM

unread,

Mar 2, 2021, 1:57:50 PM3/2/21

to black-parrot

Hi Dan,

Thanks for the feedback. I appreciate it.

Currently I have a working linux kernel in the simulation side but I am still experiencing a kernel freeze in FPGA. So I made a comparison between the host’s behavior on the FPGA (i.e nbf.c) and the host’s behavior in the simulation project (i.e bp_nonsynth_host) and I found few differences, that I am not sure why they exist.

You have mentioned that the freeze happens because of spurious I$ fetch to an uncached address and it is handled by sending 0 as a loopback. However the latest reads I get before the freeze have 0x00100000 addresses and that is handled by the condition else if ((addr_result>>12) == 0x100) in nbf.c which writes 0xFFFFFFFF twice and not 0. Below I have written the log of the latest address I get before the freeze.

On the other hand in the simulation side. The response is handled by assign domain_io_resp_lo = '{header: io_cmd_lo.header, data: '0}; in bp_nonsynth_host which returns the same address as the one that came in (i.e header) and always 0 for data. When the address 0x00100000 comes in pop(); is called which is a c++ implementation of queue pop. So in contrast with the nbf.c, receiving this address doesn’t prompt writing back 0xFFFFFFFF twice.

I believe the header is handled the same way in the FPGA project in bp_stream_mmio but the data is handled differently (i.e what’s sent through the dma either 0 or FFFFFFFF according to the address).

That is my understanding and please correct me if I am missing something here.

I am hoping this comparison could shed some light on what might have gone wrong causing the kernel freeze on FPGA.

Best regards

------ LOG when running nbf.c -------

read_result = 2

addr_result = 101000

data_result = a

read_result = 2

addr_result = 100000

data_result = a

ff twice

read_result = 2

addr_result = 100000

data_result = a

ff twice

read_result = 2

addr_result = 100000

data_result = a

ff twice

read_result = 2

addr_result = 100000

data_result = a

ff twice

read_result = 2

addr_result = 100000

data_result = a

ff twice

read_result = 2

addr_result = 100000

data_result = a

ff twice

read_result = 2

addr_result = 100000

data_result = a

ff tw

Glenn Baxter

unread,

Mar 2, 2021, 2:41:22 PM3/2/21

to KARIM, black-parrot

Hi Karim,

You are much farther ahead than I am with U280. I have a 4x BP design built for U280 but I can not get the PCIe to respond. I’ve done numerous tests to find the issue and have narrowed down a few things. But I am wondering if you could share your design, or at least the BD and PNGs of the XDMA settings you are using. I’d like to get to at least where you are. Also - do you know if you have a ES U280 or production? There is an errata on the ES that discusses issues in HBM when going over specific address limits. That *might* me a hint to the issue you are seeing below, should you have an ES board.

Thank you,

Glenn

To view this discussion on the web visit https://groups.google.com/d/msgid/black-parrot/0e7891c2-cfcc-492f-93ea-d8c390c948c0n%40googlegroups.com.

Glenn Baxter

unread,

Mar 2, 2021, 4:55:15 PM3/2/21

to KARIM, black-parrot

Gentlemen,

One other point of interest, and your data could be VERY helpful. I *think* I may have uncovered an error in Xilinx XDMA driver. The device driver repo that was supplied as ‘working’ with the BP repo is in fact out of date. I know this because when trying to compile it gets odd messages for the kernel:

glenn@xxxx:~/work/black-parrot/dma_ip_drivers/XDMA/linux-kernel/xdma(master)$ make
Makefile:10: XVC_FLAGS: .
make -C /lib/modules/5.4.0-66-generic/build M=/home/glenn/work/black-parrot/dma_ip_drivers/XDMA/linux-kernel/xdma modules
make[1]: Entering directory '/usr/src/linux-headers-5.4.0-66-generic'
/home/glenn/work/black-parrot/dma_ip_drivers/XDMA/linux-kernel/xdma/Makefile:10: XVC_FLAGS: .
CC [M] /home/glenn/work/black-parrot/dma_ip_drivers/XDMA/linux-kernel/xdma/libxdma.o
/home/glenn/work/black-parrot/dma_ip_drivers/XDMA/linux-kernel/xdma/libxdma.c: In function ‘engine_start’:
/home/glenn/work/black-parrot/dma_ip_drivers/XDMA/linux-kernel/xdma/libxdma.c:642:2: error: implicit declaration of function ‘mmiowb’ [-Werror=implicit-function-declaration]
mmiowb();
^~~~~~
cc1: some warnings being treated as errors
scripts/Makefile.build:269: recipe for target '/home/glenn/work/black-parrot/dma_ip_drivers/XDMA/linux-kernel/xdma/libxdma.o' failed
make[2]: *** [/home/glenn/work/black-parrot/dma_ip_drivers/XDMA/linux-kernel/xdma/libxdma.o] Error 1
Makefile:1760: recipe for target '/home/glenn/work/black-parrot/dma_ip_drivers/XDMA/linux-kernel/xdma' failed
make[1]: *** [/home/glenn/work/black-parrot/dma_ip_drivers/XDMA/linux-kernel/xdma] Error 2
make[1]: Leaving directory '/usr/src/linux-headers-5.4.0-66-generic'
Makefile:27: recipe for target 'all' failed
make: *** [all] Error 2

glenn@xxxx:~/work/black-parrot/dma_ip_drivers/XDMA/linux-kernel/xdma(master)$ uname -a
Linux xxxx 5.4.0-66-generic #74~18.04.2-Ubuntu SMP Fri Feb 5 11:17:31 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

There is some change beginning in the 5.3 kernel that Xilinx had to address wrt kernel headers. I have proven that on a Ubuntu 18.04.2 with kernel 5.4.0-66 fails. As well, so does a 5.3 kernel machine with Ubuntu 19.10:

glenn@hal:~$ uname -a
Linux hal 5.3.0-46-lowlatency #38 SMP PREEMPT Tue Apr 21 18:56:08 MDT 2020 x86_64 x86_64 x86_64 GNU/Linux

The typical symptom you will see is where it fails to detect the config BAR. This *ONLY* occurs when M_AXI_LITE is enabled (e.g. BAR0)

dmesg | grep xdma
[ 6.201409] xdma: loading out-of-tree module taints kernel.
[ 6.231944] xdma:xdma_mod_init: Xilinx XDMA Reference Driver xdma v2020.1.8
[ 6.231946] xdma:xdma_mod_init: desc_blen_max: 0xfffffff/268435455, timeout: h2c 10 c2h 10 sec.
[ 6.235145] xdma:xdma_device_open: xdma device 0000:42:00.0, 0x00000000b4b9ec63.
[ 6.235427] xdma:map_single_bar: BAR0 at 0x82300000 mapped at 0x000000009c213a7c, length=1048576(/1048576)
[ 6.639077] xdma:map_single_bar: BAR1 at 0x82400000 mapped at 0x000000002a573bdb, length=65536(/65536)
[ 7.042563] xdma:map_bars: Failed to detect XDMA config BAR
[ 7.453031] xdma:probe_one: pdev 0x00000000b4b9ec63, err -22.
[ 7.453037] xdma:xpdev_free: xpdev 0x0000000021856d67, destroy_interfaces, xdev 0x00000000787221fd.
[ 7.453038] xdma:xpdev_free: xpdev 0x0000000021856d67, xdev 0x00000000787221fd xdma_device_close.
[ 7.453095] xdma: probe of 0000:42:00.0 failed with error -22

Can you please share what version of what OS you are using and what kernel? That will help a lot. We have a CentOS machine with a very old kernel 3.10.0-1127.19.el7.x86_64 which does not appear to have this issue. One can argue it is the OS, but I am betting it is the kernel header changes which began circa 5.3.

I have verified this fails on U280, U250 and U200 - all identically. Any time you enable M_AXI_LITE, your toast.

Cheers,

Glenn

To view this discussion on the web visit https://groups.google.com/d/msgid/black-parrot/42DF55A1-674B-45CC-842E-D1C5295EB256%40petacat.com.

Dan Petrisko

unread,

Mar 2, 2021, 9:24:53 PM3/2/21

to Glenn Baxter, KARIM, black-parrot

Hey Glenn,

we're using Centos 7.7.1908 , kernel 4.19.82

To view this discussion on the web visit https://groups.google.com/d/msgid/black-parrot/A9D997D2-EF9F-480D-BC0B-BFB7AAA10324%40petacat.com.

Dan Petrisko

unread,

Mar 2, 2021, 9:30:49 PM3/2/21

to Glenn Baxter, KARIM, black-parrot

Hey Karim,

address 0x10_0000 is getchar, 0x10_1000 is putchar as seen here:

https://github.com/black-parrot/black-parrot/blob/master/bp_top/test/common/bp_nonsynth_host.sv#L86

We return the character from a scan() on getchar in the bp_nonsynth_host. The C code behaves similarly.

So it looks like it's polling the "uart", which is expected. I'm not sure why this would hang the PCIE.

- Dan

Reply all

Reply to author

Forward