driver 'pcieportal.c' signature mismatch

25 views
Skip to first unread message

Yaswanth Tavva

unread,
Jul 4, 2019, 7:49:46 AM7/4/19
to connectal
Hi all,

Lately I was trying to run (https://github.com/csail-csg/riscy-OOO) on AWS F1 which is using connectal. I ran into core dump issue.

Here is output I see when I run './ubuntu.exe --core-num 1 --mem-size 2048 --ignore-user-stucks 1000000 --rom rom_core_1 --elf bbl'

////////////////////output start
ubuntu@ip-192-168-0-13:~/riscy-OOO/procs/build/RV64G_OOO.core_1.core_SMALL.cache_LARGE.weak.l1_cache_lru.check_deadlock/awsf1/bin$  ./ubuntu.exe --core-num 1 --mem-size 2048 --ignore-user-stucks 1000000 --rom rom_core_1 --elf bbl
subprocess pid 28930 completed status=0 0
[initPortalHardwareOnce:284] fd 6 len 0
[checkSignature:176] read status from '/dev/connectal' was only 0 bytes long
checkSignature: driver 'pcieportal.c' signature mismatch 9a4e6520472c69f1577083d9f60b826c 47e961a4b445762e6b6024086a3ff16b
[checkSignature:170] failed to open /dev/portalmem No such file or directory
ubuntu.exe: ../../riscv-fesvr/fesvr/elfloader.cc:27: std::map<std::__cxx11::basic_string<char>, long unsigned int> load_elf(const char*, memif_t*, reg_t*): Assertion `buf != MAP_FAILED' failed.
Aborted (core dumped)

////////////////output end


also here is the output on running 'ldd ubuntu.exe',


///////////////start

ubuntu@ip-192-168-0-13:~/riscy-OOO/procs/build/RV64G_OOO.core_1.core_SMALL.cache_LARGE.weak.l1_cache_lru.check_deadlock/awsf1/bin$ ldd ubuntu.exe
    linux-vdso.so.1 =>  (0x00007ffe95dd5000)
    libfesvr.so => /home/ubuntu/riscy-OOO/tools/RV64G/lib/libfesvr.so (0x00007fe1b4eac000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe1b4c8f000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fe1b490d000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fe1b46f7000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe1b432d000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fe1b50d6000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fe1b4024000

///////////end

I was wondering what could be the cause of signature mismatch?


Here are the flow I followed (I'm unable to setup bluespec compiler on AWS. So using a local machine for this bluespec part and transferring back the generated verilog to AWS),

1. [AWS c4] make gen.awsf1
2. copy the connectal/out and build/RV64G .../awsf1 to local machine
3. [local machine] make verilog.awsf1
4. [AWS c4] make bits
5. [AWS F1] make exe
6. flash fpga image on F1
7. run 'ubuntu.exe' 

Thanks!!

John Ankcorn

unread,
Jul 4, 2019, 9:36:08 PM7/4/19
to Yaswanth Tavva, connectal
The 'signature mismatch' is displayed when the md5 of the pcieportal.c
_source code_ in the connectal project source tree does not match the
md5 computed for the same source code when the linux driver was
compiled (which you are using at runtime).

When in doubt, I would checkout the same connectal source tree on your
aws machine and (from the top level of the connectal tree):
make pciedrivers-clean
make pciedrivers
make install
which, hopefully, will install an updated driver into your
/etc/modules-load.d directory.
This should cause that message to go away...

jca

On 7/4/19, Yaswanth Tavva <yaswan...@gmail.com> wrote:
> Hi all,
>
> Lately I was trying to run (https://github.com/csail-csg/riscy-OOO) on AWS
> F1 which is using connectal. I ran into *core dump issue.*
> --
> You received this message because you are subscribed to the Google Groups
> "connectal" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to connectal+...@googlegroups.com.
> To post to this group, send email to conn...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/connectal/982f3c0b-f08b-48a4-b9ce-6458e1d66de3%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

Jamey Hicks

unread,
Jul 15, 2019, 4:08:16 PM7/15/19
to Yaswanth Tavva, connectal
Hi Yaswanth,

I'm back from vacation.

There is a message regarding the portalmem driver not being present, but I don't think it is actually used in the riscy-ooo software.

The failed assertion is in the code that loads an ELF executable:

That code does seem to check that it got a valid file descriptor, but I suggest adding an fprintf stderr to print the errno and fd if that call to mmap fails.

Hope this helps,
Jamey



--
Reply all
Reply to author
Forward
0 new messages