How to compile Dhrystone with the Zephyr SDK?

199 views
Skip to first unread message

Tommy Thorn

unread,
Nov 19, 2018, 1:43:04 AM11/19/18
to RISC-V Soft CPU Discussion
Normally compiling and running Dhrystone wouldn't cause me a headache,
but the combination of "thou must use these sources" and "thou must use
this SDK" makes this an annoying distraction.

I tried following Antti's helpful suggestions [1] (which I can't tell are ok by the rules,
but at least it's something). Unfortunately I ran into some issues:
- sysutils.c doesn't exist, but syscall.c does
- you also need encoding.h from riscv-tools
- there's no HZ macro in util.h

If I try compiling with the horrific Zephyr SDK, it can't find libraries:

$ riscv32-zephyr-elf-gcc -I/opt/zephyr-sdk/sysroots/riscv32-zephyr-elf/usr/include -march=RV32I -O3 -fno-inline *.S *.c
...
/opt/zephyr-sdk/sysroots/x86_64-pokysdk-linux/usr/bin/riscv32-zephyr-elf/../../libexec/riscv32-zephyr-elf/gcc/riscv32-zephyr-elf/6.1.0/real-ld: cannot find crt0.o: No such file or directory
/opt/zephyr-sdk/sysroots/x86_64-pokysdk-linux/usr/bin/riscv32-zephyr-elf/../../libexec/riscv32-zephyr-elf/gcc/riscv32-zephyr-elf/6.1.0/real-ld: cannot find crtbegin.o: No such file or directory
/opt/zephyr-sdk/sysroots/x86_64-pokysdk-linux/usr/bin/riscv32-zephyr-elf/../../libexec/riscv32-zephyr-elf/gcc/riscv32-zephyr-elf/6.1.0/real-ld: cannot find -lgcc
/opt/zephyr-sdk/sysroots/x86_64-pokysdk-linux/usr/bin/riscv32-zephyr-elf/../../libexec/riscv32-zephyr-elf/gcc/riscv32-zephyr-elf/6.1.0/real-ld: cannot find -lc
/opt/zephyr-sdk/sysroots/x86_64-pokysdk-linux/usr/bin/riscv32-zephyr-elf/../../libexec/riscv32-zephyr-elf/gcc/riscv32-zephyr-elf/6.1.0/real-ld: cannot find -lgloss
collect2: error: ld returned 1 exit status

and indeed none appear to be supplied. It seems to me that the Zephyr SDK can't be use standalone as regular compilers. Surely I'm missing something.

I'm very tempted to screw these rules as they don't make sense and they are ridiculous.

Tommy
[1] https://github.com/micro-FPGA/riscv-contest-2018/wiki/Dhrystone

Antti Lukats

unread,
Nov 19, 2018, 3:36:24 AM11/19/18
to RISC-V Soft CPU Discussion
On Monday, 19 November 2018 07:43:04 UTC+1, tommy wrote:
Normally compiling and running Dhrystone wouldn't cause me a headache,
but the combination of "thou must use these sources" and "thou must use
this SDK" makes this an annoying distraction.

when my daughter splits the hair she does it splitting it to 16 (I do not know how to translate this to english).

I have spent a lot time reading the rules, again and again:

1) The rules do not specify what sources (except 3 files that are not sufficient) you must use for Dhrystone - I would however see it logical that all files that are provided by the riscv benchmarking github are used with as little as possible changes. This is possible and not that hard.

2) The rules DO NOT SAY you must a SDK, I just did run search SDK, this word is not used. The rules say: Zephyr 1.13 must be used, it can be github module.

Zephyr 1.13 when pulled from github does not provide any GCC, its just not there. Also Zephyr project itself from their official website says nothing about providing cross compiler for RISC-V

But yes I know that there also exist something called "Zephyr SDK" - it is something that windows user CAN NOT USE as it is some binary installer for linux only. Inside that SDK there is GCC, but see the rules do not say you need to use the SDK?

And no comes the split to 32 trick - the rules do not say that you have to use the GCC provided by Zephyr (or Zephyr SDK), the rules only say that you can not modify it.

So to comply to the rules:

1) you should make sure you that you are not modifying any C compilers that are provided by Zephyr SDK, the best you do not modify any C compiler at all and use the official mainstream pre compiled cross compiler
2) for the cross compiler you use "standard GCC", zephyr does not provide standard GCC, in the matter of fact it uses "pulpino" patched GCC and special config setting to use "Use standard GCC" this flag changes the compilation to emit different opcodes to make the stuff work on pulpino

 
I tried following Antti's helpful suggestions [1] (which I can't tell are ok by the rules,
but at least it's something).  Unfortunately I ran into some issues:
- sysutils.c doesn't exist, but syscall.c does
Sorry I type too fast, and no one noticed until now, fixed in wiki, syscalls.c should read there
 
- you also need encoding.h from riscv-tools
yes 
- there's no HZ macro in util.h

this is documented in the wiki as "minimal change" required to modify the sources provided, those 2 lines should be inserted into util.h
 
If I try compiling with the horrific Zephyr SDK, it can't find libraries:

$ riscv32-zephyr-elf-gcc -I/opt/zephyr-sdk/sysroots/riscv32-zephyr-elf/usr/include -march=RV32I -O3 -fno-inline *.S *.c


windows BAT file to use standard GCC (as required be the rules)

set TOOLCHAIN_PATH=X:\GIT\riscv-contest\riscv\bin

set HZ=50000000LL

%TOOLCHAIN_PATH%\riscv-none-embed-gcc.exe -march=rv32i -mabi=ilp32 -static -nostdlib --std=gnu99 -O3 -fno-inline -fno-common -fno-builtin-printf -DCPU_CLK=%HZ% -I ..\common -I ..\..\env -T ..\common\test.ld ..\common\crt.S ..\common\syscalls.c dhrystone.c dhrystone_main.c -Wl,-nostdlib,-nostartfiles,-lc,-lm,-lgcc -ooutfile

 
...
/opt/zephyr-sdk/sysroots/x86_64-pokysdk-linux/usr/bin/riscv32-zephyr-elf/../../libexec/riscv32-zephyr-elf/gcc/riscv32-zephyr-elf/6.1.0/real-ld: cannot find crt0.o: No such file or directory
/opt/zephyr-sdk/sysroots/x86_64-pokysdk-linux/usr/bin/riscv32-zephyr-elf/../../libexec/riscv32-zephyr-elf/gcc/riscv32-zephyr-elf/6.1.0/real-ld: cannot find crtbegin.o: No such file or directory
/opt/zephyr-sdk/sysroots/x86_64-pokysdk-linux/usr/bin/riscv32-zephyr-elf/../../libexec/riscv32-zephyr-elf/gcc/riscv32-zephyr-elf/6.1.0/real-ld: cannot find -lgcc
/opt/zephyr-sdk/sysroots/x86_64-pokysdk-linux/usr/bin/riscv32-zephyr-elf/../../libexec/riscv32-zephyr-elf/gcc/riscv32-zephyr-elf/6.1.0/real-ld: cannot find -lc
/opt/zephyr-sdk/sysroots/x86_64-pokysdk-linux/usr/bin/riscv32-zephyr-elf/../../libexec/riscv32-zephyr-elf/gcc/riscv32-zephyr-elf/6.1.0/real-ld: cannot find -lgloss
collect2: error: ld returned 1 exit status

and indeed none appear to be supplied.  It seems to me that the Zephyr SDK can't be use standalone as regular compilers.  Surely I'm missing something.

see above

1) the rules do not mention Zephyr SDK at all
2) the rules do not say that you must use zephyr GCC, only that you can not modify it
3) the rules do not say that you must use zephyr GCC to compile Dhrystone

 
I'm very tempted to screw these rules as they don't make sense and they are ridiculous.

I am very much in your boot here. The requirement to use GCC provided by SDK (what actually is not a requirement) if applied strict means that I need to find 2 more days before the deadline.

I would say that using common sense, and leaving it to the judges would be best option.

What I have also tried to propose is: "use the original rules as is (the first version!) with common sense, document and explain every exception(s) and leave it to the judges to approve or reject"

 
Tommy
[1] https://github.com/micro-FPGA/riscv-contest-2018/wiki/Dhrystone

Antti
 

Nelson Ribeiro

unread,
Nov 19, 2018, 3:55:55 AM11/19/18
to softcpu...@riscv.org
Don't know if it is OK or not for the contest (my common sense tells me that it is OK), but you may use Charles's implementation as reference for Dhrystones:


With this Makefile you are able to compile Dhrystones with Zephyr GCC correctly (with isa=rv32im)

Nelson



--
You received this message because you are subscribed to the Google Groups "RISC-V Soft CPU Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to softcpu-discu...@riscv.org.
To post to this group, send email to softcpu...@riscv.org.
Visit this group at https://groups.google.com/a/riscv.org/group/softcpu-discuss/.
To view this discussion on the web visit https://groups.google.com/a/riscv.org/d/msgid/softcpu-discuss/f116d954-5f4c-4c4a-a493-2e9ef80f143f%40riscv.org.
For more options, visit https://groups.google.com/a/riscv.org/d/optout.

Antti Lukats

unread,
Nov 19, 2018, 3:59:47 AM11/19/18
to RISC-V Soft CPU Discussion
On Monday, 19 November 2018 09:55:55 UTC+1, Nelson Ribeiro wrote:
Don't know if it is OK or not for the contest (my common sense tells me that it is OK), but you may use Charles's implementation as reference for Dhrystones:


With this Makefile you are able to compile Dhrystones with Zephyr GCC correctly (with isa=rv32im)

Nelson


he is using custom made

src/stdlib.c src/start.S

I gusss this would also be accepted, but I prefer to use the files provided by the benchmark suite and not introduce any new source code files at all

and it is all about 

isa=rv32i


g
Antti 

Nelson Ribeiro

unread,
Nov 19, 2018, 4:28:11 AM11/19/18
to Antti Lukats, softcpu...@riscv.org
For the area metric it is OK to use that very ugly syscall.c (implementation), but for the performance metric that files are not good. 

How do you would compare a performance  implementation with https://www.sifive.com/cores/e34 which it is reported to have  1.61DMIPS/MHz? (I find this value a bit suspicious for a 32-bit implementation, but it may be achievable depending of several factors; but still suspicious considering the Coremark score, but I am giving the benefit of the doubt...)

Using custom library and *.S is common practice on running Dhrystones. 

Function in-lining and file merging are not allowed as per Dhrystones rules, but unfortunately are sometimes used by companies that want market their soft CPU cores, given the illusion that their soft CPU's have better performance than the one that they actually  have.

Nelson

--
You received this message because you are subscribed to the Google Groups "RISC-V Soft CPU Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to softcpu-discu...@riscv.org.
To post to this group, send email to softcpu...@riscv.org.
Visit this group at https://groups.google.com/a/riscv.org/group/softcpu-discuss/.

Antti Lukats

unread,
Nov 19, 2018, 4:37:36 AM11/19/18
to RISC-V Soft CPU Discussion, antti....@gmail.com


On Monday, 19 November 2018 10:28:11 UTC+1, Nelson Ribeiro wrote:
For the area metric it is OK to use that very ugly syscall.c (implementation), but for the performance metric that files are not good. 

yes ugly, but the only provided in the same github as the ugly Dhrystone files :)
 
How do you would compare a performance  implementation with https://www.sifive.com/cores/e34 which it is reported to have  1.61DMIPS/MHz? (I find this value a bit suspicious for a 32-bit implementation, but it may be achievable depending of several factors; but still suspicious considering the Coremark score, but I am giving the benefit of the doubt...)

none of the Dhrystone reports for the contest would be comparable outside the contest, there are too many unknown and open issues.

 
Using custom library and *.S is common practice on running Dhrystones. 


probably yes, this makes it all even more "not comparable"
 
Function in-lining and file merging are not allowed as per Dhrystones rules, but unfortunately are sometimes used by companies that want market their soft CPU cores, given the illusion that their soft CPU's have better performance than the one that they actually  have.

it is interesting that merging is not mentioned in the contest rules..

RISC-V has an article about the Dhrystone, it uses again DIFFERENT files than those for the contest

Charles Papon

unread,
Nov 19, 2018, 4:40:04 AM11/19/18
to RISC-V Soft CPU Discussion, antti....@gmail.com
1.61DMIPS/MHz seem fine for me, the reasons why we aren't used to that level of perf is probably because in general, MCU don't have branch prediction. while the sifive core / rocket core use it quite much.

There is a VexRiscv perf sample for a peformance oriented configuration : 
VexRiscv full max perf -> (RV32IM, 1.44 DMIPS/Mhz, 16KB-I$,16KB-D$, single cycle barrel shifter, debug module, catch exceptions, dynamic branch prediction in the fetch stage, branch and shift operations done in the Execute stage) ->

Nelson Ribeiro

unread,
Nov 19, 2018, 7:53:53 AM11/19/18
to Charles Papon, softcpu...@riscv.org, Antti Lukats
With the following rv32im simulator (which has a CPI=1) 


running a slight modified version of your implementation of Dhrystones I get 1.71 DMIPS/MHz. 

So I know that the reported value is achievable. That is why I am given the benefit of the doubt.  

For single issue processors, I see values of ~1.6 DMIPS/MHz for some MIPS targets (f32c achieves that value without an actual hardware divider!), but that is achieved by having pipeline models of the CPU architecture in the Machine Description files inside GCC, for which we have none for RISC-V (I am  aware of the mbranch-cost=N switch, which in some cases may help decreasing the penalty for branches) and by using some other interesting optimization flags, which I don't know if they can be used for the contest (-fselective-scheduling; -fno-crossjumping; -fipa-pta; -fira-algorithm=priority), but for sure could be used by SiFive in their benchmark runs.

By the way, Rocket is a 64-bit implementation which has a big advantage in the strcpy function: it can load 64 bits per clock cycle and store the 64 bits some clock cycles later to avoid RAW hazards. That is why it can score 1.72 DMIPS/MHz!

Nelson


--
You received this message because you are subscribed to the Google Groups "RISC-V Soft CPU Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to softcpu-discu...@riscv.org.
To post to this group, send email to softcpu...@riscv.org.
Visit this group at https://groups.google.com/a/riscv.org/group/softcpu-discuss/.

Antti Lukats

unread,
Nov 19, 2018, 8:04:29 AM11/19/18
to RISC-V Soft CPU Discussion, charles....@gmail.com, antti....@gmail.com
On Monday, 19 November 2018 13:53:53 UTC+1, Nelson Ribeiro wrote:
With the following rv32im simulator (which has a CPI=1) 


running a slight modified version of your implementation of Dhrystones I get 1.71 DMIPS/MHz. 

So I know that the reported value is achievable. That is why I am given the benefit of the doubt.  

For single issue processors, I see values of ~1.6 DMIPS/MHz for some MIPS targets (f32c achieves that value without an actual hardware divider!), but that is achieved by having pipeline models of the CPU architecture in the Machine Description files inside GCC, for which we have none for RISC-V (I am  aware of the mbranch-cost=N switch, which in some cases may help decreasing the penalty for branches) and by using some other interesting optimization flags, which I don't know if they can be used for the contest (-fselective-scheduling; -fno-crossjumping; -fipa-pta; -fira-algorithm=priority), but for sure could be used by SiFive in their benchmark runs.

By the way, Rocket is a 64-bit implementation which has a big advantage in the strcpy function: it can load 64 bits per clock cycle and store the 64 bits some clock cycles later to avoid RAW hazards. That is why it can score 1.72 DMIPS/MHz!

Nelson

your srec bootloader is pretty neat !

Antti 

Nelson Ribeiro

unread,
Nov 19, 2018, 8:10:16 AM11/19/18
to Antti Lukats, softcpu...@riscv.org, Charles Papon
Be careful with these kind of bootloaders, you may need to set the baudrate  to 9600 and you may need to have a small HW FIFO in the uart rx side.
It depends a bit of the performance of the CPU to be able to process the s-rec line...

Nelson  

--
You received this message because you are subscribed to the Google Groups "RISC-V Soft CPU Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to softcpu-discu...@riscv.org.
To post to this group, send email to softcpu...@riscv.org.
Visit this group at https://groups.google.com/a/riscv.org/group/softcpu-discuss/.

Frank Buss

unread,
Nov 19, 2018, 8:16:26 AM11/19/18
to RISC-V Soft CPU Discussion, charles....@gmail.com, antti....@gmail.com
I wonder if it is allowed to download everything over UART as well, like some microcode for the CPU itself (nice side effect: then it could emulate any CPU you want with different microcode) and all the apps. Then I wouldn't need to fight with the flash.

Antti Lukats

unread,
Nov 19, 2018, 8:19:05 AM11/19/18
to RISC-V Soft CPU Discussion, charles....@gmail.com, antti....@gmail.com
I think we all can do whatever we want, and the judges will approve or exclude at their judgement, there is not much happening before the submit deadline.

Antti

Frank Buss

unread,
Nov 19, 2018, 10:42:34 AM11/19/18
to RISC-V Soft CPU Discussion, charles....@gmail.com, antti....@gmail.com
I guess I could "sell" it as a creative idea :-) It would even allow to test new RISC-V developments and instructions on real hardware fast, without the need to change the HDL.

I think I will do it this way, unless one of the organizer say it is forbidden. Could be even after the contest a nice platform for CPU core development.

Tommy Thorn

unread,
Nov 19, 2018, 11:26:08 AM11/19/18
to Frank Buss, RISC-V Soft CPU Discussion, charles....@gmail.com, antti....@gmail.com
I hope so because that’s what I want to do.

Tommy
--
You received this message because you are subscribed to the Google Groups "RISC-V Soft CPU Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to softcpu-discu...@riscv.org.
To post to this group, send email to softcpu...@riscv.org.
Visit this group at https://groups.google.com/a/riscv.org/group/softcpu-discuss/.

Antti Lukats

unread,
Nov 19, 2018, 11:48:17 AM11/19/18
to RISC-V Soft CPU Discussion, programmer...@gmail.com, charles....@gmail.com, antti....@gmail.com
On Monday, 19 November 2018 17:26:08 UTC+1, tommy wrote:
I hope so because that’s what I want to do.

Tommy


its my educated guess only, but most likely we are left alone til the deadline to-do whatever we do..
The total SoW for the contest entry is of that magnitude that it is likely that judges take it a bit more relaxed. 

On Lattice the SPI stuff is integrated to the tools, and easy to use, no need to create your own utilities,
and I am "spending" for SPI bootstrap about

20 bytes of EBR
5 LUT
3 DFF

I guess the SPI bootsrap to load RAM on Microsemi would be about 30 LUT/DFF, but when I look at the Microsemi SPI flash PDF files, I am not much wanting to use them, they talk about KEIL and IAR.. and only target SmartFusion2 not IGLOO2

So UART bootstrap/loader is pretty much easy option

Antti

Frank Buss

unread,
Nov 19, 2018, 4:48:36 PM11/19/18
to RISC-V Soft CPU Discussion, programmer...@gmail.com, charles....@gmail.com, antti....@gmail.com
On Monday, November 19, 2018 at 5:26:08 PM UTC+1, tommy wrote:
I hope so because that’s what I want to do.


I wrote a first test for the iCE40 breakout board:


After reset (btw, is there an internal reset signal?) the green LED goes on. When starting to upload data, the green LED goes off and the red LED goes on. After 10 bytes are received, the green LED blinks and the data is sent back with 1 Hz, and after this the green LED is constant on again.

TX pin is on pin 6, RX pin on pin 9. That's pin 18 and 20 on "header c" (top left). You can use a 3.3 V TTL USB serial adapter, or you can solder some wires and configure the EEPROM of the integrated FTDI chip to use the unused second port, as described here, where you can see the location of the 2 pins on header c as well:


You can test it in a terminal program, like HTERM in Windows, or minicom in Linux. Settings: 8N1, 115200 baud, no handshake.

For testing it is good to have a Python script which can write and read to the serial port. For sending a binary file, it looks like this in Windows:

from serial import Serial 
ser = Serial("COM8", 115200)
filename = "test.bin"
with open(filename, "rb") as f:
    b = f.read(1)
    while b:
        ser.write(b)
        b = f.read(1)
ser.close()

Should work in Linux as well, then with /dev/ttyUSB0 (or USB1, if you use my mod). This can be enhanced to read back debug data, or for the compliance test. I tested it with Python 3.7.1, which you can get here:
You might need to change the PATH environment variable:
c:\Users\yourname\AppData\Local\Programs\Python\Python37-32
c:\Users\yourname\AppData\Local\Programs\Python\Python37-32\scripts

And you need the serial port library, if not already installed:
python -m pip install pyserial

You might need to upgrade PIP first:
python -m pip install --upgrade pip

Only problem is that it already needs 202 LUT4. I have no idea how brouhaha managed to implement it all in 220 LUT4, including the core. Needs some serious optimization, or maybe I need to change it to flash load, because I think I can implement the SPI load function with less LUTs than the serial port receiver. But for debugging it is easier if I can just boot it over serial port, without the need to write the flash first, because then I can write a Python script which runs all compliance tests automatically.

I guess I don't have enough time to finish it until the deadline, but still fun to try it. 

Antti Lukats

unread,
Nov 20, 2018, 3:24:30 AM11/20/18
to RISC-V Soft CPU Discussion, programmer...@gmail.com, charles....@gmail.com, antti....@gmail.com
I have no precise metrics for resource utilization for unpublished entries, >220 means more than 220, how much it gonna be I have no idea. Maybe it will be 219.. maybe 300

For the UART bootstrap, hints:

1) you need only RX in RTL, TX can be bitbanged
2) use absolute highest usable baudrate it makes the baudrate divider smaller
3) use PLL to deliver lowest possible clock for uart baudrate divider, if possible select output frequency that allows divide by power of divider

Super trick:

use 1 bit per byte, so you transmit 0x00 for 0 bit and 0xFE for 1 bit, so you need only capture once per received bit, now you need 4 bit shift register for SPRAM (it has 4 bit write capabiltiy)

I can not estimate how low it goes, but should go to sub 50 LUT
..

do not underestimate the documentation time, and you need setup github account and project also


g
Antti




 

Frank Buss

unread,
Nov 21, 2018, 4:33:19 AM11/21/18
to RISC-V Soft CPU Discussion, programmer...@gmail.com, charles....@gmail.com, antti....@gmail.com
On Tuesday, November 20, 2018 at 9:24:30 AM UTC+1, Antti Lukats wrote:

3) use PLL to deliver lowest possible clock for uart baudrate divider, if possible select output frequency that allows divide by power of divider

How do I do this with the iCE40? I tried the PLL from the IP Catalog, but looks like it is limited to an output frequency range of 16 MHz to 275 MHz, and the crystal oscillator on the breakout board is 12 MHz. And wouldn't this use extra LUTs for the divider anyway, which would be the same regarding the resource usage as a baudrate divider in UART entity?

Antti Lukats

unread,
Nov 21, 2018, 9:40:33 AM11/21/18
to RISC-V Soft CPU Discussion, programmer...@gmail.com, charles....@gmail.com, antti....@gmail.com
my bad my board has 27MHz so PLL would deliver a little lower clock with 0 lut and 0 FF

g
Antti 
Reply all
Reply to author
Forward
0 new messages