Building RISC code with llvm

1,716 views
Skip to first unread message

Gnanasekar R

unread,
May 10, 2018, 1:21:18 PM5/10/18
to RISC-V SW Dev
Hi,

I was able to build the llvm toolchain successfully in Release mode. I am building some code for RISC with llvm using the following options,

-Os -march=rv32imc -mabi=ilp32

When I use this with gcc i see a mix of 16-bit and 32-bit instruction getting generated. With LLVM I only see 32-bit instructions. Am i missing something in my compiler flags?

Bruce Hoult

unread,
May 10, 2018, 8:05:15 PM5/10/18
to Gnanasekar R, RISC-V SW Dev
The C extension is not yet supported by the built in llvm assembler. In the meantime you can tell llvm to output assembly language using the -S option and use gcc (or as/ld directly) to asssemble and link your final program.

I've also found it's necessary to do this because the llvm built in assembler is not emitting the attribute to say whether the code is hard float or soft float (it's soft float only at the moment) and then the linker gets upset and complains about linking soft float and hard float code together. Going via a .s file and using the gny assembler works correctly.
 

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CANUXKs8ttsqeM4hXVvCwdXimx%3DfAmM0Hd%3DofBH%2BbMvDQ6CggiA%40mail.gmail.com.

Gnanasekar

unread,
May 10, 2018, 9:43:30 PM5/10/18
to Bruce Hoult, RISC-V SW Dev
Thanks for the info Bruce. That makes me think what are the advantages I have with llvm assembler over gcc? I could use gcc to do all the stuff I want. What is the motivation to use llvm to generate assembly  and gcc to link?

Bruce Hoult

unread,
May 10, 2018, 11:14:45 PM5/10/18
to Gnanasekar, RISC-V SW Dev
Right now, little or nothing unless you're working on improving LLVM. gcc works well.

Over time, LLVM is commonly thought to be easier to modify, so it's likely to be the compiler of choice for people implementing custom extensions in RISC-V CPUs and it may well be the first compiler to support new standard extensions.

On Fri, May 11, 2018 at 1:43 PM, Gnanasekar <gnanase...@gmail.com> wrote:
Thanks for the info Bruce. That makes me think what are the advantages I have with llvm assembler over gcc? I could use gcc to do all the stuff I want. What is the motivation to use llvm to generate assembly  and gcc to link?

On 11-May-2018, at 5:35 AM, Bruce Hoult <br...@hoult.org> wrote:

The C extension is not yet supported by the built in llvm assembler. In the meantime you can tell llvm to output assembly language using the -S option and use gcc (or as/ld directly) to asssemble and link your final program.

I've also found it's necessary to do this because the llvm built in assembler is not emitting the attribute to say whether the code is hard float or soft float (it's soft float only at the moment) and then the linker gets upset and complains about linking soft float and hard float code together. Going via a .s file and using the gny assembler works correctly.
 
On Fri, May 11, 2018 at 5:21 AM, Gnanasekar R <gnanase...@gmail.com> wrote:
Hi,

I was able to build the llvm toolchain successfully in Release mode. I am building some code for RISC with llvm using the following options,

-Os -march=rv32imc -mabi=ilp32

When I use this with gcc i see a mix of 16-bit and 32-bit instruction getting generated. With LLVM I only see 32-bit instructions. Am i missing something in my compiler flags?

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.

Alex Bradbury

unread,
May 11, 2018, 2:33:46 AM5/11/18
to Bruce Hoult, Gnanasekar R, RISC-V SW Dev
I'd like to clear up some confusion in this thread.

Could you please make sure you're using the latest upstream LLVM (i.e.
checked out from svn/git). There _is_ support for the compressed
extension, which has seen extensive testing and evaluation. It works
both when emitting your output object directly (which in LLVM means .s
is never generated), or when passing through the assembler.
Demonstration below:

# First, consider a simple C program
$ cat example.c
int foo(int a, int b, int c, int d) {
return a + b - c + d - 13;
}

# Compile to .o
./bin/clang -target riscv32 -march=rv32imc -O1 -c example.c

# Dump the output. You'll notice compressed 16-bit instructions were generated
$ ./bin/llvm-objdump -d example.o

example.o: file format ELF32-riscv

Disassembly of section .text:
foo:
0: 2e 95 add a0, a0, a1
2: 4d 15 addi a0, a0, -13
4: 11 8d sub a0, a0, a2
6: 36 95 add a0, a0, a3
8: 82 80 ret

# We can also do this by emitting assembler and having the assembler
choose compressed instructions
./bin/clang -target riscv32 -march=rv32imc -O1 -S example.c
$ cat example.s
.text
.file "example.c"
.globl foo # -- Begin function foo
.p2align 1
.type foo,@function
foo: # @foo
# %bb.0: # %entry
add a0, a0, a1
addi a0, a0, -13
sub a0, a0, a2
add a0, a0, a3
ret
.Lfunc_end0:
.size foo, .Lfunc_end0-foo
# -- End function

.ident "clang version 7.0.0 (http://llvm.org/git/clang.git
1d6049b739c78b327ed25599c19ed7764cb21947)
(http://llvm.org/git/llvm.git
a02e1ac80b01142ab76751baa642e336cc570d71)"
.section ".note.GNU-stack","",@progbits
$ ./bin/clang -target riscv32 -march=rv32imc -c example.s
$ ./bin/llvm-objdump -d example.o

example.o: file format ELF32-riscv

Disassembly of section .text:
foo:
0: 2e 95 add a0, a0, a1
2: 4d 15 addi a0, a0, -13
4: 11 8d sub a0, a0, a2
6: 36 95 add a0, a0, a3
8: 82 80 ret

Best,

Alex

On 11 May 2018 at 01:05, Bruce Hoult <br...@hoult.org> wrote:
> The C extension is not yet supported by the built in llvm assembler. In the
> meantime you can tell llvm to output assembly language using the -S option
> and use gcc (or as/ld directly) to asssemble and link your final program.
>
> I've also found it's necessary to do this because the llvm built in
> assembler is not emitting the attribute to say whether the code is hard
> float or soft float (it's soft float only at the moment) and then the linker
> gets upset and complains about linking soft float and hard float code
> together. Going via a .s file and using the gny assembler works correctly.
>
>
> On Fri, May 11, 2018 at 5:21 AM, Gnanasekar R <gnanase...@gmail.com>
> wrote:
>>
>> Hi,
>>
>> I was able to build the llvm toolchain successfully in Release mode. I am
>> building some code for RISC with llvm using the following options,
>>
>> -Os -march=rv32imc -mabi=ilp32
>>
>> When I use this with gcc i see a mix of 16-bit and 32-bit instruction
>> getting generated. With LLVM I only see 32-bit instructions. Am i missing
>> something in my compiler flags?
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "RISC-V SW Dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to sw-dev+un...@groups.riscv.org.
> --
> You received this message because you are subscribed to the Google Groups
> "RISC-V SW Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sw-dev+un...@groups.riscv.org.
> To post to this group, send email to sw-...@groups.riscv.org.
> Visit this group at
> https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
> To view this discussion on the web visit
> https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CAMU%2BEkzVHhJHuOdJ54jkNPMuB2Sb62dW7nYEaiY3w8CS82s4MQ%40mail.gmail.com.

Bruce Hoult

unread,
May 11, 2018, 3:03:14 AM5/11/18
to Alex Bradbury, Gnanasekar R, RISC-V SW Dev
I'm using the latest revision of https://github.com/lowRISC/riscv-llvm.git (only the README has changed recently) and with the above test program and invocation I get 32 bit instructions.

My understanding was that with respect to RISC-V this repo contains the most recent work.


>> To post to this group, send email to sw-...@groups.riscv.org.
>> Visit this group at
>> https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
>> To view this discussion on the web visit
>> https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CANUXKs8ttsqeM4hXVvCwdXimx%3DfAmM0Hd%3DofBH%2BbMvDQ6CggiA%40mail.gmail.com.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "RISC-V SW Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an

> To post to this group, send email to sw-...@groups.riscv.org.
> Visit this group at
> https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
> To view this discussion on the web visit
--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.

Gnanasekar R

unread,
May 11, 2018, 3:29:43 AM5/11/18
to Bruce Hoult, Alex Bradbury, RISC-V SW Dev
Hi Alex,

Even I tried the same but I do not see any compressed instruction being generated. I am also using the latest revision as mentioned by Bruce.

Regards,
Gnanasekar


>> To post to this group, send email to sw-...@groups.riscv.org.
>> Visit this group at
>> https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
>> To view this discussion on the web visit
>> https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CANUXKs8ttsqeM4hXVvCwdXimx%3DfAmM0Hd%3DofBH%2BbMvDQ6CggiA%40mail.gmail.com.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "RISC-V SW Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an

> To post to this group, send email to sw-...@groups.riscv.org.
> Visit this group at
> https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
> To view this discussion on the web visit

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.

To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.

Tommy Murphy

unread,
May 11, 2018, 3:59:14 AM5/11/18
to RISC-V SW Dev, br...@hoult.org, gnanase...@gmail.com
On Friday, 11 May 2018 07:33:46 UTC+1, asb wrote:
I'd like to clear up some confusion in this thread.

Could you please make sure you're using the latest upstream LLVM (i.e.
checked out from svn/git). 

Alex Bradbury

unread,
May 11, 2018, 4:21:19 AM5/11/18
to Gnanasekar R, Bruce Hoult, RISC-V SW Dev
On 11 May 2018 at 08:29, Gnanasekar R <gnanase...@gmail.com> wrote:
> Hi Alex,
>
> Even I tried the same but I do not see any compressed instruction being
> generated. I am also using the latest revision as mentioned by Bruce.

Apologies for the confusion. "Upstream" typically refers to the
original project repository
(https://en.wikipedia.org/wiki/Upstream_(software_development)).

In this case, that would be LLVM and Clang repositories hosted by llvm.org:
http://llvm.org/docs/GettingStarted.html#checkout

I will try to be more explicit in future.

Best,

Alex

Bruce Hoult

unread,
May 11, 2018, 4:30:14 AM5/11/18
to Alex Bradbury, Gnanasekar R, RISC-V SW Dev
The latest upstream llvm (using the llvm-project-20170507 monorepo) doesn't compile for me if I enable RISCV. It fails with:

cc: error: unrecognized command line option ‘-fforce-enable-int128’

It works if I go back to the Feb 28 commit before Mandeep Singh Grang committed "For RISCV32, we must force enable int128 for compiling long double routines using the flag -fforce-enable-int128." ie. Zachary Turner's fa085c7426d3 in that repo.

The compiler thus generated also gives only base 32 bit instructions even when asked to use C.



>> To post to this group, send email to sw-...@groups.riscv.org.
>> Visit this group at
>> https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
>> To view this discussion on the web visit
>> https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CANUXKs8ttsqeM4hXVvCwdXimx%3DfAmM0Hd%3DofBH%2BbMvDQ6CggiA%40mail.gmail.com.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "RISC-V SW Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an

> To post to this group, send email to sw-...@groups.riscv.org.
> Visit this group at
> https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
> To view this discussion on the web visit

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.

To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.

Bruce Hoult

unread,
May 11, 2018, 4:34:05 AM5/11/18
to Alex Bradbury, Gnanasekar R, RISC-V SW Dev
The llvm-project-20170507 monorep I am using mirrors the llvm project svn continuously. At the moment of posting this the latest commit is by George Rimar dated Fri May 11 08:11:25 2018 +0000 with message:

[ELF] - Revert of: r332038, r332054, r332060, r332061, r332062, r332063
    
This reverts "Mitigate relocation overflow [part 1 of 2]." and the following commits which were trying to fix the bots.

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.

Gnanasekar R

unread,
May 11, 2018, 4:36:02 AM5/11/18
to Alex Bradbury, Bruce Hoult, RISC-V SW Dev
While checking out I checked out REV=326957 and then applied the patches as mentioned in  https://github.com/lowRISC/riscv-llvm . Could that be the problem?

Should I just checkout from   http://llvm.org/svn/llvm-project/llvm/trunk and then use it to build without applying patches?

Alex Bradbury

unread,
May 11, 2018, 4:41:44 AM5/11/18
to Gnanasekar R, Bruce Hoult, RISC-V SW Dev
On 11 May 2018 at 09:36, Gnanasekar R <gnanase...@gmail.com> wrote:
> While checking out I checked out REV=326957 and then applied the patches as
> mentioned in https://github.com/lowRISC/riscv-llvm . Could that be the
> problem?
>
> Should I just checkout from http://llvm.org/svn/llvm-project/llvm/trunk
> and then use it to build without applying patches?

Yes, I tried to clarify this in a recent README update:

"""
As of May 2018, the vast majority of these patches are now upstream
and most users wishing to experiment with support for RISC-V in LLVM
projects will likely be best served by building directly from the
upstream repositories. You may prefer to follow this repository if you
want to study how the backend is put together.
"""

I will add further clarification.

Best,

Alex

Gnanasekar R

unread,
May 11, 2018, 4:44:08 AM5/11/18
to Alex Bradbury, Bruce Hoult, RISC-V SW Dev
My bad. I overlooked this statement. Apologies. I am trying to build from upstream repository now.

Bruce Hoult

unread,
May 11, 2018, 4:52:11 AM5/11/18
to Alex Bradbury, Gnanasekar R, RISC-V SW Dev
> As of May 2018, the vast majority of these patches are now upstream

Implies that some are not.

> most users wishing to experiment with support for RISC-V in LLVM
> projects will likely be best served by building directly from the
> upstream repositories

I don't want to merely experiment with RISC-V in LLVM, I want to enhance it.

Therefore I want to have an LLVM with *everything* in it, so I can find exactly what works and what does not, so as to not duplicate work already done by others.



Best,

Alex

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.

Bruce Hoult

unread,
May 11, 2018, 5:48:02 AM5/11/18
to Alex Bradbury, Gnanasekar R, RISC-V SW Dev
OK, I found the problem. It turns out that it doesn't like having compiler-rt checked out/symlinked into llvm/projects.

Without this, the latest revision from either svn or the llvm-project-20170507 monorepo builds and will produce compressed instructions on demand.

HOWEVER, it's not very useful for actually building programs as a simple hello world fails with:

$ ../llvm-project-20170507/_build/bin/clang -target riscv32 -march=rv32imc -O1 hello.c -o hello
hello.c:1:10: fatal error: 'stdio.h' file not found
#include <stdio.h>
         ^~~~~~~~~
1 error generated.

In contrast, the code in lowrisc/riscv-llvm (i.e. the patches, applied to the recommended svn revision 326957 produce a compiler which does successfully find the library (newlib) headers, but doesn't use compressed instructions and fails in linking:

brucehoult@gamma06:~/riscv/play$ ../lowrisc-llvm/_build/bin/clang -target riscv32 -march=rv32im -O1 hello.c -o hello
/home/brucehoult/riscv/_install/lib/gcc/riscv32-unknown-linux-gnu/7.2.0/../../../../bin/riscv32-unknown-elf-ld: /home/brucehoult/riscv/_install/lib/gcc/riscv32-unknown-linux-gnu/7.2.0/crtbegin.o: can't link hard-float modules with soft-float modules
/home/brucehoult/riscv/_install/lib/gcc/riscv32-unknown-linux-gnu/7.2.0/../../../../bin/riscv32-unknown-elf-ld: failed to merge target specific data of file /home/brucehoult/riscv/_install/lib/gcc/riscv32-unknown-linux-gnu/7.2.0/crtbegin.o
/home/brucehoult/riscv/_install/lib/gcc/riscv32-unknown-linux-gnu/7.2.0/../../../../bin/riscv32-unknown-elf-ld: /home/brucehoult/riscv/_install/lib/gcc/riscv32-unknown-linux-gnu/7.2.0/crtend.o: can't link hard-float modules with soft-float modules
/home/brucehoult/riscv/_install/lib/gcc/riscv32-unknown-linux-gnu/7.2.0/../../../../bin/riscv32-unknown-elf-ld: failed to merge target specific data of file /home/brucehoult/riscv/_install/lib/gcc/riscv32-unknown-linux-gnu/7.2.0/crtend.o
clang-7.0: error: riscv32-unknown-elf-ld command failed with exit code 1 (use -v to see invocation)

Using gcc to link works:

brucehoult@gamma06:~/riscv/play$ ../lowrisc-llvm/_build/bin/clang -target riscv32 -march=rv32im -O1 hello.c -c
brucehoult@gamma06:~/riscv/play$ riscv32-unknown-elf-gcc hello.o -o hello
brucehoult@gamma06:~/riscv/play$ qemu-riscv32 hello
Hello world!

The only problem from my point of view is compressed instructions are not generated. But I can at least build real code and run it on qemu.

In all cases I'm using exactly the same cmake setup:

cmake -G Ninja -DCMAKE_BUILD_TYPE="Debug" \
    -DBUILD_SHARED_LIBS=True -DLLVM_USE_SPLIT_DWARF=True \
   -DLLVM_OPTIMIZED_TABLEGEN=True    -DLLVM_BUILD_TESTS=True \
   -DDEFAULT_SYSROOT="../../_install/riscv32-unknown-elf" \
   -DGCC_INSTALL_PREFIX="../../_install" \
   -DLLVM_DEFAULT_TARGET_TRIPLE="riscv32-unknown-elf" \
   -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD="RISCV" ../llvm

My directory structure is:

riscv
  _install # contains products from riscv-gnu-toolchain
  llvm-project-20170507
    _ build
    llvm
  lowrisc-llvm
    _ build
    llvm
  svn-llvm
    _ build
    llvm
  play
    example.c
    hello.c

​In summary:

svn and ​llvm-project-20170507: produce compressed instructions, can't find headers
lowrisc-llvm: finds headers, no compressed instructions

Bruce Hoult

unread,
May 11, 2018, 6:44:18 AM5/11/18
to Alex Bradbury, Gnanasekar R, RISC-V SW Dev
The problem with the svn/llvm-project-20170507 version not finding header files can be solved by adding to the clang command line:

 -Xclang -iwithsysroot -Xclang /include

There is still a linking problem which is caused by calling the system (i.e. x86) gcc to do the linking instead of riscv32-unknown-elf-gcc.

I don't know whether these can be solved once and for all with more options to cmake.

Bruce Hoult

unread,
May 11, 2018, 6:59:22 AM5/11/18
to Alex Bradbury, Gnanasekar R, RISC-V SW Dev
And the linking problem can be solved hackily by adding -ccc-gcc-name riscv32-unknown-elf-gcc (with the full path if it's not in PATH)...

So:

_build/bin/clang  -Xclang -iwithsysroot -Xclang /include \
  -ccc-gcc-name riscv32-unknown-elf-gcc \
  -target riscv32 -march=rv32imc -O1 hello.c -o hello

Ugly.

Gnanasekar R

unread,
May 11, 2018, 12:53:22 PM5/11/18
to Bruce Hoult, Alex Bradbury, RISC-V SW Dev
I built the toolchain from upstream. When I compile I am getting the following error.

fatal error: 'stdio.h' file not found.

Did I miss to pass any flag while building toolchain. I do not see the stdio.h in the include folder in the toolchain. Following are my build option,

cmake -G Ninja -DCMAKE_BUILD_TYPE="Release" -DBUILD_SHARED_LIBS=True -DCMAKE_INSTALL_PREFIX="/path/to/_install" -DLLVM_USE_SPLIT_DWARF=True -DLLVM_OPTIMIZED_TABLEGEN=True -DDEFAULT_SYSROOT="/path/to/riscv32-unknown-elf" -DGCC_INSTALL_PREFIX="/path/to/riscv-gnu-toolchain/_install" -DLLVM_DEFAULT_TARGET_TRIPLE="riscv32-unknown-elf" -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD="RISCV" ../

cmake --build .

I am pretty sure the paths mentioned for sysroot is correct(points to the riscv gcc tool chain)

Bruce Hoult

unread,
May 11, 2018, 6:44:41 PM5/11/18
to Gnanasekar R, Alex Bradbury, RISC-V SW Dev
I already mentioned this problem and a linking problem and the hacky work-arounds I found for them in my previous couple of messages in this thread 11 to 12 hours ago.

Instructions simply to "use the upstream" are not really adequate when such a complex cmake invocation and work-arounds are needed to get something that can actually build working programs. I've spent a couple of years in teams working on projects modifying certain parts of llvm but it's a huge project and I certainly don't know every part of it. If I'm struggling to figure out how to use what's in the llvm repo then I don't know what someone who isn't a compiler engineer and just wants to build and run some code is going to do.

To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.

Bruce Hoult

unread,
May 12, 2018, 2:45:24 AM5/12/18
to RISC-V SW Dev, gnanase...@gmail.com, a...@asbradbury.org, br...@hoult.org
On Saturday, May 12, 2018 at 10:44:41 AM UTC+12, Bruce Hoult wrote:
Instructions simply to "use the upstream" are not really adequate when such a complex cmake invocation and work-arounds are needed to get something that can actually build working programs. I've spent a couple of years in teams working on projects modifying certain parts of llvm but it's a huge project and I certainly don't know every part of it. If I'm struggling to figure out how to use what's in the llvm repo then I don't know what someone who isn't a compiler engineer and just wants to build and run some code is going to do.

Rather than simply complain that the instructions aren't very good, here is what I consider to be reasonable instructions.

By simply copying & pasting the following, absolutely verbatim with no changes, you can get a working clang from the head of the official llvm master branch. I've tested it on a completely virgin install of Ubuntu 16.04 LTS server. It should work on other versions of Ubuntu and on other Debian-based distros. (If it doesn't then please let me know)

Note that you currently must use a soft float version of the rv32 compiler and libraries.

You can skip the gnu toolchain and/or qemu parts if you already have those.

# about 17 GB of disk space is needed
# entire process takes (not including apt-get, which is about 90 seconds on AWS):
#   87m10s on i7-8650U Intel NUC with 32 GB RAM (on 30 Mbps VDSL internet in NZ)
#   20m40s on server with Xeon E5-2667v4 @3.20GHz (16 cores)
#   17m30s on an AWS m5.12xlarge with fresh Ubuntu 16.04 AMI (ami-4e79ed36)

#harmless if things are already installed. Obviously you can't do it if you're not an admin
sudo apt-get update
sudo apt-get -y dist-upgrade
sudo apt-get -y install \
  binutils build-essential libtool texinfo \
  gzip zip unzip patchutils curl git \
  make cmake ninja-build automake bison flex gperf \
  grep sed gawk python bc \
  zlib1g-dev libexpat1-dev libmpc-dev \
  libglib2.0-dev libfdt-dev libpixman-1-dev 

mkdir riscv
cd riscv
mkdir _install
export PATH=`pwd`/_install/bin:$PATH

pushd riscv-gnu-toolchain
./configure --prefix=`pwd`/../_install --with-arch=rv32gc --with-abi=ilp32
make -j`nproc`
popd

pushd riscv-qemu
./configure --prefix=`pwd`/../_install --target-list=riscv32-linux-user
make -j`nproc` install
popd

pushd llvm
(cd llvm/tools; ln -s ../../clang .) 
mkdir _build
cd _build
cmake -G Ninja -DCMAKE_BUILD_TYPE="Release" \
  -DBUILD_SHARED_LIBS=True -DLLVM_USE_SPLIT_DWARF=True \
  -DLLVM_OPTIMIZED_TABLEGEN=True -DLLVM_BUILD_TESTS=False \
  -DDEFAULT_SYSROOT="../../_install/riscv32-unknown-elf" \
  -DGCC_INSTALL_PREFIX="../../_install" \
  -DLLVM_DEFAULT_TARGET_TRIPLE="riscv32-unknown-elf" \
  -DLLVM_TARGETS_TO_BUILD="" \
  -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD="RISCV" \
  ../llvm
cmake --build .
alias clang-rv32="`pwd`/bin/clang -Xclang -iwithsysroot -Xclang /include \
  -ccc-gcc-name riscv32-unknown-elf-gcc \
  -target riscv32 -march=rv32imc"
popd

cat >hello.c <<END
#include <stdio.h>

int main(){
  printf("Hello RISCV!\n");
  return 0;
}
END

clang-rv32 -O1 hello.c -o hello
qemu-riscv32 hello

Luke Kenneth Casson Leighton

unread,
May 12, 2018, 8:21:34 AM5/12/18
to Bruce Hoult, RISC-V SW Dev, gnanase...@gmail.com, a...@asbradbury.org, br...@hoult.org
Not at laptop at moment will record when i csn put in wiki page otherwise msg rushes past and is lost. Will send link when done, anyone else wants to do instead feel free. V handy bruce
--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/17fcb12e-0c4a-420e-95e3-bc12d97b8a2c%40groups.riscv.org.


--
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

Alex Bradbury

unread,
May 14, 2018, 7:37:34 AM5/14/18
to Gnanasekar R, Bruce Hoult, RISC-V SW Dev
On 11 May 2018 at 17:53, Gnanasekar R <gnanase...@gmail.com> wrote:
> I built the toolchain from upstream. When I compile I am getting the
> following error.
>
> fatal error: 'stdio.h' file not found.

My apologies, that's my fault. When I wrote my original response I had
forgotten that the Clang patch for riscv32-unknown-elf isn't yet
upstream. For anyone who's interested I've cleaned up the patch,
expanded it with tests and put it up for review here:
https://reviews.llvm.org/D46822

I've added a mini FAQ which gives information on compiling upstream
LLVM/Clang as well as answer some other common queries.

https://github.com/lowRISC/riscv-llvm#mini-faq

For cross compiling, the RISC-V Clang driver support should mean you
only need to tell clang:
1) The target: pass -target when invoking clang or set
-DLLVM_DEFAULT_TARGET_TRIPLE when first building it
2) The sysroot: pass --sysroot when invoking clang or set
-DDEFAULT_SYSROOT when first building it
3) The location of a built RISC-V gcc toolchain: pass --gcc-toolchain
or set -DGCC_INSTALL_PREFIX when first building it

Best,

Alex

Bruce Hoult

unread,
May 14, 2018, 8:49:21 AM5/14/18
to Alex Bradbury, Gnanasekar R, Bruce Hoult, RISC-V SW Dev
Hi Alex, thanks for the updated information. It was very frustrating that the lowrisc repo worked but didn't have all features (e.g. no C support) while the upstream repo did not work.

May I just say that I really dislike templated instructions such as ...
./configure --prefix=/your/gccinstallpath

Since you are actually showing instructions for getting and building riscv-gnu-toolchain, why not add an explict mkdir for the install directory, cd .. (or popd) after building the gnu stuff, and make the templated stuff explicit?

That doesn't stop someone who knows what they are doing, someone who already has the gnu toolchain somewhere else (or wants to put it somewhere else) from modifying the instructions. But it makes the instructions actually work with a simple copy&paste for those who simply want a working compiler to use to build their code without having to learn how the compiler works or is put together.

On the other hand, I can't see any single advantage to writing templated instructions where the user has to understand what is templated and what is not, find all the templated bits, and know what to replace them with. Some people won't be able to do that at all and go away discouraged.

The riscv-gnu-toolchain itself is I think a great example of how to do it, with a number of different projects fetched, configured, built, and installed with a couple of lines of code.


--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.

Luke Kenneth Casson Leighton

unread,
May 15, 2018, 8:20:02 AM5/15/18
to Bruce Hoult, Alex Bradbury, Gnanasekar R, Bruce Hoult, RISC-V SW Dev
recorded here:

http://libre-riscv.org/llvm_on_riscv/

also reminder, for anyone trying debug builds, standard ld has always
always required all obj files to be ram-resident otherwise it puts the
entire machine into total thrash-meltdown: step even a few MB over the
boundary and a build that previously took 30-60 mins will instead take
several days (or segfault, as you found out, gnanasekar).

nobody really noticed or investigated the problem, however as the
number of programs that have increased in complexity to the point
where they this is becoming more and more prevalent (firefox: 7GB
resident RAM...) it's becoming much more urgent that this get "fixed".

one "solution" is of course to build without debug symbols... that
will however simply avoid the problem until such time as even a
release build goes beyond the size of resident RAM.


below is the advice of binutils' primary developer,

https://sourceware.org/bugzilla/show_bug.cgi?id=22831#c1

[reply] [−]Comment 1H.J. Lu 2018-02-11 13:01:35 UTC

Please try binutils 2.30 with "-Wl,--no-keep-memory".

gnanasekar did you get an opportunity to try that so that you could
have a debug build?

l.

Gnanasekar R

unread,
May 16, 2018, 11:30:27 PM5/16/18
to Luke Kenneth Casson Leighton, Bruce Hoult, Alex Bradbury, Bruce Hoult, RISC-V SW Dev
No i did not attempt a debug build after that. All I wanted was a llvm toolchain that I can use to generate compressed instructions. So happy with Release build for now!

Luke Kenneth Casson Leighton

unread,
May 20, 2018, 6:48:01 PM5/20/18
to Gnanasekar R, Bruce Hoult, Alex Bradbury, Bruce Hoult, RISC-V SW Dev


On Thursday, May 17, 2018, Gnanasekar R <gnanase...@gmail.com> wrote:
No i did not attempt a debug build after that. All I wanted was a llvm toolchain that I can use to generate compressed instructions. So happy with Release build for now!

Ok, rats :)  becoming increasingly more significant, this linker thrashing. 

Does anyone involved in gcc know what i vaguely am on about where i heard that gcc detects available resident RAM and uses up to but no more than that unless absolutely necessary?  If available it means that compile could go faster, and avoids swap, which would have the opposite effect.

Binutils ld command is totally missing this functionality, as its diffrent teams on gcc and binutils.  If someone from gcc could point binutiks dev at the relevant code in gcc it would result in dramatic performance increases in linker phase for a huge, huge number of people worldwide.


On 15 May 2018 at 17:49, Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote:
recorded here:

http://libre-riscv.org/llvm_on_riscv/

also reminder, for anyone trying debug builds, standard ld has always
always required all obj files to be ram-resident otherwise it puts the
entire machine into total thrash-meltdown: step even a few MB over the
boundary and a build that previously took 30-60 mins will instead take
several days (or segfault, as you found out, gnanasekar).

nobody really noticed or investigated the problem, however as the
number of programs that have increased in complexity to the point
where they this is becoming more and more prevalent (firefox: 7GB
resident RAM...) it's becoming much more urgent that this get "fixed".

one "solution" is of course to build without debug symbols... that
will however simply avoid the problem until such time as even a
release build goes beyond the size of resident RAM.


below is the advice of binutils' primary developer,

https://sourceware.org/bugzilla/show_bug.cgi?id=22831#c1

[reply] [−]Comment 1H.J. Lu 2018-02-11 13:01:35 UTC

Please try binutils 2.30 with "-Wl,--no-keep-memory".

gnanasekar did you get an opportunity to try that so that you could
have a debug build?

l.

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CANUXKs_Y%3DNTqy_d%2BHU%3D-qp6dkaYEM-fpP%3DcyqFWe%3D3p_t7nVAA%40mail.gmail.com.

Bruce Hoult

unread,
May 20, 2018, 8:44:43 PM5/20/18
to Luke Kenneth Casson Leighton, Gnanasekar R, Alex Bradbury, Bruce Hoult, RISC-V SW Dev
On Mon, May 21, 2018 at 10:47 AM, Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote:


On Thursday, May 17, 2018, Gnanasekar R <gnanase...@gmail.com> wrote:
No i did not attempt a debug build after that. All I wanted was a llvm toolchain that I can use to generate compressed instructions. So happy with Release build for now!

Ok, rats :)  becoming increasingly more significant, this linker thrashing. 

Does anyone involved in gcc know what i vaguely am on about where i heard that gcc detects available resident RAM and uses up to but no more than that unless absolutely necessary?  If available it means that compile could go faster, and avoids swap, which would have the opposite effect.

Binutils ld command is totally missing this functionality, as its diffrent teams on gcc and binutils.  If someone from gcc could point binutiks dev at the relevant code in gcc it would result in dramatic performance increases in linker phase for a huge, huge number of people worldwide.

It's true that ld can take an annoying large amount of RAM (and time!) for debug builds. In the case of LLVM quite a few of the programs need up to 8 or 9 GB while linking with gnu ld. A MacOS debug build of LLVM needs far less time and RAM than a Linux one on the same hardware for this reason.

I've read that ld.gold is also much better, but haven't tried it.

However, the LLVM build process has a work-around for this problem: set -DLLVM_PARALLEL_LINK_JOBS=1 as an option to cmake when you configure the build.

You can use a number bigger than 1 if you have more than 16 GB of RAM. My tests on many machines at AWS suggest that for best speed (including not clobbering the disk cache too much) take the amount of RAM in the machine in GB, subtract 12, then divide by 6 and round.

So:

GB Links
== =====
16   1
32   3
48   6
64   9
72   10

If you have 128 GB or more then you don't need to limit the number of link jobs. Or if you have fewer CPU threads than the above.

If you have 8 GB or less then don't even think about doing debug builds of LLVM. I suppose it will work if you have swap space configured, but I hate to think how long it will take. It's exactly an hour on a c5.2xlarge (8 thread, 16 GB) and that's bad enough. Less real RAM might easily make it take a day.

My recommendation: m5.12xlarge (48 thread, 192 GB), at 8m05s. The m5.24xlarge (6m40s) and c5.18xlarge (7m20s) are a little quicker, but not enough to make up for their higher cost.

My tiny i7-8650U NUC (8 threads, 32 GB, turbo to 4.2 GHz) takes 37m30s. Any modern quad core i5 or i7 is going to be around that with 32 GB, or an hour with 16 GB. A m5.2xlarge (also 8 thread, 32 GB but Xeon so lower MHz) is a bit slower at 43m55s.

Jim Wilson

unread,
May 20, 2018, 10:31:54 PM5/20/18
to Luke Kenneth Casson Leighton, Gnanasekar R, Bruce Hoult, Alex Bradbury, Bruce Hoult, RISC-V SW Dev
On Sun, May 20, 2018 at 3:47 PM, Luke Kenneth Casson Leighton
<lk...@lkcl.net> wrote:
> Does anyone involved in gcc know what i vaguely am on about where i heard
> that gcc detects available resident RAM and uses up to but no more than that
> unless absolutely necessary? If available it means that compile could go
> faster, and avoids swap, which would have the opposite effect.

There is no magic code in gcc that solves the binutils problem.

> Binutils ld command is totally missing this functionality, as its diffrent
> teams on gcc and binutils. If someone from gcc could point binutiks dev at
> the relevant code in gcc it would result in dramatic performance increases
> in linker phase for a huge, huge number of people worldwide.

Most binutils developers are also gcc developers.

Ld uses no more memory than necessary. Unfortunately, for some very
large programs, this may exceed the size of available memory. Ld by
default keeps input files in memory to reduce file I/O. You can
disable this by using --no-keep-memory, which reduces memory usage at
the expense of increasing file I/O, as input files will then have to
be read multiple times. Normally, this will make the linker slower,
but if you are swapping, it may make the linker faster, depending on
how much time you lose swapping.

Figuring out when to enable --no-keep-memory is a very difficult
optimization problem, as we can't know in advance how much memory we
will need, or when we will start swapping, or how much time we will
spend swapping, or which input files will be read the fewest number of
times, etc. If there was any easy solution to this problem, it would
have been found decades ago.

Repeatedly complaining about the problem doesn't help. You are just
annoying and insulting all of the people who could help you.

Jim

David Chisnall

unread,
May 21, 2018, 8:05:24 AM5/21/18
to Jim Wilson, Luke Kenneth Casson Leighton, Gnanasekar R, Bruce Hoult, Alex Bradbury, Bruce Hoult, RISC-V SW Dev
On 21 May 2018, at 03:31, Jim Wilson <ji...@sifive.com> wrote:
>
> Ld uses no more memory than necessary

Given that GNU Gold, LLVM’s lld, and Apple’s ld64 all use significantly less memory than GNU BFD ld, this is quite obviously not the case. BFD ld uses significantly more memory than is necessary, largely as a result of the layering design that hides the ELF, Mach-O and PE/COFF-specific details from some binary-file-agnostic code.

David


Reply all
Reply to author
Forward
0 new messages