[PATCH] kbuild: support 'LLVM' to switch the default tools to Clang/LLVM

35 views
Skip to first unread message

Masahiro Yamada

unread,
Apr 3, 2020, 1:17:58 AM4/3/20
to linux-...@vger.kernel.org, Nick Desaulniers, Nathan Chancellor, clang-bu...@googlegroups.com, Masahiro Yamada, Jonathan Corbet, Michal Marek, linu...@vger.kernel.org, linux-...@vger.kernel.org
As Documentation/kbuild/llvm.rst implies, building the kernel with a
full set of LLVM tools gets very verbose and unwieldy.

Provide a single switch 'LLVM' to use Clang and LLVM tools instead of
GCC and Binutils. You can pass LLVM=1 from the command line or as an
environment variable. Then, Kbuild will use LLVM toolchains in your
PATH environment.

Please note LLVM=1 does not turn on the LLVM integrated assembler.
You need to explicitly pass AS=clang to use it. When the upstream
kernel is ready for the integrated assembler, I think we can make
it default.

We discussed what we need, and we agreed to go with a simple boolean
switch (https://lkml.org/lkml/2020/3/28/494).

Some items in the discussion:

- LLVM_DIR

When multiple versions of LLVM are installed, I just thought supporting
LLVM_DIR=/path/to/my/llvm/bin/ might be useful.

CC = $(LLVM_DIR)clang
LD = $(LLVM_DIR)ld.lld
...

However, we can handle this by modifying PATH. So, we decided to not do
this.

- LLVM_SUFFIX

Some distributions (e.g. Debian) package specific versions of LLVM with
naming conventions that use the version as a suffix.

CC = clang$(LLVM_SUFFIX)
LD = ld.lld(LLVM_SUFFIX)
...

will allow a user to pass LLVM_SUFFIX=-11 to use clang-11 etc.,
but the suffixed versions in /usr/bin/ are symlinks to binaries in
/usr/lib/llvm-#/bin/, so this can also be handled by PATH.

- HOSTCC, HOSTCXX, etc.

We can switch the host compilers in the same way:

ifneq ($(LLVM),)
HOSTCC = clang
HOSTCXX = clang++
else
HOSTCC = gcc
HOSTCXX = g++
endif

This may the right thing to do, but I could not make up my mind.
Because we do not frequently switch the host compiler, a counter
solution I had in my mind was to leave it to the default of the
system.

HOSTCC = cc
HOSTCXX = c++

Many distributions support update-alternatives to switch the default
to GCC, Clang, or whatever, but reviewers were opposed to this
approach. So, this commit does not touch the host tools.

Signed-off-by: Masahiro Yamada <masa...@kernel.org>
---

Documentation/kbuild/kbuild.rst | 5 +++++
Documentation/kbuild/llvm.rst | 5 +++++
Makefile | 20 ++++++++++++++++----
3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/Documentation/kbuild/kbuild.rst b/Documentation/kbuild/kbuild.rst
index 510f38d7e78a..2d1fc03d346e 100644
--- a/Documentation/kbuild/kbuild.rst
+++ b/Documentation/kbuild/kbuild.rst
@@ -262,3 +262,8 @@ KBUILD_BUILD_USER, KBUILD_BUILD_HOST
These two variables allow to override the user@host string displayed during
boot and in /proc/version. The default value is the output of the commands
whoami and host, respectively.
+
+LLVM
+----
+If this variable is set to 1, Kbuild will use Clang and LLVM utilities instead
+of GCC and GNU binutils to build the kernel.
diff --git a/Documentation/kbuild/llvm.rst b/Documentation/kbuild/llvm.rst
index d6c79eb4e23e..4602369f6a4f 100644
--- a/Documentation/kbuild/llvm.rst
+++ b/Documentation/kbuild/llvm.rst
@@ -55,6 +55,11 @@ additional parameters to `make`.
READELF=llvm-readelf HOSTCC=clang HOSTCXX=clang++ HOSTAR=llvm-ar \\
HOSTLD=ld.lld

+You can use a single switch `LLVM=1` to use LLVM utilities by default (except
+for building host programs).
+
+ make LLVM=1 HOSTCC=clang HOSTCXX=clang++ HOSTAR=llvm-ar HOSTLD=ld.lld
+
Getting Help
------------

diff --git a/Makefile b/Makefile
index c91342953d9e..6db89ecdd942 100644
--- a/Makefile
+++ b/Makefile
@@ -409,16 +409,28 @@ KBUILD_HOSTLDFLAGS := $(HOST_LFS_LDFLAGS) $(HOSTLDFLAGS)
KBUILD_HOSTLDLIBS := $(HOST_LFS_LIBS) $(HOSTLDLIBS)

# Make variables (CC, etc...)
-LD = $(CROSS_COMPILE)ld
-CC = $(CROSS_COMPILE)gcc
CPP = $(CC) -E
+ifneq ($(LLVM),)
+CC = clang
+LD = ld.lld
+AR = llvm-ar
+NM = llvm-nm
+OBJCOPY = llvm-objcopy
+OBJDUMP = llvm-objdump
+READELF = llvm-readelf
+OBJSIZE = llvm-size
+STRIP = llvm-strip
+else
+CC = $(CROSS_COMPILE)gcc
+LD = $(CROSS_COMPILE)ld
AR = $(CROSS_COMPILE)ar
NM = $(CROSS_COMPILE)nm
-STRIP = $(CROSS_COMPILE)strip
OBJCOPY = $(CROSS_COMPILE)objcopy
OBJDUMP = $(CROSS_COMPILE)objdump
-OBJSIZE = $(CROSS_COMPILE)size
READELF = $(CROSS_COMPILE)readelf
+OBJSIZE = $(CROSS_COMPILE)size
+STRIP = $(CROSS_COMPILE)strip
+endif
PAHOLE = pahole
LEX = flex
YACC = bison
--
2.17.1

Nathan Chancellor

unread,
Apr 3, 2020, 4:57:21 AM4/3/20
to Masahiro Yamada, linux-...@vger.kernel.org, Nick Desaulniers, clang-bu...@googlegroups.com, Jonathan Corbet, Michal Marek, linu...@vger.kernel.org, linux-...@vger.kernel.org
Hi Masahiro,

On Fri, Apr 03, 2020 at 02:17:09PM +0900, Masahiro Yamada wrote:
> As Documentation/kbuild/llvm.rst implies, building the kernel with a
> full set of LLVM tools gets very verbose and unwieldy.
>
> Provide a single switch 'LLVM' to use Clang and LLVM tools instead of
> GCC and Binutils. You can pass LLVM=1 from the command line or as an
> environment variable. Then, Kbuild will use LLVM toolchains in your
> PATH environment.
>
> Please note LLVM=1 does not turn on the LLVM integrated assembler.
> You need to explicitly pass AS=clang to use it. When the upstream
> kernel is ready for the integrated assembler, I think we can make
> it default.

I agree this should be the default but I think it should probably be
called out somewhere in the documentation as well since users might not
expect to have to have a cross assembler installed.
I would personally like to see this but I do not have the strongest
opinion.
I have verified that the variables get their correct value with LLVM=1
and that they are still overridable.

Reviewed-by: Nathan Chancellor <natecha...@gmail.com>
Tested-by: Nathan Chancellor <natecha...@gmail.com> # build

Masahiro Yamada

unread,
Apr 3, 2020, 5:27:11 AM4/3/20
to Nathan Chancellor, Linux Kbuild mailing list, Nick Desaulniers, clang-built-linux, Jonathan Corbet, Michal Marek, open list:DOCUMENTATION, Linux Kernel Mailing List
Hi Nathan,


On Fri, Apr 3, 2020 at 5:57 PM Nathan Chancellor
<natecha...@gmail.com> wrote:
>
> Hi Masahiro,
>
> On Fri, Apr 03, 2020 at 02:17:09PM +0900, Masahiro Yamada wrote:
> > As Documentation/kbuild/llvm.rst implies, building the kernel with a
> > full set of LLVM tools gets very verbose and unwieldy.
> >
> > Provide a single switch 'LLVM' to use Clang and LLVM tools instead of
> > GCC and Binutils. You can pass LLVM=1 from the command line or as an
> > environment variable. Then, Kbuild will use LLVM toolchains in your
> > PATH environment.
> >
> > Please note LLVM=1 does not turn on the LLVM integrated assembler.
> > You need to explicitly pass AS=clang to use it. When the upstream
> > kernel is ready for the integrated assembler, I think we can make
> > it default.
>
> I agree this should be the default but I think it should probably be
> called out somewhere in the documentation as well since users might not
> expect to have to have a cross assembler installed.


I will add the following info to llvm.rst:

`LLVM=1` does not turn on the LLVM integrated assembler, so you still need
assembler from GNU binutils. You can pass `AS=clang` to use the integrated
assembler, but it is experimental as of writing.
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-li...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200403085719.GA9282%40ubuntu-m2-xlarge-x86.



--
Best Regards
Masahiro Yamada

Nick Desaulniers

unread,
Apr 3, 2020, 2:24:11 PM4/3/20
to Masahiro Yamada, Linux Kbuild mailing list, Nathan Chancellor, clang-built-linux, Jonathan Corbet, Michal Marek, Linux Doc Mailing List, LKML, Matthias Männich, Sandeep Patil
On Thu, Apr 2, 2020 at 10:17 PM Masahiro Yamada <masa...@kernel.org> wrote:
>
> As Documentation/kbuild/llvm.rst implies, building the kernel with a
> full set of LLVM tools gets very verbose and unwieldy.
>
> Provide a single switch 'LLVM' to use Clang and LLVM tools instead of
> GCC and Binutils. You can pass LLVM=1 from the command line or as an
> environment variable. Then, Kbuild will use LLVM toolchains in your
> PATH environment.
>
> Please note LLVM=1 does not turn on the LLVM integrated assembler.
> You need to explicitly pass AS=clang to use it. When the upstream
> kernel is ready for the integrated assembler, I think we can make
> it default.

Having this behavior change over time may be surprising. I'd rather
that if you want to not use the integrated assembler, you explicitly
negate it, or just don't use the LLVM=1 syntax, ie. `make CC=clang
LD=ld.lld ...`.

We could modify how `-no-integrated-as` is chosen when LLVM=1.

make LLVM=1 LLVMIA=0 ... # add `-no-integrated-as`
# what the flag is doesn't really matter to me, something shorter might be nice.
make LLVM=1 # use all LLVM tools

Since we got rid of $(AS), it would be appropriate to remove/change it
there, since no one really relies on AS=clang right now. (We do have 1
of our 60+ CI targets using it, but we can also change that trivially.
So I think we have a lot of freedom to change how `-no-integrated-as`
is set.

This could even be independent of this patch.
update-alternatives assumes you've installed Clang via a package manager?
$ update-alternatives --list cc
/usr/bin/gcc
On my system even though clang and friends are in my PATH.

And previously, there was feedback that maybe folks don't want to
change `cc` on their systems just for Clang kernel builds.
https://lkml.org/lkml/2020/3/30/836
https://lkml.org/lkml/2020/3/30/838

A goal for ClangBuiltLinux is to build a kernel image with no GCC or
binutils installed on the host. Let the record reflect that. And
there's been multiple complaints that the existing syntax is too long
for specifying all of the tools.

LLVM=1 is meant to be one flag. Not `make LLVM=1 HOSTCC=clang
HOSTCXX=clang`. If folks want fine grain flexibility, use the
existing command line interface, which this patch does not change.
LLVM=1 is opinionated, and inflexible, because it makes a strong
choice to enable LLVM for everything.

Another reason why I don't want to change these over time, and why I
want them all to be in sync is that there are 4 different CI systems
for the kernel, and they are currently fragmented in terms of who is
using what tools:

KernelCI: CC=clang only
Kbuild test robot aka 0day bot: CC=clang LD=ld.lld
Linaro TCWG: CC=clang only
our CI: a complete mix due to combinatorial explosion, but more
coverage of LLVM than everyone else.

That is a mess that we must solve. Having 1 flag that works
consistently across systems is one solution. Now if those were all
using LLVM=1, but some were enabling Clang's integrated assembler, and
some weren't because we changed the default over time, then we'd be
right back to this mismatch between systems. I'd much rather draw the
line in the sand, and say "this is how this flag will work, since day
1." Maybe it's too rigid, but it's important to me that if we create
something new to solve multiple objectives (1. simplifies existing
interface. 2. turns on everything.) that it does so. It is a partial
solution, if it eliminates some of the flags while leaving others. I
want a full solution.

If folks want the flexibility to mix and match tools, the existing
interface is capable. But for us to track who is using what, we need
1 flag that we know is not different depending on the cc of the
system. Once clang's integrated assembler is good to go, we will
begin recommending LLVM=1 to everyone. And we want feedback if we
regress building the host utilities during a kernel build, even if
there are not many.

I'm on the fence about having all of the above satisfied by one patch,
or taking this patch as is and following up on the above two points
(related to disabling `-no-integrated-as` and setting HOSTCC). I
trust your judgement and respect your decisions, so I'll defer to you
Masahiro, but I need to make explicit the design goals. Maybe with
this additional context it can help inform the design.
Tested-by: Nick Desaulniers <ndesau...@google.com>
I would like this to be the preferred method of building to LLVM, so
it should go first, followed by a footnote that says something along
the lines of "if you need something more flexible, the tools can be
specified in a more fine grain manner via the traditional syntax
below:"
--
Thanks,
~Nick Desaulniers

Masahiro Yamada

unread,
Apr 5, 2020, 12:46:19 PM4/5/20
to Nick Desaulniers, Linux Kbuild mailing list, Nathan Chancellor, clang-built-linux, Jonathan Corbet, Michal Marek, Linux Doc Mailing List, LKML, Matthias Männich, Sandeep Patil
I also thought a boolean flag is preferred.

AS=clang will not live long anyway, and
I hesitated to break the compatibility
for the short-term workaround.

But, if this is not a big deal, I can
replace AS=clang with LLVMIA=1.
Thanks for the comments.

I'd rather want to do this incrementally,
making sure I am doing right.


The meaning of LLVM=1 may change over time.
It means "the recommended settings" at the moment.

If CI does not want to change the behavior across
kernel versions, it can pass individual variables
explicitly.

Fangrui Song

unread,
Apr 5, 2020, 7:55:12 PM4/5/20
to Masahiro Yamada, Nick Desaulniers, Linux Kbuild mailing list, Nathan Chancellor, clang-built-linux, Jonathan Corbet, Michal Marek, Linux Doc Mailing List, LKML, Matthias Männich, Sandeep Patil
My mere complaint is that it may be difficult to infer the intention (integrated
assembler) from the abbreviation "IA" in "LLVMIA" :/

Something with "AS" in the name may be easier for a user to understand,
e.g. CLANG_AS or LLVM_AS.
>--
>You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
>To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-li...@googlegroups.com.
>To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/CAK7LNAQybfcYiosNU%2Bybd-Q7-Y2dbLqBVN2XA00wCRnFAoqdew%40mail.gmail.com.

Masahiro Yamada

unread,
Apr 5, 2020, 9:34:39 PM4/5/20
to Fangrui Song, Nick Desaulniers, Linux Kbuild mailing list, Nathan Chancellor, clang-built-linux, Jonathan Corbet, Michal Marek, Linux Doc Mailing List, LKML, Matthias Männich, Sandeep Patil
I see 'llvm-as' in my PATH,
but it is a different kind of tool, right?
(converter from LLVM assembler *.ll to LLVM bit code *.bc)

So, I thought "LLVM_AS" might be confusing.

Fangrui Song

unread,
Apr 5, 2020, 11:14:47 PM4/5/20
to Masahiro Yamada, Nick Desaulniers, Linux Kbuild mailing list, Nathan Chancellor, clang-built-linux, Jonathan Corbet, Michal Marek, Linux Doc Mailing List, LKML, Matthias Männich, Sandeep Patil
You are right. llvm-as converts a textual form of LLVM IR (.ll) to
a binary form bitcode (.bc). LLVM_AS is confusing. CLANG_AS/CLANGAS might be
suitable.

clang a.c '-###' => clang -cc1 # like gcc invokes cc1
clang a.s '-###' => clang -cc1as # this invokes the integrated assembler

Sedat Dilek

unread,
Apr 6, 2020, 5:12:21 AM4/6/20
to Masahiro Yamada, Fangrui Song, Nick Desaulniers, Linux Kbuild mailing list, Nathan Chancellor, clang-built-linux, Jonathan Corbet, Michal Marek, Linux Doc Mailing List, LKML, Matthias Männich, Sandeep Patil
Indeed LLVM_AS is confusing as llvm-as is different to Clang's
integrated assembler.
So CLANG_AS might be a better choice.

- sed@ -

Matthias Maennich

unread,
Apr 6, 2020, 7:22:23 AM4/6/20
to Masahiro Yamada, linux-...@vger.kernel.org, Nick Desaulniers, Nathan Chancellor, clang-bu...@googlegroups.com, Jonathan Corbet, Michal Marek, linu...@vger.kernel.org, linux-...@vger.kernel.org
On Fri, Apr 03, 2020 at 02:17:09PM +0900, Masahiro Yamada wrote:
What about HOSTLD ? I saw recently, that setting HOSTLD=ld.lld is not
yielding the expected result (some tools, like e.g. fixdep still require
an `ld` to be in PATH to be built). I did not find the time to look into
that yet, but I would like to consistently switch to the llvm toolchain
(including linker and possibly more) also for hostprogs.

Cheers,
Matthias
>--
>You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
>To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-li...@googlegroups.com.
>To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200403051709.22407-1-masahiroy%40kernel.org.

Masahiro Yamada

unread,
Apr 7, 2020, 12:17:53 PM4/7/20
to Matthias Maennich, Linux Kbuild mailing list, Nick Desaulniers, Nathan Chancellor, clang-built-linux, Jonathan Corbet, Michal Marek, open list:DOCUMENTATION, Linux Kernel Mailing List
HOSTLD=ld.lld worked for me, but HOSTCC=clang did not.



HOSTCC=clang without CC=clang fails to build objtool.

The build system of objtool is meh. :(


HOSTCC scripts/mod/sumversion.o
HOSTLD scripts/mod/modpost
CALL scripts/checksyscalls.sh
CALL scripts/atomic/check-atomics.sh
DESCEND objtool
error: unknown warning option '-Wstrict-aliasing=3'; did you mean
'-Wstring-plus-int'? [-Werror,-Wunknown-warning-option]
HOSTCC /home/masahiro/workspace/linux-kbuild/tools/objtool/fixdep.o
HOSTLD /home/masahiro/workspace/linux-kbuild/tools/objtool/fixdep-in.o
LINK /home/masahiro/workspace/linux-kbuild/tools/objtool/fixdep
CC /home/masahiro/workspace/linux-kbuild/tools/objtool/exec-cmd.o
CC /home/masahiro/workspace/linux-kbuild/tools/objtool/help.o
CC /home/masahiro/workspace/linux-kbuild/tools/objtool/pager.o
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200406112220.GB126804%40google.com.

Nick Desaulniers

unread,
Apr 7, 2020, 1:01:18 PM4/7/20
to Masahiro Yamada, Matthias Maennich, Linux Kbuild mailing list, Nathan Chancellor, clang-built-linux, Jonathan Corbet, Michal Marek, open list:DOCUMENTATION, Linux Kernel Mailing List
Let's tackle that in a follow up, with the goal of build hermiticity
in mind. I think there's good feedback in this thread to inform the
design of a v2:
1. CLANG_AS=0 to disable integrated as. Hopefully we won't need this
much longer, so we don't need to spend too much time on this, Masahiro
please just choose a name for this. llvm-as naming conventions
doesn't follow the rest of binutils.
2. HOSTCC=clang HOSTLD=ld.lld set by LLVM=1 for helping with build hermiticity.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/CAK7LNARkFN8jTD8F3CU7r_AL8dbqaKpUuou4MCLZvAYLGs9bYA%40mail.gmail.com.



--
Thanks,
~Nick Desaulniers

Masahiro Yamada

unread,
Apr 7, 2020, 1:47:00 PM4/7/20
to Nick Desaulniers, Matthias Maennich, Linux Kbuild mailing list, Nathan Chancellor, clang-built-linux, Jonathan Corbet, Michal Marek, open list:DOCUMENTATION, Linux Kernel Mailing List
I am not so familiar with the terminology in LLVM,
but I feel 'integrated' is a keyword IMHO.
I prefer LLVM_IA=1. (or LLVM_INTEGRATED_AS=1)


> 2. HOSTCC=clang HOSTLD=ld.lld set by LLVM=1 for helping with build hermiticity.
>




Nick Desaulniers

unread,
Apr 7, 2020, 1:53:21 PM4/7/20
to Masahiro Yamada, Matthias Maennich, Linux Kbuild mailing list, Nathan Chancellor, clang-built-linux, Jonathan Corbet, Michal Marek, open list:DOCUMENTATION, Linux Kernel Mailing List, Jian Cai, Stephen Hines, Luis Lozano
I'm happy with either, and I trust your judgement. You choose.
Hopefully we will fix all our assembler bugs soon and won't need the
flag much longer.

--
Thanks,
~Nick Desaulniers

Fangrui Song

unread,
Apr 7, 2020, 3:19:28 PM4/7/20
to 'Nick Desaulniers' via Clang Built Linux, Masahiro Yamada, Matthias Maennich, Linux Kbuild mailing list, Nathan Chancellor, Jonathan Corbet, Michal Marek, open list:DOCUMENTATION, Linux Kernel Mailing List, Jian Cai, Stephen Hines, Luis Lozano
Maybe "IAS", e.g. LLVM_IAS=1 or CLANG_IAS=1

IAS is referred to in a few places. IA is not a common abbreviation.

I don't have strong opinion here and thank Masahiro a lot for the
improvement!

>
>--
>You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
>To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-li...@googlegroups.com.
>To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/CAKwvOdkicpNMqQipZ%2BAMTEz7JVou3bkcKiQ3Cih20qH0hoziDg%40mail.gmail.com.

Masahiro Yamada

unread,
Apr 7, 2020, 9:23:59 PM4/7/20
to Fangrui Song, 'Nick Desaulniers' via Clang Built Linux, Matthias Maennich, Linux Kbuild mailing list, Nathan Chancellor, Jonathan Corbet, Michal Marek, open list:DOCUMENTATION, Linux Kernel Mailing List, Jian Cai, Stephen Hines, Luis Lozano
OK, I will rename it to LLVM_IAS.

Thanks for the advice.
Reply all
Reply to author
Forward
0 new messages