[PATCH v2] Makefile: Add clang-tidy and static analyzer support to makefile

162 views
Skip to first unread message

Nathan Huckleberry

unread,
Jul 8, 2020, 2:21:35 PM7/8/20
to masa...@kernel.org, micha...@markovi.net, linux-...@vger.kernel.org, linux-...@vger.kernel.org, clang-bu...@googlegroups.com, Nathan Huckleberry
This patch adds clang-tidy and the clang static-analyzer as make
targets. The goal of this patch is to make static analysis tools
usable and extendable by any developer or researcher who is familiar
with basic c++.

The current static analysis tools require intimate knowledge of the internal
workings of the static analysis. Clang-tidy and the clang static analyzers
expose an easy to use api and allow users unfamiliar with clang to
write new checks with relative ease.

===Clang-tidy===

Clang-tidy is an easily extendable 'linter' that runs on the AST.
Clang-tidy checks are easy to write and understand. A check consists of
two parts, a matcher and a checker. The matcher is created using a
domain specific language that acts on the AST
(https://clang.llvm.org/docs/LibASTMatchersReference.html). When AST
nodes are found by the matcher a callback is made to the checker. The
checker can then execute additional checks and issue warnings.

Here is an example clang-tidy check to report functions that have calls
to local_irq_disable without calls to local_irq_enable and vice-versa.
Functions flagged with __attribute((annotation("ignore_irq_balancing")))
are ignored for analysis. (https://reviews.llvm.org/D65828)

===Clang static analyzer===

The clang static analyzer is a more powerful static analysis tool that
uses symbolic execution to find bugs. Currently there is a check that
looks for potential security bugs from invalid uses of kmalloc and
kfree. There are several more general purpose checks that are useful for
the kernel.

The clang static analyzer is well documented and designed to be
extensible.
(https://clang-analyzer.llvm.org/checker_dev_manual.html)
(https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf)

The main draw of the clang tools is how accessible they are. The clang
documentation is very nice and these tools are built specifically to be
easily extendable by any developer. They provide an accessible method of
bug-finding and research to people who are not overly familiar with the
kernel codebase.

Signed-off-by: Nathan Huckleberry <nh...@google.com>
---
Changes V1 -> V2:
* Remove dependencies on GNU Parallel
* * Clang-tidy/analyzer now invoked directly from python
Link: https://lkml.org/lkml/2019/8/6/941

Makefile | 3 +
scripts/clang-tools/Makefile.clang-tools | 23 ++++++
.../{ => clang-tools}/gen_compile_commands.py | 0
scripts/clang-tools/run-clang-tools.py | 77 +++++++++++++++++++
4 files changed, 103 insertions(+)
create mode 100644 scripts/clang-tools/Makefile.clang-tools
rename scripts/{ => clang-tools}/gen_compile_commands.py (100%)
create mode 100755 scripts/clang-tools/run-clang-tools.py

diff --git a/Makefile b/Makefile
index fe0164a654c7..3e2df010b342 100644
--- a/Makefile
+++ b/Makefile
@@ -747,6 +747,7 @@ KBUILD_CFLAGS += $(call cc-option,-fno-allow-store-data-races)

include scripts/Makefile.kcov
include scripts/Makefile.gcc-plugins
+include scripts/clang-tools/Makefile.clang-tools

ifdef CONFIG_READABLE_ASM
# Disable optimizations that make assembler listings hard to read.
@@ -1543,6 +1544,8 @@ help:
@echo ' export_report - List the usages of all exported symbols'
@echo ' headerdep - Detect inclusion cycles in headers'
@echo ' coccicheck - Check with Coccinelle'
+ @echo ' clang-analyzer - Check with clang static analyzer'
+ @echo ' clang-tidy - Check with clang-tidy'
@echo ''
@echo 'Tools:'
@echo ' nsdeps - Generate missing symbol namespace dependencies'
diff --git a/scripts/clang-tools/Makefile.clang-tools b/scripts/clang-tools/Makefile.clang-tools
new file mode 100644
index 000000000000..e09dc1a8efff
--- /dev/null
+++ b/scripts/clang-tools/Makefile.clang-tools
@@ -0,0 +1,23 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (C) Google LLC, 2020
+#
+# Author: Nathan Huckleberry <nh...@google.com>
+#
+PHONY += clang-tidy
+clang-tidy:
+ifdef CONFIG_CC_IS_CLANG
+ $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
+ $(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-tidy compile_commands.json
+else
+ $(error Clang-tidy requires CC=clang)
+endif
+
+PHONY += clang-analyzer
+clang-analyzer:
+ifdef CONFIG_CC_IS_CLANG
+ $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
+ $(PYTHON3) scripts/clang-tools/run-clang-tools.py static-analyzer compile_commands.json
+else
+ $(error Clang-analyzer requires CC=clang)
+endif
diff --git a/scripts/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
similarity index 100%
rename from scripts/gen_compile_commands.py
rename to scripts/clang-tools/gen_compile_commands.py
diff --git a/scripts/clang-tools/run-clang-tools.py b/scripts/clang-tools/run-clang-tools.py
new file mode 100755
index 000000000000..d429a150e23a
--- /dev/null
+++ b/scripts/clang-tools/run-clang-tools.py
@@ -0,0 +1,77 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (C) Google LLC, 2020
+#
+# Author: Nathan Huckleberry <nh...@google.com>
+#
+"""A helper routine run clang-tidy and the clang static-analyzer on
+compile_commands.json."""
+
+import argparse
+import json
+import logging
+import multiprocessing
+import os
+import re
+import subprocess
+
+def parse_arguments():
+ """Set up and parses command-line arguments.
+ Returns:
+ args: Dict of parsed args
+ Has keys 'file' and 'type'
+ """
+ usage = """Run clang-tidy or the clang static-analyzer on a
+ compilation database."""
+ parser = argparse.ArgumentParser(description=usage)
+
+ type_help = ('Type of analysis to be performed')
+ parser.add_argument('type', choices=['clang-tidy', 'static-analyzer'],
+ help=type_help)
+ file_path_help = ('Path to the compilation database to parse')
+ parser.add_argument('file', type=str, help=file_path_help)
+
+ args = parser.parse_args()
+
+ return args
+
+def init(l,t):
+ global lock
+ global analysis_type
+ lock = l
+ analysis_type = t
+
+def run_analysis(entry):
+ filename = entry['file']
+ p = None
+ if(analysis_type == "clang-tidy"):
+ p = subprocess.run(["clang-tidy", "-p", os.getcwd(),
+ "-checks=-*,linuxkernel-*", filename],
+ stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+ if(analysis_type == "static-analyzer"):
+ p = subprocess.run(["clang-tidy", "-p", os.getcwd(),
+ "-checks=-*,clang-analyzer-*", filename],
+ stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+ lock.acquire()
+ print(entry['file'])
+ os.write(1, p.stdout)
+ os.write(2, p.stderr)
+ lock.release()
+
+
+def main():
+ args = parse_arguments()
+ filename = args.file
+
+ #Read JSON data into the datastore variable
+ if filename:
+ with open(filename, 'r') as f:
+ datastore = json.load(f)
+
+ lock = multiprocessing.Lock()
+ pool = multiprocessing.Pool(initializer=init, initargs=(lock,args.type,))
+ pool.map(run_analysis,datastore)
+
+if __name__ == '__main__':
+ main()
--
2.27.0.383.g050319c2ae-goog

Nick Desaulniers

unread,
Jul 8, 2020, 3:11:07 PM7/8/20
to Nathan Huckleberry, Masahiro Yamada, Michal Marek, Linux Kbuild mailing list, LKML, clang-built-linux, Tom Roeder, Bill Wendling, Pirama Arumuga Nainar
+ Tom for the rename.

I think we should add scripts/clang-tools/ to MAINTAINERS under
CLANG/LLVM SUPPORT:
```
diff --git a/MAINTAINERS b/MAINTAINERS
index c87b94e6b2f6..42602231929c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4211,6 +4211,7 @@ W: https://clangbuiltlinux.github.io/
B: https://github.com/ClangBuiltLinux/linux/issues
C: irc://chat.freenode.net/clangbuiltlinux
F: Documentation/kbuild/llvm.rst
+F: scripts/clang-tools/
K: \b(?i:clang|llvm)\b

CLEANCACHE API
```
that way we get cc'ed properly on proposed changes (should folks use
scripts/get_maintainer.pl).
s/Clang/clang/ to match the case of the target.

> +endif
> +
> +PHONY += clang-analyzer
> +clang-analyzer:
> +ifdef CONFIG_CC_IS_CLANG
> + $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
> + $(PYTHON3) scripts/clang-tools/run-clang-tools.py static-analyzer compile_commands.json
> +else
> + $(error Clang-analyzer requires CC=clang)

s/Clang/clang/ to match the case of the target.
I don't know if the kernel has a preferred style for Python, but I
think it would be good to be consistent in the use of single vs double
quotes for strings. My preference is for double quotes, but I don't
know enough about the various PEPs for style or if the kernel has a
preferred style for these.

+ Bill who knows a bit about Python style.

> +
> + args = parser.parse_args()
> +
> + return args
> +
> +def init(l,t):
> + global lock
> + global analysis_type
> + lock = l
> + analysis_type = t

Is this canonical Python? Maybe wrap these functions into methods of
an object you construct, that way you can assign these as instance
variables against `self`, rather than using global variables.

> +
> +def run_analysis(entry):
> + filename = entry['file']
> + p = None
> + if(analysis_type == "clang-tidy"):
> + p = subprocess.run(["clang-tidy", "-p", os.getcwd(),
> + "-checks=-*,linuxkernel-*", filename],
> + stdout=subprocess.PIPE, stderr=subprocess.PIPE)
> + if(analysis_type == "static-analyzer"):
> + p = subprocess.run(["clang-tidy", "-p", os.getcwd(),
> + "-checks=-*,clang-analyzer-*", filename],
> + stdout=subprocess.PIPE, stderr=subprocess.PIPE)

When you have a fair amount of duplication between two branches of an
if/else (for instance, same method invocation and number of
parameters, just slight differences in parameter values), consider if
you can use a ternary to simplify or make the code more concise. That
would also help avoid initializing `p` to `None`:

checks = "-checks=-*,linuxkernel-*" if analysis_type == "clang-tidy"
else "-checks=-*,clang-analyzer-*"
p = subprocess.run(["clang-tidy", "-p", os.getcwd(), checks,
stdout=subprocess.PIPE, stderr=subprocess.PIPE]

then maybe do some validation of the analysis_type when validating
command line arguments earlier.

> + lock.acquire()
> + print(entry['file'])
> + os.write(1, p.stdout)
> + os.write(2, p.stderr)

Please use sys.stdout and sys.stderr rather than magic constants for
their file descriptors.

> + lock.release()
> +
> +
> +def main():
> + args = parse_arguments()
> + filename = args.file
> +
> + #Read JSON data into the datastore variable
> + if filename:

Isn't there a way to make command line arguments required with
Argparse? In that case, would you still need the conditional?

> + with open(filename, 'r') as f:
> + datastore = json.load(f)
> +
> + lock = multiprocessing.Lock()
> + pool = multiprocessing.Pool(initializer=init, initargs=(lock,args.type,))
> + pool.map(run_analysis,datastore)

Please use a space to separate parameters in a parameter list.

> +
> +if __name__ == '__main__':
> + main()

So rather than call a function named main, you could simply construct
an object, then call a method on it or have the constructor simply
kick off the analysis (essentially a mix of `main` and `init`).

--
Thanks,
~Nick Desaulniers

Nathan Huckleberry

unread,
Jul 9, 2020, 1:56:20 PM7/9/20
to Nick Desaulniers, Masahiro Yamada, Michal Marek, Linux Kbuild mailing list, LKML, clang-built-linux, Tom Roeder, Bill Wendling, Pirama Arumuga Nainar
I did this to allow shared locks between processes, see
https://stackoverflow.com/questions/25557686/python-sharing-a-lock-between-processes

>
> > +
> > +def run_analysis(entry):
> > + filename = entry['file']
> > + p = None
> > + if(analysis_type == "clang-tidy"):
> > + p = subprocess.run(["clang-tidy", "-p", os.getcwd(),
> > + "-checks=-*,linuxkernel-*", filename],
> > + stdout=subprocess.PIPE, stderr=subprocess.PIPE)
> > + if(analysis_type == "static-analyzer"):
> > + p = subprocess.run(["clang-tidy", "-p", os.getcwd(),
> > + "-checks=-*,clang-analyzer-*", filename],
> > + stdout=subprocess.PIPE, stderr=subprocess.PIPE)
>
> When you have a fair amount of duplication between two branches of an
> if/else (for instance, same method invocation and number of
> parameters, just slight differences in parameter values), consider if
> you can use a ternary to simplify or make the code more concise. That
> would also help avoid initializing `p` to `None`:
>
> checks = "-checks=-*,linuxkernel-*" if analysis_type == "clang-tidy"
> else "-checks=-*,clang-analyzer-*"
> p = subprocess.run(["clang-tidy", "-p", os.getcwd(), checks,
> stdout=subprocess.PIPE, stderr=subprocess.PIPE]
>
> then maybe do some validation of the analysis_type when validating
> command line arguments earlier.

Argparse should already handle validation of the analysis type.
Thanks,
Nathan Huckleberry

Nick Desaulniers

unread,
Jul 13, 2020, 6:09:38 PM7/13/20
to Nathan Huckleberry, Masahiro Yamada, Michal Marek, Linux Kbuild mailing list, LKML, clang-built-linux, Tom Roeder, Bill Wendling, Pirama Arumuga Nainar
(bumping a few points for v3)

On Thu, Jul 9, 2020 at 10:56 AM Nathan Huckleberry <nh...@google.com> wrote:
>
> On Wed, Jul 8, 2020 at 2:11 PM Nick Desaulniers <ndesau...@google.com> wrote:
> >
> > On Wed, Jul 8, 2020 at 11:21 AM 'Nathan Huckleberry' via Clang Built
> > Linux <clang-bu...@googlegroups.com> wrote:
> > >
> > I think we should add scripts/clang-tools/ to MAINTAINERS under
> > CLANG/LLVM SUPPORT:
> > ```
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index c87b94e6b2f6..42602231929c 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -4211,6 +4211,7 @@ W: https://clangbuiltlinux.github.io/
> > B: https://github.com/ClangBuiltLinux/linux/issues
> > C: irc://chat.freenode.net/clangbuiltlinux
> > F: Documentation/kbuild/llvm.rst
> > +F: scripts/clang-tools/
> > K: \b(?i:clang|llvm)\b
> >
> > CLEANCACHE API
> > ```
> > that way we get cc'ed properly on proposed changes (should folks use
> > scripts/get_maintainer.pl).

bump

> > > --- /dev/null
> > > +++ b/scripts/clang-tools/run-clang-tools.py
> > > @@ -0,0 +1,77 @@
> > > +#!/usr/bin/env python
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +#
> > > +# Copyright (C) Google LLC, 2020
> > > +#
> > > +# Author: Nathan Huckleberry <nh...@google.com>
> > > +#
> > > +"""A helper routine run clang-tidy and the clang static-analyzer on
> > > +compile_commands.json."""
> > > +
> > > +import argparse
> > > +import json
> > > +import logging
> > > +import multiprocessing
> > > +import os
> > > +import re

Is `re` being used anywhere?

> > > +import subprocess
> > > +
> > > +def parse_arguments():
> > > + """Set up and parses command-line arguments.
> > > + Returns:
> > > + args: Dict of parsed args
> > > + Has keys 'file' and 'type'
> > > + """
> > > + usage = """Run clang-tidy or the clang static-analyzer on a
> > > + compilation database."""
> > > + parser = argparse.ArgumentParser(description=usage)
> > > +
> > > + type_help = ('Type of analysis to be performed')
> > > + parser.add_argument('type', choices=['clang-tidy', 'static-analyzer'],
> > > + help=type_help)
> > > + file_path_help = ('Path to the compilation database to parse')
> > > + parser.add_argument('file', type=str, help=file_path_help)
> >
> > I don't know if the kernel has a preferred style for Python, but I
> > think it would be good to be consistent in the use of single vs double
> > quotes for strings. My preference is for double quotes, but I don't
> > know enough about the various PEPs for style or if the kernel has a
> > preferred style for these.

double quotes.

> > > +
> > > + args = parser.parse_args()
> > > +
> > > + return args
> > > +
> > > +def init(l,t):
> > > + global lock
> > > + global analysis_type
> > > + lock = l
> > > + analysis_type = t
> >
> > Is this canonical Python? Maybe wrap these functions into methods of
> > an object you construct, that way you can assign these as instance
> > variables against `self`, rather than using global variables.
>
> I did this to allow shared locks between processes, see
> https://stackoverflow.com/questions/25557686/python-sharing-a-lock-between-processes

Ah, ok, I see the problem. In that case, I'm less worried about this.
`global` just sets off red flags unless there is a very good reason to
use it.

>
> >
> > > +
> > > +def run_analysis(entry):
> > > + filename = entry['file']
> > > + p = None
> > > + if(analysis_type == "clang-tidy"):
> > > + p = subprocess.run(["clang-tidy", "-p", os.getcwd(),
> > > + "-checks=-*,linuxkernel-*", filename],
> > > + stdout=subprocess.PIPE, stderr=subprocess.PIPE)
> > > + if(analysis_type == "static-analyzer"):
> > > + p = subprocess.run(["clang-tidy", "-p", os.getcwd(),
> > > + "-checks=-*,clang-analyzer-*", filename],

Is it worthwhile to NOT run `-*` passes and only run
`clang-analyzer-*`? Otherwise `make clang-analyzer` and `make
clang-tidy` contain a ton of duplicate info.

> > > + stdout=subprocess.PIPE, stderr=subprocess.PIPE)
> >
> > When you have a fair amount of duplication between two branches of an
> > if/else (for instance, same method invocation and number of
> > parameters, just slight differences in parameter values), consider if
> > you can use a ternary to simplify or make the code more concise. That
> > would also help avoid initializing `p` to `None`:
> >
> > checks = "-checks=-*,linuxkernel-*" if analysis_type == "clang-tidy"
> > else "-checks=-*,clang-analyzer-*"
> > p = subprocess.run(["clang-tidy", "-p", os.getcwd(), checks,
> > stdout=subprocess.PIPE, stderr=subprocess.PIPE]
> >
> > then maybe do some validation of the analysis_type when validating
> > command line arguments earlier.
>
> Argparse should already handle validation of the analysis type.

Cool, I still think the ternary can simplify a v3.

>
> >
> > > + lock.acquire()
> > > + print(entry['file'])
> > > + os.write(1, p.stdout)
> > > + os.write(2, p.stderr)
> >
> > Please use sys.stdout and sys.stderr rather than magic constants for
> > their file descriptors.

Also, I'm not a fan of how clang-tidy writes the errors to stdout.

$ make LLVM=1 -j71 defconfig clang-tidy 2> log.txt
write part of the log, and spews to stdout. Do you think it would
make sense to redirect stdout from clang-tidy to stderr for this
script?

$ grep error: log.txt | sort -u
$ grep clang-analyzer log.txt | sort -u
Checking some of the clang-tidy warnings, some seem harmless.
linux-next/net/core/devlink.c:9527:6: error: redefinition of
'devlink_compat_running_version' [clang-diagnostic-error]
looks legit, though not terribly important to fix ASAP. So that's cool.
The clang-analyzer report is a little beefier, once the traces start
getting long they become fairly hard to follow. Is it possible to dump
the html report of these? I guess the issue with that is that we
wouldn't be able to join them in the python script.

$ grep clang-analyzer log.txt | sort -u | cut -d '[' -f 2 | sort -u
clang-analyzer-core.CallAndMessage]
clang-analyzer-core.DivideZero]
clang-analyzer-core.NonNullParamChecker]
clang-analyzer-core.NullDereference]
clang-analyzer-core.StackAddressEscape]
clang-analyzer-core.UndefinedBinaryOperatorResult]
clang-analyzer-core.uninitialized.ArraySubscript]
clang-analyzer-core.uninitialized.Assign]
clang-analyzer-core.uninitialized.Branch]
clang-analyzer-core.uninitialized.UndefReturn]
clang-analyzer-deadcode.DeadStores]
clang-analyzer-optin.performance.Padding]
clang-analyzer-optin.portability.UnixAPI]
clang-analyzer-security.insecureAPI.bcmp]
clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling]
clang-analyzer-security.insecureAPI.strcpy]
clang-analyzer-unix.cstring.NullArg]
clang-analyzer-unix.Malloc]
clang-analyzer-valist.Uninitialized]

interesting. I like how clang-analyzer warns about bcmp and yet llvm
will generate calls to bcmp...
--
Thanks,
~Nick Desaulniers

Tom Roeder

unread,
Jul 13, 2020, 6:51:09 PM7/13/20
to Nathan Huckleberry, Nick Desaulniers, Masahiro Yamada, Michal Marek, Linux Kbuild mailing list, LKML, clang-built-linux, Bill Wendling, Pirama Arumuga Nainar
The rename is fine with me.

Nathan Huckleberry

unread,
Jul 14, 2020, 7:24:17 PM7/14/20
to masa...@kernel.org, micha...@markovi.net, linux-...@vger.kernel.org, linux-...@vger.kernel.org, clang-bu...@googlegroups.com, Nathan Huckleberry
Changes v2 -> v3
* Redirect clang-tidy output to stderr
* Style fixes
* Add directory to MAINTAINERS
MAINTAINERS | 1 +
Makefile | 3 +
scripts/clang-tools/Makefile.clang-tools | 23 +++++++
.../{ => clang-tools}/gen_compile_commands.py | 0
scripts/clang-tools/run-clang-tools.py | 69 +++++++++++++++++++
5 files changed, 96 insertions(+)
create mode 100644 scripts/clang-tools/Makefile.clang-tools
rename scripts/{ => clang-tools}/gen_compile_commands.py (100%)
create mode 100755 scripts/clang-tools/run-clang-tools.py

diff --git a/MAINTAINERS b/MAINTAINERS
index 1d4aa7f942de..a444564e5572 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4198,6 +4198,7 @@ W: https://clangbuiltlinux.github.io/
B: https://github.com/ClangBuiltLinux/linux/issues
C: irc://chat.freenode.net/clangbuiltlinux
F: Documentation/kbuild/llvm.rst
+F: scripts/clang-tools/
K: \b(?i:clang|llvm)\b

CLEANCACHE API
diff --git a/Makefile b/Makefile
index fe0164a654c7..3e2df010b342 100644
--- a/Makefile
+++ b/Makefile
@@ -747,6 +747,7 @@ KBUILD_CFLAGS += $(call cc-option,-fno-allow-store-data-races)

include scripts/Makefile.kcov
include scripts/Makefile.gcc-plugins
+include scripts/clang-tools/Makefile.clang-tools

ifdef CONFIG_READABLE_ASM
# Disable optimizations that make assembler listings hard to read.
@@ -1543,6 +1544,8 @@ help:
@echo ' export_report - List the usages of all exported symbols'
@echo ' headerdep - Detect inclusion cycles in headers'
@echo ' coccicheck - Check with Coccinelle'
+ @echo ' clang-analyzer - Check with clang static analyzer'
+ @echo ' clang-tidy - Check with clang-tidy'
@echo ''
@echo 'Tools:'
@echo ' nsdeps - Generate missing symbol namespace dependencies'
diff --git a/scripts/clang-tools/Makefile.clang-tools b/scripts/clang-tools/Makefile.clang-tools
new file mode 100644
index 000000000000..7ad3308c1937
--- /dev/null
+++ b/scripts/clang-tools/Makefile.clang-tools
@@ -0,0 +1,23 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (C) Google LLC, 2020
+#
+# Author: Nathan Huckleberry <nh...@google.com>
+#
+PHONY += clang-tidy
+clang-tidy:
+ifdef CONFIG_CC_IS_CLANG
+ $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
+ $(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-tidy compile_commands.json
+else
+ $(error clang-tidy requires CC=clang)
+endif
+
+PHONY += clang-analyzer
+clang-analyzer:
+ifdef CONFIG_CC_IS_CLANG
+ $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
+ $(PYTHON3) scripts/clang-tools/run-clang-tools.py static-analyzer compile_commands.json
+else
+ $(error clang-analyzer requires CC=clang)
+endif
diff --git a/scripts/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
similarity index 100%
rename from scripts/gen_compile_commands.py
rename to scripts/clang-tools/gen_compile_commands.py
diff --git a/scripts/clang-tools/run-clang-tools.py b/scripts/clang-tools/run-clang-tools.py
new file mode 100755
index 000000000000..00b8532c1729
--- /dev/null
+++ b/scripts/clang-tools/run-clang-tools.py
@@ -0,0 +1,69 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (C) Google LLC, 2020
+#
+# Author: Nathan Huckleberry <nh...@google.com>
+#
+"""A helper routine run clang-tidy and the clang static-analyzer on
+compile_commands.json."""
+
+import argparse
+import json
+import logging
+import multiprocessing
+import os
+import subprocess
+import sys
+
+def parse_arguments():
+ """Set up and parses command-line arguments.
+ Returns:
+ args: Dict of parsed args
+ Has keys "file" and "type"
+ """
+ usage = """Run clang-tidy or the clang static-analyzer on a
+ compilation database."""
+ parser = argparse.ArgumentParser(description=usage)
+
+ type_help = ("Type of analysis to be performed")
+ parser.add_argument("type", choices=["clang-tidy", "static-analyzer"],
+ help=type_help)
+ file_path_help = ("Path to the compilation database to parse")
+ parser.add_argument("file", type=str, help=file_path_help)
+
+ args = parser.parse_args()
+
+ return args
+
+def init(l,t):
+ global lock
+ global analysis_type
+ lock = l
+ analysis_type = t
+
+def run_analysis(entry):
+ filename = entry["file"]
+ checks = "-checks=-*,linuxkernel-*" if (analysis_type == "clang-tidy") else "-checks=-*,clang-analyzer-*"
+ p = subprocess.run(["clang-tidy", "-p", os.getcwd(),
+ checks, filename],
+ stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
+ lock.acquire()
+ print(filename, file=sys.stderr)
+ sys.stderr.buffer.write(p.stdout)
+ lock.release()
+
+
+def main():
+ args = parse_arguments()
+ filename = args.file
+
+ #Read JSON data into the datastore variable
+ with open(filename, "r") as f:
+ datastore = json.load(f)
+ lock = multiprocessing.Lock()
+ pool = multiprocessing.Pool(initializer=init, initargs=(lock, args.type))
+ pool.map(run_analysis,datastore)
+
+if __name__ == "__main__":
+ main()
--
2.27.0.389.gc38d7665816-goog

Nick Desaulniers

unread,
Jul 14, 2020, 8:18:27 PM7/14/20
to Nathan Huckleberry, Masahiro Yamada, Michal Marek, Linux Kbuild mailing list, LKML, clang-built-linux
^ note below on `static-analyzer`
Rather than "static-analyzer", how about "clang-analyzer" to be
consistent with the `make` target? Top level Makefile would need to
pass that here, too.

> + help=type_help)
> + file_path_help = ("Path to the compilation database to parse")
> + parser.add_argument("file", type=str, help=file_path_help)
> +
> + args = parser.parse_args()
> +
> + return args
> +
> +def init(l,t):
> + global lock
> + global analysis_type
> + lock = l
> + analysis_type = t
> +
> +def run_analysis(entry):
> + filename = entry["file"]
> + checks = "-checks=-*,linuxkernel-*" if (analysis_type == "clang-tidy") else "-checks=-*,clang-analyzer-*"

I'm not sure that the parens are necessary ^, but it's not a big
enough deal to necessitate a v4, IMO.

Though this is still running `-*,` for static-analyzer which I would
like removed.

> + p = subprocess.run(["clang-tidy", "-p", os.getcwd(),
> + checks, filename],
> + stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
> + lock.acquire()
> + print(filename, file=sys.stderr)

Should we drop printing the filename? Analyzing the output of `make
LLVM=1 -j71 defconfig clang-tidy 2> log.txt`, for example I see:
Error while processing
/linux-next/drivers/net/ethernet/freescale/fman/fman_sp.c.
...
<hundreds of lines from different files>
drivers/net/ethernet/freescale/fman/fman_sp.c

It's surprising to me how these appear out of order; maybe buffering
or not has something to do with it?

Anyways, if we print the filename anyways per error, and files with no
errors are just printed kind of meaninglessly, then I think we don't
need to print the filename being analyzed again. For clang-analyzer,
the errors also have the filename per line of the warning printed.

> + sys.stderr.buffer.write(p.stdout)
> + lock.release()
> +
> +
> +def main():
> + args = parse_arguments()
> + filename = args.file
> +
> + #Read JSON data into the datastore variable
> + with open(filename, "r") as f:
> + datastore = json.load(f)
> + lock = multiprocessing.Lock()
> + pool = multiprocessing.Pool(initializer=init, initargs=(lock, args.type))
> + pool.map(run_analysis,datastore)
> +
> +if __name__ == "__main__":
> + main()
> --


--
Thanks,
~Nick Desaulniers

Nathan Huckleberry

unread,
Jul 20, 2020, 7:35:53 PM7/20/20
to masa...@kernel.org, micha...@markovi.net, linux-...@vger.kernel.org, linux-...@vger.kernel.org, clang-bu...@googlegroups.com, pir...@google.com, mo...@google.com, Nathan Huckleberry
Changes v3->v4
* Update usages of static-analyzer to clang-analyzer
* Clarify -* explicitly in comment
* Remove filename printing
index 000000000000..5c9d76f77595
--- /dev/null
+++ b/scripts/clang-tools/Makefile.clang-tools
@@ -0,0 +1,23 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (C) Google LLC, 2020
+#
+# Author: Nathan Huckleberry <nh...@google.com>
+#
+PHONY += clang-tidy
+clang-tidy:
+ifdef CONFIG_CC_IS_CLANG
+ $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
+ $(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-tidy compile_commands.json
+else
+ $(error clang-tidy requires CC=clang)
+endif
+
+PHONY += clang-analyzer
+clang-analyzer:
+ifdef CONFIG_CC_IS_CLANG
+ $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
+ $(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-analyzer compile_commands.json
+else
+ $(error clang-analyzer requires CC=clang)
+endif
diff --git a/scripts/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
similarity index 100%
rename from scripts/gen_compile_commands.py
rename to scripts/clang-tools/gen_compile_commands.py
diff --git a/scripts/clang-tools/run-clang-tools.py b/scripts/clang-tools/run-clang-tools.py
new file mode 100755
index 000000000000..44527b3663e9
+ parser.add_argument("type", choices=["clang-tidy", "clang-analyzer"],
+ help=type_help)
+ file_path_help = ("Path to the compilation database to parse")
+ parser.add_argument("file", type=str, help=file_path_help)
+
+ args = parser.parse_args()
+
+ return args
+
+def init(l,t):
+ global lock
+ global analysis_type
+ lock = l
+ analysis_type = t
+
+def run_analysis(entry):
+ filename = entry["file"]
+ #Disable all checks, then re-enable the ones we want
+ checks = "-checks=-*,linuxkernel-*" if (analysis_type == "clang-tidy") else "-checks=-*,clang-analyzer-*"
+ p = subprocess.run(["clang-tidy", "-p", os.getcwd(),
+ checks, filename],
+ stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
+ lock.acquire()
+ sys.stderr.buffer.write(p.stdout)
+ lock.release()
+
+
+def main():
+ args = parse_arguments()
+ filename = args.file
+
+ #Read JSON data into the datastore variable
+ with open(filename, "r") as f:
+ datastore = json.load(f)
+ lock = multiprocessing.Lock()
+ pool = multiprocessing.Pool(initializer=init, initargs=(lock, args.type))
+ pool.map(run_analysis,datastore)
+
+if __name__ == "__main__":
+ main()
--
2.28.0.rc0.105.gf9edc3c819-goog

Bill Wendling

unread,
Jul 21, 2020, 8:17:09 AM7/21/20
to Nathan Huckleberry, Masahiro Yamada, micha...@markovi.net, linux-...@vger.kernel.org, LKML, clang-built-linux, Pirama Arumuga Nainar
On Mon, Jul 20, 2020 at 4:35 PM Nathan Huckleberry <nh...@google.com> wrote:
This patch adds clang-tidy and the clang static-analyzer as make
targets. The goal of this patch is to make static analysis tools
usable and extendable by any developer or researcher who is familiar
with basic c++.

The current static analysis tools require intimate knowledge of the
internal
workings of the static analysis.  Clang-tidy and the clang static
analyzers
expose an easy to use api and allow users unfamiliar with clang to
write new checks with relative ease.

Please reflow this comment.
Should this be formatted in PEP8 style instead of our internal style?
Nit: Please add a space after the "#" in comments.
 
+  checks = "-checks=-*,linuxkernel-*" if (analysis_type == "clang-tidy") else "-checks=-*,clang-analyzer-*"

This isn't very readable. The only bit that changes is the "linuxkernel" / "clang-analyzer".

+  p = subprocess.run(["clang-tidy", "-p", os.getcwd(),
+                    checks, filename],
+                    stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
+  lock.acquire()
+  sys.stderr.buffer.write(p.stdout)
+  lock.release()
+
Probably best to use a with statement:

  with lock:
    sys.stderr.buffer.write(p.stdout)

+
+def main():
+  args = parse_arguments()
+  filename = args.file
+
+  #Read JSON data into the datastore variable
+  with open(filename, "r") as f:
+    datastore = json.load(f)
+    lock = multiprocessing.Lock()
+    pool = multiprocessing.Pool(initializer=init, initargs=(lock, args.type))
+    pool.map(run_analysis,datastore)

Nit: It's good practice to have only necessary statements in a "with" statement so that you don't hold the file open too long.

Nathan Huckleberry

unread,
Jul 22, 2020, 7:11:25 PM7/22/20
to masa...@kernel.org, micha...@markovi.net, linux-...@vger.kernel.org, linux-...@vger.kernel.org, clang-bu...@googlegroups.com, pir...@google.com, mo...@google.com, Nathan Huckleberry
This patch adds clang-tidy and the clang static-analyzer as make
targets. The goal of this patch is to make static analysis tools
usable and extendable by any developer or researcher who is familiar
with basic c++.

The current static analysis tools require intimate knowledge of the
internal workings of the static analysis. Clang-tidy and the clang
static analyzers expose an easy to use api and allow users unfamiliar
with clang to write new checks with relative ease.

Changes v4->v5
* Use PEP8 style
* Other misc style fixes
MAINTAINERS | 1 +
Makefile | 3 +
scripts/clang-tools/Makefile.clang-tools | 23 ++++++
.../{ => clang-tools}/gen_compile_commands.py | 0
scripts/clang-tools/run-clang-tools.py | 76 +++++++++++++++++++
5 files changed, 103 insertions(+)
index 000000000000..41ed15d99933
--- /dev/null
+++ b/scripts/clang-tools/run-clang-tools.py
@@ -0,0 +1,76 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (C) Google LLC, 2020
+#
+# Author: Nathan Huckleberry <nh...@google.com>
+#
+"""A helper routine run clang-tidy and the clang static-analyzer on
+compile_commands.json.
+"""
+
+import argparse
+import json
+import logging
+import multiprocessing
+import os
+import subprocess
+import sys
+
+
+def parse_arguments():
+ """Set up and parses command-line arguments.
+ Returns:
+ args: Dict of parsed args
+ Has keys: [file, type]
+ """
+ usage = """Run clang-tidy or the clang static-analyzer on a
+ compilation database."""
+ parser = argparse.ArgumentParser(description=usage)
+
+ type_help = "Type of analysis to be performed"
+ parser.add_argument("type",
+ choices=["clang-tidy", "clang-analyzer"],
+ help=type_help)
+ file_path_help = "Path to the compilation database to parse"
+ parser.add_argument("file", type=str, help=file_path_help)
+
+ args = parser.parse_args()
+
+ return args
+
+
+def init(l, t):
+ global lock
+ global analysis_type
+ lock = l
+ analysis_type = t
+
+
+def run_analysis(entry):
+ filename = entry["file"]
+ # Disable all checks, then re-enable the ones we want
+ checks = "-checks=-*,{}".format("linuxkernel-*" if analysis_type ==
+ "clang-tidy" else "clang-analyzer-*")
+ p = subprocess.run(
+ ["clang-tidy", "-p", os.getcwd(), checks, filename],
+ stdout=subprocess.PIPE,
+ stderr=subprocess.STDOUT)
+ with lock:
+ sys.stderr.buffer.write(p.stdout)
+
+
+def main():
+ args = parse_arguments()
+ filename = args.file
+
+ lock = multiprocessing.Lock()
+ pool = multiprocessing.Pool(initializer=init, initargs=(lock, args.type))
+ # Read JSON data into the datastore variable
+ with open(filename, "r") as f:
+ datastore = json.load(f)
+ pool.map(run_analysis, datastore)
+

Nick Desaulniers

unread,
Jul 23, 2020, 2:45:40 PM7/23/20
to Nathan Huckleberry, Bill Wendling, Masahiro Yamada, Michal Marek, Linux Kbuild mailing list, LKML, clang-built-linux, Pirama Arumuga Nainar
ugh, sorry, I have one last nit. This line bothers me for two reasons:
1. The use of ternary statements is preferred when it is more concise
than than an if/else statement. If the ternary doesn't fit on one
line, is it still more concise? The wrapping of the condition here
is...upsetting.
2. `format` is nice when string interpolation is needed within the
middle of a string, but when you're simply appending to the end of a
string, it might be more concise to use the `+=` operator.

I think this statement would would be nicer as:
```python
checks = "-check=-*,"
checks += "linuxkernel-*" if analysis_type == "clang-tidy" else
"clang-analyzer-*"
```
If PEP8 requires line length <= 80; then maybe
```python
checks = "-check=-*,"
if analysis_type == "clang-tidy":
checks += "linuxkernel-*"
else:
checks += "clang-analyzer-*"
```
is more appropriate. Bill, thoughts?

> + p = subprocess.run(
> + ["clang-tidy", "-p", os.getcwd(), checks, filename],
> + stdout=subprocess.PIPE,
> + stderr=subprocess.STDOUT)
> + with lock:
> + sys.stderr.buffer.write(p.stdout)
> +
> +
> +def main():
> + args = parse_arguments()
> + filename = args.file
> +
> + lock = multiprocessing.Lock()
> + pool = multiprocessing.Pool(initializer=init, initargs=(lock, args.type))
> + # Read JSON data into the datastore variable
> + with open(filename, "r") as f:
> + datastore = json.load(f)
> + pool.map(run_analysis, datastore)
> +
> +
> +if __name__ == "__main__":
> + main()
> --
> 2.28.0.rc0.105.gf9edc3c819-goog
>

Bill Wendling

unread,
Jul 23, 2020, 3:00:47 PM7/23/20
to Nick Desaulniers, Nathan Huckleberry, Masahiro Yamada, Michal Marek, Linux Kbuild mailing list, LKML, clang-built-linux, Pirama Arumuga Nainar
I like the second option better.

-bw 

Nathan Huckleberry

unread,
Jul 24, 2020, 3:38:53 PM7/24/20
to masa...@kernel.org, micha...@markovi.net, linux-...@vger.kernel.org, linux-...@vger.kernel.org, clang-bu...@googlegroups.com, pir...@google.com, mo...@google.com, Nathan Huckleberry
Changes v5->v6
* Minor style fixes
MAINTAINERS | 1 +
Makefile | 3 +
scripts/clang-tools/Makefile.clang-tools | 23 ++++++
.../{ => clang-tools}/gen_compile_commands.py | 0
scripts/clang-tools/run-clang-tools.py | 77 +++++++++++++++++++
5 files changed, 104 insertions(+)
index 000000000000..1f4cd706ec01
--- /dev/null
+++ b/scripts/clang-tools/run-clang-tools.py
@@ -0,0 +1,77 @@
+ return parser.parse_args()
+
+
+def init(l, t):
+ global lock
+ global analysis_type
+ lock = l
+ analysis_type = t
+
+
+def run_analysis(entry):
+ filename = entry["file"]
+ # Disable all checks, then re-enable the ones we want
+ checks = "-checks=-*,"
+ if analysis_type == "clang-tidy":
+ checks += "linuxkernel-*"
+ else:
+ checks += "clang-analyzer-*"
+ p = subprocess.run(
+ ["clang-tidy", "-p", os.getcwd(), checks, filename],
+ stdout=subprocess.PIPE,
+ stderr=subprocess.STDOUT)
+ with lock:
+ sys.stderr.buffer.write(p.stdout)
+
+
+def main():
+ args = parse_arguments()
+ filename = args.file
+
+ lock = multiprocessing.Lock()
+ pool = multiprocessing.Pool(initializer=init, initargs=(lock, args.type))
+ # Read JSON data into the datastore variable
+ with open(filename, "r") as f:
+ datastore = json.load(f)
+ pool.map(run_analysis, datastore)
+
+
+if __name__ == "__main__":
+ main()
--
2.28.0.rc0.142.g3c755180ce-goog

Nick Desaulniers

unread,
Jul 24, 2020, 7:52:39 PM7/24/20
to Nathan Huckleberry, Masahiro Yamada, Michal Marek, Linux Kbuild mailing list, LKML, clang-built-linux, Pirama Arumuga Nainar, Bill Wendling, Tom Roeder
Thanks, patch LGTM now, but when I run clang-tidy for x86, I get a
bunch of errors that make this look broken.

Let's sort this out internally, next week?

For aarch64 defconfig clang-tidy already observes 3 instances of
linuxkernel-must-check-errs. Nice.

I think the script that generates compile_commands.json doesn't limit
itself to the current arch; if I don't do a bunch of `ARCH=x make
clean` I get additional garbage in the analyses from garbage in the
compile_commands.json.
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-li...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200724193551.2158677-1-nhuck%40google.com.



--
Thanks,
~Nick Desaulniers

Nathan Huckleberry

unread,
Jul 27, 2020, 8:47:58 PM7/27/20
to masa...@kernel.org, micha...@markovi.net, linux-...@vger.kernel.org, linux-...@vger.kernel.org, clang-bu...@googlegroups.com, pir...@google.com, mo...@google.com, Nathan Huckleberry
---
Changes v6->v7
* Fix issues with relative paths
* Additional style fixes
MAINTAINERS | 1 +
Makefile | 3 +
scripts/clang-tools/Makefile.clang-tools | 23 ++++++
.../{ => clang-tools}/gen_compile_commands.py | 0
scripts/clang-tools/run-clang-tools.py | 74 +++++++++++++++++++
5 files changed, 101 insertions(+)
index 000000000000..fa7655c7cec0
--- /dev/null
+++ b/scripts/clang-tools/run-clang-tools.py
@@ -0,0 +1,74 @@
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (C) Google LLC, 2020
+#
+# Author: Nathan Huckleberry <nh...@google.com>
+#
+"""A helper routine run clang-tidy and the clang static-analyzer on
+compile_commands.json.
+"""
+
+import argparse
+import json
+import multiprocessing
+import os
+import subprocess
+import sys
+
+
+def parse_arguments():
+ """Set up and parses command-line arguments.
+ Returns:
+ args: Dict of parsed args
+ Has keys: [path, type]
+ """
+ usage = """Run clang-tidy or the clang static-analyzer on a
+ compilation database."""
+ parser = argparse.ArgumentParser(description=usage)
+
+ type_help = "Type of analysis to be performed"
+ parser.add_argument("type",
+ choices=["clang-tidy", "clang-analyzer"],
+ help=type_help)
+ path_help = "Path to the compilation database to parse"
+ parser.add_argument("path", type=str, help=path_help)
+
+ return parser.parse_args()
+
+
+def init(l, a):
+ global lock
+ global args
+ lock = l
+ args = a
+
+
+def run_analysis(entry):
+ # Disable all checks, then re-enable the ones we want
+ checks = "-checks=-*,"
+ if args.type == "clang-tidy":
+ checks += "linuxkernel-*"
+ else:
+ checks += "clang-analyzer-*"
+ p = subprocess.run(["clang-tidy", "-p", args.path, checks, entry["file"]],
+ stdout=subprocess.PIPE,
+ stderr=subprocess.STDOUT,
+ cwd=entry["directory"])
+ with lock:
+ sys.stderr.buffer.write(p.stdout)
+
+
+def main():
+ args = parse_arguments()
+
+ lock = multiprocessing.Lock()
+ pool = multiprocessing.Pool(initializer=init, initargs=(lock, args))
+ # Read JSON data into the datastore variable
+ with open(args.path, "r") as f:

Nick Desaulniers

unread,
Jul 28, 2020, 4:35:20 PM7/28/20
to Nathan Huckleberry, Masahiro Yamada, Michal Marek, Linux Kbuild mailing list, LKML, clang-built-linux, Pirama Arumuga Nainar, Bill Wendling, Sami Tolvanen
^ unused import. Maybe Masahiro would be so kind as to drop that line
for you when merging v7?

That said, I hammered on this in testing, and it now LGTM.

Reviewed-by: Nick Desaulniers <ndesau...@google.com>
Tested-by: Nick Desaulniers <ndesau...@google.com>

For testing, I did `make clean` for x86/arm/arm64/powerpc, then a
defconfig (pseries_defconfig for ppc) build, then:
$ make -j71 LLVM=1 clang-tidy 2> log.txt
$ grep -e linuxkernel- -e clang- log.txt | sort -u | cut -d '[' -f 2 |
sort | uniq -c
$ make -j71 LLVM=1 clang-analyzer 2> log.txt
$ grep -e linuxkernel- -e clang- log.txt | sort -u | cut -d '[' -f 2 |
sort | uniq -c

Looks like it's already finding a couple bugs. I'll probably disable
clang-diagnostic-* in a follow up, as those seem pretty noisy, though
clang-diagnostic-incompatible-pointer-types will likely be of interest
for the CFI work.

Thanks for all of the work that went into this, and tolerating my pedantry.
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-li...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200728004736.3590053-1-nhuck%40google.com.



--
Thanks,
~Nick Desaulniers

Lukas Bulwahn

unread,
Aug 1, 2020, 3:23:16 PM8/1/20
to Nathan Huckleberry, Nick Desaulniers, masa...@kernel.org, micha...@markovi.net, linux-...@vger.kernel.org, linux-...@vger.kernel.org, clang-bu...@googlegroups.com, pir...@google.com, mo...@google.com
Hi Nathan, Hi Nick,

I have been busy with other topics around the kernel and static analysis;
but then, I read clang and static analysis in my mailbox in this patch.

So, I thought let me give this patch a try on the weekend.

I applied the patch on next-2020729; that worked.

Then:
$ make clang-tidy
scripts/clang-tools/Makefile.clang-tools:13: *** clang-tidy requires
CC=clang. Stop.

Okay, that is a good and clear error message.

Then:

$ make CC=clang-10 defconfig
$ make CC=clang-10 clang-tidy

python3 scripts/clang-tools/gen_compile_commands.py
WARNING: Found 8 entries. Have you compiled the kernel?
python3 scripts/clang-tools/run-clang-tools.py clang-tidy
compile_commands.json

Then actually an error in clang-tidy.
Error: no checks enabled.
USAGE: clang-tidy [options] <source0> [... <sourceN>]
...

I will get to that later how I fixed that for my setup.

Okay, good, that is clear... I need to compile it first, got it.

$ make CC=clang-10
$ make CC=clang-10 clang-tidy

Okay, I run except for the fix I needed.

Where is the output from clang-tidy?

It prints:

python3 scripts/clang-tools/gen_compile_commands.py
python3 scripts/clang-tools/run-clang-tools.py clang-tidy compile_commands.json

That is it. Does that mean 0 warnings, or where do I find the output?
The script suggests it should be in stderr once all the parallel runs
collected it, right?

I was confused; maybe a short summary output might help here.

Then, I ran

$ make CC=clang-10 clang-analyzer

And I see a lot of warnings... I guess that is intended.

There is a lot of:

Suppressed XX warnings (XX in non-user code).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.

To an outsider, it is unclear if that is intended or if the tool is broken
in this setup.

Is there are way to silent that meta-warning? Or is my setup broken?

In summary, it is pretty clear how to run clang-tidy and clang-analyzer
and it was a pretty smooth experience, even with no documentation at hand.

It was fun for me. Keep up the good work!

Just one issue... see below.
You hardcoded here: clang-tidy

But in my Ubuntu 18.04 setup, I got multiple versions of clang and
clang-tidy installed; yeah, maybe my setup is broken, but maybe those from
others are similar.

When I run:

make CC=clang-10 clang-tidy

it picks up the "wrong" clang-tidy version...

My setup is:

$ which clang-tidy
/usr/bin/clang-tidy

$ which clang-tidy-10
/usr/bin/clang-tidy-10

$ clang-tidy --version
LLVM (http://llvm.org/):
LLVM version 6.0.0

Optimized build.
Default target: x86_64-pc-linux-gnu
Host CPU: znver1

$ clang-tidy-10 --version
LLVM (http://llvm.org/):
LLVM version 10.0.1

Optimized build.
Default target: x86_64-pc-linux-gnu
Host CPU: znver1

When I run make CC=clang-10 clang-tidy, I would expect it to use
clang-tidy-10, not clang-tidy. (clang-tidy errors just because it is too
old; I guess it does have the linuxkernel-* options.)

Now, I cannot fix that without touching your script. There is no way I can
tell the build target to use clang-tidy-10.

With a quick touch:

- p = subprocess.run(["clang-tidy", "-p", args.path, checks, entry["file"]],
+ p = subprocess.run(["clang-tidy-10", "-p", args.path, checks, entry["file"]],

I got it to work.

Maybe you have a good idea how to get make clang-tidy to pick
up the intended version without touching the python script itself?

It is a minor issue, but it would be nice if that setting would work
somehow.

Thanks a lot.

Best regards,

Lukas

> + stdout=subprocess.PIPE,
> + stderr=subprocess.STDOUT,
> + cwd=entry["directory"])
> + with lock:
> + sys.stderr.buffer.write(p.stdout)
> +
> +
> +def main():
> + args = parse_arguments()
> +
> + lock = multiprocessing.Lock()
> + pool = multiprocessing.Pool(initializer=init, initargs=(lock, args))
> + # Read JSON data into the datastore variable
> + with open(args.path, "r") as f:
> + datastore = json.load(f)
> + pool.map(run_analysis, datastore)
> +
> +
> +if __name__ == "__main__":
> + main()
> --
> 2.28.0.rc0.142.g3c755180ce-goog
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-li...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200728004736.3590053-1-nhuck%40google.com.
>

Nick Desaulniers

unread,
Aug 3, 2020, 2:49:29 PM8/3/20
to Lukas Bulwahn, Nathan Huckleberry, Masahiro Yamada, Michal Marek, Linux Kbuild mailing list, LKML, clang-built-linux, Pirama Arumuga Nainar, Bill Wendling
On Sat, Aug 1, 2020 at 12:23 PM Lukas Bulwahn <lukas....@gmail.com> wrote:
>
> Hi Nathan, Hi Nick,
>
> I have been busy with other topics around the kernel and static analysis;
> but then, I read clang and static analysis in my mailbox in this patch.
>
> So, I thought let me give this patch a try on the weekend.
>
> I applied the patch on next-2020729; that worked.
>
> Then:
> $ make clang-tidy
> scripts/clang-tools/Makefile.clang-tools:13: *** clang-tidy requires
> CC=clang. Stop.
>
> Okay, that is a good and clear error message.
>
> Then:
>
> $ make CC=clang-10 defconfig
> $ make CC=clang-10 clang-tidy
>
> python3 scripts/clang-tools/gen_compile_commands.py
> WARNING: Found 8 entries. Have you compiled the kernel?
> python3 scripts/clang-tools/run-clang-tools.py clang-tidy
> compile_commands.json
>
> Then actually an error in clang-tidy.
> Error: no checks enabled.
> USAGE: clang-tidy [options] <source0> [... <sourceN>]
> ...
>
> I will get to that later how I fixed that for my setup.
>
> Okay, good, that is clear... I need to compile it first, got it.

Hi Lukas,
Thank you so much for taking the time to apply the patch and help test it.

For the case of not doing a build first: gen_compile_commands.py
parses the .*.d files to build the compilation database and warns if
not many were found. I think it might be interesting for it to just
invoke a build if it sees that, or maybe for the clang-tidy and
clang-analyzer targets to somehow invoke the default make target. The
issue there might be that you need to invoke `make clang-tidy` with
`make CC=clang LD=ld.lld ... clang-tidy` in order to trigger a build
successfully.

Also, I wonder if gen_compile_commands.py should set a return code in
that case so that callers can handle such an exceptional case? In
that case, I'd consider that a papercut or potential improvement to
scripts/get_compile_commands.py orthogonal to this patch.

>
> $ make CC=clang-10
> $ make CC=clang-10 clang-tidy
>
> Okay, I run except for the fix I needed.
>
> Where is the output from clang-tidy?
>
> It prints:
>
> python3 scripts/clang-tools/gen_compile_commands.py
> python3 scripts/clang-tools/run-clang-tools.py clang-tidy compile_commands.json
>
> That is it. Does that mean 0 warnings, or where do I find the output?
> The script suggests it should be in stderr once all the parallel runs
> collected it, right?
>
> I was confused; maybe a short summary output might help here.

I was also caught by this; for x86 defconfig, the kernel is actually
clean of instances of linuxkernel-* clang-tidy checks (there was also
an issue with the CWD for x86 in v6, but was resolved in v7 of this
patch). I had to add a case that should fail to make clang-tidy have
output, and the check only checks for unchecked "ERR_PTR", "PTR_ERR",
"IS_ERR", "IS_ERR_OR_NULL", "ERR_CAST", "PTR_ERR_OR_ZERO".
(Documentation for that should be improved.) So if you add a function
that just constructs an `ERR_PTR(0)` and call it from code that gets
compiled in, then you'll start to see warnings from clang-tidy for x86
defconfig. For aarch64 and arm, you'll see there are some unchecked
cases that look like low hanging fruit to fix.

It probably can be improved to note that there was no output, but that
will require more machinery to track how much output there was. I'd
prefer to follow up with additional polish on top of this once this
lands.

>
> Then, I ran
>
> $ make CC=clang-10 clang-analyzer
>
> And I see a lot of warnings... I guess that is intended.
>
> There is a lot of:
>
> Suppressed XX warnings (XX in non-user code).
> Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
>
> To an outsider, it is unclear if that is intended or if the tool is broken
> in this setup.
>
> Is there are way to silent that meta-warning? Or is my setup broken?

See also my comment about disabling the clang-diagnostic-* analyzer
checks. We haven't had time to sort out the cause of them yet, and for
now they just look like noise.

>
> In summary, it is pretty clear how to run clang-tidy and clang-analyzer
> and it was a pretty smooth experience, even with no documentation at hand.
>
> It was fun for me. Keep up the good work!
>
> Just one issue... see below.
>
Ah right, sorry, I tend to forget about the use case of having
multiple versions of clang installed.

I think the best approach here might be for the user (you, in this
case) to ensure that list of PATHs in the path list has the path to
the intended version of clang-tidy you'd like to run listed before
others. That is similar to the recommendation for the LLVM=1 patch
set. ie.
commit a0d1c951ef08e ("kbuild: support LLVM=1 to switch the default
tools to Clang/LLVM")
specifically this part of the commit message:

> the suffixed versions in /usr/bin/ are symlinks to binaries in
> /usr/lib/llvm-#/bin/, so this can also be handled by PATH.

If `clang-tidy` on your system points to an old version of clang-tidy,
it may be worthwhile to uninstall the old version, and update the
symlink to point to a newer version. That may be simpler than trying
to support invoking `make clang-tidy` for arbitrary versions or binary
names of clang-tidy. I can understand having multiple versions of a
compiler installed for quickly checking compatibility (though these
days I prefer godbolt.org for that) or if a particular codebase is
stuck on an old version of a toolchain for whatever reason; but having
multiple versions of clang-tidy installed and supporting all of them
is a little harder to justify.
--
Thanks,
~Nick Desaulniers

Lukas Bulwahn

unread,
Aug 3, 2020, 4:29:48 PM8/3/20
to Nick Desaulniers, Lukas Bulwahn, Nathan Huckleberry, Masahiro Yamada, Michal Marek, Linux Kbuild mailing list, LKML, clang-built-linux, Pirama Arumuga Nainar, Bill Wendling
I think the workflow is good as it is.

This already indicates to me that the clang-tidy result will depend on the
kernel config and the build, which is helpful to know.

The coccicheck target, e.g., ignores the kernel config; other basic
'syntactic' checks in the Makefile, such as make includecheck, seem also
to be config-independent, but you cannot run it without having a config...
that is a bit inconsistent in that case.

> Also, I wonder if gen_compile_commands.py should set a return code in
> that case so that callers can handle such an exceptional case? In
> that case, I'd consider that a papercut or potential improvement to
> scripts/get_compile_commands.py orthogonal to this patch.
>

I cannot see the immediate use case yet, but maybe I can provide a
specific use case once I try using clang-tidy in my attempt to work with
Ericsson's CodeChecker Web UI. It is still a very investigation for me
here with that.
Agree, it is somehow unfortunate that clang-tidy cannot provide such
summary. No big issue, though. You will just have further developers
asking the same question when more developers are attracted...

> >
> > Then, I ran
> >
> > $ make CC=clang-10 clang-analyzer
> >
> > And I see a lot of warnings... I guess that is intended.
> >
> > There is a lot of:
> >
> > Suppressed XX warnings (XX in non-user code).
> > Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
> >
> > To an outsider, it is unclear if that is intended or if the tool is broken
> > in this setup.
> >
> > Is there are way to silent that meta-warning? Or is my setup broken?
>
> See also my comment about disabling the clang-diagnostic-* analyzer
> checks. We haven't had time to sort out the cause of them yet, and for
> now they just look like noise.
>

Okay, good to know. So, my setup is not broken and the tool works :)

So, then I guess I finished with this result:

Tested-by: Lukas Bulwahn <lukas....@gmail.com>
I can live with that. I can modify the script or modify the symbolic link
of clang-tidy. No need to support arbitrary broken setups :)


Other question I came across while playing around with clang-tidy on the
kernel sources.

Is it possible to simply have a .clang-tidy in the repository root to
configure the clang-tidy invocation rather than having it in the python
script? If not, why?

I tried it, but I could not really figure out what I really needed to do
to get the same output behavior than Nathan's command-line invocation
from the python script.

Here is my .clang-tidy attempt:

$ cat .clang-tidy
Checks: '-*,linuxkernel-*'

$ clang-tidy -dump-config
---
Checks: 'clang-diagnostic-*,clang-analyzer-*,-*,linuxkernel-*'
WarningsAsErrors: ''
HeaderFilterRegex: ''
AnalyzeTemporaryDtors: false
FormatStyle: none
User: lukas
CheckOptions:
- key:
google-readability-braces-around-statements.ShortStatementLines
value: '1'
- key: google-readability-function-size.StatementThreshold
value: '800'
- key:
google-readability-namespace-comments.ShortNamespaceLines
value: '10'
- key:
google-readability-namespace-comments.SpacesBeforeComments
value: '2'
- key: modernize-loop-convert.MaxCopySize
value: '16'
- key: modernize-loop-convert.MinConfidence
value: reasonable
- key: modernize-loop-convert.NamingStyle
value: CamelCase
- key: modernize-pass-by-value.IncludeStyle
value: llvm
- key: modernize-replace-auto-ptr.IncludeStyle
value: llvm
- key: modernize-use-nullptr.NullMacros
value: 'NULL'
...

I could not understand why 'clang-diagnostic-*,clang-analyzer-*' is still
there when running clang-tidy. I think that actually leads to more
(unwanted) findings, when I run for example:

$ find ./kernel/trace/ -name '*.c' -exec clang-tidy {} +

such as:

./arch/x86/include/asm/apic.h:107:2: error: expected '(' after 'asm'
[clang-diagnostic-error]
alternative_io("movl %0, %P1", "xchgl %0, %P1", X86_BUG_11AP,
^
./arch/x86/include/asm/alternative.h:240:2: note: expanded from macro
'alternative_io'
asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature) \
^
./include/linux/compiler_types.h:239:24: note: expanded from macro
'asm_inline'
#define asm_inline asm __inline
^

Is there no way to completely override the standard clang-tidy
configuration with .clang-tidy file?

I expected that ',-*' deactivates everything that was activated to the
left of this checks expression, but it seems not to be the case.

Did you also try that and hence then settled for passing that checks as
command-line argument because there is no way to have a sane .clang-tidy
file in the root? Or did I stumble on some .clang-tidy file
misunderstanding and we can actually place a .clang-tidy file for simple
use cases as the example, find ... -exec clang-tidy {} +, above?


Lukas

> --
> Thanks,
> ~Nick Desaulniers
>

Masahiro Yamada

unread,
Aug 6, 2020, 4:44:54 AM8/6/20
to Nathan Huckleberry, Michal Marek, Linux Kbuild mailing list, Linux Kernel Mailing List, clang-built-linux, Pirama Arumuga Nainar, Bill Wendling
You can unify the almost same two rules.

PHONY += clang-tidy clang-analyzer
clang-tidy clang-analyzer:
ifdef CONFIG_CC_IS_CLANG
$(PYTHON3) scripts/clang-tools/gen_compile_commands.py
$(PYTHON3) scripts/clang-tools/run-clang-tools.py $@
compile_commands.json
else
$(error $@ requires CC=clang)
endif




But, before we proceed, please tell me
what this check is intended for.





Case 1)
Build the kernel with CC=clang,
and then run clang-tidy without CC=clang.

$ make CC=clang defconfig
$ make CC=clang -j$(nproc)
$ make clang-tidy

scripts/clang-tools/Makefile.clang-tools:13: *** clang-tidy requires
CC=clang. Stop.




Case 2)
Build the kernel using GCC,
and then run clang-tidy with CC=clang.

$ make defconfig
$ make -j$(nproc)
$ make CC=clang clang-tidy

This patch happily runs clang-tidy
although compile_commands.json
contains GCC commands.





So, it checks if you have passed CC=clang
to "make clang-tidy", where I do not see
any user of the $(CC) variable.

It does not care whether you have built
the kernel with GCC or Clang.



What happens if you run clang-tidy against
compile_commands.json that contains GCC
commands?


I also care about stale commands
in compile_commands.json.


gen_compile_commands.py creates compile_commands.json
based on *.cmd files it found.

If you rebuild the kernel for various settings
using GCC/clang without "make clean",
stale .*.cmd files will grow.

compile_commands.json will pick up commands
from older compilation, i.e. the end up with
the mixture of GCC/Clang commands.

So, I'd like to know how clang-tidy handles
the GCC commands from compile_commands.json
--
Best Regards
Masahiro Yamada

Nathan Huckleberry

unread,
Aug 6, 2020, 5:42:13 PM8/6/20
to Masahiro Yamada, Michal Marek, Linux Kbuild mailing list, Linux Kernel Mailing List, clang-built-linux, Pirama Arumuga Nainar, Bill Wendling
I like this.

>
>
>
> But, before we proceed, please tell me
> what this check is intended for.
>

Clang-tidy invokes clang using the command line
options specified in the compile_commands.json file.
Using gcc command line options causes a bunch of
errors for unknown options.

>
>
>
>
> Case 1)
> Build the kernel with CC=clang,
> and then run clang-tidy without CC=clang.
>
> $ make CC=clang defconfig
> $ make CC=clang -j$(nproc)
> $ make clang-tidy
>
> scripts/clang-tools/Makefile.clang-tools:13: *** clang-tidy requires
> CC=clang. Stop.
>

I suppose this case could allow clang-tidy to
be run.

>
>
>
> Case 2)
> Build the kernel using GCC,
> and then run clang-tidy with CC=clang.
>
> $ make defconfig
> $ make -j$(nproc)
> $ make CC=clang clang-tidy
>
> This patch happily runs clang-tidy
> although compile_commands.json
> contains GCC commands.
>

This is the worst of the two cases. I'm not
sure how to prevent this other than parsing the
compiler invocation in run-clang-tools.py.

I'm open to better suggestions.

>
>
>
>
> So, it checks if you have passed CC=clang
> to "make clang-tidy", where I do not see
> any user of the $(CC) variable.
>
> It does not care whether you have built
> the kernel with GCC or Clang.
>
>
>
> What happens if you run clang-tidy against
> compile_commands.json that contains GCC
> commands?

Clang-tidy itself uses the command line options from
compile_commands.json to invoke clang. If you run
clang-tidy against GCC commands you get lots of
errors similar to this

Found compiler error(s).
12 warnings and 8 errors generated.
Error while processing /usr/local/google/home/nhuck/linux/arch/x86/lib/iomem.c.
error: unknown argument: '-fconserve-stack' [clang-diagnostic-error]
error: unknown argument: '-fno-var-tracking-assignments'
[clang-diagnostic-error]
error: unknown argument: '-mindirect-branch-register' [clang-diagnostic-error]
error: unknown argument: '-mindirect-branch=thunk-extern'
[clang-diagnostic-error]
error: unknown argument: '-mno-fp-ret-in-387' [clang-diagnostic-error]
error: unknown argument: '-mpreferred-stack-boundary=3' [clang-diagnostic-error]
error: unknown argument: '-mskip-rax-setup' [clang-diagnostic-error]

>
>
> I also care about stale commands
> in compile_commands.json.
>

I agree with this point, but it's more of a bug with
gen_compile_commands.py. Maybe gen_compile_commands.py
could emit a warning when stale commands are detected in the
.*.cmd files.

Masahiro Yamada

unread,
Aug 6, 2020, 6:10:59 PM8/6/20
to Nathan Huckleberry, Michal Marek, Linux Kbuild mailing list, Linux Kernel Mailing List, clang-built-linux, Pirama Arumuga Nainar, Bill Wendling
Nathan, thanks for your comments.

I can improve this
so compile_commands.json contains
only commands from the last build.

Working on a patch.

Nathan Huckleberry

unread,
Aug 11, 2020, 9:24:17 PM8/11/20
to Masahiro Yamada, Michal Marek, Linux Kbuild mailing list, Linux Kernel Mailing List, clang-built-linux, Pirama Arumuga Nainar, Bill Wendling
Sounds good. Do you think this patch is ready to land then?

Masahiro Yamada

unread,
Aug 12, 2020, 1:49:05 PM8/12/20
to Nathan Huckleberry, Michal Marek, Linux Kbuild mailing list, Linux Kernel Mailing List, clang-built-linux, Pirama Arumuga Nainar, Bill Wendling
On Wed, Aug 12, 2020 at 10:24 AM 'Nathan Huckleberry' via Clang Built
Linux <clang-bu...@googlegroups.com> wrote:
>
> Sounds good. Do you think this patch is ready to land then?


I do not think so.

I pointed out the CC=clang check was not working.
I see false positive errors from GCC commands.



This patch does not use the benefit of Makefile.

Makefile is used to describe the dependency
between a target and its prerequisites,
and how to update the target.

Make compares the timestamps between the
targets and prerequisites, then determines
which targets need updating.


See your code.


clang-tidy:
ifdef CONFIG_CC_IS_CLANG
$(PYTHON3) scripts/clang-tools/gen_compile_commands.py
$(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-tidy
compile_commands.json
else
$(error clang-tidy requires CC=clang)
endif


This always runs two commands sequentially.
It rebuilds compile_commands.json even if
nothing in the source tree has been changed.

If you do this, there is no strong reason to use Make,
and actually you can rewrite it in a shell script:


clang_tidy () {
if [ "$CONFIG_CC_IS_CLANG = "y" ]; then
$PYTHON3 scripts/clang-tools/gen_compile_commands.py
$PYTHON3 scripts/clang-tools/run-clang-tools.py clang-tidy
compile_commands.json
else
echo "clang-tidy requires CC=clang"
exit 1
fi
}




I changed the rules to Makefile-ish style.

https://patchwork.kernel.org/project/linux-kbuild/list/?series=331893


I will wait for comments for the new version.
Reply all
Reply to author
Forward
0 new messages