On Mon, Mar 01, 2021 at 02:09:43PM +0100, Marco Elver wrote:
> It's 2021, and I'd like to check if we have all the pieces in place
> for KCSAN support on arm64. While it might not be terribly urgent
> right now, I think we have all the blockers resolved.
>
> On Wed, 23 Sept 2020 at 13:47, Mark Rutland <
mark.r...@arm.com> wrote:
> [...]
> > The main issues are:
> >
> > * Current builds of clang miscompile generated functions when BTI is
> > enabled, leading to build-time warnings (and potentially runtime
> > issues). I was hoping this was going to be fixed soon (and was
> > originally going to wait for the clang 11 release), but this seems to
> > be a larger structural issue with LLVM that we will have to workaround
> > for the timebeing.
> >
> > This needs some Makefile/Kconfig work to forbid the combination of BTI
> > with any feature relying on compiler-generated functions, until clang
> > handles this correctly.
>
> I think
https://reviews.llvm.org/D85649 fixed the BTI issue with
> Clang. Or was there something else missing?
I *think* so, but I haven't had a chance to go test with a recent clang
build. I see there's now as 11.1.0 build out on
llvm.org, so I can try
to give that a spin in a bit, if no-one else does.
> > * KCSAN currently instruments some functions which are not safe to
> > instrument (e.g. code used during code patching, exception entry),
> > leading to crashes and hangs for common configurations (e.g. with LSE
> > atomics). This has also highlisted some existing issues in this area
> > (e.g. with other instrumentation).
> >
> > I'm auditing and reworking code to address this, but I don't have a
> > good enough patch series yet. I intend to post that prework after rc1,
> > and hopefully the necessary bits are small enough that KCSAN can
> > follow in the same merge window.
On this part, I know we still need to do a couple of things:
* Deal with instrumentation of early boot code. We need to set the
per-cpu offset earlier, and might also need to mark more of this as
noinstr.
I'll go respin the per-cpu offset patch in a moment as that's trivial.
* Prevent instrumentation of the patching/alternatives code, which I saw
blow up when instrumented. For KCSAN we can probably survive with a
simple refactoring and marking a few things as noinstr, but there's a
more general unsoundness problem here since the patching code calls
code whihc can be instrumented or patched (e.g. bitops, cache
maintenance, common ID register accessors), and making this watertight
will require some more invasive rework that I hadn't quite figured
out.
* I have a vague recollection that there was some problem with atomics,
and that in some cases we'd need to use arch_atomic() rather than
atomic(), but I can't remember whether that was to do with the
patching code or elsewhere.
> [...]
> > > -----Original Message-----
> > > From: Marco Elver <
el...@google.com>
> [...]
> > > Let's see which one comes first: BTI getting fixed with Clang; or mainlining GCC support [1] and having GCC 11 released.
>
> If Clang still has issues, KCSAN works with GCC 11, which will be
> released this year.
>
> Mark, was there anything else blocking?
I think it's just the bits above, but I haven't had the chance to look
at this actively for a short while, so there might be more issues that
have cropped up since I last looked.
Thanks,
Mark.