[llvm-dev] Inclusion of the ORC runtime in compiler-rt.

70 views
Skip to first unread message

Lang Hames via llvm-dev

unread,
Apr 12, 2021, 3:26:13 PM4/12/21
to LLVM Developers Mailing List, jlet...@apple.com
Hi All,

I'd like to add the ORC runtime library (preview available at https://github.com/lhames/llvm-project/tree/orc-runtime-preview) to compiler-rt.

Background:

ORC, like MCJIT, can link JIT'd code either into the current process ("in-process" mode) or into a remote executor process ("cross-process" mode). Some JIT features require support code in the executor process, but the existing ORC libraries are only linked into the JIT process. This has made cross-process mode support for those features (which include static initializers, thread local variables, exception handling, and others) awkward or impractical to implement. The ORC runtime library aims to provide the necessary support code in a form that is loadable by the JIT itself, which should allow these features to work uniformly in both modes.

My prototype branch of the ORC runtime (available at https://github.com/lhames/llvm-project/tree/orc-runtime-preview) has advanced to the point where it can provide uniform support for static initializers, destructors, exceptions, thread locals, and language registration for Objective C and Swift code. This support is all MachO/Darwin only so far, but should be easily adaptable for ELF/Linux/BSD support.

Proposal:

The proof of concept implementation has been very successful, so I would like to move future development to the LLVM main branch so that others can benefit from this.

Before I start posting patches, though:

Does anyone see any problems with including this in compiler-rt?

Does anyone think that there is a more reasonable home for the ORC runtime within the llvm-project? I considered LLVM itself, or a new top-level project, but neither seemed as natural a fit as compiler-rt.

Finally, if everyone is happy for it to be included in principle, are there any volunteers to review ORC runtime patches?

Regards,
Lang.

Petr Hosek via llvm-dev

unread,
Apr 12, 2021, 5:37:01 PM4/12/21
to Lang Hames, LLVM Developers Mailing List, jlet...@apple.com
I'd like to better understand the structure of the ORC runtime and its dependencies (both build and runtime). Does it use the C or C++ standard library? Does it depend on other parts of LLVM? Do you plan on reusing some of the existing compiler-rt libraries like sanitizer_common?

To give a bit more background on why I'm interested, compiler-rt has grown fairly organically this has been making the maintenance more and more difficult, at least from the build perspective. There are some runtimes that only use C, some that use C++, some that use C++ standard library. When building compiler-rt together with other runtimes like libc or libc++, it's difficult to pick up the right order which is why we have several entry points into compiler-rt's build system to build different subsets and that's been a maintenance headache.

I've been thinking about this quite a bit recently and I repeatedly came to the conclusion that compiler-rt would ideally be broken up into several subprojects, but that should probably be discussed as a separate topic. However, understanding the build and runtimes dependencies of the ORC runtime could help us decide whether it should be a part of compiler-rt or be a separate subproject.

Lang Hames via llvm-dev

unread,
Apr 12, 2021, 6:36:41 PM4/12/21
to Petr Hosek, LLVM Developers Mailing List, jlet...@apple.com
Hi Petr,

I'd like to better understand the structure of the ORC runtime and its dependencies (both build and runtime). Does it use the C or C++ standard library? Does it depend on other parts of LLVM? Do you plan on reusing some of the existing compiler-rt libraries like sanitizer_common?

The ORC runtime currently uses the C++ standard library.
Since the ORC runtime needs to communicate with the LLVM ORC library it also currently uses some header-only includes from LLVM. It does not depend on any LLVM libraries. We could duplicate this code, but I'd prefer to share it if possible.
I have not used sanitizer_common, but some parts of it look like they may be useful.

I gravitated towards implementing the ORC runtime in compiler-rt because I need to be able to write parts of it in platform-specific assembly (which compiler-rt supports), and because the runtime should be build for all targets, not just the host (which seems to be the standard way that compiler-rt is configured).

To give a bit more background on why I'm interested, compiler-rt has grown fairly organically this has been making the maintenance more and more difficult, at least from the build perspective. There are some runtimes that only use C, some that use C++, some that use C++ standard library. When building compiler-rt together with other runtimes like libc or libc++, it's difficult to pick up the right order which is why we have several entry points into compiler-rt's build system to build different subsets and that's been a maintenance headache.

I've been thinking about this quite a bit recently and I repeatedly came to the conclusion that compiler-rt would ideally be broken up into several subprojects, but that should probably be discussed as a separate topic. However, understanding the build and runtimes dependencies of the ORC runtime could help us decide whether it should be a part of compiler-rt or be a separate subproject.

That makes sense to me. I think of the ORC runtime as a compiler-rt-style runtime with a libc++ dependency. In that sense I think it's similar to libFuzzer, and whatever solution we come up with for libFuzzer would probably also be applicable to the ORC runtime too.

-- Lang.

Eric Christopher via llvm-dev

unread,
Apr 12, 2021, 6:51:30 PM4/12/21
to Lang Hames, LLVM Developers Mailing List, jlet...@apple.com
On Mon, Apr 12, 2021 at 6:36 PM Lang Hames <lha...@gmail.com> wrote:
Hi Petr,

I'd like to better understand the structure of the ORC runtime and its dependencies (both build and runtime). Does it use the C or C++ standard library? Does it depend on other parts of LLVM? Do you plan on reusing some of the existing compiler-rt libraries like sanitizer_common?

The ORC runtime currently uses the C++ standard library.
Since the ORC runtime needs to communicate with the LLVM ORC library it also currently uses some header-only includes from LLVM. It does not depend on any LLVM libraries. We could duplicate this code, but I'd prefer to share it if possible.
I have not used sanitizer_common, but some parts of it look like they may be useful.

I gravitated towards implementing the ORC runtime in compiler-rt because I need to be able to write parts of it in platform-specific assembly (which compiler-rt supports), and because the runtime should be build for all targets, not just the host (which seems to be the standard way that compiler-rt is configured).


From this it sounds like "convenient reusing of the build system" rather than "should be included in compiler-rt as a library"? If that's the case maybe making it clear or lifting the common build system support out might be maintainable without the "this is a runtime library for the system" sort of thing?

-eric

Lang Hames via llvm-dev

unread,
Apr 12, 2021, 7:38:57 PM4/12/21
to Eric Christopher, LLVM Developers Mailing List, jlet...@apple.com
From this it sounds like "convenient reusing of the build system" rather than "should be included in compiler-rt as a library"? If that's the case maybe making it clear or lifting the common build system support out might be maintainable without the "this is a runtime library for the system" sort of thing?

Yeah. It sounds like in an ideal world we'd lift out the common build system support, then have a new set of sub-projects that re-use that generic build system.

Do you have any sense of how difficult it would be to lift out that common build system code? If that's relatively easy then maybe the right approach is to do that first, then land the ORC runtime. Otherwise the ORC runtime could go in to compiler-rt for now, then be split out with the rest of compiler-rt when it's broken up -- it doesn't require any meaningful changes to existing compiler-rt code, so it should be very easy to break back out again later.

-- Lang.

Lang Hames via llvm-dev

unread,
Apr 14, 2021, 1:35:08 PM4/14/21
to Eric Christopher, LLVM Developers Mailing List, jlet...@apple.com
Hi All,

Petr -- since the ORC runtime's dependencies are similar to libFuzzer's, is there any reason not to land the ORC runtime in compiler-rt now and then break it out again later? If the compiler-rt refactor is likely to happen soon then it's worth waiting, otherwise I think landing it in compiler-rt sooner rather than later is the best option, so that any kinks in the integration can be worked out before any future compiler-rt refactor.

-- Lang.


Lang Hames via llvm-dev

unread,
Apr 17, 2021, 4:52:08 PM4/17/21
to Eric Christopher, LLVM Developers Mailing List, jlet...@apple.com
Hi All,

I've broken the compiler-rt cmake changes and new directories out of the ORC runtime prototype and posted them for review in https://reviews.llvm.org/D100711.

Most of this was adapted from xray's cmake files and project layout. I'm not a CMake expert, so I expect there's room for improvement here, but otherwise I'm hoping it's a pretty canonical "new compiler-rt library".

Kind Regards,
Lang. 

Chris Lattner via llvm-dev

unread,
Apr 19, 2021, 12:08:52 AM4/19/21
to Lang Hames, LLVM Developers Mailing List, jlet...@apple.com
Hey Lang, 

Is your goal here to make this part fo compiler_rt the generated library, or part of the subproject?  This seems conceptually very different than compiler_rt (which was supposed to be entry points implicitly generated by the compiler).  Should this be its “own thing”?

-Chris

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Lang Hames via llvm-dev

unread,
Apr 19, 2021, 12:50:21 AM4/19/21
to Chris Lattner, LLVM Developers Mailing List, jlet...@apple.com
Hi Chris,

My understanding is that compiler-rt is an umbrella project that builds a number of libraries (builtins, asan, tsan, fuzzer, etc.), and my thought was that the ORC runtime could fit in as an addition to that set. Arguments in favor are that the build requirements are very similar, and there are some conceptual parallels (the ORC runtime is providing implementations for entry points generated by the compiler and linker, among other things). On the other hand the implementations are all JIT specific, which is definitely different from everything else in compiler-rt.

If not compiler-rt, is there anywhere else that you think this project would fit? Would it make sense to introduce it as a new top-level project?

-- Lang.

Lang Hames via llvm-dev

unread,
Apr 19, 2021, 1:32:00 PM4/19/21
to Chris Lattner, LLVM Developers Mailing List, jlet...@apple.com
Arguments in favor are that the build requirements are very similar, and there are some conceptual parallels (the ORC runtime is providing implementations for entry points generated by the compiler and linker, among other things). On the other hand the implementations are all JIT specific, which is definitely different from everything else in compiler-rt.

After a night to think about it I think I'd rephrase this: The ORC runtime is a JIT-specific compiler runtime library. From a JIT developer's point of view it is an excellent fit for compiler-rt, and from a static compiler developer's point of view it's a non-entity, so just dead weight.

If compiler-rt were broken up then it would make sense to have the ORC runtime be a separate subproject and client of the common complier-rt build infrastructure. That would be the ideal solution. Until then I think compiler-rt seems like the best home for it -- making the ORC runtime its own subproject now would duplicate compiler-rt's build system only for us to have to reconcile it later (or worse, maintain the duplication indefinitely).

Petr, Eric -- Do you have a sense of how difficult it would be to lift compiler-rt's common build infrastructure out? Being new to compiler-rt I'm wary of tackling that, but if you think it'd be easy I'm happy to work with you to try to get it done.

-- Lang.

Petr Hosek via llvm-dev

unread,
Apr 19, 2021, 2:52:25 PM4/19/21
to Lang Hames, LLVM Developers Mailing List, jlet...@apple.com
We have other runtimes that are not a part of compiler-rt but provide ABI used by the compiler (for example libcxxabi). The primary difference, at least from my perspective, is that the ABI between the compiler and compiler-rt is unstable and these runtimes are version locked to the compiler (that's why they're installed in lib/clang/<version>/lib/<target>/libclang_rt.<name>.a) whereas the runtimes that live outside of compiler-rt typically provide standard ABI and you can have multiple interchangeable implementations (for example libcxxrt or libstdcxx).

There are exceptions to this rule as always. For example, builtins library is supposed to be ABI compatible with libgcc. libFuzzer on the other hand doesn't provide any ABI used by the compiler, it's just a regular library and as far as I can tell, the only reason it lives in compiler-rt is convenience (and it's also the first library I'd like to see extracted out of compiler-rt).

This can help guide the appropriate location for ORC runtime, but as you can also see, there are no hard rules and either compiler-rt or a standalone top-level project can be made to work.

If we decide to go with a new top-level project, I wouldn't try to reuse compiler-rt CMake build. That build has a number of issues which we're trying to fix but it might take a while before the situation improves. Instead, I'd start with the simplest possible CMake setup and then see if you could reuse common parts of libc and libcxx builds, and if so we can extract those into a common location as needed.

Lang Hames via llvm-dev

unread,
Apr 19, 2021, 6:59:08 PM4/19/21
to Petr Hosek, LLVM Developers Mailing List, jlet...@apple.com
Hi Petr,

The primary difference, at least from my perspective, is that the ABI between the compiler and compiler-rt is unstable and these runtimes are version locked to the compiler...

This is the case for the ORC runtime as well, at least for now.

This can help guide the appropriate location for ORC runtime, but as you can also see, there are no hard rules and either compiler-rt or a standalone top-level project can be made to work.
 
If we decide to go with a new top-level project, I wouldn't try to reuse compiler-rt CMake build. That build has a number of issues which we're trying to fix but it might take a while before the situation improves. Instead, I'd start with the simplest possible CMake setup and then see if you could reuse common parts of libc and libcxx builds, and if so we can extract those into a common location as needed.

A couple of the properties that I was hoping to rely on are support for platform specific assembly and multi-slice archives as output. Compiler-rt seems to support those nicely, but I can't see equivalent support in any of the other runtimes. Do you know whether the compiler-rt build issues permeate the cmake code that supports those features? If so, are there any cleaner implementations of those features in other runtimes?

-- Lang.

Petr Hosek via llvm-dev

unread,
Apr 20, 2021, 1:37:33 AM4/20/21
to Lang Hames, LLVM Developers Mailing List, jlet...@apple.com
Support for building universal binaries is currently only available in the compiler-rt build so if you need it, compiler-rt is likely the best option right now. I'd like to see that support even for other runtimes because we already have a use for it, for example when building libc++ for Darwin, but it's not yet clear what that's going to look like and it might be a while before it's available.

Lang Hames via llvm-dev

unread,
Apr 21, 2021, 1:32:54 PM4/21/21
to Petr Hosek, LLVM Developers Mailing List, jlet...@apple.com
Hi Petr,

Thanks!

I'd like to see that support even for other runtimes because we already have a use for it...

That makes sense. If/when you get to breaking out that support I would be happy to help test it out. 

-- Lang.


Chris Lattner via llvm-dev

unread,
Apr 22, 2021, 7:20:47 PM4/22/21
to Lang Hames, LLVM Developers Mailing List, jlet...@apple.com
Ok, so you’re saying you want it to be part of the llvm/llvm-project/tree/main/compiler-rt tree, but not part of the compiler_rt library itself, got it.

Does compiler-rt currently depend on LLVM libraries?  Does this change the dependence graph of the llvm and compiler-rt subproject?

-Chris

Lang Hames via llvm-dev

unread,
Apr 24, 2021, 5:09:58 PM4/24/21
to Chris Lattner, LLVM Developers Mailing List, jlet...@apple.com
Hi Chris,

Ok, so you’re saying you want it to be part of the llvm/llvm-project/tree/main/compiler-rt tree, but not part of the compiler_rt library itself, got it.
Does compiler-rt currently depend on LLVM libraries?  Does this change the dependence graph of the llvm and compiler-rt subproject?

As far as I know no compiler-rt libraries depend on LLVM. The ORC runtime definitely doesn't, so there would be no change to the dependence graph.

In my prototype branch I have used some header-only includes from LLVM for shared data structures and serialization utilities. I'm not sure what the policy is on that, but if we want to prohibit any LLVM includes then those files could just be duplicated -- they're quite small and self contained.

-- Lang.  

David Blaikie via llvm-dev

unread,
Apr 24, 2021, 6:56:40 PM4/24/21
to Lang Hames, LLVM Developers Mailing List, jlet...@apple.com
Yeah, for buildability (for instance, Google's build system doesn't
have any special dispensations for header only libraries - doesn't
allow arbitrary includes without dependencies - and I think this is
generally a good thing) it's probably best/necessary to duplicate.

Though now that I think about it, I think there is /something/ shared
between compiler-rt and LLVM - for xray and instrumented profiling, I
think there are some common constants or data structures or something.
Might be worth checking how those work?

On Sat, Apr 24, 2021 at 2:10 PM Lang Hames via llvm-dev

Petr Hosek via llvm-dev

unread,
Apr 24, 2021, 7:30:05 PM4/24/21
to David Blaikie, LLVM Developers Mailing List, jlet...@apple.com
InstrProfData.inc is shared between LLVM and compiler-rt:


There's no special mechanism to keep those in sync, we manage them manually.

Lang Hames via llvm-dev

unread,
Apr 24, 2021, 8:28:27 PM4/24/21
to David Blaikie, LLVM Developers Mailing List, jlet...@apple.com
Yeah, for buildability (for instance, Google's build system doesn't
have any special dispensations for header only libraries - doesn't
allow arbitrary includes without dependencies - and I think this is
generally a good thing) it's probably best/necessary to duplicate.

Would it make sense to have a config or build step duplicate the headers?
 
Though now that I think about it, I think there is /something/ shared
between compiler-rt and LLVM - for xray and instrumented profiling, I
think there are some common constants or data structures or something.
Might be worth checking how those work?

If I look for LLVM header includes with

% grep -iR 'include.*llvm' compiler-rt

most of the results are comments. Xray does have some LLVM includes, but they're all in the lib/xray/tests subdirectory. That may still be a violation, but at least it's not going to impact the libraries.

There is one file that uses LLVM libraries directly:

compiler-rt/lib/sanitizer_common/symbolizer/sanitizer_symbolize.cpp:#include "llvm/DebugInfo/Symbolize/DIPrinter.h
compiler-rt/lib/sanitizer_common/symbolizer/sanitizer_symbolize.cpp:#include "llvm/DebugInfo/Symbolize/Symbolize.h

But it doesn't seem to be being built, at least on my system.

-- Lang.


Lang Hames via llvm-dev

unread,
Apr 24, 2021, 8:31:43 PM4/24/21
to Petr Hosek, LLVM Developers Mailing List, jlet...@apple.com
InstrProfData.inc is shared between LLVM and compiler-rt:
 
https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/ProfileData/InstrProfData.inc
https://github.com/llvm/llvm-project/blob/main/compiler-rt/include/profile/InstrProfData.inc
 
There's no special mechanism to keep those in sync, we manage them manually.

Thanks Petr. That's about the same amount of code that ORC will need to share I think. Glad there's a precedent to follow.

-- Lang. 
Reply all
Reply to author
Forward
0 new messages