_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
I think it would be great to move in this directions. It's a little bit unclear
at the moment whether or not the library use case is supported, because we
do include headers and libraries in the install targets.
As a package maintainer my wish list for an LLD library is:
1. Single shared object: https://reviews.llvm.org/D85278
2. All symbols are assumed private unless explicitly given library visibility.
3. Some subset of the API that is stable across major releases.
-Tom
> However, it is now ${YEAR} 2021, and I think we ought to reconsider this design decision. LLD was a great success: it works, it is fast, it is simple, many users have adopted it, it has many ports (COFF/ELF/mingw/wasm/new MachO). Today, we have actual users who want to run the linker as a library, and they aren't satisfied with the option of launching a child process. Some users are interested in process reuse as a performance optimization, some are including the linker in the frontend. Who knows. I try not to pre-judge any of these efforts, I think we should do what we can to enable experimentation.
>
> So, concretely, what could change? The main points of reusability are:
> - Fatal errors and warnings exit the process without returning control to the caller
> - Conflicts over global variables between threads
>
> Error recovery is the big imposition here. To avoid a giant rewrite of all error handling code in LLD, I think we should *avoid* returning failure via the llvm::Error class or std::error_code. We should instead use an approach more like clang, where diagnostics are delivered to a diagnostic consumer on the side. The success of the link is determined by whether any errors were reported. Functions may return a simple success boolean in cases where higher level functions need to exit early. This has worked reasonably well for clang. The main failure mode here is that we miss an error check, and crash or report useless follow-on errors after an error that would normally have been fatal.
>
> Another motivation for all of this is increasing the use of parallelism in LLD. Emitting errors in parallel from threads and then exiting the process is risky business. A new diagnostic context or consumer could make this more reliable. MLIR has this issue as well, and I believe they use this pattern. They use some kind of thread shard index to order the diagnostics, LLD could do the same.
>
> Finally, we'd work to eliminate globals. I think this is mainly a small matter of programming (SMOP) and doesn't need much discussion, although the `make` template presents interesting challenges.
>
> Thoughts? Tomatoes? Flowers? I apologize for the lack of context links to the original discussions. It takes more time than I have to dig those up.
>
> Reid
>
On 6/10/21 11:14 AM, Reid Kleckner via llvm-dev wrote:
> Hey all,
>
> Long ago, the LLD project contributors decided that they weren't going to design LLD as a library, which stands in opposition to the way that the rest of LLVM strives to be a reusable library. Part of the reasoning was that, at the time, LLD wasn't done yet, and the top priority was to finish making LLD a fast, useful, usable product. If sacrificing reusability helped LLD achieve its project goals, the contributors at the time felt that was the right tradeoff, and that carried the day.
>
I think it would be great to move in this directions. It's a little bit unclear
at the moment whether or not the library use case is supported, because we
do include headers and libraries in the install targets.
As a package maintainer my wish list for an LLD library is:
1. Single shared object: https://reviews.llvm.org/D85278
2. All symbols are assumed private unless explicitly given library visibility.
3. Some subset of the API that is stable across major releases.
The single shared object is consistent with clang and llvm, but
not the private symbols by default. We have discussed changing
this in clang and llvm, though, and for a library like lld that is
smaller than the clang and llvm libraries, it seems like it would
be an easier task and something that would be useful to do right
from the start.
-Tom
> 3. Some subset of the API that is stable across major releases.
>
>
> A limited stable C API seems plausible to me, if there's need.
>
> - Dave
>
>
> -Tom
>
>
> > However, it is now ${YEAR} 2021, and I think we ought to reconsider this design decision. LLD was a great success: it works, it is fast, it is simple, many users have adopted it, it has many ports (COFF/ELF/mingw/wasm/new MachO). Today, we have actual users who want to run the linker as a library, and they aren't satisfied with the option of launching a child process. Some users are interested in process reuse as a performance optimization, some are including the linker in the frontend. Who knows. I try not to pre-judge any of these efforts, I think we should do what we can to enable experimentation.
> >
> > So, concretely, what could change? The main points of reusability are:
> > - Fatal errors and warnings exit the process without returning control to the caller
> > - Conflicts over global variables between threads
> >
> > Error recovery is the big imposition here. To avoid a giant rewrite of all error handling code in LLD, I think we should *avoid* returning failure via the llvm::Error class or std::error_code. We should instead use an approach more like clang, where diagnostics are delivered to a diagnostic consumer on the side. The success of the link is determined by whether any errors were reported. Functions may return a simple success boolean in cases where higher level functions need to exit early. This has worked reasonably well for clang. The main failure mode here is that we miss an error check, and crash or report useless follow-on errors after an error that would normally have been fatal.
> >
> > Another motivation for all of this is increasing the use of parallelism in LLD. Emitting errors in parallel from threads and then exiting the process is risky business. A new diagnostic context or consumer could make this more reliable. MLIR has this issue as well, and I believe they use this pattern. They use some kind of thread shard index to order the diagnostics, LLD could do the same.
> >
> > Finally, we'd work to eliminate globals. I think this is mainly a small matter of programming (SMOP) and doesn't need much discussion, although the `make` template presents interesting challenges.
> >
> > Thoughts? Tomatoes? Flowers? I apologize for the lack of context links to the original discussions. It takes more time than I have to dig those up.
> >
> > Reid
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm...@lists.llvm.org <mailto:llvm...@lists.llvm.org>
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> >
>
> _______________________________________________
> LLVM Developers mailing list
> llvm...@lists.llvm.org <mailto:llvm...@lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
+1
> - Fatal errors and warnings exit the process without returning control to the caller
This means every single fatal() call needs scrutiny. The function is
noreturn and there are 147 references.
In many places returning from fatal() can indeed crash.
> Another motivation for all of this is increasing the use of parallelism in LLD. Emitting errors in parallel from threads and then exiting the process is risky business. A new diagnostic context or consumer could make this more reliable. MLIR has this issue as well, and I believe they use this pattern. They use some kind of thread shard index to order the diagnostics, LLD could do the same.
Yes, I remember that I refrained from warn() in some parallel*,
because warn() becomes error() in --fatal-warnings mode and error() is
similar to fatal() after --error-limit is reached.
On Thu, Jun 10, 2021 at 8:57 PM Tom Stellard <tste...@redhat.com> wrote:
>
> On 6/10/21 3:28 PM, David Blaikie wrote:
> > On Thu, Jun 10, 2021 at 3:16 PM Tom Stellard via llvm-dev <llvm...@lists.llvm.org <mailto:llvm...@lists.llvm.org>> wrote:
> >
> > On 6/10/21 11:14 AM, Reid Kleckner via llvm-dev wrote:
> > > Hey all,
> > >
> > > Long ago, the LLD project contributors decided that they weren't going to design LLD as a library, which stands in opposition to the way that the rest of LLVM strives to be a reusable library. Part of the reasoning was that, at the time, LLD wasn't done yet, and the top priority was to finish making LLD a fast, useful, usable product. If sacrificing reusability helped LLD achieve its project goals, the contributors at the time felt that was the right tradeoff, and that carried the day.
> > >
> >
> > I think it would be great to move in this directions. It's a little bit unclear
> > at the moment whether or not the library use case is supported, because we
> > do include headers and libraries in the install targets.
> >
> > As a package maintainer my wish list for an LLD library is:
> >
> > 1. Single shared object: https://reviews.llvm.org/D85278 <https://reviews.llvm.org/D85278>
> > 2. All symbols are assumed private unless explicitly given library visibility.
> >
> >
> > Is this ^ consistent with how other parts of LLVM are handled? My understanding was generally LLVM's API is wide/unbounded and not stable. I'd hesitate to restrict future libraries in some way due to some benefits that provides - ease of refactoring (LLVM's ability to be changed frequently is very valuable).
> >
>
> The single shared object is consistent with clang and llvm, but
> not the private symbols by default. We have discussed changing
> this in clang and llvm, though, and for a library like lld that is
> smaller than the clang and llvm libraries, it seems like it would
> be an easier task and something that would be useful to do right
> from the start.
I support that we compile lld source files with -fvisibility=hidden on
ELF systems.
lld::*::link may be the only API which need LLVM_EXTERNAL_VISIBILITY
(i.e. default visibility for ELF)
(LLVM_EXTERNAL_VISIBILITY is defined in llvm/include/llvm/Support/Compiler.h
Windows doesn't customize it currently.)
> -Tom
>
> > 3. Some subset of the API that is stable across major releases.
> >
> >
> > A limited stable C API seems plausible to me, if there's need.
> >
> > - Dave
> >
> >
> > -Tom
> >
> >
> > > However, it is now ${YEAR} 2021, and I think we ought to reconsider this design decision. LLD was a great success: it works, it is fast, it is simple, many users have adopted it, it has many ports (COFF/ELF/mingw/wasm/new MachO). Today, we have actual users who want to run the linker as a library, and they aren't satisfied with the option of launching a child process. Some users are interested in process reuse as a performance optimization, some are including the linker in the frontend. Who knows. I try not to pre-judge any of these efforts, I think we should do what we can to enable experimentation.
> > >
> > > So, concretely, what could change? The main points of reusability are:
> > > - Fatal errors and warnings exit the process without returning control to the caller
> > > - Conflicts over global variables between threads
> > >
> > > Error recovery is the big imposition here. To avoid a giant rewrite of all error handling code in LLD, I think we should *avoid* returning failure via the llvm::Error class or std::error_code. We should instead use an approach more like clang, where diagnostics are delivered to a diagnostic consumer on the side. The success of the link is determined by whether any errors were reported. Functions may return a simple success boolean in cases where higher level functions need to exit early. This has worked reasonably well for clang. The main failure mode here is that we miss an error check, and crash or report useless follow-on errors after an error that would normally have been fatal.
> > >
> > > Another motivation for all of this is increasing the use of parallelism in LLD. Emitting errors in parallel from threads and then exiting the process is risky business. A new diagnostic context or consumer could make this more reliable. MLIR has this issue as well, and I believe they use this pattern. They use some kind of thread shard index to order the diagnostics, LLD could do the same.
> > >
> > > Finally, we'd work to eliminate globals. I think this is mainly a small matter of programming (SMOP) and doesn't need much discussion, although the `make` template presents interesting challenges.
> > >
> > > Thoughts? Tomatoes? Flowers? I apologize for the lack of context links to the original discussions. It takes more time than I have to dig those up.
> > >
> > > Reid
> > >
> > > _______________________________________________
> > > LLVM Developers mailing list
> > > llvm...@lists.llvm.org <mailto:llvm...@lists.llvm.org>
> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> > >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm...@lists.llvm.org <mailto:llvm...@lists.llvm.org>
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> >
>
--
宋方睿
One of the earliest discussions about the LLD as a library design, at least after it had matured enough to be practical was this rather long thread https://lists.llvm.org/pipermail/llvm-dev/2016-December/107981.html
I don’t have any objections about making LLD more useable as a library.
What I would say is that we should come up with a good idea of what the functionality needed from the library is. For example I can see one use case with a relatively coarse interface that is similar to running LLD from the command line, with object files passed in memory buffers. I can see that working as an extension of the existing design. A more ambitious use case would permit more fine grain control of the pipeline, or construct LLD data structures directly rather than using object files could require quite a bit more work. I think people are thinking along the lines of the former, but it is worth making sure.
I think one of the reasons the library use case faltered was that there was no-one with a use case that was able to spend enough time to make it happen. The existing maintainers had enough work to do to catch up with Gold and BFD. Do we have someone willing to do the work?
Peter
One of the earliest discussions about the LLD as a library design, at least after it had matured enough to be practical was this rather long thread https://lists.llvm.org/pipermail/llvm-dev/2016-December/107981.html
I don’t have any objections about making LLD more useable as a library.
What I would say is that we should come up with a good idea of what the functionality needed from the library is. For example I can see one use case with a relatively coarse interface that is similar to running LLD from the command line, with object files passed in memory buffers. I can see that working as an extension of the existing design. A more ambitious use case would permit more fine grain control of the pipeline, or construct LLD data structures directly rather than using object files could require quite a bit more work. I think people are thinking along the lines of the former, but it is worth making sure.
To second what Jez said here: the new Mach-O backend is still pretty young (e.g. it wasn’t even the default Mach-O backend in LLVM 12). It’s being actively developed, and there’s still the possibility of fairly invasive changes being required to support new features or as we gain more implementation experience.
One of our main motivations for LLD for Mach-O was to improve incremental build speeds for our developers. We’ve found that link speed is a major component of incremental build times; LLD’s speed is a huge win there, and we care a lot about maintaining that speed. I’m CCing Nico, since he’s also been actively benchmarking LLD for Mach-O against Chromium builds (and reporting and fixing speed regressions).
Reid’s proposals (better error handling and eliminating globals) are completely reasonable. In general though, I really appreciate the LLD codebase’s simplicity and think it’s very valuable for understanding and maintaining the code. I haven’t worked too much with the rest of LLVM or Clang, so I’m not at all trying to compare LLD with them or cast aspersions on those codebases; I’m just speaking from my personal perspective. For example, having (mostly) separate codebases for each LLD port set off my “code duplication” spidey senses when I first started working with LLD, but while it does lead to some amount of duplication, there’s also a lot of subtle behavior differences between platforms, and having some amount of duplication is IMO a better tradeoff than e.g. having common functions that have a bunch of conditionals for each platform, or trying to come up with common abstract interfaces that are specialized for each platform, or so on. I really hope we can maintain that simplicity to whatever extent possible.
Thanks,
Shoaib
From: llvm-dev <llvm-dev...@lists.llvm.org> on behalf of Jez Ng via llvm-dev <llvm...@lists.llvm.org>
Reply-To: Jez Ng <je...@fb.com>
Date: Friday, June 11, 2021 at 2:14 PM
To: "llvm...@lists.llvm.org" <llvm...@lists.llvm.org>
Subject: Re: [llvm-dev] RFC: Revisiting LLD-as-a-library design
As one of the people working on the new Mach-O backend, my main concerns are:
_______________________________________________
Hello,
(David Blaikie)
> Reid - if you have any particular use case of your own in mind, or links to other discussions/users who are having friction with the current state of affairs, would be hand to have.
The topic came up last Thursday in the Windows call, please see notes in https://docs.google.com/document/d/1A-W0Sas_oHWTEl_x_djZYoRtzAdTONMW_6l1BH9G6Bo/
Speaking for Ubisoft, there’s a short term practical usage for us in llvm-buildozer. Neil Henning from Unity 3D raised a similar need for the Burst compiler.
Since I’ve already went through these topics before, I had a list of practical things to achieve the LLD-as-a-lib goal:
The rationale here is to have the ability to call LLD-as-a-lib (or Clang-as-a-lib for instance, or any other LLVM tool) in the same way as we do on the command-line. Essentially calling into LLD main() but in-process.
Like mentioned in https://reviews.llvm.org/D86351 one of our objectives is to pass a CDB .json to a tool (llvm-buildozer) and build in-process.
While we’re here, we had some other adjacent objectives with this work:
One point that was raised recently is, being able to compile LLVM components as DLLs on Windows. This is all adjacent to LLD-as-a-lib, perhaps it isn’t desirable to always link statically LLD into the user’s application.
Does all this sound sensible? It would be nice to split the work between us, if possible. On the short term (next few weeks) I can work on 1. and 2.
Best,
Alex.
De : Reid Kleckner <r...@google.com>
Envoyé : June 10, 2021 2:15 PM
À : llvm-dev <llvm...@lists.llvm.org>; Fangrui Song <mas...@google.com>; Sam Clegg <s...@chromium.org>; Shoaib Meenai <sme...@fb.com>; g...@fb.com; je...@fb.com; Alexandre Ganea <alexand...@ubisoft.com>; Martin Storsjö <mar...@martin.st>
Objet : RFC: Revisiting LLD-as-a-library design
The point of using LLVM for compiling WASM is to take advantage of ahead-of-time optimizations that could cause hitches in a JIT.
For example, it integrates polly to try to recover vectorization optimizations. The resulting DLL can then be cached and loaded instantly on every subsequent playthrough,
without any possibility of hitching. Microsoft Flight Simulator 2020 also ships pre-compiled plugin DLLs on Xbox, which does not allow JITing code, but because these are compiled on developer machines the linker problem doesn't really apply in that situation. If they wanted to JIT webassembly, there are plenty of JIT runtimes to do that.
Regardless, I think it's kind of silly to say that instead of using a perfectly functional linker that LLVM has, someone should JIT the code.
LLVM is a compiler backend - it should support using its own linker the same way people use LLVM, and if LLVM can be used as a library, then LLD should be usable as a library. Furthermore, there is no technical reason for LLD to not be a library. It's already almost all the way there, the maintainers simply don't bother testing to see when they forget to clean up one of the global caches.
Neil Henning | |
Senior Software Engineer Compiler | |
Can I ask more details on difference between workflow/scenario with 2000 spawns of lld process and 100ish spawns/runs in threads for linking 100ish DLLs?Is same number of binaries built in each scenario?
Where do 2000 spawns of lld come from? Are there 2000 binaries built or do You run in modifycode-recompile-relink same binary in succession workflow?
I got one more concern with "running lld in threads" which is requiring thread safety between runs. Is some data shared between those threads or could we in worst scenario duplicate it in each threads context?