Hi all,I was looking at the profile for a tool I’m working on, and noticed that it is spending 10% of its time doing locking related stuff. The structure of the tool is that it reading in a ton of stuff (e.g. one moderate example I’m working with is 40M of input) into MLIR, then uses its multithreaded pass manager to do transformations.As it happens, the structure of this is that the parsing pass is single threaded, because it is parsing through a linear file (the parser is simple and fast, so this is bound by IR construction). This means that none of the locking during IR construction is useful.
Historically, LLVM had a design where you could dynamically enable and disable multithreading support in a tool, which would be perfect for this use case, but it got removed by this patch: (xref https://reviews.llvm.org/D4216). The rationale in the patch doesn’t make sense to me - this mode had nothing to do with the old LLVM global lock, this had to do with whether llvm::llvm_is_multithreaded() returned true or false … which all the locking stuff is guarded on.
Would it make sense to re-enable this, or am I missing something?
On Apr 12, 2020, at 11:27 AM, Mehdi AMINI <joke...@gmail.com> wrote:Hey Chris,On Sat, Apr 11, 2020 at 5:15 PM Chris Lattner via llvm-dev <llvm...@lists.llvm.org> wrote:Hi all,I was looking at the profile for a tool I’m working on, and noticed that it is spending 10% of its time doing locking related stuff. The structure of the tool is that it reading in a ton of stuff (e.g. one moderate example I’m working with is 40M of input) into MLIR, then uses its multithreaded pass manager to do transformations.As it happens, the structure of this is that the parsing pass is single threaded, because it is parsing through a linear file (the parser is simple and fast, so this is bound by IR construction). This means that none of the locking during IR construction is useful.I'm curious which are the places that show up on the profile? Do you have a few stacktraces to share?
Historically, LLVM had a design where you could dynamically enable and disable multithreading support in a tool, which would be perfect for this use case, but it got removed by this patch: (xref https://reviews.llvm.org/D4216). The rationale in the patch doesn’t make sense to me - this mode had nothing to do with the old LLVM global lock, this had to do with whether llvm::llvm_is_multithreaded() returned true or false … which all the locking stuff is guarded on.It seems that at the time the assumption was that this flag was there to alleviate the cost of the global lock only and removing the lock removed the motivation for the feature? Looks like you proved this wrong :)+Zach, David, and Reid to make sure they don't miss this.
Would it make sense to re-enable this, or am I missing something?Finding a way to re-enable it seems interesting. I wonder how much it'll interact with the places inside the compiler that are threaded now, maybe it isn't much more than tracking and auditing the uses of LLVM_ENABLE_THREADS (like lib/Support/ThreadPool.cpp for example). Have you already looked into it?
Yes, the llvm::Smart* family of locks still exist. But very few places are using them outside of MLIR; it’s more common to just use plain std::mutex.
That said, I don’t think it’s really a good idea to use them, even if they were fixed to work as designed. It’s not composable: the boolean “enabled” bit is process-wide, not local to whatever data structure you’re trying to build. So your single-threaded tool gets some benefit, but the benefit goes away as soon as the process starts using multiple threads, even if there still only one thread using the MLIR context in question.
So probably I’d recommend two things:
-Eli
Yes, the llvm::Smart* family of locks still exist. But very few places are using them outside of MLIR; it’s more common to just use plain std::mutex.That said, I don’t think it’s really a good idea to use them, even if they were fixed to work as designed. It’s not composable: the boolean “enabled” bit is process-wide, not local to whatever data structure you’re trying to build. So your single-threaded tool gets some benefit, but the benefit goes away as soon as the process starts using multiple threads, even if there still only one thread using the MLIR context in question.
So probably I’d recommend two things:
- If locking uncontended locks is showing up on profiles as a performance bottleneck, it’s probably worth looking into ways to reduce that overhead in both single-threaded and multi-threaded contexts. (Reducing the number of locks taken in frequently called code, or using a better lock implementation).
- If you want some mechanism to disable MLIR locking, it should probably be a boolean attached to the MLIR context in question, not a global variable.
(reply inline)
From: Chris Lattner <clat...@nondot.org>
Sent: Sunday, April 12, 2020 3:28 PM
To: Eli Friedman <efri...@quicinc.com>
Cc: llvm-dev <llvm...@lists.llvm.org>
Subject: [EXT] Re: [llvm-dev] LLVM multithreading support
On Apr 12, 2020, at 2:23 PM, Eli Friedman <efri...@quicinc.com> wrote:
So probably I’d recommend two things:
1. If locking uncontended locks is showing up on profiles as a performance bottleneck, it’s probably worth looking into ways to reduce that overhead in both single-threaded and multi-threaded contexts. (Reducing the number of locks taken in frequently called code, or using a better lock implementation).
2. If you want some mechanism to disable MLIR locking, it should probably be a boolean attached to the MLIR context in question, not a global variable.
Ok, but let me argue the other way. We currently have a cmake flag that sets LLVM_ENABLE_THREADS, and that flag enables an across the board speedup. That cmake flag is the *worst* possible thing for library composability. :-). Are you suggesting that we remove it?
Yes, I would like to remove LLVM_ENABLE_THREADS. Assuming you’re not building for some exotic target that doesn’t have threads, there isn’t any reason to randomly shut off all thread-related functionality in the LLVM support libraries. There isn’t any significant performance or codesize gain to be had outside of MLIR, as far as I know, and it increases the number of configurations we have to worry about. I have no idea if turning it off even works on master; I don’t know of any buildbots or users using that configuration.
If you want to support some sort of lockless mode in MLIR, I think that burden should be carried as part of MLIR, instead of infecting the entire LLVM codebase.
-Eli
_______________________________________________
-Eli
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
(reply inline)
From: Chris Lattner <clat...@nondot.org>
Sent: Sunday, April 12, 2020 3:28 PM
To: Eli Friedman <efri...@quicinc.com>
Cc: llvm-dev <llvm...@lists.llvm.org>
Subject: [EXT] Re: [llvm-dev] LLVM multithreading support
On Apr 12, 2020, at 2:23 PM, Eli Friedman <efri...@quicinc.com> wrote:
So probably I’d recommend two things:
1. If locking uncontended locks is showing up on profiles as a performance bottleneck, it’s probably worth looking into ways to reduce that overhead in both single-threaded and multi-threaded contexts. (Reducing the number of locks taken in frequently called code, or using a better lock implementation).
2. If you want some mechanism to disable MLIR locking, it should probably be a boolean attached to the MLIR context in question, not a global variable.
Ok, but let me argue the other way. We currently have a cmake flag that sets LLVM_ENABLE_THREADS, and that flag enables an across the board speedup. That cmake flag is the *worst* possible thing for library composability. :-). Are you suggesting that we remove it?
Yes, I would like to remove LLVM_ENABLE_THREADS. Assuming you’re not building for some exotic target that doesn’t have threads, there isn’t any reason to randomly shut off all thread-related functionality in the LLVM support libraries.
There isn’t any significant performance or codesize gain to be had outside of MLIR, as far as I know, and it increases the number of configurations we have to worry about.
I have no idea if turning it off even works on master; I don’t know of any buildbots or users using that configuration.
If you want to support some sort of lockless mode in MLIR, I think that burden should be carried as part of MLIR, instead of infecting the entire LLVM codebase.
If this can become a performance problem, is there a way to tackle the
problem head-on to reduce that cost even in scenarios that really are
multithreaded? E.g., an inlined initial atomic lock that falls back to
the "real" lock implementation on failure?
As long as it's only a small number of hotspots (and attribute/type
uniquing seem like plausible candidates), it'd seem justified to do
such things.
Cheers,
Nicolai
>
>> Historically, LLVM had a design where you could dynamically enable and disable multithreading support in a tool, which would be perfect for this use case, but it got removed by this patch: (xref https://reviews.llvm.org/D4216). The rationale in the patch doesn’t make sense to me - this mode had nothing to do with the old LLVM global lock, this had to do with whether llvm::llvm_is_multithreaded() returned true or false … which all the locking stuff is guarded on.
>
>
> It seems that at the time the assumption was that this flag was there to alleviate the cost of the global lock only and removing the lock removed the motivation for the feature? Looks like you proved this wrong :)
>
> +Zach, David, and Reid to make sure they don't miss this.
>
>
> Yeah, it was about not paying the cost for synchronization when it wasn’t worthwhile.
>
>> Would it make sense to re-enable this, or am I missing something?
>
>
> Finding a way to re-enable it seems interesting. I wonder how much it'll interact with the places inside the compiler that are threaded now, maybe it isn't much more than tracking and auditing the uses of LLVM_ENABLE_THREADS (like lib/Support/ThreadPool.cpp for example). Have you already looked into it?
>
>
> It is super-easy to reenable, because the entire codebase is still calling llvm::llvm_is_multithreaded(). We just need to add the global back, along with the methods to set and clear the global, and change llvm::llvm_is_multithreaded() to something like:
>
> bool llvm::llvm_is_multithreaded() {
> #if LLVM_ENABLE_THREADS != 0
> return someGlobal;
> #else
> return false;
> #endif
> }
>
> -Chris
> _______________________________________________
> LLVM Developers mailing list
> llvm...@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
--
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.
Historically, LLVM had a design where you could dynamically enable and disable multithreading support in a tool, which would be perfect for this use case, but it got removed by this patch: (xref https://reviews.llvm.org/D4216). The rationale in the patch doesn’t make sense to me - this mode had nothing to do with the old LLVM global lock, this had to do with whether llvm::llvm_is_multithreaded() returned true or false … which all the locking stuff is guarded on.Would it make sense to re-enable this, or am I missing something?-Chris