It is my understanding that the implementation for jump-threading in LLVM is not presently able to effectively optimize code containing a state-machine implemented using a loop + switch. This is the case, for example, with the Coremark benchmark function core_state_transition(). Bug 42313 was filed to address this in 2019:
https://bugs.llvm.org/show_bug.cgi?id=42313
It appears that GCC improved support for jump threading in 2015 along the same lines:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54742
Is anyone aware of any plan to do improve LLVM jump-threading along the same lines for LLVM?
Thanks!
Alan Phipps
Nobody is currently working on this, as far as I know. If you’re interested in looking into it, I’ll try to answer any questions.
-Eli
No active work on my side, but I have given the topic of threaded
interpreters (which is what I think you're wanting to produce) a
good amount of thought.
I'm really not sure that switch is the right canonical form. The main reason being that having a loop over a large switch is very likely to encourage code motion which is generally profitable, but harmful in this particular context.
I had been thinking down the lines of representing the intepreter as a family of mutually recursive functions with a calling convention optimized for this case and using a musttail call through a lookup table for the dispatch.
I've played with the notion of extending clang with a custom attribute for guaranteed tail calls. I think this is pretty much the only extension needed to be able to natively write out a threaded interpreter as a set of mutually recursive functions.
This is all thought experiment from my side; I haven't had time to sit down and actually prototype any of this.
Philip
_______________________________________________ LLVM Developers mailing list llvm...@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
This is great to hear, thanks for the update! Can you provide a timeline for when you expect to have the jump-threading update in review?
The loop-flattening pass also looks very useful. I’m sure we could help with the review -- I don’t know about taking ownership of it, at least right now.
-Alan
_______________________________________________
No active work on my side, but I have given the topic of threaded interpreters (which is what I think you're wanting to produce) a good amount of thought.
I'm really not sure that switch is the right canonical form. The main reason being that having a loop over a large switch is very likely to encourage code motion which is generally profitable, but harmful in this particular context.
I had been thinking down the lines of representing the intepreter as a family of mutually recursive functions with a calling convention optimized for this case and using a musttail call through a lookup table for the dispatch.
On Wed, Sep 23, 2020 at 9:48 PM Philip Reames via llvm-dev <llvm...@lists.llvm.org> wrote:
No active work on my side, but I have given the topic of threaded interpreters (which is what I think you're wanting to produce) a good amount of thought.
I'm really not sure that switch is the right canonical form. The main reason being that having a loop over a large switch is very likely to encourage code motion which is generally profitable, but harmful in this particular context.
I had been thinking down the lines of representing the intepreter as a family of mutually recursive functions with a calling convention optimized for this case and using a musttail call through a lookup table for the dispatch.
I believe the Wasm3 project (https://github.com/wasm3/wasm3) which is a WebAssembly interpreter is using this dispatch technique described by Philip.I don't know how exactly it is guaranteed that the indirect calls are converted to tail calls (maybe it's not). But the performance is quite impressive.