The topic of whether or not LLVM allows for infinite loops has come up a lot recently (several times this week already). Regarding motivation, there are two important facts:
1. Some languages, such as Java, have well-defined infinite loops. See:
http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4.9
and:
http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4.2
and, as a community, it seems to be important for us to support such languages. That means that we must have a way, at the IR level, to support and model infinite loops.
2. Other languages, such a C and C++, allow us to assume that otherwise-side-effect-free loops terminate, specifically, for C++, 1.10p27 says:
The implementation may assume that any thread will eventually do one of the following:
- terminate
- make a call to a library I/O function
- access or modify a volatile object, or
- perform a synchronization operation or an atomic operation
[Note: This is intended to allow compiler transformations such as removal of empty loops, even
when termination cannot be proven. — end note ]
and taking advantage of these guarantees is part of providing a high-quality optimizer for C/C++ programs.
And this leaves us somewhat in a bind. To provide a high-quality C/C++ optimizer, we want to take advantage of this guarantee, but we can't do so in a generic sense without harming our ability to serve as a compiler for other languages.
In 2010, Nick proposed to add a 'halting' attribute that could be added to functions to indicate that they would not execute indefinitely (http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100705/103670.html). At the time that the patch was proposed, there were infrastructure problems with inferring the attribute for functions with loops (related to using function-level analysis passes from a CGSCC pass), but hopefully those will be fixed with the new pass manager. Regardless, however, such inference is much more powerful if it can take advantage of the guarantees that C/C++ provide.
Thus, while I would eventually like a 'halting' attribute, or some variant of that (including, for example, the lack of calls to longjmp), I think that a first step is to provide an attribute that Clang, and other frontends, can add when producing IR from sources where the language provides C/C++-like guarantees on loop termination. This attribute would indicate that the function will not execute indefinitely without producing some externally-observable side effect (calling an external function or executing a volatile/atomic memory access). I could name this attribute 'finite', but bikeshedding is welcome.
With such an attribute in place, we would be able to clarify our overall position on infinite loops, be in a stronger position to infer more specific function properties (like halting), and can put in place generally-correct fixes to outstanding bugs (PR24078, for example). I know there are some Clang users who want it to optimize while honoring infinite loops, and I think adding this attribute helps them as well (assuming we'd provide some non-default option to prevent Clang from adding it). Thoughts?
Thanks again,
Hal
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
My largest concern is that the frontend would need to add these things all over the place, not just before the loop backedges. For one thing, if the language has gotos, where should they be inserted? Before every branch will be conservatively correct, but I'm worried that will unnecessarily block optimizations. They'd also be needed at the entry to every function. On the other hand, maybe if we added an optimization that removed these things along any control-flow path that already had any other side effect, it might not be too bad?
>
>
> As for why I'm particularly interested in this being a property of
> the loop, consider if you were to have a mixture of Java and C++
> code, all compiled to LLVM. How do you inline between them?
>
You add the attribute to the caller. FWIW, I suspect I might care a lot about this particular case (because I believe that Fortran has defined behavior for infinite loops).
-Hal
Before every branch will be conservatively correct, but I'm worried that will unnecessarily block optimizations. They'd also be needed at the entry to every function.
On the other hand, maybe if we added an optimization that removed these things along any control-flow path that already had any other side effect, it might not be too bad?
>
>
> As for why I'm particularly interested in this being a property of
> the loop, consider if you were to have a mixture of Java and C++
> code, all compiled to LLVM. How do you inline between them?
>
You add the attribute to the caller.
FWIW, I suspect I might care a lot about this particular case (because I believe that Fortran has defined behavior for infinite loops).
+1
>
> In 2010, Nick proposed to add a 'halting' attribute that could be added to functions to indicate that they would not execute indefinitely (http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100705/103670.html). At the time that the patch was proposed, there were infrastructure problems with inferring the attribute for functions with loops (related to using function-level analysis passes from a CGSCC pass), but hopefully those will be fixed with the new pass manager. Regardless, however, such inference is much more powerful if it can take advantage of the guarantees that C/C++ provide.
>
> Thus, while I would eventually like a 'halting' attribute, or some variant of that (including, for example, the lack of calls to longjmp), I think that a first step is to provide an attribute that Clang, and other frontends, can add when producing IR from sources where the language provides C/C++-like guarantees on loop termination. This attribute would indicate that the function will not execute indefinitely without producing some externally-observable side effect (calling an external function or executing a volatile/atomic memory access). I could name this attribute 'finite', but bikeshedding is welcome.
I'd call this "productive" (inspired from terminology used to describe
co-inductive data types) with the connotation that a "productive"
function cannot loop indefinitely without producing side effects
(volatile reads/writes, IO etc.).
-- Sanjoy
I like this term better than "finite."
Thanks again,
Hal
>
> -- Sanjoy
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
I'm leaning toward agreeing with you, primarily because I think it will more-naturally fit into the optimizer than the attribute. We need to check loops for side effects anyway (if we otherwise default to C++-like rules), and so this intrinsic will do the right thing without any special logic.
-Hal
I’m curious why are you proposing an attribute to mark functions that will terminate instead of an attribute to mark functions that may not terminate? I.e. why isn’t the “default” the C/C++ behavior and introducing an opt-out for other languages?
(I haven’t thought too much about it)
Thanks,
—
Mehdi
I have no particular preference. Chandler's follow-up proposal effectively does this.
-Hal
LLVM is not just a C or C++ compiler. You have to assume the worst for
each function unless you know better. In this case that leads to not
knowing whether an unannotated function may or may not terminate while
one with an attribute on it is known to follow some guarantees.
The downside is that we'll put attributes on lots of functions in the
common case. I think this is less of a concern.
Nick
From: "Philip Reames" <list...@philipreames.com>
To: "Chandler Carruth" <chan...@google.com>, "Hal Finkel" <hfi...@anl.gov>
Cc: "LLVM Dev" <llv...@cs.uiuc.edu>
Sent: Friday, July 17, 2015 7:06:33 PM
Subject: Re: [LLVMdev] [RFC] Defining Infinite Loops
On 07/17/2015 02:03 AM, Chandler Carruth wrote:
On Thu, Jul 16, 2015 at 11:08 AM Hal Finkel <hfi...@anl.gov> wrote:
----- Original Message -----
> From: "Chandler Carruth" <chan...@google.com>
> To: "Hal Finkel" <hfi...@anl.gov>
> Cc: "LLVM Dev" <llv...@cs.uiuc.edu>
> Sent: Thursday, July 16, 2015 2:33:21 AM
> Subject: Re: [LLVMdev] [RFC] Defining Infinite Loops
>
>
>
>
> On Thu, Jul 16, 2015 at 12:27 AM Hal Finkel < hfi...@anl.gov >
> wrote:
>
>
>
Given the number of places in the optimizer which give up when encountering any write (EarlyCSE for one), that would be problematic for optimization effectiveness. The other approach would be to teach LLVM about non-memory side effects. I think this is solvable, but a large investment.
In practice, most of the contributors to LLVM care about C++. I worry we'd end up in a situation where languages which need infinite loops would become second class citizens and that a large number of optimizations wouldn't apply to them in practice. This is by far my biggest worry.
Now, I'm certainly biased, but I'd much rather see a scheme where a quality of implementation issue effects the performance of C++. These are far more likely to be actually fixed. :)
Earlier in this thread, the idea of using metadata on loops was mentioned. Chandler's point about generic handling for recursive loops is a good one, but in general, a metadata annotation of finiteness seems like a better starting point.
What if we introduced a piece of branch (really, control transfer instruction) metadata (not loop) called "productive" (to steal Sanjoy's term) whose semantics are such that it can be assumed to only execute a finite number of times between productive actions (volatile, synchronization, io, etc..). We then tag *all* branches emitted by Clang with this metadata. This gives us the benefit of the loop metadata in that a single tagged backedge branch implies a productive loop, but allows productive recursive functions to be converted into productive loops in a natural way.
The function attribute "productive" now has an easy inference rule in that if all branches in the function are productive and all callees are productive, so is the function. This would seem to allow us to perform DSE, LICM, and related optimizations without trouble.
Inlining now has a reasonable definition where you can inline between languages w/the semantics of each language preserved.
One cool thing here is that the set of operations which are "productive" could actually be encoded in the metadata. This could potentially support other languages than C++ w.r.t. the "no infinite loops except when" type rules.
Thoughts?
Philip
p.s. The current implementation of readonly is correct w.r.t. C++ rules only because we designate all atomic, volatile, or fencing operations as read/write. One thing we might end up with out of this discussion is the possibility of talking about functions which are readonly, but involved in synchronization. That seems like it might be useful for optimizing concurrent C++ programs in interesting ways. It would require a separate notion of a synchronization side effect independent of read/write though. This seems related, but slightly orthogonal to the assumptions about finiteness of loops.