[llvm-dev] Invalid llvm-ir and unreachable code

75 views
Skip to first unread message

Jeroen Dobbelaere via llvm-dev

unread,
Mar 12, 2021, 3:47:25 AM3/12/21
to llvm...@lists.llvm.org
Hi all,

I have found a case where an optimization pass is barfing on invalid code in an unreachable
basic block. A self referencing GEP '%x = getelementptr %x, 0, 1' is found inside a cycle of
two unreachable basic blocks)

The invalid code and the unreachable basic blocks were introduced by the function inliner.

I am wondering what is valid for these cases ?
(I did not find anything in LangRef, but I might as well have missed it)

- clearly 'dead code' (aka unreachable basic blocks) is valid in an IR ?
- I assume that the self reference in dead code is not valid ?
- should inlining do an extra effort of cleaning up such dead code blocks ?

Thanks,

Jeroen Dobbelaere

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Roman Lebedev via llvm-dev

unread,
Mar 12, 2021, 3:53:23 AM3/12/21
to Jeroen Dobbelaere, llvm...@lists.llvm.org
On Fri, Mar 12, 2021 at 11:47 AM Jeroen Dobbelaere via llvm-dev
<llvm...@lists.llvm.org> wrote:
>
> Hi all,
>
> I have found a case where an optimization pass is barfing on invalid code in an unreachable
> basic block. A self referencing GEP '%x = getelementptr %x, 0, 1' is found inside a cycle of
> two unreachable basic blocks)
>
> The invalid code and the unreachable basic blocks were introduced by the function inliner.
>
> I am wondering what is valid for these cases ?
> (I did not find anything in LangRef, but I might as well have missed it)
>
> - clearly 'dead code' (aka unreachable basic blocks) is valid in an IR ?
Yep.

> - I assume that the self reference in dead code is not valid ?

Unreachable code is permitted to take various weird shapes and forms.
So no, that is valid.

> - should inlining do an extra effort of cleaning up such dead code blocks ?
>

Which pass?
Generally, it (or the users of that code) should either use DominatoTree
to avoid handling unreachable blocks, or it should be hardened to deal
with such situations gracefully.

> Thanks,
>
> Jeroen Dobbelaere
Roman

Jeroen Dobbelaere via llvm-dev

unread,
Mar 12, 2021, 3:59:34 AM3/12/21
to Roman Lebedev, llvm...@lists.llvm.org
> > Hi all,
> >
> > I have found a case where an optimization pass is barfing on invalid code in
> an unreachable
> > basic block. A self referencing GEP '%x = getelementptr %x, 0, 1' is found
> inside a cycle of
> > two unreachable basic blocks)
> >
> > The invalid code and the unreachable basic blocks were introduced by the
> function inliner.
> >
> > I am wondering what is valid for these cases ?
> > (I did not find anything in LangRef, but I might as well have missed it)
> >
> > - clearly 'dead code' (aka unreachable basic blocks) is valid in an IR ?
> Yep.
>
> > - I assume that the self reference in dead code is not valid ?
> Unreachable code is permitted to take various weird shapes and forms.
> So no, that is valid.
>
> > - should inlining do an extra effort of cleaning up such dead code blocks ?
> >
>
> Which pass?

As far as I see, it is in SROA (and a combination of our custom changes, including the full restrict changes).
I am not sure yet if it is a generic SROA problem or not, but if it is, I should be able
to create a bugreport.

> Generally, it (or the users of that code) should either use DominatoTree
> to avoid handling unreachable blocks, or it should be hardened to deal
> with such situations gracefully.

Thanks,

Jeroen Dobbelaere




Mehdi AMINI via llvm-dev

unread,
Mar 13, 2021, 11:05:37 PM3/13/21
to Jeroen Dobbelaere, llvm...@lists.llvm.org
On Fri, Mar 12, 2021 at 12:47 AM Jeroen Dobbelaere via llvm-dev <llvm...@lists.llvm.org> wrote:
Hi all,

I have found a case where an optimization pass is barfing on invalid code in an unreachable
basic block. A self referencing GEP '%x = getelementptr %x, 0, 1' is found inside a cycle of
two unreachable basic blocks)

The invalid code and the unreachable basic blocks were introduced by the function inliner.

I am wondering what is valid for these cases ?
(I did not find anything in LangRef, but I might as well have missed it)

- clearly 'dead code' (aka unreachable basic blocks) is valid in an IR ?
- I assume that the self reference in dead code is not valid ?

Yes, self-references are valid in unreachable code, we have test cases for this, it'll pass the verifier. Passes should be resilient against this (even though a pass could always do a quick reachability check and delete all dead blocks as a pre-step).

-- 
Mehdi

Jeroen Dobbelaere via llvm-dev

unread,
Mar 15, 2021, 4:06:07 AM3/15/21
to Mehdi AMINI, llvm...@lists.llvm.org

Thanks for all the feedback !

 

The issue I encountered is likely not happening on the main llvm, but I believe the issue is interesting enough to

document:

 

- the problematic code construct consists of a cycle of 'dead' basic blocks that conditionally jumps into live blocks.

  That jump introduce a connection of a dead self-referring 'getelementptr' to live code through a PHI node.

- during SROA, the scoped noalias analysis (full restrict version) does an unbounded (MaxLookup == 0) 'getUnderlyingObject' call on a pointer.

- 'getUnderlyingObject' also follows the path to the dead blocks and ends up in the self-referring  'getelementpr'.

 

Does this mean that 'getUnderlyingObject' should also be prepared to be handling 'invalid instructions' in 'dead code' ?

 

Greetings,

 

Jeroen Dobbelaere

Roman Lebedev via llvm-dev

unread,
Mar 15, 2021, 4:55:52 AM3/15/21
to Jeroen Dobbelaere, llvm...@lists.llvm.org
On Mon, Mar 15, 2021 at 11:06 AM Jeroen Dobbelaere via llvm-dev
<llvm...@lists.llvm.org> wrote:
>
> Thanks for all the feedback !
>
>
>
> The issue I encountered is likely not happening on the main llvm, but I believe the issue is interesting enough to
>
> document:
>
>
>
> - the problematic code construct consists of a cycle of 'dead' basic blocks that conditionally jumps into live blocks.
>
> That jump introduce a connection of a dead self-referring 'getelementptr' to live code through a PHI node.
>
> - during SROA, the scoped noalias analysis (full restrict version) does an unbounded (MaxLookup == 0) 'getUnderlyingObject' call on a pointer.
>
> - 'getUnderlyingObject' also follows the path to the dead blocks and ends up in the self-referring 'getelementpr'.
>
>
>
> Does this mean that 'getUnderlyingObject' should also be prepared to be handling 'invalid instructions' in 'dead code' ?
I've just adjusted that function in 36f1c3db66f7268ea3183bcf0bbf05b3e1c570b4
to assert that no cycles are encountered instead of endlessly looping.

But i *think* the problem is in your SROA changes.
Can you link me to the particular patch in question?

> Greetings,
>
>
>
> Jeroen Dobbelaere
Roman

Jeroen Dobbelaere via llvm-dev

unread,
Mar 15, 2021, 5:53:45 AM3/15/21
to Roman Lebedev, Mehdi AMINI, llvm...@lists.llvm.org
Hi Roman,

[..]
> >
> > - the problematic code construct consists of a cycle of 'dead' basic blocks
> that conditionally jumps into live blocks.
> >
> > That jump introduce a connection of a dead self-referring 'getelementptr'
> to live code through a PHI node.
> >
> > - during SROA, the scoped noalias analysis (full restrict version) does an
> unbounded (MaxLookup == 0) 'getUnderlyingObject' call on a pointer.
> >
> > - 'getUnderlyingObject' also follows the path to the dead blocks and ends up
> in the self-referring 'getelementpr'.
> >
> >
> >
> > Does this mean that 'getUnderlyingObject' should also be prepared to be
> handling 'invalid instructions' in 'dead code' ?
> I've just adjusted that function in 36f1c3db66f7268ea3183bcf0bbf05b3e1c570b4
> to assert that no cycles are encountered instead of endlessly looping.
>
> But i *think* the problem is in your SROA changes.
> Can you link me to the particular patch in question?

This is the patch that triggers the unbounded lookup: https://reviews.llvm.org/D68507
(llvm/lib/Analysis/ScopedNoAliasAA.cpp, line 269)

I checked again the endless loop: It happens _right after SROA_, in the MemorySSA pass.

Apparently, your extra check also triggers cases without my patches ? A self-reference is
likely a mild version of an 'invalid instruction' but what other constructs can we encounter
in dead blocks ?

I am wondering if we should allow (analysis) passes to follow code into dead blocks at all ?

Greetings,

Jeroen

Florian Hahn via llvm-dev

unread,
Mar 15, 2021, 6:32:29 AM3/15/21
to Jeroen Dobbelaere, llvm...@lists.llvm.org

> On Mar 15, 2021, at 09:53, Jeroen Dobbelaere via llvm-dev <llvm...@lists.llvm.org> wrote:
>
> Hi Roman,
>
> [..]
>>>
>>> - the problematic code construct consists of a cycle of 'dead' basic blocks
>> that conditionally jumps into live blocks.
>>>
>>> That jump introduce a connection of a dead self-referring 'getelementptr'
>> to live code through a PHI node.
>>>
>>> - during SROA, the scoped noalias analysis (full restrict version) does an
>> unbounded (MaxLookup == 0) 'getUnderlyingObject' call on a pointer.
>>>
>>> - 'getUnderlyingObject' also follows the path to the dead blocks and ends up
>> in the self-referring 'getelementpr'.
>>>
>>>
>>>
>>> Does this mean that 'getUnderlyingObject' should also be prepared to be
>> handling 'invalid instructions' in 'dead code' ?
>> I've just adjusted that function in 36f1c3db66f7268ea3183bcf0bbf05b3e1c570b4
>> to assert that no cycles are encountered instead of endlessly looping.
>>
>> But i *think* the problem is in your SROA changes.
>> Can you link me to the particular patch in question?
>
> This is the patch that triggers the unbounded lookup: https://reviews.llvm.org/D68507
> (llvm/lib/Analysis/ScopedNoAliasAA.cpp, line 269)
>
> I checked again the endless loop: It happens _right after SROA_, in the MemorySSA pass.
>

Can you isolate the failure to just a run of MemorySSA? If so, it sounds like it would be a problem in your linked patch. I guess it would be good to know exactly where an infinite loop is triggered (and why).

Nikita Popov via llvm-dev

unread,
Mar 15, 2021, 6:37:52 AM3/15/21
to Florian Hahn, llvm...@lists.llvm.org
The infinite loop happens because getUnderlyingObject() is called with MaxLookup=0. getUnderlyingObject() relies on a finite depth limit to prevent infinite loops. The problem is specific to the full restrict patch for this reason -- normally we don't perform unlimited getUnderlyingObject() calls.

The correct fix here is imho to remove the ability to call getUnderlyingObject() with MaxLookup=0, which sounds like a bad idea for pathological cases anyway.

Nikita

Roman Lebedev via llvm-dev

unread,
Mar 15, 2021, 6:38:08 AM3/15/21
to Jeroen Dobbelaere, llvm...@lists.llvm.org
On Mon, Mar 15, 2021 at 12:53 PM Jeroen Dobbelaere
<Jeroen.D...@synopsys.com> wrote:
>
> Hi Roman,
>
> [..]
> > >
> > > - the problematic code construct consists of a cycle of 'dead' basic blocks
> > that conditionally jumps into live blocks.
> > >
> > > That jump introduce a connection of a dead self-referring 'getelementptr'
> > to live code through a PHI node.
> > >
> > > - during SROA, the scoped noalias analysis (full restrict version) does an
> > unbounded (MaxLookup == 0) 'getUnderlyingObject' call on a pointer.
> > >
> > > - 'getUnderlyingObject' also follows the path to the dead blocks and ends up
> > in the self-referring 'getelementpr'.
> > >
> > >
> > >
> > > Does this mean that 'getUnderlyingObject' should also be prepared to be
> > handling 'invalid instructions' in 'dead code' ?
> > I've just adjusted that function in 36f1c3db66f7268ea3183bcf0bbf05b3e1c570b4
> > to assert that no cycles are encountered instead of endlessly looping.
> >
> > But i *think* the problem is in your SROA changes.
> > Can you link me to the particular patch in question?
>
> This is the patch that triggers the unbounded lookup: https://reviews.llvm.org/D68507
> (llvm/lib/Analysis/ScopedNoAliasAA.cpp, line 269)
>
> I checked again the endless loop: It happens _right after SROA_, in the MemorySSA pass.

> Apparently, your extra check also triggers cases without my patches ? A self-reference is
> likely a mild version of an 'invalid instruction' but what other constructs can we encounter
> in dead blocks ?

Yeah, i fooled myself there.

> I am wondering if we should allow (analysis) passes to follow code into dead blocks at all ?

I wouldn't think so. It's a waste of time i would say.

Jeroen Dobbelaere via llvm-dev

unread,
Mar 15, 2021, 7:38:57 AM3/15/21
to Nikita Popov, Florian Hahn, llvm...@lists.llvm.org
>> [..]
>>>>
>>>> - the problematic code construct consists of a cycle of 'dead' basic blocks
>>> that conditionally jumps into live blocks.
>>>>
>>>>  That jump introduce a connection of a dead self-referring 'getelementptr'
>>> to live code through a PHI node.
>>>>
>>>> - during SROA, the scoped noalias analysis (full restrict version) does an
>>> unbounded (MaxLookup == 0) 'getUnderlyingObject' call on a pointer.
>>>>
>>>> - 'getUnderlyingObject' also follows the path to the dead blocks and ends up
>>> in the self-referring  'getelementpr'.
>>>>
>>>>
>>>>
>>>> Does this mean that 'getUnderlyingObject' should also be prepared to be
>>> handling 'invalid instructions' in 'dead code' ?
>>> I've just adjusted that function in 36f1c3db66f7268ea3183bcf0bbf05b3e1c570b4
>>> to assert that no cycles are encountered instead of endlessly looping.
>>>
>>> But i *think* the problem is in your SROA changes.
>>> Can you link me to the particular patch in question?
>>
>> This is the patch that triggers the unbounded lookup:  https://reviews.llvm.org/D68507
>> (llvm/lib/Analysis/ScopedNoAliasAA.cpp, line 269)
>>
>> I checked again the endless loop: It happens _right after SROA_, in the MemorySSA pass.
>>

>>Can you isolate the failure to just a run of MemorySSA? If so, it sounds like it would be a
> problem in your linked patch. I guess it would be good to know exactly where an infinite
> loop is triggered (and why).
>

>The infinite loop happens because getUnderlyingObject() is called with MaxLookup=0. getUnderlyingObject() relies on a finite depth limit to prevent >infinite loops. The problem is specific to the full restrict patch for this reason -- normally we don't perform unlimited getUnderlyingObject() >calls.
>
>The correct fix here is imho to remove the ability to call getUnderlyingObject() with MaxLookup=0, which sounds like a bad idea for pathological cases anyway.

I agree. IIRC this lookup originates from the original 'local restrict support'. I think we should be able to cope with a limited depth lookup.
At worst case, it should modify the result of a scoped noalias query to return 'MayAlias' over 'NoAlias'.

Greetings,

Jeroen

Jeroen Dobbelaere via llvm-dev

unread,
Mar 15, 2021, 7:40:55 AM3/15/21
to Roman Lebedev, Mehdi AMINI, llvm...@lists.llvm.org
[..]
> > I am wondering if we should allow (analysis) passes to follow code into dead
> blocks at all ?
> I wouldn't think so. It's a waste of time i would say.
>

Then, how can an analysis pass easily detect that it is following a PHI node into a dead basic block ?
Reply all
Reply to author
Forward
0 new messages