This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fix PR17564: don't fold [zs]ext into phi.
AbandonedPublic

Authored by lebedev.ri on Mar 29 2018, 2:48 PM.

Details

Summary

Motivational case is highlighted in PR17564:

int f(int x) {
    return (x & 1) || (x & 2);
}

If the function return type is bool, then the IR is just and+icmp.
But if it is int, then it has select, and whatnot.
https://godbolt.org/g/uTJVVo

If one analyzes -print-after-all, thing appear to go bad when we fold zext into phi.
If we don't do that, then the behavior is the same as with i1, and PR17564 is fixed.

Also, the second part is needed, to get rid of select if possible,
e.g. https://godbolt.org/g/iAYRup, but that is another issue.

Now, i'm not quite sure whether *this* fix is correct?
Two other tests changed, but those changes do not look harmful?

Testing: ninja check-llvm

Fixes PR17564.

If this makes sense i'll commit test changes as NFC, and rebase.

Diff Detail

Repository
rL LLVM

Event Timeline

lebedev.ri created this revision.Mar 29 2018, 2:48 PM

D45108 will also solve PR17564, so if this is the wrong approach, it can be closed i suppose.

spatel added a comment.Apr 2 2018, 6:17 AM

Avoiding instcombine folds is fragile. There's no guarantee that the IR isn't already in the form we're trying to avoid when we start instcombine, so we would fail to optimize in that case. Also, some of the tests show the benefit of the existing transform - we eliminate an instruction by changing the type of the phi.

I think we should solve this using something like D45108 instead, but I need to look closer at how the case in the bug report escaped the matching that we already have in foldSelectInstWithICmp() and foldSelectICmpAnd().

Avoiding instcombine folds is fragile. There's no guarantee that the IR isn't already in the form we're trying to avoid when we start instcombine, so we would fail to optimize in that case.

Yes, i understand.

Also, some of the tests show the benefit of the existing transform - we eliminate an instruction by changing the type of the phi.

Uhm, which one specifically? As far as i can see, all tests here end up with the same instructions, just some are moved around.
But in @icmp_div2 we actually loose an instruction, and always return 1.

I think we should solve this using something like D45108 instead

but I need to look closer at how the case in the bug report escaped the matching that we already have in foldSelectInstWithICmp() and foldSelectICmpAnd().

Hmm, right, somehow i missed those when looking where to work on D45108

So, i should close this?

spatel added a comment.Apr 2 2018, 6:48 AM

Also, some of the tests show the benefit of the existing transform - we eliminate an instruction by changing the type of the phi.

Uhm, which one specifically? As far as i can see, all tests here end up with the same instructions, just some are moved around.

@zext_of_phi / @sext_of_phi - currently, in those tests we absorb the zext/sext into the phi removing an instruction, right?

But in @icmp_div2 we actually loose an instruction, and always return 1.

I think that's an interesting and independent case. It looks like we're losing information because of the order of the folds? Or maybe we're missing an instsimplify for phi with an undef operand?

Hmm, right, somehow i missed those when looking where to work on D45108

Yes, that code is a mess. Any preliminary cleanup patches to organize and comment it would be appreciated. I've been putting off refactoring decomposeBitTestICmp() to include the cases for ICMP_EQ / ICMP_NE, but this bug may be the needed motivation.

So, i should close this?

Yes.

lebedev.ri abandoned this revision.Apr 2 2018, 6:58 AM

Will continue in D45108

but I need to look closer at how the case in the bug report escaped the matching that we already have in foldSelectInstWithICmp() and foldSelectICmpAnd().

Uhm, you know, i have added return nullptr; or return false; as the first statement in those functions, and even then those folds (@just_select, @zext_of_select, @sext_of_select) in this diff still happen.
So i'm not quite sure where they are happening...

but I need to look closer at how the case in the bug report escaped the matching that we already have in foldSelectInstWithICmp() and foldSelectICmpAnd().

Uhm, you know, i have added return nullptr; or return false; as the first statement in those functions, and even then those folds (@just_select, @zext_of_select, @sext_of_select) in this diff still happen.
So i'm not quite sure where they are happening...

Right, of course, that is handled by https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/InstCombine/InstCombineSelect.cpp#L1418-L1452
Continuing to look...