This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Refactor optimization of zext(or(icmp, icmp)) to enable more aggressive cast-folding
ClosedPublic

Authored by mreisinger on Jul 27 2016, 7:09 AM.

Details

Summary

InstCombine unfolds expressions of the form zext(or(icmp, icmp)) to or(zext(icmp), zext(icmp)) such that in a later iteration of InstCombine the exposed zext(icmp) instructions can be optimized. We now combine this unfolding and the subsequent zext(icmp) optimization to be performed together. Since the unfolding doesn't happen separately anymore, we also again enable the folding of logic(cast(icmp), cast(icmp)) expressions to cast(logic(icmp, icmp)) which had been disabled due to its interference with the unfolding transformation.

Tested via make check and lnt.

Background

For a better understanding on how it came to this change we subsequently summarize its history. In commit r275989 we've already tried to enable the folding of logic(cast(icmp), cast(icmp)) to cast(logic(icmp, icmp)) which had to be reverted in r276106 because it could lead to an endless loop in InstCombine (also see http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160718/374347.html). The root of this problem is that in visitZExt() in InstCombineCasts.cpp there also exists a reverse of the above folding transformation, that unfolds zext(or(icmp, icmp)) to or(zext(icmp), zext(icmp)) in order to expose zext(icmp) operations which would then possibly be eliminated by subsequent iterations of InstCombine. However, before these zext(icmp) would be eliminated the folding from r275989 could kick in and cause InstCombine to endlessly switch back and forth between the folding and the unfolding transformation. This is the reason why we now combine the zext-unfolding and the elimination of the exposed zext(icmp) to happen at one go because this enables us to still allow the cast-folding in logic(cast(icmp), cast(icmp)) without entering an endless loop again.

Details on the submitted changes

  • In visitZExt() we combine the unfolding and optimization of zext instructions.
  • In transformZExtICmp() we have to use Builder->CreateIntCast() instead of CastInst::CreateIntegerCast() to make sure that the new CastInst is inserted in a BasicBlock. The new calls to transformZExtICmp() that we introduce in visitZExt() would otherwise cause according assertions to be triggered (in our case this happend, for example, with lnt for the MultiSource/Applications/sqlite3 and SingleSource/Regression/C++/EH/recursive-throw tests). The subsequent usage of replaceInstUsesWith() is necessary to ensure that the new CastInst replaces the ZExtInst accordingly.
  • In InstCombineAndOrXor.cpp we again allow the folding of casts on icmp instructions.
  • The instruction order in the optimized IR for the zext-or-icmp.ll test case is different with the introduced changes.
  • The test cases in zext.ll have been adopted from the reverted commits r275989 and r276105.

Diff Detail

Event Timeline

mreisinger updated this revision to Diff 65722.Jul 27 2016, 7:09 AM
mreisinger retitled this revision from to [InstCombine] Combine unfolding and optimization of casts in zext(or(icmp, icmp)).
mreisinger updated this object.
mreisinger added reviewers: grosser, spatel.
mreisinger added a subscriber: llvm-commits.
mreisinger updated this object.Jul 27 2016, 9:43 AM
spatel edited edge metadata.Aug 1 2016, 6:20 AM
spatel added subscribers: majnemer, eli.friedman.

I was out of the office last week, so now I'm just catching up. cc'ing @majnemer and @eli.friedman for their expertise.

Hi Sanjay, thank you for jumping in and for adding subscribers!

Best,
Matthias

majnemer accepted this revision.Aug 2 2016, 10:27 AM
majnemer added a reviewer: majnemer.

LGTM

This revision is now accepted and ready to land.Aug 2 2016, 10:27 AM

Cool, thank you for reviewing! Since I don't have commit privileges, somebody else may have to merge this. Should I therefore rebase and reupload the patch?

mreisinger updated this revision to Diff 66656.Aug 3 2016, 7:25 AM
mreisinger retitled this revision from [InstCombine] Combine unfolding and optimization of casts in zext(or(icmp, icmp)) to [InstCombine] Refactor optimization of zext(or(icmp, icmp)) to enable more aggressive cast-folding.
mreisinger updated this object.
mreisinger edited edge metadata.

I now rebased the patch (I thereby also replaced my introduced cast<>s by dyn_cast<>s which is cleaner now, sorry for this post-acceptance change). I also reran make check and lnt now and the tests pass - all but the MultiSource/Benchmarks/MiBench/consumer-typeset/consumer-typeset.execution_time in lnt, but this seems to be independent from my changes since it also fails when testing with a clean llvm build that doesn't include my patch (I ran lnt via myvirtualenv/bin/lnt runtest nt --sandbox /tmp --cc ~/opt/llvm_release+asserts/bin/clang --test-suite ~/repos/llvm-test-suite). I don't know if that's a known issue, but at least I could not find an according bugzilla entry - I'll try to further investigate this.

Best,
Matthias

Thank you for committing!

Best,
Matthias