[InstCombine] Refactor optimization of zext(or(icmp, icmp)) to enable more…

Authored by grosser on Aug 3 2016, 12:30 PM.


[InstCombine] Refactor optimization of zext(or(icmp, icmp)) to enable more aggressive cast-folding

InstCombine unfolds expressions of the form zext(or(icmp, icmp)) to or(zext(icmp), zext(icmp)) such that in a later iteration of InstCombine the exposed zext(icmp) instructions can be optimized. We now combine this unfolding and the subsequent zext(icmp) optimization to be performed together. Since the unfolding doesn't happen separately anymore, we also again enable the folding of logic(cast(icmp), cast(icmp)) expressions to cast(logic(icmp, icmp)) which had been disabled due to its interference with the unfolding transformation.

Tested via make check and lnt.


For a better understanding on how it came to this change we subsequently summarize its history. In commit r275989 we've already tried to enable the folding of logic(cast(icmp), cast(icmp)) to cast(logic(icmp, icmp)) which had to be reverted in r276106 because it could lead to an endless loop in InstCombine (also see http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160718/374347.html). The root of this problem is that in visitZExt() in InstCombineCasts.cpp there also exists a reverse of the above folding transformation, that unfolds zext(or(icmp, icmp)) to or(zext(icmp), zext(icmp)) in order to expose zext(icmp) operations which would then possibly be eliminated by subsequent iterations of InstCombine. However, before these zext(icmp) would be eliminated the folding from r275989 could kick in and cause InstCombine to endlessly switch back and forth between the folding and the unfolding transformation. This is the reason why we now combine the zext-unfolding and the elimination of the exposed zext(icmp) to happen at one go because this enables us to still allow the cast-folding in logic(cast(icmp), cast(icmp)) without entering an endless loop again.

Details on the submitted changes

  • In visitZExt() we combine the unfolding and optimization of zext instructions.
  • In transformZExtICmp() we have to use Builder->CreateIntCast() instead of CastInst::CreateIntegerCast() to make sure that the new CastInst is inserted in a BasicBlock. The new calls to transformZExtICmp() that we introduce in visitZExt() would otherwise cause according assertions to be triggered (in our case this happend, for example, with lnt for the MultiSource/Applications/sqlite3 and SingleSource/Regression/C++/EH/recursive-throw tests). The subsequent usage of replaceInstUsesWith() is necessary to ensure that the new CastInst replaces the ZExtInst accordingly.
  • In InstCombineAndOrXor.cpp we again allow the folding of casts on icmp instructions.
  • The instruction order in the optimized IR for the zext-or-icmp.ll test case is different with the introduced changes.
  • The test cases in zext.ll have been adopted from the reverted commits r275989 and r276105.

Reviewers: grosser, majnemer, spatel

Subscribers: eli.friedman, majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D22864

Contributed-by: Matthias Reisinger <d412vv1n@gmail.com>
llvm-svn: 277635