This is an archive of the discontinued LLVM Phabricator instance.

DAGCombiner: check isZExtFree before doing combine
AbandonedPublic

Authored by escha on Jul 31 2016, 11:21 AM.

Details

Reviewers
arsenm
Summary

No in-tree effect as of now (as far as I know), but is logical, and allows me to do some certain combines that would otherwise infinite loop.

What the change does: doesn't convert zext(op(zext(X))) to op(zext(X))

Why: the larger op may be more expensive, and if the zext is free, there is no purpose to such a transform.

Really why: I want a combine in our GPU tree that shrinks (and zero extends the result of) large arithmetic operations, because smaller operations take fewer registers and are often faster as well. This will infinite loop if any combines occur on (zext (OP (x)) that result in making OP bigger again.

Diff Detail

Repository
rL LLVM

Event Timeline

escha updated this revision to Diff 66246.Jul 31 2016, 11:21 AM
escha retitled this revision from to DAGCombiner: check isZExtFree before doing combine.
escha updated this object.
escha added a reviewer: arsenm.
escha set the repository for this revision to rL LLVM.
escha added a subscriber: llvm-commits.
escha added a comment.Aug 1 2016, 12:02 PM

Closing for now; it looks like this can interfere with address mode folding (e.g. on x86_64 where 32->64 zext is free), even though it makes logical sense.

escha abandoned this revision.Aug 1 2016, 12:02 PM
arsenm edited edge metadata.Aug 2 2016, 10:40 AM

AMDGPU wants this, it eliminates a 64-bit shift. This will make even more sense when i16 is added as a legal type for some subtargets

define void @foo(i64 addrspace(1)* %out, i16 %x) {

%zext0 = zext i16 %x to i32
%shl = shl i32 %zext0, 11
%zext1 = zext i32 %shl to i64
store i64 %zext1, i64 addrspace(1)* %out
ret void

}