This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] extend (select X, C1, C2 --> ext X) to vectors
ClosedPublic

Authored by spatel on Jun 30 2016, 10:29 AM.

Details

Summary

The code change is hopefully straightforward: replace dyn_casts with m_APInt, and we get transforms for splat vectors.

But these transforms raise some questions:

  1. In the cases where we need a 'not', we're increasing the instruction count. Is this justified because ext/not is always assumed cheaper/more canonical than a select?
  1. In the not+zext case, notice that the zext gets moved ahead of the xor. This is because visitZext() has: // zext (xor i1 X, true) to i32 --> xor (zext i1 X to i32), 1

There's no code comment for the motivation. Assuming there is good reason to break the m_Not pattern should sext+xor do the same?

  1. If we're ok with increasing the instruction count for #1, is the scalar select of vectors example also a good transform? In the worst case, we'd need 4 instructions in place of the select: xor, zext, insertelement, shufflevector.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel updated this revision to Diff 62377.Jun 30 2016, 10:29 AM
spatel retitled this revision from to [InstCombine] extend (select X, C1, C2 --> ext X) to vectors.
spatel updated this object.
spatel added reviewers: eli.friedman, majnemer, hfinkel.
spatel added a subscriber: llvm-commits.
eli.friedman edited edge metadata.Jul 2 2016, 11:20 AM

Generally looks fine.

lib/Transforms/InstCombine/InstCombineSelect.cpp
965 ↗(On Diff #62377)

Extra newline.

968 ↗(On Diff #62377)

Won't this crash if SI.getType() is <4 x i1>?

Re: your questions about canonicalization:

  1. The canonical representation of a calculation is whatever instcombine declares it to be. We generally try to favor arithmetic over selects because selects tend to be more expensive, but there's a limit to where that's worthwhile.
  2. Not sure exactly why we prefer to move xor out of a zext rather than in... apparently the transform was added in r21713 to cover this testcase:
define i32 @test22(i1 %X) {
; CHECK-LABEL: @test22(
; CHECK-NEXT: %1 = zext i1 %X to i32
; CHECK-NEXT: ret i32 %1
	%Y = xor i1 %X, true		; <i1> [#uses=1]
	%Z = zext i1 %Y to i32		; <i32> [#uses=1]
	%Q = xor i32 %Z, 1		; <i32> [#uses=1]
	ret i32 %Q
}

Obviously, that doesn't really motivate the canonical representation either way. That said, moving the xor in rather than out in general raises awkward questions about what to do in cases like "(zext i8 x to i32) ^ 257". And changing the canonical representation probably involves some work to figure out what other transforms depend on the canonical representation of zext+xor.

spatel updated this revision to Diff 62898.Jul 6 2016, 10:08 AM
spatel edited edge metadata.

Patch updated:
Added assert for i1 types (vectors should be handled after rL274465).

Re: your questions about canonicalization:

Obviously, that doesn't really motivate the canonical representation either way. That said, moving the xor in rather than out in general raises awkward questions about what to do in cases like "(zext i8 x to i32) ^ 257". And changing the canonical representation probably involves some work to figure out what other transforms depend on the canonical representation of zext+xor.

Thanks for the answers! I don't think we need to worry about your ^257 example because we're only looking at i1 types for this transform.

lib/Transforms/InstCombine/InstCombineSelect.cpp
971 ↗(On Diff #62898)

Yes - scalar i1 selects were all handled above, but vectors were not. That should be fixed with rL274465, so now we can assert that no i1 types make it down here.

eli.friedman accepted this revision.Jul 6 2016, 11:16 AM
eli.friedman edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Jul 6 2016, 11:16 AM
This revision was automatically updated to reflect the committed changes.