This is an archive of the discontinued LLVM Phabricator instance.

[TargetLowering] check for sign-bit comparisons in SimplifyDemandedBits
ClosedPublic

Authored by spatel on Feb 7 2017, 2:30 PM.

Details

Summary

I don't know if anything other than x86 vectors is affected by this change, but this may allow us to remove target-specific intrinsics for blendv* (vector selects). The simplification arises from the fact that blendv* instructions only use the sign-bit when deciding which vector element to choose for the destination vector. The mechanism to fold VSELECT into SHRUNKBLEND nodes already exists in x86 lowering; this demanded bits change just enables the transform to fire more often.

The original motivation starts with a bug for DSE of masked stores that seems completely unrelated, but I've explained the likely steps in this series here:
https://llvm.org/bugs/show_bug.cgi?id=11210

Diff Detail

Event Timeline

spatel created this revision.Feb 7 2017, 2:30 PM
spatel updated this revision to Diff 87530.Feb 7 2017, 2:58 PM

Patch updated:
I'm not sure how to expose this, but I think I was missing a check for BooleanContent::ZeroOrNegativeOneBooleanContent. If the target/type uses a different format for setcc, it probably wouldn't make sense to demand the top bit, but let's check that to be safe?

RKSimon added inline comments.Feb 8 2017, 3:39 AM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
770

Add test cases for these?

test/CodeGen/X86/vselect-pcmp.ll
4

Please can you add a avx512vl target to check more mask predicate cases?

spatel added inline comments.Feb 8 2017, 7:52 AM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
770

I'm not sure we want to bloat it up with those cases yet; that's why I made it a TODO? rather than a FIXME.
Instcombine already canonicalizes the X <= -1 variant to X < 0. It doesn't know to change X > -1 to X < 0 and swap the select operands, but I think that's just an IR canonicalization oversight, so I was planning to fix that.

So if we add codegen tests for those variants, it's only because a non-canonical select pattern has been created in the backend. I'd like to find evidence of that happening before adding code or tests for it.

test/CodeGen/X86/vselect-pcmp.ll
4

Sure - I'm still not clear on all of the avx512 variants, but I added a 'vl' RUN line here:
rL294462

I'll rebase with those diffs next.

spatel added inline comments.Feb 8 2017, 8:03 AM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
770

Also, there's a hint in the test variable names for this shift/trunc canonicalization to X < 0:

define <4 x i32> @signbit_sel_v4i32(<4 x i32> %x, <4 x i32> %y, <4 x i32> %mask) {
  %sh = lshr <4 x i32> %mask, <i32 31, i32 31, i32 31, i32 31>
  %tr = trunc <4 x i32> %sh to <4 x i1>
  %z = select <4 x i1> %tr, <4 x i32> %x, <4 x i32> %y
  ret <4 x i32> %z
}

$ opt -instcombine -S foo.ll

define <4 x i32> @signbit_sel_v4i32(<4 x i32> %x, <4 x i32> %y, <4 x i32> %mask) {
  %tr = icmp slt <4 x i32> %mask, zeroinitializer
  %z = select <4 x i1> %tr, <4 x i32> %x, <4 x i32> %y
  ret <4 x i32> %z
}
spatel updated this revision to Diff 87663.Feb 8 2017, 8:13 AM

Patch updated:
No code changes, but added a RUN for avx512vl in rL294462 to show that variant.
I think the first test shows a missed opportunity for avx512vl and avx512f; not sure about the rest.

RKSimon accepted this revision.Feb 10 2017, 8:08 AM

LGTM thanks Sanjay

test/CodeGen/X86/vselect-pcmp.ll
11–12

Add a FIXME for the AVX512 test case.

This revision is now accepted and ready to land.Feb 10 2017, 8:08 AM
This revision was automatically updated to reflect the committed changes.