This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] Support some SimplifySetCC cases for comparing against vector splats of constants.
ClosedPublic

Authored by craig.topper on Feb 5 2018, 11:11 PM.

Details

Summary

This supports things like

(setcc uge X, 0) -> true
(setcc ugt X, 0) -> (setcc ne X, 0)

The AArch64 test changes are to avoid using things that we can simplify now. I'll probably commit these separately.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Feb 5 2018, 11:11 PM
spatel added a comment.Feb 6 2018, 8:59 AM

Please check in the AArch64 (and MIPS?) test changes ahead of this.

I'd prefer to add some dedicated minimal tests, so we know exactly what the improvements are without digging through the more complicated diffs. That would be stuff like this I think:

define <4 x i32> @lt_min(<4 x i32> %x) {
  %cmp = icmp slt <4 x i32> %x, <i32 2147483648, i32 2147483648, i32 2147483648, i32 2147483648>
  %r = sext <4 x i1> %cmp to <4 x i32>
  ret <4 x i32> %r
}
lib/CodeGen/SelectionDAG/TargetLowering.cpp
1891 ↗(On Diff #132940)

Is this (and similar diffs below) currently wrong for ZeroOrNegativeOneBooleanContent targets?
I'm having a hard time seeing all the pre-conditions in this 900 line monster of a function...

Please check in the AArch64 (and MIPS?) test changes ahead of this.

We're a ZeroOrNegativeOneBooleanContent target for vectors, so this may be useful.

craig.topper added inline comments.Feb 6 2018, 4:04 PM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
1891 ↗(On Diff #132940)

I suspect it is. I believe there are other lines in here that I didn't touch that are probably also wrong.

I can split out the getBooleanTrueConstant function and fix this up for scalar types first if you want?

spatel added inline comments.Feb 6 2018, 5:22 PM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
1891 ↗(On Diff #132940)

If it's not too hard, that would be good. A quick check of in-tree targets says AMDGPU, NVPTX, and some flavors of MIPS could hit that bug via scalar ZeroOrNegativeOneBooleanContent.

spatel added inline comments.Feb 6 2018, 5:25 PM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
1891 ↗(On Diff #132940)

Oops - I read that wrong. Looks like all MIPS have ZeroOrOneBooleanContent for scalars.

craig.topper added inline comments.Feb 6 2018, 11:45 PM
lib/CodeGen/SelectionDAG/TargetLowering.cpp
1891 ↗(On Diff #132940)

In most cases we probably optimize the setcc before type legalization so the result type is i1 making the difference not meaningful. So to hit a bug you'd need to only see the simplifcation after the result type has been promoted.

Add directed tests. I believe there are some missing optimizations due to not enabling canonicalization of GE/LE to GT/LT for vectors.

Add directed tests. I believe there are some missing optimizations due to not enabling canonicalization of GE/LE to GT/LT for vectors.

We might have to be more careful with vector ISAs because they tend to leave out the full set of predicates. Ie, do we have to check for legal-or-custom if this transform can happen late, or can we assume that all predicates must be handled for all data types?

I checked in the baseline tests at rL325138 - can you rebase here, so we just see the improvements?

We might have to be more careful with vector ISAs because they tend to leave out the full set of predicates. Ie, do we have to check for legal-or-custom if this transform can happen late, or can we assume that all predicates must be handled for all data types?

Do you think we should override isCondCodeLegal?

We might have to be more careful with vector ISAs because they tend to leave out the full set of predicates. Ie, do we have to check for legal-or-custom if this transform can happen late, or can we assume that all predicates must be handled for all data types?

Do you think we should override isCondCodeLegal?

No - that looks exactly like what I was thinking of. I just couldn't tell from the diffs if we're checking that consistently...so this patch should be good as-is (just rebased).

Refactoring to better organize this blob can be a follow-up. :)

We're not checking isCondCodeLegal consistently. The only calls to it are in the two blocks that covered with !isVector. I'm not sure we can use isCondCodeLegal correctly with SSE/AVX and AVX512 having different behaviors.

RKSimon added inline comments.Feb 19 2018, 8:41 AM
test/CodeGen/X86/vector-compare-simplify.ll
2 ↗(On Diff #134852)

Do we need to add SSE/SSE42/AVX2/XOP/AVX512 test targets here to check different integer compare abilities?

spatel added inline comments.Feb 19 2018, 8:50 AM
test/CodeGen/X86/vector-compare-simplify.ll
29–31 ↗(On Diff #134852)

Should we account for this regression first (by adding x86-specific folds for vector compares with edge case constants)?

78–80 ↗(On Diff #134852)

Do we know why this didn't load-fold like the sgt_max test?

spatel added inline comments.Feb 19 2018, 8:55 AM
test/CodeGen/X86/vector-compare-simplify.ll
78–80 ↗(On Diff #134852)

Disregard...this is less-than, so we can't commute the memop. But this might still be viewed as a regression.

Change to only allow vector adjustments before legalize ops since the isCondCodeLegal is probably not correct for targets today. But this will at least give us a chance to make some simplifications.

spatel accepted this revision.Mar 1 2018, 1:24 PM

LGTM.

This revision is now accepted and ready to land.Mar 1 2018, 1:24 PM
This revision was automatically updated to reflect the committed changes.