This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG][X86] Enable SimplifySetCC CTPOP transforms for vector splats
ClosedPublic

Authored by craig.topper on Oct 13 2020, 3:12 PM.

Details

Summary

This enables these transforms for vectors:
(ctpop x) u< 2 -> (x & x-1) == 0
(ctpop x) u> 1 -> (x & x-1) != 0
(ctpop x) == 1 --> (x != 0) && ((x & x-1) == 0)
(ctpop x) != 1 --> (x == 0) || ((x & x-1) != 0)

All enabled if CTPOP isn't Legal. This differs from the scalar
behavior where the first two are done unconditionally and the
last two are done if CTPOP isn't Legal or Custom. The Legal
check produced better results for vectors based on X86's
custom handling. Might be worth re-visiting scalars here.

I disabled the looking through truncate for vectors. The
code that creates new setcc can use the same result VT as the
original setcc even if we truncated the input. That may work
work for most scalars, but definitely wouldn't work for vectors
unless it was a vector of i1.

I moved the Log2_32 check for the truncate up to the if
that looks through the truncate as that seemed more obvious to me
than trying to determine if we looked through a truncate and then
checking the types. The original code used Log2_32_Ceil, but
Log2_32 is more correct for non power of 2 types.

Fixes or at least improves PR47825

Diff Detail

Event Timeline

craig.topper created this revision.Oct 13 2020, 3:12 PM
Herald added a project: Restricted Project. · View Herald Transcript
craig.topper requested review of this revision.Oct 13 2020, 3:12 PM
spatel added a reviewer: bkramer.EditedOct 14 2020, 8:05 AM

For reference, the original transform was added with rL123621:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20110117/115345.html
It was intentionally not checking for legality on scalars because that was better for the x86 motivating case, but I'm not sure if that still holds after 9+ years. It seems unlikely that holds across other targets.

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3514

It's hard to follow the vector vs. legal logic diffs here and the next set of folds. Make bools with explanatory names for these and/or add code comments to describe what we are allowing/restricting and why?

3514–3515

I think this blob of ctpop transforms is big enough to make into a helper function as a preliminary cleanup, and that + early exits might make it easier to understand/change the logic.

3516

Is there any way to expose the Log2 difference in a test?

ecnelises resigned from this revision.Oct 16 2020, 12:01 AM

I have no memory of adding that transformation 9 years ago. dec + test should be better than popcnt + cmp even on modern x86, but with no data to back that up it might not matter at all.

I have no memory of adding that transformation 9 years ago. dec + test should be better than popcnt + cmp even on modern x86, but with no data to back that up it might not matter at all.

Similar to discussion in D89479 - I think characterization of x86 BMI perf is not universal. Slow bit-manipulation instructions seem to be limited to Intel CPUs now; AMD has had full-speed popcnt/lzcnt for multiple generations according to Agner's docs.

Rebase after refactoring the ctpop code into a separate function.

spatel accepted this revision.Oct 19 2020, 6:42 AM
spatel added a reviewer: davezarzycki.

@davezarzycki - do you still want to add tests that would be affected by this patch (it might already be done, but I missed it)?

LGTM other than the code comment nits.

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3402

The comment doesn't explain the motivation for not using the more common isOperationLegalOrCustom() clause or ignoring legality for scalars.
Add something about the x86 motivation and possibly a TODO about changing that?

Grammar police: "if it is"

This revision is now accepted and ready to land.Oct 19 2020, 6:42 AM

The tests I want to add shouldn't be affected by this patch. They're for a subsequent patch that I want to contribute that refines this patch.

This revision was landed with ongoing or failed builds.Oct 19 2020, 12:57 PM
This revision was automatically updated to reflect the committed changes.
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp