Page MenuHomePhabricator

[X86][SSE] Recognise vXi1 boolean anyof/allof reduction patterns
ClosedPublic

Authored by RKSimon on Apr 12 2019, 5:26 AM.

Details

Summary

Currently combineHorizontalPredicateResult only handles anyof/allof reduction patterns of legal types, which can be tricky to match as type legalization of bools can introduce bitcasts/truncs/extensions.

This patch extends to combineHorizontalPredicateResult to recognise vXi1 bool reductions as well and uses the existing combineBitcastvxi1 helper to create the MOVMSK necessary to then compare the signmask result.

This ensures the accuracy of the reduction costs added in D60403 which assume the MOVMSK generation.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Apr 12 2019, 5:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 12 2019, 5:27 AM
spatel accepted this revision.Apr 12 2019, 6:44 AM

LGTM - see inline for a minor potential follow-up.

If we can assume vXi1 IR, then I can probably abandon my recent movmsk efforts:
D59669
D59912

test/CodeGen/X86/vector-compare-all_of.ll
1295–1296 ↗(On Diff #194846)

I think the other path prefers to use packss and a 128-bit movmsk for AVX1 here which could be a slight win since it avoids ymm?

This revision is now accepted and ready to land.Apr 12 2019, 6:44 AM

LGTM - see inline for a minor potential follow-up.

If we can assume vXi1 IR, then I can probably abandon my recent movmsk efforts:
D59669
D59912

Let's keep those open for now - both still improve code beyond the vector reductions which this patch handles.

test/CodeGen/X86/vector-compare-all_of.ll
1295–1296 ↗(On Diff #194846)

Yes a single packss to a movmskps is better - I think we can take some of the code from D59912 to tweak this.

This revision was automatically updated to reflect the committed changes.