This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Add instcombine for PTEST_ANY(X=OP(PG,...), X) -> PTEST_ANY(PG, X))
ClosedPublic

Authored by c-rhodes on Sep 30 2022, 3:01 AM.

Details

Summary

Given this is an OR reduction the two are equivalent and later
optimizations (AArch64InstrInfo::optimizePTestInstr) may rewrite the
sequence to use the flag-setting variant of instruction X, to remove the
PTEST altogether.

Diff Detail

Event Timeline

c-rhodes created this revision.Sep 30 2022, 3:01 AM
c-rhodes requested review of this revision.Sep 30 2022, 3:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 30 2022, 3:01 AM
Matt added a subscriber: Matt.Sep 30 2022, 8:57 PM
bsmith accepted this revision.Oct 3 2022, 2:27 AM

LGTM with minor nits

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
1335 ↗(On Diff #464198)

Extra parenthesis

1344–1346 ↗(On Diff #464198)

This comment is out of place now

This revision is now accepted and ready to land.Oct 3 2022, 2:27 AM

LGTM with minor nits

Thanks for review, I'll address those nits when landing unless there's any objections.

paulwalker-arm added inline comments.Oct 3 2022, 7:57 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
1335 ↗(On Diff #464198)

Are you sure this is universally true? The s variants do an implicitly ptest using the original instruction's general predicate operand:

brkb p0.b, p1/z, p2.b; ptest p1, p0.b ==> brkbs p0.b, p1/z, p2.b

If you perform this transformation when the general predicates of both instructions differ the implicit ptest will calculate a result by considering a different set of lanes.

Whilst the logic might be safe for your ptest.any case because it's essentially an OR reduction (and thus safe when pg & data == data), the same is unlikely for things like cc.first and cc.none where it's critical to test the exact lanes requested.

Assuming you agree and I've not misunderstood you probably want to do something specific for ptest.any, which is best done during isel lowering.

c-rhodes updated this revision to Diff 466077.Oct 7 2022, 7:56 AM
c-rhodes retitled this revision from [AArch64][SVE] Use flag setting variants of BRKB/BRKN/RDFFR to [AArch64][SVE] Add instcombine for PTEST_ANY(X, X) -> PTEST_ANY(PG, X).
c-rhodes edited the summary of this revision. (Show Details)

Restrict optimization to ptest.any and move to inst combine.

c-rhodes marked an inline comment as done and an inline comment as not done.Oct 7 2022, 7:58 AM
c-rhodes added inline comments.
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
1335 ↗(On Diff #464198)

Are you sure this is universally true? The s variants do an implicitly ptest using the original instruction's general predicate operand:

brkb p0.b, p1/z, p2.b; ptest p1, p0.b ==> brkbs p0.b, p1/z, p2.b

If you perform this transformation when the general predicates of both instructions differ the implicit ptest will calculate a result by considering a different set of lanes.

Whilst the logic might be safe for your ptest.any case because it's essentially an OR reduction (and thus safe when pg & data == data), the same is unlikely for things like cc.first and cc.none where it's critical to test the exact lanes requested.

Assuming you agree and I've not misunderstood you probably want to do something specific for ptest.any, which is best done during isel lowering.

You're right this isn't safe for first/last, I missed that. I've updated the patch as suggested and moved the transform to inst combine. Cheers!

paulwalker-arm added inline comments.Oct 10 2022, 2:36 PM
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
987

Perhaps PTEST_ANY(X=OP(PG,...), X) -> PTEST_ANY(PG, X)) to show where PG comes from.

993

Looking at the instruction pseudo code for brkns it works differently than the other instructions that set the flags in that its implicit predicate is an all active one. This likely means there's a bug in AArch64InstrInfo::optimizePTestInstr.

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsics-ptest.ll
45

Is the zext necessary? Same goes for the other tests.

c-rhodes updated this revision to Diff 466739.Oct 11 2022, 2:22 AM
c-rhodes marked an inline comment as done.
c-rhodes retitled this revision from [AArch64][SVE] Add instcombine for PTEST_ANY(X, X) -> PTEST_ANY(PG, X) to [AArch64][SVE] Add instcombine for PTEST_ANY(X=OP(PG,...), X) -> PTEST_ANY(PG, X)).
  • Don't transform brkn intrinsic.
  • Update comment.
c-rhodes updated this revision to Diff 466743.Oct 11 2022, 2:27 AM

Remove zext from tests.

c-rhodes marked 3 inline comments as done.Oct 11 2022, 2:30 AM
c-rhodes added inline comments.
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
993

Looking at the instruction pseudo code for brkns it works differently than the other instructions that set the flags in that its implicit predicate is an all active one. This likely means there's a bug in AArch64InstrInfo::optimizePTestInstr.

Good spot! I've removed brkn from this patch and will fix the bug in AArch64InstrInfo::optimizePTestInstr in a separate patch. I guess the optimization there can still be done as long as the ptest is called with all active mask rather than the mask from the BRKN instruction?

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsics-ptest.ll
45

Is the zext necessary? Same goes for the other tests.

Hangover from previous revision, fixed.

c-rhodes marked 2 inline comments as done.
c-rhodes added inline comments.
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
993

Looking at the instruction pseudo code for brkns it works differently than the other instructions that set the flags in that its implicit predicate is an all active one. This likely means there's a bug in AArch64InstrInfo::optimizePTestInstr.

Good spot! I've removed brkn from this patch and will fix the bug in AArch64InstrInfo::optimizePTestInstr in a separate patch. I guess the optimization there can still be done as long as the ptest is called with all active mask rather than the mask from the BRKN instruction?

Fixed in D135655.

paulwalker-arm accepted this revision.Oct 11 2022, 10:11 AM
This revision was landed with ongoing or failed builds.Oct 12 2022, 2:14 AM
This revision was automatically updated to reflect the committed changes.