Given this is an OR reduction the two are equivalent and later
optimizations (AArch64InstrInfo::optimizePTestInstr) may rewrite the
sequence to use the flag-setting variant of instruction X, to remove the
PTEST altogether.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
LGTM with minor nits
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp | ||
---|---|---|
1335 ↗ | (On Diff #464198) | Extra parenthesis |
1344–1346 ↗ | (On Diff #464198) | This comment is out of place now |
Thanks for review, I'll address those nits when landing unless there's any objections.
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp | ||
---|---|---|
1335 ↗ | (On Diff #464198) | Are you sure this is universally true? The s variants do an implicitly ptest using the original instruction's general predicate operand: brkb p0.b, p1/z, p2.b; ptest p1, p0.b ==> brkbs p0.b, p1/z, p2.b If you perform this transformation when the general predicates of both instructions differ the implicit ptest will calculate a result by considering a different set of lanes. Whilst the logic might be safe for your ptest.any case because it's essentially an OR reduction (and thus safe when pg & data == data), the same is unlikely for things like cc.first and cc.none where it's critical to test the exact lanes requested. Assuming you agree and I've not misunderstood you probably want to do something specific for ptest.any, which is best done during isel lowering. |
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp | ||
---|---|---|
1335 ↗ | (On Diff #464198) |
You're right this isn't safe for first/last, I missed that. I've updated the patch as suggested and moved the transform to inst combine. Cheers! |
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
---|---|---|
987 | Perhaps PTEST_ANY(X=OP(PG,...), X) -> PTEST_ANY(PG, X)) to show where PG comes from. | |
993 | Looking at the instruction pseudo code for brkns it works differently than the other instructions that set the flags in that its implicit predicate is an all active one. This likely means there's a bug in AArch64InstrInfo::optimizePTestInstr. | |
llvm/test/Transforms/InstCombine/AArch64/sve-intrinsics-ptest.ll | ||
45 | Is the zext necessary? Same goes for the other tests. |
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
---|---|---|
993 |
Good spot! I've removed brkn from this patch and will fix the bug in AArch64InstrInfo::optimizePTestInstr in a separate patch. I guess the optimization there can still be done as long as the ptest is called with all active mask rather than the mask from the BRKN instruction? | |
llvm/test/Transforms/InstCombine/AArch64/sve-intrinsics-ptest.ll | ||
45 |
Hangover from previous revision, fixed. |
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
---|---|---|
993 |
Fixed in D135655. |
Perhaps PTEST_ANY(X=OP(PG,...), X) -> PTEST_ANY(PG, X)) to show where PG comes from.