While get_active_lane_mask lowering it uses WHILELO instruction, but for constant range suitable for PTRUE then we could issue PTRUE instruction instead.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/CodeGen/AArch64/active_lane_mask.ll | ||
---|---|---|
486 | Please can you add a test which shows what happens when the %n parameter of active.lane.mask goes out of the range that can be represented as a ptrue immediate. I expect this should fall back to the old codegen? |
This transformation is more subtle than you expect. See https://developer.arm.com/documentation/ddi0602/2022-09/SVE-Instructions/PTRUE--predicate---Initialise-predicate-from-named-constraint-, specifically the part that reads If the constraint specifies more elements than are available at the current vector length then all elements of the destination predicate are set to false..
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
17640 | Notwithstanding Paul's comment, I think something like this also needs to account for the mapping between PredPattern constants and the integer %n constant. Note that 1-8 have a one-to-one correspondance and above that they don't. |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
17639–17640 | Unless I'm missing something, this condition doesn't look right. I suspect this combine could happen before legalization in which case you could have vscale x [big] x something, and the [big] you're testing against here could exceed the runtime vector length -- and then you're hitting the problem Paul pointed out that ptrue would return an empty predicate. I take to account for this properly you're going to need to query the minimum vscale via SubTarget->getMinSVEVectorSizeInBits(). |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
4755 | unsigned | |
4755–4757 | std::max(Subtarget->getMinSVEVectorSizeInBits(), 128u) | |
4759 | NumActiveElems? | |
4759–4760 | Op.getConstantOperandVal(2) - Op.getConstantOperandVal(1); | |
4762 | this should be <= | |
4763 | this should be the encoded pattern returned by getSVEPredPatternFromNumElements, see elsewhere in ISelLowering for examples. | |
llvm/test/CodeGen/AArch64/active_lane_mask.ll | ||
478 | these tests aren't sufficient, it would be good to have tests for
| |
495 | active.lane.mask is an unsigned less-than comparison so this generates a predicate with 4 active lanes, which for a minimum vector width of 128 and esize of 32 can be represented with ptrue p0.s, vl4? |
llvm/test/CodeGen/AArch64/active_lane_mask.ll | ||
---|---|---|
495 | correct. |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
4758–4759 | This doesn't look correct because while can take 64-bit operation. This code can truncate the result in such a way that a ptrue might incorrectly look like a viable replacement. | |
4763 | It likely doesn't really matter but given getSVEPredPatternFromNumElements takes an unsigned (because it's linked to the interface used by getVectorMinNumElements) perhaps it's clean to perform the <= check before calling getSVEPredPatternFromNumElements, that way we know the truncation is not losing any precision. |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
4758–4759 | Sorry I meant "while can take 64-bit operands". |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
4758–4759 | ok, as whilelo operands are i32, I think we have to zero extend those operands to unsigned by cast<ConstantSDNode>(Op)->getZExtValue(). |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
4758–4759 | whilelo operands are not restricted to i32, which can be seen by looking at IntrinsicsAArch64.td. Although the definition suggests any integer type is allowed, that's just a deficiency of how the td file is parsed. Realistically these intrinsics only accept either i32 or i64 operands, which are tested by sve-intrinsics-while.ll. Perhaps you can keep things as APInt up until the comparison, at which point you know an unsigned will be big enough to hold the length. |
unsigned