Splitting the operand of a scalable [S|U]INT_TO_FP results in a
concat_vectors operation where the operands are unpacked FP
scalable vectors (e.g. nxv2f32).
This patch adds custom lowering of concat_vectors which
checks that the number of operands is 2, and isel patterns
to match concat_vectors of scalable FP types with uzp1.
Details
Diff Detail
Event Timeline
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
9090–9092 | I imagine that CONCAT_VECTORS operand counts have been normalise to two by this point but given its restriction that all operands must be the same type I think you can protect against this and your VT == OpVT*2 requirement using if (getNumOperands() != 2). Can the isFloatingPoint check be made part of the assert? | |
9095–9096 | Not sure CONCAT_VECTORS have a left and right hand side. |
Changes made to the LowerCONCAT_VECTORS function:
- Replaced OpVT.getVectorElementCount()*2 condition with Op.getNumOperands() != 2
- Moved the isFloatingPoint() check into the assert
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
9095–9096 | Renamed these to Op0 & Op1 |
LGTM
Alternatively, we could make CONCAT_VECTOR "legal", and lower it using an isel pattern. But I'm not sure that's actually an improvement.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
9090–9092 |
Could this also work for predicates (e.g. nxv8i1 : nxv8i1)? |
@efriedma At this level do we need to care about CONCAT_VECTORS that do not have two operands? If so then we'll need custom lowering anyway to ensure those cases get expanded?
I think we could end up with a CONCAT_VECTORS with more than two operands. At least, I can't think of anything that would prevent it (as long as the operands have a legal type). You could still pattern-match that to a tree of uzp1, but I'd be more worried about blocking useful optimizations at that point; probably custom-lowering that makes sense.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
9090–9092 | We actually have patterns for i1 already. But would probably work, sure. |
- Simplified the LowerCONCAT_VECTORS function so that it returns Op if the number of operands is 2.
- Added ISel patterns for lowering floating-point concat_vectors, making this consistent with how we lower concats of predicate types.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
991 | Thinking about it a little more, we probably only want to mark CONCAT_VECTORS "Custom" for vectors with four or more elements. It's never legal if the result is a two-element vector. |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
9086–9088 | This can be summed up as VT.isScalableVector() && isTypeLegal(VT) && isTypeLegal(Op.getOperand(0).getValueType()) but given Eli's comment you can just add the isTypeLegal(Op.getOperand(0).getValueType()) part to the if statement to cover both bases. | |
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td | ||
1736–1740 | I doubt you need these anymore. |
- Restricted the concat_vectors marked as 'custom' to those with result types of 4+ elements only
- Removed unused patterns for reinterpret_cast
- Moved additional isTypeLegal check to the if statement in LowerCONCAT_VECTORS
- Reverted previous change to restrict the concat_vectors marked as custom. The extra check added to LowerCONCAT_VECTORS (isTypeLegal(Op.getOperand(0)...) will cover cases where the result is a two element vector.
LGTM assuming the potential compiler warning is removed.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
9084–9086 | I'm pretty sure this will cause an "unused variable - VT" compile time warning when building without asserts and thus will need to be verbose (i.e. assert(Op.getValueType().isScalable() && ...) |
Thinking about it a little more, we probably only want to mark CONCAT_VECTORS "Custom" for vectors with four or more elements. It's never legal if the result is a two-element vector.