These are useful instructions when lowering fixed length vector
extends, so I've broken this patch out as kind of NFC like work.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/AArch64/AArch64ISelLowering.h | ||
---|---|---|
101 | The signed versions are essentially predicated versions of ISD::SIGN_EXTEND_INREG? Can we use names that reflect that? |
llvm/lib/Target/AArch64/AArch64ISelLowering.h | ||
---|---|---|
101 | It is more than a rename but I can do that, it might even allow an easier path to lowering for scalable vector types. I do want the signed and unsigned variants to be consistent though. So, any objections to me creating a matching predicated zero_extend_inreg? |
llvm/lib/Target/AArch64/AArch64ISelLowering.h | ||
---|---|---|
101 | That's fine. We actually have a function on SelectionDAG called getZeroExtendInReg(), so that isn't completely new terminology. |
Changed new nodes to SIGN/ZERO_EXTEND_INREG_MERGE_PATHRU that closer resemble SIGN_EXTEND_INREG but using a predicate and passthru positioned as required by _MERGE_PATHRU style nodes.
These patterns might need attention. ISD::SIGN_EXTEND_INREG expects both the input and output registers to have the same type, extending the small values in place. I.e. the input is unpacked.
But the AArch64ISD::SIGN_EXTEND_INREG_MERGE_PASSTHRU patterns are expecting to explicitly change the element size. I.e. the input is packed.
I have another patch that's needed to show the problem, but legalize can produce something like this:
t78: v16i8 = BUILD_VECTOR t217, t216, t215, t214, t213, t212, t211, t210, t209, t208, t207, t206, t205, t204, t203, t202 t80: v16i16 = any_extend t78 t82: v16i16 = sign_extend_inreg t80, ValueType:ch:v16i1
I'm not sure I understand what you think the issue is here. ISD::SIGN_EXTEND_INREG, AArch64ISD::SIGN_EXTEND_INREG_MERGE_PASSTHRU, and the SVE sxtb instruction are all the same, as far as I know. The operand and result types are the same, and the operation in each lane is independent.
By contrast, ISD::ANY_EXTEND does change the element size.
Ah, you're right. I misread the register classes.
def : SVE_InReg_Extend<nxv8i16, op, nxv8i1, nxv8i8, !cast<Instruction>(NAME # _H)>;
I was reading nxv16i8 instead of nxv8i8 for the input operand. Since it's nxv8i8, they are unpacked.
I'll have to find a way to generate the nxv8i8 type, to cast the operand to it. Will take that to another Diff though...
Eh, thinking some more, it's still a little weird:
class SVE_InReg_Extend<ValueType vt, SDPatternOperator op, ValueType pt, ValueType inreg_vt, Instruction inst> : Pat<(vt (op pt:$Pg, vt:$Src, inreg_vt, vt:$PassThru)), (inst $PassThru, $Pg, $Src)>;
ISD::SIGN_EXTEND_INREG has the same type for the input and output reg, and an additional VT operand to encode the input type.
AArch64ISD::ZERO_EXTEND_INREG_MERGE_PASSTHRU encodes the input type in the register, and tosses the VT operand.
Maybe it would be better to lower based on the VT operand? That way we don't need to get into the half vector type business (e.g. nxv8i8).
Perhaps I've misunderstood but based on Cameron's original message I suspect he's hit a bug because I lowered all SIGN_EXTEND_INREG operations regardless of the inreg type. This is wrong because there's no patterns for non-power-of-2 non-byte-based inreg types and thus I guess Cameron has hit a selection failure?
If so then the fix is easy enough, we should expand those cases, which looking at the common code does the correct thing of replacing with shifts. I've create D86394, even if I have misunderstood it fixes a bug anyway.
The signed versions are essentially predicated versions of ISD::SIGN_EXTEND_INREG? Can we use names that reflect that?