As it seems difficult to mix in only some cases of legal i128 (for instance adding i128 as a type of GR128 immediately causes things to not even build), I did some experiments on how to handle this without doing that. This was (as expected) not as simple as one might have hoped - here is what I did so far to make the test case compile:
- Use a new SystemZISD node "SUBREG128" which is used when splitting/joining i128 inline asm values. Implement splitValueIntoRegisterParts() and joinRegisterPartsIntoValue() in SystemZTargetLowering using this new node to keep track of a GR64 reg and a hi64/low64 subreg index. This unfortunately also gets called for other things than inline-assembly, but it seems to work at least during experiments to check for an existing CallingConvention, but I am not really sure that is correct...
- Use a new psuedo instruction (also called SUBREG128) that lowers the SUBREG128 ISD node to an MI that uses a custom inserter hook. Here the INLINEASM instruction is found and replaced by a new one with GR128 operands (INLINEASM is a generic opcode without custom insertion).
- Return GR64 for i128 in SystemZTargetLowering::getRegForInlineAsmConstraint(). Currently GR128 is returned, but that is not really working either as two GR128 vregs were created. The i128 case is identified by an inline asm operand of GR64 that uses two (GR64) registers. This seems to be the only case where a register constraint could produce this, I think.
- Build a new INLINEASM with altered Flag operands and GR128 (tied) operands. This is a little tricky given the tied operands, updating of Flag operands, and making sure that the SUBREG128 instructions are there as expected...
I tried leaving the INLINEASM as it was all the way until the AsmPrinter, but that didn't seem workable with the tied operands as they were then two pairs of tied GR64 registers that didn't belong in a GR128 pair during regalloc...
The approach is sort of simple but yet quite intricate to implement. There are probably things that could be simplified by some change in common-code...
clang-format: please reformat the code
- virtual unsigned - getNumRegisters(LLVMContext &Context, EVT VT, - Optional<MVT> RegisterVT = None) const { + virtual unsigned getNumRegisters(LLVMContext &Context, EVT VT, + Optional<MVT> RegisterVT = None) const {