Export lq, stq, lqarx and stqcx. in preparation for implementing 16-byte lock free atomic operations on AIX.
Add a new register class g8prc for these instructions, since these instructions require even-odd register pair.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Although I am obviously not opposed to adding these instructions and the necessary register classes, we do have to devise a way to ensure RA doesn't allocate the same register to the base and result of LQ. An additional complication for this check is ensuring that if RA == 0, RTp != 0.
llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp | ||
---|---|---|
118 | This is likely problematic. What happens if we need to spill one of these at an offset from the stack pointer that doesn't fit in a DQ field? I imagine something breaks. These have to be handled differently in that we need to bump the base register rather than loading an immediate into a register and using it as an index. | |
llvm/lib/Target/PowerPC/PPCRegisterInfo.td | ||
23 | These names are odd, they seem to suggest that there are pairs that are subregisters, but it is actually the individual registers. Maybe: sub_gp8_x0 sub_gp8_x1 |
we do have to devise a way to ensure RA doesn't allocate the same register to the base and result of LQ
One way I can figure out now is to have an additional pseudo instruction after LQ which uses the base register of LQ so that liveranges of base register and result register overlap.
we do have to devise a way to ensure RA doesn't allocate the same register to the base and result of LQ
Another approach for this might be adding an additional out $ea_result operand which can be considered pseudo for LQ. But there might be redundant copy. For example,
%0, %1 = LQ 128, %2(tied-def 1) use %2
- Solve LQ register constraint RTp != RA by adding early-clobber on RTp
- Use doubleword load/store to perform quadword restore/spill
LQ and STQ are still needed for quadword atomic load/store.
llvm/lib/Target/PowerPC/PPCInstr64Bit.td | ||
---|---|---|
1259 | nit: misaligned - move right by one more space. | |
llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp | ||
1151 | Why use an X-Form for spilling/restoring rather than a D-Form the way all the other spilling code does? The FI eliminator should have what is needed to convert the D-Form to an X-Form (rather than relying on the pre-emit peephole which is what I assume is converting the X-Form to a D-Form in the attached test case). |
llvm/lib/Target/PowerPC/AsmParser/PPCAsmParser.cpp | ||
---|---|---|
297 | This is wrong and not needed , should be removed -- there is NO instruction using Regnum/2 as reg index. | |
440 | isEvenRegNumber is better, this is not specific to G8p, just a plain even regnum check. | |
441 | This is wrong. Should be something like: isRegNumber() && ((getImm() & 1) == 0) | |
llvm/lib/Target/PowerPC/PPCInstr64Bit.td | ||
286 | isRecordForm ? lqarx is not a recordform, stdcx. is a record form. | |
llvm/lib/Target/PowerPC/PPCRegisterInfo.td | ||
199 | This looks wrong. The encoding should be using original regnum, as in getG8pReg. eg: G8p1 is actually {X2,X3}, so the HW encoding should be 2/3, not 1. | |
201 | Shouldn't define Index here, should be <[-1, -1]> -- G8p is reg number in llvm only, there is NO regnum defined visible to debugger. Defining here will mess up mapping table. | |
486 | Add comments about why the allocation order is defined like this? And why we need a AltOrders for ELF? | |
llvm/lib/Target/PowerPC/PPCSchedule.td | ||
46 | nit: these are in alphabetic order , so these two should be moved down to after IIC_LdStLMW | |
llvm/test/CodeGen/PowerPC/ldst-16-byte.mir | ||
3 | How about powerpc32? |
llvm/test/CodeGen/PowerPC/ldst-16-byte.mir | ||
---|---|---|
3 | I guess it's not supported in powerpc32 mode, though I can't find where it is documented(It's documented in AIX assembly manual that ldarx and stdcx are not supported in powerpc32 mode). I've tried following on AIX-7.2 main: lq 2,128(4) stq 2,128(4) lqarx 2,3,4 lqarx 2,3,4,1 stqcx. 2,3,4 # as above code as -a64 -mpwr8 enc.s objdump -D a.out It gives correct dump result. 0000000000000000 <.text>: 0: e0 44 00 80 lq r2,128(r4) 4: f8 44 00 82 stq r2,128(r4) 8: 7c 43 22 28 lqarx r2,r3,r4 c: 7c 43 22 29 lqarx r2,r3,r4,1 10: 7c 43 21 6d stqcx. r2,r3,r4 But for as -a32 -mpwr8 enc.s, the result is not correct. 00000000 <.text>: 0: e0 44 00 80 lfq f2,128(r4) 4: f8 44 00 82 .long 0xf8440082 8: 7c 43 22 28 .long 0x7c432228 c: 7c 43 22 29 .long 0x7c432229 10: 7c 43 21 6d .long 0x7c43216d |
llvm/test/CodeGen/PowerPC/ldst-16-byte.mir | ||
---|---|---|
3 | Thanks. If so, there opcode should be guarded with isPPC64 pred. |
llvm/lib/Target/PowerPC/PPCInstr64Bit.td | ||
---|---|---|
286 | This should be a historical issue(IIUC, to avoid decoding conflict), I'll leave L(B|W|D|Q)ARXL as it is now and fix them in following patches. |
llvm/lib/Target/PowerPC/PPCInstr64Bit.td | ||
---|---|---|
286 | The L variant is a hinted variant and the hint sets the same bit as record-form instructions. So we just reuse that bit. There is no danger of PPCInstrInfo::optimizeCompareInstr() attempting to convert one to the other since this is not part of the RecFormRel relation. So I don't really think there is anything here that needs fixing. |
LGTM with some nits.
llvm/lib/Target/PowerPC/PPCInstr64Bit.td | ||
---|---|---|
286 | Thanks for the background, yes, it is functionality correct here, but I think it is misleading and causing confusion about the semantic of isRecordForm. | |
1258 | memrix16 ? We are restoring using two ld, do we still require DQ offset? | |
llvm/lib/Target/PowerPC/PPCRegisterInfo.td | ||
125 | nit: This doesn't enforce the relationship between SubReg0 and SubReg1 in Pair. How about something like? // GP8Pair - Paired GP8. class GP8Pair<string n, bits<5> EvenIndex>: PPCReg<n>{ assert !eq(EvenIndex{0},0), "Index should be even"; let HWEncoding{4-0} = EvenIndex ; let SubRegs = [!cast<GP8>("X"#EvenIndex), !cast<GP8>("X"#!add(EvenIndex, 1))]; let DwarfNumbers = [-1,-1] ; let SubRegIndices = [sub_gp8_x0, sub_gp8_x1] ; } then we can define pair safely and simply like: // 16 paired even-odd consecutive GP8s. foreach Index = { 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 } in { def G8p#!srl(Index, 1) : GP8Pair<"r"#Index, Index>; } |
llvm/lib/Target/PowerPC/PPCRegisterInfo.td | ||
---|---|---|
125 | Nice one. |
This is wrong and not needed , should be removed -- there is NO instruction using Regnum/2 as reg index.