This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Export 16 byte load-store instructions
ClosedPublic

Authored by lkail on May 24 2021, 4:06 AM.

Details

Summary

Export lq, stq, lqarx and stqcx. in preparation for implementing 16-byte lock free atomic operations on AIX.
Add a new register class g8prc for these instructions, since these instructions require even-odd register pair.

Diff Detail

Event Timeline

lkail created this revision.May 24 2021, 4:06 AM
lkail requested review of this revision.May 24 2021, 4:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 24 2021, 4:06 AM
lkail edited the summary of this revision. (Show Details)May 24 2021, 4:20 AM
lkail updated this revision to Diff 347352.May 24 2021, 5:07 AM

Fixed failed pre-merge tests.

Although I am obviously not opposed to adding these instructions and the necessary register classes, we do have to devise a way to ensure RA doesn't allocate the same register to the base and result of LQ. An additional complication for this check is ensuring that if RA == 0, RTp != 0.

llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp
118

This is likely problematic. What happens if we need to spill one of these at an offset from the stack pointer that doesn't fit in a DQ field? I imagine something breaks. These have to be handled differently in that we need to bump the base register rather than loading an immediate into a register and using it as an index.
See PPCRegisterInfo::eliminateFrameIndex()

llvm/lib/Target/PowerPC/PPCRegisterInfo.td
23

These names are odd, they seem to suggest that there are pairs that are subregisters, but it is actually the individual registers. Maybe:

sub_gp8_x0
sub_gp8_x1
lkail added a comment.EditedMay 25 2021, 8:23 PM

we do have to devise a way to ensure RA doesn't allocate the same register to the base and result of LQ

One way I can figure out now is to have an additional pseudo instruction after LQ which uses the base register of LQ so that liveranges of base register and result register overlap.

lkail added a comment.EditedMay 25 2021, 9:37 PM

we do have to devise a way to ensure RA doesn't allocate the same register to the base and result of LQ

Another approach for this might be adding an additional out $ea_result operand which can be considered pseudo for LQ. But there might be redundant copy. For example,

%0, %1 = LQ 128, %2(tied-def 1)
use %2
lkail updated this revision to Diff 348197.May 27 2021, 2:58 AM
  1. Solve LQ register constraint RTp != RA by adding early-clobber on RTp
  2. Use doubleword load/store to perform quadword restore/spill

LQ and STQ are still needed for quadword atomic load/store.

lkail updated this revision to Diff 348199.May 27 2021, 3:08 AM
nemanjai added inline comments.May 27 2021, 4:00 AM
llvm/lib/Target/PowerPC/PPCInstr64Bit.td
1261

nit: misaligned - move right by one more space.

llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp
1151

Why use an X-Form for spilling/restoring rather than a D-Form the way all the other spilling code does? The FI eliminator should have what is needed to convert the D-Form to an X-Form (rather than relying on the pre-emit peephole which is what I assume is converting the X-Form to a D-Form in the attached test case).

lkail updated this revision to Diff 348220.May 27 2021, 4:11 AM
  1. Use DForm to keep consistency
  2. Format fix
lkail marked 2 inline comments as done.May 27 2021, 4:13 AM
lkail updated this revision to Diff 348470.May 28 2021, 2:18 AM

Add encoding tests.

lkail updated this revision to Diff 348477.May 28 2021, 2:36 AM
jsji added inline comments.Jun 9 2021, 2:29 PM
llvm/lib/Target/PowerPC/AsmParser/PPCAsmParser.cpp
297

This is wrong and not needed , should be removed -- there is NO instruction using Regnum/2 as reg index.

440

isEvenRegNumber is better, this is not specific to G8p, just a plain even regnum check.

441

This is wrong.
We can have
lq 30, 128(4)

Should be something like:

isRegNumber() && ((getImm() & 1) == 0)

llvm/lib/Target/PowerPC/PPCInstr64Bit.td
286

isRecordForm ? lqarx is not a recordform, stdcx. is a record form.

llvm/lib/Target/PowerPC/PPCRegisterInfo.td
199

This looks wrong. The encoding should be using original regnum, as in getG8pReg.

eg: G8p1 is actually {X2,X3}, so the HW encoding should be 2/3, not 1.

201

Shouldn't define Index here, should be <[-1, -1]> -- G8p is reg number in llvm only, there is NO regnum defined visible to debugger.

Defining here will mess up mapping table.

486

Add comments about why the allocation order is defined like this? And why we need a AltOrders for ELF?

llvm/lib/Target/PowerPC/PPCSchedule.td
46

nit: these are in alphabetic order , so these two should be moved down to after IIC_LdStLMW

llvm/test/CodeGen/PowerPC/ldst-16-byte.mir
3

How about powerpc32?

lkail added inline comments.Jun 9 2021, 6:54 PM
llvm/test/CodeGen/PowerPC/ldst-16-byte.mir
3

I guess it's not supported in powerpc32 mode, though I can't find where it is documented(It's documented in AIX assembly manual that ldarx and stdcx are not supported in powerpc32 mode). I've tried following on AIX-7.2

main:
  lq 2,128(4)
  stq 2,128(4)
  lqarx 2,3,4
  lqarx 2,3,4,1
  stqcx. 2,3,4

# as above code
as -a64 -mpwr8 enc.s
objdump -D a.out

It gives correct dump result.

0000000000000000 <.text>:
   0:   e0 44 00 80     lq      r2,128(r4)
   4:   f8 44 00 82     stq     r2,128(r4)
   8:   7c 43 22 28     lqarx   r2,r3,r4
   c:   7c 43 22 29     lqarx   r2,r3,r4,1
  10:   7c 43 21 6d     stqcx.  r2,r3,r4

But for as -a32 -mpwr8 enc.s, the result is not correct.

00000000 <.text>:
   0:   e0 44 00 80     lfq     f2,128(r4)
   4:   f8 44 00 82     .long 0xf8440082
   8:   7c 43 22 28     .long 0x7c432228
   c:   7c 43 22 29     .long 0x7c432229
  10:   7c 43 21 6d     .long 0x7c43216d
jsji added inline comments.Jun 9 2021, 7:10 PM
llvm/test/CodeGen/PowerPC/ldst-16-byte.mir
3

Thanks. If so, there opcode should be guarded with isPPC64 pred.

lkail added inline comments.Jun 9 2021, 11:44 PM
llvm/lib/Target/PowerPC/PPCInstr64Bit.td
286

This should be a historical issue(IIUC, to avoid decoding conflict), I'll leave L(B|W|D|Q)ARXL as it is now and fix them in following patches.

lkail updated this revision to Diff 351076.Jun 9 2021, 11:46 PM

Address comments.

lkail marked 8 inline comments as done.Jun 9 2021, 11:49 PM
lkail updated this revision to Diff 351078.Jun 9 2021, 11:52 PM
lkail marked an inline comment as done.Jun 9 2021, 11:54 PM
nemanjai added inline comments.Jun 10 2021, 2:46 AM
llvm/lib/Target/PowerPC/PPCInstr64Bit.td
286

The L variant is a hinted variant and the hint sets the same bit as record-form instructions. So we just reuse that bit. There is no danger of PPCInstrInfo::optimizeCompareInstr() attempting to convert one to the other since this is not part of the RecFormRel relation.

So I don't really think there is anything here that needs fixing.

nemanjai accepted this revision.Jun 10 2021, 2:51 AM

LGTM.

This revision is now accepted and ready to land.Jun 10 2021, 2:51 AM
jsji accepted this revision.Jun 10 2021, 8:08 AM

LGTM with some nits.

llvm/lib/Target/PowerPC/PPCInstr64Bit.td
286

Thanks for the background, yes, it is functionality correct here, but I think it is misleading and causing confusion about the semantic of isRecordForm.
We should be able to achieve same goal about the encoding/decoding here without messing up with isRecordForm.
But yeah, this can be done in a follow up patch.

1260

memrix16 ? We are restoring using two ld, do we still require DQ offset?

llvm/lib/Target/PowerPC/PPCRegisterInfo.td
125

nit: This doesn't enforce the relationship between SubReg0 and SubReg1 in Pair.

How about something like?

// GP8Pair - Paired GP8.
class GP8Pair<string n, bits<5> EvenIndex>: PPCReg<n>{
  assert !eq(EvenIndex{0},0), "Index should be even";
  let HWEncoding{4-0} = EvenIndex ;
  let SubRegs = [!cast<GP8>("X"#EvenIndex), !cast<GP8>("X"#!add(EvenIndex, 1))];
  let DwarfNumbers = [-1,-1] ;
  let SubRegIndices = [sub_gp8_x0, sub_gp8_x1] ;
}

then we can define pair safely and simply like:

// 16 paired even-odd consecutive GP8s.
foreach Index = { 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 } in {
    def G8p#!srl(Index, 1) : GP8Pair<"r"#Index, Index>;
}
lkail added inline comments.Jun 10 2021, 7:45 PM
llvm/lib/Target/PowerPC/PPCRegisterInfo.td
125

Nice one.

lkail updated this revision to Diff 351338.Jun 10 2021, 7:46 PM
This revision was landed with ongoing or failed builds.Jun 14 2021, 6:56 PM
This revision was automatically updated to reflect the committed changes.