Page MenuHomePhabricator

[AArch64] Codegen for FEAT_LSE128
ClosedPublic

Authored by tmatheson on Tue, Jan 10, 9:16 AM.

Details

Summary

Codegen support for 128-bit atomicrmw (and|or|xchg).

  • store atomic -> swpp
  • atomicrmw xchg -> swpp
  • atomicrmw and -> ldclrp
  • atomicrmw or -> ldsetp

Diff Detail

Event Timeline

tmatheson created this revision.Tue, Jan 10, 9:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptTue, Jan 10, 9:16 AM
tmatheson requested review of this revision.Tue, Jan 10, 9:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptTue, Jan 10, 9:16 AM
lenary added inline comments.Tue, Jan 10, 9:44 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
22098

I am also confused about this assert message.

22281

The message here doesn't correspond to the case?

llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-lse128.ll
119

I thought we tested O0 because of globalisel, but here both O0 and O1 are updated, despite not having changed globalisel. Why?

tmatheson added inline comments.Tue, Jan 10, 11:43 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
22098

See ReplaceATOMIC_LOAD_128Results which has more explanation. It will convert ATOMIC_LOAD_AND to 2x XOR + LDCLRP (determined by this function, getAtomicLoad128Opcode). This is in contrast to other sizes, which convert ATOMIC_LOAD_AND to ATOMIC_LOAD_CLR earlier, because i128 is not legal. The assert is checking that no ATOMIC_LOAD_CLR made it through, because for 128 we should see only ATOMIC_LOAD_AND here.

22281

This assert is also checking that no 128-bit ATOMIC_LOAD_CLR appeared, which would indicate that ATOMIC_LOAD_AND was not lowered directly to a MachineInstr.

llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-lse128.ll
119

GlobalISel does not support quite a lot of the atomics, so it is falling back to SelectionDAG. This is also the reason so many of the tests have identical output for O0 and O1. The idea with using optimisation levels rather than forcing -global-isel is that what we care about at the end of the day is the actual output, rather than how complete GlobalISel is.

lenary added inline comments.Wed, Jan 18, 9:44 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
22098

Ok, I see, and yet I worry I'd forget this again looking directly at the assert message, which never mentions _CLR (the link between _AND and _CLR seems implicit to me)

22281

I think it might be clearer if the assert message said "ATOMIC_LOAD_CLR should be legal" maybe?

llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-store-lse128.ll
119

Right, ok. This does seem like something we could improve, but it's fine to happen later.

If there are equivalent instructions in LLVM IR, you could touch the IR translator?

If there are equivalent instructions in LLVM IR, you could touch the IR translator?

Right now, GlobalISel falls back to SelectionDAG for many atomics, including those above (the -O0 case is the globalisel case, but it has fallen back). I think GlobalISel work for these atomics can come in a later patch, to be honest.

tmatheson updated this revision to Diff 491372.Mon, Jan 23, 7:31 AM

Add comments clarifying asserts

tmatheson edited the summary of this revision. (Show Details)Mon, Jan 23, 7:34 AM
lenary accepted this revision.Wed, Jan 25, 3:51 AM
This revision is now accepted and ready to land.Wed, Jan 25, 3:51 AM
This revision was landed with ongoing or failed builds.Wed, Jan 25, 4:02 AM
This revision was automatically updated to reflect the committed changes.

I'm a little confused by the monotonic store sequence; does lse128 not imply lse2?

No, FEAT_LSE128 only implies FEAT_LSE, not FEAT_LSE2.

No, FEAT_LSE128 only implies FEAT_LSE, not FEAT_LSE2.

In that case, can you add a few tests to show what happens on targets that have both lse2 and lse128? I assume lse2 "wins", but it would be nice to make that explicit.