This patch uses AtomicExpandPass to implement quadword lock free atomic operations. It adopts the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress.
Details
- Reviewers
nemanjai jsji xingxue efriedma jfb asb jyknight hubert.reinterpretcast - Group Reviewers
Restricted Project - Commits
- rGb9c3941cd61d: [PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Looks like this is better than D103445, I would recommend go with this unless there is other comments. Thanks!
I wouldn't recommend using AtomicExpansionKind::LLSC for new code. It's been a source of problems on other targets that use/used it: most targets have a forward progress rule that imposes restrictions beyond the ll/sc instructions themselves, and normal code generation can violate them. For example, fast regalloc can insert spills inside the ll/sc loop, or the basic block layout could be rearranged. I think the only target that hasn't run into issues is Hexagon.
Follow @efriedma and @jsji 's suggestions, I adopt the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress.
Update:
- Add intrinsics in InstrinsicsPowerPC.td to lower LLVM IR via AtomicExpandPass's MaskedIntrinsic kind.
- Update PPCInstr64bit.td to pattern match intrinsics and generate pseudo instructions
- Add post RA pass ppc-atomic-expand to expand pseudo instructions
Fixed:
- Uninitialized PMV.ShiftAmt in AtomicExpandPass
More atomic operations are being added.
llvm/include/llvm/IR/IntrinsicsPowerPC.td | ||
---|---|---|
1620 | Adding IR intrinsics for this is a little weird. Is there any reason you can't just use SelectionDAG custom legalization for these operations? I mean, not that it doesn't work this way, but it seems more complicated overall. | |
llvm/lib/Target/PowerPC/PPCInstr64Bit.td | ||
318 | Do you need to specify the size of these instructions somewhere, if you're expanding them after branch relaxation? |
llvm/include/llvm/IR/IntrinsicsPowerPC.td | ||
---|---|---|
1620 | Exploit AtomicExpandPass looks easier from my side :). And what's more current PPC backend also has spilling issues with -O0(fastregalloc enabled), see https://bugs.llvm.org/show_bug.cgi?id=50780. Maybe we should unify PPC's instruction selection of atomic operations one day (PPC also has partword atomics which is feasible to use MaskedIntrinsic) which I need more feedback from @nemanjai and @jsji . | |
llvm/lib/Target/PowerPC/PPCInstr64Bit.td | ||
318 | Good point. There is PPCBranchSelector pass serves as branch relaxation and PowerPC hasn't implemented LLVM MC's branch relaxation. I should have put the expansion before PPCBranchSelector. |
llvm/include/llvm/IR/IntrinsicsPowerPC.td | ||
---|---|---|
1620 | Yes, I think we should refactor partword atomics to use AtomicExpandPass as well, although not necessary do it now. |
I had an internal discussion with @hubert.reinterpretcast and @jsji after reviewing https://reviews.llvm.org/D103501, considering AIX current doesn't support __int128_t which is required in compiler-rt's atomic implementation, we don't enable quadword lock-free atomics by default on AIX.
Updated:
- Add -ppc-quadword-atomics switch
- Add cmpxchg support
Adding IR intrinsics for this is a little weird. Is there any reason you can't just use SelectionDAG custom legalization for these operations? I mean, not that it doesn't work this way, but it seems more complicated overall.