Page MenuHomePhabricator

[PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand
ClosedPublic

Authored by lkail on Jun 3 2021, 6:37 AM.

Details

Summary

This patch uses AtomicExpandPass to implement quadword lock free atomic operations. It adopts the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress.

Diff Detail

Event Timeline

lkail created this revision.Jun 3 2021, 6:37 AM
lkail requested review of this revision.Jun 3 2021, 6:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 3 2021, 6:37 AM
jsji added a comment.Jun 3 2021, 6:44 AM

Looks like this is better than D103445, I would recommend go with this unless there is other comments. Thanks!

I wouldn't recommend using AtomicExpansionKind::LLSC for new code. It's been a source of problems on other targets that use/used it: most targets have a forward progress rule that imposes restrictions beyond the ll/sc instructions themselves, and normal code generation can violate them. For example, fast regalloc can insert spills inside the ll/sc loop, or the basic block layout could be rearranged. I think the only target that hasn't run into issues is Hexagon.

lkail planned changes to this revision.Jun 3 2021, 6:17 PM

I wouldn't recommend using AtomicExpansionKind::LLSC for new code. It's been a source of problems on other targets that use/used it: most targets have a forward progress rule that imposes restrictions beyond the ll/sc instructions themselves, and normal code generation can violate them. For example, fast regalloc can insert spills inside the ll/sc loop, or the basic block layout could be rearranged. I think the only target that hasn't run into issues is Hexagon.

Thanks for these information!

lkail updated this revision to Diff 353585.Jun 22 2021, 2:53 AM
lkail added reviewers: asb, jyknight.

Follow @efriedma and @jsji 's suggestions, I adopt the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress.

Update:

  • Add intrinsics in InstrinsicsPowerPC.td to lower LLVM IR via AtomicExpandPass's MaskedIntrinsic kind.
  • Update PPCInstr64bit.td to pattern match intrinsics and generate pseudo instructions
  • Add post RA pass ppc-atomic-expand to expand pseudo instructions

Fixed:

  • Uninitialized PMV.ShiftAmt in AtomicExpandPass

More atomic operations are being added.

lkail edited the summary of this revision. (Show Details)Jun 22 2021, 3:01 AM
lkail updated this revision to Diff 353608.Jun 22 2021, 4:40 AM
lkail updated this revision to Diff 353610.Jun 22 2021, 4:42 AM
lkail updated this revision to Diff 353623.Jun 22 2021, 6:06 AM
lkail updated this revision to Diff 353624.Jun 22 2021, 6:15 AM

Simplified td.

lkail updated this revision to Diff 353625.Jun 22 2021, 6:18 AM
lkail updated this revision to Diff 353626.
efriedma added inline comments.Jun 22 2021, 2:56 PM
llvm/include/llvm/IR/IntrinsicsPowerPC.td
1620

Adding IR intrinsics for this is a little weird. Is there any reason you can't just use SelectionDAG custom legalization for these operations? I mean, not that it doesn't work this way, but it seems more complicated overall.

llvm/lib/Target/PowerPC/PPCInstr64Bit.td
318

Do you need to specify the size of these instructions somewhere, if you're expanding them after branch relaxation?

lkail added inline comments.Jun 22 2021, 9:00 PM
llvm/include/llvm/IR/IntrinsicsPowerPC.td
1620

Exploit AtomicExpandPass looks easier from my side :). And what's more current PPC backend also has spilling issues with -O0(fastregalloc enabled), see https://bugs.llvm.org/show_bug.cgi?id=50780. Maybe we should unify PPC's instruction selection of atomic operations one day (PPC also has partword atomics which is feasible to use MaskedIntrinsic) which I need more feedback from @nemanjai and @jsji .

llvm/lib/Target/PowerPC/PPCInstr64Bit.td
318

Good point. There is PPCBranchSelector pass serves as branch relaxation and PowerPC hasn't implemented LLVM MC's branch relaxation. I should have put the expansion before PPCBranchSelector.

lkail updated this revision to Diff 353845.Jun 22 2021, 9:03 PM

Put ppc-atomic-expand before ppc-branch-select.

lkail retitled this revision from [PowerPC][AIX][RFC] Generate inlined quadword lock free atomic operations via AtomicExpand to [PowerPC][AIX] Generate inlined quadword lock free atomic operations via AtomicExpand.Jun 22 2021, 11:46 PM
jsji added inline comments.Jun 23 2021, 6:21 AM
llvm/include/llvm/IR/IntrinsicsPowerPC.td
1620

Yes, I think we should refactor partword atomics to use AtomicExpandPass as well, although not necessary do it now.

lkail updated this revision to Diff 356699.EditedTue, Jul 6, 6:32 AM
lkail retitled this revision from [PowerPC][AIX] Generate inlined quadword lock free atomic operations via AtomicExpand to [PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand.

I had an internal discussion with @hubert.reinterpretcast and @jsji after reviewing https://reviews.llvm.org/D103501, considering AIX current doesn't support __int128_t which is required in compiler-rt's atomic implementation, we don't enable quadword lock-free atomics by default on AIX.

Updated:

  • Add -ppc-quadword-atomics switch
  • Add cmpxchg support
lkail updated this revision to Diff 356707.Tue, Jul 6, 6:41 AM
jsji accepted this revision as: jsji.Wed, Jul 7, 8:32 AM

LGTM. Please hold a few days to see whether there are other comments. Thanks.

This revision is now accepted and ready to land.Wed, Jul 7, 8:32 AM
lkail updated this revision to Diff 358165.Mon, Jul 12, 11:09 PM

rebased.

lkail updated this revision to Diff 358794.Wed, Jul 14, 5:49 PM

Rebased

This revision was landed with ongoing or failed builds.Wed, Jul 14, 6:12 PM
This revision was automatically updated to reflect the committed changes.