Page MenuHomePhabricator

[PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand
ClosedPublic

Authored by lkail on Jun 3 2021, 6:37 AM.

Details

Summary

This patch uses AtomicExpandPass to implement quadword lock free atomic operations. It adopts the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress.

Diff Detail

Unit TestsFailed

TimeTest
2,470 msx64 debian > libarcher.barrier::barrier.c
Script: -- : 'RUN: at line 14'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/barrier/barrier.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/barrier/Output/barrier.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/barrier/Output/barrier.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/barrier/Output/barrier.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/barrier/barrier.c
2,680 msx64 debian > libarcher.critical::critical.c
Script: -- : 'RUN: at line 15'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/critical.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/critical.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/critical.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/critical.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/critical.c
2,480 msx64 debian > libarcher.critical::lock.c
Script: -- : 'RUN: at line 15'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/lock.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/lock.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/lock.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/critical/Output/lock.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/critical/lock.c
2,550 msx64 debian > libarcher.parallel::parallel-firstprivate.c
Script: -- : 'RUN: at line 15'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/parallel/parallel-firstprivate.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/parallel/Output/parallel-firstprivate.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/parallel/Output/parallel-firstprivate.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/parallel/Output/parallel-firstprivate.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/parallel/parallel-firstprivate.c
2,640 msx64 debian > libarcher.parallel::parallel-simple2.c
Script: -- : 'RUN: at line 15'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests -I /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/runtime/src -L /var/lib/buildkite-agent/builds/llvm-project/build/lib -Wl,-rpath,/var/lib/buildkite-agent/builds/llvm-project/build/lib /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/parallel/parallel-simple2.c -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/parallel/Output/parallel-simple2.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/parallel/Output/parallel-simple2.c.tmp 2>&1 | tee /var/lib/buildkite-agent/builds/llvm-project/build/projects/openmp/tools/archer/tests/parallel/Output/parallel-simple2.c.tmp.log | /var/lib/buildkite-agent/builds/llvm-project/build/./bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/openmp/tools/archer/tests/parallel/parallel-simple2.c
View Full Test Results (21 Failed)

Event Timeline

lkail created this revision.Jun 3 2021, 6:37 AM
lkail requested review of this revision.Jun 3 2021, 6:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 3 2021, 6:37 AM
jsji added a comment.Jun 3 2021, 6:44 AM

Looks like this is better than D103445, I would recommend go with this unless there is other comments. Thanks!

I wouldn't recommend using AtomicExpansionKind::LLSC for new code. It's been a source of problems on other targets that use/used it: most targets have a forward progress rule that imposes restrictions beyond the ll/sc instructions themselves, and normal code generation can violate them. For example, fast regalloc can insert spills inside the ll/sc loop, or the basic block layout could be rearranged. I think the only target that hasn't run into issues is Hexagon.

lkail planned changes to this revision.Jun 3 2021, 6:17 PM

I wouldn't recommend using AtomicExpansionKind::LLSC for new code. It's been a source of problems on other targets that use/used it: most targets have a forward progress rule that imposes restrictions beyond the ll/sc instructions themselves, and normal code generation can violate them. For example, fast regalloc can insert spills inside the ll/sc loop, or the basic block layout could be rearranged. I think the only target that hasn't run into issues is Hexagon.

Thanks for these information!

lkail updated this revision to Diff 353585.Jun 22 2021, 2:53 AM
lkail added reviewers: asb, jyknight.

Follow @efriedma and @jsji 's suggestions, I adopt the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress.

Update:

  • Add intrinsics in InstrinsicsPowerPC.td to lower LLVM IR via AtomicExpandPass's MaskedIntrinsic kind.
  • Update PPCInstr64bit.td to pattern match intrinsics and generate pseudo instructions
  • Add post RA pass ppc-atomic-expand to expand pseudo instructions

Fixed:

  • Uninitialized PMV.ShiftAmt in AtomicExpandPass

More atomic operations are being added.

lkail edited the summary of this revision. (Show Details)Jun 22 2021, 3:01 AM
lkail updated this revision to Diff 353608.Jun 22 2021, 4:40 AM
lkail updated this revision to Diff 353610.Jun 22 2021, 4:42 AM
lkail updated this revision to Diff 353623.Jun 22 2021, 6:06 AM
lkail updated this revision to Diff 353624.Jun 22 2021, 6:15 AM

Simplified td.

lkail updated this revision to Diff 353625.Jun 22 2021, 6:18 AM
lkail updated this revision to Diff 353626.
efriedma added inline comments.Jun 22 2021, 2:56 PM
llvm/include/llvm/IR/IntrinsicsPowerPC.td
1620

Adding IR intrinsics for this is a little weird. Is there any reason you can't just use SelectionDAG custom legalization for these operations? I mean, not that it doesn't work this way, but it seems more complicated overall.

llvm/lib/Target/PowerPC/PPCInstr64Bit.td
318

Do you need to specify the size of these instructions somewhere, if you're expanding them after branch relaxation?

lkail added inline comments.Jun 22 2021, 9:00 PM
llvm/include/llvm/IR/IntrinsicsPowerPC.td
1620

Exploit AtomicExpandPass looks easier from my side :). And what's more current PPC backend also has spilling issues with -O0(fastregalloc enabled), see https://bugs.llvm.org/show_bug.cgi?id=50780. Maybe we should unify PPC's instruction selection of atomic operations one day (PPC also has partword atomics which is feasible to use MaskedIntrinsic) which I need more feedback from @nemanjai and @jsji .

llvm/lib/Target/PowerPC/PPCInstr64Bit.td
318

Good point. There is PPCBranchSelector pass serves as branch relaxation and PowerPC hasn't implemented LLVM MC's branch relaxation. I should have put the expansion before PPCBranchSelector.

lkail updated this revision to Diff 353845.Jun 22 2021, 9:03 PM

Put ppc-atomic-expand before ppc-branch-select.

lkail retitled this revision from [PowerPC][AIX][RFC] Generate inlined quadword lock free atomic operations via AtomicExpand to [PowerPC][AIX] Generate inlined quadword lock free atomic operations via AtomicExpand.Jun 22 2021, 11:46 PM
jsji added inline comments.Jun 23 2021, 6:21 AM
llvm/include/llvm/IR/IntrinsicsPowerPC.td
1620

Yes, I think we should refactor partword atomics to use AtomicExpandPass as well, although not necessary do it now.

lkail updated this revision to Diff 356699.EditedJul 6 2021, 6:32 AM
lkail retitled this revision from [PowerPC][AIX] Generate inlined quadword lock free atomic operations via AtomicExpand to [PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand.

I had an internal discussion with @hubert.reinterpretcast and @jsji after reviewing https://reviews.llvm.org/D103501, considering AIX current doesn't support __int128_t which is required in compiler-rt's atomic implementation, we don't enable quadword lock-free atomics by default on AIX.

Updated:

  • Add -ppc-quadword-atomics switch
  • Add cmpxchg support
lkail updated this revision to Diff 356707.Jul 6 2021, 6:41 AM
jsji accepted this revision as: jsji.Jul 7 2021, 8:32 AM

LGTM. Please hold a few days to see whether there are other comments. Thanks.

This revision is now accepted and ready to land.Jul 7 2021, 8:32 AM
lkail updated this revision to Diff 358165.Jul 12 2021, 11:09 PM

rebased.

lkail updated this revision to Diff 358794.Jul 14 2021, 5:49 PM

Rebased

This revision was landed with ongoing or failed builds.Jul 14 2021, 6:12 PM
This revision was automatically updated to reflect the committed changes.