This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Add target feature to force-enable atomics
ClosedPublic

Authored by nikic on Jul 27 2022, 2:59 AM.

Details

Summary

This adds a +forced-atomics target feature with the same semantics as on ARM (D130480). For RISCV targets without the +a extension, this forces LLVM to assume that lock-free atomics (up to 32/64 bits for riscv32/64 respectively) are available.

This means that atomic load/store are lowered to a simple load/store (and fence as necessary), as these are guaranteed to be atomic (as long as they're aligned). Atomic RMW/CAS are lowered to __sync (rather than __atomic) libcalls. Responsibility for providing the __sync libcalls lies with the user (for privileged single-core code they can be implemented by disabling interrupts). Code using +forced-atomics and -forced-atomics are not ABI compatible if atomic variables cross the ABI boundary.

For context, the difference between __sync and __atomic is that the former are required to be lock-free, while the latter requires a shared global lock provided by a shared object library. See https://llvm.org/docs/Atomics.html#libcalls-atomic for a detailed discussion on the topic.

This target feature will be used by Rust's riscv32i target family to support the use of atomic load/store without atomic RMW/CAS.

Diff Detail

Event Timeline

nikic created this revision.Jul 27 2022, 2:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2022, 2:59 AM
nikic requested review of this revision.Jul 27 2022, 2:59 AM

Why atomics-32 and not something that generalises to include 64-bit atomics on RV64?

llvm/lib/Target/RISCV/RISCVInstrInfoA.td
111

Don’t duplicate patterns

nikic added inline comments.Jul 27 2022, 3:49 AM
llvm/lib/Target/RISCV/RISCVInstrInfoA.td
111

What's the right way to avoid the duplication here? Is there a way to use "or" inside Predicates?

llvm/lib/Target/RISCV/RISCVInstrInfoA.td
111

You can define an or predicate in RISCV.td like HasStdExtMOrZmmul, HasStdExtZfhOrZfhmin, etc.

nikic updated this revision to Diff 448017.Jul 27 2022, 5:55 AM
nikic retitled this revision from [RISCV] Add target feature to force 32-bit atomics to [RISCV] Add target feature to force-enable atomics.
nikic edited the summary of this revision. (Show Details)

Convert +atomics-32 into +forced-atomics, which also works correctly on riscv64. Use a single predicate for +forced-atomics and +a when defining tablegen patterns.

nikic added a comment.Jul 27 2022, 5:58 AM

Why atomics-32 and not something that generalises to include 64-bit atomics on RV64?

Good point. I've changed this to a +force-atomics feature that works for both riscv32 and riscv64. (If this patch lands, I'll also rename the ARM feature to match.)

llvm/lib/Target/RISCV/RISCVInstrInfoA.td
111

Thanks, I did that now!

reames added a subscriber: reames.Jul 27 2022, 7:14 AM

Can you spell out in the review comment the difference between sync and atomic and why there's a strong preference for one over the other? Links to details preferred, but enough context to send a reader in the right direction is appreciated.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
414

What MinCmpXchgSizeInBits do you want for the forced atomics?

Once you answer that, this code can probably be rearranged to have a common if body for both A and forced atomics.

Also, use XLenVT

933

I believe that at least a subset of these goes away if you set setMinCmpXchgSizeInBits properly. Haven't definitely confirmed, but at glance at the code makes that look likely.

llvm/test/CodeGen/RISCV/forced-atomics32.ll
5 ↗(On Diff #448017)

You need coverage here for non-seq_cst orderings. You don't need to repeat it for all sizes, but please make sure to at least add tests for all the orderings at the native width,

nikic edited the summary of this revision. (Show Details)Jul 27 2022, 7:45 AM
nikic updated this revision to Diff 448048.Jul 27 2022, 8:18 AM

Use XLen, test atomic orderings.

nikic added inline comments.Jul 27 2022, 8:22 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
414

I think we don't want a minimum for this case. It would save the need to support two __sync_val_compare_and_swap libcalls at the expense of generating more complex code for i8/i16 cmpxchg. Here's what the diff would look like: https://gist.github.com/nikic/9025e91bde139fbb75225f4fbd0539e9

asb added a comment.Aug 1 2022, 3:37 AM

Broadly speaking, this looks good to me (thanks!). I think in some previous discussions on this topic I hadn't appreciated the lock-free nature of the sync calls vs atomic.

My one request would be to flesh out the test coverage so the files cover all of the different atomicrmw operations - given they all had to be explicitly listed in the setOperationAction call to avoid their expansion, it would be worth verifying this (to provide some resilience against future refactorings).

asb added a comment.Aug 1 2022, 3:43 AM

It might also be worth patching the __sync section in the atomics docs to mention that some targets support this forced-atomics feature and provide a bit more insight on the use case (" for various reasons, it is not practical to emit the instructions inline." is a little vague).

That can be a separate patch, but I've found the LLVM atomics guide to be a really helpful reference so it would be great to keep it as up to date and informative as possible.

nikic updated this revision to Diff 449270.Aug 2 2022, 5:48 AM

Add tests for all rmw kinds, add note to atomics docs.

nikic added a comment.Aug 2 2022, 5:49 AM

Broadly speaking, this looks good to me (thanks!). I think in some previous discussions on this topic I hadn't appreciated the lock-free nature of the sync calls vs atomic.

My one request would be to flesh out the test coverage so the files cover all of the different atomicrmw operations - given they all had to be explicitly listed in the setOperationAction call to avoid their expansion, it would be worth verifying this (to provide some resilience against future refactorings).

Good call -- my implementation actually failed to handle the atomicrmw float operations -- those always need IR expansion.

asb added a comment.Aug 4 2022, 2:58 AM

Just one question about the tests - typically we combine RV32 and RV64 tests in one file (this makes it easier if e.g. another operation was added or a new corner case found, as it's less likely we'd add a new case for RV32 but not RV64 or vice versa). Is there a reason you can't merge these into forced-atomics.ll?

Otherwise, looks good to me - thanks!

nikic updated this revision to Diff 451074.Aug 9 2022, 2:37 AM
nikic added a reviewer: asb.

Combine tests for rv32 and rv64.

asb accepted this revision.Aug 9 2022, 4:40 AM

LGTM, thanks!

This revision is now accepted and ready to land.Aug 9 2022, 4:40 AM
This revision was automatically updated to reflect the committed changes.