This ensures that libcalls aren't generated when the target supports atomics. Atomics aren't in the base RV32I/RV64I instruction sets, so MaxAtomicInlineWidth and MaxAtomicPromoteWidth are set only when the atomics extension is being targeted. This must be done in setMaxAtomicWidth, as this should be done after handleTargetFeatures has been called.
Details
- Reviewers
jfb jyknight wmi asb - Commits
- rG67afb7cbf166: Merging r370073: --------------------------------------------------------------…
rL370181: Merging r370073:
rGf260630e8f4c: [RISCV] Set MaxAtomicInlineWidth and MaxAtomicPromoteWidth for RV32/RV64…
rC370073: [RISCV] Set MaxAtomicInlineWidth and MaxAtomicPromoteWidth for RV32/RV64…
rL370073: [RISCV] Set MaxAtomicInlineWidth and MaxAtomicPromoteWidth for RV32/RV64…
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 37365 Build 37364: arc lint + arc unit
Event Timeline
Looks sensible to me.
I'm just curious why we want to prevent emission of atomic LLVM instructions at this point. Won't LLVM's AtomicExpand perform a similar lowering already? Perhaps the goal is to save that pass some work?
There was some discussion about whether the frontend should lower to libcalls or not here https://bugs.llvm.org/show_bug.cgi?id=31620 and it seems the decided path was that it's better for the frontend to lower, even if AtomicExpandPass does have some support for introducing these libcalls. Additionally, setting the max atomic inline width should mean that __atomic_is_lock_free is evaluated correctly.
Also worth noting that the frontend needs to know the max width otherwise warn_atomic_op_misaligned will be triggered with the message large atomic operation may incur significant performance penalty (name is misleading since the %select{large|misaligned}0 means it's triggered for both large and misaligned atomic ops, and it's also in the atomic-alignment group). This is exactly what I did in my tree (other than the tests) and it seemed to work correctly for RV64A compiling FreeBSD.
lib/Basic/Targets/RISCV.h | ||
---|---|---|
94 ↗ | (On Diff #184293) | MaxAtomicPromoteWidth is an incompatible ABI-change kind of thing. We should set that to the maximum atomic size this target ABI will _EVER_ support with any hardware, since it changes the data layout for atomic types. MaxAtomicInlineWidth can change based on hardware, and be increased in the future if other hardware is introduced, but MaxAtomicPromoteWidth shouldn't. I think it should be set it to 64 for most 32-bit platforms, and 128 for most 64-bit platforms. |
@asb: Gentle ping on this.
I would really like to avoid atomics libcalls in order to make proper llvm benchmarks.
Given this is an ABI-compatibility issue, I've been looking at how GCC and Clang differ in how they deal with issues around size and alignment of atomic objects.
All types of size less than or equal to MaxAtomicPromoteWidth, when they are turned into their atomic variant, will: a) have their size rounded up to a power of two, and b) have their alignment changed to that same power of two.
The following Gist contains:
- a C program which will print the size and alignment of a variety of strangely sized and aligned objects, and their atomic variants.
- the output of this program when run through gcc and clang, specifying rv(32,64)ima(,fd) and (ilp32,lp64)(,f) (matching ABIs to architectures). I used riscv64-unknown-linux-gnu-gcc for the avoidance of doubt (this compiler allows you to compile for rv32 by specifying a 32-bit -march value).
- the diffs in those outputs for an single ABI, between the two different compilers.
The gist is here: https://gist.github.com/lenary/2e977a8af33876ba8e0605e98c4e1b0d
There are some differences between GCC and Clang, that seem not to be risc-v specific (I saw them when running the C program on my mac)
- Clang ensures zero-sized objects become char-sized when they become atomic. GCC leaves them zero-sized. This is documented in ASTContext.cpp line 2130, to ensure the atomic operation is generated.
- GCC seems not to round up the size and alignment of non-power-of-two-sized structures.
I think this patch needs to be updated to ensure there are no differences in objects of power-of-two size. To do this, both riscv64 and riscv32, should have a MaxAtomicPromoteWidth of 128.
@jyknight does this reasoning make sense for ABI compatibility?
Upon further thought, I realise that MaxAtomicPromoteWidth should be set to 128 regardless of whether a target HasA. I will be updating the patch with the new width and conditions early next week.
Cross linking to the relevant psABI pull request (still pending): https://github.com/riscv/riscv-elf-psabi-doc/pull/112
That GCC and Clang differ in handling of Atomics is a really unfortunate, longstanding issue.
See https://bugs.llvm.org/show_bug.cgi?id=26462
For RISCV, perhaps it's not yet too late to have the RISCV psABI actually specify a single ABI for C11 _Atomic which all compilers can implement, rather than having compilers do whatever they want to, separately...
@jyknight I hear where you're coming from. I'll see what I can do about the psABI document.
In that ticket, it's mentioned that the Darwin ABI explicitly says that non-power-of-two atomic types should be padded and realigned, but I cannot find any documentation explaining this. That would be useful, given presumably GCC does have to pad/align on Darwin.
Then the only outstanding question relates to zero-sized atomics, which GCC does not pad, but I think Clang has to pad to get the semantics correct, based on this comment: https://github.com/llvm/llvm-project/blob/master/clang/lib/AST/ASTContext.cpp#L2176
AFAIK, there is no such documentation, at least publicly. Possibly Apple has some internally, but I suspect it more likely just some in-person conversation or something.
GCC is not really supported on Darwin, so I suspect it just gets it wrong.
Then the only outstanding question relates to zero-sized atomics, which GCC does not pad, but I think Clang has to pad to get the semantics correct, based on this comment: https://github.com/llvm/llvm-project/blob/master/clang/lib/AST/ASTContext.cpp#L2176
The semantics in GCC are that you can create such an object, but any attempt to load or store it will result in a compile-time error. E.g., "error: argument 1 of ‘__atomic_load’ must be a pointer to a nonzero size object". So I don't think there's really an issue there.
To keep everyone updated, this has bubbled over into a thread on the GCC Mailing list: https://gcc.gnu.org/ml/gcc/2019-08/msg00191.html - Potentially this is a route to a resolution on non-RISC-V platforms (specifically x86).
Then the only outstanding question relates to zero-sized atomics, which GCC does not pad, but I think Clang has to pad to get the semantics correct, based on this comment: https://github.com/llvm/llvm-project/blob/master/clang/lib/AST/ASTContext.cpp#L2176
The semantics in GCC are that you can create such an object, but any attempt to load or store it will result in a compile-time error. E.g., "error: argument 1 of ‘__atomic_load’ must be a pointer to a nonzero size object". So I don't think there's really an issue there.
Neat, ok.
I think the feeling with this ticket and the RISC-V backend is:
- We match GCC for power-of-two-sized atomics
- We don't match for non-power-of-two sized atomics (as on any other platform)
This feels like that should be enough for this patch to be merged, and in a future patch we can address the fall-out of ABIs changing across GCC and clang on more platforms than just RISC-V?
This LGTM, but given how much discussion there has been about MaxPromoteWidth it would be great to get some test coverage for it.