This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Support 16-byte lock free atomics on pwr8 and up
ClosedPublic

Authored by lkail on Mar 24 2022, 3:13 AM.

Details

Summary

Make 16-byte atomic type aligned to 16-byte on PPC64, thus consistent with GCC. Also enable inlining 16-byte atomics on non-AIX targets on PPC64.

Diff Detail

Event Timeline

lkail created this revision.Mar 24 2022, 3:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2022, 3:13 AM
lkail requested review of this revision.Mar 24 2022, 3:13 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMar 24 2022, 3:13 AM
lkail edited the summary of this revision. (Show Details)Mar 24 2022, 3:15 AM
lkail updated this revision to Diff 417868.Mar 24 2022, 3:22 AM
jsji added inline comments.Mar 24 2022, 8:45 AM
clang/test/CodeGen/PowerPC/atomic-alignment.c
7

How about AIX pwr8?

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
126 ↗(On Diff #417868)

So we enable it by default , *EVEN* on AIX?

MaskRay added inline comments.
clang/lib/Basic/Targets/PPC.h
449

Many isOSLinux() usage may be really isOSBinFormatELF() (to help FreeBSD, etc). Can you check whether that's the case?

lkail updated this revision to Diff 418142.EditedMar 25 2022, 12:02 AM
lkail added reviewers: dim, pkubaj.
lkail set the repository for this revision to rG LLVM Github Monorepo.
lkail added subscribers: pkubaj, dim.

@dim @pkubaj We are modifying layout of 16-byte atomic type on Linux to be consistent with GCC, I don't know if it also applies to freebsd as @MaskRay pointed out.
To be more specific, on Linux, for arch level supporting 16-byte lock free atomics, 16-byte atomic type should be properly aligned.

#include <stdatomic.h>

int printf(const char *, ...);

typedef struct A {
  char x[16];
} A;

typedef struct B {
  char q;
  _Atomic(A) a;
} B;

int main(void) {
  _Atomic(A) *p = 0;
  printf("aligned: %d\n", __builtin_offsetof(B, a) % 16 == 0);
#if __clang__
  printf("lock free (size built-in): %d\n", __c11_atomic_is_lock_free(sizeof(*p)));
#endif
  printf("lock free (type query using pointer): %d\n", atomic_is_lock_free(p));
}

Current clang gives (-mcpu=power8)

aligned: 0
lock free (size built-in): 1
lock free (type query using pointer): 1

GCC gives

aligned: 1
lock free (type query using pointer): 1

This patch also modifies the query of __atomic_always_lock_free(16, 0) to return true for supported arch level.

lkail marked an inline comment as done.Mar 25 2022, 12:03 AM
lkail marked an inline comment as done.

Removed alter of -ppc-quadword-atomcis in backend to decouple from frontend.

efriedma added inline comments.
clang/lib/Basic/Targets/PPC.h
451

MaxAtomicPromoteWidth should not depend on whether quadword-atomics is present, only the target OS. It determines the layout of _Atomic(__int128_t).

(MaxAtomicInlineWidth is allowed to adjust as necessary.)

clang/lib/Driver/ToolChains/Clang.cpp
2140

Please avoid using "-mllvm" options like this; among other issues, it doesn't work with LTO. If we need to indicate this, please use a function attribute in the IR.

(Yes, there are other cases where we translate flags to -mllvm options, but it's not something you should copy.)

lkail updated this revision to Diff 418537.EditedMar 28 2022, 4:08 AM

Thanks @efriedma for pointing out. Updated guard code. Removed -mllvm while adjusting backend to reflect ABI change.

lkail marked an inline comment as done.Mar 28 2022, 4:14 AM
efriedma added inline comments.Mar 28 2022, 10:13 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1329 ↗(On Diff #418537)

You probably want an "else" here to call setMaxAtomicSizeInBitsSupported(64) or something like that. But you can leave that to a followup patch if you want.

clang/test/CodeGen/PowerPC/atomic-alignment.c
44

Although it is a change that affects layout compatibility with the short history of 16-byte atomic types compiled with LLVM-based compilers on AIX, having this 16-byte atomic type aligned also on AIX for PPC64 would be reasonable. The consequences of not doing so is that operations on some types would be lock-free only on ELF-based platforms.

clang/test/CodeGen/PowerPC/atomic-alignment.c
33

Just noting that GCC increases the alignment even for ppc32:

typedef struct A8 { char x[8]; } A8;
typedef struct A16 { char x[16]; } A16;
extern char q8[_Alignof(_Atomic(A8))], q8[8]; // okay for GCC targeting ppc32
extern char q16[_Alignof(_Atomic(A16))], q16[16]; // okay for GCC targeting ppc32

Apparently, the change for i686 in GCC occurred with version 11.
https://godbolt.org/z/fTTGoqWW1

clang/lib/Basic/Targets/PPC.h
448

I believe this should be presented more as this override being implemented currently only for ELF targets. The current presentation seems to imply more design intent for non-ELF targets than there is consensus for.

For example:

if (!getTriple().isOSBinFormatELF())
  return PPCTargetInfo::setMaxAtomicWidth();
clang/test/Sema/atomic-ops.c
13

This will need a separate patch to cover ppc32 (likely with AIX).

lkail updated this revision to Diff 419618.Mar 31 2022, 9:28 PM
lkail retitled this revision from [PowerPC][Linux] Support 16-byte lock free atomics on pwr8 and up to [PowerPC] Support 16-byte lock free atomics on pwr8 and up.
lkail edited the summary of this revision. (Show Details)

Make 16-byte atomic type aligned to 16-byte on PPC64.

lkail marked an inline comment as done.Mar 31 2022, 9:32 PM
lkail added inline comments.
clang/lib/Basic/Targets/PPC.h
448

It looks a chance to adjust according to arch level to me(Considering we finally will make xcoff and elf targets consistent here). If you have strong opinion on this, I'll make a change here.

clang/test/CodeGen/PowerPC/atomic-alignment.c
33

I'll post another patch to address ppc32 issue.

clang/lib/Basic/Targets/PPC.h
448

This function currently modifies two different things. One that should be arch-level-sensitive (and, iiuc, is already so) and one that should not be arch-level-sensitive.

So, the isOSBinFormatELF checks are here mainly to scope the patch (temporarily).

I only suggested to use the call to the base class as a shorthand for "nothing special that this function is doing". The associated comment should read something like "Restrict 16-byte atomic changes to ELF targets for now. TODO: Consider other targets as well."

At the very least (for this patch), I do not believe the isOSBinFormatELF should be checked twice in terms of control flow.

clang/lib/Basic/Targets/PPC.h
448

The new version of the change already addresses this; thanks.

clang/lib/Basic/Targets/PPC.h
444

Clarify that the statement is not for PPC64 in general.

446

Clarify that the support up to 8 bytes is for "baseline" PPC64 (i.e., non-baseline implementations may support lock-free inline 16-byte atomic operations).

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
18033 ↗(On Diff #419618)

Can we have a comment here to explain the isOSBinFormatELF check?

I am not sure that the choice of isOSBinFormatELF to (afaik) primarily scope this change from affecting AIX (where we know the library calls are not implemented to be lock-free yet) is better than alternative where the condition is for little-endian mode or specifically for not AIX.

lkail added a comment.Apr 5 2022, 7:00 PM

I am not sure that the choice of isOSBinFormatELF to (afaik) primarily scope this change from affecting AIX (where we know the library calls are not implemented to be lock-free yet) is better than alternative where the condition is for little-endian mode or specifically for not AIX.

I agree with we should exclude AIX explicitly(use !isOSAIX() rather than isOSBinFormatELF()) for current situation where AIX's support is on progress as we have known. What is unknown at present is the impact on freebsd.

lkail updated this revision to Diff 420687.Apr 5 2022, 7:36 PM

Address comments.

lkail edited the summary of this revision. (Show Details)Apr 5 2022, 7:36 PM
adalava added a subscriber: adalava.Apr 6 2022, 1:43 PM
adalava added inline comments.
clang/lib/Basic/Targets/PPC.h
440

I don't understand what DataLayout does mean, but I'm wondering why FreeBSD doesn't append this as well. Is it expected to be AIX() and Linux only?

clang/test/CodeGen/PowerPC/atomic-alignment.c
13

I was recently informed that, instead of using expected-no-diagnostics and -verify, using -Werror achieves the same with less processing cost. Sorry for the churn.

Noting also that expected-no-diagnostics only needs to be added once the the file.

clang/test/CodeGen/PowerPC/quadword-atomics.c
6

Can we add an AIX run line for pwr7? It should match the PPC64 check lines.

43–45

The pointer cast is avoidable; let's avoid it.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
18031 ↗(On Diff #420687)

Typo fix; grammar fix.

llvm/test/CodeGen/PowerPC/atomics-i128.ll
17 ↗(On Diff #420687)

I think this case deserves an extra comment to explain the -ppc-quadword-atomics.

77 ↗(On Diff #420687)

What library is this expected to provide this symbol?

lkail added inline comments.Apr 6 2022, 9:03 PM
clang/lib/Basic/Targets/PPC.h
440

As maskray has pointed out, usually using isOSBinFormatELF rather than isOSLinux can cover freebsd as well. This snippet of code might be beyond this patch, we might post another one to address this issue.

lkail added inline comments.Apr 6 2022, 9:12 PM
llvm/test/CodeGen/PowerPC/atomics-i128.ll
77 ↗(On Diff #420687)

As far as I know, should be none on AIX. This issue is fixed in https://reviews.llvm.org/D122868.

lkail updated this revision to Diff 421088.Apr 6 2022, 9:49 PM

Address comments.

LGTM with minor comments.

clang/test/CodeGen/PowerPC/atomic-alignment.c
1–11

Use -Werror in place of -verify with no diagnostics.

clang/test/CodeGen/PowerPC/quadword-atomics.c
2–7

Same comment as for the other file.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
18057 ↗(On Diff #421088)

Minor nit: Use semicolon.

llvm/test/CodeGen/PowerPC/atomics-i128.ll
160 ↗(On Diff #421088)

I have verified that the registers in the pairs are used correctly for this case. I've skimmed the other cases for the instructions applied to the loaded value (or, for non-inline cases, the functions called).

I did not check that the set-up for the calls, etc.

I also haven't looked into the memory barrier instruction usage (but that should be common with other widths).

This revision is now accepted and ready to land.Apr 7 2022, 10:24 AM
lkail updated this revision to Diff 421384.Apr 7 2022, 6:40 PM
lkail updated this revision to Diff 421385.
lkail marked 7 inline comments as done.Apr 7 2022, 6:44 PM
This revision was landed with ongoing or failed builds.Apr 8 2022, 4:26 PM
This revision was automatically updated to reflect the committed changes.