This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Basic/
-
clang/
-
Basic/
-
DiagnosticFrontendKinds.td
-
DiagnosticSemaKinds.td
13/13
TargetInfo.h
-
lib/
-
Basic/
2/2
TargetInfo.cpp
-
Targets/
-
AMDGPU.cpp
-
CodeGen/
6/6
CGAtomic.cpp
-
Sema/
6/7
SemaChecking.cpp
-
test/
-
CodeGen/
-
fp-atomic-ops.c
-
CodeGenCUDA/
2/2
amdgpu-atomic-ops.cu
-
CodeGenOpenCL/
-
atomic-ops.cl
-
Sema/
2/2
atomic-ops.c
-
SemaCUDA/
-
amdgpu-atomic-ops.cu
-
SemaOpenCL/
-
atomic-ops.cl

Differential D71726

Let clang atomic builtins fetch add/sub support floating point types
ClosedPublic

Authored by yaxunl on Dec 19 2019, 1:14 PM.

Download Raw Diff

Details

Reviewers

rjmccall
b-sumner
arsenm
tra
jfb

Commits

rG61d065e21ff3: Let clang atomic builtins fetch add/sub support floating point types

Summary

Recently atomicrmw started to support fadd/fsub:

https://reviews.llvm.org/D53965

However clang atomic builtins fetch add/sub still does not support emitting atomicrmw fadd/fsub.

This patch adds that.

Diff Detail

Event Timeline

yaxunl created this revision.Dec 19 2019, 1:14 PM

Herald added a subscriber: jfb. · View Herald TranscriptDec 19 2019, 1:14 PM

This generally seems fine. Does it work on most backends? I want to make sure it doesn't fail in backends :)

Also, @ldionne / @EricWF / @mclow.lists do you need this in libc++ for floating-point atomic support?

jfb added a subscriber: __simt__.Dec 19 2019, 2:53 PM

In D71726#1791904, @jfb wrote:

This generally seems fine. Does it work on most backends? I want to make sure it doesn't fail in backends :)

For x86_64, amdgcn, aarch64, armv7, mips64, it is translated to cmpxchg by AtomicExpandPass and backends did codegen successfully.

For hexagon, riscv32, it is translated to call of __atomic_fetch_add_4 for fadd float. This is concerning. Probably we need to add __atomic_fetch_{add|sub}_{f16|f32|f64} ?

arsenm added a subscriber: arsenm.Jan 2 2020, 7:53 AM

arsenm added inline comments.

clang/lib/CodeGen/CGAtomic.cpp
605–607	Should this really be based on the type, or should the builtin name be different for FP?

In D71726#1792852, @yaxunl wrote:

In D71726#1791904, @jfb wrote:

This generally seems fine. Does it work on most backends? I want to make sure it doesn't fail in backends :)

For x86_64, amdgcn, aarch64, armv7, mips64, it is translated to cmpxchg by AtomicExpandPass and backends did codegen successfully.

For hexagon, riscv32, it is translated to call of __atomic_fetch_add_4 for fadd float. This is concerning. Probably we need to add __atomic_fetch_{add|sub}_{f16|f32|f64} ?

For systems that have load-link/store-conditional architectures, like ARM / PPC / base RISC-V without extension, I would imagine that using a cmpxchg loop is much worse than simply doing the floating-point add/sub in the middle of the atomic mini-transaction. I'm sure that we want back-ends to be capable of implementing this better than what this pass is doing, even when they don't have "native" fp atomics.

You listed amdgcn... what does this do on nvptx?

In D71726#1801346, @__simt__ wrote:

In D71726#1792852, @yaxunl wrote:

In D71726#1791904, @jfb wrote:

This generally seems fine. Does it work on most backends? I want to make sure it doesn't fail in backends :)

For x86_64, amdgcn, aarch64, armv7, mips64, it is translated to cmpxchg by AtomicExpandPass and backends did codegen successfully.

For hexagon, riscv32, it is translated to call of __atomic_fetch_add_4 for fadd float. This is concerning. Probably we need to add __atomic_fetch_{add|sub}_{f16|f32|f64} ?

For systems that have load-link/store-conditional architectures, like ARM / PPC / base RISC-V without extension, I would imagine that using a cmpxchg loop is much worse than simply doing the floating-point add/sub in the middle of the atomic mini-transaction. I'm sure that we want back-ends to be capable of implementing this better than what this pass is doing, even when they don't have "native" fp atomics.

You listed amdgcn... what does this do on nvptx?

Targets can implement shouldExpandAtomicRMWInIR for the desired behavior, which NVPTX currently does not implement. Looking at AtomicExpandPass, it looks like either cmpxchg or LLSC expansions should work for the FP atomics already

rebase

Herald added a subscriber: wdng. · View Herald TranscriptMay 20 2020, 12:38 PM

In D71726#2039319, @arsenm wrote:

In D71726#1801346, @__simt__ wrote:

In D71726#1792852, @yaxunl wrote:

In D71726#1791904, @jfb wrote:

This generally seems fine. Does it work on most backends? I want to make sure it doesn't fail in backends :)

For x86_64, amdgcn, aarch64, armv7, mips64, it is translated to cmpxchg by AtomicExpandPass and backends did codegen successfully.

For hexagon, riscv32, it is translated to call of __atomic_fetch_add_4 for fadd float. This is concerning. Probably we need to add __atomic_fetch_{add|sub}_{f16|f32|f64} ?

For systems that have load-link/store-conditional architectures, like ARM / PPC / base RISC-V without extension, I would imagine that using a cmpxchg loop is much worse than simply doing the floating-point add/sub in the middle of the atomic mini-transaction. I'm sure that we want back-ends to be capable of implementing this better than what this pass is doing, even when they don't have "native" fp atomics.

You listed amdgcn... what does this do on nvptx?

Targets can implement shouldExpandAtomicRMWInIR for the desired behavior, which NVPTX currently does not implement. Looking at AtomicExpandPass, it looks like either cmpxchg or LLSC expansions should work for the FP atomics already

nvptx is similar to hexagon and riscv32, where fp atomics is translated to call of __atomic_fetch_add_4.

Since currently only amdgcn supports fp atomics, I am going to add a TargetInfo hook about whether fp atomics is supported and only emit fp atomics for targets supporting it.

clang/lib/CodeGen/CGAtomic.cpp
605–607	I think the original name is better. They are exactly what they are intended to be. They were not able to handle fp types therefore they used to emit diagnostics when fp types were passed to them. However now they are able to handle fp types.

In D71726#1791904, @jfb wrote:

This generally seems fine. Does it work on most backends? I want to make sure it doesn't fail in backends :)

Also, @ldionne / @EricWF / @mclow.lists do you need this in libc++ for floating-point atomic support?

Yes, I guess we do in order to implement fetch_add & friends on floating point types (https://wg21.link/P0020R6).

The builtins would need to work on float, double and long double. The code seems to suggest it does, however the tests only check for float. Does this support __atomic_fetch_{add,sub} on double and long double?

ldionne added inline comments.May 20 2020, 2:08 PM

clang/test/CodeGen/atomic-ops.c
296 ↗	(On Diff #265325)	Sorry if that's a dumb question, but I'm a bit confused: `p` is a `float`, but then we add a double `1.0` to it. Is that intended, or should that be `double p` instead (or `1.0f`)?

In D71726#2047566, @ldionne wrote:

In D71726#1791904, @jfb wrote:

This generally seems fine. Does it work on most backends? I want to make sure it doesn't fail in backends :)

Also, @ldionne / @EricWF / @mclow.lists do you need this in libc++ for floating-point atomic support?

Yes, I guess we do in order to implement fetch_add & friends on floating point types (https://wg21.link/P0020R6).

The builtins would need to work on float, double and long double. The code seems to suggest it does, however the tests only check for float. Does this support __atomic_fetch_{add,sub} on double and long double?

It depends on target. For x86_64, __atomic_fetch_{add,sub} on double and long double are translated to __atomic_fetch_sub_8 and __atomic_fetch_sub_16.
For amdgcn, __atomic_fetch_{add,sub} on double is translated to fp atomic insts. long double is the same as double on amdgcn.

clang/test/CodeGen/atomic-ops.c
296 ↗	(On Diff #265325)	In this case, the value type is converted to the pointee type of the pointer operand.

In D71726#2047566, @ldionne wrote:

In D71726#1791904, @jfb wrote:

This generally seems fine. Does it work on most backends? I want to make sure it doesn't fail in backends :)

Also, @ldionne / @EricWF / @mclow.lists do you need this in libc++ for floating-point atomic support?

Yes, I guess we do in order to implement fetch_add & friends on floating point types (https://wg21.link/P0020R6).

The builtins would need to work on float, double and long double. The code seems to suggest it does, however the tests only check for float. Does this support __atomic_fetch_{add,sub} on double and long double?

libc++ could implement atomic<float> using a cmpxchg loop with bit_cast and the FP instruction in most cases, and only use these builtins if available.

Added TargetInfo::isFPAtomicFetchAddSubSupported to guard fp atomic.

Herald added subscribers: kerbowa, nhaehnle, jvesely. · View Herald TranscriptMay 21 2020, 5:24 AM

tra added a subscriber: tra.May 21 2020, 10:49 AM

tra added inline comments.

clang/include/clang/Basic/TargetInfo.h
1514	I think it should be predicated on specific type. E.g. NVPTX supports atomic ops on fp32 ~everywhere, but fp64 atomic add/sub is only supported on newer GPUs. And then there's fp16...

ldionne added inline comments.May 21 2020, 11:45 AM

clang/test/CodeGen/atomic-ops.c
296 ↗	(On Diff #265325)	Ok, thanks for the clarification. Yeah, it was a dumb question after all. I still think it should be made clearer by using `1.0f`.

yaxunl marked 3 inline comments as done.May 21 2020, 2:32 PM

yaxunl added inline comments.

clang/include/clang/Basic/TargetInfo.h
1514	will do and add tests for fp16
clang/test/CodeGen/atomic-ops.c
296 ↗	(On Diff #265325)	this test has been removed. the new tests do not have this issue.

check supported fp atomics by bits.

ldionne added inline comments.May 25 2020, 11:10 AM

clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu
27	Nitpick, but this should be `1.0L` to be consistent.

tra added inline comments.May 26 2020, 9:58 AM

clang/include/clang/Basic/TargetInfo.h
1514	The number of bits alone may not be sufficient to differentiate the FP variants. E.g. 16-bit floats currently have 2 variants: IEEE FP16 and BFloat16 (supported by intel and newer NVIDIA GPUs). CUDA-11 has introduced TF32 FP format, so we're likely to have more than one 32-bit FP type, too. I think PPC has an odd `long double` variant represented as pair of 64-bit doubles.

yaxunl marked 4 inline comments as done.Jul 18 2020, 3:44 PM

yaxunl added inline comments.

clang/include/clang/Basic/TargetInfo.h
1514	will use llvm::fltSemantics for checking, which should cover different fp types.
clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu
27	done

use llvm::fltSemantics for checking

Why not have clang always emit atomicrmw for floats, and let AtomicExpandPass handle legalizing that into integer atomics if necessary, rather than adding a target hook in clang?

In D71726#2165424, @jyknight wrote:

Why not have clang always emit atomicrmw for floats, and let AtomicExpandPass handle legalizing that into integer atomics if necessary, rather than adding a target hook in clang?

Not all targets can legalize fp atomics by AtomicExpandPass. Some targets need library support.

In D71726#2165445, @yaxunl wrote:

In D71726#2165424, @jyknight wrote:

Why not have clang always emit atomicrmw for floats, and let AtomicExpandPass handle legalizing that into integer atomics if necessary, rather than adding a target hook in clang?

Not all targets can legalize fp atomics by AtomicExpandPass. Some targets need library support.

What are they missing? It can be expanded to a cmpxchg loop with bitcast to an integer type of the same size.

In D71726#2165445, @yaxunl wrote:

In D71726#2165424, @jyknight wrote:

Why not have clang always emit atomicrmw for floats, and let AtomicExpandPass handle legalizing that into integer atomics if necessary, rather than adding a target hook in clang?

Not all targets can legalize fp atomics by AtomicExpandPass. Some targets need library support.

That isn't true, because you can do so generically with a cmpxchg loop, assuming that size of atomic is supported by the target. This might not be the most efficient lowering choice, but it's always possible as a fallback. (And if the size is too large, then AtomicExpandPass will lower the cmpxchg to the libatomic call.)

If a target wants to tell AtomicExpandPass that fp add/sub are supported, and then lower the resulting ATOMIC_LOAD_FSUB sdag node into a libcall of its choice, that's also ok (as long as the libcall is lock-free).

In D71726#2165494, @jyknight wrote:

In D71726#2165445, @yaxunl wrote:

In D71726#2165424, @jyknight wrote:

Why not have clang always emit atomicrmw for floats, and let AtomicExpandPass handle legalizing that into integer atomics if necessary, rather than adding a target hook in clang?

Not all targets can legalize fp atomics by AtomicExpandPass. Some targets need library support.

That isn't true, because you can do so generically with a cmpxchg loop, assuming that size of atomic is supported by the target. This might not be the most efficient lowering choice, but it's always possible as a fallback. (And if the size is too large, then AtomicExpandPass will lower the cmpxchg to the libatomic call.)

If a target wants to tell AtomicExpandPass that fp add/sub are supported, and then lower the resulting ATOMIC_LOAD_FSUB sdag node into a libcall of its choice, that's also ok (as long as the libcall is lock-free).

how about other fp types e.g. bf16, half, long double? Do we need to diagnose them or not?

Make IEEE single and double type as supported for fp atomics in all targets by default. This is based on the assumption that AtomicExpandPass or its ongoing work is sufficient to support fp atomics for all targets. This is to facilitate middle end and backend end development to support fp atomics.

If a target would like to treat single and double fp atomics as unsupported, it can override the default behavior in its own TargetInfo.

ping. I think I have addressed all the issues in FE. I think issues in AtomicExpandPass should be addressed by separate patches. Can we land this? Thanks.

LGTM, modulo couple of nits.

@jyknight are you OK with this?

In D71726#2179428, @yaxunl wrote:

Make IEEE single and double type as supported for fp atomics in all targets by default. This is based on the assumption that AtomicExpandPass or its ongoing work is sufficient to support fp atomics for all targets. This is to facilitate middle end and backend end development to support fp atomics.

If a target would like to treat single and double fp atomics as unsupported, it can override the default behavior in its own TargetInfo.

Do we have sufficient test coverage on all platforms to make sure we're not generating something that LLVM can't handle everywhere?
If not, perhaps we should default to unsupported and only enable it for known working targets.

clang/lib/CodeGen/CGAtomic.cpp
915–917	`ShouldCastToIntPtrTy = !MemTy->isFloatingType();`
clang/test/Sema/atomic-ops.c
102–104	Rename arguments? d -> f, d2 -> d, d3 -> ld ?

revised by Artem's comments.

added tests for targets supporting fp atomics.

Herald added subscribers: atanasyan, sdardis. · View Herald TranscriptAug 4 2020, 5:09 PM

In D71726#2182667, @tra wrote:

LGTM, modulo couple of nits.

@jyknight are you OK with this?

In D71726#2179428, @yaxunl wrote:

Make IEEE single and double type as supported for fp atomics in all targets by default. This is based on the assumption that AtomicExpandPass or its ongoing work is sufficient to support fp atomics for all targets. This is to facilitate middle end and backend end development to support fp atomics.

If a target would like to treat single and double fp atomics as unsupported, it can override the default behavior in its own TargetInfo.

Do we have sufficient test coverage on all platforms to make sure we're not generating something that LLVM can't handle everywhere?
If not, perhaps we should default to unsupported and only enable it for known working targets.

I updated TargetInfo for fp atomic support for common targets. Basically by default fp atomic support is now off. It is enabled only for targets which do not generate lib calls for fp atomics. This is because the availability of lib call depends on platform, so it is up to the Target owners to determine whether the support is available if lib call is needed. For those targets which are able to generate llvm fp atomic instructions, fp atomic support is enabled in clang, and tests are added to cover them.

clang/lib/CodeGen/CGAtomic.cpp
915–917	done
clang/test/Sema/atomic-ops.c
102–104	done

yaxunl edited the summary of this revision. (Show Details)Aug 4 2020, 5:20 PM

ldionne removed a subscriber: ldionne.Aug 5 2020, 9:05 AM

ping

In D71726#2182667, @tra wrote:

If a target would like to treat single and double fp atomics as unsupported, it can override the default behavior in its own TargetInfo.

I really don't think this should be a target option at all. Every target can support the atomic fadd/fsub IR instruction (via lowering to a cmpxchg loop if nothing else). If it doesn't work, that's a bug in LLVM. We shouldn't be adding target hooks in Clang to workaround LLVM bugs, rather, we should fix them.

There is one nit -- atomicrmw doesn't (yet) support specifying alignment. There's work now to fix that, but until that's submitted, only naturally-aligned atomicrmw instructions can be created. So, for now, supporting only a naturally-aligned floating-point add would be a reasonable temporary measure.

Do we have sufficient test coverage on all platforms to make sure we're not generating something that LLVM can't handle everywhere?

Probably not.

If not, perhaps we should default to unsupported and only enable it for known working targets.

No, I don't think that's a good way to go. We should fix LLVM if it's broken.

jyknight added inline comments.Aug 10 2020, 9:11 AM

clang/lib/CodeGen/CGAtomic.cpp
961	convertToAtomicIntPointer does more than just cast to an int pointer, are you sure the rest is not necessary for fp types?
clang/lib/Sema/SemaChecking.cpp
5037–5039	This is confusing, and took me a bit to understand what you're doing. I'd suggest reordering the clauses, putting the pointer case first, e.g.: if (Form == Arithmetic && ValType->isPointerType()) Ty = Context.getPointerDiffType(); else if (Form == Init \|\| Form == Arithmetic) Ty = ValType; else if (Form == Copy \|\| Form == Xchg) ..... else ...... ...

Oh, one more note, C11 has -- and clang already supports -- _Atomic long double x; x += 4; via lowering to a cmpxchg loop. Now that we have an LLVM IR representation for atomicrmw fadd/fsub, clang should be lowering the _Atomic += to that, too. (Doesn't need to be in this patch, but it should be done.)

In D71726#2207148, @jyknight wrote:

In D71726#2182667, @tra wrote:

If a target would like to treat single and double fp atomics as unsupported, it can override the default behavior in its own TargetInfo.

I really don't think this should be a target option at all. Every target can support the atomic fadd/fsub IR instruction (via lowering to a cmpxchg loop if nothing else). If it doesn't work, that's a bug in LLVM. We shouldn't be adding target hooks in Clang to workaround LLVM bugs, rather, we should fix them.

There is one nit -- atomicrmw doesn't (yet) support specifying alignment. There's work now to fix that, but until that's submitted, only naturally-aligned atomicrmw instructions can be created. So, for now, supporting only a naturally-aligned floating-point add would be a reasonable temporary measure.

clang does not always emit atomic instructions for atomic builtins. Clang may emit lib calls for atomic builtins. Basically clang checks target info about max atomic inline width and if the desired atomic operation exceeds the supported atomic inline width, clang will emit lib calls for atomic builtins. The rationale is that the lib calls may be faster than the IR generated by the LLVM pass. This behavior has long existed and it also applies to fp atomics. I don't think emitting lib calls for atomic builtins is a bug. However, this does introduce the issue about whether the library functions for atomics are available for a specific target. As I said, only the target owners have the answer and therefore I introduced the target hook.

Do we have sufficient test coverage on all platforms to make sure we're not generating something that LLVM can't handle everywhere?

Probably not.

In clang, we only test IR generation, as is done for other atomic builtins. fp atomics do not have less coverage compared with other atomic builtins. Actually for other atomic builtins we do not even test them on different targets. The ISA generation of fp atomics should be done in llvm tests and should not be blocking clang change.

clang/lib/CodeGen/CGAtomic.cpp
961	it is not needed for fp types. If the value type does not match the pointer type, clang automatically inserts proper llvm instructions to convert the value type to a value type that matches the pointer type. Two codegen tests are added (atomic_fetch_add(double, float) and atomic_fetch_add(double, int)) to test such situations.
clang/lib/Sema/SemaChecking.cpp
5037–5039	done

Revised by James' comments.

ping

Herald added a subscriber: dexonsmith. · View Herald TranscriptOct 22 2020, 8:31 AM

In D71726#2207700, @yaxunl wrote:

clang does not always emit atomic instructions for atomic builtins. Clang may emit lib calls for atomic builtins. Basically clang checks target info about max atomic inline width and if the desired atomic operation exceeds the supported atomic inline width, clang will emit lib calls for atomic builtins. The rationale is that the lib calls may be faster than the IR generated by the LLVM pass. This behavior has long existed and it also applies to fp atomics. I don't think emitting lib calls for atomic builtins is a bug. However, this does introduce the issue about whether the library functions for atomics are available for a specific target. As I said, only the target owners have the answer and therefore I introduced the target hook.

If we want the frontend to emit an error when the target doesn't support library-based atomics, that seems fine, but there's no reason to only do so for floating-point types. That is, we should have a TargetInfo method that asks whether atomics at a given size and alignment are supported at all, similar to what we have for "builtin" (lock-free) atomics, and we should check it for all the atomic types and operations.

Actually, maybe we should take the existing hook and have it return one of { LockFree, Library, Unsupported }.

In D71726#2207700, @yaxunl wrote:

clang does not always emit atomic instructions for atomic builtins. Clang may emit lib calls for atomic builtins. Basically clang checks target info about max atomic inline width and if the desired atomic operation exceeds the supported atomic inline width, clang will emit lib calls for atomic builtins. The rationale is that the lib calls may be faster than the IR generated by the LLVM pass. This behavior has long existed and it also applies to fp atomics. I don't think emitting lib calls for atomic builtins is a bug. However, this does introduce the issue about whether the library functions for atomics are available for a specific target. As I said, only the target owners have the answer and therefore I introduced the target hook.

The LLVM AtomicExpandPass is _also_ introducing libcalls (or cmpxchg loops), as is appropriate for a given target. We currently, redundantly, support the same thing in two places. It's a long-term goal of mine to simplify the atomics code in clang, by deferring more of it to LLVM, but some prerequisites (e.g. supporting misaligned atomicrmw) are not yet in place. The intent is that it is always valid to emit the LLVM atomic IR, and it will be transformed into whatever is best on a given target. As such, there's no reason to restrict these clang intrinsics.

Yes, there are no generically available libcalls for atomic float math -- but that's okay -- let LLVM handle transform into a cmpxchg loop when required.

dexonsmith removed a subscriber: dexonsmith.Oct 23 2020, 11:53 AM

Yes, there are no generically available libcalls for atomic float math -- but that's okay -- let LLVM handle transform into a cmpxchg loop when required.

I suspect Yaxun's target cannot provide libcalls at all, which is why he wants to diagnose up-front. But I agree that we should be thinking about this uniformly, and that his target should be diagnosing *all* unsupported atomics.

revised by John's comments. Added target hook and diagnostics for generic atomic operations.

In D71726#2351069, @rjmccall wrote:

Yes, there are no generically available libcalls for atomic float math -- but that's okay -- let LLVM handle transform into a cmpxchg loop when required.

I suspect Yaxun's target cannot provide libcalls at all, which is why he wants to diagnose up-front. But I agree that we should be thinking about this uniformly, and that his target should be diagnosing *all* unsupported atomics.

amdgpu target currently does not support atomic libcalls. I added a target hook for atomic operation support and diagnostics for generic atomic operations by John's comments.

Clang has existing diagnostics for unsupported atomic load/store for some platforms, and functions about atomic support scattered in target info, AST context, and codegen. This change refactors these codes and unify them as a target hook.

ping

@rjmccall I have addressed the comments about diagnostics. Could you please review it? Thanks.

rjmccall added inline comments.Jan 27 2021, 6:51 PM

clang/include/clang/Basic/TargetInfo.h
1515	This shouldn't be here; if you have places that don't always represent an atomic operation, queries for the kind should return an `Optional<AtomicOperationKind>` from the classification.
1516	`atomic_init` is not actually an atomic operation, so there's never an inherent reason it can't be supported. In general, I am torn about this list, because it's simultaneously rather fine-grained while not seeming nearly fine-grained enough to be truly general. What's actually going on on your target? You have ISA support for doing some specific operations atomically, but not a general atomic compare-and-swap operation? Which means that you then cannot support support other operations? It is unfortunate that our layering prevents TargetInfo from simply being passed the appropriate expression.
1534	I think this reflects our current strategies for emitting atomics, but it's a somewhat misleading enum in general because this isn't an exhaustive list of the options — there are certainly possible inline expansions that aren't lock-free. (For example, you could have an inline spin-lock embedded in the atomic object.) The goal of this enum is so that TargetInfo only has to have one hook for checking atomic operations? I would be happier if you included an inline-but-not-lock-free alternative in this enum, even if it's never currently used, so that clients can do the right test.
1538	Why is this needed as a separate hook?
clang/lib/AST/ASTContext.cpp
11046 ↗	(On Diff #307206)	Should this be a method on `AtomicExpr`? It seems like an intrinsic, target-independent property of the expression.
clang/lib/Basic/TargetInfo.cpp
874	Darwin targets should all be subclasses of `DarwinTargetInfo` in OSTargets.h, so you should be able to just override this there instead of having it in the base case.
clang/lib/Basic/Targets/AArch64.h
143 ↗	(On Diff #307206)	Why can't targets reliably expand this to an atomic compare-and-exchange if they support that for the target width?

yaxunl marked 7 inline comments as done.Feb 1 2021, 8:16 AM

yaxunl added inline comments.

clang/include/clang/Basic/TargetInfo.h
1515	Removed.
1516	The target hook getAtomicSupport needs an argument for atomic operation. Since not all targets support fp add/sub, we need an enum for add/sub. Since certain release of iOS/macOS does not support C11 load/store, we need an enum for C11 load/store. We could define the enums as {AddSub, C11LoadStore, Other}. However, this would cause a difficulty for emitting diagnostic message for unsupported atomic operations since we map this enum to a string for the atomic operation and use it in the diagnostic message. 'Other' would be mapped to 'other atomic operation' which is not clear what it is.
1534	Added InlineWithLock
1538	Most target shares getAtomicSupport except FP atomic support, so define a virtual function for FP atomic support and let getAtomicSupport call it.
clang/lib/AST/ASTContext.cpp
11046 ↗	(On Diff #307206)	Yes. moved to AtomicExpr
clang/lib/Basic/TargetInfo.cpp
874	done
clang/lib/Basic/Targets/AArch64.h
143 ↗	(On Diff #307206)	There are some bugs in either the middle end or backend causing this not working. For example, half type atomic fadd on amdgcn is not lowered to cmpxchg and the backend has isel failure, bf16 type atomic fadd on arm is not lowered to cmpxchg and the backend has isel failure. The support for each fp type needs to be done case by case. So far there is no target support atomic fadd/sub with half and bf16 type.

revised by John's comments

Herald added a reviewer: jfb. · View Herald TranscriptFeb 1 2021, 8:19 AM

I still have the same fundamental objection as before to the parts of this patch for prohibiting FP add/sub on some targets.

If a particular LLVM target cannot handle transforming an FP add/sub (or any other RMW operations!) into the correct cmpxchg or LL/SC loop, that's a bug in the backend which should be fixed. I don't see why we ought to add a bunch of functionality in the frontend to workaround this?

(Some of the other changes, e.g. to diagnose lack of support for large atomics is useful, though.)

rjmccall added inline comments.Feb 1 2021, 12:09 PM

clang/include/clang/Basic/TargetInfo.h
1516	It's not obviously true that not all targets support FP add/sub, though. Any target that provides compare-and-swap at the width of an FP type can do an atomic FP add/sub at that width; it might be less efficient than it would be with specific ISA support, but that's true for a lot of atomic operations. Surely it's better to just fix whatever bugs LLVM has with lowering atomic FP add/sub than to add more abstraction to Clang to handle a special case that shouldn't exist. I don't know what issues Darwin has with C11 load/store; that might be a more compelling reason to have this abstraction, although again it seems strange that we're outlawing a specific operation when in principle we can just emit it less efficiently.
clang/lib/Basic/Targets/AArch64.h
143 ↗	(On Diff #307206)	Are we legalizing atomicrmw to cmpxchg loops in the backend instead of as LLVM IR pass? That seems like an architectural mistake. Regardless, this bug should just be fixed.

This patch focuses on clang work for enabling fp atomics. There is a middle end pass for lowering fp atomics to cmpxchg, however not all targets enable it or enable it properly. From clang point of view, those targets are not ready to say they support fp atomics, therefore it diagnose those situations and let clang fail gracefully instead of crashing with isel failure or missing symbols in linker.

I have limited resources to work on middle end and backend for all targets. If a backend really cares about fp atomics, they should fix the atomic lowering pass then enable fp atomics support in clang Target info.

This patch implements fp atomic support in clang. It does not make things worse in regarding the bugs in middle ends and backends. I think it is not beneficial to blocking this clang change due to middle end and backend issues.

If the concern is that diagnose fp atomics as unsupported hinders middle end and backend work for fixing fp atomic issues, how about adding a -fenable-fp-atomics to clang which can override target info about fp atomics support.

My concern is that this is treating a backend _bug_ as if it were just an optional feature. But it's not the case that it might be reasonable to either implement or not implement this in a backend -- it should be implemented, and those that don't are buggy.

I'd be happier with just having an ISEL failure when you try to use fp atomics on broken targets, rather than adding all this code and configuration to Clang in order to avoid that. (And, of course, the target maintainers should also fix them)

In D71726#2536966, @jyknight wrote:

My concern is that this is treating a backend _bug_ as if it were just an optional feature. But it's not the case that it might be reasonable to either implement or not implement this in a backend -- it should be implemented, and those that don't are buggy.

I'd be happier with just having an ISEL failure when you try to use fp atomics on broken targets, rather than adding all this code and configuration to Clang in order to avoid that. (And, of course, the target maintainers should also fix them)

+1. I agree with James.

Removing code is often harder than adding it. When you're adding things, you're the only user. Once things are in, they will start growing dependencies that will need to be dealt with if you ever want to remove the code.

Clean solution that works for AMDGPU only for now is better than a potentially permanent workaround.

In D71726#2537054, @tra wrote:

In D71726#2536966, @jyknight wrote:

My concern is that this is treating a backend _bug_ as if it were just an optional feature. But it's not the case that it might be reasonable to either implement or not implement this in a backend -- it should be implemented, and those that don't are buggy.

I'd be happier with just having an ISEL failure when you try to use fp atomics on broken targets, rather than adding all this code and configuration to Clang in order to avoid that. (And, of course, the target maintainers should also fix them)

+1. I agree with James.

Removing code is often harder than adding it. When you're adding things, you're the only user. Once things are in, they will start growing dependencies that will need to be dealt with if you ever want to remove the code.

Clean solution that works for AMDGPU only for now is better than a potentially permanent workaround.

For amdgpu target, we do need diagnose unsupported atomics (not limited to fp atomics) since we do not support libcall due to ISA level linking not supported. This is something we cannot fix in a short time and we would rather diagnose it than confusing the users with missing symbols in lld.

For other targets, I can make changes to assume fp atomics are supported if width is within max inline atomic width of the target. Basically this will let fp atomics emitted for these targets and assuming middle end or backend will handle them properly.

In D71726#2537101, @yaxunl wrote:

In D71726#2537054, @tra wrote:

In D71726#2536966, @jyknight wrote:

My concern is that this is treating a backend _bug_ as if it were just an optional feature. But it's not the case that it might be reasonable to either implement or not implement this in a backend -- it should be implemented, and those that don't are buggy.

I'd be happier with just having an ISEL failure when you try to use fp atomics on broken targets, rather than adding all this code and configuration to Clang in order to avoid that. (And, of course, the target maintainers should also fix them)

+1. I agree with James.

Removing code is often harder than adding it. When you're adding things, you're the only user. Once things are in, they will start growing dependencies that will need to be dealt with if you ever want to remove the code.

Clean solution that works for AMDGPU only for now is better than a potentially permanent workaround.

For amdgpu target, we do need diagnose unsupported atomics (not limited to fp atomics) since we do not support libcall due to ISA level linking not supported. This is something we cannot fix in a short time and we would rather diagnose it than confusing the users with missing symbols in lld.

Diagnosing that you don't support atomics your target can't reasonably support is completely fine. (You could actually actually inline a locking approach if you really wanted to, though; Microsoft's std::atomic does that in the general case, although admittedly that's library code.) I would like to understand whether that's really type-specific or just size-specific, though, and I don't think we've gotten a plain answer about that. Is it true that amdgpu simply does not have a generic cmpxchg?

For other targets, I can make changes to assume fp atomics are supported if width is within max inline atomic width of the target. Basically this will let fp atomics emitted for these targets and assuming middle end or backend will handle them properly.

I think that's reasonable.

In D71726#2537101, @yaxunl wrote:

For amdgpu target, we do need diagnose unsupported atomics (not limited to fp atomics) since we do not support libcall due to ISA level linking not supported. This is something we cannot fix in a short time and we would rather diagnose it than confusing the users with missing symbols in lld.

If this is limited simply to not supporting oversized or misaligned atomics, I'd find that a lot less objectionable. At that point you just need a single boolean variable/accessor for whether the target can support atomic library calls. I note that we already have warning messages: warn_atomic_op_misaligned and warn_atomic_op_oversized. Maybe those can just be promoted to errors on AMDGPU.

In D71726#2537378, @jyknight wrote:

In D71726#2537101, @yaxunl wrote:

For amdgpu target, we do need diagnose unsupported atomics (not limited to fp atomics) since we do not support libcall due to ISA level linking not supported. This is something we cannot fix in a short time and we would rather diagnose it than confusing the users with missing symbols in lld.

If this is limited simply to not supporting oversized or misaligned atomics, I'd find that a lot less objectionable. At that point you just need a single boolean variable/accessor for whether the target can support atomic library calls. I note that we already have warning messages: warn_atomic_op_misaligned and warn_atomic_op_oversized. Maybe those can just be promoted to errors on AMDGPU.

Good points. Will do.

Revised by James, Artem, and John's comments.

ping

@jyknight @rjmccall ping. diagnostic issue addressed.

@rjmccall @jyknight Ping. Any further concerns? Thanks.

Re-use existing warning instead of introducing new diagnostics.

Ping. Can some one help review this patch? I believe all comments addressed. Thanks.

yaxunl edited the summary of this revision. (Show Details)Mar 23 2021, 6:42 AM

Herald added a subscriber: tpr. · View Herald TranscriptMar 23 2021, 6:42 AM

Harbormaster completed remote builds in B95259: Diff 332658.Mar 23 2021, 8:52 AM

@jyknight - James, do you have further concerns about the patch?

clang/lib/Driver/ToolChains/Clang.cpp
6454 ↗	(On Diff #332658)	If we rely on promoting the warnings to errors for correctness, I think we may need a more robust mechanism to enforce that than trying to guess the state based on provided options. E.g. can these diagnostics be enabled/disabled with a wider scope option like `-W[no-]extra` or `-W[no-]all`? Maybe we should add a cc1-only option `--enforce-atomic-alignment` and use that to determine if misalignment should be an error at the point where we issue the diagnostics?
6457 ↗	(On Diff #332658)	This should be `else if`, or, maybe use `llvm::StringSwitch()`instead: DiagAtomicLibCall = llvm::StringSwitch<bool>(A->getValue()) .Case("no-error=atomic-alignment", false) .Case("error=atomic-alignment", true) .Default(DiagAtomicLibCall)

separate diagnosing unaligned atomc for amdgpu to another review.

yaxunl added a child revision: D99201: [AMDGPU] Diagnose unaligned atomic for amdgpu.Mar 23 2021, 11:09 AM

In D71726#2645269, @tra wrote:

@jyknight - James, do you have further concerns about the patch?

I separated the change about diagnosing unaligned atomics for amdgpu to https://reviews.llvm.org/D99201 since these two changes are orthogonal.

Harbormaster completed remote builds in B95305: Diff 332730.Mar 23 2021, 3:02 PM

rjmccall added inline comments.Mar 29 2021, 10:29 PM

clang/lib/Sema/SemaChecking.cpp
4910	Does LLVM support atomics on all floating-point types?

yaxunl added inline comments.Apr 4 2021, 8:25 AM

clang/lib/Sema/SemaChecking.cpp
4910	LLVM IR parser requires atomicrmw value operand must have size of power of 2, therefore LLVM does not support atomicrmw on x86_fp80 which has size of 80 bytes. LLVM supports atomicrmw on all other floating-point types (bfloat, half, float, double, fp128, ppc_fp128).

rjmccall added inline comments.Apr 5 2021, 10:53 AM

clang/lib/Sema/SemaChecking.cpp
4910	Okay. So this needs to check the underlying FP semantics and disallow atomics on unsupported types.

yaxunl marked 2 inline comments as done.Apr 5 2021, 6:28 PM

yaxunl added inline comments.

clang/lib/Sema/SemaChecking.cpp
4910	will do

Revised by John's comments. Do not allow atomic fetch add with x86_fp80.

Harbormaster completed remote builds in B97205: Diff 335366.Apr 5 2021, 7:14 PM

Alright, mostly looks good.

clang/lib/Sema/SemaChecking.cpp
4910	Could you extract this whole condition into a function and make it a bit more readable?

revised by John's comments

Harbormaster completed remote builds in B97288: Diff 335489.Apr 6 2021, 6:55 AM

Thanks, LGTM

This revision is now accepted and ready to land.Apr 6 2021, 11:52 AM

This revision was landed with ongoing or failed builds.Apr 6 2021, 12:45 PM

Closed by commit rG61d065e21ff3: Let clang atomic builtins fetch add/sub support floating point types (authored by yaxunl). · Explain Why

This revision was automatically updated to reflect the committed changes.

yaxunl added a commit: rG61d065e21ff3: Let clang atomic builtins fetch add/sub support floating point types.

Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2021, 12:45 PM

kpet added a subscriber: kpet.May 26 2021, 1:12 AM

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

DiagnosticFrontendKinds.td

6 lines

DiagnosticSemaKinds.td

3 lines

TargetInfo.h

3 lines

lib/

Basic/

TargetInfo.cpp

1 line

Targets/

AMDGPU.cpp

1 line

CodeGen/

CGAtomic.cpp

64 lines

Sema/

SemaChecking.cpp

19 lines

test/

CodeGen/

fp-atomic-ops.c

44 lines

CodeGenCUDA/

amdgpu-atomic-ops.cu

41 lines

CodeGenOpenCL/

atomic-ops.cl

23 lines

Sema/

atomic-ops.c

23 lines

SemaCUDA/

amdgpu-atomic-ops.cu

25 lines

SemaOpenCL/

atomic-ops.cl

26 lines

Diff 322005

clang/include/clang/Basic/DiagnosticFrontendKinds.td

	Show First 20 Lines • Show All 266 Lines • ▼ Show 20 Lines
	def err_ifunc_resolver_return : Error<			def err_ifunc_resolver_return : Error<
	"ifunc resolver function must return a pointer">;			"ifunc resolver function must return a pointer">;

	def warn_atomic_op_misaligned : Warning<			def warn_atomic_op_misaligned : Warning<
	"misaligned atomic operation may incur "			"misaligned atomic operation may incur "
	"significant performance penalty"			"significant performance penalty"
	"; the expected alignment (%0 bytes) exceeds the actual alignment (%1 bytes)">,			"; the expected alignment (%0 bytes) exceeds the actual alignment (%1 bytes)">,
	InGroup<AtomicAlignment>;			InGroup<AtomicAlignment>;
				def err_atomic_op_misaligned : Error<
				"misaligned atomic operation not supported"
				"; the expected alignment (%0 bytes) exceeds the actual alignment (%1 bytes)">;

	def warn_atomic_op_oversized : Warning<			def warn_atomic_op_oversized : Warning<
	"large atomic operation may incur "			"large atomic operation may incur "
	"significant performance penalty"			"significant performance penalty"
	"; the access size (%0 bytes) exceeds the max lock-free size (%1 bytes)">,			"; the access size (%0 bytes) exceeds the max lock-free size (%1 bytes)">,
	InGroup<AtomicAlignment>;			InGroup<AtomicAlignment>;
				def err_atomic_op_oversized : Error<
				"large atomic operation not supported"
				"; the access size (%0 bytes) exceeds the max lock-free size (%1 bytes)">;

	def warn_alias_with_section : Warning<			def warn_alias_with_section : Warning<
	"%select{alias\|ifunc}1 will not be in section '%0' but in the same section "			"%select{alias\|ifunc}1 will not be in section '%0' but in the same section "
	"as the %select{aliasee\|resolver}2">,			"as the %select{aliasee\|resolver}2">,
	InGroup<IgnoredAttributes>;			InGroup<IgnoredAttributes>;

	let CategoryName = "Instrumentation Issue" in {			let CategoryName = "Instrumentation Issue" in {
	def warn_profile_data_out_of_date : Warning<			def warn_profile_data_out_of_date : Warning<
	Show All 14 Lines

clang/include/clang/Basic/DiagnosticSemaKinds.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,153 Lines • ▼ Show 20 Lines	def err_atomic_op_needs_non_const_atomic : Error<
"address argument to atomic operation must be a pointer to non-%select{const\|constant}0 _Atomic "		"address argument to atomic operation must be a pointer to non-%select{const\|constant}0 _Atomic "
"type (%1 invalid)">;		"type (%1 invalid)">;
def err_atomic_op_needs_non_const_pointer : Error<		def err_atomic_op_needs_non_const_pointer : Error<
"address argument to atomic operation must be a pointer to non-const "		"address argument to atomic operation must be a pointer to non-const "
"type (%0 invalid)">;		"type (%0 invalid)">;
def err_atomic_op_needs_trivial_copy : Error<		def err_atomic_op_needs_trivial_copy : Error<
"address argument to atomic operation must be a pointer to a "		"address argument to atomic operation must be a pointer to a "
"trivially-copyable type (%0 invalid)">;		"trivially-copyable type (%0 invalid)">;
		def err_atomic_op_needs_atomic_int_ptr_or_fp : Error<
		"address argument to atomic operation must be a pointer to %select{\|atomic }0"
		"integer, pointer or supported floating point type (%1 invalid)">;
def err_atomic_op_needs_atomic_int_or_ptr : Error<		def err_atomic_op_needs_atomic_int_or_ptr : Error<
"address argument to atomic operation must be a pointer to %select{\|atomic }0"		"address argument to atomic operation must be a pointer to %select{\|atomic }0"
"integer or pointer (%1 invalid)">;		"integer or pointer (%1 invalid)">;
def err_atomic_op_needs_atomic_int : Error<		def err_atomic_op_needs_atomic_int : Error<
"address argument to atomic operation must be a pointer to "		"address argument to atomic operation must be a pointer to "
"%select{\|atomic }0integer (%1 invalid)">;		"%select{\|atomic }0integer (%1 invalid)">;
def warn_atomic_op_has_invalid_memory_order : Warning<		def warn_atomic_op_has_invalid_memory_order : Warning<
"memory order argument to atomic operation is invalid">,		"memory order argument to atomic operation is invalid">,
▲ Show 20 Lines • Show All 2,956 Lines • Show Last 20 Lines

clang/include/clang/Basic/TargetInfo.h

Show First 20 Lines • Show All 180 Lines • ▼ Show 20 Lines	class TargetInfo : public virtual TransferrableTargetInfo,
std::shared_ptr<TargetOptions> TargetOpts;		std::shared_ptr<TargetOptions> TargetOpts;
llvm::Triple Triple;		llvm::Triple Triple;
protected:		protected:
// Target values set by the ctor of the actual target implementation. Default		// Target values set by the ctor of the actual target implementation. Default
// values are specified by the TargetInfo constructor.		// values are specified by the TargetInfo constructor.
bool BigEndian;		bool BigEndian;
bool TLSSupported;		bool TLSSupported;
bool VLASupported;		bool VLASupported;
		bool AtomicLibCallSupported;
bool NoAsmVariants; // True if {\|} are normal characters.		bool NoAsmVariants; // True if {\|} are normal characters.
bool HasLegalHalfType; // True if the backend supports operations on the half		bool HasLegalHalfType; // True if the backend supports operations on the half
// LLVM IR type.		// LLVM IR type.
bool HasFloat128;		bool HasFloat128;
bool HasFloat16;		bool HasFloat16;
bool HasBFloat16;		bool HasBFloat16;
bool HasStrictFP;		bool HasStrictFP;

▲ Show 20 Lines • Show All 493 Lines • ▼ Show 20 Lines	public:
/// operations at the specified width and alignment.		/// operations at the specified width and alignment.
virtual bool hasBuiltinAtomic(uint64_t AtomicSizeInBits,		virtual bool hasBuiltinAtomic(uint64_t AtomicSizeInBits,
uint64_t AlignmentInBits) const {		uint64_t AlignmentInBits) const {
return AtomicSizeInBits <= AlignmentInBits &&		return AtomicSizeInBits <= AlignmentInBits &&
AtomicSizeInBits <= getMaxAtomicInlineWidth() &&		AtomicSizeInBits <= getMaxAtomicInlineWidth() &&
(AtomicSizeInBits <= getCharWidth() \|\|		(AtomicSizeInBits <= getCharWidth() \|\|
llvm::isPowerOf2_64(AtomicSizeInBits / getCharWidth()));		llvm::isPowerOf2_64(AtomicSizeInBits / getCharWidth()));
}		}
		/// Return true if the target supports atomic runtime library functions.
		bool supportsAtomicLibCall() const { return AtomicLibCallSupported; }

/// Return the maximum vector alignment supported for the given target.		/// Return the maximum vector alignment supported for the given target.
unsigned getMaxVectorAlign() const { return MaxVectorAlign; }		unsigned getMaxVectorAlign() const { return MaxVectorAlign; }
/// Return default simd alignment for the given target. Generally, this		/// Return default simd alignment for the given target. Generally, this
/// value is type-specific, but this alignment can be used for most of the		/// value is type-specific, but this alignment can be used for most of the
/// types for the given target.		/// types for the given target.
unsigned getSimdDefaultAlign() const { return SimdDefaultAlign; }		unsigned getSimdDefaultAlign() const { return SimdDefaultAlign; }

▲ Show 20 Lines • Show All 797 Lines • ▼ Show 20 Lines	#include "clang/Basic/OpenCLExtensions.def"
}		}

virtual void setAuxTarget(const TargetInfo *Aux) {}		virtual void setAuxTarget(const TargetInfo *Aux) {}

/// Whether target allows debuginfo types for decl only variables.		/// Whether target allows debuginfo types for decl only variables.
virtual bool allowDebugInfoForExternalVar() const { return false; }		virtual bool allowDebugInfoForExternalVar() const { return false; }

protected:		protected:
/// Copy type and layout related info.		/// Copy type and layout related info.
		traUnsubmitted Done Reply Inline Actions I think it should be predicated on specific type. E.g. NVPTX supports atomic ops on fp32 ~everywhere, but fp64 atomic add/sub is only supported on newer GPUs. And then there's fp16... tra: I think it should be predicated on specific type. E.g. NVPTX supports atomic ops on fp32…
		yaxunlAuthorUnsubmitted Done Reply Inline Actions will do and add tests for fp16 yaxunl: will do and add tests for fp16
		traUnsubmitted Done Reply Inline Actions The number of bits alone may not be sufficient to differentiate the FP variants. E.g. 16-bit floats currently have 2 variants: IEEE FP16 and BFloat16 (supported by intel and newer NVIDIA GPUs). CUDA-11 has introduced TF32 FP format, so we're likely to have more than one 32-bit FP type, too. I think PPC has an odd `long double` variant represented as pair of 64-bit doubles. tra: The number of bits alone may not be sufficient to differentiate the FP variants. E.g. 16-bit…
		yaxunlAuthorUnsubmitted Done Reply Inline Actions will use llvm::fltSemantics for checking, which should cover different fp types. yaxunl: will use llvm::fltSemantics for checking, which should cover different fp types.
void copyAuxTarget(const TargetInfo *Aux);		void copyAuxTarget(const TargetInfo *Aux);
		rjmccallUnsubmitted Done Reply Inline Actions This shouldn't be here; if you have places that don't always represent an atomic operation, queries for the kind should return an `Optional<AtomicOperationKind>` from the classification. rjmccall: This shouldn't be here; if you have places that don't always represent an atomic operation…
		yaxunlAuthorUnsubmitted Done Reply Inline Actions Removed. yaxunl: Removed.
virtual uint64_t getPointerWidthV(unsigned AddrSpace) const {		virtual uint64_t getPointerWidthV(unsigned AddrSpace) const {
		rjmccallUnsubmitted Done Reply Inline Actions `atomic_init` is not actually an atomic operation, so there's never an inherent reason it can't be supported. In general, I am torn about this list, because it's simultaneously rather fine-grained while not seeming nearly fine-grained enough to be truly general. What's actually going on on your target? You have ISA support for doing some specific operations atomically, but not a general atomic compare-and-swap operation? Which means that you then cannot support support other operations? It is unfortunate that our layering prevents TargetInfo from simply being passed the appropriate expression. rjmccall: `atomic_init` is not actually an atomic operation, so there's never an inherent reason it can't…
		yaxunlAuthorUnsubmitted Done Reply Inline Actions The target hook getAtomicSupport needs an argument for atomic operation. Since not all targets support fp add/sub, we need an enum for add/sub. Since certain release of iOS/macOS does not support C11 load/store, we need an enum for C11 load/store. We could define the enums as {AddSub, C11LoadStore, Other}. However, this would cause a difficulty for emitting diagnostic message for unsupported atomic operations since we map this enum to a string for the atomic operation and use it in the diagnostic message. 'Other' would be mapped to 'other atomic operation' which is not clear what it is. yaxunl: The target hook getAtomicSupport needs an argument for atomic operation. Since not all targets…
		rjmccallUnsubmitted Done Reply Inline Actions It's not obviously true that not all targets support FP add/sub, though. Any target that provides compare-and-swap at the width of an FP type can do an atomic FP add/sub at that width; it might be less efficient than it would be with specific ISA support, but that's true for a lot of atomic operations. Surely it's better to just fix whatever bugs LLVM has with lowering atomic FP add/sub than to add more abstraction to Clang to handle a special case that shouldn't exist. I don't know what issues Darwin has with C11 load/store; that might be a more compelling reason to have this abstraction, although again it seems strange that we're outlawing a specific operation when in principle we can just emit it less efficiently. rjmccall: It's not obviously true that not all targets support FP add/sub, though. Any target that…
return PointerWidth;		return PointerWidth;
}		}
virtual uint64_t getPointerAlignV(unsigned AddrSpace) const {		virtual uint64_t getPointerAlignV(unsigned AddrSpace) const {
return PointerAlign;		return PointerAlign;
}		}
virtual enum IntType getPtrDiffTypeV(unsigned AddrSpace) const {		virtual enum IntType getPtrDiffTypeV(unsigned AddrSpace) const {
return PtrDiffType;		return PtrDiffType;
}		}
virtual ArrayRef<const char *> getGCCRegNames() const = 0;		virtual ArrayRef<const char *> getGCCRegNames() const = 0;
virtual ArrayRef<GCCRegAlias> getGCCRegAliases() const = 0;		virtual ArrayRef<GCCRegAlias> getGCCRegAliases() const = 0;
virtual ArrayRef<AddlRegName> getGCCAddlRegNames() const {		virtual ArrayRef<AddlRegName> getGCCAddlRegNames() const {
return None;		return None;
}		}

private:		private:
// Assert the values for the fractional and integral bits for each fixed point		// Assert the values for the fractional and integral bits for each fixed point
// type follow the restrictions given in clause 6.2.6.3 of N1169.		// type follow the restrictions given in clause 6.2.6.3 of N1169.
void CheckFixedPointBits() const;		void CheckFixedPointBits() const;
		rjmccallUnsubmitted Done Reply Inline Actions I think this reflects our current strategies for emitting atomics, but it's a somewhat misleading enum in general because this isn't an exhaustive list of the options — there are certainly possible inline expansions that aren't lock-free. (For example, you could have an inline spin-lock embedded in the atomic object.) The goal of this enum is so that TargetInfo only has to have one hook for checking atomic operations? I would be happier if you included an inline-but-not-lock-free alternative in this enum, even if it's never currently used, so that clients can do the right test. rjmccall: I think this reflects our current strategies for emitting atomics, but it's a somewhat…
		yaxunlAuthorUnsubmitted Done Reply Inline Actions Added InlineWithLock yaxunl: Added InlineWithLock
};		};

} // end namespace clang		} // end namespace clang

		rjmccallUnsubmitted Done Reply Inline Actions Why is this needed as a separate hook? rjmccall: Why is this needed as a separate hook?
		yaxunlAuthorUnsubmitted Done Reply Inline Actions Most target shares getAtomicSupport except FP atomic support, so define a virtual function for FP atomic support and let getAtomicSupport call it. yaxunl: Most target shares getAtomicSupport except FP atomic support, so define a virtual function for…
#endif		#endif

clang/lib/Basic/TargetInfo.cpp

Show All 26 Lines

// TargetInfo Constructor.		// TargetInfo Constructor.
TargetInfo::TargetInfo(const llvm::Triple &T) : TargetOpts(), Triple(T) {		TargetInfo::TargetInfo(const llvm::Triple &T) : TargetOpts(), Triple(T) {
// Set defaults. Defaults are set for a 32-bit RISC platform, like PPC or		// Set defaults. Defaults are set for a 32-bit RISC platform, like PPC or
// SPARC. These should be overridden by concrete targets as needed.		// SPARC. These should be overridden by concrete targets as needed.
BigEndian = !T.isLittleEndian();		BigEndian = !T.isLittleEndian();
TLSSupported = true;		TLSSupported = true;
VLASupported = true;		VLASupported = true;
		AtomicLibCallSupported = true;
NoAsmVariants = false;		NoAsmVariants = false;
HasLegalHalfType = false;		HasLegalHalfType = false;
HasFloat128 = false;		HasFloat128 = false;
HasFloat16 = false;		HasFloat16 = false;
HasBFloat16 = false;		HasBFloat16 = false;
HasStrictFP = false;		HasStrictFP = false;
PointerWidth = PointerAlign = 32;		PointerWidth = PointerAlign = 32;
BoolWidth = BoolAlign = 8;		BoolWidth = BoolAlign = 8;
▲ Show 20 Lines • Show All 798 Lines • ▼ Show 20 Lines	void TargetInfo::CheckFixedPointBits() const {
assert(getAccumIBits() >= getUnsignedAccumIBits());		assert(getAccumIBits() >= getUnsignedAccumIBits());
assert(getLongAccumIBits() >= getUnsignedLongAccumIBits());		assert(getLongAccumIBits() >= getUnsignedLongAccumIBits());
}		}

void TargetInfo::copyAuxTarget(const TargetInfo *Aux) {		void TargetInfo::copyAuxTarget(const TargetInfo *Aux) {
auto Target = static_cast<TransferrableTargetInfo>(this);		auto Target = static_cast<TransferrableTargetInfo>(this);
auto Src = static_cast<const TransferrableTargetInfo>(Aux);		auto Src = static_cast<const TransferrableTargetInfo>(Aux);
Target = Src;		Target = Src;
}		}
		rjmccallUnsubmitted Done Reply Inline Actions Darwin targets should all be subclasses of `DarwinTargetInfo` in OSTargets.h, so you should be able to just override this there instead of having it in the base case. rjmccall: Darwin targets should all be subclasses of `DarwinTargetInfo` in OSTargets.h, so you should be…
		yaxunlAuthorUnsubmitted Done Reply Inline Actions done yaxunl: done

clang/lib/Basic/Targets/AMDGPU.cpp

Show First 20 Lines • Show All 331 Lines • ▼ Show 20 Lines	AMDGPUTargetInfo::AMDGPUTargetInfo(const llvm::Triple &Triple,
if (getMaxPointerWidth() == 64) {		if (getMaxPointerWidth() == 64) {
LongWidth = LongAlign = 64;		LongWidth = LongAlign = 64;
SizeType = UnsignedLong;		SizeType = UnsignedLong;
PtrDiffType = SignedLong;		PtrDiffType = SignedLong;
IntPtrType = SignedLong;		IntPtrType = SignedLong;
}		}

MaxAtomicPromoteWidth = MaxAtomicInlineWidth = 64;		MaxAtomicPromoteWidth = MaxAtomicInlineWidth = 64;
		AtomicLibCallSupported = false;
}		}

void AMDGPUTargetInfo::adjust(LangOptions &Opts) {		void AMDGPUTargetInfo::adjust(LangOptions &Opts) {
TargetInfo::adjust(Opts);		TargetInfo::adjust(Opts);
// ToDo: There are still a few places using default address space as private		// ToDo: There are still a few places using default address space as private
// address space in OpenCL, which needs to be cleaned up, then Opts.OpenCL		// address space in OpenCL, which needs to be cleaned up, then Opts.OpenCL
// can be removed from the following line.		// can be removed from the following line.
setAddressSpaceMap(/DefaultIsPrivate=/Opts.OpenCL \|\|		setAddressSpaceMap(/DefaultIsPrivate=/Opts.OpenCL \|\|
▲ Show 20 Lines • Show All 84 Lines • Show Last 20 Lines

clang/lib/CodeGen/CGAtomic.cpp

Show First 20 Lines • Show All 596 Lines • ▼ Show 20 Lines	static void EmitAtomicOp(CodeGenFunction &CGF, AtomicExpr *E, Address Dest,
case AtomicExpr::AO__c11_atomic_exchange:		case AtomicExpr::AO__c11_atomic_exchange:
case AtomicExpr::AO__opencl_atomic_exchange:		case AtomicExpr::AO__opencl_atomic_exchange:
case AtomicExpr::AO__atomic_exchange_n:		case AtomicExpr::AO__atomic_exchange_n:
case AtomicExpr::AO__atomic_exchange:		case AtomicExpr::AO__atomic_exchange:
Op = llvm::AtomicRMWInst::Xchg;		Op = llvm::AtomicRMWInst::Xchg;
break;		break;

case AtomicExpr::AO__atomic_add_fetch:		case AtomicExpr::AO__atomic_add_fetch:
PostOp = llvm::Instruction::Add;		PostOp = E->getValueType()->isFloatingType() ? llvm::Instruction::FAdd
		: llvm::Instruction::Add;
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
		arsenmUnsubmitted Done Reply Inline Actions Should this really be based on the type, or should the builtin name be different for FP? arsenm: Should this really be based on the type, or should the builtin name be different for FP?
		yaxunlAuthorUnsubmitted Done Reply Inline Actions I think the original name is better. They are exactly what they are intended to be. They were not able to handle fp types therefore they used to emit diagnostics when fp types were passed to them. However now they are able to handle fp types. yaxunl: I think the original name is better. They are exactly what they are intended to be. They were…
case AtomicExpr::AO__c11_atomic_fetch_add:		case AtomicExpr::AO__c11_atomic_fetch_add:
case AtomicExpr::AO__opencl_atomic_fetch_add:		case AtomicExpr::AO__opencl_atomic_fetch_add:
case AtomicExpr::AO__atomic_fetch_add:		case AtomicExpr::AO__atomic_fetch_add:
Op = llvm::AtomicRMWInst::Add;		Op = E->getValueType()->isFloatingType() ? llvm::AtomicRMWInst::FAdd
		: llvm::AtomicRMWInst::Add;
break;		break;

case AtomicExpr::AO__atomic_sub_fetch:		case AtomicExpr::AO__atomic_sub_fetch:
PostOp = llvm::Instruction::Sub;		PostOp = E->getValueType()->isFloatingType() ? llvm::Instruction::FSub
		: llvm::Instruction::Sub;
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case AtomicExpr::AO__c11_atomic_fetch_sub:		case AtomicExpr::AO__c11_atomic_fetch_sub:
case AtomicExpr::AO__opencl_atomic_fetch_sub:		case AtomicExpr::AO__opencl_atomic_fetch_sub:
case AtomicExpr::AO__atomic_fetch_sub:		case AtomicExpr::AO__atomic_fetch_sub:
Op = llvm::AtomicRMWInst::Sub;		Op = E->getValueType()->isFloatingType() ? llvm::AtomicRMWInst::FSub
		: llvm::AtomicRMWInst::Sub;
break;		break;

case AtomicExpr::AO__atomic_min_fetch:		case AtomicExpr::AO__atomic_min_fetch:
PostOpMinMax = true;		PostOpMinMax = true;
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case AtomicExpr::AO__c11_atomic_fetch_min:		case AtomicExpr::AO__c11_atomic_fetch_min:
case AtomicExpr::AO__opencl_atomic_fetch_min:		case AtomicExpr::AO__opencl_atomic_fetch_min:
case AtomicExpr::AO__atomic_fetch_min:		case AtomicExpr::AO__atomic_fetch_min:
▲ Show 20 Lines • Show All 180 Lines • ▼ Show 20 Lines	RValue CodeGenFunction::EmitAtomicExpr(AtomicExpr *E) {

auto TInfo = getContext().getTypeInfoInChars(AtomicTy);		auto TInfo = getContext().getTypeInfoInChars(AtomicTy);
uint64_t Size = TInfo.Width.getQuantity();		uint64_t Size = TInfo.Width.getQuantity();
unsigned MaxInlineWidthInBits = getTarget().getMaxAtomicInlineWidth();		unsigned MaxInlineWidthInBits = getTarget().getMaxAtomicInlineWidth();

bool Oversized = getContext().toBits(TInfo.Width) > MaxInlineWidthInBits;		bool Oversized = getContext().toBits(TInfo.Width) > MaxInlineWidthInBits;
bool Misaligned = (Ptr.getAlignment() % TInfo.Width) != 0;		bool Misaligned = (Ptr.getAlignment() % TInfo.Width) != 0;
bool UseLibcall = Misaligned \| Oversized;		bool UseLibcall = Misaligned \| Oversized;
		bool ShouldCastToIntPtrTy = true;
		bool LibCallSupported = getTarget().supportsAtomicLibCall();

CharUnits MaxInlineWidth =		CharUnits MaxInlineWidth =
getContext().toCharUnitsFromBits(MaxInlineWidthInBits);		getContext().toCharUnitsFromBits(MaxInlineWidthInBits);

DiagnosticsEngine &Diags = CGM.getDiags();		DiagnosticsEngine &Diags = CGM.getDiags();

		if (Oversized) {
		Diags.Report(E->getBeginLoc(), LibCallSupported
		? diag::warn_atomic_op_oversized
		: diag::err_atomic_op_oversized)
		<< (int)TInfo.Width.getQuantity() << (int)MaxInlineWidth.getQuantity();
		if (!LibCallSupported)
		return RValue::get(nullptr);
		}

if (Misaligned) {		if (Misaligned) {
Diags.Report(E->getBeginLoc(), diag::warn_atomic_op_misaligned)		Diags.Report(E->getBeginLoc(), LibCallSupported
		? diag::warn_atomic_op_misaligned
		: diag::err_atomic_op_misaligned)
<< (int)TInfo.Width.getQuantity()		<< (int)TInfo.Width.getQuantity()
<< (int)Ptr.getAlignment().getQuantity();		<< (int)Ptr.getAlignment().getQuantity();
}		if (!LibCallSupported)
		return RValue::get(nullptr);
if (Oversized) {
Diags.Report(E->getBeginLoc(), diag::warn_atomic_op_oversized)
<< (int)TInfo.Width.getQuantity() << (int)MaxInlineWidth.getQuantity();
}		}

llvm::Value *Order = EmitScalarExpr(E->getOrder());		llvm::Value *Order = EmitScalarExpr(E->getOrder());
llvm::Value *Scope =		llvm::Value *Scope =
E->getScopeModel() ? EmitScalarExpr(E->getScope()) : nullptr;		E->getScopeModel() ? EmitScalarExpr(E->getScope()) : nullptr;

switch (E->getOp()) {		switch (E->getOp()) {
case AtomicExpr::AO__c11_atomic_init:		case AtomicExpr::AO__c11_atomic_init:
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	if (MemTy->isPointerType()) {
CharUnits PointeeIncAmt =		CharUnits PointeeIncAmt =
getContext().getTypeSizeInChars(MemTy->getPointeeType());		getContext().getTypeSizeInChars(MemTy->getPointeeType());
Val1Scalar = Builder.CreateMul(Val1Scalar, CGM.getSize(PointeeIncAmt));		Val1Scalar = Builder.CreateMul(Val1Scalar, CGM.getSize(PointeeIncAmt));
auto Temp = CreateMemTemp(Val1Ty, ".atomictmp");		auto Temp = CreateMemTemp(Val1Ty, ".atomictmp");
Val1 = Temp;		Val1 = Temp;
EmitStoreOfScalar(Val1Scalar, MakeAddrLValue(Temp, Val1Ty));		EmitStoreOfScalar(Val1Scalar, MakeAddrLValue(Temp, Val1Ty));
break;		break;
}		}
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case AtomicExpr::AO__atomic_fetch_add:		case AtomicExpr::AO__atomic_fetch_add:
case AtomicExpr::AO__atomic_fetch_sub:		case AtomicExpr::AO__atomic_fetch_sub:
case AtomicExpr::AO__atomic_add_fetch:		case AtomicExpr::AO__atomic_add_fetch:
case AtomicExpr::AO__atomic_sub_fetch:		case AtomicExpr::AO__atomic_sub_fetch:
		ShouldCastToIntPtrTy = !MemTy->isFloatingType();
		LLVM_FALLTHROUGH;

		traUnsubmitted Done Reply Inline Actions `ShouldCastToIntPtrTy = !MemTy->isFloatingType();` tra: `ShouldCastToIntPtrTy = !MemTy->isFloatingType();`
		yaxunlAuthorUnsubmitted Done Reply Inline Actions done yaxunl: done
case AtomicExpr::AO__c11_atomic_store:		case AtomicExpr::AO__c11_atomic_store:
case AtomicExpr::AO__c11_atomic_exchange:		case AtomicExpr::AO__c11_atomic_exchange:
case AtomicExpr::AO__opencl_atomic_store:		case AtomicExpr::AO__opencl_atomic_store:
case AtomicExpr::AO__opencl_atomic_exchange:		case AtomicExpr::AO__opencl_atomic_exchange:
case AtomicExpr::AO__atomic_store_n:		case AtomicExpr::AO__atomic_store_n:
case AtomicExpr::AO__atomic_exchange_n:		case AtomicExpr::AO__atomic_exchange_n:
case AtomicExpr::AO__c11_atomic_fetch_and:		case AtomicExpr::AO__c11_atomic_fetch_and:
case AtomicExpr::AO__c11_atomic_fetch_or:		case AtomicExpr::AO__c11_atomic_fetch_or:
Show All 24 Lines	RValue CodeGenFunction::EmitAtomicExpr(AtomicExpr *E) {
QualType RValTy = E->getType().getUnqualifiedType();		QualType RValTy = E->getType().getUnqualifiedType();

// The inlined atomics only function on iN types, where N is a power of 2. We		// The inlined atomics only function on iN types, where N is a power of 2. We
// need to make sure (via temporaries if necessary) that all incoming values		// need to make sure (via temporaries if necessary) that all incoming values
// are compatible.		// are compatible.
LValue AtomicVal = MakeAddrLValue(Ptr, AtomicTy);		LValue AtomicVal = MakeAddrLValue(Ptr, AtomicTy);
AtomicInfo Atomics(*this, AtomicVal);		AtomicInfo Atomics(*this, AtomicVal);

		if (ShouldCastToIntPtrTy) {
Ptr = Atomics.emitCastToAtomicIntPointer(Ptr);		Ptr = Atomics.emitCastToAtomicIntPointer(Ptr);
if (Val1.isValid()) Val1 = Atomics.convertToAtomicIntPointer(Val1);		if (Val1.isValid())
if (Val2.isValid()) Val2 = Atomics.convertToAtomicIntPointer(Val2);		Val1 = Atomics.convertToAtomicIntPointer(Val1);
		jyknightUnsubmitted Done Reply Inline Actions convertToAtomicIntPointer does more than just cast to an int pointer, are you sure the rest is not necessary for fp types? jyknight: convertToAtomicIntPointer does more than just cast to an int pointer, are you sure the rest is…
		yaxunlAuthorUnsubmitted Done Reply Inline Actions it is not needed for fp types. If the value type does not match the pointer type, clang automatically inserts proper llvm instructions to convert the value type to a value type that matches the pointer type. Two codegen tests are added (atomic_fetch_add(double, float) and atomic_fetch_add(double, int)) to test such situations. yaxunl: it is not needed for fp types. If the value type does not match the pointer type, clang…
if (Dest.isValid())		if (Val2.isValid())
		Val2 = Atomics.convertToAtomicIntPointer(Val2);
		}
		if (Dest.isValid()) {
		if (ShouldCastToIntPtrTy)
Dest = Atomics.emitCastToAtomicIntPointer(Dest);		Dest = Atomics.emitCastToAtomicIntPointer(Dest);
else if (E->isCmpXChg())		} else if (E->isCmpXChg())
Dest = CreateMemTemp(RValTy, "cmpxchg.bool");		Dest = CreateMemTemp(RValTy, "cmpxchg.bool");
else if (!RValTy->isVoidType())		else if (!RValTy->isVoidType()) {
Dest = Atomics.emitCastToAtomicIntPointer(Atomics.CreateTempAlloca());		Dest = Atomics.CreateTempAlloca();
		if (ShouldCastToIntPtrTy)
		Dest = Atomics.emitCastToAtomicIntPointer(Dest);
		}

// Use a library call. See: http://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary .		// Use a library call. See: http://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary .
if (UseLibcall) {		if (UseLibcall) {
bool UseOptimizedLibcall = false;		bool UseOptimizedLibcall = false;
switch (E->getOp()) {		switch (E->getOp()) {
case AtomicExpr::AO__c11_atomic_init:		case AtomicExpr::AO__c11_atomic_init:
case AtomicExpr::AO__opencl_atomic_init:		case AtomicExpr::AO__opencl_atomic_init:
llvm_unreachable("Already handled above with EmitAtomicInit!");		llvm_unreachable("Already handled above with EmitAtomicInit!");
▲ Show 20 Lines • Show All 1,165 Lines • Show Last 20 Lines

clang/lib/Sema/SemaChecking.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,788 Lines • ▼ Show 20 Lines	ExprResult Sema::BuildAtomicExpr(SourceRange CallRange, SourceRange ExprRange,
case AtomicExpr::AO__c11_atomic_fetch_sub:		case AtomicExpr::AO__c11_atomic_fetch_sub:
case AtomicExpr::AO__opencl_atomic_fetch_add:		case AtomicExpr::AO__opencl_atomic_fetch_add:
case AtomicExpr::AO__opencl_atomic_fetch_sub:		case AtomicExpr::AO__opencl_atomic_fetch_sub:
case AtomicExpr::AO__atomic_fetch_add:		case AtomicExpr::AO__atomic_fetch_add:
case AtomicExpr::AO__atomic_fetch_sub:		case AtomicExpr::AO__atomic_fetch_sub:
case AtomicExpr::AO__atomic_add_fetch:		case AtomicExpr::AO__atomic_add_fetch:
case AtomicExpr::AO__atomic_sub_fetch:		case AtomicExpr::AO__atomic_sub_fetch:
IsAddSub = true;		IsAddSub = true;
LLVM_FALLTHROUGH;		Form = Arithmetic;
		break;
case AtomicExpr::AO__c11_atomic_fetch_and:		case AtomicExpr::AO__c11_atomic_fetch_and:
case AtomicExpr::AO__c11_atomic_fetch_or:		case AtomicExpr::AO__c11_atomic_fetch_or:
case AtomicExpr::AO__c11_atomic_fetch_xor:		case AtomicExpr::AO__c11_atomic_fetch_xor:
case AtomicExpr::AO__opencl_atomic_fetch_and:		case AtomicExpr::AO__opencl_atomic_fetch_and:
case AtomicExpr::AO__opencl_atomic_fetch_or:		case AtomicExpr::AO__opencl_atomic_fetch_or:
case AtomicExpr::AO__opencl_atomic_fetch_xor:		case AtomicExpr::AO__opencl_atomic_fetch_xor:
case AtomicExpr::AO__atomic_fetch_and:		case AtomicExpr::AO__atomic_fetch_and:
case AtomicExpr::AO__atomic_fetch_or:		case AtomicExpr::AO__atomic_fetch_or:
case AtomicExpr::AO__atomic_fetch_xor:		case AtomicExpr::AO__atomic_fetch_xor:
case AtomicExpr::AO__atomic_fetch_nand:		case AtomicExpr::AO__atomic_fetch_nand:
case AtomicExpr::AO__atomic_and_fetch:		case AtomicExpr::AO__atomic_and_fetch:
case AtomicExpr::AO__atomic_or_fetch:		case AtomicExpr::AO__atomic_or_fetch:
case AtomicExpr::AO__atomic_xor_fetch:		case AtomicExpr::AO__atomic_xor_fetch:
case AtomicExpr::AO__atomic_nand_fetch:		case AtomicExpr::AO__atomic_nand_fetch:
		Form = Arithmetic;
		break;
case AtomicExpr::AO__c11_atomic_fetch_min:		case AtomicExpr::AO__c11_atomic_fetch_min:
case AtomicExpr::AO__c11_atomic_fetch_max:		case AtomicExpr::AO__c11_atomic_fetch_max:
case AtomicExpr::AO__opencl_atomic_fetch_min:		case AtomicExpr::AO__opencl_atomic_fetch_min:
case AtomicExpr::AO__opencl_atomic_fetch_max:		case AtomicExpr::AO__opencl_atomic_fetch_max:
case AtomicExpr::AO__atomic_min_fetch:		case AtomicExpr::AO__atomic_min_fetch:
case AtomicExpr::AO__atomic_max_fetch:		case AtomicExpr::AO__atomic_max_fetch:
case AtomicExpr::AO__atomic_fetch_min:		case AtomicExpr::AO__atomic_fetch_min:
case AtomicExpr::AO__atomic_fetch_max:		case AtomicExpr::AO__atomic_fetch_max:
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	if (ValType.isConstQualified()) {
<< Ptr->getType() << Ptr->getSourceRange();		<< Ptr->getType() << Ptr->getSourceRange();
return ExprError();		return ExprError();
}		}
}		}

// For an arithmetic operation, the implied arithmetic must be well-formed.		// For an arithmetic operation, the implied arithmetic must be well-formed.
if (Form == Arithmetic) {		if (Form == Arithmetic) {
// gcc does not enforce these rules for GNU atomics, but we do so for sanity.		// gcc does not enforce these rules for GNU atomics, but we do so for sanity.
if (IsAddSub && !ValType->isIntegerType()		if (IsAddSub && !ValType->isIntegerType() && !ValType->isPointerType() &&
&& !ValType->isPointerType()) {		!ValType->isFloatingType()) {
Diag(ExprRange.getBegin(), diag::err_atomic_op_needs_atomic_int_or_ptr)		Diag(ExprRange.getBegin(), diag::err_atomic_op_needs_atomic_int_ptr_or_fp)
		rjmccallUnsubmitted Done Reply Inline Actions Does LLVM support atomics on all floating-point types? rjmccall: Does LLVM support atomics on all floating-point types?
		yaxunlAuthorUnsubmitted Done Reply Inline Actions LLVM IR parser requires atomicrmw value operand must have size of power of 2, therefore LLVM does not support atomicrmw on x86_fp80 which has size of 80 bytes. LLVM supports atomicrmw on all other floating-point types (bfloat, half, float, double, fp128, ppc_fp128). yaxunl: LLVM IR parser requires atomicrmw value operand must have size of power of 2, therefore LLVM…
		rjmccallUnsubmitted Done Reply Inline Actions Okay. So this needs to check the underlying FP semantics and disallow atomics on unsupported types. rjmccall: Okay. So this needs to check the underlying FP semantics and disallow atomics on unsupported…
		yaxunlAuthorUnsubmitted Done Reply Inline Actions will do yaxunl: will do
		rjmccallUnsubmitted Not Done Reply Inline Actions Could you extract this whole condition into a function and make it a bit more readable? rjmccall: Could you extract this whole condition into a function and make it a bit more readable?
<< IsC11 << Ptr->getType() << Ptr->getSourceRange();		<< IsC11 << Ptr->getType() << Ptr->getSourceRange();
return ExprError();		return ExprError();
}		}
if (!IsAddSub && !ValType->isIntegerType()) {		if (!IsAddSub && !ValType->isIntegerType()) {
Diag(ExprRange.getBegin(), diag::err_atomic_op_needs_atomic_int)		Diag(ExprRange.getBegin(), diag::err_atomic_op_needs_atomic_int)
<< IsC11 << Ptr->getType() << Ptr->getSourceRange();		<< IsC11 << Ptr->getType() << Ptr->getSourceRange();
return ExprError();		return ExprError();
}		}
▲ Show 20 Lines • Show All 110 Lines • ▼ Show 20 Lines	if (i < NumVals[Form] + 1) {
// Nothing else to do: we already know all we want about this pointer.		// Nothing else to do: we already know all we want about this pointer.
continue;		continue;
case 1:		case 1:
// The second argument is the non-atomic operand. For arithmetic, this		// The second argument is the non-atomic operand. For arithmetic, this
// is always passed by value, and for a compare_exchange it is always		// is always passed by value, and for a compare_exchange it is always
// passed by address. For the rest, GNU uses by-address and C11 uses		// passed by address. For the rest, GNU uses by-address and C11 uses
// by-value.		// by-value.
assert(Form != Load);		assert(Form != Load);
if (Form == Init \|\| (Form == Arithmetic && ValType->isIntegerType()))		if (Form == Arithmetic && ValType->isPointerType())
		Ty = Context.getPointerDiffType();
		else if (Form == Init \|\| Form == Arithmetic)
		jyknightUnsubmitted Done Reply Inline Actions This is confusing, and took me a bit to understand what you're doing. I'd suggest reordering the clauses, putting the pointer case first, e.g.: if (Form == Arithmetic && ValType->isPointerType()) Ty = Context.getPointerDiffType(); else if (Form == Init \|\| Form == Arithmetic) Ty = ValType; else if (Form == Copy \|\| Form == Xchg) ..... else ...... ... jyknight: This is confusing, and took me a bit to understand what you're doing. I'd suggest reordering…
		yaxunlAuthorUnsubmitted Done Reply Inline Actions done yaxunl: done
Ty = ValType;		Ty = ValType;
else if (Form == Copy \|\| Form == Xchg) {		else if (Form == Copy \|\| Form == Xchg) {
if (IsPassedByAddress) {		if (IsPassedByAddress) {
// The value pointer is always dereferenced, a nullptr is undefined.		// The value pointer is always dereferenced, a nullptr is undefined.
CheckNonNullArgument(*this, APIOrderedArgs[i],		CheckNonNullArgument(*this, APIOrderedArgs[i],
ExprRange.getBegin());		ExprRange.getBegin());
}		}
Ty = ByValType;		Ty = ByValType;
} else if (Form == Arithmetic)		} else {
Ty = Context.getPointerDiffType();
else {
Expr *ValArg = APIOrderedArgs[i];		Expr *ValArg = APIOrderedArgs[i];
// The value pointer is always dereferenced, a nullptr is undefined.		// The value pointer is always dereferenced, a nullptr is undefined.
CheckNonNullArgument(*this, ValArg, ExprRange.getBegin());		CheckNonNullArgument(*this, ValArg, ExprRange.getBegin());
LangAS AS = LangAS::Default;		LangAS AS = LangAS::Default;
// Keep address space of non-atomic pointer type.		// Keep address space of non-atomic pointer type.
if (const PointerType *PtrTy =		if (const PointerType *PtrTy =
ValArg->getType()->getAs<PointerType>()) {		ValArg->getType()->getAs<PointerType>()) {
AS = PtrTy->getPointeeType().getAddressSpace();		AS = PtrTy->getPointeeType().getAddressSpace();
▲ Show 20 Lines • Show All 9,991 Lines • Show Last 20 Lines

clang/test/CodeGen/fp-atomic-ops.c

This file was added.

				// RUN: %clang_cc1 %s -emit-llvm -DDOUBLE -O0 -o - -triple=amdgcn-amd-amdhsa \
				// RUN: \| opt -instnamer -S \| FileCheck -check-prefixes=FLOAT,DOUBLE %s

				// RUN: %clang_cc1 %s -emit-llvm -DDOUBLE -O0 -o - -triple=aarch64-linux-gnu \
				// RUN: \| opt -instnamer -S \| FileCheck -check-prefixes=FLOAT,DOUBLE %s

				// RUN: %clang_cc1 %s -emit-llvm -O0 -o - -triple=armv8-apple-ios7.0 \
				// RUN: \| opt -instnamer -S \| FileCheck -check-prefixes=FLOAT %s

				// RUN: %clang_cc1 %s -emit-llvm -DDOUBLE -O0 -o - -triple=hexagon \
				// RUN: \| opt -instnamer -S \| FileCheck -check-prefixes=FLOAT,DOUBLE %s

				// RUN: %clang_cc1 %s -emit-llvm -DDOUBLE -O0 -o - -triple=mips64-mti-linux-gnu \
				// RUN: \| opt -instnamer -S \| FileCheck -check-prefixes=FLOAT,DOUBLE %s

				// RUN: %clang_cc1 %s -emit-llvm -O0 -o - -triple=i686-linux-gnu \
				// RUN: \| opt -instnamer -S \| FileCheck -check-prefixes=FLOAT %s

				// RUN: %clang_cc1 %s -emit-llvm -DDOUBLE -O0 -o - -triple=x86_64-linux-gnu \
				// RUN: \| opt -instnamer -S \| FileCheck -check-prefixes=FLOAT,DOUBLE %s

				typedef enum memory_order {
				memory_order_relaxed = __ATOMIC_RELAXED,
				memory_order_acquire = __ATOMIC_ACQUIRE,
				memory_order_release = __ATOMIC_RELEASE,
				memory_order_acq_rel = __ATOMIC_ACQ_REL,
				memory_order_seq_cst = __ATOMIC_SEQ_CST
				} memory_order;

				void test(float f, float ff, double d, double dd) {
				// FLOAT: atomicrmw fadd float* {{.*}} monotonic
				__atomic_fetch_add(f, ff, memory_order_relaxed);

				// FLOAT: atomicrmw fsub float* {{.*}} monotonic
				__atomic_fetch_sub(f, ff, memory_order_relaxed);

				#ifdef DOUBLE
				// DOUBLE: atomicrmw fadd double* {{.*}} monotonic
				__atomic_fetch_add(d, dd, memory_order_relaxed);

				// DOUBLE: atomicrmw fsub double* {{.*}} monotonic
				__atomic_fetch_sub(d, dd, memory_order_relaxed);
				#endif
				}

clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu

This file was added.

				// RUN: %clang_cc1 %s -emit-llvm -o - -triple=amdgcn-amd-amdhsa \
				// RUN: -fcuda-is-device -target-cpu gfx906 -fnative-half-type \
				// RUN: -fnative-half-arguments-and-returns \| FileCheck %s

				// REQUIRES: amdgpu-registered-target

				#include "Inputs/cuda.h"
				#include <stdatomic.h>

				__device__ float ffp1(float *p) {
				// CHECK-LABEL: @_Z4ffp1Pf
				// CHECK: atomicrmw fadd float* {{.*}} monotonic
				return __atomic_fetch_add(p, 1.0f, memory_order_relaxed);
				}

				__device__ double ffp2(double *p) {
				// CHECK-LABEL: @_Z4ffp2Pd
				// CHECK: atomicrmw fsub double* {{.*}} monotonic
				return __atomic_fetch_sub(p, 1.0, memory_order_relaxed);
				}

				// long double is the same as double for amdgcn.
				__device__ long double ffp3(long double *p) {
				// CHECK-LABEL: @_Z4ffp3Pe
				// CHECK: atomicrmw fsub double* {{.*}} monotonic
				return __atomic_fetch_sub(p, 1.0L, memory_order_relaxed);
				}
				ldionneUnsubmitted Done Reply Inline Actions Nitpick, but this should be `1.0L` to be consistent. ldionne: Nitpick, but this should be `1.0L` to be consistent.
				yaxunlAuthorUnsubmitted Done Reply Inline Actions done yaxunl: done

				__device__ double ffp4(double *p, float f) {
				// CHECK-LABEL: @_Z4ffp4Pdf
				// CHECK: fpext float {{.*}} to double
				// CHECK: atomicrmw fsub double* {{.*}} monotonic
				return __atomic_fetch_sub(p, f, memory_order_relaxed);
				}

				__device__ double ffp5(double *p, int i) {
				// CHECK-LABEL: @_Z4ffp5Pdi
				// CHECK: sitofp i32 {{.*}} to double
				// CHECK: atomicrmw fsub double* {{.*}} monotonic
				return __atomic_fetch_sub(p, i, memory_order_relaxed);
				}

clang/test/CodeGenOpenCL/atomic-ops.cl

	// RUN: %clang_cc1 %s -cl-std=CL2.0 -emit-llvm -O0 -o - -triple=amdgcn-amd-amdhsa-amdgizcl \| opt -instnamer -S \| FileCheck %s			// RUN: %clang_cc1 %s -cl-std=CL2.0 -emit-llvm -O0 -o - -triple=amdgcn-amd-amdhsa \
				// RUN: \| opt -instnamer -S \| FileCheck %s

	// Also test serialization of atomic operations here, to avoid duplicating the test.			// Also test serialization of atomic operations here, to avoid duplicating the test.
	// RUN: %clang_cc1 %s -cl-std=CL2.0 -emit-pch -O0 -o %t -triple=amdgcn-amd-amdhsa-amdgizcl			// RUN: %clang_cc1 %s -cl-std=CL2.0 -emit-pch -O0 -o %t -triple=amdgcn-amd-amdhsa
	// RUN: %clang_cc1 %s -cl-std=CL2.0 -include-pch %t -O0 -triple=amdgcn-amd-amdhsa-amdgizcl -emit-llvm -o - \| opt -instnamer -S \| FileCheck %s			// RUN: %clang_cc1 %s -cl-std=CL2.0 -include-pch %t -O0 -triple=amdgcn-amd-amdhsa \
				// RUN: -emit-llvm -o - \| opt -instnamer -S \| FileCheck %s

	#ifndef ALREADY_INCLUDED			#ifndef ALREADY_INCLUDED
	#define ALREADY_INCLUDED			#define ALREADY_INCLUDED

				#pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable
				#pragma OPENCL EXTENSION cl_khr_int64_extended_atomics : enable

	typedef __INTPTR_TYPE__ intptr_t;			typedef __INTPTR_TYPE__ intptr_t;
	typedef int int8 __attribute__((ext_vector_type(8)));			typedef int int8 __attribute__((ext_vector_type(8)));

	typedef enum memory_order {			typedef enum memory_order {
	memory_order_relaxed = __ATOMIC_RELAXED,			memory_order_relaxed = __ATOMIC_RELAXED,
	memory_order_acquire = __ATOMIC_ACQUIRE,			memory_order_acquire = __ATOMIC_ACQUIRE,
	memory_order_release = __ATOMIC_RELEASE,			memory_order_release = __ATOMIC_RELEASE,
	memory_order_acq_rel = __ATOMIC_ACQ_REL,			memory_order_acq_rel = __ATOMIC_ACQ_REL,
	▲ Show 20 Lines • Show All 162 Lines • ▼ Show 20 Lines
	}			}

	float ff3(atomic_float *d) {			float ff3(atomic_float *d) {
	// CHECK-LABEL: @ff3			// CHECK-LABEL: @ff3
	// CHECK: atomicrmw xchg i32* {{.*}} syncscope("workgroup") seq_cst			// CHECK: atomicrmw xchg i32* {{.*}} syncscope("workgroup") seq_cst
	return __opencl_atomic_exchange(d, 2, memory_order_seq_cst, memory_scope_work_group);			return __opencl_atomic_exchange(d, 2, memory_order_seq_cst, memory_scope_work_group);
	}			}

				float ff4(global atomic_float *d, float a) {
				// CHECK-LABEL: @ff4
				// CHECK: atomicrmw fadd float addrspace(1)* {{.*}} syncscope("workgroup-one-as") monotonic
				return __opencl_atomic_fetch_add(d, a, memory_order_relaxed, memory_scope_work_group);
				}

				float ff5(global atomic_double *d, double a) {
				// CHECK-LABEL: @ff5
				// CHECK: atomicrmw fadd double addrspace(1)* {{.*}} syncscope("workgroup-one-as") monotonic
				return __opencl_atomic_fetch_add(d, a, memory_order_relaxed, memory_scope_work_group);
				}

	// CHECK-LABEL: @atomic_init_foo			// CHECK-LABEL: @atomic_init_foo
	void atomic_init_foo()			void atomic_init_foo()
	{			{
	// CHECK-NOT: atomic			// CHECK-NOT: atomic
	// CHECK: store			// CHECK: store
	__opencl_atomic_init(&j, 42);			__opencl_atomic_init(&j, 42);

	// CHECK-NOT: atomic			// CHECK-NOT: atomic
	▲ Show 20 Lines • Show All 96 Lines • Show Last 20 Lines

clang/test/Sema/atomic-ops.c

Show First 20 Lines • Show All 93 Lines • ▼ Show 20 Lines
_Static_assert(__atomic_always_lock_free(4, &i64), "");		_Static_assert(__atomic_always_lock_free(4, &i64), "");
_Static_assert(!__atomic_always_lock_free(8, &i32), "");		_Static_assert(!__atomic_always_lock_free(8, &i32), "");
_Static_assert(__atomic_always_lock_free(8, &i64), "");		_Static_assert(__atomic_always_lock_free(8, &i64), "");

#define _AS1 __attribute__((address_space(1)))		#define _AS1 __attribute__((address_space(1)))
#define _AS2 __attribute__((address_space(2)))		#define _AS2 __attribute__((address_space(2)))

void f(_Atomic(int) i, const _Atomic(int) ci,		void f(_Atomic(int) i, const _Atomic(int) ci,
_Atomic(int) p, _Atomic(float) *d,		_Atomic(int) p, _Atomic(float) f, _Atomic(double) d,
		_Atomic(long double) *ld,
int I, const int CI,		int I, const int CI,
		traUnsubmitted Done Reply Inline Actions Rename arguments? d -> f, d2 -> d, d3 -> ld ? tra: Rename arguments? d -> f, d2 -> d, d3 -> ld ?
		yaxunlAuthorUnsubmitted Done Reply Inline Actions done yaxunl: done
int *P, float D, struct S s1, struct S s2) {		int *P, float D, struct S s1, struct S s2) {
__c11_atomic_init(I, 5); // expected-error {{pointer to _Atomic}}		__c11_atomic_init(I, 5); // expected-error {{pointer to _Atomic}}
__c11_atomic_init(ci, 5); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const _Atomic(int) *' invalid)}}		__c11_atomic_init(ci, 5); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const _Atomic(int) *' invalid)}}

__c11_atomic_load(0); // expected-error {{too few arguments to function}}		__c11_atomic_load(0); // expected-error {{too few arguments to function}}
__c11_atomic_load(0,0,0); // expected-error {{too many arguments to function}}		__c11_atomic_load(0,0,0); // expected-error {{too many arguments to function}}
__c11_atomic_store(0,0,0); // expected-error {{address argument to atomic builtin must be a pointer}}		__c11_atomic_store(0,0,0); // expected-error {{address argument to atomic builtin must be a pointer}}
__c11_atomic_store((int*)0,0,0); // expected-error {{address argument to atomic operation must be a pointer to _Atomic}}		__c11_atomic_store((int*)0,0,0); // expected-error {{address argument to atomic operation must be a pointer to _Atomic}}
__c11_atomic_store(i, 0, memory_order_relaxed);		__c11_atomic_store(i, 0, memory_order_relaxed);
__c11_atomic_store(ci, 0, memory_order_relaxed); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const _Atomic(int) *' invalid)}}		__c11_atomic_store(ci, 0, memory_order_relaxed); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const _Atomic(int) *' invalid)}}

__c11_atomic_load(i, memory_order_seq_cst);		__c11_atomic_load(i, memory_order_seq_cst);
__c11_atomic_load(p, memory_order_seq_cst);		__c11_atomic_load(p, memory_order_seq_cst);
__c11_atomic_load(d, memory_order_seq_cst);		__c11_atomic_load(f, memory_order_seq_cst);
__c11_atomic_load(ci, memory_order_seq_cst);		__c11_atomic_load(ci, memory_order_seq_cst);

int load_n_1 = __atomic_load_n(I, memory_order_relaxed);		int load_n_1 = __atomic_load_n(I, memory_order_relaxed);
int *load_n_2 = __atomic_load_n(P, memory_order_relaxed);		int *load_n_2 = __atomic_load_n(P, memory_order_relaxed);
float load_n_3 = __atomic_load_n(D, memory_order_relaxed); // expected-error {{must be a pointer to integer or pointer}}		float load_n_3 = __atomic_load_n(D, memory_order_relaxed); // expected-error {{must be a pointer to integer or pointer}}
__atomic_load_n(s1, memory_order_relaxed); // expected-error {{must be a pointer to integer or pointer}}		__atomic_load_n(s1, memory_order_relaxed); // expected-error {{must be a pointer to integer or pointer}}
load_n_1 = __atomic_load_n(CI, memory_order_relaxed);		load_n_1 = __atomic_load_n(CI, memory_order_relaxed);

__atomic_load(i, I, memory_order_relaxed); // expected-error {{must be a pointer to a trivially-copyable type}}		__atomic_load(i, I, memory_order_relaxed); // expected-error {{must be a pointer to a trivially-copyable type}}
__atomic_load(CI, I, memory_order_relaxed);		__atomic_load(CI, I, memory_order_relaxed);

__atomic_load(I, i, memory_order_relaxed); // expected-warning {{passing '_Atomic(int) ' to parameter of type 'int '}}		__atomic_load(I, i, memory_order_relaxed); // expected-warning {{passing '_Atomic(int) ' to parameter of type 'int '}}
__atomic_load(I, *P, memory_order_relaxed);		__atomic_load(I, *P, memory_order_relaxed);
__atomic_load(I, *P, memory_order_relaxed, 42); // expected-error {{too many arguments}}		__atomic_load(I, *P, memory_order_relaxed, 42); // expected-error {{too many arguments}}
(int)__atomic_load(I, I, memory_order_seq_cst); // expected-error {{operand of type 'void'}}		(int)__atomic_load(I, I, memory_order_seq_cst); // expected-error {{operand of type 'void'}}
__atomic_load(s1, s2, memory_order_acquire);		__atomic_load(s1, s2, memory_order_acquire);
__atomic_load(CI, I, memory_order_relaxed);		__atomic_load(CI, I, memory_order_relaxed);
__atomic_load(I, CI, memory_order_relaxed); // expected-warning {{passing 'const int ' to parameter of type 'int ' discards qualifiers}}		__atomic_load(I, CI, memory_order_relaxed); // expected-warning {{passing 'const int ' to parameter of type 'int ' discards qualifiers}}
__atomic_load(CI, CI, memory_order_relaxed); // expected-warning {{passing 'const int ' to parameter of type 'int ' discards qualifiers}}		__atomic_load(CI, CI, memory_order_relaxed); // expected-warning {{passing 'const int ' to parameter of type 'int ' discards qualifiers}}

__c11_atomic_store(i, 1, memory_order_seq_cst);		__c11_atomic_store(i, 1, memory_order_seq_cst);
__c11_atomic_store(p, 1, memory_order_seq_cst); // expected-warning {{incompatible integer to pointer conversion}}		__c11_atomic_store(p, 1, memory_order_seq_cst); // expected-warning {{incompatible integer to pointer conversion}}
(int)__c11_atomic_store(d, 1, memory_order_seq_cst); // expected-error {{operand of type 'void'}}		(int)__c11_atomic_store(f, 1, memory_order_seq_cst); // expected-error {{operand of type 'void'}}

__atomic_store_n(I, 4, memory_order_release);		__atomic_store_n(I, 4, memory_order_release);
__atomic_store_n(I, 4.0, memory_order_release);		__atomic_store_n(I, 4.0, memory_order_release);
__atomic_store_n(CI, 4, memory_order_release); // expected-error {{address argument to atomic operation must be a pointer to non-const type ('const int *' invalid)}}		__atomic_store_n(CI, 4, memory_order_release); // expected-error {{address argument to atomic operation must be a pointer to non-const type ('const int *' invalid)}}
__atomic_store_n(I, P, memory_order_release); // expected-warning {{parameter of type 'int'}}		__atomic_store_n(I, P, memory_order_release); // expected-warning {{parameter of type 'int'}}
__atomic_store_n(i, 1, memory_order_release); // expected-error {{must be a pointer to integer or pointer}}		__atomic_store_n(i, 1, memory_order_release); // expected-error {{must be a pointer to integer or pointer}}
__atomic_store_n(s1, *s2, memory_order_release); // expected-error {{must be a pointer to integer or pointer}}		__atomic_store_n(s1, *s2, memory_order_release); // expected-error {{must be a pointer to integer or pointer}}
__atomic_store_n(I, I, memory_order_release); // expected-warning {{incompatible pointer to integer conversion passing 'int ' to parameter of type 'int'; dereference with }}		__atomic_store_n(I, I, memory_order_release); // expected-warning {{incompatible pointer to integer conversion passing 'int ' to parameter of type 'int'; dereference with }}
Show All 12 Lines	void f(_Atomic(int) i, const _Atomic(int) ci,
__atomic_exchange(s1, I, P, memory_order_seq_cst); // expected-warning 2{{parameter of type 'struct S *'}}		__atomic_exchange(s1, I, P, memory_order_seq_cst); // expected-warning 2{{parameter of type 'struct S *'}}
(int)__atomic_exchange(s1, s2, s2, memory_order_seq_cst); // expected-error {{operand of type 'void'}}		(int)__atomic_exchange(s1, s2, s2, memory_order_seq_cst); // expected-error {{operand of type 'void'}}
__atomic_exchange(I, I, I, memory_order_seq_cst);		__atomic_exchange(I, I, I, memory_order_seq_cst);
__atomic_exchange(CI, I, I, memory_order_seq_cst); // expected-error {{address argument to atomic operation must be a pointer to non-const type ('const int *' invalid)}}		__atomic_exchange(CI, I, I, memory_order_seq_cst); // expected-error {{address argument to atomic operation must be a pointer to non-const type ('const int *' invalid)}}
__atomic_exchange(I, I, CI, memory_order_seq_cst); // expected-warning {{passing 'const int ' to parameter of type 'int ' discards qualifiers}}		__atomic_exchange(I, I, CI, memory_order_seq_cst); // expected-warning {{passing 'const int ' to parameter of type 'int ' discards qualifiers}}

__c11_atomic_fetch_add(i, 1, memory_order_seq_cst);		__c11_atomic_fetch_add(i, 1, memory_order_seq_cst);
__c11_atomic_fetch_add(p, 1, memory_order_seq_cst);		__c11_atomic_fetch_add(p, 1, memory_order_seq_cst);
__c11_atomic_fetch_add(d, 1, memory_order_seq_cst); // expected-error {{must be a pointer to atomic integer or pointer}}		__c11_atomic_fetch_add(f, 1.0f, memory_order_seq_cst);
		__c11_atomic_fetch_add(d, 1.0, memory_order_seq_cst);
		__c11_atomic_fetch_add(ld, 1.0, memory_order_seq_cst);

__atomic_fetch_add(i, 3, memory_order_seq_cst); // expected-error {{pointer to integer or pointer}}		__atomic_fetch_add(i, 3, memory_order_seq_cst); // expected-error {{pointer to integer, pointer or supported floating point type}}
__atomic_fetch_sub(I, 3, memory_order_seq_cst);		__atomic_fetch_sub(I, 3, memory_order_seq_cst);
__atomic_fetch_sub(P, 3, memory_order_seq_cst);		__atomic_fetch_sub(P, 3, memory_order_seq_cst);
__atomic_fetch_sub(D, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer or pointer}}		__atomic_fetch_sub(D, 3, memory_order_seq_cst);
__atomic_fetch_sub(s1, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer or pointer}}		__atomic_fetch_sub(s1, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer, pointer or supported floating point type}}
__atomic_fetch_min(D, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer}}		__atomic_fetch_min(D, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer}}
__atomic_fetch_max(P, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer}}		__atomic_fetch_max(P, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer}}
__atomic_fetch_max(p, 3); // expected-error {{too few arguments to function call, expected 3, have 2}}		__atomic_fetch_max(p, 3); // expected-error {{too few arguments to function call, expected 3, have 2}}

__c11_atomic_fetch_and(i, 1, memory_order_seq_cst);		__c11_atomic_fetch_and(i, 1, memory_order_seq_cst);
__c11_atomic_fetch_and(p, 1, memory_order_seq_cst); // expected-error {{must be a pointer to atomic integer}}		__c11_atomic_fetch_and(p, 1, memory_order_seq_cst); // expected-error {{must be a pointer to atomic integer}}
__c11_atomic_fetch_and(d, 1, memory_order_seq_cst); // expected-error {{must be a pointer to atomic integer}}		__c11_atomic_fetch_and(f, 1, memory_order_seq_cst); // expected-error {{must be a pointer to atomic integer}}

__atomic_fetch_and(i, 3, memory_order_seq_cst); // expected-error {{pointer to integer}}		__atomic_fetch_and(i, 3, memory_order_seq_cst); // expected-error {{pointer to integer}}
__atomic_fetch_or(I, 3, memory_order_seq_cst);		__atomic_fetch_or(I, 3, memory_order_seq_cst);
__atomic_fetch_xor(P, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer}}		__atomic_fetch_xor(P, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer}}
__atomic_fetch_or(D, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer}}		__atomic_fetch_or(D, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer}}
__atomic_fetch_and(s1, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer}}		__atomic_fetch_and(s1, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer}}

_Bool cmpexch_1 = __c11_atomic_compare_exchange_strong(i, I, 1, memory_order_seq_cst, memory_order_seq_cst);		_Bool cmpexch_1 = __c11_atomic_compare_exchange_strong(i, I, 1, memory_order_seq_cst, memory_order_seq_cst);
_Bool cmpexch_2 = __c11_atomic_compare_exchange_strong(p, P, (int*)1, memory_order_seq_cst, memory_order_seq_cst);		_Bool cmpexch_2 = __c11_atomic_compare_exchange_strong(p, P, (int*)1, memory_order_seq_cst, memory_order_seq_cst);
_Bool cmpexch_3 = __c11_atomic_compare_exchange_strong(d, I, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{incompatible pointer types}}		_Bool cmpexch_3 = __c11_atomic_compare_exchange_strong(f, I, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{incompatible pointer types}}
(void)__c11_atomic_compare_exchange_strong(i, CI, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{passing 'const int ' to parameter of type 'int ' discards qualifiers}}		(void)__c11_atomic_compare_exchange_strong(i, CI, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{passing 'const int ' to parameter of type 'int ' discards qualifiers}}

_Bool cmpexchw_1 = __c11_atomic_compare_exchange_weak(i, I, 1, memory_order_seq_cst, memory_order_seq_cst);		_Bool cmpexchw_1 = __c11_atomic_compare_exchange_weak(i, I, 1, memory_order_seq_cst, memory_order_seq_cst);
_Bool cmpexchw_2 = __c11_atomic_compare_exchange_weak(p, P, (int*)1, memory_order_seq_cst, memory_order_seq_cst);		_Bool cmpexchw_2 = __c11_atomic_compare_exchange_weak(p, P, (int*)1, memory_order_seq_cst, memory_order_seq_cst);
_Bool cmpexchw_3 = __c11_atomic_compare_exchange_weak(d, I, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{incompatible pointer types}}		_Bool cmpexchw_3 = __c11_atomic_compare_exchange_weak(f, I, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{incompatible pointer types}}
(void)__c11_atomic_compare_exchange_weak(i, CI, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{passing 'const int ' to parameter of type 'int ' discards qualifiers}}		(void)__c11_atomic_compare_exchange_weak(i, CI, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{passing 'const int ' to parameter of type 'int ' discards qualifiers}}

_Bool cmpexch_4 = __atomic_compare_exchange_n(I, I, 5, 1, memory_order_seq_cst, memory_order_seq_cst);		_Bool cmpexch_4 = __atomic_compare_exchange_n(I, I, 5, 1, memory_order_seq_cst, memory_order_seq_cst);
_Bool cmpexch_5 = __atomic_compare_exchange_n(I, P, 5, 0, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{; dereference with *}}		_Bool cmpexch_5 = __atomic_compare_exchange_n(I, P, 5, 0, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{; dereference with *}}
_Bool cmpexch_6 = __atomic_compare_exchange_n(I, I, P, 0, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{passing 'int **' to parameter of type 'int'}}		_Bool cmpexch_6 = __atomic_compare_exchange_n(I, I, P, 0, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{passing 'int **' to parameter of type 'int'}}
(void)__atomic_compare_exchange_n(CI, I, 5, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-error {{address argument to atomic operation must be a pointer to non-const type ('const int *' invalid)}}		(void)__atomic_compare_exchange_n(CI, I, 5, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-error {{address argument to atomic operation must be a pointer to non-const type ('const int *' invalid)}}
(void)__atomic_compare_exchange_n(I, CI, 5, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{passing 'const int ' to parameter of type 'int ' discards qualifiers}}		(void)__atomic_compare_exchange_n(I, CI, 5, 1, memory_order_seq_cst, memory_order_seq_cst); // expected-warning {{passing 'const int ' to parameter of type 'int ' discards qualifiers}}

▲ Show 20 Lines • Show All 498 Lines • Show Last 20 Lines

clang/test/SemaCUDA/amdgpu-atomic-ops.cu

This file was added.

				// REQUIRES: amdgpu-registered-target

				// RUN: %clang_cc1 %s -verify -emit-llvm - -triple=amdgcn-amd-amdhsa \
				// RUN: -fcuda-is-device -target-cpu gfx906 -fnative-half-type \
				// RUN: -fnative-half-arguments-and-returns

				#include "Inputs/cuda.h"
				#include <stdatomic.h>

				__device__ _Float16 test_Flot16(_Float16 *p) {
				return __atomic_fetch_sub(p, 1.0f16, memory_order_relaxed);
				}

				__device__ __fp16 test_fp16(__fp16 *p) {
				return __atomic_fetch_sub(p, 1.0f16, memory_order_relaxed);
				}

				struct BigStruct {
				int data[128];
				};

				__device__ void test_big(BigStruct p1, BigStruct p2) {
				__atomic_load(p1, p2, memory_order_relaxed);
				// expected-error@-1 {{large atomic operation not supported; the access size (512 bytes) exceeds the max lock-free size (8 bytes)}}
				}

clang/test/SemaOpenCL/atomic-ops.cl

	// RUN: %clang_cc1 %s -cl-std=CL2.0 -verify -fsyntax-only -triple=spir64			// RUN: %clang_cc1 %s -cl-std=CL2.0 -verify=expected,spir \
	// RUN: %clang_cc1 %s -cl-std=CL2.0 -verify -fsyntax-only -triple=amdgcn-amdhsa-amd-opencl			// RUN: -fsyntax-only -triple=spir64
				// RUN: %clang_cc1 %s -cl-std=CL2.0 -verify -fsyntax-only \
				// RUN: -triple=amdgcn-amd-amdhsa

	// Basic parsing/Sema tests for __opencl_atomic_*			// Basic parsing/Sema tests for __opencl_atomic_*

	#pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable			#pragma OPENCL EXTENSION cl_khr_int64_base_atomics : enable
	#pragma OPENCL EXTENSION cl_khr_int64_extended_atomics : enable			#pragma OPENCL EXTENSION cl_khr_int64_extended_atomics : enable
				#pragma OPENCL EXTENSION cl_khr_fp16 : enable

	typedef __INTPTR_TYPE__ intptr_t;			typedef __INTPTR_TYPE__ intptr_t;
	typedef int int8 __attribute__((ext_vector_type(8)));			typedef int int8 __attribute__((ext_vector_type(8)));

	typedef enum memory_order {			typedef enum memory_order {
	memory_order_relaxed = __ATOMIC_RELAXED,			memory_order_relaxed = __ATOMIC_RELAXED,
	memory_order_acquire = __ATOMIC_ACQUIRE,			memory_order_acquire = __ATOMIC_ACQUIRE,
	memory_order_release = __ATOMIC_RELEASE,			memory_order_release = __ATOMIC_RELEASE,
	Show All 15 Lines

	char i8;			char i8;
	short i16;			short i16;
	int i32;			int i32;
	int8 i64;			int8 i64;

	atomic_int gn;			atomic_int gn;
	void f(atomic_int i, const atomic_int ci,			void f(atomic_int i, const atomic_int ci,
	atomic_intptr_t p, atomic_float d,			atomic_intptr_t p, atomic_float f, atomic_double d, atomic_half h, // expected-error {{unknown type name 'atomic_half'}}
	int I, const int CI,			int I, const int CI,
	intptr_t P, float D, struct S s1, struct S s2,			intptr_t P, float D, struct S s1, struct S s2,
	global atomic_int i_g, local atomic_int i_l, private atomic_int *i_p,			global atomic_int i_g, local atomic_int i_l, private atomic_int *i_p,
	constant atomic_int *i_c) {			constant atomic_int *i_c) {
	__opencl_atomic_init(I, 5); // expected-error {{address argument to atomic operation must be a pointer to _Atomic type ('__generic int *' invalid)}}			__opencl_atomic_init(I, 5); // expected-error {{address argument to atomic operation must be a pointer to _Atomic type ('__generic int *' invalid)}}
	__opencl_atomic_init(ci, 5); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const __generic atomic_int ' (aka 'const __generic _Atomic(int) ') invalid)}}			__opencl_atomic_init(ci, 5); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const __generic atomic_int ' (aka 'const __generic _Atomic(int) ') invalid)}}

	__opencl_atomic_load(0); // expected-error {{too few arguments to function call, expected 3, have 1}}			__opencl_atomic_load(0); // expected-error {{too few arguments to function call, expected 3, have 1}}
	__opencl_atomic_load(0, 0, 0, 0); // expected-error {{too many arguments to function call, expected 3, have 4}}			__opencl_atomic_load(0, 0, 0, 0); // expected-error {{too many arguments to function call, expected 3, have 4}}
	__opencl_atomic_store(0,0,0,0); // expected-error {{address argument to atomic builtin must be a pointer}}			__opencl_atomic_store(0,0,0,0); // expected-error {{address argument to atomic builtin must be a pointer}}
	__opencl_atomic_store((int )0, 0, 0, 0); // expected-error {{address argument to atomic operation must be a pointer to _Atomic type ('__generic int ' invalid)}}			__opencl_atomic_store((int )0, 0, 0, 0); // expected-error {{address argument to atomic operation must be a pointer to _Atomic type ('__generic int ' invalid)}}
	__opencl_atomic_store(i, 0, memory_order_relaxed, memory_scope_work_group);			__opencl_atomic_store(i, 0, memory_order_relaxed, memory_scope_work_group);
	__opencl_atomic_store(ci, 0, memory_order_relaxed, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const __generic atomic_int ' (aka 'const __generic _Atomic(int) ') invalid)}}			__opencl_atomic_store(ci, 0, memory_order_relaxed, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const __generic atomic_int ' (aka 'const __generic _Atomic(int) ') invalid)}}
	__opencl_atomic_store(i_g, 0, memory_order_relaxed, memory_scope_work_group);			__opencl_atomic_store(i_g, 0, memory_order_relaxed, memory_scope_work_group);
	__opencl_atomic_store(i_l, 0, memory_order_relaxed, memory_scope_work_group);			__opencl_atomic_store(i_l, 0, memory_order_relaxed, memory_scope_work_group);
	__opencl_atomic_store(i_p, 0, memory_order_relaxed, memory_scope_work_group);			__opencl_atomic_store(i_p, 0, memory_order_relaxed, memory_scope_work_group);
	__opencl_atomic_store(i_c, 0, memory_order_relaxed, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to non-constant _Atomic type ('__constant atomic_int ' (aka '__constant _Atomic(int) ') invalid)}}			__opencl_atomic_store(i_c, 0, memory_order_relaxed, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to non-constant _Atomic type ('__constant atomic_int ' (aka '__constant _Atomic(int) ') invalid)}}

	__opencl_atomic_load(i, memory_order_seq_cst, memory_scope_work_group);			__opencl_atomic_load(i, memory_order_seq_cst, memory_scope_work_group);
	__opencl_atomic_load(p, memory_order_seq_cst, memory_scope_work_group);			__opencl_atomic_load(p, memory_order_seq_cst, memory_scope_work_group);
	__opencl_atomic_load(d, memory_order_seq_cst, memory_scope_work_group);			__opencl_atomic_load(f, memory_order_seq_cst, memory_scope_work_group);
	__opencl_atomic_load(ci, memory_order_seq_cst, memory_scope_work_group);			__opencl_atomic_load(ci, memory_order_seq_cst, memory_scope_work_group);
	__opencl_atomic_load(i_c, memory_order_seq_cst, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to non-constant _Atomic type ('__constant atomic_int ' (aka '__constant _Atomic(int) ') invalid)}}			__opencl_atomic_load(i_c, memory_order_seq_cst, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to non-constant _Atomic type ('__constant atomic_int ' (aka '__constant _Atomic(int) ') invalid)}}

	__opencl_atomic_store(i, 1, memory_order_seq_cst, memory_scope_work_group);			__opencl_atomic_store(i, 1, memory_order_seq_cst, memory_scope_work_group);
	__opencl_atomic_store(p, 1, memory_order_seq_cst, memory_scope_work_group);			__opencl_atomic_store(p, 1, memory_order_seq_cst, memory_scope_work_group);
	(int)__opencl_atomic_store(d, 1, memory_order_seq_cst, memory_scope_work_group); // expected-error {{operand of type 'void' where arithmetic or pointer type is required}}			(int)__opencl_atomic_store(f, 1, memory_order_seq_cst, memory_scope_work_group); // expected-error {{operand of type 'void' where arithmetic or pointer type is required}}

	int exchange_1 = __opencl_atomic_exchange(i, 1, memory_order_seq_cst, memory_scope_work_group);			int exchange_1 = __opencl_atomic_exchange(i, 1, memory_order_seq_cst, memory_scope_work_group);
	int exchange_2 = __opencl_atomic_exchange(I, 1, memory_order_seq_cst, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to _Atomic}}			int exchange_2 = __opencl_atomic_exchange(I, 1, memory_order_seq_cst, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to _Atomic}}

	__opencl_atomic_fetch_add(i, 1, memory_order_seq_cst, memory_scope_work_group);			__opencl_atomic_fetch_add(i, 1, memory_order_seq_cst, memory_scope_work_group);
	__opencl_atomic_fetch_add(p, 1, memory_order_seq_cst, memory_scope_work_group);			__opencl_atomic_fetch_add(p, 1, memory_order_seq_cst, memory_scope_work_group);
	__opencl_atomic_fetch_add(d, 1, memory_order_seq_cst, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to atomic integer or pointer ('__generic atomic_float ' (aka '__generic _Atomic(float) ') invalid)}}			__opencl_atomic_fetch_add(f, 1.0f, memory_order_seq_cst, memory_scope_work_group);
				__opencl_atomic_fetch_add(d, 1.0, memory_order_seq_cst, memory_scope_work_group);
	__opencl_atomic_fetch_and(i, 1, memory_order_seq_cst, memory_scope_work_group);			__opencl_atomic_fetch_and(i, 1, memory_order_seq_cst, memory_scope_work_group);
	__opencl_atomic_fetch_and(p, 1, memory_order_seq_cst, memory_scope_work_group);			__opencl_atomic_fetch_and(p, 1, memory_order_seq_cst, memory_scope_work_group);
	__opencl_atomic_fetch_and(d, 1, memory_order_seq_cst, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to atomic integer ('__generic atomic_float ' (aka '__generic _Atomic(float) ') invalid)}}			__opencl_atomic_fetch_and(f, 1, memory_order_seq_cst, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to atomic integer ('__generic atomic_float ' (aka '__generic _Atomic(float) ') invalid)}}

	__opencl_atomic_fetch_min(i, 1, memory_order_seq_cst, memory_scope_work_group);			__opencl_atomic_fetch_min(i, 1, memory_order_seq_cst, memory_scope_work_group);
	__opencl_atomic_fetch_max(i, 1, memory_order_seq_cst, memory_scope_work_group);			__opencl_atomic_fetch_max(i, 1, memory_order_seq_cst, memory_scope_work_group);
	__opencl_atomic_fetch_min(d, 1, memory_order_seq_cst, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to atomic integer ('__generic atomic_float ' (aka '__generic _Atomic(float) ') invalid)}}			__opencl_atomic_fetch_min(f, 1, memory_order_seq_cst, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to atomic integer ('__generic atomic_float ' (aka '__generic _Atomic(float) ') invalid)}}
	__opencl_atomic_fetch_max(d, 1, memory_order_seq_cst, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to atomic integer ('__generic atomic_float ' (aka '__generic _Atomic(float) ') invalid)}}			__opencl_atomic_fetch_max(f, 1, memory_order_seq_cst, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to atomic integer ('__generic atomic_float ' (aka '__generic _Atomic(float) ') invalid)}}

	bool cmpexch_1 = __opencl_atomic_compare_exchange_strong(i, I, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group);			bool cmpexch_1 = __opencl_atomic_compare_exchange_strong(i, I, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group);
	bool cmpexch_2 = __opencl_atomic_compare_exchange_strong(p, P, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group);			bool cmpexch_2 = __opencl_atomic_compare_exchange_strong(p, P, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group);
	bool cmpexch_3 = __opencl_atomic_compare_exchange_strong(d, I, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group); // expected-warning {{incompatible pointer types passing '__generic int __private' to parameter of type '__generic float '}}			bool cmpexch_3 = __opencl_atomic_compare_exchange_strong(f, I, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group); // expected-warning {{incompatible pointer types passing '__generic int __private' to parameter of type '__generic float '}}
	(void)__opencl_atomic_compare_exchange_strong(i, CI, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group); // expected-warning {{passing 'const __generic int __private' to parameter of type '__generic int ' discards qualifiers}}			(void)__opencl_atomic_compare_exchange_strong(i, CI, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group); // expected-warning {{passing 'const __generic int __private' to parameter of type '__generic int ' discards qualifiers}}

	bool cmpexchw_1 = __opencl_atomic_compare_exchange_weak(i, I, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group);			bool cmpexchw_1 = __opencl_atomic_compare_exchange_weak(i, I, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group);
	bool cmpexchw_2 = __opencl_atomic_compare_exchange_weak(p, P, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group);			bool cmpexchw_2 = __opencl_atomic_compare_exchange_weak(p, P, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group);
	bool cmpexchw_3 = __opencl_atomic_compare_exchange_weak(d, I, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group); // expected-warning {{incompatible pointer types passing '__generic int __private' to parameter of type '__generic float '}}			bool cmpexchw_3 = __opencl_atomic_compare_exchange_weak(f, I, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group); // expected-warning {{incompatible pointer types passing '__generic int __private' to parameter of type '__generic float '}}
	(void)__opencl_atomic_compare_exchange_weak(i, CI, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group); // expected-warning {{passing 'const __generic int __private' to parameter of type '__generic int ' discards qualifiers}}			(void)__opencl_atomic_compare_exchange_weak(i, CI, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group); // expected-warning {{passing 'const __generic int __private' to parameter of type '__generic int ' discards qualifiers}}

	// Pointers to different address spaces are allowed.			// Pointers to different address spaces are allowed.
	bool cmpexch_10 = __opencl_atomic_compare_exchange_strong((global atomic_int )0x308, (constant int )0x309, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group);			bool cmpexch_10 = __opencl_atomic_compare_exchange_strong((global atomic_int )0x308, (constant int )0x309, 1, memory_order_seq_cst, memory_order_seq_cst, memory_scope_work_group);

	__opencl_atomic_init(ci, 0); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const __generic atomic_int ' (aka 'const __generic _Atomic(int) ') invalid)}}			__opencl_atomic_init(ci, 0); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const __generic atomic_int ' (aka 'const __generic _Atomic(int) ') invalid)}}
	__opencl_atomic_store(ci, 0, memory_order_release, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const __generic atomic_int ' (aka 'const __generic _Atomic(int) ') invalid)}}			__opencl_atomic_store(ci, 0, memory_order_release, memory_scope_work_group); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const __generic atomic_int ' (aka 'const __generic _Atomic(int) ') invalid)}}
	__opencl_atomic_load(ci, memory_order_acquire, memory_scope_work_group);			__opencl_atomic_load(ci, memory_order_acquire, memory_scope_work_group);
	▲ Show 20 Lines • Show All 98 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Let clang atomic builtins fetch add/sub support floating point typesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 322005

clang/include/clang/Basic/DiagnosticFrontendKinds.td

clang/include/clang/Basic/DiagnosticSemaKinds.td

clang/include/clang/Basic/TargetInfo.h

clang/lib/Basic/TargetInfo.cpp

clang/lib/Basic/Targets/AMDGPU.cpp

clang/lib/CodeGen/CGAtomic.cpp

clang/lib/Sema/SemaChecking.cpp

clang/test/CodeGen/fp-atomic-ops.c

clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu

clang/test/CodeGenOpenCL/atomic-ops.cl

clang/test/Sema/atomic-ops.c

clang/test/SemaCUDA/amdgpu-atomic-ops.cu

clang/test/SemaOpenCL/atomic-ops.cl

Let clang atomic builtins fetch add/sub support floating point types
ClosedPublic