Page MenuHomePhabricator

Adding __atomic_fetch_min/max intrinsics to clang
ClosedPublic

Authored by delena on May 3 2018, 5:48 AM.

Details

Summary

Added atomic_fetch_min, max, umin, umax intrinsics to clang.
These intrinsics work exactly as all other
atomic_fetch_* intrinsics and allow to create *atomicrmw* with ordering.
The similar set __sync_fetch_and_min* sets the sequentially-consistent ordering.

We use them for OpenCL 1.2, which supports atomic operations with "relaxed" ordering.

Diff Detail

Repository
rL LLVM

Event Timeline

delena created this revision.May 3 2018, 5:48 AM
jfb added inline comments.May 3 2018, 11:43 AM
lib/Sema/SemaChecking.cpp
3098 ↗(On Diff #145006)

Should __sync_fetch_and_min and others also set IsMinMax?

delena added inline comments.May 3 2018, 12:42 PM
lib/Sema/SemaChecking.cpp
3098 ↗(On Diff #145006)

sync_fetch_and_min is not variadic and not overloaded. The types of arguments are defined with the builtin itself in the def file.
BUILTIN(
sync_fetch_and_min, "iiD*i", "n"). So it is checked automatically.

The other __sync_fetch* functions are overloaded and checked in SemaBuiltinAtomicOverloaded()

jfb accepted this revision.May 3 2018, 1:06 PM
This revision is now accepted and ready to land.May 3 2018, 1:06 PM

Given that these are overloaded builtins, why are there separate builtins for min and umin?

If this is a Clang extension, it needs to be documented in our extensions documentation.

The documentation should describe the exact ordering semantics, to the extent we want to guarantee them. For example, does __atomic_fetch_max order this operation with later operations even if the new value is in fact less than the current value of the variable, or does it use some treatment more like the failure case of a compare-exchange?

The documentation should describe the set of allowed types. If the builtins work on pointer types, the documentation should describe the semantics of the pointer comparison; for example, is it undefined behavior if an ordinary pointer comparison would be? Also, your test case should actually check the well-formedness conditions more completely, e.g. verifying that the value type is convertible to the atomic type.

rjmccall requested changes to this revision.May 3 2018, 4:33 PM
This revision now requires changes to proceed.May 3 2018, 4:33 PM

Is this some sort of a vendor extension then? OpenCL 1.2 atomic builtins don't have ordering parameter.

delena added a comment.May 4 2018, 4:54 AM

Is this some sort of a vendor extension then? OpenCL 1.2 atomic builtins don't have ordering parameter.

OpenCL 1.2 atomic builtins have relaxed semantics. Always, it is not parameter, it is defined behavior. I want to translate them to atomicrmw instruction and use one of clang intrinsics for this.
I can't use _sync_fetch_*, due to the different semantics. The __atomic_* allow to specify semantics, but min/max is missing in this set.

delena updated this revision to Diff 145646.May 8 2018, 1:34 AM

Removed the unsigned version of atomics. Enhanced semantics check.
Added more tests.
Added documentation.

rjmccall added inline comments.May 8 2018, 1:32 PM
docs/LanguageExtensions.rst
1998 ↗(On Diff #145646)

Thank you for adding this documentation. Please do clarify what the memory ordering semantics actually are when the atomic object does not need to be updated, though, and verify that target code generation actually obeys that ordering. For example, if the memory ordering makes this a release operation, __atomic_fetch_min must always store the result back to the atomic object, even if the new value was actually greater than the stored value; I believe that would not be required with a relaxed operation.

delena updated this revision to Diff 146080.May 10 2018, 12:33 AM

Given more clarification about memory model of atomic operations.

rjmccall added inline comments.May 10 2018, 8:01 AM
docs/LanguageExtensions.rst
1998 ↗(On Diff #145646)

Okay, that's not what I was asking for. It's okay to assume that people understand the basic memory orderings; you don't need to copy/paste generic descriptions of them here. There is something special about min/max vs. the rest of these atomic update operations, however, which is that min and max don't always change the value of the variable. (Technically this is true of corner cases of many of the other operations — for example, you could atomically add 0 to a variable or multiply a variable by 1 — but the basic operation of min/max makes this corner case a lot more important.) I am just asking you to state definitively whether the atomic operation is still appropriately ordered if the object's value does not change.

If you don't understand what I'm getting at here, maybe you should just add the relaxed ordering for now.

delena updated this revision to Diff 146462.May 11 2018, 10:07 PM

Added a line about *load-store* semantics of these two intrinsics.
Removed the common description of memory modeling.

rjmccall accepted this revision.May 11 2018, 11:12 PM

Thanks, that works for me.

The actual semantic parts of the diff seem to have disappeared from the patch posted to Phabricator, for what it's worth.

This revision is now accepted and ready to land.May 11 2018, 11:12 PM

The actual semantic parts of the diff seem to have disappeared from the patch posted to Phabricator, for what it's worth.

It is not disappeared by itself, I removed it. I understood that you don't see any added value in the entire memory model description inside.
Thank you.

This revision was automatically updated to reflect the committed changes.

There is a (stalled) proposal to add atomic integer max/min to C++: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0493r0.pdf . The proposal has memory semantics similar to these builtins, i.e. unconditional RMW. There is no limitation to 32-bit types though, what's the motivation for that? Is it a limitation of a particular target?