This is an archive of the discontinued LLVM Phabricator instance.

Implement is_always_lock_free
AbandonedPublic

Authored by jfb on Mar 7 2016, 6:18 PM.

Details

Summary

This was voted into C++17 at last week's Jacksonville meeting. The final P0152R1 paper will be in the upcoming post-Jacksonville mailing, and is also available here:

http://jfbastien.github.io/papers/P0152R1.html

The libc++ part of this implementation is in the following code review:

http://reviews.llvm.org/D17951

Diff Detail

Event Timeline

jfb updated this revision to Diff 50016.Mar 7 2016, 6:18 PM
jfb retitled this revision from to Implement is_always_lock_free.
jfb updated this object.
jfb added reviewers: mclow.lists, rsmith.
jfb added a subscriber: cfe-commits.
jfb updated this object.Mar 7 2016, 6:19 PM
jfb added a comment.Mar 7 2016, 6:22 PM

I'll check with GCC folks whether they want to share macros the same way we're already sharing __GCC_ATOMIC_*_LOCK_FREE. The new macros I propose are __LLVM__ATOMIC_*_BYTES_LOCK_FREE and __LLVM_LOCK_FREE_IS_SIZE_BASED.

Let's do a first sanity-check round of code reviews to make sure I have the LLVM-isms down right, then I'll add GCC folks.

jfb added inline comments.Mar 7 2016, 6:24 PM
lib/Frontend/InitPreprocessor.cpp
465

How do you pick these numbers before the standard is actually out? 201603 is the Jacksonville meeting, seems fine to me :)

838

Adding these which aren't in the C standard, but which is_always_lock_free would use. We're proposing that floating-point atomics be added so it's not crazy:

http://open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0020r1.html
test/Lexer/cxx-features.cpp
6

This was out of date, I'm updating the test to C++1z, testing C++14, and keeping c++1y for compatibility.

jfb added inline comments.Mar 8 2016, 2:04 PM
lib/Frontend/InitPreprocessor.cpp
465

Those come from a separate SG10 paper: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0096r1.html#recs.cpp17
A number hasn't been published yet, I'll update the number once the post-Jacksonville mailing is out (I sent Clark a ping).

This conflicts with http://reviews.llvm.org/D17933. Most of this change also seems unnecessary.

  • I think the is_always_lock_free function should be defined based on the existing __atomic_always_lock_free builtin, not on defines (just like is_lock_free uses __atomic_is_lock_free, or __c11_atomic_is_lock_free, which is effectively an alias).
  • Then, the new __GCC_ATOMIC_DOUBLE_LOCK_FREE macros are unnecessary, unless we need to actually define a ATOMIC_DOUBLE_LOCK_FREE macro.
  • __LLVM_ATOMIC_1_BYTES_LOCK_FREE effectively duplicates __GCC_HAVE_SYNC_COMPARE_AND_SWAP_1, so aren't needed.
bcraig added a subscriber: bcraig.Mar 16 2016, 7:21 AM
bcraig added inline comments.
lib/Frontend/InitPreprocessor.cpp
305

On some targets (like Hexagon), 4-byte values are cheap to inline, but 1-byte values are not. Clang is spotty about checking this, but TargetInfo::hasBuiltinAtomic seems like the right function to ask, if you have access to it.

jfb added a comment.Mar 16 2016, 4:09 PM

This conflicts with http://reviews.llvm.org/D17933. Most of this change also seems unnecessary.

  • I think the is_always_lock_free function should be defined based on the existing __atomic_always_lock_free builtin, not on defines (just like is_lock_free uses __atomic_is_lock_free, or __c11_atomic_is_lock_free, which is effectively an alias).
  • Then, the new __GCC_ATOMIC_DOUBLE_LOCK_FREE macros are unnecessary, unless we need to actually define a ATOMIC_DOUBLE_LOCK_FREE macro.
  • __LLVM_ATOMIC_1_BYTES_LOCK_FREE effectively duplicates __GCC_HAVE_SYNC_COMPARE_AND_SWAP_1, so aren't needed.

Hmm, when I originally wrote the paper I though I'd tried that. Can't remember why I went the other way, let me try out __atomic_always_lock_free. That would indeed be much simpler as it would be a pure libc++ change., thanks for raising the issue.

lib/Frontend/InitPreprocessor.cpp
305

You're commenting on:

if (TypeWidth == TypeAlign && (TypeWidth & (TypeWidth - 1)) == 0 &&
    TypeWidth <= InlineWidth)

?

Are you asking for a separate change, or for a change to my patch?

jyknight added inline comments.Mar 16 2016, 4:33 PM
lib/Frontend/InitPreprocessor.cpp
305

That issue in clang's purview in any case; if hexagon wants to lower a 1-byte cmpxchg to a compiler runtime function, it should do so in its backend.

bcraig added inline comments.Mar 17 2016, 7:08 AM
lib/Frontend/InitPreprocessor.cpp
305

I am asking for a change to this patch. Don't assume that all widths smaller than some value are inline-able and lock free, because that may not be the case. A 4 byte cmpxchg could be lock free, while a 1 byte cmpxchg may not be lock free.

jfb added inline comments.Mar 17 2016, 7:42 AM
lib/Frontend/InitPreprocessor.cpp
305

Are you asking me to change this line:

if (TypeWidth == TypeAlign && (TypeWidth & (TypeWidth - 1)) == 0 &&
    TypeWidth <= InlineWidth)

?

In what way?

Please be more specific.

A 4 byte cmpxchg could be lock free, while a 1 byte cmpxchg may not be

lock free.

That's not possible: if you have a 4-byte cmpxchg instruction, you can use
it to implement a 1-byte cmpxchg, too. Many targets do this already. (It
would be better if that was available generically so that code didn't need
to be written separately fit each target, but that's not the case at the
moment.)

I know that MIPS does that, and an out-of-tree implementation of hexagon
implements 1-byte cmpxchg in terms of the 4-byte version. The emulation
code isn't particularly small, and it seems reasonable to make it a
libcall. The emulation code seems sketchy from a correctness
perspective, as you end up generating unsolicited loads and stores on
adjacent bytes. Take a look at the thread on cfe-dev and llvm-dev named
"the as-if rule / perf vs. security" for some of the ramifications and
concerns surrounding unsolicited loads and stores.

bcraig added inline comments.Mar 17 2016, 10:13 AM
lib/Frontend/InitPreprocessor.cpp
305

Yes, please change this line...

if (TypeWidth == TypeAlign && (TypeWidth & (TypeWidth - 1)) == 0 &&

TypeWidth <= InlineWidth)

... to something more like ...
if(TI.hasBuiltinAtomic(TypeWidth, TypeAlign))

You will need to get the TargetInfo class into the method, and you should ensure that TypeWidth and TypeAlign are measured in terms of bits, because that is what TypeInfo::hasBuiltinAtomic is expecting.

Using TargetInfo::hasBuiltinAtomic will let Targets have explicit control over what is and is not always lock free. The default implementation also happens to be pretty reasonable.

jfb added a comment.Mar 17 2016, 10:56 AM

I know that MIPS does that, and an out-of-tree implementation of hexagon
implements 1-byte cmpxchg in terms of the 4-byte version. The emulation
code isn't particularly small, and it seems reasonable to make it a
libcall. The emulation code seems sketchy from a correctness
perspective, as you end up generating unsolicited loads and stores on
adjacent bytes. Take a look at the thread on cfe-dev and llvm-dev named
"the as-if rule / perf vs. security" for some of the ramifications and
concerns surrounding unsolicited loads and stores.

C++ atomics are explicitly designed to avoid problems with touching adjacent bytes: if atomic<T> where sizeof(T) == 1 requires a 4-byte cmpxchg then it's up to the frontend to make sure sizeof<atomic<T>> >= 4 (or something equivalent such as making it non-lock-free).

This doesn't fall under "as-if".

Whether clang does this correctly for the targets you have in mind is another issue.

Whether LLVM's atomic instructions should assume C++ semantics in one way or another, i.e. whether they should assume the frontend generated code where padding can be overriden, is also a separate issue.

In all of these cases I suggest filing a separate issue. This patch is the wrong place to address the issues you raise. They're not invalid issues, they're simply not related to D17950.

lib/Frontend/InitPreprocessor.cpp
305

That's entirely unrelated to my change. My change is designed to *support* the use case you allude to even though right now LLVM doesn't have targets which exercise this usecase.

C++ atomics are explicitly designed to avoid problems with touching adjacent bytes: if atomic<T> where sizeof(T) == 1 requires a 4-byte cmpxchg then it's up to the frontend to make sure sizeof<atomic<T>> >= 4 (or something equivalent such as making it non-lock-free).

I was aware of the C++11 wording on this, but for some reason, I was under the impression that the C11 wording didn't give the sizeof(T) latitude that C++11 did. A quick glance at the C spec tells me that I was under the wrong impression though. sizeof(_Atomic char) doesn't have to be the same as sizeof(char).

So assuming the front end does the right per-target thing, _Atomic char (and std::atomic<char>) on MIPS and Hexagon probably _should_ be 4 bytes, and the IR should only do a 4 byte (or larger) cmpxchg. All the loads and stores would be "solicited" loads and stores. The back-end shouldn't need to do any masking, because the extra bytes are basically just padding.

Long story short... continue as you were. Code looks fine. Ignore the crazy drive-by reviewer :)

lib/Frontend/InitPreprocessor.cpp
305

Since you aren't changing the logic here, I'm find dropping this issue from this review.

jfb updated this revision to Diff 51048.Mar 18 2016, 11:32 AM
  • Use __atomic_always_lock_free instead
jfb added a comment.Mar 18 2016, 11:34 AM
In D17950#376965, @jfb wrote:

This conflicts with http://reviews.llvm.org/D17933. Most of this change also seems unnecessary.

  • I think the is_always_lock_free function should be defined based on the existing __atomic_always_lock_free builtin, not on defines (just like is_lock_free uses __atomic_is_lock_free, or __c11_atomic_is_lock_free, which is effectively an alias).
  • Then, the new __GCC_ATOMIC_DOUBLE_LOCK_FREE macros are unnecessary, unless we need to actually define a ATOMIC_DOUBLE_LOCK_FREE macro.
  • __LLVM_ATOMIC_1_BYTES_LOCK_FREE effectively duplicates __GCC_HAVE_SYNC_COMPARE_AND_SWAP_1, so aren't needed.

Hmm, when I originally wrote the paper I though I'd tried that. Can't remember why I went the other way, let me try out __atomic_always_lock_free. That would indeed be much simpler as it would be a pure libc++ change., thanks for raising the issue.

Changed to what you suggested. Much nicer. I don't remember why I thought it was a bad idea.

Changed to what you suggested. Much nicer. I don't remember why I thought it was a bad idea.

Thanks, great! I don't have any opinion on what remains in this patch; someone else should review now.

rsmith added inline comments.Mar 21 2016, 4:37 PM
lib/Frontend/InitPreprocessor.cpp
465

This should be defined by the relevant library header, not by the compiler, to indicate the library actually provides the new symbol.

jfb added inline comments.Mar 21 2016, 4:46 PM
lib/Frontend/InitPreprocessor.cpp
465

Ah yes, makes sense. Looks like libc++ doesn't do this for other library features yet so I'll add it there.

This patch would then just fix test/Lexer/cxx-features.cpp for C++17 support and have nothing to do with is_always_lock_free. It's probably better if I close this patch and open a new one in that case.

jfb abandoned this revision.Mar 22 2016, 2:20 PM
jfb added inline comments.
lib/Frontend/InitPreprocessor.cpp
465

On the mailing list @rsmith said:

Feel free to commit that portion of the change.

Done in: http://reviews.llvm.org/rL264098

I'll abandon this change since it doesn't contain anything useful anymore (all in libc++ instead).

Thanks for the reviews!