Page MenuHomePhabricator

Add __atomic_is_lock_free to compiler-rt
Needs ReviewPublic

Authored by oontvoo on Jul 31 2020, 1:59 PM.



This was one of the last required functions missing from Clang's libatomic.

Changes in behaviours:

  • On x86-64, it'll now be able to detect and use cmpxchg16b
  • The rest should be unchanged, given it'd already been able to figure it out statically.

Diff Detail

Event Timeline

oontvoo created this revision.Jul 31 2020, 1:59 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added subscribers: Restricted Project, jfb. · View Herald Transcript
oontvoo updated this revision to Diff 282788.Aug 3 2020, 7:54 PM

Handled the remaining sizes

oontvoo retitled this revision from [WIP] __atomic_is_lock_free to __atomic_is_lock_free.Aug 3 2020, 7:59 PM
oontvoo edited reviewers, added: jyknight; removed: jfb.
oontvoo published this revision for review.Aug 3 2020, 7:59 PM

This is technically a behavior change, someone like @ldionne ought to chime in, and we probably want to synchronize with libstdc++ @jwakely.

ldionne requested changes to this revision.Aug 4 2020, 6:03 AM

Can we please get a description of this change in the commit message and the Phab review?

This revision now requires changes to proceed.Aug 4 2020, 6:03 AM
oontvoo retitled this revision from __atomic_is_lock_free to Add __atomic_is_lock_free to compiler-rt.Aug 4 2020, 9:13 AM
oontvoo edited the summary of this revision. (Show Details)
oontvoo updated this revision to Diff 282976.Aug 4 2020, 11:29 AM

Removed use of GNU case range extension
(Triggered a bunch of clang-tidys)

oontvoo updated this revision to Diff 285166.Aug 12 2020, 12:46 PM
  • Remove the checking of non-power-of-2 sizes from is_lock_free for now (because the atomic_* ops don't handle them yet)
  • Rename function to _c and add pragma redefine
jfb added a comment.Aug 12 2020, 1:10 PM

Does this x86 change actually match what the runtime does in the same file?

oontvoo updated this revision to Diff 285230.Aug 12 2020, 6:16 PM

Updated diff

In D85044#2213896, @jfb wrote:

Does this x86 change actually match what the runtime does in the same file?


jfb added a comment.Aug 13 2020, 9:22 AM

Can you please add tests for this?


These are OS specific and won't work everywhere.


Please document what you're looking up here. It's definitely not "have_cas".

oontvoo updated this revision to Diff 285512.Aug 13 2020, 3:28 PM
oontvoo marked 2 inline comments as done.

Updated diff

Herald added a project: Restricted Project. · View Herald TranscriptAug 13 2020, 3:28 PM
In D85044#2216023, @jfb wrote:

Can you please add tests for this?

I don't see this file being built anywhere. Can you point me to where to add the test?

oontvoo updated this revision to Diff 286408.Aug 18 2020, 3:06 PM

Simple test

There is going to be a bunch more complexity required here, I'm afraid.

Not only do we need to detect the CPU capabilities in order to implement atomic_is_lock_free, but we also need to ensure that the implementations of atomic_load_c / etc can actually use those atomic instructions as appropriate.

This means we need to compile the function with different flags, e.g. -mcx16 for x86-64, in order for the compiler to be able to emit cmpxchg16b, or -march=i586 on 32bit x86. But -- we can't compile the entire file that way. Only the functions called after the runtime detection has assured that we're running on the appropriate hardware can be built with those flags.

GCC handles this by compiling the source files multiple times, with different flags, adding some additional #defines which change the function names for each of those compiles. Then, it arranges to dispatch to the correct function depending on the runtime CPU detection. Ideally, we'd also be able to use the "target" attribute to do this on an individual function within the file, but it looks like that may not actually work properly.


Same check used twice? Also, I think we don't need a configure check for this, #if __has_include(<asm/hwcap.h>) would be ok.


This isn't threadsafe as written. It would be okay to use a relaxed atomic to load and store __have_atomic_cas (Relaxed is okay, because it's okay to potentially call check_x86_atomic_cas multiple times if this is reached simultaneously on multiple threads, and it will result in the same answer each time).


It's confusing to have "have_atomic_cas" sometimes mean "have CMPXCHG8b" and sometimes "have CMPXCHG16b".

And that confusion has led to a bug here -- on x86-64, cmpxchg8b is always available, but this code returns false for 8-byte atomics, unless CMPXCHG16b is also available.


AArch64 always supports atomics, the HWCAP_ATOMICS is needed only for the newer optimized instructions instead of the older LL/SC loop. (We may want to use those instructions for performance in the future, but it's not necessary for correctness, so we can leave out that complexity for now.


Let's leave aside ARM32 support for the first draft and just concentrate on x86 to get the overall framework in place.


At the end, it'll be better to fail to compile on platforms we haven't implemented, than to do the wrong thing.

But in order to allow implementing platforms piecemeal at the beginning, I'm ok to leave this fallback for now.


In C code (compiled with Clang), this can be spelled __attribute__((fallthrough)).


Given that the contract for __atomic_is_lock_free doesn't require actual pointers, I'd leave out the construction of real objects in this test, and just pass in constructed values e.g. (void*)~7.

Also, all these assertions will need to be platform-specific.

oontvoo updated this revision to Diff 287068.Aug 21 2020, 11:00 AM
oontvoo marked 5 inline comments as done.

[WIP] mov stuff to target-specific dirs