Page MenuHomePhabricator

decoupling Freestanding atomic<T> from libatomic.a
ClosedPublic

Authored by __simt__ on Jan 18 2019, 7:59 AM.

Details

Summary

Hello,

I volunteered to write a first patch to make use of a Freestanding configuration in libcxx. The goal of this patch is to decouple non-lock-free atomic<T> from platform libraries, like libatomic.a, when a Freestanding implementation is selected. This plays a big role in the accelerator-integrated library that we are working on.

This patch is based on the GitHub monorepo as of today.

Copy/pasting how this works from my email thread to -dev:

  1. Creates a single cxx_atomic_* layer which is an adaptation of either c11_atomic_*, preferentially, or else atomic_*. (For atomic_*, it’s a rename of the layer that’s there now. For __c11_atomic_*, it’s a new but boring wrapper that just forwards calls.)
  1. Creates a __cxx_atomic_type<> template that is an alias to what should wrap the _Tp inside of an atomic<T> / atomic_flag. Defaults to the same _Atomic(_Tp) as today, either from C11 or from the adaptation layer.
  1. Under #ifdef LIBCXX_FREESTANDING, introduce cxx_locked_atomic_type<> with an implementation of cxx_atomic_*, and redirect cxx_atomic_type<> based on always-lock-freedom. a. In C++17 mode, it uses atomic_is_always_lock_free. b. Otherwise, it uses the C lock-free macros as shown in wg21.link/p0152.

Thanks for looking this over,

Olivier

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
ldionne added inline comments.Feb 4 2019, 8:49 AM
libcxx/include/atomic
929 ↗(On Diff #184517)

It seems wrong to have those be named __c11_atomic_xxx since they have nothing to do with C11. Instead, the code in atomic and __atomic_base could make calls to functions called __atomic_xxx_impl(...), and that could be implemented as C11 atomics in one case and as your __cxx_locked_atomic_type in another case.

So basically, instead of introducing this new version of __c11_atomic_init, rename the existing one to __atomic_init_impl, and then add your version under the name __atomic_init_impl too. What do you think? If you agree this would yield more clarity, we could apply this rename in a separate patch prior to this one.

I don't think the rename would be an ABI break because all those functions are static inline today, so they have internal linkage anyway (and hence they exist in each TU where they're used).

1045 ↗(On Diff #184517)

Not as cute, but you could use a traditional metafunction instead of an alias to make this potentially work in C++03. Not that I'm attached to supporting atomics in C++03, but it seems like we do (looking at the tests).

1045 ↗(On Diff #184517)

Also, I think this layer is interesting because it marks the boundary between implementation and interface. Basically, atomic and __atomic_base are both used to provide the interface that's mandated by the standard. __atomic_type is where the actual implementation begins, be it through _Atomic or the __cxx_locked_atomic_type you're introducing.

I believe that making this explicit in the name of the type might add clarity, for example __atomic_impl. WDYT?

1059 ↗(On Diff #184517)

Can you explain what this scope is?

1063 ↗(On Diff #184517)

I don't think adding the _Sco template parameter here is an issue from an ABI stand point, because I don't think __atomic_base ever gets exposed at an ABI boundary. At the very least, we don't export any symbol in libc++ that contains this type. But I could be wrong so I'd like someone else to double-check this.

1265 ↗(On Diff #184517)

I'm confused as to why this is needed. It seems like just atomic() _NOEXCEPT _LIBCPP_DEFAULT should still work fine, as even in the case where _LIBCPP_DEFAULT expands to {}, we get:

atomic() _NOEXCEPT {}

But that should default-construct __base as expected. Do you mean that you tried building in C++03 mode without this change and the tests didn't pass? That sure is surprising. If that's the case, then leave as-is, but otherwise if it's just for consistency then I would stick with what's there.

__simt__ marked 5 inline comments as done.Feb 4 2019, 9:26 AM

Thanks for the comments, Louis, responses below.

I will get working on a new patch with the __cxx_atomic_..._impl layer back in. Shouldn't take long, I can mostly revive it from history.

libcxx/include/atomic
896 ↗(On Diff #184517)

In the current version it introduces overloads to some symbols that are in the global namespace, and need to be selected against. This being said this version works least-well on Clang and that's probably not good to go. I think the way to address this is the same as your next comment -- reintroduce the interposer layer I had in the first version of the patch.

929 ↗(On Diff #184517)

The code I have here is exactly how this file already works today, it defines __c11_atomic... when it's not defined. In my first patch version, I added an interposer layer that laundered the N different backend types through the same __cxx_atomic... interface. I will come back with a patch that works that way.

1045 ↗(On Diff #184517)

SGTM. I'll use impl in the name of the resurrected interposer layer.

1059 ↗(On Diff #184517)

When it's not short-circuited (like here) its a tag type to dispatch to the right atomic backend for a scoped memory model. i.e. "acquire @device-scope" or "release @system-scope".

This part of the patch is insufficiently-justified, I understand, feel free to ask me more questions.

Here I scoped out (pun intended) the amount of plumbing one would need and it's very little (this line above, and each occurrence of _Sco in the file). You can tell me to rip this out and come back separately but it will affect the non-lock-free adapter in this patch, more significantly than the rest (as you can see).

1265 ↗(On Diff #184517)

I haven't actually built this in C++03 mode, though I did get some mileage by piggybacking on this path in CUDA, it's not a good enough reason to keep this in the patch if you don't believe the consistency argument. I can just as well drop this part.

ldionne added inline comments.
libcxx/include/atomic
896 ↗(On Diff #184517)

Which symbols in the global namespace are you thinking about?

929 ↗(On Diff #184517)

Well, I was mostly suggesting this approach to see what you (and others) would think of it. It's really the same thing except we change the name from __c11_atomic_xxx to __atomic_xxx_impl or similar. From your reply I gather that you like this better too.

934 ↗(On Diff #184517)

You don't need those to be static inline. They are templates, so they are implicitly inline, and the linkage / visibility is controlled by _LIBCPP_INLINE_VISIBILITY (so no need for static).

1059 ↗(On Diff #184517)

I think I would do that in a separate patch, this way we can have a longer discussion about whether we want it and whether it's the right way of doing it.

1265 ↗(On Diff #184517)

It's just that I don't understand why it's needed. If I did, maybe the consistency argument would begin to make more sense. I'll add other reviewers who might know better.

1905 ↗(On Diff #184517)

What's the benefit of macro-izing this? I guess it's because you want to override it under some circumstances?

__simt__ updated this revision to Diff 185202.Feb 4 2019, 5:36 PM

In this version I've restored the __cxx_atomic_... layer to which both the GCC and C11 backends map. This addresses the comment about introducing more functions named __c11_atomic... which are not part of the C11 builtin set. I do not introduce any new versions of _Atomic anymore.

There are four new configuration macros used by this patch:

  • _LIBCPP_FREESTANDING, requests the mode decoupled from libatomic.a by introducing the non-lock-free specialization. I need to set this in the CUDA configuration.
  • _LIBCPP_HAS_EXTERNAL_ATOMIC_IMP, indicates that there is an implementation of __cxx_atomic_... functions outside this file. For the CUDA configuration, I supply a scoped implementation.
  • _LIBCPP_ATOMIC_FLAG_TYPE, allows the configuration to override the default type of atomic_flag and the non-lock-free specialization's type (the default is bool). I need to override this to be a 32-bit type, I don't have native 8-bit atomics.
  • _LIBCPP_ATOMIC_SCOPE_DEFAULT, is there to plug the plumbing parameter that passes statically typed scope information through __atomic_base to the __cxx_atomic_... layer. It now defaults to a vacuous tag type __cxx_atomic_scope.

There are no new symbols being introduced outside of std::v1::, as there is no longer any attempt to grow the overload set of pre-existing functions.

Sorry if this once-again patch update scrambles the comments a bit.

I like where this is going generally speaking, but there's a couple of things I'd like to understand. It would also be great if other maintainers with more experience could chime in regarding the rename of __c11_atomic_xxx to __cxx_atomic_xxx -- is this removing functionality that we're providing (but we shouldn't be providing :-)?

There are several things in this patch what will not work in C++03. I guess we have to either:

  • Have a discussion about whether <atomic> should support C++03. I vote NO, but I wasn't there when that choice was made and I don't know off the top of my head what the repercussions might be.
  • Rewrite all of this patch in a way that works in C++03. It's possible everywhere (sometimes we might need to use an extension like __typeof), but it's going to be a pain in the neck.
libcxx/include/atomic
651 ↗(On Diff #182520)

I have the same concern -- I like where this patch is going and TBH I don't think that defining _Atomic and __c11_atomic_xxx operations in a user-facing way in libc++ is a good idea. But I want to understand what we're breaking by removing that.

557 ↗(On Diff #185202)

IIUC, it doesn't make sense for libc++ to have _LIBCPP_HAS_EXTERNAL_ATOMIC_IMP since it's a purely non-libc++ thing. Am I wrong? If not, please remove this knob for now and we can re-add it later on when we have a reason to (e.g. when/if we introduce scoped atomics into libc++).

588 ↗(On Diff #185202)

Please get rid of scoped atomics for the moment since those are orthogonal to the rest of the patch.

658 ↗(On Diff #185202)

This rename of __gcc_atomic::__can_assign to __cxx_atomic_type::__can_assign looks wrong -- __cxx_atomic_type is not a namespace. I don't think this will compile. I think that just __can_assign is what you need? Same comment applies below.

1049 ↗(On Diff #185202)

I think the first parameter should be __cxx_atomic_type<_Tp*>, but then that function already exists above.

1054 ↗(On Diff #185202)

Ditto.

1092 ↗(On Diff #185202)

Please implement _LIBCPP_FREESTANDING in __config as

#ifndef __STDC_HOSTED__
#  define _LIBCPP_FREESTANDING
#endif

This way, _LIBCPP_FREESTANDING will be defined whenever -ffreestanding is passed to Clang, which is the goal IIUC.

1109 ↗(On Diff #185202)

Indentation looks weird here.

1135 ↗(On Diff #185202)

None of those functions is going to work in C++03.

__simt__ marked 19 inline comments as done.Feb 5 2019, 9:28 PM

I will come back with another patch that addresses these comments.

Also will start a thread to debate C++03 support.

libcxx/include/atomic
896 ↗(On Diff #184517)

It was __c11_... but now it's moot.

929 ↗(On Diff #184517)

On Clang it's the only approach that works, AFAICT. GCC was more lenient.

934 ↗(On Diff #184517)

Ok. I'll clean that up. I don't think this header is particularly good about minimizing the decorations.

1045 ↗(On Diff #184517)

We should have a top-level discussion about the need for C++03 support.

651 ↗(On Diff #182520)

Agreed. My code used to live under this backend side, I understand what it does.

But, as you found out just below, I didn't test this version on GCC w/o my primary path active.

557 ↗(On Diff #185202)

Actually this has been good for me.

First, there is _LIBCPP_HAS_THREAD_API_EXTERNAL, which I've had to set for CUDA in RTC mode. This is analogous to that, in that you don't know what the underlying support is, but something has vouched that it conforms to the contract with the header.

Second, this allows a back-end that isn't GCC or C11 to slide underneath <atomic>. In that case, <atomic> does nothing to help, it assumes outright that the __cxx_atomic... symbols exist. That's the contract.

I would like to keep this one, but like anything else, I can keep it in the downstream.

588 ↗(On Diff #185202)

Ok. Next patch won't have them.

658 ↗(On Diff #185202)

Whoops. Sorry. Will fix.

1049 ↗(On Diff #185202)

It exists in another #ifdef, right? I could probably use the using name here, if it mattered. This path is the C11 path, though, so _Atomic is what we're looking to match.

FWIW, template type deduction involving _Atomic(_Tp) is terrible.

1092 ↗(On Diff #185202)

Ok. Will do.

I will come back with another patch that addresses these comments.

Great! I'll be away until next Monday, so don't expect replies from me until then.

Also will start a thread to debate C++03 support.

Good, thanks!

Also, I'm still looking for other maintainers to take a look, especially at the part that removes __c11_atomic_xxx functions from the global namespace and removes the #define _Atomic(_Tp). I like those changes, just want to make sure it works for everybody.

libcxx/include/atomic
557 ↗(On Diff #185202)

Seeing the precedent for _LIBCPP_HAS_THREAD_API_EXTERNAL, I am comfortable with keeping this. Ideally we'd have a design document for <atomic> like we do for <thread> (docs/DesignDocs/ThreadingSupportAPI.rst), but I'm not putting that on you.

1049 ↗(On Diff #185202)

I don't understand this. You (rightly) removed all functions of the form __cxx_atomic_xxx(_Atomic(_Tp), ...) in favour of __cxx_atomic(__cxx_atomic_type<_Tp*>, ...) but you only left this one. I don't understand why this one is still here.

__simt__ updated this revision to Diff 185704.Feb 6 2019, 9:15 PM
__simt__ marked an inline comment as done.

In this version:

  1. Macros moved to __config.
  2. Redundant static/_LIBCPP_INLINE_VISIBILITY simplified.
  3. Redundant inline/template simplified.
  4. Applied _LIBCPP_DEFAULT where it is sufficient for c++03.
  5. Scopes removed.
  6. A few cut-n-paste mistakes fixed.

Potentially the last major question I see is whether the new code needs to support c++03 or not.

I need to test the GCC path better, it still has some bugs. Be right back.

__simt__ updated this revision to Diff 186053.Feb 8 2019, 2:59 PM

This version passes libcxx tests with each combination of path that can be used (force GCC, force C11, force freestanding+non-lock-free). There were quite a few problems around volatile that I had not addressed yet, apologies.

ldionne requested changes to this revision.Feb 11 2019, 10:47 AM

Here's a summary of what I understand the current state of the patch is (mostly for my own reference as a bird's eye view):

#if defined(_LIBCPP_HAS_GCC_ATOMIC_IMP)

    template <typename _Tp>
    struct __cxx_atomic_type {
        // Implementation using a raw _Tp and GCC's __atomic_xxx operations
    };

    template<typename _Tp>
    using __cxx_atomic_base_impl = __cxx_atomic_type<_Tp>;

#elif defined(_LIBCPP_HAS_C_ATOMIC_IMP)

    template <typename _Tp>
    struct __cxx_atomic_type {
        // Implementation using an _Atomic(_Tp) and the __c11_atomic_xxx operations
    };

    template<typename _Tp>
    using __cxx_atomic_base_impl = __cxx_atomic_type<_Tp>;

#endif


#if defined(_LIBCPP_FREESTANDING) && defined(__cpp_lib_atomic_is_always_lock_free)

    template<typename _Tp>
    struct __cxx_atomic_lock_impl {
        // Implementation using a raw _Tp and a lock implemented as a __cxx_atomic_base_impl
    };

    template <class _Tp>
    using __cxx_atomic_impl = _Tp is always lock free ? __cxx_atomic_base_impl<_Tp> : __cxx_atomic_lock_impl<_Tp>;

#else

    template <class _Tp>
    using __cxx_atomic_impl = __cxx_atomic_base_impl<_Tp>;

#endif

template <class _Tp>
struct __atomic_base {
    // Implements the interface of <atomic> and uses a __cxx_atomic_impl<_Tp> under the hood
};

Would it make sense to decide whether we want to use GCC's non-lockfree atomics or not based on a configuration macro that's not _LIBCPP_FREESTANDING?

libcxx/include/atomic
555 ↗(On Diff #186053)

Wouldn't it be simpler to say:

#if defined(_LIBCPP_HAS_NO_ATOMIC_HEADER)
#  error <atomic> is not implemented
#endif
592 ↗(On Diff #186053)

I think uses of this can be replaced by declval<_Tp const&>()?

602 ↗(On Diff #186053)

Would std::is_assignable<_Td&, _Tp const&> work too instead? Those traits are available even in C++03 mode.

641–642 ↗(On Diff #186053)

Can you use _VSTD::copy(to, end, from) instead? Also, you should mangle the names to __to, __end and __from.

I admit that having to include <algorithm> here sucks, but at least please lift this definition into a function you can reuse below. In the future we should simply put copy and other basic algorithms in a header like <__algorithm_base> that can be used in places where we don't need all of <algorithm>.

EDIT: Nevermind this: I just saw this was a pre-existing condition in the code. Keeping the comment here to point out this is not great, but you don't have to fix it in this patch.

718 ↗(On Diff #186053)

I'm not seeing the value of this type not being called __cxx_atomic_base_impl directly. Is there any reason for this?

973 ↗(On Diff #186053)

Likewise, this type could be called __cxx_atomic_base_impl directly?

1146 ↗(On Diff #186053)

This #endif doesn't look right anymore.

1199 ↗(On Diff #186053)

__reader & friends should be mangled.

1215 ↗(On Diff #186053)

Can you please put the semi-colon on its own line, like so:

while (1 == __cxx_atomic_exchange(...))
    /* spin */;

or something like that. Just to make it obvious we're not calling __f in the loop. This applies elsewhere too.

This revision now requires changes to proceed.Feb 11 2019, 10:47 AM
__simt__ marked 9 inline comments as done.Feb 11 2019, 11:52 AM

Would it make sense to decide whether we want to use GCC's non-lockfree atomics or not based on a configuration macro that's not _LIBCPP_FREESTANDING?

Sure it can make sense. That bridge is somewhere in my future, even. When I could do now is create another _LIBCPP... macro that gets set in the __config portion for atomics when _LIBCPP_FREESTANDING is set.

libcxx/include/atomic
555 ↗(On Diff #186053)

Yes. I'll do that for the next one.

592 ↗(On Diff #186053)

These are there for c++03 support, I think. Much of this section is borrowed from the previous version, but expanded to suit the additional uses in the file now.

602 ↗(On Diff #186053)

Are they? I can probably use them then. When I test in c++03 mode next, I'll experiment with the substitution.

641–642 ↗(On Diff #186053)

Thanks for the EDIT... because the patch is trending towards fixing-all-the-things.

Also, introducing a dependency on <algorithm> is a problem for me right now. I'd like to avoid it for a while longer, or else create the <__algorithm_base> at that time.

718 ↗(On Diff #186053)

This makes more sense in the scoped version. In this version I can do what you requested here, that's what I'll come back with.

973 ↗(On Diff #186053)

Likewise.

1146 ↗(On Diff #186053)

Right. Will fix.

1199 ↗(On Diff #186053)

Ok. But if we retain c++03 support then this is going to have to change. Or at least be guarded by another #if condition if we temporarily force c++11 support.

1215 ↗(On Diff #186053)

Will do.

__simt__ updated this revision to Diff 186508.Feb 12 2019, 10:19 AM

This version addresses the preceding comments and passes libcxx tests across c++03, 11, 14 and 17 modes.

ldionne accepted this revision.Feb 13 2019, 10:17 AM

LGTM, with some de-duplication suggested. I'm not an atomic wrangler, though, so I'd feel more comfortable if someone like @jfb could quickly double-check the implementation of the atomic operations.

libcxx/include/atomic
1198 ↗(On Diff #186508)

__cxx_atomic_compare_exchange_weak and __cxx_atomic_compare_exchange_strong seem the same. Should one just forward to the other?

This revision is now accepted and ready to land.Feb 13 2019, 10:17 AM

Dear other reviewers, what else needs to happen here?

Hey guys, can I get additional feedback or else proceed with this patch?

Given that I and JF have reviewed the patch and we're OK with it, I'll merge this at the end of the week unless other reviewers chime in. Changes can always be made as a followup, but I'd like to avoid delaying this too much (especially since this is a large change that will quickly become hard to rebase).

ldionne added inline comments.Feb 28 2019, 4:20 PM
libcxx/include/atomic
1337 ↗(On Diff #186508)

It just occurred to me that you define this function (and others) that are only used for the atomic<Integral> specialization. Is there a case in which an integral type would be a non-lockfree atomic? Currently, we do not test non-lockfree integral types in the test suite (at least not on my platform, where I don't think there are non-lockfree atomics for integral types).

__simt__ marked an inline comment as done.Mar 3 2019, 9:36 AM
__simt__ added inline comments.
libcxx/include/atomic
1337 ↗(On Diff #186508)

I'm not aware of any non-lock-free integral or pointer types on a platform that libcxx supports at this moment, but it's conceivable. The minimum HW requirement for C++ implementations is to have at least one of: byte test-and-set, or byte exchange. There exist real platforms that come close to this minimum, and would have non-lock-free atomic_int.

Also, I tested this code with an override to force almost everything to take the non-lock-free path, even when the lock-free path could have been used. So while it's not exercised via normal testing, I have lightly tested it. Maybe twice. :)

You could keep or delete them, either is OK with me, but personally I'd keep them.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 4 2019, 7:25 AM

This commit broke the atomic lldb data formatter.

http://lab.llvm.org:8080/green/view/LLDB/job/lldb-cmake/21037/

@shafik @jingham

It looks like LLDB is naming internal names here. Unfortunately, it now needs to learn the new names. On the plus side, there is now a single stable internal name for all compilation paths for this header, whereas it used to differ. So the final result ought to be seen as an improvement.

This commit broke the atomic lldb data formatter.

http://lab.llvm.org:8080/green/view/LLDB/job/lldb-cmake/21037/

@shafik @jingham

It looks like LLDB is naming internal names here. Unfortunately, it now needs to learn the new names. On the plus side, there is now a single stable internal name for all compilation paths for this header, whereas it used to differ. So the final result ought to be seen as an improvement.

Nobody disagrees but the formatter has to be fixed, hence the two people cc:ed.

This commit broke the atomic lldb data formatter.

http://lab.llvm.org:8080/green/view/LLDB/job/lldb-cmake/21037/

@shafik @jingham

It looks like LLDB is naming internal names here. Unfortunately, it now needs to learn the new names. On the plus side, there is now a single stable internal name for all compilation paths for this header, whereas it used to differ. So the final result ought to be seen as an improvement.

Nobody disagrees but the formatter has to be fixed, hence the two people cc:ed.

Sorry -- I should have run the tests for the LLDB formatters. This is not the first time it happens.

However, seeing how error-prone it is to change something in libc++ without running the LLDB tests, would it make sense to test the data formatters as part of libc++ itself? One way to see the data formatters is as something that libc++ provides to work nicely with LLDB instead of as something LLDB tries to provide for libc++. When seen that way, it makes more sense that the data formatters use private parts of libc++, and the tests should be part of libc++'s test suite. I don't know how big of a change this is, but I'd encourage LLDB folks to consider this as this would reduce a cause of semi-frequent mistakes.

shafik added a comment.Mar 4 2019, 4:08 PM

This commit broke the atomic lldb data formatter.

http://lab.llvm.org:8080/green/view/LLDB/job/lldb-cmake/21037/

@shafik @jingham

It looks like LLDB is naming internal names here. Unfortunately, it now needs to learn the new names. On the plus side, there is now a single stable internal name for all compilation paths for this header, whereas it used to differ. So the final result ought to be seen as an improvement.

Nobody disagrees but the formatter has to be fixed, hence the two people cc:ed.

Sorry -- I should have run the tests for the LLDB formatters. This is not the first time it happens.

However, seeing how error-prone it is to change something in libc++ without running the LLDB tests, would it make sense to test the data formatters as part of libc++ itself? One way to see the data formatters is as something that libc++ provides to work nicely with LLDB instead of as something LLDB tries to provide for libc++. When seen that way, it makes more sense that the data formatters use private parts of libc++, and the tests should be part of libc++'s test suite. I don't know how big of a change this is, but I'd encourage LLDB folks to consider this as this would reduce a cause of semi-frequent mistakes.

Some of us had a brief discussion about this and it would probably require us to move them to python as opposed to C++ and that would not be a trivial amount of work.

I have a fix that I am testing now, I would be happy to explain it to anyone interested in understanding how to work with the formatters once I commit it.