- User Since
- Dec 10 2018, 1:57 PM (260 w, 3 d)
Oct 14 2022
Jan 21 2022
I've mentioned this before, but NVIDIA defines _VSTD to something else in our product branch. This diff, here, implies that we will need a matching & opposite diff to further separate us from upstream.
Nov 22 2021
I've just read up the backstory of this.
Oct 20 2021
Hmm. I lost track of how/when this constructor got defined as constexpr. It doesn't mean that it's unimplementable with platform semaphores, but to do so would make the implementation a lot more complicated, slower, and the object even larger than it already is.
Mar 10 2021
Jan 19 2021
This change looks good to me, thanks!
Dec 10 2020
I'm drawing a blank. Proceed.
Dec 7 2020
I think Jonathan is asking whether there is a match in the gray areas.
Nov 23 2020
I'm not sold on this but I'm not deeply opposed either.
Nov 20 2020
I've resumed looking at the library code.
Sep 16 2020
Sep 14 2020
We're looking at this patch in the context of the libcudacxx fork. I don't have more feedback right now but we probably will get there soon.
Sep 11 2020
Also adding a test case.
I will revert this part immediately, thanks for the heads up!
Sep 9 2020
Sep 8 2020
This version is now clean on Mac and Linux
Removed spurious whitespace changes
I figured out I could abandon most of the noisy changes. Dropping them out.
Jun 4 2020
I'm not an expert in the details but this looks right to me.
Jun 1 2020
Apr 7 2020
Yes! I can attest that the old text wasted a bunch of my time and the new text is what works.
Mar 23 2020
What does the Linux syscall have to do with MUSL?
Mar 18 2020
Feb 28 2020
Feb 27 2020
(I assume I'm not seeing a code review being used to veto a C++ Standard feature, but actually the other points are the reason for the red flag.)
Feb 25 2020
Feb 17 2020
This revision addresses the outstanding comments, and incorporates Louis' recommended macros for ulock detection.
Feb 14 2020
Replies. Also noting that I will include an update to the implementation status HTML page.
Jan 27 2020
Jan 25 2020
A lot of changes in this patch, mostly simplifications based on the areas that got feedback about complexity.
Jan 9 2020
No problem. I just want to know what I need to do in the next patch in order to move forward.
Jan 6 2020
That's a creative use of __cxx_atomic_impl that I hadn't foreseen. Seems to work. Ok.
Jan 2 2020
Nov 18 2019
Nov 13 2019
Hi guys. Could I get some action items or other movement on this review? Would you like me to make it possible to merge it as disabled or as experimental, and send me DRs to fix afterwards?
Oct 29 2019
Closing out some obsolete comments after changes and/or testing.
This revision enables cross-ABI compatibility for building the dylib and the user application with different combinations of these options:
Oct 26 2019
In this version I moved the tree barrier's core algorithm into the dylib so that any functional or performance issue found in it later could still be fixed without breaking ABI. By the same fact it narrows the visibility of the definition of the thread_local symbol it uses.
Oct 24 2019
In this update I repaired the usefulness of <atomic> in C++03 mode. Other headers won't be supported there, but I'm avoiding regression to <atomic>.
Oct 20 2019
Oct 18 2019
Oof, didn't attach the patch.
Added some documentation for the semaphore and barrier algorithms as comments.
Fixed an incompatibility between the _Atomic(T) backend of <atomic> and the Futex function SFINAE in <__threading_support>.
Simplified how the contention state is conditionally-used.
Oct 17 2019
Refactored the semaphores into layers that each do one clear thing, and dispensed with a lot of #ifdefs. Moved as much semaphore code as possible out of the header and into the dylib.
Oct 10 2019
Oct 9 2019
Was missing an edit to the header CMakeLists to install the new headers.
Oct 8 2019
Sorry for the delay posting the updated patch that addressed the recent comments, but I ran into some build issues.
I've processed a number of issues for my next patch.
Oct 6 2019
My Mac tests didn't catch some Linux issues when I added __ almost everywhere.
Closed some of the comments that have been addressed.
This new patch has more context, more __, fewer escapes from the CUDA port.
Oct 5 2019
Urg, sorry for struggling with UX here.
Oct 4 2019
Cleaned up a dead macro
Removed some mentions of CUDA that slipped by
Cleaned up the patch a bit.
Jul 26 2019
I think we want people to use -ffreestanding more for things like this, so the part where this is circumvented isn't clearly the right choice. The rest seems fine, and is exactly the reason why I created _LIBCPP_ATOMIC_ONLY_USE_BUILTINS.
Jun 13 2019
Jun 11 2019
I know some things about CUDA, volatile and C++. Let's see if I can understand the part of the proposed change that involves these. I don't understand the part about the test but I don't need to, I'll ignore that.
Mar 4 2019
Mar 3 2019
Feb 25 2019
Hey guys, can I get additional feedback or else proceed with this patch?
Feb 14 2019
Dear other reviewers, what else needs to happen here?
Feb 12 2019
This version addresses the preceding comments and passes libcxx tests across c++03, 11, 14 and 17 modes.
Feb 11 2019
Would it make sense to decide whether we want to use GCC's non-lockfree atomics or not based on a configuration macro that's not _LIBCPP_FREESTANDING?
Feb 8 2019
This version passes libcxx tests with each combination of path that can be used (force GCC, force C11, force freestanding+non-lock-free). There were quite a few problems around volatile that I had not addressed yet, apologies.
Feb 7 2019
I need to test the GCC path better, it still has some bugs. Be right back.
Feb 6 2019
In this version:
Feb 5 2019
I will come back with another patch that addresses these comments.
Feb 4 2019
In this version I've restored the __cxx_atomic_... layer to which both the GCC and C11 backends map. This addresses the comment about introducing more functions named __c11_atomic... which are not part of the C11 builtin set. I do not introduce any new versions of _Atomic anymore.
Thanks for the comments, Louis, responses below.
Jan 31 2019
Removed an inadvertent #define left in there for testing.
Would be better if it passed the libcxx tests with the feature turned on. Like now.
Fixed some spurious whitespace changes I didn't intend.
Simplified the changes significantly. By switching my back-end to slide under the C11 side instead of the GCC/Clang side, I can live without the new interposer layer.
Jan 30 2019
Quick update before a longer update: I have a simpler patch on the way.
Jan 18 2019
Updated the patch with a bit higher-quality and better-tested code than what I originally showed.
With apologies, I would just like to tack a note into this thread that the entire *field* of formal memory model proofs involving partially-overlapping atomics is a single paper (last I knew, https://www.cl.cam.ac.uk/~pes20/popl17/mixed-size.pdf). The paper says mixed-size, but the key point is not-perfectly-overlapping.
Just a clarification - please evaluate the design aspects first. There are nits that I know are wrong and am still working on.