Page MenuHomePhabricator

[libc++] Implement `stop_token`
Needs ReviewPublic

Authored by huixie90 on Thu, Mar 2, 1:24 PM.


Group Reviewers
Restricted Project

Diff Detail

Unit TestsFailed

4,610 mslibcxx CI ASAN > llvm-libc++-shared-cfg-in.std/thread/thread_stoptoken/stopcallback::dtor.pass.cpp
Script: -- : 'COMPILED WITH'; /usr/bin/clang++-17 /home/libcxx-builder/.buildkite-agent/builds/a8cf9f586a97-1/llvm-project/libcxx-ci/libcxx/test/std/thread/thread.stoptoken/stopcallback/dtor.pass.cpp --target=x86_64-unknown-linux-gnu -g -fno-omit-frame-pointer -fsanitize=address -nostdinc++ -I /home/libcxx-builder/.buildkite-agent/builds/a8cf9f586a97-1/llvm-project/libcxx-ci/build/generic-asan/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/a8cf9f586a97-1/llvm-project/libcxx-ci/build/generic-asan/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/a8cf9f586a97-1/llvm-project/libcxx-ci/libcxx/test/support -std=c++2b -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -Werror=thread-safety -Wuser-defined-warnings -lc++experimental -nostdlib++ -L /home/libcxx-builder/.buildkite-agent/builds/a8cf9f586a97-1/llvm-project/libcxx-ci/build/generic-asan/lib -Wl,-rpath,/home/libcxx-builder/.buildkite-agent/builds/a8cf9f586a97-1/llvm-project/libcxx-ci/build/generic-asan/lib -lc++ -pthread -o /home/libcxx-builder/.buildkite-agent/builds/a8cf9f586a97-1/llvm-project/libcxx-ci/build/generic-asan/test/std/thread/thread.stoptoken/stopcallback/Output/dtor.pass.cpp.dir/t.tmp.exe
3,450 mslibcxx CI Apple back-deployment macosx10.15 >
Script: -- : 'RUN: at line 11'; /Applications/ -c /Users/libcxx-buildkite-agent/libcxx.buildkite-agent/builds/y10-8-macminivault-com/llvm-project/libcxx-ci/libcxx/test/libcxx/ -o /Users/libcxx-buildkite-agent/libcxx.buildkite-agent/builds/y10-8-macminivault-com/llvm-project/libcxx-ci/build/apple-system-backdeployment-10.15/test/libcxx/Output/ -isysroot /Applications/ --target=x86_64-apple-macosx10.15 -nostdinc++ -I /Users/libcxx-buildkite-agent/libcxx.buildkite-agent/builds/y10-8-macminivault-com/llvm-project/libcxx-ci/build/apple-system-backdeployment-10.15/include/c++/v1 -I /Users/libcxx-buildkite-agent/libcxx.buildkite-agent/builds/y10-8-macminivault-com/llvm-project/libcxx-ci/libcxx/test/support -std=c++2b -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -Werror=thread-safety -Wuser-defined-warnings
6,150 mslibcxx CI Apple back-deployment macosx10.15 > apple-libc++-backdeployment-cfg-in.libcxx::min_max_macros.compile.pass.cpp
Script: -- : 'COMPILED WITH'; /Applications/ /Users/libcxx-buildkite-agent/libcxx.buildkite-agent/builds/y10-8-macminivault-com/llvm-project/libcxx-ci/libcxx/test/libcxx/min_max_macros.compile.pass.cpp -isysroot /Applications/ --target=x86_64-apple-macosx10.15 -nostdinc++ -I /Users/libcxx-buildkite-agent/libcxx.buildkite-agent/builds/y10-8-macminivault-com/llvm-project/libcxx-ci/build/apple-system-backdeployment-10.15/include/c++/v1 -I /Users/libcxx-buildkite-agent/libcxx.buildkite-agent/builds/y10-8-macminivault-com/llvm-project/libcxx-ci/libcxx/test/support -std=c++2b -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -Werror=thread-safety -Wuser-defined-warnings -fsyntax-only
124,620 mslibcxx CI Apple back-deployment macosx10.15 >
Script: -- : 'RUN: at line 58'; echo "" > /Users/libcxx-buildkite-agent/libcxx.buildkite-agent/builds/y10-8-macminivault-com/llvm-project/libcxx-ci/build/apple-system-backdeployment-10.15/test/libcxx/Output/
6,800 mslibcxx CI Apple back-deployment macosx10.15 > apple-libc++-backdeployment-cfg-in.libcxx::nasty_macros.compile.pass.cpp
Script: -- : 'COMPILED WITH'; /Applications/ /Users/libcxx-buildkite-agent/libcxx.buildkite-agent/builds/y10-8-macminivault-com/llvm-project/libcxx-ci/libcxx/test/libcxx/nasty_macros.compile.pass.cpp -isysroot /Applications/ --target=x86_64-apple-macosx10.15 -nostdinc++ -I /Users/libcxx-buildkite-agent/libcxx.buildkite-agent/builds/y10-8-macminivault-com/llvm-project/libcxx-ci/build/apple-system-backdeployment-10.15/include/c++/v1 -I /Users/libcxx-buildkite-agent/libcxx.buildkite-agent/builds/y10-8-macminivault-com/llvm-project/libcxx-ci/libcxx/test/support -std=c++2b -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -Werror=thread-safety -Wuser-defined-warnings -fsyntax-only
View Full Test Results (25 Failed)

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Herald added a project: Restricted Project. · View Herald TranscriptThu, Mar 2, 1:24 PM
Herald added a subscriber: arichardson. · View Herald Transcript
huixie90 requested review of this revision.Thu, Mar 2, 1:24 PM
Herald added a project: Restricted Project. · View Herald TranscriptThu, Mar 2, 1:24 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript

I mainly glanced over the patch and didn't look at the wording.


After this patch it's partial :-) Can you document which parts are still missing?

We have multiple partial done papers, without any information what is and what is not done. That makes it hard to finish these papers.


I miss _LIBCPP_HIDE_FROM_ABI on most (all?) functions.


maybe move constants __stop_requested_bit here too, that makes it clear they belong together.


Did you consider using RAII based locking?

huixie90 updated this revision to Diff 503119.Tue, Mar 7, 11:47 AM
huixie90 marked 2 inline comments as done.

add tests

huixie90 updated this revision to Diff 503161.Tue, Mar 7, 2:59 PM

add tests. addressed feedback

huixie90 retitled this revision from [libc++][In Progress] Implement `stop_token` to [libc++] Implement `stop_token`.Tue, Mar 7, 3:00 PM
huixie90 edited the summary of this revision. (Show Details)
huixie90 updated this revision to Diff 503163.Tue, Mar 7, 3:04 PM

hidden from abi

huixie90 marked 2 inline comments as done.Tue, Mar 7, 3:05 PM
huixie90 updated this revision to Diff 503467.Wed, Mar 8, 11:55 AM

try again

Geez, this is more complicated than I expected. Thanks a lot for working on this, we'll get through the whole thing.


Would it make sense to try to extract this into a more general utility? It's really tricky to review, as we both just discussed it for a while, and I think trying to extract it into its own utility would help us make gradual progress on the review.


Are we ever calling this with __is_locked == false? If so, it seems like this class is kind of like an optional<__lock_guard>. I would have expected that this constructor instead takes something like a __already_locked_t tag to express that the lock has already been taken.


Maybe we should look into extracting the intrusive linked-list related functionality into something reusable? Would that be doable, and if so would it make the code simpler?

Thanks a lot for working on this!

I looked at the header code and did some reviewing, but not completely.

This code is quite hard to understand and the Standard gives implementers a lot of freedom how to implement this. So I really would like more documentation regarding the design and some of the libc++ specific pre and post conditions. Without this information it will very hard to review and maintain this code.


Since you implement this feature using eelis instead of a paper you get
LWG3254 ( Strike stop_token's operator!=) for free, can you mark this issue as done too?


Can you use a not to mark the status. Look at the line above P0645 for an example.


We're already in the private section.


Can you add more documentation regarding what these fields mean.

Looking at the loop in
LIBCPP_HIDE_FROM_ABI bool __add_callback(__stop_callback_base* __cb) noexcept
I see several flag tested and the lock is the latest to be tested.

I would expect the lock the first to be tested.


Please don't use auto. This makes it hard to review the code or use it without and IDE. I need to look at load to see the type. (Note that the LLVM coding style also discourage auto.)


Can you explain why this relaxed it the right order?


How do you prevent calling the callback multiple times.


Note my atomics knowledge is a bit rusty.
Why is this memory_order_relaxed instead of memory_order_release?

AFAIK the current increment may not be visible in a different thread doing a decrement.
Which may cause use-after-free and double-free issues of the __state.


I assume the feature will get a FTM once done.

huixie90 added inline comments.Sun, Mar 12, 2:25 PM

yeah I will refactor/rename the code and add more comments. the lock bit is for locking the linked-list. The reason why it was not locked first is that the spec specified that if the requested bit was set, we won't go through the linked-list and instead directly invoke the callback on the current thread.


will add comments. all the atomic operations here won't need any memory ordering except the one actually does the lock.


once it is called, the next line would return.


Note that this is the same case for shared_ptr where increment is relaxed and decrement is acq_rel

The memory ordering controls the visibility of the write operations (including non-atomic operations) prio to current atomic operation, and read operations after the current atomic operation.

For increment, we don't need to do anything before and afterwards, so memory_order_relaxed is enough.
For decrement, we are potentially evaluating the destructor of underlying type T. We need to make sure that there are no other threads that are still poking around the underlying object through a different shared_ptr instance. i.e. any operations on the underlying object on any thread should be "finished" before evaluating the destructor.

We need release to tell other threads that we are done with the underlying object and all the writes to the underlying object should be visible now. (suppose you are writing to the underlying object before the shared_ptr goes out of scope, so that in case a different thread which has the last shared_ptr calls the destructor, it needs to see this write)
In case we are calling the destructor, we need acquire to see other threads write (see above).
So decrement needs both.

The way I understand is that:
acquire is git pull. All the read operations after it would see the latest.
release is git push. All the write operations before it would be visible by other threads which pulls.

please also note that shared_ptr (similarly here) does not protect you from incorrect use. When used properly, the copy operations would either copy from a nullptr or a shared_ptr which ref count is at least 1.. if you invoke a copy constructor and spawn a thread to assign the copied-from instance to nullptr. it is plain UB because you are writing and reading to the same shared_ptr object (data race)


yes there is only one FTM for stop_token and jthread. will get the FTM once jthread is done

huixie90 updated this revision to Diff 504487.Sun, Mar 12, 5:25 PM

spliting the lock and the linked list into separate classes

huixie90 added inline comments.Mon, Mar 13, 7:19 AM

Adding to my previous explanation. I think you might mix the visibility of other operations prior/after the atomic operations, with the visibility of the atomic operation itself. I think we never talk about the visibility of atomic operation itself when we talk about memory_order. The question of memory_order is, when atomic load sees another atomic store, what other operations are guaranteed to be seen. for example

int a = 0;
atomic<int> b = 0;

// thread A
a = 5;, std::memory_order_release);

// thread B
auto tmp = b.load(std::memory_order_acquire);
assert(tmp!=3 || a==5);

what the memory ordering concerns is that when thread B loads b and the value is 3, the release/acquire memory order ensures that thread B can also see that a has value 5.
Whether or not thread B sees b is 3 is not memory order problem. It is just a timing issue,

huixie90 updated this revision to Diff 504758.Mon, Mar 13, 11:02 AM
huixie90 marked 13 inline comments as done.

address comments

huixie90 added inline comments.Mon, Mar 13, 12:25 PM

extracted the locking logic into atomic_unique_lock.h


refactor the class. Instead of constructing with boolean, now the class has a constructor that takes early-fail-return predicate

huixie90 updated this revision to Diff 505767.Thu, Mar 16, 4:42 AM

try again

huixie90 updated this revision to Diff 506962.Tue, Mar 21, 6:51 AM

Fix typo as pointed out by Arthur

Thanks for the additional information and the additional documentation. I see the CI is still red, so I prefer to review it with a green CI. (Especially the thread sanitizer.) Feel free to ping me when the CI is green.


Thanks for the additional information. I also reread some parts of Anthony Williams book.