On Darwin, we need to track thread and tid as separate values.
This patch modifies suspended thread info to be more general, to
allow for this behavior.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/sanitizer_common/sanitizer_stoptheworld_mac.cc | ||
---|---|---|
32–40 ↗ | (On Diff #93391) | Does this return the same value as pthread_threadid_np? Can we use it instead? Or even better, use GetTid from sanitizer_mac.cc? Calling thread_info here sounds very expensive. Why not store the thread_id's in the array directly? If we really need to know both thread_t and thread_id, then we should have a struct that contains both and store the struct in the array. |
lib/sanitizer_common/sanitizer_stoptheworld_mac.cc | ||
---|---|---|
32–40 ↗ | (On Diff #93391) | I think it's reasonable to use a struct instead. There is a fair amount of existing code that requires the thread_id, but the code for accessing register info will require the thread_t, so there's definitely a need for both, especially if it's expensive to convert. |
lib/sanitizer_common/sanitizer_stoptheworld_mac.cc | ||
---|---|---|
32–40 ↗ | (On Diff #93391) | I don't think it's expensive to convert, but we should make sure to only do the conversion once per thread. |
lib/sanitizer_common/sanitizer_stoptheworld_mac.cc | ||
---|---|---|
32–40 ↗ | (On Diff #93391) | Does that go for pthread_threadid_np as well? I don't think there's a good way to cache the values and ensure it's only called once, other than using a struct to store both. |
lib/sanitizer_common/sanitizer_stoptheworld_mac.cc | ||
---|---|---|
32–40 ↗ | (On Diff #93391) | Looks like the pthread call takes a pthread_t and not a thread_t anyway. I'll try using the struct with a single call to thread_info. |
lib/sanitizer_common/sanitizer_stoptheworld_mac.cc | ||
---|---|---|
32–40 ↗ | (On Diff #93391) | Right. I'm suggesting to use the struct to store both. |
Use a struct to store both thread and tid
I'm not sure how great this patch is, I don't love that we store
the tid twice on Linux.
If it's helpful for the purposes of design/review, I can also upload the 3 followup patches that actually use this functionality. I've been trying to avoid overwhelming with too many patches at once.
lib/sanitizer_common/sanitizer_stoptheworld.h | ||
---|---|---|
38–39 ↗ | (On Diff #93411) | Can we use actual types here? The typedefs are just confusing. How about this: The contents of this struct will be ifdef'd and use actual types (pid_t on Linux, thread_t+uptr64_t on Darwin). The SuspendedThreadsList API below will only deal with the whole struct, i.e. GetThread returns the whole struct, GetThreadID will be removed, Contains will search for matches of the whole struct (or select one field to be the "primary id", which we'll match for). |
Besides the issue with "uptr tid", LGTM.
lib/sanitizer_common/sanitizer_stoptheworld.h | ||
---|---|---|
32 ↗ | (On Diff #93489) | uptr is not correct here; macOS has 64-bit thread IDs even on i386 |
lib/sanitizer_common/sanitizer_stoptheworld.h | ||
---|---|---|
20 ↗ | (On Diff #93489) | please avoid any #if/#ifdef in this file (and in most others). |
Hmm. In order to do this without using any #ifdef's, I think we'll need to move the struct into two new header files - sanitizer_stoptheworld_linux.h and sanitizer_stoptheworld_mac.h. Other than that, I think we should be fine removing them, if that sounds like an okay solution. I'm not sure that any existing plaform-specific headers really work for this.
lib/sanitizer_common/sanitizer_stoptheworld.h | ||
---|---|---|
21 ↗ | (On Diff #93489) | And more generally, please do not include *any* system headers in any sanitizer file, unless it already includes other system headers. |
Ping - this blocks several more Darwin LSAN patches I have waiting to send for review.
As an update to the concern raised by @kcc regarding platform-specific differences, the primary reason why we store two values (tid + thread_t) on Darwin and only one (tid) on linux is performance. We could remove platform-specific headers as well as the ifdef's if we only stored the tid on darwin as well, but that would require conversion between tid and thread_t in several cases (reading register state, etc). This is a relatively expensive conversion (requires a syscall), so we prefer to do the conversion only once and store the result, even though this does introduce some differences between the platforms.
I think that the current implementation, which uses platform-specific headers (but no system headers), and no ifdefs is a reasonable compromise. With this change: D31774, it could actually become even more straightforward, although platform-specific headers would still be required to store the thread_t. However, if we'd prefer to store only the tid on Darwin as well, and accept the cost of the conversion to thread_t each time it's required, I can upload a patch to do that.
Should we commit and merge with D31774 first?
lib/sanitizer_common/sanitizer_stoptheworld.h | ||
---|---|---|
30 ↗ | (On Diff #93538) | const SuspendedThreadInfo& GetThread |
39 ↗ | (On Diff #93538) | I think we should stick to Contains(SuspendedThreadID thread_id), it's semantics is to check whether the thread info for the given thread id is already appended to the list, not that the list contains the same info. Maybe even rename it to ContainsTid. |
You can probably feel free to add yourself as a reviewer to that diff, I assume it may not have been merged since we weren't sure whether it's a solution we want for this patch.
This breaks compilation of asan for windows unless we define a dummy struct there as well
Sorry for the push back, but the sanitizer_stoptheworld_win.h makes things really ugly. Now it feels like the entire SuspendedThreadsList implementation need to go into *_linux and *_mac specific files, leaving only the interface in sanitizer_stoptheworld.h:
class SuspendedThreadsList {
public:
virtual uptr GetThreadCount() const = 0; virtual SuspendedThreadID GetThreadID(uptr index) const = 0; virtual uptr RegisterCount() = 0; virtual PtraceRegistersStatus GetRegistersAndSP(uptr index, uptr *buffer, uptr *sp) const = 0;
};
The actual implementation SuspendedThreadsListLinux/Mac is free to store whatever it needs. Does it look cleaner?
Yeah, I was growing pretty dissatisfied with the current state as well - I like the abstract class idea. However, it will mean we need a windows implementation. The stoptheworld header is included by lsan_common, which is included by asan, even when lsan is disabled. I could write a dummy implementation for windows, with CHECK(0 && "not implemented") in all the functions, if that seems acceptable.
Split out platform-specific implementations of suspended threads list
Turns out windows isn't needed for compilation.
Yes, CHECK(0 && "not implemented") makes much more sense to the reader than dummy tid struct, it means exactly what it looks like, the functionality is not implemented.
It turns out that even the CHECK() isn't necessary for compilation on windows (I checked that asan+windows builds with this patch). Presumably if someone attempts to actually use stoptheworld on Windows, they'll hit a build error from the abstract class being unimplemented. I think this is exactly what we'd want (and even better than a CHECK).
Looks much better, thank you!
lib/sanitizer_common/sanitizer_stoptheworld.h | ||
---|---|---|
39 ↗ | (On Diff #95445) | Please rename it to ThreadCount. |
lib/sanitizer_common/sanitizer_stoptheworld_linux_libcdep.cc | ||
518 ↗ | (On Diff #95445) | Remove this comment. |
521 ↗ | (On Diff #95445) | Add empty line to separate functions. |
lib/sanitizer_common/sanitizer_stoptheworld_mac.cc | ||
55 ↗ | (On Diff #95445) | Remove this comment. |
56 ↗ | (On Diff #95445) | uptr SuspendedThreadsListMac::thread_count() const { return threads_.size(); } and add an empty line between functions. |
Looks like we can't use __cxa_pure_virtual in the sanitizers - replace with UNIMPLEMENTED