This is an archive of the discontinued LLVM Phabricator instance.

Fix time trace multi threaded support with LLVM_ENABLE_THREADS=OFF
AbandonedPublic

Authored by russell.gallop on Dec 16 2019, 6:56 AM.

Diff Detail

Event Timeline

russell.gallop created this revision.Dec 16 2019, 6:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 16 2019, 6:56 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
Kai added a comment.Dec 16 2019, 10:06 AM
This comment was removed by Kai.
Kai accepted this revision.Dec 16 2019, 10:07 AM

LGTM. Thanks!

This revision is now accepted and ready to land.Dec 16 2019, 10:07 AM
ruiu accepted this revision.Dec 16 2019, 8:12 PM
ruiu added a subscriber: ruiu.

LGTM

This revision was automatically updated to reflect the committed changes.

This broke a whole lot of bots like http://lab.llvm.org:8011/builders/clang-s390x-linux/builds/28924 or http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/30851:

[2/3225] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/TimeProfiler.cpp.o
FAILED: lib/Support/CMakeFiles/LLVMSupport.dir/TimeProfiler.cpp.o 
/usr/bin/c++  -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/Support -I/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/llvm/lib/Support -Iinclude -I/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/llvm/include -fPIC -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment -fdiagnostics-color -ffunction-sections -fdata-sections -O3    -UNDEBUG -std=c++14  -fno-exceptions -fno-rtti -MD -MT lib/Support/CMakeFiles/LLVMSupport.dir/TimeProfiler.cpp.o -MF lib/Support/CMakeFiles/LLVMSupport.dir/TimeProfiler.cpp.o.d -o lib/Support/CMakeFiles/LLVMSupport.dir/TimeProfiler.cpp.o -c /home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/llvm/lib/Support/TimeProfiler.cpp
/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/llvm/lib/Support/TimeProfiler.cpp:36:54: error: non-local variable 'llvm::TimeTraceProfilerInstance' declared '__thread' has a non-trivial destructor
 LLVM_THREAD_LOCAL std::unique_ptr<TimeTraceProfiler> TimeTraceProfilerInstance;
                                                      ^~~~~~~~~~~~~~~~~~~~~~~~~
/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/llvm/lib/Support/TimeProfiler.cpp:36:54: note: C++11 'thread_local' allows dynamic initialization and destruction

I don't think LLVM_THREAD_LOCAL is correct here. unique_ptr isn't a POD so this doesn't really fit to the use case of` LLVM_THREAD_LOCAL`. Maybe we should have a LLVM_NONTRIVIAL_THREAD_LOCAL. that doesn't fall back to __thread. Or we just use raw pointer here instead of a global unique_ptr (it seems that we anyway use it here just as an annotation to describe memory ownership?).

I don't think LLVM_THREAD_LOCAL is correct here. unique_ptr isn't a POD so this doesn't really fit to the use case of` LLVM_THREAD_LOCAL`. Maybe we should have a LLVM_NONTRIVIAL_THREAD_LOCAL. that doesn't fall back to __thread. Or we just use raw pointer here instead of a global unique_ptr (it seems that we anyway use it here just as an annotation to describe memory ownership?).

I'd prefer a "thread_local if threads enabled" as the unique_ptr helps enforce ownership when https://reviews.llvm.org/D71060 lands.

Interestingly, thread_local was okay on all public bots, the only problem was with LLVM_ENABLE_THREADS=OFF so maybe LLVM_THREAD_LOCAL could be simplified to "thread_local if threads enabled" now (removing declspec(thread) and thread). I don't know enough about how that is used to know whether that would be okay.

russell.gallop updated this revision to Diff 234270.EditedDec 17 2019, 4:25 AM

More specific fix. Just avoid thread_local when LLVM_ENABLE_THREADS=OFF.

russell.gallop reopened this revision.Dec 17 2019, 4:25 AM
This revision is now accepted and ready to land.Dec 17 2019, 4:25 AM
russell.gallop requested review of this revision.Dec 17 2019, 4:26 AM

@ruiu or @Kai I would like this to go in with D71060. Are you okay with this revised fix?

Thanks

Ping! Okay to land this? (I moved it back to needs review as this was very different from before).

llvm/include/llvm/Support/TimeProfiler.h
18

This patch still doesn't unbreak the build on Mac OSX:

/llvm-project/llvm/include/llvm/Support/TimeProfiler.h:24:8: error: thread-local storage is not supported for the current target
extern THREAD_LOCAL_IF_THREADS_ENABLED std::unique_ptr<TimeTraceProfiler>
       ^

Personally, I'd like to see the original patch reverted until you can find a way to do it without thread_local. ...And it seems like you do have a way? If the code works without using thread_local at all, then what's the logic behind adding thread_local to it conditionally? Given that the version without thread_local works fine, can we just use that version on all platforms?

russell.gallop marked an inline comment as done.Dec 23 2019, 6:30 AM
russell.gallop added inline comments.
llvm/include/llvm/Support/TimeProfiler.h
18

Ah, I thought that the Mac OSX build was using LLVM_ENABLE_THREADS=OFF, but it has that on, but doesn't have C++11 thread_local. Does it use the gcc __thread (e.g. for LLVM_THREAD_LOCAL)?

And it seems like you do have a way?

I'm not sure what you're referring to here:

a) The original RFC for LLD time trace (https://reviews.llvm.org/D69043) had a lock heavy version which was massively inefficient.
b) This works without thread_local when LLVM_ENABLE_THREADS is OFF because there is only one thread. That doesn't achieve the desired effect when LLVM_ENABLE_THREADS is ON.
c) Using LLVM_THREAD_LOCAL again, and getting that working with __thread. I think this could be made to work.

Given that the version without thread_local works fine, can we just use that version on all platforms?

b) above doesn't work multi-threaded. a) above is very inefficient.

Apologies if I've misunderstood.

So I believe that you're suggesting is a version which is able to use another form of TLS rather than c++11 thread_local? If so, I'll try to get a version going which works like that (and revert in the meantime).

llvm/include/llvm/Support/TimeProfiler.h
18

The version you're proposing here, using THREAD_LOCAL_IF_THREADS_ENABLED, specifically falls back to a single (per-process) global variable if thread_local isn't supported. So clearly the version with a single per-process variable does work (or else this patch is broken). You may be right that the version with a single per-process variable is slow, though.

It looks to me as if there is code in llvm/include/llvm/Support/Compiler.h to do exactly this macro-ization. @teemperor said don't use that code because it can fall back to __thread which doesn't work for non-PODs. However, your solution here is to use the thread_local keyword even on platforms where __has_feature(cxx_thread_local) is false, which seems wrong.

How about you roll the original patch back for now, just to get the build back in a stable state, and then either

  • submit a new patch that adds a new LLVM_NONPOD_THREAD_LOCAL macro to Compiler.h so that you can use it with unique_ptr, or
  • rework this patch so that it doesn't rely on ~unique_ptr to do its cleanup, so that then you can simply use LLVM_THREAD_LOCAL?

However, in either case, it seems like the LLVM_THREAD_LOCAL macro can fall back to nothing, i.e., a single per-process global variable. So whatever you do with LLVM_THREAD_LOCAL, you need to make sure that it's a pure optimization and that the underlying global-variable version still works correctly (if slowly).

llvm/include/llvm/Support/TimeProfiler.h
18

Btw, on Mac OSX, LLVM_THREAD_LOCAL expands to __thread. (I just tested it locally.)

russell.gallop marked an inline comment as done.Dec 24 2019, 3:45 AM
russell.gallop added inline comments.
llvm/include/llvm/Support/TimeProfiler.h
18

Thanks for the info.

Multi-threaded time trace (D71059) reverted in 2e9bfa12ff. I'll take a look at this in the new year.

russell.gallop abandoned this revision.Jan 6 2020, 8:43 AM

Abandoning this review. Will roll fix for these issues into new version on D71059.