This patch wraps an external thread local storage variable inside of a getter function and makes it have internal linkage. This allows LLVM to be built with BUILD_SHARED_LIBS on windows with MinGW. Additionally it allows Clang versions prior to 10 to compile current trunk.
Details
Diff Detail
Event Timeline
Additionally it allows Clang versions prior to 10 to compile current trunk.
I have been using Clang prior to 10 to compile without this change. Do you mean with BUILD_SHARED_LIBS for this as well?
llvm/lib/Support/TimeProfiler.cpp | ||
---|---|---|
35 | You can remove the llvm:: prefix, as this is within an anonymous namespace, like ThreadTimeTraceProfilerInstances. |
He meant when targeting mingw. There were issues with code generation for mingw targets, when accessing one TLS variable from multiple object files but that's fixed in the 10.0 branch and master.
But one can't access a TLS variable directly from a different DLL (as before with extern LLVM_THREAD_LOCAL) at all, which would be necessary when building with BUILD_SHARED_LIBS.
I believe this only occurs when using clang with MinGW as it needs to generate extra wrapper symbols that previously didn't have some needed comdat attributes
llvm/lib/Support/TimeProfiler.cpp | ||
---|---|---|
35 | I don't fully understand. TimeTraceProfiler is in the namespace llvm so I'd get an unknown typename if I were to remove it |
llvm/include/llvm/Support/TimeProfiler.h | ||
---|---|---|
17 | Use camelCase() to match style guide and other code in this header. | |
65 | This puts a function call on the TimeTrace fast path (tracing disabled). Can you please see if this is at all measurable? Try doing three compilations of something big in clang, like SemaDeclAttr.cpp or X86ISelLowering.cpp, with and without this change. I would suggest doing it on Linux, where TLS access is optimized, so any difference would be more visible. | |
llvm/lib/Support/TimeProfiler.cpp | ||
35 | Typically LLVM uses using namespace llvm; in cpp files, but I see that this file lacks it. Perhaps it should be added. |
llvm/include/llvm/Support/TimeProfiler.h | ||
---|---|---|
65 | Worst case, one could have #ifdef _WIN32 #define TimeTraceProfilerInstance getTimeTraceProfilerInstance() #else extern LLVM_THREAD_LOCAL TimeTraceProfilerInstance; #endif, but if it's negligible, it's of course simpler without that. | |
llvm/lib/Support/TimeProfiler.cpp | ||
35 | Oh, sorry, I misread - I thought it was on the variable name - as type qualifier it's ok. |
llvm/include/llvm/Support/TimeProfiler.h | ||
---|---|---|
65 | I tried benchmarking it, and it's hard to say if there's any noticeable difference. On a clang built with GCC 7.3, building X86ISelLowering.cpp with this clang from current master takes 47.467s, 47.186s, 47.488s in three attempts, while it takes 47.675s, 47.287s, 47.336s with this patch. After a few more tests with both, the lowest measurement for the unpatched version is lower than the lowest for the patched version, but there's lots of overlap between the range for measurements for the two versions. If clang is built with LTO, there should't really be any difference at all, as far as I understand. |
I also just measured compiling X86ISelLowering. Once on trunk and once with the patch applied. Using GCC 9.2 as the compiler got me 31.572, 31.843 and 31.936 for the current master and 32.241, 32.370 and 32.306 with the patch applied. So it's measurable on my machine
I realized I had built my clang with asserts enabled, which might dilute the effect of the change. But despite disabling asserts, I can't measure any significant difference due to this change - it's almost the contrary effect, this change actually seems to make it marginally faster (both when built with GCC 7 and Clang 9).
My Machine (a Ryzen 1950X) is probably a rather exotic machine compared to others so I think it's safe to assume that your results are more representative of the real world
Honestly, the numbers from both runs seem within the noise. I'm not very good at stats, though. >_>
So, I think this is good to go.
Use camelCase() to match style guide and other code in this header.