This is an archive of the discontinued LLVM Phabricator instance.

[support] Prepare TimeProfiler for cross-thread support
ClosedPublic

Authored by mbs-modular on Sep 1 2022, 1:52 PM.

Details

Summary

This NFC prepares the TimeProfiler to support the construction
and completion of time profiling 'entries' across threads.

Add ClockType alias so we can change the clock in one place.
(trivial) Use c++ usings instead of typedefs
Rename Entry to TimeTraceProfilerEntry since this type will eventually become public.
Add an intro comment.
Add some smoke unit tests.

Diff Detail

Event Timeline

mbs-modular created this revision.Sep 1 2022, 1:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 1 2022, 1:52 PM
mbs-modular requested review of this revision.Sep 1 2022, 1:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 1 2022, 1:52 PM
mbs-modular edited the summary of this revision. (Show Details)
mbs-modular edited the summary of this revision. (Show Details)
  • woops, BeginningOfTime must always capture now according to the system clock, irrespective of what we use for the profiling clock.
  • Learning arc the hard way.

Thanks for splitting this out. Generally looks good with some comments, and thanks for adding the unit test.

llvm/include/llvm/Support/TimeProfiler.h
53

Thanks for adding some guidance on how this is intended to be used.

I think that this could be slightly clearer/less ambiguous. timeTraceProfilerInitialize and timeTraceProfilerFinishThread should wrap the code in the thread, they shouldn't wrap thread creation.

Likewise with the main process, Initialize/Write and Cleanup are still inside the process. Wrapping the process sounds, to me, like it should be somehow outside of the process. Maybe there is a better way to phrase this, something like wrapping the code rather than the process maybe.

62

I haven't tried it but which trace viewer is best seems subjective!

Note that clang/docs/ClangCommandLineReference.rst suggests chrome://tracing or Speedscope so this could be seen to be a bit inconsistent.

llvm/unittests/Support/TimeProfilerTest.cpp
1

Stale comment.

8

smoke tests?

36

Is there a reason not to use SetUp() and TearDown() (https://github.com/google/googletest/blob/main/docs/primer.md#writing-the-main-function)? They would then be run automatically rather than having to be in each test, making it easier to add new tests.

  • Russell's comments
llvm/include/llvm/Support/TimeProfiler.h
53

Done.

62

Done. However no luck getting Speedscope to work (stack overflow) so I'll mention it with qualification.

llvm/unittests/Support/TimeProfilerTest.cpp
36

Ah, because in the sequel I want to test the explicit entry creation and ending is a no-op if no profiling has been set up. Added a very dummy test here to set the stage for that.

Thanks again @russell.gallop, PTAL. BTW am I using arc right? arc diff HEAD~3 --update D133153 seems wrong.

Thanks again @russell.gallop, PTAL.

I noticed a warning:

F:\git\llvm-project\llvm\lib\Support\TimeProfiler.cpp(75,3): warning: explicitly defaulted default constructor is implicitly deleted [-Wdefaulted-function-deleted]
  TimeTraceProfilerEntry() = default;
  ^
F:\git\llvm-project\llvm\lib\Support\TimeProfiler.cpp(69,23): note: default constructor of 'TimeTraceProfilerEntry' is implicitly deleted because field 'Start' of const-qualified type 'const (anonymous namespace)::TimePointType' (aka 'const time_point<std::chrono::steady_clock>') would not be initialized
  const TimePointType Start;
                      ^
1 warning generated.

Please could you fix that?

BTW am I using arc right? arc diff HEAD~3 --update D133153 seems wrong.

I don't really know as I haven't used arc. I can't see the fix for a few of my comments (e.g. header line of unit test file) so maybe it didn't pick everything up.

I did some experiments compiling a test file. Without this patch there's a difference of < 1% with and without -ftime-trace. With this patch I see around 2% performance impact from adding -ftime-trace. I'm measuring the whole time with hyperfine on Windows. Maybe this is just in the noise. Do you see a similar effect?

  • woops, BeginningOfTime must always capture now according to the system
  • Russell's comments
  • Missed two of Russell's comments

Thx again @russell.gallop

I noticed a warning:

That's a strange one -- can't repro with gcc or stock clang (14) on linux, so perhaps a windows stdlib quirk? The default std::chrono::time_point ctor has been there since c++11, and we explicitly invoke it in a few places, so I'll assume this is a bug we need to work around. So reverted the triggering change.

I did some experiments compiling a test file.

I did introduce an extra std::move level for the overall entry so perhaps I'm being too optimistic with perfect forwarding? Let's avoid the issue entirely and go back to exactly the original impl so that any such cost is only paid for by those who want to use the explicit entry API I'll introduce in the sequel.

PTAL, thx.

mbs-modular marked 5 inline comments as done.Sep 6 2022, 10:53 AM
  • Project even the default StringRef ctor

Something is definitely up now. With the latest patch I see a 12% decrease in performance when using -ftime-trace! With 10 runs there is no overlap between the with and without timings. Are you able to reproduce this difference on a test file?

Some details in case they're important:
Applying patch on top of 0431eb59dfa8f77e460c282dd15006b363d5dbd7
Building with clang-cl+lld-link 12.0.1 on Windows based on VS2019
Building in RelWithDebInfo configuration

mbs-modular added a comment.EditedSep 7 2022, 11:20 AM

This is very disturbing!

On my ubuntu box, with Release build:

CMD="/home/mbs/github/mbs-llvm-project/build/bin/clang++ -DGTEST_HAS_RTTI=0 -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/mbs/github/mbs-llvm-project/build/lib/Support -I/home/mbs/github/mbs-llvm-project/llvm/lib/Support -I/home/mbs/github/mbs-llvm-project/build/include -I/home/mbs/github/mbs-llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Werror=global-constructors -O3 -DNDEBUG -std=c++17  -fno-exceptions -fno-rtti -MD -MT lib/Support/CMakeFiles/LLVMSupport.dir/TimeProfiler.cpp.o -MF lib/Support/CMakeFiles/LLVMSupport.dir/TimeProfiler.cpp.o.d -o lib/Support/CMakeFiles/LLVMSupport.dir/TimeProfiler.cpp.o -c /home/mbs/github/mbs-llvm-project/llvm/lib/Support/TimeProfiler.cpp"

Then for i in {1..5}; do /usr/bin/time -f "%e" $CMD; done
vs for i in {1..5}; do /usr/bin/time -f "%e" $CMD -ftime-trace; done
shows a consistent x1.03 slowdown for clang-14, my main (94c6dfbaebbd5b9474794b2437757dfb6aedefc3) and this branch (d94302069f893f0183fc3053255e287b5e7cc070).

That's comparable to the variance I see between individual runs. So I think it's in the noise for me.

Possibilities:

  • Accidentally changed the clock.
  • Despite all the profile writing code not depending on raw_pwrite_stream statically, there's some magic that saves profile writing time
  • I'm not looking at a large enough run.
mbs-modular added a comment.EditedSep 7 2022, 8:55 PM

Afraid I'm out of ideas.

Accidentally changed the clock.

No, looks good.

Despite all the profile writing code not depending on raw_pwrite_stream statically, there's some magic that saves profile writing time

Can't detect a difference, see 'pwrite' columns below.

I'm not looking at a large enough run.

CLANG="/usr/bin/clang++-14"
CLANG="/home/mbs/github/mbs-llvm-project/build/bin/clang++"
CMD="$CLANG -DGTEST_HAS_RTTI=0 -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/mbs/github/mbs-llvm-project/build/lib/Analysis -I/home/mbs/github/mbs-llvm-project/llvm/lib/Analysis -I/home/mbs/github/mbs-llvm-project/build/include -I/home/mbs/github/mbs-llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG  -fno-exceptions -fno-rtti -std=c++17 -MD -MT lib/Analysis/CMakeFiles/LLVMAnalysis.dir/ValueTracking.cpp.o -MF lib/Analysis/CMakeFiles/LLVMAnalysis.dir/ValueTracking.cpp.o.d -o lib/Analysis/CMakeFiles/LLVMAnalysis.dir/ValueTracking.cpp.o -c /home/mbs/github/mbs-llvm-project/llvm/lib/Analysis/ValueTracking.cpp"

Seems in the noise to me:

main	main	branch	branch	pwrite	pwrite
none	flag	none	flag	none	flag
5.7	5.74	5.55	5.69	5.58	5.61
5.65	5.71	5.52	5.67	5.54	5.57
5.61	5.82	5.98	5.74	5.3	5.53
5.66	5.84	6.49	5.72	5.45	5.53
5.8	6.11	6.1	5.86	5.36	5.62
5.77	5.68	5.78	5.71	5.39	5.52
5.76	5.88	5.57	5.84	5.39	5.56
5.77	5.75	5.87	5.92	5.41	5.46
5.78	5.79	6.03	5.87	5.41	5.52
5.73	5.77	6.53	6.07	5.44	5.5
5.71	5.66	5.58	5.94	5.39	5.55
5.64	5.61	5.6	5.87	5.43	5.53
5.76	5.65	5.49	5.7	5.38	5.55
5.62	5.59	5.52	5.69	5.32	5.52
5.64	5.74	5.69	5.76	5.32	5.55

5.71	5.74	5.69	5.76	5.39	5.53   (medians)
	1.005		1.012		1.026  (factors)
  • woops, BeginningOfTime must always capture now according to the system
  • Russell's comments
  • Missed two of Russell's comments
  • Project even the default StringRef ctor
rriddle added a subscriber: rriddle.EditedSep 13 2022, 12:25 PM

Does the current diff contain all of the intended changes? All of the TimeProfiler changes look NFC. For the current diff (as displayed on phab), I'm struggling to see anything that would change compile time.

Thanks @rriddle

Does the current diff contain all of the intended changes? All of the TimeProfiler changes look NFC.

Yes, this is it, it's supposed to be NFC, and I can't explain @russell.gallop 's claimed regression.
The only mostly-NFC change is I raised the stream type from raw_pwrite_stream to raw_ostream to help the unit tests (trivial as they are).

rriddle accepted this revision.Sep 13 2022, 2:26 PM

Thanks @rriddle

Does the current diff contain all of the intended changes? All of the TimeProfiler changes look NFC.

Yes, this is it, it's supposed to be NFC, and I can't explain @russell.gallop 's claimed regression.
The only mostly-NFC change is I raised the stream type from raw_pwrite_stream to raw_ostream to help the unit tests (trivial as they are).

Can you revert that for now and just use raw_svector_ostream in the unittests? raw_svector_ostream implements raw_pwrite_stream, which should match the existing API. LGTM after that (though also wait if @russell.gallop has comments still).

@russell.gallop can you still repro a regression with the current patch?

llvm/include/llvm/Support/TimeProfiler.h
77

Why the extra include?

This revision is now accepted and ready to land.Sep 13 2022, 2:26 PM
  • woops, BeginningOfTime must always capture now according to the system
  • Russell's comments
  • Missed two of Russell's comments
  • Project even the default StringRef ctor
  • Revert raw_pwrite_steram change to remove any uncertainty about perf regression.
Herald added a project: Restricted Project. · View Herald TranscriptSep 13 2022, 5:12 PM
Herald added subscribers: Restricted Project, Enna1. · View Herald Transcript

Thx again, reverted raw_pwrite_stream change, timings look like +3% with -ftime-trace as usual.

llvm/include/llvm/Support/TimeProfiler.h
77

Just a IWYU force of habit I guess.

[support] Prepare TimeProfiler for cross-thread support

This NFC prepares the TimeProfiler to support the construction
and completion of time profiling 'entries' across threads.

  • Add ClockType alias so we can change the clock in one place.
  • (trivial) Use c++ usings instead of typedefs
  • Rename Entry to TimeTraceProfilerEntry since this type will eventually become public.
  • Add an intro comment.
  • Add some smoke unit tests.
mbs-modular edited the summary of this revision. (Show Details)Sep 14 2022, 6:17 AM
mbs-modular edited the summary of this revision. (Show Details)
lattner accepted this revision.Sep 14 2022, 8:14 AM
russell.gallop added a comment.EditedSep 15 2022, 7:03 AM

@russell.gallop can you still repro a regression with the current patch?

Sorry to take a while to reply.

I rebuilt with Diff 460060 applied to 40ab7875f8ccb8d1b0166bf91475fc4d662cffe9 and yes, I do still see a build performance impact with -ftime-trace. I reproduced this building SemaExpr.cpp from a stage2 build so this should be something that you can reproduce.

In spite of the warning about statistical outliers (it was a pretty quiet PC), there is no overlap between the range of times with and without -ftime-trace so it's not just one or two samples causing this.

f:\git\llvm-project\stage2>%USERPROFILE%\bin\hyperfine -w 1 -m 10 "F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp" "F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp -ftime-trace"
Benchmark #1: F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp
  Time (mean ± σ):     42.646 s ±  0.417 s    [User: 1.3 ms, System: 13.0 ms]
  Range (min … max):   41.783 s … 43.535 s    10 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark #2: F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp -ftime-trace
  Time (mean ± σ):     46.308 s ±  1.108 s    [User: 3.9 ms, System: 9.9 ms]
  Range (min … max):   44.930 s … 49.074 s    10 runs

Summary
  'F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp' ran
    1.09 ± 0.03 times faster than 'F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp -ftime-trace'

And without this patch it's just 1%.

f:\git\llvm-project\stage2>%USERPROFILE%\bin\hyperfine -w 1 -m 10 "F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp" "F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp -ftime-trace"
Benchmark #1: F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp
  Time (mean ± σ):     40.761 s ±  1.346 s    [User: 1.2 ms, System: 7.4 ms]
  Range (min … max):   38.188 s … 42.256 s    10 runs

Benchmark #2: F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp -ftime-trace
  Time (mean ± σ):     41.162 s ±  0.319 s    [User: 2.5 ms, System: 8.7 ms]
  Range (min … max):   40.750 s … 41.845 s    10 runs

Summary
  'F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp' ran
    1.01 ± 0.03 times faster than 'F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp -ftime-trace'

Sorry I don't have time to look into this in more depth, but it is a cause for concern that we might be losing a nice property of -ftime-trace.

Apart from this I'm happy with the patch.

(And many apologies in advance if it's something I'm doing wrong here! I'd be interested if anyone else can reproduce this on SemaExpr. If it's just me then I'd suggest not blocking this patch landing due to this.)

compiler-rt/test/fuzzer/cross_over.test
1 ↗(On Diff #459923)

This doesn't look related to me.

compiler-rt/test/fuzzer/merge-control-file.test
1 ↗(On Diff #459923)

This doesn't look related to me.

jloser accepted this revision.Sep 15 2022, 8:49 AM
jloser added a subscriber: jloser.

@russell.gallop can you still repro a regression with the current patch?

Sorry to take a while to reply.

I rebuilt with Diff 460060 applied to 40ab7875f8ccb8d1b0166bf91475fc4d662cffe9 and yes, I do still see a build performance impact with -ftime-trace. I reproduced this building SemaExpr.cpp from a stage2 build so this should be something that you can reproduce.

In spite of the warning about statistical outliers (it was a pretty quiet PC), there is no overlap between the range of times with and without -ftime-trace so it's not just one or two samples causing this.

f:\git\llvm-project\stage2>%USERPROFILE%\bin\hyperfine -w 1 -m 10 "F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp" "F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp -ftime-trace"
Benchmark #1: F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp
  Time (mean ± σ):     42.646 s ±  0.417 s    [User: 1.3 ms, System: 13.0 ms]
  Range (min … max):   41.783 s … 43.535 s    10 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet PC without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark #2: F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp -ftime-trace
  Time (mean ± σ):     46.308 s ±  1.108 s    [User: 3.9 ms, System: 9.9 ms]
  Range (min … max):   44.930 s … 49.074 s    10 runs

Summary
  'F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp' ran
    1.09 ± 0.03 times faster than 'F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp -ftime-trace'

And without this patch it's just 1%.

f:\git\llvm-project\stage2>%USERPROFILE%\bin\hyperfine -w 1 -m 10 "F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp" "F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp -ftime-trace"
Benchmark #1: F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp
  Time (mean ± σ):     40.761 s ±  1.346 s    [User: 1.2 ms, System: 7.4 ms]
  Range (min … max):   38.188 s … 42.256 s    10 runs

Benchmark #2: F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp -ftime-trace
  Time (mean ± σ):     41.162 s ±  0.319 s    [User: 2.5 ms, System: 8.7 ms]
  Range (min … max):   40.750 s … 41.845 s    10 runs

Summary
  'F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp' ran
    1.01 ± 0.03 times faster than 'F:\git\llvm-project\stage1\bin\clang-cl.exe  /nologo -TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_GNU_SOURCE -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\clang\lib\Sema -IF:\git\llvm-project\clang\lib\Sema -IF:\git\llvm-project\clang\include -Itools\clang\include -Iinclude -IF:\git\llvm-project\llvm\include -Wno-nonportable-include-path /Zc:inline /Zc:__cplusplus /Oi /Brepro /bigobj /permissive- /W4  -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported /Gw /MT /O2 /Ob2 /DNDEBUG    /EHs-c- /GR- -std:c++17 /Fotools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaExpr.cpp.obj /Fdtools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\ -c F:\git\llvm-project\clang\lib\Sema\SemaExpr.cpp -ftime-trace'

Sorry I don't have time to look into this in more depth, but it is a cause for concern that we might be losing a nice property of -ftime-trace.

Apart from this I'm happy with the patch.

(And many apologies in advance if it's something I'm doing wrong here! I'd be interested if anyone else can reproduce this on SemaExpr. If it's just me then I'd suggest not blocking this patch landing due to this.)

@mbs-modular - it'd be interesting to see if this is something you can repro on Linux with a stage2 build.

Other than that, this patch LGTM. If you need me or someone else to commit this on your behalf, can you please provide the email address you'd like to use for commit attribution?

git switch mbs/time_profiler_warmup
git rebase 40ab78
cmake -S llvm -B build -G Ninja -DLLVM_ENABLE_PROJECTS=clang -DLLVM_TARGETS_TO_BUILD=X86 -DCMAKE_BUILD_TYPE=Release -DLLVM_USE_LINKER=lld -DCMAKE_CXX_COMPILER=/usr/bin/clang++ -DCMAKE_C_COMPILER=/usr/bin/clang
cmake --build build
touch clang/lib/Sema/SemaExpr.cpp
cd build
ninja tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaExpr.cpp.o -v

> /usr/bin/clang++ -DGTEST_HAS_RTTI=0 -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/mbs/github/mbs-llvm-project/build/tools/clang/lib/Sema -I/home/mbs/github/mbs-llvm-project/clang/lib/Sema -I/home/mbs/github/mbs-llvm-project/clang/include -I/home/mbs/github/mbs-llvm-project/build/tools/clang/include -I/home/mbs/github/mbs-llvm-project/build/include -I/home/mbs/github/mbs-llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types -O3 -DNDEBUG  -fno-exceptions -fno-rtti -std=c++17 -MD -MT tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaExpr.cpp.o -MF tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaExpr.cpp.o.d -o tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaExpr.cpp.o -c /home/mbs/github/mbs-llvm-project/clang/lib/Sema/SemaExpr.cpp

CLANG="/home/mbs/github/mbs-llvm-project/build/bin/clang++"
CMD="$CLANG -DGTEST_HAS_RTTI=0 -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/mbs/github/mbs-llvm-project/build/tools/clang/lib/Sema -I/home/mbs/github/mbs-llvm-project/clang/lib/Sema -I/home/mbs/github/mbs-llvm-project/clang/include -I/home/mbs/github/mbs-llvm-project/build/tools/clang/include -I/home/mbs/github/mbs-llvm-project/build/include -I/home/mbs/github/mbs-llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types -O3 -DNDEBUG  -fno-exceptions -fno-rtti -std=c++17 -MD -MT tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaExpr.cpp.o -MF tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaExpr.cpp.o.d -o tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaExpr.cpp.o -c /home/mbs/github/mbs-llvm-project/clang/lib/Sema/SemaExpr.cpp"

for i in {1..15}; do /usr/bin/time -f "%e" $CMD; done

> 23.93
23.86
24.12
23.84
23.81
23.71
23.74
23.86
23.74
23.74
23.72
23.80
23.65
23.75
23.74


for i in {1..15}; do /usr/bin/time -f "%e" $CMD -ftime-trace; done

> 25.19
25.17
25.08
25.17
25.09
25.18
24.97
25.10
24.93
24.80
24.71
25.04
25.06
25.47
25.38

cd ..
git switch main
git checkout 40ab78
cmake --build build
cd build

for i in {1..15}; do /usr/bin/time -f "%e" $CMD; done

> 24.11
23.96
23.89
23.88
23.80
23.82
23.90
23.90
23.96
23.83
23.86
23.84
23.92
23.85
23.68

for i in {1..15}; do /usr/bin/time -f "%e" $CMD -ftime-trace; done

> 25.18
25.27
25.27
25.25
25.64
25.48
25.12
25.07
25.32
25.14
25.27
25.14
25.30
25.22
25.16

Ratio of with flag to without flag is 1.06 both with and without patch.
Ratio of without patch to with path is 1.01 both with and without flag.
I confirmed I'm indeed using the clang with the changes by emitting error output.
Given the disagreement on your measured times with and without the patch even without the -ftime-trace flag I'm inclined to write this off to very non-gaussian sampling noise on your setup. wdyt?

russell.gallop accepted this revision.Sep 16 2022, 12:48 AM

Given the disagreement on your measured times with and without the patch even without the -ftime-trace flag I'm inclined to write this off to very non-gaussian sampling noise on your setup. wdyt?

As you can't reproduce this I think that sounds okay. Thanks.

This revision was automatically updated to reflect the committed changes.