This is an archive of the discontinued LLVM Phabricator instance.

[clang] Adds `-ftime-trace` option to clang that produces Chrome `chrome://tracing` compatible JSON profiling output dumps
ClosedPublic

Authored by anton-afanasyev on Feb 26 2019, 7:38 AM.

Details

Summary

This is the first part of time tracing system, I have splitted them cause this part is mostly written by Aras Pranckevicius except of several minor fixes concerning formatting. I'm to commit this in behalf of Aras, we have an arrangment with him.
The second part extends this option adding terminal output making no need for profiling visualization. I can also add xray support though this need is arguable.
The third part is for cleaning up previous attempts of such implementations (like https://reviews.llvm.org/D45619, https://reviews.llvm.org/D43578 and https://reviews.llvm.org/D47196).

This is taken from GitHub PR: https://github.com/aras-p/llvm-project-20170507/pull/2
Here is cfe maillist subject discussion: http://lists.llvm.org/pipermail/cfe-dev/2019-January/060836.html

Diff Detail

Event Timeline

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptFeb 26 2019, 7:39 AM

Ideally I think you need more clang/frontend experts reviewers - many of us on the reviewers list tend to work mainly in llvm.

Test cases would be good as well - even if its just basic sanity tests for command line args etc.

clang/lib/Sema/Sema.cpp
113

remove braces

clang/lib/Sema/SemaTemplateInstantiate.cpp
25

nfc change?

clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
24

nfc change?

clang/tools/driver/cc1_main.cpp
201

remove braces

anton-afanasyev marked 3 inline comments as done.Feb 28 2019, 12:26 PM

Ok, I'm to add tests and reviewers.

clang/lib/Sema/Sema.cpp
113

Ok

clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
24

What do you mean? This change is not NFC.

clang/tools/driver/cc1_main.cpp
201

Ok

RKSimon added inline comments.Feb 28 2019, 1:49 PM
clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
24

Moving the "clang/Sema/SemaInternal.h" include?

anton-afanasyev marked an inline comment as done.Feb 28 2019, 10:11 PM
anton-afanasyev added inline comments.
clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
24

Ok, I'm to move this line back. This was done by clang-format.

anton-afanasyev marked 5 inline comments as done.
aras-p added a subscriber: aras-p.Mar 6 2019, 4:18 AM

Ping!
Should I add more FE guys to review this?

@rsmith Any suggestions for good reviewers for this please?

zturner added a subscriber: zturner.

+reid and hans, as they might be interested in this.

rnk added a comment.Mar 13 2019, 2:35 PM

Very cool! I'll take a look, I wasn't aware this had been rebased and uploaded, I was thinking about doing it myself yesterday as a side project.

As I think I've said elsewhere, I'm really excited to give users the tools they need to analyze why their code compiles slowly. If we just give them the tools, maybe they will restructure their code on their own and they won't come asking, "why is the compiler so slow on my (many transitive includes | template metaprogram)?".

I'll take a look and try to get back with some thoughts.

rnk added inline comments.Mar 13 2019, 3:12 PM
clang/lib/Parse/ParseAST.cpp
172

I think you may want to move this to clang::EmitBackendOutput, which is closer to real "backend-y" actions. I don't think there's any other heavy lifting that happens in HandleTranslationUnit, it looks like diagnostic teardown and setup for calling EmitBackendOutput.

clang/lib/Parse/ParseDeclCXX.cpp
3115–3118

For example, this would be simplified by a TimeTraceScope callable overload.

clang/lib/Parse/ParseTemplate.cpp
237–238

I guess this would be simplified as well with a callable.

clang/lib/Sema/Sema.cpp
98

This doesn't match the LLVM naming and auto usage conventions: https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable

Yes, there is an active debate about changing the way we name variables, but please just match what we have for now.

llvm/lib/IR/LegacyPassManager.cpp
1632–1634

Someone is going to have to figure out where the equivalent annotations should live in the new passmanager. I wasn't able to find it just by looking myself, so I won't ask you to do it. I guess it's in llvm/include/llvm/IR/PassManager.h.

llvm/lib/Support/TimeProfiler.cpp
28

Here and else where things should be named the LLVM way for consistency:
https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly
escapeString, Src, etc. Tedious, I know.

70

I'm tempted to micro-optimize this with StringRef and StringSaver, but I think it's unnecessary. Calling malloc may affect the profile, but probably not much.

80

Comments in this file need to be complete sentences with a leading capital and full stop.

llvm/lib/Support/TimeProfiler.h
53

It'd be nice if these took StringRefs, it would avoid a fair amount of .data() and .c_str(), but it messes up TIME_TRACE_OR_NULL. Another way to delay the work would be to have an llvm::function_ref<StringRef()> detail overload. The callable also allows the user to bind variables like this:

TimeTraceScope timeScope("ParseTag", [&]() {
  if (auto *TD = dyn_cast_or_null<TagDecl>(D))
    return TD->getNameAsStr();
  return "";
});

Instead of the isa<TagDecl>(D) ? cast<TagDecl>(D)->getNameAsStr() : "" pattern that I see, which repeats TagDecl.

rnk added inline comments.Mar 14 2019, 11:16 AM
llvm/lib/Support/TimeProfiler.h
1

I applied this patch locally to try it, and I noticed this header should be in llvm/include/llvm/Support, not llvm/lib/Support.

rnk added inline comments.Mar 14 2019, 12:47 PM
llvm/lib/IR/LegacyPassManager.cpp
1686

I think these OptModule and OptFunction labels may need some improvement. For a backend-heavy compilation like LLVM's X86ISelLowering.cpp, these labels aren't as good as they could be:


I think it's fine to leave that for later, though.

anton-afanasyev marked 5 inline comments as done.Mar 14 2019, 12:55 PM
anton-afanasyev added inline comments.
clang/lib/Parse/ParseAST.cpp
172

Yes, thanks.

clang/lib/Sema/Sema.cpp
98

Ok

llvm/lib/IR/LegacyPassManager.cpp
1632–1634

Ok, I'm to look for it.

llvm/lib/Support/TimeProfiler.cpp
28

Ok

llvm/lib/Support/TimeProfiler.h
1

Oops, you're right, this is accidentally moved in patch! Thanks.

aganea added a subscriber: aganea.Mar 14 2019, 6:11 PM
anton-afanasyev marked 14 inline comments as done.Mar 22 2019, 1:05 PM
anton-afanasyev added inline comments.
llvm/lib/IR/LegacyPassManager.cpp
1632–1634

Looks like PassManager.h changing is enough to support new PassManager. But I prefer to make this change in subsequent commits.

1686

Yes, leaving this for later. The subsequent commits are planned.

llvm/lib/Support/TimeProfiler.cpp
70

Ok, I'm not changing this.

anton-afanasyev marked 2 inline comments as done.

Updated following @rnk review notes.

rnk added a comment.Mar 22 2019, 1:32 PM

Other than the lifetime issue, I think this is basically ready.

clang/lib/Sema/SemaTemplateInstantiate.cpp
2014

Shoot, I think the callable should probably return std::string instead of StringRef, otherwise this looks like it will be a use-after-free. You allocate a temporary std::string, get a pointer to the characters, return, the string is destroyed, and then UAF.

With a std::string return type, you won't need .c_str().

clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
4089

.c_str() is unnecessary

llvm/include/llvm/Support/TimeProfiler.h
54–58 ↗(On Diff #191934)

I think you can replace const char * in these prototypes with StringRef to save a few .data() calls at some call sites.

As mentioned before, I think the callable needs to return std::string, though.

takuto.ikuta added inline comments.Mar 23 2019, 4:13 AM
llvm/lib/Support/TimeProfiler.cpp
29

you can pass StringRef here and remove .data() or .c_str() from caller.

34
45
73

I prefer to write

Stack.push_back(std::move(E));

or

Stack.emplace_back(steady_clock::now(), {}, Name, Detail);
104
148

better to have one hash map so that we don't need to call 2 lookup in L92 and L93.
Also StringMap may be faster than unordered_map.

anton-afanasyev marked 8 inline comments as done.Mar 26 2019, 7:04 AM
anton-afanasyev added inline comments.
llvm/lib/Support/TimeProfiler.cpp
34

Yes, thanks!

anton-afanasyev marked an inline comment as done.
takuto.ikuta added inline comments.Mar 26 2019, 8:24 AM
clang/lib/CodeGen/BackendUtil.cpp
1352 ↗(On Diff #192268)

We don't need explicit StringRef constructor call here. Just passing "" is enough.

clang/lib/Parse/ParseAST.cpp
154

Remove StringRef?

clang/lib/Serialization/GlobalModuleIndex.cpp
131

Remove StringRef?

745

Remove StringRef?

clang/tools/driver/cc1_main.cpp
224

Remove StringRef?

llvm/lib/Support/TimeProfiler.cpp
45

include StringExtras.h for this?

anton-afanasyev marked 3 inline comments as done.Mar 27 2019, 12:43 AM
anton-afanasyev added inline comments.
clang/lib/Parse/ParseAST.cpp
154

I use StringRef("") to explicitly cast to one of overloaded TimeScope(StringRef, StringRef) and TimeScope(StringRef, fuction_ref(std::string())).

llvm/lib/Support/TimeProfiler.cpp
45

Should one do it? It's already implicitly included.

104

This implementation looks compact and fast. I've read proposal for this library https://docs.google.com/document/d/1OEF9IauWwNuSigZzvvbjc1cVS1uGHRyGTXaoy3DjqM4 , it's recent, so I'm not sure it's stable and speed optimized. Do you actually can advise it? I can do it in the next refactor commit as well.

takuto.ikuta added inline comments.Mar 27 2019, 6:45 AM
clang/lib/Parse/ParseAST.cpp
154

I think function_ref(std::string()) does not have constructor receiving StringRef, so I feel explicit cast is redundant.

llvm/lib/Support/TimeProfiler.cpp
45

I think it is better to do, because if someone removes the StringExtras.h include used for this file, they need to include header in this file at the same time. That may require to touch unrelated files in their change.

76

s/Begin/begin/

104

Hmm, I think using json library is preferred because we don't need to take care the correctness of json format.
Confirming correctness of json format with many escaped literals is bit hard and I'm afraid to miss json format error.

anton-afanasyev marked an inline comment as done.Mar 27 2019, 2:54 PM
anton-afanasyev added inline comments.
clang/lib/Parse/ParseAST.cpp
154

The compiler tries to use function_ref<..>(Callable&&, ...) constructor and instantiates function_ref<std::string()>::function_ref<char const (&)[1]>, so one gets error like:

error: call to constructor of 'llvm::TimeTraceScope' is ambiguous
    llvm::TimeTraceScope TimeScope("Frontend", "");
                         ^         ~~~~~~~~~~~~~~
.../include/llvm/Support/TimeProfiler.h:54:3: note: candidate constructor
  TimeTraceScope(StringRef Name, StringRef Detail) {
  ^
.../include/llvm/Support/TimeProfiler.h:58:3: note: candidate constructor
  TimeTraceScope(StringRef Name, llvm::function_ref<std::string()> Detail) {
  ^
anton-afanasyev marked 10 inline comments as done.
rnk accepted this revision.Mar 29 2019, 11:40 AM

lgtm

I think this is ready. We can adjust the overloads after the fact. I'd like to get the feature in so we can make improvements independently.

clang/lib/Parse/ParseAST.cpp
154

Oh no. We are too clever for ourselves. :(

So, we'd need to make the enable_if logic in the function_ref constructor more clever to prevent it from being instantiated for non-callables.

This revision is now accepted and ready to land.Mar 29 2019, 11:40 AM
In D58675#1448112, @rnk wrote:

lgtm

I think this is ready. We can adjust the overloads after the fact. I'd like to get the feature in so we can make improvements independently.

Ok, so I'm to commit this and fix three points in the next commit(s):

  1. Improve hashing (it's easy).
  2. Use json utility.
  3. Improve function_ref constructor and delete StringRef("...") explicit casting.
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Mar 30 2019, 4:11 PM

Looks like this landed without tests. Please add tests in a follow-up.

Also, it looks like the flag is currently a cc1 flag. This should probably be a CoreOption instead so that both the gcc-style and the cl-style drivers expose it. Right now, users can't really use this if I read the patch right (without using -Xclang, which is discouraged.)

ps: Hooray for landing this, and thanks for the cool feature!

Looks like this landed without tests. Please add tests in a follow-up.

Also, it looks like the flag is currently a cc1 flag. This should probably be a CoreOption instead so that both the gcc-style and the cl-style drivers expose it. Right now, users can't really use this if I read the patch right (without using -Xclang, which is discouraged.)

Ok, thanks, I'm to add tests in a follow-up.

It works like clang -ftime-trace main.cpp now. Not sure I know where CoreOption is, but I'm to figure out it, thanks.

ps: Hooray for landing this, and thanks for the cool feature!

Thanks! It's @aras_p accomplishment.

Use native llvm JSON library update: https://reviews.llvm.org/D60609

Some post-commit review (please submit a new review, don't replace this diff)
As usual, the incorrect license headers keep slipping through.

llvm/trunk/include/llvm/Support/TimeProfiler.h
5–6 ↗(On Diff #192963)

OOOPS, wrong license.

53 ↗(On Diff #192963)

Did you mean to explicitly prohibit all the default constructors / operator=?

llvm/trunk/lib/Support/TimeProfiler.cpp
5–6 ↗(On Diff #192963)

Wrong license.

29–30 ↗(On Diff #192963)

SmallString<32> ?
Also, it is safe to OS.reserve(Src.size())

59–60 ↗(On Diff #192963)

SmallString<32>?

70–71 ↗(On Diff #192963)

Why not either take StringRef arg, or at least std::move() it when creating Entry?

89 ↗(On Diff #192963)

llvm::find_if(llvm::reverse(), )

99 ↗(On Diff #192963)

Why pass std::unique_ptr ?
Just raw_pwrite_stream &OS

122 ↗(On Diff #192963)

Elide {}

123 ↗(On Diff #192963)

llvm::sort <- this is a correctness issue.

144–145 ↗(On Diff #192963)

Would it make sense to make these SmallVector?

146–147 ↗(On Diff #192963)
  1. Eww, std::unordered_map, that will have *horrible* perf.
  2. Eww, map with key = string. Use llvm::StringMap
162 ↗(On Diff #192963)

Just raw_pwrite_stream &OS?

lebedev.ri added inline comments.Apr 13 2019, 2:06 AM
llvm/trunk/lib/Support/TimeProfiler.cpp
29–30 ↗(On Diff #192963)

Also, you probably want to add raw_svector_ostream ontop, and << into it.

anton-afanasyev marked 23 inline comments as done.Apr 14 2019, 4:36 AM

Some post-commit review (please submit a new review, don't replace this diff)
As usual, the incorrect license headers keep slipping through.

Ok, I've made a separate review for this: https://reviews.llvm.org/D60663

llvm/trunk/include/llvm/Support/TimeProfiler.h
5–6 ↗(On Diff #192963)

Yes, thanks.

53 ↗(On Diff #192963)

Ok, I'm to add lines like TimeTraceScope() = delete; here.

llvm/trunk/lib/Support/TimeProfiler.cpp
5–6 ↗(On Diff #192963)

Yes, I'm to fix it.

29–30 ↗(On Diff #192963)

This function has been already dropped here https://reviews.llvm.org/D60609 . I'm to submit code without it if json library using are ok for performance.

59–60 ↗(On Diff #192963)

Hmm, would it make a sense for Detail? Entrys are heap-allocated, the actual size is ranging from several bytes to several hundreds. Also, getQualifiedNameAsString() which is usually used for Detail returns std::string.

70–71 ↗(On Diff #192963)

Do you mean std::move(Name)? Ok.

89 ↗(On Diff #192963)

Hmm, one need not [rbegin(), rend()], but [rbegin()++,rend()], so have to explicitly specify begin and end, llvm::reverse(Stack) is inappropriate here.

99 ↗(On Diff #192963)

Yes, thanks!
This was blindly copied from CompilerInstance::createOutputFile() return type.

122 ↗(On Diff #192963)

Ok, thanks.

123 ↗(On Diff #192963)

Yes, thanks!

144–145 ↗(On Diff #192963)

Ok, I'm to change it to

SmallVector<Entry, 16> Stack;  
SmallVector<Entry, 128> Entries;

Stack size may be enough for small sources while Entries usually amounts many thousands of Entrys.

146–147 ↗(On Diff #192963)

You are right, but this is already fixed and submitted in this follow-up: https://reviews.llvm.org/D60404

162 ↗(On Diff #192963)

Yes, thanks.

Some post-commit review (please submit a new review, don't replace this diff)
As usual, the incorrect license headers keep slipping through.

Ok, I've made a separate review for this: https://reviews.llvm.org/D60663

Thank you.

anton-afanasyev marked 10 inline comments as done.Apr 16 2019, 10:28 AM

Patch by @thakis for BE passes tracing: https://reviews.llvm.org/D60782

lebedev.ri added inline comments.Sep 19 2019, 3:22 PM
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431 ↗(On Diff #192963)

This isn't covered by any timer; if you look in BackendUtil.cpp,
EmitAssemblyHelper actually has CodeGenerationTime("codegen", "Code Generation Time") timer.

llvm/trunk/lib/IR/LegacyPassManager.cpp
1686 ↗(On Diff #192963)

I think this may be the wrong place for this.
This includes the entirety of the pipeline, including all of llvm back-end stuff.

anton-afanasyev marked 2 inline comments as done.Sep 24 2019, 5:33 AM
anton-afanasyev added inline comments.
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431 ↗(On Diff #192963)

Thanks, I'm to add it.

llvm/trunk/lib/IR/LegacyPassManager.cpp
1686 ↗(On Diff #192963)

Ok, I'm just to delete this line, this block is outputted by MPPassManager::runOnModule() as well. This is just a section to see module name rather than to cover proper part of pipeline.

anton-afanasyev marked 2 inline comments as done.Sep 24 2019, 8:03 AM
anton-afanasyev added inline comments.
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431 ↗(On Diff #192963)

Hmm, I've figured out this isn't needed: such new timer mostly coincides with "Backend" timer (above). Legacy Timer CodeGenerationTime(...) is bad example of doing right timing.

lebedev.ri added inline comments.Sep 24 2019, 8:19 AM
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431 ↗(On Diff #192963)

"Mostly coincides" may not be the best way to approach fine-grained timings, i think? :)

I have noticed this because when i looked at the produced time flame graph,
there's large section in the end that is covered only by "Backend" timer,
but nothing else. Now, i'm not going to say whether or not that extra section
should or should not be within "Backend" timer, but it certainly should *also*
be within "codegen" timer. Or is there no codegen time spent there?


anton-afanasyev marked 2 inline comments as done.Sep 24 2019, 8:30 AM
anton-afanasyev added inline comments.
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431 ↗(On Diff #192963)

"Mostly coincides" here means "identical" I believe, the difference is auxiliary stuff.
Please look at clang::EmitBackendOutput(), "Backend" timer is outer for "codegen" one.

lebedev.ri added inline comments.Sep 24 2019, 9:04 AM
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431 ↗(On Diff #192963)

Then we are talking about different things using the same name.
There are two distinct codegen steps:

  1. clang AST -> LLVM IR codegen

(after that, all the opt passes run)

  1. LLVM IR -> final assembly. This happens after all the opt middle-end passes.

Those are *different* codegen's.

anton-afanasyev marked 3 inline comments as done.Sep 24 2019, 9:23 AM
anton-afanasyev added inline comments.
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431 ↗(On Diff #192963)

Yes, and step 1 is named as "CodeGen Function" whereas step 2 is named just "Backend".

lebedev.ri added inline comments.Sep 24 2019, 10:12 AM
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431 ↗(On Diff #192963)

Aha. So this is what i expected to see, apparently:

diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 71ae8fd4fb0..d89d12612f8 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -910,6 +910,7 @@ void EmitAssemblyHelper::EmitAssembly(BackendAction Action,
 
   {
     PrettyStackTraceString CrashInfo("Code generation");
+    llvm::TimeTraceScope TimeScope("BackendCodegen", StringRef(""));
     CodeGenPasses.run(*TheModule);
   }

But that actually brings me to another question - what about PrettyStackTraceString CrashInfo("Per-function optimization");?
Should that be wrapped into some section, too?
I'm less certain here.

anton-afanasyev marked 2 inline comments as done.Sep 24 2019, 11:43 AM
anton-afanasyev added inline comments.
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431 ↗(On Diff #192963)

Ok, you've talked about Timer CodeGenerationTime, which corresponds to Backend scope.
As for this BackendCodegen, adding this I would prefer to add adjacent "Per-function optimization" and "Per-module optimization passes" together.

anton-afanasyev marked an inline comment as done.Sep 27 2019, 2:55 PM
anton-afanasyev added inline comments.
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431 ↗(On Diff #192963)