Page MenuHomePhabricator

[clang] Adds `-ftime-trace` option to clang that produces Chrome `chrome://tracing` compatible JSON profiling output dumps
ClosedPublic

Authored by anton-afanasyev on Feb 26 2019, 7:38 AM.

Details

Summary

This is the first part of time tracing system, I have splitted them cause this part is mostly written by Aras Pranckevicius except of several minor fixes concerning formatting. I'm to commit this in behalf of Aras, we have an arrangment with him.
The second part extends this option adding terminal output making no need for profiling visualization. I can also add xray support though this need is arguable.
The third part is for cleaning up previous attempts of such implementations (like https://reviews.llvm.org/D45619, https://reviews.llvm.org/D43578 and https://reviews.llvm.org/D47196).

This is taken from GitHub PR: https://github.com/aras-p/llvm-project-20170507/pull/2
Here is cfe maillist subject discussion: http://lists.llvm.org/pipermail/cfe-dev/2019-January/060836.html

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
rnk added a comment.Mar 22 2019, 1:32 PM

Other than the lifetime issue, I think this is basically ready.

clang/lib/Sema/SemaTemplateInstantiate.cpp
2014 ↗(On Diff #191934)

Shoot, I think the callable should probably return std::string instead of StringRef, otherwise this looks like it will be a use-after-free. You allocate a temporary std::string, get a pointer to the characters, return, the string is destroyed, and then UAF.

With a std::string return type, you won't need .c_str().

clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
4089 ↗(On Diff #191934)

.c_str() is unnecessary

llvm/include/llvm/Support/TimeProfiler.h
54–58 ↗(On Diff #191934)

I think you can replace const char * in these prototypes with StringRef to save a few .data() calls at some call sites.

As mentioned before, I think the callable needs to return std::string, though.

takuto.ikuta added inline comments.Mar 23 2019, 4:13 AM
llvm/lib/Support/TimeProfiler.cpp
28 ↗(On Diff #191934)

you can pass StringRef here and remove .data() or .c_str() from caller.

33 ↗(On Diff #191934)
44 ↗(On Diff #191934)
72 ↗(On Diff #191934)

I prefer to write

Stack.push_back(std::move(E));

or

Stack.emplace_back(steady_clock::now(), {}, Name, Detail);
103 ↗(On Diff #191934)
147 ↗(On Diff #191934)

better to have one hash map so that we don't need to call 2 lookup in L92 and L93.
Also StringMap may be faster than unordered_map.

anton-afanasyev marked 8 inline comments as done.Mar 26 2019, 7:04 AM
anton-afanasyev added inline comments.
llvm/lib/Support/TimeProfiler.cpp
33 ↗(On Diff #191934)

Yes, thanks!

anton-afanasyev marked an inline comment as done.
takuto.ikuta added inline comments.Mar 26 2019, 8:24 AM
clang/lib/CodeGen/BackendUtil.cpp
1352 ↗(On Diff #192268)

We don't need explicit StringRef constructor call here. Just passing "" is enough.

clang/lib/Parse/ParseAST.cpp
154 ↗(On Diff #192268)

Remove StringRef?

clang/lib/Serialization/GlobalModuleIndex.cpp
131 ↗(On Diff #192268)

Remove StringRef?

745 ↗(On Diff #192268)

Remove StringRef?

clang/tools/driver/cc1_main.cpp
224 ↗(On Diff #192268)

Remove StringRef?

llvm/lib/Support/TimeProfiler.cpp
44 ↗(On Diff #192268)

include StringExtras.h for this?

anton-afanasyev marked 3 inline comments as done.Mar 27 2019, 12:43 AM
anton-afanasyev added inline comments.
clang/lib/Parse/ParseAST.cpp
154 ↗(On Diff #192268)

I use StringRef("") to explicitly cast to one of overloaded TimeScope(StringRef, StringRef) and TimeScope(StringRef, fuction_ref(std::string())).

llvm/lib/Support/TimeProfiler.cpp
44 ↗(On Diff #192268)

Should one do it? It's already implicitly included.

103 ↗(On Diff #191934)

This implementation looks compact and fast. I've read proposal for this library https://docs.google.com/document/d/1OEF9IauWwNuSigZzvvbjc1cVS1uGHRyGTXaoy3DjqM4 , it's recent, so I'm not sure it's stable and speed optimized. Do you actually can advise it? I can do it in the next refactor commit as well.

takuto.ikuta added inline comments.Mar 27 2019, 6:45 AM
clang/lib/Parse/ParseAST.cpp
154 ↗(On Diff #192268)

I think function_ref(std::string()) does not have constructor receiving StringRef, so I feel explicit cast is redundant.

llvm/lib/Support/TimeProfiler.cpp
44 ↗(On Diff #192268)

I think it is better to do, because if someone removes the StringExtras.h include used for this file, they need to include header in this file at the same time. That may require to touch unrelated files in their change.

75 ↗(On Diff #192268)

s/Begin/begin/

103 ↗(On Diff #191934)

Hmm, I think using json library is preferred because we don't need to take care the correctness of json format.
Confirming correctness of json format with many escaped literals is bit hard and I'm afraid to miss json format error.

anton-afanasyev marked an inline comment as done.Mar 27 2019, 2:54 PM
anton-afanasyev added inline comments.
clang/lib/Parse/ParseAST.cpp
154 ↗(On Diff #192268)

The compiler tries to use function_ref<..>(Callable&&, ...) constructor and instantiates function_ref<std::string()>::function_ref<char const (&)[1]>, so one gets error like:

error: call to constructor of 'llvm::TimeTraceScope' is ambiguous
    llvm::TimeTraceScope TimeScope("Frontend", "");
                         ^         ~~~~~~~~~~~~~~
.../include/llvm/Support/TimeProfiler.h:54:3: note: candidate constructor
  TimeTraceScope(StringRef Name, StringRef Detail) {
  ^
.../include/llvm/Support/TimeProfiler.h:58:3: note: candidate constructor
  TimeTraceScope(StringRef Name, llvm::function_ref<std::string()> Detail) {
  ^
anton-afanasyev marked 10 inline comments as done.
rnk accepted this revision.Mar 29 2019, 11:40 AM

lgtm

I think this is ready. We can adjust the overloads after the fact. I'd like to get the feature in so we can make improvements independently.

clang/lib/Parse/ParseAST.cpp
154 ↗(On Diff #192268)

Oh no. We are too clever for ourselves. :(

So, we'd need to make the enable_if logic in the function_ref constructor more clever to prevent it from being instantiated for non-callables.

This revision is now accepted and ready to land.Mar 29 2019, 11:40 AM
In D58675#1448112, @rnk wrote:

lgtm

I think this is ready. We can adjust the overloads after the fact. I'd like to get the feature in so we can make improvements independently.

Ok, so I'm to commit this and fix three points in the next commit(s):

  1. Improve hashing (it's easy).
  2. Use json utility.
  3. Improve function_ref constructor and delete StringRef("...") explicit casting.
This revision was automatically updated to reflect the committed changes.

Looks like this landed without tests. Please add tests in a follow-up.

Also, it looks like the flag is currently a cc1 flag. This should probably be a CoreOption instead so that both the gcc-style and the cl-style drivers expose it. Right now, users can't really use this if I read the patch right (without using -Xclang, which is discouraged.)

ps: Hooray for landing this, and thanks for the cool feature!

Looks like this landed without tests. Please add tests in a follow-up.

Also, it looks like the flag is currently a cc1 flag. This should probably be a CoreOption instead so that both the gcc-style and the cl-style drivers expose it. Right now, users can't really use this if I read the patch right (without using -Xclang, which is discouraged.)

Ok, thanks, I'm to add tests in a follow-up.

It works like clang -ftime-trace main.cpp now. Not sure I know where CoreOption is, but I'm to figure out it, thanks.

ps: Hooray for landing this, and thanks for the cool feature!

Thanks! It's @aras_p accomplishment.

Use native llvm JSON library update: https://reviews.llvm.org/D60609

Some post-commit review (please submit a new review, don't replace this diff)
As usual, the incorrect license headers keep slipping through.

llvm/trunk/include/llvm/Support/TimeProfiler.h
5–6

OOOPS, wrong license.

53

Did you mean to explicitly prohibit all the default constructors / operator=?

llvm/trunk/lib/Support/TimeProfiler.cpp
5–6

Wrong license.

29–30

SmallString<32> ?
Also, it is safe to OS.reserve(Src.size())

59–60

SmallString<32>?

70–71

Why not either take StringRef arg, or at least std::move() it when creating Entry?

89

llvm::find_if(llvm::reverse(), )

99

Why pass std::unique_ptr ?
Just raw_pwrite_stream &OS

122

Elide {}

123

llvm::sort <- this is a correctness issue.

144–145

Would it make sense to make these SmallVector?

146–147
  1. Eww, std::unordered_map, that will have *horrible* perf.
  2. Eww, map with key = string. Use llvm::StringMap
162

Just raw_pwrite_stream &OS?

lebedev.ri added inline comments.Apr 13 2019, 2:06 AM
llvm/trunk/lib/Support/TimeProfiler.cpp
29–30

Also, you probably want to add raw_svector_ostream ontop, and << into it.

anton-afanasyev marked 23 inline comments as done.Apr 14 2019, 4:36 AM

Some post-commit review (please submit a new review, don't replace this diff)
As usual, the incorrect license headers keep slipping through.

Ok, I've made a separate review for this: https://reviews.llvm.org/D60663

llvm/trunk/include/llvm/Support/TimeProfiler.h
5–6

Yes, thanks.

53

Ok, I'm to add lines like TimeTraceScope() = delete; here.

llvm/trunk/lib/Support/TimeProfiler.cpp
5–6

Yes, I'm to fix it.

29–30

This function has been already dropped here https://reviews.llvm.org/D60609 . I'm to submit code without it if json library using are ok for performance.

59–60

Hmm, would it make a sense for Detail? Entrys are heap-allocated, the actual size is ranging from several bytes to several hundreds. Also, getQualifiedNameAsString() which is usually used for Detail returns std::string.

70–71

Do you mean std::move(Name)? Ok.

89

Hmm, one need not [rbegin(), rend()], but [rbegin()++,rend()], so have to explicitly specify begin and end, llvm::reverse(Stack) is inappropriate here.

99

Yes, thanks!
This was blindly copied from CompilerInstance::createOutputFile() return type.

122

Ok, thanks.

123

Yes, thanks!

144–145

Ok, I'm to change it to

SmallVector<Entry, 16> Stack;  
SmallVector<Entry, 128> Entries;

Stack size may be enough for small sources while Entries usually amounts many thousands of Entrys.

146–147

You are right, but this is already fixed and submitted in this follow-up: https://reviews.llvm.org/D60404

162

Yes, thanks.

Some post-commit review (please submit a new review, don't replace this diff)
As usual, the incorrect license headers keep slipping through.

Ok, I've made a separate review for this: https://reviews.llvm.org/D60663

Thank you.

anton-afanasyev marked 10 inline comments as done.Apr 16 2019, 10:28 AM

Patch by @thakis for BE passes tracing: https://reviews.llvm.org/D60782

lebedev.ri added inline comments.Sep 19 2019, 3:22 PM
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431

This isn't covered by any timer; if you look in BackendUtil.cpp,
EmitAssemblyHelper actually has CodeGenerationTime("codegen", "Code Generation Time") timer.

llvm/trunk/lib/IR/LegacyPassManager.cpp
1686

I think this may be the wrong place for this.
This includes the entirety of the pipeline, including all of llvm back-end stuff.

anton-afanasyev marked 2 inline comments as done.Sep 24 2019, 5:33 AM
anton-afanasyev added inline comments.
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431

Thanks, I'm to add it.

llvm/trunk/lib/IR/LegacyPassManager.cpp
1686

Ok, I'm just to delete this line, this block is outputted by MPPassManager::runOnModule() as well. This is just a section to see module name rather than to cover proper part of pipeline.

anton-afanasyev marked 2 inline comments as done.Sep 24 2019, 8:03 AM
anton-afanasyev added inline comments.
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431

Hmm, I've figured out this isn't needed: such new timer mostly coincides with "Backend" timer (above). Legacy Timer CodeGenerationTime(...) is bad example of doing right timing.

lebedev.ri added inline comments.Sep 24 2019, 8:19 AM
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431

"Mostly coincides" may not be the best way to approach fine-grained timings, i think? :)

I have noticed this because when i looked at the produced time flame graph,
there's large section in the end that is covered only by "Backend" timer,
but nothing else. Now, i'm not going to say whether or not that extra section
should or should not be within "Backend" timer, but it certainly should *also*
be within "codegen" timer. Or is there no codegen time spent there?


anton-afanasyev marked 2 inline comments as done.Sep 24 2019, 8:30 AM
anton-afanasyev added inline comments.
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431

"Mostly coincides" here means "identical" I believe, the difference is auxiliary stuff.
Please look at clang::EmitBackendOutput(), "Backend" timer is outer for "codegen" one.

lebedev.ri added inline comments.Sep 24 2019, 9:04 AM
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431

Then we are talking about different things using the same name.
There are two distinct codegen steps:

  1. clang AST -> LLVM IR codegen

(after that, all the opt passes run)

  1. LLVM IR -> final assembly. This happens after all the opt middle-end passes.

Those are *different* codegen's.

anton-afanasyev marked 3 inline comments as done.Sep 24 2019, 9:23 AM
anton-afanasyev added inline comments.
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431

Yes, and step 1 is named as "CodeGen Function" whereas step 2 is named just "Backend".

lebedev.ri added inline comments.Sep 24 2019, 10:12 AM
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431

Aha. So this is what i expected to see, apparently:

diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 71ae8fd4fb0..d89d12612f8 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -910,6 +910,7 @@ void EmitAssemblyHelper::EmitAssembly(BackendAction Action,
 
   {
     PrettyStackTraceString CrashInfo("Code generation");
+    llvm::TimeTraceScope TimeScope("BackendCodegen", StringRef(""));
     CodeGenPasses.run(*TheModule);
   }

But that actually brings me to another question - what about PrettyStackTraceString CrashInfo("Per-function optimization");?
Should that be wrapped into some section, too?
I'm less certain here.

anton-afanasyev marked 2 inline comments as done.Sep 24 2019, 11:43 AM
anton-afanasyev added inline comments.
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431

Ok, you've talked about Timer CodeGenerationTime, which corresponds to Backend scope.
As for this BackendCodegen, adding this I would prefer to add adjacent "Per-function optimization" and "Per-module optimization passes" together.

anton-afanasyev marked an inline comment as done.Sep 27 2019, 2:55 PM
anton-afanasyev added inline comments.
cfe/trunk/lib/CodeGen/BackendUtil.cpp
1426–1431