This is an archive of the discontinued LLVM Phabricator instance.

[lld][ELF] Add additional time trace categories
ClosedPublic

Authored by jhenderson on Nov 3 2020, 6:48 AM.

Details

Summary

I noticed when running a large link with the --time-trace option that there were several areas which were missing any specific time trace categories (aside from the generic link/ExecuteLinker categories). This patch adds new categories to fill most of the "gaps", or to provide more detail than was previously provided.

Diff Detail

Event Timeline

jhenderson created this revision.Nov 3 2020, 6:48 AM
jhenderson requested review of this revision.Nov 3 2020, 6:48 AM
jhenderson updated this revision to Diff 302593.Nov 3 2020, 8:30 AM

Add missing includes.

MaskRay added inline comments.Nov 3 2020, 10:32 AM
lld/ELF/DriverUtils.cpp
242

This is a pretty simple operation. Is the scope needed? (If file traversal IO turns out to be an issue, this still needs more information to locate the issue.)

lld/ELF/Writer.cpp
207

EH is probably more appropriate than exception.

.gcc_except_table is not handled here.

2014

This may be ambiguous.

Symbol resolution has been done. This finalizes .symtab and part of .dynsym . You'll need to include the next block to finalize the rest of .dynsym (there is still a horrible Haxagon hack below but that is just one symbol which can be ignored from the profiling)

2183

If we do something to support RISC-V -mrelax in the future (D77694), there is possibility that this needs to break down.

2240

finalizeAddressDependentContent is a special address assigning operation. It does not belong to "Finalize synthetic sections".

jhenderson added inline comments.Nov 4 2020, 12:53 AM
lld/ELF/DriverUtils.cpp
242

When digging into the new "Load files" scope for my test case, I noticed gaps between each library of fairly consistent sizes, which were caused by this function. They are relatively minor overall, but if digging into the area, it looks odd that such gaps exist. There's also a fairly easy way to avoid the cost too, by explicitly specifying the library as an input file rather than using -l/-L to find it. The scope states what library is being looked for, and a user can use --verbose and/or --trace to identify the specific found library, to allow them to rectify the situation.

lld/ELF/Writer.cpp
207

If by EH you mean specifically .eh_frame, then that would be misleading for those using ARM exidx instead? If it means something else, I'm not sure what it means.

2014

I can try "Add symbols to symtab", and include the block below as well then.

jhenderson added inline comments.Nov 4 2020, 1:01 AM
lld/ELF/Writer.cpp
2183

Yeah, I wasn't sure about this big block. One option would be to add a subscope in finalizeSynthetic that specifically names the section being finalized. We'll also want to split it into two parts, as suggested below, I think.

2240

Okay. I'll split the finalize synthetic sections block into two parts, and add an additional scope for finalizeAddressDependentContent.

jhenderson updated this revision to Diff 302775.Nov 4 2020, 1:27 AM
jhenderson marked 2 inline comments as done.

Address review comments.

grimar added inline comments.Nov 4 2020, 11:31 PM
lld/ELF/Writer.cpp
2185

Does it make sense to name this and the scope above as:

"Finalize synthetic sections (1/2)"
"Finalize synthetic sections (2/2)"

?

jhenderson added inline comments.Nov 5 2020, 12:13 AM
lld/ELF/Writer.cpp
2185

I'd say no, probably, but it depends on what you want the output to look like. By naming both scopes the same, they will appear in the same colour in the chrome://tracing output, and will be combined into a single "Total Finalize synthetic sections" block below the timeline, to give an idea of how long the combined amount takes. Depending on the time taken in the intermediate bit between the two blocks, and the time trace granularity, the two blocks might even be merged into a single block in the output, if I'm not mistaken. Of course, it might be that this "treating them as the same thing" bit is actually unhelpful. To me, they are doing the same thing from a high level, so I think they should be named the same, but I could see an argument that they are different blocks and therefore should be treated differently (this feeds into my thinking that perhaps we should add subscopes for each synthetic section).

grimar added inline comments.Nov 5 2020, 12:17 AM
lld/ELF/Writer.cpp
2185

Ah, I didn't know that they will be combined into a single block. My concern was about having 2 different blocks with the same name. This place looks fine to me then.

russell.gallop added inline comments.Nov 5 2020, 3:38 AM
lld/ELF/Writer.cpp
2185

Depending on the time taken in the intermediate bit between the two blocks, and the time trace granularity, the two blocks might even be merged into a single block in the output, if I'm not mistaken.

I don't think that they would be merged into a single block. The time trace granularity applies a minimum length to the blocks of time, not to gaps. It's not a sample rate.

Note for example that if each "Finalize synthetic sections" was "< granularity" but the sum was "> granularity" then I don't think it would be traced.

My concern was about having 2 different blocks with the same name

We ended up with something like this for the clang "FrontEnd" block as it was difficult to identify a single scope which covered all of that. It's not ideal but was the best option we could think of then (https://reviews.llvm.org/D63325).

One option may be to use the second parameter to say something about "which" sections.

jhenderson updated this revision to Diff 303445.Nov 6 2020, 7:13 AM

Added sub-scope for finalizing synthetic sections, using the section name. Also added "Resolve SHF_LINK_ORDER" scope as that was flagged up after as taking a substantial amount of time in my experimental package.

jhenderson marked 2 inline comments as done.Nov 6 2020, 7:14 AM
jhenderson added inline comments.
lld/ELF/Writer.cpp
2185

My mistake. I assumed they would be, but wasn't looking at the right thing. I've added sub-scopes naming the sections. In my experiment, the dynsym and symtab showed up, but nothing else did really.

jhenderson marked an inline comment as done.Nov 6 2020, 7:14 AM
grimar accepted this revision.Nov 9 2020, 12:26 AM

This LGTM, but please wait for others.

This revision is now accepted and ready to land.Nov 9 2020, 12:26 AM
russell.gallop added inline comments.Nov 9 2020, 2:50 AM
lld/ELF/Writer.cpp
2185

Note for example that if each "Finalize synthetic sections" was "< granularity" but the sum was "> granularity" then I don't think it would be traced.

To be more specific, scopes less than granularity are not traced as individual events in the flame graph, they are counted in the "Total" times.

russell.gallop resigned from this revision.Nov 9 2020, 2:57 AM

Resigning as reviewer as I also work for Sony.

MaskRay accepted this revision.Nov 9 2020, 9:49 AM

LGTM.

lld/ELF/Writer.cpp
207

EH can mean both .eh_frame and .ARM.exidx

The ARM ABI is called "Exception Handling ABI for the Arm Architecture". An alternative name is unwinding sections.

This revision was automatically updated to reflect the committed changes.