I noticed when running a large link with the --time-trace option that there were several areas which were missing any specific time trace categories (aside from the generic link/ExecuteLinker categories). This patch adds new categories to fill most of the "gaps", or to provide more detail than was previously provided.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
lld/ELF/DriverUtils.cpp | ||
---|---|---|
241 | This is a pretty simple operation. Is the scope needed? (If file traversal IO turns out to be an issue, this still needs more information to locate the issue.) | |
lld/ELF/Writer.cpp | ||
207 | EH is probably more appropriate than exception. .gcc_except_table is not handled here. | |
2010 | This may be ambiguous. Symbol resolution has been done. This finalizes .symtab and part of .dynsym . You'll need to include the next block to finalize the rest of .dynsym (there is still a horrible Haxagon hack below but that is just one symbol which can be ignored from the profiling) | |
2118 | If we do something to support RISC-V -mrelax in the future (D77694), there is possibility that this needs to break down. | |
2175 | finalizeAddressDependentContent is a special address assigning operation. It does not belong to "Finalize synthetic sections". |
lld/ELF/DriverUtils.cpp | ||
---|---|---|
241 | When digging into the new "Load files" scope for my test case, I noticed gaps between each library of fairly consistent sizes, which were caused by this function. They are relatively minor overall, but if digging into the area, it looks odd that such gaps exist. There's also a fairly easy way to avoid the cost too, by explicitly specifying the library as an input file rather than using -l/-L to find it. The scope states what library is being looked for, and a user can use --verbose and/or --trace to identify the specific found library, to allow them to rectify the situation. | |
lld/ELF/Writer.cpp | ||
207 | If by EH you mean specifically .eh_frame, then that would be misleading for those using ARM exidx instead? If it means something else, I'm not sure what it means. | |
2010 | I can try "Add symbols to symtab", and include the block below as well then. |
lld/ELF/Writer.cpp | ||
---|---|---|
2118 | Yeah, I wasn't sure about this big block. One option would be to add a subscope in finalizeSynthetic that specifically names the section being finalized. We'll also want to split it into two parts, as suggested below, I think. | |
2175 | Okay. I'll split the finalize synthetic sections block into two parts, and add an additional scope for finalizeAddressDependentContent. |
lld/ELF/Writer.cpp | ||
---|---|---|
2180 | Does it make sense to name this and the scope above as: "Finalize synthetic sections (1/2)" ? |
lld/ELF/Writer.cpp | ||
---|---|---|
2180 | I'd say no, probably, but it depends on what you want the output to look like. By naming both scopes the same, they will appear in the same colour in the chrome://tracing output, and will be combined into a single "Total Finalize synthetic sections" block below the timeline, to give an idea of how long the combined amount takes. Depending on the time taken in the intermediate bit between the two blocks, and the time trace granularity, the two blocks might even be merged into a single block in the output, if I'm not mistaken. Of course, it might be that this "treating them as the same thing" bit is actually unhelpful. To me, they are doing the same thing from a high level, so I think they should be named the same, but I could see an argument that they are different blocks and therefore should be treated differently (this feeds into my thinking that perhaps we should add subscopes for each synthetic section). |
lld/ELF/Writer.cpp | ||
---|---|---|
2180 | Ah, I didn't know that they will be combined into a single block. My concern was about having 2 different blocks with the same name. This place looks fine to me then. |
lld/ELF/Writer.cpp | ||
---|---|---|
2180 |
I don't think that they would be merged into a single block. The time trace granularity applies a minimum length to the blocks of time, not to gaps. It's not a sample rate. Note for example that if each "Finalize synthetic sections" was "< granularity" but the sum was "> granularity" then I don't think it would be traced.
We ended up with something like this for the clang "FrontEnd" block as it was difficult to identify a single scope which covered all of that. It's not ideal but was the best option we could think of then (https://reviews.llvm.org/D63325). One option may be to use the second parameter to say something about "which" sections. |
Added sub-scope for finalizing synthetic sections, using the section name. Also added "Resolve SHF_LINK_ORDER" scope as that was flagged up after as taking a substantial amount of time in my experimental package.
lld/ELF/Writer.cpp | ||
---|---|---|
2180 | My mistake. I assumed they would be, but wasn't looking at the right thing. I've added sub-scopes naming the sections. In my experiment, the dynsym and symtab showed up, but nothing else did really. |
lld/ELF/Writer.cpp | ||
---|---|---|
2180 |
To be more specific, scopes less than granularity are not traced as individual events in the flame graph, they are counted in the "Total" times. |
LGTM.
lld/ELF/Writer.cpp | ||
---|---|---|
207 | EH can mean both .eh_frame and .ARM.exidx The ARM ABI is called "Exception Handling ABI for the Arm Architecture". An alternative name is unwinding sections. |
clang-tidy: error: no member named 'TimeTraceScope' in namespace 'llvm' [clang-diagnostic-error]
not useful
clang-tidy: error: expected ';' after expression [clang-diagnostic-error]
not useful
clang-tidy: error: use of undeclared identifier 'timeScope' [clang-diagnostic-error]
not useful