- Fix bug 38821
- Right now, the counters are added in regards of the number of successors for a given BasicBlock: it's good when we've only 1 or 2 successors (at least with BranchInstr). But in the case of a switch statement, the BasicBlock after switch has several predecessors and we need know from which BB we're coming from.
- So the idea is to revert what we're doing: add a PHINode in each block which will select the counter according to the incoming BB.
- They're several pros for doing that:
- we fix the "switch" bug
- we remove the function call to "__llvm_gcov_indirect_counter_increment" and the lookup table stuff
- we replace by PHINodes, so the optimizer will probably makes a better job.
- Tests in compiler-rt are fixed with https://reviews.llvm.org/D51620
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Transforms/Instrumentation/GCOVProfiling.cpp | ||
---|---|---|
630 ↗ | (On Diff #163985) | Reading through the diff I noticed that this i is shadowed. But, what is the purpose of this outer loop? Can we just delete it (maybe in a follow-up)? |
646 ↗ | (On Diff #163985) | Nit, should be cleaner to write insertions as EdgeToCounter[{&BB, nullptr}] = Edges. |
649 ↗ | (On Diff #163985) | Nit, might be cleaner to write for (BasicBlock *Succ : successors(TI)) { ... }, to avoid nesting / dealing with indices. |
682 ↗ | (On Diff #163985) | I suggest using the find API and asserting that the returned iterator is valid any time you need to look up an edge index. |
689 ↗ | (On Diff #163985) | The optimizer can take care of eliminating phis with a single incoming value, so it's not essential to perform that optimization here. I recommend folding this into the EdgeCount >= 2 case above for simplicity. |
lib/Transforms/Instrumentation/GCOVProfiling.cpp | ||
---|---|---|
630 ↗ | (On Diff #163985) | I think the explanation is here: But you're right: the previous implementation was buggy since i and e were shadowed. |
Thanks, LGTM.
lib/Transforms/Instrumentation/GCOVProfiling.cpp | ||
---|---|---|
630 ↗ | (On Diff #163985) | Here's the inline comment: // Each compile unit gets its own .gcno file. This means that whether we run // this pass over the original .o's as they're produced, or run it after // LTO, we'll generate the same .gcno files. I somewhat follow that comment, but it's left me confused about the purpose of the outer loop in emitProfileArcs. The loop body doesn't appear to depend on i at all and doesn't inspect CU_Nodes. All of the compiler-rt/llvm tests pass when I remove the loop. Actually, doesn't it seem like the outer loop would actively do the wrong thing with LTO enabled (insert duplicate instrumentation)? |
@vsk I've no commit access, could you land this patch and this one https://reviews.llvm.org/D51620 please ?
This needed a small follow-up fix: see r341985, http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/12824/.