D25742 improved the precision of debug locations for PGO by removing debug locations from common tail when tail-merging. However, if identical insturctions that are merged into a common tail have the same debug locations, there's no need to remove them. This patch creates a merged debug location of identical instructions across SameTails and assign it to the instruction in the common tail, so that the debug locations are maintained if they are same across identical instructions.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
However, if identical insturctions that are merged into a common tail have the same debug locations, there's no need to remove them.
Out of curiosity, did you see this happening in real code?
In general, I am not against the idea of using getMergedLocation . However, as far as I understand, in practice you can only end up in a situation where common code has the same debug location as a result of macro expansion, and if we didn't run pass AddDiscriminators.
When building for PGO, we would end up running pass AddDiscriminators, and instructions from different basic blocks would end up having different discriminators.
Yes I observed the case from refresh_potential function in spec2006 429.mcf. The original C code snippet is below:
81 while( node != root ) 82 { 83 while( node ) 84 { 85 if( node->orientation == UP ) 86 node->potential = node->basic_arc->cost + node->pred->potential; 87 else /* == DOWN */ 88 { 89 node->potential = node->pred->potential - node->basic_arc->cost; 90 checksum++; 91 } ...
Before Control Flow Optimizer pass (which runs BranchFolder pass), instructions for line 83 are located in two places, one in the loop preheader and the other in the loop body:
BB#4: derived from LLVM BB %while.cond3.preheader TEST64rr %RDX, %RDX, %EFLAGS<imp-def>; dbg:mcfutil.c:83:9 JE_1 <BB#5>, %EFLAGS<imp-use>; dbg:mcfutil.c:83:9 JMP_1 <BB#6>; dbg:mcfutil.c:83:9 ... BB#6: derived from LLVM BB %while.body4 ... TEST64rr %RDX, %RDX, %EFLAGS<imp-def>; dbg:mcfutil.c:83:9 JE_1 <BB#5>, %EFLAGS<imp-use>; dbg:mcfutil.c:83:9 JMP_1 <BB#6>; dbg:mcfutil.c:83:9
And these two blocks are tail merged by BranchFolder pass, but without this patch, debug locations are gone in the common tail.
I confirmed that AddDiscriminator pass has been executed with '-Xclang -fdebug-info-for-profiling'. (By the way @andreadb, do we need to explicitly add the flag to have discriminators? As far as I remember I could have discriminators for expressions in the same source line even without the flag. Thanks!)
Thanks for the example!
I guess, part of the reason why we end up in that situation is because the loop is rotated, so the original loop check now exists in two places (and the debugloc for those instructions is the same).
I confirmed that AddDiscriminator pass has been executed with '-Xclang -fdebug-info-for-profiling'. (By the way @andreadb, do we need to explicitly add the flag to have discriminators? As far as I remember I could have discriminators for expressions in the same source line even without the flag. Thanks!)
You remember it right. I think that there is an hidden option to disable discriminators (but just for debugging purposes).
@aprantl Thanks for the review! I added inline questions asking for your opinion about consistency vs guideline.
lib/CodeGen/BranchFolding.cpp | ||
---|---|---|
758 ↗ | (On Diff #89283) | All other functions in BranchFolding.cpp file have doxygen comments inside .cpp file with a method name. I thought it is better to keep the consistency, and wonder what is your opinion on this. |
763 ↗ | (On Diff #89283) | I kept 'i' small here for the consistency as well. |