This is an archive of the discontinued LLVM Phabricator instance.

[BranchFolding] Merge debug locations from common tail instead of removing
ClosedPublic

Authored by twoh on Feb 21 2017, 2:39 PM.

Details

Summary

D25742 improved the precision of debug locations for PGO by removing debug locations from common tail when tail-merging. However, if identical insturctions that are merged into a common tail have the same debug locations, there's no need to remove them. This patch creates a merged debug location of identical instructions across SameTails and assign it to the instruction in the common tail, so that the debug locations are maintained if they are same across identical instructions.

Diff Detail

Repository
rL LLVM

Event Timeline

twoh created this revision.Feb 21 2017, 2:39 PM

However, if identical insturctions that are merged into a common tail have the same debug locations, there's no need to remove them.

Out of curiosity, did you see this happening in real code?

In general, I am not against the idea of using getMergedLocation . However, as far as I understand, in practice you can only end up in a situation where common code has the same debug location as a result of macro expansion, and if we didn't run pass AddDiscriminators.
When building for PGO, we would end up running pass AddDiscriminators, and instructions from different basic blocks would end up having different discriminators.

twoh added a comment.Feb 24 2017, 10:00 AM

Yes I observed the case from refresh_potential function in spec2006 429.mcf. The original C code snippet is below:

 81     while( node != root )
 82     {
 83         while( node )
 84         {
 85             if( node->orientation == UP )
 86                 node->potential = node->basic_arc->cost + node->pred->potential;
 87             else /* == DOWN */
 88             {
 89                 node->potential = node->pred->potential - node->basic_arc->cost;
 90                 checksum++;
 91             }
...

Before Control Flow Optimizer pass (which runs BranchFolder pass), instructions for line 83 are located in two places, one in the loop preheader and the other in the loop body:

BB#4: derived from LLVM BB %while.cond3.preheader
  TEST64rr %RDX, %RDX, %EFLAGS<imp-def>; dbg:mcfutil.c:83:9
  JE_1 <BB#5>, %EFLAGS<imp-use>; dbg:mcfutil.c:83:9
  JMP_1 <BB#6>; dbg:mcfutil.c:83:9

...

BB#6: derived from LLVM BB %while.body4
...
  TEST64rr %RDX, %RDX, %EFLAGS<imp-def>; dbg:mcfutil.c:83:9
  JE_1 <BB#5>, %EFLAGS<imp-use>; dbg:mcfutil.c:83:9
  JMP_1 <BB#6>; dbg:mcfutil.c:83:9

And these two blocks are tail merged by BranchFolder pass, but without this patch, debug locations are gone in the common tail.

I confirmed that AddDiscriminator pass has been executed with '-Xclang -fdebug-info-for-profiling'. (By the way @andreadb, do we need to explicitly add the flag to have discriminators? As far as I remember I could have discriminators for expressions in the same source line even without the flag. Thanks!)

In D30226#685947, @twoh wrote:

Yes I observed the case from refresh_potential function in spec2006 429.mcf. The original C code snippet is below:

 81     while( node != root )
 82     {
 83         while( node )
 84         {
 85             if( node->orientation == UP )
 86                 node->potential = node->basic_arc->cost + node->pred->potential;
 87             else /* == DOWN */
 88             {
 89                 node->potential = node->pred->potential - node->basic_arc->cost;
 90                 checksum++;
 91             }
...

Before Control Flow Optimizer pass (which runs BranchFolder pass), instructions for line 83 are located in two places, one in the loop preheader and the other in the loop body:

BB#4: derived from LLVM BB %while.cond3.preheader
  TEST64rr %RDX, %RDX, %EFLAGS<imp-def>; dbg:mcfutil.c:83:9
  JE_1 <BB#5>, %EFLAGS<imp-use>; dbg:mcfutil.c:83:9
  JMP_1 <BB#6>; dbg:mcfutil.c:83:9

...

BB#6: derived from LLVM BB %while.body4
...
  TEST64rr %RDX, %RDX, %EFLAGS<imp-def>; dbg:mcfutil.c:83:9
  JE_1 <BB#5>, %EFLAGS<imp-use>; dbg:mcfutil.c:83:9
  JMP_1 <BB#6>; dbg:mcfutil.c:83:9

And these two blocks are tail merged by BranchFolder pass, but without this patch, debug locations are gone in the common tail.

Thanks for the example!
I guess, part of the reason why we end up in that situation is because the loop is rotated, so the original loop check now exists in two places (and the debugloc for those instructions is the same).

I confirmed that AddDiscriminator pass has been executed with '-Xclang -fdebug-info-for-profiling'. (By the way @andreadb, do we need to explicitly add the flag to have discriminators? As far as I remember I could have discriminators for expressions in the same source line even without the flag. Thanks!)

You remember it right. I think that there is an hidden option to disable discriminators (but just for debugging purposes).

twoh added a comment.Mar 3 2017, 3:24 PM

Friendly Ping. Thanks!

aprantl added inline comments.Mar 3 2017, 3:37 PM
lib/CodeGen/BranchFolding.cpp
758 ↗(On Diff #89283)

doxygen comments should be in the header file and not repeat the method name.

763 ↗(On Diff #89283)

I think local variables should be captialized.

948 ↗(On Diff #89283)

.

aprantl accepted this revision.Mar 3 2017, 3:38 PM

Seems plausible. (inline comments pending, ...)

This revision is now accepted and ready to land.Mar 3 2017, 3:38 PM
twoh added a comment.Mar 6 2017, 2:29 PM

@aprantl Thanks for the review! I added inline questions asking for your opinion about consistency vs guideline.

lib/CodeGen/BranchFolding.cpp
758 ↗(On Diff #89283)

All other functions in BranchFolding.cpp file have doxygen comments inside .cpp file with a method name. I thought it is better to keep the consistency, and wonder what is your opinion on this.

763 ↗(On Diff #89283)

I kept 'i' small here for the consistency as well.

twoh added a comment.Mar 13 2017, 9:10 AM

@aprantl Could you please give your opinion on the inlined questions? Thanks!

Replied.

lib/CodeGen/BranchFolding.cpp
762 ↗(On Diff #89283)

I think the right way forward is to commit it like this and then follow-up by an NFC comment that reformats the comments according to the coding guidelines. (No need for a pre-commit review IMO)

763 ↗(On Diff #89283)

see above.

This revision was automatically updated to reflect the committed changes.