This is an archive of the discontinued LLVM Phabricator instance.

[CodeExtractor] Add debug locations for new call and branch instrs.
ClosedPublic

Authored by fhahn on Nov 24 2017, 2:54 AM.

Details

Summary

If a partially inlined function has debug info, we have to add debug
locations to the call instruction calling the outlined function.
We use the debug location of the first instruction in the outlined
function, as the introduced call transfers control to this statement and
there is no other equivalent line in the source code.

We also use the same debug location for the branch instruction added
to jump from artificial entry block for the outlined function, which just
jumps to the first actual basic block of the outlined function.

Diff Detail

Event Timeline

fhahn created this revision.Nov 24 2017, 2:54 AM
fhahn updated this revision to Diff 124202.Nov 24 2017, 6:44 AM

Improve test slightly.

aprantl edited edge metadata.Dec 1 2017, 8:59 AM

This is probably okay. Is there an introduction into partial inlining somewhere? I'd like to better understand what the transformation is doing, so I can give more helpful review feedback.

lib/Transforms/Utils/CodeExtractor.cpp
750

we usually don't spell out the != nullptr

1040

Could you please copy the entire text of the description you added to this phabricator review into the comment here? It is non-obvious what we are doing here, and I prefer having the complete story in the comment.

fhahn updated this revision to Diff 125361.Dec 4 2017, 9:58 AM
fhahn marked 2 inline comments as done.
fhahn added a comment.Dec 4 2017, 10:10 AM

This is probably okay. Is there an introduction into partial inlining somewhere? I'd like to better understand what the transformation is doing, so I can give more helpful review feedback.

Muth, Robert, and Saumra Debray. "Partial inlining." might be useful (although all PDF versions I found are not really good quality :(). The basic idea is to 1) create a clone of the original function f_cloned, 2) extract cold code from the cloned function into function f_extracted and 3) replace the cold code in f_cloned with a call to f_extracted. The smaller function f_cloned might be viable to inline where f was not. This is for example useful, if you have a function too big to inline directly, but with an early return at the beginning.

This patch adds debug info for the call instruction added in step 3) and the branch instruction from the artificial entry block added to f_extracted to ensure the function entry block does not have any predecessors.

lib/Transforms/Utils/CodeExtractor.cpp
1040

I've adjusted the text for the position.

Thanks! I found one actual problem with the assertion.

lib/Transforms/Utils/CodeExtractor.cpp
1047

This assertion will fail if you haved a nodebug function inlined into a function with debuginfo.

aprantl accepted this revision.Dec 4 2017, 1:32 PM
aprantl added a reviewer: danielcdh.

Otherwise this seems good to me. Someone who is interested in profile-based-PGO should probably also look at this, because they typically don't like it when debug info is moved from one basic block into another.

This revision is now accepted and ready to land.Dec 4 2017, 1:34 PM
danielcdh accepted this revision.Dec 4 2017, 2:12 PM

The patch LGTM. Notes about SamplePGO use cases below, which may be addressed by separate patch/discussion, if necessary.

test/Transforms/CodeExtractor/PartialInlineDebug.ll
31

I'm wondering if we inline callee.1_if.then back into caller, what would the inline stack be for this instruction. Looks like it will be:

test.c:10 (callee.1_if.then)
test.c:10 (callee)
test.c:5 (caller)

But from the PGO point of view, the correct inline stack should be:

test.c:10 (callee)
test.c:5 (caller)

Not sure if PGO-desired inline stack is possible with partial inlining with or without this patch.

Maybe we should just disable partial inlining when the binary is to be used to collect SamplePGO profile, unless it's designed for flattened profile. +wmi who's evaluating flattened profile.

fhahn added a comment.Dec 4 2017, 2:57 PM

Thanks for having a look! I'll address the assertion before committing tomorrow and also think a bit about the PGO case.

lib/Transforms/Utils/CodeExtractor.cpp
1047

Ah yes, I will look into how the inliner deals with that case and handle it appropriately.

test/Transforms/CodeExtractor/PartialInlineDebug.ll
31

Great, thanks for having a look! Currently, partial inlining is disabled by default, but it sounds like we have to handle this issue before enabling it or disable it with PGO for now.

Is the problem for PGO with splitting callee into 2 functions the following: all samples would be (incorrectly) attributed to callee and not split between callee and callee.1_if.then?

danielcdh added inline comments.Dec 4 2017, 9:15 PM
test/Transforms/CodeExtractor/PartialInlineDebug.ll
31

On the contrary, in the profile, we want to have all samples be attributed to callee instead of split between 2 parts. But as callee .1_if.then is in a separate function, this does not seem possible.

fhahn added inline comments.Dec 5 2017, 6:29 AM
lib/Transforms/Utils/CodeExtractor.cpp
1047

The basic block header is from a function with debug info (it comes from oldFunction). Isn't it save to assume that instructions in this basic block should have debug info? Do you mean when inlining a nodebug function in one with debug info, some basic blocks could be missing debug info which could trigger this assertion in the partial inliner? (Sorry my knowledge of the assumptions about debug info are quite light).

rriddle added inline comments.Dec 5 2017, 11:31 AM
lib/Transforms/Utils/CodeExtractor.cpp
1047

It's not safe to assume that the header block will have debug locations. So it should probably just find the first valid debug location, if there is one, and use that for the call instruction. It's only needed if there is a valid debug location in the abstracted blocks anyways.

For nodebug functions that are inlined, all of the instructions being inlined inherit the debug location of the call. If there are missing debug locations due to that, then the function was already missing debug info on some instructions.

fhahn added inline comments.Dec 5 2017, 1:39 PM
lib/Transforms/Utils/CodeExtractor.cpp
1047

Ah thanks for confirming. That makes things slightly more tricky. What is the suggested way to find the "first" valid debug location? BFS through the CFG? Could it happen that getSubprogram() is not null, but there are no debug locations?

aprantl added inline comments.Dec 5 2017, 1:47 PM
lib/Transforms/Utils/CodeExtractor.cpp
1047

I think so:

__attribute__(nodebug) void f() { /* actual code */ }
__attribute__(nodebug) void g() { f(); }
void h() { g() }

After inlining f into g into h, all the code of f is in h, but it won't have debug locations.

rriddle added inline comments.Dec 5 2017, 3:10 PM
lib/Transforms/Utils/CodeExtractor.cpp
1047

If the header doesn't have a valid debug location it's going to be a bit wonky in terms of step debugging. At that point it's really just satisfying the need to have a location on the call.

As for getting the location, you could either iterate the set of blocks being extracted or try and search the CFG. Searching the CFG will require logic for making sure that a particular BB is in the extracted set and not already checked. The extracted set could contain loops, multiple exits, etc. IMO it doesn't really seem worth the work for little to no benefit in the debugging experience.

For the last question, yes! CodeExtractor is a generic extraction utility, it needs to work in all potential scenarios. So it's possible that there could be a small set of block(s) in a function in which the instructions do not have debug locations. A more reasonable scenario for the partial inliner could be if a call to a nodebug function is inlined and the call has no debug location.

For the reverse, we may need to remove the check for the subprogram. For example this partial inline scenario:

void f() { /* Code */ }

__attribute__((nodebug)) void g() { 
   if( /* something */ ) {
     f();
     /* more code */
   }

   /* Other code */
}

void h() { g(); }

After inlining f into g the instructions still maintain their debug locations. If we extract the part containing the now inlined f, we will end up with debug locations and no subprogram. That creates a problem if we want to then inline the stub g into h.

fhahn updated this revision to Diff 125748.Dec 6 2017, 9:19 AM

Thanks for all the feedback, it's been really helpful. I have updated the code to look for a valid debug location in all extracted blocks and added a test case where there is no debug loc in the header.

fhahn updated this revision to Diff 126122.Dec 8 2017, 4:17 AM

Fix code to use the first debug location found, not the last! Updated the test case with multiple debug locations, to check we actually pick the first one, if there are multiple blocks with debug info.

It would be great if someone could have another quick look at this version of the patch, as it changed quite a bit since being accepted.

rriddle accepted this revision.Dec 8 2017, 5:11 AM

LGTM

fhahn closed this revision.Dec 8 2017, 1:49 PM