This is an archive of the discontinued LLVM Phabricator instance.

[GVN+LICM] Use line 0 locations for better crash attribution
ClosedPublic

Authored by vsk on Apr 19 2019, 12:05 PM.

Details

Summary

This is a follow-up to r291037+r291258, which used null debug locations
to prevent jumpy line tables.

Using line 0 locations achieves the same effect, but works better for
crash attribution because it preserves the right inline scope.

Diff Detail

Repository
rL LLVM

Event Timeline

vsk created this revision.Apr 19 2019, 12:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 19 2019, 12:05 PM
This revision was not accepted when it landed; it landed in state Needs Review.Apr 19 2019, 3:37 PM
This revision was automatically updated to reflect the committed changes.

Using zero on non-call locations might bloat the line table a fair bit. May be better to let those locations get flow-on from whatever other instructions are around?

At least that, I think, is the current policy of "merge debug locations", is it not?

Perhaps we need a similar utility that can be kept in the same place (as merge debug locations)/enforce the same policy for hoisted locations.

vsk added a comment.Aug 4 2020, 4:14 PM

Using zero on non-call locations might bloat the line table a fair bit. May be better to let those locations get flow-on from whatever other instructions are around?

At least that, I think, is the current policy of "merge debug locations", is it not?

The current policy recommends using a merged debug location here (ref), but that's not the same as not setting a debug location. The recommendation is to use the applyMergedLocation API. If there's a cheap way to keep track of the scopes of the instructions replaced by 'NewInsts', that would be a good fit here.

Perhaps we need a similar utility that can be kept in the same place (as merge debug locations)/enforce the same policy for hoisted locations.

Sorry, I don't follow. Is this some enforcement mechanism for debug location update rules?

In D60913#2194847, @vsk wrote:

Using zero on non-call locations might bloat the line table a fair bit. May be better to let those locations get flow-on from whatever other instructions are around?

At least that, I think, is the current policy of "merge debug locations", is it not?

The current policy recommends using a merged debug location here (ref),

I'm not sure it does - since this code isn't doing any merging - it's hoisting a single instruction, which I don't think is covered by that part of the documentation, is it?

but that's not the same as not setting a debug location. The recommendation is to use the applyMergedLocation API.

Specifically "applyMergedLocation" doesn't handle a single instruction - it handles two, and it's not for use when crossing a conditional boundary (taking code from somewhere that only executes under one condition and moving it to a place that might execute when that condition doesn't hold). Is that's what's happening here? (is code being hoisted across a basic block boundary?)

If there's a cheap way to keep track of the scopes of the instructions replaced by 'NewInsts', that would be a good fit here.

Perhaps we need a similar utility that can be kept in the same place (as merge debug locations)/enforce the same policy for hoisted locations.

Sorry, I don't follow. Is this some enforcement mechanism for debug location update rules?

No, sorry - I meant perhaps we should have a function like Instruction::applyMergedLocation that's for the case of hoisting code across conditionals without any merging - hmm, wait, no, now I've confused myself.

If the code is hoisting across a conditional, then there's nothing to preserve and we should be doing what the docs say - https://llvm.org/docs/HowToUpdateDebugInfo.html#id5 - drop the location.

If it's a call, even then we shouldn't preserve the inlinedAt information, because we could mislead users/profilers into believing the function was reached when it wasn't (because the condition may never be true). But we do need to set a location on calls (though this would only be for calls we know certain things about - pure/const, that sort of thing - that would allow us to hoist - is that the case here?) but it should probably be in the outer/concrete function, since we can't correctly attribute the instruction to any specific scope/inlined function, I believe.

vsk added a comment.Aug 6 2020, 4:09 PM
In D60913#2194847, @vsk wrote:

Using zero on non-call locations might bloat the line table a fair bit. May be better to let those locations get flow-on from whatever other instructions are around?

At least that, I think, is the current policy of "merge debug locations", is it not?

The current policy recommends using a merged debug location here (ref),

I'm not sure it does - since this code isn't doing any merging - it's hoisting a single instruction, which I don't think is covered by that part of the documentation, is it?

Ah! Ok, got it. I had this confused earlier: I assumed GVN 'merged' instructions together, but this is not true for PerformLoadPRE (or for the hoist routine in LICM) touched in this patch.

but that's not the same as not setting a debug location. The recommendation is to use the applyMergedLocation API.

Specifically "applyMergedLocation" doesn't handle a single instruction - it handles two, and it's not for use when crossing a conditional boundary (taking code from somewhere that only executes under one condition and moving it to a place that might execute when that condition doesn't hold). Is that's what's happening here? (is code being hoisted across a basic block boundary?)

Right, it looks like both of the functions modified in this patch move an instruction across a basic block boundary.

If there's a cheap way to keep track of the scopes of the instructions replaced by 'NewInsts', that would be a good fit here.

Perhaps we need a similar utility that can be kept in the same place (as merge debug locations)/enforce the same policy for hoisted locations.

Sorry, I don't follow. Is this some enforcement mechanism for debug location update rules?

No, sorry - I meant perhaps we should have a function like Instruction::applyMergedLocation that's for the case of hoisting code across conditionals without any merging - hmm, wait, no, now I've confused myself.

If the code is hoisting across a conditional, then there's nothing to preserve and we should be doing what the docs say - https://llvm.org/docs/HowToUpdateDebugInfo.html#id5 - drop the location.

Now I think we're on the same page: I agree that that's what the docs recommend. It would be helpful to have some utility ('Instruction::applyHoistedLocation'?) to simplify setting the right location.

If it's a call, even then we shouldn't preserve the inlinedAt information, because we could mislead users/profilers into believing the function was reached when it wasn't (because the condition may never be true). But we do need to set a location on calls (though this would only be for calls we know certain things about - pure/const, that sort of thing - that would allow us to hoist - is that the case here?) but it should probably be in the outer/concrete function, since we can't correctly attribute the instruction to any specific scope/inlined function, I believe.

I think the hoist function can move a call, and afaik PerformLoadPRE cannot. If we were to use a helper like 'Instruction::applyHoistedLocation' in both cases, what would the helper have to look like? What I have in mind is:

  • When hoisting an instruction that isn't a call, drop its location.
  • If it _is_ a call, and the parent function has no debug scope, drop its location.
  • Finally if it _is_ a call and the parent function has a debug scope, set its location to line 0 with the parent function's scope, and no inlinedAt location.

Does that seem reasonable?

In D60913#2201426, @vsk wrote:
In D60913#2194847, @vsk wrote:

Using zero on non-call locations might bloat the line table a fair bit. May be better to let those locations get flow-on from whatever other instructions are around?

At least that, I think, is the current policy of "merge debug locations", is it not?

The current policy recommends using a merged debug location here (ref),

I'm not sure it does - since this code isn't doing any merging - it's hoisting a single instruction, which I don't think is covered by that part of the documentation, is it?

Ah! Ok, got it. I had this confused earlier: I assumed GVN 'merged' instructions together, but this is not true for PerformLoadPRE (or for the hoist routine in LICM) touched in this patch.

but that's not the same as not setting a debug location. The recommendation is to use the applyMergedLocation API.

Specifically "applyMergedLocation" doesn't handle a single instruction - it handles two, and it's not for use when crossing a conditional boundary (taking code from somewhere that only executes under one condition and moving it to a place that might execute when that condition doesn't hold). Is that's what's happening here? (is code being hoisted across a basic block boundary?)

Right, it looks like both of the functions modified in this patch move an instruction across a basic block boundary.

If there's a cheap way to keep track of the scopes of the instructions replaced by 'NewInsts', that would be a good fit here.

Perhaps we need a similar utility that can be kept in the same place (as merge debug locations)/enforce the same policy for hoisted locations.

Sorry, I don't follow. Is this some enforcement mechanism for debug location update rules?

No, sorry - I meant perhaps we should have a function like Instruction::applyMergedLocation that's for the case of hoisting code across conditionals without any merging - hmm, wait, no, now I've confused myself.

If the code is hoisting across a conditional, then there's nothing to preserve and we should be doing what the docs say - https://llvm.org/docs/HowToUpdateDebugInfo.html#id5 - drop the location.

Now I think we're on the same page: I agree that that's what the docs recommend. It would be helpful to have some utility ('Instruction::applyHoistedLocation'?) to simplify setting the right location.

If it's a call, even then we shouldn't preserve the inlinedAt information, because we could mislead users/profilers into believing the function was reached when it wasn't (because the condition may never be true). But we do need to set a location on calls (though this would only be for calls we know certain things about - pure/const, that sort of thing - that would allow us to hoist - is that the case here?) but it should probably be in the outer/concrete function, since we can't correctly attribute the instruction to any specific scope/inlined function, I believe.

I think the hoist function can move a call, and afaik PerformLoadPRE cannot. If we were to use a helper like 'Instruction::applyHoistedLocation' in both cases, what would the helper have to look like? What I have in mind is:

  • When hoisting an instruction that isn't a call, drop its location.
  • If it _is_ a call, and the parent function has no debug scope, drop its location.
  • Finally if it _is_ a call and the parent function has a debug scope, set its location to line 0 with the parent function's scope, and no inlinedAt location.

Does that seem reasonable?

Yeah, that sounds like what I'd think would be good. Can't guarantee it (if this function has no debug info, but is inlinable - then the call might be problematic?, if the call itself is inlinable too - I forget how we deal with this in other cases) , but that's certainly the general direction I'm on board with.

vsk added a comment.Aug 6 2020, 4:27 PM

Sounds good, I've put this on my queue and hope to take a stab at it by Monday (8/10).