This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Remove call sites when eliminating unreachable blocks
ClosedPublic

Authored by dstenb on Jul 10 2019, 8:12 AM.

Details

Summary

When eliminating an unreachable block we must remove any call site
information for calls residing in the block.

This was originally found on a downstream target, and the attached x86
test case was produced by hand-modifying some MIR.

Diff Detail

Event Timeline

dstenb created this revision.Jul 10 2019, 8:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 10 2019, 8:12 AM
vsk added a subscriber: vsk.EditedJul 10 2019, 5:03 PM

Wdyt of addressing this problem in a more general way, e.g. in MachineInstruction::eraseFromParent itself?

edit: I meant to write MachineBasicBlock::eraseFromParent, but I think the point still stands.

@dstenb The patch looks good! LGTM! Thanks for working on this!

Wdyt of addressing this problem in a more general way, e.g. in MachineInstruction::eraseFromParent itself?
edit: I meant to write MachineBasicBlock::eraseFromParent, but I think the point still stands.

That would be great and we would not need to worry about validity of the CallSitesInfo. But the thing is that we have a problem with the MI range deletion from the MBB (see for example MachineOutliner::outline or version before this feature for TargetInstrInfo::ReplaceTailWithBranchTo). Such deletions cannot be generalized. For now, there are just two of such cases but in further we may stumble across more.
The whole idea behind this assertion and update is to evade generalization in the destructor of the MachineInst (which is the safest but requires certain overhead) and to use the update only when necessary. Similarly will be with the eraseFromParent, it will require certain overhead.
Please let us know if you have an idea how to overcome those constraints.

vsk added a comment.Jul 11 2019, 10:28 AM

@dstenb The patch looks good! LGTM! Thanks for working on this!

Wdyt of addressing this problem in a more general way, e.g. in MachineInstruction::eraseFromParent itself?
edit: I meant to write MachineBasicBlock::eraseFromParent, but I think the point still stands.

That would be great and we would not need to worry about validity of the CallSitesInfo. But the thing is that we have a problem with the MI range deletion from the MBB (see for example MachineOutliner::outline or version before this feature for TargetInstrInfo::ReplaceTailWithBranchTo). Such deletions cannot be generalized. For now, there are just two of such cases but in further we may stumble across more.
The whole idea behind this assertion and update is to evade generalization in the destructor of the MachineInst (which is the safest but requires certain overhead) and to use the update only when necessary. Similarly will be with the eraseFromParent, it will require certain overhead.
Please let us know if you have an idea how to overcome those constraints.

Thanks for humoring me :). I haven't actively worked in this area, so I may be missing important/subtle details here.

But ISTM that range deletion within MachineBasicBlock can be handled generically as well ('if any MI in the range is a call ...'). It doesn't seem too far-fetched to teach all APIs that unlink an MI from the instruction list to update call site info. Perhaps this would incur a little extra overhead if, say, an MI is unlinked and subsequently re-inserted somewhere else. In this case there might be two updates instead of one. However, I'd argue that the cost is likely justified if it makes the Machine* APIs safe to use from a debug info correctness perspective. Has this already been tried, and been shown to be too expensive in compile-time benchmarking?

I just want to add that I was a bit hesitant about generalizing the removal as I think it can be quite hard to tell when call site information has been removed, so I thought it would be better to have asserts trigger for each individual case, so that we can detect and assess what to do there, rather than dropping the information silently at the risk of false negatives.

The reason why I think it's hard to see that call site information has been removed is that, as far as I can tell there is no way to print the function attributes when doing something like -print-{before,after}-all, so it can be hard to to see where in the pass pipeline the callSites information was dropped. You could of course analyze the MIR and see where the information is likely to have been dropped, but I guess that can be difficult/annoying for larger programs.

However, perhaps there are no cases where we erase a block, and at the same would want to redirect call site information for calls inside that block so that it points to (newly created?) calls in other blocks. My concerns may be unfounded, and it should be fine to silently drop the call site information when erasing the block. I'm not sure though.

vsk accepted this revision.Jul 12 2019, 11:25 AM

I just want to add that I was a bit hesitant about generalizing the removal as I think it can be quite hard to tell when call site information has been removed, so I thought it would be better to have asserts trigger for each individual case, so that we can detect and assess what to do there, rather than dropping the information silently at the risk of false negatives.

I hadn't realized there was an assert in DeleteMachineInstr. That sounds like a reasonable/systematic way to find missed updates. That addresses my main concern :).

The reason why I think it's hard to see that call site information has been removed is that, as far as I can tell there is no way to print the function attributes when doing something like -print-{before,after}-all, so it can be hard to to see where in the pass pipeline the callSites information was dropped. You could of course analyze the MIR and see where the information is likely to have been dropped, but I guess that can be difficult/annoying for larger programs.

Oh, I thought r364506 (D61061) implemented MIR printer support for call site metadata. If not, it's probably worth adding.

However, perhaps there are no cases where we erase a block, and at the same would want to redirect call site information for calls inside that block so that it points to (newly created?) calls in other blocks. My concerns may be unfounded, and it should be fine to silently drop the call site information when erasing the block. I'm not sure though.

I think the current approach is fine. I don't yet see the need to pursue the alternative (updates in *::eraseFromParent).

This revision is now accepted and ready to land.Jul 12 2019, 11:25 AM
NikolaPrica accepted this revision.Jul 15 2019, 4:31 AM

The reason why I think it's hard to see that call site information has been removed is that, as far as I can tell there is no way to print the function attributes when doing something like -print-{before,after}-all, so it can be hard to to see where in the pass pipeline the callSites information was dropped. You could of course analyze the MIR and see where the information is likely to have been dropped, but I guess that can be difficult/annoying for larger programs.

Oh, I thought r364506 (D61061) implemented MIR printer support for call site metadata. If not, it's probably worth adding.

This dumps the CallSitesInfo when we producing a ".mir" file. The CallSitesInfo describes an internal state of the MF like the fixedStack, frameInfo etc. I'm not sure how we could make it recognizable in -print-{before,after}-all. Maybe like some metadata extension of call instruction. However, it should be part of another patch.

However, perhaps there are no cases where we erase a block, and at the same would want to redirect call site information for calls inside that block so that it points to (newly created?) calls in other blocks. My concerns may be unfounded, and it should be fine to silently drop the call site information when erasing the block. I'm not sure though.

We share your concerns. We were not sure either. That's why we requested one more reconsideration after the patch was already accepted at D61062 and perhaps for the reasons you mentioned it is decided to go with assertion & addressing the missing parts.

Thanks for contribution!

Thanks for the reviews, and sorry for taking some time to land this!

In D64500#1583215, @vsk wrote:

Oh, I thought r364506 (D61061) implemented MIR printer support for call site metadata. If not, it's probably worth adding.

Yes, that was a bit of a sloppy comment by me. I think that it would be easier if the information was visible when printing using the -print* options, but you can of course -stop-{before,after} your way through the passes to pinpoint where the information is dropped.

This revision was automatically updated to reflect the committed changes.