Page MenuHomePhabricator

[DebugInfo] Select debug intrinsic line-numbers more carefully when promoting dbg.declare
ClosedPublic

Authored by jmorse on Mar 12 2019, 1:20 PM.

Details

Summary

This patch changes the line-numbers given to dbg.value intrinsics when an alloca/dbg.declare is promoted to an SSA register. Currently we use the original line number of the dbg.declare, i.e. the variable declaration. I believe there is risk that well-meaning optimisations will accidentally acquire these line numbers from dbg.values, artificially introducing line-number changes that needlessly step back to the variable declaration. I wasn't able to immediately produce an input that demonstrated this, below is an explanation of the patch, followed by some evidence to demonstrate this is a desirable change.

As far as I understand it, the actual line number that a dbg.value gets is meaningless (although the scope / inlining information is important). The issue is that the line number could leak into surrounding instructions during optimisation.

~

What I've changed: In the three ConvertDebugDeclareToDebugValue functions changed, previously we assigned each created dbg.value intrinsic the DebugLoc from the dbg.declare it originates from. Now, if the store or PHI node that the dbg.value corresponds to has a line number in the same scope, the 'getDebugValueLoc' helper picks that, otherwise an unknown-location line number is used.

The added test tests three scenarios (explained in the file) of no-location, suitable location, and inlined location.

The formal_parameter.ll test that changes, is checking that the dbg.values for a lowered dbg.declare have the right scope. They now get different DILocations, so I've added extra CHECK lines for their scope fields.

~

To try and get some perspective on whether this is a good change to make, I built clang-3.4 with and without this patch and diff'd the line tables. Roughly 900 entries were altered, in some whole lines went away, in others they were replaced by unknown-locations.

Example 1: in "llvm::GPRGetRawAllocationOrder(llvm::MachineFunction const&)", some tablegen'd function for the ARM backend, this materializeFrameBaseRegister function [0] is inlined. Without the patch, the line table progression is:

Line 577 (conditional test)
Constructor for a DebugLoc
Line 578 (conditional assignment to DL)
This [1] variable declaration in BuildMI
MachineBasicBlock::getParent
MachineFunction::getTarget
MachineFunction::getRegInfo
Line 582

Most of that makes sense for lines 577 to 582 of the function [0], except for the early step into BuildMI [1], which seems to be part of the BuildMI call on line 586. That particular linetable entry disappears with this patch. It's unclear to me how it got there in the first place, but I don't believe it's helpful.

Example 2, in "llvm::ARMFrameLowering::hasFP", this [2] inlined call to emitRegPlusImmediate steps through the evaluation of the arguments, then to the declaration of one of the formal parameters in the target function [3], then back to evaluating the arguments again. This extra spurious step into the destination function isn't helpful IMO.

While I don't know where exactly instructions pick up line numbers from dbg.values, I believe these differences demonstrate that they do, and so it should be avoided.

~

tl;dr, if we don't spread the line numbers of variable declarations throughout instructions in the function, we'll have less jumpy line tables.

Credit to @CarlosAlbertoEnciso who found this originally but didn't get around to filing a patch.

[0] https://github.com/llvm-mirror/llvm/blob/release_34/lib/Target/ARM/ARMBaseRegisterInfo.cpp#L577
[1] https://github.com/llvm-mirror/llvm/blob/release_34/include/llvm/CodeGen/MachineInstrBuilder.h#L245
[2] https://github.com/llvm-mirror/llvm/blob/release_34/lib/Target/ARM/ARMFrameLowering.cpp#L285
[3] https://github.com/llvm-mirror/llvm/blob/release_34/lib/Target/ARM/ARMFrameLowering.cpp#L108

Diff Detail

Repository
rL LLVM

Event Timeline

jmorse created this revision.Mar 12 2019, 1:20 PM
rnk added a subscriber: rnk.Mar 12 2019, 3:15 PM

This makes me nervous. The pair of the DIVariable and DILocation is used to uniquely identify the variable that a dbg.value applies to. In practice, inlining is usually what creates multiple instances of the same variable that get updated by dbg.values. You can see how the location is used in DwarfDebug::collectVariableInfoFromMFTable.

I see other places in the code that seem to assume that the scope of the location of a dbg.value will get the scope in which the variable is declared. Your change breaks that invariant. I think we could update such code to get the scope from the DIVariable, but maybe that would be wrong since it would lack inlining information.

In D59272#1426904, @rnk wrote:

This makes me nervous. The pair of the DIVariable and DILocation is used to uniquely identify the variable that a dbg.value applies to. In practice, inlining is usually what creates multiple instances of the same variable that get updated by dbg.values. You can see how the location is used in DwarfDebug::collectVariableInfoFromMFTable.

Last time I checked, the file, line number and column of a dbg.value was completely unused. To uniquely identify and inlined instance of a variable you need to the combination of inlinedAt field of the dbg.value's DILocation and the dbg.value's DILocalVariable.

I see other places in the code that seem to assume that the scope of the location of a dbg.value will get the scope in which the variable is declared. Your change breaks that invariant. I think we could update such code to get the scope from the DIVariable, but maybe that would be wrong since it would lack inlining information.

Isn't the scope coming from the DILocalVariable though?

In D59272#1426904, @rnk wrote:

This makes me nervous. The pair of the DIVariable and DILocation is used to uniquely identify the variable that a dbg.value applies to. In practice, inlining is usually what creates multiple instances of the same variable that get updated by dbg.values. You can see how the location is used in DwarfDebug::collectVariableInfoFromMFTable.

Last time I checked, the file, line number and column of a dbg.value was completely unused. To uniquely identify and inlined instance of a variable you need to the combination of inlinedAt field of the dbg.value's DILocation and the dbg.value's DILocalVariable.

I see other places in the code that seem to assume that the scope of the location of a dbg.value will get the scope in which the variable is declared. Your change breaks that invariant. I think we could update such code to get the scope from the DIVariable, but maybe that would be wrong since it would lack inlining information.

Isn't the scope coming from the DILocalVariable though?

The scope, perhaps/probably - but as you say, not the "inlinedAt" part. So, yeah, picking the wrong DebugLoc for dbg.value instruction could still drop/break/mess up inlining info due to the inlinedAt part of the DebugLoc.

rnk added a comment.Mar 13 2019, 4:16 PM

@aprantl, I think you are correct for all the code that I can find today. The only thing that matters is the inlinedAt part of the location. But I did see this code that handles dbg.declare, which uses the scope from the location:
https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1042

I guess my issue is that using the insertion point just to get "smooth" line numbers seems fragile. If the insertion point isn't a store, I think we'd be better off using an unknown location (line 0). Using the location of the store, assuming it has the right inlinedAt info, as this does, seems good. That just leaves us to ask what should happen at the other dbg.value creation places, call sites, loads, etc.

jmorse updated this revision to Diff 190626.Mar 14 2019, 7:49 AM

Update to use unknown-locations for everything but promoted store instructions,

rnk wrote:

@aprantl, I think you are correct for all the code that I can find today. The only thing that matters is the inlinedAt part of the location. But I did see this code that handles dbg.declare, which uses the scope from the location:

So that I understand correctly: the concern was that dbg.values with different DILocations could be seen as different variable instances (bad), but actually only the inlinedAt field is used for that (which is co-incidentally is also a DILocation).

For the avoidance of doubt, the code in getDebugValueLoc always uses the scope and inlinedAt field from the dbg.declare.

I guess my issue is that using the insertion point just to get "smooth" line numbers seems fragile. If the insertion point isn't a store, I think we'd be better off using an unknown location (line 0). Using the location of the store, assuming it has the right inlinedAt info, as this does, seems good. That just leaves us to ask what should happen at the other dbg.value creation places, call sites, loads, etc.

This update now uses unknown-location for everything except stores (and covers call-sites which I missed before). Using unknown-location for everything would be perfectly fine too IMO, I'm not wed to either idea.

rnk added a comment.Mar 14 2019, 9:57 AM

Thanks! I haven't read the code in detail, but this all sounds good to me. (/me goes back to cave).

Isn't there a risk that this (partially) is hiding bugs in some other passes, where the DebugLoc is picked from the wrong place. I guess that either it should be picked from an adjacent non-meta instruction. Or it should be set to unknown.
With this patch we only eliminate the risk that the faulty passes are picking an "incorrect" DebugLoc from a dbg.value intrinsic (or DBG_VALUE instruction). But the "faulty" pass could still pick incorrect DebugLoc from other instructions, right? Or is it always correct to take the DebugLoc from all other non-dbg.value instructions, including all other kinds of meta-instructions?

Might be hard to find all such bugs. But with your known examples it shouldn't be too hard to find at least some places in opt/llc where the DebugLoc from the dbg.value is infecting some other instruction. Then perhaps we want to land this patch as well, just do avoid some not-yet-detected problems.

Hi,

Isn't there a risk that this (partially) is hiding bugs in some other passes, where the DebugLoc is picked from the wrong place. I guess that either it should be picked from an adjacent non-meta instruction. Or it should be set to unknown.
With this patch we only eliminate the risk that the faulty passes are picking an "incorrect" DebugLoc from a dbg.value intrinsic (or DBG_VALUE instruction). But the "faulty" pass could still pick incorrect DebugLoc from other instructions, right? Or is it always correct to take the DebugLoc from all other non-dbg.value instructions, including all other kinds of meta-instructions?

Might be hard to find all such bugs. But with your known examples it shouldn't be too hard to find at least some places in opt/llc where the DebugLoc from the dbg.value is infecting some other instruction. Then perhaps we want to land this patch as well, just do avoid some not-yet-detected problems.

All true -- this patch can be characterised as simply reducing the risk of leaking definitely-wrong line numbers into non-meta IR instructions, which I think has some value.

Quickly testing what happens if IRBuilder is adjusted to not pick take DebugLocs from debug-intrinsics [0], roughly 750 of the 900 entries this patch changes, are also changed, possibly to different lines though. (I'll upload this trivial change sometime soon too).

That's likely the bulk of the leaking line numbers; while the rest could be hunted down, they're likely to be scattered through many places, so IMO this patch is the most time-efficient solution going forwards.

[0] https://github.com/llvm-mirror/llvm/blob/e3696113b639c8bf0b72d6c27dd76d6fdd8ebf61/include/llvm/IR/IRBuilder.h#L137

bjope added a comment.Mar 25 2019, 8:03 AM

Hi,

Isn't there a risk that this (partially) is hiding bugs in some other passes, where the DebugLoc is picked from the wrong place. I guess that either it should be picked from an adjacent non-meta instruction. Or it should be set to unknown.
With this patch we only eliminate the risk that the faulty passes are picking an "incorrect" DebugLoc from a dbg.value intrinsic (or DBG_VALUE instruction). But the "faulty" pass could still pick incorrect DebugLoc from other instructions, right? Or is it always correct to take the DebugLoc from all other non-dbg.value instructions, including all other kinds of meta-instructions?

Might be hard to find all such bugs. But with your known examples it shouldn't be too hard to find at least some places in opt/llc where the DebugLoc from the dbg.value is infecting some other instruction. Then perhaps we want to land this patch as well, just do avoid some not-yet-detected problems.

All true -- this patch can be characterised as simply reducing the risk of leaking definitely-wrong line numbers into non-meta IR instructions, which I think has some value.

Quickly testing what happens if IRBuilder is adjusted to not pick take DebugLocs from debug-intrinsics [0], roughly 750 of the 900 entries this patch changes, are also changed, possibly to different lines though. (I'll upload this trivial change sometime soon too).

That's likely the bulk of the leaking line numbers; while the rest could be hunted down, they're likely to be scattered through many places, so IMO this patch is the most time-efficient solution going forwards.

[0] https://github.com/llvm-mirror/llvm/blob/e3696113b639c8bf0b72d6c27dd76d6fdd8ebf61/include/llvm/IR/IRBuilder.h#L137

Ok, great. I'm happy with that explanation.

Ping; I've uploaded D61198 which likely eliminates the lions share of DebugLocs-leaking-from-DbgIntrinsics. This patch would still be required however to reduce the damage from other scenarios where DebugLocs leak from debug intrinsics.

aprantl added inline comments.Apr 26 2019, 3:00 PM
lib/Transforms/Utils/Local.cpp
1287 ↗(On Diff #190626)

Should we add a comment that everything except the inlinedAt is basically unused and that we are only putting it there as a safety guard for other transformations?

jmorse updated this revision to Diff 198638.May 8 2019, 6:32 AM

Add comment explaining the choice of zero line numbers in getDebugValueLoc. After discussion in D61198, I've removed the clause preserving the line numbers of store instructions -- It's probably better to make it clear that the line number information in debug intrinsics is completely meaningless, and thus not try to preserve it at all. The selection of a zero line number is thus just a defence against leaky optimisations.

aprantl accepted this revision.May 8 2019, 9:39 AM
aprantl added inline comments.
lib/Transforms/Utils/Local.cpp
1290 ↗(On Diff #198638)

///

1294 ↗(On Diff #198638)

I think this is redundant with the access on the next line?

This revision is now accepted and ready to land.May 8 2019, 9:39 AM
bjope added inline comments.May 8 2019, 10:16 AM
lib/Transforms/Utils/Local.cpp
1311 ↗(On Diff #198638)

Current solution is fine with me (as it seems to be one step in the right direction).

Alternatives solutions could be:

  • Let DIBuilder::insertDbgValueIntrinsic strip the line number? (that would also cover the creation of dbg.value in LoopUtils, Debugify and LLVMDIBuilder*)
  • Strip line/col number already when creating dbg.declare? (or do we need it in dbg.declare)

Some additional thing in this patch could be (if this is a new rule):

  • Also fix calls to DIBuilder::insertDbgValueIntrinsic in LoopUtils, Debugify and LLVMDIBuilder* the same way.
  • Assert that line/col is zero in DIBuilder::insertDbgValueIntrinsic.
  • Add checks in Verifier and MachineVerifier that line/col is zero for dbg.value/DBG_VALUE.
This revision was automatically updated to reflect the committed changes.
jmorse marked 4 inline comments as done.May 10 2019, 3:15 AM
jmorse added inline comments.
lib/Transforms/Utils/Local.cpp
1311 ↗(On Diff #198638)

Bjorn wrote:

Some additional thing in this patch could be (if this is a new rule):

Also fix calls to DIBuilder::insertDbgValueIntrinsic in LoopUtils, Debugify and LLVMDIBuilder* the same way.
Assert that line/col is zero in DIBuilder::insertDbgValueIntrinsic.
Add checks in Verifier and MachineVerifier that line/col is zero for dbg.value/DBG_VALUE.

Sounds good -- I've dropped those details in this bug report [0] so that they don't get lost, the Verifier checks in particular should be good for detecting error states.

[0] https://bugs.llvm.org/show_bug.cgi?id=41827