Page MenuHomePhabricator

[DebugInfo] Select debug intrinsic line-numbers more carefully when promoting dbg.declare
Needs ReviewPublic

Authored by jmorse on Mar 12 2019, 1:20 PM.

Details

Summary

This patch changes the line-numbers given to dbg.value intrinsics when an alloca/dbg.declare is promoted to an SSA register. Currently we use the original line number of the dbg.declare, i.e. the variable declaration. I believe there is risk that well-meaning optimisations will accidentally acquire these line numbers from dbg.values, artificially introducing line-number changes that needlessly step back to the variable declaration. I wasn't able to immediately produce an input that demonstrated this, below is an explanation of the patch, followed by some evidence to demonstrate this is a desirable change.

As far as I understand it, the actual line number that a dbg.value gets is meaningless (although the scope / inlining information is important). The issue is that the line number could leak into surrounding instructions during optimisation.

~

What I've changed: In the three ConvertDebugDeclareToDebugValue functions changed, previously we assigned each created dbg.value intrinsic the DebugLoc from the dbg.declare it originates from. Now, if the store or PHI node that the dbg.value corresponds to has a line number in the same scope, the 'getDebugValueLoc' helper picks that, otherwise an unknown-location line number is used.

The added test tests three scenarios (explained in the file) of no-location, suitable location, and inlined location.

The formal_parameter.ll test that changes, is checking that the dbg.values for a lowered dbg.declare have the right scope. They now get different DILocations, so I've added extra CHECK lines for their scope fields.

~

To try and get some perspective on whether this is a good change to make, I built clang-3.4 with and without this patch and diff'd the line tables. Roughly 900 entries were altered, in some whole lines went away, in others they were replaced by unknown-locations.

Example 1: in "llvm::GPRGetRawAllocationOrder(llvm::MachineFunction const&)", some tablegen'd function for the ARM backend, this materializeFrameBaseRegister function [0] is inlined. Without the patch, the line table progression is:

Line 577 (conditional test)
Constructor for a DebugLoc
Line 578 (conditional assignment to DL)
This [1] variable declaration in BuildMI
MachineBasicBlock::getParent
MachineFunction::getTarget
MachineFunction::getRegInfo
Line 582

Most of that makes sense for lines 577 to 582 of the function [0], except for the early step into BuildMI [1], which seems to be part of the BuildMI call on line 586. That particular linetable entry disappears with this patch. It's unclear to me how it got there in the first place, but I don't believe it's helpful.

Example 2, in "llvm::ARMFrameLowering::hasFP", this [2] inlined call to emitRegPlusImmediate steps through the evaluation of the arguments, then to the declaration of one of the formal parameters in the target function [3], then back to evaluating the arguments again. This extra spurious step into the destination function isn't helpful IMO.

While I don't know where exactly instructions pick up line numbers from dbg.values, I believe these differences demonstrate that they do, and so it should be avoided.

~

tl;dr, if we don't spread the line numbers of variable declarations throughout instructions in the function, we'll have less jumpy line tables.

Credit to @CarlosAlbertoEnciso who found this originally but didn't get around to filing a patch.

[0] https://github.com/llvm-mirror/llvm/blob/release_34/lib/Target/ARM/ARMBaseRegisterInfo.cpp#L577
[1] https://github.com/llvm-mirror/llvm/blob/release_34/include/llvm/CodeGen/MachineInstrBuilder.h#L245
[2] https://github.com/llvm-mirror/llvm/blob/release_34/lib/Target/ARM/ARMFrameLowering.cpp#L285
[3] https://github.com/llvm-mirror/llvm/blob/release_34/lib/Target/ARM/ARMFrameLowering.cpp#L108

Diff Detail

Event Timeline

jmorse created this revision.Mar 12 2019, 1:20 PM
rnk added a subscriber: rnk.Mar 12 2019, 3:15 PM

This makes me nervous. The pair of the DIVariable and DILocation is used to uniquely identify the variable that a dbg.value applies to. In practice, inlining is usually what creates multiple instances of the same variable that get updated by dbg.values. You can see how the location is used in DwarfDebug::collectVariableInfoFromMFTable.

I see other places in the code that seem to assume that the scope of the location of a dbg.value will get the scope in which the variable is declared. Your change breaks that invariant. I think we could update such code to get the scope from the DIVariable, but maybe that would be wrong since it would lack inlining information.

In D59272#1426904, @rnk wrote:

This makes me nervous. The pair of the DIVariable and DILocation is used to uniquely identify the variable that a dbg.value applies to. In practice, inlining is usually what creates multiple instances of the same variable that get updated by dbg.values. You can see how the location is used in DwarfDebug::collectVariableInfoFromMFTable.

Last time I checked, the file, line number and column of a dbg.value was completely unused. To uniquely identify and inlined instance of a variable you need to the combination of inlinedAt field of the dbg.value's DILocation and the dbg.value's DILocalVariable.

I see other places in the code that seem to assume that the scope of the location of a dbg.value will get the scope in which the variable is declared. Your change breaks that invariant. I think we could update such code to get the scope from the DIVariable, but maybe that would be wrong since it would lack inlining information.

Isn't the scope coming from the DILocalVariable though?

In D59272#1426904, @rnk wrote:

This makes me nervous. The pair of the DIVariable and DILocation is used to uniquely identify the variable that a dbg.value applies to. In practice, inlining is usually what creates multiple instances of the same variable that get updated by dbg.values. You can see how the location is used in DwarfDebug::collectVariableInfoFromMFTable.

Last time I checked, the file, line number and column of a dbg.value was completely unused. To uniquely identify and inlined instance of a variable you need to the combination of inlinedAt field of the dbg.value's DILocation and the dbg.value's DILocalVariable.

I see other places in the code that seem to assume that the scope of the location of a dbg.value will get the scope in which the variable is declared. Your change breaks that invariant. I think we could update such code to get the scope from the DIVariable, but maybe that would be wrong since it would lack inlining information.

Isn't the scope coming from the DILocalVariable though?

The scope, perhaps/probably - but as you say, not the "inlinedAt" part. So, yeah, picking the wrong DebugLoc for dbg.value instruction could still drop/break/mess up inlining info due to the inlinedAt part of the DebugLoc.

rnk added a comment.Mar 13 2019, 4:16 PM

@aprantl, I think you are correct for all the code that I can find today. The only thing that matters is the inlinedAt part of the location. But I did see this code that handles dbg.declare, which uses the scope from the location:
https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1042

I guess my issue is that using the insertion point just to get "smooth" line numbers seems fragile. If the insertion point isn't a store, I think we'd be better off using an unknown location (line 0). Using the location of the store, assuming it has the right inlinedAt info, as this does, seems good. That just leaves us to ask what should happen at the other dbg.value creation places, call sites, loads, etc.

jmorse updated this revision to Diff 190626.Mar 14 2019, 7:49 AM

Update to use unknown-locations for everything but promoted store instructions,

rnk wrote:

@aprantl, I think you are correct for all the code that I can find today. The only thing that matters is the inlinedAt part of the location. But I did see this code that handles dbg.declare, which uses the scope from the location:

So that I understand correctly: the concern was that dbg.values with different DILocations could be seen as different variable instances (bad), but actually only the inlinedAt field is used for that (which is co-incidentally is also a DILocation).

For the avoidance of doubt, the code in getDebugValueLoc always uses the scope and inlinedAt field from the dbg.declare.

I guess my issue is that using the insertion point just to get "smooth" line numbers seems fragile. If the insertion point isn't a store, I think we'd be better off using an unknown location (line 0). Using the location of the store, assuming it has the right inlinedAt info, as this does, seems good. That just leaves us to ask what should happen at the other dbg.value creation places, call sites, loads, etc.

This update now uses unknown-location for everything except stores (and covers call-sites which I missed before). Using unknown-location for everything would be perfectly fine too IMO, I'm not wed to either idea.

rnk added a comment.Mar 14 2019, 9:57 AM

Thanks! I haven't read the code in detail, but this all sounds good to me. (/me goes back to cave).

Isn't there a risk that this (partially) is hiding bugs in some other passes, where the DebugLoc is picked from the wrong place. I guess that either it should be picked from an adjacent non-meta instruction. Or it should be set to unknown.
With this patch we only eliminate the risk that the faulty passes are picking an "incorrect" DebugLoc from a dbg.value intrinsic (or DBG_VALUE instruction). But the "faulty" pass could still pick incorrect DebugLoc from other instructions, right? Or is it always correct to take the DebugLoc from all other non-dbg.value instructions, including all other kinds of meta-instructions?

Might be hard to find all such bugs. But with your known examples it shouldn't be too hard to find at least some places in opt/llc where the DebugLoc from the dbg.value is infecting some other instruction. Then perhaps we want to land this patch as well, just do avoid some not-yet-detected problems.

Hi,

Isn't there a risk that this (partially) is hiding bugs in some other passes, where the DebugLoc is picked from the wrong place. I guess that either it should be picked from an adjacent non-meta instruction. Or it should be set to unknown.
With this patch we only eliminate the risk that the faulty passes are picking an "incorrect" DebugLoc from a dbg.value intrinsic (or DBG_VALUE instruction). But the "faulty" pass could still pick incorrect DebugLoc from other instructions, right? Or is it always correct to take the DebugLoc from all other non-dbg.value instructions, including all other kinds of meta-instructions?

Might be hard to find all such bugs. But with your known examples it shouldn't be too hard to find at least some places in opt/llc where the DebugLoc from the dbg.value is infecting some other instruction. Then perhaps we want to land this patch as well, just do avoid some not-yet-detected problems.

All true -- this patch can be characterised as simply reducing the risk of leaking definitely-wrong line numbers into non-meta IR instructions, which I think has some value.

Quickly testing what happens if IRBuilder is adjusted to not pick take DebugLocs from debug-intrinsics [0], roughly 750 of the 900 entries this patch changes, are also changed, possibly to different lines though. (I'll upload this trivial change sometime soon too).

That's likely the bulk of the leaking line numbers; while the rest could be hunted down, they're likely to be scattered through many places, so IMO this patch is the most time-efficient solution going forwards.

[0] https://github.com/llvm-mirror/llvm/blob/e3696113b639c8bf0b72d6c27dd76d6fdd8ebf61/include/llvm/IR/IRBuilder.h#L137

bjope added a comment.Mar 25 2019, 8:03 AM

Hi,

Isn't there a risk that this (partially) is hiding bugs in some other passes, where the DebugLoc is picked from the wrong place. I guess that either it should be picked from an adjacent non-meta instruction. Or it should be set to unknown.
With this patch we only eliminate the risk that the faulty passes are picking an "incorrect" DebugLoc from a dbg.value intrinsic (or DBG_VALUE instruction). But the "faulty" pass could still pick incorrect DebugLoc from other instructions, right? Or is it always correct to take the DebugLoc from all other non-dbg.value instructions, including all other kinds of meta-instructions?

Might be hard to find all such bugs. But with your known examples it shouldn't be too hard to find at least some places in opt/llc where the DebugLoc from the dbg.value is infecting some other instruction. Then perhaps we want to land this patch as well, just do avoid some not-yet-detected problems.

All true -- this patch can be characterised as simply reducing the risk of leaking definitely-wrong line numbers into non-meta IR instructions, which I think has some value.

Quickly testing what happens if IRBuilder is adjusted to not pick take DebugLocs from debug-intrinsics [0], roughly 750 of the 900 entries this patch changes, are also changed, possibly to different lines though. (I'll upload this trivial change sometime soon too).

That's likely the bulk of the leaking line numbers; while the rest could be hunted down, they're likely to be scattered through many places, so IMO this patch is the most time-efficient solution going forwards.

[0] https://github.com/llvm-mirror/llvm/blob/e3696113b639c8bf0b72d6c27dd76d6fdd8ebf61/include/llvm/IR/IRBuilder.h#L137

Ok, great. I'm happy with that explanation.