Page MenuHomePhabricator

[DebugInfo][CGP] Limit placeDbgValues movement of dbg.value intrinsics
Needs ReviewPublic

Authored by jmorse on Feb 20 2019, 8:22 AM.

Details

Reviewers
aprantl
bjope
vsk
Summary

This patch limits the activity of CodeGenPrepare::placeDbgValues, which originally helped preserve variable debug locations by placing them near Value definitions. Unfortunately, this now causes more problems than it solves -- see PRs 31878, 35194, 38754. One example problem is that conditional variable assignments can be hoisted and appear to debuggers unconditionally. This patch only allows dbg.value movement when the dbg.value would otherwise be a use-before-def: this is temporarily necessary because the rest of CodeGenPrepare makes no effort to put dbg.values in the right place. Eventually the whole placeDbgValues function should be deleted.

Here are some statistics from a clang-3.4 build with various options twiddled. Note that these were built with the other patches I've uploaded (which I'll add as dependencies) that soften the blow a bit. 'normal' means placeDbgValues is enabled, 'limited' is with this patch, 'deleted' is placeDbgValues completely removed.

normal placeDbgValues:  77.6% variable locations, 45.9% scope bytes
limited placeDbgValues: 76.1% variable locations, 46.4% scope bytes
deleted placeDbgValues: 75.8% variable locations, 45.8% scope bytes

Clearly there's a big decrease (1.8%) of variable coverage happening, and temporarily allowing non-dominated dbg.values to be moved doesn't make a large difference. I've now reached the point where I can't find other (CodeGen) bugs that make a serious impact on these statistics. My current belief, based on variable locations I've traced, is:

  • That coverage drop is almost entirely caused by DBG_VALUEs whose operands have gone out of liveness, and
  • LLVM previously hasn't paid this liveness penalty, because it's always moved dbg.values to be immediately after definitions.

I believe we *should* IMHO take this coverage hit. We only avoid it right now by cheating and forcefully bringing variables back into liveness, often with stupid effects.

An example of liveness misery is the "XRayThreshold" variable in this [0] function, ignoring the constant assignment on line 150. XRayThreshold gets a value assigned to it by the getAsInteger call on line 154. That function gets inlined and XRayThresholds use (to test early return) gets hoisted, the effect of which is that XRayThreshold has a location for two instructions, neither of which get a line number in the body of [0], and thus the variable is never visible. With placeDbgValues enabled we move the dbg.value for XRayThreshold so that it covers those two instructions -- with placeDbgValues disabled the calculation for XRayThreshold is dead by the time we reach its dbg.value. IMHO the location determined by placeDbgValues isn't useful to the developer, and skews statistics.

Another case I previously reported is [1] -- here the 'var' variable assignment has its dbg.value hoisted by placeDbgValues to above the switch, somewhere it hasn't been declared yet, and its location gets lost by the time it reaches the relevant portion of the switch statement in [1].

In summary, while the variable coverage statistic is going down here, what we don't measure (and possibly can't) is whether the variable locations are defined over instructions in the program that make sense. Which placeDbgValues actively damages IMO.

On to the three changing tests:

  • register-variables.ll: The first three deleted DEBUG_VALUE lines are caused by placeDbgValues hoisting locations to places they originally weren't, they're absent with this patch, and other DEBUG_VALUE lines covers the correct behaviour. The deleted 'DEBUG_VALUE' for 'b' drops due to the branch-folding pass (which it previously avoided through placeDbgValues hoisting). The DEBUG_VALUE line for 'c' has sunk back to its original (correct) location.
  • NVPTX test: a DBG_VALUE $noreg shifts one instruction down in this test, renumbering all the labels.
  • PR37234.ll: another case of hoisting over a block.

This patch effectively invalidates the behaviour of MachineInstr::collectDebugValues because DBG_VALUEs no longer come immediately after the vreg definition they refer to. The patches this one depends on kill off most of the users, but in the WebAssembly backend some target code calls collectDebugValues. I know nothing of WebAssembly -- I suspect this means mailing the backend owner, as it's not clear what their use case is.

I haven't added a test for the absence of this behaviour -- it doesn't strike me as something that anyone would think to re-introduce.

[0] https://github.com/llvm-mirror/llvm/blob/5926c6d6fe85f6623d2d14f473873bc72c72d482/lib/CodeGen/XRayInstrumentation.cpp#L154
[1] https://github.com/llvm-mirror/clang/blob/aa528ab4a083268edddd038d0f251afe792cfa36/lib/CodeGen/CGBlocks.cpp#L1858

Diff Detail

Event Timeline

jmorse created this revision.Feb 20 2019, 8:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 20 2019, 8:22 AM
aprantl added inline comments.Feb 21 2019, 8:24 AM
lib/CodeGen/CodeGenPrepare.cpp
7136

Please add a comment that explains why

There's a lot going on in the explanation, and partly I'm handicapped by lack of familiarity with the picky details of LLVM's location-list construction. Given all the attention to where dbg.value instructions are placed, it seems like placement is a significant factor in that construction. Intuitively, placing the dbg.value near the instruction that defines the value (assuming there is such an instruction) allows the location-list to better describe the instruction stream that is actually emitted. And, that's really what I want from a location list.
I would caution against focusing too much on the coverage statistics. As you say, what we really want to know is "whether the variable locations are defined over instructions in the program that make sense" and the statistics don't tell us that.
So, what instructions "make sense" in a location list? Well, the ones where the variable has an identifiable machine location or constant value. If optimization causes a value to be defined "before" the variable is declared, that does not really worry me. The goal here is to produce a correct description of the instructions-actually-emitted.
"One example problem is that conditional variable assignments can be hoisted and appear to debuggers unconditionally." Are the values actually defined unconditionally? Then it is correct to describe them that way to the debugger. I assume this is referring to PR38754. If the instructions to compute the value are hoisted, my assertion is that the location-list should describe the variable with that value when it actually occurs.
In my view, the real problem as stated in the PR appears to be that the value later gets *lost*, sadly before the instructions directly related to the switch-case where the variable is used; the real problem is *not* that the value's description starts too soon.

So given all that, how does this patch help address the real problem?

bjope added inline comments.Feb 21 2019, 3:30 PM
lib/CodeGen/CodeGenPrepare.cpp
7131–7132

I think that the special handling of alloca can be removed now (either the alloca dominates the DVI and everyhing is fine, or is actually is possible that the DVI dominates the alloca, then I think we can move it).
Although, no need to make such a change in this patch if you think it is risky/unrelated.

7136

I think this explanation can be added by updating the description of this function. Mentioning that the purpose is to resolve use-before-def problems (which really should be seen as bugs in this, as well as other, passes):

"This function only moves (sinks) DbgValueInst:s when there is a use-before-def situation."
jmorse updated this revision to Diff 187921.Feb 22 2019, 3:52 AM

Update comments in placeDbgValues to reflect its new purpose; remove code calculating the previous non-debug-instr, as we now use a dominator tree check instead.

jmorse marked 5 inline comments as done.Feb 22 2019, 3:55 AM
jmorse added inline comments.
lib/CodeGen/CodeGenPrepare.cpp
7131–7132

I agree and have removed it -- it's redundant and misleading to a future reader to be calculating the "previous non debug instruction" when there's a dominator tree check happening.

7136

Updated the function comment; I threw a "FIXME" in too indicating that placeDbgValues is a concession, rather than design feature.

jmorse marked 2 inline comments as done.Feb 22 2019, 4:59 AM

In my view, the real problem as stated in the PR appears to be that the value later gets *lost*, sadly before the instructions directly related to the switch-case where the variable is used; the real problem is *not* that the value's description starts too soon.

So given all that, how does this patch help address the real problem?

Ah, to untangle the True (TM) problem I should illustrate some IR that corresponds to the problem in PR38754:

someblock:
  %foo = add i32 %bar, %bar
  br i1 %somecond, label %qux, label %quux
qux:
  %xyzzy = sub i32 %foo, 12
  call void @llvm.dbg.value(metadata i32 %xyzzy, [...])
  br label %quux

In this code, if %xyzzy is optimised out then the 'sub' instruction is deleted, and its operation folded into the dbg.value intrinsics DIExpression, something like:

call void @llvm.dbg.value(metadata i32 %foo, metadata !1, metadata !DIExpression(DW_OP_constu, 12, DW_OP_minus, DW_OP_stack_value))

Which is good. However the dbg.values operand is now %foo rather than %xyzzy, and placeDbgValues moves the dbg.value call to immediately after its operands definition, moving it up past a conditional branch to the 'someblock' block, giving us:

someblock:
  %foo = add i32 %bar, %bar
  call void @llvm.dbg.value(metadata i32 %foo, [...])
  br i1 %somecond, label %qux, label %quux

I'm not sure what the correct nomenclature is at this point, but the variable location that the dbg.value is defining now appears on paths through the program where it did not before, which in PR38754 makes a variable appear in the debugger to take on a value that isn't actually computed. Many different code hoisting / sinking / CSE passes can trigger similar behaviour.

So, what instructions "make sense" in a location list? Well, the ones where the variable has an identifiable machine location or constant value. If optimization causes a value to be defined "before" the variable is declared, that does not really worry me. The goal here is to produce a correct description of the instructions-actually-emitted.
"One example problem is that conditional variable assignments can be hoisted and appear to debuggers unconditionally." Are the values actually defined unconditionally? Then it is correct to describe them that way to the debugger. I assume this is referring to PR38754. If the instructions to compute the value are hoisted, my assertion is that the location-list should describe the variable with that value when it actually occurs.

I think the key phrase here is "when it actually occurs": if in a program:

  • Down two code paths the same value is computed but assigned to different variables, and
  • The computation is hoisted to a parent block,

Should both variables appear to hold that computed value once it's defined in the parent block?

(NB, I'm aware that many people approach this with a different language to me, i.e. in terms of variable locations rather than values, so it's entirely possible that there's some higher level concept I'm missing).

bjope added a comment.Feb 22 2019, 5:14 AM

placeDbgValues has been bugging me (and probably others) for some time. So I'm happy with this patch.

Haven't done any recent tests myself, but hopefully we have solved most problems that has blocked this change by now. I know that @jmorse has done a splendid job in hunting down and fixing such problems.
As this patch is a compromising solution, we still sink the dbg.value when having use-before-def, the patch seem reasonable to me at this point in time. It might unveil a few not yet detected problems, but those would probably be hard to find unless we do this.
One idea is to have a command-line option to make it easier to revert to the old behavior (if we find some bigger problems), but I do not think it is worth the trouble.

So this LGTM now. But it looks like @probinson had some questions, so I'm not setting "ready to land" right now.

Hmmm. It appears that you are detaching the notion of here-the-value-is-computed from here-the-value-becomes-associated-with-this-variable. Usually these are concurrent in an "assignment" to the variable.
Separating these concerns means that if a value computation is hoisted, it might not be appropriate to hoist the association with the variable. That is, the "assignment" effect maybe should stay where it was, even if there's no specific instruction left in that position that implements that assignment.
I could see this situation arising in a variety of circumstances. CSE, various sorts of value-propagation (constant or not), passing a value to an inlined function parameter, yada yada.

So, this patch says, as long as the association (dbg.value) is dominated by the computation, assume the association is in the right place and don't move it. Dominance is certainly the correct prerequisite for dbg.value, so in that respect the patch seems reasonable, given the separation of concerns I mentioned previously. But...

I suspect there are a bunch of places in LLVM that encode knowledge of the "value computation is assignment" paradigm, which you are apparently trying to get rid of, and I suppose the other patches you have up for review are all along this same line. Rooting out all those places will take some doing. I am also somewhat concerned that you can leave a dbg.value lying around with no direct relationship to the rest of the block it's in, which makes the effect of other optimizations potentially vague and ill-defined.
Was there any llvm-dev discussion about this paradigm shift? I have been very distracted the past couple of months so it might have happened without my noticing or really paying attention to it. While I think it is probably a net positive in the evolution of debug info for optimized code, I do want to make sure the Usual Suspects are all attuned to it and supportive. It would also be well worthwhile putting together some words for the description of dbg.value in LangRef about this (which does not have to be part of this patch).

Hi Paul,

Hmmm. It appears that you are detaching the notion of here-the-value-is-computed from here-the-value-becomes-associated-with-this-variable. Usually these are concurrent in an "assignment" to the variable.
Separating these concerns means that if a value computation is hoisted, it might not be appropriate to hoist the association with the variable. That is, the "assignment" effect maybe should stay where it was, even if there's no specific instruction left in that position that implements that assignment.
I could see this situation arising in a variety of circumstances. CSE, various sorts of value-propagation (constant or not), passing a value to an inlined function parameter, yada yada.

So, this patch says, as long as the association (dbg.value) is dominated by the computation, assume the association is in the right place and don't move it. Dominance is certainly the correct prerequisite for dbg.value, so in that respect the patch seems reasonable, given the separation of concerns I mentioned previously. But...

I suspect there are a bunch of places in LLVM that encode knowledge of the "value computation is assignment" paradigm, which you are apparently trying to get rid of, and I suppose the other patches you have up for review are all along this same line. Rooting out all those places will take some doing. I am also somewhat concerned that you can leave a dbg.value lying around with no direct relationship to the rest of the block it's in, which makes the effect of other optimizations potentially vague and ill-defined.

Absolutely true -- and constant-valued dbg.values (including undef, technically) are frequently the source of bug reports (ex: PR40795) as their lifetime isn't associated with anything else in the IR (see also PR40628).

Was there any llvm-dev discussion about this paradigm shift? I have been very distracted the past couple of months so it might have happened without my noticing or really paying attention to it. While I think it is probably a net positive in the evolution of debug info for optimized code, I do want to make sure the Usual Suspects are all attuned to it and supportive. It would also be well worthwhile putting together some words for the description of dbg.value in LangRef about this (which does not have to be part of this patch).

Zero discussion sorry, I wasn't aware this is a change in interpretation -- however I'm also not aware of any Formal (TM) definition of the semantics of dbg.value intrinsics, particularly with respect to their location in the IR. I've assumed/inferred this is how it's supposed to work from existing code and review feedback.

Further discussion can't hurt (he said boldly) and getting some meaning written down in the langref would help everyone. I'll draft something and kick that off shortly.

probinson added a comment.EditedFeb 25 2019, 5:24 PM

Absolutely true -- and constant-valued dbg.values (including undef, technically) are frequently the source of bug reports (ex: PR40795) as their lifetime isn't associated with anything else in the IR (see also PR40628).

That's disheartening. Reading the PR it's apparent that the location list construction is failing to implement the dataflow computations required to handle control flow correctly. On the bright side, if we *do* implement them, I suspect it will solve a bunch of issues all at once.

Was there any llvm-dev discussion about this paradigm shift? I have been very distracted the past couple of months so it might have happened without my noticing or really paying attention to it. While I think it is probably a net positive in the evolution of debug info for optimized code, I do want to make sure the Usual Suspects are all attuned to it and supportive. It would also be well worthwhile putting together some words for the description of dbg.value in LangRef about this (which does not have to be part of this patch).

Zero discussion sorry, I wasn't aware this is a change in interpretation -- however I'm also not aware of any Formal (TM) definition of the semantics of dbg.value intrinsics, particularly with respect to their location in the IR. I've assumed/inferred this is how it's supposed to work from existing code and review feedback.

Further discussion can't hurt (he said boldly) and getting some meaning written down in the langref would help everyone. I'll draft something and kick that off shortly.

https://llvm.org/docs/SourceLevelDebugging.html#format-common-intrinsics
"This intrinsic provides information when a user source variable is set to a new value." Naively I read this "when" as meaning, the position of the intrinsic in the instruction stream indicates the starting point where the variable takes on the new value. But making that clearer would be a Good Thing.
And as various reviews and whatnot fly by, I'm inclined to think it's at least a common misperception that the dbg.value has to be adjacent to the instruction that produces the value operand. Stating that it's not required would surely help.

bjope added a comment.Feb 26 2019, 1:01 AM

It is definitely about time that we also refresh the documentation. One thing that is lacking is documentation of the backend parts (currently https://llvm.org/docs/SourceLevelDebugging.html talks about intrinsics like dbg.value etc, but shouldn't it also cover DBG_VALUE pseudo instruction). Afaict dbg.value and DBG_VALUE aren't exactly the same. And I think even DBG_VALUE might have some different properties depending on where in the llc pipeline we are (e.g. if the location only is valid up to the end of the BB or not).

https://llvm.org/docs/SourceLevelDebugging.html#format-common-intrinsics
"This intrinsic provides information when a user source variable is set to a new value." Naively I read this "when" as meaning, the position of the intrinsic in the instruction stream indicates the starting point where the variable takes on the new value. But making that clearer would be a Good Thing.

That is indeed the intended reading of that sentence. Perhaps we should include an example here that uses a dbg.value with a constant to start a new range and an undef to end it to really drive the point home.

And as various reviews and whatnot fly by, I'm inclined to think it's at least a common misperception that the dbg.value has to be adjacent to the instruction that produces the value operand. Stating that it's not required would surely help.

Agreed!

I've generated a short summary in https://reviews.llvm.org/D58726 which illustrates how I believe dbg.values are to be interpreted -- I'll chuck this at llvm-dev tomorrow if it's not obviously broken. As mentioned in the summary, it's likely that more could be said, but getting agreement on the location of a dbg.value being the location that an assignment ``happens'' would be good.

What is the status regarding this patch? Are we waiting for something?

What is the status regarding this patch? Are we waiting for something?

Good question -- I was going to gently start pinging the dependencies (one of them needs a revision, but the rest not) to get things in.

There's no technical reason why that's necessary though: given that we appear to agree on the meaning of the dbg.value intrinsics as per D58726, this could land right now. It's not like LLVM is branching any time soon.

this could land right now.

(... assuming it's approved.)

jmorse updated this revision to Diff 205851.Thu, Jun 20, 10:03 AM

Freshen up patch to make it merge cleanly; check that DVI isn't null... not clear whether something has changed or if I uploaded a completely broken patch before.

vsk added a comment.Fri, Jun 28, 1:33 PM

The high-level plan (limit the amount of movement done in placeDbgValues, check for any fallout, and (hopefully) eventually eliminate all of it) sgtm.

lib/CodeGen/CodeGenPrepare.cpp
7126

Use CGP::getDT()?

7135

Early exit on !VI for clarity?

@yurydelendik Would this change be OK for WebAssembly debug value handling?

yurydelendik added a comment.EditedMon, Jul 1, 6:48 AM

@yurydelendik Would this change be OK for WebAssembly debug value handling?

The change will be OK: we rely on collectDebugValues in WebAssembly to locate instruction's debug values. And we worry only about these atm.

jmorse added a comment.Mon, Jul 1, 9:44 AM

Hi,

The change will be OK: we rely on collectDebugValues in WebAssembly to locate instruction's debug values. And we worry only about these atm.

Awkwardly collectDebugValues relies on a behaviour that this patch will mostly undo, specifically, that DBG_VALUE insts are placed after the defining instruction. collectDebugValues will work sometimes, but will "miss" DBG_VALUEs much more often that it does now.

We can tailor something to suit WebAssemblys needs: would it be OK if collectDebugValues returned _all_ DBG_VALUE uses of the virtual register it's called on/with? That would include, for example, debug uses from different basic blocks, which doesn't occur with the current collectDebugValues implementation.

Hi,

Awkwardly collectDebugValues relies on a behaviour that this patch will mostly undo, specifically, that DBG_VALUE insts are placed after the defining instruction. collectDebugValues will work sometimes, but will "miss" DBG_VALUEs much more often that it does now.

I was counting on "collectDebugValues will work sometimes" part and still hoping it will perform with some degree to "finding" most of the related DBG_VALUEs in the current block.

We can tailor something to suit WebAssemblys needs: would it be OK if collectDebugValues returned _all_ DBG_VALUE uses of the virtual register it's called on/with? That would include, for example, debug uses from different basic blocks, which doesn't occur with the current collectDebugValues implementation.

Agree.

jmorse updated this revision to Diff 209508.Fri, Jul 12, 9:32 AM

I've added some changes to the patch that makes collectDebugValues behave in the way I described, returning all debug users of a VReg, optionally limited to only the basic block of the source MachineInst. This should be useful for testing; and it's immediately caused the test/DebugInfo/WebAssembly/dbg-value-move-reg-stackify.mir [0] test to fail. The output is:

%0:i64 = ARGUMENT_i64 0, implicit $arguments
dead %1:i32 = I32_WRAP_I64 %0, implicit-def dead $arguments
dead %1:i32 = CALL_i32 @bar, implicit-def dead $arguments, implicit $sp32, implicit $sp64
%2:i32 = CALL_i32 @bar, implicit-def dead $arguments, implicit $sp32, implicit $sp64, implicit-def $value_stack, implicit $value_stack
DBG_VALUE %2, $noreg, <0x5490d00>, !DIExpression(), debug-location !DILocation(line: 359, column: 12, scope: <0x548f5f0>)
DBG_VALUE %2, $noreg, <0x5490cb0>, !DIExpression(), debug-location !DILocation(line: 358, column: 12, scope: <0x548f5f0>)
DBG_VALUE %2, $noreg, <0x5490a50>, !DIExpression(), debug-location !DILocation(line: 357, column: 12, scope: <0x548f5f0>)
CALL_VOID @foo, %2, implicit-def dead $arguments, implicit $sp32, implicit $sp64, implicit-def $value_stack, implicit $value_stack
RETURN_VOID implicit-def dead $arguments

While I don't know webassembly, it seems clear that a) the test post-SSA, and b) the test expects collectDebugValues to only return DBG_VALUEs that refer to the same vreg def that the source instruction does.

Is that a fair assessment, or some kind of test artefact? That behaviour is almost certainly achievable after this patch lands, although it might require more plumbing somewhere.

[0] https://github.com/llvm/llvm-project/blob/a1d97a960e622ee21550d92809512cb0870be499/llvm/test/DebugInfo/WebAssembly/dbg-value-move-reg-stackify.mir

I've added some changes to the patch that makes collectDebugValues behave in the way I described, returning all debug users of a VReg, optionally limited to only the basic block of the source MachineInst. This should be useful for testing; and it's immediately caused the test/DebugInfo/WebAssembly/dbg-value-move-reg-stackify.mir [0] test to fail. The output is:

%0:i64 = ARGUMENT_i64 0, implicit $arguments
dead %1:i32 = I32_WRAP_I64 %0, implicit-def dead $arguments
dead %1:i32 = CALL_i32 @bar, implicit-def dead $arguments, implicit $sp32, implicit $sp64
%2:i32 = CALL_i32 @bar, implicit-def dead $arguments, implicit $sp32, implicit $sp64, implicit-def $value_stack, implicit $value_stack
DBG_VALUE %2, $noreg, <0x5490d00>, !DIExpression(), debug-location !DILocation(line: 359, column: 12, scope: <0x548f5f0>)
DBG_VALUE %2, $noreg, <0x5490cb0>, !DIExpression(), debug-location !DILocation(line: 358, column: 12, scope: <0x548f5f0>)
DBG_VALUE %2, $noreg, <0x5490a50>, !DIExpression(), debug-location !DILocation(line: 357, column: 12, scope: <0x548f5f0>)
CALL_VOID @foo, %2, implicit-def dead $arguments, implicit $sp32, implicit $sp64, implicit-def $value_stack, implicit $value_stack
RETURN_VOID implicit-def dead $arguments

While I don't know webassembly, it seems clear that a) the test post-SSA, and b) the test expects collectDebugValues to only return DBG_VALUEs that refer to the same vreg def that the source instruction does.

Is that a fair assessment, or some kind of test artefact? That behaviour is almost certainly achievable after this patch lands, although it might require more plumbing somewhere.

[0] https://github.com/llvm/llvm-project/blob/a1d97a960e622ee21550d92809512cb0870be499/llvm/test/DebugInfo/WebAssembly/dbg-value-move-reg-stackify.mir

Looks like I misunderstood meaning of *all*. I assumed that means DBG_VALUEs the refer the vreg def of the source instruction, excluding DBG_VALUEs that refer the same vreg def by the above instructions (in the same block). The test result above shows that collectDebugValues locates all DBG_VALUEs for the instruction even if it does not (re)define the value, but there are few instructions in the same block above that do. Is this the intended effect of this patch? If yes, we probably need to create more robust collectDebugValues-like method that will track lifetime of DBG_VALUEs.

! In D58453#1582991, @yurydelendik wrote:
Looks like I misunderstood meaning of *all*. I assumed that means DBG_VALUEs the refer the vreg def of the source instruction, excluding DBG_VALUEs that refer the same vreg def by the above instructions (in the same block). The test result above shows that collectDebugValues locates all DBG_VALUEs for the instruction even if it does not (re)define the value, but there are few instructions in the same block above that do. Is this the intended effect of this patch? If yes, we probably need to create more robust collectDebugValues-like method that will track lifetime of DBG_VALUEs.

Indeed, that's the intention. I guess a clearer way of saying this would have been that collectDebugValues will be much less useful after leaving SSA form, because from that point onwards one must consider register definitions, rather than just the register itself, to know what a DBG_VALUE refers to. While this is a pain, the behaviour the rest of this patch is trying to eliminate is destroying a lot of variable location information already.

I've had a quick look at the wasm-reg-stackify pass, and to me it looks like:

  • It's currently performed on a per-block basis
  • Domination and LiveInterval analyses are already available

Which means that the WebAssemblyDebugValueManager constructor could just apply an additional filter on collected DbgValues, that the LiveInterval/ValueNumber the DBG_VALUE refers to is the defining instruction.

One problem with that implementation would be that debug intrinsics outside the current basic block may still refer to a vreg that wasm-reg-stackify operates on, a situation that placeDbgValues currently masks. I don't know enough about wasm to know if this is a serious issue.

the WebAssemblyDebugValueManager constructor could just apply an additional filter on collected DbgValues, that the LiveInterval/ValueNumber the DBG_VALUE refers to is the defining instruction.

Sounds good. That was the one reason for existence of WebAssemblyDebugValueManager. I would like not to change the test case, but fix the WebAssemblyDebugValueManager's logic here or in a different patch. Let me know if I need to follow up if the latter.