This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Fix: Variables that have no non-empty values being emitted when they have a DBG_VALUE_LIST
ClosedPublic

Authored by StephenTozer on Sep 15 2022, 2:40 AM.

Details

Summary

This patch fixes a simple bug where DbgValueHistoryMap::hasNonEmptyLocation was incorrectly handling DBG_VALUE_LIST instructions, treating empty values as non-empty, causing empty variables to be emitted into DWARF.

Diff Detail

Event Timeline

StephenTozer created this revision.Sep 15 2022, 2:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 15 2022, 2:40 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
StephenTozer requested review of this revision.Sep 15 2022, 2:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 15 2022, 2:40 AM
Orlando accepted this revision.Nov 2 2022, 9:42 AM

Makes sense, LGTM.

This revision is now accepted and ready to land.Nov 2 2022, 9:42 AM
vitalybuka reopened this revision.Dec 23 2022, 9:58 PM
This revision is now accepted and ready to land.Dec 23 2022, 9:58 PM

Thanks @vitalybuka , looks like we'll get back to this after the holidays.

Relanded, I incorrectly identified the cause of the failure.

Hmm - shouldn't we still see localh emitted, but without a DW_AT_location - dropping the variable entirely seems incorrect? I guess just for this test case that happens because the variable isn't listed in the DISubprogram's variables list?

Though internally at Google we're seeing an increase in the llvm-dwarfdump --statistics result for #params with binary locations/#params and #local vars with binary locations/#local vars - which suggests maybe this change is having the effect of dropping some variables entirely, leading to an apparent increase in these statistics?

Hmm - shouldn't we still see localh emitted, but without a DW_AT_location - dropping the variable entirely seems incorrect? I guess just for this test case that happens because the variable isn't listed in the DISubprogram's variables list?

I recall there was some discussion about this a while ago but don't recall the conclusions - regardless of what should ideally be done, this patch doesn't change the way we handle always-undef variables in itself, it only prevents variadic debug values from behaving differently to normal debug values.

Though internally at Google we're seeing an increase in the llvm-dwarfdump --statistics result for #params with binary locations/#params and #local vars with binary locations/#local vars - which suggests maybe this change is having the effect of dropping some variables entirely, leading to an apparent increase in these statistics?

It's possible that this patch is causing that, but I think it's unlikely - the only change this patch makes is to make DBG_VALUE_LISTs behave the same way as normal DBG_VALUEs w.r.t. being dropped. DBG_VALUE_LIST instructions do not usually make up a high % of locations, so it seems unlikely you'd see a large change, and more importantly I think they would rarely be used for parameters since they are only produced by attempting to salvage debug info from dead instructions, which isn't as likely to happen to parameters - they could still be assigned values that get salvaged into DBG_VALUE_LISTs later, but at that point we'd be looking at a fraction of a fraction of parameters being affected.

Hmm - shouldn't we still see localh emitted, but without a DW_AT_location - dropping the variable entirely seems incorrect? I guess just for this test case that happens because the variable isn't listed in the DISubprogram's variables list?

I recall there was some discussion about this a while ago but don't recall the conclusions - regardless of what should ideally be done, this patch doesn't change the way we handle always-undef variables in itself, it only prevents variadic debug values from behaving differently to normal debug values.

Though internally at Google we're seeing an increase in the llvm-dwarfdump --statistics result for #params with binary locations/#params and #local vars with binary locations/#local vars - which suggests maybe this change is having the effect of dropping some variables entirely, leading to an apparent increase in these statistics?

It's possible that this patch is causing that, but I think it's unlikely - the only change this patch makes is to make DBG_VALUE_LISTs behave the same way as normal DBG_VALUEs w.r.t. being dropped. DBG_VALUE_LIST instructions do not usually make up a high % of locations, so it seems unlikely you'd see a large change, and more importantly I think they would rarely be used for parameters since they are only produced by attempting to salvage debug info from dead instructions, which isn't as likely to happen to parameters - they could still be assigned values that get salvaged into DBG_VALUE_LISTs later, but at that point we'd be looking at a fraction of a fraction of parameters being affected.

I'll see if I can come up with a reproducer. Do you have an example of this from clang, rather than from handcrafted IR?

I guess if this test was updated - the existing behavior isn't a zero-length location list, but no location at all. That existing behavior appears correct to me.

So, at least on the test update itself, I'd say this patch creates a bug - by dropping the local variable where it was previously correctly described as existing but without a location.

StephenTozer added a comment.EditedJan 11 2023, 1:47 AM

I'll see if I can come up with a reproducer. Do you have an example of this from clang, rather than from handcrafted IR?

The problem fixed by this was found while building in the LLVM test suite - I can find an organic reduced reproducer, but I think it's worth noting that this case is quite simple and could easily show up in any optimized debug info build, the basis for this patch is as follows: DbgEntityHistoryCalculator skips recording entries for variables that have no non-empty locations within their scope; prior to this patch, it recognized undef DBG_VALUEs as being undef, but did not recognize undef DBG_VALUE_LISTs as being undef, meaning that in any scenario where the DbgEntityHistoryCalculator would be responsible for determining whether a variable should be emitted or omitted, an undef DBG_VALUE would give a different output to an undef DBG_VALUE_LIST.

So, at least on the test update itself, I'd say this patch creates a bug - by dropping the local variable where it was previously correctly described as existing but without a location.

It is possible that this patch creates a bug by exposing an issue that exists elsewhere - I'd much prefer tackling that issue directly rather than reverting this patch however, as there is a fairly substantive patch stack[0] that has landed since this patch that relies on variadic and non-variadic debug values being treated the same as each other, and in particular on the assumption that undef DBG_VALUEs and DBG_VALUE_LISTs have identical meanings.

I think some investigation is warranted for the dropping of variables, but the dropping of the variable in this test case is due to the fact that I manually added the variable but did not add it to the retained nodes list. For us to be sure that there is a bug, we would need to see an organic example of a variable being dropped, and confirm that it is being dropped incorrectly. I mentioned before that there was a discussion about this, which my esteemed colleagues have helped me find[1], in which it was determined that there was a legitimate case for omitting variables in an inlined function with no DW_AT_location.

After reviewing that thread, I believe that insofar as this patch is responsible for changing the variable counts, it is most likely correct in doing so. The function that this patch modifies, hasNonEmptyLocation, is used explicitly for determining when it is desirable to drop a variable: when the variable has no DW_AT_location, and it is inlined, such that the variable entry contains only a DW_AT_abstract_origin. The function that this patch modifies is the function that was added by the patch[2] that implements the agreed-upon fix from that discussion - essentially, the fix in that patch is one for which there is a general agreement that it is doing the right thing, and this patch simply fixes an error in that patch, which was written without reference to variadic debug values.

In summary, I don't think there is a good reason right now to assume that the patch is causing variables to be erroneously dropped - if there is a reproducer or some other indication that variables are being incorrectly omitted then I'm happy to look into it; I'm also happy to update the test to insert localh into the retained nodes list, so that it is not dropped anymore.

[0] https://reviews.llvm.org/D129372
[1] https://lists.llvm.org/pipermail/llvm-dev/2021-January/148139.html
[2] https://reviews.llvm.org/D95617

I'll see if I can come up with a reproducer. Do you have an example of this from clang, rather than from handcrafted IR?

The problem fixed by this was found while building in the LLVM test suite - I can find an organic reduced reproducer, but I think it's worth noting that this case is quite simple and could easily show up in any optimized debug info build, the basis for this patch is as follows: DbgEntityHistoryCalculator skips recording entries for variables that have no non-empty locations within their scope; prior to this patch, it recognized undef DBG_VALUEs as being undef, but did not recognize undef DBG_VALUE_LISTs as being undef, meaning that in any scenario where the DbgEntityHistoryCalculator would be responsible for determining whether a variable should be emitted or omitted, an undef DBG_VALUE would give a different output to an undef DBG_VALUE_LIST.

So, at least on the test update itself, I'd say this patch creates a bug - by dropping the local variable where it was previously correctly described as existing but without a location.

It is possible that this patch creates a bug by exposing an issue that exists elsewhere - I'd much prefer tackling that issue directly rather than reverting this patch however, as there is a fairly substantive patch stack[0] that has landed since this patch that relies on variadic and non-variadic debug values being treated the same as each other, and in particular on the assumption that undef DBG_VALUEs and DBG_VALUE_LISTs have identical meanings.

I think some investigation is warranted for the dropping of variables, but the dropping of the variable in this test case is due to the fact that I manually added the variable but did not add it to the retained nodes list. For us to be sure that there is a bug, we would need to see an organic example of a variable being dropped, and confirm that it is being dropped incorrectly. I mentioned before that there was a discussion about this, which my esteemed colleagues have helped me find[1], in which it was determined that there was a legitimate case for omitting variables in an inlined function with no DW_AT_location.

After reviewing that thread, I believe that insofar as this patch is responsible for changing the variable counts, it is most likely correct in doing so. The function that this patch modifies, hasNonEmptyLocation, is used explicitly for determining when it is desirable to drop a variable: when the variable has no DW_AT_location, and it is inlined, such that the variable entry contains only a DW_AT_abstract_origin. The function that this patch modifies is the function that was added by the patch[2] that implements the agreed-upon fix from that discussion - essentially, the fix in that patch is one for which there is a general agreement that it is doing the right thing, and this patch simply fixes an error in that patch, which was written without reference to variadic debug values.

In summary, I don't think there is a good reason right now to assume that the patch is causing variables to be erroneously dropped - if there is a reproducer or some other indication that variables are being incorrectly omitted then I'm happy to look into it; I'm also happy to update the test to insert localh into the retained nodes list, so that it is not dropped anymore.

[0] https://reviews.llvm.org/D129372
[1] https://lists.llvm.org/pipermail/llvm-dev/2021-January/148139.html
[2] https://reviews.llvm.org/D95617

Yeah, thanks for all the context - from what I've seen so far in looking cases like https://groups.google.com/g/llvm-dev/c/qtlVVEK6gso/m/mEn3c1fFAwAJ?pli=1 seem plausible, it'd be great to get that llvm-dwarfdump --statistics bug fixed, so it counts variables even when they're only present in the abstract origin.

Though the size reports I was looking at included significant (3-4%) reduction in .debug_loclists size, which doesn't track with this hypothesis (if it were only trivial inlined variables with an abstract_origin and no location being removed (in favor of only having an abstract origin and no concrete version) then loclists shouldn't've changed size) - but it's possible the loclist changes were due to other source change- I don't think those have been tested on this commit alone. I'll try doing that.