This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] ValueMapper impl for DIArgList should respect RF_IgnoreMissingLocals
ClosedPublic

Authored by StephenTozer on Nov 22 2021, 3:26 AM.

Details

Summary

Fixes: https://bugs.llvm.org/show_bug.cgi?id=52552

This patch fixes an issue in which SSA value reference within a DIArgList would be unnecessarily dropped by llvm-link, even when invoking on a single file (which should be a no-op). The reason for the difference is that the ValueMapper does not refer to the RF_IgnoreMissingLocals flag for LocalAsMetadata contained within a DIArgList; this flag is used for direct LocalAsMetadata uses to preserve SSA references even when the ValueMapper does not have an explicit mapping for the referenced SSA value, which appears to always be the case when using llvm-link in this manner.

NB: There is another difference between the direct and DIArgList uses of LocalAsMetadata, which is that the former replaces the SSA value with an empty MDNode if RF_IgnoreMissingLocals is not set and there is no known mapping, while the latter replaces it with an undef value. The former approach is out-of-date, as we currently expect undef values to be used for any dead debug values, and anything else will not be handled correctly during ISel (and possible other places). I'm not sure whether this should be fixed up as part of this patch - thoughts?

Diff Detail

Event Timeline

StephenTozer created this revision.Nov 22 2021, 3:26 AM
StephenTozer requested review of this revision.Nov 22 2021, 3:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 22 2021, 3:26 AM
StephenTozer edited the summary of this revision. (Show Details)Nov 22 2021, 3:32 AM
StephenTozer added a project: debug-info.
StephenTozer added a subscriber: debug-info.
jmorse added a subscriber: jmorse.Nov 24 2021, 5:38 AM

Correct me if I'm wrong, but isn't the point of RF_IgnoreMissingLocals that missing mappings will cause references to be "dropped" without error if it's set, which in the DIArgList scenario corresponds to emitting Undef. The comment above in the unit test says "we also shouldn't reference the unmapped local", and then on the line I've dropped a comment on, the output refers to the unmapped local.

llvm/unittests/Transforms/Utils/ValueMapperTest.cpp
54

(Line I'm referring to in main comment)

StephenTozer added inline comments.Nov 24 2021, 7:05 AM
llvm/unittests/Transforms/Utils/ValueMapperTest.cpp
54

I believe the "we shouldn't reference the unmapped local" comment is in reference to the non-IgnoreMissingLocals case specifically. The return value of nullptr for the IgnoreMissingLocals case is, as far as I can tell, equivalent to returning the unmapped local - this is the only site at which we may call this on a LocalAsMetadata/DIArgList:

Value *V = mapValue(Op);
// If we aren't ignoring missing entries, assert that something happened.
if (V)
  Op = V;
else
  assert((Flags & RF_IgnoreMissingLocals) &&
         "Referenced value not in value map!");
StephenTozer added inline comments.Nov 24 2021, 7:17 AM
llvm/unittests/Transforms/Utils/ValueMapperTest.cpp
54

Just to expand on this answer - returning nullptr if no mapping can be found is meant to be the standard for mapValue, so that the caller can decide what to do about that; the specific difficulty with DIArgList is that there may be some elements with a mapping and some without. In this case, this patch makes the executive decision to keep the unmapped value in the returned DIArgList, rather than replacing it with a nullptr and expecting the caller to manually check for a DIArgList with nullptr elements. It may be worth adding a comment to this patch to explain that, however!

This patch fixes an issue in which SSA value reference within a DIArgList would be unnecessarily dropped by llvm-link, even when invoking on a single file (which should be a no-op).

Please add a couple of IR-based tests for llvm-link in addition to the unit test (I suggest at least one that's single file and one that's multi-file).

llvm/lib/Transforms/Utils/ValueMapper.cpp
6

It'd be great to have a comment here explaining why undef isn't correct, and what's happening instead. It's not obvious to me how identity could be valid, given that the local doesn't exist on the other side... to me it looks like this could cause a DIArgList from one llvm::Function to point at a local llvm::Value owned by a different function (or module). Is that possibility excluded somehow? How? (the test suggests this results in metadata !{}; should there be a comment explaining the mechanism for that here?)

From the commit message, the goal is to change LocalAsMetadata that's referenced from a DIArgList, but it looks like it changes behaviour for all LocalAsMetadata. Can you explain why that's okay/safe?

llvm/unittests/Transforms/Utils/ValueMapperTest.cpp
51–1

This change makes it hard for me to see what has changed and what hasn't in the test. Please do one of the following:

  • commit NFC churn separately as a prep commit and rebase
  • avoid churn by choosing a different naming scheme that leaves old code untouched
  • avoid churn by moving new test code to a separate test (maybe mapValueDIArgList)
51–1

This comment makes it look like the behaviour changed for DIArgList, but the code change appears to affect only LocalAsMetadata. Can you explain what I'm missing?

StephenTozer added inline comments.Dec 13 2021, 9:48 AM
llvm/lib/Transforms/Utils/ValueMapper.cpp
6

I might misunderstand the meaning of the RF_IgnoreMissingLocals flag, but I believe that the intention of the flag is to ensure that we don't delete any LocalAsMetadata that appear prior to their referenced local value or otherwise have no existing mapping in the ValueMap, by using an identity mapping; this is the existing behaviour of the flag for direct LocalAsMetadata, and this patch is updating the DIArgList case to match.

For a direct LocalAsMetadata, this function returns nullptr if a set of conditions hold: if mapValue(LAM->getValue()) returns nullptr and the RF_IgnoreMissingLocals flag is set. The only caller of this function then interprets the nullptr return value as an identity mapping if the RF_IgnoreMissingLocals flag is set:

Value *V = mapValue(Op);
// If we aren't ignoring missing entries, assert that something happened.
if (V)
  Op = V;
else
  assert((Flags & RF_IgnoreMissingLocals) &&
         "Referenced value not in value map!");

This patch updates DIArgList uses of a LocalAsMetadata to maintain an identity mapping iff the same conditions hold. Although in the direct case it is the caller that makes the decision to maintain an identity mapping, the DIArgList case would be more complicated to handle from the caller side, and the end result is the same. Given this, it was my assumption that we only set the RF_IgnoreMissingLocals flag when we know that it won't result in an invalid Value reference, e.g. when we're cloning basic blocks within a function.

From the commit message, the goal is to change LocalAsMetadata that's referenced from a DIArgList, but it looks like it changes behaviour for all LocalAsMetadata. Can you explain why that's okay/safe?

I'm fairly sure that this should only affect LocalAsMetadata that's referenced from a DIArgList - the code for handling direct LocalAsMetadata is above (lines 382-396), all the changes of this patch are within the handler for DIArgLists, which may contain LocalAsMetadata and ConstantAsMetadata.

The code change now LGTM; thanks for walking me through it. I can take another look at the tests once you update them (or reply to those comments).

llvm/lib/Transforms/Utils/ValueMapper.cpp
4–1

I'd missed this context before!

6

I might misunderstand the meaning of the RF_IgnoreMissingLocals flag, but I believe that the intention of the flag is to ensure that we don't delete any LocalAsMetadata that appear prior to their referenced local value or otherwise have no existing mapping in the ValueMap, by using an identity mapping; this is the existing behaviour of the flag for direct LocalAsMetadata, and this patch is updating the DIArgList case to match.

The original intention of RF_IgnoreMissingLocals is tied more to behaviour of local Values than of Metadata; specifically, that RemapInstruction should not assert if there's a local that hasn't been mapped. But for Metadata, I think the behaviour you outline sounds reasonable / consistent.

I'm fairly sure that this should only affect LocalAsMetadata that's referenced from a DIArgList - the code for handling direct LocalAsMetadata is above (lines 382-396), all the changes of this patch are within the handler for DIArgLists, which may contain LocalAsMetadata and ConstantAsMetadata.

Most of my concerns came from misreading the patch; I previously didn't notice this loop was in the context of updating DIArgList arguments.

llvm/unittests/Transforms/Utils/ValueMapperTest.cpp
51–1

(Now I know what I was missing!)

Add a lit test for llvm-link, move the unit test changes into a separate function (leaving the original unmodified). The long comment in the original test has been duplicated (with a minor change) in the new test - I figure it's better to have each test containing the relevant info directly, but if preferred I could change it to some kind of "see above" instead.

dexonsmith accepted this revision.Jan 11 2022, 4:47 PM

Thanks, LGTM! (Apologies for the long delay!)

This revision is now accepted and ready to land.Jan 11 2022, 4:47 PM
This revision was landed with ongoing or failed builds.Jan 17 2022, 9:18 AM
This revision was automatically updated to reflect the committed changes.