This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Process single-location debug values in variadic form when producing DWARF
ClosedPublic

Authored by StephenTozer on Aug 17 2023, 7:46 AM.

Details

Summary

Fixes issues raised on: https://reviews.llvm.org/D133926

The patch above enabled using variadic-form debug values to represent single-location, non-stack-value debug values, and a further patch made all DBG_INSTR_REFs use variadic form. Not all code paths were updated correctly to handle the new syntax however, with entry values in still expecting an expression that begins exactly DW_OP_LLVM_entry_value, 1.

A function already exists to select non-variadic-like expressions; this patch adds an extra function to cheaply simplify such cases to non-variadic form, which we use prior to any entry-value processing to put DBG_INSTR_REFs and DBG_VALUEs down the same code path. We also use it for a few DIExpression functions that check for whether the first element(s) of a DIExpression match a particular pattern, so that they will return the same result for DIExpression(DW_OP_LLVM_arg, 0, <ops>) as for DIExpression(<ops>).

Diff Detail

Event Timeline

StephenTozer created this revision.Aug 17 2023, 7:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 17 2023, 7:46 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
StephenTozer requested review of this revision.Aug 17 2023, 7:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 17 2023, 7:46 AM

Patch broke a test case, which tested that we don't emit locations for dbg.values with constant operands >64 bits as part of a normal DWARF expression, which was appearing in variadic dbg.values. This patch incidentally fixes the specific test by causing the example variadic expression to be interpreted as non-variadic, but slightly modifying the expression fixes that.

fdeazeve accepted this revision as: fdeazeve.Aug 17 2023, 11:33 AM

Thanks for looking into this! The code itself LGTM, but it might be worth waiting for the review of someone more familiar with this portion of code base

This revision is now accepted and ready to land.Aug 17 2023, 11:33 AM
Orlando accepted this revision.Aug 22 2023, 6:57 AM
Orlando added a subscriber: Orlando.

LGTM too with some inline nits/questions.

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
159 ↗(On Diff #551178)

style nit: outer if should have braces.

llvm/lib/IR/DebugInfoMetadata.cpp
1347

Can we have DIExpressions where there are two entry_value ops (applied to two DW_OP_LLVM_args)?

Perhaps that's a separate issue or maybe there is special case isEntryValue-equivalent code somewhere for variaidics?

1355–1358

Should these functions be asserting that they're not variadic?

1554

What's the purpose of the Elements[1] != 0 check? Could we ever end up with a variadic dbg intrinsic that has 2 location args, where only the second is referenced in the expression (e.g. DW_OP_LLVM_arg, 1, ... with arg 0 not referenced)?

fdeazeve added inline comments.Aug 24 2023, 6:26 AM
llvm/lib/IR/DebugInfoMetadata.cpp
1347

This is a tricky one, as I don't think any of the "isX" or "startsWithX" methods were designed with variadics in mind. These methods should only be called when we know we have a single expression?

I wasn't around when the variadic design was made, but my first thought is to think that instead of having variadic expressions it would be cleaner to have multiple expressions, one per value argument. But there are probably a lot of implications I can't see right now!

StephenTozer added inline comments.Aug 24 2023, 7:01 AM
llvm/lib/IR/DebugInfoMetadata.cpp
1347

my first thought is to think that instead of having variadic expressions it would be cleaner to have multiple expressions, one per value argument

The expression needs to calculate a single value - it's not being used in this case for fragments/DW_OP_piece expressions, but as a function to reproduce a variable's value from a set of input values. For example, if you try to translate %z = add %x, %y into a DIExpression, you can't do so using one expression for %x and one expression for %y - you just need a single expression for %z that takes %x and %y as arguments.

fdeazeve added inline comments.Aug 24 2023, 7:30 AM
llvm/lib/IR/DebugInfoMetadata.cpp
1347

Ah, this makes perfect sense, thanks! Knew I was missing something fundamental here :)

fdeazeve added a comment.EditedAug 24 2023, 8:40 AM

@StephenTozer have you received any other feedback on this? I've been experimenting with this patch in the swift fork (https://github.com/apple/llvm-project) and it seems to fix all the issues I had found.

I've been experimenting with this patch in the swift fork (https://github.com/apple/llvm-project) and it seems to fix all the issues I had found.

Good to hear! I'm working through a small backlog of patches right now, but should have this merged soon.

StephenTozer added inline comments.Sep 8 2023, 4:57 AM
llvm/lib/IR/DebugInfoMetadata.cpp
1355–1358

That wasn't the original intent in this patch; the idea is that there's no difference between DW_OP_LLVM_arg, 0, <ops> in a variadic expression and <ops> in a non-variadic expression, so the goal of this patch is to treat those two as the same even if the expression ultimately is "truly" variadic, because we weren't already checking for variadic-ness here (so for correctness' sake we assume this is already being checked elsewhere), and the variadic check is more expensive.

However, I've found the patch triggers some errors when running on the llvm test suite, and the variadic check really isn't that expensive I think, so I'm going to rewrite the patch to do a bit more.

However, I've found the patch triggers some errors when running on the llvm test suite, and the variadic check really isn't that expensive I think, so I'm going to rewrite the patch to do a bit more.

Hi @StephenTozer, could you elaborate on what these failures are? I am concerned because, ever since D133926 was merged, there are some methods that were left with an incorrect implementation, like startsWithDeref or isEntryValue. It would be nice if we could address those.
Is the proposed fix here revealing a larger problem that may cause D133926 to be rethought?

Hi @StephenTozer, could you elaborate on what these failures are? I am concerned because, ever since D133926 was merged, there are some methods that were left with an incorrect implementation, like startsWithDeref or isEntryValue. It would be nice if we could address those.
Is the proposed fix here revealing a larger problem that may cause D133926 to be rethought?

The failures were crashes caused by some expressions being updated incorrectly; the fix I'm uploading is just for this patch, making it a bit more defensive by changing getSingleLocationExpressionElements to actually check whether the expression is variadic, and return an optional, either the single-loc expression or std::nullopt if the expression isn't variadic. I've got the change, I'm running correctness tests again and will upload it shortly.

Check for isSingleLocationExpression in more places and make other checks (isEntryValue() and similar) contingent on that check.

Apologies for the delay on getting this up!

fdeazeve accepted this revision.Sep 14 2023, 10:21 AM

Thanks for the explanation, it makes sense!

This revision was landed with ongoing or failed builds.Sep 15 2023, 11:12 AM
This revision was automatically updated to reflect the committed changes.
fdeazeve added a comment.EditedSep 20 2023, 5:09 AM

I've just noticed that, for reasons I am still investigating, the optimizer produced this expression in some test:

call void @llvm.dbg.value(metadata !DIArgList(ptr %0, i32 %4), metadata !319, metadata !DIExpression(DW_OP_LLVM_entry_value, 1, DW_OP_deref, DW_OP_LLVM_arg, 0, DW_OP_plus_uconst, 16, DW_OP_LLVM_arg, 1, DW_OP_constu, 1, DW_OP_mul, DW_OP_plus, DW_OP_deref, DW_OP_stack_value)), !dbg !344

This is technically an entry value, as entry_value, 1 is referring to %0 here. However, because of the check if (auto singleLocElts = getSingleLocationExpressionElements()) we are going to return false.

I suspect a similar issue happens with the current implementation of the isDeref check. Any thoughts on this? I'm unsure what kind of semantics those isX methods should have in the presence of variadic arg lists

Actually, I think that expression might be invalid? That argument 1 is supposed to mimic the DWARF specification, i.e., it attempts to specify the length of the operations that are evaluated upon function entry. We require it to be 1 because we always want to be just a "register" operation. It's not clear what the meaning would be for a variadic expression

I've just noticed that, for reasons I am still investigating, the optimizer produced this expression in some test:

Are you able to post the test source or an unoptimized reproducer?

I suspect a similar issue happens with the current implementation of the isDeref check. Any thoughts on this? I'm unsure what kind of semantics those isX methods should have in the presence of variadic arg lists

The awkwardness of those functions in the context of a variadic debug value is that they're really meant to refer to a specific value, i.e. "is this expression dereferencing register x?", rather than being a property of the expression as a whole. Rewriting these to make this make sense is another piece of work imo - for the cases other than isEntryValue I don't think this is an issue because iirc they're only being used in cases where we only care about non-variadic debug values anyway. The entry value case should be investigated further (and as you note in your second comment, the expression as written is incorrect).

Are you able to post the test source or an unoptimized reproducer?

Working on it!

for the cases other than isEntryValue I don't think this is an issue because iirc they're only being used in cases where we only care about non-variadic debug values anyway.

The verifier does call "isEntryValue" in order to perform some checks. I'll probably change it so that it also checks the expression is not variadic.