This is an archive of the discontinued LLVM Phabricator instance.

Add a DIExpression const-folder to prevent silly expressions
ClosedPublic

Authored by probinson on Jul 27 2021, 1:45 PM.

Details

Summary

It's entirely possible (because it actually happened) for a bool
variable to end up with a 256-bit DW_AT_const_value. This came about
when a local bool variable was initialized from a bitfield in a
32-byte struct of bitfields, and after inlining and constant
propagation, the variable did have a constant value. The sequence of
optimizations had it carrying "i256" values around, but once the
constant made it into the llvm.dbg.value, no further IR changes could
affect it.

Technically the llvm.dbg.value did have a DIExpression to reduce it
back down to 8 bits, but the compiler is in no way ready to emit an
oversized constant *and* a DWARF expression to manipulate it.
Depending on the circumstances, we had either just the very fat bool
value, or an expression with no starting value.

The sequence of optimizations that led to this state did seem pretty
reasonable, so the solution I came up with was to invent a DWARF
constant expression folder. Currently it only does convert ops, but
there's no reason it couldn't do other ops if that became useful.

This broke three tests that depended on having convert ops survive
into the DWARF, so I added an operator that would abort the folder to
each of those tests.

Diff Detail

Event Timeline

probinson created this revision.Jul 27 2021, 1:45 PM
probinson requested review of this revision.Jul 27 2021, 1:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2021, 1:45 PM

Thanks for looking into this! Is there any advantage to return a Constant+DIExpression, or could we just return a DIExpression(DW_OP_constu, ...)? Or can we not do that because even after folding the constant is > 64bits?
I think this new API would be a great candidate to create a unit test harness for (perhaps in MetadataTest.cpp).

Thanks for looking into this! Is there any advantage to return a Constant+DIExpression, or could we just return a DIExpression(DW_OP_constu, ...)? Or can we not do that because even after folding the constant is > 64bits?

Looking at the rest of the infrastructure, I think the Constant is not optional, the DIExpression is the optional part. Most of the time there isn't an expression at all, of course, just the initial operand (which might or might not be a Constant). I suppose that could be redesigned but I think it goes beyond the scope of this patch.

I think this new API would be a great candidate to create a unit test harness for (perhaps in MetadataTest.cpp).

You point out something I was thinking after I posted the patch, i.e. that the testing was just the sort I always grump about--it verifies the right thing happens in the case that prompts the patch, without looking at the broader cases. Unit testing will be just the thing.

Looking at the rest of the infrastructure, I think the Constant is not optional, the DIExpression is the optional part. Most of the time there isn't an expression at all, of course, just the initial operand (which might or might not be a Constant). I suppose that could be redesigned but I think it goes beyond the scope of this patch.

I agree - currently we can't support this, but it can go on the list of potential improvements to DIExpressions.

Some boring nits in the test case, but the implementation looks good. It sounds like some more testing is due, but the rest of this patch LGTM.

llvm/test/DebugInfo/X86/DIExpr-const-folding.ll
32

Nit, attribute can be removed.

38–45

Nit, git reference can be removed here.

Added unittest. Addressed review nits.

probinson marked 2 inline comments as done.Jul 29 2021, 6:48 AM

Looking at the rest of the infrastructure, I think the Constant is not optional, the DIExpression is the optional part. Most of the time there isn't an expression at all, of course, just the initial operand (which might or might not be a Constant). I suppose that could be redesigned but I think it goes beyond the scope of this patch.

I agree - currently we can't support this, but it can go on the list of potential improvements to DIExpressions.

Hmmm currently the folder works on a single known-constant operand; if/when the non-variadic form of the intrinsic/DBG_VALUE is fully replaced by variadic intrinsic/DBG_VALUE_LIST, the operand could become a one-element operand list and we prepend a DW_OP_LLVM_arg to the expression. That would unify how it all looks, I expect. At that point the constant-folder could be modified to take the original operand list and expression (without the caller vetting it as a single constant int) and the folder would do the vetting itself. A minor problem there is that DIExpression (and DebugInfoMetadata.h in general) don't know about operand lists, which appear to be stored in different ways for the IR intrinsic and the SDAG/MachineInstr form. That would have to be unified, presumably within DebugInfoMetadata.

It wouldn't be a stretch, at that point, to say that the folder could import constant operands directly into the expression, and remove them from the operand list. Right now that responsibility lies with the DWARF emitter. We'd also have to look at how the Codeview side works, and make sure we don't break that or make it unreasonably complicated.

Looking at the rest of the infrastructure, I think the Constant is not optional, the DIExpression is the optional part. Most of the time there isn't an expression at all, of course, just the initial operand (which might or might not be a Constant). I suppose that could be redesigned but I think it goes beyond the scope of this patch.

I see, while we use DW_OP_const for DIGlobalVariableExpression(), there is no canonical way to express a llvm.dbg.value that doesn't bind an SSA value. llvm.dbg.value(undef) has a special meaning, and llvm.dbg.value(null) is treated like undef.

llvm/test/DebugInfo/X86/convert-linked.ll
22–25

This looks wrong: What is a DW_OP_deref of the constant 32 supposed to be?

aprantl added inline comments.Jul 30 2021, 11:16 AM
llvm/unittests/IR/MetadataTest.cpp
2912

Thanks! Assuming these are dwarf expression ops, could spell them out here using the Dwarf.def enums? Or does that mess with the macro? If so, maybe we could at least add a comment expanding the raw numbers.

probinson added inline comments.Aug 2 2021, 8:31 AM
llvm/test/DebugInfo/X86/convert-linked.ll
22–25

It is supposed to be a way to keep the constant folder from firing. It doesn't "mean" anything. I can add a comment that says so?

llvm/unittests/IR/MetadataTest.cpp
2912

The values are parameters to the APInt ctor, meaning 32-bits wide with value 117 as input, and 32-bits wide with value 117 as output, and 1 element remaining in the expression afterward. I don't pass the APInt directly because the macro expansion needs to know the bit-width of the result independently. I suppose it could become
EXPECT_EXPR_FOLD(APInt(32, 117), APInt(32, 117), 32, 1)
if that would be easier to read?

aprantl added inline comments.Aug 2 2021, 4:30 PM
llvm/test/DebugInfo/X86/convert-linked.ll
22–25

I see. A comment explaining the reason for the strange expression would definitely help someone who needs to maintain the testcase later. Maybe we could even use something more neutral like DW_OP_lit0 DW_OP_plus if that also works?

probinson updated this revision to Diff 363783.Aug 3 2021, 9:49 AM

Add commentary to help explain test fragility, and use operators that make more sense.
Rename the helper macro in the unit test to try to make the nature of its arguments clearer.

probinson marked 2 inline comments as done.Aug 3 2021, 9:52 AM
probinson added inline comments.
llvm/unittests/IR/MetadataTest.cpp
2912

Changed the macro name, in case that helps. Let me know if APInt arguments would make it clearer.

aprantl accepted this revision.Aug 4 2021, 3:45 PM
This revision is now accepted and ready to land.Aug 4 2021, 3:45 PM
This revision was landed with ongoing or failed builds.Aug 5 2021, 6:14 AM
This revision was automatically updated to reflect the committed changes.