This is an archive of the discontinued LLVM Phabricator instance.

[llvm] [testsuite] Fix invalid DW_AT_location DWARF expression in unattached-global.ll
Needs ReviewPublic

Authored by jankratochvil on Apr 5 2021, 2:41 PM.

Details

Reviewers
aprantl
Summary

I had a problem in unrelated D99850 with this testcase. IMO unattached-global.ll is wrong (since its inception in D20147):

!4 = !DIExpression(DW_OP_plus_uconst, 4)
                DW_AT_location	(DW_OP_plus_uconst 0x4)

That is not a valid DWARF expression for DW_TAG_variable. There should be an empty DWARF expression (or completely missing DW_AT_location).

Probably good to cleanup that test in a separate preparatory commit. & maybe you or @aprantl might be interested in suring up the LLVM IR debug info verifier to catch this?

Diff Detail

Event Timeline

jankratochvil requested review of this revision.Apr 5 2021, 2:41 PM
jankratochvil created this revision.

@aprantl @JDevlieghere do you folks want to update the verifier in any way to make this sort of thing invalid? (looks like it might be relatively easy to test for, judging by the code touched in D99850)

@aprantl @JDevlieghere do you folks want to update the verifier in any way to make this sort of thing invalid?

Before implementing it into DWARFExpression::verify I see two issues:

  • IIUC the verifier would be run by llvm-dwarfdump --verify which is currently not being run.
  • The goal of this testcase is to omit DW_AT_location when producing the binary. Therefore the invalid DWARF expression would not be written out at all. So it could not be verified as incorrect by llvm-dwarfdump.

@aprantl @JDevlieghere do you folks want to update the verifier in any way to make this sort of thing invalid?

Before implementing it into DWARFExpression::verify I see two issues:

  • IIUC the verifier would be run by llvm-dwarfdump --verify which is currently not being run.
  • The goal of this testcase is to omit DW_AT_location when producing the binary. Therefore the invalid DWARF expression would not be written out at all. So it could not be verified as incorrect by llvm-dwarfdump.

Sorry, I didn't mean dwarfdump verify (though that could be useful too), but the IR verifier (lib/IR/Verifier.cpp). If we're going to say that this IR is not good/supported by the backends, one thing we can do is make it invalid according to the verifier.

Maybe I'm missing something but it looks to me like this test is supposed to test that we don't attach DW_AT_location to a variable that has been optimized out.

This test is what you get after optimizing away

someglobal = i32 ..., !dbg !3

and I don't think it is invalid at all. Since it doesn't hold a constant, the DIExpression is meaningless without being attached to a global, but it's not invalid IR. It just shouldn't get output in DWARF.

This going to get less ambiguous with the ongoing work to make arguments in DIExpressions more explicit.

Maybe I'm missing something but it looks to me like this test is supposed to test that we don't attach DW_AT_location to a variable that has been optimized out.

This test is what you get after optimizing away

someglobal = i32 ..., !dbg !3

and I don't think it is invalid at all. Since it doesn't hold a constant, the DIExpression is meaningless without being attached to a global, but it's not invalid IR. It just shouldn't get output in DWARF.

This going to get less ambiguous with the ongoing work to make arguments in DIExpressions more explicit.

How'd this expression get in the "globals" list, then? Shouldn't the globals list only contain constants? Any DIGlobalVariableExpression would only be referenced via the dbg.value? Oh, it is referenced from both places. That seems unfortunate/wasn't my understanding - I'd have thought that we would either reference the DIGlobalVariableExpression from the dbg.value or from the globals list, not both at the same time (because one referenced from one place doesn't make sense when referenced from the other place - due to lack of the implicit parameter).

In any case, clearly my understanding is off from the current reality & yeah, I agree with your assessment @aprantl & so this change isn't correct & I guess we'll abandon this change and go back to the other review to discuss the issue there.

Yeah, my understanding is that all global variables start out as both !dbg attachements on LLVM globals *and* in the list of globals in the DICompileUnit. This way we don't loose track of them when the LLVM global symbol gets deleted by an optimization.

Yeah, my understanding is that all global variables start out as both !dbg attachements on LLVM globals *and* in the list of globals in the DICompileUnit. This way we don't loose track of them when the LLVM global symbol gets deleted by an optimization.

yeah - seems important that the variable is reachable through both paths, but seems wrong to me that the DIExpression that only makes sense in the context of the dbg attachment is reachable from anywhere else (so I'd expect maybe two DIGlobalVariableExpressions - one with the empty DIExpression, kept in the CU's globals list, and one from the intrinsic, mutated by optimizations and dropped if something is optimized out. But that's not how it is today, so we do need to tolerate these weird/meaningless/defunct location expressions in the CU's globals list for now at least.