This is an archive of the discontinued LLVM Phabricator instance.

Select the correct DWARF location list entry when looking for variable locations
ClosedPublic

Authored by jasonmolenda on Apr 28 2022, 12:21 AM.

Details

Summary

When lldb is looking for a location list entry for a variable, it is using the pc value -- or return-pc value when up the stack -- to pick the correct DWARF location list entry. Because it is using return-pc, the entry selected may be an entirely different basic block, or function. We should be using the "pc for symbolication" -- return-pc decremented in most cases -- to select the correct block we should be looking up. I added RegisterContext::GetPCForSymbolication and StackFrame::GetFrameCodeAddressForSymbolication last year in https://reviews.llvm.org/D97644 and this is a simple patch with those in place.

The bug that drove this is in the middle of a complex project, but I can elicit incorrect behavior with a small C file that calls abort() as its last instruction, with a different function immediately after that. It must be built at -O1 with clang (on x86_64, AArch64) so that the debug info tracks variables in different registers during the function lifetime, using a DWARF location list.

I constructed a test case that does this, but it depends on clang's current codegen at -O1 on these two architectures. If the codegen were to shift around, the test could pass without correctly exercising this codepath. One possibility would be to include a test binary, DWARF for it, and corefile; that would eliminate the possibility of codegen changing. But it's a lot of data, and I'm not sure how necessary it is. Given that I get the same codegen on two architectures with clang at -O1, maybe this is a small enough exposure to the optimizer that we can use it in the testsuite. Open to suggestions.

Without this patch, this test case will show argv as unavailable when you try to print it. In the actual bug that I was tracking down, the return-pc pointed to a different location list entry for the variable so lldb showed the wrong value for that variable. A much more subtle and confusing bug for the user.

Diff Detail

Event Timeline

jasonmolenda created this revision.Apr 28 2022, 12:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 28 2022, 12:21 AM
jasonmolenda requested review of this revision.Apr 28 2022, 12:21 AM
jingham accepted this revision.Apr 29 2022, 11:03 AM

The change seems correct. Since the test only requires -01, and the test code is quite straightforward, the test seems fine.

This revision is now accepted and ready to land.Apr 29 2022, 11:03 AM
clayborg accepted this revision.Apr 29 2022, 11:34 AM

I wonder if this will help optimized code debugging at all!? Good catch

I wonder if this will help optimized code debugging at all!? Good catch

It came up in a kernel debugging session iirc; it was the case of a function doing a noreturn call, and when you went up to that stack frame, the variable was in one register up through the noreturn call, and then it was in a different register after that noreturn -- so we would print the wrong register value. :/ It takes a bit to get this exact issue, but printing an incorrect value was a worst case failure mode. The API test case I came up with will only print that the variable is unavailable, instead of the wrong value.