This is an archive of the discontinued LLVM Phabricator instance.

[lldb/DWARF] Add support for pre-standard GNU call site attributes
ClosedPublic

Authored by labath on May 25 2020, 7:16 AM.

Details

Summary

The code changes are very straight-forward -- just handle both DW_AT_GNU
and DW_AT_call versions of all tags and attributes. There is just one
small gotcha: in the GNU version, DW_AT_low_pc was used both for the
"return pc" and the "call pc" values, depending on whether the tag was
describing a tail call, while the official scheme uses different
attributes for the two things.

Depends on D80518.

Diff Detail

Event Timeline

labath created this revision.May 25 2020, 7:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 25 2020, 7:16 AM
vsk added inline comments.May 26 2020, 12:44 PM
lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
3740

I think this needs to be call_inst_pc = low_pc - 1, see DwarfCompileUnit::constructCallSiteEntryDIE for the rationale, and StackFrameList::SynthesizeTailCallFrames for where we use this information. The relevant part of the comment from SynthesizeTailCallFrames is:

"We do not want to subtract 1 from this PC, as it's the actual address of the tail-calling branch instruction. This address is provided by the compiler via DW_AT_call_pc."

In GNU+Dwarf4 mode, that's no longer true, the DW_AT_low_pc is a fake "return address" for the tail call (really: the address of the instruction after the tail-calling jump).

On x86_64, this test doesn't seem to stress this case, but the test breaks on Darwin/arm64 without the adjustment.

vsk added a comment.May 26 2020, 1:39 PM

Also, I think it'd be good to enable most/all of the tests under test/API/functionalities/tail_call_frames under -ggdb along with this change.

labath updated this revision to Diff 267179.May 29 2020, 4:47 AM
labath marked an inline comment as done.
  • add -1
  • add -ggdb flavours for inline tail call tests, drop -glldb from the others. The choice is fairly arbitrary. I don't believe we should be doing combinatorial testing at this level, so I did not want to enable all of them. Enabling inline tests was easy because I had muscle memory for it. For the rest, dropping -glldb means that the gnu flavour will be tested on platforms that default to -ggdb.
lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
3740

Heh, you're right. I should've looked at what the code does instead of just trying to reverse engineer the logic from the output. I've now added the -1.
After looking that this code some more, I've come to realize that it's usage of the lack of DW_AT_call_return_pc to indicate a tail call is not correct -- I don't see anything preventing a producer from generating this attribute even for tail calls. I'm going to try refactoring this in another patch to store the tail-call-ness more explicitly. That should also make this part slightly cleaner.

vsk added inline comments.May 29 2020, 11:39 AM
lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
3740

The parsing code uses DW_AT_call_tail_call to determine whether or not there's a tail call, which is already explicit. However, the CallEdge representation does take an invalid return_pc address to mean there's a tail call. My $0.02 is that that's legit, and a producer that emits DW_AT_call_return_pc at a tail call site is behaving badly. If we care to support that, we could do it by changing the condition on line 3710 to attr == DW_AT_call_return_pc && !tail_call.

lldb/test/API/functionalities/tail_call_frames/disambiguate_paths_to_common_sink/main.cpp
8

Are these test updates necessary because lldb doesn't print '[opt]' and '[artificial]' next to frame descriptions in a consistent way across platforms? Or is it just that you don't think matching '[opt]' is relevant to the test?

labath marked 2 inline comments as done.Jun 1 2020, 6:03 AM
labath added inline comments.
lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
3740

You're right again. I confused a tail call with a (regular) call to a noreturn function.

The real reason why I wanted to do that change was to avoid the -1 thingy here. If we carried the information about a tail call and the information whether the address points before/after the call instruction explicitly, I believe that the -1 here wouldn't be necessary, and the code in SynthesizeTailCallFrames could just set behaves_like_frame_zero depending on what kind of address it gets. I have a feeling that would be cleaner, but I have to try it out...

lldb/test/API/functionalities/tail_call_frames/disambiguate_paths_to_common_sink/main.cpp
8

Right, I wanted to mention that as it's not very obvious, but I forgot...

The [opt] thingy is not printed at all with -ggdb because the attribute we get this information from -- DW_AT_APPLE_optimized -- is only emitted for -glldb. The optimization flag did not seem very relevant for these tests (I mean, technically the compiler could emit call site attributes even in non-optimized mode) so instead of forking the expectations I chose to simply remove it.

vsk accepted this revision.Jun 1 2020, 9:46 AM

Thanks, lgtm!

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
3740

I think you could do this by adding a flag to the CallEdge class, like:

bool have_one_after_call_inst_pc = ...;
union {
  addr_t call_inst_pc;
  addr_t one_after_call_inst_pc;
};

and then set behaves_like_frame_zero = !call_edge.have_one_after_call_inst_pc. But I'm not convinced this successfully separates the concerns re: parsing DWARF vs. adjusting PC values.

lldb/test/API/functionalities/tail_call_frames/disambiguate_paths_to_common_sink/main.cpp
8

Sounds good.

This revision is now accepted and ready to land.Jun 1 2020, 9:46 AM
dblaikie added inline comments.
lldb/test/API/functionalities/tail_call_frames/disambiguate_paths_to_common_sink/main.cpp
8

As an aside, now that lldb understands these attributes - perhaps we should emit them under -glldb as well as -ggdb? (@aprantl might be interested in making that call)

labath marked 5 inline comments as done.Jun 2 2020, 2:49 AM
labath added inline comments.
lldb/test/API/functionalities/tail_call_frames/disambiguate_paths_to_common_sink/main.cpp
8

FWIW, I think that would be great as it would reduce the effects of the debugger tuning argument, making the compiler output more "portable".

Though, we may want to wait with that until I look at the -1 issue. I believe that the way this is implemented now means we will end up pointing to the middle of a call instruction in an artificial frame, which would be a slight regression. It's not the end of the world, but I believe we can do something slightly better.

This revision was automatically updated to reflect the committed changes.
labath marked an inline comment as done.Jun 2 2020, 4:28 AM
labath added inline comments.
lldb/test/API/functionalities/tail_call_frames/disambiguate_paths_to_common_sink/main.cpp
8

Ok, I take that back. The instruction pointer handling is not terribly consistent right now anyway:

(lldb) up
frame #1: 0x0000000000401210 a.out`func12(...)
(lldb) register read rip
     rip = 0x0000000000401300

So, I wouldn't worry too much about preserving behavior here.

vsk added inline comments.Jun 2 2020, 9:31 AM
lldb/test/API/functionalities/tail_call_frames/disambiguate_paths_to_common_sink/main.cpp
8

I don't see any concrete benefit to supporting -ggdb on Darwin. Actually, changing llvm to emit the GNU opcodes on Darwin seems bad to me, as it could force Darwin tools authors to support two sets of call-site related opcodes.

dblaikie added inline comments.Jun 2 2020, 10:04 AM
lldb/test/API/functionalities/tail_call_frames/disambiguate_paths_to_common_sink/main.cpp
8

Not sure what it would mean to "support -ggdb on Darwin" - it's supported in that some peolpe might be using gdb on Darwin & compile that way. But I meant emitting these when targeting lldb - which someone might be doing on any platform & they might still want to use DWARFv4 for whatever reason - or does DWARFv4 + -glldb already use the DWARFv5 call site opcodes? If so, then, sure, that sounds OK to me & I don't suppose there's much reason to emit the GNU ones instead. If DWARFv4 + -glldb doesn't emit any call site info, it seems like an improvement to emit the GNU extension (consumers should always be written to ignore tags/attributes they don't know - and if so they'll be no worse off than if the call site info hadn't been emitted) or the v5 attributes maybe (though I don't know that that's quite as valid - I guess a consumer could rightly reject tags/attributes that aren't in the extension number space, or in the standard number space for the version being parsed)

vsk added inline comments.Jun 2 2020, 1:54 PM
lldb/test/API/functionalities/tail_call_frames/disambiguate_paths_to_common_sink/main.cpp
8

Yep, DWARFv4 + -glldb emits DWARFv5 call site opcodes -- and I should have written (as I meant), 'there's no need to change this to behave like -ggdb under -gdwarf-4 mode on Darwin'. It should (remain!) possible to use -ggdb on Darwin.

dblaikie added inline comments.Jun 2 2020, 2:20 PM
lldb/test/API/functionalities/tail_call_frames/disambiguate_paths_to_common_sink/main.cpp
8

Ah, fair enough - marginally questionable emitting future standard tags/attributes in previous versions - but I'm not too fussed so long as it's not new forms. As that's the direction chosen, yeah, no point emitting the DWARFv4 gdb extension tags/attributes under -glldb.

lldb/test/API/functionalities/tail_call_frames/thread_step_out_message/main.cpp