Page MenuHomePhabricator

[lldb][AArch64] Add UnwindPlan for Linux sigreturn
ClosedPublic

Authored by DavidSpickett on Oct 19 2021, 6:11 AM.

Details

Summary

This adds a specific unwind plan for AArch64 Linux sigreturn frames.
Previously we assumed that the fp would be valid here but it is not.

https://github.com/torvalds/linux/blob/master/arch/arm64/kernel/vdso/sigreturn.S

On Ubuntu Bionic it happened to point to an old frame info which meant
you got what looked like a correct backtrace. On Focal, the info is
completely invalid. (probably due to some code shuffling in libc)

This adds an UnwindPlan that knows that the sp in a sigreturn frame
points to an rt_sigframe from which we can offset to get saved
sp and pc values to backtrace correctly.

Based on LibUnwind's change: https://reviews.llvm.org/D90898

A new test is added that sets all compares the frames from the initial
signal catch to the handler break. Ensuring that the stack/frame pointer,
function name and register values match.
(this test is AArch64 Linux specific because it's the only one
with a specific unwind plan for this situation)

Fixes https://bugs.llvm.org/show_bug.cgi?id=52165

Diff Detail

Event Timeline

DavidSpickett created this revision.Oct 19 2021, 6:11 AM
DavidSpickett requested review of this revision.Oct 19 2021, 6:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 19 2021, 6:11 AM
DavidSpickett added inline comments.
lldb/source/Symbol/AArch64UnwindInfo.cpp
40 ↗(On Diff #380668)

This comment comes from the libunwind change.

58 ↗(On Diff #380668)

If this means "at all instructions within that function" this could be yes, but no one seems to read this field anyway.

omjavaid accepted this revision.Oct 20 2021, 2:23 PM

This look great. Thanks for doing the legit fix.

@mgorny This is something you might wanna implement/test on BSD.

This revision is now accepted and ready to land.Oct 20 2021, 2:23 PM
labath requested changes to this revision.Oct 21 2021, 1:24 AM
labath added a reviewer: jasonmolenda.
labath added a subscriber: labath.

I don't think this is a scalable way to fix this, and should get more discussion. Jason what do you make of this?

This revision now requires changes to proceed.Oct 21 2021, 1:24 AM

I'm sorry for terse response. I wanted to ensure this gets more attention, but I wasn't able to look at this straight away. Let me try to elaborate.

We generally try to avoid platform-specific code in the core libraries. The unwind code is one of the biggest offenders here, but I'd still want to avoid making it worse, particularly if there is a fairly simple way to avoid that. And I think there might be one. We already involve the Platform class in the unwinding logic (see Platform::GetTrapHandlerSymbolNames). Since it already knows about their names, it doesn't seem far-fetched to have it provide a way to unwind out of them (if necessary). Creating a new Platform entry point to provide the unwind plan (or somehow refactoring the GetTrapHandlerSymbolNames so that it can also provide an unwind plan) would enable us to put this code into PlatformLinux, which (although not ideal) seems much better than smacking it into the middle of the Symbol library. It also seems consistent with the intent in one of the comments in the code you modify.

lldb/source/Symbol/AArch64UnwindInfo.cpp
58 ↗(On Diff #380668)

The way I read it is "all locations within the range of this unwind plan" as given by UnwindPlan::GetAddressRange. compiler-generated unwind plans might cover the entire function, but only be valid (correct) at call sites.

lldb/source/Target/RegisterContextUnwind.cpp
895–896

this comment

Creating a new Platform entry point to provide the unwind plan (or somehow refactoring the GetTrapHandlerSymbolNames so that it can also provide an unwind plan)

Makes sense, I'll try that.

DavidSpickett added inline comments.Oct 22 2021, 5:39 AM
lldb/source/Symbol/AArch64UnwindInfo.cpp
58 ↗(On Diff #380668)

So this would be valid at all instructions because at any point in __kernel_rt_sigreturn the stack pointer is the same.

Set ValidAtAllInstructions.

Move building unwind plan into PlatformLinux and add a platform method
to lookup unwind plans for signal handlers. I'm still checking architecture
in PlatformLinux but it does move the code out of Symbol.

The only per arch thing on lldb's side is Architecture plugins but they're
not meant to know about operating system. So it seems like a reasonable
compromise to put it in PlatformLinux.

DavidSpickett added inline comments.Oct 26 2021, 6:44 AM
lldb/source/Plugins/Platform/Linux/PlatformLinux.cpp
258

PlatformLinux lists 3 handlers:

  • _sigtramp
  • __kernel_rt_sigreturn
  • __restore_rt

I think the first 2 are the same for AArch64 due to this line in the kernel:
VDSO_sigtramp = __kernel_rt_sigreturn;

The third is for real time signals which I'm only just learning about. So I'll see what that's about but probably not relevant for this change.

Set SP not FP based on the sigcontext.

Before, the SP of the frame before sigcontext would be incorrect.
(noticed while testing alt signal stacks)

This was already done in the libunwind change but I got them mixed up.

I'll come up with some addition to the tests to check this.

Thanks for doing this. Just a couple of comments inline.

lldb/source/Plugins/Platform/Linux/PlatformLinux.cpp
288

I have a feeling that this eRegisterKind enum should correspond with the register numbers used in the unwind plan. It'd be great if it did, as then you could replace arm64_dwarf::{pc,sp} with LLDB_REGNUM_GENERIC_{PC,SP}, and avoid including the arm64 header. Though if it does work as I remember, then I guess the question is how does this work in the first place.

303–305

If you passed in a llvm::Triple, you could replace this with triple.isAArch64().

And one could imagine that e.g. the environment field of the triple could be important in determining the precise unwind plan.

lldb/source/Target/RegisterContextUnwind.cpp
895–896

I guess we should also update this comment now. Maybe:

On some platforms the unwind information for signal handlers is not present or correct. Give the platform plugins a chance to provide substitute plan. Otherwise, use eh_frame.

?

labath added inline comments.Oct 27 2021, 2:04 AM
lldb/source/Plugins/Platform/Linux/PlatformLinux.cpp
295

Ah now I see that it worked because of this. Maybe you could try dropping this line and replace the dwarf numbers with generic regnums?

Turns out comparing thread.get_thread_frames() doesn't work because
you end up getting references to the same frames. Instead save the info
we want to compare.

One obvious reason this tactic should have failed is that the libc raise
frame will have a different ID between the two backtraces. This is why I've
ignored ID in the comparion.

Doing this showed me that sp and fp should be set, so I've got fp from
the sigcontext also.

Thanks for doing this. Just a couple of comments inline.

Will respond to these shortly.

Since the signal context contains a copy of all registers, maybe you should be setting all of them. The other registers will not be used for unwinding, but they will enable to user to inspect the registers in the stack frames above the signal handler.
One way to demonstrate (and test) this would be to have a bit of inline asm, which sets all registers to known values, and then raises a signal (SIGILL is probably the easiest to raise from asm). Then lldb breaks on the signal handler, goes up the stack, and observes the registers.

lldb/source/Plugins/Platform/Linux/PlatformLinux.cpp
295

Except of course, if you end up setting all of the registers.

Use generic register names and don't set type to DWARF.

Pass triple instead of machine and use isAArch64.

DavidSpickett marked 3 inline comments as done.Oct 27 2021, 3:36 AM

Update comment.

DavidSpickett marked 2 inline comments as done.Oct 27 2021, 3:40 AM

One way to demonstrate (and test) this would be to have a bit of inline asm, which sets all registers to known values, and then raises a signal (SIGILL is probably the easiest to raise from asm).

I will give that a go, probably as a separate test case.

  • Set locations for all general purpose registers
  • Set register type to DWARF due to that
  • Add a new test that sets known register values and compares them between signal catch and handler
  • Remove changes to handle abort test (though I could do those as a separate thing later)

One thing remains, the sigcontext can include floating point and SVE
registers. We'd need to read some memory to determine if it does:
https://github.com/torvalds/linux/blob/master/arch/arm64/include/uapi/asm/sigcontext.h#L39

Which I can do by passing the target and generating the plan based on
what's present.

For now let me know if the test case makes sense. Maybe this is ok
to go in as is and I can follow up with the other registers?

DavidSpickett edited the summary of this revision. (Show Details)Wed, Nov 10, 3:24 AM
DavidSpickett marked an inline comment as done.

I'll at least make the floating point/SVE changes children of this review since it'll add some complexity.

DavidSpickett added inline comments.Wed, Nov 10, 3:35 AM
lldb/test/API/linux/aarch64/unwind_signal/main.c
24

In my testing the kernel didn't go out of its way to change all the register values but it always changed a few so the test fails consistently if the unwind plan doesn't set the locations.

labath accepted this revision.Wed, Nov 10, 4:24 AM
  • Set locations for all general purpose registers
  • Set register type to DWARF due to that
  • Add a new test that sets known register values and compares them between signal catch and handler
  • Remove changes to handle abort test (though I could do those as a separate thing later)

One thing remains, the sigcontext can include floating point and SVE
registers. We'd need to read some memory to determine if it does:
https://github.com/torvalds/linux/blob/master/arch/arm64/include/uapi/asm/sigcontext.h#L39

Which I can do by passing the target and generating the plan based on
what's present.

For now let me know if the test case makes sense. Maybe this is ok
to go in as is and I can follow up with the other registers?

Yes, this is definitely fine. Thanks for taking your time to do that.

Dynamically generating unwind plans based on the contents of memory kinda goes against the concept of unwind plans, which were meant to be static (and cacheable) descriptions of how to unwind from a given point (PC) in a program. I guess the most "pure" solution would be to encode this logic into the dwarf expressions for the relevant registers. I think this should be doable (dwarf expressions being turing-complete and all), but the resulting expressions would probably be horrendous.

Overall, I'm not too worried about not being able to recover non-gpr registers so (unless you really want to work on this) I think that can stay unimplemented. (I'd also be fine with not recovering gpr registers either, though it's obviously better if they are -- the reason this came up was because I was looking at ways to simplify the eRegisterKind business).

lldb/source/Plugins/Platform/Linux/PlatformLinux.cpp
322

Change to eRegisterKindDWARF and remove the SetRegisterKind call below.

lldb/test/API/linux/aarch64/unwind_signal/TestUnwindSignal.py
82–83

I don't think this is bad, but you could just hardcode the expectations into the python file. Then you could skip this step and continue straight to the final breakpoint.

lldb/test/API/linux/aarch64/unwind_signal/main.c
24

One could also have a similar block (but with different values) in the signal handler, which would guarantee that all registers are being read from the stack.

This revision is now accepted and ready to land.Wed, Nov 10, 4:24 AM
  • Set x registers to a different pattern in the signal handler
  • Don't save frames before and after, just check for the right pattern when unwinding out of the handler to sigill()
  • Set register kind when we first make the unwind plan
DavidSpickett marked 3 inline comments as done.Thu, Nov 11, 3:14 AM
This revision was automatically updated to reflect the committed changes.

Thanks for the help, really helped get a high quality fix!