This is an archive of the discontinued LLVM Phabricator instance.

[libunwind] Fix unwind_leaffunction test
ClosedPublic

Authored by leonardchan on Nov 30 2021, 1:09 PM.

Details

Reviewers
danielkiss
ldionne
Group Reviewers
Restricted Project
Commits
rGf178a05f2204: [libunwind] Fix unwind_leaffunction test
Summary

It's possible for this test not to pass if the libc used does not provide unwind info for raise. We can replace it with __builtin_cast, which can lead to a SIGTRAP on x86_64 and a SIGILL on aarch64.

Using this alternative, a nop is needed before the __builtin_cast. This is because libunwind incorrectly decrements pc, which can cause pc to jump into the previous function and use the incorrect FDE.

Diff Detail

Event Timeline

ldionne created this revision.Nov 30 2021, 1:09 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 30 2021, 1:09 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
ldionne requested review of this revision.Nov 30 2021, 1:09 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 30 2021, 1:09 PM
leonardchan accepted this revision.Nov 30 2021, 1:17 PM

I haven't yet responded on D114385, but it looks like using __builtin_trap() still breaks the test for us because the unwinder doesn't iterate over the frame for main. Instead it goes from crashing_leaf_func to __libc_start_main, then runs out of frames. I'm still investigating why it does this. I don't think that should be a blocker for landing this though since this approach seems more correct than the raise, so LGTM.

I haven't yet responded on D114385, but it looks like using __builtin_trap() still breaks the test for us because the unwinder doesn't iterate over the frame for main. Instead it goes from crashing_leaf_func to __libc_start_main, then runs out of frames. I'm still investigating why it does this. I don't think that should be a blocker for landing this though since this approach seems more correct than the raise, so LGTM.

Thanks for chiming in. I'll wait for the CI to be green and ship this.

Hi @ldionne, @leonardchan

I ran these changes on Linux (Ubuntu 18.04)/Aarch64 board. The test gets failed because of missed 'main':

ubuntu@jetson8:/tmp$ ./t.tmp.exe 
info.dli_sname: _Z14signal_handleri
info.dli_sname: __kernel_rt_sigreturn
info.dli_sname: _Z18crashing_leaf_funcv
info.dli_sname: __libc_start_main

The changes from D114385 work fine on the same board.

Hi @ldionne, @leonardchan

I ran these changes on Linux (Ubuntu 18.04)/Aarch64 board. The test gets failed because of missed 'main':

ubuntu@jetson8:/tmp$ ./t.tmp.exe 
info.dli_sname: _Z14signal_handleri
info.dli_sname: __kernel_rt_sigreturn
info.dli_sname: _Z18crashing_leaf_funcv
info.dli_sname: __libc_start_main

The changes from D114385 work fine on the same board.

Do we need to apply __attribute__((noinline)) to main? Can you try that out and let me know if that helps?

Hi @ldionne, @leonardchan

I ran these changes on Linux (Ubuntu 18.04)/Aarch64 board. The test gets failed because of missed 'main':

ubuntu@jetson8:/tmp$ ./t.tmp.exe 
info.dli_sname: _Z14signal_handleri
info.dli_sname: __kernel_rt_sigreturn
info.dli_sname: _Z18crashing_leaf_funcv
info.dli_sname: __libc_start_main

This is what I'm running into also.

Do we need to apply __attribute__((noinline)) to main? Can you try that out and let me know if that helps?

Locally __attribute__((noinline)) on main doesn't seem to change anything for me. I checked the unwind info and made sure it's emitted correctly. The CIE for crashing_leaf_func is

[0x202918] CIE length=20
  version: 1
  augmentation: zR
  code_alignment_factor: 1
  data_alignment_factor: -4
  return_address_register: 30

  Program:
DW_CFA_def_cfa: reg31 +0
DW_CFA_nop:
DW_CFA_nop:
DW_CFA_nop:
DW_CFA_nop:

and it's FDE is

[0x202970] FDE length=20 cie=[0x202918]
  initial_location: 0x2134ac
  address_range: 0x8 (end : 0x2134b4)

  Program:
DW_CFA_nop:
DW_CFA_nop:
DW_CFA_nop:
DW_CFA_nop:
DW_CFA_nop:
DW_CFA_nop:
DW_CFA_nop:

which I think just indicates that libunwind should use x30 for finding the return address. If I stop right before the __builtin_trap(), it looks like x30 points back into main (right after the call to crashing_leaf_func):

x30            0x2134fc 2176252
...
(gdb) info symbol 0x2134fc
main + 60 in section .text of /home/leonardchan/llvm-monorepo/llvm-build-1-master-fuchsia-toolchain/runtimes/runtimes-aarch64-unknown-linux-gnu-bins/Output/unwind_leaffunction.pass.cpp.dir/t.tmp.exe
...
Leonard Chan, 9 min
(gdb) disas 0x2134fc
Dump of assembler code for function main:
...
   0x00000000002134f8 <+56>:    bl      0x2134ac <_Z18crashing_leaf_funcv>
   0x00000000002134fc <+60>:    mov     w0, #0xfffffffe                 // #-2
...

One thing I also noticed is that if we add another function call, something like:

void crashing_leaf_func() {
  __builtin_trap();
}

void func() {
  crashing_leaf_func();
}

int main() {
  signal(SIGTRAP, signal_handler);
  signal(SIGILL, signal_handler);
  func();
  return 2;
}

Then the test passes, but it looks like unwinder skips past func (it jumps from _Z18crashing_leaf_funcv to main). Perhaps all this hints at an underlying issue in libunwind?

Do we need to apply attribute((noinline)) to main? Can you try that out and let me know if that helps?

I tried to add __attribute__((noinline)) attribute, but it didn't help. I got the same dump.

danielkiss accepted this revision.Dec 1 2021, 2:02 AM

Do we need to apply __attribute__((noinline)) to main? Can you try that out and let me know if that helps?

Clang automatically adds noinline to main.

Hi @ldionne, @leonardchan

I ran these changes on Linux (Ubuntu 18.04)/Aarch64 board. The test gets failed because of missed 'main':

ubuntu@jetson8:/tmp$ ./t.tmp.exe 
info.dli_sname: _Z14signal_handleri
info.dli_sname: __kernel_rt_sigreturn
info.dli_sname: _Z18crashing_leaf_funcv
info.dli_sname: __libc_start_main

Interesting it works on Ubuntu20.04/Aarch64, I compile the code with ToT clang or gcc-10.
could you share an objdump of main?

void func() {
 crashing_leaf_func();
}

Then the test passes, but it looks like unwinder skips past func (it jumps from _Z18crashing_leaf_funcv to main). Perhaps all this hints at an underlying issue in libunwind?

here func will tail call crashing_leaf_func so it is expected to not be visible during the unwind because the return address will point to somewhere in main.

Then the test passes, but it looks like unwinder skips past func (it jumps from _Z18crashing_leaf_funcv to main). Perhaps all this hints at an underlying issue in libunwind?

Is this a case where a function ends with a call instruction, and the return address points at the address after the call instruction, which would be the first instruction of the next function?

Can you provide disassembly of the linked binary (of the relevant functions) plus corresponding unwind info dumps (from e.g. llvm-dwarfdump)?

And/or, is this an issue where a tail call optimization makes the unwind stack entirely skip past a specific function in the stack?

What happens if you add e.g. a __asm__ __volatile__("nop"); after the call to __builtin_trap(), and in func() where you call crashing_leaf_func()? That would add one extra instruction to the function (and should avoid any tail call optimizations) so that the return address (in the case of func()) or the program counter (in the case of the trap, if it points at the next instruction) points within the range of the current function, instead of pointing at the first instruction of the next function?

Interesting it works on Ubuntu20.04/Aarch64, I compile the code with ToT clang or gcc-10.
could you share an objdump of main?

00000000002134ac <_Z18crashing_leaf_funcv>:
; _Z18crashing_leaf_funcv():
  2134ac: 20 00 20 d4   brk     #0x1
  2134b0: c0 03 5f d6   ret

00000000002134b4 <main>:
; main():
  2134b4: ff c3 00 d1   sub     sp, sp, #48
  2134b8: fd 7b 02 a9   stp     x29, x30, [sp, #32]
  2134bc: fd 83 00 91   add     x29, sp, #32
  2134c0: bf c3 1f b8   stur    wzr, [x29, #-4]
  2134c4: a0 83 1f b8   stur    w0, [x29, #-8]
  2134c8: e1 0b 00 f9   str     x1, [sp, #16]
  2134cc: a0 00 80 52   mov     w0, #5
  2134d0: 01 00 00 90   adrp    x1, 0x213000 <main+0x1c>
  2134d4: 21 80 11 91   add     x1, x1, #1120
  2134d8: e1 07 00 f9   str     x1, [sp, #8]
  2134dc: f9 0f 00 94   bl      0x2174c0 <signal@plt>
  2134e0: e1 07 40 f9   ldr     x1, [sp, #8]
  2134e4: 80 00 80 52   mov     w0, #4
  2134e8: f6 0f 00 94   bl      0x2174c0 <signal@plt>
  2134ec: f0 ff ff 97   bl      0x2134ac <_Z18crashing_leaf_funcv>
  2134f0: 20 00 80 12   mov     w0, #-2
  2134f4: fd 7b 42 a9   ldp     x29, x30, [sp, #32]
  2134f8: ff c3 00 91   add     sp, sp, #48
  2134fc: c0 03 5f d6   ret
void func() {
 crashing_leaf_func();
}

Then the test passes, but it looks like unwinder skips past func (it jumps from _Z18crashing_leaf_funcv to main). Perhaps all this hints at an underlying issue in libunwind?

here func will tail call crashing_leaf_func so it is expected to not be visible during the unwind because the return address will point to somewhere in main.

Would this still be expected if I added another layer of function calls? That is:

void crashing_leaf_func() {
  __builtin_trap();
}

void func() {
  crashing_leaf_func();
}

void func2() {
  func();
}

int main() {
  signal(SIGTRAP, signal_handler);
  signal(SIGILL, signal_handler);
  func2();
  return 2;
}

In this example, I hit crashing_leaf_func, func2, and main, missing only func.

Then the test passes, but it looks like unwinder skips past func (it jumps from _Z18crashing_leaf_funcv to main). Perhaps all this hints at an underlying issue in libunwind?

Is this a case where a function ends with a call instruction, and the return address points at the address after the call instruction, which would be the first instruction of the next function?

Can you provide disassembly of the linked binary (of the relevant functions) plus corresponding unwind info dumps (from e.g. llvm-dwarfdump)?

00000000002134fc <_Z18crashing_leaf_funcv>:
; _Z18crashing_leaf_funcv():
  2134fc: 20 00 20 d4   brk     #0x1
  213500: c0 03 5f d6   ret

0000000000213504 <_Z4funcv>:
; _Z4funcv():
  213504: fd 7b bf a9   stp     x29, x30, [sp, #-16]!
  213508: fd 03 00 91   mov     x29, sp
  21350c: fc ff ff 97   bl      0x2134fc <_Z18crashing_leaf_funcv>
  213510: fd 7b c1 a8   ldp     x29, x30, [sp], #16
  213514: c0 03 5f d6   ret

0000000000213518 <main>:
; main():
  213518: ff c3 00 d1   sub     sp, sp, #48
  21351c: fd 7b 02 a9   stp     x29, x30, [sp, #32]
  213520: fd 83 00 91   add     x29, sp, #32
  213524: bf c3 1f b8   stur    wzr, [x29, #-4]
  213528: a0 83 1f b8   stur    w0, [x29, #-8]
  21352c: e1 0b 00 f9   str     x1, [sp, #16]
  213530: a0 00 80 52   mov     w0, #5
  213534: 01 00 00 90   adrp    x1, 0x213000 <main+0x1c>
  213538: 21 c0 12 91   add     x1, x1, #1200
  21353c: e1 07 00 f9   str     x1, [sp, #8]
  213540: f8 0f 00 94   bl      0x217520 <signal@plt>
  213544: e1 07 40 f9   ldr     x1, [sp, #8]
  213548: 80 00 80 52   mov     w0, #4
  21354c: f5 0f 00 94   bl      0x217520 <signal@plt>
  213550: ed ff ff 97   bl      0x213504 <_Z4funcv>
  213554: 20 00 80 12   mov     w0, #-2
  213558: fd 7b 42 a9   ldp     x29, x30, [sp, #32]
  21355c: ff c3 00 91   add     sp, sp, #48
  213560: c0 03 5f d6   ret

CIE (same for each of the following FDEs):

00000088 00000014 00000000 CIE
  Format:                DWARF32
  Version:               1
  Augmentation:          "zR"
  Code alignment factor: 1
  Data alignment factor: -4
  Return address column: 30
  Augmentation data:     1B

  DW_CFA_def_cfa: WSP +0
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:

  CFA=WSP

main FDE:

00000118 0000001c 00000094 FDE cie=00000088 pc=00213518...00213564
  Format:       DWARF32
  DW_CFA_advance_loc: 12
  DW_CFA_def_cfa: W29 +16
  DW_CFA_offset: W30 -8
  DW_CFA_offset: W29 -16
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:

  0x213518: CFA=WSP
  0x213524: CFA=W29+16: W29=[CFA-16], W30=[CFA-8]

func FDE:

000000f8 0000001c 00000074 FDE cie=00000088 pc=00213504...00213518
  Format:       DWARF32
  DW_CFA_advance_loc: 8
  DW_CFA_def_cfa: W29 +16
  DW_CFA_offset: W30 -8
  DW_CFA_offset: W29 -16
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:

  0x213504: CFA=WSP
  0x21350c: CFA=W29+16: W29=[CFA-16], W30=[CFA-8]

crashing_leaf_func FDE:

000000e0 00000014 0000005c FDE cie=00000088 pc=002134fc...00213504
  Format:       DWARF32
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:
  DW_CFA_nop:

  0x2134fc: CFA=WSP

And/or, is this an issue where a tail call optimization makes the unwind stack entirely skip past a specific function in the stack?

The test is compiled without optimizations, so I don't think functions should be skipped in this case.

What happens if you add e.g. a __asm__ __volatile__("nop"); after the call to __builtin_trap(), and in func() where you call crashing_leaf_func()? That would add one extra instruction to the function (and should avoid any tail call optimizations) so that the return address (in the case of func()) or the program counter (in the case of the trap, if it points at the next instruction) points within the range of the current function, instead of pointing at the first instruction of the next function?

Added, but no changes in the functions that the unwinder sees. It still jumps from _Z18crashing_leaf_funcv to main.

Does one of you want to commandeer the patch? I'm somewhat out of my depth here and I don't have a system to reproduce on.

danielkiss added inline comments.Dec 2 2021, 7:14 AM
libunwind/test/unwind_leaffunction.pass.cpp
42–47

craching_lead_func could become a function with only 1 or 2 instructions ( just brk or brk+ret).
The first instruction is where unwind restores the pc but later it adjusts so it will point to the previous function actually so wrong FDE will be used for the unwinding which causes the skipped frames.
https://github.com/llvm/llvm-project/blob/main/libunwind/src/UnwindCursor.hpp#L1933

this "nop" is just a workaround which IMHO fine for now.
unwind may need to differentiate between the sync and async unwind cases.

leonardchan added inline comments.Dec 2 2021, 10:58 AM
libunwind/test/unwind_leaffunction.pass.cpp
42–47

Can confirm just adding the nop there allows for this test to pass on arm. I'll confirm on x64 then commandeer this patch and file a bug for this. Thanks for helping investigate this.

vvereschaka added inline comments.Dec 2 2021, 11:12 AM
libunwind/test/unwind_leaffunction.pass.cpp
42–47

I also confirm that the 'nop' there fixes the test on Aarch64/Ubuntu Linux 18.04 board.

leonardchan commandeered this revision.Dec 2 2021, 7:07 PM
leonardchan updated this revision to Diff 391528.
leonardchan edited reviewers, added: ldionne; removed: leonardchan.
leonardchan edited the summary of this revision. (Show Details)
leonardchan marked 2 inline comments as done.

Will file a bug also once the bugzilla migration finishes.

danielkiss accepted this revision.Dec 3 2021, 12:56 AM

Thanks for the update. LGTM.

This revision was not accepted when it landed; it landed in state Needs Review.Dec 3 2021, 11:21 AM
This revision was automatically updated to reflect the committed changes.