Page MenuHomePhabricator

Fix for TestNoreturnUnwind.py on i386
ClosedPublic

Authored by ravitheja on Nov 27 2015, 7:42 AM.

Details

Summary

The testcase TestNoreturnUnwind.py was failing
because the unwind from the vdso library was not
successful for clang compiler while it was passing
for gcc. It was passing for gcc since the unwind plan
used was the assembly plan and the ebp register was
set by the main function in case of gcc and was not
used by the functions in the call flow to the vdso, whereas
clang did not emit assembly prologue for main and so
the assembly unwind was failing. Normally in case of
failure of assembly unwind, lldb switches to EH CFI frame
based unwinding, but this was not happening for
the first frame. This patch tries to fix this behaviour by
falling to EH CFI frame based unwinding in case of assembly
unwind failure even for the first frame.
The test is still marked as XFAIL since it relys on the fix
of another bug.

Diff Detail

Event Timeline

ravitheja updated this revision to Diff 41308.Nov 27 2015, 7:42 AM
ravitheja retitled this revision from to Fix for TestNoreturnUnwind.py on i386.
ravitheja updated this object.
clayborg resigned from this revision.Nov 30 2015, 8:56 AM
clayborg removed a reviewer: clayborg.

Jason Molenda should be sufficient to review this.

jasonmolenda edited edge metadata.Nov 30 2015, 9:13 PM

Hi Ravitheja, thanks for putting together a patch, I'm probably the best person to look at this.

I'm not sure I understand the failure mode. When built with clang, this vdso routine does not preserve the frame pointer register (rbp), right? If lldb knows the start address of the function, lldb should be able to use the assembly language inspection to get the correct backtrace, even for frameless code.

I'm afraid this change would (ultimately) make lldb fall back to the architectural default unwind plan - and if it's in a frameless function, that's not correct. When the arch default unwind plan is used in a frameless function, you'll usually end up skipping the caller stack frame when you show the backtrace.

Would it be possible to show the disassembly of the routine lldb is having problems with? Could you stop there and backtrace while

(lldb) log enable lldb unwind

is enabled? The logging is a little difficult to follow, but once you're used to how the unwinder works it makes pretty good sense. I don't have a system set up to get this information myself - I'd like to make sure I understand how the unwinder is failing here.

Thanks!

ravitheja added a comment.EditedDec 1 2015, 2:06 AM

Hi Jason,
Ok let me explain the scenario here, the application i.e the test function tries to abort which eventually ends up in a function present in the vdso
, on the way the call flow goes through some functions in libc. Now the backtrace is done from inside of the function in the vdso, which was successful
in the case of gcc and unsuccessful for clang. Before I tell you my findings, I would mention that the vdso is mapped by the kernel in to the inferiors
memory and would be the same for clang or gcc. Now I found that the ebp register was not touched by the functions in the libc and the assembly
unwind plan was used for the vdso for the first frame. This was successful for gcc because gcc emitted the prologue for the main function
whereas clang does not, in the case of GCC the assembly unwind was successful because the ebp was set by main and no other function touched it
on the way. What I see is that for the first frame, LLDB only tries to use the assembly unwind plan, and if it fails it just exits instead
of trying something else like EH CFI.

Here is the log without my patch -

(lldb) log enable lldb unwind
(lldb) run
Process 99000 launched: '/nfs/site/disks/idb_team/raddepal/LLDB/llvm/llvm/tools/lldb/packages/Python/lldbsuite/test/functionalities/unwind/noreturn/a.out' (i386)
th1/fr0 with pc value of 0x8048350, symbol name is ''
(i386) /nfs/site/disks/idb_team/raddepal/LLDB/llvm/llvm/tools/lldb/packages/Python/lldbsuite/test/functionalities/unwind/noreturn/a.out: Reading EH frame info
th1/fr0 0x0000000008048350: CFA=esp +4 => esp=CFA+0 eip=[CFA-4]

th1/fr0 CFA is 0xffffd4d4: Register esp (7) contents are 0xffffd4d0, offset is 4
th1/fr0 initialized frame current pc is 0x8048350 cfa is 0xffffd4d4 using assembly insn profiling UnwindPlan
th1/fr0 supplying caller's saved eip (8)'s location using assembly insn profiling UnwindPlan
th1/fr0 supplying caller's register eip (8) from the stack, saved at CFA plus offset -4 [saved at 0xffffd4d0]
th1/fr1 pc = 0x1
th1/fr0 supplying caller's register ebp (6) from the live RegisterContext at frame 0
th1/fr1 fp = 0x0
th1/fr0 supplying caller's saved esp (7)'s location using assembly insn profiling UnwindPlan
th1/fr0 supplying caller's register esp (7), value is CFA plus offset 0 [value is 0xffffd4d4]
th1/fr1 sp = 0xffffd4d4
th1/fr1 this frame has a pc of 0x0
th1/fr0 supplying caller's saved eip (8)'s location, cached
th1/fr0 Got an invalid CFA register value - reg ebp (6), value 0x0
th1/fr0 failed to get cfa with fallback unwindplan
Frame 1 invalid RegisterContext for this frame, stopping stack walk
th1 Unwind of this thread is complete.
th1/fr0 with pc value of 0xf7fd9d80, symbol name is '__kernel_vsyscall'
(i386) [vdso](0x00000000f7fd9000): Reading EH frame info
th1/fr0 0x00000000f7fd9d75: CFA=ebp+16 => ebp=[CFA-16] esp=CFA+0 eip=[CFA-4]

th1/fr0 Got an invalid CFA register value - reg ebp (6), value 0x0
th1/fr0 could not read CFA register for this frame.
th1 Unwind of this thread is complete.
Process 99000 stopped

  • thread #1: tid = 99000, 0xf7fd9d80 [vdso]__kernel_vsyscall + 16, name = 'a.out', stop reason = signal SIGABRT frame #0: 0xf7fd9d80 [vdso]__kernel_vsyscall + 16

[vdso]`__kernel_vsyscall:
-> 0xf7fd9d80 <+16>: popl %ebp

0xf7fd9d81 <+17>: popl   %edx
0xf7fd9d82 <+18>: popl   %ecx
0xf7fd9d83 <+19>: retl

(lldb) disassemble
[vdso]`__kernel_vsyscall:

0xf7fd9d70 <+0>:  pushl  %ecx
0xf7fd9d71 <+1>:  pushl  %edx
0xf7fd9d72 <+2>:  pushl  %ebp
0xf7fd9d73 <+3>:  movl   %esp, %ebp
0xf7fd9d75 <+5>:  sysenter 
0xf7fd9d77 <+7>:  nop    
0xf7fd9d78 <+8>:  nop    
0xf7fd9d79 <+9>:  nop    
0xf7fd9d7a <+10>: nop    
0xf7fd9d7b <+11>: nop    
0xf7fd9d7c <+12>: nop    
0xf7fd9d7d <+13>: nop    
0xf7fd9d7e <+14>: int    $0x80

-> 0xf7fd9d80 <+16>: popl %ebp

0xf7fd9d81 <+17>: popl   %edx
0xf7fd9d82 <+18>: popl   %ecx
0xf7fd9d83 <+19>: retl

After the patch ->

(lldb) log enable lldb unwind
(lldb) run
Process 99000 launched: '/nfs/site/disks/idb_team/raddepal/LLDB/llvm/llvm/tools/lldb/packages/Python/lldbsuite/test/functionalities/unwind/noreturn/a.out' (i386)
th1/fr0 with pc value of 0x8048350, symbol name is ''
(i386) /nfs/site/disks/idb_team/raddepal/LLDB/llvm/llvm/tools/lldb/packages/Python/lldbsuite/test/functionalities/unwind/noreturn/a.out: Reading EH frame info
th1/fr0 0x0000000008048350: CFA=esp +4 => esp=CFA+0 eip=[CFA-4]

th1/fr0 CFA is 0xffffd4d4: Register esp (7) contents are 0xffffd4d0, offset is 4
th1/fr0 initialized frame current pc is 0x8048350 cfa is 0xffffd4d4 using assembly insn profiling UnwindPlan
th1/fr0 supplying caller's saved eip (8)'s location using assembly insn profiling UnwindPlan
th1/fr0 supplying caller's register eip (8) from the stack, saved at CFA plus offset -4 [saved at 0xffffd4d0]
th1/fr1 pc = 0x1
th1/fr0 supplying caller's register ebp (6) from the live RegisterContext at frame 0
th1/fr1 fp = 0x0
th1/fr0 supplying caller's saved esp (7)'s location using assembly insn profiling UnwindPlan
th1/fr0 supplying caller's register esp (7), value is CFA plus offset 0 [value is 0xffffd4d4]
th1/fr1 sp = 0xffffd4d4
th1/fr1 this frame has a pc of 0x0
th1/fr0 supplying caller's saved eip (8)'s location, cached
th1/fr0 Got an invalid CFA register value - reg ebp (6), value 0x0
th1/fr0 failed to get cfa with fallback unwindplan
Frame 1 invalid RegisterContext for this frame, stopping stack walk
th1 Unwind of this thread is complete.
th1/fr0 with pc value of 0xf7fd9d80, symbol name is '__kernel_vsyscall'
(i386) [vdso](0x00000000f7fd9000): Reading EH frame info
th1/fr0 0x00000000f7fd9d75: CFA=ebp+16 => ebp=[CFA-16] esp=CFA+0 eip=[CFA-4]

th1/fr0 Got an invalid CFA register value - reg ebp (6), value 0x0
th1/fr0 could not read CFA register for this frame.
th1 Unwind of this thread is complete.
Process 99000 stopped

  • thread #1: tid = 99000, 0xf7fd9d80 [vdso]__kernel_vsyscall + 16, name = 'a.out', stop reason = signal SIGABRT frame #0: 0xf7fd9d80 [vdso]__kernel_vsyscall + 16

[vdso]`__kernel_vsyscall:
-> 0xf7fd9d80 <+16>: popl %ebp

0xf7fd9d81 <+17>: popl   %edx
0xf7fd9d82 <+18>: popl   %ecx
0xf7fd9d83 <+19>: retl

(lldb) disassemble
[vdso]`__kernel_vsyscall:

0xf7fd9d70 <+0>:  pushl  %ecx
0xf7fd9d71 <+1>:  pushl  %edx
0xf7fd9d72 <+2>:  pushl  %ebp
0xf7fd9d73 <+3>:  movl   %esp, %ebp
0xf7fd9d75 <+5>:  sysenter 
0xf7fd9d77 <+7>:  nop    
0xf7fd9d78 <+8>:  nop    
0xf7fd9d79 <+9>:  nop    
0xf7fd9d7a <+10>: nop    
0xf7fd9d7b <+11>: nop    
0xf7fd9d7c <+12>: nop    
0xf7fd9d7d <+13>: nop    
0xf7fd9d7e <+14>: int    $0x80

-> 0xf7fd9d80 <+16>: popl %ebp

0xf7fd9d81 <+17>: popl   %edx
0xf7fd9d82 <+18>: popl   %ecx
0xf7fd9d83 <+19>: retl   







(lldb) log enable lldb unwind

(lldb) run
Process 113600 launched: '/nfs/site/disks/idb_team/raddepal/LLDB/llvm/llvm/tools/lldb/packages/Python/lldbsuite/test/functionalities/unwind/noreturn/a.out' (i386)
th1/fr0 with pc value of 0x8048350, symbol name is ''
(i386) /nfs/site/disks/idb_team/raddepal/LLDB/llvm/llvm/tools/lldb/packages/Python/lldbsuite/test/functionalities/unwind/noreturn/a.out: Reading EH frame info
th1/fr0 0x0000000008048350: CFA=esp +4 => esp=CFA+0 eip=[CFA-4]

th1/fr0 CFA is 0xffffd4d4: Register esp (7) contents are 0xffffd4d0, offset is 4
th1/fr0 initialized frame current pc is 0x8048350 cfa is 0xffffd4d4 using assembly insn profiling UnwindPlan
th1/fr0 supplying caller's saved eip (8)'s location using assembly insn profiling UnwindPlan
th1/fr0 supplying caller's register eip (8) from the stack, saved at CFA plus offset -4 [saved at 0xffffd4d0]
th1/fr1 pc = 0x1
th1/fr0 supplying caller's register ebp (6) from the live RegisterContext at frame 0
th1/fr1 fp = 0x0
th1/fr0 supplying caller's saved esp (7)'s location using assembly insn profiling UnwindPlan
th1/fr0 supplying caller's register esp (7), value is CFA plus offset 0 [value is 0xffffd4d4]
th1/fr1 sp = 0xffffd4d4
th1/fr1 this frame has a pc of 0x0
th1/fr0 supplying caller's saved eip (8)'s location, cached
th1/fr0 Got an invalid CFA register value - reg ebp (6), value 0x0
th1/fr0 failed to get cfa with fallback unwindplan
Frame 1 invalid RegisterContext for this frame, stopping stack walk
th1 Unwind of this thread is complete.
th1/fr0 with pc value of 0xf7fd9d80, symbol name is '__kernel_vsyscall'
(i386) [vdso](0x00000000f7fd9000): Reading EH frame info
th1/fr0 0x00000000f7fd9d75: CFA=ebp+16 => ebp=[CFA-16] esp=CFA+0 eip=[CFA-4]

th1/fr0 Got an invalid CFA register value - reg ebp (6), value 0x0
th1/fr0 supplying caller's saved eip (8)'s location using assembly insn profiling UnwindPlan
th1/fr0 supplying caller's register eip (8) from the stack, saved at CFA plus offset -4 [saved at 0xfffffffffffffffb]
th1/fr0 CFA is 0xffffd2d4: Register esp (7) contents are 0xffffd2c4, offset is 16
th1/fr0 requested caller's saved PC but this UnwindPlan uses a RA reg; getting eip (8) instead
th1/fr0 supplying caller's saved eip (8)'s location using eh_frame CFI UnwindPlan
th1/fr0 supplying caller's register eip (8) from the stack, saved at CFA plus offset -4 [saved at 0xffffd2d0]
th1/fr0 trying to unwind from this function with the UnwindPlan 'eh_frame CFI' because UnwindPlan 'assembly insn profiling' failed.
th1/fr0 initialized frame current pc is 0xf7fd9d80 cfa is 0xffffd2d4 using eh_frame CFI UnwindPlan
th1/fr0 supplying caller's saved eip (8)'s location, cached
th1/fr1 pc = 0xf7e40607
th1/fr0 supplying caller's saved ebp (6)'s location using eh_frame CFI UnwindPlan
th1/fr0 supplying caller's register ebp (6) from the stack, saved at CFA plus offset -16 [saved at 0xffffd2c4]
th1/fr1 fp = 0x0
th1/fr0 supplying caller's stack pointer esp (7) value, computed from CFA
th1/fr1 sp = 0xffffd2d4
th1/fr1 with pc value of 0xf7e40607, symbol name is 'gsignal'
(i386) /lib/i386-linux-gnu/libc.so.6: Reading EH frame info
th1/fr1 active row: 0x00000000f7e405c2: CFA=esp+12 => ebx=[CFA-12] edi=[CFA-8] eip=[CFA-4]

th1/fr0 supplying caller's saved esp (7)'s location, cached
th1/fr1 CFA is 0xffffd2e0: Register esp (7) contents are 0xffffd2d4, offset is 12
th1/fr1 m_cfa = 0xffffd2e0
th1/fr1 initialized frame current pc is 0xf7e40607 cfa is 0xffffd2e0
th1/fr0 supplying caller's saved eip (8)'s location, cached
th1/fr1 requested caller's saved PC but this UnwindPlan uses a RA reg; getting eip (8) instead
th1/fr1 supplying caller's saved eip (8)'s location using eh_frame CFI UnwindPlan
th1/fr1 supplying caller's register eip (8) from the stack, saved at CFA plus offset -4 [saved at 0xffffd2dc]

th1/fr2 pc = 0xf7e43a33

th1/fr1 no save location for ebp (6) via 'eh_frame CFI'
th1/fr0 supplying caller's saved ebp (6)'s location, cached

th1/fr2 fp = 0x0

th1/fr1 supplying caller's stack pointer esp (7) value, computed from CFA

th1/fr2 sp = 0xffffd2e0
th1/fr2 with pc value of 0xf7e43a33, symbol name is 'abort'
th1/fr2 active row: 0x00000000f7e43903: CFA=esp+304 => ebx=[CFA-12] edi=[CFA-8] eip=[CFA-4]

th1/fr1 supplying caller's saved esp (7)'s location, cached

th1/fr2 CFA is 0xffffd410: Register esp (7) contents are 0xffffd2e0, offset is 304
th1/fr2 m_cfa = 0xffffd410
th1/fr2 initialized frame current pc is 0xf7e43a33 cfa is 0xffffd410

th1/fr1 supplying caller's saved eip (8)'s location, cached
Process 113600 stopped

  • thread #1: tid = 113600, 0xf7fd9d80 [vdso]__kernel_vsyscall + 16, name = 'a.out', stop reason = signal SIGABRT frame #0: 0xf7fd9d80 [vdso]__kernel_vsyscall + 16

[vdso]`__kernel_vsyscall:
-> 0xf7fd9d80 <+16>: popl %ebp

0xf7fd9d81 <+17>: popl   %edx
0xf7fd9d82 <+18>: popl   %ecx
0xf7fd9d83 <+19>: retl

(lldb) disassemble
[vdso]`__kernel_vsyscall:

0xf7fd9d70 <+0>:  pushl  %ecx
0xf7fd9d71 <+1>:  pushl  %edx
0xf7fd9d72 <+2>:  pushl  %ebp
0xf7fd9d73 <+3>:  movl   %esp, %ebp
0xf7fd9d75 <+5>:  sysenter 
0xf7fd9d77 <+7>:  nop    
0xf7fd9d78 <+8>:  nop    
0xf7fd9d79 <+9>:  nop    
0xf7fd9d7a <+10>: nop    
0xf7fd9d7b <+11>: nop    
0xf7fd9d7c <+12>: nop    
0xf7fd9d7d <+13>: nop    
0xf7fd9d7e <+14>: int    $0x80

-> 0xf7fd9d80 <+16>: popl %ebp

0xf7fd9d81 <+17>: popl   %edx
0xf7fd9d82 <+18>: popl   %ecx
0xf7fd9d83 <+19>: retl

Hi Ravitheja, thanks for the logging output. It does look like the assembly profiling is failing for some reason, and with your patch we fall back to using the eh_frame instructions and it succeeds. I'll look at the patch closely tomorrow (I meant to do it tonight but I'm running out of time before I need to head home) - but I'm just curious what lldb will report for

(lldb) image show-unwind -n __kernel_vsyscall

when you have the process running. Looking at the assembly instructions, it seems like the assembly language profile inspection should have done the right thing here. The rule "CFA=ebp+16 => ebp=[CFA-16] esp=CFA+0 eip=[CFA-4]" looks right to me at this point in the function.

Thanks!

Jason

Just to be clear - the change your proposing may be a good improvement for falling back to another unwind plan when the initial unwind plan fails. But I'm curious why the initial unwind plan (assembly) failed here to begin with. I think lldb should have profiled this function correctly.

The reason assembly unwind did not work is because if the ebp register is set to 0 at that time, so the CFA is not available. In the case of clang the frame pointer is not set by any function, which is why the formula does not work.

(lldb) image show-unwind -n kernel_vsyscall
UNWIND PLANS for [vdso]`
kernel_vsyscall (start addr 0xf7fd9d70)

Asynchronous (not restricted to call-sites) UnwindPlan is 'assembly insn profiling'
Synchronous (restricted to call-sites) UnwindPlan is 'eh_frame CFI'

Assembly language inspection UnwindPlan:
This UnwindPlan originally sourced from assembly insn profiling
This UnwindPlan is sourced from the compiler: no.
This UnwindPlan is valid at all instruction locations: yes.
Address range of this UnwindPlan: [[vdso]..text + 1600-0x00000654)
row[0]: 0: CFA=esp +4 => esp=CFA+0 eip=[CFA-4]
row[1]: 1: CFA=esp +8 => esp=CFA+0 eip=[CFA-4]
row[2]: 2: CFA=esp+12 => esp=CFA+0 eip=[CFA-4]
row[3]: 3: CFA=esp+16 => ebp=[CFA-16] esp=CFA+0 eip=[CFA-4]
row[4]: 5: CFA=ebp+16 => ebp=[CFA-16] esp=CFA+0 eip=[CFA-4]
row[5]: 17: CFA=esp+12 => esp=CFA+0 eip=[CFA-4]
row[6]: 18: CFA=esp +8 => esp=CFA+0 eip=[CFA-4]
row[7]: 19: CFA=esp +4 => esp=CFA+0 eip=[CFA-4]

eh_frame UnwindPlan:
This UnwindPlan originally sourced from eh_frame CFI
This UnwindPlan is sourced from the compiler: yes.
This UnwindPlan is valid at all instruction locations: no.
Address range of this UnwindPlan: [[vdso]..text + 1600-0x00000654)
row[0]: 0: CFA=esp +4 => eip=[CFA-4]
row[1]: 1: CFA=esp +8 => eip=[CFA-4]
row[2]: 2: CFA=esp+12 => eip=[CFA-4]
row[3]: 3: CFA=esp+16 => ebp=[CFA-16] eip=[CFA-4]
row[4]: 17: CFA=esp+12 => ebp=[CFA-16] eip=[CFA-4]
row[5]: 18: CFA=esp +8 => ebp=[CFA-16] eip=[CFA-4]
row[6]: 19: CFA=esp +4 => ebp=[CFA-16] eip=[CFA-4]

Arch default UnwindPlan:
This UnwindPlan originally sourced from i386 default unwind plan
This UnwindPlan is sourced from the compiler: no.
This UnwindPlan is valid at all instruction locations: no.
row[0]: 0: CFA=ebp +8 => esp=CFA+0 ebp=[CFA-8] eip=[CFA-4]

Arch default at entry point UnwindPlan:
This UnwindPlan originally sourced from i386 at-func-entry default
This UnwindPlan is sourced from the compiler: no.
This UnwindPlan is valid at all instruction locations: not specified.
row[0]: 0: CFA=esp +4 => esp=CFA+0 eip=[CFA-4]

Interesting. The routine is

0xf7fd9d70 <+0>: pushl %ecx
0xf7fd9d71 <+1>: pushl %edx
0xf7fd9d72 <+2>: pushl %ebp
0xf7fd9d73 <+3>: movl %esp, %ebp
0xf7fd9d75 <+5>: sysenter
[...]
0xf7fd9d7e <+14>: int $0x80
--> 0xf7fd9d80 <+16>: popl %ebp
0xf7fd9d81 <+17>: popl %edx
0xf7fd9d82 <+18>: popl %ecx
0xf7fd9d83 <+19>: retl

The only difference between the assembly profile v. eh_frame is that the assembly profile identifies the movl %esp, %esp instruction and says that the unwind should be done in terms of ebp. ebp and esp should be identical at this point, so it doesn't matter which one is used. But when we look at the unwind logging, ebp seems to have a value of 0 and esp seems to have a value of 0xffffd2d4. Does the sysenter / int instruction sequence mess with the contents of ebp? This routine immediately trashes the contents of ebp with the 'popl %ebp' so it's irrelevant what is stored there. In fact, I don't understand why this function saves the caller's ebp with the 'movl %esp, %ebp' insn, considering that it doesn't make use of ebp at all.

Well, it is clear that lldb needs to be able to fall back to an alternate unwind scheme here - I don't know exactly what's happening during this function, but for some reason an unwind that tries to use ebp will fail.

I'm also curious why lldb didn't try to use the eh_frame instructions to begin with. We generally try to do that if it looks like the instructions describe the prologue/epilogue (which this one does). I'll go look at the eh_frame augmenter and see what the rules are but maybe it has a sanity check that it wants the starting unwind rule to match the ending unwind rule, but the eh_frame unwind rules don't clear the ebp save location once ebp is restored.

row[0]: 0: CFA=esp +4 => eip=[CFA-4]
row[6]: 19: CFA=esp +4 => ebp=[CFA-16] eip=[CFA-4]

Hello, Any updates on this differential ?

jasonmolenda accepted this revision.Jan 11 2016, 9:07 PM
jasonmolenda edited edge metadata.

Hi Ravi, sorry for taking so long to get back to you on this. It looks like a reasonable change - let's give it a try. Sometimes with the unwinder, it can be hard to foresee problems across all the different architectures/environments that we work. I'm sure it won't cause problems, but I still get a little nervous every time we add another trick to get an unwind to work. :)

This revision is now accepted and ready to land.Jan 11 2016, 9:07 PM
ravitheja closed this revision.Jan 12 2016, 2:12 AM