This patch attempt to fix https://bugs.llvm.org/show_bug.cgi?id=51864. Which is related to debugging of a binary. This patch is actually reverted https://reviews.llvm.org/D69723 with a minor change.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
IIUC the issue with previous version that .cfi_def_cfa_offset 0 should be .cfi_def_cfa_offset 16? If that is still the case I will try to fix this.
I have one question: - By applying this patch 304 test cases are failing. I only corrected one test case. Is there any script to update other test cases or do they need to be manually updated?
As I told you in various bug reports, the issue is that LLVM's CFI epilogue directives are utterly broken when shrink-wrapping is used. Please go re-read https://reviews.llvm.org/D69723 for the rationale behind reverting the original commit.
Thanks for letting me know on various reports :). But I completely don't know what is shrink wrapping. I will read it more. My only understanding it as LLVM x86 is emitting these CFI directives so there should be no problem in RISCV. If shrink wrapping is the problem then we should revert that patch: https://reviews.llvm.org/D62190. If .cfi_restore ra is the problem then does that need to be fixed?
I only want watchpoints to work correctly in that binary. If it is not possible here, I think LLVM developers recommending me to "go" and add support for hardware watchpoint in RISCV gdb port. Is that correct?
Hi @HsiangKai, I want to know how you have written 12877+ lines of test cases such as in https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/RISCV/rvv/vluxseg-rv32.ll . I need to update them in this patch to pass check-llvm. It is some script? can you please tell me how to use that.
Read the very first line:
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
Almost every test in llvm/test/CodeGen/RISCV uses that, you should rarely have to update them manually (I hope you didn't do so for all these tests here... if you did, please re-run the script on all of them, because the script can include things a human might not, e.g. you can see comments on many of the lines that technically aren't needed, but using the script ensures that when someone re-runs it they don't see spurious diffs).
That is:
utils/update_llc_test_checks.py --llc-binary build/bin/llc -u `find llvm/test/CodeGen/RISCV \( -name '*.ll' -o -name '*.mir' \)` utils/update_mir_test_checks.py --llc-binary build/bin/llc -u `find llvm/test/CodeGen/RISCV \( -name '*.ll' -o -name '*.mir' \)`
Thanks, I think it is not mention in the docs, I will write something about it in TestingGuide.rst .
Hi @luismarques, Do you have any comments here? Or what happens(break) when this CFI directive is emitted and shrink wrapping is used, an example? Stack unwinding problem, still it is here, what it needed to be fixed, maybe some hint.
I think this problem needs to be solved soon, and I am not an expert here to fix it by myself.
Hello @luismarques,
can you please tell me how you have generated this assembly mentioned in the commit summary(.c file)?
define void @branch_and_tail_call(i1 %a) { ; RV32-LABEL: branch_and_tail_call: ; RV32: # %bb.0: ; RV32-NEXT: addi sp, sp, -16 ; RV32-NEXT: .cfi_def_cfa_offset 16 ; RV32-NEXT: sw ra, 12(sp) ; RV32-NEXT: .cfi_offset ra, -4 ; RV32-NEXT: andi a0, a0, 1 ; RV32-NEXT: beqz a0, .LBB2_2 ; RV32-NEXT: # %bb.1: # %blue_pill ; RV32-NEXT: lw ra, 12(sp) ; RV32-NEXT: .cfi_restore ra ; RV32-NEXT: addi sp, sp, 16 ; RV32-NEXT: .cfi_def_cfa_offset 0 ; RV32-NEXT: tail foo ; RV32-NEXT: .LBB2_2: # %red_pill ; RV32-NEXT: call bar ; RV32-NEXT: lw ra, 12(sp) ; RV32-NEXT: .cfi_restore ra ; RV32-NEXT: addi sp, sp, 16 ; RV32-NEXT: .cfi_def_cfa_offset 0 ; RV32-NEXT: ret }
For instance, without this patch the following program will not correctly unwind, and will get stuck (compile with clang, link with a gnu toolchain, test with qemu):
void three() { throw 7; } void two(void) { try { three(); } catch(int &c) { throw 42; } } int main() { try { two(); } catch(...) { } return 0; }
When you say it get stuck, can you please tell me what steps you have run?
PS I want to revert this patch to fix https://bugs.llvm.org/show_bug.cgi?id=51864 in https://reviews.llvm.org/D111519.
This is clearly the branch_and_tail_call test case in frame-info.ll that had the ; FIXME: fix use of .cfi_restore with wrong CFAs.
For instance, without this patch the following program will not correctly unwind, and will get stuck (compile with clang, link with a gnu toolchain, test with qemu):
void three() { throw 7; } void two(void) { try { three(); } catch(int &c) { throw 42; } } int main() { try { two(); } catch(...) { } return 0; }When you say it get stuck, can you please tell me what steps you have run?
I'm going to assume as he said: "compile with clang, link with a gnu toolchain, test with qemu". Presumably the bogus DWARF leads it to get stuck in an unwind loop, with an early prologue meaning the call site is incorrectly (by correctly decoding the DWARF) determined to be a leaf, but RA is the call site itself, not the caller, so every iteration does nothing.
PS I want to revert this patch to fix https://bugs.llvm.org/show_bug.cgi?id=51864 in https://reviews.llvm.org/D111519.
I'm getting tired of saying this, but the underlying target-independent deficiency in LLVM needs fixing before you can think about reverting this.
I am still looking for .cpp file so that I can test/diff it (clang riscv assembly) with gcc riscv generated assembly. Side question if someone knows why clang riscv emits .cfi_restore in epilogue (for frame-info.ll after applying this patch) while clang x86 does not. What is the use of .cf_restore in riscv that is not needed in x86? I have read .cfi_restore directive in https://sourceware.org/binutils/docs/as/CFI-directives.html but definition didn't give the answer.
I have taken this example from https://reviews.llvm.org/D69723 to understand the problem with stack unwinding.
cat test.cpp
void three() { throw 7; } void two(void) { try { three(); } catch(int &c) { throw 42; } } int main() { try { two(); } catch(...) { } return 0; }
If I see the output of the generated assembly in both GCC RISCV and Clang RISCV cases, differences can be seen in three directive lines sequence in epilogue.
clang RISCV epilogue:
.cfi_def_cfa sp, 48 ld ra, 40(sp) .cfi_restore ra .cfi_restore s0
vs
GCC RISCV epilogue where 1= ra and 8= s0:
.cfi_restore 1 ld s0,0(sp) .cfi_restore 8 .cfi_def_cfa 2, 16
So I will assume these three lines need to be corrected to match GCC and fix the issue.
No. That is -O0 code and won't exhibit the issue, because the issue happens due to optimisation. Nor is main the function I believe to be the problem, since once you've unwound to main (which depends on the callee to have correct CFI, not the caller) there's no more unwinding to do, either implicitly (since the exception has been caught) or explicitly (since main itself performs no further actions that will request unwinding).
See https://godbolt.org/z/5xvTY9EKf. Observe that two has had its epilogue scheduled before its landing pad (corresponding to the call site that is the call to three):
.Ltmp10: .Ltmp4: .loc 1 15 1 # example.cpp:15:1 ld s0, 0(sp) # 8-byte Folded Reload # after this patch: .cfi_def_cfa sp, 16 ld ra, 8(sp) # 8-byte Folded Reload # after this patch: .cfi_restore ra # after this patch: .cfi_restore s0 addi sp, sp, 16 # after this patch: .cfi_def_cfa_offset 0 ret .LBB1_2: .Ltmp5: .Ltmp11: .loc 1 15 1 is_stmt 0 # example.cpp:15:1 sext.w a1, a1 addi a2, zero, 1 mv s0, a0 .loc 1 12 5 is_stmt 1 # example.cpp:12:5 bne a1, a2, .LBB1_6 mv a0, s0 call __cxa_begin_catch ...
This means that, if you have CFI in the epilogue as with this patch (annotated above with what I believe would be emitted), it is bogusly applied to the call to __cxa_throw (for throwing the 42 in the catch).
Now let's consider how that unwinds (during the search phase). The runtime and unwinder will unwind fine all the way back to the call to __cxa_throw (specifically the byte after; all the range checks are start < x <= end not the normal start <= x < end since jalr writes the next instruction to the link register) during the search phase. Once there, the runtime will ask the unwinder to unwind another frame, as there is no exception handler there for an int. But the unwinder, trusting the CFI which says ra has already been restored (and s0), will leave ra (and s0) untouched, so after one iteration of unwinding ra still points at the byte after two's call to __cxa_throw. So it sees if there's an exception handler there for an int, and the answer is still no. So it unwinds again, trusting the CFI, and again leaves ra untouched, which again still points at the byte after the call to __cxa_throw. And so on and so on, forever trying to unwind out of two to find an exception handler but stuck in it with no escape. It's possible I've got the details slightly wrong, but they aren't really the point, the general idea of how this breaks is, and why it is crucial that all C++ call sites have CFI that will unwind accurately if followed.
GCC's code for the same sequence is:
.LVL2: .LEHE0: .loc 1 15 1 ld ra,8(sp) .cfi_remember_state .cfi_restore 1 ld s0,0(sp) .cfi_restore 8 addi sp,sp,16 .cfi_def_cfa_offset 0 jr ra .L7: .cfi_restore_state .loc 1 12 7 li a5,1 bne a1,a5,.L11 .LBB2: .loc 1 12 18 discriminator 1 call __cxa_begin_catch ...
This is almost the same CFI as what this patch produces (GCC interleaves the .cfi_restores with the corresponding loads but that doesn't really matter, so long as the stack slot and stack pointer aren't clobbered it's valid to defer a .cfi_restore) with the key difference that, because the epilogue has been scheduled before the landing pad, a .cfi_remember_state is emitted before any other CFI directives in the epilogue and a .cfi_restore_state is emitted as the first thing in the landing pad, restoring the state to the remembered state, namely that before the epilogue, and thus ensuring that the unwinder sees CFI that says ra is still stored on the stack (along with s0). That is the crucial part missing in LLVM, and the reason that we cannot emit CFI in epilogues.
(In case you're looking at main's assembly and notice that it too has its epilogue scheduled before its landing pad, yes that results in incorrect CFI for the landing pad, but no it doesn't matter in this case unless you're debugging, since all it does is call __cxa_begin_catch and __cxa_end_catch that do reference counting of the exception object behind the scenes, they don't need to unwind)
my command: riscv64-unknown-elf-g++ xxx.cpp -g -S xxx.s, but my assemble don't have .cfi_remember_state and .cfi_restore_state directives.
ToolChain down load from here:
I was wondering how your CFI directive above came about? Is there something wrong with my command?
I used the GCC 10.2.0 on godbolt.org. Basic block scheduling in newer GCCs may result in the directives being not needed for its code generation, I don't know; the necessary CFI is a function of what code is being generated, so if that changes then so does the CFI.