This patch adds two new macros to setjmp (STORE, STORE_FP) and two new
macros to longjmp (LOAD, LOAD_FP) that takes a register and a buff, then
select the correct asm instruction for rv32 and rv64.
Details
Diff Detail
Event Timeline
Uh, how is this implementation safe at all? You have no idea what the compiler's done to the registers outside the inline assembly.
For setjmp, the registers seen by the inline assembly may not have the same values as they did on entry to the function. For longjmp, the values seen by the compiler-generated return instruction may not be the same as those loaded up. In both cases, this is due to the compiler being free to use the registers however it sees fit. For example, it may choose to allocate a stack frame, adjusting sp on entry and exit, or spill and restore ra itself in longjmp, thereby undoing the load from the jmp_buf.
I see what you mean now. I actually tried adding attribute((naked)) in both setjmp and longjmp (in D145584), so the compiler wouldn't generate the function prologue/epilogue, however, clang generates only the code defined in the assembly, so I've removed it:
- rv32 version with clang 17:
LLVM_LIBC_FUNCTION(int, setjmp, (__jmp_buf * buf)) { STORE(ra, buf->__pc); 2: 00152023 sw ra,0(a0) STORE(s0, buf->__regs[0]); 6: c140 sw s0,4(a0) STORE(s1, buf->__regs[1]); 8: c504 sw s1,8(a0) STORE(s2, buf->__regs[2]); a: 01252623 sw s2,12(a0) STORE(s3, buf->__regs[3]); e: 01352823 sw s3,16(a0) STORE(s4, buf->__regs[4]); 12: 01452a23 sw s4,20(a0) STORE(s5, buf->__regs[5]); 16: 01552c23 sw s5,24(a0) STORE(s6, buf->__regs[6]); 1a: 01652e23 sw s6,28(a0) STORE(s7, buf->__regs[7]); 1e: 03752023 sw s7,32(a0) STORE(s8, buf->__regs[8]); 22: 03852223 sw s8,36(a0) STORE(s9, buf->__regs[9]); 26: 03952423 sw s9,40(a0) STORE(s10, buf->__regs[10]); 2a: 03a52623 sw s10,44(a0) STORE(s11, buf->__regs[11]); 2e: 03b52823 sw s11,48(a0) STORE(sp, buf->__sp); 32: 02252a23 sw sp,52(a0) #if __riscv_float_abi_double STORE_FP(fs0, buf->__fpregs[0]); 36: fd00 fsw fs0,56(a0) STORE_FP(fs1, buf->__fpregs[1]); 38: e124 fsw fs1,64(a0) STORE_FP(fs2, buf->__fpregs[2]); 3a: 05252427 fsw fs2,72(a0) STORE_FP(fs3, buf->__fpregs[3]); 3e: 05352827 fsw fs3,80(a0) STORE_FP(fs4, buf->__fpregs[4]); 42: 05452c27 fsw fs4,88(a0) STORE_FP(fs5, buf->__fpregs[5]); 46: 07552027 fsw fs5,96(a0) STORE_FP(fs6, buf->__fpregs[6]); 4a: 07652427 fsw fs6,104(a0) STORE_FP(fs7, buf->__fpregs[7]); 4e: 07752827 fsw fs7,112(a0) STORE_FP(fs8, buf->__fpregs[8]); 52: 07852c27 fsw fs8,120(a0) STORE_FP(fs9, buf->__fpregs[9]); 56: 09952027 fsw fs9,128(a0) STORE_FP(fs10, buf->__fpregs[10]); 5a: 09a52427 fsw fs10,136(a0) STORE_FP(fs11, buf->__fpregs[11]); 5e: 09b52827 fsw fs11,144(a0) #elif defined(__riscv_float_abi_single) #error "setjmp implementation not available for the target architecture." #endif return 0; 62: 4501 li a0,0 64: 8082 ret
- rv64 version with clang 16 (it doesn't have D146245, so we still have the unnecessary add instruction):
LLVM_LIBC_FUNCTION(int, setjmp, (__jmp_buf * buf)) { STORE(ra, buf->__pc); 2: 00153023 sd ra,0(a0) STORE(s0, buf->__regs[0]); 6: 00850593 add a1,a0,8 a: e180 sd s0,0(a1) STORE(s1, buf->__regs[1]); c: 01050593 add a1,a0,16 10: e184 sd s1,0(a1) STORE(s2, buf->__regs[2]); 12: 01850593 add a1,a0,24 16: 0125b023 sd s2,0(a1) STORE(s3, buf->__regs[3]); 1a: 02050593 add a1,a0,32 1e: 0135b023 sd s3,0(a1) STORE(s4, buf->__regs[4]); 22: 02850593 add a1,a0,40 26: 0145b023 sd s4,0(a1) STORE(s5, buf->__regs[5]); 2a: 03050593 add a1,a0,48 2e: 0155b023 sd s5,0(a1) STORE(s6, buf->__regs[6]); 32: 03850593 add a1,a0,56 36: 0165b023 sd s6,0(a1) STORE(s7, buf->__regs[7]); 3a: 04050593 add a1,a0,64 3e: 0175b023 sd s7,0(a1) STORE(s8, buf->__regs[8]); 42: 04850593 add a1,a0,72 46: 0185b023 sd s8,0(a1) STORE(s9, buf->__regs[9]); 4a: 05050593 add a1,a0,80 4e: 0195b023 sd s9,0(a1) STORE(s10, buf->__regs[10]); 52: 05850593 add a1,a0,88 56: 01a5b023 sd s10,0(a1) STORE(s11, buf->__regs[11]); 5a: 06050593 add a1,a0,96 5e: 01b5b023 sd s11,0(a1) STORE(sp, buf->__sp); 62: 06850593 add a1,a0,104 66: 0025b023 sd sp,0(a1) #if __riscv_float_abi_double STORE_FP(fs0, buf->__fpregs[0]); 6a: 07050593 add a1,a0,112 6e: a180 fsd fs0,0(a1) STORE_FP(fs1, buf->__fpregs[1]); 70: 07850593 add a1,a0,120 74: a184 fsd fs1,0(a1) STORE_FP(fs2, buf->__fpregs[2]); 76: 08050593 add a1,a0,128 7a: 0125b027 fsd fs2,0(a1) STORE_FP(fs3, buf->__fpregs[3]); 7e: 08850593 add a1,a0,136 82: 0135b027 fsd fs3,0(a1) STORE_FP(fs4, buf->__fpregs[4]); 86: 09050593 add a1,a0,144 8a: 0145b027 fsd fs4,0(a1) STORE_FP(fs5, buf->__fpregs[5]); 8e: 09850593 add a1,a0,152 92: 0155b027 fsd fs5,0(a1) STORE_FP(fs6, buf->__fpregs[6]); 96: 0a050593 add a1,a0,160 9a: 0165b027 fsd fs6,0(a1) STORE_FP(fs7, buf->__fpregs[7]); 9e: 0a850593 add a1,a0,168 a2: 0175b027 fsd fs7,0(a1) STORE_FP(fs8, buf->__fpregs[8]); a6: 0b050593 add a1,a0,176 aa: 0185b027 fsd fs8,0(a1) STORE_FP(fs9, buf->__fpregs[9]); ae: 0b850593 add a1,a0,184 b2: 0195b027 fsd fs9,0(a1) STORE_FP(fs10, buf->__fpregs[10]); b6: 0c050593 add a1,a0,192 ba: 01a5b027 fsd fs10,0(a1) STORE_FP(fs11, buf->__fpregs[11]); be: 0c850513 add a0,a0,200 c2: 01b53027 fsd fs11,0(a0) #elif defined(__riscv_float_abi_single) #error "setjmp implementation not available for the target architecture." #endif return 0; c6: 4501 li a0,0 c8: 8082 ret
Does this address your concern with the patch?
Just because building on your system today happens to work does not mean it will work when built on other systems or in the future. So no, this code is broken and there are zero guarantees from either GCC or Clang that this will work. You get whatever you get and if it's not what you want that's your problem, not the compiler's.
Got it, so what's your suggestion? Maybe implementing it in an assembly file? glibc implements it in assembly directly: https://github.com/bminor/glibc/blob/4290aed05135ae4c0272006442d147f2155e70d7/sysdeps/riscv/setjmp.S
@sivachandra do you have any comments?
Assembly file or naked function. Doing it in actual C, even with inline assembly, just is not doable, the implementation needs to poke at too much low-level detail.
Few points:
- What @jrtc27 is pointing out is a very valid problem in principle.
- Practically though, I am not sure if it is going to be a problem for these functions.
- This patch is not introducing the "questionable" code.
- We strictly do not allow hand-written assembly files in the libc project.
Given these points, I am OK with this patch. If we can include a test to ensure that the compilers are not doing anything unwanted, that will be even better. You can choose to work on such a test separately. If in future we see that the compiler code-gen is being unfriendly, we can revisit the approach taken here.
We explicitly build these functions with -O3 and -fomit-frame-pointer to prevent all interference.
You can likely get away with it then, but don't be surprised when it breaks... this will never be officially supported by the compiler
I'd like to second this -- it seems pretty fragile that you need to compile this with very specific optimization flags in order for it to work, especially when there is a solution but it's prevented due to a policy of not allowing hand-written assembly files.
I totally appreciate the problem here. As I have said earlier, we are doing this with full appreciation of what can go wrong rather than just doing it because it happens to work.
About the requirement of a specific optimization flag for correctness, LLVM's libc is not the only libc requiring them. I do not remember which exact parts now, but many years ago when I was working with glibc, it required higher optimization levels to ensure correctness. Even within the libc project, setjmp/longjmp are not the only ones having this requirement.
Given these points, I am OK with this patch. If we can include a test to ensure that the compilers are not doing anything unwanted, that will be even better. You can choose to work on such a test separately. If in future we see that the compiler code-gen is being unfriendly, we can revisit the approach taken here.
@sivachandra given this comment, should I assume the patch was accepted and it's OK to push?
I was away the past couple of weeks, but this appears fragile to me as well. I've posted https://discourse.llvm.org/t/hand-written-in-assembly-in-libc-setjmp-longjmp/73249 to continue the discussion.