This is an archive of the discontinued LLVM Phabricator instance.

[libc][riscv] Added support for rv32 in setjmp and longjmp
ClosedPublic

Authored by mikhail.ramalho on Aug 23 2023, 10:23 AM.

Details

Summary

This patch adds two new macros to setjmp (STORE, STORE_FP) and two new
macros to longjmp (LOAD, LOAD_FP) that takes a register and a buff, then
select the correct asm instruction for rv32 and rv64.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2023, 10:23 AM
mikhail.ramalho requested review of this revision.Aug 23 2023, 10:23 AM

Uh, how is this implementation safe at all? You have no idea what the compiler's done to the registers outside the inline assembly.

Uh, how is this implementation safe at all? You have no idea what the compiler's done to the registers outside the inline assembly.

Sorry @jrtc27, I don't get your comment, what do you mean?

Uh, how is this implementation safe at all? You have no idea what the compiler's done to the registers outside the inline assembly.

Sorry @jrtc27, I don't get your comment, what do you mean?

For setjmp, the registers seen by the inline assembly may not have the same values as they did on entry to the function. For longjmp, the values seen by the compiler-generated return instruction may not be the same as those loaded up. In both cases, this is due to the compiler being free to use the registers however it sees fit. For example, it may choose to allocate a stack frame, adjusting sp on entry and exit, or spill and restore ra itself in longjmp, thereby undoing the load from the jmp_buf.

Uh, how is this implementation safe at all? You have no idea what the compiler's done to the registers outside the inline assembly.

Sorry @jrtc27, I don't get your comment, what do you mean?

For setjmp, the registers seen by the inline assembly may not have the same values as they did on entry to the function. For longjmp, the values seen by the compiler-generated return instruction may not be the same as those loaded up. In both cases, this is due to the compiler being free to use the registers however it sees fit. For example, it may choose to allocate a stack frame, adjusting sp on entry and exit, or spill and restore ra itself in longjmp, thereby undoing the load from the jmp_buf.

I see what you mean now. I actually tried adding attribute((naked)) in both setjmp and longjmp (in D145584), so the compiler wouldn't generate the function prologue/epilogue, however, clang generates only the code defined in the assembly, so I've removed it:

  • rv32 version with clang 17:
LLVM_LIBC_FUNCTION(int, setjmp, (__jmp_buf * buf)) {
  STORE(ra, buf->__pc);
   2:	00152023          	sw	ra,0(a0)
  STORE(s0, buf->__regs[0]);
   6:	c140                	sw	s0,4(a0)
  STORE(s1, buf->__regs[1]);
   8:	c504                	sw	s1,8(a0)
  STORE(s2, buf->__regs[2]);
   a:	01252623          	sw	s2,12(a0)
  STORE(s3, buf->__regs[3]);
   e:	01352823          	sw	s3,16(a0)
  STORE(s4, buf->__regs[4]);
  12:	01452a23          	sw	s4,20(a0)
  STORE(s5, buf->__regs[5]);
  16:	01552c23          	sw	s5,24(a0)
  STORE(s6, buf->__regs[6]);
  1a:	01652e23          	sw	s6,28(a0)
  STORE(s7, buf->__regs[7]);
  1e:	03752023          	sw	s7,32(a0)
  STORE(s8, buf->__regs[8]);
  22:	03852223          	sw	s8,36(a0)
  STORE(s9, buf->__regs[9]);
  26:	03952423          	sw	s9,40(a0)
  STORE(s10, buf->__regs[10]);
  2a:	03a52623          	sw	s10,44(a0)
  STORE(s11, buf->__regs[11]);
  2e:	03b52823          	sw	s11,48(a0)
  STORE(sp, buf->__sp);
  32:	02252a23          	sw	sp,52(a0)

#if __riscv_float_abi_double
  STORE_FP(fs0, buf->__fpregs[0]);
  36:	fd00                	fsw	fs0,56(a0)
  STORE_FP(fs1, buf->__fpregs[1]);
  38:	e124                	fsw	fs1,64(a0)
  STORE_FP(fs2, buf->__fpregs[2]);
  3a:	05252427          	fsw	fs2,72(a0)
  STORE_FP(fs3, buf->__fpregs[3]);
  3e:	05352827          	fsw	fs3,80(a0)
  STORE_FP(fs4, buf->__fpregs[4]);
  42:	05452c27          	fsw	fs4,88(a0)
  STORE_FP(fs5, buf->__fpregs[5]);
  46:	07552027          	fsw	fs5,96(a0)
  STORE_FP(fs6, buf->__fpregs[6]);
  4a:	07652427          	fsw	fs6,104(a0)
  STORE_FP(fs7, buf->__fpregs[7]);
  4e:	07752827          	fsw	fs7,112(a0)
  STORE_FP(fs8, buf->__fpregs[8]);
  52:	07852c27          	fsw	fs8,120(a0)
  STORE_FP(fs9, buf->__fpregs[9]);
  56:	09952027          	fsw	fs9,128(a0)
  STORE_FP(fs10, buf->__fpregs[10]);
  5a:	09a52427          	fsw	fs10,136(a0)
  STORE_FP(fs11, buf->__fpregs[11]);
  5e:	09b52827          	fsw	fs11,144(a0)
#elif defined(__riscv_float_abi_single)
#error "setjmp implementation not available for the target architecture."
#endif

  return 0;
  62:	4501                	li	a0,0
  64:	8082                	ret
  • rv64 version with clang 16 (it doesn't have D146245, so we still have the unnecessary add instruction):
LLVM_LIBC_FUNCTION(int, setjmp, (__jmp_buf * buf)) {
  STORE(ra, buf->__pc);
   2:	00153023          	sd	ra,0(a0)
  STORE(s0, buf->__regs[0]);
   6:	00850593          	add	a1,a0,8
   a:	e180                	sd	s0,0(a1)
  STORE(s1, buf->__regs[1]);
   c:	01050593          	add	a1,a0,16
  10:	e184                	sd	s1,0(a1)
  STORE(s2, buf->__regs[2]);
  12:	01850593          	add	a1,a0,24
  16:	0125b023          	sd	s2,0(a1)
  STORE(s3, buf->__regs[3]);
  1a:	02050593          	add	a1,a0,32
  1e:	0135b023          	sd	s3,0(a1)
  STORE(s4, buf->__regs[4]);
  22:	02850593          	add	a1,a0,40
  26:	0145b023          	sd	s4,0(a1)
  STORE(s5, buf->__regs[5]);
  2a:	03050593          	add	a1,a0,48
  2e:	0155b023          	sd	s5,0(a1)
  STORE(s6, buf->__regs[6]);
  32:	03850593          	add	a1,a0,56
  36:	0165b023          	sd	s6,0(a1)
  STORE(s7, buf->__regs[7]);
  3a:	04050593          	add	a1,a0,64
  3e:	0175b023          	sd	s7,0(a1)
  STORE(s8, buf->__regs[8]);
  42:	04850593          	add	a1,a0,72
  46:	0185b023          	sd	s8,0(a1)
  STORE(s9, buf->__regs[9]);
  4a:	05050593          	add	a1,a0,80
  4e:	0195b023          	sd	s9,0(a1)
  STORE(s10, buf->__regs[10]);
  52:	05850593          	add	a1,a0,88
  56:	01a5b023          	sd	s10,0(a1)
  STORE(s11, buf->__regs[11]);
  5a:	06050593          	add	a1,a0,96
  5e:	01b5b023          	sd	s11,0(a1)
  STORE(sp, buf->__sp);
  62:	06850593          	add	a1,a0,104
  66:	0025b023          	sd	sp,0(a1)

#if __riscv_float_abi_double
  STORE_FP(fs0, buf->__fpregs[0]);
  6a:	07050593          	add	a1,a0,112
  6e:	a180                	fsd	fs0,0(a1)
  STORE_FP(fs1, buf->__fpregs[1]);
  70:	07850593          	add	a1,a0,120
  74:	a184                	fsd	fs1,0(a1)
  STORE_FP(fs2, buf->__fpregs[2]);
  76:	08050593          	add	a1,a0,128
  7a:	0125b027          	fsd	fs2,0(a1)
  STORE_FP(fs3, buf->__fpregs[3]);
  7e:	08850593          	add	a1,a0,136
  82:	0135b027          	fsd	fs3,0(a1)
  STORE_FP(fs4, buf->__fpregs[4]);
  86:	09050593          	add	a1,a0,144
  8a:	0145b027          	fsd	fs4,0(a1)
  STORE_FP(fs5, buf->__fpregs[5]);
  8e:	09850593          	add	a1,a0,152
  92:	0155b027          	fsd	fs5,0(a1)
  STORE_FP(fs6, buf->__fpregs[6]);
  96:	0a050593          	add	a1,a0,160
  9a:	0165b027          	fsd	fs6,0(a1)
  STORE_FP(fs7, buf->__fpregs[7]);
  9e:	0a850593          	add	a1,a0,168
  a2:	0175b027          	fsd	fs7,0(a1)
  STORE_FP(fs8, buf->__fpregs[8]);
  a6:	0b050593          	add	a1,a0,176
  aa:	0185b027          	fsd	fs8,0(a1)
  STORE_FP(fs9, buf->__fpregs[9]);
  ae:	0b850593          	add	a1,a0,184
  b2:	0195b027          	fsd	fs9,0(a1)
  STORE_FP(fs10, buf->__fpregs[10]);
  b6:	0c050593          	add	a1,a0,192
  ba:	01a5b027          	fsd	fs10,0(a1)
  STORE_FP(fs11, buf->__fpregs[11]);
  be:	0c850513          	add	a0,a0,200
  c2:	01b53027          	fsd	fs11,0(a0)
#elif defined(__riscv_float_abi_single)
#error "setjmp implementation not available for the target architecture."
#endif

  return 0;
  c6:	4501                	li	a0,0
  c8:	8082                	ret

Does this address your concern with the patch?

Just because building on your system today happens to work does not mean it will work when built on other systems or in the future. So no, this code is broken and there are zero guarantees from either GCC or Clang that this will work. You get whatever you get and if it's not what you want that's your problem, not the compiler's.

Just because building on your system today happens to work does not mean it will work when built on other systems or in the future. So no, this code is broken and there are zero guarantees from either GCC or Clang that this will work. You get whatever you get and if it's not what you want that's your problem, not the compiler's.

Got it, so what's your suggestion? Maybe implementing it in an assembly file? glibc implements it in assembly directly: https://github.com/bminor/glibc/blob/4290aed05135ae4c0272006442d147f2155e70d7/sysdeps/riscv/setjmp.S

@sivachandra do you have any comments?

Just because building on your system today happens to work does not mean it will work when built on other systems or in the future. So no, this code is broken and there are zero guarantees from either GCC or Clang that this will work. You get whatever you get and if it's not what you want that's your problem, not the compiler's.

Got it, so what's your suggestion? Maybe implementing it in an assembly file? glibc implements it in assembly directly: https://github.com/bminor/glibc/blob/4290aed05135ae4c0272006442d147f2155e70d7/sysdeps/riscv/setjmp.S

@sivachandra do you have any comments?

Assembly file or naked function. Doing it in actual C, even with inline assembly, just is not doable, the implementation needs to poke at too much low-level detail.

Few points:

  1. What @jrtc27 is pointing out is a very valid problem in principle.
  2. Practically though, I am not sure if it is going to be a problem for these functions.
  3. This patch is not introducing the "questionable" code.
  4. We strictly do not allow hand-written assembly files in the libc project.

Given these points, I am OK with this patch. If we can include a test to ensure that the compilers are not doing anything unwanted, that will be even better. You can choose to work on such a test separately. If in future we see that the compiler code-gen is being unfriendly, we can revisit the approach taken here.

-O0 absolutely will break this function, for example

We explicitly build these functions with -O3 and -fomit-frame-pointer to prevent all interference.

You can likely get away with it then, but don't be surprised when it breaks... this will never be officially supported by the compiler

You can likely get away with it then, but don't be surprised when it breaks... this will never be officially supported by the compiler

I'd like to second this -- it seems pretty fragile that you need to compile this with very specific optimization flags in order for it to work, especially when there is a solution but it's prevented due to a policy of not allowing hand-written assembly files.

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2023, 12:29 PM

You can likely get away with it then, but don't be surprised when it breaks... this will never be officially supported by the compiler

I'd like to second this -- it seems pretty fragile that you need to compile this with very specific optimization flags in order for it to work, especially when there is a solution but it's prevented due to a policy of not allowing hand-written assembly files.

I totally appreciate the problem here. As I have said earlier, we are doing this with full appreciation of what can go wrong rather than just doing it because it happens to work.

About the requirement of a specific optimization flag for correctness, LLVM's libc is not the only libc requiring them. I do not remember which exact parts now, but many years ago when I was working with glibc, it required higher optimization levels to ensure correctness. Even within the libc project, setjmp/longjmp are not the only ones having this requirement.

Given these points, I am OK with this patch. If we can include a test to ensure that the compilers are not doing anything unwanted, that will be even better. You can choose to work on such a test separately. If in future we see that the compiler code-gen is being unfriendly, we can revisit the approach taken here.

@sivachandra given this comment, should I assume the patch was accepted and it's OK to push?

sivachandra accepted this revision.Aug 30 2023, 8:09 AM
This revision is now accepted and ready to land.Aug 30 2023, 8:09 AM
This revision was landed with ongoing or failed builds.Aug 30 2023, 8:22 AM
This revision was automatically updated to reflect the committed changes.
asb added a comment.Sep 5 2023, 2:13 AM

You can likely get away with it then, but don't be surprised when it breaks... this will never be officially supported by the compiler

I'd like to second this -- it seems pretty fragile that you need to compile this with very specific optimization flags in order for it to work, especially when there is a solution but it's prevented due to a policy of not allowing hand-written assembly files.

I was away the past couple of weeks, but this appears fragile to me as well. I've posted https://discourse.llvm.org/t/hand-written-in-assembly-in-libc-setjmp-longjmp/73249 to continue the discussion.