This is an archive of the discontinued LLVM Phabricator instance.

[libc] Add support for setjmp and longjmp in riscv

Authored by mikhail.ramalho on Mar 8 2023, 5:44 AM.



This patch implements setjmp and longjmp in riscv using inline asm. The
following changes were required:

  • Omit frame pointer: otherwise gcc won't allow us to use s0
  • Use attribute((naked)): otherwise both gcc and clang will generate

function prologue and epilogue in both functions. This doesn't happen
in x86_64, so we guard it to only riscv

Furthermore, using attribute((naked)) causes two problems: we
can't use return 0 (both gcc and clang) and the function arguments in
the function body (clang only), so we had to use a0 and a1 directly.

Diff Detail

Event Timeline

mikhail.ramalho created this revision.Mar 8 2023, 5:44 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMar 8 2023, 5:44 AM
mikhail.ramalho requested review of this revision.Mar 8 2023, 5:44 AM
asb added inline comments.Mar 8 2023, 5:52 AM

This is going to fail to reload FP state for __riscv_float_abi_single. If it's not going to be supported in this version of the patch, perhaps best to #error? (and do the same in setjmp).

Abort when __riscv_float_abi_single is defined

mikhail.ramalho marked an inline comment as done.

Moved "return 0" to inside the LIBC_TARGET_ARCH_IS_X86_64 so we don't generate two "li a0,0" in riscv

michaelrj added inline comments.
16 ↗(On Diff #503415)

so far we've only been annotating functions with noexcept in the public headers. I'm not sure it's necessary here and below.

16 ↗(On Diff #503415)

Once I added the __attribute((naked)), gcc suddenly decided to be pickier and I keep getting the following warning. Adding the __NOEXCEPT makes it go away... Do you have any other suggestions?

In file included from /home/mgadelha/tools/llvm-project/libc/src/setjmp/setjmp.cpp:9:
/home/mgadelha/tools/llvm-project/libc/src/setjmp/setjmp.cpp:20:25: warning: ‘int __llvm_libc::setjmp(__jmp_buf*)’ specifies less restrictive attribute than its target ‘int __llvm_libc::__setjmp_impl__(__jmp_buf*)’: ‘nothrow’ [-Wmissing-attributes]
   20 | LLVM_LIBC_FUNCTION(int, setjmp, (__jmp_buf * buf)) {
      |                         ^~~~~~
/home/mgadelha/tools/llvm-project/libc/src/__support/common.h:30:31: note: in definition of macro ‘LLVM_LIBC_FUNCTION’
   30 |   decltype(__llvm_libc::name) name [[gnu::alias(#name)]];                      \
      |                               ^~~~
/home/mgadelha/tools/llvm-project/libc/src/__support/common.h:31:8: note: ‘int __llvm_libc::setjmp(__jmp_buf*)’ target declared here
   31 |   type __##name##_impl__ arglist
      |        ^~
/home/mgadelha/tools/llvm-project/libc/src/setjmp/setjmp.cpp:20:1: note: in expansion of macro ‘LLVM_LIBC_FUNCTION’
   20 | LLVM_LIBC_FUNCTION(int, setjmp, (__jmp_buf * buf)) {
      | ^~~~~~~~~~~~~~~~~~
[2/3] Building CXX object projects/libc/src/setjmp/CMakeFiles/libc.src.setjmp.longjmp.dir/longjmp.cpp.o
In file included from /home/mgadelha/tools/llvm-project/libc/src/setjmp/longjmp.cpp:10:
/home/mgadelha/tools/llvm-project/libc/src/setjmp/longjmp.cpp:20:26: warning: ‘void __llvm_libc::longjmp(__jmp_buf*, int)’ specifies less restrictive attribute than its target ‘void __llvm_libc::__longjmp_impl__(__jmp_buf*, int)’: ‘nothrow’ [-Wmissing-attributes]
   20 | LLVM_LIBC_FUNCTION(void, longjmp, (__jmp_buf * buf, int val)) {
      |                          ^~~~~~~
/home/mgadelha/tools/llvm-project/libc/src/__support/common.h:30:31: note: in definition of macro ‘LLVM_LIBC_FUNCTION’
   30 |   decltype(__llvm_libc::name) name [[gnu::alias(#name)]];                      \
      |                               ^~~~
/home/mgadelha/tools/llvm-project/libc/src/__support/common.h:31:8: note: ‘void __llvm_libc::longjmp(__jmp_buf*, int)’ target declared here
   31 |   type __##name##_impl__ arglist
      |        ^~
/home/mgadelha/tools/llvm-project/libc/src/setjmp/longjmp.cpp:20:1: note: in expansion of macro ‘LLVM_LIBC_FUNCTION’
   20 | LLVM_LIBC_FUNCTION(void, longjmp, (__jmp_buf * buf, int val)) {
      | ^~~~~~~~~~~~~~~~~~
michaelrj added inline comments.Mar 8 2023, 10:30 AM
16 ↗(On Diff #503415)

ah! That makes sense, in that case noexcept is fine.

Added missing ret

I have left comments in longjmp.cpp but same questions apply for setjmp.cpp as well.


Why is this required?


Can't we directly reference buf instead of using these fixed offsets, much like how we do for x86_64 case?


Same here, can we read val instead of reading a1?

Hi Siva, let me try to explain all the comments at once (I'll add it to the commit message too):

I've added the __attribute__((naked)) because both clang and GCC will add a function prologue and epilogue to these functions without it.

The consequence of using the naked attribute is that clang doesn't allow us to reference arguments in the function body:

/home/mgadelha/llvm-project/libc/src/setjmp/setjmp.cpp:75:52: error: parameter references not allowed in naked functions
  LIBC_INLINE_ASM("sd %0, %1\n\t" : "=r"(ra) : "m"(buf->__pc) :);

This is not a problem for GCC, but clang refuses to compile the code.

So, we either use the naked attribute to reduce the size of setjmp/longjm (which forces us to use a0 and a1, only because of clang), or we don't use the attribute and generate a huge setjmp/longjmp. I've decided to go with the former...

setjmp with the naked attribute:

0000000000000000 <_ZN11__llvm_libc6setjmpEP9__jmp_buf>:
   0:	00153023          	sd	ra,0(a0)
   4:	e500                	sd	s0,8(a0)
   6:	e904                	sd	s1,16(a0)
   8:	01253c23          	sd	s2,24(a0)
   c:	03353023          	sd	s3,32(a0)
  10:	03453423          	sd	s4,40(a0)
  14:	03553823          	sd	s5,48(a0)
  18:	03653c23          	sd	s6,56(a0)
  1c:	05753023          	sd	s7,64(a0)
  20:	05853423          	sd	s8,72(a0)
  24:	05953823          	sd	s9,80(a0)
  28:	05a53c23          	sd	s10,88(a0)
  2c:	07b53023          	sd	s11,96(a0)
  30:	06253423          	sd	sp,104(a0)
  34:	b920                	fsd	fs0,112(a0)
  36:	bd24                	fsd	fs1,120(a0)
  38:	09253027          	fsd	fs2,128(a0)
  3c:	09353427          	fsd	fs3,136(a0)
  40:	09453827          	fsd	fs4,144(a0)
  44:	09553c27          	fsd	fs5,152(a0)
  48:	0b653027          	fsd	fs6,160(a0)
  4c:	0b753427          	fsd	fs7,168(a0)
  50:	0b853827          	fsd	fs8,176(a0)
  54:	0b953c27          	fsd	fs9,184(a0)
  58:	0da53027          	fsd	fs10,192(a0)
  5c:	0db53427          	fsd	fs11,200(a0)
  60:	4501                	li	a0,0

0000000000000062 <.LVL1>:
  62:	8082                	ret

without the naked attribute:

0000000000000000 <_ZN11__llvm_libc6setjmpEP9__jmp_buf>:
   0:	7155                	add	sp,sp,-208
   2:	e586                	sd	ra,200(sp)
   4:	e1a2                	sd	s0,192(sp)
   6:	fd26                	sd	s1,184(sp)
   8:	f94a                	sd	s2,176(sp)
   a:	f54e                	sd	s3,168(sp)
   c:	f152                	sd	s4,160(sp)
   e:	ed56                	sd	s5,152(sp)
  10:	e95a                	sd	s6,144(sp)
  12:	e55e                	sd	s7,136(sp)
  14:	e162                	sd	s8,128(sp)
  16:	fce6                	sd	s9,120(sp)
  18:	f8ea                	sd	s10,112(sp)
  1a:	f4ee                	sd	s11,104(sp)
  1c:	b0a2                	fsd	fs0,96(sp)
  1e:	aca6                	fsd	fs1,88(sp)
  20:	a8ca                	fsd	fs2,80(sp)
  22:	a4ce                	fsd	fs3,72(sp)
  24:	a0d2                	fsd	fs4,64(sp)
  26:	bc56                	fsd	fs5,56(sp)
  28:	b85a                	fsd	fs6,48(sp)
  2a:	b45e                	fsd	fs7,40(sp)
  2c:	b062                	fsd	fs8,32(sp)
  2e:	ac66                	fsd	fs9,24(sp)
  30:	a86a                	fsd	fs10,16(sp)
  32:	a46e                	fsd	fs11,8(sp)
  34:	00153023          	sd	ra,0(a0)
  38:	00850593          	add	a1,a0,8
  3c:	e180                	sd	s0,0(a1)
  3e:	01050593          	add	a1,a0,16
  42:	e184                	sd	s1,0(a1)
  44:	01850593          	add	a1,a0,24
  48:	0125b023          	sd	s2,0(a1)
  4c:	02050593          	add	a1,a0,32
  50:	0135b023          	sd	s3,0(a1)
  54:	02850593          	add	a1,a0,40
  58:	0145b023          	sd	s4,0(a1)
  5c:	03050593          	add	a1,a0,48
  60:	0155b023          	sd	s5,0(a1)
  64:	03850593          	add	a1,a0,56
  68:	0165b023          	sd	s6,0(a1)
  6c:	04050593          	add	a1,a0,64
  70:	0175b023          	sd	s7,0(a1)
  74:	04850593          	add	a1,a0,72
  78:	0185b023          	sd	s8,0(a1)
  7c:	05050593          	add	a1,a0,80
  80:	0195b023          	sd	s9,0(a1)
  84:	05850593          	add	a1,a0,88
  88:	01a5b023          	sd	s10,0(a1)
  8c:	06050593          	add	a1,a0,96
  90:	01b5b023          	sd	s11,0(a1)
  94:	06850593          	add	a1,a0,104
  98:	0025b023          	sd	sp,0(a1)
  9c:	07050593          	add	a1,a0,112
  a0:	a180                	fsd	fs0,0(a1)
  a2:	07850593          	add	a1,a0,120
  a6:	a184                	fsd	fs1,0(a1)
  a8:	08050593          	add	a1,a0,128
  ac:	0125b027          	fsd	fs2,0(a1)
  b0:	08850593          	add	a1,a0,136
  b4:	0135b027          	fsd	fs3,0(a1)
  b8:	09050593          	add	a1,a0,144
  bc:	0145b027          	fsd	fs4,0(a1)
  c0:	09850593          	add	a1,a0,152
  c4:	0155b027          	fsd	fs5,0(a1)
  c8:	0a050593          	add	a1,a0,160
  cc:	0165b027          	fsd	fs6,0(a1)
  d0:	0a850593          	add	a1,a0,168
  d4:	0175b027          	fsd	fs7,0(a1)
  d8:	0b050593          	add	a1,a0,176
  dc:	0185b027          	fsd	fs8,0(a1)
  e0:	0b850593          	add	a1,a0,184
  e4:	0195b027          	fsd	fs9,0(a1)
  e8:	0c050593          	add	a1,a0,192
  ec:	01a5b027          	fsd	fs10,0(a1)
  f0:	0c850513          	add	a0,a0,200
  f4:	01b53027          	fsd	fs11,0(a0)
  f8:	4501                	li	a0,0
  fa:	60ae                	ld	ra,200(sp)
  fc:	640e                	ld	s0,192(sp)
  fe:	74ea                	ld	s1,184(sp)
 100:	794a                	ld	s2,176(sp)
 102:	79aa                	ld	s3,168(sp)
 104:	7a0a                	ld	s4,160(sp)
 106:	6aea                	ld	s5,152(sp)
 108:	6b4a                	ld	s6,144(sp)
 10a:	6baa                	ld	s7,136(sp)
 10c:	6c0a                	ld	s8,128(sp)
 10e:	7ce6                	ld	s9,120(sp)
 110:	7d46                	ld	s10,112(sp)
 112:	7da6                	ld	s11,104(sp)
 114:	3406                	fld	fs0,96(sp)
 116:	24e6                	fld	fs1,88(sp)
 118:	2946                	fld	fs2,80(sp)
 11a:	29a6                	fld	fs3,72(sp)
 11c:	2a06                	fld	fs4,64(sp)
 11e:	3ae2                	fld	fs5,56(sp)
 120:	3b42                	fld	fs6,48(sp)
 122:	3ba2                	fld	fs7,40(sp)
 124:	3c02                	fld	fs8,32(sp)
 126:	2ce2                	fld	fs9,24(sp)
 128:	2d42                	fld	fs10,16(sp)
 12a:	2da2                	fld	fs11,8(sp)
 12c:	6169                	add	sp,sp,208
 12e:	8082                	ret

I'm using clang 14.0.6 in Debian sid btw:

$ c++ --version
Debian clang version 14.0.6
Target: riscv64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

Can you share the code that results in large number of instructions and register saving?

18 ↗(On Diff #504696)

Why not use the noexcept keyword instead of __NOEXCEPT?

Also, does it happen at -O3 as well?

Can you share the code that results in large number of instructions and register saving?

Sure! It's the previous version of this patch:

Also, does it happen at -O3 as well?

AFAICT, we already building it with -O3 (see the cmake file I changed in this patch, the one to omit the frame pointer).

Rebased and updated commit message

mikhail.ramalho edited the summary of this revision. (Show Details)Mar 15 2023, 7:10 AM
mikhail.ramalho edited the summary of this revision. (Show Details)Mar 15 2023, 7:10 AM
mikhail.ramalho added inline comments.
18 ↗(On Diff #504696)

I used the same macro used in setjmp_impl.h

Small change to the cmake file: we only need to set -fno-omit-frame-pointer it's not riscv

Sorry for the delay. Going by my godbolt experiments, the problem you seem to be addressing (you have already said it) is clang specific:

I would really not use manually calculated offsets. Also, in the libc project, we really want to focus on cleanliness of the source code - if a compiler is missing an optimization, then the compiler should be fixed. So, I prefer the solution which does not involve manually calculating offsets. About -fno-omit-frame-pointer, I have no strong opinions. Feel free to use it or do the other way round of using -fomit-frame-pointer (this has the benefit that it will not require the naked attribute). For the build structuring, I would suggest separating out the target specific implementations into different files and adding target independent aliases to the target dependent entrypoints. Much like how linux specific implementations are added at other places:

Thanks for the review, Siva, I think I've managed to address all your comments thanks to your code snippet (tests are running now, I'll update here if they pass).

The fix was actually not declaring the register variables and using them directly in the asm (+ adding -fomit-frame-pointer), now clang/gcc doesn't generate the function prologue/epilogue anymore.

The code generated by clang is still not great, but I'd say it's not that bad either, just a bunch of unnecessary adds:

0000000000000000 <_ZN11__llvm_libc7longjmpEP9__jmp_bufi>:
   0:	00053083          	ld	ra,0(a0)
   4:	00850613          	add	a2,a0,8
   8:	6200                	ld	s0,0(a2)
   a:	01050613          	add	a2,a0,16
   e:	6204                	ld	s1,0(a2)
  10:	01850613          	add	a2,a0,24
  14:	00063903          	ld	s2,0(a2)
  18:	02050613          	add	a2,a0,32
  1c:	00063983          	ld	s3,0(a2)
  20:	02850613          	add	a2,a0,40
  24:	00063a03          	ld	s4,0(a2)
  28:	03050613          	add	a2,a0,48
  2c:	00063a83          	ld	s5,0(a2)
  30:	03850613          	add	a2,a0,56
  34:	00063b03          	ld	s6,0(a2)
  38:	04050613          	add	a2,a0,64
  3c:	00063b83          	ld	s7,0(a2)
  40:	04850613          	add	a2,a0,72
  44:	00063c03          	ld	s8,0(a2)
  48:	05050613          	add	a2,a0,80
  4c:	00063c83          	ld	s9,0(a2)
  50:	05850613          	add	a2,a0,88
  54:	00063d03          	ld	s10,0(a2)
  58:	06050613          	add	a2,a0,96
  5c:	00063d83          	ld	s11,0(a2)
  60:	06850613          	add	a2,a0,104
  64:	00063103          	ld	sp,0(a2)
  68:	07050613          	add	a2,a0,112
  6c:	2200                	fld	fs0,0(a2)
  6e:	07850613          	add	a2,a0,120
  72:	2204                	fld	fs1,0(a2)
  74:	08050613          	add	a2,a0,128
  78:	00063907          	fld	fs2,0(a2)
  7c:	08850613          	add	a2,a0,136
  80:	00063987          	fld	fs3,0(a2)
  84:	09050613          	add	a2,a0,144
  88:	00063a07          	fld	fs4,0(a2)
  8c:	09850613          	add	a2,a0,152
  90:	00063a87          	fld	fs5,0(a2)
  94:	0a050613          	add	a2,a0,160
  98:	00063b07          	fld	fs6,0(a2)
  9c:	0a850613          	add	a2,a0,168
  a0:	00063b87          	fld	fs7,0(a2)
  a4:	0b050613          	add	a2,a0,176
  a8:	00063c07          	fld	fs8,0(a2)
  ac:	0b850613          	add	a2,a0,184
  b0:	00063c87          	fld	fs9,0(a2)
  b4:	0c050613          	add	a2,a0,192
  b8:	00063d07          	fld	fs10,0(a2)
  bc:	0c850613          	add	a2,a0,200
  c0:	00063d87          	fld	fs11,0(a2)
  c4:	0015b513          	seqz	a0,a1
  c8:	952e                	add	a0,a0,a1
  ca:	8082                	ret
0000000000000000 <_ZN11__llvm_libc6setjmpEP9__jmp_buf>:
   0:	00153023          	sd	ra,0(a0)
   4:	00850593          	add	a1,a0,8
   8:	e180                	sd	s0,0(a1)
   a:	01050593          	add	a1,a0,16
   e:	e184                	sd	s1,0(a1)
  10:	01850593          	add	a1,a0,24
  14:	0125b023          	sd	s2,0(a1)
  18:	02050593          	add	a1,a0,32
  1c:	0135b023          	sd	s3,0(a1)
  20:	02850593          	add	a1,a0,40
  24:	0145b023          	sd	s4,0(a1)
  28:	03050593          	add	a1,a0,48
  2c:	0155b023          	sd	s5,0(a1)
  30:	03850593          	add	a1,a0,56
  34:	0165b023          	sd	s6,0(a1)
  38:	04050593          	add	a1,a0,64
  3c:	0175b023          	sd	s7,0(a1)
  40:	04850593          	add	a1,a0,72
  44:	0185b023          	sd	s8,0(a1)
  48:	05050593          	add	a1,a0,80
  4c:	0195b023          	sd	s9,0(a1)
  50:	05850593          	add	a1,a0,88
  54:	01a5b023          	sd	s10,0(a1)
  58:	06050593          	add	a1,a0,96
  5c:	01b5b023          	sd	s11,0(a1)
  60:	06850593          	add	a1,a0,104
  64:	0025b023          	sd	sp,0(a1)
  68:	07050593          	add	a1,a0,112
  6c:	a180                	fsd	fs0,0(a1)
  6e:	07850593          	add	a1,a0,120
  72:	a184                	fsd	fs1,0(a1)
  74:	08050593          	add	a1,a0,128
  78:	0125b027          	fsd	fs2,0(a1)
  7c:	08850593          	add	a1,a0,136
  80:	0135b027          	fsd	fs3,0(a1)
  84:	09050593          	add	a1,a0,144
  88:	0145b027          	fsd	fs4,0(a1)
  8c:	09850593          	add	a1,a0,152
  90:	0155b027          	fsd	fs5,0(a1)
  94:	0a050593          	add	a1,a0,160
  98:	0165b027          	fsd	fs6,0(a1)
  9c:	0a850593          	add	a1,a0,168
  a0:	0175b027          	fsd	fs7,0(a1)
  a4:	0b050593          	add	a1,a0,176
  a8:	0185b027          	fsd	fs8,0(a1)
  ac:	0b850593          	add	a1,a0,184
  b0:	0195b027          	fsd	fs9,0(a1)
  b4:	0c050593          	add	a1,a0,192
  b8:	01a5b027          	fsd	fs10,0(a1)
  bc:	0c850593          	add	a1,a0,200
  c0:	01b5b027          	fsd	fs11,0(a1)
  c4:	4501                	li	a0,0
  c6:	8082                	ret
  1. Added -fomit-frame-pointer
  2. Removed attribute((naked)) from setjmp/longjmp
  3. Because of (2) we don't need the __NOEXCEPT anymore
  4. Rewrote the riscv functions to use the function arguments directly, instead of fixed offset
  5. Split the platform specific implementations into different directories/files

We submitted D146245, which should enable llvm to generate better code for setjmp/longjmp.

Mostly LGTM but I have left some questions inline. About the structuring, to avoid complicated conditionals in CMake and source code both, we should structure it this way:

   - CMakeLists.txt
   - setjmp.h
   - longjmp.h
   - riscv64/
        - CMakeLists.txt
        - setjmp.cpp
        - longjmp.cpp
   - x86_64/
       - CMakeLists.txt
       - setjmp.cpp
       - longjmp.cpp

In riscv64/CMakeLists.txt:

    ... # RISCV specific compile options

    ... # RISCV specific compile options

Similar CMakeLists.txt file should be added in the x86_64 directory also. Next, we should add ALIAS targets in src/setjmp/CMakeLists.txt:



22 ↗(On Diff #505766)

You shouldn't need this for the riscv flavor.

24 ↗(On Diff #505766)

Should this be "=m" or just "m"?

56 ↗(On Diff #505766)

Why is this required? Can we just copy val to a0 and return? Or, even better would be just return val; if we can make this a naked function?

57–58 ↗(On Diff #505766)

Can this be replaced with return 0;?

mikhail.ramalho marked 6 inline comments as done.

Address comments

Herald added a project: Restricted Project. · View Herald TranscriptMar 23 2023, 5:06 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
56 ↗(On Diff #505766)

This is the fake return mentioned on the Linux man page:

Following  a  successful  longjmp(), execution continues as if setjmp()
had returned for a second time.  This  "fake"  return  can  be  distin-
guished from a true setjmp() call because the "fake" return returns the
value provided in val.  If the programmer mistakenly passes the value 0
in val, the "fake" return will instead return 1.

We can't return normally here, since longjmp has no return.

sivachandra accepted this revision.Mar 24 2023, 1:03 AM
sivachandra added inline comments.

Your comment is in the previous diff but thanks for the explanation. I think we have missed the val check for zero in the x86_64 case and should be fixed separately.

For the above two instructions, in the interest of reducing the amount of logic in inline assembly, can we do:

val = val == 0 ? 1 : val;
LIBC_INLINE_ASM("add a0, %0, zero\n\t" : : "r"(val) :);
This revision is now accepted and ready to land.Mar 24 2023, 1:03 AM

Your comment is in the previous diff but thanks for the explanation. I think we have missed the val check for zero in the x86_64 case and should be fixed separately.

no problem, do you want me to fix the x86_64 version?

For the above two instructions, in the interest of reducing the amount of logic in inline assembly, can we do:

val = val == 0 ? 1 : val;
LIBC_INLINE_ASM("add a0, %0, zero\n\t" : : "r"(val) :);

Done, I'll rerun the tests now.

This revision was landed with ongoing or failed builds.Mar 24 2023, 12:19 PM
This revision was automatically updated to reflect the committed changes.