This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Fix missing stack pointer recover
ClosedPublic

Authored by kito-cheng on Jun 2 2022, 2:11 AM.

Details

Summary

In order to make sure the stack point is right through the EH region,
we also need to restore stack pointer from the frame pointer if we
don't preserve stack space within prologue/epilogue for outgoing variables,
normally it's just checking the variable sized object is present or not
is enough, but we also don't preserve that at prologue/epilogue when
have vector objects in stack.

Example to show what happened:

try {
  sp adjust for outgoing args. // 1. Sp changed.
  func_call  // 2. Exception raised
  sp restore // Oh, not restored
} catch {
  // 3. And now we are here.
}

// 4. Prepare to return!, restore return address from stack, but...sp is wrong.
// 5. Screw up!

Diff Detail

Event Timeline

kito-cheng created this revision.Jun 2 2022, 2:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2022, 2:11 AM
kito-cheng requested review of this revision.Jun 2 2022, 2:11 AM
reames added a comment.Jun 2 2022, 9:40 AM

I'm missing something here.

If we have a frame pointer, shouldn't restoring SP simply be a register move? Why bother with all of the offset adjustments in this case at all? Shouldn't we be able to structure the code as:
if (hasFP) {

copy SP to FP

} else {

do it the hard way

}

I'm missing something here.

If we have a frame pointer, shouldn't restoring SP simply be a register move? Why bother with all of the offset adjustments in this case at all? Shouldn't we be able to structure the code as:
if (hasFP) {

copy SP to FP

} else {

do it the hard way

}

I think that might less intuitive on the sp recover code here, ideally we could restore sp by moving s0 to sp and addi sp, sp, 32 could be removed like this,:

mv sp, s0, -32
ld ra, -8(sp) # 8-byte Folded Reload
ld s0, -16(sp) # 8-byte Folded Reload
ld s1, -24(sp) # 8-byte Folded Reload

but we can't do that in practice since that is not signal-safe/interrupt-safe, when the sp changing means we already released the space of stack, and then anything could happen on those area.

Give a practical example here, we got a signal after execute mv sp, s0, -32, and then kernel start to prepare the arguments and put into stack, ok - you could imagine what happened: the stack is corrupt, ra, s0 and s1 in stack might be corrupt, but technically it's not corrupt since you already move the stack pointer that means we release that, and that's same for the interrupt situation.

mv sp, s0, -32
; Get signal or interrupt here!
ld ra, -8(sp) # 8-byte Folded Reload   ; Signal handler or interrupt handler might overwrite this area,
ld s0, -16(sp) # 8-byte Folded Reload ; because we already released the stack space!
ld s1, -24(sp) # 8-byte Folded Reload
reames added a comment.Jun 6 2022, 7:46 AM

I'm missing something here.

If we have a frame pointer, shouldn't restoring SP simply be a register move? Why bother with all of the offset adjustments in this case at all? Shouldn't we be able to structure the code as:
if (hasFP) {

copy SP to FP

} else {

do it the hard way

}

I think that might less intuitive on the sp recover code here, ideally we could restore sp by moving s0 to sp and addi sp, sp, 32 could be removed like this,:

mv sp, s0, -32
ld ra, -8(sp) # 8-byte Folded Reload
ld s0, -16(sp) # 8-byte Folded Reload
ld s1, -24(sp) # 8-byte Folded Reload

but we can't do that in practice since that is not signal-safe/interrupt-safe, when the sp changing means we already released the space of stack, and then anything could happen on those area.

Give a practical example here, we got a signal after execute mv sp, s0, -32, and then kernel start to prepare the arguments and put into stack, ok - you could imagine what happened: the stack is corrupt, ra, s0 and s1 in stack might be corrupt, but technically it's not corrupt since you already move the stack pointer that means we release that, and that's same for the interrupt situation.

mv sp, s0, -32
; Get signal or interrupt here!
ld ra, -8(sp) # 8-byte Folded Reload   ; Signal handler or interrupt handler might overwrite this area,
ld s0, -16(sp) # 8-byte Folded Reload ; because we already released the stack space!
ld s1, -24(sp) # 8-byte Folded Reload

Kito, I don't get your comment. I'd expect the CSR restores to be happening before stack adjustment. Are you saying that there's some other code which tries to insert restores in the middle of the existing stack adjustment sequence?

If so, that's an invariant which really should be documented in comments.

Kito, I don't get your comment. I'd expect the CSR restores to be happening before stack adjustment.

It's too late to do stack adjustment after CSR restore by moving fp to sp, because s0(fp) is CSR.

addi sp, s0, -32
ld ra, 24(sp) # 8-byte Folded Reload
ld s0, 16(sp) # 8-byte Folded Reload <----- s0 is restore here 
ld s1, 8(sp) # 8-byte Folded Reload
addi sp, sp, 32
ret

Are you saying that there's some other code which tries to insert restores in the middle of the existing stack adjustment sequence?

No, I mean signal and interrupt might raise during epilogue, and that will keep using sp, and adjust that in the handler, so sp must adjust after CSR restores.

rogfer01 added inline comments.Jun 8 2022, 11:09 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
648

I assume that the exceptional path doesn't execute the add sp, sp, 32 after call _Z3fooiiiiiiiiiiPi@plt, instead we directly land to .LBB0_2: # %lpad. If this is th case, it does leaves the stack unaligned.

However, the expansion of !hasReservedCallFrame(MF) results in checking hasFP(MF) && hasRVVFrameObject(MF).

Isn't this an issue that happens irrespective of vectors and it is caused by having a fp? E.g. when using something like -fno-omit-frame-pointer.

rogfer01 added inline comments.Jun 8 2022, 11:13 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
648

Answering myself here: when not using v then we account the callee arguments as part of the caller stack frame.

rogfer01 accepted this revision.Jun 8 2022, 11:13 PM

LGTM. Thanks @kito-cheng

This revision is now accepted and ready to land.Jun 8 2022, 11:13 PM
kito-cheng added inline comments.Jun 8 2022, 11:29 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
648

Yeah, your understating is right.

BTW, same issue could be happen without v in theory (without this patch), but that situation will fixed by clang, it will emit stack pointer store/restore intrinsic...use this case to demonstrate:

simple.cpp
$ clang simple.cpp -o - -S -O -emit-llvm -march=rv64gc -target riscv64-unknown-linux-gnu

void foo(int, int, int, int, int, int, int, int, int, int, int *);

bool check_frame_variant(int x) {
  int exception_value = 1;
  try {
      int yy[x];
      foo(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, &yy[0]);
  } catch (int i) {
    exception_value = i;
  }
  return exception_value;
}

Generated LLVM IR:

define dso_local noundef zeroext i1 @_Z19check_frame_varianti(i32 noundef signext %x) local_unnamed_addr #0 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
entry:
  %0 = zext i32 %x to i64
  %1 = call i8* @llvm.stacksave()   <--- clang emit a sp back up here
  %vla = alloca i32, i64 %0, align 8
  invoke void @_Z3fooiiiiiiiiiiPi(i32 noundef signext 0, i32 noundef signext 0, i32 noundef signext 0, i32 noundef signext 0, i32 noundef signext 0, i32 noundef signext 0, i32 noundef signext 0, i32 noundef signext 0, i32 noundef 0, i32 noundef 0, i32* noundef nonnull %vla)
          to label %invoke.cont unwind label %lpad

invoke.cont:
  call void @llvm.stackrestore(i8* %1) <-- and a sp restore here!!!!!
  br label %try.cont
This revision was landed with ongoing or failed builds.Jun 9 2022, 8:38 AM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Jun 9 2022, 8:55 AM

Looks like this breaks check-llvm: http://45.33.8.238/linux/78124/step_12.txt

Please take a look, and revert for now if it takes a while to fix.

Fixed via this commit, my bad, I didn't upload those testcase update...

thakis added a comment.Jun 9 2022, 9:25 AM

Thanks!

Please run tests locally before committing :)

I'm a bit concerned about the example given in the commit message. In general, if we catch an exception from a call, we should immediately free the memory allocated for the call arguments. Otherwise, if we throw exceptions in a loop, we can allocate an arbitrary amount of memory, which eventually leads to a stack overflow.

@efriedma Thanks, that's good catch, created another issue to tracking that[1]!

[1] https://github.com/llvm/llvm-project/issues/56008