Page MenuHomePhabricator

[RISCV] Support Shadow Call Stack
ClosedPublic

Authored by zzheng on Jul 23 2020, 7:47 AM.

Details

Summary
[RISCV] Support Shadow Call Stack

Currenlty assume x18 is used as pointer to shadow call stack. User shall pass
flags:

"-fsanitize=shadow-call-stack -ffixed-x18"

Runtime supported is needed to setup x18.

If SCS is desired, all parts of the program should be built with -ffixed-x18 to
maintain inter-operatability.

There's no particuluar reason that we must use x18 as SCS pointer. Any register
may be used, as long as it does not have designated purpose already, like RA or
passing call arguments.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJul 23 2020, 7:47 AM
aaron.ballman added inline comments.Jul 23 2020, 7:58 AM
clang/test/CodeGen/shadowcallstack-attr.c
22

Now might be a good opportunity to update this check prefix to a less loaded term.

jrtc27 requested changes to this revision.Jul 23 2020, 8:01 AM
jrtc27 added inline comments.
llvm/test/CodeGen/RISCV/shadowcallstack.ll
2–4

Please style these in the same way as the other RISC-V CodeGen tests, in terms of argument order, redirecting stdin rather than using -o - %s, and using riscv64 rather than riscv64-linux-gnu (unless needed?). Also use update_llc_test_checks.py rather than hand-writing this. And we generally use RV32I and RV64I (or other appropriate arch strings) instead of RISCV32 and RISCV64 prefixes.

This revision now requires changes to proceed.Jul 23 2020, 8:01 AM

Is there a reason for choosing X18? On AArch64 that's either a temporary or saved register depending on ABI, but determined to be the "platform register". Picking X18 on RISC-V because that's the same index as AArch64 seems a little arbitrary, but maybe it happens to make sense.

apazos added inline comments.Jul 23 2020, 9:05 AM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
95

There are thee things to observe here and other reviewers might have some additional comments:

  • RISC-V does not have a reserved platform register like AAch64. The patch uses one of the RISC-V callee saved registers, x18, which happens to coincide with AArch64's register. It is possible to select another register, and additional checks for the flag combo "-fsanitize=shadow-call-stack -ffixed-xxxx" will have to be added.
  • The return address is saved on both the SCS (whose location is protected/hidden) and also in the regular stack. But the return from a function uses the value saved on SCS. The understanding is that not saving it in the regular stack can impact debugging.
  • The SCS is ascending, while the regular stack, by RISC-V convention, is descending. The SCS is not used for passing parameters between calls like the regular stack, so it seems to be ok. But this can be changed too. AArch64 's SCS is also ascending.
zzheng updated this revision to Diff 280210.Jul 23 2020, 11:55 AM
zzheng marked 3 inline comments as done.
zzheng edited the summary of this revision. (Show Details)

Using 'BLOCKED' now.

clang-formated RISCVFrameLowering.cpp

updated style of test/CodeGen/RISCV/shadowcallstack.ll

llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
95

Thanks for the clarification, Ana.

jrtc27 requested changes to this revision.Jul 23 2020, 12:37 PM
  1. Please use local variables with meaningful names for RISCV::Xn rather than inlining them everywhere and making it harder at a glance to work out what's going on without knowing what the ABI register names are.
  1. Please make a RISCVABI::getSCSPReg or similar function to avoid hard-coding X18 in a bunch of places.
  1. I'm not convinced a callee-saved register makes any more sense than anything else. Whatever you pick, compatibility only works one way round. If foo uses an SCS and calls bar that doesn't, then being callee saved helps you, but if bar calls foo then (a) foo will try to dereference SCSP (to store then later load), likely faulting, perhaps instead "just" clobbering arbitrary memory (b) if foo manages to return back to bar without faulting then bar would expect the register to have been saved, but it hasn't, an ABI violation. If you use a caller-saved register instead then bar can call foo just fine, but foo can't call bar as it'll clobber its SCSP. Reserving any register like this is a fundamental ABI incompatibility, so picking x18 because it's callee-saved is meaningless, and picking it to avoid duplicating 6 lines of code (or fewer, if the existing 6 lines are refactored) isn't a good reason either. It's ultimately arbitrary, but it should be at least be based on some kind of valid reasoning if such reasoning exists. X18 may or may not be a right answer.

    (One might initially think that this incompatibility is fine if you're building an executable with SCSP that calls other libraries without SCSP, but that breaks down as soon as callbacks exist, as well as more niche things like symbol preemption.)
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
33–37
if (find(CSI, RISCV::X1) == CSI.end())
  return;

(using llvm::find as a convenient wrapper around std::find, ie shorthand for std::find(CSI.begin(), CSI.end(), RISCV::X1)). Though personally I'd prefer to see X1 come from RI.getRARegister() rather than be hard-coded; other functions in this file already hard-code it, but in our CHERI fork we need to replace RISCV::X1 with RISCV::C1 everywhere so have changed those. Having said that, CHERI renders a shadow call stack unnecessary, so I don't particularly care if it's broken there, personally. But I still think it's nicer code.

54

This is wrong for RV64.

69–73

As above.

94

Also wrong for RV64.

This revision now requires changes to proceed.Jul 23 2020, 12:37 PM
MaskRay added inline comments.Jul 23 2020, 12:42 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
33–37

!llvm::is_contained(CSI, RISCV::X1)

pcc added a subscriber: pcc.Jul 23 2020, 12:44 PM
pcc added inline comments.
llvm/test/CodeGen/RISCV/shadowcallstack.ll
13

Shouldn't it be looking for s2 since that's how x18 is spelled in assembly?

jrtc27 added inline comments.Jul 23 2020, 12:48 PM
llvm/test/CodeGen/RISCV/shadowcallstack.ll
3

As I said before, please just use -mtriple=riscv32. The -unknown-elf is implied, irrelevant and wastes space, so all the OS-independent CodeGen tests just specify the CPU.

4

Two extra spaces to indent the | is the predominant style.

5

Delete this blank line.

13

The -NOTs shouldn't even exist, this isn't how you use update_llc_test_checks.py. But yes, by default that's how it'll be printed unless you disable printing aliases.

pcc added a comment.Jul 23 2020, 12:48 PM

Any callee-saved register can be used.

In fact, any register may be used, as long as it cannot be used to pass arguments or return values. It may be better to select a temporary register, as x18 is on aarch64 when not being used as a platform register, so that problems are noticed more easily (i.e. more likely to be clobbered by incompatible code).

zzheng updated this revision to Diff 281038.Jul 27 2020, 1:23 PM
zzheng marked 7 inline comments as done.
zzheng edited the summary of this revision. (Show Details)

Addressed styling & code clarity issues.

Fixed wrong opcode for RV64.

Unlike x18 on AArch64, there's no register that should normally be reserved/not-used on RISCV. So using any eligible register would break ABI compatibility. x18 is neither a better nor a worse choice than other registers. Non-SCS code should be built with -ffixed-x18 to be compatible with SCS-enabled code.

zzheng marked 2 inline comments as done.Jul 27 2020, 1:24 PM
zzheng added inline comments.
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
33–37

Not sure how to make llvm::find or llvm::is_contained work in this scenario.

CSI is a std::vector<llvm::CalleeSavedInfo>. We need a getReg() for each element in it before comparing to a 'Register'

zzheng updated this revision to Diff 281062.Jul 27 2020, 3:15 PM

clang-formatted.

zzheng updated this revision to Diff 281377.Jul 28 2020, 3:17 PM

rebased..

There is a possibly-compelling argument against using x18: RV32E only gives x0-x15, so would not be able to support the current implementation.

There is a possibly-compelling argument against using x18: RV32E only gives x0-x15, so would not be able to support the current implementation.

We discussed this on the RISC-V LLVM Sync-up (both this time and on 6 August 2020 IIRC). The consensus view is: if you're on an rv32e implementation, you're potentially too constrained to use a shadow call stack anyway. Even then, LLVM doesn't implement the rest of rv32e yet (though there are plans to do so at some point, which means we can revisit this change).

Our feeling is that if users have a distinct need to use a different register at the moment, they can use a downstream change to LLVM. The fact that we use RISCVABI::getSCSPReg() should make this fairly easy.


If the register choice is the only concern about this work, then I think we can probably land it as-is and fixup the register choice if we see major drawbacks later. Yes, it's an ABI issue, but on the other hand the shadow call stack is not a standard ABI anyway.

zzheng updated this revision to Diff 287428.Aug 24 2020, 10:07 AM

Rebased & ping...

IMHO, the patch is in good shape. As we discussed in the bi-weekly meetings, RV32E only has 16 registers. Systems based on RV32E may have limited memory as well. Besides, LLVM does not have full support for RV32E yet. We can commit this patch as-is and change it later if RV32E needs SCS.

Why do we have to pass -ffixed-x18 when compiling? Is it enough to just reserve x18 whenever the function has the shadow call stack attribute?

Why do we have to pass -ffixed-x18 when compiling? Is it enough to just reserve x18 whenever the function has the shadow call stack attribute?

When SCS is on, x18 must be preserved across calls. Given it's a callee-saved, value in x18 is preserved by functions that do not have SCS attribute.

However, saving x18 on stack risks leaking SCS's location in memory, making the defense useless.

Why do we have to pass -ffixed-x18 when compiling? Is it enough to just reserve x18 whenever the function has the shadow call stack attribute?

When SCS is on, x18 must be preserved across calls. Given it's a callee-saved, value in x18 is preserved by functions that do not have SCS attribute.

However, saving x18 on stack risks leaking SCS's location in memory, making the defense useless.

Ok, so any compilation units without -fsanitize=shadow-call-stack should be compiled with -ffixed-x18 if you want to link those together. That is reasonable. My question was whether we can ensure that -fsanitize=shadow-call-stack can imply -ffixed-x18 rather than having to pass both options.

It is my understanding that all functions in a CU with -fsanitize=shadow-call-stack will get the SCS function attribute, so why can't we use that attribute to work out whether x18 should be reserved, rather than using -ffixed-x18? You'll see RISCVRegisterInfo::getReservedRegs reserves fp and bp only if they're needed by the function (which can be based on attributes or other info), rather than using isRegisterReservedByUser - which seems cleaner to me.

pcc added a comment.Aug 24 2020, 11:36 AM

FWIW, on aarch64 we decided to make -fsanitize=shadow-call-stack require the x18 reservation (instead of implying it) to try to avoid ABI mismatch problems. That is, it should be safe to mix and match code compiled with and without -fsanitize=shadow-call-stack. If we make -fsanitize=shadow-call-stack imply the x18 reservation, it makes it more likely that someone will accidentally build and link in incompatible code that does not reserve x18.

In D84414#2234327, @pcc wrote:

FWIW, on aarch64 we decided to make -fsanitize=shadow-call-stack require the x18 reservation (instead of implying it) to try to avoid ABI mismatch problems. That is, it should be safe to mix and match code compiled with and without -fsanitize=shadow-call-stack. If we make -fsanitize=shadow-call-stack imply the x18 reservation, it makes it more likely that someone will accidentally build and link in incompatible code that does not reserve x18.

Ok, that approach does also make sense.

zzheng added a comment.EditedAug 25 2020, 10:49 AM

Ok, so any compilation units without -fsanitize=shadow-call-stack should be compiled with -ffixed-x18 if you want to link those together. That is reasonable. My question was whether we can ensure that -fsanitize=shadow-call-stack can imply -ffixed-x18 rather than having to pass both options.

It is my understanding that all functions in a CU with -fsanitize=shadow-call-stack will get the SCS function attribute, so why can't we use that attribute to work out whether x18 should be reserved, rather than using -ffixed-x18? You'll see RISCVRegisterInfo::getReservedRegs reserves fp and bp only if they're needed by the function (which can be based on attributes or other info), rather than using isRegisterReservedByUser - which seems cleaner to me.

In D84414#2234327, @pcc wrote:

FWIW, on aarch64 we decided to make -fsanitize=shadow-call-stack require the x18 reservation (instead of implying it) to try to avoid ABI mismatch problems. That is, it should be safe to mix and match code compiled with and without -fsanitize=shadow-call-stack. If we make -fsanitize=shadow-call-stack imply the x18 reservation, it makes it more likely that someone will accidentally build and link in incompatible code that does not reserve x18.

Yes, binding -fsanitize=shadow-call-stack with -ffixed-x18 is to ensure user don't accidentally link SCS-enabled code with non-SCS code that uses/overwrites x18. We can always infer that x18 is reserved when SCS in on.

There's no need to check RISCVRegisterInfo::getReservedRegs for x18 by looking into function attr. It forms a circular chain: is x18 reserved for SCS? <--> SCS is enabled, so x18 should be reserved.

zzheng added a comment.EditedThu, Sep 3, 7:39 AM

ping..

@jrt27, @lenary, @asb, IMHO, the patch is in good shape now, all concerns raised in comments has been addressed/answered, is there any additional comments before we can land it?

zzheng updated this revision to Diff 292274.Wed, Sep 16, 10:58 AM

rebase & ping..

jrtc27 requested changes to this revision.EditedWed, Sep 16, 11:20 AM

This is currently incompatible with the save/restore libcalls, and I don't think there's any way to avoid that (the restore libcall both loads ra and jumps to it). We should ensure combining them gives an error.

llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
36–37
43

Pointless comment; remove

46

Pointless comment; remove

53

This should be passed in as an argument IMO (same for the epilogue) given the standard prologue/epilogue code already has a DebugLoc lying around.

59–60
60

Is it intended that the shadow call stack grows *up* unlike the normal stack?

79–80

No need to repeat ourselves.

102–103

Although in fact you have both a bug and a minor performance issue with this, and it should be:

// l[w|d] ra, [-4|-8](s2)
// addi   s2, s2, -[4|8]

Then there's no read-after-write dependency chain, which is better for out-of-order cores.

The bug is that, currently, you risk a signal handler clobbering your SCS slot in between the two instructions, since you deallocate the frame before you read from it. Will be rare in practice, but a possibility.

This revision now requires changes to proceed.Wed, Sep 16, 11:20 AM
jrtc27 added inline comments.Wed, Sep 16, 11:21 AM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
86

Pointless comment; remove

89

Pointless comment; remove

zzheng updated this revision to Diff 292372.Wed, Sep 16, 4:23 PM
zzheng marked 9 inline comments as done.

Addressed comments by @jrtc27

zzheng marked 2 inline comments as done.Wed, Sep 16, 4:25 PM
zzheng added inline comments.
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
60

No. Which direction the SCS grows on is trivial.

The memory area hosting SCS is independent of the regular stack; and it's provided by the runtime.
mmap/malloc returns the low address of newly mapped/allocated area. Making the SCS growing down requires the runtime to return upper bound of the SCS. On AArch64, the SCS grows up as well.

jrtc27 added inline comments.Wed, Sep 16, 4:29 PM
llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
38
60

Ok. Wasn't saying there was anything wrong with it, was just something that jumped out at me. Having it grow up makes more sense (the only real advantage to the normal stack growing down these days is that doing aligned allocations is slightly cheaper).

79–80

This still holds.

zzheng updated this revision to Diff 292378.Wed, Sep 16, 5:16 PM
zzheng marked 2 inline comments as done.

Fixed comment and lint

asb accepted this revision.Thu, Sep 17, 5:49 AM

I think once @jrtc27 confirms all her issues are addressed this is good to land.

jrtc27 accepted this revision.Thu, Sep 17, 5:51 AM

Yes I think everything's been addressed now (though if I keep looking over it I might start nit-picking even more :)).

This revision is now accepted and ready to land.Thu, Sep 17, 5:51 AM
This revision was landed with ongoing or failed builds.Thu, Sep 17, 4:02 PM
This revision was automatically updated to reflect the committed changes.