This is an archive of the discontinued LLVM Phabricator instance.

XOR the frame pointer with the stack cookie when protecting the stack
ClosedPublic

Authored by rnk on Nov 29 2017, 1:52 PM.

Event Timeline

rnk created this revision.Nov 29 2017, 1:52 PM
hans added inline comments.Nov 29 2017, 2:13 PM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2136

Is there some way we could assert the register we use here stays the same from entry to exit? This is the only part that sounds potentially scary, the rest looks pretty straight-forward.

llvm/lib/Target/X86/X86ISelLowering.cpp
1690

It doesn't really matter what the CRT does though, right? I mean, we could do this for non-MSVC targets too, it's just that it's not common practice?

rnk added a comment.Nov 29 2017, 2:59 PM

I discovered that this doesn't work quite right with fastisel, so I'll reupload with a fix for that.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2136

Perhaps instead I should test and commit this patch separately:

diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index cb37137d547..c6d43b04003 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -1358,6 +1358,10 @@ static void preassignSwiftErrorRegs(const TargetLowering *TLI,
 }

 void SelectionDAGISel::SelectAllBasicBlocks(const Function &Fn) {
+#ifndef NDEBUG
+  const TargetRegisterInfo &TRI = *MF->getSubtarget().getRegisterInfo();
+  unsigned OriginalFrameReg = TRI.getFrameRegister(MF);
+#endif
   FastISelFailed = false;
   // Initialize the Fast-ISel state, if needed.
   FastISel *FastIS = nullptr;
@@ -1621,6 +1625,9 @@ void SelectionDAGISel::SelectAllBasicBlocks(const Function &Fn) {
   delete FastIS;
   SDB->clearDanglingDebugInfo();
   SDB->SPDescriptor.resetPerFunctionState();
+
+  assert(OriginalFrameReg == TRI.getFrameRegister(MF) &&
+         "frame register changed during instruction selection");
 }

 /// Given that the input MI is before a partial terminator sequence TSeq, return
llvm/lib/Target/X86/X86ISelLowering.cpp
1690

Yeah, I structured it this way so that we could expose it as a flag or enable it everywhere if we want to make that policy change to -fstack-protector. I don't think it costs much and it should improve security, but that's a longer discussion we need to have about beefing up LLVM's -fstack-protector.

rnk updated this revision to Diff 124836.Nov 29 2017, 3:34 PM
  • fix fastisel
hans accepted this revision.Nov 29 2017, 3:48 PM

lgtm

I'm curious what the binary size impact will be.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2136

Doesn't need to be separate since this might be the first time we start depending on this invariant. Sounds like a good idea to me.

llvm/lib/CodeGen/StackProtector.cpp
390

s/it's/its/

This revision is now accepted and ready to land.Nov 29 2017, 3:48 PM
rnk added inline comments.Nov 29 2017, 3:51 PM
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2136

I discovered there's a reason we don't depend on this invariant: it doesn't hold at all. It would really be better if we had a way to assert if anyone calls getFrameRegister before ISel is finished...

rnk added a reviewer: MatzeB.Nov 29 2017, 5:30 PM
rnk requested review of this revision.Nov 29 2017, 5:33 PM
rnk edited edge metadata.

I had a possible idea for making this work. I thought about creating a virtual register that will ultimately be assigned to the frame register, which is calculated at the time of freezeReservedRegs. I prototyped something, but this virtual register breaks a lot of rules because it has no instructions that define it.

@MatzeB, does this seem like a good idea? It has the advantage that we'll go through instruction selection with a normal register, and then we'll rewrite it in the normal course of register allocation.

Alternatively, I can do things the usual way and add Yet Another custom pseudo-instruction that is X86-specific and we'll have to do this all over again when someone wants this on ARM.

rnk planned changes to this revision.Nov 29 2017, 5:34 PM
MatzeB edited edge metadata.Nov 29 2017, 6:31 PM
In D40622#940008, @rnk wrote:

I had a possible idea for making this work. I thought about creating a virtual register that will ultimately be assigned to the frame register, which is calculated at the time of freezeReservedRegs. I prototyped something, but this virtual register breaks a lot of rules because it has no instructions that define it.

@MatzeB, does this seem like a good idea? It has the advantage that we'll go through instruction selection with a normal register, and then we'll rewrite it in the normal course of register allocation.

I'm not sure I completely understand your plan. But some assumption that come to my mind:

  • We should only assign/decide on reserved registers once. (I know 1 or 2 targets currently violate this rule, but that should be fixed sooner or later).
  • The register allocators will generally not assign a reserved register to a virtual register.
  • The copy coalescer has some cases where copies from reserved regs to vregs get eliminated.
  • I'm not sure why you would want to use a vreg when you expect it to end up containing the frame pointer anyway. Instead just assign the reserved registers right away.

Looking around the code, I think the simplest solution would be to make useStackGuardXorFP() return the register that should be used or 0.

rnk added a comment.Nov 30 2017, 11:42 AM

Looking around the code, I think the simplest solution would be to make useStackGuardXorFP() return the register that should be used or 0.

The problem is that we don't know what register to use until freezeReservedRegs happens and hasFP stops changing value. I want to use a virtual register as a place-holder for the final frame register so that we can do normal instruction selection and then replace that virtual register with the frame register during virtual register rewriting. It really has no interaction with register allocation, it's just that virtual register rewriting seems like a convenient place to replace this place-holder. Is this a reasonable idea?

rnk added a comment.Nov 30 2017, 11:46 AM
In D40622#940850, @rnk wrote:

Looking around the code, I think the simplest solution would be to make useStackGuardXorFP() return the register that should be used or 0.

The problem is that we don't know what register to use until freezeReservedRegs happens and hasFP stops changing value. I want to use a virtual register as a place-holder for the final frame register so that we can do normal instruction selection and then replace that virtual register with the frame register during virtual register rewriting. It really has no interaction with register allocation, it's just that virtual register rewriting seems like a convenient place to replace this place-holder. Is this a reasonable idea?

Maybe a better way to think about it is it's a physical register that gets replaced with another physical register. Does this seem like something that would be useful? I'm surprised we don't have something like this already.

In D40622#940853, @rnk wrote:
In D40622#940850, @rnk wrote:

Looking around the code, I think the simplest solution would be to make useStackGuardXorFP() return the register that should be used or 0.

The problem is that we don't know what register to use until freezeReservedRegs happens and hasFP stops changing value. I want to use a virtual register as a place-holder for the final frame register so that we can do normal instruction selection and then replace that virtual register with the frame register during virtual register rewriting. It really has no interaction with register allocation, it's just that virtual register rewriting seems like a convenient place to replace this place-holder. Is this a reasonable idea?

Maybe a better way to think about it is it's a physical register that gets replaced with another physical register. Does this seem like something that would be useful? I'm surprised we don't have something like this already.

On a design/high level this stack protector stuff feels to me like it would be a better fit for the prolog epilogue insertion as it's basically inserting a few extra instructions there and people just happened to do it earlier because they wanted to use generic instructions so they don't have to reimplement it for every backend...

But short of redesigning the whole thing, using a vreg and replacing it in a custom pass is fine too.

rnk updated this revision to Diff 125020.Nov 30 2017, 2:20 PM
  • Redo the whole thing with a plain old x86 pseudo instruction =P
hans accepted this revision.Nov 30 2017, 2:32 PM

Nice. This seems like the most straight-forward solution to me.

llvm/lib/CodeGen/StackProtector.cpp
390

This one still applies (for the second it's)

llvm/lib/Target/X86/X86InstrCompiler.td
145

s/use/used/

This revision is now accepted and ready to land.Nov 30 2017, 2:32 PM
rnk marked an inline comment as done.Nov 30 2017, 2:37 PM

Thanks! I think this is the way to go for now. Having some kind of virtual frame register might be nice if we had a second use case for it, but I can't think of one now.

llvm/lib/CodeGen/StackProtector.cpp
390

Hm, actually that comment is wrong. The target's fast isel implementation doesn't have to do anything. I checked, and the protection check is inserted at a higher level after fast isel.

rnk updated this revision to Diff 125027.Nov 30 2017, 2:37 PM
  • fix comments
This revision was automatically updated to reflect the committed changes.