This is an archive of the discontinued LLVM Phabricator instance.

There's a problem in the previous iteration of this patch (https://reviews.llvm.org/D138774). It made clang hang while compiling the following short snippet reduced from an open-source library (compiled on x86-64, linux):

$ cat q.c
char *texsubexpr(char *expression, char *subexpr) {
  char *texchar();
  if (*expression) return texchar(expression, subexpr);
}

void __stack_chk_fail (void) {
  __builtin_trap ();
}

$ clang -O1  -w -fstack-protector-all   -c q.c  -o q.o

Please ensure this is solved here.

In D139254#3974001, @alexfh wrote:

There's a problem in the previous iteration of this patch (https://reviews.llvm.org/D138774). It made clang hang while compiling the following short snippet reduced from an open-source library (compiled on x86-64, linux):

I check it, the clang still hang, let me check and fix it. Thanks very much!

llvm/test/CodeGen/X86/stack-protector-no-return.ll
5	There is a fail (about dominate tree checking) for this test at https://reviews.llvm.org/D138774 ,current patch fix it and then, to let it be more clear, I use llvm-reduce simplify it. (Now, both the old version and new version of this test are pass)

xiangzhangllvm added a reviewer: alexfh.Dec 6 2022, 6:01 PM

In D139254#3976724, @xiangzhangllvm wrote:

In D139254#3974001, @alexfh wrote:

There's a problem in the previous iteration of this patch (https://reviews.llvm.org/D138774). It made clang hang while compiling the following short snippet reduced from an open-source library (compiled on x86-64, linux):

I check it, the clang still hang, let me check and fix it. Thanks very much!

I root cause it, this is a very special case : )
The function "stack_chk_fail" define in the upper case is also the Stack Protection will also use (and auto generate call for it).
The option "-fstack-protector-all" required "stack_chk_fail" function self to be checked (by use "stack_chk_fail" self too.)
So this cause the Infinite loop for checking "stack_chk_fail", and make clang hang.

This was a defect in StackProtector and exposed by current patch.

Let me fix it. Thanks again!

In D139254#3974001, @alexfh wrote:
There's a problem in the previous iteration of this patch (https://reviews.llvm.org/D138774). It made clang hang while compiling the following short snippet reduced from an open-source library (compiled on x86-64, linux):
$ cat q.c
char *texsubexpr(char *expression, char *subexpr) {
  char *texchar();
  if (*expression) return texchar(expression, subexpr);
}

void __stack_chk_fail (void) {
  __builtin_trap ();
}

$ clang -O1  -w -fstack-protector-all   -c q.c  -o q.o
Please ensure this is solved here.

Fix/Done

xiangzhangllvm added inline comments.Dec 6 2022, 10:53 PM

llvm/test/CodeGen/X86/stack-protector-weight.ll
13 ↗	(On Diff #480777)	The condition branch identify the branch probability at StackProtector.cpp: line 572 And BB calculate from SelectionDAGBuilder::visitBr by visit BR with following metedata: !{!"branch_weights", i32 2147481600, i32 2048} (0x7ffff800) (0x800) The previous data is incorrect. (should be same with line 7)

Harbormaster completed remote builds in B201615: Diff 480777.Dec 7 2022, 9:33 AM

ping, thanks

Could you add a test case for recursively handling __stack_chk_fail?

Add test stack-protector-recursively.ll to check stack protector auto generated function (__stack_chk_fail).

In D139254#3980660, @LuoYuanke wrote:

Could you add a test case for recursively handling __stack_chk_fail?

Done

LuoYuanke added inline comments.Dec 8 2022, 1:14 AM

llvm/lib/CodeGen/StackProtector.cpp
581	This should increase compiling time. Need refine it by moving the BB child to NewBB.

xiangzhangllvm added inline comments.Dec 8 2022, 1:31 AM

llvm/lib/CodeGen/StackProtector.cpp
581	Yes, ok, let me add more code here, first delete the related old DT Edges and add back to the new ones.

Harbormaster completed remote builds in B201894: Diff 481176.Dec 8 2022, 8:59 AM

xiangzhangllvm updated this revision to Diff 481594.Dec 9 2022, 3:38 AM

xiangzhangllvm added inline comments.Dec 9 2022, 3:43 AM

llvm/lib/CodeGen/StackProtector.cpp
581	Let me first move the DT->recalculate out of loop. I try do more faster update for the change, but meet some problems. I need time to deeply look into the dominator tree. Pls let me first fix the problem self and then do optimization.

Harbormaster completed remote builds in B202195: Diff 481594.Dec 9 2022, 4:41 AM

LGTM.

This revision is now accepted and ready to land.Dec 9 2022, 6:12 PM

This revision was landed with ongoing or failed builds.Dec 11 2022, 4:44 PM

Closed by commit rGd656ae280957: Enhance stack protector (authored by xiangzhangllvm). · Explain Why

This revision was automatically updated to reflect the committed changes.

xiangzhangllvm added a commit: rGd656ae280957: Enhance stack protector.

This diff is a ~0.5% size regression under -Oz when building with -fstack-protector. Is this reasonably expected? And one that should be expected given that it's considered to be a correctness fix? cc @smeenai

In D139254#4019615, @lanza wrote:

This diff is a ~0.5% size regression under -Oz when building with -fstack-protector. Is this reasonably expected? And one that should be expected given that it's considered to be a correctness fix? cc @smeenai

I think the size increasement is expected because it inserts more stack corruption checks. Do you find any unexpected check?

In D139254#4019890, @LuoYuanke wrote:

In D139254#4019615, @lanza wrote:

This diff is a ~0.5% size regression under -Oz when building with -fstack-protector. Is this reasonably expected? And one that should be expected given that it's considered to be a correctness fix? cc @smeenai

I think the size increasement is expected because it inserts more stack corruption checks. Do you find any unexpected check?

That is right, sorry for not see this comment in time.

Would there be any objections to adding a flag to restore the previous behavior? This is a fairly hefty size increase for us, and we don't think the additional checking is worth the size increase in our case. We also intend to look into the size increases more to determine how much is from guarding calls to e.g. __cxa_throw, where you will reuse the stack frame, vs. calls to e.g. abort or _Unwind_Resume, where the stack frame should never be reused and checking the stack cookie doesn't seem worthwhile (if my logic is sound, which I'm not 100% sure of).

In D139254#4029517, @smeenai wrote:

Would there be any objections to adding a flag to restore the previous behavior? This is a fairly hefty size increase for us, and we don't think the additional checking is worth the size increase in our case. We also intend to look into the size increases more to determine how much is from guarding calls to e.g. __cxa_throw, where you will reuse the stack frame, vs. calls to e.g. abort or _Unwind_Resume, where the stack frame should never be reused and checking the stack cookie doesn't seem worthwhile (if my logic is sound, which I'm not 100% sure of).

I agree to add a option to go in previous behavior. (e.g -fstack-protector-weak or -fstack-protector-return)
But this enhance for no-return function is necessary for a security. We meet such cases in user code. That is why I did it.

In D139254#4030325, @xiangzhangllvm wrote:

In D139254#4029517, @smeenai wrote:

Would there be any objections to adding a flag to restore the previous behavior? This is a fairly hefty size increase for us, and we don't think the additional checking is worth the size increase in our case. We also intend to look into the size increases more to determine how much is from guarding calls to e.g. __cxa_throw, where you will reuse the stack frame, vs. calls to e.g. abort or _Unwind_Resume, where the stack frame should never be reused and checking the stack cookie doesn't seem worthwhile (if my logic is sound, which I'm not 100% sure of).

I agree to add a option to go in previous behavior. (e.g -fstack-protector-weak or -fstack-protector-return)
But this enhance for no-return function is necessary for a security. We meet such cases in user code. That is why I did it.

Yup, I'm not at all doubting the value of this work :) Adding the option will just let everyone make the appropriate trade-off between size and security for their use cases.

Hi @smeenai , do you have plan to implement it ? If not, I will do it, but may some days later (due to some other jobs in my hands).

In D139254#4034785, @xiangzhangllvm wrote:

Hi @smeenai , do you have plan to implement it ? If not, I will do it, but may some days later (due to some other jobs in my hands).

I'm going on an extended leave soon, and I won't have time to implement it before then. I've asked some coworkers to take a look, but if you have the time to add the option it'd be hugely appreciated :)

In D139254#4036649, @smeenai wrote:

In D139254#4034785, @xiangzhangllvm wrote:

Hi @smeenai , do you have plan to implement it ? If not, I will do it, but may some days later (due to some other jobs in my hands).

I'm going on an extended leave soon, and I won't have time to implement it before then. I've asked some coworkers to take a look, but if you have the time to add the option it'd be hugely appreciated :)

OK, no problem, I'll take it. Thanks.

! In D139254#4036649, @smeenai wrote:
I'm going on an extended leave soon, and I won't have time to implement it before then. I've asked some coworkers to take a look, but if you have the time to add the option it'd be hugely appreciated :)

Hi @smeenai I first implement at an llvm option for your requirement, you can use -mllvm -option-name to quick solve your code size problem.
PLS refer D141556

t.p.northover mentioned this in D143637: StackProtector: add unwind cleanup paths for instrumentation..Feb 9 2023, 3:35 AM

In D139254#4045730, @xiangzhangllvm wrote:

! In D139254#4036649, @smeenai wrote:
I'm going on an extended leave soon, and I won't have time to implement it before then. I've asked some coworkers to take a look, but if you have the time to add the option it'd be hugely appreciated :)

Hi @smeenai I first implement at an llvm option for your requirement, you can use -mllvm -option-name to quick solve your code size problem.
PLS refer D141556

Thank you!

Are there possible optimizations which can be implemented to reduce the code-size impact of this change? I'm wondering if this patch adds the checks conservatively to all instances where stack check could potentially, but not necessarily, be missed/useful?

In D139254#4190508, @hiraditya wrote:

Are there possible optimizations which can be implemented to reduce the code-size impact of this change? I'm wondering if this patch adds the checks conservatively to all instances where stack check could potentially, but not necessarily, be missed/useful?

Seems good idea, but it is hard to guess how the attacker will attack our program. (which no-return should protect or not), and sorry for I am not a defense expert.
Welcome any good suggestion, or directly add me to related patch's reviewer or subsciber.

t.p.northover mentioned this in rG203b6f31bb71: DwarfEHPrepare: insert extra unwind paths for stack protector to instrument.Mar 16 2023, 4:34 AM

t.p.northover mentioned this in rG2d690684f66f: Recommit DwarfEHPrepare: insert extra unwind paths for stack protector to….Mar 16 2023, 6:43 AM

Allen added a subscriber: Allen.Mar 18 2023, 11:07 PM

In D139254#4191358, @xiangzhangllvm wrote:

In D139254#4190508, @hiraditya wrote:

Are there possible optimizations which can be implemented to reduce the code-size impact of this change? I'm wondering if this patch adds the checks conservatively to all instances where stack check could potentially, but not necessarily, be missed/useful?

Seems good idea, but it is hard to guess how the attacker will attack our program. (which no-return should protect or not), and sorry for I am not a defense expert.
Welcome any good suggestion, or directly add me to related patch's reviewer or subsciber.

I'm happy to see at least there's an option to turn this off.
-mllvm -disable-check-noreturn-call
https://reviews.llvm.org/D141556

That said, rereading https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58245#c6 I kind of agree with Jakub. What are you protecting exactly? Why do so only for noreturn calls? Why not prior to every call? Why not insert a stack check guard after every operation? If you're that paranoid, do no meaningful work and just keep checking the stack protector.

I think this would have been better implemented as a new level beyond -fstack-protector-strong rather than regressing the code size of every user of -fstack-protector-strong.

If GCC doesn't do this, is this something ICC did?

In D139254#4231890, @nickdesaulniers wrote:

In D139254#4191358, @xiangzhangllvm wrote:

In D139254#4190508, @hiraditya wrote:

Are there possible optimizations which can be implemented to reduce the code-size impact of this change? I'm wondering if this patch adds the checks conservatively to all instances where stack check could potentially, but not necessarily, be missed/useful?

Seems good idea, but it is hard to guess how the attacker will attack our program. (which no-return should protect or not), and sorry for I am not a defense expert.
Welcome any good suggestion, or directly add me to related patch's reviewer or subsciber.

I'm happy to see at least there's an option to turn this off.
-mllvm -disable-check-noreturn-call
https://reviews.llvm.org/D141556

That said, rereading https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58245#c6 I kind of agree with Jakub. What are you protecting exactly? Why do so only for noreturn calls? Why not prior to every call? Why not insert a stack check guard after every operation? If you're that paranoid, do no meaningful work and just keep checking the stack protector.

I think this would have been better implemented as a new level beyond -fstack-protector-strong rather than regressing the code size of every user of -fstack-protector-strong.

If GCC doesn't do this, is this something ICC did?

The full context that motivated this patch is now publicly available in https://bugs.chromium.org/p/llvm/issues/detail?id=30, and I have a much better understanding now.

The problem is that you have noreturn functions like abort, which are actually going to terminate your program, and there's no need to check the stack canary before calling them. However, you also have noreturn functions like __cxa_throw and _Unwind_Resume which will cause your program to resume in the unwinder, and the unwinder is relying on saved return addresses on the stack to determine where to go, so not checking the canary before entering the unwinder can cause control flow to be hijacked.

Note that this patch by itself is an incomplete solution. @t.p.northover put up D143637 recently to make all unwind paths visible so that the canary can be checked for them. Unfortunately, I've found that patch to basically negate any size benefit from -disable-check-no-return-call. I haven't had the chance to dig into that yet, but maybe Tim has some ideas.

If you're also interested in this for size reasons, one specific potential enhancement would be distinguishing abort-like functions from unwind-like functions and avoiding the canary check before the former. I haven't been able to measure the ratio of the two in our code base yet.

hans mentioned this in rG91beab69cdac: Revert "Recommit DwarfEHPrepare: insert extra unwind paths for stack protector….Apr 4 2023, 9:10 AM

smeenai mentioned this in D147975: [StackProtector] don't check stack protector before calling nounwind functions.Apr 11 2023, 9:42 PM

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

StackProtector.cpp

69 lines

test/

CodeGen/

X86/

stack-protector-2.ll

30 lines

stack-protector-no-return.ll

165 lines

stack-protector-recursively.ll

26 lines

Diff 481957

llvm/lib/CodeGen/StackProtector.cpp

Show First 20 Lines • Show All 409 Lines • ▼ Show 20 Lines
///		///
/// entry:		/// entry:
/// StackGuardSlot = alloca i8*		/// StackGuardSlot = alloca i8*
/// StackGuard = <stack guard>		/// StackGuard = <stack guard>
/// call void @llvm.stackprotector(StackGuard, StackGuardSlot)		/// call void @llvm.stackprotector(StackGuard, StackGuardSlot)
///		///
/// Returns true if the platform/triple supports the stackprotectorcreate pseudo		/// Returns true if the platform/triple supports the stackprotectorcreate pseudo
/// node.		/// node.
static bool CreatePrologue(Function F, Module M, ReturnInst *RI,		static bool CreatePrologue(Function F, Module M, Instruction *CheckLoc,
const TargetLoweringBase TLI, AllocaInst &AI) {		const TargetLoweringBase TLI, AllocaInst &AI) {
bool SupportsSelectionDAGSP = false;		bool SupportsSelectionDAGSP = false;
IRBuilder<> B(&F->getEntryBlock().front());		IRBuilder<> B(&F->getEntryBlock().front());
PointerType *PtrTy = Type::getInt8PtrTy(RI->getContext());		PointerType *PtrTy = Type::getInt8PtrTy(CheckLoc->getContext());
AI = B.CreateAlloca(PtrTy, nullptr, "StackGuardSlot");		AI = B.CreateAlloca(PtrTy, nullptr, "StackGuardSlot");

Value *GuardSlot = getStackGuard(TLI, M, B, &SupportsSelectionDAGSP);		Value *GuardSlot = getStackGuard(TLI, M, B, &SupportsSelectionDAGSP);
B.CreateCall(Intrinsic::getDeclaration(M, Intrinsic::stackprotector),		B.CreateCall(Intrinsic::getDeclaration(M, Intrinsic::stackprotector),
{GuardSlot, AI});		{GuardSlot, AI});
return SupportsSelectionDAGSP;		return SupportsSelectionDAGSP;
}		}

/// InsertStackProtectors - Insert code into the prologue and epilogue of the		/// InsertStackProtectors - Insert code into the prologue and epilogue of the
/// function.		/// function.
///		///
/// - The prologue code loads and stores the stack guard onto the stack.		/// - The prologue code loads and stores the stack guard onto the stack.
/// - The epilogue checks the value stored in the prologue against the original		/// - The epilogue checks the value stored in the prologue against the original
/// value. It calls __stack_chk_fail if they differ.		/// value. It calls __stack_chk_fail if they differ.
bool StackProtector::InsertStackProtectors() {		bool StackProtector::InsertStackProtectors() {
// If the target wants to XOR the frame pointer into the guard value, it's		// If the target wants to XOR the frame pointer into the guard value, it's
// impossible to emit the check in IR, so the target must support stack		// impossible to emit the check in IR, so the target must support stack
// protection in SDAG.		// protection in SDAG.
bool SupportsSelectionDAGSP =		bool SupportsSelectionDAGSP =
TLI->useStackGuardXorFP() \|\|		TLI->useStackGuardXorFP() \|\|
(EnableSelectionDAGSP && !TM->Options.EnableFastISel);		(EnableSelectionDAGSP && !TM->Options.EnableFastISel);
AllocaInst *AI = nullptr; // Place on stack that stores the stack guard.		AllocaInst *AI = nullptr; // Place on stack that stores the stack guard.
		bool RecalculateDT = false;
		BasicBlock *FailBB = nullptr;

for (BasicBlock &BB : llvm::make_early_inc_range(*F)) {		for (BasicBlock &BB : llvm::make_early_inc_range(*F)) {
ReturnInst *RI = dyn_cast<ReturnInst>(BB.getTerminator());		// This is stack protector auto generated check BB, skip it.
if (!RI)		if (&BB == FailBB)
		continue;
		Instruction *CheckLoc = dyn_cast<ReturnInst>(BB.getTerminator());
		if (!CheckLoc) {
		for (auto &Inst : BB) {
		auto *CB = dyn_cast<CallBase>(&Inst);
		if (!CB)
		continue;
		if (!CB->doesNotReturn())
		continue;
		// Do stack check before non-return calls (e.g: __cxa_throw)
		CheckLoc = CB;
		break;
		}
		}

		if (!CheckLoc)
continue;		continue;

// Generate prologue instrumentation if not already generated.		// Generate prologue instrumentation if not already generated.
if (!HasPrologue) {		if (!HasPrologue) {
HasPrologue = true;		HasPrologue = true;
SupportsSelectionDAGSP &= CreatePrologue(F, M, RI, TLI, AI);		SupportsSelectionDAGSP &= CreatePrologue(F, M, CheckLoc, TLI, AI);
}		}

// SelectionDAG based code generation. Nothing else needs to be done here.		// SelectionDAG based code generation. Nothing else needs to be done here.
// The epilogue instrumentation is postponed to SelectionDAG.		// The epilogue instrumentation is postponed to SelectionDAG.
if (SupportsSelectionDAGSP)		if (SupportsSelectionDAGSP)
break;		break;

// Find the stack guard slot if the prologue was not created by this pass		// Find the stack guard slot if the prologue was not created by this pass
Show All 9 Lines	for (BasicBlock &BB : llvm::make_early_inc_range(*F)) {
// instrumentation has already been generated.		// instrumentation has already been generated.
HasIRCheck = true;		HasIRCheck = true;

// If we're instrumenting a block with a tail call, the check has to be		// If we're instrumenting a block with a tail call, the check has to be
// inserted before the call rather than between it and the return. The		// inserted before the call rather than between it and the return. The
// verifier guarantees that a tail call is either directly before the		// verifier guarantees that a tail call is either directly before the
// return or with a single correct bitcast of the return value in between so		// return or with a single correct bitcast of the return value in between so
// we don't need to worry about many situations here.		// we don't need to worry about many situations here.
Instruction *CheckLoc = RI;		Instruction *Prev = CheckLoc->getPrevNonDebugInstruction();
Instruction *Prev = RI->getPrevNonDebugInstruction();
if (Prev && isa<CallInst>(Prev) && cast<CallInst>(Prev)->isTailCall())		if (Prev && isa<CallInst>(Prev) && cast<CallInst>(Prev)->isTailCall())
CheckLoc = Prev;		CheckLoc = Prev;
else if (Prev) {		else if (Prev) {
Prev = Prev->getPrevNonDebugInstruction();		Prev = Prev->getPrevNonDebugInstruction();
if (Prev && isa<CallInst>(Prev) && cast<CallInst>(Prev)->isTailCall())		if (Prev && isa<CallInst>(Prev) && cast<CallInst>(Prev)->isTailCall())
CheckLoc = Prev;		CheckLoc = Prev;
}		}

Show All 33 Lines	if (Function GuardCheck = TLI->getSSPStackGuardCheck(M)) {
//		//
// CallStackCheckFailBlk:		// CallStackCheckFailBlk:
// call void @__stack_chk_fail()		// call void @__stack_chk_fail()
// unreachable		// unreachable

// Create the FailBB. We duplicate the BB every time since the MI tail		// Create the FailBB. We duplicate the BB every time since the MI tail
// merge pass will merge together all of the various BB into one including		// merge pass will merge together all of the various BB into one including
// fail BB generated by the stack protector pseudo instruction.		// fail BB generated by the stack protector pseudo instruction.
BasicBlock *FailBB = CreateFailBB();		if (!FailBB)
		FailBB = CreateFailBB();

// Split the basic block before the return instruction.		// Split the basic block before the return instruction.
BasicBlock *NewBB =		BasicBlock *NewBB =
BB.splitBasicBlock(CheckLoc->getIterator(), "SP_return");		BB.splitBasicBlock(CheckLoc->getIterator(), "SP_return");

// Update the dominator tree if we need to.
if (DT && DT->isReachableFromEntry(&BB)) {
DT->addNewBlock(NewBB, &BB);
DT->addNewBlock(FailBB, &BB);
}

// Remove default branch instruction to the new BB.		// Remove default branch instruction to the new BB.
BB.getTerminator()->eraseFromParent();		BB.getTerminator()->eraseFromParent();

// Move the newly created basic block to the point right after the old		// Move the newly created basic block to the point right after the old
// basic block so that it's in the "fall through" position.		// basic block so that it's in the "fall through" position.
NewBB->moveAfter(&BB);		NewBB->moveAfter(&BB);

// Generate the stack protector instructions in the old basic block.		// Generate the stack protector instructions in the old basic block.
IRBuilder<> B(&BB);		IRBuilder<> B(&BB);
Value *Guard = getStackGuard(TLI, M, B);		Value *Guard = getStackGuard(TLI, M, B);
LoadInst *LI2 = B.CreateLoad(B.getInt8PtrTy(), AI, true);		LoadInst *LI2 = B.CreateLoad(B.getInt8PtrTy(), AI, true);
Value *Cmp = B.CreateICmpEQ(Guard, LI2);		Value *Cmp = B.CreateICmpEQ(Guard, LI2);
auto SuccessProb =		auto SuccessProb =
BranchProbabilityInfo::getBranchProbStackProtector(true);		BranchProbabilityInfo::getBranchProbStackProtector(true);
auto FailureProb =		auto FailureProb =
BranchProbabilityInfo::getBranchProbStackProtector(false);		BranchProbabilityInfo::getBranchProbStackProtector(false);
MDNode *Weights = MDBuilder(F->getContext())		MDNode *Weights = MDBuilder(F->getContext())
.createBranchWeights(SuccessProb.getNumerator(),		.createBranchWeights(SuccessProb.getNumerator(),
FailureProb.getNumerator());		FailureProb.getNumerator());
B.CreateCondBr(Cmp, NewBB, FailBB, Weights);		B.CreateCondBr(Cmp, NewBB, FailBB, Weights);

		// Update the dominator tree if we need to.
		if (DT && DT->isReachableFromEntry(&BB))
		RecalculateDT = true;
}		}
}		}

		LuoYuankeUnsubmitted Not Done Reply Inline Actions This should increase compiling time. Need refine it by moving the BB child to NewBB. LuoYuanke: This should increase compiling time. Need refine it by moving the BB child to NewBB.
		xiangzhangllvmAuthorUnsubmitted Done Reply Inline Actions Yes, ok, let me add more code here, first delete the related old DT Edges and add back to the new ones. xiangzhangllvm: Yes, ok, let me add more code here, first delete the related old DT Edges and add back to the…
		xiangzhangllvmAuthorUnsubmitted Done Reply Inline Actions Let me first move the DT->recalculate out of loop. I try do more faster update for the change, but meet some problems. I need time to deeply look into the dominator tree. Pls let me first fix the problem self and then do optimization. xiangzhangllvm: Let me first move the DT->recalculate out of loop. I try do more faster update for the change…
		// TODO: Refine me, use faster way to update DT.
		// Now we have spilt the BB, some like:
		// ===================================
		// BB:
		// RetOrNoReturnCall
		// ==>
		// BB:
		// CondBr
		// NewBB:
		// RetOrNoReturnCall
		// FailBB: (*)
		// HandleStackCheckFail
		// ===================================
		// The faster way should cover:
		// For NewBB, it should success the old BB's dominatees.
		// 1) return: it didn't have dominatee
		// 2) no-return call: there may has dominatees.
		//
		// For FailBB, it may be created before, So
		// 1) if it has 1 Predecessors, add it into DT.
		// 2) if it has 2 Predecessors, it should has no dominator, remove it from DT.
		// 3) if it has 3 or more Predecessors, DT has removed it, do nothing.
		if (RecalculateDT)
		DT->recalculate(*F);

// Return if we didn't modify any basic blocks. i.e., there are no return		// Return if we didn't modify any basic blocks. i.e., there are no return
// statements in the function.		// statements in the function.
return HasPrologue;		return HasPrologue;
}		}

/// CreateFailBB - Create a basic block to jump to when the stack protector		/// CreateFailBB - Create a basic block to jump to when the stack protector
/// check fails.		/// check fails.
BasicBlock *StackProtector::CreateFailBB() {		BasicBlock *StackProtector::CreateFailBB() {
▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/stack-protector-2.ll

Show First 20 Lines • Show All 186 Lines • ▼ Show 20 Lines	; CHECK-NEXT: %2 = alloca i64
%2 = alloca i64, align 8		%2 = alloca i64, align 8
store i64 %0, i64* %2, align 8		store i64 %0, i64* %2, align 8
%3 = load i64, i64* %2, align 8		%3 = load i64, i64* %2, align 8
%4 = alloca i8, i64 %3, align 16		%4 = alloca i8, i64 %3, align 16
call void @foo(i8* %4)		call void @foo(i8* %4)
ret void		ret void
}		}

		; Check stack protect for noreturn call
		define dso_local i32 @foo_no_return(i32 %0) #1 {
		; CHECK-LABEL: @foo_no_return
		entry:
		%cmp = icmp sgt i32 %0, 4
		br i1 %cmp, label %if.then, label %if.end

		; CHECK: if.then: ; preds = %entry
		; CHECK-NEXT: %StackGuard1 = load volatile i8, i8 addrspace(257)* inttoptr (i32 40 to i8* addrspace(257)*), align 8
		; CHECK-NEXT: %1 = load volatile i8, i8* %StackGuardSlot, align 8
		; CHECK-NEXT: %2 = icmp eq i8* %StackGuard1, %1
		; CHECK-NEXT: br i1 %2, label %SP_return, label %CallStackCheckFailBlk
		; CHECK: SP_return: ; preds = %if.then
		; CHECK-NEXT: %call = call i32 @foo_no_return(i32 1)
		; CHECK-NEXT: br label %return
		; CHECK: if.end: ; preds = %entry
		; CHECK-NEXT: br label %return

		if.then: ; preds = %entry
		%call = call i32 @foo_no_return(i32 1)
		br label %return

		if.end: ; preds = %entry
		br label %return

		return: ; preds = %if.end, %if.then
		ret i32 0
		}

attributes #0 = { sspstrong }		attributes #0 = { sspstrong }
		attributes #1 = { noreturn sspreq}

llvm/test/CodeGen/X86/stack-protector-no-return.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc %s -mtriple=x86_64-unknown-linux-gnu -o - -verify-dom-info \| FileCheck %s			; RUN: llc %s -mtriple=x86_64-unknown-linux-gnu -o - -verify-dom-info \| FileCheck %s

	$__clang_call_terminate = comdat any			; Function Attrs: sspreq
				define void @_Z7catchesv() #0 personality i8* null {
				xiangzhangllvmAuthorUnsubmitted Done Reply Inline Actions There is a fail (about dominate tree checking) for this test at https://reviews.llvm.org/D138774 ,current patch fix it and then, to let it be more clear, I use llvm-reduce simplify it. (Now, both the old version and new version of this test are pass) xiangzhangllvm: There is a fail (about dominate tree checking) for this test at https://reviews.llvm.
	@_ZTIi = external dso_local constant i8*			; CHECK-LABEL: _Z7catchesv:
	@.str = private unnamed_addr constant [5 x i8] c"win\0A\00", align 1

	; Function Attrs: mustprogress noreturn sspreq uwtable
	define dso_local void @_Z7catchesv() local_unnamed_addr #0 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
	entry:
	%exception = tail call i8* @__cxa_allocate_exception(i64 4) #8
	%0 = bitcast i8* %exception to i32*
	store i32 1, i32* %0, align 16
	invoke void @__cxa_throw(i8* nonnull %exception, i8* bitcast (i8** @_ZTIi to i8), i8 null) #9
	to label %unreachable unwind label %lpad


	lpad: ; preds = %entry
	%1 = landingpad { i8*, i32 }
	catch i8* null
	%2 = extractvalue { i8*, i32 } %1, 0
	%3 = tail call i8* @__cxa_begin_catch(i8* %2) #8
	%call = invoke i64 @write(i32 noundef 1, i8* noundef getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0), i64 noundef 4)
	to label %invoke.cont unwind label %lpad1


	invoke.cont: ; preds = %lpad
	invoke void @_exit(i32 noundef 1) #9
	to label %invoke.cont2 unwind label %lpad1

	invoke.cont2: ; preds = %invoke.cont
	unreachable

	lpad1: ; preds = %invoke.cont, %lpad
	%4 = landingpad { i8*, i32 }
	cleanup
	invoke void @__cxa_end_catch()
	to label %eh.resume unwind label %terminate.lpad

	eh.resume: ; preds = %lpad1
	resume { i8*, i32 } %4

	terminate.lpad: ; preds = %lpad1
	%5 = landingpad { i8*, i32 }
	catch i8* null
	%6 = extractvalue { i8*, i32 } %5, 0
	tail call void @__clang_call_terminate(i8* %6) #10
	unreachable



	unreachable: ; preds = %entry
	unreachable
	}

	; Function Attrs: nofree
	declare dso_local noalias i8* @__cxa_allocate_exception(i64) local_unnamed_addr #1

	; Function Attrs: nofree noreturn
	declare dso_local void @__cxa_throw(i8, i8, i8*) local_unnamed_addr #2

	declare dso_local i32 @__gxx_personality_v0(...)

	; Function Attrs: nofree
	declare dso_local i8* @__cxa_begin_catch(i8*) local_unnamed_addr #1

	; Function Attrs: nofree
	declare dso_local noundef i64 @write(i32 noundef, i8* nocapture noundef readonly, i64 noundef) local_unnamed_addr #3

	; Function Attrs: nofree noreturn
	declare dso_local void @_exit(i32 noundef) local_unnamed_addr #4

	; Function Attrs: nofree
	declare dso_local void @__cxa_end_catch() local_unnamed_addr #1

	; Function Attrs: noinline noreturn nounwind
	define linkonce_odr hidden void @__clang_call_terminate(i8* %0) local_unnamed_addr #5 comdat {
	; CHECK-LABEL: __clang_call_terminate:
	; CHECK: # %bb.0:
	; CHECK-NEXT: pushq %rax
	; CHECK-NEXT: callq __cxa_begin_catch
	; CHECK-NEXT: callq _ZSt9terminatev
	%2 = tail call i8* @__cxa_begin_catch(i8* %0) #8
	tail call void @_ZSt9terminatev() #10
	unreachable
	}

	; Function Attrs: nofree noreturn nounwind
	declare dso_local void @_ZSt9terminatev() local_unnamed_addr #6

	; Function Attrs: mustprogress nofree sspreq uwtable
	define dso_local void @_Z4vulni(i32 noundef %op) local_unnamed_addr #7 {
	; CHECK-LABEL: _Z4vulni:
	; CHECK: # %bb.0: # %entry			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: pushq %rax			; CHECK-NEXT: pushq %rax
	; CHECK-NEXT: .cfi_def_cfa_offset 16			; CHECK-NEXT: .cfi_def_cfa_offset 16
	; CHECK-NEXT: movq %fs:40, %rax			; CHECK-NEXT: movq %fs:40, %rax
	; CHECK-NEXT: movq %rax, (%rsp)			; CHECK-NEXT: movq %rax, (%rsp)
	; CHECK-NEXT: cmpl $1, %edi			; CHECK-NEXT: .Ltmp0:
	; CHECK-NEXT: je .LBB2_3			; CHECK-NEXT: xorl %eax, %eax
	; CHECK-NEXT: # %bb.1: # %if.end			; CHECK-NEXT: xorl %edi, %edi
				; CHECK-NEXT: xorl %esi, %esi
				; CHECK-NEXT: xorl %edx, %edx
				; CHECK-NEXT: callq *%rax
				; CHECK-NEXT: .Ltmp1:
				; CHECK-NEXT: # %bb.1: # %invoke.cont
				; CHECK-NEXT: movq %fs:40, %rax
				; CHECK-NEXT: cmpq (%rsp), %rax
				; CHECK-NEXT: jne .LBB0_6
				; CHECK-NEXT: # %bb.2: # %SP_return
				; CHECK-NEXT: .Ltmp2:
				; CHECK-NEXT: xorl %eax, %eax
				; CHECK-NEXT: xorl %edi, %edi
				; CHECK-NEXT: callq *%rax
				; CHECK-NEXT: .Ltmp3:
				; CHECK-NEXT: # %bb.3: # %invoke.cont2
				; CHECK-NEXT: .LBB0_4: # %lpad1
				; CHECK-NEXT: .Ltmp4:
	; CHECK-NEXT: movq %fs:40, %rax			; CHECK-NEXT: movq %fs:40, %rax
	; CHECK-NEXT: cmpq (%rsp), %rax			; CHECK-NEXT: cmpq (%rsp), %rax
	; CHECK-NEXT: jne .LBB2_2			; CHECK-NEXT: jne .LBB0_6
	; CHECK-NEXT: # %bb.4: # %SP_return			; CHECK-NEXT: # %bb.5: # %SP_return2
	; CHECK-NEXT: popq %rax			; CHECK-NEXT: popq %rax
	; CHECK-NEXT: .cfi_def_cfa_offset 8			; CHECK-NEXT: .cfi_def_cfa_offset 8
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	; CHECK-NEXT: .LBB2_3: # %if.then			; CHECK-NEXT: .LBB0_6: # %CallStackCheckFailBlk
	; CHECK-NEXT: .cfi_def_cfa_offset 16			; CHECK-NEXT: .cfi_def_cfa_offset 16
	; CHECK-NEXT: movl $4, %edi
	; CHECK-NEXT: callq __cxa_allocate_exception
	; CHECK-NEXT: movl $1, (%rax)
	; CHECK-NEXT: movl $_ZTIi, %esi
	; CHECK-NEXT: movq %rax, %rdi
	; CHECK-NEXT: xorl %edx, %edx
	; CHECK-NEXT: callq __cxa_throw
	; CHECK-NEXT: .LBB2_2: # %CallStackCheckFailBlk
	; CHECK-NEXT: callq __stack_chk_fail@PLT			; CHECK-NEXT: callq __stack_chk_fail@PLT
	entry:			entry:
	%cmp = icmp eq i32 %op, 1			%call = invoke i64 null(i32 0, i8* null, i64 0)
	br i1 %cmp, label %if.then, label %if.end			to label %invoke.cont unwind label %lpad1

	if.then: ; preds = %entry			invoke.cont: ; preds = %entry
	%exception = tail call i8* @__cxa_allocate_exception(i64 4) #8			invoke void null(i32 0) #1
	%0 = bitcast i8* %exception to i32*			to label %invoke.cont2 unwind label %lpad1
	store i32 1, i32* %0, align 16
	tail call void @__cxa_throw(i8* %exception, i8* bitcast (i8** @_ZTIi to i8), i8 null) #9			invoke.cont2: ; preds = %invoke.cont
	unreachable			unreachable

	if.end: ; preds = %entry			lpad1: ; preds = %invoke.cont, %entry
				%0 = landingpad { i8*, i32 }
				cleanup
	ret void			ret void
	}			}

	attributes #0 = { mustprogress noreturn sspreq uwtable }			; uselistorder directives
	attributes #1 = { nofree }			uselistorder i8* null, { 1, 0 }
	attributes #2 = { nofree noreturn }
	attributes #3 = { nofree }			attributes #0 = { sspreq }
	attributes #4 = { nofree noreturn }			attributes #1 = { noreturn }
	attributes #5 = { noinline noreturn nounwind }
	attributes #6 = { nofree noreturn nounwind }
	attributes #7 = { mustprogress nofree sspreq uwtable }
	attributes #8 = { nounwind }
	attributes #9 = { noreturn }
	attributes #10 = { noreturn nounwind }

llvm/test/CodeGen/X86/stack-protector-recursively.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -mtriple=x86_64-pc-linux-gnu -o - < %s \| FileCheck %s

				; Make sure the stack protect not infinitly check __stack_chk_fail.
				define dso_local void @__stack_chk_fail() local_unnamed_addr #0 {
				; CHECK-LABEL: __stack_chk_fail:
				; CHECK: # %bb.0: # %entry
				; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: movq %fs:40, %rax
				; CHECK-NEXT: movq %rax, (%rsp)
				; CHECK-NEXT: movq %fs:40, %rax
				; CHECK-NEXT: cmpq (%rsp), %rax
				; CHECK-NEXT: jne .LBB0_2
				; CHECK-NEXT: # %bb.1: # %SP_return
				; CHECK-NEXT: ud2
				; CHECK-NEXT: .LBB0_2: # %CallStackCheckFailBlk
				; CHECK-NEXT: callq __stack_chk_fail
				entry:
				tail call void @llvm.trap()
				unreachable
				}

				declare void @llvm.trap() #1

				attributes #0 = { noreturn nounwind sspreq }
				attributes #1 = { noreturn nounwind }

This is an archive of the discontinued LLVM Phabricator instance.

Enhance stack protectorClosedPublic

Details

Diff Detail