This is an archive of the discontinued LLVM Phabricator instance.

Enhance stack protector
ClosedPublic

Authored by xiangzhangllvm on Dec 3 2022, 3:11 AM.

Details

Summary

Enhance stack protector for calling no return function

I find current stack protector didn't work for calling no return function.
This is an defect for "stack protector", so let's refine it.

Diff Detail

Event Timeline

xiangzhangllvm created this revision.Dec 3 2022, 3:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 3 2022, 3:11 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
xiangzhangllvm requested review of this revision.Dec 3 2022, 3:11 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 3 2022, 3:11 AM
alexfh added a subscriber: alexfh.Dec 6 2022, 4:27 AM

There's a problem in the previous iteration of this patch (https://reviews.llvm.org/D138774). It made clang hang while compiling the following short snippet reduced from an open-source library (compiled on x86-64, linux):

$ cat q.c
char *texsubexpr(char *expression, char *subexpr) {
  char *texchar();
  if (*expression) return texchar(expression, subexpr);
}

void __stack_chk_fail (void) {
  __builtin_trap ();
}

$ clang -O1  -w -fstack-protector-all   -c q.c  -o q.o

Please ensure this is solved here.

There's a problem in the previous iteration of this patch (https://reviews.llvm.org/D138774). It made clang hang while compiling the following short snippet reduced from an open-source library (compiled on x86-64, linux):

I check it, the clang still hang, let me check and fix it. Thanks very much!

llvm/test/CodeGen/X86/stack-protector-no-return.ll
5

There is a fail (about dominate tree checking) for this test at https://reviews.llvm.org/D138774 ,current patch fix it and then, to let it be more clear, I use llvm-reduce simplify it.
(Now, both the old version and new version of this test are pass)

There's a problem in the previous iteration of this patch (https://reviews.llvm.org/D138774). It made clang hang while compiling the following short snippet reduced from an open-source library (compiled on x86-64, linux):

I check it, the clang still hang, let me check and fix it. Thanks very much!

I root cause it, this is a very special case : )
The function "stack_chk_fail" define in the upper case is also the Stack Protection will also use (and auto generate call for it).
The option "-fstack-protector-all" required "
stack_chk_fail" function self to be checked (by use "stack_chk_fail" self too.)
So this cause the Infinite loop for checking "
stack_chk_fail", and make clang hang.

This was a defect in StackProtector and exposed by current patch.

Let me fix it. Thanks again!

There's a problem in the previous iteration of this patch (https://reviews.llvm.org/D138774). It made clang hang while compiling the following short snippet reduced from an open-source library (compiled on x86-64, linux):

$ cat q.c
char *texsubexpr(char *expression, char *subexpr) {
  char *texchar();
  if (*expression) return texchar(expression, subexpr);
}

void __stack_chk_fail (void) {
  __builtin_trap ();
}

$ clang -O1  -w -fstack-protector-all   -c q.c  -o q.o

Please ensure this is solved here.

Fix/Done

xiangzhangllvm added inline comments.Dec 6 2022, 10:53 PM
llvm/test/CodeGen/X86/stack-protector-weight.ll
13 ↗(On Diff #480777)

The condition branch identify the branch probability at StackProtector.cpp: line 572
And BB calculate from SelectionDAGBuilder::visitBr by visit BR with following metedata:

!{!"branch_weights", i32 2147481600, i32 2048}
                                 (0x7ffff800)         (0x800)

The previous data is incorrect. (should be same with line 7)

Could you add a test case for recursively handling __stack_chk_fail?

Add test stack-protector-recursively.ll to check stack protector auto generated function (__stack_chk_fail).

Could you add a test case for recursively handling __stack_chk_fail?

Done

LuoYuanke added inline comments.Dec 8 2022, 1:14 AM
llvm/lib/CodeGen/StackProtector.cpp
581

This should increase compiling time. Need refine it by moving the BB child to NewBB.

xiangzhangllvm added inline comments.Dec 8 2022, 1:31 AM
llvm/lib/CodeGen/StackProtector.cpp
581

Yes, ok, let me add more code here, first delete the related old DT Edges and add back to the new ones.

xiangzhangllvm added inline comments.Dec 9 2022, 3:43 AM
llvm/lib/CodeGen/StackProtector.cpp
581

Let me first move the DT->recalculate out of loop.
I try do more faster update for the change, but meet some problems.
I need time to deeply look into the dominator tree.
Pls let me first fix the problem self and then do optimization.

LuoYuanke accepted this revision.Dec 9 2022, 6:12 PM

LGTM.

This revision is now accepted and ready to land.Dec 9 2022, 6:12 PM
This revision was landed with ongoing or failed builds.Dec 11 2022, 4:44 PM
This revision was automatically updated to reflect the committed changes.

This diff is a ~0.5% size regression under -Oz when building with -fstack-protector. Is this reasonably expected? And one that should be expected given that it's considered to be a correctness fix? cc @smeenai

This diff is a ~0.5% size regression under -Oz when building with -fstack-protector. Is this reasonably expected? And one that should be expected given that it's considered to be a correctness fix? cc @smeenai

I think the size increasement is expected because it inserts more stack corruption checks. Do you find any unexpected check?

This diff is a ~0.5% size regression under -Oz when building with -fstack-protector. Is this reasonably expected? And one that should be expected given that it's considered to be a correctness fix? cc @smeenai

I think the size increasement is expected because it inserts more stack corruption checks. Do you find any unexpected check?

That is right, sorry for not see this comment in time.

Would there be any objections to adding a flag to restore the previous behavior? This is a fairly hefty size increase for us, and we don't think the additional checking is worth the size increase in our case. We also intend to look into the size increases more to determine how much is from guarding calls to e.g. __cxa_throw, where you will reuse the stack frame, vs. calls to e.g. abort or _Unwind_Resume, where the stack frame should never be reused and checking the stack cookie doesn't seem worthwhile (if my logic is sound, which I'm not 100% sure of).

Would there be any objections to adding a flag to restore the previous behavior? This is a fairly hefty size increase for us, and we don't think the additional checking is worth the size increase in our case. We also intend to look into the size increases more to determine how much is from guarding calls to e.g. __cxa_throw, where you will reuse the stack frame, vs. calls to e.g. abort or _Unwind_Resume, where the stack frame should never be reused and checking the stack cookie doesn't seem worthwhile (if my logic is sound, which I'm not 100% sure of).

I agree to add a option to go in previous behavior. (e.g -fstack-protector-weak or -fstack-protector-return)
But this enhance for no-return function is necessary for a security. We meet such cases in user code. That is why I did it.

Would there be any objections to adding a flag to restore the previous behavior? This is a fairly hefty size increase for us, and we don't think the additional checking is worth the size increase in our case. We also intend to look into the size increases more to determine how much is from guarding calls to e.g. __cxa_throw, where you will reuse the stack frame, vs. calls to e.g. abort or _Unwind_Resume, where the stack frame should never be reused and checking the stack cookie doesn't seem worthwhile (if my logic is sound, which I'm not 100% sure of).

I agree to add a option to go in previous behavior. (e.g -fstack-protector-weak or -fstack-protector-return)
But this enhance for no-return function is necessary for a security. We meet such cases in user code. That is why I did it.

Yup, I'm not at all doubting the value of this work :) Adding the option will just let everyone make the appropriate trade-off between size and security for their use cases.

Hi @smeenai , do you have plan to implement it ? If not, I will do it, but may some days later (due to some other jobs in my hands).

Hi @smeenai , do you have plan to implement it ? If not, I will do it, but may some days later (due to some other jobs in my hands).

I'm going on an extended leave soon, and I won't have time to implement it before then. I've asked some coworkers to take a look, but if you have the time to add the option it'd be hugely appreciated :)

Hi @smeenai , do you have plan to implement it ? If not, I will do it, but may some days later (due to some other jobs in my hands).

I'm going on an extended leave soon, and I won't have time to implement it before then. I've asked some coworkers to take a look, but if you have the time to add the option it'd be hugely appreciated :)

OK, no problem, I'll take it. Thanks.

! In D139254#4036649, @smeenai wrote:
I'm going on an extended leave soon, and I won't have time to implement it before then. I've asked some coworkers to take a look, but if you have the time to add the option it'd be hugely appreciated :)

Hi @smeenai I first implement at an llvm option for your requirement, you can use -mllvm -option-name to quick solve your code size problem.
PLS refer D141556

! In D139254#4036649, @smeenai wrote:
I'm going on an extended leave soon, and I won't have time to implement it before then. I've asked some coworkers to take a look, but if you have the time to add the option it'd be hugely appreciated :)

Hi @smeenai I first implement at an llvm option for your requirement, you can use -mllvm -option-name to quick solve your code size problem.
PLS refer D141556

Thank you!

Are there possible optimizations which can be implemented to reduce the code-size impact of this change? I'm wondering if this patch adds the checks conservatively to all instances where stack check could potentially, but not necessarily, be missed/useful?

Are there possible optimizations which can be implemented to reduce the code-size impact of this change? I'm wondering if this patch adds the checks conservatively to all instances where stack check could potentially, but not necessarily, be missed/useful?

Seems good idea, but it is hard to guess how the attacker will attack our program. (which no-return should protect or not), and sorry for I am not a defense expert.
Welcome any good suggestion, or directly add me to related patch's reviewer or subsciber.

Allen added a subscriber: Allen.Mar 18 2023, 11:07 PM

Are there possible optimizations which can be implemented to reduce the code-size impact of this change? I'm wondering if this patch adds the checks conservatively to all instances where stack check could potentially, but not necessarily, be missed/useful?

Seems good idea, but it is hard to guess how the attacker will attack our program. (which no-return should protect or not), and sorry for I am not a defense expert.
Welcome any good suggestion, or directly add me to related patch's reviewer or subsciber.

I'm happy to see at least there's an option to turn this off.
-mllvm -disable-check-noreturn-call
https://reviews.llvm.org/D141556

That said, rereading https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58245#c6 I kind of agree with Jakub. What are you protecting exactly? Why do so only for noreturn calls? Why not prior to every call? Why not insert a stack check guard after every operation? If you're that paranoid, do no meaningful work and just keep checking the stack protector.

I think this would have been better implemented as a new level beyond -fstack-protector-strong rather than regressing the code size of every user of -fstack-protector-strong.

If GCC doesn't do this, is this something ICC did?

Are there possible optimizations which can be implemented to reduce the code-size impact of this change? I'm wondering if this patch adds the checks conservatively to all instances where stack check could potentially, but not necessarily, be missed/useful?

Seems good idea, but it is hard to guess how the attacker will attack our program. (which no-return should protect or not), and sorry for I am not a defense expert.
Welcome any good suggestion, or directly add me to related patch's reviewer or subsciber.

I'm happy to see at least there's an option to turn this off.
-mllvm -disable-check-noreturn-call
https://reviews.llvm.org/D141556

That said, rereading https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58245#c6 I kind of agree with Jakub. What are you protecting exactly? Why do so only for noreturn calls? Why not prior to every call? Why not insert a stack check guard after every operation? If you're that paranoid, do no meaningful work and just keep checking the stack protector.

I think this would have been better implemented as a new level beyond -fstack-protector-strong rather than regressing the code size of every user of -fstack-protector-strong.

If GCC doesn't do this, is this something ICC did?

The full context that motivated this patch is now publicly available in https://bugs.chromium.org/p/llvm/issues/detail?id=30, and I have a much better understanding now.

The problem is that you have noreturn functions like abort, which are actually going to terminate your program, and there's no need to check the stack canary before calling them. However, you also have noreturn functions like __cxa_throw and _Unwind_Resume which will cause your program to resume in the unwinder, and the unwinder is relying on saved return addresses on the stack to determine where to go, so not checking the canary before entering the unwinder can cause control flow to be hijacked.

Note that this patch by itself is an incomplete solution. @t.p.northover put up D143637 recently to make all unwind paths visible so that the canary can be checked for them. Unfortunately, I've found that patch to basically negate any size benefit from -disable-check-no-return-call. I haven't had the chance to dig into that yet, but maybe Tim has some ideas.

If you're also interested in this for size reasons, one specific potential enhancement would be distinguishing abort-like functions from unwind-like functions and avoiding the canary check before the former. I haven't been able to measure the ratio of the two in our code base yet.