This is an archive of the discontinued LLVM Phabricator instance.

[WIP] Use dynamic alloca for stack variables in ASan.
ClosedPublic

Authored by samsonov on Dec 1 2014, 11:56 AM.

Details

Reviewers
kcc
Summary

This is an experimental patch that changes the way we
instrument stack variables: instead of merging all static alloca
instructions into one large static alloca, we merge them into
one *dynamic* alloca. There are two main reasons for that:

  1. We significantly reduce the stack usage in "use-after-return"

mode. Currently, large static alloca is just unused in use-after-return
mode, as we actually use "fake stack" allocated by the ASan runtime.

  1. It can improve debug info for local variables, as their location

will now be calculated via %rbx register, not %rsp, that is assumed
to be clobbered by function calls.

I'm still running a SPEC for this patch to see if it slows down ASan
on benchmarks. I've also ran it on a large internal codebase, and it
unveiled numerous problems. I see problems with inline assembly
(e.g. inline assembly code for fetching cpuid doesn't properly
save/restore %rbx, or compile errors of the form "inline assembly requires
more registers than available"). I also see several runtime errors
on -O2 level, which I haven't debugged yet.

Sadly, deployment of this change will be hard (if possible). Still,
you're welcome to experiment with this patch and see if it solves
some problems you observe (esp. debug info issues).

Diff Detail

Event Timeline

samsonov updated this revision to Diff 16778.Dec 1 2014, 11:56 AM
samsonov retitled this revision from to [WIP] Use dynamic alloca for stack variables in ASan..
samsonov updated this object.
samsonov edited the test plan for this revision. (Show Details)
samsonov added subscribers: kcc, kubamracek, friss, Unknown Object (MLST).

Hi,
I've tried this patch to see if it solves the missing debugging information issues that we're struggling with. I can confirm that most of the "regular" cases (regular local variables) are solved by this patch. However, there's still a few cases that don't work, for example __block variables:

#import <Foundation/Foundation.h>
int main(int argc, const char * argv[]) {
    __block int x = 7;
    printf("%d\n", x); // break here, cannot read value of 'x'
    return 0;
}

... or accessing variables from within a block:

#import <Foundation/Foundation.h>
int main(int argc, const char * argv[]) {
    int x = 7;
    ^{
        printf("%d\n", x); // break here, 'x' shows value of 0 instead of 7
    }();
    return 0;
}

Note that these two are broken in all other proposed solutions as well.

Anyway, I'm glad to see the progress, and we can surely focus on the regular (not blocks related) cases first.

samsonov updated this revision to Diff 16895.Dec 3 2014, 4:05 PM
  • Add a commandline flag to switch between new and old behavior
  • Don't use dynamic alloca if a function has inline assembly
  • Remove unnecessary argument in __asan_stack_free() function.

This is stil pretty much a WIP, I need to update the tests and check
performance. However, in this form the change seems to be pretty
stable on the codebases I've tested.

samsonov updated this revision to Diff 16896.Dec 3 2014, 4:06 PM

Try to upload the correct diff.

samsonov updated this revision to Diff 16974.Dec 4 2014, 6:10 PM

More changes.

  • Disable dynamic-alloca by default for now.
  • Slightly refactor the code and update the tests.
  • Bump __asan_init version.

I've tested this change (with and without enabling dynamic alloca) on a
lot of code for now, and it seems to be pretty stable (workaround to
disable dynamic alloca if there is inline assembly helps a lot). I've verified
that it helps with debug info in simple cases (reported here:
https://code.google.com/p/address-sanitizer/issues/detail?id=235), and I believe
Kuba that it helps his non-ObjC-blocks cases as well.

I've also ran a SPEC benchmarks, and observed no regression (although I'm not
sure that the results are robust/stable enough on my machine). Just this change,
which keeps static alloca, but changes the way we handle fake stack, shows
2.3% speedup on average, and if we enable dynamic alloca average speedup
goes up to 6.8% (for -O2/ASan).

Compiler-rt changes are trivial, I will send them for review in a sepate revision.

Please take a look.

samsonov added a reviewer: kcc.Dec 4 2014, 6:13 PM
kcc edited edge metadata.Dec 5 2014, 4:52 PM

Please update the commit message.

lib/Transforms/Instrumentation/AddressSanitizer.cpp
501

HasNonEmptyInlineAsm?

samsonov updated this revision to Diff 17014.Dec 5 2014, 6:00 PM
samsonov edited edge metadata.
  • Remove the extra argument of __asan_stack_malloc function: instead, calculate SP inside ASan runtime.
  • Compare FakeStack to null to check if we need to call __asan_stack_free.
  • Update the test cases accordingly.
  • Address review comment.
lib/Transforms/Instrumentation/AddressSanitizer.cpp
501

Done

samsonov updated this revision to Diff 17016.Dec 5 2014, 6:04 PM

Fixup the typo.

I've re-run the SPEC after the most recent changes, and still see a minor improvement in average results: this change as-is gives 4.5% improvement against the current code, and if/when we enable asan-stack-dynamic-alloca by default, the improvement against the current code is 3.5%.

In D6465#16, @samsonov wrote:

I've re-run the SPEC after the most recent changes, and still see a minor improvement in average results: this change as-is gives 4.5% improvement against the current code, and if/when we enable asan-stack-dynamic-alloca by default, the improvement against the current code is 3.5%.

Awesome!

Note that r223799 (Correctly handle complex locations expressions in replaceDbgDeclareForAlloca) by Frederic (@friss) fixes the Obj-C cases. Together with this patch and with -mllvm -asan-stack-dynamic-alloca, I don't see any more missing or invalid debug info.

Disable dynamic-alloca by default for now.

Alexey, what is your plan about having this on/off by default? Is it off by default just for now?

Thanks,
Kuba

In D6465#19, @kubabrecka wrote:

Disable dynamic-alloca by default for now.

Alexey, what is your plan about having this on/off by default? Is it off by default just for now?

I plan to make this on by default, after more testing.

Kostya, do you have more comments about the current version of the patch?

Thanks,
Kuba

kcc added a comment.Dec 10 2014, 2:12 PM

First comment: please update the commit message

kcc accepted this revision.Dec 10 2014, 2:14 PM
kcc edited edge metadata.

LGTM otherwise

This revision is now accepted and ready to land.Dec 10 2014, 2:14 PM
samsonov closed this revision.Dec 11 2014, 1:59 PM

r224062