This patch implements variable-sized alloca instrumentation (https://code.google.com/p/address-sanitizer/issues/detail?id=138).
Diff Detail
Event Timeline
This does not handle stack-use-after-return for alloca, right?
I don't insist you implement that now, but consider for the next patch.
lib/Transforms/Instrumentation/AddressSanitizer.cpp | ||
---|---|---|
681 | Please do it under a flag, off by default for now. | |
lib/asan/asan_interface_internal.h | ||
181 ↗ | (On Diff #15616) | indent |
lib/asan/asan_internal.h | ||
140 | I think kAsanAllocaPartialMagic is redundant, just use kAsanAllocaRightMagic | |
lib/asan/asan_report.cc | ||
990 | Maybe dynamic-stack-buffer-overflow (for both left and right cases)? |
lib/asan/asan_report.cc | ||
---|---|---|
990 | I think Max did underflow to match messages for ordinary stack. |
lib/asan/asan_report.cc | ||
---|---|---|
990 | Yea... I think the "underflow" was not very useful. |
lib/Transforms/Instrumentation/AddressSanitizer.cpp | ||
---|---|---|
157 | Please add a test for this flag in test/Instrumentation/AddressSanitizer/ | |
536 | This will not create a left red zone, right? I would prefer to create both left and right redzones and [un]poison them inline with one 4-byte store for the left rz and one or two 4-byte stores for the right one. | |
546 | do you really need to always align this by 32? | |
575 | A cleaner way is to create a new AllocaInst, just like we do in another place in this file. | |
lib/asan/asan_fake_stack.cc | ||
240 ↗ | (On Diff #15842) | We probably don't need these at all if we inline the poisoning and unpoisoning. |
lib/Transforms/Instrumentation/AddressSanitizer.cpp | ||
---|---|---|
536 | I think Max's idea was to create thricely left, right and partial. As for inlining, it would be a mess for partial redzone - it's size is unknown until runtime so we won't be able to use 4-byte stores and will have to use an ugly loop instead. |
lib/Transforms/Instrumentation/AddressSanitizer.cpp | ||
---|---|---|
536 | A loop? Come on, I am sure you can construct the appropriate 32-bit constant to poison the partial 32-byte zon just using arithmetic (masks and shifts) |
lib/Transforms/Instrumentation/AddressSanitizer.cpp | ||
---|---|---|
536 | I wonder if this can be less ugly? We don't want to emit this mess in codegen, do we? Tail = OldSize & 5; // Get length of 32-byte unaligned part sh = 24; shadow_word = 0; } else if (Tail >= 16) { sh = 16; shadow_word = 0xcb000000; // 0xcb is right magic } else if (Tail >= 8) { sh = 8; shadow_word = 0xcbcb0000; // 0xcb is right magic } else { sh = 0; shadow_word = 0xcbcbcb00; // 0xcb is right magic } |
lib/Transforms/Instrumentation/AddressSanitizer.cpp | ||
---|---|---|
536 | Hm, perhaps something like this would be preferable: padding = OldSize & (Align - 1) // get padding if (padding) { shift = padding & ~7; // the number of bits we need to shift to access first chunk in shadow memory, containing nonzero bytes // Example: // padding = 21 padding = 16 // Shadow: |00|00|05|cb| Shadow: |00|00|cb|cb| // ^ ^ // | | // shift = 21 & ~7 = 16 shift = 16 & ~7 = 16 val1 = 0xcbcbcbcb << (shift + 8); partialBits = padding & 7; if (!partialBits) partialBits = 0xcb; val2 = partialBits << shift; result = val1 | val2; } if (!partialBits) partialBits = 0xcb; looks ugly, but right now I don't see any convenient way to avoid it. |
lib/Transforms/Instrumentation/AddressSanitizer.cpp | ||
---|---|---|
536 | Oh, Align is 32, of course. |
lib/Transforms/Instrumentation/AddressSanitizer.cpp | ||
---|---|---|
536 | Yep, something along these lines. Cool. Please make sure to have a test for all 32 values of padding. Don't forget about little- vs big- endian
This should be fine actually, you can use SelectInst to avoid creating new BB |
Updated patch. Major changes are:
- (Un)poisoning is inline now.
- Added test to check instrumentation (with/without alloca instrumentation, default behavior).
- Add test to check that all 32 values of padding are handled correctly.
- Big-endian is now "supported", but I unable to test it, sorry.
lib/Transforms/Instrumentation/AddressSanitizer.cpp | ||
---|---|---|
472 | what is the middle rz? | |
476 | add a comment describing the 3 redzones. I think you should store the pointers to the shadow here, not the pointers to the app memory. | |
480 | space before = | |
559 | no constants here, please. | |
585 | why long? | |
650 | no constants here. | |
656 | instead of creating a new BB for partial RZ, I would do this: | |
test/Instrumentation/AddressSanitizer/instrument-dynamic-allocas.ll | ||
6 | use CHECK-NOALLOCA here | |
test/asan/TestCases/alloca_instruments_all_paddings.cc | ||
11 | Can you replace "str+index" with &str[0]? | |
16 | run this loop twice to ensure that we properly unpoison the stack |
Updated according to last review, major changes are:
- Removed redundant new BB creation.
- Added small fixes for tests.
- Defined new constants on top of the AddressSanitizer.cpp.
lib/Transforms/Instrumentation/AddressSanitizer.cpp | ||
---|---|---|
656 | Yes, this is a good idea to avoid new BB creation. But we still need both PartialRz and RightRz, because in case of PartialSize == 32 PartialRz points the same address as RightRz, otherwise RightRz == PartialSize + 32, isn't it? Or maybe I've misunderstood something? |
lib/Transforms/Instrumentation/AddressSanitizer.cpp | ||
---|---|---|
656 |
There are two ways to implement this:
In this case, we need to poison both PartialRz and RightRz and we need to keep this value in the register for the entire function.
In this case we always unpoison RightRz and RightRz-32 and so we don't need to keep PartialRz around. It's hard to tell which is better w/o measuring on a good benchmark, also there is a memory-vs-cpu tradeoff (larger redzones vs larger register pressure). |
lib/Transforms/Instrumentation/AddressSanitizer.cpp | ||
---|---|---|
108 | static | |
531 | Constant::getNullValue | |
550 | This deserves a comment: what exactly you are computing. | |
574 | These new functions are probably too big and should be placed outside of the class decl. | |
576 | write a function-level comment with the expression you are computing | |
582 | Constant::getNullValue | |
592 | two lines should be enough here |
lib/Transforms/Instrumentation/AddressSanitizer.cpp | ||
---|---|---|
656 |
Not really, we can do the same as in your case 2 i.e. zero out RightRz - 32. Worst case (when PartialRz = RightRz) we'll do a redundant write. As for benchmarking I'm not sure it matters that much - allocas are quite rare anyway. |
Updated according to last review.
Now we don't keep PartialRzAddr until the end of function, we can just unpoison RightRzAddr and RightRzAddr - 32 with two stores into RightRzAddrShadow and RightRzAddrShadow - 4.
If PartialRzAddr == RightRzAddr we will perform one redundant store into user's memory shadow during unpoisoning, but perhaps this should be actually fine.
lib/Transforms/Instrumentation/AddressSanitizer.cpp | ||
---|---|---|
472 | remove "partial" from comment | |
555 | rephrase somehow, e.g. ... it would contain the value that we will use to poison the partial redzone | |
1912 | please add a comment similar to the comment before handleDynamicAllocaCall | |
1933 | I tried hard, but I don't understand this :( | |
2010 | two lines, please. |
LGTM
Thanks for working on this!
If you don't yet have commit access ask Yuri to commit (mentioning you as the author).
Next steps would be to enable the flag by default (we'll do our part of testing too).
And then it may be interesting to enable use-after-return for allocas.
Next steps would be to enable the flag by default (we'll do our part of testing too).
And then it may be interesting to enable use-after-return for allocas.
And we also need to store metadata (probably just variable name) in redzone for user-friendlier reports.
Indeed so, the diagnostics could be improved.
Let's do it before enabling the feature by default
I've run the new feature on the chromium sources and it produced a compiler failure:
reduced test:
% cat a.c
int a;
int b;
int c;
void fn3(int *, int);
void fn1 () {
int d = b && c; int e[a]; int f; if (d) fn3 (&f, sizeof 0 * (&c - e));
}
% clang -fsanitize=address -mllvm -asan-instrument-allocas -O2 a.c
Instruction does not dominate all uses!
%54 = add i64 %53, 2147450880 %68 = sub i64 %54, 4
Instruction does not dominate all uses!
%35 = add i64 %34, 2147450880 %69 = inttoptr i64 %35 to i32*
Instruction does not dominate all uses!
%54 = add i64 %53, 2147450880 %71 = inttoptr i64 %54 to i32*
fatal error: error in backend: Broken function found, compilation aborted!
clang-3.6: error: clang frontend command failed with exit code 70 (use -v to see invocation)
clang version 3.6.0 (trunk 222567)
Target: x86_64-unknown-linux-gnu
Thread model: posix
clang-3.6: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ and include the crash backtrace, preprocessed source, and associated run script.
clang-3.6: note: diagnostic msg:
PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-3.6: note: diagnostic msg: /tmp/a-2b96e4.c
clang-3.6: note: diagnostic msg: /tmp/a-2b96e4.sh
clang-3.6: note: diagnostic msg:
Interesting.
The dynamic alloca in this case is moved to a basic block outside of main path and hence it does not dominate
all exits. So, we can not unpoison it at the RET statements, instead we should do it in the end of the alloca's scope.
This is starting to resemble use-after-scope... Alexey, any comments?
So we basically need to find all blocks dominated by alloca, then all exits from those to non-dominated blocks and then insert unpoison calls prior to these exits?
Or maybe use dominance frontiers to make whole process more efficient: insert unpoisons at those predecessors of blocks from alloca dominance frontier which are dominated by alloca.
I am afraid this is not that simple.
It is legal to call alloca manually inside an if-statement and then use it until the function exit.
/me pondering...
(And will be OOO most of this week, don't expect prompt replies until next Mon)
Right. Frankly I'm mostly interested in VLAs so worst case we can simply remove such pathological cases. I think they can be detected by checking if alloca is argument of some phi?
I think they can be detected by checking if alloca is argument of some phi?
Hm, not really - alloca result could also escape.
We may start from checking that alloca dominates all exits (the most common case, probably).
Handling nontrivial cases may be quite tricky. Perhaps we can implement something like linked list of "bad " allocas, storing the address and size of next/previous alloca in the left redzone and marking the last/first one with some magic value. Then, before each ret instruction, we can iterate over this list and unpoison these allocas.
Or we could just memset shadow for dynamic part of stack to 0. This wouldn't work with use-after-return though.
I see you've committed r222991 which checks that alloca dominates the exits.
Let's polish this thing first and enable it by default, then we may return to more complicated cases.
Actually exit domination wouldn't save the day. E.g.
void f() { char *p; ... do { p = alloca(100); g(p); } while(whatever); ... }
static