This patch is related to Issue 346:	moar string interceptors: strstr, strcasestr, strcspn, strpbrk
As was suggested in original review http://reviews.llvm.org/D6056 a new "strict_string_checks" run-time flag introduced.
The flag support applied for existing common, asan, msan and tsan interceptors. New asan tests added.
Details
Diff Detail
Event Timeline
| test/asan/Unit/lit.site.cfg.in | ||
|---|---|---|
| 33 ↗ | (On Diff #18597) | Some tests relied on non-strict semantics. | 
Here is a list of failed tests for one configuration:
AddressSanitizer-Unit :: Asan-i386-inline-Test/AddressSanitizer.AtoiAndFriendsOOBTest AddressSanitizer-Unit :: Asan-i386-inline-Test/AddressSanitizer.StrCaseCmpOOBTest AddressSanitizer-Unit :: Asan-i386-inline-Test/AddressSanitizer.StrCatOOBTest AddressSanitizer-Unit :: Asan-i386-inline-Test/AddressSanitizer.StrCmpOOBTest AddressSanitizer-Unit :: Asan-i386-inline-Test/AddressSanitizer.StrNCatOOBTest AddressSanitizer-Unit :: Asan-i386-inline-Test/AddressSanitizer.StrtolOOBTest AddressSanitizer-Unit :: Asan-i386-inline-Test/AddressSanitizer.StrtollOOBTest
These are test cases when string args were not correct but a result still may be calculated correctly, see:
AtoiAndFriendsOOBTest unexpected READ error: arg's last symbol is not digit, 
StrCaseCmpOOBTest unexpected READ error: s1 and s2 last bytes are not null but differs
StrCatOOBTest unexpected READ error: dest not adressable/not null-terminated, WRITE error is expected
StrCmpOOBTest unexpected READ error: s1 and s2 last bytes are not null but differs
StrNCatOOBTest unexpected READ error: dest not adressable/not null-terminated, WRITE error is expected
StrtolOOBTest unexpected READ error: arg's last symbol is not valid digit
StrtollOOBTest unexpected READ error: arg's last symbol is not valid digit
And I see only first failures but there are several test cases with such special arguments for each function. I believe they are intentional and based on expected behaviour when strings are not checked strictly.
A couple of ideas how we can proceed with the tests:
- make UnitTests extract |strict_string_checks| from ASAN_OPTIONS and fix the checks to behave based on that setting;
- remove the failing checks from the test suite and write the corresponding lit tests that will test both strict_string_checks modes (or drop them if they're covered by the tests you've added in this CL);
- fix the failing checks assuming that strict_string_checks is always on.
I've taken a look at the failing tests and it didn't occur to me that we had ever wanted to "rely" on non-strict semantics.
Those corner cases indeed need to be carefully checked (especially now, when there're two modes of string interceptors operation), but that doesn't mean we have to stick with the non-default behavior.
| test/asan/Unit/lit.site.cfg.in | ||
|---|---|---|
| 33 ↗ | (On Diff #18597) | I think we must fix the tests. It's not a good idea to silently change the default string checking mode in a bunch of tests. | 
Maria, have you tried running check-msan? I think there's a couple of tests that need to be fixed there as well.
I would prefer the second variant. If no objections or other ideas I'll prepare corresponding update for this patch.
| test/asan/TestCases/atol_strict.c | ||
|---|---|---|
| 4 | Isn't it in first line after compile or do you mean something else? | |
- remove the failing checks from the test suite and write the corresponding lit tests that will test both strict_string_checks modes (or drop them if they're covered by the tests you've added in this CL);
I would prefer the second variant. If no objections or other ideas I'll prepare corresponding update for this patch.
Yes, please!
| test/asan/TestCases/atol_strict.c | ||
|---|---|---|
| 4 | Oh, yes, sorry. | |
| test/asan/Unit/lit.site.cfg.in | ||
| 33 ↗ | (On Diff #18597) | No, unit test should not rely on the flags, we need to move such tests to lit tests. | 
Hi guys,
I've finally prepared the updated patch for "strict_string_checks" flag. The asan unit test modified to pass with strict checks enabled by default. Removed test cases are covered in lit tests. Please take a look.
Nice!
Once we are done with me I'll ask a second reviewer to check tsan/msan part.
| lib/asan/asan_interceptors.cc | ||
|---|---|---|
| 80 | Interesting. | |
| lib/sanitizer_common/sanitizer_flags.inc | ||
| 154 | I afraid we should make it false at the first step.  | |
| test/asan/TestCases/atoi_strict.c | ||
| 3 | Good. But if you use a run-time parameter instead of -DTEST_N you will not need to recompile the test multiple times. int main(int argc, char **argv) { if (argc != 2) return 1; if (!strcmp(argv[1], "foo")) test_foo(); if (!strcmp(argv[1], "bar")) test_bar(); } | |
| lib/asan/asan_interceptors.cc | ||
|---|---|---|
| 80 | Ok, in new patch I've introduced more general macro ASAN_READ_STRING_OF_LEN in order to not repeat strlen calls. The same done for tsan, msan and common interceptors. | |
| lib/sanitizer_common/sanitizer_flags.inc | ||
| 154 | Ok, changed in new patch. Must be kept in mind that tests check default behaviour as well so they have to be modified when this value is changed. | |
| lib/asan/asan_interceptors.cc | ||
|---|---|---|
| 80 | I've changed internal_strlen to REAL(strlen) only in asan/common part but probably I should fix tsan and msan as well. But I see that in other tsan interceptors internal_strlen is used. Is it intentional there? | |
I like the patch. At this point I'd ask Dmitry to review the tsan part and comment on internal_str* vs REAL(str*) in tsan.
In parallel, given that the patch is potentially very intrusive, I'd ask you to test it with asan bootstrap:
- build clang as usual
- build another clang using clang from #1 with cmake flag -DLLVM_USE_SANITIZER=Address and run "ninja check-llvm check-clang"
- Repeat #2 with ASAN_OPTIONS=strict_string_checks=true
If new errors appear in LLVM, you don't have to fix them, but at least check that they are not false positives.
| test/asan/TestCases/atol_strict.c | ||
|---|---|---|
| 14 | remove extra vertical space, leave just one line (here and below) | |
Hi,
I've run bootstrap testing, no regressions occured.
Do you have any feedback regarding tsan/msan part?
LGTM with a nit
Sorry for the delay, I was on a vacation.
| lib/tsan/rtl/tsan_interceptors.cc | ||
|---|---|---|
| 280 | Don't prefix the macros with TSAN_. The TSAN_INTERCEPT macros above are prefixed with TSAN_ for historical reasons and should be renamed. Don't look at them. | |
Attached patch with minor update due to recent feedback:
- removed extra vertical space in tests
- don't prefix new tsan macros with TSAN_
If it's okay now, can anyone commit it please?
Interesting.
This code is going to be hot, and internal_strlen(s) is not optimized for speed (and is not expected to be).
Near some of the places where we now call ASAN_READ_RANGE there are calls to REAL(strlen)(to), so you essentially repeat the call (but using slow internal_strlen).
Try not to repeat the strlen calls and try to rely on REAL(strlen), which is ~16x faster.