Diff Detail
Event Timeline
Tests, please.
We don't have a framework to run the same test under asan, tsan and msan,
so I guess we may have a full test in asan/lit_tests/TestCases/ and two tiny tests in
msan/lit_tests/ and tsan/lit_tests/
(FYI: I don't promise continued quick reviews until ~ Jan 10)
asan/asan_intercepted_functions.h | ||
---|---|---|
79 ↗ | (On Diff #6284) | Why scanf? |
asan/asan_interceptors.cc | ||
678 ↗ | (On Diff #6284) | yes, please |
sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
377 | This interceptor will cause many new bugs to be discovered |
Tests, please.
Yup, I forgot to do `svn add' when generating diff. Will do.
We don't have a framework to run the same test under asan, tsan and msan
Note that current version only intercepts printf for asan (interceptors are implemented in asan_interceptors.cc). I don't have much experience with msan so I'll need your advice on how to proceed there.
FYI I don't promise continued quick reviews until ~ Jan 10
Same here.
asan/asan_intercepted_functions.h | ||
---|---|---|
79 ↗ | (On Diff #6284) | I felt cynical and simply grabbed SANITIZER_INTERCEPT_SCANF from sanitizer_common/sanitizer_platform_interceptors.h in hope that printf and scanf always go together. What about renaming SANITIZER_INTERCEPT_SCANF to SANITIZER_INTERCEPT_FORMAT? |
asan/asan_interceptors.cc | ||
678 ↗ | (On Diff #6284) | I tried this initially but realized that Msan already intercepts printf stuff. What's the general policy about what goes into sanitizer_common (e.g. strcmp) and what gets re-implemented in each sanitizer (e.g. memset)? |
sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
377 | Makes sense, will do. |
asan/asan_intercepted_functions.h | ||
---|---|---|
79 ↗ | (On Diff #6284) | I would have a separate SANITIZER_INTERCEPT_PRINTF |
asan/asan_interceptors.cc | ||
678 ↗ | (On Diff #6284) | msan interceptors are rudimentary and will have to be replaced with these *eventually*. |
asan/asan_interceptors.cc | ||
---|---|---|
678 ↗ | (On Diff #6284) | We can enable it in MSan from the start (killing the old msan interceptors), just make sure that the new interceptors emulate a WRITE to the target buffer (for sprintf and friends). |
asan/asan_interceptors.cc | ||
---|---|---|
678 ↗ | (On Diff #6284) | Is current emulation fine or do you want to check each specifier separately? |
asan/asan_interceptors.cc | ||
---|---|---|
678 ↗ | (On Diff #6284) | I mean, when this is moved to common interceptors, you'll need to do COMMON_INTERCEPTOR_WRITE_RANGE() for the sprintf target buffer, like the current msan interceptors do. |
Updates based on prev review.
Missing stuff:
- MSan still using old interceptors
- no tests for TSan and MSan
Plus a bunch of questions inlined (see TODO markers).
Let others look at this change too.
We won't commit it before the end of our holidays (~Jan 10) anyway.
Until we don't have tsan tests, I suggest to disable this for tsan.
msan/msan_interceptors.cc | ||
---|---|---|
1294 | we can do it as a separate commit (I'd even prefer that) | |
sanitizer_common/sanitizer_common_interceptors.inc | ||
33 | can you avoid this? | |
sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
281 | good idea. feel free to implement or to leave the TODO. |
Let others look at this change too.
We won't commit it before the end of our holidays (~Jan 10) anyway.
Sure, there's no push from my side.
Until we don't have tsan tests, I suggest to disable this for tsan.
k
msan/msan_interceptors.cc | ||
---|---|---|
1294 | Sure. | |
sanitizer_common/sanitizer_common_interceptors.inc | ||
33 | Well, I need size_t for snprintf. | |
sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
281 |
Got it. Anyway, this was more a question for reviewers than real TODO. I'll add format check in next rev. |
sanitizer_common/sanitizer_common_interceptors.inc | ||
---|---|---|
33 | please use SIZE_T and don't include any additional system headers |
Until we don't have tsan tests, I suggest to disable this for tsan.
Will making a copy of printf-1.c for TSan be enough? Or you want a test with an actual race?
sanitizer_common/sanitizer_common_interceptors.inc | ||
---|---|---|
33 | Agreed, my bad. |
sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
---|---|---|
10 | Please update this comment. | |
281 | We sure need to check |format|. Since you've already touched scanf_common in this CL, why not just add COMMON_INTERCEPTOR_READ_RANGE(ctx, format, internal_strlen(format) + 1); ? | |
422 | Two spaces before the comment, please. | |
sanitizer_common/sanitizer_platform_interceptors.h | ||
75–76 | I wonder why the lines above have this strange indentation, but let us fix that in a separate commit. |
sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
---|---|---|
10 |
Will do.
But won't this make diff unreadable? Perhaps this should be a separate patch? | |
281 | Yup, already added this in new version of the patch. I didn't resubmit it because I wanted to gather reviews from all team members. The bigger question is whether we want to check size of input buffer in sscanf and friends. BTW what's the general rule: should I resubmit patch as often as possible or wait till everyone comments? | |
422 | Do you think we need a warning here? If yes, I'll add it, if not - I'll remove the comment alltogether. | |
sanitizer_common/sanitizer_platform_interceptors.h | ||
75–76 | Looks like something lint should be able to handle. |
sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
---|---|---|
10 |
SGTM | |
281 |
I think it's better to resubmit once you've addressed a round of comments. | |
281 |
Yes, that's nice to have, too. | |
422 | We can't get an invalid size from format_get_value_size(), so it's ok to have a CHECK here. | |
sanitizer_common/sanitizer_platform_interceptors.h | ||
75–76 | I've removed these. There's no clear policy about these spaces anyway. |
Looks mostly good.
asan/lit_tests/TestCases/printf-4.c | ||
---|---|---|
6 | Please sort the includes alphabetically (here and in other tests) | |
sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
11 | How about this line? Is there a printf specification we're following? | |
323 | Please swap this line with the above declaration and remove the empty line below. | |
337 | How does this comment relate to the above one? | |
350 | Good point, probably worth doing. |
asan/lit_tests/TestCases/printf-4.c | ||
---|---|---|
6 | Will do. | |
sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
11 | Oh, this is a good one. I've blindly reused scanf parser but it seems that printf format has richer syntax: it has flags and precision + asterisk semantics is different. I'll fix this in next rev. | |
323 | +1 | |
337 | Sorry, not quite clear. I believe we don't want to bother with positional args neither in scanf, nor in printf. | |
350 | Ok, I can do this. Note that there are actually many more checks of this sort: intersection of format with target string, intersection of format with %n destination, intersection of format with scanf argument, etc., etc. I guess errors like this should already be handled by things like FORTIFY_SOURCE pretty well. |
sanitizer_common/sanitizer_common_interceptors.inc | ||
---|---|---|
698 | I'd remove check_printf from this condition, otherwise we'd get an MSan regression when check_printf is disabled. Also, these writes don't depend on format string parsing, and are most certainly correct. The same goes to other v*printf interceptors. | |
sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
11 | Yes, this is kind of scary. |
sanitizer_common/sanitizer_common_interceptors.inc | ||
---|---|---|
698 | Note that check_printf is true by default and MSan will still use it's own interceptors for now. | |
sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
11 |
Are we talking about these: $ find glibc-master -name tst\* -o -name test\* | grep printf
I probably should... |
asan/lit_tests/TestCases/printf-2.c | ||
---|---|---|
2 | please make these tests run in 3 modes: ASAN_OPTIONS=check_printf=0 ASAN_OPTIONS=check_printf=1 ASAN_OPTIONS="" we need to test that the flag affects the behavior and that it's default value is what we expect it to be. (use FileCheck --check-prefix=CHECK-ON and --check-prefix=CHECK-OFF) | |
sanitizer_common/sanitizer_flags.h | ||
60 | I actually changed my mind. |
asan/lit_tests/TestCases/printf-2.c | ||
---|---|---|
2 | Ok. |
asan/lit_tests/TestCases/printf-2.c | ||
---|---|---|
2 | BTW how do you suggest the check_printf=0 case? In my understanding we'll have UB with absolutely unpredictable behavior... |
asan/lit_tests/TestCases/printf-2.c | ||
---|---|---|
2 |
I tend to agree but e.g. GCC team tends to reject tests that trigger UB. Also things like this tend to be rather unstable with respect to glibc/kernel versions. I'll do this if you're adamant though. |
New revision based on comments from glider, eugenis and kcc.
Main changes:
- fixed sanitizer_scanf_interceptor_test.cc
- update printf path based on Open Group std
- fixed several bugs based on glibc printf tests (manifested in segvs and false alarms)
- check_printf is now false by default
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
---|---|---|
90 ↗ | (On Diff #6394) | According to the manual, field width can be "*", too. The number-or-star parsing logic could be factored out in a separate function. |
98 ↗ | (On Diff #6394) | This does not look right. %m in printf is a complete conversion specifier, it should end parsing of the current directive. And it seems like "m" could go into convSpecifier, and printErrno could be removed. |
168 ↗ | (On Diff #6394) | At this point allowGnuMalloc is true. It does not make sense for output to be true. Please remove this and instead check this invariant at the beginning of the functions (!(allowGnuMalloc && output)). |
293 ↗ | (On Diff #6394) | Are these guys ([...]) even allowed in printf? |
360 ↗ | (On Diff #6394) | Does it really need to be a macro? |
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
---|---|---|
90 ↗ | (On Diff #6394) |
Yup, this is handled by dir->suppressed above)
Agreed. |
98 ↗ | (On Diff #6394) |
I agree that we are currently parsing a wider format language than really necessary. But same is true for %n in original scanf implementation and we didn't bother handling it. |
360 ↗ | (On Diff #6394) | I was under impression that pointers to va_list are not portable C but http://gcc.gnu.org/ml/gcc-help/1999-q3n/msg00283.html suggests that they're sane. I'll give it a try tomorrow. |
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
---|---|---|
98 ↗ | (On Diff #6394) |
Agreed to this one, will fix. |
360 ↗ | (On Diff #6394) | Ok, even though standard kind of allows this, GCC has a long and famous history of pointers to va_list not working (see e.g. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14557). Looks like passing va_list by reference is unportable. |
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
---|---|---|
90 ↗ | (On Diff #6394) | Oh, right. This is a bit confusing, could you add a separate field to FormatDirective instead of reusing "suppressed" here? |
98 ↗ | (On Diff #6394) | What's wrong with %n? We parse it as convSpecifier which can not have anything going after it. if (*p == 'm' && !output) { dir->allocate=true; ++p; } would do the trick. |
293 ↗ | (On Diff #6394) | Then it would make sense to handle this case earlier, in format_parse_next. Probably not a big deal. |
Major changes:
- split format_parse_next to 2 functions
- same for scanf_get_value_size
- safer check of buffer size in sprintf
Issues found by eugenis should be fixed now.
Hey guys, I've fallen ill so won't be able to respond quickly.
Meanwhile, are there any other problems that you want fixed?
lib/sanitizer_common/sanitizer_common_interceptors.inc | ||
---|---|---|
697 ↗ | (On Diff #6410) | Unless I'm missing something, you could use REAL(strlen)(str) here instead of calling *printf twice. |
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
161 ↗ | (On Diff #6410) | Should it be lengthModifier[0]? |
lib/sanitizer_common/sanitizer_common_interceptors.inc | ||
---|---|---|
697 ↗ | (On Diff #6410) | Hm but str isn't necessarily a string - it's just plain array of chars. I don't think we can call strlen on it. |
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
161 ↗ | (On Diff #6410) |
|
lib/sanitizer_common/sanitizer_common_interceptors.inc | ||
---|---|---|
697 ↗ | (On Diff #6410) | I think *printf always end the output with a null character, and never print null in the middle, but I don't see a clear indication of that in the spec. |
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
161 ↗ | (On Diff #6410) | Me :) |
lib/sanitizer_common/sanitizer_common_interceptors.inc | ||
---|---|---|
697 ↗ | (On Diff #6410) | Sorry, I must be missing something... My intent with REAL(vname2) (which is actually REAL(vsnprintf)) is to estimate number of chars that will be written. 0 tells vsnprintf that it should not _write_ anything, only estimate number of chars. We could simply pass NULL instead of str and it wouldn't change anything. So why do you bother about str? |
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
161 ↗ | (On Diff #6410) | I'll update the code and see if tests seize to work then. |
lib/sanitizer_common/sanitizer_common_interceptors.inc | ||
---|---|---|
697 ↗ | (On Diff #6410) | I'm just worried that it makes *printf quite a bit heavier - we now parse the format string 3 times (2 in libc *printf and 1 in our code). Alternatively, we could do 1 + 1, and then figure out the length of the resulting string with strlen() after the REAL *printf. |
lib/sanitizer_common/sanitizer_common_interceptors.inc | ||
---|---|---|
697 ↗ | (On Diff #6410) | Agreed but then we may get stack overflow before Asan is able to react. Looks like a trade-off between speed and security. Does libsanitizer have general preference for any of those? IMHO sprintf is already so slow that we may not need to bother so much. |
And sprintf usage is discouraged by all coding standards and static analyzers anyway :)
lib/sanitizer_common/sanitizer_common_interceptors.inc | ||
---|---|---|
697 ↗ | (On Diff #6410) | Now I must be missing something. What would be the cause of stack overflow? That would slow down sprintf (and snprintf, etc) by a factor of 1.5 (roughly speaking). That may be a big hit exactly because sprintf is so slow. I'm assuming that calculation the length of the output by passing 0 as size takes about the same time as actually formatting the output. |
lib/sanitizer_common/sanitizer_common_interceptors.inc | ||
---|---|---|
697 ↗ | (On Diff #6410) |
A call to REAL(sprintf) with unchecked target buffer size.
Yup, as I said I agree that this adds a penalty. Let's see if we can now come to conclusion on buffer overflow part and see which one is more important. |
lib/sanitizer_common/sanitizer_common_interceptors.inc | ||
---|---|---|
697 ↗ | (On Diff #6410) | I see. But that's the same call that the application does, i.e. the overflow would have happened without sanitizer, too. And we will most likely catch it, but after sprintf corrupts some memory. I think its common to use the strlen way. Interceptors like strcat, strcpy call the real function without checking that the string is actually null-terminated, and examine the result afterwards. |
lib/sanitizer_common/sanitizer_common_interceptors.inc | ||
---|---|---|
697 ↗ | (On Diff #6410) |
After some internal fighting I have to agree with that. The claim would be wrong on platforms with ascending stacks but I'm unaware of any living instance (last I heard Aarch64 failed at that).
I see, so we favor perf indeed. I'll remove the first sprintf call then. BTW do we mention somewhere on our wiki that our interceptors are not 100% secure? |
Could you rebuild the new patch from projects/compiler-rt directory
instead of projects/compiler-rt/lib? Right now the incremental diff
between two last revisions is completely broken.
lib/asan/lit_tests/TestCases/printf-1.c | ||
---|---|---|
4 ↗ | (On Diff #6502) | Just use "%t 2>&1" to test it with default ASAN_OPTIONS. |
6 ↗ | (On Diff #6502) | I'd suggest to merge all these test cases into one, as we're really testing one thing here. That would save some lines of code and would make modifying and extending the test case easier, and the tests will run faster (you will compile/link a single binary instead of 5 binaries). You can control the printf(...) call you're testing with an argv. See TestCases/stack-buffer-overflow-with-position.cc |
lib/asan/lit_tests/TestCases/printf-2.c | ||
20 ↗ | (On Diff #6502) | Could you match a file/line in the stack trace in error report? E.g. see TestCases/null_deref.cc |
lib/sanitizer_common/sanitizer_common_interceptors.inc | ||
573 ↗ | (On Diff #6502) | Just "#include", we don't seem to use "# " indentation in another places in this file. |
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
25 ↗ | (On Diff #6502) | Quick check: is it OK you will consume the first "-" in "--" string? |
30 ↗ | (On Diff #6502) | Please clarify in function name or comment why "0" or "-0" are unexpected. |
61 ↗ | (On Diff #6502) | remove empty line |
233 ↗ | (On Diff #6502) | Is it ok the code originally parsed only unsigned integers as the field width, but now it will treat "-9" as a field width "9"? |
348 ↗ | (On Diff #6502) | I don't like this diagnostics: we don't even print what this unknown specifier is. We should either do this, or silently exit. |
508 ↗ | (On Diff #6502) | same here. |
lib/sanitizer_common/tests/sanitizer_scanf_interceptor_test.cc | ||
61 ↗ | (On Diff #6502) | This one is also const. |
lib/asan/lit_tests/TestCases/printf-1.c | ||
---|---|---|
4 ↗ | (On Diff #6502) | Ok. |
6 ↗ | (On Diff #6502) | I don't quite agree. Different tests trigger different types of memory errors so CHECKs will be different. Also printed messages in some tests differ. |
lib/asan/lit_tests/TestCases/printf-2.c | ||
20 ↗ | (On Diff #6502) | Well, ok although I'd consider this an overkill. |
lib/sanitizer_common/sanitizer_common_interceptors.inc | ||
573 ↗ | (On Diff #6502) | +1 |
691 ↗ | (On Diff #6502) | +1 |
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
25 ↗ | (On Diff #6502) | %-- is invalid syntax anyway so I think it's ok. |
30 ↗ | (On Diff #6502) | This is Evegeny's code so I have no idea. |
61 ↗ | (On Diff #6502) | Ok. |
233 ↗ | (On Diff #6502) | Well, the spec does not mention that width in scanf must be positive but GNU scanf rejects negatives. I'll replace with parse_number. |
348 ↗ | (On Diff #6502) | This warning is mainly to inform us that we failed to parse some format spec and I think it's rather important for debugging purposes. What if I replace it with VReport(1, ...) ? |
lib/sanitizer_common/tests/sanitizer_scanf_interceptor_test.cc | ||
61 ↗ | (On Diff #6502) | +1 |
lib/asan/lit_tests/TestCases/printf-1.c | ||
---|---|---|
6 ↗ | (On Diff #6502) | Well, ok, leaving this to you. If you think that merging the cases would complicate things too much, leave it as is. |
lib/asan/lit_tests/TestCases/printf-2.c | ||
20 ↗ | (On Diff #6502) | We generally don't want many internal frames from ASan runtime to appear in the error report. So it's nice to verify that error report looks somewhat like this: |
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
25 ↗ | (On Diff #6502) | Alright, I agree that %-- is just invalid scanf/printf syntax and we may not necessarily parse it correctly, but now we move this to function "maybe_parse_signed", and its name doesn't tell much about this hidden assumptions ("-" consumed from "--", "0" results in NULL returned), so one could use it in a wrong way somewhere else. IMO It's better to specify this assumptions explicitly here. |
30 ↗ | (On Diff #6502) | See above |
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
---|---|---|
30 ↗ | (On Diff #6502) | Closer study of scanf/printf spec reveals that width/precision are unsigned both for printf and scanf. But printf may have '-' or '+' flag in front of width. Anyway I don't know why you check for <= 0 here. |
348 ↗ | (On Diff #6502) | The best solution would be to remember pointers to begin and end of specifier in format string. We can then use them to print specifier here. |
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
---|---|---|
30 ↗ | (On Diff #6502) | Ok, I got it: scanf is only allowed to have a non-zero width so this check needs to be moved to scanf_parse_next. All this syntax peculiarities are going to drive me crazy. |
Updates based on feedback from Evgeny&Alex.
What's missing:
- I haven't added the @LINE directives; currently ASan reports #0 0x431593 in printf_common #1 0x431a94 in __interceptor_vprintf #2 0x42b2fd in printf #3 0x47a897 in main
which probably is not what we want but I don't think that we should fix it in this patch.
Regarding glibc tests: I've ran them and verified that ASan does not cause segfaults or test failures (it does print couple of warnings on invalid format strings). Is this what you guys wanted?
A couple of nits.
lib/sanitizer_common/tests/sanitizer_scanf_interceptor_test.cc | ||
---|---|---|
37 ↗ | (On Diff #6503) | Can you please add a comment above describing why NOLINT is necessary here? |
43 ↗ | (On Diff #6503) | At least two spaces between the code and the comment, please (you'll need to reformat the rest of declarations) |
Only a few minor nits (I've also committed part of this patch already). I think it's fine for submission (once you're re-upload the diff addressing the comments).
lib/asan/lit_tests/TestCases/printf-5.c | ||
---|---|---|
4 ↗ | (On Diff #6503) | Once again, what's the point of testing the default configuration? In this way you'll have to update the tests on a flag flip, is this really what we want? Can we remove it, or at least have a single test for default configuration? |
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
48 ↗ | (On Diff #6503) | q is always non-NULL |
451 ↗ | (On Diff #6503) | Please either make it a function (I see, va_list might be a problem), or name it as SKIP_SCALAR_ARG |
lib/sanitizer_common/sanitizer_printf.cc | ||
98 ↗ | (On Diff #6503) | I've submitted this part in r199724 (it's not directly related to printf interceptors). |
lib/asan/lit_tests/TestCases/printf-5.c | ||
---|---|---|
4 ↗ | (On Diff #6503) | This was my suggestion, I think this *is* what we want. |
lib/asan/lit_tests/TestCases/printf-5.c | ||
---|---|---|
4 ↗ | (On Diff #6503) | Yeah, I guess Kostya wants to explicitly test that printf checks are disabled by default for now. |
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc | ||
48 ↗ | (On Diff #6503) | +1 |
451 ↗ | (On Diff #6503) | Will do. |
lib/sanitizer_common/tests/sanitizer_scanf_interceptor_test.cc | ||
37 ↗ | (On Diff #6503) | I copy-pasted this from TEST(SanitizerCommonInterceptors, Scanf). Seems to work with NOLINT removed so I'll update the patch. |
43 ↗ | (On Diff #6503) | Likewise. |
please make these tests run in 3 modes:
we need to test that the flag affects the behavior and that it's default value is what we expect it to be.
(use FileCheck --check-prefix=CHECK-ON and --check-prefix=CHECK-OFF)