This is an archive of the discontinued LLVM Phabricator instance.

Sanitize printf functions
ClosedPublic

Authored by ygribov on Dec 27 2013, 1:07 AM.

Details

Reviewers
samsonov
eugenis

Diff Detail

Event Timeline

kcc added a comment.Dec 27 2013, 1:16 AM

Tests, please.
We don't have a framework to run the same test under asan, tsan and msan,
so I guess we may have a full test in asan/lit_tests/TestCases/ and two tiny tests in
msan/lit_tests/ and tsan/lit_tests/

(FYI: I don't promise continued quick reviews until ~ Jan 10)

asan/asan_intercepted_functions.h
79 ↗(On Diff #6284)

Why scanf?

asan/asan_interceptors.cc
678 ↗(On Diff #6284)

yes, please

sanitizer_common/sanitizer_common_interceptors_scanf.inc
315 ↗(On Diff #6284)

This interceptor will cause many new bugs to be discovered
and the deployment in many cases will take time.
Please add a flag "check_printf" to sanitizer_common/sanitizer_flags.h (true by default)
and if this flag is false, don't do any checking.

kcc added a comment.Dec 27 2013, 1:18 AM

also, please CC llvm-commits

Tests, please.

Yup, I forgot to do `svn add' when generating diff. Will do.

We don't have a framework to run the same test under asan, tsan and msan

Note that current version only intercepts printf for asan (interceptors are implemented in asan_interceptors.cc). I don't have much experience with msan so I'll need your advice on how to proceed there.

FYI I don't promise continued quick reviews until ~ Jan 10

Same here.

asan/asan_intercepted_functions.h
79 ↗(On Diff #6284)

I felt cynical and simply grabbed SANITIZER_INTERCEPT_SCANF from sanitizer_common/sanitizer_platform_interceptors.h in hope that printf and scanf always go together. What about renaming SANITIZER_INTERCEPT_SCANF to SANITIZER_INTERCEPT_FORMAT?

asan/asan_interceptors.cc
678 ↗(On Diff #6284)

I tried this initially but realized that Msan already intercepts printf stuff. What's the general policy about what goes into sanitizer_common (e.g. strcmp) and what gets re-implemented in each sanitizer (e.g. memset)?

sanitizer_common/sanitizer_common_interceptors_scanf.inc
315 ↗(On Diff #6284)

Makes sense, will do.

also, please CC llvm-commits

Done!

kcc added inline comments.Dec 27 2013, 1:54 AM
asan/asan_intercepted_functions.h
79 ↗(On Diff #6284)

I would have a separate SANITIZER_INTERCEPT_PRINTF

asan/asan_interceptors.cc
678 ↗(On Diff #6284)

msan interceptors are rudimentary and will have to be replaced with these *eventually*.
Here is a suggestion: move the code to sanitizer_common_interceptors.inc
but define SANITIZER_INTERCEPT_PRINTF to 1 only in asan.
tsan and msan will follow later.

eugenis added inline comments.Dec 27 2013, 3:51 AM
asan/asan_interceptors.cc
678 ↗(On Diff #6284)

We can enable it in MSan from the start (killing the old msan interceptors), just make sure that the new interceptors emulate a WRITE to the target buffer (for sprintf and friends).

ygribov added inline comments.Dec 27 2013, 3:54 AM
asan/asan_interceptors.cc
678 ↗(On Diff #6284)

Is current emulation fine or do you want to check each specifier separately?

eugenis added inline comments.Dec 27 2013, 4:00 AM
asan/asan_interceptors.cc
678 ↗(On Diff #6284)

I mean, when this is moved to common interceptors, you'll need to do COMMON_INTERCEPTOR_WRITE_RANGE() for the sprintf target buffer, like the current msan interceptors do.
That is in addition to reads and writes that you do already.

ygribov updated this revision to Unknown Object (????).Dec 27 2013, 5:04 AM

Updates based on prev review.

Missing stuff:

  • MSan still using old interceptors
  • no tests for TSan and MSan

Plus a bunch of questions inlined (see TODO markers).

@Evgeny: let me know if this looks sane to you wrt MSan.

kcc added a comment.Dec 28 2013, 1:37 AM

Let others look at this change too.
We won't commit it before the end of our holidays (~Jan 10) anyway.

Until we don't have tsan tests, I suggest to disable this for tsan.

msan/msan_interceptors.cc
1294 ↗(On Diff #6286)

we can do it as a separate commit (I'd even prefer that)

sanitizer_common/sanitizer_common_interceptors.inc
33 ↗(On Diff #6286)

can you avoid this?

sanitizer_common/sanitizer_common_interceptors_scanf.inc
281 ↗(On Diff #6286)

good idea. feel free to implement or to leave the TODO.
Actually this code base uses FIXME (w/o name) more often.

Let others look at this change too.
We won't commit it before the end of our holidays (~Jan 10) anyway.

Sure, there's no push from my side.

Until we don't have tsan tests, I suggest to disable this for tsan.

k

msan/msan_interceptors.cc
1294 ↗(On Diff #6286)

Sure.

sanitizer_common/sanitizer_common_interceptors.inc
33 ↗(On Diff #6286)

Well, I need size_t for snprintf.

sanitizer_common/sanitizer_common_interceptors_scanf.inc
281 ↗(On Diff #6286)

Actually this code base uses FIXME (w/o name) more often.

Got it. Anyway, this was more a question for reviewers than real TODO. I'll add format check in next rev.

kcc added inline comments.Dec 28 2013, 8:50 AM
sanitizer_common/sanitizer_common_interceptors.inc
33 ↗(On Diff #6286)

please use SIZE_T and don't include any additional system headers

Until we don't have tsan tests, I suggest to disable this for tsan.

Will making a copy of printf-1.c for TSan be enough? Or you want a test with an actual race?

sanitizer_common/sanitizer_common_interceptors.inc
33 ↗(On Diff #6286)

Agreed, my bad.

glider added inline comments.Dec 30 2013, 12:43 AM
sanitizer_common/sanitizer_common_interceptors_scanf.inc
10 ↗(On Diff #6286)

Please update this comment.
Also, how about renaming the file to something like "sanitizer_common_interceptors_printf_scanf.inc" or "sanitizer_common_interceptors_format.inc"?

281 ↗(On Diff #6286)

We sure need to check |format|. Since you've already touched scanf_common in this CL, why not just add

COMMON_INTERCEPTOR_READ_RANGE(ctx, format, internal_strlen(format) + 1);

?

362 ↗(On Diff #6286)

Two spaces before the comment, please.

sanitizer_common/sanitizer_platform_interceptors.h
76 ↗(On Diff #6286)

I wonder why the lines above have this strange indentation, but let us fix that in a separate commit.

ygribov added inline comments.Dec 30 2013, 1:25 AM
sanitizer_common/sanitizer_common_interceptors_scanf.inc
10 ↗(On Diff #6286)

Please update this comment.

Will do.

Also, how about renaming the file to something like "sanitizer_common_interceptors_printf_scanf.inc"

But won't this make diff unreadable? Perhaps this should be a separate patch?

281 ↗(On Diff #6286)

Yup, already added this in new version of the patch. I didn't resubmit it because I wanted to gather reviews from all team members.

The bigger question is whether we want to check size of input buffer in sscanf and friends.

BTW what's the general rule: should I resubmit patch as often as possible or wait till everyone comments?

362 ↗(On Diff #6286)

Do you think we need a warning here? If yes, I'll add it, if not - I'll remove the comment alltogether.

sanitizer_common/sanitizer_platform_interceptors.h
76 ↗(On Diff #6286)

Looks like something lint should be able to handle.

glider added inline comments.Dec 30 2013, 2:12 AM
sanitizer_common/sanitizer_common_interceptors_scanf.inc
10 ↗(On Diff #6286)

But won't this make diff unreadable? Perhaps this should be a separate patch?

SGTM

281 ↗(On Diff #6286)

BTW what's the general rule: should I resubmit patch as often as possible or wait till everyone comments?

I think it's better to resubmit once you've addressed a round of comments.
Most of the team is on holidays now, so you'll never know whether everyone is going to comment.

281 ↗(On Diff #6286)

The bigger question is whether we want to check size of input buffer in sscanf and friends.

Yes, that's nice to have, too.

362 ↗(On Diff #6286)

We can't get an invalid size from format_get_value_size(), so it's ok to have a CHECK here.

sanitizer_common/sanitizer_platform_interceptors.h
76 ↗(On Diff #6286)

I've removed these. There's no clear policy about these spaces anyway.

ygribov added inline comments.Dec 30 2013, 3:26 AM
sanitizer_common/sanitizer_common_interceptors_scanf.inc
281 ↗(On Diff #6286)

Any idea how to achieve this? Scanfs won't tell us number of bytes read and I can't make use of %n in interceptor...

362 ↗(On Diff #6286)

Will do.

ygribov updated this revision to Unknown Object (????).Dec 30 2013, 11:42 PM

Updated based on comments by glider and kcc.

Will be in standby till Jan 9.

Looks mostly good.

asan/lit_tests/TestCases/printf-4.c
6 ↗(On Diff #6321)

Please sort the includes alphabetically (here and in other tests)

sanitizer_common/sanitizer_common_interceptors_scanf.inc
11 ↗(On Diff #6321)

How about this line? Is there a printf specification we're following?

323 ↗(On Diff #6321)

Please swap this line with the above declaration and remove the empty line below.

337 ↗(On Diff #6321)

How does this comment relate to the above one?

350 ↗(On Diff #6321)

Good point, probably worth doing.

ygribov added inline comments.Jan 1 2014, 9:15 AM
asan/lit_tests/TestCases/printf-4.c
6 ↗(On Diff #6321)

Will do.

sanitizer_common/sanitizer_common_interceptors_scanf.inc
11 ↗(On Diff #6321)

Oh, this is a good one. I've blindly reused scanf parser but it seems that printf format has richer syntax: it has flags and precision + asterisk semantics is different. I'll fix this in next rev.

323 ↗(On Diff #6321)

+1

337 ↗(On Diff #6321)

Sorry, not quite clear. I believe we don't want to bother with positional args neither in scanf, nor in printf.

350 ↗(On Diff #6321)

Ok, I can do this. Note that there are actually many more checks of this sort: intersection of format with target string, intersection of format with %n destination, intersection of format with scanf argument, etc., etc. I guess errors like this should already be handled by things like FORTIFY_SOURCE pretty well.

eugenis added inline comments.Jan 9 2014, 12:07 AM
sanitizer_common/sanitizer_common_interceptors.inc
698 ↗(On Diff #6321)

I'd remove check_printf from this condition, otherwise we'd get an MSan regression when check_printf is disabled. Also, these writes don't depend on format string parsing, and are most certainly correct.

The same goes to other v*printf interceptors.

sanitizer_common/sanitizer_common_interceptors_scanf.inc
11 ↗(On Diff #6321)

Yes, this is kind of scary.
AFAIK, glibc has a comprehensive set of tests for scanf format string parsing. Jakub Jelinek ran this code on them and found several tricky bugs in the parser. Could you look into doing the same for printf?

ygribov added inline comments.Jan 9 2014, 12:37 AM
sanitizer_common/sanitizer_common_interceptors.inc
698 ↗(On Diff #6321)

Note that check_printf is true by default and MSan will still use it's own interceptors for now.

sanitizer_common/sanitizer_common_interceptors_scanf.inc
11 ↗(On Diff #6321)

AFAIK, glibc has a comprehensive set of tests for scanf format string parsing.

Are we talking about these:

$ find glibc-master -name tst\* -o -name test\* | grep printf
glibc-master/stdio-common/tst-swprintf.c
glibc-master/stdio-common/test-vfprintf.c
glibc-master/stdio-common/tst-printf-round.c
glibc-master/stdio-common/tst-printf.c
glibc-master/stdio-common/tst-sprintf3.c
glibc-master/stdio-common/tst-printf.sh
glibc-master/stdio-common/tst-sprintf2.c
glibc-master/stdio-common/tst-sprintf.c
glibc-master/stdio-common/tst-obprintf.c
glibc-master/stdio-common/tst-wc-printf.c
glibc-master/stdio-common/tst-printfsz.c

Could you look into doing the same for printf?

I probably should...

kcc added inline comments.Jan 9 2014, 12:43 AM
asan/lit_tests/TestCases/printf-2.c
2 ↗(On Diff #6321)

please make these tests run in 3 modes:

ASAN_OPTIONS=check_printf=0
ASAN_OPTIONS=check_printf=1
ASAN_OPTIONS=""

we need to test that the flag affects the behavior and that it's default value is what we expect it to be.

(use FileCheck --check-prefix=CHECK-ON and --check-prefix=CHECK-OFF)

sanitizer_common/sanitizer_flags.h
60 ↗(On Diff #6321)

I actually changed my mind.
Please make if false by default for now.
After this patch is committed we'll test it a bit more before enabling by default.

ygribov added inline comments.Jan 9 2014, 12:48 AM
asan/lit_tests/TestCases/printf-2.c
2 ↗(On Diff #6321)

Ok.

ygribov added inline comments.Jan 9 2014, 12:52 AM
asan/lit_tests/TestCases/printf-2.c
2 ↗(On Diff #6321)

BTW how do you suggest the check_printf=0 case? In my understanding we'll have UB with absolutely unpredictable behavior...

ygribov added inline comments.Jan 9 2014, 1:09 AM
asan/lit_tests/TestCases/printf-2.c
2 ↗(On Diff #6321)

That's not that bad.
If you allocate a heap block using asan's allocator,
the behavior of an uncheck-ed out-of-bounds read is quite predictable

I tend to agree but e.g. GCC team tends to reject tests that trigger UB. Also things like this tend to be rather unstable with respect to glibc/kernel versions.

I'll do this if you're adamant though.

ygribov updated this revision to Unknown Object (????).Jan 9 2014, 4:35 AM

New revision based on comments from glider, eugenis and kcc.

Main changes:

  • fixed sanitizer_scanf_interceptor_test.cc
  • update printf path based on Open Group std
  • fixed several bugs based on glibc printf tests (manifested in segvs and false alarms)
  • check_printf is now false by default
eugenis added inline comments.Jan 9 2014, 5:40 AM
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
214

According to the manual, field width can be "*", too. The number-or-star parsing logic could be factored out in a separate function.

222

This does not look right. %m in printf is a complete conversion specifier, it should end parsing of the current directive. And it seems like "m" could go into convSpecifier, and printErrno could be removed.

266

At this point allowGnuMalloc is true. It does not make sense for output to be true. Please remove this and instead check this invariant at the beginning of the functions (!(allowGnuMalloc && output)).

312

Are these guys ([...]) even allowed in printf?

363

Does it really need to be a macro?

ygribov added inline comments.Jan 9 2014, 7:46 AM
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
214

According to the manual, field width can be "*", too

Yup, this is handled by dir->suppressed above)

The number-or-star parsing logic could be factored out in a separate function.

Agreed.

222

%m in printf is a complete conversion specifier, it should end parsing of the current directive

I agree that we are currently parsing a wider format language than really necessary. But same is true for %n in original scanf implementation and we didn't bother handling it.

363

I was under impression that pointers to va_list are not portable C but http://gcc.gnu.org/ml/gcc-help/1999-q3n/msg00283.html suggests that they're sane. I'll give it a try tomorrow.

ygribov added inline comments.Jan 9 2014, 7:46 AM
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
266

Agreed.

312

No, they aren't. But why do you ask the question here?

ygribov added inline comments.Jan 9 2014, 10:10 PM
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
222

And it seems like "m" could go into convSpecifier, and printErrno could be removed.

Agreed to this one, will fix.

363

Ok, even though standard kind of allows this, GCC has a long and famous history of pointers to va_list not working (see e.g. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14557). Looks like passing va_list by reference is unportable.

eugenis added inline comments.Jan 10 2014, 1:13 AM
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
214

Oh, right. This is a bit confusing, could you add a separate field to FormatDirective instead of reusing "suppressed" here?

222

What's wrong with %n? We parse it as convSpecifier which can not have anything going after it.
It seems like simply changing the code here to

if (*p == 'm' && !output) {

dir->allocate=true;
++p;

}

would do the trick.

312

Then it would make sense to handle this case earlier, in format_parse_next. Probably not a big deal.

ygribov added inline comments.Jan 10 2014, 1:32 AM
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
214

Already done. I've also decided to split format_parse_next to {printf,scanf}_parse_next because they are different enough.

222

Ok, should be fixed by splitting format_parse_next mentioned above.

ygribov updated this revision to Unknown Object (????).Jan 10 2014, 4:30 AM

Major changes:

  • split format_parse_next to 2 functions
  • same for scanf_get_value_size
  • safer check of buffer size in sprintf

Issues found by eugenis should be fixed now.

eugenis added inline comments.Jan 10 2014, 5:07 AM
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
204

It never returns null pointer.

212

maybe_parse_signed ?

420

Code duplication between scanf and printf parsers. Please factor this out.

ygribov added inline comments.Jan 10 2014, 5:50 AM
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
204

True, just trying to be safe here.

212

My bad.

420

Will do.

Hey guys, I've fallen ill so won't be able to respond quickly.

Meanwhile, are there any other problems that you want fixed?

I'll do another review pass tomorrow morning.
Get well!

eugenis added inline comments.Jan 15 2014, 12:58 AM
lib/sanitizer_common/sanitizer_common_interceptors.inc
697

Unless I'm missing something, you could use REAL(strlen)(str) here instead of calling *printf twice.

lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
161

Should it be lengthModifier[0]?
Looks like a bug in the original code.

ygribov added inline comments.Jan 15 2014, 2:36 AM
lib/sanitizer_common/sanitizer_common_interceptors.inc
697

Hm but str isn't necessarily a string - it's just plain array of chars. I don't think we can call strlen on it.

lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
161
  1. Bug or some weird GNU scanf (mis)feature. Who is the author of this piece?
  2. How do you guys normally treat situations when both change code and move it to a different place for clarity? Do you make two separate commits to make reviewers life easier?
eugenis added inline comments.Jan 15 2014, 4:44 AM
lib/sanitizer_common/sanitizer_common_interceptors.inc
697

I think *printf always end the output with a null character, and never print null in the middle, but I don't see a clear indication of that in the spec.

lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
161

Me :)
I think it's a bug, it's meant to check that %p has no length modifier.
Two commits are good, but people often don't bother.
In this case it's ok to make the change in the same commit, or whatever is more convenient for you.

ygribov added inline comments.Jan 15 2014, 5:25 AM
lib/sanitizer_common/sanitizer_common_interceptors.inc
697

Sorry, I must be missing something... My intent with REAL(vname2) (which is actually REAL(vsnprintf)) is to estimate number of chars that will be written. 0 tells vsnprintf that it should not _write_ anything, only estimate number of chars. We could simply pass NULL instead of str and it wouldn't change anything. So why do you bother about str?

lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
161

I'll update the code and see if tests seize to work then.

eugenis added inline comments.Jan 15 2014, 5:36 AM
lib/sanitizer_common/sanitizer_common_interceptors.inc
697

I'm just worried that it makes *printf quite a bit heavier - we now parse the format string 3 times (2 in libc *printf and 1 in our code). Alternatively, we could do 1 + 1, and then figure out the length of the resulting string with strlen() after the REAL *printf.

ygribov added inline comments.Jan 15 2014, 5:48 AM
lib/sanitizer_common/sanitizer_common_interceptors.inc
697

Agreed but then we may get stack overflow before Asan is able to react. Looks like a trade-off between speed and security. Does libsanitizer have general preference for any of those?

IMHO sprintf is already so slow that we may not need to bother so much.

And sprintf usage is discouraged by all coding standards and static analyzers anyway :)

eugenis added inline comments.Jan 15 2014, 5:57 AM
lib/sanitizer_common/sanitizer_common_interceptors.inc
697

Now I must be missing something. What would be the cause of stack overflow?

That would slow down sprintf (and snprintf, etc) by a factor of 1.5 (roughly speaking). That may be a big hit exactly because sprintf is so slow. I'm assuming that calculation the length of the output by passing 0 as size takes about the same time as actually formatting the output.

ygribov added inline comments.Jan 15 2014, 6:26 AM
lib/sanitizer_common/sanitizer_common_interceptors.inc
697

Now I must be missing something. What would be the cause of stack overflow?

A call to REAL(sprintf) with unchecked target buffer size.

That would slow down sprintf (and snprintf, etc) by a factor of 1.5 (roughly speaking).

Yup, as I said I agree that this adds a penalty. Let's see if we can now come to conclusion on buffer overflow part and see which one is more important.

eugenis added inline comments.Jan 15 2014, 7:01 AM
lib/sanitizer_common/sanitizer_common_interceptors.inc
697

I see. But that's the same call that the application does, i.e. the overflow would have happened without sanitizer, too. And we will most likely catch it, but after sprintf corrupts some memory.

I think its common to use the strlen way. Interceptors like strcat, strcpy call the real function without checking that the string is actually null-terminated, and examine the result afterwards.

ygribov added inline comments.Jan 15 2014, 9:46 AM
lib/sanitizer_common/sanitizer_common_interceptors.inc
697

we will most likely catch it, but after sprintf corrupts some memory.

After some internal fighting I have to agree with that. The claim would be wrong on platforms with ascending stacks but I'm unaware of any living instance (last I heard Aarch64 failed at that).

I think its common to use the strlen way.
Interceptors like strcat, strcpy call the real function
without checking

I see, so we favor perf indeed. I'll remove the first sprintf call then.

BTW do we mention somewhere on our wiki that our interceptors are not 100% secure?

ygribov updated this revision to Unknown Object (????).Jan 17 2014, 12:10 AM

New diff based on Evgeny's comments.

Could you rebuild the new patch from projects/compiler-rt directory
instead of projects/compiler-rt/lib? Right now the incremental diff
between two last revisions is completely broken.

ygribov updated this revision to Unknown Object (????).Jan 17 2014, 1:15 AM

Rebased diff to compiler-rt root.

Frankly I'd expect Phabricator to catch silly errors like this...

LGTM w/ nit
But please wait for samsonov's review.

lib/sanitizer_common/sanitizer_common_interceptors.inc
691

vname2 can be removed

697

I don't think we claim anywhere that ASan is 100% reliable, or that it catches all the bugs ;)

samsonov added inline comments.Jan 17 2014, 2:17 AM
lib/asan/lit_tests/TestCases/printf-1.c
5

Just use "%t 2>&1" to test it with default ASAN_OPTIONS.

7

I'd suggest to merge all these test cases into one, as we're really testing one thing here. That would save some lines of code and would make modifying and extending the test case easier, and the tests will run faster (you will compile/link a single binary instead of 5 binaries). You can control the printf(...) call you're testing with an argv. See TestCases/stack-buffer-overflow-with-position.cc

lib/asan/lit_tests/TestCases/printf-2.c
21

Could you match a file/line in the stack trace in error report? E.g. see TestCases/null_deref.cc

lib/sanitizer_common/sanitizer_common_interceptors.inc
573

Just "#include", we don't seem to use "# " indentation in another places in this file.

lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
25

Quick check: is it OK you will consume the first "-" in "--" string?

30

Please clarify in function name or comment why "0" or "-0" are unexpected.

61

remove empty line

218

Is it ok the code originally parsed only unsigned integers as the field width, but now it will treat "-9" as a field width "9"?

348

I don't like this diagnostics: we don't even print what this unknown specifier is. We should either do this, or silently exit.

508

same here.

lib/sanitizer_common/tests/sanitizer_scanf_interceptor_test.cc
61

This one is also const.

ygribov added inline comments.Jan 17 2014, 3:02 AM
lib/asan/lit_tests/TestCases/printf-1.c
5

Ok.

7

I don't quite agree. Different tests trigger different types of memory errors so CHECKs will be different. Also printed messages in some tests differ.

lib/asan/lit_tests/TestCases/printf-2.c
21

Well, ok although I'd consider this an overkill.

lib/sanitizer_common/sanitizer_common_interceptors.inc
573

+1

691

+1

lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
25

%-- is invalid syntax anyway so I think it's ok.

30

This is Evegeny's code so I have no idea.

61

Ok.

218

Well, the spec does not mention that width in scanf must be positive but GNU scanf rejects negatives. I'll replace with parse_number.

348

This warning is mainly to inform us that we failed to parse some format spec and I think it's rather important for debugging purposes. What if I replace it with VReport(1, ...) ?

lib/sanitizer_common/tests/sanitizer_scanf_interceptor_test.cc
61

+1

eugenis added inline comments.Jan 17 2014, 3:06 AM
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
30

Could it be different for printf vs scanf?

348

The warning is good, but to be useful it should print a bit more info. Otherwise in a big program it is not at all clear what to do about it.

samsonov added inline comments.Jan 17 2014, 3:23 AM
lib/asan/lit_tests/TestCases/printf-1.c
7

Well, ok, leaving this to you. If you think that merging the cases would complicate things too much, leave it as is.

lib/asan/lit_tests/TestCases/printf-2.c
21

We generally don't want many internal frames from ASan runtime to appear in the error report. So it's nice to verify that error report looks somewhat like this:
#0 __interceptor_printf ....
#1 main printf-2.c:16

lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
25

Alright, I agree that %-- is just invalid scanf/printf syntax and we may not necessarily parse it correctly, but now we move this to function "maybe_parse_signed", and its name doesn't tell much about this hidden assumptions ("-" consumed from "--", "0" results in NULL returned), so one could use it in a wrong way somewhere else. IMO It's better to specify this assumptions explicitly here.

30

See above

ygribov added inline comments.Jan 17 2014, 3:26 AM
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
30

Closer study of scanf/printf spec reveals that width/precision are unsigned both for printf and scanf. But printf may have '-' or '+' flag in front of width.

Anyway I don't know why you check for <= 0 here.

348

The best solution would be to remember pointers to begin and end of specifier in format string. We can then use them to print specifier here.

ygribov added inline comments.Jan 17 2014, 3:28 AM
lib/asan/lit_tests/TestCases/printf-2.c
21

Got it.

lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
25

I agree that whole parse_number/parse_signed thing is a bit messy. Let me see if I can prettify it with next commit.

ygribov added inline comments.Jan 17 2014, 3:50 AM
lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
30

Ok, I got it: scanf is only allowed to have a non-zero width so this check needs to be moved to scanf_parse_next. All this syntax peculiarities are going to drive me crazy.

ygribov updated this revision to Unknown Object (????).Jan 17 2014, 5:53 AM

Updates based on feedback from Evgeny&Alex.

What's missing:

  • I haven't added the @LINE directives; currently ASan reports #0 0x431593 in printf_common #1 0x431a94 in __interceptor_vprintf #2 0x42b2fd in printf #3 0x47a897 in main

which probably is not what we want but I don't think that we should fix it in this patch.

Regarding glibc tests: I've ran them and verified that ASan does not cause segfaults or test failures (it does print couple of warnings on invalid format strings). Is this what you guys wanted?

Alex, Evgeny,

Do you guys want anything else to go into this revision?

-Y

I will look at it now. Sorry for delay.

A couple of nits.

lib/sanitizer_common/tests/sanitizer_scanf_interceptor_test.cc
37–62

Can you please add a comment above describing why NOLINT is necessary here?

43

At least two spaces between the code and the comment, please (you'll need to reformat the rest of declarations)

Only a few minor nits (I've also committed part of this patch already). I think it's fine for submission (once you're re-upload the diff addressing the comments).

lib/asan/lit_tests/TestCases/printf-5.c
5

Once again, what's the point of testing the default configuration? In this way you'll have to update the tests on a flag flip, is this really what we want? Can we remove it, or at least have a single test for default configuration?

lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
48

q is always non-NULL

449

Please either make it a function (I see, va_list might be a problem), or name it as SKIP_SCALAR_ARG

lib/sanitizer_common/sanitizer_printf.cc
98 ↗(On Diff #6503)

I've submitted this part in r199724 (it's not directly related to printf interceptors).

kcc added inline comments.Jan 21 2014, 3:29 AM
lib/asan/lit_tests/TestCases/printf-5.c
5

This was my suggestion, I think this *is* what we want.

ygribov added inline comments.Jan 21 2014, 3:43 AM
lib/asan/lit_tests/TestCases/printf-5.c
5

Yeah, I guess Kostya wants to explicitly test that printf checks are disabled by default for now.

lib/sanitizer_common/sanitizer_common_interceptors_scanf.inc
48

+1

449

Will do.

lib/sanitizer_common/tests/sanitizer_scanf_interceptor_test.cc
37–62

I copy-pasted this from TEST(SanitizerCommonInterceptors, Scanf). Seems to work with NOLINT removed so I'll update the patch.

43

Likewise.

ygribov updated this revision to Unknown Object (????).Jan 21 2014, 3:47 AM

Minor updates based on review.

samsonov accepted this revision.Jan 21 2014, 4:04 AM

LGTM.

Submitted as r199729. Huge thanks for working on this!

Np, waiting for real-world feedback from you.

ygribov closed this revision.Jan 27 2015, 12:09 AM
Via Old World