This is an archive of the discontinued LLVM Phabricator instance.

New checks for fortified sprintf
ClosedPublic

Authored by serge-sans-paille on Dec 16 2019, 1:19 PM.

Details

Summary

Implement a pessimistic evaluator of the minimal required size for a buffer based on the format string, and couple that with the fortified version to emit a warning when the buffer size is lower than the lower bound computed from the format string.

See the test cases for examples of warnings, and https://github.com/serge-sans-paille/llvm-project/pull/6/checks for the cross-platform validation.

Note: The lower bound could be improved, but I'd rather do that in an incremental commit, if that's okay with the reviewers.

Edit: thanks to @aaron.ballman interest, I've been able to support al(most|l) the scanf specification, ending up with a rather complete support.

Diff Detail

Event Timeline

Thanks for working on this! I think this would be a really useful diagnostic to have.

clang/lib/Sema/SemaChecking.cpp
392

nit: namespace {

580–581

Can you delete this comment now?

596

Optional<APSInt> might be a better way of representing this?

601

Can't you do this for Builtin::sprintf as well? The diagnostic would still be worth issuing when _FORTIFY_SOURCE is disabled.

605

Will this assert on: sprintf(buf, L"foo");? Not that that makes any sense, but we shouldn't crash.

612

I'm not sure why you're using min here, can you add a comment?

Take remarks into account:

  • support sprintf when the buffer has a known static size
  • use llvm::Optional
  • some extra minor tweaks.
serge-sans-paille marked 7 inline comments as done.Dec 17 2019, 3:47 AM
serge-sans-paille added inline comments.
clang/lib/Sema/SemaChecking.cpp
580–581

I only deleted the one related to sprintf

605

Still need to check that.

serge-sans-paille marked an inline comment as done.
  • Prevent assert upon wide string format
  • More test cases
serge-sans-paille marked 2 inline comments as done.Dec 17 2019, 5:41 AM
serge-sans-paille added inline comments.
clang/lib/Sema/SemaChecking.cpp
605

Checked and fixed!

Support %% and %c

erik.pilkington added inline comments.
clang/lib/Sema/SemaChecking.cpp
621

Wait, does this actually return a smaller length if there is a null-terminator embedded in the string? It looks like Format->getString() returns a StringRef with embedded nul bytes and .size() is returning the length. You might have to do something like: FormatStrRef.find('0');. Can you add a test for this?

Update size computation + test case involving null byte inside the format string.

Unit tests: fail. 61975 tests passed, 1 failed and 783 were skipped.

failed: LLVM.Bindings/Go/go.test

clang-tidy: unknown.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

aaron.ballman added inline comments.Jan 20 2020, 8:29 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
745–746

I'm fine with the current wording because it's consistent with the existing diagnostics, but I wish we would quantify the size with units in these diagnostics. (e.g., mention that this is measured in bytes as opposed to anything else).

clang/lib/Sema/SemaChecking.cpp
405–406

Naming nits: StartSpecifier and SpecifierLen. It looks like startSpecifier isn't used and the name can probably just be dropped.

436–437

Why is size incremented by 3 here instead of 2 as above with a leading 0x?

448

Guard against wraparound?

607

Spurious whitespace.

608

We can handle UTF-8 format strings even when they contain non-ASCII characters?

8355

Spurious whitespace change?

serge-sans-paille marked an inline comment as done.

Finer grain lower bound computations, and simpler control-flow for the checking function.

Unit tests: pass. 62028 tests passed, 0 failed and 783 were skipped.

clang-tidy: unknown.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Unit tests: pass. 62043 tests passed, 0 failed and 783 were skipped.

clang-tidy: unknown.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

(There are still some minor whitespace nits to resolve as well.)

clang/lib/Sema/SemaChecking.cpp
743

I think this assertion can be triggered in the case where no UsedSize was specified but then UsedSizeArg evaluates to 0, can't it? Or does something catch that case and diagnose?

Remove useless assert

(There are still some minor whitespace nits to resolve as well.)

Strange, everything is passed through clang-format-diff :-/

Unit tests: pass. 62112 tests passed, 0 failed and 808 were skipped.

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

aaron.ballman accepted this revision.Jan 23 2020, 5:59 AM

(There are still some minor whitespace nits to resolve as well.)

Strange, everything is passed through clang-format-diff :-/

They may have been manually inserted by accident? It's newlines in a few places, I added phab review comments at them.

On the whole, I think this LGTM, assuming the requested test cases don't discover issues.

clang/test/Sema/warn-fortify-source.c
127

I'd like to see some additional tests for things like the + and flags, length modifiers like ll, escape characters, etc. Basically, we should be testing most of the conversion specifiers to verify our conservative length calculations.

This revision is now accepted and ready to land.Jan 23 2020, 5:59 AM

Remove extra new lines.
Add more test cases.

serge-sans-paille edited the summary of this revision. (Show Details)Jan 23 2020, 8:35 AM

Unit tests: pass. 62135 tests passed, 0 failed and 812 were skipped.

clang-tidy: pass.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

clang-format update.

Unit tests: fail. 62134 tests passed, 9 failed and 811 were skipped.

failed: LLVM.MC/AArch64/ete-sysregs.s
failed: LLVM.MC/AArch64/trace-regs.s
failed: LLVM.MC/Disassembler/AArch64/ete.txt
failed: LLVM.MC/Disassembler/AArch64/trace-regs.txt
failed: libc++.std/language_support/cmp/cmp_partialord/partialord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongeq/cmp.strongeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_strongord/strongord.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakeq/cmp.weakeq.pass.cpp
failed: libc++.std/language_support/cmp/cmp_weakord/weakord.pass.cpp

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Continues to LG, do you need me to commit on your behalf?

Continues to LG, do you need me to commit on your behalf?

I was waiting for a last LG as I did some changes, thanks for all the quick iteration, I really enjoyed working with you on that patch! I'll submit it tomorrow morning.

This revision was automatically updated to reflect the committed changes.