This is an archive of the discontinued LLVM Phabricator instance.

[Sema] Add some compile time _FORTIFY_SOURCE diagnostics
ClosedPublic

Authored by erik.pilkington on Feb 28 2019, 3:03 PM.

Details

Summary

These diagnose overflowing calls to subset of fortifiable functions. Some functions, like sprintf or strcpy aren't supported right not, but we should probably support these in the future. We previously supported this kind of functionality with -Wbuiltin-memcpy-chk-size, but that diagnose doesn't work with _FORTIFY implementations that use wrapper functions. Also unlike that diagnostic, we emit these warnings regardless of whether _FORTIFY_SOURCE is actually enabled, which is nice for programs that don't enable the runtime checks.

Why not just use diagnose_if, like Bionic does? We can get better diagnostics in the compiler (i.e. mention the sizes), and we have the potential to diagnose sprintf and strcpy which is impossible with diagnose_if (at least, in languages that don't support C++14 constexpr). This approach also saves standard libraries from having to add diagnose_if.

rdar://48006655

Thanks for taking a look!
Erik

Diff Detail

Repository
rC Clang

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptFeb 28 2019, 3:04 PM

Thanks for working on this!

I hope to take an in-depth look at this patch next week (if someone else doesn't beat me to it...), but wanted to note that I think enabling clang to emit these warnings on its own is a good thing. diagnose_if is great for potentially more targeted/implementation-defined things that standard libraries want to diagnose, but IMO clang should be able to catch trivially broken code like this regardless of the stdlib it's building against.

Looks solid to me overall; just a few nits.

I don't think I have actual ownership over any of this code, so I'll defer to either Aaron or Richard for the final LGTM

Thanks again!

clang/lib/Sema/SemaChecking.cpp
317 ↗(On Diff #188793)

nit: I think the prevailing preference is to name lambdas as you'd name variables, rather than as you'd name methods/functions

321 ↗(On Diff #188793)

Should this also try to consider fortify_stdlib?

367 ↗(On Diff #188793)

nit: All of these cases (and the two lambdas above) look super similar. Might it be clearer to just set SizeIndex and ObjectIndex variables from here, and actually evaluate them before UsedSize.ule(ComputedSize)?

If not, I'm OK with this as-is.

clang/lib/Sema/SemaExpr.cpp
5929 ↗(On Diff #188793)

Why isn't this in CheckBuiltinFunctionCall?

aaron.ballman added inline comments.Mar 8 2019, 1:28 PM
clang/lib/AST/Decl.cpp
2994 ↗(On Diff #188793)

If we should -> If true, we should

2995 ↗(On Diff #188793)

its -> it's

clang/lib/Sema/SemaChecking.cpp
319 ↗(On Diff #188793)

it's -> its

367 ↗(On Diff #188793)

Formatting looks off here -- run the patch through clang-format?

388 ↗(On Diff #188793)

depend -> depends

423 ↗(On Diff #188793)

Why don't we want to do this for the new fortify diagnostics?

erik.pilkington marked 16 inline comments as done.

Address review comments. Thanks!

clang/lib/Sema/SemaChecking.cpp
319 ↗(On Diff #188793)

The opposite mistake as above, lol.

321 ↗(On Diff #188793)

I reverted fortify_stdlib in r356103 (although you couldn't possible know this, since that was after you made this comment ;)). We're going to use your pass_object_size attribute instead.

367 ↗(On Diff #188793)

Sure, I guess it's a bit cleaner to do that.

423 ↗(On Diff #188793)

No reason, just was trying to avoid changing the output of the _chk diagnostic. We might as as well though, it cleans up the diagnostic.

clang/lib/Sema/SemaExpr.cpp
5929 ↗(On Diff #188793)

For the same reason I added the bool parameter to getBuiltinCallee, we don't usually consider calls to __attribute__((overloadable)) be builtins, so we never reach CheckBuiltinFunctionCall for them. We're planning on using that attribute though:

int sprintf(__attribute__((pass_object_size(_FORTIFY_LEVEL))) char *buffer, const char *format, ...) 
          __attribute__((overloadable)) __asm__("___sprintf_pass_object_size_chk");

This LGTM; feel free to submit after Aaron stamps this.

Thanks again!

clang/lib/Sema/SemaExpr.cpp
5929 ↗(On Diff #188793)

SGTM

aaron.ballman accepted this revision.Mar 15 2019, 11:15 AM

LGTM aside from some minor nits.

clang/lib/Sema/SemaChecking.cpp
338 ↗(On Diff #190549)

I feel like this diagnostic name should be updated -- snprintf() and memcpy() are pretty distinct things. Maybe warn_builtin_chk_overflow?

400 ↗(On Diff #190549)

const auto *

This revision is now accepted and ready to land.Mar 15 2019, 11:15 AM
This revision was automatically updated to reflect the committed changes.

This is causing false positive warnings for the Linux kernel:
https://github.com/ClangBuiltLinux/linux/issues/423
:(

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/statfs.c#n128
Consider this untested case (when the condition is false):

	if (sizeof(buf) == sizeof(*st))
		memcpy(&buf, st, sizeof(*st));

fs/statfs.c:129:3: warning: 'memcpy' will always overflow; destination buffer has size 64, but size argument is 88 [-Wfortify-source]

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/statfs.c#n169, too.

erik.pilkington marked 4 inline comments as done.Mar 22 2019, 12:14 PM

This is causing false positive warnings for the Linux kernel:
https://github.com/ClangBuiltLinux/linux/issues/423
:(

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/statfs.c#n128
Consider this untested case (when the condition is false):

	if (sizeof(buf) == sizeof(*st))
		memcpy(&buf, st, sizeof(*st));

fs/statfs.c:129:3: warning: 'memcpy' will always overflow; destination buffer has size 64, but size argument is 88 [-Wfortify-source]

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/statfs.c#n169, too.

Ah, I didn't consider that case. Presumably st is configured to have different sizes based on the target? I agree that this is a false-positive, but it seems like a pretty narrow edge case, and there is a pretty obvious source workaround that doesn't affect readability: memcpy(&buf, st, sizeof(buf)). @aaron.ballman/@rsmith Any thoughts here? IMO keeping this diagnostic is worth it.

Ah, I didn't consider that case. Presumably st is configured to have different sizes based on the target?

Yes; sorry I was not clear about that in my example.

I agree that this is a false-positive, but it seems like a pretty narrow edge case, and there is a pretty obvious source workaround that doesn't affect readability: memcpy(&buf, st, sizeof(buf)).

Oh, yeah, we could make that source level change.

@aaron.ballman/@rsmith Any thoughts here? IMO keeping this diagnostic is worth it.

I'm all for keeping it, I'm curious if it can be extended to NOT warn in the case provided?

Ah, I didn't consider that case. Presumably st is configured to have different sizes based on the target? I agree that this is a false-positive, but it seems like a pretty narrow edge case, and there is a pretty obvious source workaround that doesn't affect readability: memcpy(&buf, st, sizeof(buf)). @aaron.ballman/@rsmith Any thoughts here? IMO keeping this diagnostic is worth it.

Yes, I think we should keep this diagnostic. However, if we can find a way to silence it for this particular false-positive pattern, that would be great!

For that particular case, I think we could suppress the false positive using DiagRuntimeBehavior. How many total false positives are we talking about, and how many can the compiler statically prove are unreachable?

We have warnings like -Wdivision-by-zero that take reachability into account: https://godbolt.org/z/3PD-eF. I don't immediately know how it's all done (e.g. in batch because CFG construction is expensive? ...), but looking there for inspiration may be useful.

(Forgot to refresh before pressing 'Submit', so maybe efriedma's comment answers all of the questions in mine ;) )

For that particular case, I think we could suppress the false positive using DiagRuntimeBehavior. How many total false positives are we talking about, and how many can the compiler statically prove are unreachable?

Sure, that seems like a perfect fit, I added it in r357041. There are only two false-positives IIUC, and they're both immediately guarded by an if (sizeof(x) == sizeof(y)).

nickdesaulniers added a comment.EditedMar 27 2019, 9:46 AM

I added it in r357041.

LGTM, thanks!

phosek added a subscriber: phosek.Mar 28 2019, 6:50 PM

This is triggering a Clang assertion failure in Fuchsia build:

clang/lib/AST/ExprConstant.cpp:5032: bool (anonymous namespace)::ExprEvaluatorBase<(anonymous namespace)::PointerExprEvaluator>::VisitMemberExpr(const clang::MemberExpr *) [Derived = (anonymous namespace)::PointerExprEvaluator]: Assertion `!E->isArrow() && "missing call to bound member function?"' failed.

I've filed PR41286 which also has a reproducer.

Ah, looks like the problem is we're sending a dependent expression to the constant evaluator, we should just bail out in that case. I'll fix this tomorrow morning, thanks for tracking this down!

This is triggering a Clang assertion failure in Fuchsia build:

clang/lib/AST/ExprConstant.cpp:5032: bool (anonymous namespace)::ExprEvaluatorBase<(anonymous namespace)::PointerExprEvaluator>::VisitMemberExpr(const clang::MemberExpr *) [Derived = (anonymous namespace)::PointerExprEvaluator]: Assertion `!E->isArrow() && "missing call to bound member function?"' failed.

I've filed PR41286 which also has a reproducer.

Fixed in r357304, thanks.