This is an archive of the discontinued LLVM Phabricator instance.

Add new warning -Walloca for use of builtin alloca function
ClosedPublic

Authored by ziyig on Jul 17 2019, 2:15 PM.

Details

Summary

Add new warning -Walloca for use of builtin alloca function.

Also warns the use of __builtin_alloca_with_align. GCC has this warning, and we'd like to have this for compatibility.

Diff Detail

Event Timeline

ziyig created this revision.Jul 17 2019, 2:15 PM
cmtice added a subscriber: cmtice.Jul 17 2019, 2:17 PM

Thanks for this!

Mostly just nitpicking the warning's wording. :)

clang/include/clang/Basic/DiagnosticSemaKinds.td
2776

nit: I'd just say "use of function '%0'" here; "builtin" doesn't really add much.

I also wonder if we should be saying anything more than "we found a use of this function." Looks like GCC doesn't (https://godbolt.org/z/sYs_8G), but since this warning is sort of opinionated in itself, might it be better to expand this to "use of '%0' is discouraged"?

WDYT, Aaron?

aaron.ballman added inline comments.Jul 20 2019, 6:45 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
2776

What is the purpose to this diagnostic, aside from GCC compatibility? What does it protect against?

If there's a reason users should not use alloc(), it would be better for the diagnostic to spell it out.

Btw, I'm okay with this being default-off because the GCC warning is as well. I'm mostly hoping we can do better with our diagnostic wording.

clang/lib/Sema/SemaChecking.cpp
1172

Do we want to warn on all uses of alloca(), or just the ones that get past the call to SemaBuiltinAllocaWithAlign()?

clang/test/Sema/warn-alloca.c
1

I'd appreciate a test demonstrating no warnings if -Wall is passed without an explicit -Walloca.

clang/include/clang/Basic/DiagnosticSemaKinds.td
2776

I'm mostly hoping we can do better with our diagnostic wording

+1

What is the purpose to this diagnostic?

I think the intent boils down to that alloca is easily misused, and leads to e.g., https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt . Since its use often boils down to nothing but a tiny micro-optimization, some parties would like to discourage its use.

Both glibc and bionic recommend against the use of alloca in their documentation, though glibc's docs are less assertive than bionic's, and explicitly call out "[alloca's] use can improve efficiency compared to the use of malloc plus free."

Greping a codebase and investigating the first 15 results:

  • all of them look like microoptimizations; many of them also sit close to other malloc/new ops, so allocating on these paths presumably isn't prohibitively expensive
  • all but two of the uses of alloca have no logic to fall back to the heap malloc if the array they want to allocate passes a certain threshold. Some of the uses make it look *really* easy for the array to grow very large.
  • one of the uses compares the result of alloca to NULL. Since alloca's behavior is undefined if it fails, ...

I'm having trouble putting this into a concise and actionable diagnostic message, though. The best I can come up with at the moment is something along the lines of "use of function %0 is subtle; consider using heap allocation instead."

aaron.ballman added inline comments.Jul 22 2019, 3:51 PM
clang/include/clang/Basic/DiagnosticSemaKinds.td
2776

Okay, that's along the lines of what I was thinking.

Part of me thinks that this should not diagnose calls to alloca() that are given a constant value that we can test for sanity at compile time. e.g., calling alloca(10) is highly unlikely to be a problem, but calling alloca(1000000) certainly could be, while alloca(x) is a different class of problem without good static analysis.

That said, perhaps we could get away with use of function %0 is discouraged; there is no way to check for failure but failure may still occur, resulting in a possibly exploitable security vulnerability or something along those lines?

clang/lib/Sema/SemaChecking.cpp
1175

I think this should see how the user spelled the builtin call. It would be a bit strange for a user who wrote alloca() in their code to get a diagnostic about not calling __builtin_alloca().

clang/include/clang/Basic/DiagnosticSemaKinds.td
2776

Yeah, GCC has a similar -Walloca-larger-than=N that does roughly what you described. The icky part is exactly what you said. GCC will warn about unknown values, but considers control flow when doing so: https://godbolt.org/z/J3pGiT

It looks like it's the same "we apply optimizations and then see what happens" behavior as similar diagnostics: https://godbolt.org/z/lLyteP

WRT the diag we emit here, maybe we could use notes to break it up and imply things? e.g.

warning: use of function %0 is discouraged, due to its security implications
note: 'malloc' or 'new' are suggested alternatives, since they have well-defined behavior on failure

...not sold on the idea, but it's a thought.

If we don't want to break it to pieces, I'm fine with your suggestion

aaron.ballman added inline comments.Jul 23 2019, 8:08 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
2776

I'm not certain the note adds value because it will always be printed on the same line as the warning. A note would make sense if we had a secondary location to annotate though.

clang/include/clang/Basic/DiagnosticSemaKinds.td
2776

SGTM

ziyig updated this revision to Diff 211785.Jul 25 2019, 9:57 AM
ziyig marked 9 inline comments as done.

Updated the warning message and the test cases.

ziyig added inline comments.Jul 25 2019, 9:59 AM
clang/lib/Sema/SemaChecking.cpp
1172

I don't have strong opinion about this. Which one do you think is better?

aaron.ballman marked an inline comment as done.Jul 25 2019, 1:35 PM
aaron.ballman added inline comments.
clang/lib/Sema/SemaChecking.cpp
1172

I think the code is fine as-is. SemaBuiltinAllocaWithAlign() returns true when there's an error with the call, and so users will not get the warning only if the call is erroneous, which seems fine given that the code didn't compile. It turns out this matches GCC's behavior as well.

1175

You should pass in TheCall->getDirectCallee() and not try to get the name directly; the diagnostics engine will do the right thing automatically.

ziyig updated this revision to Diff 211816.Jul 25 2019, 1:55 PM
ziyig marked an inline comment as done.
aaron.ballman accepted this revision.Jul 25 2019, 2:21 PM

LGTM, thank you for the patch!

This revision is now accepted and ready to land.Jul 25 2019, 2:21 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 25 2019, 3:26 PM