This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Valist checkers.
ClosedPublic

Authored by xazax.hun on Dec 4 2015, 5:14 AM.

Details

Summary

This patch introduces two new checks for incorrect usage of va_lists. They finds usages of uninitialized va_lists and leaked/unterminated va_lists. They can also find va_lists that are copied into itself.

This check was developed by Donat Nagy (m1nagdon@gmail.com) and myself.

Diff Detail

Repository
rL LLVM

Event Timeline

xazax.hun updated this revision to Diff 41865.Dec 4 2015, 5:14 AM
xazax.hun retitled this revision from to [analyzer] Valist checkers..
xazax.hun updated this object.
xazax.hun added subscribers: dkrupp, cfe-commits.
zaks.anna edited edge metadata.Dec 15 2015, 10:10 PM

Looks good overall; comments below.

Please, provide more information on real world code evaluation. Which codebases this has been tested on? What was the false positive rate? How many real bugs were found/fixed?

What is the criteria for taking it out of alpha?

Please, write the commit message that will be used.

lib/StaticAnalyzer/Checkers/ValistChecker.cpp
30 ↗(On Diff #41865)

This looks like a general purpose utility that could be used by other checkers.
(Though, this code might not support all languages as is.)

54 ↗(On Diff #41865)

What are "likes" ?

82 ↗(On Diff #41865)

Maybe these could be called "checkVAListEndCall" or something similar; as is they are too similar to "checkPreCall/checkPostCall".

116 ↗(On Diff #41865)

Explain what's in the pair or make it into a struct. (I suspect the second argument is the index of the argument taking the va_arg.)

204 ↗(On Diff #41865)

This should be a general-purpose utility. It's not a good idea for each checker to roll their own implementation:) See 'variableName' in http://reviews.llvm.org/D12761.

There could be other places in the analyzer where we print the names.

217 ↗(On Diff #41865)

Please, add a comment explaining what this does and roughly how.

Looks like the main difference with Malloc checker's getAllocationSite is that this does not look for the ReferenceRegion. That might not matter for va lists since the usage is not likely to span many inlined functions, is that correct?

361 ↗(On Diff #41865)

Please add tests for this, preferably the plist output.

test/Analysis/Inputs/system-header-simulator-for-valist.h
4 ↗(On Diff #41865)

Does the second part of the comment refer to something you test for here?

test/Analysis/Inputs/system-header-simulator-valist.h
3 ↗(On Diff #41865)

Is this an unused duplicate file?

test/Analysis/valist-uninitialized.c
64 ↗(On Diff #41865)
98 ↗(On Diff #41865)

The majority of functions end with "//no-warning" comment. Why is that?

134 ↗(On Diff #41865)

Is this something we would want to warn on or not?

146 ↗(On Diff #41865)

You could call this "inlined_uses_arg".

xazax.hun marked 9 inline comments as done.Dec 22 2015, 2:15 AM

It was tested on gcc and rAthena (https://github.com/rathena/rathena). It did not find any issues in those projects, but I was able to find some issues that I artificially put into those projects.

lib/StaticAnalyzer/Checkers/ValistChecker.cpp
30 ↗(On Diff #41865)

I agree.
Right now it does not support namespaces, overloading on the types of the argument, overloading on CV qualifiers etc. Do you have any other language features that is not supported in mind?
Supporting all of that might be useful but I am a bit afraid of the result being bloated. What do you think, what would be the most appropriate place to put such utility? It could be a helper class in CallEvent.cpp/h.

54 ↗(On Diff #41865)

va_list accepting functions.

82 ↗(On Diff #41865)

You are right, I will rename this to better reflect their role.

204 ↗(On Diff #41865)

I agree that it would be great to have this as a general utility. I think I should wait until the implementation from the other checker gets commited.

217 ↗(On Diff #41865)

Right. Clarified this in the comment.

test/Analysis/Inputs/system-header-simulator-for-valist.h
4 ↗(On Diff #41865)

Added a test for it.

test/Analysis/Inputs/system-header-simulator-valist.h
3 ↗(On Diff #41865)

Yes, it is.

test/Analysis/valist-uninitialized.c
98 ↗(On Diff #41865)

To indicate that, no leak related warning there. Do you prefer to remove those comments?

134 ↗(On Diff #41865)

We should definitely wanr on that, but I think it should be ok to add this feature later.

xazax.hun updated this revision to Diff 43435.Dec 22 2015, 2:16 AM
xazax.hun edited edge metadata.
xazax.hun marked 5 inline comments as done.
  • Addressed review comments.
  • Updated to latest trunk.
xazax.hun updated this object.Dec 22 2015, 2:19 AM
zaks.anna added inline comments.Jan 5 2016, 4:54 PM
lib/StaticAnalyzer/Checkers/ValistChecker.cpp
31 ↗(On Diff #43435)

It does not support ObjC methods.

I think this is most useful to checker writes, so how about placing it in CheckerContext? (I am fine with CallEven as well if you think it has wider applicability.)

205 ↗(On Diff #43435)

How about we commit your implementation as a general utility and point the other review to it? You can make it as a separate commit for clarity.

test/Analysis/valist-uninitialized.c
135 ↗(On Diff #43435)

Please, make that clear in the comment, for example, you can say "We should produce a warning here..".

xazax.hun updated this revision to Diff 53700.Apr 14 2016, 6:20 AM
  • Updated to latest trunk.
  • Migrated to use the new CallDescription interface.
  • Use http://reviews.llvm.org/D16044 to extract variable name from regions.
xazax.hun updated this revision to Diff 53899.Apr 15 2016, 9:02 AM
xazax.hun updated this revision to Diff 60497.Jun 13 2016, 2:12 AM
  • Updated to latest trunk.
  • All prerequisites are now committed.
szepet added a subscriber: szepet.Jul 26 2016, 5:43 AM
NoQ added a subscriber: NoQ.Aug 1 2016, 5:13 AM
xazax.hun marked 10 inline comments as done.Aug 1 2016, 5:14 AM
dcoughlin added inline comments.Aug 1 2016, 12:57 PM
lib/StaticAnalyzer/Checkers/ValistChecker.cpp
177 ↗(On Diff #60497)

Are there tests for this diagnostic? I don't see any.

NoQ added a comment.Aug 11 2016, 8:41 AM

The checker's in great shape! I see a few minor things, but that's it.

The checks are split into two sections ("uninitialized" and "unterminated"), but there seem to be more auxiliary checks provided (eg. "copies into itself") that are on for both checkers, do you think you need to add more flags to make the checker more flexible, or maybe just merge everything into a single checker?

Because plist tests are so heavy: if all you wanted to test was how your bug reporter visitor works, did you consider using -analyzer-output=text instead? (this is not the default output mode; it has note: diagnostics displaying the path, which you can catch with expected-note{{}}). Also, you could probably write special tests for the visitor, instead of testing the visitors on all tests, regardless of text/plist. But i'm merely sharing this hint, i don't think it's worth rewriting once we already have plist tests.

lib/StaticAnalyzer/Checkers/ValistChecker.cpp
177 ↗(On Diff #60497)

In the other file, first few tests in valist-unterminated.c.

185 ↗(On Diff #60497)

At first, i thought that you're trying to strip casts from symbolic pointers here. But then i figured out that you are stripping an implementation detail here - as even VarRegion-based VAs reach here in the form of "element{va,0 S64b,struct __va_list_tag}". I've a feeling this deserves a comment.

254 ↗(On Diff #60497)

Does static const Stmt *PathDiagnosticLocation::getStmt(const ExplodedNode *) work for you? Because i guess it was specifically designed for that purpose. Here and in one more place where this code is duplicated.

test/Analysis/valist-uninitialized.c
64 ↗(On Diff #60497)

Trailing whitespace :)

98 ↗(On Diff #60497)

A bit more trailing whitespace here and below.

178 ↗(On Diff #60497)

Like the compiler, the static analyzer treats some functions differently if
they come from a system header -- for example, it is assumed that system
functions do not arbitrarily free() their parameters//, and that some bugs
found in system headers cannot be fixed by the user and should be
suppressed.

Perhaps library functions can be assumed to never va_end() va_lists either. The idea behind the trick, which is also used in, say, SimpleStreamChecker, is that we know all library functions, we can list all library functions that va_end() va_lists, so it's safe to assume that all other library functions do not va_end() va_lists (completely unlike other, non-standard functions with unavailable bodies).

So i think this test should warn, and it shouldn't be hard to fix. That said, some checkers (eg. PthreadLockChecker) do not implement this trick, so it's not critical, but it may give you slightly more positives.

test/Analysis/valist-unterminated.c
42 ↗(On Diff #60497)

Wow, these tests are mind-breaking :))

Is it ok if i ask to put the FIXME above the test or above no-warning or into function name? Because it took some time for me to figure out that f7 is not FIXME, being used to the traditional positioning of the FIXME comment><

xazax.hun updated this revision to Diff 68157.Aug 16 2016, 4:22 AM
xazax.hun marked 5 inline comments as done.
  • Improvements according to review comments.
NoQ accepted this revision.Aug 17 2016, 8:19 AM
NoQ added a reviewer: NoQ.
This revision is now accepted and ready to land.Aug 17 2016, 8:19 AM
This revision was automatically updated to reflect the committed changes.
xazax.hun reopened this revision.Aug 18 2016, 4:08 AM

It looks like it broke some of the build bots.

Error from the windows build bots:

error: 'note' diagnostics expected but not seen: 
  File C:\Buildbot\Slave\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\tools\clang\test\Analysis\valist-uninitialized.c Line 96: va_list 'va' is copied onto itself
  File C:\Buildbot\Slave\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\tools\clang\test\Analysis\valist-uninitialized.c Line 102: va_list 'va' is copied onto itself
  File C:\Buildbot\Slave\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\tools\clang\test\Analysis\valist-uninitialized.c Line 108: Initialized va_list 'va' is overwritten by an uninitialized one
error: 'note' diagnostics seen but not expected: 
  Line 96: va_list va' is copied onto itself
  Line 102: va_list va' is copied onto itself
  Line 108: Initialized va_list va' is overwritten by an uninitialized one
12 errors generated.

Somehow the beginning single quote is missing from the generated message. It is very strange.

Error from other architectures:

error: 'warning' diagnostics expected but not seen: 
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 27: Initialized va_list is leaked
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 39: Initialized va_list is leaked
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 110: Initialized va_list 'va_array[3]' is leaked
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 125: Initialized va_list 'mem[0]' is leaked
error: 'warning' diagnostics seen but not expected: 
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 33: Initialized va_list 'fst' is leaked
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 110: Initialized va_list 'va_array' is leaked
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 125: Initialized va_list 'mem' is leaked
error: 'note' diagnostics expected but not seen: 
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 26: Initialized va_list
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 27: Initialized va_list is leaked
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 36: Initialized va_list
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 39: Initialized va_list is leaked
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 110: Initialized va_list 'va_array[3]' is leaked
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 125: Initialized va_list 'mem[0]' is leaked
error: 'note' diagnostics seen but not expected: 
  Line 31: Initialized va_list
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 33: Initialized va_list 'fst' is leaked
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 110: Initialized va_list 'va_array' is leaked
  File /var/lib/buildbot/slaves/hexagon-build-03/clang-hexagon-elf/llvm/tools/clang/test/Analysis/valist-unterminated.c Line 125: Initialized va_list 'mem' is leaked
17 errors generated.

I suspect that slightly different AST is generated for those architectures that cause the different behavior. I will further investigate those problems.

This revision is now accepted and ready to land.Aug 18 2016, 4:08 AM
NoQ added a comment.Aug 21 2016, 4:02 AM

I suspect that slightly different AST is generated for those architectures that cause the different behavior. I will further investigate those problems.

Seems so, because on my machine when i append -triple hexagon-unknown-linux to the test run line, a bunch of things start failing. You may also consider hardcoding the runline in the current test and commit (which is so often a good idea), and fix other stuff in later commits. A proper investigation would be great, of course :)

P.S. Just curious - do you plan to eventually model va_arg(), as we discussed with Michael Tandy in http://lists.llvm.org/pipermail/cfe-dev/2016-August/050202.html ?

This revision was automatically updated to reflect the committed changes.
In D15227#521781, @NoQ wrote:

I suspect that slightly different AST is generated for those architectures that cause the different behavior. I will further investigate those problems.

Seems so, because on my machine when i append -triple hexagon-unknown-linux to the test run line, a bunch of things start failing. You may also consider hardcoding the runline in the current test and commit (which is so often a good idea), and fix other stuff in later commits. A proper investigation would be great, of course :)

I committed with the hardcoded triple in the runline for now.

P.S. Just curious - do you plan to eventually model va_arg(), as we discussed with Michael Tandy in http://lists.llvm.org/pipermail/cfe-dev/2016-August/050202.html ?

It is not on my TODO list at the moment, but I might look into it in the distant future.

@xazax.hun,

Can we move this out of alpha?

Have this checkers been tested on a large codebase? What are false positive rates?

Thanks!
Anna

@xazax.hun,

Can we move this out of alpha?

Have this checkers been tested on a large codebase? What are false positive rates?

I have tested it on a few ~200k LOC C codebase and I did not see any major problem. There is one problem left though, when I committed this check some of the build bots broke, I suspect that slightly different AST is generated on some of the platforms. I temporarily fixed the issue by hardcoding the triple into the tests and I did not have time yet to investigate the source of the problems. But as far as I remember, this produced false negatives in the tests not false positives.

But as far as I remember, this produced false negatives in the tests not false positives.

Could you double check that? Maybe you still have some notes in your mail box or just by looking at the code.

Did none of the checks work or just some of them?

Also, which platforms did this not work on?

It would be great to move all of the useful checks out of alpha since alpha essentially means "unsupported" and we do not recommend turning those checkers on.

Thanks!

But as far as I remember, this produced false negatives in the tests not false positives.

Could you double check that? Maybe you still have some notes in your mail box or just by looking at the code.

Did none of the checks work or just some of them?

Also, which platforms did this not work on?

It would be great to move all of the useful checks out of alpha since alpha essentially means "unsupported" and we do not recommend turning those checkers on.

Thanks!

I think I managed to solve the problems due to the different AST variants. In the future this should be abstracted away by the analyzer core. You can see the patch here: https://reviews.llvm.org/D30157

Early next week I will rerun this check on a larger open source project. If it does not give me too much false positives or crash, I think it is a good candidate to move out from alpha.