This is an archive of the discontinued LLVM Phabricator instance.

[clang][analyzer] Make messages of StdCLibraryFunctionsChecker user-friendly
ClosedPublic

Authored by balazske on Feb 2 2023, 8:34 AM.

Details

Summary

Warnings and notes of checker alpha.unix.StdLibraryFunctionArgs are
improved. Previously one warning and one note was emitted for every
finding, now one warning is emitted only that contains a detailed
description of the found issue.

Diff Detail

Event Timeline

balazske created this revision.Feb 2 2023, 8:34 AM
Herald added a reviewer: NoQ. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
balazske requested review of this revision.Feb 2 2023, 8:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 2 2023, 8:34 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Awesome, been a long time coming!!

Other than the minor observation of changing "of function" to "to", I'm inclined to accept this patch. We definitely should describe what the value IS, not just what is should be (aside from the cases concerning nullness, I think they are fine as-is), but seems to be another can of worms deserving of its own patch.

The reason why I'd solve the description of the argument separately is that other checkers also suffer from this problem (alpha.security.ArrayBoundV2, hello), and it opens conversations such as "What if we can't sensibly describe that value?", and on occasions, we have felt that the go-to solution should be just suppressing the warning. But then again, its an intentional false negative.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
915–917

Looking at the tests, the word 'function' may be redundant and unnecessary.

932–953

Here too.

966–969

Here too.

980

We are saying what the value should be, but neglect to say what it is instead. Something like this would be good:

"The value of the 1st argument to _nonnull should be equal to or less then 10, but is 11".

This is what I (and many of our users IIRC) miss in checker messages, like in those where we are supposed to describe what the value of the index is (besides it causing a buffer overflow).

I had the plan to add a note that tells the value that is out of range. In the most simple case it can work if we have a fixed value. But this may be the most obvious from the code too. Otherwise a new function can be added that can print the assumed constraint ranges.

balazske updated this revision to Diff 495042.Feb 6 2023, 2:08 AM

Replaced "of function" with "to" in checker messages.

Probably it is not always useful to explain why the argument is wrong. In cases when we can find out that the value is exactly between two consecutive valid ranges we can display a note, or when the exact value is known. Otherwise it may end up in something like "the value should be between 1 and 4 or between 6 and 10, but it is less than 1 or exactly 5 or greater than 10". The "good" cases are (when more information is available): "the value is less than 1", "the value is 5", "the value is greater than 10". If two bad ranges are known it may be OK too: "the value is less than 1 or it is exactly 5". The explanation may be possible to implement by test assumes for the negated ranges.

A small nit, otherwise LGTM.

Probably it is not always useful to explain why the argument is wrong. In cases when we can find out that the value is exactly between two consecutive valid ranges we can display a note, or when the exact value is known. Otherwise it may end up in something like "the value should be between 1 and 4 or between 6 and 10, but it is less than 1 or exactly 5 or greater than 10".

You're totally right. I'm not sure how to approach tricky cases like that either.

The "good" cases are (when more information is available): "the value is less than 1", "the value is 5", "the value is greater than 10". If two bad ranges are known it may be OK too: "the value is less than 1 or it is exactly 5". The explanation may be possible to implement by test assumes for the negated ranges.

Maybe we can start out with the easier cases, but we should check whether there is a checker that solves a different problem, and should try to aim at a somewhat general, reusable solution.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
95

Using a raw_ostream as a parameter sounds more elegant than a SmallString with a precise stack buffer length. Not to mention that you could call this function with llvm::errs() for easy debugging.

100

Here as well.

clang/test/Analysis/std-c-library-functions-arg-constraints-notes.cpp
51–79

Please clang-format the changes in this test file.

balazske updated this revision to Diff 496471.Feb 10 2023, 7:45 AM

change of range descriptions, align comments

balazske updated this revision to Diff 496474.Feb 10 2023, 7:48 AM

fixed another comment format problem

balazske updated this revision to Diff 496476.Feb 10 2023, 7:50 AM

fixed another comment format problem

I changed the text descriptions to relational operators at some places, this is similar to how other clang analyzer messages look (for example notes at if statements), it is shorter, and probably more faster to understand ("<= 0" instead of "non-positive").

balazske marked an inline comment as done.Feb 10 2023, 7:56 AM
balazske added inline comments.
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
95

I want to improve this in a later patch. The change does involve not new code too (getArgDesc should be changed too and other places where the messages are generated).

balazske marked 4 inline comments as done.Feb 10 2023, 7:58 AM
Szelethus accepted this revision.Feb 14 2023, 7:42 AM

LGTM!

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
95

Fair enough.

This revision is now accepted and ready to land.Feb 14 2023, 7:42 AM
This revision was landed with ongoing or failed builds.Feb 15 2023, 12:23 AM
This revision was automatically updated to reflect the committed changes.