This is an archive of the discontinued LLVM Phabricator instance.

[clang][analyzer] Improve bug reports of StdLibraryFunctionsChecker.
ClosedPublic

Authored by balazske on Feb 14 2023, 6:21 AM.

Details

Summary

Add an additional explanation of what is wrong if a constraint is
not satisfied, in some cases.
Additionally the bug report generation is changed to use raw_ostream.

Diff Detail

Event Timeline

balazske created this revision.Feb 14 2023, 6:21 AM
Herald added a reviewer: NoQ. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
balazske requested review of this revision.Feb 14 2023, 6:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 14 2023, 6:21 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
NoQ accepted this revision.Feb 16 2023, 5:05 PM

Looks like a massive improvement. Yes, the warning text traditionally focuses on what exactly is wrong. Describing why it's wrong is important as well, but usually less important. We're making an extraordinary statement that the program is wrong, so warning with path notes should look like a proof of that:

"Indeed, let's assume 'x' is greater than 0, Then the program takes true branch on this if-statement. Then it assigns value '0' to pointer 'p'. Which brings us to the statement '*p = 1' three lines below. Pointer 'p' is null, so the program will crash. // Therefore your code is broken, Q.E.D.

It should not end with a general recommendation:

"Indeed, let's assume 'x' is greater than 0. Then the program takes true branch on this if-statement. Then it assigns value '0' to pointer 'p'. Which brings us to the statement '*p = 1' three lines below. Null pointers should not be dereferenced." // I already know that, so what?

This is especially important because people read reports from bottom to top: "Ok, if this happens it's indeed bad, now why do you think this happens?". So if they don't understand what happens (in your case, what value *is* there), they can't even start asking this question. So this is why I think this patch is amazing and it should have been like this from the start.

So if I was to come up with a full warning text myself, I'd probably suggest something along the lines of

Value 3 passed as 1st parameter of '__range_1_2__4_6', which expects 1 or 2, or 4, 5 or 6
Negative value passed as 1st parameter of '__range_1_2__4_6', which expects 1 or 2, or 4, 5 or 6

or maybe even

Variable 'x' passed as 1st parameter to '__range_1_2__4_6' takes values in range [7, 10] whereas expected values for the parameter are 1 or 2, or 4, 5 or 6

so that to explain what happens first, then explain why this is bad later. But even without such further improvements, this patch is great.

clang/test/Analysis/std-c-library-functions-arg-constraints-notes.cpp
210

I'm definitely not an expert on Oxford commas but adding one between individual ranges looks valuable as it helps the reader understand what makes 5 so special.

This revision is now accepted and ready to land.Feb 16 2023, 5:05 PM

The plan is to bring the checker out of the alpha package, the bug reports should be good enough for that. To modify bug report text generation a new patch would be better, after the current is finished. Maybe all single numbers in all allowed ranges that are 1, 2 (or 3?) long (like the 1,2,4,5,6 in the example) can be collected, this makes report generation more complicated than the current way.

balazske updated this revision to Diff 504185.Mar 10 2023, 9:39 AM

Change format of bug reports.
Now the problem is shown first, then the acceptable values.
Sometimes the messages can become too verbose (in case of
"should be NULL") or grammatically not totally correct,
I can not tell if this is acceptable but this was the most
simple implementation.

Please run this on open source projects and upload the results.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
179–181

I know what you mean, but this could use an example.

183–185

This sentence makes so sense, unfortunately :( Could you rephrase, and, again, support it with an example?

820

Looking at the tests, this adds very little how about this:
" that is out of the accepted range; It should be " -> ", but should be "

Do you agree? I won't die on this hill, and am willing to accept this is good as-is if you really think that it is.

The other case is fine.

824

Did you mean to leave this here?

clang/test/Analysis/std-c-library-functions-arg-constraints-notes.cpp
18

This english is broken, but more importantly, this is the one scenario where the original message was just good enough -- in fact, better. How about either

  1. Restoring the original message (by somehow excluding NotNullConstraint in the new describe())
  2. Or saying something like "The 1st argument to '__not_null' is NULL, but should be non-NULL"
balazske updated this revision to Diff 510816.Apr 4 2023, 8:33 AM
balazske marked 3 inline comments as done.

I decided to add back DescriptionKind and make possible to use a message like
"should not be NULL". Now all generated strings in functions describe and
describeArgumentValue start with "should be" or "is" to make this consistent.
The messages are now in form "is ... but should be ..." which sounds at some times
too trivial but acceptable as generated message.

Szelethus accepted this revision.EditedApr 11 2023, 1:44 AM

LGTM! Most of the diagnostic messages are short but precise. I like this very much. Well done! Please commit.

Balázs actually tested the change on open source projects, but accidentally uploaded it to our internal server, so I have seen them, and they look good from the diagnostic message standpoint.