Title says it, but there is still place for further improvements.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Sure, this is an improvement because we display more information, but I'd argue that it isn't really a more readable warning message :) How about
<argno>th argument to the call to <function name>
- cannot be represented with a character
- is a null pointer
- ...
, which violates the function's preconditions.
WDYT?
+1 Warning messages should be as clear as possible. Although our users are programmers, they do not necessarily understand a message "Range constraint not satisfied.". Maybe the error messages could be stored together with the function summaries.
| clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
|---|---|---|
| 323 | Instead of std::string we usually use llvm::SmallString and an llvm::raw_svector_ostream to print into it. | |
| clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
|---|---|---|
| 323 | This would look better? | |
Could you add some tests demonstrating the refined diagnostics of the checker?
It would help to validate the quality of the emitted diagnostics.
| clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
|---|---|---|
| 88 | Shouldn't we use using instead? | |
| 323 | Maybe llvm::Twine would be a better choice to llvm::raw_svector_ostream for string concatenations. 
 | |
I'd rather prefer @balazske's approach (see below) as that is similarly expressive but sparser.
| clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
|---|---|---|
| 88 | Yes, we should. But it's legacy code from the times long before. | |
| 323 | We can use a Twine to create the owning std::string without creating interim objects. But then we must pass an owning object to the ctor of PathSensitiveBugReport, otherwise we might end up with dangling pointers.  | |
- Use Twine
| clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
|---|---|---|
| 323 | Yes, I like this. | |
Tests c:
I'm still not a huge fan of the warning message. Now it describes what kind of constraint was violated, but not how (too large? too small?). Also, while I respect that you didn't prefer my suggestion, I think the warning message there is still a lot more welcoming, even if it doesn't convey more information. But, I don't necessarily insist on it.
There is a TODO comment at the error message, it needs improvement. The current form is still something for developers, not for end users. For final version I would accept textual descriptions (as mentioned above), not names like "BufferSize" and words like "ArgN" inside it.
Ah, okay, silly me. Didn't catch that. Then the message is OK for now.
edit: Describing how the violation happened might be a valuable for development purposes as well.
What if we'd add a toString method to the constraints and we'd add this to Msg? This way we'd know the contents of the constraint, thus we we'd know how the constraint is violated.
I mean we'd know what is not satisfied. But, to know why exactly that is not satisfied we should dump the whole State but that's obviously not an option. Perhaps we could track which symbols and expressions are participating in the assumption related to the constraint and we could dump only those, but this seems to be a very complex approach.
I realize that the how and why phrases in this context a bit too vague :) What do you mean under having to dump the whole State? I didn't mean to compress a bug path into a warning message, only what I mentioned in D79431#2020951. In any case, I think its okay to just move on with this patch. LGTM!
First, thanks for the review!
Second, "dumping the whole State" is an overstatement. What I meant is this: to properly explain to the user how the constraint is failed is a hard task. Because we should dig up all symbols and expressions that are related to the constraint and we should dump their values. E.g. consider the violation of a range constraint here,
ssize_t send(int socket, const void *buffer, size_t length, int flags);
void foo() {
  if (socket < 0)
    send(socket, buf, buflen, flags); // constraint violation for 'socket' it should be in [0, IntMax]
}We should print something like this: 'socket' it should be in [0, IntMax] but it is in [IntMin, -1]
In this example, the related symbol is only socket but this could be more complex, e.g. if the constraint is related to a buffer size ...
Shouldn't we use using instead?