Page MenuHomePhabricator

[analyzer][StdLibraryFunctionsChecker] Add NoteTags for applied arg constraints
Needs ReviewPublic

Authored by martong on Apr 29 2021, 5:45 AM.



In this patch I add a new NoteTag for each applied argument constraint.
This way, any other checker that reports a bug - where the applied
constraint is relevant - will display the corresponding note. With this
change we provide more information for the users to understand some
bug reports easier.

Diff Detail

Unit TestsFailed

380 msx64 windows > lld.MachO::reproduce.s
Script: -- : 'RUN: at line 3'; rm -rf C:\ws\w32-1\llvm-project\premerge-checks\build\tools\lld\test\MachO\Output\reproduce.s.tmp.dir

Event Timeline

martong created this revision.Apr 29 2021, 5:45 AM
martong requested review of this revision.Apr 29 2021, 5:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2021, 5:45 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
steakhal added inline comments.Apr 29 2021, 7:43 AM

This way each and every applied constraint will be displayed even if the given argument does not constitute to the bug condition.
I recommend you branching within the lambda, on the interestingness of the given argument constraint.


BTW I raised my related concerns about this elsewhere.


I was puzzled for a moment on why do you have two notes here.
By checking the definition of __arg_constrained_twice(), I can see that it has two ArgConstraints.

Although, it shouldn't be a problem as on the UI we would visualize notes for only a single bugreport. So it is probably clear that both of the notes are valid and correspond to the given statement. It might look clunky in the LIT test, but should be 'somewhat' readable in real life.

You should take no action here. I leave this comment just for the record.

NoQ added inline comments.Apr 29 2021, 4:45 PM

This has to be a user-friendly message.

  • "Constraints" is compiler jargon.
  • We cannot afford shortening "argument" to "arg".
  • Generally, the less machine-generated it looks the better (":" is definitely robotic).

This isn't part of this patch but what do you think about {-1} U [0, 255]? Or, you know, [-1, 255].

martong planned changes to this revision.Apr 30 2021, 2:46 AM
martong marked 4 inline comments as done.
martong added inline comments.

Okay, good point, thanks for the feedback! I am planning to change to this direction.


Okay, thanks for your comment. I can make it to be more similar to the other notes we already have. What about this?

Assuming the 1st argument is within the range [1, 1]

We cannot afford shortening "argument" to "arg".

I'd like to address this in another following patch if you don't mind.


Yeah, good idea, {-1} U [0, 255] would be indeed nicer and shorter. However, [-1, 255] is hard(er) so I am going to start with the former in a follow up patch.

NoQ added inline comments.May 2 2021, 6:31 PM

Excellent catch @steakhal!

I think you can always emit the note but only mark it as unprunable when the argument is interesting. This way it'd work identically to our normal "Assuming..." notes.


This sounds good for a generic message. I still think that most of the time these messages should be part of the summary. Eg.,

Assuming the 1st argument is within range [33, 47] U [58, 64] U [91, 96] U [123, 125]

ideally should be rephrased as

Assuming the argument is a punctuation character

in the summary of ispunct().

martong marked 4 inline comments as done.May 3 2021, 2:14 AM
martong added inline comments.

I think you can always emit the note but only mark it as unprunable when the argument is interesting. This way it'd work identically to our normal "Assuming..." notes.

IsPrunable is a const member in NoteTag. So, we have to decide about prunability when we call getNoteTag. To follow your suggestion, we should decide the prunability dynamically in TagVisitor::VisitNode. This would require some infrastructural changes in NoteTag. We could add e.g. another Callback member that would be able to decide the prunability with the help of a BugReport&. I am okay to go into that direction, but it should definitely be separated from this patch (follow-up). I am not sure if it is an absolutely needed dependency for this change, is it? (If yes then I am going to create the dependent patch first).

martong marked 2 inline comments as done.May 3 2021, 2:20 AM
martong added inline comments.

Yes, absolutely, good idea. It makes sense to provide another member for the Summary that could specifically describe the function specific assumptions (or violations). However, before we would be able to go through all functions manually to create these specific messages we need a generic solution to have something that is more descriptive than the current solution.

martong updated this revision to Diff 342342.May 3 2021, 2:58 AM
martong marked an inline comment as done.
  • Add the description to the note tag only if the SVal is interesting
  • Use 'Assuming the nth arg is in ...' form for the descriptions
steakhal added inline comments.May 3 2021, 4:28 AM

I don't know. Report message construction always seemed clunky.
Clang's or ClangTidy's approach seems superior in this regard.

Do we have anything better for this @NoQ?
Maybe llvm::format() could be an option.

Regarding this patch: It's fine. Better than it was before!


Ah, there is a slight issue.
You should mark some stuff interesting here, to make this interestingness propagate back transitively.

Let's say ArgSVal is x + y which is considered to be out of range [42,52]. We should mark both x and y interesting because they themselves could have been constrained by the StdLibChecker previously. So, they must be interesting as well.

On the same token, IMO PathSensitiveBugReport::markInteresting(symbol) should be transitive. So that all SymbolData in that symbolic expression tree are considered interesting. What do you think @NoQ?
If we were doing this, @martong - you could simply acquire the assumption you constructed for the given ValueConstraint, and mark that interesting. Then all SymbolDatas on both sides of the logic operator would become implicitly interesting.