Page MenuHomePhabricator

[analyzer] NFC: Move IssueHash to libAnalysis.

Authored by NoQ on Sep 10 2019, 4:49 PM.



IssueHash is an attempt to introduce stable warning identifiers that won't change when code around them gets moved around. Path diagnostic consumers print issue hashes for the emitted diagnostics.

This move will allow us to ultimately move path diagnostic consumers to libAnalysis.

Diff Detail

Event Timeline

NoQ created this revision.Sep 10 2019, 4:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 10 2019, 4:49 PM

The "bugtype" in IssueHash is specific to Static Analyzer (or at least not documented sufficiently abstractly).

NoQ added a comment.Sep 11 2019, 11:23 AM

The "bugtype" in IssueHash is specific to Static Analyzer (or at least not documented sufficiently abstractly).

Yeah, i'll need to document this.

That's basically the warning message; it doesn't have to have anything to do with the Analyzer's BugType structure. But if the warning message contains fragile stuff (eg., mentions a name of a variable or (why??) a line number), the user may want to wipe it out of the hash, and in this case Analyzer BugTypes turn out to be fairly handy.

NoQ updated this revision to Diff 219824.Sep 11 2019, 4:00 PM

Rename BugType to WarningMessage, add comments.

Get rid of an unnecessary SourceManager parameter.

gribozavr added inline comments.Sep 12 2019, 2:52 AM

Returns an opaque identifier for a diagnostic.

This opaque identifier is intended to be stable even when the source code is changed. It allows to track diagnostics in the long term, for example, find which diagnostics are "new", maintain a database of suppressed diagnostics etc.

The identifier includes the following information:

  • the normalized source text...
  • ...

The identifier does not include the following information:

  • the name of the file,
  • the line and column number in the file,
  • ...

I don't understand what this paragraph is talking about at all. What does it mean for a new hash to be introduced? As in, when this hashing algorithm changes?


Why isn't IssueLoc const?


The doc comment suggests that this function returns a hash as a string, but that's not what it does. It returns some sort of identifier that is then hashed.

Szelethus added inline comments.Jul 14 2020, 6:15 AM

It might refer to the extreme headaches issue hash changes can cause. In D77866,a block of comments in MallocChecker.cpp talks about this, and some other discussion (D77866#2068394) in the patch as well.

NoQ updated this revision to Diff 281797.Jul 29 2020, 9:45 PM
NoQ marked 5 inline comments as done.


Re-do the comments. Rename GetIssueHash() to getIssueHashV1 because there's a high chance we'll want more different kinds of hashes (also starts with a small letter, while we're at it).


This paragraph was saying that if a different hashing algorithm is introduced (eg., GetIssueHashV2()), the old one should still be maintained for backwards compatibility.

Other then the naming issue, I don't see any problems with this change!


I'm not a big fan of things like this in names. First of all, numbers in names look bad. Second, it is not descriptive at all! I know, I know, it is hard to come up with a name that can capture how this hash or string algorithm is different from the other ones, when there are no other ones yet. And this actually brings up the third concern, why do even need to have this suffix now, when we don't have other options?

Let's maybe postpone it till we have the actual motivation to have another method, and give them good distinctive names then. What do you think?

NoQ updated this revision to Diff 283108.Aug 4 2020, 7:16 PM

Get rid of the "V1" suffix. Indeed, we'll just rename the API.

NoQ marked an inline comment as done.Aug 4 2020, 7:16 PM
vsavchenko accepted this revision.Aug 4 2020, 10:58 PM

Awesome, thanks!

This revision is now accepted and ready to land.Aug 4 2020, 10:58 PM
This revision was automatically updated to reflect the committed changes.