This is an archive of the discontinued LLVM Phabricator instance.

Include checker name in Static Analyzer PLIST output
ClosedPublic

Authored by xazax.hun on Jan 5 2015, 5:55 AM.

Details

Summary

Include checker name in Static Analyzer PLIST output. If the checker_name is not a suitable key, search and replace in the diff should work. It is useful to include the checker name in the plist output of the Static Analyzer. It makes it easier for 3rd party tools to consume the plist

Diff Detail

Repository
rL LLVM

Event Timeline

xazax.hun updated this revision to Diff 17793.Jan 5 2015, 5:55 AM
xazax.hun retitled this revision from to Include checker name in Static Analyzer PLIST output.
xazax.hun updated this object.
xazax.hun edited the test plan for this revision. (Show Details)
xazax.hun added a reviewer: zaks.anna.
xazax.hun added a subscriber: Unknown Object (MLST).
xazax.hun added a subscriber: xazax.hun.
zaks.anna edited edge metadata.Jan 5 2015, 11:00 AM

Could you provide more information on how these would be used? The idea is that the "category" and "type" should provide sufficient information to categorize the bugs.

Hi Anna,

I was working on tools, that made it more user friendly to use the static
analyzer. One of the features was to mark what are the reports that are new
in a revision. To identify reports, we calculated a hash of the environment
of the final position of the report and the name of the checker. We found
it to be more robust to use the checker names, for that reason, because
they are less likely to change. (Changing either type or category would
mean that all of the hashes would change for that checker and our tool
would report all of its output as new.) We also stored the reports in a
database, and the name of the checker was a natural filtering/sorting
option.

Regards,
Gábor

Another example: there is a web frontend to view the reports that are stored in the database. The user can jump to the documentatcion of a specific checker, and the frontend generate the link to the documentation based on the name of the checker that generated to report.

I don't believe the checker name should be used for bug identification. The checker names are implementation detail. The bug message/name and category are better for this. If we think that we might be changing the names of categories, we might come up with some kind of a stable ID.

For example, this patch, which is currently in review (http://reviews.llvm.org/D6178), will move the implementation of a set of warnings from one place/checker to another.

I'd suggest to only use the bug message and it's location. (We should already do that in the CmpRuns script.)

What do you think?

The checker names are not purely implementation details, because the user refers to them, when determining which checkers she wants to run. I also think that, the plist itself is also an implementation detail: I think the users are not supposed to read them, but consume them with tools.

When you mention location, you mean the environment? In a new snapshot of the source code the locations may differ, and we do not want to show those reports in the comparision that are just relocated.

If we identify bugs using messages, once a message changes (because some rewording, or just adding some more information to it), the comparisons will show the reports that have that message as new. I think the checker names are still a bit more stable.

I was working on a web viewer for reports generated by the static analyzer. One of the features of the web viewer was, to be able to jump to the documentation of the checker that reported a specific issue. We had several design rules that we wanted to validate, and the checker documentation containd the design rule, to make it easier for developers to mark false positives. Of course it is possible, to generate the links to the documentation based on the message of the report, but it involves more work to both implement and maintain, since one checker can report multiple messages.

All in all, we found it useful to have this information in the Plist output, but it is also possible that information is redundant for others. We needed a unique id for the checkers and we used their names. You think that categories should be used as unique IDs?

It seems wrong to me to associate the bug descriptions with checker names. I think BugType is more appropriate. Also, I don't think the checker name should be part of identifying the issue.

Said that, we do not have a very good design for issue tracking and we do not guarantee locking either category, bug type, the message, or the checker name. I'll talk to Jordan and Ted to see what they think about this.

Looks like our issue identification is only referring to the issue location (see utils/analyzer/CmpRuns.py).

Hi Gabor,

Jordan has pointed out that a similar discussion has occurred when the getCheckName method has been added to the diagnostic. There are several write-ups from Ted and Jordan about issue tracking there as well. (See the "Future directions for the analyzer" thread on cfe-dev as well as the review comments for the patch that added the getCheckName method. It's about a year old.)

Note that we do not guarantee that CheckName will never change. (At some point, we will want to clean up our naming of the checks.) We also do not guarantee that the CheckName will be the same as the checker name (these should be the names of bug types, not of the implementation module). On the other hand, we do not expect to continuously change these.

Since the getCheckName is already part of the diagnostic, it's fine to add it to the plist output. The name of the field does need to change to check_name. (This highlights that these are not guaranteed to be checker names.)

What do you think?

If you are ok with this, could you rename the field and I'll commit the patch.

Thanks!
Anna.

xazax.hun updated this revision to Diff 18078.Jan 13 2015, 1:26 AM
xazax.hun edited edge metadata.

Hi Anna,

The diff have been updated to use 'check_name' instead of 'checker_name'. I am ok with the change, and thank you very much for your help.

Gábor

Hi Anna,

If you approve the patch I can commit it too,

Thanks,
Gábor

Please, commit!
Anna.

This revision was automatically updated to reflect the committed changes.