This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer] Fix bug report source locations in minimal output.
Needs ReviewPublic

Authored by balazske on Jul 16 2020, 9:12 AM.

Details

Summary

Fix of the following problem:
If the bug report equivalence class contains multiple
reports and no (minimal) analyzer output was requested
it can happen that the wrong location is used for the warning.

Diff Detail

Event Timeline

balazske created this revision.Jul 16 2020, 9:12 AM

Lot of tests are failing because changed warning locations.
These will be updated if the code change looks good.

NoQ added a comment.Jul 16 2020, 10:11 AM

These will be updated if the code change looks good.

Hard to tell; this entire code is too convoluted, i'd rather look at the changes. Can you update at least some tests?

balazske updated this revision to Diff 278742.Jul 17 2020, 5:51 AM
balazske marked an inline comment as done.

Fixed some source location changes in tests.
Re-added check for empty path.

Big part of the test failures is with the osx.cocoa.RetainCount checker. Only a small part of the errors is fixed now.

clang/lib/Analysis/PathDiagnostic.cpp
369

This change is needed to eliminate crash in tests at assert(b.hasValue());.

I find the summary here a bit lacking. We detected this issue in D83120, where a lot of discussion can be found, so its worth linking in. On its own, I'm not immediately sold on whether this is the correct solution, and even if it is, how does it solve the problem. I bet you had to struggle a bit to understand the related machinery to fix this, it'd be invaluable to share the knowledge you gained here as well.

I took a look myself, and the issue fixed here seems to be that PathDiagnostic's constructor doesn't set the associated PathDiagnosticLocation itself, and generateEmptyDiagnosticForReport pretty much resolves to that. The thing I'm still struggling with, is that I'm not terribly sure whether this the same issue raised in the previous patch.

Fix of the following problem:
If the bug report equivalence class contains multiple
reports and no (minimal) analyzer output was requested
it can happen that the wrong location is used for the warning.

I have two burning questions about this:

  • Are we sure that we used the correct (with the shortest bug path) bug report, but associated it with the wrong location? Mind that everything you touched here, as I understand it, affects sorting, not uniqueing.
  • If so, some (even if incorrect) location must've been used, where did that come from?

Big part of the test failures is with the osx.cocoa.RetainCount checker.

Every time :^)

clang/lib/StaticAnalyzer/Core/BugReporter.cpp
3007

Is this the *only* instance when that happens? How about the text-minimal output?

balazske marked an inline comment as done.Jul 20 2020, 6:04 AM

The problem in D83120 is fixed by this patch.
What I figured out from the code is that BugReporter::FlushReport calls findReportInEquivalenceClass that returns a report that has not necessary the shortest path. That report is get from the passed bug report class and a different instance is returned if the uniqueing is turned on or off. This report is used to fill the last location in the bug path, if the bug path is empty. So the location in the bug path changes dependent on the setting of uniqueing. The bug path here is empty if the analyzer-output is set to minimal (or not specified). Even in the minimal output case, function PathDiagnosticBuilder::generate is called if it is a path sensitive case. In that function the shortest path is used. So instead of returning an empty path from that function (what is filled later by FlushReport) the last part of the path can be filled here (and use the correct source location from the shortest path).
The problem can happen if not the report with shortest path is returned from findReportInEquivalenceClass, and analyzer-output is set to minimal or default. If the output is not minimal, the bug path is filled by the PathDiagnosticBuilder::generate with correct locations. So changing the output type can result in change of the warning location.
Other way to get the problem is change only the uniqueing setting in bug type and use minimal (or default) output. Then the example report can change and because this example report is used to get the bug location the location can change too (if the equivalence class contains paths with different end locations).

clang/lib/StaticAnalyzer/Core/BugReporter.cpp
3007

I am not totally sure on this but inserted an assert and all the tests passed without triggering it.
Is the text-minimal different from the default (when to --analyzer-output is given at all)?
I think it works like this: Function PathSensitiveBugReporter::generateDiagnosticForConsumerMap is called from here, then if it is not a BasicBugReport the function PathDiagnosticBuilder::generate will be called that is fixed now to return always with non-empty path.

NoQ added inline comments.Jul 21 2020, 12:44 PM
clang/test/Analysis/CGColorSpace.c
8–11

This change doesn't look expected to me. It's not "we've found a report on a different path" (there's only one path here), it's "we're reporting the same report in a different location" (previously it was the uniqueing location, now it's the end-of-path location).

That said, RetainCountChecker is special because it has its own PathSensitiveBugReport sub-class with custom behavior. I think we should get regular checkers like MallocChecker right first. Are there more changes on tests on regular checkers? Or is the one in malloc-plist.c the only one?

clang/test/Analysis/malloc-plist.c
137–140

This sounds like an expected change: we're now displaying the same report on a different path. Except it's the longer path rather than the shorter path, so it still looks suspicious.

balazske marked an inline comment as done.Jul 22 2020, 6:53 AM
balazske added inline comments.
clang/test/Analysis/malloc-plist.c
137–140

This location may be wrong but it is then another problem. The important thing is that after this patch the same location is used for a warning in text-minimal mode as it is in text mode. Without the patch in text-minimal mode a different location is used for this warning, the one at the end of the function. But still in text mode (without the patch) the location at y++ is shown. (The warning in function function_with_leak4 in the same test is already at a similar location, not at the end of function.)

Every other test failure comes from RetainCount checker except malloc-plist.c.

balazske updated this revision to Diff 280149.Jul 23 2020, 8:50 AM

Change column in malloc-plist test because code was reformatted.

NoQ added a comment.Jul 24 2020, 9:44 AM

Every other test failure comes from RetainCount checker except malloc-plist.c.

Aha, ok. So, anyway, for retain count checker we ultimately only care about plist and html reports, not about text reports. It's also probably easier to write more precise tests after your patch: test functions are unlikely to have multiple possible uniqueing locations, and even if there are they can be discriminated between by warning message text.

But i'd still rather have it explained why does your patch affect the location of RefLeakReports in order to make sure we understand what we're doing. This consequence of the patch wasn't intended - what other unintended consequences does it have? Are we even sure that plist/html reports don't change? Is RefLeakReport even implemented correctly from our current point of view? Or, regardless of correctness, do we want its current implementation to have the old behavior or the new behavior?

I should probably investigate this myself from the fairness/justice point of view but you'll probably land your patch faster if you don't wait on me to get un-busy with other stuff :/ As a bonus, you might be able to get away without updating all the tests if you find out that this change is accidental and not really intended :) Note that you don't need to know Objective-C or know much about the checker to investigate these things; say, CGColorSpace.c is a pure C test with a straightforward resource leak bug. The customizations in RefLeakReport aren't really checker-specific either - we simply didn't ever make up our mind on whether they should apply to all leak reports or to none; they have visual consequences on plist reports though.

clang/test/Analysis/malloc-plist.c
137–140

Oh, wait, the other path isn't in fact shorter. In both cases it leaks immediately after the if-statement, and the path with y++ is in fact a bit shorter (it immediately hits PreStmtPurgeDeadSymbols for the DeclRefExpr whereas the other path hits CallExitBegin first, because the call is inlined).

So it's a good and expected change!

balazske added a comment.EditedJul 27 2020, 7:43 AM

From the beginning on in this patch I assumed that the location of the bug report should be the end of the bug path. But this is probably a wrong approach, specially if the bug report has uniqueing location. In that case the uniqueing location may be a better place for the bug report. Specially for resource leak it is not bad if the report is located at the allocation site (still there is a bug path that shows the full path and place of leak). So another approach can be to allow only bug equivalence classes where the reports have same locations. If this is not true the checker should be fixed that generates the reports.

The behavior with RefLeakReport is correct if the indicated location in minimal output mode should be the end of the bug path. Otherwise it is not correct, the report has a different location (allocation site) than the end of the path (return point) and we want to see the allocation site in minimal mode (or not?). Still there is problem, in text output mode the summary line indicates not the correct place but the end of the bug path:
before the patch:

$ clang -cc1 -internal-isystem ~/clang/llvm1/build/Debug/lib/clang/11.0.0/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,osx.cocoa.RetainCount -analyzer-store=region ~/clang/llvm1/llvm-project/clang/test/Analysis/CGColorSpace.c
~/clang/llvm1/llvm-project/clang/test/Analysis/CGColorSpace.c:9:23: warning: Potential leak of an object stored into 'X' [osx.cocoa.RetainCount]
  CGColorSpaceRef X = CGColorSpaceCreateDeviceRGB(); // expected-warning{{leak}}
                      ^
1 warning generated.

$ clang -cc1 -internal-isystem ~/clang/llvm1/build/Debug/lib/clang/11.0.0/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,osx.cocoa.RetainCount -analyzer-store=region ~/clang/llvm1/llvm-project/clang/test/Analysis/CGColorSpace.c -analyzer-output text
~/clang/llvm1/llvm-project/clang/test/Analysis/CGColorSpace.c:11:1: warning: Potential leak of an object stored into 'X' [osx.cocoa.RetainCount]
}
^
~/clang/llvm1/llvm-project/clang/test/Analysis/CGColorSpace.c:9:23: note: Call to function 'CGColorSpaceCreateDeviceRGB' returns a Core Foundation object of type 'CGColorSpaceRef' with a +1 retain count
  CGColorSpaceRef X = CGColorSpaceCreateDeviceRGB(); // expected-warning{{leak}}
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~/clang/llvm1/llvm-project/clang/test/Analysis/CGColorSpace.c:10:3: note: Reference count incremented. The object now has a +2 retain count
  CGColorSpaceRetain(X);
  ^~~~~~~~~~~~~~~~~~~~~
~/clang/llvm1/llvm-project/clang/test/Analysis/CGColorSpace.c:11:1: note: Object leaked: object allocated and stored into 'X' is not referenced later in this execution path and has a retain count of +2
}
^
1 warning generated.

after the patch:

$ clang -cc1 -internal-isystem ~/clang/llvm1/build/Debug/lib/clang/11.0.0/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,osx.cocoa.RetainCount -analyzer-store=region ~/clang/llvm1/llvm-project/clang/test/Analysis/CGColorSpace.c
~/clang/llvm1/llvm-project/clang/test/Analysis/CGColorSpace.c:11:1: warning: Potential leak of an object stored into 'X' [osx.cocoa.RetainCount]
}
^
1 warning generated.

$ clang -cc1 -internal-isystem ~/clang/llvm1/build/Debug/lib/clang/11.0.0/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,osx.cocoa.RetainCount -analyzer-store=region ~/clang/llvm1/llvm-project/clang/test/Analysis/CGColorSpace.c -analyzer-output text
~/clang/llvm1/llvm-project/clang/test/Analysis/CGColorSpace.c:11:1: warning: Potential leak of an object stored into 'X' [osx.cocoa.RetainCount]
}
^
~/clang/llvm1/llvm-project/clang/test/Analysis/CGColorSpace.c:9:23: note: Call to function 'CGColorSpaceCreateDeviceRGB' returns a Core Foundation object of type 'CGColorSpaceRef' with a +1 retain count
  CGColorSpaceRef X = CGColorSpaceCreateDeviceRGB(); // expected-warning{{leak}}
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~/clang/llvm1/llvm-project/clang/test/Analysis/CGColorSpace.c:10:3: note: Reference count incremented. The object now has a +2 retain count
  CGColorSpaceRetain(X);
  ^~~~~~~~~~~~~~~~~~~~~
~/clang/llvm1/llvm-project/clang/test/Analysis/CGColorSpace.c:11:1: note: Object leaked: object allocated and stored into 'X' is not referenced later in this execution path and has a retain count of +2
}
^
1 warning generated.

So it should be clarified how this should work, have the end of the bug path or the "location" of the bug report as display location of the bug report.

NoQ added a comment.Jul 28 2020, 1:17 AM

I believe i clarified above how this checker should work. For testing purposes (which is the only reason to ever use the minimal output) i prefer to have it at the end of path, because that tells more about the bug path and therefore more important to test.

The question i'm asking is more about API contracts of the PathSensitiveBugReport class over its virtual methods. Right now i don't care whether the entire system behaves correctly or not. I want you to explain how the change in one component caused a change in another component, so that to avoid introducing pairs of mutually cancelling bugs. In other words, this is a unit-test question, not an integration-test question.

Then the current solution is good, print always end of the bug path.
A change to the bug reporting component was made that caused reported position of bugs to change. New is the end of the path, old is the location set by the checker (BugReport::getLocation value). The location is often same as end of the path. The RefLeakReport has a special getLocation function and there is a difference between this location (that is the allocation site) and the end of the path.