This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer] Fix Bug 25609 - Assertion UNREACHABLE: 'Unexpected ProgramPoint' with widen-loops=true
ClosedPublic

Authored by MTC on Aug 26 2017, 8:42 PM.

Details

Summary

In the original design of the analyzer, it is assumed that the BlockEntrance doesn't create a new binding on the Store, But this assumption isn't true when 'widen-loops' sets true.

Regards,

Henry Wong

Diff Detail

Repository
rL LLVM

Event Timeline

MTC created this revision.Aug 26 2017, 8:42 PM
MTC edited the summary of this revision. (Show Details)Aug 26 2017, 8:51 PM
MTC updated this revision to Diff 112814.Aug 27 2017, 4:03 AM
This comment was removed by MTC.
szepet added a subscriber: szepet.Aug 27 2017, 8:48 PM
MTC updated this revision to Diff 112892.Aug 28 2017, 6:40 AM

Sorry, remove irrelevant code updates.

NoQ added inline comments.Aug 28 2017, 7:27 AM
lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
651–652 ↗(On Diff #112892)

This is user-facing text, and users shouldn't know about conjured symbols, and "max" shouldn't be shortened, and i'm not sure what else. I'd probably suggest something along the lines of "Contents of <...> are wiped", but this is still not good enough.

Also could you add a test that displays this note? I.e. with -analyzer-output=text.

MTC updated this revision to Diff 112969.Aug 28 2017, 2:21 PM

Add the test file and modify the description in the analyzer output.

MTC marked an inline comment as done.Aug 28 2017, 6:57 PM
MTC added inline comments.
lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
651–652 ↗(On Diff #112892)

Thanks for your review.

You are right, whether this information should be displayed to the user is a question worth discussing.

szepet edited subscribers, added: cfe-commits; removed: szepet.

Hello MTC,

Thanks for working on this! I planned to add a change like this in D36690 but it worths an individual patch (did not know it was a reported bug).
Just some thoughts:

  • I think the current display information is ambiguous. If I did not know the code then I would not understand what this stands for. (That is just my opinon.) I think Artem's "Contents of <...> are wiped" idea is better but I would go with something like this: "Invalidated previously known information on <...>". (Maybe remove the word 'previously' if its too long.) It still can be weird for the user that why this happened but at least in case of a false positive he/she can see that why this was even considered by the analyzer.
  • Right now there is a "race condition" between your patch and D36690. So in order to avoid "conflicts" I'd ask you to add some variable changing effect to the body of the loop. For example in the last test case the variable 'num' is the one which you use to show the effect of widening. In this case a line like num++ or num = i would ensure that the more precise widening invalidates it as well. Other thing is that handling loops which contains pointer operation or nested loops will be a little bit more conservative, so you these will not be widened like now. (However, test_for_bug_25609()) still should be valid.
  • Please upload the diff with context (git flag: -U99999) since it is really helpful for the review.
  • +1 inline comment below.
lib/StaticAnalyzer/Core/PathDiagnostic.cpp
694 ↗(On Diff #112969)

I think it would be more correct to use the location what is used in case of the BlockEdge. (So on the entranced block terminator condition.) The reason is because the BlockEntrance display message will be displayed before the message of the BlockEdge (since it is an "earlier" node in the ExplodedGraph). So it would result that if check these notes in a viewer then the earlier note would belong to the later location which could be confusing.

MTC marked an inline comment as done.Aug 30 2017, 7:13 PM

Hi peter,

Thank you very much for your help.

  • I think the current display information is ambiguous. If I did not know the code then I would not understand what this stands for. (That is just my opinon.) I think Artem's "Contents of <...> are wiped" idea is better but I would go with something like this: "Invalidated previously known information on <...>". (Maybe remove the word 'previously' if its too long.) It still can be weird for the user that why this happened but at least in case of a false positive he/she can see that why this was even considered by the analyzer.

Yes, that's the problem. The information displayed to the user should be clear, but not difficult to understand.

  • Right now there is a "race condition" between your patch and D36690. So in order to avoid "conflicts" I'd ask you to add some variable changing effect to the body of the loop. For example in the last test case the variable 'num' is the one which you use to show the effect of widening. In this case a line like num++ or num = i would ensure that the more precise widening invalidates it as well. Other thing is that handling loops which contains pointer operation or nested loops will be a little bit more conservative, so you these will not be widened like now. (However, test_for_bug_25609()) still should be valid.

I'll update the test file,:D.

  • Please upload the diff with context (git flag: -U99999) since it is really helpful for the review.

About ‘git diff’, that's my mistake, and I'll add this option next time. Thank you very much for pointing it out!

  • +1 inline comment below.
MTC updated this revision to Diff 113427.Aug 31 2017, 10:09 AM

(1) Modify the description of the bug report
(2) Update loop-widening-notes.c
(3) PathDiagnosticLocation::create() - Use the location of TerminatorCondition.

MTC marked an inline comment as done.Aug 31 2017, 10:13 AM
MTC added inline comments.
lib/StaticAnalyzer/Core/PathDiagnostic.cpp
694 ↗(On Diff #112969)

Yes, it would be better to use the location of the TerminatorCondition :D.

szepet edited edge metadata.Aug 31 2017, 10:28 AM

Thanks for the update!
It looks good to me! (Probably somebody else (most likely NoQ) will have some additional comment but I think it's great!)

MTC marked 3 inline comments as done.Sep 5 2017, 2:18 AM
zaks.anna added inline comments.Oct 19 2017, 11:05 PM
lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
651–652 ↗(On Diff #112892)

I am not convinced that we need to print this information to the user. The problem here is that it leaks internal implementation details about the analyzer. The users should not know about "loop limits" and "invalidation" and most of the users would not even know what this means. I can see how this is useful to the analyzer developers for debugging the analyzer, but not to the end user.

xazax.hun added inline comments.Oct 20 2017, 4:35 AM
lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
651–652 ↗(On Diff #112892)

While we might not want to expose this to the user this can be really useful to understand what the analyzer is doing when we debugging a false positive finding. Maybe it would be worth to introduce a developer mode or verbose mode for those purposes. What do you think?

zaks.anna added inline comments.Oct 20 2017, 9:25 AM
lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
651–652 ↗(On Diff #112892)

I'd be fine with that in theory, though the downside is that it would pollute the code a bit. One trick that's often used to better understand a report when debugging is to remove the path note pruning (by passing a flag). I am not sure if that approach can be extended for this case. Ex: maybe we could have special markers on the notes saying that they are for debug purposes only and have the pruning remove them.

By the way, is this change related to the other change from this patch?

xazax.hun added inline comments.Oct 25 2017, 2:36 AM
lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
651–652 ↗(On Diff #112892)

I agree that we should just commit this without the message to fix the assert. The "debug message" could be addressed in a separate patch.

MTC updated this revision to Diff 120265.Oct 25 2017, 8:50 AM
MTC marked 3 inline comments as done.

The message about invalidate variable values is temporarily not printed. This work can be done with separate patch.

MTC updated this revision to Diff 120401.Oct 26 2017, 6:13 AM
MTC marked an inline comment as done.

Split the long "expected" line into multiple lines.

MTC added a comment.Nov 1 2017, 7:33 AM

ping?

lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
651–652 ↗(On Diff #112892)

Thank you for your review!
Agree with you, and Artem also recommends not exposing too much internal implementation to the user.

651–652 ↗(On Diff #112892)

Thank you, Gábor!
That's a good point. I think this idea requires a little bit of work to be done, and it should be done by someone more familiar with BugReport or PathDiagnostic, :).

651–652 ↗(On Diff #112892)

Thanks, anna!
The output here is just to make it better for the user to figure out what's going on here. Delete the output information does not affect the other part, but the output information is changed from "Reach the maximum loop limit. Invalidate the value of 'num'" becomes the original output information "value assigned to 'p'".

651–652 ↗(On Diff #112892)

Done!

xazax.hun accepted this revision.Nov 14 2017, 7:17 AM

LGTM! But wait for @dcoughlin, @zaks.anna , or @NoQ before commit.

This revision is now accepted and ready to land.Nov 14 2017, 7:17 AM
dcoughlin added inline comments.Nov 18 2017, 3:41 PM
lib/StaticAnalyzer/Core/PathDiagnostic.cpp
694 ↗(On Diff #112969)

Thanks for looking into fixing this.

I don't think using the terminator condition of the entered block is the right thing to do. The terminator is at the *end* of a basic block and this program point represents the entrance to the block. I think it is better to use the location corresponding to the first element in in the entered block.

MTC updated this revision to Diff 123783.Nov 21 2017, 7:06 AM

Update diff, use the SourceLocation of the first element of the entered block as the argument of PathDiagnosticLocation.

MTC marked an inline comment as done.Nov 21 2017, 7:18 AM
MTC added inline comments.
lib/StaticAnalyzer/Core/PathDiagnostic.cpp
694 ↗(On Diff #112969)

Thank you, dcoughlin!

I have updated the diff. The first element kind I can think of is only Stmt and NewAllocator. I don't know if it's enough?

dcoughlin added inline comments.Nov 24 2017, 6:57 AM
lib/StaticAnalyzer/Core/PathDiagnostic.cpp
695 ↗(On Diff #123783)

You can use getAs<> in the if guard condition to make this more concise:

if (auto StmtElt = BlockFront.getAs<CFGStmt>()) {
  return PathDiagnosticLocation(StmtElt->getStmt()->getLocStart(), SMng);
}
694 ↗(On Diff #112969)

It would be good to add an llvm_unreachable() assertion expressing this expectation so that we'll get an assertion failure if this turns out not to be enough. (Or if we add a new CFGElement kind in the future).

MTC updated this revision to Diff 124248.EditedNov 25 2017, 1:21 AM
MTC marked an inline comment as done.

1.Use the getAs<> in the if condition.
2.Add an "Unexpected ProgramPoint" assertion to make this patch more complete.

MTC marked 2 inline comments as done.Nov 25 2017, 1:23 AM
dcoughlin accepted this revision.Nov 27 2017, 8:33 AM

LGTM, but as noted inline you should update the new llvm_unreachable() to have a more descriptive error message.

Do you have commit access or do you need someone to commit this for you?

lib/StaticAnalyzer/Core/PathDiagnostic.cpp
701 ↗(On Diff #124248)

This assertion should probably say something like "Unexpected CFG element at front of block".

MTC updated this revision to Diff 124549.Nov 28 2017, 5:25 AM

Update the llvm_unreachable's description of the BlockEntrance-branch from "Unexpected ProgramPoint" to "Unexpected CFG element at front of block".

MTC marked an inline comment as done.Nov 28 2017, 5:29 AM

Hi dcoughlin,

Thank you very much for your help. I have no commit access and hope someone can commit it on my behalf. Thanks a lot!

This revision was automatically updated to reflect the committed changes.