This is an archive of the discontinued LLVM Phabricator instance.

[StaticAnalyzer] LoopWidening: Invalidate only the possibly changed regions
AcceptedPublic

Authored by szepet on Aug 14 2017, 9:15 AM.

Details

Summary

The current widening option in the Clang Static Analyzer invalidates (almost) all of the MemRegions when a loop reach its last visit (see: maxBlockVisitOnPath config) and then continue the analysis.

My aim is to create a solution where widening only invalidates the MemRegions which is possibly to affected by the loop. So in case of pointers it requires more effort to track the possible values which can be changed. In this initial patch only specific loops will be invalidated which does not contains (complex, therefore) unhandled statements. (e.g. pointer operations).

In the invalidation process we check the possibly changed variables via ASTMatchers.

In general case the difference between the two approach is:

widen-loops-oldInvalidate everything
widen-loops-newOnly invalidate modified variables

But there is another difference when there are pointers (more precisely if it encounters a statement which can result a modified variable but it is not handled yet):

widen-loops-oldInvalidate everything
widen-loops-newInvalidate nothing (don't widen)

Diff Detail

Event Timeline

szepet created this revision.Aug 14 2017, 9:15 AM
NoQ edited edge metadata.Aug 14 2017, 9:29 AM

Yay!

Do i understand correctly that your code does widening for simple loops but leaves complex loops intact, unless the old widening is also turned on?

Hmm, i guess some tests got lost on their way here.

Can matchers from LoopUnrolling.cpp be re-used anyhow?

include/clang/StaticAnalyzer/Core/AnalyzerOptions.h
582

Not sure, but i suspect that your approach is actually less conservative than the old one(?) Like, it drops less information.

I think we should try to express more here - these approaches are very different.

lib/StaticAnalyzer/Core/LoopWidening.cpp
45–46

Do we actually put statements that aren't loops into this function?

szepet marked an inline comment as done.Aug 14 2017, 9:35 AM
szepet added inline comments.
include/clang/StaticAnalyzer/Core/AnalyzerOptions.h
582

Yepp, I meant to say its more conservative from the aspect that it only widen specific loops. I will update it more clearly but if you have any suggestion on the name of the flag don't hesitate to say! :)

lib/StaticAnalyzer/Core/LoopWidening.cpp
45–46

Not really, just wanted to be consistent with the getLoopCondition function above. (I accidently marked it as Done and cant undo it. Would you like to remove this case?)

szepet updated this revision to Diff 111008.Aug 14 2017, 9:35 AM
szepet marked an inline comment as done.

Test cases added.

seaneveson edited edge metadata.Aug 14 2017, 10:28 AM

Nice! Thanks again for working on this.

Personally I would prefer modifying or replacing widen-loops, so we don't end up with two different approaches (and options) at the same time. Certainly that should be the end goal, but I think it might be worth doing now.

At a glance I would say that the behavior difference when there are no pointers is:

widen-loopsInvalidate everything
widen-loops-conservativeOnly invalidate modified variables

And when there are pointers the difference is:

widen-loopsInvalidate everything
widen-loops-conservativeInvalidate nothing

As far as I can tell widen-loops-conservative is better (in every sense) in the no pointer case. If we can agree on what to do when there are pointers (Invalidate everything or nothing), then we would only need one option. I would guess we want to invalidate nothing, which will probably give less false positives, rather than everything which probably gives more coverage.

lib/StaticAnalyzer/Core/LoopWidening.cpp
89

There are some more compound assignment operators:

a %= b
a &= b
a |= b
a ^= b
a <<= b
a >>= b

104

I think you could ignore const pointers/arrays here without too much effort, or would checking for casts make that a pain (and thus work for a future patch)?

110

Style nit, you could return Matches.empty();.

szepet edited the summary of this revision. (Show Details)Aug 24 2017, 4:26 AM
szepet added a comment.EditedAug 24 2017, 4:35 AM

Nice! Thanks again for working on this.

Personally I would prefer modifying or replacing widen-loops, so we don't end up with two different approaches (and options) at the same time. Certainly that should be the end goal, but I think it might be worth doing now.

Thanks for the comments and the summary of the differences! If you don't mind I've added it to the description. I believe its not the best to replace it since (as you pointed out) it behaves differently enough in cases which is not handled in the 'conservative' solution. So I am not sure about the users of the current widening option but it could result a drawback on its performance (eg.: on the coverage %).
But when further measurements are done (yepp, its on me) and you and the community in general is still OK with that change, then it will happen.

szepet updated this revision to Diff 112541.EditedAug 24 2017, 6:15 AM

Added a more general approach for collecting changed variables.

The main problem could be nested loops where the coverage is suboptimal. We do not invalidate the outer loop information so (in case of known bound) the execution path will be aborted because of the maxBlockVisitOnPath check. It results in not analyzing the code after the outer loop.

szepet marked 3 inline comments as done.Aug 24 2017, 6:16 AM
szepet updated this revision to Diff 112551.Aug 24 2017, 6:57 AM

Format test file.

I would still personally rather replace widen-loops, but if its just me I'm happy for it to be a separate option for now.

Otherwise I just have a few minor comments.

lib/StaticAnalyzer/Core/ExprEngine.cpp
1555–1559

The behavior when giving giving both flags is not obvious to the user. You might want to add something to the help or disallow doing both at the same time.

lib/StaticAnalyzer/Core/LoopWidening.cpp
63

Naming the function something like changedByIncrementOrDecrement or changedByUnaryOperator would make it clearer that this checks for ++ AND --.

test/Analysis/loop-widening-conservative.cpp
270

You should also test for modifications of the value being pointed at e.g. *p = 1; *p++ etc.

szepet updated this revision to Diff 112870.Aug 28 2017, 3:17 AM
szepet marked 2 inline comments as done.

Addressed comments.

szepet updated this revision to Diff 113004.Aug 28 2017, 6:22 PM
szepet retitled this revision from [StaticAnalyzer] Adding initial conservative loop widening config option to [StaticAnalyzer] LoopWidening: Invalidate only the possibly changed regions.
szepet edited the summary of this revision. (Show Details)

Update patch to not add a new flag but change the functionality of the old one. (If Sean still agrees.)

szepet edited the summary of this revision. (Show Details)Aug 28 2017, 6:27 PM

LGTM, Thanks :)

seaneveson accepted this revision.Sep 13 2017, 2:25 AM

I didn't mark this as accepted in case anyone else wanted/needed to review this before you put it in, but I'm happy with it.

This revision is now accepted and ready to land.Sep 13 2017, 2:25 AM
zaks.anna edited edge metadata.Sep 14 2017, 9:43 AM

Hi Peter,

The loop widening is a fundamental change to the analyzer and we'd like to discuss the overall design for this feature before committing patches. Can we start with design discussions on the list?

If I recall correctly, Devin thought that we need to build widening on top of better loop modeling in the CFG, which has not happened yet.

In your design, please include the details on how nested loops will be handled. Another particular issue that we anticipate is that the loop guard statement's location is arbitrary with respect to the place where the block count gets incremented. (These are some of the issues that would be addressed if this work was based on the proper loop modeling in the CFG.)