This is an archive of the discontinued LLVM Phabricator instance.

[analyzer][IDF] Add a control dependency calculator + a new debug checker
ClosedPublic

Authored by Szelethus on May 29 2019, 12:17 PM.

Details

Summary

I intend to improve the analyzer's bug reports by tracking condition expressions.

01 bool b = messyComputation();
02 int i = 0;
03 if (b) // control dependency of the bug site, let's explain why we assume val to be true
04   10 / i; // warn: division by zero

I'll detail this heuristic in the followup patch, strictly related to this one however:

  • Create the new ControlDependencyCalculator class that uses llvm::IDFCalculator to (lazily) calculate control dependencies for Clang's CFG.
  • A new debug checker debug.DumpControlDependencies is added for lit tests
  • Add unittests

Diff Detail

Repository
rL LLVM

Event Timeline

Szelethus created this revision.May 29 2019, 12:17 PM

You can easily get CD by calculating the PostDominanceFrontier. LLVM implements a templated IDF (Iterated Dominance Frontier) construction.
A native implementation for llvm ir for reference, if you need:

Paper: R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Efficiently
computing static single assignment form and the control dependence graph. ACM
Trans. Program. Lang. Syst., 13(4):451–490, Oct. 1991.

Szelethus edited the summary of this revision. (Show Details)May 29 2019, 12:40 PM

You can easily get CD by calculating the PostDominanceFrontier. LLVM implements a templated IDF (Iterated Dominance Frontier) construction.
A native implementation for llvm ir for reference, if you need:

Paper: R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Efficiently
computing static single assignment form and the control dependence graph. ACM
Trans. Program. Lang. Syst., 13(4):451–490, Oct. 1991.

Wow, thank you so much! This looks like a far superior but not a much more complicated solution then my own :) And I came to realize it may not even be correct.

Szelethus planned changes to this revision.May 29 2019, 12:44 PM

I seem to have made good progress on this, although it did require a lot of code changes on LLVM side (basically turning BasicBlock * to template arguments). Here's a sample:

//                               <----------------------------------
//                              /              <-----------------   \
//                             /              /                  \   \
// [B12 (ENTRY)] -> [B11] -> [B10]-> [B9] -> [B8] ---> [B7] -> [B6]  |
//                             |      \        \                     /
//                             |       \        -----> [B2] --------/
//                             |        \      /
//                             |          -> [B5] -> [B4] -> [B3]
//                             |               \              /
//                             |                <------------
//                              \
//                               -> [B1] -> [B0 (EXIT)]

Control dependencies  (Node, Dependency):
(2,10)
(3,5)
(3,9)
(3,10)
(4,5)
(4,9)
(4,10)
(5,9)
(5,5)
(5,10)
(6,8)
(6,9)
(6,10)
(7,8)
(7,9)
(7,10)
(8,9)
(8,8)
(8,10)
(9,10)
(10,10)

My solution is inspired by

, and I really like where this is going. I am yet to figure out how to deal with Clang's CFG containing null pointers in a not-too-invasive way (will probably end up doing something similar to ChildrenGetter), as it currently crashes.

Also, I read some of the article you showed as well as the one I found on the dominance frontier file documentation[1], and I feel a lot more enlightened about the subject, thanks! I'll spend more time on them before wrapping this up.

[1]
Sreedhar and Gao. A linear time algorithm for placing phi-nodes.
In Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
POPL '95. ACM, New York, NY, 62-73.

I did not check the algorithm as you planned changes to that. I only did a quick review of the interface which might also be rendered obsolete once you update this patch.

clang/include/clang/Analysis/Analyses/Dominators.h
191 ↗(On Diff #202017)

Is it sensible to have a non-const reference to the DomTree? Why would the user want to modify this? I think do not really do transformations on the CFG once it is built.

193 ↗(On Diff #202017)

Same as above.

While I managed to create a version of llvm::IDFCalculator that both compiles and works well with clang::CFGBlock *, I'm a little unsure about the specifics. I sent out a mail to reach a wider audience about the next steps: http://lists.llvm.org/pipermail/llvm-dev/2019-June/132786.html

Szelethus updated this revision to Diff 204972.Jun 16 2019, 3:35 PM
Szelethus retitled this revision from [analyzer][Dominators] Add a control dependency tree builder + a new debug checker to [analyzer][Dominators] Add a control dependency calculator + a new debug checker.
Szelethus edited the summary of this revision. (Show Details)

Now using llvm::IDFCalculator.

Szelethus retitled this revision from [analyzer][Dominators] Add a control dependency calculator + a new debug checker to [analyzer][IDF] Add a control dependency calculator + a new debug checker.Jun 16 2019, 3:44 PM
Szelethus edited the summary of this revision. (Show Details)Jun 17 2019, 3:06 AM
kuhar added inline comments.Jun 17 2019, 8:10 AM
clang/include/clang/Analysis/Analyses/Dominators.h
51 ↗(On Diff #204972)

DomTree has a constructor that runs the builder -- why not use it directly?

160 ↗(On Diff #204972)

Do you need it at all? I understand it's a wrapper around dominators, but this API is virtually impossible to use safely.

225 ↗(On Diff #204972)

I thinks it should be A is control dependent on B

Szelethus marked 7 inline comments as done.

Fixes according to reviewer comments.

clang/include/clang/Analysis/Analyses/Dominators.h
51 ↗(On Diff #204972)

That is correct, but DomTreeBase does not. :/

160 ↗(On Diff #204972)

I agree 100%, I'll get rid of this in another patch.

Szelethus updated this revision to Diff 205917.Jun 20 2019, 3:50 PM

Rebase on previous patches.

Gentle ping :^)

kuhar added a comment.Jul 2 2019, 7:43 AM

The non-static-analyzer bits look good to me, I added a few nits.

clang/include/clang/Analysis/Analyses/Dominators.h
230 ↗(On Diff #205917)

If the virtual function is introduced by ManagesAnalysis, isn't it better to use override here?

235 ↗(On Diff #205917)

Can it be const?

252 ↗(On Diff #205917)

Can it be const?

257 ↗(On Diff #205917)

Can dump be const?

257 ↗(On Diff #205917)

In LLVM, dumps are usually annotated with the LLVM_DUMP_METHOD attribute and not compiled in release builds. Is the convention different in the static analyzer?

NoQ added a comment.Jul 2 2019, 10:55 AM

Static Analyzer bits look great to me as well!

clang/include/clang/Analysis/Analyses/Dominators.h
257 ↗(On Diff #205917)

LLVM_DUMP_METHOD is great. Hiding dump methods under #ifndef NDEBUG is something i've seen very rarely. It's fairly annoying to me that exploded graph dumps are unavailable in release builds, but apart from that i don't have any immediate opinion, so this sounds like a global LLVM policy that we're historically not paying much attention to, but i don't mind complying.

clang/lib/StaticAnalyzer/Checkers/DebugCheckers.cpp
90 ↗(On Diff #205917)

CaPiTaLiZe VaRiAbLeS.

(*doesn't really care*)

Can you formally accept then? :) (I'll address inlines before commiting of course!)

The non-static-analyzer bits look good to me, I added a few nits.

Thank you! This part of the project was fear of mine, and I'm super grateful for all the guidance you have given!

NoQ accepted this revision.Jul 2 2019, 1:09 PM

Sure!

This revision is now accepted and ready to land.Jul 2 2019, 1:09 PM
This revision was automatically updated to reflect the committed changes.
Szelethus marked 11 inline comments as done.
Herald added a project: Restricted Project. · View Herald TranscriptJul 5 2019, 5:17 AM
Szelethus added inline comments.Jul 6 2019, 12:07 PM
clang/include/clang/Analysis/Analyses/Dominators.h
230 ↗(On Diff #205917)

I admit to have copy-pasted this from the class above, and, well, it isn't introduced by ManagedAnalysis. Which begs the question why it was made virtual at all.

Nice catch!

235 ↗(On Diff #205917)

IDFCalculator takes it's arguments by a non-const pointer. I guess I could fix that too on LLVM's side eventually. The method itself retrieves control dependencies lazily, and might modify the state of this class.

252 ↗(On Diff #205917)

Same.

257 ↗(On Diff #205917)

Cant be const for the same reasons.