This is an archive of the discontinued LLVM Phabricator instance.

[analyzer][IDF] Add a control dependency calculator + a new debug checker
ClosedPublic

Authored by Szelethus on May 29 2019, 12:17 PM.

Details

Summary

I intend to improve the analyzer's bug reports by tracking condition expressions.

01 bool b = messyComputation();
02 int i = 0;
03 if (b) // control dependency of the bug site, let's explain why we assume val to be true
04   10 / i; // warn: division by zero

I'll detail this heuristic in the followup patch, strictly related to this one however:

  • Create the new ControlDependencyCalculator class that uses llvm::IDFCalculator to (lazily) calculate control dependencies for Clang's CFG.
  • A new debug checker debug.DumpControlDependencies is added for lit tests
  • Add unittests

Diff Detail

Event Timeline

Szelethus created this revision.May 29 2019, 12:17 PM

You can easily get CD by calculating the PostDominanceFrontier. LLVM implements a templated IDF (Iterated Dominance Frontier) construction.
A native implementation for llvm ir for reference, if you need:

Paper: R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Efficiently
computing static single assignment form and the control dependence graph. ACM
Trans. Program. Lang. Syst., 13(4):451โ€“490, Oct. 1991.

Szelethus edited the summary of this revision. (Show Details)May 29 2019, 12:40 PM

You can easily get CD by calculating the PostDominanceFrontier. LLVM implements a templated IDF (Iterated Dominance Frontier) construction.
A native implementation for llvm ir for reference, if you need:

Paper: R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Efficiently
computing static single assignment form and the control dependence graph. ACM
Trans. Program. Lang. Syst., 13(4):451โ€“490, Oct. 1991.

Wow, thank you so much! This looks like a far superior but not a much more complicated solution then my own :) And I came to realize it may not even be correct.

Szelethus planned changes to this revision.May 29 2019, 12:44 PM

I seem to have made good progress on this, although it did require a lot of code changes on LLVM side (basically turning BasicBlock * to template arguments). Here's a sample:

//                               <----------------------------------
//                              /              <-----------------   \
//                             /              /                  \   \
// [B12 (ENTRY)] -> [B11] -> [B10]-> [B9] -> [B8] ---> [B7] -> [B6]  |
//                             |      \        \                     /
//                             |       \        -----> [B2] --------/
//                             |        \      /
//                             |          -> [B5] -> [B4] -> [B3]
//                             |               \              /
//                             |                <------------
//                              \
//                               -> [B1] -> [B0 (EXIT)]

Control dependencies  (Node, Dependency):
(2,10)
(3,5)
(3,9)
(3,10)
(4,5)
(4,9)
(4,10)
(5,9)
(5,5)
(5,10)
(6,8)
(6,9)
(6,10)
(7,8)
(7,9)
(7,10)
(8,9)
(8,8)
(8,10)
(9,10)
(10,10)

My solution is inspired by

, and I really like where this is going. I am yet to figure out how to deal with Clang's CFG containing null pointers in a not-too-invasive way (will probably end up doing something similar to ChildrenGetter), as it currently crashes.

Also, I read some of the article you showed as well as the one I found on the dominance frontier file documentation[1], and I feel a lot more enlightened about the subject, thanks! I'll spend more time on them before wrapping this up.

[1]
Sreedhar and Gao. A linear time algorithm for placing phi-nodes.
In Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
POPL '95. ACM, New York, NY, 62-73.

I did not check the algorithm as you planned changes to that. I only did a quick review of the interface which might also be rendered obsolete once you update this patch.

clang/include/clang/Analysis/Analyses/Dominators.h
197

Is it sensible to have a non-const reference to the DomTree? Why would the user want to modify this? I think do not really do transformations on the CFG once it is built.

199

Same as above.

While I managed to create a version of llvm::IDFCalculator that both compiles and works well with clang::CFGBlock *, I'm a little unsure about the specifics. I sent out a mail to reach a wider audience about the next steps: http://lists.llvm.org/pipermail/llvm-dev/2019-June/132786.html

Szelethus updated this revision to Diff 204972.Jun 16 2019, 3:35 PM
Szelethus retitled this revision from [analyzer][Dominators] Add a control dependency tree builder + a new debug checker to [analyzer][Dominators] Add a control dependency calculator + a new debug checker.
Szelethus edited the summary of this revision. (Show Details)

Now using llvm::IDFCalculator.

Szelethus retitled this revision from [analyzer][Dominators] Add a control dependency calculator + a new debug checker to [analyzer][IDF] Add a control dependency calculator + a new debug checker.Jun 16 2019, 3:44 PM
Szelethus edited the summary of this revision. (Show Details)Jun 17 2019, 3:06 AM
kuhar added inline comments.Jun 17 2019, 8:10 AM
clang/include/clang/Analysis/Analyses/Dominators.h
51

DomTree has a constructor that runs the builder -- why not use it directly?

160

Do you need it at all? I understand it's a wrapper around dominators, but this API is virtually impossible to use safely.

225

I thinks it should be A is control dependent on B

Szelethus marked 7 inline comments as done.

Fixes according to reviewer comments.

clang/include/clang/Analysis/Analyses/Dominators.h
51

That is correct, but DomTreeBase does not. :/

160

I agree 100%, I'll get rid of this in another patch.

Szelethus updated this revision to Diff 205917.Jun 20 2019, 3:50 PM

Rebase on previous patches.

Gentle ping :^)

kuhar added a comment.Jul 2 2019, 7:43 AM

The non-static-analyzer bits look good to me, I added a few nits.

clang/include/clang/Analysis/Analyses/Dominators.h
230

If the virtual function is introduced by ManagesAnalysis, isn't it better to use override here?

235

Can it be const?

252

Can it be const?

257

Can dump be const?

257

In LLVM, dumps are usually annotated with the LLVM_DUMP_METHOD attribute and not compiled in release builds. Is the convention different in the static analyzer?

NoQ added a comment.Jul 2 2019, 10:55 AM

Static Analyzer bits look great to me as well!

clang/include/clang/Analysis/Analyses/Dominators.h
257

LLVM_DUMP_METHOD is great. Hiding dump methods under #ifndef NDEBUG is something i've seen very rarely. It's fairly annoying to me that exploded graph dumps are unavailable in release builds, but apart from that i don't have any immediate opinion, so this sounds like a global LLVM policy that we're historically not paying much attention to, but i don't mind complying.

clang/lib/StaticAnalyzer/Checkers/DebugCheckers.cpp
90

CaPiTaLiZe VaRiAbLeS.

(*doesn't really care*)

Can you formally accept then? :) (I'll address inlines before commiting of course!)

The non-static-analyzer bits look good to me, I added a few nits.

Thank you! This part of the project was fear of mine, and I'm super grateful for all the guidance you have given!

NoQ accepted this revision.Jul 2 2019, 1:09 PM

Sure!

This revision is now accepted and ready to land.Jul 2 2019, 1:09 PM
This revision was automatically updated to reflect the committed changes.
Szelethus marked 11 inline comments as done.
Herald added a project: Restricted Project. ยท View Herald TranscriptJul 5 2019, 5:17 AM
Herald added a subscriber: llvm-commits. ยท View Herald Transcript
Szelethus added inline comments.Jul 6 2019, 12:07 PM
clang/include/clang/Analysis/Analyses/Dominators.h
230

I admit to have copy-pasted this from the class above, and, well, it isn't introduced by ManagedAnalysis. Which begs the question why it was made virtual at all.

Nice catch!

235

IDFCalculator takes it's arguments by a non-const pointer. I guess I could fix that too on LLVM's side eventually. The method itself retrieves control dependencies lazily, and might modify the state of this class.

252

Same.

257

Cant be const for the same reasons.