Page MenuHomePhabricator

[analyzer][RFC] Get info from the LLVM IR for precision
Needs ReviewPublic

Authored by martong on Aug 5 2020, 8:27 AM.

Details

Summary

This is a prototype to access the IR from the components of the Clang Static Analyzer.
This involves architectural changes and there is a related discussion here on cfe-dev: http://lists.llvm.org/pipermail/cfe-dev/2020-August/066426.html.

There are many important and useful analyses in the LLVM layer that we can use during the path
sensitive analysis. Most notably, the "readnone" and "readonly" function
attributes which can be used to identify "pure" functions (those without side
effects). Here, I am using the pureness info from the IR to avoid invalidation
of any variables during conservative evaluation (when we evaluate a pure
function).

Diff Detail

Event Timeline

martong created this revision.Aug 5 2020, 8:27 AM
martong requested review of this revision.Aug 5 2020, 8:27 AM
martong updated this revision to Diff 283254.Aug 5 2020, 8:32 AM
  • Remove dummy test file

What will happen with the ability to analyse a translation unit whose target was not part of LLVM_TARGETS_TO_BUILD of the current clang binary? Will it break, because the binary lacks the information on how to generate for the given target?

clang/include/clang/StaticAnalyzer/Core/IRContext.h
40–42

Aren't documentation comments in LLVM ///?

clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
127

Where's this CG defined?

clang/lib/StaticAnalyzer/Frontend/FrontendActions.cpp
52–56

Isn't there a lifetime issue here? LLVMCtx is given as a raw reference to the BuildCodeGen and the shared pointer leaves scope at the end of the branch.

clang/test/Analysis/ircontext.cpp
18–19

What are we testing here? Without the ability to read the pureness of foo from the IR, the knowledge about g's value would be scrapped at the call? foo is defined in the current TU.

martong marked 3 inline comments as done.Aug 5 2020, 9:14 AM
martong added inline comments.
clang/include/clang/StaticAnalyzer/Core/IRContext.h
40–42

Yep, I forgot about that.

clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
127

In AnalysisConsumer.h. And set in FrontendActions.cpp AConsumer->setCodeGen(CGConsumer.get());

clang/lib/StaticAnalyzer/Frontend/FrontendActions.cpp
52–56

Nope, LLVMCtx is a member of AnalysisAction.

clang/test/Analysis/ircontext.cpp
18–19

We test here that with no-inlining (i.e. always doing conservative evaluation) we do not invalidate the state after a call to a pure function.

martong marked 3 inline comments as done.Aug 5 2020, 9:17 AM

What will happen with the ability to analyse a translation unit whose target was not part of LLVM_TARGETS_TO_BUILD of the current clang binary? Will it break, because the binary lacks the information on how to generate for the given target?

We get the target info from the CompilerInstance that is used for the command line of the analysis. So if clang cannot handle that then we'll probably get a diagnostics anyway by the AnalysisASTConsumers diag engine.

rjmccall requested changes to this revision.Aug 5 2020, 11:36 AM

This seems a huge architectural change that we need to talk about.

This revision now requires changes to proceed.Aug 5 2020, 11:36 AM
NoQ added inline comments.Aug 5 2020, 11:45 AM
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
720

Before i forget: you still need to conjure the return value.

NoQ added a comment.Aug 5 2020, 11:47 AM

This seems a huge architectural change that we need to talk about.

Absolutely.

I believe the relevant discussion is at http://lists.llvm.org/pipermail/cfe-dev/2020-August/066426.html.

Thanks. It'd be a good idea to mention that this is contingent on that discussion in the patch summary.

This seems a huge architectural change that we need to talk about.

Yes, sure. I am open to and welcome any discussions and suggestions. This patch is just a prototype to demonstrate the usability and feasibility. As @NoQ indicated there is a relevant discussion in cfe-dev, maybe it's better to discuss there because of the bigger visibility.

martong retitled this revision from [analyzer] Get info from the LLVM IR for precision to [analyzer][RFC] Get info from the LLVM IR for precision.Aug 5 2020, 12:04 PM
martong edited the summary of this revision. (Show Details)
martong added inline comments.Aug 5 2020, 12:17 PM
clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
720

Good catch, thanks!

martong added inline comments.Aug 7 2020, 1:30 AM
clang/lib/StaticAnalyzer/Frontend/FrontendActions.cpp
30

TODO overwrite ALL optimization related config.

Artem:

we should not be taking -O flags into account at all, but pick some default -O2 regardless of flags; and ideally all flags should be ignored by default, to ensure experience as consistent as possible.

xazax.hun added inline comments.Aug 7 2020, 1:38 AM
clang/lib/StaticAnalyzer/Frontend/FrontendActions.cpp
30

Additional bit of info from the mailing list: relying on the set of optimizations from O2 might not suffice as it might contain passes with bad side effects. One example is removing a static function that was inlined to all of the call sites which would make us unable to query the analysis results for that function. Overall, we might not want to make the analysis dependent on inlining heuristics. (Or we might not care. This is up for discussion.)

martong updated this revision to Diff 284738.Aug 11 2020, 8:14 AM
  • Use custom, CSA specific pipeline
  • Rename runOptimizerPipeline to runPipeline
  • Add unittest for IRContext
    • Add BuildCodeGen to CSA's FrontendAction.h
martong marked 3 inline comments as done.Aug 11 2020, 8:38 AM
martong added inline comments.
clang/lib/StaticAnalyzer/Frontend/FrontendActions.cpp
30

CodeGenOptions CGO = CodeGenOptions(); will create an options object with all value set to the default.

martong updated this revision to Diff 284760.Aug 11 2020, 8:38 AM
  • Use /// comments in IRContext.h
  • Conjure the return value even if the funciton is pure
  • Set all CodeGen options to the default (except the opt level)