This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Add support for WebKit "unified sources".
ClosedPublic

Authored by NoQ on Apr 19 2018, 2:59 PM.

Details

Summary

Since recently, WebKit uses a peculiar build system that compiles multiple translation units at once by automatically joining them into a bigger file via #include. Because none of the functions end up in the main file, we disable all our path-sensitive checks on such unified sources. Try to work around that.

Diff Detail

Repository
rL LLVM

Event Timeline

NoQ created this revision.Apr 19 2018, 2:59 PM
NoQ updated this revision to Diff 143195.Apr 19 2018, 4:54 PM

Hopefully a more portable way of testing this stuff.

include/clang/StaticAnalyzer/Core/PathSensitive/AnalysisManager.h
148 ↗(On Diff #143195)

Although not very common, but .cxx is also a possibly extension for C++ source files.

lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
837 ↗(On Diff #143195)

Maybe we should include a test for container methods as well.

include/clang/StaticAnalyzer/Core/PathSensitive/AnalysisManager.h
144 ↗(On Diff #143195)

Is this if really necessary? This logic has too much overfitting, and it seems that if someone decides to include .cc files, we should analyze them in any case, right? We also would prefer to not stop working if webkit decides on using a different naming for those.

lib/StaticAnalyzer/Core/CallEvent.cpp
975 ↗(On Diff #143195)

You would shave off a bit of redundancy and verbosity by saving AnalysisManager in a local variable.

lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp
837 ↗(On Diff #143195)

+1

lib/StaticAnalyzer/Core/PathDiagnostic.cpp
153 ↗(On Diff #143195)

Assert error message is not quite technically correct now

test/Analysis/unified-sources/source1.cpp
8 ↗(On Diff #143195)

Wow, expected-* directives work across multiple files?? This is really cool!

NoQ updated this revision to Diff 143396.Apr 20 2018, 2:43 PM
NoQ marked 3 inline comments as done.

Address most comments.

include/clang/StaticAnalyzer/Core/PathSensitive/AnalysisManager.h
144 ↗(On Diff #143195)

This is indeed an act of overfitting. But also there are very few reasons to include a non-header file, and all of them are pretty exotic. I'm not sure we want to analyze these files in all cases. So i want to play safe until we gather more data.

148 ↗(On Diff #143195)

Yup, thanks!

test/Analysis/unified-sources/source1.cpp
8 ↗(On Diff #143195)

-verify works over preprocessed files, so yeah, these directives respect #ifs (which we regularly abuse) and #includes (which we rarely abuse). This test also acts as a normal test, but the UnifiedSource-1.cpp test is the one that'll fail if we break the newly added functionality.

majnemer added inline comments.
include/clang/StaticAnalyzer/Core/PathSensitive/AnalysisManager.h
145–147 ↗(On Diff #143396)

C++ source code is also found in files which end in .C, this code will match against strange file endings like .cXx and .mM

I think the above logic should be changed to match https://github.com/llvm-mirror/clang/blob/master/lib/Frontend/FrontendOptions.cpp#L27

NoQ updated this revision to Diff 143416.Apr 20 2018, 5:12 PM
NoQ marked an inline comment as done.

A more accurate extension check.

include/clang/StaticAnalyzer/Core/PathSensitive/AnalysisManager.h
145–147 ↗(On Diff #143396)

Aha, yeah, thanks, that's the place i was looking for.

include/clang/StaticAnalyzer/Core/PathSensitive/AnalysisManager.h
145–147 ↗(On Diff #143396)

Why not just use the included function then? It's static.

144 ↗(On Diff #143195)

I would still say that just analyzing included c++ files is a lesser evil.

NoQ marked 2 inline comments as done.Apr 23 2018, 12:52 PM
NoQ added inline comments.
include/clang/StaticAnalyzer/Core/PathSensitive/AnalysisManager.h
145–147 ↗(On Diff #143396)

Because it has a different purpose:

for now it doesn't discriminate between code and header files

probinson added inline comments.
include/clang/StaticAnalyzer/Core/PathSensitive/AnalysisManager.h
144 ↗(On Diff #143195)

Agreed. WebKit is not the only project that does this kind of thing.

NoQ added inline comments.Apr 23 2018, 2:31 PM
include/clang/StaticAnalyzer/Core/PathSensitive/AnalysisManager.h
144 ↗(On Diff #143195)

The question is, are all such projects in fact interested in having the analyzer analyze the respective code?

For instance, if the included code is autogenerated by an external tool, users are probably not interested in finding bugs in such code because they will be unable to fix them.

Are you interested in providing a list of examples of projects that do this kind of thing, explain why are they doing it, and whether they will be interested in having warnings in included files?

NoQ added inline comments.Apr 23 2018, 2:47 PM
include/clang/StaticAnalyzer/Core/PathSensitive/AnalysisManager.h
144 ↗(On Diff #143195)

Note that we're not talking about warnings that are *emitted* in the included file. We're talking about warnings that *originate* within the included file (and as such most likely stay within the included file and its includes). This makes a difference because we are doing inter-procedural analysis. I.e., if we find that passing a null pointer in the main file into a header function causes a null dereference in the header function, the report that originates within the main file will be emitted in the header with auxiliary path notes in the main file.

So if a small chunk of code is included but the main file still contains reasonable stuff, we'll be able to find most bugs in the included code as it is being used from the rest of the file. The problem only arises when all code is included.

Hi Artem. Cool patch!

Maybe we should create an analyzer option for this? So, all users who want webkit-like behaviour will just enable (or disable, don't know what's better) the option and get the feature. I know that introducing a new option is undesirable but it seems to be a good compromise here.

probinson added inline comments.Apr 24 2018, 6:44 AM
include/clang/StaticAnalyzer/Core/PathSensitive/AnalysisManager.h
144 ↗(On Diff #143195)

Game teams are known to use "unity" builds, developing their code in separate files then having a master .cpp that will #include everything in a given library/component. This separates source into reasonable sized and logically coherent units, but the unity build improves build time for these large projects. If a team is using static analysis, they will want it on the entire compilation.

NoQ added a comment.Apr 24 2018, 12:39 PM

Aha, ok, yeah, that sounds like a lot, thank you. I think i'll follow up with a separate commit that will enable first-level-code-file-include analysis in all files under an on-by-default -analyzer-config flag, would that make sense?

In D45839#1077258, @NoQ wrote:

Aha, ok, yeah, that sounds like a lot, thank you. I think i'll follow up with a separate commit that will enable first-level-code-file-include analysis in all files under an on-by-default -analyzer-config flag, would that make sense?

Works for me, thanks!

This revision was not accepted when it landed; it landed in state Needs Review.Apr 25 2018, 2:56 PM
This revision was not accepted when it landed; it landed in state Needs Review.
This revision was automatically updated to reflect the committed changes.
This revision was automatically updated to reflect the committed changes.