This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer] Include typedef statements in CFG build.
ClosedPublic

Authored by balazske on Apr 9 2020, 8:52 AM.

Details

Summary

Array size expressions in typedef statements with a VLA
(variable-length array) are handled from now as in plain
(non-typedef) VLA declarations.
Type-aliases with VLA are handled too
(but main focus is on C code).

The fix is useful to have the values used at VLA sizes
available for the analyzer and checkers.
The code works similar as CodeGenFunction::emitDecl
that processes typedef and type-alias VLA size expressions
at place of the typedef or type-alias statement.

Diff Detail

Event Timeline

balazske created this revision.Apr 9 2020, 8:52 AM

There must be a way to dump the CFG. Perhaps you could add a test that dumps the CFG of a given source and then by using FileCheck we could see the exact feature that is missing.

There must be a way to dump the CFG. Perhaps you could add a test that dumps the CFG of a given source and then by using FileCheck we could see the exact feature that is missing.

debug.DumpCFG to the rescue!

$ clang -cc1 -analyzer-checker-help-developer | grep "debug.DumpCFG"
  debug.DumpCFG                 Display Control-Flow Graphs
efriedma added a subscriber: efriedma.

Looks roughly fine to me, but I'm not an appropriate reviewer for static analyzer code.

clang/lib/Analysis/CFG.cpp
2859

Fix this comment?

I looked at the graphical CFG dump and observed that the added code makes the same parts appear that are there in a VLA declaration but missing if a VLA typedef is used.

I looked at the graphical CFG dump and observed that the added code makes the same parts appear that are there in a VLA declaration but missing if a VLA typedef is used.

Could add a test case those with the CFG dumper debug checker? clang/test/Analysis/cfg.cpp is a great example.

balazske updated this revision to Diff 257330.Apr 14 2020, 7:40 AM
  • Added tests.
  • Updated a comment.
balazske marked an inline comment as done.Apr 14 2020, 7:43 AM

Without the new code the CFG dump for vla_typedef contains:

[B1]
   1: VLA vla;
Szelethus added a subscriber: NoQ.

Adding @NoQ and @xazax.hun , I don't have much experience building CFGs.

clang/lib/Analysis/CFG.cpp
2855

How about using? How about some other shenanigans that obscure the size of the VLA? Can't we just retrieve the size from VariableArrayType::getSizeExpr after seeing this VarDecl, and add that to the block?

xazax.hun requested changes to this revision.Apr 15 2020, 6:37 AM

I am not an expert when it comes to VLAs but I do see some problems here.

First of all, we do not want to include typedef statements in the CFG as they are noops in terms of the execution. We definitely want to include the evaluation of array size expressions for VLAs. In case the evaluation of that expression is not included in the CFG that is definitely a bug but the name of the revision should reflect this.

My questions so far:

  • Where is the size expression actually evaluated? Is it evaluated at the point of the typedef or at the point of the variable definition? It is possible to construct code that behaves differently in those cases. In case the behavior is the latter the currently produced CFG is wrong.
  • You mentioned that type aliases are not handled. Why? Is there any technical reason for that? Is it a language restriction? Does it behave differently? I think, unless it is really hard to support it well, I would prefer to introduce full support in one patch. Otherwise rewriting a code to use type aliases instead of typedefs would change the behavior of the analyzer on that code which is quite surprising for the users.

I think this needs more testing, i.e. what about more complex size expressions and multiple typedefs/type aliases? Can we have multidimensional VLAs?

clang/lib/Analysis/CFG.cpp
2843

Can't we have the same situation with type aliases? In that case, maybe we should check for TypedefNameDecl instead.

This revision now requires changes to proceed.Apr 15 2020, 6:37 AM

Where is the size expression actually evaluated? Is it evaluated at the point of the typedef or at the point of the variable definition?

At the point of the typedef.

NoQ added a comment.EditedApr 15 2020, 4:11 PM

Where is the size expression actually evaluated? Is it evaluated at the point of the typedef or at the point of the variable definition?

At the point of the typedef.

Whoa, indeed. Nice language.

I also see that the typedef constitutes a DeclStmt in the AST. Can we include that in the CFG as well? In the static analyzer that'll be the place where we actually start tracking that typedef, because without it the size expression value will be quickly cleaned up.

I also see that the typedef constitutes a DeclStmt in the AST. Can we include that in the CFG as well? In the static analyzer that'll be the place where we actually start tracking that typedef, because without it the size expression value will be quickly cleaned up.

This is the point where I am not sure, how to add the DeclStmt itself? Simply call addStmt or call autoCreateBlock and appendStmt as it is done later in the function?

balazske marked an inline comment as done.Apr 16 2020, 12:04 AM
balazske added inline comments.
clang/lib/Analysis/CFG.cpp
2855

The current code does that, it gets the size with getSizeExpr and adds it. The loop is needed to support multiple array dimensions.

To support every other type that can contain VLA a similar code is needed as in CodeGenFunction::EmitVariablyModifiedType (go over the type chain and look not only for VariableArrayType). The current code works only if there is typedef (or type alias) with a plain VLA, maybe embedded into another.
If support for every other case is needed, it is a change on its own because the code that generates CFG for VLA declaration is wrong too (later in the same function, the new code is copied from that). For example void (*vla)(int[x]); does not work now (x is not included in CFG). This is better to do in a later patch, for now this more restricted functionality is better than nothing (it is not wrong, only do not support all, probably seldom used, cases).

xazax.hun added inline comments.Apr 16 2020, 1:41 AM
clang/lib/Analysis/CFG.cpp
2855

I'd prefer to have some FIXME comments in this patch to document what is missing. Moreover, it would be great to have a test case for void (*vla)(int[x]); with a similar FIXME comment of the expected behavior.

balazske updated this revision to Diff 258011.Apr 16 2020, 4:10 AM
  • Adding tests.
  • Type alias is supported.
  • Updated comments.
aaron.ballman added inline comments.Apr 16 2020, 4:45 AM
clang/test/Analysis/cfg.cpp
570 ↗(On Diff #258011)

Can you also add tests for:

int vla_unevaluated(int x) {
  // Evaluates the ++x
  sizeof(int[++x]);
  // Does not evaluate the ++x
  _Alignof(int[++x]);
  _Generic((int(*)[++x])0, default : 0);
  
  return x;
}

Also, it's a bit strange to find these VLA tests in a .cpp file. VLAs are a C++ extension and we should have some testing for them for constructs that are C++-specific, but I would have expected many of these tests to be in a C file instead.

balazske marked an inline comment as done.Apr 17 2020, 12:05 AM
balazske added inline comments.
clang/test/Analysis/cfg.cpp
570 ↗(On Diff #258011)

Could be there platforms (on buildbot?) where the VLA code (in cpp file) does not compile? Is it worth to have a test with VLA type alias then?

balazske marked an inline comment as done.Apr 17 2020, 12:28 AM
balazske added inline comments.
clang/test/Analysis/cfg.cpp
570 ↗(On Diff #258011)

I have this output for the test above (without the new code), is it correct?
(The ++ seems to be evaluated for the alignof statement too.)

int vla_unevaluated(int x)
 [B2 (ENTRY)]
   Succs (1): B1

 [B1]
   1: x
   2: ++[B1.1]
   3: [B1.2] (ImplicitCastExpr, LValueToRValue, int)
   4: sizeof(int [++x])
   5: x
   6: ++[B1.5]
   7: [B1.6] (ImplicitCastExpr, LValueToRValue, int)
   8: alignof(int [++x])
   9: 0
  10: x
  11: [B1.10] (ImplicitCastExpr, LValueToRValue, int)
  12: return [B1.11];
   Preds (1): B2
   Succs (1): B0

 [B0 (EXIT)]
   Preds (1): B1
NoQ added inline comments.Apr 17 2020, 2:49 AM
clang/test/Analysis/cfg.cpp
570 ↗(On Diff #258011)

Could be there platforms (on buildbot?) where the VLA code (in cpp file) does not compile? Is it worth to have a test with VLA type alias then?

That's unlikely; the problems with VLAs in C++ are IIRC about the bureaucracy of the Standard (i.e., they couldn't solve enough cornercases, like whether T<int[x]> and T<int[y]> the same template when x == y at run-time?), not about platform support.

aaron.ballman added inline comments.Apr 17 2020, 5:08 AM
clang/test/Analysis/cfg.cpp
570 ↗(On Diff #258011)

Could be there platforms (on buildbot?) where the VLA code (in cpp file) does not compile? Is it worth to have a test with VLA type alias then?

I think the type alias test is a good idea (as are any other C++-specific tests) because we support VLAs in C++ mode as an extension, my concern was more that this is a C feature and so if I was hunting for tests for VLAs, I'd look in a .c file first. It's not a huge concern, more like a style nit.

(The ++ seems to be evaluated for the alignof statement too.)

That seems wrong per C17 6.5.3.4p3. It also doesn't seem to match the behavior I'm seeing: https://godbolt.org/z/rVaGnb

balazske updated this revision to Diff 259513.Apr 23 2020, 3:52 AM
balazske marked 5 inline comments as done.

Updated tests.

xazax.hun accepted this revision.Apr 23 2020, 4:29 AM

Thanks, this looks much better now.
Could you also update the description of the revision to match the current status? (E.g. type aliases are now supported.)
If you do not plan to solve the alignof issue in the near future, maybe it would be worth to also open a bug report.
Please wait for one more approval before commiting.

This revision is now accepted and ready to land.Apr 23 2020, 4:29 AM
balazske added inline comments.Apr 23 2020, 5:29 AM
clang/test/Analysis/cfg.cpp
570 ↗(On Diff #258011)

There is a cfg.cpp test file but not a cfg.c so the new tests are added to the cpp file. I do not like to make a new cfg.c because it will contain a part of the VLA tests only because the type alias cases. It is better to have all VLA tests in the same file.

The case in vla_unevaluated is probably wrong but an independent problem. This change would be only about adding the VLA related parts.

balazske edited the summary of this revision. (Show Details)Apr 23 2020, 5:35 AM
aaron.ballman added inline comments.Apr 23 2020, 6:00 AM
clang/test/Analysis/cfg.cpp
570 ↗(On Diff #258011)

There is a cfg.cpp test file but not a cfg.c so the new tests are added to the cpp file. I do not like to make a new cfg.c because it will contain a part of the VLA tests only because the type alias cases. It is better to have all VLA tests in the same file.

I don't see why it matters having all VLA tests in the same file when the file name has nothing to do with VLAs, but if you insist, I can hold my nose.

The case in vla_unevaluated is probably wrong but an independent problem. This change would be only about adding the VLA related parts.

I'm not super familiar with the analyzer parts of thing, but I'm not certain why this is an independent problem. I thought the purpose to this patch was to ensure VLAs are evaluated when necessary in otherwise unevaluated contexts. The VLA within _Alignof should not be evaluated, so it seems related to this patch. That said, I'm fine if it's handled in a follow-up patch if that's easier.

Actually, sorry. I just realized that the alignof problem is introduced by this patch. I'd love to see the solution committed together with this patch (it could be a separate patch but preferably, they should be committed together.)

Actually, sorry. I just realized that the alignof problem is introduced by this patch. I'd love to see the solution committed together with this patch (it could be a separate patch but preferably, they should be committed together.)

I do not see that the alignof problem is introduced here. If this patch is ignored the alignof problem still exists. I have found the problematic code, for every kind of UnaryExprOrTypeTraitExpr the VLA size is evaluated if there is a VLA. It looks easy to fix, just add a few lines and tests for it. Still it is a not related problem.

balazske updated this revision to Diff 259584.Apr 23 2020, 8:47 AM
  • Separating test files, fixing alignof problem.
aaron.ballman accepted this revision.Apr 23 2020, 9:14 AM

LGTM with a minor nit, thank you!

clang/lib/Analysis/CFG.cpp
2855

newBlock -> NewBlock per coding standard.

efriedma added inline comments.Apr 23 2020, 12:51 PM
clang/lib/Analysis/CFG.cpp
2922

You probably want a different example in the FIXME: we don't look in at function arguments when determining if a type is variably modified. (The rule is written in a sort of weird way in the standard, but essentially, we only need to look at the element type of an array, the pointee of a pointer, and the return type of a function.)

xazax.hun accepted this revision.Apr 23 2020, 2:19 PM

Overall looks good for me, thanks for tackling this problem! I think this should be good to go once Eli's comment is fixed.

balazske updated this revision to Diff 259820.Apr 24 2020, 12:27 AM
balazske marked an inline comment as done.
  • Separating test files, fixing alignof problem.
balazske added inline comments.Apr 24 2020, 2:14 AM
clang/lib/Analysis/CFG.cpp
2855

Another thing is to follow the style of the source code. In the same function newBlock is used at other places, it would look even worse (or not?) to change just that single new occurrence to NewBlock.

aaron.ballman added inline comments.Apr 24 2020, 4:22 AM
clang/lib/Analysis/CFG.cpp
2855

I don't insist on the change, but when the local code is already inconsistent with its naming conventions like this is, I think new code should follow the coding standard unless there's a strong reason to deviate. This way, over time, the file converges more and more on following the coding standard instead of getting more and more inconsistent.

This revision was automatically updated to reflect the committed changes.