This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] ArrayInitLoopExpr with array of non-POD type.
ClosedPublic

Authored by isuckatcs on Jul 11 2022, 9:02 AM.

Details

Summary

At the moment we are unable to handle ArrayInitLoopExpr if the array it initializes is an array of non-POD types.
The expression is supposed to create a new array, and invoke the copy constructor for each element. The CFG only
contains the copy constructor, which is supposed to be invoked for each corresponding element, however it doesn't
contain the index of each element or a construction context.

The idea is to add a construction context to the copy ctor, so that after we evaluate it, the elements are placed in the
proper region of the store. Besides that in handleConstructor, we manually select which element of the array is going
to be passed to the ctor as parameter. Everything else is the same as it happens with a simple non-POD array construction,
in another patch, which this one depends on.

Depends on: https://reviews.llvm.org/D127973

Diff Detail

Event Timeline

isuckatcs created this revision.Jul 11 2022, 9:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 11 2022, 9:02 AM
isuckatcs requested review of this revision.Jul 11 2022, 9:02 AM
NoQ added a comment.Jul 11 2022, 9:10 PM

Once again very clean patch, thanks!! I have a few ideas on how to make it even neater.

clang/lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
507

Maybe turn this function into a member function of VariableConstructionContext? That'll make the caller code a bit more verbose but both the problem and the solution seems to be generic enough to be put there.

I wouldn't even mind subclassing ConstructionContext to handle this special case with a dedicated subclass, then the ArrayInitLoopExpr could become an entire layer. But that's probably an overkill since everything fits together nicely without it.

526

OpaqueValueExpr is often an indication that its sub-expression is duplicated in the AST (i.e., pointed-to from multiple different parents). I'm curious if the DeclRefExpr on the next line is the same DeclRefExpr that you're looking for. In this case we can simplify this function dramatically by consulting the ArrayInitLoopExpr instead (which is available in this function's caller).

isuckatcs updated this revision to Diff 443932.Jul 12 2022, 6:10 AM
  • ArrayInitLoopExpr with non-POD type array in implicit copy/move ctor is now handled properly.

In the third case, when a lambda captures a non-POD type array by value, there is no construction context,
so the objects would be constructed into the same temporary region. At the moment this case is ignored, and
I added a FIXME about it near the appropriate test case.

isuckatcs marked an inline comment as done.Jul 12 2022, 6:17 AM
isuckatcs added inline comments.
clang/lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
507

Sadly we can't create 1 subclass that handles ArrayInitLoopExpr, as in all 3 different use cases we have a different ConstructionContext.

  • in the implicit copy/move constructor for a class with an array member - ConstructorInitializerConstructionContext
  • when a lambda-expression captures an array by value - <<NULL>>
  • when a decomposition declaration decomposes an array - VariableConstructionContext
isuckatcs updated this revision to Diff 444197.Jul 13 2022, 2:51 AM
  • Added a virtual method to ConstructionContext that returns ArrayInitLoopExpr.
NoQ added a comment.Jul 13 2022, 8:34 PM

Aha this looks great! Nitpicking time.

clang/include/clang/Analysis/ConstructionContext.h
332

I think or_null can be dropped because we can be certain that this variable's initializer contains, at the very least, the constructor to which the context is provided.

403

Same here, or_null can be dropped.

clang/lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
546–549

We should probably assert() that there's no else here. If this assert is ever hit, we'll know that we're missing something and change it to a FIXME.

Nice work, thanks!

clang/test/Analysis/array-init-loop.cpp
164

I think, the matching should be more precise; it is a.i that is garbage, isn't it?
Perhaps this would also make sense (unless the previous warning is a sink):

clang_analyzer_eval(a.i); // expected-warning{{UNDEFINED}}
202–204

I think, It would be useful to emphase more explicitly what we want from the S5 type. Same above with S2.

Note, is_pod is basically is_trivial && is_standard_layout from C++20 (the PODType type requirement is deprecated since then).

isuckatcs updated this revision to Diff 444664.Jul 14 2022, 7:58 AM
  • Added a new ConstructionContext for lambas, which will be polished later.
  • ArrayInitLoopExpr is evaluated properly in case of lambda captures.
NoQ added a comment.Jul 14 2022, 11:40 AM

This looks awesome!

I think the lambda code should go into a separate patch, this one's already quite big and they're logically quite separate.

clang/lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
301–302

This tiny piece of code currently mimics/duplicates ExprEngine::VisitLambdaExpr.

I'm not sure if that tiny piece of code represents the correct solution in general, I suspect that eg. if lambda is put into a variable then it needs to be constructed directly in the variable, through copy elision (?) So eventually we'll probably need to assign a construction context to LambdaExpr itself, and then use both of them here, maybe recursively. But this is what we have for now.

So for this patch, I think the right thing to do is to have ExprEngine::VisitLambdaExpr() lookup the region created here instead, from objects-under-construction. This way future improvements could be done in one place, and the infrastructure will already be there.

1173

Should we clean up the state on the else-branch as well?

isuckatcs updated this revision to Diff 444822.Jul 14 2022, 3:47 PM

Rebased onto latest relevant changes.

xazax.hun added inline comments.Jul 14 2022, 6:29 PM
clang/include/clang/Analysis/CFG.h
205–207

Nit: is this how clang format formatted this expression? Don't we miss a bit of indentation?

clang/include/clang/Analysis/ConstructionContext.h
705–710

We have many types of captures and iterators for them. Captures, capture inits, explicit captures, implicit captures. Are you sure that fields and capture inits can be paired up this way?
Also, I think capture iterators might be random access. you might be able to get to the right element with a single arithmetic. Unfortunately, that might not be the case for fields. But you could use std::advance for that.

isuckatcs marked an inline comment as done.Jul 15 2022, 2:38 AM
isuckatcs added inline comments.
clang/include/clang/Analysis/CFG.h
205–207

Yes, clang format really formatted it this way. We miss the indentation, but if I add it and run clang format, it outputs the same result you see here.

isuckatcs marked an inline comment as done.Jul 15 2022, 2:39 AM
isuckatcs marked an inline comment as done.Jul 15 2022, 4:59 AM
isuckatcs added inline comments.
clang/include/clang/Analysis/ConstructionContext.h
705–710

Are you sure that fields and capture inits can be paired up this way?

Technically this solution is copied from VisitLamdaExpr, so I assumed it to be correct.

Since you created the original solution, I assume you already know the answer too, but I did
some digging regarding the question. The answer comes from Sema::BuildLambdaExpr.
They way the function works is that it iterates over the captures in the LambdaScopeInfo
and creates a FieldDecl and an initializer for each capture.

ExprResult Sema::BuildLambdaExpr(..., LambdaScopeInfo *LSI) {
  ...
  for (unsigned I = 0, N = LSI->Captures.size(); I != N; ++I) {
    const Capture &From = LSI->Captures[I];     
    ...
    // Form the initializer for the capture field.
    ExprResult Init = BuildCaptureInit(From, ImplicitCaptureLoc);

    // FIXME: Skip this capture if the capture is not used, the initializer
    // has no side-effects, the type of the capture is trivial, and the
    // lambda is not externally visible.

    // Add a FieldDecl for the capture and form its initializer.
    BuildCaptureField(Class, From);
    Captures.push_back(Capture);
    CaptureInits.push_back(Init.get());
    ...
  }
  ...
  Class->setCaptures(Context, Captures);
  ...
  LambdaExpr *Lambda = LambdaExpr::Create(..., CaptureInits, ...);
  ...
}

The captures and the initializers are put into 2 separate lists in the same order, so they
have the same indices, while the fields are directly constructed into the class.

FieldDecl *Sema::BuildCaptureField(RecordDecl *RD, const sema::Capture &Capture) {
  ...
  FieldDecl *Field = FieldDecl::Create(..., RD, ...);
  ...
  RD->addDecl(Field);
  ...
}

void DeclContext::addDecl(Decl *D) {
  addHiddenDecl(D);
  ...
}

void DeclContext::addHiddenDecl(Decl *D) {
  ...
  if (FirstDecl) {
    LastDecl->NextInContextAndBits.setPointer(D);
    LastDecl = D;
  } else {
    FirstDecl = LastDecl = D;
  }
  ...
}

The FieldDecls are stored in a linked list, where the new Decl is inserted to the end.
That means, the captures, the initializers and the FieldDecls have the same order and
the same index at this point.

Later the capture list is passed to the class, and the initializer list is passed to
LambdaExpr::Create(). In both cases the same simple linear copy is performed, so the
indices keep their original order. The fields are already in their final place and order.

void CXXRecordDecl::setCaptures(ASTContext &Context,
                                ArrayRef<LambdaCapture> Captures) {
  CXXRecordDecl::LambdaDefinitionData &Data = getLambdaData();

  // Copy captures.
  Data.NumCaptures = Captures.size();
  Data.NumExplicitCaptures = 0;
  Data.Captures = (LambdaCapture *)Context.Allocate(sizeof(LambdaCapture) *
                                                    Captures.size());
  LambdaCapture *ToCapture = Data.Captures;
  for (unsigned I = 0, N = Captures.size(); I != N; ++I) {
    if (Captures[I].isExplicit())
      ++Data.NumExplicitCaptures;

    *ToCapture++ = Captures[I];
  }

  if (!lambdaIsDefaultConstructibleAndAssignable())
    Data.DefaultedCopyAssignmentIsDeleted = true;
}
LambdaExpr::LambdaExpr(..., ArrayRef<Expr *> CaptureInits, ...){
  LambdaExprBits.NumCaptures = CaptureInits.size();
  LambdaExprBits.CaptureDefault = CaptureDefault; 
  ...
  CXXRecordDecl *Class = getLambdaClass();
  (void)Class;
  assert(capture_size() == Class->capture_size() && "Wrong number of captures");
  assert(getCaptureDefault() == Class->getLambdaCaptureDefault());
  ...
  // Copy initialization expressions for the non-static data members.
  Stmt **Stored = getStoredStmts();
  for (unsigned I = 0, N = CaptureInits.size(); I != N; ++I)
    *Stored++ = CaptureInits[I];  
  ...
}

Hence the capture and initializer indices kept their original orders, and the fields were also
created in the same order, we know that they still have the same indices. As a result the
iterators can be paired.

isuckatcs updated this revision to Diff 444983.Jul 15 2022, 7:57 AM
isuckatcs marked 9 inline comments as done.
  • Renamed LambdaConstructionContext to LambdaCaptureConstructionContext.
  • Addressed comments.
  • Polished existing solutions.
clang/lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
1173

Capturing a VLA by value is a compile time error. If the VLA is captured by reference, there is no CXXConstructExpr we need to evaluate, so that case is not affected by this patch. I think there's nothing to clean up.

clang/test/Analysis/array-init-loop.cpp
164

Yes a.i is the Undefined variable, but the warning we get is a sink, so there's no way to confirm it (unless we look at the egraph).

NoQ added inline comments.Jul 15 2022, 6:58 PM
clang/include/clang/Analysis/CFG.h
205–207

Hmm looks like there's extra paretheses after isa<ArgumentConstructionContext>(C).

NoQ added a comment.Jul 15 2022, 8:32 PM

Aside from these nits and the gut feeling that the patch needs to be split, I don't have any other comments. The patch looks great, let's test it a bit and land it.

clang/lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
1173

Ok maybe assert() then, to make sure there's nothing to clean up(?)

isuckatcs updated this revision to Diff 445226.Jul 16 2022, 4:29 AM
isuckatcs marked 2 inline comments as done.

Addressed comments.

isuckatcs updated this revision to Diff 445327.EditedJul 17 2022, 10:14 AM

Extracted the part about lambdas to a separate patch, D129967.

NoQ accepted this revision.Jul 17 2022, 11:39 AM

Awesome, thanks, I'm happy with the code! Let's see how it performs in real life and then commit.

This revision is now accepted and ready to land.Jul 17 2022, 11:39 AM

Added testcases for structured binding syntax 3. Syntax 2 is not applicable for arrays.

This revision was landed with ongoing or failed builds.Jul 26 2022, 12:07 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 26 2022, 12:07 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript