This is an archive of the discontinued LLVM Phabricator instance.

[clang][dataflow] Fix Record initialization with InitListExpr and inheritances
ClosedPublic

Authored by kinu on Aug 31 2023, 7:25 AM.

Details

Summary

Usually RecordValues for record objects (e.g. struct) are initialized with
Environment::createValue() which internally calls getObjectFields() to
collects all fields from the current and base classes, and then filter them
with ModeledValues via DACtx::getModeledFields() so that the fields that
are actually referenced are modeled.

The consistent set of fields should be initialized when a record is initialized
with an initializer list (InitListExpr), however the existing code's behavior
was different.

Before this patch:

  • When a struct is initialized with InitListExpr, its fields are initialized based on what is returned by getFieldsForInitListExpr(), which only collects the direct fields in the current class, but not from the base classes. Moreover, if the base classes have their own InitListExpr, values that are initialized by their InitListExpr's weren't merged into the child objects.

After this patch:

  • When a struct is initialized with InitListExpr, it collects and merges the fields in the base classes that were initialized by their InitListExpr's. The code also asserts that the consistent set of fields are initialized with the ModeledFields.

Diff Detail

Event Timeline

kinu created this revision.Aug 31 2023, 7:25 AM
Herald added a project: Restricted Project. · View Herald Transcript
kinu requested review of this revision.Aug 31 2023, 7:25 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 31 2023, 7:25 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
kinu retitled this revision from Fix Record initialization with InitListExpr and inheritances to [clang][dataflow] Fix Record initialization with InitListExpr and inheritances.Aug 31 2023, 7:26 AM
kinu added inline comments.
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
131 ↗(On Diff #555037)

Note, this hits regardless of my change, but not very often. I haven't looked into details yet

kinu updated this revision to Diff 555166.Aug 31 2023, 2:34 PM

Exclude lambdas

kinu updated this revision to Diff 555676.Sep 4 2023, 1:13 AM

reverted wrong changes

mboehme added inline comments.Sep 4 2023, 8:01 AM
clang/lib/Analysis/FlowSensitive/Transfer.cpp
631–632

This adds a fair bit of clutter. Can you just use Type::getCanonicalTypeUnqualified()?

638

InitListExprs isn't accurate here, as the entries in S->inits() are not necessarily of that type. Also note typo in S->inits().

641

Nit: Make this a size_t for consistency with the type of Inits.size()?

643

I feel it would be useful to give a summary of what this large-ish block of code does (as it's not obvious from the first line).

644–645

Again, this is the prvalue case, so I think you should simply be able to use QualType::getCanonicalType().

645

This [[maybe_unused]] seems unnecesary, as Base is definitely used in the isLambda() check?

648–649

I believe this can be

assert(Base.getType().getCanonicalType() ==
       Init->getType().getCanonicalType());
  • The initializer for a base class should always be a prvalue
  • And I believe the type of a prvalue can't have const or volatile qualifiers (neither of them really makes sense)
650
  • Can this actually happen? (How do you make a lambda a base class?)
  • Why does this need to be handled separately? (Isn't a lambda just a normal class, modulo sugar? What is different about lambdas that means we can't handle them like other classes?)

Depending on the answers to the above, would also be useful to capture them as comments in the source code.

I assume you encountered this scenario somewhere. Can you add a test (with incorrect behavior marked as a FIXME) that exercises this?

654–655
  • Instead of doing dyn_cast, then asserting non-null, one can just do a cast, as that asserts that the argument is actually of the required type
  • However, I believe it's possible that getValue() can return null in this case, so we should use cast_or_null, then invent a value if we didn't get one.
  • Also, suggest BaseVal instead of RecordVal to emphasize that this is the RecordValue for the base class
656–657

I feel it would be useful to add an explanatory comment.

663–666

Rewrote this a bit with the aim of making it clearer -- WDYT?

667–669

SmallSetVector has a contains() method, so I think there is no need to convert to DenseSet?

clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
1451–1466

Can you make the class names more descriptive?

Suggestion:

Foo => MostDerived
Base2 => Intermediate
Base3 => Base1
Base => Base2

Also, I think it would be useful to tie the member variable names to the class names -- see suggested edit.

1470–1472

Is there a reason we're doing different operations on these fields?

IIUC, the only thing we want to do is to ensure they're modeled -- so maybe do the same for each? (Otherwise, it looks as if we're doing things differently so that it triggers different behavior, even when that's not the case.)

1492

findValueDecl() asserts internally that its return value is non-null, so this assertion is unnecessary.

1495–1496
1497–1511
1536–1538

These lines seem unnecessary. The test only wants to check whether the fields are modeled, not what their values are.

If anything, leaving out these operations makes the test stronger: It then tests whether the fields are modeled even if they are only initialized with an empty {}.

Then, once we leave out these operations, it's probably sufficient to give each class in this test just a single member variable (which is another nice simplification).

1564–1565
2148

IIUC, the crash was happening because copyRecord() assumes that the source and destination have the same fields. More specifically, there was an unspoken invariant in the framework that a RecordStorageLocation should always contain exactly the fields returned by DataflowAnalysisContext::getModeledFields()), but our treatment of InitListExpr was violating this invariant.

IOW, this wasn't really an issue with the way we treat copy/move operations but with InitlistExpr.

I understand that we want to have a test the reproduces the exact circumstances that led to the crash, but maybe it's enough to have one of these, and have the focus of the testing be on testing the actual invariant that we want to maintain? IOW, maybe keep this test but delete the additional tests below?

2165–2177
2431

I don't think this is true -- the return S1 in target() will use NRVO.

But I also don't think it's relevant -- I believe the crash this is reproducing was triggered by the InitListExpr, not the return statement?

Given the other tests added in this patch, do we need this test?

kinu updated this revision to Diff 555777.Sep 4 2023, 2:32 PM
kinu marked 19 inline comments as done.

Addressed comments

kinu added a comment.Sep 4 2023, 2:33 PM

Thanks! Updated the patch.

clang/lib/Analysis/FlowSensitive/Transfer.cpp
645

I removed the isLambda check below from this patch, let me keep this line for now

650

I hit this case in one of real code so added this afterwards...

but actually can I drop this part from this patch but come back in a separate patch once I clarify the behavior? I somehow lost the note about in which file this happened, and I need to confirm a few things before I can add a test :(

663–666

Much better, applied!

clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
2148

Dropped the additional tests below.

2165–2177

(Now this part is removed)

2431

It will use NRVO but in AST this still seems to generate CXXConstructExpr with isCopyOrMoveConstructor() == true because omitting the ctor is not mandatory in compilers.

I can drop this one, but I'm also a bit torn because this was the original crash repro that puzzled me a bit.

I refined the comment to explain it a bit better; how does it look now?

mboehme added inline comments.Sep 5 2023, 12:45 AM
clang/lib/Analysis/FlowSensitive/Transfer.cpp
644–645

Thinking about this again -- the field can, of course, have qualifiers, while the Init as a prvalue should not, so I think this should read

Field->getType().getCanonicalType().getUnqualifiedType() == Init->getType().getCanonicalType()

Sorry for the back and forth!

648–649

Do you actually need the Unqualified (see also comment above)? Do you have an example of a specific scenario where we need to drop the qualifiers?

650

Sounds good!

672–674
clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
1487–1489
1536–1538

Reopening the comment to discuss this line:

Then, once we leave out these operations, it's probably sufficient to give each class in this test just a single member variable (which is another nice simplification).

I see that giving each class two member variables makes this test consistent with the test above -- but OTOH, we're testing exactly the same thing for both member variables, so maybe it's better on the whole to simplify the test by putting just one member variable in each class?

2148

I still see more tests below? Suggest dropping these.

2431

It will use NRVO but in AST this still seems to generate CXXConstructExpr with isCopyOrMoveConstructor() == true

Ah, true, I see this now:

https://godbolt.org/z/z9enG8cW7

because omitting the ctor is not mandatory in compilers.

TIL that NRVO isn't guaranteed. (I always thought it was!)

I'm still pretty sure though that the CXXConstructExpr will have isElidable() == true, and in this case we [don't call copyRecord()](https://github.com/llvm/llvm-project/blob/6c6a2d3445671ada6a58b9ab5ce4a1e11e3dd610/clang/lib/Analysis/FlowSensitive/Transfer.cpp#L474):

if (S->isElidable()) {
  if (Value *Val = Env.getValue(*ArgLoc))
    Env.setValue(*S, *Val);
} else {
  auto &Val = *cast<RecordValue>(Env.createValue(S->getType()));
  Env.setValue(*S, Val);
  copyRecord(*ArgLoc, Val.getLoc(), Env);
}

So I'm not sure that this repro actually triggers the crash? (Can you verify? If it does trigger the crash, where am I going wrong in my thinking above?)

I can drop this one, but I'm also a bit torn because this was the original crash repro that puzzled me a bit.

OK, good to know that this is the original scenario that triggered the crash.

I still think it would be OK to keep only AssignmentOperatorWithInitAndInheritance because it also triggers a call to copyRecord() but does so in a more obvious fashion. And I think for a test it's actually useful if it's obvious what is happening.

kinu updated this revision to Diff 555838.Sep 5 2023, 3:12 AM
kinu marked an inline comment as done and an inline comment as not done.

Updated

kinu updated this revision to Diff 555840.Sep 5 2023, 3:18 AM
kinu marked 2 inline comments as done.

More fixes

kinu updated this revision to Diff 555842.Sep 5 2023, 3:38 AM
kinu marked 5 inline comments as done.

Updated again... very easy to overlook unaddressed comments!

kinu added a comment.Sep 5 2023, 3:39 AM

Thanks, addressed the comments, PTAL~

clang/lib/Analysis/FlowSensitive/Transfer.cpp
644–645

No problem, thanks for clarifying the details, done.

672–674

Does it work even if assert() is dropped? ... looks like it does. Ok, applied.

clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
1536–1538

That's true for this test... now I made each class have only one member.

2148

Ah I see what you mean. I dropped more!

2431

I'm still pretty sure though that the CXXConstructExpr will have isElidable() == true, and in this case we [don't call copyRecord()](https://github.com/llvm/llvm-project/blob/6c6a2d3445671ada6a58b9ab5ce4a1e11e3dd610/clang/lib/Analysis/FlowSensitive/Transfer.cpp#L474):

Interesting, so I looked into it:

Looks like after C++17 isElidable is no longer used in AST:
https://github.com/llvm/llvm-project/blob/2a603deec49cb6a27f3a29480ed8a133eef31cee/clang/include/clang/ASTMatchers/ASTMatchers.h#L8351

(So this test code doesn't take the branch)

NRVO seems to be handled around ReturnStmt, not in each Expr:
https://github.com/llvm/llvm-project/blob/2a603deec49cb6a27f3a29480ed8a133eef31cee/clang/lib/AST/Stmt.cpp#L1202

[...] because it also triggers a call to copyRecord() but does so in a more obvious fashion. And I think for a test it's actually useful if it's obvious what is happening.

Makes sense, now I dropped this.

mboehme accepted this revision.Sep 6 2023, 1:02 AM
mboehme added inline comments.
clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
2431

I'm still pretty sure though that the CXXConstructExpr will have isElidable() == true, and in this case we [don't call copyRecord()](https://github.com/llvm/llvm-project/blob/6c6a2d3445671ada6a58b9ab5ce4a1e11e3dd610/clang/lib/Analysis/FlowSensitive/Transfer.cpp#L474):

Interesting, so I looked into it:

Looks like after C++17 isElidable is no longer used in AST:
https://github.com/llvm/llvm-project/blob/2a603deec49cb6a27f3a29480ed8a133eef31cee/clang/include/clang/ASTMatchers/ASTMatchers.h#L8351

(So this test code doesn't take the branch)

Interesting -- wasn't aware!

NRVO seems to be handled around ReturnStmt, not in each Expr:
https://github.com/llvm/llvm-project/blob/2a603deec49cb6a27f3a29480ed8a133eef31cee/clang/lib/AST/Stmt.cpp#L1202

Thanks for the pointer, makes sese!

This revision is now accepted and ready to land.Sep 6 2023, 1:02 AM