This is an archive of the discontinued LLVM Phabricator instance.

[clang][dataflow] Analyze constructor bodies
ClosedPublic

Authored by samestep on Aug 8 2022, 2:01 PM.

Details

Summary

This patch adds the ability to context-sensitively analyze constructor bodies, by changing pushCall to allow both CallExpr and CXXConstructExpr, and extracting the main context-sensitive logic out of VisitCallExpr into a new transferInlineCall method which is now also called at the end of VisitCXXConstructExpr.

Diff Detail

Event Timeline

samestep created this revision.Aug 8 2022, 2:01 PM
Herald added a project: Restricted Project. · View Herald Transcript
samestep requested review of this revision.Aug 8 2022, 2:01 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 8 2022, 2:01 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
samestep edited the summary of this revision. (Show Details)Aug 8 2022, 2:07 PM

Looks good overall. I want to think a bit more about potential additional tests...

clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
230

i think that llvm_unreachable is preferable for this purpose.

256

same.

clang/lib/Analysis/FlowSensitive/Transfer.cpp
678–684

Why can't this stay in VisitCallExpr?

sgatev added inline comments.Aug 9 2022, 12:32 AM
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
138

How about we define overloads that take these types instead of taking an Expr here? This should remove the need for type-checking and guarding against bad input in the implementation. transferInlineCall can be a template if necessary.

ymandel accepted this revision.Aug 9 2022, 6:14 AM
ymandel added inline comments.
clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
4371

What about a default constructor, including when there are field initializers like:

class MyClass {
   public:
     MyClass() = default;

     bool MyField = true;
   };

Should we expect to handle that correctly? If so, can you add some tests?

This revision is now accepted and ready to land.Aug 9 2022, 6:14 AM
samestep added inline comments.Aug 9 2022, 6:39 AM
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
138

Hmm I guess we could; is there much of a benefit to doing this templated?

clang/lib/Analysis/FlowSensitive/Transfer.cpp
678–684

I had it there in an earlier version of this patch; it was causing tests to fail (the SelfReferential* ones, if I remember correctly).

clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
4371

Good idea! I'll add a test for this.

samestep updated this revision to Diff 451134.Aug 9 2022, 6:41 AM

Address Yitzie's comments

samestep updated this revision to Diff 451233.Aug 9 2022, 12:13 PM

Narrow public signature of pushCall

I feel like this is a repeated pattern. The CSA solved a very similar issue by introducing the CallEvent class hierarchy. I also remember seeing many disparate code snippets littered throughout the clang codebase that tries to deal with the problem of not having facilities to treat call-like nodes uniformly. At some point, I believe there were even some AST changes or supporting structures proposed to ameliorate this problem, but I can't find those at the moment. While I think it might be OK to introduce yet another workaround here, this is a cleanup that is long overdue, and I hope someone will have the time to actually improve the situation. Sorry for the rant, I will actually look at the code but had to vent this.

xazax.hun added inline comments.Aug 9 2022, 1:18 PM
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
247–252

I think the more idiomatic solution in Clang is to create an ArrayRef for the arguments from the ConstructExpr and the CallExpr. The args should be stored in a continuous memory area in both cases so you should be able to create the ArrayRef in constant time and you would no longer need to have these pesky if statements in your loop.

samestep added inline comments.Aug 9 2022, 1:26 PM
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
247–252

Ah OK, thanks! How should I do that? I was previously using iterators but it looked like the types were different between CallExpr and CXXConstructExpr, so I switched away from iterators in this patch; is there an easy way to make the ArrayRef?

xazax.hun added inline comments.Aug 9 2022, 1:29 PM
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
247–252

I feel like this is a repeated pattern. The CSA solved a very similar issue by introducing the CallEvent class hierarchy. I also remember seeing many disparate code snippets littered throughout the clang codebase that tries to deal with the problem of not having facilities to treat call-like nodes uniformly. At some point, I believe there were even some AST changes or supporting structures proposed to ameliorate this problem, but I can't find those at the moment. While I think it might be OK to introduce yet another workaround here, this is a cleanup that is long overdue, and I hope someone will have the time to actually improve the situation. Sorry for the rant, I will actually look at the code but had to vent this.

No worries at all for the rant, I appreciate the broader context. I was definitely surprised to learn that the two types don't share a common ancestor.

samestep updated this revision to Diff 451259.Aug 9 2022, 1:37 PM

Use ArrayRef

ymandel added inline comments.Aug 9 2022, 1:41 PM
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
207–243

I think you can push all of this logic to the type-specialized pushCalls now and just have pushCallInternal take the FunctionDecl, ArrayRef, etc.. WDYT?

xazax.hun accepted this revision.Aug 9 2022, 1:45 PM

Thanks!

clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
207–243

+1, I like that.

samestep added inline comments.Aug 9 2022, 1:48 PM
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
207–243

Great idea, thanks! Doing that now.

samestep updated this revision to Diff 451265.Aug 9 2022, 1:52 PM

Pull type-specific logic into overloads

xazax.hun accepted this revision.Aug 9 2022, 1:56 PM

Nice! We went from branches in the loop to branches before the loop first. And now the branches are at compile time during overload resolution. I love it.

ymandel added inline comments.Aug 9 2022, 1:57 PM
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
207–208

I'd go farther: just duplicate this line in both pushCalls and then the two if that follow can be moved out as well. at that point, I think Call is no longer needed.

samestep updated this revision to Diff 451267.Aug 9 2022, 2:04 PM

Remove Call parameter from pushCallInternal method

clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
207–208

Smart! Done.

sgatev accepted this revision.Aug 10 2022, 1:43 AM
sgatev added inline comments.
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
379

Let's add a note that unlike pushCall, this member is invoked on the environment of the callee.

380

#include "llvm/ADT/ArrayRef.h"

clang/lib/Analysis/FlowSensitive/Transfer.cpp
667–670
samestep added inline comments.Aug 10 2022, 6:54 AM
clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
379

Will do, thanks!

380

When I add this, I get a warning: "Included header ArrayRef.h is not used directly"

clang/lib/Analysis/FlowSensitive/Transfer.cpp
667–670

👍

samestep updated this revision to Diff 451445.Aug 10 2022, 6:57 AM

Address Stanislav's comments

This revision was landed with ongoing or failed builds.Aug 10 2022, 7:02 AM
This revision was automatically updated to reflect the committed changes.

Sorry but I had to revert this because of the conflicts with another revert: https://reviews.llvm.org/D131065