This is an archive of the discontinued LLVM Phabricator instance.

[clang][dataflow] Eliminate intermediate `ReferenceValue`s from `Environment::DeclToLoc`.
ClosedPublic

Authored by mboehme on Apr 25 2023, 3:24 AM.

Details

Summary

For the wider context of this change, see the RFC at
https://discourse.llvm.org/t/70086.

After this change, global and local variables of reference type are associated
directly with the StorageLocation of the referenced object instead of the
StorageLocation of a ReferenceValue.

Some tests that explicitly check for an existence of ReferenceValue for a
variable of reference type have been modified accordingly.

As discussed in the RFC, I have added an assertion to Environment::join() to
check that if both environments contain an entry for the same declaration in
DeclToLoc, they both map to the same StorageLocation. As discussed in
https://discourse.llvm.org/t/70086/5, this also necessitates removing
declarations from DeclToLoc when they go out of scope.

In the RFC, I proposed a gradual migration for this change, but it appears
that all of the callers of `Environment::setStorageLocation(const ValueDecl &,
SkipPast` are in the dataflow framework itself, and that there are only a few of
them.

As this is the function whose semantics are changing in a way that callers
potentially need to adapt to, I've decided to change the semantics of the
function directly.

The semantics of getStorageLocation(const ValueDecl &, SkipPast SP now no
longer depend on the behavior of the SP parameter. (There don't appear to be
any callers that use SkipPast::ReferenceThenPointer, so I've added an
assertion that forbids this usage.)

This patch adds a default argument for the SP parameter and removes the
explicit SP argument at the callsites that are touched by this change. A
followup patch will remove the argument from the remaining callsites,
allowing the SkipPast parameter to be removed entirely. (I don't want to do
that in this patch so that semantics-changing changes can be reviewed separately
from semantics-neutral changes.)

Diff Detail

Event Timeline

mboehme created this revision.Apr 25 2023, 3:24 AM
Herald added a project: Restricted Project. · View Herald Transcript
mboehme requested review of this revision.Apr 25 2023, 3:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 25 2023, 3:24 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
mboehme added inline comments.Apr 25 2023, 3:28 AM
clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
43

There were a number of pre-existing declarations that were ODR violation risks. I'm adding a new file-local function below, so instead of making it static, I decided to put everything that's supposed to be file-local in an anonymous namespace.

clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
506

We're creating a slightly deeper object hierarchy here because we now call createValue() with the non-reference type in a number of places. I believe this slightly deeper hierarchy shouldn't be a problem in practice.

2663

The changes in this test are a result of the fact that we now remove declarations from Environment::DeclToLoc when they go out of scope.

mboehme updated this revision to Diff 516745.Apr 25 2023, 4:20 AM

Add two assertions that I forgot to add in the first upload.

gribozavr2 accepted this revision.Apr 25 2023, 2:10 PM
gribozavr2 added a subscriber: gribozavr2.
gribozavr2 added inline comments.
clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
43

https://llvm.org/docs/CodingStandards.html#anonymous-namespaces

"Avoid putting declarations other than classes into anonymous namespaces"

This revision is now accepted and ready to land.Apr 25 2023, 2:10 PM
mboehme updated this revision to Diff 517258.Apr 26 2023, 11:59 AM

Use anonymous namespace only for classes, static for functions.

mboehme marked an inline comment as done.Apr 26 2023, 12:01 PM
mboehme added inline comments.
clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
43

Thanks for the pointer -- done!

ymandel accepted this revision.Apr 28 2023, 5:17 AM

Really nice patch. Thanks for this improvement!

clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
294–304

Maybe comment as to why we're not use the VarDecl overloads of createStorageLocation and why we're specifically using getNonReferenceType for createValue.

604
618

would isa_and_nonnull work?

clang/lib/Analysis/FlowSensitive/Transfer.cpp
279–297
328

why the recursive call rather than relying on what we know about their structure?

xazax.hun added inline comments.
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
610

Would we dump this in release builds? Do we want to limit this to debug/assert builds?

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
330

I think one question is whether we are interested in ScopeEnd or LifetimeEnd, see the discussions in https://reviews.llvm.org/D15031 and https://reviews.llvm.org/D16403, specifically Devin's comment:

@dcoughlin As a reviewer of both patches - could you tell us what's the difference between them? And how are we going to resolve this issue?

These two patches are emitting markers in the CFG for different things.

Here is how I would characterize the difference between the two patches.

  • Despite its name, https://reviews.llvm.org/D15031, is really emitting markers for when the lifetime of a C++ object in an automatic variable ends. For C++ objects with non-trivial destructors, this point is when the destructor is called. At this point the storage for the variable still exists, but what you can do with that storage is very restricted by the language because its contents have been destroyed. Note that even with the contents of the object have been destroyed, it is still meaningful to, say, take the address of the variable and construct a new object into it with a placement new. In other words, the same variable can have different objects, with different lifetimes, in it at different program points.
  • In contrast, the purpose of this patch (https://reviews.llvm.org/D16403) is to add markers for when the storage duration for the variable begins and ends (this is, when the storage exists). Once the storage duration has ended, you can't placement new into the variables address, because another variable may already be at that address.

I don't think there is an "issue" to resolve here. We should make sure the two patches play nicely with each other, though. In particular, we should make sure that the markers for when lifetime ends and when storage duration ends are ordered correctly.

What I wanted to add, I wonder if we might not get ScopeEnd event for temporaries and there might be other differences. The Clang implementation of P1179 used LifetimeEnd instead of ScopeEnd, and I believe probably most clients are more interested in LifetimeEnd events rather than ScopeEnd.

I think I understand why you went with ScopeEnd for this specific problem, but to avoid the confusion from having both in the Cfg (because I anticipate someone will need LifetimeEnd at some point), I wonder if we can make this work with LifetimeEnd.

mboehme updated this revision to Diff 518709.May 2 2023, 7:06 AM
mboehme marked an inline comment as done.

Changes in response to review comments

mboehme marked 5 inline comments as done.May 2 2023, 7:21 AM
mboehme added inline comments.
clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
294–304

Added comment on the getNonReferenceType().

As to why we're not using the VarDecl overload -- it's really just because we don't need the "stable" location behavior here. Wasn't sure if this rises to the level of needing to be addressed in a comment, so I've left it out for now. WDYT?

610

True, we'd want to do this only in debug builds. For simplicity, I've taken out the dump entirely -- it's easy enough to add back if we do actually encounter assertion failures here.

618

Good point, done!

clang/lib/Analysis/FlowSensitive/Transfer.cpp
328

Not sure exactly what you're asking?

If we don't want to make a recursive call here, we'd need to duplicate behavior that ProcessVarDecl() already contains and inline that code here. It seems preferable to just call ProcessVarDecl() instead? (Or to put it differently, what is a reason against making this call?)

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
330

Hm, thanks for bringing it up.

Conincidentally, I realized while chasing another issue that CFGScopeEnd isn't the right construct here. I assumed that we would get a CFGScopeEnd for every variable, but I now realize that we only get a CFGScopeEnd for the first variable in the scope.

So CFGLifetimeEnds definitely looks like the right construct to use here, and indeed it's what I originally tried to use. Unfortuately, CFG::BuildOptions::AddLifetime cannot be used at the same time as AddImplicitDtors, which we already use. We don't actually need the CFGElements added by AddImplicitDtors, but we do need AddTemporaryDtors, and that in turn can only be turned on if AddImplicitDtors is turned on.

It looks as if I will need to break one of these constraints. It looks as if the constraint that is easiest to break is that AddLifetime currently cannot be used at the same time as AddImplicitDtors. I'm not sure why this constraint currently exists (I'd appreciate any insights you or others may have), but I suspect it's because it's hard to ensure that the CFGElements added by AddImplicitDtors are ordered correctly with respect to the CFGLifetimeEnds elements.

Anyway, this requires a change to CFG one way or another. I'll work on that next. I'll then make the required changes to this patch and will ask for another look before submitting.

ymandel accepted this revision.May 2 2023, 8:07 AM
ymandel added inline comments.
clang/lib/Analysis/FlowSensitive/Transfer.cpp
328

At least previously, given what we know of their structure, the inlining was trivial. But, I think your second point is key -- why not recursively call -- it's a clean solution. I'm fine with that, thx.

mboehme updated this revision to Diff 519373.May 4 2023, 12:44 AM
mboehme marked 3 inline comments as done.
  • Removed usage of CFGScopeEnd
  • Removed the assertion that two DeclToLoc to be joined don't contain conflicting entries
mboehme added inline comments.May 4 2023, 12:51 AM
clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
330

I've taken a look at what it would take to make AddLifetime coexist with AddImplicitDtors, and it's hard (because we need to get the ordering right, and that's non-trivial).

So I've instead decided to remove the behavior from this patch that removes declarations from Environment::DeclToLoc when they go out of scope and have instead added FIXMEs in various places.

It would be nice to add this behavior, but all that it was used for in this patch was the assertion I added that the two DeclToLocs to be joined don't have entries for the same declaration but with different storage locations. Instead, if we now encounter a scenario where we have such conflicting entries for a declaration (as discussed in https://discourse.llvm.org/t//70086/5), we use the existing behavior of intersectDenseMaps that removes the corresponding declaration from the joined map.

It would be nice to be able to retain the assertion, but I don't really see any plausible failure modes that it would guard against. Also, when I tested various existing clients of the dataflow framework with this assertion in place, I didn't see any assertion failures.

WDYT?

xazax.hun accepted this revision.May 4 2023, 10:32 AM
xazax.hun added inline comments.
clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
330

Why do you need AddImplicitDtors? I have the impression if we have the LifetimeEnd markers, we can sort of simulate implicit dtors. This is more like a note for the future if we cannot make both work at the same time.

For this patch, I am OK with the current solution.

mboehme added inline comments.May 4 2023, 1:33 PM
clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
330

We don't need AddImplicitDtors per se, but we do need AddTemporaryDtors, and to get that we have to set AddImplicitDtors as well. (And it's hard to disentangle both of those too.)

We need AddTemporaryDtors so that we get a CFGTerminator::TemporaryDtorsBranch, which is used here. I'm not sure that we can get something similar with CFGLifetimeEnds -- this would at the least require some more research.

xazax.hun added inline comments.May 4 2023, 3:21 PM
clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
330

Oh, I see! Sorry, somehow my mind skipped over AddTemporaryDtors. Indeed, I don't think this is a simple problem to solve. I'd love to see AddScopes to be removed from the CFG, it does not look useful in its current form. And hopefully AddLifetime, AddImplicitDtors, and AddTemporaryDtors could all be unified as they have some overlapping.

mboehme added inline comments.May 7 2023, 11:35 PM
clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
330

Yes, I would love that too -- it would simplify life for clients as well as make the code cleaner I think.