This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Rework support for CFGScopeBegin, CFGScopeEnd, CFGLifetime elements
ClosedPublic

Authored by tomasz-kaminski-sonarsource on Jun 19 2023, 6:51 AM.

Details

Summary

This patch reworks generation for the CFGScopeBegin, CFGScopeEnd,
and CFGLiftimeEnd, in a way that they are now compatible with each
other and CFGAutomaticObjDtor. All of the above elements are now
generated by a single code path, that conditionally inserts elements if
they are requested.

In addition, the handling of goto statements is improved.
The goto statement may leave multiple scopes (and trigger destruction
and lifetime end for the affected variables) and enter multiple scopes,
for example:

{
  int s1;
  {
    int s2;
    goto label; // leaves s1, s2, and enters t1 t1
  }
}
{
  int t1;
  {
    int t2;
label:
  }
}

This is performed by first determining the shared parent scope of the
source and destination. And then emitting elements for exiting each
scope between the source and the parent, and entering each scope
between the parent and destination. All such elements are appended
to the source block, as one label may be reached from multiple scopes.

Finally, the approach for handling backward jumps is changed. When
connecting a source block to a destination block that requires the
insertion of additional elements, we put this element into a new block,
which is then linked between the source and the destination block.
For example:

{
  int t;
label:
  // Destination block referred to as 'DB'
}
{
  // Source block referred to as 'SB'
  Obj s;
  goto label;
}

The jump between SB with terminator T: goto and DB should be
coupled with the following CFG elements:

CFGAutomaticObjDtor(s)
CFGLifetimeEnd(s)
CFGScopeEnd(s)
CFGScopeBegin(t)

To handle such situations, we create a new link (LB) that is linked as
the predecessor of DB, to which we transfer the terminator (goto
statement) of SB. Then LB is handled in the same manner as the
source block in the case of forward jumps.
This produces CFG that looks like this:

SB -> LB (T: goto) -> DB

Finally, the resulting block is linked as the successor of SB. Such an
approach uses existing handling of the noreturn destructors.
As a reminder, for each destructor of an automatic object that is
marked as noreturn, a new noreturn block (marked NBn) is
created, at the destructor is inserted at the end of it.
To illustrate, given two noreturn destructors, we will have:

SB -> NB1 (noreturn)
NB2 (noreturn)
LB (T:goto) -> DB

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald Transcript
tomasz-kaminski-sonarsource requested review of this revision.Jun 19 2023, 6:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 19 2023, 6:51 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
tomasz-kaminski-sonarsource retitled this revision from CPP-4465 Rework support for CFGScopeBegin, CFGScopeEnd, CFGLifetime elements to [analyzer] Rework support for CFGScopeBegin, CFGScopeEnd, CFGLifetime elements.
steakhal accepted this revision.EditedJul 4 2023, 8:13 AM

I'd appreciate some review on this, given that a lot of you would be affected by the changes of CFG.
By changes I mean, fixes for goto statements, properly calling destructors and stuff.

It's already in production for our CSA fork, and the results look good.

In two weeks we plan to land this unless anyone objects.

This revision is now accepted and ready to land.Jul 4 2023, 8:13 AM

I'd appreciate some review on this, given that a lot of you would be affected by the changes of CFG.
By changes I mean, fixes for goto statements, properly calling destructors and stuff.

It's already in production for our CSA fork, and the results look good.

In two weeks we plan to land this unless anyone objects.

Thanks! We're planning to take a look in the next few days.

ymandel accepted this revision.Jul 7 2023, 9:46 AM
ymandel added inline comments.
clang/lib/Analysis/CFG.cpp
772

Consider adding comments describing these methods, either here or at the function definitions.

1840

if LocalScopeEndMarkers isn't supposed to change inside the body of the loop, then consider making this call once
for (std::size_t idx = 1, N = LocalScopeEndMarkers.size(); idx < N; ...

1903–1905
1920–1927

Remove extra braces?

tomasz-kaminski-sonarsource marked 3 inline comments as done.

Apply review suggestion and replaced make_scope_exit with SaveAndRestore that was already used
in file.

steakhal added inline comments.Jul 12 2023, 4:24 AM
clang/lib/Analysis/CFG.cpp
1840–1847

How about using a "pairwise" range here?
I believe, we don't have that just yet, but we can use zip instead.

1892–1894

We could use llvm::make_range(B, E) here, I believe.

1898–1900

Ranges here too.

1934–1936

Probably, we could also use a llvm::make_range(DstPos, BasePos) here, but I'm not sure if that's more readable or not.

clang/lib/Analysis/CFG.cpp
1934–1936

We need to call iterator member function I here, so we cannot replace it with the range.

tomasz-kaminski-sonarsource marked 5 inline comments as done.

Switched to ranges when applicable.

xazax.hun added inline comments.Jul 17 2023, 5:55 AM
clang/test/Analysis/lifetime-cfg-output.cpp
760

Would it be possible to add the variable name to the lifetime ends dumps? It would make interpreting these tests somewhat easier. (It is OK as a separate/follow-up patch.)

clang/test/Analysis/nonreturn-destructors-cfg-output.cpp
99

B3 and B5 look almost identical. Do we need both? If not, would there be a way to deduplicate?

tomasz-kaminski-sonarsource marked an inline comment as done.Jul 17 2023, 6:04 AM
tomasz-kaminski-sonarsource added inline comments.
clang/test/Analysis/lifetime-cfg-output.cpp
760

I think that would indeed be helpful. We could do the same for the destructors:

[B3.3].~A() (Implicit destructor)

But to be explicit, I do not have any idea when I would be able to work on it.

tomasz-kaminski-sonarsource marked 2 inline comments as done.Jul 17 2023, 7:19 AM
tomasz-kaminski-sonarsource added inline comments.
clang/test/Analysis/nonreturn-destructors-cfg-output.cpp
99

B2 is created when we visit the statement in reverse order. This includes all statement until we reach goto. So if the code would look like:

{
  A a;
  goto label;
  int b;
    foo();
}
label:
   foo();

Then B2 will include two calls to foo and lifetime of int b. The B5 only includes destructor of a that happens after goto.

So the similarly here only happens by accident, or in other words is by product of the code. In this case any deduplication mechanism for blocks will need to be content based, and I haven't seen anything like that happening in the code.

tomasz-kaminski-sonarsource marked an inline comment as done.Jul 17 2023, 7:22 AM
xazax.hun accepted this revision.Jul 17 2023, 6:33 PM

Thanks! Looks good to me.

This revision was landed with ongoing or failed builds.Jul 17 2023, 10:12 PM
This revision was automatically updated to reflect the committed changes.