This is an archive of the discontinued LLVM Phabricator instance.

[clang][Interp] Fix local variable (destructor) management in loop bodies
ClosedPublic

Authored by tbaeder on Mar 7 2023, 10:22 PM.

Details

Summary

Ok, let me explain.

PREVIOUSLY:

When generating bytecode for a loop, we visitStmt(Body), which will create its own local scope (that makes sense) and at the end of said scope, it will emit Destroy ops for all local variables, as well as emit code for the destructors of those local variables. However, the Destroy op calls the destructor for all data in the Block, and converts the blocks to DeadBlocks. That means the contents aren't usable anymore after it, and when we jump to the beginning of the loop and try to assign something to the block, we might end up trying to use the unusable data. This happens in the copy constructor of our Pointer class for example.

In pseudo code:

loop {
   create locals...

   local destructors
   Destroy Scope
   jump to beginning
}

NOW:

I've split emitting the destructors for locals and emitting the Destroy op for a scope up into two separate parts. They are mostly still the same, i.e.when using a AutoScope or any of its subclasses, which emits the destructors and the Destroy op in its own destructor.

When visiting a loop, we create our own LocalScope to manage the lifetimes of locals created within the body and then call emitDestructors of that scope right before the jmp to the beginning of the loop. The Destroy op for the scope is emitted in the LocalScope destructor, so will only be emitted once.

In pseudo code:

loop {
    create locals...

   local destructors
   jump to beginning
}
Destroy Scope

The changes in this patch are on top of https://reviews.llvm.org/D137070, but the patch needs to go in before that or be merged with it.

Diff Detail

Event Timeline

tbaeder created this revision.Mar 7 2023, 10:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 7 2023, 10:22 PM
tbaeder requested review of this revision.Mar 7 2023, 10:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 7 2023, 10:22 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
tbaeder edited the summary of this revision. (Show Details)Mar 8 2023, 1:02 AM
tbaeder updated this revision to Diff 505084.Mar 14 2023, 7:18 AM

Ping. Please prioritize this patch over the other interpreter patches since the others depend on it.

aaron.ballman added inline comments.Mar 27 2023, 12:15 PM
clang/lib/AST/Interp/ByteCodeExprGen.h
338

Should we be setting Idx = std::nullopt; after this so that the LocalScope destructor does not also emit a destroy for the same Idx?

clang/lib/AST/Interp/ByteCodeStmtGen.cpp
199–200

Errr, I'm surprised it isn't UB to call this with anything but a CompoundStmt given the function name.

209

It's a bit of a surprise that we visit the entire body of the compound statement before we visit the compound statement itself. I was thinking the compound statement could potentially have prologue work it needs to do, but now I'm second-guessing that. But this design still feels a little bit off... it's basically doing a post-order traversal just for this one node, and I wonder if we want something more general like a preVisitFoo() followed by visitFoo() followed by postVisitFoo() that applies to any AST node.

325–333

AutoScope and some curly braces to delimit the scope object lifetime?

349

Similar question here and elsewhere. The concern I have with this form is that it's pretty easy to accidentally miss that we've emitted the destructors in later code, whereas using the RAII object makes that impossible.

tbaeder added inline comments.Mar 27 2023, 11:31 PM
clang/lib/AST/Interp/ByteCodeStmtGen.cpp
199–200

Yeah, I'm not too happy about that either. I'll see what I can do.

209

I think you're overthinking this, this just for the !isa<CompoundStmt>(S) case (which IIRC happens at least for comma operator bin ops?)

325–333

The problem is that we want to emit the destructors before the jump, but not destroy the scope. That should happen after the end label, when the loop is finished altogether so an AutoScope doesn't work.

aaron.ballman added inline comments.Mar 28 2023, 12:29 PM
clang/lib/AST/Interp/ByteCodeStmtGen.cpp
209

Oh! Yeah, I totally overthought that. I got tripped up by this accepting things other than a CompoundStmt because of the function name. Sorry for the noise!

325–333

Ahh yeah, that's a good point... but this still worries me a bit because of how easy it is to misuse the scope after emitting the destructors. Notionally, we want two scopes, right? One scope for the loop construct guts and a second (inner) scope for the loop body. That's how the construct is modeled in the standard: http://eel.is/c++draft/stmt.while#2

tbaeder updated this revision to Diff 509264.Mar 29 2023, 2:24 AM
tbaeder marked 3 inline comments as done.Mar 29 2023, 2:25 AM
tbaeder added inline comments.
clang/lib/AST/Interp/ByteCodeStmtGen.cpp
325–333

Add a DestructorScope that just emits the destructors of a given scope.

tbaeder updated this revision to Diff 509279.Mar 29 2023, 3:13 AM
tbaeder updated this revision to Diff 509281.Mar 29 2023, 3:15 AM
tbaeder marked an inline comment as done.Mar 29 2023, 3:17 AM
tbaeder added inline comments.
clang/lib/AST/Interp/ByteCodeExprGen.h
338

Yeah I think that makes sense, thanks.

clang/lib/AST/Interp/ByteCodeStmtGen.cpp
199–200

Renamed it to visitLoopBody and added a doc comment explaining its purpose.

aaron.ballman accepted this revision.Mar 29 2023, 7:07 AM

LGTM!

clang/lib/AST/Interp/ByteCodeStmtGen.cpp
325–333

Thank you, I like this!

This revision is now accepted and ready to land.Mar 29 2023, 7:07 AM
tbaeder marked an inline comment as done.Mar 29 2023, 7:37 AM

Thanks, I'll verify the patch once again with an msan build before I push, just to be sure.

This revision was landed with ongoing or failed builds.Mar 30 2023, 4:17 AM
This revision was automatically updated to reflect the committed changes.