This is an archive of the discontinued LLVM Phabricator instance.

[clang] Fix 2 bugs with parenthesized aggregate initialization
ClosedPublic

Authored by ayzhao on Mar 20 2023, 3:20 PM.

Details

Summary
  • Fix an issue where temporaries initialized via parenthesized aggregate initialization don't get destroyed.
  • Fix an issue where aggregate initialization omits calls to class members' move constructors after a TreeTransform. This occurs because the CXXConstructExpr wrapping the call to the move constructor gets unboxed during a TreeTransform of the wrapping FunctionalCastExpr (as with a InitListExpr), but unlike InitListExpr, we dont reperform the InitializationSequence for the list's expressions to regenerate the CXXConstructExpr. This patch fixes this bug by treating CXXParenListInitExpr identically to InitListExpr in this regard.

Fixes #61145

Diff Detail

Event Timeline

ayzhao created this revision.Mar 20 2023, 3:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2023, 3:20 PM
ayzhao requested review of this revision.Mar 20 2023, 3:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 20 2023, 3:20 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
ayzhao updated this revision to Diff 506754.Mar 20 2023, 3:22 PM

clean up test

ayzhao updated this revision to Diff 506755.Mar 20 2023, 3:26 PM

remove extra semicolon

rsmith added a subscriber: rsmith.Mar 20 2023, 3:42 PM
rsmith added inline comments.
clang/lib/Sema/SemaExprCXX.cpp
1582–1596 ↗(On Diff #506755)

The input to this function should be a syntactic initializer such as a ParenListExpr, not an already-type-checked semantic initializer such as a CXXParenListInitExpr. The right place to do this unwrapping is in TreeTransform::TransformInitializer, where we unwrap various other kinds of semantic initializer.

shafik added reviewers: Restricted Project, rsmith, erichkeane, aaron.ballman.Mar 20 2023, 5:42 PM
ayzhao added inline comments.Mar 20 2023, 6:34 PM
clang/lib/Sema/SemaExprCXX.cpp
1582–1596 ↗(On Diff #506755)

The right place to do this unwrapping is in TreeTransform::TransformInitializer, where we unwrap various other kinds of semantic initializer.

So the CXXParenListInitExpr for S never gets passed to TreeTransform::TransformInitializer - rather, TransformCXXParenListInitExpr(...) calls TransformExprs(...) on its subexpressions. If the subexpression happens to be a CXXConstructExpr, then TransformCXXConstructExpr will call TransformInitializer(...), which is why we saw the CXXConstructExpr getting deleted

Interestingly enough, if we initialize S with a braced init-list (i.e. S{(V&&) Vec}), we never end up calling TransformCXXConstructExpr(...) since we revert to the syntatic form..

The input to this function should be a syntactic initializer such as a ParenListExpr, not an already-type-checked semantic initializer such as a CXXParenListInitExpr.

We can probably do this by returning a ParenListExpr in TransformCXXParenListInitExpr(...). This does cause some problems though. If we use a ParenListExpr, how would we know when to bail out in line 1551 (!isa<InitListExpr>(Exprs[0]) && !isa<CXXParenListInitExpr>(Exprs[0]))? Surely, not every instance of a ParenListExpr would be caused by a CXXParenListInitExpr, and if this were the case, that would be a pretty brittle assumption. Furthermore, we never create a ParenListExpr for a corresponding CXXParenListInitExpr.

As an alternative, we could probably do something similar to InitListExpr, which has separate semantic and syntactic forms. Would that work?

Also, in general, are the TransformFooExpr(...) functions supposed to return semantic or syntactic results? I seem to see a mix of both...

rsmith added inline comments.Mar 21 2023, 12:33 PM
clang/lib/Sema/SemaExprCXX.cpp
1582–1596 ↗(On Diff #506755)

OK, I've looked at this a bit more. It looks to me like there are problems in two places:

  1. TreeTransform::RebuildCXXParenListInitExpr. When transforming the subexpressions of a CXXParenListInitExpr, we call TransformExprs passing IsCall=true, which means we call TransformInitializer on each of them, which reverts them to syntactic form. So we need to redo the semantic analysis for forming a CXXParenListInitExpr here, and can't just create a new CXXParenListInitExpr node directly. Probably the easiest way to handle that would be to produce a ParenListExpr here instead.
  1. Expr::getSubExprAsWritten (reached via here) is not properly handling CXXParenListExpr -- it should be stepping over it to the single argument of the expression, which was the as-written subexpression.

But looking more into (2), it seems like there's a representational oddity here, too. We use a CXXFunctionalCastExpr wrapping a CXXParenListExpr to model both the case of Aggregate(a) (which is a functional cast, equivalent by definition to (Aggregate)a) and the case of Aggregate(a, b) (which is not a functional cast). I don't think that's necessarily a problem -- we also incorrectly use CXXFunctionalCastExpr wrapping an InitListExpr for Aggregate{a, b} -- but it means that getSubExprAsWritten will need to be careful to only unwrap in the case where the parenthesized aggregate initialization expression had exactly one element, and it's not ideal from a source fidelity perspective.

In the long term, I think we should split CXXTemporaryObjectExpr into two -- an outer wrapper representing an explicit type construction expression of the form T(...) or T{...} (excluding the case T(a) which is a functional cast expression) and an inner CXXConstructExpr representing the actual constructor call. And then we can represent parenthesized aggregate initialization as either a CXXTemporaryObjectExpr wrapping a CXXParenListInitExpr (when the number of arguments is not equal to 1) or a CXXFunctionalCastExpr wrapping a CXXParenListInitExpr (when the number of arguments is equal to 1) -- and similarly we can represent the case of (Aggregate)a as a CastExpr wrapping a CXXParenListInitExpr. But for now I think we can leave the representation as-is and make TreeTranform handle it.

(Incidentally, the name CXXParenListInitExpr seems confusing here -- there aren't necessarily any parentheses involved, and the name really ought to mention aggregate initialization. CXXAggregateInitExpr would make more sense to me.)

If we use a ParenListExpr, how would we know when to bail out in line 1551

We should never see a ParenListExpr there (that would represent T((a, b)), not T(a, b), which would result in a comma expression not a ParenListExpr). Instead, the caller of BuildCXXTypeConstructExpr should convert the ParenListExpr into a list of expressions passed as Exprs, and pass the locations of the parentheses as LParenOrBraceLoc and RParenOrBraceLoc.

In particular, I think it would make sense for TransformCXXFunctionalCastExpr to detect when it's in this special case where the thing it's transforming isn't a functional cast expression, and call a different Rebuild... overload that takes a list of expressions instead of a single expression. (It would make some sense to call RebuildCXXTemporaryObjectExpr in this case.)

Also, in general, are the TransformFooExpr(...) functions supposed to return semantic or syntactic results? I seem to see a mix of both...

The intent is that TreeTransform on an expression removes all implicit nodes -- implicit conversions, semantic forms for initializers, and so on -- and returns the node to how it would have been before it was given to Sema as a subexpression. That means that the Sema functionality for building an expression with a given set of arguments doesn't need to handle the case where they've already been implicitly converted in some way, which gives us consistent behavior between the initial parse and template instantiation.

That means that when we see an expression being used as an initializer, we want to strip off all the purely semantic pieces and revert it back to a syntactic form -- that happens when transforming the initializer of a variable, an argument to a function, etc -- and also when we transform an expression that performed some kinds of conversions on its operands, we need to redo those conversions when transforming it.

ayzhao updated this revision to Diff 509158.Mar 28 2023, 3:38 PM

use ParenListExpr and fix getSubExprAsWritten(...)

ayzhao marked 2 inline comments as done.Mar 28 2023, 3:40 PM
ayzhao added inline comments.
clang/lib/Sema/SemaExprCXX.cpp
1582–1596 ↗(On Diff #506755)

Patch has been updated per this comment. PTAL.

ayzhao updated this revision to Diff 509160.Mar 28 2023, 3:41 PM
ayzhao marked an inline comment as done.

fix stray whitespace change

rsmith accepted this revision.Mar 30 2023, 1:28 PM

LGTM, thanks!

This revision is now accepted and ready to land.Mar 30 2023, 1:28 PM