This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Implement P2718R0 "Lifetime extension in range-based for loops"
Needs ReviewPublic

Authored by yronglin on Jun 24 2023, 7:17 AM.

Details

Summary

Implement P2718R0 "Lifetime extension in range-based for loops" (https://wg21.link/P2718R0)

Diff Detail

Event Timeline

yronglin created this revision.Jun 24 2023, 7:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 24 2023, 7:17 AM
yronglin requested review of this revision.Jun 24 2023, 7:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 24 2023, 7:17 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
yronglin edited the summary of this revision. (Show Details)Jun 24 2023, 7:27 AM
yronglin added a reviewer: shafik.
yronglin edited the summary of this revision. (Show Details)Jun 24 2023, 7:29 AM
yronglin updated this revision to Diff 534212.Jun 24 2023, 7:36 AM

Update c++ status

yronglin added a reviewer: Restricted Project.Jun 24 2023, 7:51 AM
yronglin updated this revision to Diff 534559.Jun 26 2023, 7:52 AM

Format and rebase

yronglin retitled this revision from [clang] Implement P2718R0 "Lifetime extension in range-based for loops" to [Clang] Implement P2718R0 "Lifetime extension in range-based for loops".Jun 26 2023, 9:18 AM

friendly ping~

Sorry, I missed the ping on Discord.
Thanks for working on this

I don't feel qualified to review this, but I don't think there are nearly enough tests.
FYI there is a previous attempt at this here https://reviews.llvm.org/D139586 - with some test suggestions

Sorry, I missed the ping on Discord.
Thanks for working on this

I don't feel qualified to review this, but I don't think there are nearly enough tests.
FYI there is a previous attempt at this here https://reviews.llvm.org/D139586 - with some test suggestions

Thank you for take a look! I'm sorry I have not found D139586 already exists, should I close this ticket? D139586 seems more complete.

Sorry, I missed the ping on Discord.
Thanks for working on this

I don't feel qualified to review this, but I don't think there are nearly enough tests.
FYI there is a previous attempt at this here https://reviews.llvm.org/D139586 - with some test suggestions

If you are no bandwidth to working on D139586, I'd be happy to do take over.

rsmith added inline comments.Jul 6 2023, 12:20 PM
clang/lib/Sema/SemaExprCXX.cpp
8909–8922

This isn't the right way to model the behavior here -- the presence or absence of an ExprWithCleanups is just a convenience to tell consumers of the AST whether they should expect to see cleanups later or not, and doesn't carry an implication of affecting the actual temporary lifetimes and storage durations.

The outcome that we should be aiming to reach is that all MaterializeTemporaryExprs created as part of processing the for-range-initializer are marked as being lifetime-extended by the for-range variable. Probably the simplest way to handle that would be to track the current enclosing for-range-initializer variable in the ExpressionEvaluationContextRecord, and whenever a MaterializeTemporaryExpr is created, if there is a current enclosing for-range-initializer, mark that MaterializeTemporaryExpr as being lifetime-extended by it.

yronglin marked an inline comment as done.Jul 7 2023, 5:07 AM
yronglin added inline comments.
clang/lib/Sema/SemaExprCXX.cpp
8909–8922

Awesome! Thanks a lot for your advice, this is very helpful! I want to take a longer look at it.

yronglin retitled this revision from [Clang] Implement P2718R0 "Lifetime extension in range-based for loops" to [WIP][Clang] Implement P2718R0 "Lifetime extension in range-based for loops".Jul 11 2023, 4:06 AM
hubert.reinterpretcast added inline comments.
clang/lib/Sema/SemaExprCXX.cpp
8909–8922

As mentioned in D139586, CXXDefaultArgExprs may need additional handling. Similarly for CXXDefaultInitExprs.

yronglin marked 2 inline comments as done.Jul 12 2023, 4:31 AM
yronglin added inline comments.
clang/lib/Sema/SemaExprCXX.cpp
8909–8922

Thanks for your tips! I have a question that what's the correct way to extent the lifetime of CXXBindTemporaryExpr? Can I just materialize the temporary? It may replaced by MaterializeTemporaryExpr, and then I can mark it as being lifetime-extended by the for-range variable.

yronglin marked an inline comment as done.Jul 12 2023, 4:38 AM
yronglin added inline comments.
clang/lib/Sema/SemaExprCXX.cpp
8909–8922

Eg.

void f() {
  int v[] = {42, 17, 13};
  Mutex m;
  for (int x : static_cast<void>(LockGuard(m)), v) // lock released in C++ 2020
  {
    LockGuard guard(m); // OK in C++ 2020, now deadlocks
  }
}
BinaryOperator 0x135036220 'int[3]' lvalue ','
|-CXXStaticCastExpr 0x1350361d0 'void' static_cast<void> <ToVoid>
| `-CXXFunctionalCastExpr 0x135036198 'LockGuard':'struct LockGuard' functional cast to LockGuard <ConstructorConversion>
|   `-CXXBindTemporaryExpr 0x135036178 'LockGuard':'struct LockGuard' (CXXTemporary 0x135036178)
|     `-CXXConstructExpr 0x135036140 'LockGuard':'struct LockGuard' 'void (Mutex &)'
|       `-DeclRefExpr 0x135035e18 'Mutex':'class Mutex' lvalue Var 0x135035b40 'm' 'Mutex':'class Mutex'
`-DeclRefExpr 0x135036200 'int[3]' lvalue Var 0x135035928 'v' 'int[3]'
cor3ntin requested changes to this revision.Jul 12 2023, 4:50 AM

Marking that as needing revision while the author explores the direction outlined by Richard :)

This revision now requires changes to proceed.Jul 12 2023, 4:50 AM
clang/lib/Sema/SemaExprCXX.cpp
8909–8922

If MaterializeTemporaryExpr represents a "temporary materialization conversion", then the above should already have one just under the static_cast to void (since the cast operand would be a discarded-value expression).

There may be unfortunate effects from materializing temporaries for discarded-value expressions though: Technically, temporaries are also created for objects having scalar type.

Currently, MaterializeTemporaryExpr is documented as being tied to reference binding, but that is not correct: for example, MaterializeTemporaryExpr also appears when a member access is made on a temporary of class type.

yronglin added inline comments.Jul 13 2023, 5:46 AM
clang/lib/Sema/SemaExprCXX.cpp
8909–8922

http://eel.is/c++draft/class.temporary says:

[Note 3: Temporary objects are materialized:
.......
(2.6)
when a prvalue that has type other than cv void appears as a discarded-value expression ([expr.context]).
— end note]

Seems we should materialized the discard-value expression in this case, WDYT?

clang/lib/Sema/SemaExprCXX.cpp
8909–8922

I think we should, but what is the codegen fallout? Would no-opt builds start writing 42 into allocated memory for static_cast<void>(42)?

yronglin marked 2 inline comments as done.Jul 13 2023, 9:40 AM
yronglin added inline comments.
clang/lib/Sema/SemaExprCXX.cpp
8909–8922

Thanks for your confirm @hubert.reinterpretcast !

I have tried locally, the generated IR of void f() is:

define void @f()() {
entry:
  %v = alloca [3 x i32], align 4
  %m = alloca %class.Mutex, align 8
  %__range1 = alloca ptr, align 8
  %ref.tmp = alloca %struct.LockGuard, align 8
  %__begin1 = alloca ptr, align 8
  %__end1 = alloca ptr, align 8
  %x = alloca i32, align 4
  %guard = alloca %struct.LockGuard, align 8
  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %v, ptr align 4 @__const.f().v, i64 12, i1 false)
  %call = call noundef ptr @Mutex::Mutex()(ptr noundef nonnull align 8 dereferenceable(64) %m)
  %call1 = call noundef ptr @LockGuard::LockGuard(Mutex&)(ptr noundef nonnull align 8 dereferenceable(8) %ref.tmp, ptr noundef nonnull align 8 dereferenceable(64) %m)
  store ptr %v, ptr %__range1, align 8
  %0 = load ptr, ptr %__range1, align 8
  %arraydecay = getelementptr inbounds [3 x i32], ptr %0, i64 0, i64 0
  store ptr %arraydecay, ptr %__begin1, align 8
  %1 = load ptr, ptr %__range1, align 8
  %arraydecay2 = getelementptr inbounds [3 x i32], ptr %1, i64 0, i64 0
  %add.ptr = getelementptr inbounds i32, ptr %arraydecay2, i64 3
  store ptr %add.ptr, ptr %__end1, align 8
  br label %for.cond

for.cond:                                         ; preds = %for.inc, %entry
  %2 = load ptr, ptr %__begin1, align 8
  %3 = load ptr, ptr %__end1, align 8
  %cmp = icmp ne ptr %2, %3
  br i1 %cmp, label %for.body, label %for.cond.cleanup

for.cond.cleanup:                                 ; preds = %for.cond
  %call5 = call noundef ptr @LockGuard::~LockGuard()(ptr noundef nonnull align 8 dereferenceable(8) %ref.tmp) #6
  br label %for.end

for.body:                                         ; preds = %for.cond
  %4 = load ptr, ptr %__begin1, align 8
  %5 = load i32, ptr %4, align 4
  store i32 %5, ptr %x, align 4
  %call3 = call noundef ptr @LockGuard::LockGuard(Mutex&)(ptr noundef nonnull align 8 dereferenceable(8) %guard, ptr noundef nonnull align 8 dereferenceable(64) %m)
  %call4 = call noundef ptr @LockGuard::~LockGuard()(ptr noundef nonnull align 8 dereferenceable(8) %guard) #6
  br label %for.inc

for.inc:                                          ; preds = %for.body
  %6 = load ptr, ptr %__begin1, align 8
  %incdec.ptr = getelementptr inbounds i32, ptr %6, i32 1
  store ptr %incdec.ptr, ptr %__begin1, align 8
  br label %for.cond

for.end:                                          ; preds = %for.cond.cleanup
  %call6 = call noundef ptr @Mutex::~Mutex()(ptr noundef nonnull align 8 dereferenceable(64) %m) #6
  ret void
}

The AST of void f() is:

|-FunctionDecl 0x14311d408 <line:62:1, line:69:1> line:62:6 used f 'void ()'
| `-CompoundStmt 0x143808898 <col:10, line:69:1>
|   |-DeclStmt 0x14311d710 <line:63:3, col:25>
|   | `-VarDecl 0x14311d528 <col:3, col:24> col:7 used v 'int[3]' cinit
|   |   `-InitListExpr 0x14311d648 <col:13, col:24> 'int[3]'
|   |     |-IntegerLiteral 0x14311d590 <col:14> 'int' 42
|   |     |-IntegerLiteral 0x14311d5b0 <col:18> 'int' 17
|   |     `-IntegerLiteral 0x14311d5d0 <col:22> 'int' 13
|   |-DeclStmt 0x14311d9f0 <line:64:3, col:10>
|   | `-VarDecl 0x14311d740 <col:3, col:9> col:9 used m 'Mutex':'Mutex' callinit destroyed
|   |   `-CXXConstructExpr 0x14311d9b0 <col:9> 'Mutex':'Mutex' 'void ()'
|   `-CXXForRangeStmt 0x143808700 <line:65:3, line:68:3>
|     |-<<<NULL>>>
|     |-DeclStmt 0x14311e2c0 <line:65:16>
|     | `-VarDecl 0x14311dfa8 <col:16, col:49> col:16 implicit used __range1 'int (&)[3]' cinit
|     |   `-ExprWithCleanups 0x14311e238 <col:16, col:49> 'int[3]' lvalue
|     |     `-BinaryOperator 0x14311de38 <col:16, col:49> 'int[3]' lvalue ','
|     |       |-CXXStaticCastExpr 0x14311dde8 <col:16, col:46> 'void' static_cast<void> <ToVoid>
|     |       | `-MaterializeTemporaryExpr 0x14311ddd0 <col:34, col:45> 'LockGuard':'LockGuard' xvalue extended by Var 0x14311dfa8 '__range1' 'int (&)[3]'
|     |       |   `-CXXFunctionalCastExpr 0x14311dd98 <col:34, col:45> 'LockGuard':'LockGuard' functional cast to LockGuard <ConstructorConversion>
|     |       |     `-CXXBindTemporaryExpr 0x14311dd78 <col:34, col:45> 'LockGuard':'LockGuard' (CXXTemporary 0x14311dd78)
|     |       |       `-CXXConstructExpr 0x14311dd40 <col:34, col:45> 'LockGuard':'LockGuard' 'void (Mutex &)'
|     |       |         `-DeclRefExpr 0x14311da18 <col:44> 'Mutex':'Mutex' lvalue Var 0x14311d740 'm' 'Mutex':'Mutex'
|     |       `-DeclRefExpr 0x14311de18 <col:49> 'int[3]' lvalue Var 0x14311d528 'v' 'int[3]'
|     |-DeclStmt 0x143808588 <col:14>
|     | `-VarDecl 0x14311e360 <col:14> col:14 implicit used __begin1 'int *':'int *' cinit
|     |   `-ImplicitCastExpr 0x143808400 <col:14> 'int *' <ArrayToPointerDecay>
|     |     `-DeclRefExpr 0x14311e2d8 <col:14> 'int[3]' lvalue Var 0x14311dfa8 '__range1' 'int (&)[3]'
|     |-DeclStmt 0x1438085a0 <col:14>
|     | `-VarDecl 0x143808248 <col:14, col:16> col:14 implicit used __end1 'int *':'int *' cinit
|     |   `-BinaryOperator 0x143808468 <col:14, col:16> 'int *' '+'
|     |     |-ImplicitCastExpr 0x143808450 <col:14> 'int *' <ArrayToPointerDecay>
|     |     | `-DeclRefExpr 0x14311e2f8 <col:14> 'int[3]' lvalue Var 0x14311dfa8 '__range1' 'int (&)[3]'
|     |     `-IntegerLiteral 0x143808430 <col:16> 'long' 3
|     |-BinaryOperator 0x143808628 <col:14> 'bool' '!='
|     | |-ImplicitCastExpr 0x1438085f8 <col:14> 'int *':'int *' <LValueToRValue>
|     | | `-DeclRefExpr 0x1438085b8 <col:14> 'int *':'int *' lvalue Var 0x14311e360 '__begin1' 'int *':'int *'
|     | `-ImplicitCastExpr 0x143808610 <col:14> 'int *':'int *' <LValueToRValue>
|     |   `-DeclRefExpr 0x1438085d8 <col:14> 'int *':'int *' lvalue Var 0x143808248 '__end1' 'int *':'int *'
|     |-UnaryOperator 0x143808668 <col:14> 'int *':'int *' lvalue prefix '++'
|     | `-DeclRefExpr 0x143808648 <col:14> 'int *':'int *' lvalue Var 0x14311e360 '__begin1' 'int *':'int *'
|     |-DeclStmt 0x14311dee0 <col:8, col:50>
|     | `-VarDecl 0x14311de78 <col:8, col:14> col:12 x 'int' cinit
|     |   `-ImplicitCastExpr 0x1438086d0 <col:14> 'int' <LValueToRValue>
|     |     `-UnaryOperator 0x1438086b8 <col:14> 'int' lvalue prefix '*' cannot overflow
|     |       `-ImplicitCastExpr 0x1438086a0 <col:14> 'int *':'int *' <LValueToRValue>
|     |         `-DeclRefExpr 0x143808680 <col:14> 'int *':'int *' lvalue Var 0x14311e360 '__begin1' 'int *':'int *'
|     `-CompoundStmt 0x143808880 <line:66:3, line:68:3>
|       `-DeclStmt 0x143808868 <line:67:5, col:23>
|         `-VarDecl 0x143808778 <col:5, col:22> col:15 guard 'LockGuard':'LockGuard' callinit destroyed
|           `-CXXConstructExpr 0x143808820 <col:15, col:22> 'LockGuard':'LockGuard' 'void (Mutex &)'
|             `-DeclRefExpr 0x1438087e0 <col:21> 'Mutex':'Mutex' lvalue Var 0x14311d740 'm' 'Mutex':'Mutex'

Would no-opt builds start writing 42 into allocated memory for static_cast<void>(42)?

I have got this result with my locally build clang

define noundef i32 @main() {
entry:
  %retval = alloca i32, align 4
  %ref.tmp = alloca i32, align 4
  store i32 0, ptr %retval, align 4
  store i32 42, ptr %ref.tmp, align 4
  call void @f()()
  ret i32 0
}
clang/lib/Sema/SemaExprCXX.cpp
8909–8922

Would no-opt builds start writing 42 into allocated memory for static_cast<void>(42)?

I have got this result with my locally build clang

define noundef i32 @main() {
entry:
  %retval = alloca i32, align 4
  %ref.tmp = alloca i32, align 4
  store i32 0, ptr %retval, align 4
  store i32 42, ptr %ref.tmp, align 4
  call void @f()()
  ret i32 0
}

@rsmith, in terms of the AST, do you have advice on the approach we should take in generating MaterializeTemporaryExprs for discarded-value expressions? Do we go with a language-specification purest direction and generate these for all prvalue object-typed discarded-value expressions? Do we limit generating them to range-for initializers? Do we further specialize and limit to non-scalar types or, even more specifically, to cases involving constructors or destructors?

yronglin marked an inline comment as done.
rsmith added inline comments.Jul 21 2023, 3:48 PM
clang/lib/Sema/SemaExprCXX.cpp
8909–8922

The cleanest solution seems to be that we create all the language-specified temporary materialization conversions; that approach will minimize the chance of subtle bugs in other areas of language semantics, and give a more faithful AST to external AST consumers. I think I have two questions:

  1. How hard is it to teach CodeGen to still not create storage for a discarded temporary? I think we know when we're emitting a discarded expression, so we should have a reasonable place to hook into and skip the materialization.
  2. What is the memory impact of creating these additional materializations? Eg, for a handful of fairly normal translation units, how much more space does the AST take?

I think the first step should be to add the code to create the new MTEs. Then we can explore points (1) and (2) to decide if we do so only within for-range initializers or everywhere.

yronglin updated this revision to Diff 544386.Jul 26 2023, 8:47 AM

Try implement

Thanks for your comments and sorry for the very late reply, I have been investigating how to achieve it these days.

clang/lib/Sema/SemaExprCXX.cpp
8909–8922

Seems it was ignored before: https://github.com/llvm/llvm-project/blob/cb63fa00d1f723e3b4e2fb5e6645595eb0a6e84c/clang/lib/Sema/SemaExprCXX.cpp#L8219-L8228

How hard is it to teach CodeGen to still not create storage for a discarded temporary?

I try to implement as wording by the standard and add a new flags variable CanIgnore to identify that the MaterializeTemporaryExpr can be ignored, but it broken 300+ tests, and I will try to fix these.

What is the memory impact of creating these additional materializations?

Can I just dump the status of BumpPtrAllocator? or does clang has any fancy option to do this?

This is a very early implement, please give me some time to add and fix tests.

yronglin added a comment.EditedAug 6 2023, 6:47 AM

Sorry for the late reply. I tried to investigate the memory impact of creating these additional materializations by build the whole llvm-project and compare the number of MaterializedTemporaryExpr created during parsing.

Steps:

  1. Add a public member uint64_t NumMetarilizedTemporaryExpr = 0; in ASTContext .
  2. Increment the value of NumMetarilizedTemporaryExpr in Sema::CreateMaterializeTemporaryExpr.
  3. Write the NumMetarilizedTemporaryExpr into a text file when ASTContext destruction, each translation unit will append a line to the file to record the value of NumMetarilizedTemporaryExpr .
  4. Build the entire llvm-project separately using the compiler that creates addational materializations and the compiler that doesn't.
  5. Sum the numbers produced by each translation unit.

The result is:

ItemCount
Addational Materialized Temporarys Total50655585
Clang Trunk Total18346347
Diff32309238

The more detail result in https://yronglin.github.io

The gap between these two numbers is very large. So I'think we can create additional materializations only within for-range initializers. I'm not sure if I can find a way to only create materializes for temporaries that need to have an extended lifetime, WDYT?

The gap between these two numbers is very large. So I'think we can create additional materializations only within for-range initializers. I'm not sure if I can find a way to only create materializes for temporaries that need to have an extended lifetime, WDYT?

@rsmith asked for something slightly different (but I am not sure how to collect that info): He wanted the AST size delta.

For what we have, I think the detailed results table would be more useful if we add some size metric for the files and the increase expressed as a percentage. Perhaps sorting by file size would help contextualize the significance of the percentage change.

The gap between these two numbers is very large. So I'think we can create additional materializations only within for-range initializers. I'm not sure if I can find a way to only create materializes for temporaries that need to have an extended lifetime, WDYT?

@rsmith asked for something slightly different (but I am not sure how to collect that info): He wanted the AST size delta.

For what we have, I think the detailed results table would be more useful if we add some size metric for the files and the increase expressed as a percentage. Perhaps sorting by file size would help contextualize the significance of the percentage change.

Thanks to Aaron's suggestion in Discard, the initial idea was to count the number of MaterializedTemporaryExpr created in the process of building llvm-project, and then calculate the precise memory increment data, without being affected by the operating system state. Maybe I can refine this table, add a column to show the memory increment percentage change of each file.

yronglin updated this revision to Diff 549449.Aug 11 2023, 9:58 AM

Only create addational metarialized temporary in for-range-init.
FIXME: Need to handle function default argument, and add more test to make sure the generated LLVM IR is correct.

yronglin updated this revision to Diff 549715.Aug 13 2023, 10:20 AM

Handle default argument and dependent context.

yronglin retitled this revision from [WIP][Clang] Implement P2718R0 "Lifetime extension in range-based for loops" to [Clang] Implement P2718R0 "Lifetime extension in range-based for loops".Aug 13 2023, 10:20 AM

FIXME: Need add test in clang/test/CXX/special/class.temporary/p6.cpp to check generated LLVM IR.

Should we dump the real default argument AST but not only CXXDefaultArgExpr if CXXDefaultArgExpr has been rewritted?

yronglin updated this revision to Diff 549896.Aug 14 2023, 5:53 AM

Add test to check generated LLVM IR, and fix crash.

@cor3ntin I have reused EnsureImmediateInvocationInDefaultArgs to rewrite CXXDefaultArgExpr, does this is a correct approach?

Sorry for the late reply. I tried to investigate the memory impact of creating these additional materializations by build the whole llvm-project and compare the number of MaterializedTemporaryExpr created during parsing.

Steps:

  1. Add a public member uint64_t NumMetarilizedTemporaryExpr = 0; in ASTContext .
  2. Increment the value of NumMetarilizedTemporaryExpr in Sema::CreateMaterializeTemporaryExpr.
  3. Write the NumMetarilizedTemporaryExpr into a text file when ASTContext destruction, each translation unit will append a line to the file to record the value of NumMetarilizedTemporaryExpr .
  4. Build the entire llvm-project separately using the compiler that creates addational materializations and the compiler that doesn't.
  5. Sum the numbers produced by each translation unit.

The result is:

ItemCount
Addational Materialized Temporarys Total50655585
Clang Trunk Total18346347
Diff32309238

The more detail result in https://yronglin.github.io

The gap between these two numbers is very large. So I'think we can create additional materializations only within for-range initializers. I'm not sure if I can find a way to only create materializes for temporaries that need to have an extended lifetime, WDYT?

I have updated the table, and it was sorted by Count Inc col.

Num files: 7445

ItemCountMem
Addational Materialized Temporarys Total506555851187240.2734 (KB)
Clang Trunk Total18346347429992.5078 (KB)
Diff32309238757247.7656 (KB)
Avg4339.72304.2380 (KB/file)

friendly ping~

yronglin updated this revision to Diff 551845.Aug 20 2023, 8:35 AM

Fix ci failure, and introduce an variable in ExpressionEvaluationContextRecord to rewrite default argument.

yronglin updated this revision to Diff 551871.Aug 20 2023, 4:45 PM

Do not spass MaterializePRValueInDiscardStatement in PushExpressionEvaluationContext.

I'm really not qualified to give you meaningful feedback on design here, but I spotted a few minor things that can be improved, I hope it helps!

clang/include/clang/Sema/Sema.h
1357

Can IsRewriteDefaultArgument and MaterializePRValueInDiscardStatement have different values?
Maybe MaterializePRValueInDiscardStatement is sufficient

1359–1361
1361

Either IsInLifetimeExtendingContext or IsInCXXForRangeInitializer seem like better names

1363–1372
9866
clang/lib/Parse/ParseDecl.cpp
2330–2333

We need to decide whether we want to backport thgis behavior to older language mode, i think that would make sense.
@hubert.reinterpretcast wdyt?

2340–2344

A discarded statement is the non-taken branch of an if constexpr, so using discarded expression (or discarded value expression) will lead to less confusion.

clang/lib/Sema/SemaExpr.cpp
18168

Whitespace only change

clang/lib/Sema/SemaExprCXX.cpp
8206–8209

Correct me if I'm wrong but the thing we want to avoid here is to avoid creating unnecessary AST nodes, right?

clang/lib/Sema/SemaInit.cpp
7660
yronglin marked 4 inline comments as done.Aug 21 2023, 6:04 AM

@cor3ntin Thanks for your review!

clang/include/clang/Sema/Sema.h
1357

Currently they have the same value, but they mean different things, so I use two separated variables.

1361

Thanks for your suggestion, IsInLifetimeExtendingContext looks good to me, I can use this var name later.

clang/lib/Sema/SemaExpr.cpp
6250

@cor3ntin Does EnsureImmediateInvocationInDefaultArgs need to be renamed to a more generic name?

18168

Thanks, I'll remove this.

clang/lib/Sema/SemaExprCXX.cpp
8206–8209

Yeah, ignore unnecessary AST nodes and runtime memory allocation. The redundant alloca/store instructions may be removed during the optimization phase.

Eg.

static_cast<void>(42);
define noundef i32 @main() {
entry:
  %retval = alloca i32, align 4
  %ref.tmp = alloca i32, align 4
  store i32 0, ptr %retval, align 4
  store i32 42, ptr %ref.tmp, align 4
  call void @f()()
  ret i32 0
}
yronglin updated this revision to Diff 552009.Aug 21 2023, 7:02 AM
yronglin marked 7 inline comments as done.

Address comments

friendly ping~

The changes in SemaInit.cpp don't look correct to me. Trying to recurse through the initializer and find every nested temporary is highly error-prone; I can't tell you which expression nodes you forgot to recurse through, but I'm sure there are some.

I think the approach being taken here is not really maintainable. We don't want the initialization code to need to know how to recurse through an expression after the fact and find all the temporaries that might need to be lifetime-extended, and we don't need it to do that either. Instead, we should make the expression evaluation context track the current for-range variable, and in Sema::CreateMaterializeTemporaryExpr, we should create a temporary that is already set to be lifetime-extended by the loop variable.

clang/include/clang/Sema/Sema.h
1356–1357

Can you expand this comment to more fully describe what this flag governs? Which default argument? How would it be rewritten?

clang/lib/Parse/ParseDecl.cpp
2330–2333

Definitely not. This is a visible behavior change to a rule that was not in any way incorrect or broken, and for which there were no implementation concerns. The change was not voted into the standard as a DR (see https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n4929.html). Applying this behavior in older standard versions would be a conformance bug.

yronglin marked 2 inline comments as done.Oct 14 2023, 9:27 AM

@rsmith Thanks a lot for your comments and sorry for the very late reply, I was on vacation some time ago.

The changes in SemaInit.cpp don't look correct to me. Trying to recurse through the initializer and find every nested temporary is highly error-prone; I can't tell you which expression nodes you forgot to recurse through, but I'm sure there are some.

I think the approach being taken here is not really maintainable. We don't want the initialization code to need to know how to recurse through an expression after the fact and find all the temporaries that might need to be lifetime-extended, and we don't need it to do that either. Instead, we should make the expression evaluation context track the current for-range variable, and in Sema::CreateMaterializeTemporaryExpr, we should create a temporary that is already set to be lifetime-extended by the loop variable.

I agree, I have reverted the changes in SemaInit.cpp. I have ever tried to do lifetime-extend in Sema::CreateMaterializeTemporaryExpr, but I fall into some trouble, the MaterializeTemporaryExpr was created before the for-range VarDecl, so we don't have a VarDecl for MaterializeTemporaryExpr::setExtendingDecl in Sema::CreateMaterializeTemporaryExpr, I have a possible solution here, Eg. we allocate a memory block which has same size of VarDecl, and pass this pointer to MaterializeTemporaryExpr::setExtendingDecl as a placeholder VarDecl, when we build the for-range statement, we just construct a VarDecl in the memory block that we allocated before. But this approach doesn't look very good and not maintainable. So I use ForRangeInitTemporaryLifetimeExtensionVisitor to visit every temporaries in the initializer and extend the lifetime. And ForRangeInitTemporaryLifetimeExtensionVisitor was derived from RecursiveASTVisitor, maybe this approach be able to handle all temporaries. WDYT?

clang/include/clang/Sema/Sema.h
1356–1357

Yeah, this variable has been removed. By default, all CallExprs in Clang share the same CXXDefaultArgExpr from parameters, but in some contexts (such as lifetime extension), the lifetime of the temporaries in the current default parameters needs to be extended, so the CXXDefaultArgExpr needs to be copied and the lifetime of temporaries in the copy need to be extended.

yronglin marked an inline comment as done.Oct 24 2023, 3:28 AM

ping~

friendly ping~

Endill added a subscriber: Endill.Nov 16 2023, 8:33 AM

@yronglin We are sorry it takes so much time to get feedback. Richard and Hubert have little bandwidth for reviews. Others, including me, don't feel qualified to provide good feedback.

@yronglin We are sorry it takes so much time to get feedback. Richard and Hubert have little bandwidth for reviews. Others, including me, don't feel qualified to provide good feedback.

@Endill Thanks for your reply, let's wait for them have time. You are clang experts, please feel free to comments.

I'm also not an authority here, but would like to help get this unstuck. It seems like how to tie the MaterializeTemporaryExprs to the lifetime-extending decl is the biggest outstanding question?

I think the approach being taken here is not really maintainable. We don't want the initialization code to need to know how to recurse through an expression after the fact and find all the temporaries that might need to be lifetime-extended, and we don't need it to do that either. Instead, we should make the expression evaluation context track the current for-range variable, and in Sema::CreateMaterializeTemporaryExpr, we should create a temporary that is already set to be lifetime-extended by the loop variable.

I agree, I have reverted the changes in SemaInit.cpp. I have ever tried to do lifetime-extend in Sema::CreateMaterializeTemporaryExpr, but I fall into some trouble, the MaterializeTemporaryExpr was created before the for-range VarDecl, so we don't have a VarDecl for MaterializeTemporaryExpr::setExtendingDecl in Sema::CreateMaterializeTemporaryExpr

I think the RecursiveASTVisitor solution is still fragile.
Looking at BuildForRangeVarDecl, it seems it doesn't need any inputs that would preclude building it earlier, before parsing the range expression.
I haven't looked at *exactly* how to restructure the interaction between Parser/Sema to enable this, maybe it's a little ugly, but it seems justified in order to be able to use ExpressionEvaluationContextRecord as intended.

clang/lib/Sema/SemaStmt.cpp
2741

I think this will walk into the body of lambdas and extend the lifetime of temporaries found there.

You could explicitly prevent this, but this looks like the same sort of fragility Richard was concerned with: Expression evaluation context seems to be the right way to control the propagation, and it'd be nice not to reinvent it.

yronglin updated this revision to Diff 558171.Sun, Nov 26, 6:53 AM

Apply comments

I'm also not an authority here, but would like to help get this unstuck. It seems like how to tie the MaterializeTemporaryExprs to the lifetime-extending decl is the biggest outstanding question?

I think the approach being taken here is not really maintainable. We don't want the initialization code to need to know how to recurse through an expression after the fact and find all the temporaries that might need to be lifetime-extended, and we don't need it to do that either. Instead, we should make the expression evaluation context track the current for-range variable, and in Sema::CreateMaterializeTemporaryExpr, we should create a temporary that is already set to be lifetime-extended by the loop variable.

I agree, I have reverted the changes in SemaInit.cpp. I have ever tried to do lifetime-extend in Sema::CreateMaterializeTemporaryExpr, but I fall into some trouble, the MaterializeTemporaryExpr was created before the for-range VarDecl, so we don't have a VarDecl for MaterializeTemporaryExpr::setExtendingDecl in Sema::CreateMaterializeTemporaryExpr

I think the RecursiveASTVisitor solution is still fragile.
Looking at BuildForRangeVarDecl, it seems it doesn't need any inputs that would preclude building it earlier, before parsing the range expression.
I haven't looked at *exactly* how to restructure the interaction between Parser/Sema to enable this, maybe it's a little ugly, but it seems justified in order to be able to use ExpressionEvaluationContextRecord as intended.

Many thanks for your review!
I have canceled RecursiveASTVisitor plan. I try to use ExpressionEvaluationContextRecord and collect temporaries in CreateMaterializeTemporaryExpr when parsing for-range-init, and then extend lifetime of the collected temporaries in ActOnCXXForRangeStmt. WDYT?

friendly ping~