Page MenuHomePhabricator

[clang] Implement P2266 Simpler implicit move
Needs ReviewPublic

Authored by mizvekov on Mar 19 2021, 8:24 PM.

Details

Summary

This Implements P2266 Simpler implicit move.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Diff Detail

Event Timeline

mizvekov created this revision.Mar 19 2021, 8:24 PM
mizvekov requested review of this revision.Mar 19 2021, 8:24 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 19 2021, 8:24 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

For coroutines I have D68845. The current code is wrong, and not trivial to fix. We have never properly implemented P1825 and will probably just go with P2266 unconditionally, because that two-phase lookup is very cumbersome to imlement.

For coroutines I have D68845. The current code is wrong, and not trivial to fix. We have never properly implemented P1825 and will probably just go with P2266 unconditionally, because that two-phase lookup is very cumbersome to imlement.

Okay, once this is properly tested, we can spin a separate patch from this, just taking the coroutine bits unconditionally, and propose a new DR.

mizvekov updated this revision to Diff 332140.Mar 20 2021, 7:15 PM
  • Removed special flag, now is enabled on -std=c++2b
  • Always enabled for coroutines
  • Ran test suite
  • Some relevant test files updated to also run c++2b
  • Some new tests added
mizvekov edited the summary of this revision. (Show Details)Mar 20 2021, 7:19 PM

Conspicuously missing: tests for decltype(auto) return types, tests for co_return.

Personally I'd rather see these new p2266-related tests go in a separate test file, not appended to the existing file, so as to keep each test file relatively simpler and shorter. (But I don't know that those-in-charge would agree.)

clang/test/CXX/class/class.init/class.copy.elision/p3.cpp
4

These labels seem to be getting unwieldy. I propose that you rename cxx11_14_17 to cxx17 and cxx11_14_17_20 to cxx17_20.

(There are no cases where C++11, '14, '17's behavior differ.)

So then we'll have cxx17, cxx17_20, cxx20, cxx20_2b, and cxx2b.
IMHO it might even be nice to eliminate the overlapping labels; just, when a message is expected in two out of three modes, write two expectations.

B1(B1 &&) = delete;
  // cxx20-note@-1 {{'B1' has been explicitly marked deleted here}}
  // cxx2b-note@-2 {{'B1' has been explicitly marked deleted here}}

What do you think? (This could also be severed into its own earlier commit.)

332

FWIW here I would prefer

template MoveOnly& test_3a<MoveOnly&>(MoveOnly&);
template MoveOnly&& test_3a<MoveOnly>(MoveOnly&&);

with an expected error on the second line. The parameter-ness of w seems like a red herring in your version.

342

I believe this is incorrect; throw x here shouldn't implicit-move because x's scope exceeds the nearest enclosing try-block.

clang/test/CXX/special/class.copy/p33-0x.cpp
3 ↗(On Diff #332140)

Could you please ensure that this test is updated to be tested in all modes? My interpretation of the old line 1 is it's been locked to run only in '11 mode (not '14 or '17 or '20), which is just bizarre. There's nothing '11-specific about this file that I can see. (And again that change can be severed from the rest of this PR and landed earlier, I hope.)

41 ↗(On Diff #332140)

I believe this is incorrect; throw x here shouldn't implicit-move (not even in C++2b) because x's scope exceeds the nearest enclosing try-block.

With my previous comment I meant that it's better if you leave out the co_return bits for now because it's wrong anyway. We can't use PerformMoveOrCopyInitialization. It would just generate merge conflicts.

I'll probably just update my change once the committee made up its mind. I can rebase it on top of this if it needs to be dependent on C++20 vs C++2b.

Conspicuously missing: tests for decltype(auto) return types, tests for co_return.

Though the first comment warns you that this is a work in progress and I just didn't get around to doing it yet ;)

Though decltype I had tested manually, co_return I have just begun testing at all.

I believe the decltype tests should go into the other file relevant to the pertinent section. One problem with leaving them here is that it would be extra noise for the tests for the older standards which don't even support the keyword at all.

Personally I'd rather see these new p2266-related tests go in a separate test file, not appended to the existing file, so as to keep each test file relatively simpler and shorter. (But I don't know that those-in-charge would agree.)

They go into files named after section / paragraph, and I haven't come across examples where they were split yet, but I haven't looked specifically for this.

With my previous comment I meant that it's better if you leave out the co_return bits for now because it's wrong anyway. We can't use PerformMoveOrCopyInitialization. It would just generate merge conflicts.

I am using PerformMoveOrCopyInitialization with AllowNRVO always false for co_return, which effectively makes it a PerformCopyInitialization (just replacing them instead was one of the WIP cleanups I mentioned).
I am new with this part of the code here, so maybe this is obvious, but why is it broken? Do you have any examples of test cases where this would fail?
Would this patch here cause any regressions?
Right now this is passing the test suite, and seems to be reporting the correct results for the tests you created for D68845.
Remember, here we just want to evaluate the solution proposed for P2266 by testing this at large, in order to build support for it to be accepted into the standard.
Just waiting for it to be accepted first before implementing it would be too late :)

clang/test/CXX/class/class.init/class.copy.elision/p3.cpp
4

I like the idea of shortening the labels. Duplicating the warnings if it can be avoided easily, not so much. DRY is my mantra :D

Though what you said about severing this into an earlier commit, I think something may need to be done about it anyway.
So right now the tests are all over the place about which standard they are running, and none are testing c++2b at all...
I think we might need to gather input on this. Just changing every test to run every standard might be too much, so we could need a less obvious criteria.

332

Sure, sounds good.

342

Err, you are right, my bad, this is an easy fix...

clang/test/CXX/special/class.copy/p33-0x.cpp
3 ↗(On Diff #332140)

See my earlier comment about doing this systematically.

With my previous comment I meant that it's better if you leave out the co_return bits for now because it's wrong anyway. We can't use PerformMoveOrCopyInitialization. It would just generate merge conflicts.

Okay, I think I see now what you mean.
For example this:

struct task {                                                             
  struct promise_type {
    ...
    void return_value(T &&value) {}
  };                                                                      
};
task<NoCopyNoMove> local2val() {
  NoCopyNoMove value;
  co_return value;
 }

We should expect the test above to work, by binding value to the rvalue reference in task's promise, right?
Hmm, my natural course of action here would have been to figure out what is wrong and fix it.
I thought this would be good to have implemented before the committee decides on it.
@Quuxplusone thoughts? Is coroutine support optional here?

We should expect the test above to work, by binding value to the rvalue reference in task's promise, right?

Precisely, we don't want to do any initialization at all. (There will be an initialization for the parameters in BuildCallExpr.)

Implicit move just means that we're converting the lvalue to an xvalue.

On second thought, since @Quuxplusone's proposal makes things so simple you can just do this right yourself: instead of the initialization just do the cast, as D68845 does it. That's it.

Then let's see what we can get for C++20. If the proposal is accepted and gets DR status for co_return we might not need my change at all.

mizvekov updated this revision to Diff 332184.EditedMar 21 2021, 5:56 PM

Changed:

  • Implemented co_return as per Aaron's suggestion.
  • Imported coroutine tests from D68845.
  • Test tags shortened: cxxYY_cxxZZ means every version from YY to ZZ, instead of cxxAA_BB_CC_DD.
  • Existing test on decltype(auto) updated to also run on c++2b.
  • Change one of the test cases to just instantiate the relevant template, as per Arthur.

Still missing:

  • Making this run on all tests:
    • A lot of tests are running in only one old standard. Changing everything here would generate too much noise.
  • Some simplifications:
    • Aggregate common code for adding the implicit cast.
    • PerformMoveOrCopyInitialization can be changed to PerformCopyInitialization in the cases where AllowNRVO is false.
    • Can easily skip adding the implicit cast if expr was already a xvalue.
    • getCopyElisionCandidate needs refactoring IMHO. It is doing too much in one function, and this patch makes it worse. There are a lot of calls to it sprinkled around in the same sequence. Some problems:
      • On the current code, we want two things out of it: "Can we perform copy elision" AND "Can we perform implicit move". We end up calling it twice for that, with two parameters, instead of once returning both answers.
      • With this patch, now we need to call it even before we have the function return type available, then decide this type based on it, and then call it again to figure out if we can do copy elision. I will try to propose some way to improve that.
mizvekov updated this revision to Diff 332186.Mar 21 2021, 7:05 PM

small cleanup.

mizvekov updated this revision to Diff 332187.Mar 21 2021, 7:08 PM

small cleanup.

mizvekov edited the summary of this revision. (Show Details)Mar 22 2021, 10:28 AM

Splitting off the test changes into: https://reviews.llvm.org/D99225

mizvekov updated this revision to Diff 333158.Mar 24 2021, 3:43 PM

Rebased on top of D99225, running on a bunch of extra tests.

mizvekov edited the summary of this revision. (Show Details)Mar 24 2021, 3:45 PM
mizvekov marked 3 inline comments as done.

Data points: I've run this new Clang in -std=c++2b mode over my employer's (medium-sized, not too crazy) C++17 codebase, and also built Clang with Clang. Neither one required any source-code changes to deal with p2266. (The C++17 codebase did need about 10 identical changes to fix up ambiguous operator== overloads for C++20.)

Could we get someone to check out the Bloomberg codebase?

mizvekov updated this revision to Diff 333707.Mar 27 2021, 8:58 PM

Cleaned up implementation, not WIP anymore, ready for review.

mizvekov edited the summary of this revision. (Show Details)Mar 27 2021, 9:06 PM
mizvekov retitled this revision from [clang] WIP: Implement P2266 to [clang] Implement P2266 Simpler implicit move.
mizvekov updated this revision to Diff 333745.Mar 28 2021, 4:19 PM

So it turns out there was a pre-existing bug where ObjC++ blocks
with dependent return type were not getting copy elision.

My latest refactoring accidentally fixed this in the return statement
sema action, but unfortunately the template instantiator still expects
only FunctionDecl and crashes otherwise.

I am not sure there is an easy way to fix this, as BlockDecl
doesn't even carry a type unlike FunctionDecl, so there is no easy way
to get the return type.

This might require some surgery to fix, so for now I am just
skipping it and putting a FIXME.

mizvekov added inline comments.
clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
1059 ↗(On Diff #333745)

@rjmccall Hello! Do you have any tips for fixing this one?

mizvekov updated this revision to Diff 333749.Mar 28 2021, 4:57 PM

small fix

mizvekov updated this revision to Diff 333756.Mar 28 2021, 7:44 PM

some further simplifications

rjmccall added inline comments.Mar 28 2021, 11:37 PM
clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
1059 ↗(On Diff #333745)

I've always found it annoying that we only store the block type on the BlockExpr and not the BlockDecl, so if you've got a good reason to change that, I'm not opposed.

That said, while I'm not an expert in how we do NRVO variable tracking, the code must have some ability to handle an unknown return type when just processing declarations by delaying some of the bookkeeping/analysis until the end of the function. It's possible but quite uncommon to declare a return type explicitly in a block, so that's the dominant case here. I'm actually surprised we do any of this tracking when we're just instantiating a variable.

mizvekov updated this revision to Diff 333925.Mar 29 2021, 10:32 AM

Small update making some invariants more clear, with some simplifications.

mizvekov added inline comments.Mar 29 2021, 3:57 PM
clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
1059 ↗(On Diff #333745)

Thanks, your observations have been extremely helpful.

So this whole NRVO tracking thing looks to be in bad shape. This tracker here fails to handle 'auto' types. I don't think this gets fixed up anywhere else.

This is also badly covered by the test suite. The test included with the merge request that introduced this tracking (CodeGenCXX/nrvo.cpp) still passes if you either delete this whole block here, or if you just do Var->setNRVOVariable(D->isNRVOVariable()).

In fact, almost the whole test suite passes with those changes, the only tests which fail are CodeGenCXX/matrix-type-builtins.cpp and CodeGenCXX/matrix-type-operators.cpp respectively.

rjmccall added a subscriber: fhahn.Mar 29 2021, 6:54 PM
rjmccall added inline comments.
clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
1059 ↗(On Diff #333745)

I'm surprised that this has any impact on matrix types; pinging @fhahn in case he understands what the interaction might be.

It looks like we're assuming that the parse-time analysis of NRVO candidates is correct except for possibly needing adjustment for dependent expressions / return types. The parse-time analysis seems to work by keeping track of NRVO candidates on Scope; since we don't push meaningful Scopes during template instantiation, that won't work, so we have to trust the parse-time analysis. But the parse-time analysis can also be overly conservative, mostly because of if constexpr; that might be enough of a corner case to not merit something better, though. @rsmith is the expert here, though.

At any rate, it should be *correct* to handle non-function cases as if the return type were dependent, the same way it presumably works for functions with deduced return types.

mizvekov added inline comments.Mar 30 2021, 3:14 PM
clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
1059 ↗(On Diff #333745)

The main thing here is that the NRVO criteria for function return type is ignored if the type is dependent.
But it is not ignored for a non-dependent auto type, "because auto is not a record type", so this evaluates as not a NRVO candidate.

It is strange that we make this distinction between dependent and non-dependent auto here.

This is actually the only case where this function is called where the FnRetType is possibly auto, so it's an understandable oversight if that is the case.

I see we don't have the type deduced here even for functions with a single return statement and no if constexpr or other indirection, where it seems it would have been possible to have just deduced it to some dependent type during parsing.

That said, my (pretty uninformed) guess here is that it should be safe to treat auto the same as dependent, and it is probably the most reasonable choice.
That we are (again my guess) just trying here to weed out some candidates to get better results in some simple greedy algorithm later.

mizvekov updated this revision to Diff 335542.Apr 6 2021, 9:10 AM
mizvekov updated this revision to Diff 335604.Apr 6 2021, 11:12 AM
  • test/SemaCXX/coroutine-rvo: Make MoveOnly trivial, since the temporary is not created anymore and there is no special reason to test a non-trivial type here.