This is an archive of the discontinued LLVM Phabricator instance.

Diagnose cases of "return x" that should be "return std::move(x)" for efficiency
ClosedPublic

Authored by Quuxplusone on Feb 14 2018, 3:19 PM.

Details

Summary

This patch adds two new diagnostics, which are off by default:

-Wreturn-std-move

This diagnostic is enabled by -Wreturn-std-move, -Wmove, or -Wall.
Diagnose cases of return x or throw x, where x is the name of a local variable or parameter, in which a copy operation is performed when a move operation would have been available. The user probably expected a move, but they're not getting a move, perhaps because the type of "x" is different from the return type of the function.
A place where this comes up in the wild is stdext::inplace_function<Sig, N> which implements conversion via a conversion operator rather than a converting constructor; see https://github.com/WG21-SG14/SG14/issues/125#issue-297201412
Another place where this has come up in the wild, but where the fix ended up being different, was

try { ... } catch (ExceptionType ex) {
    throw ex;
}

where the appropriate fix in that case was to replace throw ex; with throw;, and incidentally to catch by reference instead of by value. (But one could contrive a scenario where the slicing was intentional, in which case throw-by-move would have been the appropriate fix after all.)
Another example (intentional slicing to a base class) is dissected in https://github.com/accuBayArea/Slides/blob/master/slides/2018-03-07.pdf

-Wreturn-std-move-in-c++11

This diagnostic is enabled only by the exact spelling -Wreturn-std-move-in-c++11.
Diagnose cases of "return x;" or "throw x;" which in this version of Clang *do* produce moves, but which prior to Clang 3.9 / GCC 5.1 produced copies instead. This is useful in codebases which care about portability to those older compilers.
The name "-in-c++11" is not technically correct; what caused the version-to-version change in behavior here was actually CWG 1579, not C++14. I think it's likely that codebases that need portability to GCC 4.9-and-earlier may understand "C++11" as a colloquialism for "older compilers." The wording of this diagnostic is based on feedback from @rsmith.

Discussion

Notice that this patch is kind of a negative-space version of Richard Trieu's -Wpessimizing-move. That diagnostic warns about cases of return std::move(x) that should be return x for speed. These diagnostics warn about cases of return x that should be return std::move(x) for speed. (The two diagnostics' bailiwicks do not overlap: we don't have to worry about a return statement flipping between the two states indefinitely.)

I propose to write a paper for San Diego that would relax the implicit-move rules so that in C++2a the user would see the moves they expect, and the diagnostic could be re-worded in a later version of Clang to suggest explicit std::move only "in C++17 and earlier." But in the meantime (and/or forever if that proposal is not well received), this diagnostic will be useful to detect accidental copy operations.

Diff Detail

Repository
rC Clang

Event Timeline

rsmith added a subscriber: rsmith.Feb 14 2018, 3:54 PM
rsmith added inline comments.
include/clang/Basic/DiagnosticSemaKinds.td
23

Can we say something like "local variable '%0' will be copied despite being %select{returned|thrown}1 by name; call 'std::move' explicitly to avoid the copy"? (Would that be accurate, given the implementation of the warning?)

Ideally, we'd move the "call 'std::move' explicitly" hint to a separate note diagnostic and include a FixItHint on that diagnostic to insert the call to std::move.

27

I think the wording of this diagnostic could be improved too. It's very language-lawyer-friendly right now. Here's one possible starting point for alternative wording (though I think maybe we can do better):

"ISO C++11, prior to the resolution of defect reports, would have made a copy of local variable '%0' despite it being returned by name due to a type mismatch with the return value%diff{ ($ vs $)|}1, 2"

... plus a "note: call 'std::move' explicitly to avoid a copy when using older compilers" or similar, with a fixit.

Quuxplusone marked an inline comment as done.Feb 15 2018, 1:36 PM
Quuxplusone added inline comments.
include/clang/Basic/DiagnosticSemaKinds.td
23

SGTM, except that I don't know how to find out whether we're in the context of a return or a throw from this deep in the guts of Sema. Could someone give me a pointer on that?

I also had trouble figuring out how to generate a fixit from x to std::move(x) — it keeps coming out as std::move( ) — but I expect I'll be able to solve that one even without help by banging my head against it a bit.

include/clang/Sema/Sema.h
3827

Q: Is it appropriate for me to be changing the signature of this public method at all? I don't know libclang's rules regarding this kind of change.

lib/Sema/SemaExprCXX.cpp
731

Q: I'm not sure what concept is being represented by "CES_Strict" here. If there is a more standardese-appropriate name, I'd like to use it. And if there are multiple concepts here that just happen to be identical *coincidentally*, I'd probably prefer to use multiple enumerators that all just happen to have value 0. But because there are so many call-sites and because this is not directly related to my patch, I have not investigated much.

Reword the diagnostic per rsmith's suggestions.
Still TODO:

  • figure out how to detect whether this is a return-stmt or throw-expression
  • figure out how to generate the appropriate fixit
Quuxplusone edited the summary of this revision. (Show Details)

@rsmith and/or @rtrieu, please take another look? All my TODOs are done now: there are fixits, and the wording of the diagnostic changes if it's a "throw" instead of a "return", and the wording has been updated per Richard Smith's suggestions.

I have one very minor nit that I don't know how to fix:

warn-return-std-move.cpp:220:12: warning: local variable 'd' will be copied
      despite being returned by name [-Wreturn-std-move]
    return (d);  // e17
           ^
warn-return-std-move.cpp:220:12: note: call 'std::move' explicitly to avoid copying
    return (d);  // e17
           ^~~
           std::move(d)

The warning places a caret directly under the (, when I wish it would place ^~~ under the entire expression, the way the fixit does.

I also spent a little time looking into whether I could/should diagnose

auto [x, y] = std::make_pair(Derived(), Derived());
return x;  // 'x' will be copied despite being returned by name

but I have decided that this is probably too far afield to be rolled into this patch, even if I could figure out how to detect it, which to a first approximation I cannot. So I am deliberately punting on that one.

Removed a redundant check for LValueReferenceType in the CWG1579 codepath. (In that branch, we know that standard C++ *did* perform the copy-to-move transformation, so obviously we can't have had an lvalue reference type originally.) I've verified that the check is still needed (not redundant) along the other codepath.

Quuxplusone added a reviewer: rsmith.
Quuxplusone added a subscriber: Rakete1111.

Eliminate a couple of auto per comment by @Rakete1111.

Somewhat related thing i have noticed: https://godbolt.org/g/ow58JV
I wonder whether that also should be diagnosed? Where though, as clang-tidy check?

Somewhat related thing i have noticed: https://godbolt.org/g/ow58JV

IIUC, you're asking whether it would be possible to detect instances of

return take(mysharedptr);

and rewrite them into

return take(std::move(mysharedptr));

(Notice that the "return" context is relatively important, because if mysharedptr is used again after this use, then the move-from is unsafe and should not be suggested. Of course it's always possible for mysharedptr to be used again after a return or throw as well, e.g. during stack unwinding/destructors; but there's precedent there for C++ saying we'll take that chance.)

If you start treating names as rvalues no matter where they appear in larger expressions, then you run into trouble with

std::string hello = ...;
std::string &hello2 = hello;
return concatenate(hello, hello2);

where the first use of hello on this line moves-out-of it, and then the second use reads the moved-from value. My best guess is that things like that occur frequently enough in real-world code that (A) the Standard can't specify move behavior there because it would quietly break code, and (B) a Clang diagnostic would produce mostly false positives (in the sense that accepting the fixits would quietly break the user's code). I admit it would be interesting to write the code and find out empirically.

Whereas in this specific limited case of return x; and throw x;, programmers are already trained to expect a move, and there are no (new) aliasing issues.

Where though, as clang-tidy check?

My experience with this patch suggests that when the thing you need to do involves "run a hypothetical overload-resolution," it is basically impossible to do with clang-tidy, and relatively easy to do in clang proper. (I tried writing this check into clang-tidy first, and couldn't figure out how to do the overload resolution part. Whereas it was painless in clang, especially with -Wpessimizing-move to copy off of.)

I have one very minor nit that I don't know how to fix:

warn-return-std-move.cpp:220:12: warning: local variable 'd' will be copied
      despite being returned by name [-Wreturn-std-move]
    return (d);  // e17
           ^
warn-return-std-move.cpp:220:12: note: call 'std::move' explicitly to avoid copying
    return (d);  // e17
           ^~~
           std::move(d)

The warning places a caret directly under the (, when I wish it would place ^~~ under the entire expression, the way the fixit does.

You can add extra tildes to any diagnostic by passing a SourceRange to Diag() calls the same way you pass FixitHints.

lib/Sema/SemaStmt.cpp
2937

I have a few concerns about this function. The code isn't a straight move from the old location to this one, so it is hard to follow along the changes, especially the change to the complex if conditional. The function should be commented, especially to behavior difference for setting IsFake.

It looks like when IsFake is set, then the result is discarded and not used, but it is still possible on success for AsRvalue to be copied to the heap. This is wasteful when it is never used again.

Another issue is the Value in the original code is used again towards the end of the function on line #3013. In the old code, Value can be updated while in the new code, it does.

It may be better to split this change in two, the first adding this function and the CopyElisionSemanticsKind enum and the second adding the diagnostic itself.

Rename some functions and parameters. Rebase onto D43898.

lib/Sema/SemaStmt.cpp
2937

Hi @rtrieu, and thanks!

I have split out the first half of the patch into a new revision D43898, and updated this one with the full patch (both halves together). Is there an easy way for me to make "just the second half" reviewable on its own, before the first half has been merged to master?

It looks like when IsFake is set, then the result is discarded and not used, but it is still possible on success for AsRvalue to be copied to the heap. This is wasteful when it is never used again.

I believe you are correct. But I'm not sure if it's safe to use AsRvalue as input to Res (which *is* used outside this function) if it's not moved like this; I don't know much about the internals here. You or anyone have a suggestion for how to fix that issue?

In the old code, Value can be updated while in the new code, it does.

I can't parse this, sorry.

FYI, in the patch I'm about to upload, I have renamed !IsFake to ConvertingConstructorsOnly, which should be more pleasing to the eye. :)

include/clang/Basic/DiagnosticSemaKinds.td
5639

I would like some guidance on whether it would be appropriate to turn this warning on as part of -Wmove.

lib/Sema/SemaStmt.cpp
3083

This source range does the right thing; thanks @rtrieu!

Quuxplusone marked an inline comment as done.Feb 28 2018, 2:16 PM
Quuxplusone added a comment.EditedMar 5 2018, 1:34 PM

@rtrieu (and perhaps @rsmith) ping?

The action items I need help with are:

  • Review and land the refactoring patch D43898 (I don't have commit privs)
  • Ideally, test compiling a bunch of (e.g. Google) code with D43322, see if there are any rough edges
  • Decide whether -Wmove should imply -Wreturn-std-move
  • Review and land the new-diagnostic patch D43322
  • Ideally, some discussion of whether I should write a paper for San Diego proposing that C++ *should* move rather than copy in these cases, and whether that possibility should change anything about this patch

Speaking of rough edges, I did find that in my employer's codebase the far-and-away-most-common trigger is return my_shared_ptr_to_derived_class; in a function that is declared to return shared_ptr<base_class>. [EDIT: oops, I forgot that this hits only -Wreturn-std-move-in-c++11, which shouldn't be included in -Wmove anyway.] This does indeed generate an extra atomic increment/decrement compared to the optimal code, but I could imagine someone considering that a "false positive."
Other than that one questionable scenario, everything else I've found with it so far has been a true positive. There are no positives (either true or false) in LLVM's codebase.

Rebased on D43898 after addressing @rtrieu's latest comments over there.

Boldly add -Wreturn-std-move to -Wmove (and thus also to -Wall).
The backward-compatibility-concerned diagnostic, -Wreturn-std-move-in-c++11, is *not* appropriate for -Wmove; but I believe this main diagnostic is appropriate.

The backward-compatibility-concerned diagnostic, -Wreturn-std-move-in-c++11, is *not* appropriate for -Wmove;

Have you evaluated possibility of adding -Wreturn-std-move-in-c++11 to one of CXX..Compat groups?

include/clang/Basic/DiagnosticGroups.td
385

The backward-compatibility-concerned diagnostic, -Wreturn-std-move-in-c++11, is *not* appropriate for -Wmove;

Have you evaluated possibility of adding -Wreturn-std-move-in-c++11 to one of CXX..Compat groups?

Hmm. I've considered it, but I don't know what the appropriate action is...
I suspect that it should *not* be "CXX11Compat" because technically the change was a DR against C++11 — sadly this is about portability to "older" compilers, not portability to "lesser" C++ standards.
Of course it can't be "CXX98Compat", for several reasons. :)
I'm not sure of the purpose of the "CXX..CompatPedantic" groups. They are all identical to the "CXX..Compat" groups, except that CXX98CompatPedantic includes one extra warning about binding to lifetime-extended temporaries which don't have copy constructors.

Rebase on master, now that the refactoring D43898 has been pushed (thanks @rtrieu!)

Quuxplusone marked 3 inline comments as done.Mar 15 2018, 11:50 AM
Quuxplusone marked an inline comment as done.

@rtrieu ping! I've rebased this on the pure refactoring you committed for me last week.

@rtrieu gentle ping!

Action items I need help with, cut-and-pasted from above:

  • Ideally, test compiling a bunch of (e.g. Google) code with D43322, see if there are any rough edges
  • Decide if -Wmove should imply -Wreturn-std-move (I have preemptively done this)
  • Review and land D43322
  • Ideally, some discussion of whether I should write a paper for San Diego proposing that C++ should move rather than copy in these cases (my default plan is to do this)
Quuxplusone edited the summary of this revision. (Show Details)

Rebased on master. AFAIK this is ready to go and I just need someone to land it for me... @rtrieu ping?

I believe the PR summary would be suitable as a commit message.

Quuxplusone edited the summary of this revision. (Show Details)Mar 27 2018, 11:19 AM
rsmith accepted this revision.Mar 27 2018, 5:45 PM
rsmith added inline comments.
include/clang/Sema/Sema.h
3806

Please wrap to 80 columns.

3827

For reference, changes here are fine. We provide no API or ABI stability guarantees for our C++ ABI; libclang provides an ABI-stable C ABI layered on top of our internal API for this reason.

lib/Sema/SemaStmt.cpp
3075–3076

I've seen this a lot more for returning unique_ptr<Derived> from a function with return type unique_ptr<Base>. Maybe just give this Expected<T> case as an example rather than saying it's the most common example?

This revision is now accepted and ready to land.Mar 27 2018, 5:45 PM

Addressed review comments from @rsmith and rebased on master.
I still need someone else to land this for me, though.

Quuxplusone marked 3 inline comments as done.Mar 28 2018, 10:50 AM

Finally learned how to "make check-clang" and actually run the test in the correct environment. Had to add -fcxx-exceptions to the command lines at the top of that file, because the code uses throw.

@rsmith PTAL?

Tuesday ping! I just need someone to commit this for me.

This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Apr 25 2018, 4:51 AM

This is a cool warning, thanks for adding it. We ran into one thing while enabling this in Chromium that I'd like to mention here. We have code that basically does:

struct Foo {
  using passwords_iterator = std::map<base::string16,
                                      std::set<std::string>,
                                      ReverseStringLess>::const_iterator;
  std::map<base::string16, std::set<std::string>, ReverseStringLess> passwords_;
  passwords_iterator get(const base::string16& in) {
    auto it = passwords_.lower_bound(in);
    return it;
  }
};

Here, the warning gets emitted because auto it becomes a non-const iterator, and passwords_iterator is a const_iterator. Maybe the warning could suggest something like "cv qualifiers don't match, make them match" on a note in addition or instead of std::move() for this case?

And then someone else pointed out that http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1579 might mean that the code-as is should be fine too in C++14 – I don't know if that's true, but maybe you could comment on that too :-)

(references:
https://cs.chromium.org/chromium/src/components/password_manager/core/browser/password_reuse_detector.h?type=cs&q=passwords_iterator&sq=package:chromium&l=64
https://chromium-review.googlesource.com/c/chromium/src/+/1025435/6/components/password_manager/core/browser/password_reuse_detector.cc
https://chromium-review.googlesource.com/c/chromium/src/+/1025435/8/components/password_manager/core/browser/password_reuse_detector.cc
)

Sorry, I responded via email but I guess Phabricator didn't pick it up for
some reason.
See below.

Sorry, I responded via email but I guess Phabricator didn't pick it up for
some reason.
See below.

And then Phab *still* didn't pick up the important part... Okay, I'll try pasting it here.

Can you post the diagnostic exactly as it appears in the compiler output? I am surprised that it would appear here. It should appear here only if the standard library's implicit conversion from std::map<...>::iterator to std::map<...>::const_iterator were implemented as a conversion operator instead of as a converting constructor. I have verified that both libstdc++ trunk and libc++ trunk implement it "correctly" as a converting constructor, and I have verified on Wandbox/Godbolt that no warning is emitted on your sample code when using either of those standard libraries.

Is it possible that you are using a standard library that is neither libc++ or libstdc++?
Is it possible that that standard library implements the iterator-to-const_iterator conversion as a conversion operator instead of a converting constructor?

And then someone else pointed out that http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1579 might mean that the code-as is should be fine too in C++14 – I don't know if that's true, but maybe you could comment on that too :-)

CWG1579 is the defect report against ISO C++11 which is mentioned in the wording of the off-by-default -Wreturn-std-move-in-c++11. :) Anything that -Wreturn-std-move itself complains about, is stuff that was NOT fixed by CWG1579.
Alternative hypothesis: Is it possible that you have manually turned on -Wreturn-std-move-in-c++11 (which is deliberately omitted from -Wmove/-Wall)? In that case, the warning is absolutely correct, but you are probably wrong to care about fixing it in this case. :)

Maybe the warning could suggest something like "cv qualifiers don't match, make them match" on a note in addition or instead of std::move() for this case?

I would love to see this, but I'm not comfortable trying to add it myself. Remember, the problem here isn't core-language cv-qualifiers ("iterator" vs. "const iterator"); the problem is library-level type mismatch ("iterator" vs. "const_iterator"). You'd have to somehow look at the human-readable name of the typedef and determine that it was a match for the regex /const_(.*)/ and the target type was a match for /\1/, and then you'd have to generate a meaningful note.

Can you post the diagnostic exactly as it appears in the compiler output? I am surprised that it would appear here. It should appear here only if the standard library's implicit conversion from std::map<...>::iterator to std::map<...>::const_iterator were implemented as a conversion operator instead of as a converting constructor. I have verified that both libstdc++ trunk and libc++ trunk implement it "correctly" as a converting constructor, and I have verified on Wandbox/Godbolt that no warning is emitted on your sample code when using either of those standard libraries.

Is it possible that you are using a standard library that is neither libc++ or libstdc++?

Is it possible that that standard library implements the iterator-to-const_iterator conversion as a conversion operator instead of a converting constructor?

@thakis ping — did you ever resolve this issue to your satisfaction?