Page MenuHomePhabricator

[clang][diag] warn if function returns class type by-const-value
Needs ReviewPublic

Authored by nlee on May 11 2022, 10:15 AM.

Details

Reviewers
rsmith
aaron.ballman
Group Reviewers
Restricted Project
Summary

Returning class or struct by value with const qualifier should be avoided since it prevents move semantics.

clang-tidy has readability-const-return-type that catches this case, but it also impacts move semantics, not just readability.

Diff Detail

Event Timeline

nlee created this revision.May 11 2022, 10:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 11 2022, 10:15 AM
nlee requested review of this revision.May 11 2022, 10:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 11 2022, 10:15 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
lattner edited reviewers, added: rsmith; removed: lattner.May 11 2022, 12:53 PM

I'm not a competent reviewer for this, Richard can you recommend someone?

aaron.ballman added reviewers: aaron.ballman, Restricted Project.May 12 2022, 6:22 AM
aaron.ballman added a subscriber: aaron.ballman.

Thank you for working on this new diagnostic! We don't typically add new diagnostics that are off by default unless they're for pedantic diagnostics or are reasonably expected to be enabled by default in the future. This is because we've had evidence that suggests such diagnostics aren't enabled often enough to warrant the maintenance cost for them.

I'm not convinced this diagnostic can be enabled by default. Yes, it prevents move semantics, but that's not typically an issue with the correctness of the code. IIRC, we decided to put this diagnostic into clang-tidy because we thought it would be too chatty in practice, especially on older code bases.

Perhaps there are ways we can improve the diagnostic behavior to allow it to be enabled by default though. One possibility would be to look at the return type to see if there's a user-provided move constructor (either = default or explicitly written) and only diagnose if one is found. But I think we'd want some evidence that this actually reduces the false positive rate in practice, which means trying to compile some large C++ projects (of various ages/code quality) to see. Would you be willing to run your patch against some large C++ projects to see what kind of true and false positives it generates? From there, we can decide whether we need additional heuristics or not.

erichkeane added inline comments.
clang/lib/Sema/SemaType.cpp
5203 ↗(On Diff #428688)

I wonder if this is same concern applies to Unions? Should this just be T->isRecordType()? Should we do more analysis to ensure that this is a movable type? I THINK CXXRecordDecl::defaultedMoveConstructorIsDeleted() should be enough to test for t his.

nlee added a comment.May 16 2022, 8:21 PM

Thank you for working on this new diagnostic! We don't typically add new diagnostics that are off by default unless they're for pedantic diagnostics or are reasonably expected to be enabled by default in the future. This is because we've had evidence that suggests such diagnostics aren't enabled often enough to warrant the maintenance cost for them.

I'm not convinced this diagnostic can be enabled by default. Yes, it prevents move semantics, but that's not typically an issue with the correctness of the code. IIRC, we decided to put this diagnostic into clang-tidy because we thought it would be too chatty in practice, especially on older code bases.

Perhaps there are ways we can improve the diagnostic behavior to allow it to be enabled by default though. One possibility would be to look at the return type to see if there's a user-provided move constructor (either = default or explicitly written) and only diagnose if one is found. But I think we'd want some evidence that this actually reduces the false positive rate in practice, which means trying to compile some large C++ projects (of various ages/code quality) to see. Would you be willing to run your patch against some large C++ projects to see what kind of true and false positives it generates? From there, we can decide whether we need additional heuristics or not.

How about adding CXXRecordDecl::hasMoveConstructor() to ensure the type is movable? I ran a test with OpenCV(c++11), and it returned some useful warnings:

In file included from /Users/nlee/Documents/opencv/modules/core/include/opencv2/core/base.hpp:662:
/Users/nlee/Documents/opencv/modules/core/include/opencv2/core/check.hpp:16:12: warning: returning by const value prevents move semantics [-Wreturn-by-const-value]
CV_EXPORTS const String typeToString(int type);
           ^~~~~~
/Users/nlee/Documents/opencv/modules/core/include/opencv2/core/check.hpp:26:12: warning: returning by const value prevents move semantics [-Wreturn-by-const-value]
CV_EXPORTS const cv::String typeToString_(int type);
           ^~~~~~

@aaron.ballman Can you suggest some other projects to test? I'm thinking of these: protobuf, pytorch

I wonder if this is same concern applies to Unions? Should this just be T->isRecordType()? Should we do more analysis to ensure that this is a movable type? I THINK CXXRecordDecl::defaultedMoveConstructorIsDeleted() should be enough to test for t his.

@erichkeane Does adding CXXRecordDecl::hasMoveConstructor() suffices to ensure the type is movable? I left out unions because they may alert false positives. An example of such a case is:

union U { int i; std::string s; };
const U f() { return U{239}; }
clang/lib/Sema/SemaType.cpp
5203 ↗(On Diff #428688)

How about adding CXXRecordDecl::hasMoveConstructor() to ensure the type is movable? I left out unions because they may alert false positives. An example of such a case is:

union U { int i; std::string s; };
const U f() { return U{239}; }
5203 ↗(On Diff #428688)

I wonder if this is same concern applies to Unions? Should this just be T->isRecordType()? Should we do more analysis to ensure that this is a movable type? I THINK CXXRecordDecl::defaultedMoveConstructorIsDeleted() should be enough to test for t his.

nlee added a comment.May 17 2022, 7:19 AM

Thank you for your reviews, and sorry for the mess in my last comment. I'm new to clang/Phabricator and trying to get used to it :)

clang/lib/Sema/SemaType.cpp
5203 ↗(On Diff #428688)

It seems it is not always possible to call T->getAsCXXRecordDecl()->hasMoveConstructor() here. CXXRecordDecl::DefinitionData is not populated if definition is not provided in the translation unit, such as in the following case:

extern struct no_def s;
const no_def f();

assert fails with message: "queried property of class with no definition"

How about adding CXXRecordDecl::hasMoveConstructor() to ensure the type is movable? I ran a test with OpenCV(c++11), and it returned some useful warnings:

I don't think that will be quite correct -- IIRC, that would still return true if the move constructor was deleted. hasSimpleMoveConstructor() and hasSimpleMoveAssignment() might be a better approach.

@aaron.ballman Can you suggest some other projects to test? I'm thinking of these: protobuf, pytorch

Those are good; I'd also run it over LLVM itself. Boost would also be a good thing to test against.

Thank you for your reviews, and sorry for the mess in my last comment. I'm new to clang/Phabricator and trying to get used to it :)

No problem, Phabricator takes some getting used to and the Clang code base is pretty large. :-)

clang/lib/Sema/SemaType.cpp
5203 ↗(On Diff #428688)

Yeah, we calculate move/copy constructor/assignment (triviality, etc) while building up the type for the class as we add declarations to it. So if we've not seen the definition yet, we won't be able to check that property.

That said, it might make sense to only trigger this diagnostic when *defining* the function because the parameter and return types must be complete by that point.

clang/test/SemaCXX/warn-return-by-const-value.cpp
33 ↗(On Diff #428688)

You should add the newline to the end of the file.

How about adding CXXRecordDecl::hasMoveConstructor() to ensure the type is movable? I ran a test with OpenCV(c++11), and it returned some useful warnings:

I don't think that will be quite correct -- IIRC, that would still return true if the move constructor was deleted. hasSimpleMoveConstructor() and hasSimpleMoveAssignment() might be a better approach.

The 'Simple' version might not be quite right... That is implemented as:

bool hasSimpleMoveConstructor() const {
   return !hasUserDeclaredMoveConstructor() && hasMoveConstructor() &&
          !data().DefaultedMoveConstructorIsDeleted;
 }

So this would still warn about user-defined move constructors.

HOWEVER, I might suggest hasMoveConstructor() && !defaultedMoveConstructorIsDeleted() for the ctor test. There is similar storage for the 'DefaultedMoveAssignmentIsDeleted`, but it isn't exposed, so you might need to add a function to expose it in DeclCXX.h.

How about adding CXXRecordDecl::hasMoveConstructor() to ensure the type is movable? I ran a test with OpenCV(c++11), and it returned some useful warnings:

I don't think that will be quite correct -- IIRC, that would still return true if the move constructor was deleted. hasSimpleMoveConstructor() and hasSimpleMoveAssignment() might be a better approach.

The 'Simple' version might not be quite right... That is implemented as:

bool hasSimpleMoveConstructor() const {
   return !hasUserDeclaredMoveConstructor() && hasMoveConstructor() &&
          !data().DefaultedMoveConstructorIsDeleted;
 }

So this would still warn about user-defined move constructors.

Good catch!

HOWEVER, I might suggest hasMoveConstructor() && !defaultedMoveConstructorIsDeleted() for the ctor test. There is similar storage for the 'DefaultedMoveAssignmentIsDeleted`, but it isn't exposed, so you might need to add a function to expose it in DeclCXX.h.

I think that might still not be correct because there could be a user-provided move constructor that's explicitly deleted. (So it has a move constructor, but the defaulted one is not deleted because it's user provided.)

How about adding CXXRecordDecl::hasMoveConstructor() to ensure the type is movable? I ran a test with OpenCV(c++11), and it returned some useful warnings:

I don't think that will be quite correct -- IIRC, that would still return true if the move constructor was deleted. hasSimpleMoveConstructor() and hasSimpleMoveAssignment() might be a better approach.

The 'Simple' version might not be quite right... That is implemented as:

bool hasSimpleMoveConstructor() const {
   return !hasUserDeclaredMoveConstructor() && hasMoveConstructor() &&
          !data().DefaultedMoveConstructorIsDeleted;
 }

So this would still warn about user-defined move constructors.

Good catch!

HOWEVER, I might suggest hasMoveConstructor() && !defaultedMoveConstructorIsDeleted() for the ctor test. There is similar storage for the 'DefaultedMoveAssignmentIsDeleted`, but it isn't exposed, so you might need to add a function to expose it in DeclCXX.h.

I think that might still not be correct because there could be a user-provided move constructor that's explicitly deleted. (So it has a move constructor, but the defaulted one is not deleted because it's user provided.)

I don't think that is possible? I think 'defaultedMoveConstructor' includes the user typing StructName(StructName&&) = delete;.

Note that out of line deleted implementations are prohibited: https://godbolt.org/z/MTK4jGa57 So we don't have to worry about that.

How about adding CXXRecordDecl::hasMoveConstructor() to ensure the type is movable? I ran a test with OpenCV(c++11), and it returned some useful warnings:

I don't think that will be quite correct -- IIRC, that would still return true if the move constructor was deleted. hasSimpleMoveConstructor() and hasSimpleMoveAssignment() might be a better approach.

The 'Simple' version might not be quite right... That is implemented as:

bool hasSimpleMoveConstructor() const {
   return !hasUserDeclaredMoveConstructor() && hasMoveConstructor() &&
          !data().DefaultedMoveConstructorIsDeleted;
 }

So this would still warn about user-defined move constructors.

Good catch!

HOWEVER, I might suggest hasMoveConstructor() && !defaultedMoveConstructorIsDeleted() for the ctor test. There is similar storage for the 'DefaultedMoveAssignmentIsDeleted`, but it isn't exposed, so you might need to add a function to expose it in DeclCXX.h.

I think that might still not be correct because there could be a user-provided move constructor that's explicitly deleted. (So it has a move constructor, but the defaulted one is not deleted because it's user provided.)

I don't think that is possible? I think 'defaultedMoveConstructor' includes the user typing StructName(StructName&&) = delete;.

Note that out of line deleted implementations are prohibited: https://godbolt.org/z/MTK4jGa57 So we don't have to worry about that.

I might be wrong about this above:
https://clang.llvm.org/doxygen/classclang_1_1CXXRecordDecl.html#a8ddbc71ffd71a6f7d580fc7a19452f30
Comment says: "
/// Set that we attempted to declare an implicit move

/// constructor, but overload resolution failed so we deleted it."

So perhaps more work to figure out non-deleted move ctor, it might require a lookup. So if we do that, we need a bunch of tests to validate all these conditions I believe (manually deleted, inherited deleted, member-cauesd-deleted, and 'defaulted' versions of the last 2, user provided but works, etc).

nlee updated this revision to Diff 434842.Tue, Jun 7, 9:42 AM

After some investigation, I think it is not always explicit that removing const makes any improvements when we are "constructing." For example, in this case, the two codes which differ in the return type of function f (S f() vs. const S f()) produce the same assembly output. I assume this is due to various copy elisions stepping in.

However, when we are "assigning," it makes a difference. In this case, the existence of const in the return type of f decides whether the assignment invokes a move or not.

So I'm suggesting a different policy. Warn if:

  • It is an assignment expression AND
  • RHS is a function call expression AND
  • The function returns by-const-value of a class or struct type AND
  • The class or struct is move assignable

The following is a part of diagnosed warnings when applied to llvm:

/Users/namgoolee/Documents/llvm-project/llvm/include/llvm/Option/Option.h:103:9: warning: const qualifying the return value prevents move semantics [-Wpessimizing-return-by-const]
  const Option getGroup() const {
        ^
/Users/namgoolee/Documents/llvm-project/llvm/lib/Option/ArgList.cpp:38:10: note: copy assignment is invoked here even if move assignment is supported for type 'llvm::opt::Option'
       O = O.getGroup()) {
         ^
--
/Users/namgoolee/Documents/llvm-project/clang/include/clang/Lex/MacroInfo.h:421:9: warning: const qualifying the return value prevents move semantics [-Wpessimizing-return-by-const]
  const DefInfo findDirectiveAtLoc(SourceLocation L,
        ^
/Users/namgoolee/Documents/llvm-project/clang/include/clang/Lex/Preprocessor.h:1167:10: note: copy assignment is invoked here even if move assignment is supported for type 'clang::MacroDirective::DefInfo'
      DI = MD->findDirectiveAtLoc(Loc, getSourceManager());
         ^
nlee updated this revision to Diff 435113.Wed, Jun 8, 5:29 AM

Updated diff to pass clang-format test.

aaron.ballman added inline comments.Thu, Jun 9, 10:53 AM
clang/include/clang/Basic/DiagnosticSemaKinds.td
6632

Given that this is a DefaultIgnore, I'm still not quite sure this belongs in Clang rather than continuing to be covered by clang-tidy.

If you leave it on by default, how many Clang tests break as a result? Do the new warnings all look like true positives? (Basically, I'm trying to see how far away we are from being able to enable this diagnostic by default.)

clang/test/SemaCXX/warn-pessimizing-return-by-const.cpp
51

We should also have a test for an explicitly deleted move assignment, and another one where a member of the class is not movable (and thus the move assignment should be deleted that way as well).

nlee added inline comments.Sun, Jun 12, 5:54 PM
clang/include/clang/Basic/DiagnosticSemaKinds.td
6632

how many Clang tests break as a result?

All Clang tests pass without breaking.

Do the new warnings all look like true positives?

When applied to Clang source code, it does catch some cases as programmed. One problem I see is that it also catches for classes with only primitive member variables where move assignment(which is copy) has no advantage over copy assignment.

clang/test/SemaCXX/warn-pessimizing-return-by-const.cpp
51

I'll add those cases.

nlee updated this revision to Diff 437011.EditedTue, Jun 14, 8:21 PM

Warn if:

  • It is an assignment expression AND
  • RHS is a function call expression AND
  • The function returns by-const-value of a class or struct type AND
  • The class or struct has non-trivial move assignment operator

Removed DefaultIgnore flag and make check-clang passes.