This is an archive of the discontinued LLVM Phabricator instance.

[clang] Add a fixit for warn-self-assign if LHS is a field with the same name as parameter on RHS
ClosedPublic

Authored by njames93 on Jul 6 2022, 7:08 AM.

Details

Summary

Add a fix-it for the common case of setters/constructors using parameters with the same name as fields

struct A{
  int X;
  A(int X) { /*this->*/X = X; }
  void setX(int X) { /*this->*/X = X;
};

Diff Detail

Event Timeline

njames93 created this revision.Jul 6 2022, 7:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 6 2022, 7:08 AM
njames93 requested review of this revision.Jul 6 2022, 7:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 6 2022, 7:08 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
njames93 updated this revision to Diff 442559.Jul 6 2022, 7:10 AM

Remove unnecessary extra include.

Thank you for working on this, I think it's a nice new diagnostic. You should add a release note to describe it. One question I have is about how chatty this is over real world code, especially older code bases. There used to be two prevailing ways to silence the "unused variable" warnings you would get when a variable is used in an assert: (void)whatever; and whatever = whatever;; I'm worried the latter case is still prevalent enough that having this warning on by default would be a problem, but I'm also optimistic my worry is unfounded.

clang/include/clang/Basic/DiagnosticSemaKinds.td
6620

We don't normally have trailing punctuation... except for question marks which we will happily use when asking the user a question.

clang/lib/Sema/SemaExpr.cpp
14676–14681

Removing the llvm:: from those mostly because we already call cast<> without the prefix. Oddly, I find myself liking the llvm:: in llvm::find_if though, so take or leave the suggestions about it.

14687

Why wouldn't we want to look through base classes for a member there? Considering a case like:

struct Base {
  int X;
};

struct Derived : Base {
  Derived(int X) { X = X; } // Oooooops
};

it seems like we would want to use a real lookup operation instead of the fields of the class. Or did you try that and found it was a performance concern?

14692–14695

Use of a note like this isn't awful, but I'm not super happy that the note always shows up on the same line as the warning. It seems like it would be cleaner (in terms of the user experience) to use a %select on the warning diagnostic to decide whether to add the "did you mean" information, and associate the fix directly with the warning. However, there's a slight functional difference with that approach which may matter to you: fixits attached to a note are not automatically applied when in fixit mode, but they are when attached to a warning or error. I don't know a reason why applying the fix-it automatically would be a bad thing in this case, so use of a note loses a bit of useful functionality.

clang/test/SemaCXX/warn-self-assign-builtin.cpp
69–73

An additional test case I'd like to see is:

struct S {
  int X;

  S(int X) : X(X) {} // Don't warn about this
};
njames93 marked 3 inline comments as done.Jul 6 2022, 11:21 PM

Thank you for working on this, I think it's a nice new diagnostic. You should add a release note to describe it. One question I have is about how chatty this is over real world code, especially older code bases. There used to be two prevailing ways to silence the "unused variable" warnings you would get when a variable is used in an assert: (void)whatever; and whatever = whatever;; I'm worried the latter case is still prevalent enough that having this warning on by default would be a problem, but I'm also optimistic my worry is unfounded.

It's not adding a new warning, so chattiness of this patch shouldn't really be much of an issue.

clang/lib/Sema/SemaExpr.cpp
14687

Once you start getting into deep class hierarchies we start to lose confidence that the fix is what the user intended. Then there is also the issue of access where the field may not even be accessible in Derived and it just adds complexity that I'm not sure is really warranted.

14692–14695

That is exactly the reason for the note. Because we aren't fixing invalid code(self assignment is suspicious but not an error) and the fix has a noticeable change of behaviour, it doesn't seem wise to automatically apply the fix-it.
I don't have a plan down the line to implement this, but what if the intent of the programmer was actually a near miss, like in the, heavily contrived, example below.

struct Foo {
  int Bar;
  void Stuff(int Bar) {
    int bar;
    // ...
    Bar = Bar; // Did they intend bar = Bar or this->Bar = Bar?
  }
};
clang/test/SemaCXX/warn-self-assign-builtin.cpp
69–73

This patch isn't adding a warning, it's just adding a fix to an existing warning. However there is currently no test cases for that particular instance so I can see value for that use case

njames93 updated this revision to Diff 442788.Jul 6 2022, 11:21 PM
njames93 marked an inline comment as done.

Address comments.

Thank you for working on this, I think it's a nice new diagnostic. You should add a release note to describe it. One question I have is about how chatty this is over real world code, especially older code bases. There used to be two prevailing ways to silence the "unused variable" warnings you would get when a variable is used in an assert: (void)whatever; and whatever = whatever;; I'm worried the latter case is still prevalent enough that having this warning on by default would be a problem, but I'm also optimistic my worry is unfounded.

It's not adding a new warning, so chattiness of this patch shouldn't really be much of an issue.

It's not triggering a diagnostic in a situation we didn't used to diagnose, but it is doubling the number of diagnostics emitted in some circumstances. So it's not a new warning, but chattiness is still a valid concern. We don't want to swamp the user with diagnostic messages.

clang/lib/Sema/SemaExpr.cpp
14687

Once you start getting into deep class hierarchies we start to lose confidence that the fix is what the user intended.

Why? If what we find is a variable (or data member),

Then there is also the issue of access where the field may not even be accessible in Derived and it just adds complexity that I'm not sure is really warranted.

The complexity question is certainly something we should keep in mind, but I'm not really sure that using a Lookup method on Sema adds a significant amount of complexity, either.

14692–14695

That is exactly the reason for the note. Because we aren't fixing invalid code(self assignment is suspicious but not an error) and the fix has a noticeable change of behaviour, it doesn't seem wise to automatically apply the fix-it.

Hmmm, I agree with you that self assignment is suspicious but not an error. However, using a K&R C function declaration is also suspicious but not an error; we still attach a fix-it to the warning for it: https://godbolt.org/z/afPr45Pd7. Self-assignment warnings are off-by-default (same as strict prototype warnings), so it seems reasonable to me that if a user opts into the diagnostic, they get the fixes on the warnings.

njames93 updated this revision to Diff 443180.Jul 8 2022, 2:48 AM

Attach fix directly to the warning.
Extend checking to also work with move assignments.

njames93 marked 2 inline comments as done.Jul 8 2022, 2:49 AM
aaron.ballman accepted this revision.Jul 8 2022, 4:59 AM

LGTM with some minor stuff you can fix when landing. Thank you for this!

clang/include/clang/Sema/Sema.h
5170–5171

Should this return a const pointer, and have some comments?

This revision is now accepted and ready to land.Jul 8 2022, 4:59 AM
njames93 updated this revision to Diff 443417.Jul 8 2022, 11:50 PM

Address comment.

njames93 marked an inline comment as done.Jul 8 2022, 11:58 PM
This revision was landed with ongoing or failed builds.Jul 9 2022, 12:28 AM
This revision was automatically updated to reflect the committed changes.