This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy][analyzer] Fix false-positive in IdenticalExprChecker and misc-redundant-expression
Needs ReviewPublic

Authored by steakhal on Nov 26 2021, 2:50 AM.

Details

Summary

Consider this example:

#include <type_traits>
using std::is_same;
bool top() {
  return is_same<char, int>::value || is_same<char, long>::value;
}

We can see that the value static members are different, but in fact,
they refer to the same VarDecl entity. It is because both is_same
class instances inherit from the common false_type class, which
actually owns the value member.

The alpha.core.IdenticalExpr Static Analyzer checker actually warns
for this, since it only checked if the referred Decls are the same.
Interestingly, the clang-tidy misc-redundant-expression also reports
this is for the exact same reason, so I'm fixing both of them at the same
time.

This patch fixes this by checking if the Decls are the same and if they
are it will try to acquire the nested namespace specifier as a type and
compare them as well. If they are inequal, that means that the
spelling of the nested namespace specifiers actually differed, this
it's likely to be a false-positive.

I'd like to note that the checker/check was actually right about that
the expressions were semantically equal in that given context, however,
we still don't want these reports in general.

We could introduce checker options to finetune this behavior if needed.

PS: According to the code coverage of the test, all touched parts are
completely covered.

Thanks David Rector (@davrec) for the idea on the cfe-dev forum:
https://lists.llvm.org/pipermail/cfe-dev/2021-November/069389.html

Diff Detail

Unit TestsFailed

Event Timeline

steakhal created this revision.Nov 26 2021, 2:50 AM
steakhal requested review of this revision.Nov 26 2021, 2:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 26 2021, 2:51 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
whisperity added a comment.EditedNov 26 2021, 4:32 AM

I haven't looked at the patch in detail yet, but I know I've run into the issue this aims to fix when developing a Tidy check!

There is a // NOLINTNEXTLINE(misc-redundant-expression) directive in the source file clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.cpp right before a static_assert that checks some type_traits-y stuff. Should be at line 510 -- according to my local checkout. Could you please remove that line and have that change be part of this patch too? I did a grep on the project and that's the only "user-side" mention to the misc-redundant-... check. Running Tidy on Tidy itself afterwards would turn out to be a nice real-world test that the fix indeed works. 🙂

whisperity added inline comments.Nov 26 2021, 4:55 AM
clang-tools-extra/clang-tidy/misc/RedundantExpressionCheck.cpp
60–133

(Wrt. to the comment about not typing stuff out directly, I guess L and R here will be X<char> and X<long> and thus not the same.)

60–133

(Style nit to lessen the indent depth, until C++17...)

187–189

(Style nit. Or LeftDR, RightDR.)

clang-tools-extra/test/clang-tidy/checkers/misc-redundant-expression.cpp
822

Are there any more ways we could trick this check to cause false positives? Something through which it doesn't look the same but still refers to the same thing?

At the top of my head I'm thinking about something along the lines of this:

template <typename T>
struct X {
  template <typename U>
  static constexpr bool same = std::is_same_v<T, U>;
};

X<char> ch() { return X<char>{}; }
X<int> i() { return X<int>{}; }

void test() {
  coin ? ch().same<long> : i().same<long>;
}

So the "type name" where we want to look up isn't typed out directly.
Or would these still be caught (and thus fixed) by looking at a NestedNameSpecifier?

I haven't looked at the patch in detail yet, but I know I've run into the issue this aims to fix when developing a Tidy check!

There is a // NOLINTNEXTLINE(misc-redundant-expression) directive in the source file clang-tools-extra/clang-tidy/bugprone/EasilySwappableParametersCheck.cpp right before a static_assert that checks some type_traits-y stuff. Should be at line 510 -- according to my local checkout. Could you please remove that line and have that change be part of this patch too? I did a grep on the project and that's the only "user-side" mention to the misc-redundant-... check. Running Tidy on Tidy itself afterwards would turn out to be a nice real-world test that the fix indeed works. 🙂

Yup, it works. I removed your NOLINT comment.

steakhal updated this revision to Diff 390023.Nov 26 2021, 5:08 AM
steakhal updated this revision to Diff 390032.Nov 26 2021, 5:31 AM
steakhal marked 3 inline comments as done.

Fixing nits: renaming variables, etc.

clang-tools-extra/clang-tidy/misc/RedundantExpressionCheck.cpp
60–133

(Wrt. to the comment about not typing stuff out directly, I guess L and R here will be X<char> and X<long> and thus not the same.)

I don't quite get what do you mean by this.

187–189

DeclRef1 -> LeftDR

clang-tools-extra/test/clang-tidy/checkers/misc-redundant-expression.cpp
822

For the example:

template <typename T, T V> struct integral_constant {
    static constexpr T value = V;
    typedef T value_type;
    typedef integral_constant<T, V> type;
};
typedef integral_constant<bool, true> true_type;
typedef integral_constant<bool, false> false_type;
template <typename T, typename U> struct is_same : false_type {};
template <typename T> struct is_same<T, T> : true_type {};

template <typename T>
struct X {
  template <typename U>
  static constexpr bool same = is_same<T, U>::value;
};

X<char> ch();
X<int> i();

void test(bool coin) {
  coin ? ch().same<long> : i().same<long>;
}

We would have no warnings for this code, nor we had previously:
https://godbolt.org/z/9xo4adfbP

Should I demonstrate this in the test file?
I wouldn't pollute it TBH, since it requires some template machinery which is quite verbose.
However, if you insist I will add it.

A couple thoughts/cases to consider...

clang-tools-extra/clang-tidy/misc/RedundantExpressionCheck.cpp
58

This function should probably either be made into a lambda private to areEquivalentDeclRefs(), or renamed to make clear it does not apply generically to all NNS's.

Part of the reason is that this is only implemented for type NNS's. But the more important reason is that, when either a) !Left || !Right, or b) !Left->getAsType() || !Right->getAsType(), this returns true, since (presumably*) this gives us the desired behavior within areEquivalentDeclRefs(), despite that in general a null NNS should probably not be considered the same as a nonnull NNS.

(*Is this indeed the desired behavior? Probably should add some tests that check qualified DeclRefExprs against unqualified DeclRefExprs, to be sure.)

59–60

Suggest you move the null check from areEquivalentDeclRefs() here, i.e.

if (!Left || !Right)
  return true;

mainly since this is needs to be done in recursive calls (see below), but also since you do the same logic on LTy and RTy subsequently.

65

It occurs to me this needs to be recursive, to check a sequence of qualifiers, i.e.

const Type *LTy = Left->getAsType();
const Type *RTy = Right->getAsType();
if (!LTy || !RTy) 
  return true;
if (LTy->getCanonicalTypeUnqualified() !=
     RTy->getCanonicalTypeUnqualified())
  return false;
return areEquivalentNameSpecifier(Left->getPrefix(), Right->getPrefix());

The reason is, if there is a prefix qualifier to this qualifier, we run into the original problem, e.g.:

struct Base {
  struct Inner { 
    static const bool value = true;
  };
};
struct Outer1 : Base {};
struct Outer2 : Base {};

// We do not want the following to warn, but without checking the prefix of Inner, 
// I believe these will be interpreted as equivalent DeclRefs and will warn:
auto sink = Outer1::Inner::value || Outer2::Inner::value;
78

Suggest you move this null check into areEquivalentNameSpecifier, see above

Sorry for my late response, I'm busy with other things right now. I plan to come back to this in a couple of weeks.

clang-tools-extra/clang-tidy/misc/RedundantExpressionCheck.cpp
58

This function should probably either be made into a lambda private to areEquivalentDeclRefs(), or renamed to make clear it does not apply generically to all NNS's.
Part of the reason is that this is only implemented for type NNS's. But the more important reason is that, when either a) !Left || !Right, or b) !Left->getAsType() || !Right->getAsType(), this returns true, since (presumably*) this gives us the desired behavior within areEquivalentDeclRefs(), despite that in general a null NNS should probably not be considered the same as a nonnull NNS.

I agree. I just wanted to preserve as much code as I could and aim for a minimal change to implement my intention.
I'll update this accordingly.

(*Is this indeed the desired behavior? Probably should add some tests that check qualified DeclRefExprs against unqualified DeclRefExprs, to be sure.)

You are right, not perfect. But it's already closer than it previously was. I'll add the test anyway.

65

You are right, but that would require me to implement all the possible kinds of NNS, in addition to the types.
And it's not entirely clear to me (yet) what the desired behavior should be for implementing this.

steakhal updated this revision to Diff 395150.Dec 17 2021, 9:13 AM

Sorry for the late update, but here we are:

Now, I properly handle all kinds of NestedNameSpecifiers.
Basically, I verify if the decls are matching, then I use the nested name specifiers for finding mismatches. I look through namespace aliases and I use the canonical types.

NoQ added a comment.Dec 17 2021, 12:51 PM

These checks are almost 2000 lines of code each and it looks like all they do is figure out if two statements are the same and we have a very flexible reusable solution for this sort of stuff - CloneDetector, but none of them use it. Your patch demonstrates that they all have the same bugs that need to be fixed in each of them separately, so reusability seems really valuable. If I was to work on these checks, the first thing I'd try would be to throw out their machines and plug in a reusable solution.

NoQ added a comment.Dec 17 2021, 1:28 PM

P.S. bugprone-branch-clone seems to have attempted to use CloneDetector in the past but now it's no more than a dead #include. I wonder what happened there.

steakhal marked an inline comment as done.Dec 20 2021, 2:46 AM

These checks are almost 2000 lines of code each and it looks like all they do is figure out if two statements are the same and we have a very flexible reusable solution for this sort of stuff - CloneDetector, but none of them use it. Your patch demonstrates that they all have the same bugs that need to be fixed in each of them separately, so reusability seems really valuable. If I was to work on these checks, the first thing I'd try would be to throw out their machines and plug in a reusable solution.

Well, yes. Ideally, we should remove probably the ClangSA implementation, since it does basically the same thing. Right now I'm focusing on fixing this FP, and I don't really want to tip my toe into anything bigger than that.

About the CloneDetector, well in February this year we observed that it basically halted an analysis, due to some unfortunate situation. I saw areSequencesClones() stringifying QualTypes for hours if not days.
We had other priorities to focus, thus we could not get there reproducing and fixing the issue with that, but that could be a potential reason why people don't want to use that. We should really dig into that at some point.
For the record, at the time it was analyzing llvm itself, the AMDGPUAsmParser.cpp file to be precise.

davrec added inline comments.Dec 21 2021, 5:57 AM
clang-tools-extra/clang-tidy/misc/RedundantExpressionCheck.cpp
65

I agree it's not entirely clear what should be the desired behavior in that rare case; I suppose it might be appropriate to introduce a bool RecurseToPrefix param that determines whether you do the while loop or finish after one iteration, mostly so that others using this function can ponder the decision for themselves.

clang/lib/StaticAnalyzer/Checkers/IdenticalExprChecker.cpp
372

I would strongly support turning these functions into methods of DeclRefExpr and NestedNameSpecifier, not just to avoid the code duplication here but because they sure seem to be of general utility to AST users.

You just need to clearly motivate these methods in the documentation, which your description for this patch largely does, e.g. something like.:

class DeclRefExpr : ... {
  ...
  /// Returns whether this \c DeclRefExpr 
  ///   a. refers to the same canonical declaration as \c other and
  ///   b. if it has a qualifier, that qualifier refers to the same canonical 
  ///       declarations as those of \c other .
  ///
  /// Consider these two DeclRefExprs:
  /// \code
  ///   std::is_same<char, int>::value
  ///   std::is_same<char, long>::value;
  /// \endcode
  ///
  /// We can see that the value static members are different, but in fact
  /// the \c getDecl() of these two returns the same VarDecl. (...etc)
  bool isSemanticallyEquivalentTo(const DeclRefExpr *Other) { ... }
};
...
class NestedNameSpecifier {
  ...
  /// Returns whether this refers to the same sequence of identifiers/canonical declarations
  /// as \c Other.  (Then I would repeat the std::is_same example here, since that
  /// really makes the issue clear.  And also describe what this returns when other is nullptr or
  /// when getPrefix() is nullptr while other->getPrefix() is non-null -- maybe add a parameter if
  /// its not clear in general what the behavior should be.)
  bool isSemanticallyEquivalentTo(const NestedNameSpecifier *Other, bool RecurseToPrefix = true) { ... }
};
davrec added inline comments.Dec 21 2021, 8:27 AM
clang/lib/StaticAnalyzer/Checkers/IdenticalExprChecker.cpp
372

Also if doing this it might be nice to add isSyntacticallyEquivalentTo(other) methods alongside the semantic versions, defined the same except they don't desugar the type nor get the underlying NamespaceDecl from a NamespaceAliasDecl. This would be simple and, given that you found that two NestedNameSpecifiers representing the exact same syntax are nonetheless different pointers, it could be very useful for AST users concerned with syntax rather than semantics.

(Btw note the DeclRefExpr's syntactic method should still use getDecl()->getCanonicalDecl(), since, confusingly, getCanonicalDecl simply fetches the FirstDecl from the various redeclarations, rather than doing any kind of desugaring as getCanonicalType does. And fortunately I don't think it is necessary to manually "desugar" the canonical decl in the semantic version, since a DeclRefExpr's getDecl is always a ValueDecl, which I don't think can ever be "sugar" for some underlying decl, as say a NamespaceAliasDecl is for a NamespaceDecl -- so that part should be fine.)

These checks are almost 2000 lines of code each and it looks like all they do is figure out if two statements are the same and we have a very flexible reusable solution for this sort of stuff - CloneDetector, but none of them use it. Your patch demonstrates that they all have the same bugs that need to be fixed in each of them separately, so reusability seems really valuable. If I was to work on these checks, the first thing I'd try would be to throw out their machines and plug in a reusable solution.

There are already two way more sophisticated (forgive me my bias) implementations in Clang that are for checking if two statements or decls are the same.

  1. ODRHash, used in modules to discover ODR violations
  2. ASTStructuralEquivalenceContext, used in ASTImporter to discover if two AST nodes are the same or not (as a side effect we diagnose ODR violations as well).

It is not the first time, when such a similarity check is needed (see https://reviews.llvm.org/D75041). Of course reusing the before mentioned components would require some architectural changes, but it might be beneficial.

There are already two way more sophisticated (forgive me my bias) implementations in Clang that are for checking if two statements or decls are the same.

  1. ODRHash, used in modules to discover ODR violations
  2. ASTStructuralEquivalenceContext, used in ASTImporter to discover if two AST nodes are the same or not (as a side effect we diagnose ODR violations as well).

It is not the first time, when such a similarity check is needed (see https://reviews.llvm.org/D75041). Of course reusing the before mentioned components would require some architectural changes, but it might be beneficial.

I do not quite see the overlap. This patch addresses the structural equivalence of DeclRefExprs: as the std::is_same (or any type trait example) demonstrates, two declarations may be the "same" (e.g. they are both std::false_type::value), but two DeclRefExprs referring to those declarations should not necessarily be considered the "same": the qualifier, specifying the path that was taken to look them up, can matter to a user. It's not a matter of the sophistication of the similarity check, it's a matter of what we mean by similarity.

I do not see DeclRefExprs handled in ODRHash or ASTStructuralEquivalence. I do see NestedNameSpecifiers handled in both, but I don't think the implementation quite matches what is needed here (e.g. in ASTStructuralEquivalence check, if one NNS is a NamespaceAlias, the other one is assumed to be a NamespaceAlias: not what we want). It's probably not worth the trouble to factor something common out of those; though they should certainly be used as a guide to make sure no cases have been missed.