This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Fixed the false-positives caused by macro generated code.
ClosedPublic

Authored by teemperor on Aug 9 2016, 7:03 AM.

Details

Summary

So far macro-generated code was treated by the CloneDetector as normal code. This caused that some macros where reported as false-positive clones because their generated code was detected as a clone.

This patch ensures that macros are treated in the same way as literals/function calls. This prevents that macros that expand into multiple statements are reported as clones.

Diff Detail

Repository
rL LLVM

Event Timeline

teemperor updated this revision to Diff 67333.Aug 9 2016, 7:03 AM
teemperor retitled this revision from to [analyzer] Fixed the false-positives caused by macro generated code..
teemperor updated this object.
teemperor added reviewers: v.g.vassilev, NoQ, zaks.anna.
teemperor added a subscriber: cfe-commits.
v.g.vassilev added inline comments.Aug 9 2016, 1:31 PM
lib/Analysis/CloneDetection.cpp
431 ↗(On Diff #67333)

What are complex macros? Could you clarify?

test/Analysis/copypaste/macro-complexity.cpp
19 ↗(On Diff #67333)

I am not sure I understand this comment. Could you reword?

test/Analysis/copypaste/macros.cpp
11 ↗(On Diff #67333)

Wouldn't it be a good idea to have a fixit hint, saying "Did you mean to use ABS(a,b)". If the suggestion is applied, it would make the code more consistent, however it would encourage using preprocessor tricks (which is not always considered as good practice).

omtcyfz added inline comments.
lib/Analysis/CloneDetection.cpp
436 ↗(On Diff #67333)

Do I understand correctly that a code generated by a macro doesn't affect "complexity" at all then?

TEST_F(QueryParserTest, Complete) {
  std::vector<llvm::LineEditor::Completion> Comps =
      QueryParser::complete("", 0, QS);
  ASSERT_EQ(6u, Comps.size());
  EXPECT_EQ("help ", Comps[0].TypedText);
  EXPECT_EQ("help", Comps[0].DisplayText);
  EXPECT_EQ("let ", Comps[1].TypedText);
  EXPECT_EQ("let", Comps[1].DisplayText);
  EXPECT_EQ("match ", Comps[2].TypedText);
  EXPECT_EQ("match", Comps[2].DisplayText);
  EXPECT_EQ("set ", Comps[3].TypedText);
  EXPECT_EQ("set", Comps[3].DisplayText);
  EXPECT_EQ("unlet ", Comps[4].TypedText);
  EXPECT_EQ("unlet", Comps[4].DisplayText);
  EXPECT_EQ("quit", Comps[5].DisplayText);
  EXPECT_EQ("quit ", Comps[5].TypedText);

  Comps = QueryParser::complete("set o", 5, QS);
  ASSERT_EQ(1u, Comps.size());
  EXPECT_EQ("utput ", Comps[0].TypedText);
  EXPECT_EQ("output", Comps[0].DisplayText);

  Comps = QueryParser::complete("match while", 11, QS);
  ASSERT_EQ(1u, Comps.size());
  EXPECT_EQ("Stmt(", Comps[0].TypedText);
  EXPECT_EQ("Matcher<Stmt> whileStmt(Matcher<WhileStmt>...)",
            Comps[0].DisplayText);
}

This is an actual piece of code from extra/unittests/clang-query/QueryParserTest.cpp. Yes, it is a test, but it still is a nice example of how many macros can be found in code (especially if we are talking about pure C or some weird C++).

Thus, I think it is reasonable to treat macro invocation as a 1-"complexity" node.

NoQ added inline comments.Aug 11 2016, 6:16 AM
lib/Analysis/CloneDetection.cpp
436 ↗(On Diff #67333)

This "0" is not for the macro itself, but for the statements into which it expands. Macro itself is not a statement. If we put "1" here, it would produce a lot more complexity than you want.

That said, it's a good idea to treat every macro as a "complexity-1" statement, just need to figure out how to implement that correctly :)

Perhaps scan the source range of the sequence for how many different macro expansions are included, and add that number to complexity(?)

omtcyfz added inline comments.Aug 11 2016, 6:21 AM
lib/Analysis/CloneDetection.cpp
436 ↗(On Diff #67333)

This "0" is not for the macro itself, but for the statements into which it expands. Macro itself is not a statement. If we put "1" here, it would produce a lot more complexity than you want.

Sure, I understand that, this is why I didn't suggest putting 1 there.

Perhaps scan the source range of the sequence for how many different macro expansions are included, and add that number to complexity(?)

Yes, this is exactly the solution that would work. Since macros aren't in the AST we'd need to get through SourceRange anyway.

omtcyfz added inline comments.Aug 11 2016, 6:26 AM
lib/Analysis/CloneDetection.cpp
436 ↗(On Diff #67333)

Though, it has to be optimized in order to prevent parsing a SourceLocation multiple times.

omtcyfz added inline comments.Aug 11 2016, 6:28 AM
lib/Analysis/CloneDetection.cpp
436 ↗(On Diff #67333)

*visiting each SourceLocation

NoQ edited edge metadata.Aug 11 2016, 10:11 AM

No comments of my own, the patch looks good :)

lib/Analysis/CloneDetection.cpp
436 ↗(On Diff #67333)

Yeah, as a rough approximation we could count macro expansions within the current statement's children...

test/Analysis/copypaste/macro-complexity.cpp
19 ↗(On Diff #67333)

Without macros, the same code would constitute a complex clone.
Wrapping code into macros reduces complexity of the code.
This tests the test above.

^(tried out some reword-ings)

teemperor updated this revision to Diff 68556.Aug 18 2016, 10:25 AM
teemperor edited edge metadata.
  • Added more documentation to the CloneSignature::Complexity field.
  • Macros now have a complexity value of 1 + sum(ChildComplexityValues).
  • Tests should be less cryptic now.
teemperor marked 10 inline comments as done.Aug 18 2016, 10:34 AM
teemperor added inline comments.
lib/Analysis/CloneDetection.cpp
436 ↗(On Diff #67333)

I'm now checking all expanded macros of the start/end locations. This should handle everything if I see that correctly (beside empty non-function macros which I marked as false-positives - not sure how we best handle them).

test/Analysis/copypaste/macros.cpp
11 ↗(On Diff #67333)

I don't think detecting clones between macro definitions and normal code is easily possible. Doing the same for functions however is certainly possible (i.e. "did you meant to call max(a, b)). I added that to the open points list.

teemperor updated this revision to Diff 68568.Aug 18 2016, 10:35 AM
teemperor marked 2 inline comments as done.
  • Added false-positives note for empty macros to the test suite.
teemperor updated this object.Aug 18 2016, 10:36 AM
v.g.vassilev accepted this revision.Aug 19 2016, 4:52 AM
v.g.vassilev edited edge metadata.
This revision is now accepted and ready to land.Aug 19 2016, 4:52 AM
teemperor updated this revision to Diff 68684.Aug 19 2016, 7:20 AM
teemperor edited edge metadata.
  • Fixed typo.
This revision was automatically updated to reflect the committed changes.