This is an archive of the discontinued LLVM Phabricator instance.

[include-cleaner] Testing helpers for ast walking
Needs ReviewPublic

Authored by kadircet on Nov 2 2022, 6:43 AM.

Details

Reviewers
hokein
sammccall
Summary

Address comments around verbosity of tests.

Depends on D135859

Diff Detail

Event Timeline

kadircet created this revision.Nov 2 2022, 6:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 2 2022, 6:43 AM
Herald added a subscriber: mgrang. · View Herald Transcript
kadircet requested review of this revision.Nov 2 2022, 6:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 2 2022, 6:43 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
hokein added a comment.Nov 4 2022, 2:49 AM

Just share my experience when reading the code, overall I have a feeling that we're regressing the code readability of the tests (left some comments on regressing samples):

  1. with the sugared macros, the syntax of TargetCode and ReferenceCode are identical -- both using the ^ without any names, it is hard to distinguish them (we could use a different syntax [[]] in the ReferenceCode to identify the point);
  2. the code of using EXPECT_REFERENCES_WITH_TYPES is not easy to read (especially with multiple RefType arguments), we're losing the relationship between RefType and ^ point, this is my biggest concern;
  3. the EXPECT_{EXPLICIT, IMPLICIT, EXPLICIT}_REFERENCES is designed for a single reference point. When we have > 1 reference points, we have to use the fallback EXPECT_REFERENCES_WITH_TYPES even if all ref points are the same type (not sure it is generic enough, maybe it is fine as single-ref-point is the most common case);

That's being said, making the tests less verbose is good, but we seem to trade too much code readability for that. I'm fine with the original annotation version even though the annotations are somewhat noisy, another alternative -- if we use abbreviations like ($E, $I, $A) in tests, would it be better?

clang-tools-extra/include-cleaner/unittests/WalkASTTest.cpp
156

a bug in existing code? by looking at this example, we should use EXPECT_AMBIGUOUS_REFERENCES(Code, Code, RefType::Ambiguous, RefType::Ambiguous, RefType::Ambiguous)

234

I find this case is hard, and not easy for human to follow&verify -- we have 3 points in the target code, and their RefTypes are from the macro arguments.

279

When reading this code, the implicit vs explicit in these 4 consecutive statements is less obvious than before (each of them has the same length, same indentation and is a single-line)

289

It feels more natural to use EXPECT_AMBIGUOUS_REFERENCES(TargetCode, ReferenceCode, Ambiguous) for this case where all refs are the same type.

kadircet marked an inline comment as done.Nov 4 2022, 4:01 AM

Just share my experience when reading the code, overall I have a feeling that we're regressing the code readability of the tests (left some comments on regressing samples):

  1. with the sugared macros, the syntax of TargetCode and ReferenceCode are identical -- both using the ^ without any names, it is hard to distinguish them (we could use a different syntax [[]] in the ReferenceCode to identify the point);

I see your point but this is not really a regression, right? that's the way the tests were written in the previous version as well. i am not sure how we can make this more explicit without introducing a bunch of boilerplate into each test case.

  1. the code of using EXPECT_REFERENCES_WITH_TYPES is not easy to read (especially with multiple RefType arguments), we're losing the relationship between RefType and ^ point, this is my biggest concern;

That is one of the points I disliked the most as well, maybe we should keep the old test syntax for these cases (with abbreviations), i.e. rather than putting the types at the end, we'll put them into the annotations. WDYT?

  1. the EXPECT_{EXPLICIT, IMPLICIT, EXPLICIT}_REFERENCES is designed for a single reference point. When we have > 1 reference points, we have to use the fallback EXPECT_REFERENCES_WITH_TYPES even if all ref points are the same type (not sure it is generic enough, maybe it is fine as single-ref-point is the most common case);

These macros are not for single references, but rather for single reference types. e.g. if all the references are of same kind.

clang-tools-extra/include-cleaner/unittests/WalkASTTest.cpp
156

no, that's actually a change in behaviour (in the base patch) but wasn't reflected here.

234

i agree. just to make my proposal in the main discussion more concrete, i am suggesting this:

EXPECT_REFERENCES_WITH_TYPES(
      R"cpp(
    namespace ns {
      void $A^x(); void $E^x(int); void $A^x(char);
    })cpp",
      "// c++-header\n using ns::^x; void foo() { x(3); }");
279

i am not sure if there's much difference between having them in the code vs in the macro names here actually.

the case on the left hand side of the diff is exactly the same, you've got implicit vs explicit right next to points, each with same length. IMO the author/reviewer needs to pay ~the same level of attention in either case.

289

right, that one should've been just EXPECT_AMBIGUOUS_REFERENCES, a residue left behind while i was tweaking the code

hokein added a comment.Nov 7 2022, 1:20 AM

Just share my experience when reading the code, overall I have a feeling that we're regressing the code readability of the tests (left some comments on regressing samples):

  1. with the sugared macros, the syntax of TargetCode and ReferenceCode are identical -- both using the ^ without any names, it is hard to distinguish them (we could use a different syntax [[]] in the ReferenceCode to identify the point);

I see your point but this is not really a regression, right? that's the way the tests were written in the previous version as well. i am not sure how we can make this more explicit without introducing a bunch of boilerplate into each test case.

In the previous version, we can easily distinguish the TargetCode and ReferenceCode by looking at annotation $name in the code, while in the current version, we have to memorize the function argument order.

One alternative will be to merge TargetCode and ReferenceCode together:

int ^x; // using ^ as a reference point to the target symbol

int y = [[]]x; // using [[]]] as a ref point to the reference.
  1. the code of using EXPECT_REFERENCES_WITH_TYPES is not easy to read (especially with multiple RefType arguments), we're losing the relationship between RefType and ^ point, this is my biggest concern;

That is one of the points I disliked the most as well, maybe we should keep the old test syntax for these cases (with abbreviations), i.e. rather than putting the types at the end, we'll put them into the annotations. WDYT?

It can work, probably it is ok, it looks like -- we keep the original annotation approach (with shorter names), and provide some syntax sugar for common cases. (though I personally prefer to use the shortened-annotation approach everywhere). We might want a third opinion @sammccall.

  1. the EXPECT_{EXPLICIT, IMPLICIT, EXPLICIT}_REFERENCES is designed for a single reference point. When we have > 1 reference points, we have to use the fallback EXPECT_REFERENCES_WITH_TYPES even if all ref points are the same type (not sure it is generic enough, maybe it is fine as single-ref-point is the most common case);

These macros are not for single references, but rather for single reference types. e.g. if all the references are of same kind.

I see, I got confusing when reading the EXPECT_REFERENCES_WITH_TYPES("", "", Ambiguous, Ambiguous) code.

clang-tools-extra/include-cleaner/unittests/WalkASTTest.cpp
156

I don't get it. My understanding is

  1. this patch is basically a NFC change on top of the https://reviews.llvm.org/D135859 and in https://reviews.llvm.org/D135859, we use three ambiguous ref types
  2. and here, we are changing one of ambiguous ref types to an explicit type.

1 and 2 is conflicted, do you mean in the base version, it should be ambiguous, explicit, ambiguous?

279

I think there is a difference -- in the previous version, you don't need to read the function name, you can see the reference type from the ^ point in the reference code snippet, while in the current version, you first identify the ^ point, and *move* our eyeball to the macro name (extra step).

Maybe we can use shorter names for these macros (e.g. EXPECT_IMPLICIT, EXPECT_EXPLICIT?).

Agree there are readability concerns here (long macro names, out-of-line kinds, complicated macros).

My suggestion would be to treat RefKind as an optional feature. Most tests would just assert the set of marked locations and test the cases where kind is interesting separately.

string markReferences(string TargetCode, string ReferencingCode, bool WithKind);
// markReferences("int x;", "int y = ^x;", false) => "int ^x;"
// markReferences("int x;", "int y = ^x;", true) => "int $explicit^x;"

#define EXPECT_REFS(T, R) EXPECT_EQ(T, markReferences(Annotations(T).code(), R, false)) << R;
#define EXPECT_TYPED_REFS(T, R) EXPECT_EQ(T, markReferences(Annotations(T).code(), R, true)) << R;
clang-tools-extra/include-cleaner/unittests/WalkASTTest.cpp
248

It's probably OK to use raw-strings inside macros these days...
Some versions of GCC had a problem with it, but the c++17 bump may have taken care of those.