This is an archive of the discontinued LLVM Phabricator instance.

[include-cleaner] Allow multiple strategies for spelling includes.
ClosedPublic

Authored by VitaNuo on May 9 2023, 2:14 AM.

Diff Detail

Event Timeline

VitaNuo created this revision.May 9 2023, 2:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2023, 2:14 AM
VitaNuo updated this revision to Diff 525560.May 25 2023, 5:47 AM

Instantiate the strategies early. Prevent instantiation on each spelling request.

VitaNuo updated this revision to Diff 525566.May 25 2023, 5:55 AM

Minor update.

VitaNuo updated this revision to Diff 525574.May 25 2023, 6:21 AM

Remove extra comment.

VitaNuo published this revision for review.May 25 2023, 6:22 AM
VitaNuo retitled this revision from [WIP][include-cleaner] Allow multiple strategies for spelling includes. to [include-cleaner] Allow multiple strategies for spelling includes..
VitaNuo added a reviewer: kadircet.
Herald added a project: Restricted Project. · View Herald TranscriptMay 25 2023, 6:22 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
VitaNuo updated this revision to Diff 525581.May 25 2023, 6:40 AM

Move the speller out of the loop.

thanks a lot! outline seems good. mostly some comments on implementation details.

clang-tools-extra/include-cleaner/include/clang-include-cleaner/Analysis.h
93

we should have some comments explaining the semantics and rationale. maybe something like:

An extension point to let applications introduce custom spelling strategies for physical headers.
It takes in absolute path to a source file and should return a verbatim include spelling (with angles/quotes) or an empty string to indicate no customizations are needed.
98

we can change the interface here to just a functor now, e.g. llvm::function_ref<std::string(llvm::StringRef /*AbsPath*/)> MapHeader.

we should also add some comments to explain the contract:

Generates a spelling for `H` that can be directly included in `Main`.
When `H` is a physical header, prefers the spelling provided by `MapHeader` if any, otherwise uses header search info to generate shortest spelling.
100

let's move this closer to interface definition.

102

let's move this into the source file and maybe expose with something like:

/// A header mapper that iterates over all registered include spelling strategies.
/// Note that when there are multiple strategies iteration order is not specified.
std::function<std::string(llvm::StringRef)> defaultHeaderMapper();

you can also make this the default for mapHeader parameter in spellHeader.

105

instead of a constructor you can have:

static auto *Strategies = []{
  auto *Result = new llvm::SmallVector<std::unique_ptr<include_cleaner::IncludeSpeller>>;
  for(auto &Strategy: include_cleaner::IncludeSpellingStrategy::entries()) {
   Result->push_back(Strategy.instantiate());
  }
}();

in the functor implementation.

clang-tools-extra/include-cleaner/lib/Analysis.cpp
92

let's keep the switch here, as it'll trigger a warning when there's a non-matched kind during compiles.

120

you can use FinalSpelling here

130

we should instantiate this outside the callback instead (to make sure we do it once). it would become obsolete if you're to use a static variable to store the strategies though.

VitaNuo updated this revision to Diff 526610.May 30 2023, 7:41 AM
VitaNuo marked 8 inline comments as done.

Address review comments.

Thanks for the comments!

VitaNuo added inline comments.May 30 2023, 7:41 AM
clang-tools-extra/include-cleaner/include/clang-include-cleaner/Analysis.h
105

Ok, in that case I'll get rid of the class altogether.

clang-tools-extra/include-cleaner/lib/Analysis.cpp
130

Using a static variable now.

VitaNuo updated this revision to Diff 526615.May 30 2023, 7:45 AM

Remove extra braces.

It looks good overall. I left some comments around the interfaces. Let me know if I miss/misunderstand something.

clang-tools-extra/include-cleaner/include/clang-include-cleaner/Analysis.h
89

I think this is an important API (we will create a subclass for our internal use), probably worth a dedicated IncludeSpeller.h/.cpp file.

89

I think it would be nice to have a unittest for it.

You can create a subclass TestIncludeSpeller in the unittest, which implements a dummy include spelling for a particular absolute file path (e.g. a file path starting with /include-cleaner-test/), and verify spellHeader API return expected results.

97

nit: maybe name the parameter AbsolutePath? I think it is better to mention the absolute file path in the name.

100

nit: consider using using IncludeSpellingStrategy = llvm::Registry<IncludeSpeller>;, which is a more modernized way.

105

It is unclear to me why we need defaultHeaderMapper and the parameter MapHeader in spellHeader in the header.

Do we want the caller of spellHeader to provide a different HeaderMapper? I don't see a usecase for that -- the current strategy of is to iterate all extension points, if we find the first available one, we just return it; otherwise we use the default fallback (suggestPathToFileForDiagnostics). I believe it is enough for spellHeader to cover all our cases.

VitaNuo updated this revision to Diff 527314.Jun 1 2023, 12:54 AM
VitaNuo marked 5 inline comments as done.

Address review comments.

VitaNuo added inline comments.Jun 1 2023, 1:09 AM
clang-tools-extra/include-cleaner/include/clang-include-cleaner/Analysis.h
89

Ok, created the test that tests the spellHeader API with a dummy speller. I do have to pass the speller object manually, though. If I try to link it via llvm::Registry, it creates side effects for other test, e.g., AnalysisTest.cpp, etc., which is undesirable.

105

Plugins might need extra information, e.g. clangd-configs for remapping quotes to > angles (or full path re-writes)
Reason to push registry to applications and rather take in a functor in >include_cleaner (or just let it be handled by applications completely?)

This is a quote from our sync notes. I believe the idea was that applications might want to parametrize mapping somehow. When linking via a strategy, you can only provide a class that has a parameterless constructor, though. At least that's my understanding.

VitaNuo updated this revision to Diff 527330.Jun 1 2023, 2:32 AM

Update test name.

hokein added a comment.Jun 1 2023, 5:29 AM

thanks, left some comments per our offline discussion.

clang-tools-extra/include-cleaner/include/clang-include-cleaner/Analysis.h
105

Right.

As discussed, for extra parameters, we can pass the input parameters via a virtual method call. we could pack all parameters into a struct, we could refine the interface like :

class IncludeSpeller {
public:
   struct Input {
      const Header& HeaderToInsert;
      HeaderSearch &HS;
      const FileEntry* MainFile;
   };
   
   virtual std::string spell(const Input&) = 0;
};
using IncludeSpellerRegistry = llvm::Registry<IncludeSpeller>;
std::string spellHeader(const IncludeSpeller::Input& Input);

Now for each IncludeSpeller, Input provides enough information to implement their strategy.
e.g. for our internal one which needs to resolve all symlinks, we can get the FileManager from the HeaderSearch.

clang-tools-extra/include-cleaner/lib/Analysis.cpp
96

as discussed, tryGetRealPathName doesn't guarantee it returns an absolute file path, and it is not enough to support our internal use case where we need to resolve the symlink (which requires a FileManager).

clang-tools-extra/include-cleaner/unittests/IncludeSpellerTest.cpp
1

nit: IncludeSpellerTest.cpp

60

we miss to check the return value here -- if it is false (the AbsolutePath doesn't start with the testRoot()), then we returns an empty string.

The mental model is that each IncludeSpeller subclass only implements their strategy for a particular interesting path pattern, if the absolute path doesn't match the pattern, we should return an empty string, claiming that there is no customization for this IncludeSpeller.

VitaNuo updated this revision to Diff 527412.Jun 1 2023, 7:19 AM

Address review comments.

VitaNuo marked 4 inline comments as done.Jun 1 2023, 7:24 AM

Thanks for the comments.

VitaNuo updated this revision to Diff 527420.Jun 1 2023, 7:49 AM

Remove extra path handling.

VitaNuo updated this revision to Diff 527446.Jun 1 2023, 9:05 AM

Fix windows build.

VitaNuo updated this revision to Diff 527447.Jun 1 2023, 9:06 AM

Add newline at file end.

VitaNuo updated this revision to Diff 527448.Jun 1 2023, 9:08 AM

Add newline at file end.

VitaNuo updated this revision to Diff 527451.Jun 1 2023, 9:09 AM

Remove extra include.

VitaNuo updated this revision to Diff 527452.Jun 1 2023, 9:11 AM

Remove newline at file end.

VitaNuo updated this revision to Diff 527466.Jun 1 2023, 9:46 AM

Fix windows test.

hokein added a comment.Jun 2 2023, 1:07 AM

Thanks, a few comments to simplify the code/test further.

clang-tools-extra/include-cleaner/include/clang-include-cleaner/Analysis.h
95

can you move it to the IncludeSpeller.h file? I think it is a more suitable place.

clang-tools-extra/include-cleaner/include/clang-include-cleaner/IncludeSpeller.h
23

This struct is the input for the IncludeSpeller interface, thus I prefer to move the structure to IncludeSpeller class, and call it Input.

clang-tools-extra/include-cleaner/lib/Analysis.cpp
86–88

similar to .h file, I think we should move this implementation to the IncludeSpeller.cpp file.

100

I think it may be clearer to make this fallback spelling implementation as a DefaultIncludeSpeller class, and append it to the end of Spellers (but not register it to make sure it always the last one of the spelling IncludeSpellingStrategy).

The code looks like

class DefaultIncludeSpeller : IncludeSpeller {
   ...
};
std::string mapHeader(const IncludeSpellerInput &Input) {
  static auto Spellers = [] {
    auto Result =
        llvm::SmallVector<std::unique_ptr<include_cleaner::IncludeSpeller>>{};
    for (const auto &Strategy :
         include_cleaner::IncludeSpellingStrategy::entries())
      Result.push_back(Strategy.instantiate());
    Result.push_back(std::make_unique<DefaultIncludeSpeller>());
    return Result;
  }();
  ....
}

std::string spellerHeader(...) {
   ...
   case Header::Physical:
      return mapHeader(Input);
}
clang-tools-extra/include-cleaner/unittests/IncludeSpellerTest.cpp
3

line: line 1 and line 2 should be merged into a single line.

61

nit: inline the Result.

if (!AbsolutePath.consume_front())
  return "";
return "\"" + AbsolutePath.str() + "\"";
82

I think there should be "" around the foo.h, the spellHeader API returns the spelling contains the ""<> characters, so we need to add "" to the result in the DummyIncludeSpeller::operator() method.

91

I think the unittest is mainly for testing our llvm-registry mechanism work by verifying DummyIncludeSpeller is used for file path starting with clangd-test, and fallback one is used otherwise.

we're less interested in the the different file separator per platform. Let's try to simplify the unittest to avoid subtle platform issues, something like below should cover the thing we want:

Inputs.ExtraFiles["/include-cleaner-test/foo.h"] = "";
Inputs.ExtraFiles["system_header.h"] = "";

Inputs.ExtraFlags = {"-isystem."};
...

EXPECT_EQ("\"foo.h\"",
            spellHeader({Header{*FM.getFile("foo.h")}, HS, MainFile})); // spelled by the dummerIncludeSpeller.
EXPECT_EQ("<system_header.h>",
            spellHeader({Header{*FM.getFile("system_header.h")}, HS, MainFile})); // spelled by the fallback one.
VitaNuo updated this revision to Diff 527872.Jun 2 2023, 8:19 AM
VitaNuo marked 6 inline comments as done.

Address review comments.

Thanks for the comments.

hokein added a comment.Jun 5 2023, 1:02 AM

thanks, looks good overall, a few more comments

clang-tools-extra/clangd/Hover.cpp
1225–1227

we probably need to add a IncludeSpeller.h #include insertion for this file, and probably for clangd/IncludeCleaner.cpp as well

clang-tools-extra/include-cleaner/include/clang-include-cleaner/IncludeSpeller.h
29

nit: we can use const HeaderSearch& now if you rebase the patch to main branch.

36

the doc is stale now, please update it accordingly,

43

and here as well

46

nit: can you move this statement immediately right after the IncludeSpeller class?

clang-tools-extra/include-cleaner/lib/IncludeSpeller.cpp
22 ↗(On Diff #527872)

nit: can you move this and the mapHeader to an anonymous namespace? these symbols are only used for the spellHeader implementation, we should keep it internal to avoid potential ODR violation.

33 ↗(On Diff #527872)

we need a static key word before the auto -- this is to make sure we only initiate all spellers once (otherwise, we will create all spellers once per spellHeader call, this can be expensive).

59 ↗(On Diff #527872)

nit: remove the debugging statement.

62 ↗(On Diff #527872)

no need to check the empty here otherwise we will hit the unreachable statement below (it is fine to return empty string)

VitaNuo updated this revision to Diff 528315.Jun 5 2023, 1:46 AM
VitaNuo marked 11 inline comments as done.

Address review comments.

Thanks for the comments!

clang-tools-extra/clangd/Hover.cpp
1225–1227

Yeah also to the IncludeCleanerCheck.cpp now that I've rebased to main.

hokein accepted this revision.Jun 5 2023, 2:32 AM

Thanks, looks great!

clang-tools-extra/include-cleaner/include/clang-include-cleaner/IncludeSpeller.h
23

would be nice to add some documentation for this interface, something like

IncludeSpeller provides an extension point to allow clients implement custom include spelling strategies applications for physical headers.
24

nit: remove the extra blank line.

clang-tools-extra/include-cleaner/lib/IncludeSpeller.cpp
24 ↗(On Diff #528315)

add a public:.

33 ↗(On Diff #528315)

nit: maybe name it spellPhysicalHeader( I'd avoid using map, it reminds me too much on the stdlib mapping).

This revision is now accepted and ready to land.Jun 5 2023, 2:32 AM
VitaNuo updated this revision to Diff 528341.Jun 5 2023, 2:46 AM
VitaNuo marked 4 inline comments as done.

Address the comments.

This revision was landed with ongoing or failed builds.Jun 5 2023, 2:47 AM
This revision was automatically updated to reflect the committed changes.