This is an archive of the discontinued LLVM Phabricator instance.

[include-cleaner] Add self-contained file support for PragmaIncludes.
ClosedPublic

Authored by hokein on Nov 9 2022, 1:14 AM.

Diff Detail

Event Timeline

hokein created this revision.Nov 9 2022, 1:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 9 2022, 1:14 AM
hokein requested review of this revision.Nov 9 2022, 1:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 9 2022, 1:14 AM

i guess this patch needs a rebase for other pieces as well? but LG in general

clang-tools-extra/include-cleaner/lib/Analysis.cpp
47 ↗(On Diff #474189)

with the API i suggested in the base revision this should look like:

while(FID != SM.getMainFileID() && !PI.isSelfContained(FID))
  FID = SM.getFileID(SM.getIncludeLoc(FID));
return SM.getFileEntryForID(FID);

i think feels more natural

clang-tools-extra/include-cleaner/lib/Record.cpp
138

FE->getUniqueID(), same below

clang-tools-extra/include-cleaner/unittests/AnalysisTest.cpp
77

nit: rather than introducing a 3rd header, i'd just include it in header2.h.

hokein updated this revision to Diff 475067.Nov 14 2022, 1:01 AM
hokein marked 3 inline comments as done.

rebase and address comments.

kadircet added inline comments.Nov 15 2022, 7:26 AM
clang-tools-extra/include-cleaner/include/clang-include-cleaner/Record.h
100

s/during/detected during/

clang-tools-extra/include-cleaner/lib/FindHeaders.cpp
18

what about selfContainedHeader instead of getResponsibleHeader, responsibility is a concept that we rather define between which header should include what headers (e.g. a header providing return types of its APIs)

24

FID = SM.getDecomposedIncludedLoc(FID).first;

which does a cached lookup, so might be faster than getFileID, especially when performing repeated lookups (e.g. a single file importing multiple tablegen'd headers).

37

drop fixme

38

can you drop FID (as it isn't used elsewhere) and just pass Loc.physical() into helper?

39

i am not sure if sequencing of this and rest of the IWYU pragma handling is the right actually.

e.g. consider a scenario in which:
private.h:

// IWYU private, include "public.h"
void foo();

private-impl.h:

#pragma once
#include "private.h"

we'll actually now use private-impl.h as provider for foo rather than public.h, a similar argument goes for export pragma. i think intent of the developer is a lot more explicit when we have these pragmas around, so we should prioritize them and then pick the self contained one, if it's still needed. WDYT?

clang-tools-extra/include-cleaner/lib/Record.cpp
306

s/ID/File

clang-tools-extra/include-cleaner/unittests/AnalysisTest.cpp
31

drop the const

clang-tools-extra/include-cleaner/unittests/FindHeadersTest.cpp
20

again const

hokein marked an inline comment as done.Nov 17 2022, 4:02 AM
hokein added inline comments.
clang-tools-extra/include-cleaner/lib/FindHeaders.cpp
39

It looks reasonable to me. Thanks for raising this case, this is a good point. I mostly considered the case where the self-contained file has the IWYU private mapping.

My current thought is:

Overall algorithm: we check the original FE to see whether we can get any IWYU mapping (private, export) headers, if there is any, we use these mapping headers as final results; if not, we use the selfContainedHeader(FE) and its IWYU mapping headers;

Below are the cases I considered -- 1), 2), 4) works fine, but the 3) is not perfect (as we prioritize the IWYU mapping headers), but probably ok.

// Case1: Private
// findHeaders("private1.inc") => "path/public1.h"
Inputs.ExtraFiles["private1.h"] = R"cpp(
  #pragma once
  #include "private1.inc"
)cpp");
Inputs.ExtraFiles["private1.inc"] = R"cpp(
  // IWYU pragma: private, include "path/public1.h"
)cpp";

// Case2: Private
// findHeaders("private2.inc") => "path/public2.h"
Inputs.ExtraFiles["private2.h"] = R"cpp(
  #pragma once
  // IWYU pragma: private, include "path/public2.h"
  #include "private2.inc"
)cpp";
Inputs.ExtraFiles["private2.inc"] = "";

// Case3: Export
// findHeaders("details1.inc") => "details1.inc", "export1.h"
Inputs.ExtraFiles["export1.h"] = R"cpp(
  #include "details1.inc" // IWYU pragma: export
)cpp";
Inputs.ExtraFiles["details1.inc"] = "";

// Case4: Export
// findHeaders("details2.inc") => "export2.h", "export2_internal.h"
Inputs.ExtraFiles["export2.h"] =R"cpp(
  #pragma once
  #include "export2_internal.h" // IWYU pragma: export
)cpp";
Inputs.ExtraFiles["export2_internal.h"] = R"cpp(
  #pragma once
  #include "details2.inc"
)cpp";
Inputs.ExtraFiles["details2.inc"] = "";

Let me know what you think about it.

hokein updated this revision to Diff 476372.Nov 18 2022, 12:43 AM
hokein marked 4 inline comments as done.

rebase and update based on the offline discussion

hokein marked 2 inline comments as done.Nov 18 2022, 12:45 AM
hokein added inline comments.
clang-tools-extra/include-cleaner/lib/FindHeaders.cpp
39

based on our offline discussion, we're going to preserve all header candidates here (and with proper private/export signals attached, which will be in in a followup patch), and let the later step to select proper headers.

mstorsjo added inline comments.
clang-tools-extra/include-cleaner/lib/CMakeLists.txt
12

Note that this file changed a bit further in 68294afa0836bb62be921e2143d147cdfdc8ba70, so you may want to rebase again.

(I tested this patch after rebasing, in a mingw+dylib build configuration, and it still builds correctly there.)

thanks, mostly LG a bunch of suggestions for tests

clang-tools-extra/include-cleaner/include/clang-include-cleaner/Record.h
74

we also need tests for this

clang-tools-extra/include-cleaner/lib/FindHeaders.cpp
42–43

nit: Results.append(PI->getExporters(FE, SM.getFileManager()))

43

can we rather write this as:

if (!PI)
  return FE ? {Header(FE)} : {};

i think it makes it more clear that in the absence of PI we're just terminating the traversal no matter what (rather than getting into the code below with possibly a null PI, because FE also happened to be null).

clang-tools-extra/include-cleaner/unittests/FindHeadersTest.cpp
72

tests and assertions here are getting a little bit hard to follow. do you mind splitting them into smaller chunks instead? each testing different parts of the traversal behaviour. one for private->public mappings, another for exports, another for non-self contained traversal. having another big integration test for testing each might also be fine, but it's really hard to reason about so i don't think that should be the main test in which we're testing the behaviour.

apart from that this might be easier to follow if we did some renaming:

  • header1.h -> private1.h
  • header2.h -> exporter.h (also probably move non-exported includes to a different header)
  • detailx.h -> exportedx.h
  • impx.inc -> fragmentx.inc
86

Verify that we emit exporters for each header on the path.

96

i think it would be better to check we stop traversal once we hit the pragma (i.e. have a non-self contained file included by another non-self contained file that also has a IWYU private mapping).

hokein updated this revision to Diff 476410.Nov 18 2022, 3:12 AM
hokein marked 2 inline comments as done.

address comment: split tests, add a smoke test for PragmaInclude::isSelfContained.

clang-tools-extra/include-cleaner/lib/CMakeLists.txt
12

sure, and thank you very much for taking care of the mingw+dylib build!

clang-tools-extra/include-cleaner/lib/FindHeaders.cpp
42–43

the type of two container is different (Header vs FileEntry, I updated to explicitly spell the Header type in the push_back) -- if we want to get rid of the for-loop, we could use the llvm::for_each.

43

I agree, we can do that, but but the sad bit is that we need to explicitly spell the return type in the conditional operator.

clang-tools-extra/include-cleaner/unittests/FindHeadersTest.cpp
72

that sounds fair enough, split into small tests.

96

I think this is covered by this case (though this case is quite simple) -- we started the traversing from the imp_private.inc, at the beginning we hit the the IWYU private mapping, we stop immediately.

kadircet accepted this revision.Nov 18 2022, 4:23 AM

thanks, lgtm!

clang-tools-extra/include-cleaner/unittests/FindHeadersTest.cpp
39

nit: i'd actually rename this to astWithIncludes and take in a set of filenames, then set up Inputs.Code here, eg:

astWithIncludes({"foo.h", "bar.h"}); -> Inputs.Code = `#include "foo.h"\n#include "bar.h"`; AST = TestAST(Inputs);
72

also we should assert that there's nothing funky going on with exporter.h itself

81

s/priviate/private

clang-tools-extra/include-cleaner/unittests/RecordTest.cpp
400 ↗(On Diff #476410)

nit: might be easier to just have #pragma once

This revision is now accepted and ready to land.Nov 18 2022, 4:23 AM
hokein updated this revision to Diff 476425.Nov 18 2022, 4:51 AM
hokein marked 3 inline comments as done.

update

clang-tools-extra/include-cleaner/unittests/FindHeadersTest.cpp
39

I think hiding and synthesizing the code of main file inside a function will hurt the code readability (though the main-file code is boilerplate). As a reader, I will prefer much that the main code and the all #includes are defined in a single place which is inside the TEST_F.

This revision was landed with ongoing or failed builds.Nov 18 2022, 4:52 AM
This revision was automatically updated to reflect the committed changes.