This is an archive of the discontinued LLVM Phabricator instance.

[include-cleaner] Add export IWYU pragma support.
ClosedPublic

Authored by hokein on Nov 3 2022, 2:18 AM.

Details

Summary
  • add support to PragmaIncludes to handle IWYU export/begin_exports/end_exports pragma;
  • implement an API to retrieve the direct exporter headers;

Diff Detail

Event Timeline

hokein created this revision.Nov 3 2022, 2:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 3 2022, 2:18 AM
hokein requested review of this revision.Nov 3 2022, 2:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 3 2022, 2:18 AM
kadircet added inline comments.Nov 3 2022, 4:09 AM
clang-tools-extra/include-cleaner/include/clang-include-cleaner/Record.h
90

what about a smallvector, instead of a denseset here?

90

nit: instead of UniqueIDs, you can directly store StringRefs in the values, if you use an Arena/StringSaver for storing full paths. that should save you from doing one extra lookup at query time.

94

i think it's important to mention why we're storing a string to the full path here:
There's no way to get a path for a UniqueID, especially when it hasn't been opened before. So store the full path and convert it to a FileEntry by opening the file again through a FileManager.

clang-tools-extra/include-cleaner/lib/Record.cpp
151

this has the nasty downside of:

// IWYU pragma: export
void foo();
#include "bar.h"

now bar.h will be considered exported :/ i think we have to store the line-number for the pragma as well and compare it (we can store 0 for block comments).

156

well, i've just noticed we're not actually doing anything specific to the include here (or for the keep equivalent) as we're just storing the line number we've seen the comment on.
so what about moving this logic (and the pragma keep one) into the HandleComment callback instead?

223

can you also assert that Stack top and current Range are from the same file id?

247

maybe call this ExportPragma instead of State?

249

what about just Location or SeenAt?

286

i think this is also a harsh assert. the file might get deleted from disk between indexing (e.g. clangd's preamble build) and query (e.g. clangd's AST built). so maybe log this occurrence instead and continue?

clang-tools-extra/include-cleaner/unittests/RecordTest.cpp
287–289

nit: indentation looks weird, maybe:

for(llvm::StringRef File : {"keep1.h", ...}) {
  Inputs.ExtraFiles[File] = "";
}
297

i think we should also EXPECT_TRUE for line 6 and 9.

346

can you also assert that for export1, export2 and mainfile?

371–373

nit: same indentation comment as above

hokein updated this revision to Diff 473174.Nov 4 2022, 2:25 AM
hokein marked 7 inline comments as done.

address review comments

hokein added inline comments.Nov 4 2022, 2:26 AM
clang-tools-extra/include-cleaner/include/clang-include-cleaner/Record.h
90

this is a good point. Thinking more about it, I think we don't even need RealPathNamesByUIDat all, if we store the Path as the value.

clang-tools-extra/include-cleaner/lib/Record.cpp
151

yeah, this is an invalid case, I think we should not put too much effort on handling all invalid cases, but this one is trivial to support, added line-number to the State.

156

right, but with the new implementation, we use the # line number to make sure the #include is within the export range, this logic needs to be here, otherwise the semantic of ShouldKeep will change to track all lines that have IWYU pragmas, even for lines without #include directive.

223

this assert makes sense for matching begin/end_exports case, but it is too strong for invalid cases, as the current code doesn't proper recover the unmatched begin/end exports cases at the moment (we might want to add some basic recovery mechanisms, but I think it is not a high priority).

e.g. for the case

good.h

// IWYU pragma: begin_exports
#include "bad.h"
// IWYU pragma: end_exports

bad.h

// IWYU pragma: end_exports
247

ExportPragma seems more confusing than State, it doesn't express that we're tracking the state of the export pragma.

249

renamed to SeenAt.

clang-tools-extra/include-cleaner/unittests/RecordTest.cpp
297

for these two cases it is false, based on the semantic of ShouldKeep, there is no #include directive on Line 6&9, so line 6&9 is not in the set, it will return false.

If there are #include directives on these lines, these are corner cases, multiple combinations:

  • /* IWYU Pragma: begin_exports*/ #include "keep.h"
  • #include "nokeep.h" /*IWYU Pragma: begin_exports*/
  • /* IWYU Pragma: end_exports*/ #include "nokeep.h"
  • #include "keep.h" /* IWYU Pragma: end_exports*/

Fully supporting these requires some effort and code complexity, and I believe these are rare cases, I would ignore these cases for now.

kadircet added inline comments.Nov 10 2022, 2:42 AM
clang-tools-extra/include-cleaner/lib/Record.cpp
150

do you mind extracting this into a method and re-flow in a early-exit friendly way, it's getting a little bit hard to follow. e.g:

void checkForExport(FileID IncludingFile, FileEntryRef IncludedFile) {
  if (ExportStack.etmpy()) return;
  auto &Top = ExportStack.top();
  if (Top.SeenAtFile != IncludingFile) return;
  ...
}
151

nit: auto &Top = ... to prevent the copy

154

can you move the comment to be above the if statement instead? nesting seems weird here. we can say make sure current include is covered by the export pragma.

155

unfortunately File is Optional.

156

track all lines that have IWYU pragmas, even for lines without #include directive

i am not sure what's so bad about this. i think it'll make the code less complicated, as inclusion directive doesn't need to care about mutating ShouldKeep anymore, and public api for the class doesn't actually care about include-directives for shouldKeep operation, it just operates on lines. so saying "yes" to preserving lines that have // IWYU keep/export but no associated inclusion directive, doesn't seem so confusing (i agree, it'd be better not to have, but i don't think is an important enough use case, i.e. people trying to make decisions for lines that don't have includes via include-cleaner is unlikely and possibly wrong enough that we shouldn't care).

up to you though, especially if you extract the export handling into a separate method, this shouldn't be too much of an issue.

156

nit: i'd actually make FullPath part of State and use it directly from there, rather than possibly calling tryGetRealPathName and save multiple times here.

195

nit: s/Handle/Record

247

i think State doesn't emphasize the fact that it's about exports specifically, hence i was trying to suggest including Export in the name somehow.

moreover, i don't think we're tracking the state of the export pragma, we're literally recording occurrence of an export pragma, the state is rather tracked via ExportStack itself. hence i'd still suggest renaming this to ExportPragma and if you want change the ExportStack to ExportState, but i think ExportStack is fine as is.

clang-tools-extra/include-cleaner/unittests/RecordTest.cpp
349

nit: re-order export3 and export2

hokein updated this revision to Diff 474505.Nov 10 2022, 4:29 AM
hokein marked 8 inline comments as done.

address remaining comments.

kadircet accepted this revision.Nov 10 2022, 4:56 AM

thanks for bearing with me, LG!

clang-tools-extra/include-cleaner/lib/Record.cpp
155

FileSeenHash feels a little bit confusing (sounds like hash of an entity), why not directly call it as the IncludingFile? (similarly instead of HashLine, IncludeLine or DirectiveLine ?)

167

i guess we should just use Top.FullPath here instead?

This revision is now accepted and ready to land.Nov 10 2022, 4:56 AM
hokein marked 2 inline comments as done.Nov 10 2022, 6:42 AM
hokein added inline comments.
clang-tools-extra/include-cleaner/lib/Record.cpp
167

oops, I added the FullPath, but forgot to update the usage here.

This revision was landed with ongoing or failed builds.Nov 10 2022, 6:43 AM
This revision was automatically updated to reflect the committed changes.
hokein marked an inline comment as done.