This is an archive of the discontinued LLVM Phabricator instance.

[include-cleaner] Make use of locateSymbol in WalkUsed
ClosedPublic

Authored by kadircet on Nov 17 2022, 4:39 AM.

Diff Detail

Event Timeline

kadircet created this revision.Nov 17 2022, 4:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 17 2022, 4:39 AM
kadircet requested review of this revision.Nov 17 2022, 4:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 17 2022, 4:39 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
hokein added a comment.Dec 5 2022, 2:43 AM

sorry for the delay, just realize it is sitting in the review list, left some comments.

clang-tools-extra/include-cleaner/lib/Analysis.cpp
28

nit: this name is similar to findHeaders (not sure this is intended), I'd rather use a different name (something like headersForSymbol)

44

I agree the place here might be on a performance-critical path, but I think the idea of using cache probably needs some bits of design, the current implementation seems half baked (there is also a FIXME in the findAllHeaders saying we should implement another cache for it, so it is unclear to me what's the whole picture).

Can we just leave out all cache in this patch with a FIXME? And having a follow-up patch for that?

46

any reason not using llvm::DenseMap?

50

nit: consider using structure binding to get rid of the Insert.first, Insert.second magic names.

clang-tools-extra/include-cleaner/unittests/AnalysisTest.cpp
115–116

it feels weird that all other tests are using WalkUsedTest, but not this one.

The logic are mostly the same except that this test doesn't need to build the AST nodes, if we change offsetToProviders method to offset(llvm::ArrayRef<Decl*> TopLevelDecl, llvm::ArrayRef<SymbolReference> MacroRef, SourceManager &SM), I think we can use it in this test.

kadircet updated this revision to Diff 480340.Dec 5 2022, 11:13 PM
kadircet marked 3 inline comments as done.
  • Address comments
clang-tools-extra/include-cleaner/lib/Analysis.cpp
44

i agree that the cache for findHeaders is a bit more nuanced, as it'll need to have assumptions about a given SymbolLocation, no matter how/where it was acquired will return the same results.
but i am not sure what's complicated/unclear about the cache for decl to headers. The benefit is clear as explained in the comment around saving computations every time a foo is referenced inside the file.
Since both the benefit is clear and the assumption around "a decl having the same set of providers" sounds simple and sane enough, i'd actually keep this one in.

sammccall added inline comments.Dec 6 2022, 2:00 AM
clang-tools-extra/include-cleaner/lib/Analysis.cpp
44

I also think this needs a bit more examination. It seems cheap, but there are several overlapping concerns and addressing the first is always cheap but makes the others more expensive.

e.g. here, it's not obvious which of these is most desirable and therefore should probably be done first:

  • caching symbol => header
  • caching (part of) location => header
  • caching/optimizing location => fileid
  • propagating signals across these steps
  • exposing location from the API
  • keeping the implementation simple

I don't personally have a strong intuition of which of these steps should be cached for performance, if any of them will really matter, and whether we're better off caching end-to-end (cache a longer computation, lower hit-rate) vs individual steps (small computation, higher hit-rate).

To me this seems premature without *at least* measuring that it improves performance, but ideally showing that it's better than caching at a different level.

clang-tools-extra/include-cleaner/test/html.cpp
7 ↗(On Diff #480340)

I'd like to keep a very simple smoke test of the HTML report.

If you're aiming to test something specific here about the HTML report, I'd suggest creating a new test describing what it is. (Though generally speaking, the HTML test has poor test coverage of its details, which I think is unfortunately the right tradeoff)

If you're aiming to provide an integration test (these locations are taken into account in making overall decisions), I think this should be a test using -print=changes rather than -html.

clang-tools-extra/include-cleaner/unittests/AnalysisTest.cpp
61

shouldn't we have a location filter for the decls, to simulate RecordAST?

kadircet marked 5 inline comments as done.Dec 6 2022, 3:10 AM
kadircet added inline comments.
clang-tools-extra/include-cleaner/test/html.cpp
7 ↗(On Diff #480340)

thanks makes sense. print-changes didn't exist back in the day when i was putting together this patch and i forgot to update this piece.

i wasn't trying to test HTMLReport behaviour specifically, and it's probably not worth adding that anyways.

kadircet updated this revision to Diff 480419.Dec 6 2022, 3:10 AM
kadircet marked an inline comment as done.
  • Drop the cache
  • Use -print=changes in lit test
  • Update unittest helper to limit decls to main file
kadircet updated this revision to Diff 480420.Dec 6 2022, 3:11 AM
  • Fix RUN: line in lit test
hokein accepted this revision.Dec 6 2022, 3:32 AM

looks good from my side.

This revision is now accepted and ready to land.Dec 6 2022, 3:32 AM
sammccall accepted this revision.Dec 6 2022, 4:19 AM

LG, thanks!

clang-tools-extra/include-cleaner/lib/Analysis.cpp
33

nit: I think this would be better as a vaguer comment at a higher level (in walkUsed) like "it might be useful to cache some parts of this computation to save repeated work"

clang-tools-extra/include-cleaner/test/multiple-providers.cpp
2

nit: splitting RUN lines over lines makes copy/pasting/debugging lit tests harder and ends up being net-negative IMO. I've added a .clang-format file so that >80 col lines won't get split

kadircet marked 2 inline comments as done.Dec 6 2022, 6:01 AM
kadircet added inline comments.
clang-tools-extra/include-cleaner/test/multiple-providers.cpp
2

i also have some extra enforcement by my editor for line breaks :P. i didn't know copy pasting run lines were a workflow (because they usually have place holders one needs to change afterwards), i've always used the command line printed with the failing test. makes sense to keep it in a single line for people with such a workflow though.

kadircet updated this revision to Diff 480454.Dec 6 2022, 6:01 AM
kadircet marked an inline comment as done.
  • Address comments
This revision was landed with ongoing or failed builds.Dec 6 2022, 6:03 AM
This revision was automatically updated to reflect the committed changes.