This is an archive of the discontinued LLVM Phabricator instance.

[include-cleaner] clang-include-cleaner can print/apply edits
ClosedPublic

Authored by sammccall on Nov 30 2022, 7:10 AM.

Details

Summary

This adds command-line flags to the tool:
+ -print: prints changed source code
+ -print=changes: prints headers added/removed
+ -edit: rewrites code in place
+ -insert=0/-remove=0: disables additions/deletions for the above

These are supported by a couple of new functions dumped into Analysis:
analyze() sits on top of walkUsed and makes used/unused decisions for
Includes. fixIncludes() applies those results to source code.

Diff Detail

Event Timeline

sammccall created this revision.Nov 30 2022, 7:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 30 2022, 7:10 AM
Herald added a subscriber: mgrang. · View Herald Transcript
sammccall requested review of this revision.Nov 30 2022, 7:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 30 2022, 7:10 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
kadircet accepted this revision.Dec 1 2022, 6:47 AM

thanks!

clang-tools-extra/include-cleaner/include/clang-include-cleaner/Analysis.h
66

nit: name the Includes parameter. preferably MainFileIncludes (or ProvidedIncludes)?

clang-tools-extra/include-cleaner/lib/Analysis.cpp
77

nit: maybe leave a comment here for skipping Headers we've seen before. there's a quite good chance that we'll see same provider showing up multiple times, skipping processing might be helpful (later on we'll probably need to cache the analysis results for diagnostics purposes).

111

this logic makes me feel a little bit uneasy, as we're relying on alphabetical sorting of Results.Missing, which isn't just the filenames but also contains < or " at the beginning.
clang-format has weird include categories but I think it never mixes " includes with < includes. so we're probably safe here.
but it might still be safer to just keep a map of offsets to edits, up to you. (having a comment at the very least would be nice)

clang-tools-extra/include-cleaner/unittests/AnalysisTest.cpp
147

nit: you can call guard(Code) here and elsewhere

This revision is now accepted and ready to land.Dec 1 2022, 6:47 AM
sammccall updated this revision to Diff 479574.Dec 2 2022, 2:54 AM
sammccall marked an inline comment as done.

replace fixIncludes impl with clang-format, tweak tests

Herald added a project: Restricted Project. · View Herald TranscriptDec 2 2022, 2:54 AM
sammccall added inline comments.Dec 2 2022, 2:56 AM
clang-tools-extra/include-cleaner/include/clang-include-cleaner/Analysis.h
66

Added a name for consistency, but I don't think the name is actually useful here.

("MainFile" is inconsistent with the other parameters, and "Provided" is redundant for an input parameter)

clang-tools-extra/include-cleaner/lib/Analysis.cpp
77

I think you're talking about a performance optimization using a cache?

We still need to process each header to compute Satisfied. So at most we're replacing a trivial comparison + 2 hashtable lookups (Inc.match and Used.insert) with one cache lookup. (The inner loop count is ~always <=1).

Happy with such a change if it improves a benchmark of course but my expectation is that it *wouldn't* (small constant factor on a fairly fast operation) so a FIXME seems premature here.

111

TL;DR: replaced implementation with clang-format, tweaked signature


Good catch, this is completely wrong, I should capture all the edits, stable_sort them, and then fold together.

But this got me thinking that actually this doesn't give the correct relative order of inserted headers: they'll just be in whatever order I start with. I switched to using libFormat instead - I can't really understand from the implementation how it solves this problem, but it probably does, and if not it's the right place to fix it.

This had some knock-on effects on the API: no more messing with MemoryBuffer etc.

sammccall updated this revision to Diff 479575.Dec 2 2022, 2:56 AM

delete dead code, add comment

kadircet accepted this revision.Dec 2 2022, 3:23 AM

thanks!

clang-tools-extra/include-cleaner/lib/Analysis.cpp
77

We still need to process each header to compute Satisfied

but we need to do this only once per unique Providers across all the callbacks (e.g. we'll get Foo.h as a provider for every reference of Foo and createFoo in the main file). From header analysis point of view, i don't think we need to treat each of these Providers specially based on the nature of the reference (neither the symbol nor the reference location/type).

So at most we're replacing a trivial comparison + 2 hashtable lookups (Inc.match and Used.insert) with one cache lookup. (The inner loop count is ~always <=1).

Yeah I guess you're right. I was treating the inner-loop as going over all the includes in the main file, but in practice it's just a bunch of lookups into a hashtable and operating over a single include.

Happy with such a change if it improves a benchmark of course but my expectation is that it *wouldn't* (small constant factor on a fairly fast operation) so a FIXME seems premature here.

no need. agreed that it is unlikely to make a difference. and if it would, it's probably better to write this in a way that'll de-duplicate Providers across references with some bucketing over reftypes and process all of them once.

sammccall added inline comments.Dec 2 2022, 4:16 AM
clang-tools-extra/include-cleaner/lib/Analysis.cpp
77

but we need to do this only once per unique Providers across all the callbacks (e.g. we'll get Foo.h as a provider for every reference of Foo and createFoo in the main file)

By "process" I mean we need to at least look up into the cache.
That's pretty trivial, but we're not doing much more at the moment.

but in practice it's just a bunch of lookups into a hashtable

in fact just one lookup!

This revision was landed with ongoing or failed builds.Dec 2 2022, 4:17 AM
This revision was automatically updated to reflect the committed changes.