This is an archive of the discontinued LLVM Phabricator instance.

[include-fixer] Add usage count to find-all-symbols.
ClosedPublic

Authored by sammccall on Feb 21 2017, 9:28 AM.

Details

Summary

Add usage count to find-all-symbols.

FindAllSymbols now finds (most!) main-file usages of the discovered symbols.
The per-TU map output provides Signals: Seen (was NumOccurrences) and Used.
These are no longer part of the SymbolInfo. They are provided in batches, deduplicated.
The reducer aggregates these to find the number of files that use a symbol.

The idea here is to use Signals.Used for ranking: intuitively number of files that
use a symbol is more meaningful than number of files that include the header.

Event Timeline

sammccall created this revision.Feb 21 2017, 9:28 AM
sammccall updated this revision to Diff 89236.Feb 21 2017, 9:34 AM

Remove obsolete comment.

hokein edited edge metadata.Feb 22 2017, 1:19 AM

Thanks for the contributions!

include-fixer/find-all-symbols/FindAllSymbols.cpp
244–253

The way seems too tricky to me. I'd use a flag variable like isUsedDecl to distinguish and handle use and decl cases.

250

Is the preceding ! intended?

include-fixer/find-all-symbols/SymbolInfo.h
99

A blank between class members.

unittests/include-fixer/find-all-symbols/FindAllSymbolsTests.cpp
45–51

nit: You can use llvm::is_contained here.

49

The same.

63

nit: a blank line between methods.

527

I'd put a FIXME comment here.

554

The same.

sammccall updated this revision to Diff 89337.Feb 22 2017, 1:39 AM
sammccall marked 7 inline comments as done.

Address review comments.

Thanks for the review, still learning the style :)

include-fixer/find-all-symbols/FindAllSymbols.cpp
250

This does the right thing (!"string" is always false) but generates a warning.
Changed to assert(false && ...)

klimek added a subscriber: klimek.Feb 22 2017, 1:48 AM
klimek added inline comments.
include-fixer/find-all-symbols/FindAllSymbols.cpp
261

You can use llvm_unreachable instead. Or do you actually want to fall through in release mode?

include-fixer/find-all-symbols/SymbolInfo.h
95–96

So we map from Symbol to NumOccurences in SymbolInfoMain, but duplicate the info in the key? That seems somewhat weird.

sammccall planned changes to this revision.Feb 22 2017, 2:28 AM

OK, some changes to come here:

  • klimek thinks using a set as a pseudo-map with mutable is evil, which is a fair point
  • the current SymbolReporter interface doesn't provide enough information to deduplicate efficiently, so we'll move the dedup to FindAllSymbols instead.
include-fixer/find-all-symbols/SymbolInfo.h
95–96

We no longer do that mapping.

sammccall updated this revision to Diff 89360.Feb 22 2017, 7:31 AM

Changes based on offline discussion:

  • mutable SignalInfo data split into SignalInfo::Signals
  • yaml serialized data is SignalInfo::WithSignals, which is currently a pair<SignalInfo, Signals>
  • FindAllSymbols/FindAllMacros now report in batches, with symbols already deduplicated. This makes it easy to count each (file,symbol) only once.
  • implemented macro usages because I had to touch that anyway
sammccall edited the summary of this revision. (Show Details)Feb 22 2017, 7:34 AM
sammccall updated this revision to Diff 89365.Feb 22 2017, 8:07 AM

Switch from SymbolInfo::WithSignals as a typedef for std::pair, to a struct
SymbolAndSignals, to have clearer semantics.

sammccall updated this revision to Diff 89366.Feb 22 2017, 8:09 AM

git-clang-format

hokein added inline comments.Feb 23 2017, 12:42 AM
include-fixer/InMemorySymbolIndex.h
27

There are many places using std::vector<clang::find_all_symbols::SymbolAndSignals>. Maybe we can use a type alias for it, so that we can type less.

include-fixer/SymbolIndexManager.cpp
105

I think you miss & here.

include-fixer/find-all-symbols/FindAllMacros.cpp
36–39

Nit: No {} for single statement in if. The same below.

include-fixer/find-all-symbols/FindAllMacros.h
39

We are missing tests for these macro usages.

include-fixer/find-all-symbols/FindAllSymbols.h
52

s/on EndSourceFileAction/onEndOfTranslationUnit.

include-fixer/find-all-symbols/SymbolInfo.h
50

I think we can make a standalone class instead of making it a nested class of SymbolInfo because I don't see strong relationship between them. Maybe name it FindingSignals or FindingInfo.

72

We don't need this method since we have SymbolAndSignals?

100

I'd put this statement inside SymbolAndSignals.

128–130

Not much better idea on names, how about SymbolFinding?

struct SymbolFinding {
   SymbolInfo Symbol;
   FindingInfo Finding;
};
sammccall updated this revision to Diff 89484.Feb 23 2017, 2:20 AM
sammccall marked 4 inline comments as done.

Address review comments.

include-fixer/InMemorySymbolIndex.h
27

I guess? It's the namespaces that are the problem (vector<SymbolAndSignals> is fine) and most of the namespace noise wouldn't go away here.

is clang::find_all_symbols::SymbolsSignalsList better enough to obscure what the actual type is? It's 45 chars vs 54.

IMO it's not worth it here, though clang::find_all_symbols::SymbolInfo::SignalMap vs std::map<clang::find_all_symbols::SymbolInfo, clang::find_all_symbols::SymbolInfo::Signals> is.

include-fixer/find-all-symbols/FindAllMacros.h
39

These are covered by FindAllSymbolsTests, which (despite the name) tests the whole FindAllSymbolsAction.

Specifically, MacroTest and MacroTestWithIWYU cover these.

include-fixer/find-all-symbols/SymbolInfo.h
50

The relationship between them is a strong scoping one: signals only make sense in the context of a particular SymbolInfo.

If it was a parallel top-level class, it needs a name that communicates this relationship, most likely SymbolSignals. I don't think that's significantly better than SymbolInfo::Signals.

(If I had my druthers, these would probably be Symbol and Symbol::Signals - the "info" is the main reason that SymbolInfo::Signals is noisy. But not worth the churn I think)

100

That won't compile: it's the members of SymbolInfo that are private, not the members of SymbolAndSignals.

128–130

I don't think SymbolFinding is better:

  • it can be misinterpreted as finding *for* a signal, not findings *and* a signal. I think the And is important
  • "finding" is vague while "signal" is more specific. I changed this from finding -> signal already based on a discussion with Ben, if you do want to change this we should sync up offline :)
hokein added inline comments.Feb 23 2017, 7:24 AM
include-fixer/InMemorySymbolIndex.h
27

If we put the type alias under clang::include_fixer namespace, it will shorten the name more. Agree it is not worth the effect as the full name only happens in headers.

We could save a few characters by getting rid of clang because we are always in clang namespace. So std::vector<find_all_symbols::SymbolAndSignals> should work, this looks slightly better. :)

include-fixer/find-all-symbols/FindAllMacros.cpp
37

code style: use prefix ++. The same below.

include-fixer/find-all-symbols/FindAllMacros.h
39

Acked. You combined them in current tests (I originally thought there should be some separate tests for these).

include-fixer/find-all-symbols/SymbolInfo.cpp
79

You forgot to update this?

include-fixer/find-all-symbols/SymbolInfo.h
50

You convinced me. Please keep the comment of Signals updated.

If I had my druthers, these would probably be Symbol and Symbol::Signals - the "info" is the main reason that SymbolInfo::Signals is noisy. But not worth the churn I think)

In the initial design, SymbolInfo merely represents a cpp symbol. But as more features developed, SymbolInfo might be not a good name at the moment. Symbol seems not a better name as LLVM already has many classes named Symbol. We can leave it here.

100

Acked. Thanks for explanations. Sorry for misleading.

128–130

Fair enough. I don't have strong opinion about the name.

sammccall updated this revision to Diff 89506.Feb 23 2017, 7:44 AM
sammccall marked 3 inline comments as done.

Address review comments; remove redundant namespace qualifiers; format.

Thanks for bearing with me here :)

include-fixer/InMemorySymbolIndex.h
27

Done, also cleaned up other redundant namespaces in touched files.

hokein accepted this revision.Feb 23 2017, 7:58 AM

Thanks! Looks good from my side.

I'd wait to see whether @bkramer has more comments before commit it.

include-fixer/find-all-symbols/SymbolInfo.h
50

keep the comment of Signals updated.

You are missing this.

This revision is now accepted and ready to land.Feb 23 2017, 7:58 AM
sammccall updated this revision to Diff 89510.Feb 23 2017, 8:00 AM

Update comment - oops!

hokein added inline comments.Feb 23 2017, 11:48 AM
unittests/include-fixer/find-all-symbols/FindAllSymbolsTests.cpp
40

A new catch: NewSymbols should be passed by reference, otherwise a copy will be generated.

sammccall updated this revision to Diff 89632.Feb 24 2017, 2:37 AM

Report symbols by reference.

Thanks, still LGTM with one nit.

include-fixer/find-all-symbols/FindAllSymbols.cpp
263

We don't need std::move right now.

sammccall updated this revision to Diff 89636.Feb 24 2017, 3:09 AM

Remove redundant std::move

bkramer accepted this revision.Feb 27 2017, 1:45 PM

lg

include-fixer/SymbolIndexManager.cpp
157

std::move

include-fixer/find-all-symbols/FindAllMacros.cpp
69

Call me oldschool, but I still prefer FileSymbols.clear()

sammccall updated this revision to Diff 89982.Feb 28 2017, 12:24 AM
sammccall marked an inline comment as done.

Review comments

This revision was automatically updated to reflect the committed changes.