This is an archive of the discontinued LLVM Phabricator instance.

clang-tools-extra/clangd/Headers.h
137	Hey, I recognize this code :-) I think the key ideas here are that: we're using opaque identifiers for the files, nothing is interesting about a file other than its (include) edges and (directly-referenced) color the identity of headers is an impl detail of Headers.h rather than being something like a FileID, this allows us to hide messy details of files not having stable identity across preamble->main file I think the second idea is important, but the first one might be a bit naive. I worry it's going to lead to certain rules being hard to implement, or being bundled into Headers.cpp instead of IncludeCleaner. For example: if a file is not self-contained, how does this affect the algorithm? (There's a FIXME for this, but it's in the wrong file!) if a file is a standard library entrypoint? if a file is a standard library impl detail? I think the facts (is a file self-contained) should be part of IncludeStructure, but that we should expose them for IncludeCleaner to deal with, rather than trying to hide them in `markUsed`. This means giving IncludeStructure a wider interface, which is we need to be careful about. It makes it harder to substitute algorithms by swapping out the UsedFunc, but I think this is only a cute trick and not actually important. Concretely, I think I'd suggest just extending the public API to expose the "file index" concept: expose the type `using IncludeStructure::File = unsigned` or so add the file id to `Inclusion` add File getFile(const FileEntry) add `ArrayRef<File> getIncludedFiles(File)` in future, we can add e.g. `const char isStandardLibraryEntrypoint(File)` or whatever And implement all of markUsed in IncludeCleaner. (Not 100% sure if we actually still need the `Used` ivar in Inclusion, come to think of it, maybe we just run this code in the diagnostic cycle)
clang-tools-extra/clangd/IncludeCleaner.cpp
144	nit: move this to be a line comment on the sort() call? I think it's sufficiently nonobvious that sorting groups by file ID that it becomes nonobvious exactly what code the comment refers to!
clang-tools-extra/clangd/IncludeCleaner.h
50	so when do we perform this expansion? Seems like you've wired this up end-to-end in this patch and we're just going to hit the elog case. I think it's reasonable to put this expansion into findReferencedFiles fwiw, it's a fairly simple second pass and in practice combining them won't interfere with reasonable tests
51	comment just says spelling, not spelling/expansion, not sure if this is significant. Expansion is actually the more obvious, but I do think we need both.

Improve structure, address review comments.

Hey, sorry for the gigantic turn around. I still need to cover the code with few tests and polish it a bit more but I've updated the majority of it and pushed to get some early feedback before I do that. Please let me know if you have any concerns/see some problems with the approach I went for!

Harbormaster completed remote builds in B125082: Diff 374174.Sep 22 2021, 4:04 AM

Populate Inclusion.ID, add a test (failing for now).

Make sure FileEntry* is not nullptr

Harbormaster completed remote builds in B125122: Diff 374226.Sep 22 2021, 7:52 AM

sammccall added inline comments.Sep 22 2021, 11:39 PM

clang-tools-extra/clangd/Headers.h
61	Most includes are part of the preamble, so there are two relevant parse actions (preamble, and mainfile-using-preamble). Each has its own SourceManager and therefore namespace of FileIDs. There's no rule that says a header gets the same FileID when a preamble is used. As written, RecordHeaders is assigning Inclusion::ID based on the preamble, and then we end up comparing it to FileIDs from compareUnusedIncludes(). From reading the ASTReader code, I believe that there's a simple offset between the two: e.g. that if a preamble uses FileIDs from 1-100, then these might be mapped to FileIDs 1501-1600 when that preamble is reused. We could go down the path of exploiting this. (Though we need to investigate the details and think a little about how it works with modules). The somewhat less-coupled alternative we use today is to use the FileEntry::Name as documented in the private section of IncludeStructure. There are a few ways to build on top of this - basically we're either going to do most calculations in FileID space, or expose a "stable file index" from IncludeStructure and do most calculations in that space...

Perform the computation in the IncludeStructure::File space.

kbobyrev marked an inline comment as done.Sep 23 2021, 1:23 AM

Harbormaster completed remote builds in B125291: Diff 374469.Sep 23 2021, 1:30 AM

kbobyrev mentioned this in D110386: [clangd] Refactor IncludeStructure: use File (unsigned) for most computations.Sep 24 2021, 12:09 AM

Prepare for rebase: revert Headers.cpp and Headers.h

Harbormaster completed remote builds in B125799: Diff 375153.Sep 26 2021, 11:45 PM

Rebase on top of D110386.

Harbormaster completed remote builds in B125800: Diff 375154.Sep 26 2021, 11:56 PM

kbobyrev mentioned this in rG0b1eff1bc5d0: [clangd] Refactor IncludeStructure: use File (unsigned) for most computations.Sep 27 2021, 8:51 AM

kbobyrev mentioned this in rG1bcd6b51a982: [clangd] Refactor IncludeStructure: use File (unsigned) for most computations.Sep 27 2021, 10:51 PM

Rebase on top of main. Now ready for a review.

Harbormaster completed remote builds in B126671: Diff 376425.Sep 30 2021, 11:21 PM

Fix the rebase

Harbormaster completed remote builds in B126674: Diff 376430.Oct 1 2021, 12:00 AM

Tiny refactoring.

Harbormaster completed remote builds in B126675: Diff 376431.Oct 1 2021, 12:06 AM

Rebase on top of landed patches.

Ping, @sammccall

Sorry, I thought i'd sent these comments...

clang-tools-extra/clangd/IncludeCleaner.cpp
158	Why are we passing around Inclusions by value?
clang-tools-extra/clangd/IncludeCleaner.h
50	This says FIXME but IIUC it's fixed.
55	this function is undocumented, unused and untested :-) What's it for? Why does it not return a set?

Harbormaster completed remote builds in B126991: Diff 377112.Oct 5 2021, 1:43 AM

Resolve review comments.

Harbormaster completed remote builds in B127001: Diff 377125.Oct 5 2021, 2:44 AM

Refactor FileID -> IncludeStructure::HeaderID into a separate function.

Harbormaster completed remote builds in B127019: Diff 377158.Oct 5 2021, 5:36 AM

sammccall added inline comments.Oct 5 2021, 5:47 AM

clang-tools-extra/clangd/IncludeCleaner.cpp
158	Sorry, should have thought this through more before leaving the comment. There are a couple of questions really: How should we store the information about which inclusions are unused? not at all, generate ReferencedFiles and compute "is this header unused" on the fly when generating diagnostics store ReferencedFiles but call "is this header unused" on the fly store a boolean or something in each Inclusion store a list of the inclusions that are unused IMO this is mostly a question of what's the lifecycle of the info, and what's the simplest representation - seems like we should prefer stuff higher on the list if we have a choice. What should the signature of the function be? There doesn't seem to be any work saved here by processing all the includes as a batch - why not simplify by just making this `bool isUnused(...)` and let the caller/test choose what data structures to use?
159	EntryPoint is unused
166	Handle unresolved case somehow? (Or assert if you're sure it can't happen - I think it can for e.g. pp_file_not_found)
166	doesn't seem like there's any need to go through filenames for this. Can't we just store the HeaderID in the Inclusion? (Blech, as an `unsigned` to avoid a circular dependency)
168	elog says that: a) this might happen b) logging for the user is the best thing we can do Can this actually happen? My suspicion is no. In which case maybe it should be an assert?
173	I'm fairly (more) certain this one should be an assert
193	assert?
196	this is get, not getOrCreate, so you don't need the mutable reference
198	I'm not totally sure whether this is safe to assert or not. WDYT? In any case, please fix the message (FE -> HeaderID, add more context)

sammccall added inline comments.Oct 5 2021, 6:11 AM

clang-tools-extra/clangd/IncludeCleaner.cpp
158	OK I was confused, nothing is getting stored, but ParsedAST::getUnused() function creates/destroys the analysis data so it needs to run as a batch. I think probably: this function should just be a simple `bool isUnused(...)` and the loop lives in the caller `ParsedAST::getUnused()` should become `getUnused(const ParsedAST&)` and live in this file We have some circularity between ParsedAST and IncludeCleaner, but I think we're going to have that in any case due to `findReferencedLocations()`

Address review comments.

Harbormaster completed remote builds in B127045: Diff 377197.Oct 5 2021, 7:01 AM

sammccall accepted this revision.Oct 5 2021, 8:20 AM

sammccall added inline comments.

clang-tools-extra/clangd/Headers.h
61	I don't think we're under any size pressure here - `Optional<unsigned>`?
61	call the member HeaderID, rather than have this as a comment only?
clang-tools-extra/clangd/IncludeCleaner.cpp
161	SM is unused
165	this can just be a check whether MFI.ID is valid or not
171	ReferencedFiles.contains(IncludeID) and inline into the if?
clang-tools-extra/clangd/ParsedAST.h
159	I think this function belongs in IncludeCleaner.h. (There's a circularity problem between ParsedAST and IncludeCleaner, but putting the function here doesn't fix it)
clang-tools-extra/clangd/unittests/IncludeCleanerTests.cpp
165 ↗	(On Diff #377197)	Don't bother doing this stripping just for the test IMO, it obscures the assertion (more than escaping quotes would)

This revision is now accepted and ready to land.Oct 5 2021, 8:20 AM

Thank you for the review! Looks much better now.

Harbormaster completed remote builds in B127094: Diff 377264.Oct 5 2021, 8:56 AM

Closed by commit rGebfcd06d4222: [clangd] IncludeCleaner: Mark used headers (authored by kbobyrev). · Explain WhyOct 5 2021, 9:08 AM

This revision was automatically updated to reflect the committed changes.

kbobyrev added a commit: rGebfcd06d4222: [clangd] IncludeCleaner: Mark used headers.

kbobyrev marked an inline comment as done.Oct 5 2021, 9:43 AM

kbobyrev mentioned this in rG0c14e279c729: [clangd] Revert unwanted change from D108194.Oct 5 2021, 9:45 AM

kbobyrev mentioned this in rGb1309a1ed99d: [clangd] Revert unwanted change from D108194.Oct 8 2021, 1:42 AM

Revision Contents

Path

Size

clang-tools-extra/

clangd/

17 lines

41 lines

16 lines

19 lines

2 lines

21 lines

Diff 366835

clang-tools-extra/clangd/Headers.h

Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
// An #include directive that we found in the main file.		// An #include directive that we found in the main file.
struct Inclusion {		struct Inclusion {
tok::PPKeywordKind Directive; // Directive used for inclusion, e.g. import		tok::PPKeywordKind Directive; // Directive used for inclusion, e.g. import
std::string Written; // Inclusion name as written e.g. <vector>.		std::string Written; // Inclusion name as written e.g. <vector>.
Path Resolved; // Resolved path of included file. Empty if not resolved.		Path Resolved; // Resolved path of included file. Empty if not resolved.
unsigned HashOffset = 0; // Byte offset from start of file to #.		unsigned HashOffset = 0; // Byte offset from start of file to #.
int HashLine = 0; // Line number containing the directive, 0-indexed.		int HashLine = 0; // Line number containing the directive, 0-indexed.
SrcMgr::CharacteristicKind FileKind = SrcMgr::C_User;		SrcMgr::CharacteristicKind FileKind = SrcMgr::C_User;
		bool Used = false; // Contains symbols used in the main file.
		sammccallUnsubmitted Done Reply Inline Actions Most includes are part of the preamble, so there are two relevant parse actions (preamble, and mainfile-using-preamble). Each has its own SourceManager and therefore namespace of FileIDs. There's no rule that says a header gets the same FileID when a preamble is used. As written, RecordHeaders is assigning Inclusion::ID based on the preamble, and then we end up comparing it to FileIDs from compareUnusedIncludes(). From reading the ASTReader code, I believe that there's a simple offset between the two: e.g. that if a preamble uses FileIDs from 1-100, then these might be mapped to FileIDs 1501-1600 when that preamble is reused. We could go down the path of exploiting this. (Though we need to investigate the details and think a little about how it works with modules). The somewhat less-coupled alternative we use today is to use the FileEntry::Name as documented in the private section of IncludeStructure. There are a few ways to build on top of this - basically we're either going to do most calculations in FileID space, or expose a "stable file index" from IncludeStructure and do most calculations in that space... sammccall: Most includes are part of the preamble, so there are two relevant parse actions (preamble, and…
		sammccallUnsubmitted Done Reply Inline Actions I don't think we're under any size pressure here - `Optional<unsigned>`? sammccall: I don't think we're under any size pressure here - `Optional<unsigned>`?
		sammccallUnsubmitted Done Reply Inline Actions call the member HeaderID, rather than have this as a comment only? sammccall: call the member HeaderID, rather than have this as a comment only?
};		};
llvm::raw_ostream &operator<<(llvm::raw_ostream &, const Inclusion &);		llvm::raw_ostream &operator<<(llvm::raw_ostream &, const Inclusion &);
bool operator==(const Inclusion &LHS, const Inclusion &RHS);		bool operator==(const Inclusion &LHS, const Inclusion &RHS);

// Contains information about one file in the build grpah and its direct		// Contains information about one file in the build grpah and its direct
// dependencies. Doesn't own the strings it references (IncludeGraph is		// dependencies. Doesn't own the strings it references (IncludeGraph is
// self-contained).		// self-contained).
struct IncludeGraphNode {		struct IncludeGraphNode {
▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	public:
// Usually it should be SM.getFileEntryForID(SM.getMainFileID())->getName().		// Usually it should be SM.getFileEntryForID(SM.getMainFileID())->getName().
llvm::StringMap<unsigned> includeDepth(llvm::StringRef Root) const;		llvm::StringMap<unsigned> includeDepth(llvm::StringRef Root) const;

// This updates IncludeDepth(), but not MainFileIncludes.		// This updates IncludeDepth(), but not MainFileIncludes.
void recordInclude(llvm::StringRef IncludingName,		void recordInclude(llvm::StringRef IncludingName,
llvm::StringRef IncludedName,		llvm::StringRef IncludedName,
llvm::StringRef IncludedRealName);		llvm::StringRef IncludedRealName);

		// Classifying the main-file includes as "used" or "unused" is subtle
		// (consider transitive includes), so we inject the algorithm.

		// Maps including files (from) to included files (to).
		using AbstractIncludeGraph = llvm::DenseMap<unsigned, SmallVector<unsigned>>;
		sammccallUnsubmitted Done Reply Inline Actions Hey, I recognize this code :-) I think the key ideas here are that: we're using opaque identifiers for the files, nothing is interesting about a file other than its (include) edges and (directly-referenced) color the identity of headers is an impl detail of Headers.h rather than being something like a FileID, this allows us to hide messy details of files not having stable identity across preamble->main file I think the second idea is important, but the first one might be a bit naive. I worry it's going to lead to certain rules being hard to implement, or being bundled into Headers.cpp instead of IncludeCleaner. For example: if a file is not self-contained, how does this affect the algorithm? (There's a FIXME for this, but it's in the wrong file!) if a file is a standard library entrypoint? if a file is a standard library impl detail? I think the facts (is a file self-contained) should be part of IncludeStructure, but that we should expose them for IncludeCleaner to deal with, rather than trying to hide them in `markUsed`. This means giving IncludeStructure a wider interface, which is we need to be careful about. It makes it harder to substitute algorithms by swapping out the UsedFunc, but I think this is only a cute trick and not actually important. Concretely, I think I'd suggest just extending the public API to expose the "file index" concept: expose the type `using IncludeStructure::File = unsigned` or so add the file id to `Inclusion` add File getFile(const FileEntry) add `ArrayRef<File> getIncludedFiles(File)` in future, we can add e.g. `const char isStandardLibraryEntrypoint(File)` or whatever And implement all of markUsed in IncludeCleaner. (Not 100% sure if we actually still need the `Used` ivar in Inclusion, come to think of it, maybe we just run this code in the diagnostic cycle) sammccall: Hey, I recognize this code :-) I think the key ideas here are that: - we're using opaque…
		// Decides usage for each file included by EntryPoint based on the set of
		// files that contain some referenced symbol.
		using UsedFunc = llvm::DenseMap<unsigned, bool>(
		const AbstractIncludeGraph &, llvm::DenseSet<unsigned> Referenced,
		unsigned EntryPoint);
		// Produce decisions for all files included from \p EntryPoint (usually the
		// main file).
		void markUsed(llvm::StringRef EntryPoint,
		llvm::ArrayRef<StringRef> ReferencedFiles,
		llvm::function_ref<UsedFunc>);

private:		private:
// Identifying files in a way that persists from preamble build to subsequent		// Identifying files in a way that persists from preamble build to subsequent
// builds is surprisingly hard. FileID is unavailable in InclusionDirective(),		// builds is surprisingly hard. FileID is unavailable in InclusionDirective(),
// and RealPathName and UniqueID are not preserved in the preamble.		// and RealPathName and UniqueID are not preserved in the preamble.
// We use the FileEntry::Name, which is stable, interned into a "file index".		// We use the FileEntry::Name, which is stable, interned into a "file index".
// The paths we want to expose are the RealPathName, so store those too.		// The paths we want to expose are the RealPathName, so store those too.
std::vector<std::string> RealPathNames; // In file index order.		std::vector<std::string> RealPathNames; // In file index order.
unsigned fileIndex(llvm::StringRef Name);		unsigned fileIndex(llvm::StringRef Name);
▲ Show 20 Lines • Show All 69 Lines • Show Last 20 Lines

clang-tools-extra/clangd/Headers.cpp

Show First 20 Lines • Show All 198 Lines • ▼ Show 20 Lines	for (const auto &Parent : PreviousLevel) {
Result[Name] = Level;		Result[Name] = Level;
}		}
}		}
}		}
}		}
return Result;		return Result;
}		}

		void IncludeStructure::markUsed(llvm::StringRef EntryPoint,
		llvm::ArrayRef<StringRef> ReferencedFiles,
		llvm::function_ref<UsedFunc> Algorithm) {
		auto Root = NameToIndex.find(EntryPoint);
		if (Root == NameToIndex.end()) {
		elog("IncludeCleaner: EntryPoint {0} not found in include graph",
		EntryPoint);
		return;
		}

		llvm::DenseSet<unsigned> Referenced;
		Referenced.reserve(ReferencedFiles.size());
		for (llvm::StringRef RefName : ReferencedFiles) {
		dlog("{0} is REFERENCED", RefName);
		auto It = NameToIndex.find(RefName);
		if (It != NameToIndex.end())
		Referenced.insert(It->second);
		}

		auto Decisions =
		Algorithm(IncludeChildren, std::move(Referenced), Root->second);
		auto RootChildren = IncludeChildren.find(Root->second);
		assert(RootChildren != IncludeChildren.end());
		llvm::DenseMap</RealPath/ StringRef, /Index/ unsigned> IncludeIndex;
		for (const auto Index : RootChildren->second) {
		if (!RealPathNames[Index].empty())
		IncludeIndex[RealPathNames[Index]] = Index;
		}
		for (auto &MFI : MainFileIncludes) {
		// FIXME: Skip includes that are not self-contained.
		auto It = IncludeIndex.find(MFI.Resolved);
		if (It != IncludeIndex.end()) {
		auto DIt = Decisions.find(It->second);
		if (DIt != Decisions.end()) {
		MFI.Used = DIt->second;
		dlog("{0} is {1}", MFI.Written, MFI.Used ? "USED" : "UNUSED");
		}
		}
		}
		}

void IncludeInserter::addExisting(const Inclusion &Inc) {		void IncludeInserter::addExisting(const Inclusion &Inc) {
IncludedHeaders.insert(Inc.Written);		IncludedHeaders.insert(Inc.Written);
if (!Inc.Resolved.empty())		if (!Inc.Resolved.empty())
IncludedHeaders.insert(Inc.Resolved);		IncludedHeaders.insert(Inc.Resolved);
}		}

/// FIXME(ioeric): we might not want to insert an absolute include path if the		/// FIXME(ioeric): we might not want to insert an absolute include path if the
/// path is not shortened.		/// path is not shortened.
▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

clang-tools-extra/clangd/IncludeCleaner.h

	Show All 40 Lines
	/// We use this to compute unused headers, so we:			/// We use this to compute unused headers, so we:
	///			///
	/// - cover the whole file in a single traversal for efficiency			/// - cover the whole file in a single traversal for efficiency
	/// - don't attempt to describe where symbols were referenced from in			/// - don't attempt to describe where symbols were referenced from in
	/// ambiguous cases (e.g. implicitly used symbols, multiple declarations)			/// ambiguous cases (e.g. implicitly used symbols, multiple declarations)
	/// - err on the side of reporting all possible locations			/// - err on the side of reporting all possible locations
	ReferencedLocations findReferencedLocations(ParsedAST &AST);			ReferencedLocations findReferencedLocations(ParsedAST &AST);

				/// Retrieves IDs of all files containing SourceLocations from \p Locs. Those
				/// locations could be within macro expansions and are not resolved to their
				sammccallUnsubmitted Done Reply Inline Actions so when do we perform this expansion? Seems like you've wired this up end-to-end in this patch and we're just going to hit the elog case. I think it's reasonable to put this expansion into findReferencedFiles fwiw, it's a fairly simple second pass and in practice combining them won't interfere with reasonable tests sammccall: so when do we perform this expansion? Seems like you've wired this up end-to-end in this…
				sammccallUnsubmitted Done Reply Inline Actions This says FIXME but IIUC it's fixed. sammccall: This says FIXME but IIUC it's fixed.
				/// spelling locations.
				sammccallUnsubmitted Done Reply Inline Actions comment just says spelling, not spelling/expansion, not sure if this is significant. Expansion is actually the more obvious, but I do think we need both. sammccall: comment just says spelling, not spelling/expansion, not sure if this is significant. Expansion…
				llvm::DenseSet<FileID> findReferencedFiles(const ReferencedLocations &Locs,
				const SourceManager &SM);

				inline llvm::DenseMap<unsigned, bool>
				sammccallUnsubmitted Done Reply Inline Actions this function is undocumented, unused and untested :-) What's it for? Why does it not return a set? sammccall: this function is undocumented, unused and untested :-) What's it for? Why does it not return a…
				directlyReferencedFiles(const IncludeStructure::AbstractIncludeGraph &Graph,
				const llvm::DenseSet<unsigned> &Referenced,
				unsigned EntryPoint) {
				llvm::DenseMap<unsigned, bool> Result;
				for (unsigned Inclusion : Graph.lookup(EntryPoint))
				Result.try_emplace(Inclusion, Referenced.contains(Inclusion));
				return Result;
				}

	} // namespace clangd			} // namespace clangd
	} // namespace clang			} // namespace clang

	#endif // LLVM_CLANG_TOOLS_EXTRA_CLANGD_INCLUDE_CLEANER_H			#endif // LLVM_CLANG_TOOLS_EXTRA_CLANGD_INCLUDE_CLEANER_H

clang-tools-extra/clangd/IncludeCleaner.cpp

	Show First 20 Lines • Show All 131 Lines • ▼ Show 20 Lines
	ReferencedLocations findReferencedLocations(ParsedAST &AST) {			ReferencedLocations findReferencedLocations(ParsedAST &AST) {
	ReferencedLocations Result;			ReferencedLocations Result;
	ReferencedLocationCrawler Crawler(Result);			ReferencedLocationCrawler Crawler(Result);
	Crawler.TraverseAST(AST.getASTContext());			Crawler.TraverseAST(AST.getASTContext());
	// FIXME(kirillbobyrev): Handle macros.			// FIXME(kirillbobyrev): Handle macros.
	return Result;			return Result;
	}			}

				llvm::DenseSet<FileID>
				findReferencedFiles(const llvm::DenseSet<SourceLocation> &Locs,
				const SourceManager &SM) {
				std::vector<SourceLocation> Sorted{Locs.begin(), Locs.end()};
				// Group by FileID.
				sammccallUnsubmitted Done Reply Inline Actions nit: move this to be a line comment on the sort() call? I think it's sufficiently nonobvious that sorting groups by file ID that it becomes nonobvious exactly what code the comment refers to! sammccall: nit: move this to be a line comment on the sort() call? I think it's sufficiently nonobvious…
				llvm::sort(Sorted);
				ReferencedFiles Result(SM);
				for (auto It = Sorted.begin(); It < Sorted.end();) {
				FileID FID = SM.getFileID(*It);
				Result.add(FID, *It);
				// Cheaply skip over all the other locations from the same FileID.
				// This avoids lots of redundant Loc->File lookups for the same file.
				do
				++It;
				while (It != Sorted.end() && SM.isInFileID(*It, FID));
				}
				return std::move(Result.Files);
				}

				sammccallUnsubmitted Done Reply Inline Actions Why are we passing around Inclusions by value? sammccall: Why are we passing around Inclusions by value?
				sammccallUnsubmitted Done Reply Inline Actions Sorry, should have thought this through more before leaving the comment. There are a couple of questions really: How should we store the information about which inclusions are unused? not at all, generate ReferencedFiles and compute "is this header unused" on the fly when generating diagnostics store ReferencedFiles but call "is this header unused" on the fly store a boolean or something in each Inclusion store a list of the inclusions that are unused IMO this is mostly a question of what's the lifecycle of the info, and what's the simplest representation - seems like we should prefer stuff higher on the list if we have a choice. What should the signature of the function be? There doesn't seem to be any work saved here by processing all the includes as a batch - why not simplify by just making this `bool isUnused(...)` and let the caller/test choose what data structures to use? sammccall: Sorry, should have thought this through more before leaving the comment. There are a couple of…
				sammccallUnsubmitted Done Reply Inline Actions OK I was confused, nothing is getting stored, but ParsedAST::getUnused() function creates/destroys the analysis data so it needs to run as a batch. I think probably: this function should just be a simple `bool isUnused(...)` and the loop lives in the caller `ParsedAST::getUnused()` should become `getUnused(const ParsedAST&)` and live in this file We have some circularity between ParsedAST and IncludeCleaner, but I think we're going to have that in any case due to `findReferencedLocations()` sammccall: OK I was confused, nothing is getting stored, but ParsedAST::getUnused() function…
	} // namespace clangd			} // namespace clangd
				sammccallUnsubmitted Done Reply Inline Actions EntryPoint is unused sammccall: EntryPoint is unused
	} // namespace clang			} // namespace clang
				sammccallUnsubmitted Done Reply Inline Actions elog says that: a) this might happen b) logging for the user is the best thing we can do Can this actually happen? My suspicion is no. In which case maybe it should be an assert? sammccall: elog says that: a) this might happen b) logging for the user is the best thing we can do Can…
				sammccallUnsubmitted Done Reply Inline Actions Handle unresolved case somehow? (Or assert if you're sure it can't happen - I think it can for e.g. pp_file_not_found) sammccall: Handle unresolved case somehow? (Or assert if you're sure it can't happen - I think it can for…
				sammccallUnsubmitted Done Reply Inline Actions I'm fairly (more) certain this one should be an assert sammccall: I'm fairly (more) certain this one should be an assert
				sammccallUnsubmitted Done Reply Inline Actions doesn't seem like there's any need to go through filenames for this. Can't we just store the HeaderID in the Inclusion? (Blech, as an `unsigned` to avoid a circular dependency) sammccall: doesn't seem like there's any need to go through filenames for this. Can't we just store the…
				sammccallUnsubmitted Done Reply Inline Actions this is get, not getOrCreate, so you don't need the mutable reference sammccall: this is get, not getOrCreate, so you don't need the mutable reference
				sammccallUnsubmitted Done Reply Inline Actions assert? sammccall: assert?
				sammccallUnsubmitted Done Reply Inline Actions I'm not totally sure whether this is safe to assert or not. WDYT? In any case, please fix the message (FE -> HeaderID, add more context) sammccall: I'm not totally sure whether this is safe to assert or not. WDYT? In any case, please fix the…
				sammccallUnsubmitted Done Reply Inline Actions SM is unused sammccall: SM is unused
				sammccallUnsubmitted Done Reply Inline Actions this can just be a check whether MFI.ID is valid or not sammccall: this can just be a check whether MFI.ID is valid or not
				sammccallUnsubmitted Done Reply Inline Actions ReferencedFiles.contains(IncludeID) and inline into the if? sammccall: ReferencedFiles.contains(IncludeID) and inline into the if?

clang-tools-extra/clangd/ParsedAST.h

Show First 20 Lines • Show All 110 Lines • ▼ Show 20 Lines	public:
/// Returns the version of the ParseInputs used to build Preamble part of this		/// Returns the version of the ParseInputs used to build Preamble part of this
/// AST. Might be None if no Preamble is used.		/// AST. Might be None if no Preamble is used.
llvm::Optional<llvm::StringRef> preambleVersion() const;		llvm::Optional<llvm::StringRef> preambleVersion() const;

const HeuristicResolver *getHeuristicResolver() const {		const HeuristicResolver *getHeuristicResolver() const {
return Resolver.get();		return Resolver.get();
}		}

		void computeUsedIncludes();

private:		private:
ParsedAST(llvm::StringRef Version,		ParsedAST(llvm::StringRef Version,
std::shared_ptr<const PreambleData> Preamble,		std::shared_ptr<const PreambleData> Preamble,
std::unique_ptr<CompilerInstance> Clang,		std::unique_ptr<CompilerInstance> Clang,
std::unique_ptr<FrontendAction> Action, syntax::TokenBuffer Tokens,		std::unique_ptr<FrontendAction> Action, syntax::TokenBuffer Tokens,
MainFileMacros Macros, std::vector<Decl *> LocalTopLevelDecls,		MainFileMacros Macros, std::vector<Decl *> LocalTopLevelDecls,
llvm::Optional<std::vector<Diag>> Diags, IncludeStructure Includes,		llvm::Optional<std::vector<Diag>> Diags, IncludeStructure Includes,
CanonicalIncludes CanonIncludes);		CanonicalIncludes CanonIncludes);
Show All 22 Lines	private:
// Top-level decls inside the current file. Not that this does not include		// Top-level decls inside the current file. Not that this does not include
// top-level decls from the preamble.		// top-level decls from the preamble.
std::vector<Decl *> LocalTopLevelDecls;		std::vector<Decl *> LocalTopLevelDecls;
IncludeStructure Includes;		IncludeStructure Includes;
CanonicalIncludes CanonIncludes;		CanonicalIncludes CanonIncludes;
std::unique_ptr<HeuristicResolver> Resolver;		std::unique_ptr<HeuristicResolver> Resolver;
};		};

} // namespace clangd		} // namespace clangd
		sammccallUnsubmitted Done Reply Inline Actions I think this function belongs in IncludeCleaner.h. (There's a circularity problem between ParsedAST and IncludeCleaner, but putting the function here doesn't fix it) sammccall: I think this function belongs in IncludeCleaner.h. (There's a circularity problem between…
} // namespace clang		} // namespace clang

#endif // LLVM_CLANG_TOOLS_EXTRA_CLANGD_PARSEDAST_H		#endif // LLVM_CLANG_TOOLS_EXTRA_CLANGD_PARSEDAST_H

clang-tools-extra/clangd/ParsedAST.cpp

Show All 12 Lines
#include "AST.h"		#include "AST.h"
#include "Compiler.h"		#include "Compiler.h"
#include "Config.h"		#include "Config.h"
#include "Diagnostics.h"		#include "Diagnostics.h"
#include "FeatureModule.h"		#include "FeatureModule.h"
#include "Features.h"		#include "Features.h"
#include "Headers.h"		#include "Headers.h"
#include "HeuristicResolver.h"		#include "HeuristicResolver.h"
		#include "IncludeCleaner.h"
#include "IncludeFixer.h"		#include "IncludeFixer.h"
#include "Preamble.h"		#include "Preamble.h"
#include "SourceCode.h"		#include "SourceCode.h"
#include "TidyProvider.h"		#include "TidyProvider.h"
#include "index/CanonicalIncludes.h"		#include "index/CanonicalIncludes.h"
#include "index/Index.h"		#include "index/Index.h"
#include "support/Logger.h"		#include "support/Logger.h"
#include "support/Trace.h"		#include "support/Trace.h"
▲ Show 20 Lines • Show All 567 Lines • ▼ Show 20 Lines	ParsedAST::ParsedAST(llvm::StringRef Version,
assert(this->Action);		assert(this->Action);
}		}

llvm::Optional<llvm::StringRef> ParsedAST::preambleVersion() const {		llvm::Optional<llvm::StringRef> ParsedAST::preambleVersion() const {
if (!Preamble)		if (!Preamble)
return llvm::None;		return llvm::None;
return llvm::StringRef(Preamble->Version);		return llvm::StringRef(Preamble->Version);
}		}

		void ParsedAST::computeUsedIncludes() {
		const auto &SM = getSourceManager();

		auto Refs = findReferencedLocations(*this);
		auto ReferencedFileIDs = findReferencedFiles(Refs, SM);
		std::vector<llvm::StringRef> ReferencedFilenames;
		ReferencedFilenames.reserve(ReferencedFileIDs.size());
		for (FileID FID : ReferencedFileIDs) {
		const FileEntry *FE = SM.getFileEntryForID(FID);
		if (!FE) {
		elog("Missing FE for {0}", SM.getComposedLoc(FID, 0).printToString(SM));
		continue;
		}
		ReferencedFilenames.push_back(SM.getFileEntryForID(FID)->getName());
		}
		Includes.markUsed(SM.getFileEntryForID(SM.getMainFileID())->getName(),
		ReferencedFilenames, directlyReferencedFiles);
		}

} // namespace clangd		} // namespace clangd
} // namespace clang		} // namespace clang

This is an archive of the discontinued LLVM Phabricator instance.

[clangd] IncludeCleaner: Mark used headersClosedPublic

Details

Diff Detail

Event Timeline