Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang-tools-extra/clangd/index/SymbolCollector.cpp | ||
---|---|---|
393–395 | sorry i don't understand what this logic is trying to achieve. we seem to be prefering "first" exporting header, over the headers that directly provide the symbol. Also comment says it's done for objc, but there's nothing limiting this logic to ObjC. any chance this is unintended? | |
791 | s/ASTCtx->getSourceManager()/SM | |
903 | this is using legacy mappings for non-objc symbols too | |
915 | getIncludeHeader might return an empty string | |
928 | you're iterating over multiple headers, and only keeping references for the last one alive. instead of storing these in the class state, you can have a llvm::DenseMap<Header, std::string> HeaderToSpelling; inside this function and later on just do: for(const auto& H: It->second) { auto [SpellingIt, Inserted] = HeaderToSpelling.try_emplace(H); if(Inserted) { switch (...) { /* Update *SpellingIt */ } } NewSym.IncludeHeaders.push_back(*SpellingIt, 1, Directives); | |
934–936 | we actually only want to use HeaderFileURIs->toURI here and not the getIncludeHeader, because:
the way we create a FileID here is a little bit iffy, by using toURI, hopefully we can avoid that conversion. the only remaining issue is, we actually still want to use system header mappings inside CanonicalIncludes until we have better support for them in include-cleaner library itself. So I think we should still call Includes->mapHeader(H.physical()) before using toURI. | |
clang-tools-extra/clangd/index/SymbolCollector.h | ||
131 | unfortunately this alternative is called after we've parsed the file, from clangd/index/FileIndex.cpp, indexSymbols. hence attaching PI to PPCallbacks won't have any effect. This is working unintentionally because we're still populating CanonicalIncludes with legacy IWYUPragmaHandler in clangd, and using that to perform private -> public mappings. We should instead propagate the PragmaIncludes we have in ParsedAST and in preamble builds here. You should be able to test the new behaviour clang-tools-extra/clangd/unittests/FileIndexTests.cpp to make sure pragma handling through dynamic indexing code paths keep working as expected using an export pragma. |
clang-tools-extra/clangd/index/SymbolCollector.cpp | ||
---|---|---|
837 | you can take this by value and std::move into the map |
Thanks for the comments!
clang-tools-extra/clangd/index/SymbolCollector.cpp | ||
---|---|---|
393–395 | No, this was intended, but possibly I didn't get it right.
AFAIU this whole code path will now only be reachable for objc symbols. I have removed pragma includes from the canonical include mapping now, leaving the pragma handling to include cleaner. However, that only triggers for C/C++, so in case we need the export and private pragmas for objc, we need to re-insert the handling for them somewhere.
Isn't that correct if there's an IWYU export pragma involved? The snippet comes from include_cleaner::findHeaders, with the only difference that it stops at the first exporter (since the canonical include mapping also just stored one mapping for the header). Let me know how to do it better, or maybe if this is necessary at all. Honestly, I am not sure about this, since, as mentioned, there are no export or private/public pragmas in objc files in the codebase atm. | |
903 | Removed this whole passage. | |
934–936 | Ok SGTM. It seems, though, that we still need the "iffy" FID for the canonical mapping.. Let me know if you know a better way. | |
clang-tools-extra/clangd/index/SymbolCollector.h | ||
131 | Ok thank you. I'm using PragmaIncludes from AST and preamble builds for the dynamic index now. For the background index, I've finally managed to move pragma recording to the IndexAction now. I've also removed the redundant (to include cleaner) comment handlers from the AST build and preamble building logic. Also removed the mapSymbol method from CanonicalIncludes, since it had one usage which should now be covered by the include cleaner, I believe. |
clang-tools-extra/clangd/ParsedAST.cpp | ||
---|---|---|
628 ↗ | (On Diff #532964) | can you also add a FIXME here saying that we should attach PragmaInclude callbacks to main file ast build too? this is not a recent regression as we were not doing it properly in previous version either. IWYU handler we attach only cares about private pragmas, which don't mean much inside the main file. but we actually need it for other pragmas like keep/export/no_include etc. (no need to do it in this patch as it's already getting quite big, let's do that as a follow up) |
clang-tools-extra/clangd/Preamble.h | ||
83 ↗ | (On Diff #532964) | you need to rebase as we had some big changes around this logic recently. we should pass the PragmaIncludes also as a shared_ptr (and store as one inside the preambledata) |
clang-tools-extra/clangd/index/CanonicalIncludes.h | ||
40 ↗ | (On Diff #532964) | AFAICT, only users of this endpoint were in IWYUCommentHandler (and some tests). we can also get rid of this one now. More importantly now this turns into a mapper used only by symbolcollector, that can be cheaply created at use time (we just copy around a pointer every time we want to create it). so you can drop CanonicalIncludes from all of the interfaces, including SymbolCollector::Options and create one on demand in HeaderFileURICache. all we need is langopts, and it's available through preprocessor (not necessarily on construction, but at the time when we want to do the mappings). As you're already touching all of the interfaces that propagate CanonicalIncludes around, hopefully this change should only make things simpler (and you already need to touch them when you rebase), but if that feels like too much churn in this patch feel free to do that in a follow up as well. |
43 ↗ | (On Diff #532964) | let's mention that this will always return a verbatim spelling, e.g. with quotes or angles. |
57 ↗ | (On Diff #532964) | you can drop this one too, only addMapping would introduce mappings here, but that's going away too. |
clang-tools-extra/clangd/index/SymbolCollector.cpp | ||
250 | let's change the interface here to take in a FileEntry instead. | |
251 | nit: early exits if (!SysHeaderMapping) return ""; auto Canonical = ...; if(Canonical.empty()) return ""; ... | |
258 | we should never hit this code path anymore. so feel free to convert the if above to an assert | |
393–395 | existing IWYUCommentHandler only handled private pragmas, so export pragmas were already dropped on the floor, we shouldn't introduce handling for them in here now. the mappings you perform for private/public pragmas and system headers below is enough. (P.S. using exporters as "alternatives" vs "sole provider" are quite different, policy in include_cleaner does the former whereas logic in here does the latter) | |
417–418 | let's move this logic into mapCanonical too. | |
420 | nit: if (auto Canonical = ...; !Canonical.empty()) return Canonical; | |
836 | what about merging this one and setIncludeLocation ? e.g: void SymbolCollector::setIncludeLocation(const Symbol &IdxS, SourceLocation DefLoc, const include_cleaner::Symbol &Sym) { if (!Opts.CollectIncludePath || shouldCollectIncludePath(S.SymInfo.Kind) == Symbol::Invalid) return; IncludeFiles[SID] = ...; SymbolProviders[SID] = headersForSymbol(...); } | |
839 | because you're taking in Headers as const, this move is not actually moving. | |
868 | let's iterate over SymbolProviders instead, as IncludeFiles is a legacy struct now. | |
869–880 | inside of this branch is getting crowded, maybe early exit instead? const Symbol *S = Symbols.find(SID); if(!S) continue; ... | |
919 | no need to call insert if we didn't modify IncludeHeaders, so maybe reflow to something like: if (Directives & Import) { if(auto IncHeader = ..; !IncHeader.empty()) { auto NewSym = *S; NewSym.IncludeHeaders....; Symbols.insert(NewSym); } // FIXME: Use providers from include-cleaner once it's polished for ObjC too. continue; } | |
925 | s/if/assert | |
941 | no need for the getOrCreateFileID after you change parameter for mapCanonical to be a file entry. | |
943 | nit: if(auto Canonical = mapCanonical(H.physical()); !Canonical.empty()) SpellingIt->second = Canonical; | |
945 | this should be else if. if we performed a mapping, we no longer care about self-containedness of original header. | |
946 | s/PP->getSourceManager()/SM | |
949 | s/if (SpellingIt->second.empty())/else | |
clang-tools-extra/clangd/index/SymbolCollector.h | ||
176 | we seem to be populating this map for all kinds of symbols, as long as shouldCollectIncludePath returns non-invalid. what's up with distinguishing imports from includes? | |
187 | no need to have this in the class state, you can just inline this into ::finish method instead. | |
192 | let's move this closer to the SymbolProviders map | |
205 | we already store this in Opts, can we just use it from there instead of creating an extra name for it? |
clang-tools-extra/clangd/index/SymbolCollector.cpp | ||
---|---|---|
836 | also sorry for missing this so far, but i think we should limit this to first provider initially. as otherwise we'll have providers that just accidentally declare certain symbols and we don't want them to be inserted (unless they're the only provider and end up at the top) just because our ranking heuristics fail or something. |
Thanks for comments!
clang-tools-extra/clangd/index/CanonicalIncludes.h | ||
---|---|---|
40 ↗ | (On Diff #532964) |
This time sounds a little late, since we don't want to rebuild the system header mapping every time we request a mapping for a particular header. |
clang-tools-extra/clangd/index/SymbolCollector.cpp | ||
250 | I believe it has to be FileEntryRef instead. This is what the CanonicalIncludes::mapHeader expects, and this is also seems right (somewhere in the Clang docs I've read "everything that needs a name should use FileEntryRef", and name is exactly what CanonicalIncludes::mapHeader needs). For the case of a physical header (i.e., FileEntry instance) I've switched to using getLastRef now. There's also another API FileManager::getFileRef where I could try passing the result of tryGetRealPathName. I am not sure I have enough information to judge if any of these options is distinctly better. | |
417–418 | This would deliver wrong results for C++. We would basically get every IWYU-public header twice: once through the verbatim branch of the include_cleaner results, and then once again when mapping the physical private file to a canonical version (e.g., for system headers). | |
839 | Ok, this is not relevant anymore, but I am not sure I understand why. Am I not allowed to say that I don't need this variable/memory anymore, even though it's const? | |
868 | Not sure I'm following. Iterating over SymbolProviders means retrieving include_cleaner::Headers. How would I handle the entire Obj-C part then, without the FID of the include header? | |
941 | Still need the getLastRef or FileManager::getFileRef, if I understand correctly. | |
clang-tools-extra/clangd/index/SymbolCollector.h | ||
176 | Yeah that's right. I've been trying to express that (after populating this map) we are not using this FID to compute the IncludeHeaders anymore (only for the ObjC case), but rather use it to determine if the Directives should contain Symbol::Import (ie., if this FID contains Obj-C constructs/imports). I'll try to be a bit more verbose and precise. |
clang-tools-extra/clangd/ClangdServer.cpp | ||
---|---|---|
85 ↗ | (On Diff #535740) | nit: PI(std::move(PI)) |
clang-tools-extra/clangd/TUScheduler.cpp | ||
1086 ↗ | (On Diff #535740) | nit: s/PI/std::move(PI) |
clang-tools-extra/clangd/index/CanonicalIncludes.cpp | ||
672 ↗ | (On Diff #535740) | we're no longer using any FileEntry pieces of this parameter you can take in just a llvm::StringRef HeaderPath instead. |
clang-tools-extra/clangd/index/CanonicalIncludes.h | ||
40 ↗ | (On Diff #532964) |
System header mapping is a singleton, we build it statically once on first request throughout the whole process and just use that for the rest of the execution. |
clang-tools-extra/clangd/index/FileIndex.cpp | ||
49 ↗ | (On Diff #535740) | again you can have a const include_cleaner::PragmaIncludes & without shared_ptr here. |
clang-tools-extra/clangd/index/FileIndex.h | ||
118 ↗ | (On Diff #535740) | you can have a const include_cleaner::PragmaIncludes& PI directly both in here and in indexHeaderSymbols. after building a preamble we always have a non-null PragmaIncludes and these functions are always executed synchronously, so no need to extend ownership and make the code indirect. |
clang-tools-extra/clangd/index/IndexAction.cpp | ||
233 | you can have a unique_ptr here instead (once we store a non-owning pointer in Opts.PragmaIncludes) | |
clang-tools-extra/clangd/index/SymbolCollector.cpp | ||
250 | yeah getLastRef is fine, i was mostly trying to avoid dodgy call to getOrCreateFileID. you can also pass in just a llvm::StringRef HeaderPath now, as we're not using anything but file path. which you can retreive with getName from both FileEntry and FileEntryRef | |
412 | s/SM.getFileEntryForID(FID)/FE | |
790 | you can actually change include_cleaner::Macro to have a const IdentifierInfo* instead of the const_cast here. it's mostly an oversight to store a non-const IdentifierInfo in the first place. | |
837 | once we have the optional<Header> as the value type you can also rewrite this as: auto [It, Inserted] = SymbolProviders.try_emplace(S.ID); if (Inserted) { auto Providers = include_cleaner::headersForSymbol(Sym, SM, Opts.PragmaIncludes.get()); if(!Providers.empty()) It->second = Providers.front(); } that way we can get rid of some redundant calls to headersForSymbol | |
839 | std::move is unfortunately a little bit misleading at times. it is just a cast that returns an rvalue reference. hence on its own it doesn't actually move anything. but it enables picking rvalue versions of constructors/assignment operators. (e.g. Foo(Foo&&) or operator=(Foo&&)) which can be implemented efficiently with the assumption that RHS object is being "moved from" and will no longer be used. that way certain types implement a move semantics by just consuming the RHS object. unfortunately that is not possbile when RHS is const, and expressions use regular copy semantics instead. | |
868 | we're joining IncludeFiles and SymbolProviders, as they have the same keys. right now you're iterating over the keys in IncludeFiles and doing a lookup into SymbolProviders using that key to get providers. | |
957 | FWIW, i think we should assert(It != SymbolProviders.end()) (or IncludeFiles.end() when you change the outer loop). as we're inserting into these maps simultaneously. | |
960 | we're not really getting any caching benefits as this map is inside the loop. can you put it outside the loop body? | |
963 | we shouldn't continue here, it just means we can directly use SpellingIt to figure out what to put into NewSym.IncludeHeaders. maybe something like: auto [SpellingIt, Inserted] = HeaderSpelling.try_emplace(H); if (Inserted) SpellingIt-> second = getSpelling(H, *PP); if(!SpellingIt->second.empty()) { Symbol NewSym = *S; NewSym.IncludeHeaders.push_back({SpellingIt->second, 1, Directives}); Symbols.insert(NewSym); } std::string getSpelling(const include_cleaner::Header &H, const Preprocessor &PP) { if(H.kind() == Physical) { // Check if we have a mapping for the header in system includes. // FIXME: get rid of this once stdlib mapping covers these system headers too. CanonicalIncludes CI; CI.addSystemHeadersMapping(PP.getLangOpts()); if (auto Canonical = CI.mapHeader(H.physical()->getLastRef()); !Canonical.empty()) return Canonical; if (!tooling::isSelfContainedHeader(...)) return ""; } // Otherwise use default spelling strategies in include_cleaner. return include_cleaner::spellHeader(H); } | |
972 | you can move the copy into this branch to make sure we're only doing that when necessary, e.g: if(auto IncludeHeader = ...) { Symbol NewSym = *S; NewSym.IncludeHeaders...; Symbols.insert(NewSym); } | |
clang-tools-extra/clangd/index/SymbolCollector.h | ||
65 | no need to have ownership here, we can have just a const include_cleaner::PragmaIncludes* PI. SymbolCollector shouldn't need to extend lifetimes | |
129 | as mentioned above addSystemHeadersMapping is cheap after the first call in the process, we initialize the mappings only once. so you can just have it as a member in HeaderFileURICache, and drop it completely from the public interface of SymbolCollector | |
183 | can we have a llvm::DenseMap<SymbolID, std::optional<include_cleaner::Header>>, to indicate there are no providers for a symbol, rather than missing the symbol completely from the mapping? |
Thank you!
clang-tools-extra/clangd/index/SymbolCollector.cpp | ||
---|---|---|
837 | Sure, although I'm not sure where the redundant calls would be coming from. I've been under the impression that this function is supposed to be called once for each symbol. | |
868 | Ok, makes sense now, thanks. | |
963 | As agreed offline, it is not easy to split out a spelling function, since HeaderFileURIs is a private member of the symbol collector, and we seem to still need to generate URIs for physical headers going forward. |
thanks a lot for bearing with me, let's ship it!
clang-tools-extra/clangd/IncludeCleaner.cpp | ||
---|---|---|
91 ↗ | (On Diff #539908) | nit: just const include_cleaner::PragmaIncludes* PI |
clang-tools-extra/clangd/index/SymbolCollector.cpp | ||
255 | SysHeaderMapping doesn't need to be part of class state, just define it here completely. (also gets rid of the need for extra explanation and any possible questions around what would happen when we add system headers multiple times) | |
421 | nit: can you move this above the call to mapCanonical and use it there too? | |
827–828 | nit: early exits might be more readable, as we've more nesting now. (if (!Opts.Collect.. || shouldCollect..() == Symbol::Invalid) return;) | |
837 | yeah unfortunately addDeclaration can be called multiple times (with increasing declaration quality, e.g. once for the forward declaration, then for a complete definition of a class) | |
949–952 | nit: auto FID = IncludeFiles.at(SID); | |
949–952 | if (!SpellingIt->second.empty()) { auto NewSym = *S; NewSym.IncludeHeaders.push_back({SpellingIt->second, 1, Directives}); Symbols.insert(NewSym); } | |
970 | i believe this should be just H.physical()->getLastRef() ? as we're passing in a fileentryref into toURI | |
971 | nit: braces, as we've multiple lines (despite being a single statement) |
you can have a unique_ptr here instead (once we store a non-owning pointer in Opts.PragmaIncludes)