Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
So I've stared at this refactoring for a while, and I still don't totally get it.
It seems like a class that really wants to be a function. It's a complicated function though - maybe a separate cpp file does make sense.
We discussed moving ShardVersions and Writes into the loader, splitting ShardVersions from IndexSymbols seems pretty odd though.
What's the advantage of making this stateful?
- As discussed offline, moved the "staleness" detection back to background index
- A renaming within Rebuilder, "LoadedTUs -> LoadedShards"
Refactoring generally looks good.
You're replacing a lot of documented code with new undocumented code, can we add some high-level comments?
clang-tools-extra/clangd/index/Background.cpp | ||
---|---|---|
130 ↗ | (On Diff #210303) | nit: unclear what's changed with respect to what. I'd suggest shardIsStale() and swap the param order? |
133 ↗ | (On Diff #210303) | While here: this message lacks context and is unnecessarily jargony: "Background-indexing: couldn't read {0} to validate stored index"? |
455 ↗ | (On Diff #210303) | This function now has almost no comments, can you explain the major steps? |
481 ↗ | (On Diff #210303) | move this outside the loop and use a counter? it's an extra lock/unlock for no reason |
490 ↗ | (On Diff #210303) | nit: again, this patch is adding Path/PathRef aliases to parts of the code that don't use them |
clang-tools-extra/clangd/index/Background.h | ||
177 ↗ | (On Diff #210303) | and their dependencies |
clang-tools-extra/clangd/index/BackgroundIndexLoader.cpp | ||
77 ↗ | (On Diff #210303) | you can't call this an (opaque?) ShardIdentifier and then use it as a path - maybe SourceFile? |
clang-tools-extra/clangd/index/BackgroundIndexLoader.h | ||
1 ↗ | (On Diff #210303) | header is out of date. The new file/structs/functions need documentation |
25 ↗ | (On Diff #210303) | LoadedShard? |
30 ↗ | (On Diff #210303) | IndexFileIn is just a struct - optional<>? (And need a comment explaining when it can be null) |
36 ↗ | (On Diff #210303) | no, they're not :-) we talked about making the values refs to ShardInfo.AbsolutePath. |
37 ↗ | (On Diff #210303) | why are these outside the ShardInfo? |
42 ↗ | (On Diff #210303) | clangd::load seems like an insufficiently precise name for this function loadIndexShards? |
clang-tools-extra/clangd/index/Background.cpp | ||
---|---|---|
490 ↗ | (On Diff #210303) | I believe they represent the intention better then string/StringRefthat was the reason I was using them. |
clang-tools-extra/clangd/index/Background.cpp | ||
---|---|---|
133 ↗ | (On Diff #210511) | {0 is missing its } |
497 ↗ | (On Diff #210511) | // We'll accept data from stale shards, but ensure the files get reindexed soon. |
clang-tools-extra/clangd/index/BackgroundIndexLoader.cpp | ||
93 ↗ | (On Diff #210511) | This handles only one main file, I can't see where you skip loading shards for headers that were loaded for a previous file |
104 ↗ | (On Diff #210511) | ShardIdentifier still here |
126 ↗ | (On Diff #210511) | I don't think this works, the storage for these StringRefs is the LoadedShards you just moved from. |
- Address comments
clang-tools-extra/clangd/index/BackgroundIndexLoader.cpp | ||
---|---|---|
93 ↗ | (On Diff #210511) | The real loading happens down the line at const CachedShard &CS = loadShard(ShardIdentifier, Storage);, which returns shard from cache if it was already loaded. |
126 ↗ | (On Diff #210511) | Yes you are right, I don't see a good way out of it with current data model. I suppose we can keep an "intern table :)" for the strings, but then LoadedShards would be meaningless without that intern table so we would end up returning another struct from loadIndexShards that will wrap std::vector<LoadedShards> and the intern table(but we tried to mitigate from that struct in previous iteration). So instead I am changing the model to return only the first translation unit and actually this makes a lot of sense when we think about shards for standard library, in a code base like chromium vector could easily have ~40k dependent TUs and even if we just returned a vector<StringRef> it would still be around 640kB, and we have a lot of standard library headers. this can quite easily add hundreds of MB to clangd's memory footprint while loading shards. |
clang-tools-extra/clangd/index/BackgroundIndexLoader.cpp | ||
---|---|---|
47 ↗ | (On Diff #210541) | I think we only need to traverse this when *missing* the cache, so it can be a return value of loadShard rather than stored in CachedShard (so CachedShard becomes just LoadedShard) |
clang-tools-extra/clangd/index/BackgroundIndexLoader.cpp | ||
---|---|---|
93 ↗ | (On Diff #210511) | OK, then we still traverse the whole subtree but hit every node. I think it'd be nicer to avoid the traversal (as noted above) if that works well, up to you. |