Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 34981 Build 34980: arc lint + arc unit
Event Timeline
So I've stared at this refactoring for a while, and I still don't totally get it.
It seems like a class that really wants to be a function. It's a complicated function though - maybe a separate cpp file does make sense.
We discussed moving ShardVersions and Writes into the loader, splitting ShardVersions from IndexSymbols seems pretty odd though.
What's the advantage of making this stateful?
- As discussed offline, moved the "staleness" detection back to background index
- A renaming within Rebuilder, "LoadedTUs -> LoadedShards"
Refactoring generally looks good.
You're replacing a lot of documented code with new undocumented code, can we add some high-level comments?
clang-tools-extra/clangd/index/Background.cpp | ||
---|---|---|
127 | nit: unclear what's changed with respect to what. I'd suggest shardIsStale() and swap the param order? | |
130 | While here: this message lacks context and is unnecessarily jargony: "Background-indexing: couldn't read {0} to validate stored index"? | |
424 | move this outside the loop and use a counter? it's an extra lock/unlock for no reason | |
425 | This function now has almost no comments, can you explain the major steps? | |
443 | nit: again, this patch is adding Path/PathRef aliases to parts of the code that don't use them | |
clang-tools-extra/clangd/index/Background.h | ||
169 | and their dependencies | |
clang-tools-extra/clangd/index/BackgroundIndexLoader.cpp | ||
78 | you can't call this an (opaque?) ShardIdentifier and then use it as a path - maybe SourceFile? | |
clang-tools-extra/clangd/index/BackgroundIndexLoader.h | ||
2 | header is out of date. The new file/structs/functions need documentation | |
26 | LoadedShard? | |
31 | IndexFileIn is just a struct - optional<>? (And need a comment explaining when it can be null) | |
37 | no, they're not :-) we talked about making the values refs to ShardInfo.AbsolutePath. | |
38 | why are these outside the ShardInfo? | |
43 | clangd::load seems like an insufficiently precise name for this function loadIndexShards? |
clang-tools-extra/clangd/index/Background.cpp | ||
---|---|---|
443 | I believe they represent the intention better then string/StringRefthat was the reason I was using them. |
clang-tools-extra/clangd/index/Background.cpp | ||
---|---|---|
130 | {0 is missing its } | |
425 | // We'll accept data from stale shards, but ensure the files get reindexed soon. | |
clang-tools-extra/clangd/index/BackgroundIndexLoader.cpp | ||
94 | This handles only one main file, I can't see where you skip loading shards for headers that were loaded for a previous file | |
105 | ShardIdentifier still here | |
127 | I don't think this works, the storage for these StringRefs is the LoadedShards you just moved from. |
- Address comments
clang-tools-extra/clangd/index/BackgroundIndexLoader.cpp | ||
---|---|---|
94 | The real loading happens down the line at const CachedShard &CS = loadShard(ShardIdentifier, Storage);, which returns shard from cache if it was already loaded. | |
127 | Yes you are right, I don't see a good way out of it with current data model. I suppose we can keep an "intern table :)" for the strings, but then LoadedShards would be meaningless without that intern table so we would end up returning another struct from loadIndexShards that will wrap std::vector<LoadedShards> and the intern table(but we tried to mitigate from that struct in previous iteration). So instead I am changing the model to return only the first translation unit and actually this makes a lot of sense when we think about shards for standard library, in a code base like chromium vector could easily have ~40k dependent TUs and even if we just returned a vector<StringRef> it would still be around 640kB, and we have a lot of standard library headers. this can quite easily add hundreds of MB to clangd's memory footprint while loading shards. |
clang-tools-extra/clangd/index/BackgroundIndexLoader.cpp | ||
---|---|---|
48 | I think we only need to traverse this when *missing* the cache, so it can be a return value of loadShard rather than stored in CachedShard (so CachedShard becomes just LoadedShard) |
clang-tools-extra/clangd/index/BackgroundIndexLoader.cpp | ||
---|---|---|
94 | OK, then we still traverse the whole subtree but hit every node. I think it'd be nicer to avoid the traversal (as noted above) if that works well, up to you. |
and their dependencies