Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 33445 Build 33444: arc lint + arc unit
Event Timeline
Removing Sam as reviewer as he's on parental leave.
Kadir, if you have other reviewers to suggest to provide feedback on the API changes here, please feel free to add them. Thanks!
For starters, what about just introducing a new field Bases into the Symbol. Which can store symbol IDs for the parents of the mentioned symbol, then during index build we can simply add same relations from base to this symbol. This should help us get rid of RelationSlab logic.
If there wasn't a specific reason that I missed for introducing RelationSlab logic.
Sorry I didn't notice the mailing thread beforehand, looks like Sam had a good point regarding performing operations on types rather than symbols(http://lists.llvm.org/pipermail/clangd-dev/2019-January/000241.html).
Does current implementation take this into account by either pointing at template specializations and/or base template declarations?
I thought about this, and I believe that having the relationship be between symbols is sufficient.
The main utility in a type hierarchy view is to allow the user to navigate to base/derived classes. If a base or derived class is an implicit specialization, the desired navigation is to the corresponding template definition; if it's an explicit specialization, the desired navigation is to the explicit specialization's definition.
A class template will obviously have its own symbol, so that part's fine. Explicit specializations are not currently stored in the index, but I think it would make sense for them to be, for other reasons as well (e.g. a user may want to navigate to an explicit specialization definition via "workspace/symbol"), in which case they too will get their own symbols. So, in either case, the desired type hierarchy item can be obtained from a symbol.
The updated patch has tests covering these scenarios, specifically Subtypes.ClassTemplate and Subtypes.DependentBase. Two additional tests, Subtypes.TemplateSpec1 and Subtypes.TemplateSpec2 are disabled until we start storing explicit specializations in the index.
Haven't looked at the patch. I think we can extend the existing Ref to support it, so that most of the stuff could be reused, rather than implementing a new slab:
- introduce a new RefKind, like BaseOf
- add a new field SymbolID in Ref
and clangIndex library has already implemented relationship functionality (we could use handleDeclOccurence to store the symbols relationships).
I had considered this approach as well, but figured that it would be wasteful in terms of storage space.
My understanding is that the storage space taken up for Refs is currently 8 bytes per Ref (4 each for the start and end positions), plus filename strings which are deduplicated across all refs. If we add a SymbolID, that adds an additional 8 bytes to each Ref. Given that Refs are numerous, and most of them won't use the SymbolID, that seems wasteful.
That said, I do appreciate that this is a simpler approach in terms of complexity, so if folks feel the right tradeoff is to take the Refs approach, I am open to doing that.
My understanding is that the storage space taken up for Refs is currently 8 bytes per Ref (4 each for the start and end positions), plus filename strings which are deduplicated across all refs. If we add a SymbolID, that adds an additional 8 bytes to each Ref. Given that Refs are numerous, and most of them won't use the SymbolID, that seems wasteful.
I see. This is a very good observation, we didn't consider increase in memory usage when we discussed your patch. Sorry about that! We discussed your patch again and we agree with the approach of adding a RelationsSlab.
Here are the options we considered, and the trade-offs we identified.
Option 1: Add SymbolID to Ref. Advantage: simpler API, fewer types in the API. Disadvantage: increased memory usage, as you noted.
Option 2: Add Relation struct, but store them in RefSlab, using a compact representation that skips storing SymbolID where it is not present. Advantages: we avoid adding a type that's almost an exact duplicate of RefSlab, we don't unnecessarily increase memory usage. Disadvantages: tricky code to avoid storing SymbolID, and RefSlab now stores both Refs and Relations, which is confusing from the API point of view.
Option 3 (what your patch implements): Add Relation struct and RelationSlab. Advantage: API matches the data model, we don't unnecessarily increase memory usage. Disadvantage: have to plumb the new slab everywhere in the code.
Option 4: Add a Relation struct. Change RefSlab to a template so that it can store either Refs, or Relations. Advantage: API matches the data model, we don't unnecessarily increase memory usage, we avoid duplicating RefSlab type. Disadvantage: possibly complex template in a in important clangd API, we avoided duplicating at most 100 lines, but we still have to plumb the new slab everywhere in the code.
clang-tools-extra/clangd/index/FileIndex.h | ||
---|---|---|
125 ↗ | (On Diff #189275) | AllSlabs? And even better, a struct? |
Thanks for the detailed analysis!
For completeness, I think another advantage of option 3 is that it allows us to make the key for relation lookups be a (SymbolID, RelationKind) pair rather than just a SymbolID.
It sounds like I should continue implementing the current approach. Marking the revision as "changes planned" accordingly.
Hi Nathan,
I would also suggest splitting up current changes so that we can start reviewing them, which might result in other changes in your planned changes and help reduce duplicate work both on our and your side.
Also please have a look at D59083, and make sure it helps implement what you have in might and let me know if there's anything missing.
As for splitting changes I would suggest something like:
- RelationSlab
- Serialization/Deserialization of the RelationSlab
- Integrating RelationSlab with SymbolCollector
- Adding queries to index structures(memindex, dex, mergedindex etc.)
- Surfacing those queries through typehierarchy request
Thanks. The proposed split looks good to me. I've posted the first patch here: https://reviews.llvm.org/D59407
mostly LG, thanks!
clang-tools-extra/clangd/XRefs.cpp | ||
---|---|---|
1076 | nit: could we directly use THI.range.start.line (same for the following 3 lines) | |
1080 | a few different approaches that comes to my mind:
Could you create a bug report in https://github.com/clangd/clangd/issues ? | |
1083 | yeah it would be great to have a Location symbolToLocation(const Symbol& S); | |
1090 | you do not need WorkspaceRoot, you can pass the translationunit getTypeHierarchy was triggered on. Which is available in File parameter. Also it always exists whereas WorkspaceRoot might be missing depending on the editor. | |
1218 | nit: braces(just the inner one, keep the outer one) | |
clang-tools-extra/clangd/unittests/TypeHierarchyTests.cpp | ||
459 | bool GotResult = false; | |
462 | EXPECT_FALSE(GotResult); GotResult = true; | |
466 | EXPECT_TRUE(GotResult); | |
470 | Maybe call it Subject instead of Type ? | |
478 | could you add another child that derives Parent to check multiple children case? | |
479 | could you get rid of member fields and put definitions into single lines to make test smaller? | |
505 | same suggestion for member fields and single line definitions |
clang-tools-extra/clangd/XRefs.cpp | ||
---|---|---|
1080 |
LGTM with a few small comments.
Thanks for implementing this awesome feature!
clang-tools-extra/clangd/FindSymbols.cpp | ||
---|---|---|
42 | let's rather return an llvm::Expected<Location> and propagate the error messages in the error then log in the caller site | |
clang-tools-extra/clangd/FindSymbols.h | ||
22 | let's rather #include "Symbol.h" | |
clang-tools-extra/clangd/XRefs.cpp | ||
1071 | this part looks a little messed up ? I suppose you wanted to propagate TUPath to symbolToLocation right? | |
clang-tools-extra/clangd/XRefs.h | ||
137–139 | I believe TUPath is not used if Index is nullptr, maybe replace those two? |
Address remaining review comments
clang-tools-extra/clangd/XRefs.cpp | ||
---|---|---|
1071 | Good catch! Fixed. |
let's rather #include "Symbol.h"