Details
- Reviewers
sammccall
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This is POC, I need to deal with forward decls and definitions from D112707 firs. Otherwise, this introduces too much noise (LLVM really has *lots* of forward decls that make almost no sense).
This is cool!
I think it will interact with other features (stdlib, exported headers), and so whatever lands second is going to have to eat the complexity of that interaction.
I think both stdlib and exported headers are more critical and more essentially complex, so should probably land first to drive the design.
clang-tools-extra/clangd/IncludeCleaner.cpp | ||
---|---|---|
298 | It seems like this is going to include any files that are transitively reachable through multiple paths and directly reachable through none. These are going to seem like false positives. Sure, if you want the whole file to be IWYU-clean you should remove <unused.h> and add <stddef.h>, but they're not *really* related. | |
311 | This is going to have problems with private/exported headers. Consider includes main -> foo -> bar -> private. However you still want to traverse children of private in case main uses those. | |
376 | We should offer both fixes as alternatives, maybe? | |
381 | exactly how to spell an include and where to place it is tricky. See IncludeInserter.h in headers. | |
clang-tools-extra/clangd/IncludeCleaner.h | ||
69 | why are we back to passing around strings? |
Oh, sorry, I marked it as "changes intended" so that you could know it's not review-ready yet :) I was just playing around with it and previewing to see how it would feel to have a somewhat working solution.
I didn't really implement the feature (add the proper header spelling that we have in IncludeFixer as you mentioned in the comments, check if the replacement header is already included or not, etc), but there are two important problems I was facing that look rather tricky
- This actually steps into the "missing includes" land: we can be conservative when deciding whether a header is used or not. It is both system headers _and_ (currently) a lot of noise in the form of headers with forward decls. E.g. for StringRef the replacement set includes SHA1 and a bunch of very surprising headers just because we have lots of forward decls in unexpected places. I originally had the idea that we can do unused includes without missing includes but it seems that they are closer in terms of at least this usecase. I wanted to see if we would need something more but maybe the follow-up on the mentioned patch (narrow down the "used" header property set conditions) would be enough.
- Performance: providing fix-its for individual headers is not really performance-optimal: we need a graph traversal from each MainFileInclusion and it's a linear BFS, so the complexity is O(unused headers * graph size). This _might_ be unfortunate for large graphs and (surprisingly) it's just faster to collect the data for a "batch replace" of unused headers with used ones. But this brings us back into the realm of missing headers. Maybe the performance is fine, but I want to estimate it in a better way. So far I've seen somewhat small graphs when editing LLVM (<1k headers) but it might be different for different parts so I want to check the performance to make sure it's really fine.
Anyway, I'd be happy to talk once I'm back but as you mentioned this patch would probably need some QOL improvements before it can be landed (needless to say, it should be complete, too :) )!
why are we back to passing around strings?