Page MenuHomePhabricator

[WIP] Index-while-building
Needs ReviewPublic

Authored by jkorous on Jul 8 2019, 8:20 PM.

Details

Summary

The entry points are EmitIndexAction and GenerateIndexAction classes.

Diff Detail

Event Timeline

jkorous created this revision.Jul 8 2019, 8:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2019, 8:20 PM

Thanks for sharing this! I hope I will have time to dig into more detail.

At a high level, my main question is around layering.
Until now, lib/Index has mostly been a pretty abstract framework: it gathers information from ASTs and exposes it to callers, who then decide what to do with it.
This has served clangd pretty well: we've been able to hook into it, use our own storage, fix the occasional bug, and avoid a bunch of custom tree traversals.
My understanding is that historically XCode has used it in similar ways by out-of-tree consumers (possibly via libclang?)

This patch introduces another concrete consumer in-tree, which is great - making uses more visible helps everyone understand and change the API.
It's also well-layered! But much of it's in the Index library, which makes the layering more fragile:

  • it's unclear to me whether strict layering is intended here - there seem to be a couple of violations (though this is preview code)
  • it's not going to be clear to people arriving at the code which dependencies are OK, and whether IWB is special or one client of many
  • as it stands, there's no technical enforcement of layering (so it may rot), nor a way to depend on the abstract part but not the concrete part. This makes changes to the concrete part scarier.

If the currently-abstract parts Index becomes coupled to IWB over time, it'll be a regression for how clangd is using the library. Maybe we can converge on the concrete consumers too, but carefully (and not necessarily!)

Can we move the concrete parts to IndexStore directories (both header and impls)? (EmitIndexAction, DeclOccurrenceGenerator, IndexDataFormat, IndexDataStore, IndexRecordWriter, PathStorage, half of IndexOptions, BitstreamUtil, FSUtil, FileIndexRecord)

clang/include/clang/Index/GenerateIndexAction.h
34 ↗(On Diff #208580)

(this seems like suspicious layering, but also seems to be unused)

clang/include/clang/Index/IndexOptions.h
40

This file mixes abstract indexing stuff (IndexingOptions) with concrete (RecordingOptions)

clang/lib/Index/GenerateIndexAction.cpp
11 ↗(On Diff #208580)

I can't tell where this is used - it's suspicious if required

MaskRay removed a subscriber: MaskRay.
jkorous marked 4 inline comments as done.Jul 22 2019, 3:06 PM

Thanks for taking a look Sam!

I finished porting of all the changes on top of llvm.org master in the meantime. The updated patch contains all the code we'd like to land. Build (on macOS) and tests are fine.
I'll leave it as a single patch for now so we can finish discussion about layering and I'll split it into couple smaller patches once this is done.

I've put the actual index-while-building part into separate subdirectories for now. For now I went with IndexWhileBuilding name but I'm open to discussion.

clang/include/clang/Index/GenerateIndexAction.h
34 ↗(On Diff #208580)

Right. Removed.

clang/lib/Index/GenerateIndexAction.cpp
11 ↗(On Diff #208580)

Removed. Thanks.

jkorous updated this revision to Diff 211205.Jul 22 2019, 3:16 PM
jkorous marked 2 inline comments as done.
jkorous edited the summary of this revision. (Show Details)

Uploaded the full diff.

For now I went with IndexWhileBuilding name but I'm open to discussion.

I'd prefer 'IndexSerialization'.

jkorous updated this revision to Diff 211321.Jul 23 2019, 9:45 AM

clang-format

Not a full review, just some comments that would help reduce the size of the patch, plus code organization concerns.

Could we avoid nested libraries and rename clang/include/clang/Index/IndexWhileBuilding to clang/include/clang/IndexWhileBuilding ?

clang-tools-extra/clangd/ClangdUnit.cpp
32

Please submit changes like these this #include removal separately to reduce the size of the patch.

clang-tools-extra/clangd/FindSymbols.cpp
20

Please submit changes like these this #include removal separately to reduce the size of the patch.

clang-tools-extra/clangd/index/IndexAction.cpp
169

Could you move index action into a separate header and inline index::createIndexingAction into its callers in a separate commit?

clang/include/clang/Index/IndexAction.h
23

+1. Instead of WrapperFrontendAction, could you implement a general frontend action multiplexer in clang/include/clang/Frontend/FrontendAction.h ? You can do that as a separate commit.

clang/include/clang/Index/IndexDataConsumer.h
64

Changes to this file can be submitted as a separate patch.

clang/include/clang/Index/IndexOptions.h
20

Are these the declarations moved from clang/include/clang/Index/IndexingAction.h? I'd suggest to commit this header splitting separately to reduce the size of the patch.

clang/include/clang/Index/IndexWhileBuilding/PathStorage.h
74 ↗(On Diff #211321)

Please add newlines (everywhere in the patch).

clang/include/clang/Index/USRGeneration.h
100

Unintended edit?

gribozavr added inline comments.Tue, Aug 6, 8:42 AM
clang/include/clang/Driver/Options.td
331

Could you add a more detailed description? Even as a compiler engineer, I can't really tell what index-record-codegen-name really does or why I would want to turn it on.

clang/include/clang/Frontend/CompilerInstance.h
180 ↗(On Diff #211321)

No need to write "\brief", it is the default.

180 ↗(On Diff #211321)

Please don't write what something is used for, write what something is.

"A factory for wrappers around FrontendActions that compile imported modules."

810 ↗(On Diff #211321)

I don't see anything actually using GenModuleActionWrapper.

clang/lib/Frontend/CompilerInstance.cpp
1113

Please end comments with a period.

gribozavr added inline comments.Fri, Aug 9, 6:23 AM
clang/include/clang/Frontend/FrontendOptions.h
370

Could we change "ignore system symbols" into a positive option ("record system symbols")? Throughout the patch please.

clang/include/clang/Index/CodegenNameGenerator.h
0

Why change to a raw pointer?

We can use a unique_ptr to an incomplete type, as long as the destructor ~CodegenNameGenerator is defined in a file where the type is complete.

clang/lib/Index/CodegenNameGenerator.cpp
9

This code looks like a straight copy from clang/lib/AST/Mangle.cpp -- can it be de-duplicated?

clang/unittests/Index/IndexTests.cpp
74

This change can be a separate patch.

106

Unclear why this test overrides these methods, if the implementation is identical to the base class.

112

No need to add virtual when we already have override.

jkorous marked 2 inline comments as done.Mon, Aug 12, 6:53 PM
jkorous added inline comments.
clang/include/clang/Index/IndexAction.h
23

Sent for review: https://reviews.llvm.org/D66125
I'll rebase this patch once/if that lands.

jkorous marked 10 inline comments as done.Tue, Aug 13, 6:55 PM

Thanks for the review! I addressed just some minor details so far.

clang/include/clang/Frontend/CompilerInstance.h
810 ↗(On Diff #211321)

IIUC it's both used (lambda on CompilerInstance.cpp:L1136) and passed (CompilerInstance.cpp:L1112) to CompilerInstance of module dependencies in compileModuleImpl.

clang/unittests/Index/IndexTests.cpp
106

Removed these overrides.

jkorous updated this revision to Diff 215009.Tue, Aug 13, 6:58 PM
jkorous marked an inline comment as done.

Addressed some of the comments.

gribozavr added inline comments.Wed, Aug 14, 3:08 AM
clang/lib/Frontend/CompilerInstance.cpp
1131

My suggestion would be to change GenerateModuleAction::CreateASTConsumer to perform the indexing. Doing so will avoid the problem of having to compose FrontendActions (which we found in another patch to be tricky), and also will make the code a lot more straightforward, I think.

WDYT?

jkorous updated this revision to Diff 215660.Fri, Aug 16, 1:12 PM

Added ASTConsumerInjector and changed `Emit|Generate-IndexAction to -ASTConsumer.

jkorous marked 3 inline comments as done.Wed, Aug 21, 2:38 PM
jkorous added inline comments.
clang/lib/Index/CodegenNameGenerator.cpp
9

Seems like you're right: https://reviews.llvm.org/D63535

jkorous updated this revision to Diff 216519.Wed, Aug 21, 4:27 PM
jkorous marked an inline comment as done.
  • addressed more comments
  • rebased
  • clang-format
jkorous marked 3 inline comments as done.Wed, Aug 21, 5:32 PM
jkorous added inline comments.
clang/include/clang/Frontend/CompilerInstance.h
810 ↗(On Diff #211321)

I was wrong! I thought we are passing some data in the wrapper. Turned out we aren't and I was able to completely remove it. Thanks!

jkorous updated this revision to Diff 216533.Wed, Aug 21, 5:33 PM
  • addressed some comments
  • removed GenModuleActionWrapper
jkorous marked an inline comment as done.Wed, Aug 21, 6:08 PM
jkorous added inline comments.
clang/include/clang/Frontend/FrontendOptions.h
370

Surprisingly it seems that the name is actually mostly determined by the fact that we want to index system symbols by default and that natural CLI for boolean argument with default positive value is optional flag with negative semantics.

Specifically - the semantics/name of the flag OPT_index_ignore_system_symbols is pretty much given (or ?).

I could rename just IndexIgnoreSystemSymbols and keep OPT_index_ignore_system_symbols but it feels less clear than current state.

WDYT?

jkorous updated this revision to Diff 216726.Thu, Aug 22, 3:36 PM
  • renamed "codegen name generator" -> "mangled name generator"
  • ranemed flag index-record-codegen-name -> index-record-mangled-name
  • removed wrapper over ASTNameGenerator from libIndex
jkorous marked an inline comment as done.Thu, Aug 22, 3:41 PM

I addressed the rest of the comments (excluding comments about splitting the patch).
Next step is to follow up on the discussion about FrontendActions/ASTConsumers from https://reviews.llvm.org/D66125 and then I'd like to start splitting off separate commits.
I'll keep this patch rebased on top of what has already landed to have some sense what still needs to be done.