This is an archive of the discontinued LLVM Phabricator instance.

[RFC][clangd] Move preamble index out of document open critical path
ClosedPublic

Authored by kuganv on Apr 12 2023, 2:24 AM.

Details

Summary

We would like to move the preamble index out of the critical path.
This patch is an RFC to get feedback on the correct implementation and potential pitfalls to keep into consideration.

I am not entirely sure if the lazy AST initialisation would create using Preamble AST in parallel. I tried with tsan enabled clangd but it seems to work OK (at least for the cases I tried)

Diff Detail

Event Timeline

kuganv created this revision.Apr 12 2023, 2:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 12 2023, 2:24 AM
nridge added a subscriber: nridge.Apr 15 2023, 4:37 PM
kuganv updated this revision to Diff 517147.Apr 26 2023, 6:37 AM

Removed wrong assert

kuganv edited the summary of this revision. (Show Details)Apr 26 2023, 9:21 AM
kuganv published this revision for review.Apr 26 2023, 9:23 AM
kuganv added subscribers: DmitryPolukhin, ivanmurashko.
Herald added a project: Restricted Project. · View Herald TranscriptApr 26 2023, 9:24 AM

We would like to move the preamble index out of the critical path

Could you provide motivation why you need to do it? What is the underlying problem that this change is trying to solve?
We rely on preamble being indexed at that particular time for correct results in future steps, it's definitely not a no-op change even if the threading issues are resolved (see the other comment).

clang-tools-extra/clangd/ClangdServer.cpp
88

This definitely does not work. After onPreambleAST returns, the AST will be destroyed and async tasks may access it afterwards.

We would like to move the preamble index out of the critical path

Could you provide motivation why you need to do it? What is the underlying problem that this change is trying to solve?
We rely on preamble being indexed at that particular time for correct results in future steps, it's definitely not a no-op change even if the threading issues are resolved (see the other comment).

Thanks for your comment. Motivation for this is:

  1. We see preamble indexing taking as much as 18% of the time for some files. Moving preamble indexing out of the critical path helps there.
  2. We are also experimenting with preamble caching with clang modules. Early results from this also shows that preamble indexing outside the critical path improves performance.
ilya-biryukov added a comment.EditedApr 27 2023, 8:43 AM
  1. We see preamble indexing taking as much as 18% of the time for some files. Moving preamble indexing out of the critical path helps there.

Which operation are you measuring? 18% of first diagnostic latency?
Preamble indexing populates necessary information for many of the Clangd features. Some examples of the top of my head:

  • code completion will not show declarations from preamble if they were not deserialized from the preamble (most symbols).
  • diagnostics from preamble will be missed,
  • multi-file rename will miss references inside function bodies.

Doing indexing of the preamble in parallel to other operations is probably an improvement as PCH gets available earlier for operations that need it, e.g. code completion can start operating faster.
However, if we want to ensure the returned results are the same as they were before the patch (at least for the first run of Clangd on each file), we have to synchronize reads with the operations that populates the index.

For any form of this optimization, I think the implementation is going to be a little tricky and we definitely want @kadircet to take a look at it. It's hard to list all things in Clangd that rely on preamble indexing
finishing before they run, this change definitely requires a careful design review.

  1. We are also experimenting with preamble caching with clang modules. Early results from this also shows that preamble indexing outside the critical path improves performance.

Could you elaborate a bit more on what is being cached with modules and how this patch would affect it?

ivanmurashko added a comment.EditedApr 27 2023, 12:57 PM

Could you elaborate a bit more on what is being cached with modules and how this patch would affect it?

I hope that I can provide some info here. We are going to use implicit modules to cache the preamble i.e. the corresponding modulemap file will contain headers from the preamble. The idea is to provide a faster goto-definition (core navigation) functionality that requires AST be built. The modules can be loaded lazily and therefore it can give performance boost more than 10x depending on the source file. The problem is the preamble indexing. That is because the indexing process forces the module load and kills the performance wins i.e. it will be ~20% instead of 10x.

We switched off the indexing for tests and could see the core navigation functionality working significantly faster than usual but that can be considered as a temp solution only. The better approach, provided at the diff, is to move the indexing into a separate thread with a possible delay for some functionality but with faster navigation.

Any other ideas will also be interesting here.

ivanmurashko edited reviewers, added: sammccall; removed: sam.Apr 27 2023, 1:49 PM
sammccall added inline comments.Apr 27 2023, 2:05 PM
clang-tools-extra/clangd/ClangdServer.cpp
88

https://reviews.llvm.org/D115768 is a very rough prototype from a couple of years ago of how to keep the preamble AST alive in a separate object that I believe would be safe to index from even after returning from onPreambleAST.
(Though the patch doesn't actually do that, that was the idea)

kuganv marked an inline comment as done.May 2 2023, 12:48 PM
kuganv added inline comments.
clang-tools-extra/clangd/ClangdServer.cpp
88

Thanks for the review and suggestion. Let me rework the prototype to address this.

Thanks for the patch, this is a topic we've discussed a couple times in the past as well as it has far reaching implications. I believe we're actually in a much better situation nowadays.

To summarise some concerns (thanks to @sammccall for useful discussion);

Features

There are certain features that make use of index, ~all of them assume a "stale" index already, so this shouldn't effect them as severely.
The most impacted feature will be code completion, as we only deserialize decls needed to parse the main file (until code completion point) from the preamble and rely on the index for the rest. So we'll have partial results until indexing finishes.
I believe this is still better than what we've today. As in the presence of a global index, we'll still have some results but they'll be stale. Today we're not showing any results (well, to be more precise we're showing the identifier based ones) until preamble indexing finishes.
Another big concern that doesn't hold today is preamble-changed callbacks. In theory clangd notifies clients that a new preamble is available, in case they want "fresh" information. Today it's only used for providing semantic highlights, which doesn't use index. But in the future, if we let clients fetch other information based on this notification, we should make sure freshness of preamble index is not required.
There's also going to be some impact on documentations for code completion items/hover, but i don't think that's unacceptable either (we can already have that as we make use of stale preambles after the first build).

RAM/CPU usage due to extra concurrency

This means we'll be running more tasks in parallel, so clangd's peak memory usage will increase. Without this patch we've 1 live (not serialized) AST & 1 serialized AST in memory at peak as all the other tasks would be blocked until live AST is destroyed. After this patch we'll also have some tasks that're working on the serialized AST. But this isn't something new either, after first preamble build we're already hitting this peak usage when using a stale preamble and building a new one.
As for extra concurrency, again this is pretty similar to what we do after the first preamble build, we're only doing more work if the preamble gets invalidated multiple times while indexing is running. Without this patch we would only build the "last" version of the preamble, but after this patch we can trigger some extra intermediate builds. Since preamble updates are rare, I don't think this one is a big concern either.

Safety concerns

Taking a closer look at the code, we're already triggering indexing callback after preamble serialization is complete. Hence it seems like there are no other users of the live AST and other structs needed for indexing. These structs are also part of compiler instance as ref-counted objects, hence keeping them alive in the indexing task (with a similar approach to Sam's patch) should hopefully be safe, as the code that resets/destroys the compiler instance isn't putting those into an invalid state (AFAICT).
Only remaining concern is whether clangd itself has any objects that it passes into preamble builds with sole-ownership and invalidates/destroys them after the build finishes. I don't think that's the case at hindsight either.
Hopefully most of these concerns can also be verified in practice by running using an ASAN build of clangd for a while.

Implementation concerns

As pointed out, we should store ref-counted objects properly and pass them to the preamble callbacks, so that they can be kept-alive by the indexing task. This implies some changes to the interface of IndexingCallbacks.
As for AsyncTaskRunner to use, since this indexing task only depends on the file-index, which is owned by ClangdServer, I don't think there's any need to introduce a new taskrunner into TUScheduler and block its destruction. We can just re-use the existing TaskRunner inside parsingcallbacks, in which we run stdlib indexing tasks.

Rollout strategy

As mentioned above I think landing the patch after some local testing with an ASAN, behind a ClangdServer option is fine. Afterwards we can test out the behaviour by gradually turning on this option in our production and once it looks safe we can turn it on by default (and just get rid of the option).
Does that strategy works for you folks too?


So all in all, I think the direction makes sense, we're trading off some "freshness" in certain places, but this tradeoff only effects the behaviour until first preamble and in return we get some considerable savings (for a large codebase preamble indexing takes ~10 secs @95th%, which is ~13% of preamble build latency).
LMK if you've any further concerns/questions and thanks for doing this!

kuganv added a comment.May 3 2023, 4:03 AM

Thanks @kadircet , @sammccall and @ilya-biryukov for the super detailed explanation. As mentioned, we will test clangd with an ASAN build to test.

I also would like to get your feedback on the implementation. After we claim the AST as per @sammccall's patch into LiveAST, we seem to have at least two options for delayed indexing.

Option 1:

  • Change onPreambleAST to get LiveAST to pass it to PreambleThread
  • In PreambleThread::build, as part of the scope exit
if (!ReusedPreamble)
   // Index Preamble

Option 2:

  • Change onPreambleAST to get LiveAST as value
  • Start an Indexing thread
  • TUScheduler can manage another thread for indexing similar to how it does preamble and AST thread.

Have you had any thoughts about this. Thanks.

Sorry I was trying to give some brief idea about what it might look like in Implementation Concerns section above, but let me give some more details;

I think we can just change the signature for PreambleParsedCallback to pass along refcounted objects. forgot to mention in the first comment, but we should also change the CanonicalIncludes to be a shared_ptr so that it can outlive the PreambleData. We should invoke the callback inside buildPreamble after a successful build. Afterwards we should also change the signature for onPreambleAST to take AST, PP and CanonicalIncludes as ref-counted objects again and PreambleThread::build should just forward objects received from PreambleParsedCallback. Afterwards inside the UpdateIndexCallbacks::onPreambleAST we can just invoke indexing on Tasks if it's present or synchronously in the absence of it.

Does that make sense?

Sorry I was trying to give some brief idea about what it might look like in Implementation Concerns section above, but let me give some more details;

I think we can just change the signature for PreambleParsedCallback to pass along refcounted objects. forgot to mention in the first comment, but we should also change the CanonicalIncludes to be a shared_ptr so that it can outlive the PreambleData. We should invoke the callback inside buildPreamble after a successful build. Afterwards we should also change the signature for onPreambleAST to take AST, PP and CanonicalIncludes as ref-counted objects again and PreambleThread::build should just forward objects received from PreambleParsedCallback. Afterwards inside the UpdateIndexCallbacks::onPreambleAST we can just invoke indexing on Tasks if it's present or synchronously in the absence of it.

Thanks @kadircet for the details.

Does that make sense?

It does make sense. Let me work on this and get back to you.

kuganv updated this revision to Diff 521936.May 13 2023, 12:10 PM
kuganv edited the summary of this revision. (Show Details)

Re-implemented based on review

Herald added a project: Restricted Project. · View Herald TranscriptMay 13 2023, 12:10 PM
kuganv retitled this revision from [RFC][clangd] Move preamble index task to a seperate task to [RFC][clangd] Move preamble index out of document open critical path.May 13 2023, 12:11 PM
kuganv planned changes to this revision.May 15 2023, 12:19 AM
kuganv updated this revision to Diff 522083.May 15 2023, 1:51 AM

Fixed tests
clangd/unittests/TUSchedulerTests.cpp require re-warite due to delayed onPreambleAST callback.

i think something went wrong with the diff, you don't seem to update PreambleCallbacks to trigger indexing on a different thread at all (and also there are certain lifetime issues). is this the final version of the patch or did I bump into a WIP version unknowingly ?

clang-tools-extra/clangd/Preamble.cpp
106–113 ↗(On Diff #522083)

why are we triggering destructors for all of these objects eagerly? if this is deliberate to "fix" some issue, could you mention that in comments?

694 ↗(On Diff #522083)

what about just keeping the callback (with a different signature) and calling it here? e.g.:

PreambleCallback(CapturedInfo.takeLife());

that way we don't need to change the return type and also control if we received a valid object on every call site (since callback will only be invoked on success)

clang-tools-extra/clangd/TUScheduler.cpp
1062–1070

you can just execute this block around the call to reportPreambleBuild right? is there any particular reason to make this part of scope cleanup ? especially call it after the call to updatePreamble ?

1065–1067

these are all references to CapturedCtx which will be destroyed after the call. i guess we should be passing shared_ptrs/IntrusiveRefCntPtrs instead, right?

also I don't think it's enough to just pass these 3. we need to make sure rest of the fields in CapturedCtx is also kept alive (e.g. Preprocessor has raw pointers to Target/AuxTarget, which would be dangling after CapturedCtx goes out of scope).

kuganv added inline comments.May 15 2023, 6:27 AM
clang-tools-extra/clangd/Preamble.cpp
106–113 ↗(On Diff #522083)

why are we triggering destructors for all of these objects eagerly? if this is deliberate to "fix" some issue, could you mention that in comments?

Thanks a lot for the review.
If we don't destruct and set it to null, CapturedASTCtx will also have to keep instances such as ASTConsumer including other related callbacks such PreambleCallbacks. This was making the CapturedASTCtx interface and implementation complex.

694 ↗(On Diff #522083)

what about just keeping the callback (with a different signature) and calling it here? e.g.:

PreambleCallback(CapturedInfo.takeLife());

that way we don't need to change the return type and also control if we received a valid object on every call site (since callback will only be invoked on success)

Apologies for the misunderstanding. Just to be clear, you prefer indexing in UpdateIndexCallbacks using the thread Tasks that also indexes in indexStdlib? In this implementation, I am calling the index action in preamble thread. I will revise it.

clang-tools-extra/clangd/TUScheduler.cpp
1062–1070

you can just execute this block around the call to reportPreambleBuild right? is there any particular reason to make this part of scope cleanup ? especially call it after the call to updatePreamble ?

Yes, we can do it around reportPreambleBuild. However, based on your previous comment, I got the impression that you prefer it be a callback to buildPreamble which indexes in the tasks. We could do it either way.

1065–1067

these are all references to CapturedCtx which will be destroyed after the call. i guess we should be passing shared_ptrs/IntrusiveRefCntPtrs instead, right?

also I don't think it's enough to just pass these 3. we need to make sure rest of the fields in CapturedCtx is also kept alive (e.g. Preprocessor has raw pointers to Target/AuxTarget, which would be dangling after CapturedCtx goes out of scope).

In this implementation, I am using these in the same scope. But I will revise it as per previous comment.

kadircet added inline comments.May 15 2023, 7:51 AM
clang-tools-extra/clangd/Preamble.cpp
694 ↗(On Diff #522083)

Apologies for the misunderstanding. Just to be clear, you prefer indexing in UpdateIndexCallbacks using the thread Tasks that also indexes in indexStdlib? In this implementation, I am calling the index action in preamble thread. I will revise it.

Yes, i guess that's the reason why I was confused while looking at the code. Sorry if I give the impression that suggests doing the indexing on PreambleThread, but I think both in my initial comment:

As for AsyncTaskRunner to use, since this indexing task only depends on the file-index, which is owned by ClangdServer, I don't think there's any need to introduce a new taskrunner into TUScheduler and block its destruction. We can just re-use the existing TaskRunner inside parsingcallbacks, in which we run stdlib indexing tasks.

and in the follow up;

I think we can just change the signature for PreambleParsedCallback to pass along refcounted objects. forgot to mention in the first comment, but we should also change the CanonicalIncludes to be a shared_ptr so that it can outlive the PreambleData. We should invoke the callback inside buildPreamble after a successful build. Afterwards we should also change the signature for onPreambleAST to take AST, PP and CanonicalIncludes as ref-counted objects again and PreambleThread::build should just forward objects received from PreambleParsedCallback. Afterwards inside the UpdateIndexCallbacks::onPreambleAST we can just invoke indexing on Tasks if it's present or synchronously in the absence of it.

I was pointing towards running this inside the Tasks in UpdateIndexCallbacks.


There's definitely some upsides to running that indexing on the preamble thread as well (which is what we do today) but I think the extra sequencing requirements (make sure to first notify the ASTPeer and then issue preamble callbacks) we put into TUScheduler (which is already quite complex) is probably not worth it.

kuganv updated this revision to Diff 522471.May 16 2023, 12:44 AM

As per review, moved Preamble indexing into ClangdServer's IndexTasks.

kuganv added inline comments.May 16 2023, 12:49 AM
clang-tools-extra/clangd/Preamble.cpp
694 ↗(On Diff #522083)

Apologies for the misunderstanding. Just to be clear, you prefer indexing in UpdateIndexCallbacks using the thread Tasks that also indexes in indexStdlib? In this implementation, I am calling the index action in preamble thread. I will revise it.

Yes, i guess that's the reason why I was confused while looking at the code. Sorry if I give the impression that suggests doing the indexing on PreambleThread, but I think both in my initial comment:

As for AsyncTaskRunner to use, since this indexing task only depends on the file-index, which is owned by ClangdServer, I don't think there's any need to introduce a new taskrunner into TUScheduler and block its destruction. We can just re-use the existing TaskRunner inside parsingcallbacks, in which we run stdlib indexing tasks.

and in the follow up;

I think we can just change the signature for PreambleParsedCallback to pass along refcounted objects. forgot to mention in the first comment, but we should also change the CanonicalIncludes to be a shared_ptr so that it can outlive the PreambleData. We should invoke the callback inside buildPreamble after a successful build. Afterwards we should also change the signature for onPreambleAST to take AST, PP and CanonicalIncludes as ref-counted objects again and PreambleThread::build should just forward objects received from PreambleParsedCallback. Afterwards inside the UpdateIndexCallbacks::onPreambleAST we can just invoke indexing on Tasks if it's present or synchronously in the absence of it.

I was pointing towards running this inside the Tasks in UpdateIndexCallbacks.


There's definitely some upsides to running that indexing on the preamble thread as well (which is what we do today) but I think the extra sequencing requirements (make sure to first notify the ASTPeer and then issue preamble callbacks) we put into TUScheduler (which is already quite complex) is probably not worth it.

Thanks again for the review. Updated the diff such that indexing is now in IndexTasks. AFIK, there will be a single thread that will now manage indexing of preamble for all the opened TUs. Please let me know if you see any issues there.

DmitryPolukhin added inline comments.May 16 2023, 2:00 AM
clang-tools-extra/clangd/Preamble.cpp
694 ↗(On Diff #522083)

There's definitely some upsides to running that indexing on the preamble thread as well (which is what we do today) but I think the extra sequencing requirements (make sure to first notify the ASTPeer and then issue preamble callbacks) we put into TUScheduler (which is already quite complex) is probably not worth it.

@kadircet I'm sorry for interfering with the review but could you please elaborate what are the downsides of current approach when preamble indexing happening off critical path but on the same thread? The benefits of such approach that preamble marked available earlier and compilation of the main file starts earlier in parallel with indexing preamble. Also such approach eliminates potential race condition between preamble thread and indexing thread. Code in PrecompiledPreamble::Build accesses AST for preamble and if indexing is running in separate thread it will be race condition. If we schedule indexing after PrecompiledPreamble::Build e still will need to be very careful to don't access anything in preamble AST from PreambleThread after the task is scheduled. The only downside of running indexing on PreambleThread is that we cannot start building new preamble while previous one is indexing but I think it might be positive thing to throttle preamble building.

kuganv updated this revision to Diff 522523.May 16 2023, 3:35 AM

Fixed missing merge that caused build error.

kadircet added inline comments.May 17 2023, 3:19 AM
clang-tools-extra/clangd/ClangdServer.cpp
82

what's the point of this semaphore, if we're only going to try_to_lock? this should be done in the Task that we're passing to the Tasks.

but more importantly, this is going to be a change in behavior, before this patch we had one preamble thread per open file and they'll run indexing (in addition to preamble builds) without any throttling.

if we limit number of indexig tasks to 1, we'll need a queue to ensure progress for every file. semaphore won't be fair especially in the face of contention, we'll also likely handle requests out-of-order (which is redundant but OK correctness-wise as our file-index already accounts for that).

hence I don't think there's any reason to change behaviour here until we see some issues with contention (which I believe is unlikely as we can still only issue preamble indexing requests as quick as we can build preambles, and the latter is slower than the former).

83

s/task:/Preamble indexing for:/

114

same as above

clang-tools-extra/clangd/Preamble.cpp
101 ↗(On Diff #522523)

unlike other takeXXX methods, this will actually leave the optional inside still in a valid state.
can you instead return a std::optional<CapturedASTCtx> here?

698–700 ↗(On Diff #522523)

we use a "producingfs" when building the preamble, hence there shouldn't be any references to statcache inside the FM&VFS stored in the compiler instance here. have you noticed something to the contrary ?

106–113 ↗(On Diff #522083)

If we don't destruct and set it to null, CapturedASTCtx will also have to keep instances such as ASTConsumer

Sorry if I am being dense but I can't actually see any references to those objects in the structs we preseve. AFAICT, Sema, ASTConsumer and ASTReader are members of CompilerInstance, which we don't keep around. Can you point me towards any references to the rest of the objects you're setting to null in case I am missing any?

ASTMutationListener seems to be the only one that's relevant. It's indeed exposed through ASTContext and our indexing operations can trigger various callbacks to fire. Can you set only that one to nullptr and leave a comment explaining the situation?

694 ↗(On Diff #522083)

but could you please elaborate what are the downsides of current approach when preamble indexing happening off critical path but on the same thread?

As I mentioned above this makes TUScheduler more complicated now. We need to make sure we first send the preamble to astpeer and then invoke parsed callbacks, the sequencing is subtle and ties our hands if we want to refactor this code in the future as it needs to be preserved.

the benefits of such approach that preamble marked available earlier and compilation of the main file starts earlier in parallel with indexing preamble.

that's the case with both approaches, right? surely directly sending the preamble to ASTPeer might be couple milliseconds faster, but we're still indexing preamble in parallel. We might loose couple milliseconds while putting the new task into the queue, but considering the preamble/ast build times i don't think that really matters.

Also such approach eliminates potential race condition between preamble thread and indexing thread. Code in PrecompiledPreamble::Build accesses AST for preamble and if indexing is running in separate thread it will be race condition. If we schedule indexing after PrecompiledPreamble::Build e still will need to be very careful to don't access anything in preamble AST from PreambleThread after the task is scheduled.

sorry I don't fully understand the argument here. in both alternatives we're running indexing after PrecompiledPreamble::Build finishes hence they cannot access "live" ast concurrently.
Maybe you mean ParsedAST::build, i.e. the main ast build? If that's the case, it uses the serialized preamble, not "live" astcontext (and serialized preamble is already thread-safe to read concurrently).

clang-tools-extra/clangd/Preamble.h
49–50 ↗(On Diff #522523)

Having the use case mentioned here is nice, but I think we should also document "what/how" this struct operates.

maybe something like:

Keeps necessary structs for an ASTContext and Preprocessor alive.

This enables consuming them after context that produced the AST is gone. (e.g. indexing a preamble ast on a separate thread).
ASTContext stored inside is still not thread-safe.
51 ↗(On Diff #522523)

can you also make this struct move-only by deleting copy constructor/assignment? as we don't want to have multiple references to astcontext by mistake

105 ↗(On Diff #522523)

s/CanonicalIncludes/const CanonicalIncludes/ (to make sure we don't accidentally modify it concurrently) this would also imply changes on the callback signatures

DmitryPolukhin added inline comments.May 17 2023, 5:33 AM
clang-tools-extra/clangd/ClangdServer.cpp
82

we can still only issue preamble indexing requests as quick as we can build preambles, and the latter is slower than the former

Preamble compliation is not always slower than indexing. For example, if modules are actively used, compilation itself can be very fast but indexing is slow. Said that, I think we can eliminate the semaphore if there is no good reason to keep it.

clang-tools-extra/clangd/Preamble.cpp
694 ↗(On Diff #522083)

We need to make sure we first send the preamble to astpeer and then invoke parsed callbacks, the sequencing is subtle and ties our hands if we want to refactor this code in the future as it needs to be preserved.

I completely agree that clangd threading has lots of subtle things that makes it easy to violate expected sequence of events. My comment about potential race condition was about very similar thing, it is very easy to keep pointer to ASTContext/FileManager/etc. in a callback and call it after the task was scheduled and it will be race condition. That is why I thought that indexing on the same thread that was used for building preamble AST is safer. I'm not aware of any existing race conditions with this diff. My example was hypothetical what would happen if indexing task was scheduled from AfterExecute, for example, as it was from one of the version of this diff. Just to make it clear, I'm fine with current approach too.

kuganv marked 2 inline comments as done.May 17 2023, 9:45 AM
kuganv added inline comments.
clang-tools-extra/clangd/Preamble.cpp
698–700 ↗(On Diff #522523)

we use a "producingfs" when building the preamble, hence there shouldn't be any references to statcache inside the FM&VFS stored in the compiler instance here. have you noticed something to the contrary ?

In PrecompiledPreamble::Build, we use StatCacheFS for FileMgr creation. Shouldn’t we keep this Alive? Without this, I am seeing:

==1509807==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000464c00 at pc 0x55f558fd7de2 bp 0x7f18a73ff010 sp 0x7f18a73ff008
READ of size 8 at 0x606000464c00 thread T48 (NodeServer.cpp1)
V[23:14:41.667] Ignored diagnostic. /data/users/kugan/fbsource/fbcode/synapse_dev/prod/NodeServer.cpp:2:2:Mandatory header <vector> not found in standard library!
I[23:14:41.669] Indexed c++20 standard library (incomplete due to errors): 0 symbols, 0 filtered
    #0 0x55f558fd7de1 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_data() const /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/basic_string.h:234:28
    #1 0x55f558fd7de1 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::data() const /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../include/c++/12/bits/basic_string.h:2567:16
    #2 0x55f558fd7de1 in llvm::StringRef::StringRef(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) /home/kugan/local/llvm-project/llvm/include/llvm/ADT/StringRef.h:101:18
    #3 0x55f558fd7de1 in clang::clangd::PreambleFileStatusCache::update(llvm::vfs::FileSystem const&, llvm::vfs::Status) /home/kugan/local/llvm-project/clang-tools-extra/clangd/FS.cpp:33:20
    #4 0x55f558fd925f in clang::clangd::PreambleFileStatusCache::getProducingFS(llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>)::CollectFS::status(llvm::Twine const&) /home/kugan/local/llvm-project/clang-tools-extra/clangd/FS.cpp:82:19
    #5 0x55f55917ddee in clang::clangd::(anonymous namespace)::TimerFS::status(llvm::Twine const&) /home/kugan/local/llvm-project/clang-tools-extra/clangd/Preamble.cpp:492:30
    #6 0x55f55670dcff in clang::FileSystemStatCache::get(llvm::StringRef, llvm::vfs::Status&, bool, std::unique_ptr<llvm::vfs::File, std::default_delete<llvm::vfs::File>>*, clang::FileSystemStatCache*, llvm::vfs::FileSystem&) /home/kugan/local/llvm-project/clang/lib/Basic/FileSystemStatCache.cpp:47:55
    #7 0x55f556701ab4 in clang::FileManager::getStatValue(llvm::StringRef, llvm::vfs::Status&, bool, std::unique_ptr<llvm::vfs::File, std::default_delete<llvm::vfs::File>>*) /home/kugan/local/llvm-project/clang/lib/Basic/FileManager.cpp:590:12
    #8 0x55f556700eed in clang::FileManager::getDirectoryRef(llvm::StringRef, bool) /home/kugan/local/llvm-project/clang/lib/Basic/FileManager.cpp:161:20
    #9 0x55f556701d3e in clang::FileManager::getDirectory(llvm::StringRef, bool) /home/kugan/local/llvm-project/clang/lib/Basic/FileManager.cpp:191:17
    #10 0x55f55929a800 in clang::clangd::getCanonicalPath[abi:cxx11](clang::FileEntryRef, clang::SourceManager const&) /home/kugan/local/llvm-project/clang-tools-extra/clangd/SourceCode.cpp:540:45
    #11 0x55f559627681 in clang::clangd::SymbolCollector::HeaderFileURICache::toURI[abi:cxx11](clang::FileEntryRef) /home/kugan/local/llvm-project/clang-tools-extra/clangd/index/SymbolCollector.cpp:211:24
    #12 0x55f55961926e in clang::clangd::SymbolCollector::getTokenLocation(clang::SourceLocation) /home/kugan/local/llvm-project/clang-tools-extra/clangd/index/SymbolCollector.cpp:425:36
    #13 0x55f5596233a9 in clang::clangd::SymbolCollector::handleMacroOccurrence(clang::IdentifierInfo const*, clang::MacroInfo const*, unsigned int, clang::SourceLocation) /home/kugan/local/llvm-project/clang-tools-extra/clangd/index/SymbolCollector.cpp:753:22
    #14 0x55f55c1c7e3d in indexPreprocessorMacro(clang::IdentifierInfo const*, clang::MacroInfo const*, clang::MacroDirective::Kind, clang::SourceLocation, clang::index::IndexDataConsumer&) /home/kugan/local/llvm-project/clang/lib/Index/IndexingAction.cpp:229:16
    #15 0x55f55c1c7e3d in indexPreprocessorMacros(clang::Preprocessor&, clang::index::IndexDataConsumer&) /home/kugan/local/llvm-project/clang/lib/Index/IndexingAction.cpp:236:7
    #16 0x55f55c1c8290 in clang::index::indexTopLevelDecls(clang::ASTContext&, clang::Preprocessor&, llvm::ArrayRef<clang::Decl const*>, clang::index::IndexDataConsumer&, clang::index::IndexingOptions) /home/kugan/local/llvm-project/clang/lib/Index/IndexingAction.cpp:283:5
    #17 0x55f55959d350 in clang::clangd::(anonymous namespace)::indexSymbols(clang::ASTContext&, clang::Preprocessor&, llvm::ArrayRef<clang::Decl*>, clang::clangd::MainFileMacros const*, clang::clangd::CanonicalIncludes const&, bool, llvm::StringRef, bool) /home/kugan/local/llvm-project/clang-tools-extra/clangd/index/FileIndex.cpp:82:3
    #18 0x55f55959f040 in clang::clangd::indexHeaderSymbols(llvm::StringRef, clang::ASTContext&, clang::Preprocessor&, clang::clangd::CanonicalIncludes const&) /home/kugan/local/llvm-project/clang-tools-extra/clangd/index/FileIndex.cpp:235:10
    #19 0x55f5595a8119 in clang::clangd::FileIndex::updatePreamble(llvm::StringRef, llvm::StringRef, clang::ASTContext&, clang::Preprocessor&, clang::clangd::CanonicalIncludes const&) /home/kugan/local/llvm-project/clang-tools-extra/clangd/index/FileIndex.cpp:464:7
106–113 ↗(On Diff #522083)

If we don't destruct and set it to null, CapturedASTCtx will also have to keep instances such as ASTConsumer

Sorry if I am being dense but I can't actually see any references to those objects in the structs we preseve. AFAICT, Sema, ASTConsumer and ASTReader are members of CompilerInstance, which we don't keep around. Can you point me towards any references to the rest of the objects you're setting to null in case I am missing any?

ASTMutationListener seems to be the only one that's relevant. It's indeed exposed through ASTContext and our indexing operations can trigger various callbacks to fire. Can you set only that one to nullptr and leave a comment explaining the situation?

Thanks for the review. I will revise based on the feedback.

In buildPreamble, we create CppFilePreambleCallbacks in stack to be used in PrecompiledPreamble::Build which becomes part of PrecompilePreambleConsumer. I think this this need to be reset. Probably this has to be done in a different way. If I dont reset, I am seeing crash in some cases. See:

85c0c79f in clang::ASTReader::DecodeIdentifierInfo(unsigned int) /home/kugan/local/llvm-project/clang/lib/Serialization/ASTReader.cpp:8695:32
    #1 0x558285c75266 in clang::ASTReader::readIdentifier(clang::serialization::ModuleFile&, llvm::SmallVector<unsigned long, 64u> const&, unsigned int&) /home/kugan/local/llvm-project/clang/include/clang/Seriali
zation/ASTReader.h:2126:12
    #2 0x558285c75266 in clang::ASTRecordReader::readIdentifier() /home/kugan/local/llvm-project/clang/include/clang/Serialization/ASTRecordReader.h:211:20
    #3 0x558285c75266 in clang::serialization::BasicReaderBase<clang::ASTRecordReader>::readDeclarationName() /home/kugan/local/llvm-project/build/tools/clang/include/clang/AST/AbstractBasicReader.inc:780:58
    #4 0x558285ce10bc in clang::ASTDeclReader::VisitNamedDecl(clang::NamedDecl*) /home/kugan/local/llvm-project/clang/lib/Serialization/ASTReaderDecl.cpp:694:26
    #5 0x558285ce10bc in clang::ASTDeclReader::VisitNamespaceDecl(clang::NamespaceDecl*) /home/kugan/local/llvm-project/clang/lib/Serialization/ASTReaderDecl.cpp:1785:3
    #6 0x558285cbc6e8 in clang::ASTDeclReader::Visit(clang::Decl*) /home/kugan/local/llvm-project/clang/lib/Serialization/ASTReaderDecl.cpp:543:37
    #7 0x558285d5a133 in clang::ASTReader::ReadDeclRecord(unsigned int) /home/kugan/local/llvm-project/clang/lib/Serialization/ASTReaderDecl.cpp:4052:10
    #8 0x558285c73677 in clang::ASTReader::GetDecl(unsigned int) /home/kugan/local/llvm-project/clang/lib/Serialization/ASTReader.cpp:7607:5
    #9 0x558285c73677 in clang::ASTReader::GetLocalDecl(clang::serialization::ModuleFile&, unsigned int) /home/kugan/local/llvm-project/clang/include/clang/Serialization/ASTReader.h:1913:12
    #10 0x558285bfd741 in clang::ASTReader::FindExternalLexicalDecls(clang::DeclContext const*, llvm::function_ref<bool (clang::Decl::Kind)>, llvm::SmallVectorImpl<clang::Decl*>&)::$_9::operator()(clang::serializ
ation::ModuleFile*, llvm::ArrayRef<llvm::support::detail::packed_endian_specific_integral<unsigned int, (llvm::support::endianness)2, 1ul, 1ul>>) const /home/kugan/local/llvm-project/clang/lib/Serialization/ASTRe
ader.cpp:7687:21
    #11 0x558285bfd741 in clang::ASTReader::FindExternalLexicalDecls(clang::DeclContext const*, llvm::function_ref<bool (clang::Decl::Kind)>, llvm::SmallVectorImpl<clang::Decl*>&) /home/kugan/local/llvm-project/c
lang/lib/Serialization/ASTReader.cpp:7697:7
    #12 0x55827e942de7 in clang::ExternalASTSource::FindExternalLexicalDecls(clang::DeclContext const*, llvm::SmallVectorImpl<clang::Decl*>&) /home/kugan/local/llvm-project/clang/include/clang/AST/ExternalASTSour
ce.h:185:5
    #13 0x55827e942de7 in clang::DeclContext::LoadLexicalDeclsFromExternalStorage() const /home/kugan/local/llvm-project/clang/lib/AST/DeclBase.cpp:1421:11
    #14 0x55827e94436b in clang::DeclContext::decls_begin() const /home/kugan/local/llvm-project/clang/lib/AST/DeclBase.cpp:1477:5
    #15 0x5582822787ab in clang::DeclContext::decls() const /home/kugan/local/llvm-project/clang/include/clang/AST/DeclBase.h:2188:48
    #16 0x5582822787ab in clang::clangd::indexHeaderSymbols(llvm::StringRef, clang::ASTContext&, clang::Preprocessor&, clang::clangd::CanonicalIncludes const&) /home/kugan/local/llvm-project/clang-tools-extra/cla
ngd/index/FileIndex.cpp:233:37
    #17 0x558282281be9 in clang::clangd::FileIndex::updatePreamble(llvm::StringRef, llvm::StringRef, clang::ASTContext&, clang::Preprocessor&, clang::clangd::CanonicalIncludes const&) /home/kugan/local/llvm-proje
ct/clang-tools-extra/clangd/index/FileIndex.cpp:464:7
kuganv updated this revision to Diff 523784.May 19 2023, 8:09 AM

Revised Except for:

  1. Explicitly destructing in AfterExecute
  2. setStatCache to take refence

Removing Both of which is causing crash.

kadircet added inline comments.May 19 2023, 8:39 AM
clang-tools-extra/clangd/ClangdServer.cpp
82

For example, if modules are actively used, compilation itself can be very fast but indexing is slow.

You're right, but that's a mode of operation that incidentally works in clangd as we discussed before.

I think we can eliminate the semaphore if there is no good reason to keep it.

As mentioned above I don't think we have good enough reasons to have some throttling here, and even if we did, I don't think we can achieve it with a semaphore cheaply, we probably need a queue and multiple consumers to at least ensure forward progress on every file.

clang-tools-extra/clangd/Preamble.cpp
698–700 ↗(On Diff #522523)

oh wow, i completely missed that. this actually points to a different data-race issue unfortunately, and we were suppressing it by keeping it alive here.

PreambleStatCache is thread-safe only after writes to it completely finishes (i.e. there are no more producingfs instances lying around). Previously, because we were not announcing statcache until indexing was over, we didn't notice this. but now we're actually sending statcache to consumers while indexing still has a producingfs alive.

Hence, in addition to the life extension of the statcache, we actually need to change FS inside filemanagers to be a cosumingfs.

106–113 ↗(On Diff #522083)

ah, this one is unfortunately a little bit more involved, and triggers only in the cases where we have modules enabled (which is incidentally working in clangd today, but effect is wider as -std=c++20 implicitly turns on -fmodules).

tl;dr; astcontext is keeping references to pchgenerator in module-enabled builds.

TBH the safest option here seems like clearing external source in astcontext completely. As we can't really risk callbacks to PCHGenerator, even if we fixed all the life-time concerns (ast consumer keeps references to frontendaction alive, so both the callbacks we pass here and the action itself needs to outlive the consumer).

But that's likely going to break preamble-indexing completely for codebases that use modules (we'll likely only index top-level decls from the current modules, but I am not sure). The alternative suggested here strikes some middle ground (there's still going to be behavior change, previously all the modules accessed during indexing would effect the final serialized preamble, they won't anymore) but I am afraid it's too fragile. The dependency between these clang-internals were so subtle that it took ~hours for me to analyze, despite me specifically looking for them, hence I don't think anyone that's unaware can preserve these invariants (i.e. make sure we don't have references to astconsumer in other places.).

I am not sure how these work for you today, but can you check if just setting external source on the astcontext to null is good enough?

694 ↗(On Diff #522083)

it is very easy to keep pointer to ASTContext/FileManager/etc. in a callback and call it after the task was scheduled and it will be race condition.

You're right, but I don't think we're making it worse here with either of the approaches. Every time we create new references to those objects, we already have that risk, but IMO it's easier to raise some eyebrows when a new reference is created to a non-threadsafe object vs just re-ordering of some calls. Hence I was more inclined towards this alternative that at least won't make sequencing of operations more constrained.

kuganv updated this revision to Diff 524836.May 23 2023, 12:08 PM
kuganv marked an inline comment as done.

Set just PrecompilePreambleConsumer/PCHGenerator and ASTMutationListener to null.
Set the FileManager VFS to consuming FS so that it is thread safe.

clang-tools-extra/clangd/Preamble.cpp
106–113 ↗(On Diff #522083)

Thanks for checking. Unfortunately, just setting external source on the astcontext to null is not working. Setting PrecompilePreambleConsumer/PCHGenerator and ASTMutationListener to null seem to be sufficient.

I checked the Preamble index size with and without the patch for some files and they are the same. Updating to the revised patch.

thanks, mostly LG. some nits around comments and request for a knob :)

clang-tools-extra/clangd/ClangdServer.cpp
81–92

can you also introduce a bool AsyncPreambleIndexing = false; into ClangdServer::Options struct and pass it into UpdateIndexCallbacks and make this conditional on that? after testing it for couple weeks, we can make it the default.

90

let's leave a comment here saying that FIndex outlives the UpdateIndexCallbacks.

93

can you add a trace::Span Tracer("PreambleIndexing"); here

213

nit: let's revert this change

clang-tools-extra/clangd/Preamble.cpp
105 ↗(On Diff #524836)

the comments are still just talking about what the code is already doing, and not "why".

// ASTContext and CompilerInstance can keep references to ASTConsumer (PCHGenerator in case of preamble builds).
// These references will become dangling after preamble build finishes, even if they didn't we don't want operations on the preamble to mutate PCH afterwards.
// So clear those references explicitly here.
696 ↗(On Diff #524836)

again comment is talking about "what" the code does rather than "why", what about:

// Stat cache is thread safe only when there are no producers. Hence change the VFS underneath to a consuming fs.
698 ↗(On Diff #524836)

nit: std::move(StatCacheFS) or just inline the initializer

kuganv updated this revision to Diff 530328.Jun 11 2023, 12:54 PM
kuganv marked 6 inline comments as done.

Updated based on the review comments.

kadircet accepted this revision.Jun 12 2023, 12:51 AM

thanks a lot for bearing with me, LGTM!

let me know if i should land this for you.

clang-tools-extra/clangd/ClangdServer.cpp
156

nit: no need to store a reference to whole Options struct, you can have single boolean.

This revision is now accepted and ready to land.Jun 12 2023, 12:51 AM
This revision was landed with ongoing or failed builds.Jun 12 2023, 4:57 AM
This revision was automatically updated to reflect the committed changes.