This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Use different FS in PreambleThread
AbandonedPublic

Authored by kadircet on May 31 2020, 2:45 PM.

Details

Reviewers
sammccall
Summary

As vfs::FileSystem is not threadsafe, we've introduced a data race when
PreambleThread started to run in async mode. As both ASTWorker and
PreambleWorker can concurrently work on the same FS.

This patch fixes that data race by propagating FSProvider into TUScheduler and
ASTWorker and making use of a new FS when issuing update to the PreambleThread.

This doesn't break our contract on FSProvider::getFileSystem, as it says the
active context will be client's context at the time of request and
ASTWorker::update is run with the context we've captured on
ClangdServer::AddDocument.

As for the operations, this implies preamble and ast threads can see different
versions of the filesystems. For example in presence of a snapshotted filesystem

  • client makes an AddDocument request with FS at snapshot v1
  • client then updates the FS to v2
  • ASTWorker starts handling the update, caches the FS v1 in FileInputs, which is used for serving reads.
  • Schedules a preamble update, this fetches a new FS from provider, which might yield v2. This says might as clients can handle this case by storing snapshot version in context.
  • Preamble build finishes and clangd publishes diagnostics for FS v2.
  • Client issues a read (e.g. go-to-def), clangd uses FS v1 for serving the request.

This can be overcome by 2 filesystems into TUScheduler in
ClangdServer::AddDocument, but I don't think it is an issue in practice
currently. As our beloved snapshotted filesystem provider stores the snapshot
version in context, and should always return the same FS for requests coming
from the same context.

Diff Detail

Event Timeline

kadircet created this revision.May 31 2020, 2:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2020, 2:45 PM

TL;DR: I think there are three viable paths that we should consider:
A) minimal change: put the FSProvider in ParseInputs
B) this patch, but with more documentation and safety
C) fight to clean up VFS multithreading semantics

Assuming we don't want to block on C, I'm not sure whether A or B is conceptually better. But A is a smaller change and brings side benefits, so wondering if you see advantages in B.


You're right about the race condition of course and I'm sure this patch is correct.
But this rubs me the wrong way a bit - the API change doesn't seem clearly motivated either from a high level (we conceptually *want* a single FS to build a single version of code) or from a low level (what are we actually doing in parallel that isn't threadsafe?).

Looking at the low level perspective first. VFS has three classes threadsafety issues:

  • it contains the "working directory" as mutable state, which isn't multi-threading friendly. We do set working directory concurrently with other work, but always to the same thing. We could factor our way around this by setting the working directory prior to any concurrency (to CDB.Directory) and ensuring we never set it again.
  • the RealFilesystem in real-working-directory mode is a singleton so its use is thread-hostile. This isn't really relevant to us because we use emulated-working-directory mode and have distinct instances. But it probably informed VFS's lack of useful threading semantics.
  • read operations aren't const or documented as threadsafe even though the canonical RealFilesystem is. In-tree implementations seem to be threadsafe, some out-of-tree ones are not :-( This definitely bites us here.

I think it would be fundamentally possible to use a single VFS here. We need to modify the VFS interface to make concurrent reads safe (and const, while here) and fix any implementations. This isn't trivial. I think it's worthwhile, but experience tells me different projects want different things from the VFS interface...

The high-level view: IMO FSProvider is basically a hack around VFS's thread-safety issues, rather than a fundamental part of the application - it's a filesystem that doesn't have an attached working directory and whose read operations (e.g. getFileSystem()->status()) are threadsafe. We could rename FSProvider -> ThreadsafeFS and getFileSystem() -> newCursor(WD) or something, and maybe we should.
From this perspective, this patch is changing two things that have little to do with each other: it's replacing FS -> ThreadsafeFS and it's moving the FS ownership from one revision of the file to the TUScheduler itself.

I'm not really sure whether the latter is a good or a bad idea. If we want to do it, the use of ParseInputs as the interface to TUScheduler is a bit confusing - we should at least document that FS isn't used (as we do for CompileCommand) and probably assert that it's null, too.

But maybe simpler would be to "just" change the VFS member in ParseInputs to a const FSProvider*. This keeps the design approximately what it is now, but simplifies the ParseInputs concept: it's now truly readonly data, and functions that consume it won't have effects that escape.

clang-tools-extra/clangd/TUScheduler.cpp
629

Yikes, ISTM we're getting the FS from Inputs here, and from this->FSProvider below. We should use one or the other (i.e. FSProvider only, following this patch) unless we're trying to do something really subtle...

TL;DR: I think there are three viable paths that we should consider:
A) minimal change: put the FSProvider in ParseInputs
B) this patch, but with more documentation and safety
C) fight to clean up VFS multithreading semantics

Assuming we don't want to block on C, I'm not sure whether A or B is conceptually better. But A is a smaller change and brings side benefits, so wondering if you see advantages in B.

I didn't want to go for A) as I was afraid of satisfying the contract on FileSystemProvider::getFileSystem

/// Context::current() will be the context passed to the clang entrypoint,
/// such as addDocument(), and will also be propagated to result callbacks.

As this patch keep the FileSystemProvider inside TUScheduler.cpp and we need to ensure context is always the one we received in the entrypoint. Whereas ParseInputs is widely
passed around, and ensuring that assumption holds might be hard.

You're right about the race condition of course and I'm sure this patch is correct.
But this rubs me the wrong way a bit - the API change doesn't seem clearly motivated either from a high level (we conceptually *want* a single FS to build a single version of code) or from a low level (what are we actually doing in parallel that isn't threadsafe?).

Looking at the low level perspective first. VFS has three classes threadsafety issues:

  • it contains the "working directory" as mutable state, which isn't multi-threading friendly. We do set working directory concurrently with other work, but always to the same thing. We could factor our way around this by setting the working directory prior to any concurrency (to CDB.Directory) and ensuring we never set it again.
  • the RealFilesystem in real-working-directory mode is a singleton so its use is thread-hostile. This isn't really relevant to us because we use emulated-working-directory mode and have distinct instances. But it probably informed VFS's lack of useful threading semantics.
  • read operations aren't const or documented as threadsafe even though the canonical RealFilesystem is. In-tree implementations seem to be threadsafe, some out-of-tree ones are not :-( This definitely bites us here.

I think it would be fundamentally possible to use a single VFS here. We need to modify the VFS interface to make concurrent reads safe (and const, while here) and fix any implementations. This isn't trivial. I think it's worthwhile, but experience tells me different projects want different things from the VFS interface...

I totally agree, this was actually my initial motivation. But I wasn't sure if it is worth the hassle, as this might receive some push-back from downstream users (VFS interface is not just in clang, but in llvm) and coming to a consensus is likely to take time.
It would be nice to clear that technical debt, but I don't think it provides any other practical benefits.

The high-level view: IMO FSProvider is basically a hack around VFS's thread-safety issues, rather than a fundamental part of the application - it's a filesystem that doesn't have an attached working directory and whose read operations (e.g. getFileSystem()->status()) are threadsafe. We could rename FSProvider -> ThreadsafeFS and getFileSystem() -> newCursor(WD) or something, and maybe we should.
From this perspective, this patch is changing two things that have little to do with each other: it's replacing FS -> ThreadsafeFS and it's moving the FS ownership from one revision of the file to the TUScheduler itself.

I'm not really sure whether the latter is a good or a bad idea. If we want to do it, the use of ParseInputs as the interface to TUScheduler is a bit confusing - we should at least document that FS isn't used (as we do for CompileCommand) and probably assert that it's null, too.

Right, the latter was a consequence of not storing FSProvider in ParseInputs, rather than a deliberate one. We can also push it around as a parameter to update, rather than a member if that looks more clear.
I didn't do that as it required more plumbing and change was actually local to TUScheduler.

But maybe simpler would be to "just" change the VFS member in ParseInputs to a const FSProvider*. This keeps the design approximately what it is now, but simplifies the ParseInputs concept: it's now truly readonly data, and functions that consume it won't have effects that escape.

Isn't this just what you proposed in A) ? My concern above applies to this one too. It is not a practical concern as of today, as most uses of ParseInputs is through callbacks in ASTWorker, hence the context should be implicitly set. Unless, someone attaches another async callback.

TL;DR: I think there are three viable paths that we should consider:
A) minimal change: put the FSProvider in ParseInputs
B) this patch, but with more documentation and safety
C) fight to clean up VFS multithreading semantics

Assuming we don't want to block on C, I'm not sure whether A or B is conceptually better. But A is a smaller change and brings side benefits, so wondering if you see advantages in B.

I didn't want to go for A) as I was afraid of satisfying the contract on FileSystemProvider::getFileSystem

/// Context::current() will be the context passed to the clang entrypoint,
/// such as addDocument(), and will also be propagated to result callbacks.

As this patch keep the FileSystemProvider inside TUScheduler.cpp and we need to ensure context is always the one we received in the entrypoint. Whereas ParseInputs is widely
passed around, and ensuring that assumption holds might be hard.

That makes sense. However I don't think that contract is conceptually that great, having the ideas of "threadsafe FS" and "maybe-context-aware FS" coupled together is... weird. The benefit is that it's an easier contract to satisfy.

It's already the case that the context could/should be plumbed through to the actual FS operations and other extension points. Things like tracing rely on it. I think saying "we always plumb context through to everything" is a better contract even if it's a bit stronger than we need and harder to prove satisfied. This frees up FileSystemProvider to be the conceptually-simple ThreadsafeFS that could be lifted from clangd/support to llvm/support if we want.

WDYT?

That makes sense. However I don't think that contract is conceptually that great, having the ideas of "threadsafe FS" and "maybe-context-aware FS" coupled together is... weird. The benefit is that it's an easier contract to satisfy.

It's already the case that the context could/should be plumbed through to the actual FS operations and other extension points. Things like tracing rely on it. I think saying "we always plumb context through to everything" is a better contract even if it's a bit stronger than we need and harder to prove satisfied. This frees up FileSystemProvider to be the conceptually-simple ThreadsafeFS that could be lifted from clangd/support to llvm/support if we want.

WDYT?

Did that in D81173, there were two issues:

  • Sometimes we create ParseInputs without an access to an FSProvider, overcome this one with a helper that wraps a VFS and returns an FSProvider for it.
  • TestTU::inputs() needs to return a ParseInputs, but it is unclear which FSProvider it should point to :/ make it a parameter to the function. Since it needs to out-live the returned ParseInputs.
kadircet abandoned this revision.Jun 9 2020, 4:29 AM

we've went with D81173 instead.