This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Keep only a limited number of idle ASTs in memory
ClosedPublic

Authored by ilya-biryukov on May 18 2018, 6:57 AM.

Details

Summary

After this commit, clangd will only keep the last 3 accessed ASTs in
memory. Preambles for each of the opened files are still kept in
memory to make completion and AST rebuilds fast.

AST rebuilds are usually fast enough, but having the last ASTs in
memory still considerably improves latency of operations like
findDefinition and documeneHighlight, which are often sent multiple
times a second when moving around the code. So keeping some of the last
accessed ASTs in memory seems like a reasonable tradeoff.

Event Timeline

ilya-biryukov created this revision.May 18 2018, 6:57 AM

Another alternative that I've considered was evicting the ASTs from memory after a certain period of time, e.g. after 30 seconds of inactivity for some file. This might be simpler and also cover the use-case of speeding up multiple code navigation requests (findDefinition/documentHighlight) in a row.
Yet another thing that came to mind: walking over all of the AST to find reference under the cursor is terribly inefficient, we should be able to build a cheap data structure that speeds up this operation. Maybe we can even store enough information to not need the AST for navigation at all and build it only for features like refactorings.
@sammccall, let me know what are your thoughts on all of this.

Maybe we can even store enough information to not need the AST for navigation at all and build it only for features like refactorings.
@sammccall, let me know what are your thoughts on all of this.

That's what I was thinking. I think I also had similar discussion with @arphaman. I think storing a limited number of ASTs is a good interim solution.

(Haven't reviewed the code yet, just design stuff)
I'm tempted to push back on the idea that we need to store any - "if it's fast enough for code complete, it's fast enough for GTD". But that's probably oversimplifying - we don't force reuse of a stable preamble for GTD I think. So we probably do want some cache.

Maybe we can even store enough information to not need the AST for navigation at all and build it only for features like refactorings.
@sammccall, let me know what are your thoughts on all of this.

That's what I was thinking. I think I also had similar discussion with @arphaman. I think storing a limited number of ASTs is a good interim solution.

Agree - it's a good idea but we shouldn't block on it. A general-purpose xrefs index may solve this problem (and others) but requires a bunch of design. A narrower structure risks building a bunch of complexity for one feature we can't reuse for the next.

Another alternative that I've considered was evicting the ASTs from memory after a certain period of time, e.g. after 30 seconds of inactivity for some file.

We discussed this a bit more offline. A time-based limit reduces idle RAM usage, and a weight limit (or just count) reduces peak RAM.
Relatively speaking, idle seems to be more important to our hosted/multiplexed use cases, and peak is more important when running on a developer machine.
Weight is probably easier to tune. Time is easier to implement as the TUs are independent.

So this gives us some options (assuming we want some caching, but not unlimited):

  • Evict if old - this helps hosted a lot, and is simple to implement.
  • Evict if full (this patch) - this helps standalone, and is more complex.
  • Evict if full AND old - this can be tuned nicely for hosted and standalone. Most complex, but only a little more than the previous option.
  • Evict if full OR old - this puts a hard limit on resource usage and controls idle, but doesn't seem useful for hosted as it can't drive idle to zero.

I think the main design decision is whether the cache logic requires TUs to interact (vs simple time eviction). If we pay that complexity cost we get a lot of flexibility to tweak the cache. It's the hosted stuff that's driving this (for Google) right now, but maybe that's just because we're not really measuring impact on workstations yet :)
So I think I like this policy as a starting point, but we should plan to bolt on time-based-expiry in the near future.

Having taken a closer look, I think the cache can be simplified/separated a bit more cleanly by returning shared pointers and not allowing lookups, instead restoring limited ownership in CppFile...

Happy to discuss more, esp if you might disagree :)

clangd/ClangdUnit.h
132–134

This may be change aversion, but I'm not sure this class does enough after this change - it doesn't store the inputs or the outputs/cache, so it kind of seems like it wants to be a function.

I guess the motivation here is that storing the outputs means dealing with the cache, and the cache is local to TUScheduler.
But CppFile is only used in TUScheduler, so we could move this there too? It feels like expanding the scope more than I'd like.

The upside is that I think it's a more natural division of responsibility: CppFile could continue to be the "main" holder of the shared_ptr<Preamble> (which we don't limit, but share), and instead of Optional<ParsedAST> it'd have a weak_ptr<ParsedAST> which is controlled and can be refreshed through the cache.

As discussed offline, the cache could look something like:

class Cache {
   shared_ptr<ParsedAST> put(ParsedAST);
   void hintUsed(ParsedAST*); // optional, bumps LRU when client reads
   void hintUnused(ParsedAST*); // optional, releases when client abandons
}

shared_ptr<ParsedAST> CppFile::getAST() {
  shared_ptr<ParsedAST> AST = WeakAST.lock();
  if (AST)
    Cache.hintUsed(AST.get());
  else
    WeakAST = AST = Cache.put(build...);
  return AST;
}
  • Reimplemented LRU cache with shared_ptr and weak_ptr.
ilya-biryukov added inline comments.May 24 2018, 2:01 AM
clangd/ClangdUnit.h
132–134

I've reimplemented the cache with weak_ptr/shared_ptr.
With a slightly different interface to hide more stuff in the cache API. Please take a look and let me know what you think.

Nevertheless, I still find CppFile refactoring useful. It unties preambles from the ASTs and I believe this is a good thing, given that their lifetimes are different.

Thanks, this looks a lot better!
There do seem to be a lot of little classes that exist exactly one-per-TU (ASTWorker, ASTBuilder, CachedAST, to a lesser extent ParsedAST) and I'm not sure the balance of responsibilities is quite right. Some comments below.

clangd/ClangdUnit.h
132–134

I'm not sure how much they were tangled before, they were computed in the same place, but accessed separately, and it's not sure you ever *want* to compute them separately? (possibly in unit tests?)

I do think making ASTWorker maintain the old preamble and pass it in is confusing. The remaining members are trivial and unrelated enough that I do think if the references to the preamble/ast are removed, then moving the remaining members to ASTWorker or to a dumb struct, and making these free functions would make it easier to navigate the class structure.

clangd/TUScheduler.cpp
71

I think I understand this more as "updates the value" but the value is lazy...

I find this API somewhat hard to follow, maybe just because it's unfamiliar.
I've mostly seen cache APIs look like one of:

  1. Cache(function<Value(Input)> Compute), Value Cache::get(Input)
  2. void Cache::put(Key, Value), Optional<Value> Cache::get(Key)
  3. Handle Cache::put(Value), Optional<Value> Handle::get()

This one is Slot Cache::create(), void Slot::update(function<Value()> LazyV), Value Slot::get().

It's similar-ish to 3, but has 3 nontrivial operations instead of 2, and the slot object is externally mutable instead of immutable, so it seems more complex. What does it buy us in exchange?

(1 & 2 work well with a natural key or inputs that are easy to compare, which we don't particularly have)

91

naming: IdleASTs doesn't mention the function of this class, which is a cache.
I'd suggest swapping the names: call the class ASTCache and the *instance* IdleASTs as that reflects its role within the TUScheduler.

For CachedAST, I think the relationship would be well-exposed by nesting it as ASTCache::Entry. This also gives you the friend for free, which seems like a hint that it's an appropriate structure.
(Though I'm not sure CachedAST is that useful)

153

as discussed offline, using a CachedAST* as a key shouldn't be necessary, the ParsedAST* should be enough I think.

157

document why

clangd/TUScheduler.h
48

nit: Policy would be more specific than Params

  • Rebase, fix merge conflicts
  • Simpler implemenataion of the Cache
  • s/IdleASTs/ASTCache

I have addressed the comments regarding the cache implementation. ASTBuilder ones are still pending, but I would appreciate the feedback on how TUScheduler.cpp looks like.

clangd/ClangdUnit.h
132–134

it's not sure you ever *want* to compute them separately?

I think that's exactly what we're doing now. The ASTs can now get built separately from the preamble, because they can be evicted from the cache and need to be rebuild while preamble is not changed.

The remaining members are trivial and unrelated enough that I do think if the references to the preamble/ast are removed, then moving the remaining members to ASTWorker or to a dumb struct, and making these free functions would make it easier to navigate the class structure.

Dumb struct SG, it's essentially what ASTWorker is right now.

clangd/TUScheduler.cpp
71

As discussed offline, now we have a simpler version that keeps unique_ptrs to idle ASTs and the clients are responsible for building the ASTs.
Note that it's not a "cache" per se, so we might want a different name for that.
@sammccall, you suggested to call it a pool, I find it reasonable. Should we name it ASTPool instead of ASTCache?

The cache looks way simpler now, thank you!
As discussed offline, flattening ASTBuilder right into ASTWorker still seems like a good idea to me, but happy with what you come up with there.

clangd/TUScheduler.cpp
66

I'd say a little more about the interaction here. e.g.

/// An LRU cache of idle ASTs.
/// Because we want to limit the overall number of these we retain, the cache
/// owns ASTs (and may evict them) while their workers are idle.
/// Workers borrow them when active, and return them when done.
71

I think the name is actually fine, it's still mostly a cache.
It does have things in common with a pool, but unrelated consumers can't share a resource, so I think that name is at least as misleading.

84

consider assert(findByKey(K) == LRU.end()) as a precondition

92

Just "run the expensive destructor outside the lock"?
the "may not" case seems unimportant and slightly confusing here

94

this line isn't actually needed right?

346

This failure doesn't get cached, correct? That's bad for performance.

But if we think this is always a clangd bug, it's probably fine. (and certainly simplifies things)

ilya-biryukov marked 2 inline comments as done.
  • Address review comments.
  • Remove ASTBuilder, make everything a helper function
ilya-biryukov added inline comments.May 28 2018, 9:06 AM
clangd/TUScheduler.cpp
71

SG, let's leave as is.

94

It isn't. But it makes an important operation (destructor of the AST) explicit, so I'd still keep it.

346

Thanks, good catch! I somehow missed it, since at some point the failure was cached.
I think we should cache it.

Failing ASTs is a clangd bug, of course.
However, they might be hard to fix if it's something inside clang, so I believe we should handle the failures gracefully in that case.

ilya-biryukov marked an inline comment as done.
  • s/Params/Policy

There are still a few nits I haven't addressed, but the other big change is now there: removing ASTBuilder, replacing it with free-standing functions. I'd say I liked the previous code better, since the use sites were a bit smaller and the functions have ridiculously many params now. But all of that seems manageable at this point.
@sammccall, let me know what you think. And I'll address the nits that are left tomorrow.

Impl looks good.
Is there a way we can reasonably test this? (specifically that we don't just store all the ASTs forever)

clangd/ClangdUnit.h
181

nit: i think filename here is only used for logging, just use Inputs.CompileCommand.Filename?

clangd/TUScheduler.cpp
103

please also explicitly mention what nullptr means (even though that's really up to the caller of put()). None vs nullptr leaves room for confusion.

clangd/TUScheduler.h
66

does this actually have more than one caller? what's the plan for exposing this option to embedders/CLI users (not saying we necessarily need the latter)?

ilya-biryukov marked 2 inline comments as done.
  • Added a unit test
  • Address review comments
  • Add ASTRetentionPolicy param to ClangdServer
clangd/ClangdUnit.h
181

Tried doing that, but the filename parameter is actually passed to PreambleCallback that updates the dynamic index. Using filename from compile command there seems fragile, so I kept the parameter for now.

clangd/TUScheduler.h
66

Yes, just one caller outside the tests.
The plan was to expose it only in ClangdServer for now. Giving this knob in CLI might be useful, if we have good reasons for that, but I hope that we could pick the default that work for everyone instead.
Added that as a parameter of ClangdServer.

Maybe we should move the default value of 3 to ClangdServer? WDYT?

  • Fixed formatting
sammccall accepted this revision.May 30 2018, 7:22 AM
sammccall added inline comments.
clangd/TUScheduler.h
66

CLI when needed SG.
I think we want to give our cloud folks the chance to set it to zero.
So maybe put the default in ClangdLSPServer? (or make it a param there too and move the value to main)

This revision is now accepted and ready to land.May 30 2018, 7:22 AM
ilya-biryukov added inline comments.Jun 1 2018, 3:12 AM
clangd/TUScheduler.h
66

After an online chat we've decided to keep the default inside the ASTRetentionPolicy struct.
If we add the CLI parameter, we can pass custom values from main to ClangdLSPServer.
And other clients already have a knob to set it in ClangdServer.

This revision was automatically updated to reflect the committed changes.