This is an archive of the discontinued LLVM Phabricator instance.

[Frontend] Cache preamble-related data
Needs ReviewPublic

Authored by yvvan on Jun 19 2018, 1:09 AM.

Details

Summary

In case two translation units are created for the same file - reuse preamble data to reduce memory and save time on extra preamble generation.

Diff Detail

Event Timeline

yvvan created this revision.Jun 19 2018, 1:09 AM

I would argue this should be handled by the clients instead. Adding global state and locking is complicated. (And ASTUnit is complicated enough).

What are the use-cases of creating multiple ASTUnit inside the same process for the same file? Which clients do that and why they can't have a single ASTUnit per file?

yvvan added a comment.Jul 4 2018, 4:19 AM

@ilya-biryukov Sorry. I didn't have time to post comments here.
The usecase that we have is a supportive translation unit for code completion. Probably you use something similar in clangd not to wait for the TU to be reparsed after a change?
The gain from this change is both performance and memory but I don't have measurements under my hand right now. And of course they are only valid for this usecase.

I agree that the optimization is compelling, we do also share preamble in clangd and run code completion in parallel with other actions that need an AST.
However, I would argue that a proper way to introduce the optimization would be to change interface of ASTUnit that would somehow allow reusing the PCHs between different ASTUnits.

Having the cache is fine, but I suggest we:

  1. Make it explicit in the interface, e.g. ASTUnit ctor will accept an extra PreambleCache parameter.
  2. Have it off by default.

Whenever ASTUnit needs to rebuild a preamble, it can look into the cache and check there's a cached preamble and whether it matches the one that we're building.
After the rebuild is finished, it will try to push the built preamble into the cache.
ASTUnit can store the shared reference to a const preamble, i.e. std::shared_ptr<const PrecompiledPreamble>. The cache itself will store the weak_ptr<const PrecompiledPreamble>, thus allowing the preambles to be removed when they're not used anymore.

It would allow to:

  1. avoid global state in ASTUnit, the cache will have to be created somewhere up the stack and passed to ASTUnit explicitly.
  2. avoid complicating ASTUnit with synchronization code, it can be moved to preamble cache.
  3. keep changes to the ASTUnit minimal. The only function that needs to be changed is the one building the preamble. Given how complicated ASTUnit is already, I think that's a big win.