This is an archive of the discontinued LLVM Phabricator instance.

ioeric retitled this revision from [clangd] Load YAML static index asyncrhonously. to [clangd] Load YAML static index asynchronously..Aug 30 2018, 1:00 AM

revert unintended change

Harbormaster completed remote builds in B22085: Diff 163276.Aug 30 2018, 1:03 AM

ilya-biryukov added inline comments.Aug 30 2018, 6:06 AM

clangd/tool/ClangdMain.cpp
86	I believe is a data race (multiple threads may run this line concurrently). You would want some synchronization around this, `std::shared_future` could be a good fit

Any reason to not just wait for the index to load? Is this a UX concern or a problem when experimenting?

Fix data race.

Harbormaster completed remote builds in B22089: Diff 163321.Aug 30 2018, 6:52 AM

In D51475#1219133, @ilya-biryukov wrote:

Any reason to not just wait for the index to load? Is this a UX concern or a problem when experimenting?

The index loading can be slow. When using LLVM YAML index, I need to wait for >10s before clangd starts giving me anything useful. We could potentially speed up loading (e.g. replacing yaml), but the index can be arbitrary large. I think it's an improvement in general to be able to get clangd running before index is loaded.

clangd/tool/ClangdMain.cpp
86	Nice catch. I am a bit hesitated about using `shared_future` though. It could potentially get rid of the mutex but would require much more careful use. For example, `Index` can be set to the same value repeatedly, which makes me a bit nervous.

In D51475#1219184, @ioeric wrote:

The index loading can be slow. When using LLVM YAML index, I need to wait for >10s before clangd starts giving me anything useful. We could potentially speed up loading (e.g. replacing yaml), but the index can be arbitrary large. I think it's an improvement in general to be able to get clangd running before index is loaded.

I would trade-off those 10 seconds for giving consistent experience (i.e. avoiding confusing the users with different modes of completion (with and without index, etc.)).
But not terribly opposed to that.

clangd/tool/ClangdMain.cpp

The following pattern should give proper results:

class AsyncIndex : public Index {
public:
  AsyncIndex(std::shared_future<Index> IndexFut);
  //....
private;
  const SymbolIndex *index() const {
     if (IndexFut.wait(0) != ready)
       return nullptr;
     return &IndexFut.get();
  }

  std::shared_future<Index>IndexFut;
};


AsyncIndex AI(std::async([]() { /* load the index here */ return Index; });

In D51475#1219197, @ilya-biryukov wrote:

In D51475#1219184, @ioeric wrote:

The index loading can be slow. When using LLVM YAML index, I need to wait for >10s before clangd starts giving me anything useful. We could potentially speed up loading (e.g. replacing yaml), but the index can be arbitrary large. I think it's an improvement in general to be able to get clangd running before index is loaded.

I would trade-off those 10 seconds for giving consistent experience (i.e. avoiding confusing the users with different modes of completion (with and without index, etc.)).
But not terribly opposed to that.

I think the inconsistency is benign and should be worth 10 seconds (which for me personally is loong...). Besides, this is just LLVM index; the loading could be arbitrary long depending on the corpus size.

clangd/tool/ClangdMain.cpp
86	My concern with this approach is that we are still calling `wait` even though it's not necessary once Index has been loaded.

ilya-biryukov added inline comments.Aug 30 2018, 7:55 AM

clangd/tool/ClangdMain.cpp
86	Maybe not bother before we know it causes actual performance issues? My bet is that it would never become a bottleneck, given the amount of work we're doing in addition to every call here.

I don't have a strong opinion on async vs sync - startup time is important and we shouldn't block simple AST-based functionality on the index, but this introduces some slightly confusing UX for that speed.

However I think this should be based on D51422 which extracts most of what you need out of MemIndex into the new SwapIndex.
So static index would just be initialized as an empty SwapIndex, then spawn a thread that loads the YAML and calls SwapIndex::reset.
This will get avoid adding nontrivial threading stuff to the main file.

(Sorry for catching this earlier, and that the patch isn't landed yet - feel free to pick up the review, else @kbobyrev will take a first pass tomorrow I think)

ioeric added inline comments.Aug 30 2018, 8:12 AM

clangd/tool/ClangdMain.cpp
86	Sure, the performance overhead is small in both cases. But I would still try to avoid `wait` e.g. OS can decide to `yield` even if the timeout is 0.

In D51475#1219291, @sammccall wrote:

I don't have a strong opinion on async vs sync - startup time is important and we shouldn't block simple AST-based functionality on the index, but this introduces some slightly confusing UX for that speed.

However I think this should be based on D51422 which extracts most of what you need out of MemIndex into the new SwapIndex.
So static index would just be initialized as an empty SwapIndex, then spawn a thread that loads the YAML and calls SwapIndex::reset.
This will get avoid adding nontrivial threading stuff to the main file.

Oops, didn't realize that. That sounds great. Thanks!

kbobyrev added inline comments.Aug 31 2018, 2:43 AM

clangd/tool/ClangdMain.cpp
47	Also, do we want only static index to be built asynchronously? Do we want it to be used only in our Clangd tool driver? Loading dynamic index might also be useful if we have this kind of behavior for the static one. I'm not completely sure about that, though.
61–117	Shouldn't this `index()->estimateMemoryUsage()` when the index is built?
85	Do we want to be more explicit about loading index asynchronously? If we're not that fact might be implicit to the user (e.g. "my Clangd is not frozen, but global completion doesn't work"), CLI flag/documentation entry might solve that issue. Introducing tons of not-so-useful flags is not optimal, though, and I'm also not sure about that. What do you think?

sammccall mentioned this in D51638: [clangd] Load static index asynchronously, add tracing..Sep 4 2018, 8:54 AM

Dropping this in favor of D51638

sammccall mentioned this in rL341376: [clangd] Load static index asynchronously, add tracing..Sep 4 2018, 9:21 AM

sammccall mentioned this in rCTE341376: [clangd] Load static index asynchronously, add tracing..

Revision Contents

Path

Size

clangd/

tool/

CMakeLists.txt

2 lines

ClangdMain.cpp

88 lines

Diff 163275

clangd/tool/CMakeLists.txt

	include_directories(${CMAKE_CURRENT_SOURCE_DIR}/..)			include_directories(${CMAKE_CURRENT_SOURCE_DIR}/../)

	add_clang_tool(clangd			add_clang_tool(clangd
	ClangdMain.cpp			ClangdMain.cpp
	)			)

	set(LLVM_LINK_COMPONENTS			set(LLVM_LINK_COMPONENTS
	support			support
	)			)
	Show All 11 Lines

clangd/tool/ClangdMain.cpp

//===--- ClangdMain.cpp - clangd server loop ------------------------------===//		//===--- ClangdMain.cpp - clangd server loop ------------------------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "ClangdLSPServer.h"		#include "ClangdLSPServer.h"
#include "JSONRPCDispatcher.h"		#include "JSONRPCDispatcher.h"
#include "Path.h"		#include "Path.h"
#include "Trace.h"		#include "Trace.h"
#include "index/SymbolYAML.h"		#include "index/SymbolYAML.h"
#include "index/dex/DexIndex.h"		#include "index/dex/DexIndex.h"
#include "clang/Basic/Version.h"		#include "clang/Basic/Version.h"
		#include "clang/Basic/Version.h"
		#include "llvm/ADT/STLExtras.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/FileSystem.h"		#include "llvm/Support/FileSystem.h"
#include "llvm/Support/Path.h"		#include "llvm/Support/Path.h"
#include "llvm/Support/Program.h"		#include "llvm/Support/Program.h"
#include "llvm/Support/Signals.h"		#include "llvm/Support/Signals.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <cstdlib>		#include <cstdlib>
#include <iostream>		#include <iostream>
Show All 9 Lines	static llvm::cl::opt<bool>
UseDex("use-dex-index",		UseDex("use-dex-index",
llvm::cl::desc("Use experimental Dex static index."),		llvm::cl::desc("Use experimental Dex static index."),
llvm::cl::init(true), llvm::cl::Hidden);		llvm::cl::init(true), llvm::cl::Hidden);

namespace {		namespace {

enum class PCHStorageFlag { Disk, Memory };		enum class PCHStorageFlag { Disk, Memory };

// Build an in-memory static index for global symbols from a YAML-format file.		// Loads the index asynchornously. This acts like an empty index before
// The size of global symbols should be relatively small, so that all symbols		// finishing loading and proxies index requests to the loaded index after
// can be managed in memory.		// loading.
		class AsyncLoadIndex : public SymbolIndex {
		kbobyrevUnsubmitted Not Done Reply Inline Actions Also, do we want only static index to be built asynchronously? Do we want it to be used only in our Clangd tool driver? Loading dynamic index might also be useful if we have this kind of behavior for the static one. I'm not completely sure about that, though. kbobyrev: Also, do we want only static index to be built asynchronously? Do we want it to be used only in…
		public:
		AsyncLoadIndex(
		llvm::unique_function<std::unique_ptr<SymbolIndex>()> LoadIndex)
		: AsyncLoad(runAsync(std::move(LoadIndex))) {}

		bool
		fuzzyFind(const FuzzyFindRequest &Req,
		llvm::function_ref<void(const Symbol &)> Callback) const override {
		if (!index())
		return false; // More
		return index()->fuzzyFind(Req, Callback);
		}

		void
		lookup(const LookupRequest &Req,
		llvm::function_ref<void(const Symbol &)> Callback) const override {
		if (!index())
		return;
		return index()->lookup(Req, Callback);
		}

		void findOccurrences(const OccurrencesRequest &Req,
		llvm::function_ref<void(const SymbolOccurrence &)>
		Callback) const override {
		if (!index())
		return;
		return index()->findOccurrences(Req, Callback);
		}

		size_t estimateMemoryUsage() const override { return 0; }

		private:
		const SymbolIndex *index() const {
		if (Index)
		return Index.get();
		if (AsyncLoad.wait_for(std::chrono::seconds(0)) !=
		std::future_status::ready)
		return nullptr;
		kbobyrevUnsubmitted Not Done Reply Inline Actions Do we want to be more explicit about loading index asynchronously? If we're not that fact might be implicit to the user (e.g. "my Clangd is not frozen, but global completion doesn't work"), CLI flag/documentation entry might solve that issue. Introducing tons of not-so-useful flags is not optimal, though, and I'm also not sure about that. What do you think? kbobyrev: Do we want to be more explicit about loading index asynchronously? If we're not that fact might…
		Index = AsyncLoad.get();
		ilya-biryukovUnsubmitted Not Done Reply Inline Actions I believe is a data race (multiple threads may run this line concurrently). You would want some synchronization around this, `std::shared_future` could be a good fit ilya-biryukov: I believe is a data race (multiple threads may run this line concurrently). You would want some…
		ioericAuthorUnsubmitted Not Done Reply Inline Actions Nice catch. I am a bit hesitated about using `shared_future` though. It could potentially get rid of the mutex but would require much more careful use. For example, `Index` can be set to the same value repeatedly, which makes me a bit nervous. ioeric: Nice catch. I am a bit hesitated about using `shared_future` though. It could potentially get…
		ilya-biryukovUnsubmitted Not Done Reply Inline Actions The following pattern should give proper results: class AsyncIndex : public Index { public: AsyncIndex(std::shared_future<Index> IndexFut); //.... private; const SymbolIndex index() const { if (IndexFut.wait(0) != ready) return nullptr; return &IndexFut.get(); } std::shared_future<Index>IndexFut; }; AsyncIndex AI(std::async([]() { / load the index here / return Index; }); ilya-biryukov:* The following pattern should give proper results: ``` class AsyncIndex : public Index { public…
		ioericAuthorUnsubmitted Not Done Reply Inline Actions My concern with this approach is that we are still calling `wait` even though it's not necessary once Index has been loaded. ioeric: My concern with this approach is that we are still calling `wait` even though it's not…
		ilya-biryukovUnsubmitted Not Done Reply Inline Actions Maybe not bother before we know it causes actual performance issues? My bet is that it would never become a bottleneck, given the amount of work we're doing in addition to every call here. ilya-biryukov: Maybe not bother before we know it causes actual performance issues? My bet is that it would…
		ioericAuthorUnsubmitted Not Done Reply Inline Actions Sure, the performance overhead is small in both cases. But I would still try to avoid `wait` e.g. OS can decide to `yield` even if the timeout is 0. ioeric: Sure, the performance overhead is small in both cases. But I would still try to avoid `wait` e.
		return Index.get();
		}
		mutable std::unique_ptr<SymbolIndex> Index;
		mutable std::future<std::unique_ptr<SymbolIndex>> AsyncLoad;
		};
		// Asynchronously build an in-memory static index for global symbols from a
		// YAML-format file. The size of global symbols should be relatively small, so
		// that all symbols can be managed in memory.
std::unique_ptr<SymbolIndex> buildStaticIndex(llvm::StringRef YamlSymbolFile) {		std::unique_ptr<SymbolIndex> buildStaticIndex(llvm::StringRef YamlSymbolFile) {
		return llvm::make_unique<AsyncLoadIndex>(
		[YamlSymbolFile]() -> std::unique_ptr<SymbolIndex> {
		trace::Span Tracer("Build static index");

auto Buffer = llvm::MemoryBuffer::getFile(YamlSymbolFile);		auto Buffer = llvm::MemoryBuffer::getFile(YamlSymbolFile);
if (!Buffer) {		if (!Buffer) {
llvm::errs() << "Can't open " << YamlSymbolFile << "\n";		llvm::errs() << "Can't open " << YamlSymbolFile << "\n";
return nullptr;		return nullptr;
}		}
auto Slab = symbolsFromYAML(Buffer.get()->getBuffer());
SymbolSlab::Builder SymsBuilder;		SymbolSlab::Builder SymsBuilder;
		{
		trace::Span Tracer("YAML to symbols");
		auto Slab = symbolsFromYAML(Buffer.get()->getBuffer());
for (auto Sym : Slab)		for (auto Sym : Slab)
SymsBuilder.insert(Sym);		SymsBuilder.insert(Sym);
		}

		trace::Span Build("Build index");
return UseDex ? dex::DexIndex::build(std::move(SymsBuilder).build())		return UseDex ? dex::DexIndex::build(std::move(SymsBuilder).build())
: MemIndex::build(std::move(SymsBuilder).build());		: MemIndex::build(std::move(SymsBuilder).build());
		});
		kbobyrevUnsubmitted Not Done Reply Inline Actions Shouldn't this `index()->estimateMemoryUsage()` when the index is built? kbobyrev: Shouldn't this `index()->estimateMemoryUsage()` when the index is built?
}		}

} // namespace		} // namespace

static llvm::cl::opt<Path> CompileCommandsDir(		static llvm::cl::opt<Path> CompileCommandsDir(
"compile-commands-dir",		"compile-commands-dir",
llvm::cl::desc("Specify a path to look for compile_commands.json. If path "		llvm::cl::desc("Specify a path to look for compile_commands.json. If path "
"is invalid, clangd will look in the current directory and "		"is invalid, clangd will look in the current directory and "
▲ Show 20 Lines • Show All 261 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Load YAML static index asynchronously.AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 163275

clangd/tool/CMakeLists.txt

clangd/tool/ClangdMain.cpp

[clangd] Load YAML static index asynchronously.
AbandonedPublic