This is an archive of the discontinued LLVM Phabricator instance.

[ORC] Fix the move-captured std::unique_ptr vs. std::function dilemma
AbandonedPublic

Authored by sgraenitz on Jan 6 2020, 1:16 PM.

Download Raw Diff

Details

Reviewers

lhames
bkramer
pree-jackie
shafik

Summary

C++14 is there for a while now and the FIXME is getting confusing. There was an earlier attempt to fix this the way (I think) it was indended originally: https://github.com/llvm/llvm-project/commit/ce74c3b19f5b#diff-f27b9f96303108227f0119d85d6c3ddf (It was rolled back here: https://github.com/llvm/llvm-project/commit/6baaa4be7831#diff-f27b9f96303108227f0119d85d6c3ddf )

  ES->setDispatchMaterialization(
      [this](JITDylib &JD, std::unique_ptr<MaterializationUnit> MU) {
+       auto Work = [MU = std::move(MU), &JD] { MU->doMaterialize(JD); };
-       // FIXME: Switch to move capture once we have c++14.
-       auto SharedMU = std::shared_ptr<MaterializationUnit>(std::move(MU));
-       auto Work = [SharedMU, &JD]() { SharedMU->doMaterialize(JD); };
        CompileThreads->async(std::move(Work));
      });

Unfortunately this doesn't work, because the lambda that captures the std::unique_ptr is being passed to ThreadPool::async() as a std::function: Holding the std::unique_ptr its copy constructor is deleted implicitly, but the standard requires the value of std::function to be copy-constructible.

What's the options? We certainly want to use ThreadPool, so we are bound to std::function. I think we don't want to go through a raw pointer capture by-value here, especially seeing that we pass a std::function next..

The current workaround creates and copies a std::shared_pointer. That's quite expensive. In a first step we can move-capture it in the Work lambda to avoid the copy. I think we don't get around the creation (assuming we keep a smart pointer), so why not receive it as a std::shared_ptr directly? The only disproportionate overhead I see is in the materializeOnCurrentThread case. I changed that to not calling the function at all.

What do you think?

Diff Detail

Repository

rG LLVM Github Monorepo

Build Status

Buildable 43369
Build 44191: arc lint + arc unit

Event Timeline

sgraenitz created this revision.Jan 6 2020, 1:16 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 6 2020, 1:16 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

Harbormaster completed remote builds in B43369: Diff 236446.Jan 6 2020, 1:20 PM

The current workaround creates and copies a std::shared_pointer. That's quite expensive. In a first step we can move-capture it in the Work lambda to avoid the copy. I think we don't get around the creation (assuming we keep a smart pointer), so why not receive it as a std::shared_ptr directly? The only disproportionate overhead I see is in the materializeOnCurrentThread case. I changed that to not calling the function at all.

In the context of running a Materializer I don’t think we need to be worried about the shared_ptr construction overhead, but it would still be nice to tidy this up.

For our purposes it would be ideal if ThreadPool::async took an llvm::unique_function rather than a std::function. I wonder if that would work for other clients too? I think 99% of ThreadPool use cases would work equally well or better with a unique_function, and it looks like unique_function should be constructible from a std::function (with minor runtime overhead for execution) for the 1% of cases that are not supported.

Indeed, that sounds like a better solution.

sgraenitz mentioned this in D72486: Add ThinLtoJIT example.Jan 19 2020, 12:12 PM

Revision Contents

Path

Size

llvm/

examples/

SpeculativeJIT/

SpeculativeJIT.cpp

9 lines

include/

llvm/

ExecutionEngine/

Orc/

Core.h

20 lines

lib/

ExecutionEngine/

Orc/

LLJIT.cpp

8 lines

unittests/

ExecutionEngine/

Orc/

CoreAPIsTest.cpp

4 lines

Diff 236446

llvm/examples/SpeculativeJIT/SpeculativeJIT.cpp

Show First 20 Lines • Show All 106 Lines • ▼ Show 20 Lines	SpeculativeJIT(
ConcurrentIRCompiler(std::move(JTMB))),		ConcurrentIRCompiler(std::move(JTMB))),
S(Imps, *this->ES),		S(Imps, *this->ES),
SpeculateLayer(*this->ES, CompileLayer, S, Mangle, BlockFreqQuery()),		SpeculateLayer(*this->ES, CompileLayer, S, Mangle, BlockFreqQuery()),
CODLayer(this->ES, SpeculateLayer, this->LCTMgr,		CODLayer(this->ES, SpeculateLayer, this->LCTMgr,
std::move(ISMBuilder)) {		std::move(ISMBuilder)) {
MainJD.addGenerator(std::move(ProcessSymbolsGenerator));		MainJD.addGenerator(std::move(ProcessSymbolsGenerator));
this->CODLayer.setImplMap(&Imps);		this->CODLayer.setImplMap(&Imps);
this->ES->setDispatchMaterialization(		this->ES->setDispatchMaterialization(
		[this](JITDylib &JD, std::shared_ptr<MaterializationUnit> MU) {
[this](JITDylib &JD, std::unique_ptr<MaterializationUnit> MU) {		CompileThreads.async(
// FIXME: Switch to move capture once we have C++14.		[MU = std::move(MU), &JD]() { MU->doMaterialize(JD); });
auto SharedMU = std::shared_ptr<MaterializationUnit>(std::move(MU));
auto Work = [SharedMU, &JD]() { SharedMU->doMaterialize(JD); };
CompileThreads.async(std::move(Work));
});		});
ExitOnErr(S.addSpeculationRuntime(MainJD, Mangle));		ExitOnErr(S.addSpeculationRuntime(MainJD, Mangle));
LocalCXXRuntimeOverrides CXXRuntimeoverrides;		LocalCXXRuntimeOverrides CXXRuntimeoverrides;
ExitOnErr(CXXRuntimeoverrides.enable(MainJD, Mangle));		ExitOnErr(CXXRuntimeoverrides.enable(MainJD, Mangle));
}		}

static std::unique_ptr<SectionMemoryManager> createMemMgr() {		static std::unique_ptr<SectionMemoryManager> createMemMgr() {
return std::make_unique<SectionMemoryManager>();		return std::make_unique<SectionMemoryManager>();
▲ Show 20 Lines • Show All 59 Lines • Show Last 20 Lines

llvm/include/llvm/ExecutionEngine/Orc/Core.h

Show First 20 Lines • Show All 1,054 Lines • ▼ Show 20 Lines	class ExecutionSession {
friend class JITDylib;		friend class JITDylib;

public:		public:
/// For reporting errors.		/// For reporting errors.
using ErrorReporter = std::function<void(Error)>;		using ErrorReporter = std::function<void(Error)>;

/// For dispatching MaterializationUnit::materialize calls.		/// For dispatching MaterializationUnit::materialize calls.
using DispatchMaterializationFunction = std::function<void(		using DispatchMaterializationFunction = std::function<void(
JITDylib &JD, std::unique_ptr<MaterializationUnit> MU)>;		JITDylib &JD, std::shared_ptr<MaterializationUnit> MU)>;

/// Construct an ExecutionSession.		/// Construct an ExecutionSession.
///		///
/// SymbolStringPools may be shared between ExecutionSessions.		/// SymbolStringPools may be shared between ExecutionSessions.
ExecutionSession(std::shared_ptr<SymbolStringPool> SSP = nullptr);		ExecutionSession(std::shared_ptr<SymbolStringPool> SSP = nullptr);

/// Add a symbol name to the SymbolStringPool and return a pointer to it.		/// Add a symbol name to the SymbolStringPool and return a pointer to it.
SymbolStringPtr intern(StringRef SymName) { return SSP->intern(SymName); }		SymbolStringPtr intern(StringRef SymName) { return SSP->intern(SymName); }
Show All 35 Lines	ExecutionSession &setErrorReporter(ErrorReporter ReportError) {
return *this;		return *this;
}		}

/// Report a error for this execution session.		/// Report a error for this execution session.
///		///
/// Unhandled errors can be sent here to log them.		/// Unhandled errors can be sent here to log them.
void reportError(Error Err) { ReportError(std::move(Err)); }		void reportError(Error Err) { ReportError(std::move(Err)); }

/// Set the materialization dispatch function.		/// Set a materialization dispatch function. It can be used to delegate
		/// compilation to different threads.
ExecutionSession &setDispatchMaterialization(		ExecutionSession &setDispatchMaterialization(
DispatchMaterializationFunction DispatchMaterialization) {		DispatchMaterializationFunction DispatchMaterialization) {
this->DispatchMaterialization = std::move(DispatchMaterialization);		this->DispatchMaterialization = std::move(DispatchMaterialization);
return *this;		return *this;
}		}

void legacyFailQuery(AsynchronousSymbolQuery &Q, Error Err);		void legacyFailQuery(AsynchronousSymbolQuery &Q, Error Err);

▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	public:
/// Materialize the given unit.		/// Materialize the given unit.
void dispatchMaterialization(JITDylib &JD,		void dispatchMaterialization(JITDylib &JD,
std::unique_ptr<MaterializationUnit> MU) {		std::unique_ptr<MaterializationUnit> MU) {
LLVM_DEBUG({		LLVM_DEBUG({
runSessionLocked([&]() {		runSessionLocked([&]() {
dbgs() << "Dispatching " << *MU << " for " << JD.getName() << "\n";		dbgs() << "Dispatching " << *MU << " for " << JD.getName() << "\n";
});		});
});		});
		if (DispatchMaterialization) {
DispatchMaterialization(JD, std::move(MU));		DispatchMaterialization(JD, std::move(MU));
		} else {
		MU->doMaterialize(JD);
		}
}		}

/// Dump the state of all the JITDylibs in this session.		/// Dump the state of all the JITDylibs in this session.
void dump(raw_ostream &OS);		void dump(raw_ostream &OS);

private:		private:
static void logErrorsToStdErr(Error Err) {		static void logErrorsToStdErr(Error Err) {
logAllUnhandledErrors(std::move(Err), errs(), "JIT session error: ");		logAllUnhandledErrors(std::move(Err), errs(), "JIT session error: ");
}		}

static void
materializeOnCurrentThread(JITDylib &JD,
std::unique_ptr<MaterializationUnit> MU) {
MU->doMaterialize(JD);
}

void runOutstandingMUs();		void runOutstandingMUs();

mutable std::recursive_mutex SessionMutex;		mutable std::recursive_mutex SessionMutex;
std::shared_ptr<SymbolStringPool> SSP;		std::shared_ptr<SymbolStringPool> SSP;
VModuleKey LastKey = 0;		VModuleKey LastKey = 0;
ErrorReporter ReportError = logErrorsToStdErr;		ErrorReporter ReportError = logErrorsToStdErr;
DispatchMaterializationFunction DispatchMaterialization =		DispatchMaterializationFunction DispatchMaterialization = nullptr;
materializeOnCurrentThread;

std::vector<std::unique_ptr<JITDylib>> JDs;		std::vector<std::unique_ptr<JITDylib>> JDs;

// FIXME: Remove this (and runOutstandingMUs) once the linking layer works		// FIXME: Remove this (and runOutstandingMUs) once the linking layer works
// with callbacks from asynchronous queries.		// with callbacks from asynchronous queries.
mutable std::recursive_mutex OutstandingMUsMutex;		mutable std::recursive_mutex OutstandingMUsMutex;
std::vector<std::pair<JITDylib *, std::unique_ptr<MaterializationUnit>>>		std::vector<std::pair<JITDylib *, std::unique_ptr<MaterializationUnit>>>
OutstandingMUs;		OutstandingMUs;
▲ Show 20 Lines • Show All 90 Lines • Show Last 20 Lines

llvm/lib/ExecutionEngine/Orc/LLJIT.cpp

Show First 20 Lines • Show All 149 Lines • ▼ Show 20 Lines	else {
CompileLayer = std::make_unique<IRCompileLayer>(		CompileLayer = std::make_unique<IRCompileLayer>(
ES, ObjTransformLayer, std::move(CompileFunction));		ES, ObjTransformLayer, std::move(CompileFunction));
}		}

if (S.NumCompileThreads > 0) {		if (S.NumCompileThreads > 0) {
CompileLayer->setCloneToNewContextOnEmit(true);		CompileLayer->setCloneToNewContextOnEmit(true);
CompileThreads = std::make_unique<ThreadPool>(S.NumCompileThreads);		CompileThreads = std::make_unique<ThreadPool>(S.NumCompileThreads);
ES->setDispatchMaterialization(		ES->setDispatchMaterialization(
[this](JITDylib &JD, std::unique_ptr<MaterializationUnit> MU) {		[this](JITDylib &JD, std::shared_ptr<MaterializationUnit> MU) {
// FIXME: Switch to move capture once we have c++14.		CompileThreads->async(
auto SharedMU = std::shared_ptr<MaterializationUnit>(std::move(MU));		[MU = std::move(MU), &JD]() { MU->doMaterialize(JD); });
auto Work = [SharedMU, &JD]() { SharedMU->doMaterialize(JD); };
CompileThreads->async(std::move(Work));
});		});
}		}
}		}

std::string LLJIT::mangle(StringRef UnmangledName) {		std::string LLJIT::mangle(StringRef UnmangledName) {
std::string MangledName;		std::string MangledName;
{		{
raw_string_ostream MangledNameStream(MangledName);		raw_string_ostream MangledNameStream(MangledName);
▲ Show 20 Lines • Show All 94 Lines • Show Last 20 Lines

llvm/unittests/ExecutionEngine/Orc/CoreAPIsTest.cpp

Show First 20 Lines • Show All 942 Lines • ▼ Show 20 Lines	TEST_F(CoreAPIsStandardTest, TestBasicWeakSymbolMaterialization) {
EXPECT_TRUE(BarMaterialized) << "Bar was not materialized at all";		EXPECT_TRUE(BarMaterialized) << "Bar was not materialized at all";
EXPECT_TRUE(DuplicateBarDiscarded)		EXPECT_TRUE(DuplicateBarDiscarded)
<< "Duplicate bar definition not discarded";		<< "Duplicate bar definition not discarded";
}		}

TEST_F(CoreAPIsStandardTest, DefineMaterializingSymbol) {		TEST_F(CoreAPIsStandardTest, DefineMaterializingSymbol) {
bool ExpectNoMoreMaterialization = false;		bool ExpectNoMoreMaterialization = false;
ES.setDispatchMaterialization(		ES.setDispatchMaterialization(
[&](JITDylib &JD, std::unique_ptr<MaterializationUnit> MU) {		[&](JITDylib &JD, std::shared_ptr<MaterializationUnit> MU) {
if (ExpectNoMoreMaterialization)		if (ExpectNoMoreMaterialization)
ADD_FAILURE() << "Unexpected materialization";		ADD_FAILURE() << "Unexpected materialization";
MU->doMaterialize(JD);		MU->doMaterialize(JD);
});		});

auto MU = std::make_unique<SimpleMaterializationUnit>(		auto MU = std::make_unique<SimpleMaterializationUnit>(
SymbolFlagsMap({{Foo, FooSym.getFlags()}}),		SymbolFlagsMap({{Foo, FooSym.getFlags()}}),
[&](MaterializationResponsibility R) {		[&](MaterializationResponsibility R) {
▲ Show 20 Lines • Show All 161 Lines • ▼ Show 20 Lines	EXPECT_EQ(FooLookupResult.getFlags(), FooSym.getFlags())
<< "lookup returned incorrect flags";		<< "lookup returned incorrect flags";
}		}

TEST_F(CoreAPIsStandardTest, TestLookupWithThreadedMaterialization) {		TEST_F(CoreAPIsStandardTest, TestLookupWithThreadedMaterialization) {
#if LLVM_ENABLE_THREADS		#if LLVM_ENABLE_THREADS

std::thread MaterializationThread;		std::thread MaterializationThread;
ES.setDispatchMaterialization(		ES.setDispatchMaterialization(
[&](JITDylib &JD, std::unique_ptr<MaterializationUnit> MU) {		[&](JITDylib &JD, std::shared_ptr<MaterializationUnit> MU) {
MaterializationThread =		MaterializationThread =
std::thread([MU = std::move(MU), &JD] { MU->doMaterialize(JD); });		std::thread([MU = std::move(MU), &JD] { MU->doMaterialize(JD); });
});		});

cantFail(JD.define(absoluteSymbols({{Foo, FooSym}})));		cantFail(JD.define(absoluteSymbols({{Foo, FooSym}})));

auto FooLookupResult = cantFail(ES.lookup(makeJITDylibSearchOrder(&JD), Foo));		auto FooLookupResult = cantFail(ES.lookup(makeJITDylibSearchOrder(&JD), Foo));

▲ Show 20 Lines • Show All 126 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[ORC] Fix the move-captured std::unique_ptr vs. std::function dilemmaAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 236446

llvm/examples/SpeculativeJIT/SpeculativeJIT.cpp

llvm/include/llvm/ExecutionEngine/Orc/Core.h

llvm/lib/ExecutionEngine/Orc/LLJIT.cpp

llvm/unittests/ExecutionEngine/Orc/CoreAPIsTest.cpp

[ORC] Fix the move-captured std::unique_ptr vs. std::function dilemma
AbandonedPublic