The lambda which handles finalization in the memory manager should run on a separate thread, as otherwise it might block the executor processs control from processing incoming messages in listenLoop().
This happened in the DebugObjectManagerPlugin: In response to the finalization of a module, the plugin transfers the debug object and calls a registration routine on the executor. The plugin must block materialization for the module until both, transfer and registration finished in order to guarantee that the debugger (attached on the executor side) had the chance to instrument the newly emitted code.
Excerpt form the deadlocked call-stack before this patch:
__psynch_cvwait _pthread_cond_wait std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) ... llvm::orc::EPCGenericJITLinkMemoryManager::allocate(llvm::jitlink::JITLinkDylib const*, ... llvm::orc::ELFDebugObject::finalizeWorkingMemory(llvm::jitlink::JITLinkContext&) --> llvm::orc::DebugObject::finalizeAsync(std::__1::function<void (llvm::Expected<llvm::sys::MemoryBlock>)>) llvm::orc::DebugObjectManagerPlugin::notifyEmitted(llvm::orc::MaterializationResponsibility&) llvm::orc::ObjectLinkingLayer::notifyEmitted(llvm::orc::MaterializationResponsibility&, ... llvm::orc::ObjectLinkingLayerJITLinkContext::notifyFinalized(... ... --> llvm::orc::EPCGenericJITLinkMemoryManager::Alloc::finalizeAsync(std::__1::function<void (llvm::Error)>)::'lambda'(llvm::Error, llvm::Error)::operator()(llvm::Error, llvm::Error) const ... llvm::orc::SimpleRemoteEPC::handleResult(unsigned long long, ... llvm::orc::SimpleRemoteEPC::handleMessage(llvm::orc::SimpleRemoteEPCOpcode, ... llvm::orc::FDSimpleRemoteEPCTransport::listenLoop() ...
The two arrows mark asynchronous finalization handlers that are candidates for new thread entry points. Using the plugin's very own DebugObject::finalizeAsync() is not sufficient though, because DebugObjectManagerPlugin::notifyEmitted() awaits transfer and registration explicitly:
https://github.com/llvm/llvm-project/blob/6e60bb6883178cf14e6fd47a6789495636e4322f/llvm/lib/ExecutionEngine/Orc/DebugObjectManagerPlugin.cpp#L442
It doesn't appear there is a thread pool that could accommodate this task. Is a detached thread ok?