This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/
-
mlir/
-
Dialect/LLVMIR/
-
LLVMIR/
-
LLVMDialect.h
-
LLVMOpBase.td
-
Target/LLVMIR/
-
LLVMIR/
1/2
ModuleTranslation.h
-
lib/
-
Conversion/GPUToCUDA/
-
GPUToCUDA/
3
ConvertKernelFuncToCubin.cpp
-
ConvertLaunchFuncToCudaCalls.cpp
-
Dialect/LLVMIR/
-
LLVMIR/
-
CMakeLists.txt
-
IR/
-
LLVMDialect.cpp
-
ExecutionEngine/
-
CMakeLists.txt
-
ExecutionEngine.cpp
-
Target/LLVMIR/
-
LLVMIR/
1
ModuleTranslation.cpp
-
test/mlir-cuda-runner/
-
mlir-cuda-runner/
-
two-modules.mlir

Differential D78207

[MLIR] Allow for multiple gpu modules during translation.
ClosedPublic

Authored by herhut on Apr 15 2020, 7:03 AM.

Download Raw Diff

Details

Reviewers

ftynse
jdoerfert
csigg

Commits

rG69040d5b0bfa: [MLIR] Allow for multiple gpu modules during translation.

Summary

This change makes the ModuleTranslation threadsafe by locking on the
LLVMContext. Furthermore, we now clone the llvm module into a new
context when compiling to PTX similar to what the OrcJit does.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

herhut created this revision.Apr 15 2020, 7:03 AM

Herald added a reviewer: jdoerfert. · View Herald TranscriptApr 15 2020, 7:03 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, frgossen, grosul1 and 15 others. · View Herald Transcript

Harbormaster failed remote builds in B53356: Diff 257707!Apr 15 2020, 7:06 AM

csigg accepted this revision.Apr 15 2020, 8:03 AM

csigg added inline comments.

mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h
29	Is this needed?

This revision is now accepted and ready to land.Apr 15 2020, 8:03 AM

Rebase and mild cleanup.

herhut marked an inline comment as done.Apr 15 2020, 8:27 AM

herhut added inline comments.

mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h
29	No, a leftover from using `std::mutex` in an earlier version. Thanks!

Harbormaster failed remote builds in B53372: Diff 257731!Apr 15 2020, 8:44 AM

I don't suppose there is a way to make this method only visible to ModuleTranslation...

mehdi_amini added inline comments.Apr 15 2020, 10:15 PM

mlir/lib/Conversion/GPUToCUDA/ConvertKernelFuncToCubin.cpp

114

Why don't you use the high level API the same way this is done in the ExecutionEngine?

// TODO(zinenko): Reevaluate model of ownership of LLVMContext in LLVMDialect.
SmallVector<char, 1> buffer;
{
  llvm::raw_svector_ostream os(buffer);
  WriteBitcodeToFile(*llvmModule, os);
}
llvm::MemoryBufferRef bufferRef(StringRef(buffer.data(), buffer.size()),
                                "cloned module buffer");
auto expectedModule = parseBitcodeFile(bufferRef, *ctx);
if (!expectedModule)
  return expectedModule.takeError();
std::unique_ptr<Module> deserModule = std::move(*expectedModule);
auto dataLayout = deserModule->getDataLayout();

I'd also like a TODO also here for Alex to actually fix: the fact that we have a LLVMContext tied to the LLVM dialect is really something we need to fix.

114

(It may be even worth extracting this logic in a helper by the way)

Extract helper and rebase.

Harbormaster failed remote builds in B53544: Diff 258012!Apr 16 2020, 4:30 AM

ftynse accepted this revision.Apr 16 2020, 4:36 AM

ftynse added inline comments.

mlir/lib/Conversion/GPUToCUDA/ConvertKernelFuncToCubin.cpp
114	I'd also like a TODO also here for Alex to actually fix: the fact that we have a LLVMContext tied to the LLVM dialect is really something we need to fix. Yeah, this is the next big thing on my todo list. Let's see how many discussion we can have in parallel :)

Closed by commit rG69040d5b0bfa: [MLIR] Allow for multiple gpu modules during translation. (authored by herhut). · Explain WhyApr 16 2020, 5:37 AM

This revision was automatically updated to reflect the committed changes.

This is causing TSAN failures. Looks like ConvertKernelFuncToCubin isn't thread safe. More specifically the call to LLVMInitializeNVPTXTargetInfo.

mlir/lib/Conversion/GPUToCUDA/ConvertKernelFuncToCubin.cpp:62:5

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
500	Can we just lock once at beginning?

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

LLVMIR/

LLVMDialect.h

10 lines

LLVMOpBase.td

1 line

Target/

LLVMIR/

ModuleTranslation.h

3 lines

lib/

Conversion/

GPUToCUDA/

ConvertKernelFuncToCubin.cpp

10 lines

ConvertLaunchFuncToCudaCalls.cpp

12 lines

Dialect/

LLVMIR/

CMakeLists.txt

2 lines

IR/

LLVMDialect.cpp

18 lines

ExecutionEngine/

CMakeLists.txt

2 lines

ExecutionEngine.cpp

16 lines

Target/

LLVMIR/

ModuleTranslation.cpp

11 lines

test/

mlir-cuda-runner/

two-modules.mlir

28 lines

Diff 258027

mlir/include/mlir/Dialect/LLVMIR/LLVMDialect.h

	Show All 26 Lines
	#include "llvm/IR/Module.h"			#include "llvm/IR/Module.h"
	#include "llvm/IR/Type.h"			#include "llvm/IR/Type.h"

	#include "mlir/Dialect/LLVMIR/LLVMOpsEnums.h.inc"			#include "mlir/Dialect/LLVMIR/LLVMOpsEnums.h.inc"

	namespace llvm {			namespace llvm {
	class Type;			class Type;
	class LLVMContext;			class LLVMContext;
				namespace sys {
				template <bool mt_only>
				class SmartMutex;
				} // end namespace sys
	} // end namespace llvm			} // end namespace llvm

	namespace mlir {			namespace mlir {
	namespace LLVM {			namespace LLVM {
	class LLVMDialect;			class LLVMDialect;

	namespace detail {			namespace detail {
	struct LLVMTypeStorage;			struct LLVMTypeStorage;
	▲ Show 20 Lines • Show All 168 Lines • ▼ Show 20 Lines
	Value createGlobalString(Location loc, OpBuilder &builder, StringRef name,			Value createGlobalString(Location loc, OpBuilder &builder, StringRef name,
	StringRef value, LLVM::Linkage linkage,			StringRef value, LLVM::Linkage linkage,
	LLVM::LLVMDialect *llvmDialect);			LLVM::LLVMDialect *llvmDialect);

	/// LLVM requires some operations to be inside of a Module operation. This			/// LLVM requires some operations to be inside of a Module operation. This
	/// function confirms that the Operation has the desired properties.			/// function confirms that the Operation has the desired properties.
	bool satisfiesLLVMModule(Operation *op);			bool satisfiesLLVMModule(Operation *op);

				/// Clones the given module into the provided context. This is implemented by
				/// transforming the module into bitcode and then reparsing the bitcode in the
				/// provided context.
				std::unique_ptr<llvm::Module>
				cloneModuleIntoNewContext(llvm::LLVMContext context, llvm::Module module);

	} // end namespace LLVM			} // end namespace LLVM
	} // end namespace mlir			} // end namespace mlir

	#endif // MLIR_DIALECT_LLVMIR_LLVMDIALECT_H_			#endif // MLIR_DIALECT_LLVMIR_LLVMDIALECT_H_

mlir/include/mlir/Dialect/LLVMIR/LLVMOpBase.td

	Show All 18 Lines
	def LLVM_Dialect : Dialect {			def LLVM_Dialect : Dialect {
	let name = "llvm";			let name = "llvm";
	let cppNamespace = "LLVM";			let cppNamespace = "LLVM";
	let hasRegionArgAttrVerify = 1;			let hasRegionArgAttrVerify = 1;
	let extraClassDeclaration = [{			let extraClassDeclaration = [{
	~LLVMDialect();			~LLVMDialect();
	llvm::LLVMContext &getLLVMContext();			llvm::LLVMContext &getLLVMContext();
	llvm::Module &getLLVMModule();			llvm::Module &getLLVMModule();
				llvm::sys::SmartMutex<true> &getLLVMContextMutex();

	private:			private:
	friend LLVMType;			friend LLVMType;

	std::unique_ptr<detail::LLVMDialectImpl> impl;			std::unique_ptr<detail::LLVMDialectImpl> impl;
	}];			}];
	}			}

	▲ Show 20 Lines • Show All 188 Lines • Show Last 20 Lines

mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h

Show All 20 Lines
#include "mlir/IR/Value.h"		#include "mlir/IR/Value.h"

#include "llvm/Frontend/OpenMP/OMPIRBuilder.h"		#include "llvm/Frontend/OpenMP/OMPIRBuilder.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/IRBuilder.h"		#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/MatrixBuilder.h"		#include "llvm/IR/MatrixBuilder.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"

		csiggUnsubmitted Not Done Reply Inline Actions Is this needed? csigg: Is this needed?
		herhutAuthorUnsubmitted Done Reply Inline Actions No, a leftover from using `std::mutex` in an earlier version. Thanks! herhut: No, a leftover from using `std::mutex` in an earlier version. Thanks!
namespace mlir {		namespace mlir {
class Attribute;		class Attribute;
class Location;		class Location;
class ModuleOp;		class ModuleOp;
class Operation;		class Operation;

namespace LLVM {		namespace LLVM {

▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	private:
LogicalResult convertBlock(Block &bb, bool ignoreArguments);		LogicalResult convertBlock(Block &bb, bool ignoreArguments);

llvm::Constant getLLVMConstant(llvm::Type llvmType, Attribute attr,		llvm::Constant getLLVMConstant(llvm::Type llvmType, Attribute attr,
Location loc);		Location loc);

/// Original and translated module.		/// Original and translated module.
Operation *mlirModule;		Operation *mlirModule;
std::unique_ptr<llvm::Module> llvmModule;		std::unique_ptr<llvm::Module> llvmModule;

/// A converter for translating debug information.		/// A converter for translating debug information.
std::unique_ptr<detail::DebugTranslation> debugTranslation;		std::unique_ptr<detail::DebugTranslation> debugTranslation;

/// Builder for LLVM IR generation of OpenMP constructs.		/// Builder for LLVM IR generation of OpenMP constructs.
std::unique_ptr<llvm::OpenMPIRBuilder> ompBuilder;		std::unique_ptr<llvm::OpenMPIRBuilder> ompBuilder;
/// Precomputed pointer to OpenMP dialect.		/// Precomputed pointer to OpenMP dialect.
const Dialect *ompDialect;		const Dialect *ompDialect;
		/// Pointer to the llvmDialect;
		LLVMDialect *llvmDialect;

/// Mappings between llvm.mlir.global definitions and corresponding globals.		/// Mappings between llvm.mlir.global definitions and corresponding globals.
DenseMap<Operation , llvm::GlobalValue > globalsMapping;		DenseMap<Operation , llvm::GlobalValue > globalsMapping;

protected:		protected:
/// Mappings between original and translated values, used for lookups.		/// Mappings between original and translated values, used for lookups.
llvm::StringMap<llvm::Function *> functionMapping;		llvm::StringMap<llvm::Function *> functionMapping;
DenseMap<Value, llvm::Value *> valueMapping;		DenseMap<Value, llvm::Value *> valueMapping;
DenseMap<Block , llvm::BasicBlock > blockMapping;		DenseMap<Block , llvm::BasicBlock > blockMapping;
};		};

} // namespace LLVM		} // namespace LLVM
} // namespace mlir		} // namespace mlir

#endif // MLIR_TARGET_LLVMIR_MODULETRANSLATION_H		#endif // MLIR_TARGET_LLVMIR_MODULETRANSLATION_H

mlir/lib/Conversion/GPUToCUDA/ConvertKernelFuncToCubin.cpp

	Show All 9 Lines
	// corresponding binary blob that can be executed on a CUDA GPU. Currently			// corresponding binary blob that can be executed on a CUDA GPU. Currently
	// only translates the function itself but no dependencies.			// only translates the function itself but no dependencies.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "mlir/Conversion/GPUToCUDA/GPUToCUDAPass.h"			#include "mlir/Conversion/GPUToCUDA/GPUToCUDAPass.h"

	#include "mlir/Dialect/GPU/GPUDialect.h"			#include "mlir/Dialect/GPU/GPUDialect.h"
				#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
	#include "mlir/IR/Attributes.h"			#include "mlir/IR/Attributes.h"
	#include "mlir/IR/Builders.h"			#include "mlir/IR/Builders.h"
	#include "mlir/IR/Function.h"			#include "mlir/IR/Function.h"
	#include "mlir/IR/Module.h"			#include "mlir/IR/Module.h"
	#include "mlir/Pass/Pass.h"			#include "mlir/Pass/Pass.h"
	#include "mlir/Pass/PassRegistry.h"			#include "mlir/Pass/PassRegistry.h"
	#include "mlir/Support/LogicalResult.h"			#include "mlir/Support/LogicalResult.h"
	#include "mlir/Target/NVVMIR.h"			#include "mlir/Target/NVVMIR.h"
	▲ Show 20 Lines • Show All 67 Lines • ▼ Show 20 Lines
	};			};

	} // anonymous namespace			} // anonymous namespace

	std::string GpuKernelToCubinPass::translateModuleToPtx(			std::string GpuKernelToCubinPass::translateModuleToPtx(
	llvm::Module &module, llvm::TargetMachine &target_machine) {			llvm::Module &module, llvm::TargetMachine &target_machine) {
	std::string ptx;			std::string ptx;
	{			{
				// Clone the llvm module into a new context to enable concurrent compilation
				// with multiple threads.
				// TODO(zinenko): Reevaluate model of ownership of LLVMContext in
				// LLVMDialect.
				llvm::LLVMContext llvmContext;
				auto clone = LLVM::cloneModuleIntoNewContext(&llvmContext, &module);

	llvm::raw_string_ostream stream(ptx);			llvm::raw_string_ostream stream(ptx);
	llvm::buffer_ostream pstream(stream);			llvm::buffer_ostream pstream(stream);
	llvm::legacy::PassManager codegen_passes;			llvm::legacy::PassManager codegen_passes;
	target_machine.addPassesToEmitFile(codegen_passes, pstream, nullptr,			target_machine.addPassesToEmitFile(codegen_passes, pstream, nullptr,
	llvm::CGFT_AssemblyFile);			llvm::CGFT_AssemblyFile);
	codegen_passes.run(module);			codegen_passes.run(*clone);
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions Why don't you use the high level API the same way this is done in the ExecutionEngine? // TODO(zinenko): Reevaluate model of ownership of LLVMContext in LLVMDialect. SmallVector<char, 1> buffer; { llvm::raw_svector_ostream os(buffer); WriteBitcodeToFile(llvmModule, os); } llvm::MemoryBufferRef bufferRef(StringRef(buffer.data(), buffer.size()), "cloned module buffer"); auto expectedModule = parseBitcodeFile(bufferRef, ctx); if (!expectedModule) return expectedModule.takeError(); std::unique_ptr<Module> deserModule = std::move(expectedModule); auto dataLayout = deserModule->getDataLayout(); I'd also like a TODO also here for Alex to actually fix: the fact that we have a LLVMContext tied to the LLVM dialect is really something we need to fix. mehdi_amini:* Why don't you use the high level API the same way this is done in the ExecutionEngine? ```…
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions (It may be even worth extracting this logic in a helper by the way) mehdi_amini: (It may be even worth extracting this logic in a helper by the way)
				ftynseUnsubmitted Not Done Reply Inline Actions I'd also like a TODO also here for Alex to actually fix: the fact that we have a LLVMContext tied to the LLVM dialect is really something we need to fix. Yeah, this is the next big thing on my todo list. Let's see how many discussion we can have in parallel :) ftynse: > I'd also like a TODO also here for Alex to actually fix: the fact that we have a LLVMContext…
	}			}

	return ptx;			return ptx;
	}			}

	OwnedCubin GpuKernelToCubinPass::convertModuleToCubin(llvm::Module &llvmModule,			OwnedCubin GpuKernelToCubinPass::convertModuleToCubin(llvm::Module &llvmModule,
	Location loc,			Location loc,
	StringRef name) {			StringRef name) {
	Show All 36 Lines

mlir/lib/Conversion/GPUToCUDA/ConvertLaunchFuncToCudaCalls.cpp

Show First 20 Lines • Show All 110 Lines • ▼ Show 20 Lines	Value allocatePointer(OpBuilder &builder, Location loc) {
return builder.create<LLVM::AllocaOp>(loc, getPointerPointerType(), one,		return builder.create<LLVM::AllocaOp>(loc, getPointerPointerType(), one,
/alignment=/0);		/alignment=/0);
}		}

void declareCudaFunctions(Location loc);		void declareCudaFunctions(Location loc);
void addParamToList(OpBuilder &builder, Location loc, Value param, Value list,		void addParamToList(OpBuilder &builder, Location loc, Value param, Value list,
unsigned pos, Value one);		unsigned pos, Value one);
Value setupParamsArray(gpu::LaunchFuncOp launchOp, OpBuilder &builder);		Value setupParamsArray(gpu::LaunchFuncOp launchOp, OpBuilder &builder);
Value generateKernelNameConstant(StringRef name, Location loc,		Value generateKernelNameConstant(StringRef moduleName, StringRef name,
OpBuilder &builder);		Location loc, OpBuilder &builder);
void translateGpuLaunchCalls(mlir::gpu::LaunchFuncOp launchOp);		void translateGpuLaunchCalls(mlir::gpu::LaunchFuncOp launchOp);

public:		public:
// Run the dialect converter on the module.		// Run the dialect converter on the module.
void runOnOperation() override {		void runOnOperation() override {
// Cache the LLVMDialect for the current module.		// Cache the LLVMDialect for the current module.
llvmDialect = getContext().getRegisteredDialect<LLVM::LLVMDialect>();		llvmDialect = getContext().getRegisteredDialect<LLVM::LLVMDialect>();
// Cache the used LLVM types.		// Cache the used LLVM types.
▲ Show 20 Lines • Show All 211 Lines • ▼ Show 20 Lines
//		//
// llvm.global constant @kernel_name("function_name\00")		// llvm.global constant @kernel_name("function_name\00")
// func(...) {		// func(...) {
// %0 = llvm.addressof @kernel_name		// %0 = llvm.addressof @kernel_name
// %1 = llvm.constant (0 : index)		// %1 = llvm.constant (0 : index)
// %2 = llvm.getelementptr %0[%1, %1] : !llvm<"i8*">		// %2 = llvm.getelementptr %0[%1, %1] : !llvm<"i8*">
// }		// }
Value GpuLaunchFuncToCudaCallsPass::generateKernelNameConstant(		Value GpuLaunchFuncToCudaCallsPass::generateKernelNameConstant(
StringRef name, Location loc, OpBuilder &builder) {		StringRef moduleName, StringRef name, Location loc, OpBuilder &builder) {
// Make sure the trailing zero is included in the constant.		// Make sure the trailing zero is included in the constant.
std::vector<char> kernelName(name.begin(), name.end());		std::vector<char> kernelName(name.begin(), name.end());
kernelName.push_back('\0');		kernelName.push_back('\0');

std::string globalName = std::string(llvm::formatv("{0}_kernel_name", name));		std::string globalName =
		std::string(llvm::formatv("{0}_{1}_kernel_name", moduleName, name));
return LLVM::createGlobalString(		return LLVM::createGlobalString(
loc, builder, globalName, StringRef(kernelName.data(), kernelName.size()),		loc, builder, globalName, StringRef(kernelName.data(), kernelName.size()),
LLVM::Linkage::Internal, llvmDialect);		LLVM::Linkage::Internal, llvmDialect);
}		}

// Emits LLVM IR to launch a kernel function. Expects the module that contains		// Emits LLVM IR to launch a kernel function. Expects the module that contains
// the compiled kernel function as a cubin in the 'nvvm.cubin' attribute of the		// the compiled kernel function as a cubin in the 'nvvm.cubin' attribute of the
// kernel function in the IR.		// kernel function in the IR.
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	auto cuModuleLoad =
getOperation().lookupSymbol<LLVM::LLVMFuncOp>(cuModuleLoadName);		getOperation().lookupSymbol<LLVM::LLVMFuncOp>(cuModuleLoadName);
builder.create<LLVM::CallOp>(loc, ArrayRef<Type>{getCUResultType()},		builder.create<LLVM::CallOp>(loc, ArrayRef<Type>{getCUResultType()},
builder.getSymbolRefAttr(cuModuleLoad),		builder.getSymbolRefAttr(cuModuleLoad),
ArrayRef<Value>{cuModule, data});		ArrayRef<Value>{cuModule, data});
// Get the function from the module. The name corresponds to the name of		// Get the function from the module. The name corresponds to the name of
// the kernel function.		// the kernel function.
auto cuOwningModuleRef =		auto cuOwningModuleRef =
builder.create<LLVM::LoadOp>(loc, getPointerType(), cuModule);		builder.create<LLVM::LoadOp>(loc, getPointerType(), cuModule);
auto kernelName = generateKernelNameConstant(launchOp.kernel(), loc, builder);		auto kernelName = generateKernelNameConstant(launchOp.getKernelModuleName(),
		launchOp.kernel(), loc, builder);
auto cuFunction = allocatePointer(builder, loc);		auto cuFunction = allocatePointer(builder, loc);
auto cuModuleGetFunction =		auto cuModuleGetFunction =
getOperation().lookupSymbol<LLVM::LLVMFuncOp>(cuModuleGetFunctionName);		getOperation().lookupSymbol<LLVM::LLVMFuncOp>(cuModuleGetFunctionName);
builder.create<LLVM::CallOp>(		builder.create<LLVM::CallOp>(
loc, ArrayRef<Type>{getCUResultType()},		loc, ArrayRef<Type>{getCUResultType()},
builder.getSymbolRefAttr(cuModuleGetFunction),		builder.getSymbolRefAttr(cuModuleGetFunction),
ArrayRef<Value>{cuFunction, cuOwningModuleRef, kernelName});		ArrayRef<Value>{cuFunction, cuOwningModuleRef, kernelName});
// Grab the global stream needed for execution.		// Grab the global stream needed for execution.
Show All 40 Lines

mlir/lib/Dialect/LLVMIR/CMakeLists.txt

	add_subdirectory(Transforms)			add_subdirectory(Transforms)

	add_mlir_dialect_library(MLIRLLVMIR			add_mlir_dialect_library(MLIRLLVMIR
	IR/LLVMDialect.cpp			IR/LLVMDialect.cpp

	ADDITIONAL_HEADER_DIRS			ADDITIONAL_HEADER_DIRS
	${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/LLVMIR			${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/LLVMIR

	DEPENDS			DEPENDS
	MLIRLLVMOpsIncGen			MLIRLLVMOpsIncGen
	MLIRLLVMConversionsIncGen			MLIRLLVMConversionsIncGen
	)			)
	target_link_libraries(MLIRLLVMIR			target_link_libraries(MLIRLLVMIR
	PUBLIC			PUBLIC
	LLVMAsmParser			LLVMAsmParser
				LLVMBitReader
				LLVMBitWriter
	LLVMCore			LLVMCore
	LLVMSupport			LLVMSupport
	LLVMFrontendOpenMP			LLVMFrontendOpenMP
	MLIRCallInterfaces			MLIRCallInterfaces
	MLIRControlFlowInterfaces			MLIRControlFlowInterfaces
	MLIROpenMP			MLIROpenMP
	MLIRIR			MLIRIR
	MLIRSideEffects			MLIRSideEffects
	▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp

Show All 14 Lines
#include "mlir/IR/DialectImplementation.h"		#include "mlir/IR/DialectImplementation.h"
#include "mlir/IR/FunctionImplementation.h"		#include "mlir/IR/FunctionImplementation.h"
#include "mlir/IR/MLIRContext.h"		#include "mlir/IR/MLIRContext.h"
#include "mlir/IR/Module.h"		#include "mlir/IR/Module.h"
#include "mlir/IR/StandardTypes.h"		#include "mlir/IR/StandardTypes.h"

#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
#include "llvm/AsmParser/Parser.h"		#include "llvm/AsmParser/Parser.h"
		#include "llvm/Bitcode/BitcodeReader.h"
		#include "llvm/Bitcode/BitcodeWriter.h"
#include "llvm/IR/Attributes.h"		#include "llvm/IR/Attributes.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/Type.h"		#include "llvm/IR/Type.h"
#include "llvm/Support/Mutex.h"		#include "llvm/Support/Mutex.h"
#include "llvm/Support/SourceMgr.h"		#include "llvm/Support/SourceMgr.h"

using namespace mlir;		using namespace mlir;
using namespace mlir::LLVM;		using namespace mlir::LLVM;
▲ Show 20 Lines • Show All 1,646 Lines • ▼ Show 20 Lines

LLVMDialect::~LLVMDialect() {}		LLVMDialect::~LLVMDialect() {}

#define GET_OP_CLASSES		#define GET_OP_CLASSES
#include "mlir/Dialect/LLVMIR/LLVMOps.cpp.inc"		#include "mlir/Dialect/LLVMIR/LLVMOps.cpp.inc"

llvm::LLVMContext &LLVMDialect::getLLVMContext() { return impl->llvmContext; }		llvm::LLVMContext &LLVMDialect::getLLVMContext() { return impl->llvmContext; }
llvm::Module &LLVMDialect::getLLVMModule() { return impl->module; }		llvm::Module &LLVMDialect::getLLVMModule() { return impl->module; }
		llvm::sys::SmartMutex<true> &LLVMDialect::getLLVMContextMutex() {
		return impl->mutex;
		}

/// Parse a type registered to this dialect.		/// Parse a type registered to this dialect.
Type LLVMDialect::parseType(DialectAsmParser &parser) const {		Type LLVMDialect::parseType(DialectAsmParser &parser) const {
StringRef tyData = parser.getFullSymbolSpec();		StringRef tyData = parser.getFullSymbolSpec();

// LLVM is not thread-safe, so lock access to it.		// LLVM is not thread-safe, so lock access to it.
llvm::sys::SmartScopedLock<true> lock(impl->mutex);		llvm::sys::SmartScopedLock<true> lock(impl->mutex);

▲ Show 20 Lines • Show All 273 Lines • ▼ Show 20 Lines	return builder.create<LLVM::GEPOp>(loc,
LLVM::LLVMType::getInt8PtrTy(llvmDialect),		LLVM::LLVMType::getInt8PtrTy(llvmDialect),
globalPtr, ArrayRef<Value>({cst0, cst0}));		globalPtr, ArrayRef<Value>({cst0, cst0}));
}		}

bool mlir::LLVM::satisfiesLLVMModule(Operation *op) {		bool mlir::LLVM::satisfiesLLVMModule(Operation *op) {
return op->hasTrait<OpTrait::SymbolTable>() &&		return op->hasTrait<OpTrait::SymbolTable>() &&
op->hasTrait<OpTrait::IsIsolatedFromAbove>();		op->hasTrait<OpTrait::IsIsolatedFromAbove>();
}		}

		std::unique_ptr<llvm::Module>
		mlir::LLVM::cloneModuleIntoNewContext(llvm::LLVMContext *context,
		llvm::Module *module) {
		SmallVector<char, 1> buffer;
		{
		llvm::raw_svector_ostream os(buffer);
		WriteBitcodeToFile(*module, os);
		}
		llvm::MemoryBufferRef bufferRef(StringRef(buffer.data(), buffer.size()),
		"cloned module buffer");
		return cantFail(parseBitcodeFile(bufferRef, *context));
		}

mlir/lib/ExecutionEngine/CMakeLists.txt

Show All 11 Lines	add_mlir_library(MLIRExecutionEngine

ADDITIONAL_HEADER_DIRS		ADDITIONAL_HEADER_DIRS
${MLIR_MAIN_INCLUDE_DIR}/mlir/ExecutionEngine		${MLIR_MAIN_INCLUDE_DIR}/mlir/ExecutionEngine
)		)
target_link_libraries(MLIRExecutionEngine		target_link_libraries(MLIRExecutionEngine
PUBLIC		PUBLIC
MLIRLLVMIR		MLIRLLVMIR
MLIRTargetLLVMIR		MLIRTargetLLVMIR
LLVMBitReader
LLVMBitWriter
LLVMExecutionEngine		LLVMExecutionEngine
LLVMObject		LLVMObject
LLVMOrcJIT		LLVMOrcJIT
LLVMJITLink		LLVMJITLink
LLVMSupport		LLVMSupport
LLVMAnalysis		LLVMAnalysis
LLVMAggressiveInstCombine		LLVMAggressiveInstCombine
LLVMInstCombine		LLVMInstCombine
Show All 20 Lines

mlir/lib/ExecutionEngine/ExecutionEngine.cpp

//===- ExecutionEngine.cpp - MLIR Execution engine and utils --------------===//		//===- ExecutionEngine.cpp - MLIR Execution engine and utils --------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file implements the execution engine for MLIR modules based on LLVM Orc		// This file implements the execution engine for MLIR modules based on LLVM Orc
// JIT engine.		// JIT engine.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
#include "mlir/ExecutionEngine/ExecutionEngine.h"		#include "mlir/ExecutionEngine/ExecutionEngine.h"
		#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
#include "mlir/IR/Function.h"		#include "mlir/IR/Function.h"
#include "mlir/IR/Module.h"		#include "mlir/IR/Module.h"
#include "mlir/Support/FileUtilities.h"		#include "mlir/Support/FileUtilities.h"
#include "mlir/Target/LLVMIR.h"		#include "mlir/Target/LLVMIR.h"

#include "llvm/Bitcode/BitcodeReader.h"
#include "llvm/Bitcode/BitcodeWriter.h"
#include "llvm/ExecutionEngine/JITEventListener.h"		#include "llvm/ExecutionEngine/JITEventListener.h"
#include "llvm/ExecutionEngine/ObjectCache.h"		#include "llvm/ExecutionEngine/ObjectCache.h"
#include "llvm/ExecutionEngine/Orc/CompileUtils.h"		#include "llvm/ExecutionEngine/Orc/CompileUtils.h"
#include "llvm/ExecutionEngine/Orc/ExecutionUtils.h"		#include "llvm/ExecutionEngine/Orc/ExecutionUtils.h"
#include "llvm/ExecutionEngine/Orc/IRCompileLayer.h"		#include "llvm/ExecutionEngine/Orc/IRCompileLayer.h"
#include "llvm/ExecutionEngine/Orc/IRTransformLayer.h"		#include "llvm/ExecutionEngine/Orc/IRTransformLayer.h"
#include "llvm/ExecutionEngine/Orc/JITTargetMachineBuilder.h"		#include "llvm/ExecutionEngine/Orc/JITTargetMachineBuilder.h"
#include "llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h"		#include "llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h"
▲ Show 20 Lines • Show All 177 Lines • ▼ Show 20 Lines	Expected<std::unique_ptr<ExecutionEngine>> ExecutionEngine::create(
// instead of this. Currently, the LLVM module created above has no triple		// instead of this. Currently, the LLVM module created above has no triple
// associated with it.		// associated with it.
setupTargetTriple(llvmModule.get());		setupTargetTriple(llvmModule.get());
packFunctionArguments(llvmModule.get());		packFunctionArguments(llvmModule.get());

// Clone module in a new LLVMContext since translateModuleToLLVMIR buries		// Clone module in a new LLVMContext since translateModuleToLLVMIR buries
// ownership too deeply.		// ownership too deeply.
// TODO(zinenko): Reevaluate model of ownership of LLVMContext in LLVMDialect.		// TODO(zinenko): Reevaluate model of ownership of LLVMContext in LLVMDialect.
SmallVector<char, 1> buffer;		std::unique_ptr<Module> deserModule =
{		LLVM::cloneModuleIntoNewContext(ctx.get(), llvmModule.get());
llvm::raw_svector_ostream os(buffer);
WriteBitcodeToFile(*llvmModule, os);
}
llvm::MemoryBufferRef bufferRef(StringRef(buffer.data(), buffer.size()),
"cloned module buffer");
auto expectedModule = parseBitcodeFile(bufferRef, *ctx);
if (!expectedModule)
return expectedModule.takeError();
std::unique_ptr<Module> deserModule = std::move(*expectedModule);
auto dataLayout = deserModule->getDataLayout();		auto dataLayout = deserModule->getDataLayout();

// Callback to create the object layer with symbol resolution to current		// Callback to create the object layer with symbol resolution to current
// process and dynamically linked libraries.		// process and dynamically linked libraries.
auto objectLinkingLayerCreator = [&](ExecutionSession &session,		auto objectLinkingLayerCreator = [&](ExecutionSession &session,
const Triple &TT) {		const Triple &TT) {
auto objectLayer = std::make_unique<RTDyldObjectLinkingLayer>(		auto objectLayer = std::make_unique<RTDyldObjectLinkingLayer>(
session, []() { return std::make_unique<SectionMemoryManager>(); });		session, []() { return std::make_unique<SectionMemoryManager>(); });
▲ Show 20 Lines • Show All 104 Lines • Show Last 20 Lines

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp

Show First 20 Lines • Show All 295 Lines • ▼ Show 20 Lines
}		}

ModuleTranslation::ModuleTranslation(Operation *module,		ModuleTranslation::ModuleTranslation(Operation *module,
std::unique_ptr<llvm::Module> llvmModule)		std::unique_ptr<llvm::Module> llvmModule)
: mlirModule(module), llvmModule(std::move(llvmModule)),		: mlirModule(module), llvmModule(std::move(llvmModule)),
debugTranslation(		debugTranslation(
std::make_unique<DebugTranslation>(module, *this->llvmModule)),		std::make_unique<DebugTranslation>(module, *this->llvmModule)),
ompDialect(		ompDialect(
module->getContext()->getRegisteredDialect<omp::OpenMPDialect>()) {		module->getContext()->getRegisteredDialect<omp::OpenMPDialect>()),
		llvmDialect(module->getContext()->getRegisteredDialect<LLVMDialect>()) {
assert(satisfiesLLVMModule(mlirModule) &&		assert(satisfiesLLVMModule(mlirModule) &&
"mlirModule should honor LLVM's module semantics.");		"mlirModule should honor LLVM's module semantics.");
}		}
ModuleTranslation::~ModuleTranslation() {}		ModuleTranslation::~ModuleTranslation() {}

/// Given an OpenMP MLIR operation, create the corresponding LLVM IR		/// Given an OpenMP MLIR operation, create the corresponding LLVM IR
/// (including OpenMP runtime calls).		/// (including OpenMP runtime calls).
LogicalResult		LogicalResult
▲ Show 20 Lines • Show All 177 Lines • ▼ Show 20 Lines	LogicalResult ModuleTranslation::convertBlock(Block &bb, bool ignoreArguments) {
}		}

return success();		return success();
}		}

/// Create named global variables that correspond to llvm.mlir.global		/// Create named global variables that correspond to llvm.mlir.global
/// definitions.		/// definitions.
LogicalResult ModuleTranslation::convertGlobals() {		LogicalResult ModuleTranslation::convertGlobals() {
		// Lock access to the llvm context.
		llvm::sys::SmartScopedLock<true> scopedLock(
		rriddleUnsubmitted Not Done Reply Inline Actions Can we just lock once at beginning? rriddle: Can we just lock once at beginning?
		llvmDialect->getLLVMContextMutex());
for (auto op : getModuleBody(mlirModule).getOps<LLVM::GlobalOp>()) {		for (auto op : getModuleBody(mlirModule).getOps<LLVM::GlobalOp>()) {
llvm::Type *type = op.getType().getUnderlyingType();		llvm::Type *type = op.getType().getUnderlyingType();
llvm::Constant *cst = llvm::UndefValue::get(type);		llvm::Constant *cst = llvm::UndefValue::get(type);
if (op.getValueOrNull()) {		if (op.getValueOrNull()) {
// String attributes are treated separately because they cannot appear as		// String attributes are treated separately because they cannot appear as
// in-function constants and are thus not supported by getLLVMConstant.		// in-function constants and are thus not supported by getLLVMConstant.
if (auto strAttr = op.getValueOrNull().dyn_cast_or_null<StringAttr>()) {		if (auto strAttr = op.getValueOrNull().dyn_cast_or_null<StringAttr>()) {
cst = llvm::ConstantDataArray::getString(		cst = llvm::ConstantDataArray::getString(
▲ Show 20 Lines • Show All 243 Lines • ▼ Show 20 Lines	LogicalResult ModuleTranslation::checkSupportedModuleOps(Operation *m) {
for (Operation &o : getModuleBody(m).getOperations())		for (Operation &o : getModuleBody(m).getOperations())
if (!isa<LLVM::LLVMFuncOp>(&o) && !isa<LLVM::GlobalOp>(&o) &&		if (!isa<LLVM::LLVMFuncOp>(&o) && !isa<LLVM::GlobalOp>(&o) &&
!o.isKnownTerminator())		!o.isKnownTerminator())
return o.emitOpError("unsupported module-level operation");		return o.emitOpError("unsupported module-level operation");
return success();		return success();
}		}

LogicalResult ModuleTranslation::convertFunctions() {		LogicalResult ModuleTranslation::convertFunctions() {
		// Lock access to the llvm context.
		llvm::sys::SmartScopedLock<true> scopedLock(
		llvmDialect->getLLVMContextMutex());
// Declare all functions first because there may be function calls that form a		// Declare all functions first because there may be function calls that form a
// call graph with cycles.		// call graph with cycles.
for (auto function : getModuleBody(mlirModule).getOps<LLVMFuncOp>()) {		for (auto function : getModuleBody(mlirModule).getOps<LLVMFuncOp>()) {
llvm::FunctionCallee llvmFuncCst = llvmModule->getOrInsertFunction(		llvm::FunctionCallee llvmFuncCst = llvmModule->getOrInsertFunction(
function.getName(),		function.getName(),
cast<llvm::FunctionType>(function.getType().getUnderlyingType()));		cast<llvm::FunctionType>(function.getType().getUnderlyingType()));
llvm::Function *llvmFunc = cast<llvm::Function>(llvmFuncCst.getCallee());		llvm::Function *llvmFunc = cast<llvm::Function>(llvmFuncCst.getCallee());
functionMapping[function.getName()] = llvmFunc;		functionMapping[function.getName()] = llvmFunc;
Show All 28 Lines	ModuleTranslation::lookupValues(ValueRange values) {
}		}
return remapped;		return remapped;
}		}

std::unique_ptr<llvm::Module>		std::unique_ptr<llvm::Module>
ModuleTranslation::prepareLLVMModule(Operation *m) {		ModuleTranslation::prepareLLVMModule(Operation *m) {
auto *dialect = m->getContext()->getRegisteredDialect<LLVM::LLVMDialect>();		auto *dialect = m->getContext()->getRegisteredDialect<LLVM::LLVMDialect>();
assert(dialect && "LLVM dialect must be registered");		assert(dialect && "LLVM dialect must be registered");
		// Lock the LLVM context as we might create new types here.
		llvm::sys::SmartScopedLock<true> scopedLock(dialect->getLLVMContextMutex());

auto llvmModule = llvm::CloneModule(dialect->getLLVMModule());		auto llvmModule = llvm::CloneModule(dialect->getLLVMModule());
if (!llvmModule)		if (!llvmModule)
return nullptr;		return nullptr;

llvm::LLVMContext &llvmContext = llvmModule->getContext();		llvm::LLVMContext &llvmContext = llvmModule->getContext();
llvm::IRBuilder<> builder(llvmContext);		llvm::IRBuilder<> builder(llvmContext);

Show All 9 Lines

mlir/test/mlir-cuda-runner/two-modules.mlir

This file was added.

				// RUN: mlir-cuda-runner %s --print-ir-after-all --shared-libs=%cuda_wrapper_library_dir/libcuda-runtime-wrappers%shlibext,%linalg_test_lib_dir/libmlir_runner_utils%shlibext --entry-point-result=void \| FileCheck %s --dump-input=always

				// CHECK: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
				func @main() {
				%arg = alloc() : memref<13xi32>
				%dst = memref_cast %arg : memref<13xi32> to memref<?xi32>
				%one = constant 1 : index
				%sx = dim %dst, 0 : memref<?xi32>
				call @mcuMemHostRegisterMemRef1dInt32(%dst) : (memref<?xi32>) -> ()
				gpu.launch blocks(%bx, %by, %bz) in (%grid_x = %one, %grid_y = %one, %grid_z = %one)
				threads(%tx, %ty, %tz) in (%block_x = %sx, %block_y = %one, %block_z = %one) {
				%t0 = index_cast %tx : index to i32
				store %t0, %dst[%tx] : memref<?xi32>
				gpu.terminator
				}
				gpu.launch blocks(%bx, %by, %bz) in (%grid_x = %one, %grid_y = %one, %grid_z = %one)
				threads(%tx, %ty, %tz) in (%block_x = %sx, %block_y = %one, %block_z = %one) {
				%t0 = index_cast %tx : index to i32
				store %t0, %dst[%tx] : memref<?xi32>
				gpu.terminator
				}
				%U = memref_cast %dst : memref<?xi32> to memref<*xi32>
				call @print_memref_i32(%U) : (memref<*xi32>) -> ()
				return
				}

				func @mcuMemHostRegisterMemRef1dInt32(%ptr : memref<?xi32>)
				func @print_memref_i32(%ptr : memref<*xi32>)

This is an archive of the discontinued LLVM Phabricator instance.

[MLIR] Allow for multiple gpu modules during translation.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 258027

mlir/include/mlir/Dialect/LLVMIR/LLVMDialect.h

mlir/include/mlir/Dialect/LLVMIR/LLVMOpBase.td

mlir/include/mlir/Target/LLVMIR/ModuleTranslation.h

mlir/lib/Conversion/GPUToCUDA/ConvertKernelFuncToCubin.cpp

mlir/lib/Conversion/GPUToCUDA/ConvertLaunchFuncToCudaCalls.cpp

mlir/lib/Dialect/LLVMIR/CMakeLists.txt

mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp

mlir/lib/ExecutionEngine/CMakeLists.txt

mlir/lib/ExecutionEngine/ExecutionEngine.cpp

mlir/lib/Target/LLVMIR/ModuleTranslation.cpp

mlir/test/mlir-cuda-runner/two-modules.mlir

[MLIR] Allow for multiple gpu modules during translation.
ClosedPublic