This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
lib/Target/LLVMIR/
-
Target/
-
LLVMIR/
-
CMakeLists.txt
-
ConvertToLLVMIR.cpp
-
Dialect/GPU/
-
GPU/
1/3
GPUToLLVMIRTranslation.cpp
-
test/Target/LLVMIR/
-
Target/
-
LLVMIR/
7/7
gpu.mlir

Differential D154153

[mlir][gpu] Update GPU translation to accept binaries.
ClosedPublic

Authored by fmorac on Jun 29 2023, 2:12 PM.

Download Raw Diff

Details

Reviewers

ftynse
ThomasRaoux
dcaballe
mehdi_amini
stellaraccident
krzysz00
nicolasvasilache
herhut

Commits

rGb43068e8707d: [mlir][gpu] Update GPU translation to accept binaries.

Summary

Commit message

Modifies GPU translation to accept GPU binaries embedding them using the
object manager interface method embedBinary, as well as accepting kernel
launch operations translating them using the interface method launchKernel.

Depends on D154152

Explanation

Summary:
These patches aim to be a replacement to the current GPU compilation infrastructure, with extensibility and trying to minimizing future disruption as the primary goal.
The biggest updates performed by these patches are:

The introduction of Target attributes, these attributes handle compilation of GPU modules into binary strings. These attributes can be implemented by any dialect, leaving the option for downstream users to implement their own serializations.
The introduction of the GPU binary operation, this operation stores GPU objects for different targets and can be invoked by gpu.launch_func.
Making gpu.binary & gpu.launch_func translatable to LLVM IR, with the translation being controlled by Object Manager attributes.
The introduction of the gpu-module-to-binary pass. This pass serializes GPU modules into GPU binaries, using the GPU targets available in the module.
The introduction of the #gpu.select_object object manager as the default object manager, it selects a single object for embedding in the IR, by default it selects the first object.

These patches leave the current infrastructure in place, allowing for a migration period for downstream users.

Examples:

GPU modules using target attributes:

gpu.module @my_module [#gpu.nvptx<chip = "sm_90">, #gpu.amdgpu, #gpu.amdgpu<chip = "gfx90a">] {
...
}

Applying the gpu-module-to-binary pass:

gpu.module @my_module [#gpu.nvptx<chip = "sm_90">, #gpu.amdgpu] {
...
}
; mlir-opt --gpu-module-to-binary
gpu.binary @my_module [#gpu.object<#gpu.nvptx<chip = "sm_90">, "BINARY DATA">, #gpu.object<#gpu.amdgpu, "BINARY DATA">]

Choosing the #gpu.amdgpu object for embedding:

gpu.binary @my_module <#gpu.select_object<#gpu.amdgpu>> [#gpu.object<#gpu.nvptx<chip = "sm_90">, "BINARY DATA">, #gpu.object<#gpu.amdgpu, "BINARY DATA">]
; It's also valid to pass the index of the object.
gpu.binary @my_module <#gpu.select_object<1>> [#gpu.object<#gpu.nvptx<chip = "sm_90">, "BINARY DATA">, #gpu.object<#gpu.amdgpu, "BINARY DATA">]

Testing:
This infrastructure was tested in 2 systems, one with a NVIDIA V100 and the other one with a AMD MI250X, in both cases the test completion was successful.

Input files:

test.cpp
test.cpp1 KBDownload
test_nvvm.mlir
test_nvvm.mlir2 KBDownload
test_rocdl.mlir
test_rocdl.mlir2 KBDownload

Steps for assembling the test for the NVIDIA system:

mlir-opt --gpu-to-llvm --gpu-module-to-binary test_nvvm.mlir | mlir-translate --mlir-to-llvmir -o test_nvptx.ll
clang++ test_nvptx.ll test.cpp -l

Output file: test_nvptx.ll

test_nvptx.ll12 KBDownload

Steps for assembling the test for the AMD system:

mlir-opt --gpu-to-llvm --gpu-module-to-binary test_rocdl.mlir | mlir-translate --mlir-to-llvmir -o test_amdgpu.ll
clang++ test_amdgpu.ll test.cpp -l

Output file: test_amdgpu.ll

test_amdgpu.ll30 KBDownload

Diff list

The following patches implement the proposal described in: https://discourse.llvm.org/t/rfc-extending-mlir-gpu-device-codegen-pipeline/70199/54 :

D154098: Add a GlobalSymbol trait.
D154097: Add a parameter for passing default values to StringRefParameter
D154100: Adds an utility class for serializing operations to binary strings.
D154104: Add GPU target attribute interface.
D154113: Add target attribute to GPU modules.
D154117: Adds the NVPTX target attribute.
D154129: Adds the AMDGPU target attribute.
D154108: Add the GPU object manager attribute interface.
D154132: Add gpu.binary op and #gpu.object attribute.
D154137: Modifies gpu.launch_func to allow lowering it after gpu-to-llvm.
D154147: Add the Select Object compilation attribute.
D154149: Add the gpu-module-to-binary pass.
D154152: Add GPU target support to gpu-to-llvm.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

fmorac created this revision.Jun 29 2023, 2:12 PM

Herald added a reviewer: ftynse. · View Herald TranscriptJun 29 2023, 2:12 PM

Herald added a reviewer: ThomasRaoux. · View Herald Transcript

Herald added a reviewer: dcaballe. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: gysit, Dinistro, bviyer and 25 others. · View Herald Transcript

Harbormaster completed remote builds in B242239: Diff 535995.Jun 29 2023, 4:36 PM

fmorac mentioned this in D154098: [mlir] Add a `GlobalSymbol` trait..Jun 29 2023, 6:10 PM

fmorac mentioned this in D154097: [mlir] Add a parameter for passing default values to `StringRefParameter`.

Rebasing.

fmorac mentioned this in D154100: [mlir][Target][LLVM] Adds an utility class for serializing operations to binary strings..Jun 29 2023, 6:15 PM

fmorac mentioned this in D154104: [mlir][gpu] Add GPU target attribute interface..

fmorac mentioned this in D154113: [mlir][gpu] Add target attribute to GPU modules..

fmorac mentioned this in D154117: [mlir][NVVM] Adds the NVVM target attribute..Jun 29 2023, 6:18 PM

fmorac mentioned this in D154129: [mlir][ROCDL] Adds the ROCDL target attribute..

fmorac mentioned this in D154108: [mlir][gpu] Add the GPU offloading handler attribute interface..

fmorac mentioned this in D154132: [mlir][gpu] Add `gpu.binary` op and `#gpu.object` attribute..

fmorac mentioned this in D154137: [mlir][gpu] Modifies `gpu.launch_func` to allow lowering it after gpu-to-llvm..

fmorac mentioned this in D154147: [mlir][gpu] Add the Select Object compilation attribute..

fmorac mentioned this in D154149: [mlir][gpu] Add the `gpu-module-to-binary` pass..Jun 29 2023, 6:20 PM

fmorac edited the summary of this revision. (Show Details)Jun 29 2023, 7:03 PM

fmorac added reviewers: mehdi_amini, stellaraccident, krzysz00.

Harbormaster completed remote builds in B242306: Diff 536082.Jun 29 2023, 10:06 PM

fmorac edited the summary of this revision. (Show Details)Jun 30 2023, 6:25 AM

Herald added a subscriber: tpr. · View Herald TranscriptJun 30 2023, 6:25 AM

fmorac published this revision for review.Jun 30 2023, 6:30 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptJun 30 2023, 6:30 AM

Herald added a reviewer: herhut. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

Matt added a subscriber: Matt.Jun 30 2023, 4:36 PM

fmorac mentioned this in rGa042a6502c17: [mlir] Add a parameter for passing default values to `StringRefParameter`.Jul 10 2023, 1:02 PM

Rebasing.

Herald added a subscriber: wangpc. · View Herald TranscriptJul 22 2023, 12:12 PM

Harbormaster completed remote builds in B247435: Diff 543218.Jul 22 2023, 12:34 PM

Something that would be welcome here is to create a new entry in docs/ to explain the GPU lowering/translation/codegen/embedding flow, can you add that?

mlir/test/Target/LLVMIR/gpu.mlir
4	Is this attribute necessary here?
10	STURCT-> STRUCT?
10	Also can you add a line before the block of CHECK saying what is checked here?

In D154153#4526959, @mehdi_amini wrote:

Something that would be welcome here is to create a new entry in docs/ to explain the GPU lowering/translation/codegen/embedding flow, can you add that?

Yes, I'll create a new patch just for docs. There are actually a couple more patches not in this series, the idea behind this series was for agreeing on the base concept & implementation. But I'll add the docs and then modify them if necessary.

mlir/test/Target/LLVMIR/gpu.mlir
10	I'll add it.
10	I don't understand, what do you mean?

mehdi_amini mentioned this in D155563: [mlir][gpu] Improving Cubin Serialization with ptxas Compiler.Jul 24 2023, 10:34 AM

mehdi_amini added inline comments.Jul 24 2023, 11:07 PM

mlir/test/Target/LLVMIR/gpu.mlir
10	Maybe STURCT Is intentional (I thought it was a typo), but I don't know what it means?

fmorac added inline comments.Jul 25 2023, 5:09 PM

mlir/test/Target/LLVMIR/gpu.mlir
10	Oh, yes, struct is the variable holding the struct with the args. I'll change the name & document.

fmorac mentioned this in D154152: [mlir][gpu] Add GPU target support to `gpu-to-llvm`..Aug 7 2023, 5:49 AM

Changed the name of the arguments & added comments indicating the purpose of the tests.

Harbormaster completed remote builds in B250762: Diff 547752.Aug 7 2023, 7:59 AM

fmorac added a child revision: D157351: [mlir][gpu] Add passes to attach (NVVM|ROCDL) target attributes to GPU Modules.Aug 7 2023, 7:25 PM

fmorac mentioned this in rGc8e0364a4336: [mlir][Target][LLVM] Adds an utility class for serializing operations to binary….Aug 8 2023, 6:09 AM

fmorac mentioned this in rG86c4dfa209b5: [mlir][gpu] Add GPU target attribute interface..

fmorac mentioned this in rG9fa7b9ef21c4: [mlir][gpu] Add target attribute to GPU modules..Aug 8 2023, 6:20 AM

fmorac mentioned this in rG895c4ac33fc8: [mlir][Target][LLVM] Adds an utility class for serializing operations to binary….Aug 8 2023, 7:49 AM

fmorac mentioned this in rG211c9752c820: [mlir][NVVM] Adds the NVVM target attribute..Aug 8 2023, 12:21 PM

fmorac added a child revision: D157461: [mlir][gpu] Add documentation for the new GPU compilation mechanism.Aug 8 2023, 5:23 PM

fmorac removed a child revision: D157461: [mlir][gpu] Add documentation for the new GPU compilation mechanism.Aug 8 2023, 5:24 PM

mehdi_amini accepted this revision.Aug 8 2023, 10:25 PM

mehdi_amini added inline comments.

mlir/lib/Target/LLVMIR/Dialect/GPU/CMakeLists.txt
15 ↗	(On Diff #547752)	Why private here?
mlir/lib/Target/LLVMIR/Dialect/GPU/GPUToLLVMIRTranslation.cpp
47	Why is gpu.module a no-op success?

This revision is now accepted and ready to land.Aug 8 2023, 10:25 PM

fmorac mentioned this in D157461: [mlir][gpu] Add documentation for the new GPU compilation mechanism.Aug 9 2023, 5:56 AM

fmorac added inline comments.Aug 9 2023, 6:02 PM

mlir/lib/Target/LLVMIR/Dialect/GPU/CMakeLists.txt
15 ↗	(On Diff #547752)	I'll change it, it's a left over from a previous version.
mlir/lib/Target/LLVMIR/Dialect/GPU/GPUToLLVMIRTranslation.cpp
47	In trunk it's also a no-op success, I presume the reason is that there's nothing to translate.

mehdi_amini added inline comments.Aug 9 2023, 10:17 PM

mlir/lib/Target/LLVMIR/Dialect/GPU/GPUToLLVMIRTranslation.cpp
47	Not clear to me why: module { gpu.module @foo { } } shouldn't return a failure instead?

Removed private lib.

gpu.module is no-op success because the translation mechanism is not aware of its context.

Example, when converting gpu.module it's not possible to distinguish between translating:

module {
 gpu.module @foo {
 }
}

and translating:

gpu.module @foo {
}

They both will invoke the same function with the same information.
It should be possible to distinguish the context by checking if the ops inside the module
have been converted or not, however that's an expensive check.

Harbormaster completed remote builds in B251919: Diff 549338.Aug 11 2023, 6:09 AM

fmorac mentioned this in rG6a0feb1503e2: [mlir][ROCDL] Adds the ROCDL target attribute..Aug 11 2023, 12:44 PM

fmorac mentioned this in rG4760ea029a94: [mlir][gpu] Add the GPU offloading handler attribute interface..Aug 11 2023, 12:46 PM

fmorac mentioned this in rGbf24fb81acf4: [mlir][gpu] Add `gpu.binary` op and `#gpu.object` attribute..

fmorac mentioned this in rG068213130dc5: [mlir][ROCDL] Adds the ROCDL target attribute..Aug 11 2023, 2:48 PM

fmorac mentioned this in rGa63db3f5f5dc: [mlir][gpu] Modifies `gpu.launch_func` to allow lowering it after gpu-to-llvm..Aug 11 2023, 2:56 PM

fmorac mentioned this in rG8ae074b19597: [mlir][gpu] Add the Select Object compilation attribute..Aug 11 2023, 3:01 PM

fmorac mentioned this in rG43752a2aa31a: [mlir][gpu] Add the `gpu-module-to-binary` pass..Aug 11 2023, 5:25 PM

Closed by commit rGb43068e8707d: [mlir][gpu] Update GPU translation to accept binaries. (authored by fmorac). · Explain WhyAug 11 2023, 5:27 PM

This revision was automatically updated to reflect the committed changes.

fmorac mentioned this in rGfcfeb1e5b3cd: [mlir][gpu] Add GPU target support to `gpu-to-llvm`..

fmorac added a commit: rGb43068e8707d: [mlir][gpu] Update GPU translation to accept binaries..

fmorac mentioned this in rGa7cdea70095f: [mlir][gpu] Add documentation for the new GPU compilation mechanism.Aug 11 2023, 5:32 PM

Revision Contents

Path

Size

mlir/

lib/

Target/

LLVMIR/

CMakeLists.txt

2 lines

ConvertToLLVMIR.cpp

4 lines

Dialect/

GPU/

GPUToLLVMIRTranslation.cpp

36 lines

test/

Target/

LLVMIR/

gpu.mlir

77 lines

Diff 549552

mlir/lib/Target/LLVMIR/CMakeLists.txt

Show First 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	add_mlir_translation_library(MLIRToLLVMIRTranslationRegistration
MLIRBuiltinToLLVMIRTranslation		MLIRBuiltinToLLVMIRTranslation
MLIRGPUToLLVMIRTranslation		MLIRGPUToLLVMIRTranslation
MLIRX86VectorToLLVMIRTranslation		MLIRX86VectorToLLVMIRTranslation
MLIRLLVMToLLVMIRTranslation		MLIRLLVMToLLVMIRTranslation
MLIRNVVMToLLVMIRTranslation		MLIRNVVMToLLVMIRTranslation
MLIROpenACCToLLVMIRTranslation		MLIROpenACCToLLVMIRTranslation
MLIROpenMPToLLVMIRTranslation		MLIROpenMPToLLVMIRTranslation
MLIRROCDLToLLVMIRTranslation		MLIRROCDLToLLVMIRTranslation
		MLIRNVVMTarget
		MLIRROCDLTarget
)		)

add_mlir_translation_library(MLIRTargetLLVMIRImport		add_mlir_translation_library(MLIRTargetLLVMIRImport
DataLayoutImporter.cpp		DataLayoutImporter.cpp
DebugImporter.cpp		DebugImporter.cpp
LoopAnnotationImporter.cpp		LoopAnnotationImporter.cpp
ModuleImport.cpp		ModuleImport.cpp
TypeFromLLVM.cpp		TypeFromLLVM.cpp
Show All 20 Lines

mlir/lib/Target/LLVMIR/ConvertToLLVMIR.cpp

	//===- ConvertToLLVMIR.cpp - MLIR to LLVM IR conversion -------------------===//			//===- ConvertToLLVMIR.cpp - MLIR to LLVM IR conversion -------------------===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// This file implements a translation between the MLIR LLVM dialect and LLVM IR.			// This file implements a translation between the MLIR LLVM dialect and LLVM IR.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "mlir/Dialect/DLTI/DLTI.h"			#include "mlir/Dialect/DLTI/DLTI.h"
	#include "mlir/Dialect/Func/IR/FuncOps.h"			#include "mlir/Dialect/Func/IR/FuncOps.h"
	#include "mlir/IR/BuiltinOps.h"			#include "mlir/IR/BuiltinOps.h"
				#include "mlir/Target/LLVM/NVVM/Target.h"
				#include "mlir/Target/LLVM/ROCDL/Target.h"
	#include "mlir/Target/LLVMIR/Dialect/All.h"			#include "mlir/Target/LLVMIR/Dialect/All.h"
	#include "mlir/Target/LLVMIR/Export.h"			#include "mlir/Target/LLVMIR/Export.h"
	#include "mlir/Tools/mlir-translate/Translation.h"			#include "mlir/Tools/mlir-translate/Translation.h"
	#include "llvm/IR/LLVMContext.h"			#include "llvm/IR/LLVMContext.h"
	#include "llvm/IR/Module.h"			#include "llvm/IR/Module.h"

	using namespace mlir;			using namespace mlir;

	namespace mlir {			namespace mlir {
	void registerToLLVMIRTranslation() {			void registerToLLVMIRTranslation() {
	TranslateFromMLIRRegistration registration(			TranslateFromMLIRRegistration registration(
	"mlir-to-llvmir", "Translate MLIR to LLVMIR",			"mlir-to-llvmir", "Translate MLIR to LLVMIR",
	[](Operation *op, raw_ostream &output) {			[](Operation *op, raw_ostream &output) {
	llvm::LLVMContext llvmContext;			llvm::LLVMContext llvmContext;
	auto llvmModule = translateModuleToLLVMIR(op, llvmContext);			auto llvmModule = translateModuleToLLVMIR(op, llvmContext);
	if (!llvmModule)			if (!llvmModule)
	return failure();			return failure();

	llvmModule->print(output, nullptr);			llvmModule->print(output, nullptr);
	return success();			return success();
	},			},
	[](DialectRegistry &registry) {			[](DialectRegistry &registry) {
	registry.insert<DLTIDialect, func::FuncDialect>();			registry.insert<DLTIDialect, func::FuncDialect>();
				registerNVVMTarget(registry);
				registerROCDLTarget(registry);
	registerAllToLLVMIRTranslations(registry);			registerAllToLLVMIRTranslations(registry);
	});			});
	}			}
	} // namespace mlir			} // namespace mlir

mlir/lib/Target/LLVMIR/Dialect/GPU/GPUToLLVMIRTranslation.cpp

	//===- GPUToLLVMIRTranslation.cpp - Translate GPU dialect to LLVM IR ------===//			//===- GPUToLLVMIRTranslation.cpp - Translate GPU dialect to LLVM IR ------===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// This file implements a translation between the MLIR GPU dialect and LLVM IR.			// This file implements a translation between the MLIR GPU dialect and LLVM IR.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	#include "mlir/Target/LLVMIR/Dialect/GPU/GPUToLLVMIRTranslation.h"			#include "mlir/Target/LLVMIR/Dialect/GPU/GPUToLLVMIRTranslation.h"
	#include "mlir/Dialect/GPU/IR/GPUDialect.h"			#include "mlir/Dialect/GPU/IR/GPUDialect.h"
	#include "mlir/Target/LLVMIR/LLVMTranslationInterface.h"			#include "mlir/Target/LLVMIR/LLVMTranslationInterface.h"
				#include "llvm/ADT/TypeSwitch.h"

	using namespace mlir;			using namespace mlir;

	namespace {			namespace {
				LogicalResult launchKernel(gpu::LaunchFuncOp launchOp,
				llvm::IRBuilderBase &builder,
				LLVM::ModuleTranslation &moduleTranslation) {
				auto kernelBinary = SymbolTable::lookupNearestSymbolFrom<gpu::BinaryOp>(
				launchOp, launchOp.getKernelModuleName());
				if (!kernelBinary) {
				launchOp.emitError("Couldn't find the binary holding the kernel: ")
				<< launchOp.getKernelModuleName();
				return failure();
				}
				auto offloadingHandler =
				dyn_cast<gpu::OffloadingLLVMTranslationAttrInterface>(
				kernelBinary.getOffloadingHandlerAttr());
				assert(offloadingHandler && "Invalid offloading handler.");
				return offloadingHandler.launchKernel(launchOp, kernelBinary, builder,
				moduleTranslation);
				}

	class GPUDialectLLVMIRTranslationInterface			class GPUDialectLLVMIRTranslationInterface
	: public LLVMTranslationDialectInterface {			: public LLVMTranslationDialectInterface {
	public:			public:
	using LLVMTranslationDialectInterface::LLVMTranslationDialectInterface;			using LLVMTranslationDialectInterface::LLVMTranslationDialectInterface;

	LogicalResult			LogicalResult
	convertOperation(Operation *op, llvm::IRBuilderBase &builder,			convertOperation(Operation *operation, llvm::IRBuilderBase &builder,
	LLVM::ModuleTranslation &moduleTranslation) const override {			LLVM::ModuleTranslation &moduleTranslation) const override {
	return isa<gpu::GPUModuleOp>(op) ? success() : failure();			return llvm::TypeSwitch<Operation *, LogicalResult>(operation)
				.Case([&](gpu::GPUModuleOp) { return success(); })
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions Why is gpu.module a no-op success? mehdi_amini: Why is gpu.module a no-op success?
				fmoracAuthorUnsubmitted Done Reply Inline Actions In trunk it's also a no-op success, I presume the reason is that there's nothing to translate. fmorac: In trunk it's also a no-op success, I presume the reason is that there's nothing to translate.
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions Not clear to me why: module { gpu.module @foo { } } shouldn't return a failure instead? mehdi_amini: Not clear to me why: ``` module { gpu.module @foo { } } ``` shouldn't return a failure…
				.Case([&](gpu::BinaryOp op) {
				auto offloadingHandler =
				dyn_cast<gpu::OffloadingLLVMTranslationAttrInterface>(
				op.getOffloadingHandlerAttr());
				assert(offloadingHandler && "Invalid offloading handler.");
				return offloadingHandler.embedBinary(op, builder, moduleTranslation);
				})
				.Case([&](gpu::LaunchFuncOp op) {
				return launchKernel(op, builder, moduleTranslation);
				})
				.Default([&](Operation *op) {
				return op->emitError("unsupported GPU operation: ") << op->getName();
				});
	}			}
	};			};

	} // namespace			} // namespace

	void mlir::registerGPUDialectTranslation(DialectRegistry &registry) {			void mlir::registerGPUDialectTranslation(DialectRegistry &registry) {
	registry.insert<gpu::GPUDialect>();			registry.insert<gpu::GPUDialect>();
	registry.addExtension(+[](MLIRContext ctx, gpu::GPUDialect dialect) {			registry.addExtension(+[](MLIRContext ctx, gpu::GPUDialect dialect) {
	Show All 10 Lines

mlir/test/Target/LLVMIR/gpu.mlir

This file was added.

				// RUN: mlir-translate -mlir-to-llvmir -split-input-file %s \| FileCheck %s

				// Checking the translation of the `gpu.binary` & `gpu.launch_fun` ops.
				module attributes {gpu.container_module} {
				mehdi_aminiUnsubmitted Done Reply Inline Actions Is this attribute necessary here? mehdi_amini: Is this attribute necessary here?
				// CHECK: [[ARGS_TY:%.*]] = type { i32, i32 }
				// CHECK: @kernel_module_bin_cst = internal constant [4 x i8] c"BLOB", align 8
				// CHECK: @kernel_module_kernel_kernel_name = private unnamed_addr constant [7 x i8] c"kernel\00", align 1
				gpu.binary @kernel_module [#gpu.object<#nvvm.target, "BLOB">]
				llvm.func @foo() {
				// CHECK: [[ARGS:%.]] = alloca %{{.}}, align 8
				mehdi_aminiUnsubmitted Done Reply Inline Actions STURCT-> STRUCT? mehdi_amini: STURCT-> STRUCT?
				fmoracAuthorUnsubmitted Done Reply Inline Actions I don't understand, what do you mean? fmorac: I don't understand, what do you mean?
				mehdi_aminiUnsubmitted Done Reply Inline Actions Maybe STURCT Is intentional (I thought it was a typo), but I don't know what it means? mehdi_amini: Maybe STURCT Is intentional (I thought it was a typo), but I don't know what it means?
				fmoracAuthorUnsubmitted Done Reply Inline Actions Oh, yes, struct is the variable holding the struct with the args. I'll change the name & document. fmorac: Oh, yes, struct is the variable holding the struct with the args. I'll change the name &…
				mehdi_aminiUnsubmitted Done Reply Inline Actions Also can you add a line before the block of CHECK saying what is checked here? mehdi_amini: Also can you add a line before the block of CHECK saying what is checked here?
				fmoracAuthorUnsubmitted Done Reply Inline Actions I'll add it. fmorac: I'll add it.
				// CHECK: [[ARGS_ARRAY:%.*]] = alloca ptr, i64 2, align 8
				// CHECK: [[ARG0:%.*]] = getelementptr inbounds [[ARGS_TY]], ptr [[ARGS]], i32 0, i32 0
				// CHECK: store i32 32, ptr [[ARG0]], align 4
				// CHECK: %{{.*}} = getelementptr ptr, ptr [[ARGS_ARRAY]], i32 0
				// CHECK: store ptr [[ARG0]], ptr %{{.*}}, align 8
				// CHECK: [[ARG1:%.*]] = getelementptr inbounds [[ARGS_TY]], ptr [[ARGS]], i32 0, i32 1
				// CHECK: store i32 32, ptr [[ARG1]], align 4
				// CHECK: %{{.*}} = getelementptr ptr, ptr [[ARGS_ARRAY]], i32 1
				// CHECK: store ptr [[ARG1]], ptr %{{.*}}, align 8
				// CHECK: [[MODULE:%.*]] = call ptr @mgpuModuleLoad(ptr @kernel_module_bin_cst)
				// CHECK: [[FUNC:%.*]] = call ptr @mgpuModuleGetFunction(ptr [[MODULE]], ptr @kernel_module_kernel_kernel_name)
				// CHECK: [[STREAM:%.*]] = call ptr @mgpuStreamCreate()
				// CHECK: call void @mgpuLaunchKernel(ptr [[FUNC]], i64 8, i64 8, i64 8, i64 8, i64 8, i64 8, i32 256, ptr [[STREAM]], ptr [[ARGS_ARRAY]], ptr null)
				// CHECK: call void @mgpuStreamSynchronize(ptr [[STREAM]])
				// CHECK: call void @mgpuStreamDestroy(ptr [[STREAM]])
				// CHECK: call void @mgpuModuleUnload(ptr [[MODULE]])
				%0 = llvm.mlir.constant(8 : index) : i64
				%1 = llvm.mlir.constant(32 : i32) : i32
				%2 = llvm.mlir.constant(256 : i32) : i32
				gpu.launch_func @kernel_module::@kernel blocks in (%0, %0, %0) threads in (%0, %0, %0) : i64 dynamic_shared_memory_size %2 args(%1 : i32, %1 : i32)
				llvm.return
				}
				}

				// -----

				// Checking the correct selection of the second object using an index as a selector.
				module {
				// CHECK: @kernel_module_bin_cst = internal constant [1 x i8] c"1", align 8
				gpu.binary @kernel_module <#gpu.select_object<1>> [#gpu.object<#nvvm.target, "0">, #gpu.object<#nvvm.target, "1">]
				}

				// -----

				// Checking the correct selection of the second object using a target as a selector.
				module {
				// CHECK: @kernel_module_bin_cst = internal constant [6 x i8] c"AMDGPU", align 8
				gpu.binary @kernel_module <#gpu.select_object<#rocdl.target>> [#gpu.object<#nvvm.target, "NVPTX">, #gpu.object<#rocdl.target, "AMDGPU">]
				}

				// -----

				// Checking the translation of `gpu.launch_fun` with an async dependency.
				module attributes {gpu.container_module} {
				// CHECK: @kernel_module_bin_cst = internal constant [4 x i8] c"BLOB", align 8
				gpu.binary @kernel_module [#gpu.object<#rocdl.target, "BLOB">]
				llvm.func @foo() {
				%0 = llvm.mlir.constant(8 : index) : i64
				// CHECK: = call ptr @mgpuStreamCreate()
				// CHECK-NEXT: = alloca {{.*}}, align 8
				// CHECK-NEXT: [[ARGS:%.*]] = alloca ptr, i64 0, align 8
				// CHECK-NEXT: [[MODULE:%.*]] = call ptr @mgpuModuleLoad(ptr @kernel_module_bin_cst)
				// CHECK-NEXT: [[FUNC:%.*]] = call ptr @mgpuModuleGetFunction(ptr [[MODULE]], ptr @kernel_module_kernel_kernel_name)
				// CHECK-NEXT: call void @mgpuLaunchKernel(ptr [[FUNC]], i64 8, i64 8, i64 8, i64 8, i64 8, i64 8, i32 0, ptr {{.*}}, ptr [[ARGS]], ptr null)
				// CHECK-NEXT: call void @mgpuModuleUnload(ptr [[MODULE]])
				// CHECK-NEXT: call void @mgpuStreamSynchronize(ptr %{{.*}})
				// CHECK-NEXT: call void @mgpuStreamDestroy(ptr %{{.*}})
				%1 = llvm.call @mgpuStreamCreate() : () -> !llvm.ptr
				gpu.launch_func <%1 : !llvm.ptr> @kernel_module::@kernel blocks in (%0, %0, %0) threads in (%0, %0, %0) : i64
				llvm.call @mgpuStreamSynchronize(%1) : (!llvm.ptr) -> ()
				llvm.call @mgpuStreamDestroy(%1) : (!llvm.ptr) -> ()
				llvm.return
				}
				llvm.func @mgpuStreamCreate() -> !llvm.ptr
				llvm.func @mgpuStreamSynchronize(!llvm.ptr)
				llvm.func @mgpuStreamDestroy(!llvm.ptr)
				}