Download Raw Diff

Details

Reviewers

bondhugula
ThomasRaoux
mehdi_amini
nicolasvasilache
herhut
ftynse
dcaballe

Commits

rG43752a2aa31a: [mlir][gpu] Add the `gpu-module-to-binary` pass.

Summary

For an explanation of these patches see D154153.

Commit message:
This pass converts GPU modules into GPU binaries, serializing all targets present
in a GPU module by invoking the serializeToObject target attribute method.

Depends on D154147

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

fmorac created this revision.Jun 29 2023, 1:53 PM

Herald added a reviewer: bondhugula. · View Herald TranscriptJun 29 2023, 1:53 PM

Herald added a reviewer: ThomasRaoux. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: bviyer, Moerafaat, zero9178 and 22 others. · View Herald Transcript

fmorac added a child revision: D154152: [mlir][gpu] Add GPU target support to `gpu-to-llvm`..Jun 29 2023, 2:04 PM

Harbormaster completed remote builds in B242231: Diff 535986.Jun 29 2023, 4:31 PM

Rebasing.

fmorac edited the summary of this revision. (Show Details)Jun 29 2023, 6:20 PM

fmorac added a reviewer: mehdi_amini.

fmorac mentioned this in D154153: [mlir][gpu] Update GPU translation to accept binaries..Jun 29 2023, 7:03 PM

Harbormaster completed remote builds in B242304: Diff 536080.Jun 29 2023, 10:13 PM

fmorac published this revision for review.Jun 30 2023, 6:30 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptJun 30 2023, 6:30 AM

Herald added a reviewer: herhut. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

krzysz00 added a subscriber: krzysz00.Jul 13 2023, 12:32 PM

krzysz00 added inline comments.

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
66	Are these new #define 's I missed in the patch series?

Rebasing.

Harbormaster completed remote builds in B247414: Diff 543194.Jul 22 2023, 9:19 AM

mehdi_amini added inline comments.Jul 24 2023, 12:56 AM

mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
74	Same as other patches: can this specify the interface instead of "Attribute"?
mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
41	The restriction on `ModuleOp` does not seem necessary to me. I would let it be an OperationPass and specify that this finds all the GPUModule op immediately present in the attached regions and convert them to GPUBinaryOp holding a serialization of the result of the codegen of the GPUModule based on its target information.
43	It's not useful to repeat the summary, please flesh this out.
44	If the pass runs on a GPUModuleOp present in its input, then this shouldn't be needed.
48	Document please. Do you have tests for it as well?
mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
97	Nit: dyn_cast without check for nullptr
121	Pattern aren't obviously appropriate to me. I would do something a bit more lightweight: void GpuModuleToBinaryPass::runOnOperation() { for (Region &region : getOperation()->getRegion()) { for (Block &block : region.getBlocks()) { for (auto gpuModuleOP : make_early_inc(block.getOperations<GPUModuleOp>())) process(gpuModuleOP); } } If we need a public API, it can be one that takes a GPUModuleOp and an OpBuilder and builds a GPUBinaryOp.
mlir/test/Dialect/GPU/module-to-binary.mlir
1 ↗	(On Diff #543194)	there is no diagnostics to verify here?

fmorac added inline comments.Jul 24 2023, 5:53 AM

mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
74	I'll change it.
mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
41	I'll change it.
43	I'll update it.
44	I'll remove it.
48	I'll add better documentation. Tests: Currently no, I think I need to add a method for parsing attributes as options for this to work as a cmd option, it was added here for having the option on C++. I can create a patch for that (cmd parsing of attributes) and the modify this.
mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
66	These are introduced in the CMakeList.txt, I added them to not reuse the ones from the existing pipeline.
97	I'll change it.
121	Yeah, the reason for doing patterns was just to expose it to be used in other passes. But I can change it,
mlir/test/Dialect/GPU/module-to-binary.mlir
1 ↗	(On Diff #543194)	You're right, I'll remove it.

Updated tests, changed the Pass to an OperationPass and switched from patterns to looking for nested Modules.

Harbormaster completed remote builds in B250755: Diff 547745.Aug 7 2023, 7:37 AM

mehdi_amini accepted this revision.Aug 8 2023, 10:21 PM

mehdi_amini added inline comments.

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
97	Still present

This revision is now accepted and ready to land.Aug 8 2023, 10:21 PM

Added an assert check for the attribute.
Note: The verifier will also verify that the dyn_cast is valid, as it's a precondition on the elements of the array.

Harbormaster completed remote builds in B251915: Diff 549334.Aug 11 2023, 6:12 AM

Separated gpu-module-to-binary test into 2 tests: nvvm & rocdl.
Also updated lit.cfg.py with:

if config.run_cuda_tests:
    config.available_features.add("host-supports-nvptx")

if config.run_rocm_tests:
    config.available_features.add("host-supports-amdgpu")

Allowing the usage of // REQUIRES: host-supports-nvptx for disabling tests.

Herald added a reviewer: ftynse. · View Herald TranscriptAug 11 2023, 4:07 PM

Herald added a reviewer: dcaballe. · View Herald Transcript

Herald added subscribers: gysit, Dinistro, mattd, awarzynski. · View Herald Transcript

Harbormaster completed remote builds in B252061: Diff 549542.Aug 11 2023, 4:08 PM

Fix patch application.

Harbormaster completed remote builds in B252063: Diff 549543.Aug 11 2023, 5:18 PM

Closed by commit rG43752a2aa31a: [mlir][gpu] Add the `gpu-module-to-binary` pass. (authored by fmorac). · Explain WhyAug 11 2023, 5:25 PM

This revision was automatically updated to reflect the committed changes.

fmorac added a commit: rG43752a2aa31a: [mlir][gpu] Add the `gpu-module-to-binary` pass..

fmorac mentioned this in rGb43068e8707d: [mlir][gpu] Update GPU translation to accept binaries..Aug 11 2023, 5:29 PM

Diff 549549

mlir/include/mlir/Dialect/GPU/Transforms/Passes.h

	Show First 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
	/// Collect all patterns to rewrite ops within the GPU dialect.			/// Collect all patterns to rewrite ops within the GPU dialect.
	inline void populateGpuRewritePatterns(RewritePatternSet &patterns) {			inline void populateGpuRewritePatterns(RewritePatternSet &patterns) {
	populateGpuAllReducePatterns(patterns);			populateGpuAllReducePatterns(patterns);
	populateGpuGlobalIdPatterns(patterns);			populateGpuGlobalIdPatterns(patterns);
	populateGpuShufflePatterns(patterns);			populateGpuShufflePatterns(patterns);
	}			}

	namespace gpu {			namespace gpu {
				/// Searches for all GPU modules in `op` and transforms them into GPU binary
				/// operations. The resulting `gpu.binary` has `handler` as its offloading
				mehdi_aminiUnsubmitted Done Reply Inline Actions Same as other patches: can this specify the interface instead of "Attribute"? mehdi_amini: Same as other patches: can this specify the interface instead of "Attribute"?
				fmoracAuthorUnsubmitted Done Reply Inline Actions I'll change it. fmorac: I'll change it.
				/// handler attribute.
				LogicalResult transformGpuModulesToBinaries(
				Operation *op, OffloadingLLVMTranslationAttrInterface handler = nullptr,
				const gpu::TargetOptions &options = {});

	/// Base pass class to serialize kernel functions through LLVM into			/// Base pass class to serialize kernel functions through LLVM into
	/// user-specified IR and add the resulting blob as module attribute.			/// user-specified IR and add the resulting blob as module attribute.
	class SerializeToBlobPass : public OperationPass<gpu::GPUModuleOp> {			class SerializeToBlobPass : public OperationPass<gpu::GPUModuleOp> {
	public:			public:
	SerializeToBlobPass(TypeID passID);			SerializeToBlobPass(TypeID passID);
	SerializeToBlobPass(const SerializeToBlobPass &other);			SerializeToBlobPass(const SerializeToBlobPass &other);

	void runOnOperation() final;			void runOnOperation() final;
	▲ Show 20 Lines • Show All 85 Lines • Show Last 20 Lines

mlir/include/mlir/Dialect/GPU/Transforms/Passes.td

Show All 32 Lines	def GpuMapParallelLoopsPass
: Pass<"gpu-map-parallel-loops", "mlir::func::FuncOp"> {		: Pass<"gpu-map-parallel-loops", "mlir::func::FuncOp"> {
let summary = "Greedily maps loops to GPU hardware dimensions.";		let summary = "Greedily maps loops to GPU hardware dimensions.";
let constructor = "mlir::createGpuMapParallelLoopsPass()";		let constructor = "mlir::createGpuMapParallelLoopsPass()";
let description = "Greedily maps loops to GPU hardware dimensions.";		let description = "Greedily maps loops to GPU hardware dimensions.";
let dependentDialects = ["mlir::gpu::GPUDialect"];		let dependentDialects = ["mlir::gpu::GPUDialect"];
}		}

def GpuDecomposeMemrefsPass : Pass<"gpu-decompose-memrefs"> {		def GpuDecomposeMemrefsPass : Pass<"gpu-decompose-memrefs"> {
let summary = "Decomposes memref index computation into explicit ops.";		let summary = "Decomposes memref index computation into explicit ops.";
		mehdi_aminiUnsubmitted Done Reply Inline Actions The restriction on `ModuleOp` does not seem necessary to me. I would let it be an OperationPass and specify that this finds all the GPUModule op immediately present in the attached regions and convert them to GPUBinaryOp holding a serialization of the result of the codegen of the GPUModule based on its target information. mehdi_amini: The restriction on `ModuleOp` does not seem necessary to me. I would let it be an…
		fmoracAuthorUnsubmitted Done Reply Inline Actions I'll change it. fmorac: I'll change it.
let description = [{		let description = [{
This pass decomposes memref index computation into explicit computations on		This pass decomposes memref index computation into explicit computations on
		mehdi_aminiUnsubmitted Done Reply Inline Actions It's not useful to repeat the summary, please flesh this out. mehdi_amini: It's not useful to repeat the summary, please flesh this out.
		fmoracAuthorUnsubmitted Done Reply Inline Actions I'll update it. fmorac: I'll update it.
sizes/strides, obtained from `memref.extract_memref_metadata` which it tries		sizes/strides, obtained from `memref.extract_memref_metadata` which it tries
		mehdi_aminiUnsubmitted Done Reply Inline Actions If the pass runs on a GPUModuleOp present in its input, then this shouldn't be needed. mehdi_amini: If the pass runs on a GPUModuleOp present in its input, then this shouldn't be needed.
		fmoracAuthorUnsubmitted Done Reply Inline Actions I'll remove it. fmorac: I'll remove it.
to place outside of `gpu.launch` body. Memrefs are then reconstructed using		to place outside of `gpu.launch` body. Memrefs are then reconstructed using
`memref.reinterpret_cast`.		`memref.reinterpret_cast`.
This is needed for as some targets (SPIR-V) lower memrefs to bare pointers		This is needed for as some targets (SPIR-V) lower memrefs to bare pointers
and sizes/strides for dynamically-sized memrefs are not available inside		and sizes/strides for dynamically-sized memrefs are not available inside
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions Document please. Do you have tests for it as well? mehdi_amini: Document please. Do you have tests for it as well?
		fmoracAuthorUnsubmitted Done Reply Inline Actions I'll add better documentation. Tests: Currently no, I think I need to add a method for parsing attributes as options for this to work as a cmd option, it was added here for having the option on C++. I can create a patch for that (cmd parsing of attributes) and the modify this. fmorac: I'll add better documentation. Tests: Currently no, I think I need to add a method for parsing…
`gpu.launch`.		`gpu.launch`.
}];		}];
let constructor = "mlir::createGpuDecomposeMemrefsPass()";		let constructor = "mlir::createGpuDecomposeMemrefsPass()";
let dependentDialects = [		let dependentDialects = [
"mlir::gpu::GPUDialect", "mlir::memref::MemRefDialect",		"mlir::gpu::GPUDialect", "mlir::memref::MemRefDialect",
"mlir::affine::AffineDialect"		"mlir::affine::AffineDialect"
];		];
}		}

		def GpuModuleToBinaryPass
		: Pass<"gpu-module-to-binary", ""> {
		let summary = "Transforms a GPU module into a GPU binary.";
		let description = [{
		This pass searches for all nested GPU modules and serializes the module
		using the target attributes attached to the module, producing a GPU binary
		with an object for every target.

		The `format` argument can have the following values:
		1. `offloading`, `llvm`: producing an offloading representation.
		2. `assembly`, `isa`: producing assembly code.
		3. `binary`, `bin`: producing binaries.
		}];
		let options = [
		Option<"offloadingHandler", "handler", "Attribute", "nullptr",
		"Offloading handler to be attached to the resulting binary op.">,
		Option<"toolkitPath", "toolkit", "std::string", [{""}],
		"Toolkit path.">,
		ListOption<"linkFiles", "l", "std::string",
		"Extra files to link to.">,
		Option<"cmdOptions", "opts", "std::string", [{""}],
		"Command line options to pass to the tools.">,
		Option<"compilationTarget", "format", "std::string", [{"bin"}],
		"The target representation of the compilation process.">
		];
		}

#endif // MLIR_DIALECT_GPU_PASSES		#endif // MLIR_DIALECT_GPU_PASSES

mlir/lib/Dialect/GPU/CMakeLists.txt

Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines

add_mlir_dialect_library(MLIRGPUTransforms		add_mlir_dialect_library(MLIRGPUTransforms
Transforms/AllReduceLowering.cpp		Transforms/AllReduceLowering.cpp
Transforms/AsyncRegionRewriter.cpp		Transforms/AsyncRegionRewriter.cpp
Transforms/DecomposeMemrefs.cpp		Transforms/DecomposeMemrefs.cpp
Transforms/GlobalIdRewriter.cpp		Transforms/GlobalIdRewriter.cpp
Transforms/KernelOutlining.cpp		Transforms/KernelOutlining.cpp
Transforms/MemoryPromotion.cpp		Transforms/MemoryPromotion.cpp
		Transforms/ModuleToBinary.cpp
Transforms/ParallelLoopMapper.cpp		Transforms/ParallelLoopMapper.cpp
Transforms/SerializeToBlob.cpp		Transforms/SerializeToBlob.cpp
Transforms/SerializeToCubin.cpp		Transforms/SerializeToCubin.cpp
Transforms/SerializeToHsaco.cpp		Transforms/SerializeToHsaco.cpp
Transforms/ShuffleRewriter.cpp		Transforms/ShuffleRewriter.cpp

ADDITIONAL_HEADER_DIRS		ADDITIONAL_HEADER_DIRS
${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/GPU		${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/GPU
Show All 18 Lines	add_mlir_dialect_library(MLIRGPUTransforms
MLIRExecutionEngineUtils		MLIRExecutionEngineUtils
MLIRGPUDialect		MLIRGPUDialect
MLIRIR		MLIRIR
MLIRIndexDialect		MLIRIndexDialect
MLIRLLVMDialect		MLIRLLVMDialect
MLIRGPUToLLVMIRTranslation		MLIRGPUToLLVMIRTranslation
MLIRLLVMToLLVMIRTranslation		MLIRLLVMToLLVMIRTranslation
MLIRMemRefDialect		MLIRMemRefDialect
		MLIRNVVMTarget
MLIRPass		MLIRPass
MLIRSCFDialect		MLIRSCFDialect
MLIRSideEffectInterfaces		MLIRSideEffectInterfaces
MLIRSupport		MLIRSupport
		MLIRROCDLTarget
MLIRTransformUtils		MLIRTransformUtils
)		)

add_subdirectory(TransformOps)		add_subdirectory(TransformOps)

if(MLIR_ENABLE_CUDA_RUNNER)		if(MLIR_ENABLE_CUDA_RUNNER)
if(NOT MLIR_ENABLE_CUDA_CONVERSIONS)		if(NOT MLIR_ENABLE_CUDA_CONVERSIONS)
message(SEND_ERROR		message(SEND_ERROR
▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp

This file was added.

				//===- ModuleToBinary.cpp - Transforms GPU modules to GPU binaries ----------=//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements the `GpuModuleToBinaryPass` pass, transforming GPU
				// modules into GPU binaries.
				//
				//===----------------------------------------------------------------------===//

				#include "mlir/Dialect/GPU/Transforms/Passes.h"

				#include "mlir/Dialect/Func/IR/FuncOps.h"
				#include "mlir/Dialect/GPU/IR/GPUDialect.h"
				#include "mlir/IR/BuiltinOps.h"
				#include "mlir/Target/LLVM/NVVM/Target.h"
				#include "mlir/Target/LLVM/ROCDL/Target.h"
				#include "mlir/Target/LLVMIR/Dialect/GPU/GPUToLLVMIRTranslation.h"
				#include "mlir/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.h"
				#include "mlir/Transforms/GreedyPatternRewriteDriver.h"

				#include "llvm/ADT/STLExtras.h"
				#include "llvm/ADT/StringSwitch.h"

				using namespace mlir;
				using namespace mlir::gpu;

				namespace mlir {
				#define GEN_PASS_DEF_GPUMODULETOBINARYPASS
				#include "mlir/Dialect/GPU/Transforms/Passes.h.inc"
				} // namespace mlir

				namespace {
				class GpuModuleToBinaryPass
				: public impl::GpuModuleToBinaryPassBase<GpuModuleToBinaryPass> {
				public:
				using Base::Base;
				void getDependentDialects(DialectRegistry &registry) const override;
				void runOnOperation() final;
				};
				} // namespace

				void GpuModuleToBinaryPass::getDependentDialects(
				DialectRegistry &registry) const {
				// Register all GPU related translations.
				registerLLVMDialectTranslation(registry);
				registerGPUDialectTranslation(registry);
				#if MLIR_CUDA_CONVERSIONS_ENABLED == 1
				registerNVVMTarget(registry);
				#endif
				#if MLIR_ROCM_CONVERSIONS_ENABLED == 1
				registerROCDLTarget(registry);
				#endif
				}

				void GpuModuleToBinaryPass::runOnOperation() {
				RewritePatternSet patterns(&getContext());
				int targetFormat = llvm::StringSwitch<int>(compilationTarget)
				.Cases("offloading", "llvm", TargetOptions::offload)
				.Cases("assembly", "isa", TargetOptions::assembly)
				.Cases("binary", "bin", TargetOptions::binary)
				.Default(-1);
				if (targetFormat == -1)
				krzysz00Unsubmitted Done Reply Inline Actions Are these new #define 's I missed in the patch series? krzysz00: Are these new #define 's I missed in the patch series?
				fmoracAuthorUnsubmitted Done Reply Inline Actions These are introduced in the CMakeList.txt, I added them to not reuse the ones from the existing pipeline. fmorac: These are introduced in the CMakeList.txt, I added them to not reuse the ones from the existing…
				getOperation()->emitError() << "Invalid format specified.";
				TargetOptions targetOptions(
				toolkitPath, linkFiles, cmdOptions,
				static_cast<TargetOptions::CompilationTarget>(targetFormat));
				if (failed(transformGpuModulesToBinaries(
				getOperation(),
				offloadingHandler ? dyn_cast<OffloadingLLVMTranslationAttrInterface>(
				offloadingHandler.getValue())
				: OffloadingLLVMTranslationAttrInterface(nullptr),
				targetOptions)))
				return signalPassFailure();
				}

				namespace {
				LogicalResult moduleSerializer(GPUModuleOp op,
				OffloadingLLVMTranslationAttrInterface handler,
				const TargetOptions &targetOptions) {
				OpBuilder builder(op->getContext());
				SmallVector<Attribute> objects;
				// Serialize all targets.
				for (auto targetAttr : op.getTargetsAttr()) {
				assert(targetAttr && "Target attribute cannot be null.");
				auto target = dyn_cast<gpu::TargetAttrInterface>(targetAttr);
				assert(target &&
				"Target attribute doesn't implements `TargetAttrInterface`.");
				std::optional<SmallVector<char, 0>> object =
				target.serializeToObject(op, targetOptions);

				if (!object) {
				op.emitError("An error happened while serializing the module.");
				return failure();
				mehdi_aminiUnsubmitted Done Reply Inline Actions Nit: dyn_cast without check for nullptr mehdi_amini: Nit: dyn_cast without check for nullptr
				fmoracAuthorUnsubmitted Done Reply Inline Actions I'll change it. fmorac: I'll change it.
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions Still present mehdi_amini: Still present
				}

				objects.push_back(builder.getAttr<gpu::ObjectAttr>(
				target,
				builder.getStringAttr(StringRef(object->data(), object->size()))));
				}
				builder.setInsertionPointAfter(op);
				builder.create<gpu::BinaryOp>(op.getLoc(), op.getName(), handler,
				builder.getArrayAttr(objects));
				op->erase();
				return success();
				}
				} // namespace

				LogicalResult mlir::gpu::transformGpuModulesToBinaries(
				Operation *op, OffloadingLLVMTranslationAttrInterface handler,
				const gpu::TargetOptions &targetOptions) {
				for (Region &region : op->getRegions())
				for (Block &block : region.getBlocks())
				for (auto module :
				llvm::make_early_inc_range(block.getOps<GPUModuleOp>()))
				if (failed(moduleSerializer(module, handler, targetOptions)))
				return failure();
				return success();
				mehdi_aminiUnsubmitted Done Reply Inline Actions Pattern aren't obviously appropriate to me. I would do something a bit more lightweight: void GpuModuleToBinaryPass::runOnOperation() { for (Region &region : getOperation()->getRegion()) { for (Block &block : region.getBlocks()) { for (auto gpuModuleOP : make_early_inc(block.getOperations<GPUModuleOp>())) process(gpuModuleOP); } } If we need a public API, it can be one that takes a GPUModuleOp and an OpBuilder and builds a GPUBinaryOp. mehdi_amini: Pattern aren't obviously appropriate to me. I would do something a bit more lightweight: ```…
				fmoracAuthorUnsubmitted Done Reply Inline Actions Yeah, the reason for doing patterns was just to expose it to be used in other passes. But I can change it, fmorac: Yeah, the reason for doing patterns was just to expose it to be used in other passes. But I can…
				}

mlir/test/Dialect/GPU/module-to-binary-nvvm.mlir

This file was added.

				// REQUIRES: host-supports-nvptx
				// RUN: mlir-opt %s --gpu-module-to-binary="format=llvm" \| FileCheck %s
				// RUN: mlir-opt %s --gpu-module-to-binary="format=isa" \| FileCheck %s

				module attributes {gpu.container_module} {
				// CHECK-LABEL:gpu.binary @kernel_module1
				// CHECK:[#gpu.object<#nvvm.target<chip = "sm_70">, "{{.*}}">]
				gpu.module @kernel_module1 [#nvvm.target<chip = "sm_70">] {
				llvm.func @kernel(%arg0: i32, %arg1: !llvm.ptr<f32>,
				%arg2: !llvm.ptr<f32>, %arg3: i64, %arg4: i64,
				%arg5: i64) attributes {gpu.kernel} {
				llvm.return
				}
				}

				// CHECK-LABEL:gpu.binary @kernel_module2
				// CHECK:[#gpu.object<#nvvm.target<flags = {fast}>, "{{.}}">, #gpu.object<#nvvm.target, "{{.}}">]
				gpu.module @kernel_module2 [#nvvm.target<flags = {fast}>, #nvvm.target] {
				llvm.func @kernel(%arg0: i32, %arg1: !llvm.ptr<f32>,
				%arg2: !llvm.ptr<f32>, %arg3: i64, %arg4: i64,
				%arg5: i64) attributes {gpu.kernel} {
				llvm.return
				}
				}
				}

mlir/test/Dialect/GPU/module-to-binary-rocdl.mlir

This file was added.

				// REQUIRES: host-supports-amdgpu
				// RUN: mlir-opt %s --gpu-module-to-binary="format=llvm" \| FileCheck %s
				// RUN: mlir-opt %s --gpu-module-to-binary="format=isa" \| FileCheck %s

				module attributes {gpu.container_module} {
				// CHECK-LABEL:gpu.binary @kernel_module1
				// CHECK:[#gpu.object<#rocdl.target<chip = "gfx90a">, "{{.*}}">]
				gpu.module @kernel_module1 [#rocdl.target<chip = "gfx90a">] {
				llvm.func @kernel(%arg0: i32, %arg1: !llvm.ptr<f32>,
				%arg2: !llvm.ptr<f32>, %arg3: i64, %arg4: i64,
				%arg5: i64) attributes {gpu.kernel} {
				llvm.return
				}
				}

				// CHECK-LABEL:gpu.binary @kernel_module2
				// CHECK:[#gpu.object<#rocdl.target<flags = {fast}>, "{{.}}">, #gpu.object<#rocdl.target, "{{.}}">]
				gpu.module @kernel_module2 [#rocdl.target<flags = {fast}>, #rocdl.target] {
				llvm.func @kernel(%arg0: i32, %arg1: !llvm.ptr<f32>,
				%arg2: !llvm.ptr<f32>, %arg3: i64, %arg4: i64,
				%arg5: i64) attributes {gpu.kernel} {
				llvm.return
				}
				}
				}

mlir/test/lit.cfg.py

Show First 20 Lines • Show All 204 Lines • ▼ Show 20 Lines	def have_host_jit_feature_support(feature_name):
mlir_cpu_runner_out = mlir_cpu_runner_cmd.stdout.read().decode("ascii")		mlir_cpu_runner_out = mlir_cpu_runner_cmd.stdout.read().decode("ascii")
mlir_cpu_runner_cmd.wait()		mlir_cpu_runner_cmd.wait()

return "true" in mlir_cpu_runner_out		return "true" in mlir_cpu_runner_out


if have_host_jit_feature_support("jit"):		if have_host_jit_feature_support("jit"):
config.available_features.add("host-supports-jit")		config.available_features.add("host-supports-jit")

		if config.run_cuda_tests:
		config.available_features.add("host-supports-nvptx")

		if config.run_rocm_tests:
		config.available_features.add("host-supports-amdgpu")

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][gpu] Add the `gpu-module-to-binary` pass.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 549549

mlir/include/mlir/Dialect/GPU/Transforms/Passes.h

mlir/include/mlir/Dialect/GPU/Transforms/Passes.td

mlir/lib/Dialect/GPU/CMakeLists.txt

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp

mlir/test/Dialect/GPU/module-to-binary-nvvm.mlir

mlir/test/Dialect/GPU/module-to-binary-rocdl.mlir

mlir/test/lit.cfg.py

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][gpu] Add the `gpu-module-to-binary` pass.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 549549

mlir/include/mlir/Dialect/GPU/Transforms/Passes.h

mlir/include/mlir/Dialect/GPU/Transforms/Passes.td

mlir/lib/Dialect/GPU/CMakeLists.txt

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp

mlir/test/Dialect/GPU/module-to-binary-nvvm.mlir

mlir/test/Dialect/GPU/module-to-binary-rocdl.mlir

mlir/test/lit.cfg.py

[mlir][gpu] Add the `gpu-module-to-binary` pass.
ClosedPublic