Download Raw Diff

Details

Reviewers

bondhugula
ThomasRaoux
mehdi_amini
nicolasvasilache
herhut
ftynse
dcaballe

Commits

rG43752a2aa31a: [mlir][gpu] Add the `gpu-module-to-binary` pass.

Summary

For an explanation of these patches see D154153.

Commit message:
This pass converts GPU modules into GPU binaries, serializing all targets present
in a GPU module by invoking the serializeToObject target attribute method.

Depends on D154147

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

fmorac created this revision.Jun 29 2023, 1:53 PM

Herald added a reviewer: bondhugula. · View Herald TranscriptJun 29 2023, 1:53 PM

Herald added a reviewer: ThomasRaoux. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: bviyer, Moerafaat, zero9178 and 22 others. · View Herald Transcript

fmorac added a child revision: D154152: [mlir][gpu] Add GPU target support to `gpu-to-llvm`..Jun 29 2023, 2:04 PM

Harbormaster completed remote builds in B242231: Diff 535986.Jun 29 2023, 4:31 PM

Rebasing.

fmorac edited the summary of this revision. (Show Details)Jun 29 2023, 6:20 PM

fmorac added a reviewer: mehdi_amini.

fmorac mentioned this in D154153: [mlir][gpu] Update GPU translation to accept binaries..Jun 29 2023, 7:03 PM

Harbormaster completed remote builds in B242304: Diff 536080.Jun 29 2023, 10:13 PM

fmorac published this revision for review.Jun 30 2023, 6:30 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptJun 30 2023, 6:30 AM

Herald added a reviewer: herhut. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

krzysz00 added a subscriber: krzysz00.Jul 13 2023, 12:32 PM

krzysz00 added inline comments.

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
66	Are these new #define 's I missed in the patch series?

Rebasing.

Harbormaster completed remote builds in B247414: Diff 543194.Jul 22 2023, 9:19 AM

mehdi_amini added inline comments.Jul 24 2023, 12:56 AM

mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
74	Same as other patches: can this specify the interface instead of "Attribute"?
mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
41	The restriction on `ModuleOp` does not seem necessary to me. I would let it be an OperationPass and specify that this finds all the GPUModule op immediately present in the attached regions and convert them to GPUBinaryOp holding a serialization of the result of the codegen of the GPUModule based on its target information.
43	It's not useful to repeat the summary, please flesh this out.
44	If the pass runs on a GPUModuleOp present in its input, then this shouldn't be needed.
48	Document please. Do you have tests for it as well?
mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
97	Nit: dyn_cast without check for nullptr
121	Pattern aren't obviously appropriate to me. I would do something a bit more lightweight: void GpuModuleToBinaryPass::runOnOperation() { for (Region &region : getOperation()->getRegion()) { for (Block &block : region.getBlocks()) { for (auto gpuModuleOP : make_early_inc(block.getOperations<GPUModuleOp>())) process(gpuModuleOP); } } If we need a public API, it can be one that takes a GPUModuleOp and an OpBuilder and builds a GPUBinaryOp.
mlir/test/Dialect/GPU/module-to-binary.mlir
2	there is no diagnostics to verify here?

fmorac added inline comments.Jul 24 2023, 5:53 AM

mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
74	I'll change it.
mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
41	I'll change it.
43	I'll update it.
44	I'll remove it.
48	I'll add better documentation. Tests: Currently no, I think I need to add a method for parsing attributes as options for this to work as a cmd option, it was added here for having the option on C++. I can create a patch for that (cmd parsing of attributes) and the modify this.
mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
66	These are introduced in the CMakeList.txt, I added them to not reuse the ones from the existing pipeline.
97	I'll change it.
121	Yeah, the reason for doing patterns was just to expose it to be used in other passes. But I can change it,
mlir/test/Dialect/GPU/module-to-binary.mlir
2	You're right, I'll remove it.

Updated tests, changed the Pass to an OperationPass and switched from patterns to looking for nested Modules.

Harbormaster completed remote builds in B250755: Diff 547745.Aug 7 2023, 7:37 AM

mehdi_amini accepted this revision.Aug 8 2023, 10:21 PM

mehdi_amini added inline comments.

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
97	Still present

This revision is now accepted and ready to land.Aug 8 2023, 10:21 PM

Added an assert check for the attribute.
Note: The verifier will also verify that the dyn_cast is valid, as it's a precondition on the elements of the array.

Harbormaster completed remote builds in B251915: Diff 549334.Aug 11 2023, 6:12 AM

Separated gpu-module-to-binary test into 2 tests: nvvm & rocdl.
Also updated lit.cfg.py with:

if config.run_cuda_tests:
    config.available_features.add("host-supports-nvptx")

if config.run_rocm_tests:
    config.available_features.add("host-supports-amdgpu")

Allowing the usage of // REQUIRES: host-supports-nvptx for disabling tests.

Herald added a reviewer: ftynse. · View Herald TranscriptAug 11 2023, 4:07 PM

Herald added a reviewer: dcaballe. · View Herald Transcript

Herald added subscribers: gysit, Dinistro, mattd, awarzynski. · View Herald Transcript

Harbormaster completed remote builds in B252061: Diff 549542.Aug 11 2023, 4:08 PM

Fix patch application.

Harbormaster completed remote builds in B252063: Diff 549543.Aug 11 2023, 5:18 PM

Closed by commit rG43752a2aa31a: [mlir][gpu] Add the `gpu-module-to-binary` pass. (authored by fmorac). · Explain WhyAug 11 2023, 5:25 PM

This revision was automatically updated to reflect the committed changes.

fmorac added a commit: rG43752a2aa31a: [mlir][gpu] Add the `gpu-module-to-binary` pass..

fmorac mentioned this in rGb43068e8707d: [mlir][gpu] Update GPU translation to accept binaries..Aug 11 2023, 5:29 PM

Diff 547745

mlir/include/mlir/Dialect/GPU/Transforms/Passes.h

	Show First 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
	/// Collect all patterns to rewrite ops within the GPU dialect.			/// Collect all patterns to rewrite ops within the GPU dialect.
	inline void populateGpuRewritePatterns(RewritePatternSet &patterns) {			inline void populateGpuRewritePatterns(RewritePatternSet &patterns) {
	populateGpuAllReducePatterns(patterns);			populateGpuAllReducePatterns(patterns);
	populateGpuGlobalIdPatterns(patterns);			populateGpuGlobalIdPatterns(patterns);
	populateGpuShufflePatterns(patterns);			populateGpuShufflePatterns(patterns);
	}			}

	namespace gpu {			namespace gpu {
				/// Searches for all GPU modules in `op` and transforms them into GPU binary
				/// operations. The resulting `gpu.binary` has `handler` as its offloading
				mehdi_aminiUnsubmitted Done Reply Inline Actions Same as other patches: can this specify the interface instead of "Attribute"? mehdi_amini: Same as other patches: can this specify the interface instead of "Attribute"?
				fmoracAuthorUnsubmitted Done Reply Inline Actions I'll change it. fmorac: I'll change it.
				/// handler attribute.
				LogicalResult transformGpuModulesToBinaries(
				Operation *op, OffloadingLLVMTranslationAttrInterface handler = nullptr,
				const gpu::TargetOptions &options = {});

	/// Base pass class to serialize kernel functions through LLVM into			/// Base pass class to serialize kernel functions through LLVM into
	/// user-specified IR and add the resulting blob as module attribute.			/// user-specified IR and add the resulting blob as module attribute.
	class SerializeToBlobPass : public OperationPass<gpu::GPUModuleOp> {			class SerializeToBlobPass : public OperationPass<gpu::GPUModuleOp> {
	public:			public:
	SerializeToBlobPass(TypeID passID);			SerializeToBlobPass(TypeID passID);
	SerializeToBlobPass(const SerializeToBlobPass &other);			SerializeToBlobPass(const SerializeToBlobPass &other);

	void runOnOperation() final;			void runOnOperation() final;
	▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

mlir/include/mlir/Dialect/GPU/Transforms/Passes.td

	Show All 31 Lines
	def GpuMapParallelLoopsPass			def GpuMapParallelLoopsPass
	: Pass<"gpu-map-parallel-loops", "mlir::func::FuncOp"> {			: Pass<"gpu-map-parallel-loops", "mlir::func::FuncOp"> {
	let summary = "Greedily maps loops to GPU hardware dimensions.";			let summary = "Greedily maps loops to GPU hardware dimensions.";
	let constructor = "mlir::createGpuMapParallelLoopsPass()";			let constructor = "mlir::createGpuMapParallelLoopsPass()";
	let description = "Greedily maps loops to GPU hardware dimensions.";			let description = "Greedily maps loops to GPU hardware dimensions.";
	let dependentDialects = ["mlir::gpu::GPUDialect"];			let dependentDialects = ["mlir::gpu::GPUDialect"];
	}			}

				def GpuModuleToBinaryPass
				: Pass<"gpu-module-to-binary", ""> {
				mehdi_aminiUnsubmitted Done Reply Inline Actions The restriction on `ModuleOp` does not seem necessary to me. I would let it be an OperationPass and specify that this finds all the GPUModule op immediately present in the attached regions and convert them to GPUBinaryOp holding a serialization of the result of the codegen of the GPUModule based on its target information. mehdi_amini: The restriction on `ModuleOp` does not seem necessary to me. I would let it be an…
				fmoracAuthorUnsubmitted Done Reply Inline Actions I'll change it. fmorac: I'll change it.
				let summary = "Transforms a GPU module into a GPU binary.";
				let options = [
				mehdi_aminiUnsubmitted Done Reply Inline Actions It's not useful to repeat the summary, please flesh this out. mehdi_amini: It's not useful to repeat the summary, please flesh this out.
				fmoracAuthorUnsubmitted Done Reply Inline Actions I'll update it. fmorac: I'll update it.
				Option<"offloadingHandler", "handler", "Attribute", "nullptr",
				mehdi_aminiUnsubmitted Done Reply Inline Actions If the pass runs on a GPUModuleOp present in its input, then this shouldn't be needed. mehdi_amini: If the pass runs on a GPUModuleOp present in its input, then this shouldn't be needed.
				fmoracAuthorUnsubmitted Done Reply Inline Actions I'll remove it. fmorac: I'll remove it.
				"Offloading handler to be attached to the resulting binary op.">,
				Option<"toolkitPath", "toolkit", "std::string", [{""}],
				"Toolkit path.">,
				ListOption<"linkFiles", "l", "std::string",
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions Document please. Do you have tests for it as well? mehdi_amini: Document please. Do you have tests for it as well?
				fmoracAuthorUnsubmitted Done Reply Inline Actions I'll add better documentation. Tests: Currently no, I think I need to add a method for parsing attributes as options for this to work as a cmd option, it was added here for having the option on C++. I can create a patch for that (cmd parsing of attributes) and the modify this. fmorac: I'll add better documentation. Tests: Currently no, I think I need to add a method for parsing…
				"Extra files to link to.">,
				Option<"cmdOptions", "opts", "std::string", [{""}],
				"Command line options to pass to the tools.">,
				Option<"compilationTarget", "format", "std::string", [{"bin"}],
				"The target representation of the compilation process.">
				];
				}

	#endif // MLIR_DIALECT_GPU_PASSES			#endif // MLIR_DIALECT_GPU_PASSES

mlir/lib/Dialect/GPU/CMakeLists.txt

Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	add_mlir_dialect_library(MLIRGPUDialect
)		)

add_mlir_dialect_library(MLIRGPUTransforms		add_mlir_dialect_library(MLIRGPUTransforms
Transforms/AllReduceLowering.cpp		Transforms/AllReduceLowering.cpp
Transforms/AsyncRegionRewriter.cpp		Transforms/AsyncRegionRewriter.cpp
Transforms/GlobalIdRewriter.cpp		Transforms/GlobalIdRewriter.cpp
Transforms/KernelOutlining.cpp		Transforms/KernelOutlining.cpp
Transforms/MemoryPromotion.cpp		Transforms/MemoryPromotion.cpp
		Transforms/ModuleToBinary.cpp
Transforms/ParallelLoopMapper.cpp		Transforms/ParallelLoopMapper.cpp
Transforms/ShuffleRewriter.cpp		Transforms/ShuffleRewriter.cpp
Transforms/SerializeToBlob.cpp		Transforms/SerializeToBlob.cpp
Transforms/SerializeToCubin.cpp		Transforms/SerializeToCubin.cpp
Transforms/SerializeToHsaco.cpp		Transforms/SerializeToHsaco.cpp

ADDITIONAL_HEADER_DIRS		ADDITIONAL_HEADER_DIRS
${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/GPU		${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/GPU
▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp

This file was added.

				//===- ModuleToBinary.cpp - Transforms GPU modules to GPU binaries ----------=//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements the `GpuModuleToBinaryPass` pass, transforming GPU
				// modules into GPU binaries.
				//
				//===----------------------------------------------------------------------===//

				#include "mlir/Dialect/GPU/Transforms/Passes.h"

				#include "mlir/Dialect/Func/IR/FuncOps.h"
				#include "mlir/Dialect/GPU/IR/GPUDialect.h"
				#include "mlir/IR/BuiltinOps.h"
				#include "mlir/Target/LLVM/NVVM/Target.h"
				#include "mlir/Target/LLVM/ROCDL/Target.h"
				#include "mlir/Target/LLVMIR/Dialect/GPU/GPUToLLVMIRTranslation.h"
				#include "mlir/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.h"
				#include "mlir/Transforms/GreedyPatternRewriteDriver.h"

				#include "llvm/ADT/STLExtras.h"
				#include "llvm/ADT/StringSwitch.h"

				using namespace mlir;
				using namespace mlir::gpu;

				namespace mlir {
				#define GEN_PASS_DEF_GPUMODULETOBINARYPASS
				#include "mlir/Dialect/GPU/Transforms/Passes.h.inc"
				} // namespace mlir

				namespace {
				class GpuModuleToBinaryPass
				: public impl::GpuModuleToBinaryPassBase<GpuModuleToBinaryPass> {
				public:
				using Base::Base;
				void getDependentDialects(DialectRegistry &registry) const override;
				void runOnOperation() final;
				};
				} // namespace

				void GpuModuleToBinaryPass::getDependentDialects(
				DialectRegistry &registry) const {
				// Register all GPU related translations.
				registerLLVMDialectTranslation(registry);
				registerGPUDialectTranslation(registry);
				#if MLIR_CUDA_CONVERSIONS_ENABLED == 1
				registerNVVMTarget(registry);
				#endif
				#if MLIR_ROCM_CONVERSIONS_ENABLED == 1
				registerROCDLTarget(registry);
				#endif
				}

				void GpuModuleToBinaryPass::runOnOperation() {
				RewritePatternSet patterns(&getContext());
				int targetFormat = llvm::StringSwitch<int>(compilationTarget)
				.Cases("offloading", "llvm", TargetOptions::offload)
				.Cases("assembly", "isa", TargetOptions::assembly)
				.Cases("binary", "bin", TargetOptions::binary)
				.Default(-1);
				if (targetFormat == -1)
				krzysz00Unsubmitted Done Reply Inline Actions Are these new #define 's I missed in the patch series? krzysz00: Are these new #define 's I missed in the patch series?
				fmoracAuthorUnsubmitted Done Reply Inline Actions These are introduced in the CMakeList.txt, I added them to not reuse the ones from the existing pipeline. fmorac: These are introduced in the CMakeList.txt, I added them to not reuse the ones from the existing…
				getOperation()->emitError() << "Invalid format specified.";
				TargetOptions targetOptions(
				toolkitPath, linkFiles, cmdOptions,
				static_cast<TargetOptions::CompilationTarget>(targetFormat));
				if (failed(transformGpuModulesToBinaries(
				getOperation(),
				offloadingHandler ? dyn_cast<OffloadingLLVMTranslationAttrInterface>(
				offloadingHandler.getValue())
				: OffloadingLLVMTranslationAttrInterface(nullptr),
				targetOptions)))
				return signalPassFailure();
				}

				namespace {
				LogicalResult moduleSerializer(GPUModuleOp op,
				OffloadingLLVMTranslationAttrInterface handler,
				const TargetOptions &targetOptions) {
				OpBuilder builder(op->getContext());
				SmallVector<Attribute> objects;
				// Serialize all targets.
				for (auto targetAttr : op.getTargetsAttr()) {
				assert(targetAttr && "Target attribute cannot be null.");
				auto target = dyn_cast<gpu::TargetAttrInterface>(targetAttr);
				std::optional<SmallVector<char, 0>> object =
				target.serializeToObject(op, targetOptions);

				if (!object) {
				op.emitError("An error happened while serializing the module.");
				return failure();
				}

				mehdi_aminiUnsubmitted Done Reply Inline Actions Nit: dyn_cast without check for nullptr mehdi_amini: Nit: dyn_cast without check for nullptr
				fmoracAuthorUnsubmitted Done Reply Inline Actions I'll change it. fmorac: I'll change it.
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions Still present mehdi_amini: Still present
				objects.push_back(builder.getAttr<gpu::ObjectAttr>(
				target,
				builder.getStringAttr(StringRef(object->data(), object->size()))));
				}
				builder.setInsertionPointAfter(op);
				builder.create<gpu::BinaryOp>(op.getLoc(), op.getName(), handler,
				builder.getArrayAttr(objects));
				op->erase();
				return success();
				}
				} // namespace

				LogicalResult mlir::gpu::transformGpuModulesToBinaries(
				Operation *op, OffloadingLLVMTranslationAttrInterface handler,
				const gpu::TargetOptions &targetOptions) {
				for (Region &region : op->getRegions())
				for (Block &block : region.getBlocks())
				for (auto module :
				llvm::make_early_inc_range(block.getOps<GPUModuleOp>()))
				if (failed(moduleSerializer(module, handler, targetOptions)))
				return failure();
				return success();
				}
				mehdi_aminiUnsubmitted Done Reply Inline Actions Pattern aren't obviously appropriate to me. I would do something a bit more lightweight: void GpuModuleToBinaryPass::runOnOperation() { for (Region &region : getOperation()->getRegion()) { for (Block &block : region.getBlocks()) { for (auto gpuModuleOP : make_early_inc(block.getOperations<GPUModuleOp>())) process(gpuModuleOP); } } If we need a public API, it can be one that takes a GPUModuleOp and an OpBuilder and builds a GPUBinaryOp. mehdi_amini: Pattern aren't obviously appropriate to me. I would do something a bit more lightweight: ```…
				fmoracAuthorUnsubmitted Done Reply Inline Actions Yeah, the reason for doing patterns was just to expose it to be used in other passes. But I can change it, fmorac: Yeah, the reason for doing patterns was just to expose it to be used in other passes. But I can…

mlir/test/Dialect/GPU/module-to-binary.mlir

This file was added.

				// RUN: mlir-opt %s --gpu-module-to-binary="format=llvm" \| FileCheck %s
				// RUN: mlir-opt %s --gpu-module-to-binary="format=isa" \| FileCheck %s
				mehdi_aminiUnsubmitted Done Reply Inline Actions there is no diagnostics to verify here? mehdi_amini: there is no diagnostics to verify here?
				fmoracAuthorUnsubmitted Done Reply Inline Actions You're right, I'll remove it. fmorac: You're right, I'll remove it.

				module attributes {gpu.container_module} {
				// CHECK-LABEL:gpu.binary @kernel_module1
				// CHECK:[#gpu.object<#nvvm.target<chip = "sm_70">, "{{.*}}">]
				gpu.module @kernel_module1 [#nvvm.target<chip = "sm_70">] {
				llvm.func @kernel(%arg0: i32, %arg1: !llvm.ptr<f32>,
				%arg2: !llvm.ptr<f32>, %arg3: i64, %arg4: i64,
				%arg5: i64) attributes {gpu.kernel} {
				llvm.return
				}
				}

				// CHECK-LABEL:gpu.binary @kernel_module2
				// CHECK:[#gpu.object<#nvvm.target<flags = {fast}>, "{{.}}">, #gpu.object<#nvvm.target, "{{.}}">]
				gpu.module @kernel_module2 [#nvvm.target<flags = {fast}>, #nvvm.target] {
				llvm.func @kernel(%arg0: i32, %arg1: !llvm.ptr<f32>,
				%arg2: !llvm.ptr<f32>, %arg3: i64, %arg4: i64,
				%arg5: i64) attributes {gpu.kernel} {
				llvm.return
				}
				}
				}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][gpu] Add the `gpu-module-to-binary` pass.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 547745

mlir/include/mlir/Dialect/GPU/Transforms/Passes.h

mlir/include/mlir/Dialect/GPU/Transforms/Passes.td

mlir/lib/Dialect/GPU/CMakeLists.txt

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp

mlir/test/Dialect/GPU/module-to-binary.mlir

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][gpu] Add the `gpu-module-to-binary` pass.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 547745

mlir/include/mlir/Dialect/GPU/Transforms/Passes.h

mlir/include/mlir/Dialect/GPU/Transforms/Passes.td

mlir/lib/Dialect/GPU/CMakeLists.txt

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp

mlir/test/Dialect/GPU/module-to-binary.mlir

[mlir][gpu] Add the `gpu-module-to-binary` pass.
ClosedPublic