Download Raw Diff

Details

Reviewers

bondhugula
ThomasRaoux
mehdi_amini
nicolasvasilache
herhut
ftynse
dcaballe

Commits

rG43752a2aa31a: [mlir][gpu] Add the `gpu-module-to-binary` pass.

Summary

For an explanation of these patches see D154153.

Commit message:
This pass converts GPU modules into GPU binaries, serializing all targets present
in a GPU module by invoking the serializeToObject target attribute method.

Depends on D154147

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

fmorac created this revision.Jun 29 2023, 1:53 PM

Herald added a reviewer: bondhugula. · View Herald TranscriptJun 29 2023, 1:53 PM

Herald added a reviewer: ThomasRaoux. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: bviyer, Moerafaat, zero9178 and 22 others. · View Herald Transcript

fmorac added a child revision: D154152: [mlir][gpu] Add GPU target support to `gpu-to-llvm`..Jun 29 2023, 2:04 PM

Harbormaster completed remote builds in B242231: Diff 535986.Jun 29 2023, 4:31 PM

Rebasing.

fmorac edited the summary of this revision. (Show Details)Jun 29 2023, 6:20 PM

fmorac added a reviewer: mehdi_amini.

fmorac mentioned this in D154153: [mlir][gpu] Update GPU translation to accept binaries..Jun 29 2023, 7:03 PM

Harbormaster completed remote builds in B242304: Diff 536080.Jun 29 2023, 10:13 PM

fmorac published this revision for review.Jun 30 2023, 6:30 AM

Herald added a reviewer: nicolasvasilache. · View Herald TranscriptJun 30 2023, 6:30 AM

Herald added a reviewer: herhut. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

krzysz00 added a subscriber: krzysz00.Jul 13 2023, 12:32 PM

krzysz00 added inline comments.

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
66	Are these new #define 's I missed in the patch series?

Rebasing.

Harbormaster completed remote builds in B247414: Diff 543194.Jul 22 2023, 9:19 AM

mehdi_amini added inline comments.Jul 24 2023, 12:56 AM

mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
74	Same as other patches: can this specify the interface instead of "Attribute"?
mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
41	The restriction on `ModuleOp` does not seem necessary to me. I would let it be an OperationPass and specify that this finds all the GPUModule op immediately present in the attached regions and convert them to GPUBinaryOp holding a serialization of the result of the codegen of the GPUModule based on its target information.
43	It's not useful to repeat the summary, please flesh this out.
44	If the pass runs on a GPUModuleOp present in its input, then this shouldn't be needed.
48	Document please. Do you have tests for it as well?
mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
96	Nit: dyn_cast without check for nullptr
120	Pattern aren't obviously appropriate to me. I would do something a bit more lightweight: void GpuModuleToBinaryPass::runOnOperation() { for (Region &region : getOperation()->getRegion()) { for (Block &block : region.getBlocks()) { for (auto gpuModuleOP : make_early_inc(block.getOperations<GPUModuleOp>())) process(gpuModuleOP); } } If we need a public API, it can be one that takes a GPUModuleOp and an OpBuilder and builds a GPUBinaryOp.
mlir/test/Dialect/GPU/module-to-binary.mlir
1	there is no diagnostics to verify here?

fmorac added inline comments.Jul 24 2023, 5:53 AM

mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
74	I'll change it.
mlir/include/mlir/Dialect/GPU/Transforms/Passes.td
41	I'll change it.
43	I'll update it.
44	I'll remove it.
48	I'll add better documentation. Tests: Currently no, I think I need to add a method for parsing attributes as options for this to work as a cmd option, it was added here for having the option on C++. I can create a patch for that (cmd parsing of attributes) and the modify this.
mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
66	These are introduced in the CMakeList.txt, I added them to not reuse the ones from the existing pipeline.
96	I'll change it.
120	Yeah, the reason for doing patterns was just to expose it to be used in other passes. But I can change it,
mlir/test/Dialect/GPU/module-to-binary.mlir
1	You're right, I'll remove it.

Updated tests, changed the Pass to an OperationPass and switched from patterns to looking for nested Modules.

Harbormaster completed remote builds in B250755: Diff 547745.Aug 7 2023, 7:37 AM

mehdi_amini accepted this revision.Aug 8 2023, 10:21 PM

mehdi_amini added inline comments.

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp
96	Still present

This revision is now accepted and ready to land.Aug 8 2023, 10:21 PM

Added an assert check for the attribute.
Note: The verifier will also verify that the dyn_cast is valid, as it's a precondition on the elements of the array.

Harbormaster completed remote builds in B251915: Diff 549334.Aug 11 2023, 6:12 AM

Separated gpu-module-to-binary test into 2 tests: nvvm & rocdl.
Also updated lit.cfg.py with:

if config.run_cuda_tests:
    config.available_features.add("host-supports-nvptx")

if config.run_rocm_tests:
    config.available_features.add("host-supports-amdgpu")

Allowing the usage of // REQUIRES: host-supports-nvptx for disabling tests.

Herald added a reviewer: ftynse. · View Herald TranscriptAug 11 2023, 4:07 PM

Herald added a reviewer: dcaballe. · View Herald Transcript

Herald added subscribers: gysit, Dinistro, mattd, awarzynski. · View Herald Transcript

Harbormaster completed remote builds in B252061: Diff 549542.Aug 11 2023, 4:08 PM

Fix patch application.

Harbormaster completed remote builds in B252063: Diff 549543.Aug 11 2023, 5:18 PM

Closed by commit rG43752a2aa31a: [mlir][gpu] Add the `gpu-module-to-binary` pass. (authored by fmorac). · Explain WhyAug 11 2023, 5:25 PM

This revision was automatically updated to reflect the committed changes.

fmorac added a commit: rG43752a2aa31a: [mlir][gpu] Add the `gpu-module-to-binary` pass..

fmorac mentioned this in rGb43068e8707d: [mlir][gpu] Update GPU translation to accept binaries..Aug 11 2023, 5:29 PM

Diff 543194

mlir/include/mlir/Dialect/GPU/Transforms/Passes.h

	Show First 20 Lines • Show All 63 Lines • ▼ Show 20 Lines

	/// Collect all patterns to rewrite ops within the GPU dialect.			/// Collect all patterns to rewrite ops within the GPU dialect.
	inline void populateGpuRewritePatterns(RewritePatternSet &patterns) {			inline void populateGpuRewritePatterns(RewritePatternSet &patterns) {
	populateGpuAllReducePatterns(patterns);			populateGpuAllReducePatterns(patterns);
	populateGpuGlobalIdPatterns(patterns);			populateGpuGlobalIdPatterns(patterns);
	populateGpuShufflePatterns(patterns);			populateGpuShufflePatterns(patterns);
	}			}

				/// Collect a set of patterns to rewrite GPU modules into GPU binary operations.
				void populateGpuModuleToBinaryPatterns(RewritePatternSet &patterns,
				Attribute objectManager = nullptr,
				mehdi_aminiUnsubmitted Done Reply Inline Actions Same as other patches: can this specify the interface instead of "Attribute"? mehdi_amini: Same as other patches: can this specify the interface instead of "Attribute"?
				fmoracAuthorUnsubmitted Done Reply Inline Actions I'll change it. fmorac: I'll change it.
				const gpu::TargetOptions &options = {});

	namespace gpu {			namespace gpu {
	/// Base pass class to serialize kernel functions through LLVM into			/// Base pass class to serialize kernel functions through LLVM into
	/// user-specified IR and add the resulting blob as module attribute.			/// user-specified IR and add the resulting blob as module attribute.
	class SerializeToBlobPass : public OperationPass<gpu::GPUModuleOp> {			class SerializeToBlobPass : public OperationPass<gpu::GPUModuleOp> {
	public:			public:
	SerializeToBlobPass(TypeID passID);			SerializeToBlobPass(TypeID passID);
	SerializeToBlobPass(const SerializeToBlobPass &other);			SerializeToBlobPass(const SerializeToBlobPass &other);

	▲ Show 20 Lines • Show All 80 Lines • Show Last 20 Lines

mlir/include/mlir/Dialect/GPU/Transforms/Passes.td

	Show All 31 Lines
	def GpuMapParallelLoopsPass			def GpuMapParallelLoopsPass
	: Pass<"gpu-map-parallel-loops", "mlir::func::FuncOp"> {			: Pass<"gpu-map-parallel-loops", "mlir::func::FuncOp"> {
	let summary = "Greedily maps loops to GPU hardware dimensions.";			let summary = "Greedily maps loops to GPU hardware dimensions.";
	let constructor = "mlir::createGpuMapParallelLoopsPass()";			let constructor = "mlir::createGpuMapParallelLoopsPass()";
	let description = "Greedily maps loops to GPU hardware dimensions.";			let description = "Greedily maps loops to GPU hardware dimensions.";
	let dependentDialects = ["mlir::gpu::GPUDialect"];			let dependentDialects = ["mlir::gpu::GPUDialect"];
	}			}

				def GpuModuleToBinaryPass
				: Pass<"gpu-module-to-binary", "mlir::ModuleOp"> {
				mehdi_aminiUnsubmitted Done Reply Inline Actions The restriction on `ModuleOp` does not seem necessary to me. I would let it be an OperationPass and specify that this finds all the GPUModule op immediately present in the attached regions and convert them to GPUBinaryOp holding a serialization of the result of the codegen of the GPUModule based on its target information. mehdi_amini: The restriction on `ModuleOp` does not seem necessary to me. I would let it be an…
				fmoracAuthorUnsubmitted Done Reply Inline Actions I'll change it. fmorac: I'll change it.
				let summary = "Transforms a GPU module into a GPU binary.";
				let description = "Transforms a GPU module into a GPU binary.";
				mehdi_aminiUnsubmitted Done Reply Inline Actions It's not useful to repeat the summary, please flesh this out. mehdi_amini: It's not useful to repeat the summary, please flesh this out.
				fmoracAuthorUnsubmitted Done Reply Inline Actions I'll update it. fmorac: I'll update it.
				let dependentDialects = ["mlir::gpu::GPUDialect"];
				mehdi_aminiUnsubmitted Done Reply Inline Actions If the pass runs on a GPUModuleOp present in its input, then this shouldn't be needed. mehdi_amini: If the pass runs on a GPUModuleOp present in its input, then this shouldn't be needed.
				fmoracAuthorUnsubmitted Done Reply Inline Actions I'll remove it. fmorac: I'll remove it.
				let options = [
				Option<"objectManager", "object-manager", "Attribute",
				/default=/"nullptr",
				"Object manager attribute.">,
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions Document please. Do you have tests for it as well? mehdi_amini: Document please. Do you have tests for it as well?
				fmoracAuthorUnsubmitted Done Reply Inline Actions I'll add better documentation. Tests: Currently no, I think I need to add a method for parsing attributes as options for this to work as a cmd option, it was added here for having the option on C++. I can create a patch for that (cmd parsing of attributes) and the modify this. fmorac: I'll add better documentation. Tests: Currently no, I think I need to add a method for parsing…
				Option<"toolkitPath", "toolkit", "std::string",
				/default=/"\"\"",
				"Toolkit path.">,
				ListOption<"bitcodeFiles", "l", "std::string",
				"Extra bitcode files to link to.">,
				];
				}

	#endif // MLIR_DIALECT_GPU_PASSES			#endif // MLIR_DIALECT_GPU_PASSES

mlir/lib/Dialect/GPU/CMakeLists.txt

Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	add_mlir_dialect_library(MLIRGPUTargets
)		)

add_mlir_dialect_library(MLIRGPUTransforms		add_mlir_dialect_library(MLIRGPUTransforms
Transforms/AllReduceLowering.cpp		Transforms/AllReduceLowering.cpp
Transforms/AsyncRegionRewriter.cpp		Transforms/AsyncRegionRewriter.cpp
Transforms/GlobalIdRewriter.cpp		Transforms/GlobalIdRewriter.cpp
Transforms/KernelOutlining.cpp		Transforms/KernelOutlining.cpp
Transforms/MemoryPromotion.cpp		Transforms/MemoryPromotion.cpp
		Transforms/ModuleToBinary.cpp
Transforms/ParallelLoopMapper.cpp		Transforms/ParallelLoopMapper.cpp
Transforms/ShuffleRewriter.cpp		Transforms/ShuffleRewriter.cpp
Transforms/SerializeToBlob.cpp		Transforms/SerializeToBlob.cpp
Transforms/SerializeToCubin.cpp		Transforms/SerializeToCubin.cpp
Transforms/SerializeToHsaco.cpp		Transforms/SerializeToHsaco.cpp

ADDITIONAL_HEADER_DIRS		ADDITIONAL_HEADER_DIRS
${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/GPU		${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/GPU
Show All 23 Lines	add_mlir_dialect_library(MLIRGPUTransforms
MLIRGPUToLLVMIRTranslation		MLIRGPUToLLVMIRTranslation
MLIRLLVMToLLVMIRTranslation		MLIRLLVMToLLVMIRTranslation
MLIRMemRefDialect		MLIRMemRefDialect
MLIRPass		MLIRPass
MLIRSCFDialect		MLIRSCFDialect
MLIRSideEffectInterfaces		MLIRSideEffectInterfaces
MLIRSupport		MLIRSupport
MLIRTransformUtils		MLIRTransformUtils

		PRIVATE
		MLIRGPUTargets
)		)

add_subdirectory(TransformOps)		add_subdirectory(TransformOps)

if(MLIR_ENABLE_CUDA_RUNNER)		if(MLIR_ENABLE_CUDA_RUNNER)
if(NOT MLIR_ENABLE_CUDA_CONVERSIONS)		if(NOT MLIR_ENABLE_CUDA_CONVERSIONS)
message(SEND_ERROR		message(SEND_ERROR
"Building mlir with cuda support requires the NVPTX backend")		"Building mlir with cuda support requires the NVPTX backend")
▲ Show 20 Lines • Show All 100 Lines • Show Last 20 Lines

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp

This file was added.

				//===- ModuleToBinary.cpp - Transforms GPU modules to GPU binaries ----------=//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements the `GpuModuleToBinaryPass` pass, transforming GPU
				// modules into GPU binaries.
				//
				//===----------------------------------------------------------------------===//

				#include "mlir/Dialect/GPU/Transforms/Passes.h"

				#include "mlir/Dialect/Func/IR/FuncOps.h"
				#include "mlir/Dialect/GPU/IR/GPUDialect.h"
				#include "mlir/IR/BuiltinOps.h"
				#include "mlir/Target/LLVMIR/Dialect/GPU/GPUToLLVMIRTranslation.h"
				#include "mlir/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.h"
				#include "mlir/Target/LLVMIR/Dialect/NVVM/NVVMToLLVMIRTranslation.h"
				#include "mlir/Target/LLVMIR/Dialect/ROCDL/ROCDLToLLVMIRTranslation.h"
				#include "mlir/Transforms/GreedyPatternRewriteDriver.h"

				#include "llvm/ADT/STLExtras.h"

				using namespace mlir;

				namespace mlir {
				#define GEN_PASS_DEF_GPUMODULETOBINARYPASS
				#include "mlir/Dialect/GPU/Transforms/Passes.h.inc"
				} // namespace mlir

				namespace {
				class GpuModuleToBinaryPass
				: public impl::GpuModuleToBinaryPassBase<GpuModuleToBinaryPass> {
				public:
				using Base::Base;
				void getDependentDialects(DialectRegistry &registry) const override;
				void runOnOperation() final;
				};

				// Rewriter for transforming GPU modules to GPU binaries.
				class GPUModuleToBinaryRewriter : public OpRewritePattern<gpu::GPUModuleOp> {
				public:
				GPUModuleToBinaryRewriter(Attribute objectManager,
				const gpu::TargetOptions &targetOptions,
				MLIRContext *context, PatternBenefit benefit = 1,
				ArrayRef<StringRef> generatedNames = {});

				LogicalResult matchAndRewrite(gpu::GPUModuleOp op,
				PatternRewriter &rewriter) const override;

				private:
				Attribute objectManager;
				const gpu::TargetOptions &targetOptions;
				};
				} // namespace

				void GpuModuleToBinaryPass::getDependentDialects(
				DialectRegistry &registry) const {
				// Register all GPU related translations.
				registerLLVMDialectTranslation(registry);
				registerGPUDialectTranslation(registry);
				#ifdef MLIR_GPU_NVPTX_TARGET_ENABLED
				registerNVVMDialectTranslation(registry);
				krzysz00Unsubmitted Done Reply Inline Actions Are these new #define 's I missed in the patch series? krzysz00: Are these new #define 's I missed in the patch series?
				fmoracAuthorUnsubmitted Done Reply Inline Actions These are introduced in the CMakeList.txt, I added them to not reuse the ones from the existing pipeline. fmorac: These are introduced in the CMakeList.txt, I added them to not reuse the ones from the existing…
				#endif
				#ifdef MLIR_GPU_AMDGPU_TARGET_ENABLED
				registerROCDLDialectTranslation(registry);
				#endif
				}

				void GpuModuleToBinaryPass::runOnOperation() {
				RewritePatternSet patterns(&getContext());
				gpu::TargetOptions targetOptions(toolkitPath, bitcodeFiles);
				populateGpuModuleToBinaryPatterns(patterns, objectManager, targetOptions);
				FrozenRewritePatternSet patternSet(std::move(patterns));
				if (failed(applyPatternsAndFoldGreedily(getOperation(), patternSet)))
				return signalPassFailure();
				}

				GPUModuleToBinaryRewriter::GPUModuleToBinaryRewriter(
				Attribute objectManager, const gpu::TargetOptions &targetOptions,
				MLIRContext *context, PatternBenefit benefit,
				ArrayRef<StringRef> generatedNames)
				: OpRewritePattern(context, benefit, generatedNames),
				objectManager(objectManager), targetOptions(targetOptions) {}

				LogicalResult
				GPUModuleToBinaryRewriter::matchAndRewrite(gpu::GPUModuleOp op,
				PatternRewriter &rewriter) const {
				SmallVector<Attribute> objects;

				// Serialize all targets.
				for (auto targetAttr : op.getTargetsAttr()) {
				auto target = dyn_cast<gpu::TargetAttrInterface>(targetAttr);
				mehdi_aminiUnsubmitted Done Reply Inline Actions Nit: dyn_cast without check for nullptr mehdi_amini: Nit: dyn_cast without check for nullptr
				fmoracAuthorUnsubmitted Done Reply Inline Actions I'll change it. fmorac: I'll change it.
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions Still present mehdi_amini: Still present
				std::optional<SmallVector<char, 0>> object =
				target.serializeToObject(op, targetOptions);

				if (!object) {
				op.emitError("An error happened while serializing the module.");
				return failure();
				}

				objects.push_back(rewriter.getAttr<gpu::ObjectAttr>(
				targetAttr,
				rewriter.getStringAttr(StringRef(object->data(), object->size()))));
				}

				rewriter.replaceOpWithNewOp<gpu::BinaryOp>(op, op.getName(), objectManager,
				rewriter.getArrayAttr(objects));
				return success();
				}

				void ::mlir::populateGpuModuleToBinaryPatterns(
				RewritePatternSet &patterns, Attribute objectManager,
				const gpu::TargetOptions &options) {
				patterns.add<GPUModuleToBinaryRewriter>(objectManager, options,
				patterns.getContext());
				}
				mehdi_aminiUnsubmitted Done Reply Inline Actions Pattern aren't obviously appropriate to me. I would do something a bit more lightweight: void GpuModuleToBinaryPass::runOnOperation() { for (Region &region : getOperation()->getRegion()) { for (Block &block : region.getBlocks()) { for (auto gpuModuleOP : make_early_inc(block.getOperations<GPUModuleOp>())) process(gpuModuleOP); } } If we need a public API, it can be one that takes a GPUModuleOp and an OpBuilder and builds a GPUBinaryOp. mehdi_amini: Pattern aren't obviously appropriate to me. I would do something a bit more lightweight: ```…
				fmoracAuthorUnsubmitted Done Reply Inline Actions Yeah, the reason for doing patterns was just to expose it to be used in other passes. But I can change it, fmorac: Yeah, the reason for doing patterns was just to expose it to be used in other passes. But I can…

mlir/test/Dialect/GPU/module-to-binary.mlir

This file was added.

				// RUN: mlir-opt %s --gpu-module-to-binary -verify-diagnostics \| FileCheck %s
				mehdi_aminiUnsubmitted Done Reply Inline Actions there is no diagnostics to verify here? mehdi_amini: there is no diagnostics to verify here?
				fmoracAuthorUnsubmitted Done Reply Inline Actions You're right, I'll remove it. fmorac: You're right, I'll remove it.

				module attributes {gpu.container_module} {
				// CHECK-LABEL:gpu.binary @kernel_module1
				// CHECK:[#gpu.object<#gpu.nvptx<chip = "sm_70">, "{{.*}}">]
				gpu.module @kernel_module1 [#gpu.nvptx<chip = "sm_70">] {
				llvm.func @kernel(%arg0: i32, %arg1: !llvm.ptr<f32>,
				%arg2: !llvm.ptr<f32>, %arg3: i64, %arg4: i64,
				%arg5: i64) attributes {gpu.kernel} {
				llvm.return
				}
				}

				// CHECK-LABEL:gpu.binary @kernel_module2
				// CHECK:[#gpu.object<#gpu.nvptx<flags = {fast}>, "{{.}}">, #gpu.object<#gpu.nvptx, "{{.}}">]
				gpu.module @kernel_module2 [#gpu.nvptx<flags = {fast}>, #gpu.nvptx] {
				llvm.func @kernel(%arg0: i32, %arg1: !llvm.ptr<f32>,
				%arg2: !llvm.ptr<f32>, %arg3: i64, %arg4: i64,
				%arg5: i64) attributes {gpu.kernel} {
				llvm.return
				}
				}
				}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][gpu] Add the `gpu-module-to-binary` pass.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 543194

mlir/include/mlir/Dialect/GPU/Transforms/Passes.h

mlir/include/mlir/Dialect/GPU/Transforms/Passes.td

mlir/lib/Dialect/GPU/CMakeLists.txt

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp

mlir/test/Dialect/GPU/module-to-binary.mlir

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][gpu] Add the `gpu-module-to-binary` pass.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 543194

mlir/include/mlir/Dialect/GPU/Transforms/Passes.h

mlir/include/mlir/Dialect/GPU/Transforms/Passes.td

mlir/lib/Dialect/GPU/CMakeLists.txt

mlir/lib/Dialect/GPU/Transforms/ModuleToBinary.cpp

mlir/test/Dialect/GPU/module-to-binary.mlir

[mlir][gpu] Add the `gpu-module-to-binary` pass.
ClosedPublic