This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/
-
mlir/
-
Conversion/GPUToSPIRV/
-
GPUToSPIRV/
2/2
ConvertGPUToSPIRV.h
1/1
ConvertGPUToSPIRVPass.h
-
Dialect/SPIRV/
-
SPIRV/
-
TargetAndABI.h
-
lib/
-
Conversion/GPUToSPIRV/
-
GPUToSPIRV/
-
ConvertGPUToSPIRV.cpp
-
ConvertGPUToSPIRVPass.cpp
-
Dialect/SPIRV/
-
SPIRV/
2/2
TargetAndABI.cpp
-
test/Conversion/GPUToSPIRV/
-
Conversion/
-
GPUToSPIRV/
-
builtins.mlir
-
if.mlir
-
load-store.mlir
-
loop.mlir
-
simple.mlir

Differential D74012

[mlir][spirv] Use spv.entry_point_abi in GPU to SPIR-V conversions
ClosedPublic

Authored by antiagainst on Feb 4 2020, 6:02 PM.

Download Raw Diff

Details

Reviewers

denis13
mravishankar

Commits

rG50aeeed8a2dd: [mlir][spirv] Use spv.entry_point_abi in GPU to SPIR-V conversions

Summary

We have spv.entry_point_abi for specifying the local workgroup size.
It should be decorated onto input gpu.func ops to drive the SPIR-V
CodeGen to generate the proper SPIR-V module execution mode. Compared
to using command-line options for specifying the configuration, using
attributes also has the benefits that 1) we are now able to use
different local workgroup for different entry points and 2) the
tests contains the configuration directly.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

antiagainst created this revision.Feb 4 2020, 6:02 PM

Herald added a reviewer: mravishankar. · View Herald TranscriptFeb 4 2020, 6:02 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, Joonsoo, liufengdb and 12 others. · View Herald Transcript

Update doc

Remove \n

Unit tests: unknown.

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Harbormaster failed remote builds in B45732: Diff 242485!Feb 4 2020, 6:12 PM

Harbormaster failed remote builds in B45735: Diff 242490!Feb 4 2020, 6:22 PM

Unit tests: unknown.

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Unit tests: unknown.

clang-tidy: pass.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt

Pre-merge checks is in beta. Report issue. Please join beta or enable it for your project.

Harbormaster failed remote builds in B45734: Diff 242488!Feb 4 2020, 6:30 PM

I understand what the intent is here, but the input already has an attribute that belongs to the SPIR-V dialect before lowering. That makes things a bit non-composable. In cases where someone lowers to the GPU dialect and then conditionally decides to lower to SPIR-V dialect or the NVVM dialect, with this change on the SPIR-V side a separate pass will be needed to add this attribute. Ideally the input should be only in GPU dialect, whereas here it isnt.
Is it possible instead to add an attribute to GPU dialect itself which contains information about the workgroup size. Then while lowering we can convert one attribute to another.

mravishankar requested changes to this revision.Feb 4 2020, 10:55 PM

This revision now requires changes to proceed.Feb 4 2020, 10:55 PM

In D74012#1858709, @mravishankar wrote:

I understand what the intent is here, but the input already has an attribute that belongs to the SPIR-V dialect before lowering.

Generally I think it is inevitable that we'll have attributes belonging to lower layers attached at the source input at a higher layer, for example we will have spv.target_env and spv.interface_var_abi attributes attached to dispatch regions isolated by IREE at HLO level. The root problem is that we cannot infer everything a lower dialect needs all from higher dialects and it does not make sense to create duplicates all the way up to every layer. But with that said,

That makes things a bit non-composable. In cases where someone lowers to the GPU dialect and then conditionally decides to lower to SPIR-V dialect or the NVVM dialect, with this change on the SPIR-V side a separate pass will be needed to add this attribute. Ideally the input should be only in GPU dialect, whereas here it isnt.

Here we are at the boundary between GPU dialect and SPIR-V dialect; so it should be fine to have SPIR-V specific stuff attached to the input to drive conversions towards SPIR-V. But I get your idea here regarding non-composability. If the input is, say, loops, it's better to have a proper GPU dialect attribute instead of spv.entry_point_abi attached to loops for driving further conversion.

Is it possible instead to add an attribute to GPU dialect itself which contains information about the workgroup size. Then while lowering we can convert one attribute to another.

Yeah that makes sense to me. At GPU level we also have such concepts so we can have similar attributes; they are just contracts to different layers. Right now we are using a bunch of command-line options for doing that job; I'd love to see us switching to use attributes there too. I've created https://llvm.discourse.group/t/using-attributes-to-specify-workgroup-configuration-when-lowering-to-gpu/496 to RFC. I view that as an upper layer above SPIR-V so it's a bit separated from the changes here IMHO.

Lets go with this for now. We can clean this up when the attribute story gets fixed up.

mlir/include/mlir/Conversion/GPUToSPIRV/ConvertGPUToSPIRV.h
26	Can we leave the workgroup size as optional. If provided will be used for overriding the default way of using the attribute.
mlir/include/mlir/Conversion/GPUToSPIRV/ConvertGPUToSPIRVPass.h
27	Same here, make the workgroup size optional.
mlir/lib/Dialect/SPIRV/TargetAndABI.cpp
162	Cant you just do op->getParentOfType<FuncOp>(op) / op->getParentOfType<gpu::FuncOp>(op).

In D74012#1859486, @antiagainst wrote:

In D74012#1858709, @mravishankar wrote:

I understand what the intent is here, but the input already has an attribute that belongs to the SPIR-V dialect before lowering.

Generally I think it is inevitable that we'll have attributes belonging to lower layers attached at the source input at a higher layer, for example we will have spv.target_env and spv.interface_var_abi attributes attached to dispatch regions isolated by IREE at HLO level. The root problem is that we cannot infer everything a lower dialect needs all from higher dialects and it does not make sense to create duplicates all the way up to every layer. But with that said,

That makes things a bit non-composable. In cases where someone lowers to the GPU dialect and then conditionally decides to lower to SPIR-V dialect or the NVVM dialect, with this change on the SPIR-V side a separate pass will be needed to add this attribute. Ideally the input should be only in GPU dialect, whereas here it isnt.

Here we are at the boundary between GPU dialect and SPIR-V dialect; so it should be fine to have SPIR-V specific stuff attached to the input to drive conversions towards SPIR-V. But I get your idea here regarding non-composability. If the input is, say, loops, it's better to have a proper GPU dialect attribute instead of spv.entry_point_abi attached to loops for driving further conversion.

Is it possible instead to add an attribute to GPU dialect itself which contains information about the workgroup size. Then while lowering we can convert one attribute to another.

Yeah that makes sense to me. At GPU level we also have such concepts so we can have similar attributes; they are just contracts to different layers. Right now we are using a bunch of command-line options for doing that job; I'd love to see us switching to use attributes there too. I've created https://llvm.discourse.group/t/using-attributes-to-specify-workgroup-configuration-when-lowering-to-gpu/496 to RFC. I view that as an upper layer above SPIR-V so it's a bit separated from the changes here IMHO.

Maybe I am missing something here, but from the GPU dialect, the sizes are passed to the gpu.launch, so you can take them from there. If you want to specialize a kernel for specific sizes, you need to ensure compatible call sites, like in other function specialization. Is this more about driving upper layers of code generation so that you end up with a gpu.launch that has sizes you want? Or do you want to make gpu.func usable independent of the gpu.launch?

Maybe I am missing something here, but from the GPU dialect, the sizes are passed to the gpu.launch, so you can take them from there. If you want to specialize a kernel for specific sizes, you need to ensure compatible call sites, like in other function specialization. Is this more about driving upper layers of code generation so that you end up with a gpu.launch that has sizes you want? Or do you want to make gpu.func usable independent of the gpu.launch?

I think you hit upon the core issue here. We would like gpu.func usable independent of gpu.launch. The way I see it, gpu.func allows for arbitrary workgroup size, but for cases where the workgroup size is fixed, the workgroup size is specified as an attribute on the function. For GPU to SPIR-V it is a pre-requisite that the workgroup size be a constant, so the change here makes it a requirement to have spv.entry_point_abi attribute on the gpu.func for the conversion to succeed (earlier it was implemented by having the workgroup size passed as an argument at pattern construction time).
Coming to the gpu.launch issue, I think we should have gpu.launch semantics be that the workgroup size be constant values if the gpu.func used has the attribute set. It would actually be better to go a bit further. Make the workgroup size arguments optional. If unspecified, the function uses a constant workgroup size, and would be illegal to specify if the target function has a constant workgroup size. This would be fairly easy to enforce in the verification.

THanks Lei! More I think about it, it is better to use this approach.

This revision is now accepted and ready to land.Feb 10 2020, 10:13 AM

This is really helpful for mlir-vulkan-runner as well, would be nice to rebase on it, when this patch is commited.

@herhut: +1 to what Mahesh said. Additionally, I'd like to tighten SPIR-V side to use attributes in general for passing in pattern configurations.

There is mismatch between GPU dialect and the SPIR-V side. For this one (spv.entry_point_abi = {local_size = ...}) probably we can push one layer upwards and propose a gpu.workgroup_size at the GPU dialect level; then the SPIR-V lowering can convert gpu.workgroup_size to spv.entry_point_abi = {local_size = ...}. But as explained in https://llvm.discourse.group/t/using-attributes-to-specify-workgroup-configuration-when-lowering-to-gpu/496/20, for gpu.launch with non-constant workgroup sizes, we need to specify the SpecIds for them, which does not really make sense to appear at GPU dialect level; so likely we need the SPIR-V lowering path user (say, IREE) to attach something like spv.entry_point_abi = {local_size_spec_id = ...} before going to GPU dialect level and pass it all the way down the stack. (You can think this as part of the SPIR-V target in lieu of a proper target mechanism in SPIR-V. We have SPIR-V conversion target on SPIR-V side but that's covering different things than the ABI here. Complexities. ;-P) Again, we need to attach a SPIR-V specific attribute to the gpu.func eventually. I'd like to have consistency between the normal constant case and spec constant case.

mlir/include/mlir/Conversion/GPUToSPIRV/ConvertGPUToSPIRV.h
26	We probably don't want to do that. One of the purpose is to tighten the contract on SPIR-V lowerings and make them consistent. Having two ways is causing more confusion IMO.
mlir/lib/Dialect/SPIRV/TargetAndABI.cpp
162	I think this is simpler and more composable given this is a utility for writing SPIR-V lowerings. There may exist some other funcs that one would like to lower towards SPIR-V so I think we can be a bit flexible here.

Closed by commit rG50aeeed8a2dd: [mlir][spirv] Use spv.entry_point_abi in GPU to SPIR-V conversions (authored by antiagainst). · Explain WhyFeb 10 2020, 1:30 PM

This revision was automatically updated to reflect the committed changes.

antiagainst marked 2 inline comments as done.

denis13 mentioned this in D72696: [mlir][spirv] Add mlir-vulkan-runner.Feb 11 2020, 9:00 AM

Revision Contents

Path

Size

mlir/

include/

mlir/

Conversion/

GPUToSPIRV/

ConvertGPUToSPIRV.h

8 lines

ConvertGPUToSPIRVPass.h

7 lines

Dialect/

SPIRV/

TargetAndABI.h

14 lines

lib/

Conversion/

GPUToSPIRV/

ConvertGPUToSPIRV.cpp

40 lines

ConvertGPUToSPIRVPass.cpp

31 lines

Dialect/

SPIRV/

TargetAndABI.cpp

33 lines

test/

Conversion/

GPUToSPIRV/

27 lines

4 lines

2 lines

2 lines

33 lines

Diff 243664

mlir/include/mlir/Conversion/GPUToSPIRV/ConvertGPUToSPIRV.h

	Show All 11 Lines

	#ifndef MLIR_CONVERSION_GPUTOSPIRV_CONVERTGPUTOSPIRV_H			#ifndef MLIR_CONVERSION_GPUTOSPIRV_CONVERTGPUTOSPIRV_H
	#define MLIR_CONVERSION_GPUTOSPIRV_CONVERTGPUTOSPIRV_H			#define MLIR_CONVERSION_GPUTOSPIRV_CONVERTGPUTOSPIRV_H

	#include "mlir/Transforms/DialectConversion.h"			#include "mlir/Transforms/DialectConversion.h"

	namespace mlir {			namespace mlir {
	class SPIRVTypeConverter;			class SPIRVTypeConverter;

	/// Appends to a pattern list additional patterns for translating GPU Ops to			/// Appends to a pattern list additional patterns for translating GPU Ops to
	/// SPIR-V ops. Needs the workgroup size as input since SPIR-V/Vulkan requires			/// SPIR-V ops. For a gpu.func to be converted, it should have a
	/// the workgroup size to be statically specified.			/// spv.entry_point_abi attribute.
	void populateGPUToSPIRVPatterns(MLIRContext *context,			void populateGPUToSPIRVPatterns(MLIRContext *context,
	SPIRVTypeConverter &typeConverter,			SPIRVTypeConverter &typeConverter,
	OwningRewritePatternList &patterns,			OwningRewritePatternList &patterns);
				mravishankarUnsubmitted Done Reply Inline Actions Can we leave the workgroup size as optional. If provided will be used for overriding the default way of using the attribute. mravishankar: Can we leave the workgroup size as optional. If provided will be used for overriding the…
				antiagainstAuthorUnsubmitted Done Reply Inline Actions We probably don't want to do that. One of the purpose is to tighten the contract on SPIR-V lowerings and make them consistent. Having two ways is causing more confusion IMO. antiagainst: We probably don't want to do that. One of the purpose is to tighten the contract on SPIR-V…
	ArrayRef<int64_t> workGroupSize);
	} // namespace mlir			} // namespace mlir

	#endif // MLIR_CONVERSION_GPUTOSPIRV_CONVERTGPUTOSPIRV_H			#endif // MLIR_CONVERSION_GPUTOSPIRV_CONVERTGPUTOSPIRV_H

mlir/include/mlir/Conversion/GPUToSPIRV/ConvertGPUToSPIRVPass.h

	Show All 16 Lines

	#include <memory>			#include <memory>

	namespace mlir {			namespace mlir {

	class ModuleOp;			class ModuleOp;
	template <typename T> class OpPassBase;			template <typename T> class OpPassBase;

	/// Pass to convert GPU Ops to SPIR-V ops. Needs the workgroup size as input			/// Pass to convert GPU Ops to SPIR-V ops. For a gpu.func to be converted, it
	/// since SPIR-V/Vulkan requires the workgroup size to be statically specified.			/// should have a spv.entry_point_abi attribute.
	std::unique_ptr<OpPassBase<ModuleOp>>			std::unique_ptr<OpPassBase<ModuleOp>> createConvertGPUToSPIRVPass();
				mravishankarUnsubmitted Done Reply Inline Actions Same here, make the workgroup size optional. mravishankar: Same here, make the workgroup size optional.
	createConvertGPUToSPIRVPass(ArrayRef<int64_t> workGroupSize);

	} // namespace mlir			} // namespace mlir
	#endif // MLIR_CONVERSION_GPUTOSPIRV_CONVERTGPUTOSPIRVPASS_H			#endif // MLIR_CONVERSION_GPUTOSPIRV_CONVERTGPUTOSPIRVPASS_H

mlir/include/mlir/Dialect/SPIRV/TargetAndABI.h

	Show First 20 Lines • Show All 106 Lines • ▼ Show 20 Lines

	/// Returns the attribute name for specifying entry point information.			/// Returns the attribute name for specifying entry point information.
	StringRef getEntryPointABIAttrName();			StringRef getEntryPointABIAttrName();

	/// Gets the EntryPointABIAttr given its fields.			/// Gets the EntryPointABIAttr given its fields.
	EntryPointABIAttr getEntryPointABIAttr(ArrayRef<int32_t> localSize,			EntryPointABIAttr getEntryPointABIAttr(ArrayRef<int32_t> localSize,
	MLIRContext *context);			MLIRContext *context);

				/// Queries the entry point ABI on the nearest function-like op containing the
				/// given `op`. Returns null attribute if not found.
				EntryPointABIAttr lookupEntryPointABI(Operation *op);

				/// Queries the local workgroup size from entry point ABI on the nearest
				/// function-like op containing the given `op`. Returns null attribute if not
				/// found.
				DenseIntElementsAttr lookupLocalWorkGroupSize(Operation *op);

	/// Returns a default resource limits attribute that uses numbers from			/// Returns a default resource limits attribute that uses numbers from
	/// "Table 46. Required Limits" of the Vulkan spec.			/// "Table 46. Required Limits" of the Vulkan spec.
	ResourceLimitsAttr getDefaultResourceLimits(MLIRContext *context);			ResourceLimitsAttr getDefaultResourceLimits(MLIRContext *context);

	/// Returns the attribute name for specifying SPIR-V target environment.			/// Returns the attribute name for specifying SPIR-V target environment.
	StringRef getTargetEnvAttrName();			StringRef getTargetEnvAttrName();

	/// Returns the default target environment: SPIR-V 1.0 with Shader capability			/// Returns the default target environment: SPIR-V 1.0 with Shader capability
	/// and no extra extensions.			/// and no extra extensions.
	TargetEnvAttr getDefaultTargetEnv(MLIRContext *context);			TargetEnvAttr getDefaultTargetEnv(MLIRContext *context);

	/// Queries the target environment from the given `op` or returns the default			/// Queries the target environment from the given `op` or returns the default
	/// target environment (SPIR-V 1.0 with Shader capability and no extra			/// target environment (SPIR-V 1.0 with Shader capability and no extra
	/// extensions) if not provided.			/// extensions) if not provided.
	TargetEnvAttr lookupTargetEnvOrDefault(Operation *op);			TargetEnvAttr lookupTargetEnvOrDefault(Operation *op);

	/// Queries the local workgroup size from entry point ABI on the nearest
	/// function-like op containing the given `op`. Returns null attribute if not
	/// found.
	DenseIntElementsAttr lookupLocalWorkGroupSize(Operation *op);

	} // namespace spirv			} // namespace spirv
	} // namespace mlir			} // namespace mlir

	#endif // MLIR_DIALECT_SPIRV_TARGETANDABI_H			#endif // MLIR_DIALECT_SPIRV_TARGETANDABI_H

mlir/lib/Conversion/GPUToSPIRV/ConvertGPUToSPIRV.cpp

Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	public:
using SPIRVOpLowering<gpu::BlockDimOp>::SPIRVOpLowering;		using SPIRVOpLowering<gpu::BlockDimOp>::SPIRVOpLowering;

PatternMatchResult		PatternMatchResult
matchAndRewrite(gpu::BlockDimOp op, ArrayRef<Value> operands,		matchAndRewrite(gpu::BlockDimOp op, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const override;		ConversionPatternRewriter &rewriter) const override;
};		};

/// Pattern to convert a kernel function in GPU dialect within a spv.module.		/// Pattern to convert a kernel function in GPU dialect within a spv.module.
class KernelFnConversion final : public SPIRVOpLowering<gpu::GPUFuncOp> {		class GPUFuncOpConversion final : public SPIRVOpLowering<gpu::GPUFuncOp> {
public:		public:
KernelFnConversion(MLIRContext *context, SPIRVTypeConverter &converter,		using SPIRVOpLowering<gpu::GPUFuncOp>::SPIRVOpLowering;
ArrayRef<int64_t> workGroupSize,
PatternBenefit benefit = 1)
: SPIRVOpLowering<gpu::GPUFuncOp>(context, converter, benefit) {
auto config = workGroupSize.take_front(3);
workGroupSizeAsInt32.assign(config.begin(), config.end());
workGroupSizeAsInt32.resize(3, 1);
}

PatternMatchResult		PatternMatchResult
matchAndRewrite(gpu::GPUFuncOp funcOp, ArrayRef<Value> operands,		matchAndRewrite(gpu::GPUFuncOp funcOp, ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const override;		ConversionPatternRewriter &rewriter) const override;

private:		private:
SmallVector<int32_t, 3> workGroupSizeAsInt32;		SmallVector<int32_t, 3> workGroupSizeAsInt32;
};		};
▲ Show 20 Lines • Show All 244 Lines • ▼ Show 20 Lines	rewriter.inlineRegionBefore(funcOp.getBody(), newFuncOp.getBody(),
newFuncOp.end());		newFuncOp.end());
rewriter.applySignatureConversion(&newFuncOp.getBody(), signatureConverter);		rewriter.applySignatureConversion(&newFuncOp.getBody(), signatureConverter);
rewriter.eraseOp(funcOp);		rewriter.eraseOp(funcOp);

spirv::setABIAttrs(newFuncOp, entryPointInfo, argABIInfo);		spirv::setABIAttrs(newFuncOp, entryPointInfo, argABIInfo);
return newFuncOp;		return newFuncOp;
}		}

PatternMatchResult		PatternMatchResult GPUFuncOpConversion::matchAndRewrite(
KernelFnConversion::matchAndRewrite(gpu::GPUFuncOp funcOp,		gpu::GPUFuncOp funcOp, ArrayRef<Value> operands,
ArrayRef<Value> operands,
ConversionPatternRewriter &rewriter) const {		ConversionPatternRewriter &rewriter) const {
if (!gpu::GPUDialect::isKernel(funcOp)) {		if (!gpu::GPUDialect::isKernel(funcOp))
return matchFailure();		return matchFailure();
}

SmallVector<spirv::InterfaceVarABIAttr, 4> argABI;		SmallVector<spirv::InterfaceVarABIAttr, 4> argABI;
for (auto argNum : llvm::seq<unsigned>(0, funcOp.getNumArguments())) {		for (auto argNum : llvm::seq<unsigned>(0, funcOp.getNumArguments())) {
argABI.push_back(spirv::getInterfaceVarABIAttr(		argABI.push_back(spirv::getInterfaceVarABIAttr(
0, argNum, spirv::StorageClass::StorageBuffer, rewriter.getContext()));		0, argNum, spirv::StorageClass::StorageBuffer, rewriter.getContext()));
}		}

auto context = rewriter.getContext();		auto entryPointAttr = spirv::lookupEntryPointABI(funcOp);
auto entryPointAttr =		if (!entryPointAttr) {
spirv::getEntryPointABIAttr(workGroupSizeAsInt32, context);		funcOp.emitRemark("match failure: missing 'spv.entry_point_abi' attribute");
		return matchFailure();
		}
FuncOp newFuncOp = lowerAsEntryFunction(funcOp, typeConverter, rewriter,		FuncOp newFuncOp = lowerAsEntryFunction(funcOp, typeConverter, rewriter,
entryPointAttr, argABI);		entryPointAttr, argABI);
if (!newFuncOp) {		if (!newFuncOp)
return matchFailure();		return matchFailure();
}
newFuncOp.removeAttr(Identifier::get(gpu::GPUDialect::getKernelFuncAttrName(),		newFuncOp.removeAttr(Identifier::get(gpu::GPUDialect::getKernelFuncAttrName(),
rewriter.getContext()));		rewriter.getContext()));
return matchSuccess();		return matchSuccess();
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// ModuleOp with gpu.module.		// ModuleOp with gpu.module.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
Show All 39 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

namespace {		namespace {
#include "GPUToSPIRV.cpp.inc"		#include "GPUToSPIRV.cpp.inc"
}		}

void mlir::populateGPUToSPIRVPatterns(MLIRContext *context,		void mlir::populateGPUToSPIRVPatterns(MLIRContext *context,
SPIRVTypeConverter &typeConverter,		SPIRVTypeConverter &typeConverter,
OwningRewritePatternList &patterns,		OwningRewritePatternList &patterns) {
ArrayRef<int64_t> workGroupSize) {
populateWithGenerated(context, &patterns);		populateWithGenerated(context, &patterns);
patterns.insert<KernelFnConversion>(context, typeConverter, workGroupSize);
patterns.insert<		patterns.insert<
ForOpConversion, GPUModuleConversion, GPUReturnOpConversion,		ForOpConversion, GPUFuncOpConversion, GPUModuleConversion,
IfOpConversion,		GPUReturnOpConversion, IfOpConversion,
LaunchConfigConversion<gpu::BlockIdOp, spirv::BuiltIn::WorkgroupId>,		LaunchConfigConversion<gpu::BlockIdOp, spirv::BuiltIn::WorkgroupId>,
LaunchConfigConversion<gpu::GridDimOp, spirv::BuiltIn::NumWorkgroups>,		LaunchConfigConversion<gpu::GridDimOp, spirv::BuiltIn::NumWorkgroups>,
LaunchConfigConversion<gpu::ThreadIdOp,		LaunchConfigConversion<gpu::ThreadIdOp,
spirv::BuiltIn::LocalInvocationId>,		spirv::BuiltIn::LocalInvocationId>,
TerminatorOpConversion, WorkGroupSizeConversion>(context, typeConverter);		TerminatorOpConversion, WorkGroupSizeConversion>(context, typeConverter);
}		}

mlir/lib/Conversion/GPUToSPIRV/ConvertGPUToSPIRVPass.cpp

	Show All 18 Lines
	#include "mlir/Dialect/SPIRV/SPIRVLowering.h"			#include "mlir/Dialect/SPIRV/SPIRVLowering.h"
	#include "mlir/Dialect/SPIRV/SPIRVOps.h"			#include "mlir/Dialect/SPIRV/SPIRVOps.h"
	#include "mlir/Pass/Pass.h"			#include "mlir/Pass/Pass.h"
	#include "mlir/Pass/PassRegistry.h"			#include "mlir/Pass/PassRegistry.h"

	using namespace mlir;			using namespace mlir;

	namespace {			namespace {
	/// Pass to lower GPU Dialect to SPIR-V. The pass only converts those functions			/// Pass to lower GPU Dialect to SPIR-V. The pass only converts the gpu.func ops
	/// that have the "gpu.kernel" attribute, i.e. those functions that are			/// inside gpu.module ops. i.e., the function that are referenced in
	/// referenced in gpu::LaunchKernelOp operations. For each such function			/// gpu.launch_func ops. For each such function
	///			///
	/// 1) Create a spirv::ModuleOp, and clone the function into spirv::ModuleOp			/// 1) Create a spirv::ModuleOp, and clone the function into spirv::ModuleOp
	/// (the original function is still needed by the gpu::LaunchKernelOp, so cannot			/// (the original function is still needed by the gpu::LaunchKernelOp, so cannot
	/// replace it).			/// replace it).
	///			///
	/// 2) Lower the body of the spirv::ModuleOp.			/// 2) Lower the body of the spirv::ModuleOp.
	class GPUToSPIRVPass : public ModulePass<GPUToSPIRVPass> {			struct GPUToSPIRVPass : public ModulePass<GPUToSPIRVPass> {
	public:
	GPUToSPIRVPass() = default;
	GPUToSPIRVPass(const GPUToSPIRVPass &) {}
	GPUToSPIRVPass(ArrayRef<int64_t> workGroupSize) {
	this->workGroupSize = workGroupSize;
	}

	void runOnModule() override;			void runOnModule() override;

	private:
	/// Command line option to specify the workgroup size.
	ListOption<int64_t> workGroupSize{
	*this, "workgroup-size",
	llvm::cl::desc(
	"Workgroup Sizes in the SPIR-V module for x, followed by y, followed "
	"by z dimension of the dispatch (others will be ignored)"),
	llvm::cl::ZeroOrMore, llvm::cl::MiscFlags::CommaSeparated};
	};			};
	} // namespace			} // namespace

	void GPUToSPIRVPass::runOnModule() {			void GPUToSPIRVPass::runOnModule() {
	MLIRContext *context = &getContext();			MLIRContext *context = &getContext();
	ModuleOp module = getModule();			ModuleOp module = getModule();

	SmallVector<Operation *, 1> kernelModules;			SmallVector<Operation *, 1> kernelModules;
	OpBuilder builder(context);			OpBuilder builder(context);
	module.walk([&builder, &kernelModules](gpu::GPUModuleOp moduleOp) {			module.walk([&builder, &kernelModules](gpu::GPUModuleOp moduleOp) {
	// For each kernel module (should be only 1 for now, but that is not a			// For each kernel module (should be only 1 for now, but that is not a
	// requirement here), clone the module for conversion because the			// requirement here), clone the module for conversion because the
	// gpu.launch function still needs the kernel module.			// gpu.launch function still needs the kernel module.
	builder.setInsertionPoint(moduleOp.getOperation());			builder.setInsertionPoint(moduleOp.getOperation());
	kernelModules.push_back(builder.clone(*moduleOp.getOperation()));			kernelModules.push_back(builder.clone(*moduleOp.getOperation()));
	});			});

	SPIRVTypeConverter typeConverter;			SPIRVTypeConverter typeConverter;
	OwningRewritePatternList patterns;			OwningRewritePatternList patterns;
	populateGPUToSPIRVPatterns(context, typeConverter, patterns, workGroupSize);			populateGPUToSPIRVPatterns(context, typeConverter, patterns);
	populateStandardToSPIRVPatterns(context, typeConverter, patterns);			populateStandardToSPIRVPatterns(context, typeConverter, patterns);

	std::unique_ptr<ConversionTarget> target = spirv::SPIRVConversionTarget::get(			std::unique_ptr<ConversionTarget> target = spirv::SPIRVConversionTarget::get(
	spirv::lookupTargetEnvOrDefault(module), context);			spirv::lookupTargetEnvOrDefault(module), context);
	target->addDynamicallyLegalOp<FuncOp>(			target->addDynamicallyLegalOp<FuncOp>(
	[&](FuncOp op) { return typeConverter.isSignatureLegal(op.getType()); });			[&](FuncOp op) { return typeConverter.isSignatureLegal(op.getType()); });

	if (failed(applyFullConversion(kernelModules, *target, patterns,			if (failed(applyFullConversion(kernelModules, *target, patterns,
	&typeConverter))) {			&typeConverter))) {
	return signalPassFailure();			return signalPassFailure();
	}			}
	}			}

	std::unique_ptr<OpPassBase<ModuleOp>>			std::unique_ptr<OpPassBase<ModuleOp>> mlir::createConvertGPUToSPIRVPass() {
	mlir::createConvertGPUToSPIRVPass(ArrayRef<int64_t> workGroupSize) {			return std::make_unique<GPUToSPIRVPass>();
	return std::make_unique<GPUToSPIRVPass>(workGroupSize);
	}			}

	static PassRegistration<GPUToSPIRVPass>			static PassRegistration<GPUToSPIRVPass>
	pass("convert-gpu-to-spirv", "Convert GPU dialect to SPIR-V dialect");			pass("convert-gpu-to-spirv", "Convert GPU dialect to SPIR-V dialect");

mlir/lib/Dialect/SPIRV/TargetAndABI.cpp

Show First 20 Lines • Show All 152 Lines • ▼ Show 20 Lines	spirv::getEntryPointABIAttr(ArrayRef<int32_t> localSize, MLIRContext *context) {
assert(localSize.size() == 3);		assert(localSize.size() == 3);
return spirv::EntryPointABIAttr::get(		return spirv::EntryPointABIAttr::get(
DenseElementsAttr::get<int32_t>(		DenseElementsAttr::get<int32_t>(
VectorType::get(3, IntegerType::get(32, context)), localSize)		VectorType::get(3, IntegerType::get(32, context)), localSize)
.cast<DenseIntElementsAttr>(),		.cast<DenseIntElementsAttr>(),
context);		context);
}		}

		spirv::EntryPointABIAttr spirv::lookupEntryPointABI(Operation *op) {
		while (op && !op->hasTrait<OpTrait::FunctionLike>())
		mravishankarUnsubmitted Done Reply Inline Actions Cant you just do op->getParentOfType<FuncOp>(op) / op->getParentOfType<gpu::FuncOp>(op). mravishankar: Cant you just do op->getParentOfType<FuncOp>(op) / op->getParentOfType<gpu::FuncOp>(op).
		antiagainstAuthorUnsubmitted Done Reply Inline Actions I think this is simpler and more composable given this is a utility for writing SPIR-V lowerings. There may exist some other funcs that one would like to lower towards SPIR-V so I think we can be a bit flexible here. antiagainst: I think this is simpler and more composable given this is a utility for writing SPIR-V…
		op = op->getParentOp();
		if (!op)
		return {};

		if (auto attr = op->getAttrOfType<spirv::EntryPointABIAttr>(
		spirv::getEntryPointABIAttrName()))
		return attr;

		return {};
		}

		DenseIntElementsAttr spirv::lookupLocalWorkGroupSize(Operation *op) {
		if (auto entryPoint = spirv::lookupEntryPointABI(op))
		return entryPoint.local_size();

		return {};
		}

spirv::ResourceLimitsAttr		spirv::ResourceLimitsAttr
spirv::getDefaultResourceLimits(MLIRContext *context) {		spirv::getDefaultResourceLimits(MLIRContext *context) {
auto i32Type = IntegerType::get(32, context);		auto i32Type = IntegerType::get(32, context);
auto v3i32Type = VectorType::get(3, i32Type);		auto v3i32Type = VectorType::get(3, i32Type);

// These numbers are from "Table 46. Required Limits" of the Vulkan spec.		// These numbers are from "Table 46. Required Limits" of the Vulkan spec.
return spirv::ResourceLimitsAttr ::get(		return spirv::ResourceLimitsAttr ::get(
IntegerAttr::get(i32Type, 128),		IntegerAttr::get(i32Type, 128),
Show All 13 Lines
}		}

spirv::TargetEnvAttr spirv::lookupTargetEnvOrDefault(Operation *op) {		spirv::TargetEnvAttr spirv::lookupTargetEnvOrDefault(Operation *op) {
if (auto attr = op->getAttrOfType<spirv::TargetEnvAttr>(		if (auto attr = op->getAttrOfType<spirv::TargetEnvAttr>(
spirv::getTargetEnvAttrName()))		spirv::getTargetEnvAttrName()))
return attr;		return attr;
return getDefaultTargetEnv(op->getContext());		return getDefaultTargetEnv(op->getContext());
}		}

DenseIntElementsAttr spirv::lookupLocalWorkGroupSize(Operation *op) {
while (op && !op->hasTrait<OpTrait::FunctionLike>())
op = op->getParentOp();
if (!op)
return {};

if (auto attr = op->getAttrOfType<spirv::EntryPointABIAttr>(
spirv::getEntryPointABIAttrName()))
return attr.local_size();

return {};
}

mlir/test/Conversion/GPUToSPIRV/builtins.mlir

	// RUN: mlir-opt -split-input-file -pass-pipeline='convert-gpu-to-spirv{workgroup-size=32,4}' %s -o - \| FileCheck %s			// RUN: mlir-opt -split-input-file -convert-gpu-to-spirv %s -o - \| FileCheck %s

	module attributes {gpu.container_module} {			module attributes {gpu.container_module} {
	func @builtin() {			func @builtin() {
	%c0 = constant 1 : index			%c0 = constant 1 : index
	"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_workgroup_id_x", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()			"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_workgroup_id_x", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()
	return			return
	}			}

	// CHECK-LABEL: spv.module "Logical" "GLSL450"			// CHECK-LABEL: spv.module "Logical" "GLSL450"
	// CHECK: spv.globalVariable [[WORKGROUPID:@.*]] built_in("WorkgroupId")			// CHECK: spv.globalVariable [[WORKGROUPID:@.*]] built_in("WorkgroupId")
	gpu.module @kernels {			gpu.module @kernels {
	gpu.func @builtin_workgroup_id_x()			gpu.func @builtin_workgroup_id_x()
	attributes {gpu.kernel} {			attributes {gpu.kernel, spv.entry_point_abi = {local_size = dense<[16, 1, 1]>: vector<3xi32>}} {
	// CHECK: [[ADDRESS:%.*]] = spv._address_of [[WORKGROUPID]]			// CHECK: [[ADDRESS:%.*]] = spv._address_of [[WORKGROUPID]]
	// CHECK-NEXT: [[VEC:%.*]] = spv.Load "Input" [[ADDRESS]]			// CHECK-NEXT: [[VEC:%.*]] = spv.Load "Input" [[ADDRESS]]
	// CHECK-NEXT: {{%.*}} = spv.CompositeExtract [[VEC]]{{\[}}0 : i32{{\]}}			// CHECK-NEXT: {{%.*}} = spv.CompositeExtract [[VEC]]{{\[}}0 : i32{{\]}}
	%0 = "gpu.block_id"() {dimension = "x"} : () -> index			%0 = "gpu.block_id"() {dimension = "x"} : () -> index
	gpu.return			gpu.return
	}			}
	}			}
	}			}

	// -----			// -----

	module attributes {gpu.container_module} {			module attributes {gpu.container_module} {
	func @builtin() {			func @builtin() {
	%c0 = constant 1 : index			%c0 = constant 1 : index
	"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_workgroup_id_y", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()			"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_workgroup_id_y", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()
	return			return
	}			}

	// CHECK-LABEL: spv.module "Logical" "GLSL450"			// CHECK-LABEL: spv.module "Logical" "GLSL450"
	// CHECK: spv.globalVariable [[WORKGROUPID:@.*]] built_in("WorkgroupId")			// CHECK: spv.globalVariable [[WORKGROUPID:@.*]] built_in("WorkgroupId")
	gpu.module @kernels {			gpu.module @kernels {
	gpu.func @builtin_workgroup_id_y()			gpu.func @builtin_workgroup_id_y()
	attributes {gpu.kernel} {			attributes {gpu.kernel, spv.entry_point_abi = {local_size = dense<[16, 1, 1]>: vector<3xi32>}} {
	// CHECK: [[ADDRESS:%.*]] = spv._address_of [[WORKGROUPID]]			// CHECK: [[ADDRESS:%.*]] = spv._address_of [[WORKGROUPID]]
	// CHECK-NEXT: [[VEC:%.*]] = spv.Load "Input" [[ADDRESS]]			// CHECK-NEXT: [[VEC:%.*]] = spv.Load "Input" [[ADDRESS]]
	// CHECK-NEXT: {{%.*}} = spv.CompositeExtract [[VEC]]{{\[}}1 : i32{{\]}}			// CHECK-NEXT: {{%.*}} = spv.CompositeExtract [[VEC]]{{\[}}1 : i32{{\]}}
	%0 = "gpu.block_id"() {dimension = "y"} : () -> index			%0 = "gpu.block_id"() {dimension = "y"} : () -> index
	gpu.return			gpu.return
	}			}
	}			}
	}			}

	// -----			// -----

	module attributes {gpu.container_module} {			module attributes {gpu.container_module} {
	func @builtin() {			func @builtin() {
	%c0 = constant 1 : index			%c0 = constant 1 : index
	"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_workgroup_id_z", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()			"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_workgroup_id_z", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()
	return			return
	}			}

	// CHECK-LABEL: spv.module "Logical" "GLSL450"			// CHECK-LABEL: spv.module "Logical" "GLSL450"
	// CHECK: spv.globalVariable [[WORKGROUPID:@.*]] built_in("WorkgroupId")			// CHECK: spv.globalVariable [[WORKGROUPID:@.*]] built_in("WorkgroupId")
	gpu.module @kernels {			gpu.module @kernels {
	gpu.func @builtin_workgroup_id_z()			gpu.func @builtin_workgroup_id_z()
	attributes {gpu.kernel} {			attributes {gpu.kernel, spv.entry_point_abi = {local_size = dense<[16, 1, 1]>: vector<3xi32>}} {
	// CHECK: [[ADDRESS:%.*]] = spv._address_of [[WORKGROUPID]]			// CHECK: [[ADDRESS:%.*]] = spv._address_of [[WORKGROUPID]]
	// CHECK-NEXT: [[VEC:%.*]] = spv.Load "Input" [[ADDRESS]]			// CHECK-NEXT: [[VEC:%.*]] = spv.Load "Input" [[ADDRESS]]
	// CHECK-NEXT: {{%.*}} = spv.CompositeExtract [[VEC]]{{\[}}2 : i32{{\]}}			// CHECK-NEXT: {{%.*}} = spv.CompositeExtract [[VEC]]{{\[}}2 : i32{{\]}}
	%0 = "gpu.block_id"() {dimension = "z"} : () -> index			%0 = "gpu.block_id"() {dimension = "z"} : () -> index
	gpu.return			gpu.return
	}			}
	}			}
	}			}

	// -----			// -----

	module attributes {gpu.container_module} {			module attributes {gpu.container_module} {
	func @builtin() {			func @builtin() {
	%c0 = constant 1 : index			%c0 = constant 1 : index
	"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_workgroup_size_x", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()			"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_workgroup_size_x", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()
	return			return
	}			}

	// CHECK-LABEL: spv.module "Logical" "GLSL450"			// CHECK-LABEL: spv.module "Logical" "GLSL450"
	gpu.module @kernels {			gpu.module @kernels {
	gpu.func @builtin_workgroup_size_x()			gpu.func @builtin_workgroup_size_x()
	attributes {gpu.kernel} {			attributes {gpu.kernel, spv.entry_point_abi = {local_size = dense<[32, 1, 1]>: vector<3xi32>}} {
	// The constant value is obtained fomr the command line option above.			// The constant value is obtained from the spv.entry_point_abi.
				// Note that this ignores the workgroup size specification in gpu.launch.
				// We may want to define gpu.workgroup_size and convert it to the entry
				// point ABI we want here.
	// CHECK: spv.constant 32 : i32			// CHECK: spv.constant 32 : i32
	%0 = "gpu.block_dim"() {dimension = "x"} : () -> index			%0 = "gpu.block_dim"() {dimension = "x"} : () -> index
	gpu.return			gpu.return
	}			}
	}			}
	}			}

	// -----			// -----

	module attributes {gpu.container_module} {			module attributes {gpu.container_module} {
	func @builtin() {			func @builtin() {
	%c0 = constant 1 : index			%c0 = constant 1 : index
	"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_workgroup_size_y", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()			"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_workgroup_size_y", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()
	return			return
	}			}

	// CHECK-LABEL: spv.module "Logical" "GLSL450"			// CHECK-LABEL: spv.module "Logical" "GLSL450"
	gpu.module @kernels {			gpu.module @kernels {
	gpu.func @builtin_workgroup_size_y()			gpu.func @builtin_workgroup_size_y()
	attributes {gpu.kernel} {			attributes {gpu.kernel, spv.entry_point_abi = {local_size = dense<[32, 4, 1]>: vector<3xi32>}} {
	// The constant value is obtained fomr the command line option above.			// The constant value is obtained from the spv.entry_point_abi.
	// CHECK: spv.constant 4 : i32			// CHECK: spv.constant 4 : i32
	%0 = "gpu.block_dim"() {dimension = "y"} : () -> index			%0 = "gpu.block_dim"() {dimension = "y"} : () -> index
	gpu.return			gpu.return
	}			}
	}			}
	}			}

	// -----			// -----

	module attributes {gpu.container_module} {			module attributes {gpu.container_module} {
	func @builtin() {			func @builtin() {
	%c0 = constant 1 : index			%c0 = constant 1 : index
	"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_workgroup_size_z", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()			"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_workgroup_size_z", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()
	return			return
	}			}

	// CHECK-LABEL: spv.module "Logical" "GLSL450"			// CHECK-LABEL: spv.module "Logical" "GLSL450"
	gpu.module @kernels {			gpu.module @kernels {
	gpu.func @builtin_workgroup_size_z()			gpu.func @builtin_workgroup_size_z()
	attributes {gpu.kernel} {			attributes {gpu.kernel, spv.entry_point_abi = {local_size = dense<[32, 4, 1]>: vector<3xi32>}} {
	// The constant value is obtained fomr the command line option above (1 is default).			// The constant value is obtained from the spv.entry_point_abi.
	// CHECK: spv.constant 1 : i32			// CHECK: spv.constant 1 : i32
	%0 = "gpu.block_dim"() {dimension = "z"} : () -> index			%0 = "gpu.block_dim"() {dimension = "z"} : () -> index
	gpu.return			gpu.return
	}			}
	}			}
	}			}

	// -----			// -----

	module attributes {gpu.container_module} {			module attributes {gpu.container_module} {
	func @builtin() {			func @builtin() {
	%c0 = constant 1 : index			%c0 = constant 1 : index
	"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_local_id_x", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()			"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_local_id_x", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()
	return			return
	}			}

	// CHECK-LABEL: spv.module "Logical" "GLSL450"			// CHECK-LABEL: spv.module "Logical" "GLSL450"
	// CHECK: spv.globalVariable [[LOCALINVOCATIONID:@.*]] built_in("LocalInvocationId")			// CHECK: spv.globalVariable [[LOCALINVOCATIONID:@.*]] built_in("LocalInvocationId")
	gpu.module @kernels {			gpu.module @kernels {
	gpu.func @builtin_local_id_x()			gpu.func @builtin_local_id_x()
	attributes {gpu.kernel} {			attributes {gpu.kernel, spv.entry_point_abi = {local_size = dense<[16, 1, 1]>: vector<3xi32>}} {
	// CHECK: [[ADDRESS:%.*]] = spv._address_of [[LOCALINVOCATIONID]]			// CHECK: [[ADDRESS:%.*]] = spv._address_of [[LOCALINVOCATIONID]]
	// CHECK-NEXT: [[VEC:%.*]] = spv.Load "Input" [[ADDRESS]]			// CHECK-NEXT: [[VEC:%.*]] = spv.Load "Input" [[ADDRESS]]
	// CHECK-NEXT: {{%.*}} = spv.CompositeExtract [[VEC]]{{\[}}0 : i32{{\]}}			// CHECK-NEXT: {{%.*}} = spv.CompositeExtract [[VEC]]{{\[}}0 : i32{{\]}}
	%0 = "gpu.thread_id"() {dimension = "x"} : () -> index			%0 = "gpu.thread_id"() {dimension = "x"} : () -> index
	gpu.return			gpu.return
	}			}
	}			}
	}			}

	// -----			// -----

	module attributes {gpu.container_module} {			module attributes {gpu.container_module} {
	func @builtin() {			func @builtin() {
	%c0 = constant 1 : index			%c0 = constant 1 : index
	"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_num_workgroups_x", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()			"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0) {kernel = "builtin_num_workgroups_x", kernel_module = @kernels} : (index, index, index, index, index, index) -> ()
	return			return
	}			}

	// CHECK-LABEL: spv.module "Logical" "GLSL450"			// CHECK-LABEL: spv.module "Logical" "GLSL450"
	// CHECK: spv.globalVariable [[NUMWORKGROUPS:@.*]] built_in("NumWorkgroups")			// CHECK: spv.globalVariable [[NUMWORKGROUPS:@.*]] built_in("NumWorkgroups")
	gpu.module @kernels {			gpu.module @kernels {
	gpu.func @builtin_num_workgroups_x()			gpu.func @builtin_num_workgroups_x()
	attributes {gpu.kernel} {			attributes {gpu.kernel, spv.entry_point_abi = {local_size = dense<[16, 1, 1]>: vector<3xi32>}} {
	// CHECK: [[ADDRESS:%.*]] = spv._address_of [[NUMWORKGROUPS]]			// CHECK: [[ADDRESS:%.*]] = spv._address_of [[NUMWORKGROUPS]]
	// CHECK-NEXT: [[VEC:%.*]] = spv.Load "Input" [[ADDRESS]]			// CHECK-NEXT: [[VEC:%.*]] = spv.Load "Input" [[ADDRESS]]
	// CHECK-NEXT: {{%.*}} = spv.CompositeExtract [[VEC]]{{\[}}0 : i32{{\]}}			// CHECK-NEXT: {{%.*}} = spv.CompositeExtract [[VEC]]{{\[}}0 : i32{{\]}}
	%0 = "gpu.grid_dim"() {dimension = "x"} : () -> index			%0 = "gpu.grid_dim"() {dimension = "x"} : () -> index
	gpu.return			gpu.return
	}			}
	}			}
	}			}

mlir/test/Conversion/GPUToSPIRV/if.mlir

	// RUN: mlir-opt -convert-gpu-to-spirv %s -o - \| FileCheck %s			// RUN: mlir-opt -convert-gpu-to-spirv %s -o - \| FileCheck %s

	module attributes {gpu.container_module} {			module attributes {gpu.container_module} {
	func @main(%arg0 : memref<10xf32>, %arg1 : i1) {			func @main(%arg0 : memref<10xf32>, %arg1 : i1) {
	%c0 = constant 1 : index			%c0 = constant 1 : index
	"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0, %arg0, %arg1) { kernel = "kernel_simple_selection", kernel_module = @kernels} : (index, index, index, index, index, index, memref<10xf32>, i1) -> ()			"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0, %arg0, %arg1) { kernel = "kernel_simple_selection", kernel_module = @kernels} : (index, index, index, index, index, index, memref<10xf32>, i1) -> ()
	return			return
	}			}

	gpu.module @kernels {			gpu.module @kernels {
	// CHECK-LABEL: @kernel_simple_selection			// CHECK-LABEL: @kernel_simple_selection
	gpu.func @kernel_simple_selection(%arg2 : memref<10xf32>, %arg3 : i1)			gpu.func @kernel_simple_selection(%arg2 : memref<10xf32>, %arg3 : i1)
	attributes {gpu.kernel} {			attributes {gpu.kernel, spv.entry_point_abi = {local_size = dense<[16, 1, 1]>: vector<3xi32>}} {
	%value = constant 0.0 : f32			%value = constant 0.0 : f32
	%i = constant 0 : index			%i = constant 0 : index

	// CHECK: spv.selection {			// CHECK: spv.selection {
	// CHECK-NEXT: spv.BranchConditional {{%.}}, [[TRUE:\^.]], [[MERGE:\^.*]]			// CHECK-NEXT: spv.BranchConditional {{%.}}, [[TRUE:\^.]], [[MERGE:\^.*]]
	// CHECK-NEXT: [[TRUE]]:			// CHECK-NEXT: [[TRUE]]:
	// CHECK: spv.Branch [[MERGE]]			// CHECK: spv.Branch [[MERGE]]
	// CHECK-NEXT: [[MERGE]]:			// CHECK-NEXT: [[MERGE]]:
	// CHECK-NEXT: spv._merge			// CHECK-NEXT: spv._merge
	// CHECK-NEXT: }			// CHECK-NEXT: }
	// CHECK-NEXT: spv.Return			// CHECK-NEXT: spv.Return

	loop.if %arg3 {			loop.if %arg3 {
	store %value, %arg2[%i] : memref<10xf32>			store %value, %arg2[%i] : memref<10xf32>
	}			}
	gpu.return			gpu.return
	}			}

	// CHECK-LABEL: @kernel_nested_selection			// CHECK-LABEL: @kernel_nested_selection
	gpu.func @kernel_nested_selection(%arg3 : memref<10xf32>, %arg4 : memref<10xf32>, %arg5 : i1, %arg6 : i1)			gpu.func @kernel_nested_selection(%arg3 : memref<10xf32>, %arg4 : memref<10xf32>, %arg5 : i1, %arg6 : i1)
	attributes {gpu.kernel} {			attributes {gpu.kernel, spv.entry_point_abi = {local_size = dense<[16, 1, 1]>: vector<3xi32>}} {
	%i = constant 0 : index			%i = constant 0 : index
	%j = constant 9 : index			%j = constant 9 : index

	// CHECK: spv.selection {			// CHECK: spv.selection {
	// CHECK-NEXT: spv.BranchConditional {{%.}}, [[TRUE_TOP:\^.]], [[FALSE_TOP:\^.*]]			// CHECK-NEXT: spv.BranchConditional {{%.}}, [[TRUE_TOP:\^.]], [[FALSE_TOP:\^.*]]
	// CHECK-NEXT: [[TRUE_TOP]]:			// CHECK-NEXT: [[TRUE_TOP]]:
	// CHECK-NEXT: spv.selection {			// CHECK-NEXT: spv.selection {
	// CHECK-NEXT: spv.BranchConditional {{%.}}, [[TRUE_NESTED_TRUE_PATH:\^.]], [[FALSE_NESTED_TRUE_PATH:\^.*]]			// CHECK-NEXT: spv.BranchConditional {{%.}}, [[TRUE_NESTED_TRUE_PATH:\^.]], [[FALSE_NESTED_TRUE_PATH:\^.*]]
	▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

mlir/test/Conversion/GPUToSPIRV/load-store.mlir

Show All 23 Lines	gpu.module @kernels {
// CHECK-SAME: [[ARG0:%.*]]: !spv.ptr<!spv.struct<!spv.array<48 x f32 [4]> [0]>, StorageBuffer> {spv.interface_var_abi = {binding = 0 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}		// CHECK-SAME: [[ARG0:%.*]]: !spv.ptr<!spv.struct<!spv.array<48 x f32 [4]> [0]>, StorageBuffer> {spv.interface_var_abi = {binding = 0 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}
// CHECK-SAME: [[ARG1:%.*]]: !spv.ptr<!spv.struct<!spv.array<48 x f32 [4]> [0]>, StorageBuffer> {spv.interface_var_abi = {binding = 1 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}		// CHECK-SAME: [[ARG1:%.*]]: !spv.ptr<!spv.struct<!spv.array<48 x f32 [4]> [0]>, StorageBuffer> {spv.interface_var_abi = {binding = 1 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}
// CHECK-SAME: [[ARG2:%.*]]: !spv.ptr<!spv.struct<!spv.array<48 x f32 [4]> [0]>, StorageBuffer> {spv.interface_var_abi = {binding = 2 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}		// CHECK-SAME: [[ARG2:%.*]]: !spv.ptr<!spv.struct<!spv.array<48 x f32 [4]> [0]>, StorageBuffer> {spv.interface_var_abi = {binding = 2 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}
// CHECK-SAME: [[ARG3:%.*]]: i32 {spv.interface_var_abi = {binding = 3 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}		// CHECK-SAME: [[ARG3:%.*]]: i32 {spv.interface_var_abi = {binding = 3 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}
// CHECK-SAME: [[ARG4:%.*]]: i32 {spv.interface_var_abi = {binding = 4 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}		// CHECK-SAME: [[ARG4:%.*]]: i32 {spv.interface_var_abi = {binding = 4 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}
// CHECK-SAME: [[ARG5:%.*]]: i32 {spv.interface_var_abi = {binding = 5 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}		// CHECK-SAME: [[ARG5:%.*]]: i32 {spv.interface_var_abi = {binding = 5 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}
// CHECK-SAME: [[ARG6:%.*]]: i32 {spv.interface_var_abi = {binding = 6 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}		// CHECK-SAME: [[ARG6:%.*]]: i32 {spv.interface_var_abi = {binding = 6 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}
gpu.func @load_store_kernel(%arg0: memref<12x4xf32>, %arg1: memref<12x4xf32>, %arg2: memref<12x4xf32>, %arg3: index, %arg4: index, %arg5: index, %arg6: index)		gpu.func @load_store_kernel(%arg0: memref<12x4xf32>, %arg1: memref<12x4xf32>, %arg2: memref<12x4xf32>, %arg3: index, %arg4: index, %arg5: index, %arg6: index)
attributes {gpu.kernel} {		attributes {gpu.kernel, spv.entry_point_abi = {local_size = dense<[16, 1, 1]>: vector<3xi32>}} {
// CHECK: [[ADDRESSWORKGROUPID:%.*]] = spv._address_of [[WORKGROUPIDVAR]]		// CHECK: [[ADDRESSWORKGROUPID:%.*]] = spv._address_of [[WORKGROUPIDVAR]]
// CHECK: [[WORKGROUPID:%.*]] = spv.Load "Input" [[ADDRESSWORKGROUPID]]		// CHECK: [[WORKGROUPID:%.*]] = spv.Load "Input" [[ADDRESSWORKGROUPID]]
// CHECK: [[WORKGROUPIDX:%.*]] = spv.CompositeExtract [[WORKGROUPID]]{{\[}}0 : i32{{\]}}		// CHECK: [[WORKGROUPIDX:%.*]] = spv.CompositeExtract [[WORKGROUPID]]{{\[}}0 : i32{{\]}}
// CHECK: [[ADDRESSLOCALINVOCATIONID:%.*]] = spv._address_of [[LOCALINVOCATIONIDVAR]]		// CHECK: [[ADDRESSLOCALINVOCATIONID:%.*]] = spv._address_of [[LOCALINVOCATIONIDVAR]]
// CHECK: [[LOCALINVOCATIONID:%.*]] = spv.Load "Input" [[ADDRESSLOCALINVOCATIONID]]		// CHECK: [[LOCALINVOCATIONID:%.*]] = spv.Load "Input" [[ADDRESSLOCALINVOCATIONID]]
// CHECK: [[LOCALINVOCATIONIDX:%.*]] = spv.CompositeExtract [[LOCALINVOCATIONID]]{{\[}}0 : i32{{\]}}		// CHECK: [[LOCALINVOCATIONIDX:%.*]] = spv.CompositeExtract [[LOCALINVOCATIONID]]{{\[}}0 : i32{{\]}}
%0 = "gpu.block_id"() {dimension = "x"} : () -> index		%0 = "gpu.block_id"() {dimension = "x"} : () -> index
%1 = "gpu.block_id"() {dimension = "y"} : () -> index		%1 = "gpu.block_id"() {dimension = "y"} : () -> index
Show All 35 Lines

mlir/test/Conversion/GPUToSPIRV/loop.mlir

	// RUN: mlir-opt -convert-gpu-to-spirv %s -o - \| FileCheck %s			// RUN: mlir-opt -convert-gpu-to-spirv %s -o - \| FileCheck %s

	module attributes {gpu.container_module} {			module attributes {gpu.container_module} {
	func @loop(%arg0 : memref<10xf32>, %arg1 : memref<10xf32>) {			func @loop(%arg0 : memref<10xf32>, %arg1 : memref<10xf32>) {
	%c0 = constant 1 : index			%c0 = constant 1 : index
	"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0, %arg0, %arg1) { kernel = "loop_kernel", kernel_module = @kernels} : (index, index, index, index, index, index, memref<10xf32>, memref<10xf32>) -> ()			"gpu.launch_func"(%c0, %c0, %c0, %c0, %c0, %c0, %arg0, %arg1) { kernel = "loop_kernel", kernel_module = @kernels} : (index, index, index, index, index, index, memref<10xf32>, memref<10xf32>) -> ()
	return			return
	}			}

	gpu.module @kernels {			gpu.module @kernels {
	gpu.func @loop_kernel(%arg2 : memref<10xf32>, %arg3 : memref<10xf32>)			gpu.func @loop_kernel(%arg2 : memref<10xf32>, %arg3 : memref<10xf32>)
	attributes {gpu.kernel} {			attributes {gpu.kernel, spv.entry_point_abi = {local_size = dense<[16, 1, 1]>: vector<3xi32>}} {
	// CHECK: [[LB:%.*]] = spv.constant 4 : i32			// CHECK: [[LB:%.*]] = spv.constant 4 : i32
	%lb = constant 4 : index			%lb = constant 4 : index
	// CHECK: [[UB:%.*]] = spv.constant 42 : i32			// CHECK: [[UB:%.*]] = spv.constant 42 : i32
	%ub = constant 42 : index			%ub = constant 42 : index
	// CHECK: [[STEP:%.*]] = spv.constant 2 : i32			// CHECK: [[STEP:%.*]] = spv.constant 2 : i32
	%step = constant 2 : index			%step = constant 2 : index
	// CHECK: spv.loop {			// CHECK: spv.loop {
	// CHECK-NEXT: spv.Branch [[HEADER:\^.*]]([[LB]] : i32)			// CHECK-NEXT: spv.Branch [[HEADER:\^.*]]([[LB]] : i32)
	Show All 23 Lines

mlir/test/Conversion/GPUToSPIRV/simple.mlir

	// RUN: mlir-opt -pass-pipeline='convert-gpu-to-spirv{workgroup-size=32,4}' %s -o - \| FileCheck %s			// RUN: mlir-opt -split-input-file -convert-gpu-to-spirv -verify-diagnostics %s -o - \| FileCheck %s

	module attributes {gpu.container_module} {			module attributes {gpu.container_module} {

	gpu.module @kernels {			gpu.module @kernels {
	// CHECK: spv.module "Logical" "GLSL450" {			// CHECK: spv.module "Logical" "GLSL450" {
	// CHECK-LABEL: func @kernel_1			// CHECK-LABEL: func @basic_module_structure
	// CHECK-SAME: {{%.*}}: f32 {spv.interface_var_abi = {binding = 0 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}			// CHECK-SAME: {{%.*}}: f32 {spv.interface_var_abi = {binding = 0 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}
	// CHECK-SAME: {{%.*}}: !spv.ptr<!spv.struct<!spv.array<12 x f32 [4]> [0]>, StorageBuffer> {spv.interface_var_abi = {binding = 1 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}			// CHECK-SAME: {{%.*}}: !spv.ptr<!spv.struct<!spv.array<12 x f32 [4]> [0]>, StorageBuffer> {spv.interface_var_abi = {binding = 1 : i32, descriptor_set = 0 : i32, storage_class = 12 : i32{{[}][}]}}
	// CHECK-SAME: spv.entry_point_abi = {local_size = dense<[32, 4, 1]> : vector<3xi32>}			// CHECK-SAME: spv.entry_point_abi = {local_size = dense<[32, 4, 1]> : vector<3xi32>}
	gpu.func @kernel_1(%arg0 : f32, %arg1 : memref<12xf32>) attributes {gpu.kernel} {			gpu.func @basic_module_structure(%arg0 : f32, %arg1 : memref<12xf32>)
				attributes {gpu.kernel, spv.entry_point_abi = {local_size = dense<[32, 4, 1]>: vector<3xi32>}} {
	// CHECK: spv.Return			// CHECK: spv.Return
	gpu.return			gpu.return
	}			}
	// CHECK: attributes {capabilities = ["Shader"], extensions = ["SPV_KHR_storage_buffer_storage_class"]}			// CHECK: attributes {capabilities = ["Shader"], extensions = ["SPV_KHR_storage_buffer_storage_class"]}
	}			}

	func @foo() {			func @main() {
				%0 = "op"() : () -> (f32)
				%1 = "op"() : () -> (memref<12xf32>)
				%cst = constant 1 : index
				"gpu.launch_func"(%cst, %cst, %cst, %cst, %cst, %cst, %0, %1) { kernel = "basic_module_structure", kernel_module = @kernels }
				: (index, index, index, index, index, index, f32, memref<12xf32>) -> ()
				return
				}
				}

				// -----

				module attributes {gpu.container_module} {
				gpu.module @kernels {
				// expected-error @below {{failed to legalize operation 'gpu.func'}}
				// expected-remark @below {{match failure: missing 'spv.entry_point_abi' attribute}}
				gpu.func @missing_entry_point_abi(%arg0 : f32, %arg1 : memref<12xf32>) attributes {gpu.kernel} {
				gpu.return
				}
				}

				func @main() {
	%0 = "op"() : () -> (f32)			%0 = "op"() : () -> (f32)
	%1 = "op"() : () -> (memref<12xf32>)			%1 = "op"() : () -> (memref<12xf32>)
	%cst = constant 1 : index			%cst = constant 1 : index
	"gpu.launch_func"(%cst, %cst, %cst, %cst, %cst, %cst, %0, %1) { kernel = "kernel_1", kernel_module = @kernels }			"gpu.launch_func"(%cst, %cst, %cst, %cst, %cst, %cst, %0, %1) { kernel = "missing_entry_point_abi", kernel_module = @kernels }
	: (index, index, index, index, index, index, f32, memref<12xf32>) -> ()			: (index, index, index, index, index, index, f32, memref<12xf32>) -> ()
	return			return
	}			}
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][spirv] Use spv.entry_point_abi in GPU to SPIR-V conversionsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 243664

mlir/include/mlir/Conversion/GPUToSPIRV/ConvertGPUToSPIRV.h

mlir/include/mlir/Conversion/GPUToSPIRV/ConvertGPUToSPIRVPass.h

mlir/include/mlir/Dialect/SPIRV/TargetAndABI.h

mlir/lib/Conversion/GPUToSPIRV/ConvertGPUToSPIRV.cpp

mlir/lib/Conversion/GPUToSPIRV/ConvertGPUToSPIRVPass.cpp

mlir/lib/Dialect/SPIRV/TargetAndABI.cpp

mlir/test/Conversion/GPUToSPIRV/builtins.mlir

mlir/test/Conversion/GPUToSPIRV/if.mlir

mlir/test/Conversion/GPUToSPIRV/load-store.mlir

mlir/test/Conversion/GPUToSPIRV/loop.mlir

mlir/test/Conversion/GPUToSPIRV/simple.mlir

[mlir][spirv] Use spv.entry_point_abi in GPU to SPIR-V conversions
ClosedPublic