This is an archive of the discontinued LLVM Phabricator instance.

mlir/include/mlir/Dialect/GPU/IR/GPUCompilationAttr.td
82 ↗	(On Diff #536066)	Nit: `hasFastMath` since it returns a bool. I would think that `getFastMath` would return a FastMathFlag attribute.
23 ↗	(On Diff #543179)	Wondering why is this in the gpu dialect and not a target specific one? (like nvvm)

mehdi_amini added inline comments.Jul 24 2023, 12:38 AM

mlir/lib/Dialect/GPU/Targets/NVPTXTarget.cpp
213 ↗	(On Diff #543179)	We should have options to stop after emitting LLVMIR and PTX, seems like we have to go to cubin right now? I would also think that going to PTX can be possible without requiring any dependency on the Cuda toolkit?

Do we have a way to test this without too much of the other code by the way?

In D154117#4526902, @mehdi_amini wrote:

Do we have a way to test this without too much of the other code by the way?

Not really? I added some on the final patches were you have all the infra ready. The best test I can think of at this point, is an unit test for this serializer?

mlir/include/mlir/Dialect/GPU/IR/GPUCompilationAttr.td
82 ↗	(On Diff #536066)	I'll change it.
23 ↗	(On Diff #543179)	I initially added these attributes (NVPTX & AMDGPU) to `nvvm` & `rocdl`, however I decided against that in the final patch because it added GPU dependencies to those dialects (includes & libs) and created more libs instead of a single (GPUTargets). I'm open to change it.
mlir/lib/Dialect/GPU/Targets/NVPTXTarget.cpp
213 ↗	(On Diff #543179)	These series of patches intend to be a full replacement of the current pipeline, upon approval we should intermediately deprecate the current pipeline and then remove it after a deprecation period. The idea is that in future patches we change this behavior and stop at LLVM, but that requires having the LLVM Offload work done. The benefit of this approach is that those future changes would be invisible to users and at the same time it makes them adopt this new mechanism now, that's why I go straight to cubin.

mehdi_amini added inline comments.Jul 24 2023, 10:47 PM

mlir/include/mlir/Dialect/GPU/IR/GPUCompilationAttr.td
23 ↗	(On Diff #543179)	Where will the dependency come from? The GPUTargetAttrInterface? Maybe we should revisit the way the interface is specified. Worse case we're just one level of indirection away :) Here if the issue is just the gpu::TargetOptions, maybe it can be declared with the interface definition, which makes it an isolated library?
mlir/lib/Dialect/GPU/Targets/NVPTXTarget.cpp
213 ↗	(On Diff #543179)	OK, thanks for explaining, can you add a TODO with this info in the code?

Move the target attribute to NVVM and added it as an external model that it's promised.

Herald added a reviewer: ftynse. · View Herald TranscriptJul 30 2023, 5:07 PM

Herald added a reviewer: dcaballe. · View Herald Transcript

Herald added subscribers: gysit, Dinistro, awarzynski. · View Herald Transcript

Harbormaster completed remote builds in B249090: Diff 545485.Jul 30 2023, 5:08 PM

Fix typo in unit test.

Harbormaster completed remote builds in B249091: Diff 545486.Jul 30 2023, 5:12 PM

fmorac retitled this revision from [mlir][gpu] Adds the NVPTX target attribute. to [mlir][NVVM] Adds the NVVM target attribute..Jul 30 2023, 5:13 PM

fmorac edited the summary of this revision. (Show Details)

mehdi_amini added inline comments.Jul 30 2023, 5:53 PM

mlir/lib/Target/LLVM/NVVM/Target.cpp
56	Do we need to the init in the registration? Can we do the init when the serialization is called instead?
mlir/unittests/Target/LLVM/SerializeNVVMTarget.cpp
34	Not clear to me where we need the native target?
47	Can we tone this down? There is no reason to link all of MLIR into this unit-test.

Addressed comments.

Harbormaster completed remote builds in B249099: Diff 545495.Jul 30 2023, 6:19 PM

fmorac mentioned this in D154100: [mlir][Target][LLVM] Adds an utility class for serializing operations to binary strings..Jul 31 2023, 5:01 AM

mehdi_amini accepted this revision.Aug 1 2023, 3:06 PM

This revision is now accepted and ready to land.Aug 1 2023, 3:06 PM

(Waiting on update to use ptxas)

This revision now requires changes to proceed.Aug 1 2023, 7:12 PM

Swtiched to ptxas, added options for stopping compilation at LLVM IR & PTX, as well as unit tests testing the added functionalty.

Harbormaster completed remote builds in B249749: Diff 546432.Aug 2 2023, 8:20 AM

Bring @tra here to get feedback on pros/cons of invoking ptxas through a temp file vs using the library APIs exposed by nvptxcompile.

In D154117#4554942, @mehdi_amini wrote:

Bring @tra here to get feedback on pros/cons of invoking ptxas through a temp file vs using the library APIs exposed by nvptxcompile.

One immediate benefit is that we don't longer have a dependency on the toolkit to build this. You can detect ptxas by setting the env variable CUDA_ROOT to the location of the toolkit or by adding it to PATH.

mlir/lib/Target/LLVM/NVVM/Target.cpp
56	Yes, it's possible I'll change it.
mlir/unittests/Target/LLVM/SerializeNVVMTarget.cpp
34	Not needed, removed.
47	Yes, I've removed it.

In D154117#4554966, @fmorac wrote:

In D154117#4554942, @mehdi_amini wrote:

Bring @tra here to get feedback on pros/cons of invoking ptxas through a temp file vs using the library APIs exposed by nvptxcompile.

One immediate benefit is that we don't longer have a dependency on the toolkit to build this. You can detect ptxas by setting the env variable CUDA_ROOT to the location of the toolkit or by adding it to PATH.

I think compiling by ptxas has merits. One can different version of ptxas in case of performance regression.

guraypp added inline comments.Aug 3 2023, 7:35 AM

mlir/lib/Target/LLVM/NVVM/Target.cpp
291	It would be nice to pass parameters to `ptxas` from MLIR.
326	What do you think about passsing `-v` always to `ptxas` and print the output (perhaps stdout file) using `llvm::errs`. I found `-v` very useful to see compile-time register usage and local memory.

guraypp added inline comments.Aug 3 2023, 7:43 AM

mlir/lib/Target/LLVM/NVVM/Target.cpp
287	Can we keep the PTX file? After codegen, it is very natural to look at the PTX file, if we keep the file via a flag, I think it would be great. I implemented the `dump-ptx` flag to the `gpu-to-cubin` Pass earlier. I guess we cannot use that flag here.

fmorac marked 3 inline comments as done.Aug 3 2023, 9:47 AM

fmorac added inline comments.

mlir/lib/Target/LLVM/NVVM/Target.cpp
287	If you pass `-debug-only=serialize-to-ptx` (I'm changing it to `serialize-to-isa`) to `mlir-opt` then the PTX file will be printed to `stdout`. This mechanism also allows emitting PTX instead of binary, so you have multiple ways of obtaining the PTX.
291	You can, the args are passed through the `targetOptions` variable. Below you'll find that I add them to the PTX invocation.
326	We could consider adding a dedicated variable, but as it stands you can pass that variable through the cmdline and use `--debug-only=serialize-to-binary` to dump that output into `stdout`.

I think compiling by ptxas has merits. One can different version of ptxas in case of performance regression.

nvptxcompile provides the same facility I believe, doesn't it?

In D154117#4558565, @mehdi_amini wrote:

I think compiling by ptxas has merits. One can different version of ptxas in case of performance regression.

nvptxcompile provides the same facility I believe, doesn't it?

Though I haven't personally worked with nvptxcompile, it appears to be bundled with the toolkit. If one could use different toolkit version for PTX compilation, then yes it gives the same facility.

Recently, I used the driver (current MLIR compilation) and ptxas for PTX compilation. I noticed that even if I feed the same PTX code, the driver occasionally generates different SASS with ptxas. Maybe I hit a corner case but it was really hard to find the reason.

I believe it's crucial for us to know the potential disparities between nvptxcompile and ptxas if there is any.

My 2 cents, as far as I know the nvptxcompiler and ptxas should produce the same code for a given release. However unlike ptxas the only way to change the version of nvptxcompiler is to recompile the NVVMTarget library with a different version.

Adding support for nvptxcompiler is like extra 20 lines of code in this patch, and a couple on CMake, how do we feel about having both?

If nvptxcompiler is detected during build then nvptxcompiler is used, if not, then ptxas is used, and we also give the user the ability to choose on CMake?

If nvptxcompiler is detected during build then nvptxcompiler is used, if not, then ptxas is used, and we also give the user the ability to choose on CMake?

Adding a CMake option looks good to me, I would pick a default and stick to it (that is fail the cmake configuration with a helpful message).

I don't like "magic" fallback because detection is fragile and it makes the behavior dependent on the environment of the user. It's harder to know what you build, and getting bug reports it makes it also harder to know what's happening.

Added NVPTX compiler support & removed unnecessary fwd declarations. The CMake variable MLIR_ENABLE_NVPTXCOMPILER controls the usage of the NVPTXCompiler library and is disabled by default, as CMake 3.20 doesn't support CUDA::nvptxcompiler_static to find the library.

Harbormaster completed remote builds in B250743: Diff 547733.Aug 7 2023, 6:42 AM

tra added inline comments.Aug 7 2023, 11:05 AM

mlir/lib/Target/LLVM/NVVM/Target.cpp
124	A word of caution -- some linux distributions scatter CUDA SDK across the 'standard' linux filsystem paths, so a single `getToolkitPath()` would not be able to find all the necessary bits, as libdevice and the binaries will be in different places. You may need additional heuristics along the lines of what clang driver does. https://github.com/llvm/llvm-project/blob/1b74459df8a6d960f7387f0c8379047e42811f58/clang/lib/Driver/ToolChains/Cuda.cpp#L180

fmorac added inline comments.Aug 7 2023, 2:03 PM

mlir/lib/Target/LLVM/NVVM/Target.cpp
124	Thank you, that's good to know. I'll see how to rework it or add docs indicating how to make it work. As it stands the user could specify the CUDA path to `/usr/lib/cuda` for `libdevice`, somewhat similar to `--cuda-path`. For `ptxas` this mechanism searches several places in the following order: Command line option. PATH variable. The toolkit path as specified by a couple of env variables or the one detected by CMake. So it would work on Debian & Ubuntu, but the mechanism could be over burdening the user.

Reapplied clang-format.
Adds a couple of syntax tests & fixes a bug not displaying the link option in the attribute.

Thanks Fabian!

mlir/CMakeLists.txt
122	The description isn't super clear, because it conflicts with what "the NVPTX backend" is in LLVM. What about: "Statically link the nvptxlibrary instead of calling `ptxas` as a subprocess for compiling PTX to cubin" ?

This revision is now accepted and ready to land.Aug 7 2023, 7:37 PM

Harbormaster completed remote builds in B250968: Diff 548030.Aug 7 2023, 9:23 PM

Changed the description of the CMake variable MLIR_ENABLE_NVPTXCOMPILER. Applied clang-format again.

Harbormaster completed remote builds in B251075: Diff 548180.Aug 8 2023, 7:36 AM

Fixed merge conflicts with D157183.

Harbormaster completed remote builds in B251122: Diff 548238.Aug 8 2023, 11:43 AM

Closed by commit rG211c9752c820: [mlir][NVVM] Adds the NVVM target attribute. (authored by fmorac). · Explain WhyAug 8 2023, 12:21 PM

This revision was automatically updated to reflect the committed changes.

fmorac added a commit: rG211c9752c820: [mlir][NVVM] Adds the NVVM target attribute..

fmorac mentioned this in rGb43068e8707d: [mlir][gpu] Update GPU translation to accept binaries..Aug 11 2023, 5:29 PM

guraypp mentioned this in D159347: [MLIR] Run the TMA test for sm_90.Sep 1 2023, 3:20 AM

guraypp mentioned this in rG8031a088eb40: [MLIR] Run the TMA test for sm_90.Sep 4 2023, 9:15 AM

Revision Contents

Path

Size

mlir/

CMakeLists.txt

5 lines

include/

mlir/

Dialect/

LLVMIR/

NVVMDialect.h

1 line

NVVMOps.td

69 lines

InitAllExtensions.h

2 lines

Target/

LLVM/

NVVM/

Target.h

28 lines

Utils.h

74 lines

lib/

Dialect/

LLVMIR/

IR/

NVVMDialect.cpp

31 lines

Target/

LLVM/

CMakeLists.txt

77 lines

NVVM/

Target.cpp

508 lines

test/

Dialect/

GPU/

ops.mlir

7 lines

LLVMIR/

nvvm.mlir

9 lines

unittests/

Target/

LLVM/

CMakeLists.txt

6 lines

SerializeNVVMTarget.cpp

154 lines

Diff 548314

mlir/CMakeLists.txt

	Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
	if(${LLVM_NATIVE_ARCH} IN_LIST LLVM_TARGETS_TO_BUILD)			if(${LLVM_NATIVE_ARCH} IN_LIST LLVM_TARGETS_TO_BUILD)
	set(MLIR_ENABLE_EXECUTION_ENGINE 1)			set(MLIR_ENABLE_EXECUTION_ENGINE 1)
	else()			else()
	set(MLIR_ENABLE_EXECUTION_ENGINE 0)			set(MLIR_ENABLE_EXECUTION_ENGINE 0)
	endif()			endif()

	# Build the CUDA conversions and run according tests if the NVPTX backend			# Build the CUDA conversions and run according tests if the NVPTX backend
	# is available			# is available
	if ("NVPTX" IN_LIST LLVM_TARGETS_TO_BUILD AND MLIR_ENABLE_EXECUTION_ENGINE)			if ("NVPTX" IN_LIST LLVM_TARGETS_TO_BUILD)
	set(MLIR_ENABLE_CUDA_CONVERSIONS 1)			set(MLIR_ENABLE_CUDA_CONVERSIONS 1)
	else()			else()
	set(MLIR_ENABLE_CUDA_CONVERSIONS 0)			set(MLIR_ENABLE_CUDA_CONVERSIONS 0)
	endif()			endif()
	# TODO: we should use a config.h file like LLVM does			# TODO: we should use a config.h file like LLVM does
	add_definitions(-DMLIR_CUDA_CONVERSIONS_ENABLED=${MLIR_ENABLE_CUDA_CONVERSIONS})			add_definitions(-DMLIR_CUDA_CONVERSIONS_ENABLED=${MLIR_ENABLE_CUDA_CONVERSIONS})

	# Build the ROCm conversions and run according tests if the AMDGPU backend			# Build the ROCm conversions and run according tests if the AMDGPU backend
	# is available.			# is available.
	if ("AMDGPU" IN_LIST LLVM_TARGETS_TO_BUILD)			if ("AMDGPU" IN_LIST LLVM_TARGETS_TO_BUILD)
	set(MLIR_ENABLE_ROCM_CONVERSIONS 1)			set(MLIR_ENABLE_ROCM_CONVERSIONS 1)
	else()			else()
	set(MLIR_ENABLE_ROCM_CONVERSIONS 0)			set(MLIR_ENABLE_ROCM_CONVERSIONS 0)
	endif()			endif()
	add_definitions(-DMLIR_ROCM_CONVERSIONS_ENABLED=${MLIR_ENABLE_ROCM_CONVERSIONS})			add_definitions(-DMLIR_ROCM_CONVERSIONS_ENABLED=${MLIR_ENABLE_ROCM_CONVERSIONS})

	set(MLIR_ENABLE_CUDA_RUNNER 0 CACHE BOOL "Enable building the mlir CUDA runner")			set(MLIR_ENABLE_CUDA_RUNNER 0 CACHE BOOL "Enable building the mlir CUDA runner")
	set(MLIR_ENABLE_ROCM_RUNNER 0 CACHE BOOL "Enable building the mlir ROCm runner")			set(MLIR_ENABLE_ROCM_RUNNER 0 CACHE BOOL "Enable building the mlir ROCm runner")
	set(MLIR_ENABLE_SPIRV_CPU_RUNNER 0 CACHE BOOL "Enable building the mlir SPIR-V cpu runner")			set(MLIR_ENABLE_SPIRV_CPU_RUNNER 0 CACHE BOOL "Enable building the mlir SPIR-V cpu runner")
	set(MLIR_ENABLE_VULKAN_RUNNER 0 CACHE BOOL "Enable building the mlir Vulkan runner")			set(MLIR_ENABLE_VULKAN_RUNNER 0 CACHE BOOL "Enable building the mlir Vulkan runner")
				set(MLIR_ENABLE_NVPTXCOMPILER 0 CACHE BOOL
				"Statically link the nvptxlibrary instead of calling ptxas as a subprocess \
				mehdi_aminiUnsubmitted Not Done Reply Inline Actions The description isn't super clear, because it conflicts with what "the NVPTX backend" is in LLVM. What about: "Statically link the nvptxlibrary instead of calling `ptxas` as a subprocess for compiling PTX to cubin" ? mehdi_amini: The description isn't super clear, because it conflicts with what "the NVPTX backend" is in…
				for compiling PTX to cubin")

	option(MLIR_INCLUDE_TESTS			option(MLIR_INCLUDE_TESTS
	"Generate build targets for the MLIR unit tests."			"Generate build targets for the MLIR unit tests."
	${LLVM_INCLUDE_TESTS})			${LLVM_INCLUDE_TESTS})

	option(MLIR_INCLUDE_INTEGRATION_TESTS			option(MLIR_INCLUDE_INTEGRATION_TESTS
	"Generate build targets for the MLIR integration tests.")			"Generate build targets for the MLIR integration tests.")

	▲ Show 20 Lines • Show All 144 Lines • Show Last 20 Lines

mlir/include/mlir/Dialect/LLVMIR/NVVMDialect.h

	Show All 9 Lines
	// NVVM specific extensions to the LLVM type system.			// NVVM specific extensions to the LLVM type system.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef MLIR_DIALECT_LLVMIR_NVVMDIALECT_H_			#ifndef MLIR_DIALECT_LLVMIR_NVVMDIALECT_H_
	#define MLIR_DIALECT_LLVMIR_NVVMDIALECT_H_			#define MLIR_DIALECT_LLVMIR_NVVMDIALECT_H_

	#include "mlir/Bytecode/BytecodeOpInterface.h"			#include "mlir/Bytecode/BytecodeOpInterface.h"
				#include "mlir/Dialect/GPU/IR/CompilationInterfaces.h"
	#include "mlir/Dialect/LLVMIR/LLVMDialect.h"			#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
	#include "mlir/IR/Dialect.h"			#include "mlir/IR/Dialect.h"
	#include "mlir/IR/OpDefinition.h"			#include "mlir/IR/OpDefinition.h"
	#include "mlir/Interfaces/SideEffectInterfaces.h"			#include "mlir/Interfaces/SideEffectInterfaces.h"
	#include "llvm/IR/IntrinsicsNVPTX.h"			#include "llvm/IR/IntrinsicsNVPTX.h"

	#include "mlir/Dialect/LLVMIR/NVVMOpsEnums.h.inc"			#include "mlir/Dialect/LLVMIR/NVVMOpsEnums.h.inc"

	Show All 33 Lines

mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td

//===-- NVVMOps.td - NVVM IR dialect op definition file ----- tablegen --===//		//===-- NVVMOps.td - NVVM IR dialect op definition file ----- tablegen --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This is the NVVM IR operation definition file.		// This is the NVVM IR operation definition file.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef NVVMIR_OPS		#ifndef NVVMIR_OPS
#define NVVMIR_OPS		#define NVVMIR_OPS

include "mlir/IR/EnumAttr.td"		include "mlir/IR/EnumAttr.td"
		include "mlir/Dialect/GPU/IR/CompilationAttrInterfaces.td"
include "mlir/Dialect/LLVMIR/LLVMOpBase.td"		include "mlir/Dialect/LLVMIR/LLVMOpBase.td"
include "mlir/Interfaces/SideEffectInterfaces.td"		include "mlir/Interfaces/SideEffectInterfaces.td"

def LLVM_i8Ptr_global : LLVM_IntPtrBase<8, 1>;		def LLVM_i8Ptr_global : LLVM_IntPtrBase<8, 1>;
def LLVM_i8Ptr_shared : LLVM_IntPtrBase<8, 3>;		def LLVM_i8Ptr_shared : LLVM_IntPtrBase<8, 3>;
def LLVM_i64ptr_any : LLVM_IntPtrBase<64>;		def LLVM_i64ptr_any : LLVM_IntPtrBase<64>;
def LLVM_i64ptr_shared : LLVM_IntPtrBase<64, 3>;		def LLVM_i64ptr_shared : LLVM_IntPtrBase<64, 3>;

▲ Show 20 Lines • Show All 1,442 Lines • ▼ Show 20 Lines	let description = [{
See for more information:		See for more information:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#asynchronous-warpgroup-level-matrix-instructions-wgmma-wait-group		https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#asynchronous-warpgroup-level-matrix-instructions-wgmma-wait-group
}];		}];
let extraClassDefinition = [{		let extraClassDefinition = [{
std::string $cppClass::getPtx() { return std::string("wgmma.wait_group.sync.aligned %0;"); }		std::string $cppClass::getPtx() { return std::string("wgmma.wait_group.sync.aligned %0;"); }
}];		}];
}		}

		//===----------------------------------------------------------------------===//
		// NVVM target attribute.
		//===----------------------------------------------------------------------===//

		def NVVM_TargettAttr : NVVM_Attr<"NVVMTarget", "target"> {
		let description = [{
		GPU target attribute for controlling compilation of NVIDIA targets. All
		parameters decay into default values if not present.

		Examples:

		1. Target with default values.
		```
		gpu.module @mymodule [#nvvm.target] attributes {...} {
		...
		}
		```

		2. Target with `sm_90` chip and fast math.
		```
		gpu.module @mymodule [#nvvm.target<chip = "sm_90", flags = {fast}>] {
		...
		}
		```
		}];
		let parameters = (ins
		DefaultValuedParameter<"int", "2", "Optimization level to apply.">:$O,
		StringRefParameter<"Target triple.", "\"nvptx64-nvidia-cuda\"">:$triple,
		StringRefParameter<"Target chip.", "\"sm_50\"">:$chip,
		StringRefParameter<"Target chip features.", "\"+ptx60\"">:$features,
		OptionalParameter<"DictionaryAttr", "Target specific flags.">:$flags,
		OptionalParameter<"ArrayAttr", "Files to link to the LLVM module.">:$link
		);
		let assemblyFormat = [{
		(`<` struct($O, $triple, $chip, $features, $flags, $link)^ `>`)?
		}];
		let builders = [
		AttrBuilder<(ins CArg<"int", "2">:$optLevel,
		CArg<"StringRef", "\"nvptx64-nvidia-cuda\"">:$triple,
		CArg<"StringRef", "\"sm_50\"">:$chip,
		CArg<"StringRef", "\"+ptx60\"">:$features,
		CArg<"DictionaryAttr", "nullptr">:$targetFlags,
		CArg<"ArrayAttr", "nullptr">:$linkFiles), [{
		return Base::get($_ctxt, optLevel, triple, chip, features, targetFlags, linkFiles);
		}]>
		];
		let skipDefaultBuilders = 1;
		let genVerifyDecl = 1;
		let extraClassDeclaration = [{
		bool hasFlag(StringRef flag) const;
		bool hasFastMath() const;
		bool hasFtz() const;
		}];
		let extraClassDefinition = [{
		bool $cppClass::hasFlag(StringRef flag) const {
		if (DictionaryAttr flags = getFlags())
		return flags.get(flag) != nullptr;
		return false;
		}
		bool $cppClass::hasFastMath() const {
		return hasFlag("fast");
		}
		bool $cppClass::hasFtz() const {
		return hasFlag("ftz");
		}
		}];
		}

#endif // NVVMIR_OPS		#endif // NVVMIR_OPS

mlir/include/mlir/InitAllExtensions.h

	Show All 10 Lines
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#ifndef MLIR_INITALLEXTENSIONS_H_			#ifndef MLIR_INITALLEXTENSIONS_H_
	#define MLIR_INITALLEXTENSIONS_H_			#define MLIR_INITALLEXTENSIONS_H_

	#include "mlir/Conversion/NVVMToLLVM/NVVMToLLVM.h"			#include "mlir/Conversion/NVVMToLLVM/NVVMToLLVM.h"
	#include "mlir/Dialect/Func/Extensions/AllExtensions.h"			#include "mlir/Dialect/Func/Extensions/AllExtensions.h"
				#include "mlir/Target/LLVM/NVVM/Target.h"

	#include <cstdlib>			#include <cstdlib>

	namespace mlir {			namespace mlir {

	/// This function may be called to register all MLIR dialect extensions with the			/// This function may be called to register all MLIR dialect extensions with the
	/// provided registry.			/// provided registry.
	/// If you're building a compiler, you generally shouldn't use this: you would			/// If you're building a compiler, you generally shouldn't use this: you would
	/// individually register the specific extensions that are useful for the			/// individually register the specific extensions that are useful for the
	/// pipelines and transformations you are using.			/// pipelines and transformations you are using.
	inline void registerAllExtensions(DialectRegistry &registry) {			inline void registerAllExtensions(DialectRegistry &registry) {
	func::registerAllExtensions(registry);			func::registerAllExtensions(registry);
	registerConvertNVVMToLLVMInterface(registry);			registerConvertNVVMToLLVMInterface(registry);
				registerNVVMTarget(registry);
	}			}

	} // namespace mlir			} // namespace mlir

	#endif // MLIR_INITALLEXTENSIONS_H_			#endif // MLIR_INITALLEXTENSIONS_H_

mlir/include/mlir/Target/LLVM/NVVM/Target.h

This file was added.

				//===- Target.h - MLIR NVVM target registration ------------------ C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This provides registration calls for attaching the NVVM target interface.
				//
				//===----------------------------------------------------------------------===//

				#ifndef MLIR_TARGET_LLVM_NVVM_TARGET_H
				#define MLIR_TARGET_LLVM_NVVM_TARGET_H

				namespace mlir {
				class DialectRegistry;
				class MLIRContext;
				/// Registers the `TargetAttrInterface` for the `#nvvm.target` attribute in the
				/// given registry.
				void registerNVVMTarget(DialectRegistry &registry);

				/// Registers the `TargetAttrInterface` for the `#nvvm.target` attribute in the
				/// registry associated with the given context.
				void registerNVVMTarget(MLIRContext &context);
				} // namespace mlir

				#endif // MLIR_TARGET_LLVM_NVVM_TARGET_H

mlir/include/mlir/Target/LLVM/NVVM/Utils.h

This file was added.

				//===- Utils.h - MLIR NVVM target utils -------------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This files declares NVVM target related utility classes and functions.
				//
				//===----------------------------------------------------------------------===//

				#ifndef MLIR_TARGET_LLVM_NVVM_UTILS_H
				#define MLIR_TARGET_LLVM_NVVM_UTILS_H

				#include "mlir/Dialect/GPU/IR/CompilationInterfaces.h"
				#include "mlir/Dialect/LLVMIR/NVVMDialect.h"
				#include "mlir/Target/LLVM/ModuleToObject.h"

				namespace mlir {
				namespace NVVM {
				/// Searches & returns the path CUDA toolkit path, the search order is:
				/// 1. The `CUDA_ROOT` environment variable.
				/// 2. The `CUDA_HOME` environment variable.
				/// 3. The `CUDA_PATH` environment variable.
				/// 4. The CUDA toolkit path detected by CMake.
				/// 5. Returns an empty string.
				StringRef getCUDAToolkitPath();

				/// Base class for all NVVM serializations from GPU modules into binary strings.
				/// By default this class serializes into LLVM bitcode.
				class SerializeGPUModuleBase : public LLVM::ModuleToObject {
				public:
				/// Initializes the `toolkitPath` with the path in `targetOptions` or if empty
				/// with the path in `getCUDAToolkitPath`.
				SerializeGPUModuleBase(Operation &module, NVVMTargetAttr target,
				const gpu::TargetOptions &targetOptions = {});

				/// Initializes the LLVM NVPTX target by safely calling `LLVMInitializeNVPTX*`
				/// methods if available.
				static void init();

				/// Returns the target attribute.
				NVVMTargetAttr getTarget() const;

				/// Returns the CUDA toolkit path.
				StringRef getToolkitPath() const;

				/// Returns the bitcode files to be loaded.
				ArrayRef<std::string> getFileList() const;

				/// Appends `nvvm/libdevice.bc` into `fileList`. Returns failure if the
				/// library couldn't be found.
				LogicalResult appendStandardLibs();

				/// Loads the bitcode files in `fileList`.
				virtual std::optional<SmallVector<std::unique_ptr<llvm::Module>>>
				loadBitcodeFiles(llvm::Module &module,
				llvm::TargetMachine &targetMachine) override;

				protected:
				/// NVVM target attribute.
				NVVMTargetAttr target;

				/// CUDA toolkit path.
				std::string toolkitPath;

				/// List of LLVM bitcode files to link to.
				SmallVector<std::string> fileList;
				};
				} // namespace NVVM
				} // namespace mlir

				#endif // MLIR_TARGET_LLVM_NVVM_UTILS_H

mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp

Show All 11 Lines
// The NVVM dialect only contains GPU specific additions on top of the general		// The NVVM dialect only contains GPU specific additions on top of the general
// LLVM dialect.		// LLVM dialect.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "mlir/Dialect/LLVMIR/NVVMDialect.h"		#include "mlir/Dialect/LLVMIR/NVVMDialect.h"

#include "mlir/Conversion/ConvertToLLVM/ToLLVMInterface.h"		#include "mlir/Conversion/ConvertToLLVM/ToLLVMInterface.h"
		#include "mlir/Dialect/GPU/IR/GPUDialect.h"
#include "mlir/Dialect/Utils/StaticValueUtils.h"		#include "mlir/Dialect/Utils/StaticValueUtils.h"
#include "mlir/IR/Builders.h"		#include "mlir/IR/Builders.h"
#include "mlir/IR/BuiltinAttributes.h"		#include "mlir/IR/BuiltinAttributes.h"
#include "mlir/IR/BuiltinTypes.h"		#include "mlir/IR/BuiltinTypes.h"
#include "mlir/IR/DialectImplementation.h"		#include "mlir/IR/DialectImplementation.h"
#include "mlir/IR/MLIRContext.h"		#include "mlir/IR/MLIRContext.h"
#include "mlir/IR/Operation.h"		#include "mlir/IR/Operation.h"
#include "mlir/IR/OperationSupport.h"		#include "mlir/IR/OperationSupport.h"
▲ Show 20 Lines • Show All 690 Lines • ▼ Show 20 Lines
#define GET_ATTRDEF_LIST		#define GET_ATTRDEF_LIST
#include "mlir/Dialect/LLVMIR/NVVMOpsAttributes.cpp.inc"		#include "mlir/Dialect/LLVMIR/NVVMOpsAttributes.cpp.inc"
>();		>();

// Support unknown operations because not all NVVM operations are		// Support unknown operations because not all NVVM operations are
// registered.		// registered.
allowUnknownOperations();		allowUnknownOperations();
declarePromisedInterface<ConvertToLLVMPatternInterface>();		declarePromisedInterface<ConvertToLLVMPatternInterface>();
		declarePromisedInterface<gpu::TargetAttrInterface>();
}		}

LogicalResult NVVMDialect::verifyOperationAttribute(Operation *op,		LogicalResult NVVMDialect::verifyOperationAttribute(Operation *op,
NamedAttribute attr) {		NamedAttribute attr) {
StringAttr attrName = attr.getName();		StringAttr attrName = attr.getName();
// Kernel function attribute should be attached to functions.		// Kernel function attribute should be attached to functions.
if (attrName == NVVMDialect::getKernelFuncAttrName()) {		if (attrName == NVVMDialect::getKernelFuncAttrName()) {
if (!isa<LLVM::LLVMFuncOp>(op)) {		if (!isa<LLVM::LLVMFuncOp>(op)) {
Show All 22 Lines	if (attrName == NVVMDialect::getMinctasmAttrName() \|\|
if (!llvm::dyn_cast<IntegerAttr>(attr.getValue()))		if (!llvm::dyn_cast<IntegerAttr>(attr.getValue()))
return op->emitError()		return op->emitError()
<< "'" << attrName << "' attribute must be integer constant";		<< "'" << attrName << "' attribute must be integer constant";
}		}

return success();		return success();
}		}

		//===----------------------------------------------------------------------===//
		// NVVM target attribute.
		//===----------------------------------------------------------------------===//
		LogicalResult
		NVVMTargetAttr::verify(function_ref<InFlightDiagnostic()> emitError,
		int optLevel, StringRef triple, StringRef chip,
		StringRef features, DictionaryAttr flags,
		ArrayAttr files) {
		if (optLevel < 0 \|\| optLevel > 3) {
		emitError() << "The optimization level must be a number between 0 and 3.";
		return failure();
		}
		if (triple.empty()) {
		emitError() << "The target triple cannot be empty.";
		return failure();
		}
		if (chip.empty()) {
		emitError() << "The target chip cannot be empty.";
		return failure();
		}
		if (files && !llvm::all_of(files, [](::mlir::Attribute attr) {
		return attr && mlir::isa<StringAttr>(attr);
		})) {
		emitError() << "All the elements in the `link` array must be strings.";
		return failure();
		}
		return success();
		}

#define GET_OP_CLASSES		#define GET_OP_CLASSES
#include "mlir/Dialect/LLVMIR/NVVMOps.cpp.inc"		#include "mlir/Dialect/LLVMIR/NVVMOps.cpp.inc"

#define GET_ATTRDEF_CLASSES		#define GET_ATTRDEF_CLASSES
#include "mlir/Dialect/LLVMIR/NVVMOpsAttributes.cpp.inc"		#include "mlir/Dialect/LLVMIR/NVVMOpsAttributes.cpp.inc"

mlir/lib/Target/LLVM/CMakeLists.txt

Show All 14 Lines	add_mlir_library(MLIRTargetLLVM
MC		MC
Passes		Passes
Support		Support
Target		Target
LINK_LIBS PUBLIC		LINK_LIBS PUBLIC
MLIRExecutionEngineUtils		MLIRExecutionEngineUtils
MLIRTargetLLVMIRExport		MLIRTargetLLVMIRExport
)		)

		if (MLIR_ENABLE_CUDA_CONVERSIONS)
		set(NVPTX_LIBS
		NVPTXCodeGen
		NVPTXDesc
		NVPTXInfo
		)
		endif()

		add_mlir_dialect_library(MLIRNVVMTarget
		NVVM/Target.cpp

		ADDITIONAL_HEADER_DIRS
		${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/LLVMIR

		LINK_COMPONENTS
		${NVPTX_LIBS}

		LINK_LIBS PUBLIC
		MLIRIR
		MLIRExecutionEngineUtils
		MLIRSupport
		MLIRGPUDialect
		MLIRTargetLLVM
		MLIRNVVMToLLVMIRTranslation
		)

		if(MLIR_ENABLE_CUDA_CONVERSIONS)
		# Find the CUDA toolkit.
		find_package(CUDAToolkit)

		if(CUDAToolkit_FOUND)
		# Get the CUDA toolkit path. The path is needed for detecting `libdevice.bc`.
		# These extra steps are needed because of a bug on CMake.
		# See: https://gitlab.kitware.com/cmake/cmake/-/issues/24858
		# TODO: Bump the MLIR CMake version to 3.26.4 and switch to
		# ${CUDAToolkit_LIBRARY_ROOT}
		if(NOT DEFINED ${CUDAToolkit_LIBRARY_ROOT})
		get_filename_component(MLIR_CUDAToolkit_ROOT ${CUDAToolkit_BIN_DIR}
		DIRECTORY ABSOLUTE)
		else()
		set(MLIR_CUDAToolkit_ROOT ${CUDAToolkit_LIBRARY_ROOT})
		endif()

		# Add the `nvptxcompiler` library.
		if(MLIR_ENABLE_NVPTXCOMPILER)
		# Find the `nvptxcompiler` library.
		# TODO: Bump the MLIR CMake version to 3.25 and use `CUDA::nvptxcompiler_static`.
		find_library(MLIR_NVPTXCOMPILER_LIB nvptxcompiler_static
		PATHS ${CUDAToolkit_LIBRARY_DIR} NO_DEFAULT_PATH)

		# Fail if `nvptxcompiler_static` couldn't be found.
		if(MLIR_NVPTXCOMPILER_LIB STREQUAL "MLIR_NVPTXCOMPILER_LIB-NOTFOUND")
		message(FATAL_ERROR
		"Requested using the `nvptxcompiler` library backend but it couldn't be found.")
		endif()

		# Link against `nvptxcompiler_static`. TODO: use `CUDA::nvptxcompiler_static`.
		target_link_libraries(MLIRNVVMTarget PRIVATE ${MLIR_NVPTXCOMPILER_LIB})
		target_include_directories(obj.MLIRNVVMTarget PUBLIC ${CUDAToolkit_INCLUDE_DIRS})
		endif()
		else()
		# Fail if `MLIR_ENABLE_NVPTXCOMPILER` is enabled and the toolkit couldn't be found.
		if(MLIR_ENABLE_NVPTXCOMPILER)
		message(FATAL_ERROR
		"Requested using the `nvptxcompiler` library backend but it couldn't be found.")
		endif()
		endif()
		message(VERBOSE "MLIR default CUDA toolkit path: ${MLIR_CUDAToolkit_ROOT}")

		# Define the `CUDAToolkit` path.
		target_compile_definitions(obj.MLIRNVVMTarget
		PRIVATE
		MLIR_NVPTXCOMPILER_ENABLED=${MLIR_ENABLE_NVPTXCOMPILER}
		__DEFAULT_CUDATOOLKIT_PATH__="${MLIR_CUDAToolkit_ROOT}"
		)
		endif()

mlir/lib/Target/LLVM/NVVM/Target.cpp

This file was added.

				//===- Target.cpp - MLIR LLVM NVVM target compilation ------------ C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This files defines NVVM target related functions including registration
				// calls for the `#nvvm.target` compilation attribute.
				//
				//===----------------------------------------------------------------------===//

				#include "mlir/Target/LLVM/NVVM/Target.h"

				#include "mlir/Dialect/GPU/IR/GPUDialect.h"
				#include "mlir/Dialect/LLVMIR/NVVMDialect.h"
				#include "mlir/Target/LLVM/NVVM/Utils.h"
				#include "mlir/Target/LLVMIR/Dialect/GPU/GPUToLLVMIRTranslation.h"
				#include "mlir/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.h"
				#include "mlir/Target/LLVMIR/Dialect/NVVM/NVVMToLLVMIRTranslation.h"
				#include "mlir/Target/LLVMIR/Export.h"

				#include "llvm/Support/FileSystem.h"
				#include "llvm/Support/FileUtilities.h"
				#include "llvm/Support/FormatVariadic.h"
				#include "llvm/Support/MemoryBuffer.h"
				#include "llvm/Support/Path.h"
				#include "llvm/Support/Process.h"
				#include "llvm/Support/Program.h"
				#include "llvm/Support/TargetSelect.h"

				#include <cstdlib>

				using namespace mlir;
				using namespace mlir::NVVM;

				#ifndef __DEFAULT_CUDATOOLKIT_PATH__
				#define __DEFAULT_CUDATOOLKIT_PATH__ ""
				#endif

				namespace {
				// Implementation of the `TargetAttrInterface` model.
				class NVVMTargetAttrImpl
				: public gpu::TargetAttrInterface::FallbackModel<NVVMTargetAttrImpl> {
				public:
				std::optional<SmallVector<char, 0>>
				serializeToObject(Attribute attribute, Operation *module,
				const gpu::TargetOptions &options) const;
				};
				} // namespace

				// Register the NVVM dialect, the NVVM translation & the target interface.
				void mlir::registerNVVMTarget(DialectRegistry &registry) {
				registerNVVMDialectTranslation(registry);
				registry.addExtension(+[](MLIRContext ctx, NVVM::NVVMDialect dialect) {
				mehdi_aminiUnsubmitted Done Reply Inline Actions Do we need to the init in the registration? Can we do the init when the serialization is called instead? mehdi_amini: Do we need to the init in the registration? Can we do the init when the serialization is called…
				fmoracAuthorUnsubmitted Done Reply Inline Actions Yes, it's possible I'll change it. fmorac: Yes, it's possible I'll change it.
				NVVMTargetAttr::attachInterface<NVVMTargetAttrImpl>(*ctx);
				});
				}

				void mlir::registerNVVMTarget(MLIRContext &context) {
				DialectRegistry registry;
				registerNVVMTarget(registry);
				context.appendDialectRegistry(registry);
				}

				// Search for the CUDA toolkit path.
				StringRef mlir::NVVM::getCUDAToolkitPath() {
				if (const char *var = std::getenv("CUDA_ROOT"))
				return var;
				if (const char *var = std::getenv("CUDA_HOME"))
				return var;
				if (const char *var = std::getenv("CUDA_PATH"))
				return var;
				return __DEFAULT_CUDATOOLKIT_PATH__;
				}

				SerializeGPUModuleBase::SerializeGPUModuleBase(
				Operation &module, NVVMTargetAttr target,
				const gpu::TargetOptions &targetOptions)
				: ModuleToObject(module, target.getTriple(), target.getChip(),
				target.getFeatures(), target.getO()),
				target(target), toolkitPath(targetOptions.getToolkitPath()),
				fileList(targetOptions.getLinkFiles()) {

				// If `targetOptions` have an empty toolkitPath use `getCUDAToolkitPath`
				if (toolkitPath.empty())
				toolkitPath = getCUDAToolkitPath();

				// Append the files in the target attribute.
				if (ArrayAttr files = target.getLink())
				for (Attribute attr : files.getValue())
				if (auto file = dyn_cast<StringAttr>(attr))
				fileList.push_back(file.str());

				// Append libdevice to the files to be loaded.
				(void)appendStandardLibs();
				}

				void SerializeGPUModuleBase::init() {
				static llvm::once_flag initializeBackendOnce;
				llvm::call_once(initializeBackendOnce, []() {
				// If the `NVPTX` LLVM target was built, initialize it.
				#if MLIR_CUDA_CONVERSIONS_ENABLED == 1
				LLVMInitializeNVPTXTarget();
				LLVMInitializeNVPTXTargetInfo();
				LLVMInitializeNVPTXTargetMC();
				LLVMInitializeNVPTXAsmPrinter();
				#endif
				});
				}

				NVVMTargetAttr SerializeGPUModuleBase::getTarget() const { return target; }

				StringRef SerializeGPUModuleBase::getToolkitPath() const { return toolkitPath; }

				ArrayRef<std::string> SerializeGPUModuleBase::getFileList() const {
				return fileList;
				}

				// Try to append `libdevice` from a CUDA toolkit installation.
				LogicalResult SerializeGPUModuleBase::appendStandardLibs() {
				StringRef pathRef = getToolkitPath();
				if (pathRef.size()) {
				traUnsubmitted Not Done Reply Inline Actions A word of caution -- some linux distributions scatter CUDA SDK across the 'standard' linux filsystem paths, so a single `getToolkitPath()` would not be able to find all the necessary bits, as libdevice and the binaries will be in different places. You may need additional heuristics along the lines of what clang driver does. https://github.com/llvm/llvm-project/blob/1b74459df8a6d960f7387f0c8379047e42811f58/clang/lib/Driver/ToolChains/Cuda.cpp#L180 tra: A word of caution -- some linux distributions scatter CUDA SDK across the 'standard' linux…
				fmoracAuthorUnsubmitted Done Reply Inline Actions Thank you, that's good to know. I'll see how to rework it or add docs indicating how to make it work. As it stands the user could specify the CUDA path to `/usr/lib/cuda` for `libdevice`, somewhat similar to `--cuda-path`. For `ptxas` this mechanism searches several places in the following order: Command line option. PATH variable. The toolkit path as specified by a couple of env variables or the one detected by CMake. So it would work on Debian & Ubuntu, but the mechanism could be over burdening the user. fmorac: Thank you, that's good to know. I'll see how to rework it or add docs indicating how to make it…
				SmallVector<char, 256> path;
				path.insert(path.begin(), pathRef.begin(), pathRef.end());
				pathRef = StringRef(path.data(), path.size());
				if (!llvm::sys::fs::is_directory(pathRef)) {
				getOperation().emitError() << "CUDA path: " << pathRef
				<< " does not exist or is not a directory.\n";
				return failure();
				}
				llvm::sys::path::append(path, "nvvm", "libdevice", "libdevice.10.bc");
				pathRef = StringRef(path.data(), path.size());
				if (!llvm::sys::fs::is_regular_file(pathRef)) {
				getOperation().emitError() << "LibDevice path: " << pathRef
				<< " does not exist or is not a file.\n";
				return failure();
				}
				fileList.push_back(pathRef.str());
				}
				return success();
				}

				std::optional<SmallVector<std::unique_ptr<llvm::Module>>>
				SerializeGPUModuleBase::loadBitcodeFiles(llvm::Module &module,
				llvm::TargetMachine &targetMachine) {
				SmallVector<std::unique_ptr<llvm::Module>> bcFiles;
				if (failed(loadBitcodeFilesFromList(module.getContext(), targetMachine,
				fileList, bcFiles, true)))
				return std::nullopt;
				return bcFiles;
				}

				#if MLIR_CUDA_CONVERSIONS_ENABLED == 1
				namespace {
				class NVPTXSerializer : public SerializeGPUModuleBase {
				public:
				NVPTXSerializer(Operation &module, NVVMTargetAttr target,
				const gpu::TargetOptions &targetOptions);

				gpu::GPUModuleOp getOperation();

				// Compile PTX to cubin using `ptxas`.
				std::optional<SmallVector<char, 0>>
				compileToBinary(const std::string &ptxCode);

				// Compile PTX to cubin using the `nvptxcompiler` library.
				std::optional<SmallVector<char, 0>>
				compileToBinaryNVPTX(const std::string &ptxCode);

				std::optional<SmallVector<char, 0>>
				moduleToObject(llvm::Module &llvmModule,
				llvm::TargetMachine &targetMachine) override;

				private:
				using TmpFile = std::pair<llvm::SmallString<128>, llvm::FileRemover>;

				// Create a temp file.
				std::optional<TmpFile> createTemp(StringRef name, StringRef suffix);

				// Find the PTXAS compiler. The search order is:
				// 1. The toolkit path in `targetOptions`.
				// 2. In the system PATH.
				// 3. The path from `getCUDAToolkitPath()`.
				std::optional<std::string> findPtxas() const;

				// Target options.
				gpu::TargetOptions targetOptions;
				};
				} // namespace

				NVPTXSerializer::NVPTXSerializer(Operation &module, NVVMTargetAttr target,
				const gpu::TargetOptions &targetOptions)
				: SerializeGPUModuleBase(module, target, targetOptions),
				targetOptions(targetOptions) {}

				std::optional<NVPTXSerializer::TmpFile>
				NVPTXSerializer::createTemp(StringRef name, StringRef suffix) {
				llvm::SmallString<128> filename;
				std::error_code ec =
				llvm::sys::fs::createTemporaryFile(name, suffix, filename);
				if (ec) {
				getOperation().emitError() << "Couldn't create the temp file: `" << filename
				<< "`, error message: " << ec.message();
				return std::nullopt;
				}
				return TmpFile(filename, llvm::FileRemover(filename.c_str()));
				}

				gpu::GPUModuleOp NVPTXSerializer::getOperation() {
				return dyn_cast<gpu::GPUModuleOp>(&SerializeGPUModuleBase::getOperation());
				}

				std::optional<std::string> NVPTXSerializer::findPtxas() const {
				// Find the `ptxas` compiler.
				// 1. Check the toolkit path given in the command line.
				StringRef pathRef = targetOptions.getToolkitPath();
				SmallVector<char, 256> path;
				if (pathRef.size()) {
				path.insert(path.begin(), pathRef.begin(), pathRef.end());
				llvm::sys::path::append(path, "bin", "ptxas");
				if (llvm::sys::fs::can_execute(path))
				return StringRef(path.data(), path.size()).str();
				}

				// 2. Check PATH.
				if (std::optional<std::string> ptxasCompiler =
				llvm::sys::Process::FindInEnvPath("PATH", "ptxas"))
				return *ptxasCompiler;

				// 3. Check `getCUDAToolkitPath()`.
				pathRef = getCUDAToolkitPath();
				path.clear();
				if (pathRef.size()) {
				path.insert(path.begin(), pathRef.begin(), pathRef.end());
				llvm::sys::path::append(path, "bin", "ptxas");
				if (llvm::sys::fs::can_execute(path))
				return StringRef(path.data(), path.size()).str();
				}
				return std::nullopt;
				}

				// TODO: clean this method & have a generic tool driver or never emit binaries
				// with this mechanism and let another stage take care of it.
				std::optional<SmallVector<char, 0>>
				NVPTXSerializer::compileToBinary(const std::string &ptxCode) {
				// Find the PTXAS compiler.
				std::optional<std::string> ptxasCompiler = findPtxas();
				if (!ptxasCompiler) {
				getOperation().emitError()
				<< "Couldn't find the `ptxas` compiler. Please specify the toolkit "
				"path, add the compiler to $PATH, or set one of the environment "
				"variables in `NVVM::getCUDAToolkitPath()`.";
				return std::nullopt;
				}

				// Base name for all temp files: mlir-<module name>-<target triple>-<chip>.
				std::string basename =
				llvm::formatv("mlir-{0}-{1}-{2}", getOperation().getNameAttr().getValue(),
				getTarget().getTriple(), getTarget().getChip());

				// Create temp files:
				std::optional<TmpFile> ptxFile = createTemp(basename, "ptx");
				if (!ptxFile)
				return std::nullopt;
				std::optional<TmpFile> logFile = createTemp(basename, "log");
				if (!logFile)
				return std::nullopt;
				std::optional<TmpFile> cubinFile = createTemp(basename, "cubin");
				if (!cubinFile)
				return std::nullopt;

				std::error_code ec;
				// Dump the PTX to a temp file.
				{
				llvm::raw_fd_ostream ptxStream(ptxFile->first, ec);
				if (ec) {
				getOperation().emitError()
				<< "Couldn't open the file: `" << ptxFile->first
				<< "`, error message: " << ec.message();
				return std::nullopt;
				}
				ptxStream << ptxCode;
				if (ptxStream.has_error()) {
				getOperation().emitError()
				<< "An error occurred while writing the PTX to: `" << ptxFile->first
				gurayppUnsubmitted Done Reply Inline Actions Can we keep the PTX file? After codegen, it is very natural to look at the PTX file, if we keep the file via a flag, I think it would be great. I implemented the `dump-ptx` flag to the `gpu-to-cubin` Pass earlier. I guess we cannot use that flag here. guraypp: Can we keep the PTX file? After codegen, it is very natural to look at the PTX file, if we keep…
				fmoracAuthorUnsubmitted Done Reply Inline Actions If you pass `-debug-only=serialize-to-ptx` (I'm changing it to `serialize-to-isa`) to `mlir-opt` then the PTX file will be printed to `stdout`. This mechanism also allows emitting PTX instead of binary, so you have multiple ways of obtaining the PTX. fmorac: If you pass `-debug-only=serialize-to-ptx` (I'm changing it to `serialize-to-isa`) to `mlir…
				<< "`.";
				return std::nullopt;
				}
				ptxStream.flush();
				gurayppUnsubmitted Done Reply Inline Actions It would be nice to pass parameters to `ptxas` from MLIR. guraypp: It would be nice to pass parameters to `ptxas` from MLIR.
				fmoracAuthorUnsubmitted Done Reply Inline Actions You can, the args are passed through the `targetOptions` variable. Below you'll find that I add them to the PTX invocation. fmorac: You can, the args are passed through the `targetOptions` variable. Below you'll find that I add…
				}

				// Create PTX args.
				std::string optLevel = std::to_string(this->optLevel);
				SmallVector<StringRef, 12> ptxasArgs(
				{StringRef("ptxas"), StringRef("-arch"), getTarget().getChip(),
				StringRef(ptxFile->first), StringRef("-o"), StringRef(cubinFile->first),
				"--opt-level", optLevel});

				std::pair<llvm::BumpPtrAllocator, SmallVector<const char *>> cmdOpts =
				targetOptions.tokenizeCmdOptions();
				for (auto arg : cmdOpts.second)
				ptxasArgs.push_back(arg);

				std::optional<StringRef> redirects[] = {
				std::nullopt,
				logFile->first,
				logFile->first,
				};

				// Invoke PTXAS.
				std::string message;
				if (llvm::sys::ExecuteAndWait(ptxasCompiler.value(), ptxasArgs,
				/Env=/std::nullopt,
				/Redirects=/redirects,
				/SecondsToWait=/0,
				/MemoryLimit=/0,
				/ErrMsg=/&message)) {
				if (message.empty()) {
				llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> ptxasStderr =
				llvm::MemoryBuffer::getFile(logFile->first);
				if (ptxasStderr)
				getOperation().emitError() << "PTXAS invocation failed. PTXAS log:\n"
				<< ptxasStderr->get()->getBuffer();
				else
				gurayppUnsubmitted Done Reply Inline Actions What do you think about passsing `-v` always to `ptxas` and print the output (perhaps stdout file) using `llvm::errs`. I found `-v` very useful to see compile-time register usage and local memory. guraypp: What do you think about passsing `-v` always to `ptxas` and print the output (perhaps stdout…
				fmoracAuthorUnsubmitted Done Reply Inline Actions We could consider adding a dedicated variable, but as it stands you can pass that variable through the cmdline and use `--debug-only=serialize-to-binary` to dump that output into `stdout`. fmorac: We could consider adding a dedicated variable, but as it stands you can pass that variable…
				getOperation().emitError() << "PTXAS invocation failed.";
				return std::nullopt;
				}
				getOperation().emitError()
				<< "PTXAS invocation failed, error message: " << message;
				return std::nullopt;
				}

				// Dump the output of PTXAS, helpful if the verbose flag was passed.
				#define DEBUG_TYPE "serialize-to-binary"
				LLVM_DEBUG({
				llvm::dbgs() << "PTXAS invocation for module: "
				<< getOperation().getNameAttr() << "\n";
				llvm::dbgs() << "Command: ";
				llvm::interleave(ptxasArgs, llvm::dbgs(), " ");
				llvm::dbgs() << "\n";
				llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> ptxasLog =
				llvm::MemoryBuffer::getFile(logFile->first);
				if (ptxasLog && (*ptxasLog)->getBuffer().size()) {
				llvm::dbgs() << "Output:\n" << (*ptxasLog)->getBuffer() << "\n";
				llvm::dbgs().flush();
				}
				});
				#undef DEBUG_TYPE

				// Read the cubin file.
				llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> cubinBuffer =
				llvm::MemoryBuffer::getFile(cubinFile->first);
				if (!cubinBuffer) {
				getOperation().emitError()
				<< "Couldn't open the file: `" << cubinFile->first
				<< "`, error message: " << cubinBuffer.getError().message();
				return std::nullopt;
				}
				StringRef cubinStr = (*cubinBuffer)->getBuffer();
				return SmallVector<char, 0>(cubinStr.begin(), cubinStr.end());
				}

				#if MLIR_NVPTXCOMPILER_ENABLED == 1
				#include "nvPTXCompiler.h"

				#define RETURN_ON_NVPTXCOMPILER_ERROR(expr) \
				do { \
				if (auto status = (expr)) { \
				emitError(loc) << llvm::Twine(#expr).concat(" failed with error code ") \
				<< status; \
				return std::nullopt; \
				} \
				} while (false)

				std::optional<SmallVector<char, 0>>
				NVPTXSerializer::compileToBinaryNVPTX(const std::string &ptxCode) {
				Location loc = getOperation().getLoc();
				nvPTXCompilerHandle compiler = nullptr;
				nvPTXCompileResult status;
				size_t logSize;

				// Create the options.
				std::string optLevel = std::to_string(this->optLevel);
				std::pair<llvm::BumpPtrAllocator, SmallVector<const char *>> cmdOpts =
				targetOptions.tokenizeCmdOptions();
				cmdOpts.second.append(
				{"-arch", getTarget().getChip().data(), "--opt-level", optLevel.c_str()});

				// Create the compiler handle.
				RETURN_ON_NVPTXCOMPILER_ERROR(
				nvPTXCompilerCreate(&compiler, ptxCode.size(), ptxCode.c_str()));

				// Try to compile the binary.
				status = nvPTXCompilerCompile(compiler, cmdOpts.second.size(),
				cmdOpts.second.data());

				// Check if compilation failed.
				if (status != NVPTXCOMPILE_SUCCESS) {
				RETURN_ON_NVPTXCOMPILER_ERROR(
				nvPTXCompilerGetErrorLogSize(compiler, &logSize));
				if (logSize != 0) {
				SmallVector<char> log(logSize + 1, 0);
				RETURN_ON_NVPTXCOMPILER_ERROR(
				nvPTXCompilerGetErrorLog(compiler, log.data()));
				emitError(loc) << "NVPTX compiler invocation failed, error log: "
				<< log.data();
				} else
				emitError(loc) << "NVPTX compiler invocation failed with error code: "
				<< status;
				return std::nullopt;
				}

				// Retrieve the binary.
				size_t elfSize;
				RETURN_ON_NVPTXCOMPILER_ERROR(
				nvPTXCompilerGetCompiledProgramSize(compiler, &elfSize));
				SmallVector<char, 0> binary(elfSize, 0);
				RETURN_ON_NVPTXCOMPILER_ERROR(
				nvPTXCompilerGetCompiledProgram(compiler, (void *)binary.data()));

				// Dump the log of the compiler, helpful if the verbose flag was passed.
				#define DEBUG_TYPE "serialize-to-binary"
				LLVM_DEBUG({
				RETURN_ON_NVPTXCOMPILER_ERROR(
				nvPTXCompilerGetInfoLogSize(compiler, &logSize));
				if (logSize != 0) {
				SmallVector<char> log(logSize + 1, 0);
				RETURN_ON_NVPTXCOMPILER_ERROR(
				nvPTXCompilerGetInfoLog(compiler, log.data()));
				llvm::dbgs() << "NVPTX compiler invocation for module: "
				<< getOperation().getNameAttr() << "\n";
				llvm::dbgs() << "Arguments: ";
				llvm::interleave(cmdOpts.second, llvm::dbgs(), " ");
				llvm::dbgs() << "\nOutput\n" << log.data() << "\n";
				llvm::dbgs().flush();
				}
				});
				#undef DEBUG_TYPE
				RETURN_ON_NVPTXCOMPILER_ERROR(nvPTXCompilerDestroy(&compiler));
				return binary;
				}
				#endif // MLIR_NVPTXCOMPILER_ENABLED == 1

				std::optional<SmallVector<char, 0>>
				NVPTXSerializer::moduleToObject(llvm::Module &llvmModule,
				llvm::TargetMachine &targetMachine) {
				// Return LLVM IR if the compilation target is offload.
				#define DEBUG_TYPE "serialize-to-llvm"
				LLVM_DEBUG({
				llvm::dbgs() << "LLVM IR for module: " << getOperation().getNameAttr()
				<< "\n";
				llvm::dbgs() << llvmModule << "\n";
				llvm::dbgs().flush();
				});
				#undef DEBUG_TYPE
				if (targetOptions.getCompilationTarget() == gpu::TargetOptions::offload)
				return SerializeGPUModuleBase::moduleToObject(llvmModule, targetMachine);

				// Emit PTX code.
				std::optional<std::string> serializedISA =
				translateToISA(llvmModule, targetMachine);
				if (!serializedISA) {
				getOperation().emitError() << "Failed translating the module to ISA.";
				return std::nullopt;
				}
				#define DEBUG_TYPE "serialize-to-isa"
				LLVM_DEBUG({
				llvm::dbgs() << "PTX for module: " << getOperation().getNameAttr() << "\n";
				llvm::dbgs() << *serializedISA << "\n";
				llvm::dbgs().flush();
				});
				#undef DEBUG_TYPE

				// Return PTX if the compilation target is assembly.
				if (targetOptions.getCompilationTarget() == gpu::TargetOptions::assembly)
				return SmallVector<char, 0>(serializedISA->begin(), serializedISA->end());

				// Compile to binary.
				#if MLIR_NVPTXCOMPILER_ENABLED == 1
				return compileToBinaryNVPTX(*serializedISA);
				#else
				return compileToBinary(*serializedISA);
				#endif // MLIR_NVPTXCOMPILER_ENABLED == 1
				}
				#endif // MLIR_CUDA_CONVERSIONS_ENABLED == 1

				std::optional<SmallVector<char, 0>>
				NVVMTargetAttrImpl::serializeToObject(Attribute attribute, Operation *module,
				const gpu::TargetOptions &options) const {
				assert(module && "The module must be non null.");
				if (!module)
				return std::nullopt;
				if (!mlir::isa<gpu::GPUModuleOp>(module)) {
				module->emitError("Module must be a GPU module.");
				return std::nullopt;
				}
				#if MLIR_CUDA_CONVERSIONS_ENABLED == 1
				NVPTXSerializer serializer(*module, cast<NVVMTargetAttr>(attribute), options);
				serializer.init();
				return serializer.run();
				#else
				module->emitError(
				"The `NVPTX` target was not built. Please enable it when building LLVM.");
				return std::nullopt;
				#endif // MLIR_CUDA_CONVERSIONS_ENABLED == 1
				}

mlir/test/Dialect/GPU/ops.mlir

	Show First 20 Lines • Show All 358 Lines • ▼ Show 20 Lines
	}			}

	// Just check that this doesn't crash.			// Just check that this doesn't crash.
	gpu.module @module {			gpu.module @module {
	"gpu.func"() ({			"gpu.func"() ({
	gpu.return			gpu.return
	}) {function_type = () -> (), sym_name = "func"} : () -> ()			}) {function_type = () -> (), sym_name = "func"} : () -> ()
	}			}

				// Check that this doesn't crash.
				gpu.module @module_with_one_target [#nvvm.target] {
				gpu.func @kernel(%arg0 : f32) kernel {
				gpu.return
				}
				}

mlir/test/Dialect/LLVMIR/nvvm.mlir

	Show First 20 Lines • Show All 423 Lines • ▼ Show 20 Lines


	// CHECK-LABEL : @wgmma_commit_group_sync_aligned			// CHECK-LABEL : @wgmma_commit_group_sync_aligned
	func.func @wgmma_wait_group_sync_aligned() {			func.func @wgmma_wait_group_sync_aligned() {
	// CHECK : nvvm.wgmma.wait.group.sync.aligned			// CHECK : nvvm.wgmma.wait.group.sync.aligned
	nvvm.wgmma.wait.group.sync.aligned 0			nvvm.wgmma.wait.group.sync.aligned 0
	return			return
	}			}

				// -----

				// Just check these don't emit errors.
				gpu.module @module_1 [#nvvm.target<chip = "sm_90", features = "+ptx70", link = ["my_device_lib.bc"], flags = {fast, ftz}>] {
				}

				gpu.module @module_2 [#nvvm.target<chip = "sm_90">, #nvvm.target<chip = "sm_80">, #nvvm.target<chip = "sm_70">] {
				}

mlir/unittests/Target/LLVM/CMakeLists.txt

	add_mlir_unittest(MLIRTargetLLVMTests			add_mlir_unittest(MLIRTargetLLVMTests
				SerializeNVVMTarget.cpp
	SerializeToLLVMBitcode.cpp			SerializeToLLVMBitcode.cpp
	)			)

	llvm_map_components_to_libnames(llvm_libs nativecodegen)			llvm_map_components_to_libnames(llvm_libs nativecodegen)

	target_link_libraries(MLIRTargetLLVMTests			target_link_libraries(MLIRTargetLLVMTests
	PRIVATE			PRIVATE
	MLIRTargetLLVM			MLIRTargetLLVM
				MLIRNVVMTarget
				MLIRGPUDialect
				MLIRNVVMDialect
	MLIRLLVMDialect			MLIRLLVMDialect
	MLIRLLVMToLLVMIRTranslation			MLIRLLVMToLLVMIRTranslation
	MLIRBuiltinToLLVMIRTranslation			MLIRBuiltinToLLVMIRTranslation
				MLIRNVVMToLLVMIRTranslation
				MLIRGPUToLLVMIRTranslation
	${llvm_libs}			${llvm_libs}
	)			)

	if (DEFINED LLVM_NATIVE_TARGET)			if (DEFINED LLVM_NATIVE_TARGET)
	target_compile_definitions(MLIRTargetLLVMTests			target_compile_definitions(MLIRTargetLLVMTests
	PRIVATE			PRIVATE
	-DLLVM_NATIVE_TARGET_TEST_ENABLED=1			-DLLVM_NATIVE_TARGET_TEST_ENABLED=1
	)			)
	else()			else()
	target_compile_definitions(MLIRTargetLLVMTests			target_compile_definitions(MLIRTargetLLVMTests
	PRIVATE			PRIVATE
	-DLLVM_NATIVE_TARGET_TEST_ENABLED=0			-DLLVM_NATIVE_TARGET_TEST_ENABLED=0
	)			)
	endif()			endif()

mlir/unittests/Target/LLVM/SerializeNVVMTarget.cpp

This file was added.

				//===- SerializeNVVMTarget.cpp ----------------------------------- C++ --===//
				//
				// This file is licensed under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "mlir/Dialect/GPU/IR/GPUDialect.h"
				#include "mlir/Dialect/LLVMIR/NVVMDialect.h"
				#include "mlir/IR/MLIRContext.h"
				#include "mlir/InitAllDialects.h"
				#include "mlir/Parser/Parser.h"
				#include "mlir/Target/LLVM/NVVM/Target.h"
				#include "mlir/Target/LLVMIR/Dialect/Builtin/BuiltinToLLVMIRTranslation.h"
				#include "mlir/Target/LLVMIR/Dialect/GPU/GPUToLLVMIRTranslation.h"
				#include "mlir/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.h"

				#include "llvm/IRReader/IRReader.h"
				#include "llvm/Support/MemoryBufferRef.h"
				#include "llvm/Support/Process.h"
				#include "llvm/Support/TargetSelect.h"
				#include "llvm/Support/raw_ostream.h"
				#include "llvm/TargetParser/Host.h"

				#include "gmock/gmock.h"

				using namespace mlir;

				// Skip the test if the NVPTX target was not built.
				#if MLIR_CUDA_CONVERSIONS_ENABLED == 0
				#define SKIP_WITHOUT_NVPTX(x) DISABLED_##x
				#else
				#define SKIP_WITHOUT_NVPTX(x) x
				mehdi_aminiUnsubmitted Done Reply Inline Actions Not clear to me where we need the native target? mehdi_amini: Not clear to me where we need the native target?
				fmoracAuthorUnsubmitted Done Reply Inline Actions Not needed, removed. fmorac: Not needed, removed.
				#endif

				class MLIRTargetLLVMNVVM : public ::testing::Test {
				protected:
				virtual void SetUp() {
				registerBuiltinDialectTranslation(registry);
				registerLLVMDialectTranslation(registry);
				registerGPUDialectTranslation(registry);
				registerNVVMTarget(registry);
				}

				// Checks if PTXAS is in PATH.
				bool hasPtxas() {
				mehdi_aminiUnsubmitted Done Reply Inline Actions Can we tone this down? There is no reason to link all of MLIR into this unit-test. mehdi_amini: Can we tone this down? There is no reason to link all of MLIR into this unit-test.
				fmoracAuthorUnsubmitted Done Reply Inline Actions Yes, I've removed it. fmorac: Yes, I've removed it.
				// Find the `ptxas` compiler.
				std::optional<std::string> ptxasCompiler =
				llvm::sys::Process::FindInEnvPath("PATH", "ptxas");
				return ptxasCompiler.has_value();
				}

				// Dialect registry.
				DialectRegistry registry;

				// MLIR module used for the tests.
				const std::string moduleStr = R"mlir(
				gpu.module @nvvm_test {
				llvm.func @nvvm_kernel(%arg0: f32) attributes {gpu.kernel, nvvm.kernel} {
				llvm.return
				}
				})mlir";
				};

				// Test NVVM serialization to LLVM.
				TEST_F(MLIRTargetLLVMNVVM, SKIP_WITHOUT_NVPTX(SerializeNVVMMToLLVM)) {
				MLIRContext context(registry);

				OwningOpRef<ModuleOp> module =
				parseSourceString<ModuleOp>(moduleStr, &context);
				ASSERT_TRUE(!!module);

				// Create an NVVM target.
				NVVM::NVVMTargetAttr target = NVVM::NVVMTargetAttr::get(&context);

				// Serialize the module.
				auto serializer = dyn_cast<gpu::TargetAttrInterface>(target);
				ASSERT_TRUE(!!serializer);
				gpu::TargetOptions options("", {}, "", gpu::TargetOptions::offload);
				for (auto gpuModule : (*module).getBody()->getOps<gpu::GPUModuleOp>()) {
				std::optional<SmallVector<char, 0>> object =
				serializer.serializeToObject(gpuModule, options);
				// Check that the serializer was successful.
				ASSERT_TRUE(object != std::nullopt);
				ASSERT_TRUE(object->size() > 0);

				// Read the serialized module.
				llvm::MemoryBufferRef buffer(StringRef(object->data(), object->size()),
				"module");
				llvm::LLVMContext llvmContext;
				llvm::Expected<std::unique_ptr<llvm::Module>> llvmModule =
				llvm::getLazyBitcodeModule(buffer, llvmContext);
				ASSERT_TRUE(!!llvmModule);
				ASSERT_TRUE(!!*llvmModule);

				// Check that it has a function named `foo`.
				ASSERT_TRUE((*llvmModule)->getFunction("nvvm_kernel") != nullptr);
				}
				}

				// Test NVVM serialization to PTX.
				TEST_F(MLIRTargetLLVMNVVM, SKIP_WITHOUT_NVPTX(SerializeNVVMToPTX)) {
				MLIRContext context(registry);

				OwningOpRef<ModuleOp> module =
				parseSourceString<ModuleOp>(moduleStr, &context);
				ASSERT_TRUE(!!module);

				// Create an NVVM target.
				NVVM::NVVMTargetAttr target = NVVM::NVVMTargetAttr::get(&context);

				// Serialize the module.
				auto serializer = dyn_cast<gpu::TargetAttrInterface>(target);
				ASSERT_TRUE(!!serializer);
				gpu::TargetOptions options("", {}, "", gpu::TargetOptions::assembly);
				for (auto gpuModule : (*module).getBody()->getOps<gpu::GPUModuleOp>()) {
				std::optional<SmallVector<char, 0>> object =
				serializer.serializeToObject(gpuModule, options);
				// Check that the serializer was successful.
				ASSERT_TRUE(object != std::nullopt);
				ASSERT_TRUE(object->size() > 0);

				ASSERT_TRUE(
				StringRef(object->data(), object->size()).contains("nvvm_kernel"));
				}
				}

				// Test NVVM serialization to Binary.
				TEST_F(MLIRTargetLLVMNVVM, SKIP_WITHOUT_NVPTX(SerializeNVVMToBinary)) {
				if (!hasPtxas())
				GTEST_SKIP() << "PTXAS compiler not found, skipping test.";

				MLIRContext context(registry);

				OwningOpRef<ModuleOp> module =
				parseSourceString<ModuleOp>(moduleStr, &context);
				ASSERT_TRUE(!!module);

				// Create an NVVM target.
				NVVM::NVVMTargetAttr target = NVVM::NVVMTargetAttr::get(&context);

				// Serialize the module.
				auto serializer = dyn_cast<gpu::TargetAttrInterface>(target);
				ASSERT_TRUE(!!serializer);
				gpu::TargetOptions options("", {}, "", gpu::TargetOptions::binary);
				for (auto gpuModule : (*module).getBody()->getOps<gpu::GPUModuleOp>()) {
				std::optional<SmallVector<char, 0>> object =
				serializer.serializeToObject(gpuModule, options);
				// Check that the serializer was successful.
				ASSERT_TRUE(object != std::nullopt);
				ASSERT_TRUE(object->size() > 0);
				}
				}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][NVVM] Adds the NVVM target attribute.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 548314

mlir/CMakeLists.txt

mlir/include/mlir/Dialect/LLVMIR/NVVMDialect.h

mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td

mlir/include/mlir/InitAllExtensions.h

mlir/include/mlir/Target/LLVM/NVVM/Target.h

mlir/include/mlir/Target/LLVM/NVVM/Utils.h

mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp

mlir/lib/Target/LLVM/CMakeLists.txt

mlir/lib/Target/LLVM/NVVM/Target.cpp

mlir/test/Dialect/GPU/ops.mlir

mlir/test/Dialect/LLVMIR/nvvm.mlir

mlir/unittests/Target/LLVM/CMakeLists.txt

mlir/unittests/Target/LLVM/SerializeNVVMTarget.cpp

[mlir][NVVM] Adds the NVVM target attribute.
ClosedPublic