This is an archive of the discontinued LLVM Phabricator instance.

I'm potentially adding an extra field in gpu::TargetOptions for passing extra cmd arguments to tools, for example passing --max-reg-count to ptxas on NVIDIA. The question here is, what would be the AMD tool where I should pass those extra flags here?

arsenm added inline comments.Aug 1 2023, 11:00 AM

mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
453–459 ↗	(On Diff #545884)	I'm in the process of deleting the control libraries. Everything except wave64 currently has a clear path to deletion

fmorac added inline comments.Aug 1 2023, 11:07 AM

mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
453–459 ↗	(On Diff #545884)	What would be the new mechanism? Also, any ideas on how to address compatibility with current ROCm installs? Should we ship this now and then update?

arsenm added inline comments.Aug 1 2023, 11:09 AM

mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
453–459 ↗	(On Diff #545884)	For the most part 'work correctly' as if it were a normal library. I would just not bother adding configuration points for the optional ones and just use the conservative default setting. Something will likely still be required for specific subtarget handling (and wave64 vs. wave32 may end up as complete separate builds)

fmorac added inline comments.Aug 1 2023, 11:17 AM

mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
453–459 ↗	(On Diff #545884)	Ok, but in the new mechanism without `amdgcn/bitcode` control libraries, is there an option for requesting fast math and all of this other options? I mean this attribute only specifies that compilation should use fast math, for example: rocdl.target<flags = {fast, wave64}> Only says the compilation mechanism should use fast math. So we could leave those here? The actual implementation and using those control libraries is done line 205 in mlir/lib/Target/LLVM/ROCDL/Target.cpp

arsenm added inline comments.Aug 1 2023, 12:17 PM

mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
453–459 ↗	(On Diff #545884)	Ideally the optimization of the libraries happens based on the fast math flags and attributes of the call sites. For example, the callers should mark their outgoing arguments with nofpclass(nan inf) for finite only. Attributor will then be able to propagate that into the library functions assuming all callers also use fast math For sqrt, the default will be correct. the backend will look for llvm.sqrt calls with appropriate !fpmath metadata for the not correctly rounded case. For general unsafe/approximate functions, it will be driven off some combination of afn or contract flags on the call. DAZ checks can also be folded out based on the known denormal mode by attributor

fmorac added inline comments.Aug 1 2023, 12:25 PM

mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
453–459 ↗	(On Diff #545884)	Ok I see, what about existing ROCm installations, again should we ship this as it is, and change it in the future? I mean, I can just remove those control libs . Also, how about we keep these in the target attribute? as they could be used by a pass aware of the target that swaps non swap to fast math?

arsenm added inline comments.Aug 1 2023, 12:29 PM

mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
453–459 ↗	(On Diff #545884)	I don't understand what the "target" attribute means here. The "libraries" are kind of fake, D130096 switches to just injecting the global constants directly (not sure this is the solution for the target-cpu/isa version)

mehdi_amini added inline comments.Aug 1 2023, 12:39 PM

mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
453–459 ↗	(On Diff #545884)	Here the "target" attribute is basically the "driver" of the GPU kernel compiler. It defines "how to invoke LLVM and the assembler to get a binary back"

fmorac added inline comments.Aug 1 2023, 12:42 PM

mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
453–459 ↗	(On Diff #545884)	Ok, just to be clear, you are removing the libraries but keeping the control variables? Like you plan to remove `oclc_daz_opt.bc` but keeping the respective control variable? Which libraries are going to be removed? Are `oclc_isa_version_`, `oclc_abi_version_`, `oclc_wavefrontsize64` & `ockl` also going to be removed? Target attributes are the new mechanism in MLIR for handling compilation of GPU modules, however they also possess information on the target for example: #rocdl.target<chip="gfx90a", flags = {fast, noWave64, unsafe_math}> Says that the GPU module should be compiled for chip `gfx90a`, using `fast`, `unsafe_math` and using `wave32` flags.

arsenm added inline comments.Aug 1 2023, 2:20 PM

mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
453–459 ↗	(On Diff #545884)	There's a few moving parts here. The first set of optional variables (unsafe math, DAZ, sqrt) should be completely gone in short order. I have a few more pieces to implement but I think are fully deletable within 1-2 months. These conceptually do not belong to the target, so I don't think you want to handle them here. The control library for the ABI version will likely be gone soon, and replaced with clang directly emitting the global definition. The target (isa version and wave32/wave64) case is trickier. Long term I have my eye on deleting them, but it's not close enough to have a concrete idea of exactly what it looks like.

fmorac added inline comments.Aug 1 2023, 2:40 PM

mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
453–459 ↗	(On Diff #545884)	Ok, so a target in here doesn't map 1-1 to an LLVM target, these options inside a target are closer to cmd line options that would be passed to clang. Ok, I'll change these to emit global definitions and leave linking to `wave64`.

Switched to control variables & added more unit tests.

Harbormaster completed remote builds in B249831: Diff 546538.Aug 2 2023, 2:18 PM

Removed unnecessary fwd declarations.

Harbormaster completed remote builds in B250745: Diff 547734.Aug 7 2023, 6:51 AM

Adds a couple of syntax tests & fixes a bug not displaying the link option in the attribute.

Harbormaster completed remote builds in B250969: Diff 548031.Aug 7 2023, 9:21 PM

Ping for review. Solved merge conflicts.

Harbormaster completed remote builds in B251379: Diff 548613.Aug 9 2023, 8:44 AM

So, this code looks reasonable to me (and, as a bonus, I can tell how to invoke this when I'm generating binaries as a library), but I don't want to hand out a green check without @arsenm or someone else who's more involved in the intricacies of how LLVM gets called for GPU compilations signing off, unless we can't get a hold of anyone.

Rebasing.

Harbormaster completed remote builds in B251908: Diff 549326.Aug 11 2023, 5:54 AM

We're not committing to a stable API here, it's a first version that we need to plumb together and learn from.
If we want to be conservative we can just remove every flag and add them later when @arsenm can provide more guidance, otherwise that seems fine to me to land and iterate in tree.

This revision is now accepted and ready to land.Aug 11 2023, 10:50 AM

@krzysz00 works for you?

Works for me

arsenm added inline comments.Aug 11 2023, 10:58 AM

mlir/lib/Target/LLVM/ROCDL/Target.cpp
434 ↗	(On Diff #549326)	don't need the flushes and can use one dbgs()?
mlir/unittests/Target/LLVM/SerializeROCDLTarget.cpp
156 ↗	(On Diff #549326)	ASSERT_FALSE(object->empty())?

Addressed the comments.

This revision was landed with ongoing or failed builds.Aug 11 2023, 12:44 PM

Closed by commit rG6a0feb1503e2: [mlir][ROCDL] Adds the ROCDL target attribute. (authored by fmorac). · Explain Why

This revision was automatically updated to reflect the committed changes.

fmorac added a commit: rG6a0feb1503e2: [mlir][ROCDL] Adds the ROCDL target attribute..

fmorac added a reverting change: rG1e77536e1d14: Revert "[mlir][ROCDL] Adds the ROCDL target attribute.".Aug 11 2023, 12:51 PM

Harbormaster completed remote builds in B252030: Diff 549492.Aug 11 2023, 1:48 PM

fmorac added a commit: rG068213130dc5: [mlir][ROCDL] Adds the ROCDL target attribute..Aug 11 2023, 2:48 PM

fmorac mentioned this in rGb43068e8707d: [mlir][gpu] Update GPU translation to accept binaries..Aug 11 2023, 5:29 PM

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

GPU/

IR/

GPUCompilationAttr.td

93 lines

lib/

Dialect/

GPU/

CMakeLists.txt

23 lines

Targets/

AMDGPUTarget.cpp

410 lines

Diff 536068

mlir/include/mlir/Dialect/GPU/IR/GPUCompilationAttr.td

//===-- GPUTargetAttr.td - GPU compilation attributes ------- tablegen --===//		//===-- GPUTargetAttr.td - GPU compilation attributes ------- tablegen --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file defines the GPU NVPTX target attribute.		// This file defines the GPU NVPTX & AMGDPU target attributes.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef GPU_COMPILATIONATTR		#ifndef GPU_COMPILATIONATTR
#define GPU_COMPILATIONATTR		#define GPU_COMPILATIONATTR

include "mlir/Dialect/GPU/IR/GPUBase.td"		include "mlir/Dialect/GPU/IR/GPUBase.td"
include "mlir/Dialect/GPU/IR/CompilationAttrInterfaces.td"		include "mlir/Dialect/GPU/IR/CompilationAttrInterfaces.td"
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	bool $cppClass::getFastMath() const {
return hasFlag("fast");		return hasFlag("fast");
}		}
bool $cppClass::getFtz() const {		bool $cppClass::getFtz() const {
return hasFlag("ftz");		return hasFlag("ftz");
}		}
}];		}];
}		}

		//===----------------------------------------------------------------------===//
		// GPU AMDGPU target attribute.
		//===----------------------------------------------------------------------===//

		def GPU_AMDGPUTargetAttr : GPU_Attr<"AMDGPUTarget", "amdgpu", [
		DeclareAttrInterfaceMethods<GPUTargetAttrInterface, [
		"serializeToObject"
		]>
		]> {
		let description = [{
		AMDGPU target attribute for controlling compilation of AMDGPU targets. All
		parameters decay into default values if not present.

		Examples:

		1. Target with default values.
		```
		gpu.module @mymodule [#gpu.amdgpu] attributes {...} {
		...
		}
		```

		2. Target with `gfx90a` chip and fast math.
		```
		gpu.module @mymodule [#gpu.amdgpu<chip = "gfx90a", flags = {fast, no_wave64}>] {
		...
		}
		```
		}];
		let parameters = (ins
		DefaultValuedParameter<"int", "2", "Optimization level to apply.">:$O,
		StringRefParameter<"Target triple.", "\"amdgcn-amd-amdhsa\"">:$triple,
		StringRefParameter<"Target chip.", "\"gfx900\"">:$chip,
		StringRefParameter<"Target chip features.", "\"\"">:$features,
		StringRefParameter<"ABI version.", "\"500\"">:$abi,
		OptionalParameter<"DictionaryAttr", "Target specific flags.">:$flags,
		OptionalParameter<"ArrayAttr", "Files to link to the LLVM module.">:$link
		);
		let assemblyFormat = [{
		(`<` struct($O, $triple, $chip, $features, $abi, $flags)^ `>`)?
		}];
		let builders = [
		AttrBuilder<(ins CArg<"int", "2">:$optLevel,
		CArg<"StringRef", "\"amdgcn-amd-amdhsa\"">:$triple,
		CArg<"StringRef", "\"gfx900\"">:$chip,
		CArg<"StringRef", "\"\"">:$features,
		CArg<"StringRef", "\"400\"">:$abiVersion,
		CArg<"DictionaryAttr", "nullptr">:$targetFlags,
		CArg<"ArrayAttr", "nullptr">:$linkFiles), [{
		return Base::get($_ctxt, optLevel, triple, chip, features, abiVersion,
		targetFlags, linkFiles);
		}]>
		];
		let skipDefaultBuilders = 1;
		let genVerifyDecl = 1;
		let extraClassDeclaration = [{
		bool hasFlag(StringRef flag) const;
		bool getWave64() const;
		bool getFastMath() const;
		bool getDaz() const;
		bool getFiniteOnly() const;
		bool getUnsafeMath() const;
		bool getCorrectSqrt() const;
		}];
		let extraClassDefinition = [{
		bool $cppClass::hasFlag(StringRef flag) const {
		if (DictionaryAttr flags = getFlags())
		return flags.get(flag) != nullptr;
		return false;
		}
		bool $cppClass::getWave64() const {
		return hasFlag("wave64") \|\| !hasFlag("no_wave64");
		}
		bool $cppClass::getFastMath() const {
		return hasFlag("fast");
		}
		bool $cppClass::getDaz() const {
		return hasFlag("daz");
		}
		bool $cppClass::getFiniteOnly() const {
		return hasFlag("finite_only");
		}
		bool $cppClass::getUnsafeMath() const {
		return hasFlag("unsafe_math");
		}
		bool $cppClass::getCorrectSqrt() const {
		return !hasFlag("unsafe_sqrt");
		}
		}];
		}

#endif // GPU_COMPILATIONATTR		#endif // GPU_COMPILATIONATTR

mlir/lib/Dialect/GPU/CMakeLists.txt

Show First 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	add_mlir_dialect_library(MLIRGPUDialect
MLIRSideEffectInterfaces		MLIRSideEffectInterfaces
MLIRSupport		MLIRSupport

PRIVATE		PRIVATE
MLIRGPUTargets		MLIRGPUTargets
)		)

add_mlir_dialect_library(MLIRGPUTargets		add_mlir_dialect_library(MLIRGPUTargets
		Targets/AMDGPUTarget.cpp
Targets/NVPTXTarget.cpp		Targets/NVPTXTarget.cpp

ADDITIONAL_HEADER_DIRS		ADDITIONAL_HEADER_DIRS
${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/GPU		${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/GPU

LINK_COMPONENTS		LINK_COMPONENTS
Core		Core
MC		MC
Target		Target
${NVPTX_LIBS}		${NVPTX_LIBS}
		${AMDGPU_LIBS}

LINK_LIBS PUBLIC		LINK_LIBS PUBLIC
MLIRIR		MLIRIR
MLIRExecutionEngineUtils		MLIRExecutionEngineUtils
MLIRSupport		MLIRSupport
MLIRTargetLLVMIRExport		MLIRTargetLLVMIRExport

PRIVATE		PRIVATE
▲ Show 20 Lines • Show All 117 Lines • ▼ Show 20 Lines
endif()		endif()

if(MLIR_ENABLE_ROCM_CONVERSIONS)		if(MLIR_ENABLE_ROCM_CONVERSIONS)
if (NOT ("AMDGPU" IN_LIST LLVM_TARGETS_TO_BUILD))		if (NOT ("AMDGPU" IN_LIST LLVM_TARGETS_TO_BUILD))
message(SEND_ERROR		message(SEND_ERROR
"Building mlir with ROCm support requires the AMDGPU backend")		"Building mlir with ROCm support requires the AMDGPU backend")
endif()		endif()

		if (DEFINED ROCM_PATH)
		set(DEFAULT_ROCM_PATH "${ROCM_PATH}" CACHE PATH "Fallback path to search for ROCm installs")
		elseif(DEFINED ENV{ROCM_PATH})
		set(DEFAULT_ROCM_PATH "$ENV{ROCM_PATH}" CACHE PATH "Fallback path to search for ROCm installs")
		else()
set(DEFAULT_ROCM_PATH "/opt/rocm" CACHE PATH "Fallback path to search for ROCm installs")		set(DEFAULT_ROCM_PATH "/opt/rocm" CACHE PATH "Fallback path to search for ROCm installs")
		endif()
		message(VERBOSE "MLIR Default ROCM toolkit path: ${DEFAULT_ROCM_PATH}")

target_compile_definitions(obj.MLIRGPUTransforms		target_compile_definitions(obj.MLIRGPUTransforms
PRIVATE		PRIVATE
__DEFAULT_ROCM_PATH__="${DEFAULT_ROCM_PATH}"		__DEFAULT_ROCM_PATH__="${DEFAULT_ROCM_PATH}"
MLIR_GPU_TO_HSACO_PASS_ENABLE=1		MLIR_GPU_TO_HSACO_PASS_ENABLE=1
)		)

		# Enable the gpu to amdgpu target.
		target_compile_definitions(obj.MLIRGPUTargets
		PRIVATE
		MLIR_GPU_AMDGPU_TARGET_ENABLED=1
		__DEFAULT_ROCM_PATH__="${DEFAULT_ROCM_PATH}"
		)
		target_compile_definitions(obj.MLIRGPUTransforms
		PRIVATE
		MLIR_GPU_AMDGPU_TARGET_ENABLED=1
		)

target_link_libraries(MLIRGPUTransforms		target_link_libraries(MLIRGPUTransforms
PRIVATE		PRIVATE
MLIRROCDLToLLVMIRTranslation		MLIRROCDLToLLVMIRTranslation
)		)
endif()		endif()

mlir/lib/Dialect/GPU/Targets/AMDGPUTarget.cpp

This file was added.

				//===- AMDGPUTarget.cpp - MLIR GPU Dialect AMDGPU target attribute --------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This files implements the AMDGPU target attribute.
				//
				//===----------------------------------------------------------------------===//

				#include "mlir/Dialect/GPU/IR/GPUDialect.h"

				using namespace mlir;
				using namespace mlir::gpu;

				#ifdef MLIR_GPU_AMDGPU_TARGET_ENABLED
				#include "mlir/ExecutionEngine/ModuleToObject.h"
				#include "mlir/Target/LLVMIR/Dialect/GPU/GPUToLLVMIRTranslation.h"
				#include "mlir/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.h"
				#include "mlir/Target/LLVMIR/Dialect/ROCDL/ROCDLToLLVMIRTranslation.h"
				#include "mlir/Target/LLVMIR/Export.h"

				#include "llvm/Support/FileSystem.h"
				#include "llvm/Support/Path.h"
				#include "llvm/Support/SourceMgr.h"
				#include "llvm/Support/TargetSelect.h"
				#include "llvm/TargetParser/TargetParser.h"

				#ifndef __DEFAULT_ROCM_PATH__
				#define __DEFAULT_ROCM_PATH__ ""
				#endif

				#define DEBUG_TYPE "serialize-to-object"

				namespace {
				struct InitTarget {
				InitTarget() {
				LLVMInitializeAMDGPUTarget();
				krzysz00Unsubmitted Not Done Reply Inline Actions Are these meant to be called before main() like this? krzysz00: Are these meant to be called before main() like this?
				LLVMInitializeAMDGPUTargetInfo();
				LLVMInitializeAMDGPUTargetMC();
				LLVMInitializeAMDGPUAsmParser();
				LLVMInitializeAMDGPUAsmPrinter();
				}
				};

				class SerializeToHSA : public ModuleToObject {
				public:
				SerializeToHSA(Operation &module, AMDGPUTargetAttr target,
				TargetOptions targetOptions = {});

				// Init the target.
				static void init();

				// Get the paths of ROCm device libraries. Function adapted from:
				// https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/AMDGPU.cpp
				void getCommonBitcodeLibs(llvm::SmallVector<std::string> &libs,
				SmallVector<char, 256> &libPath,
				StringRef isaVersion, bool wave64, bool daz,
				bool finiteOnly, bool unsafeMath, bool fastMath,
				bool correctSqrt, StringRef abiVer);

				// Removes unnecessary metadata from the loaded bitcode files.
				LogicalResult handleBitcodeFile(llvm::Module &module,
				llvm::TargetMachine &targetMachine) override;
				// Assembles the object.
				std::optional<SmallVector<char, 0>> assembleIsa(StringRef isa);

				// Create the HSACO object.
				std::optional<SmallVector<char, 0>> createHsaco(SmallVector<char, 0> &&ptx);

				std::optional<SmallVector<std::unique_ptr<llvm::Module>>>
				loadBitcodeFiles(llvm::Module &module,
				llvm::TargetMachine &targetMachine) override;

				std::optional<SmallVector<char, 0>>
				moduleToObject(llvm::Module &llvmModule,
				llvm::TargetMachine &targetMachine) override;

				private:
				AMDGPUTargetAttr target;
				StringRef toolkitPath;
				SmallVector<std::string> fileList;
				};
				} // namespace

				SerializeToHSA::SerializeToHSA(Operation &module, AMDGPUTargetAttr target,
				TargetOptions targetOptions)
				: ModuleToObject(module, target.getTriple(), target.getChip(),
				target.getFeatures(), target.getO()),
				target(target), toolkitPath(targetOptions.getToolkitPath()),
				fileList(targetOptions.getBitcodeFiles()) {
				if (toolkitPath.empty())
				toolkitPath = __DEFAULT_ROCM_PATH__;

				if (ArrayAttr files = target.getLink())
				for (Attribute attr : files.getValue())
				if (auto file = dyn_cast<StringAttr>(attr))
				fileList.push_back(file.str());
				}

				void SerializeToHSA::init() { static InitTarget target = InitTarget(); }

				void SerializeToHSA::getCommonBitcodeLibs(llvm::SmallVector<std::string> &libs,
				SmallVector<char, 256> &libPath,
				StringRef isaVersion, bool wave64,
				bool daz, bool finiteOnly,
				bool unsafeMath, bool fastMath,
				bool correctSqrt, StringRef abiVer) {
				auto addLib = [&](StringRef path) {
				if (!llvm::sys::fs::is_regular_file(path)) {
				getOperation().emitRemark() << "Bitcode library path: " << path
				<< " does not exist or is not a file.\n";
				return;
				}
				libs.push_back(path.str());
				};
				auto optLib = [](StringRef name, bool on) -> Twine {
				return name + (on ? "_on" : "_off");
				};
				auto getLibPath = [&libPath](Twine lib) {
				auto baseSize = libPath.size();
				llvm::sys::path::append(libPath, lib + ".bc");
				std::string path(StringRef(libPath.data(), libPath.size()).str());
				libPath.truncate(baseSize);
				return path;
				};

				// Add ROCm device libraries.
				addLib(getLibPath("ocml"));
				addLib(getLibPath("ockl"));
				addLib(getLibPath(optLib("oclc_daz_opt", daz)));
				addLib(getLibPath(optLib("oclc_unsafe_math", unsafeMath \|\| fastMath)));
				addLib(getLibPath(optLib("oclc_finite_only", finiteOnly \|\| fastMath)));
				addLib(getLibPath(optLib("oclc_correctly_rounded_sqrt", correctSqrt)));
				addLib(getLibPath(optLib("oclc_wavefrontsize64", wave64)));
				addLib(getLibPath("oclc_isa_version_" + isaVersion));
				if (abiVer.size())
				addLib(getLibPath("oclc_abi_version_" + abiVer));
				}

				std::optional<SmallVector<std::unique_ptr<llvm::Module>>>
				SerializeToHSA::loadBitcodeFiles(llvm::Module &module,
				llvm::TargetMachine &targetMachine) {
				// Try loading device libraries from the ROCm toolkit installation.
				StringRef pathRef = toolkitPath;
				if (pathRef.size()) {
				SmallVector<char, 256> path;
				path.insert(path.begin(), pathRef.begin(), pathRef.end());
				llvm::sys::path::append(path, "amdgcn", "bitcode");
				pathRef = StringRef(path.data(), path.size());
				if (!llvm::sys::fs::is_directory(pathRef)) {
				getOperation().emitRemark() << "ROCm amdgcn bitcode path: " << pathRef
				<< " does not exist or is not a directory.";
				return std::nullopt;
				}
				StringRef isaVersion =
				llvm::AMDGPU::getArchNameAMDGCN(llvm::AMDGPU::parseArchAMDGCN(chip));
				isaVersion.consume_front("gfx");
				getCommonBitcodeLibs(fileList, path, isaVersion, target.getWave64(),
				target.getDaz(), target.getFiniteOnly(),
				target.getUnsafeMath(), target.getFastMath(),
				target.getCorrectSqrt(), target.getAbi());
				}

				SmallVector<std::unique_ptr<llvm::Module>> bcFiles;
				if (failed(loadBitcodeFilesFromList(module.getContext(), targetMachine,
				fileList, bcFiles, true)))
				return std::nullopt;
				return bcFiles;
				}

				LogicalResult
				SerializeToHSA::handleBitcodeFile(llvm::Module &module,
				llvm::TargetMachine &targetMachine) {
				// Some ROCM builds don't strip this like they should
				if (auto *openclVersion = module.getNamedMetadata("opencl.ocl.version"))
				module.eraseNamedMetadata(openclVersion);
				// Stop spamming us with clang version numbers
				if (auto *ident = module.getNamedMetadata("llvm.ident"))
				module.eraseNamedMetadata(ident);
				return success();
				}

				//===----------------------------------------------------------------------===//
				// AMDGPU pipeline methods.
				//===----------------------------------------------------------------------===//
				#include "mlir/Support/FileUtilities.h"
				krzysz00Unsubmitted Not Done Reply Inline Actions These split-up blocks of `#include`s are vaguely bugging me but that's a nit krzysz00: These split-up blocks of `#include`s are vaguely bugging me but that's a nit
				#include "llvm/MC/MCAsmBackend.h"
				#include "llvm/MC/MCAsmInfo.h"
				#include "llvm/MC/MCCodeEmitter.h"
				#include "llvm/MC/MCContext.h"
				#include "llvm/MC/MCInstrInfo.h"
				#include "llvm/MC/MCObjectFileInfo.h"
				#include "llvm/MC/MCObjectWriter.h"
				#include "llvm/MC/MCParser/MCTargetAsmParser.h"
				#include "llvm/MC/MCRegisterInfo.h"
				#include "llvm/MC/MCStreamer.h"
				#include "llvm/MC/MCSubtargetInfo.h"
				#include "llvm/MC/TargetRegistry.h"
				#include "llvm/Support/FileUtilities.h"
				#include "llvm/Support/Program.h"

				std::optional<SmallVector<char, 0>> SerializeToHSA::assembleIsa(StringRef isa) {
				auto loc = getOperation().getLoc();

				StringRef targetTriple = this->triple;

				SmallVector<char, 0> result;
				llvm::raw_svector_ostream os(result);

				llvm::Triple triple(llvm::Triple::normalize(targetTriple));
				std::string error;
				const llvm::Target *target =
				llvm::TargetRegistry::lookupTarget(triple.normalize(), error);
				if (!target) {
				emitError(loc, Twine("failed to lookup target: ") + error);
				return std::nullopt;
				}

				llvm::SourceMgr srcMgr;
				srcMgr.AddNewSourceBuffer(llvm::MemoryBuffer::getMemBuffer(isa), SMLoc());

				const llvm::MCTargetOptions mcOptions;
				std::unique_ptr<llvm::MCRegisterInfo> mri(
				target->createMCRegInfo(targetTriple));
				std::unique_ptr<llvm::MCAsmInfo> mai(
				target->createMCAsmInfo(*mri, targetTriple, mcOptions));
				mai->setRelaxELFRelocations(true);
				std::unique_ptr<llvm::MCSubtargetInfo> sti(
				target->createMCSubtargetInfo(targetTriple, chip, features));

				llvm::MCContext ctx(triple, mai.get(), mri.get(), sti.get(), &srcMgr,
				&mcOptions);
				std::unique_ptr<llvm::MCObjectFileInfo> mofi(target->createMCObjectFileInfo(
				ctx, /PIC=/false, /LargeCodeModel=/false));
				ctx.setObjectFileInfo(mofi.get());

				SmallString<128> cwd;
				if (!llvm::sys::fs::current_path(cwd))
				ctx.setCompilationDir(cwd);

				std::unique_ptr<llvm::MCStreamer> mcStreamer;
				std::unique_ptr<llvm::MCInstrInfo> mcii(target->createMCInstrInfo());

				llvm::MCCodeEmitter ce = target->createMCCodeEmitter(mcii, ctx);
				llvm::MCAsmBackend mab = target->createMCAsmBackend(sti, *mri, mcOptions);
				mcStreamer.reset(target->createMCObjectStreamer(
				triple, ctx, std::unique_ptr<llvm::MCAsmBackend>(mab),
				mab->createObjectWriter(os), std::unique_ptr<llvm::MCCodeEmitter>(ce),
				*sti, mcOptions.MCRelaxAll, mcOptions.MCIncrementalLinkerCompatible,
				/DWARFMustBeAtTheEnd/ false));
				mcStreamer->setUseAssemblerInfoForParsing(true);

				std::unique_ptr<llvm::MCAsmParser> parser(
				createMCAsmParser(srcMgr, ctx, mcStreamer, mai));
				std::unique_ptr<llvm::MCTargetAsmParser> tap(
				target->createMCAsmParser(sti, parser, *mcii, mcOptions));

				if (!tap) {
				emitError(loc, "assembler initialization error");
				return {};
				}

				parser->setTargetParser(*tap);
				parser->Run(false);

				return result;
				}

				std::optional<SmallVector<char, 0>>
				SerializeToHSA::createHsaco(SmallVector<char, 0> &&ptx) {
				SmallVector<char, 0> isaBinary = std::move(ptx);
				auto loc = getOperation().getLoc();

				// Save the ISA binary to a temp file.
				int tempIsaBinaryFd = -1;
				SmallString<128> tempIsaBinaryFilename;
				if (llvm::sys::fs::createTemporaryFile("kernel", "o", tempIsaBinaryFd,
				tempIsaBinaryFilename)) {
				emitError(loc, "temporary file for ISA binary creation error");
				return {};
				}
				llvm::FileRemover cleanupIsaBinary(tempIsaBinaryFilename);
				llvm::raw_fd_ostream tempIsaBinaryOs(tempIsaBinaryFd, true);
				tempIsaBinaryOs << StringRef(isaBinary.data(), isaBinary.size());
				tempIsaBinaryOs.close();

				// Create a temp file for HSA code object.
				int tempHsacoFD = -1;
				SmallString<128> tempHsacoFilename;
				if (llvm::sys::fs::createTemporaryFile("kernel", "hsaco", tempHsacoFD,
				tempHsacoFilename)) {
				emitError(loc, "temporary file for HSA code object creation error");
				return {};
				}
				llvm::FileRemover cleanupHsaco(tempHsacoFilename);

				llvm::SmallString<32> lldPath(toolkitPath);
				llvm::sys::path::append(lldPath, "llvm", "bin", "ld.lld");
				int lldResult = llvm::sys::ExecuteAndWait(
				lldPath,
				{"ld.lld", "-shared", tempIsaBinaryFilename, "-o", tempHsacoFilename});
				if (lldResult != 0) {
				emitError(loc, "lld invocation error");
				return {};
				}

				// Load the HSA code object.
				auto hsacoFile = openInputFile(tempHsacoFilename);
				if (!hsacoFile) {
				emitError(loc, "read HSA code object from temp file error");
				return {};
				}

				StringRef buffer = hsacoFile->getBuffer();

				return SmallVector<char, 0>(buffer.begin(), buffer.end());
				}

				std::optional<SmallVector<char, 0>>
				SerializeToHSA::moduleToObject(llvm::Module &llvmModule,
				llvm::TargetMachine &targetMachine) {
				std::optional<std::string> serializedISA =
				translateToISA(llvmModule, targetMachine);
				if (!serializedISA) {
				getOperation().emitError() << "Failed translating the module to ISA.";
				return std::nullopt;
				}

				LLVM_DEBUG({
				llvm::dbgs() << "ISA for module: "
				<< dyn_cast<GPUModuleOp>(&getOperation()).getNameAttr()
				<< "\n";
				llvm::dbgs() << *serializedISA << "\n";
				llvm::dbgs().flush();
				});

				std::optional<SmallVector<char, 0>> assembledIsa =
				assembleIsa(serializedISA.value());

				if (!assembledIsa) {
				getOperation().emitError() << "Failed during ISA assembling.";
				return std::nullopt;
				}

				return createHsaco(std::move(assembledIsa.value()));
				}

				std::optional<SmallVector<char, 0>>
				AMDGPUTargetAttr::serializeToObject(Operation *module,
				const TargetOptions &options) const {
				assert(module && "The module must be non null.");
				if (!module)
				return std::nullopt;
				if (!mlir::isa<GPUModuleOp>(module)) {
				module->emitError("Module must be a GPU module.");
				return std::nullopt;
				}
				SerializeToHSA::init();
				SerializeToHSA serializer(module, this, options);
				return serializer.run();
				}

				#else
				// Provide a null vector for testing purposes.
				std::optional<SmallVector<char, 0>>
				AMDGPUTargetAttr::serializeToObject(Operation *module,
				const TargetOptions &options) const {
				assert(module && "The module must be non null.");
				if (!module)
				return std::nullopt;
				if (!mlir::isa<GPUModuleOp>(module)) {
				module->emitError("Module must be a GPU module.");
				return std::nullopt;
				}
				return SmallVector<char, 0>{};
				}
				#endif // MLIR_GPU_AMDGPU_TARGET_ENABLED

				LogicalResult
				AMDGPUTargetAttr::verify(function_ref<InFlightDiagnostic()> emitError,
				int optLevel, StringRef triple, StringRef chip,
				StringRef features, StringRef abiVersion,
				DictionaryAttr flags, ArrayAttr files) {
				if (optLevel < 0 \|\| optLevel > 3) {
				emitError() << "The optimization level must be a number between 0 and 3.";
				return failure();
				}
				if (triple.empty()) {
				emitError() << "The target triple cannot be empty.";
				return failure();
				}
				if (chip.empty()) {
				emitError() << "The target chip cannot be empty.";
				return failure();
				}
				if (abiVersion != "400" && abiVersion != "500") {
				emitError() << "Invalid ABI version, it must be either `400` or `500`.";
				return failure();
				}
				if (files && llvm::all_of(files, [](::mlir::Attribute attr) {
				return attr && mlir::isa<StringAttr>(attr);
				})) {
				emitError() << "All the elements in the `link` array must be strings.";
				return failure();
				}
				return success();
				}

This is an archive of the discontinued LLVM Phabricator instance.

[mlir][ROCDL] Adds the ROCDL target attribute.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 536068

mlir/include/mlir/Dialect/GPU/IR/GPUCompilationAttr.td

mlir/lib/Dialect/GPU/CMakeLists.txt

mlir/lib/Dialect/GPU/Targets/AMDGPUTarget.cpp

[mlir][ROCDL] Adds the ROCDL target attribute.
ClosedPublic