This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
mlir/
-
include/mlir/
-
mlir/
-
Dialect/GPU/
-
GPU/
-
Passes.h
-
InitAllPasses.h
-
lib/
-
Dialect/GPU/
-
GPU/
3/4
CMakeLists.txt
-
Transforms/
-
SerializeToHsaco.cpp
-
ExecutionEngine/
-
CMakeLists.txt
-
RocmRuntimeWrappers.cpp
-
test/
-
CMakeLists.txt
-
Conversion/GPUToROCm/
-
GPUToROCm/
-
lower-rocdl-kernel-to-hsaco.mlir
-
Integration/GPU/
-
GPU/
-
CUDA/
-
lit.local.cfg
-
ROCM/
-
gpu-to-hsaco.mlir
-
lit.local.cfg
-
two-modules.mlir
-
vecadd.mlir
-
vector-transferops.mlir
-
lib/Transforms/
-
Transforms/
2/2
TestConvertGPUKernelToHsaco.cpp
-
lit.cfg.py
-
lit.site.cfg.py.in
-
mlir-rocm-runner/
-
gpu-to-hsaco.mlir
-
lit.local.cfg
-
two-modules.mlir
-
vecadd.mlir
-
vector-transferops.mlir
-
tools/
-
CMakeLists.txt
-
mlir-opt/
-
mlir-opt.cpp
-
mlir-rocm-runner/
-
CMakeLists.txt
-
mlir-rocm-runner.cpp
-
rocm-runtime-wrappers.cpp

Differential D98447

[mlir] Remove mlir-rocm-runner
ClosedPublic

Authored by csigg on Mar 11 2021, 12:36 PM.

Download Raw Diff

Details

Reviewers

aartbik
herhut
whchung
bondhugula

Commits

rGa825fb2c0733: [mlir] Remove mlir-rocm-runner

Summary

This change combines for ROCm what was done for CUDA in D97463, D98203, D98360, and D98396.

I did not try to compile SerializeToHsaco.cpp or test mlir/test/Integration/GPU/ROCM because I don't have an AMD card. I fixed the things that had obvious bit-rot though.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

csigg created this revision.Mar 11 2021, 12:36 PM

Herald added a reviewer: aartbik. · View Herald TranscriptMar 11 2021, 12:36 PM

Herald added subscribers: dcaballe, cota, teijeong and 20 others. · View Herald Transcript

csigg requested review of this revision.Mar 11 2021, 12:36 PM

Herald added a reviewer: herhut. · View Herald TranscriptMar 11 2021, 12:36 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald Transcript

Harbormaster completed remote builds in B93359: Diff 330055.Mar 11 2021, 12:37 PM

csigg added a reviewer: whchung.Mar 11 2021, 12:39 PM

Polish.

Harbormaster completed remote builds in B93363: Diff 330058.Mar 11 2021, 12:58 PM

Do not depend on D98396.

csigg removed a parent revision: D98396: [mlir] Remove mlir-cuda-runner.Mar 12 2021, 12:11 AM

csigg edited the summary of this revision. (Show Details)

Rebase.

Harbormaster completed remote builds in B93436: Diff 330164.Mar 12 2021, 3:30 AM

Harbormaster completed remote builds in B93438: Diff 330168.Mar 12 2021, 3:34 AM

Various fixes.

I gave this change a spin on genesiscloud.com (Radeon Instinct™ MI25, ROCm HIP version: 3.5.20214-a2917cd).

The code now compiles, but the integration tests all fail with assembler initialization error.

I also had to remove the 'code-object-v3' features because LLVM wouldn't recognize it.

For future reference, here is how I tested:

sudo apt-get install ninja-build
sudo bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)"
wget -qO - https://apt.kitware.com/keys/kitware-archive-latest.asc | sudo apt-key add -
sudo apt-add-repository 'deb https://apt.kitware.com/ubuntu/ bionic main'
sudo apt-get update
sudo apt-get install cmake

wget https://reviews.llvm.org/D98447?download=true -O rocm.patch
git clone https://github.com/llvm/llvm-project.git
git apply ../rocm.patch
mkdir build
cd build

CC=clang-12 CXX=clang++-12 cmake ../llvm '-DCMAKE_CUDA_COMPILER=/media/samsung_ssd_850_pro/cuda/cuda-11.1/bin/nvcc' -DLLVM_BUILD_EXAMPLES=ON '-DLLVM_TARGETS_TO_BUILD=host;AMDGPU' -DLLVM_ENABLE_PROJECTS="clang;mlir;lld" '-DMLIR_INCLUDE_INTEGRATION_TESTS=ON' '-DMLIR_ROCM_RUNNER_ENABLED=1' -DBUILD_SHARED_LIBS=ON -DLLVM_CCACHE_BUILD=OFF -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON '-DLLVM_LIT_ARGS=-v -vv' -GNinja

ninja check-mlir

@csigg May I understand the goal for this changeset is to merge everything under mlir-cpu-runner?

I'll check this changeset on a ROCm 4.0 machine, and provide feedbacks.

Harbormaster completed remote builds in B93468: Diff 330210.Mar 12 2021, 6:44 AM

Another fix.

gpu-to-hsaco.mlir now passes. The other 3 integration tests still fail but that seems unrelated to this change and should be dealt with separately.

Test output:

$ env LIT_FILTER=GPU/ROCM ninja check-mlir
-- Testing: 4 of 897 tests, 4 workers --
FAIL: MLIR :: Integration/GPU/ROCM/two-modules.mlir (1 of 4)
******************** TEST 'MLIR :: Integration/GPU/ROCM/two-modules.mlir' FAILED ********************
Script:
--
: 'RUN: at line 1'; /home/ubuntu/llvm-project/build/bin/mlir-opt /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/two-modules.mlir -gpu-kernel-outlining -pass-pipeline ='gpu.module(strip-debuginfo,convert-gpu-to-rocdl,gpu-to-hsaco)' -gpu-to-llvm | /home/ubuntu/llvm-project/build/bin/mlir-cpu-runner --shared-libs=/home/ubuntu/llvm-project/buil d/lib/libmlir_rocm_runtime.so --shared-libs=/home/ubuntu/llvm-project/build/lib/libmlir_runner_utils.so --entry-point-result=void | /home/ubuntu/llvm-project/build/bin/FileChec k /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/two-modules.mlir
--
Exit Code: 2

Command Output (stderr):
--
+ : 'RUN: at line 1'
+ /home/ubuntu/llvm-project/build/bin/mlir-opt /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/two-modules.mlir -gpu-kernel-outlining '-pass-pipeline=gpu.module(strip-debugin fo,convert-gpu-to-rocdl,gpu-to-hsaco)' -gpu-to-llvm
+ /home/ubuntu/llvm-project/build/bin/mlir-cpu-runner --shared-libs=/home/ubuntu/llvm-project/build/lib/libmlir_rocm_runtime.so --shared-libs=/home/ubuntu/llvm-project/build/lib/libml ir_runner_utils.so --entry-point-result=void
+ /home/ubuntu/llvm-project/build/bin/FileCheck /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/two-modules.mlir
/home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/two-modules.mlir:37:1: error: 'func' op symbol declaration cannot have public visibility
func @mgpuMemGetDeviceMemRef1dInt32(%ptr : memref<?xi32>) -> (memref<?xi32>)
^
/home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/two-modules.mlir:37:1: note: see current operation: "func"() ( {
}) {sym_name = "mgpuMemGetDeviceMemRef1dInt32", type = (memref<?xi32>) -> memref<?xi32>} : () -> ()
Error: entry point not found
FileCheck error: '<stdin>' is empty.
FileCheck command line: /home/ubuntu/llvm-project/build/bin/FileCheck /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/two-modules.mlir

--

********************
FAIL: MLIR :: Integration/GPU/ROCM/vector-transferops.mlir (2 of 4)
******************** TEST 'MLIR :: Integration/GPU/ROCM/vector-transferops.mlir' FAILED ********************
Script:
--
: 'RUN: at line 1'; /home/ubuntu/llvm-project/build/bin/mlir-opt /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vector-transferops.mlir -gpu-kernel-outlining -pass-p ipeline='gpu.module(strip-debuginfo,convert-gpu-to-rocdl,gpu-to-hsaco)' -gpu-to-llvm | /home/ubuntu/llvm-project/build/bin/mlir-cpu-runner --shared-libs=/home/ubuntu/llvm-proje ct/build/lib/libmlir_rocm_runtime.so --shared-libs=/home/ubuntu/llvm-project/build/lib/libmlir_runner_utils.so --entry-point-result=void | /home/ubuntu/llvm-project/build/bin/F ileCheck /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vector-transferops.mlir
--
Exit Code: 2

Command Output (stderr):
--
+ : 'RUN: at line 1'
+ /home/ubuntu/llvm-project/build/bin/mlir-cpu-runner --shared-libs=/home/ubuntu/llvm-project/build/lib/libmlir_rocm_runtime.so --shared-libs=/home/ubuntu/llvm-project/build/lib/libml ir_runner_utils.so --entry-point-result=void
+ /home/ubuntu/llvm-project/build/bin/FileCheck /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vector-transferops.mlir
+ /home/ubuntu/llvm-project/build/bin/mlir-opt /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vector-transferops.mlir -gpu-kernel-outlining '-pass-pipeline=gpu.module(strip- debuginfo,convert-gpu-to-rocdl,gpu-to-hsaco)' -gpu-to-llvm
/home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vector-transferops.mlir:67:3: error: 'scf.for' op operand #0 must be index, but got 'i64'
 scf.for %i = %c0 to %c4 step %c1 {
 ^
/home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vector-transferops.mlir:67:3: note: see current operation: "scf.for"(%0, %2, %1) ( {
^bb0(%arg0: index): // no predecessors
 "std.store"(%4, %18, %arg0) : (f32, !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>, index) -> ()
 "std.store"(%4, %32, %arg0) : (f32, !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>, index) -> ()
 "scf.yield"() : () -> ()
}) : (i64, i64, i64) -> ()
Error: entry point not found
FileCheck error: '<stdin>' is empty.
FileCheck command line: /home/ubuntu/llvm-project/build/bin/FileCheck /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vector-transferops.mlir

--

********************
FAIL: MLIR :: Integration/GPU/ROCM/vecadd.mlir (3 of 4)
******************** TEST 'MLIR :: Integration/GPU/ROCM/vecadd.mlir' FAILED ********************
Script:
--
: 'RUN: at line 1'; /home/ubuntu/llvm-project/build/bin/mlir-opt /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vecadd.mlir -gpu-kernel-outlining -pass-pipeline='gpu .module(strip-debuginfo,convert-gpu-to-rocdl,gpu-to-hsaco)' -gpu-to-llvm | /home/ubuntu/llvm-project/build/bin/mlir-cpu-runner --shared-libs=/home/ubuntu/llvm-project/build/lib /libmlir_rocm_runtime.so --shared-libs=/home/ubuntu/llvm-project/build/lib/libmlir_runner_utils.so --entry-point-result=void | /home/ubuntu/llvm-project/build/bin/FileCheck /ho me/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vecadd.mlir
--
Exit Code: 2

Command Output (stderr):
--
+ : 'RUN: at line 1'
+ /home/ubuntu/llvm-project/build/bin/mlir-cpu-runner --shared-libs=/home/ubuntu/llvm-project/build/lib/libmlir_rocm_runtime.so --shared-libs=/home/ubuntu/llvm-project/build/lib/libml ir_runner_utils.so --entry-point-result=void
+ /home/ubuntu/llvm-project/build/bin/FileCheck /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vecadd.mlir
+ /home/ubuntu/llvm-project/build/bin/mlir-opt /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vecadd.mlir -gpu-kernel-outlining '-pass-pipeline=gpu.module(strip-debuginfo,co nvert-gpu-to-rocdl,gpu-to-hsaco)' -gpu-to-llvm
/home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vecadd.mlir:38:3: error: 'scf.for' op operand #0 must be index, but got 'i64'
 scf.for %i = %c0 to %c5 step %c1 {
 ^
/home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vecadd.mlir:38:3: note: see current operation: "scf.for"(%0, %2, %1) ( {
^bb0(%arg0: index): // no predecessors
 "std.store"(%3, %17, %arg0) : (f32, !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>, index) -> ()
 "std.store"(%3, %31, %arg0) : (f32, !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>, index) -> ()
 "scf.yield"() : () -> ()
}) : (i64, i64, i64) -> ()
Error: entry point not found
FileCheck error: '<stdin>' is empty.
FileCheck command line: /home/ubuntu/llvm-project/build/bin/FileCheck /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vecadd.mlir

--

********************
PASS: MLIR :: Integration/GPU/ROCM/gpu-to-hsaco.mlir (4 of 4)
********************
Failed Tests (3):
 MLIR :: Integration/GPU/ROCM/two-modules.mlir
 MLIR :: Integration/GPU/ROCM/vecadd.mlir
 MLIR :: Integration/GPU/ROCM/vector-transferops.mlir


Testing Time: 0.91s
 Excluded: 893
 Passed : 1
 Failed : 3
FAILED: tools/mlir/test/CMakeFiles/check-mlir

In D98447#2622038, @whchung wrote:

@csigg May I understand the goal for this changeset is to merge everything under mlir-cpu-runner?

The diffs referenced in the description contain more context, but basically:

file-move gpu runner tests to 'Integration' directory.
file-move gpuKernelsToBlob to Dialect/GPU/Transform
code-move lowering to GPU blob from mlir-rocm-runner to mlir-opt
change integration tests from mlir-rocm-runner to mlir-opt + cpu-runner
remove mlir-rocm-runner

I'll check this changeset on a ROCm 4.0 machine, and provide feedbacks.

Thanks! Please use the latest revision ;-)

Harbormaster completed remote builds in B93488: Diff 330234.Mar 12 2021, 7:58 AM

@csigg May I understand how you configure the build?

With this cmake command:

cmake -G Ninja ../llvm \
    -DLLVM_ENABLE_PROJECTS="mlir;lld" \
    -DLLVM_BUILD_EXAMPLES=ON \
    -DLLVM_TARGETS_TO_BUILD="X86;AMDGPU" \
    -DCMAKE_BUILD_TYPE=Release \
    -DLLVM_ENABLE_ASSERTIONS=ON \
    -DBUILD_SHARED_LIBS=ON \
    -DLLVM_BUILD_LLVM_DYLIB=ON \
    -DMLIR_ROCM_RUNNER_ENABLED=1

I'm getting:

tools/mlir/lib/Dialect/GPU/CMakeFiles/obj.MLIRGPU.dir/Transforms/SerializeToHsaco.cpp.o: In function `std::_Function_handler<std::unique_ptr<mlir::Pass, std::default_delete<mlir::Pass> > (), mlir::registerGpuSerializeToHsacoPass()::{lambda()#1}>::_M_invoke(std::_Any_data const&)':
SerializeToHsaco.cpp:(.text._ZNSt17_Function_handlerIFSt10unique_ptrIN4mlir4PassESt14default_deleteIS2_EEvEZNS1_31registerGpuSerializeToHsacoPassEvEUlvE_E9_M_invokeERKSt9_Any_data+0x34): undefined reference to `LLVMInitializeAMDGPUAsmParser'
tools/mlir/lib/Dialect/GPU/CMakeFiles/obj.MLIRGPU.dir/Transforms/SerializeToHsaco.cpp.o: In function `(anonymous namespace)::SerializeToHsacoPass::createHsaco(llvm::SmallVectorImpl<char> const&)':
SerializeToHsaco.cpp:(.text._ZN12_GLOBAL__N_120SerializeToHsacoPass11createHsacoERKN4llvm15SmallVectorImplIcEE+0x51c): undefined reference to `lld::elf::link(llvm::ArrayRef<char const*>, bool, llvm::raw_ostream&, llvm::raw_ostream&)'

mlir/test/lib/Transforms/TestConvertGPUKernelToHsaco.cpp
28	PTX -> LLVM IR?

@csigg I would need your help advising me the proper way to configure it so I could test the patch. With the cmake command I use downstream I run into link errors aforementioned.

Also, in our downstream work, we did a rebase recently and we found we have to disable multi-threading when we run MLIR passes.

static LogicalResult runMLIRPasses(ModuleOp m) {

  // TODO(sjw): fix multi-threading race condition
  m.getContext()->disableMultithreading();

  PassManager pm(m.getContext());
  applyPassManagerCLOptions(pm);

To emit HSACO for AMD GPU platform, lld is used, and it seems lld itself doesn't seem to be thread-safe.

mlir/test/lib/Transforms/TestConvertGPUKernelToHsaco.cpp
37	"code-object-v3" is dropped in recent LLVM AMDGPU backend so this line is not needed.

Simplify dependencies.

Harbormaster completed remote builds in B93646: Diff 330431.Mar 13 2021, 1:45 AM

Fix dependencies for libMLIR.so, fix tests.

In D98447#2622643, @whchung wrote:

@csigg May I understand how you configure the build?

With this cmake command:

cmake -G Ninja ../llvm \
    -DLLVM_ENABLE_PROJECTS="mlir;lld" \
    -DLLVM_BUILD_EXAMPLES=ON \
    -DLLVM_TARGETS_TO_BUILD="X86;AMDGPU" \
    -DCMAKE_BUILD_TYPE=Release \
    -DLLVM_ENABLE_ASSERTIONS=ON \
    -DBUILD_SHARED_LIBS=ON \
    -DLLVM_BUILD_LLVM_DYLIB=ON \
    -DMLIR_ROCM_RUNNER_ENABLED=1

My cmake command is above. Yours seems to be missing -DMLIR_INCLUDE_INTEGRATION_TESTS=ON.

Unrelated, but note that SHARED_LIBS + DYLIB is not supported.

I'm getting: ... `undefined reference to `lld::elf::link`

This should be fixed now. lldELF is not a LLVM target, so this needed a 'creative' solution.

I also managed to fix the tests.

Harbormaster completed remote builds in B93670: Diff 330461.Mar 13 2021, 12:07 PM

Another fix.

Harbormaster completed remote builds in B93675: Diff 330466.Mar 13 2021, 12:47 PM

In D98447#2622788, @whchung wrote:

Also, in our downstream work, we did a rebase recently and we found we have to disable multi-threading when we run MLIR passes.
To emit HSACO for AMD GPU platform, lld is used, and it seems lld itself doesn't seem to be thread-safe.

There is a lock around the lld call. I think it should be fine.

Thanks for cleaning this up! Looks good in general but I also cannot test this. It would be nice if we had a builder for this. @whchung do you have a setup that could test this continuously or at least periodically to make sure it does not break?

mlir/lib/Dialect/GPU/CMakeLists.txt
103	nit: which rocm runner?
123	Same here.
149	What does this do? I do not understand cmake enough to make sense of this.

Update error messages.

csigg marked 2 inline comments as done.Mar 17 2021, 3:25 AM

csigg added inline comments.

mlir/lib/Dialect/GPU/CMakeLists.txt
149	MLIR_LLVM_LINK_COMPONENTS is the list of LLVM libraries that are linked to libmlir.so. Normally, it is populated by the LINK_COMPONENTS argument of add_mlir_library() but here this call has already happened. We need an alias (really just a different name for an existing target) because the LINK_COMPONENTS name are implicitly prefixed with 'LLVM' when translated to target names.

Harbormaster completed remote builds in B94196: Diff 331201.Mar 17 2021, 4:05 AM

The patch has been verified on ROCm 4.0.

This revision is now accepted and ready to land.Mar 18 2021, 3:44 PM

In D98447#2631114, @herhut wrote:

Thanks for cleaning this up! Looks good in general but I also cannot test this. It would be nice if we had a builder for this. @whchung do you have a setup that could test this continuously or at least periodically to make sure it does not break?

@herhut We have CI nodes checking our work downstream. We used to have a job checking tip of LLVM but it's offline for some time. I'll work with my colleague to bring it back online.

This revision was landed with ongoing or failed builds.Mar 19 2021, 12:24 AM

Closed by commit rGa825fb2c0733: [mlir] Remove mlir-rocm-runner (authored by csigg). · Explain Why

This revision was automatically updated to reflect the committed changes.

csigg added a commit: rGa825fb2c0733: [mlir] Remove mlir-rocm-runner.

Coming from https://reviews.llvm.org/D108850#3280778

I am not comfortable that mlir has such an API dependency on lld::elf::link.
lld::elf::link (the library usage) is still not recommended.
It also feels odd that mlir needs to use lld.

Library usage may not control the concurrency well.
You may switch to spawning an ld.lld process if LLVM_ENABLE_PROJECTS includes lld.

Herald added a reviewer: bondhugula. · View Herald TranscriptJan 28 2022, 1:46 PM

Herald added subscribers: sdasgup3, wenzhicui, wrengr and 2 others. · View Herald Transcript

Revision Contents

Path

Size

mlir/

include/

mlir/

Dialect/

GPU/

Passes.h

4 lines

InitAllPasses.h

1 line

lib/

Dialect/

GPU/

CMakeLists.txt

84 lines

Transforms/

SerializeToHsaco.cpp

284 lines

ExecutionEngine/

CMakeLists.txt

64 lines

	lib/	ExecutionEngine/
	tools/	mlir-rocm-runner/

	RocmRuntimeWrappers.cpp
	rocm-runtime-wrappers.cpp

30 lines

test/

CMakeLists.txt

13 lines

Conversion/

GPUToROCm/

lower-rocdl-kernel-to-hsaco.mlir

7 lines

Integration/

GPU/

CUDA/

lit.local.cfg

2 lines

	Integration/	GPU/	ROCM/
			mlir-rocm-runner/

8 lines

2 lines

8 lines

8 lines

vector-transferops.mlir

8 lines

lib/

Transforms/

TestConvertGPUKernelToHsaco.cpp

60 lines

lit.cfg.py

1 line

lit.site.cfg.py.in

1 line

mlir-rocm-runner/

vector-transferops.mlir

tools/

CMakeLists.txt

1 line

mlir-opt/

mlir-opt.cpp

4 lines

mlir-rocm-runner/

CMakeLists.txt

mlir-rocm-runner.cpp

rocm-runtime-wrappers.cpp

Diff 330431

mlir/include/mlir/Dialect/GPU/Passes.h

	Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Registration			// Registration
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	/// Register pass to serialize GPU kernel functions to a CUBIN binary			/// Register pass to serialize GPU kernel functions to a CUBIN binary
	/// annotation.			/// annotation.
	void registerGpuSerializeToCubinPass();			void registerGpuSerializeToCubinPass();

				/// Register pass to serialize GPU kernel functions to a HSAco binary
				/// annotation.
				void registerGpuSerializeToHsacoPass();

	/// Generate the code for registering passes.			/// Generate the code for registering passes.
	#define GEN_PASS_REGISTRATION			#define GEN_PASS_REGISTRATION
	#include "mlir/Dialect/GPU/Passes.h.inc"			#include "mlir/Dialect/GPU/Passes.h.inc"

	} // namespace mlir			} // namespace mlir

	#endif // MLIR_DIALECT_GPU_PASSES_H_			#endif // MLIR_DIALECT_GPU_PASSES_H_

mlir/include/mlir/InitAllPasses.h

Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	inline void registerAllPasses() {
// Conversion passes		// Conversion passes
registerConversionPasses();		registerConversionPasses();

// Dialect passes		// Dialect passes
registerAffinePasses();		registerAffinePasses();
registerAsyncPasses();		registerAsyncPasses();
registerGPUPasses();		registerGPUPasses();
registerGpuSerializeToCubinPass();		registerGpuSerializeToCubinPass();
		registerGpuSerializeToHsacoPass();
registerLinalgPasses();		registerLinalgPasses();
LLVM::registerLLVMPasses();		LLVM::registerLLVMPasses();
quant::registerQuantPasses();		quant::registerQuantPasses();
registerSCFPasses();		registerSCFPasses();
registerShapePasses();		registerShapePasses();
spirv::registerSPIRVPasses();		spirv::registerSPIRVPasses();
registerStandardPasses();		registerStandardPasses();
tensor::registerTensorPasses();		tensor::registerTensorPasses();
tosa::registerTosaOptPasses();		tosa::registerTosaOptPasses();
}		}

} // namespace mlir		} // namespace mlir

#endif // MLIR_INITALLPASSES_H_		#endif // MLIR_INITALLPASSES_H_

mlir/lib/Dialect/GPU/CMakeLists.txt

if (MLIR_CUDA_CONVERSIONS_ENABLED)		if (MLIR_CUDA_CONVERSIONS_ENABLED)
set(NVPTX_LIBS		set(NVPTX_LIBS
NVPTXCodeGen		NVPTXCodeGen
NVPTXDesc		NVPTXDesc
NVPTXInfo		NVPTXInfo
)		)
endif()		endif()

		if (MLIR_ROCM_CONVERSIONS_ENABLED)
		set(AMDGPU_LIBS
		MCParser
		AMDGPUAsmParser
		AMDGPUAsmPrinter
		AMDGPUCodeGen
		AMDGPUDesc
		AMDGPUInfo
		)
		endif()

add_mlir_dialect_library(MLIRGPU		add_mlir_dialect_library(MLIRGPU
IR/GPUDialect.cpp		IR/GPUDialect.cpp
Transforms/AllReduceLowering.cpp		Transforms/AllReduceLowering.cpp
Transforms/AsyncRegionRewriter.cpp		Transforms/AsyncRegionRewriter.cpp
Transforms/KernelOutlining.cpp		Transforms/KernelOutlining.cpp
Transforms/MemoryPromotion.cpp		Transforms/MemoryPromotion.cpp
Transforms/ParallelLoopMapper.cpp		Transforms/ParallelLoopMapper.cpp
Transforms/SerializeToBlob.cpp		Transforms/SerializeToBlob.cpp
Transforms/SerializeToCubin.cpp		Transforms/SerializeToCubin.cpp
		Transforms/SerializeToHsaco.cpp

ADDITIONAL_HEADER_DIRS		ADDITIONAL_HEADER_DIRS
${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/GPU		${MLIR_MAIN_INCLUDE_DIR}/mlir/Dialect/GPU

LINK_COMPONENTS		LINK_COMPONENTS
Core		Core
MC		MC
${NVPTX_LIBS}		${NVPTX_LIBS}
		${AMDGPU_LIBS}

DEPENDS		DEPENDS
MLIRGPUOpsIncGen		MLIRGPUOpsIncGen
MLIRGPUOpInterfacesIncGen		MLIRGPUOpInterfacesIncGen
MLIRGPUPassIncGen		MLIRGPUPassIncGen
MLIRParallelLoopMapperAttrGen		MLIRParallelLoopMapperAttrGen
MLIRParallelLoopMapperEnumsGen		MLIRParallelLoopMapperEnumsGen

▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	if(MLIR_CUDA_RUNNER_ENABLED)

target_link_libraries(MLIRGPU		target_link_libraries(MLIRGPU
PRIVATE		PRIVATE
MLIRNVVMToLLVMIRTranslation		MLIRNVVMToLLVMIRTranslation
${CUDA_DRIVER_LIBRARY}		${CUDA_DRIVER_LIBRARY}
)		)

endif()		endif()

		if(MLIR_ROCM_RUNNER_ENABLED)
		if (NOT ("AMDGPU" IN_LIST LLVM_TARGETS_TO_BUILD))
		message(SEND_ERROR
		"Building the mlir rocm runner requires the AMDGPU backend")
		herhutUnsubmitted Done Reply Inline Actions nit: which rocm runner? herhut: nit: which rocm runner?
		endif()

		# Ensure lld is enabled.
		if (NOT "lld" IN_LIST LLVM_ENABLE_PROJECTS)
		message(SEND_ERROR "lld is not enabled. Please revise LLVM_ENABLE_PROJECTS")
		endif()

		# lld header files.
		include_directories(${MLIR_SOURCE_DIR}/../lld/include)

		# Configure ROCm support.
		if (NOT DEFINED ROCM_PATH)
		if (NOT DEFINED ENV{ROCM_PATH})
		set(ROCM_PATH "/opt/rocm" CACHE PATH "Path to which ROCm has been installed")
		else()
		set(ROCM_PATH $ENV{ROCM_PATH} CACHE PATH "Path to which ROCm has been installed")
		endif()
		set(HIP_PATH "${ROCM_PATH}/hip" CACHE PATH " Path to which HIP has been installed")
		endif()
		set(CMAKE_MODULE_PATH "${HIP_PATH}/cmake" ${CMAKE_MODULE_PATH})
		herhutUnsubmitted Done Reply Inline Actions Same here. herhut: Same here.
		find_package(HIP)
		if (NOT HIP_FOUND)
		message(SEND_ERROR "Build the mlir rocm runner requires a working ROCm and HIP install")
		else()
		message(STATUS "ROCm HIP version: ${HIP_VERSION}")
		endif()

		# Set compile-time flags for ROCm path.
		add_definitions(-D__ROCM_PATH__="${ROCM_PATH}")

		# Locate HIP runtime library.
		find_library(ROCM_RUNTIME_LIBRARY amdhip64
		PATHS "${HIP_PATH}/lib")
		if (NOT ROCM_RUNTIME_LIBRARY)
		message(SEND_ERROR "Could not locate ROCm HIP runtime library")
		else()
		message(STATUS "ROCm HIP runtime lib: ${ROCM_RUNTIME_LIBRARY}")
		endif()

		target_compile_definitions(obj.MLIRGPU
		PRIVATE
		# Set HIP compile-time flags.
		__HIP_PLATFORM_HCC__
		# Enable gpu-to-hsaco pass.
		MLIR_GPU_TO_HSACO_PASS_ENABLE=1
		)
		herhutUnsubmitted Not Done Reply Inline Actions What does this do? I do not understand cmake enough to make sense of this. herhut: What does this do? I do not understand cmake enough to make sense of this.
		csiggAuthorUnsubmitted Done Reply Inline Actions MLIR_LLVM_LINK_COMPONENTS is the list of LLVM libraries that are linked to libmlir.so. Normally, it is populated by the LINK_COMPONENTS argument of add_mlir_library() but here this call has already happened. We need an alias (really just a different name for an existing target) because the LINK_COMPONENTS name are implicitly prefixed with 'LLVM' when translated to target names. csigg: MLIR_LLVM_LINK_COMPONENTS is the list of LLVM libraries that are linked to libmlir.so. Normally…

		# Add ROCm headers includes.
		target_include_directories(obj.MLIRGPU
		PRIVATE
		"${ROCM_PATH}/include"
		"${HIP_PATH}/include"
		)

		target_link_libraries(MLIRGPU
		PRIVATE
		lldCommon
		lldDriver
		lldELF
		MLIRROCDLToLLVMIRTranslation
		${ROCM_RUNTIME_LIBRARY}
		)

		llvm_update_compile_flags(obj.MLIRGPU)

		endif()

mlir/lib/Dialect/GPU/Transforms/SerializeToHsaco.cpp

This file was added.

				//===- LowerGPUToHSACO.cpp - Convert GPU kernel to HSACO blob -------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements a pass that serializes a gpu module into HSAco blob and
				// adds that blob as a string attribute of the module.
				//
				//===----------------------------------------------------------------------===//
				#include "mlir/Dialect/GPU/Passes.h"

				#if MLIR_GPU_TO_HSACO_PASS_ENABLE
				#include "mlir/Pass/Pass.h"
				#include "mlir/Support/FileUtilities.h"
				#include "mlir/Target/LLVMIR/Dialect/ROCDL/ROCDLToLLVMIRTranslation.h"
				#include "mlir/Target/LLVMIR/Export.h"

				#include "llvm/MC/MCAsmBackend.h"
				#include "llvm/MC/MCAsmInfo.h"
				#include "llvm/MC/MCCodeEmitter.h"
				#include "llvm/MC/MCContext.h"
				#include "llvm/MC/MCObjectFileInfo.h"
				#include "llvm/MC/MCObjectWriter.h"
				#include "llvm/MC/MCParser/MCTargetAsmParser.h"
				#include "llvm/MC/MCStreamer.h"
				#include "llvm/MC/MCSubtargetInfo.h"

				#include "llvm/Support/FileUtilities.h"
				#include "llvm/Support/LineIterator.h"
				#include "llvm/Support/Program.h"
				#include "llvm/Support/TargetRegistry.h"
				#include "llvm/Support/TargetSelect.h"
				#include "llvm/Support/WithColor.h"
				#include "llvm/Target/TargetOptions.h"

				#include "lld/Common/Driver.h"

				#include "hip/hip_version.h"

				#include <mutex>

				using namespace mlir;

				namespace {
				class SerializeToHsacoPass
				: public PassWrapper<SerializeToHsacoPass, gpu::SerializeToBlobPass> {
				public:
				SerializeToHsacoPass();

				private:
				void getDependentDialects(DialectRegistry &registry) const override;

				// Serializes ROCDL to HSACO.
				std::unique_ptr<std::vector<char>>
				serializeISA(const std::string &isa) override;

				std::unique_ptr<SmallVectorImpl<char>> assembleIsa(const std::string &isa);
				std::unique_ptr<std::vector<char>>
				createHsaco(const SmallVectorImpl<char> &isaBinary);
				};
				} // namespace

				static std::string getDefaultChip() {
				const char kDefaultChip[] = "gfx900";

				// Locate rocm_agent_enumerator.
				const char kRocmAgentEnumerator[] = "rocm_agent_enumerator";
				llvm::ErrorOr<std::string> rocmAgentEnumerator = llvm::sys::findProgramByName(
				kRocmAgentEnumerator, {__ROCM_PATH__ "/bin"});
				if (!rocmAgentEnumerator) {
				llvm::WithColor::warning(llvm::errs())
				<< kRocmAgentEnumerator << "couldn't be located under " << __ROCM_PATH__
				<< "/bin\n";
				return kDefaultChip;
				}

				// Prepare temp file to hold the outputs.
				int tempFd = -1;
				SmallString<128> tempFilename;
				if (llvm::sys::fs::createTemporaryFile("rocm_agent", "txt", tempFd,
				tempFilename)) {
				llvm::WithColor::warning(llvm::errs())
				<< "temporary file for " << kRocmAgentEnumerator << " creation error\n";
				return kDefaultChip;
				}
				llvm::FileRemover cleanup(tempFilename);

				// Invoke rocm_agent_enumerator.
				std::string errorMessage;
				SmallVector<StringRef, 2> args{"-t", "GPU"};
				Optional<StringRef> redirects[3] = {{""}, tempFilename.str(), {""}};
				int result =
				llvm::sys::ExecuteAndWait(rocmAgentEnumerator.get(), args, llvm::None,
				redirects, 0, 0, &errorMessage);
				if (result) {
				llvm::WithColor::warning(llvm::errs())
				<< kRocmAgentEnumerator << " invocation error: " << errorMessage
				<< "\n";
				return kDefaultChip;
				}

				// Load and parse the result.
				auto gfxIsaList = openInputFile(tempFilename);
				if (!gfxIsaList) {
				llvm::WithColor::error(llvm::errs())
				<< "read ROCm agent list temp file error\n";
				return kDefaultChip;
				}
				for (llvm::line_iterator lines(*gfxIsaList); !lines.is_at_end(); ++lines) {
				// Skip the line with content "gfx000".
				if (*lines == "gfx000")
				continue;
				// Use the first ISA version found.
				return lines->str();
				}

				return kDefaultChip;
				}

				// Sets the 'option' to 'value' unless it already has a value.
				static void maybeSetOption(Pass::Option<std::string> &option,
				function_ref<std::string()> getValue) {
				if (!option.hasValue())
				option = getValue();
				}

				SerializeToHsacoPass::SerializeToHsacoPass() {
				maybeSetOption(this->triple, [] { return "amdgcn-amd-amdhsa"; });
				maybeSetOption(this->chip, [] {
				static auto chip = getDefaultChip();
				return chip;
				});
				}

				void SerializeToHsacoPass::getDependentDialects(
				DialectRegistry &registry) const {
				registerROCDLDialectTranslation(registry);
				gpu::SerializeToBlobPass::getDependentDialects(registry);
				}

				std::unique_ptr<SmallVectorImpl<char>>
				SerializeToHsacoPass::assembleIsa(const std::string &isa) {
				auto loc = getOperation().getLoc();

				SmallVector<char, 0> result;
				llvm::raw_svector_ostream os(result);

				llvm::Triple triple(llvm::Triple::normalize(this->triple));
				std::string error;
				const llvm::Target *target =
				llvm::TargetRegistry::lookupTarget(triple.normalize(), error);
				if (!target) {
				emitError(loc, Twine("failed to lookup target: ") + error);
				return {};
				}

				llvm::SourceMgr srcMgr;
				srcMgr.AddNewSourceBuffer(llvm::MemoryBuffer::getMemBuffer(isa),
				llvm::SMLoc());

				const llvm::MCTargetOptions mcOptions;
				std::unique_ptr<llvm::MCRegisterInfo> mri(
				target->createMCRegInfo(this->triple));
				std::unique_ptr<llvm::MCAsmInfo> mai(
				target->createMCAsmInfo(*mri, this->triple, mcOptions));
				mai->setRelaxELFRelocations(true);

				llvm::MCObjectFileInfo mofi;
				llvm::MCContext ctx(mai.get(), mri.get(), &mofi, &srcMgr, &mcOptions);
				mofi.InitMCObjectFileInfo(triple, false, ctx, false);

				SmallString<128> cwd;
				if (!llvm::sys::fs::current_path(cwd))
				ctx.setCompilationDir(cwd);

				std::unique_ptr<llvm::MCStreamer> mcStreamer;
				std::unique_ptr<llvm::MCInstrInfo> mcii(target->createMCInstrInfo());
				std::unique_ptr<llvm::MCSubtargetInfo> sti(
				target->createMCSubtargetInfo(this->triple, this->chip, this->features));

				llvm::MCCodeEmitter ce = target->createMCCodeEmitter(mcii, *mri, ctx);
				llvm::MCAsmBackend mab = target->createMCAsmBackend(sti, *mri, mcOptions);
				mcStreamer.reset(target->createMCObjectStreamer(
				triple, ctx, std::unique_ptr<llvm::MCAsmBackend>(mab),
				mab->createObjectWriter(os), std::unique_ptr<llvm::MCCodeEmitter>(ce),
				*sti, mcOptions.MCRelaxAll, mcOptions.MCIncrementalLinkerCompatible,
				/DWARFMustBeAtTheEnd/ false));
				mcStreamer->setUseAssemblerInfoForParsing(true);

				std::unique_ptr<llvm::MCAsmParser> parser(
				createMCAsmParser(srcMgr, ctx, mcStreamer, mai));
				std::unique_ptr<llvm::MCTargetAsmParser> tap(
				target->createMCAsmParser(sti, parser, *mcii, mcOptions));

				if (!tap) {
				emitError(loc, "assembler initialization error");
				return {};
				}

				parser->setTargetParser(*tap);
				parser->Run(false);

				return std::make_unique<SmallVector<char, 0>>(std::move(result));
				}

				std::unique_ptr<std::vector<char>>
				SerializeToHsacoPass::createHsaco(const SmallVectorImpl<char> &isaBinary) {
				auto loc = getOperation().getLoc();

				// Save the ISA binary to a temp file.
				int tempIsaBinaryFd = -1;
				SmallString<128> tempIsaBinaryFilename;
				if (llvm::sys::fs::createTemporaryFile("kernel", "o", tempIsaBinaryFd,
				tempIsaBinaryFilename)) {
				emitError(loc, "temporary file for ISA binary creation error");
				return {};
				}
				llvm::FileRemover cleanupIsaBinary(tempIsaBinaryFilename);
				llvm::raw_fd_ostream tempIsaBinaryOs(tempIsaBinaryFd, true);
				tempIsaBinaryOs << StringRef(isaBinary.data(), isaBinary.size());
				tempIsaBinaryOs.close();

				// Create a temp file for HSA code object.
				int tempHsacoFD = -1;
				SmallString<128> tempHsacoFilename;
				if (llvm::sys::fs::createTemporaryFile("kernel", "hsaco", tempHsacoFD,
				tempHsacoFilename)) {
				emitError(loc, "temporary file for HSA code object creation error");
				return {};
				}
				llvm::FileRemover cleanupHsaco(tempHsacoFilename);

				{
				static std::mutex mutex;
				const std::lock_guard<std::mutex> lock(mutex);
				// Invoke lld. Expect a true return value from lld.
				if (!lld::elf::link({"ld.lld", "-shared", tempIsaBinaryFilename.c_str(),
				"-o", tempHsacoFilename.c_str()},
				/canEarlyExit=/false, llvm::outs(), llvm::errs())) {
				emitError(loc, "lld invocation error");
				return {};
				}
				}

				// Load the HSA code object.
				auto hsacoFile = openInputFile(tempHsacoFilename);
				if (!hsacoFile) {
				emitError(loc, "read HSA code object from temp file error");
				return {};
				}

				StringRef buffer = hsacoFile->getBuffer();
				return std::make_unique<std::vector<char>>(buffer.begin(), buffer.end());
				}

				std::unique_ptr<std::vector<char>>
				SerializeToHsacoPass::serializeISA(const std::string &isa) {
				auto isaBinary = assembleIsa(isa);
				if (!isaBinary)
				return {};
				return createHsaco(*isaBinary);
				}

				// Register pass to serialize GPU kernel functions to a HSACO binary annotation.
				void mlir::registerGpuSerializeToHsacoPass() {
				PassRegistration<SerializeToHsacoPass> registerSerializeToHSACO(
				"gpu-to-hsaco", "Lower GPU kernel function to HSACO binary annotations",
				[] {
				// Initialize LLVM AMDGPU backend.
				LLVMInitializeAMDGPUAsmParser();
				LLVMInitializeAMDGPUAsmPrinter();
				LLVMInitializeAMDGPUTarget();
				LLVMInitializeAMDGPUTargetInfo();
				LLVMInitializeAMDGPUTargetMC();

				return std::make_unique<SerializeToHsacoPass>();
				});
				}
				#else // MLIR_GPU_TO_HSACO_PASS_ENABLE
				void mlir::registerGpuSerializeToHsacoPass() {}
				#endif // MLIR_GPU_TO_HSACO_PASS_ENABLE

mlir/lib/ExecutionEngine/CMakeLists.txt

# Exclude these from libMLIR.so because the JIT infrastructure		# Exclude these from libMLIR.so because the JIT infrastructure
# is a big dependency which most don't need.		# is a big dependency which most don't need.

set(LLVM_OPTIONAL_SOURCES		set(LLVM_OPTIONAL_SOURCES
AsyncRuntime.cpp		AsyncRuntime.cpp
CRunnerUtils.cpp		CRunnerUtils.cpp
CudaRuntimeWrappers.cpp		CudaRuntimeWrappers.cpp
SparseUtils.cpp		SparseUtils.cpp
ExecutionEngine.cpp		ExecutionEngine.cpp
		RocmRuntimeWrappers.cpp
RunnerUtils.cpp		RunnerUtils.cpp
OptUtils.cpp		OptUtils.cpp
JitRunner.cpp		JitRunner.cpp
)		)

add_mlir_library(MLIRExecutionEngine		add_mlir_library(MLIRExecutionEngine
ExecutionEngine.cpp		ExecutionEngine.cpp
OptUtils.cpp		OptUtils.cpp
▲ Show 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	target_include_directories(mlir_cuda_runtime
PRIVATE		PRIVATE
${CMAKE_CUDA_TOOLKIT_INCLUDE_DIRECTORIES}		${CMAKE_CUDA_TOOLKIT_INCLUDE_DIRECTORIES}
)		)
target_link_libraries(mlir_cuda_runtime		target_link_libraries(mlir_cuda_runtime
PRIVATE		PRIVATE
${CUDA_RUNTIME_LIBRARY}		${CUDA_RUNTIME_LIBRARY}
)		)
endif()		endif()

		if(MLIR_ROCM_RUNNER_ENABLED)
		if (NOT ("AMDGPU" IN_LIST LLVM_TARGETS_TO_BUILD))
		message(SEND_ERROR
		"Building the mlir rocm runner requires the AMDGPU backend")
		endif()

		# Ensure lld is enabled.
		if (NOT "lld" IN_LIST LLVM_ENABLE_PROJECTS)
		message(SEND_ERROR "lld is not enabled. Please revise LLVM_ENABLE_PROJECTS")
		endif()

		# lld header files.
		include_directories(${MLIR_SOURCE_DIR}/../lld/include)

		# Configure ROCm support.
		if (NOT DEFINED ROCM_PATH)
		if (NOT DEFINED ENV{ROCM_PATH})
		set(ROCM_PATH "/opt/rocm" CACHE PATH "Path to which ROCm has been installed")
		else()
		set(ROCM_PATH $ENV{ROCM_PATH} CACHE PATH "Path to which ROCm has been installed")
		endif()
		set(HIP_PATH "${ROCM_PATH}/hip" CACHE PATH "Path to which HIP has been installed")
		endif()
		set(CMAKE_MODULE_PATH "${HIP_PATH}/cmake" ${CMAKE_MODULE_PATH})
		find_package(HIP)
		if (NOT HIP_FOUND)
		message(SEND_ERROR "Build the mlir rocm runner requires a working ROCm and HIP install")
		else()
		message(STATUS "ROCm HIP version: ${HIP_VERSION}")
		endif()

		# Set compile-time flags for ROCm path.
		add_definitions(-D__ROCM_PATH__="${ROCM_PATH}")

		# Locate HIP runtime library.
		find_library(ROCM_RUNTIME_LIBRARY amdhip64
		PATHS "${HIP_PATH}/lib")
		if (NOT ROCM_RUNTIME_LIBRARY)
		message(SEND_ERROR "Could not locate ROCm HIP runtime library")
		else()
		message(STATUS "ROCm HIP runtime lib: ${ROCM_RUNTIME_LIBRARY}")
		endif()

		# Set HIP compile-time flags.
		add_definitions(-D__HIP_PLATFORM_HCC__)

		add_mlir_library(mlir_rocm_runtime
		SHARED
		RocmRuntimeWrappers.cpp

		EXCLUDE_FROM_LIBMLIR
		)
		target_include_directories(mlir_rocm_runtime
		PRIVATE
		"${ROCM_PATH}/include"
		"${HIP_PATH}/include"
		)
		target_link_libraries(mlir_rocm_runtime
		PRIVATE
		${ROCM_RUNTIME_LIBRARY}
		)
		endif()
		No newline at end of file

mlir/lib/ExecutionEngine/RocmRuntimeWrappers.cpp

This file was moved from mlir/tools/mlir-rocm-runner/rocm-runtime-wrappers.cpp.

	//===- rocm-runtime-wrappers.cpp - MLIR ROCM runner wrapper library -------===//			//===- RocmRuntimeWrappers.cpp - MLIR ROCM runner wrapper library ---------===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// Implements C wrappers around the ROCM library for easy linking in ORC jit.			// Implements C wrappers around the ROCM library for easy linking in ORC jit.
	// Also adds some debugging helpers that are helpful when writing MLIR code to			// Also adds some debugging helpers that are helpful when writing MLIR code to
	// run on GPUs.			// run on GPUs.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include <cassert>			#include <cassert>
	#include <numeric>			#include <numeric>

	#include "mlir/ExecutionEngine/CRunnerUtils.h"			#include "mlir/ExecutionEngine/CRunnerUtils.h"
	#include "llvm/ADT/ArrayRef.h"			#include "llvm/ADT/ArrayRef.h"

	#include "hip/hip_runtime.h"			#include "hip/hip_runtime.h"
				Lint: Pre-merge checks Inline Actions clang-tidy: error: 'hip/hip_runtime.h' file not found [clang-diagnostic-error] not useful Lint: Pre-merge checks: clang-tidy: error: 'hip/hip_runtime.h' file not found [clang-diagnostic-error] [[https://github.

	#define HIP_REPORT_IF_ERROR(expr) \			#define HIP_REPORT_IF_ERROR(expr) \
	[](hipError_t result) { \			[](hipError_t result) { \
	if (!result) \			if (!result) \
	return; \			return; \
	const char *name = hipGetErrorName(result); \			const char *name = hipGetErrorName(result); \
	if (!name) \			if (!name) \
	name = "<unknown>"; \			name = "<unknown>"; \
	fprintf(stderr, "'%s' failed with '%s'\n", #expr, name); \			fprintf(stderr, "'%s' failed with '%s'\n", #expr, name); \
	}(expr)			}(expr)

				// Sets the `Context` for the duration of the instance and restores the previous
				// context on destruction.
				class ScopedContext {
				public:
				ScopedContext() {
	// Static reference to HIP primary context for device ordinal 0.			// Static reference to HIP primary context for device ordinal 0.
	static hipCtx_t Context = [] {			static hipCtx_t context = [] {
	HIP_REPORT_IF_ERROR(hipInit(/flags=/0));			HIP_REPORT_IF_ERROR(hipInit(/flags=/0));
	hipDevice_t device;			hipDevice_t device;
	HIP_REPORT_IF_ERROR(hipDeviceGet(&device, /ordinal=/0));			HIP_REPORT_IF_ERROR(hipDeviceGet(&device, /ordinal=/0));
	hipCtx_t context;			hipCtx_t ctx;
	HIP_REPORT_IF_ERROR(hipDevicePrimaryCtxRetain(&context, device));			HIP_REPORT_IF_ERROR(hipDevicePrimaryCtxRetain(&ctx, device));
	return context;			return ctx;
	}();			}();

	// Sets the `Context` for the duration of the instance and restores the previous			HIP_REPORT_IF_ERROR(hipCtxPushCurrent(context));
	// context on destruction.
	class ScopedContext {
	public:
	ScopedContext() {
	HIP_REPORT_IF_ERROR(hipCtxGetCurrent(&previous));
	HIP_REPORT_IF_ERROR(hipCtxSetCurrent(Context));
	}			}

	~ScopedContext() { HIP_REPORT_IF_ERROR(hipCtxSetCurrent(previous)); }			~ScopedContext() { HIP_REPORT_IF_ERROR(hipCtxPopCurrent(nullptr)); }

	private:
	hipCtx_t previous;
	};			};

	extern "C" hipModule_t mgpuModuleLoad(void *data) {			extern "C" hipModule_t mgpuModuleLoad(void *data) {
	ScopedContext scopedContext;			ScopedContext scopedContext;
	hipModule_t module = nullptr;			hipModule_t module = nullptr;
	HIP_REPORT_IF_ERROR(hipModuleLoadData(&module, data));			HIP_REPORT_IF_ERROR(hipModuleLoadData(&module, data));
	return module;			return module;
	}			}
	▲ Show 20 Lines • Show All 137 Lines • Show Last 20 Lines

mlir/test/CMakeLists.txt

Show All 15 Lines	llvm_canonicalize_cmake_booleans(
)		)

# Passed to lit.site.cfg.py.in to set up the path where to find the libraries		# Passed to lit.site.cfg.py.in to set up the path where to find the libraries
# for linalg integration tests.		# for linalg integration tests.
set(MLIR_DIALECT_LINALG_INTEGRATION_TEST_LIB_DIR ${CMAKE_LIBRARY_OUTPUT_DIRECTORY})		set(MLIR_DIALECT_LINALG_INTEGRATION_TEST_LIB_DIR ${CMAKE_LIBRARY_OUTPUT_DIRECTORY})
set(MLIR_RUNNER_UTILS_DIR ${CMAKE_LIBRARY_OUTPUT_DIRECTORY})		set(MLIR_RUNNER_UTILS_DIR ${CMAKE_LIBRARY_OUTPUT_DIRECTORY})

# Passed to lit.site.cfg.py.in to set up the path where to find the libraries		# Passed to lit.site.cfg.py.in to set up the path where to find the libraries
# for the mlir rocm / spirv / vulkan runner tests.		# for the mlir spirv / vulkan runner tests.
set(MLIR_ROCM_WRAPPER_LIBRARY_DIR ${CMAKE_LIBRARY_OUTPUT_DIRECTORY})
set(MLIR_SPIRV_WRAPPER_LIBRARY_DIR ${CMAKE_LIBRARY_OUTPUT_DIRECTORY})		set(MLIR_SPIRV_WRAPPER_LIBRARY_DIR ${CMAKE_LIBRARY_OUTPUT_DIRECTORY})
set(MLIR_VULKAN_WRAPPER_LIBRARY_DIR ${CMAKE_LIBRARY_OUTPUT_DIRECTORY})		set(MLIR_VULKAN_WRAPPER_LIBRARY_DIR ${CMAKE_LIBRARY_OUTPUT_DIRECTORY})

if (MLIR_INCLUDE_INTEGRATION_TESTS)		if (MLIR_INCLUDE_INTEGRATION_TESTS)
set(INTEL_SDE_EXECUTABLE "" CACHE STRING		set(INTEL_SDE_EXECUTABLE "" CACHE STRING
"If set, arch-specific integration tests are run with Intel SDE.")		"If set, arch-specific integration tests are run with Intel SDE.")
option(MLIR_RUN_AVX512_TESTS "Run AVX512 tests.")		option(MLIR_RUN_AVX512_TESTS "Run AVX512 tests.")
# Passed to lit.site.cfg.py.in to set up the path where to find the libraries.		# Passed to lit.site.cfg.py.in to set up the path where to find the libraries.
Show All 35 Lines	set(MLIR_TEST_DEPENDS
mlir_c_runner_utils		mlir_c_runner_utils
mlir_async_runtime		mlir_async_runtime
)		)

if(MLIR_CUDA_RUNNER_ENABLED)		if(MLIR_CUDA_RUNNER_ENABLED)
list(APPEND MLIR_TEST_DEPENDS mlir_cuda_runtime)		list(APPEND MLIR_TEST_DEPENDS mlir_cuda_runtime)
endif()		endif()

		if(MLIR_ROCM_RUNNER_ENABLED)
		list(APPEND MLIR_TEST_DEPENDS mlir_rocm_runtime)
		endif()

list(APPEND MLIR_TEST_DEPENDS MLIRUnitTests)		list(APPEND MLIR_TEST_DEPENDS MLIRUnitTests)

if(LLVM_BUILD_EXAMPLES)		if(LLVM_BUILD_EXAMPLES)
list(APPEND MLIR_TEST_DEPENDS		list(APPEND MLIR_TEST_DEPENDS
toyc-ch1		toyc-ch1
toyc-ch2		toyc-ch2
toyc-ch3		toyc-ch3
toyc-ch4		toyc-ch4
toyc-ch5		toyc-ch5
toyc-ch6		toyc-ch6
toyc-ch7		toyc-ch7
)		)
endif()		endif()

if(MLIR_ROCM_RUNNER_ENABLED)
list(APPEND MLIR_TEST_DEPENDS
mlir-rocm-runner
)
endif()

if(MLIR_SPIRV_CPU_RUNNER_ENABLED)		if(MLIR_SPIRV_CPU_RUNNER_ENABLED)
add_subdirectory(mlir-spirv-cpu-runner)		add_subdirectory(mlir-spirv-cpu-runner)
list(APPEND MLIR_TEST_DEPENDS		list(APPEND MLIR_TEST_DEPENDS
mlir-spirv-cpu-runner		mlir-spirv-cpu-runner
mlir_test_spirv_cpu_runner_c_wrappers		mlir_test_spirv_cpu_runner_c_wrappers
)		)
endif()		endif()

Show All 24 Lines

mlir/test/Conversion/GPUToROCm/lower-rocdl-kernel-to-hsaco.mlir

	// RUN: mlir-opt %s --test-kernel-to-hsaco -split-input-file \| FileCheck %s			// RUN: mlir-opt %s --test-gpu-to-hsaco \| FileCheck %s

	// CHECK: attributes {rocdl.hsaco = "HSACO"}			// CHECK: gpu.module @foo attributes {gpu.binary = "HSACO"}
	gpu.module @foo {			gpu.module @foo {
	llvm.func @kernel(%arg0 : f32, %arg1 : !llvm.ptr<f32>)			llvm.func @kernel(%arg0 : f32, %arg1 : !llvm.ptr<f32>)
	// CHECK: attributes {gpu.kernel}			// CHECK: attributes {gpu.kernel}
	attributes { gpu.kernel } {			attributes { gpu.kernel } {
	llvm.return			llvm.return
	}			}
	}			}

	// -----			// CHECK: gpu.module @bar attributes {gpu.binary = "HSACO"}

	gpu.module @bar {			gpu.module @bar {
	// CHECK: func @kernel_a			// CHECK: func @kernel_a
	llvm.func @kernel_a()			llvm.func @kernel_a()
	attributes { gpu.kernel } {			attributes { gpu.kernel } {
	llvm.return			llvm.return
	}			}

	// CHECK: func @kernel_b			// CHECK: func @kernel_b
	llvm.func @kernel_b()			llvm.func @kernel_b()
	attributes { gpu.kernel } {			attributes { gpu.kernel } {
	llvm.return			llvm.return
	}			}
	}			}

mlir/test/Integration/GPU/CUDA/lit.local.cfg

	if not config.enable_cuda_runner:			if not config.enable_cuda_runner:
	config.unsupported = True			config.unsupported = True
	No newline at end of file

mlir/test/Integration/GPU/ROCM/gpu-to-hsaco.mlir

This file was moved from mlir/test/mlir-rocm-runner/gpu-to-hsaco.mlir.

	// RUN: mlir-rocm-runner %s \			// RUN: mlir-opt %s \
	// RUN: --shared-libs=%cuda_wrapper_library_dir/libcuda-runtime-wrappers%shlibext \			// RUN: -gpu-kernel-outlining \
				// RUN: -pass-pipeline='gpu.module(strip-debuginfo,convert-gpu-to-rocdl,gpu-to-hsaco)' \
				// RUN: -gpu-to-llvm \
				// RUN: \| mlir-cpu-runner \
				// RUN: --shared-libs=%linalg_test_lib_dir/libmlir_rocm_runtime%shlibext \
	// RUN: --shared-libs=%linalg_test_lib_dir/libmlir_runner_utils%shlibext \			// RUN: --shared-libs=%linalg_test_lib_dir/libmlir_runner_utils%shlibext \
	// RUN: --entry-point-result=void \			// RUN: --entry-point-result=void \
	// RUN: \| FileCheck %s			// RUN: \| FileCheck %s

	func @other_func(%arg0 : f32, %arg1 : memref<?xf32>) {			func @other_func(%arg0 : f32, %arg1 : memref<?xf32>) {
	%c0 = constant 0 : index			%c0 = constant 0 : index
	%c1 = constant 1 : index			%c1 = constant 1 : index
	%block_dim = dim %arg1, %c0 : memref<?xf32>			%block_dim = dim %arg1, %c0 : memref<?xf32>
	Show All 26 Lines

mlir/test/Integration/GPU/ROCM/lit.local.cfg

This file was moved from mlir/test/mlir-rocm-runner/lit.local.cfg.

	if not config.enable_rocm_runner:			if not config.enable_rocm_runner:
	config.unsupported = True			config.unsupported = True

mlir/test/Integration/GPU/ROCM/two-modules.mlir

This file was moved from mlir/test/mlir-rocm-runner/two-modules.mlir.

	// RUN: mlir-rocm-runner %s \			// RUN: mlir-opt %s \
	// RUN: --shared-libs=%cuda_wrapper_library_dir/libcuda-runtime-wrappers%shlibext \			// RUN: -gpu-kernel-outlining \
				// RUN: -pass-pipeline='gpu.module(strip-debuginfo,convert-gpu-to-rocdl,gpu-to-hsaco)' \
				// RUN: -gpu-to-llvm \
				// RUN: \| mlir-cpu-runner \
				// RUN: --shared-libs=%linalg_test_lib_dir/libmlir_rocm_runtime%shlibext \
	// RUN: --shared-libs=%linalg_test_lib_dir/libmlir_runner_utils%shlibext \			// RUN: --shared-libs=%linalg_test_lib_dir/libmlir_runner_utils%shlibext \
	// RUN: --entry-point-result=void \			// RUN: --entry-point-result=void \
	// RUN: \| FileCheck %s			// RUN: \| FileCheck %s

	// CHECK: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]			// CHECK: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
	func @main() {			func @main() {
	%arg = alloc() : memref<13xi32>			%arg = alloc() : memref<13xi32>
	%dst = memref_cast %arg : memref<13xi32> to memref<?xi32>			%dst = memref_cast %arg : memref<13xi32> to memref<?xi32>
	Show All 24 Lines

mlir/test/Integration/GPU/ROCM/vecadd.mlir

This file was moved from mlir/test/mlir-rocm-runner/vecadd.mlir.

	// RUN: mlir-rocm-runner %s \			// RUN: mlir-opt %s \
	// RUN: --shared-libs=%cuda_wrapper_library_dir/libcuda-runtime-wrappers%shlibext \			// RUN: -gpu-kernel-outlining \
				// RUN: -pass-pipeline='gpu.module(strip-debuginfo,convert-gpu-to-rocdl,gpu-to-hsaco)' \
				// RUN: -gpu-to-llvm \
				// RUN: \| mlir-cpu-runner \
				// RUN: --shared-libs=%linalg_test_lib_dir/libmlir_rocm_runtime%shlibext \
	// RUN: --shared-libs=%linalg_test_lib_dir/libmlir_runner_utils%shlibext \			// RUN: --shared-libs=%linalg_test_lib_dir/libmlir_runner_utils%shlibext \
	// RUN: --entry-point-result=void \			// RUN: --entry-point-result=void \
	// RUN: \| FileCheck %s			// RUN: \| FileCheck %s

	func @vecadd(%arg0 : memref<?xf32>, %arg1 : memref<?xf32>, %arg2 : memref<?xf32>) {			func @vecadd(%arg0 : memref<?xf32>, %arg1 : memref<?xf32>, %arg2 : memref<?xf32>) {
	%c0 = constant 0 : index			%c0 = constant 0 : index
	%c1 = constant 1 : index			%c1 = constant 1 : index
	%block_dim = dim %arg0, %c0 : memref<?xf32>			%block_dim = dim %arg0, %c0 : memref<?xf32>
	▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

mlir/test/Integration/GPU/ROCM/vector-transferops.mlir

This file was moved from mlir/test/mlir-rocm-runner/vector-transferops.mlir.

	// RUN: mlir-rocm-runner %s \			// RUN: mlir-opt %s \
	// RUN: --shared-libs=%cuda_wrapper_library_dir/libcuda-runtime-wrappers%shlibext \			// RUN: -gpu-kernel-outlining \
				// RUN: -pass-pipeline='gpu.module(strip-debuginfo,convert-gpu-to-rocdl,gpu-to-hsaco)' \
				// RUN: -gpu-to-llvm \
				// RUN: \| mlir-cpu-runner \
				// RUN: --shared-libs=%linalg_test_lib_dir/libmlir_rocm_runtime%shlibext \
	// RUN: --shared-libs=%linalg_test_lib_dir/libmlir_runner_utils%shlibext \			// RUN: --shared-libs=%linalg_test_lib_dir/libmlir_runner_utils%shlibext \
	// RUN: --entry-point-result=void \			// RUN: --entry-point-result=void \
	// RUN: \| FileCheck %s			// RUN: \| FileCheck %s

	func @vectransferx2(%arg0 : memref<?xf32>, %arg1 : memref<?xf32>) {			func @vectransferx2(%arg0 : memref<?xf32>, %arg1 : memref<?xf32>) {
	%cst = constant 1 : index			%cst = constant 1 : index
	gpu.launch blocks(%bx, %by, %bz) in (%grid_x = %cst, %grid_y = %cst, %grid_z = %cst)			gpu.launch blocks(%bx, %by, %bz) in (%grid_x = %cst, %grid_y = %cst, %grid_z = %cst)
	threads(%tx, %ty, %tz) in (%block_x = %cst, %block_y = %cst, %block_z = %cst) {			threads(%tx, %ty, %tz) in (%block_x = %cst, %block_y = %cst, %block_z = %cst) {
	▲ Show 20 Lines • Show All 78 Lines • Show Last 20 Lines

mlir/test/lib/Transforms/TestConvertGPUKernelToHsaco.cpp

	//===- TestConvertGPUKernelToHsaco.cpp - Test gpu kernel hsaco lowering ---===//			//===- TestConvertGPUKernelToHsaco.cpp - Test gpu kernel hsaco lowering ---===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "mlir/Conversion/GPUCommon/GPUCommonPass.h"			#include "mlir/Dialect/GPU/Passes.h"
	#include "mlir/Dialect/LLVMIR/ROCDLDialect.h"
	#include "mlir/Pass/Pass.h"			#include "mlir/Pass/Pass.h"
	#include "mlir/Pass/PassManager.h"
	#include "mlir/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.h"
	#include "mlir/Target/LLVMIR/Dialect/ROCDL/ROCDLToLLVMIRTranslation.h"			#include "mlir/Target/LLVMIR/Dialect/ROCDL/ROCDLToLLVMIRTranslation.h"
	#include "mlir/Target/LLVMIR/Export.h"			#include "mlir/Target/LLVMIR/Export.h"
	#include "llvm/Support/TargetSelect.h"			#include "llvm/Support/TargetSelect.h"

	using namespace mlir;			using namespace mlir;

	#if MLIR_ROCM_CONVERSIONS_ENABLED			#if MLIR_ROCM_CONVERSIONS_ENABLED
	static OwnedBlob compileIsaToHsacoForTesting(const std::string &, Location,			namespace {
	StringRef) {			class TestSerializeToHsacoPass
	const char data[] = "HSACO";			: public PassWrapper<TestSerializeToHsacoPass, gpu::SerializeToBlobPass> {
	return std::make_unique<std::vector<char>>(data, data + sizeof(data) - 1);			public:
				TestSerializeToHsacoPass();

				private:
				void getDependentDialects(DialectRegistry &registry) const override;

				// Serializes ROCDL IR to HSACO.
				whchungUnsubmitted Done Reply Inline Actions PTX -> LLVM IR? whchung: PTX -> LLVM IR?
				std::unique_ptr<std::vector<char>>
				serializeISA(const std::string &isa) override;
				};
				} // namespace

				TestSerializeToHsacoPass::TestSerializeToHsacoPass() {
				this->triple = "amdgcn-amd-amdhsa";
				this->chip = "gfx900";
				}
				whchungUnsubmitted Done Reply Inline Actions "code-object-v3" is dropped in recent LLVM AMDGPU backend so this line is not needed. whchung: "code-object-v3" is dropped in recent LLVM AMDGPU backend so this line is not needed.

				void TestSerializeToHsacoPass::getDependentDialects(
				DialectRegistry &registry) const {
				registerROCDLDialectTranslation(registry);
				gpu::SerializeToBlobPass::getDependentDialects(registry);
	}			}

	static std::unique_ptr<llvm::Module>			std::unique_ptr<std::vector<char>>
	translateModuleToROCDL(Operation *m, llvm::LLVMContext &llvmContext,			TestSerializeToHsacoPass::serializeISA(const std::string &) {
	StringRef moduleName) {			std::string data = "HSACO";
	registerLLVMDialectTranslation(*m->getContext());			return std::make_unique<std::vector<char>>(data.begin(), data.end());
	registerROCDLDialectTranslation(*m->getContext());
	return translateModuleToLLVMIR(m, llvmContext, moduleName);
	}			}

	namespace mlir {			namespace mlir {
	namespace test {			namespace test {
	void registerTestConvertGPUKernelToHsacoPass() {			// Register test pass to serialize GPU module to a HSAco binary annotation.
	PassPipelineRegistration<>(			void registerTestGpuSerializeToHsacoPass() {
	"test-kernel-to-hsaco",			PassRegistration<TestSerializeToHsacoPass> registerSerializeToHsaco(
	"Convert all kernel functions to ROCm hsaco blobs",			"test-gpu-to-hsaco",
	[](OpPassManager &pm) {			"Lower GPU kernel function to HSAco binary annotations", [] {
	// Initialize LLVM AMDGPU backend.			// Initialize LLVM AMDGPU backend.
	LLVMInitializeAMDGPUTarget();			LLVMInitializeAMDGPUTarget();
	LLVMInitializeAMDGPUTargetInfo();			LLVMInitializeAMDGPUTargetInfo();
	LLVMInitializeAMDGPUTargetMC();			LLVMInitializeAMDGPUTargetMC();
	LLVMInitializeAMDGPUAsmPrinter();			LLVMInitializeAMDGPUAsmPrinter();

	pm.addPass(createConvertGPUKernelToBlobPass(			return std::make_unique<TestSerializeToHsacoPass>();
	translateModuleToROCDL, compileIsaToHsacoForTesting,
	"amdgcn-amd-amdhsa", "gfx900", "-code-object-v3", "rocdl.hsaco"));
	});			});
	}			}
	} // namespace test			} // namespace test
	} // namespace mlir			} // namespace mlir
	#endif			#endif // MLIR_ROCM_CONVERSIONS_ENABLED

mlir/test/lit.cfg.py

Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	tools.extend([
ToolSubst('%PYTHON', config.python_executable, unresolved='ignore'),		ToolSubst('%PYTHON', config.python_executable, unresolved='ignore'),
ToolSubst('toy-ch1', unresolved='ignore'),		ToolSubst('toy-ch1', unresolved='ignore'),
ToolSubst('toy-ch2', unresolved='ignore'),		ToolSubst('toy-ch2', unresolved='ignore'),
ToolSubst('toy-ch3', unresolved='ignore'),		ToolSubst('toy-ch3', unresolved='ignore'),
ToolSubst('toy-ch4', unresolved='ignore'),		ToolSubst('toy-ch4', unresolved='ignore'),
ToolSubst('toy-ch5', unresolved='ignore'),		ToolSubst('toy-ch5', unresolved='ignore'),
ToolSubst('%linalg_test_lib_dir', config.linalg_test_lib_dir, unresolved='ignore'),		ToolSubst('%linalg_test_lib_dir', config.linalg_test_lib_dir, unresolved='ignore'),
ToolSubst('%mlir_runner_utils_dir', config.mlir_runner_utils_dir, unresolved='ignore'),		ToolSubst('%mlir_runner_utils_dir', config.mlir_runner_utils_dir, unresolved='ignore'),
ToolSubst('%rocm_wrapper_library_dir', config.rocm_wrapper_library_dir, unresolved='ignore'),
ToolSubst('%spirv_wrapper_library_dir', config.spirv_wrapper_library_dir, unresolved='ignore'),		ToolSubst('%spirv_wrapper_library_dir', config.spirv_wrapper_library_dir, unresolved='ignore'),
ToolSubst('%vulkan_wrapper_library_dir', config.vulkan_wrapper_library_dir, unresolved='ignore'),		ToolSubst('%vulkan_wrapper_library_dir', config.vulkan_wrapper_library_dir, unresolved='ignore'),
ToolSubst('%mlir_integration_test_dir', config.mlir_integration_test_dir, unresolved='ignore'),		ToolSubst('%mlir_integration_test_dir', config.mlir_integration_test_dir, unresolved='ignore'),
])		])
llvm_config.add_tool_substitutions(tools, tool_dirs)		llvm_config.add_tool_substitutions(tools, tool_dirs)


# FileCheck -enable-var-scope is enabled by default in MLIR test		# FileCheck -enable-var-scope is enabled by default in MLIR test
Show All 26 Lines

mlir/test/lit.site.cfg.py.in

	Show All 33 Lines
	config.mlir_obj_root = "@MLIR_BINARY_DIR@"			config.mlir_obj_root = "@MLIR_BINARY_DIR@"
	config.mlir_runner_utils_dir = "@MLIR_RUNNER_UTILS_DIR@"			config.mlir_runner_utils_dir = "@MLIR_RUNNER_UTILS_DIR@"
	config.mlir_tools_dir = "@MLIR_TOOLS_DIR@"			config.mlir_tools_dir = "@MLIR_TOOLS_DIR@"
	config.linalg_test_lib_dir = "@MLIR_DIALECT_LINALG_INTEGRATION_TEST_LIB_DIR@"			config.linalg_test_lib_dir = "@MLIR_DIALECT_LINALG_INTEGRATION_TEST_LIB_DIR@"
	config.build_examples = @LLVM_BUILD_EXAMPLES@			config.build_examples = @LLVM_BUILD_EXAMPLES@
	config.run_cuda_tests = @MLIR_CUDA_CONVERSIONS_ENABLED@			config.run_cuda_tests = @MLIR_CUDA_CONVERSIONS_ENABLED@
	config.enable_cuda_runner = @MLIR_CUDA_RUNNER_ENABLED@			config.enable_cuda_runner = @MLIR_CUDA_RUNNER_ENABLED@
	config.run_rocm_tests = @MLIR_ROCM_CONVERSIONS_ENABLED@			config.run_rocm_tests = @MLIR_ROCM_CONVERSIONS_ENABLED@
	config.rocm_wrapper_library_dir = "@MLIR_ROCM_WRAPPER_LIBRARY_DIR@"
	config.enable_rocm_runner = @MLIR_ROCM_RUNNER_ENABLED@			config.enable_rocm_runner = @MLIR_ROCM_RUNNER_ENABLED@
	config.spirv_wrapper_library_dir = "@MLIR_SPIRV_WRAPPER_LIBRARY_DIR@"			config.spirv_wrapper_library_dir = "@MLIR_SPIRV_WRAPPER_LIBRARY_DIR@"
	config.enable_spirv_cpu_runner = @MLIR_SPIRV_CPU_RUNNER_ENABLED@			config.enable_spirv_cpu_runner = @MLIR_SPIRV_CPU_RUNNER_ENABLED@
	config.vulkan_wrapper_library_dir = "@MLIR_VULKAN_WRAPPER_LIBRARY_DIR@"			config.vulkan_wrapper_library_dir = "@MLIR_VULKAN_WRAPPER_LIBRARY_DIR@"
	config.enable_vulkan_runner = @MLIR_VULKAN_RUNNER_ENABLED@			config.enable_vulkan_runner = @MLIR_VULKAN_RUNNER_ENABLED@
	config.enable_bindings_python = @MLIR_BINDINGS_PYTHON_ENABLED@			config.enable_bindings_python = @MLIR_BINDINGS_PYTHON_ENABLED@
	config.mlir_integration_test_dir = "@MLIR_INTEGRATION_TEST_DIR@"			config.mlir_integration_test_dir = "@MLIR_INTEGRATION_TEST_DIR@"
	config.intel_sde_executable = "@INTEL_SDE_EXECUTABLE@"			config.intel_sde_executable = "@INTEL_SDE_EXECUTABLE@"
	Show All 19 Lines

mlir/test/mlir-rocm-runner/gpu-to-hsaco.mlir

This file was moved to mlir/test/Integration/GPU/ROCM/gpu-to-hsaco.mlir.

mlir/test/mlir-rocm-runner/lit.local.cfg

This file was moved to mlir/test/Integration/GPU/ROCM/lit.local.cfg.

mlir/test/mlir-rocm-runner/two-modules.mlir

This file was moved to mlir/test/Integration/GPU/ROCM/two-modules.mlir.

mlir/test/mlir-rocm-runner/vecadd.mlir

This file was moved to mlir/test/Integration/GPU/ROCM/vecadd.mlir.

mlir/test/mlir-rocm-runner/vector-transferops.mlir

This file was moved to mlir/test/Integration/GPU/ROCM/vector-transferops.mlir.

mlir/tools/CMakeLists.txt

	add_subdirectory(mlir-cpu-runner)			add_subdirectory(mlir-cpu-runner)
	add_subdirectory(mlir-opt)			add_subdirectory(mlir-opt)
	add_subdirectory(mlir-reduce)			add_subdirectory(mlir-reduce)
	add_subdirectory(mlir-rocm-runner)
	add_subdirectory(mlir-shlib)			add_subdirectory(mlir-shlib)
	add_subdirectory(mlir-spirv-cpu-runner)			add_subdirectory(mlir-spirv-cpu-runner)
	add_subdirectory(mlir-translate)			add_subdirectory(mlir-translate)
	add_subdirectory(mlir-vulkan-runner)			add_subdirectory(mlir-vulkan-runner)

mlir/tools/mlir-opt/mlir-opt.cpp

Show First 20 Lines • Show All 59 Lines • ▼ Show 20 Lines
void registerPatternsTestPass();		void registerPatternsTestPass();
void registerSimpleParametricTilingPass();		void registerSimpleParametricTilingPass();
void registerTestAffineLoopParametricTilingPass();		void registerTestAffineLoopParametricTilingPass();
void registerTestAliasAnalysisPass();		void registerTestAliasAnalysisPass();
void registerTestCallGraphPass();		void registerTestCallGraphPass();
void registerTestConstantFold();		void registerTestConstantFold();
void registerTestConvVectorization();		void registerTestConvVectorization();
void registerTestGpuSerializeToCubinPass();		void registerTestGpuSerializeToCubinPass();
void registerTestConvertGPUKernelToHsacoPass();		void registerTestGpuSerializeToHsacoPass();
void registerTestDataLayoutQuery();		void registerTestDataLayoutQuery();
void registerTestDecomposeCallGraphTypes();		void registerTestDecomposeCallGraphTypes();
void registerTestDialect(DialectRegistry &);		void registerTestDialect(DialectRegistry &);
void registerTestDominancePass();		void registerTestDominancePass();
void registerTestDynamicPipelinePass();		void registerTestDynamicPipelinePass();
void registerTestExpandTanhPass();		void registerTestExpandTanhPass();
void registerTestGpuParallelLoopMappingPass();		void registerTestGpuParallelLoopMappingPass();
void registerTestIRVisitorsPass();		void registerTestIRVisitorsPass();
▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	void registerTestPasses() {
test::registerTestAffineLoopParametricTilingPass();		test::registerTestAffineLoopParametricTilingPass();
test::registerTestAliasAnalysisPass();		test::registerTestAliasAnalysisPass();
test::registerTestCallGraphPass();		test::registerTestCallGraphPass();
test::registerTestConstantFold();		test::registerTestConstantFold();
#if MLIR_CUDA_CONVERSIONS_ENABLED		#if MLIR_CUDA_CONVERSIONS_ENABLED
test::registerTestGpuSerializeToCubinPass();		test::registerTestGpuSerializeToCubinPass();
#endif		#endif
#if MLIR_ROCM_CONVERSIONS_ENABLED		#if MLIR_ROCM_CONVERSIONS_ENABLED
test::registerTestConvertGPUKernelToHsacoPass();		test::registerTestGpuSerializeToHsacoPass();
#endif		#endif
test::registerTestConvVectorization();		test::registerTestConvVectorization();
test::registerTestDecomposeCallGraphTypes();		test::registerTestDecomposeCallGraphTypes();
test::registerTestDataLayoutQuery();		test::registerTestDataLayoutQuery();
test::registerTestDominancePass();		test::registerTestDominancePass();
test::registerTestDynamicPipelinePass();		test::registerTestDynamicPipelinePass();
test::registerTestExpandTanhPass();		test::registerTestExpandTanhPass();
test::registerTestGpuParallelLoopMappingPass();		test::registerTestGpuParallelLoopMappingPass();
▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

mlir/tools/mlir-rocm-runner/CMakeLists.txt

This file was deleted.

	set(LLVM_OPTIONAL_SOURCES
	rocm-runtime-wrappers.cpp
	mlir-rocm-runner.cpp
	)

	if(MLIR_ROCM_RUNNER_ENABLED)
	if (NOT ("AMDGPU" IN_LIST LLVM_TARGETS_TO_BUILD))
	message(SEND_ERROR
	"Building the mlir rocm runner requires the AMDGPU backend")
	endif()

	# Ensure lld is enabled.
	if (NOT "lld" IN_LIST LLVM_ENABLE_PROJECTS)
	message(SEND_ERROR "lld is not enabled. Please revise LLVM_ENABLE_PROJECTS")
	endif()

	# lld header files.
	include_directories(${MLIR_SOURCE_DIR}/../lld/include)

	# Configure ROCm support.
	if (NOT DEFINED ROCM_PATH)
	if (NOT DEFINED ENV{ROCM_PATH})
	set(ROCM_PATH "/opt/rocm" CACHE PATH "Path to which ROCm has been installed")
	else()
	set(ROCM_PATH $ENV{ROCM_PATH} CACHE PATH "Path to which ROCm has been installed")
	endif()
	set(HIP_PATH "${ROCM_PATH}/hip" CACHE PATH " Path to which HIP has been installed")
	endif()
	set(CMAKE_MODULE_PATH "${HIP_PATH}/cmake" ${CMAKE_MODULE_PATH})
	find_package(HIP)
	if (NOT HIP_FOUND)
	message(SEND_ERROR "Build the mlir rocm runner requires a working ROCm and HIP install")
	else()
	message(STATUS "ROCm HIP version: ${HIP_VERSION}")
	endif()

	# Set compile-time flags for ROCm path.
	add_definitions(-D__ROCM_PATH__="${ROCM_PATH}")

	# Locate HIP runtime library.
	find_library(ROCM_RUNTIME_LIBRARY amdhip64
	PATHS "${HIP_PATH}/lib")
	if (NOT ROCM_RUNTIME_LIBRARY)
	message(SEND_ERROR "Could not locate ROCm HIP runtime library")
	else()
	message(STATUS "ROCm HIP runtime lib: ${ROCM_RUNTIME_LIBRARY}")
	endif()

	# Set HIP compile-time flags.
	add_definitions(-D__HIP_PLATFORM_HCC__)

	add_mlir_library(rocm-runtime-wrappers
	SHARED
	rocm-runtime-wrappers.cpp

	EXCLUDE_FROM_LIBMLIR
	)
	target_include_directories(rocm-runtime-wrappers
	PRIVATE
	"${HIP_PATH}/../include"
	"${HIP_PATH}/include"
	)
	target_link_libraries(rocm-runtime-wrappers
	PRIVATE
	${ROCM_RUNTIME_LIBRARY}
	)

	get_property(conversion_libs GLOBAL PROPERTY MLIR_CONVERSION_LIBS)
	set(LIBS
	${conversion_libs}
	lldCommon
	lldDriver
	lldELF
	MLIRJitRunner
	MLIRAnalysis
	MLIREDSC
	MLIRExecutionEngine
	MLIRGPU
	MLIRIR
	MLIRLLVMIR
	MLIRLLVMToLLVMIRTranslation
	MLIRParser
	MLIRROCDLIR
	MLIRStandard
	MLIRSupport
	MLIRTargetLLVMIRExport
	MLIRROCDLToLLVMIRTranslation
	MLIRTransforms
	MLIRTranslation
	${ROCM_RUNTIME_LIBRARY}
	)

	# Manually expand the target library, since our MLIR libraries
	# aren't plugged into the LLVM dependency tracking. If we don't
	# do this then we can't insert the CodeGen library after ourselves
	llvm_expand_pseudo_components(TARGET_LIBS AllTargetsCodeGens AllTargetsAsmParsers)
	# Prepend LLVM in front of every target, this is how the library
	# are named with CMake
	SET(targets_to_link)
	FOREACH(t ${TARGET_LIBS})
	LIST(APPEND targets_to_link "LLVM${t}")
	ENDFOREACH(t)

	add_llvm_tool(mlir-rocm-runner
	mlir-rocm-runner.cpp

	DEPENDS
	rocm-runtime-wrappers

	LINK_COMPONENTS

	Core
	LTO
	MC
	MCParser
	Option
	Support
	)
	llvm_update_compile_flags(mlir-rocm-runner)
	target_include_directories(mlir-rocm-runner
	PRIVATE
	"${HIP_PATH}/../include"
	"${HIP_PATH}/include"
	)
	target_link_libraries(mlir-rocm-runner PRIVATE ${LIBS} ${targets_to_link})

	endif()

mlir/tools/mlir-rocm-runner/mlir-rocm-runner.cpp

This file was deleted.

	//===- mlir-rocm-runner.cpp - MLIR ROCM Execution Driver-------------------===//
	//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//
	//===----------------------------------------------------------------------===//
	//
	// This is a command line utility that executes an MLIR file on the GPU by
	// translating MLIR to ROCDL/LLVM IR before JIT-compiling and executing the
	// latter.
	//
	//===----------------------------------------------------------------------===//

	#include "llvm/ADT/STLExtras.h"

	#include "mlir/Conversion/GPUCommon/GPUCommonPass.h"
	#include "mlir/Conversion/GPUToROCDL/GPUToROCDLPass.h"
	#include "mlir/Conversion/SCFToStandard/SCFToStandard.h"
	#include "mlir/Conversion/StandardToLLVM/ConvertStandardToLLVM.h"
	#include "mlir/Conversion/StandardToLLVM/ConvertStandardToLLVMPass.h"
	#include "mlir/Dialect/GPU/GPUDialect.h"
	#include "mlir/Dialect/GPU/Passes.h"
	#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
	#include "mlir/Dialect/LLVMIR/ROCDLDialect.h"
	#include "mlir/Dialect/StandardOps/IR/Ops.h"
	#include "mlir/ExecutionEngine/JitRunner.h"
	#include "mlir/ExecutionEngine/OptUtils.h"
	#include "mlir/IR/BuiltinOps.h"
	#include "mlir/Pass/Pass.h"
	#include "mlir/Pass/PassManager.h"
	#include "mlir/Support/FileUtilities.h"
	#include "mlir/Target/LLVMIR/Dialect/LLVMIR/LLVMToLLVMIRTranslation.h"
	#include "mlir/Target/LLVMIR/Dialect/ROCDL/ROCDLToLLVMIRTranslation.h"
	#include "mlir/Target/LLVMIR/Export.h"
	#include "mlir/Transforms/DialectConversion.h"
	#include "mlir/Transforms/Passes.h"
	#include "llvm/Support/ErrorOr.h"
	#include "llvm/Support/FileUtilities.h"
	#include "llvm/Support/InitLLVM.h"
	#include "llvm/Support/LineIterator.h"
	#include "llvm/Support/Program.h"
	#include "llvm/Support/SourceMgr.h"
	#include "llvm/Support/TargetRegistry.h"
	#include "llvm/Support/TargetSelect.h"

	// MC headers.
	#include "llvm/MC/MCAsmBackend.h"
	#include "llvm/MC/MCAsmInfo.h"
	#include "llvm/MC/MCCodeEmitter.h"
	#include "llvm/MC/MCContext.h"
	#include "llvm/MC/MCInstPrinter.h"
	#include "llvm/MC/MCInstrInfo.h"
	#include "llvm/MC/MCObjectFileInfo.h"
	#include "llvm/MC/MCObjectWriter.h"
	#include "llvm/MC/MCParser/AsmLexer.h"
	#include "llvm/MC/MCParser/MCTargetAsmParser.h"
	#include "llvm/MC/MCRegisterInfo.h"
	#include "llvm/MC/MCStreamer.h"
	#include "llvm/MC/MCSubtargetInfo.h"
	#include "llvm/MC/MCTargetOptionsCommandFlags.h"

	// lld headers.
	#include "lld/Common/Driver.h"

	// HIP headers.
	#include "hip/hip_version.h"

	#include <mutex>

	using namespace mlir;
	using namespace llvm;

	using Blob = SmallVector<char, 0>;

	static cl::opt<std::string> tripleName("triple", cl::desc("target triple"),
	cl::value_desc("triple string"),
	cl::init("amdgcn-amd-amdhsa"));

	static cl::opt<std::string> targetChip("target", cl::desc("target chip"),
	cl::value_desc("AMDGPU ISA version"),
	cl::init(""));

	static cl::opt<std::string> features("feature", cl::desc("target features"),
	cl::value_desc("AMDGPU target features"),
	cl::init(""));

	static constexpr const char kRunnerProgram[] = "mlir-rocm-runner";
	static constexpr const char kRocmAgentEnumerator[] = "rocm_agent_enumerator";
	static constexpr const char kDefaultTargetChip[] = "gfx900";

	static LogicalResult assembleIsa(const std::string isa, StringRef name,
	Blob &result) {
	raw_svector_ostream os(result);

	std::string error;
	Triple theTriple(Triple::normalize(tripleName));
	const Target *theTarget =
	TargetRegistry::lookupTarget(theTriple.normalize(), error);
	if (!theTarget) {
	WithColor::error(errs(), name) << error;
	return failure();
	}

	SourceMgr srcMgr;
	srcMgr.AddNewSourceBuffer(MemoryBuffer::getMemBuffer(isa), SMLoc());

	const MCTargetOptions mcOptions;
	std::unique_ptr<MCRegisterInfo> mri(theTarget->createMCRegInfo(tripleName));
	std::unique_ptr<MCAsmInfo> mai(
	theTarget->createMCAsmInfo(*mri, tripleName, mcOptions));
	mai->setRelaxELFRelocations(true);

	MCObjectFileInfo mofi;
	MCContext ctx(mai.get(), mri.get(), &mofi, &srcMgr, &mcOptions);
	mofi.InitMCObjectFileInfo(theTriple, false, ctx, false);

	SmallString<128> cwd;
	if (!sys::fs::current_path(cwd))
	ctx.setCompilationDir(cwd);

	std::unique_ptr<MCStreamer> mcStreamer;
	std::unique_ptr<MCInstrInfo> mcii(theTarget->createMCInstrInfo());
	std::unique_ptr<MCSubtargetInfo> sti(
	theTarget->createMCSubtargetInfo(tripleName, targetChip, features));

	MCCodeEmitter ce = theTarget->createMCCodeEmitter(mcii, *mri, ctx);
	MCAsmBackend mab = theTarget->createMCAsmBackend(sti, *mri, mcOptions);
	mcStreamer.reset(theTarget->createMCObjectStreamer(
	theTriple, ctx, std::unique_ptr<MCAsmBackend>(mab),
	mab->createObjectWriter(os), std::unique_ptr<MCCodeEmitter>(ce), *sti,
	mcOptions.MCRelaxAll, mcOptions.MCIncrementalLinkerCompatible,
	/DWARFMustBeAtTheEnd/ false));
	mcStreamer->setUseAssemblerInfoForParsing(true);

	std::unique_ptr<MCAsmParser> parser(
	createMCAsmParser(srcMgr, ctx, mcStreamer, mai));
	std::unique_ptr<MCTargetAsmParser> tap(
	theTarget->createMCAsmParser(sti, parser, *mcii, mcOptions));

	if (!tap) {
	WithColor::error(errs(), name) << "assembler initialization error.\n";
	return failure();
	}

	parser->setTargetParser(*tap);
	parser->Run(false);

	return success();
	}

	static std::mutex mutex;
	static LogicalResult createHsaco(const Blob &isaBlob, StringRef name,
	Blob &hsacoBlob) {
	// Save the ISA binary to a temp file.
	int tempIsaBinaryFd = -1;
	SmallString<128> tempIsaBinaryFilename;
	std::error_code ec = sys::fs::createTemporaryFile(
	"kernel", "o", tempIsaBinaryFd, tempIsaBinaryFilename);
	if (ec) {
	WithColor::error(errs(), name)
	<< "temporary file for ISA binary creation error.\n";
	return failure();
	}
	FileRemover cleanupIsaBinary(tempIsaBinaryFilename);
	raw_fd_ostream tempIsaBinaryOs(tempIsaBinaryFd, true);
	tempIsaBinaryOs << isaBlob;
	tempIsaBinaryOs.close();

	// Create a temp file for HSA code object.
	int tempHsacoFD = -1;
	SmallString<128> tempHsacoFilename;
	ec = sys::fs::createTemporaryFile("kernel", "hsaco", tempHsacoFD,
	tempHsacoFilename);
	if (ec) {
	WithColor::error(errs(), name)
	<< "temporary file for HSA code object creation error.\n";
	return failure();
	}
	FileRemover cleanupHsaco(tempHsacoFilename);

	const std::lock_guard<std::mutex> lock(mutex);
	// Invoke lld. Expect a true return value from lld.
	bool ret = lld::elf::link({"ld.lld", "-shared", tempIsaBinaryFilename.c_str(),
	"-o", tempHsacoFilename.c_str()},
	/canEarlyExit=/false, llvm::outs(), llvm::errs());
	if (!ret) {
	WithColor::error(errs(), name) << "lld invocation error.\n";
	return failure();
	}

	// Load the HSA code object.
	auto hsacoFile = mlir::openInputFile(tempHsacoFilename);
	if (!hsacoFile) {
	WithColor::error(errs(), name)
	<< "read HSA code object from temp file error.\n";
	return failure();
	}
	hsacoBlob.assign(hsacoFile->getBuffer().begin(),
	hsacoFile->getBuffer().end());

	return success();
	}

	static std::unique_ptr<llvm::Module>
	compileModuleToROCDLIR(Operation *m, llvm::LLVMContext &llvmContext,
	StringRef name) {
	auto llvmModule = translateModuleToROCDLIR(m, llvmContext, name);
	// TODO: Link with ROCm-Device-Libs in case needed (ex: the Module
	// depends on math functions).
	return llvmModule;
	}

	static OwnedBlob compileISAToHsaco(const std::string isa, Location loc,
	StringRef name) {
	// ISA -> ISA in binary form via MC.
	// Use lld to create HSA code object.
	Blob isaBlob;
	Blob hsacoBlob;

	if (succeeded(assembleIsa(isa, name, isaBlob)) &&
	succeeded(createHsaco(isaBlob, name, hsacoBlob)))
	return std::make_unique<std::vector<char>>(hsacoBlob.begin(),
	hsacoBlob.end());

	WithColor::error(errs(), name) << "producing HSA code object error.\n";
	return {};
	}

	static void configTargetChip() {
	// Set targetChip to default value first.
	targetChip = kDefaultTargetChip;

	// Locate rocm_agent_enumerator.
	llvm::ErrorOr<std::string> rocmAgentEnumerator = llvm::sys::findProgramByName(
	kRocmAgentEnumerator, {__ROCM_PATH__ "/bin"});
	std::error_code ec;
	if ((ec = rocmAgentEnumerator.getError())) {
	WithColor::warning(errs(), kRunnerProgram)
	<< kRocmAgentEnumerator << " couldn't be located under "
	<< __ROCM_PATH__ << ", set target as " << kDefaultTargetChip << "\n";
	return;
	}

	// Prepare temp file to hold the outputs.
	int tempFd = -1;
	SmallString<128> tempFilename;
	ec = sys::fs::createTemporaryFile("rocm_agent", "txt", tempFd, tempFilename);
	if (ec) {
	WithColor::warning(errs(), kRunnerProgram)
	<< "temporary file for " << kRocmAgentEnumerator
	<< " creation error, set target as " << kDefaultTargetChip << "\n";
	return;
	}
	FileRemover cleanup(tempFilename);

	// Invoke rocm_agent_enumerator.
	std::string errorMessage;
	SmallVector<StringRef, 2> args{"-t", "GPU"};
	Optional<StringRef> redirects[3] = {{""}, tempFilename.str(), {""}};
	int result =
	llvm::sys::ExecuteAndWait(rocmAgentEnumerator.get(), args, llvm::None,
	redirects, 0, 0, &errorMessage);
	if (result) {
	WithColor::warning(errs(), kRunnerProgram)
	<< kRocmAgentEnumerator << " invocation error: " << errorMessage
	<< ", set target as " << kDefaultTargetChip << "\n";
	return;
	}

	// Load and parse the result.
	auto gfxIsaList = mlir::openInputFile(tempFilename);
	if (!gfxIsaList) {
	WithColor::error(errs(), kRunnerProgram)
	<< "read ROCm agent list temp file error, set target as "
	<< kDefaultTargetChip << "\n";
	return;
	}
	for (line_iterator lines(*gfxIsaList); !lines.is_at_end(); ++lines) {
	// Skip the line with content "gfx000".
	if (*lines == "gfx000")
	continue;
	// Use the first ISA version found.
	targetChip = lines->str();
	break;
	}
	}

	static void configTargetFeatures() {
	if (features.size() > 0)
	features += ",";
	// After ROCm 3.5, adopt HSA code object V3.
	if (HIP_VERSION_MAJOR >= 3 && HIP_VERSION_MINOR >= 5)
	features += "+code-object-v3";
	else
	features += "-code-object-v3";
	}

	static LogicalResult runMLIRPasses(ModuleOp m) {
	PassManager pm(m.getContext());
	applyPassManagerCLOptions(pm);

	// Configure target chip ISA version if it has not been specified.
	if (!targetChip.size())
	configTargetChip();

	// Configure target features per ROCm / HIP version.
	configTargetFeatures();

	const char gpuBinaryAnnotation[] = "rocdl.hsaco";
	pm.addPass(createLowerToCFGPass());
	pm.addPass(createGpuKernelOutliningPass());
	auto &kernelPm = pm.nest<gpu::GPUModuleOp>();
	kernelPm.addPass(createStripDebugInfoPass());
	kernelPm.addPass(createLowerGpuOpsToROCDLOpsPass());
	kernelPm.addPass(createConvertGPUKernelToBlobPass(
	compileModuleToROCDLIR, compileISAToHsaco, tripleName, targetChip,
	features, gpuBinaryAnnotation));
	pm.addPass(createGpuToLLVMConversionPass(gpuBinaryAnnotation));

	return pm.run(m);
	}

	int main(int argc, char **argv) {
	registerPassManagerCLOptions();
	llvm::InitLLVM y(argc, argv);
	llvm::InitializeAllTargetInfos();
	llvm::InitializeAllTargetMCs();
	llvm::InitializeAllAsmParsers();

	// Initialize LLVM AMDGPU backend.
	LLVMInitializeAMDGPUTarget();
	LLVMInitializeAMDGPUTargetInfo();
	LLVMInitializeAMDGPUTargetMC();
	LLVMInitializeAMDGPUAsmPrinter();

	mlir::initializeLLVMPasses();

	mlir::JitRunnerConfig jitRunnerConfig;
	jitRunnerConfig.mlirTransformer = runMLIRPasses;

	mlir::DialectRegistry registry;
	registry.insert<mlir::LLVM::LLVMDialect, mlir::gpu::GPUDialect,
	mlir::ROCDL::ROCDLDialect, mlir::StandardOpsDialect>();
	mlir::registerLLVMDialectTranslation(registry);
	mlir::registerROCDLDialectTranslation(registry);

	return mlir::JitRunnerMain(argc, argv, registry, jitRunnerConfig);
	}

mlir/tools/mlir-rocm-runner/rocm-runtime-wrappers.cpp

This file was moved to mlir/lib/ExecutionEngine/RocmRuntimeWrappers.cpp.

This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Remove mlir-rocm-runnerClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 330431

mlir/include/mlir/Dialect/GPU/Passes.h

mlir/include/mlir/InitAllPasses.h

mlir/lib/Dialect/GPU/CMakeLists.txt

mlir/lib/Dialect/GPU/Transforms/SerializeToHsaco.cpp

mlir/lib/ExecutionEngine/CMakeLists.txt

mlir/lib/ExecutionEngine/RocmRuntimeWrappers.cpp

mlir/test/CMakeLists.txt

mlir/test/Conversion/GPUToROCm/lower-rocdl-kernel-to-hsaco.mlir

mlir/test/Integration/GPU/CUDA/lit.local.cfg

mlir/test/Integration/GPU/ROCM/gpu-to-hsaco.mlir

mlir/test/Integration/GPU/ROCM/lit.local.cfg

mlir/test/Integration/GPU/ROCM/two-modules.mlir

mlir/test/Integration/GPU/ROCM/vecadd.mlir

mlir/test/Integration/GPU/ROCM/vector-transferops.mlir

mlir/test/lib/Transforms/TestConvertGPUKernelToHsaco.cpp

mlir/test/lit.cfg.py

mlir/test/lit.site.cfg.py.in

mlir/test/mlir-rocm-runner/gpu-to-hsaco.mlir

mlir/test/mlir-rocm-runner/lit.local.cfg

mlir/test/mlir-rocm-runner/two-modules.mlir

mlir/test/mlir-rocm-runner/vecadd.mlir

mlir/test/mlir-rocm-runner/vector-transferops.mlir

mlir/tools/CMakeLists.txt

mlir/tools/mlir-opt/mlir-opt.cpp

mlir/tools/mlir-rocm-runner/CMakeLists.txt

mlir/tools/mlir-rocm-runner/mlir-rocm-runner.cpp

mlir/tools/mlir-rocm-runner/rocm-runtime-wrappers.cpp

[mlir] Remove mlir-rocm-runner
ClosedPublic