Details
- Reviewers
aartbik herhut whchung bondhugula - Commits
- rGa825fb2c0733: [mlir] Remove mlir-rocm-runner
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Various fixes.
I gave this change a spin on genesiscloud.com (Radeon Instinct™ MI25, ROCm HIP version: 3.5.20214-a2917cd).
The code now compiles, but the integration tests all fail with assembler initialization error.
I also had to remove the 'code-object-v3' features because LLVM wouldn't recognize it.
For future reference, here is how I tested:
sudo apt-get install ninja-build sudo bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)" wget -qO - https://apt.kitware.com/keys/kitware-archive-latest.asc | sudo apt-key add - sudo apt-add-repository 'deb https://apt.kitware.com/ubuntu/ bionic main' sudo apt-get update sudo apt-get install cmake wget https://reviews.llvm.org/D98447?download=true -O rocm.patch git clone https://github.com/llvm/llvm-project.git git apply ../rocm.patch mkdir build cd build CC=clang-12 CXX=clang++-12 cmake ../llvm '-DCMAKE_CUDA_COMPILER=/media/samsung_ssd_850_pro/cuda/cuda-11.1/bin/nvcc' -DLLVM_BUILD_EXAMPLES=ON '-DLLVM_TARGETS_TO_BUILD=host;AMDGPU' -DLLVM_ENABLE_PROJECTS="clang;mlir;lld" '-DMLIR_INCLUDE_INTEGRATION_TESTS=ON' '-DMLIR_ROCM_RUNNER_ENABLED=1' -DBUILD_SHARED_LIBS=ON -DLLVM_CCACHE_BUILD=OFF -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON '-DLLVM_LIT_ARGS=-v -vv' -GNinja ninja check-mlir
@csigg May I understand the goal for this changeset is to merge everything under mlir-cpu-runner?
I'll check this changeset on a ROCm 4.0 machine, and provide feedbacks.
Another fix.
gpu-to-hsaco.mlir now passes. The other 3 integration tests still fail but that seems unrelated to this change and should be dealt with separately.
Test output:
$ env LIT_FILTER=GPU/ROCM ninja check-mlir -- Testing: 4 of 897 tests, 4 workers -- FAIL: MLIR :: Integration/GPU/ROCM/two-modules.mlir (1 of 4) ******************** TEST 'MLIR :: Integration/GPU/ROCM/two-modules.mlir' FAILED ******************** Script: -- : 'RUN: at line 1'; /home/ubuntu/llvm-project/build/bin/mlir-opt /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/two-modules.mlir -gpu-kernel-outlining -pass-pipeline ='gpu.module(strip-debuginfo,convert-gpu-to-rocdl,gpu-to-hsaco)' -gpu-to-llvm | /home/ubuntu/llvm-project/build/bin/mlir-cpu-runner --shared-libs=/home/ubuntu/llvm-project/buil d/lib/libmlir_rocm_runtime.so --shared-libs=/home/ubuntu/llvm-project/build/lib/libmlir_runner_utils.so --entry-point-result=void | /home/ubuntu/llvm-project/build/bin/FileChec k /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/two-modules.mlir -- Exit Code: 2 Command Output (stderr): -- + : 'RUN: at line 1' + /home/ubuntu/llvm-project/build/bin/mlir-opt /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/two-modules.mlir -gpu-kernel-outlining '-pass-pipeline=gpu.module(strip-debugin fo,convert-gpu-to-rocdl,gpu-to-hsaco)' -gpu-to-llvm + /home/ubuntu/llvm-project/build/bin/mlir-cpu-runner --shared-libs=/home/ubuntu/llvm-project/build/lib/libmlir_rocm_runtime.so --shared-libs=/home/ubuntu/llvm-project/build/lib/libml ir_runner_utils.so --entry-point-result=void + /home/ubuntu/llvm-project/build/bin/FileCheck /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/two-modules.mlir /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/two-modules.mlir:37:1: error: 'func' op symbol declaration cannot have public visibility func @mgpuMemGetDeviceMemRef1dInt32(%ptr : memref<?xi32>) -> (memref<?xi32>) ^ /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/two-modules.mlir:37:1: note: see current operation: "func"() ( { }) {sym_name = "mgpuMemGetDeviceMemRef1dInt32", type = (memref<?xi32>) -> memref<?xi32>} : () -> () Error: entry point not found FileCheck error: '<stdin>' is empty. FileCheck command line: /home/ubuntu/llvm-project/build/bin/FileCheck /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/two-modules.mlir -- ******************** FAIL: MLIR :: Integration/GPU/ROCM/vector-transferops.mlir (2 of 4) ******************** TEST 'MLIR :: Integration/GPU/ROCM/vector-transferops.mlir' FAILED ******************** Script: -- : 'RUN: at line 1'; /home/ubuntu/llvm-project/build/bin/mlir-opt /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vector-transferops.mlir -gpu-kernel-outlining -pass-p ipeline='gpu.module(strip-debuginfo,convert-gpu-to-rocdl,gpu-to-hsaco)' -gpu-to-llvm | /home/ubuntu/llvm-project/build/bin/mlir-cpu-runner --shared-libs=/home/ubuntu/llvm-proje ct/build/lib/libmlir_rocm_runtime.so --shared-libs=/home/ubuntu/llvm-project/build/lib/libmlir_runner_utils.so --entry-point-result=void | /home/ubuntu/llvm-project/build/bin/F ileCheck /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vector-transferops.mlir -- Exit Code: 2 Command Output (stderr): -- + : 'RUN: at line 1' + /home/ubuntu/llvm-project/build/bin/mlir-cpu-runner --shared-libs=/home/ubuntu/llvm-project/build/lib/libmlir_rocm_runtime.so --shared-libs=/home/ubuntu/llvm-project/build/lib/libml ir_runner_utils.so --entry-point-result=void + /home/ubuntu/llvm-project/build/bin/FileCheck /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vector-transferops.mlir + /home/ubuntu/llvm-project/build/bin/mlir-opt /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vector-transferops.mlir -gpu-kernel-outlining '-pass-pipeline=gpu.module(strip- debuginfo,convert-gpu-to-rocdl,gpu-to-hsaco)' -gpu-to-llvm /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vector-transferops.mlir:67:3: error: 'scf.for' op operand #0 must be index, but got 'i64' scf.for %i = %c0 to %c4 step %c1 { ^ /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vector-transferops.mlir:67:3: note: see current operation: "scf.for"(%0, %2, %1) ( { ^bb0(%arg0: index): // no predecessors "std.store"(%4, %18, %arg0) : (f32, !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>, index) -> () "std.store"(%4, %32, %arg0) : (f32, !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>, index) -> () "scf.yield"() : () -> () }) : (i64, i64, i64) -> () Error: entry point not found FileCheck error: '<stdin>' is empty. FileCheck command line: /home/ubuntu/llvm-project/build/bin/FileCheck /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vector-transferops.mlir -- ******************** FAIL: MLIR :: Integration/GPU/ROCM/vecadd.mlir (3 of 4) ******************** TEST 'MLIR :: Integration/GPU/ROCM/vecadd.mlir' FAILED ******************** Script: -- : 'RUN: at line 1'; /home/ubuntu/llvm-project/build/bin/mlir-opt /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vecadd.mlir -gpu-kernel-outlining -pass-pipeline='gpu .module(strip-debuginfo,convert-gpu-to-rocdl,gpu-to-hsaco)' -gpu-to-llvm | /home/ubuntu/llvm-project/build/bin/mlir-cpu-runner --shared-libs=/home/ubuntu/llvm-project/build/lib /libmlir_rocm_runtime.so --shared-libs=/home/ubuntu/llvm-project/build/lib/libmlir_runner_utils.so --entry-point-result=void | /home/ubuntu/llvm-project/build/bin/FileCheck /ho me/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vecadd.mlir -- Exit Code: 2 Command Output (stderr): -- + : 'RUN: at line 1' + /home/ubuntu/llvm-project/build/bin/mlir-cpu-runner --shared-libs=/home/ubuntu/llvm-project/build/lib/libmlir_rocm_runtime.so --shared-libs=/home/ubuntu/llvm-project/build/lib/libml ir_runner_utils.so --entry-point-result=void + /home/ubuntu/llvm-project/build/bin/FileCheck /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vecadd.mlir + /home/ubuntu/llvm-project/build/bin/mlir-opt /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vecadd.mlir -gpu-kernel-outlining '-pass-pipeline=gpu.module(strip-debuginfo,co nvert-gpu-to-rocdl,gpu-to-hsaco)' -gpu-to-llvm /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vecadd.mlir:38:3: error: 'scf.for' op operand #0 must be index, but got 'i64' scf.for %i = %c0 to %c5 step %c1 { ^ /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vecadd.mlir:38:3: note: see current operation: "scf.for"(%0, %2, %1) ( { ^bb0(%arg0: index): // no predecessors "std.store"(%3, %17, %arg0) : (f32, !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>, index) -> () "std.store"(%3, %31, %arg0) : (f32, !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>, index) -> () "scf.yield"() : () -> () }) : (i64, i64, i64) -> () Error: entry point not found FileCheck error: '<stdin>' is empty. FileCheck command line: /home/ubuntu/llvm-project/build/bin/FileCheck /home/ubuntu/llvm-project/mlir/test/Integration/GPU/ROCM/vecadd.mlir -- ******************** PASS: MLIR :: Integration/GPU/ROCM/gpu-to-hsaco.mlir (4 of 4) ******************** Failed Tests (3): MLIR :: Integration/GPU/ROCM/two-modules.mlir MLIR :: Integration/GPU/ROCM/vecadd.mlir MLIR :: Integration/GPU/ROCM/vector-transferops.mlir Testing Time: 0.91s Excluded: 893 Passed : 1 Failed : 3 FAILED: tools/mlir/test/CMakeFiles/check-mlir
The diffs referenced in the description contain more context, but basically:
- file-move gpu runner tests to 'Integration' directory.
- file-move gpuKernelsToBlob to Dialect/GPU/Transform
- code-move lowering to GPU blob from mlir-rocm-runner to mlir-opt
- change integration tests from mlir-rocm-runner to mlir-opt + cpu-runner
- remove mlir-rocm-runner
I'll check this changeset on a ROCm 4.0 machine, and provide feedbacks.
Thanks! Please use the latest revision ;-)
@csigg May I understand how you configure the build?
With this cmake command:
cmake -G Ninja ../llvm \ -DLLVM_ENABLE_PROJECTS="mlir;lld" \ -DLLVM_BUILD_EXAMPLES=ON \ -DLLVM_TARGETS_TO_BUILD="X86;AMDGPU" \ -DCMAKE_BUILD_TYPE=Release \ -DLLVM_ENABLE_ASSERTIONS=ON \ -DBUILD_SHARED_LIBS=ON \ -DLLVM_BUILD_LLVM_DYLIB=ON \ -DMLIR_ROCM_RUNNER_ENABLED=1
I'm getting:
tools/mlir/lib/Dialect/GPU/CMakeFiles/obj.MLIRGPU.dir/Transforms/SerializeToHsaco.cpp.o: In function `std::_Function_handler<std::unique_ptr<mlir::Pass, std::default_delete<mlir::Pass> > (), mlir::registerGpuSerializeToHsacoPass()::{lambda()#1}>::_M_invoke(std::_Any_data const&)': SerializeToHsaco.cpp:(.text._ZNSt17_Function_handlerIFSt10unique_ptrIN4mlir4PassESt14default_deleteIS2_EEvEZNS1_31registerGpuSerializeToHsacoPassEvEUlvE_E9_M_invokeERKSt9_Any_data+0x34): undefined reference to `LLVMInitializeAMDGPUAsmParser' tools/mlir/lib/Dialect/GPU/CMakeFiles/obj.MLIRGPU.dir/Transforms/SerializeToHsaco.cpp.o: In function `(anonymous namespace)::SerializeToHsacoPass::createHsaco(llvm::SmallVectorImpl<char> const&)': SerializeToHsaco.cpp:(.text._ZN12_GLOBAL__N_120SerializeToHsacoPass11createHsacoERKN4llvm15SmallVectorImplIcEE+0x51c): undefined reference to `lld::elf::link(llvm::ArrayRef<char const*>, bool, llvm::raw_ostream&, llvm::raw_ostream&)'
mlir/test/lib/Transforms/TestConvertGPUKernelToHsaco.cpp | ||
---|---|---|
28 | PTX -> LLVM IR? |
@csigg I would need your help advising me the proper way to configure it so I could test the patch. With the cmake command I use downstream I run into link errors aforementioned.
Also, in our downstream work, we did a rebase recently and we found we have to disable multi-threading when we run MLIR passes.
static LogicalResult runMLIRPasses(ModuleOp m) { // TODO(sjw): fix multi-threading race condition m.getContext()->disableMultithreading(); PassManager pm(m.getContext()); applyPassManagerCLOptions(pm);
To emit HSACO for AMD GPU platform, lld is used, and it seems lld itself doesn't seem to be thread-safe.
mlir/test/lib/Transforms/TestConvertGPUKernelToHsaco.cpp | ||
---|---|---|
37 | "code-object-v3" is dropped in recent LLVM AMDGPU backend so this line is not needed. |
My cmake command is above. Yours seems to be missing -DMLIR_INCLUDE_INTEGRATION_TESTS=ON.
Unrelated, but note that SHARED_LIBS + DYLIB is not supported.
I'm getting: ... `undefined reference to `lld::elf::link`
This should be fixed now. lldELF is not a LLVM target, so this needed a 'creative' solution.
I also managed to fix the tests.
Thanks for cleaning this up! Looks good in general but I also cannot test this. It would be nice if we had a builder for this. @whchung do you have a setup that could test this continuously or at least periodically to make sure it does not break?
mlir/lib/Dialect/GPU/CMakeLists.txt | ||
---|---|---|
100 | nit: which rocm runner? | |
120 | Same here. | |
146 | What does this do? I do not understand cmake enough to make sense of this. |
mlir/lib/Dialect/GPU/CMakeLists.txt | ||
---|---|---|
146 | MLIR_LLVM_LINK_COMPONENTS is the list of LLVM libraries that are linked to libmlir.so. Normally, it is populated by the LINK_COMPONENTS argument of add_mlir_library() but here this call has already happened. We need an alias (really just a different name for an existing target) because the LINK_COMPONENTS name are implicitly prefixed with 'LLVM' when translated to target names. |
@herhut We have CI nodes checking our work downstream. We used to have a job checking tip of LLVM but it's offline for some time. I'll work with my colleague to bring it back online.
Coming from https://reviews.llvm.org/D108850#3280778
I am not comfortable that mlir has such an API dependency on lld::elf::link.
lld::elf::link (the library usage) is still not recommended.
It also feels odd that mlir needs to use lld.
Library usage may not control the concurrency well.
You may switch to spawning an ld.lld process if LLVM_ENABLE_PROJECTS includes lld.
nit: which rocm runner?