Creates rocdl.lane_id op with llvm conversion to:
__device__ static unsigned int __lane_id() { return __builtin_amdgcn_mbcnt_hi( -1, __builtin_amdgcn_mbcnt_lo(-1, 0)); }
Paths
| Differential D154666
[MLIR][ROCDL] Add conversion for gpu.lane_id to ROCDL ClosedPublic Authored by sjw36 on Jul 6 2023, 4:25 PM.
Details Summary Creates rocdl.lane_id op with llvm conversion to: __device__ static unsigned int __lane_id() { return __builtin_amdgcn_mbcnt_hi( -1, __builtin_amdgcn_mbcnt_lo(-1, 0)); }
Diff Detail
Event TimelineComment Actions As a high-level note, wouldn't it make more sense to define a rocdl wrapper around the mbcnt intrinsic and then rewrite to that so than we're not hiding a substantial bit of translation in the LLVM IR builder? I've generally seen the rocdl dialect as the place for 1:1 wrappers around LLVM functionality. Second, as a minor note, when you have the final lane_id number, would it be possible to put range metadata on it - probably a conservative [0, 63] value, but, still, that'll allow for optimizations. Comment Actions
Let's not keep doing that. Just call intrinsics directly. All of those straight intrinsic wrappers introduce more trouble Comment Actions
I missed the "dialect" part here. Just don't call the ockl wrapper for this Comment Actions The ROCDL dialect is meant to directly represent the AMDGPU-specific intrinsics in LLVM IR within MLIR, and to work with the LLVM dialect (which is LLVM-IR-in-MLIR that you can run through a simple translation layer) Comment Actions
Comment Actions This overall design works, but I've got minor nitpicks
Comment Actions Overall, looks good
This revision is now accepted and ready to land.Jul 25 2023, 10:57 AM This revision was landed with ongoing or failed builds.Jul 26 2023, 8:13 AM Closed by commit rGcdf7ca6db76b: [MLIR][ROCDL] Add conversion for gpu.lane_id to ROCDL (authored by SJW <swaters@amd.com>, committed by krzysz00). · Explain Why This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 543104 mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
mlir/lib/Conversion/GPUToROCDL/LowerGpuOpsToROCDLOps.cpp
mlir/test/Conversion/GPUToROCDL/gpu-to-rocdl.mlir
mlir/test/Target/LLVMIR/rocdl.mlir
|
Would it be possible to instead explicitly spell out the arguments and the result type, given that they're known from the LLVM?