Creates rocdl.lane_id op with llvm conversion to:
__device__ static unsigned int __lane_id() { return __builtin_amdgcn_mbcnt_hi( -1, __builtin_amdgcn_mbcnt_lo(-1, 0)); }
Differential D154666
[MLIR][ROCDL] Add conversion for gpu.lane_id to ROCDL sjw36 on Jul 6 2023, 4:25 PM. Authored by
Details Creates rocdl.lane_id op with llvm conversion to: __device__ static unsigned int __lane_id() { return __builtin_amdgcn_mbcnt_hi( -1, __builtin_amdgcn_mbcnt_lo(-1, 0)); }
Diff Detail
Event TimelineComment Actions As a high-level note, wouldn't it make more sense to define a rocdl wrapper around the mbcnt intrinsic and then rewrite to that so than we're not hiding a substantial bit of translation in the LLVM IR builder? I've generally seen the rocdl dialect as the place for 1:1 wrappers around LLVM functionality. Second, as a minor note, when you have the final lane_id number, would it be possible to put range metadata on it - probably a conservative [0, 63] value, but, still, that'll allow for optimizations. Comment Actions Let's not keep doing that. Just call intrinsics directly. All of those straight intrinsic wrappers introduce more trouble Comment Actions The ROCDL dialect is meant to directly represent the AMDGPU-specific intrinsics in LLVM IR within MLIR, and to work with the LLVM dialect (which is LLVM-IR-in-MLIR that you can run through a simple translation layer) Comment Actions
Comment Actions This overall design works, but I've got minor nitpicks
Comment Actions Overall, looks good
|
Would it be possible to instead explicitly spell out the arguments and the result type, given that they're known from the LLVM?