Until now, the GPU translation to NVVM or ROCDL intrinsics relied on the
presence of the generic gpu.kernel attribute to attach additional LLVM IR
metadata to the relevant functions. This would be problematic if each dialect
were to handle the conversion of its own options, which is the intended
direction for the translation infrastructure. Introduce nvvm.kernel and
rocdl.kernel in addition to gpu.kernel and base translation on these new
attributes instead.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo